From 8413fe3e7fdfb412baa9e31b3f1d2eaf6d21a373 Mon Sep 17 00:00:00 2001 From: Weili Qian Date: Fri, 12 Jan 2024 18:25:45 +0800 Subject: crypto: hisilicon/qm - support get device state Support get device current state. The value 0 indicates that the device is busy, and the value 1 indicates that the device is idle. When the device is in suspended, 1 is returned. Signed-off-by: Weili Qian Signed-off-by: Herbert Xu --- Documentation/ABI/testing/debugfs-hisi-hpre | 7 +++++++ Documentation/ABI/testing/debugfs-hisi-sec | 7 +++++++ Documentation/ABI/testing/debugfs-hisi-zip | 7 +++++++ 3 files changed, 21 insertions(+) (limited to 'Documentation/ABI') diff --git a/Documentation/ABI/testing/debugfs-hisi-hpre b/Documentation/ABI/testing/debugfs-hisi-hpre index 8e8de49c5cc6..6ed9258605c7 100644 --- a/Documentation/ABI/testing/debugfs-hisi-hpre +++ b/Documentation/ABI/testing/debugfs-hisi-hpre @@ -111,6 +111,13 @@ Description: QM debug registers(regs) read hardware register value. This node is used to show the change of the qm register values. This node can be help users to check the change of register values. +What: /sys/kernel/debug/hisi_hpre//qm/qm_state +Date: Jan 2024 +Contact: linux-crypto@vger.kernel.org +Description: Dump the state of the device. + 0: busy, 1: idle. + Only available for PF, and take no other effect on HPRE. + What: /sys/kernel/debug/hisi_hpre//hpre_dfx/diff_regs Date: Mar 2022 Contact: linux-crypto@vger.kernel.org diff --git a/Documentation/ABI/testing/debugfs-hisi-sec b/Documentation/ABI/testing/debugfs-hisi-sec index deeefe2c735e..403f5de96318 100644 --- a/Documentation/ABI/testing/debugfs-hisi-sec +++ b/Documentation/ABI/testing/debugfs-hisi-sec @@ -91,6 +91,13 @@ Description: QM debug registers(regs) read hardware register value. This node is used to show the change of the qm register values. This node can be help users to check the change of register values. +What: /sys/kernel/debug/hisi_sec2//qm/qm_state +Date: Jan 2024 +Contact: linux-crypto@vger.kernel.org +Description: Dump the state of the device. + 0: busy, 1: idle. + Only available for PF, and take no other effect on SEC. + What: /sys/kernel/debug/hisi_sec2//sec_dfx/diff_regs Date: Mar 2022 Contact: linux-crypto@vger.kernel.org diff --git a/Documentation/ABI/testing/debugfs-hisi-zip b/Documentation/ABI/testing/debugfs-hisi-zip index 593714afaed2..2394e6a3cfe2 100644 --- a/Documentation/ABI/testing/debugfs-hisi-zip +++ b/Documentation/ABI/testing/debugfs-hisi-zip @@ -104,6 +104,13 @@ Description: QM debug registers(regs) read hardware register value. This node is used to show the change of the qm registers value. This node can be help users to check the change of register values. +What: /sys/kernel/debug/hisi_zip//qm/qm_state +Date: Jan 2024 +Contact: linux-crypto@vger.kernel.org +Description: Dump the state of the device. + 0: busy, 1: idle. + Only available for PF, and take no other effect on ZIP. + What: /sys/kernel/debug/hisi_zip//zip_dfx/diff_regs Date: Mar 2022 Contact: linux-crypto@vger.kernel.org -- cgit v1.2.3 From e2b67859ab6efd4458bda1baaee20331a367d995 Mon Sep 17 00:00:00 2001 From: Damian Muszynski Date: Fri, 2 Feb 2024 18:53:16 +0800 Subject: crypto: qat - add heartbeat error simulator Add a mechanism that allows to inject a heartbeat error for testing purposes. A new attribute `inject_error` is added to debugfs for each QAT device. Upon a write on this attribute, the driver will inject an error on the device which can then be detected by the heartbeat feature. Errors are breaking the device functionality thus they require a device reset in order to be recovered. This functionality is not compiled by default, to enable it CRYPTO_DEV_QAT_ERROR_INJECTION must be set. Signed-off-by: Damian Muszynski Reviewed-by: Giovanni Cabiddu Reviewed-by: Lucas Segarra Fernandez Reviewed-by: Ahsan Atta Reviewed-by: Markas Rapoportas Signed-off-by: Mun Chun Yep Signed-off-by: Herbert Xu --- Documentation/ABI/testing/debugfs-driver-qat | 26 ++++++++ drivers/crypto/intel/qat/Kconfig | 14 ++++ drivers/crypto/intel/qat/qat_common/Makefile | 2 + .../crypto/intel/qat/qat_common/adf_common_drv.h | 1 + .../crypto/intel/qat/qat_common/adf_heartbeat.c | 6 -- .../crypto/intel/qat/qat_common/adf_heartbeat.h | 18 +++++ .../intel/qat/qat_common/adf_heartbeat_dbgfs.c | 52 +++++++++++++++ .../intel/qat/qat_common/adf_heartbeat_inject.c | 76 ++++++++++++++++++++++ .../crypto/intel/qat/qat_common/adf_hw_arbiter.c | 25 +++++++ 9 files changed, 214 insertions(+), 6 deletions(-) create mode 100644 drivers/crypto/intel/qat/qat_common/adf_heartbeat_inject.c (limited to 'Documentation/ABI') diff --git a/Documentation/ABI/testing/debugfs-driver-qat b/Documentation/ABI/testing/debugfs-driver-qat index b2db010d851e..bd6793760f29 100644 --- a/Documentation/ABI/testing/debugfs-driver-qat +++ b/Documentation/ABI/testing/debugfs-driver-qat @@ -81,3 +81,29 @@ Description: (RO) Read returns, for each Acceleration Engine (AE), the number : Number of Compress and Verify (CnV) errors and type of the last CnV error detected by Acceleration Engine N. + +What: /sys/kernel/debug/qat__/heartbeat/inject_error +Date: March 2024 +KernelVersion: 6.8 +Contact: qat-linux@intel.com +Description: (WO) Write to inject an error that simulates an heartbeat + failure. This is to be used for testing purposes. + + After writing this file, the driver stops arbitration on a + random engine and disables the fetching of heartbeat counters. + If a workload is running on the device, a job submitted to the + accelerator might not get a response and a read of the + `heartbeat/status` attribute might report -1, i.e. device + unresponsive. + The error is unrecoverable thus the device must be restarted to + restore its functionality. + + This attribute is available only when the kernel is built with + CONFIG_CRYPTO_DEV_QAT_ERROR_INJECTION=y. + + A write of 1 enables error injection. + + The following example shows how to enable error injection:: + + # cd /sys/kernel/debug/qat__ + # echo 1 > heartbeat/inject_error diff --git a/drivers/crypto/intel/qat/Kconfig b/drivers/crypto/intel/qat/Kconfig index c120f6715a09..02fb8abe4e6e 100644 --- a/drivers/crypto/intel/qat/Kconfig +++ b/drivers/crypto/intel/qat/Kconfig @@ -106,3 +106,17 @@ config CRYPTO_DEV_QAT_C62XVF To compile this as a module, choose M here: the module will be called qat_c62xvf. + +config CRYPTO_DEV_QAT_ERROR_INJECTION + bool "Support for Intel(R) QAT Devices Heartbeat Error Injection" + depends on CRYPTO_DEV_QAT + depends on DEBUG_FS + help + Enables a mechanism that allows to inject a heartbeat error on + Intel(R) QuickAssist devices for testing purposes. + + This is intended for developer use only. + If unsure, say N. + + This functionality is available via debugfs entry of the Intel(R) + QuickAssist device diff --git a/drivers/crypto/intel/qat/qat_common/Makefile b/drivers/crypto/intel/qat/qat_common/Makefile index 6908727bff3b..5915cde8a7aa 100644 --- a/drivers/crypto/intel/qat/qat_common/Makefile +++ b/drivers/crypto/intel/qat/qat_common/Makefile @@ -53,3 +53,5 @@ intel_qat-$(CONFIG_PCI_IOV) += adf_sriov.o adf_vf_isr.o adf_pfvf_utils.o \ adf_pfvf_pf_msg.o adf_pfvf_pf_proto.o \ adf_pfvf_vf_msg.o adf_pfvf_vf_proto.o \ adf_gen2_pfvf.o adf_gen4_pfvf.o + +intel_qat-$(CONFIG_CRYPTO_DEV_QAT_ERROR_INJECTION) += adf_heartbeat_inject.o diff --git a/drivers/crypto/intel/qat/qat_common/adf_common_drv.h b/drivers/crypto/intel/qat/qat_common/adf_common_drv.h index f06188033a93..0baae42deb3a 100644 --- a/drivers/crypto/intel/qat/qat_common/adf_common_drv.h +++ b/drivers/crypto/intel/qat/qat_common/adf_common_drv.h @@ -90,6 +90,7 @@ void adf_exit_aer(void); int adf_init_arb(struct adf_accel_dev *accel_dev); void adf_exit_arb(struct adf_accel_dev *accel_dev); void adf_update_ring_arb(struct adf_etr_ring_data *ring); +int adf_disable_arb_thd(struct adf_accel_dev *accel_dev, u32 ae, u32 thr); int adf_dev_get(struct adf_accel_dev *accel_dev); void adf_dev_put(struct adf_accel_dev *accel_dev); diff --git a/drivers/crypto/intel/qat/qat_common/adf_heartbeat.c b/drivers/crypto/intel/qat/qat_common/adf_heartbeat.c index 13f48d2f6da8..f88b1bc6857e 100644 --- a/drivers/crypto/intel/qat/qat_common/adf_heartbeat.c +++ b/drivers/crypto/intel/qat/qat_common/adf_heartbeat.c @@ -23,12 +23,6 @@ #define ADF_HB_EMPTY_SIG 0xA5A5A5A5 -/* Heartbeat counter pair */ -struct hb_cnt_pair { - __u16 resp_heartbeat_cnt; - __u16 req_heartbeat_cnt; -}; - static int adf_hb_check_polling_freq(struct adf_accel_dev *accel_dev) { u64 curr_time = adf_clock_get_current_time(); diff --git a/drivers/crypto/intel/qat/qat_common/adf_heartbeat.h b/drivers/crypto/intel/qat/qat_common/adf_heartbeat.h index b22e3cb29798..24c3f4f24c86 100644 --- a/drivers/crypto/intel/qat/qat_common/adf_heartbeat.h +++ b/drivers/crypto/intel/qat/qat_common/adf_heartbeat.h @@ -19,6 +19,12 @@ enum adf_device_heartbeat_status { HB_DEV_UNSUPPORTED, }; +/* Heartbeat counter pair */ +struct hb_cnt_pair { + __u16 resp_heartbeat_cnt; + __u16 req_heartbeat_cnt; +}; + struct adf_heartbeat { unsigned int hb_sent_counter; unsigned int hb_failed_counter; @@ -35,6 +41,9 @@ struct adf_heartbeat { struct dentry *cfg; struct dentry *sent; struct dentry *failed; +#ifdef CONFIG_CRYPTO_DEV_QAT_ERROR_INJECTION + struct dentry *inject_error; +#endif } dbgfs; }; @@ -51,6 +60,15 @@ void adf_heartbeat_status(struct adf_accel_dev *accel_dev, enum adf_device_heartbeat_status *hb_status); void adf_heartbeat_check_ctrs(struct adf_accel_dev *accel_dev); +#ifdef CONFIG_CRYPTO_DEV_QAT_ERROR_INJECTION +int adf_heartbeat_inject_error(struct adf_accel_dev *accel_dev); +#else +static inline int adf_heartbeat_inject_error(struct adf_accel_dev *accel_dev) +{ + return -EPERM; +} +#endif + #else static inline int adf_heartbeat_init(struct adf_accel_dev *accel_dev) { diff --git a/drivers/crypto/intel/qat/qat_common/adf_heartbeat_dbgfs.c b/drivers/crypto/intel/qat/qat_common/adf_heartbeat_dbgfs.c index 2661af6a2ef6..5cd6c2d6f90a 100644 --- a/drivers/crypto/intel/qat/qat_common/adf_heartbeat_dbgfs.c +++ b/drivers/crypto/intel/qat/qat_common/adf_heartbeat_dbgfs.c @@ -155,6 +155,43 @@ static const struct file_operations adf_hb_cfg_fops = { .write = adf_hb_cfg_write, }; +static ssize_t adf_hb_error_inject_write(struct file *file, + const char __user *user_buf, + size_t count, loff_t *ppos) +{ + struct adf_accel_dev *accel_dev = file->private_data; + size_t written_chars; + char buf[3]; + int ret; + + /* last byte left as string termination */ + if (count != 2) + return -EINVAL; + + written_chars = simple_write_to_buffer(buf, sizeof(buf) - 1, + ppos, user_buf, count); + if (buf[0] != '1') + return -EINVAL; + + ret = adf_heartbeat_inject_error(accel_dev); + if (ret) { + dev_err(&GET_DEV(accel_dev), + "Heartbeat error injection failed with status %d\n", + ret); + return ret; + } + + dev_info(&GET_DEV(accel_dev), "Heartbeat error injection enabled\n"); + + return written_chars; +} + +static const struct file_operations adf_hb_error_inject_fops = { + .owner = THIS_MODULE, + .open = simple_open, + .write = adf_hb_error_inject_write, +}; + void adf_heartbeat_dbgfs_add(struct adf_accel_dev *accel_dev) { struct adf_heartbeat *hb = accel_dev->heartbeat; @@ -171,6 +208,17 @@ void adf_heartbeat_dbgfs_add(struct adf_accel_dev *accel_dev) &hb->hb_failed_counter, &adf_hb_stats_fops); hb->dbgfs.cfg = debugfs_create_file("config", 0600, hb->dbgfs.base_dir, accel_dev, &adf_hb_cfg_fops); + + if (IS_ENABLED(CONFIG_CRYPTO_DEV_QAT_ERROR_INJECTION)) { + struct dentry *inject_error __maybe_unused; + + inject_error = debugfs_create_file("inject_error", 0200, + hb->dbgfs.base_dir, accel_dev, + &adf_hb_error_inject_fops); +#ifdef CONFIG_CRYPTO_DEV_QAT_ERROR_INJECTION + hb->dbgfs.inject_error = inject_error; +#endif + } } EXPORT_SYMBOL_GPL(adf_heartbeat_dbgfs_add); @@ -189,6 +237,10 @@ void adf_heartbeat_dbgfs_rm(struct adf_accel_dev *accel_dev) hb->dbgfs.failed = NULL; debugfs_remove(hb->dbgfs.cfg); hb->dbgfs.cfg = NULL; +#ifdef CONFIG_CRYPTO_DEV_QAT_ERROR_INJECTION + debugfs_remove(hb->dbgfs.inject_error); + hb->dbgfs.inject_error = NULL; +#endif debugfs_remove(hb->dbgfs.base_dir); hb->dbgfs.base_dir = NULL; } diff --git a/drivers/crypto/intel/qat/qat_common/adf_heartbeat_inject.c b/drivers/crypto/intel/qat/qat_common/adf_heartbeat_inject.c new file mode 100644 index 000000000000..a3b474bdef6c --- /dev/null +++ b/drivers/crypto/intel/qat/qat_common/adf_heartbeat_inject.c @@ -0,0 +1,76 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* Copyright(c) 2023 Intel Corporation */ +#include + +#include "adf_admin.h" +#include "adf_common_drv.h" +#include "adf_heartbeat.h" + +#define MAX_HB_TICKS 0xFFFFFFFF + +static int adf_hb_set_timer_to_max(struct adf_accel_dev *accel_dev) +{ + struct adf_hw_device_data *hw_data = accel_dev->hw_device; + + accel_dev->heartbeat->hb_timer = 0; + + if (hw_data->stop_timer) + hw_data->stop_timer(accel_dev); + + return adf_send_admin_hb_timer(accel_dev, MAX_HB_TICKS); +} + +static void adf_set_hb_counters_fail(struct adf_accel_dev *accel_dev, u32 ae, + u32 thr) +{ + struct hb_cnt_pair *stats = accel_dev->heartbeat->dma.virt_addr; + struct adf_hw_device_data *hw_device = accel_dev->hw_device; + const size_t max_aes = hw_device->get_num_aes(hw_device); + const size_t hb_ctrs = hw_device->num_hb_ctrs; + size_t thr_id = ae * hb_ctrs + thr; + u16 num_rsp = stats[thr_id].resp_heartbeat_cnt; + + /* + * Inject live.req != live.rsp and live.rsp == last.rsp + * to trigger the heartbeat error detection + */ + stats[thr_id].req_heartbeat_cnt++; + stats += (max_aes * hb_ctrs); + stats[thr_id].resp_heartbeat_cnt = num_rsp; +} + +int adf_heartbeat_inject_error(struct adf_accel_dev *accel_dev) +{ + struct adf_hw_device_data *hw_device = accel_dev->hw_device; + const size_t max_aes = hw_device->get_num_aes(hw_device); + const size_t hb_ctrs = hw_device->num_hb_ctrs; + u32 rand, rand_ae, rand_thr; + unsigned long ae_mask; + int ret; + + ae_mask = hw_device->ae_mask; + + do { + /* Ensure we have a valid ae */ + get_random_bytes(&rand, sizeof(rand)); + rand_ae = rand % max_aes; + } while (!test_bit(rand_ae, &ae_mask)); + + get_random_bytes(&rand, sizeof(rand)); + rand_thr = rand % hb_ctrs; + + /* Increase the heartbeat timer to prevent FW updating HB counters */ + ret = adf_hb_set_timer_to_max(accel_dev); + if (ret) + return ret; + + /* Configure worker threads to stop processing any packet */ + ret = adf_disable_arb_thd(accel_dev, rand_ae, rand_thr); + if (ret) + return ret; + + /* Change HB counters memory to simulate a hang */ + adf_set_hb_counters_fail(accel_dev, rand_ae, rand_thr); + + return 0; +} diff --git a/drivers/crypto/intel/qat/qat_common/adf_hw_arbiter.c b/drivers/crypto/intel/qat/qat_common/adf_hw_arbiter.c index da6956699246..65bd26b25abc 100644 --- a/drivers/crypto/intel/qat/qat_common/adf_hw_arbiter.c +++ b/drivers/crypto/intel/qat/qat_common/adf_hw_arbiter.c @@ -103,3 +103,28 @@ void adf_exit_arb(struct adf_accel_dev *accel_dev) csr_ops->write_csr_ring_srv_arb_en(csr, i, 0); } EXPORT_SYMBOL_GPL(adf_exit_arb); + +int adf_disable_arb_thd(struct adf_accel_dev *accel_dev, u32 ae, u32 thr) +{ + void __iomem *csr = accel_dev->transport->banks[0].csr_addr; + struct adf_hw_device_data *hw_data = accel_dev->hw_device; + const u32 *thd_2_arb_cfg; + struct arb_info info; + u32 ae_thr_map; + + if (ADF_AE_STRAND0_THREAD == thr || ADF_AE_STRAND1_THREAD == thr) + thr = ADF_AE_ADMIN_THREAD; + + hw_data->get_arb_info(&info); + thd_2_arb_cfg = hw_data->get_arb_mapping(accel_dev); + if (!thd_2_arb_cfg) + return -EFAULT; + + /* Disable scheduling for this particular AE and thread */ + ae_thr_map = *(thd_2_arb_cfg + ae); + ae_thr_map &= ~(GENMASK(3, 0) << (thr * BIT(2))); + + WRITE_CSR_ARB_WT2SAM(csr, info.arb_offset, info.wt2sam_offset, ae, + ae_thr_map); + return 0; +} -- cgit v1.2.3 From f5419a4239af8b3951f990c83d0d8c865a485475 Mon Sep 17 00:00:00 2001 From: Damian Muszynski Date: Fri, 2 Feb 2024 18:53:22 +0800 Subject: crypto: qat - add auto reset on error Expose the `auto_reset` sysfs attribute to configure the driver to reset the device when a fatal error is detected. When auto reset is enabled, the driver resets the device when it detects either an heartbeat failure or a fatal error through an interrupt. This patch is based on earlier work done by Shashank Gupta. Signed-off-by: Damian Muszynski Reviewed-by: Ahsan Atta Reviewed-by: Markas Rapoportas Reviewed-by: Giovanni Cabiddu Signed-off-by: Mun Chun Yep Signed-off-by: Herbert Xu --- Documentation/ABI/testing/sysfs-driver-qat | 20 ++++++++++++ .../intel/qat/qat_common/adf_accel_devices.h | 1 + drivers/crypto/intel/qat/qat_common/adf_aer.c | 11 ++++++- .../crypto/intel/qat/qat_common/adf_common_drv.h | 1 + drivers/crypto/intel/qat/qat_common/adf_sysfs.c | 37 ++++++++++++++++++++++ 5 files changed, 69 insertions(+), 1 deletion(-) (limited to 'Documentation/ABI') diff --git a/Documentation/ABI/testing/sysfs-driver-qat b/Documentation/ABI/testing/sysfs-driver-qat index bbf329cf0d67..6778f1fea874 100644 --- a/Documentation/ABI/testing/sysfs-driver-qat +++ b/Documentation/ABI/testing/sysfs-driver-qat @@ -141,3 +141,23 @@ Description: 64 This attribute is only available for qat_4xxx devices. + +What: /sys/bus/pci/devices//qat/auto_reset +Date: March 2024 +KernelVersion: 6.8 +Contact: qat-linux@intel.com +Description: (RW) Reports the current state of the autoreset feature + for a QAT device + + Write to the attribute to enable or disable device auto reset. + + Device auto reset is disabled by default. + + The values are:: + + * 1/Yy/on: auto reset enabled. If the device encounters an + unrecoverable error, it will be reset automatically. + * 0/Nn/off: auto reset disabled. If the device encounters an + unrecoverable error, it will not be reset. + + This attribute is only available for qat_4xxx devices. diff --git a/drivers/crypto/intel/qat/qat_common/adf_accel_devices.h b/drivers/crypto/intel/qat/qat_common/adf_accel_devices.h index 4a3c36aaa7ca..0f26aa976c8c 100644 --- a/drivers/crypto/intel/qat/qat_common/adf_accel_devices.h +++ b/drivers/crypto/intel/qat/qat_common/adf_accel_devices.h @@ -402,6 +402,7 @@ struct adf_accel_dev { struct adf_error_counters ras_errors; struct mutex state_lock; /* protect state of the device */ bool is_vf; + bool autoreset_on_error; u32 accel_id; }; #endif diff --git a/drivers/crypto/intel/qat/qat_common/adf_aer.c b/drivers/crypto/intel/qat/qat_common/adf_aer.c index cd273b31db0e..b3d4b6b99c65 100644 --- a/drivers/crypto/intel/qat/qat_common/adf_aer.c +++ b/drivers/crypto/intel/qat/qat_common/adf_aer.c @@ -204,6 +204,14 @@ const struct pci_error_handlers adf_err_handler = { }; EXPORT_SYMBOL_GPL(adf_err_handler); +int adf_dev_autoreset(struct adf_accel_dev *accel_dev) +{ + if (accel_dev->autoreset_on_error) + return adf_dev_aer_schedule_reset(accel_dev, ADF_DEV_RESET_ASYNC); + + return 0; +} + static void adf_notify_fatal_error_worker(struct work_struct *work) { struct adf_fatal_error_data *wq_data = @@ -215,10 +223,11 @@ static void adf_notify_fatal_error_worker(struct work_struct *work) if (!accel_dev->is_vf) { /* Disable arbitration to stop processing of new requests */ - if (hw_device->exit_arb) + if (accel_dev->autoreset_on_error && hw_device->exit_arb) hw_device->exit_arb(accel_dev); if (accel_dev->pf.vf_info) adf_pf2vf_notify_fatal_error(accel_dev); + adf_dev_autoreset(accel_dev); } kfree(wq_data); diff --git a/drivers/crypto/intel/qat/qat_common/adf_common_drv.h b/drivers/crypto/intel/qat/qat_common/adf_common_drv.h index 10891c9da6e7..57328249c89e 100644 --- a/drivers/crypto/intel/qat/qat_common/adf_common_drv.h +++ b/drivers/crypto/intel/qat/qat_common/adf_common_drv.h @@ -87,6 +87,7 @@ int adf_ae_stop(struct adf_accel_dev *accel_dev); extern const struct pci_error_handlers adf_err_handler; void adf_reset_sbr(struct adf_accel_dev *accel_dev); void adf_reset_flr(struct adf_accel_dev *accel_dev); +int adf_dev_autoreset(struct adf_accel_dev *accel_dev); void adf_dev_restore(struct adf_accel_dev *accel_dev); int adf_init_aer(void); void adf_exit_aer(void); diff --git a/drivers/crypto/intel/qat/qat_common/adf_sysfs.c b/drivers/crypto/intel/qat/qat_common/adf_sysfs.c index d450dad32c9e..4e7f70d4049d 100644 --- a/drivers/crypto/intel/qat/qat_common/adf_sysfs.c +++ b/drivers/crypto/intel/qat/qat_common/adf_sysfs.c @@ -204,6 +204,42 @@ static ssize_t pm_idle_enabled_store(struct device *dev, struct device_attribute } static DEVICE_ATTR_RW(pm_idle_enabled); +static ssize_t auto_reset_show(struct device *dev, struct device_attribute *attr, + char *buf) +{ + char *auto_reset; + struct adf_accel_dev *accel_dev; + + accel_dev = adf_devmgr_pci_to_accel_dev(to_pci_dev(dev)); + if (!accel_dev) + return -EINVAL; + + auto_reset = accel_dev->autoreset_on_error ? "on" : "off"; + + return sysfs_emit(buf, "%s\n", auto_reset); +} + +static ssize_t auto_reset_store(struct device *dev, struct device_attribute *attr, + const char *buf, size_t count) +{ + struct adf_accel_dev *accel_dev; + bool enabled = false; + int ret; + + ret = kstrtobool(buf, &enabled); + if (ret) + return ret; + + accel_dev = adf_devmgr_pci_to_accel_dev(to_pci_dev(dev)); + if (!accel_dev) + return -EINVAL; + + accel_dev->autoreset_on_error = enabled; + + return count; +} +static DEVICE_ATTR_RW(auto_reset); + static DEVICE_ATTR_RW(state); static DEVICE_ATTR_RW(cfg_services); @@ -291,6 +327,7 @@ static struct attribute *qat_attrs[] = { &dev_attr_pm_idle_enabled.attr, &dev_attr_rp2srv.attr, &dev_attr_num_rps.attr, + &dev_attr_auto_reset.attr, NULL, }; -- cgit v1.2.3 From ce133a22123055f5f988499cd9ac7953d2bf0677 Mon Sep 17 00:00:00 2001 From: Weili Qian Date: Wed, 7 Feb 2024 17:51:00 +0800 Subject: crypto: hisilicon/qm - obtain stop queue status The debugfs files 'dev_state' and 'dev_timeout' are added. Users can query the current queue stop status through these two files. And set the waiting timeout when the queue is released. dev_state: if dev_timeout is set, dev_state indicates the status of stopping the queue. 0 indicates that the queue is stopped successfully. Other values indicate that the queue stops fail. If dev_timeout is not set, the value of dev_state is 0; dev_timeout: if the queue fails to stop, the queue is released after waiting dev_timeout * 20ms. Signed-off-by: Weili Qian Signed-off-by: Herbert Xu --- Documentation/ABI/testing/debugfs-hisi-hpre | 15 ++++ Documentation/ABI/testing/debugfs-hisi-sec | 15 ++++ Documentation/ABI/testing/debugfs-hisi-zip | 15 ++++ drivers/crypto/hisilicon/debugfs.c | 5 ++ drivers/crypto/hisilicon/qm.c | 108 +++++++++++++++++++++------- include/linux/hisi_acc_qm.h | 6 ++ 6 files changed, 138 insertions(+), 26 deletions(-) (limited to 'Documentation/ABI') diff --git a/Documentation/ABI/testing/debugfs-hisi-hpre b/Documentation/ABI/testing/debugfs-hisi-hpre index 6ed9258605c7..d4e16ef9ac9a 100644 --- a/Documentation/ABI/testing/debugfs-hisi-hpre +++ b/Documentation/ABI/testing/debugfs-hisi-hpre @@ -118,6 +118,21 @@ Description: Dump the state of the device. 0: busy, 1: idle. Only available for PF, and take no other effect on HPRE. +What: /sys/kernel/debug/hisi_hpre//qm/dev_timeout +Date: Feb 2024 +Contact: linux-crypto@vger.kernel.org +Description: Set the wait time when stop queue fails. Available for both PF + and VF, and take no other effect on HPRE. + 0: not wait(default), others value: wait dev_timeout * 20 microsecond. + +What: /sys/kernel/debug/hisi_hpre//qm/dev_state +Date: Feb 2024 +Contact: linux-crypto@vger.kernel.org +Description: Dump the stop queue status of the QM. The default value is 0, + if dev_timeout is set, when stop queue fails, the dev_state + will return non-zero value. Available for both PF and VF, + and take no other effect on HPRE. + What: /sys/kernel/debug/hisi_hpre//hpre_dfx/diff_regs Date: Mar 2022 Contact: linux-crypto@vger.kernel.org diff --git a/Documentation/ABI/testing/debugfs-hisi-sec b/Documentation/ABI/testing/debugfs-hisi-sec index 403f5de96318..6c6c9a6e150a 100644 --- a/Documentation/ABI/testing/debugfs-hisi-sec +++ b/Documentation/ABI/testing/debugfs-hisi-sec @@ -98,6 +98,21 @@ Description: Dump the state of the device. 0: busy, 1: idle. Only available for PF, and take no other effect on SEC. +What: /sys/kernel/debug/hisi_sec2//qm/dev_timeout +Date: Feb 2024 +Contact: linux-crypto@vger.kernel.org +Description: Set the wait time when stop queue fails. Available for both PF + and VF, and take no other effect on SEC. + 0: not wait(default), others value: wait dev_timeout * 20 microsecond. + +What: /sys/kernel/debug/hisi_sec2//qm/dev_state +Date: Feb 2024 +Contact: linux-crypto@vger.kernel.org +Description: Dump the stop queue status of the QM. The default value is 0, + if dev_timeout is set, when stop queue fails, the dev_state + will return non-zero value. Available for both PF and VF, + and take no other effect on SEC. + What: /sys/kernel/debug/hisi_sec2//sec_dfx/diff_regs Date: Mar 2022 Contact: linux-crypto@vger.kernel.org diff --git a/Documentation/ABI/testing/debugfs-hisi-zip b/Documentation/ABI/testing/debugfs-hisi-zip index 2394e6a3cfe2..a22dd6942219 100644 --- a/Documentation/ABI/testing/debugfs-hisi-zip +++ b/Documentation/ABI/testing/debugfs-hisi-zip @@ -111,6 +111,21 @@ Description: Dump the state of the device. 0: busy, 1: idle. Only available for PF, and take no other effect on ZIP. +What: /sys/kernel/debug/hisi_zip//qm/dev_timeout +Date: Feb 2024 +Contact: linux-crypto@vger.kernel.org +Description: Set the wait time when stop queue fails. Available for both PF + and VF, and take no other effect on ZIP. + 0: not wait(default), others value: wait dev_timeout * 20 microsecond. + +What: /sys/kernel/debug/hisi_zip//qm/dev_state +Date: Feb 2024 +Contact: linux-crypto@vger.kernel.org +Description: Dump the stop queue status of the QM. The default value is 0, + if dev_timeout is set, when stop queue fails, the dev_state + will return non-zero value. Available for both PF and VF, + and take no other effect on ZIP. + What: /sys/kernel/debug/hisi_zip//zip_dfx/diff_regs Date: Mar 2022 Contact: linux-crypto@vger.kernel.org diff --git a/drivers/crypto/hisilicon/debugfs.c b/drivers/crypto/hisilicon/debugfs.c index 06e67eda409f..cd67fa348ca7 100644 --- a/drivers/crypto/hisilicon/debugfs.c +++ b/drivers/crypto/hisilicon/debugfs.c @@ -1112,6 +1112,7 @@ DEFINE_DEBUGFS_ATTRIBUTE(qm_atomic64_ops, qm_debugfs_atomic64_get, void hisi_qm_debug_init(struct hisi_qm *qm) { struct dfx_diff_registers *qm_regs = qm->debug.qm_diff_regs; + struct qm_dev_dfx *dev_dfx = &qm->debug.dev_dfx; struct qm_dfx *dfx = &qm->debug.dfx; struct dentry *qm_d; void *data; @@ -1140,6 +1141,10 @@ void hisi_qm_debug_init(struct hisi_qm *qm) debugfs_create_file("status", 0444, qm->debug.qm_d, qm, &qm_status_fops); + + debugfs_create_u32("dev_state", 0444, qm->debug.qm_d, &dev_dfx->dev_state); + debugfs_create_u32("dev_timeout", 0644, qm->debug.qm_d, &dev_dfx->dev_timeout); + for (i = 0; i < ARRAY_SIZE(qm_dfx_files); i++) { data = (atomic64_t *)((uintptr_t)dfx + qm_dfx_files[i].offset); debugfs_create_file(qm_dfx_files[i].name, diff --git a/drivers/crypto/hisilicon/qm.c b/drivers/crypto/hisilicon/qm.c index 3b015482b4e6..41dff28326f1 100644 --- a/drivers/crypto/hisilicon/qm.c +++ b/drivers/crypto/hisilicon/qm.c @@ -236,6 +236,12 @@ #define QM_DEV_ALG_MAX_LEN 256 + /* abnormal status value for stopping queue */ +#define QM_STOP_QUEUE_FAIL 1 +#define QM_DUMP_SQC_FAIL 3 +#define QM_DUMP_CQC_FAIL 4 +#define QM_FINISH_WAIT 5 + #define QM_MK_CQC_DW3_V1(hop_num, pg_sz, buf_sz, cqe_sz) \ (((hop_num) << QM_CQ_HOP_NUM_SHIFT) | \ ((pg_sz) << QM_CQ_PAGE_SIZE_SHIFT) | \ @@ -2037,43 +2043,25 @@ static void qp_stop_fail_cb(struct hisi_qp *qp) } } -/** - * qm_drain_qp() - Drain a qp. - * @qp: The qp we want to drain. - * - * Determine whether the queue is cleared by judging the tail pointers of - * sq and cq. - */ -static int qm_drain_qp(struct hisi_qp *qp) +static int qm_wait_qp_empty(struct hisi_qm *qm, u32 *state, u32 qp_id) { - struct hisi_qm *qm = qp->qm; struct device *dev = &qm->pdev->dev; struct qm_sqc sqc; struct qm_cqc cqc; int ret, i = 0; - /* No need to judge if master OOO is blocked. */ - if (qm_check_dev_error(qm)) - return 0; - - /* Kunpeng930 supports drain qp by device */ - if (test_bit(QM_SUPPORT_STOP_QP, &qm->caps)) { - ret = qm_stop_qp(qp); - if (ret) - dev_err(dev, "Failed to stop qp(%u)!\n", qp->qp_id); - return ret; - } - while (++i) { - ret = qm_set_and_get_xqc(qm, QM_MB_CMD_SQC, &sqc, qp->qp_id, 1); + ret = qm_set_and_get_xqc(qm, QM_MB_CMD_SQC, &sqc, qp_id, 1); if (ret) { dev_err_ratelimited(dev, "Failed to dump sqc!\n"); + *state = QM_DUMP_SQC_FAIL; return ret; } - ret = qm_set_and_get_xqc(qm, QM_MB_CMD_CQC, &cqc, qp->qp_id, 1); + ret = qm_set_and_get_xqc(qm, QM_MB_CMD_CQC, &cqc, qp_id, 1); if (ret) { dev_err_ratelimited(dev, "Failed to dump cqc!\n"); + *state = QM_DUMP_CQC_FAIL; return ret; } @@ -2082,8 +2070,9 @@ static int qm_drain_qp(struct hisi_qp *qp) break; if (i == MAX_WAIT_COUNTS) { - dev_err(dev, "Fail to empty queue %u!\n", qp->qp_id); - return -EBUSY; + dev_err(dev, "Fail to empty queue %u!\n", qp_id); + *state = QM_STOP_QUEUE_FAIL; + return -ETIMEDOUT; } usleep_range(WAIT_PERIOD_US_MIN, WAIT_PERIOD_US_MAX); @@ -2092,6 +2081,49 @@ static int qm_drain_qp(struct hisi_qp *qp) return 0; } +/** + * qm_drain_qp() - Drain a qp. + * @qp: The qp we want to drain. + * + * If the device does not support stopping queue by sending mailbox, + * determine whether the queue is cleared by judging the tail pointers of + * sq and cq. + */ +static int qm_drain_qp(struct hisi_qp *qp) +{ + struct hisi_qm *qm = qp->qm; + struct hisi_qm *pf_qm = pci_get_drvdata(pci_physfn(qm->pdev)); + u32 state = 0; + int ret; + + /* No need to judge if master OOO is blocked. */ + if (qm_check_dev_error(pf_qm)) + return 0; + + /* HW V3 supports drain qp by device */ + if (test_bit(QM_SUPPORT_STOP_QP, &qm->caps)) { + ret = qm_stop_qp(qp); + if (ret) { + dev_err(&qm->pdev->dev, "Failed to stop qp!\n"); + state = QM_STOP_QUEUE_FAIL; + goto set_dev_state; + } + return ret; + } + + ret = qm_wait_qp_empty(qm, &state, qp->qp_id); + if (ret) + goto set_dev_state; + + return 0; + +set_dev_state: + if (qm->debug.dev_dfx.dev_timeout) + qm->debug.dev_dfx.dev_state = state; + + return ret; +} + static int qm_stop_qp_nolock(struct hisi_qp *qp) { struct hisi_qm *qm = qp->qm; @@ -2319,7 +2351,31 @@ static int hisi_qm_uacce_start_queue(struct uacce_queue *q) static void hisi_qm_uacce_stop_queue(struct uacce_queue *q) { - hisi_qm_stop_qp(q->priv); + struct hisi_qp *qp = q->priv; + struct hisi_qm *qm = qp->qm; + struct qm_dev_dfx *dev_dfx = &qm->debug.dev_dfx; + u32 i = 0; + + hisi_qm_stop_qp(qp); + + if (!dev_dfx->dev_timeout || !dev_dfx->dev_state) + return; + + /* + * After the queue fails to be stopped, + * wait for a period of time before releasing the queue. + */ + while (++i) { + msleep(WAIT_PERIOD); + + /* Since dev_timeout maybe modified, check i >= dev_timeout */ + if (i >= dev_dfx->dev_timeout) { + dev_err(&qm->pdev->dev, "Stop q %u timeout, state %u\n", + qp->qp_id, dev_dfx->dev_state); + dev_dfx->dev_state = QM_FINISH_WAIT; + break; + } + } } static int hisi_qm_is_q_updated(struct uacce_queue *q) diff --git a/include/linux/hisi_acc_qm.h b/include/linux/hisi_acc_qm.h index 720f10874a66..2d14742ad729 100644 --- a/include/linux/hisi_acc_qm.h +++ b/include/linux/hisi_acc_qm.h @@ -163,6 +163,11 @@ struct qm_dev_alg { const char *alg; }; +struct qm_dev_dfx { + u32 dev_state; + u32 dev_timeout; +}; + struct dfx_diff_registers { u32 *regs; u32 reg_offset; @@ -191,6 +196,7 @@ struct qm_debug { struct dentry *debug_root; struct dentry *qm_d; struct debugfs_file files[DEBUG_FILE_NUM]; + struct qm_dev_dfx dev_dfx; unsigned int *qm_last_words; /* ACC engines recoreding last regs */ unsigned int *last_words; -- cgit v1.2.3 From 2ecd43413d7668d67b9b8a56f882aa1ea12b8a62 Mon Sep 17 00:00:00 2001 From: Giovanni Cabiddu Date: Mon, 12 Feb 2024 13:05:09 +0000 Subject: Documentation: qat: fix auto_reset section Remove unneeded colon in the auto_reset section. This resolves the following errors when building the documentation: Documentation/ABI/testing/sysfs-driver-qat:146: ERROR: Unexpected indentation. Documentation/ABI/testing/sysfs-driver-qat:146: WARNING: Block quote ends without a blank line; unexpected unindent. Fixes: f5419a4239af ("crypto: qat - add auto reset on error") Reported-by: Stephen Rothwell Closes: https://lore.kernel.org/linux-kernel/20240212144830.70495d07@canb.auug.org.au/T/ Signed-off-by: Giovanni Cabiddu Signed-off-by: Herbert Xu --- Documentation/ABI/testing/sysfs-driver-qat | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'Documentation/ABI') diff --git a/Documentation/ABI/testing/sysfs-driver-qat b/Documentation/ABI/testing/sysfs-driver-qat index 6778f1fea874..96020fb051c3 100644 --- a/Documentation/ABI/testing/sysfs-driver-qat +++ b/Documentation/ABI/testing/sysfs-driver-qat @@ -153,7 +153,7 @@ Description: (RW) Reports the current state of the autoreset feature Device auto reset is disabled by default. - The values are:: + The values are: * 1/Yy/on: auto reset enabled. If the device encounters an unrecoverable error, it will be reset automatically. -- cgit v1.2.3