summaryrefslogtreecommitdiff
path: root/drivers/md/dm-crypt.c
AgeCommit message (Collapse)AuthorFilesLines
2024-07-10dm-crypt: support for per-sector NVMe metadataMikulas Patocka1-23/+24
Support per-sector NVMe metadata in dm-crypt. This commit changes dm-crypt, so that it can use NVMe metadata to store authentication information. We can put dm-crypt directly on the top of NVMe device, without using dm-integrity. This commit improves write throughput twice, becase the will be no writes to the dm-integrity journal. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
2024-07-10dm-crypt: limit the size of encryption requestsMikulas Patocka1-3/+29
There was a performance regression reported where dm-crypt would perform worse on new kernels than on old kernels. The reason is that the old kernels split the bios to NVMe request size (that is usually 65536 or 131072 bytes) and the new kernels pass the big bios through dm-crypt and split them underneath. If a big 1MiB bio is passed to dm-crypt, dm-crypt processes it on a single core without parallelization and this is what causes the performance degradation. This commit introduces new tunable variables /sys/module/dm_crypt/parameters/max_read_size and /sys/module/dm_crypt/parameters/max_write_size that specify the maximum bio size for dm-crypt. Bios larger than this value are split, so that they can be encrypted in parallel by multiple cores. If these variables are '0', a default 131072 is used. Splitting bios may cause performance regressions in other workloads - if this happens, the user should increase the value in max_read_size and max_write_size variables. max_read_size: 128k 2399MiB/s 256k 2368MiB/s 512k 1986MiB/s 1024 1790MiB/s max_write_size: 128k 1712MiB/s 256k 1651MiB/s 512k 1537MiB/s 1024k 1332MiB/s Note that if you run dm-crypt inside a virtual machine, you may need to do "echo numa >/sys/module/workqueue/parameters/default_affinity_scope" to improve performance. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Tested-by: Laurence Oberman <loberman@redhat.com>
2024-06-14block: remove the blk_integrity_profile structureChristoph Hellwig1-1/+1
Block layer integrity configuration is a bit complex right now, as it indirects through operation vectors for a simple two-dimensional configuration: a) the checksum type of none, ip checksum, crc, crc64 b) the presence or absence of a reference tag Remove the integrity profile, and instead add a separate csum_type flag which replaces the existing ip-checksum field and a new flag that indicates the presence of the reference tag. This removes up to two layers of indirect calls, remove the need to offload the no-op verification of non-PI metadata to a workqueue and generally simplifies the code. The downside is that block/t10-pi.c now has to be built into the kernel when CONFIG_BLK_DEV_INTEGRITY is supported. Given that both nvme and SCSI require t10-pi.ko, it is loaded for all usual configurations that enabled CONFIG_BLK_DEV_INTEGRITY already, though. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20240613084839.1044015-6-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-06-14dm-integrity: use the nop integrity profileChristoph Hellwig1-2/+2
Use the block layer built-in nop profile instead of duplicating it. Tested by: $ dd if=/dev/urandom of=key.bin bs=512 count=1 $ cryptsetup luksFormat -q --type luks2 --integrity hmac-sha256 \ --integrity-no-wipe /dev/nvme0n1 key.bin $ cryptsetup luksOpen /dev/nvme0n1 luks-integrity --key-file key.bin and then doing mkfs.xfs and simple I/O on the mount file system. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Milan Broz <gmazyland@gmail.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20240613084839.1044015-5-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-04-23dm-crypt: don't set WQ_CPU_INTENSIVE for WQ_UNBOUND crypt_queueMike Snitzer1-1/+5
Fix crypt_queue's use of WQ_UNBOUND to _not_ use WQ_CPU_INTENSIVE because it is meaningless with WQ_UNBOUND. Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2024-04-23dm-crypt: stop constraining max_segment_size to PAGE_SIZEMike Snitzer1-10/+2
This change effectively reverts commit 586b286b110e ("dm crypt: constrain crypt device's max_segment_size to PAGE_SIZE") and relies on block core's late bio-splitting to ensure that dm-crypt's encryption bios are split accordingly if they exceed the underlying device's limits (e.g. max_segment_size). Commit 586b286b110e was applied as a 4.3 fix for the benefit of stable@ kernels 4.0+ just after block core's late bio-splitting was introduced in 4.3 with commit 54efd50bfd873 ("block: make generic_make_request handle arbitrarily sized bios"). Given block core's late bio-splitting it is past time that dm-crypt make use of it. Also, given the recent need to revert meaningful progress that was attempted during the 6.9 merge window (see commit bff4b74625fe Revert "dm: use queue_limits_set") this change allows DM core to safely make use of queue_limits_set() without risk of breaking dm-crypt on NVMe. Though it should be noted this commit isn't a prereq for reinstating DM core's use of queue_limits_set() because blk_validate_limits() was made less strict with commit b561ea56a264 ("block: allow device to have both virt_boundary_mask and max segment size"). Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Mikulas Patocka <mpatocka@redhat.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2024-04-16dm-crypt: export sysfs of all workqueuesyangerkun1-7/+21
Once there is a heavy IO load, so many encrypt/decrypt work will occupy all of the cpu, which may lead to the poor performance for other service. So the improved visibility and controls over dm-crypt workqueues, as was offered with commit a2b8b2d97567 ("dm crypt: export sysfs of kcryptd workqueue"), seems necessary. By exporting dm-crypt's workqueues in sysfs, the entry like cpumask/max_active and so on can help us to limit the CPU usage for encrypt/decrypt work. However, commit a2b8b2d97567 did not consider that DM table reload will call .ctr before .dtr, so the reload for dm-crypt failed because the same sysfs name was present. This was the original need for commit 48b0777cd93d ("Revert "dm crypt: export sysfs of kcryptd workqueue""). Reintroduce the use of WQ_SYSFS, and use it for both the IO and crypt workqueues, but make the workqueue names include a unique id (via ida) to allow both old and new sysfs entries to coexist. Signed-off-by: yangerkun <yangerkun@huawei.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2024-04-16dm-crypt: add the optional "high_priority" flagMikulas Patocka1-10/+25
When WQ_HIGHPRI was used for the dm-crypt kcryptd workqueue it was reported that dm-crypt performs badly when the system is loaded[1]. Because of reports of audio skipping, dm-crypt stopped using WQ_HIGHPRI with commit f612b2132db5 (Revert "dm crypt: use WQ_HIGHPRI for the IO and crypt workqueues"). But it has since been determined that WQ_HIGHPRI provides improved performance (with reduced latency) for highend systems with much more resources than those laptop/desktop users which suffered from the use of WQ_HIGHPRI. As such, add an option "high_priority" that allows the use of WQ_HIGHPRI for dm-crypt's workqueues and also sets the write_thread to nice level MIN_NICE (-20). This commit makes it optional, so that normal users won't be harmed by it. [1] https://listman.redhat.com/archives/dm-devel/2023-February/053410.html Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2024-03-02dm-crypt: Convert from tasklet to BH workqueueTejun Heo1-1/+5
The only generic interface to execute asynchronously in the BH context is tasklet; however, it's marked deprecated and has some design flaws. To replace tasklets, BH workqueue support was recently added. A BH workqueue behaves similarly to regular workqueues except that the queued work items are executed in the BH context. This commit converts dm-crypt from tasklet to BH workqueue. It backfills tasklet code that was removed with commit 0a9bab391e33 ("dm-crypt, dm-verity: disable tasklets") and tweaks to use BH workqueue. Like a regular workqueue, a BH workqueue allows freeing the currently executing work item. Converting from tasklet to BH workqueue removes the need for deferring bio_endio() again to a work item, which was buggy anyway. I tested this lightly with "--perf-no_read_workqueue --perf-no_write_workqueue" + some code modifications, but would really -appreciate if someone who knows the code base better could take a look. Signed-off-by: Tejun Heo <tj@kernel.org> Link: http://lkml.kernel.org/r/82b964f0-c2c8-a2c6-5b1f-f3145dc2c8e5@redhat.com [snitzer: rebase ontop of commit 0a9bab391e33 reduced this commit's changes] Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2024-02-20dm crypt: Fix IO priority lost when queuing write biosHongyu Jin1-0/+1
Since dm-crypt queues writes to a different kernel thread (workqueue), the bios will dispatch from tasks with different io_context->ioprio settings and blkcg than the submitting task, thus giving incorrect ioprio to the io scheduler. Get the original IO priority setting via struct dm_crypt_io::base_bio and set this priority in the bio for write. Link: https://lore.kernel.org/dm-devel/alpine.LRH.2.11.1612141049250.13402@mail.ewheeler.net Signed-off-by: Hongyu Jin <hongyu.jin@unisoc.com> Reviewed-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2024-02-20dm crypt: remove redundant state settings after waking upLizhe1-1/+0
The task status has been set to TASK_RUNNING in schedule(). No need to set again here. Signed-off-by: Lizhe <sensor1010@163.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2024-02-20dm-crypt, dm-integrity, dm-verity: bump target versionMike Snitzer1-1/+1
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2024-02-20dm-verity, dm-crypt: align "struct bvec_iter" correctlyMikulas Patocka1-2/+2
"struct bvec_iter" is defined with the __packed attribute, so it is aligned on a single byte. On X86 (and on other architectures that support unaligned addresses in hardware), "struct bvec_iter" is accessed using the 8-byte and 4-byte memory instructions, however these instructions are less efficient if they operate on unaligned addresses. (on RISC machines that don't have unaligned access in hardware, GCC generates byte-by-byte accesses that are very inefficient - see [1]) This commit reorders the entries in "struct dm_verity_io" and "struct convert_context", so that "struct bvec_iter" is aligned on 8 bytes. [1] https://lore.kernel.org/all/ZcLuWUNRZadJr0tQ@fedora/T/ Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2024-02-20dm-crypt: recheck the integrity tag after a failureMikulas Patocka1-16/+73
If a userspace process reads (with O_DIRECT) multiple blocks into the same buffer, dm-crypt reports an authentication error [1]. The error is reported in a log and it may cause RAID leg being kicked out of the array. This commit fixes dm-crypt, so that if integrity verification fails, the data is read again into a kernel buffer (where userspace can't modify it) and the integrity tag is rechecked. If the recheck succeeds, the content of the kernel buffer is copied into the user buffer; if the recheck fails, an integrity error is reported. [1] https://people.redhat.com/~mpatocka/testcases/blk-auth-modify/read2.c Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2024-02-20dm-crypt: don't modify the data when using authenticated encryptionMikulas Patocka1-0/+6
It was said that authenticated encryption could produce invalid tag when the data that is being encrypted is modified [1]. So, fix this problem by copying the data into the clone bio first and then encrypt them inside the clone bio. This may reduce performance, but it is needed to prevent the user from corrupting the device by writing data with O_DIRECT and modifying them at the same time. [1] https://lore.kernel.org/all/20240207004723.GA35324@sol.localdomain/T/ Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2024-02-02dm-crypt, dm-verity: disable taskletsMikulas Patocka1-36/+2
Tasklets have an inherent problem with memory corruption. The function tasklet_action_common calls tasklet_trylock, then it calls the tasklet callback and then it calls tasklet_unlock. If the tasklet callback frees the structure that contains the tasklet or if it calls some code that may free it, tasklet_unlock will write into free memory. The commits 8e14f610159d and d9a02e016aaf try to fix it for dm-crypt, but it is not a sufficient fix and the data corruption can still happen [1]. There is no fix for dm-verity and dm-verity will write into free memory with every tasklet-processed bio. There will be atomic workqueues implemented in the kernel 6.9 [2]. They will have better interface and they will not suffer from the memory corruption problem. But we need something that stops the memory corruption now and that can be backported to the stable kernels. So, I'm proposing this commit that disables tasklets in both dm-crypt and dm-verity. This commit doesn't remove the tasklet support, because the tasklet code will be reused when atomic workqueues will be implemented. [1] https://lore.kernel.org/all/d390d7ee-f142-44d3-822a-87949e14608b@suse.de/T/ [2] https://lore.kernel.org/lkml/20240130091300.2968534-1-tj@kernel.org/ Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org Fixes: 39d42fa96ba1b ("dm crypt: add flags to optionally bypass kcryptd workqueues") Fixes: 5721d4e5a9cdb ("dm verity: Add optional "try_verify_in_tasklet" feature") Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2024-01-08mm, treewide: rename MAX_ORDER to MAX_PAGE_ORDERKirill A. Shutemov1-1/+1
commit 23baf831a32c ("mm, treewide: redefine MAX_ORDER sanely") has changed the definition of MAX_ORDER to be inclusive. This has caused issues with code that was not yet upstream and depended on the previous definition. To draw attention to the altered meaning of the define, rename MAX_ORDER to MAX_PAGE_ORDER. Link: https://lkml.kernel.org/r/20231228144704.14033-2-kirill.shutemov@linux.intel.com Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-11-17dm-crypt: start allocating with MAX_ORDERMikulas Patocka1-1/+1
Commit 23baf831a32c ("mm, treewide: redefine MAX_ORDER sanely") changed the meaning of MAX_ORDER from exclusive to inclusive. So, we can allocate compound pages with up to 1 << MAX_ORDER pages. Reflect this change in dm-crypt and start trying to allocate compound pages with MAX_ORDER. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-11-01Merge tag 'for-6.7/dm-changes' of ↵Linus Torvalds1-12/+14
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper updates from Mike Snitzer: - Update DM core to directly call the map function for both the linear and stripe targets; which are provided by DM core - Various updates to use new safer string functions - Update DM core to respect REQ_NOWAIT flag in normal bios so that memory allocations are always attempted with GFP_NOWAIT - Add Mikulas Patocka to MAINTAINERS as a DM maintainer! - Improve DM delay target's handling of short delays (< 50ms) by using a kthread to check expiration of IOs rather than timers and a wq - Update the DM error target so that it works with zoned storage. This helps xfstests to provide proper IO error handling coverage when testing a filesystem with native zoned storage support - Update both DM crypt and integrity targets to improve performance by using crypto_shash_digest() rather than init+update+final sequence - Fix DM crypt target by backfilling missing memory allocation accounting for compound pages * tag 'for-6.7/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm crypt: account large pages in cc->n_allocated_pages dm integrity: use crypto_shash_digest() in sb_mac() dm crypt: use crypto_shash_digest() in crypt_iv_tcw_whitening() dm error: Add support for zoned block devices dm delay: for short delays, use kthread instead of timers and wq MAINTAINERS: add Mikulas Patocka as a DM maintainer dm: respect REQ_NOWAIT flag in normal bios issued to DM dm: enhance alloc_multiple_bios() to be more versatile dm: make __send_duplicate_bios return unsigned int dm log userspace: replace deprecated strncpy with strscpy dm ioctl: replace deprecated strncpy with strscpy_pad dm crypt: replace open-coded kmemdup_nul dm cache metadata: replace deprecated strncpy with strscpy dm: shortcut the calls to linear_map and stripe_map
2023-10-31dm crypt: account large pages in cc->n_allocated_pagesMikulas Patocka1-3/+12
The commit 5054e778fcd9c ("dm crypt: allocate compound pages if possible") changed dm-crypt to use compound pages to improve performance. Unfortunately, there was an oversight: the allocation of compound pages was not accounted at all. Normal pages are accounted in a percpu counter cc->n_allocated_pages and dm-crypt is limited to allocate at most 2% of memory. Because compound pages were not accounted at all, dm-crypt could allocate memory over the 2% limit. Fix this by adding the accounting of compound pages, so that memory consumption of dm-crypt is properly limited. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Fixes: 5054e778fcd9c ("dm crypt: allocate compound pages if possible") Cc: stable@vger.kernel.org # v6.5+ Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-10-31dm crypt: use crypto_shash_digest() in crypt_iv_tcw_whitening()Eric Biggers1-7/+1
Simplify crypt_iv_tcw_whitening() by using crypto_shash_digest() instead of an init+update+final sequence. This should also improve performance. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-10-30Merge tag 'hardening-v6.7-rc1' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull hardening updates from Kees Cook: "One of the more voluminous set of changes is for adding the new __counted_by annotation[1] to gain run-time bounds checking of dynamically sized arrays with UBSan. - Add LKDTM test for stuck CPUs (Mark Rutland) - Improve LKDTM selftest behavior under UBSan (Ricardo Cañuelo) - Refactor more 1-element arrays into flexible arrays (Gustavo A. R. Silva) - Analyze and replace strlcpy and strncpy uses (Justin Stitt, Azeem Shaikh) - Convert group_info.usage to refcount_t (Elena Reshetova) - Add __counted_by annotations (Kees Cook, Gustavo A. R. Silva) - Add Kconfig fragment for basic hardening options (Kees Cook, Lukas Bulwahn) - Fix randstruct GCC plugin performance mode to stay in groups (Kees Cook) - Fix strtomem() compile-time check for small sources (Kees Cook)" * tag 'hardening-v6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (56 commits) hwmon: (acpi_power_meter) replace open-coded kmemdup_nul reset: Annotate struct reset_control_array with __counted_by kexec: Annotate struct crash_mem with __counted_by virtio_console: Annotate struct port_buffer with __counted_by ima: Add __counted_by for struct modsig and use struct_size() MAINTAINERS: Include stackleak paths in hardening entry string: Adjust strtomem() logic to allow for smaller sources hardening: x86: drop reference to removed config AMD_IOMMU_V2 randstruct: Fix gcc-plugin performance mode to stay in group mailbox: zynqmp: Annotate struct zynqmp_ipi_pdata with __counted_by drivers: thermal: tsens: Annotate struct tsens_priv with __counted_by irqchip/imx-intmux: Annotate struct intmux_data with __counted_by KVM: Annotate struct kvm_irq_routing_table with __counted_by virt: acrn: Annotate struct vm_memory_region_batch with __counted_by hwmon: Annotate struct gsc_hwmon_platform_data with __counted_by sparc: Annotate struct cpuinfo_tree with __counted_by isdn: kcapi: replace deprecated strncpy with strscpy_pad isdn: replace deprecated strncpy with strscpy NFS/flexfiles: Annotate struct nfs4_ff_layout_segment with __counted_by nfs41: Annotate struct nfs4_file_layout_dsaddr with __counted_by ...
2023-10-23dm crypt: replace open-coded kmemdup_nulJustin Stitt1-2/+1
kzalloc() followed by strncpy() on an expected NUL-terminated string is just kmemdup_nul(). Let's simplify this code (while also dropping a deprecated strncpy() call [1]). Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings [1] Link: https://github.com/KSPP/linux/issues/90 Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Justin Stitt <justinstitt@google.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-10-06dm crypt: Fix reqsize in crypt_iv_eboiv_genHerbert Xu1-1/+2
A skcipher_request object is made up of struct skcipher_request followed by a variable-sized trailer. The allocation of the skcipher_request and IV in crypt_iv_eboiv_gen is missing the memory for struct skcipher_request. Fix it by adding it to reqsize. Fixes: e3023094dffb ("dm crypt: Avoid using MAX_CIPHER_BLOCKSIZE") Cc: <stable@vger.kernel.org> #6.5+ Reported-by: Tatu Heikkilä <tatu.heikkila@gmail.com> Reviewed-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-10-02dm crypt: Annotate struct crypt_config with __counted_byKees Cook1-1/+1
Prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). As found with Coccinelle[1], add __counted_by for struct crypt_config. [1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci Cc: Alasdair Kergon <agk@redhat.com> Cc: Mike Snitzer <snitzer@kernel.org> Cc: dm-devel@redhat.com Reviewed-by: "Gustavo A. R. Silva" <gustavoars@kernel.org> Link: https://lore.kernel.org/r/20230915200344.never.272-kees@kernel.org Signed-off-by: Kees Cook <keescook@chromium.org>
2023-08-09bio-integrity: update the payload size in bio_integrity_add_page()Jinyoung Choi1-1/+0
Previously, the bip's bi_size has been set before an integrity pages were added. If a problem occurs in the process of adding pages for bip, the bi_size mismatch problem must be dealt with. When the page is successfully added to bvec, the bi_size is updated. The parts affected by the change were also contained in this commit. Cc: Christoph Hellwig <hch@lst.de> Cc: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jinyoung Choi <j-young.choi@samsung.com> Tested-by: "Martin K. Petersen" <martin.petersen@oracle.com> Reviewed-by: "Martin K. Petersen" <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20230803024956epcms2p38186a17392706650c582d38ef3dbcd32@epcms2p3 Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-06-30Merge tag 'v6.5-p1' of ↵Linus Torvalds1-4/+11
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto updates from Herbert Xu: "API: - Add linear akcipher/sig API - Add tfm cloning (hmac, cmac) - Add statesize to crypto_ahash Algorithms: - Allow only odd e and restrict value in FIPS mode for RSA - Replace LFSR with SHA3-256 in jitter - Add interface for gathering of raw entropy in jitter Drivers: - Fix race on data_avail and actual data in hwrng/virtio - Add hash and HMAC support in starfive - Add RSA algo support in starfive - Add support for PCI device 0x156E in ccp" * tag 'v6.5-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (85 commits) crypto: akcipher - Do not copy dst if it is NULL crypto: sig - Fix verify call crypto: akcipher - Set request tfm on sync path crypto: sm2 - Provide sm2_compute_z_digest when sm2 is disabled hwrng: imx-rngc - switch to DEFINE_SIMPLE_DEV_PM_OPS hwrng: st - keep clock enabled while hwrng is registered hwrng: st - support compile-testing hwrng: imx-rngc - fix the timeout for init and self check KEYS: asymmetric: Use new crypto interface without scatterlists KEYS: asymmetric: Move sm2 code into x509_public_key KEYS: Add forward declaration in asymmetric-parser.h crypto: sig - Add interface for sign/verify crypto: akcipher - Add sync interface without SG lists crypto: cipher - On clone do crypto_mod_get() crypto: api - Add __crypto_alloc_tfmgfp crypto: api - Remove crypto_init_ops() crypto: rsa - allow only odd e and restrict value in FIPS mode crypto: geniv - Split geniv out of AEAD Kconfig option crypto: algboss - Add missing dependency on RNG2 crypto: starfive - Add RSA algo support ...
2023-06-30Merge tag 'for-6.5/dm-changes' of ↵Linus Torvalds1-15/+36
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper updates from Mike Snitzer: - Update DM crypt to allocate compound pages if possible - Fix DM crypt target's crypt_ctr_cipher_new return value on invalid AEAD cipher - Fix DM flakey testing target's write bio corruption feature to corrupt the data of a cloned bio instead of the original - Add random_read_corrupt and random_write_corrupt features to DM flakey target - Fix ABBA deadlock in DM thin metadata by resetting associated bufio client rather than destroying and recreating it - A couple other small DM thinp cleanups - Update DM core to support disabling block core IO stats accounting and optimize away code that isn't needed if stats are disabled - Other small DM core cleanups - Improve DM integrity target to not require so much memory on 32 bit systems. Also only allocate the recalculate buffer as needed (and increasingly reduce its size on allocation failure) - Update DM integrity to use %*ph for printing hexdump of a small buffer. Also update DM integrity documentation - Various DM core ioctl interface hardening. Now more careful about alignment of structures and processing of input passed to the kernel from userspace. Also disallow the creation of DM devices named "control", "." or ".." - Eliminate GFP_NOIO workarounds for __vmalloc and kvmalloc in DM core's ioctl and bufio code * tag 'for-6.5/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: (28 commits) dm: get rid of GFP_NOIO workarounds for __vmalloc and kvmalloc dm integrity: scale down the recalculate buffer if memory allocation fails dm integrity: only allocate recalculate buffer when needed dm integrity: reduce vmalloc space footprint on 32-bit architectures dm ioctl: Refuse to create device named "." or ".." dm ioctl: Refuse to create device named "control" dm ioctl: Avoid double-fetch of version dm ioctl: structs and parameter strings must not overlap dm ioctl: Avoid pointer arithmetic overflow dm ioctl: Check dm_target_spec is sufficiently aligned Documentation: dm-integrity: Document an example of how the tunables relate. Documentation: dm-integrity: Document default values. Documentation: dm-integrity: Document the meaning of "buffer". Documentation: dm-integrity: Fix minor grammatical error. dm integrity: Use %*ph for printing hexdump of a small buffer dm thin: disable discards for thin-pool if no_discard_passdown dm: remove stale/redundant dm_internal_{suspend,resume} prototypes in dm.h dm: skip dm-stats work in alloc_io() unless needed dm: avoid needless dm_io access if all IO accounting is disabled dm: support turning off block-core's io stats accounting ...
2023-06-28Merge tag 'mm-stable-2023-06-24-19-15' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull mm updates from Andrew Morton: - Yosry Ahmed brought back some cgroup v1 stats in OOM logs - Yosry has also eliminated cgroup's atomic rstat flushing - Nhat Pham adds the new cachestat() syscall. It provides userspace with the ability to query pagecache status - a similar concept to mincore() but more powerful and with improved usability - Mel Gorman provides more optimizations for compaction, reducing the prevalence of page rescanning - Lorenzo Stoakes has done some maintanance work on the get_user_pages() interface - Liam Howlett continues with cleanups and maintenance work to the maple tree code. Peng Zhang also does some work on maple tree - Johannes Weiner has done some cleanup work on the compaction code - David Hildenbrand has contributed additional selftests for get_user_pages() - Thomas Gleixner has contributed some maintenance and optimization work for the vmalloc code - Baolin Wang has provided some compaction cleanups, - SeongJae Park continues maintenance work on the DAMON code - Huang Ying has done some maintenance on the swap code's usage of device refcounting - Christoph Hellwig has some cleanups for the filemap/directio code - Ryan Roberts provides two patch series which yield some rationalization of the kernel's access to pte entries - use the provided APIs rather than open-coding accesses - Lorenzo Stoakes has some fixes to the interaction between pagecache and directio access to file mappings - John Hubbard has a series of fixes to the MM selftesting code - ZhangPeng continues the folio conversion campaign - Hugh Dickins has been working on the pagetable handling code, mainly with a view to reducing the load on the mmap_lock - Catalin Marinas has reduced the arm64 kmalloc() minimum alignment from 128 to 8 - Domenico Cerasuolo has improved the zswap reclaim mechanism by reorganizing the LRU management - Matthew Wilcox provides some fixups to make gfs2 work better with the buffer_head code - Vishal Moola also has done some folio conversion work - Matthew Wilcox has removed the remnants of the pagevec code - their functionality is migrated over to struct folio_batch * tag 'mm-stable-2023-06-24-19-15' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (380 commits) mm/hugetlb: remove hugetlb_set_page_subpool() mm: nommu: correct the range of mmap_sem_read_lock in task_mem() hugetlb: revert use of page_cache_next_miss() Revert "page cache: fix page_cache_next/prev_miss off by one" mm/vmscan: fix root proactive reclaim unthrottling unbalanced node mm: memcg: rename and document global_reclaim() mm: kill [add|del]_page_to_lru_list() mm: compaction: convert to use a folio in isolate_migratepages_block() mm: zswap: fix double invalidate with exclusive loads mm: remove unnecessary pagevec includes mm: remove references to pagevec mm: rename invalidate_mapping_pagevec to mapping_try_invalidate mm: remove struct pagevec net: convert sunrpc from pagevec to folio_batch i915: convert i915_gpu_error to use a folio_batch pagevec: rename fbatch_count() mm: remove check_move_unevictable_pages() drm: convert drm_gem_put_pages() to use a folio_batch i915: convert shmem_sg_free_table() to use a folio_batch scatterlist: add sg_set_folio() ...
2023-06-19dm-crypt: use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGNCatalin Marinas1-1/+1
ARCH_DMA_MINALIGN represents the minimum (static) alignment for safe DMA operations while ARCH_KMALLOC_MINALIGN is the minimum kmalloc() objects alignment. Link: https://lkml.kernel.org/r/20230612153201.554742-10-catalin.marinas@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Tested-by: Isaac J. Manjarres <isaacmanjarres@google.com> Cc: Alasdair Kergon <agk@redhat.com> Cc: Mike Snitzer <snitzer@kernel.org> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christoph Hellwig <hch@lst.de> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Jerry Snitselaar <jsnitsel@redhat.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Jonathan Cameron <jic23@kernel.org> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: Lars-Peter Clausen <lars@metafoo.de> Cc: Logan Gunthorpe <logang@deltatee.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Saravana Kannan <saravanak@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-16dm crypt: fix crypt_ctr_cipher_new return value on invalid AEAD cipherMikulas Patocka1-1/+1
If the user specifies invalid AEAD cipher, dm-crypt should return the error returned from crypt_ctr_auth_spec, not -ENOMEM. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-06-16dm crypt: allocate compound pages if possibleMikulas Patocka1-14/+35
It was reported that allocating pages for the write buffer in dm-crypt causes measurable overhead [1]. Change dm-crypt to allocate compound pages if they are available. If not, fall back to the mempool. [1] https://listman.redhat.com/archives/dm-devel/2023-February/053284.html Suggested-by: Matthew Wilcox <willy@infradead.org> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-06-09dm crypt: Avoid using MAX_CIPHER_BLOCKSIZEHerbert Xu1-4/+11
MAX_CIPHER_BLOCKSIZE is an internal implementation detail and should not be relied on by users of the Crypto API. Instead of storing the IV on the stack, allocate it together with the crypto request. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Reviewed-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-06-01dm-crypt: use __bio_add_page to add single page to clone bioJohannes Thumshirn1-2/+1
crypt_alloc_buffer() already allocates enough entries in the clone bio's vector, so adding a page to the bio can't fail. Use __bio_add_page() to reflect this. Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/f9a4dee5e81389fd70ffc442da01006538e55aca.1685532726.git.johannes.thumshirn@wdc.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-04-11dm: add helper macro for simple DM target module init and exitYangtao Li1-13/+1
Eliminate duplicate boilerplate code for simple modules that contain a single DM target driver without any additional setup code. Add a new module_dm() macro, which replaces the module_init() and module_exit() with template functions that call dm_register_target() and dm_unregister_target() respectively. Signed-off-by: Yangtao Li <frank.li@vivo.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-04-11dm: push error reporting down to dm_register_target()Yangtao Li1-7/+1
Simplifies each DM target's init method by making dm_register_target() responsible for its error reporting (on behalf of targets). Signed-off-by: Yangtao Li <frank.li@vivo.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-03-24Merge tag 'for-6.3/dm-fixes' of ↵Linus Torvalds1-6/+10
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper fixes from Mike Snitzer: - Fix DM thin to work as a swap device by using 'limit_swap_bios' DM target flag (initially added to allow swap to dm-crypt) to throttle the amount of outstanding swap bios. - Fix DM crypt soft lockup warnings by calling cond_resched() from the cpu intensive loop in dmcrypt_write(). - Fix DM crypt to not access an uninitialized tasklet. This fix allows for consistent handling of IO completion, by _not_ needlessly punting to a workqueue when tasklets are not needed. - Fix DM core's alloc_dev() initialization for DM stats to check for and propagate alloc_percpu() failure. * tag 'for-6.3/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm stats: check for and propagate alloc_percpu failure dm crypt: avoid accessing uninitialized tasklet dm crypt: add cond_resched() to dmcrypt_write() dm thin: fix deadlock when swapping to thin device
2023-03-09dm crypt: avoid accessing uninitialized taskletMike Snitzer1-6/+9
When neither "no_read_workqueue" nor "no_write_workqueue" are enabled, tasklet_trylock() in crypt_dec_pending() may still return false due to an uninitialized state, and dm-crypt will unnecessarily do io completion in io_queue workqueue instead of current context. Fix this by adding an 'in_tasklet' flag to dm_crypt_io struct and initialize it to false in crypt_io_init(). Set this flag to true in kcryptd_queue_crypt() before calling tasklet_schedule(). If set crypt_dec_pending() will punt io completion to a workqueue. This also nicely avoids the tasklet_trylock/unlock hack when tasklets aren't in use. Fixes: 8e14f610159d ("dm crypt: do not call bio_endio() from the dm-crypt tasklet") Cc: stable@vger.kernel.org Reported-by: Hou Tao <houtao1@huawei.com> Suggested-by: Ignat Korchagin <ignat@cloudflare.com> Reviewed-by: Ignat Korchagin <ignat@cloudflare.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-03-06dm crypt: add cond_resched() to dmcrypt_write()Mikulas Patocka1-0/+1
The loop in dmcrypt_write may be running for unbounded amount of time, thus we need cond_resched() in it. This commit fixes the following warning: [ 3391.153255][ C12] watchdog: BUG: soft lockup - CPU#12 stuck for 23s! [dmcrypt_write/2:2897] ... [ 3391.387210][ C12] Call trace: [ 3391.390338][ C12] blk_attempt_bio_merge.part.6+0x38/0x158 [ 3391.395970][ C12] blk_attempt_plug_merge+0xc0/0x1b0 [ 3391.401085][ C12] blk_mq_submit_bio+0x398/0x550 [ 3391.405856][ C12] submit_bio_noacct+0x308/0x380 [ 3391.410630][ C12] dmcrypt_write+0x1e4/0x208 [dm_crypt] [ 3391.416005][ C12] kthread+0x130/0x138 [ 3391.419911][ C12] ret_from_fork+0x10/0x18 Reported-by: yangerkun <yangerkun@huawei.com> Fixes: dc2676210c42 ("dm crypt: offload writes to thread") Cc: stable@vger.kernel.org Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-02-22Merge tag 'for-6.3/dm-changes' of ↵Linus Torvalds1-55/+62
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper updates from Mike Snitzer: - Fix DM cache target to free background tracker work items, otherwise slab BUG will occur when kmem_cache_destroy() is called. - Improve 2 of DM's shrinker names to reflect their use. - Fix the DM flakey target to not corrupt the zero page. Fix dm-flakey on 32-bit hughmem systems by using bvec_kmap_local instead of page_address. Also, fix logic used when imposing the "corrupt_bio_byte" feature. - Stop using WQ_UNBOUND for DM verity target's verify_wq because it causes significant Android latencies on ARM64 (and doesn't show real benefit on other architectures). - Add negative check to catch simple case of a DM table referencing itself. More complex scenarios that use intermediate devices to self-reference still need to be avoided/handled in userspace. - Fix DM core's resize to only send one uevent instead of two. This fixes a race with udev, that if udev wins, will cause udev to miss uevents (which caused premature unmount attempts by systemd). - Add cond_resched() to workqueue functions in DM core, dn-thin and dm-cache so that their loops aren't the cause of unintended cpu scheduling fairness issues. - Fix all of DM's checkpatch errors and warnings (famous last words). Various other small cleanups. * tag 'for-6.3/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: (62 commits) dm: remove unnecessary (void*) conversion in event_callback() dm ioctl: remove unnecessary check when using dm_get_mdptr() dm ioctl: assert _hash_lock is held in __hash_remove dm cache: add cond_resched() to various workqueue loops dm thin: add cond_resched() to various workqueue loops dm: add cond_resched() to dm_wq_requeue_work() dm: add cond_resched() to dm_wq_work() dm sysfs: make kobj_type structure constant dm: update targets using system workqueues to use a local workqueue dm: remove flush_scheduled_work() during local_exit() dm clone: prefer kvmalloc_array() dm: declare variables static when sensible dm: fix suspect indent whitespace dm ioctl: prefer strscpy() instead of strlcpy() dm: avoid void function return statements dm integrity: change macros min/max() -> min_t/max_t where appropriate dm: fix use of sizeof() macro dm: avoid 'do {} while(0)' loop in single statement macros dm log: avoid multiple line dereference dm log: avoid trailing semicolon in macro ...
2023-02-14dm: avoid split of quoted strings where possibleHeinz Mauelshagen1-2/+1
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-02-14dm: add missing empty linesHeinz Mauelshagen1-0/+3
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-02-14dm crypt: correct 'foo*' to 'foo *'Heinz Mauelshagen1-9/+9
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-02-14dm: correct block comments format.Heinz Mauelshagen1-3/+4
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-02-14dm: address indent/space issuesHeinz Mauelshagen1-1/+1
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-02-14dm: avoid initializing static variablesHeinz Mauelshagen1-1/+1
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-02-14dm: avoid assignment in if conditionsHeinz Mauelshagen1-3/+6
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-02-14dm: change "unsigned" to "unsigned int"Heinz Mauelshagen1-24/+24
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-02-14dm: prefer kmap_local_page() instead of deprecated kmap_atomic()Heinz Mauelshagen1-12/+12
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-02-14dm: add missing SPDX-License-IndentifiersHeinz Mauelshagen1-0/+1
'GPL-2.0-only' is used instead of 'GPL-2.0' because SPDX has deprecated its use. Suggested-by: John Wiele <jwiele@redhat.com> Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>