summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)AuthorFilesLines
2021-05-26f2fs: compress: fix to disallow temp extensionChao Yu1-4/+12
This patch restricts to configure compress extension as format of: [filename + '.' + extension] rather than: [filename + '.' + extension + (optional: '.' + temp extension)] in order to avoid to enable compression incorrectly: 1. compress_extension=so 2. touch file.soa 3. touch file.so.tmp Fixes: 4c8ff7095bef ("f2fs: support data compression") Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-05-26f2fs: let's allow compression for mmap filesJaegeuk Kim1-2/+0
This patch allows to compress mmap files. E.g., for so files. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-05-26f2fs: add MODULE_SOFTDEP to ensure crc32 is included in the initramfsChao Yu1-0/+1
As marcosfrm reported in bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=213089 Initramfs generators rely on "pre" softdeps (and "depends") to include additional required modules. F2FS does not declare "pre: crc32" softdep. Then every generator (dracut, mkinitcpio...) has to maintain a hardcoded list for this purpose. Hence let's use MODULE_SOFTDEP("pre: crc32") in f2fs code. Fixes: 43b6573bac95 ("f2fs: use cryptoapi crc32 functions") Reported-by: marcosfrm <marcosfrm@gmail.com> Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-05-26f2fs: return success if there is no work to doTom Rix1-1/+1
Static analysis reports this problem file.c:3206:2: warning: Undefined or garbage value returned to caller return err; ^~~~~~~~~~ err is only set if there is some work to do. Because the loop returns immediately on an error, if all the work was done, a 0 would be returned. Instead of checking the unlikely case that there was no work to do, change the return of err to 0. Signed-off-by: Tom Rix <trix@redhat.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-05-14f2fs: compress: clean up parameter of __f2fs_cluster_blocks()Chao Yu1-20/+13
Previously, in order to reuse __f2fs_cluster_blocks(), f2fs_is_compressed_cluster() assigned a compress_ctx type variable, which is used to pass few parameters (cc.inode, cc.cluster_size, cc.cluster_idx), it's wasteful to allocate such large space in stack. Let's clean up parameters of __f2fs_cluster_blocks() to avoid that. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-05-14f2fs: compress: remove unneeded f2fs_put_dnode()Chao Yu1-1/+0
If we don't initialize dn.inode_page for f2fs_get_block(), f2fs_get_block() will call f2fs_put_dnode() itself, so let's remove unneeded f2fs_put_dnode() in f2fs_vm_page_mkwrite(). Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-05-14f2fs: atgc: fix to set default age thresholdChao Yu1-0/+1
Default age threshold value is missed to set, fix it. Fixes: 093749e296e2 ("f2fs: support age threshold based garbage collection") Reported-by: Sahitya Tummala <stummala@codeaurora.org> Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-05-14f2fs: Prevent swap file in LFS modeShin'ichiro Kawasaki1-0/+6
The kernel writes to swap files on f2fs directly without the assistance of the filesystem. This direct write by kernel can be non-sequential even when the f2fs is in LFS mode. Such non-sequential write conflicts with the LFS semantics. Especially when f2fs is set up on zoned block devices, the non-sequential write causes unaligned write command errors. To avoid the non-sequential writes to swap files, prevent swap file activation when the filesystem is in LFS mode. Fixes: 4969c06a0d83 ("f2fs: support swap file w/ DIO") Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Cc: stable@vger.kernel.org # v5.10+ Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-05-14f2fs: fix to avoid racing on fsync_entry_slab by multi filesystem instancesChao Yu3-10/+23
As syzbot reported, there is an use-after-free issue during f2fs recovery: Use-after-free write at 0xffff88823bc16040 (in kfence-#10): kmem_cache_destroy+0x1f/0x120 mm/slab_common.c:486 f2fs_recover_fsync_data+0x75b0/0x8380 fs/f2fs/recovery.c:869 f2fs_fill_super+0x9393/0xa420 fs/f2fs/super.c:3945 mount_bdev+0x26c/0x3a0 fs/super.c:1367 legacy_get_tree+0xea/0x180 fs/fs_context.c:592 vfs_get_tree+0x86/0x270 fs/super.c:1497 do_new_mount fs/namespace.c:2905 [inline] path_mount+0x196f/0x2be0 fs/namespace.c:3235 do_mount fs/namespace.c:3248 [inline] __do_sys_mount fs/namespace.c:3456 [inline] __se_sys_mount+0x2f9/0x3b0 fs/namespace.c:3433 do_syscall_64+0x3f/0xb0 arch/x86/entry/common.c:47 entry_SYSCALL_64_after_hwframe+0x44/0xae The root cause is multi f2fs filesystem instances can race on accessing global fsync_entry_slab pointer, result in use-after-free issue of slab cache, fixes to init/destroy this slab cache only once during module init/destroy procedure to avoid this issue. Reported-by: syzbot+9d90dad32dd9727ed084@syzkaller.appspotmail.com Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-05-14f2fs: restructure f2fs page.private layoutChao Yu11-109/+146
Restruct f2fs page private layout for below reasons: There are some cases that f2fs wants to set a flag in a page to indicate a specified status of page: a) page is in transaction list for atomic write b) page contains dummy data for aligned write c) page is migrating for GC d) page contains inline data for inline inode flush e) page belongs to merkle tree, and is verified for fsverity f) page is dirty and has filesystem/inode reference count for writeback g) page is temporary and has decompress io context reference for compression There are existed places in page structure we can use to store f2fs private status/data: - page.flags: PG_checked, PG_private - page.private However it was a mess when we using them, which may cause potential confliction: page.private PG_private PG_checked page._refcount (+1 at most) a) -1 set +1 b) -2 set c), d), e) set f) 0 set +1 g) pointer set The other problem is page.flags has no free slot, if we can avoid set zero to page.private and set PG_private flag, then we use non-zero value to indicate PG_private status, so that we may have chance to reclaim PG_private slot for other usage. [1] The other concern is f2fs has bad scalability in aspect of indicating more page status. So in this patch, let's restructure f2fs' page.private as below to solve above issues: Layout A: lowest bit should be 1 | bit0 = 1 | bit1 | bit2 | ... | bit MAX | private data .... | bit 0 PAGE_PRIVATE_NOT_POINTER bit 1 PAGE_PRIVATE_ATOMIC_WRITE bit 2 PAGE_PRIVATE_DUMMY_WRITE bit 3 PAGE_PRIVATE_ONGOING_MIGRATION bit 4 PAGE_PRIVATE_INLINE_INODE bit 5 PAGE_PRIVATE_REF_RESOURCE bit 6- f2fs private data Layout B: lowest bit should be 0 page.private is a wrapped pointer. After the change: page.private PG_private PG_checked page._refcount (+1 at most) a) 11 set +1 b) 101 set +1 c) 1001 set +1 d) 10001 set +1 e) set f) 100001 set +1 g) pointer set +1 [1] https://lore.kernel.org/linux-f2fs-devel/20210422154705.GO3596236@casper.infradead.org/T/#u Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-05-14f2fs: add cp_error check in f2fs_write_compressed_pagesChao Yu1-0/+6
This patch adds cp_error check in f2fs_write_compressed_pages() like we did in f2fs_write_single_data_page() Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-05-14f2fs: compress: rename __cluster_may_compressChao Yu1-4/+4
This patch renames __cluster_may_compress() to cluster_has_invalid_data() for better readability. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-05-14Merge tag 'f2fs-5.13-rc1-fix' of ↵Linus Torvalds5-47/+56
git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs fixes from Jaegeuk Kim: "This fixes some critical bugs such as memory leak in compression flows, kernel panic when handling errors, and swapon failure due to newly added condition check" * tag 'f2fs-5.13-rc1-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: f2fs: return EINVAL for hole cases in swap file f2fs: avoid swapon failure by giving a warning first f2fs: compress: fix to assign cc.cluster_idx correctly f2fs: compress: fix race condition of overwrite vs truncate f2fs: compress: fix to free compress page correctly f2fs: support iflag change given the mask f2fs: avoid null pointer access when handling IPU error
2021-05-12f2fs: return EINVAL for hole cases in swap fileJaegeuk Kim1-2/+2
This tries to fix xfstests/generic/495. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-05-11f2fs: avoid swapon failure by giving a warning firstJaegeuk Kim1-6/+23
The final solution can be migrating blocks to form a section-aligned file internally. Meanwhile, let's ask users to do that when preparing the swap file initially like: 1) create() 2) ioctl(F2FS_IOC_SET_PIN_FILE) 3) fallocate() Reported-by: kernel test robot <oliver.sang@intel.com> Fixes: 36e4d95891ed ("f2fs: check if swapfile is section-alligned") Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-05-11f2fs: compress: fix to assign cc.cluster_idx correctlyChao Yu3-12/+13
In f2fs_destroy_compress_ctx(), after f2fs_destroy_compress_ctx(), cc.cluster_idx will be cleared w/ NULL_CLUSTER, f2fs_cluster_blocks() may check wrong cluster metadata, fix it. Fixes: 4c8ff7095bef ("f2fs: support data compression") Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-05-11f2fs: compress: fix race condition of overwrite vs truncateChao Yu1-23/+12
pos_fsstress testcase complains a panic as belew: ------------[ cut here ]------------ kernel BUG at fs/f2fs/compress.c:1082! invalid opcode: 0000 [#1] SMP PTI CPU: 4 PID: 2753477 Comm: kworker/u16:2 Tainted: G OE 5.12.0-rc1-custom #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014 Workqueue: writeback wb_workfn (flush-252:16) RIP: 0010:prepare_compress_overwrite+0x4c0/0x760 [f2fs] Call Trace: f2fs_prepare_compress_overwrite+0x5f/0x80 [f2fs] f2fs_write_cache_pages+0x468/0x8a0 [f2fs] f2fs_write_data_pages+0x2a4/0x2f0 [f2fs] do_writepages+0x38/0xc0 __writeback_single_inode+0x44/0x2a0 writeback_sb_inodes+0x223/0x4d0 __writeback_inodes_wb+0x56/0xf0 wb_writeback+0x1dd/0x290 wb_workfn+0x309/0x500 process_one_work+0x220/0x3c0 worker_thread+0x53/0x420 kthread+0x12f/0x150 ret_from_fork+0x22/0x30 The root cause is truncate() may race with overwrite as below, so that one reference count left in page can not guarantee the page attaching in mapping tree all the time, after truncation, later find_lock_page() may return NULL pointer. - prepare_compress_overwrite - f2fs_pagecache_get_page - unlock_page - f2fs_setattr - truncate_setsize - truncate_inode_page - delete_from_page_cache - find_lock_page Fix this by avoiding referencing updated page. Fixes: 4c8ff7095bef ("f2fs: support data compression") Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-05-11f2fs: compress: fix to free compress page correctlyChao Yu1-1/+2
In error path of f2fs_write_compressed_pages(), it needs to call f2fs_compress_free_page() to release temporary page. Fixes: 5e6bbde95982 ("f2fs: introduce mempool for {,de}compress intermediate page allocation") Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-05-11f2fs: support iflag change given the maskJaegeuk Kim1-1/+2
In f2fs_fileattr_set(), if (!fa->flags_valid) mask &= FS_COMMON_FL; In this case, we can set supported flags by mask only instead of BUG_ON. /* Flags shared betwen flags/xflags */ (FS_SYNC_FL | FS_IMMUTABLE_FL | FS_APPEND_FL | \ FS_NODUMP_FL | FS_NOATIME_FL | FS_DAX_FL | \ FS_PROJINHERIT_FL) Fixes: 9b1bb01c8ae7 ("f2fs: convert to fileattr") Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-05-11f2fs: avoid null pointer access when handling IPU errorJaegeuk Kim1-2/+2
Unable to handle kernel NULL pointer dereference at virtual address 000000000000001a pc : f2fs_inplace_write_data+0x144/0x208 lr : f2fs_inplace_write_data+0x134/0x208 Call trace: f2fs_inplace_write_data+0x144/0x208 f2fs_do_write_data_page+0x270/0x770 f2fs_write_single_data_page+0x47c/0x830 __f2fs_write_data_pages+0x444/0x98c f2fs_write_data_pages.llvm.16514453770497736882+0x2c/0x38 do_writepages+0x58/0x118 __writeback_single_inode+0x44/0x300 writeback_sb_inodes+0x4b8/0x9c8 wb_writeback+0x148/0x42c wb_do_writeback+0xc8/0x390 wb_workfn+0xb0/0x2f4 process_one_work+0x1fc/0x444 worker_thread+0x268/0x4b4 kthread+0x13c/0x158 ret_from_fork+0x10/0x18 Fixes: 955772787667 ("f2fs: drop inplace IO if fs status is abnormal") Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-05-11Merge tag 'for-5.13-rc1-part2-tag' of ↵Linus Torvalds1-0/+2
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fix from David Sterba: "Handle transaction start error in btrfs_fileattr_set() This is fix for code introduced by the new fileattr merge" * tag 'for-5.13-rc1-part2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: handle transaction start error in btrfs_fileattr_set
2021-05-11btrfs: handle transaction start error in btrfs_fileattr_setRitesh Harjani1-0/+2
Add error handling in btrfs_fileattr_set in case of an error while starting a transaction. This fixes btrfs/232 which otherwise used to fail with below signature on Power. btrfs/232 [ 1119.474650] run fstests btrfs/232 at 2021-04-21 02:21:22 <...> [ 1366.638585] BUG: Unable to handle kernel data access on read at 0xffffffffffffff86 [ 1366.638768] Faulting instruction address: 0xc0000000009a5c88 cpu 0x0: Vector: 380 (Data SLB Access) at [c000000014f177b0] pc: c0000000009a5c88: btrfs_update_root_times+0x58/0xc0 lr: c0000000009a5c84: btrfs_update_root_times+0x54/0xc0 <...> pid = 24881, comm = fsstress btrfs_update_inode+0xa0/0x140 btrfs_fileattr_set+0x5d0/0x6f0 vfs_fileattr_set+0x2a8/0x390 do_vfs_ioctl+0x1290/0x1ac0 sys_ioctl+0x6c/0x120 system_call_exception+0x3d4/0x410 system_call_common+0xec/0x278 Fixes: 97fc29775487 ("btrfs: convert to fileattr") Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-05-10Merge tag 'for-5.13-rc1-tag' of ↵Linus Torvalds11-26/+55
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: "First batch of various fixes, here's a list of notable ones: - fix unmountable seed device after fstrim - fix silent data loss in zoned mode due to ordered extent splitting - fix race leading to unpersisted data and metadata on fsync - fix deadlock when cloning inline extents and using qgroups" * tag 'for-5.13-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: initialize return variable in cleanup_free_space_cache_v1 btrfs: zoned: sanity check zone type btrfs: fix unmountable seed device after fstrim btrfs: fix deadlock when cloning inline extents and using qgroups btrfs: fix race leading to unpersisted data and metadata on fsync btrfs: do not consider send context as valid when trying to flush qgroups btrfs: zoned: fix silent data loss after failure splitting ordered extent
2021-05-09Merge tag '5.13-rc-smb3-part3' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds3-0/+14
Pull cifs fixes from Steve French: "Three small SMB3 chmultichannel related changesets (also for stable) from the SMB3 test event this week. The other fixes are still in review/testing" * tag '5.13-rc-smb3-part3' of git://git.samba.org/sfrench/cifs-2.6: smb3: if max_channels set to more than one channel request multichannel smb3: do not attempt multichannel to server which does not support it smb3: when mounting with multichannel include it in requested capabilities
2021-05-08Merge tag 'kbuild-v5.13-2' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull more Kbuild updates from Masahiro Yamada: - Convert sh and sparc to use generic shell scripts to generate the syscall headers - refactor .gitignore files - Update kernel/config_data.gz only when the content of the .config is really changed, which avoids the unneeded re-link of vmlinux - move "remove stale files" workarounds to scripts/remove-stale-files - suppress unused-but-set-variable warnings by default for Clang as well - fix locale setting LANG=C to LC_ALL=C - improve 'make distclean' - always keep intermediate objects from scripts/link-vmlinux.sh - move IF_ENABLED out of <linux/kconfig.h> to make it self-contained - misc cleanups * tag 'kbuild-v5.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (25 commits) linux/kconfig.h: replace IF_ENABLED() with PTR_IF() in <linux/kernel.h> kbuild: Don't remove link-vmlinux temporary files on exit/signal kbuild: remove the unneeded comments for external module builds kbuild: make distclean remove tag files in sub-directories kbuild: make distclean work against $(objtree) instead of $(srctree) kbuild: refactor modname-multi by using suffix-search kbuild: refactor fdtoverlay rule kbuild: parameterize the .o part of suffix-search arch: use cross_compiling to check whether it is a cross build or not kbuild: remove ARCH=sh64 support from top Makefile .gitignore: prefix local generated files with a slash kbuild: replace LANG=C with LC_ALL=C Makefile: Move -Wno-unused-but-set-variable out of GCC only block kbuild: add a script to remove stale generated files kbuild: update config_data.gz only when the content of .config is changed .gitignore: ignore only top-level modules.builtin .gitignore: move tags and TAGS close to other tag files kernel/.gitgnore: remove stale timeconst.h and hz.bc usr/include: refactor .gitignore genksyms: fix stale comment ...
2021-05-08smb3: if max_channels set to more than one channel request multichannelSteve French1-0/+3
Mounting with "multichannel" is obviously implied if user requested more than one channel on mount (ie mount parm max_channels>1). Currently both have to be specified. Fix that so that if max_channels is greater than 1 on mount, enable multichannel rather than silently falling back to non-multichannel. Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-By: Tom Talpey <tom@talpey.com> Cc: <stable@vger.kernel.org> # v5.11+ Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
2021-05-08smb3: do not attempt multichannel to server which does not support itSteve French1-0/+6
We were ignoring CAP_MULTI_CHANNEL in the server response - if the server doesn't support multichannel we should not be attempting it. See MS-SMB2 section 3.2.5.2 Reviewed-by: Shyam Prasad N <sprasad@microsoft.com> Reviewed-By: Tom Talpey <tom@talpey.com> Cc: <stable@vger.kernel.org> # v5.8+ Signed-off-by: Steve French <stfrench@microsoft.com>
2021-05-08smb3: when mounting with multichannel include it in requested capabilitiesSteve French1-0/+5
In the SMB3/SMB3.1.1 negotiate protocol request, we are supposed to advertise CAP_MULTICHANNEL capability when establishing multiple channels has been requested by the user doing the mount. See MS-SMB2 sections 2.2.3 and 3.2.5.2 Without setting it there is some risk that multichannel could fail if the server interpreted the field strictly. Reviewed-By: Tom Talpey <tom@talpey.com> Reviewed-by: Shyam Prasad N <sprasad@microsoft.com> Cc: <stable@vger.kernel.org> # v5.8+ Signed-off-by: Steve French <stfrench@microsoft.com>
2021-05-07Merge tag 'block-5.13-2021-05-07' of git://git.kernel.dk/linux-blockLinus Torvalds1-3/+17
Pull block fixes from Jens Axboe: - dasd spelling fixes (Bhaskar) - Limit bio max size on multi-page bvecs to the hardware limit, to avoid overly large bio's (and hence latencies). Originally queued for the merge window, but needed a fix and was dropped from the initial pull (Changheun) - NVMe pull request (Christoph): - reset the bdev to ns head when failover (Daniel Wagner) - remove unsupported command noise (Keith Busch) - misc passthrough improvements (Kanchan Joshi) - fix controller ioctl through ns_head (Minwoo Im) - fix controller timeouts during reset (Tao Chiu) - rnbd fixes/cleanups (Gioh, Md, Dima) - Fix iov_iter re-expansion (yangerkun) * tag 'block-5.13-2021-05-07' of git://git.kernel.dk/linux-block: block: reexpand iov_iter after read/write nvmet: remove unsupported command noise nvme-multipath: reset bdev to ns head when failover nvme-pci: fix controller reset hang when racing with nvme_timeout nvme: move the fabrics queue ready check routines to core nvme: avoid memset for passthrough requests nvme: add nvme_get_ns helper nvme: fix controller ioctl through ns_head bio: limit bio max size RDMA/rtrs: fix uninitialized symbol 'cnt' s390: dasd: Mundane spelling fixes block/rnbd: Remove all likely and unlikely block/rnbd-clt: Check the return value of the function rtrs_clt_query block/rnbd: Fix style issues block/rnbd-clt: Change queue_depth type in rnbd_clt_session to size_t
2021-05-07Merge tag 'io_uring-5.13-2021-05-07' of git://git.kernel.dk/linux-blockLinus Torvalds1-18/+51
Pull io_uring fixes from Jens Axboe: "Mostly fixes for merge window merged code. In detail: - Error case memory leak fixes (Colin, Zqiang) - Add the tools/io_uring/ to the list of maintained files (Lukas) - Set of fixes for the modified buffer registration API (Pavel) - Sanitize io thread setup on x86 (Stefan) - Ensure we truncate transfer count for registered buffers (Thadeu)" * tag 'io_uring-5.13-2021-05-07' of git://git.kernel.dk/linux-block: x86/process: setup io_threads more like normal user space threads MAINTAINERS: add io_uring tool to IO_URING io_uring: truncate lengths larger than MAX_RW_COUNT on provide buffers io_uring: Fix memory leak in io_sqe_buffers_register() io_uring: Fix premature return from loop and memory leak io_uring: fix unchecked error in switch_start() io_uring: allow empty slots for reg buffers io_uring: add more build check for uapi io_uring: dont overlap internal and user req flags io_uring: fix drain with rsrc CQEs
2021-05-07Merge tag 'nfs-for-5.13-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds31-399/+735
Pull NFS client updates from Trond Myklebust: "Highlights include: Stable fixes: - Add validation of the UDP retrans parameter to prevent shift out-of-bounds - Don't discard pNFS layout segments that are marked for return Bugfixes: - Fix a NULL dereference crash in xprt_complete_bc_request() when the NFSv4.1 server misbehaves. - Fix the handling of NFS READDIR cookie verifiers - Sundry fixes to ensure attribute revalidation works correctly when the server does not return post-op attributes. - nfs4_bitmask_adjust() must not change the server global bitmasks - Fix major timeout handling in the RPC code. - NFSv4.2 fallocate() fixes. - Fix the NFSv4.2 SEEK_HOLE/SEEK_DATA end-of-file handling - Copy offload attribute revalidation fixes - Fix an incorrect filehandle size check in the pNFS flexfiles driver - Fix several RDMA transport setup/teardown races - Fix several RDMA queue wrapping issues - Fix a misplaced memory read barrier in sunrpc's call_decode() Features: - Micro optimisation of the TCP transmission queue using TCP_CORK - statx() performance improvements by further splitting up the tracking of invalid cached file metadata. - Support the NFSv4.2 'change_attr_type' attribute and use it to optimise handling of change attribute updates" * tag 'nfs-for-5.13-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (85 commits) xprtrdma: Fix a NULL dereference in frwr_unmap_sync() sunrpc: Fix misplaced barrier in call_decode NFSv4.2: Remove ifdef CONFIG_NFSD from NFSv4.2 client SSC code. xprtrdma: Move fr_mr field to struct rpcrdma_mr xprtrdma: Move the Work Request union to struct rpcrdma_mr xprtrdma: Move fr_linv_done field to struct rpcrdma_mr xprtrdma: Move cqe to struct rpcrdma_mr xprtrdma: Move fr_cid to struct rpcrdma_mr xprtrdma: Remove the RPC/RDMA QP event handler xprtrdma: Don't display r_xprt memory addresses in tracepoints xprtrdma: Add an rpcrdma_mr_completion_class xprtrdma: Add tracepoints showing FastReg WRs and remote invalidation xprtrdma: Avoid Send Queue wrapping xprtrdma: Do not wake RPC consumer on a failed LocalInv xprtrdma: Do not recycle MR after FastReg/LocalInv flushes xprtrdma: Clarify use of barrier in frwr_wc_localinv_done() xprtrdma: Rename frwr_release_mr() xprtrdma: rpcrdma_mr_pop() already does list_del_init() xprtrdma: Delete rpcrdma_recv_buffer_put() xprtrdma: Fix cwnd update ordering ...
2021-05-07Merge tag '9p-for-5.13-rc1' of git://github.com/martinetd/linuxLinus Torvalds2-3/+3
Pull 9p updates from Dominique Martinet: "An error handling fix and constification" * tag '9p-for-5.13-rc1' of git://github.com/martinetd/linux: fs: 9p: fix v9fs_file_open writeback fid error check 9p: Constify static struct v9fs_attr_group
2021-05-07Merge branch 'akpm' (patches from Andrew)Linus Torvalds129-421/+131
Merge yet more updates from Andrew Morton: "This is everything else from -mm for this merge window. 90 patches. Subsystems affected by this patch series: mm (cleanups and slub), alpha, procfs, sysctl, misc, core-kernel, bitmap, lib, compat, checkpatch, epoll, isofs, nilfs2, hpfs, exit, fork, kexec, gcov, panic, delayacct, gdb, resource, selftests, async, initramfs, ipc, drivers/char, and spelling" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (90 commits) mm: fix typos in comments mm: fix typos in comments treewide: remove editor modelines and cruft ipc/sem.c: spelling fix fs: fat: fix spelling typo of values kernel/sys.c: fix typo kernel/up.c: fix typo kernel/user_namespace.c: fix typos kernel/umh.c: fix some spelling mistakes include/linux/pgtable.h: few spelling fixes mm/slab.c: fix spelling mistake "disired" -> "desired" scripts/spelling.txt: add "overflw" scripts/spelling.txt: Add "diabled" typo scripts/spelling.txt: add "overlfow" arm: print alloc free paths for address in registers mm/vmalloc: remove vwrite() mm: remove xlate_dev_kmem_ptr() drivers/char: remove /dev/kmem for good mm: fix some typos and code style problems ipc/sem.c: mundane typo fixes ...
2021-05-07treewide: remove editor modelines and cruftMasahiro Yamada118-386/+109
The section "19) Editor modelines and other cruft" in Documentation/process/coding-style.rst clearly says, "Do not include any of these in source files." I recently receive a patch to explicitly add a new one. Let's do treewide cleanups, otherwise some people follow the existing code and attempt to upstream their favoriate editor setups. It is even nicer if scripts/checkpatch.pl can check it. If we like to impose coding style in an editor-independent manner, I think editorconfig (patch [1]) is a saner solution. [1] https://lore.kernel.org/lkml/20200703073143.423557-1-danny@kdrag0n.dev/ Link: https://lkml.kernel.org/r/20210324054457.1477489-1-masahiroy@kernel.org Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: Miguel Ojeda <ojeda@kernel.org> [auxdisplay] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-07fs: fat: fix spelling typo of valuesdingsenjie1-1/+1
vaules -> values Link: https://lkml.kernel.org/r/20210302034817.30384-1-dingsenjie@163.com Signed-off-by: dingsenjie <dingsenjie@yulong.com> Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-06Merge tag 'iomap-5.13-merge-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds2-7/+2
Pull more iomap updates from Darrick Wong: "Remove the now unused 'io_private' field from struct iomap_ioend, for a modest savings in memory allocation" * tag 'iomap-5.13-merge-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: iomap: remove unused private field from ioend
2021-05-06Merge tag 'xfs-5.13-merge-5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds18-68/+143
Pull more xfs updates from Darrick Wong: "Except for the timestamp struct renaming patches, everything else in here are bug fixes: - Rename the log timestamp struct. - Remove broken transaction counter debugging that wasn't working correctly on very old filesystems. - Various fixes to make pre-lazysbcount filesystems work properly again. - Fix a free space accounting problem where we neglected to consider free space btree blocks that track metadata reservation space when deciding whether or not to allow caller to reserve space for a metadata update. - Fix incorrect pagecache clearing behavior during FUNSHARE ops. - Don't allow log writes if the data device is readonly" * tag 'xfs-5.13-merge-5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: don't allow log writes if the data device is readonly xfs: fix xfs_reflink_unshare usage of filemap_write_and_wait_range xfs: set aside allocation btree blocks from block reservation xfs: introduce in-core global counter of allocbt blocks xfs: unconditionally read all AGFs on mounts with perag reservation xfs: count free space btree blocks when scrubbing pre-lazysbcount fses xfs: update superblock counters correctly for !lazysbcount xfs: don't check agf_btreeblks on pre-lazysbcount filesystems xfs: remove obsolete AGF counter debugging xfs: rename struct xfs_legacy_ictimestamp xfs: rename xfs_ictimestamp_t
2021-05-06hpfs: replace one-element array with flexible-array memberGustavo A. R. Silva1-1/+2
There is a regular need in the kernel to provide a way to declare having a dynamically sized set of trailing elements in a structure. Kernel code should always use “flexible array members”[1] for these cases. The older style of one-element or zero-length arrays should no longer be used[2]. Also, this helps with the ongoing efforts to enable -Warray-bounds by fixing the following warning: CC [M] fs/hpfs/dir.o fs/hpfs/dir.c: In function `hpfs_readdir': fs/hpfs/dir.c:163:41: warning: array subscript 1 is above array bounds of `u8[1]' {aka `unsigned char[1]'} [-Warray-bounds] 163 | || de ->name[0] != 1 || de->name[1] != 1)) | ~~~~~~~~^~~ [1] https://en.wikipedia.org/wiki/Flexible_array_member [2] https://www.kernel.org/doc/html/v5.10/process/deprecated.html#zero-length-and-one-element-arrays Link: https://github.com/KSPP/linux/issues/79 Link: https://github.com/KSPP/linux/issues/109 Link: https://lkml.kernel.org/r/20210326173510.GA81212@embeddedor Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Cc: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-06nilfs2: fix typos in commentsLu Jialin3-4/+4
numer -> number in fs/nilfs2/cpfile.c Decription -> Description in fs/nilfs2/ioctl.c isntance -> instance in fs/nilfs2/the_nilfs.c Link: https://lkml.kernel.org/r/1617942951-14631-1-git-send-email-konishi.ryusuke@gmail.com Link: https://lore.kernel.org/r/20210409022519.176988-1-lujialin4@huawei.com Signed-off-by: Lu Jialin <lujialin4@huawei.com> Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-06fs/nilfs2: fix misspellings using codespell toolLiu xuzhi1-2/+2
Two typos are found out by codespell tool \ in 2217th and 2254th lines of segment.c: $ codespell ./fs/nilfs2/ ./segment.c:2217 :retured ==> returned ./segment.c:2254: retured ==> returned Fix two typos found by codespell. Link: https://lkml.kernel.org/r/1617864087-8198-1-git-send-email-konishi.ryusuke@gmail.com Signed-off-by: Liu xuzhi <liu.xuzhi@zte.com.cn> Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-06isofs: fix fall-through warnings for ClangGustavo A. R. Silva1-0/+1
In preparation to enable -Wimplicit-fallthrough for Clang, fix a warning by explicitly adding a break statement instead of just letting the code fall through to the next case. Link: https://github.com/KSPP/linux/issues/115 Link: https://lkml.kernel.org/r/5b7caa73958588065fabc59032c340179b409ef5.1605896059.git.gustavoars@kernel.org Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-06fs/epoll: restore waking from ep_done_scan()Davidlohr Bueso1-0/+6
Commit 339ddb53d373 ("fs/epoll: remove unnecessary wakeups of nested epoll") changed the userspace visible behavior of exclusive waiters blocked on a common epoll descriptor upon a single event becoming ready. Previously, all tasks doing epoll_wait would awake, and now only one is awoken, potentially causing missed wakeups on applications that rely on this behavior, such as Apache Qpid. While the aforementioned commit aims at having only a wakeup single path in ep_poll_callback (with the exceptions of epoll_ctl cases), we need to restore the wakeup in what was the old ep_scan_ready_list() such that the next thread can be awoken, in a cascading style, after the waker's corresponding ep_send_events(). Link: https://lkml.kernel.org/r/20210405231025.33829-3-dave@stgolabs.net Fixes: 339ddb53d373 ("fs/epoll: remove unnecessary wakeups of nested epoll") Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Jason Baron <jbaron@akamai.com> Cc: Roman Penyaev <rpenyaev@suse.de> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-06proc/sysctl: fix function name error in commentszhouchuangao1-1/+1
The function name should be modified to register_sysctl_paths instead of register_sysctl_table_path. Link: https://lkml.kernel.org/r/1615807194-79646-1-git-send-email-zhouchuangao@vivo.com Signed-off-by: zhouchuangao <zhouchuangao@vivo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-06proc: delete redundant subset=pid checkAlexey Dobriyan1-4/+0
Two checks in lookup and readdir code should be enough to not have third check in open code. Can't open what can't be looked up? Link: https://lkml.kernel.org/r/YFYYwIBIkytqnkxP@localhost.localdomain Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: Alexey Gladkov <gladkov.alexey@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-06proc: mandate ->proc_lseek in "struct proc_ops"Alexey Dobriyan1-12/+2
Now that proc_ops are separate from file_operations and other operations it easy to check all instances to have ->proc_lseek hook and remove check in main code. Note: nonseekable_open() files naturally don't require ->proc_lseek. Garbage collect pde_lseek() function. [adobriyan@gmail.com: smoke test lseek()] Link: https://lkml.kernel.org/r/YG4OIhChOrVTPgdN@localhost.localdomain Link: https://lkml.kernel.org/r/YFYX0Bzwxlc7aBa/@localhost.localdomain Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-06proc: save LOC in __xlate_proc_name()Alexey Dobriyan1-9/+2
Can't look at this verbosity anymore. Link: https://lkml.kernel.org/r/YFYXAp/fgq405qcy@localhost.localdomain Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-06fs/proc/generic.c: fix incorrect pde_is_permanent checkColin Ian King1-1/+1
Currently the pde_is_permanent() check is being run on root multiple times rather than on the next proc directory entry. This looks like a copy-paste error. Fix this by replacing root with next. Addresses-Coverity: ("Copy-paste error") Link: https://lkml.kernel.org/r/20210318122633.14222-1-colin.king@canonical.com Fixes: d919b33dafb3 ("proc: faster open/read/close with "permanent" files") Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Reviewed-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Greg Kroah-Hartman <gregkh@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-06Merge tag 'ceph-for-5.13-rc1' of git://github.com/ceph/ceph-clientLinus Torvalds18-684/+524
Pull ceph updates from Ilya Dryomov: "Notable items here are - a series to take advantage of David Howells' netfs helper library from Jeff - three new filesystem client metrics from Xiubo - ceph.dir.rsnaps vxattr from Yanhu - two auth-related fixes from myself, marked for stable. Interspersed is a smattering of assorted fixes and cleanups across the filesystem" * tag 'ceph-for-5.13-rc1' of git://github.com/ceph/ceph-client: (24 commits) libceph: allow addrvecs with a single NONE/blank address libceph: don't set global_id until we get an auth ticket libceph: bump CephXAuthenticate encoding version ceph: don't allow access to MDS-private inodes ceph: fix up some bare fetches of i_size ceph: convert some PAGE_SIZE invocations to thp_size() ceph: support getting ceph.dir.rsnaps vxattr ceph: drop pinned_page parameter from ceph_get_caps ceph: fix inode leak on getattr error in __fh_to_dentry ceph: only check pool permissions for regular files ceph: send opened files/pinned caps/opened inodes metrics to MDS daemon ceph: avoid counting the same request twice or more ceph: rename the metric helpers ceph: fix kerneldoc copypasta over ceph_start_io_direct ceph: use attach/detach_page_private for tracking snap context ceph: don't use d_add in ceph_handle_snapdir ceph: don't clobber i_snap_caps on non-I_NEW inode ceph: fix fall-through warnings for Clang ceph: convert ceph_readpages to ceph_readahead ceph: convert ceph_write_begin to netfs_write_begin ...
2021-05-06Merge tag 'ecryptfs-5.13-rc1-updates' of ↵Linus Torvalds14-63/+73
git://git.kernel.org/pub/scm/linux/kernel/git/tyhicks/ecryptfs Pull ecryptfs updates from Tyler Hicks: "Code cleanups and a bug fix - W=1 compiler warning cleanups - Mutex initialization simplification - Protect against NULL pointer exception during mount" * tag 'ecryptfs-5.13-rc1-updates' of git://git.kernel.org/pub/scm/linux/kernel/git/tyhicks/ecryptfs: ecryptfs: fix kernel panic with null dev_name ecryptfs: remove unused helpers ecryptfs: Fix typo in message eCryptfs: Use DEFINE_MUTEX() for mutex lock ecryptfs: keystore: Fix some kernel-doc issues and demote non-conformant headers ecryptfs: inode: Help out nearly-there header and demote non-conformant ones ecryptfs: mmap: Help out one function header and demote other abuses ecryptfs: crypto: Supply some missing param descriptions and demote abuses ecryptfs: miscdev: File headers are not good kernel-doc candidates ecryptfs: main: Demote a bunch of non-conformant kernel-doc headers ecryptfs: messaging: Add missing param descriptions and demote abuses ecryptfs: super: Fix formatting, naming and kernel-doc abuses ecryptfs: file: Demote kernel-doc abuses ecryptfs: kthread: Demote file header and provide description for 'cred' ecryptfs: dentry: File headers are not good candidates for kernel-doc ecryptfs: debug: Demote a couple of kernel-doc abuses ecryptfs: read_write: File headers do not make good candidates for kernel-doc ecryptfs: use DEFINE_MUTEX() for mutex lock eCryptfs: add a semicolon
2021-05-06block: reexpand iov_iter after read/writeyangerkun1-3/+17
We get a bug: BUG: KASAN: slab-out-of-bounds in iov_iter_revert+0x11c/0x404 lib/iov_iter.c:1139 Read of size 8 at addr ffff0000d3fb11f8 by task CPU: 0 PID: 12582 Comm: syz-executor.2 Not tainted 5.10.0-00843-g352c8610ccd2 #2 Hardware name: linux,dummy-virt (DT) Call trace: dump_backtrace+0x0/0x2d0 arch/arm64/kernel/stacktrace.c:132 show_stack+0x28/0x34 arch/arm64/kernel/stacktrace.c:196 __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x110/0x164 lib/dump_stack.c:118 print_address_description+0x78/0x5c8 mm/kasan/report.c:385 __kasan_report mm/kasan/report.c:545 [inline] kasan_report+0x148/0x1e4 mm/kasan/report.c:562 check_memory_region_inline mm/kasan/generic.c:183 [inline] __asan_load8+0xb4/0xbc mm/kasan/generic.c:252 iov_iter_revert+0x11c/0x404 lib/iov_iter.c:1139 io_read fs/io_uring.c:3421 [inline] io_issue_sqe+0x2344/0x2d64 fs/io_uring.c:5943 __io_queue_sqe+0x19c/0x520 fs/io_uring.c:6260 io_queue_sqe+0x2a4/0x590 fs/io_uring.c:6326 io_submit_sqe fs/io_uring.c:6395 [inline] io_submit_sqes+0x4c0/0xa04 fs/io_uring.c:6624 __do_sys_io_uring_enter fs/io_uring.c:9013 [inline] __se_sys_io_uring_enter fs/io_uring.c:8960 [inline] __arm64_sys_io_uring_enter+0x190/0x708 fs/io_uring.c:8960 __invoke_syscall arch/arm64/kernel/syscall.c:36 [inline] invoke_syscall arch/arm64/kernel/syscall.c:48 [inline] el0_svc_common arch/arm64/kernel/syscall.c:158 [inline] do_el0_svc+0x120/0x290 arch/arm64/kernel/syscall.c:227 el0_svc+0x1c/0x28 arch/arm64/kernel/entry-common.c:367 el0_sync_handler+0x98/0x170 arch/arm64/kernel/entry-common.c:383 el0_sync+0x140/0x180 arch/arm64/kernel/entry.S:670 Allocated by task 12570: stack_trace_save+0x80/0xb8 kernel/stacktrace.c:121 kasan_save_stack mm/kasan/common.c:48 [inline] kasan_set_track mm/kasan/common.c:56 [inline] __kasan_kmalloc+0xdc/0x120 mm/kasan/common.c:461 kasan_kmalloc+0xc/0x14 mm/kasan/common.c:475 __kmalloc+0x23c/0x334 mm/slub.c:3970 kmalloc include/linux/slab.h:557 [inline] __io_alloc_async_data+0x68/0x9c fs/io_uring.c:3210 io_setup_async_rw fs/io_uring.c:3229 [inline] io_read fs/io_uring.c:3436 [inline] io_issue_sqe+0x2954/0x2d64 fs/io_uring.c:5943 __io_queue_sqe+0x19c/0x520 fs/io_uring.c:6260 io_queue_sqe+0x2a4/0x590 fs/io_uring.c:6326 io_submit_sqe fs/io_uring.c:6395 [inline] io_submit_sqes+0x4c0/0xa04 fs/io_uring.c:6624 __do_sys_io_uring_enter fs/io_uring.c:9013 [inline] __se_sys_io_uring_enter fs/io_uring.c:8960 [inline] __arm64_sys_io_uring_enter+0x190/0x708 fs/io_uring.c:8960 __invoke_syscall arch/arm64/kernel/syscall.c:36 [inline] invoke_syscall arch/arm64/kernel/syscall.c:48 [inline] el0_svc_common arch/arm64/kernel/syscall.c:158 [inline] do_el0_svc+0x120/0x290 arch/arm64/kernel/syscall.c:227 el0_svc+0x1c/0x28 arch/arm64/kernel/entry-common.c:367 el0_sync_handler+0x98/0x170 arch/arm64/kernel/entry-common.c:383 el0_sync+0x140/0x180 arch/arm64/kernel/entry.S:670 Freed by task 12570: stack_trace_save+0x80/0xb8 kernel/stacktrace.c:121 kasan_save_stack mm/kasan/common.c:48 [inline] kasan_set_track+0x38/0x6c mm/kasan/common.c:56 kasan_set_free_info+0x20/0x40 mm/kasan/generic.c:355 __kasan_slab_free+0x124/0x150 mm/kasan/common.c:422 kasan_slab_free+0x10/0x1c mm/kasan/common.c:431 slab_free_hook mm/slub.c:1544 [inline] slab_free_freelist_hook mm/slub.c:1577 [inline] slab_free mm/slub.c:3142 [inline] kfree+0x104/0x38c mm/slub.c:4124 io_dismantle_req fs/io_uring.c:1855 [inline] __io_free_req+0x70/0x254 fs/io_uring.c:1867 io_put_req_find_next fs/io_uring.c:2173 [inline] __io_queue_sqe+0x1fc/0x520 fs/io_uring.c:6279 __io_req_task_submit+0x154/0x21c fs/io_uring.c:2051 io_req_task_submit+0x2c/0x44 fs/io_uring.c:2063 task_work_run+0xdc/0x128 kernel/task_work.c:151 get_signal+0x6f8/0x980 kernel/signal.c:2562 do_signal+0x108/0x3a4 arch/arm64/kernel/signal.c:658 do_notify_resume+0xbc/0x25c arch/arm64/kernel/signal.c:722 work_pending+0xc/0x180 blkdev_read_iter can truncate iov_iter's count since the count + pos may exceed the size of the blkdev. This will confuse io_read that we have consume the iovec. And once we do the iov_iter_revert in io_read, we will trigger the slab-out-of-bounds. Fix it by reexpand the count with size has been truncated. blkdev_write_iter can trigger the problem too. Signed-off-by: yangerkun <yangerkun@huawei.com> Acked-by: Pavel Begunkov <asml.silencec@gmail.com> Link: https://lore.kernel.org/r/20210401071807.3328235-1-yangerkun@huawei.com Signed-off-by: Jens Axboe <axboe@kernel.dk>