Age | Commit message (Collapse) | Author | Files | Lines |
|
gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs
Pull vfs folio updates from Christian Brauner:
"This contains work to port write_begin and write_end to rely on folios
for various filesystems.
This converts ocfs2, vboxfs, orangefs, jffs2, hostfs, fuse, f2fs,
ecryptfs, ntfs3, nilfs2, reiserfs, minixfs, qnx6, sysv, ufs, and
squashfs.
After this series lands a bunch of the filesystems in this list do not
mention struct page anymore"
* tag 'vfs-6.12.folio' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs: (61 commits)
Squashfs: Ensure all readahead pages have been used
Squashfs: Rewrite and update squashfs_readahead_fragment() to not use page->index
Squashfs: Update squashfs_readpage_block() to not use page->index
Squashfs: Update squashfs_readahead() to not use page->index
Squashfs: Update page_actor to not use page->index
jffs2: Use a folio in jffs2_garbage_collect_dnode()
jffs2: Convert jffs2_do_readpage_nolock to take a folio
buffer: Convert __block_write_begin() to take a folio
ocfs2: Convert ocfs2_write_zero_page to use a folio
fs: Convert aops->write_begin to take a folio
fs: Convert aops->write_end to take a folio
vboxsf: Use a folio in vboxsf_write_end()
orangefs: Convert orangefs_write_begin() to use a folio
orangefs: Convert orangefs_write_end() to use a folio
jffs2: Convert jffs2_write_begin() to use a folio
jffs2: Convert jffs2_write_end() to use a folio
hostfs: Convert hostfs_write_end() to use a folio
fuse: Convert fuse_write_begin() to use a folio
fuse: Convert fuse_write_end() to use a folio
f2fs: Convert f2fs_write_begin() to use a folio
...
|
|
There is only one called of alloc_page_buffers and it doesn't require
__GFP_NOFAIL so drop this allocation mode.
Signed-off-by: Michal Hocko <mhocko@suse.com>
Link: https://lore.kernel.org/r/20240829130640.1397970-1-mhocko@kernel.org
Acked-by: Song Liu <song@kernel.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Almost all callers have a folio now, so change __block_write_begin()
to take a folio and remove a call to compound_head().
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Convert all callers from working on a page to working on one page
of a folio (support for working on an entire folio can come later).
Removes a lot of folio->page->folio conversions.
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Most callers have a folio, and most implementations operate on a folio,
so remove the conversion from folio->page->folio to fit through this
interface.
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
All callers now have a folio, so pass it in instead of converting
from a folio to a page and back to a folio again. Saves a call
to compound_head().
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
All callers are now converted, delete this compatibility wrapper. Also
fix up some comments which referred to page_mapping.
Link: https://lkml.kernel.org/r/20240423225552.4113447-7-willy@infradead.org
Link: https://lkml.kernel.org/r/20240524181813.698813-1-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Eric Biggers <ebiggers@google.com>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull bdev bd_inode updates from Al Viro:
"Replacement of bdev->bd_inode with sane(r) set of primitives by me and
Yu Kuai"
* tag 'pull-bd_inode-1' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
RIP ->bd_inode
dasd_format(): killing the last remaining user of ->bd_inode
nilfs_attach_log_writer(): use ->bd_mapping->host instead of ->bd_inode
block/bdev.c: use the knowledge of inode/bdev coallocation
gfs2: more obvious initializations of mapping->host
fs/buffer.c: massage the remaining users of ->bd_inode to ->bd_mapping
blk_ioctl_{discard,zeroout}(): we only want ->bd_inode->i_mapping here...
grow_dev_folio(): we only want ->bd_inode->i_mapping there
use ->bd_mapping instead of ->bd_inode->i_mapping
block_device: add a pointer to struct address_space (page cache of bdev)
missing helpers: bdev_unhash(), bdev_drop()
block: move two helpers into bdev.c
block2mtd: prevent direct access of bd_inode
dm-vdo: use bdev_nr_bytes(bdev) instead of i_size_read(bdev->bd_inode)
blkdev_write_iter(): saner way to get inode and bdev
bcachefs: remove dead function bdev_sectors()
ext4: remove block_device_ejected()
erofs_buf: store address_space instead of inode
erofs: switch erofs_bread() to passing offset instead of block number
|
|
Distinguish these functions from brelse() and __brelse().
Link: https://lkml.kernel.org/r/20240416031754.4076917-7-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Pankaj Raghav <p.raghav@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Move the documentation for __brelse() to brelse(), format it as kernel-doc
and update it from talking about pages to folios.
Link: https://lkml.kernel.org/r/20240416031754.4076917-6-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Pankaj Raghav <p.raghav@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
The extra indentation confused the kernel-doc parser, so remove it. Fix
some other wording while I'm here, and advise the user they need to call
brelse() on this buffer.
__bread_gfp() isn't used directly by filesystems, but the other wrappers
for it don't have documentation, so document it accordingly.
Link: https://lkml.kernel.org/r/20240416031754.4076917-5-willy@infradead.org
Co-developed-by: Pankaj Raghav <p.raghav@samsung.com>
Signed-off-by: Pankaj Raghav <p.raghav@samsung.com>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Just the low-hanging fruit...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Link: https://lore.kernel.org/r/20240411145346.2516848-2-viro@zeniv.linux.org.uk
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
All callers are passing end_buffer_async_write as this argument, so we can
hardcode references to it within __block_write_full_folio(). That lets us
make end_buffer_async_write() static.
Link: https://lkml.kernel.org/r/20231215200245.748418-15-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Convert the function to be compatible with writepage_t so that it can be
passed to write_cache_pages() by blkdev. This removes a call to
compound_head(). We can also remove the function export as both callers
are built-in.
Link: https://lkml.kernel.org/r/20231215200245.748418-14-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Patch series "Clean up the writeback paths".
Most of these patches verge on the trivial, converting filesystems that
just use block_write_full_page() to use mpage_writepages(). But as we saw
with Christoph's earlier patchset, there can be some "interesting"
gotchas, and I clearly haven't tested the majority of filesystems I've
touched here.
Patches 3 & 4 get rid of a lot of stack usage on architectures with larger
page sizes; 1024 bytes on 64-bit systems with 64KiB pages. It starts to
open the door to larger folio sizes on all architectures, but it's
certainly not enough yet.
Patch 14 is kind of trivial, but it's nice to get that simplification in.
This patch (of 14):
This function has been unused since the removal of bdev_write_page().
Link: https://lkml.kernel.org/r/20231215200245.748418-1-willy@infradead.org
Link: https://lkml.kernel.org/r/20231215200245.748418-2-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
With all users converted, remove the old create_empty_buffers() and rename
folio_create_empty_buffers() to create_empty_buffers().
Link: https://lkml.kernel.org/r/20231016201114.1928083-28-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Andreas Gruenbacher <agruenba@redhat.com>
Cc: Pankaj Raghav <p.raghav@samsung.com>
Cc: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Extract this useful helper from nilfs_page_get_nth_block()
Link: https://lkml.kernel.org/r/20231016201114.1928083-6-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Cc: Andreas Gruenbacher <agruenba@redhat.com>
Cc: Pankaj Raghav <p.raghav@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Patch series "Finish the create_empty_buffers() transition", v2.
Pankaj recently added folio_create_empty_buffers() as the folio equivalent
to create_empty_buffers(). This patch set finishes the conversion by
first converting all remaining filesystems to call
folio_create_empty_buffers(), then renaming it back to
create_empty_buffers(). I took the opportunity to make a few
simplifications like making folio_create_empty_buffers() return the head
buffer and extracting get_nth_bh() from nilfs2.
A few of the patches in this series aren't directly related to
create_empty_buffers(), but I saw them while I was working on this and
thought they'd be easy enough to add to this series. Compile-tested only,
other than ext4.
This patch (of 26):
Almost all callers want to know the first BH that was allocated for this
folio. We already have that handy, so return it.
Link: https://lkml.kernel.org/r/20231016201114.1928083-1-willy@infradead.org
Link: https://lkml.kernel.org/r/20231016201114.1928083-3-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Pankaj Raghav <p.raghav@samsung.com>
Cc: Andreas Gruenbacher <agruenba@redhat.com>
Cc: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Inline it into __bread_gfp().
Link: https://lkml.kernel.org/r/20230914150011.843330-9-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Hui Zhu <teawater@antgroup.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Most of the callers of sb_getblk_gfp() already assumed that they were
passing the entire GFP flags to use. Fix up the two callers that didn't,
and remove the __GFP_NOFAIL from them since they both appear to correctly
handle failure.
Link: https://lkml.kernel.org/r/20230914150011.843330-8-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Hui Zhu <teawater@antgroup.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Now that __getblk() is in the right place in the file, it is trivial to
call it from sb_getblk().
Link: https://lkml.kernel.org/r/20230914150011.843330-7-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Hui Zhu <teawater@antgroup.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Move these two functions up in the file for the benefit of the next patch,
and pass in all of the GFP flags to use instead of the partial GFP flags
used by __getblk_gfp().
Link: https://lkml.kernel.org/r/20230914150011.843330-6-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Hui Zhu <teawater@antgroup.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
grow_dev_page() is only called by grow_buffers(). grow_buffers() is only
called by __getblk_slow() and __getblk_slow() is only called from
__getblk_gfp(), so it is safe to move the GFP flags setting all the way
up. With that done, add a new bdev_getblk() entry point that leaves the
GFP flags the way the caller specified them.
[willy@infradead.org: fix grow_dev_page() error handling]
Link: https://lkml.kernel.org/r/ZRREEIwqiy5DijKB@casper.infradead.org
Link: https://lkml.kernel.org/r/20230914150011.843330-3-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Hui Zhu <teawater@antgroup.com>
Cc: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Patch series "Add and use bdev_getblk()", v2.
This patch series fixes a bug reported by Hui Zhu; see proposed
patches v1 and v2:
https://lore.kernel.org/linux-fsdevel/20230811035705.3296-1-teawaterz@linux.alibaba.com/
https://lore.kernel.org/linux-fsdevel/20230811071519.1094-1-teawaterz@linux.alibaba.com/
I decided to go in a rather different direction for this fix, and fix a
related problem at the same time. I don't think there's any urgency to
rush this into Linus' tree, nor have I marked it for stable. Reasonable
people may disagree.
This patch (of 8):
Instead of creating entirely new flags, inherit them from grow_dev_page().
The other callers create the same flags that this function used to
create.
Link: https://lkml.kernel.org/r/20230914150011.843330-1-willy@infradead.org
Link: https://lkml.kernel.org/r/20230914150011.843330-2-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Hui Zhu <teawater@antgroup.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext4 fixes from Ted Ts'o:
"Regression and bug fixes for ext4"
* tag 'ext4_for_linus-6.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: fix rec_len verify error
ext4: do not let fstrim block system suspend
ext4: move setting of trimmed bit into ext4_try_to_trim_range()
jbd2: Fix memory leak in journal_init_common()
jbd2: Remove page size assumptions
buffer: Make bh_offset() work for compound pages
|
|
If the buffer pointed to by the buffer_head is part of a compound page,
bh_offset() assumes that b_page is the precise page that contains
the data. A recent change to jbd2 inadvertently violated that assumption.
By using page_size(), we support both b_page being set to the head page
(as page_size() will return the size of the entire folio) and the precise
page (as it will return PAGE_SIZE for a tail page).
Fixes: 8147c4c4546f ("jbd2: use a folio in jbd2_journal_write_metadata_buffer()")
Reported-by: Zorro Lang <zlang@kernel.org>
Tested-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
|
|
Pull block updates from Jens Axboe:
"Pretty quiet round for this release. This contains:
- Add support for zoned storage to ublk (Andreas, Ming)
- Series improving performance for drivers that mark themselves as
needing a blocking context for issue (Bart)
- Cleanup the flush logic (Chengming)
- sed opal keyring support (Greg)
- Fixes and improvements to the integrity support (Jinyoung)
- Add some exports for bcachefs that we can hopefully delete again in
the future (Kent)
- deadline throttling fix (Zhiguo)
- Series allowing building the kernel without buffer_head support
(Christoph)
- Sanitize the bio page adding flow (Christoph)
- Write back cache fixes (Christoph)
- MD updates via Song:
- Fix perf regression for raid0 large sequential writes (Jan)
- Fix split bio iostat for raid0 (David)
- Various raid1 fixes (Heinz, Xueshi)
- raid6test build fixes (WANG)
- Deprecate bitmap file support (Christoph)
- Fix deadlock with md sync thread (Yu)
- Refactor md io accounting (Yu)
- Various non-urgent fixes (Li, Yu, Jack)
- Various fixes and cleanups (Arnd, Azeem, Chengming, Damien, Li,
Ming, Nitesh, Ruan, Tejun, Thomas, Xu)"
* tag 'for-6.6/block-2023-08-28' of git://git.kernel.dk/linux: (113 commits)
block: use strscpy() to instead of strncpy()
block: sed-opal: keyring support for SED keys
block: sed-opal: Implement IOC_OPAL_REVERT_LSP
block: sed-opal: Implement IOC_OPAL_DISCOVERY
blk-mq: prealloc tags when increase tagset nr_hw_queues
blk-mq: delete redundant tagset map update when fallback
blk-mq: fix tags leak when shrink nr_hw_queues
ublk: zoned: support REQ_OP_ZONE_RESET_ALL
md: raid0: account for split bio in iostat accounting
md/raid0: Fix performance regression for large sequential writes
md/raid0: Factor out helper for mapping and submitting a bio
md raid1: allow writebehind to work on any leg device set WriteMostly
md/raid1: hold the barrier until handle_read_error() finishes
md/raid1: free the r1bio before waiting for blocked rdev
md/raid1: call free_r1bio() before allow_barrier() in raid_end_bio_io()
blk-cgroup: Fix NULL deref caused by blkg_policy_data being installed before init
drivers/rnbd: restore sysfs interface to rnbd-client
md/raid5-cache: fix null-ptr-deref for r5l_flush_stripe_to_raid()
raid6: test: only check for Altivec if building on powerpc hosts
raid6: test: make sure all intermediate and artifact files are .gitignored
...
|
|
With all users converted to folio_set_bh(), remove this function.
Link: https://lkml.kernel.org/r/20230713035512.4139457-8-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <brauner@kernel.org>
Cc: David Sterba <dsterba@suse.com>
Cc: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Pankaj Raghav <p.raghav@samsung.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Tom Rix <trix@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
block_commit_write() always returns 0, this patch changes it to return
void.
Link: https://lkml.kernel.org/r/20230626055518.842392-3-beanhuo@iokpp.de
Signed-off-by: Bean Huo <beanhuo@micron.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Acked-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Luís Henriques <ocfs2-devel@oss.oracle.com>
Cc: Mark Fasheh <mark@fasheh.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Add a new config option that controls building the buffer_head code, and
select it from all file systems and stacking drivers that need it.
For the block device nodes and alternative iomap based buffered I/O path
is provided when buffer_head support is not enabled, and iomap needs a
a small tweak to define the IOMAP_F_BUFFER_HEAD flag to 0 to not call
into the buffer_head code when it doesn't exist.
Otherwise this is just Kconfig and ifdef changes.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Link: https://lore.kernel.org/r/20230801172201.1923299-7-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
block_page_mkwrite_return is neither block nor mkwrite specific, and
should not be under CONFIG_BLOCK. Move it to mm.h and rename it to
vmf_fs_error.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Link: https://lore.kernel.org/r/20230801172201.1923299-3-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull misc filesystem updates from Jan Kara:
- Rewrite kmap_local() handling in ext2
- Convert ext2 direct IO path to iomap (with some infrastructure tweaks
associated with that)
- Convert two boilerplate licenses in udf to SPDX identifiers
- Other small udf, ext2, and quota fixes and cleanups
* tag 'fs_for_v6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
udf: Fix uninitialized array access for some pathnames
ext2: Drop fragment support
quota: fix warning in dqgrab()
quota: Properly disable quotas when add_dquot_ref() fails
fs: udf: udftime: Replace LGPL boilerplate with SPDX identifier
fs: udf: Replace GPL 2.0 boilerplate license notice with SPDX identifier
fs: Drop wait_unfrozen wait queue
ext2_find_entry()/ext2_dotdot(): callers don't need page_addr anymore
ext2_{set_link,delete_entry}(): don't bother with page_addr
ext2_put_page(): accept any pointer within the page
ext2_get_page(): saner type
ext2: use offset_in_page() instead of open-coding it as subtraction
ext2_rename(): set_link and delete_entry may fail
ext2: Add direct-io trace points
ext2: Move direct-io to use iomap
ext2: Use generic_buffers_fsync() implementation
ext4: Use generic_buffers_fsync_noflush() implementation
fs/buffer.c: Add generic_buffers_fsync*() implementation
ext2/dax: Fix ext2_setsize when len is page aligned
|
|
Most of the callers already have a folio; convert reiserfs_write_end() to
have a folio. Removes a couple of hidden calls to compound_head().
Link: https://lkml.kernel.org/r/20230612210141.730128-10-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Andreas Gruenbacher <agruenba@redhat.com>
Cc: Bob Peterson <rpeterso@redhat.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Remove nine hidden calls to compound_head() by using a folio instead of a
page.
Link: https://lkml.kernel.org/r/20230612210141.730128-5-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Tested-by: Bob Peterson <rpeterso@redhat.com>
Reviewed-by: Bob Peterson <rpeterso@redhat.com>
Cc: Andreas Gruenbacher <agruenba@redhat.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Some of the higher layers like iomap takes inode_lock() when calling
generic_write_sync().
Also writeback already happens from other paths without inode lock,
so it's difficult to say that we really need sync_mapping_buffers() to
take any inode locking here. Having said that, let's add
generic_buffers_fsync/_noflush() implementation in buffer.c with no
inode_lock/unlock() for now so that filesystems like ext2 and
ext4's nojournal mode can use it.
Ext4 when got converted to iomap for direct-io already copied it's own
variant of __generic_file_fsync() without lock.
This patch adds generic_buffers_fsync()
& generic_buffers_fsync_noflush() implementations for use in filesystems
like ext2 & ext4 respectively.
Later we can review other filesystems as well to see if we can make
generic_buffers_fsync/_noflush() which does not take any inode_lock() as
the default path.
Tested-by: Disha Goel <disgoel@linux.ibm.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Message-Id: <d573408ac8408627d23a3d2d166e748c172c4c9e.1682069716.git.ritesh.list@gmail.com>
|
|
Folio version of create_empty_buffers(). This is required to convert
create_page_buffers() to folio_create_buffers() later in the series.
It removes several calls to compound_head() as it works directly on folio
compared to create_empty_buffers(). Hence, create_empty_buffers() has
been modified to call folio_create_empty_buffers().
Link: https://lkml.kernel.org/r/20230417123618.22094-4-p.raghav@samsung.com
Signed-off-by: Pankaj Raghav <p.raghav@samsung.com>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Folio version of alloc_page_buffers() helper. This is required to convert
create_page_buffers() to folio_create_buffers() later in the series.
alloc_page_buffers() has been modified to call folio_alloc_buffers() which
adds one call to compound_head() but folio_alloc_buffers() removes one
call to compound_head() compared to the existing alloc_page_buffers()
implementation.
Link: https://lkml.kernel.org/r/20230417123618.22094-3-p.raghav@samsung.com
Signed-off-by: Pankaj Raghav <p.raghav@samsung.com>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Patch series "convert create_page_buffers to folio_create_buffers".
One of the first kernel panic we hit when we try to increase the block
size > 4k is inside create_page_buffers()[1]. Even though buffer.c
function do not support large folios (folios > PAGE_SIZE) at the moment,
these changes are required when we want to remove that constraint.
This patch (of 4):
The folio version of set_bh_page(). This is required to convert
create_page_buffers() to folio_create_buffers() later in the series.
Link: https://lkml.kernel.org/r/20230417123618.22094-1-p.raghav@samsung.com
Link: https://lkml.kernel.org/r/20230417123618.22094-2-p.raghav@samsung.com
Signed-off-by: Pankaj Raghav <p.raghav@samsung.com>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Patch series "Start converting buffer_heads to use folios".
I was hoping that filesystems would convert from buffer_heads to iomap,
but that's not happening particularly quickly. So the buffer_head
infrastructure needs to be converted from being page-based to being
folio-based.
This patch (of 12):
Buffer heads point to the allocation (ie the folio), not the page. This
is currently the same thing for all filesystems that use buffer heads, so
this is a safe transitional step.
Link: https://lkml.kernel.org/r/20221215214402.3522366-1-willy@infradead.org
Link: https://lkml.kernel.org/r/20221215214402.3522366-2-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
- Yu Zhao's Multi-Gen LRU patches are here. They've been under test in
linux-next for a couple of months without, to my knowledge, any
negative reports (or any positive ones, come to that).
- Also the Maple Tree from Liam Howlett. An overlapping range-based
tree for vmas. It it apparently slightly more efficient in its own
right, but is mainly targeted at enabling work to reduce mmap_lock
contention.
Liam has identified a number of other tree users in the kernel which
could be beneficially onverted to mapletrees.
Yu Zhao has identified a hard-to-hit but "easy to fix" lockdep splat
at [1]. This has yet to be addressed due to Liam's unfortunately
timed vacation. He is now back and we'll get this fixed up.
- Dmitry Vyukov introduces KMSAN: the Kernel Memory Sanitizer. It uses
clang-generated instrumentation to detect used-unintialized bugs down
to the single bit level.
KMSAN keeps finding bugs. New ones, as well as the legacy ones.
- Yang Shi adds a userspace mechanism (madvise) to induce a collapse of
memory into THPs.
- Zach O'Keefe has expanded Yang Shi's madvise(MADV_COLLAPSE) to
support file/shmem-backed pages.
- userfaultfd updates from Axel Rasmussen
- zsmalloc cleanups from Alexey Romanov
- cleanups from Miaohe Lin: vmscan, hugetlb_cgroup, hugetlb and
memory-failure
- Huang Ying adds enhancements to NUMA balancing memory tiering mode's
page promotion, with a new way of detecting hot pages.
- memcg updates from Shakeel Butt: charging optimizations and reduced
memory consumption.
- memcg cleanups from Kairui Song.
- memcg fixes and cleanups from Johannes Weiner.
- Vishal Moola provides more folio conversions
- Zhang Yi removed ll_rw_block() :(
- migration enhancements from Peter Xu
- migration error-path bugfixes from Huang Ying
- Aneesh Kumar added ability for a device driver to alter the memory
tiering promotion paths. For optimizations by PMEM drivers, DRM
drivers, etc.
- vma merging improvements from Jakub Matěn.
- NUMA hinting cleanups from David Hildenbrand.
- xu xin added aditional userspace visibility into KSM merging
activity.
- THP & KSM code consolidation from Qi Zheng.
- more folio work from Matthew Wilcox.
- KASAN updates from Andrey Konovalov.
- DAMON cleanups from Kaixu Xia.
- DAMON work from SeongJae Park: fixes, cleanups.
- hugetlb sysfs cleanups from Muchun Song.
- Mike Kravetz fixes locking issues in hugetlbfs and in hugetlb core.
Link: https://lkml.kernel.org/r/CAOUHufZabH85CeUN-MEMgL8gJGzJEWUrkiM58JkTbBhh-jew0Q@mail.gmail.com [1]
* tag 'mm-stable-2022-10-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (555 commits)
hugetlb: allocate vma lock for all sharable vmas
hugetlb: take hugetlb vma_lock when clearing vma_lock->vma pointer
hugetlb: fix vma lock handling during split vma and range unmapping
mglru: mm/vmscan.c: fix imprecise comments
mm/mglru: don't sync disk for each aging cycle
mm: memcontrol: drop dead CONFIG_MEMCG_SWAP config symbol
mm: memcontrol: use do_memsw_account() in a few more places
mm: memcontrol: deprecate swapaccounting=0 mode
mm: memcontrol: don't allocate cgroup swap arrays when memcg is disabled
mm/secretmem: remove reduntant return value
mm/hugetlb: add available_huge_pages() func
mm: remove unused inline functions from include/linux/mm_inline.h
selftests/vm: add selftest for MADV_COLLAPSE of uffd-minor memory
selftests/vm: add file/shmem MADV_COLLAPSE selftest for cleared pmd
selftests/vm: add thp collapse shmem testing
selftests/vm: add thp collapse file and tmpfs testing
selftests/vm: modularize thp collapse memory operations
selftests/vm: dedup THP helpers
mm/khugepaged: add tracepoint to hpage_collapse_scan_file()
mm/madvise: add file and shmem support to MADV_COLLAPSE
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext4 updates from Ted Ts'o:
"The first two changes involve files outside of fs/ext4:
- submit_bh() can never return an error, so change it to return void,
and remove the unused checks from its callers
- fix I_DIRTY_TIME handling so it will be set even if the inode
already has I_DIRTY_INODE
Performance:
- Always enable i_version counter (as btrfs and xfs already do).
Remove some uneeded i_version bumps to avoid unnecessary nfs cache
invalidations
- Wake up journal waiters in FIFO order, to avoid some journal users
from not getting a journal handle for an unfairly long time
- In ext4_write_begin() allocate any necessary buffer heads before
starting the journal handle
- Don't try to prefetch the block allocation bitmaps for a read-only
file system
Bug Fixes:
- Fix a number of fast commit bugs, including resources leaks and out
of bound references in various error handling paths and/or if the
fast commit log is corrupted
- Avoid stopping the online resize early when expanding a file system
which is less than 16TiB to a size greater than 16TiB
- Fix apparent metadata corruption caused by a race with a metadata
buffer head getting migrated while it was trying to be read
- Mark the lazy initialization thread freezable to prevent suspend
failures
- Other miscellaneous bug fixes
Cleanups:
- Break up the incredibly long ext4_full_super() function by
refactoring to move code into more understandable, smaller
functions
- Remove the deprecated (and ignored) noacl and nouser_attr mount
option
- Factor out some common code in fast commit handling
- Other miscellaneous cleanups"
* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (53 commits)
ext4: fix potential out of bound read in ext4_fc_replay_scan()
ext4: factor out ext4_fc_get_tl()
ext4: introduce EXT4_FC_TAG_BASE_LEN helper
ext4: factor out ext4_free_ext_path()
ext4: remove unnecessary drop path references in mext_check_coverage()
ext4: update 'state->fc_regions_size' after successful memory allocation
ext4: fix potential memory leak in ext4_fc_record_regions()
ext4: fix potential memory leak in ext4_fc_record_modified_inode()
ext4: remove redundant checking in ext4_ioctl_checkpoint
jbd2: add miss release buffer head in fc_do_one_pass()
ext4: move DIOREAD_NOLOCK setting to ext4_set_def_opts()
ext4: remove useless local variable 'blocksize'
ext4: unify the ext4 super block loading operation
ext4: factor out ext4_journal_data_mode_check()
ext4: factor out ext4_load_and_init_journal()
ext4: factor out ext4_group_desc_init() and ext4_group_desc_free()
ext4: factor out ext4_geometry_check()
ext4: factor out ext4_check_feature_compatibility()
ext4: factor out ext4_init_metadata_csum()
ext4: factor out ext4_encoding_init()
...
|
|
submit_bh/submit_bh_wbc are non-blocking functions which just submit
the bio and return. The caller of submit_bh/submit_bh_wbc needs to wait
on buffer till I/O completion and then check buffer head's b_state field
to know if there was any I/O error.
Hence there is no need for these functions to have any return type.
Even now they always returns 0. Hence drop the return value and make
their return type as void to avoid any confusion.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Link: https://lore.kernel.org/r/cb66ef823374cdd94d2d03083ce13de844fffd41.1660788334.git.ritesh.list@gmail.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
bh_submit_read() has no user anymore, just remove it.
Link: https://lkml.kernel.org/r/20220901133505.2510834-15-yi.zhang@huawei.com
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Now that all ll_rw_block() users has been replaced to new safe helpers,
we just remove it here.
Link: https://lkml.kernel.org/r/20220901133505.2510834-13-yi.zhang@huawei.com
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Current ll_rw_block() helper is fragile because it assumes that locked
buffer means it's under IO which is submitted by some other who holds
the lock, it skip buffer if it failed to get the lock, so it's only
safe on the readahead path. Unfortunately, now that most filesystems
still use this helper mistakenly on the sync metadata read path. There
is no guarantee that the one who holds the buffer lock always submit IO
(e.g. buffer_migrate_folio_norefs() after commit 88dbcbb3a484 ("blkdev:
avoid migration stalls for blkdev pages"), it could lead to false
positive -EIO when submitting reading IO.
This patch add some friendly buffer read helpers to prepare replacing
ll_rw_block() and similar calls. We can only call bh_readahead_[]
helpers for the readahead paths.
Link: https://lkml.kernel.org/r/20220901133505.2510834-3-yi.zhang@huawei.com
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Patch series "fs/buffer: remove ll_rw_block()", v2.
ll_rw_block() will skip locked buffer before submitting IO, it assumes
that locked buffer means it is under IO. This assumption is not always
true because we cannot guarantee every buffer lock path would submit IO.
After commit 88dbcbb3a484 ("blkdev: avoid migration stalls for blkdev
pages"), buffer_migrate_folio_norefs() becomes one exceptional case, and
there may be others. So ll_rw_block() is not safe on the sync read path,
we could get false positive EIO return value when filesystem reading
metadata. It seems that it could be only used on the readahead path.
Unfortunately, many filesystem misuse the ll_rw_block() on the sync read
path. This patch set just remove ll_rw_block() and add new friendly
helpers, which could prevent false positive EIO on the read metadata path.
Thanks for the suggestion from Jan, the original discussion is at [1].
patch 1: remove unused helpers in fs/buffer.c
patch 2: add new bh_read_[*] helpers
patch 3-11: remove all ll_rw_block() calls in filesystems
patch 12-14: do some leftover cleanups.
[1]. https://lore.kernel.org/linux-mm/20220825080146.2021641-1-chengzhihao1@huawei.com/
This patch (of 14):
No one use __breadahead_gfp() and sb_breadahead_unmovable() any more,
remove them.
Link: https://lkml.kernel.org/r/20220901133505.2510834-1-yi.zhang@huawei.com
Link: https://lkml.kernel.org/r/20220901133505.2510834-2-yi.zhang@huawei.com
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Andreas Gruenbacher <agruenba@redhat.com>
Cc: Bob Peterson <rpeterso@redhat.com>
Cc: Evgeniy Dushistov <dushistov@mail.ru>
Cc: Heming Zhao <ocfs2-devel@oss.oracle.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Yu Kuai <yukuai3@huawei.com>
Cc: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Commit d4252071b97d ("add barriers to buffer_uptodate and
set_buffer_uptodate") added proper memory barriers to the buffer head
BH_Uptodate bit, so that anybody who tests a buffer for being up-to-date
will be guaranteed to actually see initialized state.
However, that commit didn't _just_ add the memory barrier, it also ended
up dropping the "was it already set" logic that the BUFFER_FNS() macro
had.
That's conceptually the right thing for a generic "this is a memory
barrier" operation, but in the case of the buffer contents, we really
only care about the memory barrier for the _first_ time we set the bit,
in that the only memory ordering protection we need is to avoid anybody
seeing uninitialized memory contents.
Any other access ordering wouldn't be about the BH_Uptodate bit anyway,
and would require some other proper lock (typically BH_Lock or the folio
lock). A reader that races with somebody invalidating the buffer head
isn't an issue wrt the memory ordering, it's a serialization issue.
Now, you'd think that the buffer head operations don't matter in this
day and age (and I certainly thought so), but apparently some loads
still end up being heavy users of buffer heads. In particular, the
kernel test robot reported that not having this bit access optimization
in place caused a noticeable direct IO performance regression on ext4:
fxmark.ssd_ext4_no_jnl_DWTL_54_directio.works/sec -26.5% regression
although you presumably need a fast disk and a lot of cores to actually
notice.
Link: https://lore.kernel.org/all/Yw8L7HTZ%2FdE2%2Fo9C@xsang-OptiPlex-9020/
Reported-by: kernel test robot <oliver.sang@intel.com>
Tested-by: Fengwei Yin <fengwei.yin@intel.com>
Cc: Mikulas Patocka <mpatocka@redhat.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
There are several places in the kernel where wait_on_bit is not followed
by a memory barrier (for example, in drivers/md/dm-bufio.c:new_read).
On architectures with weak memory ordering, it may happen that memory
accesses that follow wait_on_bit are reordered before wait_on_bit and
they may return invalid data.
Fix this class of bugs by introducing a new function "test_bit_acquire"
that works like test_bit, but has acquire memory ordering semantics.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Acked-by: Will Deacon <will@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Let's have a look at this piece of code in __bread_slow:
get_bh(bh);
bh->b_end_io = end_buffer_read_sync;
submit_bh(REQ_OP_READ, 0, bh);
wait_on_buffer(bh);
if (buffer_uptodate(bh))
return bh;
Neither wait_on_buffer nor buffer_uptodate contain any memory barrier.
Consequently, if someone calls sb_bread and then reads the buffer data,
the read of buffer data may be executed before wait_on_buffer(bh) on
architectures with weak memory ordering and it may return invalid data.
Fix this bug by adding a memory barrier to set_buffer_uptodate and an
acquire barrier to buffer_uptodate (in a similar way as
folio_test_uptodate and folio_mark_uptodate).
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Pull folio updates from Matthew Wilcox:
- Fix an accounting bug that made NR_FILE_DIRTY grow without limit
when running xfstests
- Convert more of mpage to use folios
- Remove add_to_page_cache() and add_to_page_cache_locked()
- Convert find_get_pages_range() to filemap_get_folios()
- Improvements to the read_cache_page() family of functions
- Remove a few unnecessary checks of PageError
- Some straightforward filesystem conversions to use folios
- Split PageMovable users out from address_space_operations into
their own movable_operations
- Convert aops->migratepage to aops->migrate_folio
- Remove nobh support (Christoph Hellwig)
* tag 'folio-6.0' of git://git.infradead.org/users/willy/pagecache: (78 commits)
fs: remove the NULL get_block case in mpage_writepages
fs: don't call ->writepage from __mpage_writepage
fs: remove the nobh helpers
jfs: stop using the nobh helper
ext2: remove nobh support
ntfs3: refactor ntfs_writepages
mm/folio-compat: Remove migration compatibility functions
fs: Remove aops->migratepage()
secretmem: Convert to migrate_folio
hugetlb: Convert to migrate_folio
aio: Convert to migrate_folio
f2fs: Convert to filemap_migrate_folio()
ubifs: Convert to filemap_migrate_folio()
btrfs: Convert btrfs_migratepage to migrate_folio
mm/migrate: Add filemap_migrate_folio()
mm/migrate: Convert migrate_page() to migrate_folio()
nfs: Convert to migrate_folio
btrfs: Convert btree_migratepage to migrate_folio
mm/migrate: Convert expected_page_refs() to folio_expected_refs()
mm/migrate: Convert buffer_migrate_page() to buffer_migrate_folio()
...
|