diff options
author | Filipe Manana <fdmanana@suse.com> | 2023-02-08 17:46:49 +0000 |
---|---|---|
committer | David Sterba <dsterba@suse.com> | 2023-02-15 19:38:55 +0100 |
commit | a724f313f84beb5b63b8844d9ec42a547e4a3c18 (patch) | |
tree | 8fe82b2e7fd20494623d5e2bc7a4964792d64c0c /fs | |
parent | 7b00dfffebd4f3444a3ec04d9e4203b7ac1acb47 (diff) |
btrfs: do unsigned integer division in the extent buffer binary search loop
In the search loop of the binary search function, we are doing a division
by 2 of the sum of the high and low slots. Because the slots are integers,
the generated assembly code for it is the following on x86_64:
0x00000000000141f1 <+145>: mov %eax,%ebx
0x00000000000141f3 <+147>: shr $0x1f,%ebx
0x00000000000141f6 <+150>: add %eax,%ebx
0x00000000000141f8 <+152>: sar %ebx
It's a few more instructions than a simple right shift, because signed
integer division needs to round towards zero. However we know that slots
can never be negative (btrfs_header_nritems() returns an u32), so we
can instead use unsigned types for the low and high slots and therefore
use unsigned integer division, which results in a single instruction on
x86_64:
0x00000000000141f0 <+144>: shr %ebx
So use unsigned types for the slots and therefore unsigned division.
This is part of a small patchset comprised of the following two patches:
btrfs: eliminate extra call when doing binary search on extent buffer
btrfs: do unsigned integer division in the extent buffer binary search loop
The following fs_mark test was run on a non-debug kernel (Debian's default
kernel config) before and after applying the patchset:
$ cat test.sh
#!/bin/bash
DEV=/dev/sdi
MNT=/mnt/sdi
MOUNT_OPTIONS="-o ssd"
MKFS_OPTIONS="-O no-holes -R free-space-tree"
FILES=100000
THREADS=$(nproc --all)
FILE_SIZE=0
umount $DEV &> /dev/null
mkfs.btrfs -f $MKFS_OPTIONS $DEV
mount $MOUNT_OPTIONS $DEV $MNT
OPTS="-S 0 -L 6 -n $FILES -s $FILE_SIZE -t $THREADS -k"
for ((i = 1; i <= $THREADS; i++)); do
OPTS="$OPTS -d $MNT/d$i"
done
fs_mark $OPTS
umount $MNT
Results before applying patchset:
FSUse% Count Size Files/sec App Overhead
2 1200000 0 174472.0 11549868
4 2400000 0 253503.0 11694618
4 3600000 0 257833.1 11611508
6 4800000 0 247089.5 11665983
6 6000000 0 211296.1 12121244
10 7200000 0 187330.6 12548565
Results after applying patchset:
FSUse% Count Size Files/sec App Overhead
2 1200000 0 207556.0 11393252
4 2400000 0 266751.1 11347909
4 3600000 0 274397.5 11270058
6 4800000 0 259608.4 11442250
6 6000000 0 238895.8 11635921
8 7200000 0 211942.2 11873825
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Diffstat (limited to 'fs')
-rw-r--r-- | fs/btrfs/ctree.c | 17 | ||||
-rw-r--r-- | fs/btrfs/ctree.h | 2 |
2 files changed, 12 insertions, 7 deletions
diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c index 5f954642860c..a5b6bb54545f 100644 --- a/fs/btrfs/ctree.c +++ b/fs/btrfs/ctree.c @@ -853,8 +853,8 @@ int btrfs_realloc_node(struct btrfs_trans_handle *trans, /* * Search for a key in the given extent_buffer. * - * The lower boundary for the search is specified by the slot number @low. Use a - * value of 0 to search over the whole extent buffer. + * The lower boundary for the search is specified by the slot number @first_slot. + * Use a value of 0 to search over the whole extent buffer. * * The slot in the extent buffer is returned via @slot. If the key exists in the * extent buffer, then @slot will point to the slot where the key is, otherwise @@ -863,18 +863,23 @@ int btrfs_realloc_node(struct btrfs_trans_handle *trans, * Slot may point to the total number of items (i.e. one position beyond the last * key) if the key is bigger than the last key in the extent buffer. */ -int btrfs_generic_bin_search(struct extent_buffer *eb, int low, +int btrfs_generic_bin_search(struct extent_buffer *eb, int first_slot, const struct btrfs_key *key, int *slot) { unsigned long p; int item_size; - int high = btrfs_header_nritems(eb); + /* + * Use unsigned types for the low and high slots, so that we get a more + * efficient division in the search loop below. + */ + u32 low = first_slot; + u32 high = btrfs_header_nritems(eb); int ret; const int key_size = sizeof(struct btrfs_disk_key); - if (low > high) { + if (unlikely(low > high)) { btrfs_err(eb->fs_info, - "%s: low (%d) > high (%d) eb %llu owner %llu level %d", + "%s: low (%u) > high (%u) eb %llu owner %llu level %d", __func__, low, high, eb->start, btrfs_header_owner(eb), btrfs_header_level(eb)); return -EINVAL; diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 322f2171275d..97897107fab5 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -508,7 +508,7 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range); int __init btrfs_ctree_init(void); void __cold btrfs_ctree_exit(void); -int btrfs_generic_bin_search(struct extent_buffer *eb, int low, +int btrfs_generic_bin_search(struct extent_buffer *eb, int first_slot, const struct btrfs_key *key, int *slot); /* |