diff options
author | Uros Bizjak <ubizjak@gmail.com> | 2022-05-25 16:40:12 +0200 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2022-05-26 09:52:53 -0700 |
commit | 3378323bbb9e77643a645f6b1ff10f7bdb9d61e4 (patch) | |
tree | debb62114ffa9f58d4a24bcc1d0c48ecc008f9c8 | |
parent | babf0bb978e3c9fce6c4eba6b744c8754fd43d8e (diff) |
locking/lockref: Use try_cmpxchg64 in CMPXCHG_LOOP macro
Use try_cmpxchg64 instead of cmpxchg64 in CMPXCHG_LOOP macro.
x86 CMPXCHG instruction returns success in ZF flag, so this
change saves a compare after cmpxchg (and related move instruction
in front of cmpxchg). The main loop of lockref_get improves from:
13: 48 89 c1 mov %rax,%rcx
16: 48 c1 f9 20 sar $0x20,%rcx
1a: 83 c1 01 add $0x1,%ecx
1d: 48 89 ce mov %rcx,%rsi
20: 89 c1 mov %eax,%ecx
22: 48 89 d0 mov %rdx,%rax
25: 48 c1 e6 20 shl $0x20,%rsi
29: 48 09 f1 or %rsi,%rcx
2c: f0 48 0f b1 4d 00 lock cmpxchg %rcx,0x0(%rbp)
32: 48 39 d0 cmp %rdx,%rax
35: 75 17 jne 4e <lockref_get+0x4e>
to:
13: 48 89 ca mov %rcx,%rdx
16: 48 c1 fa 20 sar $0x20,%rdx
1a: 83 c2 01 add $0x1,%edx
1d: 48 89 d6 mov %rdx,%rsi
20: 89 ca mov %ecx,%edx
22: 48 c1 e6 20 shl $0x20,%rsi
26: 48 09 f2 or %rsi,%rdx
29: f0 48 0f b1 55 00 lock cmpxchg %rdx,0x0(%rbp)
2f: 75 02 jne 33 <lockref_get+0x33>
[ Michael Ellerman and Mark Rutland confirm that code generation on
powerpc and arm64 respectively is also ok, even though they do not
have a native arch_try_cmpxchg() implementation, and rely on the
default fallback case - Linus ]
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Tested-by: Michael Ellerman <mpe@ellerman.id.au>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Waiman.Long@hp.com
Cc: paulmck@linux.vnet.ibm.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-rw-r--r-- | lib/lockref.c | 9 |
1 files changed, 4 insertions, 5 deletions
diff --git a/lib/lockref.c b/lib/lockref.c index 5b34bbd3eba8..c6f0b183b937 100644 --- a/lib/lockref.c +++ b/lib/lockref.c @@ -14,12 +14,11 @@ BUILD_BUG_ON(sizeof(old) != 8); \ old.lock_count = READ_ONCE(lockref->lock_count); \ while (likely(arch_spin_value_unlocked(old.lock.rlock.raw_lock))) { \ - struct lockref new = old, prev = old; \ + struct lockref new = old; \ CODE \ - old.lock_count = cmpxchg64_relaxed(&lockref->lock_count, \ - old.lock_count, \ - new.lock_count); \ - if (likely(old.lock_count == prev.lock_count)) { \ + if (likely(try_cmpxchg64_relaxed(&lockref->lock_count, \ + &old.lock_count, \ + new.lock_count))) { \ SUCCESS; \ } \ if (!--retry) \ |