io_uring: speedup provided buffer handling - ~sima/drm

diff options

author	Jens Axboe <axboe@kernel.dk>	2022-03-08 17:46:52 -0700
committer	Jens Axboe <axboe@kernel.dk>	2022-03-10 06:33:14 -0700
commit	cc3cec8367cba76a8ae4c271eba8450f3efc1ba3 (patch)
tree	23a60579ae1cc6438208a64d7be4867f5240de9f /io_uring
parent	e7a6c00dc77aedf27a601738ea509f1caea6d673 (diff)

io_uring: speedup provided buffer handling

In testing high frequency workloads with provided buffers, we spend a lot of time in allocating and freeing the buffer units themselves. Rather than repeatedly free and alloc them, add a recycling cache instead. There are two caches: - ctx->io_buffers_cache. This is the one we grab from in the submission path, and it's protected by ctx->uring_lock. For inline completions, we can recycle straight back to this cache and not need any extra locking. - ctx->io_buffers_comp. If we're not under uring_lock, then we use this list to recycle buffers. It's protected by the completion_lock. On adding a new buffer, check io_buffers_cache. If it's empty, check if we can splice entries from the io_buffers_comp_cache. This reduces about 5-10% of overhead from provided buffers, bringing it pretty close to the non-provided path. Signed-off-by: Jens Axboe <axboe@kernel.dk>

Diffstat (limited to 'io_uring')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: