diff options
author | Zhang Yi <yi.zhang@huawei.com> | 2023-06-06 21:59:23 +0800 |
---|---|---|
committer | Theodore Ts'o <tytso@mit.edu> | 2023-07-10 23:09:21 -0400 |
commit | c2d6fd9d6f35079f1669f0100f05b46708c74b7f (patch) | |
tree | c0fced541c7e1deafe1b68d4540191db21b31f08 /drivers/edac | |
parent | 06c2afb862f9da8dc5efa4b6076a0e48c3fbaaa5 (diff) |
jbd2: recheck chechpointing non-dirty buffer
There is a long-standing metadata corruption issue that happens from
time to time, but it's very difficult to reproduce and analyse, benefit
from the JBD2_CYCLE_RECORD option, we found out that the problem is the
checkpointing process miss to write out some buffers which are raced by
another do_get_write_access(). Looks below for detail.
jbd2_log_do_checkpoint() //transaction X
//buffer A is dirty and not belones to any transaction
__buffer_relink_io() //move it to the IO list
__flush_batch()
write_dirty_buffer()
do_get_write_access()
clear_buffer_dirty
__jbd2_journal_file_buffer()
//add buffer A to a new transaction Y
lock_buffer(bh)
//doesn't write out
__jbd2_journal_remove_checkpoint()
//finish checkpoint except buffer A
//filesystem corrupt if the new transaction Y isn't fully write out.
Due to the t_checkpoint_list walking loop in jbd2_log_do_checkpoint()
have already handles waiting for buffers under IO and re-added new
transaction to complete commit, and it also removing cleaned buffers,
this makes sure the list will eventually get empty. So it's fine to
leave buffers on the t_checkpoint_list while flushing out and completely
stop using the t_checkpoint_io_list.
Cc: stable@vger.kernel.org
Suggested-by: Jan Kara <jack@suse.cz>
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Tested-by: Zhihao Cheng <chengzhihao1@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20230606135928.434610-2-yi.zhang@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Diffstat (limited to 'drivers/edac')
0 files changed, 0 insertions, 0 deletions