summaryrefslogtreecommitdiff
path: root/fs/f2fs
AgeCommit message (Collapse)AuthorFilesLines
2015-01-09f2fs: avoid double lock for cp_rwsemJaegeuk Kim1-2/+2
The __f2fs_add_link is covered by cp_rwsem all the time. This calls init_inode_metadata, which conducts some acl operations including memory allocation with GFP_KERNEL previously. But, under memory pressure, f2fs_write_data_page can be called, which also grabs cp_rwsem too. In this case, this incurs a deadlock pointed by Chao. Thread #1 Thread #2 down_read down_write down_read -> here down_read should wait forever. Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-01-09f2fs: activate f2fs_trace_iosJaegeuk Kim3-0/+7
This patch activates f2fs_trace_ios. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-01-09f2fs: activate f2fs_trace_pidJaegeuk Kim3-0/+7
This patch activates f2fs_trace_pid. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-01-09f2fs: add key functions for f2fs_io_tracerJaegeuk Kim2-0/+104
This patch adds two key functions to trace process ids and IOs. The basic idea is to 1. remain process ids, pids, in page->private. 2. show pids in IO traces. So, later we can retrieve process information according to IO traces. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-01-09f2fs: add f2fs_io_tracer supportJaegeuk Kim4-0/+59
This patch adds: o initial trace.c and trace.h with skeleton functions o Kconfig and Makefile to activate this feature Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-01-09f2fs: use f2fs_io_info to clean up messy parameters during IO pathJaegeuk Kim6-66/+87
This patch cleans up parameters on IO paths. The key idea is to use f2fs_io_info adding a parameter, block address, and then use this structure as parameters. Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-01-09f2fs: use ra_meta_pages to simplify readahead code in restore_node_summaryChao Yu1-52/+13
Use more common function ra_meta_pages() with META_POR to readahead node blocks in restore_node_summary() instead of ra_sum_pages(), hence we can simplify the readahead code there, and also we can remove unused function ra_sum_pages(). changes from v2: o use invalidate_mapping_pages as before suggested by Changman Lee. changes from v1: o fix one bug when using truncate_inode_pages_range which is pointed out by Jaegeuk Kim. Reviewed-by: Changman Lee <cm224.lee@samsung.com> Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-01-09f2fs: merge two uchar variable in struct node_info to reduce memory costChao Yu2-13/+24
This patch moves one member of struct nat_entry: _flag_ to struct node_info, so _version_ in struct node_info and _flag_ which are unsigned char type will merge to one 32-bit space in register/memory. So the size of nat_entry will be reduced from 28 bytes to 24 bytes (for 64-bit machine, reduce its size from 40 bytes to 32 bytes) and then slab memory using by f2fs will be reduced. changes from v2: o update description of memory usage gain for 64-bit machine suggested by Changman Lee. changes from v1: o introduce inline copy_node_info() to copy valid data from node info suggested by Jaegeuk Kim, it can avoid bug. Reviewed-by: Changman Lee <cm224.lee@samsung.com> Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-01-09f2fs: readahead contiguous current summary blocks in checkpointChao Yu3-5/+20
Let's add readahead code for reading contiguous compact/normal summary blocks in checkpoint, then we will gain better performance in mount procedure. Changes from v1 o remove inappropriate 'unlikely' in npages_for_summary_flush. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-01-09f2fs: use missing the use of f2fs_kunmap_pageJaegeuk Kim1-2/+1
This patch calls f2fs_kunmap_page which I missed before. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-01-09f2fs: remove unnecessary call to invalidate inmemory pagesJaegeuk Kim3-21/+0
Now we use inmemory pages for atomic write only and provide abort procedure, we don't need to truncate them explicitly. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-01-09f2fs: fix small discards not to issue redundantlyJaegeuk Kim1-3/+5
The ckpt_valid_map and cur_valid_map are synced by seg_info_to_raw_sit. In the case of small discards, the candidates are selected before sync, while fitrim selects candidates after sync. So, for small discards, we need to add candidates only just being obsoleted. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-01-09f2fs: change atomic and volatile write policiesJaegeuk Kim7-14/+88
This patch adds two new ioctls to release inmemory pages grabbed by atomic writes. o f2fs_ioc_abort_volatile_write - If transaction was failed, all the grabbed pages and data should be written. o f2fs_ioc_release_volatile_write - This is to enhance the performance of PERSIST mode in sqlite. In order to avoid huge memory consumption which causes OOM, this patch changes volatile writes to use normal dirty pages, instead blocked flushing to the disk as long as system does not suffer from memory pressure. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-01-09f2fs: don't need to call lock_op and lock_page for abortJaegeuk Kim1-15/+20
We don't need to call lock_op and lock_page at the aborting path. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-01-09f2fs: fix wrong condition check to trigger f2fs_sync_fsJaegeuk Kim1-1/+1
If there is not enough available memory, we need to trigger f2fs_sync_fs. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-01-09f2fs: remove checking dirty_exceedJaegeuk Kim1-2/+0
We don't need to force to write dirty_exceeded for f2fs_balance_fs_bg. This flag was only meaningful to write bypassing conditions. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-12-08f2fs: avoid to ra unneeded blocks in recover flowChao Yu3-18/+23
To improve recovery speed, f2fs try to readahead many contiguous blocks in warm node segment, but for most time, abnormal power-off do not occur frequently, so when mount a normal power-off f2fs image, by contrary ra so many blocks and then invalid them will hurt the performance of mount. It's better to just ra the first next-block for normal condition. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-12-08f2fs: introduce is_valid_blkaddr to cleanup codes in ra_meta_pagesChao Yu1-27/+26
This patch does cleanup work, it introduces is_valid_blkaddr() to include verification code for blkaddr with upper and down boundary value which were in ra_meta_pages previous. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-12-08f2fs: fix to enable readahead for SSA/CP blocksChao Yu1-2/+15
1.We use zero as upper boundary value for ra SSA/CP blocks, we will skip readahead as verification failure with max number, it causes low performance. 2.Low boundary value is not accurate for SSA/CP/POR region verification, so these values need to be redefined. This patch fixes above issues. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-12-08f2fs: use atomic for counting inode with inline_{dir,inode} flagChao Yu2-8/+11
As inline_{dir,inode} stat is increased/decreased concurrently by multi threads, so the value is not so accurate, let's use atomic type for counting accurately. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-12-08f2fs: cleanup path to need cp at fsyncChangman Lee1-36/+43
Added some commentaries for code readability and cleaned up if-statement clearly. Signed-off-by: Changman Lee <cm224.lee@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-12-08f2fs: check if inode state is dirty at fsyncChangman Lee1-6/+19
If inode state is dirty, go straight to write. Suggested-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Changman Lee <cm224.lee@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-12-08f2fs: count the number of inmemory pagesJaegeuk Kim3-1/+8
This patch adds counting # of inmemory pages in the page cache. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-12-08f2fs: release inmemory pages when the file was closedJaegeuk Kim1-0/+9
If file is closed, let's drop inmemory pages. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-12-08f2fs: set page private for inmemory pages for truncationJaegeuk Kim1-0/+2
The inmemory pages should be handled by invalidate_page since it needs to be released int the truncation path. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-12-08f2fs: count inline_xx in do_read_inodeJaegeuk Kim1-2/+4
In do_read_inode, if we failed __recover_inline_status, the inode has inline flag without increasing its count. Later, f2fs_evict_inode will decrease the count, which causes -1. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-12-08f2fs: do retry operations with cond_reschedJaegeuk Kim4-38/+20
This patch revists retrial paths in f2fs. The basic idea is to use cond_resched instead of retrying from the very early stage. Suggested-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-12-05f2fs: call radix_tree_preload before radix_tree_insertJaegeuk Kim3-6/+19
This patch tries to fix: BUG: using smp_processor_id() in preemptible [00000000] code: f2fs_gc-254:0/384 (radix_tree_node_alloc+0x14/0x74) from [<c033d8a0>] (radix_tree_insert+0x110/0x200) (radix_tree_insert+0x110/0x200) from [<c02e8264>] (gc_data_segment+0x340/0x52c) (gc_data_segment+0x340/0x52c) from [<c02e8658>] (f2fs_gc+0x208/0x400) (f2fs_gc+0x208/0x400) from [<c02e8a98>] (gc_thread_func+0x248/0x28c) (gc_thread_func+0x248/0x28c) from [<c0139944>] (kthread+0xa0/0xac) (kthread+0xa0/0xac) from [<c0105ef8>] (ret_from_fork+0x14/0x3c) The reason is that f2fs calls radix_tree_insert under enabled preemption. So, before calling it, we need to call radix_tree_preload. Otherwise, we should use _GFP_WAIT for the radix tree, and use mutex or semaphore to cover the radix tree operations. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-12-03f2fs: use rw_semaphore for nat entry lockJaegeuk Kim2-27/+27
Previoulsy, we used rwlock for nat_entry lock. But, now we have a lot of complex operations in set_node_addr. (e.g., allocating kernel memories, handling radix_trees, and so on) So, this patches tries to change spinlock to rw_semaphore to give CPUs to other threads. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-12-03f2fs: fix missing kmem_cache_freeJaegeuk Kim1-1/+1
This patch fixes missing kmem_cache_free when handling errors. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-12-02f2fs: more fast lookup for gc_inode listChangman Lee2-19/+34
If there are many inodes that have data blocks in victim segment, it takes long time to find a inode in gc_inode list. Let's use radix_tree to reduce lookup time. Signed-off-by: Changman Lee <cm224.lee@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-12-01f2fs: cleanup redundant macroChangman Lee1-3/+3
We've already made fi and sbi for inode. Let's avoid duplicated work. Signed-off-by: Changman Lee <cm224.lee@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-12-01f2fs: fix to return correct error number in f2fs_write_beginChao Yu1-1/+3
Fix the wrong error number in error path of f2fs_write_begin. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-11-27f2fs: cleanup if-statement of phase in gc_data_segmentChangman Lee1-16/+16
Little cleanup to distinguish each phase easily Signed-off-by: Changman Lee <cm224.lee@samsung.com> [Jaegeuk Kim: modify indentation for code readability] Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-11-25f2fs: fix to recover converted inline_dataJaegeuk Kim1-0/+3
If an inode has converted inline_data which was written to the disk, we should set its inode flag for further fsync so that this inline_data can be recovered from sudden power off. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-11-25f2fs: make clean the page before writingJaegeuk Kim1-1/+6
If a page is set to be written to the disk, we can make clean the page. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-11-25f2fs: no more dirty_nat_entires when flushingChangman Lee1-4/+4
After flushing dirty nat entries, it has to be no more dirty nat entries. Signed-off-by: Changman Lee <cm224.lee@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-11-25f2fs: check dirty_nat_cnt before flushing nat entries in journalChangman Lee1-4/+3
It's meaningless to check dirty_nat_cnt after re-dirtying nat entries in journal. And although there are rooms for dirty nat entires if dirty_nat_cnt is zero, it's also meaningless to check __has_cursum_space. Signed-off-by: Changman Lee <cm224.lee@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-11-25f2fs: fix deadlock during inline_data conversionJaegeuk Kim1-14/+14
A deadlock can be occurred: Thread 1] Thread 2] - f2fs_write_data_pages - f2fs_write_begin - lock_page(page #0) - grab_cache_page(page #X) - get_node_page(inode_page) - grab_cache_page(page #0) : to convert inline_data - f2fs_write_data_page - f2fs_write_inline_data - get_node_page(inode_page) In this case, trying to lock inode_page and page #0 causes deadlock. In order to avoid this, this patch adds a rule for this locking policy, which is that page #0 should be locked followed by inode_page lock. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-11-25f2fs: fix typos for the word "destroy" in jump labelsMarkus Elfring1-4/+4
Two jump labels were adjusted in the implementation of the create_node_manager_caches() function because these identifiers contained typos. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-11-23f2fs: fix livelock calling f2fs_iget during f2fs_evict_inodeJaegeuk Kim1-1/+10
In f2fs_evict_inode, commit_inmemory_pages f2fs_gc f2fs_iget iget_locked -> wait for inode free Here, if the inode is same as the one to be evicted, f2fs should wait forever. Actually, we should not call f2fs_balance_fs during f2fs_evict_inode to avoid this. But, the commit_inmem_pages calls f2fs_balance_fs by default, even if f2fs_evict_inode wants to free inmemory pages only. Hence, this patch adds to trigger f2fs_balance_fs only when there is something to write. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-11-23f2fs: introduce f2fs_dentry_kunmap to clean upJaegeuk Kim4-24/+18
This patch introduces f2fs_dentry_kunmap to clean up dirty codes. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-11-23f2fs: fix wrong data structure when create slabChangman Lee1-1/+1
It used nat_entry_set when create slab for sit_entry_set. Signed-off-by: Changman Lee <cm224.lee@samsung.com> Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-11-23f2fs: call flush_dcache_page when the page was updatedJaegeuk Kim1-0/+1
Whenever f2fs updates mapped pages, it needs to call flush_dcache_page. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-11-19f2fs: write SSA pages under memory pressureJaegeuk Kim1-1/+4
Under memory pressure, we don't need to skip SSA page writes. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-11-19f2fs: submit bio for node blocks in the reclaim pathJaegeuk Kim1-0/+4
If a node page is request to be written during the reclaiming path, we should submit the bio to avoid pending to recliam it. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-11-19f2fs: introduce struct inode_management to wrap inner fieldsChao Yu4-49/+66
Now in f2fs, we have three inode cache: ORPHAN_INO, APPEND_INO, UPDATE_INO, and we manage fields related to inode cache separately in struct f2fs_sb_info for each inode cache type. This makes codes a bit messy, so that this patch intorduce a new struct inode_management to wrap inner fields as following which make codes more neat. /* for inner inode cache management */ struct inode_management { struct radix_tree_root ino_root; /* ino entry array */ spinlock_t ino_lock; /* for ino entry lock */ struct list_head ino_list; /* inode list head */ unsigned long ino_num; /* number of entries */ }; struct f2fs_sb_info { ... struct inode_management im[MAX_INO_ENTRY]; /* manage inode cache */ ... } Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-11-19f2fs: remove unneeded check code with option in f2fs_remountChao Yu1-2/+2
Because we have checked the contrary condition in case of "if" judgment, we do not need to check the condition again in case of "else" judgment. Let's remove it. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-11-19f2fs: avoid unable to restart gc thread in remountChao Yu2-3/+1
In f2fs_remount, we will stop gc thread and set need_restart_gc as true when new option is set without BG_GC, then if any error occurred in the following procedure, we can restore to start the gc thread. But after that, We will fail to restore gc thread in start_gc_thread as BG_GC is not set in new option, so we'd better move this condition judgment out of start_gc_thread to fix this issue. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-11-18f2fs: put the inode page when error was occurredJaegeuk Kim1-4/+6
We should put the inode page when error was occurred. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>