summaryrefslogtreecommitdiff
path: root/fs/xfs/xfs_trans_ail.c
AgeCommit message (Collapse)AuthorFilesLines
2012-07-29xfs: re-enable xfsaild idle mode and fix associated racesBrian Foster1-3/+32
xfsaild idle mode logic currently leads to a couple hangs: 1.) If xfsaild is rescheduled in during an incremental scan (i.e., tout != 0) and the target has been updated since the previous run, we can hit the new target and go into idle mode with a still populated ail. 2.) A wake up is only issued when the target is pushed forward. The wake up can race with xfsaild if it is currently in the process of entering idle mode, causing future wake up events to be lost. These hangs have been reproduced and verified as fixed by running xfstests 273 in a loop on a slightly modified upstream kernel. The kernel is modified to re-enable idle mode as previously implemented (when count == 0) and with a revert of commit 670ce93f, which includes performance improvements that make this harder to reproduce. The solution, the algorithm for which has been outlined by Dave Chinner, is to modify xfsaild to enter idle mode only when the ail is empty and the push target has not been moved forward since the last push. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ben Myers <bpm@sgi.com>
2012-05-14xfs: move xfsagino_t to xfs_types.hDave Chinner1-1/+0
Untangle the header file includes a bit by moving the definition of xfs_agino_t to xfs_types.h. This removes the dependency that xfs_ag.h has on xfs_inum.h, meaning we don't need to include xfs_inum.h everywhere we include xfs_ag.h. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
2012-05-14xfs: pass shutdown method into xfs_trans_ail_delete_bulkDave Chinner1-2/+3
xfs_trans_ail_delete_bulk() can be called from different contexts so if the item is not in the AIL we need different shutdown for each context. Pass in the shutdown method needed so the correct action can be taken. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
2012-05-14xfs: remove some obsolete comments in xfs_trans_ail.cChristoph Hellwig1-14/+0
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
2012-05-14xfs: on-stack delayed write buffer listsChristoph Hellwig1-72/+57
Queue delwri buffers on a local on-stack list instead of a per-buftarg one, and write back the buffers per-process instead of by waking up xfsbufd. This is now easily doable given that we have very few places left that write delwri buffers: - log recovery: Only done at mount time, and already forcing out the buffers synchronously using xfs_flush_buftarg - quotacheck: Same story. - dquot reclaim: Writes out dirty dquots on the LRU under memory pressure. We might want to look into doing more of this via xfsaild, but it's already more optimal than the synchronous inode reclaim that writes each buffer synchronously. - xfsaild: This is the main beneficiary of the change. By keeping a local list of buffers to write we reduce latency of writing out buffers, and more importably we can remove all the delwri list promotions which were hitting the buffer cache hard under sustained metadata loads. The implementation is very straight forward - xfs_buf_delwri_queue now gets a new list_head pointer that it adds the delwri buffers to, and all callers need to eventually submit the list using xfs_buf_delwi_submit or xfs_buf_delwi_submit_nowait. Buffers that already are on a delwri list are skipped in xfs_buf_delwri_queue, assuming they already are on another delwri list. The biggest change to pass down the buffer list was done to the AIL pushing. Now that we operate on buffers the trylock, push and pushbuf log item methods are merged into a single push routine, which tries to lock the item, and if possible add the buffer that needs writeback to the buffer list. This leads to much simpler code than the previous split but requires the individual IOP_PUSH instances to unlock and reacquire the AIL around calls to blocking routines. Given that xfsailds now also handle writing out buffers, the conditions for log forcing and the sleep times needed some small changes. The most important one is that we consider an AIL busy as long we still have buffers to push, and the other one is that we do increment the pushed LSN for buffers that are under flushing at this moment, but still count them towards the stuck items for restart purposes. Without this we could hammer on stuck items without ever forcing the log and not make progress under heavy random delete workloads on fast flash storage devices. [ Dave Chinner: - rebase on previous patches. - improved comments for XBF_DELWRI_Q handling - fix XBF_ASYNC handling in queue submission (test 106 failure) - rename delwri submit function buffer list parameters for clarity - xfs_efd_item_push() should return XFS_ITEM_PINNED ] Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
2012-05-14xfs: implement freezing by emptying the AILChristoph Hellwig1-7/+29
Now that we write back all metadata either synchronously or through the AIL we can simply implement metadata freezing in terms of emptying the AIL. The implementation for this is fairly simply and straight-forward: A new routine is added that asks the xfsaild to push the AIL to the end and waits for it to complete and send a wakeup. The routine will then loop if the AIL is not actually empty, and continue to do so until the AIL is compeltely empty. We keep an inode reclaim pass in the freeze process to avoid having memory pressure have to reclaim inodes that require dirtying the filesystem to be reclaimed after the freeze has completed. This means we can also treat unmount in the exact same way as freeze. As an upside we can now remove the radix tree based inode writeback and xfs_unmountfs_writesb. [ Dave Chinner: - Cleaned up commit message. - Added inode reclaim passes back into freeze. - Cleaned up wakeup mechanism to avoid the use of a new sleep counter variable. ] Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
2012-05-14xfs: allow assigning the tail lsn with the AIL lock heldChristoph Hellwig1-7/+15
Provide a variant of xlog_assign_tail_lsn that has the AIL lock already held. By doing so we do an additional atomic_read + atomic_set under the lock, which comes down to two instructions. Switch xfs_trans_ail_update_bulk and xfs_trans_ail_delete_bulk to the new version to reduce the number of lock roundtrips, and prepare for a new addition that would require a third lock roundtrip in xfs_trans_ail_delete_bulk. This addition is also the reason for slightly rearranging the conditionals and relying on xfs_log_space_wake for checking that the filesystem has been shut down internally. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
2012-02-22xfs: cleanup xfs_log_space_wakeChristoph Hellwig1-2/+2
Remove the now unused opportunistic parameter, and use the the xlog_writeq_wake and xlog_reserveq_wake helpers now that we don't have to care about the opportunistic wakeups. Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Ben Myers <bpm@sgi.com>
2012-02-22xfs: remove xfs_trans_unlocked_itemChristoph Hellwig1-44/+0
There is no reason to wake up log space waiters when unlocking inodes or dquots, and the commit log has no explanation for this function either. Given that we now have exact log space wakeups everywhere we can assume the reason for this function was to paper over log space races in earlier XFS versions. Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Ben Myers <bpm@sgi.com>
2012-02-22xfs: split tail_lsn assignments from log space wakeupsChristoph Hellwig1-34/+11
Currently xfs_log_move_tail has a tail_lsn argument that is horribly overloaded: it may contain either an actual lsn to assign to the log tail, 0 as a special case to use the last sync LSN, or 1 to indicate that no tail LSN assignment should be performed, and we should opportunisticly wake up at one task waiting for log space even if we did not move the LSN. Remove the tail lsn assigned from xfs_log_move_tail and make the two callers use xlog_assign_tail_lsn instead of the current variant of partially using the code in xfs_log_move_tail and partially opencoding it. Note that means we grow an addition lock roundtrip on the AIL lock for each bulk update or delete, which is still far less than what we had before introducing the bulk operations. If this proves to be a problem we can still add a variant of xlog_assign_tail_lsn that expects the lock to be held already. Also rename the remainder of xfs_log_move_tail to xfs_log_space_wake as that name describes its functionality much better. Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Ben Myers <bpm@sgi.com>
2011-10-18xfs: add AIL pushing tracepointsChristoph Hellwig1-0/+8
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Alex Elder <aelder@sgi.com>
2011-10-18xfs: put in missed fix for merge problemAlex Elder1-1/+1
I intended to do this as part of fixing part of the conflict with the merge with Linus' tree, but evidently it didn't get included in the commit. Signed-off-by: Alex Elder <aelder@sgi.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2011-10-17Merge branch 'master' of ↵Alex Elder1-35/+48
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux Resolved conflicts: fs/xfs/xfs_trans_priv.h: - deleted struct xfs_ail field xa_flags - kept field xa_log_flush in struct xfs_ail fs/xfs/xfs_trans_ail.c: - in xfsaild_push(), in XFS_ITEM_PUSHBUF case, replaced "flush_log = 1" with "ailp->xa_log_flush++" Signed-off-by: Alex Elder <aelder@sgi.com>
2011-10-11xfs: reduce the number of log forces from tail pushingDave Chinner1-13/+20
The AIL push code will issue a log force on ever single push loop that it exits and has encountered pinned items. It doesn't rescan these pinned items until it revisits the AIL from the start. Hence we only need to force the log once per walk from the start of the AIL to the target LSN. This results in numbers like this: xs_push_ail_flush..... 1456 xs_log_force......... 1485 For an 8-way 50M inode create workload - almost all the log forces are coming from the AIL pushing code. Reduce the number of log forces by only forcing the log if the previous walk found pinned buffers. This reduces the numbers to: xs_push_ail_flush..... 665 xs_log_force......... 682 For the same test. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
2011-10-11xfs: revert to using a kthread for AIL pushingChristoph Hellwig1-32/+41
Currently we have a few issues with the way the workqueue code is used to implement AIL pushing: - it accidentally uses the same workqueue as the syncer action, and thus can be prevented from running if there are enough sync actions active in the system. - it doesn't use the HIGHPRI flag to queue at the head of the queue of work items At this point I'm not confident enough in getting all the workqueue flags and tweaks right to provide a perfectly reliable execution context for AIL pushing, which is the most important piece in XFS to make forward progress when the log fills. Revert back to use a kthread per filesystem which fixes all the above issues at the cost of having a task struct and stack around for each mounted filesystem. In addition this also gives us much better ways to diagnose any issues involving hung AIL pushing and removes a small amount of code. Signed-off-by: Christoph Hellwig <hch@lst.de> Reported-by: Stefan Priebe <s.priebe@profihost.ag> Tested-by: Stefan Priebe <s.priebe@profihost.ag> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Alex Elder <aelder@sgi.com>
2011-10-11xfs: force the log if we encounter pinned buffers in .iop_pushbufChristoph Hellwig1-2/+7
We need to check for pinned buffers even in .iop_pushbuf given that inode items flush into the same buffers that may be pinned directly due operations on the unlinked inode list operating directly on buffers. To do this add a return value to .iop_pushbuf that tells the AIL push about this and use the existing log force mechanisms to unpin it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reported-by: Stefan Priebe <s.priebe@profihost.ag> Tested-by: Stefan Priebe <s.priebe@profihost.ag> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Alex Elder <aelder@sgi.com>
2011-10-11xfs: do not update xa_last_pushed_lsn for locked itemsChristoph Hellwig1-1/+0
If an item was locked we should not update xa_last_pushed_lsn and thus skip it when restarting the AIL scan as we need to be able to lock and write it out as soon as possible. Otherwise heavy lock contention might starve AIL pushing too easily, especially given the larger backoff once we moved xa_last_pushed_lsn all the way to the target lsn. Signed-off-by: Christoph Hellwig <hch@lst.de> Reported-by: Stefan Priebe <s.priebe@profihost.ag> Tested-by: Stefan Priebe <s.priebe@profihost.ag> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Alex Elder <aelder@sgi.com>
2011-08-09xfs: set cursor in xfs_ail_splice() even when AIL was emptyAlex Elder1-38/+29
In xfs_ail_splice(), if a cursor is provided it is updated to point to the last item on the list being spliced into the AIL. But if the AIL was found to be empty, the cursor (if provided) is just initialized instead. There is no reason the empty AIL case needs to be treated any differently. And treating it the same way allows this code to be rearranged a bit, with a somewhat tidier result. Signed-off-by: Alex Elder <aelder@sgi.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2011-07-20xfs: convert AIL cursors to use struct list_headDave Chinner1-53/+26
The list of active AIL cursors uses a roll-your-own linked list with special casing for the AIL push cursor. Simplify this code by replacing the list with standard struct list_head lists, and use a separate list_head to track the active cursors. This allows us to treat the AIL push cursor as a generic cursor rather than as a special case, further simplifying the code. Further, fix the duplicate push cursor initialisation that the special case handling was hiding, and clean up all the comments around the active cursor list handling. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
2011-07-20xfs: remove confusing ail cursor wrapperDave Chinner1-31/+19
xfs_trans_ail_cursor_set() doesn't set the cursor to the current log item, it sets it to the next item. There is already a function for doing this - xfs_trans_ail_cursor_next() - and the _set function is simply a two line wrapper. Remove it and open code the setting of the cursor in the two locations that call it to remove the confusion. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
2011-07-20xfs: use a cursor for bulk AIL insertionDave Chinner1-21/+88
Delayed logging can insert tens of thousands of log items into the AIL at the same LSN. When the committing of log commit records occur, we can get insertions occurring at an LSN that is not at the end of the AIL. If there are thousands of items in the AIL on the tail LSN, each insertion has to walk the AIL to find the correct place to insert the new item into the AIL. This can consume large amounts of CPU time and block other operations from occurring while the traversals are in progress. To avoid this repeated walk, use a AIL cursor to record where we should be inserting the new items into the AIL without having to repeat the walk. The cursor infrastructure already provides this functionality for push walks, so is a simple extension of existing code. While this will not avoid the initial walk, it will avoid repeating it tens of thousands of times during a single checkpoint commit. This version includes logic improvements from Christoph Hellwig. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
2011-05-09xfs: fix race condition in AIL push triggerDave Chinner1-6/+10
The recent conversion of the xfsaild functionality to a work queue introduced a hard-to-hit log space grant hang. One is caused by a race condition in determining whether there is a psh in progress or not. The XFS_AIL_PUSHING_BIT is used to determine whether a push is currently in progress. When the AIL push work completes, it checked whether the target changed and cleared the PUSHING bit to allow a new push to be requeued. The race condition is as follows: Thread 1 push work smp_wmb() smp_rmb() check ailp->xa_target unchanged update ailp->xa_target test/set PUSHING bit does not queue clear PUSHING bit does not requeue Now that the push target is updated, new attempts to push the AIL will not trigger as the push target will be the same, and hence despite trying to push the AIL we won't ever wake it again. The fix is to ensure that the AIL push work clears the PUSHING bit before it checks if the target is unchanged. As a result, both push triggers operate on the same test/set bit criteria, so even if we race in the push work and miss the target update, the thread requesting the push will still set the PUSHING bit and queue the push work to occur. For safety sake, the same queue check is done if the push work detects the target change, though only one of the two will will queue new work due to the use of test_and_set_bit() checks. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com> (cherry picked from commit e4d3c4a43b595d5124ae824d300626e6489ae857)
2011-05-09xfs: make AIL target updates and compares 32bit safe.Dave Chinner1-3/+4
The recent conversion of the xfsaild functionality to a work queue introduced a hard-to-hit log space grant hang. One of the problems noticed was that updates of the push target are not 32 bit safe as the target is a 64 bit value. We cannot copy a 64 bit LSN without the possibility of corrupting the result when racing with another updating thread. We have function to do this update safely without needing to care about 32/64 bit issues - xfs_trans_ail_copy_lsn() - so use that when updating the AIL push target. Also move the reading of the target in the push work inside the AIL lock, and use XFS_LSN_CMP() for the unlocked comparison during work termination to close read holes as well. Signed-off-by: Dave Chinner <david@fromorbit.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com> (cherry picked from commit fd5670f22fce247754243cf2ed41941e5762d990)
2011-05-09xfs: always push the AIL to the targetDave Chinner1-1/+1
The recent conversion of the xfsaild functionality to a work queue introduced a hard-to-hit log space grant hang. One of the problems discovered is a target mismatch between the item pushing loop and the target itself. The push trigger checks for the target increasing (i.e. new target > current) while the push loop only pushes items that have a LSN < current. As a result, we can get the situation where the push target is X, the items at the tail of the AIL have LSN X and they don't get pushed. The push work then completes thinking it is done, and cannot be restarted until the push target increases to >= X + 1. If the push target then never increases (because the tail is not moving), then we never run the push work again and we stall. Fix it by making sure log items with a LSN that matches the target exactly are pushed during the loop. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com> (cherry picked from commit cb64026b6e8af50db598ec7c3f59d504259b00bb)
2011-05-09xfs: exit AIL push work correctly when AIL is emptyDave Chinner1-13/+13
The recent conversion of the xfsaild functionality to a work queue introduced a hard-to-hit log space grant hang. The main cause is a regression where a work exit path fails to clear the PUSHING state and recheck the target correctly. Make both exit paths do the same PUSHING bit clearing and target checking when the "no more work to be done" condition is hit. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com> (cherry picked from commit ea35a20021f8497390d05b93271b4d675516c654)
2011-04-08xfs: push the AIL from memory reclaim and periodic syncDave Chinner1-2/+48
When we are short on memory, we want to expedite the cleaning of dirty objects. Hence when we run short on memory, we need to kick the AIL flushing into action to clean as many dirty objects as quickly as possible. To implement this, sample the lsn of the log item at the head of the AIL and use that as the push target for the AIL flush. Further, we keep items in the AIL that are dirty that are not tracked any other way, so we can get objects sitting in the AIL that don't get written back until the AIL is pushed. Hence to get the filesystem to the idle state, we might need to push the AIL to flush out any remaining dirty objects sitting in the AIL. This requires the same push mechanism as the reclaim push. This patch also renames xfs_trans_ail_tail() to xfs_ail_min_lsn() to match the new xfs_ail_max_lsn() function introduced in this patch. Similarly for xfs_trans_ail_push -> xfs_ail_push. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Alex Elder <aelder@sgi.com>
2011-04-08xfs: clean up code layout in xfs_trans_ail.cDave Chinner1-136/+118
This patch rearranges the location of functions in xfs_trans_ail.c to remove the need for forward declarations of those functions in preparation for adding new functions without the need for forward declarations. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Alex Elder <aelder@sgi.com>
2011-04-08xfs: convert the xfsaild threads to a workqueueDave Chinner1-60/+73
Similar to the xfssyncd, the per-filesystem xfsaild threads can be converted to a global workqueue and run periodically by delayed works. This makes sense for the AIL pushing because it uses variable timeouts depending on the work that needs to be done. By removing the xfsaild, we simplify the AIL pushing code and remove the need to spread the code to implement the threading and pushing across multiple files. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com>
2011-03-07xfs: convert xfs_cmn_err to xfs_alert_tagDave Chinner1-1/+1
Continue the conversion of the old cmn_err interface be converting all the conditional panic tag errors to xfs_alert_tag() and then removing xfs_cmn_err(). Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Alex Elder <aelder@sgi.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2010-12-20xfs: use AIL bulk delete function to implement single deleteDave Chinner1-65/+0
We now have two copies of AIL delete operations that are mostly duplicate functionality. The single log item deletes can be implemented via the bulk updates by turning xfs_trans_ail_delete() into a simple wrapper. This removes all the duplicate delete functionality and associated helpers. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2010-12-20xfs: use AIL bulk update function to implement single updatesDave Chinner1-88/+0
We now have two copies of AIL insert operations that are mostly duplicate functionality. The single log item updates can be implemented via the bulk updates by turning xfs_trans_ail_update() into a simple wrapper. This removes all the duplicate insert functionality and associated helpers. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2010-12-20xfs: remove all the inodes on a buffer from the AIL in bulkDave Chinner1-0/+73
When inode buffer IO completes, usually all of the inodes are removed from the AIL. This involves processing them one at a time and taking the AIL lock once for every inode. When all CPUs are processing inode IO completions, this causes excessive amount sof contention on the AIL lock. Instead, change the way we process inode IO completion in the buffer IO done callback. Allow the inode IO done callback to walk the list of IO done callbacks and pull all the inodes off the buffer in one go and then process them as a batch. Once all the inodes for removal are collected, take the AIL lock once and do a bulk removal operation to minimise traffic on the AIL lock. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2010-12-20xfs: bulk AIL insertion during transaction commitDave Chinner1-2/+107
When inserting items into the AIL from the transaction committed callbacks, we take the AIL lock for every single item that is to be inserted. For a CIL checkpoint commit, this can be tens of thousands of individual inserts, yet almost all of the items will be inserted at the same point in the AIL because they have the same index. To reduce the overhead and contention on the AIL lock for such operations, introduce a "bulk insert" operation which allows a list of log items with the same LSN to be inserted in a single operation via a list splice. To do this, we need to pre-sort the log items being committed into a temporary list for insertion. The complexity is that not every log item will end up with the same LSN, and not every item is actually inserted into the AIL. Items that don't match the commit LSN will be inserted and unpinned as per the current one-at-a-time method (relatively rare), while items that are not to be inserted will be unpinned and freed immediately. Items that are to be inserted at the given commit lsn are placed in a temporary array and inserted into the AIL in bulk each time the array fills up. As a result of this, we trade off AIL hold time for a significant reduction in traffic. lock_stat output shows that the worst case hold time is unchanged, but contention from AIL inserts drops by an order of magnitude and the number of lock traversal decreases significantly. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2010-12-03xfs: clean up xfs_ail_delete()Dave Chinner1-20/+7
xfs_ail_delete() has a needlessly complex interface. It returns the log item that was passed in for deletion (which the callers then assert is identical to the one passed in), and callers of xfs_ail_delete() still need to invalidate current traversal cursors. Make xfs_ail_delete() return void, move the cursor invalidation inside it, and clean up the callers just to use the log item pointer they passed in. While cleaning up, remove the messy and unnecessary "/* ARGUSED */" comments around all these functions. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2010-07-26xfs: drop dmapi hooksChristoph Hellwig1-1/+0
Dmapi support was never merged upstream, but we still have a lot of hooks bloating XFS for it, all over the fast pathes of the filesystem. This patch drops over 700 lines of dmapi overhead. If we'll ever get HSM support in mainline at least the namespace events can be done much saner in the VFS instead of the individual filesystem, so it's not like this is much help for future work. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2010-02-02xfs: Don't issue buffer IO direct from AIL push V2Dave Chinner1-6/+7
All buffers logged into the AIL are marked as delayed write. When the AIL needs to push the buffer out, it issues an async write of the buffer. This means that IO patterns are dependent on the order of buffers in the AIL. Instead of flushing the buffer, promote the buffer in the delayed write list so that the next time the xfsbufd is run the buffer will be flushed by the xfsbufd. Return the state to the xfsaild that the buffer was promoted so that the xfsaild knows that it needs to cause the xfsbufd to run to flush the buffers that were promoted. Using the xfsbufd for issuing the IO allows us to dispatch all buffer IO from the one queue. This means that we can make much more enlightened decisions on what order to flush buffers to disk as we don't have multiple places issuing IO. Optimisations to xfsbufd will be in a future patch. Version 2 - kill XFS_ITEM_FLUSHING as it is now unused. Signed-off-by: Dave Chinner <david@fromorbit.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2010-01-21xfs: cleanup up xfs_log_force calling conventionsChristoph Hellwig1-1/+1
Remove the XFS_LOG_FORCE argument which was always set, and the XFS_LOG_URGE define, which was never used. Split xfs_log_force into a two helpers - xfs_log_force which forces the whole log, and xfs_log_force_lsn which forces up to the specified LSN. The underlying implementations already were entirely separate, as were the users. Also re-indent the new _xfs_log_force/_xfs_log_force which previously had a weird coding style. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
2010-01-15xfs: Don't wake the aild once per secondDave Chinner1-8/+11
Now that the AIL push algorithm is traversal safe, we don't need a watchdog function in the xfsaild to catch pushes that fail to make progress. Remove the watchdog timeout and make pushes purely driven by demand. This will remove the once-per-second wakeup that is seen when the filesystem is idle and make laptop power misers happy. Signed-off-by: Dave Chinner <david@fromorbit.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
2009-11-17xfs: copy li_lsn before dropping AIL lockNathaniel W. Turner1-3/+20
Access to log items on the AIL is generally protected by m_ail_lock; this is particularly needed when we're getting or setting the 64-bit li_lsn on a 32-bit platform. This patch fixes a couple places where we were accessing the log item after dropping the AIL lock on 32-bit machines. This can result in a partially-zeroed log->l_tail_lsn if xfs_trans_ail_delete is racing with xfs_trans_ail_update, and in at least some cases, this can leave the l_tail_lsn with a zero cycle number, which means xlog_space_left will think the log is full (unless CONFIG_XFS_DEBUG is set, in which case we'll trip an ASSERT), leading to processes stuck forever in xlog_grant_log_space. Thanks to Adrian VanderSpek for first spotting the race potential and to Dave Chinner for debug assistance. Signed-off-by: Nathaniel W. Turner <nate@houseofnate.net> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
2009-03-29xfs: fix various typosMalcolm Parsons1-2/+2
Signed-off-by: Malcolm Parsons <malcolm.parsons@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2008-10-30[XFS] correctly select first log item to pushDavid Chinner1-1/+1
Under heavy metadata load we are seeing log hangs. The AIL has items in it ready to be pushed, and they are within the push target window. However, we are not pushing them when the last pushed LSN is less than the LSN of the first log item on the AIL. This is a regression introduced by the AIL push cursor modifications. SGI-PV: 987246 SGI-Modid: xfs-linux-melb:xfs-kern:32409a Signed-off-by: David Chinner <david@fromorbit.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2008-10-30[XFS] Finish removing the mount pointer from the AIL APIDavid Chinner1-20/+21
Change all the remaining AIL API functions that are passed struct xfs_mount pointers to pass pointers directly to the struct xfs_ail being used. With this conversion, all external access to the AIL is via the struct xfs_ail. Hence the operation and referencing of the AIL is almost entirely independent of the xfs_mount that is using it - it is now much more tightly tied to the log and the items it is tracking in the log than it is tied to the xfs_mount. SGI-PV: 988143 SGI-Modid: xfs-linux-melb:xfs-kern:32353a Signed-off-by: David Chinner <david@fromorbit.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30[XFS] Move the AIL lock into the struct xfs_ailDavid Chinner1-27/+29
Bring the ail lock inside the struct xfs_ail. This means the AIL can be entirely manipulated via the struct xfs_ail rather than needing both the struct xfs_mount and the struct xfs_ail. SGI-PV: 988143 SGI-Modid: xfs-linux-melb:xfs-kern:32350a Signed-off-by: David Chinner <david@fromorbit.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30[XFS] move the AIl traversal over to a consistent interfaceDavid Chinner1-75/+42
With the new cursor interface, it makes sense to make all the traversing code use the cursor interface and make the old one go away. This means more of the AIL interfacing is done by passing struct xfs_ail pointers around the place instead of struct xfs_mount pointers. We can replace the use of xfs_trans_first_ail() in xfs_log_need_covered() as it is only checking if the AIL is empty. We can do that with a call to xfs_trans_ail_tail() instead, where a zero LSN returned indicates and empty AIL... SGI-PV: 988143 SGI-Modid: xfs-linux-melb:xfs-kern:32348a Signed-off-by: David Chinner <david@fromorbit.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30[XFS] Use a cursor for AIL traversal.David Chinner1-56/+151
To replace the current generation number ensuring sanity of the AIL traversal, replace it with an external cursor that is linked to the AIL. Basically, we store the next item in the cursor whenever we want to drop the AIL lock to do something to the current item. When we regain the lock. the current item may already be free, so we can't reference it, but the next item in the traversal is already held in the cursor. When we move or delete an object, we search all the active cursors and if there is an item match we clear the cursor(s) that point to the object. This forces the traversal to restart transparently. We don't invalidate the cursor on insert because the cursor still points to a valid item. If the intem is inserted between the current item and the cursor it does not matter; the traversal is considered to be past the insertion point so it will be picked up in the next traversal. Hence traversal restarts pretty much disappear altogether with this method of traversal, which should substantially reduce the overhead of pushing on a busy AIL. Version 2 o add restart logic o comment cursor interface o minor cleanups SGI-PV: 988143 SGI-Modid: xfs-linux-melb:xfs-kern:32347a Signed-off-by: David Chinner <david@fromorbit.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30[XFS] Allocate the struct xfs_ailDavid Chinner1-38/+49
Rather than embedding the struct xfs_ail in the struct xfs_mount, allocate it during AIL initialisation. Add a back pointer to the struct xfs_ail so that we can pass around the xfs_ail and still be able to access the xfs_mount if need be. This is th first step involved in isolating the AIL implementation from the surrounding filesystem code. SGI-PV: 988143 SGI-Modid: xfs-linux-melb:xfs-kern:32346a Signed-off-by: David Chinner <david@fromorbit.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-04-18[XFS] replace remaining __FUNCTION__ occurrencesHarvey Harrison1-1/+1
__FUNCTION__ is gcc-specific, use __func__ SGI-PV: 976035 SGI-Modid: xfs-linux-melb:xfs-kern:30775a Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18[XFS] Replace custom AIL linked-list code with struct list_headJosef 'Jeff' Sipek1-89/+60
Replace the xfs_ail_entry_t with a struct list_head and clean the surrounding code up. Also fixes a livelock in xfs_trans_first_push_ail() by terminating the loop at the head of the list correctly. SGI-PV: 978682 SGI-Modid: xfs-linux-melb:xfs-kern:30636a Signed-off-by: Josef 'Jeff' Sipek <jeffpc@josefsipek.net> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-03-06[XFS] 977545 977545 977545 977545 977545 977545 xfsaild causing too manyDavid Chinner1-7/+10
wakeups Idle state is not being detected properly by the xfsaild push code. The current idle state is detected by an empty list which may never happen with mostly idle filesystem or one using lazy superblock counters. A single dirty item in the list that exists beyond the push target can result repeated looping attempting to push up to the target because it fails to check if the push target has been acheived or not. Fix by considering a dirty list with everything past the target as an idle state and set the timeout appropriately. SGI-PV: 977545 SGI-Modid: xfs-linux-melb:xfs-kern:30532a Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-02-07[XFS] Make xfs_ail_check check less by defaultDavid Chinner1-9/+28
Checking the entire AIL on every insert and remove is prohibitively expensive - the sustained sequntial create rate on a single disk drops from about 1800/s to 60/s because of this checking resulting in the xfslogd becoming cpu bound. By default on debug builds, only check the next and previous entries in the list to ensure they are ordered correctly. If you really want, define XFS_TRANS_DEBUG to use the old behaviour. SGI-PV: 972759 SGI-Modid: xfs-linux-melb:xfs-kern:30372a Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>