Age | Commit message (Collapse) | Author | Files | Lines |
|
We currently have a global error message buffer in cmn_err that is
protected by a spin lock that disables interrupts. Recently there
have been reports of NMI timeouts occurring when the console is
being flooded by SCSI error reports due to cmn_err() getting stuck
trying to print to the console while holding this lock (i.e. with
interrupts disabled). The NMI watchdog is seeing this CPU as
non-responding and so is triggering a panic. While the trigger for
the reported case is SCSI errors, pretty much anything that spams
the kernel log could cause this to occur.
Realistically the only reason that we have the intemediate message
buffer is to prepend the correct kernel log level prefix to the log
message. The only reason we have the lock is to protect the global
message buffer and the only reason the message buffer is global is
to keep it off the stack. Hence if we can avoid needing a global
message buffer we avoid needing the lock, and we can do this with a
small amount of cleanup and some preprocessor tricks:
1. clean up xfs_cmn_err() panic mask functionality to avoid
needing debug code in xfs_cmn_err()
2. remove the couple of "!" message prefixes that still exist that
the existing cmn_err() code steps over.
3. redefine CE_* levels directly to KERN_*
4. redefine cmn_err() and friends to use printk() directly
via variable argument length macros.
By doing this, we can completely remove the cmn_err() code and the
lock that is causing the problems, and rely solely on printk()
serialisation to ensure that we don't get garbled messages.
A series of followup patches is really needed to clean up all the
cmn_err() calls and related messages properly, but that results in a
series that is not easily back portable to enterprise kernels. Hence
this initial fix is only to address the direct problem in the lowest
impact way possible.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
|
|
Recent tests writing lots of small files showed the flusher thread
being CPU bound and taking a long time to do allocations on a debug
kernel. perf showed this as the prime reason:
samples pcnt function DSO
_______ _____ ___________________________ _________________
224648.00 36.8% xfs_error_test [kernel.kallsyms]
86045.00 14.1% xfs_btree_check_sblock [kernel.kallsyms]
39778.00 6.5% prandom32 [kernel.kallsyms]
37436.00 6.1% xfs_btree_increment [kernel.kallsyms]
29278.00 4.8% xfs_btree_get_rec [kernel.kallsyms]
27717.00 4.5% random32 [kernel.kallsyms]
Walking btree blocks during allocation checking them requires each
block (a cache hit, so no I/O) call xfs_error_test(), which then
does a random32() call as the first operation. IOWs, ~50% of the
CPU is being consumed just testing whether we need to inject an
error, even though error injection is not active.
Kill this overhead when error injection is not active by adding a
global counter of active error traps and only calling into
xfs_error_test when fault injection is active.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
|
|
Change the tag and file name arguments to xfs_error_report() and
xfs_corruption_error() to use a const qualifier.
Signed-off-by: Alex Elder <aelder@sgi.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
|
|
xfs_fs_vcmn_err can be called under a spinlock, but does a sleeping memory
allocation to create buffer for it's internal sprintf. Fortunately it's
the only caller of icmn_err, so we can merge the two and have one single
static buffer and spinlock protecting it. While we're at it make sure
we proper __attribute__ format annotations so that the compiler can detect
mismatched format strings.
Reported-by: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
|
|
All the error injection is already enabled through ifdef DEBUG, so kill
the never set second cpp symbol to activate it without the rest of the
debugging infrastructure.
SGI-PV: 981498
SGI-Modid: xfs-linux-melb:xfs-kern:31771a
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
|
|
Currently the xfs module init/exit code is a mess. It's farmed out over a
lot of function with very little error checking. This patch makes sure we
propagate all initialization failures properly and clean up after them.
Various runtime initializations are replaced with compile-time
initializations where possible to make this easier. The exit path is
similarly consolidated.
There's now split out function to create/destroy the kmem zones and
alloc/free the trace buffers. I've also changed the ktrace allocations to
KM_MAYFAIL and handled errors resulting from that.
And yes, we really should replace the XFS_*_TRACE ifdefs with a single
XFS_TRACE..
SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:31354a
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
|
|
No need for xfs to have its own hex dumping routine now that the kernel
has one.
SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:29847a
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
|
|
vfs_altfsid was just a pointer to mp->m_fixedfsid so we can trivially
replace it with the latter. vfs_fsid also was identical to m_fixedfsid
through rather obfuscated ways so we can kill it as well and simply its
only user.
SGI-PV: 969608
SGI-Modid: xfs-linux-melb:xfs-kern:29506a
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
|
|
Patch provided by Eric Sandeen (sandeen@sandeen.net).
SGI-PV: 960897
SGI-Modid: xfs-linux-melb:xfs-kern:28038a
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
|
|
The XFS quiet mount logic was inverted making quiet mounts noisy and vice
versa. Fix it.
SGI-PV: 958469
SGI-Modid: xfs-linux-melb:xfs-kern:27520a
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Tim Shimmin <tes@sgi.com>
|
|
SGI-PV: 955302
SGI-Modid: xfs-linux-melb:xfs-kern:26802a
Signed-off-by: Nathan Scott <nathans@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
|
|
SGI-PV: 955302
SGI-Modid: xfs-linux-melb:xfs-kern:26749a
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Nathan Scott <nathans@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
|
|
SGI-PV: 951299
SGI-Modid: xfs-linux-melb:xfs-kern:25632a
Signed-off-by: Nathan Scott <nathans@sgi.com>
|
|
equivalents.
SGI-PV: 907752
SGI-Modid: xfs-linux-melb:xfs-kern:24961a
Signed-off-by: Nathan Scott <nathans@sgi.com>
|
|
boilerplate.
SGI-PV: 913862
SGI-Modid: xfs-linux:xfs-kern:23903a
Signed-off-by: Nathan Scott <nathans@sgi.com>
|
|
SGI-PV: 943122
SGI-Modid: xfs-linux:xfs-kern:23901a
Signed-off-by: Nathan Scott <nathans@sgi.com>
|
|
SGI-PV: 936255
SGI-Modid: xfs-linux:xfs-kern:192760a
Signed-off-by: Christoph Hellwig <hch@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
|
|
Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.
Let it rip!
|