Age | Commit message (Collapse) | Author | Files | Lines |
|
The zlib inflate code has an old micro-optimization based on the
assumption that for pre-increment memory accesses, the compiler will
generate code that fits better into the processor's pipeline than what
would be generated for post-increment memory accesses.
This optimization was already removed in upstream zlib in 2016:
https://github.com/madler/zlib/commit/9aaec95e8211
This optimization causes UB according to C99, which says in section 6.5.6
"Additive operators": "If both the pointer operand and the result point to
elements of the same array object, or one past the last element of the
array object, the evaluation shall not produce an overflow; otherwise, the
behavior is undefined".
This UB is not only a theoretical concern, but can also cause trouble for
future work on compiler-based sanitizers.
According to the zlib commit, this optimization also is not optimal
anymore with modern compilers.
Replace uses of OFF, PUP and UP_UNALIGNED with their definitions in the
POSTINC case, and remove the macro definitions, just like in the upstream
patch.
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Mikhail Zaslonko <zaslonko@linux.ibm.com>
Link: http://lkml.kernel.org/r/20200507123112.252723-1-jannh@google.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
inflate_fast() can do either POST INC or PRE INC on its pointers walking
the memory to decompress. Default is PRE INC.
The sout pointer offset was miscalculated in one case as the calculation
assumed sout was a char * This breaks inflate_fast() iff configured to do
POST INC.
Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Commit 6846ee5ca68d81e6baccf0d56221d7a00c1be18b ("zlib: Fix build of
powerpc boot wrapper") made the new optimized inflate only available on
arch's that define CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS.
This patch will again enable the optimization for all arch's by defining
our own endian independent version of unaligned access. As an added
bonus, arch's that define CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS do a
plain load instead.
Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Cc: Anton Blanchard <anton@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Commit ac4c2a3bbe5db5fc570b1d0ee1e474db7cb22585 broke the build
of all powerpc boot wrappers.
It attempts to add an include of autoconf.h but used the wrong
path for it. It also adds -D__KERNEL__ to our boot wrapper, both
things that we pretty much didn't do on purpose so far.
We want our boot wrapper to remain independent enough of the kernel
for various reasons, one of them being that you can "wrap" an existing
kernel at distro install time which allows to ship one kernel image
and a set of boot wrappers for different platforms, the wrappers
don't have to be built out of the same kernel build tree.
It's also incorrect to do what the patch does in our boot environment
since we may not have a proper alignment exception handler which means
we may not be able to fixup the few cases where an unaligned access will
need SW emulation (depends on the core variant, could be when crossing
page or segment boundaries for example).
This patch fixes it by putting the old code back in and using the
new "fancy" variant only when CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
is set, which happens not to be set on powerpc since we don't include
autoconf.h. It also reverts the changes to our boot wrapper Makefile.
This means that x86 should, afaik, keep the optimisations since its
boot wrapper does include autoconf.h and define __KERNEL__ (though I
doubt they make that much different outside of slow embedded processors).
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
JFFS2 uses lesser compression ratio and inflate always ends up in "copy
direct from output" case.
This patch tries to optimize the direct copy procedure. Uses
get_unaligned() but only in one place.
The copy loop just above this one can also use this optimization, but I
havn't done so as I have not tested if it is a win there too.
On my MPC8321 this is about 17% faster on my JFFS2 root FS than the
original.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Cc: Roel Kluin <roel.kluin@gmail.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Acked-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Fix function definitions to be ANSI-compliant:
lib/zlib_inflate/inffast.c:68:1: warning: non-ANSI definition of function 'inflate_fast'
lib/zlib_inflate/inftrees.c:33:1: warning: non-ANSI definition of function 'zlib_inflate_table'
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
Upgrade the zlib_inflate implementation in the kernel from a patched
version 1.1.3/4 to a patched 1.2.3.
The code in the kernel is about seven years old and I noticed that the
external zlib library's inflate performance was significantly faster (~50%)
than the code in the kernel on ARM (and faster again on x86_32).
For comparison the newer deflate code is 20% slower on ARM and 50% slower
on x86_32 but gives an approx 1% compression ratio improvement. I don't
consider this to be an improvement for kernel use so have no plans to
change the zlib_deflate code.
Various changes have been made to the zlib code in the kernel, the most
significant being the extra functions/flush option used by ppp_deflate.
This update reimplements the features PPP needs to ensure it continues to
work.
This code has been tested on ARM under both JFFS2 (with zlib compression
enabled) and ppp_deflate and on x86_32. JFFS2 sees an approx. 10% real
world file read speed improvement.
This patch also removes ZLIB_VERSION as it no longer has a correct value.
We don't need version checks anyway as the kernel's module handling will
take care of that for us. This removal is also more in keeping with the
zlib author's wishes (http://www.zlib.net/zlib_faq.html#faq24) and I've
added something to the zlib.h header to note its a modified version.
Signed-off-by: Richard Purdie <rpurdie@rpsys.net>
Acked-by: Joern Engel <joern@wh.fh-wedel.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.
Let it rip!
|