summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2010-04-01Post-release version bump to 0.19.1HEADmasterSøren Sandmann Pedersen1-2/+2
2010-04-01Pre-release version bump to 0.18.0Søren Sandmann Pedersen1-2/+2
2010-03-24Revert "Improve PIXREGION_NIL to return true on degenerated regions."Matthias Hopf1-5/+1
This reverts commit ebba1493136a5a0dd7667073165b2115de203eda. Scheduled for re-discussion after stable 0.18 has been released.
2010-03-24Improve PIXREGION_NIL to return true on degenerated regions.Matthias Hopf1-1/+5
Fixes Novell bug 568811.
2010-03-23Post-release version bump to 0.17.15Søren Sandmann Pedersen1-1/+1
2010-03-23Pre-release version bump to 0.17.14Søren Sandmann Pedersen1-1/+1
2010-03-23Merge remote branch 'ssvb/arm-fixes'Søren Sandmann Pedersen7-388/+769
2010-03-22ARM: SIMD optimizations moved to a separate .S fileSiarhei Siamashka4-32/+359
This should be the last step in providing full armv4t compatibility with CPU features runtime autodetection in pixman.
2010-03-22ARM: SIMD optimizations updated to use common assembly calling conventionsSiarhei Siamashka1-102/+67
2010-03-22ARM: Helper ARM NEON assembly binding macros moved into a separate headerSiarhei Siamashka2-248/+329
This is needed for future reuse of the same macros for the other ARM assembly optimizations (armv4t, armv6)
2010-03-22ARM: Workaround for a NEON bug in assembler from binutils 2.18Siarhei Siamashka1-6/+6
The problem was reported as bug 25534 against pixman in freedesktop.org bugzila. Link to a patch for binutils: http://sourceware.org/ml/binutils/2008-03/msg00260.html For pixman the impact is a build failure when using binutils 2.18. Versions 2.19 and higer are fine. Still some distros may be using older versions of binutils and this is causing problems. This patch workarounds the problem by replacing a problematic "vmov a, b" instruction with equivalent "vorr a, b, b". Actually they even map to the same instruction opcode in the generated code, so the resulting binary is identical with and without patch.
2010-03-22ARM: Use '.object_arch' directive in NEON assembly fileSiarhei Siamashka2-1/+9
This can be used to override the architecture recorded in the EABI object attribute section. We set a minimum arch to 'armv4'. Binutils documentation recommends to use this directive with the code performing runtime detection of CPU features. Additionally NEON/VFP EABI attributes are suppressed. And the instruction set to use is explicitly set to '.arm'. Configure test for NEON support is also updated to include a bunch of these new directives (if any of these is unsupported by the assembler, it is better to fail configure test than to fail library build). All these changes are required to fix SIGILL problem on armv4t, reported in http://lists.freedesktop.org/archives/pixman/2010-March/000123.html
2010-03-17Avoid a potential division-by-zero exeception in window-testJon TURNEY1-2/+2
Avoid a division-by-zero exception if the first number returned by rand() is a multiple of 500, causing us to create a zero width pixmap, and then attempt to use get_rand(0) when generating a random stride... Fixes https://bugs.freedesktop.org/attachment.cgi?id=34162
2010-03-17Post-release version bump to 0.17.13Søren Sandmann Pedersen1-1/+1
2010-03-17Pre-release version bump to 0.17.12Søren Sandmann Pedersen1-1/+1
2010-03-17Specialize the fast_composite_scaled_nearest_* scalers to positive x unitsSøren Sandmann Pedersen1-4/+16
This avoids a test in the inner loop, which improves performance especially for tiled sources. On x86-32, I get these results: Before: op=1, src_fmt=20028888, dst_fmt=20028888, speed=306.96 MPix/s (73.18 FPS) op=1, src_fmt=20028888, dst_fmt=10020565, speed=102.67 MPix/s (24.48 FPS) op=1, src_fmt=10020565, dst_fmt=10020565, speed=324.85 MPix/s (77.45 FPS) After: op=1, src_fmt=20028888, dst_fmt=20028888, speed=332.19 MPix/s (79.20 FPS) op=1, src_fmt=20028888, dst_fmt=10020565, speed=110.41 MPix/s (26.32 FPS) op=1, src_fmt=10020565, dst_fmt=10020565, speed=363.28 MPix/s (86.61 FPS)
2010-03-17Add a FAST_PATH_X_UNIT_POSITIVE flagSøren Sandmann Pedersen2-7/+14
This is the common case for a lot of transformed images. If the unit were negative, the transformation would be a reflection which is fairly rare.
2010-03-17Use the right format for the OVER_8888_565 fast pathAlexander Larsson1-1/+1
2010-03-17Add specialized fast nearest scalersAlexander Larsson1-0/+243
This is a macroized version of SRC/OVER repeat normal/unneeded nearest neighbour scaling instantiated for some common 8888 and 565 formats. Based on work by Siarhei Siamashka
2010-03-17Add FAST_PATH_SAMPLES_COVER_CLIP and FAST_PATH_16BIT_SAFEAlexander Larsson2-17/+69
FAST_PATH_SAMPLES_COVER_CLIP: This is set of the source sample grid, unrepeated but transformed completely completely covers the clip destination. If this is set you can use a simple scaled that doesn't have to care about the repeat mode. FAST_PATH_16BIT_SAFE: This signifies two things: 1) The size of the src/mask fits in a 16.16 fixed point, so something like: max_vx = src_image->bits.width << 16; Is allowed and is guaranteed to not overflow max_vx 2) When stepping the source space we're guaranteed to never overflow a 16.16 bit fix point variable, even if we step one extra step in the destination space. This means that a loop doing: x = vx >> 16; vx += unit_x; d = src_row[x]; will never overflow vx causing x to be negative. And additionally, if you track vx like above and apply NORMAL repeat after the vx addition with something like: while (vx >= max_vx) vx -= max_vx; This will never overflow the vx even on the final increment that takes vx one past the end of where we will read, which makes the repeat loop safe.
2010-03-17Add FAST_PATH_NO_NONE_REPEAT flagAlexander Larsson2-3/+8
2010-03-17Add CONVERT_8888_TO_8888 and CONVERT_0565_TO_0565 macrosAlexander Larsson1-0/+4
These are useful for macroization
2010-03-17Add CONVERT_0565_TO_8888 macroAlexander Larsson1-0/+2
This lets us simplify some fast paths since we get a consistent naming that always has 8888 and gets some value for alpha.
2010-03-17Ensure that only the low 4 bit of 4 bit pixels are stored.Søren Sandmann Pedersen2-9/+15
In some cases we end up trying to use the STORE_4 macro with an 8 bit values, which resulted in other pixels getting overwritten. Fix this by always masking off the low 4 bits. This fixes blitters-test on big-endian machines.
2010-03-16Fix contact address in configure.acSøren Sandmann Pedersen1-1/+1
2010-03-16Add PIXMAN_DEFINE_THREAD_LOCAL() and PIXMAN_GET_THREAD_LOCAL() macrosSøren Sandmann Pedersen2-21/+79
These macros hide the various types of thread local support. On Linux and Unix, they expand to just __thread. On Microsoft Visual C++, they expand to __declspec(thread). On OS X and other systems that don't have __thread, they expand to a complicated concoction that uses pthread_once() and pthread_get/set_specific() to get thread local variables.
2010-03-16Add checks for various types of thread local storage.Søren Sandmann Pedersen2-1/+81
OS X does not support __thread, so we have to check for it before using it. It does however support pthread_get/setspecific(), so if we don't have __thread, check if those are available.
2010-03-15Add Sun cc to thread-local support checks in pixman-compiler.hAlan Coopersmith1-1/+1
Clears '#warning: "unknown compiler"' messages when building Signed-off-by: Alan Coopersmith <alan.coopersmith@sun.com>
2010-03-15Make .s target asm flag selection more portableAlan Coopersmith1-6/+6
The previous code worked in GNU make, but caused a syntax error in Solaris make ( https://bugs.freedesktop.org/show_bug.cgi?id=27062 ) - this seems to work in both, and should hopefully not cause syntax errors in any versions of make not supporting the macro-substitution-in-macro-name feature, just cause the macro to expand to nothing. Signed-off-by: Alan Coopersmith <alan.coopersmith@sun.com>
2010-03-15Fix typo: WORDS_BIG_ENDIAN => WORDS_BIGENDIAN in pixman-edge.cSøren Sandmann Pedersen1-1/+1
Pointed out by Andreas Falkenhahn on the cairo mailing list.
2010-03-14test: Add support for indexed formats to blitters-testSøren Sandmann Pedersen1-3/+24
These formats work fine, they just need to have a palette set.
2010-03-14pixman.h: Only define stdint types when PIXMAN_DONT_DEFINE_STDINT is undefinedSøren Sandmann Pedersen1-0/+5
In SPICE, with Microsoft Visual C++, pixman.h is included after another file that defines these types, which causes warnings and errors. This patch allows such code to just define PIXMAN_DONT_DEFINE_STDINT to use its own version of those types.
2010-03-14Merge branch 'operator-table'Søren Sandmann Pedersen1-74/+103
2010-03-14Merge branch 'fast-path-cache'Søren Sandmann Pedersen3-22/+93
2010-03-14Change operator table to be an array of arrays of four bytes.Søren Sandmann Pedersen1-21/+26
This makes gcc generate slightly better code for optimize_operator.
2010-03-14Strength reduce certain conjoint/disjoint to their normal counterparts.Søren Sandmann Pedersen1-11/+8
This allows us to not test for them later on.
2010-03-14Store the operator table more compactly.Søren Sandmann Pedersen1-98/+84
The four cases for each operator: none-are-opaque, src-is-opaque, dest-is-opaque, both-are-opaque are packed into one uint32_t per operator. The relevant strength reduced operator can then be found by packing the source-is-opaque and dest-is-opaque into two bits and shifting that number of bytes. Chris Wilson pointed out a bug in the original version of this commit: dest_is_opaque and source_is_opaque were used as booleans, but their actual values were the results of a logical AND with the FAST_PATH_OPAQUE flag, so the shift value was wildly wrong. The only reason it actually passed the test suite (on x86) was that the compiler computed the shift amount in the cl register, and the low byte of FAST_PATH_OPAQUE happens to be 0, so no shifting actually took place, and the original operator was returned.
2010-03-14Make the operator strength reduction constant time.Søren Sandmann Pedersen1-46/+87
By extending the operator information table to cover all operators we can replace the loop with a table look-up. At the same time, base the operator optimization on the computed flags rather than the ones in the image struct. Finally, as an extra optimization, we no longer ignore the case where there is a mask. Instead we consider the source opaque if both source and mask are opaque, or if the source is opaque and the mask is missing.
2010-03-14ARM: SIMD: Try without any CFLAGS before forcing -mcpu=Loïc Minier1-5/+15
http://bugs.launchpad.net/bugs/535183
2010-03-12Eliminate trailing comma in enumEgor Starkov1-1/+2
https://bugs.freedesktop.org/show_bug.cgi?id=27050 Pixman is not compiling with c++ compiler. During compilation it gives the following error: /usr/include/pixman-1/pixman.h:335: error: comma at end of enumerator list Signed-off-by: Søren Sandmann Pedersen <ssp@redhat.com>
2010-03-06Add a fast path cacheSøren Sandmann Pedersen3-22/+93
This patch adds a cache in front of the fast path tables to reduce the overhead of pixman_composite(). It is fixed size with move-to-front to make sure the most popular fast paths are at the beginning of the cache. The cache is thread local to avoid locking.
2010-03-05Post-release version bump to 0.17.11Søren Sandmann Pedersen1-1/+1
2010-03-05Pre-release version bump to 0.17.10Søren Sandmann Pedersen1-1/+1
2010-03-04Move __force_align_arg_pointer workaround before composite32()Søren Sandmann Pedersen1-18/+18
Since otherwise the workaround won't take effect when you call pixman_image_composite32() directly.
2010-03-04Merge branch 'more-flags'Søren Sandmann Pedersen4-254/+256
2010-03-03test: Remove obsolete commentSøren Sandmann Pedersen1-2/+0
2010-03-03ARM: added 'neon_composite_over_reverse_n_8888' fast pathSiarhei Siamashka2-0/+58
This fast path function improves performance of 'poppler' cairo-perf trace. Benchmark from ARM Cortex-A8 @720MHz before: [ # ] backend test min(s) median(s) stddev. count [ 0] image poppler 38.986 39.158 0.23% 6/6 after: [ # ] backend test min(s) median(s) stddev. count [ 0] image poppler 24.981 25.136 0.28% 6/6
2010-03-03ARM: added 'neon_composite_src_x888_8888' fast pathSiarhei Siamashka2-0/+45
This fast path function improves performance of 'gnome-system-monitor' cairo-perf trace. Benchmark from ARM Cortex-A8 @720MHz before: [ # ] backend test min(s) median(s) stddev. count [ 0] image gnome-system-monitor 68.838 68.899 0.05% 5/6 after: [ # ] backend test min(s) median(s) stddev. count [ 0] image gnome-system-monitor 53.336 53.384 0.09% 6/6
2010-03-03ARM: added 'neon_composite_over_n_8888_8888_ca' fast pathSiarhei Siamashka2-0/+110
This fast path function improves performance of 'firefox-talos-gfx' cairo-perf trace. Benchmark from ARM Cortex-A8 @720MHz before: [ # ] backend test min(s) median(s) stddev. count [ 0] image firefox-talos-gfx 139.969 141.176 0.35% 6/6 after: [ # ] backend test min(s) median(s) stddev. count [ 0] image firefox-talos-gfx 111.810 112.196 0.23% 6/6
2010-02-24Restructure the flags computation in compute_image_info().Søren Sandmann Pedersen1-36/+59
Restructure the code to use switches instead of ifs. This saves a few comparisons and make the code slightly easier to follow. Also add some comments.