~mattst88/pixman - Unnamed repository; edit this file to name it for gitweb.

Age	Commit message (Collapse)	Author	Files	Lines
2012-03-13	MIPS: DSPr2: Added over_n_8888_8888_ca and over_n_8888_0565_ca fast paths.HEAD master	Nemanja Lukic	4	-0/+569
	Performance numbers before/after on MIPS-74kc @ 1GHz Referent (before): lowlevel-blt-bench: over_n_8888_8888_ca = L1: 8.32 L2: 7.65 M: 6.38 ( 51.08%) HT: 5.78 VT: 5.74 R: 5.84 RT: 4.39 ( 37Kops/s) over_n_8888_0565_ca = L1: 7.40 L2: 6.95 M: 6.16 ( 41.06%) HT: 5.72 VT: 5.52 R: 5.63 RT: 4.28 ( 36Kops/s) cairo-perf-trace: [ # ] backend test min(s) median(s) stddev. count [ # ] image: pixman 0.25.3 [ 0] image xfce4-terminal-a1 138.223 139.070 0.33% 6/6 [ # ] image16: pixman 0.25.3 [ 0] image16 xfce4-terminal-a1 132.763 132.939 0.06% 5/6 Optimized: lowlevel-blt-bench: over_n_8888_8888_ca = L1: 19.35 L2: 23.84 M: 13.68 (109.39%) HT: 11.39 VT: 11.19 R: 11.27 RT: 6.90 ( 47Kops/s) over_n_8888_0565_ca = L1: 18.68 L2: 17.00 M: 12.56 ( 83.70%) HT: 10.72 VT: 10.45 R: 10.43 RT: 5.79 ( 43Kops/s) cairo-perf-trace: [ # ] backend test min(s) median(s) stddev. count [ # ] image: pixman 0.25.3 [ 0] image xfce4-terminal-a1 130.400 131.720 0.46% 6/6 [ # ] image16: pixman 0.25.3 [ 0] image16 xfce4-terminal-a1 125.830 126.604 0.34% 6/6
2012-03-13	Expand TLS support beyond __thread to __declspec(thread)	Jeremy Huddleston	2	-14/+16
	This code was pretty much coppied from a similar commit that I made to xorg-server in April. cf: xorg/xserver: bb4d145bd25e2aee988b100ecf1105ea3b6a40b8 Signed-off-by: Jeremy Huddleston <jeremyhu@apple.com>
2012-03-13	Disable MMX when incompatible clang is being used.	Jeremy Huddleston	1	-0/+9
	Signed-off-by: Jeremy Huddleston <jeremyhu@apple.com>
2012-03-13	Silence a warning about unused pixman_have_mmx	Jeremy Huddleston	1	-0/+2
	Signed-off-by: Jeremy Huddleston <jeremyhu@apple.com>
2012-03-13	Revert "Disable MMX when Clang is being used."	Jeremy Huddleston	1	-3/+0
	This reverts commit 5eb4c12a79b3017ec6cc22ab756f53f225731533.
2012-03-08	Post-release version bump to 0.25.3	Søren Sandmann Pedersen	1	-1/+1

2012-03-08	Pre-release version bump to 0.25.2	Søren Sandmann Pedersen	1	-1/+1

2012-03-08	mmx: Squash a warning by making the argument to ldl_u() const	Søren Sandmann Pedersen	1	-1/+1

2012-03-05	Just use xmmintrin.h when building with Solaris Studio compilers	Alan Coopersmith	1	-0/+4
	Since the Solaris Studio compilers don't have a mode where MMX instructions are available and SSE instructions are not, we can just use the <xmmintrin.h> header directly. Fixes build failure due to Studio not supporting the __gnu_inline__ or __artificial__ attributes. Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com> Acked-by: Matt Turner <mattst88@gmail.com>
2012-03-04	MIPS: DSPr2: Added mips_dspr2_blt and mips_dspr2_fill routines.	Nemanja Lukic	3	-0/+272
	Performance numbers before/after on MIPS-74kc @ 1GHz Referent (before): lowlevel-blt-bench: src_n_0565 = L1: 238.14 L2: 233.15 M: 57.88 ( 77.23%) HT: 53.22 VT: 49.99 R: 47.73 RT: 24.79 ( 91Kops/s) src_n_8888 = L1: 190.19 L2: 187.57 M: 28.94 ( 77.23%) HT: 27.91 VT: 27.33 R: 26.64 RT: 14.68 ( 77Kops/s) cairo-perf-trace: [ # ] backend test min(s) median(s) stddev. count [ # ] image: pixman 0.25.1 [ 0] image gnome-system-monitor 268.460 269.712 0.22% 6/6 Optimized: lowlevel-blt-bench: src_n_0565 = L1:1081.39 L2: 258.22 M:189.59 (252.91%) HT: 60.23 VT: 55.01 R: 53.44 RT: 23.68 ( 89Kops/s) src_n_8888 = L1: 653.46 L2: 113.55 M:135.26 (360.86%) HT: 38.99 VT: 37.38 R: 34.95 RT: 18.67 ( 84Kops/s) cairo-perf-trace: [ # ] backend test min(s) median(s) stddev. count [ # ] image: pixman 0.25.1 [ 0] image gnome-system-monitor 246.565 246.706 0.04% 6/6
2012-03-01	pixman-access.c: Remove some unused macros	Søren Sandmann Pedersen	1	-9/+0
	The macros related to palette entries: RGB15_TO_ENTRY, RGB24_TO_ENTRY, RGB24_TO_ENTRY_Y are not used anywhere.
2012-03-01	pixman-accessors.h: Delete unused macros	Søren Sandmann Pedersen	1	-15/+0
	The MEMCPY_WRAPPED and ACCESS macros are not used anymore.
2012-03-01	Move fetching for solid bits images to pixman-noop.c	Søren Sandmann Pedersen	2	-28/+27
	This should be a bit faster because it can reuse the scanline on each iteration.
2012-03-01	lowlevel-blt-bench: add in_8_8 and in_n_8_8	Matt Turner	1	-0/+2
	Signed-off-by: Matt Turner <mattst88@gmail.com>
2012-02-28	Disable implementations mentioned in the PIXMAN_DISABLE environment variable.	Søren Sandmann Pedersen	1	-11/+44
	With this, it becomes possible to do PIXMAN_DISABLE="sse2 mmx" some_app which will run some_app without SSE2 and MMX enabled. This is useful for benchmarking, testing and narrowing down bugs. The current list of implementations that can be disabled: fast mmx sse2 arm-simd arm-iwmmxt arm-neon mips-dspr2 vmx The general and noop implementations can't be disabled because pixman depends on those being available for correct operation. Reviewed-by: Matt Turner <mattst88@gmail.com>
2012-02-25	MIPS: DSPr2: Added fast-paths for SRC operation.	Nemanja Lukic	6	-1/+876
	Following fast-path functions are implemented (routines 4, 5 and 6 utilize same fast-memcpy routine): 1. src_x888_8888 2. src_8888_0565 3. src_0565_8888 4. src_0565_0565 5. src_8888_8888 6. src_0888_0888 Performance numbers before/after on MIPS-74kc @ 1GHz Referent (before): lowlevel-blt-bench: src_x888_8888 = L1: 199.35 L2: 96.54 M: 18.87 (100.68%) HT: 17.12 VT: 16.24 R: 15.43 RT: 9.33 ( 61Kops/s) src_8888_0565 = L1: 71.22 L2: 51.95 M: 24.19 ( 96.17%) HT: 20.71 VT: 19.92 R: 18.15 RT: 9.92 ( 63Kops/s) src_0565_8888 = L1: 38.82 L2: 36.22 M: 18.60 ( 73.95%) HT: 14.47 VT: 13.19 R: 12.97 RT: 6.61 ( 49Kops/s) src_0565_0565 = L1: 286.05 L2: 155.02 M: 37.68 (100.54%) HT: 31.08 VT: 28.07 R: 26.26 RT: 11.93 ( 68Kops/s) src_8888_8888 = L1: 454.32 L2: 139.15 M: 19.30 (102.98%) HT: 17.73 VT: 16.08 R: 16.62 RT: 10.45 ( 64Kops/s) src_0888_0888 = L1: 190.47 L2: 106.14 M: 25.26 (101.08%) HT: 21.88 VT: 20.32 R: 18.83 RT: 10.10 ( 63Kops/s) cairo-perf-trace: [ # ] backend test min(s) median(s) stddev. count [ # ] image: pixman 0.25.1 [ 0] image firefox-asteroids 421.215 421.325 0.01% 4/6 [ 1] image firefox-planet-gnome 647.708 648.486 0.13% 6/6 [ 2] image gnome-system-monitor 276.073 277.506 0.38% 6/6 [ 3] image gnome-terminal-vim 263.866 265.229 0.39% 6/6 [ 4] image poppler 123.576 124.003 0.15% 6/6 Optimized (with these optimizations): lowlevel-blt-bench: src_x888_8888 = L1: 369.50 L2: 99.37 M: 27.19 (145.07%) HT: 20.24 VT: 19.48 R: 19.00 RT: 10.22 ( 63Kops/s) src_8888_0565 = L1: 105.65 L2: 67.87 M: 25.41 (101.00%) HT: 20.78 VT: 19.84 R: 18.52 RT: 9.81 ( 63Kops/s) src_0565_8888 = L1: 77.10 L2: 63.04 M: 23.37 ( 92.90%) HT: 20.29 VT: 19.37 R: 18.14 RT: 10.02 ( 63Kops/s) src_0565_0565 = L1: 519.02 L2: 241.32 M: 62.35 (166.34%) HT: 33.74 VT: 27.63 R: 26.12 RT: 11.70 ( 67Kops/s) src_8888_8888 = L1: 390.48 L2: 113.99 M: 30.32 (161.77%) HT: 19.55 VT: 17.05 R: 17.13 RT: 10.19 ( 63Kops/s) src_0888_0888 = L1: 349.74 L2: 156.68 M: 40.68 (162.78%) HT: 25.58 VT: 20.57 R: 20.20 RT: 9.96 ( 63Kops/s) cairo-perf-trace: [ # ] backend test min(s) median(s) stddev. count [ # ] image: pixman 0.25.1 [ 0] image firefox-asteroids 400.050 400.308 0.04% 6/6 [ 1] image firefox-planet-gnome 628.978 629.364 0.07% 6/6 [ 2] image gnome-system-monitor 270.247 270.313 0.03% 6/6 [ 3] image gnome-terminal-vim 256.413 257.641 0.21% 6/6 [ 4] image poppler 119.540 120.023 0.21% 6/6
2012-02-25	MIPS: DSPr2: Basic infrastructure for MIPS architecture	Nemanja Lukic	6	-0/+205
	MIPS DSP instruction set extensions
2012-02-24	lowlevel-blt: add over_x888_n_8888	Matt Turner	1	-0/+1
	Signed-off-by: Matt Turner <mattst88@gmail.com>
2012-02-24	lowlevel-blt: add over_8888_8888	Matt Turner	1	-0/+1
	Signed-off-by: Matt Turner <mattst88@gmail.com>
2012-02-24	Disable MMX when Clang is being used.	Søren Sandmann Pedersen	1	-0/+3
	There are several issues with the Clang compiler and pixman-mmx.c: - When not optimizing, it doesn't seem to recognize that an argument to an __always_inline__ function is compile-time constant. This results in this error being produced: fatal error: error in backend: Invalid operand for inline asm constraint 'K'! - This inline assembly: asm ("pmulhuw %1, %0\n\t" : "+y" (__A) : "y" (__B) ); results in fatal error: error in backend: Unsupported asm: input constraint with a matching output constraint of incompatible type! So disable MMX when the compiler is Clang.
2012-02-24	mmx: make load8888 take a pointer to data instead of the data itself	Matt Turner	1	-129/+148
	Allows us to tune how we load data into the vector registers. Signed-off-by: Matt Turner <mattst88@gmail.com> And squashed in: mmx: define and use load8888u function For unaligned loads. Signed-off-by: Matt Turner <mattst88@gmail.com>
2012-02-24	mmx: make store8888 take uint32_t *dest as argument	Matt Turner	1	-46/+47
	Allows us to tune how we store data from the vector registers. Signed-off-by: Matt Turner <mattst88@gmail.com>
2012-02-22	Update .gitignore with more demos and tests	Matt Turner	1	-0/+23
	Signed-off-by: Matt Turner <mattst88@gmail.com>
2012-02-22	mmx: Delete unused function in_over_full_src_alpha()	Søren Sandmann Pedersen	1	-13/+5
	Also a few minor formatting fixes. Reviewed-by: Matt Turner <mattst88@gmail.com>
2012-02-22	mmx: Enable over_x888_8_8888() for x86 as well	Søren Sandmann Pedersen	1	-7/+0
	It used to be slower than the generic code (with the gcc that was current in 2007), but that doesn't seem to be the case anymore: over_x888_8_8888 = L1: 22.97 L2: 22.88 M: 22.27 ( 5.29%) HT: 18.30 VT: 15.81 R: 15.54 RT: 10.35 ( 131Kops/s) over_x888_8_8888 = L1: 53.56 L2: 53.20 M: 50.50 ( 11.99%) HT: 38.60 VT: 31.19 R: 29.00 RT: 17.37 ( 208Kops/s) Reviewed-by: Matt Turner <mattst88@gmail.com>
2012-02-21	mmx: fix typo in pix_add_mul on MSVC	Matt Turner	1	-1/+1
	Typo introduced in commit a075a870. Signed-off-by: Matt Turner <mattst88@gmail.com>
2012-02-21	mmx: Use _mm_shuffle_pi16	Matt Turner	1	-42/+19
	The pshufw x86 instruction is part of Extended 3DNow! and SSE1. The equivalent ARM wshufh instruction was available from the first iwMMXt instrucion set. This instruction is already used in the SSE2 code. Reduces code size by ~9%. amd64 text data bss dec hex filename 29925 2240 0 32165 7da5 .libs/libpixman_mmx_la-pixman-mmx.o 27237 2240 0 29477 7325 .libs/libpixman_mmx_la-pixman-mmx.o x86 text data bss dec hex filename 27677 1792 0 29469 731d .libs/libpixman_mmx_la-pixman-mmx.o 24959 1792 0 26751 687f .libs/libpixman_mmx_la-pixman-mmx.o arm text data bss dec hex filename 30176 1792 0 31968 7ce0 .libs/libpixman_iwmmxt_la-pixman-mmx.o 27384 1792 0 29176 71f8 .libs/libpixman_iwmmxt_la-pixman-mmx.o Signed-off-by: Matt Turner <mattst88@gmail.com>
2012-02-21	mmx: Use _mm_mulhi_pu16	Matt Turner	1	-2/+18
	The pmulhuw x86 instruction is part of Extended 3DNow! and SSE1. The equivalent ARM wmuluh instruction was available from the first iwMMXt instrucion set. This instruction is already used in the SSE2 code. Reduces code size by ~5%. amd64 text data bss dec hex filename 31325 2240 0 33565 831d .libs/libpixman_mmx_la-pixman-mmx.o 29925 2240 0 32165 7da5 .libs/libpixman_mmx_la-pixman-mmx.o x86 text data bss dec hex filename 29165 1792 0 30957 78ed .libs/libpixman_mmx_la-pixman-mmx.o 27677 1792 0 29469 731d .libs/libpixman_mmx_la-pixman-mmx.o arm text data bss dec hex filename 31632 1792 0 33424 8290 .libs/libpixman_iwmmxt_la-pixman-mmx.o 30176 1792 0 31968 7ce0 .libs/libpixman_iwmmxt_la-pixman-mmx.o Signed-off-by: Matt Turner <mattst88@gmail.com>
2012-02-20	mmx: enable over_x888_8_8888 on ARM/iwMMXt	Matt Turner	1	-3/+3
	before: over_x888_8_8888 = L1: 7.63 L2: 7.72 M: 6.44 ( 19.17%) HT: 6.24 VT: 6.11 R: 5.87 RT: 4.61 ( 51Kops/s) after : over_x888_8_8888 = L1: 11.88 L2: 11.11 M: 8.70 ( 26.01%) HT: 8.15 VT: 8.07 R: 7.76 RT: 5.62 ( 61Kops/s) Signed-off-by: Matt Turner <mattst88@gmail.com>
2012-02-20	autoconf: use #error instead of error	Matt Turner	1	-4/+4
	We'd rather see the actual #error message rather than a syntax error in config.log. Signed-off-by: Matt Turner <mattst88@gmail.com>
2012-02-18	Convert while (w) to if (w) when possible	Matt Turner	2	-8/+8
	Missed in commit 57fd8c37. Signed-off-by: Matt Turner <mattst88@gmail.com>
2012-02-17	Make sure to run AC_SUBST IWMMXT_CFLAGS	Matt Turner	1	-0/+1
	Allows you to compile without -flax-vector-conversions in your CFLAGS, though -march=iwmmxt2 is still necessary since specifying some other -march= value will override it, and disable iwmmxt. Signed-off-by: Matt Turner <mattst88@gmail.com>
2012-02-16	configure.ac: Add an --enable-libpng option	Jeremy Huddleston	1	-1/+8
	Now there is a way to not link against libpng even if it's available. Signed-off-by: Jeremy Huddleston <jeremyhu@apple.com>
2012-02-11	Use AC_LANG_SOURCE for iwMMXt configure program	Matt Turner	1	-3/+3
	Signed-off-by: Matt Turner <mattst88@gmail.com>
2012-01-31	Revert "Reject trapezoids where top (botttom) is above (below) the edges"	Søren Sandmann Pedersen	2	-9/+5
	Cairo 1.10 will sometimes generate trapezoids like this, so we can't consider them invalid. Fixes bug 45009, reported by Michael Biebl. This reverts commit 2437ae80e5066dec9fe52f56b016bf136d7cea06.
2012-01-31	iOS Runtime Detection Support For ARM NEON	Bobby Salazar	1	-0/+45
	This patch adds runtime detection support for the ARM NEON fast paths for code compiled with the iOS SDK.
2012-01-10	test: Port composite test over to use new pixel_checker_t object.	Søren Sandmann Pedersen	1	-195/+95
	Also make some tweaks to the way the errors are printed.
2012-01-10	test: Add a new "pixel_checker_t" object.	Søren Sandmann Pedersen	2	-0/+166
	Add a new pixel_checker_t object to test/utils.[ch]. This object should be initialized with a format and can then be used to check whether a given "real" pixel in that format is close enough to a "perfect" pixel given as a double precision ARGB struct. The acceptable deviation is calcuated as follows. Each channel of the perfect pixel has 0.004 subtracted from it and is then converted to the format. The resulting value is the minimum value that will be accepted. Similarly, to compute the maximum value, the channel has 0.004 added to it and is then converted to the given format. Checking a pixel is then a matter of splitting it into channels and checking that each is within the computed bounds. The value of 0.004 was chosen because it is the minimum one that will make the existing composite test pass (see next commit). A problem with this value is that it causes 0xFE to be acceptable when the correct value is 1.0, and 0x01 to be acceptable when the correct value is 0. It would be better if, when the result is exactly 0 or exactly 1, an a8r8g8b8 pixel were required to produce exactly 0x00 or 0xff to preserve full black and full white. A deviation value of 0.003 would produce this, but currently this would cause tests with operators that involve divisions to fail.
2012-01-10	Rename color_correct() to round_color()	Søren Sandmann Pedersen	3	-20/+25
	And do the rounding from float to int in the same way cairo does: by multiplying with (1 << width), then subtracting one when the input was 1.0.
2012-01-10	Move the color_correct() function from composite.c to utils.c	Søren Sandmann Pedersen	3	-36/+39

2012-01-10	Get rid of delegates for combiners	Søren Sandmann Pedersen	3	-134/+34
	Add a new function _pixman_implementation_lookup_combiner() that will find a usable combiner given an operator and information about whether the combiner should apply component alpha and whether it should be 64 bit. In pixman-general.c use this function to look up a combiner up front instead of walking the delegate chain for every scanline.
2012-01-10	test/alphamap.c: Make dst and orig_dst more independent of each other	Søren Sandmann Pedersen	1	-27/+32
	When making the copy of the destination, do so separately for the image and the alpha map. This ensures that the alpha channel of the alpha map will be different from the alpha channel of the actual image. Previously, orig_dst would be copied onto dst along with its alpha map, which mean that the alpha map of orig_dst would become the new alpha channel of both dst and dst's alpha map. This meant that test didn't actually test that the alpha maps alpha channel was actually fetched.
2012-01-10	Fix bugs with alpha maps	Søren Sandmann Pedersen	1	-11/+42
	The alpha channel from the alpha map must be inserted as the new alpha channel when a scanline is fetched from an image. Previously the alpha map would overwrite the buffer instead. This wasn't caught be the alpha map test because it would only verify that the resulting alpha channel was correct, and not pay attention to incorrect color channels.
2012-01-10	test: In the alphamap test, also test that we get the right red value	Søren Sandmann Pedersen	1	-6/+79
	There is a bug where the red channel of the alpha map of the destination image is used instead of the red channel of the destination image itself.
2012-01-09	Make mmx code compatible with Solaris Studio 12.3 compilers	Alan Coopersmith	1	-19/+38
	Rearranged some of the existing gcc & Intel compiler checks to allow easier sharing of common cases among the compilers. Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com>
2012-01-09	Fix rounding for DIV_UNc()	Søren Sandmann Pedersen	2	-2/+2
	We need to compute floor (a/b * 255 + 0.5), not floor (a / b * 255), so add b/2 to the numerator in the DIV_UNc() macro.
2012-01-09	Reject trapezoids where top (botttom) is above (below) the edges	Søren Sandmann Pedersen	2	-5/+9
	When a trapezoid has a top/bottom that is above/below the left/right edges, degenerate trapezoids become possible. For example the edge could be very short and close to horizontal. If the bottom edge is far below the bottom point of such a short edge, the result is that the lower right corner of the trapezoid will be extremely far to the left. This kind of trapezoid causes overflows in the rasterization code, so change pixman_trapezoid_valid() to reject them.
2012-01-09	In MUL_UNc() cast to comp2_t	Søren Sandmann Pedersen	1	-1/+1
	Otherwise, when comp1_t is 16 bits wide, we can end up with a signed integer overflow.
2012-01-09	Fix a bunch of signed overflow issues	Søren Sandmann Pedersen	6	-22/+39
	In pixman-fast-path.c: (1 << 31) - 1 causes a signed overflow, so change to (1U << n) - 1. In pixman-image.c: The check for whether m10 == -m01 will overflow when -m01 == INT_MIN. Instead just check whether the variables are 1 and -1. In pixman-utils.c: When the depth of the topmost channel is 0, we can end up shifting by 32. In blitters-test.c: Replicating the mask would end up shifting more than 32. In region-contains-test.c: Computing the average of two large integers could overflow. Instead add half the difference between them to the first integer. In stress-test.c: Masking the value in fake_reader() would sometimes shift by 32. Instead just use the most significant bits instead of the least significant. All these issues were found by the IOC tool: http://embed.cs.utah.edu/ioc/
2012-01-09	Add missing cast in _pixman_edge_multi_init()	Søren Sandmann Pedersen	1	-1/+1
	nx and e->dy are both 32 bit quantities, so a cast is needed to make sure their product is 64 bit before subtracting it from a 64 bit quantity.