~sandmann/pixman - Unnamed repository; edit this file to name it for gitweb.

Age	Commit message (Collapse)	Author	Files	Lines
2013-03-11	test: Add radial-perf-test, a microbenchmark for radial gradients	Søren Sandmann Pedersen	2	-1/+60
	This benchmark renders one of the radial gradients used in the swfdec-youtube cairo trace 500 times and reports the average time it took. V2: Update .gitignore
2013-02-13	Fix to lowlevel-blt-bench	Ben Avison	1	-7/+7
	The source, mask and destination buffers are initialised to 0xCC just after they are allocated. Between each benchmark, there are a pair of memcpys, from the destination buffer to the source buffer and back again (there are no explanatory comments, but presumably this is an effort to flush the caches). However, it has an unintended consequence, which is to change the contents of the buffers on entry to subsequent benchmarks. This means it is not a fair test: for example, with over_n_8888 (featured in the following patches) it reports L2 and even M tests as being faster than the L1 test, because after the L1 test, the source buffer is filled with fully opaque pixels, for which over_n_8888 has a shortcut. The fix here is simply to reverse the order of the memcpys, so src and destination are both filled with 0xCC on entry to all tests.
2013-02-13	utils.c: Increase acceptable deviation to 0.0064 in pixel_checker_t	Søren Sandmann Pedersen	1	-1/+1
	The check-formats programs reveals that the 8 bit pipeline cannot meet the current 0.004 acceptable deviation specified in utils.c, so we have to increase it. Some of the failing pixels were captured in pixel-test, which with this commit now passes. == a4r4g4b4 DISJOINT_XOR a8r8g8b8 == The DISJOINT_XOR operator applied to an a4r4g4b4 source pixel of 0xd0c0 and a destination pixel of 0x5300ea00 results in the exact value: fa = (1 - da) / sa = (1 - 0x53 / 255.0) / (0xd / 15.0) = 0.7782 fb = (1 - sa) / da = (1 - 0xd / 15.0) / (0x53 / 255.0) = 0.4096 r = fa * (0xc / 15.0) + fb * (0xea / 255.0) = 0.99853 But when computing in 8 bits, we get: fa8 = ((255 - 0x53) * 255 + 0xdd / 2) / 0xdd = 0xc6 fb8 = ((255 - 0xdd) * 255 + 0x53 / 3) / 0x53 = 0x68 r8 = (fa8 * 0xcc + 127) / 255 + (fb8 * 0xea + 127) / 255 = 0xfd and 0xfd / 255.0 = 0.9921568627450981 for a deviation of 0.00637118610187, which we then have to consider acceptable given the current implementation. By switching to computing the result with r = (fa * s + fb * d + 127) / 255 rather than r = (fa * s + 127) / 255 + (fb * d + 127) / 255 the deviation would be only 0.00244961747442, so at some point it may be worth doing either this, or switching to floating point for operators that involve divisions. Note that the conversion from 4 bits to 8 bits does not cause any error in this case because both rounding and bit replication produces an exact result when the number of from-bits divide the number of to-bits. == a8r8g8b8 OVER r5g6b5 == When OVER compositing the a8r8g8b8 pixel 0x0f00c300 with the x14r6g6b6 pixel 0x03c0, the true floating point value of the resulting green channel is: 0xc3 / 255.0 + (1.0 - 0x0f / 255.0) * (0x0f / 63.0) = 0.9887955 but when compositing 8 bit values, where the 6-bit green channel is converted to 8 bit through bit replication, the 8-bit result is: 0xc3 + ((255 - 0x0f) * 0x3c + 127) / 255 = 251 which corresponds to a real value of 0.984314. The difference from the true value is 0.004482 which is bigger than the acceptable deviation of 0.004. So, if we were to compute all the CONJOINT/DISJOINT operators in floating point, or otherwise make them more accurate, the acceptable deviation could be set at 0.0045. If we were doing the 6-bit conversion with rounding: (x / 63.0 * 255.0 + 0.5) instead of bit replication, the deviation in this particular case would be only 0.0005, so we may want to consider this at some point.
2013-02-13	test: Add new pixel-test regression test	Søren Sandmann Pedersen	2	-0/+268
	This test program contains a table of individual operator/pixel combinations. For each pixel combination, images of various sizes are filled with the pixels and then composited. The result is then verified against the output of do_composite(). If the result doesn't match, detailed error information is printed. The initial 14 pixel combinations currently all fail.
2013-02-13	a1-trap-test: Add tests for operator_name and format_name()	Søren Sandmann Pedersen	1	-0/+8
	The check-formats.c test depends on the exact format of the strings returned from these functions, so add a test here. a1-trap-test isn't the ideal place, but it seems like overkill to add a new test just for these trivial checks.
2013-02-13	test: Add new check-formats utility	Søren Sandmann Pedersen	3	-2/+355
	Given an operator and two formats, this program will composite and check all pixels where the red and blue channels are 0. That is, if the two formats are a8r8g8b8 and a4r4g4b4, all source pixels matching the mask 0xff00ff00 are composited with the given operator against all destination pixels matching the mask 0xf0f0 and the result is then verified against the do_composite() function that was moved to utils.c earlier. This program reveals that a number of operators and format combinations are not computed to within the precision currently accepted by pixel_checker_t. For example: check-formats over a8r8g8b8 r5g6b5 \| grep failed \| wc -l 30 reveals that there are 30 pixel combinations where OVER produces insufficiently precise results for the a8r8g8b8 and r5g6b5 formats.
2013-02-13	utils.[ch]: Add pixel_checker_get_masks()	Søren Sandmann Pedersen	2	-0/+24
	This function returns the a, r, g, and b masks corresponding to the pixel checker's format.
2013-02-13	test/utils.[ch]: Add pixel_checker_convert_pixel_to_color()	Søren Sandmann Pedersen	2	-0/+40
	This function takes a pixel in the format corresponding to the pixel checker, and converts to a color_t.
2013-02-13	test: Move do_composite() function from composite.c to utils.c	Søren Sandmann Pedersen	3	-277/+285
	So that it can be used in other tests.
2013-01-29	stresstest: Ensure that the rasterizer is only given alpha formats	Søren Sandmann Pedersen	1	-55/+68
	In c2cb303d33ec11390b93cabd90f0f9, return_if_fail()s were added to prevent the trapezoid rasterizers from being called with non-alpha formats. However, stress-test actually does call the rasterizers with non-alpha formats, but because _pixman_log_error() is disabled in versions with an odd minor number, the errors never materialized. Fix this by changing the argument to random format to an enum of three values DONT_CARE, PREFER_ALPHA, or REQUIRE_ALPHA, and then in the switch that calls the trapezoid rasterizers, pass the appropriate value for the function in question.
2013-01-29	Improve L1 and L2 benchmark tests for caches that don't use allocate-on-write	Ben Avison	1	-6/+25
	In particular this affects single-core ARMs (e.g. ARM11, Cortex-A8), which are usually configured this way. For other CPUs, this should only add a constant time, which will be cancelled out by the EXCLUDE_OVERHEAD runs. The problems were caused by cachelines becoming permanently evicted from the cache, because the code that was intended to pull them back in again on each iteration assumed too long a cache line (for the L1 test) or failed to read memory beyond the first pixel row (for the L2 test). Also, the reloading of the source buffer was unnecessary. These issues were identified by Siarhei in this post: http://lists.freedesktop.org/archives/pixman/2013-January/002543.html
2013-01-27	Use pixman_transform_point_31_16() from pixman_transform_point()	Siarhei Siamashka	1	-3/+3
	Old functions pixman_transform_point() and pixman_transform_point_3d() now become just wrappers for pixman_transform_point_31_16() and pixman_transform_point_31_16_3d(). Eventually their uses should be completely eliminated in the pixman code and replaced with their extended range counterparts. This is needed in order to be able to correctly handle any matrices and parameters that may come to pixman from the code responsible for XRender implementation.
2013-01-27	test: Added matrix-test for testing projective transform accuracy	Siarhei Siamashka	2	-0/+187
	This test uses __float128 data type when it is available for implementing a "perfect" reference implementation. The output from from pixman_transform_point_31_16() and pixman_transform_point_31_16_affine() is compared with the reference implementation to make sure that the rounding errors may only show up in a single least significant bit. The platforms and compilers, which do not support __float128 data type, can rely on crc32 checksum for the pseudorandom transform results.
2013-01-25	Tweaks to lowlevel-blt-bench	Ben Avison	1	-1/+3
	This adds two extra tests, src_n_8 and src_8_8, which I have been using to benchmark my ARMv6 changes. I'd also like to propose that it requires an exact test name as the executable's argument, as achieved by this strstr to strcmp change. Without this, it is impossible to only benchmark (for example) add_8_8, add_n_8 or src_n_8, due to those also being substrings of many other test names.
2013-01-23	test: Use operator_name() and format_name() in composite.c	Søren Sandmann Pedersen	1	-120/+101
	With the operator_name() and format_name() functions there is no longer any reason for composite.c to have its own table of format and operator names.
2013-01-23	utils.[ch]: Add new format_name() function	Søren Sandmann Pedersen	6	-23/+103
	This function returns the name of the given format code, which is useful for printing out debug information. The function is written as a switch without a default value so that the compiler will warn if new formats are added in the future. The fake formats used in the fast path tables are also recognized. The function is used in alpha_map.c, where it replaces an existing format_name() function, and in blitters-test.c, affine-test.c, and scaling-test.c.
2013-01-23	test/utils.[ch]: Add new function operator_name()	Søren Sandmann Pedersen	5	-6/+78
	This function returns the name of the given operator, which is useful for printing out debug information. The function is done as a switch without a default value so that the compiler will warn if new operators are added in the future. The function is used in affine-test.c, scaling-test.c, and blitters-test.c.
2013-01-22	Convert INCLUDES to AM_CPPFLAGS	Matt Turner	1	-1/+1
	INCLUDES has been deprecated starting with automake 1.13. Convert all occurrences with the recommended AM_CPPFLAGS replacement.
2012-12-18	test: add "src_0565_8888" to lowlevel-blt-bench	Siarhei Siamashka	1	-0/+1

2012-12-13	Add testing of trapezoids to stress-test	Søren Sandmann Pedersen	1	-25/+135
	The entry points add_trapezoids(), rasterize_trapezoid() and composite_trapezoid() are exercised with random trapezoids. This uncovers crashes with stress-test seeds 0x17ee and 0x313c.
2012-12-11	test/utils.[ch]: Add utility function to draw a checkerboard	Søren Sandmann Pedersen	2	-0/+59
	This is useful in demo programs to display the alpha channel.
2012-12-10	test: Workaround unaligned MOVDQA bug (http://gcc.gnu.org/PR55614)	Siarhei Siamashka	1	-0/+12
	Just use SSE2 intrinsics to do unaligned memory accesses as a workaround for this gcc bug related to vector extensions.
2012-12-06	test: Get rid of the obsolete 'prng_rand_N' and 'prng_rand_u32'	Siarhei Siamashka	10	-56/+44
	They are the same as 'prng_rand_n' and 'prng_rand'
2012-12-06	test: Switch to the new PRNG instead of old LCG	Siarhei Siamashka	13	-68/+63
	Wallclock time for running pixman "make check" (compile time not included): ----------------------------+----------------+-----------------------------+ \| old PRNG (LCG) \| new PRNG (Bob Jenkins) \| Processor type +----------------+------------+----------------+ \| gcc 4.5 \| gcc 4.5 \| gcc 4.7 (simd) \| ----------------------------+----------------+------------+----------------+ quad Intel Core i7 @2.8GHz \| 0m49.494s \| 0m43.722s \| 0m37.560s \| dual ARM Cortex-A15 @1.7GHz \| 5m8.465s \| 4m37.375s \| 3m45.819s \| IBM Cell PPU @3.2GHz \| 23m0.821s \| 20m38.316s \| 16m37.513s \| ----------------------------+----------------+------------+----------------+ But some tests got a particularly large boost. For example benchmarking and profiling blitters-test on Core i7: === before === $ time ./blitters-test real 0m10.907s user 0m55.650s sys 0m0.000s 70.45% blitters-test blitters-test [.] create_random_image 15.81% blitters-test blitters-test [.] compute_crc32_for_image_internal 2.26% blitters-test blitters-test [.] _pixman_implementation_lookup_composite 1.07% blitters-test libc-2.15.so [.] _int_free 0.89% blitters-test libc-2.15.so [.] malloc_consolidate 0.87% blitters-test libc-2.15.so [.] _int_malloc 0.75% blitters-test blitters-test [.] combine_conjoint_general_u 0.61% blitters-test blitters-test [.] combine_disjoint_general_u 0.40% blitters-test blitters-test [.] test_composite 0.31% blitters-test libc-2.15.so [.] _int_memalign 0.31% blitters-test blitters-test [.] _pixman_bits_image_setup_accessors 0.28% blitters-test libc-2.15.so [.] malloc === after === $ time ./blitters-test real 0m3.655s user 0m20.550s sys 0m0.000s 41.77% blitters-test.n blitters-test.new [.] compute_crc32_for_image_internal 15.77% blitters-test.n blitters-test.new [.] prng_randmemset_r 6.15% blitters-test.n blitters-test.new [.] _pixman_implementation_lookup_composite 3.09% blitters-test.n libc-2.15.so [.] _int_free 2.68% blitters-test.n libc-2.15.so [.] malloc_consolidate 2.39% blitters-test.n libc-2.15.so [.] _int_malloc 2.27% blitters-test.n blitters-test.new [.] create_random_image 2.22% blitters-test.n blitters-test.new [.] combine_conjoint_general_u 1.52% blitters-test.n blitters-test.new [.] combine_disjoint_general_u 1.40% blitters-test.n blitters-test.new [.] test_composite 1.02% blitters-test.n blitters-test.new [.] prng_srand_r 1.00% blitters-test.n blitters-test.new [.] _pixman_image_validate 0.96% blitters-test.n blitters-test.new [.] _pixman_bits_image_setup_accessors 0.90% blitters-test.n libc-2.15.so [.] malloc
2012-12-06	test: Search/replace 'lcg_' -> 'prng_'	Siarhei Siamashka	14	-317/+317
	The 'lcg' prefix is going to be misleading if we replace PRNG algorithm.
2012-12-06	test: Added a better PRNG (pseudorandom number generator)	Siarhei Siamashka	4	-0/+581
	This adds a fast SIMD-optimized variant of a small noncryptographic PRNG originally developed by Bob Jenkins: http://www.burtleburtle.net/bob/rand/smallprng.html The generated pseudorandom data is good enough to pass "Big Crush" tests from TestU01 (http://en.wikipedia.org/wiki/TestU01). SIMD code uses http://gcc.gnu.org/onlinedocs/gcc/Vector-Extensions.html which is a GCC specific extension. There is also a slower alternative code path, which should work with any C compiler. The performance of filling buffer with random data: Intel Core i7 @2.8GHz (SSE2) : ~5.9 GB/s ARM Cortex-A15 @1.7GHz (NEON) : ~2.2 GB/s IBM Cell PPU @3.2GHz (Altivec) : ~1.7 GB/s
2012-12-06	test: Change is_little_endian() into inline function	Siarhei Siamashka	2	-10/+6
	Also dropped redundant volatile keyword because any object can be accessed via char* pointer without breaking aliasing rules. The compilers are able to optimize this function to either constant 0 or 1.
2012-11-20	Round fixed-point multiplication	Søren Sandmann Pedersen	3	-7/+7
	After two fixed-point numbers are multiplied, the result is shifted into place, but up until now pixman has simply discarded the low-order bits instead of rounding to the closest number. Fix that by adding 0x8000 (or 0x2 in one place) before shifting and update the test checksums to match.
2012-11-14	test: Fix compiler warnings caused by unused code	Stefan Weil	1	-0/+4
	Signed-off-by: Stefan Weil <sw@weilnetz.de>
2012-11-14	pixman: Use uintptr_t in type casts from pointer to integral value	Stefan Weil	2	-3/+3
	These modifications fix lots of compiler warnings for systems where sizeof(unsigned long) != sizeof(void *). This is especially true for MinGW-w64 (64 bit Windows). Signed-off-by: Stefan Weil <sw@weilnetz.de>
2012-10-21	pixman_composite_trapezoids(): don't clip to extents for some operators	Søren Sandmann Pedersen	1	-1/+1
	pixman_composite_trapezoids() is supposed to composite across the entire destination, but it actually only composites across the extent of the trapezoids. For operators such as ADD or OVER this doesn't matter since a zero source has no effect on the destination. But for operators such as SRC or IN, it does matter. So for such operators where a zero source has an effect, don't clip to the trap extents.
2012-10-01	Add combiner test	Søren Sandmann Pedersen	7	-16/+158
	This test runs the new floating point combiners on random input with divide-by-zero exceptions turned on. With the floating point combiners the only thing we guarantee is that divide-by-zero exceptions are not generated, so change enable_fp_exceptions() to only enable those, and rename accordingly.
2012-10-01	blitters-test: Prepare for floating point	Søren Sandmann Pedersen	1	-1/+3
	Comment out some formats in blitters-test that are going to rely on floating point in some upcoming patches.
2012-10-01	glyph-test: Prepare for floating point	Søren Sandmann Pedersen	1	-2/+5
	In preparation for an upcoming change of the wide pipe to use floating point, comment out some formats in glyph-test that are going to be using floating point and update the CRC32 value to match.
2012-09-29	rotate-test: Call image_endian_swap() in make_image()	Søren Sandmann Pedersen	1	-1/+3
	Otherwise the test fails on big-endian. Tested-by: Matt Turner <mattst88@gmail.com>
2012-09-24	test: Add inifinite-loop test	Søren Sandmann Pedersen	2	-0/+40
	This test demonstrates a bug where a certain transformation matrix can result in an infinite loop. It was extracted as a standalone version of "affine-test 212944861". If given the option -nf, the test program will not call fail_after() and therefore potentially run forever.
2012-09-24	affine-test: Print out the transformation matrix when verbose	Søren Sandmann Pedersen	1	-4/+12
	Printing out the translation and scale is a bit misleading because the actual transformation matrix can be modified in various other ways. Instead simply print the whole transformation matrix that is actually used.
2012-09-22	Add rotate-test.c test program	Søren Sandmann Pedersen	2	-0/+112
	This program exercises a bug in pixman-image.c where "-1" and "1" were used instead of the correct "- pixman_fixed_1" and "pixman_fixed_1". With the fast implementation enabled: % ./rotate-test rotate test failed! (checksum=35A01AAB, expected 03A24D51) Without it: % env PIXMAN_DISABLE=fast ./rotate-test pixman: Disabled fast implementation rotate test passed (checksum=03A24D51) V2: The first version didn't have lcg_srand (testnum) in test_transform().
2012-09-22	Fix bugs in component alpha combiners for separable PDF operators	Søren Sandmann Pedersen	1	-1/+1
	In general, the component alpha version of an operator is supposed to do this: - multiply source with mask in all channels - multiply mask with source alpha in all channels - compute the regular operator in all channels using the mask value whenever source alpha is called for The first two steps are usually accomplished with the function combine_mask_ca(), but for operators where source alpha is not used, such as SRC, ADD and OUT, the simpler function combine_mask_value_ca(), which doesn't compute the new mask values, can be used. However, the PDF blend modes generally do make use of source alpha, so they can't use combine_mask_value_ca() as they do now. They have to use combine_mask_ca(). This patch fixes this in combine_multiply_ca() and the CA combiners generated by PDF_SEPARABLE_BLEND_MODE.
2012-09-22	Add PIXMAN_x8b8g8r8 and PIXMAN_a8b8g8r8 formats to scaling-test	Søren Sandmann Pedersen	1	-9/+31
	Update the CRC values based on what the general implementation reports. This reveals a bug in the fast implementation: % env PIXMAN_DISABLE="mmx sse2" ./test/scaling-test pixman: Disabled mmx implementation pixman: Disabled sse2 implementation scaling test failed! (checksum=AA722B06, expected 03A23E0C) vs. % env PIXMAN_DISABLE="mmx sse2 fast" ./test/scaling-test pixman: Disabled fast implementation pixman: Disabled mmx implementation pixman: Disabled sse2 implementation scaling test passed (checksum=03A23E0C) Reviewed-by: Matt Turner <mattst88@gmail.com>
2012-09-15	build: Improve win32 build system	Andrea Canciani	1	-3/+7
	Handle cross-directory dependencies using PHONY targets and clean up some redundancies.
2012-08-29	test/utils.c: Use pow(), not powf() in sRGB conversion routines	Søren Sandmann Pedersen	1	-2/+2
	These functions are operating on double precision values, so use pow() instead of powf().
2012-08-26	pixel_checker: Move sRGB conversion into get_limits()	Søren Sandmann Pedersen	1	-15/+13
	The sRGB conversion has to be done every time the limits are being computed. Without this fix, pixel_checker_get_min/max() will produce the wrong results when called from somewhere other than pixel_checker_check().
2012-07-31	glyph-test: Avoid setting solid images as alpha maps.	Søren Sandmann Pedersen	1	-2/+2
	glyph-test would sometimes set a solid image as an alpha map, which is not allowed. When this happened and the debug spew was enabled, messages like this one would be generated: * BUG * In pixman_image_set_alpha_map: The expression !alpha_map \|\| alpha_map->type == BITS was false Set a breakpoint on '_pixman_log_error' to debug Fix this by not passing the ALLOW_SOLID flag to create_image() when the resulting is to be used as an alpha map.
2012-07-31	stress-test: Avoid overflows in clip rectangles	Søren Sandmann Pedersen	1	-0/+5
	The rectangles in the clip region set in set_general_properties() would sometimes overflow, which would lead to messages like these: * BUG * In pixman_region32_union_rect: Invalid rectangle passed Set a breakpoint on '_pixman_log_error' to debug when the micro version number of pixman is even. Fix this by detecting the overflow and clamping such that the x2/y2 coordinates are less than INT32_MAX.
2012-07-30	Add tests to validate new sRGB behavior	Antti S. Lankila	4	-6/+104
	Composite checks random combinations of operations that now also have sRGB sources, masks and destinations, and stress-test validates the read/write primitives.
2012-07-29	Remove unnecessary dst initialization	Antti S. Lankila	1	-9/+0
	The initialization work is already performed correctly in image_init().
2012-07-02	test: Make stress-test more likely to actually composite something	Søren Sandmann Pedersen	1	-16/+55
	stress-test current almost never composites anything because the clip rectangles and transformations are such that either _pixman_compute_composite_region32() or analyze_extent() will return FALSE. Fix this by: - making log_rand() return smaller numbers so that the clip rectangles are more likely to be within the destination image - adding rand_x() and rand_y() functions that pick positions within an image and using them for positioning alpha maps and source/mask positions. - making it less likely that clip regions are used in general These changes make the test take longer, so speed it up a little by making most images smaller and by reducing the maximum convolution filter from 17x19 to 3x4. With these changes, stress-test reveals a crash in iteration 0xd39 where fast_composite_tiled_repeat() creates an indexed image without a palette.
2012-07-01	Bilinear interpolation precision is now configurable at compile time	Siarhei Siamashka	2	-2/+22
	Macro BILINEAR_INTERPOLATION_BITS in pixman-private.h selects the number of fractional bits used for bilinear interpolation. scaling-test and affine-test have checksums for 4-bit, 7-bit and 8-bit configurations.
2012-06-29	test: support nearest/bilinear scaling in lowlevel-blt-bench	Siarhei Siamashka	1	-1/+62
	Scale factor is selected to be nearly 1x, so that the MPix/s results can be directly compared with the results of non-scaled compositing operations.