summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2012-12-11demos/radial-test: Add checkerboard to display the alpha channelHEADmasterSøren Sandmann Pedersen1-0/+2
2012-12-11demos/conical-test: Use the draw_checkerboard() utility functionSøren Sandmann Pedersen1-36/+2
Instead of having its own copy.
2012-12-11test/utils.[ch]: Add utility function to draw a checkerboardSøren Sandmann Pedersen2-0/+59
This is useful in demo programs to display the alpha channel.
2012-12-11radial: When comparing t to mindr, use >= rather than >Søren Sandmann Pedersen1-3/+3
Radial gradients are conceptually rendered as a sequence of circles generated by linearly extrapolating from the two circles given by the gradient specification. Any circles in that sequence that would end up with a negative radius are not drawn, a condition that is enforced by checking that t * dr is bigger than mindr: if (t * dr > mindr) However, it is legitimate for a circle to have radius exactly 0, so the test should use >= rather than >. This gets rid of the dots in demos/radial-test except for when the c2 circle has radius 0 and a repeat mode of either NONE or NORMAL. Both those dots correspond to a t value of 1.0, which is outside the defined interval of [0.0, 1.0) and therefore subject to the repeat algorithm. As a result, in the NONE case, a value of 1.0 turns into transparent black. In the NORMAL case, 1.0 wraps around and becomes 0.0 which is red, unlike 0.99 which is blue. Cc: ranma42@gmail.com
2012-12-11demos/radial-test: Add zero-radius circles to demonstrate rendering bugsSøren Sandmann Pedersen1-1/+9
Add two new gradient columns, one where the start circle is has radius 0 and one where the end circle has radius 0. All the new gradients except for one are rendered with a bright dot in the middle. In most but not all cases this is incorrect. Cc: ranma42@gmail.com
2012-12-10test: Workaround unaligned MOVDQA bug (http://gcc.gnu.org/PR55614)Siarhei Siamashka1-0/+12
Just use SSE2 intrinsics to do unaligned memory accesses as a workaround for this gcc bug related to vector extensions.
2012-12-10Improve performance of combine_over_uSiarhei Siamashka1-7/+51
The generic C over_u combiner can be a lot faster with the addition of special shortcuts for 0xFF and 0x00 alpha/mask values. This is already implemented in C and SSE2 fast paths. Profiling the run of cairo-perf-trace benchmarks with PIXMAN_DISABLE environment variable set to "fast mmx sse2" on Intel Core i7: === before === 37.32% cairo-perf-trac libpixman-1.so.0.29.1 [.] combine_over_u 21.37% cairo-perf-trac libpixman-1.so.0.29.1 [.] bits_image_fetch_bilinear_no_repeat_8888 13.51% cairo-perf-trac libpixman-1.so.0.29.1 [.] bits_image_fetch_bilinear_affine_none_a8r8g8b8 2.96% cairo-perf-trac libpixman-1.so.0.29.1 [.] radial_compute_color 2.74% cairo-perf-trac libpixman-1.so.0.29.1 [.] fetch_scanline_a8 2.71% cairo-perf-trac libpixman-1.so.0.29.1 [.] fetch_scanline_x8r8g8b8 2.17% cairo-perf-trac libpixman-1.so.0.29.1 [.] _pixman_gradient_walker_pixel 1.86% cairo-perf-trac libcairo.so.2.11200.0 [.] _cairo_tor_scan_converter_generate 1.57% cairo-perf-trac libpixman-1.so.0.29.1 [.] bits_image_fetch_bilinear_affine_pad_a8r8g8b8 0.97% cairo-perf-trac libpixman-1.so.0.29.1 [.] combine_in_reverse_u 0.96% cairo-perf-trac libpixman-1.so.0.29.1 [.] combine_over_ca === after === 28.79% cairo-perf-trac libpixman-1.so.0.29.1 [.] bits_image_fetch_bilinear_no_repeat_8888 18.44% cairo-perf-trac libpixman-1.so.0.29.1 [.] bits_image_fetch_bilinear_affine_none_a8r8g8b8 15.54% cairo-perf-trac libpixman-1.so.0.29.1 [.] combine_over_u 3.94% cairo-perf-trac libpixman-1.so.0.29.1 [.] radial_compute_color 3.69% cairo-perf-trac libpixman-1.so.0.29.1 [.] fetch_scanline_a8 3.69% cairo-perf-trac libpixman-1.so.0.29.1 [.] fetch_scanline_x8r8g8b8 2.94% cairo-perf-trac libpixman-1.so.0.29.1 [.] _pixman_gradient_walker_pixel 2.52% cairo-perf-trac libcairo.so.2.11200.0 [.] _cairo_tor_scan_converter_generate 2.08% cairo-perf-trac libpixman-1.so.0.29.1 [.] bits_image_fetch_bilinear_affine_pad_a8r8g8b8 1.31% cairo-perf-trac libpixman-1.so.0.29.1 [.] combine_in_reverse_u 1.29% cairo-perf-trac libpixman-1.so.0.29.1 [.] combine_over_ca
2012-12-08Add fast paths for separable convolutionSøren Sandmann Pedersen3-3/+184
Similar to the fast paths for general affine access, add some fast paths for the separable filter for all combinations of formats x8r8g8b8, a8r8g8b8, r5g6b5, a8 with the four repeat modes. It is easy to see the speedup in the demos/scale program.
2012-12-08Add demo program for conical gradientsSøren Sandmann Pedersen2-0/+136
This new test is derived from radial-test.c and displays conical gradients at various angles. It also demonstrates how PIXMAN_REPEAT_NORMAL is supposed to work when used with a gradient specification where the first stop is not a 0.0: In this case the gradient is supposed to have a smooth transition from the last stop back to the first stop with no sharp transitions. It also shows that the repeat mode is not ignored for conical gradients as one might be tempted to think.
2012-12-08Add demos/zone_plate.pngSøren Sandmann Pedersen1-0/+0
The zone plate image is a useful test case for image scalers because it contains all representable frequencies, so any imperfection in resampling filters will show up as Moire patterns. This version is symmetric around the midpoint of the image, so since rotating it is supposed to be a noop, it can also be used to verify that the resampling filters don't shift the image. V2: Run the file through OptiPNG to cut the size in half, as suggested by Siarhei.
2012-12-08demos: Add new demo program, "scale"Søren Sandmann Pedersen3-2/+737
This program allows interactively scaling and rotating images with using various filters and repeat modes. It uses pixman_filter_create_separate_convolution() to generate the filters.
2012-12-08demos/gtk-utils.[ch]: Add pixman_image_from_file()Søren Sandmann Pedersen2-0/+69
This function uses GdkPixbuf to load various common formats such as .png and .jpg into a pixman image.
2012-12-08Add new pixman_filter_create_separable_convolution() APISøren Sandmann Pedersen4-2/+370
This new API is a helper function to create filter parameters suitable for use with PIXMAN_FILTER_SEPARABLE_CONVOLUTION. For each dimension, given a scale factor, reconstruction and sample filter kernels, and a subsampling resolution, this function will compute a convolution of the two kernels scaled appropriately, then sample that convolution and return the resulting vectors in a form suitable for being used as parameters to PIXMAN_FILTER_SEPARABLE_CONVOLUTION. The filter kernels offered are the following: - IMPULSE: Dirac delta function, ie., point sampling - BOX: Box filter - LINEAR: Linear filter, aka. "Tent" filter - CUBIC: Cubic filter, currently Mitchell-Netravali - GAUSSIAN: Gaussian function, sigma=1, support=3*sigma - LANCZOS2: Two-lobed Lanczos filter - LANCZOS3: Three-lobed Lanczos filter - LANCZOS3_STRETCHED: Three-lobed Lanczos filter, stretched by 4/3.0. This is the "Nice" filter from Dirty Pixels by Jim Blinn. The intended way to use this function is to extract scaling factors from the transformation and then pass those to this function to get a filter suitable for compositing with that transformation. The filter kernels can be chosen according to quality and performance tradeoffs. To get equivalent quality to GdkPixbuf for downscalings, use BOX for both reconstruction and sampling. For upscalings, use LINEAR for reconstruction and IMPULSE for sampling (though note that for upscaling in both X and Y directions, simply using PIXMAN_FILTER_BILINEAR will likely be a better choice).
2012-12-08rounding.txt: Describe how SEPARABLE_CONVOLUTION filter worksSøren Sandmann Pedersen1-0/+33
Add some notes on how to compute the convolution matrices to be used with the SEPARABLE_CONVOLUTION filter.
2012-12-08Add new filter PIXMAN_FILTER_SEPARABLE_CONVOLUTIONSøren Sandmann Pedersen4-3/+149
This filter is a new way to use a convolution matrix for filtering. In contrast to the existing CONVOLUTION filter, this new variant is different in two respects: - It is subsampled: Instead of just one convolution matrix, this filter chooses between a number of matrices based on the subpixel sample location, allowing the convolution kernel to be sampled at a higher resolution. - It is separable: Each matrix is specified as the tensor product of two vectors. This has the advantages that many fewer values have to be stored, and that the filtering can be done separately in the x and y dimensions (although the initial implementation doesn't actually do that). The motivation for this new filter is to improve image downsampling quality. Currently, the best pixman can do is the regular convolution filter which is limited to coarsely sampled convolution kernels. With this new feature, any separable filter can be used at any desired resolution.
2012-12-08Fix thread safety on mingw-w64 and clangBenjamin Gilbert1-1/+1
After finding a working TLS storage class specifier, configure was continuing to test other candidates. This caused it to prefer __declspec(thread) over __thread. However, __declspec(thread) is ignored with a warning by mingw-w64 [1] and silently ignored by clang [2]. The resulting binary behaved as if PIXMAN_NO_TLS was defined. Bug introduced by a069da6c. [1] https://bugs.freedesktop.org/show_bug.cgi?id=57591 [2] http://lists.freedesktop.org/archives/pixman/2012-October/002320.html
2012-12-06test: Get rid of the obsolete 'prng_rand_N' and 'prng_rand_u32'Siarhei Siamashka10-56/+44
They are the same as 'prng_rand_n' and 'prng_rand'
2012-12-06test: Switch to the new PRNG instead of old LCGSiarhei Siamashka14-69/+65
Wallclock time for running pixman "make check" (compile time not included): ----------------------------+----------------+-----------------------------+ | old PRNG (LCG) | new PRNG (Bob Jenkins) | Processor type +----------------+------------+----------------+ | gcc 4.5 | gcc 4.5 | gcc 4.7 (simd) | ----------------------------+----------------+------------+----------------+ quad Intel Core i7 @2.8GHz | 0m49.494s | 0m43.722s | 0m37.560s | dual ARM Cortex-A15 @1.7GHz | 5m8.465s | 4m37.375s | 3m45.819s | IBM Cell PPU @3.2GHz | 23m0.821s | 20m38.316s | 16m37.513s | ----------------------------+----------------+------------+----------------+ But some tests got a particularly large boost. For example benchmarking and profiling blitters-test on Core i7: === before === $ time ./blitters-test real 0m10.907s user 0m55.650s sys 0m0.000s 70.45% blitters-test blitters-test [.] create_random_image 15.81% blitters-test blitters-test [.] compute_crc32_for_image_internal 2.26% blitters-test blitters-test [.] _pixman_implementation_lookup_composite 1.07% blitters-test libc-2.15.so [.] _int_free 0.89% blitters-test libc-2.15.so [.] malloc_consolidate 0.87% blitters-test libc-2.15.so [.] _int_malloc 0.75% blitters-test blitters-test [.] combine_conjoint_general_u 0.61% blitters-test blitters-test [.] combine_disjoint_general_u 0.40% blitters-test blitters-test [.] test_composite 0.31% blitters-test libc-2.15.so [.] _int_memalign 0.31% blitters-test blitters-test [.] _pixman_bits_image_setup_accessors 0.28% blitters-test libc-2.15.so [.] malloc === after === $ time ./blitters-test real 0m3.655s user 0m20.550s sys 0m0.000s 41.77% blitters-test.n blitters-test.new [.] compute_crc32_for_image_internal 15.77% blitters-test.n blitters-test.new [.] prng_randmemset_r 6.15% blitters-test.n blitters-test.new [.] _pixman_implementation_lookup_composite 3.09% blitters-test.n libc-2.15.so [.] _int_free 2.68% blitters-test.n libc-2.15.so [.] malloc_consolidate 2.39% blitters-test.n libc-2.15.so [.] _int_malloc 2.27% blitters-test.n blitters-test.new [.] create_random_image 2.22% blitters-test.n blitters-test.new [.] combine_conjoint_general_u 1.52% blitters-test.n blitters-test.new [.] combine_disjoint_general_u 1.40% blitters-test.n blitters-test.new [.] test_composite 1.02% blitters-test.n blitters-test.new [.] prng_srand_r 1.00% blitters-test.n blitters-test.new [.] _pixman_image_validate 0.96% blitters-test.n blitters-test.new [.] _pixman_bits_image_setup_accessors 0.90% blitters-test.n libc-2.15.so [.] malloc
2012-12-06test: Search/replace 'lcg_*' -> 'prng_*'Siarhei Siamashka14-317/+317
The 'lcg' prefix is going to be misleading if we replace PRNG algorithm.
2012-12-06test: Added a better PRNG (pseudorandom number generator)Siarhei Siamashka4-0/+581
This adds a fast SIMD-optimized variant of a small noncryptographic PRNG originally developed by Bob Jenkins: http://www.burtleburtle.net/bob/rand/smallprng.html The generated pseudorandom data is good enough to pass "Big Crush" tests from TestU01 (http://en.wikipedia.org/wiki/TestU01). SIMD code uses http://gcc.gnu.org/onlinedocs/gcc/Vector-Extensions.html which is a GCC specific extension. There is also a slower alternative code path, which should work with any C compiler. The performance of filling buffer with random data: Intel Core i7 @2.8GHz (SSE2) : ~5.9 GB/s ARM Cortex-A15 @1.7GHz (NEON) : ~2.2 GB/s IBM Cell PPU @3.2GHz (Altivec) : ~1.7 GB/s
2012-12-06test: Change is_little_endian() into inline functionSiarhei Siamashka2-10/+6
Also dropped redundant volatile keyword because any object can be accessed via char* pointer without breaking aliasing rules. The compilers are able to optimize this function to either constant 0 or 1.
2012-11-22Add text file rounding.txt describing how rounding worksSøren Sandmann Pedersen1-0/+134
It is not entirely obvious how pixman gets from "location in the source image" to "pixel value stored in the destination". This file describes how the filters work, and in particular how positions are rounded to samples.
2012-11-22Convolution filter: round color values instead of truncatingSøren Sandmann Pedersen1-4/+4
The pixel computed by the convolution filter should be rounded off, not truncated. As a simple example consider a convolution matrix consisting of five times 0x3333. If all five all five input pixels are 0xff, then the result of truncating will be (5 * 0x3333 * 255) >> 16 = 254 But the real value of the computation is (5 * 0x3333 / 65536.0) * 254 = 254.9961, so the error is almost 1. If the user isn't very careful about normalizing the convolution kernel so that it sums to one in fixed point, such error might cause solid images to change color, or opaque images to become translucent. The fix is simply to round instead of truncate.
2012-11-20Round fixed-point multiplicationSøren Sandmann Pedersen4-12/+12
After two fixed-point numbers are multiplied, the result is shifted into place, but up until now pixman has simply discarded the low-order bits instead of rounding to the closest number. Fix that by adding 0x8000 (or 0x2 in one place) before shifting and update the test checksums to match.
2012-11-14test: Fix compiler warnings caused by unused codeStefan Weil1-0/+4
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2012-11-14pixman: Use uintptr_t in type casts from pointer to integral valueStefan Weil6-95/+95
These modifications fix lots of compiler warnings for systems where sizeof(unsigned long) != sizeof(void *). This is especially true for MinGW-w64 (64 bit Windows). Signed-off-by: Stefan Weil <sw@weilnetz.de>
2012-11-14Always use xmmintrin.h for 64 bit WindowsStefan Weil1-1/+1
MinGW-w64 uses the GNU compiler and does not define _MSC_VER. Nevertheless, it provides xmmintrin.h and must be handled here like the MS compiler. Otherwise compilation fails due to conflicting declarations. Signed-off-by: Stefan Weil <sw@weilnetz.de>
2012-11-14MIPS: DSPr2: Added several nearest neighbor fast paths with a8 mask:Nemanja Lukic3-0/+214
Performance numbers before/after on MIPS-74kc @ 1GHz: lowlevel-blt-bench -n Referent (before): over_8888_8_0565 = L1: 9.62 L2: 8.85 M: 7.40 ( 39.27%) HT: 5.67 VT: 5.61 R: 5.45 RT: 2.98 ( 22Kops/s) over_0565_8_0565 = L1: 7.90 L2: 7.49 M: 6.72 ( 26.75%) HT: 5.24 VT: 5.20 R: 5.06 RT: 2.90 ( 22Kops/s) Optimized: over_8888_8_0565 = L1: 18.51 L2: 16.82 M: 12.13 ( 64.43%) HT: 10.06 VT: 9.88 R: 9.54 RT: 5.63 ( 31Kops/s) over_0565_8_0565 = L1: 14.82 L2: 13.94 M: 11.34 ( 45.20%) HT: 9.45 VT: 9.35 R: 9.03 RT: 5.50 ( 31Kops/s)
2012-11-14MIPS: DSPr2: Added more fast-paths for OVER operation:Nemanja Lukic3-1/+178
Performance numbers before/after on MIPS-74kc @ 1GHz: lowlevel-blt-bench results Referent (before): over_n_0565 = L1: 14.48 L2: 21.36 M: 17.57 ( 23.30%) HT: 6.95 VT: 6.44 R: 6.39 RT: 2.16 ( 22Kops/s) over_n_8888 = L1: 92.60 L2: 86.13 M: 24.41 ( 64.74%) HT: 8.94 VT: 8.06 R: 8.00 RT: 2.53 ( 25Kops/s) Optimized: over_n_0565 = L1: 27.65 L2: 189.22 M: 58.19 ( 77.12%) HT: 52.80 VT: 49.88 R: 47.53 RT: 23.67 ( 72Kops/s) over_n_8888 = L1: 235.99 L2: 230.86 M: 29.09 ( 77.11%) HT: 27.95 VT: 27.24 R: 26.58 RT: 18.10 ( 67Kops/s)
2012-11-14MIPS: DSPr2: Added more fast-paths for SRC operation:Nemanja Lukic2-0/+142
Performance numbers before/after on MIPS-74kc @ 1GHz: lowlevel-blt-bench results Referent (before): src_n_8_8888 = L1: 13.79 L2: 22.47 M: 17.55 ( 58.28%) HT: 6.95 VT: 6.46 R: 6.34 RT: 2.07 ( 20Kops/s) src_n_8_8 = L1: 20.22 L2: 20.21 M: 18.20 ( 24.17%) HT: 6.65 VT: 6.22 R: 6.11 RT: 2.03 ( 20Kops/s) Optimized: src_n_8_8888 = L1: 58.31 L2: 53.34 M: 25.69 ( 85.29%) HT: 22.55 VT: 21.44 R: 19.91 RT: 10.34 ( 48Kops/s) src_n_8_8 = L1: 102.60 L2: 89.43 M: 65.01 ( 86.32%) HT: 37.87 VT: 37.02 R: 32.43 RT: 12.41 ( 51Kops/s)
2012-11-11Allow src and dst to be identical in pixman_f_transform_invert()Søren Sandmann Pedersen1-6/+9
It is useful to be able to invert a matrix in place, but currently pixman_f_transform_invert() will produce wrong results if you pass the same matrix as both source and destination. Fix that by inverting into a temporary matrix and then copying that to the destination.
2012-11-10pixman.h: Add typedefs for pixman_f_transform and pixman_f_vectorSøren Sandmann Pedersen1-0/+3
2012-11-09Fix undeclared variable use and sysctlbyname error handling on ppcJoshua Root1-3/+3
Fixes bug 56889.
2012-11-09pixman_image_composite: Reduce opaque masks to NULLSøren Sandmann Pedersen1-1/+1
When the mask is known to be opaque, we might as well reduce it to NULL to take advantage of the various fast paths that operate on NULL masks.
2012-11-07Post-release version bump to 0.29.1Søren Sandmann Pedersen1-2/+2
2012-11-07Pre-release version bump to 0.28.0Søren Sandmann Pedersen1-2/+2
2012-10-25Post-release version bump to 0.27.5Søren Sandmann Pedersen1-1/+1
2012-10-25Pre-release version bump to 0.27.4Søren Sandmann Pedersen1-1/+1
2012-10-25MIPS: DSPr2: Added more fast-paths for ADD operation: - add_8888_8888_8888 - ↵Nemanja Lukic2-0/+212
add_8_8 - add_8888_8888 Performance numbers before/after on MIPS-74kc @ 1GHz: lowlevel-blt-bench results Referent (before): add_8888_8888_8888 = L1: 17.55 L2: 13.35 M: 8.13 ( 93.95%) HT: 6.60 VT: 6.64 R: 6.45 RT: 3.47 ( 26Kops/s) add_8_8 = L1: 86.07 L2: 84.89 M: 62.36 ( 90.11%) HT: 36.36 VT: 34.74 R: 29.56 RT: 11.56 ( 52Kops/s) add_8888_8888 = L1: 95.59 L2: 73.05 M: 17.62 (101.84%) HT: 15.46 VT: 15.01 R: 13.94 RT: 6.71 ( 42Kops/s) Optimized: add_8888_8888_8888 = L1: 41.52 L2: 33.21 M: 11.97 (138.45%) HT: 10.47 VT: 10.19 R: 9.42 RT: 4.86 ( 32Kops/s) add_8_8 = L1: 135.06 L2: 104.82 M: 57.13 ( 82.58%) HT: 34.79 VT: 36.60 R: 28.28 RT: 10.54 ( 51Kops/s) add_8888_8888 = L1: 176.36 L2: 67.82 M: 17.48 (101.06%) HT: 15.16 VT: 14.62 R: 13.88 RT: 8.05 ( 45Kops/s)
2012-10-25MIPS: DSPr2: Added more fast-paths for ADD operation: - add_0565_8_0565 - ↵Nemanja Lukic2-0/+182
add_8888_8_8888 - add_8888_n_8888 Performance numbers before/after on MIPS-74kc @ 1GHz: lowlevel-blt-bench results Referent (before): add_0565_8_0565 = L1: 8.89 L2: 8.37 M: 7.35 ( 29.22%) HT: 5.90 VT: 5.85 R: 5.67 RT: 3.31 ( 26Kops/s) add_8888_8_8888 = L1: 17.22 L2: 14.17 M: 9.89 ( 65.56%) HT: 7.57 VT: 7.50 R: 7.36 RT: 4.10 ( 30Kops/s) add_8888_n_8888 = L1: 17.79 L2: 14.87 M: 10.35 ( 54.89%) HT: 5.19 VT: 4.93 R: 4.92 RT: 1.90 ( 19Kops/s) Optimized: add_0565_8_0565 = L1: 21.72 L2: 20.01 M: 14.96 ( 59.54%) HT: 12.03 VT: 11.81 R: 11.26 RT: 6.33 ( 37Kops/s) add_8888_8_8888 = L1: 47.42 L2: 38.64 M: 15.90 (105.48%) HT: 13.34 VT: 13.03 R: 11.84 RT: 6.63 ( 38Kops/s) add_8888_n_8888 = L1: 54.83 L2: 42.66 M: 17.36 ( 92.11%) HT: 15.20 VT: 14.82 R: 13.66 RT: 7.83 ( 41Kops/s)
2012-10-25MIPS: DSPr2: Added fast-paths for ADD operation: - add_n_8_8 - add_n_8_8888 ↵Nemanja Lukic3-0/+284
- add_8_8_8 Performance numbers before/after on MIPS-74kc @ 1GHz: lowlevel-blt-bench results Referent (before): add_n_8_8 = L1: 41.37 L2: 37.83 M: 30.38 ( 60.45%) HT: 23.70 VT: 22.85 R: 21.51 RT: 10.32 ( 45Kops/s) add_n_8_8888 = L1: 16.01 L2: 14.46 M: 11.64 ( 46.32%) HT: 5.50 VT: 5.18 R: 5.06 RT: 1.89 ( 18Kops/s) add_8_8_8 = L1: 13.26 L2: 12.47 M: 11.16 ( 29.61%) HT: 8.09 VT: 8.04 R: 7.68 RT: 3.90 ( 29Kops/s) Optimized: add_n_8_8 = L1: 96.03 L2: 79.37 M: 51.89 (103.31%) HT: 32.59 VT: 31.29 R: 28.52 RT: 11.08 ( 46Kops/s) add_n_8_8888 = L1: 53.61 L2: 46.92 M: 23.78 ( 94.70%) HT: 19.06 VT: 18.64 R: 17.30 RT: 9.15 ( 43Kops/s) add_8_8_8 = L1: 89.65 L2: 66.82 M: 37.10 ( 98.48%) HT: 22.10 VT: 21.74 R: 20.12 RT: 8.12 ( 41Kops/s)
2012-10-25Workaround for FTBFS with gcc 4.6 (http://gcc.gnu.org/PR54965)Siarhei Siamashka1-0/+7
GCC 4.6 has problems with force_inline, so just use normal inline instead. Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=55630
2012-10-21pixman_composite_trapezoids(): don't clip to extents for some operatorsSøren Sandmann Pedersen2-9/+36
pixman_composite_trapezoids() is supposed to composite across the entire destination, but it actually only composites across the extent of the trapezoids. For operators such as ADD or OVER this doesn't matter since a zero source has no effect on the destination. But for operators such as SRC or IN, it does matter. So for such operators where a zero source has an effect, don't clip to the trap extents.
2012-10-21pixman_composite_trapezoids(): Factor out extents computationSøren Sandmann Pedersen1-40/+53
The computation of the extents rectangle is moved to its own function.
2012-10-21Add new pixman_image_create_bits_no_clear() APISøren Sandmann Pedersen4-13/+52
When pixman_image_create_bits() function is given NULL for bits, it will allocate a new buffer and initialize it to zero. However, in some cases, only a small region of the image is actually used; in that case it is wasteful to touch all of the memory. The new pixman_image_create_bits_no_clear() works exactly like _create_bits() except that it doesn't initialize any newly allocated memory.
2012-10-17configure.ac: PIXMAN_LINK_WITH_ENV fixBenny Siegert1-1/+4
(fixes bug #52101) On MirBSD, the compiler produces a (harmless) warning when the compiler is called without the standard CFLAGS: foo.c:0: note: someone does not honour COPTS correctly, passed 0 times However, PIXMAN_LINK_WITH_ENV considers _any_ output on stderr as an error, even if the exit status of the compiler is 0. Furthermore, it resets CFLAGS and LDFLAGS at the start. On MirBSD, this will lead to a warning in each test, making all such tests fail. In particular, the pthread_setspecific test fails, thus pixman is compiled without thread support. This leads to compile errors later on, or at least it did when I tried this on pkgsrc. Re-adding the saved CFLAGS, LDFLAGS and LIBS before the test makes it work. The second hunk inverts the order of the pthread flag checks. On BSD systems (this is true at least on OpenBSD and MirBSD), both -lpthread and -pthread work but the latter is "preferred", whatever this means.
2012-10-16Add missing force_inline to in() function used for C fast pathsSiarhei Siamashka1-1/+1
2012-10-16MIPS: skip runtime detection for DSPr2 if -mdspr2 option is in CFLAGSSiarhei Siamashka1-3/+13
This provides a way to enable MIPS DSP ASE optimizations if running under qemu-user (where /proc/cpuinfo contains information about the host processor instead of the emulated one). Can be used for running pixman test suite in qemu-user when having no access to real MIPS hardware.
2012-10-11region: Remove overlap argument from pixman_op()Søren Sandmann Pedersen1-37/+14
This is used to compute whether the regions in question overlap, but nothing makes use of this information, so it can be removed.
2012-10-11region: Formatting fixSøren Sandmann Pedersen1-3/+1
The while part of a do/while loop was formatted as if it were a while loop with an empty body. Probably some indent tool misinterpreted the code at some point.