summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2011-03-12Simplify the prototype for iterator initializers.simplify-itersSøren Sandmann Pedersen9-153/+61
All of the information previously passed to the iterator initializers is now available in the iterator itself, so there is no need to pass it as arguments anymore.
2011-03-12Fill out parts of iters in _pixman_implementation_{src,dest}_iter_init()Søren Sandmann Pedersen4-11/+24
This makes _pixman_implementation_{src,dest}_iter_init() responsible for filling parts of the information in the iterators. Specifically, the information passed as arguments is stored in the iterator. Also add a height field to pixman_iter_t().
2011-03-12In delegate_{src,dest}_iter_init() call delegate directly.Søren Sandmann Pedersen2-3/+3
There is no reason to go through _pixman_implementation_{src,dest}_iter_init(), especially since _pixman_implementation_src_iter_init() is doing various other checks that only need to be done once. Also call delegate->src_iter_init() directly in pixman-sse2.c
2011-03-07test: Do endian swapping of the source and destination images.Søren Sandmann Pedersen1-0/+4
Otherwise the test fails on big endian. Fix for bug 34767, reported by Siarhei Siamashka.
2011-03-07test: In image_endian_swap() use pixman_image_get_format() to get the bpp.Søren Sandmann Pedersen6-12/+17
There is no reason to pass in the bpp as an argument; it can be gotten directly from the image.
2011-02-28ARM: NEON optimization for bilinear scaled 'src_8888_8888'Siarhei Siamashka2-0/+242
Initial NEON optimization for bilinear scaling. Can be probably improved more. Benchmark on ARM Cortex-A8: Microbenchmark (scaling 2000x2000 image with scale factor close to 1x): before: op=1, src=20028888, dst=20028888, speed=6.70 MPix/s after: op=1, src=20028888, dst=20028888, speed=44.27 MPix/s
2011-02-28SSE2 optimization for bilinear scaled 'src_8888_8888'Siarhei Siamashka1-0/+112
A primitive naive implementation of bilinear scaling using SSE2 intrinsics, which only handles one pixel at a time. It is approximately 2x faster than pixman general compositing path. Single pass processing without intermediate temporary buffer contributes to ~15% and loop unrolling contributes to ~20% of this speedup. Benchmark on Intel Core i7 (x86-64): Using cairo-perf-trace: before: image firefox-planet-gnome 12.566 12.610 0.23% 6/6 after: image firefox-planet-gnome 10.961 11.013 0.19% 5/6 Microbenchmark (scaling 2000x2000 image with scale factor close to 1x): before: op=1, src=20028888, dst=20028888, speed=70.48 MPix/s after: op=1, src=20028888, dst=20028888, speed=165.38 MPix/s
2011-02-28test: check correctness of 'bilinear_pad_repeat_get_scanline_bounds'Siarhei Siamashka2-0/+95
Individual correctness check for the new bilinear scaling related supplementary function. This test program uses a bit wider range of input arguments, not covered by other tests.
2011-02-28Main loop template for fast single pass bilinear scalingSiarhei Siamashka1-0/+432
Can be used for implementing SIMD optimized fast path functions which work with bilinear scaled source images. Similar to the template for nearest scaling main loop, the following types of mask are supported: 1. no mask 2. non-scaled a8 mask with SAMPLES_COVER_CLIP flag 3. solid mask PAD repeat is fully supported. NONE repeat is partially supported (right now only works if source image has alpha channel or when alpha channel of the source image does not have any effect on the compositing operation).
2011-02-28test: Silence MSVC warningsAndrea Canciani3-1/+3
MSVC does not notice non-returning functions (abort() / assert(0)) and warns about paths which end with them in non-void functions: c:\cygwin\home\ranma42\code\fdo\pixman\test\fetch-test.c(114) : warning C4715: 'reader' : not all control paths return a value c:\cygwin\home\ranma42\code\fdo\pixman\test\stress-test.c(133) : warning C4715: 'real_reader' : not all control paths return a value c:\cygwin\home\ranma42\code\fdo\pixman\test\composite.c(431) : warning C4715: 'calc_op' : not all control paths return a value These warnings can be silenced by adding a return after the termination call.
2011-02-28Do not include unused headersAndrea Canciani2-3/+0
pixman-combine32.h is included without being used both in pixman-image.c and in pixman-general.c.
2011-02-28test: Add Makefile for Win32Andrea Canciani1-0/+73
2011-02-28test: Fix tests for compilation on WindowsAndrea Canciani3-54/+47
The Microsoft C compiler cannot handle subobject initialization and Win32 does not provide snprintf. Work around these limitations by using normal struct initialization and using sprintf (a manual check shows that the buffer size is sufficient).
2011-02-28Fix compilation on Win32Andrea Canciani1-2/+4
Makefile.win32 contained a typo and was missing the dependency from the built sources.
2011-02-22Post-release version bump to 0.21.7Søren Sandmann Pedersen1-1/+1
2011-02-22Pre-release version bump to 0.21.6Søren Sandmann Pedersen1-1/+1
2011-02-22Minor fix to the RELEASING fileSøren Sandmann Pedersen1-2/+2
2011-02-22Delete pixman-x64-mmx-emulation.h from pixman/Makefile.amSøren Sandmann Pedersen1-1/+1
2011-02-22Ensure that tests run as the last step of a build for 'make check'Siarhei Siamashka1-1/+1
Previously 'make check' would compile and run tests first, and only then proceed to compiling demos. Which is not very convenient because of the need to scroll back console output to see the tests verdict. Swapping order of SUBDIRS variable entries in Makefile.am resolves this.
2011-02-18sse2: Minor coding style cleanups.Søren Sandmann Pedersen1-6/+12
Also make pixman_fill_sse2() static.
2011-02-18sse2: Remove pixman-x64-mmx-emulation.hSøren Sandmann Pedersen2-273/+0
Also stop including mmintrin.h
2011-02-18sse2: Delete obsolete or redundant commentsSøren Sandmann Pedersen1-137/+0
2011-02-18sse2: Remove all the core_combine_* functionsSøren Sandmann Pedersen1-356/+157
Now that _mm_empty() is not used anymore, they are no longer different from the sse2_combine_* functions, so they can be consolidated.
2011-02-18sse2: Don't compile pixman-sse2.c with -mmmx anymoreSøren Sandmann Pedersen1-1/+1
It's not necessary now that the file doesn't use MMX instructions.
2011-02-18sse2: Delete unused MMX functions and constants and all _mm_empty()sSøren Sandmann Pedersen1-211/+0
These are not needed because the SSE2 implementation doesn't use MMX anymore.
2011-02-18sse2: Convert all uses of MMX registers to use SSE2 registers instead.Søren Sandmann Pedersen1-348/+440
By avoiding use of MMX registers we won't need to call emms all over the place, which avoids various miscompilation issues.
2011-02-18Coding style: core_combine_in_u_pixelsse2 -> core_combine_in_u_pixel_sse2Søren Sandmann Pedersen1-5/+5
2011-02-18In pixman_image_set_transform() allow NULL for transformSøren Sandmann Pedersen1-1/+1
Previously, this would crash unless the existing transform were also NULL.
2011-02-18Avoid marking images dirty when properties are resetSøren Sandmann Pedersen1-0/+18
When an image property is set to the same value that it already is, there is no reason to mark the image dirty and incur a recomputation of the flags.
2011-02-18Add new public function pixman_add_triangles()Søren Sandmann Pedersen2-17/+51
This allows some more code to be deleted from the X server. The implementation consists of converting to trapezoids, and is shared with pixman_composite_triangles().
2011-02-18Optimize adding opaque trapezoids onto a8 destination.Søren Sandmann Pedersen1-57/+76
When the source is opaque and the destination is alpha only, we can avoid the temporary mask and just add the trapezoids directly.
2011-02-18Add a test program, tri-testSøren Sandmann Pedersen2-2/+52
This program tests whether the new triangle support works.
2011-02-15Add support for triangles to pixman.Søren Sandmann Pedersen2-0/+151
The Render X extension can draw triangles as well as trapezoids, but the implementation has always converted them to trapezoids. This patch moves the X server's triangle conversion code into pixman, where we can reuse the pixman_composite_trapezoid() code.
2011-02-15Add a test program for pixman_composite_trapezoids().Søren Sandmann Pedersen2-0/+255
A CRC32 based test program to check that pixman_composite_trapezoids() actually works.
2011-02-15Add pixman_composite_trapezoids().Søren Sandmann Pedersen2-1/+98
This function is an implementation of the X server request Trapezoids. That request is what the X backend of cairo is using all the time; by moving it into pixman we can hopefully make it faster.
2011-02-15test/Makefile.am: Move all the TEST_LDADD into a new global LDADD.Søren Sandmann Pedersen1-34/+1
This gets rid of a bunch of replicated *_LDADD clauses
2011-02-15Add @TESTPROGS_EXTRA_LDFLAGS@ to AM_LDFLAGSSøren Sandmann Pedersen1-17/+1
Instead of explicitly adding it to each test program.
2011-02-15Move all the GTK+ based test programs to a new subdir, "demos"Søren Sandmann Pedersen15-52/+38
This separates the test suite from the random gtk+ using test programs. "demos" is somewhat misleading because the programs there are not particularly exciting (with the possible exception of composite-test which shows off all the compositing operators).
2011-02-15SSE2 optimization for nearest scaled over_8888_n_8888Siarhei Siamashka1-0/+118
This operation shows up a little bit in some of the html5 based games from http://www.kesiev.com/akihabara/ === Cairo trace of the game intro animation for 'Legend of Sadness' === before: [ 0] image firefox-legend-of-sadness 46.286 46.298 0.01% 5/6 after: [ 0] image firefox-legend-of-sadness 45.088 45.102 0.04% 6/6 === Microbenchmark (scaling ~2000x~2000 -> ~2000x~2000) === before: translucent: op=3, src=8888, mask=s dst=8888, speed=131.30 MPix/s transparent: op=3, src=8888, mask=s dst=8888, speed=132.38 MPix/s opaque: op=3, src=8888, mask=s dst=8888, speed=167.90 MPix/s after: translucent: op=3, src=8888, mask=s dst=8888, speed=301.93 MPix/s transparent: op=3, src=8888, mask=s dst=8888, speed=770.70 MPix/s opaque: op=3, src=8888, mask=s dst=8888, speed=301.80 MPix/s
2011-02-15ARM: NEON optimization for nearest scaled over_0565_8_0565Siarhei Siamashka2-0/+19
In some cases may be used for html5 video when hardware acceleration is not available.
2011-02-15ARM: NEON optimization for nearest scaled over_8888_8_0565Siarhei Siamashka2-0/+20
In some cases may be used for html5 video when hardware acceleration is not available.
2011-02-15ARM: new macro template for using scaled fast paths with a8 maskSiarhei Siamashka1-0/+44
2011-02-15Better support for NONE repeat in nearest scaling main loop templateSiarhei Siamashka4-14/+25
Scaling function now gets an extra boolean argument, which is set to TRUE when we are fetching padding pixels for NONE repeat. This allows to make a decision whether to interpret alpha as 0xFF or 0x00 for such pixels when working with formats which don't have alpha channel (for example x8r8g8b8 and r5g6b5).
2011-02-15Support for a8 and solid mask in nearest scaling main loop templateSiarhei Siamashka1-15/+151
In addition to the most common case of not having any mask at all, two variants of scaling with mask show up in cairo traces: 1. non-scaled a8 mask with SAMPLES_COVER_CLIP flag 2. solid mask This patch extends the nearest scaling main loop template to also support these cases.
2011-02-15test: Extend scaling-test to support a8/solid mask and ADD operationSiarhei Siamashka1-9/+127
Image width also has been increased because SIMD optimizations typically do more unrolling in the inner loops, and this needs to be tested.
2011-02-15Use const modifiers for source buffers in nearest scaling fast pathsSiarhei Siamashka3-19/+19
2011-02-10C fast paths for a simple 90/270 degrees rotationSiarhei Siamashka1-0/+292
Depending on CPU architecture, performance is in the range of 1.5 to 4 times slower than simple nonrotated copy (which would be an ideal case, perfectly utilizing memory bandwidth), but still is more than 7 times faster if compared to general path. This implementation sets a performance baseline for rotation. The use of SIMD instructions may further improve memory bandwidth utilization.
2011-02-10New flags for 90/180/270 rotationSiarhei Siamashka2-0/+20
These flags are set when the transform is a simple nonscaled 90/180/270 degrees rotation.
2011-02-10test: affine-test updated to stress 90/180/270 degrees rotation moreSiarhei Siamashka1-4/+30
2011-02-10Add pixman-conical-gradient.c to Makefile.win32.Søren Sandmann Pedersen1-0/+1
Pointed out by Kirill Tishin.