~podain/pixman - Private pixman repository

Age	Commit message (Collapse)	Author	Files	Lines
2011-04-11	ARM: tweaked horizontal weights update in NEON bilinear scaling code	Siarhei Siamashka	1	-9/+11
	Moving horizontal interpolation weights update instructions from the beginning of loop to its end allows to hide some pipeline stalls and improve performance.
2011-04-06	ARM: Tiny improvement in over_n_8888_8888_ca_process_pixblock_head	Søren Sandmann Pedersen	1	-3/+2
	Instead of two mvn d24, d24 mvn d25, d25 use just one mvn q12, q12 Also move another vmvn instruction into the created pipeline bubble, as pointed out by Siarhei.
2011-04-06	Makefile.am: Put development releases in "snapshots" directory	Søren Sandmann Pedersen	1	-4/+3
	Up until now, all pixman release, both snapshots and releases were uploaded to the "releases" directory on www.cairographics.org, but it's better to development snapshots in the "snapshots" directory. This patch changes Makefile.am to do that.
2011-03-22	test: Fix infinite loop in composite	Søren Sandmann Pedersen	1	-4/+4
	When run in PIXMAN_RANDOMIZE_TESTS mode, this test would go into an infinite loop because the loop started at 'seed' but the stop condition was still N_TESTS.
2011-03-22	Add support for the r8g8b8a8 and r8g8b8x8 formats to the tests.	Alexandros Frantzis	4	-2/+28

2011-03-22	Add simple support for the r8g8b8a8 and r8g8b8x8 formats.	Alexandros Frantzis	3	-1/+108
	This format is particularly useful on big-endian architectures, where RGBA in memory/file order corresponds to r8g8b8a8 as an uint32_t. This is important because RGBA is in some cases the only available choice (for example as a pixel format in OpenGL ES 2.0).
2011-03-19	test: Randomize some tests if PIXMAN_RANDOMIZE_TESTS is set	Søren Sandmann Pedersen	4	-16/+51
	This patch makes so that composite and stress-test will start from a random seed if the PIXMAN_RANDOMIZE_TESTS environment variable is set. Running the test suite in this mode is useful to get more test coverage. Also, in stress-test.c make it so that setting the initial seed causes threads to be turned off. This makes it much easier to see when something fails.
2011-03-18	Simplify the prototype for iterator initializers.	Søren Sandmann Pedersen	9	-153/+61
	All of the information previously passed to the iterator initializers is now available in the iterator itself, so there is no need to pass it as arguments anymore.
2011-03-18	Fill out parts of iters in _pixman_implementation_{src,dest}_iter_init()	Søren Sandmann Pedersen	4	-11/+24
	This makes _pixman_implementation_{src,dest}_iter_init() responsible for filling parts of the information in the iterators. Specifically, the information passed as arguments is stored in the iterator. Also add a height field to pixman_iter_t().
2011-03-18	In delegate_{src,dest}_iter_init() call delegate directly.	Søren Sandmann Pedersen	2	-3/+3
	There is no reason to go through _pixman_implementation_{src,dest}_iter_init(), especially since _pixman_implementation_src_iter_init() is doing various other checks that only need to be done once. Also call delegate->src_iter_init() directly in pixman-sse2.c
2011-03-12	ARM: a bit faster NEON bilinear scaling for r5g6b5 source images	Siarhei Siamashka	1	-18/+100
	Instructions scheduling improved in the code responsible for fetching r5g6b5 pixels and converting them to the intermediate x8r8g8b8 color format used in the interpolation part of code. Still a lot of NEON stalls are remaining, which can be resolved later by the use of pipelining. Benchmark on ARM Cortex-A8 r2p2 @1GHz, 32-bit LPDDR @200MHz: Microbenchmark (scaling 2000x2000 image with scale factor close to 1x): before: op=1, src=10020565, dst=10020565, speed=32.29 MPix/s op=1, src=10020565, dst=20020888, speed=36.82 MPix/s after: op=1, src=10020565, dst=10020565, speed=41.35 MPix/s op=1, src=10020565, dst=20020888, speed=49.16 MPix/s
2011-03-12	ARM: NEON optimization for bilinear scaled 'src_0565_0565'	Siarhei Siamashka	2	-0/+6
	Benchmark on ARM Cortex-A8 r2p2 @1GHz, 32-bit LPDDR @200MHz: Microbenchmark (scaling 2000x2000 image with scale factor close to 1x): before: op=1, src=10020565, dst=10020565, speed=3.30 MPix/s after: op=1, src=10020565, dst=10020565, speed=32.29 MPix/s
2011-03-12	ARM: NEON optimization for bilinear scaled 'src_0565_x888'	Siarhei Siamashka	2	-0/+7
	Benchmark on ARM Cortex-A8 r2p2 @1GHz, 32-bit LPDDR @200MHz: Microbenchmark (scaling 2000x2000 image with scale factor close to 1x): before: op=1, src=10020565, dst=20020888, speed=3.39 MPix/s after: op=1, src=10020565, dst=20020888, speed=36.82 MPix/s
2011-03-12	ARM: NEON optimization for bilinear scaled 'src_8888_0565'	Siarhei Siamashka	2	-0/+8
	Benchmark on ARM Cortex-A8 r2p2 @1GHz, 32-bit LPDDR @200MHz: Microbenchmark (scaling 2000x2000 image with scale factor close to 1x): before: op=1, src=20028888, dst=10020565, speed=6.56 MPix/s after: op=1, src=20028888, dst=10020565, speed=61.65 MPix/s
2011-03-12	ARM: use common macro template for bilinear scaled 'src_8888_8888'	Siarhei Siamashka	1	-188/+3
	This is a cleanup for old and now duplicated code. The performance improvement is mostly coming from the enabled use of software prefetch, but instructions scheduling is also slightly better. Benchmark on ARM Cortex-A8 r2p2 @1GHz, 32-bit LPDDR @200MHz: Microbenchmark (scaling 2000x2000 image with scale factor close to 1x): before: op=1, src=20028888, dst=20028888, speed=53.24 MPix/s after: op=1, src=20028888, dst=20028888, speed=74.36 MPix/s
2011-03-12	ARM: NEON: common macro template for bilinear scanline scalers	Siarhei Siamashka	2	-0/+239
	This allows to generate bilinear scanline scaling functions targeting various source and destination color formats. Right now a8r8g8b8/x8r8g8b8 and r5g6b5 color formats are supported. More formats can be added if needed.
2011-03-12	ARM: new bilinear fast path template macro in 'pixman-arm-common.h'	Siarhei Siamashka	2	-41/+48
	It can be reused in different ARM NEON bilinear scaling fast path functions.
2011-03-12	ARM: assembly optimized nearest scaled 'src_8888_8888'	Siarhei Siamashka	2	-0/+12
	Benchmark on ARM Cortex-A8 r1p3 @500MHz, 32-bit LPDDR @166MHz: Microbenchmark (scaling 2000x2000 image with scale factor close to 1x): before: op=1, src=20028888, dst=20028888, speed=44.36 MPix/s after: op=1, src=20028888, dst=20028888, speed=39.79 MPix/s Benchmark on ARM Cortex-A8 r2p2 @1GHz, 32-bit LPDDR @200MHz: Microbenchmark (scaling 2000x2000 image with scale factor close to 1x): before: op=1, src=20028888, dst=20028888, speed=102.36 MPix/s after: op=1, src=20028888, dst=20028888, speed=163.12 MPix/s
2011-03-12	ARM: common macro for nearest scaling fast paths	Siarhei Siamashka	1	-24/+36
	The code of nearest scaled 'src_0565_0565' function was generalized and moved to a common macro, so that it can be reused for other fast paths.
2011-03-12	ARM: use prefetch in nearest scaled 'src_0565_0565'	Siarhei Siamashka	1	-2/+25
	Benchmark on ARM Cortex-A8 r1p3 @500MHz, 32-bit LPDDR @166MHz: Microbenchmark (scaling 2000x2000 image with scale factor close to 1x): before: op=1, src=10020565, dst=10020565, speed=75.02 MPix/s after: op=1, src=10020565, dst=10020565, speed=73.63 MPix/s Benchmark on ARM Cortex-A8 r2p2 @1GHz, 32-bit LPDDR @200MHz: Microbenchmark (scaling 2000x2000 image with scale factor close to 1x): before: op=1, src=10020565, dst=10020565, speed=176.12 MPix/s after: op=1, src=10020565, dst=10020565, speed=267.50 MPix/s
2011-03-07	test: Do endian swapping of the source and destination images.	Søren Sandmann Pedersen	1	-0/+4
	Otherwise the test fails on big endian. Fix for bug 34767, reported by Siarhei Siamashka.
2011-03-07	test: In image_endian_swap() use pixman_image_get_format() to get the bpp.	Søren Sandmann Pedersen	6	-12/+17
	There is no reason to pass in the bpp as an argument; it can be gotten directly from the image.
2011-02-28	ARM: NEON optimization for bilinear scaled 'src_8888_8888'	Siarhei Siamashka	2	-0/+242
	Initial NEON optimization for bilinear scaling. Can be probably improved more. Benchmark on ARM Cortex-A8: Microbenchmark (scaling 2000x2000 image with scale factor close to 1x): before: op=1, src=20028888, dst=20028888, speed=6.70 MPix/s after: op=1, src=20028888, dst=20028888, speed=44.27 MPix/s
2011-02-28	SSE2 optimization for bilinear scaled 'src_8888_8888'	Siarhei Siamashka	1	-0/+112
	A primitive naive implementation of bilinear scaling using SSE2 intrinsics, which only handles one pixel at a time. It is approximately 2x faster than pixman general compositing path. Single pass processing without intermediate temporary buffer contributes to ~15% and loop unrolling contributes to ~20% of this speedup. Benchmark on Intel Core i7 (x86-64): Using cairo-perf-trace: before: image firefox-planet-gnome 12.566 12.610 0.23% 6/6 after: image firefox-planet-gnome 10.961 11.013 0.19% 5/6 Microbenchmark (scaling 2000x2000 image with scale factor close to 1x): before: op=1, src=20028888, dst=20028888, speed=70.48 MPix/s after: op=1, src=20028888, dst=20028888, speed=165.38 MPix/s
2011-02-28	test: check correctness of 'bilinear_pad_repeat_get_scanline_bounds'	Siarhei Siamashka	2	-0/+95
	Individual correctness check for the new bilinear scaling related supplementary function. This test program uses a bit wider range of input arguments, not covered by other tests.
2011-02-28	Main loop template for fast single pass bilinear scaling	Siarhei Siamashka	1	-0/+432
	Can be used for implementing SIMD optimized fast path functions which work with bilinear scaled source images. Similar to the template for nearest scaling main loop, the following types of mask are supported: 1. no mask 2. non-scaled a8 mask with SAMPLES_COVER_CLIP flag 3. solid mask PAD repeat is fully supported. NONE repeat is partially supported (right now only works if source image has alpha channel or when alpha channel of the source image does not have any effect on the compositing operation).
2011-02-28	test: Silence MSVC warnings	Andrea Canciani	3	-1/+3
	MSVC does not notice non-returning functions (abort() / assert(0)) and warns about paths which end with them in non-void functions: c:\cygwin\home\ranma42\code\fdo\pixman\test\fetch-test.c(114) : warning C4715: 'reader' : not all control paths return a value c:\cygwin\home\ranma42\code\fdo\pixman\test\stress-test.c(133) : warning C4715: 'real_reader' : not all control paths return a value c:\cygwin\home\ranma42\code\fdo\pixman\test\composite.c(431) : warning C4715: 'calc_op' : not all control paths return a value These warnings can be silenced by adding a return after the termination call.
2011-02-28	Do not include unused headers	Andrea Canciani	2	-3/+0
	pixman-combine32.h is included without being used both in pixman-image.c and in pixman-general.c.
2011-02-28	test: Add Makefile for Win32	Andrea Canciani	1	-0/+73

2011-02-28	test: Fix tests for compilation on Windows	Andrea Canciani	3	-54/+47
	The Microsoft C compiler cannot handle subobject initialization and Win32 does not provide snprintf. Work around these limitations by using normal struct initialization and using sprintf (a manual check shows that the buffer size is sufficient).
2011-02-28	Fix compilation on Win32	Andrea Canciani	1	-2/+4
	Makefile.win32 contained a typo and was missing the dependency from the built sources.
2011-02-22	Post-release version bump to 0.21.7	Søren Sandmann Pedersen	1	-1/+1

2011-02-22	Pre-release version bump to 0.21.6pixman-0.21.6	Søren Sandmann Pedersen	1	-1/+1

2011-02-22	Minor fix to the RELEASING file	Søren Sandmann Pedersen	1	-2/+2

2011-02-22	Delete pixman-x64-mmx-emulation.h from pixman/Makefile.am	Søren Sandmann Pedersen	1	-1/+1

2011-02-22	Ensure that tests run as the last step of a build for 'make check'	Siarhei Siamashka	1	-1/+1
	Previously 'make check' would compile and run tests first, and only then proceed to compiling demos. Which is not very convenient because of the need to scroll back console output to see the tests verdict. Swapping order of SUBDIRS variable entries in Makefile.am resolves this.
2011-02-18	sse2: Minor coding style cleanups.	Søren Sandmann Pedersen	1	-6/+12
	Also make pixman_fill_sse2() static.
2011-02-18	sse2: Remove pixman-x64-mmx-emulation.h	Søren Sandmann Pedersen	2	-273/+0
	Also stop including mmintrin.h
2011-02-18	sse2: Delete obsolete or redundant comments	Søren Sandmann Pedersen	1	-137/+0

2011-02-18	sse2: Remove all the core_combine_* functions	Søren Sandmann Pedersen	1	-356/+157
	Now that _mm_empty() is not used anymore, they are no longer different from the sse2_combine_* functions, so they can be consolidated.
2011-02-18	sse2: Don't compile pixman-sse2.c with -mmmx anymore	Søren Sandmann Pedersen	1	-1/+1
	It's not necessary now that the file doesn't use MMX instructions.
2011-02-18	sse2: Delete unused MMX functions and constants and all _mm_empty()s	Søren Sandmann Pedersen	1	-211/+0
	These are not needed because the SSE2 implementation doesn't use MMX anymore.
2011-02-18	sse2: Convert all uses of MMX registers to use SSE2 registers instead.	Søren Sandmann Pedersen	1	-348/+440
	By avoiding use of MMX registers we won't need to call emms all over the place, which avoids various miscompilation issues.
2011-02-18	Coding style: core_combine_in_u_pixelsse2 -> core_combine_in_u_pixel_sse2	Søren Sandmann Pedersen	1	-5/+5

2011-02-18	In pixman_image_set_transform() allow NULL for transform	Søren Sandmann Pedersen	1	-1/+1
	Previously, this would crash unless the existing transform were also NULL.
2011-02-18	Avoid marking images dirty when properties are reset	Søren Sandmann Pedersen	1	-0/+18
	When an image property is set to the same value that it already is, there is no reason to mark the image dirty and incur a recomputation of the flags.
2011-02-18	Add new public function pixman_add_triangles()	Søren Sandmann Pedersen	2	-17/+51
	This allows some more code to be deleted from the X server. The implementation consists of converting to trapezoids, and is shared with pixman_composite_triangles().
2011-02-18	Optimize adding opaque trapezoids onto a8 destination.	Søren Sandmann Pedersen	1	-57/+76
	When the source is opaque and the destination is alpha only, we can avoid the temporary mask and just add the trapezoids directly.
2011-02-18	Add a test program, tri-test	Søren Sandmann Pedersen	2	-2/+52
	This program tests whether the new triangle support works.
2011-02-15	Add support for triangles to pixman.	Søren Sandmann Pedersen	2	-0/+151
	The Render X extension can draw triangles as well as trapezoids, but the implementation has always converted them to trapezoids. This patch moves the X server's triangle conversion code into pixman, where we can reuse the pixman_composite_trapezoid() code.