~ds/pixman - Unnamed repository; edit this file to name it for gitweb.

Age	Commit message (Collapse)	Author	Files	Lines
2011-05-17	Fixes due to rebasingorc	David Schleef	4	-9/+10

2011-05-17	Update for splatw0q->splatw3q change	David Schleef	3	-31/+120

2011-05-17	fixes for rebase	David Schleef	1	-72/+72

2011-05-17	hacking	David Schleef	9	-29/+386

2011-05-17	update for current orc	David Schleef	2	-78/+302

2011-05-17	random stuff	David Schleef	2	-7/+45

2011-05-17	Fixes for changes in orc opcodes	David Schleef	1	-9/+31

2011-05-17	Add xor operator	David Schleef	3	-13/+56

2011-05-17	Add atop operator	David Schleef	3	-12/+53

2011-05-17	orc: Add out operator	David Schleef	3	-17/+48

2011-05-17	orc: test program to check orc generated code	David Schleef	2	-1/+173

2011-05-17	orc: fix over operator	David Schleef	1	-4/+2

2011-05-17	orc: Add orc backend	David Schleef	5	-0/+654

2011-05-17	orc: Check for Orc	David Schleef	1	-0/+32

2011-05-17	test: Fix compilation on win32	Andrea Canciani	1	-3/+1
	MSVC complains about uint32_t being used as an expression: composite.c(902) : error C2275: 'uint32_t' : illegal use of this type as an expression
2011-05-09	Check for working mmap()	Dave Yeo	2	-1/+6
	OS/2 doesn't have a working mmap().
2011-05-02	Post-release version bump to 0.23.1	Søren Sandmann Pedersen	1	-2/+2

2011-05-02	Pre-release version bump to 0.22.0	Søren Sandmann Pedersen	1	-2/+2

2011-04-19	Post-release version bump to 0.21.9	Søren Sandmann Pedersen	1	-1/+1

2011-04-19	Pre-release version bump to 0.21.8	Søren Sandmann Pedersen	1	-1/+1

2011-04-18	ARM: Enable bilinear fast paths using scanline functions in ↵	Taekyun Kim	1	-0/+39
	pixman-arm-neon-asm-bilinear.S Enable fast paths which is supported by scanline functions in pixman-arm-neon-asm-bilinear.S
2011-04-18	ARM: NEON scanline functions for bilinear scaling	Taekyun Kim	2	-0/+769
	General fetch->combine->store based bilinear scanline functions. Need further optimizations and eventually will be replaced with optimal functions one by one. General functions should be located in pixman-arm-neon-asm-bilinear.S and optimal functions in pixman-arm-neon-asm.S Following general bilinear scanline functions are implemented over_8888_8888 add_8888_8888 src_8888_8_8888 src_8888_8_0565 src_0565_8_x888 src_0565_8_0565 over_8888_8_8888 add_8888_8_8888
2011-04-18	ARM: Common macro for scaled bilinear scanline function with A8 mask	Taekyun Kim	1	-0/+45
	Defining PIXMAN_ARM_BIND_SCALED_BILINEAR_SRC_A8_DST macro for declaration of scaled bilinear scanline functions in common header.
2011-04-18	Offset rendering in pixman_composite_trapezoids() by (x_dst, y_dst)	Søren Sandmann Pedersen	3	-8/+19
	Previously, this function would do coordinate calculations in such a way that (x_dst, y_dst) would only affect the alignment of the source image, but not of the traps, which would always be considered to be in absolute destination coordinates. This is unlike the pixman_image_composite() function which also registers the mask to the destination. This patch makes it so that traps are also offset by (x_dst, y_dst). Also add a comment explaining how this function is supposed to operate, and update tri-test.c and composite-trap-test.c to deal with the new semantics.
2011-04-18	ARM: Add 'neon_composite_over_n_8888_0565_ca' fast path	Søren Sandmann Pedersen	2	-0/+173
	This improves the performance of the firefox-talos-gfx benchmark with the image16 backend. Benchmark on an 800 MHz ARM Cortex A8: Before: [ # ] backend test min(s) median(s) stddev. count [ 0] image16 firefox-talos-gfx 121.773 122.218 0.15% 6/6 After: [ # ] backend test min(s) median(s) stddev. count [ 0] image16 firefox-talos-gfx 85.247 85.563 0.22% 6/6 V2: Slightly better instruction scheduling based on comments from Taekyun Kim. V3: Eliminate all stalls from the inner loop. Also based on comments from Taekyun Kim.
2011-04-18	Fix OpenMP not supported case	Gilles Espinasse	1	-20/+27
	PIXMAN_LINK_WITH_ENV did not fail unless -Wall -Werror is used. So even when the compiler did not support OpenMP, USE_OPENMP was defined. Fix that by running the second OpenMP test only when first AC_OPENMP find supported configure tested in the cases : gcc without libgomp support, no openmp option, --enable-openmp and --disable-openmp gcc with libgomp support, no openmp option, --enable-openmp and --disable-openmp Not tested with autoconf version not knowing openmp (<2.62) Warn when --enable-openmp is requested but no support is found Signed-off-by: Gilles Espinasse <g.esp@free.fr>
2011-04-18	Fix missing AC_MSG_RESULT value from Werror test	Gilles Espinasse	1	-1/+1
	Use the correct variable name Signed-off-by: Gilles Espinasse <g.esp@free.fr>
2011-04-11	ARM: pipelined NEON implementation of bilinear scaled 'src_8888_0565'	Siarhei Siamashka	1	-1/+244
	Benchmark on ARM Cortex-A8 r1p3 @600MHz, 32-bit LPDDR @166MHz: Microbenchmark (scaling 2000x2000 image with scale factor close to 1x): before: op=1, src=20028888, dst=10020565, speed=33.59 MPix/s after: op=1, src=20028888, dst=10020565, speed=46.25 MPix/s Benchmark on ARM Cortex-A8 r2p2 @1GHz, 32-bit LPDDR @200MHz: Microbenchmark (scaling 2000x2000 image with scale factor close to 1x): before: op=1, src=20028888, dst=10020565, speed=63.86 MPix/s after: op=1, src=20028888, dst=10020565, speed=84.22 MPix/s
2011-04-11	ARM: pipelined NEON implementation of bilinear scaled 'src_8888_8888'	Siarhei Siamashka	1	-0/+127
	Performance of the inner loop when working with the data in L1 cache: ARM Cortex-A8: 41 cycles per 4 pixels (no stalls and partial dual issue) ARM Cortex-A9: 48 cycles per 4 pixels (no stalls) It might be still possible to improve performance even more on ARM Cortex-A8 with a better use of dual issue. Benchmark on ARM Cortex-A8 r1p3 @600MHz, 32-bit LPDDR @166MHz: Microbenchmark (scaling 2000x2000 image with scale factor close to 1x): before: op=1, src=20028888, dst=20028888, speed=40.38 MPix/s after: op=1, src=20028888, dst=20028888, speed=48.47 MPix/s Benchmark on ARM Cortex-A8 r2p2 @1GHz, 32-bit LPDDR @200MHz: Microbenchmark (scaling 2000x2000 image with scale factor close to 1x): before: op=1, src=20028888, dst=20028888, speed=79.68 MPix/s after: op=1, src=20028888, dst=20028888, speed=93.11 MPix/s
2011-04-11	ARM: support different levels of loop unrolling in bilinear scaler	Siarhei Siamashka	1	-8/+76
	Now an extra 'flag' parameter is supported in bilinear scaline scaling function generation macro. It can be used to enable 4 or 8 pixels per loop iteration unrolling and provide save/restore code for d8-d15 registers.
2011-04-11	ARM: use less ARM instructions in NEON bilinear scaling code	Siarhei Siamashka	1	-41/+38
	This reduces code size and also puts less pressure on the instruction decoder.
2011-04-11	ARM: support for software pipelining in bilinear macros	Siarhei Siamashka	1	-3/+28
	Now it's possible to override the main loop of bilinear scaling code with optimized pipelined implementation.
2011-04-11	ARM: use aligned memory writes in NEON bilinear scaling code	Siarhei Siamashka	1	-14/+35

2011-04-11	ARM: tweaked horizontal weights update in NEON bilinear scaling code	Siarhei Siamashka	1	-9/+11
	Moving horizontal interpolation weights update instructions from the beginning of loop to its end allows to hide some pipeline stalls and improve performance.
2011-04-06	ARM: Tiny improvement in over_n_8888_8888_ca_process_pixblock_head	Søren Sandmann Pedersen	1	-3/+2
	Instead of two mvn d24, d24 mvn d25, d25 use just one mvn q12, q12 Also move another vmvn instruction into the created pipeline bubble, as pointed out by Siarhei.
2011-04-06	Makefile.am: Put development releases in "snapshots" directory	Søren Sandmann Pedersen	1	-4/+3
	Up until now, all pixman release, both snapshots and releases were uploaded to the "releases" directory on www.cairographics.org, but it's better to development snapshots in the "snapshots" directory. This patch changes Makefile.am to do that.
2011-03-22	test: Fix infinite loop in composite	Søren Sandmann Pedersen	1	-4/+4
	When run in PIXMAN_RANDOMIZE_TESTS mode, this test would go into an infinite loop because the loop started at 'seed' but the stop condition was still N_TESTS.
2011-03-22	Add support for the r8g8b8a8 and r8g8b8x8 formats to the tests.	Alexandros Frantzis	4	-2/+28

2011-03-22	Add simple support for the r8g8b8a8 and r8g8b8x8 formats.	Alexandros Frantzis	3	-1/+108
	This format is particularly useful on big-endian architectures, where RGBA in memory/file order corresponds to r8g8b8a8 as an uint32_t. This is important because RGBA is in some cases the only available choice (for example as a pixel format in OpenGL ES 2.0).
2011-03-19	test: Randomize some tests if PIXMAN_RANDOMIZE_TESTS is set	Søren Sandmann Pedersen	4	-16/+51
	This patch makes so that composite and stress-test will start from a random seed if the PIXMAN_RANDOMIZE_TESTS environment variable is set. Running the test suite in this mode is useful to get more test coverage. Also, in stress-test.c make it so that setting the initial seed causes threads to be turned off. This makes it much easier to see when something fails.
2011-03-18	Simplify the prototype for iterator initializers.	Søren Sandmann Pedersen	9	-153/+61
	All of the information previously passed to the iterator initializers is now available in the iterator itself, so there is no need to pass it as arguments anymore.
2011-03-18	Fill out parts of iters in _pixman_implementation_{src,dest}_iter_init()	Søren Sandmann Pedersen	4	-11/+24
	This makes _pixman_implementation_{src,dest}_iter_init() responsible for filling parts of the information in the iterators. Specifically, the information passed as arguments is stored in the iterator. Also add a height field to pixman_iter_t().
2011-03-18	In delegate_{src,dest}_iter_init() call delegate directly.	Søren Sandmann Pedersen	2	-3/+3
	There is no reason to go through _pixman_implementation_{src,dest}_iter_init(), especially since _pixman_implementation_src_iter_init() is doing various other checks that only need to be done once. Also call delegate->src_iter_init() directly in pixman-sse2.c
2011-03-12	ARM: a bit faster NEON bilinear scaling for r5g6b5 source images	Siarhei Siamashka	1	-18/+100
	Instructions scheduling improved in the code responsible for fetching r5g6b5 pixels and converting them to the intermediate x8r8g8b8 color format used in the interpolation part of code. Still a lot of NEON stalls are remaining, which can be resolved later by the use of pipelining. Benchmark on ARM Cortex-A8 r2p2 @1GHz, 32-bit LPDDR @200MHz: Microbenchmark (scaling 2000x2000 image with scale factor close to 1x): before: op=1, src=10020565, dst=10020565, speed=32.29 MPix/s op=1, src=10020565, dst=20020888, speed=36.82 MPix/s after: op=1, src=10020565, dst=10020565, speed=41.35 MPix/s op=1, src=10020565, dst=20020888, speed=49.16 MPix/s
2011-03-12	ARM: NEON optimization for bilinear scaled 'src_0565_0565'	Siarhei Siamashka	2	-0/+6
	Benchmark on ARM Cortex-A8 r2p2 @1GHz, 32-bit LPDDR @200MHz: Microbenchmark (scaling 2000x2000 image with scale factor close to 1x): before: op=1, src=10020565, dst=10020565, speed=3.30 MPix/s after: op=1, src=10020565, dst=10020565, speed=32.29 MPix/s
2011-03-12	ARM: NEON optimization for bilinear scaled 'src_0565_x888'	Siarhei Siamashka	2	-0/+7
	Benchmark on ARM Cortex-A8 r2p2 @1GHz, 32-bit LPDDR @200MHz: Microbenchmark (scaling 2000x2000 image with scale factor close to 1x): before: op=1, src=10020565, dst=20020888, speed=3.39 MPix/s after: op=1, src=10020565, dst=20020888, speed=36.82 MPix/s
2011-03-12	ARM: NEON optimization for bilinear scaled 'src_8888_0565'	Siarhei Siamashka	2	-0/+8
	Benchmark on ARM Cortex-A8 r2p2 @1GHz, 32-bit LPDDR @200MHz: Microbenchmark (scaling 2000x2000 image with scale factor close to 1x): before: op=1, src=20028888, dst=10020565, speed=6.56 MPix/s after: op=1, src=20028888, dst=10020565, speed=61.65 MPix/s
2011-03-12	ARM: use common macro template for bilinear scaled 'src_8888_8888'	Siarhei Siamashka	1	-188/+3
	This is a cleanup for old and now duplicated code. The performance improvement is mostly coming from the enabled use of software prefetch, but instructions scheduling is also slightly better. Benchmark on ARM Cortex-A8 r2p2 @1GHz, 32-bit LPDDR @200MHz: Microbenchmark (scaling 2000x2000 image with scale factor close to 1x): before: op=1, src=20028888, dst=20028888, speed=53.24 MPix/s after: op=1, src=20028888, dst=20028888, speed=74.36 MPix/s
2011-03-12	ARM: NEON: common macro template for bilinear scanline scalers	Siarhei Siamashka	2	-0/+239
	This allows to generate bilinear scanline scaling functions targeting various source and destination color formats. Right now a8r8g8b8/x8r8g8b8 and r5g6b5 color formats are supported. More formats can be added if needed.
2011-03-12	ARM: new bilinear fast path template macro in 'pixman-arm-common.h'	Siarhei Siamashka	2	-41/+48
	It can be reused in different ARM NEON bilinear scaling fast path functions.