~podain/pixman - Private pixman repository

Age	Commit message (Collapse)	Author	Files	Lines
2011-07-12	composite_bench: Beautify output message and some refinementtest_and_perf	Taekyun Kim	1	-11/+11

2011-07-06	composite_bench: Silence warnings	Taekyun Kim	1	-3/+3

2011-07-05	composite_bench: Better handling of bench mark combinations	Taekyun Kim	1	-170/+103
	Just comment out unnecessary cases from the array
2011-07-05	Benchmark for vaious kind of image composition	Taekyun Kim	2	-1/+370

2011-07-04	Makefile.am: Add pixman@lists.freedesktop.org to RELEASE_ANNOUNCE_LIST	Søren Sandmann Pedersen	1	-1/+1

2011-07-04	Post-release version bump to 0.23.3	Søren Sandmann Pedersen	1	-1/+1

2011-07-04	Pre-release version bump to 0.23.2	Søren Sandmann Pedersen	1	-1/+1

2011-06-28	Bilinear REPEAT_NORMAL source line extension for too short src_width	Taekyun Kim	1	-3/+47
	To avoid function call and other calculation overhead, extend source scanline into temporary buffer when source width is too small. Temporary buffer will be repeatedly accessed, so extension cost is very small due to cache effect.
2011-06-28	Enable REPEAT_NORMAL bilinear fast path entries	Taekyun Kim	1	-3/+39

2011-06-28	ARM: Add REPEAT_NORMAL functions to bilinear BIND macros	Taekyun Kim	1	-1/+10
	Now bilinear template support REPEAT_NORMAL, so functions for that is added to PIXMAN_ARM_BIND_SCALED_BILINEAR_ macros. Fast path entries are not enabled yet.
2011-06-28	sse2: Declare bilinear src_8888_8888 REPEAT_NORMAL composite function	Taekyun Kim	1	-0/+5
	Now bilinear template support REPEAT_NORMAL, so declare composite functions using it. Function is just declared not used yet.
2011-06-28	REPEAT_NORMAL support for bilinear fast path template	Taekyun Kim	1	-0/+90
	The basic idea is to break down normal repeat into a set of non-repeat scanline compositions and stitching them together. Bilinear may interpolate last and first pixels of source scanline. In this case, we can use temporary wrap around buffer.
2011-06-28	Replace boolean arguments with flags for bilinear fast path template	Taekyun Kim	3	-30/+49
	By replacing boolean arguments with flags, the code can be more readable and flags can be extended to do some more things later. Currently following flags are defined. FLAG_NONE - No flags are turned on. FLAG_HAVE_SOLID_MASK - Template will generate solid mask composite functions. FLAG_HAVE_NON_SOLID_MASK - Template will generate bits mask composite functions. FLAG_HAVE_SOLID_MASK and FLAG_NON_SOLID_MASK should be mutually exclusive.
2011-06-25	test: Make fuzzer-find-diff.pl executable	Søren Sandmann	1	-0/+0

2011-06-25	ARM: Fix two bugs in neon_composite_over_n_8888_0565_ca().	Søren Sandmann	1	-4/+4
	The first bug is that a vmull.u8 instruction would store its result in the q1 register, clobbering the d2 register used later on. The second is that a vraddhn instruction would overwrite d25, corrupting the q12 register used later. Fixing the second bug caused a pipeline bubble where the d18 register would be unavailable for a clock cycle. This is fixed by swapping the instruction with its successor.
2011-06-25	blitters-test: Make common formats more likely to be tested.	Søren Sandmann Pedersen	1	-8/+14
	Move the eight most common formats to the top of the list of image formats and make create_random_image() much more likely to select one of those eight formats. This should help catch more bugs in SIMD optimized operations.
2011-06-23	Silence autoconf warnings	Andrea Canciani	1	-20/+20
	Autoconf 2.86 reports: warning: AC_LANG_CONFTEST: no AC_LANG_SOURCE call detected in body Every code fragment must be wrapped in [AC_LANG_SOURCE([...])]
2011-06-20	Replace argumentxs to composite functions with a pointer to a struct	Søren Sandmann Pedersen	11	-1041/+278
	This allows more information, such as flags or the composite region, to be passed to the composite functions.
2011-06-12	In pixman-general.c rename image_parameters to {src, mask, dest}_image	Søren Sandmann Pedersen	1	-17/+16
	All the fast paths generally use these names as well.
2011-06-12	Replace instances of "dst_" with "dest_"	Søren Sandmann Pedersen	11	-275/+275
	The variables in question were dst_x, dst_y, dst_image. The majority of _x and _y uses were already dest_x and dest_y, while the majority of _image uses were dst_image.
2011-05-31	demos: Comment out some unused variables	Søren Sandmann	2	-1/+7

2011-05-31	sse2: Delete some unused variables	Søren Sandmann	1	-14/+4

2011-05-31	mmx: Delete some unused variables	Søren Sandmann	1	-14/+3

2011-05-29	Include noop in win32 builds	Andrea Canciani	1	-0/+1

2011-05-24	Fix a few typos in pixman-combine.c.template	Nis Martensen	1	-4/+3
	Some equations have too much multiplication with alpha.
2011-05-19	Move NOP src iterator into noop implementation.	Søren Sandmann Pedersen	2	-9/+6
	The iterator for sources where neither RGB nor ALPHA is needed, really belongs in the noop implementation.
2011-05-19	Move NULL iterator into pixman-noop.c	Søren Sandmann Pedersen	2	-19/+17
	Iterating a NULL image returns NULL for all scanlines. We may as well do this in the noop iterator.
2011-05-19	Add a noop src iterator	Søren Sandmann Pedersen	1	-0/+39
	When the image is a8r8g8b8 and not transformed, and the fetched rectangle is within the image bounds, scanlines can be fetched by simply returning a pointer instead of copying the bits.
2011-05-19	Move noop dest fetching to noop implementation	Søren Sandmann Pedersen	2	-26/+37
	It will at some point become useful to have CPU specific destination iterators. However, a problem with that, is that such iterators should not be used if we can composite directly in the destination image. By moving the noop destination iterator to the noop implementation, we can ensure that it will be chosen before any CPU specific iterator.
2011-05-19	Add a noop composite function for the DST operator	Søren Sandmann Pedersen	2	-2/+19
	The DST operator doesn't actually do anything, so add a noop "fast path" for it, instead of checking in pixman_image_composite32(). The performance tradeoff here is that we get rid of a test for DST in the common case where the operator is not DST, in return for an extra walk over the clip rectangles in the uncommon case where the operator actually is DST.
2011-05-19	Add a "noop" implementation.	Søren Sandmann Pedersen	4	-0/+51
	This new implementation is ahead of all other implementations in the fallback chain and is supposed to contain operations that are "noops", ie., they don't require any work. For example, it might contain a "fast path" for the DST operator that doesn't actually do anything or an iterator for a8r8g8b8 that just returns a pointer into the image.
2011-05-17	test: Fix compilation on win32	Andrea Canciani	1	-3/+1
	MSVC complains about uint32_t being used as an expression: composite.c(902) : error C2275: 'uint32_t' : illegal use of this type as an expression
2011-05-09	Check for working mmap()	Dave Yeo	2	-1/+6
	OS/2 doesn't have a working mmap().
2011-05-02	Post-release version bump to 0.23.1	Søren Sandmann Pedersen	1	-2/+2

2011-05-02	Pre-release version bump to 0.22.0pixman-0.22.0	Søren Sandmann Pedersen	1	-2/+2

2011-04-19	Post-release version bump to 0.21.9	Søren Sandmann Pedersen	1	-1/+1

2011-04-19	Pre-release version bump to 0.21.8pixman-0.21.8	Søren Sandmann Pedersen	1	-1/+1

2011-04-18	ARM: Enable bilinear fast paths using scanline functions in ↵	Taekyun Kim	1	-0/+39
	pixman-arm-neon-asm-bilinear.S Enable fast paths which is supported by scanline functions in pixman-arm-neon-asm-bilinear.S
2011-04-18	ARM: NEON scanline functions for bilinear scaling	Taekyun Kim	2	-0/+769
	General fetch->combine->store based bilinear scanline functions. Need further optimizations and eventually will be replaced with optimal functions one by one. General functions should be located in pixman-arm-neon-asm-bilinear.S and optimal functions in pixman-arm-neon-asm.S Following general bilinear scanline functions are implemented over_8888_8888 add_8888_8888 src_8888_8_8888 src_8888_8_0565 src_0565_8_x888 src_0565_8_0565 over_8888_8_8888 add_8888_8_8888
2011-04-18	ARM: Common macro for scaled bilinear scanline function with A8 mask	Taekyun Kim	1	-0/+45
	Defining PIXMAN_ARM_BIND_SCALED_BILINEAR_SRC_A8_DST macro for declaration of scaled bilinear scanline functions in common header.
2011-04-18	Offset rendering in pixman_composite_trapezoids() by (x_dst, y_dst)	Søren Sandmann Pedersen	3	-8/+19
	Previously, this function would do coordinate calculations in such a way that (x_dst, y_dst) would only affect the alignment of the source image, but not of the traps, which would always be considered to be in absolute destination coordinates. This is unlike the pixman_image_composite() function which also registers the mask to the destination. This patch makes it so that traps are also offset by (x_dst, y_dst). Also add a comment explaining how this function is supposed to operate, and update tri-test.c and composite-trap-test.c to deal with the new semantics.
2011-04-18	ARM: Add 'neon_composite_over_n_8888_0565_ca' fast path	Søren Sandmann Pedersen	2	-0/+173
	This improves the performance of the firefox-talos-gfx benchmark with the image16 backend. Benchmark on an 800 MHz ARM Cortex A8: Before: [ # ] backend test min(s) median(s) stddev. count [ 0] image16 firefox-talos-gfx 121.773 122.218 0.15% 6/6 After: [ # ] backend test min(s) median(s) stddev. count [ 0] image16 firefox-talos-gfx 85.247 85.563 0.22% 6/6 V2: Slightly better instruction scheduling based on comments from Taekyun Kim. V3: Eliminate all stalls from the inner loop. Also based on comments from Taekyun Kim.
2011-04-18	Fix OpenMP not supported case	Gilles Espinasse	1	-20/+27
	PIXMAN_LINK_WITH_ENV did not fail unless -Wall -Werror is used. So even when the compiler did not support OpenMP, USE_OPENMP was defined. Fix that by running the second OpenMP test only when first AC_OPENMP find supported configure tested in the cases : gcc without libgomp support, no openmp option, --enable-openmp and --disable-openmp gcc with libgomp support, no openmp option, --enable-openmp and --disable-openmp Not tested with autoconf version not knowing openmp (<2.62) Warn when --enable-openmp is requested but no support is found Signed-off-by: Gilles Espinasse <g.esp@free.fr>
2011-04-18	Fix missing AC_MSG_RESULT value from Werror test	Gilles Espinasse	1	-1/+1
	Use the correct variable name Signed-off-by: Gilles Espinasse <g.esp@free.fr>
2011-04-11	ARM: pipelined NEON implementation of bilinear scaled 'src_8888_0565'	Siarhei Siamashka	1	-1/+244
	Benchmark on ARM Cortex-A8 r1p3 @600MHz, 32-bit LPDDR @166MHz: Microbenchmark (scaling 2000x2000 image with scale factor close to 1x): before: op=1, src=20028888, dst=10020565, speed=33.59 MPix/s after: op=1, src=20028888, dst=10020565, speed=46.25 MPix/s Benchmark on ARM Cortex-A8 r2p2 @1GHz, 32-bit LPDDR @200MHz: Microbenchmark (scaling 2000x2000 image with scale factor close to 1x): before: op=1, src=20028888, dst=10020565, speed=63.86 MPix/s after: op=1, src=20028888, dst=10020565, speed=84.22 MPix/s
2011-04-11	ARM: pipelined NEON implementation of bilinear scaled 'src_8888_8888'	Siarhei Siamashka	1	-0/+127
	Performance of the inner loop when working with the data in L1 cache: ARM Cortex-A8: 41 cycles per 4 pixels (no stalls and partial dual issue) ARM Cortex-A9: 48 cycles per 4 pixels (no stalls) It might be still possible to improve performance even more on ARM Cortex-A8 with a better use of dual issue. Benchmark on ARM Cortex-A8 r1p3 @600MHz, 32-bit LPDDR @166MHz: Microbenchmark (scaling 2000x2000 image with scale factor close to 1x): before: op=1, src=20028888, dst=20028888, speed=40.38 MPix/s after: op=1, src=20028888, dst=20028888, speed=48.47 MPix/s Benchmark on ARM Cortex-A8 r2p2 @1GHz, 32-bit LPDDR @200MHz: Microbenchmark (scaling 2000x2000 image with scale factor close to 1x): before: op=1, src=20028888, dst=20028888, speed=79.68 MPix/s after: op=1, src=20028888, dst=20028888, speed=93.11 MPix/s
2011-04-11	ARM: support different levels of loop unrolling in bilinear scaler	Siarhei Siamashka	1	-8/+76
	Now an extra 'flag' parameter is supported in bilinear scaline scaling function generation macro. It can be used to enable 4 or 8 pixels per loop iteration unrolling and provide save/restore code for d8-d15 registers.
2011-04-11	ARM: use less ARM instructions in NEON bilinear scaling code	Siarhei Siamashka	1	-41/+38
	This reduces code size and also puts less pressure on the instruction decoder.
2011-04-11	ARM: support for software pipelining in bilinear macros	Siarhei Siamashka	1	-3/+28
	Now it's possible to override the main loop of bilinear scaling code with optimized pipelined implementation.
2011-04-11	ARM: use aligned memory writes in NEON bilinear scaling code	Siarhei Siamashka	1	-14/+35