Age | Commit message (Collapse) | Author | Files | Lines |
|
|
|
|
|
These macros are identical to the ones that Tor Lillqvist posted here:
http://lists.freedesktop.org/archives/pixman/2010-April/000160.html
with one exception: the variable is allocated with calloc() and not
malloc().
Cc: tml@iki.fi
|
|
It is apparently broken. See this:
http://mingw-users.1079350.n2.nabble.com/gcc-4-4-multi-threaded-exception-handling-thread-specifier-not-working-td3440749.html
We'll need to support thread local storage on MinGW32 some other way.
Cc: tml@iki.fi
|
|
The indexed formats have 0 bits of alpha, but can't be considered
opaque because there may be non-opaque colors in the palette.
|
|
|
|
|
|
This line:
mask = mask | mask >> 8 | mask >> 16 | mask >> 24;
only works when mask has 0s in the lower 24 bits, so add
mask &= 0xff000000;
before.
Reported by Todd Rinaldo on the #cairo IRC channel.
|
|
The tls_name_key variable is passed to tls_name_get(), and the first
time this happens it isn't initialized. tls_name_get() then passes it
on to tls_name_alloc() which passes it on to pthread_setspecific()
leading to undefined behavior.
None of this is actually necessary at all because there is only one
such variable per thread local variable, so it doesn't need to passed
as a parameter at all.
All of this was pointed out by Tor Lillqvist on the cairo mailing
list.
|
|
The thread local cache is allocated with malloc(), but we rely on it
being initialized to zero, so allocate it with calloc() instead.
|
|
Use the builtin version instead of defining the types ourselves.
|
|
|
|
|
|
This reverts commit ebba1493136a5a0dd7667073165b2115de203eda.
Scheduled for re-discussion after stable 0.18 has been released.
|
|
Fixes Novell bug 568811.
|
|
|
|
|
|
|
|
This should be the last step in providing full armv4t compatibility
with CPU features runtime autodetection in pixman.
|
|
|
|
This is needed for future reuse of the same macros for the other
ARM assembly optimizations (armv4t, armv6)
|
|
The problem was reported as bug 25534 against pixman in
freedesktop.org bugzila. Link to a patch for binutils:
http://sourceware.org/ml/binutils/2008-03/msg00260.html
For pixman the impact is a build failure when using
binutils 2.18. Versions 2.19 and higer are fine. Still
some distros may be using older versions of binutils and
this is causing problems.
This patch workarounds the problem by replacing a problematic
"vmov a, b" instruction with equivalent "vorr a, b, b". Actually
they even map to the same instruction opcode in the generated
code, so the resulting binary is identical with and without patch.
|
|
This can be used to override the architecture recorded in the EABI object
attribute section. We set a minimum arch to 'armv4'. Binutils documentation
recommends to use this directive with the code performing runtime detection
of CPU features.
Additionally NEON/VFP EABI attributes are suppressed. And the instruction
set to use is explicitly set to '.arm'.
Configure test for NEON support is also updated to include a bunch of
these new directives (if any of these is unsupported by the assembler,
it is better to fail configure test than to fail library build).
All these changes are required to fix SIGILL problem on armv4t, reported in
http://lists.freedesktop.org/archives/pixman/2010-March/000123.html
|
|
Avoid a division-by-zero exception if the first number returned by
rand() is a multiple of 500, causing us to create a zero width pixmap,
and then attempt to use get_rand(0) when generating a random stride...
Fixes https://bugs.freedesktop.org/attachment.cgi?id=34162
|
|
|
|
|
|
This avoids a test in the inner loop, which improves performance
especially for tiled sources.
On x86-32, I get these results:
Before:
op=1, src_fmt=20028888, dst_fmt=20028888, speed=306.96 MPix/s (73.18 FPS)
op=1, src_fmt=20028888, dst_fmt=10020565, speed=102.67 MPix/s (24.48 FPS)
op=1, src_fmt=10020565, dst_fmt=10020565, speed=324.85 MPix/s (77.45 FPS)
After:
op=1, src_fmt=20028888, dst_fmt=20028888, speed=332.19 MPix/s (79.20 FPS)
op=1, src_fmt=20028888, dst_fmt=10020565, speed=110.41 MPix/s (26.32 FPS)
op=1, src_fmt=10020565, dst_fmt=10020565, speed=363.28 MPix/s (86.61 FPS)
|
|
This is the common case for a lot of transformed images. If the unit
were negative, the transformation would be a reflection which is
fairly rare.
|
|
|
|
This is a macroized version of SRC/OVER repeat normal/unneeded nearest
neighbour scaling instantiated for some common 8888 and 565 formats.
Based on work by Siarhei Siamashka
|
|
FAST_PATH_SAMPLES_COVER_CLIP:
This is set of the source sample grid, unrepeated but transformed
completely completely covers the clip destination. If this is set
you can use a simple scaled that doesn't have to care about the repeat
mode.
FAST_PATH_16BIT_SAFE:
This signifies two things:
1) The size of the src/mask fits in a 16.16 fixed point, so something like:
max_vx = src_image->bits.width << 16;
Is allowed and is guaranteed to not overflow max_vx
2) When stepping the source space we're guaranteed to never overflow
a 16.16 bit fix point variable, even if we step one extra step
in the destination space. This means that a loop doing:
x = vx >> 16;
vx += unit_x; d = src_row[x];
will never overflow vx causing x to be negative.
And additionally, if you track vx like above and apply NORMAL repeat
after the vx addition with something like:
while (vx >= max_vx) vx -= max_vx;
This will never overflow the vx even on the final increment that
takes vx one past the end of where we will read, which makes the
repeat loop safe.
|
|
|
|
These are useful for macroization
|
|
This lets us simplify some fast paths since we get a consistent
naming that always has 8888 and gets some value for alpha.
|
|
In some cases we end up trying to use the STORE_4 macro with an 8 bit
values, which resulted in other pixels getting overwritten. Fix this
by always masking off the low 4 bits.
This fixes blitters-test on big-endian machines.
|
|
|
|
These macros hide the various types of thread local support. On Linux
and Unix, they expand to just __thread. On Microsoft Visual C++, they
expand to __declspec(thread).
On OS X and other systems that don't have __thread, they expand to a
complicated concoction that uses pthread_once() and
pthread_get/set_specific() to get thread local variables.
|
|
OS X does not support __thread, so we have to check for it before
using it. It does however support pthread_get/setspecific(), so if we
don't have __thread, check if those are available.
|
|
Clears '#warning: "unknown compiler"' messages when building
Signed-off-by: Alan Coopersmith <alan.coopersmith@sun.com>
|
|
The previous code worked in GNU make, but caused a syntax error in Solaris
make ( https://bugs.freedesktop.org/show_bug.cgi?id=27062 ) - this seems to
work in both, and should hopefully not cause syntax errors in any versions
of make not supporting the macro-substitution-in-macro-name feature, just
cause the macro to expand to nothing.
Signed-off-by: Alan Coopersmith <alan.coopersmith@sun.com>
|
|
Pointed out by Andreas Falkenhahn on the cairo mailing list.
|
|
These formats work fine, they just need to have a palette set.
|
|
In SPICE, with Microsoft Visual C++, pixman.h is included after
another file that defines these types, which causes warnings and
errors.
This patch allows such code to just define PIXMAN_DONT_DEFINE_STDINT
to use its own version of those types.
|
|
|
|
|
|
This makes gcc generate slightly better code for optimize_operator.
|
|
This allows us to not test for them later on.
|
|
The four cases for each operator:
none-are-opaque, src-is-opaque, dest-is-opaque, both-are-opaque
are packed into one uint32_t per operator. The relevant strength
reduced operator can then be found by packing the source-is-opaque and
dest-is-opaque into two bits and shifting that number of bytes.
Chris Wilson pointed out a bug in the original version of this commit:
dest_is_opaque and source_is_opaque were used as booleans, but their
actual values were the results of a logical AND with the
FAST_PATH_OPAQUE flag, so the shift value was wildly wrong.
The only reason it actually passed the test suite (on x86) was that
the compiler computed the shift amount in the cl register, and the low
byte of FAST_PATH_OPAQUE happens to be 0, so no shifting actually took
place, and the original operator was returned.
|
|
By extending the operator information table to cover all operators we
can replace the loop with a table look-up. At the same time, base the
operator optimization on the computed flags rather than the ones in
the image struct.
Finally, as an extra optimization, we no longer ignore the case where
there is a mask. Instead we consider the source opaque if both source
and mask are opaque, or if the source is opaque and the mask is
missing.
|
|
http://bugs.launchpad.net/bugs/535183
|