Age | Commit message (Collapse) | Author | Files | Lines |
|
When returning the single box that represents a path, always return it
consistently wound.
|
|
|
|
Add an even simpler sweep-line tessellator for rectangular trapezoids (as
produced by the rectilinear stoker and box filler).
This is so simple it even outperforms pixman's region validation code for the
purposes of path-to-region conversion.
|
|
Avoid a small amount of unnecessary overhead by performing a simple
conversion of the path to traps when it consists solely of simple boxes.
|
|
This tidies the common case which was to call, for example,
_cairo_polygon_line_to(); _cairo_polygon_status();
|
|
For the simple cases where the clip is an unaligned box (or boxes), apply
the clip directly to the geometry and avoid having to use an intermediate
clip-mask.
|
|
|
|
First perform a simple geometric clip to catch the majority of cases where
an unaligned clip has been set outside the operation extents that can be
discarded without having to use an image surface.
This causes a dramatic increase of over 13x for the poppler-bug-12266
trace and little impact elsewhere for more sensible clippers.
|
|
We can ensure that we always produce a clip region when possible by using
the rectilinear tessellator to convert complex, device-aligned polygons to
regions. Prior to using the tessellator, we relied on pixman's region code
which could only handle a union of rectangles.
|
|
For the frequent cases where we know in advance that we are dealing with a
rectilinear path, but can not use the simple region code, implement a
variant of the Bentley-Ottmann tessellator. The advantages here are that
edge comparison is very simple (we only have vertical edges) and there are
no intersection, though possible overlaps. The idea is the same, maintain
a y-x sorted queue of start/stop events that demarcate traps and sweep
through the active edges at each event, looking for completed traps.
The motivation for this was noticing a performance regression in
box-fill-outline with the self-intersection work:
1.9.2 to HEAD^: 3.66x slowdown
HEAD^ to HEAD: 5.38x speedup
1.9.2 to HEAD: 1.57x speedup
The cause of which was choosing to use spans instead of the region handling
code, as the complex polygon was no longer being tessellated.
|
|
|
|
We refactor the surface fallbacks to convert full strokes and fills to the
intermediate polygon representation (as opposed to before where we
returned the trapezoidal representation). This allow greater flexibility
to choose how then to rasterize the polygon. Where possible we use the
local spans rasteriser for its increased performance, but still have the
option to use the tessellator instead (for example, with the current
Render protocol which does not yet have a polygon image).
In order to accommodate this, the spans interface is tweaked to accept
whole polygons instead of a path and the tessellator is tweaked for speed.
Performance Impact
==================
...
Still measuring, expecting some severe regressions.
...
|
|
Handling clip as part of the surface state, as opposed to being part of
the operation state, is cumbersome and a hindrance to providing true proxy
surface support. For example, the clip must be copied from the surface
onto the fallback image, but this was forgotten causing undue hassle in
each backend. Another example is the contortion the meta surface
endures to ensure the clip is correctly recorded. By contrast passing the
clip along with the operation is quite simple and enables us to write
generic handlers for providing surface wrappers. (And in the future, we
should be able to write more esoteric wrappers, e.g. automatic 2x FSAA,
trivially.)
In brief, instead of the surface automatically applying the clip before
calling the backend, the backend can call into a generic helper to apply
clipping. For raster surfaces, clip regions are handled automatically as
part of the composite interface. For vector surfaces, a clip helper is
introduced to replay and callback into an intersect_clip_path() function
as necessary.
Whilst this is not primarily a performance related change (the change
should just move the computation of the clip from the moment it is applied
by the user to the moment it is required by the backend), it is important
to track any potential regression:
ppc:
Speedups
========
image-rgba evolution-20090607-0 1026085.22 0.18% -> 672972.07 0.77%: 1.52x speedup
▌
image-rgba evolution-20090618-0 680579.98 0.12% -> 573237.66 0.16%: 1.19x speedup
▎
image-rgba swfdec-fill-rate-4xaa-0 460296.92 0.36% -> 407464.63 0.42%: 1.13x speedup
▏
image-rgba swfdec-fill-rate-2xaa-0 128431.95 0.47% -> 115051.86 0.42%: 1.12x speedup
▏
Slowdowns
=========
image-rgba firefox-periodic-table-0 56837.61 0.78% -> 66055.17 3.20%: 1.09x slowdown
▏
|
|
Use const to document the read-only nature of the arguments passed to the
callbacks.
|
|
Yikes! The callback could fail so we need to propagate the error status.
|
|
The error paths should be hit very rarely during normal operation, so mark
them as being unlikely so gcc may emit better code.
|
|
Also scan for appendages of simple rectangles.
|
|
Scan the path for a series of consistently wound rectangles.
|
|
The spline decomposition code allocates and stores points in a temporary
buffer which is immediately consumed by the caller. If the caller supplies
a callback that handles each point computed along the spline, then we can
use the point immediately and avoid the allocation.
|
|
Avoid the iterative search for the extreme points by first checking the
most likely arrangement (as produced by cairo_rectangle() under no
transformation).
|
|
Avoid the overhead in sorting the edges within
_cairo_traps_tessellate_convex_quad() by using our prior knowledge that we
have a simple rectangle and construct the trap from the extreme points.
|
|
|
|
|
|
Allocate subsequent path bufs twice as large as the previous buf,
whilst still embedding a small initial buf into cairo_path_fixed_t
that handles the most frequent usage.
|
|
Now, the functions to add new data to a polygon all become void,
and there's a new _cairo_polygon_status call to query the status
at the end of a sequence of operations.
With this change, we fix many callerswhich were previously not
checking the return values of _cairo_polygon functions by adding
only a single call to _cairo_polygon_status rathern than several
new checks.
|
|
Actually assign the result that is tested on the next line...
|
|
This means, we have to malloc only one buffer, not two. Worst case
is that one always draws curves, which fills the arg (point) buffer
six times faster than op buffer. But that's not a big deal since
each op takes 1 byte, while each point takes 8 bytes. So op space
is cheap to spare, so to speak (about 10% memory waste at worst).
|
|
It turns out that this case is extremely common and worth avoiding
the overhead of the path iteration and tessellation code.
The optimization here works only for device-axis-aligned rectangles
It should be possible to generalize this to catch more cases, (such
as any convex quadrilateral with 4 or fewer points).
This fix results in a 1.4-1.8x speedup for the rectangles perf case:
image-rgb rectangles-512 7.80 1.22% -> 4.35 1.62%: 1.79x speedup
▊
image-rgba rectangles-512 7.71 4.77% -> 4.37 0.30%: 1.77x speedup
▊
xlib-rgba rectangles-512 8.78 5.02% -> 5.58 5.54%: 1.57x speedup
▋
xlib-rgb rectangles-512 11.87 2.71% -> 8.75 0.08%: 1.36x speedup
▍
Which conveniently overcomes the ~ 1.3x slowdown we had been seeing
for this case since 1.2. Now, compared to 1.2.6 we see only a speedup:
image-rgba rectangles-512 6.19 0.29% -> 4.37 0.30%: 1.42x speedup
▎
image-rgb rectangles-512 6.12 1.68% -> 4.35 1.62%: 1.41x speedup
▎
xlib-rgba rectangles-512 7.48 1.07% -> 5.58 5.54%: 1.34x speedup
▏
xlib-rgb rectangles-512 10.35 1.03% -> 8.75 0.08%: 1.18x speedup
▏
|
|
|
|
This patch was produced by running git-stripspace on all *.[ch] files
within cairo. Note that this script would have also created all the changes
from the previous commits to remove trailing whitespace.
|
|
This patch was produced with the following (GNU) sed script:
sed -i -r -e 's/^[ \t]+$//'
run on all *.[ch] files within cairo.
|
|
functions and put them in cairo-clip.c.
Rewrite to use new cairo_clip_t functions for manipulating the clip state, change the clip_and_composite_trapezoids call tree to use cairo_clip_t instead of cairo_gstate_t.
Use new cairo_clip_t function to maintain clip state while replaying.
Pass fill rule and tolerance directly, to break gstate dependency.
New function. Set the clip for a surface as specified by the cairo_clip_t.
Move translate_traps() from cairo-gstate.c to here and rename it.
Reviewed by: otaylor
|
|
Fix indentation.
|
|
functions to be named as _cairo_path_fixed functions.
Track name change of cairo_path_real_t and _cairo_path_fixed functions.
|
|
cairo_t, cairo_gstate_t, and cairo_path_real_t into their own header files.
Track changes to header files, reaching into the new private headers where necessary.
|
|
(cairo_append_path): Rename cairo_copy_path_data, cairop_copy_path_data_flat, and cairo_append_path_data to cairo_copy_path, cairo_copy_path_flat, and cairo_append_path.
Add new cairo_path_t, containing a cairo_path_data_t array and an explicit length. Remove CAIRO_PATH_END_PATH terminator from cairo_path_data_t.
Rename the internal path object from cairo_path_t to cairo_path_real_t.
|
|
address.
|
|
|
|
|
|
Library Public License version 2 or 'any later version'
|
|
|
|
cairo_move_to_func, cairo_line_to_func, cairo_curve_to_func, and cairo_close_path_func.
cairo_path.last_move_point and cairo_path.current_point are now fixed-point not doubles for consistency.
Now accept 4 explicit function pointers rather than a structure. Eliminate unnecessary done_path callback.
Track change in _cairo_path_interpret. Code previously in done_path callback is now here immediately after call to _cairo_path_interpret.
Internal _cairo_path API modified to accept fixed-point data everywhere. Much cleaner this way.
Have to convert doubles to fixed-point to track changes in _cairo_path API.
Keep data in fixed-point rather than going through intermediate doubles. Track changes in _cairo_path API.
New function to help when working with freetype.
|
|
close_path instead of add_edge, add_spline, and done_sub_path. Much, much nicer.
Provide cairo_polygon_move_to and cairo_polygon_line_to instead of cairo_polygon_add_point.
Track change in cairo_polygon interface.
|
|
|
|
cairo_set_target_surface, and cairo_set_target_image to act appropriately in the face of non-zero status.
|
|
some point.
|
|
|
|
functionality up into cairo_surface_t instead). Eliminated most of the remaining X datatypes (XFixed, XPointFixed, XLineFixed, XTrapezoid). Fixed some numerical problems relating to pen initialization and intersection calculation.
|
|
|
|
|