summaryrefslogtreecommitdiff
path: root/TODO
diff options
context:
space:
mode:
authorSøren Sandmann <sandmann@redhat.com>2007-12-08 09:52:39 -0500
committerSøren Sandmann <sandmann@redhat.com>2007-12-08 09:52:39 -0500
commit3ef6f915540df92ce20921e696b7ab8110dd4806 (patch)
tree2d9aff2e6504746bf8fd1722d91f4bc9a5c3adba /TODO
parentd6391c34f3d5e339da176908d3d6e165875a8723 (diff)
Process four pixels at a time; some new instructions
Diffstat (limited to 'TODO')
-rw-r--r--TODO50
1 files changed, 26 insertions, 24 deletions
diff --git a/TODO b/TODO
index 8a27080..b62b901 100644
--- a/TODO
+++ b/TODO
@@ -75,35 +75,20 @@
A simple, but probably pretty good scheme:
- - Generate two versions of each op, one where everything is aligned, and one where
- alignment is detected on the fly.
+ - Generate two versions of each op, one where everything is aligned,
+ and one where alignment is detected on the fly.
In both cases n_pixels is computed, the number of pixels to handle
- per iteration. In the aligned case, we then just read in that
- many pixels as efficiently as possible. For the first iteration,
- that probably means 2 pixels in many cases, but eventually it
- would be nice to unroll once to get to four pixels.
+ per iteration. In the aligned case, we then just read in that many
+ pixels as efficiently as possible. For the first iteration, that
+ probably means 2 pixels in many cases, but eventually it would be
+ nice to unroll once to get to four pixels.
So both versions have a preamble that reads the source, mask and
destinations into sse registers. Then afterwards, the computations
are the same, then finally two different versions generate the
final write to the destination.
-- Backwards vs. forwards
-
- The code currently in testjit iterates backwards over each line. It
- may be a little better to go forward. This could be done by
-
- - initializing the line to (line + w * bpp)
- - initializing w to -width
- - not having a displacement:
-
- movq (line, w, bpp), xmm0
-
- and
-
- add 2, width
-
- It is important that the register allocator is not too dumb
- EAX is the only register we can use for multiplication
@@ -188,9 +173,6 @@
smaller by compressing fields into uint8_t's. This would cause gdb
to not show registers in enums though.
-- Pixman CPU detection should be generated dynamically. That will get
- rid of the annoying #ifdefs and getisax() stuff.
-
- Public API:
pixman-sse-jit.h:
@@ -220,6 +202,26 @@
DONE:
+- Pixman CPU detection should be generated dynamically. That will get
+ rid of the annoying #ifdefs and getisax() stuff.
+
+ - Backwards vs. forwards
+
+ The code currently in testjit iterates backwards over each line. It
+ may be a little better to go forward. This could be done by
+
+ - initializing the line to (line + w * bpp)
+ - initializing w to -width
+ - not having a displacement:
+
+ movq (line, w, bpp), xmm0
+
+ and
+
+ add 2, width
+
+ Current code iterates forwards.
+
- The memindex/membase should take ops, not reg numbers.
- There should only REG and MEM in the ops. emit_memindex() can