iwmmx notes

author: Søren Sandmann Pedersen <ssp@redhat.com> 2011-07-22 12:33:16 -0400
committer: Søren Sandmann Pedersen <ssp@redhat.com> 2013-07-29 06:21:50 -0400
commit: 541ba5ec8abe3f013121857ac7a953ad07a1c426 (patch)
tree: f889a2a325c6d5b531abf94545283f6395eb797c
parent: 3ee6046da09f518cbb97a69f4914699c1ca91445 (diff)
1 files changed, 84 insertions, 0 deletions
diff --git a/iwmmx-review.txt b/iwmmx-review.txt
new file mode 100644
index 00000000..5ec7c7e9
--- /dev/null
+++ b/iwmmx-review.txt
@@ -0,0 +1,84 @@
+- For symmetry: USE_X86_MMX
+
+- Commit messages need more details and explanations
+
+- If you remvoe all the MMX instructions from pixman_blt_mmx(), then
+  it isn't MMX-specific anymore and may as well be moved to
+  pixman-general.
+
+  However, I'm not convinced we can rely on GCC or glibc to be
+  optimized for MMX code. On the other hand x86/mmx is basically
+  obsolete, superseded by SSE2.
+
+- I'd like a comment at the top of pixman-mmx.c that it is used for
+  both iwMMX and x86 MMX.
+
+- ???
+
+- For CPUs with both iwmmx and NEON, shouldn't NEON be preferred?
+
+- The data is interesting. One major conclusion is that ARMv6 stuff is
+  worse than the generic C code. Maybe we should simply delete it, or
+  alternatively, find out if it's accessing too much memory or
+  something.
+
+  [need to find out which ones are slower and which ones are faster]
+
+- At this point, MMX is basically obsolete for x86. It has stuck
+  around mainly for OLPC, but OLPC is moving to ARM now.
+
+Comments on patch 2, fix unaligned
+
+- I'm a bit suspicious of the ldq_u code.
+
+  In most cases, that function will be called with something
+  that is at least 32 bit aligned, but the compiler can't know that,
+  so it seems like it will have to do something really bad.
+
+  Or alternatively, it deduces from uint64_t * that the pointer is 64
+  bit aligned, and therefore doesn't generate any realignment code.
+
+  Similar arguments apply to the 32 bit code
+
+  What does the assembly for those functions look like?
+
+  On x86, at least GCC generates just a movq, but I don't know whether
+  that's because it thinks the pointer is aligned, or because it knows
+  that x86 can cope with unaligned access.
+
+- In many cases, it seems like the fixes for unaligned actually have
+  the opposite effect. Ie., in all the
+      
+	while (w >= 4)
+	      ...
+
+  for the 8 bit ops, there seems to be no guarantee that the source is
+  actually 32 bit aligned. The old code did actually guarantee that.
+
+  Maybe it's better instead to add such checks to
+  mmx_composite_add_8_8()
+
+  [need to make sure all of mmx has gone through this] Question: Does
+  the Armada 610
+
+- For the blt code, maybe the right plan is to add a memcpy() based
+  fallback in -generic, and then only run the accelerated whenever the
+  source and destinations are both aligned.
+
+  [As an aside, I think it would be worthwhile taking another look at
+  the SSSE3 blt code from Intel again].
+
+  Clearly, deleting all the MMX specific code except _mm_empty()
+  doesn't make any sense.
+
+- 
+
+Comments on patch 3, enabling iwmmx
+
+- Why GCC 3.4? The reason for that for x86 was that it was the first
+  version that generated non-retarded code for MMX. The intrinsics
+  were actually available in 3.3, but the code was terribly bad.
+
+- WMMX is selected ahead of NEON
+
+-
author	Søren Sandmann Pedersen <ssp@redhat.com>	2011-07-22 12:33:16 -0400
committer	Søren Sandmann Pedersen <ssp@redhat.com>	2013-07-29 06:21:50 -0400
commit	541ba5ec8abe3f013121857ac7a953ad07a1c426 (patch)
tree	f889a2a325c6d5b531abf94545283f6395eb797c
parent	3ee6046da09f518cbb97a69f4914699c1ca91445 (diff)