diff options
author | Søren Sandmann Pedersen <ssp@redhat.com> | 2011-07-22 12:33:16 -0400 |
---|---|---|
committer | Søren Sandmann Pedersen <ssp@redhat.com> | 2013-07-29 06:21:50 -0400 |
commit | 541ba5ec8abe3f013121857ac7a953ad07a1c426 (patch) | |
tree | f889a2a325c6d5b531abf94545283f6395eb797c | |
parent | 3ee6046da09f518cbb97a69f4914699c1ca91445 (diff) |
iwmmx notes
-rw-r--r-- | iwmmx-review.txt | 84 |
1 files changed, 84 insertions, 0 deletions
diff --git a/iwmmx-review.txt b/iwmmx-review.txt new file mode 100644 index 00000000..5ec7c7e9 --- /dev/null +++ b/iwmmx-review.txt @@ -0,0 +1,84 @@ +- For symmetry: USE_X86_MMX + +- Commit messages need more details and explanations + +- If you remvoe all the MMX instructions from pixman_blt_mmx(), then + it isn't MMX-specific anymore and may as well be moved to + pixman-general. + + However, I'm not convinced we can rely on GCC or glibc to be + optimized for MMX code. On the other hand x86/mmx is basically + obsolete, superseded by SSE2. + +- I'd like a comment at the top of pixman-mmx.c that it is used for + both iwMMX and x86 MMX. + +- ??? + +- For CPUs with both iwmmx and NEON, shouldn't NEON be preferred? + +- The data is interesting. One major conclusion is that ARMv6 stuff is + worse than the generic C code. Maybe we should simply delete it, or + alternatively, find out if it's accessing too much memory or + something. + + [need to find out which ones are slower and which ones are faster] + +- At this point, MMX is basically obsolete for x86. It has stuck + around mainly for OLPC, but OLPC is moving to ARM now. + +Comments on patch 2, fix unaligned + +- I'm a bit suspicious of the ldq_u code. + + In most cases, that function will be called with something + that is at least 32 bit aligned, but the compiler can't know that, + so it seems like it will have to do something really bad. + + Or alternatively, it deduces from uint64_t * that the pointer is 64 + bit aligned, and therefore doesn't generate any realignment code. + + Similar arguments apply to the 32 bit code + + What does the assembly for those functions look like? + + On x86, at least GCC generates just a movq, but I don't know whether + that's because it thinks the pointer is aligned, or because it knows + that x86 can cope with unaligned access. + +- In many cases, it seems like the fixes for unaligned actually have + the opposite effect. Ie., in all the + + while (w >= 4) + ... + + for the 8 bit ops, there seems to be no guarantee that the source is + actually 32 bit aligned. The old code did actually guarantee that. + + Maybe it's better instead to add such checks to + mmx_composite_add_8_8() + + [need to make sure all of mmx has gone through this] Question: Does + the Armada 610 + +- For the blt code, maybe the right plan is to add a memcpy() based + fallback in -generic, and then only run the accelerated whenever the + source and destinations are both aligned. + + [As an aside, I think it would be worthwhile taking another look at + the SSSE3 blt code from Intel again]. + + Clearly, deleting all the MMX specific code except _mm_empty() + doesn't make any sense. + +- + +Comments on patch 3, enabling iwmmx + +- Why GCC 3.4? The reason for that for x86 was that it was the first + version that generated non-retarded code for MMX. The intrinsics + were actually available in 3.3, but the code was terribly bad. + +- WMMX is selected ahead of NEON + +- |