summaryrefslogtreecommitdiff
path: root/test/CodeGen/X86/recip-fastmath.ll
AgeCommit message (Collapse)AuthorFilesLines
2015-06-22[x86] set default reciprocal (division and square root) codegen to match GCCSanjay Patel1-19/+19
D8982 ( checked in at http://reviews.llvm.org/rL239001 ) added command-line options to allow reciprocal estimate instructions to be used in place of divisions and square roots. This patch changes the default settings for x86 targets to allow that recip codegen (except for scalar division because that breaks too much code) when using -ffast-math or its equivalent. This matches GCC behavior for this kind of codegen. Differential Revision: http://reviews.llvm.org/D10396 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240310 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-04make reciprocal estimate code generation more flexible by adding ↵Sanjay Patel1-2/+2
command-line options (3rd try) The first try (r238051) to land this was reverted due to ExecutionEngine build failure; that was hopefully addressed by r238788. The second try (r238842) to land this was reverted due to BUILD_SHARED_LIBS failure; that was hopefully addressed by r238953. This patch adds a TargetRecip class for processing many recip codegen possibilities. The class is intended to handle both command-line options to llc as well as options passed in from a front-end such as clang with the -mrecip option. The x86 backend is updated to use the new functionality. Only -mcpu=btver2 with -ffast-math should see a functional change from this patch. All other x86 CPUs continue to *not* use reciprocal estimates by default with -ffast-math. Differential Revision: http://reviews.llvm.org/D8982 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239001 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-03Revert "make reciprocal estimate code generation more flexible by adding ↵Rafael Espindola1-2/+2
command-line options (2nd try)" This reverts commit r238842. It broke -DBUILD_SHARED_LIBS=ON build. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238900 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-02make reciprocal estimate code generation more flexible by adding ↵Sanjay Patel1-2/+2
command-line options (2nd try) The first try (r238051) to land this was reverted due to bot failures that were hopefully addressed by r238788. This patch adds a TargetRecip class for processing many recip codegen possibilities. The class is intended to handle both command-line options to llc as well as options passed in from a front-end such as clang with the -mrecip option. The x86 backend is updated to use the new functionality. Only -mcpu=btver2 with -ffast-math should see a functional change from this patch. All other x86 CPUs continue to *not* use reciprocal estimates by default with -ffast-math. Differential Revision: http://reviews.llvm.org/D8982 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238842 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-23Revert "make reciprocal estimate code generation more flexible by adding ↵Rafael Espindola1-2/+2
command-line options" This reverts commit r238051. It broke some bots: http://lab.llvm.org:8011/builders/llvm-ppc64-linux1/builds/18190 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238075 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-22make reciprocal estimate code generation more flexible by adding ↵Sanjay Patel1-2/+2
command-line options This patch adds a class for processing many recip codegen possibilities. The TargetRecip class is intended to handle both command-line options to llc as well as options passed in from a front-end such as clang with the -mrecip option. The x86 backend is updated to use the new functionality. Only -mcpu=btver2 with -ffast-math should see a functional change from this patch. All other CPUs continue to *not* use reciprocal estimates by default with -ffast-math. Differential Revision: http://reviews.llvm.org/D8982 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238051 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-08fixed to test features, not CPU modelsSanjay Patel1-24/+24
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234413 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-12Expose the number of Newton-Raphson iterations applied to the hardware's ↵Sanjay Patel1-12/+49
reciprocal estimate as a parameter (x86). This is a follow-on to r221706 and r221731 and discussed in more detail in PR21385. This patch also loosens the testcase checking for btver2. We know that the "1.0" will be loaded, but we can't tell exactly when, so replace the CHECK-NEXT specifiers with plain CHECKs. The CHECK-NEXT sequence relied on a quirk of post-RA-scheduling that may change independently of anything in these tests. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221819 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-11Use rcpss/rcpps (X86) to speed up reciprocal calcs (PR21385).Sanjay Patel1-0/+72
This is a first step for generating SSE rcp instructions for reciprocal calcs when fast-math allows it. This is very similar to the rsqrt optimization enabled in D5658 ( http://reviews.llvm.org/rL220570 ). For now, be conservative and only enable this for AMD btver2 where performance improves significantly both in terms of latency and throughput. We may never enable this codegen for Intel Core* chips because the divider circuits are just too fast. On SandyBridge, divss can be as fast as 10 cycles versus the 21 cycle critical path for the rcp + mul + sub + mul + add estimate. Follow-on patches may allow configuration of the number of Newton-Raphson refinement steps, add AVX512 support, and enable the optimization for more chips. More background here: http://llvm.org/bugs/show_bug.cgi?id=21385 Differential Revision: http://reviews.llvm.org/D6175 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221706 91177308-0d34-0410-b5e6-96231b3b80d8