[X86][SchedModel] SSE reciprocal square root instruction latencies.

The SSE rsqrt instruction (a fast reciprocal square root estimate) was grouped in the same scheduling IIC_SSE_SQRT* class as the accurate (but very slow) SSE sqrt instruction. For code which uses rsqrt (possibly with newton-raphson iterations) this poor scheduling was affecting performances. This patch splits off the rsqrt instruction from the sqrt instruction scheduling classes and creates new IIC_SSE_RSQER* classes with latency values based on Agner's table. Differential Revision: http://reviews.llvm.org/D5370 Patch by Simon Pilgrim. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218517 91177308-0d34-0410-b5e6-96231b3b80d8
author: Andrea Di Biagio <Andrea_DiBiagio@sn.scee.net> 2014-09-26 12:56:44 +0000
committer: Andrea Di Biagio <Andrea_DiBiagio@sn.scee.net> 2014-09-26 12:56:44 +0000
commit: a5ab9baf8319dfef200bef6898039f8d7c3286e3 (patch)
tree: d15f823f0faff168842f8ebd82b29d6c86e6d6ab /lib/Target/X86/X86SchedHaswell.td
parent: a0d5d7aed8e177cea381b3d054d80c212ece9f2c (diff)
1 files changed, 1 insertions, 0 deletions
diff --git a/lib/Target/X86/X86SchedHaswell.td b/lib/Target/X86/X86SchedHaswell.td
index 7bb3569ad33..73a32304302 100644
--- a/lib/Target/X86/X86SchedHaswell.td
+++ b/lib/Target/X86/X86SchedHaswell.td
@@ -129,6 +129,7 @@ defm : HWWriteResPair<WriteFAdd,   HWPort1, 3>;
 defm : HWWriteResPair<WriteFMul,   HWPort0, 5>;
 defm : HWWriteResPair<WriteFDiv,   HWPort0, 12>; // 10-14 cycles.
 defm : HWWriteResPair<WriteFRcp,   HWPort0, 5>;
+defm : HWWriteResPair<WriteFRsqrt, HWPort0, 5>;
 defm : HWWriteResPair<WriteFSqrt,  HWPort0, 15>;
 defm : HWWriteResPair<WriteCvtF2I, HWPort1, 3>;
 defm : HWWriteResPair<WriteCvtI2F, HWPort1, 4>;
author	Andrea Di Biagio <Andrea_DiBiagio@sn.scee.net>	2014-09-26 12:56:44 +0000
committer	Andrea Di Biagio <Andrea_DiBiagio@sn.scee.net>	2014-09-26 12:56:44 +0000
commit	a5ab9baf8319dfef200bef6898039f8d7c3286e3 (patch)
tree	d15f823f0faff168842f8ebd82b29d6c86e6d6ab /lib/Target/X86/X86SchedHaswell.td
parent	a0d5d7aed8e177cea381b3d054d80c212ece9f2c (diff)