nir/algebraic: Add algebraic opt for float comparisons with identical operands.

The flt version could have been added in 56e21647e20d, but our collective understanding of NaN and comparisons was poor in 2015. The new "is_a_number" predicate makes the others possible. All of the helped shaders in shader-db are either from Mad Max or Skia. Some of the Skia shaders just get decimated by this change: instructions helped: shaders/skia/580-4.shader_test FS SIMD8: 81 -> 29 (-64.20%) (scheduled: top-down) I looked at a couple of those shaders, and they had sequences like: vec1 32 ssa_44 = flt32 ssa_32, ssa_32 vec1 32 ssa_45 = b32csel ssa_44, ssa_43, ssa_0 vec1 32 ssa_46 = fge32 ssa_32, ssa_32 vec1 32 ssa_47 = b32csel ssa_46, ssa_0, ssa_45 vec1 32 ssa_48 = iand ssa_46, ssa_44 vec1 32 ssa_49 = b32csel ssa_48, ssa_43, ssa_0 ssa_44 is replaced with False. Then ssa_47 selects between ssa_0 and ssa_0, so ssa_47 and ssa_46 are eliminated. ssa_48 is (False && don't care), so ssa_48 and ssa_49 are eliminated. After that, many calculations now involve constants of zero, so they are optimized down too. So it continues until there's not much left! All Intel platforms had similar results. (Tiger Lake shown) total instructions in shared programs: 21072238 -> 21071386 (<.01%) instructions in affected programs: 33722 -> 32870 (-2.53%) helped: 146 HURT: 1 helped stats (abs) min: 1 max: 62 x̄: 5.84 x̃: 2 helped stats (rel) min: 0.19% max: 62.35% x̄: 4.09% x̃: 1.07% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.20% max: 0.20% x̄: 0.20% x̃: 0.20% 95% mean confidence interval for instructions value: -7.94 -3.65 95% mean confidence interval for instructions %-change: -5.87% -2.25% Instructions are helped. total cycles in shared programs: 856203326 -> 856192238 (<.01%) cycles in affected programs: 749966 -> 738878 (-1.48%) helped: 148 HURT: 0 helped stats (abs) min: 1 max: 1226 x̄: 74.92 x̃: 18 helped stats (rel) min: 0.07% max: 49.70% x̄: 2.69% x̃: 0.46% 95% mean confidence interval for cycles value: -104.82 -45.02 95% mean confidence interval for cycles %-change: -4.01% -1.37% Cycles are helped. LOST: 4 GAINED: 0 Fossil-db results: Tiger Lake Instructions in all programs: 160915223 -> 160898354 (-0.0%) SENDs in all programs: 6812780 -> 6812780 (+0.0%) Loops in all programs: 38340 -> 38340 (+0.0%) Cycles in all programs: 7434144207 -> 7433978462 (-0.0%) Spills in all programs: 192582 -> 192582 (+0.0%) Fills in all programs: 304537 -> 304537 (+0.0%) Ice Lake Instructions in all programs: 145296298 -> 145279531 (-0.0%) SENDs in all programs: 6863692 -> 6863692 (+0.0%) Loops in all programs: 38334 -> 38334 (+0.0%) Cycles in all programs: 8800257014 -> 8800088384 (-0.0%) Spills in all programs: 216880 -> 216880 (+0.0%) Fills in all programs: 334248 -> 334248 (+0.0%) Skylake Instructions in all programs: 135891664 -> 135874910 (-0.0%) SENDs in all programs: 6802946 -> 6802946 (+0.0%) Loops in all programs: 38331 -> 38331 (+0.0%) Cycles in all programs: 8444273433 -> 8444130932 (-0.0%) Spills in all programs: 194839 -> 194839 (+0.0%) Fills in all programs: 301114 -> 301114 (+0.0%)
author: Ian Romanick <ian.d.romanick@intel.com> 2020-08-10 15:34:42 -0700
committer: Ian Romanick <ian.d.romanick@intel.com> 2021-04-02 12:56:18 -0700
commit: e2218620eb5ad18aa64813c580e23fba4e2db0a2 (patch)
tree: c997a4de7e5a40d57efc2a677ffe0643c3ec3a97
parent: 7d0b6b3f8f2e8c98a9478cfca8ac5d9aa0400a1d (diff)
1 files changed, 8 insertions, 3 deletions
diff --git a/src/compiler/nir/nir_opt_algebraic.py b/src/compiler/nir/nir_opt_algebraic.py
index c61bcfb21c5..ed6edfa0174 100644
--- a/src/compiler/nir/nir_opt_algebraic.py
+++ b/src/compiler/nir/nir_opt_algebraic.py
@@ -968,15 +968,20 @@ optimizations.extend([
    (('iand', 'a@bool16', 1.0), ('b2f', a)),
    (('iand', 'a@bool32', 1.0), ('b2f', a)),
    (('flt', ('fneg', ('b2f', 'a@1')), 0), a), # Generated by TGSI KILL_IF.
-   # Comparison with the same args.  Note that these are not done for
-   # the float versions because NaN always returns false on float
-   # inequalities.
+   # Comparison with the same args.  Note that these are only done for the
+   # float versions when the source must be a number.  Generally, NaN cmp NaN
+   # produces the opposite result of X cmp X.  flt is the outlier.  NaN < NaN
+   # is false, and, for any number X, X < X is also false.
    (('ilt', a, a), False),
    (('ige', a, a), True),
    (('ieq', a, a), True),
    (('ine', a, a), False),
    (('ult', a, a), False),
    (('uge', a, a), True),
+   (('flt', a, a), False),
+   (('fge', 'a(is_a_number)', a), True),
+   (('feq', 'a(is_a_number)', a), True),
+   (('fneu', 'a(is_a_number)', a), False),
    # Logical and bit operations
    (('iand', a, a), a),
    (('iand', a, ~0), a),
author	Ian Romanick <ian.d.romanick@intel.com>	2020-08-10 15:34:42 -0700
committer	Ian Romanick <ian.d.romanick@intel.com>	2021-04-02 12:56:18 -0700
commit	e2218620eb5ad18aa64813c580e23fba4e2db0a2 (patch)
tree	c997a4de7e5a40d57efc2a677ffe0643c3ec3a97
parent	7d0b6b3f8f2e8c98a9478cfca8ac5d9aa0400a1d (diff)