Age | Commit message (Collapse) | Author | Files | Lines |
|
This adds new node types for each intrinsic.
For instance, for addv, we have AArch64ISD::UADDV, such that:
(v4i32 (uaddv ...))
is the same as
(v4i32 (scalar_to_vector (i32 (int_aarch64_neon_uaddv ...))))
that is,
(v4i32 (INSERT_SUBREG (v4i32 (IMPLICIT_DEF)),
(i32 (int_aarch64_neon_uaddv ...)), ssub)
In a combine, we transform all such across-vector-lanes intrinsics to:
(i32 (extract_vector_elt (uaddv ...), 0))
This has one big advantage: by making the extract_element explicit, we
enable the existing patterns for lane-aware instructions to fire.
This lets us avoid needlessly going through the GPRs. Consider:
uint32x4_t test_mul(uint32x4_t a, uint32x4_t b) {
return vmulq_n_u32(a, vaddvq_u32(b));
}
We now generate:
addv.4s s1, v1
mul.4s v0, v0, v1[0]
instead of the previous:
addv.4s s1, v1
fmov w8, s1
dup.4s v1, w8
mul.4s v0, v1, v0
rdar://20044838
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231840 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
Follow up from r231505.
Fix the non-determinism by using a MapVector and reintroduce the AArch64
testcase. Defer deleting the got candidates up to the end and remove
them in a bulk, avoiding linear time removal of each element.
Thanks to Renato Golin for trying it out on other platforms.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231830 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
Phabricator review: http://reviews.llvm.org/D8185
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231827 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
The dependences are now expose through the new getInterestingDependences
API so we can use that with -analyze too and fix the FIXME.
This lets us remove the test that relied on -debug to check the
dependences.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231807 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
them. Note that we still can not lower gc.relocates for invoke statepoints.
Also it extracts getCopyFromRegs helper function in SelectionDAGBuilder as we need to be able to customize type of the register exported from basic block during lowering of the gc.result.
(Resubmitting this change after not being able to reproduce buildbot failure)
Differential Revision: http://reviews.llvm.org/D7760
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231800 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
We want to replace as much custom x86 shuffling via intrinsics
as possible because pushing the code down the generic shuffle
optimization path allows for better codegen and less complexity
in LLVM.
This is the sibling patch for the Clang half of this change:
http://reviews.llvm.org/D8088
Differential Revision: http://reviews.llvm.org/D8086
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231794 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
This crash occurs due to memory corruption when trying to update dependency
direction based on Constraints.
This crash was observed during lnt regression of Polybench benchmark test case dynprog.
Review: http://reviews.llvm.org/D8059
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231788 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
This crash in Dependency analysis is because we assume here that in case of UsefulGEP
both source and destination have the same number of operands which may not be true.
This incorrect assumption results in crash while populating Pairs. Fix the same.
This crash was observed during lnt regression for code such as-
struct s{
int A[10][10];
int C[10][10][10];
} S;
void dep_constraint_crash_test(int k,int N) {
for( int i=0;i<N;i++)
for( int j=0;j<N;j++)
S.A[0][0] = S.C[0][0][k];
}
Review: http://reviews.llvm.org/D8162
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231784 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
comparison to zero width.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231761 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
sized types.
We failed to use a marking set to properly handle recursive types, which caused use
to recurse infinitely and eventually overflow the stack.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231760 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
malformed statepoint intrinsic.
In this situation we would always have already flagged an error on the statepoint intrinsic,
but then we carry on to parse other, related GC intrinsics, and could end up crashing during that
verification when they try to access data from the malformed statepoint.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231759 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
side effects can be constant folded.
ReplaceInstUsesWith needs to return nullptr when the input has no users,
because in that case it does not mutate the program. Otherwise, we can
get stuck in an infinite loop of repeatedly attempting to constant fold
and instruction with no users.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231755 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
They mark the start of a compile unit, so name them .Lcu_*. Using
Section->getLabelBeginName() makes it looks like they mark the start of the
section.
While at it, switch to createTempSymbol to avoid collisions with labels
created in inline assembly. Not sure if a "don't crash" test is worth it.
With this getLabelBeginName is dead, delete it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231750 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
CFLAA didn't know how to properly handle ConstantExprs; it would silently
ignore them. This was a problem if the ConstantExpr is, say, a GEP of a global,
because CFLAA wouldn't realize that there's a global there. :)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231743 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
We now treat pointers given to ptrtoint and pointers retrieved from
inttoptr as similar to arguments or globals (can alias anything, etc.)
This solves some of the problems we were having with giving incorrect
results.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231741 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
-sanitizer-coverage-block-threshold=0 to actually do something useful.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231736 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
It turns out accelerator tables where totally broken if they contained
entries with colliding hashes. The failure mode is pretty bad, as it not
only impacted the colliding entries, but would basically make all the
entries after the first hash collision pointing in the wrong place.
The testcase uses the symbol names that where found to collide during a
clang build.
From a performance point of view, the patch adds a sort and a linear
walk over each bucket contents. While it has a measurable impact on the
accelerator table emission, it's not showing up significantly in clang
profiles (and I'd argue that correctness is priceless :-)).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231732 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231723 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231722 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
This fixes a subtle issue that was introduced in r205153.
When reusing a store for the extractelement expansion (to load directly
from it, inserting of going through the stack), later stores to the
same location might have overwritten the data we were expecting to
extract from.
To fix that, we need to explicitly replace the chain going out of the
reused store, so that later stores also have an explicit dependency on
the generated element-extracting loads, and can't clobber them.
rdar://20066785
Differential Revision: http://reviews.llvm.org/D8180
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231721 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
In preparation for a patch where cfi directives get in the way.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231720 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
preparation
Fix the double-deletion of AnalysisResolver when delegating through to
Dwarf EH preparation by creating one from scratch. Hopefully the new
pass manager simplifies this.
This reverts commit r229952.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231719 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
This also has the advantage of not depending on the brittle getLabelBeginName.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231714 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
immediate when checking if A2_tfrsi is combinable.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231710 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
Summary:
This removes some duplicated code, and also helps optimization: e.g. in
the test case added, `%idx ULT 128` in `@x` is not currently optimized
to `true` by `-indvars` but will be, after this change.
The only functional change in ths commit is that for add recurrences,
ScalarEvolution::getRange will be more aggressive -- computing the
unsigned (resp. signed) range for a SCEVAddRecExpr will now look at the
NSW (resp. NUW) bits and check for signed (resp. unsigned) overflow.
This can be a strict improvement in some cases (such as the attached
test case), and should be no worse in other cases.
Reviewers: atrick, nlewycky
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D8142
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231709 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
Summary:
Unused in this commit, but will be used in a subsequent change (D8142)
by a FileCheck test.
Reviewers: atrick
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D8143
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231708 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231703 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231699 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
This was just creating unused labels for .text when the module had no
functions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231694 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231693 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
This format's easier to understand and update by hand.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231686 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231685 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
In the case where just tables are part of the function section, this produces
more readable assembly by avoiding switching to the eh section and back
to .text.
This would also break with non unique section names, as trying to switch to
a unique section actually creates a new one.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231677 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
Summary:
Code is mostly copied from AArch64 port and modified where needed for Mips.
This handles the "non" legal cases of logical ops. Legal cases are handled by tablegen patterns.
Test Plan:
Make check test logopm.ll
All of test-suite passes at O0/O2 and mips32 r1/r2 with this new change.
Reviewers: dsanders
Reviewed By: dsanders
Subscribers: echristo, llvm-commits, aemerson, rfuhler
Differential Revision: http://reviews.llvm.org/D6599
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231665 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
This is a candidate for stable.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231659 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
Also, replaced line with 'target triple' with flag -mtriple on the RUN line.
Removed the data layout string as it is not needed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231654 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
As it broke llvm bootstrap.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231635 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
non-constant clause operands.
Fixing this also exposed a related issue where the landingpad under construction was not
cleaned up when an error was raised, which would cause bad reference errors before the
error could actually be printed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231634 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
For inner one of nested loops, it is more likely to be a hot loop,
and the runtime check can be promoted out from patch 0001, so the
overhead is less, we can try a doubled threshold to unroll more loops.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231632 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
loop from vectorization.
Runtime unrolling is an expensive optimization which can bring benefit
only if the loop is hot and iteration number is relatively large enough.
For some loops, we know they are not worth to be runtime unrolled.
The scalar loop from vectorization is one of the cases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231631 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
Runtime unrollng will introduce a runtime check in loop prologue.
If the unrolled loop is a inner loop, then the proglogue will be inside
the outer loop. LICM pass can help to promote the runtime check out if
the checked value is loop invariant.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231630 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
Summary:
See the two test cases.
; Can fold fcmp with undef on one side by choosing NaN for the undef
; Can fold fcmp with undef on both side
; fcmp u_pred undef, undef -> true
; fcmp o_pred undef, undef -> false
; because whatever you choose for the first undef
; you can choose NaN for the other undef
Reviewers: hfinkel, chandlerc, majnemer
Reviewed By: majnemer
Subscribers: majnemer, llvm-commits
Differential Revision: http://reviews.llvm.org/D7617
From: Mehdi Amini <mehdi.amini@apple.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231626 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
is specified by the user.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231613 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
There were cases where the backend computed a wrong permute mask for a VPERM2X128 node.
Example:
\code
define <8 x float> @foo(<8 x float> %a, <8 x float> %b) {
%shuffle = shufflevector <8 x float> %a, <8 x float> %b, <8 x i32> <i32 undef, i32 undef, i32 6, i32 7, i32 undef, i32 undef, i32 6, i32 7>
ret <8 x float> %shuffle
}
\code end
Before this patch, llc (with -mattr=+avx) emitted the following vperm2f128:
vperm2f128 $0, %ymm0, %ymm0, %ymm0 # ymm0 = ymm0[0,1,0,1]
With this patch, llc emits a vperm2f128 with a correct permute mask:
vperm2f128 $17, %ymm0, %ymm0, %ymm0 # ymm0 = ymm0[2,3,2,3]
Differential Revision: http://reviews.llvm.org/D8119
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231601 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
Provide basic support for dynamically loadable coff objects. Only handles a subset of x64 currently.
Patch by Andy Ayers!
Differential Revision: http://reviews.llvm.org/D7793
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231574 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
This patch fixes the logic in the DAGCombiner that folds an AND node according
to rule: (and (X (load V)), C) -> (X (load V))
An AND between a vector load 'X' and a constant build_vector 'C' can be folded
into the load itself only if we can prove that the AND operation is redundant.
The algorithm implemented by 'visitAND' firstly computes the splat value 'S'
from C, and then checks if S has the lower 'B' bits set (where B is the size in
bits of the vector element type). The algorithm takes into account also the
'undef' bits in the splat mask.
Unfortunately, the algorithm only worked under the assumption that the size of S
is a multiple of the vector element type. With this patch, we conservatively
avoid folding the AND if the splat bits are not compatible with the vector
element type.
Added X86 test and-load-fold.ll
Differential Revision: http://reviews.llvm.org/D8085
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231563 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
This patch attempts to convert a SCALAR_TO_VECTOR using an operand from an EXTRACT_VECTOR_ELT into a VECTOR_SHUFFLE.
This prevents many cases of spilling scalar data between the gpr + simd registers.
At present the optimization only accepts cases where there is no TRUNC of the scalar type (i.e. all types must match).
Differential Revision: http://reviews.llvm.org/D8132
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231554 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
non-temporary enabling options. This is part of removing misched-bench
as an option.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231546 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
Doing this gets function's low_pc and global variable's locations right
in the output debug info. It also could get right other attributes
that need to be relocated (in linker terms), but I don't know of any
other than the address attributes.
This doesn't fixup low_pc attributes in compile_unit, lexical_block
or inlined subroutine, nor does it get right high_pc attributes
for function. This will come in a subsequent commit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231544 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
to disable lane switching if we don't actually have the instruction
set we want to switch to. Models the earlier check above the
conditional for the pass.
The testcase is one that triggered with the assert that's added
as part of the fix, use it to avoid adding a new testcase as it
highlights the same problem.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231539 91177308-0d34-0410-b5e6-96231b3b80d8
|