summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2016-07-11AMDGPU: fix local stack slot allocation bugslatestNicolai Hähnle3-2/+35
Summary: The main bug fix here is using the 32-bit encoding of V_ADD_I32 in materializeFrameBaseRegister and resolveFrameIndex, so that arbitrary immediates work. The second part is that we may now require the SegmentWaveByteOffset even when there are initially no stack objects and VGPR spilling isn't enabled, for stack slots that are allocated later. This means that some bits become effectively dead and can be cleaned up. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96602 Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: http://reviews.llvm.org/D21551
2016-07-11AMDGPU: Unify MOVRELSOffset and MOVRELDOffsetNicolai Hähnle4-34/+20
Summary: Previously, constant index insertelements would be turned into SI_INDIRECT_DST, which is bound to prevent some optimization opportunities. Worse, it mislead the heuristic that decides whether immediates should be lowered to S_MOV_B32 or V_MOV_B32 in a way that resulted in unnecessary v_readfirstlanes. Reviewers: arsenm, tstellarAMD Subscribers: arsenm, kzhuravl, llvm-commits Differential Revision: http://reviews.llvm.org/D22217
2016-07-11AMDGPU: Treat texture gather instructions more like other MIMG instructionsNicolai Hähnle3-4/+24
Summary: Setting MIMG to 0 has a bunch of unexpected side effects, including that isVMEM returns false which leads to incorrect treatment in the hazard recognizer. The reason I noticed it is that it also leads to incorrect treatment in VGPR-to-SGPR copies, which is one cause of the referenced bug. The only reason why MIMG was set to 0 is to signal the special handling of dmasks, but that can be checked differently. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96877 Reviewers: arsenm, tstellarAMD Subscribers: arsenm, kzhuravl, llvm-commits Differential Revision: http://reviews.llvm.org/D22210
2016-07-11[IPRA] Properly compute register usage at call sites.Chad Rosier4-8/+11
Differential Revision: http://reviews.llvm.org/D21395 Patch by Vivek Pandya. PR28144 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275087 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11[SystemZ] Recognize Load On Condition Immediate (LOCHI/LOGHI) opportunitiesZhan Jun Liau11-2/+294
Summary: Add support for the z13 instructions LOCHI and LOCGHI which conditionally load immediate values. Add target instruction info hooks so that if conversion will allow predication of LHI/LGHI. Author: RolandF Reviewers: uweigand Subscribers: zhanjunl Commiting on behalf of Roland. Differential Revision: http://reviews.llvm.org/D22117 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275086 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11[SCCP] Try to follow the DRY principle, use `OpSt`.Davide Italiano1-3/+2
Thanks to Eli Friedman for pointing out in his post-commit review! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275084 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11[SLSR] Call getPointerSizeInBits with the correct address space.Jingyue Wu2-5/+22
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275083 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11[PM/IPO] Port LowerTypeTests to the new PassManager.Davide Italiano5-17/+39
There's a little bit of churn in this patch because the initialization mechanism is now shared between the old and the new PM. Other than that, it's just a pretty mechanical translation. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275082 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11[lanai] Add more tests for assembly of conditional ALU opsJacques Pienaar4-5/+363
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275081 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11Fix the assertion failure caused by http://reviews.llvm.org/D22118Dehao Chen2-2/+3
Summary: http://reviews.llvm.org/D22118 uses metadata to store the call count, which makes it possible to have branch weight to have only one elements. Also fix the assertion failure in inliner when checking the instruction type to include "invoke" instruction. Reviewers: mkuper, dnovillo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D22228 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275079 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11[IR] Stop a -Wsign-compare warning from firingDavid Majnemer1-1/+1
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275077 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11[LowerTypeTests] Don't rely on doInitialization().Davide Italiano1-23/+16
In preparation for porting this pass to the new PM (which has no doInitialization()). Differential Revision: http://reviews.llvm.org/D22223 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275074 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11Implement callsite-hotness based inline cost for Sample-based PGODehao Chen5-1/+103
Summary: For sample-based PGO, using BFI to calculate callsite count is sometime not accurate. This is because with sampling based approach, if a callsite resides in a hot loop deeply nested in a bunch of cold branches, the callsite's BFI frequency would be inaccurately calculated due to lack of samples in the cold branch. E.g. if (A1 && A2 && A3 && ..... && A10) { for (i=0; i < 100000000; i++) { callsite(); } } Assume that A1 to A100 are all 100% taken, and callsite has 1000 samples and thus is considerred hot. Because the loop's trip count is huge, it's normal that all branches outside the loop has no sample at all. As a result, we can only use static branch probability to derive the the frequency of the loop header. Assuming that static heuristic thinks each branch is 50% taken, then the count calculated from BFI will be 1/(2^10) of the actual value. In order to get more accurate callsite count, we directly annotate the weight on the call instruction, and directly use it when checking callsite hotness. Note that this mechanism can also be shared by instrumentation based callsite hotness analysis. The side benefit is that it breaks the dependency from Inliner to BFI as call count is embedded in the IR. Reviewers: davidxl, eraman, dnovillo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D22118 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275073 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11Tune the weight propagation algorithm for sample profile.Dehao Chen2-16/+30
Summary: Handle the case when there is only one incoming/outgoing edge for a visited basic block: use the block weight to adjust edge weight even when the edge has been visited before. This can help reduce inaccuracies introduced by incorrect basic block profile, as shown in the updated unittest. Reviewers: davidxl, dnovillo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D22180 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275072 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11[x86] make some of the tests 256-bit for testing diversitySanjay Patel1-54/+106
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275070 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11Add missing include from previous commitNirav Dave1-0/+1
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275069 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11Fix branch relaxation in 16-bit mode.Nirav Dave17-48/+115
Thread through MCSubtargetInfo to relaxInstruction function allowing relaxation to generate jumps with 16-bit sized immediates in 16-bit mode. This fixes PR22097. Reviewers: dwmw2, tstellarAMD, craig.topper, jyknight Subscribers: jfb, arsenm, jyknight, llvm-commits, dsanders Differential Revision: http://reviews.llvm.org/D20830 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275068 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11[x86] specify triple to avoid bot failuresSanjay Patel1-6/+6
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275067 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11[Sink] Don't move calls to readonly functions across storesNicolai Haehnle2-2/+118
Summary: Reviewers: hfinkel, majnemer, tstellarAMD, sunfish Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17279 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275066 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11AliasAnalysis: unify getModRefInfo(I, CS) semantics with other overloadsNicolai Haehnle1-1/+1
This subtle change to getModRefInfo(Instruction, ImmutableCallSite) is to ensure that the semantics are equal to that of getModRefInfo(CS1, CS2) when the Instruction is a call-site. This is now more in line with getModRefInfo generally: it returns Mod when I modifies a memory location that is accessed (read or written) by CS and Ref when I reads a memory location that is written by CS. From a grep of the code, the only uses of this particular getModRefInfo overload are in MemorySSA and MemCpyOptimizer, and they only care about where the result is MR_NoModRef or not. Therefore, this change should have no visible effect. Separated out from D17279 upon request. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275065 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11[x86] update checksSanjay Patel1-15/+30
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275064 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11[X86][SSE] Generalise target shuffle combine of shuffles using variable masksSimon Pilgrim1-13/+21
At present the only shuffle with a variable mask we recognise is PSHUFB, which influences if its worth the cost of mask creation/loading of a combined target shuffle with a variable mask. This change sets up the infrastructure to support other shuffles in the future but has no effect yet. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275059 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11Provide support for preserving assembly commentsNirav Dave13-4/+201
Preserve assembly comments from input in output assembly and flags to toggle property. This is on by default for inline assembly and off in llvm-mc. Parsed comments are emitted immediately before an EOL which generally places them on the expected line. Reviewers: rtrieu, dwmw2, rnk, majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D20020 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275058 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11[AMDGPU][llvm-mc] Quickfix for r272748 to enable labels in branch instructions.Artem Tamazov2-0/+19
Fixes issue mentioned at: https://github.com/RadeonOpenCompute/LLVM-AMDGPU-Assembler-Extra/issues/13. Lit tests added. Differential Revision: http://reviews.llvm.org/D22133 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275054 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11[mips][microMIPS] Implement LDC1, SDC1, LDC2, SDC2, LWC1, SWC1, LWC2 and ↵Zlatko Buljan40-81/+816
SWC2 instructions and add CodeGen support Differential Revision: http://reviews.llvm.org/D18824 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275050 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11AVX-512: DAG lowering for scalar MIN/MAX commutable opsElena Demikhovsky2-3/+72
DAG lowering was missing for the scalar FMINC, FMAXC nodes. The nodes are generated only in the "unsafe-fp-math" mode. Added tests. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275048 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11[AVX512] Add support for 512-bit ANDN now that all ones build vectors ↵Craig Topper2-1/+68
survive long enough to allow the matching. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275046 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11[AVX512] Use vpternlog with an immediate of 0xff to create 512-bit all one ↵Craig Topper15-137/+193
vectors. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275045 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11[X86] Add the AVX512 SET0 pseudos to foldMemoryOperandImpl since they are ↵Craig Topper2-3/+14
marked for CanFoldAsLoad. I don't really know how to test this. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275044 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11Revert r275027 - Let FuncAttrs infer the 'returned' argument attributeHal Finkel4-56/+6
Reverting r275027 and r275033. These seem to cause miscompiles on the AArch64 buildbot. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275042 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11Allow BasicBlockEdge to be used in DenseMapDaniel Berlin1-0/+21
Summary: Add a DenseMapInfo specialization for BasicBlockEdge Reviewers: hfinkel, chandlerc, majnemer Differential Revision: http://reviews.llvm.org/D22207 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275041 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11Pointer-comparison folding should look through returned-argument functionsHal Finkel2-0/+35
For functions which are known to return a specific argument, pointer-comparison folding can look through the function calls as part of its analysis. Differential Revision: http://reviews.llvm.org/D9387 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275039 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11Teach isDereferenceablePointer to look through returned-argument functionsHal Finkel2-1/+9
For functions which are known to return their argument, isDereferenceableAndAlignedPointer can examine the argument value. Differential Revision: http://reviews.llvm.org/D9384 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275038 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11Teach SCEV to look through returned-argument functionsHal Finkel2-0/+23
When building SCEVs, if a function is known to return its argument, then we can build the SCEV using the corresponding argument value. Differential Revision: http://reviews.llvm.org/D9381 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275037 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11Teach computeKnownBits to look through returned-argument functionsHal Finkel2-3/+21
If a function is known to return one of its arguments, we can use that in order to compute known bits of the return value. Differential Revision: http://reviews.llvm.org/D9397 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275036 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11BasicAA should look through functions with returned argumentsHal Finkel4-2/+77
Motivated by the work on the llvm.noalias intrinsic, teach BasicAA to look through returned-argument functions when answering queries. This is essential so that we don't loose all other AA information when supplementing with llvm.noalias. Differential Revision: http://reviews.llvm.org/D9383 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275035 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11Add a 'Returned' intrinsic property corresponding to the 'returned' argument ↵Hal Finkel4-1/+16
attribute This will be used by the upcoming llvm.noalias intrinsic. Differential Revision: http://reviews.llvm.org/D22201 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275034 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11Don't use a SmallSet for returned attribute inferenceHal Finkel1-11/+19
Suggested post-commit by David Majnemer on IRC (following-up on a pre-commit review comment). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275033 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-10Add getReturnedArgOperand to Call/InvokeInst, CallSiteHal Finkel5-4/+47
In order to make the optimizer smarter about using the 'returned' argument attribute (generally, but motivated by my llvm.noalias intrinsic work), add a utility function to Call/InvokeInst, and CallSite, to make it easy to get the returned call argument (when one exists). P.S. There is already an unfortunate amount of code duplication between CallInst and InvokeInst, and this adds to it. We should probably clean that up separately. Differential Revision: http://reviews.llvm.org/D22204 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275031 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-10[X86][SSE] Relax type assertions for matchVectorShuffleAsInsertPSSimon Pilgrim1-2/+4
Calls to matchVectorShuffleAsInsertPS only need to ensure the inputs are 128-bit vectors. Only lowerVectorShuffleAsInsertPS needs to ensure that they are v4f32. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275028 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-10Let FuncAttrs infer the 'returned' argument attributeHal Finkel4-6/+48
A function can have one argument with the 'returned' attribute, indicating that the associated argument is always the return value of the function. Add FuncAttrs inference logic. Differential Revision: http://reviews.llvm.org/D22202 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275027 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-10Update the LangRef description of the 'returned' attributeHal Finkel1-6/+7
The description of the 'returned' attribute says that it is only used when code-generating the caller. I'd like to make the optimizer smarter about looking through functions with returned arguments (generally, but motivated by my llvm.noalias work). As David pointed out in the review of D22202, the LangRef should be updated to make its expanded uses clearer. Differential Revision: http://reviews.llvm.org/D22205 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275026 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-10[DAG] make isConstantSplatVector() available to the rest of loweringSanjay Patel3-32/+29
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275025 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-10AMDGPU/R600: Add implicitarg.ptr intrinsicJan Vesely7-36/+336
Differential Revision: http://reviews.llvm.org/D21622 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275024 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-10[X86][SSE] Add support for target shuffle combining to PSHUFLW/PSHUFHWSimon Pilgrim4-22/+63
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275022 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-10fix documentation comments; NFCSanjay Patel2-21/+13
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275021 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-10[x86, SSE, AVX] add tests for icmp+zext (PR28484)Sanjay Patel1-1/+191
Note the inconsistent vpbroadcast generation for AVX2; another bug. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275020 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-10[X86][SSE] Added tests for combining shuffles to PSHUFLW/PSHUFHWSimon Pilgrim3-0/+104
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275019 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-10[Support] Make helper function static. NFC.Benjamin Kramer1-2/+2
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275017 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-10[SystemZ] Utilize Test Data Class instructions.Marcin Koscielnicki13-4/+1003
This adds a new SystemZ-specific intrinsic, llvm.s390.tdc.f(32|64|128), which maps straight to the test data class instructions. A new IR pass is added to recognize instructions that can be converted to TDC and perform the necessary replacements. Differential Revision: http://reviews.llvm.org/D21949 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275016 91177308-0d34-0410-b5e6-96231b3b80d8