~nh/llvm - Misc LLVM things, mostly radeonsi (AMDGPU)

Age	Commit message (Collapse)	Author	Files	Lines
2016-07-11	AMDGPU: fix local stack slot allocation bugslatest	Nicolai Hähnle	3	-2/+35
	Summary: The main bug fix here is using the 32-bit encoding of V_ADD_I32 in materializeFrameBaseRegister and resolveFrameIndex, so that arbitrary immediates work. The second part is that we may now require the SegmentWaveByteOffset even when there are initially no stack objects and VGPR spilling isn't enabled, for stack slots that are allocated later. This means that some bits become effectively dead and can be cleaned up. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96602 Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: http://reviews.llvm.org/D21551
2016-07-11	AMDGPU: Unify MOVRELSOffset and MOVRELDOffset	Nicolai Hähnle	4	-34/+20
	Summary: Previously, constant index insertelements would be turned into SI_INDIRECT_DST, which is bound to prevent some optimization opportunities. Worse, it mislead the heuristic that decides whether immediates should be lowered to S_MOV_B32 or V_MOV_B32 in a way that resulted in unnecessary v_readfirstlanes. Reviewers: arsenm, tstellarAMD Subscribers: arsenm, kzhuravl, llvm-commits Differential Revision: http://reviews.llvm.org/D22217
2016-07-11	AMDGPU: Treat texture gather instructions more like other MIMG instructions	Nicolai Hähnle	3	-4/+24
	Summary: Setting MIMG to 0 has a bunch of unexpected side effects, including that isVMEM returns false which leads to incorrect treatment in the hazard recognizer. The reason I noticed it is that it also leads to incorrect treatment in VGPR-to-SGPR copies, which is one cause of the referenced bug. The only reason why MIMG was set to 0 is to signal the special handling of dmasks, but that can be checked differently. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96877 Reviewers: arsenm, tstellarAMD Subscribers: arsenm, kzhuravl, llvm-commits Differential Revision: http://reviews.llvm.org/D22210
2016-07-11	[IPRA] Properly compute register usage at call sites.	Chad Rosier	4	-8/+11
	Differential Revision: http://reviews.llvm.org/D21395 Patch by Vivek Pandya. PR28144 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275087 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11	[SystemZ] Recognize Load On Condition Immediate (LOCHI/LOGHI) opportunities	Zhan Jun Liau	11	-2/+294
	Summary: Add support for the z13 instructions LOCHI and LOCGHI which conditionally load immediate values. Add target instruction info hooks so that if conversion will allow predication of LHI/LGHI. Author: RolandF Reviewers: uweigand Subscribers: zhanjunl Commiting on behalf of Roland. Differential Revision: http://reviews.llvm.org/D22117 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275086 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11	[SCCP] Try to follow the DRY principle, use `OpSt`.	Davide Italiano	1	-3/+2
	Thanks to Eli Friedman for pointing out in his post-commit review! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275084 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11	[SLSR] Call getPointerSizeInBits with the correct address space.	Jingyue Wu	2	-5/+22
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275083 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11	[PM/IPO] Port LowerTypeTests to the new PassManager.	Davide Italiano	5	-17/+39
	There's a little bit of churn in this patch because the initialization mechanism is now shared between the old and the new PM. Other than that, it's just a pretty mechanical translation. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275082 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11	[lanai] Add more tests for assembly of conditional ALU ops	Jacques Pienaar	4	-5/+363
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275081 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11	Fix the assertion failure caused by http://reviews.llvm.org/D22118	Dehao Chen	2	-2/+3
	Summary: http://reviews.llvm.org/D22118 uses metadata to store the call count, which makes it possible to have branch weight to have only one elements. Also fix the assertion failure in inliner when checking the instruction type to include "invoke" instruction. Reviewers: mkuper, dnovillo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D22228 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275079 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11	[IR] Stop a -Wsign-compare warning from firing	David Majnemer	1	-1/+1
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275077 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11	[LowerTypeTests] Don't rely on doInitialization().	Davide Italiano	1	-23/+16
	In preparation for porting this pass to the new PM (which has no doInitialization()). Differential Revision: http://reviews.llvm.org/D22223 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275074 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11	Implement callsite-hotness based inline cost for Sample-based PGO	Dehao Chen	5	-1/+103
	Summary: For sample-based PGO, using BFI to calculate callsite count is sometime not accurate. This is because with sampling based approach, if a callsite resides in a hot loop deeply nested in a bunch of cold branches, the callsite's BFI frequency would be inaccurately calculated due to lack of samples in the cold branch. E.g. if (A1 && A2 && A3 && ..... && A10) { for (i=0; i < 100000000; i++) { callsite(); } } Assume that A1 to A100 are all 100% taken, and callsite has 1000 samples and thus is considerred hot. Because the loop's trip count is huge, it's normal that all branches outside the loop has no sample at all. As a result, we can only use static branch probability to derive the the frequency of the loop header. Assuming that static heuristic thinks each branch is 50% taken, then the count calculated from BFI will be 1/(2^10) of the actual value. In order to get more accurate callsite count, we directly annotate the weight on the call instruction, and directly use it when checking callsite hotness. Note that this mechanism can also be shared by instrumentation based callsite hotness analysis. The side benefit is that it breaks the dependency from Inliner to BFI as call count is embedded in the IR. Reviewers: davidxl, eraman, dnovillo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D22118 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275073 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11	Tune the weight propagation algorithm for sample profile.	Dehao Chen	2	-16/+30
	Summary: Handle the case when there is only one incoming/outgoing edge for a visited basic block: use the block weight to adjust edge weight even when the edge has been visited before. This can help reduce inaccuracies introduced by incorrect basic block profile, as shown in the updated unittest. Reviewers: davidxl, dnovillo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D22180 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275072 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11	[x86] make some of the tests 256-bit for testing diversity	Sanjay Patel	1	-54/+106
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275070 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11	Add missing include from previous commit	Nirav Dave	1	-0/+1
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275069 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11	Fix branch relaxation in 16-bit mode.	Nirav Dave	17	-48/+115
	Thread through MCSubtargetInfo to relaxInstruction function allowing relaxation to generate jumps with 16-bit sized immediates in 16-bit mode. This fixes PR22097. Reviewers: dwmw2, tstellarAMD, craig.topper, jyknight Subscribers: jfb, arsenm, jyknight, llvm-commits, dsanders Differential Revision: http://reviews.llvm.org/D20830 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275068 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11	[x86] specify triple to avoid bot failures	Sanjay Patel	1	-6/+6
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275067 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11	[Sink] Don't move calls to readonly functions across stores	Nicolai Haehnle	2	-2/+118
	Summary: Reviewers: hfinkel, majnemer, tstellarAMD, sunfish Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17279 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275066 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11	AliasAnalysis: unify getModRefInfo(I, CS) semantics with other overloads	Nicolai Haehnle	1	-1/+1
	This subtle change to getModRefInfo(Instruction, ImmutableCallSite) is to ensure that the semantics are equal to that of getModRefInfo(CS1, CS2) when the Instruction is a call-site. This is now more in line with getModRefInfo generally: it returns Mod when I modifies a memory location that is accessed (read or written) by CS and Ref when I reads a memory location that is written by CS. From a grep of the code, the only uses of this particular getModRefInfo overload are in MemorySSA and MemCpyOptimizer, and they only care about where the result is MR_NoModRef or not. Therefore, this change should have no visible effect. Separated out from D17279 upon request. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275065 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11	[x86] update checks	Sanjay Patel	1	-15/+30
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275064 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11	[X86][SSE] Generalise target shuffle combine of shuffles using variable masks	Simon Pilgrim	1	-13/+21
	At present the only shuffle with a variable mask we recognise is PSHUFB, which influences if its worth the cost of mask creation/loading of a combined target shuffle with a variable mask. This change sets up the infrastructure to support other shuffles in the future but has no effect yet. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275059 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11	Provide support for preserving assembly comments	Nirav Dave	13	-4/+201
	Preserve assembly comments from input in output assembly and flags to toggle property. This is on by default for inline assembly and off in llvm-mc. Parsed comments are emitted immediately before an EOL which generally places them on the expected line. Reviewers: rtrieu, dwmw2, rnk, majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D20020 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275058 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11	[AMDGPU][llvm-mc] Quickfix for r272748 to enable labels in branch instructions.	Artem Tamazov	2	-0/+19
	Fixes issue mentioned at: https://github.com/RadeonOpenCompute/LLVM-AMDGPU-Assembler-Extra/issues/13. Lit tests added. Differential Revision: http://reviews.llvm.org/D22133 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275054 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11	[mips][microMIPS] Implement LDC1, SDC1, LDC2, SDC2, LWC1, SWC1, LWC2 and ↵	Zlatko Buljan	40	-81/+816
	SWC2 instructions and add CodeGen support Differential Revision: http://reviews.llvm.org/D18824 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275050 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11	AVX-512: DAG lowering for scalar MIN/MAX commutable ops	Elena Demikhovsky	2	-3/+72
	DAG lowering was missing for the scalar FMINC, FMAXC nodes. The nodes are generated only in the "unsafe-fp-math" mode. Added tests. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275048 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11	[AVX512] Add support for 512-bit ANDN now that all ones build vectors ↵	Craig Topper	2	-1/+68
	survive long enough to allow the matching. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275046 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11	[AVX512] Use vpternlog with an immediate of 0xff to create 512-bit all one ↵	Craig Topper	15	-137/+193
	vectors. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275045 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11	[X86] Add the AVX512 SET0 pseudos to foldMemoryOperandImpl since they are ↵	Craig Topper	2	-3/+14
	marked for CanFoldAsLoad. I don't really know how to test this. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275044 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11	Revert r275027 - Let FuncAttrs infer the 'returned' argument attribute	Hal Finkel	4	-56/+6
	Reverting r275027 and r275033. These seem to cause miscompiles on the AArch64 buildbot. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275042 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11	Allow BasicBlockEdge to be used in DenseMap	Daniel Berlin	1	-0/+21
	Summary: Add a DenseMapInfo specialization for BasicBlockEdge Reviewers: hfinkel, chandlerc, majnemer Differential Revision: http://reviews.llvm.org/D22207 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275041 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11	Pointer-comparison folding should look through returned-argument functions	Hal Finkel	2	-0/+35
	For functions which are known to return a specific argument, pointer-comparison folding can look through the function calls as part of its analysis. Differential Revision: http://reviews.llvm.org/D9387 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275039 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11	Teach isDereferenceablePointer to look through returned-argument functions	Hal Finkel	2	-1/+9
	For functions which are known to return their argument, isDereferenceableAndAlignedPointer can examine the argument value. Differential Revision: http://reviews.llvm.org/D9384 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275038 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11	Teach SCEV to look through returned-argument functions	Hal Finkel	2	-0/+23
	When building SCEVs, if a function is known to return its argument, then we can build the SCEV using the corresponding argument value. Differential Revision: http://reviews.llvm.org/D9381 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275037 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11	Teach computeKnownBits to look through returned-argument functions	Hal Finkel	2	-3/+21
	If a function is known to return one of its arguments, we can use that in order to compute known bits of the return value. Differential Revision: http://reviews.llvm.org/D9397 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275036 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11	BasicAA should look through functions with returned arguments	Hal Finkel	4	-2/+77
	Motivated by the work on the llvm.noalias intrinsic, teach BasicAA to look through returned-argument functions when answering queries. This is essential so that we don't loose all other AA information when supplementing with llvm.noalias. Differential Revision: http://reviews.llvm.org/D9383 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275035 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11	Add a 'Returned' intrinsic property corresponding to the 'returned' argument ↵	Hal Finkel	4	-1/+16
	attribute This will be used by the upcoming llvm.noalias intrinsic. Differential Revision: http://reviews.llvm.org/D22201 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275034 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11	Don't use a SmallSet for returned attribute inference	Hal Finkel	1	-11/+19
	Suggested post-commit by David Majnemer on IRC (following-up on a pre-commit review comment). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275033 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-10	Add getReturnedArgOperand to Call/InvokeInst, CallSite	Hal Finkel	5	-4/+47
	In order to make the optimizer smarter about using the 'returned' argument attribute (generally, but motivated by my llvm.noalias intrinsic work), add a utility function to Call/InvokeInst, and CallSite, to make it easy to get the returned call argument (when one exists). P.S. There is already an unfortunate amount of code duplication between CallInst and InvokeInst, and this adds to it. We should probably clean that up separately. Differential Revision: http://reviews.llvm.org/D22204 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275031 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-10	[X86][SSE] Relax type assertions for matchVectorShuffleAsInsertPS	Simon Pilgrim	1	-2/+4
	Calls to matchVectorShuffleAsInsertPS only need to ensure the inputs are 128-bit vectors. Only lowerVectorShuffleAsInsertPS needs to ensure that they are v4f32. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275028 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-10	Let FuncAttrs infer the 'returned' argument attribute	Hal Finkel	4	-6/+48
	A function can have one argument with the 'returned' attribute, indicating that the associated argument is always the return value of the function. Add FuncAttrs inference logic. Differential Revision: http://reviews.llvm.org/D22202 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275027 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-10	Update the LangRef description of the 'returned' attribute	Hal Finkel	1	-6/+7
	The description of the 'returned' attribute says that it is only used when code-generating the caller. I'd like to make the optimizer smarter about looking through functions with returned arguments (generally, but motivated by my llvm.noalias work). As David pointed out in the review of D22202, the LangRef should be updated to make its expanded uses clearer. Differential Revision: http://reviews.llvm.org/D22205 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275026 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-10	[DAG] make isConstantSplatVector() available to the rest of lowering	Sanjay Patel	3	-32/+29
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275025 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-10	AMDGPU/R600: Add implicitarg.ptr intrinsic	Jan Vesely	7	-36/+336
	Differential Revision: http://reviews.llvm.org/D21622 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275024 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-10	[X86][SSE] Add support for target shuffle combining to PSHUFLW/PSHUFHW	Simon Pilgrim	4	-22/+63
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275022 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-10	fix documentation comments; NFC	Sanjay Patel	2	-21/+13
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275021 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-10	[x86, SSE, AVX] add tests for icmp+zext (PR28484)	Sanjay Patel	1	-1/+191
	Note the inconsistent vpbroadcast generation for AVX2; another bug. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275020 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-10	[X86][SSE] Added tests for combining shuffles to PSHUFLW/PSHUFHW	Simon Pilgrim	3	-0/+104
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275019 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-10	[Support] Make helper function static. NFC.	Benjamin Kramer	1	-2/+2
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275017 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-10	[SystemZ] Utilize Test Data Class instructions.	Marcin Koscielnicki	13	-4/+1003
	This adds a new SystemZ-specific intrinsic, llvm.s390.tdc.f(32\|64\|128), which maps straight to the test data class instructions. A new IR pass is added to recognize instructions that can be converted to TDC and perform the necessary replacements. Differential Revision: http://reviews.llvm.org/D21949 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275016 91177308-0d34-0410-b5e6-96231b3b80d8