diff options
author | Elena Demikhovsky <elena.demikhovsky@intel.com> | 2015-07-09 07:42:48 +0000 |
---|---|---|
committer | Elena Demikhovsky <elena.demikhovsky@intel.com> | 2015-07-09 07:42:48 +0000 |
commit | 43afab3bdb46bc4d3b5568540428920311821891 (patch) | |
tree | 15bf90f2eb1235285e01ab1d803fdba00a094480 /docs | |
parent | f959df643a8759cfdd3e882257bff11682882efd (diff) |
Extended syntax of vector version of getelementptr instruction.
The justification of this change is here: http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-March/082989.html
According to the current GEP syntax, vector GEP requires that each index must be a vector with the same number of elements.
%A = getelementptr i8, <4 x i8*> %ptrs, <4 x i64> %offsets
In this implementation I let each index be or vector or scalar. All vector indices must have the same number of elements. The scalar value will mean the splat vector value.
(1) %A = getelementptr i8, i8* %ptr, <4 x i64> %offsets
or
(2) %A = getelementptr i8, <4 x i8*> %ptrs, i64 %offset
In all cases the %A type is <4 x i8*>
In the case (2) we add the same offset to all pointers.
The case (1) covers C[B[i]] case, when we have the same base C and different offsets B[i].
The documentation is updated.
http://reviews.llvm.org/D10496
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241788 91177308-0d34-0410-b5e6-96231b3b80d8
Diffstat (limited to 'docs')
-rw-r--r-- | docs/LangRef.rst | 58 |
1 files changed, 54 insertions, 4 deletions
diff --git a/docs/LangRef.rst b/docs/LangRef.rst index 2e4bcbe7302..ca5c16c0f49 100644 --- a/docs/LangRef.rst +++ b/docs/LangRef.rst @@ -6718,7 +6718,8 @@ Overview: The '``getelementptr``' instruction is used to get the address of a subelement of an :ref:`aggregate <t_aggregate>` data structure. It performs -address calculation only and does not access memory. +address calculation only and does not access memory. The instruction can also +be used to calculate a vector of such addresses. Arguments: """""""""" @@ -6844,12 +6845,61 @@ Example: ; yields i32*:iptr %iptr = getelementptr [10 x i32], [10 x i32]* @arr, i16 0, i16 0 -In cases where the pointer argument is a vector of pointers, each index -must be a vector with the same number of elements. For example: +Vector of pointers: +""""""""""""""""""" + +The ``getelementptr`` returns a vector of pointers, instead of a single address, +when one or more of its arguments is a vector. In such cases, all vector +arguments should have the same number of elements, and every scalar argument +will be effectively broadcast into a vector during address calculation. + +.. code-block:: llvm + + ; All arguments are vectors: + ; A[i] = ptrs[i] + offsets[i]*sizeof(i8) + %A = getelementptr i8, <4 x i8*> %ptrs, <4 x i64> %offsets + + ; Add the same scalar offset to each pointer of a vector: + ; A[i] = ptrs[i] + offset*sizeof(i8) + %A = getelementptr i8, <4 x i8*> %ptrs, i64 %offset + + ; Add distinct offsets to the same pointer: + ; A[i] = ptr + offsets[i]*sizeof(i8) + %A = getelementptr i8, i8* %ptr, <4 x i64> %offsets + + ; In all cases described above the type of the result is <4 x i8*> + +The two following instructions are equivalent: + +.. code-block:: llvm + + getelementptr %struct.ST, <4 x %struct.ST*> %s, <4 x i64> %ind1, + <4 x i32> <i32 2, i32 2, i32 2, i32 2>, + <4 x i32> <i32 1, i32 1, i32 1, i32 1>, + <4 x i32> %ind4, + <4 x i64> <i64 13, i64 13, i64 13, i64 13> + + getelementptr %struct.ST, <4 x %struct.ST*> %s, <4 x i64> %ind1, + i32 2, i32 1, <4 x i32> %ind4, i64 13 + +Let's look at the C code, where the vector version of ``getelementptr`` +makes sense: + +.. code-block:: c + + // Let's assume that we vectorize the following loop: + double *A, B; int *C; + for (int i = 0; i < size; ++i) { + A[i] = B[C[i]]; + } .. code-block:: llvm - %A = getelementptr i8, <4 x i8*> %ptrs, <4 x i64> %offsets, + ; get pointers for 8 elements from array B + %ptrs = getelementptr double, double* %B, <8 x i32> %C + ; load 8 elements from array B into A + %A = call <8 x double> @llvm.masked.gather.v8f64(<8 x double*> %ptrs, + i32 8, <8 x i1> %mask, <8 x double> %passthru) Conversion Operations --------------------- |