summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2016-11-14Fix build since r286752.HEADmasterTom Stellard1-1/+2
git-svn-id: https://llvm.org/svn/llvm-project/libclc/trunk@286839 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-11Fix build since llvm r286566 and require at least llvm 4.0Tom Stellard2-3/+4
git-svn-id: https://llvm.org/svn/llvm-project/libclc/trunk@286634 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-21Provide vstore_half helper to workaround clc restrictionsJan Vesely4-26/+75
clang won't accept half precision loads and stores without cl_khr_fp16 since r281904 git-svn-id: https://llvm.org/svn/llvm-project/libclc/trunk@282106 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-16configure: Add amdgcn-mesa-mesa3d targetTom Stellard1-1/+5
git-svn-id: https://llvm.org/svn/llvm-project/libclc/trunk@281793 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-16amdgcn-amdhsa: Add get_num_groups implementationTom Stellard3-0/+14
git-svn-id: https://llvm.org/svn/llvm-project/libclc/trunk@281792 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-16amdgcn-amdhsa: Add get_global_size() implementationTom Stellard2-0/+40
git-svn-id: https://llvm.org/svn/llvm-project/libclc/trunk@281791 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-15math: Implement tgammaAaron Watry5-0/+77
Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> git-svn-id: https://llvm.org/svn/llvm-project/libclc/trunk@281566 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-15math: Implement lgammaAaron Watry5-0/+49
Just use lgamma_r and ignore the value returned in the second argument Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> git-svn-id: https://llvm.org/svn/llvm-project/libclc/trunk@281565 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-15math: Implement lgamma_rAaron Watry6-0/+518
Ported from the amd-builtins branch, which is itself based on the Sun Microsystems implementation. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> git-svn-id: https://llvm.org/svn/llvm-project/libclc/trunk@281564 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-15Add ADDR_SPACE parameter to _CLC_V_V_VP_VECTORIZEAaron Watry1-12/+27
This macro is currently unused, but I plan to use it shortly. The previous form did casts of pointers without an address space, which doesn't work so well for CL 1.x. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> git-svn-id: https://llvm.org/svn/llvm-project/libclc/trunk@281563 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-08Replace nextafter implementationMatt Arsenault2-28/+29
This one passes conformance. git-svn-id: https://llvm.org/svn/llvm-project/libclc/trunk@280961 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-07Avoid ambiguity in calling atom_add functions.Jan Vesely4-4/+4
clang (since r280553) allows pointer casts in function overloads, so we need to disambiguate the second argument. clang might be smarter about overloads in the future see https://reviews.llvm.org/D24113, but let's be safe in libclc anyway. git-svn-id: https://llvm.org/svn/llvm-project/libclc/trunk@280871 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-30configure.py: Add polaris10 and polaris11Niels Ole Salscheider1-2/+2
git-svn-id: https://llvm.org/svn/llvm-project/libclc/trunk@280121 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-25amdgcn: Fix return type of get_num_groupsMatt Arsenault5-2/+24
git-svn-id: https://llvm.org/svn/llvm-project/libclc/trunk@279723 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-25Strip opencl.ocl.version metadataMatt Arsenault1-0/+7
This should be uniqued when linking, but right now it creates a lot of metadata spam listing the same version. This should also probably be reporting the compiled version of the user program, which may differ from the library. Currently the library IR files report 1.0 while 1.1/1.2 are the default for user programs. git-svn-id: https://llvm.org/svn/llvm-project/libclc/trunk@279692 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-24amdgcn: Also correct get_local_size type for HSAMatt Arsenault1-5/+8
git-svn-id: https://llvm.org/svn/llvm-project/libclc/trunk@279656 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-24amdgcn: Fix return type for get_global_sizeMatt Arsenault5-2/+24
git-svn-id: https://llvm.org/svn/llvm-project/libclc/trunk@279644 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-20amdgpu: Fix default case value for get_local_sizeMatt Arsenault2-2/+2
git-svn-id: https://llvm.org/svn/llvm-project/libclc/trunk@279359 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-20amdgcn: Fix get_local_size IR return typeMatt Arsenault5-5/+27
git-svn-id: https://llvm.org/svn/llvm-project/libclc/trunk@279350 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-19amdgcn: Correct return types to be size_tMatt Arsenault3-3/+3
git-svn-id: https://llvm.org/svn/llvm-project/libclc/trunk@279343 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-17Implement vstore_half{,n}Jan Vesely3-19/+68
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> git-svn-id: https://llvm.org/svn/llvm-project/libclc/trunk@278962 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-25Make min follow the OCL 1.0 specsJan Vesely1-2/+2
OpenCL 1.0: "Returns y if y < x, otherwise it returns x. If x *and* y are infinite or NaN, the return values are undefined." OpenCL 1.1+: "Returns y if y < x, otherwise it returns x. If x *or* y are infinite or NaN, the return values are undefined." The 1.0 version is stricter so use that one. Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> git-svn-id: https://llvm.org/svn/llvm-project/libclc/trunk@276704 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-22Implement cbrt builtinTom Stellard7-0/+869
This implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. git-svn-id: https://llvm.org/svn/llvm-project/libclc/trunk@276497 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-22Implement cosh builtinTom Stellard7-0/+370
This implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. git-svn-id: https://llvm.org/svn/llvm-project/libclc/trunk@276496 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-22geometric/floatn.inc: Add vec8 and vec16 typesTom Stellard1-0/+16
git-svn-id: https://llvm.org/svn/llvm-project/libclc/trunk@276495 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-22AMDGPU: Implement get_global_offset builtinJan Vesely9-1/+33
Also fix get_global_id to consider offset No idea how to add this for ptx, so they are stuck with the old get_global_id implementation. v2: split to a separate patch v3: Switch R600 to use implictarg.ptr Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> git-svn-id: https://llvm.org/svn/llvm-project/libclc/trunk@276443 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-22AMDGPU: Use clang intrinsics for workitem builtinsJan Vesely14-136/+71
v2: split into 2 patches use clang builtins for other intrinsics as well v3: Fix warnings Switch r600 to use implictarg.ptr Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> git-svn-id: https://llvm.org/svn/llvm-project/libclc/trunk@276442 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-22ptx: Fix builtin names after clang r274770Jan Vesely5-13/+13
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Acked-By: Aaron Watry <awatry@gmail.com> git-svn-id: https://llvm.org/svn/llvm-project/libclc/trunk@276423 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-19amdgpu: Use right builtn for rsqMatt Arsenault1-1/+6
The r600 path has never actually worked sinced double is not implemented there. git-svn-id: https://llvm.org/svn/llvm-project/libclc/trunk@276009 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-18R600: Use new barrier intrinsicMatt Arsenault1-4/+3
git-svn-id: https://llvm.org/svn/llvm-project/libclc/trunk@275874 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-18Replace llvm.AMDGPU.ldexp with llvm.amdgcn.ldexpMatt Arsenault3-3/+3
It didn't really work on r600 to begin with, which should get its own intrinsic. git-svn-id: https://llvm.org/svn/llvm-project/libclc/trunk@275813 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-17configure: Remove device specific definesJan Vesely1-25/+11
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <tom@stellard.net> git-svn-id: https://llvm.org/svn/llvm-project/libclc/trunk@273044 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-17nvptx: Drop feature defines.Jan Vesely1-6/+4
This is now handled by clang Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <tom@stellard.net> git-svn-id: https://llvm.org/svn/llvm-project/libclc/trunk@273043 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-1764 bit integers are legal in full profile without an extensionJan Vesely2-6/+12
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <tom@stellard.net> git-svn-id: https://llvm.org/svn/llvm-project/libclc/trunk@273042 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-17math: Use single precision fmax in sp pathJan Vesely1-1/+1
Fixes fdim piglit on Turks v2: use CL fmax instead of __builtin Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <tom.stellard@amd.com> git-svn-id: https://llvm.org/svn/llvm-project/libclc/trunk@269807 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-06math: Add erf ported from amd-builtinsJan Vesely4-0/+413
The scalar float/double function bodies are a direct copy/paste, aside from the removed (optional) code in float function body that requires subnormals. reviewers: jvesely Patch by: Vedran Miletić <rivanvx@gmail.com> git-svn-id: https://llvm.org/svn/llvm-project/libclc/trunk@268766 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-06math: Add fdim implementationAaron Watry6-0/+86
Based on the amd-builtin, but explicitly vectorized for all sizes (not just float4), and includes a vectorized double implementation. Passes piglit (float) tests on pitcairn. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu> git-svn-id: https://llvm.org/svn/llvm-project/libclc/trunk@268708 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-15prepare-builtins: Remove call to getGlobalContext()Tom Stellard1-1/+1
This function has been removed from LLVM. Patch By: Laurent Carlier git-svn-id: https://llvm.org/svn/llvm-project/libclc/trunk@266430 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-07[AMDGPU] Implement get_local_size for amdgcn--amdhsa tripleKonstantin Zhuravlyov5-1/+41
Differential Revision: http://reviews.llvm.org/D18284 git-svn-id: https://llvm.org/svn/llvm-project/libclc/trunk@265713 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-30Update copyright year to 2016.Paul Robinson1-1/+1
git-svn-id: https://llvm.org/svn/llvm-project/libclc/trunk@264949 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-24math: Fix ilogb(double) return typeAaron Watry1-1/+1
Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu> git-svn-id: https://llvm.org/svn/llvm-project/libclc/trunk@261714 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-23math: Add ilogb ported from amd-builtinsAaron Watry6-0/+68
The scalar float/double function bodies are a direct copy/paste with usage of the CLC wrappers to vectorize them. This commit also adds in the FP_ILOGB0 and FP_ILOGBNAN macros which are equal to the results of ilogb(0.0f) and ilogb(float nan) respectively. v2: Add FP_ILOGB0 and FP_ILOGBNAN definitions Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu> v1 Reviewed-by: Tom Stellard <thomas.stellard@amd.com> git-svn-id: https://llvm.org/svn/llvm-project/libclc/trunk@261639 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-17Add .gitignore for build directoriesMatt Arsenault1-0/+13
git-svn-id: https://llvm.org/svn/llvm-project/libclc/trunk@261043 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-17amdgcn: Use new workitem intrinsicsMatt Arsenault9-38/+124
git-svn-id: https://llvm.org/svn/llvm-project/libclc/trunk@261042 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-13Update page to list supported targetsMatt Arsenault1-2/+2
git-svn-id: https://llvm.org/svn/llvm-project/libclc/trunk@260778 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-13Split sources for amdgcn and r600Matt Arsenault34-38/+75
Most files remain in a common amdgpu directory. Also switches barriers to to use convergent, and use llvm.amdgcn.s.barrier. This now requires 3.9/trunk to build amdgcn. git-svn-id: https://llvm.org/svn/llvm-project/libclc/trunk@260777 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-09configure: Remove llvm 3.6 definesJan Vesely1-3/+3
we require llvm 3.7 reviewer: tstellard Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> git-svn-id: https://llvm.org/svn/llvm-project/libclc/trunk@260304 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-09configure: Remove cl_khr_fp64 for device that don't support doublesJan Vesely1-5/+5
Also remove definitions if provided by clang (3.7+) This halves the size of builtin.opt.{cedar,barts}.bc reviewer: tstellard Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> git-svn-id: https://llvm.org/svn/llvm-project/libclc/trunk@260303 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-09configure: Introduce per device definesJan Vesely1-11/+24
Make cl_khr_fp64 define per-device. This patch does not change the generated Makefile (for llvm 3.6, 3.7) v2: Make the device defines per LLVM version, 'all' for all versions reviewer: tstellard Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> git-svn-id: https://llvm.org/svn/llvm-project/libclc/trunk@260302 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-09math: Fix log2 vectorization on non-fp64 hwJan Vesely1-0/+2
reviewer: tstellard Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> git-svn-id: https://llvm.org/svn/llvm-project/libclc/trunk@260301 91177308-0d34-0410-b5e6-96231b3b80d8