summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2013-05-14Remove redundant implementations in vload*vload_optimizationAaron Watry2-66/+21
2013-05-14Remove redundant implementations in vstore*Aaron Watry2-66/+21
2013-05-14libclc: Optimize vstore4/8/16 to global for int/uint types.Aaron Watry4-2/+211
R600 probably doesn't support v2/v3 stores and chokes on types that aren't 32-bits in size. These caveats could/should change in the future. For now, the non-optimized implementations for other sizes/types are left intact.
2013-05-14Fix vload* function attributes. These functions are all readonly, not readnoneAaron Watry2-20/+20
2013-05-14fix unused assembly... just in case we use it in the future.Aaron Watry1-4/+4
2013-05-14libclc: Optimize vload4/8/16 from global for int/uint types.Aaron Watry4-3/+208
R600 doesn't seem to support v2/v3 loads, doesn't support constant address space well currently, and chokes on types that aren't 32-bits in size. All of those caveats could/should change in the future. For now, the non-optimized implementations for other sizes/types are left intact.
2013-05-14libclc: Don't build vload4 from 2x vload2...Aaron Watry1-1/+1
It creates a bunch of extra instructions in the assembly that may not get optimized out.
2013-05-06Merge branch 'master' of git://people.freedesktop.org/~tstellar/libclcAaron Watry2-12/+12
Conflicts: generic/include/clc/clc.h generic/lib/SOURCES generic/lib/shared/clamp.inc
2013-04-30r600: Fix implementations of get_group_id.ll and get_local_size.llTom Stellard2-12/+12
2013-04-19libclc: Implement clz() builtinAaron Watry7-0/+158
Squashed commit of the following: commit a0df0a0e86c55c1bdc0b9c0f5a739e5adef4b056 Author: Aaron Watry <awatry@gmail.com> Date: Mon Apr 15 18:42:04 2013 -0500 libclc: Rename clz.ll to clz_if.ll to ensure it gets built. configure.py treats files that have the same name with the .cl and .ll extensions as overriding eachother. E.g. If you have clz.cl and clz.ll both specified to be built in the same SOURCES file, only the first file listed will actually be built. Since the contents of clz.ll were an interface that is implemented in clz_impl.ll, rename clz.ll to clz_if.ll to make sure that the interface is built. commit 931b62bed05c58f737de625bd415af09571a6a5a Author: Aaron Watry <awatry@gmail.com> Date: Sat Apr 13 12:32:54 2013 -0500 libclc: llvm assembly implementation of clz Untested... currently crashes in the same manner as add_sat. commit 6ef0b7b0b6d2e5584086b4b9a9243743b2e0538f Author: Aaron Watry <awatry@gmail.com> Date: Sat Mar 23 12:35:27 2013 -0500 libclc: Add stub clz builtin For scalar int/uint, attempt to use the clz llvm builtin.. for all others return 0 until an actual implementation is finished.
2013-04-19libclc: Add clamp(vec, scalar, scalar) and max(vec, scalar)Aaron Watry4-0/+20
For any GENTYPE that isn't scalar, we need to implement a mixed vector/scalar version of clamp/max. This depends on the min() patches I sent to the list a few minutes ago.
2013-04-19libclc: Implement the min(vec, scalar) version of the min builtin.Aaron Watry4-0/+41
Checks if the current GENTYPE is scalar, and if not, then defines a separate implementation of the function which casts the second arg to vector before proceeding.
2013-04-19libclc: implement initial version of min()Aaron Watry6-0/+22
This doesn't handle the integer cases for min(vector, scalar).
2013-04-19libclc: Rename [add|sub]_sat.ll to [add|sub]_sat_if.llAaron Watry3-2/+2
configure.py allows overloading *.cl with *.ll, but will only ever build the first file listed in SOURCES of ${file}.cl and ${file}.ll add_sat, sub_sat, (and the soon to be submitted clz) all define interfaces in ${function_name}.ll which are implemented in ${function_name}_impl.ll. Renaming the interface files is enough to get them to build again, fixing CL usage of these functions. Tested on clover/r600g.
2013-04-17libclc: Initial vstore*() implementationAaron Watry4-0/+92
Caveats: 1) Does NOT implement half operations. 2) Assumes that cl_khr_byte_addressable_store is available for the char/short store operations.
2013-04-17libclc: vload memory accesses are const qualifiedAaron Watry2-6/+6
2013-04-17libclc: Initial vload*() implementationAaron Watry4-0/+86
Everything except for halfN is implemented
2013-04-16libclc: Add mul24() implementationAaron Watry5-0/+85
2013-04-16libclc: add mad24() implementationAaron Watry5-0/+85
The spec requires the first 2 inputs to be within the bounds of a 24-bit integer's possible values. If they're not, then results are implementation defined.
2013-04-16libclc: simplify clamp() by using min()/max()Aaron Watry1-2/+2
2013-04-15Merge branch 'rotate'Aaron Watry0-0/+0
2013-04-15Merge branch 'clz'Aaron Watry7-0/+158
Conflicts: generic/include/clc/clc.h generic/include/clc/integer/gentype.inc generic/lib/SOURCES generic/lib/integer/rotate.inc
2013-04-15libclc: Rename [add|sub]_sat.ll to [add|sub]_sat_if.llAaron Watry3-2/+2
configure.py allows overloading *.cl with *.ll, but will only ever build the first file listed in SOURCES of ${file}.cl and ${file}.ll add_sat, sub_sat, (and the soon to be submitted clz) all define interfaces in ${function_name}.ll which are implemented in ${function_name}_impl.ll. Renaming the interface files is enough to get them to build again, fixing CL usage of these functions. Tested on clover/r600g.
2013-04-15libclc: Rename clz.ll to clz_if.ll to ensure it gets built.clzAaron Watry2-1/+1
configure.py treats files that have the same name with the .cl and .ll extensions as overriding eachother. E.g. If you have clz.cl and clz.ll both specified to be built in the same SOURCES file, only the first file listed will actually be built. Since the contents of clz.ll were an interface that is implemented in clz_impl.ll, rename clz.ll to clz_if.ll to make sure that the interface is built.
2013-04-13libclc: llvm assembly implementation of clzAaron Watry5-11/+151
Untested... currently crashes in the same manner as add_sat.
2013-04-13libclc: Add clamp(vec,scalar,scalar) and max(vec,scalar)Aaron Watry4-0/+20
For any GENTYPE that isn't scalar, we need to implement a mixed vector/scalar version of clamp/max. This depends on the min() patches I sent to the list a few minutes ago.
2013-04-13Merge branch 'min'Aaron Watry8-0/+63
2013-04-13libclc: Implement the min(vec,scalar) version of the min builtin.minAaron Watry4-0/+41
Checks if the current GENTYPE is scalar, and if not, then defines a separate implementation of the function which casts the second arg to vector before proceeding.
2013-04-11Merge branch 'master' of git://people.freedesktop.org/~awatry/libclcAaron Watry0-0/+0
2013-04-11libclc: implement initial version of min()Aaron Watry6-0/+22
This doesn't handle the integer cases for min(vector, scalar).
2013-04-11libclc: Fix libclc build for LLVM 3.3Aaron Watry1-0/+12
LLVM moved a bunch of IR-related headers for version 3.3. This fixes the libclc build to follow suit.
2013-04-08Add a another TODO note.Aaron Watry1-0/+3
2013-04-08Add a TODO note.Aaron Watry1-0/+4
2013-04-08Simplify rotate implementation a bit..Aaron Watry2-21/+37
Much more understandable/readable as a result, and probably more efficient.
2013-04-08libclc: implement rotate builtinAaron Watry7-0/+55
This implementation does a lot of bit shifting and masking. Suffice to say, this is somewhat suboptimal... but it does look to produce correct results (after the piglit tests were corrected for sign extension issues). Someone who knows LLVM better than I could re-write this more efficiently.
2013-04-08libclc: Move max builtin to shared/Aaron Watry11-16/+10
Max(x,y) is available for all integer/floating types.
2013-04-08libclc: Add clamp() builtin for integer/floating pointAaron Watry6-0/+24
Created under a new shared/ directory for functions which are available for both integer and floating point types.
2013-04-08libclc: Fix abs_diff builtin integer functionAaron Watry2-1/+2
2013-04-08libclc: Add max() builtin functionAaron Watry10-0/+28
Adds this function for both int and floating data types.
2013-04-05configure: Enable building separate libraries for target variantsTom Stellard1-44/+75
2013-03-27Add a another TODO note.rotateAaron Watry1-0/+3
2013-03-27Add a TODO note.Aaron Watry1-0/+4
2013-03-27Simplify rotate implementation a bit..Aaron Watry2-21/+37
Much more understandable/readable as a result, and probably more efficient.
2013-03-23libclc: Add stub clz builtinAaron Watry6-0/+18
For scalar int/uint, attempt to use the clz llvm builtin.. for all others return 0 until an actual implementation is finished.
2013-03-23libclc: implement rotate builtinAaron Watry7-0/+55
This implementation does a lot of bit shifting and masking. Suffice to say, this is somewhat suboptimal... but it does look to produce correct results (after the piglit tests were corrected for sign extension issues). Someone who knows LLVM better than I could re-write this more efficiently.
2013-03-20libclc: Move max builtin to shared/Aaron Watry11-16/+10
Max(x,y) is available for all integer/floating types.
2013-03-20libclc: Add clamp() builtin for integer/floating pointAaron Watry6-0/+24
Created under a new shared/ directory for functions which are available for both integer and floating point types.
2013-03-20libclc: Fix abs_diff builtin integer functionAaron Watry2-1/+2
2013-03-20libclc: Add max() builtin functionAaron Watry10-0/+28
Adds this function for both int and floating data types.
2013-03-20configure: Enable building separate libraries for target variantsAaron Watry1-44/+75
From: Tom Stellard <thomas.stellard at amd.com>