|
specific hardware and propose simple way to use them. We have three extensions
here:
- Gen register regions. This allows us to perform strided loads in the register
file. To implement that on top of OCL, the idea is to encapsulate them in a
function with a side effect. Not really clean but it works.
- Gen gather from register file. Same idea but here we simply gather data from
a bunch of registers
- Vote any/all. This is basically the same idea as ptx i.e. uniform predicates
for branches.
- block read/write. Just to play with uniform load/store messages
I added a bunch of tests for all that and fix thing here and there to make them
work
|