blob: 28bfd73f3afca1f1f344cd15e7171f03fba3f4cc (
plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
|
MiniCL
======
This small project contains a quick and dirty implementation of a OpenCL
run-time for gen6 and gen7. There are several limitations and many things are
not implemented or supported.
How to build
------------
The projec uses CMake with three profiles:
1/ Debug (-g)
2/ RelWithDebInfo (-g with optimizations)
3/ Release (only optimizations)
Basically, from the root directory of the project
> mkdir build
> ccmake ../ # to configure
> Choose whatever you want for the build
> then press 'c' to configure and 'g' to generate the code
> make
How to run
----------
The project comes with several tests. You need to specify the enviromnent
bariable KISS_KERNEL_PATH pointing to the location of the kernels. They are
located in ./kernels of miniCL directory
How it works
------------
The complete code is just a loader of kernels *already* compiled by the windows
run-time. Windows guys built a stand-alone executable name "TC_Tester.exe" which
basically builds a binary blob from a OCL kernel.
Limitations
-----------
- Some bugs may be still outstanding. I hacked quickly some values to make all
tests pass. It should not be a big deal but be prepared to debug nasty bugs
- No support for samplers / textures but it should be rather easy since the
low-level parts of the code already supports it (used in another code base)
- No support for events
- We could be able to push NDRangeKernels into _different_ queues from several
threads but it was never tested
- No support for Enqueue*Buffer. I added a straightforward extension to map /
unmap buffer. This extension "clIntelMapBuffer" directly maps dri_bo_map which
is really convenient
[Update]: Added support for EnqueueBuffer through a straightforward copy into
host-allocated memory after maping the buffer [Hari Thantry]
Fulsim
------
The code base supports an integrationn with fulsim. Basically, while compiling
and configuring the project, you may choose to emulate one specific hardware.
Typically, if you choose EMULATE_IVB while running cmake, the CL run-time will
override all the internal identifiers related to the hardware to emulate an IVB
machine. What you need to run fulsim is the following steps:
- Get fulsim from our subversion server. We compile versions of it. They are all
located in https://subversion.jf.intel.com/cag/gen/gpgpu/fulsim/
- Copy fulsim executables you use where you are going to run the tests
- You need to get a specific version of libdrm which is able to output AUB
files. Take it from:
https://subversion.jf.intel.com/cag/gen/gpgpu/libdrm_fulsim/
Then, compile it and install it
- Setup variable environments for your emulated machine. I put a short script
to emulate IVB at the root of miniCL. So just type:
source setup_fulsim.sh LIVE_DEBUG
LIVE_DEBUG is 0 or 1. When set to 0, the code just runs without stopping. When
setting 1, the fulsim debugger is opened and you can step into the code and
debug it easily. Just look at the script: it is nothing more than setting
variables
- Now run a test like:
./test_copy_buffer
Normally, it will open fulsim at some point and you will see it running
Ben Segovia <benjamin.segovia@intel.com>
|