Age | Commit message (Collapse) | Author | Files | Lines |
|
|
|
As llvm has refined the ParseCommandLineOptions and it will case
double free problem we now remove the useless ParseCommandLineOptions
Signed-off-by: Pan Xiuli <xiuli.pan@intel.com>
Reviewed-by: Yang Rong <rong.r.yang@intel.com>
|
|
The first patch 192feb51 has something wrong in rebase and takes new
bug in. Now fix both the original bug and revert the wrong
patch.
Signed-off-by: Pan Xiuli <xiuli.pan@intel.com>
Reviewed-by: Yang Rong <rong.r.yang@intel.com>
|
|
SIMD_WIDTH.
It makes sense to set CL_KERNEL_PREFERRED_WORK_GROUP_SIZE_MULTIPLE to the
corresponding SIMD size. Then it provides a way for intel's OCL application
to get SIMD width at runtime and make some SIMD width dependant optimization
possible.
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
Reviewed-by: Ruiling Song <ruiling.song@intel.com>
|
|
llvm 3.7 change to llvm IR, need two copies if still use the llvm IR
to implement llvm.memset and llvm.memcpy. And opencl c is more clearly.
Signed-off-by: Yang Rong <rong.r.yang@intel.com>
Reviewed-by: Ruiling Song <ruiling.song@intel.com>
Reviewed-by: Igor Gnatenko <i.gnatenko.brain@gmail.com>
|
|
Must explicit use void if function don't have parameter.
Signed-off-by: Yang Rong <rong.r.yang@intel.com>
Reviewed-by: Ruiling Song <ruiling.song@intel.com>
Reviewed-by: Igor Gnatenko <i.gnatenko.brain@gmail.com>
|
|
It can fix datalayout mismatch warning in llvm3.7.
Signed-off-by: Yang Rong <rong.r.yang@intel.com>
Reviewed-by: Ruiling Song <ruiling.song@intel.com>
Reviewed-by: Igor Gnatenko <i.gnatenko.brain@gmail.com>
|
|
Otherwise, createInstructionCombiningPass will convert some call to illegal
instruction in llvm3.7, for example utest compiler_time_stamp and test_load_program_from_spir.
Signed-off-by: Yang Rong <rong.r.yang@intel.com>
Reviewed-by: Ruiling Song <ruiling.song@intel.com>
Reviewed-by: Igor Gnatenko <i.gnatenko.brain@gmail.com>
|
|
Move all llvm relative includes to llvm_includes.hpp.
Signed-off-by: Yang Rong <rong.r.yang@intel.com>
Reviewed-by: Ruiling Song <ruiling.song@intel.com>
Reviewed-by: Igor Gnatenko <i.gnatenko.brain@gmail.com>
|
|
there is no logical relationship between the time of finish and map,
remove the condition.
Signed-off-by: Guo Yejun <yejun.guo@intel.com>
Reviewed-by: Yang Rong <rong.r.yang@intel.com>
|
|
We failed to handle -I "/XX X/YY YY/" like path or
-DAAA=BBB"CC DDD"EEE like defines from the build option.
We need to consider the spaces here and pass it correctly
to Clang.
V4:
Fix a minor mistake.
Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: Yang Rong <rong.r.yang@intel.com>
|
|
Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: Yang Rong <rong.r.yang@intel.com>
|
|
We define ourself's ArgInfo structure to ease the serialization
of the arguement.
Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: Yang Rong <rong.r.yang@intel.com>
|
|
Signed-off-by: Junyan He <junyan.he@linux.intel.com>
|
|
From linux 4.3, kernel redefined the mocs table's value,
But before 4.3, still used the hw defautl value.
Signed-off-by: Yang Rong <rong.r.yang@intel.com>
Reviewed-by: Ruiling Song <ruiling.song@intel.com>
|
|
Reported to fix fix a ~50% performance regression (in OpenCV 3.0 and
Luxmark 2.1 among others) with v4.3 kernels on Gen9 hardware.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92975
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Yang Rong <rong.r.yang@intel.com>
|
|
special versions of linux kernel and libdrm are needed.
utest and conformance test PASSED.
Signed-off-by: Guo Yejun <yejun.guo@intel.com>
Reviewed-by: Junyan He <junyan.he@linux.intel.com>
|
|
Return CL_INVALID_CONTEXT if the context associated with
command_queue and events in event_wait_list are not the same.
Signed-off-by: Yang Rong <rong.r.yang@intel.com>
Reviewed-by: Luo Xionghu <xionghu.luo@intel.com>
|
|
llvm 3.7 may generate cast instructions "%13 = uitofp i1 %12 to float",
while the dst type is float or double , should call the coresponding
newXXXimmediate function.
Signed-off-by: Luo Xionghu <xionghu.luo@intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
When p == end (the null terminator byte), don't try to read p + 1:
as this is outside the string, it might be a '%' from a different
object (causing __parse_printf_state(end + 2, end, ...) to be called,
which will fail), or an invalid address.
Signed-off-by: Rebecca Palmer <rebecca_palmer@zoho.com>
Reviewed-by: Pan, Xiuli <xiuli.pan@intel.com>
|
|
1. need support float.
2. get correct element type.
3. should use ir::TYPE_U8 for byte store.
Signed-off-by: Ruiling Song <ruiling.song@intel.com>
Reviewed-by: Yang Rong <rong.r.yang@intel.com>
|
|
The uint32_t size is not enough for coming bigger
gpu memory, now GEN9 support 4G buffer. Also add
assertion for invalid size.
Signed-off-by: Pan Xiuli <xiuli.pan@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@intel.com>
|
|
Signed-off-by: Yang Rong <rong.r.yang@intel.com>
Reviewed-by: Luo Xionghu <xionghu.luo@intel.com>
|
|
If the float overflow, convert to long/ulong is undef. So must use long/ulong's max and min value
as return value.
Also refine long to other integer type sat convert. Use to statement to avoid generate if/else/endif.
Signed-off-by: Yang Rong <rong.r.yang@intel.com>
Reviewed-by: Ruiling Song <ruiling.song@intel.com>
|
|
currently,the ByteGather generates IR as:
BYTE_GATHER(16) %109<0>:UD : %96<0,1,0>:UD 0x4:UD
MOV(1) %75<0>:UB : %109<32,8,4>:UB
Fix it to generate IR as:
BYTE_GATHER(16) %109<0>:UD : %96<0,1,0>:UD 0x4:UD
MOV(1) %75<0>:UB : %109<0,1,0>:UB
otherwise, there is regression issue of local copy propagation optimization
which uses %109<32,8,4>:UB
Signed-off-by: Guo Yejun <yejun.guo@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@intel.com>
|
|
This is also gpgpu event, which can cause leakes.
Just release it.
Signed-off-by: Pan Xiuli <xiuli.pan@intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
We get an event out of NDRangeKernel, and we don't release it.
As an gpgpu event it can also make drm buffer leak, to avoid
potenial error we just release it.w
Signed-off-by: Pan Xiuli <xiuli.pan@intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
Refine the event struct to make last_event become a list to store
all uncompeleted events and update them every queue flush. This can
make sure all events created in the runtime have a chance to update
status and run callback functions and then be deleted. We will also
fix the memory leak problem casued by uncompeted events.
This is a bugfix for https://bugs.freedesktop.org/show_bug.cgi?id=91710
The leaked events with gpu buffers will be unreferenced and cause other
drm buffer leak and result in terrible memory leak.
Signed-off-by: Pan Xiuli <xiuli.pan@intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
This should be a typo, we should wait for the gpgpu and create
node only if the batch buffer is busy.
Signed-off-by: Pan Xiuli <xiuli.pan@intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
It is a drm related bug. As the drm driver changed the time to free their test
userptr to bufmgr destroy(30921483c70c6939f017476eac13da6aa26b3b3c), we need
anothr order to release our driver to make sure the test userptr can be freed
with a valid fd.
Signed-off-by: Pan Xiuli <xiuli.pan@intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
Signed-off-by: Yang Rong <rong.r.yang@intel.com>
|
|
Fix to calculate the current cpu monotonic raw timestamp in nanoseconds
for enqueued,submitted,start and finshed and send this to application
based on the parameter queries.
Signed-off-by: Midhun Kodiyath <midhunchandra.kodiyath@intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
catch the error: out of host memery.
Signed-off-by: Luo Xionghu <xionghu.luo@intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
let's just keep things simple.
Signed-off-by: Ruiling Song <ruiling.song@intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
suboffset() will not set .subnr correctly, as vec1() will get a horizontal
stride 0 register.
Signed-off-by: Ruiling Song <ruiling.song@intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
All programs or none programs specified by input_programs contain a compiled binary or library
for the device. Otherwise return CL_INVALID_OPERATION.
Correct this condition check.
Signed-off-by: Yang Rong <rong.r.yang@intel.com>
Reviewed-by: Luo, Xionghu <xionghu.luo@intel.com>
|
|
1. return CL_INVALID_LINKER_OPTIONS when invalid options, using clang to check the options.
2. return CL_INVALID_OPERATION when the binary type is not same.
3. When link fail, will not return CL_LINK_PROGRAM_FAILURE, fix it.
4. Should not delete program in genProgramBuildFromLLVM, the program is new and delete from runtime.
Signed-off-by: Yang Rong <rong.r.yang@intel.com>
Reviewed-by: Luo, Xionghu <xionghu.luo@intel.com>
|
|
cl_buffer_get_subdata sometime is very very very slow in linux kernel, in skl and chv,
and it is random. So temporary disable it, use map/copy/unmap to read.
Should re-enable it after find root cause.
Signed-off-by: Yang Rong <rong.r.yang@intel.com>
Reviewed-by: Luo, Xionghu <xionghu.luo@intel.com>
|
|
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
Signed-off-by: Ruiling Song <ruiling.song@intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
This patch adds 2 new tests to the unit tests. It uses the existing
framework and data structures and tests the llvm/asm dump generation
when these flags (-dump-opt-llvm, -dump-opt-asm) are passed as build
options along with the dump file names.
Methods added:
1) get_build_llvm_info() tests LLVM dump generation
2) get_build_asm_info() tests ASM dump generation
Signed-off-by: Sirisha Gandikota <sirisha.gandikota@intel.com>
Reviewed-by: "Song, Ruiling" <ruiling.song@intel.com>
|
|
LLVM provides powerful string-remapped feature which could be used
to map a string to an input file name, thus we don't need to create
a temporary cl source file any more.
This patch not only make things much clear and avoid the unecessary
file creation. It only fixes some weird directory related problems.
Because beignet creates the temoprary file at the /tmp directory.
Then the clang will search the include files in that directory by
default, but the developer expects it to search the working directory
firstly. This causing two weird things:
1. If a .cl file is including a .h file in the current directory, beignet
will not find it.
2. Even if the probram add a "-I." option manually, beignet will search /tmp
firstly, and if there is a .h file in /tmp/ with the eaxct same file
name, beignet will the file located in /tmp.
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
Reviewed-by: Luo, Xionghu <xionghu.luo@intel.com>
|
|
Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
There is no NULL pointer check for kernel->program->build_opts.
This will cause utest test_get_arg_info crash.
In fact, we will add -cl-kernel-arg-info flag for compiling
ever time, and so the arg info is always avaible.
But some test case deliberately unset this flag and expect the ERR
return value, so we really need a check here.
Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
1.Change the code for null param_value
2.Add the return value check for build option "-cl-kernel-arg-info"
3.Correct one return value typo
Signed-off-by: Pan Xiuli <xiuli.pan@intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
ENDIF should be treated as barrier-like instruction
in instruction scheduling.
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
Reviewed-by: Luo, Xionghu <xionghu.luo@intel.com>
|
|
Need to take care of the uniform cases.
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
We need to test large image 1d buffer read and write testing.
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
We should treat it as a 2D image as image 1d buffer may be
exceed the 1D image size restrication.
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
originally, the dst of simd_shuffle is not uniform, but if it is
optimized as scalar, just use simd_width=1 to generate sel_op/asm
Signed-off-by: Guo Yejun <yejun.guo@intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|