summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2016-04-19Bump version to 1.1.1Release_v1.1.2Release_v1.1Yang Rong2-1/+4
2016-02-25Backend: Remove uselsee ParseCommandLineOptionsPan Xiuli1-14/+1
As llvm has refined the ParseCommandLineOptions and it will case double free problem we now remove the useless ParseCommandLineOptions Signed-off-by: Pan Xiuli <xiuli.pan@intel.com> Reviewed-by: Yang Rong <rong.r.yang@intel.com>
2016-02-24Driver: Fix GPGPU delete bugPan Xiuli1-2/+2
The first patch 192feb51 has something wrong in rebase and takes new bug in. Now fix both the original bug and revert the wrong patch. Signed-off-by: Pan Xiuli <xiuli.pan@intel.com> Reviewed-by: Yang Rong <rong.r.yang@intel.com>
2016-01-27runtime: set CL_KERNEL_PREFERRED_WORK_GROUP_SIZE_MULTIPLE to kernel's ↵Zhigang Gong5-6/+13
SIMD_WIDTH. It makes sense to set CL_KERNEL_PREFERRED_WORK_GROUP_SIZE_MULTIPLE to the corresponding SIMD size. Then it provides a way for intel's OCL application to get SIMD width at runtime and make some SIMD width dependant optimization possible. Signed-off-by: Zhigang Gong <zhigang.gong@intel.com> Reviewed-by: Ruiling Song <ruiling.song@intel.com>
2016-01-26GBE: use opencl c to implement llvm.memset and llvm.memcpy.Yang Rong8-924/+181
llvm 3.7 change to llvm IR, need two copies if still use the llvm IR to implement llvm.memset and llvm.memcpy. And opencl c is more clearly. Signed-off-by: Yang Rong <rong.r.yang@intel.com> Reviewed-by: Ruiling Song <ruiling.song@intel.com> Reviewed-by: Igor Gnatenko <i.gnatenko.brain@gmail.com>
2016-01-26fix llvm3.7 compiler_function_qualifiers utest fail.Yang Rong1-2/+2
Must explicit use void if function don't have parameter. Signed-off-by: Yang Rong <rong.r.yang@intel.com> Reviewed-by: Ruiling Song <ruiling.song@intel.com> Reviewed-by: Igor Gnatenko <i.gnatenko.brain@gmail.com>
2016-01-26GBE: Add datalayout and triple to ll files.Yang Rong2-0/+6
It can fix datalayout mismatch warning in llvm3.7. Signed-off-by: Yang Rong <rong.r.yang@intel.com> Reviewed-by: Ruiling Song <ruiling.song@intel.com> Reviewed-by: Igor Gnatenko <i.gnatenko.brain@gmail.com>
2016-01-26GBE: Move createStripAttributesPass before createInstructionCombiningPass.Yang Rong1-1/+1
Otherwise, createInstructionCombiningPass will convert some call to illegal instruction in llvm3.7, for example utest compiler_time_stamp and test_load_program_from_spir. Signed-off-by: Yang Rong <rong.r.yang@intel.com> Reviewed-by: Ruiling Song <ruiling.song@intel.com> Reviewed-by: Igor Gnatenko <i.gnatenko.brain@gmail.com>
2016-01-26GBE: Add llvm3.7 support.Yang Rong19-453/+207
Move all llvm relative includes to llvm_includes.hpp. Signed-off-by: Yang Rong <rong.r.yang@intel.com> Reviewed-by: Ruiling Song <ruiling.song@intel.com> Reviewed-by: Igor Gnatenko <i.gnatenko.brain@gmail.com>
2016-01-25utest: correct a typo in compiler_cl_finish.cppGuo Yejun1-5/+2
there is no logical relationship between the time of finish and map, remove the condition. Signed-off-by: Guo Yejun <yejun.guo@intel.com> Reviewed-by: Yang Rong <rong.r.yang@intel.com>
2016-01-25Fix the bug when we pass argument with spaces.Junyan He1-6/+67
We failed to handle -I "/XX X/YY YY/" like path or -DAAA=BBB"CC DDD"EEE like defines from the build option. We need to consider the spaces here and pass it correctly to Clang. V4: Fix a minor mistake. Signed-off-by: Junyan He <junyan.he@linux.intel.com> Reviewed-by: Yang Rong <rong.r.yang@intel.com>
2016-01-19Add the serializeToBin and deserializeFromBin for kernel arg info.Junyan He1-0/+55
Signed-off-by: Junyan He <junyan.he@linux.intel.com> Reviewed-by: Yang Rong <rong.r.yang@intel.com>
2016-01-19Backend: Use KernelArgument::ArgInfo to replace llvm's arg info.Junyan He3-4/+17
We define ourself's ArgInfo structure to ease the serialization of the arguement. Signed-off-by: Junyan He <junyan.he@linux.intel.com> Reviewed-by: Yang Rong <rong.r.yang@intel.com>
2016-01-19Backend: Fix the bug of printf in multi kernels within on file.Junyan He1-9/+18
Signed-off-by: Junyan He <junyan.he@linux.intel.com>
2016-01-05SKL: use the hw defautl value mocs index before linux 4.3.Yang Rong1-1/+15
From linux 4.3, kernel redefined the mocs table's value, But before 4.3, still used the hw defautl value. Signed-off-by: Yang Rong <rong.r.yang@intel.com> Reviewed-by: Ruiling Song <ruiling.song@intel.com>
2016-01-05SKL: Use kernel-defined MOCS values instead of assuming hardware defaults.Francisco Jerez1-2/+2
Reported to fix fix a ~50% performance regression (in OpenCV 3.0 and Luxmark 2.1 among others) with v4.3 kernels on Gen9 hardware. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92975 Signed-off-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Yang Rong <rong.r.yang@intel.com>
2015-12-18add Broxton supportGuo Yejun12-10/+208
special versions of linux kernel and libdrm are needed. utest and conformance test PASSED. Signed-off-by: Guo Yejun <yejun.guo@intel.com> Reviewed-by: Junyan He <junyan.he@linux.intel.com>
2015-12-18Runtime: return the correct error code in cl_event_check_waitlist.Yang Rong1-2/+4
Return CL_INVALID_CONTEXT if the context associated with command_queue and events in event_wait_list are not the same. Signed-off-by: Yang Rong <rong.r.yang@intel.com> Reviewed-by: Luo Xionghu <xionghu.luo@intel.com>
2015-12-18gbe: fix uitofp instruction issue.Luo Xionghu1-1/+11
llvm 3.7 may generate cast instructions "%13 = uitofp i1 %12 to float", while the dst type is float or double , should call the coresponding newXXXimmediate function. Signed-off-by: Luo Xionghu <xionghu.luo@intel.com> Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
2015-12-18GBE: Don't read past end of printf format stringRebecca N. Palmer1-1/+1
When p == end (the null terminator byte), don't try to read p + 1: as this is outside the string, it might be a '%' from a different object (causing __parse_printf_state(end + 2, end, ...) to be called, which will fail), or an invalid address. Signed-off-by: Rebecca Palmer <rebecca_palmer@zoho.com> Reviewed-by: Pan, Xiuli <xiuli.pan@intel.com>
2015-12-18GBE: Fix unaligned load/store issues.Ruiling Song1-4/+5
1. need support float. 2. get correct element type. 3. should use ir::TYPE_U8 for byte store. Signed-off-by: Ruiling Song <ruiling.song@intel.com> Reviewed-by: Yang Rong <rong.r.yang@intel.com>
2015-12-18drivers: change the buf size to size_tPan Xiuli2-9/+12
The uint32_t size is not enough for coming bigger gpu memory, now GEN9 support 4G buffer. Also add assertion for invalid size. Signed-off-by: Pan Xiuli <xiuli.pan@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@intel.com>
2015-12-18Runtime: add CL_DEVICE_SPIR_VERSIONS to clGetDeviceInfo.Yang Rong3-0/+4
Signed-off-by: Yang Rong <rong.r.yang@intel.com> Reviewed-by: Luo Xionghu <xionghu.luo@intel.com>
2015-12-18LibOcl: Fix float convert to long/ulong bug.Yang Rong1-3/+23
If the float overflow, convert to long/ulong is undef. So must use long/ulong's max and min value as return value. Also refine long to other integer type sat convert. Use to statement to avoid generate if/else/endif. Signed-off-by: Yang Rong <rong.r.yang@intel.com> Reviewed-by: Ruiling Song <ruiling.song@intel.com>
2015-12-18fix uniform case for ByteGatherGuo Yejun1-2/+2
currently,the ByteGather generates IR as: BYTE_GATHER(16) %109<0>:UD : %96<0,1,0>:UD 0x4:UD MOV(1) %75<0>:UB : %109<32,8,4>:UB Fix it to generate IR as: BYTE_GATHER(16) %109<0>:UD : %96<0,1,0>:UD 0x4:UD MOV(1) %75<0>:UB : %109<0,1,0>:UB otherwise, there is regression issue of local copy propagation optimization which uses %109<32,8,4>:UB Signed-off-by: Guo Yejun <yejun.guo@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@intel.com>
2015-12-18utests: event should be releasedPan Xiuli1-0/+1
This is also gpgpu event, which can cause leakes. Just release it. Signed-off-by: Pan Xiuli <xiuli.pan@intel.com> Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
2015-12-18Fix a event leak in create contextPan Xiuli1-0/+1
We get an event out of NDRangeKernel, and we don't release it. As an gpgpu event it can also make drm buffer leak, to avoid potenial error we just release it.w Signed-off-by: Pan Xiuli <xiuli.pan@intel.com> Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
2015-12-18runtime: refine the last_event in queue to a listPan Xiuli3-24/+55
Refine the event struct to make last_event become a list to store all uncompeleted events and update them every queue flush. This can make sure all events created in the runtime have a chance to update status and run callback functions and then be deleted. We will also fix the memory leak problem casued by uncompeted events. This is a bugfix for https://bugs.freedesktop.org/show_bug.cgi?id=91710 The leaked events with gpu buffers will be unreferenced and cause other drm buffer leak and result in terrible memory leak. Signed-off-by: Pan Xiuli <xiuli.pan@intel.com> Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
2015-12-18Fix gpgpu node related bugPan Xiuli1-1/+1
This should be a typo, we should wait for the gpgpu and create node only if the batch buffer is busy. Signed-off-by: Pan Xiuli <xiuli.pan@intel.com> Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
2015-12-18Driver: fix the annoying "Failed to release userptr..." error messagePan Xiuli1-2/+4
It is a drm related bug. As the drm driver changed the time to free their test userptr to bufmgr destroy(30921483c70c6939f017476eac13da6aa26b3b3c), we need anothr order to release our driver to make sure the test userptr can be freed with a valid fd. Signed-off-by: Pan Xiuli <xiuli.pan@intel.com> Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
2015-10-08Bump version to 1.1.1Release_v1.1.1Yang Rong2-1/+4
Signed-off-by: Yang Rong <rong.r.yang@intel.com>
2015-09-24Calculate appropriate timestamps for cl profileMidhun Kodiyath3-4/+71
Fix to calculate the current cpu monotonic raw timestamp in nanoseconds for enqueued,submitted,start and finshed and send this to application based on the parameter queries. Signed-off-by: Midhun Kodiyath <midhunchandra.kodiyath@intel.com> Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
2015-09-21should check the return value of cl_program_new.Luo Xionghu1-0/+18
catch the error: out of host memery. Signed-off-by: Luo Xionghu <xionghu.luo@intel.com> Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
2015-09-21GBE: Minor refine uw1grf(nr, subnr).Ruiling Song1-1/+7
let's just keep things simple. Signed-off-by: Ruiling Song <ruiling.song@intel.com> Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
2015-09-21GBE: fix ub1grf(nr, subnr) issue.Ruiling Song1-1/+7
suboffset() will not set .subnr correctly, as vec1() will get a horizontal stride 0 register. Signed-off-by: Ruiling Song <ruiling.song@intel.com> Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
2015-09-21Fix clLinkProgram error.Yang Rong2-16/+29
All programs or none programs specified by input_programs contain a compiled binary or library for the device. Otherwise return CL_INVALID_OPERATION. Correct this condition check. Signed-off-by: Yang Rong <rong.r.yang@intel.com> Reviewed-by: Luo, Xionghu <xionghu.luo@intel.com>
2015-09-18Fix piglit clLinkProgram fail.Yang Rong7-9/+80
1. return CL_INVALID_LINKER_OPTIONS when invalid options, using clang to check the options. 2. return CL_INVALID_OPERATION when the binary type is not same. 3. When link fail, will not return CL_LINK_PROGRAM_FAILURE, fix it. 4. Should not delete program in genProgramBuildFromLLVM, the program is new and delete from runtime. Signed-off-by: Yang Rong <rong.r.yang@intel.com> Reviewed-by: Luo, Xionghu <xionghu.luo@intel.com>
2015-09-18Don't use cl_buffer_get_subdata in clEnqueueReadBuffer.Yang Rong1-1/+4
cl_buffer_get_subdata sometime is very very very slow in linux kernel, in skl and chv, and it is random. So temporary disable it, use map/copy/unmap to read. Should re-enable it after find root cause. Signed-off-by: Yang Rong <rong.r.yang@intel.com> Reviewed-by: Luo, Xionghu <xionghu.luo@intel.com>
2015-09-18GBE: fix build error with LLVM 3.5 and previous version.Zhigang Gong1-1/+6
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com> Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
2015-09-18GBE: add check dumpASMFileName.empty()Ruiling Song1-5/+8
Signed-off-by: Ruiling Song <ruiling.song@intel.com> Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
2015-09-18utests: Added unit tests to test LLVM and ASM dump generation.Sirisha Gandikota1-0/+107
This patch adds 2 new tests to the unit tests. It uses the existing framework and data structures and tests the llvm/asm dump generation when these flags (-dump-opt-llvm, -dump-opt-asm) are passed as build options along with the dump file names. Methods added: 1) get_build_llvm_info() tests LLVM dump generation 2) get_build_asm_info() tests ASM dump generation Signed-off-by: Sirisha Gandikota <sirisha.gandikota@intel.com> Reviewed-by: "Song, Ruiling" <ruiling.song@intel.com>
2015-09-18GBE: Use addRemappedFile to avoid creating temporary cl source file.Zhigang Gong1-30/+10
LLVM provides powerful string-remapped feature which could be used to map a string to an input file name, thus we don't need to create a temporary cl source file any more. This patch not only make things much clear and avoid the unecessary file creation. It only fixes some weird directory related problems. Because beignet creates the temoprary file at the /tmp directory. Then the clang will search the include files in that directory by default, but the developer expects it to search the working directory firstly. This causing two weird things: 1. If a .cl file is including a .h file in the current directory, beignet will not find it. 2. Even if the probram add a "-I." option manually, beignet will search /tmp firstly, and if there is a .h file in /tmp/ with the eaxct same file name, beignet will the file located in /tmp. Signed-off-by: Zhigang Gong <zhigang.gong@intel.com> Reviewed-by: Luo, Xionghu <xionghu.luo@intel.com>
2015-09-18Utest: Add -cl-kernel-arg-info to the utest test_get_arg_infoJunyan He1-1/+1
Signed-off-by: Junyan He <junyan.he@linux.intel.com> Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
2015-09-18Runtime: Add NULL pointer check in clGetKernelArgInfoJunyan He1-1/+2
There is no NULL pointer check for kernel->program->build_opts. This will cause utest test_get_arg_info crash. In fact, we will add -cl-kernel-arg-info flag for compiling ever time, and so the arg info is always avaible. But some test case deliberately unset this flag and expect the ERR return value, so we really need a check here. Signed-off-by: Junyan He <junyan.he@linux.intel.com> Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
2015-09-18Fix clGetKernelArgInfo fail on piglitPan Xiuli2-9/+13
1.Change the code for null param_value 2.Add the return value check for build option "-cl-kernel-arg-info" 3.Correct one return value typo Signed-off-by: Pan Xiuli <xiuli.pan@intel.com> Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
2015-09-18GBE: a potential bug in instruction scheduling.Zhigang Gong1-1/+5
ENDIF should be treated as barrier-like instruction in instruction scheduling. Signed-off-by: Zhigang Gong <zhigang.gong@intel.com> Reviewed-by: Luo, Xionghu <xionghu.luo@intel.com>
2015-09-18GBE: one minor bug in OP_SIMD_XXX.Zhigang Gong1-1/+7
Need to take care of the uniform cases. Signed-off-by: Zhigang Gong <zhigang.gong@intel.com> Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
2015-09-18utests: refine image 1d buffer test case.Zhigang Gong2-53/+32
We need to test large image 1d buffer read and write testing. Signed-off-by: Zhigang Gong <zhigang.gong@intel.com> Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
2015-09-18GBE: fix the broken image_1d_buffer write.Zhigang Gong1-1/+13
We should treat it as a 2D image as image 1d buffer may be exceed the 1D image size restrication. Signed-off-by: Zhigang Gong <zhigang.gong@intel.com> Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
2015-09-18correct simd width when dst of simd_shuffle is scalarGuo Yejun1-0/+5
originally, the dst of simd_shuffle is not uniform, but if it is optimized as scalar, just use simd_width=1 to generate sel_op/asm Signed-off-by: Guo Yejun <yejun.guo@intel.com> Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>