summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2014-09-26GBE: structurized loop exit need an extra branching instruction when do ↵HEADmasterZhigang Gong3-14/+38
reordering. When we want to reorder the BBs and move the unstructured BB out-of the structured block, we need to add a BRA to the block. If the exit of the structured block is a loop, we need to append a unconditional BRA right after the predicated BRA. Otherwise, we may lost the correct successor if an unstructured BB is moved next to this BB. After this patch, with loop optimization enabled, there is no regression on both utests and piglit. But there are still a few regressions in opencv test suite: [----------] Global test environment tear-down [==========] 8 tests from 2 test cases ran. (40041 ms total) [ PASSED ] 2 tests. [ FAILED ] 6 tests, listed below: [ FAILED ] OCL_Photo/FastNlMeansDenoising.Mat/2, where GetParam() = (Channels(2), false) [ FAILED ] OCL_Photo/FastNlMeansDenoising.Mat/3, where GetParam() = (Channels(2), true) [ FAILED ] OCL_Photo/FastNlMeansDenoisingColored.Mat/0, where GetParam() = (Channels(3), false) [ FAILED ] OCL_Photo/FastNlMeansDenoisingColored.Mat/1, where GetParam() = (Channels(3), true) [ FAILED ] OCL_Photo/FastNlMeansDenoisingColored.Mat/2, where GetParam() = (Channels(4), false) [ FAILED ] OCL_Photo/FastNlMeansDenoisingColored.Mat/3, where GetParam() = (Channels(4), true) So let's keep this optimizaion disabled. Will enable it when I fixed all the known issues. Signed-off-by: Zhigang Gong <zhigang.gong@intel.com> Reviewed-by: Luo <xionghu.luo@intel.com>
2014-09-26GBE: fix a loop header file including bug.Zhigang Gong1-1/+0
function.hpp doesn't need to include the structural_analysis.hpp. Signed-off-by: Zhigang Gong <zhigang.gong@intel.com> Reviewed-by: Luo <xionghu.luo@intel.com>
2014-09-26Use instruction WHILE to manipulate structure.Luo Xionghu8-9/+61
1. WHILE instruction should be non-schedulable. 2. if this WHILE instruction jumps to an ELSE instruction, the distance need add 2. v2: We also need to take care of HSW for while instruction. Signed-off-by: Luo Xionghu <xionghu.luo@intel.com> Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
2014-09-26add handleSelfLoopNode to insert while instruction on Gen IR level.Luo Xionghu4-11/+39
v2: disable loop optimization by default due to still buggy. Signed-off-by: Luo Xionghu <xionghu.luo@intel.com> Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
2014-09-26Add Gen IR WHILE.Luo Xionghu3-1/+9
Add Gen IR WHILE to mark the strucutred region. Signed-off-by: Luo Xionghu <xionghu.luo@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-09-18GBE/libocl: Add __gen_ocl_get_timestamp() to get timestamp.Ruiling Song18-3/+343
Gen provide tm0 register for intra-kernel profiling. Here we provide an API __gen_ocl_get_timestamp() to return the timestamp in TM. The return type is defined as: struct time_stamp { ulong tick; uint event; }; 'tick' is a 64bit time tick. 'event' stores a value which means whether a tmEvent has occured (non-zero) or not (0). tmEvent includes time-impacting event such as context switch or frequency change since last time tm0 was read. I add a sample in the kernels/compiler_time_stamp.cl. Hope it would help you understand how to use it. V2: Introduce ir::ARFRegister to avoid directly use of nr/subnr in Gen IR. Rename __gen_ocl_extract_reg to __gen_ocl_region. Rename beignet_get_time_stamp to __gen_ocl_get_timestamp. Signed-off-by: Ruiling Song <ruiling.song@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-09-18GBE/libocl: fix build dependency issue.Zhigang Gong1-2/+2
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com> Reviewed-by: He Junyan <junyan.he@inbox.com>
2014-09-18Add long support for printfJunyan He3-12/+38
V2: Replace all the long and ulong to int64_t Signed-off-by: Junyan He <junyan.he@linux.intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-09-18GBE: Output linkModules's error message.Ruiling Song1-2/+3
Signed-off-by: Ruiling Song <ruiling.song@intel.com> Reviewed-by: Junyan He <junyan.he@linux.intel.com>
2014-09-17fix utest memory leak.Luo Xionghu1-2/+3
Signed-off-by: Luo Xionghu <xionghu.luo@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-09-17fix one bug at cl_get_kernel_workgroup_info.Luo Xionghu1-0/+1
Signed-off-by: Luo Xionghu <xionghu.luo@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-09-17Revert "improve the build performance of vector type built-in function."Zhigang Gong1-39/+6
This patch still has to be pending to fix the wide integer issue completely. Although we have a fallback mechanism which will try to build the module again by ignoring some passes to avoid the wide integer issue, it's broken now on master branch. As now all the builtin functions have been built statically, and those bitcode may already have i128/i512 etc. This reverts commit 565d1eb00d9a5219c2848b3674e40ac07cb48b89.
2014-09-16improve the build performance of vector type built-in function.Luo Xionghu1-6/+39
this patch was lost during the libocl merge. resubmit it to improve the vector function performance. please refer to e2db890596eea0a6eb741e11e576a38952f1ed1e for detail. Signed-off-by: Luo Xionghu <xionghu.luo@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-09-16remove the LinkOnceAnyLinkage since the libocl is introduced.Luo Xionghu2-15/+2
no need to set the LinkOnceAnyLinkage for global variables and functions to avoid redefinition. v2: also enable the VerifierPass. Signed-off-by: Luo Xionghu <xionghu.luo@intel.com> Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
2014-09-16Fix the bug of LLVM_LFLAGS fail to setJunyan He2-3/+4
The LLVM_LFLAGS is used before finding the LLVM package, which causes the CMake fails to set correct -L flags and cause linkage error. Signed-off-by: Junyan He <junyan.he@linux.intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-09-16GBE/libocl: fix a regression after libocl change.Zhigang Gong1-4/+4
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com> Reviewed-by: He Junyan <junyan.he@inbox.com>
2014-09-16GBE/libocl: add missing vector builtin definition for fma.Zhigang Gong1-1/+1
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com> Reviewed-by: "Song, Ruiling" <ruiling.song@intel.com> Reviewed-by: He Junyan <junyan.he@inbox.com>
2014-09-15Modify the CMakeList to use the internal PCH first.Junyan He2-4/+4
Because we delete the validation of the PCH file, sometimes the PCH in the system dir is not compatible with the clang and cause crash. Anytime, we need to use internal PCH when compiling. Signed-off-by: Junyan He <junyan.he@linux.intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-09-15Update NEWS.Zhigang Gong1-0/+3
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
2014-09-15Remove out-of-date document.Zhigang Gong3-99/+47
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
2014-09-15GBE/libocl: Fix sub_sat corner case.Ruiling Song2-2/+2
It seems that hw return wrong result when y is equal to 0x80000000 in sub_sat(int x, int y). So we re-write it as: add_sat(add_sat(0x7fffffff, x), 1) Also enable corresponding utest. Signed-off-by: Ruiling Song <ruiling.song@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-09-15fix bin/cl-program-tester tests/cl/program/execute/attributes.cl regression.Luo Xionghu1-6/+7
work_group_size_hint should define another variable. Signed-off-by: Luo Xionghu <xionghu.luo@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-09-15Update readme.Zhigang Gong1-54/+58
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
2014-09-12Enable ICC and CLANG compiler for beignetLv Meng3-70/+39
the 'COMPILER' is to choose the detail compiler,the default is GCC. Signed-off-by: Lv Meng <meng.lv@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-09-12GBE: fix multiple files compilation bugs.Zhigang Gong2-3/+8
If we want to link multiple files together, and one kernel function need refer other kernel functions in other files, we must not set those functions as linked once attribute. Signed-off-by: Zhigang Gong <zhigang.gong@intel.com> Reviewed-by: He Junyan <junyan.he@inbox.com> Tested-by: "Meng, Mengmeng" <mengmeng.meng@intel.com> Reviewed-by: "Song, Ruiling" <ruiling.song@intel.com>
2014-09-12fix piglit get kernel info FUNCTION ATTRIBUTE fail.Luo13-0/+100
the backend need return the kernel FUNCTION ATTRIBUTE message to the clGetKernelInfo. there are 3 kind of function attribute so far, vec_type_hint parameter is not available to return due to llvm lack of such info. Signed-off-by: Luo <xionghu.luo@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-09-12runtime: fix build status handling.Zhigang Gong3-23/+35
According to the spec: The build status is to Returns the build, compile or link status, whichever was performed last on program for device. The previous implementation only consider the clProgramBuild and doesn't consider the compile. Now fix it. Signed-off-by: Zhigang Gong <zhigang.gong@intel.com> Reviewed-by: He Junyan <junyan.he@inbox.com> Tested-by: "Meng, Mengmeng" <mengmeng.meng@intel.com> Reviewed-by: "Song, Ruiling" <ruiling.song@intel.com>
2014-09-12runtime: fix program binary type bug.Zhigang Gong1-0/+3
If the binary is a executable type, the first byte is zero and we need to set the binary type correctly to CL_PROGRAM_BINARY_TYPE_EXECUTABLE. Signed-off-by: Zhigang Gong <zhigang.gong@intel.com> Reviewed-by: He Junyan <junyan.he@inbox.com> Tested-by: "Meng, Mengmeng" <mengmeng.meng@intel.com> Reviewed-by: "Song, Ruiling" <ruiling.song@intel.com>
2014-09-12Update license disclaimer.Yang Rong1-32/+28
LunarGLASS have update his copyright, so update the copyright in llvm_scalarize.cpp. Signed-off-by: Yang Rong <rong.r.yang@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-09-11GBE: don't enable double by default.Zhigang Gong1-1/+5
Actually, we don't support double completely currently. Let's disable it now. This bring a little incompatible point with the 1.2 spec which doesn't require the kernel to use the following pragma to enable fp64. #pragma OPENCL EXTENSION cl_khr_fp64 : enable If the application wants to try the partially supported double with beignet under opencl 1.2, the application will still need to add the above pragma. Signed-off-by: Zhigang Gong <zhigang.gong@intel.com> Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
2014-09-11GBE: fix a potential memory leak bug.Zhigang Gong1-0/+1
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com> Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
2014-09-11GBE: Fix a potential segfault.Zhigang Gong1-1/+2
And when we fail to compile a module, the fileName may be NULL, we can't access it unconditionally. Signed-off-by: Zhigang Gong <zhigang.gong@intel.com> Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
2014-09-11GBE: don't return error if we get an empty module.Zhigang Gong1-1/+1
When compile a empty string, we may get an empty module which is not an error. Signed-off-by: Zhigang Gong <zhigang.gong@intel.com> Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
2014-09-11fix piglit cl-api-set-kernel-arg fail.Luo Xionghu3-1/+20
the memory object should be checked whether valid in context buffers before being set as kernel arguments. v2: rename the function from mem_in_buffers to is_valid_mem, move the magic header check into it. Signed-off-by: Luo Xionghu <xionghu.luo@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-09-11fix clGetKernelWorkGroupInfo built-in kernel fail.Luo Xionghu6-0/+73
add CL_KERNEL_GLOBAL_WORK_SIZE option for clGetKernelWorkGroupInfo. v2: should return the max global work size instead of current work size. This funtion need return CL_INVALID_VALUE if the device is not a custom device or kernel is not a built-in kernel. we have 3 kind of built-in kernels for 1d/2d/3d memories, the max global work size are decided by the dimension and memory type. the piglit fail is caused by calling NON built-in kernels, so need send patch to piglit later. Signed-off-by: Luo Xionghu <xionghu.luo@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-09-11GBE/libocl: Added one missing prototype fma().Zhigang Gong1-0/+1
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com> Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
2014-09-11GBE: fix bugs when handling -cl-std option.Zhigang Gong2-5/+9
Actually, CLANG does take this option and we should not filter it out. We also change the default option to create PCH file to -cl-std=CL1.2. And if the user pass in a CL1.1 we will have to disable PCH. Another change is that if we are CL1.2, then we should enable the cl_khr_fp64 by default. As from CL1.2, this extension should be enabled by default. Signed-off-by: Zhigang Gong <zhigang.gong@intel.com> Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
2014-09-10Add the switch logic for math conformance fast pathJunyan He5-12/+20
Modify the __ocl_math_fastpath_flag init value in the backend link stage to switch between fast path and conformance path. V2: Rename the function prototype parameter name. V3: Modify the parameter to boolean and correct some comment words. Signed-off-by: Junyan He <junyan.he@linux.intel.com> Reviewed-by: "Song, Ruiling" <ruiling.song@intel.com>
2014-09-10GBE/libocl: fix the wrong prototype of scalar native_powr.Zhigang Gong1-1/+1
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com> Reviewed-by: Junyan He <junyan.he@linux.intel.com>
2014-09-10Fix the issue of -cl-std=CLX.X option.Junyan He3-8/+82
The -cl-std= will specify the least version to compile the source code providing to our API. So we need to check it early, and return failure if our platform's version can not meet the request. In the backend, we just ignore this cmd line option. Signed-off-by: Junyan He <junyan.he@linux.intel.com> Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
2014-09-10Runtime: Implement clGetExtensionFunctionAddressForPlatform.Zhigang Gong2-3/+18
It seems that this function is required by latest PyOpenCL. Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com> Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
2014-09-10Add copyright header for all libocl files.Junyan He31-0/+533
Signed-off-by: Junyan He <junyan.he@linux.intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-09-10Use ${PYTHON_EXECUTABLE} to run python scripts.Yichao Yu1-4/+4
Signed-off-by: Yichao Yu <yyc1992@gmail.com> Reviewed-by: He Junyan <junyan.he@inbox.com>
2014-09-10Update README for the command parser in drm kernel.Yang Rong1-0/+8
Signed-off-by: Yang Rong <rong.r.yang@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-09-09fix piglit cl-api-get-program-info fail.Luo Xionghu1-1/+1
add pointer check. Signed-off-by: Luo Xionghu <xionghu.luo@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-09-05Add uncompatible PCH Options to avoid compiling failure.Junyan He1-1/+14
Signed-off-by: Junyan He <junyan.he@linux.intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-09-05GBE: fallback if we get a wider than i64 constant.Zhigang Gong1-0/+4
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com> Tested-by: Meng, Mengmeng <mengmeng.meng@intel.com>
2014-09-05GBE: fix a bug with LLVM 3.3.Zhigang Gong1-5/+5
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com> Tested-by: Meng, Mengmeng <mengmeng.meng@intel.com>
2014-09-05Add the missing function prototypes of any() and atom_add()Junyan He2-0/+26
Signed-off-by: Junyan He <junyan.he@linux.intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-09-05GBE: avoid one optimization pass to generate wide integer.Zhigang Gong1-12/+12
Integer type wider than 64 bit is hard to handle on Gen. Let's try to prevent ScalarReplAggregates pass to generate such type of integer. v2: fix compilation error with LLVM 3.3. Signed-off-by: Zhigang Gong <zhigang.gong@intel.com> Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>