summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2014-11-07utests: fix bugs in builtin_tgamma().Release_v0.9.xRebecca Palmer1-2/+7
This patch is based on Rebecca's patch at: https://bugs.debian.org/cgi-bin/bugreport.cgi?msg=5;filename=Fix-pow-erf-tgamma.patch;att=3;bug=768090. And fixed another bug which we should not use an absolute error checking. We should use ULP and considering the strict conformance or non strict conformance state. Signed-off-by: Zhigang Gong <zhigang.gong@intel.com> Signed-off-by: Rebecca Palmer <rebecca_palmer@zoho.com> Reviewed-by: Ruiling Song <ruiling.song@intel.com>
2014-11-07utests: fix bugs in builtin_pow().Rebecca Palmer1-5/+11
This patch is based on Rebecca's patch at: https://bugs.debian.org/cgi-bin/bugreport.cgi?msg=5;filename=Fix-pow-erf-tgamma.patch;att=3;bug=768090. And fixed another bug which we should not use an absolute error checking. We should use ULP and considering the strict conformance or non strict conformance state. Signed-off-by: Zhigang Gong <zhigang.gong@intel.com> Signed-off-by: Rebecca Palmer <rebecca_palmer@zoho.com> Reviewed-by: Ruiling Song <ruiling.song@intel.com>
2014-11-07GBE: fix bug in tgamma().Rebecca Palmer1-182/+7
tgamma is actually lgamma, a related but very different function. This patch is from: https://bugs.debian.org/cgi-bin/bugreport.cgi?msg=5;filename=Fix-pow-erf-tgamma.patch;att=3;bug=768090 Signed-off-by: Zhigang Gong <zhigang.gong@intel.com> Signed-off-by: Rebecca Palmer <rebecca_palmer@zoho.com> Reviewed-by: Ruiling Song <ruiling.song@intel.com>
2014-11-07GBE: fix bug in erf()/erfc().Rebecca Palmer2-14/+290
erf/erfc diverge (instead of converging to 1 or 0) for arguments above about 2. This patch is from: https://bugs.debian.org/cgi-bin/bugreport.cgi?msg=5;filename=Fix-pow-erf-tgamma.patch;att=3;bug=768090 Signed-off-by: Zhigang Gong <zhigang.gong@intel.com> Signed-off-by: Rebecca Palmer <rebecca_palmer@zoho.com> Reviewed-by: Ruiling Song <ruiling.song@intel.com>
2014-11-07GBE: fix bug in pow()/pown().Rebecca Palmer2-5/+17
pow/pown ignore the sign of their first argument (e.g. pow(-2,3) gives 8 instead of -8) This patch is from: https://bugs.debian.org/cgi-bin/bugreport.cgi?msg=5;filename=Fix-pow-erf-tgamma.patch;att=3;bug=768090 Signed-off-by: Zhigang Gong <zhigang.gong@intel.com> Signed-off-by: Rebecca Palmer <rebecca_palmer@zoho.com> Reviewed-by: Ruiling Song <ruiling.song@intel.com>
2014-11-07GBE: Support more instructions for constant expression handling.Zhigang Gong3-19/+125
Add support for the following OPs: FCmp/ICmp/FPToSI/FPToUI/SIToFP/UIToFP. Signed-off-by: Zhigang Gong <zhigang.gong@intel.com> Reviewed-by: Ruiling Song <ruiling.song@intel.com>
2014-11-03GBE: fix one compilation warning.Zhigang Gong1-2/+4
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
2014-11-03use env to set environment variables for GBE_BIN_GENERATERAndreas Beckmann1-1/+1
cmake interprets OCL_PCM_PATH=... as a command and will enclose it in quotes in case it contains characters requiring protection, e.g. ~ a quoted "FOO=bar" is interpreted by /bin/sh as a command (that does not exist), not a variable setting for a following command use env to set the variables unambiguously Signed-off-by: Andreas Beckmann <anbe@debian.org> Reviewed-by: Zhigang Gong <zhigang.gong@intel.com>
2014-11-03fix some typosAndreas Beckmann2-4/+4
Signed-off-by: Andreas Beckmann <anbe@debian.org> Reviewed-by: Zhigang Gong <zhigang.gong@intel.com>
2014-09-15Bump to 0.9.3.Release_v0.9.3Zhigang Gong2-1/+4
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
2014-09-15Remove out-of-date document.Zhigang Gong3-99/+47
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
2014-09-15Document fixup.Zhigang Gong1-6/+2
For 0.9.x, we only support GCC build. Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
2014-09-15GBE: Fix sub_sat corner case.Ruiling Song2-2/+2
It seems that hw return wrong result when y is equal to 0x80000000 in sub_sat(int x, int y). So we re-write it as: add_sat(add_sat(0x7fffffff, x), 1) Also enable corresponding utest. Signed-off-by: Ruiling Song <ruiling.song@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-09-15Update readme.Zhigang Gong1-54/+58
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
2014-09-15fix bin/cl-program-tester tests/cl/program/execute/attributes.cl regression.Luo Xionghu1-6/+7
work_group_size_hint should define another variable. Signed-off-by: Luo Xionghu <xionghu.luo@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-09-12fix piglit get kernel info FUNCTION ATTRIBUTE fail.Luo13-0/+100
the backend need return the kernel FUNCTION ATTRIBUTE message to the clGetKernelInfo. there are 3 kind of function attribute so far, vec_type_hint parameter is not available to return due to llvm lack of such info. Signed-off-by: Luo <xionghu.luo@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-09-12runtime: fix build status handling.Zhigang Gong3-23/+35
According to the spec: The build status is to Returns the build, compile or link status, whichever was performed last on program for device. The previous implementation only consider the clProgramBuild and doesn't consider the compile. Now fix it. Signed-off-by: Zhigang Gong <zhigang.gong@intel.com> Reviewed-by: He Junyan <junyan.he@inbox.com> Tested-by: "Meng, Mengmeng" <mengmeng.meng@intel.com> Reviewed-by: "Song, Ruiling" <ruiling.song@intel.com>
2014-09-12runtime: fix program binary type bug.Zhigang Gong2-1/+4
If the binary is a executable type, the first byte is zero and we need to set the binary type correctly to CL_PROGRAM_BINARY_TYPE_EXECUTABLE. Signed-off-by: Zhigang Gong <zhigang.gong@intel.com> Reviewed-by: He Junyan <junyan.he@inbox.com> Tested-by: "Meng, Mengmeng" <mengmeng.meng@intel.com> Reviewed-by: "Song, Ruiling" <ruiling.song@intel.com>
2014-09-12GBE: fix multiple files compilation bugs.Zhigang Gong2-3/+8
If we want to link multiple files together, and one kernel function need refer other kernel functions in other files, we must not set those functions as linked once attribute. Signed-off-by: Zhigang Gong <zhigang.gong@intel.com> Reviewed-by: He Junyan <junyan.he@inbox.com> Tested-by: "Meng, Mengmeng" <mengmeng.meng@intel.com> Reviewed-by: "Song, Ruiling" <ruiling.song@intel.com>
2014-09-12Update license disclaimer.Yang Rong1-32/+28
LunarGLASS have update his copyright, so update the copyright in llvm_scalarize.cpp. Signed-off-by: Yang Rong <rong.r.yang@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-09-11GBE: don't enable double by default.Zhigang Gong1-1/+5
Actually, we don't support double completely currently. Let's disable it now. This bring a little incompatible point with the 1.2 spec which doesn't require the kernel to use the following pragma to enable fp64. #pragma OPENCL EXTENSION cl_khr_fp64 : enable If the application wants to try the partially supported double with beignet under opencl 1.2, the application will still need to add the above pragma. Signed-off-by: Zhigang Gong <zhigang.gong@intel.com> Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
2014-09-11GBE: fix a potential memory leak bug.Zhigang Gong1-0/+1
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com> Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
2014-09-11GBE: Fix a potential segfault.Zhigang Gong1-1/+2
And when we fail to compile a module, the fileName may be NULL, we can't access it unconditionally. Signed-off-by: Zhigang Gong <zhigang.gong@intel.com> Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
2014-09-11fix piglit cl-api-set-kernel-arg fail.Luo Xionghu3-1/+20
the memory object should be checked whether valid in context buffers before being set as kernel arguments. v2: rename the function from mem_in_buffers to is_valid_mem, move the magic header check into it. Signed-off-by: Luo Xionghu <xionghu.luo@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-09-11fix clGetKernelWorkGroupInfo built-in kernel fail.Luo Xionghu6-0/+73
add CL_KERNEL_GLOBAL_WORK_SIZE option for clGetKernelWorkGroupInfo. v2: should return the max global work size instead of current work size. This funtion need return CL_INVALID_VALUE if the device is not a custom device or kernel is not a built-in kernel. we have 3 kind of built-in kernels for 1d/2d/3d memories, the max global work size are decided by the dimension and memory type. the piglit fail is caused by calling NON built-in kernels, so need send patch to piglit later. Signed-off-by: Luo Xionghu <xionghu.luo@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-09-11GBE: fix bugs when handling -cl-std option.Zhigang Gong2-3/+9
Actually, CLANG does take this option and we should not filter it out. We also change the default option to create PCH file to -cl-std=CL1.2. And if the user pass in a CL1.1 we will have to disable PCH. Another change is that if we are CL1.2, then we should enable the cl_khr_fp64 by default. As from CL1.2, this extension should be enabled by default. Signed-off-by: Zhigang Gong <zhigang.gong@intel.com> Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
2014-09-10Fix the issue of -cl-std=CLX.X option.Junyan He3-5/+66
The -cl-std= will specify the least version to compile the source code providing to our API. So we need to check it early, and return failure if our platform's version can not meet the request. In the backend, we just ignore this cmd line option. Signed-off-by: Junyan He <junyan.he@linux.intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-09-10Runtime: Implement clGetExtensionFunctionAddressForPlatform.Zhigang Gong2-3/+18
It seems that this function is required by latest PyOpenCL. Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com> Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
2014-09-09Update README for the command parser in drm kernel.Yang Rong1-0/+8
Signed-off-by: Yang Rong <rong.r.yang@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-09-09GBE: fix some predfeined OCL macros.Zhigang Gong1-1/+5
Now beignet is a pure opencl 1.2 implementation. Set some predefined macros correctly. __OPENCL_C_VERSION__ and __OPENCL_VERSION__ should be 120 by default. Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
2014-09-09fix piglit cl-api-get-program-info fail.Luo Xionghu1-1/+1
add pointer check. Signed-off-by: Luo Xionghu <xionghu.luo@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-09-09build: fix a CXXFLAGS override bug in backend directory.Zhigang Gong1-3/+1
Reported-by: Jérôme Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
2014-09-09GBE: Fix a bug in gatherBTI.Ruiling Song1-1/+1
The needNewBTI is a state that only valid for the current candidate. So need to reset to default value for each candidate. This fix the regression in opencv 3.0: ./opencv_perf_objdetect OCL_Cascade_Image_MinSize_CascadeClassifier.CascadeClassifier Signed-off-by: Ruiling Song <ruiling.song@intel.com> Reviewed-by: Yang Rong <rong.r.yang@intel.com>
2014-09-09GBE: initialize BTI structure to zero.Ruiling Song1-0/+4
Clear to zero to avoid garbage data, as we do not assign it later for local/constant memory access. v2: move initialization code into constructor. Signed-off-by: Ruiling Song <ruiling.song@intel.com> Reviewed-by: Yang Rong <rong.r.yang@intel.com>
2014-09-05GBE: fixup/refine a bug for image1D array's extra binding index handling.Zhigang Gong5-16/+36
Due to hardware limitation on Gen7/Gen75 when sampling a surface with clamp address mode and nearest filter mode on a integer image1Darray type surface, we have to bind one buffer to to bti. The previous implementation hard coded it to 128 + original index and when check whether it is such type bti in driver layer, assume the bti reserved is 3 which is wrong now. This patch fixed those hard coded functions and use the macros defined in the program.h. Signed-off-by: Zhigang Gong <zhigang.gong@intel.com> Reviewed-by: "Song, Ruiling" <ruiling.song@intel.com>
2014-09-05Fix the global string bug for printf.Junyan He1-0/+12
When there are multi printf statements in multi kernel fucntions within the same translate unit, if they have the same sting parameter, the Clang will just generate one global string named .strXXX to represent that string. So when translating the kernel to gen, we can not unref that global var. Just ignore it to avoid assert. Signed-off-by: Junyan He <junyan.he@linux.intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-09-05GBE: cleanup image base index related code.Zhigang Gong6-42/+0
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com> Reviewed-by: Ruiling Song <ruiling.song@intel.com>
2014-09-05GBE: Handle bti allocation for internal buffer used by printf.Ruiling Song16-34/+96
1. Move the bti/Register map from gbe::Context to ir::Function. 2. use GlobalVariable instead of 'call' to get internal buffer (used for printf) base address. Signed-off-by: Ruiling Song <ruiling.song@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-09-05GBE: Refine bti usage in backend & runtime.Ruiling Song26-167/+599
Previously, we simply map 2G surface for memory access, which has obvious security issue, user can easily read/write graphics memory that does not belong to him. To prevent such kind of behaviour, We bind each surface to a dedicated bti. HW provides automatic bounds check. For out-of-bound write, it will be ignored. And for read out-of-bound, hardware will simply return zero value. The idea behind the patch is for a load/store instruction, it will search through the LLVM use-def chain until finding out where the address comes from. Then the bti is saved in ir::Instruction and used for the later code generation. And for mixed pointer case, a load/store will access more than one bti. To simplify some code, '0' is reserved for constant address space, '1' is reserved for private address space. Other btis are assigned automatically by backend. Signed-off-by: Ruiling Song <ruiling.song@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-09-05GBE: Optimize constant load with sampler.Ruiling Song7-11/+30
Signed-off-by: Ruiling Song <ruiling.song@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-09-05GBE: fallback if we get a wider than i64 constant.Zhigang Gong1-0/+4
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com> Tested-by: Meng, Mengmeng <mengmeng.meng@intel.com>
2014-09-05GBE: fix a bug with LLVM 3.3.Zhigang Gong1-5/+5
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com> Tested-by: Meng, Mengmeng <mengmeng.meng@intel.com>
2014-09-05GBE: avoid one optimization pass to generate wide integer.Zhigang Gong1-11/+12
Integer type wider than 64 bit is hard to handle on Gen. Let's try to prevent ScalarReplAggregates pass to generate such type of integer. v2: fix compilation error with LLVM 3.3. Signed-off-by: Zhigang Gong <zhigang.gong@intel.com> Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
2014-09-03GBE: adjust preferred vector length.Zhigang Gong1-12/+12
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com> Reviewed-by: "Song, Ruiling" <ruiling.song@intel.com>
2014-09-03Fix the global string bug for printf.Junyan He1-1/+15
When there are multi printf statements in multi kernel fucntions within the same translate unit, if they have the same sting parameter, the Clang will just generate one global string named .strXXX to represent that string. So when translating the kernel to gen, we can not unref that global var. Just ignore it to avoid assert. Signed-off-by: Junyan He <junyan.he@linux.intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-09-02GBE: fix error in the rootn fastpath function for some special input.Zhigang Gong1-3/+12
The fastpath is to lose some accuracy but get fast speed. It is not to generate error result. The rootn has many special input and need to be taken care before we call the native pow directly. This patch fix all the pow related failures at the OpenCV 3.0 test suite. Signed-off-by: Zhigang Gong <zhigang.gong@intel.com> Reviewed-by: "Song, Ruiling" <ruiling.song@intel.com>
2014-09-02utests: fix two utest bugs.Zhigang Gong2-2/+2
Similar as the bug found by junyan, some events are accessed before assigned. Signed-off-by: Zhigang Gong <zhigang.gong@intel.com> Reviewed-by: He Junyan <junyan.he@inbox.com>
2014-09-02Fix a bug for runtime_barrier_list.cpp, event array out of boundJunyan He1-1/+1
Signed-off-by: Junyan He <junyan.he@linux.intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-09-02Fix two bugs.Yang Rong2-3/+3
1. A INSERT_REGINSERT_REG typo. 2. Release main_buf in utest sub_buffer_check. Signed-off-by: Yang Rong <rong.r.yang@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-09-02Two minor fix.Yang Rong3-7/+7
1. Some systems don't define ulong type, use unsigned long instead of.. 2. Use sA, sB... instead of sa, sb... to access vector 16, because sometimes sa, sb will cause clang error. Signed-off-by: Yang Rong <rong.r.yang@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>