Age | Commit message (Collapse) | Author | Files | Lines |
|
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
Signed-off-by: Luo Xionghu <xionghu.luo@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
We add the test case for uniform when doing the bswap.
Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
To generate SPIR binary, please refer to the page
https://github.com/KhronosGroup/SPIR.
For llvm3.2, the command is "clang -cc1 -emit-llvm-bc -triple
spir-unknown-unknown -cl-std=CL1.2 -include opencl_spir.h
compiler_ceil.cl -o compiler_ceil32.spir"
For llvm3.5, the option -cl-kernel-arg-info is required,
and option -fno-builtin is required to avoid warning.
v2: add missing load_program_from_spir.cpp file.
Signed-off-by: Luo Xionghu <xionghu.luo@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
Change output measurement from time to bandwidth, so we can compare
all benchmark results easily. And change return type of benchmark
from int to double, because int is not precise enough.
v2: Change output measurement from time to bandwidth.
Signed-off-by: Chuanbo Weng <chuanbo.weng@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
Because of the HW limitation, vertical stride is at
least aligned to 2. For 1D array image, the data has interval.
The image size is just twice as big as the buffer size we think.
Use clEnqueueWriteImage is safe and fix this bug.
Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
The bug caused because we didn't consider two overflow
cases when type is long, which do not cause carry and
are easy to be ignored.
V2:
Clean some verbose NOPs.
Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
should use clz function instead of __builtin_clz.
add zero input check.
v2: add signed type test. remove redundant case.
v3: remove printf.
Signed-off-by: Luo Xionghu <xionghu.luo@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
The two cases are are generated by utest_math_gen.py.
Signed-off-by: Zhu Bingbing <bingbingx.zhu@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
the limitation is loosed from page size to cache line size
alignment inside driver, update utest accordingly.
Signed-off-by: Guo Yejun <yejun.guo@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
ulong and uint64_t have different size on i386 and
i386_64, which cause the test case failure.
Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
this kernl calls the llvm __builtin_clz to generate llvm.clz function
then call the gen instruction clz, different from the test
compiler_clz_int, which use the fbh to implement.
Signed-off-by: Luo Xionghu <xionghu.luo@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
the built test case is load_program_from_bin_file, it demos how to
generate from source kernel compiler_ceil.cl to binary kernel
compiler_ceil.bin with the standalone compiler for a specific gen
pci id, and also demos how to load and execute the binary kernel
when the compiler is not available in the running system.
btw, please make sure compiler_ceil.bin is really updated if there is
already one there, the safe way is to delete it first.
Signed-off-by: Guo Yejun <yejun.guo@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
At some platforms with old c/c++ environment, C++11 features are not
supported, it results in the failure to build the gbe compiler part
which depends on LLVM/clang using C++11 features.
The way to resolve is to build a standalone gbe compiler within another
feasible system, and build beignet with the already built standalone
gbe compiler by setting USE_STANDALONE_GBE_COMPILER=true. The path of
the standalone compiler is /usr/local/lib/beignet as default or could
be specified by STANDALONE_GBE_COMPILER_DIR.
Once USE_STANDALONE_GBE_COMPILER is given, all the gbe compiler relative
code will not be built any longer, only libcl.so and libgebinterp.so are
built. And libcl.so is special for GEN_PCI_ID, which is queried from the
building machie or could be specified as CMake option.
v2: separate the CMake option name.
update the commit comments.
add back the script for gen pci id, and build driver with it.
v3: add file FindStandaloneGbeCompiler.cmake to make the main cmakefile clean.
Signed-off-by: Guo Yejun <yejun.guo@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
|
|
change the keyword from constexpr to const, update the code for
explicit type conversion and std::map's iterator.
Signed-off-by: Guo Yejun <yejun.guo@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
|
|
power(x,y) return Nan for x<0 in spec, so add that for powr.
Signed-off-by: Meng Mengmeng <mengmeng.meng@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
No need to iterate so many times.
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
Tested-by: "Meng, Mengmeng" <mengmeng.meng@intel.com>
|
|
the overflow intrinsics are introduced since llvm-3.5.
v2: no need to exclude bswap test since llvm-3.3 supports it.
v3: disable the overflow utest in cmake instead of code.
Signed-off-by: Luo Xionghu <xionghu.luo@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
Signed-off-by: Zhu Bingbing<bingbingx.zhu@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
The 2 delete operators work on array pointer.
Signed-off-by: Yan Wang <yan.wang@linux.intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
the original case only tested overflow when src1 is 1, add the test of
src1 is max(this could trigger signed type uadd.with.overflow results
incorrect, the fail was fixed in backend already, just update the utest.)
Signed-off-by: Luo Xionghu <xionghu.luo@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
two bswap call in one block would trigger nsetc failures.
the fail was fixed in backend already, just update the utest.
Signed-off-by: Luo Xionghu <xionghu.luo@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
Signed-off-by: Ruiling Song <ruiling.song@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
Signed-off-by: Meng Mengmeng <mengmeng.meng@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
Signed-off-by: Guo Yejun <yejun.guo@intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
From application perspective, userptr is transparent. App does not
need to know if userptr is enabled or not, just invokes standard
OpenCL APIs.
Signed-off-by: Guo Yejun <yejun.guo@intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
and also refine the code to move time_subtract into utest_helper.hpp/cpp
Signed-off-by: Guo Yejun <yejun.guo@intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
Reviewed-by: Ruiling Song <ruiling.song@intel.com>
|
|
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
|
|
Signed-off-by: Guo Yejun <yejun.guo@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
The maximum work group size on BYT is 256.
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
|
|
due to a stray . at utests/builtin_pow.cpp:79:112.
Reported by "Rebecca N. Palmer" <rebecca_palmer@zoho.com>.
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
Reviewed-by: "Song, Ruiling" <ruiling.song@intel.com>
|
|
Reduce work group size from 1024 to 256 to fit all platforms.
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
|
|
To make the license statement consistent to each other, adjust
all license versions to v2.1+. Thus beignet should have a pure
LGPL v2.1+ license.
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
|
|
As we can't find the original license of these test cases, we
have to remove them from beignet's unit test cases.
Reported by "Rebecca N. Palmer" <rebecca_palmer@zoho.com>.
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
|
|
This reverts commit 9e236b18542f2564e399bf13d4d1fbcc48a5ec9f.
|
|
At some systems, function aligned_alloc is not supported.
From Linux Programmer's Manual:
The function aligned_alloc() was added to glibc in version 2.16.
The function posix_memalign() is available since glibc 2.1.91.
V2: add check for return value of posix_memalign
Signed-off-by: Guo Yejun <yejun.guo@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
Signed-off-by: Guo Yejun <yejun.guo@intel.com>
Tested-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
Signed-off-by: Guo Yejun <yejun.guo@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
|
|
Signed-off-by: Ruiling Song <ruiling.song@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
This patch is based on Rebecca's patch at:
https://bugs.debian.org/cgi-bin/bugreport.cgi?msg=5;filename=Fix-pow-erf-tgamma.patch;att=3;bug=768090.
And fixed another bug which we should not use an absolute error checking.
We should use ULP and considering the strict conformance or non strict
conformance state.
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
Signed-off-by: Rebecca Palmer <rebecca_palmer@zoho.com>
Reviewed-by: "Song, Ruiling" <ruiling.song@intel.com>
|
|
This patch is based on Rebecca's patch at:
https://bugs.debian.org/cgi-bin/bugreport.cgi?msg=5;filename=Fix-pow-erf-tgamma.patch;att=3;bug=768090.
And fixed another bug which we should not use an absolute error checking.
We should use ULP and considering the strict conformance or non strict
conformance state.
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
Signed-off-by: Rebecca Palmer <rebecca_palmer@zoho.com>
Reviewed-by: "Song, Ruiling" <ruiling.song@intel.com>
|
|
erf/erfc diverge (instead of converging to 1 or 0) for arguments above
about 2.
This patch is from:
https://bugs.debian.org/cgi-bin/bugreport.cgi?msg=5;filename=Fix-pow-erf-tgamma.patch;att=3;bug=768090
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
Signed-off-by: Rebecca Palmer <rebecca_palmer@zoho.com>
Reviewed-by: "Song, Ruiling" <ruiling.song@intel.com>
|
|
This test case shows that assignment operation in if block seems
does not affect lvalue.
Signed-off-by: Chuanbo Weng <chuanbo.weng@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|