Age | Commit message (Collapse) | Author | Files | Lines |
|
This should be a typo, we should wait for the gpgpu and create
node only if the batch buffer is busy.
Signed-off-by: Pan Xiuli <xiuli.pan@intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
This will dump the Gen ASM output to the file specified in the
-dump-opt-asm Link option during the Link program step.
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Reviewed-by: Ruiling Song <ruiling.song@intel.com>
|
|
After the curbe allocation refactor, not all kernel arguments
will be allocated unconditional. If some kernel arguments haven't
been used at all, the corresponding arguments will be ignored
at backend thus we may get a -1 offset. On the runtime driver
side, we need check this situation.
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
The major motivation is to normalize the curbe payload's
allocation and prepare to use liveness information
to avoid unecessary payload register allocation and avoid
fragments when allocate curbe registers. For an example,
for GBE_CURBE_LOCAL_ID_Y/Z, many one dimention
kernels don't need them. But previous curbe allocation
occurs before the liveness interval computing, thus it
will allocate that curbe anyway. Altough it will be expired
soon but it still need us to prepare those payload at
host side. After this patch, this type of overhead
has been eliminated easily.
Another purpose is to eliminate the ugly curbe patch list
handling in backend. After this patch, the curbe register
handling is much cleaner than before.
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
|
|
It is a drm related bug. As the drm driver changed the time to free their test
userptr to bufmgr destroy(30921483c70c6939f017476eac13da6aa26b3b3c), we need
anothr order to release our driver to make sure the test userptr can be freed
with a valid fd.
Signed-off-by: Pan Xiuli <xiuli.pan@intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
Fix to calculate the current cpu monotonic raw timestamp in nanoseconds
for enqueued,submitted,start and finshed and send this to application
based on the parameter queries.
Signed-off-by: Midhun Kodiyath <midhunchandra.kodiyath@intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
this patch allows create 2d image with a cl buffer with zero copy.
v2: should use reference to manage the release the buffer and image.
After being created, the buffer reference count is 2, and image reference
count is 1.
if image is released first, decrease the image reference count and
buffer reference count both, release the bo when the buffer is released
at last;
if buffer is released first, decrease the buffer reference count only,
release the buffer when the image is released.
add CL_DEVICE_IMAGE_BASE_ADDRESS_ALIGNMENT in cl_device_info.
v3: move is_image_from_buffer to _cl_mem_image; return
CL_INVALID_IMAGE_SIZE if image size is larger than the buffer.
v4: pitchalignment set to 2.
Signed-off-by: Luo Xionghu <xionghu.luo@intel.com>
Reviewed-by: Guo, Yejun <yejun.guo@intel.com>
|
|
Signed-off-by: Luo Xionghu <xionghu.luo@intel.com>
Reviewed-by: Guo, Yejun <yejun.guo@intel.com>
|
|
catch the error: out of host memery.
Signed-off-by: Luo Xionghu <xionghu.luo@intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
All programs or none programs specified by input_programs contain a compiled binary or library
for the device. Otherwise return CL_INVALID_OPERATION.
Correct this condition check.
Signed-off-by: Yang Rong <rong.r.yang@intel.com>
Reviewed-by: Luo, Xionghu <xionghu.luo@intel.com>
|
|
cl_buffer_get_subdata sometime is very very very slow in linux kernel, in skl and chv,
and it is random. So temporary disable it, use map/copy/unmap to read.
Should re-enable it after find root cause.
Signed-off-by: Yang Rong <rong.r.yang@intel.com>
Reviewed-by: Luo, Xionghu <xionghu.luo@intel.com>
|
|
1. return CL_INVALID_LINKER_OPTIONS when invalid options, using clang to check the options.
2. return CL_INVALID_OPERATION when the binary type is not same.
3. When link fail, will not return CL_LINK_PROGRAM_FAILURE, fix it.
4. Should not delete program in genProgramBuildFromLLVM, the program is new and delete from runtime.
Signed-off-by: Yang Rong <rong.r.yang@intel.com>
Reviewed-by: Luo, Xionghu <xionghu.luo@intel.com>
|
|
There is no NULL pointer check for kernel->program->build_opts.
This will cause utest test_get_arg_info crash.
In fact, we will add -cl-kernel-arg-info flag for compiling
ever time, and so the arg info is always avaible.
But some test case deliberately unset this flag and expect the ERR
return value, so we really need a check here.
Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
1.Change the code for null param_value
2.Add the return value check for build option "-cl-kernel-arg-info"
3.Correct one return value typo
Signed-off-by: Pan Xiuli <xiuli.pan@intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
initialize the data inside kernel with packed integer vector
V2: call functions from ctx, instead of ctx.registerAllocator
Signed-off-by: Guo Yejun <yejun.guo@intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
Part of the plumbing that passes the ASM file name from the compiler options
level down to the emitCode level so that the assembly can be written to that
file
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Laura Ekstrand <laura.d.ekstrand@intel.com>
Reviewed-by: Song, Ruiling <ruiling.song@intel.com>
|
|
Device ID and vendor ID are not same.Set the correct vendor ID.
Signed-off-by: Midhun Kodiyath <midhunchandra.kodiyath@intel.com>
Reviewed-by: Song, Ruiling <ruiling.song@intel.com>
Reviewed-by: Guo, Yejun <yejun.guo@intel.com>
|
|
get_sub_group_id ranges at [0, 7] for SIMD8 and [0, 15] for SIMD16,
previously we set up the values in kernel payload, now change it
to generate the values inside kernel with packed integer vector.
v2: encapsulate into a function so that others can get the lane id easily.
Signed-off-by: Guo Yejun <yejun.guo@intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
Some applications forgot to call clWaitForEvents() before calling to
clGetEventProfilingInfo(). Let's update the event's status here.
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
The event should have been finished after clFinish, update the event status.
Signed-off-by: Luo Xionghu <xionghu.luo@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
Signed-off-by: Ruiling Song <ruiling.song@intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
Signed-off-by: Ruiling Song <ruiling.song@intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
When the event parament is not NULL, the event will point to a new event, so
need to check address of the event and the wait events.
V2: check the address of the event and the wait events.
Signed-off-by: Yang Rong <rong.r.yang@intel.com>
Reviewed-by: "Song, Ruiling" <ruiling.song@intel.com>
|
|
Also fix a spelling bug - s/internel/internal.
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
This patch fixed two thread-safe bugs in the builtin-kernel
usage code path.
1. The builtin kernel array itself need to be protected.
2. Each caller need to get a dup of the builtin kernel,
rather than share the same kernel structure.
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
Signed-off-by: Luo Xionghu <xionghu.luo@intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
Basically, it boils down to a difference in getopt(3). The
getopt(3) on (Free)BSD will exit parsing arguments at the
first unknown argument.
Signed-off-by: Koop Mast <kwm@rainbow-runner.nl>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
last_event and current_event should be thread private data.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
On HSW, if we use default 4.0.x kernel without the i915.enable_ppgtt=2
boot argument, then the atomic in L3 will not work and the SLM will not
work neither. We need to test atomic in L3 firstly, if it fails we need
to test SLM again. Otherwise, beignet will not give any error/warning
information for both atomic in L3 and SLM not working case.
v2:
shold set the atomic test result before the second round SLM test.
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
Reviewed-by: "Luo, Xionghu" <xionghu.luo@intel.com>
|
|
Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
The cl device may have different extensions from the
platform. We will add some items based on the platform
extensions.
Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
The init order of the intel_platform and the intel_extension is
somehow not clear. When some API such as clGetDeviceIDs can pass
NULL as cl_platform_id, we just use the global value intel_platform
as the default but do not care about the init state of the extension.
The init of the extension may be done when the cl device is created.
This is OK if the paltform and the device have the same extensions.
But now because of the fp16, they are not always the same.
Use cl_get_platform_default to replace the global value to ensure that
when default platform is available, the extension is also inited.
Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
the regression is caused when only enable CL_UNORM_INT8 for CL_RG,
the reason is that during the image copy implementation with internal
kernel, all formats are considerd as integer format, it becomes
unknown since CL_UNSIGNED_INT8 is not enabled yet.
Signed-off-by: Guo Yejun <yejun.guo@intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
check the selftest kernel return value, if enqueue kernel failed,
set the flag to not enable atomics the L3 for HSW.
This reverts commit 83f8739b6fc4893fac60145326052ccb5cf653dc.
v2: don't use global variable to pass value from runtime to driver.
v3: add type SELF_TEST_OTHER_FAIL to differentiate from SELF_TEST_ATOMIC_FAIL;
seperate the ATOMIC_FAIL from SLM_FAIL, only SLM_FAIL can be control by
env OCL_IGNORE_SELF_TEST.
Signed-off-by: Luo Xionghu <xionghu.luo@intel.com>
Reviewed-by: Yang, Rong <rong.r.yang@intel.com>
|
|
Signed-off-by: Guo Yejun <yejun.guo@intel.com>
Reviewed-by: Yang, Rong <rong.r.yang@intel.com>
|
|
need minus one when fill '\0' to sizeof char type array.
Signed-off-by: Luo Xionghu <xionghu.luo@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
Reflect recent beignet and Linux changes.
Signed-off-by: Rebecca Palmer <rebecca_palmer@zoho.com>
Reviewed-by: "Luo, Xionghu" <xionghu.luo@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
Run a small __local-using kernel in clGetDeviceIDs; if this returns
the wrong result, return CL_DEVICE_NOT_FOUND.
As far as I can see, there's no way to tell in advance (except
unreliably with a global version check) whether __local-using batches
will be accepted...so the easiest solution is probably to just try
running one and see what result we get.
Signed-off-by: Rebecca Palmer <rebecca_palmer@zoho.com>
Reviewed-by: "Luo, Xionghu" <xionghu.luo@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
Signed-off-by: Guo Yejun <yejun.guo@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
|
|
This issue should be fixed in KMD in the future. Let's revert it
here.
This reverts commit ef7127c03bd533277afc443b335c37a69927250a.
|
|
v2: use random input data; update comments.
v3: change the image attribute to __write_only.
Signed-off-by: Luo Xionghu <xionghu.luo@intel.com>
Reviewed-by: "Weng, Chuanbo" <chuanbo.weng@intel.com>
|
|
We can change the image_channel_order to CL_RGBA and
image_channel_data_type to CL_UNSIGNED_INT32 for some special
case, thus 16 bytes can be read by one work item. Bandwidth is
fully used.
v2: merge patch 3 of initializing region0;
remove k dimension in kernel for 2d image.
Signed-off-by: Luo Xionghu <xionghu.luo@intel.com>
Reviewed-by: "Weng, Chuanbo" <chuanbo.weng@intel.com>
|
|
Cherryview's EU configurations is not decided by pciid, must get from kernel by libdrm.
Thanks for Jeff adding this support in the kernel and libdrm.
V2: Add the warning when can't get configurations.
Signed-off-by: Yang Rong <rong.r.yang@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
uint __gen_ocl_get_simd_id();
return value ranges from 0 to simdsize - 1
V2: use function sel.selReg to refine code
V3: correct the uniform condition in liveness.cpp
Signed-off-by: Guo Yejun <yejun.guo@intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
If the front end label ip exceed 0xffff, then the backend will
use real DW to represent each block's IP address. This is
a dynamic behaviour according to the actual front end's label
number.
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
Beignet uses drm_intel_gem_bo_context_exec() to flush command queue to
linux drm driver layer. We need to check the return value of that function,
as it may fail when the application uses very large array.
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|