diff options
author | Zhigang Gong <zhigang.gong@intel.com> | 2014-04-09 18:25:22 +0800 |
---|---|---|
committer | Zhigang Gong <zhigang.gong@intel.com> | 2014-04-16 09:45:50 +0800 |
commit | add15cb38aa2ae0dc8576cb653c8d05584087c5d (patch) | |
tree | fa05c84a3243e65947cf5d7dc1b83ef17e00c028 /src/intel/intel_driver.c | |
parent | d7ad5ee6f79fc28cf82321c8b527ae73da9f10f2 (diff) |
GBE: Optimize read_image performance for CL_ADDRESS_CLAMP..
The previous work around(due to hardware restriction.) is to use
CL_ADDRESS_CLAMP_TO_EDGE to implement CL_ADDRESS_CLAMP which is
not very efficient, especially for the boundary checking overhead.
The root cause is that we need to check each pixel's coordinate.
Now we change to use the LD message to implement CL_ADDRESS_CLAMP. For
integer coordinates, we don't need to do the boundary checking. And for
the float coordinates, we only need to check whether it's less than zero
which is much simpler than before.
This patch could bring about 20% to 30% performance gain for luxmark's
medium and simple scene.
v2:
simplfy the READ_IMAGE0.
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
Diffstat (limited to 'src/intel/intel_driver.c')
-rw-r--r-- | src/intel/intel_driver.c | 2 |
1 files changed, 1 insertions, 1 deletions
diff --git a/src/intel/intel_driver.c b/src/intel/intel_driver.c index 2a2335bf..cce033f9 100644 --- a/src/intel/intel_driver.c +++ b/src/intel/intel_driver.c @@ -135,7 +135,7 @@ intel_driver_memman_init(intel_driver_t *driver) { driver->bufmgr = drm_intel_bufmgr_gem_init(driver->fd, BATCH_SIZE); assert(driver->bufmgr); - //drm_intel_bufmgr_gem_set_aub_dump(driver->bufmgr, 1); + //drm_intel_bufmgr_gem_set_aub_dump(driver->bufmgr, 1); drm_intel_bufmgr_gem_enable_reuse(driver->bufmgr); } |