summaryrefslogtreecommitdiff
path: root/src/intel
AgeCommit message (Collapse)AuthorFilesLines
2016-10-06intel: aubinator: add missing return charactersLionel Landwerlin1-5/+5
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-10-06anv: fix GetPhysicalDeviceProperties to return timestampPeriod in nsPhilipp Zabel1-1/+1
According to chapters 16.5. (Timestamp Queries) and 30.2 (Limits) of the Vulkan Specification 1.0.29, the .limits.timestampPeriod field returned by vkGetPhysicalDeviceProperties is measured in nanoseconds, not in seconds. Signed-off-by: Philipp Zabel <philipp.zabel@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-10-05intel/blorp: Use documented RECTLIST vertex positionsNanley Chery1-3/+3
Use the vertex positions described in the PRMs. This has no effect on rendering but quiets the simulator warnings seen when the vertices appear out of order. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2016-10-05anv/meta: Roll clear_image into CmdClearDepthStencilImageJason Ekstrand1-56/+28
It is now the only caller so there's no sense in keeping things split out. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2016-10-05anv: Use blorp for VkCmdFillBufferJason Ekstrand2-130/+96
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2016-10-05intel: aubinator: pack supported generations into an arrayLionel Landwerlin1-53/+35
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-10-05i965/l3: Add explicit way size calculation for bxtBen Widawsky1-1/+3
There should be no functional change here because Broxton and CHV are both gt1. Without this code however, it might seem like broxton support is missing. While here, put the gt1 check in front to hopefully short-circuit the condition for the mobile cases. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-10-04aubinator: use the correct format specifier for printing ptrdiff_t.Kenneth Graunke1-1/+1
Fixes more warnings in 32-bit builds. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-10-04aubinator: Use less -RS instead of -r for the implicit pager.Kenneth Graunke1-4/+3
From the less man page: "Warning: when the -r option is used, less cannot keep track of the actual appearance of the screen (since this depends on how the screen responds to each type of control character). Thus, various display problems may result, such as long lines being split in the wrong place." Lines which are too long to fit in the terminal would be word wrapped, but unfortunately less would get confused about which line it was on, and text would be drawn on top of other text. The most noticable case was shader assembly, which is frequently too wide for an 80 character terminal, and thus would be drawn on top of the following state packets, making them completely unreadable. Using -R instead of -r fixes this problem by only allowing color escape sequences. (Notably, Git's implicit pager invocation uses -R.) Unfortunately, it means our "clear to the end of the line" hack for extending the blue bar headers won't work anymore. Word wrapping usually isn't terribly readable, anyway, so we also add the -S option (chop long lines) to restrict it to the terminal width. (You can hit the left and right arrow keys to scroll sideways.) Then, for a new blue bar hack, we can use a printf specifier to pad the command packet names to be 80 characters long (arbitrarily), which extends them "far enough" to look good, and doesn't require us to use ioctls to determine the terminal width. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Sirisha Gandikota <sirisha.gandikota@intel.com>
2016-10-04anv/gen7_pipeline: Fix typo in semicolonAnuj Phogat1-1/+1
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-10-04anv/gen7_pipeline: Set sample mask field in 3DSTATE_PSAnuj Phogat1-0/+3
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-10-04anv/gen7_pipeline: Move ksp{1,2} state setting next to ksp0Anuj Phogat1-3/+2
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-10-04anv/gen8_pipeline: Add an assert to ensure use_alt_mode is not set in prog_dataAnuj Phogat1-0/+1
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
2016-10-04anv/gen8_pipeline: Fix typo in semicolonAnuj Phogat1-1/+1
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-10-04intel/genxml: Keep the value name 'Alternate' uniform across gen75.xmlAnuj Phogat1-3/+3
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-10-04intel/genxml: Fix typo in gen75.xmlAnuj Phogat1-1/+1
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-10-04anv/gen7_pipeline: Use MSDISPMODE_PERSAMPLE for non-multisampled fboAnuj Phogat1-1/+2
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-10-04anv/blorp: Handle zero width/height blits in blorp_copy()Anuj Phogat1-1/+4
V2: Move the check from copy_buffer_to_image() to blorp_copy(). (Nanley) Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2016-10-04intel/isl: Add an assert to check zero width/height surfaceAnuj Phogat1-0/+3
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-10-04intel: use the correct format specifier for printing uint64_tTimothy Arceri2-11/+13
Fixes a bunch of warnings in 32-bit builds. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-10-04intel: fix compilation warning on gen_get_device_infoTapani Pälli2-2/+2
(warning: 'const' type qualifier on return type has no effect) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2016-10-03intel/isl: Allow non-2D HiZ surfacesJason Ekstrand1-2/+2
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2016-10-03intel/isl: Add a detailed comment about multisampling with HiZJason Ekstrand1-2/+58
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Chad Versace <chadversary@chromium.org> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2016-10-03intel/isl: Remove tiling checks from choose_msaa_layoutJason Ekstrand2-14/+7
We already do those checks in filter_tiling. There's no good reason to repeat them in choose_msaa_layout. If anything they should have been asserts and not "return false" checks. Also, this check was causing us to outright reject multisampled HiZ surfaces which wasn't intended. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Chad Versace <chadversary@chromium.org> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2016-10-03intel/isl: Handle HiZ and CCS tiling more directlyJason Ekstrand2-16/+16
The HiZ and CCS tiling formats are always used for HiZ and CCS surfaces respectively. There's no reason why we should go through filter_tiling and it's much easier to always get HiZ and CCS right if we just handle them directly. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chadversary@chromium.org> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2016-10-03intel/isl: Allow multisampling with ISL_FORMAT_HiZJason Ekstrand2-3/+12
HiZ buffers can be multisampled and, on Broadwell and earlier, simply using interleaved multisampling with a compression block size of 8x4 samples yields the correct HiZ surface size calculations. Unfortunately, choose_msaa_layout was rejecting multisampled HiZ buffers because of format checks. Now that we have a simple helper for determining if a format supports multisampling, that's an easy enough issue to fix. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Chad Versace <chadversary@chromium.org> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2016-10-03intel/isl: Allow creation of 1-D compressed texturesJason Ekstrand2-3/+11
Compressed 1-D textures are not well-defined thing in either GL or Vulkan. However, auxiliary surfaces are treated as compressed textures in ISL and we can do HiZ and CCS with 1-D so we need to be able to create them. In order to prevent actually using them (the docs say no), we assert in the state setup code. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2016-10-03intel/isl: Fix up asserts in calc_phys_level0_extent_saJason Ekstrand1-2/+4
The assertion that a format is uncompressed in the multisample layouts isn't quite right. What we really want to assert is that the format supports multisampling which is a bit more complicated query. We also want to assert that it has a block size of 1x1 since we do nothing with the block size in the phys_level0_sa assignment. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chadversary@chromium.org> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2016-10-03intel/isl: Add a format_supports_multisampling helperJason Ekstrand5-36/+33
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Chad Versace <chadversary@chromium.org> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2016-10-03anv/formats: Fix build on gcc-4 and earlierVille Syrjälä1-4/+19
gcc-4 and earlier don't allow compound literals where a constant is required in -std=c99/gnu99 mode, so we can't use ISL_SWIZZLE() when populating the anv_formats[] array. There are a few ways around it: First one would be -std=c89/gnu89, but the rest of the code depends on c99 so it's not really an option. The second option would be to upgrade to gcc-5+ where the compiler behaviour was relaxed a bit [1]. And the third option is just to avoid using compound literals. I chose the last option since it keeps gcc-4 and earlier working. [1] https://gcc.gnu.org/gcc-5/porting_to.html Cc: Jason Ekstrand <jason@jlekstrand.net> Cc: Topi Pohjolainen <topi.pohjolainen@intel.com> Fixes: 7ddb21708c80 ("intel/isl: Add an isl_swizzle structure and use it for isl_view swizzles") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-10-03i965: rename max_ds_* variable to max_tes_*Timothy Arceri3-27/+27
Using consistent naming allows us to create macros more easily. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-10-03i965: rename max_hs_* variables to max_tcs_*Timothy Arceri3-27/+27
Using consistent naming allows us to create macros more easily. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-09-26aubinator: Fix the decoding of values that span two DwordsSirisha Gandikota1-13/+37
Fixed the way the values that span two Dwords are decoded. Based on the start and end indices of the field, the Dwords are fetched and decoded accordingly. v2: rename dw to qw in gen_field_iterator_next and remove extra white space (Anuj) v3: change all instances of dw to qw (Anuj) Earlier, 64-bit fields (such as most pointers on Gen8+) weren't decoded correctly. gen_field_iterator_next seemed to walk one DWord at a time, sets v.dw, and then passes it to field(). So, even though field() takes a uint64_t, we're passing it a uint32_t (which gets promoted, so the top 32 bits will always be zero). This seems pretty bogus... (Ken) Signed-off-by: Sirisha Gandikota <Sirisha.Gandikota@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-09-25aubinator: fix resource leakNayan Deshmukh1-1/+3
CovID: 1373370 Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-09-23anv: Check for VK_WHOLE_SIZE in anv_CmdFillBufferNicolas Koch1-0/+6
From the Vulkan spec: Size is the number of bytes to fill, and must be either a multiple of 4, or VK_WHOLE_SIZE to fill the range from offset to the end of the buffer. If VK_WHOLE_SIZE is used and the remaining size of the buffer is not a multiple of 4, then the nearest smaller multiple is used. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-09-23anv: get rid of duplicated values from gen_device_infoLionel Landwerlin6-43/+28
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-09-23intel/i965: make gen_device_info mutableLionel Landwerlin7-53/+59
Make gen_device_info a mutable structure so we can update the fields that can be refined by querying the kernel (like subslices and EU numbers). This patch does not make any functional change, it just makes gen_get_device_info() fill a structure rather than returning a const pointer. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-09-21anv: pipeline: use correct number of thread for computeLionel Landwerlin1-1/+4
Reproduces this commit : commit 0fb85ac08d61d365e67c8f79d6955e9f89543560 Author: Kenneth Graunke <kenneth@whitecape.org> Date: Mon Jun 6 21:37:34 2016 -0700 i965: Use the correct number of threads for compute shaders. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-09-21anv: allocator: correct scratch space for haswellLionel Landwerlin1-1/+21
This reproduces this commit : commit 2213ffdb4bb79856f0556bdf2bfd4bdf57720232 Author: Kenneth Graunke <kenneth@whitecape.org> Date: Mon Jun 6 21:37:34 2016 -0700 i965: Allocate scratch space for the maximum number of compute threads. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-09-21anv: device: calculate compute thread numbers using subslices numbersLionel Landwerlin6-18/+74
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-09-20aubinator: add a custom handler for immediate register loadLionel Landwerlin3-3/+47
Transforming this : 0x00c77084: 0x11000001: MI_LOAD_REGISTER_IMM 0x00c77088: 0x0000b020 : Dword 1 Register Offset: 0x0000b020 0x00c7708c: 0x00880038 : Dword 2 Data DWord: 8912952 Into this: 0x007880f0: 0x11000001: MI_LOAD_REGISTER_IMM 0x007880f4: 0x0000b020 : Dword 1 Register Offset: 0x0000b020 0x007880f8: 0x00080040 : Dword 2 Data DWord: 524352 register L3CNTLREG2 (0xb020) : 0x80040 SLM Enable: 0 URB Allocation: 32 URB Low Bandwidth: 0 RO Allocation: 32 RO Low Bandwidth: 0 DC Allocation: 0 DC Low Bandwidth: 0 v2: Drop unused arguments (Sirisha) Print out register name Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2016-09-15isl: Finish tiling filtering for Gen6.Kenneth Graunke3-5/+15
Gen6 only has one additional restriction over Gen7+, so we just add it to the existing gen7 function (which actually covers later gens too). This should stop FINISHME spew when running GL on Sandybridge. v2: Fix bytes per block vs. bits per block confusion (Jason) and rename function to gen6_filter_tiling (Jason and Chad). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-09-15nir: Add a flag to lower_io to force "sample" interpolationJason Ekstrand1-1/+1
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-09-14anv/cmd_buffer: Set the L3 atomic disable mask bit in CHICKEN3 on HSWJason Ekstrand2-0/+2
Without this bit set, the value in "L3 Atomic Disable" won't get applied by the hardware so we won't properly get L3 atomic caching. Fixes dEQP-VK.spirv_assembly.instruction.compute.opatomic.compex and 198 of the dEQP-VK.image.atomic_operations.* tests on HSW Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-09-14intel/blorp: Stop setting 3DSTATE_DRAWING_RECTANGLEJason Ekstrand2-20/+0
The Vulkan driver sets 3DSTATE_DRAWING_RECTANGLE once to MAX_INT x MAX_INT at the GPU initialization time and never sets it again. The GL driver sets it every time the framebuffer changes. Originally, blorp set it to the size of the drawing area but meant we had to set it back in the Vulkan driver. Instead, we can easily just do that in the GL driver's blorp_exec implementation and not set it in blorp core. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2016-09-14intel/blorp: Emit 3DSTATE_MULTISAMPLE directlyJason Ekstrand2-40/+45
Previously, we relied on a driver hook for 3DSTATE_MULTISAMPLE. However, now that Vulkan and GL use the same sample positions, we can set up 3DSTATE_MULTISAMPLE directly in blorp and delete the driver hook. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2016-09-14intel: Move Vulkan sample positions to common codeJason Ekstrand4-21/+21
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2016-09-13aubinator: Remove bogus "end" parameter in gen_disasm_disassemble()Sirisha Gandikota3-10/+12
Earlier, the loop pretends to loop over instructions from "start" to "end", but the callers always pass 8192 for end, which is some huge bogus value. The real loop termination condition is send-with-EOT or 0. (Ken) Signed-off-by: Sirisha Gandikota <Sirisha.Gandikota@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-09-13aubinator: Make gen_disasm_disassemble handle split sendsSirisha Gandikota1-7/+12
Skylake adds new SENDS and SENDSC opcodes, which should be handled in the send-with-EOT check. Make an is_send() helper that checks if the opcode is SEND/SENDC/SENDS/SENDSC (Ken) v2: Make is_send() much more crispier, Mix declaration and code to make the code compact (Ken) Signed-off-by: Sirisha Gandikota <Sirisha.Gandikota@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-09-13aubinator: Simplify print_dword_val() methodSirisha Gandikota1-8/+4
Remove the float/dword union and use the iter->p[f->start / 32] directly as printf formatter %08x expects uint32_t (Ken) v2: Make the cleanup much more crispier (Ken) Signed-off-by: Sirisha Gandikota <Sirisha.Gandikota@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>