Age | Commit message (Collapse) | Author | Files | Lines |
|
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91337
Cc: 10.6 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
|
|
Cuts another 12% of vc4_uniforms.o, in exchange for computing it at
CSO creation time.
|
|
In exchange for a bit of space and computation in CSO setup, we cut
vc4_uniform.c (draw time) code size by 4.8%.
|
|
The rest of vc4_program.c is about compiling, while this is about
uniform emit at draw time.
|
|
No code generation changes from this, but it'll be useful to have this
next time I go checking -Wdouble-promotion.
|
|
This field should always be set for gen8. In the bdw PRM, Volume 2d:
Command Reference: Structures under INTERFACE_DESCRIPTOR_DATA, DWORD
6, Bits 9:0, Number of Threads in GPGPU Thread Group:
"This field should not be set to 0 even if the barrier is disabled,
since an accurate value is needed for proper pre-emption."
In the HSW PRM, the it doesn't mention that it must always be set, but
it should not hurt.
Reported-by: Kristian Høgsberg <krh@bitplanet.net>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
|
|
|
|
|
|
Cuts another 88 bytes of compiled code.
|
|
Drops 680 bytes of code, from avoiding a bunch of extra updates to the
next pointer in the struct.
|
|
I needed to rewrite this a bit for safety checking in the next commit.
Despite being a static inline of the same thing that was being done, we
lose 36 bytes of code for some reason.
|
|
Now that RCL generation is in the kernel, we don't have any other
callers. Oddly, the compiler generates another 8 bytes of code for
this, but the simplification is worth it.
|
|
Now that we don't resize the CL as we build (it's set up at the top by
vc4_start_draw()), we can store the pointers instead of offsets from
the base. Saves a bit of math in emitting relocs (about 60 bytes of
code).
|
|
|
|
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
|
|
Extend the existing lower_ubo_reference pass to also detect SSBO loads
and lower them to __intrinsic_load_ssbo intrinsics.
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
|
|
Extend the existing lower_ubo_reference pass to also detect SSBO writes
and lower them to __intrinsic_store_ssbo intrinsics.
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
|
|
Since the backing storage for these is shared we cannot ensure that
the value won't change by writes from other threads. Normally SSBO
accesses are not guaranteed to be syncronized with other threads,
except when memoryBarrier is used. So, we might be able to optimize
some SSBO accesses, but for now we always take the safe path and emit
the SSBO access.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
|
|
Since the backing storage for these is shared we cannot ensure that
the value won't change by writes from other threads. Normally SSBO
accesses are not guaranteed to be syncronized with other threads,
except when memoryBarrier is used. So, we might be able to optimize
some SSBO accesses, but for now we always take the safe path and emit
the SSBO access.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
|
|
Since the backing storage for these is shared we cannot ensure that
the value won't change by writes from other threads. Normally SSBO
accesses are not guaranteed to be syncronized with other threads,
except when memoryBarrier is used. So, we might be able to optimize
some SSBO accesses, but for now we always take the safe path and emit
the SSBO access.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
|
|
If we kill dead assignments we lose the buffer writes.
Also, we never kill UBO declarations even if they are never referenced
by the shader, they are always considered active. Although the spec
does not seem say this specifically for SSBOs, it is probably implied
since SSBOs are pretty much the same as UBOs, only that you can write
to them.
v2:
- Fix the comment (Jordan)
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
|
|
Otherwise we can lose writes into the buffers backing the variables.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
|
|
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
|
|
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
|
|
v2:
- Fix error message (Jordan)
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
|
|
v2:
- Add space before const (Jordan)
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
|
|
v2:
- Remove the extra spaces (Jordan)
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
|
|
v2:
- Fix indention, used tabs instead of whitespaces. (Jordan)
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
|
|
Due to GL_ARB_shader_storage_buffer_object extension, shader storage blocks
have the same limitations as uniform blocks.
This patch fixes the corresponding error messages.
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
|
|
Section 4.3.7 "Buffer Variables", GLSL 4.30 spec:
"Buffer variables may only be declared inside interface blocks
(section 4.3.9 “Interface Blocks”), which are then referred to as
shader storage blocks. It is a compile-time error to declare buffer
variables at global scope (outside a block)."
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
|
|
Section 4.3.7 "Buffer Variables" of the GLSL 4.30 spec:
"Buffer variables cannot have initializers."
v2:
- Rewrite error message (Jordan)
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
|
|
See GLSL 4.30 spec, section 4.4.5 "Uniform and Shader Storage Block
Layout Qualifiers".
v2:
- Add whitespace in an error message. Delete period '.' at the end of that
error message (Jordan).
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
|
|
v2:
- Set MaxShaderStorageBlocks to 8.
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
|
|
This includes the array of bindings, the current buffer bound to the
GL_SHADER_STORAGE_BUFFER target and a set of general limits and default
values for shader storage buffers.
v2:
- Use spec values for the new defined constants (Jordan)
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
|
|
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
|
|
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
|
|
This is used to identify shader storage buffer interface blocks where
buffer variables are declared.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
|
|
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
|
|
Since this now checks if a variable is inside a uniform or a shader
storage block.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
|
|
This will be used to identify buffer variables inside shader storage
buffer objects, which are very similar to uniforms except for a few
differences, most important of which is that they are writable.
Since buffer variables are so similar to uniforms, we will almost always
want them to go through the same paths as uniforms.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
|
|
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
|
|
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
|
|
The util/hash_table was intended to be a fast hash table
replacement for the program/hash_table see 35fd61bd99c1 and 72e55bb6888ff.
This change replaces some more uses of the old hash table.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
|
|
Inspired (copied) from Marek's commit for egl/x11
commit 0b56e23e7f3(egl/dri2: use the correct screen index)
v2: Fix copy/pasta errors.
Cc: 10.6 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
|
|
Most of the data stored(duplicated) was unused, and for the one that is
follow the approach set by other drivers.
This eliminates the use of legacy (dri1) types.
Cc: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
|
|
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
|
|
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
|
|
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
|
|
The former handles O_CLOEXEC (and the lack of it) appropriately.
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
|
|
The former handles O_CLOEXEC (and the lack of it) appropriately.
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
|