This is [[IanRomanick|IanRomanick]]'s rather lengthy todo list. I basically add stuff to this list whenever I think of or find something that needs doing. It is _not_ a list of things that I have a plan for actually doing. :) --- ## General Optimization * Modify texmem interface so that mipmaps can be fragmented. The individual mipmaps don't need to be right next to each other in memory. This could be helpful for some chips like Matrox (G200 and G400 family), SiS, and ATIRage128. * Modify drivers to drop either the back-buffer or the depth-buffer if there is not enough on-card memory. The idea is that DDX driver will allocate as much memory as it can. Hopefully there is enough for two full size buffers. There obviously has to be enough for the front buffer. The other buffer can be used for either the back-buffer or the depth-buffer. It then exports visuals that use only one extra buffer normally. Visuals that use both the back-buffer and the depth-buffer are exported as "slow." These visuals end up being indirect only. * The radeon driver already supports a degenerate version of this, where the back buffer can be disabled. Also, I don't see why you'd need to force the slow visuals indirect, you should be able to do them as software renderbuffers, at least for the case of no hardware back buffer. -- [[AdamJackson|AdamJackson]] * Add support for frustum culling of display lists. When a display list is compiled, generate a bounding box for it. When the list is rendered, test the bounding box against the current view frustum. If it's outside the view, don't render it. This could help even HW TCL cards and should improve our [[ViewPerf|ViewPerf]] scores! It might be possible to apply this optimization to some other cases as well (i.e., CVA and VBOs). * Change the way [[GL_SGIS_generate_mipmap|http://oss.sgi.com/projects/ogl-sample/registry/SGIS/generate_mipmap.txt]] is implemented. Right now [[GL_SGIS_generate_mipmap|http://oss.sgi.com/projects/ogl-sample/registry/SGIS/generate_mipmap.txt]] is implemented by creating the mipmaps as soon as the base texture is modified. This makes it impossible to implement support in hardware. A better way to do it would be to keep a bit mask with one bit per level. When the base level is modified, each of the lower levels are marked "dirty". If the application provides a texture for a level, it gets marked "clean". At texture upload, the driver has to do something "smart" with the dirty levels. There are some tricky parts. The app can enable mipmap generation, set the base level, disable mipmap generation, and modify the base level. In that case, the base level no longer holds the texture that should be used to generate the lower levels. A number of solutions are available, including falling back to software generation, but I'm not sure which one is best. It will take some experimentation. There is also the problem doing a [[CopyTexSubImage|CopyTexSubImage]] (or even [[TexSubImage|TexSubImage]]) to a hardware generated level. This is another case where we could fallback to software. ## New / improved extension support * Add support for [[GL_SGIS_fog_function|http://oss.sgi.com/projects/ogl-sample/registry/SGIS/fog_func.txt]]. This is essentially table based fog. For hardware that supports table based fog, we convert the app specified fog function into the fog table. * Improve support for [[GL_APPLE_client_storage|http://oss.sgi.com/projects/ogl-sample/registry/APPLE/client_storage.txt]]. Right now this extension really only works as a hack in the R200 driver. However, Mesa should be able to detect if the driver format matches the source texture format. If it is a match, Mesa should keep a pointer to the client's data instead of making a copy. If this were done, we could get at least some benefit from the extension on all drivers. * Look at adding support for [[GL_SGIX_resample|http://oss.sgi.com/projects/ogl-sample/registry/SGIX/resample.txt]], [[GL_OML_subsample|http://oss.sgi.com/projects/ogl-sample/registry/OML/subsample.txt]], [[GL_OML_resample|http://oss.sgi.com/projects/ogl-sample/registry/OML/resample.txt]], [[GL_SGIX_interlace|http://oss.sgi.com/projects/ogl-sample/registry/SGIX/interlace.txt]], [[GL_OML_interlace|http://oss.sgi.com/projects/ogl-sample/registry/OML/interlace.txt]], [[GL_INGR_interlace_read|http://oss.sgi.com/projects/ogl-sample/registry/INGR/interlace_read.txt]]. * Add support for [[GL_SGIS_texture_color_mask|http://oss.sgi.com/projects/ogl-sample/registry/SGIS/texture_color_mask.txt]]. * Add support for [[GL_ATI_envmap_bumpmap|http://oss.sgi.com/projects/ogl-sample/registry/ATI/envmap_bumpmap.txt]]. * **Done** in mesa [[mesa: add support for ATI_envmap_bumpmap|http://cgit.freedesktop.org/mesa/mesa/commit/?id=114152e068ec919feb0a57a1259c2ada970b9f02]] In terms of advancing OpenGL on Linux, there's no real point to implementing this extension. However, by implementing this extension Wine can implement DirectX EMBM without having to use fragment programs. This is a good thing since the only open-source driver that supports fragment programs on the i915. As another twist, it should be possible to add support to the texture_env_combine-to-fragment program conversion code for this extension. Then any future card that supports [[GL_ARB_fragment_program|http://oss.sgi.com/projects/ogl-sample/registry/ARB/fragment_program.txt]] would automatically get support. * Add support for [[GL_NV_texture_env_combine4|http://oss.sgi.com/projects/ogl-sample/registry/NV/texture_env_combine4.txt]]. * **Done** in mesa [[mesa: enable GL_NV_texture_env_combine4 for sw drivers|http://cgit.freedesktop.org/mesa/mesa/commit/?id=d4757cd02aeebe1a3072f35b5134ad5e278e3a6f]] I've been told that there are quite a few apps that support this extension. That's not terribly surprising to me. It has been supported by Nvidia hardware since the original TNT. As another twist, it should be possible to add support to the texture_env_combine-to-fragment program conversion code for this extension. Then any future card that supports [[GL_ARB_fragment_program|http://oss.sgi.com/projects/ogl-sample/registry/ARB/fragment_program.txt]] would automatically get support. * Implement GLX_SUN_init_threads. Right now libGL dynamially determines if an application is multithreaded. There is a small (very small) race condition in this code. If an application were to tell libGL in advance that it was going to be multithreaded, we could completely avoid that race. What might be more interesting is to have an extension where applications could tell libGL that they were going to be exclusively single-threaded. That way we could build specialized DRI drivers that don't do some of the locking that otherwise required. That could help the performance of some single-threaded, CPU-bound applicattion. At this point it may not be worth the effort. With Hyperthreading and multicore CPUs becoming more common, more and more applications are going to be multithreaded. A [[manual page|http://www.mail-archive.com/oglbase-discuss@corp.sgi.com/msg00113.html]] for the SUN extension is available and [[Google|http://www.google.com/search?as_q=&num=50&hs=Gfx&hl=en&client=firefox-a&rls=org.mozilla%3Aen-US%3Aofficial&btnG=Google+Search&as_epq=&as_oq=GLX_SUN_init_threads+SUN_init_threads&as_eq=&lr=&as_ft=i&as_filetype=&as_qdr=all&as_occt=any&as_dt=i&as_sitesearch=&as_rights=&safe=images]] turns up many hits. * Implement [[GL_EXT_texture_filter_anisotropic|http://oss.sgi.com/projects/ogl-sample/registry/EXT/texture_filter_anisotropic.txt]] using "rip maps" in software Mesa. [[Rip maps|http://www.sgi.com/products/software/performer/brew/anisotropic.html]] are a cheap way to implement anisotropic texture filtering. Having some sort of anisotropic filtering in software Mesa would be nice. As an added bonus, hardware that doesn't implement anisotropic filter could re-use most of the swrast infrastructure. There would be some trickery involved, and the performance wouldn't be that great. However, it _would_ be better than nothing. ## Hardware Specific * Add support to Matrox driver for storing depth-buffer in AGP memory. Starting with the G400, the depth and stencil buffers can be stored in AGP memory. There are two things that are particularly interesting about this. First, it can enable display modes for which there is not enough on-card memory to store the front, back, and depth buffers. This means that 3D acceleration would be possible at 1600x1200x24-bit on a 16MB card. In addition, having the depth-buffer in AGP memory would free up on-card memory for textures. It would be very interesting to me to see how this would impact performance of some applications. * Add support for [[GL_EXT_texture_filter_anisotropic|http://oss.sgi.com/projects/ogl-sample/registry/EXT/texture_filter_anisotropic.txt]] to the Matrox driver. We already know how to enable it. That's not the problem. The problem is that not all G400-class hardware supports it quite right. The G400 and G450 need to use both texture units to implement it. The G550 does not. The problem is that before MGA DRM 3.2 we could not tell the difference between a G4x0 and a G550. Using the `MGA_PARAM_CARD_TYPE` parameter to `mga_getparam` we can now do this. I'd be willing to forego support for this extension on the G4x0 cards. We just need to detect the presence of a G550 and enable it there. * Add support to SiS driver for storing depth-buffer in AGP memory. * Add support for [[GL_ATI_envmap_bumpmap|http://oss.sgi.com/projects/ogl-sample/registry/ATI/envmap_bumpmap.txt]] to the Radeon driver. * Add support for [[GL_ATI_envmap_bumpmap|http://oss.sgi.com/projects/ogl-sample/registry/ATI/envmap_bumpmap.txt]] to the R200 driver. I'm not 100% sure how this works on the R200. I suspect that it may require a two-pass fragment shader. * Add support for [[GL_ATI_envmap_bumpmap|http://oss.sgi.com/projects/ogl-sample/registry/ATI/envmap_bumpmap.txt]] to the R300 driver. The R300 can implement this via a fragment program. * Add support for [[GL_ATI_envmap_bumpmap|http://oss.sgi.com/projects/ogl-sample/registry/ATI/envmap_bumpmap.txt]] to the Matrox driver. * Add support for [[GL_ATI_envmap_bumpmap|http://oss.sgi.com/projects/ogl-sample/registry/ATI/envmap_bumpmap.txt]] to the SiS driver. * Add support for [[GL_ATI_envmap_bumpmap|http://oss.sgi.com/projects/ogl-sample/registry/ATI/envmap_bumpmap.txt]] to the Intel (i915) driver. The i915 can implement this via a fragment program, and the i830 can implement this natively (I think). * Add support for [[GL_ATI_envmap_bumpmap|http://oss.sgi.com/projects/ogl-sample/registry/ATI/envmap_bumpmap.txt]] to the [[Unichrome|CLE266]] driver. * Add support for GL_MESA_ycbcr_texture to the 3dfx driver. * Add support for GL_MESA_ycbcr_texture to the [[S3Virge|S3Virge]] driver (_ha-ha!_). * Add support for more depth / stencil modes to the SiS driver. The file sis_reg.h leads me to believe that, in addition to the usual 16/0 and 24/8 modes, the SiS hardware can support 15/1, 32/0, 31/1, 30/2, and 28/4 modes. Mesa can't support the 32/0 mode, but all the rest should be doable. The trick is that this would require changes to both the client-side driver and the server-side driver. Care would be needed to prevent breaking the combination of an old client-side driver with a new server-side driver. * Investigate adding support for [[GL_EXT_texture_filter_anisotropic|http://oss.sgi.com/projects/ogl-sample/registry/EXT/texture_filter_anisotropic.txt]] to the SiS driver. There are some bits in sis_reg.h (e.g., `MASK_TextureAnisotropyRatio`) that lead me to believe that the SiS hardware has some form of support for anisotropic texture filtering. Just looking at the bits in the header file, it's not 100% clear how it might be implemented. My guess is that the anisotropy value is converted to an integer (4.0? 3.1? 2.2?) and stored in the low 4-bits of `REG_3D_Texture[01]Mip`. Then, `MASK_Texture0AnisotropicEnable` is set in `REG_3D_TEnable2`. Is that it? * Using the patch at [[http://people.freedesktop.org/~anholt/sis-anisotropic.diff|http://people.freedesktop.org/~anholt/sis-anisotropic.diff]] and testing all 15 values of the aniso field using texfilt, no visual difference was seen. --[[EricAnholt|EricAnholt]] * Add support for GL_EXT_texture_compression_s3tc to the SiS driver. There are bits in sis_reg.h for S3TC support. * Add support for GL_ARB_texture_env_combine to the [[S3Savage|S3Savage]] driver. The [[Windows driver|http://www.delphi3d.net/hardware/viewreport.php?report=800]] supports GL_ARB_texture_env_combine, and it appears that there is enough information in savage_3d_reg.h to implement it. The meaning of some of the bits in savageRegTexBlendCtrl will take some experimentation and guess work. * Add support for mirrored texture wrap modes to the [[S3Savage|S3Savage]] driver. Savage4 has bits to support some for of mirrored texture wrapping. It's probably GL_ARB_texture_mirrored_repeat. Investigate, implement, and test. * The wrapping field actually has only 3 documented values, all of which we use (we're cheating on clamping, as several drivers do, where we use a _CLAMP_TO_EDGE value for _CLAMP). I tried sticking the fourth possible value of the field in in the GL_MIRRORED_REPEAT case (arbitrarily), and got the results at [[sis-badwrap.png|http://people.freedesktop.org/~anholt/savage-badwrap.png]] (top window is hardware, bottom is indirect). Looks kinda like MIRROR_CLAMP_TO_EDGE but flipped. Actually, the plain CLAMP wrapping shows issues on my Savage4, as you can see in the 2nd and 3rd boxes at the top. --[[EricAnholt|EricAnholt]] * Add support for GL_EXT_fog_coord to the [[S3Savage|S3Savage]] driver. The [[Windows driver|http://www.delphi3d.net/hardware/viewreport.php?report=800]] supports GL_EXT_fog_coord, but I'm not quite sure that all the needed information is in savage_3d_reg.h and savage_bci.h. It looks like savageRegFogCtrl (savage_3d_reg.h) and [[FogMode|FogMode]] (savage_bci.h) play roles, but I don't see exactly how. There appears to be some code in the driver to support fog coordinates, but it doesn't look complete. * Add support for GL_ARB_point_parameters to the [[S3Savage|S3Savage]] driver. The [[Windows driver|http://www.delphi3d.net/hardware/viewreport.php?report=800]] supports GL_EXT_point_parameters (and the ARB version is virtually identical). It doesn't look like the Savage hardware can do points at all. My guess is that large points will have to be converted to polygons of some sort. * Add support for 1-bit stencil buffer to [[S3Savage|S3Savage]] driver (Savage3D) The only stencil configuration that the old Savage3D can support is a 15/1 depth/stencil mode. There is currently no support for it, but there should be. The Savage4 may also support this mode, but it's not entirely clear from looking at the code or headers. * Enhance the set of used texture formats in the [[S3Savage|S3Savage]] driver. The Savage4 hardware (and a few others) can directly support `GL_LUMINANCE4_ALPHA4` (`TFT_A4L4`). This may require some changes to core Mesa. Some Savage4 hardware can also directly support `GL_INTENSITY8` (`TFT_I8`). Code will need to be added to detect hardware that is known to support it correctly, and only use that mode on that hardware. * Investigate texture filter modes in the [[Unichrome|CLE266]] driver. The file via_3d_reg.h has some interesting bits in the `HC_HTXnFL*` group. The most intersting are `HC_HTXnFLSe_Sharp` (could be related to SGIS_sharpen_texture), `HC_HTXnFLSe_Flat_Gaussian_Cubic`, and `HC_HTXnFLDs_Ani`. In any case, it would be useful to figure out what all the various bits do (by way of experimentation) and document it in the header file. * Add support for GL_EXT_blend_func_separate to the [[Unichrome|CLE266]] driver. * Add support for GL_EXT_blend_color to the [[Unichrome|CLE266]] driver. It appears that the blend color could be supported the same way that `GL_ZERO` and `GL_ONE` are supported. * Add support for GL_EXT_blend_minmax to the [[Unichrome|CLE266]] driver. It looks like setting the source blend factor to 1, the destination blend factor to 0, and using `HC_HABLCa_maxSrcDst` or `HC_HABLCa_minSrcDst` should do the trick. * Add support for GL_EXT_blend_equation_separate to the [[Unichrome|CLE266]] driver. Once support for GL_EXT_blend_minmax is added, support for this extension can also be added. * Add support for GL_EXT_blend_subtract to the [[Unichrome|CLE266]] driver. The [[Windows drivers|http://www.delphi3d.net/hardware/viewreport.php?report=962]] supports this on KM400 / KN400 (and later?) chips. It's not 100% clear to me how this might be done, but there are some posabilities that need to be explored. On thing to try would be the `HC_HALBCbias_*` values. Another thing to try would be `HC_XC_OPCp5` (or using a 3 for that value instead of 2). The last obvious thing to try is `HC_HABLCop_MASK`. This could toggle add and subtract mode. A separate "subtract reverse" mode isn't needed because the hardware is flexable enough to just switch the operands. I actually think that this is the most likely candidate. * Add support for [[GL_EXT_texture_filter_anisotropic|http://oss.sgi.com/projects/ogl-sample/registry/EXT/texture_filter_anisotropic.txt]] to the Intel (i810) driver. Looking at i810_3d_reg.h, it appears that the i810 can do the same sort of anisotropic filtering that the i830 can. * Add support for GL_ATI_texture_env_combine3 to the Intel (i810) driver. This is a little tricky, but it should be mostly doable. The i810 has two texture units, but 3 texture combine stages. All of the extra operations can be split into two operations. As long as a new op is only used in one of the texture units, it should be fine. * Add support for GL_EXT_secondary_color to the Intel (i810) driver. * Add support for GL_EXT_fog_coord to the Intel (i810) driver. * Add support for GL_ATI_texture_env_combine3 to the Intel (i830) driver. The driver can natively do `GL_MODULATE_ADD_ATI`. The other modes can be emulated by using multiple combine stages. * Optimize generation of texture combine stages in the Intel (i830) driver. The i830 has several `MODULATE_AND_ADD` modes and a `BLEND_AND_ADD` mode. These can be used to use fewer combine stages. By itself this isn't useful. However, it can be used to free up instruction slots for texture environments that are expanded by GL_ATI_texture_env_combine3 support. * Investigate the "missing" combine modes in the Intel (i830) driver. In the defines for the texture combine modes, there is a gap between `TEXBLENDOP_MODULATE` and `TEXBLENDOP_ADD`. The odds are pretty good that these missing values do *something*. The trick is to figure out what they do and if it's useful.