This month the last release drivers appears to be for GeForce, Forceware 295.73; and for Radeon, Catalyst 12.x preview 8.96.
Catalyst 12.x preview 8.96 looks like a leak but it is available online and performs relatively well as it fixes few bugs. Following some discussions about SNORM texture and texture format conversions in OpenGL, I decided to make the following write up for this status for programmers interested by these greedy OpenGL details.
Here is the following up some discussions resulting of the SNORM textures status from last month.
Typically with OpenGL textures, we are living in a convenient world where whatever texture format data will be converted automatically by the implementation to whatever internal format the user requested or to be precise, to whatever internal format with a precision at least big enough to store the request internal format. This is to say for example a lot of GPUs doesn’t support anymore RGB5A1 format anymore but this format remains part of OpenGL 4.2 specification. An implementation may implement this format with RGBA8 instead for example. This conversion support is so wide that we can even submit RGBA32F data and convert it to DXT1 using glTexImage2D if we wish… I am not saying I can imagine a good reason to do that but it’s possible.
With the release of EXT_texture_integer, it seems that all these conversions were not really making the authors of this extension in a good mood so that this extension prevent any kind of conversion by forcing the application which want to use an integer texture to use one of the following format without any format conversion possible.
In parallel to this extension, EXT_texture_snorm extension was released providing snorm textures (normalized texture between -1.0 and 1.0 instead of 0.0 and 1.0). This extension written against OpenGL 3.0, followed the same precedent than any other OpenGL core textures which means that we could convert any texture format to a snorm texture internal format.
Unfortunately and probably because OpenGL tends to build up more exceptions than the French grammar does, OpenGL core specification (from 3.1 onward) allows conversion of only signed format data (GL_BYTE, GL_SHORT, GL_INTEGER) to snorm textures.
What implementations do? AMD implementation exposes EXT_texture_snorm so we should expect that we could create snorm texture from unsigned integer data. Unfortunately and this is the result of the status, the implementation result isn’t correct. On NVIDIA side, EXT_texture_snorm is not exposed so we should get an OpenGL error when trying to convert unsigned format data to snorm texture. However, this pass and create a functional texture.
I recieved few feedbacks and I am taking advantage of this quick update to comment them and update the drivers status.
Some new evidences demonstrate that I was wrong on the snorm texture and the OpenGL specification allows to convert anything to SNORM textures.
Data are taken from the currently bound pixel unpack buffer or client memory as a sequence of signed or unsigned bytes (GL data types byte and ubyte), signed or unsigned short integers (GL data types short and ushort), signed or unsigned integers (GL data types int and uint), or floating point values (GL data types half and float). These elements are grouped into sets of one, two, three, or four values, depending on the format, to form a group. Table 3.3 summarizes the format of groups obtained from memory; it also indicates those formats that yield indices and those that yield floating-point or integer components.
I didn't really took time to figure out the bottom of this but it looks like there is a problem: The sample displays a black screen, no OpenGL error generates... but the sample is still working on AMD. I can't exclude that I did something wrong.
The GLSL 4.20 clearly specify that vertex input can't be structures but it NVIDIA implementation supports them. Even if the GLSL compiler should generate an error, it sounds like a good idea and it doesn't look like it's an issue for the enumeration API.
"Vertex shader inputs can also form arrays of these types, but not structures."
I can't say I really understand GLSL implicit conversions. It would be me they would all generate GLSL error and I think an implementation should as least generate an warning for each like a C++ compiler would do. GLSL defines a clear list for the implicit conversion section "4.1.10 Implicit Conversions". In this list, the allowed conversions are always between type with the same number of components. It seems that in some case AMD implementation allows implicit conversions between vectors of different sizes.
If you have feedback on this, please don't hesitate to drop me a mail.
Once again, don't forget to contribute to the OpenGL community by reporting your bugs!
These tests have been done on Windows 7 64 with the OpenGL Samples Pack 4.2.4 branch, still in development, on an GeForce GTX 470 and a Radeon HD 5850.
I think that OpenGL 4.2 is so much better than OpenGL 4.1 because it completely clarifies the interface matching between shader stages. However, I discover a new corner case: With linked programs, if a built-in block is declared on the vertex shader stage but the next shader stage doesn't declare it, this is undefined and either lead to a silent error (NVIDIA implemetation) or a sort of luck (AMD implemetation where it's working).
OpenGL Samples Pack 4.2.4 wip, OpenGL specification tests | AMD Catalyst 12.2 preview, 8.94 (25/01/2012) | AMD Catalyst 12.x preview, 8.96 (14/02/2012) | NVIDIA Forceware 290.53 (22/12/2012) | NVIDIA Forceware 295.73 (22/02/2012) |
---|---|---|---|---|
420-transform-feedback-instanced | ||||
420-texture-storage | Allows an implicit cast on texture coordinates parameter | |||
420-texture-pixel-store | ||||
420-texture-conversion | Immutable texture and BC7 conversions is not working | Immutable texture and BC7 conversions is not working | ||
420-texture-compressed | ||||
420-test-depth-conservative | ||||
420-sampler-fetch | ||||
420-memory-barrier | ||||
420-interface-matching | glGetAttribLocation fails to return the location here | Structure for vertex inputs supported | ||
420-image-unpack | Unpack isn't correct? | |||
420-image-store | glClear is skipped for the first frame | |||
420-image-load | ||||
420-fbo-layered | If a vertex shader declares a built-in block and the geometry shader doesn't the result is undefined. | If a vertex shader declares a built-in block and the geometry shader doesn't the result is undefined. | ||
420-draw-base-instance | ||||
420-direct-state-access-ext | Unsupported DSA storage functions | Unsupported DSA storage functions | ||
420-buffer-uniform | ||||
420-atomic-counter | ||||
410-program-separate-dsa-ext | ||||
410-program-binary | May crash if the binary is not AMD's | May crash if the binary is not AMD's | ||
410-program-64 | ||||
410-primitive-tessellation-5 | Bug on the shader interface matching: Block member not active with linked separated program | Bug on the shader interface matching: Block member not active with linked separated program | ||
410-primitive-tessellation-2 | ||||
410-primitive-instanced | ||||
400-transform-feedback-stream | ||||
400-transform-feedback-object | EXT_transform_feedback extension string missing | EXT_transform_feedback extension string missing | ||
400-texture-buffer-rgb | ||||
400-sampler-gather | ||||
400-sampler-fetch | ||||
400-sampler-array | ||||
400-program-varying-structs | ||||
400-program-varying-blocks | ||||
400-program-subroutine | ||||
400-program-64 | ||||
400-primitive-tessellation | ||||
400-primitive-smooth-shading | ||||
400-primitive-instanced | ||||
400-fbo-rtt-texture-array | ||||
400-fbo-rtt | ||||
400-fbo-multisample | ||||
400-fbo-layered | ||||
400-draw-indirect | ||||
420-debug-output | DebugControl doesn't work, null-terminated strings generate errors | DebugControl doesn't work, null-terminated strings generate errors | ||
400-blend-rtt | ||||
330-transform-feedback-separated | ||||
330-transform-feedback-interleaved | ||||
330-texture-pixel-store | ||||
330-texture-format | SNORM conversion not performed | SNORM conversion not performed | EXT_texture_snorm string missing | EXT_texture_snorm string missing |
330-primitive-point-sprite | Pop free clipping | Pop free clipping | ||
330-fbo-srgb | ||||
330-draw-without-vertex-attrib | ||||
330-buffer-type | i32 vertex input data not supported |
OpenGL Samples Pack 4.2.4-wip, proprietary features | AMD Catalyst 12.2 preview, 8.94 (25/01/2012) | AMD Catalyst 12.x preview, 8.96 (14/02/2012) | NVIDIA Forceware 290.53 (22/12/2012) | NVIDIA Forceware 295.73 (22/02/2012) |
---|---|---|---|---|
420-texture-copy-nv | ||||
420-test-depth-clamp-separate-amd | AMD_depth_clamp_separate not supported | AMD_depth_clamp_separate not supported | ||
420-primitive-bindless-nv | NV_shader_buffer_load not supported | NV_shader_buffer_load not supported | ||
420-fbo-srgb-decode-ext | EXT_texture_sRGB_decode not supported | EXT_texture_sRGB_decode not supported | ||
420-fbo-multisample-position-amd | AMD_sample_positions not supported | AMD_sample_positions not supported | ||
420-fbo-multisample-dsa-nv | NV_texture_multisample not supported | NV_texture_multisample not supported | ||
420-draw-indirect-amd | AMD_multi_draw_indirect not supported | AMD_multi_draw_indirect not supported | ||
420-buffer-pinned-amd | AMD_pinned_memory not supported | AMD_pinned_memory not supported | ||
420-buffer-barrier-gtc | Works as desired | Works as desired | Generates an invalid operation as specified | Generates an invalid operation as specified |
420-blend-op-amd | This is a Radeon 6900+ series feature | This is a Radeon 6900+ series feature | ||
330-fbo-multisample-explicit-nv |