Normal operation. Resolve operation. 1.0 2.2 Add and Clamp Add but no Clamp Subtract Dst from Src, and Clamp Subtract Dst from Src, and don't Clamp Minimum of Src, Dst (the src and dst blend functions are forced to D3D_ONE) Maximum of Src, Dst (the src and dst blend functions are forced to D3D_ONE) Subtract Src from Dst, and Clamp Subtract Src from Dst, and don't Clamp 3D destination is no microtiled 3D destination is microtiled 3D destination is square microtiled. Only available in 16-bit No swap Word swap (2 bytes in 16-bit) Dword swap (4 bytes in a 32-bit) Half-Dword swap (2 16-bit in a 32-bit) Truncate Round LUT dither Never Less Equal Less or Equal Greater Than Not Equal Greater or Equal Always Never Less Less or Equal Equal Greater or Equal Greater Than Not Equal Always 2/4 sub-pixel samples. 3/6 sub-pixel samples. Solid fill color Flat shading Gouraud shading Horizontal Vertical Square (horizontal or vertical depending upon slope) Computed (perpendicular to slope) Draw points. Draw lines. Draw triangles. Round to trunc Round to nearest Disable stencil auto inc/dec (def). Enable stencil auto inc/dec based on triangle cw/ccw, force into dzy low bit. Force 0 into dzy low bit. 16 words 32 words 64 words 128 words 32 words 64 words 128 words 256 words 64 words 128 words 256 words 512 words 0 words 4 words 8 words 12 words Select C0A Select C1A Select C2A Select C3A Select 1/(1/W) Select Z Select Z Select 1/(1/W) Select (1/W) Select 1.0 1x1 tile (one 1x1). 2 tiles (two 1x1 : ST-A,B). 4 tiles (one 2x2). 8 tiles (two 2x2 : ST-A,B). 16 tiles (one 4x4). 32 tiles (two 4x4 : ST-A,B). 64 tiles (one 8x8). 128 tiles (two 8x8 : ST-A,B). ST-A tile. ST-B tile. Select 1/12 subpixel precision. Select 1/16 subpixel precision. Sample texture coordinates at real pixel centers Sample texture coordinates at adjusted pixel centers Four components (R,G,B,A) Three components (R,G,B,0) Three components (R,G,B,1) One component (0,0,0,A) Zero components (0,0,0,0) Zero components (0,0,0,1) One component (1,1,1,A) Zero components (1,1,1,0) Zero components (1,1,1,1) C0 - 1st texture component C1 - 2nd texture component C2 - 3rd texture component C3 - 4th texture component K0 - The value 0.0 K1 - The value 1.0 L-in,R-in,HT-in,HB-in L-in,R-in,HT-in,HB-out L-in,R-in,HT-out,HB-in L-in,R-in,HT-out,HB-out L-in,R-out,HT-in,HB-in L-in,R-out,HT-in,HB-out L-in,R-out,HT-out,HB-in L-in,R-out,HT-out,HB-out L-out,R-in,HT-in,HB-in L-out,R-in,HT-in,HB-out L-out,R-in,HT-out,HB-in L-out,R-in,HT-out,HB-out L-out,R-out,HT-in,HB-in L-out,R-out,HT-in,HB-out L-out,R-out,HT-out,HB-in L-out,R-out,HT-out,HB-out T-in,B-in,VL-in,VR-in T-in,B-in,VL-in,VR-out T-in,B-in,VL,VR-in T-in,B-in,VL-out,VR-out T-out,B-in,VL-in,VR-in T-out,B-in,VL-in,VR-out T-out,B-in,VL-out,VR-in T-out,B-in,VL-out,VR-out T-in,B-out,VL-in,VR-in T-in,B-out,VL-in,VR-out T-in,B-out,VL-out,VR-in T-in,B-out,VL-out,VR-out T-out,B-out,VL-in,VR-in T-out,B-out,VL-in,VR-out T-out,B-out,VL-out,VR-in T-out,B-out,VL-out,VR-out L-in,R-in,HT-in,HB-in L-in,R-in,HT-in,HB-out L-in,R-in,HT-out,HB-in L-in,R-in,HT-out,HB-out L-in,R-out,HT-in,HB-in L-in,R-out,HT-in,HB-out L-in,R-out,HT-out,HB-in L-in,R-out,HT-out,HB-out L-out,R-in,HT-in,HB-in L-out,R-in,HT-in,HB-out L-out,R-in,HT-out,HB-in L-out,R-in,HT-out,HB-out L-out,R-out,HT-in,HB-in L-out,R-out,HT-in,HB-out L-out,R-out,HT-out,HB-in L-out,R-out,HT-out,HB-out T-in,B-in,VL-in,VR-in T-in,B-in,VL-in,VR-out T-in,B-in,VL,VR-in T-in,B-in,VL-out,VR-out T-in,B-out,VL-in,VR-in T-in,B-out,VL-in,VR-out T-in,B-out,VL-out,VR-in T-in,B-out,VL-out,VR-out T-out,B-in,VL-in,VR-in T-out,B-in,VL-in,VR-out T-out,B-in,VL-out,VR-in T-out,B-in,VL-out,VR-out T-out,B-out,VL-in,VR-in T-out,B-out,VL-in,VR-out T-out,B-out,VL-out,VR-in T-out,B-out,VL-out,VR-out Wrap (repeat) Mirror Clamp to last texel (0.0 to 1.0) MirrorOnce to last texel (-1.0 to 1.0) Clamp half way to border color (0.0 to 1.0) MirrorOnce half way to border color (-1.0 to 1.0) Clamp to border color (0.0 to 1.0) MirrorOnce to border color (-1.0 to 1.0) Point Linear None Point Linear Disable ChromaKey (kill pixel if any sample matches chroma key) ChromaKeyBlend (set sample to 0 if it matches chroma key) Normal rounding on all components (+0.5) MPEG4 rounding on all components (+0.25) Dont truncate coordinate fractions. Truncate coordinate fractions to 0.0 and 0.5 for MPEG Use TXWIDTH for image addressing Use TXPITCH for image addressing Disable YUV to RGB conversion Enable YUV to RGB conversion (with clamp) Enable YUV to RGB conversion (without clamp) WHOLE HALF_REGION_0 HALF_REGION_1 FOURTH_REGION_0 FOURTH_REGION_1 FOURTH_REGION_2 FOURTH_REGION_3 EIGHTH_REGION_0 EIGHTH_REGION_1 EIGHTH_REGION_2 EIGHTH_REGION_3 EIGHTH_REGION_4 EIGHTH_REGION_5 EIGHTH_REGION_6 EIGHTH_REGION_7 SIXTEENTH_REGION_0 SIXTEENTH_REGION_1 SIXTEENTH_REGION_2 SIXTEENTH_REGION_3 SIXTEENTH_REGION_4 SIXTEENTH_REGION_5 SIXTEENTH_REGION_6 SIXTEENTH_REGION_7 SIXTEENTH_REGION_8 SIXTEENTH_REGION_9 SIXTEENTH_REGION_A SIXTEENTH_REGION_B SIXTEENTH_REGION_C SIXTEENTH_REGION_D SIXTEENTH_REGION_E SIXTEENTH_REGION_F 2KB page is linear 2KB page is tiled src0.r src0.g src0.b src1.r src1.g src1.b src2.r src2.g src2.b src0.a src1.a src2.a srcp.r srcp.g srcp.b srcp.a 0.0 1.0 0.5 Do not modify input Negate input Take absolute value of input Take negative absolute value of input 1.0-2.0*A0 A1-A0 A1+A0 1.0-A0 Result Result * 2 Result * 4 Result * 8 Result / 2 Result / 4 Result / 8 Do not clamp output. Clamp output to the range [0,1]. NONE: No not write any output. R: Write the red channel only. G: Write the green channel only. RG: Write the red and green channels. B: Write the blue channel only. RB: Write the red and blue channels. GB: Write the green and blue channels. RGB: Write the red, green, and blue channels. src0.rgb src0.rrr src0.ggg src0.bbb src1.rgb src1.rrr src1.ggg src1.bbb src2.rgb src2.rrr src2.ggg src2.bbb src0.aaa src1.aaa src2.aaa srcp.rgb srcp.rrr srcp.ggg srcp.bbb srcp.aaa 0.0 1.0 0.5 src0.gbr src1.gbr src2.gbr src0.brg src1.brg src2.brg src0.abg src1.abg src2.abg 1.0-2.0*RGB0 RGB1-RGB0 RGB1+RGB0 1.0-RGB0 C4_8 (S/U) C4_10 (U) C4_10_GAMMA - (U) C_16 - (S/U) C2_16 - (S/U) C4_16 - (S/U) C_16_MPEG - (S) C2_16_MPEG - (S) C2_4 - (U) C_3_3_2 - (U) C_6_5_6 - (S/U) C_11_11_10 - (S/U) C_10_11_11 - (S/U) C_2_10_10_10 - (S/U) UNUSED - Render target is not used C_16_FP - (S10E5) C2_16_FP - (S10E5) C4_16_FP - (S10E5) C_32_FP - (S23E8) C2_32_FP - (S23E8) C4_32_FP - (S23E8) Alpha Red Green Blue Red Green Blue Alpha WSRC_US - W comes from shader instruction WSRC_RAS - W comes from rasterizer -W < X < W, -W < Y < W, -W < Z < W (OpenGL Definition) -W < X < W, -W < Y < W, 0 < Z < W (DirectX Definition) None (will not trigger Setup Engine to run) Point List Line List Line Strip Triangle List Triangle Fan Triangle Strip Triangle with wFlags (aka, Rage128 'Type-2' triangles) * 8- Unused Line Loop Quad List Quad Strip Polygon *Encoding 7 indicates whether a 16-bit word of wFlags is present in the stream of indices arriving when the VTX_AMODE is programmed as a '0'. The Setup Engine just steps over the wFlags word; ignoring it. 0 = Stream contains just indices, as: [ Index1, Index0] [ Index3, Index2] [ Index5, Index4 ] etc... 1 = Stream contains indices and wFlags: [ Index1, Index0] [ wFlags,Index 2 ] [ Index4, Index3] [ wFlags, Index5 ] etc... State-Based Vertex Data. (Vertex data and tokens embedded in command stream.) Indexes (Indices embedded in command stream; vertex data to be fetched from memory.) Vertex List (Vertex data to be fetched from memory.) Vertex Data (Vertex data embedded in command stream.) Select this color Select User Color 0 Select User Color 1 User Color 0 State is NOT updated when User Color 0 is written. User Color 1 State IS updated when User Color 0 is written. Update Hierarchical Z with Max value Update Hierarchical Z with Min value Z unit cache controller does RMW Z unit cache controller does cache-line granular Write only 16-bit Integer Z 16-bit compressed 13E3 24-bit Integer Z, 8 bit Stencil (LSBs) Keep: New value = Old value Zero: New value = 0 Replace: New value = STENCILREF Increment: New value++ (clamp) Decrement: New value-- (clamp) Invert new value: New value = !Old value Increment: New value++ (wrap) Decrement: New value-- (wrap) Physical (Default) Virtual Full size 1/2 size 1/4 size 1/8 size No override Stuff texture 0 Stuff texture 1 Stuff texture 2 Stuff texture 3 Stuff texture 4 Stuff texture 5 Stuff texture 6 Stuff texture 7 Stuff texture 8/C2 Stuff texture 9/C3 Replicate VAP source texture coordinates (S,T,[R,Q]). Stuff with source texture coordinates (S,T). Stuff with source texture coordinates (S,T,R). Not active 1 component (VAP/GA), 2 component (GA/SU) 2 component (VAP/GA), 2 component (GA/SU) 3 component (VAP/GA), 3 component (GA/SU) 4 component (VAP/GA), 4 component (GA/SU) Filter4 Point Linear Component filter should interpret texel data as unsigned Component filter should interpret texel data as signed src0 src1 src2 srcp Red Green Blue Alpha Zero Half One Unused Result Result * 2 Result * 4 Result * 8 Result / 2 Result / 4 Result / 8 Disable output modifier and clamping (result is copied exactly; only valid for MIN/MAX/CMP/CND) Predicate == (ALU) Predicate < (ALU) Predicate >= (ALU) Predicate != (ALU) Resolve buffer destination address. The cache must be empty before changing this register if the cb is in resolve mode. Unpipelined 256-bit aligned 3D resolve destination offset. Resolve Buffer Pitch and Tiling Control. The cache must be empty before changing this register if the cb is in resolve mode. Unpipelined 3D destination pitch in multiples of 2-pixels. Alpha Blend Control for Alpha Channel. Pipelined through the blender. Combine Function , Allows modification of how the SRCBLEND and DESTBLEND are combined.

Source Blend Function , Alpha blending function (SRC).

Destination Blend Function , Alpha blending function (DST).

Color Compare Color. Stalls the 2d/3d datapath until it is idle. Like RB2D_CLRCMP_CLR, but a separate register is provided to keep 2D and 3D state separate. Color Compare Flip. Stalls the 2d/3d datapath until it is idle. Like RB2D_CLRCMP_FLIPE, but a separate register is provided to keep 2D and 3D state separate. Color Compare Mask. Stalls the 2d/3d datapath until it is idle. Like RB2D_CLRCMP_CLR, but separate registers provided to keep 2D and 3D state separate. Color Buffer Address Offset of multibuffer 0. Unpipelined. 256-bit aligned 3D destination offset address. The cache must be empty before this is changed. Dithering control register. Pipelined through the blender. Dither mode

Destination Color Buffer Cache Control/Status. If the cb is in e2 mode, then a flush or free will not occur upon a write to this register, but a sync will be immediately sent if one is requested. If both DC_FLUSH and DC_FREE are zero but DC_FINISH is one, then a sync will be sent immediately -- the cb will not wait for all the previous operations to complete before sending the sync. Unpipelined except when DC_FINISH and DC_FREE are both set to zero. Setting this bit flushes dirty data from the 3D Dst Cache. Unless the DC_FREE bits are also set, the tags in the cache remain valid. A purge is achieved by setting both DC_FLUSH and DC_FREE. No effect No effect Flushes dirty 3D data Flushes dirty 3D data Setting this bit invalidates the 3D Dst Cache tags. Unless the DC_FLUSH bit is also set, the cache lines are not written to memory. A purge is achieved by setting both DC_FLUSH and DC_FREE. No effect No effect Free 3D tags Free 3D tags do not send a finish signal to the CP send a finish signal to the CP after the end of operation 3D ROP Control. Stalls the 2d/3d datapath until it is idle.

ROP2 code for 3D fragments. This value is replicated into 2 nibbles to form the equivalent ROP3 code to control the ROP3 logic. These are the GDI ROP2 codes. Where does depth come from? Depth comes from scan converter as plane equation. Depth comes from shader as four discrete values. Fog Blending Enable Enable for fog blending

Fog generation function Fog function is linear Fog function is exponential Fog function is exponential squared Fog is derived from constant fog factor Specifies per RGB or Alpha shading method. Specifies solid, flat or Gouraud shading.

Specifies solid, flat or Gouraud shading.

Specifies, for flat shaded polygons, which vertex holds the polygon color. Provoking is first vertex Provoking is second vertex Provoking is third vertex Provoking is always last vertex Specifies the offset to apply to fog. 32b SPFP scale value. Specifies the scale to apply to fog. 32b SPFP scale value. S Texture Coordinate Value for Vertex 0 of Line (stuff textures -- i.e. AA). S texture coordinate value generated for vertex 0 of an antialiased line; 32-bit IEEE float format. Typical 0.0. S Texture Coordinate Value for Vertex 1 of Lines (V2 of parallelogram -- stuff textures -- i.e. AA). S texture coordinate value generated for vertex 1 of an antialiased line; 32-bit IEEE float format. Typical 1.0. Line Stipple configuration information. Specify type of reset to use for stipple accumulation. No reseting Reset per line Reset per packet Specifies, in truncated (30b) floating point, scale to apply to generated texture coordinates. Specifies maximum and minimum point & sprite sizes for per vertex size specification. Minimum point & sprite radius (in subsamples) size to allow. Maximum point & sprite radius (in subsamples) size to allow. S Texture Coordinate of Vertex 0 for Point texture stuffing (LLC). S texture coordinate of vertex 0 for point; 32-bit IEEE float format. S Texture Coordinate of Vertex 2 for Point texture stuffing (URC). S texture coordinate of vertex 2 for point; 32-bit IEEE float format. T Texture Coordinate of Vertex 0 for Point texture stuffing (LLC). T texture coordinate of vertex 0 for point; 32-bit IEEE float format. T Texture Coordinate of Vertex 2 for Point texture stuffing (URC). T texture coordinate of vertex 2 for point; 32-bit IEEE float format. Polygon Mode Polygon mode enable. Disable poly mode (render triangles). Dual mode (send 2 sets of 3 polys with specified poly type). Specifies how to render front-facing polygons.

Specifies how to render back-facing polygons.

Specifies amount to shift integer position of vertex (screen space) before converting to float for triangle stipple. Amount to shift x position before conversion to SPFP. Amount to shift y position before conversion to SPFP. Specifies the graphics pipeline configuration for antialiasing. Enables antialiasing.

Specifies the number of subsamples to use while antialiasing. 2 subsamples 3 subsamples 4 subsamples 6 subsamples OpenGL Clip rectangles Left hand edge of clip rectangle Upper edge of clip rectangle OpenGL Clip rectangles Right hand edge of clip rectangle Lower edge of clip rectangle OpenGL Clip boolean function OpenGL Clip boolean function. The 'inside' flags for each of the four clip rectangles form a 4-bit binary number. The corresponding bit in this 16-bit number specifies whether the pixel is visible. Hierarchical Z Enable Enable for hierarchical Z.

Specifies whether to compute min or max z value HZ block computes minimum z value HZ block computes maximum z value Specifies adjustment to get added or subtracted from computed z value Add or Subtract 1/256 << ze Add or Subtract 1/128 << ze Add or Subtract 1/64 << ze Add or Subtract 1/32 << ze Add or Subtract 1/16 << ze Add or Subtract 1/8 << ze Add or Subtract 1/4 << ze Add or Subtract 1/2 << ze Specifies whether vertex 0 z contains minimum z value Vertex 0 does not contain minimum z value Vertex 0 does contain minimum z value Specifies whether vertex 0 z contains maximum z value Vertex 0 does not contain maximum z value Vertex 0 does contain maximum z value Scissor rectangle specification Left hand edge of scissor rectangle Upper edge of scissor rectangle Scissor rectangle specification Right hand edge of scissor rectangle Lower edge of scissor rectangle Screen door sample mask Screen door sample mask - 1 means sample may be covered, 0 means sample is not covered Culling Enables Enable for front-face culling. Do not cull front-facing triangles. Cull front-facing triangles. Enable for back-face culling. Do not cull back-facing triangles. Cull back-facing triangles. X-Ored with cross product sign to determine positive facing Positive cross product is front (CCW). Negative cross product is front (CW). SU Depth Offset value. SPFP Floating point applied to depth before conversion to FXP. SU Depth Scale value. SPFP Floating point applied to depth before conversion to FXP. Back-Facing Polygon Offset Offset. Specifies polygon offset offset for back-facing polygons; 32b IEEE float format; applied after Z scale & offset (0 to 2^24-1 range) Back-Facing Polygon Offset Scale. Specifies polygon offset scale for back-facing polygons; 32-bit IEEE float format; applied after Z scale & offset (0 to 2^24-1 range); slope computed in subpixels (1/12 or 1/16) Enables for polygon offset Enables front facing polygon's offset.

Enables back facing polygon's offset.

Forces all parallelograms to have FRONT_FACING for poly offset -- Need to have FRONT_ENABLE also set to have Z offset for parallelograms.

Front-Facing Polygon Offset Offset. Specifies polygon offset offset for front-facing polygons; 32b IEEE float format; applied after Z scale & offset (0 to 2^24-1 range) Front-Facing Polygon Offset Scale. Specifies polygon offset scale for front-facing polygons; 32b IEEE float format; applied after Z scale & offset (0 to 2^24-1 range); slope computed in subpixels (1/12 or 1/16) Horizontal Guard Band Clip Adjust Register. 32-bit floating point value. Should be set to 1.0 for no guard band. Horizontal Guard Band Discard Adjust Register. 32-bit floating point value. Should be set to 1.0 for no guard band. Vertical Guard Band Clip Adjust Register. 32-bit floating point value. Should be set to 1.0 for no guard band. Vertical Guard Band Discard Adjust Register. 32-bit floating point value. Should be set to 1.0 for no guard band. VAP Out/GA Vertex Format Register 0 Output the Position Vector Output Color 0 Vector Output Color 1 Vector Output Color 2 Vector Output Color 3 Vector Output Point Size Vector VAP Out/GA Vertex Format Register 1 Number of words in texture 0 = Not Present 1 = 1 component 2 = 2 components 3 = 3 components 4 = 4 components Number of words in texture 0 = Not Present 1 = 1 component 2 = 2 components 3 = 3 components 4 = 4 components Number of words in texture 0 = Not Present 1 = 1 component 2 = 2 components 3 = 3 components 4 = 4 components Number of words in texture 0 = Not Present 1 = 1 component 2 = 2 components 3 = 3 components 4 = 4 components Number of words in texture 0 = Not Present 1 = 1 component 2 = 2 components 3 = 3 components 4 = 4 components Number of words in texture 0 = Not Present 1 = 1 component 2 = 2 components 3 = 3 components 4 = 4 components Number of words in texture 0 = Not Present 1 = 1 component 2 = 2 components 3 = 3 components 4 = 4 components Number of words in texture 0 = Not Present 1 = 1 component 2 = 2 components 3 = 3 components 4 = 4 components Setup Engine Data Port 0 through 15. 1st of 16 consecutive dwords for writing vertex data 128-bit Data Port for Indexed Primitives. 128-bit Data Port for Indexed Primitives. Write-only. Setup Engine Index Port 0 through 15. 1st of 16 consecutive dwords for writing vertex index Programmable Stream Control Extension Word 0 X-Component Swizzle Select 0 = SELECT_X 1 = SELECT_Y 2 = SELECT_Z 3 = SELECT_W 4 = SELECT_FP_ZERO (Floating Point 0.0) 5 = SELECT_FP_ONE (Floating Point 1.0) 6,7 RESERVED Y-Component Swizzle Select (See Above) Z-Component Swizzle Select (See Above) W-Component Swizzle Select (See Above) 4-bit write enable. Bit 0 maps to X Bit 1 maps to Y Bit 2 maps to Z Bit 3 maps to W See SWIZZLE_SELECT_X_0 See SWIZZLE_SELECT_Y_0 See SWIZZLE_SELECT_Z_0 See SWIZZLE_SELECT_W_0 See WRITE_ENA_0 Programmable Stream Control Signed Normalize Control There are 3 methods of normalizing signed numbers: SGN_NORM_ZERO : value / (2^(n-1)-1), so - 128/127 will be less that -1.0, -127/127 will yeild -1.0, 0/127 will yield 0, and 127/127 will yield 1.0 for 8-bit numbers. SGN_NORM_ZERO_CLAMP_MINUS_ONE: Same as SGN_NORM_ZERO except -128/127 will yield -1.0 for 8-bit numbers. SGN_NORM_NO_ZERO: (2 * value + 1)/2^n, so - 128 will yield -255/255 = -1.0, 127 will yield 255/255 = 1.0, but 0 will yield 1/255 != 0. See SGN_NORM_METHOD_0 See SGN_NORM_METHOD_0 See SGN_NORM_METHOD_0 See SGN_NORM_METHOD_0 See SGN_NORM_METHOD_0 See SGN_NORM_METHOD_0 See SGN_NORM_METHOD_0 See SGN_NORM_METHOD_0 See SGN_NORM_METHOD_0 See SGN_NORM_METHOD_0 See SGN_NORM_METHOD_0 See SGN_NORM_METHOD_0 See SGN_NORM_METHOD_0 See SGN_NORM_METHOD_0 See SGN_NORM_METHOD_0 Programmable Vertex Shader Code Control Register 0 First Instruction to Execute in PVS. The PVS Instruction which updates the clip coordinate position for the last time. This value is used to lower the processing priority while trivial clip and back-face culling decisions are made. This field must be set to valid instruction. Last Instruction (Inclusive) for the PVS to execute. Programmable Vertex Shader Code Control Register 1 The PVS Instruction which uses the Input Vertex Memory for the last time. This value is used to free up the Input Vertex Slots ASAP. This field must be set to a valid instruction. Programmable Vertex Shader Constant Control Register Vector Offset into PVS constant memory to the start of the constants for the current shader The maximum constant address which should be generated by the shader (Inst Const Addr + Addr Register). If the address which is generated by the shader is outside the range of 0 to PVS_MAX_CONST_ADDR, then (0,0,0,0) is returned as the source operand data. Programmable Vertex Shader Flow Control Addresses Register 0 This field defines the last PVS instruction to execute prior to the control flow redirection. JUMP - The last instruction executed prior to the jump LOOP - The last instruction executed prior to the loop (init loop counter/inc) JSR - The last instruction executed prior to the jump to the subroutine. This field has multiple definitions as follows: JUMP - The instruction address to jump to. LOOP - The loop count. *Note loop count of 0 must be replaced by a jump. JSR - The instruction address to jump to (first inst of subroutine). This field has multiple definitions as follows: JUMP - Not Applicable LOOP - The last instruction of the loop. JSR - The last instruction of the subroutine. This field has multiple definitions as follows: JUMP - Not Applicable LOOP - First Instruction of Loop (Typically ACT_ADRS + 1) JSR - First Instruction After JSR (Typically ACT_ADRS + 1) Programmable Vertex Shader Flow Control Opcode Register This opcode field determines what type of control flow instruction to execute. 0 = NO_OP 1 = JUMP 2 = LOOP 3 = JSR (Jump to Subroutine) See PVS_FC_OPC_0. See PVS_FC_OPC_0. See PVS_FC_OPC_0. See PVS_FC_OPC_0. See PVS_FC_OPC_0. See PVS_FC_OPC_0. See PVS_FC_OPC_0. See PVS_FC_OPC_0. See PVS_FC_OPC_0. See PVS_FC_OPC_0. See PVS_FC_OPC_0. See PVS_FC_OPC_0. See PVS_FC_OPC_0. See PVS_FC_OPC_0. See PVS_FC_OPC_0. This register is used to force a flush of the PVS block when single-buffered updates are performed. The multi- state control of PVS Code and Const memories by the driver is primarily for more flexible PVS state control and for performance testing. When this register address is written, the State Block will force a flush of PVS processing so that both versions of PVS state are available before updates are processed. This register is write only, and the data that is written is unused. 32-bit data to write to Vector Memory. Used for PVS code and Constant updates. 128-bit data path to write to Vector Memory. Used for PVS code and Constant updates. Octword offset to begin writing. This register is used to define the number of core clocks to wait for a vertex to be received by the VAP input controller (while the primitive path is backed up) before forcing any accumulated vertices to be submitted to the vertex processing path. Maximum Vertex Indx Clamp If index to be fetched is larger than this value, the fetch indx is set to MAX_INDX Minimum Vertex Indx Clamp If index to be fetched is smaller than this value, the fetch indx is set to MIN_INDX Viewport Transform X Offset. Viewport Offset for X coordinates. An IEEE float. Viewport Transform X Scale Factor. Viewport Scale Factor for X coordinates. An IEEE float. Viewport Transform Y Offset. Viewport Offset for Y coordinates. An IEEE float. Viewport Transform Y Scale Factor. Viewport Scale Factor for Y coordinates. An IEEE float. Viewport Transform Z Offset. Viewport Offset for Z coordinates. An IEEE float. Viewport Transform Z Scale Factor. Viewport Scale Factor for Z coordinates. An IEEE float. Viewport Transform Engine Control Viewport Transform Scale Enable for X component Viewport Transform Offset Enable for X component Viewport Transform Scale Enable for Y component Viewport Transform Offset Enable for Y component Viewport Transform Scale Enable for Z component Viewport Transform Offset Enable for Z component Indicates that the incoming X, Y have already been multiplied by 1/W0. If OFF, the Setup Engine will bultiply the X, Y coordinates by 1/W0., Indicates that the incoming Z has already been multiplied by 1/W0. If OFF, the Setup Engine will multiply the Z coordinate by 1/W0. Indicates that the incoming W0 is not 1/W0. If ON, the Setup Engine will perform the reciprocal to get 1/W0. If set, x,y,z viewport transform are performed serially through a single pipeline instead of in parallel. Used to mimic RL300 design. Array-of-Structures Attributes 0 & 1 Number of dwords in this structure. Number of dwords from one array element to the next. Number of dwords in this structure. Number of dwords from one array element to the next. Array-of-Structures Address Base Address of the Array of Structures. Vertex Size Specification Register This field specifies the number of DWORDS per vertex to expect when VAP_VF_CNTL.PRIM_WALK is set to Vertex Data (vertex data embedded in command stream). This field is not used for any other PRIM_WALK settings. This field replaces the usage of the VAP_VTX_FMT_0/1 for this purpose in prior implementations. Z Buffer Clear Value. When a block has a Z Mask value of 0, all Z values in that block are cleared to this value. In 24bpp, the stencil value is also updated regardless of whether it is enabled or not. Z Buffer Address Offset 2K aligned Z buffer address offset for macro tiles. Z Buffer Pitch and Endian Control Z buffer pitch in multiples of 4 pixels. Specifies whether Z buffer is macro-tiled. macro-tiles are 2K aligned

Specifies whether Z buffer is micro-tiled. micro-tiles is 32 bytes

Specifies endian control for the Z buffer.

Depth buffer X and Y coordinate offset X coordinate offset. multiple of 32 . Bits 4:0 have to be zero Y coordinate offset. multiple of 32 . Bits 4:0 have to be zero Hierarchical Z Pitch Pitch used in HiZ address computation. Stencil Reference Value and Mask Specifies the reference stencil value. This value is ANDed with both the reference and the current stencil value prior to the stencil test. Specifies the write mask for the stencil planes. Z Buffer Cache Control/Status Setting this bit flushes the dirty data from the Z cache. Unless ZC_FREE bit is also set, the tags in the cache remain valid. A purge is achieved by setting both ZC_FLUSH and ZC_FREE. This is a sticky bit and it clears itself at the end of the operation. No effect Flush and Free Z cache lines Setting this bit invalidates the Z cache tags. Unless ZC_FLUSH bit is also set, the cachelines are not written to memory. A purge is achieved by setting both ZC_FLUSH and ZC_FREE. This is a sticky bit that clears itself at the end of the operation. No effect Free Z cache lines (invalidate) This bit is unused ... Idle Busy Z Buffer Z Pass Counter Address Writing this location with a DWORD address causes the value in ZB_ZPASS_DATA to be written to main memory at the location pointed to by this address. NOTE: R300 has 2 pixel pipes. Broadcasting this address causes both pipes to write their ZPASS value to the same address. There is no guarantee which pipe will write last. So when writing to this register, the GA needs to be programmed to send the write command to pipe 0. Then a different address needs to be written to pipe 1. Then both pipes should be enabled for further register writes. Z is at the bottom of the pipe, after the fog unit. Z is at the top of the pipe, after the scan unit. Resolve Buffer Control. Unpipelined Specifies if the color buffer is in resolve mode. The cache must be empty before changing this register.

Specifies the gamma and degamma to be applied to the samples before and after filtering, respectively.

Alpha Blend Control for Color Channels. Pipelined through the blender. Allow alpha blending with the destination.

Enables use of RB3D_ABLENDCNTL

When blending is enabled, this enables memory reads. Memory reads will still occur when this is disabled if they are for reasons not related to blending.

Discard pixels when blending is enabled based on the src color. Disable Discard pixels if src alpha == 0 Discard pixels if src color == 0 Discard pixels if (src alpha == 0) && (src color == 0) Discard pixels if src alpha == 1 Discard pixels if src color == 1 Discard pixels if (src alpha == 1) && (src color == 1) Combine Function , Allows modification of how the SRCBLEND and DESTBLEND are combined.

Source Blend Function , Alpha blending function (SRC).

Destination Blend Function , Alpha blending function (DST).

Unpipelined. A quad is replicated and written to this many + 1 buffers. 0 (1 buffer) is the only mode where the cb processes the end of packet command. Enables equivalent of rage128 CMP_EQ_FLIP color compare mode. This is used to ensure 3D data does not get chromakeyed away by logic in the backend.

Enables AA color compression. The cache must be empty before this is changed.

Set to 0 Color buffer format and tiling control for all the multibuffers and the pitch of multibuffer 0. Unpipelined. The cache must be empty before any of the registers are changed. 3D destination pitch in multiples of 2-pixels. Denotes whether the 3D destination is in macrotiled format.

Denotes whether the 3D destination is in microtiled format.

Specifies endian control for the color buffer.

3D destination color format. ARGB1555 RGB565 ARGB8888 ARGB32323232 I8 ARGB16161616 YUV422 packed (VYUY) YUV422 packed (YVYU) UV88 ARGB4444 3D Color Channel Mask. If all the channels used in the current color format are disabled, then the cb will discard all the incoming quads. Pipelined through the blender. mask bit for blue channel

mask bit for green channel

mask bit for red channel

mask bit for alpha channel

Clear color that is used when the color mask is set to 00. Unpipelined. Constant color used by the blender. Pipelined through the blender. blue constant color green constant color red constant color alpha constant color Alpha Function Specifies the alpha compare value. Specifies the alpha compare function.

Enables/Disables alpha compare function.

Enables/Disables alpha-to-mask function.

Specfies number of sub-pixel samples for alpha-to-mask function.

Enables/Disables RGB Dithering.

Blue Component of Fog Color Blue component of fog color; (0.9) fixed format. Green Component of Fog Color Green component of fog color; (0.9) fixed format. Red Component of Fog Color Red component of fog color; (0.9) fixed format. Constant Factor for Fog Blending Constant fog factor; fixed (0.9) format. GA Enhancement Register TCL/GA Deadlock control. Prevents TCL interface from deadlocking on GA side.

Enables Fast register/primitive switching

Line control 1/2 width of line, in subpixels; (16.0) fixed format. Specifies how ends of lines should be drawn.

Current value of stipple accumulator. 24b Integer, measuring stipple accumulation in subpixels. (note: field is 32b, but only lower 24b used) Specifies x & y offsets for vertex data after conversion to FP. Specifies X offset in S15 format (subpixels). Specifies Y offset in S15 format (subpixels). Dimensions for Points 1/2 Height of point; fixed (16.0), subpixel format. 1/2 Width of point; fixed (16.0), subpixel format. Specifies the rouding mode for geometry & color SPFP to FP conversions. Trunc (0) or round to nearest (1) for geometry (XY).

Trunc (0) or round to nearest (1) for colors (RGBA).

Specifies SPFP color clamp range of [0,1] or [-8,8] for RGB. Clamp to [0,1.0] for RGB Clamp to [-7.9999, 7.9999] for RGB Specifies SPFP alpha clamp range of [0,1] or [-8,8]. Clamp to [0,1.0] for Alpha Clamp to [-7.9999, 7.9999] for Alpha Specifies number of cycles to assert reset, and also causes RB3D soft reset to assert. Count in cycles (def 256). Specifies blue & alpha components of fill color. Component alpha value. (S3.12) Component blue value. (S3.12) Specifies red & green components of fill color. Component green value (S3.12). Component red value (S3.12). Specifies top of Raster pipe specific enable controls. Specifies if points will have stuffed texture coordinates.

Specifies if lines will have stuffed texture coordinates.

Specifies if triangles will have stuffed texture coordinates.

Specifies if the auto dec/inc stencil mode should be enabled, and how.

Specifies the source of the texture coordinates for this texture. Replicate VAP source texture coordinates 0 (S,T,[R,Q]). Stuff with source texture coordinates (S,T). Stuff with source texture coordinates (S,T,R). Specifies the source of the texture coordinates for this texture. Replicate VAP source texture coordinates 1 (S,T,[R,Q]). Stuff with source texture coordinates (S,T). Stuff with source texture coordinates (S,T,R). Specifies the source of the texture coordinates for this texture. Replicate VAP source texture coordinates 2 (S,T,[R,Q]). Stuff with source texture coordinates (S,T). Stuff with source texture coordinates (S,T,R). Specifies the source of the texture coordinates for this texture. Replicate VAP source texture coordinates 3 (S,T,[R,Q]). Stuff with source texture coordinates (S,T). Stuff with source texture coordinates (S,T,R). Specifies the source of the texture coordinates for this texture. Replicate VAP source texture coordinates 4 (S,T,[R,Q]). Stuff with source texture coordinates (S,T). Stuff with source texture coordinates (S,T,R). Specifies the source of the texture coordinates for this texture. Replicate VAP source texture coordinates 5 (S,T,[R,Q]). Stuff with source texture coordinates (S,T). Stuff with source texture coordinates (S,T,R). Specifies the source of the texture coordinates for this texture. Replicate VAP source texture coordinates 6 (S,T,[R,Q]). Stuff with source texture coordinates (S,T). Stuff with source texture coordinates (S,T,R). Specifies the source of the texture coordinates for this texture. Replicate VAP source texture coordinates 7 (S,T,[R,Q]). Stuff with source texture coordinates (S,T). Stuff with source texture coordinates (S,T,R). Specifies the sizes of the various FIFO's in the sc/rs/us. This register must be the first one written Size of scan converter input FIFO (XYZ)

Size of scan converter top-of-pipe Z FIFO

Size of scan converter input FIFO (B)

Size of ras input FIFO (Texture)

Size of ras input FIFO (Color)

Size of us RAM

Size of us output FIFO (RGBA)

Size of us output FIFO (W)

High water mark for RS color FIFO (0-7, default 7) High water mark for RS texture FIFO (0-7, default 7) High water mark for US output FIFO (0-12, default 4)

High water mark for US texture output FIFO (0-15, default 11) Specifies the position of multisamples 0 through 2 Specifies the x and y position (in subpixels) of multisample 0 Specifies the x and y position (in subpixels) of multisample 0 Specifies the x and y position (in subpixels) of multisample 1 Specifies the x and y position (in subpixels) of multisample 1 Specifies the x and y position (in subpixels) of multisample 2 Specifies the x and y position (in subpixels) of multisample 2 Specifies the minimum y distance (in subpixels) between the pixel edge and the multisample bounding box. This value is used in the tile scan converter msbd0_x[2:0] specifies the minimum x distance (in subpixels) between the pixel edge and the multisample bounding box. This value is used in the tile scan converter. The special case value of 8 is represented by msbd0_x[2:0]=7. msbd0_x[3] is used to force a bounding box based tile scan conversion instead of an intercept based one. This value should always be set to 0. Specifies the position of multisamples 3 through 5 Specifies the x and y position (in subpixels) of multisample 3 Specifies the x and y position (in subpixels) of multisample 3 Specifies the x and y position (in subpixels) of multisample 4 Specifies the x and y position (in subpixels) of multisample 4 Specifies the x and y position (in subpixels) of multisample 5 Specifies the x and y position (in subpixels) of multisample 5 Specifies the minimum distance (in subpixels) between the pixel edge and the multisample bounding box. This value is used in the quad scan converter Specifies various polygon specific selects (fog, depth, perspective). Specifies source for outgoing (GA to SU) fog value.

Specifies source for outgoing (GA/SU & SU/RAS) depth value.

Specifies source for outgoing (1/W) value, used to disable perspective correct colors/textures.

Specifies the graphics pipeline configuration for rasterization Enables tiling, otherwise all tiles receive all polygons.

Specifies the number of active pipes and contexts. RV350 R300 Specifies width & height (square), in pixels. 8 pixels (not supported by zb/cb) 16 pixels 32 pixels (not supported by zb/cb) Specifies number of tiles and config in super chip configuration.

X Location of chip within super tile. Y Location of chip within super tile. Tile location of chip in a multi super tile config (Super size of 2,8,32 or 128).

Specifies the subpixel precision.

unused unused This register specifies the rasterizer input packet configuration Specifies the total number of texture address components contained in the rasterizer input packet (0:32). Specifies the total number of colors contained in the rasterizer input packet (0:4). Specifies the total number of w values contained in the rasterizer input packet (0 or 1). Specifies the relative rasterizer input packet location of w (if w_count==1) Enable high resolution texture coordinate output when q is equal to 1 This table specifies what happens during each rasterizer instruction Specifies the index (into the RS_IP table) of the texture address output during this rasterizer instruction Write enable for texture address

Specifies the destination address (within the current pixel stack frame) of the texture address output during this rasterizer instruction Specifies the index (into the RS_IP table) of the color output during this rasterizer instruction Write enable for color No write - color not valid write - color valid Specifies the destination address (within the current pixel stack frame) of the color output during this rasterizer instruction Specifies whether to sample texture coordinates at the real or adjusted pixel centers

unused This register specifies the number of rasterizer instructions Number of rasterizer instructions (1:16) Specifies that the rasterizer needs to generate w Defines texture coordinate offset (based on min/max coordinate range of triangle) used to minimize or eliminate peroidic errors on texels sampled right on their edges 0.0 range/8K range/16K range/32K range/64K range/128K range/256K range/512K This table specifies the source location and format for up to 8 texture addresses (i[0]:i[7]) and four colors (c[0]:c[3]) Specifies the relative rasterizer input packet location of texture address (i[i]). Specifies the relative rasterizer input packet location of the color (c[i]). Specifies the format of the color (c[i]).

Source select for S, T, R, and Q

Edge rules - what happens when an edge falls exactly on a sample point Edge rules for triangles, points, left-right lines, right-left lines, upper-bottom lines, bottom-upper lines. For values 0 to 15, bit 0 specifies whether a sample on a horizontal- bottom edge is in, bit 1 specifies whether a sample on a horizontal-top edge is in, bit 2 species whether a sample on a right edge is in, bit 3 specifies whether a sample on a left edge is in. For values 16 to 31, bit 0 specifies whether a sample on a vertical-right edge is in, bit 1 specifies whether a sample on a vertical-left edge is in, bit 2 species whether a sample on a bottom edge is in, bit 3 specifies whether a sample on a top edge is in

Edge rules for triangles, points, left-right lines, right-left lines, upper-bottom lines, bottom-upper lines. For values 0 to 15, bit 0 specifies whether a sample on a horizontal- bottom edge is in, bit 1 specifies whether a sample on a horizontal-top edge is in, bit 2 species whether a sample on a right edge is in, bit 3 specifies whether a sample on a left edge is in. For values 16 to 31, bit 0 specifies whether a sample on a vertical-right edge is in, bit 1 specifies whether a sample on a vertical-left edge is in, bit 2 species whether a sample on a bottom edge is in, bit 3 specifies whether a sample on a top edge is in

SU Raster pipe destination select for registers Select which of the 2 pipes (enable per pipe) to send register read/write to. b0: P0 enable, b3: P1 enable P0 enable, b P1 enable Enables for Cylindrical Wrapping

Border Color for Map 0. Color used for borders. Format is the same as the texture being bordered. Texture Chroma Key for Map 0. Color used for chroma key compare. Format is the same as the texture being keyed. Texture Enables for Maps 0 to 15 Texture Map 0 Enable.

Texture Map 1 Enable.

Texture Map 2 Enable.

Texture Map 3 Enable.

Texture Map 4 Enable.

Texture Map 5 Enable.

Texture Map 6 Enable.

Texture Map 7 Enable.

Texture Map 8 Enable.

Texture Map 9 Enable.

Texture Map 10 Enable.

Texture Map 11 Enable.

Texture Map 12 Enable.

Texture Map 13 Enable.

Texture Map 14 Enable.

Texture Map 15 Enable.

Texture Filter State for Map 0 Clamp mode for first texture coordinate

Clamp mode for second texture coordinate

Clamp mode for third texture coordinate

Filter used when texture is magnified

Filter used when texture is minified

Filter used between mipmap levels

Filter used between layers of a volume. (if no filter is specifed, select from MIN/MAG filters)

LOD index of largest (finest) mipmap to use (0 is largest). Ranges from 0 to NUM_LEVELS. Logical id for this physical texture Texture Filter State for Map 0 Chroma Key Mode

Bilinear rounding mode

(s4.5). Ranges from -16.0 to 15.99. Mipmap LOD bias measured in mipmap levels. Added to the signed, computed LOD before the LOD is clamped. MPEG coordinate truncation mode

Texture Format State for Map 0 Image width - 1. The largest image is 2048 texels. When wrapping or mirroring, must be a power of 2. When mipmapping, must be a power of 2 or padded to a power of 2 in memory. Can always be non-square, except for cube maps which must be square. Image height - 1. The largest image is 2048 texels. When wrapping or mirroring, must be a power of 2. When mipmapping, must be a power of 2 or padded to a power of 2 in memory. Can always be non-square, except for cube maps which must be square. LOG2(depth) of volume texture Number of mipmap levels minus 1. Ranges from 0 to 11. Equivalent to LOD index of smallest (coarsest) mipmap to use. Specifies whether texture coords are projected.

Indicates when TXPITCH should be used instead of TXWIDTH for image addressing

Texture Format State for Map 0 Texture Format. Components are numbered right to left. Parenthesis indicate typical uses of each format. TX_FMT_8 TX_FMT_16 TX_FMT_4_4 TX_FMT_8_8 TX_FMT_16_16 TX_FMT_3_3_2 TX_FMT_5_6_5 TX_FMT_6_5_5 TX_FMT_11_11_10 TX_FMT_10_11_11 TX_FMT_4_4_4_4 TX_FMT_1_5_5_5 TX_FMT_8_8_8_8 TX_FMT_2_10_10_10 TX_FMT_16_16_16_16 TX_FMT_Y8 TX_FMT_AVYU444 TX_FMT_VYUY422 TX_FMT_YVYU422 TX_FMT_16_MPEG TX_FMT_16_16_MPEG TX_FMT_16f TX_FMT_16f_16f TX_FMT_16f_16f_16f_16f TX_FMT_32f TX_FMT_32f_32f TX_FMT_32f_32f_32f_32f TX_FMT_W24_FP Component0 filter should interpret texel data as signed or unsigned. (Ignored for Y/YUV formats.) Component0 filter should interpret texel data as unsigned Component0 filter should interpret texel data as signed Component1 filter should interpret texel data as signed or unsigned. (Ignored for Y/YUV formats.) Component1 filter should interpret texel data as unsigned Component1 filter should interpret texel data as signed Component2 filter should interpret texel data as signed or unsigned. (Ignored for Y/YUV formats.) Component2 filter should interpret texel data as unsigned Component2 filter should interpret texel data as signed Component3 filter should interpret texel data as signed or unsigned. (Ignored for Y/YUV formats.) Component3 filter should interpret texel data as unsigned Component3 filter should interpret texel data as signed Specifies swizzling for alpha channel at the input of the pixel shader. (Ignored for Y/YUV formats.) Select Texture Component0 for the Alpha Channel. Select Texture Component1 for the Alpha Channel. Select Texture Component2 for the Alpha Channel. Select Texture Component3 for the Alpha Channel. Select the value 0 for the Alpha Channel. Select the value 1 for the Alpha Channel. Specifies swizzling for red channel at the input of the pixel shader. (Ignored for Y/YUV formats.) Select Texture Component0 for the Red Channel. Select Texture Component1 for the Red Channel. Select Texture Component2 for the Red Channel. Select Texture Component3 for the Red Channel. Select the value 0 for the Red Channel. Select the value 1 for the Red Channel. Specifies swizzling for green channel at the input of the pixel shader. (Ignored for Y/YUV formats.) Select Texture Component0 for the Green Channel. Select Texture Component1 for the Green Channel. Select Texture Component2 for the Green Channel. Select Texture Component3 for the Green Channel. Select the value 0 for the Green Channel. Select the value 1 for the Green Channel. Specifies swizzling for blue channel at the input of the pixel shader. (Ignored for Y/YUV formats.) Select Texture Component0 for the Blue Channel. Select Texture Component1 for the Blue Channel. Select Texture Component2 for the Blue Channel. Select Texture Component3 for the Blue Channel. Select the value 0 for the Blue Channel. Select the value 1 for the Blue Channel. Optionally remove gamma from texture before passing to shader. Only apply to 8bit or less components.

YUV to RGB conversion mode

Specifies coordinate type.

Multi-texture performance can be optimized and made deterministic by assigning textures to separate regions under sw control.

Texture Format State for Map 0 Used instead of TXWIDTH for image addressing when TXPITCH_EN is asserted. Pitch is given as number of texels minus one. Maximum pitch is 16K texels. Invalidate texture cache tags Texture Offset State for Map 0 Endian Control

Macro Tile Control

Micro Tile Control

32-byte aligned pointer to base map This table specifies the Alpha source addresses for up to 64 ALU instruction. The ALU expects 6 source operands - three for color (rgb0, rgb1, rgb2) and three for alpha (a0, a1, a2). Specifies the identity of source operands a0, a1, and a2. Values 0 through 31 specify a location within the current pixel stack frame. Values 32 through 63 specify a constant. Specifies the identity of source operands a0, a1, and a2. Values 0 through 31 specify a location within the current pixel stack frame. Values 32 through 63 specify a constant. Specifies the identity of source operands a0, a1, and a2. Values 0 through 31 specify a location within the current pixel stack frame. Values 32 through 63 specify a constant. Specifies the address of the pixel stack frame register to which the Alpha result of this instruction is to be written. Specifies whether or not to write the Alpha component of the result for this instruction to the pixel stack frame. NONE: No not write register. A: Write the alpha channel only. Specifies whether or not to write the Alpha component of the result of this instruction to the output fifo. NONE: No not write output. A: Write the alpha channel only. Specifies which frame buffer target to write to. Specifies whether or not to write the Alpha component of the result of this instuction to the depth output fifo. NONE: No not write output to w. A: Write the alpha channel only. Specifies which components (R,G,B,A) contribute to the stat count (see performance counter field in US_CONFIG). ALU Alpha Instruction Specifies the operand and component select for inputs A, B, and C.

Specifies the modifier for inputs A, B, and C.

Specifies the operand and component select for inputs A, B, and C.

Specifies the modifier for inputs A, B, and C.

Specifies the operand and component select for inputs A, B, and C.

Specifies the modifier for inputs A, B, and C.

Specifies how the pre-subtract value (SRCP) is computed

Specifies the operand for this instruction. OP_MAD: Result = A*B + C OP_DP: Result = dot product from RGB ALU OP_MIN: Result = min(A,B) OP_MAX: Result = max(A,B) OP_CND: Result = cnd(A,B,C) = (C>0.5)?A:B OP_CMP: Result = cmp(A,B,C) = (C>=0.0)?A:B OP_FRC: Result = fractional(A) OP_EX2: Result = 2^^A OP_LN2: Result = log2(A) OP_RCP: Result = 1/A OP_RSQ: Result = 1/sqrt(A) Specifies the output modifier for this instruction.

Specifies clamp mode for this instruction.

This table specifies the RGB source and destination addresses for up to 64 ALU instructions. The ALU expects 6 source operands - three for color (rgb0, rgb1, rgb2) and three for alpha (a0, a1, a2). Specifies the identity of source operands rgb0, rgb1, and rgb2. Values 0 through 31 specify a location within the current pixel stack frame. Values 32 through 63 specify a constant. Specifies the identity of source operands rgb0, rgb1, and rgb2. Values 0 through 31 specify a location within the current pixel stack frame. Values 32 through 63 specify a constant. Specifies the identity of source operands rgb0, rgb1, and rgb2. Values 0 through 31 specify a location within the current pixel stack frame. Values 32 through 63 specify a constant. Specifies the address of the pixel stack frame register to which the RGB result of this instruction is to be written. Specifies which of the R, G, and B components of the result of this instruction are written to the pixel stack frame.

Specifies which of the R, G, and B components of the result of this instruction are written to the output fifo.

Specifies which frame buffer target to write to. ALU RGB Instruction Specifies the operand and component select for inputs A, B, and C.

Specifies the modifier for inputs A, B, and C.

Specifies the operand and component select for inputs A, B, and C.

Specifies the modifier for inputs A, B, and C.

Specifies the operand and component select for inputs A, B, and C.

Specifies the modifier for inputs A, B, and C.

Specifies how the pre-subtract value (SRCP) is computed

Specifies the operand for this instruction. OP_MAD: Result = A*B + C OP_DP3: Result = A.r*B.r + A.g*B.g + A.b*B.b OP_DP4: Result = A.r*B.r + A.g*B.g + A.b*B.b + A.a*B.a OP_D2A: Result = A.r*B.r + A.g*B.g + C.b OP_MIN: Result = min(A,B) OP_MAX: Result = max(A,B) OP_CND: Result = cnd(A,B,C) = (C>0.5)?A:B OP_CMP: Result = cmp(A,B,C) = (C>=0.0)?A:B OP_FRC: Result = frac(A) OP_SOP: Result = ex2,ln2,rcp,rsq from Alpha ALU Specifies the output modifier for this instruction.

Specifies clamp mode for this instruction.

Specifies whether to insert a NOP instruction after this. This would get specified in order to meet dependency requirements for the pre-subtract inputs. Do not insert NOP instruction after this one Insert a NOP instruction after this one Code Address for Indirection Levels 0 to 3 Specifies the start address of the ALU microcode segment associated with the current indirection level (0:63) Specifies the size of the ALU microcode segment associated with the current indirection level (1:64) Specifies the start address of the texture microcode segment associated with the current indirection level (0:31) Specifies the size of the texture microcode segment associated with the current indirection level (1:32) Indicates at least one RGBA output instruction at this level Indicates at least one W output instruction at this level Specifies the offset and size for the ALU and Texture micrcode. These values are used to support relocatable code, and to support register writes to the code store without requiring a pipeline flush. Specifies the offset for the ALU code. This value is added to the ALU_START field in the US_CODE_ADDR registers (0:63) Specifies the total size for the ALU code for all levels (0:64) Specifies the offset for the Texture code. This value is added to the TEX_START field in the US_CODE_ADDR registers (0:31) Specifies the total size for the Texture code for all levels (0:32) Shader Configuration Specifies the valid indirection levels. Level 3 only (normal DX7-style texturing) Levels 2 and 3 (DX8-style bump mapping) Levels 1, 2, and 3 Levels 0, 1, 2, and 3 Specifies whether or not the texture code for the first valid level is enabled

Specifies how the shader output is written to the fog unit for each of up to four render targets Specifies the number and size of components

Specifies the source for components C0, C1, C2, C3

Mask specifying whether components C3, C2, C1 and C0 are signed (C4_8, C_16, C2_16 and C4_16 formats only) Shader pixel size. This register specifies the size and partitioning of the current pixel stack frame Specifies the total size of the current pixel stack frame (1:32) Texture Instruction Specifies the location (within the shader pixel stack frame) of the texture address for this instruction Specifies the location (within the shader pixel stack frame) of the returned texture data for this instruction Specifies the id of the texture map used for this instruction Specifies the operation taking place for this instruction NOP: Do nothing LD: Do Texture Lookup (S,T,R) TEXKILL: Kill pixel if any component is < 0 PROJ: Do projected texture lookup (S/Q,T/Q,R/Q) LODBIAS: Do texture lookup with lod bias unused Specifies the source and format for the Depth (W) value output by the shader Format for W W0 - W is always zero W24 - 24-bit fixed point W24_FP - 24-bit floating point Source for W

Shader Constant Color 0 Alpha Component Specifies the alpha component; (S16E7) fixed format. Shader Constant Color 0 Blue Component Specifies the blue component; (S16E7) fixed format. Shader Constant Color 0 Green Component Specifies the green component; (S16E7) fixed format. Shader Constant Color 0 Red Component Specifies the red component; (S16E7) fixed format. Control Bits for User Clip Planes and Clipping Enable User Clip Plane 0 Enable User Clip Plane 1 Enable User Clip Plane 2 Enable User Clip Plane 3 Enable User Clip Plane 4 Enable User Clip Plane 5 0 = Cull using distance from center of point 1 = Cull using radius-based distance from center of point 2 = Cull using radius-based distance from center of point, Expand and Clip on intersection 3 = Always expand and clip as trifan Disables clip code generation and clipping process for TCL Cull Primitives against UCPS, but don't clip If set, boundary edges are highlighted, else they are not highlighted Vertex Assembler/Processor Control Register Specifies the number of vertex slots to be used in the VAP PVS process. A slot represents a single vertex storage location1 across multiple engines (one vertex per engine). By decreasing the number of slots, there is more memory for each vertex, but less parallel processing. Similarly, by increasing the number of slots, thre is less memory per vertex but more vertices being processed in parallel. Specifies the maximum number of controllers to be processing in parallel. In general should be set to max value of TBD. Can be changed for performance analysis. Specifies the number of Floating Point Units (Vector/Math Engines) to use when processing vertices. This field controls the number of vertices that the vertex fetcher manages for the TCL and Setup Vertex Storage memories (and therefore the number of vertices that can be re-used). This value should be set to 12 for most operation, This number may be modified for performance evaluation. The value is the maximum vertex number used which is one less than the number of vertices (i.e. a 12 means 13 vertices will be used) Clip space is defined as:

Vertex Assemblen/Processor Control Status Endian-Swap Control. 0 = No swap 1 = 16-bit swap: 0xAABBCCDD becomes 0xBBAADDCC 2 = 32-bit swap: 0xAABBCCDD becomes 0xDDCCBBAA 3 = Half-dword swap: 0xAABBCCDD becomes 0xCCDDAABB Default = 0 The TCL engine is logically or physically removed from the circuit. Transform/Clip/Light (TCL) Engine is Busy. Read-only. Vertex Store is Busy. Read-only. Reciprocal Engine is Busy. Read-only. ViewPort Transform Engine is Busy. Read-only. Memory Interface Unit is Busy. Read-only. Vertex Cache is Busy. Read-only. Vertex Fetcher is Busy. Read-only. Register Pipeline is Busy. Read-only. VAP Engine is Busy. Read-only. Programmable Stream Control Word 0 The data type for element 0 0 = FLOAT_1 (Single IEEE Float) 1 = FLOAT_2 (2 IEEE floats) 2 = FLOAT_3 (3 IEEE Floats) 3 = FLOAT_4 (4 IEEE Floats) 4 = BYTE * (1 DWORD w 4 8-bit fixed point values) (X = [7:0], Y = [15:8], Z = [23:16], W = [31:24]) 5 = D3DCOLOR * (Same as BYTE except has X->Z,Z- >X swap for D3D color def) (Z = [7:0], Y = [15:8], X = [23:16], W = [31:24]) 6 = SHORT_2 * (1 DWORD with 2 16-bit fixed point values) (X = [15:0], Y = [31:16], Z = 0.0, W = 1.0) 7 = SHORT_4 * (2 DWORDS with 4(2 per dword) 16- bit fixed point values) (X = DW0 [15:0], Y = DW0 [31:16], Z = DW1 [15:0], W = DW1 [31:16]) 8 = VECTOR_3_TTT * (1 DWORD with 3 10-bit fixed point values) (X = [9:0], Y = [19:10], Z = [29:20], W = 1.0) 9 = VECTOR_3_EET * (1 DWORD with 2 11-bit and 1 10-bit fixed point values) (X = [10:0], Y = [21:11], Z = [31:22], W = 1.0) * These data types use the SIGNED and NORMALIZE flags described below. The number of DWORDS to skip (discard) after processing the current element. The vector address in the input memory to write this element If set, indicates the last vector of the current vertex stream

Determines whether fixed point data types are unsigned (0) or 2's complement signed (1) data types. See NORMALIZE for complete description of affect

Determines whether the fixed to floating point conversion will normalize the value (i.e. fixed point value is all fractional bits) or not (i.e. fixed point value is all integer bits). This table describes the fixed to float conversion results SIGNED NORMALIZE FLT RANGE 0 0 0.0 - (2^n - 1) (i.e. 8-bit -> 0.0 - 255.0) 0 1 0.0 - 1.0 1 0 -2^(n-1) - (2^(n-1) - 1) (i.e. 8-bit -> -128.0 - 127.0) 1 1 -1.0 - 1.0 where n is the number of bits in the associated fixed point value For signed, normalize conversion, since the fixed point range is not evenly distributed around 0, there are 3 different methods supported by R300. See the VAP_PSC_SGN_NORM_CNTL description for details.

See SKIP_DWORDS_0 See DST_VEC_LOC_0 See LAST_VEC_0

See SIGNED_0

See NORMALIZE_0

Programmable Vertex Shader Flow Control Loop Index Register 0 This field stores the automatic loop index register init value. This is an 8-bit unsigned value 0-255. This field is only used if the corresponding control flow instruction is a loop. This field stores the automatic loop index register step value. This is an 8-bit 2's comp signed value -128-127. This field is only used if the corresponding control flow instruction is a loop. Vertex Fetcher Control Primitive Type

Method of Passing Vertex Data.

When set, vertex indices are 32-bits/indx, otherwise, 16- bits/indx. When set, vertex reuse is disabled. DO NOT SET unless PRIM_WALK is Indexes. When set, the incoming index is treated as two separate indices. Bits 23-16 are used as the index for AOS 0 (These are 0 for 16-bit indices) Bits 15-0 are used as the index for AOS 1-15. This mode was added specifically for HOS usage Number of vertices in the command packet. Vertex Array of Structures Control The number of arrays required to represent the current vertex type. Each Array is described by the following three fields: VTX_AOS_ADDR, VTX_AOS_COUNT, VTX_AOS_STRIDE. Force Vertex Data Pre-fetching. If this bit is set, then a 256-bit word will always be fetched, regardless of which dwords are needed. Typically useful when VAP_VF_CNTL.PRIM_WALK is set to Vertex List (Auto-incremented indices). Granule Size to Fetch for AOS 0. 0 = 128-bit granule size 1 = 256-bit granule size This allows the driver to program the fetch size based on DWORDS/VTX/AOS combined with AGP vs. LOC Memory. The general belief is that the granule size should always be 256-bits for LOC memory and AGP8X data, but should be 128-bit for AGP2X/4X data if the DWORDS/VTX/AOS is less than TBD (128?) bits. See AOS_0_FETCH_SIZE See AOS_0_FETCH_SIZE See AOS_0_FETCH_SIZE See AOS_0_FETCH_SIZE See AOS_0_FETCH_SIZE See AOS_0_FETCH_SIZE See AOS_0_FETCH_SIZE See AOS_0_FETCH_SIZE See AOS_0_FETCH_SIZE See AOS_0_FETCH_SIZE See AOS_0_FETCH_SIZE See AOS_0_FETCH_SIZE See AOS_0_FETCH_SIZE See AOS_0_FETCH_SIZE See AOS_0_FETCH_SIZE VAP Vertex State Control Register

Use vertex state addresses directly to write to vertex state memory. Use Address Indirection table to write to vertex state memory for lower 64 DWORD addresses. Z Buffer Band-Width Control Bit Defa Enables hierarchical Z.

Enables reading of compressed Z data from memory to the cache.

Enables writing of compressed Z data from cache to memory,

This bit is set when the Z buffer is used to help the CB in clearing a region. Part of the region is cleared by the color buffer and part will be cleared by the Z buffer. Since the Z buffer does not have any write masks in the cache, full micro-tiles need to be written. If a partial micro-tile is touched , then the un-touched part will be unknowns. The cache will operate in write-allocate mode and quads will be accumulated in the cache and then evicted to main memory. The color value is supplied through the ZB_DEPTHCLEARVALUE register.

Enabling this bit will force all the compressed stencil values to be Z Buffer Control Enables stenciling.

Enables Z functions.

Enables writing of the Z buffer.

Enable signed Z buffer comparison , for W-buffering.

When STENCIL_ENABLE is set, setting STENCIL_FRONT_BACK bit to one specifies that stencilfunc/stencilfail/stencilzpass/stencilzfail registers are used if the quad is generated from front faced primitive and stencilfunc_bf/stencilfail_bf/stencilzpass_bf/stencilzfail_bf are used if the quad is generated from a back faced primitive. If the STENCIL_FRONT_BACK is not set, then stencilfunc/stencilfail/stencilzpass/stencilzfail registers determine the operation independent of the front/back face state of the quad.

Format of the Data in the Z buffer Specifies the format of the Z buffer.

in 13E3 format , count leading 0's in 13E3 format , count leading 1's. This bit is unused 7 bytes per plane equation, 1 byte for stencil 8 bytes per plane equation, no bytes for stencil Hierarchical Z Data. This DWORD contains an 8-bit value for 4 4x4 blocks. The 4 blocks are organized as a 2x2 tile. The frame buffer coordinate (X,Y) corresponds to a particular 8-bit value for the 4x4 block within the DWORD as follows: BITPOS[4:0] = 16 * X[2] + 8 * Y[2] HIZ[7:0] = HIZDWORD[BITPOS+7:BITPOS] Hierarchical Z Memory Offset DWORD offset into HiZ RAM. A DWORD can hold an 8-bit HiZ value for 4 blocks, so this offset is aligned on 4 4x4 blocks. In each pipe, the HIZ RAM DWORD address is generated from a pixel x[11:0] , y[11:0] as follows: HIZ_DWORD_ADDRESS[13:0] = HIZ_OFFSET[16:3] + Y[11:3] * HIZ_PITCH[13:5] + X[11:5]. Hierarchical Z Read Index Read index into HiZ RAM. The index must start on a DWORD boundary. RDINDEX words much like WRINDEX. Every read from HIZ_DWORD will increment the register by 2. Hierarchical Z Write Index Self-incrementing write index into the HiZ RAM. Starting write index must start on a DWORD boundary. Each time ZB_HIZ_DWORD is written, this index will increment by two DWORD, this due to the fact that there are 2 pipes and the data is broadcasted to both pipes. HIZ_OFFSET and HIZ_PITCH are not used to compute read/write address to HIZ ram, when it is accessed through WRINDEX and DWORD Z Buffer Z Pass Counter Data. Contains the number of passed Z components since the last write to this location. Writing this location resets the count to the value written. Z and Stencil Function Control Specifies the Z function.

Specifies the stencil function.

Specifies the stencil value to be written if the stencil test fails.

Same encoding as STENCILFAIL. Specifies the stencil value to be written if the stencil test passes and the Z test passes (or is not enabled). Same encoding as STENCILFAIL. Specifies the stencil value to be written if the stencil test passes and the Z test fails. Same encoding as STENCILFUNC. Specifies the stencil function for back faced quads , if STENCIL_FRONT_BACK = 1. Same encoding as STENCILFAIL. Specifies the stencil value to be written if the stencil test fails for back faced quads, if STENCIL_FRONT_BACK = 1 Same encoding as STENCILFAIL. Specifies the stencil value to be written if the stencil test passes and the Z test passes (or is not enabled) for back faced quads, if STENCIL_FRONT_BACK = 1 Same encoding as STENCILFAIL. Specifies the stencil value to be written if the stencil test passes and the Z test fails for back faced quads, if STENCIL_FRONT_BACK =1 (RO) Command Stream Indirect Queue 2 Status Current Write Pointer into the Indirect Queue. Default = 0. Current Read Pointer into the Indirect Queue. Default = 0. Current Write Pointer into the Indirect Queue. Default = 0. (WO) Command Stream Queue Address Address into the Command Stream Queue which is to be read from. Used for debug, to read the contents of the Command Stream Queue. IB1 Aperture map in RBBM - PIO. IB1 Aperture IB2 Aperture map in RBBM - PIO. IB2 Aperture Primary Aperture map in RBBM - PIO. Primary Aperture Command Stream Queue Available Counts Count of available dwords in the queue for the Primary Stream. Read Only. Count of available dwords in the queue for the Indirect Stream. Read Only. Count of available dwords in the queue for the Indirect Stream. Read Only. Command Stream Queue Control Command Stream Queue Mode. Controls whether each command stream is enabled, and whether it is in push mode (Programmed I/O), or pull mode (Bus-Master). Encodings are chosen to be compatible with Rage128. 0= Primary Disabled, Indirect Disabled. 1= Primary PIO, Indirect Disabled. 2= Primary BM, Indirect Disabled. 3,5,7= Primary PIO, Indirect BM. 4,6,8= Primary BM, Indirect BM. 9-14= Reserved. 15= Primary PIO, Indirect PIO Default = 0 Primary Disabled, Indirect Disabled. Primary PIO, Indirect Disabled. Primary BM, Indirect Disabled. 3,5, Primary PIO, Indirect BM. 4,6, Primary BM, Indirect BM. 9- Primary PIO, Indirect PIO Default = 0 (RO) Command Stream Queue Data. Data from the Command Stream Queue, from location pointed to by the CP_CSQ_ADDR register. Used for debug, to read the contents of the Command Stream Queue. Alternate Command Stream Queue Control Start location of Indirect Queue #2 in the command cache. This value also sets the size in double octwords of the Indirect Queue #1 cache that will reside in locations INDIRECT1_START to (INDIRECT2_START - 1). The Indirect Queue #2 will reside in locations INDIRECT2_START to 0x5f. The minimum size of the Indirect Queues must be at least twice the MAX_FETCH size as programmed in the CP_RB_CNTL register. Start location of Indirect Queue #1 in the command cache. This value is also the size in double octwords of the Primary Queue cache that will reside in locations 0 to (INDIRECT1_START - 1). The minimum size of the Primary Queue cache must be at least twice the MAX_FETCH size as programmed in the CP_RB_CNTL register.

Enables Indirect Buffer #2. If this bit is set, the CP_CSQ_MODE register overrides the operation of the CSQ_MODE variable in the CP_CSQ_CNTL register.

Enables Indirect Buffer #1. If this bit is set, the CP_CSQ_MODE register overrides the operation of the CSQ_MODE variable in the CP_CSQ_CNTL register.

Enables Primary Buffer. If this bit is set, the CP_CSQ_MODE register overrides the operation of the CSQ_MODE variable in the CP_CSQ_CNTL register. (RO) Command Stream Queue Status Current Read Pointer into the Primary Queue. Default = 0. Current Write Pointer into the Primary Queue. Default = 0. Current Read Pointer into the Indirect Queue. Default = 0. Command for PIO GUI DMAs. Command for PIO DMAs to the GUI DMA. Only DWORD access is allowed to this register. Destination Address for PIO GUI DMAs. Destination address for PIO DMAs to the GUI DMA. Only DWORD access is allowed to this register. Source Address for PIO GUI DMAs. Source address for PIO DMAs to the GUI DMA. Only DWORD access is allowed to this register. Indirect Buffer 2 Base Indirect Buffer 2 Base. Address of the beginning of the indirect buffer. Only DWORD access is allowed to this register. Indirect Buffer 2 Size Indirect Buffer 2 Size. This size is expressed in dwords. This field is an initiator to begin fetching commands from the Indirect Buffer. Only DWORD access is allowed to this register. Default = 0 Indirect Buffer Base Indirect Buffer Base. Address of the beginning of the indirect buffer. Only DWORD access is allowed to this register. Indirect Buffer Size Indirect Buffer Size. This size is expressed in dwords. This field is an initiator to begin fetching commands from the Indirect Buffer. Only DWORD access is allowed to this register. Default = 0 Micro Engine Control Status of MicroEngine internal registers. This value depends on the current value of the ME_STATMUX field. Read Only. Selects which status is to be returned on the ME_STAT field. Busy indicator for the MicroEngine. 0 = MicroEngine not busy. 1 = MicroEngine is active. Read Only. Run-Mode of MicroEngine. 0 = Single-Step Mode. 1 = Free-running Mode. Default = 1 Step the MicroEngine by one instruction. Writing a '1' to this field causes the MicroEngine to step by one instruction, if and only if the ME_MODE bit is a '0'. Write Only. MicroEngine RAM Address MicroEngine RAM Address (Write Mode) Writing this MicroEngine RAM Data High MicroEngine RAM Data High Used to load the MicroEngine RAM. MicroEngine RAM Data Low. MicroEngine RAM Data Low Used to load the MicroEngine RAM. MicroEngine RAM Read Address MicroEngine RAM Address (Read Mode) Writing Ring Buffer Base Ring Buffer Base. Address of the beginning of the ring buffer. Ring Buffer Control Ring Buffer Size. This size is expressed in log2 of the actual size. Values 0 and 1 are clamped to an 8 DWORD ring buffer. A value of 2 to 22 will give a ring buffer: 2^(RB_BUFSZ+1). Values greater than 22 will clamp to 22. Default = 0 Ring Buffer Block Size. This defines the number of quadwords that the Command Processor will read between updates to the host's copy of the Read Pointer. This size is expressed in log2 of the actual size (in 64-bit quadwords). For example, for a block of 1024 quadwords, you would program this field to 10(decimal). Default = 0 Endian Swap Control for Ring Buffer and Indirect Buffer. Only affects the chip behavior if the buffer resides in system memory. 0 = No swap 1 = 16-bit swap: 0xAABBCCDD becomes 0xBBAADDCC 2 = 32-bit swap: 0xAABBCCDD becomes 0xDDCCBBAA 3 = Half-dword swap: 0xAABBCCDD becomes 0xCCDDAABB Default = 0 Maximum Fetch Size for any read request that the CP makes to memory. 0 = 1 double octword. (32 bytes) 1 = 2 double octwords. (64 bytes) 2 = 4 double octwords. (128 bytes) 3 = 8 double octwords. (256 bytes). Default =0 Ring Buffer No Write to Read Pointer 0= Write to Host's copy of Read Pointer in system memory. 1= Do not write to Host's copy of Read pointer. The purpose of this control bit is to have a fall-back position if the bus- mastered write to system memory doesn't work, in which case the driver will have to read the Graphics Controller's copy of the Read Pointer directly, with some performance penalty. Default = 0 Write to Host's copy of Read Pointer in system memory. Do not write to Host's copy of Read pointer. The purpose of this control bit is to have a fall-back position if the bus- mastered write to system memory doesn't work, in which case the driver will have to read the Graphics Controller's copy of the Read Pointer directly, with some performance penalty. Default = 0 Ring Buffer Read Pointer Write Transfer Enable. When set the contents of the CP_RB_RPTR_WR register is transferred to the active read pointer (CP_RB_RPTR) whenever the CP_RB_WPTR register is written. Default =0 Ring Buffer Read Pointer Address (RO) Ring Buffer Read Pointer. This is an index (in dwords) of the current element being read from the ring buffer. Ring Buffer Read Pointer Address Swap control of the reported read pointer address. See CP_RB_CNTL.BUF_SWAP for the encoding. Ring Buffer Read Pointer Address. Address of the Host's copy of the Read Pointer. CP_RB_RPTR (RO) Ring Buffer Read Pointer Writable Ring Buffer Read Pointer Address Writable Ring Buffer Read Pointer. Writable for updating the RB_RPTR after an ACPI. (RO) Ring Buffer Write Pointer Ring Buffer Write Pointer. This is an index (in dwords) of the last known element to be written to the ring buffer (by the host). Ring Buffer Write Pointer Delay Pre-Write Timer. The number of clocks that a write to the CP_RB_WPTR register will be delayed until actually taking effect. Default = 0 Pre-Write Limit. The number of times that the CP_RB_WPTR register can be written (while the PRE_WRITE_TIMER has not expired) before the CP_RB_WPTR register is forced to be updated with the most recently written value. Default = 0 Raster Engine Sync Address (WO) Scratch Register Offset Address. Raster Engine Sync Data (WO). Data written to selected Scratch Register when a sync pulse pair is received from the CBA and CBB. (RO) Busy Status Signals Memory Read Unit Busy. Memory Write Unit Busy. Register Backbone Input Interface Busy. RBBM Output Interface Busy. Primary Command Stream Fetcher Busy. Indirect #1 Command Stream Fetcher Busy. Data in Command Queue for Primary Stream. Data in Command Queue for Indirect #1 Stream. Command Stream Interpreter Busy. Indirect #2 Command Stream Fetcher Busy. Data in Command Queue for Indirect #2 Stream. GUI DMA Engine Busy. VID DMA Engine Busy. Command Stream Busy. CP Busy. Command for PIO VID DMAs. Command for PIO DMAs to the VID DMA. Only DWORD access is allowed to this register. Destination Address for PIO VID DMAs. Destination address for PIO DMAs to the VID DMA. Only DWORD access is allowed to this register. Source Address for PIO VID DMAs. Source address for PIO DMAs to the VID DMA. Only DWORD access is allowed to this register. Virtual vs Physical Address Control - Selects whether the address corresponds to a physical or virtual address in memory.

Resolve Buffer Control. Unpipelined Specifies if the color buffer is in resolve mode. The cache must be empty before changing this register.

Specifies the gamma and degamma to be applied to the samples before and after filtering, respectively.

Controls whether alpha is averaged in the resolve. 0 => the resolved alpha value is selected from the sample 0 value. 1=> the resolved alpha value is a filtered (average) result of of the samples. Resolved alpha value is taken from sample 0. Resolved alpha value is the average of the samples. The average is not gamma corrected. Alpha Blend Control for Color Channels. Pipelined through the blender. Allow alpha blending with the destination.

Enables use of RB3D_ABLENDCNTL

When blending is enabled, this enables memory reads. Memory reads will still occur when this is disabled if they are for reasons not related to blending.

Discard pixels when blending is enabled based on the src color. Disable Discard pixels if src alpha <= RB3D_DISCARD_SRC_PIXEL_LTE_THRESHOLD Discard pixels if src color <= RB3D_DISCARD_SRC_PIXEL_LTE_THRESHOLD Discard pixels if src argb <= RB3D_DISCARD_SRC_PIXEL_LTE_THRESHOLD Discard pixels if src alpha >= RB3D_DISCARD_SRC_PIXEL_GTE_THRESHOLD Discard pixels if src color >= RB3D_DISCARD_SRC_PIXEL_GTE_THRESHOLD Discard pixels if src argb >= RB3D_DISCARD_SRC_PIXEL_GTE_THRESHOLD Combine Function , Allows modification of how the SRCBLEND and DESTBLEND are combined.

Source Blend Function , Alpha blending function (SRC).

Destination Blend Function , Alpha blending function (DST).

Enables source alpha zero performance optimization to skip reads.

Enables source alpha one performance optimization to skip reads.

Discard src pixels greater than or equal to threshold. Blue Green Red Alpha Discard src pixels less than or equal to threshold. Blue Green Red Alpha Unpipelined. A quad is replicated and written to this many + 1 buffers. 0 (1 buffer) is the only mode where the cb processes the end of packet command. Enables equivalent of rage128 CMP_EQ_FLIP color compare mode. This is used to ensure 3D data does not get chromakeyed away by logic in the backend.

Enables AA color compression. Cmask must also be enabled when aa compression is enabled. The cache must be empty before this is changed.

Enables use of the cmask ram. The cache must be empty before this is changed.

Set to 0 Enables indepedent color channel masks for the MRTs. Disabling this feature will cause all the MRTs to use color channel mask 0.

Disables write compression. Enable write compression Disable write compression Enables independent color format for the MRTs. Disabling this feature will cause all the MRTs to use color format 0.

Color buffer format and tiling control for all the multibuffers and the pitch of multibuffer 0. Unpipelined. The cache must be empty before any of the registers are changed. 3D destination pitch in multiples of 2-pixels. Denotes whether the 3D destination is in macrotiled format.

Denotes whether the 3D destination is in microtiled format.

Specifies endian control for the color buffer.

3D destination color format. ARGB10101010 UV1010 CI8 (2D ONLY) ARGB1555 RGB565 ARGB2101010 ARGB8888 ARGB32323232 I8 ARGB16161616 YUV422 packed (VYUY) YUV422 packed (YVYU) UV88 I10 ARGB4444 3D Color Channel Mask. If all the channels used in the current color format are disabled, then the cb will discard all the incoming quads. Pipelined through the blender. mask bit for the blue channel

mask bit for the green channel

mask bit for the red channel

mask bit for the alpha channel

mask bit for the blue channel of MRT 1

mask bit for the green channel of MRT 1

mask bit for the red channel of MRT 1

mask bit for the alpha channel of MRT 1

mask bit for the blue channel of MRT 2

mask bit for the green channel of MRT 2

mask bit for the red channel of MRT 2

mask bit for the alpha channel of MRT 2

mask bit for the blue channel of MRT 3

mask bit for the green channel of MRT 3

mask bit for the red channel of MRT 3

mask bit for the alpha channel of MRT 3

Clear color that is used when the color mask is set to 00. Unpipelined. Program this register with a 32-bit value in ARGB8888 or ARGB2101010 formats, ignoring the fields. blue clear color green clear color red clear color alpha clear color Alpha and red clear color values that are used when the color mask is set to 00 in FP16 per component mode. Unpipelined. red clear color alpha clear color Green and blue clear color values that are used when the color mask is set to 00 in FP16 per component mode. Unpipelined. blue clear color green clear color Constant color used by the blender. Pipelined through the blender. blue constant color (For R520, this field is ignored, use RB3D_CONSTANT_COLOR_GB__BLUE instead) green constant color (For R520, this field is ignored, use RB3D_CONSTANT_COLOR_GB__GREEN instead) red constant color (For R520, this field is ignored, use RB3D_CONSTANT_COLOR_AR__RED instead) alpha constant color (For R520, this field is ignored, use RB3D_CONSTANT_COLOR_AR__ALPHA instead) Constant color used by the blender. Pipelined through the blender. red constant color in 0.10 fixed or FP16 format alpha constant color in 0.10 fixed or FP16 format Constant color used by the blender. Pipelined through the blender. blue constant color in 0.10 fixed or FP16 format green constant color in 0.10 fixed or FP16 format Sets the fifo sizes Determines the size of the op fifo

Alpha Function Specifies the 8-bit alpha compare value when AF_EN_8BIT is enabled Specifies the alpha compare function.

Enables/Disables alpha compare function.

Enable 8-bit alpha compare function. Default 10-bit alpha compare. Enable 8-bit alpha compare. Enables/Disables alpha-to-mask function.

Specfies number of sub-pixel samples for alpha-to-mask function.

Enables/Disables RGB Dithering (Not supported in R520)

Alpha offset enable/disable (Not supported in R520)

Enable/Disable discard zero mask coverage quad to ZB No discard of zero coverage mask quads Discard zero coverage mask quads Enables/Disables FP16 alpha function Default 10-bit alpha compare and alpha-to-mask function Enable FP16 alpha compare and alpha-to-mask function Alpha Compare Value Specifies the alpha compare value, 0.10 fixed or FP16 format Blue Component of Fog Color Blue component of fog color; (0.10) fixed format. Green Component of Fog Color Green component of fog color; (0.10) fixed format. Red Component of Fog Color Red component of fog color; (0.10) fixed format. Constant Factor for Fog Blending Constant fog factor; fixed (0.10) format. Specifies color properties and mappings of textures. Specifies undefined(0), flat(1) and Gouraud(2/def) shading for each texture.

Specifies undefined(0), flat(1) and Gouraud(2/def) shading for each texture.

Specifies undefined(0), flat(1) and Gouraud(2/def) shading for tex10 components.

Specifies if each color should come from a texture and which one.

GA Enhancement Register TCL/GA Deadlock control. Prevents TCL interface from deadlocking on GA side.

Enables Fast register/primitive switching

R520+: When set, GA supports simultaneous register reads & writes No effect. Enables GA support of simultaneous register reads and writes. No effect. Enables GA support of no-stall reads for register read back. GA Input fifo high water marks Number of words remaining in input vertex fifo before asserting nearly full Number of words remaining in input primitive fifo before asserting nearly full Number of words remaining in input register fifo before asserting nearly full Alpha fill color. FP20 format for alpha fill. Blue fill color. FP20 format for blue fill. Green fill color. FP20 format for green fill. Red fill color. FP20 format for red fill. Returns idle status of various G3D block, captured when GA_IDLE written or when hard or soft reset asserted. Idle status of physical pipe 3 Z unit Idle status of physical pipe 2 Z unit Idle status of physical pipe 3 CB unit Idle status of physical pipe 2 CB unit Idle status of physical pipe 3 FG unit Idle status of physical pipe 2 FG unit Idle status of physical pipe 3 US unit Idle status of physical pipe 2 US unit Idle status of physical pipe 3 SC unit Idle status of physical pipe 2 SC unit Idle status of physical pipe 3 RS unit Idle status of physical pipe 2 RS unit Idle status of physical pipe 1 Z unit Idle status of physical pipe 0 Z unit Idle status of physical pipe 1 CB unit Idle status of physical pipe 0 CB unit Idle status of physical pipe 1 FG unit Idle status of physical pipe 0 FG unit Idle status of physical pipe 1 US unit Idle status of physical pipe 0 US unit Idle status of physical pipe 1 SC unit Idle status of physical pipe 0 SC unit Idle status of physical pipe 1 RS unit Idle status of physical pipe 0 RS unit Idle status of SU unit Idle status of GA unit Idle status of GA unit2 Line control 1/2 width of line, in subpixels (1/12 or 1/16 only, even in 8b subprecision); (16.0) fixed format. Specifies how ends of lines should be drawn.

R520+: When enabled, all lines are sorted so that V0 is vertex with smallest X, or if X equal, smallest Y. No sorting (default) Sort on minX than MinY Current value of stipple accumulator. 24b Integer, measuring stipple accumulation in subpixels (1/12 or 1/16, even in 8b precision). (note: field is 32b, but only lower 24b used) Specifies x & y offsets for vertex data after conversion to FP. Specifies X offset in S15 format (subpixels -- 1/12 or 1/16, even in 8b subprecision). Specifies Y offset in S15 format (subpixels -- 1/12 or 1/16, even in 8b subprecision). Dimensions for Points 1/2 Height of point; fixed (16.0), subpixel format (1/12 or 1/16, even if in 8b precision). 1/2 Width of point; fixed (16.0), subpixel format (1/12 or 1/16, even if in 8b precision) Specifies the rouding mode for geometry & color SPFP to FP conversions. Trunc (0) or round to nearest (1) for geometry (XY).

When set, FP32 to FP20 using round to nearest; otherwise trunc

Specifies SPFP color clamp range of [0,1] or FP20 for RGB. Clamp to [0,1.0] for RGB RGB is FP20 Specifies SPFP alpha clamp range of [0,1] or FP20. Clamp to [0,1.0] for Alpha Alpha is FP20 4b negative polarity mask for subpixel precision. Inverted version gets ANDed with subpixel X, Y masks. Specifies blue & alpha components of fill color -- S312 format -- Backwards comp. Component alpha value. (S3.12) Component blue value. (S3.12) Specifies red & green components of fill color -- S312 format -- Backwards comp. Component green value (S3.12). Component red value (S3.12). Data register for loading US instructions and constants. 32 bit dword Used to load US instructions and constants Instruction (TYPE == GA_US_VECTOR_INST) or constant (TYPE == GA_US_VECTOR_CONST) number at which to start loading. The GA will then expect n*6 (instructions) or n*4 (constants) writes to GA_US_VECTOR_DATA. The GA will self-increment until this register is written again. For instructions, the GA expects the dwords in the following order: US_CMN_INST, US_ALU_RGB_ADDR, US_ALU_ALPHA_ADDR, US_ALU_ALPHA, US_RGB_INST, US_ALPHA_INST, US_RGBA_INST. For constants, the GA expects the dwords in RGBA order. Specifies if the GA should load instructions or constants. Load instructions - INDEX is an instruction index Load constants - INDEX is a constant index No clamping of data - Default Clamp to [-1.0,1.0] constant data Specifies top of Raster pipe specific enable controls. Specifies if points will have stuffed texture coordinates.

Specifies if lines will have stuffed texture coordinates.

Specifies if triangles will have stuffed texture coordinates.

Specifies if the auto dec/inc stencil mode should be enabled, and how.

Specifies the sources of the texture coordinates for each texture.

Specifies the sizes of the various FIFO's in the sc/rs/us. This register must be the first one written Size of scan converter input FIFO (XYZ)

Size of scan converter top-of-pipe Z FIFO

Size of scan converter input FIFO (B)

Size of ras input FIFO (Texture)

Size of ras input FIFO (Color)

Size of us RAM

Size of us output FIFO (RGBA)

Size of us output FIFO (W)

High water mark for RS colors' fifo -- NOT USED High water mark for RS textures' fifo -- NOT USED High water mark for US output fifo

High water mark for US cube map fifo Specifies the sizes of the various FIFO's in the sc/rs. High water mark for SC input fifo High water mark for SC input fifo (B) High water mark for RS colors' fifo High water mark for RS textures' fifo Specifies the position of multisamples 0 through 2 Specifies the x and y position (in subpixels) of multisample 0 Specifies the x and y position (in subpixels) of multisample 0 Specifies the x and y position (in subpixels) of multisample 1 Specifies the x and y position (in subpixels) of multisample 1 Specifies the x and y position (in subpixels) of multisample 2 Specifies the x and y position (in subpixels) of multisample 2 Specifies the minimum x and y distance (in subpixels) between the pixel edge and the multisamples. These values are used in the first (coarse) scan converter Specifies the minimum x and y distance (in subpixels) between the pixel edge and the multisamples. These values are used in the first (coarse) scan converter Specifies the position of multisamples 3 through 5 Specifies the x and y position (in subpixels) of multisample 3 Specifies the x and y position (in subpixels) of multisample 3 Specifies the x and y position (in subpixels) of multisample 4 Specifies the x and y position (in subpixels) of multisample 4 Specifies the x and y position (in subpixels) of multisample 5 Specifies the x and y position (in subpixels) of multisample 5 Specifies the minimum distance (in subpixels) between the pixel edge and the multisamples. This value is used in the second (quad) scan converter Selects which of 4 pipes are active. Maps physical pipe 0 to logical pipe ID (def 0). Maps physical pipe 1 to logical pipe ID (def 1). Maps physical pipe 2 to logical pipe ID (def 2). Maps physical pipe 3 to logical pipe ID (def 3). 4b mask, indicates which physical pipes are enabled (def none=0x0) -- B3=P3, B2=P2, B1=P1, B0=P0. -- 1: enabled, 0: disabled P3, B P2, B P1, B P0. -- 1: enabled, disabled 2b, indicates, by the fuses, the max number of allowed pipes. 0 = 1 pipe ... 3 = 4 pipes -- Read Only 4b, indicates, by the fuses, the bad pipes: B3=P3, B2=P2, B1=P1, B0=P0 -- 1: bad, 0: good -- Read Only P3, B P2, B P1, B P0 -- bad, good -- Read Only If this bit is set when writing this register, the logical pipe ID values are assigned automatically based on the values that are read back in the MAX_PIPE and BAD_PIPES fields. This field is always read back as 0. Do nothing Force self-configuration Specifies various polygon specific selects (fog, depth, perspective). Specifies source for outgoing (GA to SU) fog value.

Specifies source for outgoing (GA/SU & SU/RAS) depth value.

Specifies source for outgoing (1/W) value, used to disable perspective correct colors/textures.

Controls enabling of fog stuffing into texture coordinate.

Controls which texture gets fog value Controls which component of texture gets fog value Specifies the graphics pipeline configuration for rasterization Enables tiling, otherwise all tiles receive all polygons.

Specifies the number of active pipes and contexts (up to 4 pipes, 1 ctx). When this field is written, it is automatically reduced by hardware so as not to use more pipes than the number indicated in GB_PIPE_SELECT.MAX_PIPES or the number of pipes left unmasked GB_PIPE_SELECT.BAD_PIPES. The potentially altered value is read back, rather than the original value written by software. RV350 (1 pipe, 1 ctx) R300 (2 pipes, 1 ctx) 06 – R420-3P (3 pipes, 1 ctx) 07 – R420 (4 pipes, 1 ctx) Specifies width & height (square), in pixels (only 16, 32 available). 8 pixels. 16 pixels. 32 pixels. Specifies number of tiles and config in super chip configuration.

X Location of chip within super tile. Y Location of chip within super tile. Tile location of chip in a multi super tile config (Super size of 2,8,32 or 128).

Specifies the precision of subpixels wrt pixels (12 or 16).

Specifies the number of quads to be sent to each rasterizer in turn when in RV300B or R300B mode 4 Quads 8 Quads 16 Quads 32 Quads Specifies whether to use an intercept or bounding box based calculation for the first (coarse) scan converter Use intercept based scan converter Use bounding box based scan converter Specifies whether to use an altenate scan pattern for the coarse scan converter Use normal left-right scan Use alternate left-right-left scan Not used -- should be 0 Not used Not used Set to 0 Support for 3x2 tiling in 3P mode Use default tiling in all tiling modes Use alternative 3x2 tiling in 3P mode Support for extended setup Z range from [0,1] to [-2,2] with per pixel clamping Use (24.1) Z format, with vertex clamp to [1.0,0.0] Use (S25.1) format, with vertex clamp to [2.0,- 2.0] and per pixel [1.0,0.0] Specifies the z plane equation configuration. Specifies the z plane equation size. 4x4 z plane equations (point-sampled or aa) 8x8 z plane equations (point-sampled only) PS3 mode enable register When reset (default), follows R300/PS2 mode; when set, allows for new ps3 mode. Default PS2 mode New PS3 mode Specifies source for texture components in PS3 mode Specifies VAP source, or GA (ST) or GA (STR) stuffing for each texture.

Specifies VAP source, or GA (ST) or GA (STR) stuffing for each texture.

PS3 vertex format register How many active components (0,1,2,3,4) are in each texture.

How many active components (0,1,2,3,4) are in each texture.

How many active components (0,2,3,4) are in texture 10. Not active 2 component (GA/SU) 3 component (GA/SU) 4 component (GA/SU) This register specifies the rasterizer input packet configuration Specifies the total number of texture address components contained in the rasterizer input packet (0:32). Specifies the total number of colors contained in the rasterizer input packet (0:4). Specifies the relative rasterizer input packet location of w (if w_count==1) Enable high resolution texture coordinate output when q is equal to 1 This table specifies what happens during each rasterizer instruction Specifies the index (into the RS_IP table) of the texture address output during this rasterizer instruction Write enable for texture address

Specifies the destination address (within the current pixel stack frame) of the texture address output during this rasterizer instruction Specifies the index (into the RS_IP table) of the color output during this rasterizer instruction Write enable for color No write - color not valid write - color valid write fbuffer - XY00->RGBA write backface - B000->RGBA Specifies the destination address (within the current pixel stack frame) of the color output during this rasterizer instruction Specifies whether to sample texture coordinates at the real or adjusted pixel centers

Specifies that the rasterizer should output w No write - w not valid write - w valid This register specifies the number of rasterizer instructions Number of rasterizer instructions (1:16) Indicates range of texture offset to minimize peroidic errors on texels sampled right on their edges This table specifies the source location and format for up to 16 texture addresses (i[0]:i[15]) and four colors (c[0]:c[3]) Specifies the relative rasterizer input packet location of each component (S, T, R, and Q) of texture address (i[i]). The values 62 and 63 select constant inputs for the component: 62 selects K0 (0.0), and 63 selects K1 (1.0). Specifies the relative rasterizer input packet location of each component (S, T, R, and Q) of texture address (i[i]). The values 62 and 63 select constant inputs for the component: 62 selects K0 (0.0), and 63 selects K1 (1.0). Specifies the relative rasterizer input packet location of each component (S, T, R, and Q) of texture address (i[i]). The values 62 and 63 select constant inputs for the component: 62 selects K0 (0.0), and 63 selects K1 (1.0). Specifies the relative rasterizer input packet location of each component (S, T, R, and Q) of texture address (i[i]). The values 62 and 63 select constant inputs for the component: 62 selects K0 (0.0), and 63 selects K1 (1.0). Specifies the relative rasterizer input packet location of the color (c[i]). Specifies the format of the color (c[i]).

Enable application of the TX_OFFSET in RS_INST_COUNT Do not apply the TX_OFFSET in RS_INST_COUNT Apply the TX_OFFSET specified by RS_INST_COUNT Edge rules - what happens when an edge falls exactly on a sample point Edge rules for triangles, points, left-right lines, right-left lines, upper-bottom lines, bottom-upper lines. For values 0 to 15, bit 0 specifies whether a sample on a horizontal- bottom edge is in, bit 1 specifies whether a sample on a horizontal-top edge is in, bit 2 species whether a sample on a right edge is in, bit 3 specifies whether a sample on a left edge is in. For values 16 to 31, bit 0 specifies whether a sample on a vertical-right edge is in, bit 1 specifies whether a sample on a vertical-left edge is in, bit 2 species whether a sample on a bottom edge is in, bit 3 specifies whether a sample on a top edge is in

SU Raster pipe destination select for registers Register read/write destination select: b0: logical pipe0, b1: logical pipe1, b2: logical pipe2 and b3: logical pipe3 logical pipe0, b logical pipe1, b logical pipe2 and b logical pipe3 Enables for Cylindrical Wrapping tNcM -- Enable texture wrapping on component M (S,T,R,Q) of texture N.

tNcM -- Enable texture wrapping on component M (S,T,R,Q) of texture N.

Specifies texture wrapping for new PS3 textures. tNcM -- Enable texture wrapping on component M (S,T,R,Q) of texture N.

tNcM -- Enable texture wrapping on component M (S,T,R,Q) of texture N.

Border Color. Color used for borders. Format is the same as the texture being bordered. Texture Chroma Key. Color used for chroma key compare. Format is the same as the texture being keyed. Texture Enables for Maps 0 to 15 Texture Map Enables.

Texture Map Enables.

Texture Filter State Clamp mode for texture coordinates

Clamp mode for texture coordinates

Filter used when texture is magnified

Filter used when texture is minified

Filter used between mipmap levels

Filter used between layers of a volume. (if no filter is specifed, select from MIN/MAG filters)

LOD index of largest (finest) mipmap to use (0 is largest). Ranges from 0 to NUM_LEVELS. Logical id for this physical texture Texture Filter State Chroma Key Mode

Bilinear rounding mode

(s4.5). Ranges from -16.0 to 15.99. Mipmap LOD bias measured in mipmap levels. Added to the signed, computed LOD before the LOD is clamped. MPEG coordinate truncation mode

Apply slope and bias to trilerp fraction to reduce the number of 2-level fetches for trilinear. Should only be used if MIP_FILTER is LINEAR. Breakpoint=0/8. lfrac_out = lfrac_in Breakpoint=1/8. lfrac_out = clamp(4/3*lfrac_in - 1/6) Breakpoint=1/4. lfrac_out = clamp(2*lfrac_in - 1/2) Breakpoint=3/8. lfrac_out = clamp(4*lfrac_in - 3/2) Set to 0 Set to 0 Set to 0 If enabled, addressing switches to macro-linear when image width is <= 8 micro-tiles. If disabled, functionality is same as RV350, switch to macro-linear when image width is < 8 micro-tiles. RV350 mode Switch from macro-tiled to macro-linear when (width <= 8 micro-tiles) To fix issues when using non-square mipmaps, with border_color, and extreme minification. R3xx R4xx mode Stop right shifting coord once mip size is pinned to one Filter4 Kernel (s1.9). Bottom or Right weight of pair. (s1.9). Top or Left weight of pair. Indicates which pair of weights within phase to load. Top or Left Bottom or Right Indicates which of 9 phases to load Indicates whether to load the horizontal or vertical weights Horizontal Vertical Texture Format State Image width - 1. The largest image is 4096 texels. When wrapping or mirroring, must be a power of 2. When mipmapping, must be a power of 2 or padded to a power of 2 in memory. Can always be non-square, except for cube maps which must be square. Image height - 1. The largest image is 4096 texels. When wrapping or mirroring, must be a power of 2. When mipmapping, must be a power of 2 or padded to a power of 2 in memory. Can always be non-square, except for cube maps which must be square. LOG2(depth) of volume texture Number of mipmap levels minus 1. Ranges from 0 to 12. Equivalent to LOD index of smallest (coarsest) mipmap to use. Specifies whether texture coords are projected.

Indicates when TXPITCH should be used instead of TXWIDTH for image addressing

Texture Format State Texture Format. Components are numbered right to left. Parenthesis indicate typical uses of each format. TX_FMT_8 or TX_FMT_1 (if TX_FORMAT2.TXFORMAT_MSB is set) TX_FMT_16 or TX_FMT_1_REVERSE (if TX_FORMAT2.TXFORMAT_MSB is set) TX_FMT_4_4 or TX_FMT_10 (if TX_FORMAT2.TXFORMAT_MSB is set) TX_FMT_8_8 or TX_FMT_10_10 (if TX_FORMAT2.TXFORMAT_MSB is set) TX_FMT_16_16 or TX_FMT_10_10_10_10 (if TX_FORMAT2.TXFORMAT_MSB is set) TX_FMT_3_3_2 or TX_FMT_ATI1N (if TX_FORMAT2.TXFORMAT_MSB is set) TX_FMT_5_6_5 TX_FMT_6_5_5 TX_FMT_11_11_10 TX_FMT_10_11_11 TX_FMT_4_4_4_4 TX_FMT_1_5_5_5 TX_FMT_8_8_8_8 TX_FMT_2_10_10_10 TX_FMT_16_16_16_16 TX_FMT_Y8 TX_FMT_AVYU444 TX_FMT_VYUY422 TX_FMT_YVYU422 TX_FMT_16_MPEG TX_FMT_16_16_MPEG TX_FMT_16f TX_FMT_16f_16f TX_FMT_16f_16f_16f_16f TX_FMT_32f TX_FMT_32f_32f TX_FMT_32f_32f_32f_32f TX_FMT_W24_FP TX_FMT_ATI2N Component filter should interpret texel data as signed or unsigned. (Ignored for Y/YUV formats.)

Component filter should interpret texel data as signed or unsigned. (Ignored for Y/YUV formats.)

Specifies swizzling for each channel at the input of the pixel shader. (Ignored for Y/YUV formats.)

Optionally remove gamma from texture before passing to shader. Only apply to 8bit or less components.

YUV to RGB conversion mode

Specifies coordinate type.

This field is ignored on R520 and RV510.

Texture Format State Used instead of TXWIDTH for image addressing when TXPITCH_EN is asserted. Pitch is given as number of texels minus one. Maximum pitch is 16K texels. Specifies the MSB of the texture format to extend the number of formats to 64. Specifies bit 11 of TXWIDTH to extend the largest image to 4096 texels. Specifies bit 11 of TXHEIGHT to extend the largest image to 4096 texels. Optionally divide by 256 instead of 255 during fix2float. Can only be asserted for 8-bit components. Divide by pow2-1 for fix2float (default) Divide by pow2 for fix2float If filter4 is enabled, specifies which texture component to apply filter4 to. Select Texture Component0. Select Texture Component1. Select Texture Component2. Select Texture Component3. Invalidate texture cache tags. Unused Texture Offset State Endian Control

Macro Tile Control

Micro Tile Control

32-byte aligned pointer to base map ALU Alpha Instruction Specifies the opcode for this instruction. OP_MAD: Result = A*B + C OP_DP: Result = dot product from RGB ALU OP_MIN: Result = min(A,B) OP_MAX: Result = max(A,B) OP_CND: Result = cnd(A,B,C) = (C>0.5)?A:B OP_CMP: Result = cmp(A,B,C) = (C>=0.0)?A:B OP_FRC: Result = A-floor(A) OP_EX2: Result = 2^^A OP_LN2: Result = log2(A) OP_RCP: Result = 1/A OP_RSQ: Result = 1/sqrt(A) OP_SIN: Result = sin(A*2pi) OP_COS: Result = cos(A*2pi) OP_MDH: Result = A*B + C; A is always topleft.src0, C is always topright.src0 (source select and swizzles ignored). Input modifiers are respected for all inputs. OP_MDV: Result = A*B + C; A is always topleft.src0, C is always bottomleft.src0 (source select and swizzles ignored). Input modifiers are respected for all inputs. Specifies the address of the pixel stack frame register to which the Alpha result of this instruction is to be written. Specifies whether the loop register is added to the value of ALPHA_ADDRD before it is used. This implements relative addressing.

Specifies the operands for Alpha inputs A and B.

Specifies the channel sources for Alpha inputs A and B.

Specifies the input modifiers for Alpha inputs A and B.

Specifies the operands for Alpha inputs A and B.

Specifies the channel sources for Alpha inputs A and B.

Specifies the input modifiers for Alpha inputs A and B.

Specifies the output modifier for this instruction.

This specifies which (cached) frame buffer target to write to. For non-output ALU instructions, this specifies how to compare the results against zero when setting the predicate bits.

Specifies whether or not to write the Alpha component of the result of this instuction to the depth output fifo. NONE: Do not write output to w. A: Write the alpha channel only to w. This table specifies the Alpha source addresses and pre-subtract operation for up to 512 ALU instruction. The ALU expects 6 source operands - three for color (rgb0, rgb1, rgb2) and three for alpha (a0, a1, a2). The pre-subtract operation creates two more (rgbp and ap). Specifies the identity of source operands a0, a1, and a2. If the const field is set, this number ranges from 0 to 255 and specifies a location within the constant register bank. Otherwise: If the most significant bit is cleared, this field specifies a location within the current pixel stack frame (ranging from 0 to 127). If the most significant bit is set, then the lower 7 bits specify an inline unsigned floating- point constant with 4 bit exponent (bias 7) and 3 bit mantissa, including denormals but excluding infinite/NaN. Specifies whether the associated address is a constant register address or a temporary address / inline constant.

Specifies whether the loop register is added to the value of the associated address before it is used. This implements relative addressing.

Specifies the identity of source operands a0, a1, and a2. If the const field is set, this number ranges from 0 to 255 and specifies a location within the constant register bank. Otherwise: If the most significant bit is cleared, this field specifies a location within the current pixel stack frame (ranging from 0 to 127). If the most significant bit is set, then the lower 7 bits specify an inline unsigned floating- point constant with 4 bit exponent (bias 7) and 3 bit mantissa, including denormals but excluding infinite/NaN. Specifies whether the associated address is a constant register address or a temporary address / inline constant.

Specifies whether the loop register is added to the value of the associated address before it is used. This implements relative addressing.

Specifies how the pre-subtract value (SRCP) is computed.

ALU Shared RGBA Instruction Specifies the opcode for this instruction. OP_MAD: Result = A*B + C OP_DP3: Result = A.r*B.r + A.g*B.g + A.b*B.b OP_DP4: Result = A.r*B.r + A.g*B.g + A.b*B.b + A.a*B.a OP_D2A: Result = A.r*B.r + A.g*B.g + C.b OP_MIN: Result = min(A,B) OP_MAX: Result = max(A,B) OP_CND: Result = cnd(A,B,C) = (C>0.5)?A:B OP_CMP: Result = cmp(A,B,C) = (C>=0.0)?A:B OP_FRC: Result = A-floor(A) OP_SOP: Result = ex2,ln2,rcp,rsq,sin,cos from Alpha ALU OP_MDH: Result = A*B + C; A is always topleft.src0, C is always topright.src0 (source select and swizzles ignored). Input modifiers are respected for all inputs. OP_MDV: Result = A*B + C; A is always topleft.src0, C is always bottomleft.src0 (source select and swizzles ignored). Input modifiers are respected for all inputs. Specifies the address of the pixel stack frame register to which the RGB result of this instruction is to be written. Specifies whether the loop register is added to the value of RGB_ADDRD before it is used. This implements relative addressing.

Specifies the operands for RGB and Alpha input C.

Specifies, per channel, the sources for RGB and Alpha input C.

Specifies the input modifiers for RGB and Alpha input C.

Specifies the operands for RGB and Alpha input C.

Specifies, per channel, the sources for RGB and Alpha input C.

Specifies the input modifiers for RGB and Alpha input C.

ALU RGB Instruction Specifies the operands for RGB inputs A and B.

Specifies, per channel, the sources for RGB inputs A and B.

Specifies the input modifiers for RGB inputs A and B.

Specifies the operands for RGB inputs A and B.

Specifies, per channel, the sources for RGB inputs A and B.

Specifies the input modifiers for RGB inputs A and B.

Specifies the output modifier for this instruction.

This specifies which (cached) frame buffer target to write to. For non-output ALU instructions, this specifies how to compare the results against zero when setting the predicate bits.

Specifies whether to update the current ALU result. Do not modify the current ALU result. Modify the current ALU result based on the settings of ALU_RESULT_SEL and ALU_RESULT_OP. This table specifies the RGB source addresses and pre-subtract operation for up to 512 ALU instructions. The ALU expects 6 source operands - three for color (rgb0, rgb1, rgb2) and three for alpha (a0, a1, a2). The pre-subtract operation creates two more (rgbp and ap). Specifies the identity of source operands rgb0, rgb1, and rgb2. If the const field is set, this number ranges from 0 to 255 and specifies a location within the constant register bank. Otherwise: If the most significant bit is cleared, this field specifies a location within the current pixel stack frame (ranging from 0 to 127). If the most significant bit is set, then the lower 7 bits specify an inline unsigned floating-point constant with 4 bit exponent (bias 7) and 3 bit mantissa, including denormals but excluding infinite/NaN. Specifies whether the associated address is a constant register address or a temporary address / inline constant.

Specifies whether the loop register is added to the value of the associated address before it is used. This implements relative addressing.

Specifies the identity of source operands rgb0, rgb1, and rgb2. If the const field is set, this number ranges from 0 to 255 and specifies a location within the constant register bank. Otherwise: If the most significant bit is cleared, this field specifies a location within the current pixel stack frame (ranging from 0 to 127). If the most significant bit is set, then the lower 7 bits specify an inline unsigned floating-point constant with 4 bit exponent (bias 7) and 3 bit mantissa, including denormals but excluding infinite/NaN. Specifies whether the associated address is a constant register address or a temporary address / inline constant.

Specifies whether the loop register is added to the value of the associated address before it is used. This implements relative addressing.

Specifies how the pre-subtract value (SRCP) is computed.

Shared instruction fields for all instruction types Specifies the type of instruction. Note that output instructions write to render targets. US_INST_TYPE_ALU: This instruction is an ALU instruction. US_INST_TYPE_OUT: This instruction is an output instruction. US_INST_TYPE_FC: This instruction is a flow control instruction. US_INST_TYPE_TEX: This instruction is a texture instruction. Specifies whether to wait for the texture semaphore. This instruction may issue immediately. This instruction will not issue until the texture semaphore is available. Specifies whether the instruction uses predication. For ALU/TEX/Output this specifies predication for the RGB channels only. For FC this specifies the predicate for the entire instruction. US_PRED_SEL_NONE: No predication US_PRED_SEL_RGBA: Independent Channel Predication US_PRED_SEL_RRRR: R-Replicate Predication US_PRED_SEL_GGGG: G-Replicate Predication US_PRED_SEL_BBBB: B-Replicate Predication US_PRED_SEL_AAAA: A-Replicate Predication Specifies whether the predicate should be inverted. For ALU/TEX/Output this specifies predication for the RGB channels only. For FC this specifies the predicate for the entire instruction.

Specifies which pixels to write to. Only write to channels of active pixels Write to channels of all pixels, including inactive pixels Specifies whether this is the last instruction. Do not terminate the shader after executing this instruction (unless this instruction is at END_ADDR). All active pixels are willing to terminate after executing this instruction. There is no guarantee that the shader will actually terminate here. This feature is provided as a performance optimization for tests where pixels can conditionally terminate early. Specifies whether to insert a NOP instruction after this. This would get specified in order to meet dependency requirements for the pre-subtract inputs, and dependency requirements for src0 of an MDH/MDV instruction. Do not insert NOP instruction after this one. Insert a NOP instruction after this one. Specifies whether to wait for pending ALU instructions to complete before issuing this instruction. Do not wait for pending ALU instructions to complete before issuing the current instruction. Wait for pending ALU instructions to complete before issuing the current instruction. Specifies which components of the result of the RGB instruction are written to the pixel stack frame.

Specifies whether the result of the Alpha instruction is written to the pixel stack frame. NONE: Do not write register. A: Write the alpha channel only. Specifies which components of the result of the RGB instruction are written to the output fifo if this is an output instruction, and which predicate bits should be modified if this is an ALU instruction.

Specifies whether the result of the Alpha instruction is written to the output fifo if this is an output instruction, and whether the Alpha predicate bit should be modified if this is an ALU instruction. NONE: Do not write output. A: Write the alpha channel only. Specifies RGB and Alpha clamp mode for this instruction.

Specifies RGB and Alpha clamp mode for this instruction.

Specifies which component of the result of this instruction should be used as the 'ALU result' by a subsequent flow control instruction. RED: Use red as ALU result for FC. ALPHA: Use alpha as ALU result for FC. Specifies whether the predicate should be inverted. For ALU/TEX/Output this specifies predication for the alpha channel only. This field has no effect on FC instructions.

Specifies how to compare the ALU result against zero for the 'alu_result' bit in a subsequent flow control instruction. Equal to Less than Greater than or equal to Not equal Specifies whether the instruction uses predication. For ALU/TEX/Output this specifies predication for the alpha channel only. This field has no effect on FC instructions. US_PRED_SEL_NONE: No predication US_PRED_SEL_RGBA: A predication (identical to US_PRED_SEL_AAAA) US_PRED_SEL_RRRR: R Predication US_PRED_SEL_GGGG: G Predication US_PRED_SEL_BBBB: B Predication US_PRED_SEL_AAAA: A Predication Specifies which components (R,G,B,A) contribute to the stat count Code start and end instruction addresses. Specifies the address of the first instruction to execute in the shader program. This address is relative to the shader program offset given in US_CODE_OFFSET.OFFSET_ADDR. Specifies the address of the last instruction to execute in the shader program. This address is relative to the shader program offset given in US_CODE_OFFSET.OFFSET_ADDR. Shader program execution will always terminate after the instruction at this address is executed. Offsets used for relative instruction addresses in the shader program, including START_ADDR, END_ADDR, and any non-global flow control jump addresses. Specifies the offset to add to relative instruction addresses, including START_ADDR, END_ADDR, and some flow control jump addresses. Range of instructions that contains the current shader program. Specifies the start address of the current code window. This address is an absolute address. Specifies the size of the current code window, minus one. The last instruction in the code window is given by CODE_ADDR + CODE_SIZE. Shader Configuration Set to 0 Control how ALU multiplier behaves when one argument is zero. This affects the multiplier used in MAD and dot product calculations. Default behaviour (0*inf=nan,0*nan=nan) Legacy behaviour for shader model 1 (0*anything=0) Flow Control Instruction Address Fields The address of the static boolean register to use in the jump function. The address of the static integer register to use for loop/rep and endloop/endrep. The address to jump to if the jump function evaluates to true. Specifies whether to interpret JUMP_ADDR as a global address. Add the shader program offset in US_CODE_OFFSET.OFFSET_ADDR when calculating the destination address of a jump Don't use the shader program offset when calculating the destination address jump Static Boolean Constants for Flow Control Branching Instructions. Quad-buffered. Specifies the boolean value for constants 0-31. Flow Control Options. Quad-buffered. Specifies whether test mode is enabled. This flag currently has no effect in hardware. Normal mode Test mode (currently unused) Specifies whether full flow control functionality is enabled. Use partial flow-control (enables twice the contexts). Loops and subroutines are not available in partial flow-control mode, and the nesting depth of branch statements is limited. Use full pixel shader 3.0 flow control, including loops and subroutines. Flow Control Instruction Specifies the type of flow control instruction. US_FC_OP_JUMP: (if, endif, call, etc) US_FC_OP_LOOP: same as jump except always take the jump if the static counter is 0. If we don't take the jump, push initial loop counter and loop register (aL) values onto the loop stack. US_FC_OP_ENDLOOP: same as jump but decrement the loop counter and increment the loop register (aL), and don't take the jump if the loop counter becomes zero. US_FC_OP_REP: same as loop but don't push the loop register aL. US_FC_OP_ENDREP: same as endloop but don't update/pop the loop register aL. US_FC_OP_BREAKLOOP: same as jump but pops the loop stacks if a pixel stops being active. US_FC_OP_BREAKREP: same as breakloop but don't pop the loop register if it jumps. US_FC_OP_CONTINUE: used to disable pixels that are ready to jump to the ENDLOOP/ENDREP instruction. Specifies whether to perform an else operation on the active and branch-inactive pixels before executing the instruction. Don't alter the branch state before executing the instruction. Perform an else operation on the branch state before executing the instruction; pixels in the active state are moved to the branch inactive state with zero counter, and vice versa. If set, jump if any active pixels want to take the jump (otherwise the instruction jumps only if all active pixels want to). Jump if ALL active pixels want to take the jump (for if and else). If no pixels are active, jump. Jump if ANY active pixels want to take the jump (for call, loop/rep and endrep/endloop). If no pixels are active, do not jump. The address stack operation to perform if we take the jump. US_FC_A_OP_NONE: Don't change the address stack US_FC_A_OP_POP: If we jump, pop the address stack and use that value for the jump target US_FC_A_OP_PUSH: If we jump, push the current address onto the address stack A 2x2x2 table of boolean values indicating whether to take the jump. The table index is indexed by {ALU Compare Result, Predication Result, Boolean Value (from the static boolean address in US_FC_ADDR.BOOL)}. To determine whether to jump, look at bit ((alu_result<<2) | (predicate<<1) | bool). The amount to decrement the branch counter by if US_FC_B_OP_DECR operation is performed. The branch state operation to perform if we don't take the jump. US_FC_B_OP_NONE: If we don't jump, don't alter the branch counter for any pixel. US_FC_B_OP_DECR: If we don't jump, decrement branch counter by B_POP_CNT for inactive pixels. Activate pixels with negative counters. US_FC_B_OP_INCR: If we don't jump, increment branch counter by 1 for inactive pixels. Deactivate pixels that decided to jump and set their counter to zero. The branch state operation to perform if we do take the jump. US_FC_B_OP_NONE: If we do jump, don't alter the branch counter for any pixel. US_FC_B_OP_DECR: If we do jump, decrement branch counter by B_POP_CNT for inactive pixels. Activate pixels with negative counters. US_FC_B_OP_INCR: If we do jump, increment branch counter by 1 for inactive pixels. Deactivate pixels that decided not to jump and set their counter to zero. If set, uncovered pixels will not participate in flow control decisions. Include uncovered pixels in jump decisions Ignore uncovered pixels in making jump decisions Integer Constants used by Flow Control Loop Instructions. Single buffered. Specifies the number of iterations. Unsigned 8-bit integer in [0, 255]. Specifies the initial value of the loop register (aL). Unsigned 8-bit integer in [0, 255]. Specifies the increment used to change the loop register (aL) on each iteration. Signed 7-bit integer in [-128, 127]. width > 2048, height <= 2048 width <= 2048, height > 2048 width > 2048, height > 2048

Normal rounding Modified rounding of fixed-point data Shader pixel size. This register specifies the size and partitioning of the current pixel stack frame Specifies the total size of the current pixel stack frame (1:128) Texture addresses and swizzles Specifies the location (within the shader pixel stack frame) of the texture address for this instruction Specifies whether the loop register is added to the value of the associated address before it is used. This implements relative addressing.

Specify which colour channel of src_addr to use for S coordinate

Specify which colour channel of src_addr to use for T coordinate

Specify which colour channel of src_addr to use for R coordinate

Specify which colour channel of src_addr to use for Q coordinate

Specifies the location (within the shader pixel stack frame) of the returned texture data for this instruction Specifies whether the loop register is added to the value of the associated address before it is used. This implements relative addressing. NONE: Do not modify destination address RELATIVE: Add aL before lookup. Specify which colour channel of the returned texture data to write to the red channel of dst_addr Write R channel to R channel Write G channel to R channel Write B channel to R channel Write A channel to R channel Specify which colour channel of the returned texture data to write to the green channel of dst_addr Write R channel to G channel Write G channel to G channel Write B channel to G channel Write A channel to G channel Specify which colour channel of the returned texture data to write to the blue channel of dst_addr Write R channel to B channel Write G channel to B channel Write B channel to B channel Write A channel to B channel Specify which colour channel of the returned texture data to write to the alpha channel of dst_addr Write R channel to A channel Write G channel to A channel Write B channel to A channel Write A channel to A channel Additional texture addresses and swizzles for DX/DY inputs Specifies the location (within the shader pixel stack frame) of the DX value for this instruction Specifies whether the loop register is added to the value of the associated address before it is used. This implements relative addressing.

Specify which colour channel of dx_addr to use for S coordinate

Specify which colour channel of dx_addr to use for T coordinate

Specify which colour channel of dx_addr to use for R coordinate

Specify which colour channel of dx_addr to use for Q coordinate

Specifies the location (within the shader pixel stack frame) of the DY value for this instruction Specifies whether the loop register is added to the value of the associated address before it is used. This implements relative addressing.

Specify which colour channel of dy_addr to use for S coordinate

Specify which colour channel of dy_addr to use for T coordinate

Specify which colour channel of dy_addr to use for R coordinate

Specify which colour channel of dy_addr to use for Q coordinate

Texture Instruction Specifies the id of the texture map used for this instruction Specifies the operation taking place for this instruction NOP: Do nothing LD: Do Texture Lookup (S,T,R) TEXKILL: Kill pixel if any component is < 0 PROJ: Do projected texture lookup (S/Q,T/Q,R/Q) LODBIAS: Do texture lookup with lod bias LOD: Do texture lookup with explicit lod DXDY: Do texture lookup with lod calculated from DX and DY Whether to hold the texture semaphore until the data is written to the temporary register. Don't hold the texture semaphore Hold the texture semaphore until the data is written to the temporary register. If set, US will not request data for pixels which are uncovered. Clear this bit for indirect texture lookups. Fetch texels for uncovered pixels Don't fetch texels for uncovered pixels Whether to scale texture coordinates when sending them to the texture unit. Scale the S, T, R texture coordinates from [0.0,1.0] to the dimensions of the target texture Use the unscaled S, T, R texture coordates. Specifies the source and format for the Depth (W) value output by the shader Format for W W0 - W is always zero W24 - 24-bit fixed point W24_FP - 24-bit floating point. The floating point values are a special format that preserve sorting order when values are compared as integers, allowing higher precision in W without additional logic in other blocks. Source for W

Alternate Number of Vertices to allow > 16-bits of Vertex count 24-bit vertex count for command packet. Used instead of bits 31:16 of VAP_VF_CNTL if VAP_VF_CNTL.USE_ALT_NUM_VERTS is set. Control Bits for User Clip Planes and Clipping Enable User Clip Plane 0 Enable User Clip Plane 1 Enable User Clip Plane 2 Enable User Clip Plane 3 Enable User Clip Plane 4 Enable User Clip Plane 5 0 = Cull using distance from center of point 1 = Cull using radius-based distance from center of point 2 = Cull using radius-based distance from center of point, Expand and Clip on intersection 3 = Always expand and clip as trifan Disables clip code generation and clipping process for TCL Cull Primitives against UCPS, but don't clip If set, boundary edges are highlighted, else they are not highlighted If set, color2 is used as texture8 by GA (PS3.0 requirement) If set, color3 is used as texture9 by GA (PS3.0 requirement) Vertex Assembler/Processor Control Register Specifies the number of vertex slots to be used in the VAP PVS process. A slot represents a single vertex storage location1 across multiple engines (one vertex per engine). By decreasing the number of slots, there is more memory for each vertex, but less parallel processing. Similarly, by increasing the number of slots, there is less memory per vertex but more vertices being processed in parallel. Specifies the maximum number of controllers to be processing in parallel. In general should be set to max value of TBD. Can be changed for performance analysis. Specifies the number of Floating Point Units (Vector/Math Engines) to use when processing vertices. If set, VAP will not process any draw commands (i.e. writes to VAP_VF_CNTL, the INDX and DATAPORT and Immediate mode writes are ignored. This field controls the number of vertices that the vertex fetcher manages for the TCL and Setup Vertex Storage memories (and therefore the number of vertices that can be re-used). This value should be set to 12 for most operation, This number may be modified for performance evaluation. The value is the maximum vertex number used which is one less than the number of vertices (i.e. a 12 means 13 vertices will be used) Clip space is defined as:

If set, enables the TCL state optimization, and the new state is used only if there is a change in TCL state, between VF_CNTL (triggers) Vertex Assemblen/Processor Control Status Endian-Swap Control. 0 = No swap 1 = 16-bit swap: 0xAABBCCDD becomes 0xBBAADDCC 2 = 32-bit swap: 0xAABBCCDD becomes 0xDDCCBBAA 3 = Half-dword swap: 0xAABBCCDD becomes 0xCCDDAABB Default = 0 The TCL engine is logically or physically removed from the circuit. Transform/Clip/Light (TCL) Engine is Busy. Read-only. Maximum number of MPs fused for this chip. Read- only. For A11, fusemask is fixed to 1XXX. For A12, CG.CC_COMBINEDSTRAPS.MAX_MPS[7:5] = 000 => max_mps[3:0] = 1XXX => 8 MPs CG.CC_COMBINEDSTRAPS.MAX_MPS[7:5] = 001 => max_mps[3:0] = 0110 => 6 MPs CG.CC_COMBINEDSTRAPS.MAX_MPS[7:5] = 010 => max_mps[3:0] = 0101 => 5 MPs CG.CC_COMBINEDSTRAPS.MAX_MPS[7:5] = 011 => max_mps[3:0] = 0100 => 4 MPs CG.CC_COMBINEDSTRAPS.MAX_MPS[7:5] = 100 => max_mps[3:0] = 0011 => 3 MPs CG.CC_COMBINEDSTRAPS.MAX_MPS[7:5] = 101 => max_mps[3:0] = 0010 => 2 MPs CG.CC_COMBINEDSTRAPS.MAX_MPS[7:5] = 110 => max_mps[3:0] = 0001 => 1 MP CG.CC_COMBINEDSTRAPS.MAX_MPS[7:5] = 111 => max_mps[3:0] = 0000 => 0 MP Note that max_mps[3:0] = 0111 = 7 MPs is not available Vertex Store is Busy. Read-only. Reciprocal Engine is Busy. Read-only. ViewPort Transform Engine is Busy. Read-only. Memory Interface Unit is Busy. Read-only. Vertex Cache is Busy. Read-only. Vertex Fetcher is Busy. Read-only. Register Pipeline is Busy. Read-only. VAP Engine is Busy. Read-only. Offset Value added to index value in both Indexed and Auto-indexed modes. Disabled by setting to 0 25-bit signed 2's comp offset value Programmable Stream Control Word 0 The data type for element 0 0 = FLOAT_1 (Single IEEE Float) 1 = FLOAT_2 (2 IEEE floats) 2 = FLOAT_3 (3 IEEE Floats) 3 = FLOAT_4 (4 IEEE Floats) 4 = BYTE * (1 DWORD w 4 8-bit fixed point values) (X = [7:0], Y = [15:8], Z = [23:16], W = [31:24]) 5 = D3DCOLOR * (Same as BYTE except has X->Z,Z- >X swap for D3D color def) (Z = [7:0], Y = [15:8], X = [23:16], W = [31:24]) 6 = SHORT_2 * (1 DWORD with 2 16-bit fixed point values) (X = [15:0], Y = [31:16], Z = 0.0, W = 1.0) 7 = SHORT_4 * (2 DWORDS with 4(2 per dword) 16- bit fixed point values) (X = DW0 [15:0], Y = DW0 [31:16], Z = DW1 [15:0], W = DW1 [31:16]) 8 = VECTOR_3_TTT * (1 DWORD with 3 10-bit fixed point values) (X = [9:0], Y = [19:10], Z = [29:20], W = 1.0) 9 = VECTOR_3_EET * (1 DWORD with 2 11-bit and 1 10-bit fixed point values) (X = [10:0], Y = [21:11], Z = [31:22], W = 1.0) 10 = FLOAT_8 (8 IEEE Floats) Sames as 2 FLOAT_4 but must use consecutive DST_VEC_LOC. Used to allow > 16 PSC for OGL path. 11 = FLT16_2 (1 DWORD with 2 16-bit floating point values (SE5M10 exp bias of 15, supports denormalized numbers)) (X = [15:0], Y = [31:16], Z = 0.0, W = 1.0) 12 = FLT16_4 (2 DWORDS with 4(2 per dword) 16-bit floating point values (SE5M10 exp bias of 15, supports denormalized numbers))) (X = DW0 [15:0], Y = DW0 [31:16], Z = DW1 [15:0], W = DW1 [31:16]) * These data types use the SIGNED and NORMALIZE flags described below. The number of DWORDS to skip (discard) after processing the current element. The vector address in the input memory to write this element If set, indicates the last vector of the current vertex stream

Determines whether fixed point data types are unsigned (0) or 2's complement signed (1) data types. See NORMALIZE for complete description of affect

Similar to DATA_TYPE_0 See SKIP_DWORDS_0 See DST_VEC_LOC_0

See LAST_VEC_0

See SIGNED_0

See NORMALIZE_0

For VS3.0 - To support more PVS instructions, increase the address range - Programmable Vertex Shader Flow Control Lower Word Addresses Register 0 This field defines the last PVS instruction to execute prior to the control flow redirection. JUMP - The last instruction executed prior to the jump LOOP - The last instruction executed prior to the loop (init loop counter/inc) JSR - The last instruction executed prior to the jump to the subroutine. (Addrss_Range:1K=[9:0];512=[8:0];256=[7:0]) This field has multiple definitions as follows: JUMP - The instruction address to jump to. LOOP - The loop count. *Note loop count of 0 must be replaced by a jump. JSR - The instruction address to jump to (first inst of subroutine). (Addrss_Range:1K=[24:15];512=[23:15];256=[22:15]) For VS3.0 - To support more PVS instructions, increase the address range - Programmable Vertex Shader Flow Control Upper Word Addresses Register 0 This field has multiple definitions as follows: JUMP - Not Applicable LOOP - The last instruction of the loop. JSR - The last instruction of the subroutine. (Addrss_Range:1K=[9:0];512=[8:0];256=[7:0]) This field has multiple definitions as follows: JUMP - Not Applicable LOOP - First Instruction of Loop (Typically ACT_ADRS + 1) JSR - First Instruction After JSR (Typically ACT_ADRS + 1). (Addrss_Range:1K=[24:15];512=[23:15];256=[22:15]) Programmable Vertex Shader Flow Control Loop Index Register 0 This field stores the automatic loop index register init value. This is an 8-bit unsigned value 0-255. This field is only used if the corresponding control flow instruction is a loop. This field stores the automatic loop index register step value. This is an 8-bit 2's comp signed value -128-127. This field is only used if the corresponding control flow instruction is a loop. When this field is set, the automatic loop index register init value is not used at loop activation. The intial loop index is inherited from outer loop. The loop index register step value is used at the end of each loop iteration ; after loop completion, the outer loop index register is restored For VS3.0 color2texture - flat shading on textures - limitation: only first 8 vectors can have clipping with wrap shortest or point sprite generated textures Vertex Fetcher Control Primitive Type

Method of Passing Vertex Data.

Reserved bits When set, vertex indices are 32-bits/indx, otherwise, 16- bits/indx. When set, vertex reuse is disabled. DO NOT SET unless PRIM_WALK is Indexes. When set, the incoming index is treated as two separate indices. Bits 23-16 are used as the index for AOS 0 (These are 0 for 16-bit indices) Bits 15-0 are used as the index for AOS 1-15. This mode was added specifically for HOS usage When set, the number of vertices in the command packet is taken from VAP_ALT_NUM_VERTICES register instead of bits 31:16 of VAP_VF_CNTL Number of vertices in the command packet. Vertex Array of Structures Control The number of arrays required to represent the current vertex type. Each Array is described by the following three fields: VTX_AOS_ADDR, VTX_AOS_COUNT, VTX_AOS_STRIDE. Force Vertex Data Pre-fetching. If this bit is set, then a 256-bit word will always be fetched, regardless of which dwords are needed. Typically useful when VAP_VF_CNTL.PRIM_WALK is set to Vertex List (Auto-incremented indices). If set, the vertex cache is not invalidated between draw packets. This allows vertex cache hits to occur from packet to packet. This must be set with caution with respect to multiple contexts in the driver. Granule Size to Fetch for AOS 0. 0 = 128-bit granule size 1 = 256-bit granule size This allows the driver to program the fetch size based on DWORDS/VTX/AOS combined with AGP vs. LOC Memory. The general belief is that the granule size should always be 256-bits for LOC memory and AGP8X data, but should be 128-bit for AGP2X/4X data if the DWORDS/VTX/AOS is less than TBD (128?) bits. See AOS_0_FETCH_SIZE See AOS_0_FETCH_SIZE See AOS_0_FETCH_SIZE See AOS_0_FETCH_SIZE See AOS_0_FETCH_SIZE See AOS_0_FETCH_SIZE See AOS_0_FETCH_SIZE See AOS_0_FETCH_SIZE See AOS_0_FETCH_SIZE See AOS_0_FETCH_SIZE See AOS_0_FETCH_SIZE See AOS_0_FETCH_SIZE See AOS_0_FETCH_SIZE See AOS_0_FETCH_SIZE See AOS_0_FETCH_SIZE VAP Vertex State Control Register

Set to 0 Data register EDGE_FLAGS Z Buffer Band-Width Control Defa Enables hierarchical Z.

Enables reading of compressed Z data from memory to the cache.

Enables writing of compressed Z data from cache to memory,

This bit is set when the Z buffer is used to help the CB in clearing a region. Part of the region is cleared by the color buffer and part will be cleared by the Z buffer. Since the Z buffer does not have any write masks in the cache, full micro- tiles need to be written. If a partial micro-tile is touched, then the un-touched part will be unknowns. The cache will operate in write-allocate mode and quads will be accumulated in the cache and then evicted to main memory. The color value is supplied through the ZB_DEPTHCLEARVALUE register.

Enabling this bit will force all the compressed stencil values By default this is 0 (enabled). When NEWZ=OLDZ, then writes do not occur to save BW. Enable not updating the Z buffer if NewZ=OldZ Disable above feature (in case there is a bug) By default this is 0 (enabled). When NEW_STENCIL=OLD_STENCIL, then writes do not occur to save BW. Enable not updating the Stencil buffer if NewS=OldS Disable above feature (in case there is a bug) Controls whether bytemasking is used or not. Enable bytemasking Disable bytemasking Enables hiz rejects when the z function is equals.

Determines whether leading zeros or ones are eliminated. Count leading 1s Count leading 0s The zb tries to detect single plane equations that completely This disables storing samples contiguously in 6xaa. Enables packing of the plane equations to eliminate wasted peq slots.

Enables discarding of pointers from pixels that are going to be Z Buffer Control Enables stenciling.

Enables Z functions.

Enables writing of the Z buffer.

Enable signed Z buffer comparison , for W-buffering.

Specifies the signed number type to use for the Z buffer comparison. This only has an effect when ZSIGNED_COMPARE is enabled. Twos complement Signed magnitude

Sets the fifo sizes Determines the size of the op fifo

Format of the Data in the Z buffer Specifies the format of the Z buffer.

in 13E3 format , count leading 1's in 13E3 format , count leading 0's. This bit is unused Hierarchical Z Data. This DWORD contains 8-bit values for 4 blocks.. Reading this register causes a read from the address pointed to by RDINDEX. Writing to this register causes a write to the address pointed to by WRINDEX. Hierarchical Z Memory Offset DWORD offset into HiZ RAM. Hierarchical Z Read Index Read index into HiZ RAM. Hierarchical Z Write Index Self-incrementing write index into the HiZ RAM. Starting write index must start on a DWORD boundary. Each time ZB_HIZ_DWORD is written, this index will autoincrement. HIZ_OFFSET and HIZ_PITCH are not used to compute read/write address to HIZ ram, when it is accessed through WRINDEX and DWORD Stencil Reference Value and Mask for backfacing quads Specifies the reference stencil value. This value is ANDed with both the reference and the current stencil value prior to the stencil test. Specifies the write mask for the stencil planes. Z Buffer Z Pass Counter Data. Contains the number of Z passed pixels since the last write to this location. Writing this location resets the count to the value written. Z and Stencil Function Control Specifies the Z function.

Specifies the stencil function.

Specifies the stencil value to be written if the stencil test fails.