Shader model
From Tech Artists Wiki
A Shader model is a shader compile target. Code can compile to various shader models, and each shader model will determine what sort of code is written and how it will compile.
Contents |
[edit] Shader Model comparison
(from Wikipedia [1] )
Shader model 1.1 is considered deprecated and will not be listed (there are no 1.1 cards on the market and there haven't been for a number of years).
[edit] Pixel shader comparison
| Pixel shader version | 2.0 | 2.0a | 2.0b | 3.0 | 4.0 |
|---|---|---|---|---|---|
| Dependent texture limit | 8 | No Limit | 8 | No Limit | No Limit |
| Texture instruction limit | 32 | Unlimited | Unlimited | Unlimited | Unlimited |
| Position register | No | No | No | Yes | Yes |
| Instruction slots | 32 + 64 | 512 | 512 | ≥ 512 | ≥ 65536 |
| Executed instructions | 32 + 64 | 512 | 512 | 65536 | Unlimited |
| Texture indirections | 4 | No limit | 4 | No Limit | No Limit |
| Interpolated registers | 2 + 8 | 2 + 8 | 2 + 8 | 10 | 32 |
| Instruction predication | No | Yes | No | Yes | No |
| Index input registers | No | No | No | Yes | Yes |
| Temp registers | 12 | 22 | 32 | 32 | 4096 |
| Constant registers | 32 | 32 | 32 | 224 | 16x4096 |
| Arbitrary swizzling | No | Yes | No | Yes | Yes |
| Gradient instructions | No | Yes | No | Yes | Yes |
| Loop count register | No | No | No | Yes | Yes |
| Face register (2-sided lighting) | No | No | No | Yes | Yes |
| Dynamic flow control | No | No | No | 24 | Yes |
| Bitwise Operators | No | No | No | No | Yes |
| Native Integers | No | No | No | No | Yes |
- PS 2.0 = DirectX 9.0 original Shader Model 2 specification.
- PS 2.0a = NVIDIA GeForce FX-optimized model.
- PS 2.0b = ATI Radeon X700, X800, X850 shader model, DirectX 9.0b.
- PS 3.0 = Shader Model 3.
- PS 4.0 = Shader Model 4.
"32 + 64" for Executed Instructions means "32 texture instructions and 64 arithmetic instructions."
[edit] Vertex shader comparison
| Vertex shader version | VS 2.0 | VS 2.0a | VS 3.0 | VS 4.0 |
|---|---|---|---|---|
| # of instruction slots | 256 | 256 | ≥ 512 | 4096 |
| Max # of instructions executed | 65536 | 65536 | 65536 | 65536 |
| Instruction Predication | No | Yes | Yes | Yes |
| Temp Registers | 12 | 13 | 32 | 4096 |
| # constant registers | ≥ 256 | ≥ 256 | ≥ 256 | 16x4096 |
| Static Flow Control | Yes | Yes | Yes | Yes |
| Dynamic Flow Control | No | Yes | Yes | Yes |
| Dynamic Flow Control Depth | No | 24 | 24 | Yes |
| Vertex Texture Fetch | No | No | Yes | Yes |
| # of texture samplers | N/A | N/A | 4 | 128 |
| Geometry instancing support | No | No | Yes | Yes |
| Bitwise Operators | No | No | No | Yes |
| Native Integers | No | No | No | Yes |
- VS 2.0 = DirectX 9.0 original Shader Model 2 specification.
- VS 2.0a = NVIDIA GeForce FX-optimized model.
- VS 3.0 = Shader Model 3.
- VS 4.0 = Shader Model 4.
[edit] Tips
[edit] Choosing compile targets
You should generally choose the lowest target your shader can compile for. Compiling for 1.1 will almost always yield significantly better results than compiling for 2.0, but given the restrictions of 1.1, this is usually not an option. Sometimes 3.0 will be faster than compiling for 2.0, but it is generally very easy to test. As a general rule, compile as low as possible, but for any significant shaders, one should always test results, since different shaders compile differently.
