05.08.2013 Views

Intel HD Graphics DirectX Developer's Guide (Sandy Bridge)

Intel HD Graphics DirectX Developer's Guide (Sandy Bridge)

Intel HD Graphics DirectX Developer's Guide (Sandy Bridge)

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Quick Tips: <strong>Graphics</strong> Performance Tuning<br />

3 Quick Tips: <strong>Graphics</strong><br />

Performance Tuning<br />

3.1 Primitive Processing<br />

3.1.1 Tips On Vertex/Primitive Processing<br />

1. Consider using the call ID3DX10Mesh::OptimizeMesh for faster drawing as the<br />

reordering can optimize cache usage.<br />

2. Use IDirect3DDevice9::DrawIndexedPrimitive (<strong>DirectX</strong>* 9) or<br />

ID3D10Device::DrawIndexed (<strong>DirectX</strong>* 10) to maximize reuse of the vertex<br />

cache.<br />

a. The vertex cache size will increase over time and can be discovered using<br />

D3DQUERYTYPE_VCACHE.<br />

3. Ensure adequate batching of primitives to amortize runtime and driver overhead.<br />

a. Maximize batch sizes: 200* to 1000 are recommended and within this range,<br />

bigger is better. *Minimum batch size applies to meshes that don‟t cover<br />

many pixels. For example, a two triangle full screen quad in a postprocessing<br />

pass will not bottleneck the driver.<br />

b. Minimize render state changes between batches to reduce the number of<br />

pipeline flushes.<br />

c. Use instancing to enable better vertex throughput especially for small batch<br />

sizes. This also minimizes state changes and Draw calls.<br />

4. Use static vertex buffers where possible.<br />

5. Use visibility tests to reject objects that fall outside the view frustum to reduce the<br />

impact of objects that are not visible.<br />

a. Set D3DRS_CLIPPING to FALSE for objects that do not need clipping.<br />

6. Take advantage of Early-Z rejection.<br />

a. Render with a Z-only pass (meaning no color buffer writes or pixel shader<br />

execution) followed by a normal render pass. This uses the higher<br />

performance of Early-Z to reject occluded fragments which reduces compute<br />

times and raster operations.<br />

This is particularly recommended where this can replace or eliminate texkill<br />

instructions - the latter are lot more expensive.<br />

How to maximize graphics performance on <strong>Intel</strong>® Integrated <strong>Graphics</strong> 15

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!