27.12.2012 Views

ARUP; ISBN: 978-0-9562121-5-3 - CMBBE 2012 - Cardiff University

ARUP; ISBN: 978-0-9562121-5-3 - CMBBE 2012 - Cardiff University

ARUP; ISBN: 978-0-9562121-5-3 - CMBBE 2012 - Cardiff University

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

For each pixel on screen we first compute the ray parameters using the method proposed by<br />

Krüger and Westermann 9 . This process spawns one thread on the GPU for each ray that intersects<br />

the volume, making optimal use of the GPU’s massive parallelism.<br />

4.2.1 Basic Raycasting Loop<br />

To traverse the volume we perform the following two operations in a loop until the ray exits the<br />

volume:<br />

1. compute the current brick and the entry and exit positions inside the brick<br />

2. perform ray casting, progressing until we leave that brick<br />

While the second operation is performed mostly as described by Stegmaier et al. 10 , the first<br />

operation is novel. Using the current ray position and direction, the operation determines the<br />

appropriate level of detail to utilize. It then searches for the brick in a 3D texture atlas. If the<br />

brick isn’t found, we use a coarser resolution for this rendering pass, and write the brick identifier<br />

into the page cache table (described in detail below) for fetching later. It then returns whatever<br />

brick was found, with an indicator of whether it was the desired brick or a lower-resolution<br />

version.<br />

The second operation needs only a few small modifications for chonophotographic visualization.<br />

First, the compositing must take all active time steps (those with an importance value greater than<br />

0) into account. Secondly, we need to be sure all the required data is in memory, which is done<br />

by always uploading a corresponding bricks in time when uploading a new brick.<br />

4.2.2 Hash Function<br />

GPUs in general do not have native support for containers (e.g. hash tables), so a new<br />

implementation had to be designed. Our implementation is based the<br />

EXT_shader_image_load_store extension in OpenGL (native in version 4.0). This extension<br />

gives a shader random read and write access to a 2D texture in any pipeline stage. The<br />

mechanism allows not just for simple load and store calls, but also atomic operations such as a<br />

atomic compare and swap. With that call in particular, we implement a lock free hash table with<br />

rehashing as the collision avoidance method. Therefore, we compute a hash value from the 4D<br />

brick coordinates (i.e. 3D for the brick position and 1D for the LoD) and use atomic compare and<br />

swap to check that entry. If the entry is (i.e. zero), we enter our data into that table and the hash<br />

function is done. If the entry was not empty we compare with our current brick. If it is identical,<br />

we are also done, as it implies that the current brick has already been flagged as missing by<br />

another ray. In any other case a new value is computed.<br />

4.2.3 Atlas Update and Ray Resume<br />

At the end of a frame we read the hash table back into CPU memory and check if it contains any<br />

non-zero entries. An empty table indicates that the image was fully rendered at an appropriate<br />

resolution, and therefore our rendering is complete. If the hash-table does contain non-zero<br />

entries, the corresponding bricks are fetched from disk and inserted into the texture atlas, using<br />

least recently used as an eviction strategy. Once the entire hash table has been processed, the atlas<br />

presence table is updated and both the presence table and atlas are uploaded to the GPU. A new

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!