Recitation 5: Rendering Fractals
Recitation 5: Rendering Fractals
Recitation 5: Rendering Fractals
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
CS179: GPU Programming<br />
<strong>Recitation</strong> 5: <strong>Rendering</strong> <strong>Fractals</strong>
<strong>Rendering</strong> <strong>Fractals</strong><br />
●<br />
●<br />
●<br />
●<br />
●<br />
Volume data vs. texture memory<br />
Creating and using CUDA arrays<br />
Using PBOs for screen output<br />
Quaternion Julia Sets<br />
<strong>Rendering</strong> volume data
Volume Data<br />
●<br />
●<br />
●<br />
●<br />
Stored in global memory<br />
Can be accessed only as linear memory<br />
No texturing pipeline features available<br />
– But, only form of global writeable data<br />
Allocate arbitrary linear memory using<br />
cudaMalloc
Texture Memory<br />
●<br />
●<br />
●<br />
●<br />
CUDA arrays allocated in “texture memory”<br />
– use cudaMalloc3DArray for correct pitch<br />
Then declare texture in device:<br />
– texture tex;<br />
<br />
– type = cudaReadModeElementType<br />
cudaReadModeNormalizedFloat<br />
Access using tex3D:<br />
– tex3D(tex, s, t, p);
Textures<br />
●<br />
●<br />
●<br />
●<br />
Can set properties of existing tex object:<br />
– tex.normalized = true;<br />
– tex.filterMode = cudaFilterModeLinear;<br />
– tex.addressMode[i] = cudaAddressModeClamp;<br />
– Basically, same settings as OpenGL<br />
Then, bind tex using Malloc3D's array:<br />
– cudaBindTextureToArray<br />
No need to bind/unbind for each use<br />
Usually have at least 8 texture lines available<br />
– Probably wont need more than one anyway...
Using PBOs<br />
●<br />
How to actually render using CUDA?<br />
– PBO: pixel buffer object<br />
●<br />
●<br />
A PBO handles pixels like VBOs handle vertices<br />
OpenGL allocates it as a region of global memory<br />
– So, it can be mapped via cudaGLMapBufferObject<br />
– written to by CUDA<br />
– bound using glBindBufferARB<br />
●<br />
GL_PIXEL_UNPACK_BUFFER_ARB<br />
– then drawn to screen with glDrawPixels
Lab 5<br />
●<br />
●<br />
<strong>Rendering</strong> quaternion Julia sets<br />
Not as complicated as it sounds:<br />
– Calculate volume fractal using equation<br />
– Copy over to texture memory<br />
– Volume render<br />
– Only recalculate when necessary<br />
●<br />
But first.. what is a quaternion Julia set?
<strong>Fractals</strong><br />
●<br />
●<br />
Self-similar, recursive sets<br />
Became popular in mid-late 1900s with the<br />
evolution of graphics<br />
– Difficult before graphics due to infinite detail<br />
– Graphics made visualizing them possible<br />
– Mandlebrot used fractals to try and estimate<br />
coastlines..
The Mandlebrot Set
The Mandlebrot Set<br />
z n+1<br />
= z n2<br />
+c
The Mandlebrot Set<br />
●<br />
Defined by iterative complex equation:<br />
– z n+1<br />
= z n<br />
2<br />
+ c<br />
●<br />
c is a pixel coord on the complex plane<br />
– x-axis = real axis, y axis = imaginary axis<br />
● z 0<br />
= 0<br />
● Three possible results depending on c:<br />
– Converge to 0 (black space)<br />
– Stays in finite orbit (boundary)<br />
– Escapes to infinity
The Mandlebrot Set<br />
●<br />
●<br />
●<br />
●<br />
●<br />
Typically computed by iterating z and checking<br />
if it escapes some magnitude (~2)<br />
Can color based on rate of escape<br />
Typically 20-50 iterations is enough to tell<br />
behavior of z<br />
Used to take seconds to render set on CPU<br />
See SDK for real-time program
Julia Set<br />
●<br />
Each point of the Mandlebrot set has a<br />
corresponding Julia set:<br />
– Iterate z 2 + c, but z 0<br />
is some pixel
Julia Sets<br />
●<br />
●<br />
Calculated in the same way as Mandlebrot<br />
sets<br />
We don't really have a practical application for<br />
these..<br />
– But they look really pretty!<br />
– And they're parallelizable, so we'll work with them
Quaternion Julia Sets<br />
●<br />
We could do 2D Julia sets..<br />
– But 4D ones are more exciting!<br />
●<br />
The iterative process is the same, except now<br />
we use Quaternions.
Quaternions<br />
●<br />
Extension to the real numbers:<br />
i 2 = j 2 =k 2 =ijk =−1<br />
ij=k ji=−k<br />
jk =i kj=−i<br />
ki= j ik=− j<br />
●<br />
Very applicable in CG for 3D rotations,<br />
visualizations, etc..
Quaternion Julia Sets<br />
●<br />
So, we can create a 4D set, but how do we<br />
render in 3D?<br />
– Projection!
Projection<br />
●<br />
We can take 2D slices of a 3D object<br />
– Think MRI scan<br />
●<br />
Same idea: we take 3D volume slices of a 4D<br />
object<br />
– Imagine a 3D object that morphs over time<br />
– The object at one instance of time is our 3D slice<br />
●<br />
●<br />
These 3D slices are what we render<br />
So, we have three parameters now: z 0<br />
, c, and<br />
the slicing plane
Quaternions in Lab 5<br />
●<br />
●<br />
●<br />
●<br />
Quaternion multiplication provided:<br />
– mul_quat, sqr_quat<br />
pos_to_quat<br />
– Given a plane as a parameter, converts a 3D<br />
position to a quaternion<br />
We'll store quaternions as float4's<br />
cutil_math.h provides vector math (dot, cross,<br />
etc.) and operator definitions (float4 * float,<br />
etc.)
<strong>Rendering</strong><br />
●<br />
●<br />
●<br />
●<br />
●<br />
Transform each point in volume texture to quaternion<br />
Iterate the Julia fractal equation<br />
Store whether point is in set or how fast it escapes<br />
Then, use normal volume rendering techniques<br />
– raytracing.. remember lab2?<br />
Raytracing might not work perfectly..<br />
– Julia sets have infinite detail, so some parts are<br />
infinitely thin
Julia Distance Function<br />
●<br />
●<br />
●<br />
●<br />
There is a distance estimator function for Julia<br />
Sets<br />
Gives lower bound on the distance to the set<br />
from any point in space<br />
Iterate section equation simultaneously with<br />
Julia set equation:<br />
– z n<br />
' = 2z n<br />
z n<br />
'<br />
Then, the distance is estimated by:<br />
d (z)= ∣ z n ∣<br />
2∣z n<br />
'∣ log ∣z n ∣
Julia Distance Function<br />
●<br />
●<br />
●<br />
We can actually just render this function!<br />
Distance function is smooth, so we can render<br />
the isosurface of it<br />
More iterations improve the estimate
Julia Distance Function<br />
●<br />
How to use it:<br />
– Iterate z' n+1<br />
= 2z n<br />
z n<br />
' and z n+1<br />
= z n<br />
2<br />
+ c with provided c<br />
and z 0<br />
– Can stop iterating once z n<br />
escapes (|z n<br />
| 2 > ~20) or<br />
reach maximum number of iterations<br />
– Return distance on previous slide
Better <strong>Rendering</strong><br />
●<br />
●<br />
●<br />
●<br />
Fill in volume data with value of distance<br />
function at each point<br />
Copy to volume texture<br />
Step along projected ray and render when you<br />
hit isosurface (value < epsilon)<br />
This is pretty fast when parallelized<br />
– like lab 2
Best <strong>Rendering</strong><br />
●<br />
●<br />
●<br />
●<br />
●<br />
But we can speed this up!<br />
We have a distance estimator<br />
If we estimate we are 0.5 units away, no need<br />
to step ray by 0.001<br />
Step by a * d(z)<br />
– a is just some constant.. 0.1-0.5 works well<br />
Will this cause thread divergence?<br />
– Not really, threads that finish early will just wait<br />
– Note: These distances are in 4D
Drawing Isosurface<br />
●<br />
●<br />
●<br />
●<br />
Stop stepping along the ray when we hit<br />
surface, and render something<br />
In order to do lighting, we'll want the normal<br />
For a smooth scalar field, the normal is the<br />
gradient of the field at that point<br />
Compute gradient from volume texture?<br />
– No, this will be blocky<br />
– Instead, compute gradient via more juliaDist calls
Computing Normals<br />
●<br />
Theoretically, this should work..<br />
– But in practice, it doesn't work too well<br />
●<br />
Can also arbitrarily choose axes for gradient<br />
computation, then calculate tangent and<br />
binormal, then normal = tangent x binormal<br />
– Still, not too great, but better<br />
– This is optional, you can just do the gradient, it's<br />
much easier
Lab 5<br />
●<br />
●<br />
What you need to do:<br />
On the host:<br />
– Execute kernels<br />
– Copy global memory to texture memory<br />
●<br />
Look up necessary functions in CUDA manuals<br />
– Set symbols in graphics memory<br />
●<br />
On the device:<br />
– Julia distance estimator function<br />
– Fractal computation kernel<br />
– Volume rendering kernel<br />
●<br />
Let the TODOs guide you, as usual
Important Note on Registers<br />
●<br />
●<br />
●<br />
●<br />
Volume rendering calls JuliaDist, intersectBox,<br />
computes normals, etc.<br />
Easy to run out of register memory<br />
So, be careful and put things into functions if<br />
you don't need them later (helps compiler)<br />
Might also want to use less than 512 threads<br />
per block
Lab 5<br />
●<br />
What's given to you:<br />
– Volume render ray is set up<br />
●<br />
– Steps along ray at constant interval and<br />
accumulates from 3D texture<br />
Change this to have a dynamic interval and<br />
stop when we hit isosurface
Lab 5<br />
●<br />
●<br />
●<br />
●<br />
Organize your volume cube with threads<br />
running in the lowest dimension and a 2D grid<br />
for the other 2 dimensions to make indexing<br />
easier<br />
See globally defined dim3s<br />
The space extends in ±2.0 in each direction<br />
(for converting indices to positions)<br />
1 thread per element is probably fastest, but<br />
feel free to experiment with loops
Memory Coalescing<br />
●<br />
If you compute the index within the block as:<br />
– i = x + width*y + width*height*z<br />
– then write to output[i]<br />
– threads run along x, therefore coalesced<br />
●<br />
●<br />
Non-coalesced case: x swapped with one of<br />
the other dimensions<br />
Test both coalesced and non-coalesced<br />
speeds, write results into README
Cool stuff:<br />
●<br />
●<br />
Color it however you want<br />
– You should compute normals<br />
– Color it as a function of something:<br />
●<br />
●<br />
●<br />
normals<br />
position in space<br />
etc..<br />
Could experiment with different functions, like<br />
z 3 + c (mention it in README)
Cool Stuff<br />
●<br />
Extra credit:<br />
– Raytrace for shadows?<br />
– Adaptive Detailing: render using lower epsilon if<br />
we're closer to camera
Final Notes<br />
●<br />
We could just raytrace entire set<br />
– also pretty fast<br />
●<br />
●<br />
This way teaches us a little about memory<br />
Raytracing also fairly simple<br />
– Just call JuliaDist in volume rendering function<br />
instead of sampling texture<br />
– But, sampling textures is still faster<br />
– We get more threads this way (O(n 3 ) instead of<br />
O(n 2 ))