Recitation 5: Rendering Fractals

CS179: GPU Programming 

Recitation 5: Rendering Fractals

Rendering Fractals 

● 

● 

● 

● 

● 

Volume data vs. texture memory 

Creating and using CUDA arrays 

Using PBOs for screen output 

Quaternion Julia Sets 

Rendering volume data

Volume Data 

● 

● 

● 

● 

Stored in global memory 

Can be accessed only as linear memory 

No texturing pipeline features available 

– But, only form of global writeable data 

Allocate arbitrary linear memory using 

cudaMalloc

Texture Memory 

● 

● 

● 

● 

CUDA arrays allocated in “texture memory” 

– use cudaMalloc3DArray for correct pitch 

Then declare texture in device: 

– texture tex; 

 

– type = cudaReadModeElementType 

cudaReadModeNormalizedFloat 

Access using tex3D: 

– tex3D(tex, s, t, p);

Textures 

● 

● 

● 

● 

Can set properties of existing tex object: 

– tex.normalized = true; 

– tex.filterMode = cudaFilterModeLinear; 

– tex.addressMode[i] = cudaAddressModeClamp; 

– Basically, same settings as OpenGL 

Then, bind tex using Malloc3D's array: 

– cudaBindTextureToArray 

No need to bind/unbind for each use 

Usually have at least 8 texture lines available 

– Probably wont need more than one anyway...

Using PBOs 

● 

How to actually render using CUDA? 

– PBO: pixel buffer object 

● 

● 

A PBO handles pixels like VBOs handle vertices 

OpenGL allocates it as a region of global memory 

– So, it can be mapped via cudaGLMapBufferObject 

– written to by CUDA 

– bound using glBindBufferARB 

● 

GL_PIXEL_UNPACK_BUFFER_ARB 

– then drawn to screen with glDrawPixels

Lab 5 

● 

● 

Rendering quaternion Julia sets 

Not as complicated as it sounds: 

– Calculate volume fractal using equation 

– Copy over to texture memory 

– Volume render 

– Only recalculate when necessary 

● 

But first.. what is a quaternion Julia set?

Fractals 

● 

● 

Self-similar, recursive sets 

Became popular in mid-late 1900s with the 

evolution of graphics 

– Difficult before graphics due to infinite detail 

– Graphics made visualizing them possible 

– Mandlebrot used fractals to try and estimate 

coastlines..

The Mandlebrot Set

The Mandlebrot Set 

z n+1 

= z n2 

+c


● 

Defined by iterative complex equation: 

– z n+1 

= z n 

2 

+ c 

● 

c is a pixel coord on the complex plane 

– x-axis = real axis, y axis = imaginary axis 

● z 0 

= 0 

● Three possible results depending on c: 

– Converge to 0 (black space) 

– Stays in finite orbit (boundary) 

– Escapes to infinity


● 

● 

● 

● 

● 

Typically computed by iterating z and checking 

if it escapes some magnitude (~2) 

Can color based on rate of escape 

Typically 20-50 iterations is enough to tell 

behavior of z 

Used to take seconds to render set on CPU 

See SDK for real-time program

Julia Set 

● 

Each point of the Mandlebrot set has a 

corresponding Julia set: 

– Iterate z 2 + c, but z 0 

is some pixel

Julia Sets 

● 

● 

Calculated in the same way as Mandlebrot 

sets 

We don't really have a practical application for 

these.. 

– But they look really pretty! 

– And they're parallelizable, so we'll work with them


● 

We could do 2D Julia sets.. 

– But 4D ones are more exciting! 

● 

The iterative process is the same, except now 

we use Quaternions.

Quaternions 

● 

Extension to the real numbers: 

i 2 = j 2 =k 2 =ijk =−1 

ij=k ji=−k 

jk =i kj=−i 

ki= j ik=− j 

● 

Very applicable in CG for 3D rotations, 

visualizations, etc..


● 

So, we can create a 4D set, but how do we 

render in 3D? 

– Projection!

Projection 

● 

We can take 2D slices of a 3D object 

– Think MRI scan 

● 

Same idea: we take 3D volume slices of a 4D 

object 

– Imagine a 3D object that morphs over time 

– The object at one instance of time is our 3D slice 

● 

● 

These 3D slices are what we render 

So, we have three parameters now: z 0 

, c, and 

the slicing plane

Quaternions in Lab 5 

● 

● 

● 

● 

Quaternion multiplication provided: 

– mul_quat, sqr_quat 

pos_to_quat 

– Given a plane as a parameter, converts a 3D 

position to a quaternion 

We'll store quaternions as float4's 

cutil_math.h provides vector math (dot, cross, 

etc.) and operator definitions (float4 * float, 

etc.)

Rendering 

● 

● 

● 

● 

● 

Transform each point in volume texture to quaternion 

Iterate the Julia fractal equation 

Store whether point is in set or how fast it escapes 

Then, use normal volume rendering techniques 

– raytracing.. remember lab2? 

Raytracing might not work perfectly.. 

– Julia sets have infinite detail, so some parts are 

infinitely thin

Julia Distance Function 

● 

● 

● 

● 

There is a distance estimator function for Julia 

Sets 

Gives lower bound on the distance to the set 

from any point in space 

Iterate section equation simultaneously with 

Julia set equation: 

– z n 

' = 2z n 

z n 

' 

Then, the distance is estimated by: 

d (z)= ∣ z n ∣ 

2∣z n 

'∣ log ∣z n ∣


● 

● 

● 

We can actually just render this function! 

Distance function is smooth, so we can render 

the isosurface of it 

More iterations improve the estimate


● 

How to use it: 

– Iterate z' n+1 

= 2z n 

z n 

' and z n+1 

= z n 

2 

+ c with provided c 

and z 0 

– Can stop iterating once z n 

escapes (|z n 

| 2 > ~20) or 

reach maximum number of iterations 

– Return distance on previous slide

Better Rendering 

● 

● 

● 

● 

Fill in volume data with value of distance 

function at each point 

Copy to volume texture 

Step along projected ray and render when you 

hit isosurface (value < epsilon) 

This is pretty fast when parallelized 

– like lab 2

Best Rendering 

● 

● 

● 

● 

● 

But we can speed this up! 

We have a distance estimator 

If we estimate we are 0.5 units away, no need 

to step ray by 0.001 

Step by a * d(z) 

– a is just some constant.. 0.1-0.5 works well 

Will this cause thread divergence? 

– Not really, threads that finish early will just wait 

– Note: These distances are in 4D

Drawing Isosurface 

● 

● 

● 

● 

Stop stepping along the ray when we hit 

surface, and render something 

In order to do lighting, we'll want the normal 

For a smooth scalar field, the normal is the 

gradient of the field at that point 

Compute gradient from volume texture? 

– No, this will be blocky 

– Instead, compute gradient via more juliaDist calls

Computing Normals 

● 

Theoretically, this should work.. 

– But in practice, it doesn't work too well 

● 

Can also arbitrarily choose axes for gradient 

computation, then calculate tangent and 

binormal, then normal = tangent x binormal 

– Still, not too great, but better 

– This is optional, you can just do the gradient, it's 

much easier

Lab 5 

● 

● 

What you need to do: 

On the host: 

– Execute kernels 

– Copy global memory to texture memory 

● 

Look up necessary functions in CUDA manuals 

– Set symbols in graphics memory 

● 

On the device: 

– Julia distance estimator function 

– Fractal computation kernel 

– Volume rendering kernel 

● 

Let the TODOs guide you, as usual

Important Note on Registers 

● 

● 

● 

● 

Volume rendering calls JuliaDist, intersectBox, 

computes normals, etc. 

Easy to run out of register memory 

So, be careful and put things into functions if 

you don't need them later (helps compiler) 

Might also want to use less than 512 threads 

per block

Lab 5 

● 

What's given to you: 

– Volume render ray is set up 

● 

– Steps along ray at constant interval and 

accumulates from 3D texture 

Change this to have a dynamic interval and 

stop when we hit isosurface

Lab 5 

● 

● 

● 

● 

Organize your volume cube with threads 

running in the lowest dimension and a 2D grid 

for the other 2 dimensions to make indexing 

easier 

See globally defined dim3s 

The space extends in ±2.0 in each direction 

(for converting indices to positions) 

1 thread per element is probably fastest, but 

feel free to experiment with loops

Memory Coalescing 

● 

If you compute the index within the block as: 

– i = x + width*y + width*height*z 

– then write to output[i] 

– threads run along x, therefore coalesced 

● 

● 

Non-coalesced case: x swapped with one of 

the other dimensions 

Test both coalesced and non-coalesced 

speeds, write results into README

Cool stuff: 

● 

● 

Color it however you want 

– You should compute normals 

– Color it as a function of something: 

● 

● 

● 

normals 

position in space 

etc.. 

Could experiment with different functions, like 

z 3 + c (mention it in README)

Cool Stuff 

● 

Extra credit: 

– Raytrace for shadows? 

– Adaptive Detailing: render using lower epsilon if 

we're closer to camera

Final Notes 

● 

We could just raytrace entire set 

– also pretty fast 

● 

● 

This way teaches us a little about memory 

Raytracing also fairly simple 

– Just call JuliaDist in volume rendering function 

instead of sampling texture 

– But, sampling textures is still faster 

– We get more threads this way (O(n 3 ) instead of 

O(n 2 ))

Recitation 5: Rendering Fractals

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?