Multi-Threaded Fluid Simulation

More documents

Recommendations

Info

MULTI-THREADED FLUID SIMULATION FOR GAMES Simulation The code is based on an article and sample code written by Mick West that was recently posted on Gamasutra. 1,2 Since one of our goals was to produce a modular, reusable code base, the first thing we did was to convert the existing sample to C++ and separate the simulation from the rest of the application, specifically from the rendering. This separation allows the simulation code to be easily added to an existing code base and rendering engine. Next, we extended the code into three dimensions, which turned out to be a fairly straightforward exercise because none of the algorithms changed appreciably with the addition of another dimension. For example, the 2D diffusion step is solved by applying the diffusion operation to move values from a cell to its neighbors (Figures 1 and 2). This approach is extended in the 3D case so that instead of inspecting four neighboring cells we look at six neighbors (Figures 1 and 3). Other cases used the same principle cpe as well. Figure 2. A code sample that shows 2D diffusion. Figure 3. A code sample that shows 3D diffusion. Figure 1. Transitioning code from 2D to 3D. The following two code samples (Figures 2 and 3) show the code that performs the diffusion operation for non-border case cells in both the 2D and 3D cases. The border cells are those cells at the very edge of the grid that are missing one or more neighbors. These cells are handled separately as special cases. The transition to 3D is accomplished by adding in consideration for the z-axis. Apart from that, the algorithm remains essentially the same. After transitioning the simulation into 3D, we began investigating how to break the work up for parallel processing. As a base for our multi-core work we used the Intel® Threading Building Blocks (TBB) package. We found TBB to be very competitive in performance with other threading APIs. Additionally, it has a lot of built-in functionality and structures, such as a thread pool, task manager, and multi-threaded memory management. The idea behind threading the simulation is data decomposition: take the grid and break it into smaller portions. These smaller portions are submitted as tasks to the task queue and processed by the worker threads, independently of each other. The grid is broken up evenly until a small-enough granularity is reached, at which point further reductions become ineffective, or worse, inefficient due to thread management overhead. Figure 4 shows an example of how a 2D grid is broken into four tasks. 1 West, Mick. “Practical Fluid Dynamics: Part 1” Gamasutra, 26 June 2008. http://www.gamasutra.com/view/feature/1549/practical_fluid_dynamics_part_1.php 2 West, Mick. “Practical Fluid Dynamics: Part 2” Gamasutra, 23 July 2008. http://www.gamasutra.com/view/feature/1615/practical_fluid_dynamics__part_ii.php 2
MULTI-THREADED FLUID SIMULATION FOR GAMES The main computational load of the simulation occurs when each step loops over the grid. These are all nested loops, so a straightforward way to break them up is by using the “parallel for” construction of TBB. In the example case of density diffusion, the code went from the serial version shown in Figure 3 to the parallel version shown in Figures 5 and 6. Figure 5. Using Intel® Threading Building Blocks to call the diffusion algorithm. Figure 4. Breaking up a 2D grid into tasks and an individual cell adjacent to other tasks For example, in the simulation there is a diffusion step followed by a force update step and later, an advection step. At each step the grid is broken into pieces, and the tasks write their data to a shared output grid, which becomes the input into the next step. In most cases, this approach works well. However, a problem can arise if the calculation uses its own results in the current step. That is, the output of cell 1 is used to calculate cell 2. Looking at Figure 4, it is obvious that certain cells, like the shaded cell, will be processed in a task separate from the task in which their immediate neighbors are processed. Since each task uses the same input, the neighboring cells’ data will be different in the multi-threaded version compared to the serial version. Whether this is an issue depends on which portion of the simulation we are in. Inaccuracies in the results in the diffusion step can occur, but they are not noticeable and don’t introduce destabilizing effects. In most cases the inaccuracies are smoothed out by subsequent iterations over the data and have no lasting impact on the simulation. This kind of behavior is acceptable because the primary concern is visual appeal, not numerical accuracy. However, in other cases inaccuracies can accumulate, introducing a destabilizing effect on the simulation, which can quickly destroy any semblance of normal behavior. An example of this occurs while processing the edge cases in diffusion—where diffusion is calculated along the edges of the grid or cube. When run in parallel, the density does not diffuse out into the rest of the volume and instead accumulates at the edges, forming a solid border. For this situation and similar ones, we simply run that portion serially. Figure 5 shows how TBB is used to call the diffusion algorithm. The arguments uBegin and uEnd are the start and end values of the total range to process, and uGrainSize is how small to break the range into. For example, if uBegin is 0 and uEnd is 99, the total range is 100 units. If uGrainSize is 10, 10 tasks will be submitted, each of which will process a range of 10. Figure 6 shows the actual algorithm called by the TBB task manager. The range to process is passed in, and the input and output grids are variables common to each job. As a result of using multi-threading, the performance in frames per second was improved by 3.76× when going from one thread to eight threads on an Intel® Core i7 processor. Figure 6. A code sample that shows 3D, multi-threaded diffusion. 3
Page 1: Multi-Threaded Fluid Simulation for

Multi-Threaded Fluid Simulation

Create successful ePaper yourself

Delete template?

Save as template?