11.07.2014 Views

C programming notes - School of Physics

C programming notes - School of Physics

C programming notes - School of Physics

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

C <strong>programming</strong> <strong>notes</strong><br />

file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />

39 <strong>of</strong> 40 19/03/2007 10:06 AM<br />

Reasons for worrying about how fast a program is include:<br />

if it is interactive, and reponsiveness is important,<br />

if it is pushing the boundaries <strong>of</strong> practicality (e.g., it might takes weeks to run).<br />

Hints on improving speed:<br />

use a clever algorithm.<br />

identify the time-critical parts <strong>of</strong> the program, and concentrate on improving those.<br />

examine memory use as well (e.g., try to reduce the amount <strong>of</strong> memory being used, and keep the usage localised<br />

as opposed to jumping around randomly).<br />

iterate most rapidly over the rightmost subscript in arrays, since this will lead to better localisation <strong>of</strong> memory<br />

use (and hence greater likelyhood <strong>of</strong> cache hits).<br />

use optimisation (e.g., -O1, -O2, or -O3; see "man gcc" for details).<br />

try optimising for small program size (-Os).<br />

in general, try not to be "too clever" - the C compiler can quite <strong>of</strong>ten do a better job if your program is clear.<br />

make the critical parts <strong>of</strong> the program as small (in terms <strong>of</strong> bytes <strong>of</strong> instructions) as possible - they will then be<br />

more likely to fit in the CPU's "instruction cache", resulting in better performance.<br />

declare <strong>of</strong>ten-called functions as "inline", which eliminates the expense <strong>of</strong> the function call and copying the<br />

arguments to/from the stack. This technique should be used with caution, since it makes the program larger, in<br />

which case it may no longer fit in the instruction cache.<br />

avoid printing out lots <strong>of</strong> unnecessary information. Formatting and writing text to the screen is quite CPU<br />

intensive.<br />

2D cellular automata - Conway's Game <strong>of</strong> Life<br />

In 1970, the mathematician John Horton Conway published the description <strong>of</strong> a simple 2D cellular automata which he<br />

called the Game <strong>of</strong> Life. Cells in a 2D grid were either "alive" or "dead", and their state in the next generation was<br />

determined by their current state and the number <strong>of</strong> nearest neighbours (from 0 to 8, inclusive). The rules were chosen<br />

to simulate some properties <strong>of</strong> biological systems (e.g., a cell would die <strong>of</strong> "overcrowding" if too many <strong>of</strong> its<br />

neighbours were alive, and die <strong>of</strong> "lack <strong>of</strong> support" if too few <strong>of</strong> its neighbours were alive).<br />

Here is a "simple" example <strong>of</strong> how to program the Game <strong>of</strong> Life on a computer.<br />

If we are interested in following the evolution <strong>of</strong> the game for many thousands <strong>of</strong> generations, it is advantageous to<br />

think <strong>of</strong> ways <strong>of</strong> speeding up our simple implementation. The first thing to try is to turn on full optimisation during the<br />

compilation, using the "-O4" switch. The next step is to make the critical functions "inline" (see here for the code). We<br />

can also think about the algorithm, and realise that a lot <strong>of</strong> our calculation <strong>of</strong> the number <strong>of</strong> nearest neighbours was<br />

involved in handling the special case <strong>of</strong> being on the boundary; we can re-write our program to separate out the<br />

boundary case, thereby allowing a simplified (faster) function to do most <strong>of</strong> the neighbour calculations, at the expense<br />

<strong>of</strong> increased complexity in handling the evolution from one generation to the next. Finally, we can try to use the fact<br />

that much <strong>of</strong> the Life "universe" tends to be sparsely populated, allowing us to produce a second<br />

neighbourhood-calculating function for the special case that the preceeding cell had no nearest neighbours. These are<br />

by no means the only speed-ups that are possible, but they give you some flavour <strong>of</strong> what is possible.<br />

The following table gives the time taken to follow one million generations using a 850MHz Pentium III computer and<br />

the various tricks from the last paragraph.<br />

Time Technique<br />

[seconds]<br />

--------------------------------------------------<br />

330 Original program, no optimisation<br />

205 Original program, -O4 optimisation<br />

113 Inline function calls<br />

45 Boundary case handled separately<br />

37 Sparse population handled separately<br />

So, we have managed almost a factor <strong>of</strong> ten improvement in speed over our original attempt.<br />

To show why this might be useful, let's now alter our program to automatically identify glider patterns (those that can<br />

self-propagate). We will choose a brute-force technique where we randomly seed the centre <strong>of</strong> our rectangular<br />

"universe" with alive/dead cells, and then evolve the system until one <strong>of</strong> the boundary cells is hit. When this happens<br />

we stop the evolution and print a 21x21 grid around the cell, hopefully capturing a glider in full flight! Here is the code,<br />

and you can see that it is a straightforward variation on the earlier program. It typically finds a glider within 2 seconds

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!