11.07.2014 Views

C programming notes - School of Physics

C programming notes - School of Physics

C programming notes - School of Physics

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

C <strong>programming</strong> <strong>notes</strong><br />

file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />

1 <strong>of</strong> 40 19/03/2007 10:06 AM<br />

PHYS2020 - Computational <strong>Physics</strong>, based on the C<br />

<strong>programming</strong> language<br />

Index<br />

Copyright (c) Michael Burton 2005, Michael Ashley 1994-2004.<br />

Example programs<br />

Why learn C?<br />

Some problems with C<br />

A brief history <strong>of</strong> C<br />

Simplest C program<br />

Hello World<br />

Computing fundamentals<br />

Bits, bytes and words<br />

32-bit, 64-bit<br />

Binary, octal, decimal, hexadecimal<br />

Floating point number<br />

GNU/Linux fundamentals<br />

Operating system<br />

Shell<br />

Files<br />

Filesystem<br />

Input and Output<br />

Command toolkit<br />

On-line manuals<br />

Fundamentals <strong>of</strong> C<br />

Constants and escape sequences and strings<br />

Comments<br />

Compiling, linking and executing<br />

Makefiles<br />

Compiling and Running<br />

Linking to libraries<br />

GNU debugger "gdb"<br />

C data types<br />

Constants: binary, octal, hexadecimal, decimal<br />

Conversion between integer and floating point<br />

Operators<br />

Unary operators<br />

Binary operators<br />

Assignment operators<br />

Comma operator<br />

Precedence and associativity<br />

Evaluation <strong>of</strong> logical AND and OR<br />

C identifiers<br />

Variable storage classes<br />

Initilialisation <strong>of</strong> variables<br />

Mathematical functions<br />

The "for" loop<br />

"If" statements<br />

Break and continue<br />

Goto<br />

Switch statements<br />

Assert for program reliability<br />

Formatted output: printf<br />

Formatted input: scanf<br />

GNU readline<br />

Arrays


C <strong>programming</strong> <strong>notes</strong><br />

file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />

2 <strong>of</strong> 40 19/03/2007 10:06 AM<br />

Character arrays<br />

Functions<br />

Prototypes<br />

Pointers<br />

Pointer arithmetic<br />

Pointers and arrays<br />

Structures<br />

String functions<br />

File operations<br />

The preprocesser, "cpp"<br />

Command line arguments<br />

Allocating memory at runtime, "malloc"<br />

Some applications<br />

Random numbers<br />

Quasi-random numbers<br />

Sorting<br />

Multiple precision arithmetic<br />

Gnuplot<br />

Numerical integration via gravity and orbits<br />

Program speed<br />

The game <strong>of</strong> "Life"<br />

Some Example C Programs<br />

The simplest possible C program.<br />

Hello world.<br />

C escape sequences.<br />

Mandelbrot image generator - illustrating "for" loops.<br />

ASCII circle printer.<br />

How to use GNU readline.<br />

Generating random numbers.<br />

Sorting using quicksort.<br />

Numerical integration - gravity.<br />

1D cellular automata - text output.<br />

1D cellular automata - image output.<br />

Game <strong>of</strong> Life - simple version<br />

Game <strong>of</strong> Life - inline function version<br />

Game <strong>of</strong> Life - boundary version<br />

Game <strong>of</strong> Life - special-case version<br />

Game <strong>of</strong> Life - version to find gliders<br />

Linear fitting program<br />

Bisection Method for finding roots <strong>of</strong> equations<br />

Interpolation Method for finding roots<br />

Newton-Raphson Method for finding roots<br />

Why learn C?<br />

1. C is the most widely used <strong>programming</strong> language.<br />

2. The basic ideas in C are common to many other <strong>programming</strong> languages.<br />

3. Even though FORTRAN is still in heavy use by physicists, it is on its way out.<br />

4. If you want to get a computer-related job outside the field <strong>of</strong> <strong>Physics</strong>, it is helpful to know C.<br />

5. Many scientific instruments are programmed in C (e.g., CCD cameras and special purpose interface cards<br />

invariably come with a C library interfaces).<br />

6. C is closer to assembly language, so you can have finer control over what the computer is doing, and thereby<br />

make faster programs.<br />

7. There is a free C compiler available (GNU C, gcc), that is <strong>of</strong> very high quality and that has been ported to<br />

numerous machines.<br />

8. UNIX is written in C, so it is easier to interface with UNIX-like operating systems (such as GNU/Linux) if you<br />

write in C.<br />

9. Once you have a working knowledge <strong>of</strong> C, you will find perl easy to learn, and perl is a very useful language to


C <strong>programming</strong> <strong>notes</strong><br />

file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />

3 <strong>of</strong> 40 19/03/2007 10:06 AM<br />

know, but that is another story...<br />

Some problems with C<br />

The language was designed with writing operating systems in mind, and so it was not purpose-built for numerical<br />

work.<br />

There is no support for complex numbers (actually, as <strong>of</strong> the C-99 standard, there is!).<br />

Handling multi-dimensional arrays, understanding pointers, and string manipulation are difficult when you first<br />

encounter them.<br />

It is easy to write completely opaque code in C.<br />

You have to explicitly declare every variable that you use. This is actually not a problem at all! It is a feature.<br />

It is possible to bypass many <strong>of</strong> the protective features <strong>of</strong> the language, and get yourself into trouble<br />

(alternatively, this can be seen as an advantage, since the language imposes few limitations on what you can do).<br />

The ANSI C standard leaves a lot <strong>of</strong> details undefined. This makes C non-portable between different operating<br />

systems, compilers, and computer hardware. Java is better in this respect.<br />

It is not easy to write large bug-free programs in C. Java is <strong>of</strong>ten a better choice.<br />

A brief history <strong>of</strong> C<br />

C evolved from a language called B, written by Ken Thompson at Bell Labs in 1970. Ken used B to write one <strong>of</strong> the<br />

first implementations <strong>of</strong> UNIX. B in turn was a decendant <strong>of</strong> the language BCPL (developed at Cambridge (UK) in<br />

1967), with most <strong>of</strong> its instructions removed.<br />

In fact, so many instructions were removed in going from BCPL to B, that Dennis Ritchie <strong>of</strong> Bell Labs put some back<br />

in (in 1972), and called the language C.<br />

The famous book The C Programming Language was written by Kernighan and Ritchie in 1978, and has remained the<br />

definitive reference book on C.<br />

The original C was still too limiting, and not standardized, and so in 1983 an ANSI committee was established to<br />

formalise the language definition.<br />

By 1993 the ANSI standard became well accepted and almost universally supported by compilers. The GNU C<br />

compiler, gcc, has become, in many ways, the standard implementation <strong>of</strong> C.<br />

What about C++ and C#?<br />

C++ is a superset <strong>of</strong> C, incorporating many object-oriented <strong>programming</strong> ideas. However, it is not particularly widely<br />

used. Java is a better choice in most cases.<br />

C# is a Java-like language developed by Micros<strong>of</strong>t. Enough said.<br />

Interesting books<br />

Numerical Analysis by Burden and Faires.<br />

A New Kind <strong>of</strong> Science by Wolfram.<br />

Numerical Recipies in C<br />

Other on-line resources<br />

For coverage <strong>of</strong> typical problems encountered with C, have a look at the Comp.lang.c FAQ.<br />

For a rather good tutorial introduction to C, check out this document by Gordon Dodrill.<br />

The simplest possible C program<br />

Here it is.<br />

Hello world


C <strong>programming</strong> <strong>notes</strong><br />

file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />

4 <strong>of</strong> 40 19/03/2007 10:06 AM<br />

Before reading further, have a look at the generic Hello World, program written in C.<br />

But before we get too excited and start to write really useful programs, there are a few fundamental concepts that you<br />

will need to understand first.<br />

Computing Fundamentals<br />

Bits, bytes, and words<br />

Bit<br />

A bit is the smallest unit <strong>of</strong> information. It can be thought <strong>of</strong> as the answer to a yes/no question. A bit has the value 1 or<br />

0 (which can be equated with yes/no, true/false).<br />

Byte<br />

A byte consists <strong>of</strong> eight bits, ordered from the most significant bit (MSB) on the left, to the least significant bit (LSB) on<br />

the right. E.g., 011011000 is a byte. A byte can be interpretted as a base-2 number.<br />

There are 256 (i.e., 2 to the power 8) possible combinations <strong>of</strong> 8 bits, ranging from 00000000 to 11111111. These 256<br />

combinations could, in principle, be used to represent any 256 things (e.g., we could define 00000000 to be "apple" and<br />

00000001 to be 3.14159265359).<br />

In practice there are three commonly used interpretations <strong>of</strong> a byte:<br />

Words<br />

Unsigned byte<br />

An unsigned byte can represent integers from zero to 255.<br />

Signed byte<br />

A byte can also be used to represent integers from -128 to +127. Such a byte is called a signed byte, and the MSB<br />

is used to indicate the sign (0 for positive, 1 for negative), -128 is represented as 10000000, -1 is 11111111, 0 is<br />

0, and +127 is 01111111.<br />

Non-examinable: To find the bit pattern for a negative number, you first write down the representation <strong>of</strong> the<br />

number without the minus sign, then invert all the bits, then add one. e.g., -1 is 00000001 inverted (which is<br />

11111110), plus one (which gives 11111111). This representation is called "2's complement", and is used<br />

because (1) it makes binary arithmetic easier to implement in the computer hardware, and (2) it avoids the waste<br />

<strong>of</strong> a bit pattern in the case <strong>of</strong> 1000000, which you might think was -0.<br />

A byte as an ASCII character<br />

The American Standard Code for Information Interchange (ASCII) associates the various possible bit patterns in<br />

a byte with characters (i..e., things like 'A', 'B', 'C'...'Z', 'a', 'b', 'c', ...'z', '*', '&', '@', etc). For example, 'A' is stored<br />

as 65 (decimal). The ASCII table allows you to translate a simple text document into a list <strong>of</strong> bytes (which you<br />

might then store in a file, see below). For a complete list <strong>of</strong> the ASCII table, type "man ascii".<br />

In order to represent numbers outside the range <strong>of</strong> an unsigned or signed byte, multiple bytes can be grouped together<br />

to form words <strong>of</strong> various lengths. Words usually contain a power <strong>of</strong> two bytes, e.g., 2, 4, 8, 16, and so on. They may<br />

either be signed or unsigned. For example, an unsigned 2-byte word can represent numbers from 0 to 65536 (2 to the<br />

power 16), a signed 2-byte word ranges from -32768 to + 32767. Words are sometimes called "short words", "words",<br />

and "long words", for the 2, 4, and 8-byte variations, but this usage is not uniform.<br />

32-bit computer, 64-bit computers<br />

Most PCs are 32-bit computers, i.e., they process 32-bits (4 bytes) <strong>of</strong> data in one operation. In a few years, 64-bit


C <strong>programming</strong> <strong>notes</strong><br />

file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />

5 <strong>of</strong> 40 19/03/2007 10:06 AM<br />

computer will become very common.<br />

Binary, octal, decimal, hexadecimal<br />

Integers can be written in various bases. The common bases used with computers are binary (base-2), octal (base-8),<br />

decimal (base-10), and hexadecimal (base-16). Octal and hexadecimal are <strong>of</strong>ten useful when doing low-level<br />

<strong>programming</strong>, since they encode a binary number in a smaller amount <strong>of</strong> typing than using binary notation.<br />

The following table compares the various numbering systems (note that you can't directly specify a binary constant in<br />

the C <strong>programming</strong> language); octal constants are distinguished by a leading zero; hexadecimal constants by a leading<br />

"0x"):<br />

Decimal Binary Octal Hexadecimal<br />

--------------------------------<br />

0 0000 000 0x0<br />

1 0001 001 0x1<br />

2 0010 002 0x2<br />

3 0011 003 0x3<br />

4 0100 004 0x4<br />

5 0101 005 0x5<br />

6 0110 006 0x6<br />

7 0111 007 0x7<br />

8 1000 010 0x8<br />

9 1001 011 0x9<br />

10 1010 012 0xa<br />

11 1011 013 0xb<br />

12 1100 014 0xc<br />

13 1101 015 0xd<br />

14 1110 016 0xe<br />

15 1111 017 0xf<br />

Conversion between binary, octal, and hexadecimal is fairly trivial using the above table. For example, to convert the<br />

binary number 11100101010 to octal, break the number into groups <strong>of</strong> three bits starting from the least significant bit<br />

(i.e., 11 100 101 010), then convert each three-bit group to octal (i.e., 3452) and add a leading 0 to tell C it is octal (i.e.,<br />

03452). To convert the number to hexadecimal, break it into groups <strong>of</strong> four bits (i.e., 111 0010 1010), convert each<br />

group (i.e., 72a) and add a leading "0x" (i.e., 0x72a).<br />

To convert between octal and hexadecimal, it is probably easiest to go via binary.<br />

To convert from binary to decimal, note that n-th bit from the least-significant is worth 2 to the power n. So,<br />

11100101010<br />

\\\\\\\\\\\_ 0 x 1 = 0<br />

\\\\\\\\\\_ 1 x 2 = 2<br />

\\\\\\\\\_ 0 x 4 = 0<br />

\\\\\\\\_ 1 x 8 = 8<br />

\\\\\\\_ 0 x 16 = 0<br />

\\\\\\_ 1 x 32 = 32<br />

\\\\\_ 0 x 64 = 0<br />

\\\\_ 0 x 128 = 0<br />

\\\_ 1 x 256 = 256<br />

\\_ 1 x 512 = 512<br />

\_ 1 x 1024 = 1024<br />

Floating-point numbers<br />

Total = 1832<br />

Scientific computation <strong>of</strong>ten requires the use <strong>of</strong> floating-point numbers, e.g., 1.23. There is an IEEE standard that<br />

describes how floating-point numbers can be represented as a group <strong>of</strong> bytes. The basic idea is to allocate a certain<br />

number <strong>of</strong> bits to the exponent (base-2), and the rest to the mantissa (which has been normalised to be between 0.5 and<br />

1). Both the exponent and mantissa have sign bits.<br />

Floating-point numbers typically come in 4, 8, or 16 byte sizes. In C these are usually called "float", "double", and<br />

"long double" respectively.<br />

For any given number <strong>of</strong> bytes allocated to store a floating-point number, there is clearly a trade<strong>of</strong>f between the range<br />

<strong>of</strong> the exponent, and the precision <strong>of</strong> the mantissa: the more bits you use for the exponent, the fewer are left over for the<br />

mantissa. We will see later how this compromise has been chosen by the IEEE for the various byte sizes.


C <strong>programming</strong> <strong>notes</strong><br />

file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />

6 <strong>of</strong> 40 19/03/2007 10:06 AM<br />

Note that there are some redundant bit patterns in the representation <strong>of</strong> floating-point numbers: e.g., 0.0 with any<br />

exponent is still zero. One <strong>of</strong> these bit patterns is used as a flag to indicate that a calculation produced an illegal result.<br />

This bit pattern is called "NaN" which stands for "not a number". E.g, if you take the square root <strong>of</strong> a negative number,<br />

you will receive the result NaN. Any operation on a NaN will continue to produce a NaN (e.g., if you subtract two<br />

NaNs you will still have a NaN), this is useful to propagate such error conditions in your programs so that you are<br />

aware when they affect the final result <strong>of</strong> a calculation.<br />

GNU/Linux Fundamentals<br />

The operating system<br />

The operating system <strong>of</strong> a computer is, somewhat loosely speaking, the program that is always operating in the<br />

background on the computer. It handles the interaction with the hardware, and running other programs for the user(s) <strong>of</strong><br />

the computer. The operating system that we will be using is called GNU/Linux. Other examples <strong>of</strong> operating systems<br />

are Windows XP, OpenBSD, and FreeBSD.<br />

The shell<br />

The shell is an interactive program that allows the user to interface with the operating system. This is the program that<br />

provides the command-line prompt (it also helps you run programs, allows you to use the arrow keys on your keyboard<br />

to edit the command-line, and a bunch <strong>of</strong> other useful things). The shell that we will be using is called bash, it is the<br />

standard shell that is used with GNU/Linux. Other examples <strong>of</strong> shells are ksh, tcsh, and zsh.<br />

Files<br />

A file consists <strong>of</strong> two components:<br />

File data<br />

The data in a file is simply a list <strong>of</strong> bytes. The list may be empty (i.e., the file may have no data). The list <strong>of</strong> bytes<br />

might be used for a variety <strong>of</strong> different purposes, for example, it may be a C program, an e-mail message, a<br />

photograph, a song, a movie. In principle there is no way to distinguish the purpose <strong>of</strong> a file's data by looking at<br />

it. In practice, you can usually make a pretty good guess. To list the data in a file, you can, e.g., use "cat myfile".<br />

File metadata<br />

The metadata associated with a file consists <strong>of</strong> miscellaneous additional information that the operating system<br />

keeps with the file. For example: the name <strong>of</strong> the file, the number <strong>of</strong> bytes in the file, the time at which the file<br />

was created, which user owns the file, and the file permissions (i.e., the list <strong>of</strong> users who read and/or write the<br />

file, and whether the file is executable (i.e., is a program that can be run)). The term "metadata" means "data<br />

about data". To list the metadata associated with a file, you can, e.g., use "stat myfile".<br />

File names and file extensions<br />

When you create a file, you give it a name. The name is a list <strong>of</strong> bytes. In principle the name could be bizarre, such as<br />

" _1 `", but in practice you should choose simple, lowercase, names without spaces, e.g., "thesis.txt", "photo.jpg",<br />

"assignment0001.c". Note the convention <strong>of</strong> suffixing a file name with an extension that gives a clue as what the file is<br />

being used for. So, C programs invariably end with the ".c" extension.<br />

The GNU/Linux filesystem<br />

Directories (folders)<br />

To assist with organising groups <strong>of</strong> files, GNU/Linux has the concept <strong>of</strong> a directory (or folder). A directory is actually<br />

just a file (this adheres to the UNIX philosophy <strong>of</strong> making things simple and logical). A directory contains a list <strong>of</strong> the<br />

file names that are "in" the directory. Directories can contain directories. This leads to a hierarchical filesystem, where,<br />

in order to fully specify the location <strong>of</strong> a file, you have to specify the list <strong>of</strong> directories to follow to reach the file. By<br />

convention a slash character '/' is used to separate the directory names when giving the fully qualified filename. "/" itself<br />

is known as the root directory - there are no directories above the root directory in the filesystem. "/thesis.txt" would


C <strong>programming</strong> <strong>notes</strong><br />

file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />

7 <strong>of</strong> 40 19/03/2007 10:06 AM<br />

then refer to file "thesis.txt" in the root directory. "/home/fred/thesis/chapter4.txt" would be the file "chapter4.txt" in the<br />

directory "thesis", in the directory "fred", in the directory "home", in the root directory.<br />

The Windows operating system uses letters such as "C:", "D:", etc to specify different filesystems (which might be in<br />

on different hard disks). GNU/Linux places these filesystems within the "/" hierarchy, i.e., the floppy disk might be<br />

accessible as "/mnt/floppy", the CD-ROM might be "/mnt/cdrom".<br />

The current working directory<br />

GNU/Linux has the concept <strong>of</strong> the current working directory. You can specify a file in the current working directory<br />

by just giving the file name. For convenience, "./" refers to the current working directory, and "../" refers to the<br />

directory above the current working directory. The fully-qualified filename <strong>of</strong> a file in the current working directory<br />

can be written as, e.g., "./thesis.txt".<br />

Programs<br />

To get a computer to do something for you, you create and run a program. You will be doing this many times during<br />

this course. To create a program, you type in the program's source code, and then use, e.g, the C compiler (another<br />

program, which someone else wrote) to turn the source code into a program. The program itself is just a file. The file's<br />

metadata tells the operating system that the file is executable (i.e., that it is a program able to be run by the computer).<br />

To run the program, you just type its fully qualified filename.<br />

GNU/Linux commands<br />

GNU/Linux provides you with many hundreds <strong>of</strong> programs, <strong>of</strong>ten called commands, but they are really exactly like the<br />

programs that you can create yourself (actually, some <strong>of</strong> the commands are a subset <strong>of</strong> the bash shell, but this is a<br />

detail). The commands range from listing the contents <strong>of</strong> a directory ("ls"), listing the contents <strong>of</strong> a file ("cat"), the C<br />

compiler ("gcc"), to the GNU image manipulation program ("gimp").<br />

To run a program, you simply type its filename. In the case <strong>of</strong> your own programs, you should use the fully qualified<br />

filename (e.g., "./a.out"). In the case <strong>of</strong> programs provided by GNU/Linux, the shell has a list <strong>of</strong> default directories<br />

where it searches for programs, so you normally only need to type the filename (without any directories). E.g., the "ls"<br />

program to list directories can be run by simply typing "ls", or "/bin/ls", or, if your current working directory happened<br />

to be "/bin", you could type "./ls".<br />

Input and Output<br />

In general, for a program to do useful work, it needs to be provided with input <strong>of</strong> some sort, and to produce output.<br />

Input to a program can take many forms: e.g., typing at the keyboard, a file containing data, movements <strong>of</strong> a mouse, or<br />

something more exotic such as audio input from a sound card, or video from a camera.<br />

Output from a program can also take many forms, e.g., text appearing on your computer screen, a graph on your screen,<br />

a file <strong>of</strong> results, or something more exotic such as audio output, or movement <strong>of</strong> a robotic arm.<br />

GNU/Linux provides the following input and output methods for all programs (I am covering all <strong>of</strong> the possibilities<br />

here, but you are only expected to really understand the first two at this stage):<br />

Standard input<br />

Standard input (or STDIN) is a stream <strong>of</strong> bytes provided to the program. The program can read from the stream,<br />

and can determine when there is no more data. Standard input comes from one <strong>of</strong> three sources depending on<br />

how you run your program. For example, suppose that your program is called "a.out" (the default name assumed<br />

by the C compiler), then<br />

./a.out<br />

./a.out < file<br />

standard input comes from the keyboard<br />

standard input comes from a file called "file"


C <strong>programming</strong> <strong>notes</strong><br />

file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />

8 <strong>of</strong> 40 19/03/2007 10:06 AM<br />

command | ./a.out<br />

standard input comes from the standard output <strong>of</strong> "command"<br />

Standard output<br />

Standard output (or STDOUT) is a stream <strong>of</strong> bytes produced by a program. Standard output goes to one <strong>of</strong> three<br />

locations depending on how you run your program. E.g.,<br />

./a.out<br />

./a.out > file<br />

./a.out | command<br />

standard output appears on the computer screen as text<br />

standard output is sent to the file called "file"<br />

standard output is sent to the standard input <strong>of</strong> "command"<br />

Standard error<br />

Standard error (or STDERR) is a stream <strong>of</strong> bytes produced by a program. Usually it contains error messages<br />

(e.g., 'divide by zero') that are not normally considered as part <strong>of</strong> the program output. Standard error can be sent<br />

to one <strong>of</strong> three locations. E.g.,<br />

./a.out<br />

./a.out 2> file<br />

./a.out 2>&1 | command<br />

standard error appears on the computer<br />

screen as text, intermingled with the standard output<br />

standard error is sent to the file called "file"<br />

standard error (and standard output) is sent to<br />

the standard input <strong>of</strong> "command"<br />

Command-line arguments<br />

Anything you type after the program name, and before any special shell characters such as '>' and '|', is made<br />

available to your program as a command-line argument. These can be a useful source <strong>of</strong> additional input to a<br />

program. E.g.,<br />

./a.out 23.45 albatross 7<br />

In the above example, "23.45", "albatross", and "7" are three command-line arguments.We will see later how you<br />

can read them from within a C program.<br />

Exit status<br />

Another form <strong>of</strong> output that is available to all programs is an integer exit status. By convention, this is used to<br />

indicate whether the program ran successfully (in which case the exit status is zero) or if an error occurred (in<br />

which case the exit status is nonzero and its value can be decoded to determine what the error was). The exit<br />

status <strong>of</strong> a program is stored in the shell variable called "$?" (don't worry if you don't understand this at this<br />

stage).<br />

Putting it all together<br />

Here are some more complex examples <strong>of</strong> program input/output:<br />

./a.out 2.1 < data.txt > results.txt<br />

"2.1" is an argument, standard input<br />

comes from "data.txt", standard output<br />

goes to "results.txt"


C <strong>programming</strong> <strong>notes</strong><br />

file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />

9 <strong>of</strong> 40 19/03/2007 10:06 AM<br />

./a.out > results.txt 2> error.txt<br />

there are no arguments, standard input<br />

is from the keyboard, standard output<br />

to "results.txt" and standard error to<br />

"error.txt"<br />

Command Toolkit<br />

Here are some <strong>of</strong> the basic GNU/Linux commands that you will be using:<br />

cat myfile - concatenate a file (i.e., list the file's data on standard output)<br />

less myfile - like "cat", but allows you to scroll up and down in the file using the page-up, and page-down keys.<br />

Exit with 'q'.<br />

rm myfile - remove a file (i.e., delete the file; it is gone for good)<br />

mv myfile newname - renames a file<br />

cp myfile copyOfMyFile - makes a copy <strong>of</strong> a file, with a different name<br />

cd mydir - change directory<br />

mkdir mynewdir - make a new directory<br />

rmdir mydir - remove the directory (you have to remove the files in it first)<br />

ls - list the files and directories in the current working directory<br />

pwd - print the name <strong>of</strong> the current working directory<br />

gcc myprog.c - compiles your program<br />

./a.out - runs the program compiled as above<br />

man mytopic - displays the manual pages about "mytopic"<br />

man -k mytopic - searches the manual pages for any information about "mytopic"<br />

gedit myprog.c - starts up a simple GUI editor<br />

logout - logs out <strong>of</strong> the computer<br />

The on-line manual pages<br />

Use "man -k keyword" to search for information about "keyword". For example, suppose you want to know<br />

something about random number generators: "man -k random" on a GNU/Linux system will return output<br />

containing lines like these:<br />

ppmspread<br />

(1) - displace a portable pixmap's pixels by a random amount<br />

rand<br />

(3) - random number generator<br />

RAND_bytes<br />

(3ssl) - generate random data<br />

random<br />

(3) - random number generator<br />

random<br />

(4) - kernel random number source devices<br />

RAND_pseudo_bytes [RAND_bytes] (3ssl) - generate random data<br />

rand [sslrand] (1ssl) - generate pseudo-random bytes<br />

rand [sslrand] (3ssl) - pseudo-random number generator<br />

seed48 [drand48] (3) - generate uniformly distributed pseudo-random numbers<br />

The number in parentheses is called the section number (e.g., section 3 is reserved for C functions, section 4 for<br />

low-level kernel interfaces), and you need to specify this if there are entries in multiple volumes. So, e.g., to read<br />

about the kernel random number source devices, you would type "man 4 random".


C <strong>programming</strong> <strong>notes</strong><br />

file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />

10 <strong>of</strong> 40 19/03/2007 10:06 AM<br />

Fundamentals <strong>of</strong> C<br />

Character constants, escape sequences, and string constants<br />

A character constant in C is a single character (or an escape sequence such as \n) enclosed in single quotes, e.g., 'A'. It<br />

occupies one byte <strong>of</strong> storage.<br />

The value <strong>of</strong> a character constant is the numeric value <strong>of</strong> the character in the computer's character set (e.g., 'A' has the<br />

value 65). In 99.99% <strong>of</strong> cases this is the ASCII character set, but this is not defined by the C standard!<br />

But how do you represent a character such as a single quote itself, or one that doesn't print (such as the ASCII BEL<br />

character)? The answer is to use an escape sequence.<br />

For reference, here is a program to print out all the special escape sequences. Have a look at it now.<br />

In addition, you can specify any 8-bit ASCII character using either \ooo or \xhh where `ooo' is an octal number (with<br />

from 1 to 3 digits), and 'xhh' is a hexadecimal number (with 1 or 2 digits). For example, \x20 is the ASCII character for<br />

SPACE, so ' ' and '\x20' are identical.<br />

String constants are a sequence <strong>of</strong> zero or more characters, enclosed in double quotes. For example, "test", "", "this is<br />

an invalid string" are all valid strings (you can't always believe what a string tells you!). String constants are stored in<br />

the computer's memory as a sequence <strong>of</strong> numbers (usually from the ASCII character set), and are terminated by a null<br />

byte (\0). So, "test" would appear in memory as the numbers 116, 110, 115, 116, 0.<br />

A consequence <strong>of</strong> the null byte terminator is that a C character string can never contain a null byte. In practice, this isn't<br />

<strong>of</strong>ten a limitation.<br />

We have already used some examples <strong>of</strong> strings in our programs, e.g, "hello world\n" was a null-terminated character<br />

string that we sent to the printf function above.<br />

The above program also shows how to add comments to your C program.<br />

Comments<br />

To make your program more readable, it is good practice to include comments. These do not influence the operation <strong>of</strong><br />

the program, nor are they printed out when the program runs, they are simply to document things that you think are<br />

important.<br />

Comments are delimited by `/*' and `*/', or from '//' to the end <strong>of</strong> the line. Any text between the first `/*' and the next<br />

`*/' is ignored by the compiler, irrespective <strong>of</strong> the position <strong>of</strong> line breaks. Note that this means that comments do not<br />

nest. Here are some examples <strong>of</strong> the valid use <strong>of</strong> comments:<br />

/* This is a comment */<br />

// and so is this<br />

/* here is another one<br />

that spans two lines */<br />

i = /* a big number */ 123456;<br />

Here are some problems that can arise when using comments:<br />

i = 123456; /* a comment starts here<br />

i = i + 2; this statement is also part <strong>of</strong> the comment */<br />

/* this is a comment /* and so is this */ but this will generate an error */<br />

The fact that comments don't nest is a real nuisance if you want to comment-out a whole section <strong>of</strong> code. In this case, it<br />

is simplest to use pre-processor directives to do the job. For example:<br />

i = 123456;<br />

#if 0 // The next two lines are ignored by the compiler.<br />

i = i + 2;<br />

j = i * i;<br />

#endif<br />

Compiling, linking, executing, etc


C <strong>programming</strong> <strong>notes</strong><br />

file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />

11 <strong>of</strong> 40 19/03/2007 10:06 AM<br />

Source files<br />

C source files contain the C program that you write. The files invariably end with the suffix ".c". You compile them<br />

with the C compiler to produce...<br />

Object files<br />

Have a ".o" suffix, and are the result <strong>of</strong> compiling a source file. They contain the machine language equivalent <strong>of</strong> the C<br />

program, but they are not ready to execute yet, since they need to be linked against libraries <strong>of</strong> other C object files<br />

containing any functions that are used, but not coded for, in the source file (e.g., the printf function). Linking can be<br />

done for you by the C compiler, or can be a separate step if you wish. After linking, you have an...<br />

Executable file<br />

This file is ready to run, by typing its name at the command prompt.<br />

Makefiles - the easy way to automate compilation<br />

When developing s<strong>of</strong>tware, a great deal <strong>of</strong> time can be save by using the UNIX make utility.<br />

The idea behind make is that you should be able to compile a program simply by typing make (followed by ENTER).<br />

For this to work, you have to tell the computer how to build your program, by storing instructions in a Makefile.<br />

While this step takes some effort, you are amply repaid by the time savings later on. make is particularly useful for large<br />

programs that consist <strong>of</strong> numerous source files: make only recompiles the file(s) that need recompiling (it works out<br />

which ones to process by looking at the dates <strong>of</strong> last modification).<br />

Here is an example <strong>of</strong> a simple Makefile:<br />

test: test.o; gcc test.o -o test<br />

Notes:<br />

1. A Makefile consists <strong>of</strong> a series <strong>of</strong> lines containing targets, dependencies, and shell commands for updating the<br />

targets.<br />

2. In the above example,<br />

test is the target, i.e., the executable program,<br />

test.o is the dependency (i.e., the file that test depends on), and<br />

gcc test.o -o test is the shell command for making test from its dependencies.<br />

3. Important note: test is a very bad name for a program since it conflicts with the UNIX shell command <strong>of</strong> the<br />

same name, hence if you type "test" it will run the UNIX command rather than your program. To get around this<br />

problem, simply use "./test" (if "test" is in your current directory), or rename the program. It is always a good<br />

idea to use the leading "./" since it avoids a lot <strong>of</strong> subtle problems that can be hard to track down.<br />

Makefiles can get much more complicated than this simple example. Here is the next step in complexity, showing the<br />

use <strong>of</strong> a macro definition and a program that depends on multiple files.<br />

OBJS = main.o sub1.o sub2.o<br />

main: $(OBJS); gcc $(OBJS) -o main<br />

It is well worth learning a little bit about make since it can save a lot <strong>of</strong> time!<br />

How to compile and run a C program<br />

Short Instructions<br />

1.<br />

Type in your program into your favourite editor (eg emacs): "emacs filename.c &" and save it.<br />

e.g. for the hellow world program described in lectures you would type into the editor:<br />

#include <br />

int main(void) {<br />

printf("hello world\n");


C <strong>programming</strong> <strong>notes</strong><br />

file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />

12 <strong>of</strong> 40 19/03/2007 10:06 AM<br />

return 0;<br />

}<br />

2.<br />

Compile: "gcc filename.c"<br />

Fix any errors, save, and recompile until compilation successful.<br />

3. Run: "./a.out"<br />

Examine the output to see if it is correct. If not, examine the program carefully to ascertain where you<br />

migth have gone wrong.<br />

Full Instructions (for experts)<br />

1.<br />

2.<br />

3.<br />

4.<br />

5.<br />

6.<br />

Create a new directory for the program (not essential, but it helps to be organised).<br />

Use whatever editor you want to generate the program (make sure the program file has a suffix <strong>of</strong> '.c').<br />

Construct a Makefile for the program (not essential, but worth doing).<br />

Type make to compile the program (if you don't have a Makefile, you will need to type the compiler command<br />

yourself).<br />

Run the program by typing its name.<br />

Locate and fix bugs, then go to step 4.<br />

Here is a complete example <strong>of</strong> the above process for the ``Hello world'' program:<br />

cd<br />

mkdir hello<br />

cd hello<br />

cat > hello.c


C <strong>programming</strong> <strong>notes</strong><br />

file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />

13 <strong>of</strong> 40 19/03/2007 10:06 AM<br />

const double pi = 3.14159265358979323846;<br />

double e, d = pi/2;<br />

}<br />

e = sin(d);<br />

printf("The sine <strong>of</strong> %f is %f\n", d, e);<br />

return 0;<br />

This program produces the result:<br />

The sine <strong>of</strong> 1.570796 is 1.000000<br />

Aside: the file /usr/include/math.h defines M_PI to be the double precision value <strong>of</strong> pi. So, use this rather than typing<br />

the value <strong>of</strong> pi yourself.<br />

The GNU C debugger "gdb"<br />

When debugging a program, a common technique is to insert "print" statements to display the values <strong>of</strong> variables. A<br />

much more powerful technique is to use the GNU C debugger, "gdb". Other C compilers will have debuggers as well.<br />

Firstly, you must compile your program with the debug switch (-g). It is also a good idea to disable optimisation with<br />

-O0.<br />

gcc -g -O0 main.c<br />

Then run your program using gdb:<br />

gdb a.out<br />

The following are some <strong>of</strong> the more common commands that are available:<br />

help<br />

- obtain help<br />

help {arg}<br />

- obtain help on topic {arg}<br />

list<br />

- list 10 lines around current position<br />

list {line}<br />

- list 10 lines around line number {line}<br />

list {beg},{end}<br />

- list lines from line number {beg} to {end}<br />

run arg1 arg2 ...<br />

- begin execution <strong>of</strong> the program, with the arguments<br />

break line<br />

- suspend execution at the given line number<br />

step<br />

- steps (and executes) one line in your program<br />

print {exp} ...<br />

- print the value <strong>of</strong> the given expressions<br />

printf "string", exp, ... - print expressions using format string(C)<br />

quit<br />

- quit gdb<br />

<br />

- repeats the last command (very useful with "step")<br />

A program to give information on C data types<br />

Like other <strong>programming</strong> languages, C has a variety <strong>of</strong> different data types. The ANSI standard defines the data types<br />

that must be supported by a compiler, but it doesn't tell you details such as the range <strong>of</strong> numbers that each type can<br />

represent, or the number <strong>of</strong> bytes <strong>of</strong> storage occupied by each type. These details are implementation dependent, and<br />

defined in the two system include-files "limits.h" and "float.h". To find out what the limits are, try running the<br />

following program with your favourite C compiler.<br />

/* A program to print out various machine-dependent constants */<br />

/* Michael Ashley / UNSW / 03-Mar-2003 */<br />

#include /* for printf definition */<br />

#include /* for CHAR_MIN, CHAR_MAX, etc */<br />

#include /* for FLT_DIG, DBL_DIG, etc */<br />

int main(void) {<br />

printf("char %d bytes %d to %d \n", size<strong>of</strong>(char ), CHAR_MIN, CHAR_MAX );<br />

printf("unsigned char %d bytes %d to %d \n", size<strong>of</strong>(unsigned char ), 0 , UCHAR_MAX );<br />

printf("short %d bytes %hi to %hi \n", size<strong>of</strong>(short ), SHRT_MIN, SHRT_MAX );<br />

printf("unsigned short %d bytes %hu to %hu \n", size<strong>of</strong>(unsigned short), 0 , USHRT_MAX );<br />

printf("int %d bytes %i to %i \n", size<strong>of</strong>(int ), INT_MIN , INT_MAX );<br />

printf("unsigned int %d bytes %u to %u \n", size<strong>of</strong>(unsigned int ), 0 , UINT_MAX );<br />

printf("long %d bytes %li to %li \n", size<strong>of</strong>(long ), LONG_MIN, LONG_MAX );<br />

printf("unsigned long %d bytes %lu to %lu \n", size<strong>of</strong>(unsigned long ), 0 , ULONG_MAX );<br />

printf("float %d bytes %e to %e \n", size<strong>of</strong>(float ), FLT_MIN , FLT_MAX );<br />

printf("double %d bytes %e to %e \n", size<strong>of</strong>(double ), DBL_MIN , DBL_MAX );<br />

printf("precision <strong>of</strong> float %d digits\n", FLT_DIG);<br />

printf("precision <strong>of</strong> double %d digits\n", DBL_DIG);


C <strong>programming</strong> <strong>notes</strong><br />

file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />

14 <strong>of</strong> 40 19/03/2007 10:06 AM<br />

}<br />

return 0;<br />

Notes:<br />

size<strong>of</strong> looks like a function, but it is actually a built-in C operator (i.e., just like +,-,*). The compiler replaces<br />

size<strong>of</strong>(data-type-name) (or, in fact, size<strong>of</strong>(variable)) with the number <strong>of</strong> bytes <strong>of</strong> storage allocated to the<br />

data-type or variable.<br />

The 'unsigned' data types are useful when you are refering to things which are naturally positive, such as the<br />

number <strong>of</strong> bytes in a file. They also give you a factor <strong>of</strong> two increase in the largest number that you can represent<br />

in a given amount <strong>of</strong> space.<br />

An "unsigned char" is a single-byte C data type that can represent positive integers from 0 to 255. A "signed<br />

char" uses the most-significant bit <strong>of</strong> the byte as a sign-bit (0 indicates a positive number, 1 indicates a negative<br />

number), leaving the remaining 7 bits to encode the absolute value <strong>of</strong> the number; the range <strong>of</strong> signed chars is<br />

therefore -128 to +127.<br />

Most <strong>of</strong> the time you don't need to worry about how many bytes are in each data type, since the limits are usually OK<br />

for normal programs. However, a common problem is that the "int" type can be either 2-bytes or 4-bytes in length,<br />

depending on the compiler.<br />

Here is the output <strong>of</strong> the preceeding program when run on a GNU/Linux PC, using gcc:<br />

char 1 bytes -128 to 127<br />

unsigned char 1 bytes 0 to 255<br />

short 2 bytes -32768 to 32767<br />

unsigned short 2 bytes 0 to 65535<br />

int 4 bytes -2147483648 to 2147483647<br />

unsigned int 4 bytes 0 to 4294967295<br />

long 4 bytes -2147483648 to 2147483647<br />

unsigned long 4 bytes 0 to 4294967295<br />

float<br />

4 bytes 1.175494e-38 to 3.402823e+38<br />

double<br />

8 bytes 2.225074e-308 to 1.797693e+308<br />

precision <strong>of</strong> float 6 digits<br />

precision <strong>of</strong> double 15 digits<br />

Constants; binary, octal, hexademical, and decimal<br />

We have already seen how to write character constants and strings. Let's now look at other types <strong>of</strong> constants:<br />

int<br />

123, -1, 2147483647, 040 (octal), 0xab (hexadecimal)<br />

unsigned int 123u, 2147483648, 040U (octal), 0X02 (hexadecimal)<br />

long<br />

123L, 0XFFFFl (hexadecimal)<br />

unsigned long 123ul, 0777UL (octal)<br />

float<br />

1.23F, 3.14e+0f<br />

double 1.23, 2.718281828<br />

long double 1.23L, 9.99E-9L<br />

Note:<br />

Integers are automatically assumed to be <strong>of</strong> the smallest type that can represent them (but at least an "int"). For<br />

example, 2147483648 was assumed by the compiler to be an "unsigned int" since this number is too big for an<br />

"int". Numbers too big to be an "unsigned int" are promoted to "long" or "unsigned long" as appropriate.<br />

An integer starting with a zero is assumed to be octal, unless the character after the zero is an "x" or "X", in<br />

which case the number in hexadecimal.<br />

Unsigned numbers have a suffix <strong>of</strong> "u" or "U".<br />

Long "int"s or "double"s have a suffix <strong>of</strong> "l" or "L" (it is probably better to use "L" so that it isn't mistaken for the<br />

number one).<br />

Floating-point numbers are assumed to be "double"s unless they have a suffix <strong>of</strong> "f" or "F" for "float", or "l" or<br />

"L" for "long double".<br />

It pays to be very careful when specifying numbers, to make sure that you do it correctly, particularly when dealing<br />

with issues <strong>of</strong> precision. For example, have a look at this program (precision1.c, precision2.c):<br />

#include <br />

int main(void) {<br />

double r;<br />

r = 1.0 + 0.2F;


C <strong>programming</strong> <strong>notes</strong><br />

file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />

15 <strong>of</strong> 40 19/03/2007 10:06 AM<br />

}<br />

r = r - 1.2F;<br />

printf("%22.16e\n", r);<br />

return 0;<br />

The program gives at output <strong>of</strong> "-4.4703483581542969e-08", not zero as you might expect.<br />

EXERCISE: Run this program and verify for yourself that the answer is indeed not zero! Re-write the program so that<br />

the numerical values in the code (i.e. 0.2, 1.2 etc) are double rather than float, and verify that the result <strong>of</strong> execution is<br />

to give the answer 0.000000000000000e+00.<br />

The lesson to be learnt here is when writing constants, always think carefully about what type you want them to be, and<br />

use the suffixes "U", "L", and "F" to be explicit about it. It is not a good idea to rely on the compiler to do what you<br />

expect. Don't be surprised if different machines/compilers give different answers if you program sloppily.<br />

Conversion between integers and floating point numbers<br />

In C, as in all computer languages, there are rules that the compiler uses when a program mixes integers and floating<br />

point numbers in the same expression.<br />

Let's look at what happens if you assign a floating point number to an integer variable (convert1.c):<br />

#include <br />

int main(void) {<br />

int i, j;<br />

i = 1.99;<br />

j = -1.99;<br />

printf("i = %d; j = %d\n", i, j);<br />

return 0;<br />

}<br />

This program produces the result "i = 1; j = -1". Note that the floating point numbers have been truncated and not<br />

rounded.<br />

When converting integers to floating-point, be aware that a "float" has fewer digits <strong>of</strong> precision than an "int", even<br />

though they both use 4 bytes <strong>of</strong> storage (on normal PCs). This can result in some strange behaviour, e.g. (convert2.c),<br />

#include <br />

int main(void) {<br />

unsigned int i;<br />

float f;<br />

i = 4294967295; /* the largest unsigned int */<br />

f = i; /* convert it to a float */<br />

printf("%u %20.13e %20.13e\n", i, f, f - i);<br />

return 0;<br />

}<br />

This program produces the following output when compiled with "gcc":<br />

4294967295 4.2949672960000e+09 1.0000000000000e+00<br />

Rather than rely on automatic type-conversion, you can be explicit about it by using a type-cast operator, e.g.,<br />

f = (float)i;<br />

This converts the number or variable or parethesised expression immediately to its right, to the indicated type. It is a<br />

good idea to use type-casting to ensure that you leave nothing to chance.<br />

Operators<br />

C has a rich set <strong>of</strong> operators (i.e., things like + - * /), which allow you to write complicated expressions quite<br />

compactly.<br />

Unary operators


C <strong>programming</strong> <strong>notes</strong><br />

file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />

16 <strong>of</strong> 40 19/03/2007 10:06 AM<br />

Unary operators are operators that only take one argument. The following list will be confusing when you first see it, so<br />

just keep it mind for reference later on.<br />

Some <strong>of</strong> the operators we have already seen (e.g., "size<strong>of</strong>()"), others are very simple (e.g., +, -), others are really neat<br />

(e.g., ~, !), others are useful for adding/subtracting 1 automatically (e.g., ++i, --i, i++, i--), and the rest involve pointers<br />

and addressing, which will be covered in detail later.<br />

size<strong>of</strong>(i) the number <strong>of</strong> bytes <strong>of</strong> storage allocated to i<br />

+123 positive 123<br />

-123 negative 123<br />

~i one's complement (bitwise complement)<br />

!i logical negation (i.e., 1 if i is zero, 0 otherwise)<br />

*i returns the value stored at the address pointed to by i<br />

&i<br />

returns the address in memory <strong>of</strong> i<br />

++i<br />

adds one to i, and returns the new value <strong>of</strong> i<br />

--i<br />

subtracts one from i, and returns the new value <strong>of</strong> i<br />

i++<br />

adds one to i, and returns the old value <strong>of</strong> i<br />

i--<br />

subtracts one from i, and returns the old value <strong>of</strong> i<br />

i[j]<br />

array indexing<br />

i (j)<br />

calling the function i with argument j<br />

i.j<br />

returns member j <strong>of</strong> structure i<br />

i->j<br />

returns member j <strong>of</strong> structure pointed to by i<br />

Binary operators<br />

Binary operators work on two operands ("binary" here means 2 operands, not in the sense <strong>of</strong> base-2 arithmetic).<br />

Here is a list. All the usual operators that you would expect are there, with a whole bunch <strong>of</strong> interesting new ones.<br />

Note:<br />

+ addition<br />

- subtraction<br />

* multiplication<br />

/ division<br />

% remainder (e.g., 2%3 is 2), also called 'modulo'<br />


C <strong>programming</strong> <strong>notes</strong><br />

file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />

17 <strong>of</strong> 40 19/03/2007 10:06 AM<br />

Left argument Right argument & | ^ && ||<br />

0011 0101 0001 0111 0110 1 1<br />

1110 1010 1010 1110 0100 1 1<br />

0000 1111 0000 1111 1111 0 1<br />

Assignment operators<br />

Assignment operators are really just binary operators. The simplest example is "=", which takes the value on the right<br />

and places it into the variable on the left. But C provides you with a host <strong>of</strong> other assignment operators which make life<br />

easier for the programmer.<br />

= assignment<br />

+= addition assignment<br />

-= subtraction assignment<br />

*= multiplication assignment<br />

/= division assignment<br />

%= remainder/modulus assignment<br />

&= bitwise AND assignment<br />

|= bitwise OR assignment<br />

^= bitwise exclusive OR assignment<br />

= right shift assignment<br />

So, for example, "i += j" is equivalent to "i = i + j". The advantage <strong>of</strong> the assignment operators is that they can reduce<br />

the amount <strong>of</strong> typing that you have to do, and make the meaning clearer. This is particularly noticeable when, instead<br />

<strong>of</strong> a simple variable such as "i", you have something complicated such as "position[wavelength + correction_factor *<br />

2]";<br />

The thing to the left <strong>of</strong> the assignment operator has to be something where a result can be stored, and is known as an<br />

"lvalue" (i.e., something that can be on the left <strong>of</strong> an "="). Valid "lvalues" include variables such as "i", and array<br />

expressions. It doesn't make sense, however, to use constants or expressions to the left <strong>of</strong> an equals sign, so these are<br />

not "lvalues".<br />

The comma operator<br />

C allows you to put multiple expression in the same statement, separated by a comma. The expressions are evaluated in<br />

left-to-right order. The value <strong>of</strong> the overall expression is then equal to that <strong>of</strong> the rightmost expression.<br />

For example,<br />

Example<br />

Equivalent to<br />

------- -------------<br />

i = ((j = 2), 3); i = 3; j = 2;<br />

myfunct(i, (j = 2, j + 1), 1); j = 2; myfunct(i, 3, 1);<br />

The comma operator has the lowest precedence, so it is always executed last when evaluating an expression. Note that<br />

in the example given comma is used in two distinct ways inside an argument list for a function.<br />

Both the above examples are artificial, and not very useful. The comma operator can be useful when used in "for" and<br />

"while" loops as we will see later.<br />

Precedence and associativity <strong>of</strong> operators<br />

The precedence <strong>of</strong> an operator gives the order in which operators are applied in expressions: the highest precedence<br />

operator is applied first, followed by the next highest, and so on.<br />

The associativity <strong>of</strong> an operator gives the order in which expressions involving operators <strong>of</strong> the same precedence are<br />

evaluated.<br />

The following table lists all the operators, in order <strong>of</strong> precedence, with their associativity:<br />

Operator<br />

Associativity<br />

-------- -------------


C <strong>programming</strong> <strong>notes</strong><br />

file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />

18 <strong>of</strong> 40 19/03/2007 10:06 AM<br />

() [] ->> . left-to-right<br />

- + ++ -- ! ~ * & size<strong>of</strong> (type) right-to-left (note: unary +, -, *, and &)<br />

* / % left-to-right<br />

+ - left-to-right<br />

> left-to-right<br />

< >= left-to-right<br />

== != left-to-right<br />

&<br />

left-to-right<br />

^<br />

left-to-right<br />

| left-to-right<br />

&&<br />

left-to-right<br />

|| left-to-right<br />

?: right-to-left<br />

= += -= *= /= %= &= ^= |= = right-to-left<br />

, left-to-right<br />

Note: the unary operators + (introducing a positive number), - (introducing a negative number), * (pointer<br />

dereferencing), and & (address <strong>of</strong>); appear twice in the above table. The unary forms (on the second line) have higher<br />

precedence that the binary forms.<br />

Operators on the same line have the same precedence, and are evaluated in the order given by their associativity.<br />

To specify a different order <strong>of</strong> evaluation you can use parentheses. In fact, it is <strong>of</strong>ten good practice to use parentheses to<br />

guard against making mistakes in difficult cases, or to make your meaning clear.<br />

Side effects in evaluating expressions<br />

It is possible to write C expressions that give different answers on different machines, since some aspects <strong>of</strong><br />

expression-evaluation are not defined by the ANSI standard. This is deliberate since it gives the compiler writers the<br />

ability to choose different evaluation orders depending on the underlying machine architecture. You, the programmer,<br />

should avoid writing expressions with side effects.<br />

Here are some examples:<br />

myfunc(j, ++j); /* the arguments may be the same, or differ by one */<br />

array[j] = j++; /* is j incremented before being used as an index? */<br />

i = f1() + f2(); /* the order <strong>of</strong> evaluation <strong>of</strong> the two functions<br />

is not defined. If one function affects the<br />

results <strong>of</strong> the other, then side effects will<br />

result */<br />

Evaluation <strong>of</strong> logical AND and OR<br />

A useful aspect <strong>of</strong> C is that it guarantees the order <strong>of</strong> evaluation <strong>of</strong> expressions containing the logical AND (&&) and<br />

OR (||) operators: it is always left-to-right, and stops when the outcome is known. For example, in the expression "1 ||<br />

f()", the function "f()" will not be called since the truth <strong>of</strong> the expression is known regardless <strong>of</strong> the value returned by<br />

"f()".<br />

It is worth keeping this in mind. You can <strong>of</strong>ten speed up programs by rearranging logical tests so that the outcome <strong>of</strong><br />

the test can be determined as soon as possible.<br />

Another good example is an expression such as "i >= 0 && i


C <strong>programming</strong> <strong>notes</strong><br />

file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />

19 <strong>of</strong> 40 19/03/2007 10:06 AM<br />

Rules for identifiers:<br />

Legal characters are a-z, A-Z, 0-9, and _.<br />

Case is significant.<br />

The first character must be a letter or _.<br />

Identifiers can be <strong>of</strong> any length (although only the first 31 characters are guaranteed to be significant).<br />

The following are reserved keywords, and may not be used as identifiers:<br />

auto double int struct<br />

break else long switch<br />

case enum register typedef<br />

char extern return union<br />

const float short unsigned<br />

continue for signed void<br />

default goto size<strong>of</strong> volatile<br />

do if static while<br />

Here are some examples <strong>of</strong> legal identifiers:<br />

i<br />

count<br />

numberOfAardvarks<br />

number_<strong>of</strong>_aardvarks<br />

MAX_LENGTH<br />

Variable storage classes<br />

Variables in C belong to one <strong>of</strong> two fundamental storage classes: "static" or "automatic".<br />

A static variable is stored at a fixed memory location in the computer, and is created and initialised once when the<br />

program is first started. Such a variable maintains its value between calls to the block (a function, or compound<br />

statement) in which it is defined.<br />

An automatic variable is created, and initialised, each time the block is entered (if you jump in half-way through a<br />

block, the creation still works, but the variable is not initialised). The variable is destroyed when the block is exited.<br />

Variables can be explicitly declared as "static" or "auto" by using these keywords before the data-type definition. If you<br />

don't use one <strong>of</strong> these keywords, the default is "static" for variables defined outside any block, and "auto" for those<br />

inside a block.<br />

Actually, there is another storage class: "register". This is like "auto" except that it asks the compiler to try and store the<br />

variable in one <strong>of</strong> the CPU's fast internal registers. In practice, it is usually best not to use the "register" type since<br />

compilers are now so smart that they can do a better job <strong>of</strong> deciding which variables to place in fast storage than you<br />

can.<br />

Const - volatile<br />

Variables can be qualified as "const" to indicate that they are really constants that can be initialised, but not altered.<br />

Variables can also be termed "volatile" to indicate that their value may change unexpectedly during the execution <strong>of</strong> the<br />

program (e.g., they may be hardware registers on a PC, able to be altered by external events). By using the "volatile"<br />

qualifier, you prevent the compiler from optimising the variable out <strong>of</strong> loops.<br />

Extern<br />

Variables (and functions) can also be classified as "extern", which means that they are defined external to the current<br />

block (or even to the current source file). An "extern" variable must be defined once (and only once) without the<br />

"extern" qualifier.<br />

As an example <strong>of</strong> an "extern" function, all the functions in "libm.a" (the math library) are external to the source file that<br />

calls them.


C <strong>programming</strong> <strong>notes</strong><br />

file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />

20 <strong>of</strong> 40 19/03/2007 10:06 AM<br />

An example showing storage class and variable scope (storageclass.c)<br />

#include <br />

int i; /* i is static, and visible to the entire program */<br />

extern j; /* j is static, and visible to the entire program */<br />

static int k; /* k is static, and visible to the routines in this source file */<br />

int main(void){}<br />

void func(void) {<br />

int m = 1;<br />

auto int n = 2;<br />

/* i.e., a function that takes no arguments, and<br />

doesn't return a value */<br />

/* m is automatic, local to this function, and initialised<br />

each time the function is called */<br />

/* n is automatic, local to this function, and initialised<br />

each time the function is called */<br />

static int p = 3; /* p is static, local to this function, and initialised<br />

once when the program is first started up */<br />

extern int q; /* q is static, and defined in some external module */<br />

}<br />

for (i = 0; i < 10; i++) {<br />

int m = 10; /* m is automatic, local to this block, and initialised<br />

each time the block is entered */<br />

printf("m = %i\n", m);<br />

}<br />

Initialisation <strong>of</strong> variables (initialisation.c)<br />

A variable is initialised by equating it to a constant expression on the line in which it is defined. For example<br />

int i = 0;<br />

"static" variables are initialised once (to zero if not explicitly initialised), "automatic" variables are initialised when the<br />

block in which they are defined is entered (and to an undefined value if not explicitly initialised).<br />

The "constant expression" can contain combinations <strong>of</strong> any type <strong>of</strong> constant, and any operator (except assignment,<br />

incrementing, decrementing, function call, or the comma operator), including the ability to use the unary "&" operator<br />

to find the address <strong>of</strong> static variables.<br />

Here are some valid examples:<br />

#include <br />

#include <br />

int i = 0;<br />

int j = 2 + 2;<br />

int k = 2 * (3


C <strong>programming</strong> <strong>notes</strong><br />

file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />

21 <strong>of</strong> 40 19/03/2007 10:06 AM<br />

Mathematical functions<br />

Prototypes for the math functions are in the system include-file "math.h", so you should put the line<br />

#include <br />

in any C source file that calls one <strong>of</strong> them.<br />

Here is a list <strong>of</strong> the math functions defined by the ANSI standard:<br />

sin(x)<br />

sine <strong>of</strong> x<br />

cos(x)<br />

cosine <strong>of</strong> x<br />

tan(x)<br />

tan <strong>of</strong> x<br />

asin(x)<br />

arcsine <strong>of</strong> x, result between -pi/2 and +pi/2<br />

acos(x)<br />

arccosine <strong>of</strong> x, result between 0 and +pi<br />

atan(x)<br />

arctan <strong>of</strong> x, result between -pi/2 and +pi/2<br />

atan2(y,x)<br />

arctan <strong>of</strong> (y/x), result between -pi and +pi<br />

hsin(x)<br />

hyperbolic sine <strong>of</strong> x<br />

hcos(x)<br />

hyperbolic cosine <strong>of</strong> x<br />

htan(x)<br />

hyperbolic tan <strong>of</strong> x<br />

exp(x)<br />

exponential function<br />

log(x)<br />

natural logarithm<br />

log10(x) logarithm to base 10<br />

pow(x,y)<br />

x to the power <strong>of</strong> y (x**y in FORTRAN)<br />

sqrt(x)<br />

the square root <strong>of</strong> x (x must not be negative)<br />

ceil(x)<br />

ceiling; the smallest integer not less than x<br />

floor(x)<br />

floor; the largest integer not greater than x<br />

fabs(x)<br />

absolute value <strong>of</strong> x<br />

ldexp(x,n)<br />

x times 2**n<br />

frexp(x, int *exp) returns x normalized between 0.5 and 1; the exponent <strong>of</strong> 2 is in *exp<br />

modf(x, double *ip) returns the fractional part <strong>of</strong> x; the integral part goes to *ip<br />

fmod(x,y)<br />

returns the floating-point remainder <strong>of</strong> x/y, with the sign <strong>of</strong> x<br />

In the above table, "x", and "y" are <strong>of</strong> type "double", and "n" is an "int". All the above functions return "double" results.<br />

C libraries may also include "float" versions <strong>of</strong> the above. For example, "fsin(x)" takes a float argument and returns a<br />

float result.<br />

Note that the arguments for trignometric functions are in radians and not degrees!<br />

A complete list <strong>of</strong> mathematical functions supported by C can be found in the file Cmathfunctions.txt.<br />

The "for" loop<br />

The basic looping construct in C is the "for" loop.<br />

Here is the syntax <strong>of</strong> the "for" statement:<br />

for (initial_expression; loop_condition; loop_expression) statement;<br />

and this is precisely equivalent to the following code fragment:<br />

initial_expression;<br />

while (loop_condition) {<br />

statement;<br />

loop_expression;<br />

}<br />

An example will, hopefully, clear this up:<br />

for (i = 0; i< 100; i++) printf("%i\n", i);<br />

which simply prints the first 100 integers. If you want to include more that one statement in the loop, use curly brackets<br />

to delimit the body <strong>of</strong> the loop, e.g.,<br />

for (i = 0; i < 100; i++) {<br />

j = i * i;<br />

printf("i = %i; j = %i\n", i, j);


C <strong>programming</strong> <strong>notes</strong><br />

file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />

22 <strong>of</strong> 40 19/03/2007 10:06 AM<br />

}<br />

A related control mechanism is the do-while loop<br />

initial_expression;<br />

do{statement;} while (loop_condition)<br />

For example (dowhile.c):<br />

i = 0;<br />

do {printf("The value <strong>of</strong> i is now %d\n", i);<br />

i = i + 1;<br />

} while (i < 5);<br />

Here is a slightly more interesting example which converts Centigrade to Farenheit (tempconv.c).<br />

And here is a more complex example, illustrating nested loops, with an if thrown in for good measure:<br />

ASCII circle printer.<br />

And here is another program illustrating the power <strong>of</strong> loops: Mandelbrot image generator<br />

Semicolons in compound statements<br />

The last statement in the body <strong>of</strong> statements in a "for" loop (or, in fact, in any other compound statement) must be<br />

terminated with a semicolon.<br />

For example,<br />

for (i = 0; i < 10; i++) {<br />

x = i * i;<br />

x += 2; /* the semicolon is required here */<br />

} /* do not use a semicolon here */<br />

Variables defined within compound statements<br />

You can create variables that are local to a compound statement by declaring the variables immediately after the<br />

leading curly bracket.<br />

If statements<br />

if (expression)<br />

statement<br />

else if (expression)<br />

statement<br />

else if (expression)<br />

statement<br />

else<br />

statement<br />

Where "statement" is a simple C statement ending in a semicolon, or a compound statement ending in a curly bracket.<br />

Some examples will help:<br />

if (i == 6) x = 3;<br />

if (i == 6) {<br />

x = 3;<br />

}<br />

if (i) x = 3;<br />

if (i == 2)<br />

x = 3;<br />

else if (i == 1)<br />

if (j)<br />

y = 2;


C <strong>programming</strong> <strong>notes</strong><br />

file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />

23 <strong>of</strong> 40 19/03/2007 10:06 AM<br />

else /* NOTE: possible ambiguity here, the compiler uses */<br />

y = 3; /* the closest if statement that does not have */<br />

else { /* an else clause */<br />

x = 4;<br />

y = 5;<br />

}<br />

Example Program ifelse.c<br />

Break and continue<br />

These two statements are used in loop control.<br />

"break" exits the innermost current loop.<br />

"continue" starts the next iteration <strong>of</strong> the loop.<br />

Here is an examaple program with break and continue (breakcon.c)<br />

Infinite loops<br />

Here are three ways <strong>of</strong> writing an infinite loop, choose whichever one you like:<br />

for (;;;) {<br />

statement;<br />

}<br />

while (1) {<br />

statement;<br />

}<br />

do {<br />

statement;<br />

} while (1);<br />

Infinite loops can be useful. They are normally terminated using a conditional test with a "break" or "return" statement.<br />

For example:<br />

while (1) {<br />

statement;<br />

if (logical_condition) break;<br />

}<br />

goto<br />

C does have a "goto" statement, but you normally don't need it. Using "goto" is usually a result <strong>of</strong> bad <strong>programming</strong><br />

(although in some circumstances it can be a sensible choice).<br />

The switch statement<br />

The switch statement allows you to choose between multiple paths <strong>of</strong> execution depending on the value <strong>of</strong> an<br />

expression. It could be replaced by a series <strong>of</strong> if statements, but switch can make the flow <strong>of</strong> the program easier to<br />

understand.<br />

switch (expression) {<br />

case const-expression: statements<br />

case const-expression: statements<br />

default: statements<br />

}<br />

Beware that control will flow from one case to the next unless you explicitly include a break statement to cause control<br />

to skip to the end <strong>of</strong> the switch statement.<br />

Here is an example <strong>of</strong> using switch:<br />

switch (i) {<br />

case 0: printf("i had the value zero\n");<br />

break;<br />

case 1:<br />

case 2: printf("i was either one or two\n");


C <strong>programming</strong> <strong>notes</strong><br />

file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />

24 <strong>of</strong> 40 19/03/2007 10:06 AM<br />

}<br />

break;<br />

default: printf("i was outside the range 0 to 2\n");<br />

Here is an example with multiple switch values (switch.c)<br />

Assert as a tool for improving program reliability<br />

By now you will be aware that it is very easy to write buggy programs in C, and it may not always be obvious at<br />

runtime that there is a problem. Checking for all possible error conditions in a program can be tedious. A useful tool to<br />

improve program reliability is the "assert" construct defined in the system include-file "assert.h". An example<br />

(assertexample.c) will clarify this:<br />

#include <br />

#include <br />

int main(void) {<br />

int i;<br />

}<br />

printf("enter a positive integer > ");<br />

assert(1 == scanf("%i", &i));<br />

assert(i > 0);<br />

printf("you successfully entered %i\n", i);<br />

return 0;<br />

If the expression inside parentheses after "assert" is true, the program continues as normal, if the expression is false (i.e,<br />

zero), then the program aborts with an error message.<br />

This may sound like a trivial concept, but by using "assert" extensively, you will be surprised at how your<br />

<strong>programming</strong> improves.<br />

It is particularly useful to use "assert" to check for conditions that should never arise in your program, e.g., suppose you<br />

write a function that should only ever be called with an argument between -1.0 and 1.0, then, use "assert" in the<br />

function to check this, e.g.,<br />

double myAsin(double x) {<br />

assert(fabs(x) < 1.0);<br />

...<br />

}<br />

Also, you can use assert to check for unexpected conditions such as failure to open a file, or failure to allocate memory,<br />

e.g.,<br />

assert(NULL != (fp = fopen("data.dat", "r")));<br />

assert(NULL != (p = malloc(1024));<br />

Formatted output: printf description<br />

The format string given to the "printf" function may contain both ordinary characters (which are simply printed out)<br />

and conversion characters (beginning with a percent symbol, %, these define how the value <strong>of</strong> an internal variable is to<br />

be converted into a character string for output).<br />

Here is the syntax <strong>of</strong> a conversion specification:<br />

%{flags: - + space 0 #}{minimum field width}{.}{precision}{length modifier}{conversion character}<br />

Flags: "-" means left-justify (default is right-justify), "+" means that a sign will always be used, " " prefix a<br />

space, "0" pad to the field width with zeroes, "#" specifies an alternate form (for details, see the manual!). Flags<br />

can be concatenated in any order, or left <strong>of</strong>f altogether.


C <strong>programming</strong> <strong>notes</strong><br />

file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />

25 <strong>of</strong> 40 19/03/2007 10:06 AM<br />

Minimum field width: the output field will be at least this wide, and wider if necessary.<br />

"." separates the field width from the precision.<br />

Precision: its meaning depends on the type <strong>of</strong> object being printed:<br />

Character: the maximum number <strong>of</strong> characters to be printed.<br />

Integer: the minimum number <strong>of</strong> digits to be printed.<br />

Floating point: the number <strong>of</strong> digits after the decimal point.<br />

Exponential format: the number <strong>of</strong> significant digits.<br />

Length modifer: "h" means short, or unsigned short; "l" means long or unsigned long; "L" means long double.<br />

Conversion character: a single character specifying the type <strong>of</strong> object being printed, and the manner in which it<br />

will be printed, according to the following table:<br />

Character Type<br />

Result<br />

d,i int signed decimal integer<br />

o int unsigned octal (no leading zero)<br />

x, X int unsigned hex (no leading 0x or 0X)<br />

u int unsigned decimal integer<br />

c int single character<br />

s char * characters from a string<br />

f double floating point [-]dddd.pppp<br />

e, E double exponential [-]dddd.pppp e[=/-]xx<br />

g, G double floating if the exponent less than -4 or >= precision<br />

else exponential<br />

p void * pointer<br />

n int * the number <strong>of</strong> characters written so far by printf<br />

is stored into the argument (i.e., not printed)<br />

% print %<br />

Here is an example program (printfexample.c) to show some <strong>of</strong> these effects (try running the program yourself):<br />

#include <br />

int main(void) {<br />

int i = 123;<br />

double f = 3.14159265358979323846;<br />

printf("i = %i\n", i);<br />

printf("i = %o\n", i);<br />

printf("i = %x\n", i);<br />

printf("i = %X\n", i);<br />

printf("i = %+i\n", i);<br />

printf("i = %8i\n", i);<br />

printf("i = %08i\n", i);<br />

printf("i = %+08i\n", i);<br />

printf("f = %f\n", f);<br />

printf("f = %10.3f\n", f);<br />

printf("f = %+10.3f\n", f);<br />

printf("f = %g\n", f);<br />

printf("f = %10.6g\n", f);<br />

printf("f = %10.6e\n", f);<br />

}<br />

return 0;<br />

Notes:<br />

There are differences between the various implementations <strong>of</strong> "printf". The above should be a subset <strong>of</strong> the<br />

available options. Consult the manual (e.g., "info gcc") for your particular C compiler to be sure.<br />

"printf" is actually an integer function, which returns the number <strong>of</strong> characters written (or a negative number if an<br />

error occurred).<br />

Formatted input: scanf overview<br />

Input in C is similar to output: the same conversion characters are used. The main difference is that you use a routine<br />

called "scanf", and you must pass the addresses <strong>of</strong> the variables you want to change, not their values.<br />

For example:<br />

scanf("%d", &i); /* reads an integer into `i' */<br />

scanf("%i", &i); /* reads an integer (or octal, or hex) into `i' */<br />

scanf("%f %i", &f, &i); /* reads a double followed by an integer */<br />

scanf is actually an integer function, which returns the number <strong>of</strong> input items assigned (or the integer constant EOF<br />

(defined in stdio.h) if the end-<strong>of</strong>-file is reached or an error occurred).


C <strong>programming</strong> <strong>notes</strong><br />

file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />

26 <strong>of</strong> 40 19/03/2007 10:06 AM<br />

The ampersand character "&" is a unary operator that returns the address <strong>of</strong> the thing to its right. Recall that a C<br />

function can not alter the value <strong>of</strong> its arguments (but there is nothing stopping it altering the value that is pointed to by<br />

one <strong>of</strong> its arguments!).<br />

scanf details<br />

The scanf function is used for reading a line <strong>of</strong> input from the standard input stream (usually the users terminal) and<br />

extracting from it one or more integers, floating point numbers, character strings, etc.<br />

The first argument to scanf is a format description, which describes how the input line should be decoded. For example,<br />

if you want to read an integer, the format would be "%i" (note that this allows you to enter numbers in decimal,<br />

hexadecimal or octal (e.g., 123, 0x123, 0123)). If you want to read a floating point number, the format would be "%f".<br />

If you want to read an integer followed by a floating point number, the format woud be "%d %f". If the two numbers<br />

were separated by a comma on the input line, you would use "%d, %f".<br />

The arguments after the first are pointers to where the data should be stored.<br />

Examples (where i and j are integers, x and y are floats, and fruit and colour are char arrays):<br />

Input data<br />

scanf function call to read the data<br />

123 scanf("%i", &i);<br />

123 0x456 scanf("%i %i", &i, &j);<br />

123. 456. scanf("%i. %i.", &i, &j);<br />

123.1 456.1 scanf("%f %f", &x, &y);<br />

e=2.71 pi=3.14<br />

scanf("e=%f pi=%f", &x, &y);<br />

apples red<br />

scanf("%s %s", fruit, colour);<br />

Guido ate 2 apples<br />

scanf("Guido ate %i %s", &i, fruit);<br />

It is important to check the return value <strong>of</strong> the scanf function - this is the number <strong>of</strong> items that were successfully<br />

extracted from the input. It may be less than the number you expected, or even zero. If this happens, the input stream<br />

pointer is left at the point at which the error occurred, and you can try reading again. scanf can also return the special<br />

integer constant EOF to indicate an end-<strong>of</strong>-file condition.<br />

Variations on scanf<br />

fscanf is very similar to scanf except it takes a FILE argument ("FILE" is a structure that is defined in the <br />

include file - you don't have to worry about what precisely "FILE" is, just follow the examples) to specify which input<br />

stream to read from, e.g.,<br />

FILE *fp;<br />

float x, y;<br />

fp = fopen("input.dat", "r");<br />

fscanf(fp, "%f %f", &x, &y);<br />

"sscanf" takes input from a character array rather than an input stream, e.g.,<br />

char input="1.23 4.56";<br />

sscanf(input, "%f %f", &x, &y);<br />

Improving the ease <strong>of</strong> interaction with your programs - GNU readline<br />

While it is possible to use "scanf" for reading input typed by a person, you can greatly improve the user experience by<br />

using the GNU readline package.


C <strong>programming</strong> <strong>notes</strong><br />

file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />

27 <strong>of</strong> 40 19/03/2007 10:06 AM<br />

The GNU readline package provides some very convenient features such as line editing and history. This means, e.g.,<br />

that you can use the arrow keys to move within a line, and between previously entered lines. You can also use, e.g.,<br />

CTRL-A to go to the beginning <strong>of</strong> the line, CTRL-E to go to the end, and CNTRL-R to search backwards for particular<br />

strings. See "man readline" (on a GNU/Linux computer) for a complete description.<br />

Here is an example program showing the use <strong>of</strong> GNU readline.<br />

Arrays<br />

In mathematics and physics you <strong>of</strong>ten need to group numbers together into arrays, i.e., 1, 2, or higher-dimensional<br />

structures.<br />

Arrays are declared in C as follows (for example):<br />

int counts[100];<br />

float temperature[1024];<br />

double mat[2][3];<br />

In this example, "count" is an array that can hold 100 integers, "temperature" is an array that can hold 1024 floats, and<br />

"mat" is a 2x3 array (or matrix) <strong>of</strong> doubles.<br />

Note carefully that the first element <strong>of</strong> an array in C is element number 0 (rather than 1 as in FORTRAN). While this<br />

may be confusing to die-hard FORTRAN programmers, it is really a more natural choice. A side-effect <strong>of</strong> this choice is<br />

that the last element in an array has an index that is one less that the declared size <strong>of</strong> the array. This is a source <strong>of</strong> some<br />

confusion, and something to watch out for.<br />

So, "counts[0]" is the first element <strong>of</strong> the array "counts", defined above. And "counts[99]" is the last element. There are<br />

100 elements in total.<br />

Initialising arrays<br />

To initialise an array to have particular numerical values, specify the initial values in a list within curly brackets. For<br />

example:<br />

int primes[100] =<br />

{2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43,<br />

47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107,<br />

109, 113, 127, 131, 137, 139, 149, 151, 157, 163, 167, 173, 179, 181,<br />

191, 193, 197, 199, 211, 223, 227, 229, 233, 239, 241, 251, 257, 263,<br />

269, 271, 277, 281, 283, 293, 307, 311, 313, 317, 331, 337, 347, 349,<br />

353, 359, 367, 373, 379, 383, 389, 397, 401, 409, 419, 421, 431, 433,<br />

439, 443, 449, 457, 461, 463, 467, 479, 487, 491, 499, 503, 509, 521,<br />

523, 541};<br />

float temp[1024] = {5.0F, 2.3F};<br />

double trouble[] = {1.0, 2.0, 3.0};<br />

In this example, "primes" is initialised with the values <strong>of</strong> the first 100 primes (trust me!), and the first two elements<br />

(temp[0] and temp[1]) <strong>of</strong> "temp" are initialised to 5.0F and 2.3F respectively. The remaining elements <strong>of</strong> "temp" are set<br />

to 0.0F. Note that we use the trailing "F" on these numbers to indicate that they are floats, not doubles.<br />

The array "trouble" in the above example contains three double numbers. Note that its length is not explicitly declared,<br />

C is smart enough to calculate the length for you.<br />

Static arrays that are not explicitly initialised are set to zero. Automatic arrays that are not explicitly initialised, have<br />

undefined values.<br />

Multidimensional arrays<br />

Multidimensional arrays are declared and referenced in C by using multiple sets <strong>of</strong> square brackets. For example,<br />

int table[2][3][4];<br />

"table" is a 2x3x4 array, with the rightmost array subscript changing most rapidly as you move through memory (the


C <strong>programming</strong> <strong>notes</strong><br />

file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />

28 <strong>of</strong> 40 19/03/2007 10:06 AM<br />

first element is "table[0][0][0]", the next element is "table[0][0][1]", and the last element is "table[1][2][3]").<br />

When writing programs that use large arrays (say, more than a few megabytes), you should be very careful to ensure<br />

that array references in your program are as close as possible to being consecutive (otherwise you may get severe<br />

swapping problems, i.e., the memory is swapped to hard disk, resulting in more than a factor <strong>of</strong> 1000 slow-down).<br />

Initialisation <strong>of</strong> multidimensional arrays<br />

This is best demonstrated with an example:<br />

int mat[3][4] = {<br />

}<br />

Operations on arrays<br />

{ 0, 1, 2, 23},<br />

{ 3, 4, 5, 24},<br />

{ 6, 7, 8, 25}<br />

If you wish to, e.g., add two arrays together, you have to do it element-by-element yourself using a "for" loop. For<br />

example:<br />

int a0[10], a1[10], a2[10];<br />

for (int i = 0; i < 10; i++) {<br />

a0[i] = a1[i] + a2[i];<br />

}<br />

Even something as simple as setting one array to have the same value as another has to be done with a "for" loop. E.g. :<br />

int a0[10], a1[10];<br />

for (int i = 0; i < 10; i++) {<br />

a0[i] = a1[i];<br />

}<br />

Character arrays<br />

Arrays can be <strong>of</strong> any type. In C, you manipulate character strings as arrays <strong>of</strong> characters, and operations on the strings<br />

(such as concatenation, searching) are done by calling special libary functions (e.g., strcat, strcmp).<br />

Note that when calling a string-manipulation function, the end <strong>of</strong> the string is taken as the position <strong>of</strong> the first NUL<br />

character (which has a numerical value <strong>of</strong> zero) in the string.<br />

Character arrays can be initialised in the following ways:<br />

char str[] = {'a', 'b', 'c'};<br />

char prompt[] = "please enter a number";<br />

In the example, "str" has length 3 bytes, and "prompt" has length 22 bytes, which is one more than the number <strong>of</strong><br />

characters in "please enter a number", the extra byte is used to store a NUL character (zero) as an indication <strong>of</strong> the end<br />

<strong>of</strong> the string. "str" does not having a trailing NUL.<br />

Functions<br />

A function call allows a commonly used piece <strong>of</strong> code to be made available through a single calling statement. For<br />

instance, to calculate the square <strong>of</strong> a number we could use the following (floating point) function:<br />

float sqr(inval) /* square a float, return a float */<br />

float inval;<br />

{<br />

float square;<br />

square = inval * inval;<br />

return(square);


C <strong>programming</strong> <strong>notes</strong><br />

file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />

29 <strong>of</strong> 40 19/03/2007 10:06 AM<br />

This would be called from the main part <strong>of</strong> the program by a statement like:<br />

y = sqr(x);<br />

In this case the result <strong>of</strong> the function call ("square") is assigned to the value "y". The complete calling sequence is<br />

illustrated in the program floatsq.c.<br />

Prototypes<br />

The function call described above is the "classical" mode <strong>of</strong> function use, as described, for eaxmple, in Kernigan &<br />

Ritchie. A prototype is a model <strong>of</strong> a real thing, and in C you have the ability to define a model <strong>of</strong> each function for the<br />

compiler. The compiler can then use the model to check each call to the function and so determine if the correct number<br />

<strong>of</strong> arguments have been used, and whether they are <strong>of</strong> the correct type. By using prototypes you are therefore letting the<br />

compiler do some additional error checking. The program floatsq2.c does the same calculation as the program floatsq.c<br />

as above, but this time uses prototypes. Note the placement <strong>of</strong> the prototype for the square function "sqr":<br />

float sqr(float inval);<br />

at the start <strong>of</strong> the program, before the start <strong>of</strong> "main".<br />

Pointers<br />

If there is one thing that sets C apart from many other languages, it is the pervasive use <strong>of</strong> pointers.<br />

To understand pointers, it helps to have a mental image <strong>of</strong> the way in which the memory <strong>of</strong> a computer is organised.<br />

e.g., imagine that your computer has 128MB <strong>of</strong> memory; then you could label each one <strong>of</strong> these 128 million bytes <strong>of</strong><br />

memory with a number: byte 0 would be the first byte in memory, byte 1 the second, and so on up to 128 million.<br />

These numbers are referred to as the addresses <strong>of</strong> the memory cells (note: I'm actually simplifying the true situation; in<br />

a modern computer each separate process has it's own virtual address-space, and there are several layers <strong>of</strong> indirection<br />

involved in going from the addresses that C uses, to the actual physical piece <strong>of</strong> memory on the computer<br />

motherboard). A pointer is then simply a variable containing a memory address. This is a very natural concept when<br />

you think in terms <strong>of</strong> the hardware <strong>of</strong> the computer, which is a viewpoint that C encourages.<br />

Suppose that "i" is declared as "int i". Then the address in memory at which "i" is stored is given by "&i", where "&" is<br />

the unary address operator. One says that "&i" is a pointer to "i". On a 32-bit computer such as a typical PC, a pointer<br />

is 4 bytes in length (therefore allowing up to 4GB <strong>of</strong> memory to be addressed; this is an uncomfortably small number).<br />

Pointers are declared by specifying the type <strong>of</strong> thing they point at. For example,<br />

int *p;<br />

defines "p" as a pointer to an int (so, therefore, "*p" is an int, hence the form <strong>of</strong> the declaration).<br />

Note carefully, then by declaring "p" as in the above example, the compiler simply allocates space for the pointer (4<br />

bytes for 32-bit computers, 8 bytes for 64-bit computers), not for the variable that the pointer points to! This is a very<br />

important point, and is <strong>of</strong>ten overlooked. Before using "*p" in an expression, you have to ensure that "p" is set to point<br />

to a valid int. This "int" must have had space allocated for it, either statically, automatically, or by dynamically<br />

allocating memory at run-time (e.g., using the "malloc" function, see later).<br />

Here is a rather convoluted example showing some <strong>of</strong> the uses <strong>of</strong> pointers (pointerexample1.c):<br />

#include <br />

int main(void) {<br />

int m = 0, n = 1, k = 2;<br />

int *p;<br />

char msg[] = "hello";<br />

char *cp;<br />

/* Line 7 */<br />

p = &m; /* Line 8: p now points to m */<br />

*p = 1; /* Line 9: m now equals 1 */<br />

k = *p; /* Line 10: k now equals 1 */<br />

cp = msg; /* Line 11: cp points to the first character <strong>of</strong> msg */<br />

*cp = 'H'; /* Line 12: change the case <strong>of</strong> the 'h' in msg */<br />

cp = &msg[4]; /* Line 13: cp points to the 'o' */<br />

*cp = 'O'; /* Line 14: change its case */<br />

printf("m = %d, n = %d, k = %d\nmsg = \"%s\"\n", m, n, k, msg);<br />

return 0;


C <strong>programming</strong> <strong>notes</strong><br />

file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />

30 <strong>of</strong> 40 19/03/2007 10:06 AM<br />

}<br />

By using the GNU debugger, gdb, we can identify the actual memory locations that the C compiler uses to store the<br />

variable in the above program. It is then possible to draw up a table (see below) showing the detailed operation <strong>of</strong> the<br />

program. You should follow this example very carefully so that you understand what is going on.<br />

Variable Starting Value after the specified line has executed<br />

location 7 8 9 10 11 12 13 14<br />

------------------------------------------------------------------<br />

m 0xbffff98c 0 0 1 1 1 1 1 1<br />

n 0xbffff988 1 1 1 1 1 1 1 1<br />

k 0xbffff984 2 2 2 1 1 1 1 1<br />

p 0xbffff980 ? 98c 98c 98c 98c 98c 98c 98c<br />

msg 0xbffff970 hello hello hello hello hello Hello Hello HellO<br />

cp 0xbffff96c ? ? ? ? 970 970 976 976<br />

The memory addresses are shown in hexadecimal, and only the rightmost three digits are shown (e.g., 98c) for<br />

convenience in displaying the table. Memory addresses are always given in units <strong>of</strong> bytes. The above table shows, for<br />

example, that the variable "m" occupies the memory addresses 0xbffff98c, 0xbffff98d, 0xbffff98e, and 0xbffff98f. You<br />

will note that 16 bytes are reserved for "msg", even though "msg" only needs 6 bytes (5 for the characters in "hello"<br />

and one for the trailing null byte). The C compiler won't guarantee any particular placement in memory, and may leave<br />

gaps, particularly to align numbers with 4-byte boundaries (i.e., so that their memory addresses are divisable by 4).<br />

Note the very important point that the name <strong>of</strong> an array ("msg" in the above example), if used without an index, is<br />

considered to be a pointer to the first element <strong>of</strong> the array. In fact, an array name followed by an index is exactly<br />

equivalent to a pointer followed by an <strong>of</strong>fset. For example (pointerexample2.c),<br />

#include <br />

int main(void) {<br />

char msg[] = "hello world";<br />

char *cp;<br />

cp = msg;<br />

cp[0] = 'H';<br />

*(msg+6) = 'W';<br />

}<br />

printf("%s\n", msg);<br />

printf("%s\n", &msg[0]);<br />

printf("%s\n", cp);<br />

printf("%s\n", &cp[0]);<br />

return 0;<br />

Note, however, that "cp" is a variable, and can be changed, whereas "msg" is a constant, and is not an lvalue (i.e., it can<br />

not be used on the left <strong>of</strong> an equals sign).<br />

Pointer arithmetic<br />

You can add and subtract numbers to/from pointers, but the results may surprise you. For example, if "p" is a pointer to<br />

an "int" (i.e., "int *p"), then "p++" actually adds 4 to "p"! The logic behind this is that adding one to "p" should make it<br />

point to the next integer, which has a memory address that is 4 bytes greater.<br />

Pointers used as arguments to functions<br />

We have already seen that C functions can not alter their arguments. The simple reason for this is that when you call a<br />

function, the compiler passes copies <strong>of</strong> the values <strong>of</strong> the arguments to the function. For example, if you have a function<br />

as follow:<br />

void myfunc(int n) {<br />

n = 6;<br />

}<br />

and you call it with<br />

int i = 0;<br />

myfunc(i);<br />

Then "myfunc" is passed the number 0. "myfunc" knows nothing whatsoever about where this number came from. The


C <strong>programming</strong> <strong>notes</strong><br />

file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />

31 <strong>of</strong> 40 19/03/2007 10:06 AM<br />

zero is stored in the local variable n. "myfunc" changed n to 6, but this doesn't have any effect on i in the main<br />

program.<br />

In order to change the value <strong>of</strong> a variable in the main program, you need to give the function a pointer to the variable.<br />

In other words, you tell the function whereabouts in memory the variable is. With this crucial bit <strong>of</strong> information, the<br />

function can go ahead and alter the value <strong>of</strong> the variable. For example,<br />

void myfunc(int *n) {<br />

*n = 6;<br />

}<br />

and you call it with<br />

int i = 0;<br />

myfunc(&i);<br />

Now, the value <strong>of</strong> i in the main program will be altered by the call to "myfunc".<br />

Here is an example <strong>of</strong> a function that swaps the values <strong>of</strong> its two arguments, which must be pointers to integers:<br />

void swap_args(int *pi, int *pj) {<br />

int temp;<br />

}<br />

temp = *pi;<br />

*pi = *pj;<br />

*pj = temp;<br />

If all the asterisks were left out <strong>of</strong> this routine, then it would still compile OK, but its only effect would be to swap local<br />

copies <strong>of</strong> "pi" and "pj".<br />

Pointers and arrays<br />

Consider an array declared as follows:<br />

int a[10];<br />

Then, as we have seen, a[0] is the first element <strong>of</strong> the array, and a[9] is the last. The address <strong>of</strong> the first element is<br />

"&a[0]", and by convention this can also be written simply as "a".<br />

Structures<br />

Any <strong>programming</strong> language worth its salt has to allow you to manipulate more complex data types than simply ints,<br />

floats, and arrays. You need to have structures <strong>of</strong> some sort. This is best illustrated by example:<br />

To define a structure, use the following syntax:<br />

struct time {<br />

int hour;<br />

int minute;<br />

float second;<br />

};<br />

This defines a new data type, called "time", which contains 3 fields (stored in consecutive locations in the computer's<br />

memory).<br />

Logically, it makes sense to associate the various fields <strong>of</strong> "time" together, since they are part <strong>of</strong> one physical entity:<br />

the time.<br />

To declare a variable <strong>of</strong> type "time", you could do the following:<br />

struct time t1[10], t2 = {12, 0, 0.0F};<br />

This defines "t1" to be an array <strong>of</strong> 10 "time" structures. The variable "t2" is also a "time" structure, and is initialised to<br />

12 hours, 0 minutes, 0.0 seconds.<br />

An example program using structures is addtime.c. This defines the above "time" structure, and then adds two times


C <strong>programming</strong> <strong>notes</strong><br />

file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />

32 <strong>of</strong> 40 19/03/2007 10:06 AM<br />

together, taking account <strong>of</strong> any overflow in the seconds and the minutes. The assert command is also used to ensure<br />

that the times that are input are also valid.<br />

To refer to an individual element <strong>of</strong> a structure, you use the following syntax:<br />

t1[0].hour = 12;<br />

t1[0].minutes = 0;<br />

t1[0].second = t2.second;<br />

t1[1] = t2;<br />

Alternatively, and somewhat confusingly, if "p", say, is a pointer to a structure, then you can access the elements <strong>of</strong> the<br />

structure using notation like "p->hour". For example:<br />

struct time *p;<br />

// p is a pointer to a time structure<br />

struct time t = {1,2,3.0}; // t is a time structure<br />

p = &t;<br />

// p now points to t<br />

p->hour = 6;<br />

// this statement is exactly equivalent to...<br />

t.hour = 6; // ...this one<br />

Structures can contain any type, including arrays and other structures. A common use is to create a linked list by having<br />

a structure contain a pointer to a structure <strong>of</strong> the same type, for example,<br />

struct person {<br />

char *name[80];<br />

char *address[256];<br />

struct person *next_person;<br />

};<br />

where "person.next_person" is actually a pointer to a "person" struct. Sounds complicated, but it is simple when you<br />

get the hang <strong>of</strong> it.<br />

Structures are the first step towards object-oriented <strong>programming</strong> (OOP), where most <strong>of</strong> the <strong>programming</strong> effort is<br />

devoted to defining the best representation <strong>of</strong> the data, and the operations which can be performed on the data.<br />

In physics, a structure can be used to represent the state <strong>of</strong> a physical system. For example, here is a structure that<br />

might be used to represent the state <strong>of</strong> a particle in three-dimensional space:<br />

struct particle {<br />

double mass;<br />

double position[3];<br />

double velocity[3];<br />

};<br />

We could then define an array <strong>of</strong> 1000 particles as simply as:<br />

struct particle p[1000];<br />

You can take this one step further and define your own type (like "int", "double") by doing:<br />

typedef struct {<br />

double mass;<br />

double position[3];<br />

double velocity[3];<br />

} particle;<br />

In which case you can define the array <strong>of</strong> 1000 particles as:<br />

particle p[1000];<br />

String functions<br />

The standard C libraries include a bunch <strong>of</strong> functions for manipulating strings (i.e., arrays <strong>of</strong> chars).<br />

Note that before using any <strong>of</strong> these functions, you should include the line "#include " in your program. This<br />

include-file defines prototypes for the following functions, as well as defining special types such as "size_t", which are<br />

operating-system dependent.<br />

char *strcat(s1, s2) Concatenates s2 onto s1. null-terminates s1.<br />

Returns s1.<br />

char *strchr(s, c)<br />

Searches string s for character c. Returns a


C <strong>programming</strong> <strong>notes</strong><br />

file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />

33 <strong>of</strong> 40 19/03/2007 10:06 AM<br />

int strcmp(s1, s2)<br />

pointer to the first occurence, or null pointer<br />

if not.<br />

Compares strings s1 and s2. Returns an integer<br />

that is less than 0 is s1 < s2; zero if s1 = s2;<br />

and greater than zero is s1 > s2.<br />

char *strcpy (s1, s2) Copies s2 to s1, returning s1.<br />

size_t strlen(s)<br />

char *strncat(s1, s2, n)<br />

int strncmp(s1, s2, n)<br />

char *strncpy(s1, s2)<br />

char *strrchr(s, c)<br />

int strstr(s1, s2)<br />

size_r strspn(s1, s2)<br />

size_r strcspn(s1, s2)<br />

char *strpbrk(s1, s2)<br />

char *strerror(n)<br />

char *strtok(s1, s2)<br />

Returns the number <strong>of</strong> characters in s, excluding<br />

the terminating null.<br />

Concatenates s2 onto s1, stopping after `n'<br />

characters or the end <strong>of</strong> s2, whichever occurs<br />

first. Returns s1.<br />

Like strcmp, except at most `n' characters are<br />

compared.<br />

Like strcpy, except at most `n' characters are<br />

copied.<br />

Like strchr, except searches from the end <strong>of</strong><br />

the string.<br />

Searches for the string s2 in s1. Returns a<br />

pointer if found, otherwise the null-pointer.<br />

Returns the number <strong>of</strong> consecutive characters<br />

in s1, starting at the beginning <strong>of</strong> the string,<br />

that are contained within s2.<br />

Returns the number <strong>of</strong> consecutive characters<br />

in s1, starting at the beginning <strong>of</strong> the string,<br />

that are not contained within s2.<br />

Returns a pointer to the first character in s1<br />

that is present in s2 (or NULL if none).<br />

Returns a pointer to a string describing the<br />

error code `n'.<br />

Searches s1 for tokens delimited by characters<br />

from s1. The first time it is called with a<br />

non-null s1, it writes a NULL into s1 at the<br />

first position <strong>of</strong> a character from s2, and<br />

returns a pointer to the beginning <strong>of</strong> s1. When<br />

strtok is then called with a null s1, it finds<br />

the next token in s1, starting at one position<br />

beyond the previous null.<br />

File operations<br />

defines a number <strong>of</strong> functions that are used for accessing files. Before using a file, you have to declare a<br />

pointer to it, so that it can be referred to with the functions. You do this with, for example,<br />

FILE *in_file;<br />

FILE *out_file;<br />

where "FILE" is a type defined in (it is usually a complicated structure <strong>of</strong> some sort). To open a file, you do,<br />

for example:<br />

in_file = fopen("input_file.dat", "r");<br />

out_file = fopen("output_file.dat", "w");<br />

Note the use <strong>of</strong> "r" to indicate read access, and "w" to indicate write access. The following modes are available:<br />

"r" read<br />

"w" write (destroys any existing file with the same name)<br />

"rb" read a binary file<br />

"wb" write a binary file (overwriting any existing file)<br />

"r+" opens an existing file for random read access, and<br />

writing to its end<br />

"w+" opens a new file for random read access, and writing to<br />

its end (destroys any existing file with the same name)<br />

Once the file is opened, you can use the following functions to read/write it:


C <strong>programming</strong> <strong>notes</strong><br />

file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />

34 <strong>of</strong> 40 19/03/2007 10:06 AM<br />

int fread(void *ptr, int size, int nelm, FILE *fp) Reads nelm elements<br />

<strong>of</strong> data, each size bytes long, storing them<br />

at the location pointed to by ptr. Returns<br />

the number <strong>of</strong> elements read.<br />

int fwrite(void *ptr, int size, int nelm, FILE *fp) Writes nelm elements<br />

<strong>of</strong> data, each size bytes long, from the<br />

location pointed to by ptr. Returns<br />

the number <strong>of</strong> elements written.<br />

int getc(FILE *fp)<br />

Returns the next character from `fp', or<br />

EOF on error or end-<strong>of</strong>-file.<br />

int putc(int c, FILE *fp)<br />

Write the character c to the file `fp',<br />

returning the character written, or EOF on<br />

error.<br />

int fscanf(FILE *fp, char *format, ...) Like scanf, except input is taken<br />

from the file fp.<br />

int fprintf(FILE *fp, char *format, ...) Like printf, except output is written<br />

to the file fp.<br />

char *fgets(char *line, int n, FILE *fp) Gets the next line <strong>of</strong> input from the<br />

file fp, up to `n-1' characters in length.<br />

The newline character is included at the end<br />

<strong>of</strong> the string, and a null is appended. Returns<br />

`line' if successful, else NULL if end-<strong>of</strong>-file<br />

or other error.<br />

int fputs(char *line, FILE *fp) Outputs the string `line' to the file fp.<br />

Returns zero is successful, EOF if not.<br />

When you have finished with a file, you should close it with 'fclose':<br />

int fclose(FILE *fp)<br />

Closes the file 'fp', after flushing any<br />

buffers. This function is automatically called<br />

for any open files by the operating<br />

system at the end <strong>of</strong> a program.<br />

It can also be useful to flush any buffers associated with a file, to guarantee that the characters that you have written<br />

have actually been sent to the file:<br />

int fflush(FILE *fp)<br />

Flushes any buffers associated with the<br />

file 'fp'.<br />

To check the status <strong>of</strong> a file, the following functions can be called:<br />

int fe<strong>of</strong>(FILE *fp)<br />

int ferror(FILE *fp)<br />

void clearerr(FILE *fp)<br />

int fileno(FILE *fp)<br />

Returns non-zero when an end-<strong>of</strong>-file is read.<br />

Returns non-zero when an error has occurred,<br />

unless cleared by clearerr.<br />

Resets the error and end-<strong>of</strong>-file statuses.<br />

Returns the integer file descriptor associated<br />

with the file (useful for low-level I/O).<br />

Note that the above "functions" are actually defined as pre-processor macros in C. This is quite a common thing to do.<br />

The preprocessor, cpp<br />

Any line in your C source code that starts with the "#" (hash) chararacter is a preprocessor directive. Here are some<br />

examples <strong>of</strong> what is possible:<br />

#define PI 3.1415926535<br />

#define SQR(a) (sqrarg=(a),sqrarg*sqrarg)<br />

#include "filename" /* from the current directory */<br />

#include /* from the system directories (as modified by the "-I" switch) */<br />

#define DEBUG


C <strong>programming</strong> <strong>notes</strong><br />

file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />

35 <strong>of</strong> 40 19/03/2007 10:06 AM<br />

/* defines the symbol DEBUG */<br />

#ifdef DEBUG<br />

/* code here is compiled if DEBUG is defined */<br />

#elif defined UNIX<br />

/* code here is compiled if DEBUG is not defined, and UNIX is defined */<br />

#else<br />

/* code here is compiled if neither DEBUG or UNIX are defined */<br />

#endif<br />

#if 0<br />

/* code here is never compiled */<br />

#endif<br />

Command-line arguments, and returning a status to the shell<br />

Up until now, our "main" function has always been declared as "int main(void)". We have seen that the integer return<br />

value is available to the UNIX "shell" in the "$?" environment variable, and this can be used to alter the execution <strong>of</strong><br />

scripts. It turns out that "main" can actually take arguments, and these are traditionally given as "int argc, char *argv[]",<br />

where argc is one plus the number <strong>of</strong> command-line arguments provided to the program, and argv is an array <strong>of</strong><br />

character strings containing the arguments. argv[0] is usually the name <strong>of</strong> the program itself. An example<br />

(statusexample.c) will hopefully clarify this:<br />

#include <br />

int main(int argc, char *argv[]) {<br />

printf("this program is called '%s'\n", argv[0]);<br />

if (argc == 1) {<br />

printf("it was called without any arguments\n");<br />

} else {<br />

int i;<br />

printf("it was called with %d arguments\n", argc - 1);<br />

for (i = 1; i < argc; i++) {<br />

printf("argument number %d was \n", i, argv[i]);<br />

}<br />

}<br />

exit(argc);<br />

}<br />

echo $?<br />

Allocating memory at run-time - malloc<br />

The situation <strong>of</strong>ten arises that you don't know the size <strong>of</strong> a data structure in your program at compile-time. For<br />

example, suppose you are writing a program to read an 2D image from a file into the computer's memory. If the size <strong>of</strong><br />

the image can vary at run-time, it would be nice not to have to pre-define the size <strong>of</strong> the memory array in your program.<br />

C allows you to do this through the concept <strong>of</strong> allocating and deallocating memory.<br />

Allocating memory means requesting the computer to give you access to a contiguous block <strong>of</strong> memory. E.g., suppose<br />

you need a megabyte; you ask the computer "can I use a megabyte <strong>of</strong> memory, and if so, can you please tell me where<br />

the memory is?". You ask the question by calling the function "malloc", which takes a single argument: the number <strong>of</strong><br />

bytes <strong>of</strong> memory that you want. malloc will attempt to allocate the memory, and if it is successful it will return a<br />

pointer to the first byte <strong>of</strong> the memory block (if malloc was unsuccessful, it returns NULL; this is a rare enough<br />

occurence for reasonable memory requests that you should handle it with "assert").<br />

Deallocating memory means telling the computer "I have now finished using this block <strong>of</strong> memory that you previously<br />

allocated for me with malloc, so you can now have the memory back again for some other purpose". You deallocate<br />

memory by calling the "free" function.<br />

Some important points to remember when using malloc:<br />

malloc returns a pointer to void, i.e., "(void *)", which just means that malloc itself knows nothing about what<br />

the block <strong>of</strong> memory is to be used for, which is fair enough since you didn't tell it. However, you do have to let<br />

the C compiler know, and you do this by type-casting malloc's result (see the example below).<br />

You should always call "free" to deallocate any memory that you have finished with. Note carefully that the<br />

argument to "free" must be a pointer obtained from malloc, and that you can only call free once for a given block


C <strong>programming</strong> <strong>notes</strong><br />

file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />

36 <strong>of</strong> 40 19/03/2007 10:06 AM<br />

<strong>of</strong> memory.<br />

After freeing memory, it is good practice to set any pointers that referred to it to NULL, this will prevent you<br />

from inadvertantly using deallocated memory.<br />

The memory that malloc allocates can have random data in it. If you need it to be zeroed, either do this yourself<br />

with a "for" loop <strong>of</strong> use the "calloc" function (see "man calloc").<br />

Here is an example to get you started (mallocexample.c):<br />

#include <br />

#include <br />

#include <br />

/* Generates a two dimensional matrix, and fills it randomly with<br />

zeroes and ones. */<br />

int main(void) {<br />

int xdim, ydim;<br />

int i, j;<br />

int *p, *q;<br />

// Obtain the dimensions <strong>of</strong> the matrix from the user.<br />

printf("x dimension <strong>of</strong> matrix? > ");<br />

scanf("%d", &xdim);<br />

printf("y dimension <strong>of</strong> matrix? > ");<br />

scanf("%d", &ydim);<br />

// malloc the memory needed for the matrix. Note the use <strong>of</strong> "size<strong>of</strong>(int)"<br />

// which allows the program to run on machines with a variety <strong>of</strong> word sizes.<br />

assert(NULL != (p = (int *)malloc(xdim * ydim * size<strong>of</strong>(int))));<br />

// Fill the matrix with random 0s and 1s.<br />

for (i = 0; i < xdim * ydim; i++) {<br />

if (random() > RAND_MAX/2) {<br />

*(p+i) = 1;<br />

} else {<br />

*(p+i) = 0;<br />

}<br />

}<br />

// Now print the matrix out, showing the use <strong>of</strong> pointer arithmetic.<br />

for (i = 0; i < ydim; i++) {<br />

q = p + i * xdim;<br />

for (j = 0; j < xdim; j++) {<br />

printf("%1d", *(q++));<br />

}<br />

printf("\n");<br />

}<br />

// Free the memory we had allocated.<br />

free((void *)p);<br />

// And remove all information about where the memory used to be.<br />

// This is overkill in this example, since the program immediately<br />

// terminates, but it is good practice, in order to avoid<br />

// accidentally using free'ed memory.<br />

}<br />

p = q = NULL;<br />

return 0;<br />

Generating random numbers<br />

Random numbers are used in all sorts <strong>of</strong> ways in mathematics and physics. They are a fundamental tool that is <strong>of</strong>ten<br />

useful when conducting numerical simulations <strong>of</strong> physical systems with a computer.<br />

Writing a computer program to generate random numbers requires understanding some fairly subtle concepts.<br />

There are several different algorithms for generating a new random number given an initial random number. One <strong>of</strong> the


C <strong>programming</strong> <strong>notes</strong><br />

file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />

37 <strong>of</strong> 40 19/03/2007 10:06 AM<br />

simplest (and best) algorithms for generating random integers is a multiplicative congruential algorithm: you simply<br />

multiply the current random number generator by a constant, and then take the modulus. The choice <strong>of</strong> the constant and<br />

the modulus is critical. A good choice is 16807 and 2147483647 (see Numerical Recipes in C, Press et al, 2nd Ed, page<br />

278).<br />

But how do you generate the initial random number, the so-called "seed", for the algorithm?<br />

One technique is to use the time (in milli or micro-seconds since the last integral second). This is easy to program, but<br />

has a few disadvantages, e.g, there are only 1000 or 1000000 different seeds, and these numbers are not necessarily<br />

equally likely to occur (e.g., the operating system might tend to start programs at particular times - see here for more<br />

details - randomnumber.c).<br />

Another technique on GNU/Linux systems is to read from the file /dev/random. /dev/random contains random numbers<br />

generated from "random" events such as the arrival time <strong>of</strong> internet packets, the movement <strong>of</strong> the mouse, the timing <strong>of</strong><br />

user keystrokes, etc. If you "od -h /dev/random" ("od -h" prints out the file as hexadecimal characters) you will find<br />

that it will eventually stop, waiting for new events to generate randomness - move the mouse and it will continue. An<br />

alternative is /dev/urandom, which will continuously generate random numbers, using an algorithm if events are too<br />

infrequent.<br />

If you require very high quality random numbers, then you should read about the subject, perhaps using Numerical<br />

Recipies in C as a starting point.<br />

The C libraries provide the functions "srandom" and "random" to generate random integers between 0 and<br />

RAND_MAX (defined in "stdlib.h") inclusive. Use the on-line manual pages to learn about "srandom" and "random"<br />

(i.e., type "man srandom" at the prompt). Here is an example program (randomnumber.c) showing their usage, and<br />

seeding them with the time in microseconds.<br />

Quasi-random numbers<br />

For some applications, you don't want truly random numbers, but would prefer numbers that cover an n-dimensional<br />

space more unformly. For example, let's take a more detailed look at Monte-Carlo integration.<br />

When using random numbers, the error involved in Monte-Carlo integration goes down as 1/sqrt(N), where N is the<br />

number <strong>of</strong> points. This means, e.g., if you use 10 times as many points, the error in your estimation <strong>of</strong> the integral will<br />

be reduced by a factor <strong>of</strong> about 1/sqrt(10), or 0.32. Can we do better than this? It turns out that a simple uniform grid <strong>of</strong><br />

points has an error that goes as 1/N, which is about 3 times better than the random case. However, a uniform grid has<br />

the undesirable property that you have to decide on the grid spacing in advance, and you can't simply add more points<br />

at a later time.<br />

What we need is a quasi-random number generator that chooses numbers that tend to fill the gaps in the sequence.<br />

There are several possibilities, and we will explore one designed by a Russian mathematician by the name <strong>of</strong> Sobel<br />

(sobel.c).<br />

Sorting<br />

A frequent requirement is to sort quantities (numbers, character strings, or more complicated structures) into some<br />

order (ascending, descending, or something more complex). The C library provides the function "qsort", an<br />

implementation <strong>of</strong> the famous quicksort algorithm. It is usually best to use "qsort" than write your own sorting<br />

algorithm.<br />

Here are two example programs showing the use <strong>of</strong> qsort.<br />

Multiple precision arithmetic<br />

As an example <strong>of</strong> how to program in C, let's explore the topic <strong>of</strong> multiple precision arithmetic. All the hard work will<br />

be done by GNU's Multiple Precision Arithmetic Library (GNU MP), written by Torbjorn Granlund.<br />

Multiple precision arithmetic allows you to perform calculations to a greater precision than the host computer will<br />

normally allow. The penalty you pay is that the operations are slower, and that you have to call subroutines to do all the<br />

calculations.<br />

GNU MP can perform operations on integers and rational numbers. It uses preprocessor macros (defined in gmp.h) to


C <strong>programming</strong> <strong>notes</strong><br />

file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />

38 <strong>of</strong> 40 19/03/2007 10:06 AM<br />

define special data types for storing these numbers. MP_INT is an integer, and MP_RAT is a rational number.<br />

However, since a multiple precision number may occupy an arbitrarily large amount <strong>of</strong> memory, it is not sufficient to<br />

allocate memory for each number at compile time. GNU MP copes with this problem by dynamically allocating more<br />

memory when necessary.<br />

To begin with you need to initialise each variable that you are going to use. When you are finished using a variable,<br />

you should free the space it uses by calling a special function.<br />

MP_INT x, y;<br />

mpz_init(&x);<br />

mpz_init(&y);<br />

/* operations on x and y */<br />

mpz_clear(&x);<br />

mpz_clear(&y);<br />

Let's now try a full program. For example, calculating the square root <strong>of</strong> two (roottwo.c) to about 200 decimal places:<br />

#include <br />

#include <br />

int main(void) {<br />

char two[450], guess[225];<br />

int i;<br />

MP_INT c, x, temp, diff;<br />

two[0] = '2';<br />

for (i = 1; i < size<strong>of</strong>(two)-1; i++) {<br />

two[i] = '0';<br />

}<br />

two[i] = 0;<br />

mpz_init_set_str(&c, two, 10);<br />

guess[0] = '1';<br />

for (i = 1; i < size<strong>of</strong>(guess)-1; i++) {<br />

guess[i] = '0';<br />

}<br />

guess[i] = 0;<br />

mpz_init_set_str(&x, guess, 10);<br />

mpz_init(&temp);<br />

mpz_init(&diff);<br />

do {<br />

mpz_div(&temp, &c, &x);<br />

mpz_sub(&diff, &x, &temp);<br />

mpz_abs(&diff, &diff);<br />

mpz_add(&x, &temp, &x);<br />

mpz_div_ui(&x, &x, 2U);<br />

} while (mpz_cmp_ui (&diff, 10U) > 0);<br />

}<br />

printf("the square root <strong>of</strong> two is approximately ");<br />

mpz_out_str(stdout, 10, &x);<br />

printf("\n");<br />

return 0;<br />

To compile, link, and run the program, use<br />

gcc -o two two.c -lgmp<br />

./two<br />

Gnuplot<br />

Gnuplot is a simple graphical plotting package. It is particuarly easy to plot graphs with it. These <strong>notes</strong> provide an<br />

introduction to its use.<br />

Numerical integration<br />

Here is an example <strong>of</strong> using numerical integration to follow the motion <strong>of</strong> a planet around the Sun:<br />

Writing fast programs


C <strong>programming</strong> <strong>notes</strong><br />

file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />

39 <strong>of</strong> 40 19/03/2007 10:06 AM<br />

Reasons for worrying about how fast a program is include:<br />

if it is interactive, and reponsiveness is important,<br />

if it is pushing the boundaries <strong>of</strong> practicality (e.g., it might takes weeks to run).<br />

Hints on improving speed:<br />

use a clever algorithm.<br />

identify the time-critical parts <strong>of</strong> the program, and concentrate on improving those.<br />

examine memory use as well (e.g., try to reduce the amount <strong>of</strong> memory being used, and keep the usage localised<br />

as opposed to jumping around randomly).<br />

iterate most rapidly over the rightmost subscript in arrays, since this will lead to better localisation <strong>of</strong> memory<br />

use (and hence greater likelyhood <strong>of</strong> cache hits).<br />

use optimisation (e.g., -O1, -O2, or -O3; see "man gcc" for details).<br />

try optimising for small program size (-Os).<br />

in general, try not to be "too clever" - the C compiler can quite <strong>of</strong>ten do a better job if your program is clear.<br />

make the critical parts <strong>of</strong> the program as small (in terms <strong>of</strong> bytes <strong>of</strong> instructions) as possible - they will then be<br />

more likely to fit in the CPU's "instruction cache", resulting in better performance.<br />

declare <strong>of</strong>ten-called functions as "inline", which eliminates the expense <strong>of</strong> the function call and copying the<br />

arguments to/from the stack. This technique should be used with caution, since it makes the program larger, in<br />

which case it may no longer fit in the instruction cache.<br />

avoid printing out lots <strong>of</strong> unnecessary information. Formatting and writing text to the screen is quite CPU<br />

intensive.<br />

2D cellular automata - Conway's Game <strong>of</strong> Life<br />

In 1970, the mathematician John Horton Conway published the description <strong>of</strong> a simple 2D cellular automata which he<br />

called the Game <strong>of</strong> Life. Cells in a 2D grid were either "alive" or "dead", and their state in the next generation was<br />

determined by their current state and the number <strong>of</strong> nearest neighbours (from 0 to 8, inclusive). The rules were chosen<br />

to simulate some properties <strong>of</strong> biological systems (e.g., a cell would die <strong>of</strong> "overcrowding" if too many <strong>of</strong> its<br />

neighbours were alive, and die <strong>of</strong> "lack <strong>of</strong> support" if too few <strong>of</strong> its neighbours were alive).<br />

Here is a "simple" example <strong>of</strong> how to program the Game <strong>of</strong> Life on a computer.<br />

If we are interested in following the evolution <strong>of</strong> the game for many thousands <strong>of</strong> generations, it is advantageous to<br />

think <strong>of</strong> ways <strong>of</strong> speeding up our simple implementation. The first thing to try is to turn on full optimisation during the<br />

compilation, using the "-O4" switch. The next step is to make the critical functions "inline" (see here for the code). We<br />

can also think about the algorithm, and realise that a lot <strong>of</strong> our calculation <strong>of</strong> the number <strong>of</strong> nearest neighbours was<br />

involved in handling the special case <strong>of</strong> being on the boundary; we can re-write our program to separate out the<br />

boundary case, thereby allowing a simplified (faster) function to do most <strong>of</strong> the neighbour calculations, at the expense<br />

<strong>of</strong> increased complexity in handling the evolution from one generation to the next. Finally, we can try to use the fact<br />

that much <strong>of</strong> the Life "universe" tends to be sparsely populated, allowing us to produce a second<br />

neighbourhood-calculating function for the special case that the preceeding cell had no nearest neighbours. These are<br />

by no means the only speed-ups that are possible, but they give you some flavour <strong>of</strong> what is possible.<br />

The following table gives the time taken to follow one million generations using a 850MHz Pentium III computer and<br />

the various tricks from the last paragraph.<br />

Time Technique<br />

[seconds]<br />

--------------------------------------------------<br />

330 Original program, no optimisation<br />

205 Original program, -O4 optimisation<br />

113 Inline function calls<br />

45 Boundary case handled separately<br />

37 Sparse population handled separately<br />

So, we have managed almost a factor <strong>of</strong> ten improvement in speed over our original attempt.<br />

To show why this might be useful, let's now alter our program to automatically identify glider patterns (those that can<br />

self-propagate). We will choose a brute-force technique where we randomly seed the centre <strong>of</strong> our rectangular<br />

"universe" with alive/dead cells, and then evolve the system until one <strong>of</strong> the boundary cells is hit. When this happens<br />

we stop the evolution and print a 21x21 grid around the cell, hopefully capturing a glider in full flight! Here is the code,<br />

and you can see that it is a straightforward variation on the earlier program. It typically finds a glider within 2 seconds


C <strong>programming</strong> <strong>notes</strong><br />

file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />

40 <strong>of</strong> 40 19/03/2007 10:06 AM<br />

on a 850MHz Pentium III computer.<br />

C <strong>programming</strong> course, <strong>School</strong> <strong>of</strong> <strong>Physics</strong>, UNSW / Michael Burton / mgb@phys.unsw.edu.au

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!