C programming notes - School of Physics

C programming notes 

file:///F:/my_docs/web_phys2020/Cprogrammingnotes.html 

1 of 40 19/03/2007 10:06 AM 

PHYS2020 - Computational Physics, based on the C 

programming language 

Index 

Copyright (c) Michael Burton 2005, Michael Ashley 1994-2004. 

Example programs 

Why learn C? 

Some problems with C 

A brief history of C 

Simplest C program 

Hello World 

Computing fundamentals 

Bits, bytes and words 

32-bit, 64-bit 

Binary, octal, decimal, hexadecimal 

Floating point number 

GNU/Linux fundamentals 

Operating system 

Shell 

Files 

Filesystem 

Input and Output 

Command toolkit 

On-line manuals 

Fundamentals of C 

Constants and escape sequences and strings 

Comments 

Compiling, linking and executing 

Makefiles 

Compiling and Running 

Linking to libraries 

GNU debugger "gdb" 

C data types 

Constants: binary, octal, hexadecimal, decimal 

Conversion between integer and floating point 

Operators 

Unary operators 

Binary operators 

Assignment operators 

Comma operator 

Precedence and associativity 

Evaluation of logical AND and OR 

C identifiers 

Variable storage classes 

Initilialisation of variables 

Mathematical functions 

The "for" loop 

"If" statements 

Break and continue 

Goto 

Switch statements 

Assert for program reliability 

Formatted output: printf 

Formatted input: scanf 

GNU readline 

Arrays




Character arrays 

Functions 

Prototypes 

Pointers 

Pointer arithmetic 

Pointers and arrays 

Structures 

String functions 

File operations 

The preprocesser, "cpp" 

Command line arguments 

Allocating memory at runtime, "malloc" 

Some applications 

Random numbers 

Quasi-random numbers 

Sorting 

Multiple precision arithmetic 

Gnuplot 

Numerical integration via gravity and orbits 

Program speed 

The game of "Life" 

Some Example C Programs 

The simplest possible C program. 

Hello world. 

C escape sequences. 

Mandelbrot image generator - illustrating "for" loops. 

ASCII circle printer. 

How to use GNU readline. 

Generating random numbers. 

Sorting using quicksort. 

Numerical integration - gravity. 

1D cellular automata - text output. 

1D cellular automata - image output. 

Game of Life - simple version 

Game of Life - inline function version 

Game of Life - boundary version 

Game of Life - special-case version 

Game of Life - version to find gliders 

Linear fitting program 

Bisection Method for finding roots of equations 

Interpolation Method for finding roots 

Newton-Raphson Method for finding roots 

Why learn C? 

1. C is the most widely used programming language. 

2. The basic ideas in C are common to many other programming languages. 

3. Even though FORTRAN is still in heavy use by physicists, it is on its way out. 

4. If you want to get a computer-related job outside the field of Physics, it is helpful to know C. 

5. Many scientific instruments are programmed in C (e.g., CCD cameras and special purpose interface cards 

invariably come with a C library interfaces). 

6. C is closer to assembly language, so you can have finer control over what the computer is doing, and thereby 

make faster programs. 

7. There is a free C compiler available (GNU C, gcc), that is of very high quality and that has been ported to 

numerous machines. 

8. UNIX is written in C, so it is easier to interface with UNIX-like operating systems (such as GNU/Linux) if you 

write in C. 

9. Once you have a working knowledge of C, you will find perl easy to learn, and perl is a very useful language to




know, but that is another story... 

Some problems with C 

The language was designed with writing operating systems in mind, and so it was not purpose-built for numerical 

work. 

There is no support for complex numbers (actually, as of the C-99 standard, there is!). 

Handling multi-dimensional arrays, understanding pointers, and string manipulation are difficult when you first 

encounter them. 

It is easy to write completely opaque code in C. 

You have to explicitly declare every variable that you use. This is actually not a problem at all! It is a feature. 

It is possible to bypass many of the protective features of the language, and get yourself into trouble 

(alternatively, this can be seen as an advantage, since the language imposes few limitations on what you can do). 

The ANSI C standard leaves a lot of details undefined. This makes C non-portable between different operating 

systems, compilers, and computer hardware. Java is better in this respect. 

It is not easy to write large bug-free programs in C. Java is often a better choice. 

A brief history of C 

C evolved from a language called B, written by Ken Thompson at Bell Labs in 1970. Ken used B to write one of the 

first implementations of UNIX. B in turn was a decendant of the language BCPL (developed at Cambridge (UK) in 

1967), with most of its instructions removed. 

In fact, so many instructions were removed in going from BCPL to B, that Dennis Ritchie of Bell Labs put some back 

in (in 1972), and called the language C. 

The famous book The C Programming Language was written by Kernighan and Ritchie in 1978, and has remained the 

definitive reference book on C. 

The original C was still too limiting, and not standardized, and so in 1983 an ANSI committee was established to 

formalise the language definition. 

By 1993 the ANSI standard became well accepted and almost universally supported by compilers. The GNU C 

compiler, gcc, has become, in many ways, the standard implementation of C. 

What about C++ and C#? 

C++ is a superset of C, incorporating many object-oriented programming ideas. However, it is not particularly widely 

used. Java is a better choice in most cases. 

C# is a Java-like language developed by Microsoft. Enough said. 

Interesting books 

Numerical Analysis by Burden and Faires. 

A New Kind of Science by Wolfram. 

Numerical Recipies in C 

Other on-line resources 

For coverage of typical problems encountered with C, have a look at the Comp.lang.c FAQ. 

For a rather good tutorial introduction to C, check out this document by Gordon Dodrill. 

The simplest possible C program 

Here it is. 

Hello world




Before reading further, have a look at the generic Hello World, program written in C. 

But before we get too excited and start to write really useful programs, there are a few fundamental concepts that you 

will need to understand first. 

Computing Fundamentals 

Bits, bytes, and words 

Bit 

A bit is the smallest unit of information. It can be thought of as the answer to a yes/no question. A bit has the value 1 or 

0 (which can be equated with yes/no, true/false). 

Byte 

A byte consists of eight bits, ordered from the most significant bit (MSB) on the left, to the least significant bit (LSB) on 

the right. E.g., 011011000 is a byte. A byte can be interpretted as a base-2 number. 

There are 256 (i.e., 2 to the power 8) possible combinations of 8 bits, ranging from 00000000 to 11111111. These 256 

combinations could, in principle, be used to represent any 256 things (e.g., we could define 00000000 to be "apple" and 

00000001 to be 3.14159265359). 

In practice there are three commonly used interpretations of a byte: 

Words 

Unsigned byte 

An unsigned byte can represent integers from zero to 255. 

Signed byte 

A byte can also be used to represent integers from -128 to +127. Such a byte is called a signed byte, and the MSB 

is used to indicate the sign (0 for positive, 1 for negative), -128 is represented as 10000000, -1 is 11111111, 0 is 

0, and +127 is 01111111. 

Non-examinable: To find the bit pattern for a negative number, you first write down the representation of the 

number without the minus sign, then invert all the bits, then add one. e.g., -1 is 00000001 inverted (which is 

11111110), plus one (which gives 11111111). This representation is called "2's complement", and is used 

because (1) it makes binary arithmetic easier to implement in the computer hardware, and (2) it avoids the waste 

of a bit pattern in the case of 1000000, which you might think was -0. 

A byte as an ASCII character 

The American Standard Code for Information Interchange (ASCII) associates the various possible bit patterns in 

a byte with characters (i..e., things like 'A', 'B', 'C'...'Z', 'a', 'b', 'c', ...'z', '*', '&', '@', etc). For example, 'A' is stored 

as 65 (decimal). The ASCII table allows you to translate a simple text document into a list of bytes (which you 

might then store in a file, see below). For a complete list of the ASCII table, type "man ascii". 

In order to represent numbers outside the range of an unsigned or signed byte, multiple bytes can be grouped together 

to form words of various lengths. Words usually contain a power of two bytes, e.g., 2, 4, 8, 16, and so on. They may 

either be signed or unsigned. For example, an unsigned 2-byte word can represent numbers from 0 to 65536 (2 to the 

power 16), a signed 2-byte word ranges from -32768 to + 32767. Words are sometimes called "short words", "words", 

and "long words", for the 2, 4, and 8-byte variations, but this usage is not uniform. 

32-bit computer, 64-bit computers 

Most PCs are 32-bit computers, i.e., they process 32-bits (4 bytes) of data in one operation. In a few years, 64-bit




computer will become very common. 

Binary, octal, decimal, hexadecimal 

Integers can be written in various bases. The common bases used with computers are binary (base-2), octal (base-8), 

decimal (base-10), and hexadecimal (base-16). Octal and hexadecimal are often useful when doing low-level 

programming, since they encode a binary number in a smaller amount of typing than using binary notation. 

The following table compares the various numbering systems (note that you can't directly specify a binary constant in 

the C programming language); octal constants are distinguished by a leading zero; hexadecimal constants by a leading 

"0x"): 

Decimal Binary Octal Hexadecimal 

-------------------------------- 

0 0000 000 0x0 

1 0001 001 0x1 

2 0010 002 0x2 

3 0011 003 0x3 

4 0100 004 0x4 

5 0101 005 0x5 

6 0110 006 0x6 

7 0111 007 0x7 

8 1000 010 0x8 

9 1001 011 0x9 

10 1010 012 0xa 

11 1011 013 0xb 

12 1100 014 0xc 

13 1101 015 0xd 

14 1110 016 0xe 

15 1111 017 0xf 

Conversion between binary, octal, and hexadecimal is fairly trivial using the above table. For example, to convert the 

binary number 11100101010 to octal, break the number into groups of three bits starting from the least significant bit 

(i.e., 11 100 101 010), then convert each three-bit group to octal (i.e., 3452) and add a leading 0 to tell C it is octal (i.e., 

03452). To convert the number to hexadecimal, break it into groups of four bits (i.e., 111 0010 1010), convert each 

group (i.e., 72a) and add a leading "0x" (i.e., 0x72a). 

To convert between octal and hexadecimal, it is probably easiest to go via binary. 

To convert from binary to decimal, note that n-th bit from the least-significant is worth 2 to the power n. So, 

11100101010 

\\\\\\\\\\\_ 0 x 1 = 0 

\\\\\\\\\\_ 1 x 2 = 2 

\\\\\\\\\_ 0 x 4 = 0 

\\\\\\\\_ 1 x 8 = 8 

\\\\\\\_ 0 x 16 = 0 

\\\\\\_ 1 x 32 = 32 

\\\\\_ 0 x 64 = 0 

\\\\_ 0 x 128 = 0 

\\\_ 1 x 256 = 256 

\\_ 1 x 512 = 512 

\_ 1 x 1024 = 1024 

Floating-point numbers 

Total = 1832 

Scientific computation often requires the use of floating-point numbers, e.g., 1.23. There is an IEEE standard that 

describes how floating-point numbers can be represented as a group of bytes. The basic idea is to allocate a certain 

number of bits to the exponent (base-2), and the rest to the mantissa (which has been normalised to be between 0.5 and 

1). Both the exponent and mantissa have sign bits. 

Floating-point numbers typically come in 4, 8, or 16 byte sizes. In C these are usually called "float", "double", and 

"long double" respectively. 

For any given number of bytes allocated to store a floating-point number, there is clearly a tradeoff between the range 

of the exponent, and the precision of the mantissa: the more bits you use for the exponent, the fewer are left over for the 

mantissa. We will see later how this compromise has been chosen by the IEEE for the various byte sizes.




Note that there are some redundant bit patterns in the representation of floating-point numbers: e.g., 0.0 with any 

exponent is still zero. One of these bit patterns is used as a flag to indicate that a calculation produced an illegal result. 

This bit pattern is called "NaN" which stands for "not a number". E.g, if you take the square root of a negative number, 

you will receive the result NaN. Any operation on a NaN will continue to produce a NaN (e.g., if you subtract two 

NaNs you will still have a NaN), this is useful to propagate such error conditions in your programs so that you are 

aware when they affect the final result of a calculation. 

GNU/Linux Fundamentals 

The operating system 

The operating system of a computer is, somewhat loosely speaking, the program that is always operating in the 

background on the computer. It handles the interaction with the hardware, and running other programs for the user(s) of 

the computer. The operating system that we will be using is called GNU/Linux. Other examples of operating systems 

are Windows XP, OpenBSD, and FreeBSD. 

The shell 

The shell is an interactive program that allows the user to interface with the operating system. This is the program that 

provides the command-line prompt (it also helps you run programs, allows you to use the arrow keys on your keyboard 

to edit the command-line, and a bunch of other useful things). The shell that we will be using is called bash, it is the 

standard shell that is used with GNU/Linux. Other examples of shells are ksh, tcsh, and zsh. 

Files 

A file consists of two components: 

File data 

The data in a file is simply a list of bytes. The list may be empty (i.e., the file may have no data). The list of bytes 

might be used for a variety of different purposes, for example, it may be a C program, an e-mail message, a 

photograph, a song, a movie. In principle there is no way to distinguish the purpose of a file's data by looking at 

it. In practice, you can usually make a pretty good guess. To list the data in a file, you can, e.g., use "cat myfile". 

File metadata 

The metadata associated with a file consists of miscellaneous additional information that the operating system 

keeps with the file. For example: the name of the file, the number of bytes in the file, the time at which the file 

was created, which user owns the file, and the file permissions (i.e., the list of users who read and/or write the 

file, and whether the file is executable (i.e., is a program that can be run)). The term "metadata" means "data 

about data". To list the metadata associated with a file, you can, e.g., use "stat myfile". 

File names and file extensions 

When you create a file, you give it a name. The name is a list of bytes. In principle the name could be bizarre, such as 

" _1 `", but in practice you should choose simple, lowercase, names without spaces, e.g., "thesis.txt", "photo.jpg", 

"assignment0001.c". Note the convention of suffixing a file name with an extension that gives a clue as what the file is 

being used for. So, C programs invariably end with the ".c" extension. 

The GNU/Linux filesystem 

Directories (folders) 

To assist with organising groups of files, GNU/Linux has the concept of a directory (or folder). A directory is actually 

just a file (this adheres to the UNIX philosophy of making things simple and logical). A directory contains a list of the 

file names that are "in" the directory. Directories can contain directories. This leads to a hierarchical filesystem, where, 

in order to fully specify the location of a file, you have to specify the list of directories to follow to reach the file. By 

convention a slash character '/' is used to separate the directory names when giving the fully qualified filename. "/" itself 

is known as the root directory - there are no directories above the root directory in the filesystem. "/thesis.txt" would




then refer to file "thesis.txt" in the root directory. "/home/fred/thesis/chapter4.txt" would be the file "chapter4.txt" in the 

directory "thesis", in the directory "fred", in the directory "home", in the root directory. 

The Windows operating system uses letters such as "C:", "D:", etc to specify different filesystems (which might be in 

on different hard disks). GNU/Linux places these filesystems within the "/" hierarchy, i.e., the floppy disk might be 

accessible as "/mnt/floppy", the CD-ROM might be "/mnt/cdrom". 

The current working directory 

GNU/Linux has the concept of the current working directory. You can specify a file in the current working directory 

by just giving the file name. For convenience, "./" refers to the current working directory, and "../" refers to the 

directory above the current working directory. The fully-qualified filename of a file in the current working directory 

can be written as, e.g., "./thesis.txt". 

Programs 

To get a computer to do something for you, you create and run a program. You will be doing this many times during 

this course. To create a program, you type in the program's source code, and then use, e.g, the C compiler (another 

program, which someone else wrote) to turn the source code into a program. The program itself is just a file. The file's 

metadata tells the operating system that the file is executable (i.e., that it is a program able to be run by the computer). 

To run the program, you just type its fully qualified filename. 

GNU/Linux commands 

GNU/Linux provides you with many hundreds of programs, often called commands, but they are really exactly like the 

programs that you can create yourself (actually, some of the commands are a subset of the bash shell, but this is a 

detail). The commands range from listing the contents of a directory ("ls"), listing the contents of a file ("cat"), the C 

compiler ("gcc"), to the GNU image manipulation program ("gimp"). 

To run a program, you simply type its filename. In the case of your own programs, you should use the fully qualified 

filename (e.g., "./a.out"). In the case of programs provided by GNU/Linux, the shell has a list of default directories 

where it searches for programs, so you normally only need to type the filename (without any directories). E.g., the "ls" 

program to list directories can be run by simply typing "ls", or "/bin/ls", or, if your current working directory happened 

to be "/bin", you could type "./ls". 

Input and Output 

In general, for a program to do useful work, it needs to be provided with input of some sort, and to produce output. 

Input to a program can take many forms: e.g., typing at the keyboard, a file containing data, movements of a mouse, or 

something more exotic such as audio input from a sound card, or video from a camera. 

Output from a program can also take many forms, e.g., text appearing on your computer screen, a graph on your screen, 

a file of results, or something more exotic such as audio output, or movement of a robotic arm. 

GNU/Linux provides the following input and output methods for all programs (I am covering all of the possibilities 

here, but you are only expected to really understand the first two at this stage): 

Standard input 

Standard input (or STDIN) is a stream of bytes provided to the program. The program can read from the stream, 

and can determine when there is no more data. Standard input comes from one of three sources depending on 

how you run your program. For example, suppose that your program is called "a.out" (the default name assumed 

by the C compiler), then 

./a.out 

./a.out < file 

standard input comes from the keyboard 

standard input comes from a file called "file"




command | ./a.out 

standard input comes from the standard output of "command" 

Standard output 

Standard output (or STDOUT) is a stream of bytes produced by a program. Standard output goes to one of three 

locations depending on how you run your program. E.g., 

./a.out 

./a.out > file 

./a.out | command 

standard output appears on the computer screen as text 

standard output is sent to the file called "file" 

standard output is sent to the standard input of "command" 

Standard error 

Standard error (or STDERR) is a stream of bytes produced by a program. Usually it contains error messages 

(e.g., 'divide by zero') that are not normally considered as part of the program output. Standard error can be sent 

to one of three locations. E.g., 

./a.out 

./a.out 2> file 

./a.out 2>&1 | command 

standard error appears on the computer 

screen as text, intermingled with the standard output 

standard error is sent to the file called "file" 

standard error (and standard output) is sent to 

the standard input of "command" 

Command-line arguments 

Anything you type after the program name, and before any special shell characters such as '>' and '|', is made 

available to your program as a command-line argument. These can be a useful source of additional input to a 

program. E.g., 

./a.out 23.45 albatross 7 

In the above example, "23.45", "albatross", and "7" are three command-line arguments.We will see later how you 

can read them from within a C program. 

Exit status 

Another form of output that is available to all programs is an integer exit status. By convention, this is used to 

indicate whether the program ran successfully (in which case the exit status is zero) or if an error occurred (in 

which case the exit status is nonzero and its value can be decoded to determine what the error was). The exit 

status of a program is stored in the shell variable called "$?" (don't worry if you don't understand this at this 

stage). 

Putting it all together 

Here are some more complex examples of program input/output: 

./a.out 2.1 < data.txt > results.txt 

"2.1" is an argument, standard input 

comes from "data.txt", standard output 

goes to "results.txt"




./a.out > results.txt 2> error.txt 

there are no arguments, standard input 

is from the keyboard, standard output 

to "results.txt" and standard error to 

"error.txt" 

Command Toolkit 

Here are some of the basic GNU/Linux commands that you will be using: 

cat myfile - concatenate a file (i.e., list the file's data on standard output) 

less myfile - like "cat", but allows you to scroll up and down in the file using the page-up, and page-down keys. 

Exit with 'q'. 

rm myfile - remove a file (i.e., delete the file; it is gone for good) 

mv myfile newname - renames a file 

cp myfile copyOfMyFile - makes a copy of a file, with a different name 

cd mydir - change directory 

mkdir mynewdir - make a new directory 

rmdir mydir - remove the directory (you have to remove the files in it first) 

ls - list the files and directories in the current working directory 

pwd - print the name of the current working directory 

gcc myprog.c - compiles your program 

./a.out - runs the program compiled as above 

man mytopic - displays the manual pages about "mytopic" 

man -k mytopic - searches the manual pages for any information about "mytopic" 

gedit myprog.c - starts up a simple GUI editor 

logout - logs out of the computer 

The on-line manual pages 

Use "man -k keyword" to search for information about "keyword". For example, suppose you want to know 

something about random number generators: "man -k random" on a GNU/Linux system will return output 

containing lines like these: 

ppmspread 

(1) - displace a portable pixmap's pixels by a random amount 

rand 

(3) - random number generator 

RAND_bytes 

(3ssl) - generate random data 

random 

(3) - random number generator 

random 

(4) - kernel random number source devices 

RAND_pseudo_bytes [RAND_bytes] (3ssl) - generate random data 

rand [sslrand] (1ssl) - generate pseudo-random bytes 

rand [sslrand] (3ssl) - pseudo-random number generator 

seed48 [drand48] (3) - generate uniformly distributed pseudo-random numbers 

The number in parentheses is called the section number (e.g., section 3 is reserved for C functions, section 4 for 

low-level kernel interfaces), and you need to specify this if there are entries in multiple volumes. So, e.g., to read 

about the kernel random number source devices, you would type "man 4 random".




Fundamentals of C 

Character constants, escape sequences, and string constants 

A character constant in C is a single character (or an escape sequence such as \n) enclosed in single quotes, e.g., 'A'. It 

occupies one byte of storage. 

The value of a character constant is the numeric value of the character in the computer's character set (e.g., 'A' has the 

value 65). In 99.99% of cases this is the ASCII character set, but this is not defined by the C standard! 

But how do you represent a character such as a single quote itself, or one that doesn't print (such as the ASCII BEL 

character)? The answer is to use an escape sequence. 

For reference, here is a program to print out all the special escape sequences. Have a look at it now. 

In addition, you can specify any 8-bit ASCII character using either \ooo or \xhh where òoo' is an octal number (with 

from 1 to 3 digits), and 'xhh' is a hexadecimal number (with 1 or 2 digits). For example, \x20 is the ASCII character for 

SPACE, so ' ' and '\x20' are identical. 

String constants are a sequence of zero or more characters, enclosed in double quotes. For example, "test", "", "this is 

an invalid string" are all valid strings (you can't always believe what a string tells you!). String constants are stored in 

the computer's memory as a sequence of numbers (usually from the ASCII character set), and are terminated by a null 

byte (\0). So, "test" would appear in memory as the numbers 116, 110, 115, 116, 0. 

A consequence of the null byte terminator is that a C character string can never contain a null byte. In practice, this isn't 

often a limitation. 

We have already used some examples of strings in our programs, e.g, "hello world\n" was a null-terminated character 

string that we sent to the printf function above. 

The above program also shows how to add comments to your C program. 

Comments 

To make your program more readable, it is good practice to include comments. These do not influence the operation of 

the program, nor are they printed out when the program runs, they are simply to document things that you think are 

important. 

Comments are delimited by `/*' and `*/', or from '//' to the end of the line. Any text between the first `/*' and the next 

`*/' is ignored by the compiler, irrespective of the position of line breaks. Note that this means that comments do not 

nest. Here are some examples of the valid use of comments: 

/* This is a comment */ 

// and so is this 

/* here is another one 

that spans two lines */ 

i = /* a big number */ 123456; 

Here are some problems that can arise when using comments: 

i = 123456; /* a comment starts here 

i = i + 2; this statement is also part of the comment */ 

/* this is a comment /* and so is this */ but this will generate an error */ 

The fact that comments don't nest is a real nuisance if you want to comment-out a whole section of code. In this case, it 

is simplest to use pre-processor directives to do the job. For example: 

i = 123456; 

#if 0 // The next two lines are ignored by the compiler. 

i = i + 2; 

j = i * i; 

#endif 

Compiling, linking, executing, etc




Source files 

C source files contain the C program that you write. The files invariably end with the suffix ".c". You compile them 

with the C compiler to produce... 

Object files 

Have a ".o" suffix, and are the result of compiling a source file. They contain the machine language equivalent of the C 

program, but they are not ready to execute yet, since they need to be linked against libraries of other C object files 

containing any functions that are used, but not coded for, in the source file (e.g., the printf function). Linking can be 

done for you by the C compiler, or can be a separate step if you wish. After linking, you have an... 

Executable file 

This file is ready to run, by typing its name at the command prompt. 

Makefiles - the easy way to automate compilation 

When developing software, a great deal of time can be save by using the UNIX make utility. 

The idea behind make is that you should be able to compile a program simply by typing make (followed by ENTER). 

For this to work, you have to tell the computer how to build your program, by storing instructions in a Makefile. 

While this step takes some effort, you are amply repaid by the time savings later on. make is particularly useful for large 

programs that consist of numerous source files: make only recompiles the file(s) that need recompiling (it works out 

which ones to process by looking at the dates of last modification). 

Here is an example of a simple Makefile: 

test: test.o; gcc test.o -o test 

Notes: 

1. A Makefile consists of a series of lines containing targets, dependencies, and shell commands for updating the 

targets. 

2. In the above example, 

test is the target, i.e., the executable program, 

test.o is the dependency (i.e., the file that test depends on), and 

gcc test.o -o test is the shell command for making test from its dependencies. 

3. Important note: test is a very bad name for a program since it conflicts with the UNIX shell command of the 

same name, hence if you type "test" it will run the UNIX command rather than your program. To get around this 

problem, simply use "./test" (if "test" is in your current directory), or rename the program. It is always a good 

idea to use the leading "./" since it avoids a lot of subtle problems that can be hard to track down. 

Makefiles can get much more complicated than this simple example. Here is the next step in complexity, showing the 

use of a macro definition and a program that depends on multiple files. 

OBJS = main.o sub1.o sub2.o 

main: $(OBJS); gcc $(OBJS) -o main 

It is well worth learning a little bit about make since it can save a lot of time! 

How to compile and run a C program 

Short Instructions 

1. 

Type in your program into your favourite editor (eg emacs): "emacs filename.c &" and save it. 

e.g. for the hellow world program described in lectures you would type into the editor: 

#include 

int main(void) { 

printf("hello world\n");




return 0; 

} 

2. 

Compile: "gcc filename.c" 

Fix any errors, save, and recompile until compilation successful. 

3. Run: "./a.out" 

Examine the output to see if it is correct. If not, examine the program carefully to ascertain where you 

migth have gone wrong. 

Full Instructions (for experts) 

1. 

2. 

3. 

4. 

5. 

6. 

Create a new directory for the program (not essential, but it helps to be organised). 

Use whatever editor you want to generate the program (make sure the program file has a suffix of '.c'). 

Construct a Makefile for the program (not essential, but worth doing). 

Type make to compile the program (if you don't have a Makefile, you will need to type the compiler command 

yourself). 

Run the program by typing its name. 

Locate and fix bugs, then go to step 4. 

Here is a complete example of the above process for the ``Hello world'' program: 

cd 

mkdir hello 

cd hello 

cat > hello.c




const double pi = 3.14159265358979323846; 

double e, d = pi/2; 

} 

e = sin(d); 

printf("The sine of %f is %f\n", d, e); 

return 0; 

This program produces the result: 

The sine of 1.570796 is 1.000000 

Aside: the file /usr/include/math.h defines M_PI to be the double precision value of pi. So, use this rather than typing 

the value of pi yourself. 

The GNU C debugger "gdb" 

When debugging a program, a common technique is to insert "print" statements to display the values of variables. A 

much more powerful technique is to use the GNU C debugger, "gdb". Other C compilers will have debuggers as well. 

Firstly, you must compile your program with the debug switch (-g). It is also a good idea to disable optimisation with 

-O0. 

gcc -g -O0 main.c 

Then run your program using gdb: 

gdb a.out 

The following are some of the more common commands that are available: 

help 

- obtain help 

help {arg} 

- obtain help on topic {arg} 

list 

- list 10 lines around current position 

list {line} 

- list 10 lines around line number {line} 

list {beg},{end} 

- list lines from line number {beg} to {end} 

run arg1 arg2 ... 

- begin execution of the program, with the arguments 

break line 

- suspend execution at the given line number 

step 

- steps (and executes) one line in your program 

print {exp} ... 

- print the value of the given expressions 

printf "string", exp, ... - print expressions using format string(C) 

quit 

- quit gdb 

 

- repeats the last command (very useful with "step") 

A program to give information on C data types 

Like other programming languages, C has a variety of different data types. The ANSI standard defines the data types 

that must be supported by a compiler, but it doesn't tell you details such as the range of numbers that each type can 

represent, or the number of bytes of storage occupied by each type. These details are implementation dependent, and 

defined in the two system include-files "limits.h" and "float.h". To find out what the limits are, try running the 

following program with your favourite C compiler. 

/* A program to print out various machine-dependent constants */ 

/* Michael Ashley / UNSW / 03-Mar-2003 */ 

#include /* for printf definition */ 

#include /* for CHAR_MIN, CHAR_MAX, etc */ 

#include /* for FLT_DIG, DBL_DIG, etc */ 


printf("char %d bytes %d to %d \n", sizeof(char ), CHAR_MIN, CHAR_MAX ); 

printf("unsigned char %d bytes %d to %d \n", sizeof(unsigned char ), 0 , UCHAR_MAX ); 

printf("short %d bytes %hi to %hi \n", sizeof(short ), SHRT_MIN, SHRT_MAX ); 

printf("unsigned short %d bytes %hu to %hu \n", sizeof(unsigned short), 0 , USHRT_MAX ); 

printf("int %d bytes %i to %i \n", sizeof(int ), INT_MIN , INT_MAX ); 

printf("unsigned int %d bytes %u to %u \n", sizeof(unsigned int ), 0 , UINT_MAX ); 

printf("long %d bytes %li to %li \n", sizeof(long ), LONG_MIN, LONG_MAX ); 

printf("unsigned long %d bytes %lu to %lu \n", sizeof(unsigned long ), 0 , ULONG_MAX ); 

printf("float %d bytes %e to %e \n", sizeof(float ), FLT_MIN , FLT_MAX ); 

printf("double %d bytes %e to %e \n", sizeof(double ), DBL_MIN , DBL_MAX ); 

printf("precision of float %d digits\n", FLT_DIG); 

printf("precision of double %d digits\n", DBL_DIG);




} 

return 0; 

Notes: 

sizeof looks like a function, but it is actually a built-in C operator (i.e., just like +,-,*). The compiler replaces 

sizeof(data-type-name) (or, in fact, sizeof(variable)) with the number of bytes of storage allocated to the 

data-type or variable. 

The 'unsigned' data types are useful when you are refering to things which are naturally positive, such as the 

number of bytes in a file. They also give you a factor of two increase in the largest number that you can represent 

in a given amount of space. 

An "unsigned char" is a single-byte C data type that can represent positive integers from 0 to 255. A "signed 

char" uses the most-significant bit of the byte as a sign-bit (0 indicates a positive number, 1 indicates a negative 

number), leaving the remaining 7 bits to encode the absolute value of the number; the range of signed chars is 

therefore -128 to +127. 

Most of the time you don't need to worry about how many bytes are in each data type, since the limits are usually OK 

for normal programs. However, a common problem is that the "int" type can be either 2-bytes or 4-bytes in length, 

depending on the compiler. 

Here is the output of the preceeding program when run on a GNU/Linux PC, using gcc: 

char 1 bytes -128 to 127 

unsigned char 1 bytes 0 to 255 

short 2 bytes -32768 to 32767 

unsigned short 2 bytes 0 to 65535 

int 4 bytes -2147483648 to 2147483647 

unsigned int 4 bytes 0 to 4294967295 

long 4 bytes -2147483648 to 2147483647 

unsigned long 4 bytes 0 to 4294967295 

float 

4 bytes 1.175494e-38 to 3.402823e+38 

double 

8 bytes 2.225074e-308 to 1.797693e+308 

precision of float 6 digits 

precision of double 15 digits 

Constants; binary, octal, hexademical, and decimal 

We have already seen how to write character constants and strings. Let's now look at other types of constants: 

int 

123, -1, 2147483647, 040 (octal), 0xab (hexadecimal) 

unsigned int 123u, 2147483648, 040U (octal), 0X02 (hexadecimal) 

long 

123L, 0XFFFFl (hexadecimal) 

unsigned long 123ul, 0777UL (octal) 

float 

1.23F, 3.14e+0f 

double 1.23, 2.718281828 

long double 1.23L, 9.99E-9L 

Note: 

Integers are automatically assumed to be of the smallest type that can represent them (but at least an "int"). For 

example, 2147483648 was assumed by the compiler to be an "unsigned int" since this number is too big for an 

"int". Numbers too big to be an "unsigned int" are promoted to "long" or "unsigned long" as appropriate. 

An integer starting with a zero is assumed to be octal, unless the character after the zero is an "x" or "X", in 

which case the number in hexadecimal. 

Unsigned numbers have a suffix of "u" or "U". 

Long "int"s or "double"s have a suffix of "l" or "L" (it is probably better to use "L" so that it isn't mistaken for the 

number one). 

Floating-point numbers are assumed to be "double"s unless they have a suffix of "f" or "F" for "float", or "l" or 

"L" for "long double". 

It pays to be very careful when specifying numbers, to make sure that you do it correctly, particularly when dealing 

with issues of precision. For example, have a look at this program (precision1.c, precision2.c): 

#include 


double r; 

r = 1.0 + 0.2F;




} 

r = r - 1.2F; 

printf("%22.16e\n", r); 

return 0; 

The program gives at output of "-4.4703483581542969e-08", not zero as you might expect. 

EXERCISE: Run this program and verify for yourself that the answer is indeed not zero! Re-write the program so that 

the numerical values in the code (i.e. 0.2, 1.2 etc) are double rather than float, and verify that the result of execution is 

to give the answer 0.000000000000000e+00. 

The lesson to be learnt here is when writing constants, always think carefully about what type you want them to be, and 

use the suffixes "U", "L", and "F" to be explicit about it. It is not a good idea to rely on the compiler to do what you 

expect. Don't be surprised if different machines/compilers give different answers if you program sloppily. 

Conversion between integers and floating point numbers 

In C, as in all computer languages, there are rules that the compiler uses when a program mixes integers and floating 

point numbers in the same expression. 

Let's look at what happens if you assign a floating point number to an integer variable (convert1.c): 

#include 


int i, j; 

i = 1.99; 

j = -1.99; 

printf("i = %d; j = %d\n", i, j); 

return 0; 

} 

This program produces the result "i = 1; j = -1". Note that the floating point numbers have been truncated and not 

rounded. 

When converting integers to floating-point, be aware that a "float" has fewer digits of precision than an "int", even 

though they both use 4 bytes of storage (on normal PCs). This can result in some strange behaviour, e.g. (convert2.c), 

#include 


unsigned int i; 

float f; 

i = 4294967295; /* the largest unsigned int */ 

f = i; /* convert it to a float */ 

printf("%u %20.13e %20.13e\n", i, f, f - i); 

return 0; 

} 

This program produces the following output when compiled with "gcc": 

4294967295 4.2949672960000e+09 1.0000000000000e+00 

Rather than rely on automatic type-conversion, you can be explicit about it by using a type-cast operator, e.g., 

f = (float)i; 

This converts the number or variable or parethesised expression immediately to its right, to the indicated type. It is a 

good idea to use type-casting to ensure that you leave nothing to chance. 

Operators 

C has a rich set of operators (i.e., things like + - * /), which allow you to write complicated expressions quite 

compactly. 

Unary operators




Unary operators are operators that only take one argument. The following list will be confusing when you first see it, so 

just keep it mind for reference later on. 

Some of the operators we have already seen (e.g., "sizeof()"), others are very simple (e.g., +, -), others are really neat 

(e.g., ~, !), others are useful for adding/subtracting 1 automatically (e.g., ++i, --i, i++, i--), and the rest involve pointers 

and addressing, which will be covered in detail later. 

sizeof(i) the number of bytes of storage allocated to i 

+123 positive 123 

-123 negative 123 

~i one's complement (bitwise complement) 

!i logical negation (i.e., 1 if i is zero, 0 otherwise) 

*i returns the value stored at the address pointed to by i 

&i 

returns the address in memory of i 

++i 

adds one to i, and returns the new value of i 

--i 

subtracts one from i, and returns the new value of i 

i++ 

adds one to i, and returns the old value of i 

i-- 

subtracts one from i, and returns the old value of i 

i[j] 

array indexing 

i (j) 

calling the function i with argument j 

i.j 

returns member j of structure i 

i->j 

returns member j of structure pointed to by i 

Binary operators 

Binary operators work on two operands ("binary" here means 2 operands, not in the sense of base-2 arithmetic). 

Here is a list. All the usual operators that you would expect are there, with a whole bunch of interesting new ones. 

Note: 

+ addition 

- subtraction 

* multiplication 

/ division 

% remainder (e.g., 2%3 is 2), also called 'modulo' 




Left argument Right argument & | ^ && || 

0011 0101 0001 0111 0110 1 1 

1110 1010 1010 1110 0100 1 1 

0000 1111 0000 1111 1111 0 1 

Assignment operators 

Assignment operators are really just binary operators. The simplest example is "=", which takes the value on the right 

and places it into the variable on the left. But C provides you with a host of other assignment operators which make life 

easier for the programmer. 

= assignment 

+= addition assignment 

-= subtraction assignment 

*= multiplication assignment 

/= division assignment 

%= remainder/modulus assignment 

&= bitwise AND assignment 

|= bitwise OR assignment 

^= bitwise exclusive OR assignment 

= right shift assignment 

So, for example, "i += j" is equivalent to "i = i + j". The advantage of the assignment operators is that they can reduce 

the amount of typing that you have to do, and make the meaning clearer. This is particularly noticeable when, instead 

of a simple variable such as "i", you have something complicated such as "position[wavelength + correction_factor * 

2]"; 

The thing to the left of the assignment operator has to be something where a result can be stored, and is known as an 

"lvalue" (i.e., something that can be on the left of an "="). Valid "lvalues" include variables such as "i", and array 

expressions. It doesn't make sense, however, to use constants or expressions to the left of an equals sign, so these are 

not "lvalues". 

The comma operator 

C allows you to put multiple expression in the same statement, separated by a comma. The expressions are evaluated in 

left-to-right order. The value of the overall expression is then equal to that of the rightmost expression. 

For example, 

Example 

Equivalent to 

------- ------------- 

i = ((j = 2), 3); i = 3; j = 2; 

myfunct(i, (j = 2, j + 1), 1); j = 2; myfunct(i, 3, 1); 

The comma operator has the lowest precedence, so it is always executed last when evaluating an expression. Note that 

in the example given comma is used in two distinct ways inside an argument list for a function. 

Both the above examples are artificial, and not very useful. The comma operator can be useful when used in "for" and 

"while" loops as we will see later. 

Precedence and associativity of operators 

The precedence of an operator gives the order in which operators are applied in expressions: the highest precedence 

operator is applied first, followed by the next highest, and so on. 

The associativity of an operator gives the order in which expressions involving operators of the same precedence are 

evaluated. 

The following table lists all the operators, in order of precedence, with their associativity: 

Operator 

Associativity 

-------- -------------




() [] ->> . left-to-right 

- + ++ -- ! ~ * & sizeof (type) right-to-left (note: unary +, -, *, and &) 

* / % left-to-right 

+ - left-to-right 

> left-to-right 

< >= left-to-right 

== != left-to-right 

& 

left-to-right 

^ 

left-to-right 

| left-to-right 

&& 

left-to-right 

|| left-to-right 

?: right-to-left 

= += -= *= /= %= &= ^= |= = right-to-left 

, left-to-right 

Note: the unary operators + (introducing a positive number), - (introducing a negative number), * (pointer 

dereferencing), and & (address of); appear twice in the above table. The unary forms (on the second line) have higher 

precedence that the binary forms. 

Operators on the same line have the same precedence, and are evaluated in the order given by their associativity. 

To specify a different order of evaluation you can use parentheses. In fact, it is often good practice to use parentheses to 

guard against making mistakes in difficult cases, or to make your meaning clear. 

Side effects in evaluating expressions 

It is possible to write C expressions that give different answers on different machines, since some aspects of 

expression-evaluation are not defined by the ANSI standard. This is deliberate since it gives the compiler writers the 

ability to choose different evaluation orders depending on the underlying machine architecture. You, the programmer, 

should avoid writing expressions with side effects. 

Here are some examples: 

myfunc(j, ++j); /* the arguments may be the same, or differ by one */ 

array[j] = j++; /* is j incremented before being used as an index? */ 

i = f1() + f2(); /* the order of evaluation of the two functions 

is not defined. If one function affects the 

results of the other, then side effects will 

result */ 

Evaluation of logical AND and OR 

A useful aspect of C is that it guarantees the order of evaluation of expressions containing the logical AND (&&) and 

OR (||) operators: it is always left-to-right, and stops when the outcome is known. For example, in the expression "1 || 

f()", the function "f()" will not be called since the truth of the expression is known regardless of the value returned by 

"f()". 

It is worth keeping this in mind. You can often speed up programs by rearranging logical tests so that the outcome of 

the test can be determined as soon as possible. 

Another good example is an expression such as "i >= 0 && i




Rules for identifiers: 

Legal characters are a-z, A-Z, 0-9, and _. 

Case is significant. 

The first character must be a letter or _. 

Identifiers can be of any length (although only the first 31 characters are guaranteed to be significant). 

The following are reserved keywords, and may not be used as identifiers: 

auto double int struct 

break else long switch 

case enum register typedef 

char extern return union 

const float short unsigned 

continue for signed void 

default goto sizeof volatile 

do if static while 

Here are some examples of legal identifiers: 

i 

count 

numberOfAardvarks 

number_of_aardvarks 

MAX_LENGTH 

Variable storage classes 

Variables in C belong to one of two fundamental storage classes: "static" or "automatic". 

A static variable is stored at a fixed memory location in the computer, and is created and initialised once when the 

program is first started. Such a variable maintains its value between calls to the block (a function, or compound 

statement) in which it is defined. 

An automatic variable is created, and initialised, each time the block is entered (if you jump in half-way through a 

block, the creation still works, but the variable is not initialised). The variable is destroyed when the block is exited. 

Variables can be explicitly declared as "static" or "auto" by using these keywords before the data-type definition. If you 

don't use one of these keywords, the default is "static" for variables defined outside any block, and "auto" for those 

inside a block. 

Actually, there is another storage class: "register". This is like "auto" except that it asks the compiler to try and store the 

variable in one of the CPU's fast internal registers. In practice, it is usually best not to use the "register" type since 

compilers are now so smart that they can do a better job of deciding which variables to place in fast storage than you 

can. 

Const - volatile 

Variables can be qualified as "const" to indicate that they are really constants that can be initialised, but not altered. 

Variables can also be termed "volatile" to indicate that their value may change unexpectedly during the execution of the 

program (e.g., they may be hardware registers on a PC, able to be altered by external events). By using the "volatile" 

qualifier, you prevent the compiler from optimising the variable out of loops. 

Extern 

Variables (and functions) can also be classified as "extern", which means that they are defined external to the current 

block (or even to the current source file). An "extern" variable must be defined once (and only once) without the 

"extern" qualifier. 

As an example of an "extern" function, all the functions in "libm.a" (the math library) are external to the source file that 

calls them.




An example showing storage class and variable scope (storageclass.c) 

#include 

int i; /* i is static, and visible to the entire program */ 

extern j; /* j is static, and visible to the entire program */ 

static int k; /* k is static, and visible to the routines in this source file */ 

int main(void){} 

void func(void) { 

int m = 1; 

auto int n = 2; 

/* i.e., a function that takes no arguments, and 

doesn't return a value */ 

/* m is automatic, local to this function, and initialised 

each time the function is called */ 

/* n is automatic, local to this function, and initialised 

each time the function is called */ 

static int p = 3; /* p is static, local to this function, and initialised 

once when the program is first started up */ 

extern int q; /* q is static, and defined in some external module */ 

} 

for (i = 0; i < 10; i++) { 

int m = 10; /* m is automatic, local to this block, and initialised 

each time the block is entered */ 

printf("m = %i\n", m); 

} 

Initialisation of variables (initialisation.c) 

A variable is initialised by equating it to a constant expression on the line in which it is defined. For example 

int i = 0; 

"static" variables are initialised once (to zero if not explicitly initialised), "automatic" variables are initialised when the 

block in which they are defined is entered (and to an undefined value if not explicitly initialised). 

The "constant expression" can contain combinations of any type of constant, and any operator (except assignment, 

incrementing, decrementing, function call, or the comma operator), including the ability to use the unary "&" operator 

to find the address of static variables. 

Here are some valid examples: 

#include 

#include 

int i = 0; 

int j = 2 + 2; 

int k = 2 * (3




Mathematical functions 

Prototypes for the math functions are in the system include-file "math.h", so you should put the line 

#include 

in any C source file that calls one of them. 

Here is a list of the math functions defined by the ANSI standard: 

sin(x) 

sine of x 

cos(x) 

cosine of x 

tan(x) 

tan of x 

asin(x) 

arcsine of x, result between -pi/2 and +pi/2 

acos(x) 

arccosine of x, result between 0 and +pi 

atan(x) 

arctan of x, result between -pi/2 and +pi/2 

atan2(y,x) 

arctan of (y/x), result between -pi and +pi 

hsin(x) 

hyperbolic sine of x 

hcos(x) 

hyperbolic cosine of x 

htan(x) 

hyperbolic tan of x 

exp(x) 

exponential function 

log(x) 

natural logarithm 

log10(x) logarithm to base 10 

pow(x,y) 

x to the power of y (x**y in FORTRAN) 

sqrt(x) 

the square root of x (x must not be negative) 

ceil(x) 

ceiling; the smallest integer not less than x 

floor(x) 

floor; the largest integer not greater than x 

fabs(x) 

absolute value of x 

ldexp(x,n) 

x times 2**n 

frexp(x, int *exp) returns x normalized between 0.5 and 1; the exponent of 2 is in *exp 

modf(x, double *ip) returns the fractional part of x; the integral part goes to *ip 

fmod(x,y) 

returns the floating-point remainder of x/y, with the sign of x 

In the above table, "x", and "y" are of type "double", and "n" is an "int". All the above functions return "double" results. 

C libraries may also include "float" versions of the above. For example, "fsin(x)" takes a float argument and returns a 

float result. 

Note that the arguments for trignometric functions are in radians and not degrees! 

A complete list of mathematical functions supported by C can be found in the file Cmathfunctions.txt. 

The "for" loop 

The basic looping construct in C is the "for" loop. 

Here is the syntax of the "for" statement: 

for (initial_expression; loop_condition; loop_expression) statement; 

and this is precisely equivalent to the following code fragment: 

initial_expression; 

while (loop_condition) { 

statement; 

loop_expression; 

} 

An example will, hopefully, clear this up: 

for (i = 0; i< 100; i++) printf("%i\n", i); 

which simply prints the first 100 integers. If you want to include more that one statement in the loop, use curly brackets 

to delimit the body of the loop, e.g., 

for (i = 0; i < 100; i++) { 

j = i * i; 

printf("i = %i; j = %i\n", i, j);




} 

A related control mechanism is the do-while loop 

initial_expression; 

do{statement;} while (loop_condition) 

For example (dowhile.c): 

i = 0; 

do {printf("The value of i is now %d\n", i); 

i = i + 1; 

} while (i < 5); 

Here is a slightly more interesting example which converts Centigrade to Farenheit (tempconv.c). 

And here is a more complex example, illustrating nested loops, with an if thrown in for good measure: 

ASCII circle printer. 

And here is another program illustrating the power of loops: Mandelbrot image generator 

Semicolons in compound statements 

The last statement in the body of statements in a "for" loop (or, in fact, in any other compound statement) must be 

terminated with a semicolon. 

For example, 

for (i = 0; i < 10; i++) { 

x = i * i; 

x += 2; /* the semicolon is required here */ 

} /* do not use a semicolon here */ 

Variables defined within compound statements 

You can create variables that are local to a compound statement by declaring the variables immediately after the 

leading curly bracket. 

If statements 

if (expression) 

statement 

else if (expression) 

statement 

else if (expression) 

statement 

else 

statement 

Where "statement" is a simple C statement ending in a semicolon, or a compound statement ending in a curly bracket. 

Some examples will help: 

if (i == 6) x = 3; 

if (i == 6) { 

x = 3; 

} 

if (i) x = 3; 

if (i == 2) 

x = 3; 

else if (i == 1) 

if (j) 

y = 2;




else /* NOTE: possible ambiguity here, the compiler uses */ 

y = 3; /* the closest if statement that does not have */ 

else { /* an else clause */ 

x = 4; 

y = 5; 

} 

Example Program ifelse.c 

Break and continue 

These two statements are used in loop control. 

"break" exits the innermost current loop. 

"continue" starts the next iteration of the loop. 

Here is an examaple program with break and continue (breakcon.c) 

Infinite loops 

Here are three ways of writing an infinite loop, choose whichever one you like: 

for (;;;) { 

statement; 

} 

while (1) { 

statement; 

} 

do { 

statement; 

} while (1); 

Infinite loops can be useful. They are normally terminated using a conditional test with a "break" or "return" statement. 

For example: 

while (1) { 

statement; 

if (logical_condition) break; 

} 

goto 

C does have a "goto" statement, but you normally don't need it. Using "goto" is usually a result of bad programming 

(although in some circumstances it can be a sensible choice). 

The switch statement 

The switch statement allows you to choose between multiple paths of execution depending on the value of an 

expression. It could be replaced by a series of if statements, but switch can make the flow of the program easier to 

understand. 

switch (expression) { 

case const-expression: statements 

case const-expression: statements 

default: statements 

} 

Beware that control will flow from one case to the next unless you explicitly include a break statement to cause control 

to skip to the end of the switch statement. 

Here is an example of using switch: 

switch (i) { 

case 0: printf("i had the value zero\n"); 

break; 

case 1: 

case 2: printf("i was either one or two\n");




} 

break; 

default: printf("i was outside the range 0 to 2\n"); 

Here is an example with multiple switch values (switch.c) 

Assert as a tool for improving program reliability 

By now you will be aware that it is very easy to write buggy programs in C, and it may not always be obvious at 

runtime that there is a problem. Checking for all possible error conditions in a program can be tedious. A useful tool to 

improve program reliability is the "assert" construct defined in the system include-file "assert.h". An example 

(assertexample.c) will clarify this: 

#include 

#include 


int i; 

} 

printf("enter a positive integer > "); 

assert(1 == scanf("%i", &i)); 

assert(i > 0); 

printf("you successfully entered %i\n", i); 

return 0; 

If the expression inside parentheses after "assert" is true, the program continues as normal, if the expression is false (i.e, 

zero), then the program aborts with an error message. 

This may sound like a trivial concept, but by using "assert" extensively, you will be surprised at how your 

programming improves. 

It is particularly useful to use "assert" to check for conditions that should never arise in your program, e.g., suppose you 

write a function that should only ever be called with an argument between -1.0 and 1.0, then, use "assert" in the 

function to check this, e.g., 

double myAsin(double x) { 

assert(fabs(x) < 1.0); 

... 

} 

Also, you can use assert to check for unexpected conditions such as failure to open a file, or failure to allocate memory, 

e.g., 

assert(NULL != (fp = fopen("data.dat", "r"))); 

assert(NULL != (p = malloc(1024)); 

Formatted output: printf description 

The format string given to the "printf" function may contain both ordinary characters (which are simply printed out) 

and conversion characters (beginning with a percent symbol, %, these define how the value of an internal variable is to 

be converted into a character string for output). 

Here is the syntax of a conversion specification: 

%{flags: - + space 0 #}{minimum field width}{.}{precision}{length modifier}{conversion character} 

Flags: "-" means left-justify (default is right-justify), "+" means that a sign will always be used, " " prefix a 

space, "0" pad to the field width with zeroes, "#" specifies an alternate form (for details, see the manual!). Flags 

can be concatenated in any order, or left off altogether.




Minimum field width: the output field will be at least this wide, and wider if necessary. 

"." separates the field width from the precision. 

Precision: its meaning depends on the type of object being printed: 

Character: the maximum number of characters to be printed. 

Integer: the minimum number of digits to be printed. 

Floating point: the number of digits after the decimal point. 

Exponential format: the number of significant digits. 

Length modifer: "h" means short, or unsigned short; "l" means long or unsigned long; "L" means long double. 

Conversion character: a single character specifying the type of object being printed, and the manner in which it 

will be printed, according to the following table: 

Character Type 

Result 

d,i int signed decimal integer 

o int unsigned octal (no leading zero) 

x, X int unsigned hex (no leading 0x or 0X) 

u int unsigned decimal integer 

c int single character 

s char * characters from a string 

f double floating point [-]dddd.pppp 

e, E double exponential [-]dddd.pppp e[=/-]xx 

g, G double floating if the exponent less than -4 or >= precision 

else exponential 

p void * pointer 

n int * the number of characters written so far by printf 

is stored into the argument (i.e., not printed) 

% print % 

Here is an example program (printfexample.c) to show some of these effects (try running the program yourself): 

#include 


int i = 123; 

double f = 3.14159265358979323846; 

printf("i = %i\n", i); 

printf("i = %o\n", i); 

printf("i = %x\n", i); 

printf("i = %X\n", i); 

printf("i = %+i\n", i); 

printf("i = %8i\n", i); 

printf("i = %08i\n", i); 

printf("i = %+08i\n", i); 

printf("f = %f\n", f); 

printf("f = %10.3f\n", f); 

printf("f = %+10.3f\n", f); 

printf("f = %g\n", f); 

printf("f = %10.6g\n", f); 

printf("f = %10.6e\n", f); 

} 

return 0; 

Notes: 

There are differences between the various implementations of "printf". The above should be a subset of the 

available options. Consult the manual (e.g., "info gcc") for your particular C compiler to be sure. 

"printf" is actually an integer function, which returns the number of characters written (or a negative number if an 

error occurred). 

Formatted input: scanf overview 

Input in C is similar to output: the same conversion characters are used. The main difference is that you use a routine 

called "scanf", and you must pass the addresses of the variables you want to change, not their values. 

For example: 

scanf("%d", &i); /* reads an integer into ì' */ 

scanf("%i", &i); /* reads an integer (or octal, or hex) into ì' */ 

scanf("%f %i", &f, &i); /* reads a double followed by an integer */ 

scanf is actually an integer function, which returns the number of input items assigned (or the integer constant EOF 

(defined in stdio.h) if the end-of-file is reached or an error occurred).




The ampersand character "&" is a unary operator that returns the address of the thing to its right. Recall that a C 

function can not alter the value of its arguments (but there is nothing stopping it altering the value that is pointed to by 

one of its arguments!). 

scanf details 

The scanf function is used for reading a line of input from the standard input stream (usually the users terminal) and 

extracting from it one or more integers, floating point numbers, character strings, etc. 

The first argument to scanf is a format description, which describes how the input line should be decoded. For example, 

if you want to read an integer, the format would be "%i" (note that this allows you to enter numbers in decimal, 

hexadecimal or octal (e.g., 123, 0x123, 0123)). If you want to read a floating point number, the format would be "%f". 

If you want to read an integer followed by a floating point number, the format woud be "%d %f". If the two numbers 

were separated by a comma on the input line, you would use "%d, %f". 

The arguments after the first are pointers to where the data should be stored. 

Examples (where i and j are integers, x and y are floats, and fruit and colour are char arrays): 

Input data 

scanf function call to read the data 

123 scanf("%i", &i); 

123 0x456 scanf("%i %i", &i, &j); 

123. 456. scanf("%i. %i.", &i, &j); 

123.1 456.1 scanf("%f %f", &x, &y); 

e=2.71 pi=3.14 

scanf("e=%f pi=%f", &x, &y); 

apples red 

scanf("%s %s", fruit, colour); 

Guido ate 2 apples 

scanf("Guido ate %i %s", &i, fruit); 

It is important to check the return value of the scanf function - this is the number of items that were successfully 

extracted from the input. It may be less than the number you expected, or even zero. If this happens, the input stream 

pointer is left at the point at which the error occurred, and you can try reading again. scanf can also return the special 

integer constant EOF to indicate an end-of-file condition. 

Variations on scanf 

fscanf is very similar to scanf except it takes a FILE argument ("FILE" is a structure that is defined in the 

include file - you don't have to worry about what precisely "FILE" is, just follow the examples) to specify which input 

stream to read from, e.g., 

FILE *fp; 

float x, y; 

fp = fopen("input.dat", "r"); 

fscanf(fp, "%f %f", &x, &y); 

"sscanf" takes input from a character array rather than an input stream, e.g., 

char input="1.23 4.56"; 

sscanf(input, "%f %f", &x, &y); 

Improving the ease of interaction with your programs - GNU readline 

While it is possible to use "scanf" for reading input typed by a person, you can greatly improve the user experience by 

using the GNU readline package.




The GNU readline package provides some very convenient features such as line editing and history. This means, e.g., 

that you can use the arrow keys to move within a line, and between previously entered lines. You can also use, e.g., 

CTRL-A to go to the beginning of the line, CTRL-E to go to the end, and CNTRL-R to search backwards for particular 

strings. See "man readline" (on a GNU/Linux computer) for a complete description. 

Here is an example program showing the use of GNU readline. 

Arrays 

In mathematics and physics you often need to group numbers together into arrays, i.e., 1, 2, or higher-dimensional 

structures. 

Arrays are declared in C as follows (for example): 

int counts[100]; 

float temperature[1024]; 

double mat[2][3]; 

In this example, "count" is an array that can hold 100 integers, "temperature" is an array that can hold 1024 floats, and 

"mat" is a 2x3 array (or matrix) of doubles. 

Note carefully that the first element of an array in C is element number 0 (rather than 1 as in FORTRAN). While this 

may be confusing to die-hard FORTRAN programmers, it is really a more natural choice. A side-effect of this choice is 

that the last element in an array has an index that is one less that the declared size of the array. This is a source of some 

confusion, and something to watch out for. 

So, "counts[0]" is the first element of the array "counts", defined above. And "counts[99]" is the last element. There are 

100 elements in total. 

Initialising arrays 

To initialise an array to have particular numerical values, specify the initial values in a list within curly brackets. For 

example: 

int primes[100] = 

{2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 

47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107, 

109, 113, 127, 131, 137, 139, 149, 151, 157, 163, 167, 173, 179, 181, 

191, 193, 197, 199, 211, 223, 227, 229, 233, 239, 241, 251, 257, 263, 

269, 271, 277, 281, 283, 293, 307, 311, 313, 317, 331, 337, 347, 349, 

353, 359, 367, 373, 379, 383, 389, 397, 401, 409, 419, 421, 431, 433, 

439, 443, 449, 457, 461, 463, 467, 479, 487, 491, 499, 503, 509, 521, 

523, 541}; 

float temp[1024] = {5.0F, 2.3F}; 

double trouble[] = {1.0, 2.0, 3.0}; 

In this example, "primes" is initialised with the values of the first 100 primes (trust me!), and the first two elements 

(temp[0] and temp[1]) of "temp" are initialised to 5.0F and 2.3F respectively. The remaining elements of "temp" are set 

to 0.0F. Note that we use the trailing "F" on these numbers to indicate that they are floats, not doubles. 

The array "trouble" in the above example contains three double numbers. Note that its length is not explicitly declared, 

C is smart enough to calculate the length for you. 

Static arrays that are not explicitly initialised are set to zero. Automatic arrays that are not explicitly initialised, have 

undefined values. 

Multidimensional arrays 

Multidimensional arrays are declared and referenced in C by using multiple sets of square brackets. For example, 

int table[2][3][4]; 

"table" is a 2x3x4 array, with the rightmost array subscript changing most rapidly as you move through memory (the




first element is "table[0][0][0]", the next element is "table[0][0][1]", and the last element is "table[1][2][3]"). 

When writing programs that use large arrays (say, more than a few megabytes), you should be very careful to ensure 

that array references in your program are as close as possible to being consecutive (otherwise you may get severe 

swapping problems, i.e., the memory is swapped to hard disk, resulting in more than a factor of 1000 slow-down). 

Initialisation of multidimensional arrays 

This is best demonstrated with an example: 

int mat[3][4] = { 

} 

Operations on arrays 

{ 0, 1, 2, 23}, 

{ 3, 4, 5, 24}, 

{ 6, 7, 8, 25} 

If you wish to, e.g., add two arrays together, you have to do it element-by-element yourself using a "for" loop. For 

example: 

int a0[10], a1[10], a2[10]; 

for (int i = 0; i < 10; i++) { 

a0[i] = a1[i] + a2[i]; 

} 

Even something as simple as setting one array to have the same value as another has to be done with a "for" loop. E.g. : 

int a0[10], a1[10]; 

for (int i = 0; i < 10; i++) { 

a0[i] = a1[i]; 

} 

Character arrays 

Arrays can be of any type. In C, you manipulate character strings as arrays of characters, and operations on the strings 

(such as concatenation, searching) are done by calling special libary functions (e.g., strcat, strcmp). 

Note that when calling a string-manipulation function, the end of the string is taken as the position of the first NUL 

character (which has a numerical value of zero) in the string. 

Character arrays can be initialised in the following ways: 

char str[] = {'a', 'b', 'c'}; 

char prompt[] = "please enter a number"; 

In the example, "str" has length 3 bytes, and "prompt" has length 22 bytes, which is one more than the number of 

characters in "please enter a number", the extra byte is used to store a NUL character (zero) as an indication of the end 

of the string. "str" does not having a trailing NUL. 

Functions 

A function call allows a commonly used piece of code to be made available through a single calling statement. For 

instance, to calculate the square of a number we could use the following (floating point) function: 

float sqr(inval) /* square a float, return a float */ 

float inval; 

{ 

float square; 

square = inval * inval; 

return(square);




This would be called from the main part of the program by a statement like: 

y = sqr(x); 

In this case the result of the function call ("square") is assigned to the value "y". The complete calling sequence is 

illustrated in the program floatsq.c. 

Prototypes 

The function call described above is the "classical" mode of function use, as described, for eaxmple, in Kernigan & 

Ritchie. A prototype is a model of a real thing, and in C you have the ability to define a model of each function for the 

compiler. The compiler can then use the model to check each call to the function and so determine if the correct number 

of arguments have been used, and whether they are of the correct type. By using prototypes you are therefore letting the 

compiler do some additional error checking. The program floatsq2.c does the same calculation as the program floatsq.c 

as above, but this time uses prototypes. Note the placement of the prototype for the square function "sqr": 

float sqr(float inval); 

at the start of the program, before the start of "main". 

Pointers 

If there is one thing that sets C apart from many other languages, it is the pervasive use of pointers. 

To understand pointers, it helps to have a mental image of the way in which the memory of a computer is organised. 

e.g., imagine that your computer has 128MB of memory; then you could label each one of these 128 million bytes of 

memory with a number: byte 0 would be the first byte in memory, byte 1 the second, and so on up to 128 million. 

These numbers are referred to as the addresses of the memory cells (note: I'm actually simplifying the true situation; in 

a modern computer each separate process has it's own virtual address-space, and there are several layers of indirection 

involved in going from the addresses that C uses, to the actual physical piece of memory on the computer 

motherboard). A pointer is then simply a variable containing a memory address. This is a very natural concept when 

you think in terms of the hardware of the computer, which is a viewpoint that C encourages. 

Suppose that "i" is declared as "int i". Then the address in memory at which "i" is stored is given by "&i", where "&" is 

the unary address operator. One says that "&i" is a pointer to "i". On a 32-bit computer such as a typical PC, a pointer 

is 4 bytes in length (therefore allowing up to 4GB of memory to be addressed; this is an uncomfortably small number). 

Pointers are declared by specifying the type of thing they point at. For example, 

int *p; 

defines "p" as a pointer to an int (so, therefore, "*p" is an int, hence the form of the declaration). 

Note carefully, then by declaring "p" as in the above example, the compiler simply allocates space for the pointer (4 

bytes for 32-bit computers, 8 bytes for 64-bit computers), not for the variable that the pointer points to! This is a very 

important point, and is often overlooked. Before using "*p" in an expression, you have to ensure that "p" is set to point 

to a valid int. This "int" must have had space allocated for it, either statically, automatically, or by dynamically 

allocating memory at run-time (e.g., using the "malloc" function, see later). 

Here is a rather convoluted example showing some of the uses of pointers (pointerexample1.c): 

#include 


int m = 0, n = 1, k = 2; 

int *p; 

char msg[] = "hello"; 

char *cp; 

/* Line 7 */ 

p = &m; /* Line 8: p now points to m */ 

*p = 1; /* Line 9: m now equals 1 */ 

k = *p; /* Line 10: k now equals 1 */ 

cp = msg; /* Line 11: cp points to the first character of msg */ 

*cp = 'H'; /* Line 12: change the case of the 'h' in msg */ 

cp = &msg[4]; /* Line 13: cp points to the 'o' */ 

*cp = 'O'; /* Line 14: change its case */ 

printf("m = %d, n = %d, k = %d\nmsg = \"%s\"\n", m, n, k, msg); 

return 0;




} 

By using the GNU debugger, gdb, we can identify the actual memory locations that the C compiler uses to store the 

variable in the above program. It is then possible to draw up a table (see below) showing the detailed operation of the 

program. You should follow this example very carefully so that you understand what is going on. 

Variable Starting Value after the specified line has executed 

location 7 8 9 10 11 12 13 14 

------------------------------------------------------------------ 

m 0xbffff98c 0 0 1 1 1 1 1 1 

n 0xbffff988 1 1 1 1 1 1 1 1 

k 0xbffff984 2 2 2 1 1 1 1 1 

p 0xbffff980 ? 98c 98c 98c 98c 98c 98c 98c 

msg 0xbffff970 hello hello hello hello hello Hello Hello HellO 

cp 0xbffff96c ? ? ? ? 970 970 976 976 

The memory addresses are shown in hexadecimal, and only the rightmost three digits are shown (e.g., 98c) for 

convenience in displaying the table. Memory addresses are always given in units of bytes. The above table shows, for 

example, that the variable "m" occupies the memory addresses 0xbffff98c, 0xbffff98d, 0xbffff98e, and 0xbffff98f. You 

will note that 16 bytes are reserved for "msg", even though "msg" only needs 6 bytes (5 for the characters in "hello" 

and one for the trailing null byte). The C compiler won't guarantee any particular placement in memory, and may leave 

gaps, particularly to align numbers with 4-byte boundaries (i.e., so that their memory addresses are divisable by 4). 

Note the very important point that the name of an array ("msg" in the above example), if used without an index, is 

considered to be a pointer to the first element of the array. In fact, an array name followed by an index is exactly 

equivalent to a pointer followed by an offset. For example (pointerexample2.c), 

#include 


char msg[] = "hello world"; 

char *cp; 

cp = msg; 

cp[0] = 'H'; 

*(msg+6) = 'W'; 

} 

printf("%s\n", msg); 

printf("%s\n", &msg[0]); 

printf("%s\n", cp); 

printf("%s\n", &cp[0]); 

return 0; 

Note, however, that "cp" is a variable, and can be changed, whereas "msg" is a constant, and is not an lvalue (i.e., it can 

not be used on the left of an equals sign). 

Pointer arithmetic 

You can add and subtract numbers to/from pointers, but the results may surprise you. For example, if "p" is a pointer to 

an "int" (i.e., "int *p"), then "p++" actually adds 4 to "p"! The logic behind this is that adding one to "p" should make it 

point to the next integer, which has a memory address that is 4 bytes greater. 

Pointers used as arguments to functions 

We have already seen that C functions can not alter their arguments. The simple reason for this is that when you call a 

function, the compiler passes copies of the values of the arguments to the function. For example, if you have a function 

as follow: 

void myfunc(int n) { 

n = 6; 

} 

and you call it with 

int i = 0; 

myfunc(i); 

Then "myfunc" is passed the number 0. "myfunc" knows nothing whatsoever about where this number came from. The




zero is stored in the local variable n. "myfunc" changed n to 6, but this doesn't have any effect on i in the main 

program. 

In order to change the value of a variable in the main program, you need to give the function a pointer to the variable. 

In other words, you tell the function whereabouts in memory the variable is. With this crucial bit of information, the 

function can go ahead and alter the value of the variable. For example, 

void myfunc(int *n) { 

*n = 6; 

} 

and you call it with 

int i = 0; 

myfunc(&i); 

Now, the value of i in the main program will be altered by the call to "myfunc". 

Here is an example of a function that swaps the values of its two arguments, which must be pointers to integers: 

void swap_args(int *pi, int *pj) { 

int temp; 

} 

temp = *pi; 

*pi = *pj; 

*pj = temp; 

If all the asterisks were left out of this routine, then it would still compile OK, but its only effect would be to swap local 

copies of "pi" and "pj". 

Pointers and arrays 

Consider an array declared as follows: 

int a[10]; 

Then, as we have seen, a[0] is the first element of the array, and a[9] is the last. The address of the first element is 

"&a[0]", and by convention this can also be written simply as "a". 

Structures 

Any programming language worth its salt has to allow you to manipulate more complex data types than simply ints, 

floats, and arrays. You need to have structures of some sort. This is best illustrated by example: 

To define a structure, use the following syntax: 

struct time { 

int hour; 

int minute; 

float second; 

}; 

This defines a new data type, called "time", which contains 3 fields (stored in consecutive locations in the computer's 

memory). 

Logically, it makes sense to associate the various fields of "time" together, since they are part of one physical entity: 

the time. 

To declare a variable of type "time", you could do the following: 

struct time t1[10], t2 = {12, 0, 0.0F}; 

This defines "t1" to be an array of 10 "time" structures. The variable "t2" is also a "time" structure, and is initialised to 

12 hours, 0 minutes, 0.0 seconds. 

An example program using structures is addtime.c. This defines the above "time" structure, and then adds two times




together, taking account of any overflow in the seconds and the minutes. The assert command is also used to ensure 

that the times that are input are also valid. 

To refer to an individual element of a structure, you use the following syntax: 

t1[0].hour = 12; 

t1[0].minutes = 0; 

t1[0].second = t2.second; 

t1[1] = t2; 

Alternatively, and somewhat confusingly, if "p", say, is a pointer to a structure, then you can access the elements of the 

structure using notation like "p->hour". For example: 

struct time *p; 

// p is a pointer to a time structure 

struct time t = {1,2,3.0}; // t is a time structure 

p = &t; 

// p now points to t 

p->hour = 6; 

// this statement is exactly equivalent to... 

t.hour = 6; // ...this one 

Structures can contain any type, including arrays and other structures. A common use is to create a linked list by having 

a structure contain a pointer to a structure of the same type, for example, 

struct person { 

char *name[80]; 

char *address[256]; 

struct person *next_person; 

}; 

where "person.next_person" is actually a pointer to a "person" struct. Sounds complicated, but it is simple when you 

get the hang of it. 

Structures are the first step towards object-oriented programming (OOP), where most of the programming effort is 

devoted to defining the best representation of the data, and the operations which can be performed on the data. 

In physics, a structure can be used to represent the state of a physical system. For example, here is a structure that 

might be used to represent the state of a particle in three-dimensional space: 

struct particle { 

double mass; 

double position[3]; 

double velocity[3]; 

}; 

We could then define an array of 1000 particles as simply as: 

struct particle p[1000]; 

You can take this one step further and define your own type (like "int", "double") by doing: 

typedef struct { 

double mass; 

double position[3]; 

double velocity[3]; 

} particle; 

In which case you can define the array of 1000 particles as: 

particle p[1000]; 

String functions 

The standard C libraries include a bunch of functions for manipulating strings (i.e., arrays of chars). 

Note that before using any of these functions, you should include the line "#include " in your program. This 

include-file defines prototypes for the following functions, as well as defining special types such as "size_t", which are 

operating-system dependent. 

char *strcat(s1, s2) Concatenates s2 onto s1. null-terminates s1. 

Returns s1. 

char *strchr(s, c) 

Searches string s for character c. Returns a




int strcmp(s1, s2) 

pointer to the first occurence, or null pointer 

if not. 

Compares strings s1 and s2. Returns an integer 

that is less than 0 is s1 < s2; zero if s1 = s2; 

and greater than zero is s1 > s2. 

char *strcpy (s1, s2) Copies s2 to s1, returning s1. 

size_t strlen(s) 

char *strncat(s1, s2, n) 

int strncmp(s1, s2, n) 

char *strncpy(s1, s2) 

char *strrchr(s, c) 

int strstr(s1, s2) 

size_r strspn(s1, s2) 

size_r strcspn(s1, s2) 

char *strpbrk(s1, s2) 

char *strerror(n) 

char *strtok(s1, s2) 

Returns the number of characters in s, excluding 

the terminating null. 

Concatenates s2 onto s1, stopping after `n' 

characters or the end of s2, whichever occurs 

first. Returns s1. 

Like strcmp, except at most `n' characters are 

compared. 

Like strcpy, except at most `n' characters are 

copied. 

Like strchr, except searches from the end of 

the string. 

Searches for the string s2 in s1. Returns a 

pointer if found, otherwise the null-pointer. 

Returns the number of consecutive characters 

in s1, starting at the beginning of the string, 

that are contained within s2. 

Returns the number of consecutive characters 

in s1, starting at the beginning of the string, 

that are not contained within s2. 

Returns a pointer to the first character in s1 

that is present in s2 (or NULL if none). 

Returns a pointer to a string describing the 

error code `n'. 

Searches s1 for tokens delimited by characters 

from s1. The first time it is called with a 

non-null s1, it writes a NULL into s1 at the 

first position of a character from s2, and 

returns a pointer to the beginning of s1. When 

strtok is then called with a null s1, it finds 

the next token in s1, starting at one position 

beyond the previous null. 

File operations 

defines a number of functions that are used for accessing files. Before using a file, you have to declare a 

pointer to it, so that it can be referred to with the functions. You do this with, for example, 

FILE *in_file; 

FILE *out_file; 

where "FILE" is a type defined in (it is usually a complicated structure of some sort). To open a file, you do, 

for example: 

in_file = fopen("input_file.dat", "r"); 

out_file = fopen("output_file.dat", "w"); 

Note the use of "r" to indicate read access, and "w" to indicate write access. The following modes are available: 

"r" read 

"w" write (destroys any existing file with the same name) 

"rb" read a binary file 

"wb" write a binary file (overwriting any existing file) 

"r+" opens an existing file for random read access, and 

writing to its end 

"w+" opens a new file for random read access, and writing to 

its end (destroys any existing file with the same name) 

Once the file is opened, you can use the following functions to read/write it:




int fread(void *ptr, int size, int nelm, FILE *fp) Reads nelm elements 

of data, each size bytes long, storing them 

at the location pointed to by ptr. Returns 

the number of elements read. 

int fwrite(void *ptr, int size, int nelm, FILE *fp) Writes nelm elements 

of data, each size bytes long, from the 

location pointed to by ptr. Returns 

the number of elements written. 

int getc(FILE *fp) 

Returns the next character from `fp', or 

EOF on error or end-of-file. 

int putc(int c, FILE *fp) 

Write the character c to the file `fp', 

returning the character written, or EOF on 

error. 

int fscanf(FILE *fp, char *format, ...) Like scanf, except input is taken 

from the file fp. 

int fprintf(FILE *fp, char *format, ...) Like printf, except output is written 

to the file fp. 

char *fgets(char *line, int n, FILE *fp) Gets the next line of input from the 

file fp, up to `n-1' characters in length. 

The newline character is included at the end 

of the string, and a null is appended. Returns 

`line' if successful, else NULL if end-of-file 

or other error. 

int fputs(char *line, FILE *fp) Outputs the string `line' to the file fp. 

Returns zero is successful, EOF if not. 

When you have finished with a file, you should close it with 'fclose': 

int fclose(FILE *fp) 

Closes the file 'fp', after flushing any 

buffers. This function is automatically called 

for any open files by the operating 

system at the end of a program. 

It can also be useful to flush any buffers associated with a file, to guarantee that the characters that you have written 

have actually been sent to the file: 

int fflush(FILE *fp) 

Flushes any buffers associated with the 

file 'fp'. 

To check the status of a file, the following functions can be called: 

int feof(FILE *fp) 

int ferror(FILE *fp) 

void clearerr(FILE *fp) 

int fileno(FILE *fp) 

Returns non-zero when an end-of-file is read. 

Returns non-zero when an error has occurred, 

unless cleared by clearerr. 

Resets the error and end-of-file statuses. 

Returns the integer file descriptor associated 

with the file (useful for low-level I/O). 

Note that the above "functions" are actually defined as pre-processor macros in C. This is quite a common thing to do. 

The preprocessor, cpp 

Any line in your C source code that starts with the "#" (hash) chararacter is a preprocessor directive. Here are some 

examples of what is possible: 

#define PI 3.1415926535 

#define SQR(a) (sqrarg=(a),sqrarg*sqrarg) 

#include "filename" /* from the current directory */ 

#include /* from the system directories (as modified by the "-I" switch) */ 

#define DEBUG




/* defines the symbol DEBUG */ 

#ifdef DEBUG 

/* code here is compiled if DEBUG is defined */ 

#elif defined UNIX 

/* code here is compiled if DEBUG is not defined, and UNIX is defined */ 

#else 

/* code here is compiled if neither DEBUG or UNIX are defined */ 

#endif 

#if 0 

/* code here is never compiled */ 

#endif 

Command-line arguments, and returning a status to the shell 

Up until now, our "main" function has always been declared as "int main(void)". We have seen that the integer return 

value is available to the UNIX "shell" in the "$?" environment variable, and this can be used to alter the execution of 

scripts. It turns out that "main" can actually take arguments, and these are traditionally given as "int argc, char *argv[]", 

where argc is one plus the number of command-line arguments provided to the program, and argv is an array of 

character strings containing the arguments. argv[0] is usually the name of the program itself. An example 

(statusexample.c) will hopefully clarify this: 

#include 

int main(int argc, char *argv[]) { 

printf("this program is called '%s'\n", argv[0]); 

if (argc == 1) { 

printf("it was called without any arguments\n"); 

} else { 

int i; 

printf("it was called with %d arguments\n", argc - 1); 

for (i = 1; i < argc; i++) { 

printf("argument number %d was \n", i, argv[i]); 

} 

} 

exit(argc); 

} 

echo $? 

Allocating memory at run-time - malloc 

The situation often arises that you don't know the size of a data structure in your program at compile-time. For 

example, suppose you are writing a program to read an 2D image from a file into the computer's memory. If the size of 

the image can vary at run-time, it would be nice not to have to pre-define the size of the memory array in your program. 

C allows you to do this through the concept of allocating and deallocating memory. 

Allocating memory means requesting the computer to give you access to a contiguous block of memory. E.g., suppose 

you need a megabyte; you ask the computer "can I use a megabyte of memory, and if so, can you please tell me where 

the memory is?". You ask the question by calling the function "malloc", which takes a single argument: the number of 

bytes of memory that you want. malloc will attempt to allocate the memory, and if it is successful it will return a 

pointer to the first byte of the memory block (if malloc was unsuccessful, it returns NULL; this is a rare enough 

occurence for reasonable memory requests that you should handle it with "assert"). 

Deallocating memory means telling the computer "I have now finished using this block of memory that you previously 

allocated for me with malloc, so you can now have the memory back again for some other purpose". You deallocate 

memory by calling the "free" function. 

Some important points to remember when using malloc: 

malloc returns a pointer to void, i.e., "(void *)", which just means that malloc itself knows nothing about what 

the block of memory is to be used for, which is fair enough since you didn't tell it. However, you do have to let 

the C compiler know, and you do this by type-casting malloc's result (see the example below). 

You should always call "free" to deallocate any memory that you have finished with. Note carefully that the 

argument to "free" must be a pointer obtained from malloc, and that you can only call free once for a given block




of memory. 

After freeing memory, it is good practice to set any pointers that referred to it to NULL, this will prevent you 

from inadvertantly using deallocated memory. 

The memory that malloc allocates can have random data in it. If you need it to be zeroed, either do this yourself 

with a "for" loop of use the "calloc" function (see "man calloc"). 

Here is an example to get you started (mallocexample.c): 

#include 

#include 

#include 

/* Generates a two dimensional matrix, and fills it randomly with 

zeroes and ones. */ 


int xdim, ydim; 

int i, j; 

int *p, *q; 

// Obtain the dimensions of the matrix from the user. 

printf("x dimension of matrix? > "); 

scanf("%d", &xdim); 

printf("y dimension of matrix? > "); 

scanf("%d", &ydim); 

// malloc the memory needed for the matrix. Note the use of "sizeof(int)" 

// which allows the program to run on machines with a variety of word sizes. 

assert(NULL != (p = (int *)malloc(xdim * ydim * sizeof(int)))); 

// Fill the matrix with random 0s and 1s. 

for (i = 0; i < xdim * ydim; i++) { 

if (random() > RAND_MAX/2) { 

*(p+i) = 1; 

} else { 

*(p+i) = 0; 

} 

} 

// Now print the matrix out, showing the use of pointer arithmetic. 

for (i = 0; i < ydim; i++) { 

q = p + i * xdim; 

for (j = 0; j < xdim; j++) { 

printf("%1d", *(q++)); 

} 

printf("\n"); 

} 

// Free the memory we had allocated. 

free((void *)p); 

// And remove all information about where the memory used to be. 

// This is overkill in this example, since the program immediately 

// terminates, but it is good practice, in order to avoid 

// accidentally using free'ed memory. 

} 

p = q = NULL; 

return 0; 

Generating random numbers 

Random numbers are used in all sorts of ways in mathematics and physics. They are a fundamental tool that is often 

useful when conducting numerical simulations of physical systems with a computer. 

Writing a computer program to generate random numbers requires understanding some fairly subtle concepts. 

There are several different algorithms for generating a new random number given an initial random number. One of the




simplest (and best) algorithms for generating random integers is a multiplicative congruential algorithm: you simply 

multiply the current random number generator by a constant, and then take the modulus. The choice of the constant and 

the modulus is critical. A good choice is 16807 and 2147483647 (see Numerical Recipes in C, Press et al, 2nd Ed, page 

278). 

But how do you generate the initial random number, the so-called "seed", for the algorithm? 

One technique is to use the time (in milli or micro-seconds since the last integral second). This is easy to program, but 

has a few disadvantages, e.g, there are only 1000 or 1000000 different seeds, and these numbers are not necessarily 

equally likely to occur (e.g., the operating system might tend to start programs at particular times - see here for more 

details - randomnumber.c). 

Another technique on GNU/Linux systems is to read from the file /dev/random. /dev/random contains random numbers 

generated from "random" events such as the arrival time of internet packets, the movement of the mouse, the timing of 

user keystrokes, etc. If you "od -h /dev/random" ("od -h" prints out the file as hexadecimal characters) you will find 

that it will eventually stop, waiting for new events to generate randomness - move the mouse and it will continue. An 

alternative is /dev/urandom, which will continuously generate random numbers, using an algorithm if events are too 

infrequent. 

If you require very high quality random numbers, then you should read about the subject, perhaps using Numerical 

Recipies in C as a starting point. 

The C libraries provide the functions "srandom" and "random" to generate random integers between 0 and 

RAND_MAX (defined in "stdlib.h") inclusive. Use the on-line manual pages to learn about "srandom" and "random" 

(i.e., type "man srandom" at the prompt). Here is an example program (randomnumber.c) showing their usage, and 

seeding them with the time in microseconds. 

Quasi-random numbers 

For some applications, you don't want truly random numbers, but would prefer numbers that cover an n-dimensional 

space more unformly. For example, let's take a more detailed look at Monte-Carlo integration. 

When using random numbers, the error involved in Monte-Carlo integration goes down as 1/sqrt(N), where N is the 

number of points. This means, e.g., if you use 10 times as many points, the error in your estimation of the integral will 

be reduced by a factor of about 1/sqrt(10), or 0.32. Can we do better than this? It turns out that a simple uniform grid of 

points has an error that goes as 1/N, which is about 3 times better than the random case. However, a uniform grid has 

the undesirable property that you have to decide on the grid spacing in advance, and you can't simply add more points 

at a later time. 

What we need is a quasi-random number generator that chooses numbers that tend to fill the gaps in the sequence. 

There are several possibilities, and we will explore one designed by a Russian mathematician by the name of Sobel 

(sobel.c). 

Sorting 

A frequent requirement is to sort quantities (numbers, character strings, or more complicated structures) into some 

order (ascending, descending, or something more complex). The C library provides the function "qsort", an 

implementation of the famous quicksort algorithm. It is usually best to use "qsort" than write your own sorting 

algorithm. 

Here are two example programs showing the use of qsort. 

Multiple precision arithmetic 

As an example of how to program in C, let's explore the topic of multiple precision arithmetic. All the hard work will 

be done by GNU's Multiple Precision Arithmetic Library (GNU MP), written by Torbjorn Granlund. 

Multiple precision arithmetic allows you to perform calculations to a greater precision than the host computer will 

normally allow. The penalty you pay is that the operations are slower, and that you have to call subroutines to do all the 

calculations. 

GNU MP can perform operations on integers and rational numbers. It uses preprocessor macros (defined in gmp.h) to




define special data types for storing these numbers. MP_INT is an integer, and MP_RAT is a rational number. 

However, since a multiple precision number may occupy an arbitrarily large amount of memory, it is not sufficient to 

allocate memory for each number at compile time. GNU MP copes with this problem by dynamically allocating more 

memory when necessary. 

To begin with you need to initialise each variable that you are going to use. When you are finished using a variable, 

you should free the space it uses by calling a special function. 

MP_INT x, y; 

mpz_init(&x); 

mpz_init(&y); 

/* operations on x and y */ 

mpz_clear(&x); 

mpz_clear(&y); 

Let's now try a full program. For example, calculating the square root of two (roottwo.c) to about 200 decimal places: 

#include 

#include 


char two[450], guess[225]; 

int i; 

MP_INT c, x, temp, diff; 

two[0] = '2'; 

for (i = 1; i < sizeof(two)-1; i++) { 

two[i] = '0'; 

} 

two[i] = 0; 

mpz_init_set_str(&c, two, 10); 

guess[0] = '1'; 

for (i = 1; i < sizeof(guess)-1; i++) { 

guess[i] = '0'; 

} 

guess[i] = 0; 

mpz_init_set_str(&x, guess, 10); 

mpz_init(&temp); 

mpz_init(&diff); 

do { 

mpz_div(&temp, &c, &x); 

mpz_sub(&diff, &x, &temp); 

mpz_abs(&diff, &diff); 

mpz_add(&x, &temp, &x); 

mpz_div_ui(&x, &x, 2U); 

} while (mpz_cmp_ui (&diff, 10U) > 0); 

} 

printf("the square root of two is approximately "); 

mpz_out_str(stdout, 10, &x); 

printf("\n"); 

return 0; 

To compile, link, and run the program, use 

gcc -o two two.c -lgmp 

./two 

Gnuplot 

Gnuplot is a simple graphical plotting package. It is particuarly easy to plot graphs with it. These notes provide an 

introduction to its use. 

Numerical integration 

Here is an example of using numerical integration to follow the motion of a planet around the Sun: 

Writing fast programs




Reasons for worrying about how fast a program is include: 

if it is interactive, and reponsiveness is important, 

if it is pushing the boundaries of practicality (e.g., it might takes weeks to run). 

Hints on improving speed: 

use a clever algorithm. 

identify the time-critical parts of the program, and concentrate on improving those. 

examine memory use as well (e.g., try to reduce the amount of memory being used, and keep the usage localised 

as opposed to jumping around randomly). 

iterate most rapidly over the rightmost subscript in arrays, since this will lead to better localisation of memory 

use (and hence greater likelyhood of cache hits). 

use optimisation (e.g., -O1, -O2, or -O3; see "man gcc" for details). 

try optimising for small program size (-Os). 

in general, try not to be "too clever" - the C compiler can quite often do a better job if your program is clear. 

make the critical parts of the program as small (in terms of bytes of instructions) as possible - they will then be 

more likely to fit in the CPU's "instruction cache", resulting in better performance. 

declare often-called functions as "inline", which eliminates the expense of the function call and copying the 

arguments to/from the stack. This technique should be used with caution, since it makes the program larger, in 

which case it may no longer fit in the instruction cache. 

avoid printing out lots of unnecessary information. Formatting and writing text to the screen is quite CPU 

intensive. 

2D cellular automata - Conway's Game of Life 

In 1970, the mathematician John Horton Conway published the description of a simple 2D cellular automata which he 

called the Game of Life. Cells in a 2D grid were either "alive" or "dead", and their state in the next generation was 

determined by their current state and the number of nearest neighbours (from 0 to 8, inclusive). The rules were chosen 

to simulate some properties of biological systems (e.g., a cell would die of "overcrowding" if too many of its 

neighbours were alive, and die of "lack of support" if too few of its neighbours were alive). 

Here is a "simple" example of how to program the Game of Life on a computer. 

If we are interested in following the evolution of the game for many thousands of generations, it is advantageous to 

think of ways of speeding up our simple implementation. The first thing to try is to turn on full optimisation during the 

compilation, using the "-O4" switch. The next step is to make the critical functions "inline" (see here for the code). We 

can also think about the algorithm, and realise that a lot of our calculation of the number of nearest neighbours was 

involved in handling the special case of being on the boundary; we can re-write our program to separate out the 

boundary case, thereby allowing a simplified (faster) function to do most of the neighbour calculations, at the expense 

of increased complexity in handling the evolution from one generation to the next. Finally, we can try to use the fact 

that much of the Life "universe" tends to be sparsely populated, allowing us to produce a second 

neighbourhood-calculating function for the special case that the preceeding cell had no nearest neighbours. These are 

by no means the only speed-ups that are possible, but they give you some flavour of what is possible. 

The following table gives the time taken to follow one million generations using a 850MHz Pentium III computer and 

the various tricks from the last paragraph. 

Time Technique 

[seconds] 

-------------------------------------------------- 

330 Original program, no optimisation 

205 Original program, -O4 optimisation 

113 Inline function calls 

45 Boundary case handled separately 

37 Sparse population handled separately 

So, we have managed almost a factor of ten improvement in speed over our original attempt. 

To show why this might be useful, let's now alter our program to automatically identify glider patterns (those that can 

self-propagate). We will choose a brute-force technique where we randomly seed the centre of our rectangular 

"universe" with alive/dead cells, and then evolve the system until one of the boundary cells is hit. When this happens 

we stop the evolution and print a 21x21 grid around the cell, hopefully capturing a glider in full flight! Here is the code, 

and you can see that it is a straightforward variation on the earlier program. It typically finds a glider within 2 seconds




on a 850MHz Pentium III computer. 

C programming course, School of Physics, UNSW / Michael Burton / mgb@phys.unsw.edu.au

C programming notes - School of Physics

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?