C programming notes - School of Physics
C programming notes - School of Physics
C programming notes - School of Physics
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
C <strong>programming</strong> <strong>notes</strong><br />
file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />
1 <strong>of</strong> 40 19/03/2007 10:06 AM<br />
PHYS2020 - Computational <strong>Physics</strong>, based on the C<br />
<strong>programming</strong> language<br />
Index<br />
Copyright (c) Michael Burton 2005, Michael Ashley 1994-2004.<br />
Example programs<br />
Why learn C?<br />
Some problems with C<br />
A brief history <strong>of</strong> C<br />
Simplest C program<br />
Hello World<br />
Computing fundamentals<br />
Bits, bytes and words<br />
32-bit, 64-bit<br />
Binary, octal, decimal, hexadecimal<br />
Floating point number<br />
GNU/Linux fundamentals<br />
Operating system<br />
Shell<br />
Files<br />
Filesystem<br />
Input and Output<br />
Command toolkit<br />
On-line manuals<br />
Fundamentals <strong>of</strong> C<br />
Constants and escape sequences and strings<br />
Comments<br />
Compiling, linking and executing<br />
Makefiles<br />
Compiling and Running<br />
Linking to libraries<br />
GNU debugger "gdb"<br />
C data types<br />
Constants: binary, octal, hexadecimal, decimal<br />
Conversion between integer and floating point<br />
Operators<br />
Unary operators<br />
Binary operators<br />
Assignment operators<br />
Comma operator<br />
Precedence and associativity<br />
Evaluation <strong>of</strong> logical AND and OR<br />
C identifiers<br />
Variable storage classes<br />
Initilialisation <strong>of</strong> variables<br />
Mathematical functions<br />
The "for" loop<br />
"If" statements<br />
Break and continue<br />
Goto<br />
Switch statements<br />
Assert for program reliability<br />
Formatted output: printf<br />
Formatted input: scanf<br />
GNU readline<br />
Arrays
C <strong>programming</strong> <strong>notes</strong><br />
file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />
2 <strong>of</strong> 40 19/03/2007 10:06 AM<br />
Character arrays<br />
Functions<br />
Prototypes<br />
Pointers<br />
Pointer arithmetic<br />
Pointers and arrays<br />
Structures<br />
String functions<br />
File operations<br />
The preprocesser, "cpp"<br />
Command line arguments<br />
Allocating memory at runtime, "malloc"<br />
Some applications<br />
Random numbers<br />
Quasi-random numbers<br />
Sorting<br />
Multiple precision arithmetic<br />
Gnuplot<br />
Numerical integration via gravity and orbits<br />
Program speed<br />
The game <strong>of</strong> "Life"<br />
Some Example C Programs<br />
The simplest possible C program.<br />
Hello world.<br />
C escape sequences.<br />
Mandelbrot image generator - illustrating "for" loops.<br />
ASCII circle printer.<br />
How to use GNU readline.<br />
Generating random numbers.<br />
Sorting using quicksort.<br />
Numerical integration - gravity.<br />
1D cellular automata - text output.<br />
1D cellular automata - image output.<br />
Game <strong>of</strong> Life - simple version<br />
Game <strong>of</strong> Life - inline function version<br />
Game <strong>of</strong> Life - boundary version<br />
Game <strong>of</strong> Life - special-case version<br />
Game <strong>of</strong> Life - version to find gliders<br />
Linear fitting program<br />
Bisection Method for finding roots <strong>of</strong> equations<br />
Interpolation Method for finding roots<br />
Newton-Raphson Method for finding roots<br />
Why learn C?<br />
1. C is the most widely used <strong>programming</strong> language.<br />
2. The basic ideas in C are common to many other <strong>programming</strong> languages.<br />
3. Even though FORTRAN is still in heavy use by physicists, it is on its way out.<br />
4. If you want to get a computer-related job outside the field <strong>of</strong> <strong>Physics</strong>, it is helpful to know C.<br />
5. Many scientific instruments are programmed in C (e.g., CCD cameras and special purpose interface cards<br />
invariably come with a C library interfaces).<br />
6. C is closer to assembly language, so you can have finer control over what the computer is doing, and thereby<br />
make faster programs.<br />
7. There is a free C compiler available (GNU C, gcc), that is <strong>of</strong> very high quality and that has been ported to<br />
numerous machines.<br />
8. UNIX is written in C, so it is easier to interface with UNIX-like operating systems (such as GNU/Linux) if you<br />
write in C.<br />
9. Once you have a working knowledge <strong>of</strong> C, you will find perl easy to learn, and perl is a very useful language to
C <strong>programming</strong> <strong>notes</strong><br />
file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />
3 <strong>of</strong> 40 19/03/2007 10:06 AM<br />
know, but that is another story...<br />
Some problems with C<br />
The language was designed with writing operating systems in mind, and so it was not purpose-built for numerical<br />
work.<br />
There is no support for complex numbers (actually, as <strong>of</strong> the C-99 standard, there is!).<br />
Handling multi-dimensional arrays, understanding pointers, and string manipulation are difficult when you first<br />
encounter them.<br />
It is easy to write completely opaque code in C.<br />
You have to explicitly declare every variable that you use. This is actually not a problem at all! It is a feature.<br />
It is possible to bypass many <strong>of</strong> the protective features <strong>of</strong> the language, and get yourself into trouble<br />
(alternatively, this can be seen as an advantage, since the language imposes few limitations on what you can do).<br />
The ANSI C standard leaves a lot <strong>of</strong> details undefined. This makes C non-portable between different operating<br />
systems, compilers, and computer hardware. Java is better in this respect.<br />
It is not easy to write large bug-free programs in C. Java is <strong>of</strong>ten a better choice.<br />
A brief history <strong>of</strong> C<br />
C evolved from a language called B, written by Ken Thompson at Bell Labs in 1970. Ken used B to write one <strong>of</strong> the<br />
first implementations <strong>of</strong> UNIX. B in turn was a decendant <strong>of</strong> the language BCPL (developed at Cambridge (UK) in<br />
1967), with most <strong>of</strong> its instructions removed.<br />
In fact, so many instructions were removed in going from BCPL to B, that Dennis Ritchie <strong>of</strong> Bell Labs put some back<br />
in (in 1972), and called the language C.<br />
The famous book The C Programming Language was written by Kernighan and Ritchie in 1978, and has remained the<br />
definitive reference book on C.<br />
The original C was still too limiting, and not standardized, and so in 1983 an ANSI committee was established to<br />
formalise the language definition.<br />
By 1993 the ANSI standard became well accepted and almost universally supported by compilers. The GNU C<br />
compiler, gcc, has become, in many ways, the standard implementation <strong>of</strong> C.<br />
What about C++ and C#?<br />
C++ is a superset <strong>of</strong> C, incorporating many object-oriented <strong>programming</strong> ideas. However, it is not particularly widely<br />
used. Java is a better choice in most cases.<br />
C# is a Java-like language developed by Micros<strong>of</strong>t. Enough said.<br />
Interesting books<br />
Numerical Analysis by Burden and Faires.<br />
A New Kind <strong>of</strong> Science by Wolfram.<br />
Numerical Recipies in C<br />
Other on-line resources<br />
For coverage <strong>of</strong> typical problems encountered with C, have a look at the Comp.lang.c FAQ.<br />
For a rather good tutorial introduction to C, check out this document by Gordon Dodrill.<br />
The simplest possible C program<br />
Here it is.<br />
Hello world
C <strong>programming</strong> <strong>notes</strong><br />
file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />
4 <strong>of</strong> 40 19/03/2007 10:06 AM<br />
Before reading further, have a look at the generic Hello World, program written in C.<br />
But before we get too excited and start to write really useful programs, there are a few fundamental concepts that you<br />
will need to understand first.<br />
Computing Fundamentals<br />
Bits, bytes, and words<br />
Bit<br />
A bit is the smallest unit <strong>of</strong> information. It can be thought <strong>of</strong> as the answer to a yes/no question. A bit has the value 1 or<br />
0 (which can be equated with yes/no, true/false).<br />
Byte<br />
A byte consists <strong>of</strong> eight bits, ordered from the most significant bit (MSB) on the left, to the least significant bit (LSB) on<br />
the right. E.g., 011011000 is a byte. A byte can be interpretted as a base-2 number.<br />
There are 256 (i.e., 2 to the power 8) possible combinations <strong>of</strong> 8 bits, ranging from 00000000 to 11111111. These 256<br />
combinations could, in principle, be used to represent any 256 things (e.g., we could define 00000000 to be "apple" and<br />
00000001 to be 3.14159265359).<br />
In practice there are three commonly used interpretations <strong>of</strong> a byte:<br />
Words<br />
Unsigned byte<br />
An unsigned byte can represent integers from zero to 255.<br />
Signed byte<br />
A byte can also be used to represent integers from -128 to +127. Such a byte is called a signed byte, and the MSB<br />
is used to indicate the sign (0 for positive, 1 for negative), -128 is represented as 10000000, -1 is 11111111, 0 is<br />
0, and +127 is 01111111.<br />
Non-examinable: To find the bit pattern for a negative number, you first write down the representation <strong>of</strong> the<br />
number without the minus sign, then invert all the bits, then add one. e.g., -1 is 00000001 inverted (which is<br />
11111110), plus one (which gives 11111111). This representation is called "2's complement", and is used<br />
because (1) it makes binary arithmetic easier to implement in the computer hardware, and (2) it avoids the waste<br />
<strong>of</strong> a bit pattern in the case <strong>of</strong> 1000000, which you might think was -0.<br />
A byte as an ASCII character<br />
The American Standard Code for Information Interchange (ASCII) associates the various possible bit patterns in<br />
a byte with characters (i..e., things like 'A', 'B', 'C'...'Z', 'a', 'b', 'c', ...'z', '*', '&', '@', etc). For example, 'A' is stored<br />
as 65 (decimal). The ASCII table allows you to translate a simple text document into a list <strong>of</strong> bytes (which you<br />
might then store in a file, see below). For a complete list <strong>of</strong> the ASCII table, type "man ascii".<br />
In order to represent numbers outside the range <strong>of</strong> an unsigned or signed byte, multiple bytes can be grouped together<br />
to form words <strong>of</strong> various lengths. Words usually contain a power <strong>of</strong> two bytes, e.g., 2, 4, 8, 16, and so on. They may<br />
either be signed or unsigned. For example, an unsigned 2-byte word can represent numbers from 0 to 65536 (2 to the<br />
power 16), a signed 2-byte word ranges from -32768 to + 32767. Words are sometimes called "short words", "words",<br />
and "long words", for the 2, 4, and 8-byte variations, but this usage is not uniform.<br />
32-bit computer, 64-bit computers<br />
Most PCs are 32-bit computers, i.e., they process 32-bits (4 bytes) <strong>of</strong> data in one operation. In a few years, 64-bit
C <strong>programming</strong> <strong>notes</strong><br />
file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />
5 <strong>of</strong> 40 19/03/2007 10:06 AM<br />
computer will become very common.<br />
Binary, octal, decimal, hexadecimal<br />
Integers can be written in various bases. The common bases used with computers are binary (base-2), octal (base-8),<br />
decimal (base-10), and hexadecimal (base-16). Octal and hexadecimal are <strong>of</strong>ten useful when doing low-level<br />
<strong>programming</strong>, since they encode a binary number in a smaller amount <strong>of</strong> typing than using binary notation.<br />
The following table compares the various numbering systems (note that you can't directly specify a binary constant in<br />
the C <strong>programming</strong> language); octal constants are distinguished by a leading zero; hexadecimal constants by a leading<br />
"0x"):<br />
Decimal Binary Octal Hexadecimal<br />
--------------------------------<br />
0 0000 000 0x0<br />
1 0001 001 0x1<br />
2 0010 002 0x2<br />
3 0011 003 0x3<br />
4 0100 004 0x4<br />
5 0101 005 0x5<br />
6 0110 006 0x6<br />
7 0111 007 0x7<br />
8 1000 010 0x8<br />
9 1001 011 0x9<br />
10 1010 012 0xa<br />
11 1011 013 0xb<br />
12 1100 014 0xc<br />
13 1101 015 0xd<br />
14 1110 016 0xe<br />
15 1111 017 0xf<br />
Conversion between binary, octal, and hexadecimal is fairly trivial using the above table. For example, to convert the<br />
binary number 11100101010 to octal, break the number into groups <strong>of</strong> three bits starting from the least significant bit<br />
(i.e., 11 100 101 010), then convert each three-bit group to octal (i.e., 3452) and add a leading 0 to tell C it is octal (i.e.,<br />
03452). To convert the number to hexadecimal, break it into groups <strong>of</strong> four bits (i.e., 111 0010 1010), convert each<br />
group (i.e., 72a) and add a leading "0x" (i.e., 0x72a).<br />
To convert between octal and hexadecimal, it is probably easiest to go via binary.<br />
To convert from binary to decimal, note that n-th bit from the least-significant is worth 2 to the power n. So,<br />
11100101010<br />
\\\\\\\\\\\_ 0 x 1 = 0<br />
\\\\\\\\\\_ 1 x 2 = 2<br />
\\\\\\\\\_ 0 x 4 = 0<br />
\\\\\\\\_ 1 x 8 = 8<br />
\\\\\\\_ 0 x 16 = 0<br />
\\\\\\_ 1 x 32 = 32<br />
\\\\\_ 0 x 64 = 0<br />
\\\\_ 0 x 128 = 0<br />
\\\_ 1 x 256 = 256<br />
\\_ 1 x 512 = 512<br />
\_ 1 x 1024 = 1024<br />
Floating-point numbers<br />
Total = 1832<br />
Scientific computation <strong>of</strong>ten requires the use <strong>of</strong> floating-point numbers, e.g., 1.23. There is an IEEE standard that<br />
describes how floating-point numbers can be represented as a group <strong>of</strong> bytes. The basic idea is to allocate a certain<br />
number <strong>of</strong> bits to the exponent (base-2), and the rest to the mantissa (which has been normalised to be between 0.5 and<br />
1). Both the exponent and mantissa have sign bits.<br />
Floating-point numbers typically come in 4, 8, or 16 byte sizes. In C these are usually called "float", "double", and<br />
"long double" respectively.<br />
For any given number <strong>of</strong> bytes allocated to store a floating-point number, there is clearly a trade<strong>of</strong>f between the range<br />
<strong>of</strong> the exponent, and the precision <strong>of</strong> the mantissa: the more bits you use for the exponent, the fewer are left over for the<br />
mantissa. We will see later how this compromise has been chosen by the IEEE for the various byte sizes.
C <strong>programming</strong> <strong>notes</strong><br />
file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />
6 <strong>of</strong> 40 19/03/2007 10:06 AM<br />
Note that there are some redundant bit patterns in the representation <strong>of</strong> floating-point numbers: e.g., 0.0 with any<br />
exponent is still zero. One <strong>of</strong> these bit patterns is used as a flag to indicate that a calculation produced an illegal result.<br />
This bit pattern is called "NaN" which stands for "not a number". E.g, if you take the square root <strong>of</strong> a negative number,<br />
you will receive the result NaN. Any operation on a NaN will continue to produce a NaN (e.g., if you subtract two<br />
NaNs you will still have a NaN), this is useful to propagate such error conditions in your programs so that you are<br />
aware when they affect the final result <strong>of</strong> a calculation.<br />
GNU/Linux Fundamentals<br />
The operating system<br />
The operating system <strong>of</strong> a computer is, somewhat loosely speaking, the program that is always operating in the<br />
background on the computer. It handles the interaction with the hardware, and running other programs for the user(s) <strong>of</strong><br />
the computer. The operating system that we will be using is called GNU/Linux. Other examples <strong>of</strong> operating systems<br />
are Windows XP, OpenBSD, and FreeBSD.<br />
The shell<br />
The shell is an interactive program that allows the user to interface with the operating system. This is the program that<br />
provides the command-line prompt (it also helps you run programs, allows you to use the arrow keys on your keyboard<br />
to edit the command-line, and a bunch <strong>of</strong> other useful things). The shell that we will be using is called bash, it is the<br />
standard shell that is used with GNU/Linux. Other examples <strong>of</strong> shells are ksh, tcsh, and zsh.<br />
Files<br />
A file consists <strong>of</strong> two components:<br />
File data<br />
The data in a file is simply a list <strong>of</strong> bytes. The list may be empty (i.e., the file may have no data). The list <strong>of</strong> bytes<br />
might be used for a variety <strong>of</strong> different purposes, for example, it may be a C program, an e-mail message, a<br />
photograph, a song, a movie. In principle there is no way to distinguish the purpose <strong>of</strong> a file's data by looking at<br />
it. In practice, you can usually make a pretty good guess. To list the data in a file, you can, e.g., use "cat myfile".<br />
File metadata<br />
The metadata associated with a file consists <strong>of</strong> miscellaneous additional information that the operating system<br />
keeps with the file. For example: the name <strong>of</strong> the file, the number <strong>of</strong> bytes in the file, the time at which the file<br />
was created, which user owns the file, and the file permissions (i.e., the list <strong>of</strong> users who read and/or write the<br />
file, and whether the file is executable (i.e., is a program that can be run)). The term "metadata" means "data<br />
about data". To list the metadata associated with a file, you can, e.g., use "stat myfile".<br />
File names and file extensions<br />
When you create a file, you give it a name. The name is a list <strong>of</strong> bytes. In principle the name could be bizarre, such as<br />
" _1 `", but in practice you should choose simple, lowercase, names without spaces, e.g., "thesis.txt", "photo.jpg",<br />
"assignment0001.c". Note the convention <strong>of</strong> suffixing a file name with an extension that gives a clue as what the file is<br />
being used for. So, C programs invariably end with the ".c" extension.<br />
The GNU/Linux filesystem<br />
Directories (folders)<br />
To assist with organising groups <strong>of</strong> files, GNU/Linux has the concept <strong>of</strong> a directory (or folder). A directory is actually<br />
just a file (this adheres to the UNIX philosophy <strong>of</strong> making things simple and logical). A directory contains a list <strong>of</strong> the<br />
file names that are "in" the directory. Directories can contain directories. This leads to a hierarchical filesystem, where,<br />
in order to fully specify the location <strong>of</strong> a file, you have to specify the list <strong>of</strong> directories to follow to reach the file. By<br />
convention a slash character '/' is used to separate the directory names when giving the fully qualified filename. "/" itself<br />
is known as the root directory - there are no directories above the root directory in the filesystem. "/thesis.txt" would
C <strong>programming</strong> <strong>notes</strong><br />
file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />
7 <strong>of</strong> 40 19/03/2007 10:06 AM<br />
then refer to file "thesis.txt" in the root directory. "/home/fred/thesis/chapter4.txt" would be the file "chapter4.txt" in the<br />
directory "thesis", in the directory "fred", in the directory "home", in the root directory.<br />
The Windows operating system uses letters such as "C:", "D:", etc to specify different filesystems (which might be in<br />
on different hard disks). GNU/Linux places these filesystems within the "/" hierarchy, i.e., the floppy disk might be<br />
accessible as "/mnt/floppy", the CD-ROM might be "/mnt/cdrom".<br />
The current working directory<br />
GNU/Linux has the concept <strong>of</strong> the current working directory. You can specify a file in the current working directory<br />
by just giving the file name. For convenience, "./" refers to the current working directory, and "../" refers to the<br />
directory above the current working directory. The fully-qualified filename <strong>of</strong> a file in the current working directory<br />
can be written as, e.g., "./thesis.txt".<br />
Programs<br />
To get a computer to do something for you, you create and run a program. You will be doing this many times during<br />
this course. To create a program, you type in the program's source code, and then use, e.g, the C compiler (another<br />
program, which someone else wrote) to turn the source code into a program. The program itself is just a file. The file's<br />
metadata tells the operating system that the file is executable (i.e., that it is a program able to be run by the computer).<br />
To run the program, you just type its fully qualified filename.<br />
GNU/Linux commands<br />
GNU/Linux provides you with many hundreds <strong>of</strong> programs, <strong>of</strong>ten called commands, but they are really exactly like the<br />
programs that you can create yourself (actually, some <strong>of</strong> the commands are a subset <strong>of</strong> the bash shell, but this is a<br />
detail). The commands range from listing the contents <strong>of</strong> a directory ("ls"), listing the contents <strong>of</strong> a file ("cat"), the C<br />
compiler ("gcc"), to the GNU image manipulation program ("gimp").<br />
To run a program, you simply type its filename. In the case <strong>of</strong> your own programs, you should use the fully qualified<br />
filename (e.g., "./a.out"). In the case <strong>of</strong> programs provided by GNU/Linux, the shell has a list <strong>of</strong> default directories<br />
where it searches for programs, so you normally only need to type the filename (without any directories). E.g., the "ls"<br />
program to list directories can be run by simply typing "ls", or "/bin/ls", or, if your current working directory happened<br />
to be "/bin", you could type "./ls".<br />
Input and Output<br />
In general, for a program to do useful work, it needs to be provided with input <strong>of</strong> some sort, and to produce output.<br />
Input to a program can take many forms: e.g., typing at the keyboard, a file containing data, movements <strong>of</strong> a mouse, or<br />
something more exotic such as audio input from a sound card, or video from a camera.<br />
Output from a program can also take many forms, e.g., text appearing on your computer screen, a graph on your screen,<br />
a file <strong>of</strong> results, or something more exotic such as audio output, or movement <strong>of</strong> a robotic arm.<br />
GNU/Linux provides the following input and output methods for all programs (I am covering all <strong>of</strong> the possibilities<br />
here, but you are only expected to really understand the first two at this stage):<br />
Standard input<br />
Standard input (or STDIN) is a stream <strong>of</strong> bytes provided to the program. The program can read from the stream,<br />
and can determine when there is no more data. Standard input comes from one <strong>of</strong> three sources depending on<br />
how you run your program. For example, suppose that your program is called "a.out" (the default name assumed<br />
by the C compiler), then<br />
./a.out<br />
./a.out < file<br />
standard input comes from the keyboard<br />
standard input comes from a file called "file"
C <strong>programming</strong> <strong>notes</strong><br />
file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />
8 <strong>of</strong> 40 19/03/2007 10:06 AM<br />
command | ./a.out<br />
standard input comes from the standard output <strong>of</strong> "command"<br />
Standard output<br />
Standard output (or STDOUT) is a stream <strong>of</strong> bytes produced by a program. Standard output goes to one <strong>of</strong> three<br />
locations depending on how you run your program. E.g.,<br />
./a.out<br />
./a.out > file<br />
./a.out | command<br />
standard output appears on the computer screen as text<br />
standard output is sent to the file called "file"<br />
standard output is sent to the standard input <strong>of</strong> "command"<br />
Standard error<br />
Standard error (or STDERR) is a stream <strong>of</strong> bytes produced by a program. Usually it contains error messages<br />
(e.g., 'divide by zero') that are not normally considered as part <strong>of</strong> the program output. Standard error can be sent<br />
to one <strong>of</strong> three locations. E.g.,<br />
./a.out<br />
./a.out 2> file<br />
./a.out 2>&1 | command<br />
standard error appears on the computer<br />
screen as text, intermingled with the standard output<br />
standard error is sent to the file called "file"<br />
standard error (and standard output) is sent to<br />
the standard input <strong>of</strong> "command"<br />
Command-line arguments<br />
Anything you type after the program name, and before any special shell characters such as '>' and '|', is made<br />
available to your program as a command-line argument. These can be a useful source <strong>of</strong> additional input to a<br />
program. E.g.,<br />
./a.out 23.45 albatross 7<br />
In the above example, "23.45", "albatross", and "7" are three command-line arguments.We will see later how you<br />
can read them from within a C program.<br />
Exit status<br />
Another form <strong>of</strong> output that is available to all programs is an integer exit status. By convention, this is used to<br />
indicate whether the program ran successfully (in which case the exit status is zero) or if an error occurred (in<br />
which case the exit status is nonzero and its value can be decoded to determine what the error was). The exit<br />
status <strong>of</strong> a program is stored in the shell variable called "$?" (don't worry if you don't understand this at this<br />
stage).<br />
Putting it all together<br />
Here are some more complex examples <strong>of</strong> program input/output:<br />
./a.out 2.1 < data.txt > results.txt<br />
"2.1" is an argument, standard input<br />
comes from "data.txt", standard output<br />
goes to "results.txt"
C <strong>programming</strong> <strong>notes</strong><br />
file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />
9 <strong>of</strong> 40 19/03/2007 10:06 AM<br />
./a.out > results.txt 2> error.txt<br />
there are no arguments, standard input<br />
is from the keyboard, standard output<br />
to "results.txt" and standard error to<br />
"error.txt"<br />
Command Toolkit<br />
Here are some <strong>of</strong> the basic GNU/Linux commands that you will be using:<br />
cat myfile - concatenate a file (i.e., list the file's data on standard output)<br />
less myfile - like "cat", but allows you to scroll up and down in the file using the page-up, and page-down keys.<br />
Exit with 'q'.<br />
rm myfile - remove a file (i.e., delete the file; it is gone for good)<br />
mv myfile newname - renames a file<br />
cp myfile copyOfMyFile - makes a copy <strong>of</strong> a file, with a different name<br />
cd mydir - change directory<br />
mkdir mynewdir - make a new directory<br />
rmdir mydir - remove the directory (you have to remove the files in it first)<br />
ls - list the files and directories in the current working directory<br />
pwd - print the name <strong>of</strong> the current working directory<br />
gcc myprog.c - compiles your program<br />
./a.out - runs the program compiled as above<br />
man mytopic - displays the manual pages about "mytopic"<br />
man -k mytopic - searches the manual pages for any information about "mytopic"<br />
gedit myprog.c - starts up a simple GUI editor<br />
logout - logs out <strong>of</strong> the computer<br />
The on-line manual pages<br />
Use "man -k keyword" to search for information about "keyword". For example, suppose you want to know<br />
something about random number generators: "man -k random" on a GNU/Linux system will return output<br />
containing lines like these:<br />
ppmspread<br />
(1) - displace a portable pixmap's pixels by a random amount<br />
rand<br />
(3) - random number generator<br />
RAND_bytes<br />
(3ssl) - generate random data<br />
random<br />
(3) - random number generator<br />
random<br />
(4) - kernel random number source devices<br />
RAND_pseudo_bytes [RAND_bytes] (3ssl) - generate random data<br />
rand [sslrand] (1ssl) - generate pseudo-random bytes<br />
rand [sslrand] (3ssl) - pseudo-random number generator<br />
seed48 [drand48] (3) - generate uniformly distributed pseudo-random numbers<br />
The number in parentheses is called the section number (e.g., section 3 is reserved for C functions, section 4 for<br />
low-level kernel interfaces), and you need to specify this if there are entries in multiple volumes. So, e.g., to read<br />
about the kernel random number source devices, you would type "man 4 random".
C <strong>programming</strong> <strong>notes</strong><br />
file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />
10 <strong>of</strong> 40 19/03/2007 10:06 AM<br />
Fundamentals <strong>of</strong> C<br />
Character constants, escape sequences, and string constants<br />
A character constant in C is a single character (or an escape sequence such as \n) enclosed in single quotes, e.g., 'A'. It<br />
occupies one byte <strong>of</strong> storage.<br />
The value <strong>of</strong> a character constant is the numeric value <strong>of</strong> the character in the computer's character set (e.g., 'A' has the<br />
value 65). In 99.99% <strong>of</strong> cases this is the ASCII character set, but this is not defined by the C standard!<br />
But how do you represent a character such as a single quote itself, or one that doesn't print (such as the ASCII BEL<br />
character)? The answer is to use an escape sequence.<br />
For reference, here is a program to print out all the special escape sequences. Have a look at it now.<br />
In addition, you can specify any 8-bit ASCII character using either \ooo or \xhh where `ooo' is an octal number (with<br />
from 1 to 3 digits), and 'xhh' is a hexadecimal number (with 1 or 2 digits). For example, \x20 is the ASCII character for<br />
SPACE, so ' ' and '\x20' are identical.<br />
String constants are a sequence <strong>of</strong> zero or more characters, enclosed in double quotes. For example, "test", "", "this is<br />
an invalid string" are all valid strings (you can't always believe what a string tells you!). String constants are stored in<br />
the computer's memory as a sequence <strong>of</strong> numbers (usually from the ASCII character set), and are terminated by a null<br />
byte (\0). So, "test" would appear in memory as the numbers 116, 110, 115, 116, 0.<br />
A consequence <strong>of</strong> the null byte terminator is that a C character string can never contain a null byte. In practice, this isn't<br />
<strong>of</strong>ten a limitation.<br />
We have already used some examples <strong>of</strong> strings in our programs, e.g, "hello world\n" was a null-terminated character<br />
string that we sent to the printf function above.<br />
The above program also shows how to add comments to your C program.<br />
Comments<br />
To make your program more readable, it is good practice to include comments. These do not influence the operation <strong>of</strong><br />
the program, nor are they printed out when the program runs, they are simply to document things that you think are<br />
important.<br />
Comments are delimited by `/*' and `*/', or from '//' to the end <strong>of</strong> the line. Any text between the first `/*' and the next<br />
`*/' is ignored by the compiler, irrespective <strong>of</strong> the position <strong>of</strong> line breaks. Note that this means that comments do not<br />
nest. Here are some examples <strong>of</strong> the valid use <strong>of</strong> comments:<br />
/* This is a comment */<br />
// and so is this<br />
/* here is another one<br />
that spans two lines */<br />
i = /* a big number */ 123456;<br />
Here are some problems that can arise when using comments:<br />
i = 123456; /* a comment starts here<br />
i = i + 2; this statement is also part <strong>of</strong> the comment */<br />
/* this is a comment /* and so is this */ but this will generate an error */<br />
The fact that comments don't nest is a real nuisance if you want to comment-out a whole section <strong>of</strong> code. In this case, it<br />
is simplest to use pre-processor directives to do the job. For example:<br />
i = 123456;<br />
#if 0 // The next two lines are ignored by the compiler.<br />
i = i + 2;<br />
j = i * i;<br />
#endif<br />
Compiling, linking, executing, etc
C <strong>programming</strong> <strong>notes</strong><br />
file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />
11 <strong>of</strong> 40 19/03/2007 10:06 AM<br />
Source files<br />
C source files contain the C program that you write. The files invariably end with the suffix ".c". You compile them<br />
with the C compiler to produce...<br />
Object files<br />
Have a ".o" suffix, and are the result <strong>of</strong> compiling a source file. They contain the machine language equivalent <strong>of</strong> the C<br />
program, but they are not ready to execute yet, since they need to be linked against libraries <strong>of</strong> other C object files<br />
containing any functions that are used, but not coded for, in the source file (e.g., the printf function). Linking can be<br />
done for you by the C compiler, or can be a separate step if you wish. After linking, you have an...<br />
Executable file<br />
This file is ready to run, by typing its name at the command prompt.<br />
Makefiles - the easy way to automate compilation<br />
When developing s<strong>of</strong>tware, a great deal <strong>of</strong> time can be save by using the UNIX make utility.<br />
The idea behind make is that you should be able to compile a program simply by typing make (followed by ENTER).<br />
For this to work, you have to tell the computer how to build your program, by storing instructions in a Makefile.<br />
While this step takes some effort, you are amply repaid by the time savings later on. make is particularly useful for large<br />
programs that consist <strong>of</strong> numerous source files: make only recompiles the file(s) that need recompiling (it works out<br />
which ones to process by looking at the dates <strong>of</strong> last modification).<br />
Here is an example <strong>of</strong> a simple Makefile:<br />
test: test.o; gcc test.o -o test<br />
Notes:<br />
1. A Makefile consists <strong>of</strong> a series <strong>of</strong> lines containing targets, dependencies, and shell commands for updating the<br />
targets.<br />
2. In the above example,<br />
test is the target, i.e., the executable program,<br />
test.o is the dependency (i.e., the file that test depends on), and<br />
gcc test.o -o test is the shell command for making test from its dependencies.<br />
3. Important note: test is a very bad name for a program since it conflicts with the UNIX shell command <strong>of</strong> the<br />
same name, hence if you type "test" it will run the UNIX command rather than your program. To get around this<br />
problem, simply use "./test" (if "test" is in your current directory), or rename the program. It is always a good<br />
idea to use the leading "./" since it avoids a lot <strong>of</strong> subtle problems that can be hard to track down.<br />
Makefiles can get much more complicated than this simple example. Here is the next step in complexity, showing the<br />
use <strong>of</strong> a macro definition and a program that depends on multiple files.<br />
OBJS = main.o sub1.o sub2.o<br />
main: $(OBJS); gcc $(OBJS) -o main<br />
It is well worth learning a little bit about make since it can save a lot <strong>of</strong> time!<br />
How to compile and run a C program<br />
Short Instructions<br />
1.<br />
Type in your program into your favourite editor (eg emacs): "emacs filename.c &" and save it.<br />
e.g. for the hellow world program described in lectures you would type into the editor:<br />
#include <br />
int main(void) {<br />
printf("hello world\n");
C <strong>programming</strong> <strong>notes</strong><br />
file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />
12 <strong>of</strong> 40 19/03/2007 10:06 AM<br />
return 0;<br />
}<br />
2.<br />
Compile: "gcc filename.c"<br />
Fix any errors, save, and recompile until compilation successful.<br />
3. Run: "./a.out"<br />
Examine the output to see if it is correct. If not, examine the program carefully to ascertain where you<br />
migth have gone wrong.<br />
Full Instructions (for experts)<br />
1.<br />
2.<br />
3.<br />
4.<br />
5.<br />
6.<br />
Create a new directory for the program (not essential, but it helps to be organised).<br />
Use whatever editor you want to generate the program (make sure the program file has a suffix <strong>of</strong> '.c').<br />
Construct a Makefile for the program (not essential, but worth doing).<br />
Type make to compile the program (if you don't have a Makefile, you will need to type the compiler command<br />
yourself).<br />
Run the program by typing its name.<br />
Locate and fix bugs, then go to step 4.<br />
Here is a complete example <strong>of</strong> the above process for the ``Hello world'' program:<br />
cd<br />
mkdir hello<br />
cd hello<br />
cat > hello.c
C <strong>programming</strong> <strong>notes</strong><br />
file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />
13 <strong>of</strong> 40 19/03/2007 10:06 AM<br />
const double pi = 3.14159265358979323846;<br />
double e, d = pi/2;<br />
}<br />
e = sin(d);<br />
printf("The sine <strong>of</strong> %f is %f\n", d, e);<br />
return 0;<br />
This program produces the result:<br />
The sine <strong>of</strong> 1.570796 is 1.000000<br />
Aside: the file /usr/include/math.h defines M_PI to be the double precision value <strong>of</strong> pi. So, use this rather than typing<br />
the value <strong>of</strong> pi yourself.<br />
The GNU C debugger "gdb"<br />
When debugging a program, a common technique is to insert "print" statements to display the values <strong>of</strong> variables. A<br />
much more powerful technique is to use the GNU C debugger, "gdb". Other C compilers will have debuggers as well.<br />
Firstly, you must compile your program with the debug switch (-g). It is also a good idea to disable optimisation with<br />
-O0.<br />
gcc -g -O0 main.c<br />
Then run your program using gdb:<br />
gdb a.out<br />
The following are some <strong>of</strong> the more common commands that are available:<br />
help<br />
- obtain help<br />
help {arg}<br />
- obtain help on topic {arg}<br />
list<br />
- list 10 lines around current position<br />
list {line}<br />
- list 10 lines around line number {line}<br />
list {beg},{end}<br />
- list lines from line number {beg} to {end}<br />
run arg1 arg2 ...<br />
- begin execution <strong>of</strong> the program, with the arguments<br />
break line<br />
- suspend execution at the given line number<br />
step<br />
- steps (and executes) one line in your program<br />
print {exp} ...<br />
- print the value <strong>of</strong> the given expressions<br />
printf "string", exp, ... - print expressions using format string(C)<br />
quit<br />
- quit gdb<br />
<br />
- repeats the last command (very useful with "step")<br />
A program to give information on C data types<br />
Like other <strong>programming</strong> languages, C has a variety <strong>of</strong> different data types. The ANSI standard defines the data types<br />
that must be supported by a compiler, but it doesn't tell you details such as the range <strong>of</strong> numbers that each type can<br />
represent, or the number <strong>of</strong> bytes <strong>of</strong> storage occupied by each type. These details are implementation dependent, and<br />
defined in the two system include-files "limits.h" and "float.h". To find out what the limits are, try running the<br />
following program with your favourite C compiler.<br />
/* A program to print out various machine-dependent constants */<br />
/* Michael Ashley / UNSW / 03-Mar-2003 */<br />
#include /* for printf definition */<br />
#include /* for CHAR_MIN, CHAR_MAX, etc */<br />
#include /* for FLT_DIG, DBL_DIG, etc */<br />
int main(void) {<br />
printf("char %d bytes %d to %d \n", size<strong>of</strong>(char ), CHAR_MIN, CHAR_MAX );<br />
printf("unsigned char %d bytes %d to %d \n", size<strong>of</strong>(unsigned char ), 0 , UCHAR_MAX );<br />
printf("short %d bytes %hi to %hi \n", size<strong>of</strong>(short ), SHRT_MIN, SHRT_MAX );<br />
printf("unsigned short %d bytes %hu to %hu \n", size<strong>of</strong>(unsigned short), 0 , USHRT_MAX );<br />
printf("int %d bytes %i to %i \n", size<strong>of</strong>(int ), INT_MIN , INT_MAX );<br />
printf("unsigned int %d bytes %u to %u \n", size<strong>of</strong>(unsigned int ), 0 , UINT_MAX );<br />
printf("long %d bytes %li to %li \n", size<strong>of</strong>(long ), LONG_MIN, LONG_MAX );<br />
printf("unsigned long %d bytes %lu to %lu \n", size<strong>of</strong>(unsigned long ), 0 , ULONG_MAX );<br />
printf("float %d bytes %e to %e \n", size<strong>of</strong>(float ), FLT_MIN , FLT_MAX );<br />
printf("double %d bytes %e to %e \n", size<strong>of</strong>(double ), DBL_MIN , DBL_MAX );<br />
printf("precision <strong>of</strong> float %d digits\n", FLT_DIG);<br />
printf("precision <strong>of</strong> double %d digits\n", DBL_DIG);
C <strong>programming</strong> <strong>notes</strong><br />
file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />
14 <strong>of</strong> 40 19/03/2007 10:06 AM<br />
}<br />
return 0;<br />
Notes:<br />
size<strong>of</strong> looks like a function, but it is actually a built-in C operator (i.e., just like +,-,*). The compiler replaces<br />
size<strong>of</strong>(data-type-name) (or, in fact, size<strong>of</strong>(variable)) with the number <strong>of</strong> bytes <strong>of</strong> storage allocated to the<br />
data-type or variable.<br />
The 'unsigned' data types are useful when you are refering to things which are naturally positive, such as the<br />
number <strong>of</strong> bytes in a file. They also give you a factor <strong>of</strong> two increase in the largest number that you can represent<br />
in a given amount <strong>of</strong> space.<br />
An "unsigned char" is a single-byte C data type that can represent positive integers from 0 to 255. A "signed<br />
char" uses the most-significant bit <strong>of</strong> the byte as a sign-bit (0 indicates a positive number, 1 indicates a negative<br />
number), leaving the remaining 7 bits to encode the absolute value <strong>of</strong> the number; the range <strong>of</strong> signed chars is<br />
therefore -128 to +127.<br />
Most <strong>of</strong> the time you don't need to worry about how many bytes are in each data type, since the limits are usually OK<br />
for normal programs. However, a common problem is that the "int" type can be either 2-bytes or 4-bytes in length,<br />
depending on the compiler.<br />
Here is the output <strong>of</strong> the preceeding program when run on a GNU/Linux PC, using gcc:<br />
char 1 bytes -128 to 127<br />
unsigned char 1 bytes 0 to 255<br />
short 2 bytes -32768 to 32767<br />
unsigned short 2 bytes 0 to 65535<br />
int 4 bytes -2147483648 to 2147483647<br />
unsigned int 4 bytes 0 to 4294967295<br />
long 4 bytes -2147483648 to 2147483647<br />
unsigned long 4 bytes 0 to 4294967295<br />
float<br />
4 bytes 1.175494e-38 to 3.402823e+38<br />
double<br />
8 bytes 2.225074e-308 to 1.797693e+308<br />
precision <strong>of</strong> float 6 digits<br />
precision <strong>of</strong> double 15 digits<br />
Constants; binary, octal, hexademical, and decimal<br />
We have already seen how to write character constants and strings. Let's now look at other types <strong>of</strong> constants:<br />
int<br />
123, -1, 2147483647, 040 (octal), 0xab (hexadecimal)<br />
unsigned int 123u, 2147483648, 040U (octal), 0X02 (hexadecimal)<br />
long<br />
123L, 0XFFFFl (hexadecimal)<br />
unsigned long 123ul, 0777UL (octal)<br />
float<br />
1.23F, 3.14e+0f<br />
double 1.23, 2.718281828<br />
long double 1.23L, 9.99E-9L<br />
Note:<br />
Integers are automatically assumed to be <strong>of</strong> the smallest type that can represent them (but at least an "int"). For<br />
example, 2147483648 was assumed by the compiler to be an "unsigned int" since this number is too big for an<br />
"int". Numbers too big to be an "unsigned int" are promoted to "long" or "unsigned long" as appropriate.<br />
An integer starting with a zero is assumed to be octal, unless the character after the zero is an "x" or "X", in<br />
which case the number in hexadecimal.<br />
Unsigned numbers have a suffix <strong>of</strong> "u" or "U".<br />
Long "int"s or "double"s have a suffix <strong>of</strong> "l" or "L" (it is probably better to use "L" so that it isn't mistaken for the<br />
number one).<br />
Floating-point numbers are assumed to be "double"s unless they have a suffix <strong>of</strong> "f" or "F" for "float", or "l" or<br />
"L" for "long double".<br />
It pays to be very careful when specifying numbers, to make sure that you do it correctly, particularly when dealing<br />
with issues <strong>of</strong> precision. For example, have a look at this program (precision1.c, precision2.c):<br />
#include <br />
int main(void) {<br />
double r;<br />
r = 1.0 + 0.2F;
C <strong>programming</strong> <strong>notes</strong><br />
file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />
15 <strong>of</strong> 40 19/03/2007 10:06 AM<br />
}<br />
r = r - 1.2F;<br />
printf("%22.16e\n", r);<br />
return 0;<br />
The program gives at output <strong>of</strong> "-4.4703483581542969e-08", not zero as you might expect.<br />
EXERCISE: Run this program and verify for yourself that the answer is indeed not zero! Re-write the program so that<br />
the numerical values in the code (i.e. 0.2, 1.2 etc) are double rather than float, and verify that the result <strong>of</strong> execution is<br />
to give the answer 0.000000000000000e+00.<br />
The lesson to be learnt here is when writing constants, always think carefully about what type you want them to be, and<br />
use the suffixes "U", "L", and "F" to be explicit about it. It is not a good idea to rely on the compiler to do what you<br />
expect. Don't be surprised if different machines/compilers give different answers if you program sloppily.<br />
Conversion between integers and floating point numbers<br />
In C, as in all computer languages, there are rules that the compiler uses when a program mixes integers and floating<br />
point numbers in the same expression.<br />
Let's look at what happens if you assign a floating point number to an integer variable (convert1.c):<br />
#include <br />
int main(void) {<br />
int i, j;<br />
i = 1.99;<br />
j = -1.99;<br />
printf("i = %d; j = %d\n", i, j);<br />
return 0;<br />
}<br />
This program produces the result "i = 1; j = -1". Note that the floating point numbers have been truncated and not<br />
rounded.<br />
When converting integers to floating-point, be aware that a "float" has fewer digits <strong>of</strong> precision than an "int", even<br />
though they both use 4 bytes <strong>of</strong> storage (on normal PCs). This can result in some strange behaviour, e.g. (convert2.c),<br />
#include <br />
int main(void) {<br />
unsigned int i;<br />
float f;<br />
i = 4294967295; /* the largest unsigned int */<br />
f = i; /* convert it to a float */<br />
printf("%u %20.13e %20.13e\n", i, f, f - i);<br />
return 0;<br />
}<br />
This program produces the following output when compiled with "gcc":<br />
4294967295 4.2949672960000e+09 1.0000000000000e+00<br />
Rather than rely on automatic type-conversion, you can be explicit about it by using a type-cast operator, e.g.,<br />
f = (float)i;<br />
This converts the number or variable or parethesised expression immediately to its right, to the indicated type. It is a<br />
good idea to use type-casting to ensure that you leave nothing to chance.<br />
Operators<br />
C has a rich set <strong>of</strong> operators (i.e., things like + - * /), which allow you to write complicated expressions quite<br />
compactly.<br />
Unary operators
C <strong>programming</strong> <strong>notes</strong><br />
file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />
16 <strong>of</strong> 40 19/03/2007 10:06 AM<br />
Unary operators are operators that only take one argument. The following list will be confusing when you first see it, so<br />
just keep it mind for reference later on.<br />
Some <strong>of</strong> the operators we have already seen (e.g., "size<strong>of</strong>()"), others are very simple (e.g., +, -), others are really neat<br />
(e.g., ~, !), others are useful for adding/subtracting 1 automatically (e.g., ++i, --i, i++, i--), and the rest involve pointers<br />
and addressing, which will be covered in detail later.<br />
size<strong>of</strong>(i) the number <strong>of</strong> bytes <strong>of</strong> storage allocated to i<br />
+123 positive 123<br />
-123 negative 123<br />
~i one's complement (bitwise complement)<br />
!i logical negation (i.e., 1 if i is zero, 0 otherwise)<br />
*i returns the value stored at the address pointed to by i<br />
&i<br />
returns the address in memory <strong>of</strong> i<br />
++i<br />
adds one to i, and returns the new value <strong>of</strong> i<br />
--i<br />
subtracts one from i, and returns the new value <strong>of</strong> i<br />
i++<br />
adds one to i, and returns the old value <strong>of</strong> i<br />
i--<br />
subtracts one from i, and returns the old value <strong>of</strong> i<br />
i[j]<br />
array indexing<br />
i (j)<br />
calling the function i with argument j<br />
i.j<br />
returns member j <strong>of</strong> structure i<br />
i->j<br />
returns member j <strong>of</strong> structure pointed to by i<br />
Binary operators<br />
Binary operators work on two operands ("binary" here means 2 operands, not in the sense <strong>of</strong> base-2 arithmetic).<br />
Here is a list. All the usual operators that you would expect are there, with a whole bunch <strong>of</strong> interesting new ones.<br />
Note:<br />
+ addition<br />
- subtraction<br />
* multiplication<br />
/ division<br />
% remainder (e.g., 2%3 is 2), also called 'modulo'<br />
C <strong>programming</strong> <strong>notes</strong><br />
file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />
17 <strong>of</strong> 40 19/03/2007 10:06 AM<br />
Left argument Right argument & | ^ && ||<br />
0011 0101 0001 0111 0110 1 1<br />
1110 1010 1010 1110 0100 1 1<br />
0000 1111 0000 1111 1111 0 1<br />
Assignment operators<br />
Assignment operators are really just binary operators. The simplest example is "=", which takes the value on the right<br />
and places it into the variable on the left. But C provides you with a host <strong>of</strong> other assignment operators which make life<br />
easier for the programmer.<br />
= assignment<br />
+= addition assignment<br />
-= subtraction assignment<br />
*= multiplication assignment<br />
/= division assignment<br />
%= remainder/modulus assignment<br />
&= bitwise AND assignment<br />
|= bitwise OR assignment<br />
^= bitwise exclusive OR assignment<br />
= right shift assignment<br />
So, for example, "i += j" is equivalent to "i = i + j". The advantage <strong>of</strong> the assignment operators is that they can reduce<br />
the amount <strong>of</strong> typing that you have to do, and make the meaning clearer. This is particularly noticeable when, instead<br />
<strong>of</strong> a simple variable such as "i", you have something complicated such as "position[wavelength + correction_factor *<br />
2]";<br />
The thing to the left <strong>of</strong> the assignment operator has to be something where a result can be stored, and is known as an<br />
"lvalue" (i.e., something that can be on the left <strong>of</strong> an "="). Valid "lvalues" include variables such as "i", and array<br />
expressions. It doesn't make sense, however, to use constants or expressions to the left <strong>of</strong> an equals sign, so these are<br />
not "lvalues".<br />
The comma operator<br />
C allows you to put multiple expression in the same statement, separated by a comma. The expressions are evaluated in<br />
left-to-right order. The value <strong>of</strong> the overall expression is then equal to that <strong>of</strong> the rightmost expression.<br />
For example,<br />
Example<br />
Equivalent to<br />
------- -------------<br />
i = ((j = 2), 3); i = 3; j = 2;<br />
myfunct(i, (j = 2, j + 1), 1); j = 2; myfunct(i, 3, 1);<br />
The comma operator has the lowest precedence, so it is always executed last when evaluating an expression. Note that<br />
in the example given comma is used in two distinct ways inside an argument list for a function.<br />
Both the above examples are artificial, and not very useful. The comma operator can be useful when used in "for" and<br />
"while" loops as we will see later.<br />
Precedence and associativity <strong>of</strong> operators<br />
The precedence <strong>of</strong> an operator gives the order in which operators are applied in expressions: the highest precedence<br />
operator is applied first, followed by the next highest, and so on.<br />
The associativity <strong>of</strong> an operator gives the order in which expressions involving operators <strong>of</strong> the same precedence are<br />
evaluated.<br />
The following table lists all the operators, in order <strong>of</strong> precedence, with their associativity:<br />
Operator<br />
Associativity<br />
-------- -------------
C <strong>programming</strong> <strong>notes</strong><br />
file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />
18 <strong>of</strong> 40 19/03/2007 10:06 AM<br />
() [] ->> . left-to-right<br />
- + ++ -- ! ~ * & size<strong>of</strong> (type) right-to-left (note: unary +, -, *, and &)<br />
* / % left-to-right<br />
+ - left-to-right<br />
> left-to-right<br />
< >= left-to-right<br />
== != left-to-right<br />
&<br />
left-to-right<br />
^<br />
left-to-right<br />
| left-to-right<br />
&&<br />
left-to-right<br />
|| left-to-right<br />
?: right-to-left<br />
= += -= *= /= %= &= ^= |= = right-to-left<br />
, left-to-right<br />
Note: the unary operators + (introducing a positive number), - (introducing a negative number), * (pointer<br />
dereferencing), and & (address <strong>of</strong>); appear twice in the above table. The unary forms (on the second line) have higher<br />
precedence that the binary forms.<br />
Operators on the same line have the same precedence, and are evaluated in the order given by their associativity.<br />
To specify a different order <strong>of</strong> evaluation you can use parentheses. In fact, it is <strong>of</strong>ten good practice to use parentheses to<br />
guard against making mistakes in difficult cases, or to make your meaning clear.<br />
Side effects in evaluating expressions<br />
It is possible to write C expressions that give different answers on different machines, since some aspects <strong>of</strong><br />
expression-evaluation are not defined by the ANSI standard. This is deliberate since it gives the compiler writers the<br />
ability to choose different evaluation orders depending on the underlying machine architecture. You, the programmer,<br />
should avoid writing expressions with side effects.<br />
Here are some examples:<br />
myfunc(j, ++j); /* the arguments may be the same, or differ by one */<br />
array[j] = j++; /* is j incremented before being used as an index? */<br />
i = f1() + f2(); /* the order <strong>of</strong> evaluation <strong>of</strong> the two functions<br />
is not defined. If one function affects the<br />
results <strong>of</strong> the other, then side effects will<br />
result */<br />
Evaluation <strong>of</strong> logical AND and OR<br />
A useful aspect <strong>of</strong> C is that it guarantees the order <strong>of</strong> evaluation <strong>of</strong> expressions containing the logical AND (&&) and<br />
OR (||) operators: it is always left-to-right, and stops when the outcome is known. For example, in the expression "1 ||<br />
f()", the function "f()" will not be called since the truth <strong>of</strong> the expression is known regardless <strong>of</strong> the value returned by<br />
"f()".<br />
It is worth keeping this in mind. You can <strong>of</strong>ten speed up programs by rearranging logical tests so that the outcome <strong>of</strong><br />
the test can be determined as soon as possible.<br />
Another good example is an expression such as "i >= 0 && i
C <strong>programming</strong> <strong>notes</strong><br />
file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />
19 <strong>of</strong> 40 19/03/2007 10:06 AM<br />
Rules for identifiers:<br />
Legal characters are a-z, A-Z, 0-9, and _.<br />
Case is significant.<br />
The first character must be a letter or _.<br />
Identifiers can be <strong>of</strong> any length (although only the first 31 characters are guaranteed to be significant).<br />
The following are reserved keywords, and may not be used as identifiers:<br />
auto double int struct<br />
break else long switch<br />
case enum register typedef<br />
char extern return union<br />
const float short unsigned<br />
continue for signed void<br />
default goto size<strong>of</strong> volatile<br />
do if static while<br />
Here are some examples <strong>of</strong> legal identifiers:<br />
i<br />
count<br />
numberOfAardvarks<br />
number_<strong>of</strong>_aardvarks<br />
MAX_LENGTH<br />
Variable storage classes<br />
Variables in C belong to one <strong>of</strong> two fundamental storage classes: "static" or "automatic".<br />
A static variable is stored at a fixed memory location in the computer, and is created and initialised once when the<br />
program is first started. Such a variable maintains its value between calls to the block (a function, or compound<br />
statement) in which it is defined.<br />
An automatic variable is created, and initialised, each time the block is entered (if you jump in half-way through a<br />
block, the creation still works, but the variable is not initialised). The variable is destroyed when the block is exited.<br />
Variables can be explicitly declared as "static" or "auto" by using these keywords before the data-type definition. If you<br />
don't use one <strong>of</strong> these keywords, the default is "static" for variables defined outside any block, and "auto" for those<br />
inside a block.<br />
Actually, there is another storage class: "register". This is like "auto" except that it asks the compiler to try and store the<br />
variable in one <strong>of</strong> the CPU's fast internal registers. In practice, it is usually best not to use the "register" type since<br />
compilers are now so smart that they can do a better job <strong>of</strong> deciding which variables to place in fast storage than you<br />
can.<br />
Const - volatile<br />
Variables can be qualified as "const" to indicate that they are really constants that can be initialised, but not altered.<br />
Variables can also be termed "volatile" to indicate that their value may change unexpectedly during the execution <strong>of</strong> the<br />
program (e.g., they may be hardware registers on a PC, able to be altered by external events). By using the "volatile"<br />
qualifier, you prevent the compiler from optimising the variable out <strong>of</strong> loops.<br />
Extern<br />
Variables (and functions) can also be classified as "extern", which means that they are defined external to the current<br />
block (or even to the current source file). An "extern" variable must be defined once (and only once) without the<br />
"extern" qualifier.<br />
As an example <strong>of</strong> an "extern" function, all the functions in "libm.a" (the math library) are external to the source file that<br />
calls them.
C <strong>programming</strong> <strong>notes</strong><br />
file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />
20 <strong>of</strong> 40 19/03/2007 10:06 AM<br />
An example showing storage class and variable scope (storageclass.c)<br />
#include <br />
int i; /* i is static, and visible to the entire program */<br />
extern j; /* j is static, and visible to the entire program */<br />
static int k; /* k is static, and visible to the routines in this source file */<br />
int main(void){}<br />
void func(void) {<br />
int m = 1;<br />
auto int n = 2;<br />
/* i.e., a function that takes no arguments, and<br />
doesn't return a value */<br />
/* m is automatic, local to this function, and initialised<br />
each time the function is called */<br />
/* n is automatic, local to this function, and initialised<br />
each time the function is called */<br />
static int p = 3; /* p is static, local to this function, and initialised<br />
once when the program is first started up */<br />
extern int q; /* q is static, and defined in some external module */<br />
}<br />
for (i = 0; i < 10; i++) {<br />
int m = 10; /* m is automatic, local to this block, and initialised<br />
each time the block is entered */<br />
printf("m = %i\n", m);<br />
}<br />
Initialisation <strong>of</strong> variables (initialisation.c)<br />
A variable is initialised by equating it to a constant expression on the line in which it is defined. For example<br />
int i = 0;<br />
"static" variables are initialised once (to zero if not explicitly initialised), "automatic" variables are initialised when the<br />
block in which they are defined is entered (and to an undefined value if not explicitly initialised).<br />
The "constant expression" can contain combinations <strong>of</strong> any type <strong>of</strong> constant, and any operator (except assignment,<br />
incrementing, decrementing, function call, or the comma operator), including the ability to use the unary "&" operator<br />
to find the address <strong>of</strong> static variables.<br />
Here are some valid examples:<br />
#include <br />
#include <br />
int i = 0;<br />
int j = 2 + 2;<br />
int k = 2 * (3
C <strong>programming</strong> <strong>notes</strong><br />
file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />
21 <strong>of</strong> 40 19/03/2007 10:06 AM<br />
Mathematical functions<br />
Prototypes for the math functions are in the system include-file "math.h", so you should put the line<br />
#include <br />
in any C source file that calls one <strong>of</strong> them.<br />
Here is a list <strong>of</strong> the math functions defined by the ANSI standard:<br />
sin(x)<br />
sine <strong>of</strong> x<br />
cos(x)<br />
cosine <strong>of</strong> x<br />
tan(x)<br />
tan <strong>of</strong> x<br />
asin(x)<br />
arcsine <strong>of</strong> x, result between -pi/2 and +pi/2<br />
acos(x)<br />
arccosine <strong>of</strong> x, result between 0 and +pi<br />
atan(x)<br />
arctan <strong>of</strong> x, result between -pi/2 and +pi/2<br />
atan2(y,x)<br />
arctan <strong>of</strong> (y/x), result between -pi and +pi<br />
hsin(x)<br />
hyperbolic sine <strong>of</strong> x<br />
hcos(x)<br />
hyperbolic cosine <strong>of</strong> x<br />
htan(x)<br />
hyperbolic tan <strong>of</strong> x<br />
exp(x)<br />
exponential function<br />
log(x)<br />
natural logarithm<br />
log10(x) logarithm to base 10<br />
pow(x,y)<br />
x to the power <strong>of</strong> y (x**y in FORTRAN)<br />
sqrt(x)<br />
the square root <strong>of</strong> x (x must not be negative)<br />
ceil(x)<br />
ceiling; the smallest integer not less than x<br />
floor(x)<br />
floor; the largest integer not greater than x<br />
fabs(x)<br />
absolute value <strong>of</strong> x<br />
ldexp(x,n)<br />
x times 2**n<br />
frexp(x, int *exp) returns x normalized between 0.5 and 1; the exponent <strong>of</strong> 2 is in *exp<br />
modf(x, double *ip) returns the fractional part <strong>of</strong> x; the integral part goes to *ip<br />
fmod(x,y)<br />
returns the floating-point remainder <strong>of</strong> x/y, with the sign <strong>of</strong> x<br />
In the above table, "x", and "y" are <strong>of</strong> type "double", and "n" is an "int". All the above functions return "double" results.<br />
C libraries may also include "float" versions <strong>of</strong> the above. For example, "fsin(x)" takes a float argument and returns a<br />
float result.<br />
Note that the arguments for trignometric functions are in radians and not degrees!<br />
A complete list <strong>of</strong> mathematical functions supported by C can be found in the file Cmathfunctions.txt.<br />
The "for" loop<br />
The basic looping construct in C is the "for" loop.<br />
Here is the syntax <strong>of</strong> the "for" statement:<br />
for (initial_expression; loop_condition; loop_expression) statement;<br />
and this is precisely equivalent to the following code fragment:<br />
initial_expression;<br />
while (loop_condition) {<br />
statement;<br />
loop_expression;<br />
}<br />
An example will, hopefully, clear this up:<br />
for (i = 0; i< 100; i++) printf("%i\n", i);<br />
which simply prints the first 100 integers. If you want to include more that one statement in the loop, use curly brackets<br />
to delimit the body <strong>of</strong> the loop, e.g.,<br />
for (i = 0; i < 100; i++) {<br />
j = i * i;<br />
printf("i = %i; j = %i\n", i, j);
C <strong>programming</strong> <strong>notes</strong><br />
file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />
22 <strong>of</strong> 40 19/03/2007 10:06 AM<br />
}<br />
A related control mechanism is the do-while loop<br />
initial_expression;<br />
do{statement;} while (loop_condition)<br />
For example (dowhile.c):<br />
i = 0;<br />
do {printf("The value <strong>of</strong> i is now %d\n", i);<br />
i = i + 1;<br />
} while (i < 5);<br />
Here is a slightly more interesting example which converts Centigrade to Farenheit (tempconv.c).<br />
And here is a more complex example, illustrating nested loops, with an if thrown in for good measure:<br />
ASCII circle printer.<br />
And here is another program illustrating the power <strong>of</strong> loops: Mandelbrot image generator<br />
Semicolons in compound statements<br />
The last statement in the body <strong>of</strong> statements in a "for" loop (or, in fact, in any other compound statement) must be<br />
terminated with a semicolon.<br />
For example,<br />
for (i = 0; i < 10; i++) {<br />
x = i * i;<br />
x += 2; /* the semicolon is required here */<br />
} /* do not use a semicolon here */<br />
Variables defined within compound statements<br />
You can create variables that are local to a compound statement by declaring the variables immediately after the<br />
leading curly bracket.<br />
If statements<br />
if (expression)<br />
statement<br />
else if (expression)<br />
statement<br />
else if (expression)<br />
statement<br />
else<br />
statement<br />
Where "statement" is a simple C statement ending in a semicolon, or a compound statement ending in a curly bracket.<br />
Some examples will help:<br />
if (i == 6) x = 3;<br />
if (i == 6) {<br />
x = 3;<br />
}<br />
if (i) x = 3;<br />
if (i == 2)<br />
x = 3;<br />
else if (i == 1)<br />
if (j)<br />
y = 2;
C <strong>programming</strong> <strong>notes</strong><br />
file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />
23 <strong>of</strong> 40 19/03/2007 10:06 AM<br />
else /* NOTE: possible ambiguity here, the compiler uses */<br />
y = 3; /* the closest if statement that does not have */<br />
else { /* an else clause */<br />
x = 4;<br />
y = 5;<br />
}<br />
Example Program ifelse.c<br />
Break and continue<br />
These two statements are used in loop control.<br />
"break" exits the innermost current loop.<br />
"continue" starts the next iteration <strong>of</strong> the loop.<br />
Here is an examaple program with break and continue (breakcon.c)<br />
Infinite loops<br />
Here are three ways <strong>of</strong> writing an infinite loop, choose whichever one you like:<br />
for (;;;) {<br />
statement;<br />
}<br />
while (1) {<br />
statement;<br />
}<br />
do {<br />
statement;<br />
} while (1);<br />
Infinite loops can be useful. They are normally terminated using a conditional test with a "break" or "return" statement.<br />
For example:<br />
while (1) {<br />
statement;<br />
if (logical_condition) break;<br />
}<br />
goto<br />
C does have a "goto" statement, but you normally don't need it. Using "goto" is usually a result <strong>of</strong> bad <strong>programming</strong><br />
(although in some circumstances it can be a sensible choice).<br />
The switch statement<br />
The switch statement allows you to choose between multiple paths <strong>of</strong> execution depending on the value <strong>of</strong> an<br />
expression. It could be replaced by a series <strong>of</strong> if statements, but switch can make the flow <strong>of</strong> the program easier to<br />
understand.<br />
switch (expression) {<br />
case const-expression: statements<br />
case const-expression: statements<br />
default: statements<br />
}<br />
Beware that control will flow from one case to the next unless you explicitly include a break statement to cause control<br />
to skip to the end <strong>of</strong> the switch statement.<br />
Here is an example <strong>of</strong> using switch:<br />
switch (i) {<br />
case 0: printf("i had the value zero\n");<br />
break;<br />
case 1:<br />
case 2: printf("i was either one or two\n");
C <strong>programming</strong> <strong>notes</strong><br />
file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />
24 <strong>of</strong> 40 19/03/2007 10:06 AM<br />
}<br />
break;<br />
default: printf("i was outside the range 0 to 2\n");<br />
Here is an example with multiple switch values (switch.c)<br />
Assert as a tool for improving program reliability<br />
By now you will be aware that it is very easy to write buggy programs in C, and it may not always be obvious at<br />
runtime that there is a problem. Checking for all possible error conditions in a program can be tedious. A useful tool to<br />
improve program reliability is the "assert" construct defined in the system include-file "assert.h". An example<br />
(assertexample.c) will clarify this:<br />
#include <br />
#include <br />
int main(void) {<br />
int i;<br />
}<br />
printf("enter a positive integer > ");<br />
assert(1 == scanf("%i", &i));<br />
assert(i > 0);<br />
printf("you successfully entered %i\n", i);<br />
return 0;<br />
If the expression inside parentheses after "assert" is true, the program continues as normal, if the expression is false (i.e,<br />
zero), then the program aborts with an error message.<br />
This may sound like a trivial concept, but by using "assert" extensively, you will be surprised at how your<br />
<strong>programming</strong> improves.<br />
It is particularly useful to use "assert" to check for conditions that should never arise in your program, e.g., suppose you<br />
write a function that should only ever be called with an argument between -1.0 and 1.0, then, use "assert" in the<br />
function to check this, e.g.,<br />
double myAsin(double x) {<br />
assert(fabs(x) < 1.0);<br />
...<br />
}<br />
Also, you can use assert to check for unexpected conditions such as failure to open a file, or failure to allocate memory,<br />
e.g.,<br />
assert(NULL != (fp = fopen("data.dat", "r")));<br />
assert(NULL != (p = malloc(1024));<br />
Formatted output: printf description<br />
The format string given to the "printf" function may contain both ordinary characters (which are simply printed out)<br />
and conversion characters (beginning with a percent symbol, %, these define how the value <strong>of</strong> an internal variable is to<br />
be converted into a character string for output).<br />
Here is the syntax <strong>of</strong> a conversion specification:<br />
%{flags: - + space 0 #}{minimum field width}{.}{precision}{length modifier}{conversion character}<br />
Flags: "-" means left-justify (default is right-justify), "+" means that a sign will always be used, " " prefix a<br />
space, "0" pad to the field width with zeroes, "#" specifies an alternate form (for details, see the manual!). Flags<br />
can be concatenated in any order, or left <strong>of</strong>f altogether.
C <strong>programming</strong> <strong>notes</strong><br />
file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />
25 <strong>of</strong> 40 19/03/2007 10:06 AM<br />
Minimum field width: the output field will be at least this wide, and wider if necessary.<br />
"." separates the field width from the precision.<br />
Precision: its meaning depends on the type <strong>of</strong> object being printed:<br />
Character: the maximum number <strong>of</strong> characters to be printed.<br />
Integer: the minimum number <strong>of</strong> digits to be printed.<br />
Floating point: the number <strong>of</strong> digits after the decimal point.<br />
Exponential format: the number <strong>of</strong> significant digits.<br />
Length modifer: "h" means short, or unsigned short; "l" means long or unsigned long; "L" means long double.<br />
Conversion character: a single character specifying the type <strong>of</strong> object being printed, and the manner in which it<br />
will be printed, according to the following table:<br />
Character Type<br />
Result<br />
d,i int signed decimal integer<br />
o int unsigned octal (no leading zero)<br />
x, X int unsigned hex (no leading 0x or 0X)<br />
u int unsigned decimal integer<br />
c int single character<br />
s char * characters from a string<br />
f double floating point [-]dddd.pppp<br />
e, E double exponential [-]dddd.pppp e[=/-]xx<br />
g, G double floating if the exponent less than -4 or >= precision<br />
else exponential<br />
p void * pointer<br />
n int * the number <strong>of</strong> characters written so far by printf<br />
is stored into the argument (i.e., not printed)<br />
% print %<br />
Here is an example program (printfexample.c) to show some <strong>of</strong> these effects (try running the program yourself):<br />
#include <br />
int main(void) {<br />
int i = 123;<br />
double f = 3.14159265358979323846;<br />
printf("i = %i\n", i);<br />
printf("i = %o\n", i);<br />
printf("i = %x\n", i);<br />
printf("i = %X\n", i);<br />
printf("i = %+i\n", i);<br />
printf("i = %8i\n", i);<br />
printf("i = %08i\n", i);<br />
printf("i = %+08i\n", i);<br />
printf("f = %f\n", f);<br />
printf("f = %10.3f\n", f);<br />
printf("f = %+10.3f\n", f);<br />
printf("f = %g\n", f);<br />
printf("f = %10.6g\n", f);<br />
printf("f = %10.6e\n", f);<br />
}<br />
return 0;<br />
Notes:<br />
There are differences between the various implementations <strong>of</strong> "printf". The above should be a subset <strong>of</strong> the<br />
available options. Consult the manual (e.g., "info gcc") for your particular C compiler to be sure.<br />
"printf" is actually an integer function, which returns the number <strong>of</strong> characters written (or a negative number if an<br />
error occurred).<br />
Formatted input: scanf overview<br />
Input in C is similar to output: the same conversion characters are used. The main difference is that you use a routine<br />
called "scanf", and you must pass the addresses <strong>of</strong> the variables you want to change, not their values.<br />
For example:<br />
scanf("%d", &i); /* reads an integer into `i' */<br />
scanf("%i", &i); /* reads an integer (or octal, or hex) into `i' */<br />
scanf("%f %i", &f, &i); /* reads a double followed by an integer */<br />
scanf is actually an integer function, which returns the number <strong>of</strong> input items assigned (or the integer constant EOF<br />
(defined in stdio.h) if the end-<strong>of</strong>-file is reached or an error occurred).
C <strong>programming</strong> <strong>notes</strong><br />
file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />
26 <strong>of</strong> 40 19/03/2007 10:06 AM<br />
The ampersand character "&" is a unary operator that returns the address <strong>of</strong> the thing to its right. Recall that a C<br />
function can not alter the value <strong>of</strong> its arguments (but there is nothing stopping it altering the value that is pointed to by<br />
one <strong>of</strong> its arguments!).<br />
scanf details<br />
The scanf function is used for reading a line <strong>of</strong> input from the standard input stream (usually the users terminal) and<br />
extracting from it one or more integers, floating point numbers, character strings, etc.<br />
The first argument to scanf is a format description, which describes how the input line should be decoded. For example,<br />
if you want to read an integer, the format would be "%i" (note that this allows you to enter numbers in decimal,<br />
hexadecimal or octal (e.g., 123, 0x123, 0123)). If you want to read a floating point number, the format would be "%f".<br />
If you want to read an integer followed by a floating point number, the format woud be "%d %f". If the two numbers<br />
were separated by a comma on the input line, you would use "%d, %f".<br />
The arguments after the first are pointers to where the data should be stored.<br />
Examples (where i and j are integers, x and y are floats, and fruit and colour are char arrays):<br />
Input data<br />
scanf function call to read the data<br />
123 scanf("%i", &i);<br />
123 0x456 scanf("%i %i", &i, &j);<br />
123. 456. scanf("%i. %i.", &i, &j);<br />
123.1 456.1 scanf("%f %f", &x, &y);<br />
e=2.71 pi=3.14<br />
scanf("e=%f pi=%f", &x, &y);<br />
apples red<br />
scanf("%s %s", fruit, colour);<br />
Guido ate 2 apples<br />
scanf("Guido ate %i %s", &i, fruit);<br />
It is important to check the return value <strong>of</strong> the scanf function - this is the number <strong>of</strong> items that were successfully<br />
extracted from the input. It may be less than the number you expected, or even zero. If this happens, the input stream<br />
pointer is left at the point at which the error occurred, and you can try reading again. scanf can also return the special<br />
integer constant EOF to indicate an end-<strong>of</strong>-file condition.<br />
Variations on scanf<br />
fscanf is very similar to scanf except it takes a FILE argument ("FILE" is a structure that is defined in the <br />
include file - you don't have to worry about what precisely "FILE" is, just follow the examples) to specify which input<br />
stream to read from, e.g.,<br />
FILE *fp;<br />
float x, y;<br />
fp = fopen("input.dat", "r");<br />
fscanf(fp, "%f %f", &x, &y);<br />
"sscanf" takes input from a character array rather than an input stream, e.g.,<br />
char input="1.23 4.56";<br />
sscanf(input, "%f %f", &x, &y);<br />
Improving the ease <strong>of</strong> interaction with your programs - GNU readline<br />
While it is possible to use "scanf" for reading input typed by a person, you can greatly improve the user experience by<br />
using the GNU readline package.
C <strong>programming</strong> <strong>notes</strong><br />
file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />
27 <strong>of</strong> 40 19/03/2007 10:06 AM<br />
The GNU readline package provides some very convenient features such as line editing and history. This means, e.g.,<br />
that you can use the arrow keys to move within a line, and between previously entered lines. You can also use, e.g.,<br />
CTRL-A to go to the beginning <strong>of</strong> the line, CTRL-E to go to the end, and CNTRL-R to search backwards for particular<br />
strings. See "man readline" (on a GNU/Linux computer) for a complete description.<br />
Here is an example program showing the use <strong>of</strong> GNU readline.<br />
Arrays<br />
In mathematics and physics you <strong>of</strong>ten need to group numbers together into arrays, i.e., 1, 2, or higher-dimensional<br />
structures.<br />
Arrays are declared in C as follows (for example):<br />
int counts[100];<br />
float temperature[1024];<br />
double mat[2][3];<br />
In this example, "count" is an array that can hold 100 integers, "temperature" is an array that can hold 1024 floats, and<br />
"mat" is a 2x3 array (or matrix) <strong>of</strong> doubles.<br />
Note carefully that the first element <strong>of</strong> an array in C is element number 0 (rather than 1 as in FORTRAN). While this<br />
may be confusing to die-hard FORTRAN programmers, it is really a more natural choice. A side-effect <strong>of</strong> this choice is<br />
that the last element in an array has an index that is one less that the declared size <strong>of</strong> the array. This is a source <strong>of</strong> some<br />
confusion, and something to watch out for.<br />
So, "counts[0]" is the first element <strong>of</strong> the array "counts", defined above. And "counts[99]" is the last element. There are<br />
100 elements in total.<br />
Initialising arrays<br />
To initialise an array to have particular numerical values, specify the initial values in a list within curly brackets. For<br />
example:<br />
int primes[100] =<br />
{2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43,<br />
47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107,<br />
109, 113, 127, 131, 137, 139, 149, 151, 157, 163, 167, 173, 179, 181,<br />
191, 193, 197, 199, 211, 223, 227, 229, 233, 239, 241, 251, 257, 263,<br />
269, 271, 277, 281, 283, 293, 307, 311, 313, 317, 331, 337, 347, 349,<br />
353, 359, 367, 373, 379, 383, 389, 397, 401, 409, 419, 421, 431, 433,<br />
439, 443, 449, 457, 461, 463, 467, 479, 487, 491, 499, 503, 509, 521,<br />
523, 541};<br />
float temp[1024] = {5.0F, 2.3F};<br />
double trouble[] = {1.0, 2.0, 3.0};<br />
In this example, "primes" is initialised with the values <strong>of</strong> the first 100 primes (trust me!), and the first two elements<br />
(temp[0] and temp[1]) <strong>of</strong> "temp" are initialised to 5.0F and 2.3F respectively. The remaining elements <strong>of</strong> "temp" are set<br />
to 0.0F. Note that we use the trailing "F" on these numbers to indicate that they are floats, not doubles.<br />
The array "trouble" in the above example contains three double numbers. Note that its length is not explicitly declared,<br />
C is smart enough to calculate the length for you.<br />
Static arrays that are not explicitly initialised are set to zero. Automatic arrays that are not explicitly initialised, have<br />
undefined values.<br />
Multidimensional arrays<br />
Multidimensional arrays are declared and referenced in C by using multiple sets <strong>of</strong> square brackets. For example,<br />
int table[2][3][4];<br />
"table" is a 2x3x4 array, with the rightmost array subscript changing most rapidly as you move through memory (the
C <strong>programming</strong> <strong>notes</strong><br />
file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />
28 <strong>of</strong> 40 19/03/2007 10:06 AM<br />
first element is "table[0][0][0]", the next element is "table[0][0][1]", and the last element is "table[1][2][3]").<br />
When writing programs that use large arrays (say, more than a few megabytes), you should be very careful to ensure<br />
that array references in your program are as close as possible to being consecutive (otherwise you may get severe<br />
swapping problems, i.e., the memory is swapped to hard disk, resulting in more than a factor <strong>of</strong> 1000 slow-down).<br />
Initialisation <strong>of</strong> multidimensional arrays<br />
This is best demonstrated with an example:<br />
int mat[3][4] = {<br />
}<br />
Operations on arrays<br />
{ 0, 1, 2, 23},<br />
{ 3, 4, 5, 24},<br />
{ 6, 7, 8, 25}<br />
If you wish to, e.g., add two arrays together, you have to do it element-by-element yourself using a "for" loop. For<br />
example:<br />
int a0[10], a1[10], a2[10];<br />
for (int i = 0; i < 10; i++) {<br />
a0[i] = a1[i] + a2[i];<br />
}<br />
Even something as simple as setting one array to have the same value as another has to be done with a "for" loop. E.g. :<br />
int a0[10], a1[10];<br />
for (int i = 0; i < 10; i++) {<br />
a0[i] = a1[i];<br />
}<br />
Character arrays<br />
Arrays can be <strong>of</strong> any type. In C, you manipulate character strings as arrays <strong>of</strong> characters, and operations on the strings<br />
(such as concatenation, searching) are done by calling special libary functions (e.g., strcat, strcmp).<br />
Note that when calling a string-manipulation function, the end <strong>of</strong> the string is taken as the position <strong>of</strong> the first NUL<br />
character (which has a numerical value <strong>of</strong> zero) in the string.<br />
Character arrays can be initialised in the following ways:<br />
char str[] = {'a', 'b', 'c'};<br />
char prompt[] = "please enter a number";<br />
In the example, "str" has length 3 bytes, and "prompt" has length 22 bytes, which is one more than the number <strong>of</strong><br />
characters in "please enter a number", the extra byte is used to store a NUL character (zero) as an indication <strong>of</strong> the end<br />
<strong>of</strong> the string. "str" does not having a trailing NUL.<br />
Functions<br />
A function call allows a commonly used piece <strong>of</strong> code to be made available through a single calling statement. For<br />
instance, to calculate the square <strong>of</strong> a number we could use the following (floating point) function:<br />
float sqr(inval) /* square a float, return a float */<br />
float inval;<br />
{<br />
float square;<br />
square = inval * inval;<br />
return(square);
C <strong>programming</strong> <strong>notes</strong><br />
file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />
29 <strong>of</strong> 40 19/03/2007 10:06 AM<br />
This would be called from the main part <strong>of</strong> the program by a statement like:<br />
y = sqr(x);<br />
In this case the result <strong>of</strong> the function call ("square") is assigned to the value "y". The complete calling sequence is<br />
illustrated in the program floatsq.c.<br />
Prototypes<br />
The function call described above is the "classical" mode <strong>of</strong> function use, as described, for eaxmple, in Kernigan &<br />
Ritchie. A prototype is a model <strong>of</strong> a real thing, and in C you have the ability to define a model <strong>of</strong> each function for the<br />
compiler. The compiler can then use the model to check each call to the function and so determine if the correct number<br />
<strong>of</strong> arguments have been used, and whether they are <strong>of</strong> the correct type. By using prototypes you are therefore letting the<br />
compiler do some additional error checking. The program floatsq2.c does the same calculation as the program floatsq.c<br />
as above, but this time uses prototypes. Note the placement <strong>of</strong> the prototype for the square function "sqr":<br />
float sqr(float inval);<br />
at the start <strong>of</strong> the program, before the start <strong>of</strong> "main".<br />
Pointers<br />
If there is one thing that sets C apart from many other languages, it is the pervasive use <strong>of</strong> pointers.<br />
To understand pointers, it helps to have a mental image <strong>of</strong> the way in which the memory <strong>of</strong> a computer is organised.<br />
e.g., imagine that your computer has 128MB <strong>of</strong> memory; then you could label each one <strong>of</strong> these 128 million bytes <strong>of</strong><br />
memory with a number: byte 0 would be the first byte in memory, byte 1 the second, and so on up to 128 million.<br />
These numbers are referred to as the addresses <strong>of</strong> the memory cells (note: I'm actually simplifying the true situation; in<br />
a modern computer each separate process has it's own virtual address-space, and there are several layers <strong>of</strong> indirection<br />
involved in going from the addresses that C uses, to the actual physical piece <strong>of</strong> memory on the computer<br />
motherboard). A pointer is then simply a variable containing a memory address. This is a very natural concept when<br />
you think in terms <strong>of</strong> the hardware <strong>of</strong> the computer, which is a viewpoint that C encourages.<br />
Suppose that "i" is declared as "int i". Then the address in memory at which "i" is stored is given by "&i", where "&" is<br />
the unary address operator. One says that "&i" is a pointer to "i". On a 32-bit computer such as a typical PC, a pointer<br />
is 4 bytes in length (therefore allowing up to 4GB <strong>of</strong> memory to be addressed; this is an uncomfortably small number).<br />
Pointers are declared by specifying the type <strong>of</strong> thing they point at. For example,<br />
int *p;<br />
defines "p" as a pointer to an int (so, therefore, "*p" is an int, hence the form <strong>of</strong> the declaration).<br />
Note carefully, then by declaring "p" as in the above example, the compiler simply allocates space for the pointer (4<br />
bytes for 32-bit computers, 8 bytes for 64-bit computers), not for the variable that the pointer points to! This is a very<br />
important point, and is <strong>of</strong>ten overlooked. Before using "*p" in an expression, you have to ensure that "p" is set to point<br />
to a valid int. This "int" must have had space allocated for it, either statically, automatically, or by dynamically<br />
allocating memory at run-time (e.g., using the "malloc" function, see later).<br />
Here is a rather convoluted example showing some <strong>of</strong> the uses <strong>of</strong> pointers (pointerexample1.c):<br />
#include <br />
int main(void) {<br />
int m = 0, n = 1, k = 2;<br />
int *p;<br />
char msg[] = "hello";<br />
char *cp;<br />
/* Line 7 */<br />
p = &m; /* Line 8: p now points to m */<br />
*p = 1; /* Line 9: m now equals 1 */<br />
k = *p; /* Line 10: k now equals 1 */<br />
cp = msg; /* Line 11: cp points to the first character <strong>of</strong> msg */<br />
*cp = 'H'; /* Line 12: change the case <strong>of</strong> the 'h' in msg */<br />
cp = &msg[4]; /* Line 13: cp points to the 'o' */<br />
*cp = 'O'; /* Line 14: change its case */<br />
printf("m = %d, n = %d, k = %d\nmsg = \"%s\"\n", m, n, k, msg);<br />
return 0;
C <strong>programming</strong> <strong>notes</strong><br />
file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />
30 <strong>of</strong> 40 19/03/2007 10:06 AM<br />
}<br />
By using the GNU debugger, gdb, we can identify the actual memory locations that the C compiler uses to store the<br />
variable in the above program. It is then possible to draw up a table (see below) showing the detailed operation <strong>of</strong> the<br />
program. You should follow this example very carefully so that you understand what is going on.<br />
Variable Starting Value after the specified line has executed<br />
location 7 8 9 10 11 12 13 14<br />
------------------------------------------------------------------<br />
m 0xbffff98c 0 0 1 1 1 1 1 1<br />
n 0xbffff988 1 1 1 1 1 1 1 1<br />
k 0xbffff984 2 2 2 1 1 1 1 1<br />
p 0xbffff980 ? 98c 98c 98c 98c 98c 98c 98c<br />
msg 0xbffff970 hello hello hello hello hello Hello Hello HellO<br />
cp 0xbffff96c ? ? ? ? 970 970 976 976<br />
The memory addresses are shown in hexadecimal, and only the rightmost three digits are shown (e.g., 98c) for<br />
convenience in displaying the table. Memory addresses are always given in units <strong>of</strong> bytes. The above table shows, for<br />
example, that the variable "m" occupies the memory addresses 0xbffff98c, 0xbffff98d, 0xbffff98e, and 0xbffff98f. You<br />
will note that 16 bytes are reserved for "msg", even though "msg" only needs 6 bytes (5 for the characters in "hello"<br />
and one for the trailing null byte). The C compiler won't guarantee any particular placement in memory, and may leave<br />
gaps, particularly to align numbers with 4-byte boundaries (i.e., so that their memory addresses are divisable by 4).<br />
Note the very important point that the name <strong>of</strong> an array ("msg" in the above example), if used without an index, is<br />
considered to be a pointer to the first element <strong>of</strong> the array. In fact, an array name followed by an index is exactly<br />
equivalent to a pointer followed by an <strong>of</strong>fset. For example (pointerexample2.c),<br />
#include <br />
int main(void) {<br />
char msg[] = "hello world";<br />
char *cp;<br />
cp = msg;<br />
cp[0] = 'H';<br />
*(msg+6) = 'W';<br />
}<br />
printf("%s\n", msg);<br />
printf("%s\n", &msg[0]);<br />
printf("%s\n", cp);<br />
printf("%s\n", &cp[0]);<br />
return 0;<br />
Note, however, that "cp" is a variable, and can be changed, whereas "msg" is a constant, and is not an lvalue (i.e., it can<br />
not be used on the left <strong>of</strong> an equals sign).<br />
Pointer arithmetic<br />
You can add and subtract numbers to/from pointers, but the results may surprise you. For example, if "p" is a pointer to<br />
an "int" (i.e., "int *p"), then "p++" actually adds 4 to "p"! The logic behind this is that adding one to "p" should make it<br />
point to the next integer, which has a memory address that is 4 bytes greater.<br />
Pointers used as arguments to functions<br />
We have already seen that C functions can not alter their arguments. The simple reason for this is that when you call a<br />
function, the compiler passes copies <strong>of</strong> the values <strong>of</strong> the arguments to the function. For example, if you have a function<br />
as follow:<br />
void myfunc(int n) {<br />
n = 6;<br />
}<br />
and you call it with<br />
int i = 0;<br />
myfunc(i);<br />
Then "myfunc" is passed the number 0. "myfunc" knows nothing whatsoever about where this number came from. The
C <strong>programming</strong> <strong>notes</strong><br />
file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />
31 <strong>of</strong> 40 19/03/2007 10:06 AM<br />
zero is stored in the local variable n. "myfunc" changed n to 6, but this doesn't have any effect on i in the main<br />
program.<br />
In order to change the value <strong>of</strong> a variable in the main program, you need to give the function a pointer to the variable.<br />
In other words, you tell the function whereabouts in memory the variable is. With this crucial bit <strong>of</strong> information, the<br />
function can go ahead and alter the value <strong>of</strong> the variable. For example,<br />
void myfunc(int *n) {<br />
*n = 6;<br />
}<br />
and you call it with<br />
int i = 0;<br />
myfunc(&i);<br />
Now, the value <strong>of</strong> i in the main program will be altered by the call to "myfunc".<br />
Here is an example <strong>of</strong> a function that swaps the values <strong>of</strong> its two arguments, which must be pointers to integers:<br />
void swap_args(int *pi, int *pj) {<br />
int temp;<br />
}<br />
temp = *pi;<br />
*pi = *pj;<br />
*pj = temp;<br />
If all the asterisks were left out <strong>of</strong> this routine, then it would still compile OK, but its only effect would be to swap local<br />
copies <strong>of</strong> "pi" and "pj".<br />
Pointers and arrays<br />
Consider an array declared as follows:<br />
int a[10];<br />
Then, as we have seen, a[0] is the first element <strong>of</strong> the array, and a[9] is the last. The address <strong>of</strong> the first element is<br />
"&a[0]", and by convention this can also be written simply as "a".<br />
Structures<br />
Any <strong>programming</strong> language worth its salt has to allow you to manipulate more complex data types than simply ints,<br />
floats, and arrays. You need to have structures <strong>of</strong> some sort. This is best illustrated by example:<br />
To define a structure, use the following syntax:<br />
struct time {<br />
int hour;<br />
int minute;<br />
float second;<br />
};<br />
This defines a new data type, called "time", which contains 3 fields (stored in consecutive locations in the computer's<br />
memory).<br />
Logically, it makes sense to associate the various fields <strong>of</strong> "time" together, since they are part <strong>of</strong> one physical entity:<br />
the time.<br />
To declare a variable <strong>of</strong> type "time", you could do the following:<br />
struct time t1[10], t2 = {12, 0, 0.0F};<br />
This defines "t1" to be an array <strong>of</strong> 10 "time" structures. The variable "t2" is also a "time" structure, and is initialised to<br />
12 hours, 0 minutes, 0.0 seconds.<br />
An example program using structures is addtime.c. This defines the above "time" structure, and then adds two times
C <strong>programming</strong> <strong>notes</strong><br />
file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />
32 <strong>of</strong> 40 19/03/2007 10:06 AM<br />
together, taking account <strong>of</strong> any overflow in the seconds and the minutes. The assert command is also used to ensure<br />
that the times that are input are also valid.<br />
To refer to an individual element <strong>of</strong> a structure, you use the following syntax:<br />
t1[0].hour = 12;<br />
t1[0].minutes = 0;<br />
t1[0].second = t2.second;<br />
t1[1] = t2;<br />
Alternatively, and somewhat confusingly, if "p", say, is a pointer to a structure, then you can access the elements <strong>of</strong> the<br />
structure using notation like "p->hour". For example:<br />
struct time *p;<br />
// p is a pointer to a time structure<br />
struct time t = {1,2,3.0}; // t is a time structure<br />
p = &t;<br />
// p now points to t<br />
p->hour = 6;<br />
// this statement is exactly equivalent to...<br />
t.hour = 6; // ...this one<br />
Structures can contain any type, including arrays and other structures. A common use is to create a linked list by having<br />
a structure contain a pointer to a structure <strong>of</strong> the same type, for example,<br />
struct person {<br />
char *name[80];<br />
char *address[256];<br />
struct person *next_person;<br />
};<br />
where "person.next_person" is actually a pointer to a "person" struct. Sounds complicated, but it is simple when you<br />
get the hang <strong>of</strong> it.<br />
Structures are the first step towards object-oriented <strong>programming</strong> (OOP), where most <strong>of</strong> the <strong>programming</strong> effort is<br />
devoted to defining the best representation <strong>of</strong> the data, and the operations which can be performed on the data.<br />
In physics, a structure can be used to represent the state <strong>of</strong> a physical system. For example, here is a structure that<br />
might be used to represent the state <strong>of</strong> a particle in three-dimensional space:<br />
struct particle {<br />
double mass;<br />
double position[3];<br />
double velocity[3];<br />
};<br />
We could then define an array <strong>of</strong> 1000 particles as simply as:<br />
struct particle p[1000];<br />
You can take this one step further and define your own type (like "int", "double") by doing:<br />
typedef struct {<br />
double mass;<br />
double position[3];<br />
double velocity[3];<br />
} particle;<br />
In which case you can define the array <strong>of</strong> 1000 particles as:<br />
particle p[1000];<br />
String functions<br />
The standard C libraries include a bunch <strong>of</strong> functions for manipulating strings (i.e., arrays <strong>of</strong> chars).<br />
Note that before using any <strong>of</strong> these functions, you should include the line "#include " in your program. This<br />
include-file defines prototypes for the following functions, as well as defining special types such as "size_t", which are<br />
operating-system dependent.<br />
char *strcat(s1, s2) Concatenates s2 onto s1. null-terminates s1.<br />
Returns s1.<br />
char *strchr(s, c)<br />
Searches string s for character c. Returns a
C <strong>programming</strong> <strong>notes</strong><br />
file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />
33 <strong>of</strong> 40 19/03/2007 10:06 AM<br />
int strcmp(s1, s2)<br />
pointer to the first occurence, or null pointer<br />
if not.<br />
Compares strings s1 and s2. Returns an integer<br />
that is less than 0 is s1 < s2; zero if s1 = s2;<br />
and greater than zero is s1 > s2.<br />
char *strcpy (s1, s2) Copies s2 to s1, returning s1.<br />
size_t strlen(s)<br />
char *strncat(s1, s2, n)<br />
int strncmp(s1, s2, n)<br />
char *strncpy(s1, s2)<br />
char *strrchr(s, c)<br />
int strstr(s1, s2)<br />
size_r strspn(s1, s2)<br />
size_r strcspn(s1, s2)<br />
char *strpbrk(s1, s2)<br />
char *strerror(n)<br />
char *strtok(s1, s2)<br />
Returns the number <strong>of</strong> characters in s, excluding<br />
the terminating null.<br />
Concatenates s2 onto s1, stopping after `n'<br />
characters or the end <strong>of</strong> s2, whichever occurs<br />
first. Returns s1.<br />
Like strcmp, except at most `n' characters are<br />
compared.<br />
Like strcpy, except at most `n' characters are<br />
copied.<br />
Like strchr, except searches from the end <strong>of</strong><br />
the string.<br />
Searches for the string s2 in s1. Returns a<br />
pointer if found, otherwise the null-pointer.<br />
Returns the number <strong>of</strong> consecutive characters<br />
in s1, starting at the beginning <strong>of</strong> the string,<br />
that are contained within s2.<br />
Returns the number <strong>of</strong> consecutive characters<br />
in s1, starting at the beginning <strong>of</strong> the string,<br />
that are not contained within s2.<br />
Returns a pointer to the first character in s1<br />
that is present in s2 (or NULL if none).<br />
Returns a pointer to a string describing the<br />
error code `n'.<br />
Searches s1 for tokens delimited by characters<br />
from s1. The first time it is called with a<br />
non-null s1, it writes a NULL into s1 at the<br />
first position <strong>of</strong> a character from s2, and<br />
returns a pointer to the beginning <strong>of</strong> s1. When<br />
strtok is then called with a null s1, it finds<br />
the next token in s1, starting at one position<br />
beyond the previous null.<br />
File operations<br />
defines a number <strong>of</strong> functions that are used for accessing files. Before using a file, you have to declare a<br />
pointer to it, so that it can be referred to with the functions. You do this with, for example,<br />
FILE *in_file;<br />
FILE *out_file;<br />
where "FILE" is a type defined in (it is usually a complicated structure <strong>of</strong> some sort). To open a file, you do,<br />
for example:<br />
in_file = fopen("input_file.dat", "r");<br />
out_file = fopen("output_file.dat", "w");<br />
Note the use <strong>of</strong> "r" to indicate read access, and "w" to indicate write access. The following modes are available:<br />
"r" read<br />
"w" write (destroys any existing file with the same name)<br />
"rb" read a binary file<br />
"wb" write a binary file (overwriting any existing file)<br />
"r+" opens an existing file for random read access, and<br />
writing to its end<br />
"w+" opens a new file for random read access, and writing to<br />
its end (destroys any existing file with the same name)<br />
Once the file is opened, you can use the following functions to read/write it:
C <strong>programming</strong> <strong>notes</strong><br />
file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />
34 <strong>of</strong> 40 19/03/2007 10:06 AM<br />
int fread(void *ptr, int size, int nelm, FILE *fp) Reads nelm elements<br />
<strong>of</strong> data, each size bytes long, storing them<br />
at the location pointed to by ptr. Returns<br />
the number <strong>of</strong> elements read.<br />
int fwrite(void *ptr, int size, int nelm, FILE *fp) Writes nelm elements<br />
<strong>of</strong> data, each size bytes long, from the<br />
location pointed to by ptr. Returns<br />
the number <strong>of</strong> elements written.<br />
int getc(FILE *fp)<br />
Returns the next character from `fp', or<br />
EOF on error or end-<strong>of</strong>-file.<br />
int putc(int c, FILE *fp)<br />
Write the character c to the file `fp',<br />
returning the character written, or EOF on<br />
error.<br />
int fscanf(FILE *fp, char *format, ...) Like scanf, except input is taken<br />
from the file fp.<br />
int fprintf(FILE *fp, char *format, ...) Like printf, except output is written<br />
to the file fp.<br />
char *fgets(char *line, int n, FILE *fp) Gets the next line <strong>of</strong> input from the<br />
file fp, up to `n-1' characters in length.<br />
The newline character is included at the end<br />
<strong>of</strong> the string, and a null is appended. Returns<br />
`line' if successful, else NULL if end-<strong>of</strong>-file<br />
or other error.<br />
int fputs(char *line, FILE *fp) Outputs the string `line' to the file fp.<br />
Returns zero is successful, EOF if not.<br />
When you have finished with a file, you should close it with 'fclose':<br />
int fclose(FILE *fp)<br />
Closes the file 'fp', after flushing any<br />
buffers. This function is automatically called<br />
for any open files by the operating<br />
system at the end <strong>of</strong> a program.<br />
It can also be useful to flush any buffers associated with a file, to guarantee that the characters that you have written<br />
have actually been sent to the file:<br />
int fflush(FILE *fp)<br />
Flushes any buffers associated with the<br />
file 'fp'.<br />
To check the status <strong>of</strong> a file, the following functions can be called:<br />
int fe<strong>of</strong>(FILE *fp)<br />
int ferror(FILE *fp)<br />
void clearerr(FILE *fp)<br />
int fileno(FILE *fp)<br />
Returns non-zero when an end-<strong>of</strong>-file is read.<br />
Returns non-zero when an error has occurred,<br />
unless cleared by clearerr.<br />
Resets the error and end-<strong>of</strong>-file statuses.<br />
Returns the integer file descriptor associated<br />
with the file (useful for low-level I/O).<br />
Note that the above "functions" are actually defined as pre-processor macros in C. This is quite a common thing to do.<br />
The preprocessor, cpp<br />
Any line in your C source code that starts with the "#" (hash) chararacter is a preprocessor directive. Here are some<br />
examples <strong>of</strong> what is possible:<br />
#define PI 3.1415926535<br />
#define SQR(a) (sqrarg=(a),sqrarg*sqrarg)<br />
#include "filename" /* from the current directory */<br />
#include /* from the system directories (as modified by the "-I" switch) */<br />
#define DEBUG
C <strong>programming</strong> <strong>notes</strong><br />
file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />
35 <strong>of</strong> 40 19/03/2007 10:06 AM<br />
/* defines the symbol DEBUG */<br />
#ifdef DEBUG<br />
/* code here is compiled if DEBUG is defined */<br />
#elif defined UNIX<br />
/* code here is compiled if DEBUG is not defined, and UNIX is defined */<br />
#else<br />
/* code here is compiled if neither DEBUG or UNIX are defined */<br />
#endif<br />
#if 0<br />
/* code here is never compiled */<br />
#endif<br />
Command-line arguments, and returning a status to the shell<br />
Up until now, our "main" function has always been declared as "int main(void)". We have seen that the integer return<br />
value is available to the UNIX "shell" in the "$?" environment variable, and this can be used to alter the execution <strong>of</strong><br />
scripts. It turns out that "main" can actually take arguments, and these are traditionally given as "int argc, char *argv[]",<br />
where argc is one plus the number <strong>of</strong> command-line arguments provided to the program, and argv is an array <strong>of</strong><br />
character strings containing the arguments. argv[0] is usually the name <strong>of</strong> the program itself. An example<br />
(statusexample.c) will hopefully clarify this:<br />
#include <br />
int main(int argc, char *argv[]) {<br />
printf("this program is called '%s'\n", argv[0]);<br />
if (argc == 1) {<br />
printf("it was called without any arguments\n");<br />
} else {<br />
int i;<br />
printf("it was called with %d arguments\n", argc - 1);<br />
for (i = 1; i < argc; i++) {<br />
printf("argument number %d was \n", i, argv[i]);<br />
}<br />
}<br />
exit(argc);<br />
}<br />
echo $?<br />
Allocating memory at run-time - malloc<br />
The situation <strong>of</strong>ten arises that you don't know the size <strong>of</strong> a data structure in your program at compile-time. For<br />
example, suppose you are writing a program to read an 2D image from a file into the computer's memory. If the size <strong>of</strong><br />
the image can vary at run-time, it would be nice not to have to pre-define the size <strong>of</strong> the memory array in your program.<br />
C allows you to do this through the concept <strong>of</strong> allocating and deallocating memory.<br />
Allocating memory means requesting the computer to give you access to a contiguous block <strong>of</strong> memory. E.g., suppose<br />
you need a megabyte; you ask the computer "can I use a megabyte <strong>of</strong> memory, and if so, can you please tell me where<br />
the memory is?". You ask the question by calling the function "malloc", which takes a single argument: the number <strong>of</strong><br />
bytes <strong>of</strong> memory that you want. malloc will attempt to allocate the memory, and if it is successful it will return a<br />
pointer to the first byte <strong>of</strong> the memory block (if malloc was unsuccessful, it returns NULL; this is a rare enough<br />
occurence for reasonable memory requests that you should handle it with "assert").<br />
Deallocating memory means telling the computer "I have now finished using this block <strong>of</strong> memory that you previously<br />
allocated for me with malloc, so you can now have the memory back again for some other purpose". You deallocate<br />
memory by calling the "free" function.<br />
Some important points to remember when using malloc:<br />
malloc returns a pointer to void, i.e., "(void *)", which just means that malloc itself knows nothing about what<br />
the block <strong>of</strong> memory is to be used for, which is fair enough since you didn't tell it. However, you do have to let<br />
the C compiler know, and you do this by type-casting malloc's result (see the example below).<br />
You should always call "free" to deallocate any memory that you have finished with. Note carefully that the<br />
argument to "free" must be a pointer obtained from malloc, and that you can only call free once for a given block
C <strong>programming</strong> <strong>notes</strong><br />
file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />
36 <strong>of</strong> 40 19/03/2007 10:06 AM<br />
<strong>of</strong> memory.<br />
After freeing memory, it is good practice to set any pointers that referred to it to NULL, this will prevent you<br />
from inadvertantly using deallocated memory.<br />
The memory that malloc allocates can have random data in it. If you need it to be zeroed, either do this yourself<br />
with a "for" loop <strong>of</strong> use the "calloc" function (see "man calloc").<br />
Here is an example to get you started (mallocexample.c):<br />
#include <br />
#include <br />
#include <br />
/* Generates a two dimensional matrix, and fills it randomly with<br />
zeroes and ones. */<br />
int main(void) {<br />
int xdim, ydim;<br />
int i, j;<br />
int *p, *q;<br />
// Obtain the dimensions <strong>of</strong> the matrix from the user.<br />
printf("x dimension <strong>of</strong> matrix? > ");<br />
scanf("%d", &xdim);<br />
printf("y dimension <strong>of</strong> matrix? > ");<br />
scanf("%d", &ydim);<br />
// malloc the memory needed for the matrix. Note the use <strong>of</strong> "size<strong>of</strong>(int)"<br />
// which allows the program to run on machines with a variety <strong>of</strong> word sizes.<br />
assert(NULL != (p = (int *)malloc(xdim * ydim * size<strong>of</strong>(int))));<br />
// Fill the matrix with random 0s and 1s.<br />
for (i = 0; i < xdim * ydim; i++) {<br />
if (random() > RAND_MAX/2) {<br />
*(p+i) = 1;<br />
} else {<br />
*(p+i) = 0;<br />
}<br />
}<br />
// Now print the matrix out, showing the use <strong>of</strong> pointer arithmetic.<br />
for (i = 0; i < ydim; i++) {<br />
q = p + i * xdim;<br />
for (j = 0; j < xdim; j++) {<br />
printf("%1d", *(q++));<br />
}<br />
printf("\n");<br />
}<br />
// Free the memory we had allocated.<br />
free((void *)p);<br />
// And remove all information about where the memory used to be.<br />
// This is overkill in this example, since the program immediately<br />
// terminates, but it is good practice, in order to avoid<br />
// accidentally using free'ed memory.<br />
}<br />
p = q = NULL;<br />
return 0;<br />
Generating random numbers<br />
Random numbers are used in all sorts <strong>of</strong> ways in mathematics and physics. They are a fundamental tool that is <strong>of</strong>ten<br />
useful when conducting numerical simulations <strong>of</strong> physical systems with a computer.<br />
Writing a computer program to generate random numbers requires understanding some fairly subtle concepts.<br />
There are several different algorithms for generating a new random number given an initial random number. One <strong>of</strong> the
C <strong>programming</strong> <strong>notes</strong><br />
file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />
37 <strong>of</strong> 40 19/03/2007 10:06 AM<br />
simplest (and best) algorithms for generating random integers is a multiplicative congruential algorithm: you simply<br />
multiply the current random number generator by a constant, and then take the modulus. The choice <strong>of</strong> the constant and<br />
the modulus is critical. A good choice is 16807 and 2147483647 (see Numerical Recipes in C, Press et al, 2nd Ed, page<br />
278).<br />
But how do you generate the initial random number, the so-called "seed", for the algorithm?<br />
One technique is to use the time (in milli or micro-seconds since the last integral second). This is easy to program, but<br />
has a few disadvantages, e.g, there are only 1000 or 1000000 different seeds, and these numbers are not necessarily<br />
equally likely to occur (e.g., the operating system might tend to start programs at particular times - see here for more<br />
details - randomnumber.c).<br />
Another technique on GNU/Linux systems is to read from the file /dev/random. /dev/random contains random numbers<br />
generated from "random" events such as the arrival time <strong>of</strong> internet packets, the movement <strong>of</strong> the mouse, the timing <strong>of</strong><br />
user keystrokes, etc. If you "od -h /dev/random" ("od -h" prints out the file as hexadecimal characters) you will find<br />
that it will eventually stop, waiting for new events to generate randomness - move the mouse and it will continue. An<br />
alternative is /dev/urandom, which will continuously generate random numbers, using an algorithm if events are too<br />
infrequent.<br />
If you require very high quality random numbers, then you should read about the subject, perhaps using Numerical<br />
Recipies in C as a starting point.<br />
The C libraries provide the functions "srandom" and "random" to generate random integers between 0 and<br />
RAND_MAX (defined in "stdlib.h") inclusive. Use the on-line manual pages to learn about "srandom" and "random"<br />
(i.e., type "man srandom" at the prompt). Here is an example program (randomnumber.c) showing their usage, and<br />
seeding them with the time in microseconds.<br />
Quasi-random numbers<br />
For some applications, you don't want truly random numbers, but would prefer numbers that cover an n-dimensional<br />
space more unformly. For example, let's take a more detailed look at Monte-Carlo integration.<br />
When using random numbers, the error involved in Monte-Carlo integration goes down as 1/sqrt(N), where N is the<br />
number <strong>of</strong> points. This means, e.g., if you use 10 times as many points, the error in your estimation <strong>of</strong> the integral will<br />
be reduced by a factor <strong>of</strong> about 1/sqrt(10), or 0.32. Can we do better than this? It turns out that a simple uniform grid <strong>of</strong><br />
points has an error that goes as 1/N, which is about 3 times better than the random case. However, a uniform grid has<br />
the undesirable property that you have to decide on the grid spacing in advance, and you can't simply add more points<br />
at a later time.<br />
What we need is a quasi-random number generator that chooses numbers that tend to fill the gaps in the sequence.<br />
There are several possibilities, and we will explore one designed by a Russian mathematician by the name <strong>of</strong> Sobel<br />
(sobel.c).<br />
Sorting<br />
A frequent requirement is to sort quantities (numbers, character strings, or more complicated structures) into some<br />
order (ascending, descending, or something more complex). The C library provides the function "qsort", an<br />
implementation <strong>of</strong> the famous quicksort algorithm. It is usually best to use "qsort" than write your own sorting<br />
algorithm.<br />
Here are two example programs showing the use <strong>of</strong> qsort.<br />
Multiple precision arithmetic<br />
As an example <strong>of</strong> how to program in C, let's explore the topic <strong>of</strong> multiple precision arithmetic. All the hard work will<br />
be done by GNU's Multiple Precision Arithmetic Library (GNU MP), written by Torbjorn Granlund.<br />
Multiple precision arithmetic allows you to perform calculations to a greater precision than the host computer will<br />
normally allow. The penalty you pay is that the operations are slower, and that you have to call subroutines to do all the<br />
calculations.<br />
GNU MP can perform operations on integers and rational numbers. It uses preprocessor macros (defined in gmp.h) to
C <strong>programming</strong> <strong>notes</strong><br />
file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />
38 <strong>of</strong> 40 19/03/2007 10:06 AM<br />
define special data types for storing these numbers. MP_INT is an integer, and MP_RAT is a rational number.<br />
However, since a multiple precision number may occupy an arbitrarily large amount <strong>of</strong> memory, it is not sufficient to<br />
allocate memory for each number at compile time. GNU MP copes with this problem by dynamically allocating more<br />
memory when necessary.<br />
To begin with you need to initialise each variable that you are going to use. When you are finished using a variable,<br />
you should free the space it uses by calling a special function.<br />
MP_INT x, y;<br />
mpz_init(&x);<br />
mpz_init(&y);<br />
/* operations on x and y */<br />
mpz_clear(&x);<br />
mpz_clear(&y);<br />
Let's now try a full program. For example, calculating the square root <strong>of</strong> two (roottwo.c) to about 200 decimal places:<br />
#include <br />
#include <br />
int main(void) {<br />
char two[450], guess[225];<br />
int i;<br />
MP_INT c, x, temp, diff;<br />
two[0] = '2';<br />
for (i = 1; i < size<strong>of</strong>(two)-1; i++) {<br />
two[i] = '0';<br />
}<br />
two[i] = 0;<br />
mpz_init_set_str(&c, two, 10);<br />
guess[0] = '1';<br />
for (i = 1; i < size<strong>of</strong>(guess)-1; i++) {<br />
guess[i] = '0';<br />
}<br />
guess[i] = 0;<br />
mpz_init_set_str(&x, guess, 10);<br />
mpz_init(&temp);<br />
mpz_init(&diff);<br />
do {<br />
mpz_div(&temp, &c, &x);<br />
mpz_sub(&diff, &x, &temp);<br />
mpz_abs(&diff, &diff);<br />
mpz_add(&x, &temp, &x);<br />
mpz_div_ui(&x, &x, 2U);<br />
} while (mpz_cmp_ui (&diff, 10U) > 0);<br />
}<br />
printf("the square root <strong>of</strong> two is approximately ");<br />
mpz_out_str(stdout, 10, &x);<br />
printf("\n");<br />
return 0;<br />
To compile, link, and run the program, use<br />
gcc -o two two.c -lgmp<br />
./two<br />
Gnuplot<br />
Gnuplot is a simple graphical plotting package. It is particuarly easy to plot graphs with it. These <strong>notes</strong> provide an<br />
introduction to its use.<br />
Numerical integration<br />
Here is an example <strong>of</strong> using numerical integration to follow the motion <strong>of</strong> a planet around the Sun:<br />
Writing fast programs
C <strong>programming</strong> <strong>notes</strong><br />
file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />
39 <strong>of</strong> 40 19/03/2007 10:06 AM<br />
Reasons for worrying about how fast a program is include:<br />
if it is interactive, and reponsiveness is important,<br />
if it is pushing the boundaries <strong>of</strong> practicality (e.g., it might takes weeks to run).<br />
Hints on improving speed:<br />
use a clever algorithm.<br />
identify the time-critical parts <strong>of</strong> the program, and concentrate on improving those.<br />
examine memory use as well (e.g., try to reduce the amount <strong>of</strong> memory being used, and keep the usage localised<br />
as opposed to jumping around randomly).<br />
iterate most rapidly over the rightmost subscript in arrays, since this will lead to better localisation <strong>of</strong> memory<br />
use (and hence greater likelyhood <strong>of</strong> cache hits).<br />
use optimisation (e.g., -O1, -O2, or -O3; see "man gcc" for details).<br />
try optimising for small program size (-Os).<br />
in general, try not to be "too clever" - the C compiler can quite <strong>of</strong>ten do a better job if your program is clear.<br />
make the critical parts <strong>of</strong> the program as small (in terms <strong>of</strong> bytes <strong>of</strong> instructions) as possible - they will then be<br />
more likely to fit in the CPU's "instruction cache", resulting in better performance.<br />
declare <strong>of</strong>ten-called functions as "inline", which eliminates the expense <strong>of</strong> the function call and copying the<br />
arguments to/from the stack. This technique should be used with caution, since it makes the program larger, in<br />
which case it may no longer fit in the instruction cache.<br />
avoid printing out lots <strong>of</strong> unnecessary information. Formatting and writing text to the screen is quite CPU<br />
intensive.<br />
2D cellular automata - Conway's Game <strong>of</strong> Life<br />
In 1970, the mathematician John Horton Conway published the description <strong>of</strong> a simple 2D cellular automata which he<br />
called the Game <strong>of</strong> Life. Cells in a 2D grid were either "alive" or "dead", and their state in the next generation was<br />
determined by their current state and the number <strong>of</strong> nearest neighbours (from 0 to 8, inclusive). The rules were chosen<br />
to simulate some properties <strong>of</strong> biological systems (e.g., a cell would die <strong>of</strong> "overcrowding" if too many <strong>of</strong> its<br />
neighbours were alive, and die <strong>of</strong> "lack <strong>of</strong> support" if too few <strong>of</strong> its neighbours were alive).<br />
Here is a "simple" example <strong>of</strong> how to program the Game <strong>of</strong> Life on a computer.<br />
If we are interested in following the evolution <strong>of</strong> the game for many thousands <strong>of</strong> generations, it is advantageous to<br />
think <strong>of</strong> ways <strong>of</strong> speeding up our simple implementation. The first thing to try is to turn on full optimisation during the<br />
compilation, using the "-O4" switch. The next step is to make the critical functions "inline" (see here for the code). We<br />
can also think about the algorithm, and realise that a lot <strong>of</strong> our calculation <strong>of</strong> the number <strong>of</strong> nearest neighbours was<br />
involved in handling the special case <strong>of</strong> being on the boundary; we can re-write our program to separate out the<br />
boundary case, thereby allowing a simplified (faster) function to do most <strong>of</strong> the neighbour calculations, at the expense<br />
<strong>of</strong> increased complexity in handling the evolution from one generation to the next. Finally, we can try to use the fact<br />
that much <strong>of</strong> the Life "universe" tends to be sparsely populated, allowing us to produce a second<br />
neighbourhood-calculating function for the special case that the preceeding cell had no nearest neighbours. These are<br />
by no means the only speed-ups that are possible, but they give you some flavour <strong>of</strong> what is possible.<br />
The following table gives the time taken to follow one million generations using a 850MHz Pentium III computer and<br />
the various tricks from the last paragraph.<br />
Time Technique<br />
[seconds]<br />
--------------------------------------------------<br />
330 Original program, no optimisation<br />
205 Original program, -O4 optimisation<br />
113 Inline function calls<br />
45 Boundary case handled separately<br />
37 Sparse population handled separately<br />
So, we have managed almost a factor <strong>of</strong> ten improvement in speed over our original attempt.<br />
To show why this might be useful, let's now alter our program to automatically identify glider patterns (those that can<br />
self-propagate). We will choose a brute-force technique where we randomly seed the centre <strong>of</strong> our rectangular<br />
"universe" with alive/dead cells, and then evolve the system until one <strong>of</strong> the boundary cells is hit. When this happens<br />
we stop the evolution and print a 21x21 grid around the cell, hopefully capturing a glider in full flight! Here is the code,<br />
and you can see that it is a straightforward variation on the earlier program. It typically finds a glider within 2 seconds
C <strong>programming</strong> <strong>notes</strong><br />
file:///F:/my_docs/web_phys2020/C<strong>programming</strong><strong>notes</strong>.html<br />
40 <strong>of</strong> 40 19/03/2007 10:06 AM<br />
on a 850MHz Pentium III computer.<br />
C <strong>programming</strong> course, <strong>School</strong> <strong>of</strong> <strong>Physics</strong>, UNSW / Michael Burton / mgb@phys.unsw.edu.au