C++ for Scientists - Technische Universität Dresden

Technische Universität Dresden 

Fakültät Mathematik und Naturwissenschaften 

Institut für wissenschaftliches Rechnen 

01062 Dresden 

http://www.math.tu-dresden.de/~pgottsch/script/cpp for scientists.pdf 

Peter Gottschling 

C ++ für Wissenschaftler 

basierend auf einer gemeinsamen Vorlesung mit Karl Meerbergen 

mit Hilfe von Andrey Chesnokov, Yvette Vanberghen, 

Kris Demarsin und Yao Yue 

und Beiträgen von René Heinzl und Philipp Schwaha 

Stand 16. Januar 2012

Copyright c○ 2010 Copyright (c); Peter Gottschling, René Heinzl, Karl Meerbergen, and 

Philipp Schwaha

Contents 

I Understanding C++ 7 

Introduction 9 

0.1 Programming languages for scientific programming . . . . . . . . . . . . . . . . . 9 

0.2 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 

1 Good and Bad Scientific Software 11 

2 C++ Basics 19 

2.1 Our First Program . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 

2.2 Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 

2.3 Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 

2.4 Expressions and Statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 

2.5 Control statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 

2.6 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 

2.7 Input and output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 

2.8 Structuring Software Projects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 

2.9 Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 

2.10 Pointers and References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 

2.11 Real-world example: matrix inversion . . . . . . . . . . . . . . . . . . . . . . . . 53 

2.12 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 

2.13 Operator Precedence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 

3 Classes 65 

3.1 Program for universal meaning not for technical details . . . . . . . . . . . . . . 65 

3.2 Class members . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 

3.3 Constructors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 

3.4 Destructors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 

3.5 Assignment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 

3.6 Automatically Generated Operators . . . . . . . . . . . . . . . . . . . . . . . . . 76 

3.7 Accessing object members . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 

3.8 Other Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 

4 Generic programming 89 

4.1 Templates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 

4.2 Generic functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 

3

4 CONTENTS 

4.3 Generic classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 

4.4 Concepts and Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 

4.5 Inheritance or Generics? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 

4.6 Template Specialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 

4.7 Non-Type Parameters for Templates . . . . . . . . . . . . . . . . . . . . . . . . . 109 

4.8 Functors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 

4.9 STL — The Mother of All Generic Libraries . . . . . . . . . . . . . . . . . . . . . 121 

4.10 Cursors and Property Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 

4.11 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 

5 Meta-programming 133 

5.1 Let the Compiler Compute . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 

5.2 Providing Type Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 

5.3 Expression Templates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150 

5.4 Meta-Tuning: Write Your Own Compiler Optimization . . . . . . . . . . . . . . . 156 

5.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185 

6 Inheritance 187 

6.1 Basic Principles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187 

6.2 Dynamic Selection by Sub-typing . . . . . . . . . . . . . . . . . . . . . . . . . . . 187 

6.3 Remove Redundancy With Base Classes . . . . . . . . . . . . . . . . . . . . . . . 189 

6.4 Casting Up and Down and Elsewhere . . . . . . . . . . . . . . . . . . . . . . . . . 189 

6.5 Barton-Nackman Trick . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193 

7 Effective Programming: The Polymorphic Way 199 

7.1 Imperative Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201 

7.2 Generic Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203 

7.3 Programming with Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206 

7.4 Functional Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209 

7.5 From Monomorphic to Polymorphic Behavior . . . . . . . . . . . . . . . . . . . . 212 

7.6 Best of Both Worlds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221 

II Using C++ 223 

8 Finite World of Computers 225 

8.1 Mathematical Objects inside the Computer . . . . . . . . . . . . . . . . . . . . . 225 

8.2 More Numbers and Basic Structure . . . . . . . . . . . . . . . . . . . . . . . . . . 226 

8.3 A Loop and More . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230 

8.4 The Other Way Around . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231 

9 How to Handle Physics on the Computer 233 

9.1 Finite Elements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233 

9.2 Again, Integrators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234 

10 Programming tools 235 

10.1 GCC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235 

10.2 Debugging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236 

10.3 Valgrind . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239 

10.4 Gnuplot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239

CONTENTS 5 

10.5 Unix and Linux . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240 

11 C++ Libraries for Scientific Computing 243 

11.1 GLAS: Generic Linear Algebra Software . . . . . . . . . . . . . . . . . . . . . . . 243 

11.2 Boost . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244 

11.3 Boost.Bindings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245 

11.4 Matrix Template Library . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249 

11.5 Blitz++ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249 

11.6 Graph Libraries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249 

11.7 Geometric Libraries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249 

12 Real-World Programming 253 

12.1 Transcending Legacy Applications . . . . . . . . . . . . . . . . . . . . . . . . . . 253 

13 Parallelism 259 

13.1 Multi-Threading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259 

13.2 Message Passing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259 

14 Numerical exercises 263 

14.1 Computing an eigenfunction of the Poisson equation . . . . . . . . . . . . . . . . 263 

14.2 The 2D Poisson equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272 

14.3 The solution of a system of differential equations . . . . . . . . . . . . . . . . . . 272 

14.4 Google’s Page rank . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274 

14.5 The bisection method for finding the zero of a function in an interval . . . . . . . 276 

14.6 The Newton-Raphson method for finding the minimum of a convex function . . . 278 

14.7 Sequential noise reduction of real-time measurements by least squares . . . . . . 281 

15 Programmierprojekte 285 

15.1 Exponisation von Matrizen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285 

15.2 Exponisation von Matrizen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286 

15.3 LU-Zerlegung für m × n-Matrizen . . . . . . . . . . . . . . . . . . . . . . . . . . . 286 

15.4 Bunch-Kaufman Zerlegung . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286 

15.5 Konditionszahl (reziprok) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286 

15.6 Matrix-Skalierung . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287 

15.7 QR mit Überschreiben . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287 

15.8 Direkter Löser für schwach besetzte Matrizen . . . . . . . . . . . . . . . . . . . . 287 

15.9 Anwendung MTL4 auf Typen der Intervallarithmetik . . . . . . . . . . . . . . . . 288 

15.10Anwendung MTL4 auf Typen mit höherer Genauigkeit . . . . . . . . . . . . . . . 289 

15.11Anwendung MTL4 auf AD-Typen . . . . . . . . . . . . . . . . . . . . . . . . . . 289 

16 Acknowledgement 291

6 CONTENTS

Part I 

Understanding C ++ 

7

Introduction 

“It would be nice if every kind of numeric software could be written in C++ without 

loss of efficiency, but unless something can be found that achieves this without compromising 

the C++ type system it may be preferable to rely on Fortran, assembler 

or architecture-specific extensions.” 

— Bjarne Stroustrup. 

The purpose of this script is doing you this favor, Bjarne. Amongst others. Conversely, the 

reader of this book shall learn the best way to benefit from C ++ features for writing scientific 

software. It is not our goal to explain all C ++ features in a well-balanced manner. We rather 

aim for an application-driven illustration of features that are valuable for writing 

• Well-structured; 

• Readable; 

• Maintanable; 

• Extensible; 

• Type-safe; 

• Reliable; 

• Portable; and last but not least 

• Highly performing 

software. 

0.1 Programming languages for scientific programming 

Scientific programming is an old discipline in computer science. The first applications on computers 

were indeed computations. In the early decades, ALGOL was a relatively popular programming 

language, competing with FORTRAN. FORTRAN 77 became a standard in scientific 

programming because of its efficiency and portability. Other computer languages were developed 

in computer science but not frequently used in scientific computing : C, Ada, Java, C ++. 

They were merely used in universities and labs for research purposes. 

9

10 

C ++ was not a reliable computer language in the nineties : code was not portable, object 

code not efficient and had a large size. This made C ++ unpopular in scientific computing. This 

changed at the end of the nineties : the compilers produced more efficient code, and the standard 

was more and more supported by compilers. Especially the ability to inline small functions and 

the introduction of complex numbers in the C99 standard made C ++ more attractive to scientific 

programmers. 

Together with the development of compilers, numerical libraries are being developed in C ++ 

that offer great flexibility together with efficiency. This work is still ongoing and more and 

more software is being written in C ++. Currently, other languages used for numerics are FOR- 

TRAN 77 (even new codes!), Fortran 95, and Matlab. More and more becoming popular is 

Python. The nice thing about Python is that it is relatively easy to link C ++ functions and 

classes into Python scripts. Writing such interfaces is not a subject of this course. 

The goal of this course is to introduce students to the exciting world of C ++ programming for 

scientific applications. The course does not offer a deep study of the programming language 

itself, but rather focuses on those aspects that make C ++ suitable for scientific programming. 

Language concepts are introduced and applied to numerical programming, together with the 

STL and Boost. 

Starting C ++ programmers often adopt a Java programming style: both languages are object 

oriented, but there are subtle differences that allow C ++ to produce more compact expressions. 

For example, C ++ classes typically do not have getters and setters as this is often the case 

in Java classes. This will be discussed in more detail in the course. We use the following 

convention, which is also used by Boost, that is one of the good examples of C ++ software. 

Classes and variables are denoted by lower case characters. Underscores are used as separator 

in symbols. An exception are matrices that are written as single capitals for the simularity with 

the mathematical notation. Mixed upper and lower case characters (CamelCase) are typically 

used for concepts. Constants are often (as in C) written in capital. 

0.2 Outline 

The topics that will be discussed are several aspects of the syntax of C ++, illustrated by small 

numerical programs, an introduction to meta programming, expression templates, STL, boost, 

MTL4, and GLAS. We will also discuss interoperability with other languages. The first three 

chapters discuss basic language aspects, such as functions, types, and classes, inheritance and 

generic programming, include examples from STL. The remaining chapters discuss topics that 

are of great importance for numerical applications: functors, expression templates, and interoperability 

with FORTRAN and C.

Good and Bad Scientific Software 

Chapter 1 

This chapter will give you an idea what we consider good scientific software and what not. If 

you have never programmed before in your life you might wish to skip the entire chapter. This 

is o.k. because if you had no contact with the program sources of bad software you can learn 

programming with a pure mind. 

If you have some software knowledge, there might be still some details you will not understand 

right now but this is no reason to worry. If you do not understand it after reading this script 

then you can start worrying, or we as authors could. This chapter is only about getting a feeling 

what distinguishes good from bad software in science. 

As foundation of our discussion — and to not start the book with hello world — we consider an 

iterative method to solve system of linear equations Ax = b where A is a symmetric positivedefinite 

(SPD) matrix, x and b are vectors, and x is searched. The method is called ‘Conjugate 

Gradients’ (CG) and was introduced by Magnus R. Hestenes and Eduard Stiefel [?]. 

The mathematical details do not matter here but the different styles of implementation. The 

algorithm can be written in the following form: 1 

Algorithm 1: Conjugate Gradient Method Algorithm. 

Input: SPD matrix A, vector b, and left preconditioner L, termination criterion ε. 

Output: Vector x such Ax ≈ b. 

1 r = b − Ax 

2 while |r| ≥ ε do 

z = L−1 3 

r 

4 ρ = 〈r, z〉 

5 if First iteration then 

6 p = z 

7 

8 

9 

10 

11 

12 

13 

else 

p = z + ρ 

ρ ′ p 

q = Ap 

α = ρ/〈p, q〉 

x = x + αp 

r = r − αq 

ρ ′ = ρ 

1 This is not precisely the original notation but a slightly adapted version that introduces some extra variables 

to avoid redundant calculations. 

11

12 CHAPTER 1. GOOD AND BAD SCIENTIFIC SOFTWARE 

Programmers transform this mathematical notation into a form that a compiler understands, 

by using operations from the language. The result could look like Listing 1.1. Do not read it 

in detail, just skim it. 

#include 

#include 

double one norm(int size, double ∗vp) 

{ 

int i; 

double sum= 0; 

for (i= 0; i < size; i++) 

sum+= fabs(vp[i]); 

return sum; 

} 

double dot(int size, double ∗vp, double ∗wp) 

{ 

int i; 

double sum= 0; 

for (inti= 0; i < size; i++) 

sum+= vp[i] ∗ wp[i]; 

return sum; 

} 

int cg(int size, int nnz, int∗ aip, int∗ ajp, double∗ avp, 

double ∗x, double ∗b, void (∗lpre)(int, double∗, double∗), double eps) 

{ 

int i, j, iter= 0; 

double rho, rho 1, alpha; 

double ∗p= (double∗) malloc(size ∗ sizeof(double)); 

double ∗q= (double∗) malloc(size ∗ sizeof(double)); 

double ∗r= (double∗) malloc(size ∗ sizeof(double)); 

double ∗z= (double∗) malloc(size ∗ sizeof(double)); 

// r= b; 


r[i]= b[i]; 

// r−= A∗x; 

for (i= 0; i < nnz; i++) 

r[aip[i]]−= avp[i] ∗ b[ajp[i]]; 

while (one norm(size, r) >= eps) { 

// z = solve(L, r); 

(∗lpre)(size, z, r); // function pointer call 

rho= dot(size, r, z); 

if (!iter) { 


p[i]= z[i]; 

} else { 


p[i]= z[i] + rho / rho 1 ∗ p[i];

} 

} 

// q= A ∗ p; 


q[i]= 0; 

for (i= 0; i < nnz; i++) 

q[aip[i]]+= avp[i] ∗ p[ajp[i]]; 

alpha= rho / dot(size, p, q); 

// x+= alpa ∗ p; r−= alpha ∗ q; 

for (i= 0; i < size; i++) { 

x[i]+= alpha ∗ p[i]; 

r[i]−= alpha ∗ q[i]; 

} 

iter++; 

} 

free(q); free(p); free(r); free(z); 

return iter; 

void ic 0(int size, double∗ out, double∗ in) { /∗ .. ∗/ } 

int main (int argc, char∗ argv[]) 

{ 

int nnz, size; 

} 

// set nnz and size 

int ∗aip= (int∗) malloc(nnz ∗ sizeof(double)); 

int ∗ajp= (int∗) malloc(nnz ∗ sizeof(double)); 

double ∗avp= (double∗) malloc(nnz ∗ sizeof(double)); 

double ∗x= (double∗) malloc(size ∗ sizeof(double)); 

double ∗b= (double∗) malloc(size ∗ sizeof(double)); 

// set A and b 

cg(size, nnz, aip, ajp, avp, x, b, ilu, 1e−9); 

return 0 ; 

Listing 1.1: Low Abstraction Implementation of CG 

As said before the details do not matter here but only the principal approach. The good thing 

about this code is that it is self-contained. But this is about the only advantage. The problem 

with this implemenation is its low abstraction level. This creates three major disadvantages: 

• Bad readability; 

• No flexibility; and 

• High error-proneness. 

The bad readability manifests in the fact that almost every operation is implemented in one 

or multiple loops. For instance, would we have found the matrix vector multiplication q = Ap 

13


without the comments? We would easily catch where the variables representing q, A, and p 

are used but to see that this is a matrix vector product will take a closer look and a good 

understanding how the matrix is stored. 

This leads us to the second problem: the implementation commits to many technical details 

and only works in precisely this context. Algorithm 1 only requires that matrix A is symmetric 

positive-definite but it does not demand a certain storage scheme. There are many other sparse 

matrix formats that we can all use in the CG method but not with this implementation. The 

matrix format is not the only detail the code commits to. What if we want to compute in lower 

(float) or higher precision (long double)? Or solve a complex linear system? For every such 

new CG application, we need a new implementation. Needless to say that running on parallel 

computers or exploring GPGPU (General-Purpose Graphic Processing Units) acceleration 

needs reimplementations as well. Much worse, every combination of the above needs a new 

implementation. 

Some of the readers might think: “It is only one function of 20–30 lines. Rewriting this little 

function, how much work can this be. And we do not introduce new matrix formats or computer 

architectures every month.” Certainly true but in some sense it is putting the cart before 

the horse. Because of such inflexible and detail-obsessed programming style, many scientific 

applications grew into the 100,000s and millions of lines of code. Once an application or library 

reached such a monstruous size, it is very arduous modifying features of the software and only 

rarely done. The road to success is starting scientific software from a higher level of abstraction 

from the beginning, even if it is more work initially. 

The last major disadvantage is how error-prone it is. All arguments are given as pointers and 

the size of the underlying arrays is given as an extra argument. We as programmer of function 

cg can only hope that the caller did everything right because we have no way to verify it. If 

the user does not allocate enough memory (or does not allocate at all) the execution will crash 

at some more or less random position or even worse, will generate some nonsensical results 

because data and software can be randomly owerwritten. Good programmers must avoid such 

fragile interfaces because the slightest mistake can have catastrophic consequences and the 

program errors are extremely difficult to find. Unfortunately, even recently released and widely 

used software is written in this manner, either for backward-compatibility to C and Fortran or 

because it is written in one of these two languages. In fact, the implementation above is C and 

not C ++. If this is way you love software, you probably will not like this script. 

So much about software we do not like. In Listing 1.2 we show how scientific software could 

look like. 

// This source is part of MTL4 

#include 

#include 

template < typename LinearOperator, typename HilbertSpaceX, typename HilbertSpaceB, 

typename Preconditioner, typename Iteration > 

int conjugate gradient(const LinearOperator& A, HilbertSpaceX& x, const HilbertSpaceB& b, 

const Preconditioner& L, Iteration& iter) 

{ 

typedef HilbertSpaceX Vector; 

typedef typename mtl::Collection::value type Scalar; 

Scalar rho(0), rho 1(0), alpha(0);

Vector p(resource(x)), q(resource(x)), r(resource(x)), z(resource(x)); 

r = b − A∗x; 

while (! iter.finished(r)) { 

z = solve(L, r); 

rho = dot(r, z); 

if (iter.first()) 

p = z; 

else 

p = z + (rho / rho 1) ∗ p; 

q = A ∗ p; 

alpha = rho / dot(p, q); 

x += alpha ∗ p; 

r −= alpha ∗ q; 

rho 1 = rho; 

++iter; 

} 

return iter; 

} 


{ 

int size; 

} 

// set size 

mtl::compressed2D A(size, size); 

mtl::dense vector x(size), b(size); 

// set A and b 

// Create preconditioner 

itl::pc::ic 0 L(A); 

// Object that controls iteration, terminate if residual is below 10ˆ−9 or decrease 

// by 6 orders of magnitude, abord after 30 iterations if not converged 

itl::basic iteration iter(b, 30, 1.e−6, 1.e−9); 

conjugate gradient(A, x, b, L, iter); 

return 0 ; 

Listing 1.2: High Abstraction Implementation of CG 

The first thing you might realize is that the CG implementation is readable without comments. 

As a thumb of rule, if other people’s comments look like your program sources then you are 

a really good programmer. If you compare the mathematical notation in Algorithm 1 with 

Listing 1.2 you will realize that — except for the type and variable declarations at the beginnig 

— they are identical. Some readers might think that it looks more like Matlab or Mathematica 

15


than C ++. Yes, C ++ can look like this if one puts enough effort in good software. 

Evidently, it is also much easier to write algorithms at this abstraction level than expressing it 

with low-level operations. 

The Purpose of Scienfic Software 

Scientists shall do science. 

Excellent scientific software is expressed only in mathematical and domainspecific 

operations without any technical detail exposed. 

At this abstraction level, scientists can focus on models and algorithms, being 

much more productive and progress scientific discovery. 

Nobody knows how many scientists wasting how much time every year dwelling on small technical 

details of bad software like in Listing 1.1. Of course, the technical details have to be realized 

in some place but not in a scientific application. This is the worst possible location. Use a 

two-level approach: write your applications in terms of expressive mathematical operations and 

if they do not exist, implement them separately. These mathematical operations must be carefully 

implemented for maximal performance or use other operations with maximal performance. 

Investing time in the performance of these fundamental operations is highly rentable because 

the functions will be reused very often. 

Advise 

Use the right abstractions! 

If they do not exist, implement them. 

Speaking of abstractions, the CG implementation in Listing 1.2 does not commit to any technical 

detail. In no place, the function is restricted to a numerical type like double. It works as well 

for float, GNU’s multiprecision, complex, interval arithmetic, quaternions, . . . 

The matrix A can have any internal format, as long as it can be multiplied with a vector 

it can be used in the function. In fact, it does not even need to be matrix but can be any 

linear operator. For instance, an object that performs a Fast Fourier Transformation (FFT) 

on a vector can be used as A when the FFT is expressed by a product of A with the vector. 

Similarly, the vectors do not need to be represented by finite-dimensional arrays but can be 

elements of any vector space that is somehow computer-representable as long as all operations 

in the algorithm can be performed. 

We are also open to other computer architectures. If the matrix and the vectors are distributed 

over the nodes of a parallel supercomputer and according parallel operations are available, the 

functions runs in parallel without changing any single line. (GP)GPU acceleration can be also 

realized within the data structures and their operations without changing the algorithm. In

general, any existing or new platform that is supported in the operations of the matrix and 

vector types is also supported by our ‘Generic’ conjugate gradient function. As mentioned 

before, we do not even need to change it. If we have a sophisticated scientific application of 

several thousand lines (not 100,000s) written with appropriate abstractions, we need to modify 

it either. 

Starting with the next chapter, we will explain you how to write good scientific software. 

17

18 CHAPTER 1. GOOD AND BAD SCIENTIFIC SOFTWARE

C ++ Basics 

Chapter 2 

In this first chapter we will briefly introduce some basic knowledge about C ++. A useful site 

with a reference manual to C ++ commands is http://www.cplusplus.com/. 

2.1 Our First Program 

As an introduction to the C ++ language, let us look at the following example: 

#include 

int main () 

{ 

std::cout ≪ ”Answer to the Ultimate Question of Life, the Universe, and Everything is ” 

≪ 6 ∗ 7 ≪ std::endl; 

return 0; 

} 

according to Douglas Adams’ “Hitchhiker’s Guide to the Galaxy.” This short example shows 

already many things about C ++: 

• The first line includes a file name “iostream.” Whatever is defined in this file will be 

defined in our program as well. The file “iostream” contains the standard I/O of C ++. 

Input and output is not part of the core language in C ++ but part of the standard libraries. 

This means that we cannot program I/O commands without including “iostream” (or 

something similar). But it also means that this file must exist in every compiler because 

it is part of the standard. Include commands should be at the beginning of the file if 

possible. 

• the main program is called main and has an integer return value, which is set to 0 by the 

return command. The caller of a program (usually the operating system) knows that it 

finished successfully when a 0 is returned. A return code other than 0 symbolizes that 

something went wrong and often the return code also says something about what went 

wrong. 

• Braces “{ }” denote a block/group of code (also called a compound statement). Variables 

declared within “{ }” groups are only accessible within this block. 

19

20 CHAPTER 2. C++ BASICS 

• std::cout and std::endl are defined in “iostream.” The former is an output stream that prints 

text on the screen (unless it is redirected). With std::endl a line is terminated. 

• The special operator≪ is used to pass objects to the output to an stream std::cout that is 

to print it on that stream. 

• The double quotes surround string constants, more precisely string literals. This is the 

same as in C. For string manipulation, however, one should use C ++’s string class instead 

of C’s cumbersome and error-prone functions. 

• The expression 6 ∗ 7 is evaluated and a temporary integer is passed to std::cout. In C ++ 

everything has a type. Sometimes we as programmers have to declare the type and 

sometimes the compiler deduces it for us. 6 and 7 are literal constants that have type int 

and so has their product. 

This was a lot of information for such a short program. So let us start step by step. 

TODO: A little explanation how to compile and run it. For g++ and Visual Studio. 

2.2 Variables 

In contrast to most scripting languages C ++ is strongly typed, that is every variable has a type 

and this type never change. A variable is declared by a statement TYPE varname. 1 Basic types 

are int, unsigned int, long, float, double, char, and bool. 

int integer1 = 2; 

int integer2, integer3; 

float pi = 3.14159; 

char mycharacter = ’a’; 

bool cmp = integer1 < pi; 

Each statement has to be terminated by a “;”. In the following section, we show operations 

that are often applied to integer and float types. In contrast to other languages like Python, 

where ’ and ” is used for both characters and strings, C ++ distinguishes between the two of 

them. The C ++ compiler considers ’a’ as the character ‘a’ (it has type char) and ”a” is the string 

containing ‘a’ (it has type char[1]). If you are used to Python please pay attention to this. 

Advise 

Define variables right before using them the first time. This makes your 

programs more readable when they grow long. It also allows the compiler to 

use your memory more efficiently when you have nested scopes (more details 

later). Old C versions required to define all variables at the beginning of a 

function and several people stick to this till today. However, in C ++ it leads 

generally to higher efficiency and more importantly to higher readability to 

define variables as late as possible. 

1 TODO: too simple, variable lists and in-place initialization is missing

2.2. VARIABLES 21 

2.2.1 Constants 

Syntactically, constants are like special variables in C ++ with the additional attribute of immutability. 

const int integer1 = 2; 

const int integer3; // Error 

const float pi = 3.14159; 

const char mycharacter = ’a’; 

const bool cmp = integer1 < pi; 

As they cannot be changed, it is mandatory to set the value in the definition. The second 

constant definition violates this rule and the compiler will complain about it. 

Constants can be used where ever variables are allowed — as long as they are not modified, of 

course. On the other hand, constants like those above are already known during compilation. 

This enables many kinds of optimizations and the constants can be even used as arguments of 

types (we will come back to this later). 

2.2.2 Literals 

Literals like “2” or “3.14” have types as well. Simply spoken, integral numbers are treated as 

int, long or unsigned long depending on the number of digits. Every number with a dot or an 

exponent (e.g. 3e12 ≡ 3 · 10 12 ) is considered a double. 

Usually this does not matter much in practice since C ++ has implicit conversation between 

built-in numeric types and most programs work well without explicitly specifying the type of 

the literals. There are however three major reasons why paying attention to the types of literals: 

• Availability; 

• Ambiguity and 

• Accuracy. 

Without going into detail here, the implicit conversation is not used with template functions 

(for good reasons). The standard library provides a type for complex numbers where the type 

for the real and imaginary part can be parametrized by the user: 

std::complex z(1.3, 2.4), z2; 

These complex numbers provide of course the common operations. However, when we write: 

z2= 2 ∗ z; // error 

z2= 2.0 ∗ z; // error 

we will get an error message that the multiplication is not available. More specifically, the 

compiler will tell us that there is no operator∗() for int and std::complex respectively for 

double and std::complex. 2 The library provides a multiplication for the type that we use 

for the real and imaginary part, here float. There are two ways to ascertain that “2” is float: 

z2= float(2) ∗ z; 

z2= 2.0f ∗ z; 

2 It is however possible to implement std::complex in a fashion such that these expressions work [Got11].


In the first case, we have an int literal that is converted into float and in the second case, the 

literal is float from the beginning. For the sake of clarity, the float literal is preferable. 

Later in this book we will introduce function overloading, that is a function with different 

implementations for different argument types (or argument tuples). The compiler selects the 

function overload that fits best. Sometimes the best fit is not clear, for instance if function f 

accepts an unsigned or a pointer and we call: 

f(0); 

“0” is considered as int and can be implicitly converted into unsigned or any pointer type. None 

of the conversions is prioritized. As before we can address the issue by explicit conversion and 

by a literal of the desired type: 

f(unsigned(0)); 

f(0u); 

Again, we prefer the second version because it is more direct (and shorter). 

The accuracy issue comes up when work with long double. On the author’s computer, the format 

can handle at least 19 digits. Let us define one third with 20 digits and print out 19 of it: 

long double third= 0.3333333333333333333; 

cout.precision(19); 

cout ≪ ”One third is ” ≪ third ≪ ”.\n”; 

The result is: 

One third is 0.3333333333333333148. 

The program behavior is more satisfying if we append an “l” to the number: 

long double third= 0.3333333333333333333l; 

yielding the print-out that we hoped for: 

One third is 0.3333333333333333333. 

The following table gives examples of literals and their type: 

Literal Type 

2 int 

2u unsigned 

2l long 

2ul unsigned long 

2.0 double 

2.0f float 

2.0l long double 

For more details, see for instance [Str97, § 4.4f,§ C.4]. There you also find a description how to 

define literals on an octal or hexadecimal basis.

2.2. VARIABLES 23 

2.2.3 Scope of variables 

Global definition: Every variable that we intend to use in a program must have been declared 

with its type specifier at an earlier point in the code. A variable can be either of global or local 

scope. A global variable is a variable that has been declared in the main body of the source 

code, outside all functions. After declaration, global variables can be referred from anywhere in 

the code, even inside functions. This sounds very handy because it is easily available but when 

your software grows it becomes more difficult and painful to keep track of the global variables’ 

modifications. At some point, every code change bears the potential of triggering an avalanche 

of errors. Just do not use global variables. Sooner or later you will regret this. Believe us. 

Global constants like 

const double pi= 3.14159265358979323846264338327950288419716939; 

are fine because they cannot cause side effects. 

Local definition: Opposed to it, a local variable is declared within the body of a function 

or a block. Its visibility/availability is limited to the block enclosed in curly braces { } where 

it is declared. More precisely, the scope of a variable is from its definition to the end of the 

enclosing braces. Recalling the example of output streams 

int main () 

{ 

std::ofstream myfile(”example.txt”); 

myfile ≪ ”Writing this to a file. ” ≪ std::endl; 

return 0; 

} 

the scope of myfile is the from its definition to the end of function main. If we would write: 

int main () 

{ 

int a= 5; 

{ 



} 

myfile ≪ ”a is ” ≪ a ≪ std::endl; // error 

return 0; 

} 

then the second output is not valid because myfile is out of scope. The program would not 

compile and the compiler would tell you something like “myfile is not defined in this 

scope”. 

Hiding: If variables with the same name exist in different scopes then only variable is visible 

the others are hidden. A variables in an inner scope hides all variables in outer scopes. For 

instance: 3 

3 TODO: Picture would be nice.


int main () 

{ 

int a= 5; // define #1 

{ 

a= 3; // assign #1, #2 is not defined yet 

int a; // define #2 

a= 8; // assign #2, #1 is hidden 

{ 

a= 7; // #2 

} 

} // end of #2’s scope 

a= 11; // #1, #2 is now out of scope 

return 0; 

} 

Defining the same variable name twice in the same scope is an error. 

The advantage of scopes is that you do not need to worry whether a variable (or something 

else) is already defined outside the scope. It is just hidden but does not create a conflict. 4 

Unfortunately, the hiding makes the homonymous variables in the outer scope inaccessible. 

Best thing you can do is to rename the variable in the inner scope (and eventually in the nextouter 

scope(s) to access more of those variables). Renaming the outermost variable also solves 

the problem of accessibility but tends to be more work because it is probably more often used 

due to its longer life time. A better solution to manage nesting and accessibility are namespaces, 

see next section. 

Scopes also have the advantage to reuse memory, e.g.: 

int main () 

{ 

int x, y; 

float z; 

cin ≫x; 

if (x < 4) { 

y= x ∗ x; 

// something with y 

} else { 

z= 2.5 ∗ float(x); 

// something with z 

} 

} 

The example uses three variables. However, they are never used at the same time. y is only 

used in the first branch and z only in the second one. 

Thus, we rewrite the program as follows 

int main () 

{ 

int x; 

cin ≫x; 

4 As opposed to macros, an obsolete and reckless legacy feature from C that should be avoided at any price 

because it undermines all structure and reliability of the language.

2.3. OPERATORS 25 

if (x < 4) { 

int y= x ∗ x; 

// something with y 

} else { 

float z= 2.5 ∗ float(x); 

// something with z 

} 

} 

then y exists only in the first branch and z only exists in the second one. In general, it helps 

us saving memory to let variables only live as long as necessary, especially when we have very 

large objects. That is define variables as late as possible — ideally directly before using — then 

they are implicitly in the innermost possible scope, e.g. in the branches in the previous example 

instead of the main function. The reduced code complexity of having less active variables at 

any point in your program also simplifies your life if program does not what it should (in very 

rare cases, of course) and you have to debug it. 

For all those reasons, it is also preferable defining loop indices directly in the loop: 

for (int i= 0; i < n; i++) { ... } 

If you need the loop index afterwards you must define it outside, e.g.: 

cin ≫x; 

for (int i= 0; abs(x) > 0.001 && i < 100; i++) 

x= f(x); 

cout ≫”Did ” ≪ i ≪ ” iterations.\n” // error, which i???? 

The example is some kind of (probably useless) fix point calculation. It stops when |x| ≤ 

0.001 or 100 iterations were performed (remember the second term is not a termination but a 

continuation criterion). When we finished the loop we want to know how many iterations we 

performed. But our loop index already died. Let’s try again: 

cin ≫x; 

int i; 

for (i= 0; abs(x) > 0.001 && i < 100; i++) 

x= f(x); 

cout ≫”Did ” ≪ i ≪ ” iterations.\n” 

Now it works. 

2.3 Operators 

C ++ is rich in built-in operators. An operator is a symbol that tells the compiler to perform 

specific mathematical or logical manipulations. C ++ has three general classes of operators, 

arithmetic, boolean, and bitwise. This section gives a short overview of the different operators 

and their meaning. 

2.3.1 Arithmetic operators 

The following table lists the arithmetic operators allowed in C ++:


Operator Action 

− subtraction, also unary minus 

+ addition 

∗ multiplication 

/ division 

% modulus 

−− decrement 

++ increment 

The modulus operator yields the remainder of the integer division. The ++ operator adds one 

to its operand and −− subtracts one. Both can precede or follow the operand. When they 

precede the operand, the corresponding operation will be performed before using the operand’s 

value to evaluate the rest of the expression. If the operator follows its operand, C ++ will use 

the operand’s value before incrementing or decrementing it. Consider the following example: 

x = 1; 

y = ++x; 

x = 1; 

z = x++; 

As a result of executing these four lines of code, y will be set to 2, x will be set to 2 and z will 

be set to 1. 

The priority and associativity of binary arithmetic operators is the same as we know it from 

math: multiplication and division precedes addition and subtraction. Thus, x + y ∗ z is evaluated 

as x + (y ∗ z). Operations of the same priority are left-associative, i.e. x / y ∗ z is 

equivalent to (x / y) ∗ z. Unary operators have precedence over binary: x ∗ y++ / −z means 

(x ∗ (y++)) / (−z). Nevertheless, as long as you are still learning C ++ and not entirely sure 

about the precedences, you might want to add redundant parenthesis instead of wasting hours 

debugging your program. 

With these operators we can write our first numeric program: 

#include 

int main () 

{ 

float r1 = 3.5, r2 = 7.3, pi = 3.14159; 

float area1 = pi ∗ r1∗r1; 

std::cout ≪ ”A circle of radius ” ≪ r1 ≪ ” has area ” 

≪ area1 ≪ ”.” ≪ std::endl; 

std::cout ≪ ”The average of ” ≪ r1 ≪ ” and ” ≪ r2 ≪ ” is ” 

≪ (r1+r2)/2 ≪ ”.” ≪ std::endl; 

return 0 ; 

} 

2.3.2 Boolean operators 

Boolean operators are logical and relational operators. Both return boolean values, therefore 

the name. Operators and their significations are:


Operator Meaning 

> greater than 

>= greater than or equal to 

< less than 

= 1 + 7 is evaluated as if it were written 4 >= (1 + 7). 

Advise 

Integer values can be treated in C ++ as boolean. For the sake of clarity it is 

always better to use bool for all logical expression. 

This is a legacy of C where bool does not exist. Almost all techniques from C work also in 

C ++— as the language name suggests — but using the new features of C ++ allows you to write 

programs with better structure. For instance, if you want to store the result of a comparison 

do not use an integer variable but a bool. 

bool out of bound = x < min || x > max; 

2.3.3 Bitwise operators 

Bitwise operators allow you to test or change the bits of integers. 5 There are the following 

operations: 

Operator Action 

& AND 

| OR 

ˆ exclusive OR 

∼ one’s complement (NOT) 

≫ shift right 

≪ shift left 

The shift operators bitwise shift the value on their left by the number of bits on their right: 

• ≪ shifts left and adds zeros at the right end. 

• ≫ shifts right and adds either 0s, if value is an unsigned type, or extends the top bit (to 

preserve the sign) if it is a signed type. 

5 The bitwise operators work also on bool but it is favorable to use the logical operators from the previous 

section. Especially the shift operators are rather silly for bool.


The bitwise operations can be used to characterize properties in a very compact form as in the 

following example: 

#include 

int main () 

{ 

int concave = 1, monotone = 2, continuous = 4; 

int f is = concave | continuous; 

std::cout ≪ ”f is ” ≪ f is ≪ std::endl; 

std::cout ≪ ”Is f concave? (0 means no, 1 means yes) ” 

≪ (f is & concave) ≪ std::endl; 

f is = f is | monotone; 

f is = f is ˆ concave; 

std::cout ≪ ”f is now ” ≪ f is ≪ std::endl; 

return 0 ; 

} 

Line 5 introduces three properties that can be combined arbitrarily. The numbers are powers 

of two so that their binary representations contain a single 1-bit respectively. In line 7 we used 

bitwise OR to combine two properties. Bitwise AND allows for masking single or multiple bits 

as shown in line 11. In line 13 an additional property is set with bitwise OR. Bitwise exclusive 

OR (XOR) like in line 14 allows for toggling a property. Operating systems and hardware driver 

use this style of operations exhaustively. But it needs some practice to get used to it. 

Shift operations provide an efficient way to multiply with or divide by powers of 2 as shown in 

the following code: 

int i = 78; 

std::cout ≪ ”i ∗ 8 is ” ≪ (i ≪ 3) 

≪ ”, i / 4 is ” ≪ (i ≫2) ≪ std::endl; 

Obviously, that needs some familiarization as well. 

On the performance side, processors are today quite fast in multiplying integers so that you 

will not see a big performance boost when replacing your products by left shifts. Division is 

still a bit slow and a right shift can make a difference. Even then the price of this source code 

obfuscation is only justified if the operation is critical for the overall performance of your entire 

application. 

2.3.4 Compound assignment 

The compound assignment operators apply an arithmetic operation to the left and right-hand 

side and store the result in the left hand side. 

There operators are +=, −=, ∗=, /=, %=, ≫=, ≪=, &=, ˆ=, and |=. 

The statement a+=b is equal to the statement a=a+b.


2.3.5 Bracket operators 

The operator [] is used access elements of an arrays (see § 2.9), and () for function calls. 

2.3.6 All operators 

We haven’t introduced all operators yet. They will be shown in an appropriate context. For 

now, we only list the entire operator set with their precedences and associativity. The table is 

taken from [?] (by courtesy of Bjarne Stroustrup). For more details about specific operators 

see there. The operators on top have the highest priorities. 6 

Operator Summary 

scope resolution class name :: member 

scope resolution namespace name :: member 

global :: name 

global :: qualified-name 

member selection object . member 

member selection pointer → member 

subscripting expr[ expr ] 

subscripting (user-defined) object [ expr ] 7 

function call expr ( expr list ) 

value construction type ( expr list ) 

post increment lvalue ++ 

post decrement lvalue −− 

type identification typeid ( type ) 

run-time type identification typeid ( expr ) 

run-time checked conversion dynamic cast < type > ( expr ) 

compile-time checked conversion static cast < type > ( expr ) 

unchecked conversion reinterpret cast < type > ( expr ) 

cast conversion const cast < type > ( expr ) 

size of object sizeof expr 

size of type sizeof ( type ) 

pre increment ++ lvalue 

pre decrement −− lvalue 

complement ∼ expr 

not ! expr 

unary minus − expr 

unary plus + expr 

address of & lvalue 

dereference ∗ lvalue 

create (allocate) new type 

create (allocate and initialize) new type( expr list ) 

create (place) new ( expr list ) type 

create (place and initialize) new ( expr list ) type( expr list ) 

destroy (deallocate) delete pointer 

destroy array delete [ ] pointer 

6 TODO: If possible references 

7 Not in [?].


cast (type conversion) ( type ) expr 

member selection object.∗ pointer to member 

member selection pointer → ∗ pointer to member 

multiply expr ∗ expr 

divide expr / expr 

modulo (remainder) expr % expr 

add (plus) expr + expr 

subtract (minus) expr − expr 

shift left expr ≪ expr 

shift right expr ≫ expr 

less than expr < expr 

less than or equal expr expr 

greater than or equal expr >= expr 

equal expr == expr 

not equal expr != expr 

bitwise AND expr & expr 

bitwise exclusive OR (XOR) expr ˆ expr 

bitwise inclusive OR expr | expr 

logical AND expr && expr 

logical OR expr || expr 

conditional expression expr ? expr: expr 

simple assignemt lvalue = expr 

mulitply and assignemt lvalue ∗= expr 

divide and assignemt lvalue /= expr 

modulo and assignemt lvalue %= expr 

add and assignemt lvalue += expr 

subtract and assignemt lvalue −= expr 

shift left and assignemt lvalue ≪= expr 

shift right and assignemt lvalue ≫= expr 

AND and assignemt lvalue &= expr 

inclusive OR and assignemt lvalue |= expr 

exclusive OR and assignemt lvalue ˆ= expr 

throw exception throw expr 

comma (sequencing) expr , expr 

To see the operator precedences at one glance, use Table 2.13 on page 64. 8 

2.3.7 Overloading 

A very powerful aspect of C ++ is that the programmer can define operators for new types. This 

will be explained in section ??. Operators of built-in types cannot be changed. New operators 

cannot be added as in some other languages. If you redefine operators make sure that the 

expected priority of the operation corresponds to the operator precedence. For instance, you 

might have the idea using the L ATEX notation for exponentiation of matrices: 

8 TODO: Associativity?

2.4. EXPRESSIONS AND STATEMENTS 31 

A= Bˆ2; 

A is B squared. So far so good. That the original meaning of ˆ is a bitwise XOR does not 

worry us because we do not plan implementing bitwise operations on matrices. 

Now we add C: 

A= Bˆ2 + C; 

Looks nice. But does not work (or does something weird). — Why? 

Because + has a higher priority than ˆ. Thus, the compiler understands our expression as: 

A= B ˆ (2 + C); 

Oops. That looks wrong. 9 The operator gives a concise and intuitive interface but its priority 

would cause a lot of confusion. Thus, it is advisable to refrain from this overloading. 

2.4 Expressions and Statements 

C and C ++ distinguish between expressions and statements. Very casually spoken, one could 

just say that every expression becomes a statement if an semicolon is appended. However, we 

would like to discuss this topic a bit more. 

Let us build this recursively from bottom up. Any variable name (x, y, z, . . . ), constant or 

literal is an expression. One or more expressions with an operator is an expression, e.g. x + y or 

x ∗ y + z. In several languages, e.g. Pascal, the assignment is a statement. In C and C ++ it is an 

expression, e.g. x= y + z. As a consequence it can be used in another assignment: x2= x= y + z. 

Assignments are evaluated from right to left. Input and output operations as 

std::cout ≪ ”x is ” ≪ x ≪ ”\n”; 

are also expressions. 

A function call with expressions as arguments is an expression, e.g. abs(x), abs(x ∗ y + z). Therefor, 

function calls can be nested: pow(abs(x), y). In languages where a function call is a statement 

this would not be possible. As the assignment is an expression, it can be used as argument of a 

function: abs(x= y). Or I/O operations as those above. Needless to say that this is quite bad programming 

style. An expression surrounded by parenthesis is an expression as well, e.g. x + y. 

This allows us to change the order of evaluation, e.g. x ∗ (y + z) computes the addition first 

although the multiplication has the higher priority. 

A very special operator in C ++ is the ‘comma operator’ that provides a sequential evaluation. 

The meaning is simply evaluating first the sub-expression left of the comma and then that right 

of it. The value of the whole expression is that of the right sub-expression. The sub-expressions 

can contain the comma operator as well so that arbitrarily long sequences can be defined. With 

the help of the comma operator, one can evaluate multiple expressions in program locations 

where only one expression is allowed. If used as function argument it the comma expression 

needs surrounding parentheses; otherwise the comma is interpreted as separation of function 

arguments. The comma operator can be overloaded with a user-defined semantics. This can 

9 The precise interpretation is A.operator=(operatorˆ(B, operator+(2, C)));


complicate the understanding of the program behavior dramatically and has to be used with 

utter care. In general it is advisable to use it not too often. 

Any of the above expression followed by a semicolon 10 is a statement, e.g.: 

x= y + z; 

y= f(x + z) ∗ 3.5; 

A statement like y + z; is allowed although it is most likely useless. During program execution 

the sum of y and z would be computed and then thrown away. Decent compilers would optimize 

out this useless computation. However, it is not guaranteed that this statement can be always 

omitted. If y or z is an object of a user type then the addition is also user-defined and might 

change y or z or something else. This is obviously bad programming style but legitimate in 

C ++. 

A single semicolon is an empty statement. Therefor, one can put as many semicolons after an 

expression as wanted. Some statements do not end with a semicolon, e.g. function definitions. 

If a semicolon is appended to such a statement it is not an error but just an extra empty 

statement. 11 Any sequence of statements surrounded by curly braces is a statement — called a 

compound statement. 

The variable and constant declarations we have seen before are also statements. As initial 

value of a variable or constant, one can use any of the expressions mentioned before (however 

involving assignment or comma operator is probably rather confusing). Other statements — to 

be discussed later — are function and class definitions, as well as control statements that we 

will introduce in the next section. 

2.5 Control statements 

Control statements allow us to steer the program execution be means of branching and repeating. 

2.5.1 If-statement 

This is the simplest form of control and its meaning is intuitively clear, for instance in: 

if (weight > 100.0) 

cout ≪ ”This is quite heavy.\n”; 

else 

cout ≪ ”I can carry this.\n”; 

Often, the else branch is not needed and can be omitted. Say we have some value in variable x 

and compute something on its magnitude: 

if (x < 0.0) 

x= −x; 

// Now we now that x >= 0.0 

10 The usage of the semicolon in Pascal looks similar at the first glance. However, in Pascal the semicolon has 

a slightly different purpose which is separating statements. Thus, the semicolon can be omitted when only one 

statement exist in a line. Coming from Pascal, it takes some time to get used to this difference. 

11 Nonetheless some compilers print a warning in pedantic mode.

2.5. CONTROL STATEMENTS 33 

The expression in the parentheses must be logic expression or something convertible to bool. 

For instance, one can write: 

int i; 

// ... 

if (i) // bad style 

do something(); 

In the example, do something is called if i is different from 0. Experienced C and C ++ programmers 

know that (from heart) but the intentions of the developer are better communicated if 

this is stated explicitly: 

int i; 

// ... 

if (i != 0) // much better 

do something(); 

The branches of if consist each of one single statement. To perform multiple operations one can 

use braces: 12 

int nr then= 0, nr else= 0; 

// ... 

if (...) { 

nr then++; 

cout ≪ ”In then−branche\n”; 

} else { 

nr else++; 

cout ≪ ”In else−branche\n”; 

} 

In the beginning, it is helpful to always write the braces. With more experience, most developers 

only write the braces where necessary. At any rate it is highly advisable to intend the branches 

for better readable with any degree of experience. 

An if statement can contain other if-statements: 

if (weight > 100.0) { 


cout ≪ ”This is extremely heavy.\n”; 

else 


} else { 

if (weight < 50.0) 

cout ≪ ”A child can carry this.\n”; 

else 

cout ≪ ”I can carry this.\n”; 

} 

In the above example, the parentheses could be omitted without changing the behavior but it 

is clearer to have them. The example is more readable if we reorganize the nesting: 

if (weight < 50.0) { 

cout ≪ ”A child can carry this.\n”; 

} else if (weight


} else if (weight 100.0) 



else 


It looks like the last line is executed when weight is between 100 and 200 assuming the first if 

has no else-branch. But we could also assume the second if comes without else-branch and the 

last line is executed when weight is less or equal 100. Fortunately, the C ++ standard specifies 

that an else-branch always belongs to the innermost possible if. So, we can count on our first 

interpretation. In case that the else-branch should belong to the first if we need braces: 

if (weight > 100.0) { 



} else 

cout ≪ ”This is not so heavy.\n”; 

Maybe these examples convinced you that it is more productive to set more braces and save 

the time guessing what the branches belong to. 

Advise 

If you use an editor that understands C ++ (like the IDE from Visual Studio 

or emacs in C ++ mode) then automatic indentation is a great help with 

structured programming. Whenever a line is not indented as you expected, 

something is most likely not nested as you intended. 

2.5.2 Conditional Expression 

Although this section describes statements, we like to talk about the conditional expression 

here because of its proximity to the if-statement. The semantic of 

condition ? result for true : result for false 

is that if the condition in first sub-expression evaluates to true then the entire expression is the 

second expression otherwise the third one. For instance, we can compute the minimum of two 

values with either if-then-else or the conditional expression:


if (x


eps/= 2.0; 

} while (eps > 0.0001); 

The loop is performed at least one time — even with an extremely small value for eps in our 

example. The difference between a while-loop and a do-while-loop is irrelevant to most scientific 

software. Only loops with very few iterations and with extremely strong impact on the overall 

performance might matter because a do-while-loop performs one comparison and one jump less. 

2.5.4 For Loop 

The most common loop in C ++ is the for-loop. As simple example we like to add two vectors 15 

and print the result afterward: 

double v[3], w[]= {2., 4., 6.}, x[]= {6., 5., 4}; 

for (int i= 0; i < 3; i++) 

v[i] = w[i] + x[i]; 


cout ≪ ”v[” ≪ i ≪ ”] = ” ≪ v[i] ≪ ’\n’; 

The loop head consists of three components: 

• The initialization; 

• A Continuation criterion; and 

• A step operation. 

The example above is typical for a for-loop. In the initialization, one typically declares a new 

variable and initializes it to 0 because this is the start index of most indexed data structures. 

The condition usually tests if the loop index is smaller than a certain size and the last operation 

typically increments the loop index. 

It is a very popular beginners’ mistake to write conditions like “i


Here it was simpler to take out term 0 and start with term 1. We also used less-equal to assure 

that the term x 10 /10! is considered. 

The for-loop in C ++ is very flexible. The initialization part can be any expression, a variable 

declaration or empty. It is possible to introduce multiple new variables of the same type. This 

can be used to avoid repeating the same operation in the condition, e.g.: 

for (int i= xyz.begin(), end= xyz.end(); i < end; i++) ... 

Variables declared in the initialization are only visible within the loop and hide variables of the 

same names from outside the loop. 

The condition can be any expression that can be converted to a bool. An empty condition is 

always true and the loop is repeated infinitely unless from inside the body as we will discuss 

in the next section. We said that loop indices are typically incremented in the head’s third 

part. In principle, one can modify it within the loop body but programs are much clearer if it 

is done in the loop head. On the other hand, there is no limitation that only one variable is 

increased by 1. One can modify as many variables as wanted using the comma operator and by 

any modification desired such as: 

for (int i= 0, j= 0, p= 1; ...; i++, j+= 4, p∗= 2) ... 

This is of course more complex than having just one loop index but still more readable than 

declaring/modifying indices before the loop or inside the loop body. 

In fact, the for-loop in C and C ++ is just another notation of a while-loop. Any for-loop: 

for (init; cond; incr) { 

st1; st2; ... stn; 

} 

can be written with a while-loop: 

{ 

} 

init; 

while (cond) { 

st1; st2; ... stn; 

incr; 

} 

Conversely, any while-loop can evidently be written as for-loop. We do not know if there is 

a design guideline from a software engineering guru when to use while or for but for is more 

concise if there is a local initialization or some incremental operation. 

2.5.5 Loop Control 

There are two statements to deviate from the regular loop evaluation: 

• break and 

• continue. 

A break terminates the loop entirely and continue ends only the current iteration and continues 

the loop with the next iteration, for instance:


for (...; ...; ...) { 

... 

if (dx == 0.0) continue; 

x+= dx; 

... 

if (r < eps) break; 

... 

} 

In the example above we assumed that the remainder of the iteration is not needed when 

dx == 0.0. In some iterative computations it might be clear in the middle of an iteration (here 

when r < eps) that work is already done. 

Understanding the program behavior becomes more difficult the more breaks and continues 

are used. One should always aim for moving as much loop control as possible into the loop 

head. However, avoiding breaks and continues by excessive if-then-else branches is even less 

comprehensible. 

Sometimes, one might prefer performing some surplus operations inside a loop (if it has no 

perceivable impact on the overall performance) and keep the program simpler. Simpler programs 

on the other hand have a better chance to get optimized by the compiler. There is certainly 

no golden rule but as practical approach one should implement software first for maximal 

clarity and simplicity (but using efficient algorithms as early as possible). Once the software is 

working correctly one can try variations to investigate the impact of implementation details on 

performance. 

2.5.6 Switch Statement 

A switch is like a special kind of if. It provides a concise notation when different computations 

for different cases of a given integral value are performed: 

switch(op code) { 

case 0: z= x + y; break; 

case 1: z= x − y; cout ≪ ”compute diff\n”; break; 

case 2: 

case 3: z= x ∗ y; break; 

default: z= x / y; 

} 

When people see the switch statement for the first time, they are usually surprised that one 

needs to say at end of each case that the statement is terminated. Otherwise the statements of 

the next case are executed as well. This can be used to perform the same operation for different 

cases, e.g. for 2 and 3 in the example above. 

The continuation allows us also to implement short loops without the termination test after 

each iteration. Say we have vectors with dimension ≤ 5. Then we could implement a vector 

addition without a loop: 

assert(size(v)

2.6. FUNCTIONS 39 

case 3: v[i] = w[i] + x[i]; i++; 

case 2: v[i] = w[i] + x[i]; i++; 

case 1: v[i] = w[i] + x[i]; 

case 0: ; 

} 

This technique is called Duff’s device. Although this is an interesting technique to realize an 

iterative computation without a loop, the performance impact is probably limited in practice. 

Such technique should be only considered in program parts with a significant fraction on the 

overall run time; otherwise readability of sources is more important. 

2.5.7 Goto 

DO NOT USE IT. NEVER! EVER! 

2.6 Functions 

Functions are important building blocks of C ++ programs. The first example we have seen is 

the main function in the hello-world program. main must be present in every executable and is 

called when the program starts. Other than that there is noting special about main. 

The general form of a C ++ function is: 

[inline] return type function name (argument list) 

{ 

body of the function 

} 

For instance, one can be implement a very simple function to square a value: 

double square(double x) 

{ 

return x ∗ x; 

} 

In C and C ++ each function has a return type. A function that does not return a value has the 

pseudo-return-type “void”: 

void print(double x) 

{ 

std::cout ≪ ”x is ” ≪ x ≪ ’\n’; 

} 

void is not a real type but moreover a placeholder that enables us to omit returning a value. 

We cannot define objects of it: 

void nothing; // error


2.6.1 Inline Functions 

Calling a function requires a fair amount of activities: 

• The arguments (or at least their addresses) must be copied on the stack; 

• The current program counter must be copied on the stack to continue the execution at 

this point when the function is finished; 

• Save registers to allow the function using them; 

• Jump to the code of the function; 

• Execute the function; 

• Clean the arguments from the stack; 

• Copy the result on the stack; 

• Jump back to the calling code; 

• Store back registers. 

What happens exactly depends on the hardware. The good news is that the function call 

overhead is dramatically lower than in the past. Furthermore, the compiler can optimize out 

those activities not needed in a specific call. 

Nonetheless, for small functions, like the square above, the effort for calling the function is still 

significantly higher than what the function actually does. C programmers avoid the function-call 

overhead by macros. Macros create so many problems in the software development that they 

must only be used when there is absolutely no alternative whatsoever. Bjarne Stroustrup 

says “Almost every macro demonstrates a flaw in the programming language, in the program, 

or in the programmer.” We like to add a flaw “in the compiler optimization”. 16 

Fortunately, we have an excellent alternative to macros: inline functions. The programmer just 

adds the keyword inline to the function definition: 

inline double square(double x) 

{ 

return x ∗ x; 

} 

and all the overhead of the function call vanishes into thin air. 

An excessive use of inline can have a negative effect on performance. When many large functions 

are inlined then the binary executable becomes very large. The consequence is that a lot of 

time is spend loading the binary from memory and lots of cache memory is wasted for it as 

well. This decreases the memory bandwidth and cache available for data, causing more slow 

down than what is saved on function calls. 

16 Advanced: Compilers are today really smart in eliminating unused code. However, we experienced that 

arguments of inline functions might be constructed although they are not used. This are usually only few 

machine instructions. But when this happens extremely frequently as in an index range check that should 

disappear in release mode, it can ruin the overall performance. We hope that further compiler improvement can 

rescue us from this kind of macro usage.


It should be mentioned here that the inline keyword is not mandatory. The compiler can decide 

against inlining for the reasons given in the previous paragraph. On the other hand, the compiler 

is free to inline functions without the inline keyword. 

For obvious reasons, the definition of an inline function must be visible in every compile unit 

where it is called. In contrast to other functions, it cannot be compiled separately. Conversely, 

a non-inline function cannot be visible in multiple compile units because it collides when the 

compiled parts are ‘linked’ together. Thus, there are two ways to avoid such collisions: assuring 

that the function definition is only present in one compile unit or declaring the function as 

inline. 

2.6.2 Function Arguments 

If we pass an argument to a function it creates by default a copy. For instance, the following 

would not work (as expected): 

void increment(int x) 

{ 

x++; 

} 

int main() 

{ 

int i= 4; 

increment(i); 

cout ≪ ”i is ” ≪ i ≪ ’\n’; 

} 

The output would be 4. The operation x++ in the second line only increments a local copy but 

not the original value. This kind of argument transfer is called ‘call-by-value’ or ‘pass-by-value’. 

To modify the value itself we have to ‘pass-by-reference’ the variable: 

void increment(int& x) 

{ 

x++; 

} 

Now the variable itself is increment and the output will be 5 as expected. We will discuss 

references more detailed in § 2.10.2. 

Temporary variables — like the result of an operation — cannot be passed by reference: 

increment(i + 9); // error 

We could not compute (i + 9)++ anyway. In order to call such a function with some temporary 

value one needs to store it first in a variable and pass this variable to the function. 

Larger data structures like vectors and matrices are almost always passed by reference for 

avoiding expensive copy operations: 

double two norm(vector& v) { ... } 

An operation like a norm should not change its argument. But passing the vector by reference 

bears the risk of accidentally overwriting it.


To make sure that our vector is not changed (and not copied either), we pass it as constant 

reference: 

double two norm(const vector& v) { ... } 

If we would change v in this function the compiler would emit an error. Both call-by-value and 

constant references ascertain that the argument is not altered but by different means: 

• Arguments that are passed by value can be changed in the function since the function 

works with a copy. 17 

• With const references one works on the passed argument directly but all operations that 

might change the argument are forbidden. In particular, const-referred arguments cannot 

appear on the left side of an assignment or passed as non-const references to other functions 

(in fact the LHS of an assignment is also a non-const reference). 

In contrast to mutable references, constant ones allow for passing temporaries: 

alpha= two norm(v + w); 

This is admittedly not entirely consequent on the language design side, but it makes the life of 

programmers much easier. 

Values that are quite frequent as argument might be declared as default. Say we implement a 

function the computes the n nt root and mostly the square root then we can write: 

double root(double x, int degree= 2) { ... } 

This function can be called with one or two arguments: 

x= root(3.5, 3); 

y= root(7.0); 

One can declare multiple default arguments but only at the end. In other words, after an 

argument with a default value one cannot have one without. 

2.6.3 Returning Results 

In the examples before, we only returned double or int. These are the nice ones. Functions that 

compute new values of large data structures are more difficult. 

Default arguments 

Sometimes functions have arguments that are used very infrequently. To address this, you can 

give a parameter a default value that is automatically used when no argument corresponding 

to that parameter is specified. In this way the caller only needs to specify those arguments that 

are meaningful at a particular instance. Consider the following example: 

void foo( int a = 5, char ch =’A’ ) 

{ std::cout ≪ a ≪ ” ” ≪ ch ≪ std::endl ;} 

17 This assumes that the argument is properly copied. For user-defined types one can implement its own copy 

operation with aliasing effect (on purpose or by accident). Then modifications of the copy also affect the original 

object.


foo takes one integer argument with default value 5 and one character argument with a default 

value of ‘A’. Now this function can be called by one of the three methods shown here: 

foo( 1, ’J’ ); 

foo(24); 

foo(); 

Which results in the following output: 

1 J 

24 A 

5 A 

Void functions 

When the result type of a function is void, we do not return a result. For example 

void foo( int i ) { 

std::cout ≪ ”My value is ” ≪ i ≪ std::endl ; 

} 

Constant arguments 

We can use const objects as arguments in functions to protect them from being changed. For 

example : 

bool bar( int const& x, int y ) { 

y = y+2; 

return y ==x ; 

} 

Since we do not want to modify x, we can add the keyword const. Note that const can be put 

before or behind the type, but it is recommended by the authors of this course to put it behind. 

2.6.4 Overloading 

In C ++, functions can share the same name as long as their parameter declarations are different. 

More precisely, the functions should differ in the number or the type of their parameters. 

The compiler can then use the number/type of the arguments to determine which version of 

the overloaded function should be used. Note that although overloaded functions may have 

different return types, a difference in return type alone is not sufficient to distinguish between 

two versions of a function. 

Consider the following example: 

#include 

#include 

int divide (int a, int b){ 

return a / b ; 

}


float divide (float a, float b){ 

return std::floor( a / b ) ; 

} 

int main (){ 

int x=5,y=2; 

float n=5.0,m=2.0; 

std::cout ≪ divide (x,y) ≪ std::endl; 

std::cout ≪ divide (n,m) ≪ std::endl; 

return 0; 

} % ≫ ≫ ≫ ≫ 

In this case we have defined two functions with the same name, divide, but one of them accepts 

two parameters of type int and the other one accepts them of type float. In the first call to 

divide the two arguments passed are of type int, therefore, the function with the first prototype 

is called. This function returns the result of dividing one parameter by the other. The second 

call passes two arguments of type float, so the function with the second prototype is called. 

This one executes a similar division and rounds the result. 

2.6.5 Assertions 

The function assert is a special kind of function and has the following interface: 

void assert (int expression); 

If the argument expression evaluates to 0, this causes an assertion failure that terminates the 

program. A message is written to the standard error device and abort is called, terminating 

the program execution. 

The specifics of the message shown depend on the specific implementation in the compiler, but 

it shall include: the expression whose assertion failed, the name of the source file, and the line 

number where it happened. A usual expression format is: 

Assertion failed: expression, file filename, line linenumber 

This allows for a programmer to include many assert calls in a source code while debugging 

the program. The many assert calls may reduce the performance of the code and so it would 

be desirable to disable asserts for high-performance libraries. Asserts are disabled by including 

the following line 

#define NDEBUG 

at the beginning of his code, before the inclusion of cassert or by defining the variable in the 

compiler, e.g. 

g++ -DNDEBUG foo.cpp 

Example: 

#include 

#include 

int main () 

{

2.7. INPUT AND OUTPUT 45 

std::ifstream datafile( ”file.dat” ) ; 

assert( datafile.is open() ); 

datafile.close(); 

return 0; 

} 

In this example, assert is used to abort the program execution if datafile compares equal to 0, 

which happens when the opening of the file was unsuccessful. 

2.7 Input and output 

C ++ uses a convenient abstraction called streams to perform input and output operations in 

sequential media such as the screen or the keyboard. A stream is an object where a program 

can either insert characters to or extract from. The standard C ++ library includes the header 

file iostream, where the standard input and output stream objects are declared. 

2.7.1 Standard Output (cout) 

By default, the standard output of a program is the screen, and the C ++ stream object defined 

to access it, is cout. 

cout is used in conjunction with the insertion operator, which is written as ≪ . It may be used 

more than once in a single statement. This is especially useful if we want to print a combination 

of variables and constants or more than one variable. Consider this example: 

std::cout ≪ ”Hello World, my name is ” ≪ name ≪ std::endl ; 

std::cout ≪ ”I am ” ≪ age ≪ ” years old.” ≪ std::endl ; 

If we assume the name variable to contain the value Jane and the age variable to contain 25 

the output of the previous statement would be: 

Hello World, my name is Jane. 

I am 25 years old. 

The endl manipulator produces a newline character. An alternative representation of endl is the 

character ’n’. 

2.7.2 Standard Input (cin) 

The standard input device is usually the keyboard. Handling the standard input in C ++ is done 

by applying the overloaded operator of extraction ≫ on the cin stream. The operator must be 

followed by the variable that will store the data that is going to be extracted from the stream. 

For example: 

int age; 

std::cin ≫age;


The first statement declares a variable of type int called age, and the second one waits for an 

input from cin (the keyboard) in order to store it in this integer variable. The input from the 

keyboard is processed once the RETURN key has been pressed. 

You can also use cin to request more than one datum input from the user: 

std::cin ≫a ≫b; 

is equivalent to: 

std::cin ≫a; 

std::cin ≫b; 

In both cases the user must give two data, one for variable a and another one for variable b that 

may be separated by any valid blank separator: a space, a tab character or a newline. 

2.7.3 Input/Output with files 

C ++ provides the following classes to perform output and input of characters to/from files: 

• std::ofstream: used to write to files 

• std::ifstream: used to read from files 

• std::fstream: used to both read and write from/to files. 

We can use file streams the same way we are already used cin and cout, with the only difference 

that we have to associate these streams with physical files. Here is an example: 

#include 

#include 

int main () { 

std::ofstream myfile; 

myfile.open (”example.txt”); 


myfile.close(); 

return 0; 

} 

This code creates a file called example.txt (or overwrites it if it already exists) and inserts a 

sentence into it in a way that is similar to the use of cout. C ++ has the concept of an output 

streams that is satisfied by an output file as well as be std::cout. That means that everything 

that can be written to std::cout can also be written to a file, and vice versa. If you define yourself 

the operator ≪ for a new type you do not need to program it for different output type but 

only once for a general output stream, see 18 

Alternatively, one can give the file stream object the file name as argument. This opens the file 

implicitly. The file is also implicitly closed when myfile at some point, in this case at the end 

of the main function. The mechanisms that control such implicit actions will become clear in 

§ 2.2.3. The bottom line is that you only in few cases must close your files explicitly. The short 

version of the previous listing is 

18 TODO: Where? New section needed.

2.8. STRUCTURING SOFTWARE PROJECTS 47 

#include 

#include 

int main () { 



return 0; 

} 

2.8 Structuring Software Projects 

2.8.1 Namespaces 

In the last section we mentioned that equal names in different scopes hides the variables (or 

functions, types, . . . ) of the outer scopes while defining the same name in one scope is an error. 

Common function names like min, max or abs already exists and if you write a function with 

the same name (and same argument types) the compiler will tell you that the name already 

exists. But this does not only concern common names; you must be sure that every name you 

use is not already used in some other library. This really can be a hustle because you might 

add more libraries later and there is new potential for conflicts. Then you have to rename some 

of your functions and inform everybody who uses your software. Or one of your software users 

is including a library that you do not know and has a name conflict. This can grow to a serious 

problem and it happens in C all the time. 

One possibility to deal with this is using different names like max , my abs, or library name abs. 

This in fact what is done in C. Main libraries have short function names, user-libraries longer 

names, and OS-related internals typically start with . This decreases the probability of conflicts 

but does not eliminate it entirely. 

Remark: Particularly annoying are macros. This is an old technique of code reuse by expanding 

macro names to their text definition, potentially with arguments. This gives a lot of possibilities 

to empower your program but much more for ruin it. Macros are resistent against namespaces 

because they are reckless text substitution without any notion of types, scopes or any other 

language feature. Unfortunately, some libraries define macros with common names like major. 

We uncompromisingly undefine such macros, e.g. #undef major, without merci for people that 

might want use those macros. Visual Studio defines — till today!!! — min and max as macros 

and we advise you to disable this by compiling with /DNO MIN MAX. Almost all macros can be 

replaced by other techniques (constants, templates, inline functions). But if you really do not 

find another way of implementing it use LONG AND UGLY NAMES IN CAPITALS like the library 

collection Boost does. 

2.8.2 Header and implementation 

It is usual to split class (Chapter 3) and function definition and implementation into different 

files. Classes and functions are typically defined in a header file (.hpp), and implemented in a 

cpp file, which is then compiled and added to a library. For example, the header file foo.hpp 

could be: 

foo.hpp:


#ifndef athens foo hpp 

#define athens foo hpp 

double foo (double a, double b); 

#endif 

Note the ifndef and define C-preprocessor commands. These commands are called include guards 

and prevent the file from being included several times. The use of such guards in header files is 

quite common. 

The source file in the library would be contained in the file foo.cpp. 

#include ”foo.hpp” 

double foo (double a, double b) 

{ return a+b; } 

The main program file is contained in the file bar.cpp: 

#include 

#include ”foo.hpp” 

int main() { 

double a = 2.1; 

double b = 3.9; 

std::cout ≪ foo(a,b) ≪ std::endl ; 

} 

Include files usually contain the interface of software packages and are stored somewhere on 

disk. The compiler is told where to look for the include files. The programmer can partially 

control this as follows: 

• #include ”foo.hpp”: the compiler looks in the directory of the including file and the list of 

directories it is given. 

• #include : the compiler only looks in the list of directories it is given. 

Frequently used include files 

The types and functions defined in the following include files are in the namespace std. 

• : input and output stream, e.g. std::cin and std::cout 

• : file input and output 

• : For assertions, see §2.6.5. 

• : Headers for the C functions from math.h, among others: abs, fabs, pow, acos, 

asin, atan, atan2, ceil, floor, cos, cosh, sin, sinh, exp, fmod (floating point mod), modf (split in 

integer and fractional part (< 1)), log, log10, sqrt, tan, tanh 

And other useful functions such as isnan. 

• : String operations

2.9. ARRAYS 49 

• : Complex numbers 

• , , , , ...: STL, see Section 4.9 

Inline keyword 

Instead of creating a library as described at the beginning of this section, we can also store the 

implementation in the header file. We then have to add the keyword inline for two reasons. The 

code will not be stored in a library but inlined in the calling functions: this may lead to more 

efficient code when the functions are small. If we do not use the inline keyword, we may end 

up with multiple defined functions, since the compiler will create the methods in every source 

file they are used. 

Consider for example the following header file sqr.hpp: 

#ifndef athens sqr hpp 

#define athens sqr hpp 

inline double sqr(double a) 

{ return a∗a;} 

#endif 

2.9 Arrays 

C based programming languages are not very good at working with arrays. In this section, we 

discuss the language concepts for arrays. In Section 4.9, we will present more practical software 

for arrays and other complicated mass data structures. 

An array is created as follow 

int x[10]; 

The variable x is a constant size array. It allows for fast creation (it is typically stored on the 

stack). 

Arrays are accessed by square brackets: x[i] is a reference to the ith element. The first element 

is x[0], the last one is x[9]. Arrays can be initialized at the definition 

float v[]= {1.0, 2.0, 3.0}, w[]= {7.0, 8.0, 9.0}; 

In this case, the array size is deduced. 

Operations on arrays are typically performed in loops, e.g. to compute x = v − 3w as vector 

operation is realized by 

float x[3]; 


x[i]= v[i] − 3.0 ∗ w[i]; 

One can also define arrays of higher dimension


float A[7][9]; // a 7 by 9 matrix 

int q[3][2][3] // a 3 by 2 by 3 array 

The language does not provide linear algebra operations upon the arrays. Therefore we will 

build our own linear algebra and look forward to future C ++ standards coming with intrinsic 

higher math. 

Arrays have the following two disadvantages: 

• Indices are not checked before accessing an array and one can find himself outside the 

array and the program crashed with segmentation fault/violation. This is not even the 

worst case. If your program crashes you see that things go wrong. The false access can 

also mess up your own data, the program keeps running and produces entirely wrong 

results with whatever consequence you can imagine. 

• The size of the array must be known at compile time. 19 For instance, we have an array 

stored to a file and need to read it back into memory 

ifstream ifs(”some array.dat”); 

ifs ≫size; 

float v[size]; // error, size not known at compile time 

This does not work because we need the size already when the program is compiled. 

The first problem can be only solved with new array types and the second one with dynamic 

allocation. This leads us to pointers. 

2.10 Pointers and References 

2.10.1 Pointers 

A pointer is a variable that contains a memory address. This address can be that of another 

variable or dynamically allocated memory. Let’s start with the latter as we were looking for 

arrays of dynamic size. 

int∗ y = new int[10]; 

This allocates an array of 10 int. The size can now be chosen at run-time. We can also implement 

the vector reading example from the previous section 

ifstream ifs(”some array.dat”); 

int size; 

ifs ≫size; 

float∗ v= new float[size]; 

for (int i= 0; i < size; i++) 

ifs ≫v[i]; 

Pointers bear the same danger as arrays of risking to access out of range data with program 

crashes or data invalidation. It is also the programmer’s responsability to keep the information 

of the array size. 

19 Some compilers support run-time values as array sizes. Since this is not guaranteed to with other compilers 

one should avoid this in portable software.

2.10. POINTERS AND REFERENCES 51 

Furthermore, the programmer is responsible for releasing the memory when not needed anymore. 

This is done by 

delete[] v; 

As we came from arrays, we made the second step before the first one regarding pointer usage. 

The simple use of pointers is allocating one single data item. 

int∗ ip = new int; 

Releasing such memory is performed by 

delete ip; 

Note the duality of allocation and release: the single-object allocation requires a single-object 

release and the array allocation demands an array release. 20 

Pointers can also refer to other variables 

int i= 3; 

int∗ ip2= &i; 

The operator & takes an object and returns its address. The reverse operator is ∗ that takes 

an address and returns object. 

int j= ∗ip2; 

This is called dereferencing. It is clear from the context whether the symbol ∗ represents a 

dereference or a multiplication. 

A danger of pointers are memory leaks. For instance, our array y became too small and we 

want assign a new array 

int∗ y = new int[15]; 

We can now use more space in y. Nice. But what happened with the memory that we allocated 

before? It is still there but we have no access to it anymore. We cannot release it anymore. 

This memory is lost for the rest of our program execution. Only when the program is finished 

the operation system will be able to free it. In the example it is only 40 byte out of how many 

Gigabyte you might have. But if this happens with larger data in an iterative process the dead 

memory grows and at some point the program crashes when all memory is used. 

The warnings above are not intended as fun killers. And we do not discourage the use of 

pointers. Many things can be only achieved with pointers: lists, queues, trees, graphs, . . . But 

pointers must be used with utter care to avoid all the really serious problems mentioned above. 

There are two strategies to minimize pointer-related errors: 

Use standard implementations from the standard library or other validated libraries. std::vector 

from the standard library provides you all the functionality of dynamic arrays, including 

resizing and range check, and the memory is released automatically, see § 4.9. Smart pointers 

from Boost provide automatic resource management: dynamically allocated memory 

that is not referred by a smart pointer is released automatically, see § 11.2. 

20 TODO: Otherwise?


Encapsulate your dynamic memory management in classes. Then you have to deal with it 

only once per class. 21 If all memory allocated by an object is released when the object 

is destroyed then it does not matter how many memory you allocate. If you have 738 

objects with dynamic memory then it will be released 738 times. If you have called new 

738 times, partly in loops and branches, can you be sure that you have called delete 738 

times? We know that there are tools for this but these are errors you better prevent than 

fix. Even with the encapsulation there is probably something to fix inside the classes but 

this is orders of magnitude less work than having pointers spread all over your program. 

We have shown two main purposes of pointers: 

• Dynamic memory management; and 

• Referring other objects. 

For the former there is no alternative to pointers, dynamic memory handling needs pointers, 

either directly or using classes that contain pointers. To refer to other objects, there exist 

another kind of types called reference (surprise, surprise) that we will introduce in the next 

section. 

2.10.2 References 

The following code introduces a reference: 

int i= 5; 

int& j= i; 

j= 4; 

std::cout ≪ ”j = ” ≪ j ≪ ’\n’; 

The variable j is referring to i. Changing j will also alter i and vice versa, as in the example. i 

and j will always have the same value. One can think of a reference as an alias. Whenever one 

defines a reference, one must directly say what it is referring to (other then pointers). It is not 

possible to refer to another variable later. 

So far, that does not sound extremely useful. References are extremely useful for function 

arguments (§ 2.6), for refering parts of other objects (e.g. the seventh entry of a vector), and 

for building views ( 22 ). 

2.10.3 Comparison between pointers and references 

The advantage of pointers over references is the ability of dynamic memory management and 

address calculation. On the other hand, references refer to defined locations 23 , they always 

must refer to something, they do not leave memory leaks (unless you play really evil tricks), 

and they have the same notation in usage as the referred object. 

21 

It is save to assume that there are many more objects than classes; otherwise there is something wrong with 

the program. 

22 

TODO: reref to a section when it is written 

23 

References can refer to arbitrary addresses but one must work hard to achieve this. For your own savefy we 

will not show you how to make reference to behave as badly as pointers.

2.11. REAL-WORLD EXAMPLE: MATRIX INVERSION 53 

Feature Pointers References 

Referring defined location - + 

Mandatory initialisation - + 

Avoidance of memory leaks - + 

Object-like notation - + 

Memory management + - 

Adress calculation + - 

Table 2.2: comparison between pointers and references 

For short, references are not idiot-proof but much less error-prone than pointers. Pointers 

should be only used when dealing with dynamic memory and even then one should do this via 

well-tested types or encapsulate the pointer within a class. 

2.10.4 Do Not Refer Outdated Data 

Variables in functions are only valid within this function, for instance: 

double& square ref(double d) // DO NOT! 

{ 

double s= d ∗ d; 

return s; 

} 

The variable s is not valid anymore after the function is finished. If you are lucky the memory 

where s was stored is not overwritten yet. But this is nothing one can count on. Good compilers 

will warn you that you are referring local variable. Sadly enough we have seen examples in web 

tutorial that do this! 

The same applies correspondingly to pointers: 

double∗ square ptr(double d) // DO NOT! 

{ 

double s= d ∗ d; 

return &s; 

} 

This is as wrong as it is for references. 

There are cases where functions, esp. member functions return references and addresses and 

the destruction order of object impedes the invalidation of references, 24 cf. § ??. 

2.11 Real-world example: matrix inversion 

TODO: I am not sure anymore if this is very good here. I still think we should propagate 

abstraction and demonstrate how to develop reusable software but the section feels now a bit 

misplaced. At the beginning of the next chapter is not much better. Maybe a good intro 

paragraph saves the situation. 

24 Unfortunately there are ways to circumvent this and an exception to this rule.


As a practical exercise, we now go step-by-step through the development process of a function 

for matrix inversion. This is easier than it seems. 25 For it, we use the Matrix Template Library 4 

— see http://www.mtl4.org. It already provides most of the functionality we need. 26 

In the program development, we follow some principles of Extreme Programming, especially 

writing tests first and implement the functionality afterwards. This has two significant advantages: 

• It prevents you as programmer (to some extend) from featurism — the obsession to add 

more features instead of finishing one thing after another. If you write down what you 

want to achieve you work more directly towards this goal and accomplish it usually much 

earlier. When writing the function call you specify the interface of the function you plan 

implementing, when testing your results against expected values you say something about 

the semantics of your function. Thus, tests are compilable documentation. The tests 

might not tell everything about the functions and classes you are going to implement, but 

what it says it does very precisely. Documentation in text can be much more detailed and 

comprehensible but also much vaguer than tests. 

• If you start writing tests after you finally finished the implementation — say on a late 

Friday afternoon — You Do Not Want To See It Failing. You will write the test with your 

nice data (whatever this means for the program in question) and minimize the risk that 

it fails. You might decide going home and swear to God that you will test it on Monday. 

For those reasons, you will be more honest if you write your tests first. Of course, you can 

modify your tests later if you realize that something does not work or you changed the design 

of some item or you want test more details. It goes without saying that verifying partial 

implementations requires uncommenting parts of your test, temporarily. 

Before we start implementing our inverse function and even the tests we have to choose an 

algorithm. We can use determinants of sub-matrices, block algorithms, Gauß-Jordan, or LU 

decomposition with or without pivoting. Let’s say we prefer LU factorization with column 

pivoting so that we have 

LU = P A, 

with a unit lower triangular matrix L, an upper triangular matrix U, and a permutation matrix 

P . Thus it is 

A = P −1 LU 

and 

A −1 = U −1 L −1 P. (2.1) 

We use the LU factorization from MTL4, implement the inversion of the lower and upper 

triangular matrix and compose it appropriately. 

Now we start with our test by defining an invertible matrix and printing it out. 

int main(int argc, char∗ argv[]) 

{ 

const unsigned size= 3; 

typedef dense2D Matrix; 

Matrix A(size, size); 

A= 4, 1, 2, 

25 At least with the implementations we already have. 

26 It actualy provides the inversion function inv already but we want to learn now how to get there.


1, 5, 3, 

2, 6, 9; 

cout ≪ ”A is:\n” ≪ A; 

For later abstraction we define the type Matrix and the constant size. The LU factorization in 

MTL4 is performed in place. To not alter our original matrix we copy it into a new one. 

Matrix LU(A); 

We also define a vector for the permutation computed in the factorization. 

mtl::dense vector Pv(size); 

These are the two arguments for the LU factorization 

lu(LU, Pv); 

For our purpose it is more convenient to represent the permutation as matrix 

Matrix P(permutation(Pv)); 

cout ≪ ”Permutation vector is ” ≪ Pv ≪ ”\nPermutation matrix is\n” ≪ P; 

For instance to show A in its permutated form 27 

cout ≪ ”Permuted A is \n” ≪ Matrix(P ∗ A); 

We now define an identity matrix of appropriate size and extract L and U from our in-place 

factorization 

Matrix I(matrix::identity(size, size)), L(I + strict lower(LU)), U(upper(LU)); 

Note that the unit diagonal of L is not stored and needs to be added. It could also be treated 

implicitly but we refrain from it for the sake of simplicity. We have now finished the preliminaries 

and come to our first test. If we had computed the inverse of U, say UI, the product must be 

the identity matrix, approximately. 

Matrix UI(inverse upper(U)); 

cout ≪ ”inverse(U) [permuted] is:\n” ≪ UI ≪ ”UI ∗ U is:\n” ≪ Matrix(UI ∗ U); 

assert(one norm(Matrix(LI ∗ L − I)) < 0.1); 

Testing results of non-trivial numeric calculation for equality is quite certain to fail. Therefore, 

we used the norm of the matrix difference as criterion. Likewise, the inversion of L (with a 

different function) is tested. 

Matrix LI(inverse lower(L)); 

cout ≪ ”inverse(L) [permuted] is:\n” ≪ LI ≪ ”LI ∗ L is:\n” ≪ Matrix(LI ∗ L); 

assert(one norm(Matrix(LI ∗ L − I)) < 0.1); 

This enables us to calculate the inverse of A itself and test its correctness 

Matrix AI(UI ∗ LI ∗ P); 

cout ≪ ”inverse(A) [UI ∗ LI ∗ P] is \n” ≪ AI ≪ ”A ∗ AI is\n” ≪ Matrix(AI ∗ A); 

assert(one norm(Matrix(AI ∗ A − I)) < 0.1); 

27 If you wonder why we explicitly built a matrix for P ∗ A, you have wait until Chapter 5.3 for understanding 

that some functions return special types that need special treatment. Future versions of MTL4 will minimize the 

need of such special treatments.


A function computing the inverse must return the same value and also pass the test agains 

identity: 

Matrix A inverse(inverse(A)); 

cout ≪ ”inverse(A) is \n” ≪ A inverse ≪ ”A ∗ AI is\n” ≪ Matrix(A inverse ∗ A); 

assert(one norm(Matrix(A inverse ∗ A − I)) < 0.1); 

After establishing tests for all components of our calculation we start with their implementations. 

The first function we program is the inversion of an upper triangular matrix. This function 

takes a dense matrix as argument and returns another matrix: 

dense2D inline inverse upper(dense2D const& A) { 

} 

Since we do not need another copy of the input matrix we pass it as reference. The argument 

shall not be changed so we can pass it as const. The constancy has several advantages: 

• We improve the reliability of our program. Arguments passed as const are guaranteed 

not to change, if we accidentally modify them the compiler will tell us and abort the 

compilation. There is a way to remove the constancy but this should only be used as 

last resort, e.g. for interfacing obsolete libraries written by others. Everything you write 

yourself can be realized without eliminating the constancy of arguments. 

• Compilers can optimize better when the objects are guaranteed not to alter. 

• In case of references, the function can be called with expressions. Non-const references 

require to store the expression into a variable and pass the variable to the function. 

Another comment, people might tell you that it is too expensive to return containers as results 

and it is more efficient to use references. This is true — in principle. For the moment we accept 

this extra cost and pay more attention to clarity and convenience. Later in this book we will 

introduce techniques how to minimize the cost of returning containers from functions. 

So much for the function signature, let us now turn our attention to the function body. The 

first thing we do is verifying that our argument is valid. Obviously the matrix must be square: 

const unsigned n= num rows(A); 

assert(num cols(A) == n); // Matrix must be square 

The number of rows is needed several times in this function and is therefore stored in a variable, 

well constant. Another prerequisite is that the matrix has no zero entries in the diagonal. We 

leave this test to the triangular solver. 

Speaking of which, we can get our inverse triangular matrix with a triangular solver of a linear 

system, which we find in MTL4, more precisely the k-th vector of U −1 is the solution of 

Ux = ek 

where ek is the k-th unit vector. First we define a temporary variable for the result. 

dense2D Inv(n, n); 

Then we iterate over the columns of Inv:


for (unsigned k= 0; k < n; ++k) { 

} 

In each iteration we need the k-th unit vector. 

dense vector e k(n); 

for (unsigned i= 0; i < n; ++i) 

if (i == k) 

e k[i]= 1.0; 

else 

e k[i]= 0.0; 

The triangular solver returns a column vector. We could assign the entries of this vector directly 

to entries of the target matrix: 


Inv[i][k]= upper trisolve(A, e k)[i]; 

This is nicely short but we would compute upper trisolve n times! Although we said that performance 

is not our primary goal at this point, the raise of overall complexity from order 3 to 4 is 

too much waste of resources. Therefore, we better store the vector and copy the entries from 

there. 

dense vector res k(n); 

res k= upper trisolve(A, e k); 


Inv[i][k]= res k[i]; 

Return our temporary matrix finishes the function that we now give in its complete form. 

dense2D inverse upper(dense2D const& A) 

{ 



} 

dense2D Inv(n, n); 


dense vector e k(n); 


if (i == k) 

e k[i]= 1.0; 

else 

e k[i]= 0.0; 

dense vector res k(n); 

res k= upper trisolve(A, e k); 


Inv[i][k]= res k[i]; 

} 

return Inv;


Now that the function is complete, we first run our test. Evidently, we have to uncomment 

part of the test because we only implemented one function so far. But it is worth to know if 

this first function already behaves as expected. It does and we could be now happy with it and 

turn our attention to the next task, there are still many. But we will not. 

Well, at least we can be happy to have a correctly running function. Nevertheless, it is still 

worth spending some time to improve it. Such improvements are called refactoring. Experience 

from practise has shown that refactoring immediately after implementation is takes much less 

time than later modification when bugs are discovered, the software is ported to other platforms 

or extended for more usability. Obviously, it is much easier now to simplify and structure our 

software immediately when we still know what is going on than in some week/months/years or 

when somebody else is refactoring it. 

First thing we might dislike is that something so simple as the initialization of a unit vector 

takes 5 lines. This is rather verbose. Putting the if statement in one line 

is badly structured. 


if (i == k) e k[i]= 1.0; else e k[i]= 0.0; 

C ++ and even good ole C have a special operator for conditions 


e k[i]= i == k ? 1.0 : 0.0; 

The conditional operator ‘?:’ usually needs some time to get used to but it results in a more 

concise representation. There are also situations where one cannot use an if but the ?: operator. 

Although, we have not changed anything semantically in the program and it seems obvious that 

the result will still be the same, it cannot harm to run our test again. You will see, how often 

you are sure that your program changes could never possibly change the behavior but still do. 

And the sooner you realize the better. And with the test we already wrote it only takes a few 

seconds and makes you feel more confident. 

If we would like to be really cool we could explore some insider know how. The expression 

‘i == k’ returns a boolean and we know that bool can be converted implicitely into int. In this 

conversation false results in 0 and true returns 1 according to the standard. This are precisely 

the values we want as double: 

e k[i]= double(i == k); 

In fact, the conversion from int to double is performed implicitly and can be omitted: 

e k[i]= i == k; 

As cute as this looks, it is some stretch to assign a logical value to a floating point number. It is 

well-defined by the implicit conversion chain bool → int → double but it will confuse potential 

readers and you might end up explaining them what is happening on a mailing list or you add 

a comment to the program. In both cases you end up writing more for the explication than you 

saved in the program. 

Another thought that might occur to us is that it is probably not the last time we need a unit 

vector. So, why don’t writing a function for it?


dense vector inline unit vector(unsigned k, unsigned n) 

{ 

dense vector v(n, 0.0); 

v[k]= 1; 

return v; 

} 

As the function returns the unit vector we can just take it as argument of the triangular solver 

res k= upper trisolve(A, unit vector(k, n)); 

For a dense matrix, MTL4 allows us to access a matrix column as column vector (instead of a 

sub-matrix). Then we can assign the result vector directly without a loop. 

Inv[irange(0, n)][k]= res k; 

As short explanation, the bracket operator is implemented in a manner that integer indices 

for rows and columns returns the matrix entry while ranges for rows and columns returns a 

sub-matrix. Likewise, a range of rows and a single column gives you a column of the according 

matrix — or part of this column. Vice versa, a row vector can be extracted from a matrix with 

an integer as row index and a range for the columns. 

This is an interesting example how to deal with the limitations as well as possibilities of C ++. 

Other languages have ranges as part of their intrinsic notation, e.g. Python has a symbol ‘:’ 

for expressing ranges of indices. C ++ does not have this symbol but we can introduce a new 

type — like MTL4’s irange — and define the behavior of operator[] for this type. This leads to 

an extremely powerful mechanism! 

Extending Operator Functionality 

Since we cannot introduce new operators into C ++— not now (in 2010), not 

in the next standard (C ++0x), maybe in that afterwards — we define new 

types and give operators the desired behavior when applied to those types. 

This technique allows us providing a very broad functionality with a limited 

number of operators. 

The operator semantics on user types shall be intuitive and must be consistent with the operator 

priority (see example in § 2.3.7). 

Back to our algorithm. We store the result of the solver in a vector and then we assign it to a 

matrix column. In fact, we can assign the triangular solver’s result directly. 

Inv[irange(0, n)][k]= upper trisolve(A, unit vector(k, n)); 

The range of all indices is predefined as iall: 

Inv[iall][k]= upper trisolve(A, unit vector(k, n)); 

Next, we explore some mathematical back-ground. The inverse of an upper triangular matrix 

is also upper triangular. Thus, we only need to compute the upper part of the result and set 

the remainder to 0 — or the whole matrix to zero before computing the upper part. Of course,


we need smaller unit vectors now and only sub-matrices of A. This can nicely be expressed with 

ranges: 

Inv= 0; 

for (unsigned k= 0; k < n; ++k) 

Inv[irange(0, k+1)][k]= upper trisolve(A[irange(0, k+1)][irange(0, k+1)], unit vector(k, k+1)); 

Admittedly, the irange makes the expression hard to read. Although it looks like a function, 

irange is a type and we just created objects on the fly and passed them to passed them to the 

operator[]. As we use the same range 3 times, it is shorter to create a variable (or a constant). 


const irange r(0, k+1); 

Inv[r][k]= upper trisolve(A[r][r], unit vector(k, k+1)); 

} 

This does not only make the second line shorter, it is also easier to see that this is all the same 

range. 

Another observation: after shortening the unit vectors they all have the one in the last entry. 

Thus, we only need the size of the vector and the position of the one is implied: 

dense vector inline last unit vector(unsigned n) 

{ 

dense vector v(n, 0.0); 

v[n−1]= 1; 

return v; 

} 

We choose a different name to reflect the different meaning. Nonetheless, we wonder if we really 

want such a function. How is the probability to need this ever again? Charles H. Moore, 

the creator of the programming language Forth once said that “The purpose of functions is not 

to hash a program into tiny pieces but to create highly reusable entities.” All this said, we 

prefer the more general function that is much more likely to be useful later. 

After all these modifications, we are now satisfied with the implementation and go to the next 

function. We still might change something at a later point in time but having it made clearer 

and better structured will make the later modification much easier for us or somebody else. 

The more experience you gain, the less steps you will need to achieve the implementation that 

makes you happy. And it goes without saying that we tested the inverse upper repeatedly while 

modifying it. 

Now that we know how to invert triangular matrices we can do the same for the lower triangular 

accordingly. Alternatively we can just transpose the input and output: 

dense2D inline inverse lower(dense2D const& A) 

{ 

dense2D T(trans(A)); 

return dense2D(trans(inverse upper(T))); 

} 

Ideally this implementation should look like this: 

dense2D inline inverse lower(dense2D const& A) 

{ 

return trans(inverse upper(trans(T))); 

}


This does not work yet for technical reasons but will in the future. 

You may argue that the transpostions and passing the matrix and the vector once more takes 

more time. More importantly, we know that the lower matrix has a unit diagonal and we did 

not explore this property, e.g. for avoiding the divisions in the triangular solver. We could 

even ignore or omit the diagonal and treat this implicitly in the algorithms. This is all true. 

However, we prioritized the simplicity and clarity of the implementation and the reusability 

aspect higher than performance here. 28 

We have now all we need to put the matrix inversion together. As above we start we checking 

the squareness. 

dense2D inline inverse(dense2D const& A) 

{ 



Then we perform the LU factorization. For performance reasons this function does not return 

the result but takes its arguments as mutable references and factorizes in place. Thus, we need 

a copy of a matrix to pass and a permutation vector of appropriate size. 

dense2D PLU(A); 

dense vector Pv(n); 

lu(PLU, Pv); 

The upper triangular factor PU of the permuted A is stored in the upper triangle of PLU. The 

lower triangular factor PL is partly stored in the strict lower triangle of PLU while the unit 

diagonal is omitted. We therefore need to add it before inversion (or alternatively handle the 

unit diagonal implicitly in the inversion). 

dense2D PU(upper(PLU)), PL(strict lower(PLU) + matrix::identity(n, n)); 

The inversion of a square matrix according to Equation (2.1) can then be performed in one 

single line: 29 

return dense2D(inverse upper(PU) ∗ inverse lower(PL) ∗ permutation(Pv)); 

During this section you have seen that you have always alternatives to implement the same 

behavior, most likely you already made this experience before. Despite we suggested for every 

choice we made that it is the most appropriate, there is not always THE single best solution and 

even while trading off pro and cons of the alternatives, one might not come to a final conclusion 

and just pick one. We also illustrated that the choices depend on the goals, for instance the 

implementation would look different if performance were the primary goal. 

The section shall show as well that that non-trivial programs are not written in a single sweep 

by an ingenious mind — exceptions might prove the rule — but are the result of a gradually 

improving development. Experience will make this journey shorter and directer but we will not 

write the perfect program at the first glance. 

28 People that care about performance do not use matrix inversion in the first place. 

29 The explicit conversion can probably be omitted in later versions of MTL4.


2.12 Exercises 

2.12.1 Age 

Write a program that asks input from the keyboard and prints the result on the screen and a 

file. The question is: What is your age? 

2.12.2 Exercise on include 

We provide you the following files: foo.hpp included by bar1.hpp and bar2.hpp. The main 

program is in main.cpp. 

Compile and try to link the program. It should not link. Correct errors so that it links. 

2.12.3 Arrays and pointers 

1. Write the following declarations: pointer to a character, array of 10 integers, pointer to 

an array of 10 integers, pointer to an array of character strings, pointer to pointer to a 

character, integer constant, pointer to an integer constant, constant pointer to an integer. 

Initialize all of the objects. 

2. Read a sequence of double’s from an input stream. Let the value 0 define the end of a 

sequence. Print the values in the input order. Remove duplicate values. Sort the values 

before printing. 

3. Make a small program that creates arrays on the stack (fixed size arrays) and arrays on 

the heap (using allocation, i.e. new). Use valgrind to check what happens when you do 

not use delete correctly. 

2.12.4 Read the header of a Matrix-Market file 

The Matrix Market data format is used to store dense and sparse matrices in ASCII format. 

The header contains some information about the type and the size of the matrix. For a sparse 

matrix, the data are stored in three columns. The first column is the row number, the second 

column the column number, and the third column the numerical value. If the matrix is complex, 

a fourth column is added for the imaginary part. 

An example of a Matrix Market file is: 

%%MatrixMarket matrix coordinate real general 

% 

% ATHENS course matrix 

% 

2025 2025 100015 

1 1 .9273558001498543E-01 

1 2 .3545880644900583E-01 

...................

2.12. EXERCISES 63 

The first line that does not start with % contains the number of rows, the number of columns 

and the number of non-zero elements on the sparse matrix. 

Use fstream to read the header of a MatrixMarket file and print the number of rows and columns, 

and the number of nonzeroes on the screen. 

2.12.5 String manipulation programs 

There is a type string in the standard library. This type contains a large number of string 

operations, such as string concatenation, string comparison, etc. Note the include of the header 

file string. 

#include 

#include 

int main() 

{ 

std::string s1 = ”Hello”; 

std::string s2 = ”World”; 

std::string s3 = s1 + ”, ” + s2 ; 

std::cout ≪ s3 ≪ std::endl ; 

return 0; 

} 

In this example we have concatenated the strings s1 and s2 together with a string constants. 

Perform the following exercises: 

1. Write a function itoa (int i, std::string& b) that constructs a string representation of i in b 

and returns b. 

2. Write a simple encryption program. It should read the input from cin and write the 

encrypted symbols in cout. Use the following simple encryption scheme: the code for a 

symbol c is c key[i] , where key is a string given as a parameter to a function. The symbols 

from key are used in a cyclic way. (After the repeated encryption with a same key key you 

should get the source string.)


2.13 Operator Precedence 

The following table gives all operators on one page for quickly seeing their priorities, for meaning 

see Table 2.3.6. Semicolons are only separators. 

Operator Precedence 

class name :: member; namespace name :: member; :: name; :: qualified-name 

object . member; pointer → member; expr[ expr ] 

object [ expr ]; expr ( expr list ); type ( expr list ); lvalue ++; lvalue −− 

typeid ( type ); typeid ( expr ); dynamic cast < type > ( expr ) 

static cast < type > ( expr ); reinterpret cast < type > ( expr ) 

const cast < type > ( expr ) 

sizeof expr; sizeof ( type ); ++ lvalue; −− lvalue; ∼ expr; ! expr; − expr 

+expr; & lvalue; ∗ lvalue; new type; new type( expr list ) 

new ( expr list ) type; new ( expr list ) type( expr list ) 

delete pointer; delete [ ] pointer; ( type ) expr 

object.∗ pointer to member; pointer → ∗ pointer to member 

expr ∗ expr; expr / expr; expr % expr 

expr + expr; expr − expr 

expr ≪ expr; expr ≫ expr 

expr < expr; expr expr; expr >= expr 

expr == expr; expr != expr 

expr & expr 

expr ˆ expr 

expr | expr 

expr && expr 

expr || expr 

expr ? expr: expr 

lvalue = expr; lvalue ∗= expr; lvalue /= expr; lvalue %= expr; lvalue += expr 

lvalue −= expr; lvalue ≪= expr; lvalue ≫= expr; lvalue &= expr 

lvalue |= expr; lvalue ˆ= expr 

throw expr 

expr , expr

Classes 

Chapter 3 

“Computer science is no more about computers than astronomy is about telescopes.” 

— Edsger W. Dijkstra. 

“Accordingly, computer science is more than programming language details.” 

Good programming is more then drilling on small language details and more then cleverly 

manipulating specific bits on the latest and greatest computer hardware. Focusing primarily 

on technical details can lead to clever codes that perform a certain task in a certain context 

extremely efficiently. If one is good at this one might even create the fastest solution for this 

task and gain the admiration of the geeks. 

3.1 Program for universal meaning not for technical details 

Writing leading-edge scientific software with such an attitude is very painful and likely to fail. 

The most important tasks in scientific programming are: 

• Identifying the mathematical abstractions that are important in the domain; and 

• Representing this abstractions comprehensively and efficiently in software. 

Common abstractions that appear in almost every scientific application are vector spaces and 

linear operators. A linear operator projects from one vector space to another one. 

First we should decide how to represent this abstraction in a program. Be v an element of a 

vector space and L a linear operator. Then C ++ allows us to represent the application of L on 

v as 

or 

L(v) 

L ∗ v 

Which one is better suited is not so easy to say. What is easy to say is that both are better 

then 

65

66 CHAPTER 3. CLASSES 

apply symm blk2x2 rowmajor dnsvec multhr athlon(L.data addr, L.nrows, L.ncols, 

L.ldim, L.blksch, v.data addr, v.size); 

Developing software in this fashion is far from being fun. It wastes so much energy of the 

programmer. Getting such calls right is of course much more work than the former notations. 

If one of the arguments is stored in a different format, the function call must be meticulously 

adapted. Remember the person who implements the linear projection wanted to do science, 

actually. 

The cardinal error of scientific software providing such interfaces — there is even worse than 

our example — is to commit to too many technical details in the user interface. The reason lies 

partly in the usage of simplistic programming languages as C and Fortran 77 or in the effort to 

interoperate with software in these languages. 

Advise 

If you ever get forced to write software that interoperates with C or Fortran, 

write your software first with a concise and intuitive interface in C ++ for 

yourself and other C ++ programmer and add the C and Fortran interface on 

top of it. 

The elegant way of writing scientific software is to use and to provide the best abstraction. A 

good implementation reduces the user interface to the essential behavior and omits all surplus 

commitments to technical details. Applications with a concise and intuitive interface can be as 

efficient as their ugly and detail-obsessed counterparts. 

In our example, this is achieved by providing a class for every specific linear operator and implement 

the projection type-dependently. 1 This way, we can apply the projection without given 

all details and the user application is short and nice. This chapter will show the foundations of 

how providing new abstraction in scientific software and the following chapters will elaborate 

this. 

3.2 Class members 

Object types are called classes in C ++, defined by the class keyword. A class defines a new data 

type, which can be used to create objects. A class is a collection of: 

• data; 

• functions which are also referred to as member functions or methods; 

• types 

Furthermore class members can be public or private and classes can inherit from each other. 

Let us now give an example to illustrate the class concept. To have something tangible for 

scientists, we refrain from foo and bar examples but implement gradually a class complex (al- 

1 Specializations for specific platforms can also be handled with the type system.

3.2. CLASS MEMBERS 67 

though this already exist). This class must contain variables to store the real and the imaginary 

part: 

class complex 

{ 

double r, i; 

}; 

Variables within a class are called ‘member variables’. 

3.2.1 Access attributes 

All items — variables, constants, functions, and types — of a class have access attributes. C ++ 

provides the following three attributes: 

• public: Accessible from everywhere; 

• private: Accessible only within the class; and 

• protected: Accessible only within the class and in derived classes. 

The access attributes give the class designer good control how the class users can utilize the 

class. Defining more public members gives more freedom in usage but less control and vice 

versa more private members establishes a stricter user interface. Protected members are less 

restrictive then private ones and more restrictive then public ones. Since inheritence is not a 

major topic in this book, they are not very important in this context. All class members are 

by default ‘private’. 

3.2.2 Member functions 

It is common practice in object-oriented software to declare member variables as private and 

access them with functions. We do this here in a Java style: 

class complex 

{ 

public: 

double get r() { return r; } 

void set r(double newr) { r = newr; } 

double get i() { return i; } 

void set i(double newi) { i = newi; } 

private: 

double r, i; 

}; 

Functions in a class are called ‘member functions’. Member functions are also private by default, 

i.e. they can only be called by functions within the class. This is evidently not particularly 

useful for our getters and setters. 

Therefore we declared them ‘public’. Public member functions and variables can be accessed 

outside the class. So, we can write c.get r() but not c.r. The class above can be used in the 

following way:


int main() 

{ 

complex c1, c2; 

// set c1 

c1.set r(3.0); 

c1.set i(2.0); 

// copy c1 to c2 

c2.set r(c1.get r()); 

c2.set i(c1.get i()); 

return 0; 

} 

In line 3 we created two objects of type complex. Then we set one of the objects and copied it 

to the other one. This works but it is a bit clumsy, isn’t it? 

C ++ provides another keyword for defining classes: struct. The only difference 2 is that members 

are by default public, therefore the example above is equivalent to: 

struct complex 

{ 

double get r() { return r; } 

void set r(double newr) { r = newr; } 

double get i() { return i; } 

void set i(double newi) { i = newi; } 

private: 

double r, i; 

}; 

Our member variables can only be accessed via functions. This gives the class designer the 

maximal control over the behavior. The setter could only accept values in a certain range. We 

could count how often the setter and getter is called for each complex number or for all complex 

numbers in the execution. The functions could have additional print-outs for debugging. 3 We 

could even allow the reading only at certain times of the day or writing only if the program runs 

on a computer with a certain IP. We will most likely not do the latter, at least not for complex 

numbers, but we could. If the variables are public and accessed directly, such modifications 

would not be possible. Nevertheless, handling the real and imaginary part of a complex number 

is cumbersome and we will discuss alternatives. 

Most C ++ programmer would not implement it this way. What would a C ++ programmer do 

first then? Writing constructors. 

3.3 Constructors 

What are constructors? Constructors initialize objects of classes and create a working environment 

for member functions. Sometimes such an environment includes resources like files, 

memory or locks that have to be freed after use. We come back to this later. 

To start with let us define a constructor for complex: 

2 There is really no other difference. One can define operators and virtual functions or derived classes in the 

same manner as with class. Performance of class and struct is also absolutely identical. 

3 A debugger is usually a better alternative to putting print-outs into programs.

3.3. CONSTRUCTORS 69 

class complex 

{ 

public: 

complex(double rnew, double inew) 

{ 

r= rnew; i= inew; 

} 

// ... 

}; 

Thus, a constructor is a member function with the same name as the class itself. It can have 

an arbitrary number of arguments. In our case, two arguments are most suitable because we 

want to set two member variables. This constructor allows us to set c1’s values directly in the 

definition: 

complex c1(2.0, 3.0); 

There is a special syntax for setting member variables in constructors 

class complex 

{ 

public: 

complex(double rnew, double inew) : r(rnew), i(inew) {} 

// ... 

}; 

This not only shorter but has also another advantage. It calls the constructors of the variables in 

class’s constructor. For plain old data types (POD) this does not make a significant difference. 

The situation is another one if the members are themselves classes. 

Imagine you have a class that solves linear systems with the same matrix and you store the 

matrix in your class 

class solver 

{ 

public: 

solver(int nrows, int ncols) // : A() #1 → error 

{ 

A(nrows, ncols); // this is not a constructor here #2 → error 

} 

// ... 

private: 

matrix type A; 

}; 

Suppose our matrix class has a constructor setting the dimensions. This constructor cannot 

be called in the function body of the constructor (#2). The call in #2 is interpreted as 

A.operator()(nrows, ncols), see § 4.8. 

All member variables of the class are constructed before the class constructor reaches the opening 

{. Those members — like A — that do not appear in the list after the colon are built by a constructor 

without arguments, called the default constructor. Correspondingly, classes that have 

such a constructor are called default-constructible. Our matrix class is not default-constructible 

and the compiler will tell us something like “Operator matrix type::matrix type() not 

found”. Thus, we need


class solver 

{ 

public: 

solver(int nrows, int ncols) : A(nrows, ncols) {} 

// ... 

private: 

matrix type A; 

}; 

Often the matrix (or whatever other object) is already constructed and we do not like to waste 

the memory for a copy. In this case we will use a reference to the object. A reference must 

be set in the constructor because this is the only place to declare what it is referring to. The 

solver shall not modify the matrix, so we write: 

class solver 

{ 

public: 

solver(const matrix type& A) : A(A) {} 

// ... 

private: 

const matrix type& A; 

}; 

The code also shows that we can give the constructor arguments the same names as the member 

variables. After the colon, which A is which? The rule is that names outside the parenthesis 

refer to members and inside the parenthesis the constructor arguments are hiding the member 

variables. Some people are confused by this rule and use different names. To what refers A 

inside {}? To the constructor argument. Only names that does not exist as argument names 

are interpreted as member variables. In fact, this is a pure scope resolution: the scope of the 

function — in this case the constructor — is inside the scope of the class and thus the argument 

names hide the class member names. 

Let us return to our complex example. So far, we have a constructor allowing us to set the real 

and the imaginary part. Often only the real part is set and the imaginary is defaulted to 0. 

class complex 

{ 

public: 

complex(double r, double i) : r(r), i(i) {} 

complex(double r) : r(rnew), i(0) {} 

// ... 

}; 

We can also say that the number is 0 + 0i if no value is given, i.e. if the complex number is 

default-constructed: 

complex() : r(0), i(0) {}


Advise 

Define a default constructor for where it is possible although it might not 

seem necessary when you implement the class. 

For the complex class, we might think that we do not need a default constructor because we 

can delay its declaration until we know its value. The absence of a default constructor creates 

(at least) two problems: 

• We might need the variable outside the scope in which the values are computed. For 

instance, if the value depends on some condition and we would declare the (complex) 

variable in the two branches of if, the variable would not exist after the if. 

• We build containers of the type, e.g. a matrix of complex values. Then the constructor of 

the matrix must call constructors of complex for each entry and the default constructor 

is the most convenient fashion to handle this. 

For some classes, it might be very difficult to define a default constructor, e.g. when some of 

the members are references. In those cases, it can be easier to accept the before-mentioned 

drawbacks instead of building badly designed default constructors. 

We can combine all three of them with default arguments: 

class complex 

{ 

public: 

complex(double r= 0, double i= 0) : r(r), i(i) {} 

// ... 

}; 

In the previous main function we defined two objects, one a copy of the other. We can write a 

constructor for this — called copy constructor: 

class complex 

{ 

public: 

complex(const complex& c) : i(c.i), r(c.r) {} 

// ... 

}; 

But we do not have to. C ++ is doing this itself. If we do not define a copy constructor, i.e. a 

construstor that has one argument and which is a const reference to its type, than the compiler 

creates this construstor implicitly. This automatically built copies each member variable by 

calling the variables’ copy constructors and this is exactly what we did. In cases like this where 

copying all members is precisely what you want for your copy constructor you should use the 

default for the following reasons: 

• It is less verbose; 

• It is less error-prone; 

• Other people know directly what your copy constructor does without reading your code; 

and


• Compilers might find more optimizations. 

There are cases where the default copy constructor does not work, especially when the class 

contains pointers. Say we have a simple vector class with a copy constructor: 

class vector 

{ 

public: 

vector(const vector& v) 

: size(v.size), data(new double[size]) 

{ 

for (unsigned i= 0; i < size; i++) 

data[i]= v.data[i]; 

} 

// ... 

private: 

unsigned size; 

double ∗data; 

}; 

If we omit this copy constructor the compiler would not complain and voluntarily built one 

for us. We are glad that our program is shorter and sexier but sooner or later we find that it 

behaves bizarrely. Changing one vector, modifies another one as well and when we observe this 

strange behavior we have to find the error in our program. This is particularly difficult because 

there is no error in what we have written but in what we have omitted. 

Another problem we can observe is that the run-time library will complain that we freed the 

same memory twice. 4 The reason for this is the way pointers are copied. Only the address is 

copied and the result is that both pointers point to the same memory. This might be useful in 

some cases but most of the time it is not, at least in our domain. Some pointer-addicted geeks 

might see this differently. 

3.3.1 Explicit and implicit constructors 

In C ++ we distinguish implicit and explicit constructors. Implicit constructors enable in addition 

to object initialization implicit conversions and assignment-like notation for construction. 

Instead of: 

complex c1(3.0); 

we can also write: 

or 

complex c1= 3.0; 

complex c1= pi∗pi/6.0; 

This notation is for many scientifically educated people more readable. Older compilers might 

generate more code in initializations using ‘=’ (the object is first created with the default 

constructor and the value is copied afterwards) while current compiler generate the same code 

for both notations. 

4 This is an error message every programmer experiences at least once in his/her life (or he/she is not doing 

serious business).


The implicit conversion kicks in when one type is needed and another one is given, e.g. a double 

instead of a complex. Assume we have a function: 5 

double inline complex abs(complex c) 

{ 

return std::sqrt(real(c) ∗ real(c) + imag(c) ∗ imag(c)); 

} 

and call this with a double, e.g.: 

cout ≪ ”|7| = ” ≪ complex abs(7.0) ≪ ’\n’; 

The constant ‘7.0’ is considered as a double but there is no function ‘complex abs’ for double. 

There is a function for complex and complex has a constructor that accepts a double. So, the 

complex value is implicitly built from the double. 

This can be forbidden by declaring the constructor as ‘explicit’: 

class complex { public: 

explicit complex(double nr= 0.0, double i= 0.0) : r(nr), i(i) {} 

}; 

Then complex abs would not be called with a double or any other type complex. To call this 

function with a double we can write an overload for double or construct a complex explicitly in 

the call: 

cout ≪ ”|7| = ” ≪ complex abs(complex(7.0)) ≪ ’\n’; 

The explicit attribute is really important for the vector class. There will be a constructor taken 

the size of the vector as argument: 

class vector 

{ 

public: 

vector(int n) : my size(n), data(new double[my size]) {} 

}; 

A function computing a scalar product will expect two vectors as arguments: 

double dot(const vector& v, const vector& w) { ... } 

Calling this function with integer arguments 

double d= dot(8, 8); 

will compile. What happened? Two temporary vectors of size 8 are created with the implicit 

constructor and passed to the function dot. This nonsense can be easily avoided by declaring 

the constructor explicit. 

Discussion 3.1 Which constructor shall be explicit is in the end the class designer’s decision. 

It is pretty obvious in the vector example: no right-minded programmer wants the compiler 

converting integers automatically into vectors. 

Whether the constructor of the complex class should be explicit depends on the expected utilization. 

Since a complex number with a zero imaginary part is mathematically identical with 

5 The definitions of real and imag will be given soon.


a real number, the implicit conversion does not create semantic inconsistencies. An implicit 

constructor is more convenient because doubles and double literals can be given whereever a 

complex is expected. Functions that are not performance-critical can be implemented only once 

for complex and used for double. Vice versa, in performance-critical applications it might be 

preferable using an explicit constructor because the compiler will refuse to call complex functions 

with double arguments. Then the programmer can implement overload of those functions with 

double arguments that do not waste run time on null imaginaries. 

That does not mean that high-performance implementations necessarily have to be realized with 

explicit constructors. The implicit conversion might happen in rarely called functions and the 

impact on the overall performance might be negligible. The compiler cannot tell us but a profiling 

tool can. A function that consumes less than 1 % of the execution time is not worth to spend 

much time on tuning it. All this considered, there are more reasons for an implicit constructor 

than for an explicit one and so it is implemented in std::complex. 

3.4 Destructors 

A destructor is a function that is called every time an object of this class is destroyed, for 

example: 

∼complex() 

{ 

std::cout ≪ ”So long and thanks for the fish.\n”; 

} 

Since the destructor is the complementary operation of the default constructor it uses the 

complementary notation in the signature. Opposed to the constructor there is only one single 

overload and arguments are not allowed — what could they are good for anyway, as grave 

goods? There is no live after death in C ++. 

In our example, there is nothing to do when a complex number is destroyed and we can omit 

the destructor. A destructor is needed when the object acquired resources, e.g. memory. In 

this cases the memory must be freed in the destructor and the other ressource be released. 

class vector 

{ 

public: 

// ... 

∼vector() 

{ 

if (data) // check if pointer was allocated 

delete[] data; 

} 

// ... 

private: 

unsigned my size; 

double ∗data; 

}; 

Files that are opened with std::ifstream or std::ofstream does not need to closed explicitly, their 

destructors will do this if necessary. Files that are opened with old C handles require explicit 

closing and this is only one reason for not using them.

3.5. ASSIGNMENT 75 

It must be paid attention that the freed ressources are not used or released somewhere else in 

the program afterwards. C ++ generates a default destructor in the same way as the default 

constructor: calling the destructor of each member but in the reverse order. 6 

3.5 Assignment 

Assignment operators are used to enable for user-defined types expressions like: 

x= y; 

u= v= w= x; 

As usual we consider first the class complex. Assigning a complex to a complex requires an 

operator like: 

complex& operator=(const complex& src) 

{ 

r= src.r; i= src.i; 

return ∗this; 

} 

Evidently, we copy the members ‘r’ and ‘i’. The operator returns a reference to the object 

for enabling multiple assignments. ‘this’ is a pointer to the object itself and since we need a 

reference for syntactic reasons it is dereferred. What happens if we assign a double? 

c= 7.5; 

It compiles without the definition of an assignment operator for double. Once again, we have a 

implicit conversion: the implicit constructor creates a complex on the fly and assigns this one. 

If this becomes a performance issue we can add an assignment for double: 

complex& operator=(double nr) 

{ 

r= nr; i= 0; 


} 

An assignment operator like the first one that assigns a an object of the same type is called 

Copy Assignment and this operator is synthesized by the compiler. In the case of complex 

numbers the generated copy assignment operator performs exactly what we need, copying all 

members. 

As for the vector the synthesized operator is not satisfying because it only copies the address 

of the data and not the data itself. The implementation is very similar to the copy constructor: 

vector& operator=(const vector& src) 

{ 

if (this == &src) 


assert(my size == src.my size); 

for (int i= 0; i < my size; i++) 

data[i]= src.data[i]; 

6 TODO: Good and short explanation why. If possible with example.


} 


In fact every class implementation where the copy assignment and the copy constructor have 

essential differences in their implementation are very confusing in their behavior and should not 

be used, cf. [SA05, p. 94]. The two operations differ in the respect that a constructor creates 

content in a new object while an assignment replaces content in an existing object. However, 

both the creation as well as the replacement is performed with a copy semantics and the two 

operations should behave consistently therefore. 

An assignment of an object to itself (source and target have the same address) can be skipped, 

line 3 and 4. In line 5 it is tested whether the assignment is a legal operation by checking 

the equality of their size. Alternatively the assignment could resize the target if the sizes are 

different but that does not correspond to the authors’ understanding of vector behavior — or 

can you think of a context in mathematics or physics where a vector space all of a sudden 

changes its dimension. 

3.6 Automatically Generated Operators 

If you define a class without operators C ++ will generate the following four: 

• Default constructor; 

• Copy constructor; 

• Destructor; and 

• Copy assignment. 

Assume you have a class without any function but with some member variables like this: 

class my class 

{ 

type1 var1; 

type2 var2; 

// ... 

typen varn; 

}; 

Then the compiler adds the four operators and your class behaves as you would have written: 

class my class 

{ 

public: 

my class() 

: var1(), 

var2(), 

// ... 

varn() 

{} 

my class(const my class& that) 

: var1(that.var1), 

var2(that.var2),

3.6. AUTOMATICALLY GENERATED OPERATORS 77 

{} 

//... 

varn(that.varn) 

∼my class() 

{ 

varn.∼typen(); 

// ... 

var2.∼type2(); 

var1.∼type1(); 

} 

my class& operator=(const my class& that) 

{ 

var1= that.var1; 

var2= that.var2; 

// ... 

varn= that.varn; 


} 

private: 

type1 var1; 

type2 var2; 

// ... 

typen varn; 

}; 

The generation is straight forward. The four operators are respectively called on each member 

variable. The careful reader has realized that the constructors and the assignment is performed 

in the exact order as the variables are defined. The destructors are called in reverse order. 

The generation of these operators will be disabled if you define your own. The rules for this 

are quite simple. The simplest is for the destructor: either you define it or the compiler does. 

There is only one destructor (because it has no arguments). The default constructor generation 

is disabled when any constructor is defined by the user — even a private constructor. 

The copy constructor and copy assignment operator are generated automatically unless there 

is a user-defined version for the class type or a reference of it. In detail, if the user defines one 

or two of the following: 

• return type operator=(my class that); 

• return type operator=(const my class& that); or 

• return type operator=(my class& that); 

Then the compiler does not generated it. Typically, one defines only the second operator 

because the first one causes an extra copy 7 and the last one requires mutability what is usually 

not necessary for the assignment. The copy constructor can only be defined for references 

because it need itself for passing a value as argument. Defining a constructor or assignment for 

any other type does not disable the generation of the copy operators. 

7 An exception is user-defined move semantics. 8


This mechanism applies recursively. For instance, if type1 is itself a class with an automatically 

generated default constructor the default constructors of its members are called in the order 

of their definition. Of those variables or some of them are also classes then their default 

constructors are called and so forth. If the type of a member variable is an intrinsic type like int 

or float then there are evidently no such operators because the types are no classes. However, 

the behavior can be easily emulated: the “default constructor” just creates it with a random 

value (whatever bits where set on the according memory position before determine its value), 

the “copy constructor” and the “copy assignment” copy the values and the “destructor” does 

nothing. 

3.7 Accessing object members 

3.7.1 Access functions 

In § 3.2.2 we introduced getters and setters to access the variables of the class complex. This 

becomes cumbersome when we want for instance increment the real part: 

c.set r(c.get r() + 5.); 

This does not really look like numeric operations and is not very readable either. A better way 

dealing with this is writing a member function that returns a reference: 


double& real() { return r; } 

}; 

With this function we can write: 

c.real()+= 5.; 

This looks already much better but still a little bit weird. Why not incrementing like this: 

real(c)+= 5.; 

To do this, we write a free function: 

inline double& real(complex& c) { return c.r; } 

But this function access the private member ‘r’. We can modify the free function calling the 

member function: 

inline double& real(complex& c) { return c.real(); } 

Or alternatively declaring the free function as friend of complex: 


friend double& real(complex& c); 

}; 

Functions or classes that are friends can access private and protected data. A strange issue 

with this free function is that the inline attribute must be written before the reference type. 

Usually it does not matter whether the inline is written before or after the return type. 9 

9 TODO: Anybody a decent explanation for this?

3.7. ACCESSING OBJECT MEMBERS 79 

This function works only the complex number is not constant. So we also need a function that 

takes a constant reference as argument. In return it can only provide a constant reference of 

the number’s real part. 

inline const double& real(const complex& c) { return c.r; } 

This function requires a friend declaration, too. 

The functions — in free as well as in member form — can evidently only be called when object 

is created. The references of the number’s real part that we use in the statement 

real(c)+= 5.; 

exist only until the end of the statement. The variable c lives longer. We can create a reference 

variable: 

double &rr= real(c); 

C ++ destroys objects in reverse order. That means that even if rr and c are in the same function 

or block, c lives longer than rr. 

The same is true for constant references if objects from variable declarations are referred. 

Temporary objects can also be passed as constant references enabling the definition of outdated 

references: 

const double &rr= real(complex()); // Bad thing!!! 

cout ≪ ”The real part is ” ≪ rr ≪ ’\n’; 

The complex variable is created temporarily and only exist until the end of the first statement. 

The reference to its real part lives till the end of the surrounding block. 

Advise 

Do Not Make Constant References Of Temporary Expressions! 

They are invalid before you use them the first time. 

3.7.2 Subscript operator 

A really stupid way to access vector entries would be writing a function for each one: 

class vector 

{ 

public: 

double& zeroth() { return data[0]; } 

double& first() { return data[1]; } 

double& second() { return data[2]; } 

// ... 

int size() const { return my size; } 

};


One could not even write a loop over all elements. 

To enable such iteration, we need a function like: 

class vector 

{ 

public: 

double at(int i) 

{ 

assert(i >= 0 && i < my size); 

return data[i]; 

} 

}; 

Summing the entries of vector v reads: 

double sum= 0.0; 

for (int i= 0; i < v.size(); i++) 

sum+= v.at(i); 

C ++ and C access entries of (fixed-size) arrays with the subscript operator. It is, thus, only 

natural doing the same for (dynamically sized) vectors. Then we could rewrite the previous 

example as: 

double sum= 0.0; 

for (int i= 0; i < v.size(); i++) 

sum+= v[i]; 

This is more concise and shows more clearly what we are doing. 

The operator overloading has the same syntax as the assignment operator and the implementation 

from function at: 

class vector 

{ 

public: 

double& operator[](int i) 

{ 



} 

}; 

With this operator we can access vector elements with brackets but only if the vector is mutable 

vectors. 

3.7.3 Constant member functions 

This raises the more general question: How can we write operators and member functions that 

accept constant objects? In fact, operators are a special form of member functions and can be 

called like a member function: 

v[i]; // is syntactic sugar for: 

v.operator[](i);


Of course, the long form is almost never called but it illustrates that operators are regular 

functions that only provide an extra syntax to call them. 

Free functions allow qualifying the const-ness of each argument. Member functions do not even 

mention the processed object in the signature. How const-ness can be specified then? There is 

a special notation that notates the applicability of a member function to constant objects after 

the function header, e.g. our subscript operator: 

class vector 

{ 

public: 

const double& operator[](int i) const 

{ 



} 

}; 

The const attribute is not just a casual gesture of the programmer that he/she does not mind 

calling this member function with a constant object. C ++ takes this constancy very seriously 

and will verify that the function does not modify the object, i.e. some of its members, that the 

object is only passed as const when free functions are called and that called member functions 

have the const attribute as well. 

This constancy guarantee also impedes returning non-constant pointers or references. One can 

return constant pointers or references as well as objects. A returned object does not need to 

be constant (but it could) because it is a copy of the object, of one of its member variables 

(or constants), or of a temporary variable; and because it is a copy the object is guaranteed to 

remain unchanged. 

Constant member functions can be called for non-constant objects (because C ++ implicitly 

converts non-constant references into constant references when necessary). Therefore, it is 

often sufficient to provide only the constant member function. For instance a function that 

returns the size of the vector: 

class vector 

{ 

public: 

int size() const { return my size; } 

// int size() { return my size; } // futile 

}; 

The non-constant size function does the same as the constant one and is therefore useless. 

For our subscript operator we need both the constant and the mutable version. If we only 

had the constant member function, we could use it to read the elements of both constant and 

mutable vectors but we could not modify the elements. By the way, our abandonned getters 

should have been const since they are only used to read values regardless of whether the object 

is constant or mutable. 

3.7.4 Accessing multi-dimensional arrays 

Let us assume that we have a simple matrix class like the following:


class matrix 

{ 

public: 

matrix() : nrows(0), ncols(0), data(0) {} 

matrix(int nrows, int ncols) 

: nrows(nrows), ncols(ncols), data( new double[nrows ∗ ncols] ) {} 

matrix(const matrix& that) 

: nrows(that.nrows), ncols(that.ncols), data(new double[nrows ∗ ncols]) 

{ 

for (int i= 0, size= nrows∗ncols; i < size; ++i) 

data[i]= that.data[i]; 

} 

∼matrix() { if (data) delete [] data; } 

void operator=(const matrix& that) 

{ 

assert(nrows == that.nrows && ncols == that.ncols); 

for (int i= 0, size= nrows∗ncols; i < size; ++i) 


} 

int num rows() const { return nrows; } 

int num cols() const { return ncols; } 

private: 

int nrows, ncols; 

double∗ data; 

}; 

So far, the implementation is done in the same manner as before: variables are private, the 

constructors establish defined values for all members, the copy constructor and the assignment 

are consistent, size information are provided by a constant function. 

What is still missing is the access to the matrix entries. 

Be aware! 

The bracket operator accepts only one argument. 

That means we cannot define 

double& operator[](int r, int c) { ... } 

Approach 1: Parenthesis 

The simplest way handling multiple indices is replacing the square brackets with parentheses: 

double& operator()(int r, int c)


{ 

} 

return data[r∗ncols + c]; 

Adding range checking — in a separate function for better reuse — can safe us a lot of debug 

time in the future. We also implement the constant access: 

private: 

void check(int r, int c) const { assert(0


Approach 3: Returning proxies 

Instead of returning a pointer we can build a specific type that keeps a reference to the matrix 

and the row index and that provide an operator[] for accessing matrix entries. This proxy must 

be therefore a friend of the matrix class to reach its private data. Alternatively, we can keep 

the operator with the parentheses and call this one from the proxy. In both cases, we encounter 

cyclic dependencies. 10 

If we have several matrix types, each of them would need its own proxy. We would also need 

different proxies for constant and mutable access respectively. In Section 6.5 we will show how 

to write a proxy that works for all matrix types. The same templated proxy will handle constant 

and mutable access. Fortunately, it even solves the problem of mutual dependencies. The only 

minor flaw is that eventual errors cause lenghty compiler messages. 

Approach 4: Multi-index type (advanced) 

Preliminary note: this approach contains several new language features and discusses some 

subtle details. If you do not understand the first time, don’t worry. If you like to skip it, do 

it. That will not be a problem for understanding the rest of the book. But please read the 

comparing discussion. 

The fact that operator[] accepts only one argument does not necessarily mean that we cannot 

give two. But we need a tricky technique to build one object out of two, without explicitly 

constructing the object. The implementation is based on the matrix example from an onlinetutorial 

[Sch]. 

First, we define a type: 

struct double index 

{ 

double index (int i1, int i2): i1 (i1), i2 (i2) {} 

int i1, i2; 

}; 

For this type we define the access operator: 

double& operator[](double index i) { return data[i.i1∗ncols + i.i2]; } 

const double& operator[](double index i) const { return data[i.i1∗ncols + i.i2]; } 

Now we can write: 

A[double index(1, 0)]; 

This works but it was not the concise notation we were looking for. 

We introduce a second type: 

struct single index 

{ 

single index (int i1): i1 (i1) {} 

double index operator, (single index j) const { 

10 The dependencies cannot be resolved with forward declaration because we not only define references or 

pointers but call member functions in the matrix and in the proxy. We will explain this in § ??.


}; 

} 

return double index (i1, j.i1); 

operator int() const { return i1; } 

single index& operator++ () { 

++i1; return ∗this; 

} 

int i1; 

This new type overloaded the comma operator so that a second index creates a double index. 

The constructor is implicit and the class contains an operator to int. This enables the compiler 

to switch between single index and int in both ways. 

This allows us to write code like: 

or 

single index i= 0, j= 1; 

std::cout ≪ ”A[0, 1] is ” ≪ A[i, j] ≪ ’\n’; 

for (single index i= 0; i < A.num rows(); ++i) 

for (single index j= 0; j < A.num cols(); ++j) 

std::cout ≪ ”A[” ≪ i ≪ ”, ” ≪ j ≪ ”] is ” ≪ A[i, j] ≪ ’\n’; 

In the loop, an single index (i) is compared with an int (A.num rows()). This comparison operator 

is not defined. The compiler converts i implicitly to an int and compares the values as int. 

Thus, the conversion operator allows us to use all operations that are defined for int without 

implementing them. 

At this opportunity we can introduce another operator. C and C ++ provide a prefix and postfix 

increment/decrement. The difference only manifests if we read the incremented/decremented 

value, e.g., j= i++; is differs from j= ++i; by having the old value of i in j (in the first statement) 

or the already incremented i (in the second statement). If the increment is the only expression 

in the statement, e.g., i++; or ++i;, there is no semantic difference. Therefore, it does not 

matter for loops whether we use the postfix or prefix notation. 

for (single index i= 0; i < A.num rows(); ++i) 

is (semantically) equivalent to: 

for (single index i= 0; i < A.num rows(); i++) 

For C ++ integer types it really does not matter. For user-defined types, the compiler will tell 

us that this operation is not defined. The GNU Compiler emits the following error message: 

no ≫operator++(int)≪ for suffix ≫++≪ declared, instead prefix operator tried 

Fortunately, it already reveals the solution. 

The operator++ without arguments is understood as prefix operator. To define a postfix operator 

we must define it with a dummy int argument. This argument has no effect but we need 

a way to define the symbol ++ as prefix and postfix operator. Unary operators are defined 

as member functions without argument. This works for all other unary operators but in case


of the decrement/increment we have the same symbol for two operators respectively that are 

distinguished by the position. 

To make a long story short, if we write i++ we must define the postfix increment: 

single index operator++ (int) 

{ 

single index tmp(∗this); 

++i1; 

return tmp; 

} 

We see that the operation requires an extra copy. The object itself must be incremented but 

the returned valued must be still the old one. If we returned the object itself, i.e. ∗this, then we 

had no possibility to increment it after the return. Therefore we need a copy before we modify 

the object. Alternatively we could omit the copy and return a new object with the old value: 

single index operator++ (int) 

{ 

++i1; 

return single index(i1 − 1); 

} 

This avoids the copy at the beginning but we still create a new object. These implementations 

show that the postfix operators are somewhat more expensive than prefix operators; and this 

true for all user-defined types. For C ++-own types the compiler can generate efficient executables 

for both forms. 

The really sad part of the story is that we put so much effort returning the old value of our 

index and does not even use it. Therefore, we give the following 

Advise 

If you increment or decrement user-defined types prefer the prefix notation, 

especially if the value of the changed variable is not used in the statement. 

In the examples, we declared both indices as single index. It is sufficient doing this for the first 

one and let the implicit constructor convert the second one: 

A[single index(0), 1] 

Unfortunately, we cannot write 

A[0, 1] 

The compiler will give an error message 11 like: 

no match for ≫operator[]≪ in ≫A[(0, 0)]≪ 

To call operator[], the compiler would need to perform multiple steps that depend on each other: 

first the zeros that are considered int would need to be converted to single index and then the 

11 This is the message from GNU compiler.


comma operator has to be applied on them. A language that would allow such dependent 

conversions would end up in extremely long compile times to considered all possibilities 12 and 

probability of ambiguities would increase tremendously. 

Instead the compiler considers ‘0, 0’ as a sequence of two expressions where each expression is 

an integer constant. The result of a sequence is the result of the last expression, i.e. the integer 

constant zero in our case. This cannot be converted into a double index. 

To throw in a really bad idea, we give the second constructor argument of double index a default 

value: 

struct double index 

{ 

double index (int i1, int i2= 0) // Very bad 

: i1 (i1), i2 (i2) {} 

int i1, i2; 

}; 

Then the expression A[0, 1] compiles, as well as A[0, 1, 2, 3, 4]. It evaluates the integer sequence 

and the result is the last expression. A single integer can be implicitly converted into double index. 

As a result, the last integer is considered the row and the column is zero. 

Comparing the approaches 

The previous implementations show that C ++ allows us to provide different notations for userdefined 

types and we can implement it in the manner that seems most appropriate to us. The 

first approach was replacing square brackets by round parentheses to enable multiple arguments. 

This was the simplest solution and if one is willing to accept this syntax, one can safe oneself 

the length we went through to come up with a fancier notation. The technique of returning a 

pointer was not complicated either but it relies to strongly on the internal representation. If 

we use some internal blocking or some other specialized internal storage scheme, we will need 

an entirely different technique. Another drawback was that we cannot test the range of the 

column index. 

The last approach, introduced special types and the fact that we must always specify the type 

of the index explicitly makes the notation for constant indices clumsier instead of clearer. It 

also introduced a lot of implicit conversions and in a large code base we might have enormous 

trouble to avoid ambiguities. Another unfortunate aspect is the overloading of the comma 

operator. It makes the understanding of programs more difficult — because one has to pay a 

lot of attention to the types of expressions to distinguish it from non-overloaded sequences — 

and can cause weird affects. Thus, our first recommendation is keep reading since the proxy 

solution in § ?? is in our opinion preferable to the previous approaches (although not perfect 

either). 

Resuming, C ++ gives us the opportunity to handle programming tasks in different ways. Several 

times, none of the solutions will be perfect. Even if oneself is satisfied with the solution, 

then there will be most certainly some (allegedly) experienced C ++ programmer who finds a 

disadvantage. 

12 It might even become undecidable.


There are two lessons we can learn from this, firstly: 

Advise 

Don’t push C ++ too far! Avoid fragile features and minimize implicit conversions. 

C ++ enables many techniques but that doesn’t mean one have to use them all. Especially the 

comma operator bears so much danger that its utilization must be limited to very rare cases 

or better avoided entirely. It is important to have an appropriate notation and time spent on 

syntactic sugar is really worthwhile for the sake of better usability of new classes. But some 

tricks provide a little improvement in the syntax and create large problems in the interplay with 

other techniques. 

Secondly: 

Advise 

If you can’t find a perfect solution, pick what serves you best and accept it. 

We dare the hypothesis that there is no single C ++ program that everybody is happy with. The 

attempt to come up with the world’s first perfect C ++ program will end in failure and bitterness. 

Of course that does not mean always willingly accepting the first working implementation one 

comes up with. Software always can be improved and should be. As mentions in § 2.11, 

experiences have shown that is most efficient to refactor software as early as possible than 

retroactively fixing issues when important applications crash, users are angry and the program 

author(s) forgot the details or are already gone. On the other hand, by the time one reaches 

a really good implementation one has certainly spent already much more time than initially 

planned. 

3.8 Other Operators

Generic programming 

Chapter 4 

In this chapter we will explain the use of templates in C ++ to create generic functions and 

classes. We will also discuss metaprogramming and the Standard Template Library. 

4.1 Templates 

Templates are a feature of the C ++ programming language that create functions and classes 

that operate with generic types — also called parametric types. As a result, a function or class 

can work with many different data types without being manually rewritten for each one. 

A template parameter is a special kind of parameter that can be used to pass a type as an 

argument: just like regular function parameters can be used to pass values to a function, 

template parameters allow to pass also types to a function or a class. These generic functions 

can use these parameters as if they were any other regular type. 

4.2 Generic functions 

Generic functions — also called function templates — are in some sort generalizations of overloaded 

functions. 

Suppose we want to write the function max(x,y) where x and y are variables or expressions of 

some type. Using overloading, we can easily do this as follows: 

int inline max (int a, int b) 

{ 

if (a > b) 

return a; 

else 

return b; 

} 

double inline max (double a, double b) 

{ 

if (a > b) 

return a; 

89

90 CHAPTER 4. GENERIC PROGRAMMING 

} 

else 

return b; 

Note that the function body is exactly the same for both int and double. 

With the template mechanism we can write just one generic implementation: 

template 

T inline max (T a, T b) 

{ 

if (a > b) 

return a; 

else 

return b; 

} 

The function can be used in the same way as the overloaded functions: 

std::cout ≪ ”The maximum of 3 and 5 is ” ≪ max(3, 5) ≪ ’\n’; 

std::cout ≪ ”The maximum of 3l and 5l is ” ≪ max(3l, 5l) ≪ ’\n’; 

std::cout ≪ ”The maximum of 3.0 and 5.0 is ” ≪ max(3.0, 5.0) ≪ ’\n’; 

In the first case, ‘3’ and ‘5’ are literals of type int and the max function is instantiated to 

int inline max (int, int); 

Likewise the second and third call of max instantiate 

long inline max (long, long); 

double inline max (double, double); 

as the literals are interpreted as long and double. 

In the same way the template function can be called with variables and expressions: 

unsigned u1= 2, u2= 8; 

std::cout ≪ ”The maximum of u1 and u2 is ” ≪ max(u1, u2) ≪ ’\n’; 

std::cout ≪ ”The maximum of u1∗u2 and u1+u2 is ” ≪ max(u1∗u2, u1+u2) ≪ ’\n’; 

Here the function is instantiated for short. 

Instead of typename one can also write class in this context but we do not recommend this 

because typename expresses better the intention of a generic function. 

What does instantiation mean? When you write a non-generic function, the compiler reads 

its definition, checks for errors, and generates executable code. When the compiler processes 

a generic function’s definition it only checks certain errors (parsing errors) and generates no 

executable code. For instance: 

template 


{ 

if a > b // Error ! 

return a; 

else 

return b; 

}

4.2. GENERIC FUNCTIONS 91 

would not compile because the if statement without the parentheses is not a legal expression of 

the C ++ grammar. Meanwhile the following stupid implementation: 

template 


{ 

if (a > b) 

return max(a, b); // Infinite loop ! 

else 

return max(b, a); // Infinite loop ! 

} 

compiles because it does not violate any grammar rule. It obviously results in an infinite loop 

but this is beyond the compiler’s responsibility. 

So far, the compiler only checked the grammatical correctness of the definition but did not 

generate code. If we do not call the template function, the binary will have no trace of our max 

function. What happens when we call the generic function and cause their instantiation. The 

compiler first checks if the function can be compiled with the given argument type. It can do 

it for int or double as we have seen before. What about types that have no ‘>’? For instance 

std::complex. Let us try to compile: 

std::complex z(3, 2), c(4, 8); 

std::cout ≪ ”The maximum of c and z is ” ≪ ::max(c, z) ≪ ’\n’; 

The double colons in front of max shall avoid ambiguities with the standard libraries max that 

some compilers may include implicitly (as g++ apparently). Our compilation attempt will end 

in error like: 

Error: no match for ≫operator>≪ in ≫a > b≪ 

Obviously, we cannot call the max function with types that have no “greater than” operator. 

In fact, there is no maximum function for complex numbers. 

What happens when our template function calls another template function which in turn . . . ? 

Likewise, these functions are only completely checked at instantiation time. Let us look at the 

following program: 

#include 

#include 

#include 

#include 

int main () 

{ 

using namespace std; 

vector v; 

sort(v.begin(), v.end()); 

} 

return 0 ; 

Without going into detail, the problem is the same as before: we cannot compare complex 

numbers and thus not sort arrays of it. This time the missing comparison is discovered in 

an indirectly called function and the compiler provides you the entire call stack so that you


can trace back the error. Please try to compile this example on different compilers at your 

availability and see if you can make any sense out of the error messages. 

If you run into such lengthy error message 1 DON’T PANIC! First, look at the error itself 

and take out what is useful for you, e.g. missing “operator>” or something not assignable, 

i.e. missing “operator=” or something const that should not. Then find in the call stack your 

innermost code that is the part of your program where you call somebody else’s template 

function. Stare for a while at this and its preceding lines because this is the most likely place 

where the error is made. Does a type of the template function function’s argument is missing 

an operator or function as mentioned in the error? Do not get scared away from this, often 

the problem is much simpler than it seems from the never-ending error message. From our 

experience, most errors in template functions one can find faster than run-time errors. 

Another question we have not answered so far is what happens if we use two different types: 

unsigned u1= 2; 

int i= 3; 

std::cout ≪ ”The maximum of u1 and i is ” ≪ max(u1, i) ≪ ’\n’; 

The compiler tell us — this time briefly — something like 

Error: no match for function call ≫max(unsigned int&, int)≪ 

Indeed, we assumed that both types are the same. Now can we write a template function with 

two template parameters? Of course, we can. But that does not help us much here because we 

would not know what return type the function had. 

There are different options. First we could add a non-templated function like: 

int inline max (int a, int b) { return a > b ? a : b; } 

This can be called with mixed types and the unsigned argument would be implicitly converted 

into an int. But what would happen if we also add a function for unsigned? 

int max(unsigned a, unsigned b) { return a > b ? a : b; } 

Shall the int be converted into an unsigned or vice versa? The compiler does not know and will 

complain about this ambibuity. 

At any rate, adding non-templated overloads to the templated implemention is far from being 

elegant nor productive. So, we remove all non-templated overloads and look what we can do in 

the function call. We can explicitly convert one argument to the type of the other: 


int i= 3; 

std::cout ≪ ”The maximum of u1 and i is ” ≪ max(int(u1), i) ≪ ’\n’; 

Now max is called with two ints. Another option is specifying the template type explicitly in 

the function call: 


int i= 3; 

std::cout ≪ ”The maximum of u1 and i is ” ≪ max(u1, i) ≪ ’\n’; 

1 The longest we have heard off was 18MB what corresponds to about 9000 pages of text.

4.2. GENERIC FUNCTIONS 93 

Then the arguments are converted to int. 2 

After these less pleasant details on templates one really good news: template functions perform 

as efficient as their non-templated counterpart! The reason is that C ++ generates new code 

for every type or type combination that the function is called with. Java in contrast compiles 

templates only once and executes them for different types by casting them to the corresponding 

types. This results in faster compilation and shorter executables but it is less efficient than 

non-templated implementations (which are already less efficient than C ++ programs). 

Another price we have to pay for the fast templates is that we have longer executables because 

of the multiple instantiations for each type (combination). However, in practice the number of 

a function’s instances will not be that large and it only really matters for non-inline functions 

with long implementations (including called template functions). Inline functions’ binary codes 

are at any rate inserted directly in the exutable at the location of the function call so that the 

impact on the executable length is the same for template and non-template functions. 

4.2.1 The function accumulate 

TODO: An example on containers is much better than with ugly pointer arithmetic. 

Consider an array double a[n] which is described by its begin and end pointers a and a + n 

respectively. 3 

We create a function for the sum of an array of doubles. The loop over the array uses pointers 

as was explained in Section 2.9. Figure 4.1 shows the positions of the begin pointer a and the 

end pointer a+n that is directly past the end of the array. 

a 

❄ 

✲ 

a + n 

Figure 4.1: An array of length n with begin and end pointers 

Thus, we specify the range of entries by an right-open interval of adresses. 

2 

For complicated reasons of compiler internals the explicit type parameter turns off argument-dependent 

name lookup (ADL). 

3 

An array and a pointer are treated in much the same way in C/C ++. So one can pass an array when a 

pointer is expected and it takes the address of the first entry &a[0]. a + n is for a pointer or array a and an 

integer n equivalent to &a[n]. 

❄


Advise 

Unless you have strong reasons against it, use right-open intervals because: 

• It is easy to represent empty sets by two equal locations (pointers, 

iterators, . . . ). 

• It works on types without an ordering: if you specify the end by the 

locaation of the last element you need an operator

4.3. GENERIC CLASSES 95 

+= operator to variables of type T. This operator is defined for int and double types. This 

implies that the following main program will compile without the need for another definition of 

the accumulate function: 

int main() 

{ 

const int n = 10; 

float a[n] ; 

int b[n] ; 

for (int i= 0; i < n; ++i) { 

a[i]= float(i) + 1.0f; 

b[i]= i + 1; 

} 

float s= accumulate(a, a + n); 

int r= accumulate(b, b + n); 

return 0; 

} 

As well as in the previous example we do not need to indicate explicitly that T is double or int. 

The compiler deduces this for us from the function. We can, however, fill in the correct value 

of the type as follows: 

int r = accumulate(b, b+n); 

If you fill in the wrong type the compiler will give you a type error by saying that no matching 

function exists. 

4.3 Generic classes 

In the previous section, we described the use of templates to create generic functions. Templates 

can also be used to create generic classes, that define a certain behaviour that is independent 

of the types they operate on. Good candidates are for example container classes like vectors, 

matrices and lists. We can also extend the complex class with a parametric value type but we 

spent already so much time with it that we will now look at something else. 

Let us write a generic vector class. 4 First we just implement a class with the most fundamental 

operators: 

template 

class vector 

{ 

void check size(int that size) const { assert(my size == that size); } 

void check index(int i) const { assert(i >= 0 && i < my size); } 

public: 

explicit vector(int size) 

: my size(size), data( new T[my size] ) 

{} 

vector() 

: my size(0), data(0) 

{} 

4 In the sense of linear algebra not like STL vector.


vector( const vector& that ) 

: my size(that.my size), data( new T[my size] ) 

{ 

for (int i= 0; i < my size; ++i) 


} 

∼vector() { if (data) delete [] data ; } 

vector& operator=( const vector& that ) 

{ 

check size(that.my size); 



} 

int size() const { return my size ; } 

const T& operator[]( int i ) const 

{ 

check index(i); 


} 

T& operator[]( int i ) 

{ 


return data[i] ; 

} 

vector operator+( const vector& that ) const 

{ 

check size(that.my size); 

vector sum(my size); 


sum[i]= data[i] + that[i]; 

return sum ; 

} 

private: 

int my size ; 

T∗ data ; 

}; 

Listing 4.1: Template vector class 

The template class is not essentially different to a non-template class. There is only the extra 

parameter T as placeholder for the type that the class is used with. 

We have member variables like my size and member functions size() that are not affected by 

the template parameter. Other functions like the access operator or the first constructor are 

parametrized. However the difference is minimal, whereever we had double (or another type) 

before we put now the type parameter T, e.g. for return types or within new. Likewise our 

member variables and constants can be parametrized by T as for data. Even program parts 

that use generic functions or data can be often implemented without explicitly stating the

4.4. CONCEPTS AND MODELING 97 

type parameters. For instance the destructor uses the pointer data with a template type but 

the delete function can deduce its type automatically and for the null pointer test it does not 

matter either. 

Template arguments can have default values. Assume, our vector class has in addition to the 

value type also two parameters for the orientation and location: 

template 

class vector; 

The arguments of a vector can be fully declared: 

vector v; 

The last argument is equal to the default value and can be omitted: 

vector v; 

As for functions, only the last arguments can be omitted. For instance, if the second argument 

is the default and the last one is not we must write them all: 

vector w; 

If all template arguments are the default values, we can of course omit them all. However the 

type is still a template class and the compiler gets confused if we skip the brackets in the type: 

vector x; // wrong, it is considered a non−template class 

vector y; // looks a bit strange but is correct 

Other than the defaults of function arguments, the template defaults can refer to previous 

template arguments: 

template 

class pair; 

This is a class for two values that might have different types. If they do not we do not want to 

declare it twice: 

pair p1; // object with an int and float value 

pair p2; // object with two int values 

The dependency on previous arguments can be more complex than just equality when using 

meta-functions that we will introduce in Chapter ??. 

TODO: transition to next section 

4.4 Concepts and Modeling 

In the previous sections one could get the impression that template parameters can be replaced 

by any type. This is in fact not entirely true. The programmer of templated classes and functions 

makes assumptions about the operations that can be performed on the templated variables. So 

it is very important to know which types may correctly be substituted for the formal template 

parameters, in C ++ lingo which types the template function or class can be instantiated with. 

Clearly, accumulate can be instantiated with int or double. Types without addition like a solver


class (on page 70) cannot be used for accumulate. What should be accumulated from a set of 

solvers? All the requirements for the template T of the function accumulate can be summarized 

as follows: 

• T is CopyConstructable; 

– Copy constructor T::T(const T&) exists so that ‘T a(b);’ is compilable if b is of type 

T. 

• T is PlusAssignable: 

– PlusAssign operator T::operator+=(const T&) exists so that ‘a+= b;’ is compilable if 

b is of type T. 

• T is Constructible from int 

– Constructor T::T(int) exists so that ‘T a(0);’ is compilable. 

Such a set of type requirements is called a ‘Concept’. A concept CR that contains all requirements 

of concept C and additional requirements is called a ‘Refinement’ of C. A type t that 

holds all requirements of concept C is called a ‘Model’ of C. 

A complete definition of a template function or type shall contain the list of required concepts 

like it is done for functions from the Standard Template Library, see http://www.sgi.com/ 

tech/stl/. 

Today such requirements are mere documentation. A prototype of a C ++ Concept Compiler [?] 

that checks that 

• Whether a function can be called with a certain type (combination 5 ); 

• Whether a class can be instantiated with a certain types (combination); and 

• Whether a function’s requirement list covers all used expressions, including those in subfunctions. 

The compiler generates short and comprehensive message when template functions or classes 

are used erroneously. People interested in generic programming shall try the compiler, it helps 

for better understanding. However, the compiler really is a prototype and must not be used in 

production code. This functionality was even planned for the next language standard but the 

committee could achieve a consensus on its details, to make a (very) long story short. 

Discussion 4.1 The most vulnerable aspect of generic programming is the semantic conformance, 

that is which Semantic Concepts are modelled. For instance, an algorithm might require 

that a binary operation is associative to calculate correctly. One can express this requirement 

in the functions documentation but if someone calls this function with an operation that is not 

associative the compiler has no idea about this. If one violates a syntactical requirement than 

the compiler will complain about the missing function or operator — often in a hardly readable 

form — but it will be caught no matter what. If one violates a semantic requirement the 

compiler generates erroneous executables and the compilation does not give not any warning 

because it is entirely unaware of the user types’ semantic. The only way to find such semantic 

errors in templates with today’s compilers is careful documentation (and its reading of course). 

Latest research gives hope that future C ++ standards and compilers will provide more reliable 

and elegant possibilities to ensure semantic correctness of template programs. 

5 If you have multiple template arguments

4.5. INHERITANCE OR GENERICS? 99 

For illustration purpose we like to show the conceptualized implementation of a generic sorting 

function as used in the library of the concept compiler: 

template 

requires LessThanComparable 

&& CopyAssignable 

&& Swappable 

&& CopyConstructible 

inline void sort(Iter first, Iter last); 

If the function is called erroneously, the compiler will detect this directly in the function call 

not deep inside its implementation. 

4.5 Inheritance or Generics? 

In this section we will discuss the commonalities and difference of/between object-oriented 

programming (OOP) and generic programming. People that do not know OOP so far will not 

learn it in this section. The purpose of this section is to motivate why we pay more attention 

to generic than to object-oriented programming in this book. The short answer is performance 

and applicability. If this answer is good enough for you, you can skip this section and continue 

with the next one. Programmers that are used to OOP and think they can implement the 

functionality with inheritance instead of templates should take the time and read this section. 

Inheritance and generic programming are similar in the sense that most programming problems 

that can be solved by inheritance have a generic alternative solution and vice versa. The 

following table summarizes the basic components of inheritance and the corresponding building 

blocks of generic programming: 

Inheritance Generic Programming 

base class concept 

derived class model 

In the remainder of this section we will discuss the differences between generic programming 

and inheritance. 

We will focus on functions, but similar arguments hold for templated classes. The advantage 

of using a base class reference or pointer as argument type of a function is that we are sure 

that all derived classes can be used as argument too, see § ??. 6 Inheritance in C ++ and other 

OOP languages is designed in a fashion that a function in a derived class can substitute (hide) 

the one in the base class with the identic signature. Thus calling the function for a base class 

argument will either use the base class’s implementation or those of the derived class (if the 

function is virtual). In both cases we can rely on the existance. We will explain OOP in more 

detail in Section ??. Here, we only name advantages and disadvantages of the two approaches 

regarding different aspects of programming. 

Compile time: With the OOP approach, function is only compiled once. The distinction 

between the different calculations is realized at run-time. The generic implementation requires 

6 TODO: OOP section is still not written yet.


a new compilation for each combination of types. As a consequence, the sources must reside in 

header files and cannot be stored to libraries. 7 

Executable size: As mentioned before, generic functions need multiple compilations and 

as a result of this, the generated executable contains code for each instatiation. A function 

programmed against an abstract interface exist only once. On the other hand, the virtual 

functions introduce some additional memory need to store the virtual function tables. Except 

for some pathological examples, one can expect that this additional space is less than the extra 

space needed for having separate machine code for every instantiation of a generic function. In 

extreme cases, a very large executable size can negatively impact the performance due to waste 

of cache memory. 

Performance: The higher compilation efforts for generic programming has a double performance 

benefit. Functions within the multi-functional computations do not need to be called 

indirectly via expensive function pointers but can be called directly. Whenever appropriate they 

can be even inlined saving the function call overhead entirely. We once measured the impact 

of the approaches to the performance of an accumulate function (a more general approach than 

in § 4.2.1) [?]. The generic version was in our case about 40 times faster than the inheritancebased 

implementation. This value varies from platform to platform but for small functions 

one can expect that an inlined template function is 10–100 times faster than virtual functions. 

Conversely, for long calculations like solving a large linear system the performance difference is 

unperceivable. 

Concept refinement: that is adding (syntactic) requirements is feasible with the inheritance 

approach but it is very tedious and obfuscates the program source, details in [?]. 

Intrusiveness: The genericity emulation by inheritance can induce a deep class hierarchy [?], 

more critical for the universal applicability is that the technology is intrusive. A type cannot 

be used as argument of an OOP implementation if it is not derived from the according class 

even if the provides the correct interface! Thus, we have to add additional base class(es) to the 

type. This is particularly problematic if we use types from third-party libraries or intrinsic types 

because we cannot add base classes their. Generic functions have not such rigid constraints. We 

can even adapt a third-party or intrinsic type to meet a generic function’s syntactic requirements 

without modifying third-party programs. 

Time of selection: At least one advantage of the OOP-style polymorphism we should mention 

at the end. The argument type of generic function call must be known at compile time so that 

the compiler can instantiate the template function. The type of an OOP function argument can 

be chosen during the execution of the program and therefore depend on preceeding calculations 

or input data. For instance, one can define in a file which linear solver is used in an application. 

Résumé: It is not our goal to compare object-oriented and generic programming in general. 

The two approaches complete each other in many respects and this is beyond the scope of this 

discussion. However, when only considering the aspect of maximal applicability with optimal 

performance the generic approach is undoubtly superior. Especially if functions of a library are 

used with types defined outside this library, possibly necessary interface adaption is quite easy 

without modifying the type definition while the addition of extra base classes forces changing 

the type definition what is not always possible (or desirable). In contexts where functions are 

used with limited numbers of types and they are defined in the same library, derivation can be 

7 Libraries in the classical sense that are linked with separately compiled sources as opposed to template 

libraries.

4.6. TEMPLATE SPECIALIZATION 101 

an appropriate technique to achieve polymorphism. 

4.6 Template Specialization 

Although one of the advantages of a generic implementation is that the same code can be used 

for all objects that satisfy the corresponding concept, this is not always the best approach. 

Sometimes the same behavior can be implemented more efficiently for a specific type. In 

principle, one can even implement a different behaviour for a specific type but this is not 

advisable in general because the program becomes much more complicated to understand and 

using the specialized classes can require a whole chain of further specialization (bearing the 

danger of errors when imcompletely realized). C ++ provides an enormous flexibility and the 

programmer is in charge to use this flexibility responsibly and for being consistent to himself. 

4.6.1 Specializing a Class for One Type 

In the following, we want to specialize our vector example from page 96 for bool. Our goal is 

to save memory by packing 8 bools into one byte. Let us start with the class definition: 

template 

class vector 

{ 

// .. 

}; 

Although our specialized class is not type-parametric, we still need the template key word and 

the empty triangle brackets. After the class the complete type list must be given. This syntax 

looks a bit cumbersome in this context but makes more sense for multiple template arguments 

where only some are specialized. For instance, if we had some container with 3 arguments and 

specialize the second one: 

template 

class some container 

{ 

// .. 

}; 

Back to our boolean vector class. In the class we define a default constructor for empty vectors, 

a constructor for vectors of size n and a destructor. In the size of the array, we have to pay 

some attention if the vector size is not disible by 8 because the integer division simply cuts off 

the remainder. 

template 

class vector 

{ 

public: 

explicit vector(int size) 

: my size(size), data(new unsigned char[(my size + 7) / 8] ) 

{} 

vector() : my size(0), data(0) {}


∼vector() { if (data) delete [] data ; } 

private: 

int my size; 

unsigned char∗ data ; 

}; 

One thing we realize is that the default constructor and the destructor are identic with the 

non-specialized version (in the following also referred to as general version). Unfortunately, this 

is not ‘inherited’ to the specialization. If we write a specialization we have to define everything 

from scratch. We are free to omit member functions or variables from the general but for the 

sake of consistency we do this only for good reasons, for very good reasons. For instance, we 

might omit the operator+ because we have no addition for bool. The constant access operator 

is implemented with shifting and bit masking: 

template class vector 

{ 

bool operator[](int i) const { return (data[i/8] ≫i%8) & 1; } 

}; 

The mutable access is trickier because we cannot refer to single bits. The trick is to returns 

some helper type — called ‘Proxy’ — that can perform the assignment and returning boolean 

from a byte reference and the position within the byte. 

template class vector 

{ 

vector bool proxy operator[](int i) 

{ 

return vector bool proxy(data[i/8], i%8); 

} 

}; 

Let us now implement our proxy: 

class vector bool proxy 

{ 

public: 

vector bool proxy(unsigned char& byte, int p) : byte(byte), mask(1 ≪ p) {} 

private: 

unsigned char& byte; 

unsigned char mask; 

}; 

To simplify further operations we create a mask that has 1 on the position in question and 0 

on all other positions. 

The reading access is implemented by simply masking in the conversion operator: 


{ 

operator bool() const { return byte & mask; } 

}; 

Setting a bit is realized by an assignment operator for bool:



{ 

vector bool proxy& operator=(bool b) 

{ 

if (b) 

byte|= mask; 

else 

byte&= ∼mask; 


} 

}; 

If our argument is true we ‘or’ it with the mask, i.e. on the considered position the one bit in 

the mask turns on the bit in the byte reference and in all other positions the zero bits in the 

mask leave the according positions unchanged. Reversely with a false argument, we first invert 

the mask and ‘and’ it with the byte reference so that the mask’s zero bit on the active position 

turns the bit off and on all other positions the ‘and’ with one bits conserves the old bit values. 8 

4.6.2 Specializing a Function to a Specific Type 

Functions can be specialized in the same manner as classes. Assume we have a generic function 

that computes the power x y and want specialize this one: 

template 

Base inline power(const Base& x, const Exponent); 

template 

double inline power(const double& x, const double& y); // Do not use this 

Unfortunately many of such specializations are ignored. Therefore, we give the following 

Advise 

Do not use function template specialization! 

To specialize a function to one specific type or type tuple as above, we can simply use overloading. 

This works better and is even simpler. Back to our example, assume we have an entirely 

generic power method. 9 In the case that both arguments are double we want nevertheless use 

the standard implementation hoping that some caffeine-drugged geeks figured out an incredibly 

fast assembler hack for our platform and put it in our Linux distribution. Excited by the 

incredible performance — even if it is only the hope for it — we overload our power function 

as follows: 

#include 

template 

Base inline power(const Base& x, const Exponent) 

8 TODO: picture 

9 TODO: Anybody an idea for an implementation? Or a better example?


{ 

} 

... 

double inline power(double x, double y) 

{ 

return std::pow(x, y); 

} 

Speaking of platform-specific assembler hacks, maybe we are eager to contribute a code that 

explores SSE units by performing two computations in parallel: 

template 

Base inline power(const Base& x, const Exponent) { ... } 

#ifdef SSE FOR TRYPTICHON WQ OMICRON LXXXVI SUPPORTED 

std::pair inline power(const std::pair& x, double y) 

{ 

asm { 

# Yo, I’m the greatestest geek under the sun! 

} 

return whatever; 

} 

#endif 

#ifdef ... more hacks ... 

What is to say about this snippet? If you do not like to write such specializations, we will 

not blame you. If you do, always put such hacks in conditional compilation. You have 

to make sure as well that your build system only enables the macro when it is definitely a 

platform that supports the hack. For the case that it does not, we must guarentee that the 

generic implementation or another overload can deal with pairs of double. Last but not least, 

you have to rewrite your applications for using this function. Convincing others to use such 

special implementation could be even more work than getting the assembler hack producing 

plausible numbers. More importantly, such special signatures undermines the ideal of a clear 

and intuitive programming. However, if power functions are computed on entire vectors and 

matrices, one could perform the calculation pairwise internally without affecting the interface 

or the user application. 

You might also think that SSEs were yesterday and today we have GPUs and GPGPUs but 

programming generically still takes a lot of tricks (at least in the beginning of 2010). But this is 

another story and we digress. Resuming: programming for highest performance can be tricky 

but at least there often ways to explore unportable feature (where available) without sacrificing 

portability at the application level. 10 

In the previous examples, we specialized all arguments of the function. It is also possible to 

specialize some argument(s) and leave the remaining argument(s) as template(s): 

template 

Base inline power(const Base& x, const Exponent& y); 

template 

10 TODO: Is this comprehensible?


Base inline power(const Base& x, int y); 

template 

double inline power(double x, const Exponent& y); 

The compiler will find all overloads that match the argument combination and select the most 

specific. For instance, power(3.0, 2u) will match for the first and third overload where the latter 

is more specific. 11 To put it to higher math: 12 type specificity is a partial order that forms a 

lattice and the compiler picks the maximum of the available overloads. However, you do not 

need to dive deeply into algebra to see which type or type combination is more specific. 

If we call power(3.0, 2) with the previous overloads all three matches. However, this time we 

cannot determine the most specific overload. The compiler will tell us that the call is ambiguous 

and show us overload 2 and 3 as candidates. As we implemented the overloads consisently and 

with optimal performance we might be glad with either choice but the compiler will not choose. 

To disambiguate the overloads we must add: 

double inline power(double x, int y); 

The lattice people from the previous paragraph will think “Of course, we were missing the join 

in the specificity order.” Again, one can understand C ++ without studying lattices. 

4.6.3 Partial Specialization 

If you implemented template classes you will run sooner or later in the situation where you like 

to specialize a template class for another template class. Suppose we have a templated complex 

class: 

template 

class complex; 

Assume further that we had some really boosting algorithmic specialization for complex vectors 

13 that safes tremendous compute time. Then we start specializing our vector class: 

template 

class vector; 

template 

class vector; // again ??? :−/ 

template 

class vector; // how many more ??? :−P 

Apparently, this lacks elegance to reimplement the specialization for all possible and impossible 

instantiations of complex. Much worse, it destroys our ideal of universal applicability because 

the complex class is intended to support user-defined types as Real but the specialization of the 

vector class will be ignored for those types. 

The solution to the implementation redundancy and the ignorance of new types is ‘Partial 

Specialization’. We specialize our vector class for all complex instantiations: 

11 TODO: Exercises for which type is more specific than which. 

12 For those who like higher mathematics. And only for those. 

13 TODO: Anyone a good example?


template 

class vector 

{ 

... 

}; 

That will do the trick. Pay attention to put a space between the closing ‘¿’; otherwise the 

compiler will take two subsequent ‘¿’ as shift operator ‘¿¿’ and becomes pretty confused. 14 

This also works for classes with multiple parameters, for instance: 

template 

class vector 

{ 

... 

}; 

We can also specialize for all pointers: 

template 

class vector 

{ 

... 

}; 

Whenever the set of types is expressible by a Type Pattern we can apply partial specialization 

on it. 

Partial template specialization can be combined with regular template specialization from § 4.6.1 

— let us call it ‘Complete Specialization’ for distinction. In this case, the complete specialization 

is prioritized over the partial one. Between different partial specializations the most specific is 

selected. In the following example: 

template 

class vector 

{ 

... 

}; 

template 

class vector 

{ 

... 

}; 

the second specialization is more specific than the first one and picked when matches. In this 

sense a complete specialization is always more specific than a partial one. 

4.6.4 Partially Specializing Functions 

The C ++ standard committee distinguishes between explicit specialization as in the first paragraph 

of § 4.6.2 and implicit specialization. An example for implicit specialization is the following 

computation of a value’s magnitude: 

14 In the next (new depending on publication date) standard, closing ‘¿’ without intermediate spaces. Some 

compilers — e.g., VS 2008 already support the conglutinated notation today.


template 

T inline abs(const T& x) 

{ 

return x < T(0) ? −x : x; 

} 

template // Do not specialize functions like this either 

T inline abs(const std::complex& x) 

{ 

return sqrt(real(x)∗real(x) + imag(x)∗imag(x)); 

} 

This works significantly better than the explicit specialization but even this form of specialization 

fails sometimes in the sense that a template function is selected which is not the most 

specific. 15 A mean aspect of this implicit specialization is that it seems to work properly with 

few specializations and when a software project grows eventually it goes wrong. Since the 

developers have seen the specialization working before, they might not expect it and the unintended 

function selection might remain unobserved while corrupting results or at least wasting 

resources. It is also possible that the specialization behavior varies from compiler to compiler. 16 

The only conclusion from this is to not specializing function templates! It introduces an 

unnecessary fragility into our software. Instead we introduce an additional class (called functor 

§ 4.8) with an operator(). Template classes are properly specialized on all compilers 17 both 

partially and completely. 

In our abs example we start with the function itself and a forward declaration of the template 

class: 

template struct abs functor; 

template 

typename abs functor::result type 

inline abs(const T& x) 

{ 

abs functor functor object; 

return functor object(x); 

} 

Alternatively to the forward declaration we could have declared the class directly. The return 

type of our function refers to a typedef or (as correct term in generic programming) to a 

‘Associated Type’ of abs functor. Already for complex numbers we do not return the argument 

type itself but its associated type value type. Using an associated type here gives us all possible 

flexibility for further specialization. For instance, the magnitude of a vector could be the sum 

or maximum of the elements’ magnitudes or a vector with the magnitudes of each element. 

Evidently the functor classes must define a result type to be called. 

Inside the function, we instantiate the functor class with the argument type: abs functor 

and create an object of this type. Then we call the object’s application operator. As we do not 

15 TODO: Good example. 

16 TODO: Ask a compiler expert about this. 

17 Several years ago many compilers failed in partial specialization, e.g. VS 2003, but today all major compiler 

handle this properly. If you nevertheless experience problems with this feature in some compiler take your hands 

off of it, most likely you will encounter further problems. Even the CUDA compiler that is far from being 

standard-compliant supports partial specialization.


really the object itself but only use it for the calculation, we can as well create an anonymous 

object and perform the creation/construction and calcution in one expression: 

template 



{ 

return abs functor()(x); 

} 

In this expression we have two pairs of parentheses: the first one contains the arguments of the 

constructor, which are empty, and the arguments of the application operator, which is/are the 

argument(s) of the function. If would write: 

template 



{ 

return abs functor(x); // error 

} 

then x would be interpreted as argument of the constructor and an object of the functor class 

would be returned. 18 

Now we have to implement our functor classes: 

template 

struct abs functor 

{ 

typedef T result type; 

}; 

T operator()(const T& x) 

{ 

return x < T(0) ? −x : x; 

} 

template 

struct abs functor 

{ 

typedef T result type; 

}; 

T operator()(const std::complex& x) 

{ 

return sqrt(real(x)∗real(x) + imag(x)∗imag(x)); 

} 

We wrote a general implementation that works for all fixed-point and floating-point types. 

18 Many years and versions ago, g++ tolerated this expression (sometimes) despite it is not standard-compliant.

4.7. NON-TYPE PARAMETERS FOR TEMPLATES 109 

4.7 Non-Type Parameters for Templates 

So far, we used template arguments only for types. Values can be template arguments as well. 

Not all values but only integral types, i.e. fixed point numbers and bool. 

Very popular is the definition of short vectors and small matrices with size arguments as template 

parameters, for instance: 

template 

class fsize vector 

{ 

typedef fsize vector self; 

void check index(int i) const { assert(i >= 0 && i < my size); } 

public: 

typedef T value type; 

const static int my size= Size; 

fsize vector() {} 

fsize vector( const self& that ) 

{ 



} 

self& operator=( const self& that ) 

{ 



} 

int size() const { return my size ; } 

const T& operator[]( int i ) const 

{ 



} 

T& operator[]( int i ) 

{ 


return data[i] ; 

} 

self operator+( const self& that ) const 

{ 

self sum; 


sum[i]= data[i] + that[i]; 

return sum ; 

} 

private:


}; 

T data[Size] ; 

If you compare this implementation with the implementation in Section 4.3 on page 95 you 

realize that there not so many differences. 

The essential difference is that the size is now part of the type and that the compiler knows it. 

Let us start with the latter. The compiler can use its knowlegde for optimization. For instance, 

if we create a variable 

fsize vector v(w); 

the compiler can decide that the generated code for the copy constructor is not performed in a 

loop but as a sequence of independent operations like: 

fsize vector( const self& that ) 

{ 

data[0]= that.data[0]; 



} 

This saves the incrementation of the counter and the test for the loop end. In some sense, 

this test is already performed at compile time. As a rule of thumb, the more is known during 

compilation the more potential for optimization exist. We will come back to this in more detail 

in Section 8.2 and Chapter ??. 

Which optimization is induced by additional compile-time information is of course compilerdependent. 

One can only find out which transformation is actually done by reading the generated 

assembler code — what is not that easy, especially with high optimization and with low 

optimization the effect will probably not be there — or indirectly by observing performance and 

comparing it with other implementations. In the example above, the compiler will probably 

unroll the loop as shown for small sizes like 3 and keep the loop for larger sizes say 100. You 

see, why this compile-time sizes are particularly interesting for small matrices and vectors, e.g. 

three-dimensional coordinates or rotations. 

Another benefit of knowning the size at compile time is that we can store the values in an array 

and even inside the class. Then the values of temporary objects are stored on the stack and not 

on the heap. 19 The creation and destruction is much less expensive because only the change of 

the program counter at function begin and end needs to adapted to the objects size compared 

to dynamic memory allocation on the heap that involves the management of lists to keep track 

of allocated and free memory blocks. 20 To make a long story short, keeping the data in small 

arrays is much less expensive than dynamic allocation. 

We said that the size becomes part of the type. The careful reader might have realized that we 

omitted the checks whether the vectors have the same size. We do not need them anymore. If 

an argument has the class type, it implicitly has the same size. Consider the following program 

snippet: 

fsize vector v; 

fsize vector w; 

19 TODO: Picture. 

20 TODO: Need easier or longer explication. or citation.

4.8. FUNCTORS 111 

vector x(3), y(4); 

v= w; 

x= y; 

The last two lines are incompatible vector assignments. The difference is that the imcompatibility 

in the second assignment x= y; is discovered at run time in our assertion. The assignment 

v= w; does not even compile because fixed-size vectors of dimension 3 only accept vectors of the 

same dimension as argument. 

Like type arguments, non-type template arguments can have defaults. Say the most frequent 

dimension of our vectors is three because we live in a three-dimensional world, relativity and 

string theory aside. Then we save some typing with a default: 

template 


{ /∗ ... ∗/ }; 

fsize vector v, w, x, y; 

fsize vector space time; 

fsize vector string; 

4.8 Functors 

Let us develop a mathematical algorithm for computing the finite difference of a differentiable 

function f. The finite difference is an approximation of the first derivative by 

f ′ (x) ≈ 

where h is a small value also called spacing. 

f(x + h) − f(x) 

h 

A general function for computing the finite difference is presented here: 

#include 

#include 

// Function taking a function argument 

double finite difference( double f( double ), double x, double h ) { 

return ( f(x+h) − f(x) ) /h ; 

} 

double sin plus cos( double x ) { 

return sin(x) + cos(x) ; 

} 

int main() { 

std::cout ≪ finite difference( sin plus cos, 1., 0.001 ) ≪ std::endl ; 

std::cout ≪ finite difference( sin plus cos, 0., 0.001 ) ≪ std::endl ; 

}


Note that the function finite difference takes an arbitrary function (from double to double) as 

argument. 

Now suppose we want to compute the second order derivative. It would make sense to call 

finite difference with finite difference as argument. Unfortunately this is not possible since we have 

three arguments in this function and the first argument of finite difference only accepts a function 

with a single argument. 

For this reason, we can use ‘functors’. Functors — not to confuse with functors from category 

theory — are either functions or objects of classes providing operator(). This means that 

‘functors’ are things which can be called liked functions but are not necessarily functions. 

Using objects of a class providing operator() has the additional advantage that it can use an 

internal state in terms of member variables. 21 

For our example, the functor could be implemented as follows: 

struct sin plus cos 

{ 

double operator() (double x) const 

{ 

return sin(x) + cos(x) ; 

} 

}; 

but we could also consider a functor with an parameter like this: 

class para sin plus cos 

{ 

public: 

para sin plus cos(double parameter) : parameter(parameter) {} 

double operator() (double x) const 

{ 

return sin(parameter ∗ x) + cos(x) ; 

} 

private: 

double parameter; 

} ; 

How can we use the functor in a function? We want to be able to pass objects of both sin plus cos 

and para sin plus cos to our finite difference function. There are two possible solutions: inheritance 

and generic programming, which we now discuss. 

4.8.1 Functors via inheritance 

TODO: Better as counter-example in OO chapter. We haven’t introduced virtual functions 

yet. 

Let us first rewrite our function finite difference using an abstract base class. 

21 TODO: Do we want the following sentences?: Functors can encapsulate C and C ++ function pointers 

employing the concepts templates and polymorphism. All the functions must have the same return-type and 

calling parameters.


struct functor base 

{ 

virtual double operator() (double x) const= 0 ; 

} ; 

double finite difference( functor base const& f, double x, double h ) 

{ 


} 

The functor class has a pure 22 virtual function operator() and thus can not be used. We can 

however alter the functor para sin plus cos such that it inherits from the abstract base class and 

specializes operator(). 


: public functor base 

{ 

public: 

para sin plus cos(double p) : parameter(p) {} 

double operator() (double x) const // Is virtual function in base 

{ 

return sin( parameter ∗ x ) + cos(x); 

} 

private: 


}; 

Now we can use an object of this class as the first argument of finite difference. 

The whole program looks as follows: 

#include 

#include 

struct functor base { 

virtual double operator() ( double x ) const = 0 ; 

} ; 

double finite difference( functor base const& f, double x, double h ) { 


} 


: public functor base 

{ 

public: 

para sin plus cos( double const& p ) 

: parameter ( p ) 

{} 

double operator() ( double x ) const { // Virtual function 

return sin( parameter ∗ x )+ cos(x) ; 

22 TODO: undefined


} 

private: 

double parameter ; 

} ; 

int main() { 

para sin plus cos sin 1( 1.0 ) ; 

std::cout ≪ finite difference( sin 1, 1., 0.001 ) ≪ std::endl ; 

std::cout ≪ finite difference( para sin plus cos(2.0), 1., 0.001 ) ≪ std::endl ; 


} 

4.8.2 Functors via generic programming 

If we make the functor argument in finite difference generic, we do not need a functor base 

any longer. There is also no need to alter our previously defined functors sin plus cos and 

para sin plus cos. This is a perfect example of the fact that using generic programming makes 

extending software easier. The program now looks like: 

#include 

#include 

template 

T inline finite difference(F const& f, const T& x, const T& h) 

{ 

return ( f(x+h) − f(x) ) / h ; 

} 


{ 

public: 

para sin plus cos(double p) : parameter(p) {} 

double operator() ( double x ) const 

{ 

return sin( parameter ∗ x ) + cos(x); 

} 

private: 


}; 

int main() 

{ 

para sin plus cos sin 1( 1.0 ) ; 

std::cout ≪ finite difference( sin 1, 1., 0.001 ) ≪ std::endl ; 



} 

return 0;


Since we are using a template argument F we need to define the constraints that it has to satisfy. 

For this function, we need F to be a functor with one argument. This is called a UnaryFunctor. 

Formally, we can write this as follows: 

• Let f be of type F. 

• Let x be of type X, where X is the argument type of F. 

• f(x) calls f with one argument, and returns an object of the result type. 

In this example we also require that the argument type and result type of F are identical. We 

can remove this restriction if we establish a unique way to deduce the return type. This can be 

achieved by meta-programming or with the type deduction in the next C ++ standard. 

So far so good. We complained before that we cannot apply the finite differences on themselves 

to compute higher order derivatives. Actually, we still cannot. The problem is that the 

finite difference expects (amongst others) a unary functor and is itself a ternary function. So it 

cannot use itself as argument. The solution is to realize its functionality in a unary functor that 

we call derivative: 

template 

class derivative 

{ 

public: 

derivative(const F& f, const T& h) : f(f), h(h) {} 

T operator()(const T& x) const 

{ 

return ( f(x+h) − f(x) ) / h ; 

} 

private: 

const F& f; 

T h; 

}; 

Now we can create an object that approximates the derivative from f(x) = sin(1 · x) + cos x: 

typedef derivative spc der 1; 

spc der 1 spc(sin 1, 0.001); 

The object spc can be used like a function and it approximates f ′ (x). In addition it is a unary 

functor. That means we can compute its derivative: 

typedef derivative spc der 2; 

spc der 2 spc scd(spc, 0.001); 

std::cout ≪ ”Second derivative of sin(0) + cos(0) is ” ≪ spc scd(0.0) ≪ ’\n’; 

The object spc scd is again a unary functor and aproximates f ′′ (x). We could again construct 

a functor for its derivative and continue this game eternally. 

Assume that we need second derivatives from different functions. Then it becomes annoying to 

define first the type of the first derivative constructing a functor from it for finally creating a 

functor the second one. According to Greg Wilson’s [?] 23 maxim “Whatever you use twice, 

automate!” we write a class that provides us the second derivative directly: 

23 This online course contains a gigantic collection of tips how to develop software successfully and avoid 

frustrating unproductivity. We highly recommend you reading this material.


template 

class second derivative 

{ 

public: 

second derivative(const F& f, const T& h) : h(h), fp(f, h) {} 


{ 

return ( fp(x+h) − fp(x) ) / h ; 

} 

private: 

T h; 

derivative fp; 

}; 

Now we can build the f ′′ functor from f: 

second derivative spc scd2(para sin plus cos(1.0), 0.001); 

When we think about how we would implement the third, fourth or in general the n-th derivative, 

we realize that they would look much like the second one: calling the (n-1)-th derivative on x+h 

and x. We can explore this with a recursive implementation: 

template 

class nth derivative 

{ 

typedef nth derivative prec derivative; 

public: 

nth derivative(const F& f, const T& h) : h(h), fp(f, h) {} 


{ 


} 

private: 

T h; 

prec derivative fp; 

}; 

To save the compiler from infinite recursion we must stop this mutual referring when we reach 

the first derivative. Note that we cannot use ‘if’ or ‘?:’ to stop the recursion because both of its 

respective branches are evaluated and one of them still contains the infinite recursion. Recursive 

template definitions are terminated with a specialization like this: 

template 


{ 

public: 

nth derivative(const F& f, const T& h) : f(f), h(h) {} 


{ 

return ( f(x+h) − f(x) ) / h ; 

} 

private:


}; 

const F& f; 

T h; 

This specialization is identical with the class derivative that we now could throw away. If we keep 

it, we can at least reuse its functionality and variables to reduce redundancy. This is achieved 

by derivation (more in Chapter 6). 

template 


: public derivative 

{ 

public: 

nth derivative(const F& f, const T& h) : derivative(f, h) {} 

}; 

With our recursive definition we can easily define the twenty-second derivative: 

nth derivative spc 22(para sin plus cos(1.0), 0.00001); 

The new object spc 22 is again a unary functor. Unfortunately, it approximates so badly that 

we are too ashamed to present the results here. From Taylor series we know that the error of 

the f ′′ approximation is reduced from O(h) to O(h 2 ) when a backward difference is applied 

on the forward difference. This said, maybe we can improve our approximation if we alternate 

between forward and backward differences: 

template 


{ 

typedef nth derivative prec derivative; 

public: 

nth derivative(const F& f, const T& h) : h(h), fp(f, h) {} 


{ 

return N & 1 ? ( fp(x+h) − fp(x) ) / h 

: ( fp(x) − fp(x−h) ) / h ; 

} 

private: 

T h; 

prec derivative fp; 

}; 

Sadly, our 22nd derivative is still as wrong as before, well slightly worse. Which is particularly 

frustrating when we become aware that we evaluate f over four million times. 24 Decreasing h 

does not help either: the tangent approaches better the derivative but on the other hand the 

values of f(x) and f(x ± h) become quite close and their difference has only few meaningful 

bits. At least the second derivative improved by our alternating difference scheme as the Taylor 

series teach us. Another consolidating fact is that we probably did not pay for the alteration. 

The template argument N is known at compile time and the condition N&1 whether the last 

bit is on can be also evaluated during compilation. When N is odd than the operator reduces 

effectively to: 

24 TODO: Is there an efficient and well-approximating recursive scheme to compute higher order derivatives?



{ 


} 

Likewise for even N, only the backward difference is computed without testing. 

If nothing else we learned something about C ++ and we are confirmed in the 

Truism 

Not even the coolest programming can substitute for solid mathematics. 

In the end, this script is primarily about programming. To improve the expressiveness of our 

software, functors are an extremely powerful approach. We have seen how to take an arbitrary 

unary function and construct a unary function that approximates its derivative or a higher-order 

derivative. 

If we do not know the type of a function or we do not like to bother with it we can write a 

convenience function that detects the type automatically: 

template 

nth derivative 

inline make nth derivative(const F& f, const T& h) 

{ 

return nth derivative(f, h); 

} 

Here F and T are types of function arguments and can be detected by the compiler. The only 

template argument that the compiler does not detect is N. Note that such arguments must be 

at the beginning of the template argument list and the compiler-detected at the end. Therefore 

the following template function is wrong: 

template // error 

nth derivative 

inline make nth derivative(const F& f, const T& h) 

{ 

return nth derivative(f, h); 

} 

If you call this one, the compiler will complain that it cannot detect N. This leads us to the 

question how we call this function. Of course, we can explicitly declare all argument types: 

make nth derivative(sin 1, 0.00001); 

But this is exactly what we wanted to avoid with implementing this function. As said, F and T 

can be detected by the compiler and we only need to provide N: 

make nth derivative(sin 1, 0.00001); 

What is this expression good for? Written like this, not much. It creates a function that will 

be immediately destroyed. If it is a function we should be able to call it with an argument:


std::cout ≪ ”Seventh derivative of sin 1 at x=3 is ” 

≪ make nth derivative(sin 1, 0.00001)(3.0) ≪ ’\n’; 

In the cases above the type of the functor was obvious because we wrote the class ourselves. The 

type is less obvious if the type is constructed from an expression, for instance by a λ-function. 

Support for λ-functions will be introduced with C ++0x. 25 Emulation is available since some 

years with Boost.Lambda [?]. For instance, we can generate a functor object that computes 

with the following short expression: 

(3.5 ∗ 1 + 4.0) ∗ 1 ∗ 1; 

p(x) = 3.5x 3 + 4x 2 = (3.5x + 4)x 2 

This expression can be used with our derivative function: 

make nth derivative((3.5 ∗ 1 + 4.0) ∗ 1 ∗ 1, 0.0001) 

to generate a functor computing (approximating) 21x + 8. 

With the lambda expressions, we do not even know the type of our functor but we can compute 

its derivative. The type is in fact so long 26 that it is much easier to implement our own functor 

when we were obliged to spell the type out. 

The following listing illustrates how to approximate p ′′ (2): 

#include 

// .. our definitions of derivatives 

int main() 

{ 

using boost::lambda:: 1; 

std::cout ≪ ”Second derivative of 3.5∗xˆ3+4∗xˆ2 at x=2 is ” 

≪ make nth derivative((3.5 ∗ 1 + 4.0) ∗ 1 ∗ 1, 0.0001)(2) ≪ ’\n’; 

return 0; 

} 

Unfortunately, we cannot keep the results of our computations if we do not know their types 

with current standard C ++. In C ++0x, we will be able to let the compiler deduce the type: 

auto p= (3.5 ∗ 1 + 4.0) ∗ 1 ∗ 1; // With C++0x 

auto p2= make nth derivative(p, 0.0001); 

Once defined, we can reuse p and p2 as often as we want. Of course, calculating the derivatives 

of polynomials can be done better than with differential quotients. We will discuss this in 

Section 8.2. 

25 TODO: Try in g++ 4.3 and 4.4? 

26 boost::lambda::lambda functor


4.8.3 The function accumulate with a functor argument 

TODO: Again, I don’t like the use of pointers here — Peter 

Recall the function accumulate from section 4.2.1 that we used to introduce Generic Programming. 

In this section, we will generalize this function. We introduce a binary functor (concept 

BinaryFunctor) that implements an operation on two arguments as function or callable class 

object. 27 Then we can accumulate values with respect to this binary operation: 

template 

T accumulate( T∗ a, T∗ a end, T init, BinaryFunctor op ) { 

T sum( init ) ; 

for ( ; a!=a end; ++a ) { 

sum = op( sum, ∗a ) ; 

} 

return sum ; 

} 

The concept BinaryFunctor is defined as follows: 28 

• Let op be of type BinaryFunctor. 

– has the method op( first argument type, second argument type ) with result type being 

convertible to T. T should be convertible to the first and second argument types. 

From this generic example, it is quite clear that the conceptual conditions are becoming complicated 

when we are mixing types. Usually, we make sure that the first argument type, second 

argument type and result type are the same, but strictly speaking, this is not required, since 

the compiler is allowed to perform conversions. 

The main program could be as follows: 

struct sum functor 

{ 

double operator() ( double a, double b ) const { 

return a + b ; 

} 

} ; 

struct product functor 

{ 

double operator() ( double a, double b ) const { 

return a ∗ b ; 

} 

} ; 

int main() 

{ 

int n=10; 

double a[n] ; 

double s = accumulate( a, a+n, 0.0, sum functor() ) ; 

s = accumulate( a, a+n, 1.0, product functor() ) ; 

} 

27 TODO: Introduce term. 

28 TODO: revisit

4.9. STL — THE MOTHER OF ALL GENERIC LIBRARIES 121 

4.9 STL — The Mother of All Generic Libraries 

The Standard Template Library — STL — is an example of a generic C ++ library. It defines 

generic container classes, generic algorithms, and iterators. Online documentation is provided 

under www.sgi.com/tech/stl. There are also entire books written about the usage of STL so 

that we can keep it short here and refer to these books [?]. 

4.9.1 Introducing Example 

Containers are classes whose purpose is to contain other objects. The classes vector and list are 

examples of STL container classes. Each of these classes is templated, and can be instantiated 

to contain any type of object (that is a model of the appropriate concept). For example, the 

following lines create a vector containing doubles and another one containing integers: 

std::vector vec d ; 

std::vector vec i ; 

The STL also includes a large collection of algorithms that manipulate the data stored in 

containers. The accumulate algorithm, for example, can be used to compute any reduction — 

such as sum, product, or minimum — on a list or vector in the following way: 

std::vector vec ; // fill the vector... 

std::list lst ; // fill the list... 

double vec sum = std::accumulate( vec.begin(), vec.end(), 0.0 ) ; 

double lst sum = std::accumulate( lst.begin(), lst.end(), 0.0 ) ; 

Notice the use of the functions begin() and end(), that denote the begin and end of the vector 

and the list represented by ‘Iterators’. Iterators are the central concept of the STL and we will 

have a closer look at it. 

4.9.2 Iterators 

Disrespectfully spoken, an iterator is a generalized pointer: one can dereference it and change the 

referred location. This over-simplified view is not doing justice to its importance. Iterators are a 

Fundamental Methodology to Decouple the Implementation of Data Structures and Algorithms. 

Figure 4.2 29 depicts this central role of iterators. Every data structure provides an iterator for 

traversing it and all algorithms are implemented in terms of iterators. 

To program m algorithms on n data structures, one needs in classical C and Fortran programming 

m · n implementations. 

Expressing algorithms in terms of iterators decreases this to only 

m + n implementations! 

29 TODO: Flatter boxes and more containers and algos, maybe.


Data Structures Algorithms 

vector 

set 

map 

queue 

: : 

Iterators 

Figure 4.2: Central role of iterators in STL 

copy 

search 

replace 

Evidently, not all algorithms can be implemented on every data structure. Which algorithm 

works on a given data structure depends on the kind of iterator provided by the container. 

Iterators can be distinguished by the form of access: 

InputIterator: an iterator concept for reading the referred entries. 

OutputIterator: an iterator concept for writing to the referred entries. 

Note that the ability to write does not imply readability, e.g., an ostream iterator is an STL 

interface used to write to output streams like files opened in write mode. Another differentiation 

of iterators is the form of traversal: 

ForwardIterator: a concept for iterators that can pass from one element to the next, i.e. types 

that provide an operator++. It is a refinement of InputIterator and OutputIterator. In contrast 

to those, ForwardIterator they allows for traversing multiple times. 

BidirectionalIterator: a concept for iterators with step-wise forward and backward traversal, 

i.e. types with operator++ and operator−−. It refines ForwardIterator. 

RandomAccessIterator: a concept for iterators that can increment their position by an arbitrary 

integer, i.e. types that also provide operator[]. It refines BidirectionalIterator. 

Data structures that provide more refined iterators (e.g. modeling RandomAccessIterator) can be 

used in more algorithms. Dually, algorithm implementations that require less refined iterators 

(like InputIterator) can be applied to more data structures. The interfaces are designed with 

backward compatibility in mind and old-style pointers can be used as iterators. 

All standard container templates provide a rich and consistent set of iterator types. The 

following very simple example shows a typical use of iterators: 

std::list l ; 

for (std::list::const iterator it = l.begin(); it != l.end(); ++it) { 

std::cout ≪ ∗it ≪ std::endl; 

} 

sort 

: :

4.10. CURSORS AND PROPERTY MAPS 123 

As illustrated above, iterators are usually used in pairs, where one is used for the actual iteration 

and the second serves to mark the end of the collection. The iterators are created by the 

corresponding container class using standard methods such as begin() and end(). The iterator 

returned by begin() points to the first element. The iterator returned by end() points past the 

end of elements to mark the end. All algorithms are implemented with right-open intervals 

[b, e) operating on the value referred by b until b = e. Therefore intervals of the form [x, x) are 

regarded empty. 

A more general (and more useful) algorithm is the linear search on an arbitrary sequence. This 

is provided by the STL function find in the following fashion: 

template 

InputIterator find(InputIterator first, InputIterator last, const T& value) { 

while (first != last && ∗first != value) 

++first; 

return first; 

} 

find takes three arguments: two iterators that define the right-open interval of the search space, 

and a value to search for in that range. Each entry referred by ‘first’ is compared with ‘value’. 

When a match is found, the iterator pointing to it is returned. If the value is not contained 

in the sequence, an iterator equal to ‘last’ is returned. Thus, the caller can test whether the 

search was successful by comparing its result with ‘last’. In fact, one must perform this test 

because after a failed search the returned iterator cannot dereferred correctly (it points outside 

the given range and might cause segmentation violations or corrupt data). 

This section only scratched the surface of STL and was primarily intended to introduce the 

iterator concept that we will generalize in the following section. 

4.10 Cursors and Property Maps 

The essential idea of iterators is to represent a position and a referred value. A further generalization 

of this idea is to decouple the the notion of position and value. Dietmar Kühl 

proposed this mechanism in his master thesis (Diplomarbeit) [?] for the generic treatment of 

grahps. The Boost Graph Library [?] provides the notion of property maps in the form that 

properties are available for vertices and edges and all properties can be accessed independently 

from each other and from the traversal of the graph. 

As case study we implement a simple sparse matrix class with cursors and property maps. The 

minimalistic implementation of the sparse matrix is: 

#include 

#include 

#include 

#include 

template 

class coo matrix 

{ 

typedef Value value type; // better in trait 

public: 

coo matrix(int nr, int nc) : nr(nr), nc(nc) {}


void insert(int r, int c, Value v) 

{ 

assert(r < nr && c < nc); 

row index.push back(r); 

col index.push back(c); 

data.push back(v); 

} 

void sort() {} 

int nnz() const { return row index.size(); } 

int num rows() const { return nr; } 

int num cols() const { return nc; } 

int begin row(int r) const 

{ 

unsigned i= 0; 

while (i < row index.size() && row index[i] < r) ++i; 

return i; 

} 

template friend struct coo col; 

template friend struct coo row; 

template friend struct coo const value; 

template friend struct coo value; 

private: 

int nr, nc; 

std::vector row index, col index; 

std::vector data; 

}; 

The matrix is supposed to be sorted lexicographically (although we omitted the implementation 

of the sort function for the sake of brevity). For any offset i the i th entry in each of the vectors 

row index, row index and data represent row, column and value of one non-zero entry in the matrix. 

The traversal over all non-zeros of the matrix can be realized with a cursor that contains just 

this offset. 

struct nz cursor 

{ 

typedef int key type; 

nz cursor(int offset) : offset(offset) {} 

nz cursor& operator++() { offset++; return ∗this; } 

nz cursor operator++(int) { nz cursor tmp(∗this); offset++; return tmp; } 

key type operator∗() const { return offset; } 

bool operator!=(const nz cursor& other) { return offset != other.offset; } 

protected: 

int offset; 

};


The cursor is initialized with one offset. Many cursor classes will keep a reference to the traversed 

matrix object but we do not need this here. The cursor can be incremented, compared, and 

dereferred. The result of the dereferentiation is a ‘key’. For simplicity we used an int as key 

type. 

Like the begin and end functions in STL we define: 

template 

nz cursor nz begin(const Matrix& A) 

{ 

return nz cursor(0); 

} 

template 

nz cursor nz end(const Matrix& A) 

{ 

return nz cursor(A.nnz()); 

} 

the function nz begin that returns a cursor on the first non-zero entry and nz end which gives a 

past-the-end cursor to terminate the traversal 

A key can be used as argument for a property map that we will define now: 

template 

struct coo col 

{ 


coo col(const Matrix& ref) : ref(ref) {} 

int operator()(key type k) const { return ref.col index[k]; } 

private: 

const Matrix& ref; 

}; 

Property maps have typically a reference to the matrix in order to read internal data from it. 

They are often declared as friends because they are an important tool to access the object’s 

internal data — it might even be the only way to access data as in the Boost Graph Library. 

Property maps to read the row index or the value fo the offset key are equivalent and therefore 

omitted here. 

A property map for mutable entries is implemented as follows: 

template 

struct coo value 

{ 


typedef typename Matrix::value type value type; 

coo value(Matrix& ref) : ref(ref) {} 

value type operator()(key type k) const { return ref.data[k]; } 

void operator()(key type k, const value type& v) { ref.data[k]= v; }


private: 

Matrix& ref; 

}; 

In contrast to the previous maps it contains a mutable reference and another operator for setting 

a value. 

To test our implementation we create matrix A: 

coo matrix A(3, 5); 

A.insert(0, 0, 2.3); 

A.insert(0, 3, 3.4); 

A.insert(1, 2, 4.5); 

and define the three property maps: 

coo col col(A); 

coo row row(A); 

coo value value(A); 

A read-only traversal of all non-zero entries reads: 

for (nz cursor c= nz begin(A), end= nz end(A); c != end; ++c) 

std::cout ≪ ”A[” ≪ row(∗c) ≪ ”][” ≪ col(∗c) ≪ ”] = ” ≪ value(∗c) ≪ ”\n”; 

Scaling all non-zero elements can be achieved similarly: 

for (nz cursor c= nz begin(A), end= nz end(A); c != end; ++c) 

value(∗c, 2.0 ∗ value(∗c)); 

Note that we did not used all property maps in the last algorithm. In fact, this is one of the 

motivation for property maps. Only data really needed in the algorithm must be provided. In 

today’s computer landscape, this can make a significant difference in performance since reading 

and writing data is much more time-consuming than most numeric computations. Or if data is 

only available implicitly and needs recomputation. 

Another advantage of this approach is the easier realization of nested traversals. Say we have 

an algorithm that iterates over rows and within each row over the non-zero entries. In this case, 

we need other cursor type(s) but can reuse the property maps — if our new cursor derefer to 

the same key type. First we need a cursor to iterate over all rows of a matrix: 

template 

struct row cursor 

{ 

row cursor(int r, const Matrix& ref) : r(r), ref(ref) {} 

row cursor& operator++() { r++; return ∗this; } 

row cursor operator++(int) { row cursor tmp(∗this); r++; return tmp; } 

bool operator!=(const row cursor& other) { return r != other.r; } 

nz cursor begin() const { return nz cursor(ref.begin row(r)); } 

nz cursor end() const { return nz cursor(ref.begin row(r+1)); } 

protected: 

int r; 


};


Its implementation is almost the same as nz cursor and with some refactoring one could certainly 

combine it in one implementation that serves both cursors as base class. For the sake of 

simplicity we refrain from it here. The two main differences to nz cursor are 

• The lack of operator∗ because the cursor is not intended to be dereferred; and 

• The functions begin and end to provide the inter loop traversal. 

The according functions to provide a right-open interval of row cursors are straight forward: 

template 

row cursor row begin(const Matrix& A) 

{ 

return row cursor(0, A); 

} 

template 

row cursor row end(const Matrix& A) 

{ 

return row cursor(A.num rows(), A); 

} 

We can now write begin and end functions that take a row cursor (instead of a matrix) as 

argument and give the right-open interval of the rows non-zeros: 

template 

nz cursor nz begin(const row cursor& c) 

{ 

return c.begin(); 

} 

template 

nz cursor nz end(const row cursor& c) 

{ 

return c.end(); 

} 

For the inner loop we can reuse nz cursor and only need to determine the right intervals within 

each row. This is performed with the begin and end function from row cursor which in turn uses 

begin row from the matrix. That is why the row cursor needs a matrix reference. 

A two-dimensional traversal is realized as follows: 

for (row cursor< coo matrix > c= row begin(A), end= row end(A); c != end; ++c) { 

std::cout ≪ ”−−−−−\n”; 

for (nz cursor ic= nz begin(c), iend= nz end(c); ic != iend; ++ic) 

std::cout ≪ ”A[” ≪ row(∗ic) ≪ ”][” ≪ col(∗ic) ≪ ”] = ” ≪ value(∗ic) ≪ ”\n”; 

} 

std::cout ≪ ”−−−−−\n”; 

The outer loop iterates over all rows of the matrix and the inner loop over all non-zeros in this 

row. 

Résumé The technique is more complicated and less readable than accessing entries with 

operator[] and needs some familiarization. However, it allows for


• High Code Reuse with very Diverse Data Structures; 

• While still enabling High Performance. 

4.11 Exercises 

TODO: Move exercises to next chapter 

4.11.1 Unroll a loop 

Look at the loop from Subsection ??: 

int sum = 0; 

for (int i = 1 ; i


1 

2 

3 

4 

function gcd(a, b): 

if b = 0 return a 

else return gcd(b, a mod b) 

Then write an integral metafunction that executes the same algorithm but at compile time. 

Your metafunction should be of the following form: 

template 

struct gcd meta { 

static int const value = ... ; 

} ; 

i.e. gcd meta::value is the GCD of a and b. Verify whether the results correspond with your 

C ++ function gcd(). 

4.11.6 Overloading of functions 

Overloading of functions is possible for different types, e.g. 

void foo( int i ) { ... } 

void foo( double d ) { ... } 

This is an exercise on another form of overloading: based on a boolean meta expression. We 

will use the Boost functions enable if and disable if for this exercise. 

#include 

#include 

template 

typename boost::enable if< boost::is integral, T >::type foo( T const& v ) { 

return v ; 

} 

template 

typename boost::disable if< boost::is integral, T >::type foo( T const& v ) { 

return std::floor( v ) ; 

} 

If we call e.g. foo(5);, the compiler uses the special version for integers: 

template 

T foo( T const& v ) { 

return v ; 

} 

If we call e.g. foo(5.0);, the compiler uses the special version for types that are not integral: 

template 

T foo( T const& v ) { 

return std::floor( v ) ; 

}


Create a meta function to check whether a type is a pointer. Write a function evaluate that 

returns the same value as its argument, except when the argument is a pointer, in which case 

you return the value pointed to by the pointer. Hint: look at http://www.boost.org/libs/ 

utility/enable_if.html for enable if c. 

4.11.7 Meta-list 

Revisit exercise ??. 

Make a list of types. Make meta functions insert, append, delete and size. 

4.11.8 Iterator of a vector 

Revisit exercise ??. Add methods begin() and end() for returning a begin and end iterator. Add 

the types iterator and const iterator to the class. Note that pointers are iterators. 

Use the STL functions sort and lower bound. 

4.11.9 Iterator of a list 

Revisit exercise ??. 

Make a generic list type. 

Add methods begin() and end() for returning a begin and end const iterator. Add the type 

const iterator to the class. Note that pointers cannot be used as iterators. 

4.11.10 Trapezoid rule 

A simple method for computing the integral of a function is the trapezoid rule. Suppose we 

want to integrate the function f over the interval [a, b]. We split the interval in n small intervals 

[xi, xi+1] of the same length h = (b − a)/n and approximate f by a piecewise linear function. 

The integral is then approximated by the sum of the integrals of the piecewise linear function. 

This gives us the formula : 

I = h 

2 

f(a) + h 

2 

n−1 � 

f(b) + h 

j=1 

f(a + jh) (4.1) 

In this exercise, we develop a function for the trapezoid rule, with a functor argument. We 

develop software using inheritance and using generic programming. Then we use the function 

for integrating the following functions: 

• f = exp(−3x) for x ∈ [0, 4]. Try the following arguments of trapezoid: 

double exp3( double x ) { 

return std::exp( 3.0 ∗ x ) ; 

} 

struct exp3 {


double operator() ( double x ) const { 

return std::exp( 3.0 ∗ x ) ; 

} 

} ; 

• f = sin(x) if x < 1 and f = cos(x) if x ≥ 1 for x ∈ [0, 4]. 

• Can we use trapezoid( std::sin, 0.0, 2.0 ); ? 

As a second exercise, develop a functor for computing the finite difference. Then integrate the 

finite difference to verify that you get the function value back. 

4.11.11 STL and functor 

Write a generic function that copies the values of a container to another container after transformation 

using a functor: 

struct double functor { 

int operator() ( int v ) const { 

if (v my input vec ; ... 

std::vector< int > my output vec ; 

transform( my input vec.begin(), my input vec.end(), my output vec.begin(), double functor() ) ; 

Write code for the function transform and test it.

132 CHAPTER 4. GENERIC PROGRAMMING

Meta-programming 

Chapter 5 

‘Meta-programming’ is actually discovered by accident. Erwin Unruh wrote in the early 90’s 

a program that printed prime number as error messages. This showed that C ++ compilers can 

compute. Because the language has changed since Unruh wrote the example, here is a version 

adapted to today’s standard C ++: 

// Prime number computation by Erwin Unruh 

template struct D { D(void∗); operator int(); }; 

template struct is prime { 

enum { prim = (p==2) || (p%i) && is prime2?p:0), i−1> :: prim }; 

}; 

template struct Prime print { 

Prime print a; 

enum { prim = is prime::prim }; 

void f() { D d = prim ? 1 : 0; a.f();} 

}; 

template struct is prime { enum {prim=1}; }; 

template struct is prime { enum {prim=1}; }; 

template struct Prime print { 

enum {prim=0}; 

void f() { D d = prim ? 1 : 0; }; 

}; 

main() { 

Prime print a; 

a.f(); 

} 

When tried to compile with g++ 4.1.2, one will observe the following error message: TODO: 

Need English error message. 

TODO: Ask Erwin Unruh if we can use his example. 

After people realized the computational power of the C ++ compiler, it was used to realize very 

powerful performance optimization techniques. In fact, one can perform entire applications 

during compile time. Jeremiah Wilcock once wrote a Lisp interpreter that evaluated Lisp 

133

134 CHAPTER 5. META-PROGRAMMING 

expression during a C ++ compilation [?]. Todd Veldhuizen showed that the template type 

system of C ++ is Turing complete [?]. 

On the other hand, excessive usage of meta-programming techniques can end in quite long 

compile times. Entire research projects were cancelled after many millions dollars of funding 

because even short applications of less than 20 lines took weeks of compile time on parallel 

computers. We know people who managed to produce a 18 MB error message (it came mainly 

from one single error). Nevertheless, the authors used a fair amount of meta-programming in 

their scientific projects and could still avoid exhaustive compile times. 1 Also compilers improved 

significantly in the last decade. Meanwhile the compile time grew quadratically in the template 

instantiation depth in old compilers, today compile grows only linearly [?]. 

5.1 Let the Compiler Compute 

Typical introduction examples for meta-programming are factorial and Fibonacci numbers. It 

is computed recursively: 

template 

struct fibonacci 

{ 

static const long value= fibonacci::value + fibonacci::value; 

}; 

template 


{ 

static const long value= 1; 

}; 

template 


{ 

static const long value= 1; 

}; 

Note that we need the specialization for 1 and 2 to terminate the recursion. The following 

definition: 

template 


{ 

static const long value= N < 3 ? 1 : fibonacci::value + fibonacci::value; // error 

}; 

ends in an infinite compile loop. For N = 2, the compiler would evaluate the expression: 

template 


{ 

static const long value= 2 < 3 ? 1 : fibonacci::value + fibonacci::value; // error 

}; 

1 TODO: Oder René?

5.2. PROVIDING TYPE INFORMATION 135 

This requires the evaluation of fibonacci::value as 

template 


{ 

static const long value= 0 < 3 ? 1 : fibonacci< −1>::value + fibonacci< −2>::value; // error 

}; 

which needs fibonacci< −1>::value . . . . Although the values for N < 3 are not used in the end, 

the compiler will nevertheless generate these terms infinitely and die at some point. 

We said before that we implement the computation recursively. In fact, all repetive calculations 

must be realized recursively as there is no iteration for 2 meta-functions. 

If we write for instance 

std::cout ≪ fibonacci::value ≪ ”\n”; 

the value would be already calculated during the compilation and the program just prints 

it. If you do not believe us, you can read the assembler code (e.g. compile with ‘g++ -S 

fibonacci.cpp -o fibonacci.asm’). 

We mentioned long compilations with meta-programming at the beginning of the chapter. The 

compilation for Fibonacci number 45 took less than a second. Compared to it, a naïve run-time 

implemtation: 

long fibonacci2(long x) 

{ 

return x < 3 ? 1 : fibonacci2(x−1) + fibonacci2(x−2); 

} 

took 14s on the same computer. The reason is that the compiler remember intermediate results 

while the run-time version recomputes everything. We are, however, convinced that every reader 

of this book can rewrite fibonacci2 without the exponential overhead of recomputations. 

5.2 Providing Type Information 

5.2.1 Type Traits 

When we write template functions, we can easily define temporary values because they have 

usually the same type as one of the template arguments. But not always. Imagine you have a 

function that returns from two value that with the minimal magnitude: 

template 

T inline min magnitude(const T& x, const T& y) 

{ 

using std::abs; 

T ax= abs(x), ay= abs(y); 

return ax < ay ? x : y; 

} 

We can call this for int, unsigned, double values: 

2 The Meta Programming Library provides compile-time iterators but even those are recursive internally.


double d1= 3., d2= 4.; 

std::cout ≪ ”min magnitude(d1, d2) = ” ≪ min magnitude(d1, d2) ≪ ’\n’; 

If we call this function with two complex values: 

std::complex c1(3.), c2(4.); 

std::cout ≪ ”min magnitude(c1, c2) = ” ≪ min magnitude(c1, c2) ≪ ’\n’; 

we will see the error message 

no match for ≫operator< ≪in ≫ax < a≪ 

The problem is that abs returns in this case double values which provides the comparison operator 

but we store them as complex values in the temporaries. 

The careful reader might think we do we store them at all, if we compared the magnitudes 

directly we might safe memory and we could compare them as they are. This absolutely true 

and this is how we would implement the function normally. However, there are situations where 

one need a temporary, e.g., when computing the value with the minimal magnitude in a vector. 

For the sake of simplicity we just look at two values. With the new standard we can also handle 

the issue easily with auto types like: 

template 


{ 


auto ax= abs(x), ay= abs(y); 


} 

To make a long story short, sometimes we need to know explicitly the result type of an expression 

or a type information in general. Just think of a member variable of a template class: we must 

know the type of the member in the definition of the class. 

This leads us to ‘type traits’. Type traits meta-functions that provide an information about a 

type. 

In the example here we search for a given type an appropriate type for its magnitude. We can 

provide such type information by template specialization: 

template 

struct Magnitude {}; 

template 

struct Magnitude 

{ 

typedef int type; 

}; 

template 


{ 

typedef float type; 

}; 

template



{ 

typedef double type; 

}; 

template 


{ 

typedef float type; 

}; 

template 


{ 


}; 

Admittedly, this is rather cumbersome. 

We can abbreviate the first definitions by postulating “if we do not know better, we assume 

that T’s Magnitude type is T itself.” 

template 


{ 

typedef T type; 

}; 

This is true for all intrinsic types and we handle them all correctly with one definition. A slight 

disadvantage of this definition is that it incorrectly applies to all types whose type trait is not 

specialized. A set of classes where we know that the above definition is not correct, are all 

instantiations of the template class complex. So we define specializations like: 

template 


{ 


}; 

Instead of defining them individually for complex, complex, . . . we use a templated 

form to treat them all 

template 


{ 


}; 

Now that the type traits are defined we can refactor our function to use it: 

template 


{ 


typename Magnitude::type ax= abs(x), ay= abs(y); 


}


We can now consider extending this definition to vectors and matrices, e.g., to determine the 

return type of a norm. The specialization reads 

template 


{ 

typedef T type; // not really perfect 

}; 

However, if the value type of the vector is complex, its norm will not. Instead, we need the 

magitude type from the values: 

template 


{ 

typedef typename Magnitude::type type; 

}; 

5.2.2 A const-clean View Example 

In this section, we look at an efficient and expressive implementation of a transposed matrix. If 

you compute the transposed of a matrix, many software packages return a new matrix object 

with the interchanged values. This is a quite expensive operation: it requires memory allocation 

and deallocation and often copying a lot of data. 

Writing a Simple View Class 

A much more efficient approach is implementing a ‘View’ of the existing object. We refer 

internally to the viewed object and just adapt its interface. This can be done very nicely for 

the transposed of a matrix: 

1 template 

2 class transposed view 

3 { 

4 public: 

5 typedef typename mtl::Collection::value type value type; 

6 typedef typename mtl::Collection::size type size type; 

7 

8 transposed view(Matrix& A) : ref(A) {} 

9 

10 value type& operator()(size type r, size type c) { return ref(c, r); } 

11 const value type& operator()(size type r, size type c) const { return ref(c, r); } 

12 

13 private: 

14 Matrix& ref; 

15 }; 

Listing 5.1: Simple view implementation 

We assume that the matrix class has an operator() taking two arguments for the row and column 

index respectively. We further suppose that type traits are defined for value type and size type. 

This is all we need to know about the referred matrix, 3 at least in this mini example. 

3 TODO: We should define a concept for it.


The reader will imagine that implementations in libraries like MTL or GLAS will provide 

a larger interface in such classes. this short example is expressive enough to demonstrate the 

approach. However, the example is large enough to demonstrate the need of meta-programming 

in certain views. 

An object of this class can be handled like a matrix so that a template function use it as 

argument whereever a matrix is expected. The transposition is achieved by calling operator() in 

the referred object with switched indices. For every matrix object we can define a transposed 

view that behaves like a matrix 

mtl::dense2D A(3, 3); 

A= 2, 3, 4, 

5, 6, 7, 

8, 9, 10; 

tst::transposed view At(A); 

When we access At(i, j) we will get A(j, i). We even define a non-const access so that we can 

even change entries: 

At(2, 0)= 4.5; 

This operation sets A(0, 2) to 4.5. 

The definition of a transposed view object does not leed to particularly concise programs. For 

convience we define a function that returns the transposed view. 

template 

transposed view inline trans(Matrix& A) 

{ 

return transposed view(A); 

} 

Now we can use the transposed elegantly in our scientific software, for instance in a matrix 

vector product: 

v= trans(A) ∗ q; 

In this case, a temporary view is created and used in the product. Since operator() from the 

view is inlined the transposed product will be as fast as with A itself. 

Dealing with Const-ness 

So far, so good. Problems arise if we build the transposed view of a constant matrix: 

const mtl::dense2D B(A); 

We still can create the transposed view of B but we cannot access its elements: 

std::cout ≪ ”tst::trans(B)(2, 0) = ” ≪ tst::trans(B)(2, 0) ≪ ’\n’; // error 

The compiler will tell us that it cannot initialize a ‘float&’ from a ‘const float’. If we look at the 

location of the error we will realize that this is line 9 in Listing 5.1. But we did the compiler 

used the non-constant version of the operator? In line 10 we defined an operator for constant 

objects which returns a constant reference and fits perfectly for this situation.


First of all, is the ref member really constant? We never used const in the class definition or 

the function trans. Help is provided from the ‘Run-Time Type Identification (RTTI)’. We add 

the header ‘typeinfo’ and print the type information: 

#include 

... 

std::cout ≪ ”typeid of trans(A) = ” ≪ typeid(tst::trans(A)).name() ≪ ’\n’; 

std::cout ≪ ”typeid of trans(B) = ” ≪ typeid(tst::trans(B)).name() ≪ ’\n’; 

This will produce the following output: 4 

typeid of trans(A) = N3tst15transposed_viewIN3mtl6matrix7dense2DIfNS2_10 

parametersINS1_3tag9row_majorENS1_5index7c_indexENS1_9non_fixed10 

dimensionsELb0EEEEEEE 

typeid of trans(B) = N3tst15transposed_viewIKN3mtl6matrix7dense2DIfNS2_10 

parametersINS1_3tag9row_majorENS1_5index7c_indexENS1_9non_fixed10 

dimensionsELb0EEEEEEE 

The output is apparently not very clear. However, if we look very careful, we see the extra 

‘K’ in the second line that tells us that the view is instantiated with a constant matrix type. 

Another disadvantage of RTTI is that we only see the const attribute of template parameters. 

That is printing the type informantion of trans(B).ref would not tell wether or not this type is 

constant. 

An alternative that solves both problems is inspecting the type by provocing an error message. 

We can for instance write: 

int ta= trans(A); 

int tb= trans(B); 

Then the compiler gives us a message like: 

trans_const.cpp:120: Error: ≫mtl::matrix::transposed_view >≪ cannot be converted to ≫int≪ 

in initialization 

trans_const.cpp:121: Error: ≫const mtl::matrix::transposed_view≪ cannot be 

converted to ≫int≪ in initialization 

Here the types are much more readable. 5 We can see clearly that trans(B) returns a view with 

a constant template parameter. The same trick could be done for the reference in the view: 

int tar= trans(A).ref; 

int tbr= trans(B).ref; 

The error message would be accordingly: 

4 

With g++, on other compilers it might be different but the essential information will be the same. The lines 

are broken manually. 

5 

TODO: Why the hell is this const outside in line 121???


trans_const.cpp:121: Error: ≫const mtl::matrix::dense2D≪ cannot be converted to ≫int≪ 

in initialization 

Obviously, with this trick we will not get an executable binary. But we know more about the 

types in our program and can now better solve our problems. In the rare case that the type you 

examine is convertible to int, you can take any other type like std::set to which the examined 

class is not convertible. To exclude convertibility entirely you can introduce a new type. 

After this short excursion into type introspection we know for certain that the member ref is a 

constant reference. The following happens: 

• When we call trans(B) the function’s template argument is instantiated with const dense2D. 

• Thus, the return type is transposed view. 

• The constructor argument has type const dense2D&. 

• Likewise the member has type const dense2D&. 

It remains the question why the non-const version of the operator (line 9) is called despite we 

refer a constant matrix. The answer is that the constancy of ref does not matter for the choice 

but whether or not the view object is constant. Thus, we can write: 

const tst::transposed view Bt(B); 

std::cout ≪ ”Bt(2, 0) = ” ≪ Bt(2, 0) ≪ ’\n’; 

This works but it is not very elegant. 

A brutal possibility to get the view compiled for constant matrices is to cast away the constancy. 

The undesired result would be that mutable views on constant matrices enable the modification 

of the allegedly constant matrix. This violates so heavily our principles that we do not even 

show how the code would read. 

Rule 

Never cast away const. 

In the following we will empower you with very strong methodologies for handling constancy 

correctly. Every const cast is an indicator for a severe design error. As Sutter and Alexandrescu 

phrased it “If you go const you never go back.” The only situation where a const cast 

is needed by using const-incorrect third-party software, i.e. read-only arguments are passed as 

mutable pointers or references. That is not our fault and we have no choice. Unfortunately, 

there is still a lot of const-incorrect packages around and some of them would take too much 

resources to reimplement that we have to live with them. The best we can do is to add an 

appropriate API on top of it and avoid working with the original API. This saves ourselves 

from spoiling our applications with const casts and restricts the unspeakable const cast to the 

interface. A good example of such a layer is ‘Boost::Bindings’ [?] that provides const-correct


high-quality interface to BLAS, LAPACK and other libraries with similarly old-fashioned 6 interfaces. 

Conversely, as long as we only use our own functions and classes we can avoid every 

const cast. 7 

We could implement a second view class for constant matrices and overload the trans function 

to return this view: 

template 

class const transposed view 

{ 

public: 

typedef typename mtl::Collection::value type value type; 

typedef typename mtl::Collection::size type size type; 

}; 

const transposed view(const Matrix& A) : ref(A) {} 

const value type& operator()(size type r, size type c) const { return ref(c, r); } 

//private: 


template 

const transposed view inline trans(const Matrix& A) 

{ 

return const transposed view(A); 

} 

This works fine and the user could use the trans function for both constant and mutable matrices. 

However, a complete new class definition is a fair amount of work where just one piece of the 

class definition needs to be altered. For this purpose we introduce two meta-functions. 

Check for Constancy 

Our problem with the view in Listing 5.1 is that it cannot handle constant types as template 

argument. To modify the behavior for constant arguments we first need to find out whether 

an argument is constant. The meta-function that provides this informantion is very simple to 

implement by partial template specialization: 

template 

struct is const 

{ 

static const bool value= false; 

}; 

template 

struct is const 

{ 

static const bool value= true; 

}; 

6 To phrase it diplomatically. 

7 We disagree with Sutter and Alexandrescu on the other exception for using const cast [SA05, page 179], 

this can be handled easily with an extra function. 8


Constant types match both definitions but the second one is more specific and therefore picked 

by the compiler. Non-constant types match only the first one. Note that the constancy of 

template parameters is not considered, e.g., view is not regarded constant. 

Compile-time Branching 

The other tool we need for our view is a type selection depending on a logical condition. Introduced 

was this technology by Krzysztof Czarnecki 9 and Ulrich W. Eisenecker [CE00] 

This can be achieved by a rather simple implementation 

1 template 

2 struct if c 

3 { 

4 typedef ThenType type; 

5 }; 

6 

7 template 

8 struct if c 

9 { 

10 typedef ElseType type; 

11 }; 

Listing 5.2: Compile-time if 

When this template is instantiated with a logical expressions and two type, only the general 

definition in line 1 matches when the first argument evaluates to true and the ‘ThenType’ is used 

in the type definition. If the first argument evaluates to false then the specialization in line 7 

is more specific so that the ‘ElseType’ is used. Like many ingenious inventions it is very simple 

once it is found. 

This allows us to define funny things like using double for temporaries when our maximal 

iteration number is larger than 100 other float: 

typedef tst::if c 100, double, float>::type tmp type; 

std::cout ≪ ”typeid = ” ≪ typeid(tmp type).name() ≪ ’\n’; 

Needless to say that ‘max iter’ must be known at compile time. Admittedly, the example does 

not look extremely useful and the meta-if is not so important in small isolated code snippets. 

On the other hand, for the development of large generic software packages, it becomes extremely 

important. 

A convenience function as defined in the Meta-Programming Library [GA04] is ‘if ’ 

template 

struct if 

: if c 

{}; 

It expects as first argument a type with a static const member named value and convertible to 

bool. In other words, it selects the type based on the value of condition (and saves typing 8 

characters). 

9 Zu dem Zeitpunkt war er Doktorand an der TU Ilmenau.


The Solution 

Now we have all we need to revise the view from Listing 5.1. The problem was that we returned 

an entry of a constant matrix as mutable reference. To avoid this we can try to make the 

mutable access operator disappear in the view when the referred matrix is constant. This is 

possible but too complicated for the momemt. We will come back to this in Section 5.2.4. 

An easier solution is to keep both the mutable and the constant access operator but choose the 

return type of the former depending on the type of the template argument: 

1 template 

2 class transposed view 

3 { 

4 public: 

5 typedef typename mtl::Collection::value type value type; 

6 typedef typename mtl::Collection::size type size type; 

7 private: 

8 typedef typename if ::type vref type; 

12 public: 

13 transposed view(Matrix& A) : ref(A) {} 

14 

15 vref type operator()(size type r, size type c) { return ref(c, r); } 

16 const value type& operator()(size type r, size type c) const { return ref(c, r); } 

17 

18 private: 

19 Matrix& ref; 

20 }; 

Listing 5.3: Const-safe view implementation 

This implementation returns a constant reference in line 15 when the referred matrix is constant 

and a mutable referrence for mutable referred matrix. Let us see if this is what we need. For 

mutable matrix references, the return type of operator() depends on the constancy of the view 

object: 

• If the view object is mutable (line 15) then operator() returns a mutable reference (line 10); 

and 

• If the view object is constant (line 16) then operator() returns a constant reference. 

This is the same behavior as in Listing 5.1. 

If the matrix reference is constant, then a constant reference is always returned: 

• If the view object is mutable (line 15) then operator() returns a mutable reference (line 9); 

and 

• If the view object is constant (line 16) then operator() returns a constant reference. 

Altogether, we implemented a view object that provides read and write access whereever appropriate 

and disables it where inappropriate.


5.2.3 More Useful Meta-functions 

The Boost Type Traits library [?] provides a large spectrum of meta-functions to test or manipulate 

attributes of types. Some of them are rather easy to implement — like the previously 

introduced is const — and others — like has trivial constructor or is base — require deep insight 

into C ++ subtleties and often into compiler internals as well. Unless one only uses very simple 

type traits and wants absolutely avoid the dependency of an external library, it is advisable to 

favor the extensively tested implementations from the type traits library over rewriting it. 

With the boost::is xyz we can implement special behavior for certain sets of types. One can easily 

add tests for domain specific type sets: 

template 

struct is matrix 

: boost::mpl::false 

{}; 

template 


: boost::mpl::true 

{}; 

// more matrix classes ... 

template 


: is matrix 

{}; 

// more views ... 

Our program snippet is in line with the implementations in Boost. Instead of defining a static 

constant as in Section 5.2.2 we derive the meta function from boost::mpl::false and boost::mpl::true 

where static constants are defined with some additional typedefs. This not only shorter but 

requires also a bit less compile time, see [?]. 10 

The code is quite self-explanatory. Type we do not know are considered not being a matrix. 

Then we specialize for known matrix classes. For views we can further refer to the matrix-ness 

of the template argument. 

Alternatively, we can say in the type trait that every transposed view is a matrix and instead 

require for template arguments of transposed view that they are matrices. 

#include 

template 

class transposed view 

{ 

BOOST STATIC ASSERT((is matrix::value)); // Make sure that the argument is a matrix type 

// ... 

}; 

This additional assertion guarantees that the view class can only be instantiated with known 

matrix types. For other argument types the compilation, will terminate in this line. Unfortunately, 

the error message is not very informative for not saying confusing: 

10 TODO: page


trans_const.cpp:96: Error: Invalid application of ≫sizeof≪ on incomplete type 

≫boost::STATIC_ASSERTION_FAILURE≪ 

If you see an error message with “STATIC ASSERTION” in it, do not think about the message 

itself (it is meaningless) but look at the source code line that caused this error and hope that 

the author of the assertion will provide more information in a comment. 

When we try to compile our test with the assertion we will see that trans(A) compiles but 

not trans(B). The reason is that ‘const dense2D’ is considered different from ‘dense2D’ in 

template specialization so that it is still considered non-matrix. The good new is that we do not 

need to double our specializations for mutable and constant types but we can write a partial 

specialization for all constant arguments: 

template 


: is matrix {}; 

Note that BOOST STATIC ASSERT is a macro and does not understand C ++. This manifests in 

particular if the argument contains one or more commas. Than the preprocessor will interpret 

this as multiple arguments for the macro and get confused. This confusion can be avoided 

by enclosing the argument of BOOST STATIC ASSERT with two enclosing parentheses as we did 

in the example (although it was not necessary here). Despite the double parentheses and the 

rather arbitrary error message, static assertions are very useful to increase reliabily. The next 

C ++ standard will provide static assertions in the language like: 

template 

class transposed view 

{ 

static assert(is matrix::value, ”transposed view requires a matrix as argument”); 

// ... 

}; 

As the reader can see, the integration into the language overcomes the before-mentioned deficiencies 

of the macro implementation. 

Useful are meta-functions to remove something from a type if exists, e.g. remove const transforms 

const T into T and non-constant types remain unchanged. Note that this only removes the 

constancy of entire types not that of template arguments, e.g., in vector the constancy 

of the arguments is not removed. 

Dually, meta-functions can add something to a type: 

typedef typename boost::add reference::type ref type; 

It would be shorter to just add an & but this is easily overseen in longer type definitions. More 

importantly, if some trait returns already a reference then it is an error to add another one. The 

meta-function adds the reference only to types that are no references yet. To adding const to a 

type we find it more concise without the meta-function: 

typedef typename some trait::type const const type; 

If the type trait returns already a constant type, the second const will be ignored. 

The widest functionality in the area of meta-programming provides the Boost Meta-Programming 

Library (MPL) [GA04]. The library implements most of the STL algorithms (§ 4.9) and also


provides similar data types, e.g., vector or map. Another interesting library is Boost Fusion [?] 

that helps the mixing the execution at compile and run time. Both libraries are well documented 

and therefore not further discussed here. 

5.2.4 Enable-If 

A very powerful mechanism for meta-programming is “enable if” discovered by Jaakko Järvi 

and Jeremiah Wilcock. It bases on the paradigm SFINAE — Substitution Failure Is Not An 

Error. Imagine a function call with a given argument type — say dense vector. One of the 

overloads has a return type that is determined by a meta-function depending on the function 

argument. Then compiler will substitute the meta-function argument with dense vector 

to find out the return type. If this meta-function is defined dense vector then the template 

function (overload) has no return type. Instead of generating an error message, the C ++ compiler 

diligently ignores this overload. Of course, an error might occur later if all overloads are ignored 

for the given type or compiler cannot determine the most specific overload between those that 

are not ignored the. 

This compiler behavior can be explored to select an implementation based on meta-functions. As 

an example think of the L1 norm. It is defined for vector spaces and linear operators. Although 

these definitions are related, the practical real-world implementation for finite-dimensional vectors 

and matrices is different. Of course we could implement L1 norm for every matrix and 

vector type so that the call one norm(x) would select the appropriate implementation for this 

type. 

More productively, we like have one single implementation for all matrix types (including views) 

and one single implementation for all vector types. We use meta-function is matrix and implement 

accordingly is vector: 

template 

struct is vector 

: boost::mpl::false 

{}; 

template 

struct is vector 

: boost::mpl::true 

{}; 

// ... more vector types 

We also need the meta-function Magnitude to handle the magnitude of complex matrices and 

vectors. 

The implementation of enable if is very simple. It defines a type if the condition holds and none 

if the condition does not. The version in Boost adds a second level to access the static value 

member in types: 

template 

struct enable if c { 


}; 

template 

struct enable if c {};


template 

struct enable if 

: public enable if c 

{}; 

The real enabling behavior is realized in enable if c whereas enable if is merely a convience function 

to avoid type ‘::value’. 

Now we have all we need to implement the L1 norm in the generic fashion we aimed for: 

1 template 

2 typename boost::enable if::type 

3 inline one norm(const T& A) 

4 { 

5 using std::abs; 

6 typedef typename Magnitude::type mag type; 

7 mag type max(0); 

8 for (unsigned c= 0; c < num cols(A); c++) { 

9 mag type sum(0); 

10 for (unsigned r= 0; r < num cols(A); r++) 

11 sum+= abs(A[r][c]); 

12 max= max < sum ? sum : max; 

13 } 

14 return max; 

15 } 

16 

17 template 

18 typename boost::enable if::type 

19 inline one norm(const T& v) 

20 { 

21 using std::abs; 

22 typedef typename Magnitude::type mag type; 

23 mag type sum(0); 

24 for (unsigned r= 0; r < size(v); r++) 

25 sum+= abs(v[r]); 

26 return sum; 

27 } 

The selection is now driven by enable if in line 2 and 18. Let us look at line 2 in detail for a 

matrix argument: 

1. is matrix is evaluated to (i.e. inherited from) true ; 

2. enable if passes true ::value i.e. true to enable if c; 

3. enable if c< >::type is set to typename Magnitude::type; 

4. This is the return type of the function overload. 

What happens in this line when the argument is not a matrix type: 

1. is matrix is evaluated to (i.e. inherited from) false ; 

2. enable if passes false ::value i.e. false to enable if c 

3. enable if c< >::type is not set in this case;


4. The function overload has no return type; 

5. Is therefore ignored. 

For short, the overload is only enabled if the argument is a matrix — as the names of the 

meta-functions say. Likewise the second overload is only available for vectors. A short test 

demonstrates this: 

mtl::dense2D A(3, 3); 

A= 2, 3, 4, 

5, 6, 7, 

8, 9, 10; 

mtl::dense vector v(3); 

v= 3, 4, 5; 

std::cout ≪ ”one norm(A) is ” ≪ tst::one norm(A) ≪ ”\n”; 

std::cout ≪ ”one norm(v) is ” ≪ tst::one norm(v) ≪ ”\n”; 

For types that are neither matrix or vector it will look as there is no function one norm at all. 

Types that are considered both matrix and vector would cause an ambiguity. 

Draw-backs: The mechanism of enable if is very powerful but not particularly pleasant to 

debug. Error messages caused by enable if are usually rather long but not very meaningful. If 

a function match is missing for a given argument type, it is hard to determine why because no 

helpful information is provided to the programmer, he/she is only told that no match is found, 

period. The enabling mechanism can not select the most specific condition. For instance, 

we cannot specialize implementation for say is sparse matrix. This can be achieved by avoid 

ambiguities in the conditions: 

template 

typename boost::enable if c::type 

inline one norm(const T& A); 

template 

typename boost::enable if::type 

inline one norm(const T& A); 

Evidently, this will become quite confusing if too many hierarchical conditions are considered. 

The SFINAE paradigm only applies to template arguments of the function itself. Therefore, 

member functions cannot be enabled depending on the class’ template argument. For instance, 

the mutable access operator in line 9 of Listing 5.1 cannot be hidden with enable if for views on 

constant matrices because the operator itself is not a template function. There are possibilities 

to introduce a template argument artificially for a member function to enable enable if but this 

really does not contribute to the clarity of the program. 

Concepts can handle hierarchies in conditions, non-template member functions and provide also 

more helpful error messages. Unfortunately, they will not be available in C ++0x and it is not 

clear yet when they will usable for mainstream programming.


5.3 Expression Templates 

Scientific software has usually strong performance requirements — especially those problems 

we tackle with C ++. Many large-scale simulations of physical, chemical, or biological processes 

run for weeks or months and everybody is glad if at least a part of this very long execution 

times can be safed. Such safings are often at the price of readable and maintainable program 

sources. In Section 5.3.1 we will show a simple implementation of an operator and discuss why 

this is not efficient and in the remainder of Section 5.3 we will demonstrate how to improve to 

improve the performance without sacrificing the natural notation. 

5.3.1 Simple Operator Implementation 

Assume we have an application with vector addition. We want for instance write an expression 

of the following form for vectors w, x, y and z: 

w = x + y + z; 

Say, we have a vector class as in Section 4.3: 

template 

class vector 

{ 

public: 

explicit vector(int size) : my size(size), data(new T[my size]) {} 

vector() : my size(0), data(0) {} 

}; 

friend int size(const vector& x) { return x.my size; } 

const T& operator[](int i) const { check index(i); return data[i]; } 

T& operator[](int i) { check index(i); return data[i]; } 

// ... 

We can of course provide an operator for adding such vectors: 

template 

vector inline operator+(const vector& x, const vector& y) 

{ 

x.check size(size(y)); 

vector sum(size(x)); 

for (int i= 0; i < size(x); ++i) 

sum[i] = x[i] + y[i]; 

return sum; 

} 

A short test program checks that everything works: 

int main() 

{ 

vector x(4), y(4), z(4), w(4); 

x[0]= x[1]= 1.0; x[2]= 2.0; x[3] = −3.0; 

y[0]= y[1]= 1.7; y[2]= 4.0; y[3] = −6.0; 

z[0]= z[1]= 4.1; z[2]= 2.6; z[3] = 11.0;

5.3. EXPRESSION TEMPLATES 151 

} 

std::cout ≪ ”x = ” ≪ x ≪ std::endl; 

std::cout ≪ ”y = ” ≪ y ≪ std::endl; 

std::cout ≪ ”z = ” ≪ z ≪ std::endl; 

w= x + y + z; 

std::cout ≪ ”w= x + y + z = ” ≪ w ≪ std::endl; 

return 0; 

If this works properly, what is wrong with it? From the software engineering prospective: 

nothing. From the performance prospective: a lot. 

How is the statement executed: 

1. Create temporary variable sum for the addition of x and y; 

2. Perform a loop reading x and y, adding it element-wise, and writing the result to sum; 

3. Copy sum to a temporary variable, say t xy, in the return statement; 

4. Delete sum; 

5. Create temporary variable sum for the addition of t xy and z; 

6. Perform a loop reading t xy and z, adding it element-wise, and writing the result to sum; 

7. Copy sum to a temporary variable, say t xyz, in the return statement; 

8. Delete sum; 

9. Delete t xy; 

10. Perform a loop reading t xyz and writing to w; 

11. Delete t xyz; 

This is admittedly the worst-case scenario. But it was the code that old compilers generated. 

Modern compilers perform more optimizations by static code analysis and can avoid copying 

the return value into the temporaries t xy and t xyz. Instead of creating the temporaries t xy 

and t xyz, they become aliases for the respective sum temporaries. 

The optimized version performs: 

1. Create temporary variable sum (for distinction sum xy) for the addition of x and y; 

2. Perform a loop reading x and y, adding it element-wise, and writing the result to sum; 

3. Create temporary variable sum (for distinction sum xyz) for the addition of sum xy and z; 

4. Perform a loop reading sum xy and z, adding it, and writing the result to sum xyz; 

5. Delete sum xy; 

6. Perform a loop reading sum xyz and writing to w; 

7. Delete sum xyz; 

How much operations did we perform? Say our vectors have lenght n then we have in total: 

• 2n additions;


• 3n assignments; 

• 5n reads; 

• 3n writes; 

• 2 memory allocations; and 

• 2 memory deallocations. 

As comparison, if we could write a single loop or an inline function: 

template 

void inline add3(const vector& x, const vector& y, const vector& z, vector& sum) 

{ 

x.check size(size(y)); 

x.check size(size(z)); 

x.check size(size(sum)); 

for (int i= 0; i < size(x); ++i) 

sum[i] = x[i] + y[i] + z[i]; 

} 

This function performs: 

• 2n additions; 

• n assignments; 

• 3n reads; 

• n writes; 

The call of this function: 

add3(x, y, z, w); 

is of course less elegant than the operator notation. Often, one need another look at the 

documentation wether the first or the last argument contains the result. With operators this is 

evident. 

In high-performance software, programmers tend to implement a hard-coded version of every 

important operation instead of freely compose them from smaller expressions. The reason is 

obvious, our operator implementation performed additionally: 

• 2n assignments; 

• 2n reads; 

• 2n writes; 

• 2 memory allocations; and 

• 2 memory deallocations. 

The good news is we have not performed additional arithmetic. The bad news is that the 

operations above are more expensive. On modern computers, it takes much more time to 

read data from or write to the memory than executing fixed or floating point operations. 11 

Unfortunately, vectors in scientific applications tend to be rather long, often larger than the 

11 TODO: Maybe quantifying for some machine.


caches of the platform and the vectors must really be transfer to and from main memory. In 

case of shorter vectors, the data might reside in L1 or L2 cache and the data transfer is less 

critical. But in this case, the allocation and deallocation becomes a serious slow down factor. 

The purpose of expression templates is to keep the original operator notation without introducing 

the overhead induced by temporaries. 

5.3.2 An Expression Template Class 

The solution is to introduce a special class that keeps references to the vectors and allows us 

to perform all computations later in one sweep. The addition does not return now a vector but 

an object with the references: 

template 

class vector sum 

{ 

public: 

vector sum(const vector& v1, const vector& v2) : v1(v1), v2(v2) {} 

private: 

const vector& v1, v2; 

}; 

template 

vector sum inline operator+(const vector& x, const vector& y) 

{ 

return vector sum(x, y); 

} 

Now we can already write x + y but not w= x + y yet. It is not only that the assignment is not 

defined, we have not yet provided vector sum with enough functionality to perform something 

useful in the assignment. Thus, we first extend vector sum so that it looks like a vector itself: 

template 


{ 

void check index(int i) const { assert(i >= 0 && i < size(v1)); } 

public: 

vector sum(const vector& v1, const vector& v2) : v1(v1), v2(v2) 

{ 

assert(size(v1) == size(v2)); 

} 

friend int size(const vector sum& x) { return size(x.v1); } 

T operator[](int i) const { check index(i); return v1[i] + v2[i]; } 

private: 

const vector& v1, v2; 

}; 

For the sake of defensive programming, we added a test that the two vectors have the same 

size and can be consistently added. Then we consider the size of the first vector as the size of 

our vector sum. The most important function is the bracket operator: when the i th entry we 

compute the sum of the operands i th entries.


Discussion 5.1 The drawback is that if the entries are accessed multiple times the sum is 

recomputed. On the other hand, most expressions are only used once and this is not a problem. 

An example where vector entries are accessed several times is A ∗ (x+y). Here, it is preferable 

to compute a true vector first instead of computing the matrix vector product on the expression 

template. 12 

To evaluate w= x + y we also need an assignment operator for vector sum: 

template class vector sum; // forward declaration 

template 

class vector 

{ // ... 

vector& operator=(const vector sum& that) 

{ 

check size(size(that)); 


data[i]= that[i]; 


} 

}; 

The assignment runs a loop over w and that. As that is an object of type vector sum the expression 

that[i] computes x[i] + y[i]. In contrast to the implementationn in Section 5.3.1 we have now 

• Only one loop; 

• No temporary vector; 

• No additional memory allocation and deallocation; and 

• No addional data reads and writes. 

In fact, the same operations are performed as in the loop 

for (int i= 0; i < size(w); ++i) 

w[i] = x[i] + y[i]; 

The cost to create a vector sum object is negligible. The object will be kept on the stack and does 

not require memory allocation. Even that little effort for creating the object will be optimized 

away by most compilers with static code analysis. 

What happens when we like to add three vectors? The naïve implementation from § 5.3.1 

returns a vector and this vector can be added to another vector. Our approach returns a 

vector sum and we have no addition for vector sum and vector. Thus we would need another ET 

class and an according operation: 

template 

class vector sum3 

{ 


public: 

vector sum3(const vector& v1, const vector& v2, const vector& v3) : v1(v1), v2(v2), v3(v3) 

{ 

assert(size(v1) == size(v2)); assert(size(v1) == size(v3)); 

12 TODO: Shall we provide a solution for this as well? This something that is over-due in MTL4 anyway.


} 

friend int size(const vector sum3& x) { return size(x.v1); } 

T operator[](int i) const { check index(i); return v1[i] + v2[i] + v3[i]; } 

private: 

const vector& v1, v2, v3; 

}; 

template 

vector sum3 inline operator+(const vector sum& x, const vector& y) 

{ 

return vector sum3(x.v1, x.v2, y); 

} 

Furthermore, vector sum must declare our new plus operator as friend to access its private 

members and vector needs an assignment for vector sum3. This becomes increasingly annoying. 

Also, what happens if we perform the second addition first w= x + (y + z)? Then we 

need another plus operator. What if some of the vectors are multiplied by a scalar, e.g., 

w= x + dot(x, y) ∗ y + 4.3 ∗ z, and this scalar product is also implemented by an ET? Our implementation 

effort runs into combinatorial explosion and we need a more flexible solution that 

we introduce in the next section. 

5.3.3 Generic Expression Templates 

So far we started from a specific class (vector) and generalized the implementation gradually. 

Although this can help us to understand the mechanism, we like to go now to the general version 

that takes arbitrary vector types: 

template 

vector sum inline operator+(const V1& x, const V2& y) 

{ 

return vector sum(x, y); 

} 

We now need an expression class with arbitrary arguments: 

template 


{ 

typedef vector sum self; 


public: 

vector sum(const V1& v1, const V2& v2) : v1(v1), v2(v2) 

{ 

assert(size(v1) == size(v2)); 

} 

???? operator[](int i) const { check index(i); return v1[i] + v2[i]; } 

friend int size(const self& x) { return size(x.v1); } 

private: 

const V1& v1;


}; 

const V2& v2; 

This is rather straightforward. The only issue is what type to return in operator[]? For this 

we must define value type in each class — more flexible would be an external type trait. In 

vector sum we take the value type of the first argument which can itself be taken from another 

class. 

template 


{ 

// ... 

typedef typename V1::value type value type; 

}; 

value type operator[](int i) const { check index(i); return v1[i] + v2[i]; } 

To assign such an expression to a vector we can also generalize the assign operator: 

template 

class vector 

{ 

public: 


}; 

template 

vector& operator=(const Src& that) 

{ 





} 

This assigment can also handle vector as argument and we can omit the standard assignment 

operator. 

Advantages of expression templates: Although the availability of operator overloading 

in C ++ resulted in notationally nicer code, the scientific community refused to give up programming 

in Fortran or to implement the loops directly in C/C ++. The reason was that the 

traditional operator implementations were too expensive. Due to the overhead of the creation 

of temporary variables and the copying of vector and matrix objects, C ++ could not compete 

with the performance of programs written in Fortran. This problem has now been resolved 

by the introduction of generics and expression templates. Now it is possible to write efficient 

scientific programs in a notationally convenient manner. 

5.4 Meta-Tuning: Write Your Own Compiler Optimization 

Compiler technology is progressing and provides us an increasing number of optimization techniques. 

Ideally, everyone writes his software in the way it is the easiest for him and the compiler

5.4. META-TUNING: WRITE YOUR OWN COMPILER OPTIMIZATION 157 

transforms the operations to a form that is best for execution time. We would only need a new 

compiler and our programs become faster. 13 But live — especially as advanced C ++ programmer 

— is no walk in the park. Of course, the compiler helps us a lot to speed up our programs. 

But there are limitations, many optimizations need knowledge of the semantic behavior and can 

therefore only be applied on types and operations where the semantic is known at the time the 

compiler is written, see also discussion in [?]. Research is going on, to overcome this limitations 

by providing concept-based optimization [?]. Unfortunately, this will take time until it becomes 

mainstream, especially now that concepts are taken out of the C ++0x standard. An alternative 

is source-to-source code transformation with external tools like ROSE [?]. 

Even for types and operations that the compiler can handle, it has its limitations. Most compilers 

(gcc, . . . 14 ) only deal with the inner loop in nested ones (see solution in Section 5.4.2) 

and does not dare to introduce extra temporaries (see solution in Section ??). Some compilers 

are particularly tuned for benchmarks. 15 For instance, they have pattern matching to recognize 

a 3-nested loop that computes a dense matrix product and transform those in BLAS-like code 

with 7 or 9 platform-dependent loops. 16 All this said, writing high-performance software is no 

walk in the park. That does not mean that such software must be unreadable and unmaintainable 

hackery. The route of success is again to provide appropriate abstractions. Those can be 

empowered with compile-time optimizations so that the applications are still writen in natural 

mathematical notation whereas the generated binaries can still explore all known techniques 

for fast execution. 

5.4.1 Classical Fixed-Size Unrolling 

The easiest form of compile-time optimization can be realized for fixed-size data types, in 

particular vectors as in Section 4.7. Simular to the default assignment, we can write a generic 

vector assignment: 

template 


{ 

public: 

const static int my size= Size; 

}; 

template 

self& operator=(const self& that) 

{ 



} 

13 

In some sense, this is the programming equivalent of communism: everybody contributes as much as he 

pleases and like he pleases and in the end, the right thing will happen anyway thanks to a self-improving society. 

Likewise, some people write software in a very naïve fashion and blame the compiler not transforming their 

programs into high-performance code. 

14 

TODO: we should run some benchmarks on MSVC and icc. 

15 

TODO: search for paper on kcc. 

16 

One could sometimes get the impression that the HPC community beliefs that multiplying dense matrices 

at near-peak performance solves all performance issues of the world or at least demonstrates that everything can 

be computed at near-peak performance if only one tries hard enough. Fortunately, more and more people in the 

supercomputer centers realize that their machines are not only running BLAS3 and LAPACK operations and 

that real-world applications are more often than not limited by memory bandwidth and latency.


A state-of-the-art compiler will recognize that all iterations are independent one from each 

other, e.g., data[2]= that[2]; is independent of data[1]= that[1];. The compiler will also determine 

the size of loop during compilation. As a consequence, the generated binary of a type with size 

3 will be equivalent to: 

template 


{ 

template 


{ 

data[0]= that[0]; 



} 

}; 

The right-hand-side vector that might be an expression template § 5.3 for say alpha ∗ x + y and 

its evaluation will be also inlined: 

template 


{ 

template 


{ 

data[0]= alpha ∗ x[0] + y[0]; 

data[1]= alpha ∗ x[1] + y[1]; 

data[2]= alpha ∗ x[2] + y[2]; 

} 

}; 

To make the unrolling more explicit and for the sake of step-wise introducing meta-tuning we 

develop a functor that computes the assignment: 

template 

struct fsize assign 

{ 

void operator()(Target& tar, const Source& src) 

{ 

fsize assign()(tar, src); 

std::cout ≪ ”assign entry ” ≪ N ≪ ’\n’; 

tar[N]= src[N]; 

} 

}; 

template 


{ 


{ 

std::cout ≪ ”assign entry ” ≪ 0 ≪ ’\n’; 

tar[0]= src[0]; 

} 

};


The print-outs shall show us the execution. For convenience, one can templatize the operator 

on the argument types: 

template 


{ 

template 


{ 

fsize assign()(tar, src); 

std::cout ≪ ”assign entry ” ≪ N ≪ ’\n’; 

tar[N]= src[N]; 

} 

}; 

template 


{ 

template 


{ 

std::cout ≪ ”assign entry ” ≪ 0 ≪ ’\n’; 

tar[0]= src[0]; 

} 

}; 

Then the vector types can by deduced by the compiler when the operator is called. Instead of 

the previous loop, we call the assignment functor in the operator: 

template 


{ 

BOOST STATIC ASSERT((my size > 0)); 

}; 

self& operator=( const self& that ) 

{ 

fsize assign()(∗this, that); 


} 

template 

self& operator=( const Vector& that ) 

{ 

fsize assign()(∗this, that); 


} 

The execution of the following code fragment 

yields 

fsize vector v, w; 

v[0]= v[1]= 1.0; v[2]= 2.0; v[3]= −3.0; 

w= v;


assign entry 0 




In this implementation, we replaced the loop by a recursion — counting on the compiler to 

inline the operations (otherwise it would be even slower as the loop) — and made sure that no 

loop index is incremented and tested for termination. This is only beneficial for small loops that 

run in L1 cache. Larger loops are dominated by loading the data from memory and the loop 

overhead is irrelevant. To the contrary, unrolling operations on very large vectors entirely will 

probably decrease the performance because a lot of instructions need to be loaded and decrease 

therefore the available bandwidth for the data. As mentioned before, compilers can unroll such 

operations by themselves — and hopefully know when it is better not to — and sometimes this 

automatic unrolling is even slightly faster then the explicit implementation. 

5.4.2 Nested Unrolling 

From our experience, compilers usually unroll nested loops. Even a good compiler that can 

handle certain nested loops will not be able to optimize every program kernel, in particular those 

with heavily templatized programs instantiated with user-defined types. We will demonstrate 

here how to unroll nested loops at compile time at the example of matrix vector multiplication. 

For this purpose, we introduce a simplistic fixed-size matrix type: 

template 

class fsize matrix 

{ 

typedef fsize matrix self; 

public: 


BOOST STATIC ASSERT((Rows ∗ Cols > 0)); 

const static int my rows= Rows, my cols= Cols; 

fsize matrix() 

{ 

for (int i= 0; i < my rows; ++i) 

for (int j= 0; j < my cols; ++j) 

data[i][j]= T(0); 

} 

fsize matrix( const self& that ) { ... } 

// cannot check column index 

const T∗ operator[](int r) const { return data[r]; } 

T∗ operator[](int r) { return data[r]; } 

mat vec et operator∗(const fsize vector& v) const 

{ 

return mat vec et (∗this, v); 

} 

private:


}; 

T data[Rows][Cols]; 

The bracket operator returns a pointer for the sake of simplicity but a good implementation 

should return a proxy that allows for checking the column index. The multiplication with a 

vector is realized by means of an expression template for not copying the result vector. 

Then the vector assigment needs a specialization for the expression template 17 

template 


{ 

template 

self& operator=( const mat vec et& that ) 

{ 

typedef mat vec et et; 

fsize mat vec mult()(that.A, that.v, ∗this); 


} 

}; 

The functor fsize mat vec mult must now compute the matrix vector product on the three arguments. 

The general implementation of the functor reads: 

template 

struct fsize mat vec mult 

{ 

template 

void operator()(const Matrix& A, const VecIn& v in, VecOut& v out) 

{ 

fsize mat vec mult()(A, v in, v out); 

v out[Rows]+= A[Rows][Cols] ∗ v in[Cols]; 

} 

}; 

Again, the functor is only templatized on the sizes and the container types are deduced. The 

operator assumes that all smaller column indices are already handled and we can increment 

v out[Rows] by A[Rows][Cols] ∗ v in[Cols]. In particular, we assume that the first operation on 

v out[Rows] initializes it. Thus we need a (partial) specialization for Cols = 0: 

template 


{ 

template 


{ 

fsize mat vec mult()(A, v in, v out); 

v out[Rows]= A[Rows][0] ∗ v in[0]; 

} 

}; 

The careful reader noticed the substitution of += by =. We also notice that we have to call the 

computation for the preceeding row with all columns and inductively for all smaller rows. The 

17 A better solution would be implementing all assignments with a functor and specialize the functor because 

partial template specialization of functions does not always work as expected.


number of columns in the matrix is taken from an internal definition in the matrix type for the 

sake of simplicity. Passing this as extra template argument or taking a type traits would have 

been more general because we are now limited to types where my cols is defined in the class. 

We still need a (full) specialization to terminate the recursion: 

template 


{ 

template 


{ 

v out[0]= A[0][0] ∗ v in[0]; 

} 

}; 

With the inlining, our program will execute the operation w= A ∗ v for vectors of size 4 as: 

w[0]= A[0][0] ∗ v[0]; 

w[0]+= A[0][1] ∗ v[1]; 

w[0]+= A[0][2] ∗ v[2]; 

w[0]+= A[0][3] ∗ v[3]; 

w[1]= A[1][0] ∗ v[0]; 

w[1]+= A[1][1] ∗ v[1]; 

w[1]+= A[1][2] ∗ v[2]; 

w[1]+= A[1][3] ∗ v[3]; 

w[2]= A[2][0] ∗ v[0]; 

w[2]+= A[2][1] ∗ v[1]; 

w[2]+= A[2][2] ∗ v[2]; 

w[2]+= A[2][3] ∗ v[3]; 

w[3]= A[3][0] ∗ v[0]; 

w[3]+= A[3][1] ∗ v[1]; 

w[3]+= A[3][2] ∗ v[2]; 

w[3]+= A[3][3] ∗ v[3]; 

Our tests have shown that such an implementation is really faster than the compiler optimization 

on loops. 18 

Increasing Concurrency 

A disadvantage of the preceeding implementation is that all operations on an entry of the target 

vector are performed in one sweep. Therefore, the second operation must wait for the first the 

third for the second on so on. The fifth operation can be done in parallel with the forth, 

the ninth with the eighth but this is not satisfying. We like to have more concurrency in our 

program that enables parallel pipelines in superscalar processors. Again, we can twiddle our 

thumbs and hope that the compiler will reorder the statements or take it in our hands. More 

concurrency is provided by the following operation sequence: 

w[0]= A[0][0] ∗ v[0]; 

w[1]= A[1][0] ∗ v[0]; 

w[2]= A[2][0] ∗ v[0]; 

w[3]= A[3][0] ∗ v[0]; 

w[0]+= A[0][1] ∗ v[1]; 

18 TODO: Give numbers


w[1]+= A[1][1] ∗ v[1]; 

w[2]+= A[2][1] ∗ v[1]; 

w[3]+= A[3][1] ∗ v[1]; 

w[0]+= A[0][2] ∗ v[2]; 

w[1]+= A[1][2] ∗ v[2]; 

w[2]+= A[2][2] ∗ v[2]; 

w[3]+= A[3][2] ∗ v[2]; 

w[0]+= A[0][3] ∗ v[3]; 

w[1]+= A[1][3] ∗ v[3]; 

w[2]+= A[2][3] ∗ v[3]; 

w[3]+= A[3][3] ∗ v[3]; 

We only need to reorganize our functor. The general template reads now: 

template 

struct fsize mat vec mult cm 

{ 

template 


{ 

fsize mat vec mult cm()(A, v in, v out); 

v out[Rows]+= A[Rows][Cols] ∗ v in[Cols]; 

} 

}; 

Now, we need a partial specialization for row 0 to go the next column: 

template 


{ 

template 


{ 


v out[0]+= A[0][Cols] ∗ v in[Cols]; 

} 

}; 

The partial specialization for column 0 is also needed to initialize the entry of the output vector: 

template 


{ 

template 


{ 


v out[Rows]= A[Rows][0] ∗ v in[0]; 

} 

}; 

Finally, we still need a specialization for row and column 0 to terminate the recursion. This 

can be reused from the previous functor: 

template 


: fsize mat vec mult {};


Using Registers 

Another feature of modern processors one should keep in mind: cache coherency. Processors 

are nowadays designed to share memory while pertaining consistency in their caches. As a 

result, every time we write into data structure in memory like our vector w a cache invalidation 

signal is sent on the bus. Even if no other processor is present. Unfortunately, this slows down 

computation perceivably (from our experience). 

Fortunately, this can be avoided in many cases in a rather simple way by introducing a temporary 

in a function that resides in register(s) if the type allows. We can rely on the compiler to 

decide reasonably the location of temporaries. 

This implementation requires two classes: one for the outer and one for the inner loop. Let us 

start with the outer loop: 

1 template 

2 struct fsize mat vec mult reg 

3 { 

4 template 

5 void operator()(const Matrix& A, const VecIn& v in, VecOut& v out) 

6 { 

7 fsize mat vec mult reg()(A, v in, v out); 

8 

9 typename VecOut::value type tmp; 

10 fsize mat vec mult aux()(A, v in, tmp); 

11 v out[Rows]= tmp; 

12 } 

13 }; 

We assume that fsize mat vec mult aux is defined or declared before this class. The first statement 

in line 7 calls the computations on the preceeding rows. A temporary is defined in line 9 with 

the hope that it will be located in a register. Then we call the computation within this row. The 

temporary is passed as reference to an inline function so that the summation will be performed 

in a register. In line 10 we write the result back to v out. This still causes the invalidation signal 

on the bus but only once for each entry. 

The functor must be specialized for row 0 to avoid infinite loops: 

template 

struct fsize mat vec mult reg 

{ 

template 


{ 

typename VecOut::value type tmp; 

fsize mat vec mult aux()(A, v in, tmp); 

v out[0]= tmp; 

} 

}; 

Within each row we iterate over the columns and increment the temporary (in the register 

hopefully): 

template 

struct fsize mat vec mult aux


{ 

}; 

template 

void operator()(const Matrix& A, const VecIn& v in, ScalOut& tmp) 

{ 

fsize mat vec mult aux()(A, v in, tmp); 

tmp+= A[Rows][Cols] ∗ v in[Cols]; 

} 

To terminate the computation in the column we write a specialization. 

template 

struct fsize mat vec mult aux 

{ 

template 

void operator()(const Matrix& A, const VecIn& v in, ScalOut& tmp) 

{ 

tmp= A[Rows][0] ∗ v in[0]; 

} 

}; 

In this section we showed different ways to optimize a two-dimensional loop (with fixed sizes). 

There are certainely more possibilities: for instance, we could try to implement it in a way that 

uses registers but with the same concurrency as in the second-last implementation. Another 

form of optimization could be to agglomerate the write-backs so that multiple invalidation 

signals are sent at a time and maybe behave less interruptive. 

5.4.3 Dynamic Unrolling – Warm up 

⇒ vector unroll example.cpp 

As important as the fixed-size optimization is, acceleration for dynamically sized containers is 

needed even more. We start here with a simple example and some observations. We will reuse 

the vector class from Listing 4.1. To show the implementation more clearly, we write the code 

without operators and expression templates. Our test case will compute 

u = 3v + w 

for three short vectors of size 1000. The wall clock time will be measured with boost::timer. 19 

The vectors v and w will be initialized and to have the data ready to use (i.e. the vectors are 

definitively in cache 20 ) we run few additional operations without timing: 

#include 

#include 

// ... 

int main() 

{ 

unsigned s= 1000; 

if (argc > 1) s= atoi(argv[1]); // read (potentially) from command line 

19 See http://www.boost.org/doc/libs/1_43_0/libs/timer/timer.htm 

20 TODO: shouldn’t the initialization make this sure? Do we have a better explanation? Reference to benchmark 

literature? Do we really need a bullet proof justification here?


} 

vector u(s), v(s), w(s); 

vector u(s), v(s), w(s); 

for (unsigned i= 0; i < s; i++) { 

v[i]= float(i); 

w[i]= float(2∗i + 15); 

} 

for (unsigned j= 0; j < 3; j++) 

for (unsigned i= 0; i < s; i++) 

u[i]= 3.0f ∗ v[i] + w[i]; 

const unsigned rep= 200000; 

boost::timer native; 

for (unsigned j= 0; j < rep; j++) 


u[i]= 3.0f ∗ v[i] + w[i]; 

std::cout ≪ ”Compute time native loop is ” ≪ 1000000.0 ∗ native.elapsed() / double(rep) ≪ ” µs.\n”; 

return 0 ; 

Alternatively we compute this with an unrolling of 4 cycles: 


for (unsigned i= 0; i < s; i+= 4) { 

u[i]= 3.0f ∗ v[i] + w[i]; 

u[i+1]= 3.0f ∗ v[i+1] + w[i+1]; 

u[i+2]= 3.0f ∗ v[i+2] + w[i+2]; 

u[i+3]= 3.0f ∗ v[i+3] + w[i+3]; 

} 

This code will obviously only work if the vector size is divisible by 4. To avoid errors we can 

add an assertion on the vector size but this is not really satisfying. Instead, we generalize this 

implementation to arbitrary vector sizes: 

boost::timer unrolled; 

for (unsigned j= 0; j < rep; j++) { 

unsigned sb= s / 4 ∗ 4; 

for (unsigned i= 0; i < sb; i+= 4) { 

u[i]= 3.0f ∗ v[i] + w[i]; 

u[i+1]= 3.0f ∗ v[i+1] + w[i+1]; 

u[i+2]= 3.0f ∗ v[i+2] + w[i+2]; 

u[i+3]= 3.0f ∗ v[i+3] + w[i+3]; 

} 

for (unsigned i= sb; i < s; i++) 

u[i]= 3.0f ∗ v[i] + w[i]; 

} 

std::cout ≪ ”Compute time unrolled loop is ” ≪ 1000000.0 ∗ unrolled.elapsed() / double(rep) ≪ ” µs.\n”; 

std::cout ≪ ”u is ” ≪ u ≪ ’\n’; 

Listing 5.4: Unrolled computation of u = 3v + w 

The little program was compiled with g++ 4.1.2 with the flags -O3 -ffast-math -DNDEBUG


and resulted on the test computer 21 in: 

Compute time native loop is 2.64 µs. 

Compute time unrolled loop is 1.15 µs. 

Alternatively to our hand-coded unrolling we can use the compiler flag -funroll-loops. This 

results in the following execution time on the test machine: 



The original loop became slightly faster while our optimized version slowed down a bit. An 

entirely different behavior we see if we replace the size s by a constant: 

const unsigned s= 1000; 

In this case the compiler knows the size of the loops and it might be easier to transform the 

loop or to determine that a transformation is beneficial. 



Now the native loop is clearly accelerated by the compiler optimization. Why our hand-written 

unrolling is slower than before is not clear. Apparently, the manual and the automatic optimization 

got into conflict or the latter overrode the first. 

Discussion 5.2 Software tuning and benchmarking is an art of its own with complex compiler 

optimization. The tiniest modification in the source can change the run-time behavior of an 

examined computation. In the example it should not have mattered whether the size is known 

at compile time or not. But it did. Especially when the code is compiled without -DNDEBUG 

the compiler might omit the index check in some situations and perform it on others. It is 

also important to print out computed values (and filter them out with grep or such) because 

the compiler might omit an entire computation when it is obvious that the result is not needed. 

Such optimization happen in particular if the results are intrinsic types while computations on 

user-defined types are usually subject to such omissions (but one should not count on it). 

The goal of this section is not to determine why which code is how much faster than another 

one. Besides, each compiler has a different sensitivity on sizes and flags so that we would need a 

different line of argumentation and calculation for each of them. The only conclusion we like to 

draw from these observations is that despite all the progress in compiler technology, we cannot 

rely blindly on it and still need hand-tuned implementations and careful benchmarking when 

maximal performance is needed. On the other hand, program snippets as in the last listing shall 

not appear in scientific applications for the sake of readability, maintainability, portability, . . . 

Another question we have not raised so far is: What is the optimal block size for the 

unrolling? 

• Does it depend on the expression? 

• Does it depend on the types of the arguments? 

• Does it depend on the computer architecture? 

21 Phenom II X2 545 3.0 GHz, 3600 MHz PSB, 7MB total cache, Sockel AM2,2x 2GB DDR2-800


The answer is yes. All of them. The main reason (but not the only one) is that different 

processors have different numbers of registers. How many registers are needed in one iteration 

depends on the expression and on the types (a complex value needs more registers than a float). 

In the following section we will address both issues: how to encapsulate the transformation 

so that it does not show up in the application and how we can change the block size without 

rewritten the loop. 

5.4.4 Unrolling Vector Expressions 

For easier understanding, we discuss the abstraction in meta-tuning step by step. We start with 

the previous loop and implement a function for it. Say the function’s name is my axpy and it 

has a template argument for the block size so that we can write for instance 


my axpy(u, v, w); 

This function shall contain a main loop in unrolled manner with customizable size and a clean-up 

loop at the end: 

template 

void my axpy(U& u, const V& v, const W& w) 

{ 

assert(u.size() == v.size() && v.size() == w.size()); 

unsigned s= u.size(), sb= s / BSize ∗ BSize; 

} 

for (unsigned i= 0; i < sb; i+= BSize) 

my axpy ftor()(u, v, w, i); 


u[i]= 3.0f ∗ v[i] + w[i]; 

As mentioned before, deduced template types, as the vector types in our case, must be defined 

at the end and the explicitly given arguments, in our case the block size, must be at the 

beginning of the template arguments. The implementation of the block statement in the first 

loop can be implemented similarly to the functor in Section 5.4.1. We deviate a bit from this 

implementation by using two template arguments where the former is increased until it is equal 

to the second. It appeared that this approach yielded faster binaries on gcc than using only 

one argument and counting it down to zero. 22 In addition, the two-argument version is more 

consistent with the multi-dimensional implementation in Section ??. As for fixed-size unrolling 

we need a recursive template definition. Within the operators, a single statement is performed 

and the following statements are called: 

template 

struct my axpy ftor 

{ 

template 

void operator()(U& u, const V& v, const W& w, unsigned i) 

{ 

u[i+Offset]= 3.0f ∗ v[i+Offset] + w[i+Offset]; 

22 TODO: exercise for it


}; 

} 

my axpy ftor()(u, v, w, i); 

The only difference to fixed-size unrolling is that the indices are relative to an argument — 

here i. The operator() is first called with Offset equal to 0, then with 1, 2, . . . Since each call is 

inlined the functor call results in one monolithic block of operations without loop control and 

function call. Thus, the call of my axpy ftor()(u, v, w, i) performs the same operations as 

one iteration of the first loop in Listing 5.4. 

Of course this compilation would end in an infinite loop if we forget to specialize it for Max: 

template 

struct my axpy ftor 

{ 

template 

void operator()(U& u, const V& v, const W& w, unsigned i) {} 

}; 

Performing the considered vector operation with different unrollings yields 





Now we can call this operation for any block size we like. On the other hand, it is rather 

cumbersome to implement the according functions and functors for each vector expression. 

Therefore, we combine this technique now with expression templates. 

5.4.5 Tuning an Expression Template 

⇒ vector unroll example2.cpp 

Let us recall Section 5.3.3. So far, we developed a vector class with expression templates for 

vector sums. In the same manner we can implement the product of a scalar and a vector but 

we leave this as exercise and consider expressions with addition only, for example: 

u = v + v + w 

Now we frame this vector operation with a repeting loop and the time measure: 

boost::timer t; 


u= v + v + w; 

std::cout ≪ ”Compute time is ” ≪ 1000000.0 ∗ t.elapsed() / double(rep) ≪ ” µs.\n”; 

This results in: 

Compute time is 1.72 µs. 

To incorporate meta-tuning into expression templates we only need to modify the actual assignment 

because only here a loop is performed. All the other operations (well so far we have


only a sum but in theory it could be tons of them) only return objects with references. The 

loop operator= is split into the unrolled at the beginning and the one-by-one completion at the 

end: 

template 

class vector 

{ 

template 


{ 


unsigned s= my size, sb= s / 4 ∗ 4; 

}; 

} 

for (unsigned i= 0; i < sb; i+= 4) 

assign()(∗this, that, i); 




The assign functor is realized analogous to my axpy ftor: 

template 

struct assign 

{ 

template 

void operator()(U& u, const V& v, unsigned i) 

{ 

u[i+Offset]= v[i+Offset]; 

assign()(u, v, i); 

} 

}; 

template 

struct assign 

{ 

template 

void operator()(U& u, const V& v, unsigned i) {} 

}; 

Computing the expression above we yield: 

Compute time is 1.37 µs. 

With this rather simple modification we now accelerated ALL vector expression templates. 

In comparison with the previous implementation we lost however the flexibility to costumize 

the loop unrolling. The functor assign has two arguments thus allowing for customization. The 

problem is the assignment operator. In principle we can define an explicit template argument 

there: 

template 


{


} 


unsigned s= my size, sb= s / BSize ∗ BSize; 


assign()(∗this, that, i); 




The drawback is that we cannot use the symbol ‘=’ naturally as infix operator but must write: 

u.operator=(v + v + w); 

This has in fact a certain geeky charm and one could also argue that people did (and still do) 

more painful things for performance. Nonetheless, it does not meet our ideals of intuitiveness 

and readability. 

Alternative notations are: 

or 

unroll(u= v + v + w); 

unroll(u)= v + v + w; 

Both version are implementable and provide comparable intuitiveness. The former expresses 

more correctly what we are doing while the latter is easier to implement and the structure of the 

computed expression remains better visibility. Therefore we show the realization of the second 

form. 

The function unroll is simple to implement: it just returns an object with a reference to the 

vector and a type information for the unroll size: 

template 

unroll vector inline unroll(Vector& v) 

{ 

return unroll vector(v); 

} 

The class unroll vector is not complicated either. It only needs to take a reference to the target 

vector and an assignment operator: 

template 

class unroll vector 

{ 

public: 

unroll vector(V& ref) : ref(ref) {} 

template 

V& operator=(const Src& that) 

{ 

assert(size(ref) == size(that)); 

unsigned s= size(ref), sb= s / BSize ∗ BSize;



assign()(ref, that, i); 


ref[i]= that[i]; 

return ref; 

} 

private: 

V& ref; 

}; 

Evaluting the considered vector expressions for some block sizes yields: 

Compute time unroll(u)= v + v + w is 1.72 µs. 





This few benchmarks are consistent with the previous results, i.e. unroll is equal to the 

canocical implementation and unroll is as fast as the hard-wired unrolling. 

5.4.6 Tuning Reduction Operations 

Reducing on a Single Variable 

⇒ reduction unroll example.cpp 

In the preceding vector operations, the i th entry of each vector was handled independently of 

any other entry. For reduction operations, they are related by one or more temporary variables. 

And this temporary variable(s) can become a serious bottle neck. 

First, we test if a reduction operation, say the discrete L1 norm (also known as Manhattan 

norm) can be accelerated by the techniques from Section 5.4.4. We implement the one norm 

function in terms of a functor for the iteration block: 

template 

typename Vector::value type 

inline one norm(const Vector& v) 

{ 


typename Vector::value type sum(0); 

unsigned s= size(v), sb= s / BSize ∗ BSize; 

} 


one norm ftor()(sum, v, i); 


sum+= abs(v[i]); 

return sum;


The functor is also implemented in the same manner as before: 

template 

struct one norm ftor 

{ 

template 

void operator()(S& sum, const V& v, unsigned i) 

{ 


sum+= abs(v[i+Offset]); 


} 

}; 

template 


{ 

template 

void operator()(S& sum, const V& v, unsigned i) {} 

}; 

The measured run-time behavior behavior is: 

Compute time one_norm(v) is 7.42 µs. 





This is already a good improvement but maybe we can do better. 23 

Reducing on an Array 

⇒ reduction unroll array example.cpp 

When we look at the previous computation, we see that a different entry of v is used in each 

iteration. But every computation accesses the same temporary variable sum and this limits 

concurrency. To provide more concurrency, we can use multiple temporaries 24 in an array for 

instance. The modified function reads then: 

template 



{ 


typename Vector::value type sum[BSize]; 

for (unsigned i= 0; i < BSize; i++) 

sum[i]= 0; 

23 TODO: Test it with gcc 3.4 and MSVC. Speed up in table 

24 Strictly speaking, this is not true for every possible scalar type we can think of. The addition of the sum type 

must be a commutative monoid because we change the evaluation order. This holds of course for all intrinsic 

numeric types and certainly for almost all user-defined arithmetic types. But one is free to define an addition 

that is not commutative or not monoidal. In this case our transformation would be wrong. To deal with such 

exceptions we need semantic concepts which hopefully become part of C ++ in the next years.


} 




for (unsigned i= 1; i < BSize; i++) 

sum[0]+= sum[i]; 


sum[0]+= abs(v[i]); 

return sum[0]; 

The according functor must refer the right element in the sum array: 

template 


{ 

template 

void operator()(S∗ sum, const V& v, unsigned i) 

{ 


sum[Offset]+= abs(v[i+Offset]); 


} 

}; 

template 


{ 

template 

void operator()(S∗ sum, const V& v, unsigned i) {} 

}; 

On the test machine this took: 



Compute time one_norm(v) is 2 µs. 



This is even a bit slower than the version with one variable. Maybe an array is more expensive 

to pass as argument even in an inline function. Let us try something else. 

Reducing on a Nested Class Object 

⇒ reduction unroll nesting example.cpp 

To avoid arrays, we can define a class for n temporary variables where n is a template argument. 

Such a class is designed more consistently with the recursive scheme of the functors: 

template 

struct multi tmp


{ 

}; 

typedef multi tmp sub type; 

multi tmp(const Value& v) : value(v), sub(v) {} 

Value value; 

sub type sub; 

template 

struct multi tmp 

{ 

multi tmp(const Value& v) {} 

}; 

An object of this type can be recursively initialized so that we do not need a loop as for the 

array. A functor can operate on the value member and pass a reference to the sub member to 

its successor. This leads us to the implementation of our functor: 

template 


{ 

template 

void operator()(S& sum, const V& v, unsigned i) 

{ 


sum.value+= abs(v[i+Offset]); 

one norm ftor()(sum.sub, v, i); 

} 

}; 

template 


{ 

template 

void operator()(S& sum, const V& v, unsigned i) {} 

}; 

The unrolled function that uses this functor reads: 

template 



{ 


typedef typename Vector::value type value type; 

multi tmp multi sum(0); 



one norm ftor()(multi sum, v, i); 

value type sum= multi sum.sum(); 


sum+= abs(v[i]);


} 

return sum; 

There is one piece still missing. We need to reduce the partial sums in multi sum. Unfortunately 

we cannot write a loop over the members of multi sum. So, we need a recursive function that 

dives down into multi sum. This would be a bit cumbersome as free function, especially as we 

try to avoid partial specialization of template. As a member function, it is much easier and the 

specialization happens more safely on the class level: 

template 


{ 

Value sum() const { return value + sub.sum(); } 

}; 

template 


{ 

Value sum() const { return 0; } 

}; 

Note that we started the summation with 0 not the innermost value member. We could do this 

but then we need another specialization for multi tmp. Likewise we can implement a 

general reduction but we need as in std::accumulate an initial element: 

template 


{ 

template 

Value reduce(Op op, const Value& init) const { return op(value, sub.reduce(op, init)); } 

}; 

template 


{ 

template 

Value reduce(Op, const Value& init) const { return init; } 

}; 

The compute time of this version is: 






Pushing Temporaries into Registers 

⇒ reduction unroll registers example.cpp


Earlier experiments with older compilers (gcc 3.4) 25 exposed a serious overhead for using arrays 

or nested classes; it was finally even slower then using one single variable. The reason was 

probably that the compiler could not use registers for these types. 26 

The most likely way to store temporaries in registers is to declare them as separate variables: 


{ 

typename Vector::value type s0(0), s1(0), s2(0), ... 

} 

As one can see, the problem is how many one declares. The number cannot depend on the 

template argument but must be fix for all sizes (unless one writes a different implementation 

for each number and undermines the expressiveness of templates). Thus, we have to fix a certain 

number of variables — say 8. Then, we cannot unroll it more than eight times. 

The next issue we run into is the number of function arguments. When we call the iteration 

block we pass all variables (registers): 


one norm ftor()(s0, s1, s2, s3, s4, s5, s6, s7, v, i); 

The first calculation in such a block is performed on s0 and s1–s2 are only passed to the functors 

for the following computations. After this, the second computation must accumulate on the 

second function argument, the third calculation on the third argument, . . . This is unfortunately 

not implementable with templates (only with very ugly and highly error-prone source code 

manipulations by macros). 

Alternatively, each computation is performed on its first function argument and subsequent 

functors are called with omitted first argument: 

one norm ftor()(s1, s2, s3, s4, s5, s6, s7, v, i); 

one norm ftor()(s2, s3, s4, s5, s6, s7, v, i); 

one norm ftor()(s3, s4, s5, s6, s7, v, i); 

This is neither realizable with templates. 

The solution is to rotate the references to registers: 




This rotation is achieved by the following functor implementation: 

template 


{ 

template 

void operator()(S& s0, S& s1, S& s2, S& s3, S& s4, S& s5, S& s6, S& s7, const V& v, unsigned i) 

{ 


s0+= abs(v[i+Offset]); 


25 TODO: Show!!! 

26 TODO: which raises the question why they can do it today


}; 

} 

template 


{ 

template 

void operator()(S& s0, S& s1, S& s2, S& s3, S& s4, S& s5, S& s6, S& s7, const V& v, unsigned i) {} 

}; 

The according one norm function based on this functor is straightforward: 

template 



{ 


typename Vector::value type s0(0), s1(0), s2(0), s3(0), s4(0), s5(0), s6(0), s7(0); 


} 



s0+= s1 + s2 + s3 + s4 + s5 + s6 + s7; 


s0+= abs(v[i]); 

return s0; 

A slight disadvantage is that all registers must be accumulated after the first iteration no matter 

how small BSize is and how short the vector. A great advantage of the rotation is that BSize 

is not limited to the number of temporary variables in such accumulations. If BSize is larger 

then some or all variables are used multiple times without corrupting the result. The number 

of temporaries is nonetheless a limiting factor for the concurrency. 

The execution of this implementation durates on the test machine: 






This is comparable with the nested class (in this environment). 

Résumé on Reduction Tuning 

The goal of this section was not to determine the ultimately tuned reduction implementation 

for superscalar processors. 27 The main ambition of this section, in fact of the whole book, is to 

demonstrate the diversity of implementation opportunities. With the enormous expressiveness 

27 In the presence of the new GPU cards with hundreds of cores and millions of threads, the fight for this little 

concurrency is not so impressive. Nonetheless, we will still need performance tuning on single-core and “few-core”


of C ++ one can use (or abuse) the compiler to generate the most efficient version without 

rewriting the program sources, as one would need in C or Fortran. The power of internal 

code generation with the C ++ compiler only makes external code generation as in ATLAS 28 

unnecessary. In ATLAS, functions are written in a domain specific language and C programs 29 in 

slight variations are generated with a tool and compared regarding performance. The techniques 

presented here empower us to generate binaries equivalent to those variations by just using a 

C ++ compiler. Thus, we can tune our programs by changing template arguments or constants 

(that might be set platform-dependently). 

5.4.7 Tuning Nested Loops 

⇒ matrix unroll example.cpp 

The most used (and abused) example in performance discussions is dense matrix multiplication. 

We do not claim to compete with hand-tuned assembler codes but we show the power of metaprogramming 

to generate code variations from a single implementation. As starting point we 

use a templatized implementation of matrix class from Section 3.7.4. 

We begin our implementation with a simple test case: 

int main() 

{ 

const unsigned s= 4; // s= 4 for testing and 128 for timing 

matrix A(s, s), B(s, s), C(s, s); 

} 


for (unsigned j= 0; j < s; j++) { 

A(i, j)= 100.0 ∗ i + j; 

B(i, j)= 200.0 ∗ i + j; 

} 

mult(A, B, C); 

std::cout ≪ ”C is ” ≪ C ≪ ’\n’; 

A matrix multiplication is easily implemented with three nested loops. One of the 6 possible 

nestings is a dot-product-like calculation of each entry from C: 

cik = Ai · B k 

where Ai is the i th row of A and Bk the k th column of B. We use a temporary in the innermost 

loop to decrease the cache-invalidation overhead of writing to C’s elements in each operation: 

template 

void inline mult(const Matrix& A, const Matrix& B, Matrix& C) 

{ 

assert(A.num rows() == B.num rows()); // ... 

machines at least for some years since not everybody has GPU card for numerics and not every algorithm is 

already successfully ported (e.g. incomplete LU on arbitrary sparse matrices). By the time of this writing their 

is not even support for std::complex. 

28 http://math-atlas.sourceforge.net/ 

29 In some cases the C programs contain assembler snippets for a given platform in order to achieve performance 

close to peak.


} 


unsigned s= A.num rows(); 


for (unsigned k= 0; k < s; k++) { 

value type tmp(0); 

for (unsigned j= 0; j < s; j++) 

tmp+= A(i, j) ∗ B(j, k); 

C(i, k)= tmp; 

} 

For this implementation, we write a benchmark function: 

template 

void bench(const Matrix& A, const Matrix& B, Matrix& C, const unsigned rep) 

{ 

boost::timer t1; 


mult(A, B, C); 

double t= t1.elapsed() / double(rep); 


} 

std::cout ≪ ”Compute time mult(A, B, C) is ” 

≪ 1000000.0 ∗ t ≪ ” µs. This are ” 

≪ s ∗ s ∗ (2∗s − 1) / t / 1000000.0 ≪ ” MFlops.\n”; 

The run time and performance of our canonical implementation (with 128 × 128 matrices) is: 

Compute time mult(A, B, C) is 5290 µs. This are 789.777 MFlops. 

This implementation is our reference regarding performance and results. 

For the development of the unrolled implementation we go back to 4 × 4 matrices. In contrast 

to Section 5.4.6 we do not unroll a single reduction but perform multiple reductions in parallel. 

That means for the three loops to unroll the two outer loops and to replace the body in the 

inner loop by multiple operations. The latter we achieve as usual with a functor. 

As in the canonical implementation, the reduction shall not be performed in elements of C 

but in temporaries. For this purpose we use the class multi tmp from § 5.4.6. For the sake of 

simplicity we limit ourselves to matrix sizes that are multiples of the unroll parameters. 30 An 

unrolled matrix multiplication is shown in the following code: 

template 

void inline mult(const Matrix& A, const Matrix& B, Matrix& C) 

{ 

assert(A.num rows() == B.num rows()); // ... 

assert(A.num rows() % Size0 == 0); // we omitted cleanup here 

assert(A.num cols() % Size1 == 0); // we omitted cleanup here 



30 A full implementation for arbitrary matrix sizes is realized in MTL4.


} 

mult block block; 

for (unsigned i= 0; i < s; i+= Size0) 

for (unsigned k= 0; k < s; k+= Size1) { 

multi tmp tmp(value type(0)); 

for (unsigned j= 0; j < s; j++) 

block(tmp, A, B, i, j, k); 

block.update(tmp, C, i, k); 

} 

We still owe the reader the implementation of the functor mult block. The techniques are the 

same as in vector operations but we have to deal with more indices and their respective limits: 

template 

struct mult block 

{ 

typedef mult block next; 

template 

void operator()(Tmp& tmp, const Matrix& A, const Matrix& B, unsigned i, unsigned j, unsigned k) 

{ 

std::cout ≪ ”tmp.” ≪ tmp.bs ≪ ”+= A[” ≪ i + Index0 ≪ ”][” ≪ j ≪ ”] ∗ B[” ≪ j ≪ ”][” ≪ 

k + Index1 ≪ ”]\n”; 

tmp.value+= A(i + Index0, j) ∗ B(j, k + Index1); 

next()(tmp.sub, A, B, i, j, k); 

} 

}; 

template 

void update(const Tmp& tmp, Matrix& C, unsigned i, unsigned k) 

{ 

std::cout ≪ ”C[” ≪ i + Index0 ≪ ”][” ≪ k + Index1 ≪ ”]= tmp.” ≪ tmp.bs ≪ ”\n”; 

C(i + Index0, k + Index1)= tmp.value; 

next().update(tmp.sub, C, i, k); 

} 

template 


{ 


template 


{ 


k + Max1 ≪ ”]\n”; 

tmp.value+= A(i + Index0, j) ∗ B(j, k + Max1); 


} 

template 


{ 

std::cout ≪ ”C[” ≪ i + Index0 ≪ ”][” ≪ k + Max1 ≪ ”]= tmp.” ≪ tmp.bs ≪ ”\n”;


}; 

} 

C(i + Index0, k + Max1)= tmp.value; 


template 


{ 

template 


{ 

std::cout ≪ ”tmp.” ≪ tmp.bs ≪ ”+= A[” ≪ i + Max0 ≪ ”][” ≪ j ≪ ”] ∗ B[” ≪ j ≪ ”][” ≪ 

k + Max1 ≪ ”]\n”; 

tmp.value+= A(i + Max0, j) ∗ B(j, k + Max1); 

} 

}; 

template 


{ 

std::cout ≪ ”C[” ≪ i + Max0 ≪ ”][” ≪ k + Max1 ≪ ”]= tmp.” ≪ tmp.bs ≪ ”\n”; 

C(i + Max0, k + Max1)= tmp.value; 

} 

In order to verify that all operations are performed, we log them completely but look here only 

at tmp.4 and tmp.3: 

tmp.4+= A[1][0] * B[0][0] 

tmp.3+= A[1][0] * B[0][1] 

tmp.4+= A[1][1] * B[1][0] 

tmp.3+= A[1][1] * B[1][1] 

tmp.4+= A[1][2] * B[2][0] 

tmp.3+= A[1][2] * B[2][1] 

tmp.4+= A[1][3] * B[3][0] 

tmp.3+= A[1][3] * B[3][1] 

C[1][0]= tmp.4 

C[1][1]= tmp.3 

tmp.4+= A[3][0] * B[0][0] 

tmp.3+= A[3][0] * B[0][1] 

tmp.4+= A[3][1] * B[1][0] 

tmp.3+= A[3][1] * B[1][1] 

tmp.4+= A[3][2] * B[2][0] 

tmp.3+= A[3][2] * B[2][1] 

tmp.4+= A[3][3] * B[3][0] 

tmp.3+= A[3][3] * B[3][1] 

C[3][0]= tmp.4 

C[3][1]= tmp.3


This log shows that C[1][0] and C[1][1] are computed alternately so that it can be performed in 

parallel on a super-scalar computer. One can also verify that 

cik = 

3� 

j=0 

aijbjk. 

Printing C will also show the same result as for the canonical matrix multiplication. 

The implementation above can be simplified. The first functor specialization is only different 

to the general functor in the way how the indices are incrememted. We can factor this out with 

an additional loop class: 

template 

struct loop2 

{ 

static const unsigned next index0= Index0, next index1= Index1 + 1; 

}; 

template 

struct loop2 

{ 

static const unsigned next index0= Index0 + 1, next index1= 0; 

}; 

Such a general class has a high potential of reuse. With this class we can fuse the funtor 

template and the first specialization: 

template 


{ 

typedef loop2 l; 


template 


{ 


k + Index1 ≪ ”]\n”; 

tmp.value+= A(i + Index0, j) ∗ B(j, k + Index1); 


} 

}; 

template 


{ 

std::cout ≪ ”C[” ≪ i + Index0 ≪ ”][” ≪ k + Index1 ≪ ”]= tmp.” ≪ tmp.bs ≪ ”\n”; 

C(i + Index0, k + Index1)= tmp.value; 


} 

The other specialization remains unaltered. 

Last but not least we like to see impact of our not-so-simple matrix product. The benchmark 

yielded on our test machine:















One can see that mult has the same performance as the original implementation which 

in fact is performing the operations in exactly the same order (so far the compiler optimization 

does not change the order internally). We see also that the unrolled versions are all faster, up 

to a speed-up of 2.6. 

With double matrices the performance is lower in total: 














It shows that other parametrizations yield more acceleration and that the performance could 

almost be tripled. 

Which configuration is best and why is — as mentioned before — not topic of this script; we 

only show programming techniques. The reader is invited to try this program on his/her own 

computer. The technique in this section is intended for L1 cache usage. If matrices are larger, 

one should use more levels of blocking. A general-purpose methodology for locality on L2, L3, 

main memory, local disk, . . . is recursion. This avoids reimplementation for each cache size and 

performs even reasonably well in virtually memory, see for instance [?].


5.5 Exercises 

5.5.1 Vector class 

Revisit the vector example from §??. 

Make an expression for a scalar times a vector: 

class scalar times vector expressions { 

} ; 

that inherits from base vector. Use the inheritance mechanism to assign scalar times vector expressions 

into vector. 

5.5.2 Vector expression template 

Make a vector concept, which you call Vector. Make a vector class (you can use std::vector) 

that satisfies this concept. This vector class should have at least the following members: 

class my vector { 

public: 

typedef double value type ; 

public: 

my vector( int n ) ; 

// Copy Constructor from type itself 

my vector( my vector& ) ; 

// Constructor from generic vector 

template 

my vector( Vector& ) ; 

// Assignment operator 

my vector& operator=( my vector const& v ) ; 

// Assignment for generic Vector 

template 

my vector& operator=( Vector const& v ) ; 

value type& operator() ( int i ) ; 

public: // Vector concept 

int size() const ; 

value type operator() ( int i ) const ; 

} ; 

Make an expression for a scalar times a vector: 

template 

class scalar times vector expression{ 

} ;


template 

scalar times vector expressions operator∗( Scalar const& s, Vector const& v ) { 

return scalar times vector expressions( s, v ) ; 

} 

Put all classes and functions in the namespace athens. You can also make an expression template 

for the addition of two vectors. 

Write a small program, e.g. 

int main() { 

athens::my vector v( 5 ) ; 

... Fill in some values of v ... 

athens::my vector w( 5 ) ; 

w = 5.0 ∗ v ; 

w = 5.0 ∗ (7.0 ∗ v ) ; 

w = v + 7.0∗v ; // (If you have added the operator+) 

} 

Use the debugger to see what happens.

Inheritance 

Chapter 6 

C ++ is a multi-paradigm language and the paradigm that is most strongly associated with C ++ 

is ‘Object-Oriented Programming’ (OOP). The authors feel nevertheless that it is not the most 

important paradigm for scientific programming because it is inferior to generic programming 

for two major reasons: 

• Flexibility and 

• Performance. 

However, the impact of these two disadvantages is negligible in some situations. The performance 

is only deteriorated when we use virtual functions (§ 6.1). 

OOP in combination with generic programming is a very powerful mechanism to provide a form 

of reusability that neither of the paradigms can provide on it own (§ 6.3–§ 6.5). 

6.1 Basic Principles 

See section ?? from page ?? to page ??. 

6.2 Dynamic Selection by Sub-typing 

solver base class 

The way solvers are selected in AMDiS. The MTL4 solvers generic functions. AMDiS is only 

slightly generic but many decisions are made at run-time (by means of pointers and virtual 

functions). So, we needed a way to call the generic functions but decide at run time which one. 

The dynamic solver selection can be done with classical C features like: 

#include 

#include 

class matrix {}; 

class vector {}; 

void cg(const matrix& A, const vector& b, vector& x) 

187

188 CHAPTER 6. INHERITANCE 

{ 

} 

std::cout ≪ ”CG\n”; 

void bicg(const matrix& A, const vector& b, vector& x) 

{ 

std::cout ≪ ”BiCG\n”; 

} 


{ 

matrix A; 

vector b, x; 

} 

switch (std::atoi(argv[1])) { 

case 0: cg(A, b, x); break; 

case 1: bicg(A, b, x); break; 

} 

return 0 ; 

This works but it is not scalable with respect to source code complexity. If we call the solver 

with other vectors and matrices somewhere else we must copy the whole switch-case-block 

for each argument combination. This can avoided by encapsulating the block into a function 

and call this one with different arguments. More complicated is to different preconditioners 

(diagonal, ILU, IC, . . . ) that are also dynamically selected. Shall we copy a switch block for 

the preconditioners into each case block of the solvers? 

An elegant solution is an abstract solver class and derived classes for the solvers: 

struct solver 

{ 

virtual void operator()(const matrix& A, const vector& b, vector& x)= 0; 

virtual ∼solver() {} 

}; 

// potentially templatize 

struct cg solver : solver 

{ 

void operator()(const matrix& A, const vector& b, vector& x) { cg(A, b, x); } 

}; 

struct bicg solver : solver 

{ 

void operator()(const matrix& A, const vector& b, vector& x) { bicg(A, b, x); } 

}; 

In the application we can define one or multiple pointers of type solver and assign them the 

desired solver: 

// Factory 

solver∗ my solver= 0; 

switch (std::atoi(argv[1])) { 

case 0: my solver= new cg solver; break; 

case 1: my solver= new bicg solver; break;

6.3. REMOVE REDUNDANCY WITH BASE CLASSES 189 

} 

This idea is discussed thouroughly in the design patterns book [?] as factory pattern. Once we 

have defined a pointer of such a abstract class (also called interface), we can call it directly: 

(∗my solver)(A, b, x); 

Without going into detail, we can have multiple factories and use the pointers together without 

the combinatorial explosion in the program sources: 

// Preconditioner factory 

precon∗ my precon= 0; 

switch (std::atoi(argv[2])) { ... } 

(∗my solver)(∗my precon, A, b, x); 

C ++ does not allow virtual template functions because this would make the compiler implementation 

very complicated to avoid infinite function pointer tables. However, template classes can 

have virtual functions. This enables generic programming with virtual functions by templatizing 

the entire class not single member functions. 

6.3 Remove Redundancy With Base Classes 

especially when no type infos involved 

6.4 Casting Up and Down and Elsewhere 

In C ++, there are four different cast operators: 

• static cast; 

• dynamic cast; 

• const cast; and 

• reinterpret cast. 

Its linguistic root C knew only one casting operator: ‘( type ) expr’. The trouble with this 

single operator is that it is not standardized or clearly defined which casting is performed under 

which conditions. As a consequence, the behavior of the casting can change from compiler to 

compiler. C ++ still allows the this old-style casting but all C ++ experts agree on discouraging 

its use. Another quite important issue is that this notation is not easy to find in large code 

bases (there is no regular expression to filter out all C casts) what increases significantly the 

maintenance costs, see also discussion in [SA05, chapter 95]. In this section, we will show you 

the different cast operators and discuss the pros and cons of different casts in different contexts.


6.4.1 Casting Between Base and Derived Classes 

Casting Up 

⇒ up down cast example.cpp 

Casting up, i.e. from a derived to a base class, is always possible if there are no ambiguities and 

can be even performed implicitly. Assume we have the following class structure: 1 

struct A 

{ 

virtual void f(){} 

virtual ∼A(){} 

int ma; 

}; 

struct B : A { float mb; }; 

struct C : A {}; 

struct D : B, C {}; 

and the following unary functions: 

void f(A a) { /∗ ... ∗/ } 

void g(A& a) { /∗ ... ∗/ } 

void h(A∗ a) { /∗ ... ∗/ } 

An object of type B can be passed to all three funtions: 


{ 

B b; 

f(b); 

g(b); 

h(&b); 

} 

return 0 ; 

In all three cases the object b is implicitly converted to an object of type A. The call of function 

‘f’ is however a bit different: only b’s members within class A are copied into the function 

argument and the remainder — in our example the member ‘mb’ is not accessible in f by any 

means. The functions ‘g’ and ‘h’ refer to an object of type A by reference or pointer. If an 

object of a derived class is passed to one of those functions the other members are in principle 

still there but just hidden. One could still access them by down-casting the argument in the 

function. Before we down-cast we should ask ourselves the following questions: 

• How do we assure that the argument passed to function is really an object of the derived 

class? For instance with extra arguments or with run-time tests. 

• What can we do if the object cannot be down-casted? 

• Can we write directly a function for the derived class? 

• Why we do not overload the function for the base and the derived type? This is definitively 

a much cleaner design and always feasible. 

1 TODO: picture

6.4. CASTING UP AND DOWN AND ELSEWHERE 191 

Up-casting only fails if the base class is ambiguous. In the current example we cannot up-cast 

from D to A: 

D d; 

A ad(d); // error: ambiguous 

because the compiler does not know if we mean the base class A from B or from C. We can 

clarify this with an explicit intermediate up-cast: 

A ad(B(d)); 

Or we can share A between B and C: 2 

struct B : virtual A { float mb; }; 

struct C : virtual A {}; 

Now the members of A exist only once in D which is probably the best solution for multiple 

inheritance in most cases because we safe memory and do not need to pay attention which 

replica of A is accessed. 

Casting Down 

There are situations where references or pointers are casted down, e.g. in the next section § 6.5. 

This can be performed with static cast or dynamic cast. As the names suggest, static cast is 

statically type-checked during compile time whereas dynamic cast performs run-time tests (with 

only minimal compile-time tests). We still use our diamond-shaped class hierarchy A–D as case 

study. Now we introduce to pointers of type B∗ holding objects of types B and D: 

B ∗bbp= new B, ∗bdp= new D; 

When we cast these pointers down to D∗, dynamic cast verifies whether the referred object 

actually allows this cast. Since this information is in general only known at run time, e.g.: 

B ∗bxp= argc > 1 ? new B : new D; 

dynamic cast must verify the referred object’s type with run-time information (RTTI). Performing 

an incorrect cast leads to a null pointer: 

D∗ dbp= dynamic cast(bbp); // error: cannot downcast from B to D 

D∗ ddp= dynamic cast(bdp); // ok: bdp points to bn object of type D 

std::cout ≪ ”Dynamic downcast of bbp should fail and pointer should be 0, it is: ” ≪ dbp ≪ ’\n’; 

std::cout ≪ ”Dynamic downcast of bdp should succeed and pointer should not be 0, it is: ” ≪ ddp ≪ 

’\n’; 

The programmer can check the zeroness of the pointer and eventually react to the failed downcast. 

Likewise, incorrect down-casts of references throw an exception of type std::bad cast can 

be handled in a try-catch block. 

In contrast to it, static cast only verifies that the target type is a derived class of the source 

type — respectively references or pointers thereof — or vice versa: 

2 TODO: picture


dbp= static cast(bbp); // erroneous downcast performed 

ddp= static cast(bdp); // correct downcast but not checked by the system 

std::cout ≪ ”Erroneous downcast of bbp will not return 0, it is: ” ≪ dbp ≪ ’\n’; 

std::cout ≪ ”Correct downcast of bdp but not checked at run−time, it is: ” ≪ ddp ≪ ’\n’; 

Whether the referred object really allows for the downcast cannot be decided at compile time 

and lies in the responsibility of the programmer. 

Cross-casting 

An interesting feature of dynamic cast is casting across from B to C when the referred object’s 

type is a derived class of both types: 

C∗ cdp= dynamic cast(bdp); // cross−cast from B to C ok: bdp points to an object of type D 

std::cout ≪ ”Dynamic cross−cast of bdp should succeed and pointer should not be 0, it is: ” ≪ cdp ≪ 

’\n’; 

Static cross-casting from B to C: 

cdp= static cast(bdp); // error: cross−cast from B to C does not compile 

is not possible because C is neither a base or derived class of B. It can be casted indirectly over 

D: 

cdp= static cast(static cast(bdp)); // error: cross−cast from B to C via D 

This again is in the responsibility of the programmer whether the addressed object can be really 

casted this way. 

Comparing Static and Dynamic Cast 

Dynamic casting is safer but slower then static casting due the run-time check of the referred 

object’s type. Static casting allows for casting up and down with the programmer’s responsibility 

that the referred objects are handled correctly. Dynamic casting is in some sense always 

up, namely from the referred object’s type to a super-type (including itself). 

Furthermore, dynamic casting can only be applied on ‘Polymorphic Types’ that are class that 

define or inherit a virtual function. The following summarizes the differences between the to 

forms of casting: 

static cast dynamic cast 

Applicability all only polymorphic classes 

Cross-casting no yes 

Run-time check no yes 

Speed no run-time overhead overhead for checking 

Table 6.1: Static vs. dynamic cast

6.5. BARTON-NACKMAN TRICK 193 

6.4.2 Const Cast 

const cast adds or removes the attributes const and/or volatile. The key word volatile informs the 

compiler that a variable can be modified by other programs. It is therefore not hold or cached 

in registers and accessed each time memory. This feature is not used in this script. Adding 

an attribute is an implicit conversion in C ++. That is one can always assign an expression 

to an variable of the same type with extra attributes without the need for a cast. Removing 

an attribute requires a const cast and should be only done when unavoidable, e.g. to interface 

old-style software that is lacking appropriate const attributes. 

6.4.3 Reinterpretation Cast 

This is the most aggressive form of casting and not used in this script. It takes an address or an 

object’s memory location and interprets the bits there as it was of the target type. One can for 

instance change a single bit in a floating point number by casting it to a bit chain. It is more 

important for programming hardware drivers than complex flux solvers. Needless to say that 

reinterpret cast is one of the most efficient ways to undermine the portability of an application. 

6.5 Barton-Nackman Trick 

This section describes the ‘Curiously Recurring Template Pattern’ (CRTP). It was introduced 

by John Barton and Lee Nackman [?] and is therefore also referred to as the ‘Barton- 

Nackman Trick’. 

6.5.1 A Simple Example 

⇒ crtp simple example.cpp 

We will explain this with a simple example. Assume we have a class point with an equality 

operator: 

class point 

{ 

public: 

point(int x, int y) : x(x), y(y) {} 

bool operator==(const point& that) const { return x == that.x && y == that.y; } 

private: 

int x, y; 

}; 

We can program the unequality by using common sense or by applying de Morgan’s law: 

bool operator!=(const point& that) const { return x != that.x || y != that.y; } 

Or we can simplify our live and just negate the result of the equality: 

bool operator!=(const point& that) const { return !(∗this == that); }


Our compilers are so sophisticated, they certainly handle de Morgan’s law perfectly. Negating 

the equality operator is something we can do on every type that has an equality operator. We 

could copy-and-past this code snippet and just replace the type of the argument. 

Alternatively, we can write a class like this: 

template 

struct unequality 

{ 

bool operator!=(const T& that) const { return !(static cast(∗this) == that); } 

}; 

and derive from it: 

class point : public unequality { ... }; 

This mutual dependency: 

• One class is derived from the other and 

• The latter takes the derived class’ type as template argument 

is somewhat confusing at the first view. 

Essential for this to work is that the code of a template class member is only generated when 

the class is instantiated and the function is actually called. At the time the template class 

‘unequality is parsed, the compiler checks only the correctness of the syntax. 

When we write 


{ 

point p1(3, 4), p2(3, 5); 

std::cout ≪ ”p1 != p2 is ” ≪ (p1 != p2 ? ”true” : ”false”) ≪ ’\n’; 

} 

return 0 ; 

fter the definition of unequality and point both types are completely known to the compiler. 

What happens when we call p1 != p2? 

1. The compiler searches for operator!= in class point → without success. 

2. The compiler looks for operator!= in the base class unequality → with success. 

3. The this pointer of unequality refers a component of point’s this pointer. 

4. Both types are completely known and we can statically down-cast the this pointer to point. 

5. Since we know that the this pointer of unequality is an up-casted this pointer to 

point 3 we are save to down-cast it to its original type. 

6. The equality operator for point is called. Its implementation is already known at this point 

because the code of unequality’s operator!= is not generated before the instantiation 

of point. 

3 Unless the first argument is really of type unequality. There are also ways to impede this, e.g. http: 

//en.wikipedia.org/wiki/Barton-Nackman_trick but we used this unary operator notation for the sake of 

simplicity.


Likewise every class U with an equality operator can be derived from unequality. A collection 

of such CRTP templates for operator defaults is provided by Boost.Operators from Jeremy 

Siek and David Abrahams. 

Alternatively to the above implementation where the this pointer is dereferred and casted as 

reference, one can cast the pointer first and derefer it afterwards: 

template 

struct unequality 

{ 

bool operator!=(const T& that) const { return !(∗static cast(this) == that); } 

}; 

There is no difference, this is just a question of taste. 

6.5.2 A Reusable Access Operator 

⇒ matrix crtp example.cpp 

We still owe the reader the reusable implementation of the matrix bracket operator promised 

in Section 3.7.4. Back then we did not know enough language features. 

First of all we had no templates which are indispensable for a proxy. We will show you why. 

Say we have a matrix class as in § 3.7.4 and we just want to call the binary operator() from the 

unary operator[] via a proxy: 

class matrix; // Forward declaration 

class simple bracket proxy 

{ 

public: 

simple bracket proxy(matrix& A, int r) : A(A), r(r) {} 

double& operator[](int c){ return A(r, c); } 

private: 

matrix& A; 

int r; 

}; 

class matrix 

{ 

// ... 

double& operator()(int r, int c) { ... } 

}; 

simple bracket proxy operator[](int r) 

{ 

return simple bracket proxy(∗this, r); 

} 

This does not compile because operator[] from simple bracket proxy calls operator() from matrix but 

this is not defined yet. The forward declaration of matrix is not sufficient because we need the 

complete definition of matrix not only the assertion that the type exist. Vice versa if we define 

matrix first, we would miss the constructor of simple bracket proxy in the operator[] implementation.


Another disadvantage of the implementation above is that we would need another proxy for the 

constant access. 

This is an interesting aspect of templates. It does not only enable writing type-parametric software 

but can also help to break mutual dependencies thanks to its post-poned code generation. 

By templetizing the proxy the dependency is gone: 

template 

class bracket proxy 

{ 

public: 

bracket proxy(Matrix& A, int r) : A(A), r(r) {} 

Result& operator[](int c){ return A(r, c); } 

private: 

Matrix& A; 

int r; 

}; 

class matrix 

{ 

// ... 

bracket proxy operator[](int r) 

{ 

return bracket proxy(∗this, r); 

} 

}; 

With this implementation we can now write A[i][j] and it is realized by the binary operator() 

however this is implemented. Such a bracket operator is useful in every matrix class and the 

implementation will be always the same. 

For this reason we like to have this implementation only once in our code base and reuse 

whereever appropriate. The only way to achieve this is with the CRTP paradigm: 

template 

class bracket proxy 

{ 

public: 

bracket proxy(Matrix& A, int r) : A(A), r(r) {} 

Result& operator[](int c){ return A(r, c); } 

private: 

Matrix& A; 

int r; 

}; 

template 

class crtp matrix 

{ 

public: 

bracket proxy operator[](int r) 

{ 

return bracket proxy(static cast(∗this), r); 

}


}; 

bracket proxy operator[](int r) const 

{ 

return bracket proxy(static cast(∗this), r); 

} 

class matrix : public crtp matrix 

{ 

// ... 

}; 

Once we have such a CRTP class we can provide a bracket operator for every matrix class with a 

binary application operator. In a full-fledged linear algebra package one needs to pay attention 

which matrices return references and which are mutable but the approach is as described above. 

Several timings have shown that the indirection with the proxy did not create run-time overhead 

compared to the direct usage of the binary access operator. Apparently, the compilers optimized 

the creation of proxies away in the executables.

198 CHAPTER 6. INHERITANCE

Effective Programming: The 

Polymorphic Way 

Chapter 7 

Fools ignore complexity. Pragmatists suffer it. Some can avoid it. Geniuses remove 

it. 

—Alan Perlis 

To remove complexity in scientific application development (but not only there), several programming 

techniques, methods, and application of paradigms have to be used accordingly. This 

not only depends on the ability to combine application-specific functionality with other librarycode 

from a variety of sources but also to restrict the amount of application-specific glue code. 

So libraries must remain open for extension but closed for modification, which can be attributed 

to a technique called polymorphic programming. 

The presented sections of this book introduced important mechanisms to successfully develop 

scientific applications such as C++ basics, encapsulation, generic and meta programming as 

well as inheritance. An important part of scientific computing, matrix containers and matrix 

algorithms, has been presented to aid the topics so far. Effective programming is then possible if 

these mechanisms are not viewed as separate entities, but as different characteristics to achieve 

important goals, such as 

• uncompromising efficiency of simple basic operations (e.g., array subscripting should not 

incur the cost of a function call), 

• type-safety (e.g., an object from a container should be usable without explicit or implicit 

type conversion), 

• code reuse and extensibility, 

all with their respective advantages and disadvantages. This section reviews important techniques 

to achieve polymorphism from a more general point of view and highlights a basic but 

very important recurring principle for scientific computing: code reusability. This is not mainly 

because programmers are lazy people, but also because applications have to be tested. For 

the field of scientific applications this is particularlly important due to large parameter sets, 

changing boundary and initial conditions, as well as long run-times of simulation codes. Hence 

it should not be underestimated how much time and effort can be saved, if already tested code 

can be used as starting point or reference. So code reusability is not only about programming 

less, but also because of extend code quality. Most of the presented and discussed technique so 

199

200 CHAPTER 7. EFFECTIVE PROGRAMMING: THE POLYMORPHIC WAY 

far already deal with some kind of code reusability, but mostly in an implicit way. The following 

section overview polymorphic mechanisms in a more explicit way. 

As soon as code reusability is covered, an almost equal importance is placed on code extensibility, 

which should not be constrained by reused code. Scientific code development is always 

driven by transforming newly developed scientific methods into executable code. Various programming 

techniques with different scopes are therefore mandatory. If programming techniques 

are analyzed this way, it becomes understandable why some of the presented programming 

paradigms are not ideally suited to accomplish code reusability and extensibility together (e.g., 

the object-oriented inheritance model). 

No technique, or more generally paradigm, will result in the ultimate and final solution, but 

each of the techniques results in tools to manage the complexity of a problem. It does not give 

you the ability to do so. A bad problem specification will lead to a bad solution independently 

from the technique or paradigm used for implementation. 

The usage of the Boost graph library (BGL) is an excellent example. There is great diversity 

of requirements in the field of graph algorithms and data structures. Even so, the performance 

claim for a library like this, is very high. Nevertheless, it was possible to implement all necessary 

functionality at a high performance level. More than this, the library can be extended greatly 

in many different ways. But on the other side, this library is not easy to use or extend without 

an understanding of the underlying techniques. 

And as a reminder of the main goal of this book is how to write good scientific software.

7.1. IMPERATIVE PROGRAMMING 201 

7.1 Imperative Programming 

Imperative programming may be viewed as the very bones on which all other abstractions 

depend. This programming paradigm uses a sequence of instructions which act on a state to 

realize algorithms. Thus it is always specified in detail what and how to execute next. The 

modification of the program state while convenient is also an issue, as with increasing size of 

the program, unintended modifications of the state becomes an increasing problem. In order 

to address this issue the imperative programming method has been refined to procedural and 

structured programming paradigms, which attempt to provide more control of the modifications 

of the program state. Hence it is based upon organized procedure calls. Procedure calls, also 

known as routines, subroutines, methods, or functions simply contain a series of computational 

steps to be carried out. Any given procedure might be called at any point during a program’s 

execution, including other procedures or itself. A function consists of: 

• The return type of the function: A function returns the value at a user specified position. 

C or C ++ which does not provide procedures explicitely has to use the keyword void for 

indicating that a function does not return a value. 

• The name of the function: Therewith the function can be called. The name should be as 

expressive possible. Never underestimate names with good significance. 

• The parameter list of the function: The parameters of a function serve as placeholders 

for values that are later supplied by the user during each invocation of the function. A 

function can have an empty parameter list. The values of the parameter list can be given 

by value or by reference. 

• The body of the function: The body of a function implements the logic of the operation. 

Typically, it manipulates the named parameters of the function. 

The advantages of this paradigm are: 

• Few techniques 

• Rapid prototyping for easy problems 

• Functions can be put into a library 

• Fast compilation 

The disadvantages of this paradigm are: 

• Test effort is high 

• Source of error is manifold 

• Non-trivial problems cause high programming line effort 

• No user defined data types 

• No locality of data 

• Only very few and simple functions can be put into a library 

Even in the refined form as procedural programming the incurred overhead can be limited to a 

bare minimum as the level of abstraction is relatively low. This was well suited for the situation


of scarce computing resources and a lack of mature and powerful tools. Under these circumstances 

the overall performance, in terms of execution speed or memory consumption is solely 

dependent on the skill and ingenuity of the programmer and has resulted in the almost mythical 

”hand optimized” code. However, to achive the desired specifications in such a fashion the 

clarity and readability, and therby the maintainability of the code were sacrificed. Furthermore, 

the low level of abstraction also hinders portability, as different architectures favour different 

assumptions to produce efficient execution. To address this effect, implementations were duplicated 

in order to optimize for different architectures and platforms, which of course makes a 

mockery of goals such as code reusability or even extensiblity. 

This paradigm and the derived techniques are then used differently in Section 2.11, where 

generic programming is used to offer an efficient approach for matrix operations.

7.2. GENERIC PROGRAMMING 203 

7.2 Generic Programming 

Generic programming may be viewed as having been developed in order to further facilitate 

the goals of code reusability and extensibility. From a general view the generic programming 

paradigm is about generalizing software components so that they can be directly reused easily 

in a wide variety of situations. While these are among the goals, which lead to the development 

of object oriented programming, it may vary quite profoundly in the realization. A major 

distinction from object oriented programming, which is focused on data structures and their 

states, is that it especially allows for a very abstract and orthogonal description of algorithms. 

To achieve this kind of generalization a separation of the basic tools of programming are important: 

algorithms, containers (data structures), and a glue between them (so called iterators 

or more generally traversors). As introduced as an important part for effective programming, 

the minimization of glue code, iterators and traversal objects operate as a minimal but fully 

abstract interface between data structures and algorithms. 

While the desired functionality is often implemented using static polymorphism mechanisms, 

such as templates in C++, generic programming should not be equated with simply programming 

with templates. However, when generic programming is realized using purely compile 

time facilities such as static polymorphism, not only is implementation effort reduced but the 

resulting run time performance optimized. 

In the following, the process of generic programming is given by elevating a procedural code to 

a generic one simultanioulsy fullfilling the important topics of effective programming (efficiency, 

type-safety, code reuse): 

• Algorithm: Generic algorithms are generic in two ways. First the data type which they 

are operating on is arbitrary and second, the type of container within which the elements 

are held is arbitrary. 

To get in touch with the generic approach, a generalization of the memcpy() function of the 

C standard library is discussed. An implementation of memcpy() might look somewhat 

like the following: 

void∗ memcpy(void∗ region1, const void∗ region2, size t n) 

{ 

const char∗ first = (const char∗)region2; 

const char∗ last = ((const char∗)region2) + n; 

char∗ result = (char∗)region1; 

while (first != last) 

∗result++ = ∗first++; 

return result; 

} 

The memcpy() function is already generalized to some extent by the use of void∗ so that 

the function can be used to copy arrays of different kinds of data. 

Looking at the body of memcpy(), the function’s minimal requirements are that it needs to 

traverse the sequence using some sort of pointer, access the elements pointed to, copy the 

elements to the destination, and compare pointers to know when to stop. The memcpy() 

function can then be written in a generic manner: 

template 

OutputIterator


copy(InputIterator first, InputIterator last, OutputIterator result) 

{ 

while (first != last) 

∗result++ = ∗first++; 

return result; 

} 

With this code the same functionality of the memcpy() from the C library is achieved. 

All kinds of data structure which offer an begin() and end() iterator can be used. 

• Container: An abstraction to all kinds of data structures which can store other data 

types. 

• Iterator: This is the glue between the containers and the algorithms. First, it separates 

the usage of data structures and algorithms. Second, it provides a concept hierarchy for 

all kinds of traversal within data structures. 

This type of genericity is called parametric polymorphism (see Section 7.5.2). Section 4.9 

introduced the Standard Template Library (STL). The STL solves many standard data structure 

and algorithmic problems. The STL is (or should be) the first choice in all code development 

steps. 

• Algorithm/Data-Structure Interoperability: First, each algorithm is written in a datastructure 

neutral way, allowing a single template function to operate on many different 

classes of containers. The concept of an iterator is the key ingredient in this decoupling of 

algorithms and data-structures. The impact of this technique is a reduction of the STL’s 

code size from O(M*N) to O(M+N), where M is the number of algorithms and N is the 

number of containers. Considering a situation of 20 algorithms and 5 data-structures, 

this makes the difference between writing 100 functions versus only 25 functions! And the 

differences grows faster as the number of algorithms and data-structures increase. 

• Extension through Function Objects: The second way that the STL is generic is that its 

algorithms and containers are extensible. The user can adapt and customize the STL 

through the use of function objects. This flexibility is what makes STL such a great 

tool for solving real-world problems. Each programming problem brings its own set of 

entities and interactions that must be modeled. Function objects provide a mechanism 

for extending the STL to handle the specifics of each problem domain. 

• Element Type Parametrization: The third way that STL is generic is that its containers 

are parametrized on the element type. 

Most people think, that element type parametrization is the feature that makes the successful. 

This is perhaps the least interesting way in which STL is generic. The interoperability with 

iterators and the extensibility by function objects are more important parts of the STL. But 

the essence is the programming with concepts. The programmer can write the data structures 

and algorithms, or in other words the concept of these, as it should be. Next to these facts, the 

STL has proven that with the generic programming paradigm, high performance computing 

can be accomplished as well on several different computer architectures. 


• Programming with concepts 

• Great number of available libraries

7.2. GENERIC PROGRAMMING 205 

• Great expandibility 

• Great code reusability 

• Development of high performance code 

• All other paradigms can be used 

• Concepts can be proven by the compiler 


• Long compilation times: C++ and the statical type checking requires a complete template 

instantiation and type checking. 

• Steep learning curve due to many complex techniques 

• Code bloat: Due to an incorrect usage of templates, the compiler can produce an excessive 

amount of code.


7.3 Programming with Objects 

Programming with objects may be viewed as an evolution from the structured imperative 

paradigm. It, on the one hand, tries to address the issue of code reuseability by providing a 

specific type of polymorphism, sub-typing. On the other hand it addresses the issue of unchecked 

modification of state by enforcing data encapsulation, thus enforcing changes through defined 

interfaces. Both of these notions are attached to an entity called an object. Therefore an object 

serves as a self contained unit which interacts with the environment via messages. It thus 

accomplishes a decoupling of the internal implementation within the object and the interaction 

with the sourrounding environment. Thus enforcing (clean) interfaces, which is essential for 

effective programming. The algorithms are expressed much more by the notion of what is to be 

done as an interaction and modification of objects, where the details of how, are encapsulated 

to a great extent within the objects themselves. 

Another benefit for programming with objects is, that these entities can be placed in libraries. 

This saves the effort of continually rewriting the same code for every new program. Furthermore, 

because objects can be made polymorphic, object libraries offer the programmer more flexibility 

and functionality than subroutine libraries (their counterparts in the procedural paradigm). 

Technically, object libraries are quite feasible, and the advantages of extensibility can be significant. 

However, the real challenge to making code reusable is not technical. Rather, it is 

identifying functionality that other people both understand and want. People who use procedural 

languages have been writing and using subroutine libraries for decades. These libraries are 

most successful when they perform simple, clearly defined functions, such as calculating square 

roots or computing trigonometric functions. An object library can provide complex functions 

more easily than a subroutine library. However, unless those functions are clearly defined, well 

understood and generally useful, the library is unlikely to be used widely. 

To give an intuitive specification of the programming approach with objects, the following list 

specifies different points in the object world: 

• Identity is the quantization of data in discrete, distinguishable entities called objects 

• Classification is the grouping of objects with the same structure and behavior into classes 

• Polymorphism is the differentiation of behavior of the same operation on different classes 

• Inheritance is the sharing of structure and behavior among classes in a hierarchical relationship 

But one of the biggest problems of this programming approach is the interaction of objects with 

algorithms. The problem can easily be seen using the example of a simple sorting algorithm. 

Should the algorithm be placed into the object? Should an algorithm work on a class hierarchy 

with a common interface? 

The problem cannot be solved easily within this paradigm. A possible solution is some kind of 

polymorphism, which is explained in Section 7.5.2. 

7.3.1 Object-Based Programming 

In languages which support identity and classification the object-based paradigm can be used 

efficiently.

7.3. PROGRAMMING WITH OBJECTS 207 


• User defined data structures with data locality: programming can be more intuitive than 

compared to the procedural paradigm. and algorithms can be put into a library 

• Library code can be tested independently 

• Fast compilation, may be slower than procedural 


• Runtime performance 

• Library/code reusability 

7.3.2 Object-Oriented Programming 

To overcome the mentioned problem of code reusability, inheritance and polymorphism were 

introduced 1 . Inheritance is deployed with the aim of reducing implementation efforts by allowing 

refinement of already existing objects. By using inheritance and the connected sub typing also 

makes polymorphic programming available at run time: 

• Inheritance allows us to group classes into families of related types, allowing to share 

common operations and data. The reuse of already existing code can be accomplished. 

• Polymorphism allows us to implement these families as a unit rather than as individual 

classes, giving us greater flexibility in adding or removing any particular class. This point 

is explained in more detail in Section7.5.2, where this type of polymorphism is called 

subtyping polymorphism. 

• Dynamic binding is a third aspect of object-oriented programming. 

The actual member function resolution is delayed until run time. With the combination 

of inheritance and (subtyping) polymorphism a generic way of dealing with geometrical 

objects can be achieved. 

While the concepts of object orientation have proved to be invaluable to the development of 

modular software, its limits also became apparent as the goal of general reusability suffers from 

the stringent limitations of the required sub typing. Which may be viewed as a consequence 

that objects are not necessarily fit to accomodate the required abstractions such as in the case 

the algorithms themselves. Furthermore, the extension of existing codes is often only possible 

by intrusive means, such as changing the already existing implementations thus not leading to 

the high degree of reduction of effort as was hoped for. 

Compared to the run time environment or compiler required to realize the simple imperative 

programming paradigm, the object oriented paradigm requires more sophistication as it needs 

to be able to handle run time dispatches using virtual functions for instance. Additionally, 

seemingly simple statements may hide the true complexity encapsulated within the objects. 

Thus not only is the demand on the tools higher but the programmer also needs to be aware 

of the implications of the seemingly simple statements in order to achive desirable levels of 

performance. 

1 If a language supports all these features (identity, classification, polymorphism, and inheritance), then the 

object-oriented paradigm is supported in this language.


Behind the Dynamic Polymorphism in C++ 

A programmer must be aware of the fact, that inheritance is one of the 

strongest bonds between objects. In real-world examples, few problems can 

be modeled successfully by class-inheritance only. The coupling by inheritance 

should be used very carefully. 


• Library: Data-types can be enhanced greatly. 

• Abstract algorithms with polymorphism enable greater code reusability compared to the 

procedural paradigm. 

• Strong binding of data structures and methods: Logical connections can be modeled easily. 

Logical errors can be detected easily. 


• The binary-method problem (see Section 7.5.3) 

• Bad optimization capability of a compiler due to the subtyping polymorphism (see Section 

??) 

• Strong binding of data structures and methods: Only usable on object-oriented problems.

7.4. FUNCTIONAL PROGRAMMING 209 

7.4 Functional Programming 

In contrast to the procedural and object oriented paradigm, which explicitly formulate algorithms 

and programs as a sequence of instructions which act on a program state, the functional 

paradigm uses mathematical functions for this task and forgoes the use of a state altogether. 

Therefore, there are no mutable variable and no side effects in purely functional programming. 

As such it is declarative in nature and relies on the language’s environment to produce an imperative 

representation which can be run on a physical machine. Among the greatest strengths of 

the functional paradigm is the availability of a strong theoretical framework of lambda calculus 

(cite()), which is explained in more detail in Section ref(), for the different implementations. 

Higher-order functions are an important concept of functional programming due to its usability 

in procedural languages. They were studied in lambda calculus theory well before the notion of 

functional programming existed and present the design of a number of functional programming 

languages, such as Scheme and Haskell. 

As modern procedural languages and their implementations have started to put greater emphasis 

on correctness, rather than raw speed, and the implementations of functional languages have 

begun to emphasize speed as well as correctness, the performance of functional languages and 

procedural languages has begun to converge. For programs which spend most of their time 

doing numerical computations, some functional languages (such as OCaml and Clean) can 

approach the performance of programs written in C speed, while for programs that handle 

large matrices and multidimensional databases, array functional languages (such as J and K) 

are usually faster than most non-optimized C programs. Functional languages have long been 

criticized as resource-hungry, both in terms of CPU resources and memory. This was mainly 

due to two things: 

• some early functional languages were implemented with no concern for efficiency 

• non-functional languages achieved speed at least in part by neglecting features such as 

checking of bounds or garbage collection which are viewed as essential parts of modern 

computing frameworks representing an overhead which was built-in to functional languages 

by default 

Since a purely functional description is free of side effects, it is a favourable choice for parallelization, 

as the description does not contain a state, which would require synchronization. 

Data related dependencies, however, must still be considered in order to ensure correct operation. 

Since the declarative style connected to the functional paradigm distances itself from 

the traditional imperative paradigm and its connection to states, input and output opearions 

pose a hurdle which is often addressed in a manner, which is not purely functional. As such 

functional interdependencies may be specified trivially, while the details how these are to be 

met remain opaque and as a choice to the specificc implementaiton. 

Last, we give an example of pure functional programming. We point out, that the next code 

snippet is presented in Haskel, not in C ++ syntax. The ”hello world” program in the functional 

programming paradigm: the factorial calculation 

fac :: Integer → Integer 

fac 0 = 1 

fac n | n>0 = n ∗ fac (n−1)


7.4.1 Lambda Calculus 

As was presented in Section ref(), is not very easy to reuse the STL standard function objects 

because the use is not very intuitive. Either a function object for each loop or binder has to be 

written. A binder (or binder object) is passed at construction time, to another function object 

which performs an action. The binder takes the function object as well as another binding value 

and makes a binary function unary by fixing the first parameter. However it is not obvious at 

first. An easy way to implemented such a functionality is to write it as it is: 

std::for each(vec.begin(), vec.end(), std::cout ≪ ∗vec iter); 

Of course this can not compile for several reasons. First the third argument is not a function 

object. Second the variable vec iter does not exist, nor does it know anything about the 

iterated container vec. Anyway, an expression like this is easy to write and less error prone 

compared to a binder object. To enable a program like this the following has to be accomplised: 

First the output-stream operator 

ArgumentT 

operator()(ArgumentT arg) 

{ 

return arg1; 

} 

template< typename Argument1T, typename Argument2T> 

ArgumentT 

operator()(Argument1T arg1, Argument2T arg2) 

{ 

return arg1; 

} 

}; 

So what does this object really do? It provides unary and binary bracket operators for one and 

two objects which return the argument passed. A function object is implemented next which 

stores an arbitrary stream type. 

template 

output function object 

{ 

output function object (StreamType stream, FunctionObjectT func) : stream(stream), func(func) {} 

template < typename ArgumentT> 

void operator()(ArgumentT arg) 

{

7.4. FUNCTIONAL PROGRAMMING 211 

} 

}; 

stream ≪ func(arg); 

The only thing we have to do by now is to write an appropriate object generator around in 

order to persuade the C ++ syntax to accept something like the first line of code of this chapter. 

template 

output function object 

operator≪ (StreamType stream, FunctionObjectT func) 

{ 

return output function object(stream, func); 

} 

By using these objects it is almost possible to offer a convenient way to write the already 

presented for each - code snippet. The remaining adaptation is to use a so-called unnamed 

object instead of the dereferenced iterator arg1. 

argument 1 function object arg1; 

std::for each(vec.begin(), vec.end(), std::cout ≪ arg1); 

By creating a collection of functor objects 2 a functional programming style can be mimiced. 

As can then be obvserved, polymorphism, which has to be especially provided in the imperative 

world, comes naturally to the functional paradigm as no specific assumptions about data types 

are required, only conceptual requirements need to be met. 

2 Instead of creating all of these functors again, the Boost Phoenix library or the C++ TR1 lambda library 

can be used.


7.5 From Monomorphic to Polymorphic Behavior 

As presented in the last sections, each programming techique (or paradigm) offers different key 

benefits regarding effective programming. The imperative programming, the related procedural 

paradigm, and the object-related programming are simple and require that all calls to an object 

or function have exactly the same typing as the signature. So type checks and type constraints 

can be derived directly from the program text. But the effectivness (genericity and applicability) 

is greatly reduced for real world problems. This is in contrast to polymorphic code that freely 

operates only on abstract concept types. Polymorphic behavior enables the use of algorithms 

and data structures with several different types. The object-oriented, generic, and functional 

programming offer an additional mechanism which delays the actual type instantiation to a 

later evaluation point. Compared to the simple monomorphic way, the polymorphic mechanism 

is composed of a complex set of inference rules, because there is propagation of type information 

between the object and function signature and the call signature in both directions. 

In object-oriented programming, libraries typically specify that the types supplied to the library 

must be derived from a common abstract base class, providing implementations for a collection 

of pure virtual functions. The library knows only about the abstract base class interface, 

but can be extendedto work with new user types derived from the abstract interface. That is, 

variability is achieved through differing implementations of the virtual functions in the derived 

classes. This is how object-oriented programming supports modules that are closed for modification, 

yet remain open for extension. One strength of this paradigm is its support for varying 

the types supplied to a module at runtime. Composability of modules is limited, however, 

since independently produced modules generally do not agree on common abstract interfaces 

from which supplied types must inherit. The paradigm of generic programming, pioneered by 

Stepanov, Musser and their collaborators, is based on the principle of decomposing software into 

efficient components which make only minimal assumptions about other components, allowing 

maximum flexibility in composition. C++ libraries developed following the generic programming 

paradigm typically rely on templates for the parametric and ad- hoc polymorphism they 

offer. Composability is enhanced as use of a library does not require inheriting from a particular 

abstract interface. Interfaces of library components are specified using concept collections of 

requirements analogous to, say, Haskell type classes. The key difference to abstract base classes 

and inheritance is that a type can be made to satisfy the constraints of a concept retroactively, 

independently of the definition of the type. Also, generic programming strives to make algorithms 

fully generic, while remaining as efficient as non-generic hand-written algorithms. Such 

an approach is not possible when the cost of any customization is a virtual function call. 

The strength of polymorphism is that the same piece of code can operate on different types, 

even types that are not known at the time the code was written. Such applicability is the 

cornerstone of polymorphism because it amplifies the usefulness and reusability of code. If 

the types of poymorphism are analysed in more detail, then two different main types can be 

observed: 

• Ad-hoc polymorphism 

• Universal polymorphism 

Only the second type, universal polymorphism, is actually important for effective programming, 

where the first type, ad-hoc polymorphism, is rather convenience.

7.5. FROM MONOMORPHIC TO POLYMORPHIC BEHAVIOR 213 

7.5.1 Ad-hoc Polymorphism 

This kind of polymorphic behavior is expressed with ad-hoc, which should point out, that this 

kind of behavior is locality. Common to these two types (overloading and coercion) is the fact 

that the programmer has to specify exactly what types are to be usable with the polymorphic 

function. 

Overloading 

Is a simple convenient way of programming, to ease the programmer’s life. 

class my stack 

{ 

virtual bool push(int ..) {} 

virtual bool push(double ..) {} 

virtual bool push(complex ..) {} 

virtual int pop() {..} 

virtual double pop() {..} 

// .... 

}; 

Coercion 

Coercion is automatic type conversion. The following stack example can be used with all 

numerical data types, which can be converted to double: 

class my stack 

{ 

virtual bool push(double ..) {} 

virtual double pop() {..} 

// .... 

};


7.5.2 Universal Polymorphism 

The universal in the title means, that the different kinds of expression for polymorphic behavior 

in this section are the most useful techniques to accomplish the desired behavior and should be 

used preferably: 

• Dynamic polymorphism (subtyping) 

• Static polymorphism (parametric) 

Subtyping Polymorphism 

In C++ the object-oriented paradigm implements subtyping polymorphism 3 using sub-classing. 

The term dynamic polymorphism is often found for this type of polymorphism. 

To introduce the applicability of this kind of polymorphism an example from the topological 

area is given. Classes for different kinds of points are used, which should be comparable in their 

own set. Traversing through containers or data structures is a quite common task in generic 

programming. The next code snippet presents the base class for all kind of vertices. 

#include 

class topology { }; 

class vertex 

{ 

public: 

virtual bool equal(const vertex∗ ve) const = 0; 

}; 

If these vertex types have to be extended, only the new class with the according equal method 

should be implemented. The next code snippet presents two possible implementations for a 

vertex, which can be used in different topologies. 

class structured vertex : public vertex 

{ 

public: 

structured vertex(int id, topology∗ topo) : id(id), topo(topo) {} 

virtual bool equal(const vertex∗ ve) const 

{ 

const structured vertex∗ sv = dynamic cast(ve); 

return ((id == sv→ id) && (topo == sv→ topo)); 

} 

protected: 

int id; 

topology∗ topo; 

}; 

3 Also called inclusion polymorphism.


class unstructured vertex : public vertex 

{ 

public: 

unstructured vertex(int handle, topology∗ topo, int segment) : handle(handle), segment(segment), topo(topo) {} 

virtual bool equal(const vertex∗ ve) const 

{ 

const unstructured vertex∗ sv = dynamic cast(ve); 

return handle == (( sv→ handle ) && ( topo == sv→ topo) && (segment == sv→ segment)); 

} 

protected: 

int handle; 

int segment; 


}; 

With this virtual class hierarchy, an algorithm which operates on all different classes derived 

from vertex can be written. This is called explicit interface. 

void print equal(const vertex∗ ve1, const vertex∗ ve2) 

{ 

std::cout ≪ std::boolalpha ≪ ve1→ equal(ve2) ≪ std::endl; 

} 

The next code lines present the generic behavior of the algorithm, which operators on both 

types derived from vertex. 

int main() 

{ 

topology the topo; 

vertex∗ the vertex1; 

vertex∗ the vertex2; 

} 

// ∗∗∗ structured 

the vertex1 = new structured vertex(12, &the topo); 

the vertex2 = new structured vertex(12, &the topo); 

print equal(the vertex1, the vertex2); 

// ∗∗∗ unstructured 

the vertex1 = new unstructured vertex(12, &the topo, 1); 

the vertex2 = new unstructured vertex(12, &the topo, 2); 

print equal(the vertex1, the vertex2); 

return 0; 

As can be seen, polymorphic behavior can be achieved, but with major drawbacks. First,


pointers or references to the objects have to be used, which eliminates the possibility for a 

compiler to optimize some parts of the code, i.e. inlining. Second, a dynamic cast has to be 

used which can cause an exception at run time. This kind of problem is called binary-methodproblem, 

which is explained in Section 7.5.3 

Nevertheless, dynamic polymorphism in C++ is best at: 

• Uniform manipulation based on base/derived class relationships: Different classes that 

hold a base/derived relationship can be treated uniformly. 

• Static type checking: All types are checked statically in C++. 

• Dynamic binding and separate compilation: Code that uses classes in a hierarchy can 

be compiled apart from the code of the entire hierarchy. This is possible because of the 

indirection that pointers provide (both to objects and to functions). 

• Binary interfacing: Modules can be linked either statically or dynamically, as long as the 

linked modules lay out the virtual tables the same way. 

Behind the Dynamic Polymorphism in C++ 

How virtual functions work: 

• Normally when the compiler sees a member function call it simply inserts 

instructions calling the appropriate subroutine (as determined by the 

type of the pointer or reference) 

• However, if the function is virtual a member function call such as 

vc→ foo() is replaced with following: (∗((vc→ vtab)[0]))() 

• The expression vc→ vtab locates a special ”secret” data member of the 

object pointed to by vc. This data member is automatically present in 

all objects with at least one virtual function. It points to a class-specific 

table of function pointers (known as the class’s vtable) 

• The expression (vc→ vtab)[0] locates the first element of the class’s 

vtable of the object (the one corresponding to the first virtual function 

foo() ). That element is a function pointer to the appropriate foo() 

member function. 

• Finally, the expression (∗((vc→ vtab)[0]))() dereferences the function 

pointer and calls the function 

• Special care must be taken with destructors in virtual class hierarchies. 

The base class does not know anything about the derived classes and 

so the derived class destructor has to be marked with virtual, too.


Parametric Polymorphism 

Parametric polymorphism was the first type of polymorphism developed, and first identified by 

Christopher Strachey in 1967. It was also the first type of polymorphism to appear in an actual 

programming language, ML in 1976. It exists in C++, Standard ML, Haskell, and others. The 

term static polymorphism is often found. 

In C++, this type of polymorphism can be used via templates and also lets a value have more 

than one type. Inside 

template double function(T param) {..} 

param can have any type that can be substituted inside function to render compilable code. 

This is called implicit interface in contrast to a base class’s explicit interface. It achieves the 

same goal of polymorphism - writing code that operates on multiple types but in a very different 

way. 

To tie up to the dynamic polymorphic by example, the same example as in the static polymorphic 

world is used through function templates: 

#include 

class topology 

{ 

// ... temp class 

}; 

class structured vertex 

{ 

public: 

structured vertex(int id, topology∗ topo) : id(id), topo(topo) {} 

bool equal(const structured vertex& ve) const 

{ 

return id == ve.id && topo == ve.topo; 

} 

protected: 

int id; 


}; 

class unstructured vertex 

{ 

public: 

unstructured vertex(int handle, topology∗ topo, int segment) : handle(handle), segment(segment), topo(topo) {} 

bool equal(const unstructured vertex& ve) const 

{ 

return handle == ve.handle && topo == ve.topo && segment == ve.segment; 

}


protected: 

int handle; 

int segment; 


}; 

Here, no class hierarchy is required. It has only be guaranteed, that each data type provides an 

implementation of the required method. Below print equal() is written as a function template: 

template 

void print equal(const VertexType& ve1, const VertexType& ve2) 

{ 

std::cout ≪ std::boolalpha ≪ ve1.equal(ve2) ≪ std::endl; 

} 

In the code snippet below, the same polymorphic behavior can be seen as in the dynamic 

polymorphism example, but without the necessity of inheriting from a common base class. 

int main() 

{ 

topology the topo; 

} 

// ∗∗∗ structured 

structured vertex sv1(12, &the topo); 

structured vertex sv2(12, &the topo); 

print equal(sv1, sv2); 

// ∗∗∗ unstructured 

unstructured vertex usv1(12, &the topo,1); 

unstructured vertex usv2(12, &the topo,2); 

print equal(usv1, usv2); 

return 0; 

Without a pointer mechanisms the compiler can easily optimize these lines, i.e. inline the code. 

Additionally exceptions cannot occur at run time. 

Due to its characteristics, static polymorphism in C++ is best at: 

• Uniform manipulation based on syntactic and semantic interface: Types that obey a 

syntactic and semantic interface can be treated uniformly.


• Static type checking: All types are checked statically. 

• Static binding (prevents separate compilation):All types are bound statically. 

• Efficiency: Compile-time evaluation and static binding allow optimization and efficiencies 

not available with dynamic binding. 

7.5.3 Comparison of Static and Dynamic Polymorphism 

Here the main features from static and dynamic polymorphism are summarized: 

• Virtual function calls are slower during run time than function templates: A virtual 

function call includes an extra pointer dereference to find the appropriate method in the 

virtual table. By itself, this overhead may not be significant. Significant slowdowns can 

result in compiled code because the indirection may prevent an optimizing compiler from 

inlining the function and from applying subsequent optimizations to the surrounding code 

after inlining. 

• Run time dispatch versus compile-time dispatch: The run time dispatch of virtual functions 

and inheritance is certainly one of the best features of object-oriented programming. 

For certain kinds of components, run time dispatching is an absolute requirement, decisions 

need to be made based on information that is only available at run time. When this 

is the case, virtual functions and inheritance are needed. 

Templates do not offer run time dispatching, but they do offer significant flexibility at 

compile time. In fact, if the dispatching can be performed at compile time, templates offer 

more flexibility than inheritance because they do not require the template arguments types 

to inherit from some base class. 

• Code size: virtual functions are small, templates are big:: A common concern in templatebased 

programs is code bloat, which typically results from naive use of templates. Carefully 

designed template components need not result in significantly larger executable size 

than their inheritance-based counterparts. 

• The binary method problem: There is a serious problem that shows up when using inheritance 

and virtual functions to express operations that work on two or more objects.


Note 

The binary method problem is encountered when methods in which the 

receiver type and argument type should vary together, such as equality 

comparisons, must instead use a fixed formal parameter type to 

maintain type safety. The problem arises in mainstream object-oriented 

languages because only the receiver of a method call is used for run time 

method selection, and so the argument must be assumed to have the 

most general possible type. Existing techniques to solve this problem 

require intricate coding patterns that are tedious and error-prone. The 

binary method problem is a prototypical example of a larger class of 

problems where overriding methods require type information for their 

formal parameters. Another common example of this problem class is 

the implementation of event handling (e.g., for graphical user interfaces), 

where ”callback methods” must respond to a variety of event 

types.

7.6. BEST OF BOTH WORLDS 221 

7.6 Best of Both Worlds 

The object-oriented programming paradigm offers mechanisms to write libraries that are open 

for extension, but it tends to impose intrusive interface requirements on the types that will be 

supplied to the library. The generic programming paradigm has seen much success in C++, 

partly due to the fact that libraries remain open to extension without imposing the need to 

intrusively inherit from particular abstract base classes. However, the static polymorphism that 

is a staple of programming with templates and overloads in C++, limits generic programming 

applicability in application domains where more dynamic polymorphism is required. 

In combining elements of object-oriented programming with those of generic programming, we 

take generic programming as the starting point, retaining its central ideas. In particular, generic 

programming is built upon the notion of value types that are assignable, copy constructible, 

The behavior expected from value types reflects that of C++ built-in types, like int, double, 

and so forth. This generally assumes that types encapsulate their memory and resource management 

into their constructors, copy-constructors, assignment operators, and destructors, so 

that objects can be copied, and passed as parameters by copy, etc., without worrying about 

references to their resources becoming aliased or becoming dangling. Value types simplify local 

reasoning about programs. Explicitly managing objects on the heap and using pass-by-reference 

as the parameter passing mode makes for complex object ownership management (and object 

lifetime management in languages that are not garbage collected). Instead, explicitly visible 

mechanisms thin wrapper types like reference wrapper in the (draft) C++ standard library ae 

used when sharing is desired. 

.. more to come.. 

7.6.1 Compile Time Container 

7.6.2 Meta-Functions 

7.6.3 Run-Time concepts

222 CHAPTER 7. EFFECTIVE PROGRAMMING: THE POLYMORPHIC WAY

Part II 

Using C ++ 

223

Finite World of Computers 

Chapter 8 

8.1 Mathematical Objects inside the Computer 

First natural numbers N are introduced and the data types available in a programming language 

used to represent them. The difference between a single digit, and their connection to the used 

base is an important concept in computer science. 

A number is represented by several single digits, with each digit being a factor for a corresponding 

power of the base. The number is only complete when both the base and all of the digits are 

known. To use an example, the digit sequence 123 is calculated with the corresponding base, 

e.g. base = 10: 

12310 = 1 · 10 2 + 2 · 10 1 + 3 · 10 0 

If the base is switched, e.g. base = 4, then the following numbers are derived: 

1234 = 1 · 4 2 + 2 · 4 1 + 3 · 4 0 = 2710 

One of the drawbacks of the representation of numbers within the computer is the fact that the 

built-in types such as int and long can only use a finite number of bits and are hence limited 

in their range (the int can be omitted), e.g.: 

short int: -32768 +32767 

long int: -2147483648 +2147483647 

unsigned long int: 0 +4 294 967 295 

As can be seen the maximum number of countable items is restricted. If, as an example, a 

program has to count the living humans on earth we have to switch to another number concept, 

either floating point or a decimal data type. A plain and simple arbitrary digit number container 

can be implemented by: 

class big number{ 

long base; 

std::vector digits; 

public: 

225

226 CHAPTER 8. FINITE WORLD OF COMPUTERS 

// ......... 

}; 

8.2 More Numbers and Basic Structure 

Polynomials are an important and efficient tool for numerous fields of science. Due to the simple 

rules regarding differentiation and integration polynomials have found wide spread application. 

Polynomials can be defined as a weighted sum of exponential terms in at least one variable or 

expression, with the exponents being restricted to non-negative whole numbers. Their simple 

definition as well as the fact that their algebraic structure is not only closed under addition, 

subtraction, and multiplication, but also under differentiation and integration, result in their 

widespread application. The demand of additional properties such as, e.g., orthogonality with 

respect to an inner product results in special classes of polynomials, orthogonal polynomials, 

which further increases their appeal in fields such as finite elements. A polynomial consists of 

coefficients (ai) and a variable expression (x i ): 

a0 x 0 + a1 x 1 + a2 x 2 + . . . + an x n 

Thus a container representation to store the coefficients for polynomials was chosen so that a 

generic C++ variable contains the expression: 

gsse::polynomial 

When storing the coefficients in a container great care has been taken to implement the library 

to be generic with respect to the type of the underlying data structure. In this way it is possible 

to use compile time containers if the size or even the concrete coefficients are already known at 

compile time. This allows the compiler to inline and execute operations at compile time. 

The most suitable container to use for the coefficients usually depends on the input and not 

the algorithms. It is therefore important to provide a basic set of programming utilities which 

are generic with regard to the used container type. Compile time and run time containers have 

a few incompatible requirements which make it hard to define a common set of utilities. 

8.2.1 Accessing Coefficients 

Accessing a polynomial’s coefficients is an important operation. There exist two basic ways of 

accessing the coefficient. Compile time accessors are used when the index of the coefficient to 

be accessed is known at compile time, while run time accessors have to be used otherwise. The 

compile time version takes the index as a template-parameter, while the run time entity as a 

function argument. 

namespace compiletime { 

template 

typename result of::coeff::type 

coeff(Polynomial const &p); 

} 

namespace runtime {

8.2. MORE NUMBERS AND BASIC STRUCTURE 227 

template 

typename result of::coeff::type 

coeff(index type n, Polynomial const &p); 

} 

Access to the coefficient is then available by: 

polynomial p; 

compiletime::coeff(p); 

runtime::coeff(n, p); 

Thus it is possible for the compiler to simplify the code and determine more information about 

the coefficient. Therefore the compile time version is more flexible than the run time version. 

Using inhomogeneous compile time containers in conjunction with the run time accessor is not 

possible since it is not possible to determine the return type in advance. This reduces the 

flexibility of the code using the run time accessors. A workaround to this problem can be 

achieved by using the visitor pattern. 

template 

void coeff visitor( 

index type n, 

Polynomial const &p, 

Visitor v 

); 

However, this approach has the disadvantage of being more complicated to use than the coeff 

function. 

The coefficient accessors are not simple wrappers around the accessors of the underlying container. 

They check the access and return a zero value if the container does not contain the 

coefficient. The zero value is determined by the coeff trait template class: 

template 

struct coeff trait 

{ 

typedef CoeffType zero type; 

static zero type const 

zero value = zero type(); 

}; 

By using partial template-specialization it is possible to define the corresponding zero value for 

the correct type. For inhomogeneous polynomials default coeff is passed as CoeffType and 

the default behavior is to return an int. 

8.2.2 Setting Coefficients 

Coefficients may set using the set coeff function. It does not change the given polynomial but 

creates a new view instead. This provides the polynomial library with a functional programming 

style. Setting the coefficients and changing the polynomial can only be achieved by directly 

manipulating the coefficient container.


namespace compiletime 

{ 

template 

typename result of::set coeff::type 

set coeff(Polynomial const &p, 

Coeff const &c); 

} 

namespace runtime 

{ 

template 

typename result of::set coeff::type 

set coeff(index type n, 

Polynomial const &p, 

Coeff const &c); 

} 

Write access is then available by: 

polynomial p; 

compiletime::set coeff(p, 1); 

runtime::set coeff(n, p, 1); 

The degree of the polynomial is defined as the maximum degree of all of its terms, where the 

degree of a term is given as the sum of the degree of all variables in this term. The polynomial 

library defines the degree as the index of the highest non-zero coefficient. To obtain the correct 

degree requires to use a polynomial for each variable and finally combine them: 

struct X; 

typedef polynomial< 

X, 

fusion::map< pair< mpl::int , double> > 

> inner poly; 

degree � 3 x 4 y 2� = 4 + 2 = 6 

typedef polynomial< 

Y, 

fusion::map< pair< mpl::int , inner poly> > 

> the polynomial; 

By instantiating the polynomial the calculation of its degree is possible: 

the polynomial p; 

assert( degree(p) == 6 );

8.2. MORE NUMBERS AND BASIC STRUCTURE 229 

8.2.3 Compile Time Programming 

The application of meta-programming is presented which utilizes the compiler to execute code 

at compile time and then reduce the result of the expressions. As an example the derivative of 

a second-degree polynomial is calculated and a second polynomial is added: 

d(3 + 4.5 x + 10 x 2 ) 

dx 

+ (1 + 2x) = 5.5 + 22 x 

The type list represents the type of each coefficient starting from the zero to the second degree 

coefficient. 

struct X { } x; 

typedef fusion::vector coeffs; 

typedef polynomial poly; 

poly p(x, coeffs(3.0, 4.5, 10)); 

typedef result of::diff::type diffed; 

diffed d = diff(p, x); 

poly q(x, coeffs(1.0, 2.0, 0)); 

std::cout ≪ coeff(q + d); 

By compiling and evaluating the assembler code it is revealed that the calculations were performed 

at compile time and the binary only contains the final result of 22. 

8.2.4 Arbitrary-Precision Arithmetic 

The application of the polynomial library to perform arbitrary-precision arithmetic (or “bignum 

arithmetic”) is also presented here. It uses the fact that a number is in essence a polynomial 

with a fixed base. 

1372 = 1 · 10 3 + 3 · 10 2 + 7 · 10 + 2 

This can easily be translated into C++ code by using the polynomial library. Note, that the 

first element in the array is the zero coefficient: 

typedef unsigned char byte t; 

typedef array coeffs t; 

coeffs t coeffs = {{2, 7, 3, 1}}; 

gsse::polynomial p(coeffs); 

Since computer systems usually operate on binary numbers base-2 is the optimal choice. The 

difference between polynomial arithmetic and arbitrary-precision arithmetic is that the coefficients 

need to be realigned to the base after each operation.


8.2.5 Finite Element Integration 

In the theory of finite elements [?, ?], a continuous function space is projected onto a finite 

function space P k , where the space P k is the space of polynomials up to the total order of k. 

For many special cases, finite element integrals can be computed manually and added into 

the source code of an application. This results in excellent run time performance but lacks 

flexibility. For more general cases, e.g., general coefficients, they must be computed by numerical 

integration at run-time. To prevent an ill-conditioned system matrix, orthogonal polynomials 

have to be chosen as numerical integration weights. One possible type of polynomial is a 

normalized Legendre polynomial [?]. Coefficients for such a polynomial Pk of order k can be 

efficiently evaluated by using the recursion procedure: 

P0(x) = 1 (8.1) 

P1(x) = x 

2j − 1 

Pk(x) = x Pk−1(x) − 

j 

j − 1 

j Pk−2(x) k ≥ 2 

To use arbitrary p-finite elements (polynomial order [?, ?]) the numerical coefficients have to 

be calculated either manually and inserted into the source code or determined numerically at 

run time. 

The polynomial library presented here is then used to store manually pre-calculated integration 

tables at compile time (order 1-5). If the user requires higher order finite elements, numerical 

coefficients are calculated at run time to any order. 

8.3 A Loop and More 

One of the important parts in computer science is repetation. A computer was made to do 

exactly like this, programmable operations and repetations. To give a simple example, a for 

loop is expressed by: 

for (long i = 0; i < max counter; ++i) 

{} 

To give a real application of this concept, integration is used. 

� b 

a 

f(x)dx 

Several approximation schemes are also available: 

� b 

a 

� b 

a 

f(x)dx ≈ 

f(x)dx ≈ 

b − a 

6 

f(a) + f(b) 

2 

� 

f(a) + 4f 

· (b − a) 

� a + b 

2 

� � 

+ f(b)

8.4. THE OTHER WAY AROUND 231 

As can be seen, this is a very coarse approximation, but the main issue persists. The known 

continuos integration is not possible inside the computer, but the concept of numerical integration 

is possible. This means the constraint of a finite dx is replaced by a ∆x and the � is 

replaced by a finite sum � 

i=0;i

232 CHAPTER 8. FINITE WORLD OF COMPUTERS

How to Handle Physics on the 

Computer 

Chapter 9 

9.1 Finite Elements 

Discretization schemes lead in general to a linear system of equations: 

These matrices are typically: 

• sparse (there are only few non-zero elements per row) 

• large dimension N (10 4 − 10 9 unknowns) 

—xx 

A x = f (9.1) 

The non-zero elements of the matrix Ai,j represent a finite element with both degrees of freedom 

i and j connected. 

To demonstrate the transfer of a continuous formulated equation such as the Laplace or Poisson 

equation to the finite regime of a computer, a simple Dirichlet problem is used. If an implicit 

(uniform) 1D-grid with n elements is used, the contribution of each element to the system 

matrix A is constant, so called stencil sub-matrix. 

⎛ 

⎜ 

A = ⎜ 

⎝ 

2 −1 

−1 2 −1 

2D implicit grid of dimension N = (n − 1) 2 is: 

−1 2 −1 

−1 2 

233 

⎞ 

⎟ 

⎠ 

(n−1)x(n−1)

234 CHAPTER 9. HOW TO HANDLE PHYSICS ON THE COMPUTER 

⎛ 

⎜ 

A = ⎜ 

⎝ 

⎜ 

D = ⎜ 

⎝ 

2 −1 

−1 2 −1 

⎛ 

⎜ 

A = ⎜ 

⎝ 

⎛ 

−1 2 −1 

−1 2 

D −I 

−I D −I 

4 −1 

−1 4 −1 

and the (n − 1)x(n − 1) identity matrix I. 

9.2 Again, Integrators 

⎞ 

⎟ 

⎠ 

−I D −I 

−I D 

−1 4 −1 

−1 4 

⎞ 

⎟ 

⎠ 

(n−1)x(n−1) 

⎞ 

⎟ 

⎠ 

(n−1)x(n−1)

Programming tools 

Chapter 10 

In this chapter we introduce programming tools that can be used to solve the exercises. 

10.1 GCC 

GCC stands for the Gnu Compiler Collection. It is a collection of compilers (C, C++, FOR- 

TRAN, Fortran 90, java) free of charge [?]. The C++ compilers are very good and produce 

reasonably efficient code. In this section, we explain how to compile a C++ program. 

The following command: 

g++ -o hello hello.cpp 

compiles the C++ source file hello.cpp into the executable hello. 

The compiler command is gcc or g++ with the following options. 

• -Idirectory: Include files directory 

• -O: Optimization 

• -g: Debugging 

• -p: Profiling 

• -o filename: output file name 

• -c: Compile, no link 

• -Ldirectory: Library directory 

• -lfile: Link with library libfile.a 

Here is another example: 

g++ -o foo foo.cpp -I/opt/include -L/opt/lib -lblas 

compiles and links the file foo.cpp using include files from /opt/include/ (option -I) and 

linked with a library that is situated in the directory /opt/lib: For optimizing code, we have 

to use the compilation options : 

-O3 -DNDEBUG 

235

236 CHAPTER 10. PROGRAMMING TOOLS 

The -DNDEBUG option sets the C-preprocessor variable NDEBUG which tells the assert command 

that debug tests should not be done. This allows us to save time at execution. 

10.2 Debugging 

10.2.1 Debugging with text tools 

“Et la tu t’dis que c’est fini 

car pire que ça ce serait la mort. 

Qu’en tu crois enfin que tu t’en sors 

quand y en a plus et ben y en a encore!” 

— Stromae. 

There are several debugging tools. In general, graphical ones are more user friendly, but they 

are not always available. In this section, we describe the gdb debugger, which is very useful to 

trace the cause of a run time error if the code was compiled with the option -g. 

The following contains a printout of a gdb session of the program hello.cpp: 

#include 

#include 

int main() { 

glas::dense vector< int > x( 2 ) ; 

x(0) = 1 ; x(1) = 2 ; 

for (int i=0; i

10.2. DEBUGGING 237 

T& glas::continuous_dense_vector::operator()(ptrdiff_t) [with T = int]: 

Assertion ‘i


10.2.2 Debugging with graphical interface: DDD 

More convenient than debugging on a text level is using a graphical interface like DDD (Data 

Display Debugger). It has more or less the same functionality as gdb and in fact it runs gdb 

internally. One can use it also with another text debugger. 

As case study, we use a modified example from Section 5.4.5. In fact, the buggy program arose 

by teaching § 5.4.5, i.e. one of the authors tried to reconstruct vector unroll example2.cpp 

on the fly. 

TODO: Find a better example. The above finally was okay, the tuning just did not change the 

run-time behaviour. 

In addition to the window above you will see a smaller one like in Figure 10.1, typically on the 

right of the large window if there is enough space on your screen. 

This control panel let you geer through the debug session in way that is 

easier for beginner and even for some advanced users more convenient. 

You have the following command: 

Run Start or restart your program. 

Interrupt If your program does not terminate or does not reach the next 

break point you can stop it manually. 

Step Go one step forward. If your position is a function call, jump into 

the function. 

Next Go to the next line in your source code. If you are located on a 

function call do not jump into it unless there is a break point set 

inside. 

Figure 10.1: DDD 

control panel

10.3. VALGRIND 239 

Stepi and Nexti This are the equivalents on instruction level. This is 

only needed for debugging assembler code and not subject in this 

book. 

Until Position your cursor in your source and run the program until you 

reach this line. If your program flow do not pass this line the execution 

will continued till the end, the next break point or bug. 

Finish Execute the remainder of the current function and stop in the first 

line outside this function, i.e. the line after the function call. 

Cont Continue your execution till the next event (break point, bug, or 

end). 

Kill the program. 

Up Show the line of the current function’s call, i.e. go up one level in the 

call stack. 

Down Go back to the called function, i.e. go down one level in the call 

stack. 

Undo Revert last action (works rarely or never). 

Redo Repeat the last command. 

Edit Call an editor with the source file currently shown. 

Make Call ‘make’ (which must know what to compile). 

10.3 Valgrind 

The valgrind distribution offers several tools that you can use to analyze your software. We will 

only use one of these tools called memcheck. For more information on the others we refer you 

to http://valgrind.org Memcheck detects memory-management problems like memory leaks. 

Memcheck also reports if your program accesses memory it should not or if it uses uninitialized 

values. All these errors are reported as soon as they occur along with the corresponding source 

line number at which they occurred and also a stack trace of the functions called to reach 

that line. You should also take into account that Memcheck runs programs about 10 to 30 

times slower than normal. Use the following command to check the memory management of a 

program: 

valgrind −−tool=memcheck program name 

10.4 Gnuplot 

A useful tool for making plots is Gnuplot. It is a public domain program. 

Invoke gnuplot to start the program. Suppose we have the file results with the following 

content:


0 1 

0.25 0.968713 

0.75 0.740851 

1.25 0.401059 

1.75 0.0953422 

2.25 -0.110732 

2.75 -0.215106 

3.25 -0.237847 

3.75 -0.205626 

4.25 -0.145718 

4.75 -0.0807886 

5.25 -0.0256738 

5.75 0.0127226 

6.25 0.0335624 

6.75 0.0397399 

7.25 0.0358296 

7.75 0.0265507 

8.25 0.0158041 

8.75 0.00623965 

9.25 -0.000763948 

9.75 -0.00486465 

plot "results" w l plot "results" 

The first column represents the x coordinate and the second colum contains the corresponding 

y coordinate values. We can plot this using the command: 

plot "results" w l 

The command 

plot "results" 

only plots stars, no line. The command help is also useful. For 3D plots, i.e. a table with three 

columns, we use the command splot. 

10.5 Unix and Linux 

Unix (and Linux) are not used as often as Windows platforms, although for scientific programming 

they are popular development platforms. The Unix operating system is a command line 

system with several graphical interfaces. Especially in Linux, the graphical interfaces are well 

developed so that you get a windows like look and feel. Although you can easily browse through 

the directories, create new directories and move data around with a few mouse clicks, it may 

be interesting to know at least a few Unix commands: 

• ps: list of my processes, 

• kill -9 id : kill the process with id id,

10.5. UNIX AND LINUX 241 

• top: list all processes and resource use, 

• mkdir: make a new directory, 

• rmdir: remove an (empty) directory, 

• pwd: name of the current directory, 

• cd dir: change directory to dir, 

• ls: list the files in the current directory 

• cp from to: copy the file from to the file or directory to. if the file to exists, it is 

overwritten, unless you use cp -i from to, 

• mv from to: move the file from to the file or directory to. If the file to exists, it is 

overwritten, unless you use mv -i from to, 

• rm files: remove all the files in the list files. rm * removes everything (be careful) chmod 

mode files : change the user mode for files. 

See http://www.physics.wm.edu/unix_intro/outline.html for on-line help.

242 CHAPTER 10. PROGRAMMING TOOLS

C ++ Libraries for Scientific Computing 

Chapter 11 

TODO: Introducing words. 

11.1 GLAS: Generic Linear Algebra Software 

11.1.1 Introduction 

Software kernels for dense and sparse linear algebra have been developed over many decades. 

The development of the BLAS [?] [?] [?] [?] [?] in FORTRAN and later the similar work in 

C++, see MTL [?], Blitz++, to name a few. 

Currently, more and more scientific software is written in C++, but the language does not 

provide us with dense and sparse vector and matrix concepts and algorithms, as this is the 

case for Matlab. This makes exchanging C++ software harder than, for example, Fortran 90 

software, which has dense vector and matrix concepts defined in the language. Note that 

Fortran 90 does not have sparse and structured matrix types such as symmetric or upper 

triangular, or banded matrices. 

11.1.2 Goal 

The goal of the GLAS project is to open the discussion on standardization for C++ programming. 

The goal is not to present a standard as such, but may be a first step to achieve this 

goal. 

We realize that this is very ambitious. We think, the GLAS proposal meets the goals, but the 

internals are still rather complicated, which makes extensions less straightforward. GLAS is a 

generic software package using advanced meta programming tools such as the Boost MPL, but 

this is invisible to the user who does not want to add extensions to GLAS. A minor knowledge 

about template programming and expression templates is required for making proper use of the 

software. 

This version does not use Concept C++, since we have encountered instability problems with 

the Concept-GCC compiler and found it hard to work with expression-templates. 

243

244 CHAPTER 11. C++ LIBRARIES FOR SCIENTIFIC COMPUTING 

We now briefly explain how the goals are met before entering a more detailed discussion of the 

software design. 

The GLAS should be considered as an interface to other software for linear algebra, e.g. the 

BLAS, MTL, or other linear algebra software. Such interface is provided by the Back-ends, 

whereas the syntax for using such backends does not change. For example, if we want to add a 

scaled vector to another vector (an axpy), then we write 

y + = a ∗ x ; 

but the implementation can use the BLAS (e.g. daxpy), or MTL, or another package. We have 

provided a reference C++ implementation, that is an illustration of how the expressions are 

dispatched to the actual implementation. 

The concepts mainly contain free functions and meta functions, so that external objects can be 

used in GLAS provided these functions are specialized. As an exercise, we show how this can 

be done for an std::vector. 

For more information, see [?]. 

11.1.3 Status 

GLAS is still under development. Currently, there are features for working with dense vectors 

and matrices, and sparse matrices. There is support to the Boost.Sandbox.Bindings and Toolboxes 

for working with LAPACK, Structured Matrices (mase toolbox), and iterative methods 

(iterative toolbox). 

11.2 Boost 

Boost is a bit out of line in this chapter. Firstly, it is not a library itself but a whole collection 

of freely available C ++ libraries. Secondly, not all of the contained libraries deal directly with 

scientific computing. However, many of the “non-scientific” libraries provide useful functionality 

for scientific libraries and applications. 

Boost provides free portable C++ libraries. 

Currently, the following Boost libraries are available that are useful for numerical software: 

• Data structures 

– tuple: pairs, triples, etc, e.g. tuple 

– smart ptr: smart pointers 

• Correctness and testing 

– static assert: compile time assertions 

• Template programming 

– enable if, mpl, type traits 

– static assert: compile time assertions

11.3. BOOST.BINDINGS 245 

• Math and numerics 

– numeric::conversions: conversions of types 

– thread: multi-threading 

– bindings: generic bindings to external software 

– graph: graph programs 

– integer: integer types 

– interval: interval arithmetic 

– random: random number generator 

– rational: rational numbers 

– math: various mathematical things, e.g. greatest common divisor 

– typeof: type deduction 

– numeric::ublas: vector and matrix library 

– math::quaternion, math::octonian 

– math::special functions 

• Miscellaneous 

– filesystem: advanced operations on files, directories 

– program options: working with command line options in your 

– timer: timing class 

For more information on these and other boost libraries see http://www.boost.org. 

11.3 Boost.Bindings 

Scientific programmers using C++ also want to use the features offered by mature FORTRAN 

and C codes such as LAPACK [?], MUMPS [?] [?], SuperLU [?] and UMFPACK [?]. The 

programming effort for rewriting these codes in C++ is very high. It therefore makes more 

sense to link the codes into C++ code. Another argument for linking with external software is 

performance : the vendor tuned BLAS functions are perhaps the most obvious example. 

In the traditional approach, an interface is developed for each basic C++ linear algebra package 

and for each external linear algebra package. This is illustrated by Figure 11.1. The Boost 

bindings adopt the approach of orthogonality between algorithms and data. This orthogonality 

is created by traits classes that provide the necessary data to the external software. The vector 

traits, for example, provide a pointer (or address), size and stride, which can then be used 

by e.g. the BLAS function ddot. Each traits class is specialized for user defined vector and 

matrix packages. This implies that, for a new vector or matrix type, the development effort is 

limited to the specialization of the traits classes. Once the traits classes are specialized, BLAS 

and LAPACK can be used straightaway. For a new external software package, it is sufficient


BLAS LAPACK ATLAS MUMPS . . . 

uBLAS 

✪✪✪✪✪✪ ✧ ✧✧✧✧✧✧✧✧ 

✦✦✦✦✦✦✦✦✦✦✦✦✦ MTL GLAS . . . 

Figure 11.1: Traditional interfaces between software 

BLAS LAPACK 

❅ 

❅❘ ❄ 

uBLAS 

��✒ 

✻ 

MTL 

ATLAS 

❄ 

Bindings 

✻ 

GLAS 

MUMPS 

Figure 11.2: Concept of bindings as a generic layer between linear algebra algorithms and vector 

and matrix software 

to provide a layer that uses the bindings. Figure 11.2 illustrates this philosophy. Note the 

difference with Figure 11.1. 

11.3.1 Software bindings 

We now illustrate how the bindings can be used to interface external software by means of 

examples. 

BLAS bindings 

The BLAS are the Basic Linear Algebra Subroutines [?] [?] [?] [?] [?], whose reference implementation 

is available through Netlib 1 . The BLAS are subdivided in three levels : level one contains 

vector operations, level two matrix vector operations and level three, matrix operations. 

The BLAS bindings in Boost Sandbox contain interfaces to some BLAS functions. Functions 

are added on request. The interfaces check the input arguments using the assert command, 

which is only compiled when the NDEBUG compile flag is not set. The interfaces are contained 

in three files : blas1.hpp, blas2.hpp, and blas3.hpp in the directory boost/numeric/bindings/blas. The 

BLAS bindings reside in the namespace boost::numeric::bindings::blas. 

The BLAS provide functions for vectors and matrices with value type float, double, std::complex, 

and std::complex. All matrix containers have ordering type column major t,since the (FOR- 

TRAN) BLAS assume column major matrices. 

The bindings are illustrated in Figure 11.3 for the BLAS subprograms DCOPY, DSCAL, and 

DAXPY for objects of type std::vector. Note the include files for the bindings of the 

BLAS-1 subprograms and the include file that contains the specialization of vector traits for 

std::vector. 

1 http://www.netlib.org 

. . . 

❄ 

✻ 

. . . 

� 

�✠

11.3. BOOST.BINDINGS 247 

#include 

#include 

int main() { 

std::vector< double > x( 10 ), y( 10 ) ; 

// Fill the vector x 

... 

bindings::blas::copy( x, y ) ; 

bindings::blas::scal( 2.0, y ) ; 

bindings::blas::axpy( −3.0, x, y ) ; 

return 0 ; 

} 

LAPACK bindings 

Figure 11.3: Example for BLAS-1 bindings and std::vector bindings traits 

Software for dense and banded matrices is collected in LAPACK [?]. It is a collection of 

FORTRAN routines mainly for solving linear systems, and eigenvalue problems, including the 

singular value decomposition. As for the BLAS, the Boost Sandbox does not contain a full set 

of interfaces to LAPACK routines, but only very commonly used subprograms. On request, 

more functions are added to the library. The LAPACK bindings reside in the namespace 

boost::numeric::bindings::lapack. 

Many LAPACK subroutines require auxiliary arrays, which a non-expert user does not wish to 

allocate for reasons of comfort. The interface allows the user to allocate auxiliary vectors using 

the templated Boost.Bindings class array. 

The LAPACK bindings verify the matrix structure to see whether the routine is the right choice. 

It is also checked whether the matrix arguments are column major. Every function’s return 

type is int. The return value is the return value of the INFO argument of the corresponding 

LAPACK subprogram. 

Figure 11.4 shows an example using GLAS. 

MUMPS bindings 

MUMPS stands for Multifrontal Massively Parallel Solver. The first version was a result from 

the EU project PARASOL [?, ?, ?]. The software is developed in Fortran 90 and contains a C interface. 

The input matrices should be given in coordinate format, i.e. storage format=coordinate t 

and the index numbering should start from one, i.e. sparse matrix traits::index base==1. We 

refer to the MUMPS Users Guide, distributed with the software [?]. 

The C++ interface is a generic interface to the respective C structs for the different value 

types that are available from the MUMPS distribution: float, double, std::complex, and 

std::complex. The C++ bindings also contain functions to set the pointers and sizes 

of the parameters in the C struct using the bindings traits classes. An example is given in 

Figure 11.5. The sparse matrix is the uBLAS coordinate matrix, which is a sparse matrix in


#include 

#include 

#include 

#include 

#include 

#include 

... 

int main () { 

int n=100; 

// Define a real n x n matrix 

glas::dense matrix< double > matrix( n, n ) ; 

// Define a complex n vector 

glas::dense vector< std::complex > eigval( n ) ; 

// Fill the matrix 

... 

// Call LAPACK routine DGEES for computing the eigenvalue Schur form. 

// We create workspace for best performance. 

bindings::lapack::gees( matrix, eigval, bindings::lapack::optimal workspace() ) ; 

... 

} 

Figure 11.4: Example for LAPACK bindings and matrix bindings traits

11.4. MATRIX TEMPLATE LIBRARY 249 

coordinate format. The matrix is stored column wise. The template argument 1 indicates that 

row and column numbers start from one, which is required for the Fortran 90 code MUMPS. 

Finally, the last argument indicates that the row and column indices are stored in type int, 

which is also a requirement for the Fortran 90 interface. The solve consists of three phases : 

(1) the analysis phase, which only needs the matrix’ integer data, (2) the factorization phase, 

where also the numerical values are required and (3) the solution phase (or backtransformation), 

where the right-hand side vector is passed on. The included files contain the specializations of 

the dense matrix and sparse matrix traits for uBLAS and the MUMPS bindings. 

11.4 Matrix Template Library 

11.5 Blitz++ 

TODO: We can ask Todd to write something himself — Peter 

11.6 Graph Libraries 

TODO: Few introducing words from Peter 

11.6.1 Boost Graph Library 

TODO: I can write something about it — Peter 

11.6.2 LEDA 

LEDA implements advanced container types and combinatorial algorithms, especially graph 

algorithms. Containers are parameterized by element type and implementation strategies. Algorithms 

in general work only with the data structure of the library itself. 

11.7 Geometric Libraries 

TODO: Few introducing words from René and Philipp 

11.7.1 CGAL 

TODO: Ask Sylvain to write something? Or can René and Philipp write it? 

CGAL implements generic classes and procedures for geometric computing. The data structure 

complexity is at a very high level.


#include 

#include 

#include 

int main() { 

namespace ublas = boost::numeric::ublas ; 

namespace mumps = boost::numeric::bindings::mumps ; 

... 

typedef ublas::coordinate matrix< double, ublas::column major 

, 1, ublas::unbounded array 

> sparse matrix type ; 

sparse matrix type matrix( n, n, nnz ) ; 

// Fill the sparse matrix 

... 

mumps::mumps< sparse matrix type > mumps solver ; 

// Analysis (Set the pointer and sizes of the integer data of the matrix) 

matrix integer data( mumps solver, matrix ) ; 

mumps solver.job = 1 ; 

driver( mumps solver ) ; 

// Factorization (Set the pointer for the values of the matrix) 

matrix value data( mumps solver, matrix ) ; 


driver( mumps solver ) ; 

// Set the right−hand side 

ublas::vector v( 10 ) ; 

... 

// Solve (set pointer and size for the right−hand side vector) 

rhs sol value data( mumps solver, v ) ; 


mumps::driver( mumps solver ) ; 

return 0 ; 

} 

Figure 11.5: Example of the use of the MUMPS bindings

11.7. GEOMETRIC LIBRARIES 251 

11.7.2 GrAL 

TODO: René and Philipp write more? 

GrAL implements some concepts like GSSE, but without the generalization of function objects, 

the three-layer-concept (segment,domain,structure), generalized quantity storage, and 

n-dimensional structured grid.

252 CHAPTER 11. C++ LIBRARIES FOR SCIENTIFIC COMPUTING

Real-World Programming 

Chapter 12 

12.1 Transcending Legacy Applications 

Legacy application has been written in plain ANSI C or are available as Fortran libraries. It is 

therefore highly desirable to rejuvenate the implementation which is already available so that 

it utilizes advanced technologies and techniques while at the same time keeping as much of 

the already obtained experience and trust related to the original code base. One approach 

for the transition is possible by an evolutionary fashion initially including as much of the old 

implementation as possible and gradually replacing it to bring it up to date. 

The following examples are based on a particle simulator, where to important concepts can be 

separated: scattering mechanisms (physical behaviour of particles at boundaries (TODO: PS)) 

and physical model descriptions (how particles interact (TODO: PS) ). All available scattering 

mechanisms are implemented as individual functions, which are called subsequently. The scattering 

models require a variable set of parameters, which leads to non-homogeneous interfaces 

in the functions representing them. To alleviate this to some extent global variables have been 

employed completely eliminating any aspirations of data encapsulation and posing a serious 

problem for attempts for parallelization to take advantage of modern multi-core CPUs. The 

code has the very simple and repetitive structure: 

double sum = 0; 

double current rate = generate random number(); 

if (A key == on) 

{ 

sum = A rate(state, parameters); 

if (current rate < sum) 

{ 

counter→ A[state→ valley]++; 

state after A (st, p); 

return; 

} 

} 

sum += B rate (state, state 2, parameters); 

253

254 CHAPTER 12. REAL-WORLD PROGRAMMING 

if (current rate < sum) 

{ 

counter→ B[state→ valley]++; 

state after B (state, state 2); 

return; 

} 

... 

Extensions to this code are usually accomplished by copy and paste, which is prone to simple 

mistakes by oversight, such as failing to change the counter which has to be incremented or 

calling the incorrect function to update the electron’s state. 

Furthermore, at times the need arises to calculate the sum of all the scattering models (λtotal), 

which is accomplished in a different part of the implementation, thus further opening the possibility 

for inconsistencies between the two code paths. 

The decision which models to evaluate is done strictly at run time and it would require significant, 

if simple modification of the code to change this at compile time, thus making highly 

optimized specializations very cumbersome. 

The functions calculating the rates and state transitions, however, have been well tested and 

verified, so that abandoning them would be wasteful. 

12.1.1 Best of Both Worlds 

Scientific computing requires not only high performance components evaluated and optimized 

at compile-time, but also runtime exchangeable (physical) models and the ability to cope with 

various boundary conditions. The two most commonly used programming paradigms, object 

oriented and generic programming, differ in how the required functionality is implemented. 

Object oriented programming directly offers runtime polymorphism by means of virtual inheritance. 

Unfortunately current implementations of inheritance use an intrusive approach for new 

software components and tightly couples a type and the corresponding operations to the super 

type. In contrast to object-oriented programming, generic programming is limited to algorithms 

using statically and homogeneously typed containers but offers highly flexible, reusable, 

and optimizeable software components. 

As can be seen, both programming types offer different points of evaluation. runtime-polymorphism 

based on concepts [?] (runtime concepts) tries to combine the virtual inheritance runtime modification 

mechanism and the compile-time flexibility and optimization. 

Inheritance in the context of runtime polymorphism is used to provide an interface template 

to model the required concept where the derived class must provide the implementation of the 

given interface. The following code snippet 

template struct scatter facade 

{ 

typedef StateT state type; 

struct scattering concept 

{ 

virtual ∼scattering concept() {} ; 

virtual numeric type rate(const state type& input) const = 0;

12.1. TRANSCENDING LEGACY APPLICATIONS 255 

virtual void transition(state type& input) = 0; 

}; 

boost::shared ptr scattering object; 

template struct scattering model:scattering concept 

{ 

T scattering instance; 

scattering model(const T& x):scattering instance(x) {} 

numeric type rate(const state type& input) const ; 

void transition(state type& input) ; 

}; 

numeric type rate(const state type& input) const; 

void transition(state type& input) ; 

template 

scatter facade(const T& x):scattering object(new scattering model(x)){} 

∼scatter facade() {} 

}; 

therefore introduces a scattering facade which wraps a scattering concept part. The 

virtual inheritance is used to configure the necessary interface parts, in this case rate() and 

transition(), which have to be implemented by any scattering model. In the given example 

the state type is still available for explicit parametrization. 

In contrast to other applications of runtime concepts, e.g. in computer graphics, it is not 

necessary to provide mechanisms for deep copies, as the actual physical models remain unaltered 

once they have been created and would only serve unnecessarily increase the memory footprint. 

Therefore a boost::shared ptr is used for memory management. 

The legacy application has been writte in plain ANSI C, which makes it easily compatible with 

the new C ++ implementation. Several design decisions, such as the use of global and static 

variables, make it difficult to extend and update appropriately for modern multi-core CPUs. 

To interface this novel approach a core structure is implemented which wraps the implementations 

of the scattering models by using runtime concepts. 

template 

struct scattering rate A 

{ 

... 

const ParameterType& parameters; 

scattering rate A(const ParameterType& parameters):parameters(parameters){} 

template 

numeric type 

operator() (const StateType& state) const 

{ 

return A rate(state, parameters); 

} 

};

256 CHAPTER 12. REAL-WORLD PROGRAMMING 

By supplying the required parameters at construction time it is possible to homogenize the 

interface of the operator(). This methodology also allows the continued use of the old data 

structures in the initial phases of transition, while not being so constrictive as to hamper future 

developments. 

The functions for the state transitions are treated similarly to those for the rate calculation. 

Both are then fused in a scattering pack to form the complete scattering model and to ensure 

consistency of the rate and state transition calculations and which also models the runtime 

concept as can be seen in the following part of code: 

template 

struct scattering pack 

{ 

// ... 

scattering rate type rate calculation; 

transition type state transition; 

scattering pack (const parameter type& parameters) : 

rate calculation(parameters), 

state transition(parameters) 

{} 

template 

numeric type rate(const StateType& state) const 

{ 

return rate calculation(state); 

} 

template 

void transition(StateType& state) 

{ 

state transition(state); 

} 

} 

The blend of runtime and compile time mechanisms allows the storage of all scattering models 

within a single container, e.g. std::vector, which can be iterated over in order to evaluate 

them. 

typedef std::vector scatter container type ; 

scatter container type scatter container ; 

scatter container.push back(scattering model) ; 

For the development of new collision models easy extendability, even without recompilations, 

is also a highly important issue. This approach allows the addition of scattering models at 

runtime and to expose an interface to an interpreted language such as, e.g., Python [?]. 

In case a highly optimized version is desired, the runtime container (here the std::vector) may 

be exchanged by a compile time container, which is also readily available from the GSSE and 

provides the compiler with further opportunities for optimizations at the expense of runtime 

adaptability.

12.1. TRANSCENDING LEGACY APPLICATIONS 257 

12.1.2 Reuse Something Appropriate 

While the described approach initially slightly increases the burden of implementation, due 

to the fact, that wrappers need to be provided, it gives a transition path to integrate legacy 

codes into an up to date frame while at the same time not to abandoning the experience 

associated with it. The invested effort allows to raise the level of abstraction, which in turn 

allows to increase the benefits obtained from the advances in compiler technologies. This in 

turn inherently allows an optimization for several platforms without the need for massive human 

effort, which was needed in previous approaches. 

In this particular case, encapsulating the reliance on global variables of the functions implementing 

the scattering models to the wrapping structures, parallelization efforts are greatly 

facilitated, which are increasingly important with the continued increase of computing cores 

per CPU. 

Furthermore the results can easily be verified as code parts a gradually moved to newer implementations, 

the only stringent requirement being link compatibility with C ++. This test and 

verification can be taken a step further in case the original implementation is written in ANSI 

C, due to the high compatibility of it to C ++. It is possible to weave parts of the new implementation 

into the older code. Providing the opportunity to get very a fine grained comparison 

not only of final results, but of all the intermediates as well. 

Such swift verification of implementations allows to also speed up the steps necessary to verify 

calculated results with subsequent or contemporary experiments, which should not be neglected, 

in order to keep physical models and their numerical representations strongly rooted in reality.

258 CHAPTER 12. REAL-WORLD PROGRAMMING

Parallelism 

Chapter 13 

13.1 Multi-Threading 

To do! 

13.2 Message Passing 

13.2.1 Traditional Message Passing 

Parallel hello world 

#include 

#include 


{ 

MPI Init(&argc, &argv); 

std::cout ≪ ”Hello, World!\n”; 

MPI Finalize(); 

} 

return 0 ; 

#include 

#include 


{ 


int myrank, nprocs; 

MPI Comm rank(MPI COMM WORLD, &myrank); 

MPI Comm size(MPI COMM WORLD, &nprocs); 

std::cout ≪ ”Hello world, I am process number ” ≪ myrank ≪ ” out of ” ≪ nprocs ≪ ”.\n”; 

259

260 CHAPTER 13. PARALLELISM 

} 


return 0 ; 

13.2.2 Generic Message Passing 

Everybody sends to process number 0. 

#include 

#include 

#include 


{ 


} 




float vec[2]; 

vec[0]= 2∗myrank; vec[1]= vec[0]+1; 

// Local accumulation 

float local= std::abs(vec[0]) + std::abs(vec[1]); 

// Global accumulation 

float global= 0.0f; 

MPI Status st; 

// Receive from predecessor 

if (myrank > 0) 

MPI Recv(&global, 1, MPI FLOAT, myrank−1, 387, MPI COMM WORLD, &st); 

// Increment 

global+= local; 

// Send to successor 

if (myrank+1 < nprocs) 

MPI Send(&global, 1, MPI FLOAT, myrank+1, 387, MPI COMM WORLD); 

else 

std::cout ≪ ”Hello, I am the last process and I know that |v| 1 is ” ≪ global ≪ ”.\n”; 


return 0 ; 

low abstraction level 

The library performs the reduction. 

#include 

#include 

#include

13.2. MESSAGE PASSING 261 


{ 


} 

Because: 




float vec[2]; 

vec[0]= 2∗myrank; vec[1]= vec[0]+1; 

// Local accumulation 

float local= std::abs(vec[0]) + std::abs(vec[1]); 

// Global accumulation 

float global; 

MPI Allreduce (&local, &global, 1, MPI FLOAT, MPI SUM, MPI COMM WORLD); 

std::cout ≪ ”Hello, I am process ” ≪ myrank ≪ ” and I know too that |v| 1 is ” ≪ global ≪ ”.\n”; 


return 0 ; 

• Higher abstraction: 

• MPI implementation usually adapted the underlying hardware: typically logarithmic effort; 

can be tuned in assember for network card

262 CHAPTER 13. PARALLELISM

Numerical exercises 

Chapter 14 

In this chapter, we list a number of exercises where the different aspects discussed in the course 

will be used. The goal is to implement a small application program in C++, run it and interpret 

the results. 

You can use any software that may help you with your task. A list of packages is provided 

at the end of this chapter. We have only installed Boost, Boost.Sandbox, GLAS, BLAS, and, 

LAPACK. Other smaller packages could be downloaded if necessary. 

In each exercise, a generic function or class will be developed, and its documentation. These 

functions and classes should be part of the namespace athens. The functions arguments will 

have to be described. Each template argument will have to satisfy concepts. You may have 

to define new concepts. If you are using STL or GLAS concepts, you can just refer to them, 

without definition. 

Write a small paper on the decisions you made for the development of the software. Use the 

software for some examples and report the results. You may write the report on paper or send 

it in electronic form (PDF by preference). 

14.1 Computing an eigenfunction of the Poisson equation 

This is an example of a more complicated problem. It illustrates what is expected from the 

exercises. The actual exercises are less demanding. 

In this section, we derive software for the solution of the Poisson equation. We start with the 

1D problem and then move to the 2D problem. 

14.1.1 The 1D Poisson equation 

The 1D Poisson equation is 

− d2u = f (14.1) 

dx2 where u(x) is the solution and f the excitation and x ∈ [0, 1]. We impose the boundary 

conditions 

u(0) = u(1) = 0 . 

263

264 CHAPTER 14. NUMERICAL EXERCISES 

This is called a boundary value problem. 

The goal is to compute the solution u for all x ∈ [0, 1]. Since this is not possible numerically, we 

only compute u for a discrete number of x’s, which we call discretization points. We discretize x 

as xj = jh for j = 0, . . . , n+1 and h = 1/(n+1). This is called an equidistant distribution. The 

smaller h, the closer we are to the continuous problem, i.e. we have more points in [0, 1], but, as 

we shall see, the problem becomes more expensive to solve. One method for solving boundary 

value problems is to replace the derivative by finite differences. We use finite differences for the 

second order derivatives: 

Filling this in (14.1), we obtain 

d 2 u 

dx 2 (xj) ≈ 1 

h 2 (−2u(xj) + u(xj−1) + u(xj+1)) . 

1 

h 2 (−u(xi−1) − u (xi+1) + 2u(xi)) = f(xi) for j = 1, . . . , n . (14.2) 

Note that u(x0) = u(xn+1) = 0. Now define the vectors 

u = [u(x1), . . . , u(xn)] T 

and f = [f(x1), . . . , f(xn)] T . 

Putting together (14.2) for j = 1, . . . , n leads to the algebraic system of equations Au = f with 

n rows and columns where ⎡ 

2 

⎢ −1 

A = ⎢ 

⎣ 

−1 

2 

. .. 

−1 

. .. . .. 

⎤ 

⎥ 

⎦ 

−1 2 

. 

Note that A is a symmetric tridiagonal matrix. We can show that it is positive definite. 

In the algorithms, we need operations on this matrix. We will use two different types of 

operations. The first one is the matrix-vector product y = Ax. We write a function for this 

with a template argument for the vectors since we do not know beforehand what the type of 

the vectors will be. 

#ifndef athens poisson 1d hpp 

#define athens poisson 1d hpp 

#include 

#include 

namespace athens { 

template 

void poisson 1d( X const& x, Y& y ) { 

assert( glas::size(x)==glas::size(y) ) ; 

assert( glas::size(x) > 1 ) ; 

y(0) = 2.0∗x(0) − x(1) ; 

for ( int i=1; i

14.1. COMPUTING AN EIGENFUNCTION OF THE POISSON EQUATION 265 

} // namespace athens 

#endif 

where we assume that the types X and Y are models of the concept glas::DenseVectorCollection. 

14.1.2 Richardson iteration 

Richardson iteration is an iterative method for the solution of the linear system 

Bu = g 

that starts from an initial guess u0 and computes ui = ui−1 + ri−1 at iteration i, where ri−1 is 

the residual g − Bui−1. It works as follows: The method converges when the eigenvalues of B 

1 

2 

3 

4 

5 

1. For i = 1, . . . , max it: 

1.1. Compute residual ri−1 = g − Bui−1 

1.2. If �ri−1�2 ≤ τ: return 

1.3. Compute new solution ui = ui−1 + ri−1 

lie between 0 and 2. 

The eigenvalues of the Poisson matrix A are λj = 2(1 − cos(πj/(n + 1))) for j = 1, . . . , n. The 

eigenvalues are thus bounded by 0 < λj < 4. We therefore first multiply Au = f by 0.5 into 

(0.5A)u = (0.5f) 

Note that the solution u does not change. Define B = 0.5A and g = 0.5f, then Bu = g and the 

eigenvalues of B lie in (0, 2). For such matrix, we can use the Richardson iteration method. 

We develop the following function 

template 

double richardson( Op const& op, G const& g, U& u, double const& tol, int max it ) ; 

where op is a BinaryFunction op(x,y) that computes y = Bx for a given input argument x, and 

where u is an initial estimate of the solution on input and the computed solution on output. 

The vector g is the right-hand side of the system. The return value of richardson is the residual 

norm. This allows us to check how accurate the solution is without having to compute the 

residual explicitly. The parameter tol corresponds to the tolerance τ. 

First, we set conceptual conditions on all arguments. 

• U is a model of concept glas::DenseVectorCollection, i.e. we assume that a dense vector 

from GLAS is used. 

• Op is a model of BinaryFunction, i.e. the following are valid expressions for op of type Op: 

– op(x,y) where x and y are instances of type X where X is a model of the concept 

glas::DenseVectorCollection.


• G is a model of concept glas::VectorExpression. 

Next, we write the code for the Richardson iteration. We store the variables ui in u and ri in r. 

#ifndef athens richardson hpp 

#define athens richardson hpp 

#include 


template 

double richardson( Op const& op, F const& f, U& u, double const& tol, int max it ) { 

double resid norm ; 

// Create residual vector 

glas::dense vector< typename glas::value type::type > r( glas::size(u) ) ; 

for ( int iter =0; iter


glas::random( f, seed ) ; 

v type x( 10 ) ; 

x = 0.0 ; 

// Richardson iteration 

double res nrm = athens::richardson( poisson scaled(), 0.5∗f, x, 1.e−4, 1000 ) ; 

{ 

glas::dense vector r( size(x) ) ; 

athens::poisson 1d( x, r ) ; 

std::cout ≪ ”res nrm = ” ≪ norm 2( f −r ) ≪ std::endl ; 

std::cout ≪ ”f = ” ≪ f ≪ std::endl ; 

std::cout ≪ ”x = ” ≪ x ≪ std::endl ; 

} 

return 0 ; 

} 

We multiply right-hand side and matrix vector product by 0.5 to make sure the Richardson 

method converges. 

The output looks like 

res_nrm = 0.000195164 

f = (10)[0.0484811,0.822283,0.102721,0.436631,0.46112,0.0475317,0.864644,0.0772845,0.920099,0.105434] 

x = (10)[1.85463,3.66081,4.64473,5.52601,5.97071,5.9544,5.8906,4.96226,3.95668,2.03105] 

Note that the Richardson method converges very slowly. For the Poisson equation, there exist 

much faster methods. 

14.1.3 LAPACK tridiagonal solver 

The LAPACK [?] software package contains routines for solving linear systems with a symmetric 

positive definite tridiagonal matrix. This package is written in FORTRAN 77. The 

corresponding functions are 

• Cholesky factorization: A = LL T by 

SUBROUTINE DPTTRF( N, D, E, INFO ) 

• Linear solve: Ax = b using LL T x = b by 

SUBROUTINE DPTTRS( N, NRHS, D, E, B, LDB, INFO ) 

In order to solve Au = f, first A is factorized by the Cholesky factorization into A = LDL T 

where L is a matrix consisting of a main diagonal of ones and a diagonal below the main diagonal 

and D is a diagonal matrix. Once the factorization is performed, the solution is computed as 

u = L −T D(L −1 f). Note the inversions of L and L T are not computed explicitly. For example 

L −1 f is computed as a linear solve with L. Linear solves for triangular matrices are easy to 

program. This is what DPTTRS does for us. 

A C++ interface to DPTTRF and DPTTRS is available from the BoostSandbox.Bindings. For 

our application, we can solve a linear system as follows.


1 

2 

3 

4 

5 

6 

7 

1. Given an approximate eigenvector x0 

2. Normalize: x0 = x0/�x0�2. 

3. For i = 1, . . . , m: 

3.1. Solve Ayi = xi. 

3.2. Compute the eigenvalue estimate: λi = � xi/ � yi. 

3.3. xi = yi/�yi�2. 

#include // Lapack binding 

#include // glas binding 

#include // glas vectors 

#include // for std::fill 

#include // for assert 

#include / for cout and endl 

int main() { 

int const n = 10 ; 

glas::dense vector< double > d(n) ; // Main diagonal 

glas::dense vector< double > e(n−1) ; // Lower/upper diagonal 

std::fill( begin(d), end(d), 2.0 ) ; 

std::fill( begin(e), end(e), −1.0 ) ; 

glas::dense vector< double > rhs( n ) ; 

std::fill( begin(rhs), end(rhs), 3.0 ) ; 

int info = boost::numeric::bindings::lapack::pttrf( d, e ) ; 

assert( !info ) ; 

std::cout≪ rhs ≪ std::endl ; 

info = boost::numeric::bindings::lapack::pttrs( ’L’, d, e, rhs ) ; 

std::cout≪ rhs ≪ std::endl ; 

// Solution is in rhs 

} 

14.1.4 The inverse iteration method 

The inverse iteration method computes an eigenvalue of a matrix A. The method converges to 

the eigenvector associated with the eigenvalue nearest zero. The method works as follows: In 

this algorithm � xi means the sum of the elements of xi. For the solution of the linear system, 

we can use Richardson iteration. 

Write a function with the following header: 

template 

void inverse iteration( Op const& op, DenseVectorCollection& x, int m, Float& lambda ) ; 

where Op is a model of BinaryFunction that solves y from x, x is the eigenvector estimate on 

input and output, and m the number of iterations. The return value is the estimated eigenvalue.


First, we set conceptual conditions on all arguments. 

• Op is a model of BinaryFunction, i.e. the following are valid expressions for op of type Op: 

– op(x,y) where x and y are instances of type X where X is a model of the concept 

glas::DenseVectorCollection. 

• DenseVectorCollection is a model of glas::DenseVectorCollection. 

• Float is a concept of real numbers, i.e. it is float, double, or long double. 

The implementation for inverse iteration could be as follows: 

#ifndef athens inverse iteration hpp 

#define athens inverse iteration hpp 

#include 

#include 


template 

void inverse iteration( Op const& op, DenseVectorCollection& x, int m, Float& lambda ) { 

glas::dense vector< typename glas::value type::type > y( glas::size(x) ) ; 

x = x / norm 2( x ) ; // 2. 



} 

} ; 

athens::richardson( poisson scaled(), 0.5∗x, y, 1.e−8, 1000 ) ; 

int main() { 

typedef glas::dense vector v type ; 

v type x( 10 ) ; 

glas::random seed seed ; 

glas::random( x, seed ) ; 

double lambda ; 

athens::inverse iteration( solve(), x, 100, lambda ) ; 

std::cout ≪ ”lambda = ” ≪ lambda ≪ std::endl ; 

std::ofstream xf( ”x.out” ) ; 



0.45 

0.4 

0.35 

0.3 

0.25 

0.2 

0.15 

"x.out" 

0.1 

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 

Figure 14.1: First eigenvector of the 1D Poisson operator 

int main() { 

typedef glas::dense vector v type ; 

int n = 10 ; 

v type x( n ) ; 

glas::random seed seed ; 

glas::random( x, seed ) ; 

v type d( n ) ; std::fill( begin(d), end(d), 2.0 ) ; 

v type e( n−1 ) ; std::fill( begin(e), end(e), −1.0 ) ; 

solve< v type, v type > solver( d, e ) ; 

athens::inverse iteration( solver, x, 100, lambda ) ; 

std::cout ≪ ”lambda = ” ≪ lambda ≪ std::endl ; 

std::ofstream xf( ”x.out” ) ; 

for ( int i=0; i plot "x.out" w l 

gnuplot>


14.2 The 2D Poisson equation 

The 2D Poisson equation is 

− ∂2u ∂x2 − ∂2u = f 

∂y2 where u(x, y) is the solution and f the excitation and (x, y) ∈ [0, 1] × [0, 1]. We impose the 

boundary conditions 

u(0, y) = u(1, y) = y(x, 0) = y(x, 1) = 0 . 

We discretize the x as xj = jh for j = 1, . . . , n and h = 1/n. Similarly, yj = jh. We use finite 

differences for the second order derivatives. This produces the equation 

1 

h 2 (−u(xi−1, yj)−u(xi, yj−1)−u (xi+1, yj)−u(xi, yj+1)+4u(xi, yj)) = f(xi, yj) for i, j = 1, . . . , n . 

This leads to the algebraic system of equations Au = f with n 2 rows and columns. 

Recall the example exercise of §14.1. We do exactly the same exercise. Since the matrix is not 

tridiagonal, we cannot use the LAPACK routine pttrf any longer. We use the LAPACK routine 

sytrf for a full matrix instead. See the documentation on 

boost-sandbox/libs/numeric/bindings/lapack/doc/index.html. 

For a 2D problem the solution vector u can be represented as a matrix. The row index corresponds 

to the variable x and the column index to a variable y. 

In particular, you develop the functions inverse iteration, poisson 2d for the matrix vector product, 

scaled poisson for the scaled matrix vector product, and richardson. Give for each templated 

argument the conceptual conditions. make a plot of the eigenvector using Gnuplot’s splot (for 

plotting surfaces). 

14.3 The solution of a system of differential equations 

In this exercise, we write a function for the computation of a time step of a system of differential 

equations using Runge-Kutta methods. 

14.3.1 Explicit time integration 

Methods for the solution of the differential equation 

˙u = f(u) u(0) = u0 

operate time step by time step, i.e. the time is discretized and given the solution at time step 

tj, we compute the solution at time step tj+1 = tj + h where h is small.

14.3. THE SOLUTION OF A SYSTEM OF DIFFERENTIAL EQUATIONS 273 

The method that we use here is the Runge-Kutta 4 method, which is described here: the 

solution at time step tj+1 is computed as 

14.3.2 Software 

Write a generic function 

uj+1 = uj + h 

6 (k1 + 2k2 + 2k3 + k4) 

k1 

k2 

= 

= 

f(uj) 

� 

f uj + h 

2 k1 

k3 = 

� 

� 

f uj + h 

2 k2 

� 

k4 = f(uj + hk3) 

template 

void rk4( U& u, F& f, T const& h ) ; 

that computes one time step with the Runge-Kutta 4 method. The argument u is on input the 

solution at time t and on output at timestep t+h. The argument f is the functor that evaluates 

the function f(u). The argument u is a vector. 

When the implementation is finished, write the concepts for U and F in comment lines in the 

code. 

14.3.3 The van der Pol oscillator 

Differential equations appear in the study of physical phenomena. The Van der Pol oscillator 

is described by the following equation: 

d2x dt2 − µ(1 − x2 ) dx 

+ x = 0 (14.3) 

dt 

with initial solution x(0) and x ′ (0). This is a non-linear second order differential equation, with 

a parameter µ. When µ = 0, we have a purely harmonic solution (cos and sin). When µ ≥ 0, 

the solution evolves to a harmonic limit cycle. 

Second order differential equations are usually solved by writing them as a system of first order 

differential equations: 

� � � � � � 

d dx 

dt −µ(1 − x2 dx 

) 1 

+ 

dt = 0 . 

dt x 

−1 0 x 

In matrix form, the equation can be written as 

where 

A(u) = 

du 

dt 

+ A(u)u = 0 

� −µ(1 − u 2 2 ) 1 

−1 0 

� 

,


1 3 

2 

Figure 14.2: An example of a web with only four pages. An arrow from page A to page B 

indicates a link from page A to page B. 

or 

with 

14.3.4 Exercise 

du 

dt 

= f(u) 

f(u) = −A(u)u . 

Use the Runge-Kutta 4 method for evaluating the Van der Pol equation for µ = 0, µ = 0.1 and 

µ = 1 in the time interval [0, 10] with time steps h = 0.001. Also try smaller and larger time 

steps. 

Plot the results using gnuplot. 

14.4 Google’s Page rank 

We all use Google for web searching. In this exercise, we try and understand a particular tool 

used by Google to rank pages, called PageRank. 

The basic idea behind the Google Page Ranking Algorithm, is that the importance of a webpage 

is determined by the number of references made to it. We would like to compute a score xk 

reflecting the importance of page k. A simple minded approach would be just to count the 

number of links to each page. This approach does not reflect the fact that some pages might 

be more significant than others therefore rendering their votes more important. It also leaves 

open the possibility of artificially inflating the rank of a particular page by generating other 

trivial or advertising pages whose only function is to promote the importance of a particular 

page. Significant refinements are: 

• Weight each in-link by the importance of the page which links to it. 

• Give each page a total vote of 1. If page j contains nj links, one of which links to page k, 

then page k’s score is boosted by xj 

nj . 

4

14.4. GOOGLE’S PAGE RANK 275 

Taking the new refinements into account we can compute the importance score xk of a page k 

as follows: 

xk = � xj 

(14.4) 

nj 

j∈Lk 

Where Lk denotes the set of pages with a link to page k. Consider the simple example of Figure 

14.2. Using the formula (14.4) we get the following equations for the importance scores of the 

pages in this example: 

x1 = x3 + x4 

2 

x2 = x1 

3 

x3 = x1 

3 

x4 = x1 

3 

+ x2 

2 

+ x2 

2 

+ x4 

2 

These linear equations can be written as Ax = x where x = [x1 x2 x3 x4] T and 

⎡ 

⎤ 

A = 

⎢ 

⎣ 

0 0 1 1 

2 

1 

3 0 0 0 

1 

3 

1 

3 

1 1 

2 0 2 

1 

2 0 0 

This transforms the web ranking problem into the standard problem of finding an eigenvector 

x with eigenvalue 1 for the square matrix A. This eigenvector can be found iteratively using 

the power method with a threshold τ: 

The power method converges to the eigenvector corresponding to the dominant eigenvalue λ1. 

1 

2 

3 

4 

5 

6 

1. v (0) = some vector with � v (0) �= 1 

2. Repeat for k=1,2, . . . : 

2.1. Apply A: w = Av (k−1) . 

2.2. Normalize: v (k) = w/ � w �. 

3. Until �v (k−1) − v (k) � < τ 

The matrix A is called a column stochastic matrix, since it is a square matrix with positive 

entries and the entries in each column sum to one. In the case of a column stochastic matrix, 

this dominant eigenvalue is 1. 


Write a generic function: 

template 

void power iteration( V& v, Function & f, double tau ) ; 

that computes the power iteration algorithm 5 for a matrix A with starting vector v. The 

resulting eigenvector should be stored in v. The argument f is a functor that returns the result 

of the matrix vector product. Also write documentation and specify the conceptual constraints 

for the arguments. 

⎥ 

⎦ .


14.4.2 Dictionary application 

The page ranking algorithm which was described above can also be used to rank different words 

in a dictionary. Consider the following small dictionary: 

backwoods = bush, jungle 

bush = backwoods, jungle, shrub, plant, hedge 

flower = plant 

hedge = bush 

jungle = bush, backwoods 

plant = bush, shrub, flower, weed 

shrub = bush, plant, tree 

tree = shrub 

weed = plant 

Construct a graph linking every word with the words in its explanation. The first line of the 

dictionary, for example, would link bush and jungle to backwoods. The graph can be constructed 

on paper. Use equation (14.4) to construct the sparse column stochastic matrix A and use your 

power method to rank the words. 

o of a function in an interval 

In this exercise, we make a programming exercise on a root finding method, called the bisection 

method. 

14.5.1 Functions in one variable 

Suppose we are given a function f in one variable and we want to compute the unique zero 

in the interval [a, b]. A method that could be used is the bisection method. It only requires 

function evaluations and is thus widely applicable. 

The method computes a small interval that contains the minimum. This small interval is 

obtained by splitting the interval [a, b] in two parts [a, m] and [m, b], where 

The method works as follows : 

1 

2 

3 

4 

5 

6 

m = 

a + b 

2 

1. Given the interval [a, b] for which f(a)f(b) < 0. 

2. Repeat until b − a < τ: 

2.1. Compute m from (14.5). 

2.2. If f(m)f(a) < 0: b = m. 

2.2. Else: a = m. 

(14.5)

14.5. THE BISECTION METHOD FOR FINDING THE ZERO OF A FUNCTION IN AN INTERVAL277 


The task is to first develop the function 

template 

void bisection( T& a, T& b, Function& f, double tau ) ; 

that computes the bisection Algorithm 6. The object f is a functor that returns the function 

value for a single argument x. The type T is a float type, i.e. float or double. 

Write documentation for the function and describe the conceptual conditions on Function. 

14.5.3 The growth and downfall of a caterpillar population 

Everyone knows caterpillars grow up to be beautiful butterflies. But before they reach that 

stage of their life, they need lots of food to grow. A large population will not grow at the same 

rate as a smaller one, because of a shortage of food. Furthermore most birds enjoy a juicy 

caterpillar as snack, so they are responsible for the premature death of several members of the 

caterpillar population. These relationships can be modelled mathematically by the following 

equation: 

dN 

dt 

N αN 2 

= rN(1 − ) − 

K β + N 2 

In this equation rN(1 − N 

) models the growth of the population, where N equals the num- 

K 

ber of caterpillars, r is the growing rate of the population and K is the maximum amount of 

caterpillars that can inhabit the area. The second term of the equation models the death of the 

caterpillars. Here α is the maximum rate at which a bird can eat caterpillars when N is large 

and β is a parameter that indicates the intensity of the bird attacks. We want to know when 

there exists an equilibrium between the growth and death rate in the caterpillar population, 

i.e. when dN 

equals zero. 

dt 

Use the function bisection to compute the number of caterpillars N for which the following 

populations are at an equilibrium in the intervals [0.1; 10], [10, 20] and [20, 100]: 

• Population 1: r = 1.3, K = 100, α = 20 and β = 50 

• Population 2: r = 2.0, K = 80, α = 25 and β = 10 

Show the resulting roots in a table. 

14.5.4 Computing eigenvalues using the bisection method 

In this exercise, we use the function bisection to compute the eigenvalues of a real symmetric 

dense matrix A with real eigenvalues. The problem is to compute λ so that 

det(A − λI) = 0 . 

The determinant is computed using the QR factorization (which is available in LAPACK). The 

QR factorization computes an orthogonal matrix Q (Q T Q = I) and an upper triangular matrix 

R so that 

A − λI = QR .


We use the property that 

det(A − λI) = det(R) 

Since R is upper triangular, the determinant is the product of the diagonal elements of R. 

The matrices A are constructed as follows. Start with a simple case : the diagonal matrix with 

elements 1, 2, . . . , n on the main diagonal. Then do the tests for the same matrix multiplied 

on left and right by a random orthogonal matrix X, as in A = XDX T where D is a diagonal 

matrix. 

g the minimum of a convex function 

This exercise is a programming exercise on Newton’s method. First, we explain the method for 

a function with a single variable, then we discuss the case of multivariate functions, and finally, 

we show a small application. 

14.6.1 Functions in one variable 

For a differentiable function f, the minimum ˜x is attained for f ′ (˜x) = 0. So, we must find the 

zero of the first order derivative. When f is a second order polynomial, we have 

then an extreme value of f is attained for 

which is a minimum when f ′′ ≡ 2γ > 0. 

f = p := α + βx + γx 2 

f ′ = p ′ := β + 2γx 

˜x = − β 

2γ 

(14.6) 

For an arbitrary function, we do not have such simple explicit formulae. We can use an iterative 

method, which is called Newton’s method. It is an iterative approach, i.e. we start from an 

initial guess ˜x and improve this value until it has converged to the minimum of the function. On 

each iteration we approximate the function by a degree two polynomial, for which the simple 

formula (14.6) can be used. One way to compute such a degree two polynomial is to start from 

the Taylor expansion of f around ˜x : 

f(x) = f(˜x) + f ′ (˜x)(x − ˜x) + 1 

2 f ′′ (˜x)(x − ˜x) 2 + · · · . 

If we approximate f by the first 3 terms (i.e. a degree two polynomial), then we have 

f(x) ≈ p(x) := f(˜x) + f ′ (˜x)(x − ˜x) + 1 

2 f ′′ (˜x)(x − ˜x) 2 . 

If x is close to ˜x, |f(x) − p(x)| is small. The first order derivative is 

f ′ (x) ≈ p ′ (x) = f ′ (˜x) + f ′′ (˜x)(x − ˜x) .

14.6. THE NEWTON-RAPHSON METHOD FOR FINDING THE MINIMUM OF A CONVEX FUNCTION2 

Then p ′ (x) = 0 for 

x = ˜x − f ′ (˜x) 

f ′′ (˜x) 

The Newton method goes as follows : In this algorithm τ is a tolerance for the stopping criterion. 

1 

2 

1. Given initial ˜x = x (0) . 

2. Repeat for j = 1, 2, . . . 

3 

2.1. Compute x (j) = x (j−1) − f ′ (x (j−1) )/f ′′ (x (j−1) 4 

) 

3. Until |f ′ (x (j−1) )/f ′′ (xj−1) 5 

)| < τ 

The iteration stops when the derivative is much smaller than the second order derivative. What 

happens when f ′′ (xj−1) = 0 ? 

14.6.2 Multivariate functions 

For multivariate functions, the principle is the same, but it is more complicated. A multivariate 

function f has an argument x ∈ R n , ie. a vector of size n. For example, f = sin(x1)+x2 cos(x1) 

is a multivariate function in the variables x1 and x2. 

We use the same idea as for one variable. That is, we use the Taylor expansion to approximate 

the function : 

f(x) � f(˜x) + ∇f(˜x) T (x − ˜x) + 1 

2 (x − ˜x)T H(f(˜x))(x − ˜x) 

⎛ ⎞ 

∂f/∂x1 

⎜ ⎟ 

∇f(˜x) = ⎝ . ⎠ 

H(f) = 

⎡ 

⎢ 

⎣ 

∂f/∂xn 

∂ 2 f/∂x1∂x1 · · · ∂ 2 f/∂x1∂xn 

. 

∂ 2 f/∂xn∂x1 · · · ∂ 2 f/∂xn∂xn 

where ∇f(˜x) is called the gradient vector and H(f) the Hessian matrix. 

The derivative becomes 

so, the derivative is zero when 

f ′ (x) = ∇f(˜x) + H(f(˜x))(x − ˜x) 

x = ˜x − {H(f(˜x))} −1 ∇f(˜x) . 

This requires the solution of an n × n linear system on each iteration. The Newton algorithm 

is very similar to the univariate case: 

14.6.3 Software for uni-variate functions 


. 

⎤ 

⎥ 

⎦


1 

2 

3 

4 

5 

6 

1. Given initial ˜x = x (0) ∈ R n . 

2. Repeat for j = 1, 2, . . . 

2.1. Compute d = {H(f(x (j−1) )} −1∇f(x (j−1 )) 

2.2. Compute x (j) = x (j−1) − d. 

3. Until �d�2 < τ 

template 

void newton raphson( X& x, Function& f, Derivative& d, SecondDerivative& s, double tau ) ; 

that computes the Newton-Raphson Algorithm 7. f, d, and s are functors that return the 

function value and the derivatives for the single argument x. 

Write documentation for the function and describe the conceptual conditions on Function, 

Derivative, and SecondDerivative. 

Next, you use this function to compute the minima for the following functions : 

• f = x 2 − 2x + 4 

• f = x 10 

• f = x + 5 

• f = −x 2 − 2x + 4 

14.6.4 Software for multi-variate functions 


template 

void newton raphson( X& x, Function& f, Gradient& d, Hessian& h, double tau ) ; 

that computes the Newton-Raphson Algorithm 8. Note that in this case, g and h should return 

the resulting gradient vector and Hessian matrix respectively. Also write documentation and 

specify the conceptual constraints for the arguments. 

14.6.5 Application 

The following is an application for the multivariate case. Given a Hermitian matrix L ∈ R n×n , 

then we want to solve the following optimization problem : 

min 1 

2 xT Lx 

s.t. x T x = 1 

We first introduce a Lagrange multiplier λ and rewrite this problem in the following form. Find 

x and λ so that 

min f(x, λ) = 1 

2 xT Lx − 1 

2 λ(x∗x − 1)

14.7. SEQUENTIAL NOISE REDUCTION OF REAL-TIME MEASUREMENTS BY LEAST SQUARES281 

The gradient and Hessian are : 

∇f = 

H(f) = 

� 

Lx − λx 

xT � 

x − 1 

� 

L − λI −x 

−xT � 

0 

One can prove that the solution of this optimization problem is the smallest eigenvalue λ 

and associated normalized eigenvector x. This is a method for computing eigenvalues of large 

matrices. 

For solving a linear system with the Hessian, you can use the direct solver MUMPS or the 

iterative solver toolbox from GLAS. 

easurements by least squares 

Suppose, we want to measure a function f(t) for given time snapshots t1, . . . , tm. We know that 

the function is a polynomial of a given degree n − 1, but due to measurement errors, the data 

are noisy. If f is a polynomial, 

n� 

f(t) = ξjt j−1 

We could have a more general series, e.g. 

f(t) = 

where φj is the jth base function. With 

⎛ ⎞ 

f(t1) 

⎜ ⎟ 

b = ⎝ . ⎠ 

f(tm) 

⎛ ⎞ 

we have 

x = 

A = 

⎜ 

⎝ 

⎡ 

⎢ 

⎣ 

ξ1 

. 

ξn 

⎟ 

⎠ 

j=1 

n� 

ξjφj(t) 

j=1 

φ1(t1) φ2(t1) · · · φn(t1) 

. 

. 

φ1(tm) φ2(tm) · · · φn(tm) 

⎤ 

⎥ 

⎦ 

Ax = b (14.7) 

Note that (14.7) is an m×n linear system, where usually m ≫ n. This system is overdetermined, 

and so, due to errors in the data, it cannot be solved. However, we can solve the system in a 

least squares sense, i.e. find x so that 

min 

x �Ax − b�2 . (14.8)


When measurements come in sequentially, i.e. at time steps t1, t2, . . ., we receive at time step 

tj the jth row of A and the jth element of b. In the algorithms we now discuss, we have the 

following 

14.7.1 The least squares QR algorithm 

A numerically stable method for solving (14.8) is based on the QR factorization. The QR 

factorization of the m × n matrix A is 

A = QR 

with Q ∈ R m×n and unitary (Q T Q = I) and R ∈ R n×n upper triangular. If A has full rank, 

the diagonal elements of R are non-zero. Suppose we have computed the solution for 

where 

�Akx − bk�2 min 

�Akx − bk�2 = �QkRkx − bk�2 

= �Rkx − Q T k bk�2 

We have to solve an upper triangular linear system. We can develop a ‘sequential’ method for 

this QR decomposition without storing Q, but we will not discuss this any further. 

14.7.2 The least squares method via the normal equations 

One method to achieve this are the normal equations. That is, multiply (14.7) on the left by 

A T , then we obtain 

A T Ax = A T b (14.9) 

If the A has full column rank, the solution of x is unique and satisfies (14.8). 

14.7.3 Least squares Kalman filtering 

The Kalman filter is a method to solve the normal equations (14.9) in a step by step way, i.e. 

the measurements come in time step at time step. The Kalman filter adapts the least squares 

solution to the newly arrived data. 

Suppose we have computed the least squares solution of 

Akxk = bk 

where Ak are the first k rows of A and bk the first k elements of b with k ≥ n. Then we want 

to compute the least squares solution of 

Since 

Ak+1 = 

� Ak 

a T k+1 

Ak+1xk+1 = bk+1 . 

� 

� 

and bk+1 = 

bk 

f(tk+1) 

�

14.7. SEQUENTIAL NOISE REDUCTION OF REAL-TIME MEASUREMENTS BY LEAST SQUARES283 

we have, with gk = A T k bk, that 

A T k+1 Ak+1xk+1 = gk+1 

(A T k Ak + ak+1a T k+1 )xk+1 = gk + ak+1f(tk+1) 

With Mk = (A T k Ak) −1 ∈ R n×n , we derive from the Sherman-Morrison formula, that 

and we also have that 

Mk+1 := (A T k Ak + ak+1a T k+1 )−1 = Mk − Mkak+1a T k+1 Mk 

1 + a T k+1 Mkak+1 

xk+1 = Mk+1(gk + ak+1f(tk+1)) = Mkgk + Mkak+1f(tk+1) − Mkak+1aT k+1 

1 + aT k+1Mkak+1 (Mkgk + Mkak+1f(tk+1)) 

= xk + 

Mkak+1 

1 + aT k+1Mkak+1 (f(tk+1) − a T k+1xk) The Kalman method works as follows. We can use the LAPACK subroutine DGESV for com- 

1 

2 

3 

4 

5 

6 

7 

1. Solve Anxn = bn by taking the first n rows of A and b 

2. Let Mn = A−1 n A−T n 

3. For k = n + 1, . . . , m do: 

3.1. Compute the Kalman gain vector kk+1 = Mkak+1/(1 + aT k+1Mkak+1). 3.2. Update step: xk+1 = xk + kk+1(f(tk+1) − aT k+1xk). 3.3. Mk+1 = Mk − kkaT k+1Mk. puting A −1 

n . 


The goal is to write a function that computes the Kalman filter least squares. Because of the 

sequential character, we suggest to make a class with the following specifications: 

template 

class kalman { 

public: 

// Creation of the Kalman filter 

kalman( int n ) ; 

// Compute the first n observations and initialize the Kalman 

// filter (Steps 1 and 2 in the algorithm) 

// BaseFun is a binary functor. 

template 

void initialize( VIt t begin, VIt const& t end, BaseFun& base fun, F& f ) { 

... 

} 

template


void step( T const& t, Base& base, F const& f ) { 

... 

} 

public: 

// Return the solution 

typedef ... x type ; 

x type const& x() const { ... } 

private: 

... 

14.7.5 Test problems 

We now solve the following test problems. First, consider the following expansion: 

f(t) = ξ1 + ξ2 cos t + ξ3 sin t + ξ4cos2t + ξ5 sin 2t 

We compute the coefficients following the least squares criterion for the function 

f = (2 − 5 cos t) 

Print the solution x for each step of the Kalman filter and see how it changes. It should be very 

close to the function. 

Then apply random noise with relative size 0.0001: 

f = (2 − 5 cos t)(1 + 0.0001ɛ) 

where ɛ is a random number in [−1, 1]. Print the solution x for each step of the Kalman filter 

and see how it changes. It should be close to the solution of the function with ɛ ≡ 0. 

Plot the results using gnuplot.

Programmierprojekte 

Kapitel 15 

Die folgenden Hinweise betreffen alle Projekte. 

• Die Projekte werden vorzugsweise in Teams von 2 Studenten realisiert. 

• Jedes Team bekommt ein Repository in einem MTL4-Zweig zur Verfügung. 

• Das heißt auch, dass jeder Kursteilnehmer die Versionskontrollesoftware “subversion” lernen 

muss, siehe http://subversion.tigris.org/. Die Vorlesung von Greg Wilson 

gibt eine ausreichende Einführung in subversion, siehe http://software-carpentry. 

org/. Ich werde in der 2. Übung (19.4.) selbst eine kurze Einführung geben. 

• Die Projekte sollen mit einem Kommando gebildet (kompiliert, gelinkt) werden. Verwenden 

Sie möglichst “cmake”. 1 cmake ist bei jedem vernünftigen Linux mit dabei und 

müsste auch auf dem Pool vorhanden sein. Gibt es sogar für Windows: dort kann es die 

Projektdateien für Visual Studio erzeugen. 

• Schreiben Sie zuerst Tests für neue Features, bevor Sie sie implementieren. 

• Versuchen Sie, Ihre Rückfragen auf die Übungszeiten zu begrenzen. 

• Schreiben Sie eine doxygen-Dokumentation für Ihre Klassen und Funktionen (auf englisch). 

Schreiben Sie möglichst viele Beispiele. (Diese können gern von Ihren Tests abgeleitet 

sein.) 

– Formeln möglichst mit den Kommandos für L ATEX-Einfügungen (\f[ u.ä) erstellen. 

Bei dieser Gelegenheit lernt man häufig auch seine Linux-Installation besser kennen, 

da doxygen L ATEX nicht immer findet. Es ist hier keine Schande, Hilfe von befreundeten 

Hackern in Anspruch zu nehmen. 

15.1 Potenzieren von Matrizen A x 

Implementieren Sie Algorithmen für A x für verschiedene Matrixtypen und für x ∈ Q als x ∈ R. 

1 Notfalls reines “make” (siehe z.B. http://software-carpentry.org/build.html). 

285

286 KAPITEL 15. PROGRAMMIERPROJEKTE 

15.2 Exponisation von Matrizen e A 

Implementieren Sie Algorithmen für e A für verschiedene Matrixtypen, insbesondere schwach 

besetzte Matrizen. Nutzen Sie die in der MTL4 vorhandenen Algorithmen zum Lösen von 

Gleichungssystemen. Artikel von Cleve Moller, “19 dubios ways. . . ” 

15.3 LU-Zerlegung für m × n-Matrizen 

m, n L U 

m = n unteres Dreieck oberes Dreieck 

m > n Trapez oberes Dreieck 

m < n unteres Dreieck Trapez 

A = P · L · U (15.1) 

Bei L wird die Diagonale=1, daher nicht mit gespeichert. Berechung der Lösung eines Systems 

von Gleichungen und anschließende Fehlerberechung. 

Siehe http://software.intel.com/sites/products/documentation/hpc/mkl/mklman/lle/functn_getrf.htm. 

15.4 Bunch-Kaufman Zerlegung 

A mit A = A T 

Implementiere die Zerlegung: 

• Überrschreibend, 

A = P · U · D · U T · P T 

(15.2) 

• und entwickle Funktionen zum Extrahieren von P , U und D aus dem resultierenden A. 

• Kopiere A, berechne die Zerlegung und gib P , U und D als Tuple zurück. 

Siehe http://software.intel.com/sites/products/documentation/hpc/mkl/mklman/lle/functn_sytrf.htm. 

15.5 Konditionszahl (reziprok) 

• Im allgemeinen Fall LU nutzen. 

– Cholesky, wenn symmetrisch. 

∗ Gegebenenfalls Bunch-Kaufmann . . .

15.6. MATRIX-SKALIERUNG 287 

Siehe http://software.intel.com/sites/products/documentation/hpc/mkl/mklman/lle/functn_gecon.htm. 

15.6 Matrix-Skalierung 

Für dicht und schach besetzte Matrizen Zeilen- und Spalten-Skalierungsfaktoren. Damit größter 

Matrixeintrag in jeder Zeile und Spalte 1 ist. 

Siehe http://software.intel.com/sites/products/documentation/hpc/mkl/mklman/lle/functn_geequ.htm. 

15.7 QR mit Überschreiben 

Implementieren Sie verschiedene eine Zerlegung 

mit Q orthogal/unitär für reelle/komplexe A. Realisieren Sie: 

• Eine überschreibende Faktorisierung wie in LAPACK, 

• Funktionen zum Extrahieren von Q und R, 

• Eine Version, die A kopiert und Q und R als Paar zurückgibt. 

• Schreiben Sie Tests oder Anwendungen. 

A = QR (15.3) 

Siehe http://software.intel.com/sites/products/documentation/hpc/mkl/mklman/lse/functn_orgqr.htm, 

http://software.intel.com/sites/products/documentation/hpc/mkl/mklman/lse/functn_ungqr.htm, 

http://software.intel.com/sites/products/documentation/hpc/mkl/mklman/lse/functn_ormqr.htm, 

http://software.intel.com/sites/products/documentation/hpc/mkl/mklman/lse/functn_unmqr.htm. 

15.8 Direkter Löser für schwach besetzte Matrizen 

Implementieren Sie einen direkten Löser rekursiv. 

• Die Matrix sollte hierarchisch als Quad-Baum dargestellt werden. 

• Die Operationen sollen auch rekursiv auf Blöcken durchgeführt werden: 

– Matrixaddition und -subtrakion, 

– Matrixmultiplikation, 

– Inverse von Teilbäumen 

– Pivotisierung auf 

∗ Spalte,

288 KAPITEL 15. PROGRAMMIERPROJEKTE 

∗ Zeile oder 

∗ Diagonale. 

Je nachdem, was am besten geeignet erscheint. 

– Die Pivotisierung muss natürlich durch eine Permutation repräsentiert werden. 

• Die Anwendung der Lösung auf einen Vektor möglichst auch rekursiv anwenden. 

– Das bedeutet, den Dreieckslöser auch rekursiv zu implementieren. 

Abbildung 15.1: Hierarchischer Ansatz. 

Dieses Projekt ist die größte Herausforderung von allen und auch signifikante Teilergebnisse 

werden als Erfolg gewertet. 

15.9 Anwendung MTL4 auf Typen der Intervallarithmetik 

Schreiben Sie Anwendungen von Matrizen und Vektoren für geeignete Typen der Intervallarithmetik, 

z.B. boost::interval.

15.10. ANWENDUNG MTL4 AUF TYPEN MIT HÖHERER GENAUIGKEIT 289 

15.10 Anwendung MTL4 auf Typen mit höherer Genauigkeit 

Schreiben Sie Anwendungen von Matrizen und Vektoren für geeignete Typen mit höherer 

Genauigkeit, z.B. Gnu Multiprecision (GMP). 

15.11 Anwendung MTL4 auf AD-Typen 

Schreiben Sie Anwendungen von Matrizen und Vektoren für geeignete Typen des automatischen 

Differenzierens mit operatorüberladener Ableitung.

290 KAPITEL 15. PROGRAMMIERPROJEKTE

Acknowledgement 

Chapter 16 

Special thanks to Josef Weinbub, Carlos Giani, and Franz Stimpfl. These people are 

instrumental in the design and development of GSSE and this book. Thanks also goes to 

Michael Spevak for the development of some basic concepts and text parts for an early 

version of GSSE. 

Andrey Chesnokov, Yvette Vanberghen, Kris Demarsin and Yao Yue. 

students of the class “C ++ für Wissenschaftler” at Technische Universität Dresden for many 

fruitful discussion 

291

292 CHAPTER 16. ACKNOWLEDGEMENT

Bibliography 

[AG04] David Abrahams and Aleksey Gurtovoy. C++ Template Metaprogramming: Concepts, 

Tools, and Techniques from Boost and Beyond. Addison-Wesley, 2004. 

[CE00] Krzysztof Czarnecki and Ulrich W. Eisenecker. Generative programming: methods, 

tools, and applications. ACM Press/Addison-Wesley Publishing Co., New York, NY, 

USA, 2000. 

[DHP03] Ionut Danaila, Frédéric Hecht, and Olivier Pironneau. Simulation Numérique en 

C++. Dunod, Paris, 2003. 

[ES90] Margaret A. Ellis and Bjarne Stroustrup. The Annotated C++ Reference Manual. 

Addison-Wesley, 1990. 

[GA04] Aleksey Gurtovoy and David Abrahams. Boost Meta-Programming Library (MPL). 

Boost, 2004. www.boost.org/doc/libs/1_42_0/libs/mpl/doc/index.html. 

[Got11] Peter Gottschling. Mixed Complex Arithmetic. SimuNova, 2011. 

=https://simunova.zih.tu-dresden.de/mtl4/docs/mixed complex.html, Part of 

Matrix Template Library 4. 

[Kar05] Björn Karlsson. Beyond the C++ standard library. Addison-Wesley, 2005. 

[SA05] Herb Sutter and Andrei Alexandrescu. C++ coding standards. The C++ in-depth 

series. Addison-Wesley, 2005. 

[Sch] Douglas C. Schmidt. C++ programming language tutorials. http://www.cs.wustl. 

edu/~schmidt/C++. 

[Str97] Bjarne Stroustrup. The C++ Programming Language. Addison-Wesley, 3rd edition, 

1997. 

293

C++ for Scientists - Technische Universität Dresden

Create successful ePaper yourself

Delete template?

Save as template?