01.06.2014 Views

Concurrent Systems II - Bad Request - Trinity College Dublin

Concurrent Systems II - Bad Request - Trinity College Dublin

Concurrent Systems II - Bad Request - Trinity College Dublin

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>Concurrent</strong> <strong>Systems</strong> <strong>II</strong><br />

(CS3D4/CS3BA29)<br />

Mike Brady<br />

ORI.G41, brady@cs.tcd.ie<br />

https://www.cs.tcd.ie/~brady/courses/ba/CS3BA29/3D4-3BA292009.pdf<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


POSIX<br />

• POSIX – Portable Operating System Interface<br />

◾for variants of Unix, including Linux<br />

◾IEEE 1003, ISO/IEC9945<br />

• Really, considered a standard set of facilities and APIs for Unix.<br />

◾1988 onwards<br />

◾Doesn’t have to be Unix – e.g. Cygwin for Windows give it partial<br />

POSIX compilance<br />

◾ref: http://en.wikipedia.org/wiki/POSIX<br />

2<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


POSIX Threads<br />

• POSIX Threads aka ‘pthreads’<br />

◾correspond to ‘Light Weight Processes’ (LWPs) in older literature.<br />

◾pthreads live within processes.<br />

◾processes have separate memory spaces from one another<br />

• thus, inter-process communication & scheduling may be expensive<br />

◾pthreads (within a process) share the same memory space<br />

• inter-thread communication & scheduling can be cheap<br />

3<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


POSIX Threads (2)<br />

• Portable Threading Library across Unix OSes<br />

◾All POSIX-compliant Unixes implement pthreads<br />

• Linux, Mac OS X, Solaris, FreeBSD, OpenBSD, etc.<br />

• Also Windows:<br />

◾E.g. Open Source: pthreads-win32<br />

4<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Six Separate Threads<br />

Master Thread<br />

Join<br />

T1<br />

T2<br />

T4<br />

T5<br />

T6<br />

T3<br />

5<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Creating a pthread<br />

#include <br />

int pthread_create(<br />

pthread_t *thread, // the returned thread id<br />

const pthread_attr_t *attr, // starting attributes<br />

void *(*start_routine)(void*), // the function to run in the thread<br />

void *arg); // parameter<br />

6<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Where…<br />

• ‘thread’ is the ID of the thread<br />

•<br />

•<br />

‘attr’ is the input-only attributes (NULL for standard attributes)<br />

‘start_routine’ (can be any name) is the function that runs when<br />

the thread is started, and which must have the signature:<br />

void* start_routine (void* arg);<br />

• ‘arg’ is the parameter that is sent to the start routine.<br />

• returns a status code. ‘0’ is good, ‘-1’ is bad.<br />

7<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


wait for a thread to exit<br />

int pthread_join(pthread_t thread, void **value_ptr);<br />

where<br />

‘thread’ is the id of the thread you wish to wait on<br />

‘value ptr’ is where the thread’s exit status will be placed on exit<br />

(NULL if you’re not interested.)<br />

•NB: a thread can be joined only to one other thread!<br />

8<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Hello World -- Creating Threads<br />

int main (int argc, const char * argv[]) {<br />

pthread_t threads[NUM_THREADS];<br />

int rc,t;<br />

for (t=0;t


Hello World -- Waiting for Exit<br />

// wait for threads to exit<br />

for(t=0;t


Hello World -- Thread Function<br />

void *PrintHello(void *threadid) {<br />

printf("\n%d: Hello World!\n", threadid);<br />

pthread_exit(NULL);<br />

}<br />

11<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


HelloWorld -- Complete<br />

#include <br />

#include <br />

#include <br />

#define NUM_THREADS 6<br />

void *PrintHello(void *threadid) {<br />

printf("\n%d: Hello World!\n", threadid);<br />

pthread_exit(NULL);<br />

}<br />

int main (int argc, const char * argv[]) {<br />

pthread_t threads[NUM_THREADS];<br />

int rc,t;<br />

for (t=0;t


Wilde/Stoker/Ubuntu<br />

h-3.1$ cc -lpthread main.c -o main<br />

h-3.1$ ./main<br />

• Include the pthread library to allow it to compile and link<br />

• Automatically included in Mac OS X<br />

13<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


What’s wrong deceptive about this?<br />

Master Thread<br />

Join<br />

T1<br />

T2<br />

T4<br />

T5<br />

T6<br />

T3<br />

14<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Interaction<br />

• No thread interaction shown.<br />

•<br />

Most of the issues to do with threads (really, with any kind of<br />

parallel processing) arise over unforeseen interaction sequences.<br />

◾For example, if two threads attempt to increment the same<br />

variable<br />

15<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Unsafe Interaction Example<br />

Thread 1 Thread 2 G<br />

L=G<br />

(4)<br />

L=L+10<br />

(14)<br />

– 4<br />

– 4<br />

G=L<br />

L=G<br />

(4)<br />

L++<br />

(5)<br />

4<br />

14<br />

G=L 5<br />

G = Global Variable; L = Separate Local Variables<br />

16<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Interactions & Dependencies<br />

• Interactions occur because of dependencies.<br />

•<br />

Interactions can be made safe using a variety of techniques.<br />

◾E.g. semaphores, condition variables, etc.<br />

• We will look at them…<br />

• They tend to be expensive; sometimes very expensive.<br />

•<br />

•<br />

But, sometimes we can redesign the code to avoid dependencies.<br />

One of the biggest themes in this kind of programming is<br />

dependency minimisation.<br />

17<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


We need to use our knowledge of…<br />

• Sequential imperative programming, in C/C++ on a single<br />

processor.<br />

• How programs could communicate.<br />

18<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


…to do Parallel Programming<br />

• High performance parallel programming<br />

•<br />

•<br />

We will confine ourselves to shared-memory machines.<br />

We will use dual-core (maybe quad-core) machines for practice.<br />

19<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Why is it this difficult?<br />

• It’s not clear whether it really is more difficult to think about<br />

using multiple agents, e.g. multiple processors, to solve a<br />

problem:<br />

◾Maybe we are brainwashed into thinking about just one agent;<br />

◾Maybe it’s natural;<br />

◾Maybe it’s our notations and toolsets;<br />

◾Of course, maybe it really is trickier.<br />

20<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Hardware<br />

Main Memory<br />

Level 2 Cache<br />

Level 1 Cache<br />

CPU<br />

Single Processor<br />

21<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Hardware<br />

Memory<br />

Cache<br />

CPU<br />

Bus<br />

… … …<br />

Cache<br />

CPU<br />

Multiple Processor<br />

22<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Shared Memory Machine<br />

• Each processor has equal access to each main memory location<br />

– Uniform Memory Access (UMA).<br />

◾Opposite: Non-Uniform Memory Access (NUMA)<br />

• Each processor may have its own cache, or may share caches<br />

with others.<br />

◾May have problems with cache issues, e.g. consistency, line-sharing,<br />

etc.<br />

23<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Sequential Program<br />

• A Sequential Program has one site of program execution. At any<br />

time, there is only one site in the program where execution is<br />

under way.<br />

• The Context of a sequential program is the set of all variables,<br />

processor registers (including hidden ones that can affect the<br />

program), stack values and, of course, the code, assumed to be<br />

fixed, that are current for the single site of execution.<br />

• The combination of the context and the flow of program<br />

execution is often called [informally] a thread.<br />

24<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


<strong>Concurrent</strong> Program<br />

• A <strong>Concurrent</strong> Program has more than one site of execution. That<br />

is, if you took a snapshot of a concurrent program, you could<br />

understand it by examining the state of execution in a number of<br />

different sites in the program.<br />

• Each site of execution has its own context—registers, stack<br />

values, etc.<br />

• Thus, a concurrent program has multiple threads of execution.<br />

25<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Parallel Program<br />

• A Parallel Program, like a concurrent program, has multiple<br />

threads of program execution.<br />

• The key difference between concurrent and parallel is<br />

◾In concurrent programs, only one execution agent is assumed to<br />

be active. Thus, while there are many execution sites, only one will<br />

be active at any time.<br />

◾In parallel programs, multiple execution agents are assumed to be<br />

active simultaneously, so many execution sites will be active at any<br />

time.<br />

26<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Communication<br />

• A parallel program consists of two or more separate threads of<br />

execution, that run independently except when they have to<br />

interact<br />

• To interact, they must communicate<br />

•<br />

Communication is typically implemented by<br />

◾ sharing memory<br />

• One thread writes data to memory; the other reads it<br />

◾passing messages<br />

• One thread sends a message; the other gets it<br />

27<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Dependency<br />

• Independent sections of code can run independently.<br />

• We can analyse code for dependencies.<br />

◾To preserve the meaning of the program;<br />

◾To transform the program to reduce dependencies and improve<br />

parallelism.<br />

28<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


1 – Flow Dependence<br />

S1: write to location L<br />

S2: read from location L<br />

• Flow Dependence: S2 is flow dependent on S1 because S2 reads<br />

a location S1 writes to.<br />

◾It must be written to (S1) before it’s read from (S2)<br />

29<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


2 – Antidependence<br />

S1: read from location L<br />

S2: write to location L<br />

• Antidependence: S2 is antidependent on S1 because S2 writes to<br />

a location S1 reads from.<br />

◾It must be read from (S1) before it can be written to (S2)<br />

30<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


3 – Output Dependence<br />

S1: write to location L<br />

S2: write to location L<br />

• Output dependence: S2 is output dependent on S1 because S2<br />

writes to a location S1 writes to.<br />

◾The value of L is affected by the order of S1 and S2.<br />

31<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Example<br />

S1: a = b + d;<br />

S2: c = a * 3;<br />

S3: a = b +c;<br />

S4: e = a / 2;<br />

32<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Dependence in Loops<br />

j k k<br />

i<br />

1 2<br />

j<br />

5 5<br />

7<br />

i<br />

21 24<br />

27<br />

3 4<br />

8 9<br />

10<br />

47 54<br />

71<br />

a b c<br />

33<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Sample Serial Solution<br />

for i = 1 to 2 {<br />

for k = 1 to 3 {<br />

sum = 0.0;<br />

for j = 1 to 2<br />

sum = sum + a[1,j] * b[j,k];<br />

c[i,k] = sum;<br />

}<br />

}<br />

34<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


How can we transform it?<br />

• We can work out how to transform it by locating the<br />

dependencies in it.<br />

• Some dependencies are intrinsic to the solution, but<br />

•<br />

Some are artefacts of the way we are solving the problem;<br />

◾If we can identify them, perhaps we can modify or remove them.<br />

35<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Try three execution agents:<br />

with k = 1;<br />

for i = 1 to 2 {<br />

for k = 1 to 3 {<br />

sum = 0.0;<br />

for j = 1 to 2<br />

sum = sum + a[1,j] * b[j,k];<br />

c[i,k] = sum;<br />

}<br />

}<br />

with k = 2;<br />

for i = 1 to 2 {<br />

for k = 1 to 3 {<br />

sum = 0.0;<br />

for j = 1 to 2<br />

sum = sum + a[1,j] * b[j,k];<br />

c[i,k] = sum;<br />

}<br />

}<br />

with k = 3;<br />

for i = 1 to 2 {<br />

for k = 1 to 3 {<br />

sum = 0.0;<br />

for j = 1 to 2<br />

sum = sum + a[1,j] * b[j,k];<br />

c[i,k] = sum;<br />

}<br />

}<br />

36<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Issues:<br />

• The variable sum, as written, is common to all three programs.<br />

• Solution:<br />

◾Make sum private to each program to avoid this dependency.<br />

37<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Try Six Execution Agents<br />

with k = 1, i=1;<br />

for i = 1 to 2 {<br />

for k = 1 to 3 {<br />

sum = 0.0;<br />

for j = 1 to 2<br />

sum = sum + a[1,j] * b[j,k];<br />

c[i,k] = sum;<br />

}<br />

}<br />

with k = 2, i=1;<br />

for i = 1 to 2 {<br />

for k = 1 to 3 {<br />

sum = 0.0;<br />

for j = 1 to 2<br />

sum = sum + a[1,j] * b[j,k];<br />

c[i,k] = sum;<br />

}<br />

}<br />

with k = 3, i=1;<br />

for i = 1 to 2 {<br />

for k = 1 to 3 {<br />

sum = 0.0;<br />

for j = 1 to 2<br />

sum = sum + a[1,j] * b[j,k];<br />

c[i,k] = sum;<br />

}<br />

}<br />

with k = 1, i=2;<br />

for i = 1 to 2 {<br />

for k = 1 to 3 {<br />

sum = 0.0;<br />

for j = 1 to 2<br />

sum = sum + a[1,j] * b[j,k];<br />

c[i,k] = sum;<br />

}<br />

}<br />

with k = 2, i=2;<br />

for i = 1 to 2 {<br />

for k = 1 to 3 {<br />

sum = 0.0;<br />

for j = 1 to 2<br />

sum = sum + a[1,j] * b[j,k];<br />

c[i,k] = sum;<br />

}<br />

}<br />

with k = 3, i=2;<br />

for i = 1 to 2 {<br />

for k = 1 to 3 {<br />

sum = 0.0;<br />

for j = 1 to 2<br />

sum = sum + a[1,j] * b[j,k];<br />

c[i,k] = sum;<br />

}<br />

}<br />

38<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Summarising:<br />

• We could parallelise the original algorithm with some care:<br />

◾Private Variables to avoid unnecessary dependencies<br />

• Actually, we could break this ‘Embarrassingly Parallel’ problem<br />

into tiny separate pieces; maybe too small. (How will we know?)<br />

39<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Example: Numerical Integration<br />

f(x)<br />

X<br />

a<br />

b<br />

Numerically integrate from a to b<br />

40<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Example<br />

• Sample function: y = 0.1x3 -0.4x 2 +2<br />

(b − a)(f(a) + f(b))<br />

2<br />

41<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


<strong>Concurrent</strong> <strong>Systems</strong> <strong>II</strong><br />

Practical 1<br />

Practical 1<br />

February 24, 2009<br />

This practical is worth 2% of your year-end result. Have your program ready one week<br />

from the date of the practical—i.e. have it ready for inspection on March 3 at 11 am.<br />

Have a printout also.<br />

• Write a complete threaded program in C (or non-OO C++) on a Linux machine—<br />

e.g. stoker.cs.tcd.ie—to compute the value π. You’ll need to find some way to<br />

do this with, for instance, a series or an integral so that you can use an embarrassingly<br />

parallel approach.<br />

• Find out how to measure elapsed time in the Linux environment and do some measurements.<br />

From the measurements, can you deduce how many processors/cores<br />

are in the machine?<br />

Here is what connecting to stoker looks like over an ssh link:<br />

42<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Synchronisation<br />

• We’ve looked at situations where the threads can operate<br />

independently -- the ‘write sets’ of the threads don’t intersect.<br />

• Where the write sets intersect, we must ensure that<br />

independent thread writes do not damage the data.<br />

43<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007


Shared Variable S: Thread 1 before Thread 2<br />

A Thread 1 S Thread 2<br />

Time<br />

25<br />

25 A=S 25<br />

26 A++ 25 B=S 25<br />

26 S=A 26 B*=10 250<br />

26 250 S=B 250<br />

26 250 250<br />

Thread 2 accesses S after Thread 1 but before Thread 1 has finished updating S<br />

44<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007


Shared Variable S: Thread 1 after Thread 2<br />

A Thread 1 S Thread 2 B<br />

Time<br />

25<br />

25 25 B=S 25<br />

25 A=S 25 B*=10 250<br />

26 A++ 250 S=B 250<br />

26 S=A 26 250<br />

26 26 250<br />

Thread 1 accesses S after Thread 2 but before Thread 2 has finished modifying S<br />

45<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007


Synchronisation<br />

• Problem: Access to shared resources can be dangerous.<br />

◾These are so-called ‘critical’ accesses.<br />

• Solution. Critical accesses should be made exclusive. Thus, all<br />

critical accesses to a resource are mutually exclusive.<br />

• In the example, both threads should have asked for exclusive<br />

access before making their updates.<br />

◾Depending on timing, one or the other would get exclusive access<br />

first. The other would have to wait to get any kind of access.<br />

46<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007


Mutual Exclusion in pthreads.<br />

• Mutual Exclusion is accomplished using ‘mutex’ variables.<br />

•<br />

Mutex variables are used to protect access to a resource.<br />

47<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007


Accessing a protected resource<br />

• To access a mutex-protected resource, an agent must acquire a<br />

lock on the mutex variable.<br />

◾If the mutex variable is unlocked, it is immediately locked and the<br />

agent has acquired it. When finished, the agent must unlock it.<br />

◾If the mutex variable is already locked, the agent has failed to<br />

acquire the lock -- the protected resource is in exclusive use by<br />

someone else.<br />

• The agent is usually blocked until lock is acquired.<br />

• A non-blocking version of lock acquisition is available.<br />

48<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007


Shared Variable S Protected by Mutex M<br />

A Thread 1 S Thread 2 B M<br />

Lock(M) 25 locked<br />

25 A=S 25 Lock(M) locked<br />

26 A++ 25<br />

locked<br />

26 S=A 26<br />

[thread<br />

blocked]<br />

locked<br />

26 Unlock(M) 26<br />

locked<br />

26 26 B=S 26 locked<br />

26 B*=10 260 locked<br />

260 S=B 260 locked<br />

260 Unlock(M) locked<br />

260 unlocked<br />

Time<br />

49<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007


Create pthread mutex variable<br />

• Static:<br />

◾pthread_mutex_t m = PTHREAD_MUTEX_INITIALISER;<br />

• Initially unlocked<br />

• Dynamic<br />

◾pthread_mutex_init(,attributes)<br />

50<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007


Lock and Unlock Mutex<br />

• pthread_mutex_lock();<br />

◾acquire lock or block while waiting<br />

• pthread_mutex_tryunlock();<br />

◾non-blocking; check returned code<br />

• pthread_mutex_unlock();<br />

51<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007


Example (1 of 2) – Thread<br />

main.c 23/02/2009 10<br />

#include <br />

#include <br />

#include <br />

#include <br />

#define checkResults(string, val) { \<br />

if (val) { \<br />

printf("Failed with %d at %s", val, string); \<br />

exit(1); \<br />

} \<br />

}<br />

// From http://publib.boulder.ibm.com/infocenter/iseries/v5r3/index.jsp?topic=/rzahw/rzahwe18rx.htm<br />

// Fixed to avoid calls to non-standard pthread_getthreadid_np()<br />

#define NUMTHREADS 3<br />

pthread_mutex_t<br />

mutex = PTHREAD_MUTEX_INITIALIZER;<br />

int<br />

sharedData=0;<br />

int<br />

sharedData2=0;<br />

void *theThread(void *threadid)<br />

{<br />

int rc;<br />

printf("Thread %.8x: Entered\n", (int)threadid);<br />

rc = pthread_mutex_lock(&mutex);<br />

checkResults("pthread_mutex_lock()\n", rc);<br />

/********** Critical Section *******************/<br />

printf("Thread %.8x: Start critical section, holding lock\n",(int)threadid);<br />

/* Access to shared data goes here */<br />

++sharedData; --sharedData2;<br />

printf("Thread %.8x: End critical section, release lock\n",(int)threadid);<br />

/********** Critical Section *******************/<br />

}<br />

rc = pthread_mutex_unlock(&mutex);<br />

checkResults("pthread_mutex_unlock()\n", rc);<br />

return NULL;<br />

52<br />

int main(int argc, char **argv)<br />

{<br />

pthread_t<br />

thread[NUMTHREADS];<br />

int<br />

rc=0;<br />

int i;<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007<br />

printf("Enter Testcase - %s\n", argv[0]);


}<br />

/********** Critical Section *******************/<br />

printf("Thread %.8x: Start critical section, holding lock\n",(int)threadid);<br />

/* Access to shared data goes here */<br />

++sharedData; --sharedData2;<br />

printf("Thread %.8x: End critical section, release lock\n",(int)threadid);<br />

/********** Critical Section *******************/<br />

Example (2 of 2) – Main<br />

rc = pthread_mutex_unlock(&mutex);<br />

checkResults("pthread_mutex_unlock()\n", rc);<br />

return NULL;<br />

int main(int argc, char **argv)<br />

{<br />

pthread_t<br />

thread[NUMTHREADS];<br />

int<br />

rc=0;<br />

int i;<br />

printf("Enter Testcase - %s\n", argv[0]);<br />

printf("Hold Mutex to prevent access to shared data\n");<br />

rc = pthread_mutex_lock(&mutex);<br />

checkResults("pthread_mutex_lock()\n", rc);<br />

printf("Create/start threads\n");<br />

for (i=0; i


Problems<br />

• Voluntary<br />

◾Mutexes ‘protect’ code.<br />

◾Other programmers don’t have to use them to get access to the<br />

variables the code accesses.<br />

• Unfair<br />

• This is part of the tradeoff. Use processes rather than threads if you<br />

want better protection.<br />

◾If multiple threads are blocked on a mutex, the order in which<br />

they waken up is not guaranteed to be any particular order.<br />

54<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007


Problem Tutorial<br />

• The transpose of a matrix M is a matrix T, where:<br />

T[i,j] = M[j,i] for all i, j.<br />

• Write a program to transpose a matrix.<br />

• How would you parallelise it?<br />

• What problems/issues do you foresee?<br />

55<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Problems<br />

• Voluntary<br />

◾Mutexes ‘protect’ code.<br />

◾Other programmers don’t have to use them to get access to the<br />

variables the code accesses.<br />

• Unfair<br />

• This is part of the tradeoff. Use processes rather than threads if you<br />

want better protection.<br />

◾If multiple threads are blocked on a mutex, the order in which<br />

they waken up is not guaranteed to be any particular order.<br />

56<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007


Producers and Consumers<br />

• Essentially a pipeline of separate sequential programs.<br />

◾E.g. concatenation of unix commands.<br />

• Programs communicate via buffers.<br />

◾Implemented in different ways, but, e.g.<br />

• Shared memory, flags, semaphores, etc.<br />

• Message passing over a network<br />

•<br />

Flow of data is essentially one-way.<br />

• Example…<br />

57<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Condition Variables<br />

• A generalised synchronisation construct.<br />

•<br />

Allows you to acquire mutex lock when a condition relying on a<br />

shared variable is true and to sleep otherwise.<br />

• The ‘conditon variable’ is used to manage the mutex lock and the<br />

signalling.<br />

• Slightly tricky.<br />

58<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007


Condition Variables (2)<br />

• Dramatis Personae:<br />

◾A mutex variable M,<br />

◾A shared resource R to be guarded by M,<br />

◾A condition variable V,<br />

◾A condition C,<br />

◾A thread U that wishes to use R, protected by M, on condition<br />

that C is true,<br />

◾A thread S that will modify V.<br />

59<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007


From Thread U’s POV (1)<br />

• Acquire Mutex M<br />

◾This controls access to R,<br />

◾but also must control access to any shared variables that C<br />

depends on.<br />

• If C is true, continue – the mutex is available unfettered,<br />

◾i.e. if C is true, the mutex can be used as normal.<br />

• if C is false…<br />

60<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007


From Thread U’s POV (2)<br />

• If C is false, wait for condition variable V to be signalled.<br />

• This is tricky, as access to V is controlled by M.<br />

◾So, Mutex M is unlocked,<br />

◾Thread sleeps, to be woken when V is signalled,<br />

◾Then, M is re-acquired.<br />

• No guarantee you’ll get it right away–maybe another thread will.<br />

• So now, if V has been signalled…<br />

61<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007


From Thread U’s POV (3)<br />

• Since V has been signalled, there is a chance that the condition C<br />

is now true.<br />

• Re-evaluate C:<br />

◾If true, then the mutex is available for you to use,<br />

◾If false, back to previous page.<br />

• Our ‘if’ needs to be replaced by a ‘while’…<br />

62<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007


From Thread U’s POV (4)<br />

• Acquire Mutex M<br />

• While !C<br />

◾Unlock M<br />

◾Wait for V to be changed<br />

◾Lock M<br />

• Continue<br />

• Unlock M (finished)<br />

pthread_lock(&m)<br />

while (!C)<br />

pthread_cond_wait(&v,&m)<br />

or<br />

pthread_cond_timedwait(&<br />

v,&m,)<br />

continue<br />

pthread_unlock(&m)<br />

63<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007


Thread S (1)<br />

• Thread U is the ‘user’ of the condition variable V.<br />

•<br />

If the condition is not true, U unlocks V’s mutex M and sleeps,<br />

waiting for prince charming some other thread S to signal V and<br />

thereby waking U to check the condition again.<br />

64<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007


Thread S (2)<br />

• Acquire Mutex M<br />

• Do something<br />

• Unlock Mutex M<br />

•<br />

Signal the condition variable<br />

◾This will awaken a thread<br />

that is waiting on the<br />

condition variable.<br />

pthread_lock(&m)<br />

Do something<br />

pthread_unlock(&m)<br />

pthread_cond_signal(&v)<br />

or<br />

pthread_cond_broadcast(<br />

&v)<br />

65<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007


Semaphores<br />

• Semaphore = sema [signal] + phoros [bearer] (from Greek).<br />

•<br />

Historically, semaphores were used for long-distance optical<br />

signalling, e.g. lanterns, flags, coloured arms, etc.<br />

◾ E.g. Semaphores (“the Chappe system”) were used in France from<br />

late 1700s to mid 1800s for military and governmental use.<br />

66<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007


Semaphores<br />

• Computer Semaphores [specifically “counting semaphores”]<br />

invented by Edsger W. Dijkstra, named after signalling<br />

semaphores.<br />

• Can be thought of as a special case of a condition variable, a nonnegative<br />

counter value.<br />

• Producer and Consumer Threads<br />

•<br />

Can be implemented as a special case of Condition Variables<br />

67<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007


Semaphore Producer<br />

• Acquire Mutex.<br />

•<br />

•<br />

•<br />

• Release Mutex<br />

<br />

Increment Semaphore Count<br />

Signal Condition Variable<br />

68<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007


Semaphore Consumer<br />

◾Acquire Mutex<br />

◾While semaphore count = 0,<br />

• release mutex<br />

• sleep on Condition Variable<br />

• reacquire mutex<br />

◾Decrement semaphore count.<br />

◾<br />

◾Release mutex<br />

69<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007


Semaphores (2)<br />

• A semaphores package is also available as part of the POSIX<br />

Realtime Extensions.<br />

◾Small detail: these are ‘async safe’ in Unix – they can be used inside<br />

Unix signal handlers.<br />

70<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007


Semaphores (3)<br />

• Useful for countable resources, e.g. buffers.<br />

◾Semaphore variable = resource count.<br />

• Can be used to prevent underflow and to limit count (i.e.<br />

prevent overflow).<br />

71<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007


Clients and Servers<br />

• Somewhat similar to producer-consumer, but:<br />

◾Two-way flow of information between client and server, e.g. data<br />

supplied and result returned.<br />

◾Server may have many clients.<br />

• Widely used paradigm on distributed systems.<br />

72<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Sample Solution for ∏<br />

• Calculate π using the formula<br />

below:<br />

2.5<br />

∫ 1<br />

0<br />

4<br />

1+x 2<br />

0 2.5<br />

73<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Amdahl’s Law<br />

• If a portion P of a job can be speeded up by a factor of S,<br />

then the speedup is given as:<br />

Speedup =<br />

• From this, you can calculate the efficiency of the parallelisation.<br />

1<br />

(1 − P )+ P S<br />

E = Speedup<br />

N<br />

74<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


unsigned long long clock_cycles()<br />

{<br />

unsigned long long int x;<br />

__asm__ volatile (".byte 0x0f, 0x31" : "=A" (x));<br />

return x;<br />

Problem Tutorial<br />

}<br />

#elif __x86_64__<br />

unsigned long long clock_cycles(void)<br />

{<br />

unsigned hi, lo;<br />

__asm__ __volatile__ ("rdtsc" : "=a"(lo), "=d"(hi));<br />

return ( (unsigned long long)lo)|( ((unsigned long long)hi)


Good Practice…<br />

http://www.xkcd.com/292/<br />

76<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


The Challenge of Concurrency<br />

• Conventional testing and debugging is not generally useful.<br />

◾We assume that the sequential parts of concurrent programs are<br />

correctly developed.<br />

• Beyond normal sequential-type bugs, there is a whole range of<br />

problems caused by errors in communication and<br />

synchronisation. They can not easily be reproduced and<br />

corrected.<br />

• So, we are going to use notions, algorithms and formalisms to<br />

help us design concurrent programs that are correct by design.<br />

77<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Sequential Process<br />

• A sequential process is the execution of a sequence of atomic<br />

statements.<br />

◾E.g. Process P could consist of the execution of P1,P2,P3,….<br />

◾Process Q could be Q1,Q2,Q3,….<br />

• We think of each sequential process having its own separate PC<br />

and being a distinct entity.<br />

78<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


<strong>Concurrent</strong> Execution<br />

• A concurrent system is modelled as a collection of sequential<br />

processes, where the atomic statements of the sequential<br />

processes can be arbitrarily interleaved, but respecting the<br />

sequentiality of the atomic statements within their sequential<br />

processes.<br />

• E.g. say P is P 1,P2 and Q is Q1,Q2.<br />

79<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Scenarios for P and Q.<br />

p1→q1→p2→q2<br />

p1→q1→q2→p2<br />

p1→p2→q1→q2<br />

q1→p1→q2→p2<br />

q1→p1→p2→q2<br />

q1→q2→p1→p2<br />

80<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Notation<br />

Trivial <strong>Concurrent</strong> Program (Title)<br />

integer n ←0 (Globals)<br />

p (Process Name)<br />

q<br />

integer k1 ← 1 (Locals) integer k2 ← 2<br />

p1: n ← k1 (Atomic Statements) p2: n ← k2<br />

81<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Sample Sequential Program<br />

p1: n ← k1<br />

p2: n ← k2<br />

integer k1 ← 1<br />

integer k2 ← 2<br />

Trivial Sequential Program<br />

integer n ←0<br />

p1: n ← k1<br />

k1 = 1, k2 = 2<br />

n = 0<br />

P2: n ← k2<br />

k1 = 1, k2 = 2<br />

n = 1<br />

(end)<br />

k1 = 1, k2 = 2<br />

n = 2<br />

82<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Trivial <strong>Concurrent</strong> Program<br />

p1: n ← k1<br />

q1: n ← k2<br />

k1 = 1, k2 = 2<br />

n = 0<br />

(end)<br />

q1: n ← k2<br />

k1 = 1, k2 = 2<br />

n = 1<br />

p1: n ← k1<br />

(end)<br />

k1 = 1, k2 = 2<br />

n = 2<br />

(end)<br />

(end)<br />

k1 = 1, k2 = 2<br />

n = 2<br />

(end)<br />

(end)<br />

k1 = 1, k2 = 2<br />

n = 1<br />

p1, q1 q1, p1<br />

83<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


State Diagrams and Scenarios<br />

• We could describe all possible ways a program can execute with<br />

a state diagram.<br />

◾There is a transition between s1 and s2 (“s1:s2”) if executing a<br />

statement in s1 changes the state to s2.<br />

◾A state diagram is generated inductively from the starting state.<br />

• If ∃ s 1 and a transition s1:s2, then ∃ s2 and a directed arc from s1:s2<br />

• Two states are identical if they have the same variable values and<br />

the same directed arcs leaving them.<br />

• A scenario is one path through the state diagram.<br />

84<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Example — Jumping Frogs<br />

• A frog can move to an adjacent stone if it’s vacant.<br />

•<br />

A frog can hop over an adjacent stone to the next one if that<br />

one is vacant.<br />

• No other moves are possible.<br />

85<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Can we interchange the positions of the frogs?<br />

to<br />

86<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


If the frogs can only move “forwards”, can we:<br />

to<br />

87<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Proof<br />

• We can prove interesting properties by trying to construct a<br />

state diagram describing<br />

◾each possible state of the system and<br />

◾each possible move from one state to another<br />

• We might use a state diagram to show that<br />

◾A property always holds<br />

◾A property never holds in any reachable state<br />

◾A property sometimes holds<br />

◾A property always holds eventually<br />

88<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


State Diagrams for Processes<br />

• A state is defined by a tuple of<br />

◾for each process, the label of the statement available for<br />

execution.<br />

◾for each variable, its value.<br />

• Q: What is the maximum number of possible states in such a<br />

state diagram?<br />

89<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Course Link<br />

https://www.cs.tcd.ie/~brady/courses/ba/CS3BA29/3D4-3BA292009.pdf<br />

90<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Book Reference<br />

Principles of concurrent and distributed programming<br />

By M. Ben-Ari<br />

Edition: 2<br />

Published by Addison-Wesley, 2006<br />

Original from the University of Michigan<br />

Digitized Nov 16, 2007<br />

ISBN 032131283X, 9780321312839<br />

361 pages<br />

$125 at Amazon!<br />

91<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Correctness, fairness<br />

• We can use state diagram descriptions of concurrent<br />

computations to explore whether a concurrent progam is correct<br />

according to some criterion, or whether it is fair.<br />

92<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Correctness<br />

• (Correctness in Sequential Programs)<br />

◾Partial Correctness: The answer is correct if the program halts<br />

◾Total Correctness: The program does halt and the answer is<br />

correct.<br />

• Correctness in <strong>Concurrent</strong> Programs<br />

◾Safety Property: a property must be always true<br />

◾Liveness Property: a property must be eventually true<br />

93<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Fairness<br />

• Recall a concurrent program is modelled as the interleaving of<br />

statements in any possible way in a number of sequential<br />

programs.<br />

◾If a statement is ready for execution, then if it is guaranteed that it<br />

will eventually be executed (i.e. it will appear in a scenario), the<br />

system is said to be weakly fair.<br />

• Another way of thinking about this is that in a weakly fair<br />

system, the arbitrary interleaving of statements does not include<br />

scenarios in which available and ready statements will never be<br />

executed.<br />

• Sometimes we assume weak fairness for something to work.<br />

94<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Using State machines<br />

• We can describe concurrent programs using states and state<br />

transitions.<br />

• We can construct a state transition diagram (a “model”) which<br />

captures every possible scenario a concurrent program is<br />

capable of executing.<br />

• By reasoning with the model (“model checking”), we can prove<br />

whether or not a particular property holds for a concurrent<br />

program.<br />

• Much more powerful idea than merely testing a concurrent<br />

program.<br />

95<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


The Critical Section Problem<br />

• This is to model a situation in which two or more endless processes<br />

have critical sections in which it is certain that only one process is<br />

executing, i.e:<br />

◾Statements from the critical regions of two or more process must not<br />

be interleaved.<br />

• We also need to be certain that processes are safe from one another<br />

in other ways:<br />

◾No deadlock: if some processes are trying to execute their critical<br />

sections, one will succeed,<br />

◾No starvation: if any process tries to execute its critical region, it will<br />

succeed eventually.<br />

96<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


The Critical Section Problem (2)<br />

• We must develop some kind of a synchronization mechanism.<br />

◾Before the critical region, a sequence of setup statements called a<br />

preprotocol.<br />

◾After the critical region, a sequence of teardown statements called<br />

a postprotocol.<br />

• Our task is to develop pre- and postprotocols to ensure mutual<br />

exclusion from critical region execution, i.e.<br />

◾We must ensure that statements from the critical regions of<br />

processes are not interleaved.<br />

97<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Critical Section Processes Model<br />

p<br />

local variables<br />

loop forever<br />

non-critical section<br />

preprotocol<br />

critical section<br />

postprotocol<br />

Critical Section Problem<br />

global variables<br />

q<br />

local variables<br />

loop forever<br />

non-critical section<br />

preprotocol<br />

critical section<br />

postprotocol<br />

98<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Observations & Implications<br />

• Remember, we are using these ideas to model real programs.<br />

•<br />

We assume the protocols and the sections are independent of<br />

one another, not sharing variables, etc.<br />

• Critical sections must progress. It must be that the critical<br />

sections always complete their run-throughs, i.e. if a critical<br />

section is entered, we must be certain that it will exit. (Why?)<br />

• Non-critical sections should not be required to progress. (Why?)<br />

99<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Critical Section – First Try<br />

Critical Section Problem<br />

integer turn ← 1<br />

loop forever<br />

p<br />

loop forever<br />

q<br />

p1: non-critical section q1: non-critical section<br />

p2: await turn = 1 q2: await turn = 2<br />

p3: critical section q3: critical section<br />

p4: turn ← 2 q4: turn ← 1<br />

Total number of states possible: 4*4*2 = 32<br />

100<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Simplify<br />

Critical Section Problem<br />

integer turn ← 1<br />

loop forever<br />

p<br />

loop forever<br />

q<br />

p1: // non-critical section q1: // non-critical section<br />

p2: await turn = 1 q2: await turn = 2<br />

p3: // critical section q3: // critical section<br />

p4: turn ← 2 q4: turn ← 1<br />

Sections and Protocols are independent of one another, by definition.<br />

If we comment out p3 (or q3), it’s as if we went straight to p4 (resp. q4).<br />

So, we can eliminate p3 (q3). Similarly with p1 (q1).<br />

101<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Simplify<br />

Critical Section Problem<br />

integer turn ← 1<br />

loop forever<br />

r<br />

loop forever<br />

s<br />

r1: NCS await turn = 1 s1: NCS await turn = 2<br />

r2: turn ← 2 s2: turn ← 1<br />

Total number of states possible: 2*2*2 = 8<br />

102<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


State Transitions<br />

r1,s1,1<br />

r1: await turn = 1,<br />

s1: await turn = 2,<br />

turn = 1<br />

r1,s1,2<br />

r1: await turn = 1,<br />

s1: await turn = 2,<br />

turn = 2<br />

r2,s1,1<br />

r2,s1,1<br />

r2: turn ← 2,<br />

s1: await turn = 2,<br />

turn = 1<br />

r1,s2,2<br />

r1: await turn = 1,<br />

s2: turn ← 1,<br />

turn = 2<br />

r2,s1,2<br />

r2,s2,2<br />

r2,s2,1<br />

103<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Mutual Exclusion in Critical Sections?<br />

• Are the critical sections mutually exclusive?<br />

◾Yep: cannot be reached? ✔<br />

104<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Deadlock Free?<br />

• Deadlock Free: If some processes are trying to enter their critical<br />

sections, it is certain that at least one will eventually succeed.<br />

• Proofs:<br />

◾In , process r will progress to r2.<br />

◾In a transition is guaranteed* to .<br />

◾In , process s will progress to s2.<br />

◾In , a transition is guaranteed* to .<br />

• Thus, guaranteed deadlock-free.<br />

*How does this guarantee work?<br />

105<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Starvation Free?<br />

• Starvation Free: If any process is trying to enter its critical sections, it is<br />

certain that it will eventually succeed.<br />

• Proofs:<br />

◾In , both processes are in their combined non-critical and<br />

preprotocol sections.<br />

◾Say s is ready to enter its critical region: it can’t because turn = 1,<br />

and only r can set it to 2.<br />

• But r is in its non-critical region, where progress is not guaranteed.<br />

◾We can’t be certain turn will ever change to 2.<br />

◾Thus, we can’t guarantee starvation free operation. ✖<br />

106<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Scorecard<br />

Mutual Exclusion<br />

Deadlock Free<br />

Starvation Free<br />

✔<br />

✔<br />

✖<br />

Basically, if one process dies in a non critical section, the other processes<br />

will be starved of access to their critical regions. Not good.<br />

Must Do Better: How about giving each process its own equivalent of a<br />

turn variable? -- a “want” variable?<br />

107<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Second Attempt<br />

• Two processes, p and q, each with their own ‘want’ variables<br />

wantp and wantq.<br />

• If a process wants to enter its critical section, it ensures the<br />

other process’ want variable is false and then asserts its own<br />

while its entering, is in, and is leaving its critical region.<br />

108<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Attempt #2<br />

Critical Section Problem Attempt #2<br />

boolean wantp ← false, wantq ← false<br />

loop forever<br />

p<br />

loop forever<br />

q<br />

p1: non-critical section q1: non-critical section<br />

p2: await wantq = false q2: await wantp = false<br />

p3: wantp ← true q3: wantq ← true<br />

p4: critical section q4: critical section<br />

p5: wantp ← false q5: wantq ← false<br />

Do you have an intuition about what might happen?<br />

109<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Attempt #2 Shortened<br />

loop forever<br />

Critical Section Problem Attempt #2<br />

boolean wantp ← false, wantq ← false<br />

p<br />

loop forever<br />

p1: await wantq = false q1: await wantp = false<br />

p2: wantp ← true q2: wantq ← true<br />

p3: wantp ← false q3: wantq ← false<br />

q<br />

Process p Process q wantp wantq<br />

p1: await wantq = false q1: await wantp = false FALSE FALSE<br />

p2: wantp ← true q1: await wantp = false FALSE FALSE<br />

p2: wantp ← true q2: wantq ← true FALSE FALSE<br />

p3: wantp ← false q2: wantq ← true TRUE FALSE<br />

p3: wantp ← false q3: wantq ← false TRUE TRUE<br />

110<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Attempt #2 Fails…<br />

• Attempt 2 fails because we can construct a scenario (a path<br />

through a state transition diagram) which shows that a state with<br />

p3 and q3 free to execute can be reached. These contain the<br />

critical regions of their processes.<br />

◾Thus: mutual exclusion is not guaranteed.<br />

111<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Attempt #3<br />

• Attempt #2 fails because of the ‘gap’ between finding out the<br />

other process is not attempting to get into its critical region and<br />

entering the critical region.<br />

◾(The other process could enter its critical region in the gap.)<br />

• So, attempt #3 works by claiming to enter the critical region<br />

before checking it’s okay by everyone else.<br />

◾i.e. Attempt #3 does assert wait before awaiting wait<br />

◾c.f. Attempt #2 does await wait before assert wait<br />

112<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Attempt #3<br />

Critical Section Problem Attempt #3<br />

boolean wantp ← false, wantq ← false<br />

loop forever<br />

p<br />

loop forever<br />

q<br />

p1: non-critical section q1: non-critical section<br />

p2: wantp ← true q2: wantq ← true<br />

p3: await wantq = false q3: await wantp = false<br />

p4: critical section q4: critical section<br />

p5: wantp ← false q5: wantq ← false<br />

Do you have an intuition about what might happen?<br />

113<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Attempt #3 Shortened<br />

loop forever<br />

Critical Section Problem Attempt #3<br />

boolean wantp ← false, wantq ← false<br />

p<br />

loop forever<br />

p1: wantp ← true q1: wantq ← true<br />

p2: await wantq = false q2: await wantp = false<br />

p3: wantp ← false q3: wantq ← false<br />

q<br />

Process p Process q wantp wantq<br />

p1: non critical section q1: non critical section FALSE FALSE<br />

p2: wantp ← true q1: non critical section FALSE FALSE<br />

p2: wantp ← true q2: wantq ← true FALSE FALSE<br />

p3: await wantq = false q2: wantq ← true TRUE FALSE<br />

p3: await wantq = false q3: await wantp = false TRUE TRUE<br />

114<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Attempt #3 Fails…<br />

• Attempt 3 fails because we can construct a scenario which<br />

shows that a state with deadlock (actually this is livelock) can be<br />

reached.<br />

◾Thus: freedom from deadlock is not guaranteed.<br />

115<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Attempt #4<br />

• Attempt #3 fails because each process insisted on entering its<br />

critical region (by asserting its wait variable).<br />

◾Having insisted on entering, they could not proceed because they<br />

were waiting for all others to not be in their critical regions.<br />

• So, attempt #4 works by our process claiming to enter the<br />

critical region before checking it’s okay by everyone else.<br />

◾Then, if another process is in its critical region, our process<br />

relinquishes and reasserts its insistence on entering its critical<br />

region<br />

• i.e. Attempt #4 negates and then reasserts its wait variable.<br />

116<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Attempt #4<br />

loop forever<br />

Critical Section Problem Attempt #4<br />

boolean wantp ← false, wantq ← false<br />

p<br />

loop forever<br />

p1: non-critical section q1: non-critical section<br />

p2: wantp ← true q2: wantq ← true<br />

p3: while wantq q3: while wantp<br />

p4: wantp ← false q4: wantq ← false<br />

p5: wantp ← true q5: wantq ← true<br />

p6: critical section q6: critical section<br />

p7: wantp ← false q7: wantq ← false<br />

q<br />

Do you have an intuition about what might happen?<br />

117<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Fragment of a scenario…<br />

Process p Process q wantp wantq<br />

p3: while wantq q3: while wantp TRUE TRUE<br />

p3: while wantq q4: wantq ← false TRUE TRUE<br />

p4: wantp ← false q4: wantq ← false TRUE TRUE<br />

p4: wantp ← false q5: wantq ←true TRUE FALSE<br />

p5: wantp ← true q5: wantq ←true FALSE FALSE<br />

p5: wantp ← true q3: while wantp TRUE FALSE<br />

p3: while wantq q3: while wantp TRUE TRUE<br />

…<br />

Here is the possibility of an endless loop.<br />

118<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Attempt #4 Fails…<br />

• Attempt 3 fails because we can construct a scenario which<br />

shows that a state with starvation can occur.<br />

◾Thus: freedom from starvation is not guaranteed.<br />

119<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Attempt #5 – Dekker’s Algorithm<br />

• Uses turn variable (Attempt #1)<br />

•<br />

Uses want and relinquishing insistence to critical region<br />

(Attempt #4)<br />

120<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Attempt #5, derived from #4<br />

Critical Section Problem Attempt #5<br />

boolean wantp ← false, wantq ← false, turn ← 1<br />

p<br />

q<br />

loop forever<br />

loop forever<br />

p1: non-critical section q1: non-critical section<br />

p2: wantp ← true q2: wantq ← true<br />

p3: while wantq q3: while wantp<br />

if turn = 2 if turn = 1<br />

p4: wantp ← false q4: wantq ← false<br />

await turn = 1 await turn = 2<br />

p5: wantp ← true q5: wantq ← true<br />

p6: critical section q6: critical section<br />

turn ← 2 turn ← 1<br />

p7: wantp ← false q7: wantq ← false<br />

121<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Dekker’s Algorithm<br />

Critical Section Problem Attempt #5<br />

boolean wantp ← false, wantq ← false, turn ← 1<br />

loop forever<br />

p<br />

loop forever<br />

p1: non-critical section q1: non-critical section<br />

p2: wantp ← true q2: wantq ← true<br />

p3: while wantq q3: while wantp<br />

p4: if turn = 2 q4 if turn = 1<br />

p5: wantp ← false q5: wantq ← false<br />

p6: await turn = 1 q6: await turn = 2<br />

p7: wantp ← true q7: wantq ← true<br />

p8: critical section q8: critical section<br />

p9: turn ← 2 q9: turn ← 1<br />

p10 wantp ← false q10 wantq ← false<br />

q<br />

122<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Verification of concurrent programs<br />

• Manual construction and inspection of state diagrams<br />

– Only suitable for the most trivial programs with few states<br />

1<br />

• Automated construction and inspection of state diagrams<br />

– The task of constructing the state diagram and using it to check the correctness<br />

of the concurrent program can be automated using a model checker<br />

– The power of model checkers lies in their ability to handle very large numbers<br />

of states<br />

– We still need a way to specify models and logical correctness properties<br />

• Formal specification and verification using temporal logic<br />

3BA29/3D4 <strong>Concurrent</strong> <strong>Systems</strong> and Operating <strong>Systems</strong><br />

http://www.cs.tcd.ie/Stephen.Childs childss@cs.tcd.ie


A quick overview of some mathematical logic<br />

• M. Ben-Ari, “Principles of <strong>Concurrent</strong> and Distributed<br />

Programming”, 2 nd edition, Chapter 4 and Appendix B<br />

2<br />

• Propositional calculus<br />

– atomic propositions {p,q,r,…}<br />

– unary operator ¬ (not)<br />

– binary operator ∨ (or)<br />

– binary operator ∧ (and)<br />

– binary operator → (implication)<br />

– binary operator ↔ (equivalence)<br />

• Semantics<br />

– An assignment function v maps propositions to {T,F} (true, false)<br />

– Once we have mapped propositions to truth values, we can interpret formulas<br />

according to the following rules<br />

– e.g. p is TRUE, q is FALSE so p ∨ q is TRUE but p ∧ q is FALSE<br />

– So the truth value is an interpretation, the interpretation function also given as v.<br />

3BA29/3D4 <strong>Concurrent</strong> <strong>Systems</strong> and Operating <strong>Systems</strong><br />

http://www.cs.tcd.ie/Stephen.Childs childss@cs.tcd.ie


A quick overview of some mathematical logic<br />

3<br />

A v(A 1 ) v(A 2 ) v(A)<br />

¬ A 1 T F<br />

¬ A 1 F T<br />

A 1 ∨ A 2 F F F<br />

A 1 ∨ A 2<br />

otherwise<br />

T<br />

A 1 ∧ A 2 T T T<br />

A 1 ∧ A 2<br />

otherwise<br />

F<br />

A 1 → A 2 T F F<br />

A 1 → A 2<br />

otherwise<br />

T<br />

A 1 ↔ A 2 v(A 1 ) = v(A 2 )<br />

A 1 ↔ A 2 v(A 1 ) ≠ v(A 2 )<br />

T<br />

F<br />

3BA29/3D4 <strong>Concurrent</strong> <strong>Systems</strong> and Operating <strong>Systems</strong><br />

http://www.cs.tcd.ie/Stephen.Childs childss@cs.tcd.ie


A quick overview of some mathematical logic<br />

• Material implication paradox<br />

4<br />

v(A 1 ) v(A 2 ) v(A 1 → A 2 )<br />

F F T<br />

F T T<br />

T F F<br />

T T T<br />

• Consider: superman is real implies fish can swim<br />

– superman is real is FALSE<br />

– fish can swim is TRUE<br />

– so, by the rules of material implication, our statement is TRUE<br />

– this may seem counterintuitive<br />

3BA29/3D4 <strong>Concurrent</strong> <strong>Systems</strong> and Operating <strong>Systems</strong><br />

http://www.cs.tcd.ie/Stephen.Childs childss@cs.tcd.ie


A quick overview of some mathematical logic<br />

• However, consider: “If it is sunny then I will be at the<br />

beach”<br />

5<br />

– Think of the statement as a guarantee<br />

– When is the guarantee broken?<br />

v(A 1 )<br />

it is sunny<br />

v(A 2 )<br />

I will be at the<br />

beach<br />

v(A 1 → A 2 )<br />

F F T<br />

F T T<br />

T F F<br />

T T T<br />

3BA29/3D4 <strong>Concurrent</strong> <strong>Systems</strong> and Operating <strong>Systems</strong><br />

http://www.cs.tcd.ie/Stephen.Childs childss@cs.tcd.ie


A quick overview of some mathematical logic<br />

• Proof by induction<br />

6<br />

• For natural numbers<br />

– First prove for n=1<br />

– Then assume for n<br />

– Then prove for n+1<br />

– e.g. prove F(n): the sum of the first n positive integers is n(n+1)/2<br />

• For states<br />

– First prove for initial state (base case)<br />

– Then assume A true in all states up to current state<br />

– Then prove it’s true in next state (may be more than one)<br />

3BA29/3D4 <strong>Concurrent</strong> <strong>Systems</strong> and Operating <strong>Systems</strong><br />

http://www.cs.tcd.ie/Stephen.Childs childss@cs.tcd.ie


Proving correctness by induction<br />

• We can prove correctness using induction<br />

– E.g. Prove that the critical section solution with “Critical Section Problem<br />

Attempt #3” guarantees mutual exclusion<br />

– formally, prove that ¬ (p4 ∧ q4) is invariant<br />

– an invariant must remain unchanged in every state of any execution<br />

– In our proof, we only need to consider statements that can change the value<br />

(truth) of one of our atomic propositions<br />

– By the properties of material implication, if a formula is assumed true by<br />

inductive hypothesis, it can only be falsified in two ways (see next slide)<br />

7<br />

Critical Section Problem Attempt #3<br />

boolean wantp ← false, wantq ← false<br />

loop forever<br />

p1: ><br />

p2: wantp ← true<br />

p3: await wantq = false<br />

p4: ><br />

p5: wantp ← false<br />

p<br />

loop forever<br />

q1: ><br />

q2: wantq ← true<br />

q3: await wantp = false<br />

q4 ><br />

q5: wantq ← false<br />

q<br />

3BA29/3D4 <strong>Concurrent</strong> <strong>Systems</strong> and Operating <strong>Systems</strong><br />

http://www.cs.tcd.ie/Stephen.Childs childss@cs.tcd.ie


Proving correctness by induction<br />

• The only two ways to falsify a true implication (p→q):<br />

– Take the case v(p) = v(q) = True<br />

– “Suddenly” v(q) becomes False, v(p) still True: p → q is now False<br />

8<br />

– Take the case v(p) = v(q) = F<br />

– “Suddenly” v(p) becomes True, v(q) remains False: p → q is now False<br />

v(p) v(q) v(p → q)<br />

F F T<br />

F T T<br />

T F F<br />

T T T<br />

3BA29/3D4 <strong>Concurrent</strong> <strong>Systems</strong> and Operating <strong>Systems</strong><br />

http://www.cs.tcd.ie/Stephen.Childs childss@cs.tcd.ie


A quick overview of some temporal logic<br />

• We have been assigning the value TRUE or FALSE to<br />

propositions to allow us to interpret formulas, for example<br />

…<br />

– A 1 → A 2 when A 1 is TRUE and is A 2 FALSE<br />

– If it is sunny then I will be at the beach, when “it is sunny” is TRUE and “I will be<br />

at the beach” is TRUE<br />

– p3..5 → wantp when p3..5 is FALSE and wantp is FALSE<br />

• This is insufficient to allow us to reason about concepts<br />

such as “eventually” or “always”, for example …<br />

– p3 → eventually wantp will be FALSE<br />

• Extend propositional calculus with temporal operators<br />

– (propositional) linear temporal logic (LTL)<br />

– M. Ben-Ari, “Principles of <strong>Concurrent</strong> and Distributed Programming”, 2 nd edition,<br />

section 4.3<br />

9<br />

3BA29/3D4 <strong>Concurrent</strong> <strong>Systems</strong> and Operating <strong>Systems</strong><br />

http://www.cs.tcd.ie/Stephen.Childs childss@cs.tcd.ie


A quick overview of some temporal logic<br />

• Computation<br />

– an infinite sequence of states {s 0 , s 1 , …}<br />

10<br />

• □, always<br />

– a formula □ A is true in a state s i if and only if the formula A is true in all states<br />

s j , j ≥ i, of the computation<br />

– recall safety and liveness<br />

– □ A is a safety property (e.g. □ x ≥ 0)<br />

• ◊, eventually<br />

– a formula ◊ A is true in state s i if and only if the formula A is true in some state<br />

s j , j ≥ i, of the computation<br />

– ◊ A is a liveness property (e.g. ◊ y, where y is a boolean variable)<br />

3BA29/3D4 <strong>Concurrent</strong> <strong>Systems</strong> and Operating <strong>Systems</strong><br />

http://www.cs.tcd.ie/Stephen.Childs childss@cs.tcd.ie


A quick overview of some temporal logic<br />

• Temporal operator duality<br />

– Remember deMorgan’s laws?<br />

– ¬ (A ∧ B) ↔ (¬ A ∨ ¬ B)<br />

– ¬ (A ∨ B) ↔ (¬ A ∧ ¬ B)<br />

11<br />

– Similarly<br />

– ¬ □ A ↔ ◊ ¬ A<br />

– ¬ ◊ A ↔ □ ¬ A<br />

• Temporal operator sequence<br />

– ◊ □ A<br />

– □ ◊ A<br />

3BA29/3D4 <strong>Concurrent</strong> <strong>Systems</strong> and Operating <strong>Systems</strong><br />

http://www.cs.tcd.ie/Stephen.Childs childss@cs.tcd.ie


What is temporal logic useful for?<br />

• When proving safety property (e.g. mutual exclusion) we<br />

used invariants like ¬ (p4 ∧ q4)<br />

• These are no use for liveness properties (e.g. freedom<br />

from starvation)<br />

– We need to reason about progress: if process is in state satisfying formula A, it<br />

must progress to state satisfying B:<br />

– In temporal logic: A → ◊B<br />

• We need a basic understanding of temporal logic to be<br />

able to use model checking tools<br />

3BA29/3D4 <strong>Concurrent</strong> <strong>Systems</strong> and Operating <strong>Systems</strong><br />

http://www.cs.tcd.ie/Stephen.Childs childss@cs.tcd.ie


Temporal logic at work<br />

• Consider some safety and liveness properties for our solutions to the<br />

critical section problem:<br />

Critical section problem – Dekker’s algorithm<br />

boolean wantp ← false, wantq ← false<br />

integer turn ← 1<br />

loop forever<br />

p1: non-critical section<br />

p2: wantp ← true<br />

p3:<br />

p4:<br />

while wantq<br />

if turn = 2<br />

p5: wantp ← false<br />

p6: await turn = 1<br />

p7: wantp ← true<br />

p8: ><br />

p9: turn ← 2<br />

p10: wantp ← false<br />

p<br />

loop forever<br />

q1: non-critical section<br />

q2: wantq ← true<br />

q3:<br />

q4:<br />

while wantp<br />

if turn = 1<br />

q5: wantq ← false<br />

q6: await turn = 2<br />

q7: wantq ← true<br />

q8: ><br />

q9: turn ← 1<br />

q10: wantq ← false<br />

q<br />

• Mutual exclusion (safety)?<br />

• Freedom from starvation for p (liveness)?<br />

• Freedom from deadlock (liveness)?<br />

3BA29/3D4 <strong>Concurrent</strong> <strong>Systems</strong> and Operating <strong>Systems</strong><br />

http://www.cs.tcd.ie/Stephen.Childs childss@cs.tcd.ie


Model Checking<br />

• “Model checking is an automated technique that, given a finite-state model of a system<br />

and a logical property, systematically checks whether this property holds for (a given<br />

initial state in) that model.” [Clarke & Emerson 1981]*<br />

◾ A safety property asserts that nothing bad happens (e.g. no deadlocked states).<br />

◾ A liveness property asserts that something good will eventually happen (e.g. always normal<br />

termination).<br />

E. M. Clarke and E. A. Emerson. Design and synthesis of synchronization skeletons using branching time temporal logic. In Logic of Programs: Workshop, Yorktown<br />

Heights, NY, May 1981, volume 131 of LNCS. Springer, 1981<br />

◾ Clarke & Emerson (CMU) and Joseph Sifakis (Grenoble) won the ACM Turing Award 2007 for their work on Model Checking<br />

◾ Based on:<br />

http://ls1-www.cs.uni-dortmund.de/~tick/Lehre/SS06/MC/Spin-1.pdf &<br />

http://ls1-www.cs.uni-dortmund.de/~tick/Lehre/SS06/MC/Spin-2.pdf<br />

136<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Promela<br />

• Promela [Protocol/Process Meta Language is a modelling language, not a programming<br />

language.<br />

• A Promela model consist of:<br />

◾ type declarations<br />

◾ channel declarations // we won’t be using these<br />

◾ global variable declarations<br />

◾ process declarations<br />

◾ [init process]<br />

initialises variables and<br />

starts processes<br />

mtype, constants,<br />

typedefs (records)<br />

• A Promela model corresponds to a finite transition system, so:<br />

◾ no unbounded data / channels / processes / process creation<br />

- simple vars<br />

- structured vars<br />

- vars can be accessed<br />

by all processes<br />

behaviour of the processes:<br />

local variables + statements<br />

137<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Process (1)<br />

• A process is defined by a proctype definition<br />

• It executes concurrently with all other processes, independently of speed or<br />

behaviour<br />

• A process communicates with other processes:<br />

◾ using global variables<br />

◾ using channels<br />

• There may be several processes of the same type.<br />

• Each process has its own local state:<br />

◾ the process counter (location within the proctype)<br />

◾ the contents of the local variables<br />

138<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Process (2)<br />

• A process type (proctype)<br />

consists of<br />

◾a name<br />

◾a list of formal parameters<br />

◾local variable declarations<br />

◾the body<br />

proctype Sender(byte in; byte out) {<br />

bit sndB;<br />

do<br />

:: out==10 -><br />

in=4;<br />

if<br />

:: sndB == 1 -> sndB = 1-sndB<br />

:: else -> skip<br />

fi<br />

od<br />

139<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Process (3)<br />

• Processes are created using the run<br />

statement (which returns the<br />

process id).<br />

• Processes can be created at any<br />

point in the execution.<br />

• Processes start executing after the<br />

run statement.<br />

• Processes can also be created and<br />

start running by adding active in<br />

front of the proctype declaration.<br />

active [2] proctype P() {<br />

byte i = 1;<br />

byte temp;<br />

do<br />

}<br />

:: ( i > TIMES ) -> break<br />

:: else -><br />

temp = n;<br />

n = temp + 1;<br />

i++<br />

od;<br />

finished++; /* Process terminates */<br />

140<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


SPIN Hello World Example<br />

SPIN = Simple Promula Interpreter<br />

/* A "Hello World" Promela model for SPIN. */<br />

active proctype Hello() {<br />

printf("Hello process, my pid is: %d\n", _pid);<br />

}<br />

init {<br />

int lastpid;<br />

printf("init process, my pid is: %d\n", _pid);<br />

lastpid = run Hello();<br />

printf("last pid was: %d\n", lastpid);<br />

lastpid = run Hello();<br />

printf("last pid was: %d\n", lastpid);<br />

}<br />

$ spin -n2 hello.pr<br />

init process, my pid is: 1<br />

last pid was: 2<br />

Hello process, my pid is: 0<br />

Hello process, my pid is: 2<br />

last pid was: 3<br />

Hello process, my pid is: 3<br />

4 processes created<br />

141<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Promela Variables<br />

• Basic types<br />

◾ bit e.g. turn=1; range: [0..1]<br />

◾ bool e.g. flag; [0..1]<br />

◾ byte e.g. counter; [0..255]<br />

◾ short e.g. s; [-2 15 .. 2 15 –1]<br />

◾ int e.g. msg; [-2 31 .. 2 31 –1]<br />

• Default initial value of basic variables (local and global) is 0.<br />

• Most arithmetic, relational, and logical operators of C/Java are supported,<br />

including bitshift operators.<br />

142<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Promela Variables (2)<br />

• Arrays<br />

◾ Zero-based indexing<br />

• Records (“structs” in C/Java)<br />

◾ typedefs<br />

typedef Record {<br />

short f1;<br />

byte f2;<br />

}<br />

143<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Statements<br />

• A statement is either<br />

◾ executable: the statement can be executed immediately.<br />

◾ blocked: the statement cannot be executed.<br />

• An assignment is always executable.<br />

• An expression is also a statement; it is executable if it evaluates to non-zero. E.g.<br />

◾ 2 < 3 always executable<br />

◾ x < 27 only executable if value of x is smaller 27<br />

◾ 3 + x executable if x is not equal to –3<br />

144<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Statements (2)<br />

• The skip statement is always<br />

executable.<br />

◾ it “does nothing”, only changes<br />

the process counter<br />

• A run statement is only<br />

executable if a new process can<br />

be created (remember: the<br />

number of processes is<br />

bounded).<br />

int x;<br />

proctype Aap()<br />

{<br />

int y=1;<br />

skip;<br />

run Noot();<br />

x=2;<br />

x>2 && y==1;<br />

skip;<br />

}<br />

• A printf statement is always<br />

executable (but not effected<br />

during verification).<br />

145<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Statements (3) — assert<br />

• Format: assert();<br />

•<br />

•<br />

The assert statement is always executable.<br />

If evaluates to zero, SPIN will exit with an error, as the<br />

has been violated.<br />

• Often used within Promela models, to check whether certain<br />

properties are valid in a state.<br />

146<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


If Statement<br />

if<br />

:: (n % 2 != 0) -> n=1<br />

:: (n >= 0) -> n=n-2<br />

:: (n % 3 == 0) -> n=3<br />

:: else -> skip<br />

fi<br />

• Each :: introduces an alternative followed by convention with a guard statement<br />

◾ if the guard is executable, then the alternative is executable, and one executable<br />

alternative is non-deterministically chosen.<br />

◾ The optional else becomes executable if none of the other guards are executable.<br />

• If no guard is executable, the if statement blocks.<br />

147<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Do Statement<br />

• Same as the If statement, but it repeats the choice at the end.<br />

• Use the break statement to move on to the next sequential statement.<br />

do<br />

:: choice1 -> stat1.1; stat1.2; stat1.3; …<br />

:: choice2 -> stat2.1; stat2.2; stat2.3; …<br />

:: …<br />

:: choicej -> break;<br />

:: choicen -> statn.1; statn.2; statn.3; …<br />

od;<br />

148<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Atomic Statement<br />

atomic { stat1; stat2; ... statn }<br />

• An atomic{} statement can be used to group statements into an atomic<br />

sequence;<br />

• all statements are executed in a single sequence (no interleaving with<br />

statements of other processes), though each step is taken.<br />

• The statement is executable if stat1 is executable<br />

• If a stat i (with i>1) is blocked, the “atomicity token” is temporarily lost and<br />

other processes may do a step.<br />

149<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


d_step (d = deterministic?)<br />

d_step { stat1; stat2; ... statn }<br />

• A d_step is a more efficient version of atomic:<br />

◾No intermediate states are generated and stored.<br />

◾It may only contain deterministic steps.<br />

• It is a run-time error if stat i (i>1) blocks.<br />

150<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Process Execution / Evaluation Semantics<br />

• Promela processes execute concurrently.<br />

◾ Non-deterministic scheduling of the processes.<br />

• Processes are interleaved (statements of different processes do not occur at the<br />

same time).<br />

◾ exception: rendezvous communication. (We’re not using this).<br />

• All statements are atomic; each statement is executed without interleaving with<br />

other processes.<br />

• Each process may have several different possible actions enabled at each point<br />

of execution.<br />

◾ one choice is made, non-deterministically.<br />

151<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Model Checking<br />

• Model checking tools automatically verify whether a property holds in a given<br />

(finite-state) model of a system.<br />

• Safety Properties<br />

◾ “Something bad never happens.”<br />

• SPIN tries to find a path to the bad thing; if not found, the property is satisfied.<br />

• Liveness Properties<br />

◾ “Something good always happens eventually.”<br />

• SPIN tries to find an infinite loop in which the good thing doesn’t happen; if not<br />

found, the property is satisfied.<br />

152<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


The Closed World Assumption<br />

• The model seeks to establish the truth of something by trying<br />

exhaustively and failing to prove the negation of it.<br />

◾e.g. to prove that mutex holds, it tries to prove that it doesn’t<br />

hold. If it fails to do so, it is taken as proof that the mutex holds.<br />

• This only corresponds to what we normally understand to be<br />

“proof” if it is assumed that the model checker knows<br />

everything (i.e. can prove everything provable) about the system<br />

-- it’s called the “Closed World Assumption.”<br />

153<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Related: Negation-as-Failure<br />

• If you fail to prove that A is true, then with negation as failure,<br />

this is taken as proof that (¬ A) is true.<br />

◾Again, this doesn’t correspond with what we “normally”<br />

understand as proof.<br />

◾To make it correspond, we must invoke the CWA.<br />

154<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Process memory management<br />

• In a multi-programming environment processes share available physical memory<br />

◾ The aggregate memory requirements of processes executing on a system often exceed the available physical<br />

memory capacity<br />

◾ How does the programmer know where his/her program will be located in memory?<br />

• Solution: use another memory device (e.g. a hard disk) to increase memory capacity<br />

◾ A “swap device” or “page file”<br />

◾ Treat the memory used by a process as a set of fixed size memory chunks, called pages<br />

◾ Simplifies moving memory contents to/from the swap device<br />

◾ Leads to fragmentation: process’s memory is no longer contiguous in physical memory<br />

◾ Not only does the programmer not know where his/her program will be located, but now it can also move<br />

over time<br />

155<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Relocation and non-contiguity<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

1<br />

1<br />

1<br />

1<br />

1<br />

1<br />

2<br />

2<br />

2<br />

2<br />

2<br />

2<br />

3<br />

3<br />

3<br />

3<br />

3<br />

3<br />

4<br />

4<br />

4<br />

4<br />

4<br />

4<br />

5<br />

5<br />

5<br />

5<br />

5<br />

5<br />

6<br />

6<br />

6<br />

6<br />

6<br />

6<br />

7<br />

7<br />

7<br />

7<br />

7<br />

7<br />

8<br />

8<br />

8<br />

8<br />

8<br />

8<br />

9<br />

9<br />

9<br />

9<br />

9<br />

9<br />

10<br />

10<br />

10<br />

10<br />

10<br />

10<br />

11<br />

11<br />

11<br />

11<br />

11<br />

11<br />

12<br />

12<br />

12<br />

12<br />

12<br />

12<br />

13<br />

13<br />

13<br />

13<br />

13<br />

13<br />

14<br />

14<br />

14<br />

14<br />

14<br />

14<br />

Load A Load B Load C Swap out B Load D<br />

156<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Virtual memory<br />

• Give each process its own virtual address space<br />

◾ Divide the space into fixed-size pages (e.g. 4KB or 212B)<br />

• Programs refer to addresses in their own virtual memory space – virtual addresses<br />

• “On-the-fly” conversion of virtual addresses into physical addresses, which are applied to physical<br />

memory<br />

32 bit address<br />

20 bits 12 bits<br />

Virtual address<br />

Physical address<br />

virtual page number<br />

translated by<br />

MMU<br />

physical page frame number<br />

page offset<br />

page offset<br />

unchanged<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Virtual memory<br />

• The Memory Management Unit (MMU) is integrated into modern CPUs<br />

• If we use 32-bit addresses (physical and virtual) and if the address space is divided into 4KB pages<br />

◾ The total addressable memory is 232 = 4GB<br />

◾ Each page contains 212 bytes (4KB) requiring a 12-bit offset<br />

◾ There are 220 (1,048,576) 4KB pages, each with a 20-bit page number<br />

• The size of the virtual address space is typically much larger than the physical address space<br />

◾ Every virtual page won’t be mapped to a physical page<br />

◾ Some virtual pages may not be used<br />

◾ Some pages may be mapped to a swap device or page file<br />

• Per-process virtual address space<br />

◾ Each process has it’s own virtual address space<br />

158<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Virtual memory<br />

0<br />

0<br />

process 1<br />

2 32 0<br />

physical<br />

memory<br />

process 2<br />

page<br />

frames<br />

2 32 0<br />

x<br />

process n<br />

paging<br />

disk<br />

2 32<br />

159<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Memory management requirements and VM<br />

• Requirements<br />

◾ Location<br />

• In a multiprogramming environment, processes (and the operating system) need to share physical memory<br />

• Programs must be paged in and out of memory<br />

• Programmers have no way of knowing where in physical memory their processes will be located, and processes won’t<br />

be continuous<br />

• Virtual memory gives each process it’s own contiguous virtual memory space, within which the programmer works<br />

◾ Protection<br />

• In a multiprogramming environment, a user process must be prevented from interfering with memory belonging to<br />

other processes and the operating system itself<br />

• This protection must be implemented at run-time<br />

• Each process only has access to its own virtual memory space, which should never map to the physical memory of<br />

another process (except under specific circumstances!)<br />

160<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Memory management requirements and VM<br />

◾ Sharing<br />

• Protection must be flexible enough to allow portions of memory to be shared between processes<br />

• We can map the virtual pages of different processes to the same physical memory page frame (or same frame on disk)<br />

◾ Logical Organization<br />

• Programs are organised as modules with different properties<br />

• Executable code, data, shared data<br />

• Assign properties to virtual memory pages<br />

◾ Physical Organization<br />

• The programmer should be isolated from implementation of the memory hierarchy (e.g. swapping)<br />

• The programmer shouldn’t be concerned with the amount of physical memory available<br />

• The virtual memory space hides details from the programmer<br />

161<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Virtual memory implementation<br />

• Each process has its own page table<br />

◾ Used to map virtual pages to physical page frames<br />

◾ Page tables are also stored in memory<br />

• To translate a virtual page number to a physical page frame number<br />

◾ Use the virtual page number as an index to the page table<br />

◾ Read the physical page frame number from the table<br />

Virtual address<br />

virtual page number<br />

page offset<br />

Physical address<br />

physical page frame number<br />

page offset<br />

162<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


E.g. Virtual memory implementation<br />

• Page table format (32-bit addresses and 4KB pages)<br />

0<br />

32 bits<br />

20 bits 12 bits<br />

i<br />

physical page<br />

number<br />

status<br />

page table entry (PTE)<br />

2 20 - 1<br />

Status: e.g. valid,<br />

protection level,<br />

modified, referenced<br />

163<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Multi-level page tables<br />

• Assuming 32-bit addresses and 4KB page size<br />

◾ 1,048,576 page table entries x 4 bytes each = 4MB<br />

◾ i.e. every process requires a 4MB page table!!<br />

◾ wasteful, when you consider that most processes will use only a small portion of the virtual address space<br />

◾ most of the page table will be unused<br />

• Solution<br />

◾ A multi-level page table scheme<br />

32 bit address<br />

10 bits 10 bits<br />

12 bits<br />

Virtual address<br />

1 st index 2 nd index page offset<br />

164<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Multi-level page tables<br />

Virtual address<br />

1 st index 2 nd index<br />

page offset<br />

e.g. a twolevel<br />

page<br />

table scheme<br />

Physical address<br />

physical page frame number<br />

page offset<br />

165<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Multi-level page tables<br />

• In a two-level page table scheme<br />

◾ One primary page table<br />

◾ As many secondary page tables as needed<br />

◾ Secondary page tables are created (and removed) as needed<br />

◾ Depends on how much of the virtual memory space is required by the process<br />

◾ In the preceding example, each page table was 4KB in size so each page table fitted in exactly one page frame<br />

◾ Different schemes will use different size bit-fields for 1st and 2nd indices<br />

◾ Different schemes will use three (or more?) levels of page tables<br />

◾ More efficient than a single-level scheme<br />

◾ The page offset still remains un-translated.<br />

• What is the total size of all page tables in the preceding example, if two primary PTEs are used?<br />

166<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Typical process memory layout<br />

memory<br />

page<br />

0<br />

code<br />

code<br />

code<br />

primary<br />

page table<br />

code<br />

code<br />

code<br />

data<br />

memory<br />

page<br />

data<br />

1022 invalid<br />

page table<br />

entries<br />

1020 invalid<br />

page table<br />

entries<br />

memory<br />

page<br />

memory<br />

page<br />

stack<br />

2 32 stack<br />

process virtual<br />

address space<br />

1022 invalid<br />

page table<br />

entries<br />

stack<br />

stack<br />

memory<br />

page<br />

memory<br />

page<br />

167<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Page Faults<br />

• When the MMU accesses a PTE, it checks the Valid bit of the status field to determine if the entry is<br />

valid:<br />

◾ if V==0 and a primary PTE is being accessed<br />

then a secondary page table has not been allocated physical memory<br />

◾ if V==0 and a secondary PTE is being accessed<br />

then no physical memory has been allocated to the referenced page<br />

◾ in either case, a page fault occurs<br />

• The OS is responsible for handling page faults by …<br />

◾ … allocating a page of physical memory for a secondary page frame<br />

◾ … allocating a page of physical memory for the referenced page<br />

◾ … updating page table entries<br />

◾ … writing a replaced page to disk if it is dirty<br />

◾ … reading code or data from disk (another thread will be scheduled pending the I/O operation)<br />

◾ … signalling an access violation<br />

◾ ... retrying to execute the faulting instruction<br />

168<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Translation look aside buffer<br />

• Each virtual-to-physical memory address translation requires 1 memory access for each level of page<br />

tables<br />

◾ 2 memory accesses in a 2-level scheme<br />

• To reduce the number of memory access required for virtual-to-physical address translation, the MMU<br />

uses a Translation Look-Aside Buffer (TLB)<br />

◾ An on-chip cache<br />

◾ Stores the results of the m most recent address translations<br />

0<br />

virtual page number<br />

secondary page table entry<br />

m-1<br />

169<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Translation Look-Aside Buffer<br />

• Before walking the page tables to translate a virtual address to a physical address…<br />

◾ … the MMU checks each entry in the TLB<br />

• If the virtual page number is found, the cached secondary PTE is used by the MMU to translate the<br />

address …<br />

◾ … and there is no need to walk the page tables in memory<br />

• If a TLB match is not found …<br />

◾ … The MMU walks the page tables and locates the required PTE<br />

• The least recently used (LRU) TLB entry is now replaced with the current virtual page number and the<br />

PTE from the secondary page table<br />

• What are the TLB implications of a context switch?<br />

170<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Page fault handling<br />

Start<br />

Access page table<br />

O/S reads page from disk<br />

no<br />

Page in main<br />

memory?<br />

Memory full?<br />

no<br />

yes<br />

Page tables updated<br />

yes<br />

Update TLB<br />

Page tables updated<br />

page fault handler<br />

CPU generates physical address<br />

171<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Memory-management: the OS perspective<br />

• What are we interested in from an OS perspective?<br />

◾ Page fault handling<br />

• The goal of the operating system is to attempt to minimise the occurrence of page faults, which are<br />

expensive<br />

◾ Need to decide which physical memory page to use<br />

◾ Need to perform I/O<br />

◾ Another process needs to be scheduled to run during the I/O, resulting in a context switch<br />

• We will look at several policies used by the OS to perform memory management<br />

◾ The key questions are:<br />

• When should a page be loaded?<br />

• Where (in physical memory) should it be loaded?<br />

172<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Fetch policy<br />

• When should a page be loaded?<br />

◾ Two alternatives<br />

• Demand Paging<br />

◾ Pages are loaded into memory only when they are first referenced<br />

◾ When a process is started, there will be a number of page faults in quick succession<br />

◾ Locality of reference: over a short period of time, most memory references are to a small number of pages<br />

• The pages references changed relatively slowly over time<br />

• Prepaging (performed in addition to demand paging)<br />

◾ Takes advantage of the characteristics of disk devices<br />

• Seek times and rotational latency<br />

◾ If a number of pages are contiguous in secondary storage, it is more efficient to load the contiguous pages,<br />

even though they haven’t been referenced yet<br />

173<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Page replacement policy<br />

• Of the page frames available for replacement, which one should be replaced when a page fault occurs?<br />

• Basic scheme<br />

◾ A page fault occurs<br />

◾ Find a free frame<br />

◾ If there was a free frame, then use it<br />

◾ Otherwise if all the frames are used<br />

• select a victim frame using a page replacement policy<br />

• write the victim frame to disk and update the page table entries accordingly<br />

◾ Read the new page from disk into the selected page frame<br />

◾ Allow the faulting process to resume<br />

• Note<br />

◾ Replacing a victim page frame potentially requires both a read and a write<br />

174<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Page replacement policy<br />

• A page replacement policy is required when there are no free pages and we need to select a page to<br />

replace<br />

• Optimal algorithm<br />

◾ Select the page for which the time until the next reference is furthest away …<br />

◾ Obviously, we cannot implement this policy without prior knowledge of future events<br />

◾ However, it is a useful policy against which more practical policies can be measured<br />

• Least Recently Used (LRU)<br />

◾ Replace the page in memory that was referenced furthest back in time<br />

◾ Exploits locality of reference<br />

◾ Near-optimal performance<br />

◾ Disadvantage: very expensive to implement<br />

175<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Page replacement policy<br />

• Least Recently Used (LRU) (Cont’d)<br />

◾ Counter:<br />

• Associate a timestamp variable with each page table entry<br />

• Implement a logical clock that is incremented on every memory access<br />

• Whenever we access a memory page, we update the timestamp in the page table entry with the current logical clock<br />

value<br />

• To replace a page we need to search the page tables to find the least recently used page<br />

• Updates are cheap but searching for a replacement is very expensive<br />

◾ Stack<br />

• Maintain a stack as a doubly linked list of page table entries<br />

• When a page is referenced, move it from its current position in the stack to the top of the stack<br />

• Always replace the page at the bottom of the stack<br />

• Updates are expensive but searching for a victim page is cheap<br />

176<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Page replacement policy<br />

• First-In-First-Out (FIFO)<br />

◾ Replace pages in a round-robin order<br />

◾ Very efficient to implement – maintain a pointer to the next page to be replaced<br />

◾ Assumes that the page loaded into memory furthest back in time is also the page accessed furthest back in<br />

time<br />

◾ <strong>Bad</strong> assumption, we might have a page that is accessed frequently for a long period of time<br />

• This page will be repeatedly loaded and unloaded<br />

• Clock algorithm (or second-chance algorithm)<br />

◾ Performs better than FIFO but is not as inefficient to implement as LRU<br />

◾ Conceptually, candidate pages for replacement are arranged in a circular list<br />

◾ Each page has an associated use bit (also called reference bit)<br />

• Set to 1 when page is loaded or subsequently accessed<br />

177<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Page replacement policy<br />

• Clock algorithm (continued …)<br />

◾ To find a page for replacement<br />

• Proceed around the list until a page with a use=0 is found<br />

• Every time we pass over a page with use=1, set use=0<br />

• If all pages have use=1, we will eventually arrive back where we started, but we will have set use=0<br />

◾ Efficient to implement<br />

◾ Similar to FIFO, except that any page with use=1 will be ignored, unless there is no page with use=0<br />

• Clock algorithm with modified bit<br />

◾ As well as use bits, we use pages’ modified bits<br />

◾ Algorithm:<br />

• First pass: look for use=0, modified=0, don’t set use=0<br />

• Second pass: look for use=0, modified=1, set use=0<br />

• Third pass: As before, we are back at our starting position, but use will be set to 0 this time<br />

◾ Why? It’s cheaper to replace pages that don’t need to be written first<br />

178<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Clock Algorithm<br />

179<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Resident set management<br />

• How many pages can each process have resident in memory?<br />

• Factors affecting decision<br />

◾ If we allocate less memory to each process, we can fit more processes into memory, thus increasing the<br />

probability that there will be at least one process ready to be executed<br />

◾ If each process has only a small number of pages resident in memory, we will have more page faults<br />

◾ If we increase the number of resident pages for a process beyond some threshold, the performance increase<br />

will become negligible (locality of reference)<br />

• Two types of policy are used<br />

◾ Fixed allocation – every process has a fixed number of pages in memory<br />

◾ Variable allocation – allocation varies over the lifetime of the process<br />

180<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Resident set management<br />

• Variable allocation<br />

◾ Processes suffering from many page faults can be allocated more page frames<br />

◾ Processes with few page faults can have their allocation reduced<br />

◾ This appears to be more powerful than fixed allocation<br />

• With fixed allocation, we are stuck with the decision made a load-time – the ideal resident set size may vary depending<br />

on input to the program<br />

• And how to we calculate the working set size in the first place<br />

• However, it is also more expensive to implement<br />

• Variable allocation raises another question …<br />

◾ Should the set of candidate pages for replacement be restricted to the pages already allocated to the process<br />

that caused the page fault?<br />

◾ Two choices:<br />

• Variable allocation – global scope<br />

• Variable allocation – local scope<br />

181<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Resident set management<br />

• Variable allocation – global scope<br />

◾ The operating system maintains a list of free page frames<br />

◾ When a fault occurs, a free frame is added to the resident set of the process<br />

◾ When there are no free pages available, the page selected for replacement can belong to any process<br />

◾ This will result in a reduction in the working set size for the process whose page was replaced – might be<br />

unfair, there is no way to discriminate between processes<br />

• Variable allocation – local scope<br />

◾ When a process is loaded, it is allocated some number of page frames based, for example, on application type<br />

◾ When a page fault occurs, a page is selected for replacement from the resident set of the process that caused<br />

the fault<br />

◾ Occasionally, the operating system will re-evaluate the number of page frames allocated to each process<br />

182<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Resident set management<br />

Local Replacement<br />

Global Replacement<br />

Fixed Allocation<br />

•<br />

Number of frames allocated to process is<br />

fixed<br />

•<br />

Page to be replaced is chosen from among<br />

pages allocated to the faulting process<br />

•<br />

Not possible<br />

Variable<br />

Allocation<br />

•<br />

Number of frames allocated to process will<br />

vary according to working set requirements<br />

•<br />

Page is chosen from among pages allocated<br />

to faulting process<br />

•<br />

Page to be replaced is chosen<br />

from among all available page<br />

frames in main memory<br />

•<br />

This causes the size of the<br />

resident set to vary<br />

183<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


The Working Set<br />

◾ The Working Set of a process is the list of pages it has referenced in the last ∆ references, proposed by<br />

Denning.<br />

◾ Thus, the Working Set is a “windowed” snapshot of a process’ paging referencing.<br />

◾ The Working Set Model idea is that processes should have their working set resident in physical memory to be<br />

able to function efficiently.<br />

• If too few pages allocated (i.e. less than the working set), a process will spend lots of time page-faulting.<br />

• If too many pages are allocated, (i.e. more than the working set), other processes might be deprived of the physical<br />

memory they need.<br />

◾ If there isn’t enough memory for the working sets of all the processes, some processes are temporarily<br />

removed from the system.<br />

◾ Only difficulty is keeping track of the working set.<br />

184<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Resident set management<br />

Δ<br />

Working Set Size<br />

Transient Transient Transient Transient<br />

Stable Stable Stable Stable<br />

185<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Resident set management<br />

• How do we decide on a resident set size for a process?<br />

◾ We could monitor the working set of each process and periodically remove from the resident set those pages<br />

not in the working set<br />

• Problems<br />

◾ Past doesn’t predict future – size and membership of working set will change over time<br />

◾ Measurement of working set is impractical<br />

◾ Optimal value for ∆ is unknown<br />

• Solution<br />

◾ We can approximately detect a change in working set size by observing the page fault rate<br />

◾ If the page fault rate is below some minimum threshold, we can release allocated page frames for use by other<br />

processes<br />

◾ If the page fault rate is above some maximum threshold, we can allocate additional page frames to the process<br />

186<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Resident set management<br />

187<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Cleaning policy<br />

• Determines when a modified page should be written to secondary memory<br />

• Two approaches<br />

◾ Demand cleaning<br />

• Pages are written to secondary memory only when they are selected for replacement<br />

◾ Precleaning<br />

• Write pages periodically, before they need to be replaced<br />

• Pages can be written in batches to increase efficiency<br />

• Neither approach is ideal<br />

◾ If pages are precleaned long before they are replaced, there is a high probability they will have been modified<br />

again<br />

◾ Demand cleaning results in a longer delay when replacing pages<br />

188<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Cleaning policy<br />

• Page buffering<br />

◾ A better solution for both page replacement and cleaning<br />

◾ When a page is replaced, instead of replacing it immediately, place it on either an unmodified list or a modified<br />

list<br />

◾ Pages on the modified list can periodically be written out to secondary memory and moved to the unmodified<br />

list<br />

◾ Pages on the unmodified list can be reclaimed if they are referenced again before they are allocated to another<br />

process<br />

• Notes<br />

◾ “Moving” a page to a modified or unmodified list does not result in copying of data.<br />

◾ Instead, the PTE for the page is removed from the page table and placed on one of the lists<br />

◾ This approach can be combined with FIFO replacement to improve the performance of FIFO replacement<br />

while remaining efficient to implement<br />

189<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Load control<br />

Processor utilisation<br />

Multiprogramming level<br />

190<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Load control<br />

• We can use a few different policies<br />

◾ Only allow processes whose resident set is sufficiently large to execute<br />

◾ Research has shown that when the mean time between faults is equal to the mean time required to process a<br />

fault, processor utilisation will be maximised<br />

• Suspending processes<br />

◾ To reduce the degree of multiprogramming, we need to suspend (swap out) one or more resident processes –<br />

but which ones do we swap?<br />

• Lowest priority process<br />

• Faulting process<br />

• Last process activated<br />

• Process with smallest resident set size<br />

• Largest process<br />

191<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Operating system scheduling (single CPU)<br />

• Only one process can execute at a time on a single processor system<br />

◾ A process is typically executed until it is required to wait for some event<br />

• e.g. I/O, timer, resource<br />

◾ When a process’ execution is stalled, we want to maximise CPU utilization by executing another process on<br />

the CPU<br />

• CPU I/O burst cycle<br />

◾ Processes alternate between bursts of CPU and I/O activity<br />

◾ CPU burst times have an exponential frequency curve with many short bursts and very few long bursts<br />

• CPU scheduler<br />

◾ When the CPU becomes idle, the operating system must select another process to execute<br />

◾ This is the role of the short-term scheduler or CPU scheduler<br />

192<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Process state transitions<br />

• Process state transitions<br />

◾ Only one process can be in the running state on a single processor at any time<br />

◾ Multiple processes may be in the waiting or ready states<br />

pre-emptive<br />

pre-emptive and nonpre-emptive<br />

new<br />

admitted<br />

exit<br />

terminated<br />

ready<br />

interrupt<br />

dispatch to CPU<br />

running<br />

event<br />

(e.g. I/O completion)<br />

waiting<br />

wait<br />

(e.g.for I/O completion)<br />

193<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


CPU Scheduler<br />

• CPU scheduler (continued)<br />

◾ Candidate processes for scheduling are those processes that are in memory and a ready to execute<br />

◾ Candidate processes are stored on the ready queue<br />

◾ The ready queue is not necessarily a FIFO queue, but depends on the implementation of the scheduler<br />

• FIFO queue, priority queue, unordered list, …<br />

• Scheduling may be performed in any of these situations<br />

1. When a process leaves the running state and enters the waiting state, waiting for an event (e.g. completion of<br />

an I/O operation)<br />

2. [PREEMPTIVE] When a process leaves the running state and enters the ready state ( e.g. when an interrupt<br />

occurs)<br />

3. [PREEMPTIVE] When a process leaves the waiting state and enters the ready state<br />

4. When a process terminates<br />

194<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Preemptive and nonpreemptive scheduling<br />

• Scheduling must be performed in cases 1 and 4<br />

◾ If scheduling only takes place in cases 1 and 4, then the scheduling is non-preemptive<br />

◾ Otherwise the scheduling is preemptive<br />

• With a non-preemptive scheme<br />

◾ Once a process has entered the running state on a CPU, it will remain running until it relinquishes the CPU to<br />

another process<br />

• Preemptive scheduling incurs a cost<br />

◾ Access to shared data?<br />

◾ Synchronization of access to kernel data structures?<br />

195<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Scheduling Algorithms<br />

First-Come-First-<br />

Served (FCFS)<br />

Shortest Job First Priority Round Robin<br />

Non-Preemptive ✔ Fragile Fragile This is FCFS<br />

Pre-Emptive ? ✔ ✔ ✔<br />

• FCFS also is First In First Out (FIFO)<br />

• Non-Preemptive Scheduling is also called Cooperative Scheduling and relies on the processes to be<br />

“well-behaved” and to voluntarily relinquish the processor. Thus, it’s always fragile!<br />

• The advantage of Cooperative Scheduling is that, because the processes are designed to relinquish the<br />

processor, they can be designed to minimise the overhead of doing so. This is often a very big advantage<br />

in specialised situations, e.g. inside the kernel of an OS, where efficiency is paramount and processes can<br />

be trusted.<br />

196<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Scheduling algorithms<br />

• First-Come First-Served (FCFS) [non-preemptive]<br />

◾ The process that requests the CPU first is the first to be scheduled<br />

◾ Implemented with a FIFO ready queue<br />

◾ When a process enters the ready state, it is placed at the tail of the FIFO ready queue<br />

◾ When the CPU is free and another process must be dispatched to run, the process at the head of the queue is<br />

dispatched and the running process is removed from the queue<br />

◾ The average waiting time under FCFS scheduling can be quite long and depends on the CPU burst times of the<br />

processes<br />

◾ Consider three processes: P1, P2 and P3 that appear on the FCFS ready queue in the order P1, P2 and P3 and<br />

with burst times of 24, 3 and 3 milliseconds respectively<br />

P 1 P 2 P 3<br />

0 24 27<br />

◾ Average waiting time = (0 + 24 + 27) / 3 = 17 ms<br />

197<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Scheduling algorithms<br />

• First-Come First-Served (FCFS) (continued)<br />

◾ Now consider the same processes, but in the order P2, P3, P1 on the ready queue<br />

P 2 P 3<br />

0 3<br />

6<br />

P 1<br />

◾ Average waiting time = (0 + 3 + 6) / 3 = 3 ms<br />

◾ Another example: one CPU bound process and many I/O bound processes:<br />

• The CPU bound process will obtain and hold the CPU for a long time, during which the I/O bound processes complete<br />

their I/O and enter the ready queue<br />

• The CPU bound process eventually blocks on an event, the I/O bound processes are dispatched in order and block<br />

again on I/O<br />

• The CPU now remains idle until the CPU bound process is again scheduled and the I/O bound processes again spend a<br />

long time waiting<br />

198<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Scheduling algorithms<br />

• Shortest-Job-First [preemptive or non-preemptive]<br />

◾ Given a set of processes, the process with the shortest next CPU burst is selected to execute on the CPU<br />

◾ Optimizes minimum average waiting time<br />

◾ There is an obvious problem with this – how do we know how long the next CPU burst will be for each<br />

process?<br />

◾ We can use past performance to predict future behaviour<br />

• Estimate the next CPU burst time as an exponential average of the lengths of previous CPU bursts<br />

• Let tn be the length of the n th CPU burst for a process and let τn+1 be our estimated next CPU burst time<br />

τn+1 = αtn+(1- α)τn , α ≤ 0 ≤ 1<br />

• If α = 0, the current prediction is equal to the last prediction<br />

• if α = 1, then τn+1 = αtn and the next CPU burst time is estimated to be the length of the last CPU burst time<br />

• We need to set τ0 to a constant value<br />

199<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Scheduling algorithms<br />

• Shortest-Job-First [preemptive or non-preemptive]<br />

◾ To see how the value for τn+1 is calculated based on past history, we can expand the formula, substituting for τn<br />

τn+1 = αtn + (1- α) αtn-1 + … + (1- α) j αtn-j + … + (1- α) n+1 τ0<br />

α is typically given the value 0.5<br />

◾ SJF can be either preemptive or non-preemptive<br />

• Preemptive SJF will preempt the currently executing process if a process joins the ready queue with a smaller next CPU<br />

burst time<br />

• Non-preemptive SJF scheduling will allow the currently executing process to complete its CPU burst before scheduling<br />

the next process<br />

200<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Scheduling algorithms<br />

• Priority scheduling [preemptive or non-preemptive]<br />

◾ Shortest-Job-First is a special case of a more general priority-based scheduling scheme<br />

◾ Each process has an associated priority<br />

◾ The process with the highest priority is executed on the CPU<br />

◾ Processes with the same priority are executed in the order in which they appear on the ready queue (FCFS<br />

order)<br />

◾ What do we mean by high and low priority<br />

• We’ll assume high-priority processes have lower priority values than lower priority processes<br />

• Doesn’t matter<br />

◾ The priority of a process can be defined either internally or externally<br />

• Internally defined priories are based on measurements associated with the process, such as the ratio of I/O burst times<br />

to CPU burst times, the estimated next CPU burst time, time since last scheduling<br />

• Externally defined priorities are based on properties external to the operating system: user status, wealth, political<br />

affiliation, …<br />

201<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Scheduling algorithms<br />

• Priority scheduling [preemptive or non-preemptive]<br />

◾ Like SJF scheduling, priority scheduling can be either preemptive or non-preemptive<br />

• Preemptive: the currently executing process is preempted if another higher priority process arrives on the ready queue<br />

• Non-preemptive: the currently executing process is allowed to complete its current CPU burst<br />

◾ Priority scheduling has the potential to starve low priority processes of access to the CPU<br />

◾ A possible solution is to implement an aging scheme<br />

• Periodically decrement the priority value of waiting processes (increasing their priority)<br />

202<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Scheduling algorithms<br />

• Round-Robin Scheduling<br />

◾ First-Come, First-Served with preemption<br />

◾ Define a time quantum or time slice, T<br />

◾ Every T seconds, preempt the currently executing process, place it at the tail of the ready queue and schedule<br />

the process at the head of the ready queue<br />

◾ New processes are still added to the tail of the ready queue when they are created<br />

◾ The currently executing process may relinquish the CPU before it is preempted (e.g. by performing I/O or<br />

waiting for some other event)<br />

• The next process at the head of the ready queue is scheduled and a new time quantum is started<br />

◾ For a system with n processes, there is an upper limit on the time before a process is given a time quantum on<br />

the CPU<br />

◾ (n – 1) q<br />

203<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Scheduling algorithms<br />

• Round-Robin Scheduling<br />

◾ The performance of Round-Robin scheduling depends on the size of the time quantum, T<br />

◾ For very large values of T, the performance (and behaviour) of Round-Robin scheduling approaches that of<br />

FCFS<br />

◾ As T approached zero, processes execute just one instruction per quantum<br />

◾ Obviously for very small values of T, the time taken to perform a context switch needs to be considered<br />

• T should be large with respect to the time required to perform a context switch<br />

204<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Example<br />

Process Arrival Time Next Burst Time<br />

Priority<br />

(Lower is Better)<br />

P1 0 8 2<br />

P2 1 4 4<br />

P3 2 9 1<br />

P4 3 5 3<br />

• Scheduling goals<br />

◾ Fast response time<br />

• Look at Waiting Time (i.e. latency) and Turnaround Time (i.e. time from arrival to end of burst).<br />

◾ High throughput<br />

• Avoid thrashing, avoid high level of context switches<br />

◾ Fairness<br />

• If we implement priorities, be careful to ensure starvation is avoided.<br />

205<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Example — FCFS<br />

FCFS Arrival Time Next Burst Time Waiting Time Turnaround Time<br />

P1 0 8 0 8<br />

P2 1 4 8-1 = 7 12-1 = 11<br />

P3 2 9<br />

P4 3 5<br />

206<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Scheduling algorithms<br />

• Multi-level queue scheduling<br />

◾ Often we can group processes together based on their characteristics or purpose<br />

• system processes<br />

• interactive process<br />

• low priority interactive processes<br />

• background processes<br />

system processes<br />

interactive processes<br />

low priority interactive processes<br />

background processes<br />

207<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Scheduling algorithms<br />

• Multi-level queue scheduling<br />

◾ Processes are permanently assigned to these groups and each group has an associated queue<br />

◾ Each process group queue has an associated queuing discipline<br />

• FCFS, SJF, priority queuing, Round-Robin, …<br />

◾ Additional scheduling is performed between the queues<br />

• Often implemented as fixed-priority pre-emptive scheduling<br />

◾ Each queue may have absolute priority over lower priority queues<br />

• In other words, no process in the background queue can execute while there are processes in the interactive queue<br />

• If a background process is executing when an interactive process arrives in the interactive queue, the background<br />

process will be preempted and the interactive process will be scheduled<br />

◾ Alternatively, each queue could be assigned a time slice<br />

• For example, we might allow the interactive queue 80% of the CPU time, leaving the remaining 20% for the other<br />

queues<br />

208<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Scheduling algorithms<br />

• Multi-level feedback queue scheduling<br />

◾ Unlike multi-level queue scheduling, this scheme allows processes to move between queues<br />

◾ Allows I/O bound processes to be separated from CPU bound processes for the purposes of scheduling<br />

◾ Processes using a lot of CPU time are moved to lower priority queues<br />

◾ Processes that are I/O bound remain in the higher priority queues<br />

◾ Processes with long waiting times are moved to higher priority queues<br />

• Aging processes in this way prevents starvation<br />

◾ Higher priority queues have absolute priority over lower priority queues<br />

• A process arriving in a high-priority queue will preempt a running process in a lower priority queue<br />

T = 8<br />

T = 16<br />

FCFS<br />

209<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Scheduling algorithms<br />

• Multi-level feedback queue scheduling<br />

◾ When a process arrives on the ready queue (a new process or a process that has completed waiting for an<br />

event), it is placed on the highest priority queue<br />

◾ Processes in the highest priority queue have a quantum of 8ms<br />

◾ If a high priority process does not end its CPU burst within 8ms, it is preempted and placed on the next lowest<br />

priority queue<br />

◾ When the highest priority queue is empty, processes in the next queue are given 16ms time slices, and the<br />

same behaviour applies<br />

◾ Finally, processes in the bottom queue are serviced in FCFS order<br />

◾ We can change<br />

• number of queues<br />

• queuing discipline<br />

• rules for moving process to higher or lower priority queues<br />

210<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Multi-processor scheduling<br />

• Flynn’s taxonomy of parallel processor systems<br />

◾ Single instruction single data (SISD)<br />

◾ Single instruction multiple data (SIMD)<br />

• vector / array processor<br />

◾ Multiple instruction single data (MISD)<br />

• argument over precise meaning (pipelined processor ???)<br />

◾ Multiple instruction multiple data (MIMD)<br />

• shared memory (tightly coupled)<br />

• asymmetric (master/slave)<br />

• symmetric (SMP)<br />

• distributed memory (loosely coupled)<br />

• cluster<br />

211<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Multi-processor scheduling<br />

• Processor affinity<br />

◾ If a process can execute on any processor, what are the implications for cache performance?<br />

◾ Most SMP systems implement a processor affinity scheme<br />

• soft affinity: an attempt is made to schedule a process to execute on the CPU it last executed on<br />

• hard affinity: processes are always executed on the same CPU<br />

• Load balancing<br />

◾ Scheduler can use a common ready queue for all processors or use private per-processor ready queues<br />

◾ If per-processor ready queues are used (and no affinity or soft-affinity is used), then load-balancing between<br />

CPUs becomes an issue<br />

◾ Two approaches:<br />

• pull-migration: idle processors pull ready jobs from other processors<br />

• push-migration: load on each processor is monitored and jobs are pushed from overloaded processors to under-loaded<br />

processors<br />

◾ Load-balancing competes with affinity<br />

212<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Layered file I/O (1)<br />

Application<br />

Directory<br />

management<br />

File system<br />

Physical<br />

organisation<br />

Device I/O<br />

Hardware<br />

• The operating system modules or drivers responsible for<br />

performing I/O form a hierarchy, with a different view or<br />

abstraction of the underlying storage service or device<br />

presented at each level<br />

◾ Directory management: symbolic names and hierarchical directory<br />

structure<br />

◾ File system: logical layout of disk blocks and servicing of requests<br />

◾ Physical organization: logical block addresses are converted into<br />

physical addresses and device identifiers<br />

◾ Device I/O: Scheduling and control of I/O on physical devices<br />

213<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Layered file I/O (2)<br />

Application<br />

Directory<br />

management<br />

File system<br />

Software RAID<br />

Physical<br />

organisation<br />

Device I/O<br />

Hardware<br />

• Layers can be added or substituted to add functionality<br />

◾ As long as the layers adhere to interfaces<br />

• Example: Software RAID<br />

◾ We can implement a single logical volume on multiple disks to improve<br />

performance and/or reliability<br />

◾ A software RAID layer presents the same logical volume abstraction to<br />

the file system layer above<br />

◾ Logical block addresses at the software RAID level are translated into<br />

logical addresses of blocks within a partition on a disk device<br />

◾ The request is serviced as before at the physical organisation and<br />

device I/O levels<br />

214<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Disk access<br />

• Disks are slow<br />

◾ We need ways to improve the performance of disk I/O<br />

◾ Caching is one possibility, but this still doesn’t help us on the occasions when device I/O needs to be<br />

performed<br />

◾ How can the operating system improve disk I/O performance?<br />

• Aside: Disk I/O delays …<br />

◾ seek time: the head must be positioned at the correct track on the platter<br />

◾ rotational delay: the transfer cannot begin until the required sector is below the head (sometimes expressed<br />

at latency)<br />

◾ data transfer: the read or write operation is performed<br />

• Average access time:<br />

seek<br />

latency transfer<br />

215<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Access time example<br />

• Suppose we have a disk with these characteristics …<br />

◾ Rotational speed: 7,200 RPM (latency = 4.17ms)<br />

◾ Average seek time: 8.5 ms<br />

◾ 512-byte sectors<br />

◾ 320 sectors per track<br />

• Using these average delays, how long does it take to read 2560 sectors from a file?<br />

◾ Time to read one sector = 8.5 + 4.17 + 0.0261 = 12.6961 ms<br />

◾ Time to read 2560 sectors = 32.5 seconds (that’s slow)<br />

• However, if we assume the file is contiguous on disk<br />

◾ and occupies exactly 8 tracks<br />

◾ Time to seek to first track = 8.5 ms<br />

◾ Time to read track = 4.17 ms + 8.34 ms<br />

◾ Suppose seek time between tracks is 1.2 ms<br />

◾ Total read time = 8.5 + 4.17 + 8.34 + 7 x (1.2 + 4.17 + 8.34) = 117 ms<br />

216<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Disk scheduling<br />

• So, the order in which sectors are read has a significant influence on I/O performance<br />

◾ Most file systems will try to arrange files on disk so large portions of the file are contiguous<br />

◾ Below the file system level, even if files are not contiguous, the operating system or disk device driver can try<br />

to order operations so seek times are minimised<br />

• Disk scheduling algorithms<br />

◾ When a process needs to perform disk I/O, a request is issued to the operating system’s I/O manager<br />

◾ The operating system creates a descriptor for the operation<br />

◾ If there are no other pending requests (from the same or other processes) then the request can be scheduled<br />

immediately and performance is not an issue<br />

◾ If multiple disk accesses are queued up at the same time, the OS can improve I/O bandwidth and/or latency<br />

through sensible ordering of the queued I/O operations<br />

217<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Disk scheduling algorithms (1)<br />

• FCFS scheduling<br />

◾ The I/O request queue is follows the FIFO queuing discipline<br />

◾ <strong>Request</strong>s are serviced in the order in which they were submitted to the operating system’s I/O manager<br />

◾ Although the algorithm is fair, the distribution of requests across the disk means a significant proportion of the<br />

disk access time is spent seeking<br />

• SSTF scheduling<br />

◾ Shortest seek time first<br />

◾ Of the queued requests, select the one that is closest to the current disk head position (the position of the last<br />

disk access)<br />

◾ Reduces time spent seeking<br />

◾ Problem: Some requests may be starved<br />

◾ Problem: The algorithm is still not optimal<br />

218<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Disk scheduling algorithms (2)<br />

• SCAN scheduling<br />

◾ Starting from either the hub or the outer edge of the disk and moving to the other edge, service all the<br />

queued requests along the way<br />

◾ When the other edge is reached, reverse the direction of head movement and service the queued requests in<br />

the opposite direction<br />

◾ If a requests arrives in the queue just before the head reaches the requested track, the request will be serviced<br />

quickly<br />

◾ However, if a requests in the queue just after the head has serviced the requested position, the request may<br />

take a long time to be serviced (depending on the current head position)<br />

• Consider a request for a track at the hub that arrives just after the hub has been serviced<br />

• That request will not be serviced until the head has serviced the requests in both directions<br />

219<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009


Disk scheduling algorithms (3)<br />

• C-SCAN scheduling<br />

◾ Circular-SCAN scheduling<br />

◾ Similar to SCAN, but the worst case service time is reduced<br />

◾ Like SCAN, the head moves in one direction servicing requests along the way<br />

◾ However, unlike SCAN, instead of servicing requests in the opposite direction, the head is returned<br />

immediately to the opposite edge and starts servicing requests again in the same direction<br />

• SCAN and C-SCAN continuously move the head from one extreme of the disk to the other<br />

◾ In practice, if there are no further requests in the current direction, the head reverses its direction immediately<br />

◾ These modifications are referred to as the LOOK and C-LOOK scheduling algorithms<br />

220<br />

<strong>Trinity</strong> <strong>College</strong> <strong>Dublin</strong><br />

© Mike Brady 2007–2009

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!