01.09.2014 Views

Threading History and Implementation - Classes

Threading History and Implementation - Classes

Threading History and Implementation - Classes

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>Threading</strong><br />

<strong>History</strong> <strong>and</strong> <strong>Implementation</strong><br />

Matthew Atwood<br />

Intel Atom Group - OSU


Moore's Law(s)<br />

Moore's Law<br />

"The number of transistors <strong>and</strong> resistors on a chip doubles every 18<br />

months."<br />

Moore's 2nd Law<br />

"The capital cost of a semiconductor fab also increases exponentially<br />

over time."<br />

Gordon E. Moore<br />

Intel Co-Founder


Computing Power<br />

1. There has always been two schools of thought on how to increase<br />

computing power.<br />

2. Increase the number of transistors<br />

1. Primary way to increase computing power in the last 50 years.<br />

3. Increase the number of cores<br />

1. A strong idea but more difficult to implement.


What is <strong>Threading</strong>?<br />

Thread of execution:<br />

1. Smallest unit of processing that can be scheduled by an operating<br />

system.<br />

2. In most modern operating systems, a thread is contained within a<br />

process (Unix <strong>and</strong> Unix like operating systems).<br />

○ Linux does not differentiate between processes <strong>and</strong> threads. A<br />

thread is merely a special kind of process.<br />

3. On single processor a process's time is divided among threads.<br />

4. On multiple core machines threads can run concurrently thus<br />

achieving true parallelism.


Threads vs Processes<br />

● Processes are typically (not always) independent, threads<br />

exist as part of process<br />

● The state information of threads is much less dense<br />

● Threads share their address space<br />

● Process can only communicate through inter-process<br />

communication: sockets, file descriptors, etc..<br />

● Context switching between threads (of the same process) is<br />

typically much faster than context switching among<br />

processes.


Why <strong>Threading</strong>? Why Now?<br />

Part I<br />

1. Approaching the limit for how small a transistor can get<br />

○ April 18, 2011 a 7 atom transistor was made<br />

2. The cost in terms of power consumption, <strong>and</strong> in turn heat,<br />

for faster chips is becoming more <strong>and</strong> more unacceptable<br />

3. In order to continue increasing computing power harnessing<br />

multiple cores instead of using high power single core<br />

machines is critical.


Why Treading? Why Now?<br />

Part II<br />

1. Threads are much cheaper to create than processes<br />

○ pthread_create() is often less than 1/5th the creation<br />

time of fork().<br />

2. Thread inter-communication is much faster than process<br />

inter-communication.<br />

○ Especially if I/O is used to communicate between<br />

processes.<br />

3. Context switching between threads is much less expensive<br />

than processes.<br />

○ This is due mainly to the shared memory model.


C<strong>and</strong>idates for Parallelism<br />

If there are two independent tasks that can be run concurrently,<br />

interleaved, overlapped then we can potentially achieve an<br />

optimization through threading


Thread Safety<br />

If Parallelism, more specifically threading, is so great why not<br />

use it for everything?<br />

● Parallelism is hard. Humans think serially, many existing<br />

algorithms cannot be done in parallel (by design).<br />

● Interactions between threads in which data is accessed <strong>and</strong><br />

manipulated is called a critical section<br />

○ Use of "locking" is used to prevent this<br />

● If it is possible for two threads to be in the same critical<br />

section at the same time we call this a race condition.<br />

○ A race condition means that the order of execution<br />

determines the output.


Thread Synchronization<br />

● Use parallel design patterns, these are often similar to<br />

regular design patterns.<br />

○ Master/ Slave<br />

○ Assembly Line<br />

○ Peer<br />

● To lock critical sections use of atomic variables is required.<br />

An atomic variable can guarantee only one thread acts on it<br />

at a time.<br />

○ Semaphores: Typically used in a gate design pattern<br />

○ Mutex: Typical locking mechanism for shared data<br />

○ Spinlocks: Linux Kernel locking mechanism


<strong>Threading</strong> Libraries<br />

1. Posix Threads (pthreads)<br />

○ Available on almost every platform (was originally<br />

intended for Unix <strong>and</strong> Unix like operating systems)<br />

○ Extremely customizable allowing for greater degrees for<br />

control<br />

2. Intel Thread Building Blocks<br />

○ A higher level library for threading<br />

3. OpenMP<br />

○ An open source higher level threading library<br />

A


Matt's Tips for <strong>Threading</strong><br />

1. Don't design algorithms serially<br />

○ Start out expecting your code to only be run in parallel<br />

2. Pick the "right" pattern<br />

○ Identify what your goal is (memory usage, speed, etc..)<br />

<strong>and</strong> what you're willing to sacrifice to get there.<br />

3. Avoid lots of critical sections<br />

○ Not to mean by shear quantity but rather how often you<br />

will need to enter those critical sections.<br />

4. Think carefully about your division of labor


Recommended Reading<br />

1. The Little Book of Semaphores: Allen B. Downey<br />

○ available for free: http://greenteapress.<br />

com/semaphores/downey08semaphores.pdf<br />

2. Pthread Tutorial:<br />

○ https://computing.llnl.gov/tutorials/pthreads/<br />

3. And if you're interested in the Kernel...<br />

○ Linux Kernel Development: Robert Love

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!