Threading History and Implementation - Classes
Threading History and Implementation - Classes
Threading History and Implementation - Classes
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
<strong>Threading</strong><br />
<strong>History</strong> <strong>and</strong> <strong>Implementation</strong><br />
Matthew Atwood<br />
Intel Atom Group - OSU
Moore's Law(s)<br />
Moore's Law<br />
"The number of transistors <strong>and</strong> resistors on a chip doubles every 18<br />
months."<br />
Moore's 2nd Law<br />
"The capital cost of a semiconductor fab also increases exponentially<br />
over time."<br />
Gordon E. Moore<br />
Intel Co-Founder
Computing Power<br />
1. There has always been two schools of thought on how to increase<br />
computing power.<br />
2. Increase the number of transistors<br />
1. Primary way to increase computing power in the last 50 years.<br />
3. Increase the number of cores<br />
1. A strong idea but more difficult to implement.
What is <strong>Threading</strong>?<br />
Thread of execution:<br />
1. Smallest unit of processing that can be scheduled by an operating<br />
system.<br />
2. In most modern operating systems, a thread is contained within a<br />
process (Unix <strong>and</strong> Unix like operating systems).<br />
○ Linux does not differentiate between processes <strong>and</strong> threads. A<br />
thread is merely a special kind of process.<br />
3. On single processor a process's time is divided among threads.<br />
4. On multiple core machines threads can run concurrently thus<br />
achieving true parallelism.
Threads vs Processes<br />
● Processes are typically (not always) independent, threads<br />
exist as part of process<br />
● The state information of threads is much less dense<br />
● Threads share their address space<br />
● Process can only communicate through inter-process<br />
communication: sockets, file descriptors, etc..<br />
● Context switching between threads (of the same process) is<br />
typically much faster than context switching among<br />
processes.
Why <strong>Threading</strong>? Why Now?<br />
Part I<br />
1. Approaching the limit for how small a transistor can get<br />
○ April 18, 2011 a 7 atom transistor was made<br />
2. The cost in terms of power consumption, <strong>and</strong> in turn heat,<br />
for faster chips is becoming more <strong>and</strong> more unacceptable<br />
3. In order to continue increasing computing power harnessing<br />
multiple cores instead of using high power single core<br />
machines is critical.
Why Treading? Why Now?<br />
Part II<br />
1. Threads are much cheaper to create than processes<br />
○ pthread_create() is often less than 1/5th the creation<br />
time of fork().<br />
2. Thread inter-communication is much faster than process<br />
inter-communication.<br />
○ Especially if I/O is used to communicate between<br />
processes.<br />
3. Context switching between threads is much less expensive<br />
than processes.<br />
○ This is due mainly to the shared memory model.
C<strong>and</strong>idates for Parallelism<br />
If there are two independent tasks that can be run concurrently,<br />
interleaved, overlapped then we can potentially achieve an<br />
optimization through threading
Thread Safety<br />
If Parallelism, more specifically threading, is so great why not<br />
use it for everything?<br />
● Parallelism is hard. Humans think serially, many existing<br />
algorithms cannot be done in parallel (by design).<br />
● Interactions between threads in which data is accessed <strong>and</strong><br />
manipulated is called a critical section<br />
○ Use of "locking" is used to prevent this<br />
● If it is possible for two threads to be in the same critical<br />
section at the same time we call this a race condition.<br />
○ A race condition means that the order of execution<br />
determines the output.
Thread Synchronization<br />
● Use parallel design patterns, these are often similar to<br />
regular design patterns.<br />
○ Master/ Slave<br />
○ Assembly Line<br />
○ Peer<br />
● To lock critical sections use of atomic variables is required.<br />
An atomic variable can guarantee only one thread acts on it<br />
at a time.<br />
○ Semaphores: Typically used in a gate design pattern<br />
○ Mutex: Typical locking mechanism for shared data<br />
○ Spinlocks: Linux Kernel locking mechanism
<strong>Threading</strong> Libraries<br />
1. Posix Threads (pthreads)<br />
○ Available on almost every platform (was originally<br />
intended for Unix <strong>and</strong> Unix like operating systems)<br />
○ Extremely customizable allowing for greater degrees for<br />
control<br />
2. Intel Thread Building Blocks<br />
○ A higher level library for threading<br />
3. OpenMP<br />
○ An open source higher level threading library<br />
A
Matt's Tips for <strong>Threading</strong><br />
1. Don't design algorithms serially<br />
○ Start out expecting your code to only be run in parallel<br />
2. Pick the "right" pattern<br />
○ Identify what your goal is (memory usage, speed, etc..)<br />
<strong>and</strong> what you're willing to sacrifice to get there.<br />
3. Avoid lots of critical sections<br />
○ Not to mean by shear quantity but rather how often you<br />
will need to enter those critical sections.<br />
4. Think carefully about your division of labor
Recommended Reading<br />
1. The Little Book of Semaphores: Allen B. Downey<br />
○ available for free: http://greenteapress.<br />
com/semaphores/downey08semaphores.pdf<br />
2. Pthread Tutorial:<br />
○ https://computing.llnl.gov/tutorials/pthreads/<br />
3. And if you're interested in the Kernel...<br />
○ Linux Kernel Development: Robert Love