A Technical History of the SEI
ihQTwP
ihQTwP
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
Real-Time Multicore Scheduling<br />
The Challenge: Taking Advantage <strong>of</strong> Multicore Chips<br />
The trend to increase processing power has shifted from increasing <strong>the</strong> frequency <strong>of</strong> execution to<br />
multiplying <strong>the</strong> number <strong>of</strong> processors embedded in a single chip. This trend is known as multicore<br />
chips or chip-level multiprocessor (CMP). The increase in processing power is generated by increasing<br />
<strong>the</strong> number <strong>of</strong> instructions that can be executed concurrently ra<strong>the</strong>r than by reducing <strong>the</strong><br />
time to execute a single instruction. Consequently, an application experiences a speedup in its execution<br />
only if it has enough instructions that can be executed concurrently—parallelizable instructions.<br />
The additional processor capacity made available by having additional cores in a multicore<br />
processor can be exploited only if enough parallel instructions can be found in <strong>the</strong><br />
application. Unfortunately, this limitation is misaligned with <strong>the</strong> sequential programming model<br />
prevailing in current s<strong>of</strong>tware development practice, where application code is generally developed<br />
as a single sequence <strong>of</strong> instructions under <strong>the</strong> assumption that <strong>the</strong>y will not execute in parallel.<br />
In many real-time systems, a fair amount <strong>of</strong> parallelization has already been exploited, but still<br />
more is needed to cope with <strong>the</strong> growing demands <strong>of</strong> computation imposed on defense systems,<br />
such as that imposed by <strong>the</strong> increased demand for autonomy in unmanned aerial vehicles (UAVs).<br />
Today, real-time systems and applications have already been developed using threads that are<br />
later scheduled with well-established schedulers to achieve predictable timing behavior. However,<br />
<strong>the</strong> bulk <strong>of</strong> <strong>the</strong> scheduling research assumed a single core; and while multiprocessors (not multicore)<br />
are already being used and analyzed, such systems are not yet fully understood. For instance,<br />
<strong>the</strong>re are cases <strong>of</strong> anomalies where a system with N processors can miss deadlines with a<br />
workload that is only just enough to fill one processor [Dhall 1978]. Similar problems occur when<br />
threads distributed across multiple processors need to synchronize with each o<strong>the</strong>r, leading to idle<br />
processors and poor utilization. Essentially, <strong>the</strong>re are two aspects that must be considered: (1) allocating<br />
and mapping a thread to a processor and (2) determining <strong>the</strong> execution order on that processor;<br />
that is, scheduling. The solution to <strong>the</strong>se problems will very likely also involve a change in<br />
<strong>the</strong> structure and in <strong>the</strong> abstractions used to develop <strong>the</strong>se systems.<br />
A Solution: Real-Time Scheduling for Multicore Processors<br />
In 2009, <strong>the</strong> <strong>SEI</strong> began to investigate real-time scheduling for multicore processors with a focus<br />
on analyzing <strong>the</strong> problems <strong>of</strong> task-to-core allocation, synchronization, and <strong>the</strong> relationship between<br />
synchronization and task allocation. The focus was soon extended to analyze variations <strong>of</strong><br />
multicore processors that include graphical processor units (GPUs). While GPUs are typically<br />
used to render graphics, <strong>the</strong>y are <strong>of</strong>ten used for general parallel computation as well.<br />
Previous work on scheduling resulted in an increase <strong>of</strong> <strong>the</strong> global scheduling utilization to 33 percent<br />
for periodic tasks and 50 percent for aperiodic tasks [Andersson 2001, 2003]. Some o<strong>the</strong>r approaches<br />
chose to use quantized assignments <strong>of</strong> processor cycles to tasks with a scheduler that<br />
calculates a scheduling window at fixed intervals [Srinivasan 2001, Anderson 2006]. However,<br />
none <strong>of</strong> <strong>the</strong>se efforts took into account <strong>the</strong> task interactions and different tasks and application<br />
structures. The <strong>SEI</strong> partnered with Carnegie Mellon research faculty to explore <strong>the</strong> combination<br />
<strong>of</strong> task scheduling and task synchronization, creating a coordinated allocation and synchronization<br />
algorithm that can obtain up to twice <strong>the</strong> utilization <strong>of</strong> non-coordinated ones [Lakshmanan 2009].<br />
CMU/<strong>SEI</strong>-2016-SR-027 | SOFTWARE ENGINEERING INSTITUTE | CARNEGIE MELLON UNIVERSITY 46<br />
Distribution Statement A: Approved for Public Release; Distribution is Unlimited