29.01.2015 Views

Embedded Software for SoC - Grupo de Mecatrônica EESC/USP

Embedded Software for SoC - Grupo de Mecatrônica EESC/USP

Embedded Software for SoC - Grupo de Mecatrônica EESC/USP

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Lightweight Implementation of the POSIX Threads API 37<br />

Table 3-2. Per<strong>for</strong>mance of the main kernel functions (in number of cycles).<br />

Operation<br />

Number of processors<br />

1<br />

2<br />

3<br />

4<br />

Context switch<br />

Mutex lock (acquired)<br />

Mutex unlock<br />

Mutex lock (suspen<strong>de</strong>d)<br />

Mutex unlock (awakes a thread)<br />

Thread creation<br />

Thread exit<br />

Semaphore acquisition<br />

Semaphore acquisition<br />

Interrupt handler<br />

172<br />

36<br />

30<br />

117<br />

N/A<br />

667<br />

98<br />

36<br />

36<br />

200<br />

187<br />

56<br />

30<br />

123<br />

191<br />

738<br />

117<br />

48<br />

50<br />

430<br />

263<br />

61<br />

31<br />

258<br />

198<br />

823<br />

142<br />

74<br />

78<br />

1100<br />

351<br />

74<br />

34<br />

366<br />

218<br />

1085<br />

230<br />

76<br />

130<br />

1900<br />

memory is much lower. The implementation is a bit tricky‚ but quite compact<br />

and efficient. Our experimentations have shown that a POSIX compliant SMP<br />

kernel allowing task migration is an acceptable solution in terms of generality‚<br />

per<strong>for</strong>mance and memory footprint <strong>for</strong> <strong>SoC</strong>.<br />

The main problem due to the introduction of networks on chip is the<br />

increasing memory access latency. One of our goal in the short term is to<br />

investigate the use of latency hiding techniques <strong>for</strong> these networks. Our<br />

next experiment concerns the use of a <strong>de</strong>dicated hardware <strong>for</strong> semaphore<br />

and pollable variables that would queue the acquiring requests and put to<br />

sleep the requesting processors until a change occurs to the variable.<br />

This can be effectively supported by the VCI interconnect‚ by the mean of<br />

its request/acknowledge handshake. In that case‚ the implementation of<br />

pthread_spin_lock could suspend the calling task. This could be efficiently<br />

taken care of if the processors that run the kernel are processors with multiple<br />

hardware contexts‚ as introduced in [10].<br />

The SMP version of this kernel‚ and a minimal C library‚ is part of the<br />

Disy<strong>de</strong>nt tool suite available un<strong>de</strong>r GPL at www-asim.lip6.fr/disy<strong>de</strong>nt.<br />

REFERENCES<br />

1.<br />

2.<br />

3.<br />

4.<br />

VSI Alliance. Virtual Component Interface Standard (OCB 2 2.0)‚ August 2000.<br />

T. R. Halfhill. “<strong>Embed<strong>de</strong>d</strong> Market Breaks New Ground.” Microprocessor Report‚ January<br />

2000.<br />

J. Archibald and J.-L. Baer. “Cache Coherence Protocols: Evaluation Using a Multiprocessor<br />

Simulation Mo<strong>de</strong>l.” ACM Transactions on Computer Systems‚ Vol. 4‚ No. 4‚ pp. 273–298‚<br />

1986.<br />

T. P. Baker‚ F. Mueller‚ and V. Rustagi. “Experience with a Prototype of the POSIX Minimal

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!