11.01.2015 Views

How to Benchmark Code Execution Times on Intel IA-32 and IA-64 ...

How to Benchmark Code Execution Times on Intel IA-32 and IA-64 ...

How to Benchmark Code Execution Times on Intel IA-32 and IA-64 ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<str<strong>on</strong>g>How</str<strong>on</strong>g> <str<strong>on</strong>g>to</str<strong>on</strong>g> <str<strong>on</strong>g>Benchmark</str<strong>on</strong>g> <str<strong>on</strong>g>Code</str<strong>on</strong>g> <str<strong>on</strong>g>Executi<strong>on</strong></str<strong>on</strong>g> <str<strong>on</strong>g>Times</str<strong>on</strong>g> <strong>on</strong> <strong>Intel</strong> ® <strong>IA</strong>-<strong>32</strong><br />

<strong>and</strong> <strong>IA</strong>-<strong>64</strong> Instructi<strong>on</strong> Set Architectures<br />

Figure 4. Variance Behavior Graph 4<br />

graph4<br />

clock cycles<br />

24<br />

22<br />

20<br />

18<br />

16<br />

14<br />

12<br />

10<br />

8<br />

6<br />

4<br />

2<br />

0<br />

1 51 101 151 201 251 301 351 401 451 501 551 601 651 701 751 801 851 901 951<br />

ensembles<br />

Variance<br />

In Figure 3 we can see that the minimum value is perfectly c<strong>on</strong>stant between<br />

ensembles; in Figure 4 the variance is either equal <str<strong>on</strong>g>to</str<strong>on</strong>g> 2 or 3 clock cycles.<br />

3.2.3 An Alternative Method for Architecture Not Supporting<br />

RDTSCP<br />

This secti<strong>on</strong> presents an alternative method <str<strong>on</strong>g>to</str<strong>on</strong>g> benchmark code executi<strong>on</strong> cycles<br />

for architectures that do not support the RDTSCP instructi<strong>on</strong>. Such a method is not<br />

as good as the <strong>on</strong>e presented in Secti<strong>on</strong> 3.2.1, but it is still much better than the<br />

<strong>on</strong>e using CPUID <str<strong>on</strong>g>to</str<strong>on</strong>g> serialize code executi<strong>on</strong>. In this method between the two<br />

timestamp register reads we serialize the code executi<strong>on</strong> by writing the c<strong>on</strong>trol<br />

register CR0.<br />

Regarding the code in the Appendix, the developer should replace ln19 <str<strong>on</strong>g>to</str<strong>on</strong>g> ln54 with<br />

the following:<br />

asm volatile( "CPUID\n\t"<br />

"RDTSC\n\t"<br />

"mov %%edx, %0\n\t"<br />

"mov %%eax, %1\n\t": "=r" (cycles_high), "=r" (cycles_low)::<br />

"%rax", "%rbx", "%rcx", "%rdx");<br />

asm volatile( "mov %%cr0, %%rax\n\t"<br />

"mov %%rax, %%cr0\n\t"<br />

"RDTSC\n\t"<br />

"mov %%edx, %0\n\t"<br />

"mov %%eax, %1\n\t": "=r" (cycles_high1), "=r" (cycles_low1)::<br />

"%rax", "%rdx");<br />

20

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!