Computer Architecture 2 / Advanced Computer Architecture
Computer Architecture 2 / Advanced Computer Architecture
Computer Architecture 2 / Advanced Computer Architecture
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
University<br />
Duisburg-Essen<br />
Dr.-Ing. Basermann<br />
Prof. Dr.-Ing. Hunger<br />
SS 2004/2005<br />
<strong>Computer</strong> <strong>Architecture</strong> 2 / <strong>Advanced</strong> <strong>Computer</strong> <strong>Architecture</strong> Seite: 1<br />
Annotation to the assignments and the solution sheet<br />
This is a multiple choice examination, that means:<br />
• Solution approaches are not assessed.<br />
• For each subpart of an assignment one or more answers can be right.<br />
But: If you mark the box "None of them" of one subpart, the other marked answers of<br />
this subpart will be disregarded.<br />
• It is not possible to get a negative score in a subpart of any assignment.<br />
Note the following points<br />
• In addition to the assignment sheet there is a solution sheet<br />
• Mark the answers on the solution sheet as described!!!<br />
MARKED ANSWERS ON THE ASSIGNMENT SHEET WILL NOT BE CONSIDERED.<br />
• You get the assignment sheet only once.<br />
• In case of erroneous entries ask the personnel for a new solution sheet .<br />
• Only use the sheets enclosed in the envelop. Don't use any other paper. If you need<br />
more paper ask the supervisors.<br />
• Return everything, i.e. assignment sheet, solution sheet and the sheets - used and<br />
unused. Only exams that are returned completely will be assessed.<br />
• FILL-IN YOUR NAME AND MATRICULATION NUMBER ON THE ASSIGNMENT SHEET<br />
AND THE SOLUTION SHEET!<br />
Name Matrikelnummer Typ<br />
A
University<br />
Duisburg-Essen<br />
Dr.-Ing. Basermann<br />
Prof. Dr.-Ing. Hunger<br />
SS 2004/2005<br />
<strong>Computer</strong> <strong>Architecture</strong> 2 / <strong>Advanced</strong> <strong>Computer</strong> <strong>Architecture</strong> Seite: 2<br />
Question 1 (14 Points)<br />
Parallelism within a Processor<br />
1.1 Which of the following statements about the von Neumann architecture is/are true?<br />
A: Programs and data are resident in different memories.<br />
B: The computer structure is independent of the problem to be processed.<br />
C: Programs consist of a sequence of instructions which are executed in parallel.<br />
D: The machine applies binary codes.<br />
E: None of the answers above is correct.<br />
1.2 Instruction Pipelining: How long (in ns) is the gap (bubble) within the fourth task entering<br />
the pipe below?<br />
IF I<br />
E<br />
MEM WB<br />
4 ns 3 D ns 4 Xns<br />
8 ns 3<br />
ns<br />
F: 12 ns.<br />
G: 16 ns.<br />
H: 20 ns.<br />
I: None of the answers above is correct.<br />
1.3 Pipelining: what is the execution time per stage of a pipeline that has 5 equal stages<br />
and a mean overhead of 8 cycles?<br />
J: 2 cycles.<br />
K: 3 cycles.<br />
L: 4 cycles.<br />
M: None of the answers above is correct.<br />
1.4 Itanium processor, ILP (EPIC): A vector operation c = a + b with 154 elements per<br />
vector shall be performed. How many cycles are required within the loop below for the<br />
vector operation above (neglect the branch operation br.ctop ) if the load (ldl)<br />
instructions take two cycles and the remaining operations take 1 cycle?<br />
Name Matrikelnummer Typ<br />
A
University<br />
Duisburg-Essen<br />
Dr.-Ing. Basermann<br />
Prof. Dr.-Ing. Hunger<br />
SS 2004/2005<br />
<strong>Computer</strong> <strong>Architecture</strong> 2 / <strong>Advanced</strong> <strong>Computer</strong> <strong>Architecture</strong> Seite: 3<br />
Intel‘s Itanium<br />
N: 158.<br />
O: 159.<br />
P: 162.<br />
Q: None of the answers above is correct.<br />
ld r2=addr(a)<br />
ld r3=addr(b) ;;<br />
ld r4=addr(c)<br />
ld.lc=4<br />
ld.ec=5 ;;<br />
loop:<br />
(p16) ldl f32=[r2],8<br />
(p17) ldl f36=[r3],8<br />
(p19) fadd f38=f35+f38<br />
(p20) stl [r4]=f39,8<br />
br.ctop.loop ;;<br />
1.5 Which feature of Itanium processors aims to increase parallelism by changing<br />
instructions order?<br />
R: Rotating Registers.<br />
S: Predication.<br />
T: Speculation.<br />
U: None of the answers above is correct.<br />
Name Matrikelnummer Typ<br />
A
University<br />
Duisburg-Essen<br />
Dr.-Ing. Basermann<br />
Prof. Dr.-Ing. Hunger<br />
SS 2004/2005<br />
<strong>Computer</strong> <strong>Architecture</strong> 2 / <strong>Advanced</strong> <strong>Computer</strong> <strong>Architecture</strong> Seite: 4<br />
Question 2 (12 Points)<br />
Classification & Performance of Parallel <strong>Architecture</strong>s<br />
2.1 Which kind of architecture is represented by the following figure?<br />
I/O<br />
I/O<br />
I/O<br />
CU1<br />
CU2<br />
.<br />
.<br />
.<br />
CUn<br />
IS<br />
IS<br />
IS<br />
PU1<br />
PU2<br />
.<br />
.<br />
.<br />
PUn<br />
A: SISD architecture.<br />
B: SIMD architecture.<br />
C: MIMD architecture.<br />
D: MISD architecture.<br />
E: None of the answers above is correct.<br />
DS<br />
DS<br />
DS<br />
Shared<br />
Memory<br />
2.2 Which statement(s) related to the system in figure in 2.1 is/are true?<br />
F: The system is very well scalable with respect to the number of processors.<br />
G: The system represents a vector processor.<br />
H: 2 The processors can communicate with each others through shared variables.<br />
I: None of the answers above is correct.<br />
2.3 Parallel programs: Which is the parallel execution time of a program with mean parallel<br />
overhead 4 s and sequential execution time 600 s on 150 processors?<br />
J: 4 s. K: 8 s. L: 12 s.<br />
M: N: None of the answers above is<br />
correct<br />
Name Matrikelnummer Typ<br />
A<br />
IS<br />
IS<br />
IS
University<br />
Duisburg-Essen<br />
Dr.-Ing. Basermann<br />
Prof. Dr.-Ing. Hunger<br />
SS 2004/2005<br />
<strong>Computer</strong> <strong>Architecture</strong> 2 / <strong>Advanced</strong> <strong>Computer</strong> <strong>Architecture</strong> Seite: 5<br />
2.4 Parallel programs: Which is the execution time of a program on 100 processors if 93%<br />
of the program is ideally parallel, the remaining part is sequential and the sequential<br />
execution time is 10000 s?<br />
O: 100 s. P: 593 s. Q: 793 s.<br />
R: None of the answers above is correct<br />
2.5 Workload driven evaluation of parallel systems, memory constrained scaling: A matrix<br />
factorization with complexity n³ takes 20 hours for a square matrix which requires<br />
128*10 8 bytes on one processor (8 bytes per element). Which time would it need on 100<br />
processors (assuming 50% parallel efficiency)?<br />
S: 200 hours. T: 400 hours. U: 600 hours.<br />
V: None of the answers above is correct<br />
2.6 Workload driven evaluation of parallel systems, time-constrained scaling: Which should<br />
be the number of rows for a matrix-matrix multiplication on 1 processor if it is 3000 on<br />
30 processors (assuming 90% parallel efficiency)?<br />
W: 1000. X: 1500. Y: 2000.<br />
Z: None of the answers above is correct<br />
Name Matrikelnummer Typ<br />
A
University<br />
Duisburg-Essen<br />
Dr.-Ing. Basermann<br />
Prof. Dr.-Ing. Hunger<br />
SS 2004/2005<br />
<strong>Computer</strong> <strong>Architecture</strong> 2 / <strong>Advanced</strong> <strong>Computer</strong> <strong>Architecture</strong> Seite: 6<br />
Question 3 (12 Points)<br />
Interconnection Networks<br />
3.1 Topology: What is the difference between a 2-D torus and a hypercube with 16 nodes<br />
regarding the topology parameters node degree, diameter, bisection width, and average<br />
distance?<br />
A: The hypercube has the higher bisection width.<br />
B: The node degree is different.<br />
C: The 2-D torus has the higher average distance.<br />
D: No difference.<br />
E: None of the answers above is correct.<br />
3.2 E-cube routing: Which is the path taken from 010 to 101?<br />
110 111<br />
010 011<br />
100 101<br />
000 001<br />
F: 010 -> 011 -> 001 -> 101.<br />
G: 010 -> 110 -> 100 -> 101.<br />
H: 010 -> 000 -> 001 -> 101.<br />
I: 010 -> 110 -> 111 -> 101.<br />
J: None of the answers above is correct.<br />
3.3 Topology: Which is the height of a binary tree with 128 nodes?<br />
K: 8. L: 7. M: 6. N: None of the answers<br />
K-M is correct.<br />
3.4 Which routing strategies are deadlock-free?<br />
O: E-cube routing on hypercubes.<br />
P: XY routing on tori.<br />
Q: XY routing on 2D meshes.<br />
R:<br />
None of the answers above is correct.<br />
Name Matrikelnummer Typ<br />
A
University<br />
Duisburg-Essen<br />
Dr.-Ing. Basermann<br />
Prof. Dr.-Ing. Hunger<br />
SS 2004/2005<br />
<strong>Computer</strong> <strong>Architecture</strong> 2 / <strong>Advanced</strong> <strong>Computer</strong> <strong>Architecture</strong> Seite: 7<br />
3.5 Topology: Which is the average distance in a butterfly network with 256 nodes?<br />
S: 16. T: 4.<br />
U: 8. V: None of the answers<br />
S-U is correct.<br />
3.6 Routing in a butterfly network: Which statement is true?<br />
W: Each stage corresponds to a bit in the destination address.<br />
X: The corresponding bit of the destination address selects the<br />
output of each stage (0 or 1).<br />
Y: The corresponding bit of the destination address selects the<br />
input of each stage (0 or 1).<br />
Z: None of the answers above is correct.<br />
Name Matrikelnummer Typ<br />
A
University<br />
Duisburg-Essen<br />
Dr.-Ing. Basermann<br />
Prof. Dr.-Ing. Hunger<br />
SS 2004/2005<br />
<strong>Computer</strong> <strong>Architecture</strong> 2 / <strong>Advanced</strong> <strong>Computer</strong> <strong>Architecture</strong> Seite: 8<br />
Question 4 (9 Points)<br />
Caches<br />
4.1 Simple cache model, 1 level only: Which is the cache access time if the access time<br />
from the processor view is 5 ns, the hit rate is 99% and the cache access time is 1/400<br />
of the memory access time?<br />
A: 2 ns.<br />
B: 1 ns.<br />
C: 3 ns.<br />
D: None of the answers above is correct.<br />
4.2 Cache coherence: For which shared (virtual) memory systems is the snooping protocol<br />
not suited?<br />
E: Systems with butterfly network.<br />
F: Bus based systems.<br />
G: Systems with 3-D torus network.<br />
H: None of the answers above is correct.<br />
4.3 Snooping cache protocol: In which cases is the main memory up-to-date?<br />
I: Write-back caches: Cache data marked as exclusive.<br />
J: Write-back caches: Cache data marked as modified.<br />
K: Write-through caches: After writing to shared data.<br />
L: None of the answers above is correct.<br />
4.4 Snooping cache protocol, write-back caches: What is not an immediate effect of writing<br />
to shared data in the cache of one processor?<br />
M: Updating copies in the caches of other processors.<br />
N: Invalidating copies in the caches of other processors.<br />
O: Updating main memory.<br />
P: None of the answers above is correct.<br />
Name Matrikelnummer Typ<br />
A
University<br />
Duisburg-Essen<br />
Dr.-Ing. Basermann<br />
Prof. Dr.-Ing. Hunger<br />
SS 2004/2005<br />
<strong>Computer</strong> <strong>Architecture</strong> 2 / <strong>Advanced</strong> <strong>Computer</strong> <strong>Architecture</strong> Seite: 9<br />
4.5 Directory-based cache coherence protocols for distributed memory systems: Which<br />
information is not necessary in the directory of each processor?<br />
Q: Status information on data in memory of other processors.<br />
R: Locations of copies of the processor´s cache data.<br />
S: Status information on the processor´s cache data.<br />
T: Status information on the processor´s cache data + locations of copies.<br />
U: None of the answers above is correct.<br />
Name Matrikelnummer Typ<br />
A