13.03.2013 Views

Hacking the Xbox

Hacking the Xbox

Hacking the Xbox

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

242<br />

<strong>Hacking</strong> <strong>the</strong> <strong>Xbox</strong>: An Introduction to Reverse Engineering<br />

trouble adjusting to an HDL than a novice, because many software tricks<br />

that are taken for granted translate very poorly to direct hardware<br />

implementation. Arrays, structures, multiplication, and division primitives<br />

are all taken for granted in <strong>the</strong> software world, but each of <strong>the</strong>se<br />

constructs translate to potentially large and inefficient blocks of hardware.<br />

Fur<strong>the</strong>rmore, in a hardware implementation, all possible cases in a<br />

case statement exist whe<strong>the</strong>r or not you intend for it; neglecting to fullyspecify<br />

a case statement with a default case often means that extra<br />

hardware will be syn<strong>the</strong>sized to handle <strong>the</strong> implicit cases. Numerous<br />

tutorials and syntax reference manuals for Verilog are indexed in Google;<br />

Overclocking FPGA Designs<br />

It is worth noting that <strong>the</strong> timing models used for an FPGA are<br />

quite conservative. This means that it is quite likely that an<br />

FPGA will operate properly at frequencies much higher than<br />

<strong>the</strong> timing analyzer will admit. In fact, careful hand-layout of<br />

an FPGA’s logic can stretch <strong>the</strong> performance of <strong>the</strong> FPGA much<br />

fur<strong>the</strong>r than its stated specifications.<br />

For example, <strong>the</strong> FPGA (Xilinx Virtex-E) used to implement <strong>the</strong><br />

<strong>Xbox</strong> Hypertransport bus tap is only specified to handle data<br />

rates of around 200 Mbits/s/pin, but <strong>the</strong> application demanded<br />

400 Mbits/s/pin. The reason I could pull this off is that <strong>the</strong> actual<br />

logic and storage elements can run very fast, but most of<br />

<strong>the</strong> performance is burned off in <strong>the</strong> wires and repeaters that<br />

carry <strong>the</strong> signals between logic elements. Specifically, some<br />

wires will have so much delay at 400 Mbits/s that <strong>the</strong>y effectively<br />

store data for a single clock cycle.<br />

I determined which wires were slower than <strong>the</strong> rest by capturing<br />

a sequence of data and comparing it against a pattern<br />

that I had previously discovered using an oscilloscope. Once<br />

<strong>the</strong> slow paths were identified, I inverted <strong>the</strong> clock and/or inserted<br />

flip-flops on channels that had too little delay. The end<br />

result was a set of signals that were time-skew corrected. These<br />

signals could <strong>the</strong>n be trivially demultiplexed to a lower clock<br />

rate where conventionally compiled HDL design techniques<br />

could be used.<br />

While this technique is very powerful, it is not generally applicable<br />

because <strong>the</strong> amount of delay caused by a wire varies<br />

from chip to chip and can depend on parameters such as<br />

<strong>the</strong> ambient temperature and <strong>the</strong> quality of <strong>the</strong> power supply<br />

voltage. However, for one specific chip under controlled circumstances,<br />

I was able to get 2x <strong>the</strong> rated performance. Ano<strong>the</strong>r<br />

important difference between this application and a<br />

more general application is that bit error rates on <strong>the</strong> order of<br />

1 error in a few thousand was tolerable, since I could just take<br />

three traces and XOR <strong>the</strong>m to recover any information lost to<br />

random noise sources. However, 1 in 10,000 bit error rates are<br />

not acceptable for normal applications; unrecoverable error<br />

rates better than 1 in 10,000,000,000,000 are more typical. This<br />

all goes back to a saying that I have: “It is easy to do something<br />

once, but doing something a million times perfectly is<br />

hard.”

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!