15.01.2015 Views

4th International Conference on Principles and Practices ... - MADOC

4th International Conference on Principles and Practices ... - MADOC

4th International Conference on Principles and Practices ... - MADOC

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

tomated assembler <strong>and</strong> disassembler testing (secti<strong>on</strong> 5.5).<br />

We briefly discuss related work in secti<strong>on</strong> 6 <strong>and</strong> the paper<br />

c<strong>on</strong>cludes with notable observati<strong>on</strong>s <strong>and</strong> future work (secti<strong>on</strong><br />

7).<br />

3. HOW TO USE THE ASSEMBLERS<br />

Each assembler c<strong>on</strong>sists of the top level package<br />

com.sun.max.asm <strong>and</strong> the subpackage matching its ISA<br />

as listed in Figure 1. In additi<strong>on</strong>, the package<br />

com.sun.max.asm.x86 is shared between the AMD64 <strong>and</strong><br />

the IA32 assembler. Hence, to use the AMD64 assembler<br />

the following packages are needed: 4 com.sun.max.asm,<br />

com.sun.max.asm.amd64 <strong>and</strong> com.sun.max.asm.x86. N<strong>on</strong>e<br />

of the assemblers requires any of the packages under .gen<br />

<strong>and</strong> .dis.<br />

To use an assembler, <strong>on</strong>e starts by instantiating <strong>on</strong>e of the<br />

leaf classes shown in Figure 2. The top class Assembler provides<br />

comm<strong>on</strong> methods for all assemblers, c<strong>on</strong>cerning e.g.<br />

label binding <strong>and</strong> output to streams or byte arrays. The<br />

generated classes in the middle c<strong>on</strong>tain the ISA-specific assembly<br />

routines. For ease of use, these methods are purposefully<br />

closely oriented at existing assembly reference manuals,<br />

with method names that mimic mnem<strong>on</strong>ics <strong>and</strong> parameters<br />

that directly corresp<strong>on</strong>d to individual symbolic <strong>and</strong> integral<br />

oper<strong>and</strong>s.<br />

Here is an example for AMD64 that creates a small sequence<br />

of machine code instructi<strong>on</strong>s (shown in Figure 3) in<br />

a Java byte array:<br />

import s t a t i c<br />

. . . asm . amd64 . AMD64GeneralRegister64 . ∗ ;<br />

. . .<br />

public byte [ ] c r e a t e I n s t r u c t i o n s ( ) {<br />

l<strong>on</strong>g s t a r t A d d r e s s = 0 x12345678L ;<br />

AMD64Assembler asm =<br />

new AMD64Assembler ( s t a r t A d d r e s s ) ;<br />

}<br />

Label loop = new Label ( ) ;<br />

Label s u b r o u t i n e = new Label ( ) ;<br />

asm . f i x L a b e l ( subroutine , 0x234L ) ;<br />

asm . mov(RDX, 12 , RSP . i n d i r e c t ( ) ) ;<br />

asm . bindLabel ( loop ) ;<br />

asm . c a l l ( s u b r o u t i n e ) ;<br />

asm . sub (RDX, RAX) ;<br />

asm . cmpq(RDX, 0 ) ;<br />

asm . j n z ( loop ) ;<br />

asm . mov( 2 0 , RCX. base ( ) , RDI . index ( ) ,<br />

SCALE 8 , RDX) ;<br />

return asm . toByteArray ( ) ;<br />

Instead of using a byte array, assembler output can also be<br />

directed to a stream (e.g. to write to a file or into memory):<br />

OutputStream stream = new . . . Stream ( . . . ) ;<br />

asm . output ( stream ) ;<br />

The above example illustrates two different kinds of label<br />

usage. Label loop is bound to the instructi<strong>on</strong> following the<br />

bindLabel() call. In c<strong>on</strong>trast, label subroutine is bound<br />

to an absolute address. In both cases, the assembler creates<br />

PC-relative code, though, by computing the respective<br />

4 In additi<strong>on</strong>, general purpose packages from MaxwellBase<br />

<strong>and</strong> the JRE are needed.<br />

offset argument. 5 An explicit n<strong>on</strong>-label argument can be<br />

expressed by using int (or sometimes l<strong>on</strong>g) values instead<br />

of labels, as in:<br />

asm . c a l l ( 2 0 0 ) ;<br />

The variant of call() used here is defined in the raw assembler<br />

(AMD64RawAssembler) superclass of our assembler <strong>and</strong><br />

it takes a “raw” int argument:<br />

public void c a l l ( int r e l 3 2 ) { . . . }<br />

In c<strong>on</strong>trast, the call() method used in the first example<br />

is defined in the label assembler (AMD64LabelAssembler),<br />

which sits between our assembler class <strong>and</strong> the raw assembler<br />

class:<br />

public void c a l l ( Label label ) {<br />

. . . c a l l ( l a b e l O f f s e t A s I n t ( label ) ) ; . . .<br />

}<br />

This method builds <strong>on</strong> the raw call() method, as sketched<br />

in its body.<br />

These methods, like many others, are syntactically differentiated<br />

by means of parameter overloading. This Java<br />

language feature is also leveraged to distinguish whether a<br />

register is used directly, indirectly, or in the role of a base<br />

or an index. For example, the expressi<strong>on</strong> RSP.indirect()<br />

above results in a different Java type than plain RSP, thus<br />

clarifying which addressing mode the given mov instructi<strong>on</strong><br />

must use. Similarily, RCX.base() specifies a register in the<br />

role of a base, etc.<br />

If there is an argument with a relatively limited range of<br />

valid values, a matching enum class rather than a primitive<br />

Java type is defined as the parameter type. This is for<br />

instance the case regarding SCALE 8 in the SIB addressing<br />

expressi<strong>on</strong> above. Its type is declared as follows:<br />

public enum S c a l e . . . {<br />

SCALE 1 , SCALE 2 , SCALE 4 , SCALE 8 ;<br />

. . .<br />

}<br />

Each RISC assembler features synthetic instructi<strong>on</strong>s according<br />

to the corresp<strong>on</strong>ding reference manual. For instance,<br />

<strong>on</strong>e can write these statements to create some synthetic<br />

SPARC instructi<strong>on</strong>s [20]:<br />

import s t a t i c . . . asm . s p a r c .GPR. ∗ ;<br />

SPARC32Assembler asm = new SPARC32Assembler ( . . . ) ;<br />

asm . nop ( ) ;<br />

asm . s e t ( 5 5 , G3 ) ;<br />

asm . i n c ( 4 , G7 ) ;<br />

asm . r e t l ( ) ;<br />

. . .<br />

Let’s take a look at the generated source code of <strong>on</strong>e of these<br />

methods:<br />

5 In our current implementati<strong>on</strong>, labels always generate PCrelative<br />

code, i.e. absolute addressing is <strong>on</strong>ly supported by<br />

the raw assemblers.<br />

5

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!