15.01.2015 Views

4th International Conference on Principles and Practices ... - MADOC

4th International Conference on Principles and Practices ... - MADOC

4th International Conference on Principles and Practices ... - MADOC

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

The Project Maxwell Assembler System<br />

Bernd Mathiske, Doug Sim<strong>on</strong>, Dave Ungar<br />

Sun Microsystems Laboratories<br />

16 Network Circle, Menlo Park, CA 94025, USA<br />

{Bernd.Mathiske,Doug.Sim<strong>on</strong>,David.Ungar}@sun.com<br />

ABSTRACT<br />

The Java TM programming language is primarily used for<br />

platform-independent programming. Yet it also offers many<br />

productivity, maintainability <strong>and</strong> performance benefits for<br />

platform-specific functi<strong>on</strong>s, such as the generati<strong>on</strong> of machine<br />

code.<br />

We have created reliable assemblers for SPARC TM , AMD64,<br />

IA32 <strong>and</strong> PowerPC which support all user mode <strong>and</strong> privileged<br />

instructi<strong>on</strong>s <strong>and</strong> with 64-bit mode support for all but<br />

the latter. These assemblers are generated as Java source<br />

code by our extensible assembler framework, which itself is<br />

written in the Java language. The assembler generator also<br />

produces javadoc comments that precisely specify the legal<br />

values for each oper<strong>and</strong>.<br />

Our design is based <strong>on</strong> the Klein Assembler System written<br />

in Self. Assemblers are generated from a specificati<strong>on</strong>,<br />

as are table-driven disassemblers <strong>and</strong> unit tests. The specificati<strong>on</strong>s<br />

that drive the generators are expressed as Java<br />

language objects. Thus no extra parsers are needed <strong>and</strong> developers<br />

do not need to learn any new syntax to extend the<br />

framework for additi<strong>on</strong>al ISAs.<br />

Every generated assembler is tested against a preexisting<br />

assembler by comparing the output of both. Each instructi<strong>on</strong>’s<br />

test cases are derived from the cross product of its<br />

potential oper<strong>and</strong> values. The majority of tests are positive<br />

(i.e., result in a legal instructi<strong>on</strong> encoding). The framework<br />

also generates negative tests, which are expected to cause an<br />

error detecti<strong>on</strong> by an assembler. As with the Klein Assembler<br />

System, we have found bugs in the external assemblers<br />

as well as in ISA reference manuals.<br />

Our framework generates tens of milli<strong>on</strong>s of tests. For<br />

symbolic oper<strong>and</strong>s, our tests include all applicable predefined<br />

c<strong>on</strong>stants. For integral oper<strong>and</strong>s, the important boundary<br />

values, such as the respective minimum, maximum, 0,<br />

1 <strong>and</strong> -1, are tested. Full testing can take hours to run but<br />

gives us a high degree of c<strong>on</strong>fidence regarding correctness.<br />

Permissi<strong>on</strong> to make digital or hard copies of all or part of this work for<br />

pers<strong>on</strong>al or classroom use is granted without fee provided that copies are<br />

not made or distributed for profit or commercial advantage <strong>and</strong> that copies<br />

bear this notice <strong>and</strong> the full citati<strong>on</strong> <strong>on</strong> the first page. To copy otherwise, to<br />

republish, to post <strong>on</strong> servers or to redistribute to lists, requires prior specific<br />

permissi<strong>on</strong> <strong>and</strong>/or a fee.<br />

PPPJ 2006, August 30 – September 1, 2006, Mannheim, Germany.<br />

Copyright 2006 ACM ...$5.00.<br />

Keywords<br />

cross assembler, assembler generator, disassembler, automated<br />

testing, the Java language, domain-specific framework,<br />

systems programming<br />

1. INTRODUCTION AND MOTIVATION<br />

Even though the Java programming language is designed<br />

for platform-independent programming, many of its attracti<strong>on</strong>s<br />

1 are clearly more generally applicable <strong>and</strong> thus also<br />

carry over to platform-specific tasks. For instance, popular<br />

integrated development envir<strong>on</strong>ments (IDEs) that are<br />

written in the Java language have been extended (see e.g.<br />

[5]) to support development in languages such as C/C++,<br />

which get statically compiled to platform-specific machine<br />

code. Except for legacy program reuse, we see no reas<strong>on</strong><br />

why compilers in such an envir<strong>on</strong>ment should not enjoy all<br />

the usual advantages attributed to developing software in<br />

the Java language (in c<strong>on</strong>trast to C/C++). Furthermore,<br />

several Java virtual machines have been written in the Java<br />

language (e.g., [3], [21], [14]), including compilers from byte<br />

code to machine code.<br />

With the c<strong>on</strong>tributi<strong>on</strong>s presented in this paper we intend<br />

to encourage <strong>and</strong> support further compiler c<strong>on</strong>structi<strong>on</strong> research<br />

<strong>and</strong> development in Java. Our software relieves programmers<br />

of arguably the most platform-specific task of all,<br />

the correct generati<strong>on</strong> of machine instructi<strong>on</strong>s adhering to<br />

existing general purpose instructi<strong>on</strong> set architecture (ISA)<br />

specificati<strong>on</strong>s.<br />

We focus <strong>on</strong> this low-level issue in clean separati<strong>on</strong> from<br />

any higher level tasks such as instructi<strong>on</strong> selecti<strong>on</strong>, instructi<strong>on</strong><br />

scheduling, addressing mode selecti<strong>on</strong>, register allocati<strong>on</strong>,<br />

or any kind of optimizati<strong>on</strong>. This separati<strong>on</strong> of c<strong>on</strong>cerns<br />

allows us to match our specificati<strong>on</strong>s directly <strong>and</strong> uniformly<br />

to existing documentati<strong>on</strong> (reference manuals) <strong>and</strong> to<br />

exploit pre-existing textual assemblers for systematic, comprehensive<br />

testing. Thus our system virtually eliminates an<br />

entire class of particularly hard-to-find bugs <strong>and</strong> users gain<br />

a fundament of trust to build further compiler layers up<strong>on</strong>.<br />

C<strong>on</strong>sidering different approaches for building assemblers,<br />

we encounter these categories:<br />

1 To name just a few: automatic memory management,<br />

generic static typing, object orientati<strong>on</strong>, excepti<strong>on</strong> h<strong>and</strong>ling,<br />

excellent IDE support, large collecti<strong>on</strong> of st<strong>and</strong>ard libraries.<br />

3

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!