MaJIC: Compiling MATLAB for Speed and Responsiveness*

MaJIC: Compiling MATLAB for Speed and Responsiveness* 

ABSTRACT 

This paper presents and evaluates techniques to improve the 

execution performance of MATLAB. Previous efforts concentrated 

on source to source translation and batch compilation; 

MaJIC provides an interactive frontend that looks like 

MATLAB and compiles/optimizes code behind the scenes 

in real time, employing a combination of just-in-time and 

speculative ahead-of-time compilation. Performance results 

show that the proper mixture of these two techniques can 

yield near-zero response time as well as performance gains 

previously achieved only by batch compilers. 

Categories and Subject Descriptors 

D.3.4 [Programming Languages]: Interpreters, Compilers, 

Code Generation, Run-time environments 

General Terms 

Design, Languages, Algorithms, Performance 

1. INTRODUCTION 

MATLAB [15], a product of Mathworks Inc., is a popular 

programming language and development environment 

for numeric applications. The MATLAB programming language 

resembles FORTRAN 90 in that it deals with vectors 

and matrices, but unlike FORTRAN it is weakly typed and 

polymorphic. 

The main strengths of MATLAB lie both in its interactive 

nature, which makes it a handy exploration tool, and the 

richness of its precompiled libraries and toolboxes. 

The main weakness of MATLAB is its slow execution, especially 

when compared to similarly written code in FOR- 

TRAN. Because MATLAB has weak typing, the interpreter 

in the development environment has to check types at runtime, 

resulting in substantial performance loss. 

∗ This work was supported in part by NSF contract ACI98- 

70687. 

Permission to make digital or hard copies of all or part of this work for 

personal or classroom use is granted without fee provided that copies are 

not made or distributed for profit or commercial advantage and that copies 

bearthisnoticeandthefullcitationonthefirstpage. Tocopyotherwise,to 

republish,topostonserversortoredistributetolists,requirespriorspecific 

permission and/or a fee. 

PLDI’02, June 17-19, 2002, Berlin, Germany. 

Copyright 2002 ACM 1-58113-463-0/02/0006 ...$5.00. 

George Almási and David Padua 

galmasi,padua@cs.uiuc.edu 

Department of Computer Science 

University of Illinois at Urbana-Champaign 

1 

294 

Previous work with MATLAB to FORTRAN translators, 

notably the FALCON compiler [9, 8], has shown a performance 

increase of up to three orders of magnitude by employing 

compile-time type analysis to reduce the number of 

runtime checks. 

MaJIC (Matlab Just-In-time Compiler) aims to achieve 

the same performance goals without sacrificing the interactive 

nature of MATLAB. Like FALCON, it attempts to remove 

the overhead of runtime type checks by compiling code 

instead of interpreting it. Unlike FALCON, which is a batch 

compiler, MaJIC preserves interactive behavior by minimizing 

– or hiding – compilation time. MaJIC attempts to 

compile code ahead of time by speculation; whenever speculation 

fails, MaJIC falls back to just-in-time compilation. 

MaJIC’s dynamic (JIT) compiler reduces compile time 

as much as possible. It consists of an extremely fast type 

inference engine and a relatively naive, but fast, code generation 

engine. Compilation is performed as late as possible 

in order to gather more runtime information, in the idea 

that better runtime information allows the compiler to save 

time-consuming optimization steps. 

In addition to JIT compilation, MaJIC also performs 

speculative ahead-of-time compilation. Looking at source 

code only, the compiler guesses the run-time context most 

likely to occur in practice. If the guess is correct, the end 

result is highly optimized code that will have been compiled 

by the time it is needed, effectively hiding compilation latency. 

A wrong guess by the compiler results, at worst, in 

degraded performance, but never affects program correctness: 

MaJIC contains a mechanism to insure that code is 

only executed if its semantics are guaranteed. 

The rest of this paper is structured as follows. Section 2 

describes the software architecture of MaJIC and optimization 

techniques related to JIT type inference and speculative 

type inference. Section 3 presents and analyzes the performance 

results we obtained. Section 4 offers a brief survey of 

related work. In Section 5 we present our conclusions. 

2. SOFTWARE ARCHITECTURE 

MaJIC’s users interact with a MATLAB-like front end: 

a compatible interpreter that can execute MATLAB code at 

approximately MATLAB’s original speed. However, MaJIC’s 

front end doesn’t attempt to execute all code: it defers computationally 

complex tasks (in the current implementation, 

function calls) to the code repository. To pass work to the 

repository, the MaJIC front end builds an invocation containing 

the name of a MATLAB function and the values of 

the parameters (if any).

type signature 

source code 

1 

parser 

2 

inliner 

compiled AST 

symbol table 

U/D chain 

2 

disambiguator 

The code repository is a database of compiled code. It 

compiles code on its own, ahead of time, by snooping the 

source code directories, maintaining dependency information 

between source code and object code and triggering 

recompilations when the source code changes. The repository 

can also compile code as a result of user actions (such 

as invoking MATLAB functions). 

The code repository collects the type information necessary 

for compiling MATLAB code. This type information 

comes from different sources: directly from the user (i.e. 

when the user calls a function directly), from earlier runs of 

the same code, or from the type speculator. 

The code repository responds to requests for compiled 

code by the interpreter. It has a type matching system 

(described in Section 2.2.1) that allows the retrieval of semantically 

correct compiled code for a given invocation by 

the interpreter. A failure to find appropriate code usually 

triggers a compilation; since this typically happens during 

program execution, where time is at a premium, the JIT 

compiler is used in this situation. The generated code can 

later be recompiled (and replaced in the repository) using a 

better compiler. 

The compiler itself has the task of turning source code 

into executable code. The compiler’s passes are shown in 

Figure 1. 

• The first pass is a scanner/parser which transforms 

MATLAB source into an abstract syntax tree (AST). 

MaJIC’sparserisbasedonFALCON’sparserwitha 

few minor improvements. 

• Next, preliminary data flow analysis (disambiguation) 

is performed to build a static symbol table. At this 

point the compiler can optionally perform function inlining 

(which then necessitates the re-building of the 

symbol table). 

• When the symbol table is complete, the compiler performs 

type inference. This pass conservatively assigns 

types to all expressions in the program text. In JIT 

compilation mode, the type inference engine uses runtime 

information fed to it by the repository; in speculative 

mode, the inference engine uses only the AST 

and the symbol table and produces speculative results. 

• The last step of the compilation is code generation. 

3 

JIT 

type inference 

−− or −− 

speculative 

type inference 

type annotations 

fun. call map 

Figure 1: MaJIC compiler passes 

2 

295 

4 

JIT 

code generator 

−− or −− 

speculative mode 

code generator 

native/object 

code 

There exists a code generator each for JIT and speculative 

mode. The JIT code generator builds code fast 

and in memory; in speculative mode, the code generator 

builds C or Fortran source code, which is then 

compiled and linked with platform native tools. 

In the next few sections we present some of the compiler 

passes in more detail. 

2.1 Disambiguating MATLAB symbols 

Other than keywords, symbols in MATLAB can represent 

variables, calls to built-in primitives, or calls to user 

functions. The interpreter recognizes a symbol as a variable 

when it appears on the left side of an assignment, or else if it 

has an entry in the dynamic symbol table of the interpreter. 

A symbol not recognized as a variable is potentially a builtin 

primitive; if it cannot be resolved as either a variable or 

a built-in, the MATLAB interpreter also consults the dynamic 

table of existing user functions. If the symbol cannot 

be found there either, its occurrence is treated as an error. 

Unlike the MATLAB interpreter, MaJIC needs to identify 

symbol meanings at compile time; but some symbols’ 

meanings are hard to determine without running the code. 

Figure 2 shows code with ambiguous symbols. The left box 

shows a loop where the first occurrence of the symbol i is 

ambiguous, interpreted by MATLAB as √ −1inthefirst 

iteration, and as a variable in all following iterations. 

The right code box contains a loop where compiler analysis 

would recognize the right-hand-side occurrence of y is a 

possible undefined variable, or even a user function, if control 

flow is not taken into account. Looking at control flow, 

however, makes it obvious that y can only be accessed only 

after having been defined. 

clear 

while(...), 

z = i; 

i = z+1; 

end 

clear 

x=0; 

for p=1:N, 

if (p ≥ 2) then x = y; 

y = p; 

end 

Figure 2: Ambiguous symbols in MATLAB 

Ambiguous symbols are rare in practice and almost always 

a sign of buggy code. MaJIC does deal with them: it de-

fers their processing until runtime. Non-ambiguous variables 

can, however, be identified at compile time by a variation 

of reaching definitions analysis: a symbol that has a reachingdefinitionasavariableonall 

paths leading to it must 

be a variable. This analysis is the first pass of the MaJIC 

compiler. 

2.2 The type system 

MaJIC’s type system is used by the type inference engine 

and by the code repository. The type system is inspired 

from that of FALCON, which in turn was influenced by the 

APL [6] and SETL compilers. MaJIC’s notion of a type is 

represented by the Cartesian product of several lattices as 

follows: 

The intrinsic type of the expression is an element in the finite 

lattice Li formed by the elements real, integer, boolean, 

complex and string, and the requisite comparison operator: 

Li ={J , ⊥i, ⊤i, ⊑i, ⊔i}, where 

J = {⊥i, bool, int, real, cplx, strg, ⊤i} 

⊥i ⊑i bool ⊑i int ⊑i real ⊑i cplx ⊑i ⊤i and 

⊥i ⊑i strg ⊑i ⊤i 

A MaJIC expression’s shape Ls consists of a pair of values, 

one each for the number of rows and columns of the 

expression. In the current version of MaJIC we only consider 

Fortran-like two-dimensional shapes: 

Ls ={N × N, ⊥s, ⊤s, ⊑s, ⊔s}, where 

⊥s =< 0, 0 >, ⊤s =< ∞, ∞ >; 

⊑s< c,d> iff a ≤ c and b ≤ d 

An expression’s range Ll is the interval of values the expression 

can take [4]. We define ranges only for real numbers; 

strings and complex expressions do not have associated 

ranges. The two numbers in the range define the (inclusive) 

lower and upper limits of an interval. The lower limit is always 

less than or equal to the upper limit, or else the range 

is malformed: 

Ll ={R × R, ⊥l, ⊤l, ⊑l, ⊔l}, where 

⊥l =< nan, nan >; ⊤l =< −∞, ∞ >; 

⊑l< c,d> iff = ⊥l or (c ≤ a and b ≤ d) 

The type system is the Cartesian product T = Li × Ls × 

Ls × Ll. Ls appears twice, because MaJIC tracks lower as 

well as upper bounds of shape descriptors. We will use the 

collective denominator “shape” to mean both descriptors 

together. Thus the type system consists of intrinsic type, 

shape and range information. 

2.2.1 Type signatures 

Suppose that a function we are compiling has n formal 

parameters {f1,f2, ...fn}. We assign the following types to 

the parameters: T = {T1,T2, ...Tn}, whereTi ∈T, 1 ≤ i ≤ 

n. 

We call T the type signature of the compiled code. 

We use type signatures to determine whether compiled 

code is safe to execute, given a particular invocation. MaJIC 

generates code in such a way that an invocation of the compiled 

code with the actual parameters {a1,a2, ...an} having 

types {Q1,Q2, ...Qn} is safe if Qi ⊑ Ti, 1 ≤ i ≤ n. An ac- 

3 

296 

tual invocation is safe as long the types of the inputs are 

subtypes of the type signature of the compiled code. 

The code repository may contain, at any time, several 

compiled versions of the same code, differing only in the 

assumptions about the types of input parameters (Figure 3 

shows a simple function with a single parameter as an example). 

The function locator has to match a given invocation 

to a version of compiled code in the repository that is safe 

to execute (i.e. preserves the semantics of the program), 

and at the same time is optimal performance-wise. In order 

to do so, the function locator checks the type signature of 

the invocation against the signatures of the existing compiled 

objects in the repository, until a matching object is 

found or all repository objects are exhausted. When several 

matching objects exist, the code repository uses simple 

heuristics to find the best matching candidate for a particular 

call, based on a Manhattan-like “distance” between the 

type signature of the invocation and the matching compiled 

code. 

2.3 Type inference 

The type inference engine is an iterative join-of-all-paths 

monotonic data analysis framework [17]. It starts out with 

the control flow graph (CFG) of a MATLAB program and, in 

the case of JIT type inference, a type signature T (where |T | 

is equal to the number of formal parameters of the function 

that is being compiled). The result of type inference is a set 

of type annotations S, one type for each expression node in 

the abstract syntax tree. S is a conservative estimate of the 

types that expression nodes can assume during execution. 

The annotations are later used by the code generator. 

Because MaJIC has a relatively simple type system, and 

because the type inference engine avoids symbolic computation 

and caps the number of iterations, the type inference 

engine is fast enough for use by the JIT compiler. 

2.3.1 Transfer functions 

The transfer functions of the type inference engine are implemented 

as a set of rules in a type calculator. The calculator 

has two modes of operation: in forward mode it infers 

expression types from argument types; in backward mode 

it infers argument types from the expressions’ types (this 

mode is used by the type speculator). 

Multiple type calculation rules may exist for each AST 

node type. Each rule is guarded by a boolean precondition. 

When the type calculator is invoked with a particular AST 

node as argument, the corresponding rules’ preconditions 

are tested in order until one evaluates to true; the rule is 

then applied to calculate the result(s). 

A rational way of ordering type inference rules is to progress 

from the most restrictive ones to the least restrictive ones. 

Evaluating more restrictive rules first makes sense because 

these generally lead to better performance, whereas more 

general rules tend to yield generic, low performance code. If 

no rules’ preconditions evaluate to true, the type calculator 

applies the implicit default rule: all output types are set to 

⊤. This allows the type inference engine to behave conservatively 

for language constructs that have no corresponding 

rules in the database. 

Thus for example the “*” operator in MaJIC can be evaluated 

successively as an instance of: integer scalar multiply; 

real scalar multiply; complex scalar multiply; real scalar × 

vector or vector × scalar; part of a dgemv operation; or a

MATLAB code type signature generated code: C + MATLAB C library functions 

itype(x)=int 

int poly1 sig0() { 

shape(x)=scalar 

return 254; 

limits(x)= 

} 

itype(x)=int 

int poly1 sig1(int x) { 


return x*x*x*x*x+3*x+2; 

limits(x)=⊤l 

itype(x)=real 


} 

int poly1 sig2(double x) { 

return x*x*x*x*x+3.0*x+2.0; 

function p=poly(x) 

p = x.ˆ5+3*x+2; 

return 


itype(x)=real 

minshape(x)= 

maxshape(x)= 


} 

double *poly sig3(double x[3]) { 

static tmp2[3]; 

tmp2[0]=x[0]*x[0]*x[0]*x[0]*x[0]+3.0*x[0]+2.0; 

tmp2[1]=x[1]*x[1]*x[1]*x[1]*x[1]+3.0*x[1]+2.0; 

tmp2[2]=x[2]*x[2]*x[2]*x[2]*x[2]+3.0*x[2]+2.0; 

} 

mxArray *poly4 sig1(mxArray *x) { 

mxArray *tmp1 = mlfScalar(5.0); 

mxArray *tmp2 = mlfPower(x,tmp1); mxFree(tmp1); 


itype(x)=complex 

shape(x)=⊤s 


mxArray *tmp4 = mlfTimes(tmp3,x); mxFree(tmp3); 

mxArray *tmp5 = mlfPlus(tmp2, tmp4); 

mxFree(tmp2); mxFree(tmp4); 


mxArray *tmp7 = mlfPlus(tmp5, tmp6); 

mxFree(tmp5); mxFree(tmp6); 

return tmp7; 

} 

Figure 3: Type signatures and generated code.The operators itype(x), shape(x) and limits(x) in the second 

column refer to type components from the type lattice defined earlier. 

generic complex matrix multiply. This does not exhaust all 

possibilities, but these are the categories that MaJIC can 

generate successively less optimized code for. 

Currently, MaJIC’s type calculator contains about 250 

rules. Each MATLAB expression/operator type has at least 

one entry in the database; many of MATLAB’s built-in functions 

have several entries each. Our current implementation 

covers just enough of MATLAB to execute the benchmarks 

efficiently. The type inference engine can handle all other 

language features by resorting to the default. 

2.4 JIT type inference 

In JIT mode, the type calculator performs only forward 

analysis. Type inference propagates the type signature of 

the function to calculate type annotations for the function 

body. Since the type inference system is biased towards 

more speed in detriment of precision, one would expect the 

quality of type annotations to suffer when performing justin-time 

type inference. However, JIT type inference operates 

with very precise initial data: the type signature of the code, 

derived directly from the input values of the runtime invocation. 

Under these circumstances type inference is not only 

precise but lends itself to a number of extensions, which extract 

additional information from the type inference process 

at little or no additional cost: 

• Constant propagation: Range propagation (the part 

of type inference which deals with the Ll lattice) can 

be thought of as a generalization of constant propagation 

for real scalars. A real value is a constant if 

its lower and upper limits are equal. Given a type 

signature that contains many constants, most of the 

transfer functions are able to calculate exact lower and 

4 

297 

upper limits for scalar objects, effectively performing 

constant propagation as part of type inference. 

Range propagation does not work for complex numbers 

and non-numeric values, so constant propagation does 

not work for these either. 

• Exact shape inference: MaJIC propagates lower 

and upper bounds for array shape information. An array’s 

shape is exactly determined if the lower and upper 

shape bounds are equal. Just as constants can be 

determined given good input data, exact array shapes 

can be determined also. Sometimes value range propagation 

and shape propagation collaborate on determining 

exact shapes. For example, in the statement A = 

zeros(m,n), the value ranges of m and n may uniquely 

determine the shape of A. 

In array assignments of the form A(i)=..., the range 

of the index can determine the shape of the array A 

(because MATLAB arrays reshape themselves to accommodate 

indices). 

There are a number of ways in which exact shapes 

can be used to achieve better performance. For example, 

by completely unrolling simple operations on 

small arrays we can eliminate all control flow from the 

operation. 

• Subscript check removal: MATLAB mandates subscript 

checks on all array accesses. The removal of 

unnecessary subscript checks is a major source of performance 

enhancement in MaJIC. 

Older versions of MATLAB’s own compiler, mcc, have 

command line switches to disable subscript checks (in-

cluding resizing checks). This can cause code that otherwise 

is correct to run incorrectly when compiled with 

mcc. Newer versions of mcc have consequently discontinued 

the option. 

MaJIC removes subscript checks automatically and 

conservatively, by using the range and shape information 

readily available during type inference. Because 

JIT type inference propagates these exactly, the extra 

effort needed for subscript check analysis is extremely 

low, comparing favorably with more conventional techniques 

[13]. 

2.5 Type speculation 

Just-in-time type inference assumes that the full calling 

context (i.e. the type signature) is available to the analyzer. 

By contrast, type speculation assumes nothing about the 

calling context: it guesses the likely types of the arguments. 

This allows the compiler to process the code ahead of time, 

applying advanced (and time consuming) loop optimizations 

in order to generate good code. 

The type speculator’s trick is to back-propagate certain 

type hints from the body of the code to the input parameters. 

Type hints are collected from syntactic constructs that 

suggest, but do not command, particular semantic meanings. 

These constructs originate in part from programmers’ tendency 

to avoid arcane MATLAB specific constructs, sticking 

instead to features already prevalent in Fortran-like languages. 

Other hints can be derived from some MATLAB 

built-in functons’ affinity towards certain inputs. The following 

list summarizes the type hints used by MaJIC’s 

speculator: 

• When processing the colon operator (:), used to specify 

index ranges, MATLAB silently ignores the imaginary 

part of the index arguments. Even if the index 

is a complex array, only the real part of its first element 

is used, and all indices are of course rounded 

before use. This suggests that operands of the interval 

operator are almost always integer scalars. 

• Relational operators disregard the imaginary components 

of their operands. Also, relational operations 

between vectors are possible but are rare in practice, 

since their semantics are non-intuitive. This holds even 

stronger for expressions that form the condition of an 

if-statement of a while-statement. 

• The MATLAB bracket operator (vector constructor) 

collates several matrices into a new larger matrix. The 

components all have to have either the same number of 

rows or the same number of columns. In practice the 

bracket operator is often used to build vectors out of 

scalars. When we can prove that one of the arguments 

xi of the bracket operator [x1x2...xn] is a scalar, all 

other arguments are probably scalars too. 

• In matrix index expressions of the form A(idx) and 

A(idx1,idx2), if the subscript is an expression or a 

variable then it is likely scalar. This is a reasonable assumption 

because a many MATLAB applications use 

either Fortran77 or Fortran90 compatible array indexing 

operations. Fortran90 syntax is indicated by the 

presence of the colon (:) operator; the lack of colons 

indicates Fortran77 syntax. 

5 

298 

• Arguments to a number of builtin functions, such as 

zeros, ones, rand, the second argument of size and 

many others, are likely integer scalars. MATLAB issues 

warnings when the arguments in question are nonscalars 

or non-integers, but does not stop processing. 

However most well-written MATLAB programs don’t 

intentionally produce these warnings. 

These hints are implemented as type calculator rules. Note 

that the hints involve backwards propagation of types, since 

they make statements about input arguments rather than 

the result types of MATLAB expressions. Thus, in order to 

propagate hints, the type inference engine must be used in 

backwards mode. 

Speculative type inference consists of a number of alternating 

backward and forward type inference passes. A speculative 

(backward) pass infers a credible type signature from 

the code body; it is immediately followed by a normal type 

inference pass to re-calculate the types in the body. The 

alternating backwards-forwards process can be iterated several 

times until convergence. 

2.6 Code generation 

MaJIC has two code generation systems: a fast lightweight 

code generator used for JIT compilation, and a C (or Fortran) 

based code generator that uses the host system to compile, 

optimize and link the code. Both code generators use 

the parsed AST and type annotations to drive code selection. 

The code generators follow the same general selection 

rules, but build radically different code. 

The JIT code generator is able to build executable code 

directly in memory by using the vcode [11] dynamic assembler. 

The code generator makes a single code selection pass 

through the parsed AST. No loop optimizations or instruction 

scheduling are performed. Register allocation is done 

using the linear-scan register allocator [19]. This, and the 

small total number of code generation passes, results in a 

fast code generator. 

The source code generator is somewhat more complicated. 

It uses the same code selection pass as the JIT code generator, 

but builds C or Fortran source code in a temporary 

file. This file is then compiled with the native compiler using 

the most aggressive optimization mode that is available. 

The compiler generates a relocatable object which is then 

dynamically linked into the MaJIC executable. Unlike the 

JIT code generator, the source code generator is quite slow, 

hampered by the large overheads of loading and executing 

the compiler and the linker. Compilation, optimization and 

linking can take several seconds. 

2.6.1 Code selection rules 

As mentioned before, the two code generators use the 

same selection rules even though they use them to produce 

different code. A few of the selection rules are listed below. 

• The implicit default rule for any operator is that the 

numeric operands are complex matrices. This is the 

unoptimized fall-back option for operations that have 

not been type-inferred. The MATLAB C library provides 

functions that implement these generic operators. 

• MaJIC inlines scalar arithmetic and logical operations, 

elementary math functions and assignments of

scalar integers, reals and complex numbers. This is 

probably the most important performance optimization 

in MaJIC: it relies on type annotations to replace 

MATLAB’s polymorphic operations with single 

machine instructions. 

• MaJIC inlines scalar and F90-like array index operations. 

The MATLAB interpreter discriminates between 

array expressions types at runtime, spending 

hundreds of cycles. By contrast, an inlined scalar index 

operation takes only a few cycles. 

• Small temporary arrays of known sizes are pre-allocated. 

MATLAB’s expression evaluation semantics sometimes 

forces the existence of temporary buffers to hold intermediary 

array results. Replacing dynamically allocated 

buffers with statically allocated ones saves a lot 

of overhead at the expense of a small amount of heap 

memory. 

• Elementary vector operations, such as arithmetic operations 

and vector concatenation, are completely unrolled 

when exact array shapes are known. This technique 

is very effective on small (up to 3 × 3) matrices 

and vectors because it completely eliminates loop overhead. 

• MaJIC performs code selection to combine several 

AST nodes into a single library call. For example, expressions 

like a*X+b*C*Y are transformed into a single 

call to the BLAS routine dgemv [7]. 

• Unlike Fortran, MATLAB resizes arrays on demand. 

In general, this occurs when an array index overflow 

occurs on the left hand side. Repetitive array resizing 

(e.g. in a loop) can be tremendously expensive. 

MaJIC applies the simple but effective technique of 

“oversizing” arrays, i.e. allocating about 10% more 

space for a resized array than strictly necessary, so 

that subsequent growth of the array does not necessitate 

another resize operation. 

MaJIC performs oversizing carefully in order to preserve 

the original semantics of the code. The oversized 

array, when queried, returns accurate size information. 

Oversizing is also limited by the amount of available 

memory and the size of the array. Large arrays are 

never oversized. 

• MaJIC inlines calls to small (less than 200 lines of 

code) functions. Inlining preserves the call-by-value 

semantics of MATLAB by making copies of the actual 

parameters. However, read-only formal parameters are 

not copied. This can result in huge performance gain 

when large matrices are being passed as read-only arguments 

in the call. 

3. PERFORMANCE EVALUATION 

In this section, we evaluate the overall performance of the 

MaJIC compiler. Although the repository is part of the 

interactive MaJIC system, an evaluation of its performance 

is not a goal of this paper. We were interested in analyzing 

the quality and speed of the JIT and speculative compilers. 

To test JIT compilation, we started our experiments with 

an empty repository. This resulted in the JIT compiler being 

invoked for any function call. 

6 

299 

To test speculative compilation, we also started up MaJIC 

with an initially empty repository, but we invoked the benchmarks 

only after MaJIC’s repository had ample time to find 

them and compile them speculatively. 

3.1 Benchmarks 

MaJIC was tested with 15 MATLAB benchmarks, between 

50 and 250 lines long each. Table 1 lists the names, 

origin and functional description of the benchmarks, as well 

as the associated problem size for which measurements were 

run (matrix sizes in some of the benchmarks). In addition, 

we list the number of lines in each benchmark and the runtime 

on a reference system (the SPARC platform described 

in Section 3.3) using a stock MATLAB interpreter. 

Many of the benchmarks were originally used to evaluate 

FALCON; we reused them in order to facilitate a direct 

comparison of MaJIC and FALCON. 

In order to make the subsequent discussion easier, we 

group the benchmarks into four partially overlapping categories. 

Benchmarks in the same categories tend to be optimized 

in similar ways by the compiler, and show similar 

performance gains: 

• Scalar, or Fortran-like, benchmarks: dirich, finedif, 

icn, mandel and, to some extent, crnich, are written 

in a style that closely resembles Fortran 77. All array 

indices in these benchmarks are scalars. 

• Benchmarks with built-in functions: cgopt, qmr, sor 

and mei spend a large portion of their runtime in builtin 

MATLAB library functions. Typically, these codes 

are hard to optimize, since the the library functions 

themselves are already optimized. 

• Array benchmarks: orbec, orbrk, fractal and adapt 

have many operations on small fixed size MATLAB 

vectors. adapt features a large (and dynamically growing) 

array as well as small vectors. 

• Recursive benchmarks: fibo and ack contain recursion, 

which makes inlining and type inference harder. 

3.2 Measurement methodology 

Our performance figures are derived from the running 

times of the benchmarks. The most important gauge of 

performance we use is the speedup of compiled code relative 

to interpreted code, i.e. the expression s = ti/tc, whereti is 

the runtime of the code in MATLAB’s interpreter, and tc is 

the runtime in the compiler. 

We measured MaJIC’s speedups in both JIT and speculative 

compilation mode. In JIT mode runtime includes 

the time spent by the JIT compiler producing object code. 

In speculative mode the repository is assumed to have a 

generated the code ahead of time; hence compile time is not 

included in the runtime in this case, unless the speculatively 

generated code turns out not to match the benchmarks invocation 

– in this latter case the JIT compiler kicks in and 

helps out with the code generation. This mode of measuring 

runtimes is consistent with the expected real-world usage 

pattern of MaJIC. 

For purposes of comparison, we also measured the speedups 

of mcc, the compiler supplied by the Mathworks Inc. We set 

a number of compile time options for this compiler in order 

to guarantee the best performance: we manually eliminated

enchmark source short description problem size lines of code runtime (s) 

adapt [14] adaptive quadrature approx. 2500 81 5.24 

cgopt [3] conjugate gradient w. diagonal preconditioner 420 x 420 38 0.43 

crnich [14] Crank-Nicholson heat equation solver 321 x 321 40 16.33 

dirich [14] Dirichlet solution to Laplace’s equation 134 x 134 34 277.89 

finedif [14] Finite difference solution to the wave equation 1000 x 1000 21 57.81 

galrkn [12] Galerkin’s method (finite element method) 40 x 40 43 8.02 

icn R. Bramley Cholesky factorization 400 x 400 29 7.72 

mei unknown fractal landscape generator 31 x 14 24 10.77 

orbec [12] Euler-Cromer method for 1-body problem 62400 points 24 19.10 

orbrk [12] Runge-Kutta method for 1-body problem 5000 points 52 9.30 

qmr [12] linear equation system solver, QMR method 420 x 420 119 5.29 

sor [3] lin. eq. sys. solver, successive overrelaxation 420 x 420 29 4.77 

ackermann authors Ackermann’s function ackermann(3,5) 15 3.84 

fractal authors Barnsley fern generator 25000 points 35 26.55 

mandel authors Mandelbrot set generator 200 x 200 16 8.64 

fibonacci authors recursive Fibonacci function fibonacci(20) 10 1.29 

subscript checks, and replaced operations on complex number 

with real number operations where it was safe to do 

so. 

We measured the speedups of FALCON by repeating the 

experiments described in [9] on our test machines. We instructed 

FALCON to eliminate subscript checks wherever 

this did not break the code. 

Execution times were measured on a “best of 10 runs” 

basis on a quiet system. 

Our performance graphs show four bars for each benchmark. 

The four bars are the speedups achieved by mcc, 

FALCON, MaJIC in JIT mode and MaJIC in speculative 

mode respectively. Because the speedups are distributed 

over four orders of magnitude, ranging from 0.1 to about 

1000, the graphs are represented on a logarithmic scale. 

3.3 Testing platforms and speedups 

We measured the interpreted execution time ti of all benchmarks 

using the MATLAB 6 (release 12) integrated environment 

on two architectures: 

• The development platform for MaJIC is a 400MHz 

UltraSparc 10 workstation with 256MB of RAM, running 

Solaris 7 and equipped with the Sparcworks 5.0 

C compiler. The performance results for this machine 

aresummarizedinFigure4. 

As described above, the figure has bars for each benchmark, 

called “mmc”, “falcon”, “jit” and “spec” respectively. 

A few of the speedup bars are missing: there 

are no FALCON speedup bars for the benchmarks ack, 

fractal, fibo and mandel, because these were not 

part of the original FALCON benchmark series and 

are unsuitable for compilation with FALCON. 

The speedup bars of cgopt appear to be missing because 

they are very close to 1.0. 

• We also ran some of the experiments on an SGI Origin 

200 machine equipped with 4 180MHz R10000 processors, 

IRIX 6.5 and the MIPSPro C compiler. The JIT 

compiler on this platform is not yet completely implemented. 

Some benchmarks (like adapt) wereleftout 

of the graphs for this reason. Others are included, but 

Table 1: MaJIC benchmarks 

7 

300 

1000 

100 

10 

1 

0.1 

crnich 

dirich 

finedif 

icn 

mandel 

mmc falcon jit+gen spec 

cgopt 

mei 

qmr 

Figure 4: Performance on the SPARC platform 

run at reduced performance due to the poor quality of 

the generated code. Figure 5 shows the results. 

3.4 Comparative performance analysis 

The two groups of benchmarks that most clearly benefit 

from compilation are the Fortran-like benchmarks and the 

small vector benchmarks. These types of codes incur the 

most overhead during interpreted execution; they profit the 

most from the removal of overhead. 

By contrast, the benchmarks that are heavy in built-in 

function calls benefit very little, and sometimes not at all, 

from compilation. Obviously, the execution speed of built-in 

functions is not influenced by compiling the calling code. 

The orbrk benchmark demonstrates that inlining at compile 

time is beneficial. Recursive functions like fibo and 

ack also generally benefit from inlining. MaJIC does not 

attempt to inline more than 3 levels of recursive calls in 

order to avoid code explosion. 

While mcc is not particularly successful at removing the 

interpretive overhead, both FALCON and MaJIC do succeed 

in eliminating it, although using different strategies. 

FALCON relies heavily on the native Fortran compiler to 

sor 

adapt 

orbec 

orbrk 

fractal 

galrkn 

ack 

fibo

10000 

1000 

100 

10 

1 

0.1 

crnich 

dirich 

finedif 

icn 


mmc falcon jit spec 

cgopt 

mei 

Figure 5: Performance on the MIPS platform 

generate good code. MaJIC has a few specific optimizations 

(described in Section 2.6.1) that make it less reliant 

on the native compiler and allow it to generate reasonable 

code even with the JIT code generator. 

On the SPARC platform the native Fortran-90 compiler 

generates relatively poor code, causing MaJIC to outperform 

FALCON in a few of the benchmarks. On the MIPS 

platform the native compiler is excellent, causing MaJIC’s 

JIT compiler to fall behind FALCON. 

3.5 Analysis of JIT compilation 

For the analysis of JIT compilation we rely mostly on results 

gathered on the SPARC platform, since the JIT code 

generator was optimized for this platform. The performance 

figures are remarkable when considering that the code in 

question is generated in a fraction of a second and without 

the benefit of backend optimizations. On the other 

hand there is room for future optimizations; however, before 

adding these on, it will be necessary to test whether the 

increased compile time will destroy the performance gained 

by optimization. 

Figure 6 shows the time composition of the runtime of 

each JIT-compiled benchmark. With the exception of orbrk, 

most benchmarks spend a relatively modest amount of time 

compiling the code. The compile time/runtime ratio is artificially 

high anyway, in part because the benchmarks run 

on modestly sized problems. There is definitely room for 

at least basic back-end optimizations in the JIT compiler, 

such as common subexpression elimination, loop unrolling, 

loop invariant removal and some form of instruction scheduling. 

Preliminary experiments with the finedif and dirich 

benchmark suggest that loop unrolling alone can reduce excution 

time by about 50% at a reasonable cost in overhead. 

3.5.1 The effect of existing JIT optimizations 

The effect of optimizations in any compiler is cumulative 

and hard to study in isolation. In this section we evaluate 

the effectiveness of JIT-specific optimizations by individually 

disabling them and studying the resulting drop in 

performance. Figure 7 shows the measurement results. 

The first set of bars (“no range”) was obtained by disabling 

range propagation during JIT type inference. The 

primary effect of this measure is to disable subscript check 

qmr 

sor 

adapt 

orbec 

orbrk 

fractal 

galrkn 

8 

301 

normalized execution time 

100% 

90% 

80% 

70% 

60% 

50% 

40% 

30% 

20% 

10% 

0% 

crnich 

dirich 

finedif 

icn 


disamb typeinf codegen exec 

cgopt 

mei 

qmr 

Figure 6: The composition of JIT execution 

performance relative to fully optimized JIT 

100% 

80% 

60% 

40% 

20% 

0% 

crnich 

dirich 

finedif 

icn 

no ranges 

no min. shapes 

no regalloc 


cgopt 

mei 

Figure 7: Disabling JIT optimizations 

removal. The relative increase in execution time is highest 

in the benchmarks that have many array accesses: dirich, 

finedif and mandel are good examples. 

The second set of bars (“no min. shape”) was obtained 

by disabling the propagation of minimum shape information. 

This disables subscript check removal in some cases, 

and does not allow the compiler to unroll small vector operations. 

orbec, orbrk and fractal are the most affected, 

because these consist mostly of operations on small vectors 

and matrices. 

The last set of bars (“no regalloc”) was obtained by forcing 

the linear-scan register allocator to spill every variable. 

This is roughly equivalent to compiling with the -g flag set 

on a regular compiler like gcc. 

The results clearly show that range propagation, minimum 

shape propagation and register allocation are essential 

to JIT performance. 

qmr 

sor 

sor 

adapt 

adapt 

orbec 

orbec 

orbrk 

orbrk 

fractal 

fractal 

galrkn 

galrkn 

ack 

ack 

fibo 

fibo

3.6 Analysis of the speculator 

The speedup results produced by speculation generally 

match those of FALCON. We conclude that speculation is 

generally successful. However, we cannot expect a speculative 

technique to be universally successful; we need to 

analyze the consequences of failure. 

MaJIC’s type speculator fails in two ways: by being too 

aggressive, and generating useless code, or by not being aggressive 

enough and generating suboptimal code. The first 

type of failure, unreasonable specialization of input types, 

is easily countered: the type signature check, done by the 

repository at runtime, will eliminate such code from consideration. 

A more insidious failure is when the speculator generates 

code that is perfectly safe to execute, but suboptimal. Such 

cases are not caught at runtime. The performance of the 

invoked code will be lower, but it is not immediately clear 

by how much. 

benchmark crnich dirich finedif icn mandel 

spec. 181 817 412 48 36 

JIT 181 817 413 51 54.0 

benchmark cgopt mei qmr sor adapt 

spec. 1 4.24 4.52 1.68 4.09 

JIT 1.16 5.67 5.68 1.79 4.16 

benchmark orbec orbrk fractal galrkn ack 

spec. 146 465 663 61.7 4.04 

JIT 174 465 664 72.9 6.00 

benchmark fibo 

spec. 3.49 

JIT 5.16 

Table 2: JIT vs.speculative type inference 

Table 2 attempts to quantify the speculator’s performance. 

It compares the speedups produced by the same code generator 

using type annotations generated with either speculation 

or JIT type inference (the speedups were calculated 

without considering compile time). Looking at this table, 

it is obvious that speculative type inference closely matches 

the performance of JIT type inference in many cases. We 

conclude that 

• Speculation works best on scalar (Fortran 77-like) and 

vector codes. Speculative rules look for exactly the 

kinds of features that are prevalent in these codes. 

• Benchmarks with built-in functions typically fare badly 

because the speculative rules currently present in MaJIC 

do not account for the language features used by these 

codes. MaJIC mispredicts a “*” operator in qmr to 

represent scalar multiplication, whereas in fact it is a 

matrix-vector multiplication. In mei the speculator is 

unable to predict that the arguments to an eig function 

call are reals; instead it considers them complex 

values which leads to performance loss. A similar situation 

occurs in mandel due to the use of the built-in 

function i. 

• Recursive benchmarks are not handled correctly by 

speculative compilation. They always need to be recompiled 

at runtime. 

9 

302 

4. RELATED WORK 

MaJIC is patterned after FALCON [9, 8], a MATLAB 

to Fortran-90 translator developed by L. DeRose in 1996. 

FALCON performs type inference to generate declarations 

for variables. It then generates Fortran code using these 

declarations. However, FALCON’s type inference engine is 

facing a limiting factor. Because FALCON is a batch compiler, 

it has no information about the calling context of the 

functions it tries to compile. This makes type inference potentially 

ineffective. FALCON circumvents this problem by 

“peeking” into the input files of the code it compiles and 

extracting type information from there. 

MENHIR [5], developed by Francois Bodin at INRIA, is 

another batch compiler similar to FALCON: it generates 

code for MATLAB and exploits parallelism by using optimized 

runtime libraries. MENHIR’s code generation is 

retargetable (it generates C or FORTRAN code). It also 

contains a type inference engine similar to FALCON’s. 

MATCH [2] is a MATLAB compiler targeted to heterogeneous 

architectures, such as DSP chips and FPGAs. It 

also uses type analysis and generates code for multiple large 

functional units. 

Vijay Menon’s vectorizer [16] is an alternative to compilation. 

Menon observed that scalar operations in MATLAB 

were slower than vector operations because they involved 

more overhead pear floating-point operation. He proposed 

to eliminate this overhead not by compilation, but by translating 

Fortran 77-like scalar operations into Fortran 90-like 

vector expressions in the MATLAB source code. Menon’s 

vectorizer is built on top of the MaJIC infrastructure. 

Just-in-time compilation has been around since 1984, when 

Deutsch described a dynamic compilation system for the 

Smalltalk language [10]. The technique became truly popular 

with the Java language, and countless Java JIT compilers 

have been proposed and implemented in recent times. 

MaJIC’s JIT compiler reuses code and ideas from the vcode [11] 

and tcc [18] packages. vcode was originally built as a generalpurpose, 

platform-independent RISC-like dynamic assembly 

language to facilitate dynamic code generation, and is used 

in almost unchanged form by MaJIC. tcc was built on top 

of vcode and provides an implementation of ‘C, a C-like 

programming language with a LISP-like backquote operator 

that facilitates the building of dynamic code by composition. 

We did not reuse the ‘C parser, but we did use the tcc intermediate 

language specification, ICODE, and re-implemented 

the register allocator used by tcc. 

5. CONCLUSIONS 

In an effort to bring high performance to the MATLAB integrated 

environment, we have designed, built and evaluated 

two paradigms for compiling MATLAB code: JIT compilation 

and speculative compilation. 

JIT compilation is remarkably successful in bringing down 

compile time to almost nil, while obtaining reasonable performance 

gains (up to two orders of magnitude faster than 

the MATLAB interpreter). It falls behind in terms of performance 

when compared to the best that a static compiler 

(like FALCON) can do. Of our benchmarks, the most affected 

were the Fortran-like and small vector codes, where 

the lack of backend optimization is felt the most. 

In order to estimate the effect of adding more optimizations 

to the JIT compiler, we hand-optimized the finedif

enchmark by hand-unrolling its innermost loop and performing 

common subexpression elimination. We obtained 

a version of finedif that was almost 100% faster than the 

normal JIT-compiled finedif, and within 20% of the performance 

of the best (native compiler-generated) version of the 

code. Preliminary data suggests that similar, although less 

impressive, performance improvements can be obtained with 

some of the other Fortran-like benchmarks, which leaves the 

door open for future enhancements of the JIT compiler. 

Speculative compilation is successful in bringing up performance 

to – and beyond – FALCON levels. However, generation 

of optimized code takes time; speculation is designed 

to allow the hiding of compilation latency. Speculation is not 

universally successful; it can result in loss of performance 

when it fails. 

It is interesting to note that the speculative type hints that 

are used most successfully by MaJIC’s speculator are tied 

to the very same language features of MATLAB that slow 

down the interpreter. Hence, speculation tends to succeed 

when it is most needed. 

6. REFERENCES 

[1] George Almasi. MaJIC: a Matlab Just-In-time 

Compiler. PhD thesis, University of Illinois at 

Urbana-Champaign, June 2001. 

[2] P. Banerjee, N. Shenoy, A. Choudhary, S. Hauck, 

C. Bachmann, M. Chang, M. Haldar, P. Joisha, 

A. Jones, A. Kanhare, A. Nayak, S. Periyacheri, and 

M. Walkden. Match: A matlab compiler for 

configurable computing systems. Technical Report 

CPDC-TR-9908-013, Center for Parallel and 

Distributed Computing, Northwestern University, 

Aug. 1999. 

[3] R. Barrett, M. Berry, T. F. Chan, J. Demmel, 

J.Donato,J.Dongarra,V.Eijkhout,R.Pozo, 

C. Romine, and H. Van der Vorst. Templates for the 

Solution of Linear Systems: Building Blocks for 

Iterative Methods, 2nd Edition. SIAM, Philadelphia, 

PA, 1994. 

[4] William Blume and Rudolf Eigenmann. Symbolic 

range propagation. In Proceedings of the 9th 

International Parallel Processing Symposium, April 

1995. 

[5] Francois Bodin. MENHIR: High performance code 

generation for MATLAB. 

http://www.irisa.fr/caps/PEOPLE/Francois/. 

[6] Timothy Budd. An APL Compiler. Springer Verlag, 

1988. 

[7] J. Choi, J. Dongarra, and D.W. Walker. BLAS 

reference manual (version 1.0beta). Technical Report 

ORNL/TM-12469, Oak Ridge National Laboratory, 

March 1994. 

[8] Luiz DeRose and David Padua. Techniques for the 

translation of MATLAB programs into Fortran 90. 

ACM Transactions on Programming Languages and 

Systems (TOPLAS), 21(2):285–322, March 1999. 

[9] Luiz Antonio DeRose. Compiler Techniques for 

MATLAB Programs. Technical Report 

UIUCDCS-R-96-1996, Department of Computer 

Science, University of Illinois, 1996. 

10 

303 

[10] L Peter Deutsch and Alan Schiffman. Efficient 

Implementation of the Smalltalk-80 System. In 

Proceedings of the 11th Symposium on the Principles 

of Programming Languages, Salt Lake City, UT, 1984. 

[11] Dawson R. Engler. VCODE: a portable, very fast 

dynamic code generation system. In Proceedings of the 

ACM SIGPLAN Conference on Programming 

Languages Design and Implementation (PLDI ’96), 

Philadelphia PA, May 1996. 

[12] Alejandro L. Garcia. Numerical Methods for Physics. 

Prentice Hall, 1994. 

[13] Rajiv Gupta. Optimizing array bounds checks using 

flow analysis. ACM Letters on Programming 

Languages and Systems, 2(1-4):135–150, 1993. 

[14] John H. Mathews. Numerical Methods for 

Mathematics, Science and Engineering. Prentice Hall, 

1992. 

[15] Mathworks Inc. homepage. www.mathworks.com. 

[16] Vijay Menon and Keshav Pingali. High-level semantic 

optimization of numerical codes. In 1999 ACM 

Conference on Supercomputing. ACM SIGARCH, 

June 1999. 

[17] Steven S. Muchnick and Neil D. Jones. Program Flow 

Analysis: Theory and Application. Prentice Hall, 1981. 

[18] Massimiliano Poletto, Dawson R. Engler, and 

M. Frans Kaashoek. tcc: A system for fast, flexible, 

and high-level dynamic code generation. In 

Proceedings of the ACM SIGPLAN Conference on 

Programming Languages Design and Implementation 

(PLDI ’97), pages 109–121, Las Vegas, Nevada, May 

1997. 

[19] Massimiliano Poletto and Vivek Sarkar. Linear scan 

register allocation. ACM Transactions on 

Programming Languages and Systems, 21(5):895–913, 

1999.

MaJIC: Compiling MATLAB for Speed and Responsiveness*

Create successful ePaper yourself

Delete template?

Save as template?