A Graph-Based Generic Type System for Object-Oriented Programs

UNU-IISTInternational Institute forSoftware TechnologyA Graph-Based Generic Type System forObject-Oriented ProgramsWei Ke, Zhiming Liu, Shuling Wang and LiangZhaoJune 2011UNU-IIST Report No. 448R

UNU-IIST and UNU-IIST ReportsUNU-IIST (United Nations University International Institute for Software Technology) is a Research andTraining Centre of the United Nations University (UNU). It is based in Macao, and was founded in1991. It started operations in July 1992. UNU-IIST is jointly funded by the government of Macao andthe governments of the People’s Republic of China and Portugal through a contribution to the UNUEndowment Fund. As well as providing two-thirds of the endowment fund, the Macao authorities alsosupply UNU-IIST with its office premises and furniture and subsidise fellow accommodation.The mission of UNU-IIST is to assist developing countries in the application and development of softwaretechnology.UNU-IIST contributes through its programmatic activities:1. Advanced development projects, in which software techniques supported by tools are applied,2. Research projects, in which new techniques for software development are investigated,3. Curriculum development projects, in which courses of software technology for universities in developingcountries are developed,4. University development projects, which complement the curriculum development projects by aimingto strengthen all aspects of computer science teaching in universities in developing countries,5. Schools and Courses, which typically teach advanced software development techniques,6. Events, in which conferences and workshops are organised or supported by UNU-IIST, and7. Dissemination, in which UNU-IIST regularly distributes to developing countries information on internationalprogress of software technology.Fellows, who are young scientists and engineers from developing countries, are invited to actively participatein all these projects. By doing the projects they are trained.At present, the technical focus of UNU-IIST is on formal methods for software development. UNU-IISTis an internationally recognised center in the area of formal methods. However, no software technique isuniversally applicable. We are prepared to choose complementary techniques for our projects, if necessary.UNU-IIST produces a report series. Reports are either Research R , Technical T , Compendia C orAdministrative A . They are records of UNU-IIST activities and research and development achievements.Many of the reports are also published in conference proceedings and journals.Please write to UNU-IIST at P.O. Box 3058, Macao or visit UNU-IIST’s home page: http://www.iist.unu.edu,if you would like to know more about UNU-IIST and its report series.Peter Haddawy, Director

UNU-IISTInternational Institute forSoftware TechnologyP.O. Box 3058MacaoA Graph-Based Generic Type System forObject-Oriented ProgramsWei Ke, Zhiming Liu, Shuling Wang and LiangZhaoAbstractWe present a graph-based model of a generic type system for an oo language. The type systemsupports the features of recursive types, generics and interfaces, which are commonly found inmodern oo languages such as Java. In the classical graph theory, we define type graphs, instantiationgraphs and conjunction graphs that naturally illustrate the relations among types,generics and interfaces within complex oo programs. The model employs a combination ofonymous and anonymous nodes to represent respectively types that are identified as names andstructures, and defines graph-based relations and operations on types including subtyping, equivalence,conjunction and instantiation. Algorithms based on the graph structures are designedfor the implementation of the type system. We believe that this type system is important forthe development of a graph-based logical foundation of a formal method for verification of andreasoning about oo programs.Keywords: OO programs, type systems, generics, type graphs, recursive types

Wei Ke is a PhD student registered in Beihang University, Beijing, and a researcher in MacaoPolytechnic Institute, Macao. He is working on projects in object-oriented programming languagesemantics. Email: wke@ipm.edu.moZhiming Liu is a Senior Research Fellow at UNU-IIST. His research interests include theoryof computing systems, emphasizing sound methods and tools for specification, verificationand refinement of fault-tolerant, realtime and concurrent systems, and formal techniques forobject-oriented development. Email: Z.Liu@iist.unu.eduShuling Wang is currently a postdoctoral fellow of ISCAS, China. She is working with Prof.Zhao Chaochen and Prof. Zhan Naijun. Her research interest is in formal methods of hybrid systems,including modeling and verification; object-oriented confinement and verification. Email:wangsl@ios.ac.cnLiang Zhao is a PhD student in Computer Science of the joint PhD program of UNU-IISTand University of Pisa. His research interests include semantics and type systems of programminglanguages, graph transformation, and formal methods for object-oriented development.Email: liang@iist.unu.eduThe authors acknowledge support from the project GAVES funded by Macao S&TD Fund,NSFC-90718014, NSFC-60970031, NSFC-91018012 and NSFC-60736017.Copyright c○ 2011 by UNU-IIST

Contents 5Contents1 Introduction 72 An Object Oriented Language 93 Type Graphs 113.1 Types and Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123.2 Rooted Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133.3 Type Equivalence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143.4 Checking Type Equivalence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164 Type Operations 174.1 Shallow Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184.2 Conjunction Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194.3 Class Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204.4 Instantiation Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214.5 Recursive Generic Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255 Type System 275.1 Well-Formed Type Substitutions . . . . . . . . . . . . . . . . . . . . . . . . . . . 285.2 Well-Formed Type Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285.3 Environment Dependent Commands . . . . . . . . . . . . . . . . . . . . . . . . . 295.4 Type Graph Completeness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305.5 Type Contexts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305.6 Type-Checking of Type Expressions . . . . . . . . . . . . . . . . . . . . . . . . . 335.7 Subtype Relation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345.8 Type-Checking of Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355.9 Instantiation of Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 365.10 Type-Checking of Programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 396 Execution Environment 406.1 Environment Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 406.2 Construction of Environment Graphs . . . . . . . . . . . . . . . . . . . . . . . . . 417 Operational Semantics 437.1 State Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 437.2 Execution Environment Lookup . . . . . . . . . . . . . . . . . . . . . . . . . . . . 447.3 Well-Typed State Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 457.4 Semantic Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 457.5 Type-Safety of Programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 478 Conclusion 51Report No. 448, June 2011UNU-IIST, P.O. Box 3058, Macao

Introduction 71 IntroductionGraph notations are widely used to express class structures and execution states of oo programs.For oo languages, such as Java [1], where object variables are pointers, it is natural to representobject identities by nodes and variables by directed edges, e.g., uml [2] and the trace model ofHe and Hoare [3]. However, traditional semantic definitions and reasoning about properties ofoo programs are based on the plain theory of sets and relations [4, 5], without much utilizationof the properties of graph structures. To close the gap between structural representation andreasoning of oo programs, we define a graph-based type system and operational semantics foroo programs. A significant advantage of this type system is the intuitiveness to study propertiesand relations of complex types and executions of oo programs using graphs and their properties.Concepts and properties of graphs, those of paths, connectivity, homomorphism and isomorphismin particular, articulate the formulation of properties of a program to verify, the idea of reasoningabout the properties, and the design of verification and implementation [6]. Notice that here weemphasize the simplicity of mathematical structure and way of thinking, instead of the actualvisual representation of a program.Overview of contribution. Graph-based representation of type systems with no type operationsis rather straightforward. Types are represented as nodes of a type graph, and relationsof types, including the subtype relation, are represented as edges between them. However,modern oo languages often incorporate generics, i.e., parameterized types, to strengthen thetype-checking with dynamic typing. Their type systems use identified and constrained type variablesin place of a much widened universal superclass to avoid unchecked downcasts [7]. Fora graph-based representation of such type systems, type operations that construct new typesshould be realized by graph transformations that introduce new components into a type graph.How these new components are defined, and how they coexist and get related with the rest ofthe type graph are the main problem of the representation.Recursive generic classes lead to further complication of the problem. That is, a generic class canbe instantiated to a recursive type only for certain type arguments, and such a recursive type isactually an unfolded version of some other type. Consider an example generic class A〈β〉, whereβ is a type variable, and one of its instantiations A〈β ↦→ B〉. Let b of type β be an attribute ofA〈β〉 and a of type A〈β ↦→ B〉 an attribute of class B, the instantiation of the recursive generictype is shown below.baA〈β ↦→ B〉 : B :bbaIn the above figure, the graph on the left is an instantiation graph of A, by substituting thegraph of B (on the right) for β. The white nodes all represent the type A〈β ↦→ B〉, but thegraphs connected to these white nodes are different. We need to relate these graphs using theequi-recursive approach [8] to ensure their consistency.Report No. 448, June 2011UNU-IIST, P.O. Box 3058, Macao

An Object Oriented Language 9systems, see also [17, 18, 19]. The common feature of these methods is that they focus on programexecution semantics instead of type systems. The reason is mainly that representing advancedfeatures of type systems by graphs, such as generic types and polymorphism, is challenging sinceit involves hierarchal structures.Outline of structure. We define the extension to rcos, called the rcos/g language, in §2, andtype graphs with their relations in §3. In §4, we define the operations of class conjunction andclass instantiation, and discuss how the graphs of recursive generic classes can be constructed.The type checking system is presented in §5, together with the proofs of subtyping partial orderand the type-safety of command instantiation. §6 is about environment graphs that encode theruntime type information. We show how environment graphs are constructed from type graphs.In §7, we study the connection between the type system and the operational semantics based onenvironment graphs, proving that the program execution is type-safe.2 An Object Oriented LanguageThe formal language of rcos is a general oo language which has a Java-like syntax and supportsmost of the essential oo features such as inheritance, type casting, dynamic binding andrecursive objects [4]. For a generic type system, we extend rcos by introducing generic classes,class instantiations and class conjunctions. We also modify the syntax of method invocation,adopting the mechanism of “parameter binding by name”. With such binding mechanism, theorder of parameters of a method is not significant, which reduces the complexity of the graphrepresentation. We call the new language rcos/g, and show its full syntax in Table 1. The syntaxdefinition uses the overline notation exp(s, t) to represent a finite list exp(s 1 , t 1 ), . . . , exp(s k , t k )where k > 0. For example, Te x is a list Te 1 x 1 , . . . , Te k x k of variable declarations.In this syntax, C ranges over class names, α denotes an arbitrary type variable, x a variable or amethod parameter, a an attribute, m a method name, l a literal of a primitive type. We assumeprimitive types Z for integers, B for booleans and S for text strings. We also define the auxiliaryclass Null as the type of the null object.As in Java, a program consists of a list cdecl of class declarations and a Main method as theentry point. The Main method has a list extv of external variable declarations and a methodbody command c that accesses the external variables. Each type in extv is a primitive type or aclass declared in cdecl. A class declaration is of the formclass C [〈α cnstr〉] [extends Cc] {adef ; mdef },where C is the name of the class, [〈α cnstr〉] is an optional type parameter list for generic classes,[extends Cc] is an optional superclass specification, and {adef ; mdef } are definitions of members,i.e., attributes and methods, of the class. A type parameter is a type variable α followed by itsconstraints cnstr, involving an optional superclass constraint and an optional member constraint.Report No. 448, June 2011UNU-IIST, P.O. Box 3058, Macao

An Object Oriented Language 10Table 1: rCos/g Syntaxprogram ::= cdecl • Mainprogramcdecl ::= class Cclass declaration[〈α cnstr〉]type parameters[extends Cc]direct superclass{adef ; mdef }memberscnstr ::=constraints[extends Cu]superclass constraint[implements Cn] member constraintadef ::= visib Te a attribute definitionvisib ::= private | protect | public visibilitymdef ::= m(Te x) {c} method definitionCu ::= Cc | αsuperclass upper boundTe ::=type expressionZ | B | Sprimitive types| CeCe ::= Cc | Cn | αclass expressionsCn ::= ⋓ Ccclass conjunctionCc ::= C | Cjconcrete classesCj ::= C〈α ↦→ Te〉class instantiationc ::=commandskip| var Te x variable declaration| end x variable undeclaration| Cc.new(le) object creation| le := e assignment| e.m(ve • x • re) method invocation| c; c | c ⊳ e ⊲ c | e ∗ c structural commandsve ::= evalue argumentre ::= leresult argumente ::=expressionle | self | null | l| (Ce)e type cast| f(e) built-in function appl.le ::=l-expressionx| e.a object attributeextv ::= Te xexternal variable decl.Main ::= {extv; c}main method definitionAn attribute definition adef consists of a visibility specification visib and an attribute typing.A method definition mdef is a method signature consisting of a method name and a formalparameter typing list, followed by a command as the method body.A type is specified by a type expression Te, which is either a primitive type or a class expression.Class expressions Ce are formed from classes, type variables, class conjunctions and class instantiations.Among class expressions, classes and class instantiations are called concrete classes,Report No. 448, June 2011UNU-IIST, P.O. Box 3058, Macao

Type Graphs 11denoted by Cc. Only concrete classes can have class instances, thus we can only invoke the newcommand on concrete classes. A class conjunction Cn is a collection of member signatures of aset of concrete classes. Class conjunctions are used as class interfaces to describe memberwiserequirements and assumptions. A class instantiation Cj represents a new instance of a genericclass, in the form of the generic class name followed by a substitution of type variables with typeexpressions.A command c in rcos/g is similar to those in common imperative languages. We provide structuralconstructors of commands including sequential compositions c; c, conditionals c ⊳ e ⊲ cand while-loops e ∗ c. We require that commands, including assignments, object creations andmethod invocations, should not occur in expressions, so that the evaluation of expressions has noside-effects. We also explicitly formulate variable declarations and undeclarations as commandsfor local scope introductions and cleanups, respectively.A method invocation specifies the method name and the object upon which the method isinvoked, as well as how the value arguments ve are passed to and how the result arguments reare retrieved from the formal parameters x. Upon entering the method, the value of a valueargument is copied to the corresponding formal parameter. And upon leaving the method, thevalue of a formal parameter is copied to the corresponding result argument. For example, themethod invocationo.m(true, 4 • x, y • p.a, q.b)invokes method m upon object o, passing true to x, 4 to y when entering, and retrieving p.afrom x, q.b from y when leaving.Expressions are formed from variables, attributes and literals through applications of membernavigations e.a, typecasts (Ce)e and primitive-type functions f(e). We denote the null objectby null, and use the special variable self to point to the currently active object. The set ofl-expressions, ranged over by le, includes variables and navigation paths, which are allowed tooccur on the left-hand side of assignments.To simplify our discussion, all variables, parameters and attributes are initialized with the defaultvalue of their types, specifically, 0 for integers, false for booleans, ε (empty) for text strings andnull for all class types.3 Type GraphsThe class declarations cdecl of an oo program define the structure of the objects of the programin terms of their types and relations. This structure can be formally represented by a directedand labeled graph [9], called a type graph. In this paper, we develop a graph-based type systemfor rcos/g. For this, we extend the model in [9] to capture the extended features of the language.Report No. 448, June 2011UNU-IIST, P.O. Box 3058, Macao

Type Graphs 123.1 Types and StructuresIn an rcos/g program, the primitive types, classes and type variables are called nominal types.Each nominal type has a name, and types with different names are different, even if they havethe same set of members. Class instantiations and conjunctions are types obtained from typeoperations, and they are called structural types. Two structural types are the same if they havethe same set of members. Both nominal and structural types are represented in a type graph.The static table of a class lists the methods of the class, and the frame of a method consists of theparameters of the method. These information is also characterized in a type graph. In fact, eachtype, static table or method frame of a program is represented as a subgraph of the type graphof the program. There is a designated node in such a subgraph, called the root of the graph ofthe type, static table or method frame. For a nominal type, the root of its graph is the name ofthe type, but for a structural type, a static table or a method frame, the root is anonymous. Wecall the root of a nominal type a nominal node and an anonymous root a structural node.We use D to denote the primitive type names and U the class names, and C = D ∪ U theconstant types. Let X be the type variables. Thus, we define C ∪ X to be the set of nominalnodes. Using S to denote the structural nodes and assuming all these sets are mutually disjoint,we have the set N = C ∪ X ∪ S of nodes of type graphs.For the labels of edges, let A be the set of the names of variables, attributes, methods andparameters, which is disjoint with N . We introduce a special label ‘⊲’ for the class inheritancerelation, and another special symbol ‘σ’ to label an incoming edge to the root of a static table.If a class contains a type variable α, the class is generic. A generic class can be instantiated bymapping α to some actual type u. We use an edge labeled by the type variable α and targetingu to record the mapping α ↦→ u. We put such an edge to the static table of each generic classand class instantiation, see Fig. 1. Thus, A + = A ∪ X ∪ {⊲, σ} is the set of all labels of edges.Definition 1 (Type Graph) A type graph is a directed and labeled graph Γ = 〈N, E〉, whereN ⊆ N is the set of nodes, and E : N × A + → N is the set of edges.Notice that E is a function implies that labels of the outgoing edges of a node are distinct. Thusthe names of attributes and methods of a class are different, and parameters of a method aredifferent. In particular, there is no multiple inheritances allowed in this model.Fig. 1 shows a type graph. In the type graph, C is a generic class, with a type variable β and asuperclass D. Notice the β-labeled edge targeting at the node β. If β is instantiated to an actualtype, the β node is replaced by the node of the actual type. C has two attributes a and b oftypes S and β, respectively, and a method m 1 with a parameter x of type S. The type variableβ has a method m 2 with a parameter x of type S, meaning that any replacement of β in aninstantiation of C must have such a method. D is an ordinary class with an attribute a and twomethods m 3 and m 4 , with a parameter x and two parameters x and y, respectively.Report No. 448, June 2011UNU-IIST, P.O. Box 3058, Macao

Type Graphs 13aa generic classa static tablea primitive typeSxxm 1m 2βσσCβb⊲Da classa type variableσ m 4m 3xaZyxBa method frameFigure 1: A Type GraphWe use s a −→ t to represent an a-labeled edge from s to t, and call s the source node and t thetarget node of the edge. We use s a −→ · to represent an a-labeled outgoing edge from s and · a−→ tan a-labeled incoming edge to t. For a given graph Γ = 〈N, E〉, we use Γ.ns to denote the set Nof nodes, and Γ.es the set E of edges.Let Γ be a type graph and n a list of nodes of Γ. We define the set out-es(Γ, n) of outgoingedges from the nodes in n, and the set adj -ns(Γ, n) of nodes adjacent to any of the nodes in n.Formally,out-es(Γ, n) ̂= {s a −→ t ∈ Γ | s ∈ n},adj -ns(Γ, n) ̂= {t | · −→ t ∈ out-es(Γ, n)}.aA path p is a sequence of consecutive edges, denoted as n10 −→ n1 · · · n k−1 −→ nk or simplya 1 .a 2 ...ank0 −−−−−−→ nk if we are not concerned with its intermediate nodes. We also overload thenotation of set membership “∈”, using n ∈ Γ, e ∈ Γ and p ∈ Γ to denote that a node n, an edgee and a path p are in Γ, respectively.For a path p, let source(p) and target(p) be the starting and final node, respectively. We saythat a node n 2 is reachable from a node n 1 , denoted by n 1 −→ ∗ n 2 , if there is a path from n 1 ton 2 . A path n 1a −→ · · · a −→ n2 is simply denoted as n 1a −→ ∗ n 2 .a k3.2 Rooted GraphsGiven the type graph Γ of a program, an individual type T , say a class, is represented by a noder of Γ, the nodes reachable from r and the edges between these nodes. These nodes and edgesform a subgraph of Γ with r being designated as the root, and this rooted graph represents thetype T .Definition 2 (Rooted Graph) Given a node r in a graph Γ = 〈N, E〉, the rooting operationΓ r returns the subgraph 〈N ′ , E ′ , r〉 of Γ such thatN ′ = {n | r −→ ∗ n ∈ Γ},E ′ = {n 1a −→ n2 ∈ Γ | n 1 , n 2 ∈ N ′ }.Report No. 448, June 2011UNU-IIST, P.O. Box 3058, Macao

Type Graphs 14This subgraph is called a rooted graph, and r is called the root, denoted by G..Notice that for each node r ∈ Γ, Γ r is unique. For a rooted graph G = 〈N, E, r〉, thegraph 〈N, E〉 is called the base graph of G, denoted by G.base. A path in G from the rootr a −→ n is denoted by a −→ n. We extend the rooting operation to rooted graphs, and defineG n ̂= G.base n.The rooting operation performs garbage collection. For example, a removal of a set es of edgesfrom a rooted graph G = 〈N, E, r〉 is defined as〈N, E, r〉 es ̂= 〈N, E es〉 r.The removal of edges may disconnect a subgraph from the root, causing additional nodes andedges to be removed.3.3 Type EquivalenceRelations between types can now be studied by comparing the structures of their rooted graphs.Graph homomorphisms and isomorphisms are the instruments for this purpose.Definition 3 (Homomorphism) Given two rooted graphs G = 〈N, E, r〉 and G ′ = 〈N ′ , E ′ , r ′ 〉,a function f : N → N ′ is a homomorphism from G to G ′ , if1. f(r) = r ′ ,2. s a −→ t ∈ E =⇒ f(s) a −→ f(t) ∈ E ′ .The function f is an isomorphism if, in addition, f is bijective and f −1 is also a homomorphism.With the notions of graph homomorphism and isomorphism, we are able to define the structuralequivalence relation between types in terms of their rooted graphs.Definition 4 (Structural Equivalence) Two rooted graphs G and G ′ are structural equivalent,denoted as G ≃ G ′ , if there exists an isomorphism f : G → G ′ such that1. n ∈ C ∪ X =⇒ f(n) = n,2. n ∈ S =⇒ f(n) ∈ S .Report No. 448, June 2011UNU-IIST, P.O. Box 3058, Macao

Type Graphs 15u ∼ v ∼ w ∼ ru, v, w, r ∈ SSqavawaqa 2a 1ubbr a b ua 1a 2vbbBbpZpu ∼ vFigure 2: Equivalent Recursive TypesThe structural equivalence relation does not handle the equivalence between a structural recursivetype, in which an attribute refers back to its owner type, and the unfolded variants of the recursivetype, as shown in Fig. 2. For this, we define the type reduction relation.Definition 5 (Reduction) A rooted graph G is reducible to a rooted graph G ′ , denoted asGG ′ , if there exists a surjective graph homomorphism f : G → G ′ such that,1. n ∈ C ∪ X =⇒ f(n) = n,2. n ∈ S =⇒ f(n) ∈ S ,3. f(n) a −→ · ∈ G ′ =⇒ n a −→ · ∈ G.We call such f a reduction function.Definition 6 (Type Equivalence) Two rooted graphs G and G ′ are type equivalent, denotedas G ∼ G ′ , if there exists a rooted graph G r such that both G and G ′ are reducible to G r .It is straightforward to verify that ∼ is an equivalence relation. In addition, G ∼ G ′ holds if andonly if there exist G 1 and G ′ 1 such that G is reducible to G 1, G ′ is reducible to G ′ 1 , and G 1 andG ′ 1 are structural equivalent, i.e., G 1 ≃ G ′ 1 .In general, we use Γ ⊢ n ♦ n ′ to denote that two rooted graphs Γ n and Γ n ′ are relatedby a relation ♦, i.e., Γ n ♦ Γ n ′ . We also say the two nodes n and n ′ have relation ♦. Inparticular, two nodes n and n ′ in a type graph Γ are called type equivalent if Γ ⊢ n ∼ n ′ .It is worth pointing out that if two types T and T ′ are equivalent, attributes of T and T ′ withthe same names also refer to equivalent types. Restricting a reduction function to a subgraph,we haveG f G ′ ∧ n ∈ G =⇒ G n f0 G ′ f(n),where f denotes the reduction defined by f, and f 0 = f | (Gn).ns . Together with Definition 3,we have the following lemma.Report No. 448, June 2011UNU-IIST, P.O. Box 3058, Macao

Type Graphs 16Lemma 1 If G ∼ G ′ and a −→ n ∈ G, there exists a −→ n ′ ∈ G ′ and G n ∼ G ′ n ′ .Proof. By induction on the length of a.□We extend the transitive closure of the ⊲ relation to type equivalent rooted graphs.Definition 7 (Subclass) A rooted graph G ′ is a subclass of a rooted graph G, denoted as G ′ G, if∃ ⊲ −→ ∗ n ∈ G ′ • G ′ n ∼ G.3.4 Checking Type EquivalenceGiven two rooted graphs X and Y , we check if X and Y are equivalent by using the functiont-eq(X, Y ). The function traverses the two graphs and generates a set Q = (s, t) of pairs. Each(s, t) of the pairs consists of two sets s and t of nodes of X and Y , respectively. Starting with({X.}, {Y.}), each next step checks the targets of the outgoing edges, that have identicallabels, of nodes in each pair of sets already generated. If the node of X (or Y ) being visitedwas visited before in step i, the corresponding nodes of Y (or X, respectively) being visited areadded to the set of nodes of Y (or X, respectively) in the pair generated in step i; otherwise, anew pair of sets containing these nodes respectively is generated. The traverse terminates whenthere exists a pair (s, t) for that the set of labels of the outgoing edges of s is different from thatof t. In this case, the checking function returns false. Otherwise, the checking terminates whenall nodes are visited and the checking function returns true along with the generated set Q ofpairs, if and only if, for any pair (s, t) in Q, s and t contain only structural nodes, or are bothsingletons and equal.Table 2 shows the definition of the auxiliary function t-eq 0 . The definition is given by a list ofinference rules, evaluated from the first one. If, for a rule, all the conditions above the line hold,the evaluation reduces to the part of the rule below the line, otherwise the next rule is furtherevaluated. We define the t-eq function based on t-eq 0 ast-eq(X, Y ) ̂= t-eq 0 (X, Y, {({X.}, {Y.})}) .If t-eq(X, Y ) returns (Q, true), X and Y are equivalent and we can construct the graph G r fromQ that both X and Y are reducible to. For a pair ({C}, {C}) in Q, where C is nominal, G r hasnode C. And for a pair of sets (s, t) of structural nodes in Q, G r has a corresponding structuralnode n that represents all the equivalent nodes in the pair. The outgoing edges of n correspondto the outgoing edges of the equivalent nodes in s and t.Report No. 448, June 2011UNU-IIST, P.O. Box 3058, Macao

Type Operations 17Table 2: Type Equivalencet-eq 0 (X, Y, Q) ̂=(s, t) ∈ Q x ∈ s y ∈ t x −→ a x ′ ∈ X x ′ /∈ Q.sy −→ a y ′ ∈ Y y ′ ∈ t ′ (s ′ , t ′ ) ∈ Q y −→ a y ′ ∈ Y y ′ /∈ Q.tt-eq 0 (X, Y, Q {(s ′ , t ′ )} ∪ {(s ′ ∪ {x ′ }, t ′ )} t-eq 0 (X, Y, Q ∪ {({x ′ }, {y ′ })})false(s, t) ∈ Q x ∈ s y ∈ t y a −→ y ′ ∈ Y y ′ /∈ Q.tt-eq 0 (Y, X, {(v, u) | (u, v) ∈ Q})(s, t), (s ′ 1, t ′ 1), (s ′ 2, t ′ 2) ∈ Q(s ′ 1, t ′ 1) ≠ (s ′ 2, t ′ 2) x 1 , x 2 ∈ s x 1a−→ x′1 , x 2a−→ x′2 ∈ X x ′ 1 ∈ s ′ 1 x ′ 2 ∈ s ′ 2t-eq 0 (X, Y, Q {(s ′ 1, t ′ 1)} {(s ′ 2, t ′ 2)} ∪ {(s ′ 1 ∪ s ′ 2, t ′ 1 ∪ t ′ 2)})(s, t), (s ′ 1, t ′ 1), (s ′ 2, t ′ 2) ∈ Q(s ′ 1, t ′ 1) ≠ (s ′ 2, t ′ 2) y 1 , y 2 ∈ t y 1a−→ y′1 , y 2a−→ y′2 ∈ Y y ′ 1 ∈ t ′ 1 y ′ 2 ∈ t ′ 2t-eq 0 (X, Y, Q {(s ′ 1, t ′ 1)} {(s ′ 2, t ′ 2)} ∪ {(s ′ 1 ∪ s ′ 2, t ′ 1 ∪ t ′ 2)})(s, t) ∈ Qn 1 ≠ n 2n 1 , n 2 ∈ s ∪ t{n 1 , n 2 } Sfalse(s, t) ∈ Q x ∈ s y ∈ t{a | x a −→ · ∈ X} ≠ {b | y b −→ · ∈ Y }false (Q, true) ,where Q is a set of pairs and each pair consists of two sets of equivalent nodes, Q.s = ⋃ (s,t)∈Q s andQ.t = ⋃ (s,t)∈Q t.4 Type OperationsIn an rcos/g program, types are specified by type expressions. Besides atomic type expressions,such as primitive types, classes and type variables, a type expression also consists of typeoperations.We have two type operations, class conjunction and class instantiation. Since types are representedby rooted graphs, type operations are defined in terms of graph operations. In thissection, we define conjunction graphs and instantiation graphs for class conjunctions and classinstantiations, respectively.The graph operations share a common idea — the result graph is constructed by introducing newstructural nodes and new edges pointing into the operand graphs without changing any existingrooted graphs.Report No. 448, June 2011UNU-IIST, P.O. Box 3058, Macao

Type Operations 18C 1Da⊲σσbm 1 m 2m 2 m 3T 1 Txy 2yC 2bσm 3Tz 3zam 2bm 1 m 3 iaσi I AbraΓ ⊢ r ∼ (C 1 ⋓ C 2 )ivΓ ⊢ v ⧁ AXxByYbFigure 3: A Conjunction Graph and a Shallow Structure4.1 Shallow StructuresUnlike in C++ where a member of an object is a nested sub-object, a member of an object inrcos/g is a reference to another object. Therefore we say the member structures are shallow.Different types represented by rooted graphs may have the same structure, although their rootsmay have different names. To compare the structures of two types, we define their correspondingshallow structures, whose roots are anonymous, and check if they are equivalent.Definition 8 (Shallow Structure) A rooted graph G ′ is a shallow structure of a rooted graphG, denoted as G ′ ⧁ G, if1. the root of G ′ is a structural node, G ′ . ∈ S ,2. for a node adjacent to the root of G, there is an equivalent node adjacent to the root of G ′through the same label, a −→ n ∈ G =⇒ ∃ a −→ n ′ ∈ G ′ • G n ∼ G ′ n ′ ,3. there are no more outgoing labels from the root of G ′ than from the root of G, a −→ · ∈ G ′ =⇒∃ a −→ · ∈ G.Fig. 3 (right) shows a class A and its shallow structure rooted at v. Note that the graph ofa generic class C has a nominal root. We will see later in this section, an instantiation of Cwill first create a shallow structure of C, called the template of instantiation for C, and thensubstitute actual types for the type variables.Given a type graph Γ and a node n, function shlo introduces a new structural node n ′ , and newedges from n ′ to the adjacent nodes of n. The new graph rooted at n ′ is a shallow structure ofΓ n. Formally, shlo(Γ, n) ̂= (Γ ′ , n ′ ), whereΓ ′ .ns = Γ.ns ∪ {n ′ }, Γ ′ .es = Γ.es ∪ {n ′ a −→ t | na −→ t ∈ Γ.es}, n ′ ∈ S Γ.ns.Report No. 448, June 2011UNU-IIST, P.O. Box 3058, Macao

Type Operations 194.2 Conjunction GraphsA class contains all the attributes and methods of its direct superclass as well as the new membersintroduced in the class itself. Inductively along the inheritance hierarchy, the effective memberstructure of a class is a conjunction of all the members of its superclasses. The effective memberstructure is also called the interface of the class.The notion of conjunctions can be extended to a set of arbitrary classes. A conjunction of aset of classes consists of all the effective members of these classes. For the conjunction of a setof classes to be defined, these classes must be consistent that members with the same name indifferent classes must have equivalent types. We often use a conjunction to combine a numberof interfaces to define a structural requirement that is to be implemented by a concrete class,hence the term “conjunction”. The requirement is satisfied only when all of the interfaces areimplemented.Definition 9 (Conjunction Graph) Given a list of rooted graphs G, a rooted graph G ′ is aconjunction graph of G, denoted as G ′ ∼ ⋓ G, if1. G ′ . ∈ S ,2. all attributes in G are included in G ′ ,G ∈ G ∧ a ∈ A ∧ ⊲ −→ ∗ a −→ n ∈ G =⇒ ∃ a −→ n ′ ∈ G ′ • G n ∼ G ′ n ′ ,3. G ′ only contains attributes in G,a ≠ σ ∧ a −→ · ∈ G ′ =⇒ a ∈ A ∧ ∃G ∈ G • ⊲ −→ ∗ a −→ · ∈ G,4. all method signatures in G are included in G ′ ,G ∈ G ∧ m ∈ A ∧ ⊲ −→ ∗ σ.m −−→ n ∈ G =⇒ ∃ σ.m −−→ n ′ ∈ G ′ • G n ∼ G ′ n ′ ,5. G ′ only contains method signatures in G, −−→ σ.m · ∈ G ′ =⇒ m ∈ A ∧ ∃G ∈ G • −→ ⊲ ∗ −−→ σ.m · ∈ G.Fig. 3 (left) shows a conjunction graph rooted at r, of two classes C 1 and C 2 . A conjunctiongraph can be considered as a shallow structure of multiple classes.In a type graph Γ, it is often the case that two nodes n and n ′ lead to equivalent nodes n 1 andn ′ 1 through the same attribute navigation path a. We can thus merge n 1 and n ′ 1 to obtain asimpler graph. Formally, let n be a list of nodes of Γ and a a navigation path, provided that∀t 1 , t 2 ∈ {t | n ⊲ −→ ∗ a −→ t ∈ Γ, n ∈ n} • Γ ⊢ t1 ∼ t 2 ,we define co-target(Γ, n, a) ̂= ts, where{⊲{tts = 0 } if ∃n 0 ∈ n • n 0 −→∗−→ a t 0 ∈ Γ,∅ otherwise.Report No. 448, June 2011UNU-IIST, P.O. Box 3058, Macao

Type Operations 20Based on function co-target, we define a function conj to construct a conjunction graph. Formally,conj (Γ, n) ̂= (Γ ′ , r), whereΓ ′ .ns = Γ.ns ∪ {r, s} ∧ r, s ∈ S Γ.ns ∧ r ≠ s,Γ ′ .es = Γ.es ∪ {r σ −→ s}It is easy to prove the following lemma.∪ {r a −→ t | t ∈ co-target(Γ, n, a), a ∈ A }∪ {s m −→ t | t ∈ co-target(Γ, n, σ.m), m ∈ A }.Lemma 2 If Γ ′ is defined, then Γ ′ ⊢ r ∼ ⋓ n.In particular, the conjunction graph of a single rooted graph G is the effective member structureof G, called the memberwise closure graph, denoted as clo(G). Formally, clo(G) ̂= Γ ′ r, where(Γ ′ , r) = conj (G.base, G.).With the memberwise closure graph, we can retrieve the attributes and methods of a classdirectly. For a method m in a rooted graph G, we define a function mfr(G, m) to get the edgesof the method frame. Formally, mfr(G, m) ̂= out-es(clo(G), n), where −−→ σ.m n ∈ clo(G).4.3 Class ConstraintsA type variable can be declared with constraints that its replacement classes should satisfy.We have two kinds of constraints for classes, member constraints and superclass constraints. Amember constraint lists the minimal set of members for a class, while a superclass constraintrequires a class to have a specific superclass. A type variable is possible to have both a memberconstraint and a superclass constraint.We encode the constraints on a type variable as a graph rooted at the type variable. This allowsus to assume the members and superclass of the type variable by its graph. When the typevariable is replaced by a class satisfying the constraints, the type-safety is retained.Let Γ be a type graph, n the nodes representing types S and t the node representing type T .A member constraint α implements⋓ S on a type variable α is encoded as a conjunction graphof Γ n, with the root of the graph changed to α. A superclass constraint α extends T on α isencoded as an edge α ⊲ −→ t.The satisfaction of a member constraint can now be formalized as a memberwise cover betweentwo rooted graphs.Definition 10 (Memberwise Cover) A rooted graph G ′ is a memberwise cover of a rootedgraph G, denoted as G ′ ̂⊃ G, ifReport No. 448, June 2011UNU-IIST, P.O. Box 3058, Macao

Type Operations 211. all effective attributes in G are included in G ′ ,a ≠ σ ∧ a −→ n ∈ clo(G) =⇒ ∃ a −→ n ′ ∈ clo(G ′ ) • G n ∼ G ′ n ′ ,2. all effective method signatures in G are included in G ′ , −−→ σ.m n ∈ clo(G) =⇒ ∃ −−→ σ.m n ′ ∈ clo(G ′ ) • G n ∼ G ′ n ′ .If a rooted graph G ′ contains an ⊲-chain from the root to a node n, G ′ is a subclass of thesubgraph rooted at n of G ′ . We call G ′ a superclass cover of a rooted graph that encodes asuperclass constraint ⊲ −→ n.Definition 11 (Superclass Cover) A rooted graph G ′ is a superclass cover of a rooted graphG, denoted as G ′ ̂⊲ G, if ⊲ −→ n ∈ G =⇒ G ′ G n.4.4 Instantiation GraphsGiven a generic class C〈α〉 with a list α of type variables, C〈α ↦→ Te〉 is a class instantiationof C〈α〉, where Te is a list of type expressions. We call 〈α ↦→ Te〉 the type substitution of theinstantiation. The root of the graph of the generic class C〈α〉 is named with C. However, theroot of the graph of an instantiation C〈α ↦→ Te〉 is a structural node. And in this instantiationgraph, all the subgraphs rooted at the type variables α in the graph of C〈α〉 are replaced by thecorresponding graphs of the actual types Te. Let u be the roots of the graphs of Te. We alsocall 〈α ↦→ u〉 the type substitution of the instantiation of the graph of C〈α〉 with the graph ofC〈α ↦→ Te〉.Generally, any rooted graph containing type variables is generic, and can be used as a templatefor instantiation. Type variables only appear in the graph of a generic class, or an instantiationgraph resulted by using type variables as actual types. We unify the two cases by using theshallow structure of a generic class as the template for the instantiation of the generic class. Infact, the shallow structure is type equivalent to an instantiation graph of the generic class overan identity type substitution 〈α ↦→ α〉. Moreover, since a generic class itself cannot be used as atype, there is no edge pointing to the root of its graph. Thus, in a template graph, nodes on apath to a type variable can only be structural nodes or type variables.To use a homomorphism to define the instantiation of a template G over a type substitution〈α ↦→ u〉, we define the subgraph of G terminating at the α-nodes. Given a list n of nodes ina rooted graph G, a subgraph of G terminating at n is obtained from G by removing all theoutgoing edges from n,G n ̂= G out-es(G, n).Notice that G n has the same root as G.Report No. 448, June 2011UNU-IIST, P.O. Box 3058, Macao

Type Operations 22C〈α, β〉{α a; β b}aσCZaαβσbααββbaασβbSA〈α〉{B〈β ↦→ α〉 b; α e}AσbaeσβααbσbeσeβσαZαaA〈α ↦→ Z〉σββbαeβC〈α ↦→ Z, β ↦→ C〈α ↦→ S, β ↦→ S〉〉abeσαB〈β〉{A〈α ↦→ β〉 a}σaσBFigure 4: Generic Classes and Class InstantiationsDefinition 12 (Instantiation Graph) Given a type graph Γ, two nodes t, j ∈ Γ and a typesubstitution s = 〈α ↦→ u〉, let G = Γ t α and J = Γ j. J is an instantiation graph of tover s if there exists a graph homomorphism f : G → J such that1. f(α) = u,2. n ∈ C ∪ X {α} =⇒ f(n) = n,3. n ∈ S =⇒ f(n) ∈ S ,4. n, n ′ /∈ α ∧ f(n) = f(n ′ ) =⇒ n = n ′ ,5. n /∈ α ∧ f(n) −→ a · ∈ J =⇒ n −→ a · ∈ G.We call f the instantiation function, and j an instantiation node of t. By projecting the instantiationfunction f to a subgraph of G, we have a property that f(n) is an instantiation node ofa node n ∈ G.Fig. 4 (left) shows a generic class C and two instantiation graphs of C, one over 〈α ↦→ S, β ↦→ S〉as a subgraph, the other over 〈α ↦→ Z, β ↦→ C〈α ↦→ S, β ↦→ S〉〉. Fig. 4 (right) shows two mutuallyrecursive generic classes A and B. Each graph rooted at a double circle is type equivalent to aninstantiation graph of A. The left one in A is over 〈α ↦→ α〉, the middle two are over 〈α ↦→ Z〉,and the right one in B is over 〈α ↦→ β〉.In a type graph, instantiations of a template over the same type substitution are structuralequivalent.Report No. 448, June 2011UNU-IIST, P.O. Box 3058, Macao

Type Operations 23Lemma 3 For a type graph Γ, a node t ∈ Γ, a type substitution s = 〈α ↦→ u〉, and two instantiationgraphs J 1 and J 2 of t over s, J 1 ≃ J 2 .Proof. Let G = Γ t α, f 1 and f 2 be the instantiation functions from G to J 1 and J 2 ,respectively, and U = {u | α ∈ G, α ↦→ u ∈ s}. Notice that U excludes those mappings in s oftype variables not in G. We can construct the isomorphism h : J 1 → J 2 as follows:– h(n) = n, if n ∈ U, since n must be in both J 1 and J 2 ,– h(n) = f 2 (f −11 (n)), if n /∈ U, since both f 1 and f 2 are injective on nodes not in α.We can show that if t ∈ α, h(J 1 .) = J 1 . = f 1 (t) = f 2 (t) = J 2 ., otherwise, h(J 1 .) =f 2 (f −11 (J 1.)) = f 2 (t) = J 2 .. Thus, h is an isomorphism preserving nominal nodes. □For a node t ∈ Γ, we denote an instantiation node j of t over a type substitution s as Γ ⊢ j ≃ t〈s〉.For a generic class C ∈ Γ, let (Γ ′ , n) = shlo(Γ, C), if a node j ∈ Γ and a type substitution ssatisfy Γ ′ ⊢ j ≃ n〈s〉, we also say that j is an instantiation node of C over s in Γ, denoted asΓ ⊢ j ≃ C〈s〉.Given that Γ ⊢ j ≃ t〈s〉, a node n that is type equivalent to j in Γ is called a class instantiationof t over s, denoted as Γ ⊢ n ∼ t〈s〉.Table 3: Construction of Instantiationsins 0 (Γ, ts, f) ̂=t ∈ ts S dom(f)ins 0 (Γ, ts, f ∪ {t ↦→ t})t ∈ ts dom(f) j ∈ S Γ.ns ns = adj -ns(Γ, t)(〈N ′ , E ′ 〉, f ′ ) = ins 0 (〈Γ.ns ∪ {j}, Γ.es〉, ns, f ∪ {t ↦→ j})ins 0 (〈N ′ , E ′ ∪ {j a −→ f ′ (n) | n ∈ ns}〉, ts, f ′ ) (Γ, f) .We now show that for a type graph Γ, a node t ∈ Γ and a type substitution s, there is alwaysan extension Γ ′ of Γ with a partial function f from Γ to Γ ′ such that Γ ′ ⊢ f(t) ≃ t〈s〉.We define the extending function as ins 0 in Table 3. The function takes a type graph Γ, a set tsof the root nodes of templates and a map f as the arguments, and returns an updated type graphwith an updated map. Initially, ts is a singleton set of the node t, and f is the type substitutions. For each structural node n ∈ ts, the function introduces a new structural node n ′ , adds amapping n ↦→ n ′ , then recursively constructs the instantiation nodes for the adjacent nodes ofn, finally adds new edges from n ′ to the instantiation nodes, corresponding to the edges from nto its adjacent nodes. For each nominal node t in ts but not in α, the function adds an identitymapping t ↦→ t. The map being updated in the function is also used as the “visited” flags, wherea node in the domain of the map is considered visited.Report No. 448, June 2011UNU-IIST, P.O. Box 3058, Macao

Type Operations 24We define the inst function as a wrapper of ins 0 to pass in the required arguments. Formally,inst(Γ, t, s) ̂= (Γ ′ , f(t)),where (Γ ′ , f) = ins 0 (Γ, {t}, s). We also define the insg function to instantiate from a genericclass by first obtaining its shallow structure as the template. Formally,where (Γ ′ , n) = shlo(Γ, t).insg(Γ, t, s) ̂= inst(Γ ′ , n, s),Let α be the list of type variables in the type substitution of an instantiation. The followinglemma shows that the extending function ins 0 is correctly defined. That is, ins 0 always returnsa type graph containing the instantiation graph, provided that each node on a path from theroot of the template to a node in α is either in S or in α.Lemma 4 Given a type graph Γ, a node t ∈ Γ and a type substitution s = 〈α ↦→ u〉, let (Γ ′ , f) =ins 0 (Γ, {t}, s) and G = Γ t α, ifthen Γ ′ ⊢ f(t) ≃ t〈s〉, andis the instantiation function.∀α ∈ α, t −→ ∗ n −→ ∗ α ∈ G • n ∈ S ∪ {α}g = f ∪ {n ↦→ n | n ∈ G.ns dom(f)}Proof. We need only to show that g is the instantiation function. First, g is a homomorphism,since each new node in Γ ′ is a structural node in S , and is an image under f; for every x ∈dom(f), each of its outgoing edges x −→ a y is mapped to f(x) −→ a f(y); for all other nodes, theiroutgoing edges are not changed. Second, on a path from t with only structural nodes to a nominalnode, including those in α, all the structural nodes are mapped to the new nodes; all the othernodes except α are mapped to themselves, thus the combined map is injective. Finally, α aremapped according to the type substitution. Therefore, g is the instantiation function. □Type variables of a generic class can be used as actual types within the type expressions occurredin the generic class definition. These type expressions need to be instantiated when the genericclass is instantiated. Recall that we have two kinds of type expressions, class instantiations andclass conjunctions. An instantiation t〈s 1 〉〈s 2 〉 of a class instantiation t〈s 1 〉 is equivalent to theinstantiation of the template t over the composition s 1 ◦ s 2 of the type substitutions s 1 and s 2 .Lemma 5 Given a type graph Γ, a node t ∈ Γ and two type substitutions s 1 , s 2 , if Γ ⊢ t ′ ≃ t〈s 1 〉and Γ ⊢ t ′′ ≃ t ′ 〈s 2 〉, then Γ ⊢ t ′′ ≃ t〈s 1 ◦ s 2 〉.Report No. 448, June 2011UNU-IIST, P.O. Box 3058, Macao

Type Operations 25Proof. By Definition 12 and the definition of substitution composition, we can directly establishthe instantiation function.□In addition, an instantiation of a conjunction of a set of classes can be distributed over theconjunction.Lemma 6 Given a type graph Γ, nodes t and a type substitution s, if for nodes j, j ′ , r, r ′ ,then Γ ⊢ r ′ ∼ j ′ .Γ ⊢ j ≃ t〈s〉, r ′ ∼ ⋓ j, r ∼ ⋓ t, j ′ ≃ r〈s〉,Proof. For a navigation path a in the form of a or σ.m.x, and a path j ′ a −→ u ∈ Γ, by instantiation,there is either a path r −→ a u ∈ Γ, or a path r −→ a α ∈ Γ if α ↦→ u ∈ s. Thus, by conjunction, there⊲exists a node t 0 ∈ t such that t 0 −→∗−→ a ⊲u ∈ Γ, or t 0 −→∗−→ a α ∈ Γ. Let j 0 ∈ j be the instantiation⊲node of t 0 . There exists a path j 0 −→∗−→ a u 0 ∈ Γ such that Γ ⊢ u 0 ≃ u, either directly mapped,or by the substitution α ↦→ u. Thus there exists a path r ′ a −→ u′0 ∈ Γ such that Γ ⊢ u ′ 0 ∼ u. Wehave proven that for such a navigation path a, j ′ a −→ u ∈ Γ =⇒ ∃r′ a −→ u′0 • Γ ⊢ u ′ 0 ∼ u. Theopposite is similar.□4.5 Recursive Generic ClassesFor a non-recursive generic class D〈β〉, we draw the graph of D〈β〉 directly from its textualdefinition. Based on this graph, we can construct any instantiation of D〈β〉 in the way stated inthe previous subsection. However, for a recursive generic class C〈α〉, a class instantiation C〈s〉,where s is a type substitution, may be used as the type of an attribute of C〈α〉. In this case,there is an edge from the node C to the node representing C〈s〉, thus the graph of C〈α〉 dependson the graph of C〈s〉. For the same reason the graph of C〈s〉 depends on the graph of C〈s 2 〉,and so on, where s k ̂= s k−1 ◦ s. Notice that Lemma 5 ensures that the notation of instantiationC〈s k 〉 is well defined for a positive number k. Because of the recursive instantiation, a genericclass C〈α〉 with an attribute of type C〈s〉 is defined only if the type substitution s is convergent.A type substitution s is called convergent if the set {s i | i > 0} is finite, otherwise it is calleddivergent. For example, 〈α ↦→ α〉 and 〈α ↦→ β, β ↦→ α〉 are convergent substitutions, while〈α ↦→ C〈α ↦→ α〉〉 and 〈α ↦→ γ, γ ↦→ D〈β ↦→ α〉〉 are divergent ones. A type substitution s isdivergent if and only if there is a list of mappings α 1 ↦→ Te 1 , . . . , α k ↦→ Te k (k ≥ 1) in s suchthat α i+1 occurs as an actual type in Te i for 1 ≤ i < k and α 1 occurs as an actual type in aclass instantiation in Te k .When s is convergent, the set {C〈s i 〉 | i > 0} of instantiations that the generic class C〈α〉depends on is finite, i.e., the graph of C〈α〉 is finite. Since the types C〈s i 〉 mutually refer to eachReport No. 448, June 2011UNU-IIST, P.O. Box 3058, Macao

Type Operations 26other, we construct the graph of C〈α〉 and these instantiations C〈s i 〉 of C〈α〉 all together. Wefirst identify the nodes of the graph, each of which represents a type expression that C〈α〉 refersto, directly or indirectly. This can be done by using a function to associate each type expressionwith a unique node. Then, we add the edges between these nodes based on the relations betweentheir corresponding type expressions.Let F be an injective function from type expressions to N such that F (t) = t for t ∈ C ∪ X .For a list C of classes, containing a recursive generic class C 0 〈α〉 and all the classes that C 0 refersto, generic or non-generic, the graph of C is constructed by the following procedure through F ,– for a superclass extends Cc defined in a class C, we add an edge C ⊲ −→ F (Cc),– for an attribute Te a defined in a class C, we add an edge C a −→ F (Te),– for a parameter Te x of method m defined in a class C, we add a path C σ.m.x −−−→ Te,– for each edge C a −→ F (Te) and each instantiation C〈s〉 of C, we add an edge F (C〈s〉) a −→F (Te〈s〉),– for each path C −−−→ σ.m.x F (Te) and each instantiation C〈s〉 of C, we add an edge F (C〈s〉) −−−→σ.m.xF (Te〈s〉),– for each class conjunction Te = ⋓ Cc or type variable α with constraint implements ⋓ Cc,we construct the conjunction graph G by conj and change the root of G to F (Te) or α,– for each type variable α with constraint extends Cc, we add an edge α ⊲ −→ F (Cc),where C is a generic or non-generic class in C. An instantiation of a type expression Te〈s〉 isdefined as,⎧s(Te) if Te ∈ dom(s),⎪⎨C〈s ′ ◦ s〉 if Te = C〈s ′ 〉,Te〈s〉 ̂=⋓ Cc〈s〉 if Te = ⋓ Cc,⎪⎩Te otherwise.In fact, the above procedure can be generally used to construct the graph of any class, not onlyrecursive generic classes.Fig. 5 shows the graphs of three generic classes, Pair〈α, β〉, List〈γ〉 and Tree〈δ〉. Among thethree, List and Tree are recursive. A List of γ extends a Pair of an element type γ and a Listof γ. A Tree of δ extends a Pair of an element type δ and a List of Tree of δ. In the figure, thetype expression Te in the callout box of a structural node n indicates there is a mapping Te ↦→ nin F .The soundness of the above construction procedure can be proved by the type equivalence betweenan instantiation graph of a recursive generic class C〈α〉, and the subgraph of C〈α〉 correspondingto the class instantiation.Report No. 448, June 2011UNU-IIST, P.O. Box 3058, Macao

Type System 27αaγPairaList⊲Pair〈〉α ↦→ γ,β ↦→ List〈γ ↦→ γ〉δaTree⊲bList〈γ ↦→ Tree〈δ ↦→ δ〉〉bβb⊲Pair〈〉α ↦→ δ,β ↦→ List〈γ ↦→ Tree〈δ ↦→ δ〉〉⊲⊲ab〈α ↦→ Tree〈δ ↦→ δ〉,〉List〈γ ↦→ γ〉Tree〈δ ↦→ δ〉Pairβ ↦→ List〈γ ↦→ Tree〈δ ↦→ δ〉〉Figure 5: Recursive Generic ClassesLemma 7 Let F be an injective function from type expressions to N such that F (t) = t fort ∈ C ∪ X , and Γ be the type graph containing the graph of a generic class C 0 〈α〉 constructedin the above procedure through F . For any node j in Γ and type substitution 〈α ↦→ Te〉 such thatΓ ⊢ j ≃ C 0 〈α ↦→ F (Te)〉, Γ ⊢ j ∼ F (C 0 〈α ↦→ Te〉).Proof. Let f be the instantiation function from Γ C 0 α to Γ j. We can construct thereduction function g from Γ j to Γ F (C 0 〈α ↦→ Te〉) as follows,– g(n) = F (C 0 〈α ↦→ Te〉), if n = j,– g(n) = n, if ∃α ∈ α • f(α) = n,– otherwise, g(n) = F (F −1 (f −1 (n))〈α ↦→ Te〉).□5 Type SystemThe type system checks whether a program is well-typed statically based on a well-formed typegraph of the program. The type-checking is done in a bottom-up way, from the checking oftype expressions, expressions and commands to that of methods and programs. Commands aretranslated to an environment dependent version while being checked, where type expressions inthe commands are replaced by their corresponding type nodes in the type graph.For example, the well-typedness of the command C〈α ↦→ B〉.new(o); o.a := 1 requires the typegraph to have a node t representing the class instantiation C〈α ↦→ B〉 which has an outgoing edgea to Z. The type-checking procedure translates the command C〈α ↦→ B〉.new(o); o.a := 1 to itsenvironment dependent version ⌈t⌋.new(o); o.a := 1, where ⌈⌋ encloses a type node.After the type-checking, an environment graph is constructed from the type graph to store theReport No. 448, June 2011UNU-IIST, P.O. Box 3058, Macao

Type System 295. Generic classes: nodes C ∈ U , with 2†, a σ-labeled edge to a generic static table, and5† an optional ⊲-labeled edge to a class or a class instantiation,such that there is a path C σ.α −−→ α for any type variable α reachable from C.6. Non-generic classes, or classes for short: nodes in U that reach no type variable, with 2†, 5†,and6† a σ-labeled edge to a non-generic static table.7. Type variables: nodes in X , with 2†, 6†, and an optional ⊲-labeled edge to a class or a classinstantiation or a type variable.8. Class conjunctions: nodes in S , with 2† and 6†.9. Class instantiations: nodes j ∈ S , such that Γ ⊢ j ∼ C〈s〉 for some generic class C andwell-formed type substitution s.Here and after, when we talk about nodes in a type graph, a class type means a node that belongsto either the subsets 6, 7, 8 or 9, while a type means either a class type or a primitive type. Inaddition, a static table means either a non-generic static table or a generic static table.Notice that we add an empty static table to each of the primitive types, so that we can treatprimitive types and classes uniformly in the type system. It is also worth pointing out thatthe above partition is unique, since each node belongs to at most one specific subset under anypartition, decided by its own outgoing edges.Definition 14 (Well-Formed Type Graph) A well-formed type graph is a type graph withwell-formed nodes that satisfies1. there is no cycle consisting of only ⊲-labeled edges,2. any two generic classes have two disjoint sets of type variable labels outgoing from their statictables.5.3 Environment Dependent CommandsThe type information needed for program execution is encoded in an environment graph. Forthis purpose, we need to translate the type expressions in the commands to the type nodes ofthe environment graph. A translated command is called an environment dependent command,whose syntax is the same as that of rcos/g except for the following primitives involving types.Report No. 448, June 2011UNU-IIST, P.O. Box 3058, Macao

Type System 30c ::= . . . other commands| var ⌈t⌋ x variable declaration| ⌈t⌋.new(le) object creatione ::= . . . other expressions| (⌈t⌋)e typecastextv ::= ⌈t⌋ x external variable decl.⌈t⌋type nodeThrough the type nodes stored in the environment dependent commands, we can lookup thetype information in the corresponding environment graph.5.4 Type Graph CompletenessGiven a program, the type graph Γ of the program must include the graphs of all the possible typeexpressions used in the program. The graphs of the type expressions used in class attribute andmethod parameter definitions are constructed by the procedure described in §4.5. For the typeexpressions used in commands and expressions, we construct the conjunction and instantiationgraphs by applying the shlo, conj , inst and insg functions on the graphs of the operands ofthe type operations, and extend the type graph to include the resulted graphs. We do this byextending the type graph Γ to a new type graph Γ ′ . We only introduce new nodes with newedges pointing into the old graph Γ. The following lemma ensures that any existing rooted graphin Γ remains unchanged when the type graph is extended.Lemma 8 If a type graph Γ ′ is the result of an application of shlo, conj , inst or insg to a typegraph Γ, then Γ n = Γ ′ n for all nodes n ∈ Γ.Proof. By the definitions of shlo, conj , inst and insg.□Let f be one of the shlo, conj , inst and insg functions, Γ a type graph, and X the rest of thearguments of f. We can always extend Γ to Γ ′ without changing any rooted graphs in Γ, providedthat (Γ ′ , n ′ ) = f(Γ, X) is defined. To simplify the type-checking rules, we require all the rootedgraphs of type expressions to be already in Γ, and abbreviateas Γ ⊢ n ∼ f(X).let (Γ ′ , n ′ ) = f(Γ, X) in ∃n ∈ Γ • Γ ′ n ′ ∼ Γ n5.5 Type ContextsA type graph Γ itself is not enough to decide whether an expression or a command is well-typed.For example, a variable x is well-typed inside a local scope var T x; · · · ; end x, but it is likely toReport No. 448, June 2011UNU-IIST, P.O. Box 3058, Macao

Type System 31∆ 1 ∆ 2 = pop(∆ 1 ) push (∆ 2 , (y, z), (A, B))x zz$$lnk Blnk BlnkB ybxb ybx$b ybx$b y $CaA ZC A ZiaC A Zia iFigure 6: Stack Push and Popbe ill-typed (undefined) with respect to Γ. To address this issue, we introduce a graph notationof type context that describes the the scopes of local variables. A type context graph consistsof nodes representing scopes, and edges labeled by the variables from their scope nodes to thecorresponding type nodes in the underlying type graph.Let O be a new infinite set of nodes disjoint with N .Definition 15 (Type Context) A type context is a rooted, directed and labeled graph ∆ =〈N, E, r〉, where– N ⊆ N ∪ O is the set of nodes, containing type nodes N ∩ N and scope nodes N ∩ O,– E : N × (A ∪ {$, self }) → N is the set of edges,– r ∈ N ∩ O is the root of the graph.A type context ∆ = 〈N, E, r〉 is well-formed if the following conditions hold,1. each outgoing edge from a scope node is labeled by a variable (or self ), and targets at a typenode,2. all the scope nodes are on a path starting from the root r, with edges labeled by $, and3. except the root r, which has no incoming edge, each scope node has exactly one incomingedge.In Fig. 6, the dashed subgraphs are examples of type contexts.A type context represents a snapshot of the type environment at a time of type-checking, recordingthe types of variables, including self , declared in different scopes at that time. A type contexthas a stack structure. When the type-checking enters a new scope, a node with outgoing edgesrecording the variables in the scope is pushed onto the top of the stack. When the type-checkingexits a scope, the top node of the stack, together with its outgoing edges, is popped out, so thatthe type context recovers to the one exactly before entering the scope.Report No. 448, June 2011UNU-IIST, P.O. Box 3058, Macao

Type System 32For a list x of variables and a list n of nodes, push(∆, x, n) adds a new scope to a type context∆ with outgoing edges labeled by x and pointing to n, respectively. Formally,push(∆, x, n) ̂= 〈N ′ , E ′ , r ′ 〉,wherer ′ ∈ O ∆.ns, N ′ = ∆.ns ∪ {r ′ , n}, E ′ = ∆.es ∪ {r ′ $ −→ r, r ′ x −→ n}.As shown in Fig. 6, ending a scope pops the root out of the stack by simply removing it, alongwith all its outgoing edges, from the type context, and the next node on the stack becomes theroot. Formally,pop(∆) ̂= ∆ r next , provided $ −→ r next ∈ ∆.A type context only adds and removes scope nodes, pointing into the underlying type graph,and does not introduce any other change.Let ∆ be a type context, n one of its scope nodes and w a variable (or self ). We define a partialfunction search(∆, n, w) that searches for the edge labeled by w in ∆ from n node-by-node downthe stack. Formally, search(∆, n, w) ̂= d, where{w w n −→ t if n −→ t ∈ ∆,d =search(∆, n ′ , w) otherwise, if n −→ $ n ′ ∈ ∆.And search(∆, w) ̂= search(∆, ∆., w) searches from the root. Notice that the recursion alwaysterminates as there is no loop formed by $-edges.Based on search, the function∆(w) ̂= target(search(∆, w))returns the type of w in ∆. ∆(w) exists if and only if there is an edge in ∆ labeled by w.Moreover, we use var(∆, n) to denote the set of variables (together with self ) recorded in thescope node n of ∆, size(∆, n) the length of the $-path starting from n in ∆, and depth(∆, n)the length of the $-path from the root to n in ∆. Formally,var(∆, n) ̂= {w | t ∈ N , n −→ w t ∈ ∆},{1 + size(∆, nsize(∆, n) ̂=′ ) if n −→ $ n ′ ∈ ∆,0 otherwise,depth(∆, n) ̂= size(∆, ∆.) − size(∆, n).We abbreviate var(∆, ∆.) as var(∆), i.e., the set of variables in the current scope, andsize(∆, ∆.) as size(∆), i.e., the length of the entire $-path.The type-checking possibly updates the type context, and translates type expressions to theircorresponding nodes in the type graph. Generally, the type-checking is a quintupleΓ, ∆ → ∆ ′ ⊢ X → X ′Report No. 448, June 2011UNU-IIST, P.O. Box 3058, Macao

Type System 33Table 4: Type-Checking of Type Expressions and Subtyping of Type Nodes(T-Name) T ∈ C ∪ X t-var(Γ, T ) = ∅Γ ⊢ T → ⌈T ⌋Γ ⊢ Te → ⌈t⌋ Γ ⊢ public(t) Γ ⊢ n ∼ conj (t)(T-Conj)Γ ⊢ ⋓ Te → ⌈n⌋C ∈ C {α} = t-var(Γ, C) Γ ⊢ Te → ⌈t⌋ Γ ⊢ 〈α ↦→ t〉 Γ ⊢ j ∼ insg(C, 〈α ↦→ t〉)(T-Ins) ,Γ ⊢ C〈α ↦→ Te〉 → ⌈j⌋(-Nul) t /∈ D t −→ σ · ∈ ΓΓ ⊢ Null t(-Isa) Γ ⊢ t′ tΓ ⊢ t ′ t(-Cvr) t ∈ S t-var(Γ, t) = ∅ Γ ⊢ t′ ̂⊃ tΓ ⊢ t ′ . tmeaning that in the type-checking, X is translated to X ′ , X ′ is well-typed under Γ and ∆, and∆ is updated to ∆ ′ after the checking.5.6 Type-Checking of Type ExpressionsFor a type graph Γ, a type expression Te and a node t, we use Γ ⊢ Te → ⌈t⌋ to denote thatTe is translated to ⌈t⌋, and t is well-typed under Γ. The well-typedness of a type expression isindependent of type contexts. For a class instantiation, we require that the type substitutiononly contains the type variables specified in the generic class. We define a function t-var(Γ, t)to return the set of type variables of a node t in Γ. Formally,t-var(Γ, t) ̂= {α | t σ.α −−→ · ∈ Γ}.We also define a function gcls(Γ, j) to return the generic class of a class instantiation j in Γ.Formally, gcls(Γ, j) ̂= C, provided thatC ∈ C ∧ t-var(Γ, C) = t-var(Γ, j) ≠ ∅.The type-checking rules for type expressions are given in Table 4.Given a program and its type graph Γ, we define a function visib(Γ, t, a), in Table 5, to return thevisibility of an attribute a of type t. We say that a type t is fully public, denoted as Γ ⊢ public(t),if all its attributes are public, i.e.,∀ a −→ · ∈ clo(Γ t) • visib(Γ, t, a) = publicReport No. 448, June 2011UNU-IIST, P.O. Box 3058, Macao

Type System 34Table 5: Visibility of Attributesvisib(Γ, t, a) ̂=“class C {visib Te a}” t = Ct −→ ⊲ t ′ ⊲−→∗ a−→ · ∈ Γvisib ≥ visib(Γ, t ′ , a) > private visibvisibt ∈ C t −→ ⊲ t ′ ∈ Γvisib(Γ, t ′ , a) > privatevisib(Γ, t ′ , a)t ∈ S t −→ a · ∈ ΓC = gcls(Γ, t)visib(Γ, C, a) publict ∈ Xt −→ a · ∈ Γ,publicvisib(Γ, u, a) = private(V-Priv) Γ ⊢ ∆(self ) ∼ uΓ, ∆ ⊢ u sees avisib(Γ, u, a) = protect(V-Prot) Γ ⊢ ∆(self ) uΓ, ∆ ⊢ u sees avisib(Γ, u, a) = public(V-Pub) ,Γ, ∆ ⊢ u sees awhere we define the visibility relation as public > protect > privateAdditionally, Γ ⊢ public(t) denotes that all the types in t are fully public. To conform to thevisibility rule, when we take the conjunction of some types, we require that they are fully public,since all of the attributes in a class conjunction are assumed visible.Based on the visibility of an attribute a of a type u, whether u can navigate through a in amethod m is also determined by the class C in which m is defined. The class C is recorded asthe type of self in the type context ∆. Given a type graph Γ and a type context ∆, we denotethe accessibility of a from u as Γ, ∆ ⊢ u sees a, defined by the (V-Priv), (V-Prot) and (V-Pub)rules in Table 5.5.7 Subtype RelationThe compatibility of types should be checked in various occasions, including typecasts, assignmentsand method invocations. The compatibility is governed by the subtype relation betweentype expressions. Since we have translated type expressions to type nodes, we define the subtyperelation between type nodes.Given a type graph Γ, a node t ′ is a subtype of a node t, denoted as Γ ⊢ t ′ t, if either t ′ is asubclass of t, or t is a class conjunction and t ′ memberwise covers t. The rules for the subtyperelation are given in Table 4.Theorem 1 (Subtyping Partial Order) The subtype relation given in Table 4 is a partialReport No. 448, June 2011UNU-IIST, P.O. Box 3058, Macao

Type System 35order.Proof. Since ⊲-cycles are not allowed, and by Definition 10 and 9, we can verify that two wellformedclass conjunctions memberwise covering each other are type equivalent, so the subtyperelation is antisymmetric.We can also verify that the ̂⊃ relation is transitive, and furthermore, by Definition 9, 7 andLemma 1, G ′ G implies G ′ ̂⊃ G, hence the subtyping relation is transitive.□5.8 Type-Checking of CommandsFor a type graph Γ, a type context ∆, an expression e and a type node t, we use Γ, ∆ ⊢ e → e ′ : tto denote that e is translated to e ′ , e ′ is well-typed under Γ and ∆, and e ′ is of type t in Γ. Thetype-checking of an expression does not update the type context, although it looks up variablesin the type context.For constant literals l, including those of the primitive types and the null object, we use T(l) todenote their types. For example, T(5) = Z, T(false) = B and T(null) = Null. The type-checkingrules for expressions are given in Table 6.A well-typed command must satisfy certain conditions by its own. For example, the typesof the two expressions on the left-hand side and right-hand side of an assignment should becompatible, and the types of actual arguments in a method invocation should be consistent withthe method signature. Such kind of requirements is called local well-typedness. Furthermore,a locally well-typed command c under Γ and ∆ is well-typed, denoted as Γ, ∆ ⊢ c → c ′ , if itsvariable declarations and undeclarations always match. This means the original type context isnever under-broken, and restored afterwards, during the checking.Besides checking whether the stack sizes of the starting type context size(∆) and ending typecontext size(∆ ′ ) are the same, we also need to distinguish commands such as c 1 = (var x; end x)and c 2 = (end x; var x). Both of them are locally well-typed, but the latter is not well-typed as awhole. For a type graph Γ, a type context ∆ and a command c, we define the local well-typednessas Γ, ∆ → ∆ ′ : k ⊢ c → c ′ , to denote that c is translated to its environment dependent version c ′ ,c ′ is locally well-typed under Γ and ∆, ∆ is updated to ∆ ′ after the checking, and the smallestsize of the type context during the checking is k, where k is a natural number.Now, it is clear that Γ, ∆ ⊢ c → c ′ if and only if Γ, ∆ → ∆ : size(∆) ⊢ c → c ′ , i.e., the commandc maintains the size of the type context to be no less than the original and eventually restoresthe original type context. We give the type-checking rules for commands in Table 8.Report No. 448, June 2011UNU-IIST, P.O. Box 3058, Macao

Type System 36Table 6: Type-Checking of ExpressionsT(l) = B(T-Lit)Γ, ∆ ⊢ l → l : B∆(self ) = u(T-Self)Γ, ∆ ⊢ self → self : u(T-Op) f : B → B′ Γ, ∆ ⊢ e → e ′ : BΓ, ∆ ⊢ f(e) → f(e ′ ) : B ′∆(x) = t(T-Var)Γ, ∆ ⊢ x → x : t(T-Attr) Γ, ∆ ⊢ e → e′ : u a −→ t ∈ clo(Γ u) Γ, ∆ ⊢ u sees aΓ, ∆ ⊢ e.a → e ′ .a : t(T-UCast)Γ ⊢ Ce → ⌈u⌋Γ, ∆ ⊢ e → e ′ : u ′ Γ ⊢ u ′ uΓ, ∆ ⊢ (Ce)e → e ′ : uΓ ⊢ Ce → ⌈u⌋Γ, ∆ ⊢ e → e ′ : u ′ Γ ⊢ u u ′(T-DCast)Γ, ∆ ⊢ (Ce)e → (⌈u⌋)e ′ .: u5.9 Instantiation of CommandsAfter the type-checking of commands, type expressions in the commands have been replacedby their corresponding nodes in the type graph. An instantiation of a command over a typesubstitution is to replace the type nodes in the command with their instantiation nodes over thetype substitution.For expressions, type nodes occur only in typecast expressions. For commands, type nodes canoccur in nested expressions, variable declarations and object creations. Given a type graph Γand a type substitution s, we follow the structure of expressions, commands and type contextsto define the instantiation functions inse, insc and insd, in Table 7, to respectively replace thetype nodes in an expression e, a command c and a type context ∆ with the instantiation nodesover the type substitution s.We show in Lemma 9 that over a well-formed type substitution, such an instantiation preservesthe subtyping relations between any two type nodes in the command, and in Lemma 10 thata replacement node provides all the attributes and methods of a replaced node. Hence, thecommand instantiation is type-safe.Lemma 9 Given a type graph Γ, two type nodes t 1 , t 2 ∈ Γ and a well-formed type substitutions, if Γ ⊢ j 1 ≃ t 1 〈s〉 and Γ ⊢ j 2 ≃ t 2 〈s〉, thenΓ ⊢ t 1 t 2 =⇒ Γ ⊢ j 1 j 2 .Proof. Let G 1 = Γ t 1 , G 2 = Γ t 2 , J 1 = Γ j 1 and J 2 = Γ j 2 . By the subtyping rules inTable 4, we need to proveReport No. 448, June 2011UNU-IIST, P.O. Box 3058, Macao

Type System 37Table 7: Instantiation of Expressions, Commands and Type Contextsinse(Γ, e, s) ̂=e = (⌈t⌋)e ′ Γ ⊢ t ′ ∼ inst(t, s)(⌈t ′ ⌋)e ′ e = e ′ .ainse(Γ, e ′ ).ae = f(e ′ )f(inse(Γ, e ′ )) e ,insc(Γ, c, s) ̂=c = ⌈u⌋.new(le) Γ ⊢ u ′ ∼ inst(u, s)⌈u ′ ⌋.new(inse(Γ, le, s))c = var ⌈t⌋ x Γ ⊢ t ′ ∼ inst(t, s)var ⌈t ′ ⌋ xc = e.m(ve • x • re)(e ′ , ve ′ , re ′ ) = inse(Γ, (e, ve, re), s)e ′ .m(ve ′ • x • re ′ )c = c ; c c = le := ele ′ := e ′ c ′ 1; c ′ 2(le ′ , e ′ ) = inse(Γ, (le, e), s)1 2(c ′ 1, c ′ 2) = insc(Γ, (c 1 , c 2 ), s)c = c 1 ⊳ e ⊲ c 2e ′ = inse(Γ, e, s) (c ′ 1, c ′ 2) = insc(Γ, (c 1 , c 2 ), s)c ′ 1 ⊳ e ′ ⊲ c ′ 2c = e ∗ c 1e ′ = inse(Γ, e, s) c ′ 1 = insc(Γ, c 1 , s)e ′ ∗ c ′ 1 c ,insd(Γ, ∆, s) ̂=t ∈ N d = n w −→ t ∈ ∆ ∆ ′ = insd(Γ, ∆ {d}, s) Γ ⊢ t ′ ∼ inst(t, s)〈∆ ′ .ns ∪ {t ′ }, ∆ ′ .es ∪ {n w −→ t ′ }, ∆ ′ .〉 ∆ .– G 1 G 2 =⇒ J 1 J 2 ,– t 2 ∈ S ∧ G 1 ̂⊃ G 2 =⇒ J 1 ̂⊃ J 2 .By Definition 12, a non-generic class remains the same in the instantiation, while in a structuraltype, only the type variables are replaced in the instantiation. Over the same type substitution, iftwo paths lead to the same type variable, they must lead to the same node after the instantiations.Furthermore, a type variable is nominal and only equivalent to itself. Thus, the equivalence ispreserved for all nodes, the subclass relation and memberwise cover are preserved for nodes notin dom(s).For a replaced type variable covering a structural node, by Definition 13, the replacement coversthe instantiated type variable, thus covers the instantiated structural node.In the case that t 1 is a type variable subclassing t 2 , by Definition 7, G 1 G 2 implies there exists⊲a path t 1 −→ ∗ n ∈ G 1 , such that G 1 n ∼ G 2 . By induction on the length k of the ⊲-path, ifk = 0, we have shown the equivalence case that G 1 ∼ G 2 implies J 1 ∼ J 2 . If k ≥ 1, there exists⊲ ⊲a path t 1 −→ n0 −→ ∗ n ∈ G 1 and an edge −→ ⊲ j 0 in the instantiation graph J 1 ′ of the shallowstructure of t 1 over s. Let G 0 = G 1 n 0 and J 0 = J 1 ′ j 0. By Definition 13 and 11, J 1 J 0 .By the induction hypothesis, we have that G 0 G 2 implies J 0 J 2 , hence J 1 J 2 . □Lemma 10 Given a type graph Γ, a type node t ∈ Γ and a well-formed type substitution s, ifΓ ⊢ j ≃ t〈s〉, thenReport No. 448, June 2011UNU-IIST, P.O. Box 3058, Macao

Type System 38Table 8: Type-Checking of Commands, Methods and Programs(T-Skip) Γ, ∆ ⊢ skip → skip (T-Assign) Γ, ∆ ⊢ le, e → le′ : t, e ′ : t ′ Γ ⊢ t ′ tΓ, ∆ ⊢ le := e → le ′ := e ′(T-New) Γ ⊢ Cc → ⌈u⌋ Γ, ∆ ⊢ le → le′ : u ′ u ∈ C ∨ t-var(Γ, u) ≠ ∅ Γ ⊢ u u ′Γ, ∆ ⊢ Cc.new(le) → ⌈u⌋.new(le ′ )(T-Invk) Γ, ∆ ⊢ e, ve, re → e′ : u, ve ′ : t, re ′ : t ′ {· x−→ t ′′ } = mfr(Γ u, m) Γ ⊢ t t ′′ t ′Γ, ∆ ⊢ e.m(ve • x • re) → e ′ .m(ve ′ • x • re ′ )Γ ⊢ Te → ⌈t⌋ k = size(∆)(T-Decl)Γ, ∆ → push(∆, x, t) : k ⊢ var Te x → var ⌈t⌋ xvar(∆) = {x} k = size(∆) ≥ 1(T-End)Γ, ∆ → pop(∆) : k − 1 ⊢ end x → end x(T-If) Γ, ∆ ⊢ e, c 1, c 2 → e ′ : B, c ′ 1, c ′ 2Γ, ∆ ⊢ c 1 ⊳ e ⊲ c 2 → c ′ 1 ⊳ e ′ ⊲ c ′ 2Γ, ∆ → ∆ ′′ : k 1 ⊢ c 1 → c ′ 1Γ, ∆ ′′ → ∆ ′ : k 2 ⊢ c 2 → c ′ 2(T-Seq)Γ, ∆ → ∆ ′ : min(k 1 , k 2 ) ⊢ c 1 ; c 2 → c ′ 1; c ′ 2(T-While) Γ, ∆ ⊢ e, c → e′ : B, c ′Γ, ∆ ⊢ e ∗ c → e ′ ∗ c ′(T-Meth) Γ ⊢ j ∼ self (u) {· x−→ t} = mfr(Γ j, m) Γ, push(∆ ∅ , (self , x), (j, t)) ⊢ c → c ′Γ ⊢ u :: m{c} → u :: m{c ′ }(T-Main) Γ ⊢ Te → ⌈t⌋ Γ, push(∆ ∅, x, t) ⊢ c → c ′Γ ⊢ {Te x; c} → {⌈t⌋ x; c ′ }(T-Prog)Γ ⊢ md, Md → md ′ , Md ′Γ ⊢ md • Md → md ′ • Md ′ .1. a ≠ σ ∧ a −→ n ∈ clo(Γ t) =⇒ ∃ a −→ n ′ ∈ clo(Γ j) • Γ ⊢ n ′ ∼ n〈s〉,2. σ.m −−→ n ∈ clo(Γ t) =⇒ ∃ σ.m −−→ n ′ ∈ clo(Γ j) • Γ ⊢ n ′ ∼ n〈s〉.Proof. By the definition of clo, the properties of the instantiation function in Definition 12 andDefinition 13.□From the above two lemmas, an instantiation of a well-typed command is also well-typed. Weinstantiate commands and type contexts by the functions insc and insd, respectively.Theorem 2 (Type-Safety of Command Instantiations) Given a type graph Γ and a wellformedtype substitution s, if Γ, ∆ 1 → ∆ 2 : k ⊢ c, letthen Γ, ∆ ′ 1 → ∆′ 2 : k ⊢ c′ .c ′ = insc(Γ, c, s), ∆ ′ 1 = insd(Γ, ∆ 1 , s), ∆ ′ 2 = insd(Γ, ∆ 2 , s),Report No. 448, June 2011UNU-IIST, P.O. Box 3058, Macao

Type System 39Proof. The theorem is proven by replacing all the type nodes with their corresponding instantiationnodes in the type-checking rules for expressions and commands, and then applying Lemma 4,5, 6, 8, 9 and 10. □5.10 Type-Checking of ProgramsBased on the type-checking of commands, we are able to check the well-typedness of a program.A program is well-typed if each method, including the Main method, defined in the program iswell-typed, and a method definition is well-typed if its body command is well-typed accordingto its formal parameters.In the type context of the checking of a method body command, the self reference points tothe class defining the method. If the class is generic, self must point to an instantiation of thegeneric class over an identity type substitution, i.e., a shallow structure of the generic class. Wedefine a function self (Γ, t) to return the shallow structure when needed. Formally,{shlo(Γ, t) if t-var(Γ, t) ≠ ∅,self (Γ, t) ̂=(Γ, t) otherwise.We start by creating the empty type context with only a root node,∆ ∅ = 〈{r}, ∅, r〉, where r ∈ O.For a method definition, we push the method frame onto ∆ ∅ , so that the result type context ∆ ′ ∅contains all the parameters and their types. We check whether the body command is well-typedunder ∆ ′ ∅. The following lemma ensures that if a parameter can be found in the method framepushed onto the empty type context, it can also be found in the method frame pushed on anytype context. Thus a well-typed method can always be invoked.Lemma 11 Given a type graph Γ, a type context ∆ and a command c, let∆ ′ ∅ = push(∆ ∅, x, n) and ∆ ′ = push(∆, x, n),if Γ, ∆ ′ ∅ ⊢ c, then Γ, ∆′ ⊢ c and for all variables y occurred in c and all ∆ 1 → ∆ 2 occurred in thededuction of Γ, ∆ ′ ⊢ c such that search(∆ 1 , y) /∈ ∆.Proof. We first show that if a variable is well-typed in ∆ ′ ∅ , it is also well-typed in ∆′ and thevariable cannot be in ∆. Then, we prove that the lemma holds for a general command byinduction on the structure of commands and expressions.If a variable y in c is well-typed, by (T-Var), y is found by search, hence, y is either introducedby c above the stack of ∆ ′ ∅ , or in ∆′ ∅ . If y is in ∆′ ∅ , it must be in x, because ∆ ∅ contains no edge.So the edge for y is either in ∆ ′ or introduced in command c, but not in ∆.□Report No. 448, June 2011UNU-IIST, P.O. Box 3058, Macao

Execution Environment 40We extract all the class methods C :: m as a list C :: m{c}, and the Main method as {Te x; c}.In this way, a program is denoted asP = C :: m{c} • {Te x; c}.Definition 16 (Well-Typed Program) Let Γ be a type graph and P a program. If Γ ⊢ P →P ′ , P ′ is a well-typed program under type graph Γ.After the type-checking, if a program P ′ is well-typed under Γ, P ′ contains the environmentdependent commands for all the methods. The type-checking rules for methods and programsare given in Table 8.6 Execution EnvironmentThe environment of a program stores the structure of the initial object plus the method bodycommands for each concrete class. An initial object in the environment is represented by a typenode in the environment graph. The outgoing edges of this node are labeled by the attributes ofthe class and the targets of the these edges are the default (or initial) values of the correspondingattributes. It also has an outgoing edge labeled by σ to the static table of the class, representedas a node with outgoing edges labeled by method names of the class to the corresponding bodycommands. The outgoing edges, the targeted literals and the static table of the class constitutethe structure of the initial object of the class, called a class frame. There is also a set of nodesrecorded on the static table of each class, representing its supertypes.Based on an environment graph, the execution of a program looks up methods, creates classinstances and checks typecasts at runtime without the need of a type graph.6.1 Environment GraphsLet L be the set of literals, including the null object and the values of primitive types, and Kthe set of commands. We assume that L , K , O and N are disjoint.Definition 17 (Environment Graph) An environment graph is a directed and labeled graphR = 〈N, E, S〉, where– N ⊆ N ∪ L ∪ K , is the set of type nodes, static tables, literals and commands,– E : (N ∩ N ) × (A ∪ {σ}) → N is the set of edges,Report No. 448, June 2011UNU-IIST, P.O. Box 3058, Macao

Execution Environment 41{self .a := y}m 1S : {u}m 1S : {D, u}iσu0σiDjjijCfalseabσnullS : {C, D, u}m 1{x.m 2 (y); self .a := x}Figure 7: An Environment Graph– S : S → 2 N is a partial function mapping each static table in N to a set of superclass nodes,denoted by R.sm.An environment graph R = 〈N, E, S〉 is well-formed if1. for a method body command k ∈ N ∩ K , there is at least one incoming edge to k in R andevery incoming edge to k is from a node in dom(S),2. for a static tables s ∈ dom(S), there is exactly one incoming edge labeled σ to s in R and allthe nodes adjacent to s is in K ,3. for a type node t /∈ K ∪ dom(S) ∪ L , there is no incoming edge to t in R and there is anoutgoing edge from t labeled σ in R, and4. each literal l ∈ L is a leaf in R.Let R be an environment graph. An edge C a −→ l in R with l in L means that type C has anattribute a with initial value l. A node u in set R.sm(s) means that an object having static tables can be cast to type u. An edge s m −→ k with k in K means that an object having static tables has a method m with k as the body command. Fig. 7 shows an environment graph.6.2 Construction of Environment GraphsBased on the type graph of a well-typed program, the construction of the environment graphof the program is straightforward. We copy all the nodes of concrete classes in the type graphalong with their attributes and static tables, point the attributes to their initial values. For eachmethod, we lookup its body command in the environment dependent program, and instantiatethe command when needed, which may extend the type graph further. We also associate thestatic table of each type with the set of its supertypes.For a type graph Γ and a program P , we construct the environment graph bywhere the function f e is defined in Table 9.f e (P, Γ, 〈∅, ∅, ∅〉),Report No. 448, June 2011UNU-IIST, P.O. Box 3058, Macao

Execution Environment 42Table 9: Construction of Environment Graphsf e (P, Γ, R) ̂=(attributes) t σ −→ · ∈ Γ isCc(Γ, t) t /∈ R es = {t a −→ ini(n) | a −→ n ∈ clo(Γ t), a ≠ σ}f e (P, Γ, 〈R.ns ∪ {t} ∪ {l | t a −→ l ∈ es}, R.es ∪ es, R.sm〉)(static tables) t ∈ R t σ −→ s ∈ Γ t σ −→ s /∈ R ts = {t ′ | Γ ⊢ t t ′ }f e (P, Γ, 〈R.ns ∪ {s}, R.es ∪ {t σ −→ s}, R.sm ∪ {s ↦→ ts}〉)t −→ σ s ∈ R −−→ σ.m· ∈ clo(Γ t) t −−→ σ.m· /∈ R (Γ ′ , k) = insc(Γ, mk(P, Γ, t, m), ssj (Γ, t))(methods)f e (P, Γ ′ , 〈R.ns ∪ {k}, R.es ∪ {s −→ m k}, R.sm〉) R ,where isCc(Γ, t) ̂= t-var(Γ, t) = ∅ ∧ t ∈ U ∨ t-var(Γ, t) ≠ ∅ ∧ t ∈ S is a condition for t to be a concreteclass node.In (attributes), the function f e builds the class frame that represents the structure of the initialobject for each concrete class node t in Γ. It adds to the environment graph the type nodesrepresenting t, the default values of all attributes, and the edges labeled by attributes outgoingfrom t to the default values. We use ini(t ′ ) to denote the default value of a type node t ′ , i.e.,ini(Z) = 0, ini(B) = false, ini(S) = ε and ini(C) = null for any class type C. In (static tables),f e adds the static tables from Γ to the environment graph, and associates each static table withthe set of supertype nodes. Finally, f e constructs outgoing edges from static tables labeled bymethod names, targeting at the instantiated body commands of the methods respectively, asshown in (methods).In order to get the instantiated method body commands, we introduce two functions. First, fora class instantiation j ∈ Γ, when we instantiate a command for j, the type substitution can berecovered from the edges labeled by type variables in the generic static table of j. The functionssj returns the type substitution. Formally,ssj (Γ, j) ̂= 〈α ↦→ t | α ∈ X , j σ.α −−→ t ∈ Γ〉.Second, for method body commands, we define a function mk to lookup in program P for bodycommand c of method m of type t. If t is a class instantiation, we track for its template genericclass in Γ. Formally, mk(P, Γ, t, m) ̂= c, where⎧⎪⎨ k ′ if ∃t :: m{k ′ } ∈ P,c = mk(P, Γ, t⎪⎩′ , m) if ∃t ′ = gcls(Γ, t),mk(P, Γ, t ′ , m) otherwise, if ∃t −→ ⊲ t ′ ∈ Γ.When t is directly defined in P , mk returns the body command by the method definition, otherwiseit looks up the command in the template generic class or the superclass. Thus, for a typeReport No. 448, June 2011UNU-IIST, P.O. Box 3058, Macao

Operational Semantics 43node t, we can instantiate the body command c for a method m of t over the type instantiationextracted from ssj (Γ, t), and finally add the instantiated commands to the environment graph.7 Operational SemanticsWe now define a small-step operational semantics for rcos/g.7.1 State GraphsA state of a program at runtime consists of objects, values (states) of their attributes, as well asstacked (local) scopes and edges representing variables from scopes to objects and literal values.Each small-step of the execution of the program is to change the state by creating a new object,assigning to an attribute or a variable, creating a new scope with new variables, or destroyingthe current scope.Such a program state can be represented by a rooted graph, with scopes, objects and literalvalues being the nodes, variables and attributes being the edges.Definition 18 (State Graph) Given an environment graph R, a state graph is a rooted, directedand labeled graph G = 〈N, E, r〉, where– N ⊆ O ∪ L ∪ S is the set of nodes,– E : N × (A ∪ {σ, self , $}) → N is the set of edges,– r ∈ N ∩ O is the root of the graph.A state graph G = 〈N, E, r〉 is well-formed, representing a proper program state, if it satisfiesthat1. starting from r, the $-edges, if there are any, form a $-path (stack) such that except r, whichhas no incoming edge, each node on the path has only one incoming edge,2. a node is either an object node in O but not on the $-path, a literal value in L , a static tablein S , or a scope node in O and on the $-path,3. a literal value is a leaf, having no outgoing edges,4. the source of an edge labeled by self is a scope node,5. an object node has an outgoing edge labeled σ pointing to a static table, andReport No. 448, June 2011UNU-IIST, P.O. Box 3058, Macao

Operational Semantics 44RanulllnkCalnknullxfalsetruebblnkb a iσ σσ...σσAbbfalsez$y0x $i0Figure 8: A State Graph and a Corresponding Environment GraphG6. a static table in G must also be a static table in dom(R.sm).A state graph G is shown as solid lines in Fig. 8.Besides extending the stack operations on type contexts to state graphs, we define the swingoperation to change the targets of a list of edges in a state graph. Formally, given G = 〈N, E, r〉,d = n ′ a −→ · ∈ E, and n ∈ N ∪ L , we define swing(G, d, n) ̂= 〈N ′ , E ′ 〉 r, whereN ′ = N ∪ {n}, E ′ = E n ′ a −→ n.The M e map operation sequentially adds mappings e to M, and overrides the existing mappings.Formally,M (a ↦→ b, s ↦→ t) ̂= ({x ↦→ y ∈ M | x ≠ a} ∪ {a ↦→ b}) s ↦→ t.7.2 Execution Environment LookupThe static table nodes connect objects to their runtime class information in the environmentgraph. Via static tables, we can lookup superclasses, method names and method body commands.A static table is introduced to a state graph by object creation, which can be simplified as shallowcopying all the outgoing edges of a type node in the environment graph to the state graph withthe substitution of a new object node for the type node.Besides the static table of an object, we can also access the environment graph through theenvironment dependent ⌈t⌋.new command. Given an environment graph R, a type node t ∈ R,a state graph G and an edge d ∈ G, we define the function new to add a new instance of t tothe state graph and swing d to the instance. Formally, given G = 〈N, E, r〉 and d = n a −→ · ∈ E,we define new(R, G, t, d) ̂= 〈N ′ , E ′ 〉 r, whereN ′ = N ∪ {o} ∪ {l | t b −→ l ∈ R},E ′ = E ∪ {o b −→ l | t b −→ l ∈ R} n a −→ o,o ∈ O N.Report No. 448, June 2011UNU-IIST, P.O. Box 3058, Macao

Operational Semantics 45An environment graph R, which corresponds to a state graph G, is shown as dashed lines inFig. 8.7.3 Well-Typed State GraphsA state graph G is well-typed if it is consistent with a type graph Γ and a type context ∆. Foran object node n ∈ G, n must correspond to a type node in Γ, denoted as t = Γ(G, n), such thatt σ −→ s ∈ Γ and n σ −→ s ∈ G for some static table s. For a literal value l, l corresponds to its typeT(l). So, we define Γ(G, l) as T(l). For a scope node n s ∈ G satisfying depth(n s ) < size(∆), theremust exist a scope node in ∆, denoted as t s = ∆(G, n s ), such that depth(G, n s ) = depth(∆, t s ).Definition 19 (Well-Typed State Graph) Let Γ be a type graph and ∆ a type context. Astate graph G is well-typed w.r.t. (Γ, ∆), if1. for an object node n ∈ G and a ∈ A ,n a −→ n ′ ∈ G ⇐⇒ Γ(G, n) ⊲ −→ ∗ a −→ t ′ ∈ Γ, and Γ ⊢ Γ(G, n ′ ) t ′ ,2. for a scope node n s ∈ G with depth(G, n s ) < size(∆) and w ∈ A ∪ {self },n sw −→ n ′ ∈ G ⇐⇒ ∆(G, n s ) w −→ t ′ ∈ ∆, and Γ ⊢ Γ(G, n ′ ) t ′ ,3. size(G) ≥ size(∆) − 1.The −1 offset on size(∆) is a result of the approach of building type contexts upon ∆ ∅ . Thestate graph G shown in Fig. 8 is well-typed w.r.t. the type graph and the type context ∆ 1 inFig. 6.7.4 Semantic RulesGiven the environment graph R constructed from a type graph Γ, the small-step operationalsemantics is given in Table 10, including the evaluation of expressions, l-expressions and theexecution of commands.Given a state graph G, an expression e is evaluated by eval(G, e) to an object node in G or aliteral. By contrast, an l-expression le is evaluated by l-eval(G, le) to an edge in G that can beswung, representing an l-value, whose target node is not significant.The execution of commands is defined by giving the transition relation → between configurations.There are two kinds of configurations:1. a non-terminated configuration is a pair 〈c, G〉, representing a state graph G with a commandc to be executed, andReport No. 448, June 2011UNU-IIST, P.O. Box 3058, Macao

Operational Semantics 46Table 10: Operational Semantics of Commands and Evaluation of Expressionsl ∈ L(Lit)eval(G, l) ̂= ln = eval(G, e) ∈ L(Op)eval(G, f(e)) ̂= f(n)(Var) w ∈ A ∪ {self } ∪ {x∗ | x ∈ A }eval(G, w) ̂= G(w)(Attr) eval(G, e) a −→ n ∈ Geval(G, e.a) ̂= nx ∈ A(L-Var)l-eval(G, x) ̂= search(G, x)x ∈ A(P-Var)po(G, x) ̂= null(Cast) n = eval(G, e) n −→ σ s ∈ G t ∈ R.sm(s),eval(G, (⌈t⌋)e) ̂= n(L-Attr) q = eval(G, e) a ∈ A q −→ a n ∈ Gl-eval(G, e.a) ̂= q −→ a ,nq = eval(G, e)(P-Attr) a ∈ A ,po(G, e.a) ̂= qq = null x ∈ A(R-Var)spo(G, x, q) ̂= l-eval(G, x)(R-Attr) q ≠ null q −→ a n ∈ Gspo(G, e.a, q) ̂= q −→ a n ,(Skip) 〈skip, G〉 → Gd = l-eval(G, le) n = eval(G, e)(Assign)〈le := e, G〉 → swing(G, d, n)(Decl) 〈var ⌈t⌋ x, G〉 → push(G, x, ini(t))d = l-eval(G, le)(New)〈⌈t⌋.new(le), G〉 → new(R, G, t, d)(End) 〈end x, G〉 → pop(G)n = eval(G, ve) q = po(G, re)(Enter)〈enter(o, ve, x, re), G〉 → push(G, (self , x, x ∗ ), (o, n, q))(Leave) d = spo(pop(G), re, eval(G, x∗ )) n = eval(G, x)〈leave(x, re), G〉 → pop(swing(G, d, n))o = eval(G, e) o −→ σ s ∈ G s −→ m k ∈ R(Invk)〈e.m(ve • x • re), G〉 → 〈enter(o, ve, x, re); k; leave(x, re), G〉〈c 1 , G〉 → 〈c ′ 1, G ′ 〉(Seq)〈c 1 ; c 2 , G〉 → 〈c ′ 1; c 2 , G ′ 〉〈c 1 , G〉 → G ′(Seq-Pri)〈c 1 ; c 2 , G〉 → 〈c 2 , G ′ 〉eval(G, b) = true(If-T)〈c 1 ⊳ b ⊲ c 2 , G〉 → 〈c 1 , G〉eval(G, b) = false(If-F)〈c 1 ⊳ b ⊲ c 2 , G〉 → 〈c 2 , G〉eval(G, b) = false(While-F)〈b ∗ c, G〉 → G(While-T)eval(G, b) = true〈b ∗ c, G〉 → 〈c; b ∗ c, G〉 .2. a terminated configuration is a state graph G, representing the completion of the execution ofa command.Report No. 448, June 2011UNU-IIST, P.O. Box 3058, Macao

Operational Semantics 47The transition rules for assignment, object creation, variable declaration and undeclaration aredefined by the simple graph operations of edge swing, new instance adding, stack push and pop.By contrast, the rule for method invocation is more dedicate and deserves some explanation.In a method invocation, a result argument re is an l-expression. We have to remember the l-valueof re initially before it is possibly changed during the invocation. If re = e.a is a navigationpath, we call the object referred to by e the parent object of the attribute a. Our approach isto assign this parent object po(G, re) to an auxiliary variable x ∗ when entering a method. Andwhen exiting the method, we recover the initial l-value of re by spo(G, re, q) that returns theoutgoing edge labeled by a of the parent object q retrieved from x ∗ . For a variable w, the notionof parent object is not significant since its l-value cannot be changed. For unification, however,we define the parent object of w as the null object.7.5 Type-Safety of ProgramsThe type-safety of a program ensures that a well-typed expression can be evaluated and a welltypedcommand can be executed. We reason about the type-safety by showing that given awell-typed state graph, there exists a semantic rule which applies, and that the resulted stategraph of the semantic rule is also well-typed.There are exceptional cases when an expression cannot be evaluated, or a command cannot beexecuted.1. Null reference: the evaluation of an expression e.a or the execution of a command e.m(. . .)fails, if e is evaluated to null.2. Illegal downcast: the evaluation of an expression (⌈t⌋)e fails, if e is evaluated to a node nand the type of n is not a subtype of t.These two cases can not be checked statically.Lemma 12 Let Γ be a type graph, ∆ a type context, e an expression, and G a well-typed stategraph w.r.t. (Γ, ∆). If Γ, ∆ ⊢ e : t, we have1. n = eval(G, e) exists and Γ ⊢ Γ(G, n) t, and2. if e is an l-expression, l-eval(G, e) exists,unless one of the exceptional cases happens.Proof. The proof is given by induction on the structure of the expression e. Suppose none of theexceptional cases happens.Report No. 448, June 2011UNU-IIST, P.O. Box 3058, Macao

Operational Semantics 48– Case e is w (either a variable or self ).A well-typed variable Γ, ∆ ⊢ w : t can only be deduced from (T-Var) and (T-Self). Thus wehave ∆(w) = t. Because G is well-typed w.r.t. (Γ, ∆), n = G(w) exists and Γ ⊢ Γ(G, n) t.Because ∆(w) = target(search(∆, w)) and G(w) = target(search(G, w)), search(G, w) alsoexists. Hence, by (L-Var), if w ≠ self , l-eval(G, w) exists.– Case e is e ′ .a.A well-typed attribute Γ, ∆ ⊢ e ′ .a : t can only be deduced from (T-Attr). Thus we haveΓ, ∆ ⊢ e ′ : u and u ⊲ −→ ∗ a −→ t ∈ Γ. By the induction hypothesis, eval(G, e ′ ) = q and Γ ⊢Γ(G, q) u. By the subtyping rules, Γ(G, q) ⊲ −→ ∗ a −→ t ∈ Γ. Because G is well-typed w.r.t.(Γ, ∆), there exists q a −→ n ∈ G such that Γ ⊢ Γ(G, n) t. Hence, by (L-Attr), l-eval(G, e ′ .a)exists.– Case e is l (either a literal or null).Since Γ, ∆ ⊢ l : B can only be resulted from (T-Lit), we have T(l) = B. Also by (Lit),eval(G, l) = l.– Case e is (⌈t⌋)e ′ .Since Γ, ∆ ⊢ (⌈t⌋)e ′ : t can only be resulted from (T-DCast), we have Γ, ∆ ⊢ e ′ : t ′ satisfyingΓ ⊢ t t ′ and also Γ ⊢ Ce → ⌈t⌋ for some class expression Ce. By induction hypothesis,n = eval(G, e ′ ) exists. If the exceptional case of illegal downcast does not happen, wemust have Γ ⊢ Γ(G, n) t and n must be an object node, since t is translated from aclass expression and cannot be a primitive type. By Condition 5 of the well-formednessof state graphs, there exists a static table s such that n −→ σ s ∈ G. By (static tables) inthe construction of environment graphs, we have t ∈ R.sm(s). Hence, by (Cast), we haveeval(G, (⌈t⌋)e ′ ) = n.– Case e is f(e ′ ).Since Γ, ∆ ⊢ f(e ′ ) : B ′ can only be resulted from (T-Op), we know that f : B → B ′ andΓ, ∆ ⊢ e ′ : B for some primitive types B. By induction hypothesis, n = eval(G, e ′ ) existsatisfying T(n) = B. As a result, eval(G, f(e ′ )) = f(n) and T(f(n)) = B ′ .□Theorem 3 (Type-Safety of Commands) Let Γ be a type graph, R an environment graphderived from Γ, ∆ and ∆ ⋄ two type contexts, c a command, and G a well-typed state graph w.r.t.(Γ, ∆). Unless one of the exceptional cases happens, if Γ ⊢ u :: m{k} for all u −−→ σ.m k ∈ R, andΓ, ∆ → ∆ ⋄ ⊢ c, either1. there exists a well-typed state graph G ⋄ w.r.t. (Γ, ∆ ⋄ ) such that 〈c, G〉 → G ⋄ , or2. there exists a configuration 〈c ′ , G ′ 〉 and a type context ∆ ′ with G ′ being a well-typed state graphw.r.t. (Γ, ∆ ′ ) such that 〈c, G〉 → 〈c ′ , G ′ 〉 and Γ, ∆ ′ → ∆ ⋄ ⊢ c ′ .Report No. 448, June 2011UNU-IIST, P.O. Box 3058, Macao

Operational Semantics 49Proof. The proof is in general by induction on the structure of the command c. Suppose noneof the exceptional cases happens.– Case c is skip.Since Γ, ∆ ⊢ skip can only be resulted from (T-Skip), we have ∆ ⋄ = ∆. By (Skip), we have〈skip, G〉 → G. Thus G is well-typed w.r.t. ∆ ⋄ .– Case c is ⌈u⌋.new(le ′ ).Since Γ, ∆ ⊢ ⌈u⌋.new(le ′ ) can only be resulted from (T-New), we have ∆ ⋄ = ∆, Γ, ∆ ⊢ le ′ : u ′ ,Γ ⊢ u u ′ and Γ ⊢ Cc → u for some concrete class Cc. Thus, u is a concrete class nodeand u ′ is a class node. By Lemma 12, we have d = l-eval(G, le) exists. Therefore, G ′ =new(R, G, u, d) is defined. By the definition of new and (attributes) in the construction ofenvironment graphs, G ′ is also well-typed w.r.t. Γ, since (1) the newly introduced object nodehas all the attributes of u, (2) the attributes point to their initial values, and (3) Γ, ∆ ⊢ le ′ : u ′implies either Γ(G, source(d)) ⊲ −→ ∗ a −→ u ′ ∈ Γ for some attribute a or ∆(G, source(d)) x −→ u ′ ∈ ∆for some variable x. Note that new does not affect scope nodes, thus G ′ is also well-typedw.r.t. ∆. By (New), we have 〈Cc.new(le), G〉 → G ′ .– Case c is le ′ := e ′ .Since Γ, ∆ ⊢ le ′ := e ′ can only be resulted from (T-Assign), we have ∆ ⋄ = ∆, Γ, ∆ ⊢ le ′ : t,Γ, ∆ ⊢ e ′ : t ′ and Γ ⊢ t ′ t. By Lemma 12, we have d = l-eval(le) and n = eval(e)exist and Γ ⊢ Γ(G, n) t ′ . Therefore, G ′ = swing(G, d, n) is defined and well-typed w.r.tΓ, since Γ, ∆ ⊢ le ′ : t implies either Γ(G, source(d)) ⊲ −→ ∗ a −→ t ∈ Γ for some attribute a or∆(G, source(d)) x −→ t ∈ ∆ for some variable x. Note that swing does not affect scope nodes,thus G ′ is also well-typed w.r.t. ∆. By (Assign), we have 〈le := e, G〉 → G ′ .– Case c is var ⌈t⌋ x.Since Γ, ∆ → ∆ ⋄ ⊢ var ⌈t⌋ x can only be resulted from (T-Decl), we have ∆ ⋄ = push(∆, x, t).Let G ′ = push(G, x, ini(t)). By (Decl), we have 〈var Te x, G〉 → G ′ . Note that G ′ is welltypedw.r.t. (Γ, ∆ ⋄ ), since var(G ′ ) = {x} = var(∆ ⋄ ) and x point to their initial values.– Case c is end x.Since Γ, ∆ → ∆ ⋄ ⊢ end x can only be resulted from (T-End), we have ∆ ⋄ = pop(∆), thuspop(G) is consistent with ∆ ⋄ . By (End), we have 〈end x, G〉 → pop(G).– Case c is e.m(ve • x • re).A well-typed method invocation Γ, ∆ ⊢ e.m(ve • x • re) is deduced from (T-Invk). Thus, wehave Γ ⊢ e : u and mfr(Γ u, m). Hence, o = eval(G, e) exists and −−→ σ.m · ∈ clo(Γ u).Since Γ ⊢ u ′ = Γ(G, o) u, we have −−→ σ.m · ∈ clo(Γ u ′ ). By the definition in Table 9, wehave u ′ σ m σ −→ s −→ k ∈ R and o −→ s ∈ G. Therefore, by (Invk), the method invocation can bereduced to 〈enter(o, ve, x, re); k; leave(x, re), G〉.Also, we have Γ ⊢ ve : t, x : t ′′ , re : t ′ , t t ′′ . It implies that (a) n = eval(G, ve) exist andΓ ⊢ Γ(G, n) t, and (b) re must be l-expressions, either variables or attributes. Hence,q = po(G, re) exist. Therefore, by the enter auxiliary command, G is transformed to G 1 =Report No. 448, June 2011UNU-IIST, P.O. Box 3058, Macao

Operational Semantics 50push(G, (self , x, x ∗ ), (o, n, q)), which is well-typed w.r.t. ∆ 1 = push(∆, (self , x), (u, t ′′ )). Noticethat Definition 19 of well-typed states permits extra auxiliary variables with names notin A .Since Γ ⊢ u :: m{k}, by (T-Meth), we have Γ, push(∆ ∅ , (self , x), (u, t ′′ )) ⊢ k. By Lemma 11,we also have Γ, ∆ 1 ⊢ k, and k cannot see any variables in ∆, including those in re. Also, theauxiliary variables x ∗ are used only by enter and leave and cannot be seen by k.By the induction hypothesis, if 〈k, G 1 〉 → G ′ 1 , we have that a state graph G′ 1 is well-typedw.r.t. ∆ 1 . It implies that pop(G ′ 1 ) is well-typed w.r.t. ∆, and neither re nor x∗ is changed inpop(G ′ 1 ). Hence, we can perform the spo on re, variables or attributes, to retain the originaledges d for re. According to (T-Var), we maintain Γ, ∆ 1 ⊢ x : t ′′ . Thus n = eval(G ′ 1 , x) existand Γ ⊢ Γ(G ′ 1 , n) t′′ . Therefore, the leave command swings d to n which are of subtypes oft ′ , for Γ ⊢ t ′′ t ′ being a premise of (T-Invk). Then it pops the graph. The resulted stategraph G ⋄ differs from pop(G ′ 1 ) only in the update of d, thus is still well-typed w.r.t. ∆.If 〈k, G 1 〉 → 〈k ′ , G ′ 1 〉, we have Γ, ∆′ 1 → ∆ 1 ⊢ k ′ and G ′ 1 well-typed w.r.t. ∆′ 1 . In orderto deduce Γ, ∆ ′ 1 → ∆ ⊢ (k′ ; leave(x, re)), we only need to setup an auxiliary typing ruleΓ, ∆ → pop(∆) ⊢ leave(x, re). This is possible, since we have shown that a leave commandin a method invocation expansion is always type-safe, and leave pops a state graph, meaningthat the type context also needs to be popped.– Case c is c 1 ; c 2 .Since Γ, ∆ → ∆ ⋄ ⊢ c 1 ; c 2 can only be resulted from (T-Seq), we have Γ, ∆ → ∆ mid ⊢ c 1and Γ, ∆ mid → ∆ ⋄ ⊢ c 2 for some ∆ mid . From the induction hypothesis, there are two cases:either (1) 〈c 1 , G〉 → G ′ for some G ′ well-typed w.r.t. ∆ mid ; or (2) 〈c 1 , G〉 → 〈c ′ 1 , G′ 〉 andΓ, ∆ ′ → ∆ mid ⊢ c ′ 1 for some ∆′ , c ′ 1 and G′ well-typed w.r.t. ∆ ′ . In the case (1), we have〈c 1 ; c 2 , G〉 → 〈c 2 , G ′ 〉 according to (Seq-Pri). In the case (2), we have 〈c 1 ; c 2 , G〉 → 〈c ′ 1 ; c 2, G ′ 〉according to (Seq), and Γ, ∆ ′ → ∆ ⋄ ⊢ c ′ 1 ; c 2 according to (T-Seq).– Case c is c 1 ⊳ e ⊲ c 2 .Since Γ, ∆ → ∆ ⋄ ⊢ c 1 ⊳ e ⊲ c 2 can only be resulted from (T-If), we have ∆ ⋄ = ∆, Γ, ∆ ⊢e : B, Γ, ∆ ⊢ c 1 and Γ, ∆ ⊢ c 2 . So, eval(G, e) equals true or false, Γ, ∆ → ∆ ⋄ ⊢ c 1 andΓ, ∆ → ∆ ⋄ ⊢ c 2 . If eval(G, e) = true, we have 〈c 1 ⊳ e ⊲ c 2 , G〉 → 〈c 1 , G〉 according to (If-T).Otherwise, we get 〈c 1 ⊳ e ⊲ c 2 , G〉 → 〈c 2 , G〉 according to (If-F).– Case c is e ∗ c 1 .Since Γ, ∆ → ∆ ⋄ ⊢ e ∗ c 1 can only be resulted from (T-While), we have ∆ ⋄ = ∆, Γ, ∆ ⊢e : B and Γ, ∆ ⊢ c 1 . As a result, eval(G, e) equals true or false and Γ, ∆ → ∆ ⊢ c 1 . Ifeval(G, e) = false, we have 〈e ∗ c 1 , G〉 → G according to (While-F). Otherwise, we get〈e ∗ c 1 , G〉 → 〈c 1 ; e ∗ c 1 , G〉 according to (While-T), and Γ, ∆ → ∆ ⋄ ⊢ c 1 ; e ∗ c 1 accordingto (T-Seq).Based on Lemma 12, Theorem 2 and 3, we can prove the type-safety of programs.□Report No. 448, June 2011UNU-IIST, P.O. Box 3058, Macao

Conclusion 51Theorem 4 (Type-Safety of Programs) Let Md = {⌈t⌋ x; c} be a main method, P = md •Md a program, Γ a type graph, R an environment graph derived from Γ, ∆ 1 = push(∆ ∅ , x, t),and G 1 a well-typed state graph w.r.t. (Γ, ∆ 1 ). Unless one of the exceptional cases happens, ifΓ ⊢ P , either1. there exists a well-typed state graph G ⋄ w.r.t. (Γ, ∆ 1 ) such that 〈c, G 1 〉 → G ⋄ , or2. there exists a configuration 〈c ′ , G ′ 〉 and a type context ∆ ′ with G ′ being a well-typed state graphw.r.t. (Γ, ∆ ′ ) such that 〈c, G 1 〉 → 〈c ′ , G ′ 〉 and Γ, ∆ ′ → ∆ 1 ⊢ c ′ .Theorem 3 and 4 state that a well-typed program can always either terminate or progress correctlygiven a well-typed initial state graph, unless one of the two exceptional cases happens. Thus,the graph-based generic type system is sound.8 ConclusionWe give a formal definition of oo type graph with generics, and provide algorithms for relatedgraph operations. We establish a type system purely based on the graph model, and prove itssoundness with respect to a graph-based operational semantics. A distinct nature of our typesystem is the representation of advanced type features of oo systems, including parameterizedtypes and conjunction types, by simple edge-labeled graphs. Based on such representation,the whole procedure of type checking at compile-time and program execution at runtime areimplemented by elementary graph transformations. This provides a systematic graph-basedmodel to help users to understand and analyze the types and semantics of oo programs.Future work. The graph algorithms can be used to derive a concrete implementation to translate,type-check and interpret a program from the source form of the rcos/g abstract syntax.This extends the implementation of the graph-based operational semantics in [6, 10], and willlead to a full-featured tool set to animate the type-checking and interpreting processes of oo programswritten in rcos/g. Notice that the type graphs and state graphs are likely to be complexfor programs with advanced type features. For the tool set to be practically used, it is importantto be able to render a graph partially, based on the types and semantics.Arrays and abstract containers cannot be represented in the current graph model yet, but theyare essential in oo programming. It is helpful if we can setup regions in the graphs and reasonabout the external behaviors of a region without being concerned with its internals. Besides,simplicity is always one of the key issues of the graph-based model to improve.Report No. 448, June 2011UNU-IIST, P.O. Box 3058, Macao

References 52References[1] J. Gosling, B. Joy, G. Steele, and G. Bracha, Java Language Specification: The Java Series. Addison-Wesley Longman Publishing Co., Inc. Boston, MA, USA, 2000.[2] O. M. Group, “Omg unified modeling language (omg uml), superstructure, v2.1.2,” tech. rep., 2007.www.omg.org/spec/UML/2.1.2/Superstructure/PDF.[3] C. A. R. Hoare and J. He, “A trace model for pointers and objects,” in 13th European Conferenceon Object-Oriented Programming, LNCS 1628, pp. 1–17, Springer, 1999.[4] J. He, X. Li, and Z. Liu, “rCOS: A refinement calculus for object systems,” Theor. Comput. Sci.,vol. 365, no. 1-2, pp. 109–142, 2006.[5] G. Klein and T. Nipkow, “A machine-checked model for a Java-like language, virtual machine, andcompiler,” ACM Transactions on Programming Languages and Systems (TOPLAS), vol. 28, no. 4,pp. 619–695, 2006.[6] W. Ke, Z. Liu, S. Wang, and L. Zhao, “A Graph-based Operational Semantics of OO Programs,”Formal Methods and Software Engineering, pp. 347–366, 2009.[7] G. Bracha, “Generics in the Java programming language,” Sun Microsystems, 2004. java.sun.com/j2se/1.5/pdf/generics-tutorial.pdf.[8] B. Pierce, Types and programming languages. The MIT Press, 2002.[9] L. Zhao, X. Liu, Z. Liu, and Z. Qiu, “Graph transformations for object-oriented refinement,” Form.Asp. Comput., vol. 21, no. 1-2, pp. 103–131, 2009.[10] W. Ke, Z. Liu, S. Wang, and L. Zhao, “Graph-based type system, operational semantics and implementationof an object-oriented programming language,” Tech. Rep. 410, UNU-IIST, P.O. Box3058, Macau, 2009. www.iist.unu.edu/www/docs/techreports/reports/report410.pdf.[11] M. Abadi and L. Cardelli, A Theory of Objects. Springer, 1996.[12] A. Igarashi, B. Pierce, and P. Wadler, “Featherweight Java: A minimal core calculus for Javaand GJ,” ACM Transactions on Programming Languages and Systems (TOPLAS), vol. 23, no. 3,pp. 396–450, 2001.[13] S. Wang, Q. Long, and Z. Qiu, “Type Safety for FJ and FGJ,” Theoretical Aspects of Computing-ICTAC 2006, pp. 257–271, 2006.[14] A. P. L. Ferreira, L. Foss, and L. Ribeiro, “Formal verification of object-oriented graph grammarsspecifications,” Electronic Notes in Theoretical Computer Science, vol. 175, no. 4, pp. 101 – 114,2007.[15] A. Corradini, F. L. Dotti, L. Foss, and L. Ribeiro, “Translating Java code to graph transformationsystems,” in Graph Transformations, LNCS 3256, pp. 383–398, Springer, 2004.[16] H. Kastenberg, A. Kleppe, and A. Rensink, “Defining object-oriented execution semantics usinggraph transformations,” in Formal Methods for Open Object-Based Distributed Systems, LNCS 4037,Springer, 2006.[17] R. Heckel, J. M. Küster, and G. Taentzer, “Confluence of typed attributed graph transformationsystems,” in ICGT ’02: Proceedings of the First International Conference on Graph Transformation,pp. 161–176, Springer-Verlag, 2002.[18] M. Wermelinger and J. L. Fiadero, “A graph transformation approach to software architecture reconfiguration,”Sci. Comput. Program., vol. 44, no. 2, pp. 133–155, 2002.Report No. 448, June 2011UNU-IIST, P.O. Box 3058, Macao

References 53[19] H. Ehrig, K. Ehrig, U. Prange, and G. Taentzer, “Fundamental theory for typed attributed graphsand graph transformation based on adhesive HLR categories,” Fundamenta Informaticae, vol. 74,no. 1, pp. 31–61, 2006.Report No. 448, June 2011UNU-IIST, P.O. Box 3058, Macao

A Graph-Based Generic Type System for Object-Oriented Programs

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?