10.07.2015 Views

all FREECO11-workshop-papers.pdf - trese

all FREECO11-workshop-papers.pdf - trese

all FREECO11-workshop-papers.pdf - trese

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

FREECO-11Table of ContentsTable of ContentsPolicy Languages for Distributed Business Applications Require the Same CompositionMechanisms as Programming Languages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1Tom Dinkelaker and Sascha HaukeAn association-based model of dynamic behaviour . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6Ian PiumartaThe Keyword Revolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11Steven te Brinke, Lodewijk Bergmans and Christoph BockischComposing heterogeneous software with style. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16Stephen KellTowards Modular Code Generators Using Symmetric Language-Aware Aspects . . . . . . . . . . . 21Steffen Zschaler and Awais RashidTowards Using Constructive Type Theory for Verifiable Modular Transformations . . . . . . . . 26Steffen Zschaler, Iman Poernomo and Jeffrey TerrellOpen, extensible composition models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31Ian Piumarta1


Policy Languages Require the Same CompositionMechanisms as Programming LanguagesTom DinkelakerTechnische Universität Darmstadtdinkelaker@cs.tu-darmstadt.deSascha HaukeTechnische Universität Darmstadtsascha.hauke@cased.deABSTRACTCurrent policy languages come with a monolithic syntax andsupport only a limited set of security formalisms. Thus,contemporary policies can only inadequately prescribe thecorrect behavior of a distributed business application w.r.t.different views, such as usage control, safety properties, orgovernance. To support composing policies that involve multipleviews, we propose to include well-established composabilitymechanisms into policy languages. In this paper, wepropose an extensible security DSL that composes multiplemechanisms—namely inheritance, scoping, aspects, and differentparadigms—into one composite policy language.Categories and Subject DescriptorsD.3.2 [PROGRAMMING LANGUAGES]: LanguageClassifications—Extensible languages, Multiparadigm languagesGeneral TermsLanguages, Policy Languages, Inheritance, Scoping, Aspects1. INTRODUCTIONToday, distributed systems assume a homogeneous securityinfrastructure that uses a monolithic policy languagethat typic<strong>all</strong>y is a non-Turing complete and domain-specificlanguage (DSL). Often a policy language uses a concreteDSL syntax, it comes with a fixed set of security primitivesthat are part of a specific security formalism (e.g. RBACor security automata), and that follow a certain paradigm(e.g. rule-based or state machines). In the policy language,developers specify policies that define part of the correctexpected behavioral of the system. It supports only oneenforcement mechanism specialized for only one particularpurpose, but it is isolated from other policy languages andtheir enforcement mechanisms. Nonetheless in a globalizedeconomy, distributed systems begin to span multiple applications,stacks, companies, or markets, in which differentpolicy languages and enforcement mechanisms are used. ToPermission to make digital or hard copies of <strong>all</strong> or part of this work forpersonal or classroom use is granted without fee provided that copies arenot made or distributed for profit or commercial advantage and that copiesbear this notice and the full citation on the first page. To copy otherwise, torepublish, to post on servers or to redistribute to lists, requires prior specificpermission and/or a fee.FREECO’11, July 26, 2011, Lancaster, UK.Copyright 2011 ACM 978-1-4503-0892-2/11/07 ...$10.00.freely compose policies over multiple views of the correct behavior,such usage control, safety properties, or governance,in this paper, we propose an extensible policy language thatleverages well-established composition mechanisms from existinggeneral-purpose programming languages, namely inheritance,scoping, aspects, and paradigms.The contribution of the paper can be seen in two ways. Onthe one hand, it provides a requirement analysis for a policylanguage for distributed business applications. On the otherhand, it proposes a preliminary language design that integrateswell-established composition mechanisms into a policylanguage. Although there is yet no concrete implementation,the paper sketches what techniques from programminglanguages we will use to implement these mechanisms.In the remainder of the paper, Section 2 defines the specialrequirements for policy languages in distributed businessapplications and discusses problems of current policy languages.Section 4 proposes a set of composition mechanismsthat we think are essential for enabling freely composablesecurity policy languages. Section 5 concludes the paper.2. INSUFFICIENCIES IN POLICIES FOR DIS-TRIBUTED BUSINESS APPLICATIONSDistributed business applications comprise various looselycoupledsoftware components in a SOA that run in differentsystems, stacks, companies, and markets. In the following,we discuss a set of example policies for such applicationswhich require special mechanisms that are currentlynot completely supported in contemporary policy languages.2.1 Static Policies Lack Semantic FlexibilityIn a globalized economy, processes involve internationalbusiness partners. Corresponding applications combine servicesand providers distributed over different countries. Forthe execution of such services, the providers must take intoaccount differing national legislation which individual servicesand combined processes are obliged to adhere to. Consequently,a policy defined for an international business applicationneeds to be adapted to its execution context.Consider the WSPL policy in Listing 1 that enumeratesa set of payment options available at the location a servicerequest originates from. Say if a request originates from aforeign country, the policy accepts only pre-paid orders. Incontrast for domestic requests, <strong>all</strong> options are available, suchas credit card payment, bank invoice as well as pre-paid.With current policy languages, such as WSPL, it is possibleto specify such a policy but only in a hard-wired way.For example, such a policy could use a conditional state-


1 Policy (Id = ”Service Levels”) {2 Rule {3 Location = ”Germany”, Fee = 5, Currency = ”EUR”,4 Options = { ”Pre−Paid”, ”Credit−Card”, ”Invoice” }5 }6 ...7 Rule {8 Location = ”USA”, Fee = 7, Currency = ”USD”,9 Options = { ”Pre−Paid” }10 }11 }Listing 1: A WSPL policy with payment optionsment that enumerates <strong>all</strong> applicable options for <strong>all</strong> possiblelocations a priori. However, it is not possible to dynamic<strong>all</strong>ydefine new options and locations without changing theoriginal policy, because <strong>all</strong> applicable options and locationshave to be encoded in the policy file a priory, which userscannot update and reload at run-time.What is needed for such business security policies are languagemechanisms that enable semantic flexibility.2.2 Nested Policies Lack a Precise ScopingIn WS-Policy, a policy expression that is an element of anotherpolicy expression is c<strong>all</strong>ed a nested policy. The nestedpolicy enriches the enclosing policy by defining further details.For example, in Listing 2, the nested policy (lines 5– 5)defines that the enclosing policy (lines 1–11) for the symmetrictransport binding (line 2) must use a Kerberos token.WS-Policy enables defining security capabilities and requirementsfor single Web service. Now consider a servicecomposition with a global policy affecting multiple systemlayers or levels in an enterprise environment. In this context,policies need to provide flexible means to define which partsof the policy affect what service components or enterpriselevels. However, policy languages that only support staticpolicies for single end points, such as WS-Policy, do not supportprecisely scoping their effect to parts of the system.1 2 3 4 5 ... 6 7 8 9 10 ...11 Listing 2: A WS-Policy policy with a nested policyIn Ponder [2], a policy group defines a lexical scope inwhich one can declare a set of policies and constraints. Insidesuch a policy group, such as the one in Listing 3, nestedpolicies are grouped together through a semantic relationshipthat the group’s body defines (e.g., lines 1–7). Whenpolicies are instantiated (e.g., lines 8 and 9), the parameterssubstitute the variables in the group. Still, in Ponder, thereis only one scoping strategy, namely identifiers propagatewithin a group but not to the outside. In general, there isno cascading propagation of elements defined in a policy toits nested policies.1 type group serviceFailT (set s1, set t1, event e) {2 inst auth+ sReset{3 subject s1; action resetSchedule; target t1;4 }5 oblig failReset {6 subject s1; on e; do resetSchedule(); target t1;7 }}8 group brS A = serviceFailT(brManager/, brServices/, failure);9 group brS B = serviceFailT(opManager/, deliveries/, lateDelivery);Listing 3: A Ponder policy defining a policy groupWhat is needed is a policy language that encompasses arich set of scopes that precisely define how elements propagatethroughout the system, encompassing different scopingstrategies (e.g., lexical vs. dynamic scoping), different kindof topologies (e.g., technical or organizational structures,platform stacks), and dynamic contexts (e.g., applications,c<strong>all</strong> stack).2.3 Policies Suffer from TanglingJust as in programming languages, crosscutting concernsexist in policy languages. Such concerns may comprise, forinstance, demands on auditing and logging of policy enforcementand monitoring, transport layer or storage encryption,data and information flow control, which are potenti<strong>all</strong>yshared among differing policy rules governing service executionor data storage. Spreading out requirements affectingvarious rules in a policy violates the principle of the separationof concerns – similar to violating encapsulated code inOO programming languages. This leads to tangled and alsoscattered policy specifications, which makes maintenance ofpolicies complicated, and hinders developers and users toquickly comprehend a policy’s content and intention.1 Policy (Id = ”Compute Service”) {2 Rule {3 Service−Offer = ”Compute Bulk Low”,4 Payment = ”Flat”,5 Accounting = ”User Choice”,6 Privacy = ”Non−Critical”, #privacy7 Storage Encryption = ”256−bit BlowFish”, #confidentiality8 Transfer Encryption= ”None”9 }10 ...11 Rule {12 Service−Offer = ”Compute By Usage High”13 Payment = ”Per Minute”,14 Accounting = ”Mandatory Invoice”,15 Privacy = ”Critical ”, #privacy16 Storage Encryption = ”2048−bit BlowFish”, #confidentiality17 Transfer Encryption = ”TLS”18 }}Listing 4: A tangled and scattered policyListing 4 describes invoicing and privacy issues of a service,various details of the business-process-level logic arecombined with details of accounting and encryption. Thus,the Service-Offer, Payment, and Privacy entries representbusiness processes, the other entries refer to technical detailsthat are dependent upon the higher level definitions.This leads to an entanglement of business and implementationlogic within the policies. While controlling and securityfeatures are relevant to the successful execution of the services,peculiarities of the implementation are spread over


the corresponding rules. Policy assertions relating to differentsecurity concerns are tangled in one rule, such as theseverity level for privacy (line 6) and the encryption algorithmfor confidentiality (line 7). Policy assertions relatingto one security concern are scattered over multiple rules,such as the encryption algorithm BlowFish (lines 7 and 16).Tangling and scattering leads to poor maintainability of thepolicy specification due to crosscutting concerns.What is needed for modularizing tangled policies are languagemechanisms that help to encapsulate and disentanglenon-functional concerns in a clear and concise way.2.4 Policies Lack Integrating FormalismsPolicies express different goals. Like these goals vary, so dotheir representations in policy languages that specify them.Gener<strong>all</strong>y, different policy languages do not share a commonsyntax, even if they are based on the same fundamental representationaltechnology, such as XML (e.g. WS-Policy) orEBNF (e.g. Ponder [2]). Furthermore, policy statementsare adapted to the concepts underlying the specificationparadigms, with each policy supporting a single paradigm,and thus formalism. This limited view is insufficient whendealing with distributed business applications. A holisticspecification of the system’s behavior requires a combinationof views of the complete system. Because different representationsand formalisms have various degrees of powerregarding the description of these views, it is difficult to determinea single, ideal language suited to their expression.What is needed for policies are methods for integratingdifferent paradigms addressing different views of the system.3. OVERVIEW OF RELATED WORKCurrent policy languages only support one concrete syntaxthat supports only a limited set of security formalismsand paradigms. Existing languages are not open for newsyntax and semantics, which is what new formalisms andparadigms would require. They do not support policies likethose discussed in Section 2.1–2.4 because they <strong>all</strong> share theinsuffiencies that policies are not semantic<strong>all</strong>y flexible, notprecisely scoped, contain scattered and tangled code fragments,and adherent to one paradigm only.Ponder [2] is an object-oriented policy language with avery basic support for controlling the effects of policies inpolicy groups, but particularly it neither supports full-fledgedlexical nor dynamic scoping.WS-Policy 1 is a XML-based policy language frameworkthat <strong>all</strong>ows integrating new domain-specific policy languageswith an XML-based syntax, but in particular concrete syntaxis not supported.XACML 2 is an XML-based and rule-based policy languagefor defining attributes the specification of authorizationpolicies and obligations. Specific<strong>all</strong>y, but policies for abehavioral specification of a component (e.g. security automata)are out of scope.What is needed is an extensible policy language that userscan tailor for the specific requirements of their business application.So that, for such a policy language, they can selectthe right concrete DSL syntax, composition mechanisms, securityformalisms, or paradigms that they want to use, tobe included into their policy language.1 WS-Policy: www.w3.org/TR/ws-policy2 XACML: www.oasis-open.org/committees/xacml4. INTRODUCING COMPOSITION MECH-ANISMS INTO POLICY LANGUAGESIn order to overcome the limitation of existing policy languagesdiscussed in Section 2, we propose to make compositionmechanisms from programming languages available inpolicy languages. Specific<strong>all</strong>y, we consider adapting polymorphyand scoping for meeting the need of policy languagefor semantic flexibility. Furthermore, we propose the use ofaspects for addressing scattered and tangled policy definitions,and an open set of paradigms to define policies withthe right syntax and semantics. In the following, we presentexample solutions in WSPL, however WS-Policy or anotherpolicy language could be extended with the same mechanismsin a similar manner.4.1 PolymorphyTo enable flexible policies, we propose to extend the policylanguage with an inheritance mechanism that is similarto polymorphic programming languages. The inheritancemechanism enables the end users to refine the rules of abase policy in an extended policy.1 Policy (Id = ”Service−Levels”) {2 Rule { Trust = ”High”,3 Options = { ”Pre−Paid”, ”Credit−Card”, ”Invoice” }}4 ...5 Rule { Trust = ”Low”,6 Options = { ”Pre−Paid” }}7 }Listing 5: Base of a polymorphic policyFor example, Listing 5 and Listing 6 show two modularpolicies, where the latter policy extends the former – likea subclass extends its super class. Listing 5 shows the basepolicy Service-Levels that defines the payment options fordifferent trust levels. Listing 6 shows the policy extensionSpecific-Service-Levels that defines what Low trust orrespectively High trust means. Note that, even when differentstakeholders define those policies at different timesor in different sub-systems, the policy extension Specific-Service-Levels can refine what Low or High trust meansfor the Service-Levels base policy.1 Policy (Id = ”Specific−Service−Levels”) {2 Extends ( Super = ”Service−Levels” )3 Rule { Location = ”Germany”, Experience−Level = ”Good”,4 Experience−Length = ”Long”, Trust = ”High” }5 Rule { Location = ”USA”, TPM−Available = True,6 Platform−Monitor = ”Deployed”, Trust = ”High” }7 ...8 Rule { Trust = ”Low” }9 }Listing 6: Extension of a polymorphic policyWe expect that having an inheritance mechanism availablein policy languages, we can provide policy developers withsimilar advantages as having OO inheritance w.r.t. extensibility,reusability, and modular reasoning. However, it is ach<strong>all</strong>enge to provide such an inheritance mechanism for anopen set of policy dialects. On the one hand, the inheritancemechanism need to define a default polymorphic semanticsthat <strong>all</strong>ows to refine policies by overriding parts of them at


the level of assertions. On the other hand, the inheritancemechanism needs to be extensible for special cases in whichit must take into account the specific semantics of a policydialect. While we expect that end users can use the defaultpolymorphic semantics for most cases, only an extensibleinheritance mechanism enables domain-specific compositionsemantics for composition.4.2 ScopingTo precisely scope policies, we propose to support differentscoping schemes in the policy language. Every scopedescribes a partial view of the system that is structured indifferent topologies, such as an organizational or a technicaltopology, of which several topological views can overlap. Todefine a new scope, there is a special operator Scope with:(1) an Id that defines a unique identifier for the scope withina topology, (2) a scoping Strategy that defines how the definedelements in its body propagate, and (3) a Priorityfor resolving conflicts between overlapping scopes. Insidethe scope operator, a nested policy defines a binding forthat policy in this scope.Depending on the topological view and the scoping strategy,the contained bindings can propagate to other scopes orparts of the system. With lexical scoping, elements of an enclosingscope propagate to its nested scopes. With dynamicscoping, the definition of an element always establishes anew binding that propagates glob<strong>all</strong>y through <strong>all</strong> scopes.For example, Listing 7 shows several policies that are nestedinside different scopes. The corresponding Scope operatorsselect the right scope and strategy for the nested policiesthat <strong>all</strong> have the same Id. Since there are different scopesdefined for the policies, there is no name clash between them.The first two scopes <strong>all</strong>ow defining the Key-Length policydifferently for the two companies Organizational.MarketX.Company1with 1024 bits (lines 3–9) and Organizational.MarketX.Company2with 512 bits (lines 10–14). BecauseCompany1 uses a lexical scoping strategy, it is possibleto redefine the policy within a certain department (e.g., Dep1with 2048 bits). There is another scope that defines a specialKey-Length policy for devices with limited resources.There are two dynamic policies that impose restrictions onthe maximum key length, namely 256 bits when using thealgorithms to communicate with sensor nodes, and 128 bitswhen the systems detects that the battery of a mobile sensordevice is low. Each scope implicitly binds the policies thatare nested into it body.Alternative, one can explicitly define the scope of a policyby using the policy operator’s optional Scope attribute.For example, the policy in line 7 redefines the Key-Lengthpolicy. Because the policy is explicitly scoped to Organizational.MarketX.Company1.Dep1through the Scope attributeof the policy, however this only overrides the bindingof this policy within the department Dep1 of Company1.When scopes of different topologies overlap, there canbe multiple bindings for one policy Id. Consider a subcomponentthat is part an organizational topology elementCompany1 and part of a technical topology Technical.NetworkA.SensorNodes.In Listing 7, there are different policiesdefined for this sub-component by the two scopes Organizational.MarketX.Company1and Technical.NetworkA.SensorNodes.Therefore, it is necessary to resolve such conflictingbindings. To resolve such conflicts, we always select thebinding from the scope with the higher priority.1 Scope (Id = ”Organizational.MarketX”,2 Strategy =”Lexical”,Priority=”Normal”) {3 Scope (Id = ”Company1”, Strategy =”Lexical”){4 Policy (Id = ”Key−Length”) {5 Rule {6 Key−Length = ”1024−bit”7 Policy (Id = ”Key−Length”, Scope = ”Dep1”) {8 Rule { Key−Length = ”2048−bit” }9 }}}}10 Scope (Id = ”Company2”) {11 Policy (Id = ”Key−Length”) {12 Rule {13 Key−Length = ”512−bit”14 }}}}1516 Scope (Id = ”Technical.NetworkA.SensorNodes”,17 Strategy = ”Dynamic”, Priority=”High”) {18 Policy (Id =”Restricted−Key−Length”) {19 Rule { Key−Length = ”256−bit”, Transport = ”GPRS” }}20 Policy (Id = ”Quality−of−Protection”) {21 Rule { Key−Length = ”128−bit”, Battery = ”Low” }}22 }Listing 7: A policy using different scoping schemes4.3 AspectsIn order to disentangle crosscutting concerns in modernOO programming languages, aspect-oriented approaches[7, 6] have gained attention. Aspects provide designatedmeans for encapsulating (non-functional) crosscutting concerns.Transferring aspect-orientation, including its conceptsof pointcuts and advice, to policy languages offers usersand developers a convenient and familiar solution for dealingwith tangled and scattered assertions in policies.Listing 8 presents a way of disentangling the scattered policycriticized in Section 2.3. Each pointcut defines a patternover rules and assertions where the advice defines what assertionsit adds to the matching rules. Introducing aspectsexplicates statements and bindings, as well as distinguishingfunctional from non-functional segments of a policy. Furthermore,the presented solution can avoid ambiguities usingexplicit execution order via priorities (e.g. line 24), whichenables a better composability of different policy fragments.Aside from integrating easily accessible constructs familiarfrom AOP, policies can also be empowered through user definedfunctions, such as the NOT(ME) statement presented inthe above listing. Here, NOT negates the predicate ME; MErefers to a provider that defined the above policy – thus theexpression NOT(ME) matches <strong>all</strong> services not operated by theprovider.4.4 Multi-Paradigm InterpretationEmpowering policies with the interpretation of multipleparadigms <strong>all</strong>ows them to incorporate different formalismscovering possibly orthogonal system views. It also givesusers and developers more freedom to specify facets of a policyin the constructs they deem most appropriate—regarding,for instance, usability, understandability or brevity—for covering<strong>all</strong> pertinent facts of the view covered.The preceding policy fragment in listing Listing 9 illustratesthe integration of different policy constructs, i.e. aFinite State Machine and a Ponder [2] role-based access controlstatement, into a generic WSPL [1] policy. This permitsaccess to primitives from policy languages that weredesigned to specify a specific system functionality withina joint generic context. What is special in such a com-


1 Policy (Id = ”Compute Service”) {2 Rule {3 Service−Offer = ”Compute Bulk Low”,4 Payment = ”Flat”, Privacy = ”Non−Critical”5 }6 ...7 Rule {8 Service−Offer = ”Compute By Usage High”,9 Payment = ”Per Minute”, Privacy = ”Critical”10 }}1112 Aspect (Id = ”Service Invoicing Billing”) {13 Pointcut(Binding =”1”){14 Policy–ID = ”Compute Service”, Service−Provider = NOT(ME),15 Service−Offer = ”∗”, Operation = ”∗”, Payment = ”Per Minute”16 }17 Advice(Binding =”1”){18 Accounting = ”Mandatory Invoice” }1920 Pointcut(Binding =”2”){21 Policy–ID = ”Compute Service”, Service−Provider = ”∗”,22 Service−Offer = ”∗”, Operation = ”∗”, Payment = ”Flat”23 }24 Advice(Binding =”2”,Priority=”1”) {25 Accounting = ”User Choice” }26 }2728 Aspect (Id = ”Service Security”) {29 Pointcut(Binding =”1”){30 Policy–ID = ”Compute Service”, Service−Provider = ”∗”,31 Service−Offer = ”∗”, Operation = ”∗”, Privacy = ”Non−Critical”3233 }34 Advice(Binding =”1”){35 Storage Encryption = ”256−bit BlowFish”,36 Transfer Encryption= ”None”37 }38 ...39 }Listing 8: Using aspects to disentangle crosscutting1 Policy (Id = ”Service−Operation”) {2 Rule(Id = ”Permissible−Operations”) {3 Paradigm(FSM){4 State (Id = ”not read”) { ”FileXRead” −> ”read”}5 State (Id = ”read”) { ... }6 }7 ...8 Paradigm(RBAC−Ponder){9 inst auth+ fileaccess {10 subject User; target FileX; action read();11 }}}}Listing 9: Policy with different paradigmsposed policy language is that users can select an appropriateparadigm from an extensible set of existing paradigms,of which each enables the user of a policy to freely definerequirements using the concise terms of a DSL. The ch<strong>all</strong>engeof refining disparate composed paradigms into a cohesive,less highly abstracted policy language can be metby translating the composed policy into a lower-level unifiedlanguage using a pre-processor. This, then, can occurautomatic<strong>all</strong>y and transparent to the user.5. DISCUSSION AND CONCLUSIONIn this paper, we have proposed a prototype policy languagethat makes use of well-established composition mechanismsto meet the special requirements of distributed businessapplications. To conclude the paper, we sketch our planto implement a policy language toolkit for building such extensiblepolicy languages. In the toolkit, we will implementeach of the proposed mechanism as one modular languageplug-in, so that users can select a set of plug-ins to tailor apolicy language instantiation to their specific requirements.The toolkit then composes the selected mechanisms. Futurework will evaluate how the different mechanisms interactand how this effects policy language complexity.For enabling an extensible policy language, we plan tocombine techniques for programming languages. For enablingextensible and composable languages, we will use theconcept of embedded DSLs [5], where each policy dialectand mechanism is embedded as a library into an existinglanguage. For polymorphic policies, we will embed a modulesystem for policies with support for user-defined inheritanceschemes. For precise scoping, we will make use of theconcept of scoping strategies [8]. For aspects, we will usedomain-specific join points to enable composing aspects inpolicies written in DSLs [3]. For multiple paradigms, userscan load a new paradigm with a special syntax [4].6. ACKNOWLEDGMENTSThe work presented in this paper was performed in thecontext of the Software-Cluster project EMERGENT (www.software-cluster.org). It was funded by the German FederalMinistry of Education and Research (BMBF) undergrant no. ”01IC10S01” and by the Center for Advanced SecurityResearch Darmstadt (CASED, www.cased.de).7. REFERENCES[1] A. Anderson. An introduction to the Web ServicesPolicy Language (WSPL). Workshop on Policies forDistributed Systems and Networks, pages 189–192, 2004.[2] N. Damianou, N. Dulay, E. Lupu, and M. Sloman. Theponder policy specification language. In M. Sloman,E. Lupu, and J. Lobo, editors, Policies for DistributedSystems and Networks, volume 1995 of Lecture Notes inComputer Science, pages 18–38. Springer Berlin /Heidelberg, 2001.[3] T. Dinkelaker, M. Eichberg, and M. Mezini. Anarchitecture for composing embedded domain-specificlanguages. In Proceedings of the 9th InternationalConference on Aspect-Oriented Software Development(AOSD’10), pages 49–60, NY, USA, 2010. ACM.[4] T. Dinkelaker, M. Eichberg, and M. Mezini.Incremental Concrete Syntax for Embedded Languages.In ACM Symposium on Applied Computing—TechnicalTrack on Programming Languages (PL at SAC). ACM,NY, USA, 2011.[5] P. Hudak. Building Domain-Specific EmbeddedLanguages. ACM Computing Surveys, 28(4es):196–196,1996.[6] G. Kiczales, E. Hilsdale, J. Hugunin, M. Kersten,J. Palm, and W. G. Griswold. An Overview of AspectJ.In ECOOP, volume 2072 of LNCS, pages 327–353,2001.[7] G. Kiczales, J. Lamping, A. Menhdhekar, C. Maeda,C. Lopes, J. Loingtier, and J. Irwin. Aspect-OrientedProgramming. In ECOOP, pages 220–242, 1997.[8] E. Tanter. Beyond static and dynamic scope.SIGPLAN Not., 44:3–14,October2009.


An association-based model of dynamic behaviour ∗Ian PiumartaViewpoints Research Institute, Glendale, CA, USAian@vpri.orgABSTRACTDynamic programming languages seem to spend much oftheir time looking up behaviour associatively. Data structuresin these languages are also easily expressible as associations.We propose that many, and maybe even <strong>all</strong>, interestingorganisations of information and behaviour mightbe built from a single primitive operation: n-way associativelookup. A fast implementation of this primitive, possibly inhardware, could be the basis of efficient and compact implementationsof a diverse range of programming languagesemantics and data structures.1. INTRODUCTIONLanguages with dynamic dispatch [9], first-class environments,and similar late-binding mechanisms, use associativelookup as a central component of the mechanisms and semanticsthey provide. For example, message sending (orc<strong>all</strong>ing a virtual function) uses an association from typesand message (or function) names to function implementations.Multiple dispatch [2] typic<strong>all</strong>y associates a sequenceof several (potenti<strong>all</strong>y many) type names with a function ormethod implementation. Even simple objects with namedfields, or indexable arrays, are associations between an objectidentifier and a field name or numeric index that neednot specify further how the storage is implemented.This leads to the question of whether many (maybe even<strong>all</strong>) useful organisations of information and behaviour in adynamic language might be constructed from a single primitiveoperation: n-way associative lookup.We could explore this question top-down by choosing arange of interesting behaviours and organisations and showinghow they can be composed from a single primitive, orbottom-up by showing how a single primitive operation can∗ This material is based upon work supported in part bythe National Science Foundation under Grant No. 0639876.Opinions, findings, and conclusions or recommendations expressedin this material are those of the author and almostcertainly do not reflect those of the NSF—or of anyone else,for that matter.Permission to make digital or hard copies of <strong>all</strong> or part of this work forpersonal or classroom use is granted without fee provided that copies arenot made or distributed for profit or commercial advantage and that copiesbear this notice and the full citation on the first page. To copy otherwise, torepublish, to post on servers or to redistribute to lists, requires prior specificpermission and/or a fee.FREECO’11, July 26, 2011, Lancaster, UKCopyright 2011 ACM 978-1-4503-0892-2/11/07 ... $10.00.be used alone or in composition with itself to arrive increment<strong>all</strong>yat a number of familiar and widely-used behavioursand organisations. Since the latter seems more open-ended,that is the approach we take here.2. AN ABSTRACT MODEL OF MEMORYWe will distinguish between application and primitive mechanisms.Application mechanisms are the fundamental semantic operationsneeded to implement some programming system. InSm<strong>all</strong>talk [5], for example, we would identify dynamic bindingas the critical semantic operation. These mechanismsform the essential part of the programming model exposedto users of the system, even if they are not always madedirectly available to users.Primitive mechanisms are the raw material used by thelanguage implementor as a platform on which to build theapplication semantics. In most Sm<strong>all</strong>talk implementationswe would have to admit that the primitive mechanisms arememory <strong>all</strong>ocation (hidden within primitives new and new:)and base+offset addressing of that memory (hidden withinthe various primitives at: and at:put:).2.1 Primitive mechanismOur primitive mechanism provides a memory that is a mapm associating one or more keys k i with a value v.m : K ∗ → Vm[k 1,...,k n]=vThis memory supports two primitive operators, associativeread and associative write, which we will be written as‘[ ]’ and ‘[ ]✁’ respectively:m[k 1,...,k n]m[k 1,...,k n] ✁ vvalue in m associated with keys k iupdate m; subsequently m[k i]=v(The notation m[k] is used instead of m(k) to remind us thatm is not a function but an associative lookup.) The state ofm is therefore relative to a particular time, the passage ofwhich will be implied but not stated in this discussion. 1The domain K of keys and range V of values in m arethe same. A distinguished value (the “undefined” value) isiniti<strong>all</strong>y associated with every possible combination of keys1 Object models in which time, versioning, causality, etc.,are significant are probably far better modelled by consideringthe time component as another key (a first-class useraccessiblevalue) rather than an intrinsic property of theunderlying model.


in m. If is used as a key, the associated value is also (regardless of the other keys).m[k 1,...,k n]=for any k i = A simple application model might choose to let an valuepropagate through subsequent operations, or to raise an exceptionimmediately when an is read, etc.2.2 Application mechanismApplication mechanisms are presented as read and write operationson memory via the functions r and w, respectively.r(k 1,...,k n)w(k 1,...,k n,v)r : K ∗ → Vw : K ∗ × V → read value associated with keys k iwrite value associated with keys k iFor a given object model we would like to define its ‘characteristic’functions r and w of k in terms of the primitiveoperations [ ] and [ ]✁. To illustrate this we will consider thesimplest possible object model: a flat address space.3. PHYSICAL MEMORYReading a memory address yields a value; writing a memoryaddress updates its value. The functions r and w are trivi<strong>all</strong>ydefined as the two fundamental operations on m.r(k) = m[k]w(k, v) =m[k] ✁ vIn a w-bit computer (with no paging or segmentation) wemight have K = {n ∈ N 0 | 0 ≤ n


that, when substituted back into r (keeping the β-transformationsof the delegation example), yieldr(k 1,k 2) = r (m[k 1,τ],k 2)r (k 1,k 2)m[k1,k=2]r (m[k 1,σ],k 2)for m[k 1,k 2] = for m[k 1,k 2]=which uses some “property” τ of an object k 1 as the startingpoint for the previous example’s lookup (following a “chain”of σ slots). Put another way, if k 2 is interpreted as a messagename then τ is the “type” of an object (grouping relatedobjects into a family) and σ the “supertype” of a type. Inother wordsn =2α 1(k) =m[k, τ]β 1(k) =m[k, σ]is the dynamic binding mechanism for a class-based objectsystem with inheritance.Of course, not <strong>all</strong> the complexity of a practical system iscontained within the three lines that characterise the mechanism.For example, in Sm<strong>all</strong>talk these three lines say nothingabout creating the initial class hierarchy, inst<strong>all</strong>ing newmethods in classes, or implementing a ClassBuilder object.6. KEYS ARE META-TAXONOMIC DIMENSIONSEach particular well-known key, along with its recursiveβ- andα-transformations, can generate a taxonomy withinwhich objects can be organised. In the above examples, appliedto a Sm<strong>all</strong>talk-like system, τ is an object’s class pointerand σ is a superclass pointer in a (meta)class (both of whichare hierarchical taxonomies of object types). Each is associatedwith a different concrete key, but both exist in the samedimension (are used in the same position k i, where i =2inthis case).Each additional key position (gained by increasing n by1, for example) creates a new “dimension” or “taxonomicspace” in which any number of new taxonomies can be created.These new taxonomies will <strong>all</strong> be orthogonal to (andcompletely independent from) those in other key positions(even if they share the same concrete keys).Continuing with the delegation example, increasing n to3 (adding the key k 3)m[k1,kr(k 2,k 3] for m[k 1,k 2,k 3] = 1,k 2,k 3)=r(m[k 1,σ],k 2,k 3) for m[k 1,k 2,k 3]=gives us multiple (disjoint) perspectives on objects, each associatedwith a particular concrete k 3, with delegation occurringbetween objects only within a single perspective. Ineffect, k 3 is a ‘namespace’ constraining both the content of,and the extent of the taxonomies defined by concrete keysand their β functions between, objects ‘residing’ within it.If we have a namespace ω in which global relationshipsare expressed and letβ 3(k) =m[k, σ, ω]then perspectives (the k 3 keys) on a given object will delegateto each other (via their σ slot).The occurrence of in m can be used to terminate delegation(or other recursive relationships) in multiple dimensions.Introducing distinct versions of r (one r i for eachdimension i in which delegation occurs) lets us choose theprecedence of axes in the n-dimensional delegation space.For example,r1(kr(k 1,k 2,k 3) for r 1(k 1,k 2,k 3) = 1,k 2,k 3)=r 1(k 1,k 2,m[k 3,σ]) otherwisem[k1,kr 2,k 3] for m[k 1,k 2,k 3] = 1(k 1,k 2,k 3)=r 1(m[k 1,σ],k 2,k 3) otherwisedelegates first between objects k 1 within a single perspectivek 3 and then between perspectives k 3 on the original object,whereasr3(kr(k 1,k 2,k 3)1,k 2,k 3)=for r 3(k 1,k 2,k 3) = r 3(m[k 1,σ],k 2,k 3) otherwisem[k1,kr 2,k 3] for m[k 1,k 2,k 3] = 3(k 1,k 2,k 3)=r 3(k 1,k 2,m[k 3,σ]) otherwisedelegates first between perspectives k 3 on a single object k 1and then between distinct objects k 1 in the original perspectivek 3.One final example (among many): if we let v range overmethods of arity n within a memory indexed by k 1,...k n,then the above model (with appropriate α- and β-transformations)can easily describe binding mechanisms for multimethod(generic function) dispatch.7. FUNCTION W AND ITS TRANSFORMATIONSThese are constructed in exactly the same manner as forthe function r, with the same possibilities for pre- and posttransformationsand for recursive recombination, in the obviousmanner.The simplest useful definition of w, the application writefunction,w(k i,...,k n,v)=m[k i,...,k n] ✁ vintroduces new keys into m directly with no attempt to reasonabout “where” the new value v should be “placed” withinany taxonomies defined by r. In the same manner as wasdone for r, pre-transformations γ i and post-transformationsδ i can be introduced.w(k 1,...,k n)=w (γ 1(k 1),...,γ n(k n))⎧⎨w w for some condition(δ 1(k 1),...,δ n(k n))(k 1,...,k n)=on , r, α i,β i,γ i,δ i⎩m[k 1,...,k n] ✁ v otherwiseIt is worthwhile to note that this “simplest useful” definitionof w is often the most appropriate. (For the inheritanceand delegation mechanisms described above it is preciselywhat is wanted.) More exotic constructions for w would beidentical in nature to those already examined for the functionr.8. UNIFICATIONThe primitive read and write operations on m can be unifiedinto a single operation. To write a value v, astatementm[k 1,...,k n,v]is made about its presence within the memory. (If v = the value is “deleted”.) Unifying a single variable v within asimilar statementv = m[k 1,...,k n, ?]


etrieves a value. It is trivial to rephrase this entire paperusing the above formulation.This simplification suggests a very powerful extension thatwould <strong>all</strong>ow the ‘unified’ variable(s) to appear in any keyposition, not just the last. The primitive mechanism is nowdirectly applicable to the semantics of local operations ofrelational languages. 3 (Support for publish-subscribe wouldthen require ‘just’ the addition of a global notification mechanism.One possibility might be ‘future unification’ where aprocess blocks until a non- value becomes available for eachunified variable in a statement.)Such extensions are not without practical and philosophicalcosts (far beyond the already considerable implementationch<strong>all</strong>enges presented by the basic primitive mechanism).9. PRACTICAL CONSIDERATIONSSome of the application-level models of organisation and dynamicbehaviour described in this paper are trivial to implementon (or are intrinsic to) current computer hardware.All of them are trivial to implement given the primitive [ ]and [ ]✁ operators. Furthermore, if these implementationsare efficient then the resulting programming system will beefficient, with complexity increasing commensurately (in theabsolute worst case exponenti<strong>all</strong>y) with n.Software implementations for <strong>all</strong> of the models/behaviourspresented for are common for n = 2, and can be made veryefficient (through various caching techniques) for α i thatmap many objects onto a much sm<strong>all</strong>er set of object families.Hash tables work well for ‘singleton’ associations wheren = 2 and alpha(k) =k, but already present problems ofgarbage collection: values should be deleted from m wheneither k 1 or k 2 becomes unreachable, but it is usual to consideronly k 1. The problem becomes increasingly difficultas generality is preserved while n grows beyond 2, whereunreachability of any given key k must imply deletion of<strong>all</strong> values for which some k i = k (for any i :0≤ i


This suggests using the association primitive as part ofa language definition, to be translated automatic<strong>all</strong>y intoan efficient implementation. The dynamic mechanisms withwhich we are familiar (as described above, as well as thosethat have not been invented yet) could, and probably should,be nothing more remarkable than the consequences of particulararrangements of implications made from properties ofobjects described by the programmer as part of the specificationof the environment in which their application programwill be written and executed. 511. RELATED WORKContext-oriented programming [6] addresses similar issues,but provides solutions at a much higher level of abstractionby extending high-level languages (Lisp and Java) withinthemselves to add another axis to the binding process.Predicate dispatch [4] unifies many mechanisms for choosingdynamic<strong>all</strong>y between methods within a generic function,but is qualitatively different to the present approach in itsheavy reliance on compile-time static type analysis.Maybe the most closely-related type-based work is λ {} [1]which also seeks to unify the overloading of methods to formgeneric functions, but does so dynamic<strong>all</strong>y and tries to usethe sm<strong>all</strong>est number of operators to accomplish the task.In contrast to the above, the association primitive is simplersince it considers type as an optional runtime propertyderived from a value, not as a formal property of an abstractvalue at compile time.12. CONCLUSIONThis paper is an attempt to stimulate thinking about how avery simple pair of primitive operations (that should be efficientlyrealisable in sufficiently par<strong>all</strong>el hardware) can scaleto (and adequately implement with trivial additional work)the complex structures and behaviours we struggle to implementin object-oriented, functional and relational systems.Hopefully it also manages to demonstrate that many apparentlyvery different and interesting organisations and behavioursare in fact closely related as slight variations withina general, parameterisable, n-way associative memory.We may never see hardware support for the primitive operatorsdescribed here, but an efficient software implementation(capable of scaling to billions of entries) would makea great doctoral thesis. The big ch<strong>all</strong>enges are not necessarilyto be found in the primitive operators, but rather inthe associated management—garbage collection, in particular.[3] E. Codd (1970) Arelationalmodelofdataforlargeshared data banks, Communications of the ACM, Vol. 13,No. 6, pp. 377–387[4] M. Ernst, C. Kaplan and C. Chambers (1998) Predicatedispatching: A unified theory of dispatch, Proc. 12thEuropean Conference on Object-Oriented Programming(ECOOP’98), pp. 186–211[5] A. Goldberg and D. Robson (1983) Sm<strong>all</strong>talk-80: TheLanguage and its Implementation, Addison-Wesley,ISBN 0–201–11371–6[6] R. Hirschfeld, P. Costanza and O. Nierstrasz (2008)Context-oriented Programming, Journal of ObjectTechnology (JOT), Vol. 7, No. 3, pp. 125–151[7] H. Lieberman (1986) Using Prototypical Objects toImplement Shared Behavior in Object Oriented Systems,Proc. First ACM Conference on Object-OrientedProgramming Systems, Languages and Applications(OOPSLA), Portland, OR[8] B. Liskov and J. Wing (1994) Abehavioralnotionofsubtyping, ACM Transactions on ProgrammingLanguages and Systems (TOPLAS), Vol. 16, No. 6,pp. 1811–1841[9] S. Milton and H. Schmidt (1994) Dynamic Dispatch inObject-Oriented Languages, Technical ReportTR–CS–94–02, Commonwealth Scientific and IndustrialResearch Organisation (CSIRO), Division of InformationTechnology[10] http://www.sqlite.org[11] A. Stepanov and M. Lee (1994) The StandardTemplate Library, Technical Report X3J16/94–0095,WG21/N0482, ISO Programming Language C++Project[12] B. Stroustrup (1997) The C++ ProgrammingLanguage, Addison Wesley, ISBN 0–201–88954–4[13] A. Warth, Y. Ohshima, T. Kaehler and A. Kay (2010)Worlds: Controlling the Scope of Side Effects, TechnicalReport TR–2010–001, Viewpoints Research InstituteAcknowledgementsThe author is greatly indebted to the three anonymous reviewerswho provided much useful feeback and who workedvaliantly in hope of turning this paper into something of academicvalue. Responsibility for failure to achieve that goalrests entirely with the author.13. REFERENCES[1] G. Castagna (1997) Unifying overloading andλ-abstractions: λ {} , Theoretical Computer Science,Vol. 176, No. 1–2, pp. 337–345[2] C. Chambers (1992) Object-Oriented Multi-Methods inCecil, Proc. European Conference on Object-OrientedComputing (ECOOP’92), pp. 33–565 This “meta programming” should be no more intimidatingthan the current “meta” practices of defining a template library[11], or redefining operators new and delete, to extendthe environment in which C++ [12] programs are written,for example.


The Keyword RevolutionPromoting language constructs for data access to first class citizensSteven te Brinke, Lodewijk Bergmans, and Christoph BockischUniversity of Twente – Software Engineering group – Enschede, The Netherlands{brinkes, bergmans, c.m.bockisch}@cs.utwente.nlABSTRACTAn ongoing trend is to develop new mechanisms for composingsoftware modules that resemble the relations betweencorresponding problem-domain entities and thus enable anatural decomposition of software for an increasing numberof problem domains. However, we have observed that today’sprogramming languages hard-wire a fixed set of compositionmechanisms, usu<strong>all</strong>y in terms of keywords. To overcomethis limitation, we have proposed the Co-op approachenabling developers to implement an open-ended number ofcomposition mechanisms as first-class citizens. Extendingour previous prototype which focused on the compositionof behavior, this paper reports on our prototype Co-op/IIwhich facilitates implementing composition mechanisms fordata access. We show that our approach is sufficient to realizeseveral styles, e.g., of sharing data between sub classes,of controlling visibility, and of behavioral modifiers like synchronizationof data access, converting or persisting data.1. PROBLEM STATEMENTTo be able to properly decompose a software system intomodules which correspond to the entities of the problem domain,sufficient mechanisms must be provided to composethe modules into a running system again. There is a continuoustrend [10] in software engineering research to developprogramming languages with composition mechanisms thatmap to relations between problem-domain entities. Thus anatural decomposition of software is enabled for an increasingnumber of problem domains. The key to these compositionmechanisms is abstraction: Program elements canuse others without explicitly referring to their implementation;typic<strong>all</strong>y, multiple implementations of an abstractionexist and the execution environment selects one accordingto well-defined rules.A popular example of a composition mechanism is inheritance,where the behavior of a child class is composed withthe behavior of the parent. Already for this well known example,many different variants exist [11]. Examples of otherPermission to make digital or hard copies of <strong>all</strong> or part of this work forpersonal or classroom use is granted without fee provided that copies arenot made or distributed for profit or commercial advantage and that copiesbear this notice and the full citation on the first page. To copy otherwise, torepublish, to post on servers or to redistribute to lists, requires prior specificpermission and/or a fee.FREECO’11, July 26, 2011, Lancaster, UK.Copyright 2011 ACM 978-1-4503-0892-2/11/07 ...$10.00.composition mechanisms are delegation, predicate dispatching,aggregation, pointcut-advice, etc.The number of existing composition mechanisms is immenseand shows that there is a need for a variety of suchtechnology. On the other hand, the fact that research in thisarea is going on and producing ever new composition mechanismsshows that every programming language which justprovides a fixed set of such technologies will always be limiting.However, this is what current programming languagesdo: They offer a limited number of composition mechanismsfrom which developers can choose. It is still possible toemulate other mechanisms by using specific coding stylesor design patterns. But applying these has many drawbacks:Implementing patterns requires training, especi<strong>all</strong>ywhen multiple composition mechanisms are combined; codingdiscipline is required to keep the code understandable;re-usability of modules is impaired as they are polluted withpattern implementations.In previous work [6], we have presented the Co-op conceptof a programming language that <strong>all</strong>ows freely implementingcomposition mechanisms. In Co-op, composition mechanismsare implemented as first-class objects that operateon compositions embodied as message sends; such objectsare c<strong>all</strong>ed composition operators in Co-op. We have implementedthis concept prototypic<strong>all</strong>y in the language andexecution environment Co-op/I [6] and have proved our approachfeasible for composing behavior. Especi<strong>all</strong>y, we havemade a case study showing our approach powerful enough tomodel different design patterns [7] and different semanticsfor inheritance.But composition techniques like inheritance do not onlycontrol the composition of behavior, but also the compositionof data. For instance, access modifiers control fromwhere data fields can be accessed. For example, only bymethod definitions contained in the same class as the fielddeclaration; or also by methods defined in classes inheritingfrom the one declaring the field. Thus, in this paper wereport on the second prototype of a Co-op language and executionenvironment, Co-op/II. In this prototype, additionalto function c<strong>all</strong>s, also data accesses are reified as messagesbeing sent and composition operators can reason about andinfluence such messages. Throughout this paper, we willillustrate our approach and discuss the feasibility of implementingdifferent composition mechanisms for data access.2. INTRODUCING CO-OP BY EXAMPLEWe have implemented the concepts of Co-op in the languageCo-op/II. The syntax of this language is inspired by


class-based languages like Java and C++. Nevertheless, ithas no constructs for inheritance, is dynamic<strong>all</strong>y typed, andaims to avoid many keywords and language constructs forexpressing specific semantics. For instance, keywords likepublic, private, protected and static are avoided. We introducedkeywords to syntactic<strong>all</strong>y distinguish the four kindsof members: var, method, binding and constraint.This section will present some examples, which work withthe current Co-op/II prototype. To start with, a simple classcan be defined as follows:1 class Person {2 var name;3 var title;45 method talk() {6 return ”What should ” + this.title + ” ” + this.name + ” say?”;7 }89 method new(name, title) {...} // returns a new, initialized, instance10 }The Co-op object model is based on the manipulation ofmessages that are being sent between objects. In Co-op/II,the dot specifies a message send, thus, the following examplecontains three message sends: the creation of a new person(Bob), requesting what he has to say, and printing this.1 var bob = Person.new(”Bob”, ”Doctor”);2 System.println( bob.talk() );A binding selects which messages it applies to and howthey should be rewritten. Bindings can be defined by anyclass; such a class is then referred to as a composition operator,since—through the bindings—it affects how interactingobjects are composed. The defaultBinding is the onlybuilt-in binding and <strong>all</strong> messages are eventu<strong>all</strong>y dispatchedthrough this binding. It handles a message by dispatchingit to the method specified by the message properties nameand targetType. In the diagram below, we see the generatedmessage for bob.talk() and—a selection of—its properties. Inthis example, the message is processed by the default bindingwhich succeeds.messageKind = ”C<strong>all</strong>”name = ”talk”target = bobtargetType = Personthis = bobdefaultBindingsucceedsEach message has a set of properties, which can be added,removed, changed and used by any binding. Some properties,e.g. the ones in the previous figure, are special in thesense that an initial value will be assigned to them when amessage is sent. Besides that, there is no difference betweenthese properties and any other property.3. INHERITANCE AS A COMPOSITION OP-ERATORInheritance is a common technique for composing classes.Here we use the Sm<strong>all</strong>talk inheritance style to illustrate theimplementation of a composition operator. Sm<strong>all</strong>talk inheritance<strong>all</strong>ows for overriding methods along a single inheritancehierarchy. As an example, we introduce the followingclass as a subclass of class Person:1 class Student {2 method study() {3 return ”Is that necessary?”;4 }5 }The inheritance relation is not specified through a built-inlanguage construct, but through composition operators, asshown in the following code, which could e.g. be part of themain() method of the application:1 Sm<strong>all</strong>talkStyleInheritance.subclassOf(Student, Person);2 var alice = Student.new(”Alice”, ”Bachelor”);3 System.println( alice.study() );4 System.println( alice.talk() );In the case of inheritance, it may be more intuitive (and infact, appropriate), to specify inheritance within the subclassitself: this can be achieved in Co-op either by writing itin the class initializer of Student, or to express it throughan annotation on the class declaration, such as class @Inherits(Person)Student, but the latter is not supported by the currentprototype. The key point of our contribution is that theinheritance specification is no longer part of the languagesyntax, and the location where to specify it can now bedetermined to provide the best design trade-off.Further, in this example, the inheritance behavior is infact composed of two composition operators: one for methodinheritance, and one for field inheritance. These are createdin respectively line 3 and 4 of the following listing. The definitionof FieldInheritance takes a third parameter, here classLocalFieldAccess, which defines the policy for field access. Toexpress Sm<strong>all</strong>talk-like inheritance, we want fields to be accessibleonly loc<strong>all</strong>y, and not from subclasses:1 class Sm<strong>all</strong>talkStyleInheritance {2 method @ImplicitParameters([]) subclassOf(childType, parentType) {3 MethodInheritance.subclassOf(childType, parentType);4 FieldInheritance.subclassOf(childType, parentType, LocalFieldAccess);5 }6 }4. DEFINING INHERITANCE OF METHODSIn this section we first illustrate how method inheritancecan be expressed using Co-op/II, which is similar to thatin Co-op/I. The execution of alice.talk() (i.e., invoking an inheritedmethod) involves more steps than the execution ofbob.talk() we saw in section 2:messageKind = ”C<strong>all</strong>”name = ”talk”target = alicetargetType = Studentthis = alicemessageKind = ”C<strong>all</strong>”name = ”talk”target = alicetargetType = Personthis = aliceMethodInheritance.virtualBindingdefaultBindingsucceedsAlice is an instance of Student, and the talk() method isimplemented in Person, so Alice cannot execute this behaviordirectly. To express inheritance, the virtualBinding rewritesmessages sent to a Student to address a Person. The compositionoperator that specifies this binding is defined as follows:1 class MethodInheritance {2 var childType;3 var parentType;4 // Binding for virtual method lookup5 binding virtualBinding = (messageKind == ”C<strong>all</strong>”6 &targetType==this.childType){


7 targetType = this.parentType;8 }9 // virtualBinding is applicable only if the default binding fails10 constraint bottomUpResolution = skip(defaultBinding, virtualBinding);11 // initialize the instance variables and activate binding.12 method subclassOf(childType, parentType) {...}13 }Each instance of MethodInheritance defines an inheritance relationbetween the childType and parentType. The virtualBindingin this example selects <strong>all</strong> c<strong>all</strong>s to the childType and reroutesthese to the parentType, effectively specifying virtual methodlookup. This virtual lookup only takes place if regular lookupfails, specified by the constraint bottomUpResolution: If the defaultbinding succeeds, the virtualBinding is skipped.5. DEFINING INHERITANCE OF FIELDSDefining the availability and accessibility of fields from superclassesis needed for expressing various inheritance- (andother composition-) semantics. Our previous version of theCo-op language, Co-op/I, did not support this; this paperexplains the application of controlling field composition semanticsfor the first time.For example, the Sm<strong>all</strong>talk style inheritance we have definedearlier, also creates FieldInheritance. Even though thefields of a Person are not directly accessible from its subclassStudent, they are still addressable through methods definedin the class Person. Thus, upon creation of a child class, also<strong>all</strong> fields in its superclasses must be created. That is whatthe composition operator FieldInheritance does.FieldInheritance also creates a relation between an instanceof the child class, and an instance of its superclass 1 . ForSm<strong>all</strong>talk style inheritance, this relation is created using theLocalFieldAccess operator, which <strong>all</strong>ows private field access only.Now, consider you do not want field access to be limitedto the declaring class only, but you want them to be alsoaddressable through the child classes. For example, <strong>all</strong>owingus to write the following implementation of Student:1 class Student {2 method study() {3 return ”Is that necessary for ” + this.title + ” ” + this.name + ”?”;4 }5 }The field accesses in the method study() address fields definedin the super class, just like the way we can address methodsdefined in the super class. Therefore, defining a bindingwhich realizes accessing fields through child classes can bedone in a similar way to defining it for methods:1 class InheritedFieldAccess {2 var child;3 var parent;4 // Binding for field lookup in parent type5 binding inheritFields = (messageKind == ”Lookup”6 &target==this.child){7 target = this.parent;8 targetType = System.classOf(this.parent);9 this = this.parent;10 }11 // inheritFields is applicable only if the default binding fails12 constraint bottomUpResolution = skip(defaultBinding, inheritFields);13 // initialize instance and activate binding14 method initDispatch(child, parent) { ... }15 }The most notable difference with MethodInheritance is that inthis case we have a parent and child instance instead of type,1 This does not involve specific and optimized memory layoutsfor objectsbecause field values are instance properties, whereas methodbehavior is a property of the type.Now, we can use field inheritance as shown in the examplebelow. First, we create a relation between the parent andchild type and then we can access <strong>all</strong> fields of the parenttype also through any instance of a child type, which themethod study() does.1 MethodInheritance.subclassOf(Student, Person);2 FieldInheritance.subclassOf(Student, Person, InheritedFieldAccess);3 var alice = Student.new(”Alice”, ”Bachelor”);4 System.println( alice.study() );Languages like Java and C++, which enable inheritanceof fields, also provide mechanisms to let the developer selectfields which are not accessible through subtypes. We havemodeled this using the SelectiveFieldInheritance composition operator,which can replace the InheritedFieldInheritance. This operatoronly enables inheritance for fields that are selectedexplicitly. Its use is as follows:1 MethodInheritance.subclassOf(Student, Person);2 var selectiveInh = SelectiveFieldInheritance.new();3 selectiveInh.initDispatch(Student, Person);4 selectiveInh.addField(”name”);5 var alice = Student.new(”Alice”, ”Bachelor”);6 System.println( alice.study() );Since this example <strong>all</strong>ows subclasses of Person only accessto the field name, the implementation of Student shown in thebeginning of this section will yield a runtime error. However,the following implementation of Student is correct when thisselective inheritance is applied:1 class Student {2 method study() {3 return ”Is that necessary for ” + this.name + ”?”;4 }5 }Traditional programming languages <strong>all</strong>ow programmers toadd access rules to fields or methods by adding a modifierto their declaration. To enable a similar programming style,Co-op <strong>all</strong>ows the addition of annotations to field and methoddeclarations. Using reflective capabilities to access theseannotations—which are not yet possible in the Co-op/IIprototype—<strong>all</strong>ows implementing the FieldInheritance compositionoperator in such a way that it, for example, only appliesto fields with the @Inherited annotation:1 class Person {2 var @Inherited name;3 var title;4 }6. FREEING FIELD ACCESS BEHAVIOR FROMKEYWORDSIn most (OO) programming languages, there are manykeywords and fixed language constructs to manipulate theway that data is, or can be, accessed. We mention a fewexamples:• Access modifiers in Java, C++ and C# are public, protected,and private. A language like C++ adds a friendkeyword to express yet another form of access rightson data (as well as behavior). Note that there is a widerange of possible access modifiers, when including thenotion of package-level protection, or the distinctionbetween class-level vs instance-level protection.• The Java, C++ or C# keyword static controls whether<strong>all</strong> instances of a class share a field, or each has its owncopy.


• The keywords final in Java, const in C++ or readonlyin C# declare special semantics to the usage of thevariable (i.e., the variable may be assigned only once).For every new keyword or feature that is introduced in alanguage, it must be considered carefully how this interactswith <strong>all</strong> possible combinations of other language constructs.This can be very ch<strong>all</strong>enging, and also tends to make the evolutionof the language over time very difficult. As a result,creating new language constructs to address <strong>all</strong> the desiredfeatures in one language is not feasible.One of the possible work-arounds is the adoption of designpatterns [5], which document and standardize solutionsto common design problems. However, implementing designpatterns requires code changes and additions in multiple locations,with the concept and identity of the adopted designpattern being lost [14, 8].Our proposal, as illustrated in the previous sections, is toaim for a simple object model, and a single mechanism forexpressing a wide range of behavioral modifiers for fields.Examples of modified composition semantics, which can beexpressed with the proper composition operators in Co-op/II,are: access modifiers, static, synchronized, final, and soforth, but also more conceptual constructs such as automaticconversions, checking of validity constraints, persistence,transactions, or expressing roles. Composition operatorssupporting these semantics can be provided by reusablelibraries. In [1], larger complementary examples of behavioralcomposition are presented.7. RELATED WORKOstermann and Mezini argue in [8], fully in line with ourreasoning that “[..] often non-standard composition semanticsis needed, with a mixture of properties, which is notas such provided by any of the standard techniques.”. Toaddress this, they propose a sm<strong>all</strong> design space of propertiesof composition languages, of which Overriding of members,Transparent redirection of access to pseudovariables,and Acquisition, or transparent forwarding of access. Thisdesign space specific<strong>all</strong>y covers the range of compositionsfrom object aggregation to inheritance, but is unable toexpress other types of compositions, such as predicate dispatch,aspects, andsoforth. A key technique proposed intheir work are Compound References, which are exploited toexpress various alternative visibility and sharing styles fordata fields. Our approach differs among others in the abilityto express crosscutting abstractions, and the ability toinfluence field access (dynamic<strong>all</strong>y) based on predicates.Open classes [3], later c<strong>all</strong>ed inter-type declarations inAspectJ [15], <strong>all</strong>ow for flexibly extending classes with additionalfields, expressed separately from the original classdefinitions. This <strong>all</strong>ows for application-specific extension ofclasses with additional fields. Using advice on field read orwrite join points, it is also possible in aspect-oriented languagesor frameworks to implement conceptual modifiers forfield accesses. Examples are adding persistence 2 or definingaccess permissions in the style of multilevel security forfields [9]. There is no notion of generalizing such extensionsto an application-independent composition operator.On another note, aspect-oriented languages like AspectJdo not only <strong>all</strong>ow for influencing field access and adding2 See the SourceForge project “Java Persistence Aspect” athttp://sourceforge.net/projects/jpa/.fields to classes; they also introduce the new mechanism ofimplicit object instantiation which is controlled by so-c<strong>all</strong>edaspect instantiation policies. Such policies declare when newaspect instances are shared between different (implicit) invocationsand when a new instance has to be created. Compose*[4] extends this concept and supports instantiationpolicies per field in an aspect instance. In Co-op, we believe,<strong>all</strong> these mechanisms can be realized uniformly as compositionoperators.In [2], Bracha and Lindstrom discuss that the class construct(in languages with inheritance) has many differentroles (nine, when ignoring type-related issues). They proposeto adopt a very simple model of classes, which canthen be enhanced by applying operators over modules, toexpress a wide range of inheritance semantics. As such ithas similar aims as Co-op, but focuses only on inheritancelikesemantics (e.g. it is not able to express aspects), and itdiffers in its mechanism which is purely static; JIGSAW is amodule manipulation language, where our composition operatorsexpress various composition semantics through firstclassmodules (expressed in the same language).Reflex [12] is a reflection-based kernel for AOP languages.It also supports structural aspects, which may involve addition(and possibly modification) of members. In [13], structuralaspects are discussed, focussing on the detection of interactionsbetween multiple, additive, and structural aspectexpressions.Reflective languages and systems in general offer low-levelconstructs for influencing, among others, the message dispatchprocess. However, for application programmers, theylack the abstractions provided by Co-op such as bindingsand constraints that enable the structured and composableexpression of reusable composition operators. Nevertheless,fully reflective languages can be a suitable means for implementinglanguages like Co-op on top of them, or for implementinga meta-object protocol which offers comparableabstractions.8. CONCLUSIONS AND FUTURE WORKThe Co-op approach enables developers to freely defineand use operators realizing composition mechanisms in theirprograms. This is facilitated through the simple object modeland the concept of declarative bindings defined in normalclasses which can rewrite message sends. Because of thisfirst-class nature of composition operators, they can be reusedand composed again to enable a decomposition of softwaremost natural to its problem domain.The contribution of this paper is the representation offield accesses as message sends in Co-op/II and thus exposingfield accesses to composition operators. To demonstratethe appropriateness of this approach we have expressed awide range of composition mechanisms for fields, like differentaccess policies for shared data in class hierarchies orbehavioral modifiers for fields 3 . An example which is notdescribed in this paper is static field access.There are several ch<strong>all</strong>enges for such a composition operatortechnique; firstly, expressiveness: the technique shouldbe able to express the desired semantics. This puts requirementson the language for expressing the composition behavior,as well as on the access to the internals of the program(say, its representation and execution by an underlying vir-3 See the SourceForge project “Co-op” at co-op.sf.net


tual machine). For example, some level of reflection is a necessity.Secondly, composability: a key question is whethermultiple composition operators, in particular when appliedto the same location (e.g. same field), will still yield thedesired behavior. From our experience with Co-op we distinguishseveral issues for composing composition operators:• Multiple operators should be applicable to the sameprogram location.• The semantics of co-located operators should be compatible:this may require a proper design of the compositionoperator library.• The order of applying composition operators may makea difference.• Some operators are by design incompatible.• Hence one needs to express ordering, exclusion and coexistenceconstraints between operators. These can begeneral constraints, or application-specific constraints.We plan to perform additional case studies of implementingcomposition mechanisms in Co-op/II and to improve ourprototype in several ways. As mentioned already in section3, often developers depend on specific compositionalsemantics for program elements like classes, methods andfields they define. For instance, it may be relevant thatone class inherits from another or that a certain field canonly be accessed loc<strong>all</strong>y. For reasons of good readability thecomposition mechanism should be declared together withthe definition of the program element then. One possibilityto achieve this while staying flexible and independent ofkeywords is to use annotations that can be accessed by compositionoperators. Co-op/II does not yet have full supportfor annotations on method and field declarations, but canonly use a few hard-wired ones. To enable the full benefitof annotations, we intend to further extend Co-op/II’s reflectivecapabilities to access annotations in the definition ofbindings.Currently, the only supported condition that can be usedin constraints is whether a named binding is applicable ornot. But sometimes, it is necessary to disable a bindingwhen a certain predicate is satisfied. Consider for examplethe SelectiveFieldAccess composition operator which is supposedto disable the binding that realizes to share fields betweenthe super and the sub class. In our implementationwe had to use a workaround which is to define a binding inSelectiveFieldAccess that is applicable when the sharedFieldBindingshould be ignored and then define a constraint between thesetwo. We plan to improve the expressiveness of conditionalconstraints in Co-op/II.9. REFERENCES[1] L. Bergmans, W. Havinga, and M. Akşit. First-classcompositions–defining and composing object andaspect compositions with first-class operators.Transactions on Aspect-Oriented SoftwareDevelopment, Special issue on Modularity Constructsin Programming Languages, TBD.[2] G. Bracha and G. Lindstrom. Modularity meetsinheritance. In In Proc. International Conference onComputer Languages, pages 282–290. IEEE ComputerSociety, 1992.[3] C. Clifton, G. T. Leavens, C. Chambers, andT. Millstein. Multijava: modular open classes andsymmetric multiple dispatch for java. In Proceedings ofthe 15th ACM SIGPLAN conference onObject-oriented programming, systems, languages, andapplications, OOPSLA ’00, pages 130–145, New York,NY, USA, 2000. ACM.[4] A. J. de Roo, M. F. H. Hendriks, W. K. Havinga,P. E. A. Durr, and L. M. J. Bergmans. Compose*: alanguage- and platform-independent aspect compilerfor composition filters. In K. Mens, M. van den Brand,A. Kuhn, H. Kienle, and R. Wuyts, editors, FirstInternational Workshop on Academic SoftwareDevelopment Tools and Techniques, Cyprus, July2008. No publisher.[5] E. Gamma, R. Helm, R. Johnson, and J. Vlissides.Design Patterns: Elements of ReusableObject-Oriented Software. Addison Wesley, Reading,Massachusetts, 1994.[6] W. K. Havinga, L. M. J. Bergmans, and M. Akşit. Amodel for composable composition operators:Expressing object and aspect compositions withfirst-class operators. In Proceedings of AOSD, pages145–156. ACM, March 2010.[7] W. K. Havinga, C. M. Bockisch, and L. M. J.Bergmans. A case for custom, composable compositionoperators. In Proceedings of the 1st InternationalWorkshop on Composition: Objects, Aspects,Components, Services and Product Lines, Rennes,France, volume 564 of Workshop Proceedings, pages45–50. CEUR-WS, March 2010.[8] K. Ostermann and M. Mezini. Object-orientedcomposition untangled. In Proc. OOPSLA ’01 Conf.Object Oriented Programming Systems Languages andApplications, pages 283–299. ACM Press, 2001.[9] R. Ramachandran, D. J. Pearce, and I. Welch.AspectJ for multilevel security. In Y. Coady, D. H.Lorenz, O. Spinczyk, and E. Wohlstadter, editors,Proceedings of the Fifth AOSD Workshop on Aspects,Components, and Patterns for Infrastructure Software,pages 13–17, Bonn, Germany, Mar. 20 2006. Publishedas University of Virginia Computer Science TechnicalReport CS–2006–01.[10] B. G. Ryder, M. L. Soffa, and M. Burnett. The impactof software engineering research on modernprogamming languages. ACM Transactions onSoftware Engineering and Methodology, 14:431–477,October 2005.[11] A. Taivalsaari. On the notion of inheritance. ACMComput. Surv., 28(3):438–479,1996.[12] É. Tanter. Aspects of composition in the Reflex AOPkernel. In Proceedings of the 5th InternationalSymposium on Software Composition (SC 2006),volume 4089 of lncs, pages 98–113, Vienna, Austria,Mar. 2006. Springer-Verlag.[13] É. Tanter and J. Fabry. Supporting composition ofstructural aspects in an AOP kernel. Journal ofUniversal Computer Science, 15(3):620–647,2009.[14] D. Wagelaar and L. Bergmans. Using a concept-basedapproach to aspect-oriented software design. InM. Glandrup, P. Tarr, S. Clarke, and F. Akkawi,editors, Workshop on Aspect Oriented Design —Identifying, Separating and Verifying Concerns in theDesign (AOSD-2002), Mar. 2002.[15] Xerox Corporation. The AspectJ programming guide.http://www.eclipse.org/aspectj/doc/released/progguide,2003.


Composing heterogeneous software with styleStephen Kell ∗Computer Laboratory, University of CambridgeStephen.Kell@cl.cam.ac.ukABSTRACTTools for composing software impose homogeneity requirementson what is composed—that modules must share alanguage, target the same libraries, or share other conventions.This inhibits cross-language and cross-infrastructurecomposition. We observe that a unifying representation ofsoftware turns heterogeneity of components into a matterof styles: recurring interface patterns that cross-cut largenumbers of codebases. We sketch a rule-based language forcapturing styles independently of composition context, anddescribe how it applies in two example scenarios.1. INTRODUCTIONOur ability to build software composition<strong>all</strong>y from unmodifiedcomponents is limited by two problems. Firstly,tools (such as compilers and linkers) require that composedmodules be plug-compatible—their interfaces match “in thesm<strong>all</strong>”. Where this does not hold, compositions are achievedonly by laborious glue coding or invasive editing. Secondly,they must be homogeneous—function<strong>all</strong>y compatible modulescan not be composed if they are written in different languages,using different interface conventions, different codingstyles, or different support libraries. This severely limitsthe space of possible compositions. Considerable prior workhas targeted the first problem [7, 15–17, 21, 23]. However,the second has received only narrow special-case treatments(such as pairwise interoperation between languages [3, 4, 6]or realisations of procedure- or message-based interaction[2, 9]) or clean-slate approaches [10].This paper outlines ongoing work on an approach to heterogeneouscomposition, based on interface styles. Its keyinsight is that given an appropriate unifying medium—anintermediate representation capturing diverse components—heterogeneity is reduced to differing patterns of usage withinthat medium, which we c<strong>all</strong> stylistic variation. A style is∗ The author is now primarily affiliated with the Departmentof Computer Science, University of Oxford. The detailsshown remain valid.Permission to make digital or hard copies of <strong>all</strong> or part of this workforpersonal or classroom use is granted without fee provided that copies arenot made or distributed for profit or commercial advantage and thatcopiesbear this notice and the full citation on the first page. To copy otherwise,torepublish, to post on servers or to redistribute to lists, requires prior specificpermission and/or a fee.FREECO’11, July 26, 2011, Lancaster, UK.Copyright 2011 ACM 978-1-4503-0892-2/11/07 ...$10.00.any recurring convention used to realise some programmaticconcern; example concerns include error-handling, functiondispatch, representation of common data types (lists, sets,strings, etc.), memory management, and so on. Styles are bydefinition cross-cutting: they recur across large populationsof components (i.e. the components that are homogeneouswith respect to the style) which may be dissimilar otherwise.By <strong>all</strong>owing programmers to describe styles abstractly, independentof composition context, we can re-use this descriptiveeffort to simplify large numbers of composition tasks.In our approach, styles are described by the programmerusing high-level rules. From such descriptions, guided by aset of input components annotated with references to theirstyles, a glue code generator can compose heterogeneoussoftware essenti<strong>all</strong>y by inserting code to “undo” one stylisticconcretion and “replay” a different one at the boundarybetween modules. Our approach is black-box, meaningitissensitive only to an interface abstraction of components. Itis also increment<strong>all</strong>y adoptable, inthatitappliestoalargepopulation of existing components.Specific<strong>all</strong>y, we present the following contributions:• we characterise the phenomenon of stylistic variation,by identifying a selection of stylistic concerns and somefamiliar concretions of each;• we sketch a notation for describing styles, as an extensionto the Cake linking language [16], and present twoexamples of composition tasks handled using styles;• we discuss some semantic and practical questions arising,and outline possible future directions.We begin with a simple example of stylistic variation.2. CHARACTERISING STYLESSuppose two programmers independently develop a simplecomponent for counting the lines, words and characters in afile. Fig. 1 shows what they might write. The componentsare abstractly equivalent, but concretely different. Our goalis to capture these concrete differences programmatic<strong>all</strong>y,hence <strong>all</strong>owing a tool to abstract them away, so that codetargetting one of them could instead compose with the other.We can observe some dimensions of stylistic variation ataglance. Outputparametershavebeenencodeddifferently,as have character strings. One component provides an explicitresource management API, implicitly also handling initializationand finalization, whereas the other provides onlyexplicit initialization. Naming conventions for multi-word


struct wc; // implemented in C// struct is treated opaquely by clientstruct wc ∗word counter new(const char ∗filename);// returns NULL and sets errno on errorclass WordCounter // implemented in Java{/∗ fields not shown... ∗/public WordCounter(String filename)throws IOException { /∗ ... ∗/ }int word counter get words( struct wc ∗obj);int word counter get characters ( struct wc ∗obj);int word counter get lines ( struct wc ∗obj);int word counter get <strong>all</strong> ( struct wc ∗obj,int ∗words out, int ∗ characters out , int ∗ lines out );void word counter free ( struct wc∗ obj);public int getWords() { /∗ ... ∗/ }public int getCharacters () { /∗ ... ∗/ }public int getLines () { /∗ ... ∗/ }public Triple getAll() { /∗ ... ∗/ }};// implicitly , de<strong>all</strong>ocation is done by unreferencing + GCFigure 1: Two stylistic variants of the same interfaceword_counter_new("README") = 0x9cd6180[struct wc]word_counter_get_words(0x9cd6180[struct wc]) = 311word_counter_get_characters(0x9cd6180[struct wc]) = 2275word_counter_get_lines(0x9cd6180[struct wc]) = 59word_counter_get_<strong>all</strong>(0x9cd6180, 0xbffeed00[stack],0xbffeecfc[stack], 0xbffeecf8[stack]) = 0word_counter_free(0x9cd6180[struct wc]) = ()_Jv_InitClass(..., 0x6015e0[java::lang::Class], ...) = ..._Jv_AllocObjectNoFinalizer(..., 0x6015e0, ...) = 0x9158d20WordCounter::WordCounter(java::lang::String*)(0x9158d20[WordCounter], 0x9ae3dc8[java::lang::String]) = ()WordCounter::getWords()(0x9158d20[WordCounter]) = 311WordCounter::getCharacters()(0x9158d20[WordCounter]) =2275WordCounter::getLines()(0x9158d20[WordCounter]) = 59WordCounter::getAll()(0x9158d20[WordCounter]) = 0x9f6093e8[Triple]Figure 2: Traces generated by a simple client of each interfaceidentifiers differ. Moreover, the components are written indifferent languages, so compilation will introduce further differences.C<strong>all</strong>s to the Java component will use virtual dispatchand exception handling, while C code will not.These conventions are not invented anew by each programmer.Rather, they are imported from a cultural repertoire,defined by a language, a toolchain, or simply a codingstyle. We want to capture each convention in a one-time effort,so that programmers need consider only an abstracted,style-independent view during composition tasks. We canconsider this abstraction as a rewriting exercise on traces ofthe kind shown in Fig. 2, which are an annotated extensionof the traces generated by the well-known ltrace tool 1 . Althoughstylistic variation is a broad phenomenon, this traceview captures a large subset of it. 2In more realistic examples, there will be not only stylisticdifferences, but also differences in how each programmerhas modelled the domain. These are precisely what is handledby style-unaware adaptation tools [7, 15–17, 21, 23].Style support complements such tools; Fig. 3 illustrates this.Styles may be captured as “views” or “lenses” which abstract“vertic<strong>all</strong>y”, recovering a more abstract interface from a moreconcrete one. Horizontal adaptation can then be performedas usual, but at the more abstract level.Figure 3: Styles as abstractions over interfaces1 http://www.ltrace.org/2 The use of pointers in the traces is an abbreviation; thetrace properly includes the full exchanged data structures.Table 1: Stylistic concerns relevant to Fig. 1Any interface convention which recurs across a large populationof components may be considered a style. Whatinterface conventions recur in this way? This question canonly be answered empiric<strong>all</strong>y. There are no prior studies onstylistic variation. The Appendix presents a preliminary catalogueof stylistic concerns gathered from simple programmingexperience. For each concrete convention we observe,we can identify an abstract concern that it models. Notethat our catalogue need not be exhaustive. Our approachcaptures user-defined styles—using the list as a guide, butnot limited to it. To give a flavour, Table 1 shows a slice ofthis table containing the conventions evidenced in Fig. 1.


3. APPROACHOur approach consists of four parts.Aunifyingmediumwhich could be any intermediate orbytecode-like representation of code. Relocatable objectcode, augmented with debugging information, isthe one we adopt. This is output by many implementationsof a wide range of languages. (Note that ourblack-box approach works purely by link-time insertionof generated code, and is architecture-agnostic.)Alanguagefordescribingstyleswhich we develop asan extension to the Cake composition language [16].Cake code consists of rules which relate one componentinterface to another, by identifying correspondingdata types and function c<strong>all</strong>s. Cake rules conceptu<strong>all</strong>yspecify a transducer which rewrites traces like thosein Fig. 2. Adding support for styles means extendingCake to multi-hop relations, formed by multiple transducers.Rather than relating one fixed interface toanother, style rules relate elements of a more concreteinterface to a more abstract one, and are parameterisedso they can apply to any component modeling a style.Alanguagefordescribingcompositionsin terms of thestyles they instantiate: our composition language isagain based on Cake. The programmer introduces acomponent with an exists declaration, as in normalCake code, but now including an ordered list of namedstyles which the component models. The order is usedto determine coarse-grained precedence. Our semanticshandles the fine-grained composition of styles.Semantics for the combination of these: given some styledefinitions and a composition annotated with the stylesof each component, the composition formed by our toolis defined by an elaboration process. Inform<strong>all</strong>y, this isabacktrackingsearchforthe“mostabstracting”pathby which a given function c<strong>all</strong> or data value could betransmitted between the composed modules, given thestyles that the programmer has applied (and any horizontalrules that have been defined). We will brieflyillustrate this process by example in the next section.4. EXAMPLESFirst, we consider a simple data representation concern,and second, more complex styles concerning function c<strong>all</strong>s.4.1 BooleansAsimpleexampleofstylesconcernsencodingofbooleans.For example, C code often encodes booleans as integers, withzero indicating false and nonzero indicating true. An oppositeconvention exists in Unix shell programming: zero indicatestruth, and nonzero indicates falsehood. Fig. 4 showstwo style definitions capturing these two alternative conventions.The styles use Cake’s table construct to relate enumeratedsets of values. This is the relational analogue of an enumeratedtype: rather than enumerating a set of possible values,it enumerates correspondences between elements of one datatype and those of another. Style rules relate two views of thesame component: a more concrete view (always on the left)and a more abstract (on the right). Styles may be parameterised(in a macro-like fashion) to widen their applicability,style c89 booleans(integer typename){ table integer typename ←→ boolean{ 0 ←→ false ;−→ true; /∗ ordered pattern -matching ∗/1 ←− true;}; };style shell booleans (integer typename){ table integer typename ←→ boolean{ 0 ←→ true;−→ false ;1 ←− false ;}; };exists // ↙ apply c89 style , parameter ”BOOL”, to...c89 booleans(BOOL)( // ↙ ... the underlying componentelf reloc (”componentA.o”))componentA;// ←− identifier for the ensembleexists // ↙ apply shell style , parameter ”BOOL”, to...shell booleans (BOOL)( // ↙ the underlying componentelf reloc (”componentB.o”))componentB;// ←− identifier for the ensemblederive my composition = link[componentA, componentB]{ /∗ ”horizontal ” composition- specific rules go here ∗/ };Figure 4: Two styles, and their use in a composition Figure 5: Elaboration of the most abstracting flowand these parameters are supplied at exists-time. Our stylesare parameterised on an identifier (integer typename) usedto identify the data type that is encoding booleans as integers.3 The exists and derive declarations introduce twocomponents, componentA and componentB, eachrepresentingbooleans as integers, but where componentA uses the Cconventions, whereas componentB uses the shell conventions.Mismatch is avoided by applying the appropriate styles, <strong>all</strong>owingthe Cake compiler to generate conversion logic.How does the compiler work out what rules should applyto these integers? This is determined by the elaboration ofstyles. In our example we have two possible “flows” for aBOOL: onetreatingitinthestyle-specifiedway,andonetreating it as a plain integer. Fig. 5 illustrates. When compilingthis, the compiler must select a particular sequence ofvalue conversion rules. For each component, it chooses from3 This is not simply int for two reasons. Firstly, languagesother than C name integers differently. Secondly, not <strong>all</strong>integers are re<strong>all</strong>y booleans. For now we are assuming thatsome quasi-annotation has been done for us, e.g. by a Cprogrammer using typedef to create a synonym for integers,namely BOOL, usedexactlywhentheyrepresentbooleans.For other cases, the Cake language has features for annotatingdistinguished use contexts of a given data type, whichwe do not discuss here.


1 style jni static long c<strong>all</strong> (classname, funname, argsig )2 { // ↙ guard predicates names bound to return values ↘ ↙ patterns on contextual c<strong>all</strong>s3 [status != JNI ERR] ( status , jvm, env) ⇐ JNI CreateJavaVM( , , ), ...,4 [c != 0, @FindClass == (∗env)↩→FindClass] c ⇐ @FindClass(env, #classname), ...,5 [mid != 0, @GSMID == (∗env)↩→GetStaticMethodID] mid ⇐ @GSMID(env, c, #funname, #argsig), ...,6 [@CSLM == (∗env)↩→C<strong>all</strong>StaticLongMethod] @CSLM(env, c, mid, args... ) // the ” triggering ” c<strong>all</strong>7 −→ classname ## ## funname ## ## argsig(args...);8 // ↖ the abstracted view: a single c<strong>all</strong> , named by cpp-style metaprog’ing9 // extra rule needed to <strong>all</strong>ow reversibility10 JNI CreateJavaVM(out , out ,myvmargs) −→ {};11 };Figure 6: Abstracting a sequence of c<strong>all</strong>sJavaVM ∗jvm; JNIEnv ∗env;JavaVMInitArgs vmargs;long st = JNI CreateJavaVM(&jvm, &env, &vmargs);if (st != JNI ERR){ jclass c = (∗env)−>FindClass(env, ”java/lang/System”);if (c){ jmethodID mid = (∗env)−>GetStaticMethodID(env, c, ”currentTimeMillis ”, ”(J)V”);if (mid){ jint result = (∗env)−>C<strong>all</strong>StaticLongMethod(env, c, mid, 5);}}}// else handle errorsFigure 7: JNI code for a simple function c<strong>all</strong>the rules defined by each style applied to that component.Loosely, elaboration searches for a successful composition(i.e. each function c<strong>all</strong> yields a correspondent in the opposingcomponent, and similarly for <strong>all</strong> data types used) whilealways preferring a more abstract flow. This means preferringa “t<strong>all</strong>er stack” of styles. The order in which the styleswere applied is respected. (This logic is near-trivial in ourexample, since only one style is applied on each side.)4.2 Java Native Interface styleAs a more advanced example of styles, interpreting functionc<strong>all</strong>s, consider a c<strong>all</strong>er written in C but consuming aJava library using the Java Native Interface [19]. Fig. 7shows C code a JNI programmer might write, and Fig. 6shows a style definition for abstracting the resulting traceinto a single c<strong>all</strong> obeying a simple naming convention.The rule consists of a comma-separated list of patterns,each of which matches a function c<strong>all</strong> and binds names toits elements, including (to the left of the ⇐) itsreturnvalues.Each pattern is preceded by a square-bracketed guardpredicate defining additional matching conditions in terms ofthe names bound in the pattern. Data-dependent matching,i.e. matching only particular argument values, is threadedthrough the list of patterns by re-using identifiers bound earlier.The final element of the pattern defines the c<strong>all</strong> which“triggers” the rule, here @CSLM 4 ;therule“fires”whenthisc<strong>all</strong> occurs in a context where c<strong>all</strong>s matching the previouspatterns have preceded it. The pattern-list is followed byaright-arrow;ontherightofthearrowisthe“abstracted”view (line 7) of the left side. Here this is a single c<strong>all</strong> whose4 Here identifiers beginning with “@”aretreatedasmetavariables,rather than resolving to component-level names; line 5binds @CLSM to env’s GetStaticMethodID member.name is built from the style’s arguments, using metaprogrammingoperators like those in the C preprocessor.By applying these rules, we recover an abstract sequenceof c<strong>all</strong>s, discarding JNI details. Now we consider the reversedirection—given some abstract sequence of c<strong>all</strong>s, generatedby some heterogeneous client in another style (such as adifferent foreign function interface than JNI), how can wedispatch this against the JNI interface? To avoid introducinganother example style, let us simply turn the tables: howdo we dispatch abstract c<strong>all</strong>s to JNI? This means runningour JNI style rules “in reverse”.In short, we direct the abstract c<strong>all</strong>s into generated stubcode whose role is to reproduce a context satisfying the predicateson the left of the JNI rule (lines 3–6). To do so, itkeeps a “sliding window”-style log of c<strong>all</strong> history across theinterface. For example, on receiving the first abstract c<strong>all</strong>,JNI CreateJVM has yet to be c<strong>all</strong>ed, so our stub does thisand checks the return value against JNI ERR. Continuing,we can use the data dependencies between patterns to synthesisethe arguments to subsequent c<strong>all</strong>s, using the contentsof the c<strong>all</strong> history (which includes earlier argument values).In a few cases, the relevant arguments cannot be recoveredwithout extra programmer guidance; for example, we cannotrecover the vmargs argument to JNI CreateJavaVM. Anextrarule (line 10) handles this: the empty right-hand side signifiesit may be inserted whenever necessary, and cruci<strong>all</strong>y,it provides the required argument value for vmargs, namelymy vmargs. (Here this is assumed to name a static<strong>all</strong>y definedstructure in the instantiating component; more realistic<strong>all</strong>y,this identifier would be a parameter of the style.)5. DISCUSSION AND FUTURE WORKCurrently we have only syntax and some paper semanticsfor styles. However, work on implementing these within theCake compiler is ongoing. (In fact, styles were an envisagedfeature from the initial design of Cake.)Deeper experience with styles, by further case study, isrequired in order to discover how our preliminary resultsgeneralise. An empirical study of styles found in a large setof codebases (e.g. open-source code in a variety of languages)would be both valuable and feasible.Performance of generated code is limited by how well themulti-layered glue code generated by our design can be collapsedto a sm<strong>all</strong> and efficient adapter, using whole-programoptimisation techniques; this requires further research.What we have loosely claimed to be a “style” is re<strong>all</strong>ydescribing a “style transformer”: a mapping from one styleto another, where the latter is hopefully more abstract. For


example, the naming convention we selected in Fig. 6 is itselfanother style, even though it has discarded JNI details. Itis therefore essential that styles compose with each other,that mismatch between styles does not become a problem,and that a quadratic explosion of styles can be avoided. Theemergence of “well-known” named styles, intowhichawidestylistic variety of input components can be transformed,might solve this analogously to how popular intermediatefile formats can avoid quadratic explosion in Make [12].It would be useful to automatic<strong>all</strong>y infer what knownstyles apply to a component, by searching for the relevantpatterns in interfaces. This search becomes more complexwhen considering compositions of styles and parameterisation.A likely solution might combine backtracking search(much like Make finds compositions of rules satisfying prerequisites)with constraint solving (to find satisfying instantiationsof styles’ parameters). Similarly, it would be usefulto automatic<strong>all</strong>y infer likely styles, given a corpus of interfaces,perhaps using existing learning approaches [11].Assurances about style-based compositions could be gainedby considering their round-tripping properties, as with lensesfor tree-structured data [13]. One idea is to cross-checkround-trips using symbolic execution techniques [8].6. RELATED WORKComponent systems such as CORBA [22] use stub compilersto abstract interfaces, but fundament<strong>all</strong>y do not addressheterogeneity, since they assume <strong>all</strong> components are programmedagainst interfaces generated by such a compiler.By contrast, styles both generate abstractions and recogniseconcretions, enablingheterogeneouscomposition.Kent identified a similar phenomenon to stylistic variationin database schemas [18]; we have effectively extendedconsideration of this phenomenon to component interfaces.Flexible Packaging [10] has similar goals to ours, but relieson a clean-slate approach to development, whereas ourapproach is designed to apply to existing components.The abstracting, normalising nature of styles is similar tothe “objectification” transformation of COMPOST [2], butwith considerably greater flexibility—notably a languageagnostic,black-box approach.Interface styles lie on the same spectrum as design patterns[14] and architectural styles [20], but are gener<strong>all</strong>ysm<strong>all</strong>er-scale than both. Their sm<strong>all</strong> size makes it tractableto describe them in a one-time fashion, but also means thatany real interface will feature a complex composition of styles,making style composition a more significant problem.Composition languages such as Piccola [1] consider howto capture different styles of composition, hence overlappingwith interface styles. However, Piccola does not facilitateheterogeneous composition; rather, it formalises compositionswithin a single “compositional style” at a time.LayOM [5] shares some conceptual similarities, but differentobjectives: since it does not address heterogeneity, itdoes not adopt a unifying medium, does not prioritise thedefinition of new layers (doing which entails C++ sourcecode transformation), and has no analogue of elaborationfor automatic composition across layers.7. CONCLUSIONSStyles are a novel way to abstract away recurring differencesin diverse component interfaces. Our next step is toimplement and practic<strong>all</strong>y evaluate styles. A survey of observedstyles in existing code will add focus to this work. Webelieve styles can open up a hugely bigger space of feasiblecompositions than <strong>all</strong>owed by current tools.AcknowledgmentsIthankDavidGreavesandDerekMurrayforhelpfuldiscussion,Aistis Simaitis for feedback on presentation, and theOxford Martin School Institute for the Future of Computingfor support in preparing this manuscript.References[1] F. Achermann and O. Nierstrasz. Applications = components +scripts. In Software Architectures and Component Technology,pages 261–292. Kluwer, 2001.[2] U. Assmann, T. Genssler, and H. Bar. Meta-programming greyboxconnectors. In Proc. 33rd Int. Conf. on Technology ofObject-Oriented Languages (TOOLS 33), pages300–311,2000.[3] D. Beazley. Swig: An easy to use tool for integrating scriptinglanguages with C and C++. In Proc. 4th USENIX Tcl/TkWorkshop, pages129–139,1996.[4] M. Blume. No-Longer-Foreign: Teaching an ML compiler tospeak C. ENTCS, 59(1):36–52,2001.[5] J. Bosch. Superimposition: a component adaptation technique.Inf. and Softw. Tech., 41:257–273,1999.[6] P. Bothner. Compiling Java with GCJ. Linux Journal, 2003.[7] A. Bracciali, A. Brogi, and C. Canal. A formal approach tocomponent adaptation. J. Syst. Softw., 74:45–54,2005.[8] C. Cadar, D. Dunbar, and D. Engler. Klee: unassisted and automaticgeneration of high-coverage tests for complex systemsprograms. In Proc. 8th OSDI, pages209–224.USENIXAssociation,2008.[9] J. C<strong>all</strong>ahan and J. Purtilo. A packaging system for heterogeneousexecution environments. IEEE Trans. Softw. Eng., 17:626–635,1991.[10] R. DeLine. Avoiding packaging mismatch with flexible packaging.IEEE Trans. Softw. Eng., 27:124–143,2001.[11] M. D. Ernst, J. Cockrell, W. G. Griswold, and D. Notkin. Dynamic<strong>all</strong>ydiscovering likely program invariants to support programevolution. IEEE Trans. Softw. Eng., 27,2001.[12] S. I. Feldman. Make: a program for maintaining computer programs.Softw: Pract. Exper., 9,1979.[13] J. N. Foster, M. B. Greenwald, J. T. Moore, B. C. Pierce, andA. Schmitt. Combinators for bi-directional tree transformations:alinguisticapproachtotheviewupdateproblem.InProc. 32ndACM SIGPLAN-SIGACT Symposium on Principles of ProgrammingLanguages, pages233–246.ACM,2005.[14] E. Gamma, R. Helm, R. Johnson, and J. Vlissides. Design patterns:elements of reusable object-oriented software. Addison-Wesley, 1995.[15] J. Järvi, M. Marcus, and J. Smith. Library composition andadaptation using C++ concepts. In Proc. 6th Int. Conf. onGenerative Programming and Component Engineering, 2007.[16] S. Kell. Component adaptation and assembly using interfacerelations. In Proc. 25th ACM Conf. on Systems, ProgrammingLanguages, Applications: Software for Humanity, 2010.[17] R. Keller and U. Holzle. Binary component adaptation. In Proc.ECOOP ’98, pages307–329.Springer,1998.[18] W. Kent. The many forms of a single fact. In 34th IEEE ComputerSociety Intl. Conf. Digest of Papers., February1989.[19] S. Liang. The Java Native Interface: Programmer’s Guide andSpecification. Addison-Wesley Professional, 1999.[20] D. Perry and A. Wolf. Foundations for the study of softwarearchitecture. ACM SIGSOFT Softw. Eng. Not., 17,1992.[21] J. Purtilo and J. Atlee. Module reuse by interface adaptation.Softw. Pract. Exper., 21:539–556,1991.[22] N. Wang, D. C. Schmidt, and C. O’Ryan. Overview of theCORBA Component Model. In G. T. Heineman and W. T. Councill,editors, Component-based software engineering: puttingthe pieces together, pages557–571.AddisonWesley,2001.[23] D. Yellin and R. Strom. Protocol specifications and componentadaptors. ACM Trans. Prog. Lang. and Syst., 19:292–333,1997.APPENDIXSee the author’s web page: http://www.cl.cam.ac.uk/˜srk31/.


Towards Modular Code Generators Using SymmetricLanguage-Aware AspectsSteffen ZschalerKing’s College London,Department of Informatics, London, UKszschaler@acm.orgAwais RashidSchool of Computing and Communications,Lancaster University, Lancaster, UKawais@comp.lancs.ac.ukABSTRACTModel-driven engineering, especi<strong>all</strong>y using domain-specific languages,<strong>all</strong>ows constructing software from abstractions that are moreclosely fitted to the problem domain and that better hide technicaldetails of the solution space. Code generation is used to produce executablecode from these abstractions, which may result in individualconcerns being scattered and tangled throughout the generatedcode. The ch<strong>all</strong>enge, then, becomes how to modularise the codegeneratortemplates to avoid scattering and tangling of concernswithin the templates themselves. This paper shows how symmetric,language-aware approaches to aspect orientation can be appliedto code generation to improve modularisation support.Categories and Subject DescriptorsD2.3 [Software Engineering]: Coding Tools and TechniquesKeywordsmodel-driven engineering, code generation, symmetric aspects1. INTRODUCTIONIn model-driven engineering (MDE), domain-specific languages(DSLs) are typic<strong>all</strong>y accompanied by code generators, which expandthe abstract DSL concepts into source code in a target programminglanguage, typic<strong>all</strong>y taking into consideration a particulartarget platform (e.g., middleware, hardware, distribution andconcurrency model, etc.). Such expansions can be limited to thegeneration of standard, ‘boiler-plate’ code, but, more interestingly,may include so-c<strong>all</strong>ed local-to-global or global-to-local transformations[14]. A local-to-global transformation means that informationfrom one element in a DSL program may be scattered acrossmultiple elements in the generated source code—for example, adata description may affect user-interface code as well as datadefinitionscripts for setting up a database back end. A globalto-localtransformation, on the other hand, means that informationfrom a number of DSL elements may need to be tangled within oneelement in the generated source code—for example, both informationfrom a layout policy and the data description must be tangledPermission to make digital or hard copies of <strong>all</strong> or part of this work forpersonal or classroom use is granted without fee provided that copies arenot made or distributed for profit or commercial advantage and that copiesbear this notice and the full citation on the first page. To copy otherwise, torepublish, to post on servers or to redistribute to lists, requires prior specificpermission and/or a fee.FREECO’11, July 26, 2011, Lancaster, UK.Copyright 2011 ACM 978-1-4503-0892-2/11/07 ...$10.00.in user-interface code generated.The ch<strong>all</strong>enge in the presence of local-to-global and global-tolocaltransformations is how to cleanly modularise the code-generationtemplates so that, while tangling and scattering occur inthe code generated, the templates themselves do not suffer from it.Moreover, the code-generators themselves may need to address arange of technical concerns, such as consistency management [6],performance, security, different target platforms, etc., which arenot captured in the source models. Of course, we would also liketo avoid scattering and tangling of technical concerns in generationtemplates. This becomes particularly important where DSLsshould be reused in different projects, which may require to addresssome technical concerns in a different manner or may requireadaptations to the DSL concepts provided. In this case, generatormodularisation becomes essential, as it enables a ‘mix and match’approach to DSL adaptation and reuse [16]. Ide<strong>all</strong>y, we would liketo modularise code generators such that each technical concern aswell as each concern modelled in a DSL could be realised in a separatecode-generation module as independently as possible of theother concerns.Existing code-generation frameworks (e.g., [4, 7, 8, 11]) assumethat one target file will be produced by one code-generation template(even though this may import and invoke additional generationrules). Thus, any scattering and tangling that occurs in thegenerated code automatic<strong>all</strong>y also occurs in the generation templates.To address this issue, aspect-oriented approaches to codegeneration have been proposed [10,15]. However, they are not fullyadequate for <strong>all</strong> needs of code-generator modularisation: Becausethey are asymmetric approaches [5], they require an explicit basetemplate, often imply the use of scaffolding (e.g., empty rules inthe base template whose only purpose is to serve as hooks for theaspect templates), and lack sufficient support for weaving contexts.Furthermore, because they are language-agnostic, any languagespecificweaving semantics must be implemented in the generationtemplates, which is not always possible.This paper proposes a novel approach for modularising codegenerators. This approach maintains the benefits of reduced tanglingintroduced by [10, 15], but also addresses the short-comingsof these approaches:• It is a symmetric aspect-oriented approach, addressing theissues connected to asymmetry in current approaches.• It is also a language-aware approach to address the issue oflanguage agnosticism.Thus, the contributions of this paper are the following:1. We identify four shortcomings of existing asymmetric AOapproaches to code generation that stand in the way of effectiveseparation of concerns for code generators.


2. We present a novel approach and architecture for symmetriclanguage-aware AO for code generation, which addresses theaforementioned shortcomings.3. We present a prototype implementing this architecture.The remainder of this paper is structured as follows: In the nextsection, we give a more detailed motivation for the need of symmetricaspect-oriented code generation based on the identificationof four issues with current symmetric approaches. In Sect. 3, wepresent a generic architecture for symmetric aspect-oriented codegeneration and present a prototype implementing these conceptsbased on the Epsilon Generation Language (EGL) [12]. Section 4surveys some related work and Sect. 5 concludes the paper. Due tospace limitations, we have omitted an example as well as a moredetailed discussion of the benefits and drawbacks of our approach.These can be found in an accompanying technical report [17].2. MOTIVATION: MODULARISING CODEGENERATION TEMPLATESMany code-generation frameworks, such as [4, 7, 8, 11], alreadyprovide notions such as generator rules and operations. These arevery useful for modularising a generator based on the structure ofthe files to be generated. However, information that is localisedin one place in a model often needs to affect multiple places inthe generated code. Thus, the knowledge of how to implement amodel element is scattered throughout the generated code. As aconsequence this knowledge will also be scattered throughout thecode generators. Furthermore, information from multiple modelelements may need to be combined to generate a particular file.Thus, the generated code tangles a number of concerns from theDSL models. Unless specific measures are taken in modularisingthe code generators, these concerns will also be tangled within thecode generators.Scattering and tangling in code generator templates is particularlybad where we want to flexibly configure a generation workflowso that it can be adapted to different circumstances. This canhappen, for example, when a DSL is to be reused in a different context[16], or when the system context changes (e.g., a new versionof an underlying platform is released and must be integrated withthe system under development).For example, when generating Eclipse [3] plugins from an applicationmodel, a number of different concerns must be addressedby these generators: i) generating the Java implementation of theapplication’s business logic, ii) generating code implementing thedata model, and iii) generating user interface code, among others.It is immediately clear that there is a number of local-to-globaltransformations in this scenario—for example, information abouta data structure affects code in the user interface, the data model,as well as, possibly, the business logic code. Furthermore, there area number of tangled concerns—for example, both user-interfacecode and data-model code may have to provide configuration information,which means these concerns will be tangled in Eclipse’splugin.xml configuration file. Separating these concerns intodistinct generation modules, would enable us to choose from differentvariations for implementing each concern. Figure 1 shows anexample of how we may want to decompose the code generators.In particular, we have defined one generator for the user-interfaceconcern, one for the concern of business logic, and two differentvariations for the concern of data-model implementation—one usingEMF [13] and one using plain Java objects. The figure showstwo possible configurations making specific choices about thesecode generators. It can be seen how each of the generators usesinformation from the application model (which could be providedusing a number of DSLs) and generates code in one or more files;that is, concerns are scattered throughout the generated code. Somecode generators also need to provide code to the same file; that is,different concerns are tangled within these files.In summary, we require a modularisation of our code generators,such that each concern can be realised in a separate code-generationmodule as independently as possible of the other concerns, even ifthese concerns need to be scattered and tangled in the generatedcode. Ide<strong>all</strong>y, this should enable us to modify, remove, or add acode-generation module without impact on any of the other modulesin our code generator.To support the modularisation of tangling in code generators,some authors have proposed using aspect-oriented techniques in thedefinition of code-generator templates [10, 15]. These are asymmetricaspect-oriented approaches [5] <strong>all</strong>owing the definition ofaround, before, and after advice for individual code-generationrules or operations in a base template. While this can successfullysolve some of the modularisation issues, it also has importantdrawbacks:1. Base template needed. Every aspect template requires a basetemplate against which it is defined and which it modifies.This makes it difficult to define code generators for files thatdo not need to exist for <strong>all</strong> variants of a system. In the exampleof Fig. 1, depending on which concerns we include inthe generation, not <strong>all</strong> of the plugins generated will actu<strong>all</strong>yrequire a plugin.xml file. At the same time, a number ofconcerns will have to make contributions to plugin.xmlif they are selected. Using asymmetric aspect-oriented codegenerationtechniques, we need an empty code-generationbase template for plugin.xml, even though this file wouldoften not be needed. Note that, because we want to <strong>all</strong>ow freechoice of code generators and because this selection may bemade based on information in the application model ratherthan based on a static configuration choice of the generationworkflow, we cannot guarantee a priori that any specific codegenerators will be used in a concrete workflow, and, thus,cannot make any generator the base generator.2. Scaffolding needed in base template. Because every adviceneeds to be attached to some rule or operation in the basetemplate, we can only influence those parts of a file for whichexplicit code-generation rules have been defined. This typic<strong>all</strong>yleads to the definition of a number of empty generationrules, which only serve as hooks to be referenced inaspect templates adding additional code to the file generated.1 For example, in Fig. 1, the empty base template forplugin.xml needs to provide empty rules for generatingextension and extension-point descriptions, even though wedo not declare any of them in the base template. The onlypurpose of these rules would be to enable aspect templates toadd extensions and extension points of their own. The needfor such additional constructs in base code has been previouslyidentified—for example, [15, p. 9] shows an exampleof such an empty generator rule (the fact that [15] in principle<strong>all</strong>ows aspects on aspects does not strongly affect theneed for scaffolding). In [2] such rules have been referred toas scaffolding. They have been shown to contribute stronglyto the instability of pointcuts and to break encapsulation ofbase and aspects.1 Oldevik and Haugen [10] describe this: “[. . . ] there must be constraintson the base transformation [. . . ], e.g. that domain-specificrules should be clearly separated from general rules”.


Application ModelApplication ModelGUI GeneratorEMF GeneratorBusinessLogic GeneratorJava Data ModelGeneratorBusinessLogic Generatorplugin.xmlJava CodeabJava CodeFigure 1: An example decomposition of code generators for generating Eclipse plugin code. The figure shows two possible configurations:a) using EMF for data management as well as including a graphical user interface, and b) using plain Java objects for datamanagement and providing a programming interface only3. Insufficient support for weaving contexts. In the example,different generators affect the final Java code. Some of thesegenerators may need to add implemented interfaces to certainclasses (e.g., to be consistent with requirements of frameworksused). This demonstrates another problem of asymmetricaspect orientation for code generation. Only before,after, and around are supported, making correct generationof implemented interfaces impossible as soon as morethan one aspect needs to add to this list unless the base templatealready provides a non-empty list. The first aspect to beapplied would need to introduce the implements keyword(we will c<strong>all</strong> this the weaving context). All following aspectsmust not produce this weaving context. Of course, onecould extend the aspect languages to support advice orderingand use ordered post-advice only. Still, if one decidednot to generate code for the first feature (where the aspect introducesthe implements keyword) the setup would breakand one would have to change the code of another aspect tomake it work again.In addition to these drawbacks mainly caused by the asymmetricnature of current approaches, there is also an issue because theseapproaches are language agnostic. Because the code generators andaspect weavers have no knowledge of the language to be generated,the weaving result can easily be wrong. In the interface-implementationexample from above, it could, for example, easily happenthat the same interface is named more than once in the list of implementedinterfaces that is generated.To address these issues, we propose a novel aspect-oriented approachto code generation. This approach is symmetric [5], addressingthe short-comings related to the asymmetric nature of currentapproaches. Our approach, further, is language aware, so thatit can provide better weaving based on language syntax and semantics.The following section discusses our approach from anarchitectural perspective and presents a working prototype.3. AN ARCHITECTURE FOR SYMMETRICLANGUAGE-AWARE ASPECTS FOR CODEGENERATIONWe begin this section by proposing a general architecture forsymmetric aspect-oriented code generation, addressing the issuesidentified above. Then, we present a prototype we have implementedbased on the Epsilon Generation Language [12].3.1 Registration + Weaving = Symmetric Aspectsfor Code GenerationWe propose a code-generation infrastructure that separates thestep of generating code from the step of writing this generated codeto a target file. This separation enables us to inject additional behaviour.In particular, we can decide to have a number of codegenerators generate code for the same target file and inject a codeweavingstep that merges the generated code before it is written tothe target file.Figure 2 gives an overview of such an architecture. For codegenerators to become as independent as possible, each one needsto generate an operation<strong>all</strong>y complete slice. These slices are thenregistered against their intended target file name in a code-slice registryfrom where they can be extracted for weaving by the codesliceweaver (removing the slices from the registry in the process).The result of the weaving can then be written to the target file. Asthe code-slice weaver is a code generator itself, we may also decidenot to write the weaving result to a target file, but rather toregister it again in the registry. This <strong>all</strong>ows us to build hierarchicalcompositions of code generators, encapsulating the number andresponsibility of the individual code generators.This architecture provides symmetric aspect orientation for codegeneration. In particular, because <strong>all</strong> generation results are registeredagainst their target file names in the same way, there is nodistinction between base templates and aspect templates. Also, becausethe weaving happens based on the generation result no detailsof the template structure need to be exposed. This means there isno need for scaffolding through empty generator rules. Weavingcontext is also not a problem, because it can be generated by everytemplate and its resolution can be left to the weaving algorithm.Because generators are loosely coupled through the common registryonly, they can be developed independently and it is easy toadd or remove a generator for the same target file. As already discussedabove, this requires, however, that each generator createsoperation<strong>all</strong>y complete slices of code, which may lead to some redundancybetween templates for the same target file.The code-slice weaver is given a number of (code) text streamsand faces the task of merging them into one stream that can thenbe either written to a target file or re-registered against the originaltarget file. Combining two or more text streams into one has beendiscussed in the literature quite extensively in particular in connectionwith software configuration management. Tom Mens [9] providesa good overview of the state of the art in software merging.He also proposes a number of dimensions providing a frameworkfor classifying and discussing merging algorithms:1. Two-Way vs Three-Way Merging “Two-way merging attemptsto merge two versions of a software artifact without relyingon the common ancestor from which both versions originated.With three-way merging, the information in the com-


CG Template 1CG Template 2...CodeSliceRegistryCode Slice WeaverOutput FilesCG Template nSave to fileRegistry emptyGenerateWeaveText registered for file Text wovenGenerateRegisterFigure 2: Abstract architecture for a symmetric aspect-oriented code-generation infrastructure. The upper part shows the componentsinvolved. The lower part shows the possible states of the code-slice registry for a particular target filemon ancestor is also used during the merge process.” [9]2. State-Based vs Change-Based Merging “With state-based merging,only the information in the original version and/or its revisionsis considered during the merge. In contrast, changebasedmerging addition<strong>all</strong>y uses information about the previouschanges that were performed during evolution of thesoftware.” [9]3. Textual vs Syntactic vs Semantic Merging Different merge algorithmsuse different amounts of knowledge about the languagein which the text streams to be merged are expressed.Textual merge merges two texts independently of their language.Syntactic merge takes into account the syntax of thelanguage of the texts to be merged, while semantic mergeeven considers the language’s semantics.In the context of our architecture, only two-way merging can beused, as there is no common ancestor available. This also impliesusing a state-based merge algorithm. In order to make compositionlanguage aware (which also helps address issues of weaving contextas discussed above), we need to use syntactic or even semanticmerging. Superimposition—for example as implemented in FEA-TUREHOUSE [1]—is an interesting candidate. It supports syntacticmerging based on feature-structure trees (syntax trees whose terminalnodes correspond to blocks of code—for example, entire methods)and semantic merging for the terminal nodes of these trees.3.2 PrototypeWe have implemented the architecture from Fig. 2 as an extensionof the Epsilon Generation Language (EGL) [12]. 2 Note thatwe have chosen EGL because of our previous expertise, but in principlecould have used any other code-generation language.Our prototype provides additions to the workflow component ofEGL. EGL (and Epsilon in general) uses Ant-based workflow descriptions[8] to co-ordinate different tasks required to solve a particularmodel-management problem. EGL provides a single workflowcomponent epsilon.egl, which takes a model and an EGLtemplate and writes the result of evaluating the template to a specifiedfile. We provide a specialisation of this component (epsilon.eglRegister) which has the same interface and functionality,2 The full prototype as well as the example is available fromhttp://www.steffen-zschaler.de/publications/symmetric_ao_cg.but instead of storing the generation to a file, uses a workflowwideregistry to register the generated code against the specifiedfile name. EGL <strong>all</strong>ows templates to instantiate and execute othertemplates, <strong>all</strong>owing the result of these executions to be stored inseparate files. If an EGL template is evaluated in the context ofepsilon.eglRegister any code generated in this way willalso be registered rather than written to the file system.Registered code slices can then be woven and fin<strong>all</strong>y written to afile (or registered again) using the epsilon.eglMerge component.To select the files for which to merge code slices, epsilon.eglMerge uses standard ANT file sets that <strong>all</strong>ow the user to usewildcards to express the set of files for which to weave. This isimportant, because when templates are invoked from other templates,the names of the target files may depend on the contents ofthe models from which code is generated. Therefore, in the generationworkflow, these file names cannot be static<strong>all</strong>y encoded. Usingwildcards in file names addresses this issue.Fin<strong>all</strong>y, epsilon.eglMerge supports the definition of an orderin which different code slices registered for the same file shouldbe woven. To this end, code slices can be associated with a featureid in epsilon.eglRegister. epsilon.eglMerge thenprovides a means to define partial orderings between slices usingtheir associated feature ids.An example applying our prototype to the generation of EJBcode from a class model can also be found in [17].4. RELATED WORKAs mentioned previously, aspect-oriented approaches to codegeneration already exist [10, 15]. These approaches are asymmetricand language agnostic and their issues and how our approachaddresses them have been the focus of this paper.Hemel et al. [6] discuss modularisation of code generation andmodel transformation in the context of Stratego/XT and their WebDSLcase study. One of their modularisation scenarios is closely relatedto the work presented in this paper: Because the modularisation ofcode generators is driven by the structure of the target metamodel(a single element or artefact in the target code must be the result ofa single code-generation rule), Hemel et al. were forced to use verylarge ‘God rules’ incorporating generation logic for <strong>all</strong> features thataffect a particular artefact. This is exactly the same problem thatmotivated our work. They introduce an intermediary language providingmore advanced constructs for modularisation (e.g., partialclasses and operations). They then use a staged generation strategy:A first code generator produces code in the intermediary language


from WebDSL code. A second code generator then merges (superimposes)partial classes and operations and produces the finalcode in the target language. Again, this is essenti<strong>all</strong>y the same asour approach. However, because Stratego/XT does not provide anydirect support for aspect-oriented transformations or code generations,Hemel et al. needed to introduce an explicit new intermediarylanguage, containing a large set of constructs only relevant for thesubsequent superimposition (e.g., @Class for representing a partialclass). Their approach is eased by the use of Stratego/XT, wheregenerated code is always handled as an abstract syntax tree ratherthan plain text. Instead, in our approach, code is generated as text,which is then merged in a separate step. This requires this text tobe parsed again in preparation for merging, making our approachperhaps a little less time efficient. At the same time, however, italso <strong>all</strong>ows the use of simpler textual merges based on the sameinfrastructure.5. CONCLUSIONSIn this paper, we have proposed an approach applying notions ofsymmetric aspects to the domain of code generation. This approachseparates the specification of code generators from the specificationof their composition, providing more flexibility in choosing codegenerators that should be used for a particular generation task. Becauseit is a symmetric approach, it does not require any templateto be identified as the base template, enabling us to flexibly leaveout or re-arrange code generators as required. Because the weavinghappens at the level of generated code, there is no need for scaffoldingin the form of empty generator rules. Because we use syntacticmerging, the approach can take into account syntactic constraints ofthe language of the generated code. We have successfully appliedour approach to the generation of EJB code from a class model [17],and are looking to further evaluate it with more case studies as wellas in a comparative study with other generation approaches.AcknowledgementsThis work has been funded by the European Commission underFP6 STREP AMPLE and FP7 Marie-Curie IEF RIVAR. The authorswish to thank Dimitrios Kolovos, Louis Rose, and RichardPaige for help with and discussion about the Epsilon framework,and Jon Whittle for useful feedback on an earlier draft.6. REFERENCES[1] Sven Apel, Christian Kästner, and Christian Lengauer.FEATUREHOUSE: Language-independent, automatedsoftware composition. In Stephen Fickas, Joanne Atlee, andPaola Inverardi, editors, Proc. 31st Int’l Conf. on SoftwareEngineering (ICSE’09), pages 221–231. IEEE ComputerSociety, 2009.[2] Ruzanna Chitchyan, Phil Greenwood, Americo Sampaio,Awais Rashid, Alessandro Garcia, and Lyrene Fernandesda Silva. Semantic vs. syntactic compositions inaspect-oriented requirements engineering: An empiricalstudy. In Proc. 8th ACM Int’l Conf. on Aspect-OrientedSoftware Development (AOSD’09), pages 149–160. ACM,2009.[3] Eclipse Foundation. Eclipse web site. Published on-line:http://www.eclipse.org/.[4] Sven Efftinge, Peter Friese, Arno Haase, Dennis Hübner,Clemens Kadura, Bernd Kolb, Jan Köhnlein, Dieter Moroff,Karsten Thoms, Markus Völter, Patrick Schönbach, MoritzEysholdt, and Steven Reinisch. openarchitectureware xpanddocumentation. Published on-line:http://www.openarchitectureware.org/pub/documentation/4.3.1/html/contents/core_reference.html#xpand_reference_introduction, 2009.[5] William H. Harrison, Harold L. Ossher, and Peri L. Tarr.Asymmetric<strong>all</strong>y vs. symmetric<strong>all</strong>y organized paradigms forsoftware composition. Technical Report RC22685, IBMResearch, 2002.[6] Zef Hemel, Lennart C. L. Kats, Danny M. Groenewegen, andEelco Visser. Code generation by model transformation: Acase study in transformation modularity. Software andSystems Modelling, 9(3):375–402, June 2010. Publishedon-line first at www.springerlink.com.[7] Frédéric Jouault, Jean Bézivin, and Ivan Kurtev. TCS: ADSL for the specification of textual concrete syntaxes inmodel engineering. In Stan Jarzabek, Douglas C. Schmidt,and Todd L. Veldhuizen, editors, Proc. 5th Int’l Conf. onGenerative Programming and Component Engineering(GPCE’06), pages 249–254. ACM, 2006.[8] Dimitrios Kolovos, Richard Paige, Louis Rose, and FionaPolack. The Epsilon Book. Published on-line: http://www.eclipse.org/gmt/epsilon/doc/book/,2009.[9] Tom Mens. A state-of-the-art survey on software merging.IEEE Transactions on Software Engineering, 28(5):449–462,2002.[10] Jon Oldevik and Øystein Haugen. Higher-ordertransformations for product lines. In Tomoji Kishi and DirkMuthig, editors, Proc. 11th Int’l Software Product Line Conf.(SPLC’07), pages 243–254. IEEE Computer Society, 2007.[11] Jon Oldevik, Tor Neple, Roy Grønmo, Jan Aagedal, andArne-J. Berre. Toward standardised model to texttransformations. In A. Hartman and D. Kreische, editors,European Conf. on Model Driven Architecture – Foundationsand Applications (ECMDA-FA’05), pages 239–253, 2005.[12] Louis M. Rose, Richard F. Paige, Dimitrios S. Kolovos, andFiona A. Polack. The Epsilon generation language. In InaSchieferdecker and Alan Hartman, editors, Proc. 4thEuropean Conf. on Model Driven Architecture(ECMDA-FA’08), pages 1–16. Springer, 2008.[13] Dave Steinberg, Frank Budinsky, Marcelo Paternostro, andEd Merks. EMF: Eclipse Modeling Framework.Addison-Wesley Professional, 2nd edition, 2009.[14] Jonne van Wijngaarden and Eelco Visser. Programtransformation mechanics: A classification of mechanismsfor program transformation with a survey of existingtransformation systems. Technical Report UU-CS-2003-048,Institute of Information and Computing Sciences, UtrechtUniversity, May 2003.[15] Markus Völter and Iris Groher. Handling variability in modeltransformations and generators. In Proc. of the 7th OOPSLAWorkshop on Domain-Specific Modeling, 2007.[16] Jules White, Jeff Gray, and Douglas C. Schmidt.Constraint-based model weaving. Transactions onAspect-Oriented Software Development, Special Issue onAspects and Model-Driven Engineering, 5560(6):153–190,2009.[17] Steffen Zschaler and Awais Rashid. Symmetriclanguage-aware aspects for modular code generators.Technical Report TR-11-01, King’s College London,Department of Informatics, 2011.


Open, extensible composition models(extended abstract) ∗Ian PiumartaAcademic Center for Computing and Media Studies, Kyoto University, JapanViewpoints Research Institute, Glendale, CA, USAian@vpri.orgABSTRACTSimple functional languages like LISP are useful for exploringnovel semantics and composition mechanisms. That usefulnesscan be limited by the assumptions built into theevaluator about the structure of data and the meaning ofexpressions. These assumptions create difficulties when aprogram introduces a composition mechanism that differssubstanti<strong>all</strong>y from the built-in mechanism of function application.We explore how an evaluator can be constructedto eliminate most built-in assumptions about meaning, andshow how new composition mechanisms can be introducedeasily and seamlessly into the language it evaluates.1. INTRODUCTIONAdding a new composition mechanism to a programminglanguage can entail defining a corresponding data type T ,creating and maintaining values of that type, adding a syntacticoperator to combine those values with one or moreother values, and providing an algorithm that determinesthe compositional meaning of those combinations. Valuesof T retain persistent information needed by the composition.These values can also act as syntactic operators, if theintrinsic evaluation mechanism associates their presence ina combination with behaviour specific to T . The algorithmprovides semantics for the composition, expressed as furthercompositions or as primitive operations of the language, accordingto the values being combined.The above can be achieved within the basic abstractionsof some programming languages, although unnecessary complexityand obfuscation arise whenever those abstractionslimit direct access to the data and algorithms implementingthe composition. If supporting mechanisms can be introducedat the meta level of the host language then these limitationsdo not arise, the interface presented to the programmercan deal directly and efficiently with relevant information,and the new composition appears as a natural extensionof the language—qualitatively indistinguishable froman intrinsic mechanism.∗ The full paper [4] is available online.Permission to make digital or hard copies of <strong>all</strong> or part of this work forpersonal or classroom use is granted without fee provided that copies arenot made or distributed for profit or commercial advantage and that copiesbear this notice and the full citation on the first page. To copy otherwise, torepublish, to post on servers or to redistribute to lists, requires prior specificpermission and/or a fee.FREECO’11, July 26, 2011, Lancaster, UKCopyright 2011 ACM 978-1-4503-0892-2/11/07 ... $10.00The introduction of supporting mechanisms at the metalevel will be illustrated using a functional language derivedfrom McCarthy’s LISP [1, 2]. Section 2 introduces this languageand its evaluator. Section 3 describes modificationsin its meta level to accommodate user-defined compositions.Section 4 presents several examples of composition mechanismsadded to the language. Section 5 discusses the workand places it in context. Section 6 offers concluding remarks.1.1 Typographic conventionsURLs and code are set monospaced. In code the name of a identifier is enclosed by angle brackets and access tothe fields of its values is designated -field. (, %and - are letters having no special significance.) Primitivebehaviour is written as {pseudo code} .2. A MINIMAL FUNCTIONAL LANGUAGEFigure 1 defines an evaluator for a functional language ofsymbolic expressions represented as lists in polish notation.The evaluator corrects several semantic inadequacies of LISP(described by Stoyan [5]) and is metacircular (written in thelanguage it evaluates). The language provides:• lists and atomic values, including symbols• primitive functions c<strong>all</strong>ed s• symbolic functions (closures) c<strong>all</strong>ed s• a object that encapsulates another applicablevalue and prevents argument evaluation• predicates to discriminate between the above types• built-in s to access the contents of these values• awaytoc<strong>all</strong> the primitive behaviour of a • a quotation mechanism to prevent evaluation of literalsThe interpretation of structures is defined by the usualpair of functions eval and apply. eval takes an expression(simple or complex) and yields its value in the context ofan environment of bound names. apply takes a complexexpression, split into a function part and its arguments,and yields the result of applying the former to the latterin the context of an environment of bound names. (As inLISP [2]: evlis evaluates each element in a list and returnsa list of the results, pairlis extends an environment bybinding a list of names to a list of values, and assoc finds apreviously-bound name in an environment.)A global initial environment contains bindings for primitives( values) and control structures (s ors wrapped in a ). Assignment and mutablestate are supported via primitives (as they were in LISP [2,pp. 70ff ]).


(define eval (lambda (exp env)(cond((symbol? exp) (cdr (assoc exp env)))((atom?(’texp) exp)(let ((fn (eval (car exp) env)))(if (fixed? fn)(apply (-function fn) (cdr exp) env)(apply fn (evlis (cdr exp) env) env)))))))(define apply (lambda (fun args env)(cond((subr? fun) {c<strong>all</strong> (-implementation fun) args env} )((expr? fun) (eval (-body fun) (pairlis (-formals fun) args (-environment fun)))))))(define %type-names (tuple))(define %type-sizes (tuple))(define %type-fields (tuple))Figure 1: Evaluation of symbolic expressions in a minimal functional language(define %<strong>all</strong>ocate-type(let ((last-type number-of-builtin-types ))(lambda (name fields)(let ((type (set last-type (+ 1 last-type))))(set-tuple-at %type-names type name)(set-tuple-at %type-sizes type (list-length fields))(set-tuple-at %type-fields type fields)type))))(define (%<strong>all</strong>ocate-type ’ ’(x y)))(define -x (lambda (value)(and (= (type-of value))(tuple-at value 0))))Figure 2: Adding an aggregate type to the languages are closures carrying the environment in whichthey were defined. Variable lookup is lexic<strong>all</strong>y scoped. (Atrivial change to the last line in Figure 1 gives LISP’s dynamicscoping.) Additional atomic types such as numbers,and primitives that act on them, will be present in a practic<strong>all</strong>anguage of this kind. They are omitted here for brevity.3. SUPPORTING OPEN COMPOSITIONThe language just described supports one combining form(the list). When a list is evaluated it causes a compositionin which each of the element(s) being combined is recursivelyevaluated yielding one or more values, the first ofwhich is then applied (as a function) to the rest (the arguments).“Open composition” means the ability to add newcompositions (or replace existing ones) corresponding to theevaluation of new (or existing) combining forms.Modifications to the language will be made to support:• defining new types, to represent the syntactic operators,state and semantics of composition mechanisms, and• associating meaning with combinations involving thesenew types.3.1 Extensible aggregate typesPredicates in eval and apply use some property of a value todiscriminate between types. The simplest generalisation isto identify each type with a unique integer. Incrementing acounter suffices to <strong>all</strong>ocate a new type. A primitive functiontype-of yields the type identifier for a given value.Three types (, and s for constructinglists) appear in the evaluator that are aggregates of values.Aggregation can be generalised to a single mechanism:infinitely-sized, indexable N-tuples containing undefined at<strong>all</strong> uninitialised indices. 11 Assignment at an uninitialised index extends the tuple asnecessary.With these mechanisms addition of a new aggregate typecan be effected in user code, as shown in Figure 2. 2 Norestrictions have been placed on the structure or semanticsof objects. Intrinsic types (those used by the evaluator)are built from the same parts: there is no disparity betweenbuilt-in and user-defined types and values.3.2 Extensible composition rulesTwo kinds of composition occur in the evaluator. Simpleatomic expressions (symbols in particular) are composedwith the environment by eval to yield a value. Combinationsof one or more values in complex expressions arecomposed by apply by application of a primitive or closurevalue to the remaining values. Both of kinds of compositionare made extensible by applying an appropriate combinationmechanism to every expression according to its type. Twotuples, evaluators and applicators, are indexed by type ineval and apply, respectively.eval(x, e) =apply(evaluators[type(x)], cons(x, nil),e)apply(f, a, e) =apply(applicators[type(f)], cons(f, a),e)Four consequences of this decomposition are:• any applicable value can supply the composition semanticsfor any expression, simple or complex,• any value can be made applicable, with semantics determinedby its type and its value,• the meaning of a complex expression (explicit combinationof values, e.g., as a list) is not fixed, or evensupplied, by the evaluation mechanism, and2 The steps shown <strong>all</strong>ocate the type, record information forprinting and instantiating, and define an accessor for an instancefield. In practice these steps are generated automatic<strong>all</strong>yfrom a single define-type expression, given a typename and a list of field names, along with the implied set of-field accessors.


(define eval (lambda (exp env)(apply (tuple-at evaluators (type-of exp)) (list exp env) env)))(define apply (lambda (fn args env)(if (subr? fn){c<strong>all</strong> (-implementation fn) args env}(apply (tuple-at applicators (type-of fn)) (list fn args env) env))))Figure 3: Generalised assignment of meaning to expressions(set-tuple-at evaluators (lambda (exp env) (cdr (assoc exp env))))(set-tuple-at evaluators (lambda (exp env) exp))(set-tuple-at evaluators (lambda (exp env)(let ((fn (eval (car exp) env)))(if (eq (type-of fn) )(apply (-function fn) (cdr exp) env)(apply fn (evlis (cdr exp) env) env)))))(set-tuple-at applicators (lambda (fn args env)(eval (-body fn) (pairlis (-formals fn) args (-environment fn)))))Figure 4: The original language semantics expressed as composition rules(define-type (function))(define form (lambda (function)(let ((self (new )))(set (-function self) function)self))(set-tuple-at *applicators* (lambda (fn args env)(eval (apply (-function fn) args env) env)))Figure 5: Form type for defining macros• apply is an infinitely-recursive function.Infinite recursion is avoided by short-circuiting the semanticsof applying a primitive, as shown in Figure 3. Four entriesin the evaluators and applicators tuples, as shown inFigure 4, restore the original semantics to the language.Associating evaluators and applicators with env (effectivelybinding them in the environment) permits great flexibilityin assigning meaning to program constructs, includingincremental extension or modification of existing compositionmechanisms, and multiple context-sensitive semanticsfor any given type. (The latter is pivotal when modellinglanguage implementation as a series of partial evaluations,which is the motivation for an “open composition model”.)4. DEFINING NEW COMPOSITIONSThree new compositions will be added to the language: a value (for defining macros), object-oriented messagepassing and generic functions. (We assume a quasiquotationmechanism, which is straightforward given .)4.1 Forms and macrosFigure 5 introduces an applicable type, encapsulatingsome other applicable value. The encapsulated value isapplied to the arguments and then the result is re-evaluated.Placing a inside a creates a macro.4.2 Message passingMessage passing sends a message with zero or more argumentsto an object that executes a corresponding method.The method is typic<strong>all</strong>y chosen by combining the type of theobject with the name (selector) of the message:type × selector → methodObjects scan store (usu<strong>all</strong>y in their type) a table of methodsindexed by selector, or selectors can store a table of methodsindexed by type. 3 The implementation in Figure 6 choosesthe latter. 44.3 Generic functionsA generic function contains several function implementations.When applied, the generic function uses some propertyof each of its arguments to choose which implementationwill be executed. The type of each argument is often used. 5Figure 7 shows a simple model of generic functions. 6 Tuplesare organised as a sparse multi-dimensional array. Successivedimensions correspond to successive argument positions.For each dimension the tuple maps a type id to thetuple for the next argument position, until the final tuplewhich maps type to the implementation function.5. DISCUSSIONThe preceding compositions can be modelled in the languageof Section 2 by a closure implementing dispatch through aclosed-over list-based structure containing the required map.Accessing (for inspection or manipulation) the structure isone source of the “complexity and obfuscation” mentionedin the introduction, demanding knowledge of the internalstructure and implementation of closures and environments.3 Each has advantages compared with the other. Both wouldbe present in a comprehensive model of message passing.4 This breaks encapsulation, a precept of object-orientation.Order can be restored by storing an association list witheach type to map selector names to methods. Then looks up the method and memoises it, in the structureshown here, before invoking it. Lookup need not be efficientbecause of the memoisation. Chaining s togetherpermits enumeration to flush memoised results whenevermethod dictionaries are manipulated. Such an implementationof is one line longer than that shown.5 The simplest property to use is equality with a constantbut the resulting behaviour is of limited use.6 This model uses equality to compare the list of actual argumenttypes and the type signature associated with eachimplementation. If an ordering relation can be defined fortypes then more interesting comparisons can be used, tofind the ‘closest admissible’ implementation when actual argumenttypes do not precisely match an implementationfunction signature. As with selectors, the comparison neednot be efficient because the result can be memoised in the’s array.


(define-type (name methods))(define make-selector (lambda (name)(let ((self (new )))(set (-name self) name)(set (-methods self) (tuple))self)))(define define-selector (fixed (form (lambda (name)‘(define ,name (make-selector ’,name))))))(define %add-method (lambda (self type method)(set-tuple-at (-methods self) type method)))(define define-method (fixed (form (lambda (selector type args . body)‘(%add-method ,selector ,type (lambda (self ,@args) ,@body))))))(set-tuple-at *applicators* (lambda (self . arguments)(apply (or (tuple-at (-methods self) (type-of (car arguments)))(error "no method in "(-name self)" for "(type-of (car arguments))))arguments)))(define-type (name methods))(define generic (lambda (name)(let ((self (new )))(set (-name self) name)(set (-methods self) (tuple))self)))Figure 6: Selector type and its applicative meaning function(define define-generic (fixed (form (lambda name) ‘(define ,name (generic ’,name)))))(define %add-multimethod (lambda (mm types method)(if types(let ((methods (or (-methods mm) (set (-methods mm) (tuple)))))(while (cdr types)(let ((type (eval (car types))))(set methods (or (tuple-at methods type) (set-tuple-at methods type (tuple)))))(set types (cdr types)))(set-tuple-at methods (eval (car types)) method))(set (-methods mm) method))))(define define-multimethod (fixed (form (lambda (method typed-args . body)(let ((args (map cadr typed-args))(types (map car typed-args)))‘(%add-multimethod ,method (list ,@types) (lambda ,args ,@body)))))))(set-tuple-at *applicators* (lambda (self . arguments)(let ((method (-methods self))(arg arguments))(while arg(set method (tuple-at method (type-of (car arg))))(set arg (cdr arg)))(apply method arguments))))Figure 7: Generic function type and its applicative meaning function5.1 Metacircularity and extensibilityA metacircular evaluator is self-extensible through directmanipulation of its own implementation. Eventu<strong>all</strong>y it mustbe grounded in an executable representation, terminatingthe recursion in its implementation. Typic<strong>all</strong>y this meanstranslating <strong>all</strong> or part of the evaluator into another language.The translated parts of the evaluator become inaccessibleto direct manipulation. 7 Minimising the number of,and semantic assumptions made by, these translated partsis therefore desirable.Grounding the original evaluator of Figure 1 fixes:• the types permissible in simple expressions (implicitcombination of atom and environment),• the meaning of simple expressions,• the types in which complex expressions (explicit combinationsof values) can be represented,7 In the absence of a dynamic translator from expressions tothe executable representation.• the composition rule associated with complex expressionsof each permissible type,• the types that can appear as operators in complex expressions,and• the semantics associated with an operator in a complexexpression.Grounding the generalised evaluator of Figure 3 fixes onlythe mechanism associating meaning with the type of an expressionand the semantics of applying a primitive .Additional grounded mechanisms (Figure 4) are neededto supply meaning for:• atomic values (identifiers and literals),• a composition rule for combination of explicit combinationof values via a list of s, and• the semantics of applying a closure .Modifications to these last three mechanisms can be madeby programs, with care. With almost no restrictions: newtypes can be given meaning as simple expressions, new types


defined to combine values into complex expressions, specificmeanings assigned to those combinations, new applicabletypes defined to serve as composition operators, and newsemantics associated with those compositions.5.2 CompositionalityIn compositional languages, the meaning of a complex expressionis determined from the meanings of its lexical componentsand the syntactic operator used to combine them [6].In computer languages, which are usu<strong>all</strong>y compositional, thelexical components are expressions and the syntactic operatoris the punctuation (or reserved word) that combines severalexpressions into a complex expression. 8 The meaningof an expression is its value or effect. The syntactic operatordetermines a rule of composition for the lexical components.This can be written as a homomorphism between syntaxand semantics [3]:m(F (e 1,...,e k)) = G(m(e 1),...,m(e k))Our minimal language has no syntax, so the type of thefirst expression in a combination acts as the syntactic operatorF . 9 The semantic function G associated with F , and thevalues (meanings) of the sub-expressions, m(e i), determinethe value (meaning) of the over<strong>all</strong> composition. So:• lists (of s) create a combination of sub-expressions(they are not syntax),• compositional syntax is associated with the type of thevalue of the first sub-expression in a combination,• each compositional syntax has exactly one compositionalsemantics associated with it, and• the compositional semantics is parameterised by thecombined sub-expressions, including the value that determinedthe compositional syntax.The morphology of the above rule is imposed by the functionstored at evaluators[] (Figure 4) which definesthe meaning of complex expressions combined as a list ofs. Rules leading to forms of composition very differentto that above can be expressed easily as new aggregatetypes (to combine sub-expressions) with associated meaningfunctions (imposing compositional forms) in evaluators.5.3 ImplementationThe language used in the examples, implementing the extensibletypes and composition rules described here, can bedownloaded from: http://piumarta.com/software/maruThe language hosts a sm<strong>all</strong> compiler that translates ametacircular definition of its evaluator from S-expressionsto IA32 machine code. Several composition mechanisms aredefined and used in the compiler for brevity, clarity and simplicityin its implementation.8 Punctuation and reserved words may also be associatedwith declarations. These are not syntactic operators, butmay nonetheless create or modify the environment in whichcompositions are subsequently performed. Some languagesmay also have a meta language, with its own set of syntacticoperators, that determine the meanings of meta expressionsinvolving declarations or types. Such meta languagesare usu<strong>all</strong>y compositional and therefore subject to the sameconsiderations presented here for the evaluation of “normal”expressions.9 So set and + are different syntactic operators, but + and -are the same.5.4 PerformanceThe translation of S-expressions to IA32 machine code inthe above metacircular implementation provides a convenientbenchmark for measuring the cost of extensibility. Thecompiler was run twice, once using the original evaluator ofSection 2 and again using the extensible evaluator describedin Section 3. Translation took 30% longer when the compilerwas run using the extensible evaluator.6. CONCLUSIONSA simple, metacircular, symbolic, functional language wasrestructured to remove assumptions about types and compositionmechanisms. The original behaviour was restoredby indirectly associating evaluation rules with three typesand applicable behaviour with a fourth type. New compositionrules can be defined in the resulting language, withno privileged status accorded to the built-in types. Additionalindirections in the evaluator caused a 30% loss in performance.(Techniques beyond the scope of this paper, involvingstaged evaluation of expressions and their associatedcompositions, can more than recover this loss. Supportingsuch techniques flexibly was the reason for developing andrefining the open composition mechanism described here.)The restructuring follows general principles that could beadapted for any sm<strong>all</strong> language in which code can be manipulatedunder program control. The language presented hereis very simple, approaching the simplest in which the restructuringfor extensible composition is possible, but provides acompelling demonstration of the expressive power gained.Its metacircular evaluator, runtime library, and compilergenerating IA32 machine code are expressed in less than1800 distinct lines of code.7. REFERENCES[1] J. McCarthy (1960) Recursive Functions of SymbolicExpressions and Their Computation by Machine,CACM, Vol. 3, No. 3, pp. 184–195[2] J. McCarthy et al (1961) LISP 1.5 Programmer’sManual, MIT AI Project, Cambridge, MA[3] R. Montague (1970) Universal grammar, Theoria,Vol. 36, Issue 3, pp. 373–398[4] I. Piumarta (2011) Open, extensible composition models,http://piumarta.com/<strong>papers</strong>/freeco11[5] H. Stoyan (1991) The Influence of the Designer on theDesign—J. McCarthy and Lisp, in V. Lifschitz (Ed.),Artificial Intelligence and Mathematical Theory ofComputation: Papers in Honor of John McCarthy,Academic Press Professional, Inc.[6] Z. G. Szabó (2008) Compositionality, in E. Zalta (Ed.),The Stanford Encyclopedia of PhilosophyAcknowledgementsThe author is greatly indebted to Kita Laboratory, KyotoUniversity, for supporting this work. Mark Rafter andYoshiki Ohshima provided invaluable comments on an earlydraft of this paper. The anonymous reviewers made manyexcellent suggestions of which, unfortunately, only half couldbe given proper consideration in the space available for thisversion of the paper.


Towards Using Constructive Type Theory for VerifiableModular TransformationsSteffen ZschalerKing’s College London,Department of Informatics,London, UKszschaler@acm.orgIman PoernomoKing’s College London,Department of Informatics,London, UKiman.poernomo@kcl.ac.ukJeffrey TerrellKing’s College London,Department of Informatics,London, UKjeffrey.terrell@kcl.ac.ukABSTRACTModel transformations have been studied for some time, typic<strong>all</strong>yusing a semantics based on graph transformations. This has beenvery successful in defining, optimising and executing model transformations,but has been less useful for providing a firm semanticbasis for modular, reusable transformations. We propose a novelrendering of transformation semantics in terms of constructive typetheory and show how this can be employed for expressing dependenciesand guarantees of transformation modules in a formal framework.Categories and Subject DescriptorsD2.3 [Software Engineering]: Coding Tools and TechniquesKeywordsProofs-as-model-transformations, type theory, formal model drivenengineering1. INTRODUCTIONModel-Driven Engineering (MDE) focuses on using models asthe central artefact of software development, and model transformationsto turn them into executable code. Model transformationscan encode design rules, platform choices, or even coding conventions.MDE can result in better software quality because it encouragesdevelopers to focus on high-level, domain-centered concepts,which ensures consistency of implementation and reliabilityof analysis.As MDE is being used increasingly within science and industry,and transformations of interest are becoming more complex, thetrustworthiness of transformations is quite rightly receiving moreattention. The informality of MDE as it currently stands makes ituntrustworthy and therefore potenti<strong>all</strong>y dangerous. If model transformationsare incorrect, the MDE process can result in software ofa lower quality than that produced by traditional software development.A sm<strong>all</strong> number of errors in a complex transformation caneasily lead to an exponential number of errors in the resulting code,which may be difficult to trace and debug.Previous work by Terrell and Poernomo [14] has attempted tosolve this problem within a formal method known as ConstructiveType Theory (CTT). CTT possesses a property known as the Curry-Howard Isomorphism, where data, functions and their correctnessproofs can be treated as ontologic<strong>all</strong>y equivalent, and where a similarequivalence holds for the related trinity of typing information,program specifications and programs. A practical implication ofthe isomorphism is that, by proving the logical validity of a modeltransformation specification, we can automatic<strong>all</strong>y synthesize animplementation of the transformation that satisfies the specification.Following [13], we c<strong>all</strong> this the proofs-as-model-transformationsparadigm.As transformations become more complex, there is an increasingneed to be able to modularise them. Some work on this hasalready been done [1, 3, 4, 6–8, 11, 12, 17, 19, 20]. However, it isstill difficult to safely reuse transformation modules as there arecurrently no techniques for expressing or verifying a transformationmodule’s dependencies. In [14], Terrell and Poernomo showedhow proofs-as-model-transformations <strong>all</strong>ows us to develop a structuredapproach to provably correct model transformations, definingmaps between class hierarchies. In this paper, we sketch how thisapproach can be extended to form<strong>all</strong>y express contracts for transformationmodules. In particular, we show how the higher-ordernature of CTT can enable a natural characterisation of a transformationmodule’s dependencies.2. SPECIFICATION AND DEVELOPMENTOF MODEL TRANSFORMATIONSConstructive Type Theory (CTT) as a formal method is like aconventional functional programming type system that has beenextended to include logical specifications, so that a valid type inferenceis also a proof of program certification. For example, justas we can find terms 2 and + that satisfy2:int or +:int ∗ int → intas valid type inferences in a typical functional programming language,we can also develop a term t in CTT such thatt : ∀x : int.∃y : int.GreaterP rimeNumber(x, y) (1)Any such term t is simultaneously• a program that, given an input x, will output a prime numbery greater than x satisfying GreaterP rimeNumber(x, y);• a proof that the program meets its specification.That is, t is proof-carrying code, a program and a certification ofthe program’s correctness with respect to its specification (1).We have applied the same principle to model transformations,extending the type system to accommodate EMOF like metamodelsas types so that we can develop certified model transformations tby developing a type inference of the formt : ∀x : Source.Pre(x) →∃y : Target.Post(x, y)Any such term t is simultaneously• a model transformation that, given an input model x of metamodelSource, will output a model y of metamodel Target;


• a proof that the model transformation meets its specification.In the next section, we present a brief summary of how CTT canbe used to specify and develop model transformations.2.1 Constructive Type TheoryWe sketch our version of constructive type theory (a sugared versionof Coquand and Huet’s Extended Calculus of Constructions,the type theory at the heart of the Coq theorem prover).The CTT is a lambda calculus whose core set of terms, P , aregiven over a set of variables, V :P ::= V |λ V.P|(P P)|P, P|fst(P )|snd(P )|inl(P )|inr(P )|match P with inl(V ) ⇒ P | inr(V ) ⇒ PThe lambda calculus is a functional programming language. Wecan compile terms and run them as programs. As such, the calculusis equipped with an evaluation semantics. Lambda abstraction andapplication are standard and widely used in functional programminglanguages such as SML. The term λx.Pdefines a functionthat takes x as input and will output P [a/x] when applied to a viaan application (λ x.P)a. The calculus also includes pairs a, b,where fst(a, b) will evaluate to the first projection a (similarlyfor the second projection). Case matching provides a form of conditional,so that match z with inl(x) ⇒ P | inr(y) ⇒ Q willevaluate to P [x/a] if z is inl(a) and to Q[y/a] if z is inr(a). Evaluationis assumed to be lazy – that is, the operational semantics isapplied to the outermost terms first, working inwards until a neutralterm is reached. We write a ✄ b if a evaluates to b according to thissemantics.Like most modern programming languages, the CTT calculus istyped, <strong>all</strong>owing us to specify, for example, the input and outputtypes of lambda terms. We writef : Tto signify that a term f has type T .The terms of our lambda calculus are associated with the followingkinds of types: basic types, e.g. integers, functional types(A → B), product types (A ∗ B), disjoint unions (A|B), dependentproduct types (Πx : t.a) and dependent sum types (Σx : t.b),where in both cases x is taken from V . The first three types havethe standard meaning found in typical functional programming languages.For example,t :(A → B)means that t is a function that can accept as input any value of typeA to produce a value of type B.The next two types require some explanation. A dependent producttype expresses the dependence of a function’s output types onits input term arguments. For example, if(λx.t) :Πx : T.F(x)then the function (λx.t) can input any value a of type T , producingan output value t[a/x] of type F (a). Thus, the final output type ofthe function is parameterized by the input value.Similarly, the dependent sum type expresses a dependence betweenthe type of a pair’s second element and the value of its firstelement. For example, if we have a pair(a, b) :Σx : T.F(x)then the type of b is F (a).Typing rules provide a formal system for determining what thetypes of lambda terms should be.We have extended the standard way of encoding objects andclasses, using record types of the same form as found in functionalprogramming languages such as SML. Bidirectional and cyclic dependenciespose a technical problem to CTT. We solve this by usingco-inductive record types. Co-induction over record types essenti<strong>all</strong>y<strong>all</strong>ows us to expand as many references to other recordsas we require, simulating navigation through a metamodel’s cyclicreference structure. The formal treatment of these concepts is givenin [14].2.2 Proofs as Model TransformationsThe Curry-Howard isomorphism shows that constructive logicis natur<strong>all</strong>y embedded within our type theory, where proofs correspondto terms, formulae to types, logical rules to typing rules,and proof normalization to term simplification. Consider the setof well-formed formulae WFF, built from exactly the same predicatesthat occur in our type theory. We can define an injectionasType from WFF to types of the lambda calculus as in Fig. 1.AasType(A)Q(x), where Q is a predicateQ(x)∀x : T.Px : T.asType(P )∃x : T.P Σx : T.asType(P )P ∧ QasType(P ) ∗ asType(Q)P ∨ QasType(P )|asType(Q)P ⇒ QasType(P ) → asType(Q)⊥⊥Figure 1: Definition of asType, an injection from WFF totypes of the lambda calculus.The isomorphism tells us that logical statements and proofs correspondto types and terms. We assume we have a proof inferencesystem for constructive logic Int (similar to the inference systemstaught in undergraduate logic classes, where Γ Int P means thata proposition P can be logic<strong>all</strong>y deduced from a set of assumptionsΓ).Theorem 1 Let Γ={G 1,...,G n} be a set of premises. Let Γ ={x 1 : G 1,...,x n : G n} be a corresponding set of typed variables.Let A be a well-formed formula. Then the following is true. Givena proof in constructive logic of Γ Int A we can use the typingrules to construct a well-typed proof-term p : asType(A) whosefree proof-term variables are Γ . Symmetric<strong>all</strong>y, given a well-typedproof-term p : asType(A) whose free term variables are Γ , we canconstruct a proof in constructive logic Γ A.✷Because the isomorphism holds, we will often omit the use ofasType and use logical connectives and quantifiers instead of theircomputational counterparts (and vice versa) where there is ambiguity(for example, we will write ∀ instead of Π if the context makesit clear that a dependent product is being employed).The key implication of this theorem is that• types can be considered to be specifications of functionalprograms and• an inhabitant of a specification type can be considered to beboth a program that satisfies the specification and a proof ofthis satisfaction.These results are entailed by the following.Theorem 2 Let Γ={G 1,...,G n} be a set of premises. Let Γ ={x 1 : G 1,...,x n : G n} be a corresponding set of typed variables.Let ∀x : T.∃y : U.P (x, y) be a well-formed ∀∃ formula.


Ifis a well typed term, thenis provable. p : asType(∀x : T.∃y : U.P (x, y))∀x : T.P(x, fst(px))The theorem means that, given a proof of a formula ∀x : T.∃y :U.P (x, y), we can automatic<strong>all</strong>y extract a function f that, giveninput x : T , will produce an output fx that satisfies the constraintP (x, fx).Our notion of proofs-as-model-transformations essenti<strong>all</strong>y followsfrom this theorem. Given that we have the machinery to typethe structure of arbitrary metamodels, a model transformation betweentwo metamodels Source and Targetcan be thought of as afunctional programt : Source → TargetSuch a program can be specified as set of constraints over instancesof the input x : Source and output y : Target metamodels. Inthe simplest case, we can consider these constraints to be of a preconditionPre(x) that is assumed to hold over input metamodelinstances x : Source, and a postcondition relationship Post(x, y)that holds between x and required output metamodel instances y :Target.Given types Source and Targetto represent the source and targetmetamodels, and constraints as logical formulae over terms ofthe metamodels, we can then specify the transformation by a formula∀x : Source.Pre(x) →∃y : Target.Post(x, y)After that, we can attempt to find a certified transformation by identifyingan inhabiting term t oft : ∀x : Source.Pre(x) →∃y : Target.Post(x, y)If we look at the meaning of the types (rec<strong>all</strong> that ∀ corresponds toa dependent product Π and ∃ to a dependent sum Σ), we see thatt must be a function that takes in any input x of type Source andreturns a pairtx = w, pIn order to synthesize a provably correct model transformation, weapply the extraction mapping over t according to Theorem 2: thiswill give us the required model transformation fst(t) =w and acertification of the transformation’s correctness, a proof snd(t) =p.3. MODULARISING MODEL TRANSFOR-MATIONSThere has been considerable research interest in modularisingmodel transformations for some time already. The approaches proposedand studied so far, may be characterised by the granularity ofmodules that they provide: At a first level, we can distinguish internalcomposition of transformation rules from external compositionof entire model transformations [10]. We can further differentiateinternal composition into inter-rule composition, where entirerules are taken to be modules, and intra-rule composition, whererules themselves can be composed of finer-grained modules. In thefollowing, we will briefly discuss each of these compositions inturn.✷3.1 External CompositionExternal composition takes entire model transformations to bemodules that can be independently reused and composed. Early researchon external composition focused mainly on languages andtools for describing and executing such compositions of reusablemodel transformations. This has led to early work on MDA components[4], megamodelling [5], transformation chaining [3, 6, 17],and transformation configuration [19].As <strong>all</strong> of this work considers transformations as black-box componentsto be composed into larger components, the ‘signature’ or‘interface’ of a transformation becomes important. These termsrefer to the information that can be obtained about a transformationwithout inspecting its implementation. Initial work on externalcomposition [12, 18] defined transformation signatures by twosets of metamodels: one typing the models that the transformationconsumed and another typing the models produced by the transformation.Later research [6] found that this is not always sufficientinformation for safely composing transformations. In particular,endogenous transformations transform between models of thesame metamodel, but may well only address particular elementswithin this metamodel. Information about the metamodel thus becomesuseless when composing a set of endogenous transformations.In addition, some endogenous transformations may be intendedto be used with a fixpoint semantics (invoking them untilno more changes occur), which makes composing them even morecomplex. It was concluded in [6] that in addition to the metamodel,there needs to be information about the particular subset of modelelements that are used or affected by a transformation. In par<strong>all</strong>el tothis work, [17] also identified a need to include information aboutthe technical space of models (e.g., MOF or XML) into the transformationsignature. Alternatively, some of this information hasbeen encapsulated by wrapping models as components themselves,providing interfaces for accessing and manipulating the model in afashion independent of the technical representation [12].3.2 Internal CompositionInternal composition considers modules of a finer granularitythan entire model transformations. Inter-rule composition considersindividual rules to be modules, while intra-rule compositionconsiders even finer-grained modules, i.e. parts of rules.3.2.1 Inter-rule CompositionA number of transformation languages consider transformationrules to be the unit of modularity. A number of mechanisms areprovided for composing rules into transformations, including implicitand explicit rule invocation, and rule inheritance [3, 11]. Approachesinspired from graph transformation—for example, VMT[16]—even <strong>all</strong>ow for chaining of individual transformation rules.Module superimposition [20] applies the notion of superimpositionfrom feature-oriented software development [2] to the developmentof transformation modules, <strong>all</strong>owing individual rules to be overriddenby rules from superimposed modules.All of these techniques create some flexibility in <strong>all</strong>owing developersto exchange or independently evolve rules. However, they donot distinguish a rule’s interface from its implementation, whichmeans that rules and rule compositions cannot be verified or understoodmodularly without inspecting the complete implementationof each rule. Furthermore, some evaluations have shown that thereare scenarios where the modularisation capabilities available at thelevel of complete rules are not sufficient [7, 8, 11].3.2.2 Intra-rule CompositionTo improve modularisation capabilities, a number of mechanisms


have been proposed that <strong>all</strong>ow parts of rules to become units ofmodularity. Balogh and Varró [1] describe how matching and creationpatterns can be defined as standalone units of modularity, andcomposed into more complex patterns for use in transformationrules. Johannes et al. [9] <strong>all</strong>ow rules to be composed and generatedfrom a number of pattern instantiations annotated to the sourcemetamodel.While these approaches clearly improve the modularity capabilitiesof inter-rule composition approaches, they still do not enablemodular verification or understanding.In summary, while most of these techniques provide some assuranceswith respect to the syntactic correctness of models producedfrom a composed transformation (if only by virtue of the fact thatthey abide by a metamodel), there is very little support for modularreasoning about semantic properties. In the next section, we proposea formal encoding of transformation semantics, which <strong>all</strong>owsus to provide modular reasoning and verification about semantictransformation properties.4. MODULAR TRANSFORMATION FUNC-TIONSHigher-order quantification is the ability to make statements thatare generic or parametrised over other functions, statements orproofs of statements. The proofs-as-model-transformations idea,when combined with higher-order quantification, <strong>all</strong>ows us to form<strong>all</strong>ytreat modularity in transformations and transformation specifications.The higher order nature of CTT <strong>all</strong>ows us to parametrize statementsover variables that stand as placeholders for other statements.This is achieved by introducing a higher-order universe type Propof <strong>all</strong> propositions: the type <strong>all</strong>ows us to treat logical statements asforms of data to be quantified over, just like integers or strings. Wecan therefore define specifications that are parametrized with respectto arbitrary sub-requirements. For example, we can parametrizethe specification of a transformation from UML to relational databasesas∀SubReq:(UML∗ RDBS)→ Prop.⎛⎜⎝∀x : UML.∃y : RDBS.Post(x, y)∧SubReq(x, y)⎞⎟⎠ (2)The predicate variable SubReq stands for any subrequirementwe might have over the input and output model instancesof the transformation. It could, for example, stand for aproposition CTS(x, y), which asserts that the number of tables ina relational database y : RDBS is greater than or equal to thenumber of classes in an input UML diagram x : UML. Givena proof of (2), the variable SubReq could then be replaced withCTS, yielding an instantiated version of the generic specification:∀x : UML.∃y : RDBS.P ost(x, y) ∧ CTS(x, y)This instantiated formula can be considered to be a version of thegeneric specification, rendered specific to a particular requirementabout classes and tables.We can combine this treatment of parametrised specificationswith the Curry-Howard isomorphism to define a notion of transformationmodularity that includes a formal treatment of certifiedparameters. This is done by quantifying, not only over subrequirements,but also over arbitrary programs and proofs.For example, we are able to parameterise a specification over anassumed input proof of a sub-requirement. Consider the modulartransformation dependency given in Fig.2. The right hand modulerepresents a generic, structural transformation between UMLobject diagrams such that an input object A : UO is mapped toa new root object B : UO, standing in A.b number of relationsto a list of objects, CL : List(UO). Assume this mapping hasbeen defined by the predicate Map(A, B, CL). The specificationis generic over the properties that might hold over each C ∈ CL(in particular, its attribute c): this subrequirement is defined as apredicate variable SubReq(A, C).We can define a type for the modular transformation with parametersrepresenting both the subrequirement and an assumed proofpr of the subrequirement as follows∀SubReq : UO ∗ UO → Prop.pr ⎛: ∀A : UO.∃C : UO.SubReq(A, C). ⎞∀A : UO.∃B : UO.∃CL : List(UO). (3)⎝ Map(A, B, CL)∧⎠∀C ∈ CL.SubReq(A, C)The parameter pr stands for a proof that, given any A, we can finda C such that SubReq(A, C). It would be employed in the proofof the composed, instantiated transformation to certify that a particulardata transformation between the source and target can beplugged in to the structural transformation.Note that the parameter SubReq in the example above predicatesover individual model elements, as opposed to Map, whichdeals with the entire model structure. This form<strong>all</strong>y expresses theseparation of concerns between the structural transformation Mapand its parameter, which can only specify data transformations.Thus, higher-order quantification over proofs and subrequirements<strong>all</strong>ows us to• form<strong>all</strong>y represent modular specifications, parametrised overdesirable subrequirements.• by the Curry-Howard ismorphism, instantiation of suchparametrised specifications correspond to certified modulartransformations.5. CONCLUSIONSAs model transformations become more important to softwaredevelopment, systematic development of these transformations becomesitself more important. We have shown our current ideas onhow constructive type theory can be used to form<strong>all</strong>y express theinterfaces of, dependencies between, and contracts supported bytransformation components. A key enabling factor in this has beenthe use of higher-order type theory, which <strong>all</strong>owed us to quantify(i.e., parametrise) over predicates and proofs of these predicates.This has enabled a transformation module to express precisely whatit expects of other transformation modules with which it can becomposed.In the present paper, we have presented the essential idea of ourapproach using an extremely academic example only. We are currentlyworking to apply this idea to examples of modular transformationsfrom [9,11] and hope to report on this more extensively ina further publication.6. REFERENCES[1] András Balogh and Dániel Varró. Pattern composition ingraph transformation rules. In 1st European Workshop onComposition of Model Transformations (CMT’06) [10],pages 33–37.[2] Don Batory, Jacob Neal Sarvela, and Axel Rauschmayer.Scaling step-wise refinement. IEEE Transactions onSoftware Engineering, 30(6):355–371, 2004.


Data TransformationStructure TransformationAa = L?c = L + 3?...?c = L + 3Ab = KCc = ?BKtimesCc = ?Figure 2: High-level representation of two transformation components[3] Mariano Belaunde. Transformation compositon in QVT. In1st European Workshop on Composition of ModelTransformations (CMT’06) [10], pages 39–45.[4] Jean Bézivin, Sébastien Gérard, Pierre-Alain Muller, andLaurent Rioux. MDA components: Ch<strong>all</strong>enges andopportunities. In Andy Evans, Paul Sammut, and James S.Willans, editors, Proc. 1st Int’l Workshop Metamodelling forMDA, pages 23–41, York, UK, 2003.[5] Jean Bézivin, Frédéric Jouault, Peter Rosenthal, and PatrickValduriez. Modeling in the large and modeling in the sm<strong>all</strong>.In Uwe Aßmann, Mehmet Aksit, and Arend Rensink,editors, Proc. MDAFA 2003/04, volume 3599 of LectureNotes in Computer Science, pages 33–46. Springer Berlin /Heidelberg, 2005.[6] Raphaël Chenouard and Frédéric Jouault. Automatic<strong>all</strong>ydiscovering hidden transformation chaining constraints. InSchürr and Selic [15], pages 92–106.[7] Thomas Cleenewerck and Ivan Kurtev. Separation ofconcerns in translational semantics for DSLs in modelengineering. In ACM Symposium on Applied Computing,pages 985–992, 2007.[8] Arda Goknil and N. Yasemin Topaloglu. Composingtransformation operations based on complex source patterndefinitions. In 1st European Workshop on Composition ofModel Transformations (CMT’06) [10], pages 27–32.[9] Jendrik Johannes, Steffen Zschaler, Miguel A. Fernández,Antonio Castillo, Dimitrios S. Kolovos, and Richard F. Paige.Abstracting complex languages through transformation andcomposition. In Schürr and Selic [15], pages 546–550.[10] A. G. Kleppe. 1st European <strong>workshop</strong> on composition ofmodel transformations (CMT’06). Technical ReportTR-CTIT-06-34, Centre for Telematics and InformationTechnology, University of Twente, June 2006.[11] Ivan Kurtev, Klaas van den Berg, and Frédéric Jouault.Evaluation of rule-based modularization in modeltransformation languages illustrated with ATL. In Proc. 21stAnnual ACM Symposium on Applied Computing (SAC’06’),pages 1202–1209, April 2006.[12] Raphaël Marvie. A transformation composition frameworkfor model driven engineering. Technical report, University ofLille 1, 2004. LIFL technical report 2004-n10.[13] Iman Poernomo. Proofs-as-model-transformations. InAntonio V<strong>all</strong>ecillo, Jeff Gray, and Alfonso Pierantonio,editors, Proc. 1st Int’l Conf. on Theory and Practice ofModel Transformations (ICMT’08), volume 5063 of LectureNotes in Computer Science, pages 214–228. Springer, 2008.[14] Iman Poernomo and Jeffrey Terrell. Correct-by-constructionmodel transformations from parti<strong>all</strong>y ordered specificationsin Coq. In Jin Song Dong and Huibiao Zhu, editors, Proc.12th Int’l Conf. Formal Methods and Software Engineering(ICFEM 2010), volume 6447 of Lecture Notes in ComputerScience, pages 56–73. Springer, 2010.[15] Andy Schürr and Bran Selic, editors. Proc. Int’l Conf. onModel Driven Engineering Languages and Systems(MoDELS’09), volume 5795 of LNCS. Springer, 2009.[16] Shane Send<strong>all</strong>, Gilles Perrouin, Nicolas Guelfi, and OlivierBiberstein. Supporting model-to-model transformations: TheVMT approach. In Arend Rensink, editor, Proc. MDAFA2003, 2003. Published as CTIT Technical ReportTR–CTIT–03–27, University of Twente.[17] Bert Vanhooff, Dhouha Ayed, Stefan Van Baelen, WouterJoosen, and Yolande Berbers. UniTI: A unifiedtransformation infrastructure. In Gregor Engels, Bill Opdyke,Douglas C. Schmidt, and Frank Weil, editors, Proc. 10thInt’l Conf. Model Driven Engineering Languages andSystems (MoDELS’07), volume 4735 of Lecture Notes inComputer Science, pages 31–45. Springer, 2007.[18] Andrés Vignaga, Frédéric Jouault, María Cecilia Bastarrica,and Hugo Brunelière. Typing in model management. InRichard Paige, editor, Proc. 2nd Int’l Conf. on Theory andPractice of Model Transformations (ICMT’09), volume 5563of Lecture Notes in Computer Science, pages 197–212.Springer-Verlag, 2009.[19] Dennis Wagelaar and Ragnhild Van Der Straeten. Acomparison of configuration techniques for modeltransformations. In Arend Rensink and Jos Warmer, editors,Proc. ECMDA-FA 2006, volume 4066 of LNCS, pages331–345. Springer, 2006.[20] Dennis Wagelaar, Ragnhild van der Straeten, and DirkDeridder. Module superimposition: A composition techniquefor rule-based model transformation languages. Software andSystems Modeling, 9:285–309, 2010.


FREECO-11Author IndexAuthor IndexBergmans, Lodewijk 7Bockisch, Christoph 7Dinkelaker, Tom 1Hauke, Sascha 1Kell, Stephen 12Piumarta, Ian 6, 27Poernomo, Iman 22Rashid, Awais 17te Brinke, Steven 7Terrell, Jeffrey 22Zschaler, Steffen 17, 221


FREECO-11Program CommitteeProgram CommitteeGarcia AlessandroSven ApelLodewijk BergmansChristoph BockischKästner ChristianSchäfer InaFindler RobertMillstein ToddDinkelaker TomLancaster UniversityUniversity of PassauUniversity of TwenteSoftware Engineering group, University of TwentePhilipps University MarburgTechnische Universität BraunschweigNorthwestern UniversityUniversity of California, Los AngelesTechnische Universitaet Darmstadt, Germany1

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!