Towards Modular Code Generators Using SymmetricLanguage-Aware AspectsSteffen ZschalerKing’s College London,Department of Informatics, London, UKszschaler@acm.orgAwais RashidSchool of Computing and Communications,Lancaster University, Lancaster, UKawais@comp.lancs.ac.ukABSTRACTModel-driven engineering, especi<strong>all</strong>y using domain-specific languages,<strong>all</strong>ows constructing software from abstractions that are moreclosely fitted to the problem domain and that better hide technicaldetails of the solution space. Code generation is used to produce executablecode from these abstractions, which may result in individualconcerns being scattered and tangled throughout the generatedcode. The ch<strong>all</strong>enge, then, becomes how to modularise the codegeneratortemplates to avoid scattering and tangling of concernswithin the templates themselves. This paper shows how symmetric,language-aware approaches to aspect orientation can be appliedto code generation to improve modularisation support.Categories and Subject DescriptorsD2.3 [Software Engineering]: Coding Tools and TechniquesKeywordsmodel-driven engineering, code generation, symmetric aspects1. INTRODUCTIONIn model-driven engineering (MDE), domain-specific languages(DSLs) are typic<strong>all</strong>y accompanied by code generators, which expandthe abstract DSL concepts into source code in a target programminglanguage, typic<strong>all</strong>y taking into consideration a particulartarget platform (e.g., middleware, hardware, distribution andconcurrency model, etc.). Such expansions can be limited to thegeneration of standard, ‘boiler-plate’ code, but, more interestingly,may include so-c<strong>all</strong>ed local-to-global or global-to-local transformations[14]. A local-to-global transformation means that informationfrom one element in a DSL program may be scattered acrossmultiple elements in the generated source code—for example, adata description may affect user-interface code as well as datadefinitionscripts for setting up a database back end. A globalto-localtransformation, on the other hand, means that informationfrom a number of DSL elements may need to be tangled within oneelement in the generated source code—for example, both informationfrom a layout policy and the data description must be tangledPermission to make digital or hard copies of <strong>all</strong> or part of this work forpersonal or classroom use is granted without fee provided that copies arenot made or distributed for profit or commercial advantage and that copiesbear this notice and the full citation on the first page. To copy otherwise, torepublish, to post on servers or to redistribute to lists, requires prior specificpermission and/or a fee.FREECO’11, July 26, 2011, Lancaster, UK.Copyright 2011 ACM 978-1-4503-0892-2/11/07 ...$10.00.in user-interface code generated.The ch<strong>all</strong>enge in the presence of local-to-global and global-tolocaltransformations is how to cleanly modularise the code-generationtemplates so that, while tangling and scattering occur inthe code generated, the templates themselves do not suffer from it.Moreover, the code-generators themselves may need to address arange of technical concerns, such as consistency management [6],performance, security, different target platforms, etc., which arenot captured in the source models. Of course, we would also liketo avoid scattering and tangling of technical concerns in generationtemplates. This becomes particularly important where DSLsshould be reused in different projects, which may require to addresssome technical concerns in a different manner or may requireadaptations to the DSL concepts provided. In this case, generatormodularisation becomes essential, as it enables a ‘mix and match’approach to DSL adaptation and reuse [16]. Ide<strong>all</strong>y, we would liketo modularise code generators such that each technical concern aswell as each concern modelled in a DSL could be realised in a separatecode-generation module as independently as possible of theother concerns.Existing code-generation frameworks (e.g., [4, 7, 8, 11]) assumethat one target file will be produced by one code-generation template(even though this may import and invoke additional generationrules). Thus, any scattering and tangling that occurs in thegenerated code automatic<strong>all</strong>y also occurs in the generation templates.To address this issue, aspect-oriented approaches to codegeneration have been proposed [10,15]. However, they are not fullyadequate for <strong>all</strong> needs of code-generator modularisation: Becausethey are asymmetric approaches [5], they require an explicit basetemplate, often imply the use of scaffolding (e.g., empty rules inthe base template whose only purpose is to serve as hooks for theaspect templates), and lack sufficient support for weaving contexts.Furthermore, because they are language-agnostic, any languagespecificweaving semantics must be implemented in the generationtemplates, which is not always possible.This paper proposes a novel approach for modularising codegenerators. This approach maintains the benefits of reduced tanglingintroduced by [10, 15], but also addresses the short-comingsof these approaches:• It is a symmetric aspect-oriented approach, addressing theissues connected to asymmetry in current approaches.• It is also a language-aware approach to address the issue oflanguage agnosticism.Thus, the contributions of this paper are the following:1. We identify four shortcomings of existing asymmetric AOapproaches to code generation that stand in the way of effectiveseparation of concerns for code generators.
2. We present a novel approach and architecture for symmetriclanguage-aware AO for code generation, which addresses theaforementioned shortcomings.3. We present a prototype implementing this architecture.The remainder of this paper is structured as follows: In the nextsection, we give a more detailed motivation for the need of symmetricaspect-oriented code generation based on the identificationof four issues with current symmetric approaches. In Sect. 3, wepresent a generic architecture for symmetric aspect-oriented codegeneration and present a prototype implementing these conceptsbased on the Epsilon Generation Language (EGL) [12]. Section 4surveys some related work and Sect. 5 concludes the paper. Due tospace limitations, we have omitted an example as well as a moredetailed discussion of the benefits and drawbacks of our approach.These can be found in an accompanying technical report [17].2. MOTIVATION: MODULARISING CODEGENERATION TEMPLATESMany code-generation frameworks, such as [4, 7, 8, 11], alreadyprovide notions such as generator rules and operations. These arevery useful for modularising a generator based on the structure ofthe files to be generated. However, information that is localisedin one place in a model often needs to affect multiple places inthe generated code. Thus, the knowledge of how to implement amodel element is scattered throughout the generated code. As aconsequence this knowledge will also be scattered throughout thecode generators. Furthermore, information from multiple modelelements may need to be combined to generate a particular file.Thus, the generated code tangles a number of concerns from theDSL models. Unless specific measures are taken in modularisingthe code generators, these concerns will also be tangled within thecode generators.Scattering and tangling in code generator templates is particularlybad where we want to flexibly configure a generation workflowso that it can be adapted to different circumstances. This canhappen, for example, when a DSL is to be reused in a different context[16], or when the system context changes (e.g., a new versionof an underlying platform is released and must be integrated withthe system under development).For example, when generating Eclipse [3] plugins from an applicationmodel, a number of different concerns must be addressedby these generators: i) generating the Java implementation of theapplication’s business logic, ii) generating code implementing thedata model, and iii) generating user interface code, among others.It is immediately clear that there is a number of local-to-globaltransformations in this scenario—for example, information abouta data structure affects code in the user interface, the data model,as well as, possibly, the business logic code. Furthermore, there area number of tangled concerns—for example, both user-interfacecode and data-model code may have to provide configuration information,which means these concerns will be tangled in Eclipse’splugin.xml configuration file. Separating these concerns intodistinct generation modules, would enable us to choose from differentvariations for implementing each concern. Figure 1 shows anexample of how we may want to decompose the code generators.In particular, we have defined one generator for the user-interfaceconcern, one for the concern of business logic, and two differentvariations for the concern of data-model implementation—one usingEMF [13] and one using plain Java objects. The figure showstwo possible configurations making specific choices about thesecode generators. It can be seen how each of the generators usesinformation from the application model (which could be providedusing a number of DSLs) and generates code in one or more files;that is, concerns are scattered throughout the generated code. Somecode generators also need to provide code to the same file; that is,different concerns are tangled within these files.In summary, we require a modularisation of our code generators,such that each concern can be realised in a separate code-generationmodule as independently as possible of the other concerns, even ifthese concerns need to be scattered and tangled in the generatedcode. Ide<strong>all</strong>y, this should enable us to modify, remove, or add acode-generation module without impact on any of the other modulesin our code generator.To support the modularisation of tangling in code generators,some authors have proposed using aspect-oriented techniques in thedefinition of code-generator templates [10, 15]. These are asymmetricaspect-oriented approaches [5] <strong>all</strong>owing the definition ofaround, before, and after advice for individual code-generationrules or operations in a base template. While this can successfullysolve some of the modularisation issues, it also has importantdrawbacks:1. Base template needed. Every aspect template requires a basetemplate against which it is defined and which it modifies.This makes it difficult to define code generators for files thatdo not need to exist for <strong>all</strong> variants of a system. In the exampleof Fig. 1, depending on which concerns we include inthe generation, not <strong>all</strong> of the plugins generated will actu<strong>all</strong>yrequire a plugin.xml file. At the same time, a number ofconcerns will have to make contributions to plugin.xmlif they are selected. Using asymmetric aspect-oriented codegenerationtechniques, we need an empty code-generationbase template for plugin.xml, even though this file wouldoften not be needed. Note that, because we want to <strong>all</strong>ow freechoice of code generators and because this selection may bemade based on information in the application model ratherthan based on a static configuration choice of the generationworkflow, we cannot guarantee a priori that any specific codegenerators will be used in a concrete workflow, and, thus,cannot make any generator the base generator.2. Scaffolding needed in base template. Because every adviceneeds to be attached to some rule or operation in the basetemplate, we can only influence those parts of a file for whichexplicit code-generation rules have been defined. This typic<strong>all</strong>yleads to the definition of a number of empty generationrules, which only serve as hooks to be referenced inaspect templates adding additional code to the file generated.1 For example, in Fig. 1, the empty base template forplugin.xml needs to provide empty rules for generatingextension and extension-point descriptions, even though wedo not declare any of them in the base template. The onlypurpose of these rules would be to enable aspect templates toadd extensions and extension points of their own. The needfor such additional constructs in base code has been previouslyidentified—for example, [15, p. 9] shows an exampleof such an empty generator rule (the fact that [15] in principle<strong>all</strong>ows aspects on aspects does not strongly affect theneed for scaffolding). In [2] such rules have been referred toas scaffolding. They have been shown to contribute stronglyto the instability of pointcuts and to break encapsulation ofbase and aspects.1 Oldevik and Haugen [10] describe this: “[. . . ] there must be constraintson the base transformation [. . . ], e.g. that domain-specificrules should be clearly separated from general rules”.