Dissertaç ˜ao de Mestrado Mestrado em Engenharia Informática Jo ...

Universidade Nova de Lisboa 

Faculdade de Ciências e Tecnologia 

Departamento de Informática 

Dissertação de Mestrado 

Mestrado em Engenharia Informática 

João André Martins (26464) 

Lisboa 

(2009)

Universidade Nova de Lisboa 

Faculdade de Ciências e Tecnologia 

Departamento de Informática 

Dissertação de Mestrado 

SmART: An Application 

Reconfiguration 

Framework 

Joao Andre Martins (26464) 

Orientador: Prof. Doutor Hervé Paulino 

Co-orientador: Prof. Doutor João Lourenço 

Lisboa 

(2009) 

Dissertação apresentada na Faculdade de Ciências 

e Tecnologia da Universidade Nova de Lisboa para 

a obtensão do Grau de Mestre em Engenharia In- 

formática.

To my family and friends, who supported me through the best and worst.

Summary 

Among the information technology sectors is the virtualization sector, which stands out by 

being a very active area, with many collaborations and advances. Virtualization’s benefits are 

plenty. Some of them are the massive server consolidation, reduced cooling, structural and 

electrical costs and the isolation between virtual machines. 

Virtual appliance is a concept resulting from the advancements on virtualization, and de- 

fines an alternative software distribution model from those existing by then, such as the tradi- 

tional CD-ROM distribution, software as a service and hardware appliances. This model allows 

customers to have their IT solution fully specified and optimized for the task it must perform. 

With the virtual appliance outbreak, many servers using different applications tend to 

gather on the same physical machine. The applications, in turn, may implement plenty config- 

uration formats, from widely adopted standards to proprietary formats. The configuration of 

these applications is typically carried by busy system administrators, who must dedicate some 

of their time coping such a vast configuration format range. 

This thesis contributes for the nourishing of the virtual appliance concept and proposes 

the creation of a tool which automatically configures applications inside virtual appliances, 

regardless of the application being configured. The contributions on this area are, therefore, 

very rare, which elevates this dissertation to a pioneer of the area. 

The approach for the problem of automatic application configuration explores the frequently 

found patterns in configuration files. Typically, those files use similar concepts, such as parame- 

ter definitions, parameter blocks and comments. Besides this, only some files were found to im- 

plement other concepts, although on a punctual basis. This dissertation proposes a framework 

for the automatic configuration of applications based on this characteristic. A configuration 

file is transformed to a structured and application-independent language, like the eXtended 

Markup Language, which is then modified and reverted to its original syntax. 

Keywords: Virtual appliance, automatic application configuration 

vii

Sumário 

Entre os sectores de tecnologias de informação, o sector da virtualização distingue-se por 

ser uma área muito activa, que conta com muitas colaborações e desenvolvimentos. As util- 

idades da virtualização são inúmeras. Entre elas estão a consolidação maciça de servidores, 

redução nos custos de arrefecimento, estruturais e de electricidade e o isolamento entre 

máquinas virtuais. 

Fruto dos avanços da virtualização é o conceito de virtual appliance, que define um modelo 

de distribuição de software alternativo aos até então existentes, tais como distribuição tradi- 

cional (CD-ROM), software as a service e hardware appliances. Este modelo permite aos clientes 

terem a sua solução informática específica e optimizada para a tarefa que se propõe desempen- 

har. 

Com o surgimento das virtual appliances, muitos servidores que recorrem a diversas aplicações 

tendem a juntar-se em máquinas físicas. As aplicações, por sua vez, podem implementar vários 

formatos de configuração, desde padrões amplamente adoptados a formatos de proprietário. A 

configuração destas aplicações é tipicamente feita por administradores de sistemas ocupados, 

que têm assim necessidade de canalisar o seu tempo para lidar com a variedade de formatos 

de configuração. 

Esta dissertação vai no encontro dos avanços no ramo das virtual appliances, propondo a 

criação de uma ferramenta que faça a configuração automática de aplicações, independen- 

temente da aplicação em causa. São muito poucos os contributos nesta área, pelo que esta 

dissertação pode ser considerada pioneira na área. 

A abordagem para o problema da configuração automática de aplicações explora os padrões 

recorrentes em ficheiros de configuração de aplicações. Tipicamente, estes ficheiros imple- 

mentam conceitos semelhantes, como definições de parâmetros, blocos de parâmetros e co- 

mentários, existindo ainda outros ficheiros que implementam padrões próprios, tratando-se, 

no entanto, de casos pontuais. Esta dissertação propõe uma framework baseada nesta carac- 

terística, onde um ficheiro de configuração é transformado para uma linguagem estruturada e 

independente da aplicação, como a eXtended Markup Language (XML), e que depois de alter- 

ado é revertido na sua síntaxe original. 

Palavras-chave: Virtual appliance, configuração automática de aplicações 

ix

Contents 

1 Introduction 1 

1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 

1.2 Problem Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 

1.2.1 Pre-installation Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . 4 

1.2.2 Post-installation Configuration . . . . . . . . . . . . . . . . . . . . . . . . . 4 

1.2.3 Application Configuration in General . . . . . . . . . . . . . . . . . . . . . 4 

1.2.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 

1.3 Problem Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 

1.4 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 

1.5 Document Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 

2 State of the Art and Related Work 11 

2.1 Virtual Appliances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 

2.1.1 Virtualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 

2.1.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 

2.1.3 Open Virtual Machine Format . . . . . . . . . . . . . . . . . . . . . . . . . 14 

2.1.4 Virtual Appliances in Real Life . . . . . . . . . . . . . . . . . . . . . . . . . 15 

2.2 Automatic Configuration of Applications . . . . . . . . . . . . . . . . . . . . . . . 15 

2.2.1 Thin Crust . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 

2.2.2 SmartFrog . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 

2.3 Automatic Recognition of Configuration Files . . . . . . . . . . . . . . . . . . . . . 16 

2.3.1 SableCC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 

2.3.2 JavaCC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 

2.4 Representation of Configuration Files in a Generic Syntax . . . . . . . . . . . . . . 18 

3 An application reconfiguration framework 21 

3.1 Architectural Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 

3.2 Automatic Application Configuration . . . . . . . . . . . . . . . . . . . . . . . . . 24 

3.3 Original to Generic syntax Converter . . . . . . . . . . . . . . . . . . . . . . . . . . 24 

3.3.1 Component Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 

3.3.2 Original to Generic Syntax Converter Operation Scenarios . . . . . . . . . 31 

3.4 Generic to Original syntax Converter . . . . . . . . . . . . . . . . . . . . . . . . . . 33 

xi

CONTENTS 

3.5 Modifying the Configuration File . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 

3.6 External Parser Addition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 

4 An Implementation 37 

4.1 Programming Language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 

4.2 Parser Generator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 

4.3 Generic Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 

4.4 Data Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 

4.4.1 Parser . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 

4.4.2 Grammar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 

4.4.3 ParsingData . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 

4.4.4 CompilationData . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 

4.5 Original to Generic syntax Converter . . . . . . . . . . . . . . . . . . . . . . . . . . 43 

4.5.1 User Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 

4.5.2 Configuration File Parser . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 

4.5.3 Parser Repository . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 

4.5.4 Grammar Compiler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 

4.5.5 Code Generator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 

4.5.6 Tentative Grammar Repository . . . . . . . . . . . . . . . . . . . . . . . . . 45 

4.6 Generation of Original Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 

4.7 Generic to Original Syntax Converter . . . . . . . . . . . . . . . . . . . . . . . . . 48 

5 Framework Evaluation 51 

5.1 Functional Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 

5.2 Operational Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 

5.2.1 INI-like Configuration Files . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 

5.2.2 Apache-like Configuration Files . . . . . . . . . . . . . . . . . . . . . . . . . 56 

5.2.3 XML-like Configuration Files . . . . . . . . . . . . . . . . . . . . . . . . . . 56 

5.2.4 Parsing the Generated XML . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 

5.3 Performance Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 

6 Integration with VIRTU 67 

6.1 Project Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 

6.2 VIRTU Top-Level Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 

6.3 VIRTU Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 

6.3.1 Data and Resources Layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 

6.3.2 Processing Layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 

6.3.3 Presentation Layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 

6.3.4 Other Notable Sectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 

6.4 SmART integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 

6.4.1 SmART Original to Generic syntax Converter . . . . . . . . . . . . . . . . 76 

6.4.2 SmART Generic to Original syntax Converter . . . . . . . . . . . . . . . . 77 

xii

CONTENTS 

6.4.3 Parser Generator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 

6.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 

7 Conclusion and Future work 79 

7.1 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 

A Appendix - Architecture 83 

A.1 OGC sequence diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 

B Appendix - Evaluation 85 

B.1 MySQL Configuration file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 

B.2 Apache Configuration File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 

B.3 Eclipse Configuration File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 

B.4 Performance Validation File 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 



xiii

List of Figures 

2.1 Virtual appliance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 

2.2 Physical unit running four VMs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 

3.1 SmART Work Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 

3.2 Components overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 

3.3 Partial and totally parsed configuration file stages . . . . . . . . . . . . . . . . . . 29 

3.4 Interactions on the successful file recognition scenario . . . . . . . . . . . . . . . . 32 

3.5 Interactions on the failed file recognition scenario . . . . . . . . . . . . . . . . . . 33 

4.1 Recognition of a configuration file by two parsers . . . . . . . . . . . . . . . . . . 40 

4.2 Graphical representation of ParsingData . . . . . . . . . . . . . . . . . . . . . . . . 42 

4.3 List of TGRNodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 

5.1 Parsing the generated XML . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 

6.1 VIRTU Assembly Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 

6.2 VIRTU Logical View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 

6.3 VIRTU Architectural Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 

6.4 SmART integration points with VIRTU . . . . . . . . . . . . . . . . . . . . . . . . . 75 

A.1 Original to generic syntax converter sequence diagram . . . . . . . . . . . . . . . 84 

xv

List of Tables 

3.1 Presentation Layer Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 

3.2 Logic Layer Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 

3.3 Data Layer Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 

5.1 Elapsed times for parser compilation in JavaCC . . . . . . . . . . . . . . . . . . . 65 

xvii

List of Listings 

1.1 Apache configuration file excerpt . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 

1.2 Eclipse configuration file excerpt . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 

1.3 MySQL configuration file excerpt . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 

1.4 PostgreSQL configuration file excerpt . . . . . . . . . . . . . . . . . . . . . . . . . 7 

1.5 Mantis configuration file excerpt . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 

3.1 User Interface proposed interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 

3.2 Configuration File Parser proposed interface . . . . . . . . . . . . . . . . . . . . . 28 

3.3 Code Generator proposed interface . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 

3.4 Grammar Compiler proposed interface . . . . . . . . . . . . . . . . . . . . . . . . 30 

3.5 Parser Repository proposed interface . . . . . . . . . . . . . . . . . . . . . . . . . . 30 

3.6 Tentative Grammar Repository proposed interface . . . . . . . . . . . . . . . . . . 31 

3.7 Printer proposed interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 

3.8 External Parser Proposed Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 

4.1 Node interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 

4.2 Parser data structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 

4.3 InternalParser data structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 

4.4 ExternalParser data structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 

4.5 Grammar data structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 

4.6 ParsingData data structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 

4.7 CompilationData data structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 

4.8 Example Metadata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 

4.9 Example FStr . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 

4.10 Apache parameter with a single value . . . . . . . . . . . . . . . . . . . . . . . . . 48 

4.11 Apache parameter with multiple values . . . . . . . . . . . . . . . . . . . . . . . . 48 

4.12 XML element with a single subelement . . . . . . . . . . . . . . . . . . . . . . . . 49 

4.13 XML element with multiple elements . . . . . . . . . . . . . . . . . . . . . . . . . 49 

5.1 MySQL configuration file snippet . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 

5.2 MySQL block in XML . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 

5.3 MySQL comment in XML . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 

5.4 MySQL special instruction in XML . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 

5.5 MySQL Metadata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 

5.6 Apache configuration file snippet . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 

xix

LIST OF LISTINGS 

5.7 Apache parameter with a single value in XML . . . . . . . . . . . . . . . . . . . . 57 

5.8 Apache parameter with multiple values in XML . . . . . . . . . . . . . . . . . . . 57 

5.9 Apache block in XML . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 

5.10 Apache nested block in XML . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 

5.11 Apache comment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 

5.12 Apache Metadata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 

5.13 Eclipse configuration file snippet . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 

5.14 Eclipse empty block in XML . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 

5.15 Eclipse block with parameters in XML . . . . . . . . . . . . . . . . . . . . . . . . . 61 

5.16 Eclipse nested blocks in XML . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 

5.17 Eclipse comment in XML . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 

5.18 Eclipse Metadata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 

G:/etc/mysql/my.cnf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 

G:/home/citanul/workspace/conf2xml blox/src/httpd.conf . . . . . . . . . . . . . . 89 

G:/home/citanul/workspace/conf2xml xml/src/workbench.xml . . . . . . . . . . . 120 

G:/home/citanul/SACT/performance/pg ident.conf . . . . . . . . . . . . . . . . . . . 139 

G:/home/citanul/SACT/performance/pg hba.conf . . . . . . . . . . . . . . . . . . . . 140 

G:/home/citanul/SACT/performance/postgresql.conf . . . . . . . . . . . . . . . . . . 143 

xx

1 

Introduction 

This dissertation presents a framework for the automatic configuration of applications in vir- 

tual appliances. This framework, entitled Smart Application Reconfiguration Tool (SmART), 

configures any kind of application by transforming the application’s configuration files into a 

generic structure, dettached from the configuration file original syntax. This abstraction allows 

for a configurator to traverse configuration files in original syntax from any application and 

modify them to reflect a desired behaviour on the target application. As soon as the configura- 

tor manages to apply the changes, the information in the configuration files on generic syntax 

is used to reproduce them in their original syntax. This framework is very innovative on the 

grounds that it approaches the area of automatic application configuration, an area which ben- 

efits from few contributions and currently does not offer many solutions for that endeavour. 

This chapter is structured on the following way: Section 1.1 explains the motivation for the 

resolution of the problem, Section 1.2 defines the problem to be solved, Section 1.3 reveals how 

that problem can be solved, Section 1.4 enumerates the contributions of this dissertation for the 

problem solution and Section 1.5 presents the organization of the remainder of this dissertation. 

1.1 Motivation 

The virtualization area is a very hot and fast-moving area of Information Technology (IT). Al- 

though virtualization is not an entirely new concept, only recently it has been noticed, devel- 

oped and studied carefully and in depth, so it is prone to more advancements and should be 

considered as a mainstream software deployment model for the future [VMw]. 

Despite the term and ideas of virtualization being quite old (1960’s), only recently it has 

been given increased attention [Sin04]. Companies which recur to IT services are increasingly 

aware of virtualization due to the crucial changes it brings along. Virtualization makes it pos- 

1

1. INTRODUCTION 1.1. Motivation 

sible to gather several underutilized servers into fewer ones (possibly just a single server) by 

virtualizing the hardware where each server runs. This reduces not only the companies’ electri- 

cal bills, but also cooling, space and hardware costs. Virtualization brings along other benefits 

such as the fast replication and/or migration of virtual machines (VMs), which allows for high 

availability of the services provided by the VMs. 

But not only the final users of virtualization woke up to the trend. Some IT giants such as 

Intel, AMD and Microsoft already entered the virtualization market, with different solutions 

such as hardware-assisted virtualization or hypervisor based virtualization. This huge trend 

also got into the consumers market, with popular solutions such as VMware Workstation or 

Microsoft Virtual PC, and even in the gaming market. Anyone who runs games on MAME 

or WINE is using virtualization technology. Currently, there are a lot of virtualization solu- 

tions in the market. These include VMware [VMw09], XenSource [Cit09], Microsoft [Mic09a], 

Sun [Mic09c], Parallels [Par09] products, and so on. 

With the advancement of virtualization technology, the new Virtual Appliance (VA) concept 

emerged. VAs came to change the way of looking to software deployment. Unlike the existing 

software deployment models, presented ahead in Section 2.1, such as the traditional CD-ROM 

way, Software as a Service or Hardware Appliances, VAs allow the costumers to have their 

own IT platform, fully optimized for the task being and completely under their domain. The 

blending of virtualization benefits with VAs’ performance is bringing software deployment to 

a new level. 

This shift in the IT sector motivated project VIRTU [Sol09a] which is a collective effort by 

the consortium composed of Evolve Space Solutions, Universidade Nova de Lisboa, Universi- 

dade de Coimbra, HP Labs and the European Space Agency. The project’s goal is to develop an 

open-source vendor-independent virtualization platform, including support for the manage- 

ment and configuration of applications. 

There are currently two emerging opportunities on the IT market. On one side, virtualiza- 

tion has changed the way in which companies deploy software, on the other hand, application 

configuration is still mostly a manual task. VIRTU exploits these opportunities by defining a 

virtualization tool which manages application stacks. In other words, VIRTU allows for the 

creation of optimized virtual machines assembled with existing applications which can be au- 

tomatically configured. 

VIRTU project’s use cases range from software testing in complex distributed systems, in- 

cluding the European Space Operations Center’s Mission Control System (MCS) and Ground 

System Tests and Validation infrastructure (GSTVi), to IT infrastructural management, like Por- 

tugal Telecom’s Application Service Provider or Novabase’s Enterprise Applications Environ- 

ment. VIRTU also makes it possible for a customer to have its solution self-provisioned within 

his domain, but also to maintain different application versions to achieve legacy compatibility. 

The advent of virtual appliances makes it possible to gather hundreds, or thousands of 

applications on the same machine. This results in many different ways to represent configu- 

rations, which, in turn, makes the management of those configurations complicated and time- 

consuming. 

2

1. INTRODUCTION 1.2. Problem Description 

This dissertation proposes the creation of a tool, entitled Smart Application Reconfiguration 

Tool (SmART), that is able to configure any application, regardless of how it represents its con- 

figuration. This tool was developed on the context of VIRTU project, which is a virtualization 

platform for the creation and management of VAs. 

1.2 Problem Description 

The problem of automatic application configuration is caused by software applications which 

employ different means of representing their configurations. In some cases, it was noticed that 

the same application implemented different configuration formats from one version to another. 

This lack of standardization reveals that the configuration aspects of the applications are put 

in the backburner by application developers. Currently, the solutions on the field of automatic 

configuration of applications are seldom. This work goes in a way as to explore the mentioned 

opportunity. 

VAs may have different formats and may contain all sorts of applications, so it would be 

very helpful if a tool could configure any kind of application inside a VA. For this matter, it is 

essential that the tool interprets each configuration file independently of the application being 

configured, rather than having a limited set of known configurations and trying to match them 

with the configuration file. 

Besides the differences in the representation of configurations between two disparate ap- 

plications, once can also identify divergent configurations in the same applications. Typically, 

whenever an application is upgraded, so are its configuration files, even if the differences in 

both situations come down to one or two extra lines containing new parametrizable elements 

than before. This motivates the abstraction from the configuration representation, which will 

allow any application to be configured based on what settings it defines, not how they are 

defined. 

Two ways to configure applications in VAs can be identified. The first sets up the config- 

uration files before the application is installed on the VA, whereas the second searches for the 

configuration files in the VA and changes them. Given the fact that the we are dealing with text 

configuration files which need to be read and parsed in both approaches, applications whose 

configuration files are in the binary format may require special parsers. This is often the case 

in closed-source applications. 

An example of the typical usage of this tool is the case where many VAs containing the 

same applications (e.g., webserver) need to be deployed on an intranet. Although these VAs 

have the same software, their configurations must be different (e.g., machine name, IP address, 

etc.). Following is a brief description of both approaches and the way in which each approach 

deals with the example case. 

3


1.2.1 Pre-installation Configuration 

Before the application is installed on a VA, there are two elements: the application installation 

package and the VA. In order to configure an application prior to its installation, the configura- 

tion files must be read, parsed and changed, to reflect the desired settings, outside the VA. After 

the configuration file is properly modified, the application package must be reconstructed to 

be ready for installation. 

This approach has two significant limitations. First, whenever a different configuration for 

another VA is required, a new package containing that configuration must be built. This results 

in the proliferation of packages for the same applications, differing only on their configuration 

files. 

Also, since this approach deals with uninstalled packages, it does not allow the same VA 

to be configured more than once. If an application is to be configured again, the process goes 

back to the beginning and the application has to be re-installed. 

1.2.2 Post-installation Configuration 

On the post-installation configuration, the configuration file is retrieved from an already built 

VA. The file is then parsed, changed outside the VA and will later replace the older version 

inside the VA. Getting the configuration file from inside the VA may be the toughest part of the 

process because it requires knowledge of the VA format, as well as the software inside it. For 

instance, in Fedora Core Linux distributions the Apache HTTP Server configuration files are 

usually stored in the /etc/httpd/ directory whereas in Ubuntu Linux distributions they are 

located at /etc/apache2/. 

In the example case, only one template VA, containing the applications, must be built. It is 

then replicated enough times to match the desired number of VAs. After all VAs are created, 

their applications can be configured with the tool any number of times without the need to 

reinstall the application. This clears the need of having to proceed with multiple installations. 

1.2.3 Application Configuration in General 

One way to treat different applications equally is to look for similarities in their structure. The 

fact is that the majority of configuration files are represented similarly. Although the formats 

or the languages of the configuration files tend to differ, different configuration files end up by 

implementing the same concepts, or patterns. The majority, if not all configuration files, use 

the notion of parameter to represent the setting of a value to an application variable. 

An analysis is now carried to find out what patterns are expected to be found in config- 

uration files by SmART. This analysis focuses the applications of some VIRTU project usage 

scenarios. These applications include: 

• Apache 

• Eclipse 

4


Listing 1.1: Apache configuration file excerpt 

# ServerRoot: The top of the directory tree under which the server’s 

# configuration, error, and log files are kept. 

ServerRoot "@@ServerRoot@@" 

Allow from all 

 

 

# 

# If you wish httpd to run as a different user or group, you must run 

# httpd as root initially and it will switch. 

User daemon 

Group daemon 

 

 

!includedir /etc/mysql/conf.d/ 

• MySQL 

• PostgreSQL 

• Mantis 

Following are presented some configuration file excerpts of the presented applications. 

Each excerpt is analysed to identify the commonly found patterns in each of them. 

Apache 

The patterns in the Apache HTTP Server [Fou09a] configuration files (Listing 1.1) are parameters 

formed as id - separator - value, where the separator is a space character and the value may have 

multiple instances; comments, which are text lines starting with the ‘#’ character; and blocks, 

which are sets of patterns delimited by a header and a footer. Besides these patterns, there 

are also special instructions, which start with a ‘!’ character, followed by a command key and 

further arguments. 

Eclipse 

The Eclipse [Fou09b] configuration files (Listing 1.2) are in a format similar to the eXtensible 

Markup Language (XML) format [Con03]. It mainly contains elements and blocks. The blocks 

may be empty or may contain other blocks. In case the blocks are empty, they are represented 

solely by a tag, whereas if a block contains other blocks, it is delimited by a starting and an 

ending tag. Additionally, a block may contain attributes, corresponding to parameters. 

Being XML, it also supports comments, of the form . 

5


Listing 1.2: Eclipse configuration file excerpt 

 

 

 

 

[mysqld] 

# 

# * Basic Settings 

# 

Listing 1.3: MySQL configuration file excerpt 

user = mysql 

pid-file = /var/run/mysqld/mysqld.pid 

socket = /var/run/mysqld/mysqld.sock 

port = 3306 

MySQL 

The MySQL [Mic09b] configuration files (Listing 1.3) are in a format similar to the INI file 

format (ahead in Section 1.3). A MySQL configuration file implements parameters formed as id 

- separator - value, where the separator is the ‘=’ character; comments which are text lines starting 

with the ‘#’ character; and blocks identified by an initial header and which may contain multiple 

parameters and/or comments. 

PostgreSQL 

The PostgreSQL [Gro09b] configuration files (Listing 1.4) are an amalgam of parameters and 

comments. A parameter is formed as id - separator - value, where the separator is the ‘=’ charac- 

ter and the value may be a keyword or a string delimited by the apostrophe character as the 

value. A comment may appear anywhere in the file and is identified by the ‘#’ character. Unlike 

the previous examples, the PostgreSQL configuration files do not implement the block pattern, 

although there seems to be a followed convention in which a set of parameters is preceded by a 

comment, which is similar to a block. 

Mantis 

The Mantis PHP Bug Tracker [Gro09a] configuration file (Listing 1.5) implements parameters of 

the form $ - key - separator - value, where the separator is the ‘=’ character and the value is a 

keyword or a string delimited by apostrophes or quotes. It also implements comments, which 

are text lines that start with the ‘#’ character. 

6


Listing 1.4: PostgreSQL configuration file excerpt 

#----------------------------------------------- 

# CLIENT CONNECTION DEFAULTS 

#----------------------------------------------- 

datestyle = ’iso, mdy’ 

timezone = unknown # actually, defaults to TZ 

#timezone_abbreviations = ’Default’ 

# --- database variables --------- 

Listing 1.5: Mantis configuration file excerpt 

# set these values to match your setup 

$g_hostname = "localhost"; 

$g_db_username = "mantisdbuser"; 

$g_webmaster_email = ’webmaster@example.com’; 

$g_allow_file_upload = ON; 

1.2.4 Conclusion 

The VIRTU project usage scenarios also allowed us to conclude that overall, configuration files 

either follow three formats: 

• Apache-like format; 

• INI-like format; 

• XML-like format. 

The Apache-like format is characterized by defining parameters as key/value pairs separated 

by spaces or a similar character, blocks delimited by a header and a footer, containing other 

patterns, and comments. 

The INI-like format defines parameters as key/value pairs separated by an equal, space or a 

similar character, blocks delimited by a header and comments. 

The XML-like format defines blocks as XML elements delimited by a header and a footer, if 

they contain other blocks, or a single tag if they do not contain other blocks, parameters as XML 

attributes contained in an XML tag, and comments. 

SmART should provide built-in support for these three configuration file categories, seen 

as most of the configuration files fit in one of these categories. However, the tool must also be 

extensible to support new configuration file formats. 

7

1. INTRODUCTION 1.3. Problem Approach 

1.3 Problem Approach 

Section 1.2 identified two ways to configure applications in virtual appliances: pre and post- 

installation. The pre-installation approach has no outstanding requirement and operates by 

generating as many application packages as different required configurations for that applica- 

tion, and does not allow for the re-configuration of an application in a practical fashion (i.e., 

without resorting to that application re-installation). On the other hand, the post-installation 

approach requires previous knowledge of the software running inside the VA but, contrary 

to the previous approach, each different configuration only requires a new configuration file 

version, and allows for the practical re-configuration of an application. Given the benefits and 

drawbacks of each approach, the post-installation approach requirement was found tolerable 

and moreover, allows for better functionality than the pre-installation approach and, therefore, 

SmART will follow this discipline. 

The adopted approach operates by conducting a lexical analysis on an application’s con- 

figuration files with the aim of extracting the relevant information for that application con- 

figuration from them. This raises a significant requirement which says that the configuration 

files must be in text format. A consequent implication of this requirement is that special for- 

mats, like the Windows OS registry or binary files, may not be configurable using the approach. 

However, there exist dedicated parsers for the treatment of those formats and these can be used 

by the framework to provide support for the mentioned kinds of formats. 

Still in the scope of the applications supported by SmART, Section 1.2.3 concluded that, at a 

conceptual level, the structure of the configuration files expected to be found more regularly by 

the tool can be reduced to three types: INI-like, XML-like and Apache-like. These three types were 

already presented and implement three patterns: parameters, comments and blocks, besides 

the patterns specific to a certain application. One way of tackling this problem is to reduce 

the configuration files of any application to a generic structure, dettached from its application 

specifics, which contains the relevant elements for its application configuration. This structure 

can then be modified without any knowledge of the application to which it belongs. Once 

altered, that structure is again converted to a configuration file in its original syntax, containing 

the applied modifications. 

1.4 Contributions 

This dissertation contributions for the automatic application configuration are: 

• Identification of commonalities amidst configuration files structures; 

• Proposal of a framework for the rebuilding of configuration files of an application; 

• Implementation of the proposed framework; 

• Provision of the VIRTU project with a state-of-the-art and innovative tool for application 

reconfiguration, independent of its vendor. 

8

1. INTRODUCTION 1.5. Document Organization 

1.5 Document Organization 

The rest of the dissertation is structured as follows: the next chapter covers the state of the art of 

the related subjects to this thesis, Chapter 3 describes the proposed architecture of the SmART 

tool, Chapter 4 explains the tool implementation decisions, Chapter 5 provides the validation 

process carried to the tool, Chapter 6 approaches the integration of SmART with the VIRTU 

virtualization platform, and Chapter 7 reveals the final considerations about this thesis. 

9

2 

State of the Art and Related Work 

This section aims at presenting the state of the art in automatic application configuration, as 

well as studying the tools which may be used to tackle the problem identified in Chapter 1. 

To better understand the outlines of the problem, this section starts by explaining the con- 

text of this dissertation. Namely, the virtual appliances theme is explored with the purpose of 

making it more natural. 

Afterwards, the existing techniques to deal with the problem are analysed. There is, never- 

theless, a particularity regarding this part. As previsouly mentioned in Chapter 1, the configu- 

ration of applications is still mostly a manual task. As such, the existing references considerably 

relevant to this theme are very few. Therefore, this dissertation consists on one of the first con- 

tributions for the problem of automatic configuration of applications. 

Finally, the adopted concepts and tools to mitigate the problem are analysed. The automatic 

file recognition theme is studied by identifying two parser generators and describing them. 

Finally, it is seen how the file can be represented in an abstract, independent from its original 

language, way. 

2.1 Virtual Appliances 

A Virtual Appliance (VA) is a minimal virtual machine (VM, Section 2.1.1) composed of pre- 

configured and pre-installed applications plus an optimized operating system called Just 

Enough Operating System (JeOS) (Figure 2.1). VAs are normally created to perform a specific 

task in the most effective way, therefore they only contain the essential and necessary resources 

for the execution of that task, contrary to the regular VM where all of the kernel OS features are 

present, even those whose use is seldom. 

11

2. STATE OF THE ART AND RELATED WORK 2.1. Virtual Appliances 

2.1.1 Virtualization 

Virtual Appliance 

JeOS 

Virtualization Layer 

Physical Hardware 

Figure 2.1: Virtual appliance 

Virtualization is a very broad term and refers to something which does not exist physically 

but appears to exist. Think of it in the context of virtual reality. With virtual memory, for 

example, computer software gains access to more memory than is physically installed, via 

the background swapping of data to disk storage. Similarly, virtualization techniques can be 

applied to other IT infrastructure layers such as networks, storage, laptop or server hardware, 

operating systems and applications. 

The concept of platform virtualization refers to the technique for hiding the physical char- 

acteristics of computing resources from the way in which other systems, applications or end 

users interact with those resources. In other words, virtualization allows multiple logical com- 

puting units to run on a single physical computing unit, like in the Figure 2.2. This is done with 

the aid of a software layer (alias virtualization layer, virtualization manager, virtual machine 

monitor or hypervisor) which provides the illusion of a “real” machine to multiple instances of 

“virtual machines”. 

Virtualization Layer 

Computational Unit 

Figure 2.2: Physical unit running four VMs 

12


The virtualization concept goes back to the 1960s when corporate mainframes were in a 

severe underutilization state [oV08]. By then, some IBM research personnel were working on 

the CP-40 OS designed to run on the System/360 time-sharing system. It was the first OS to 

completely virtualize a system (namely System/360) so that other S/360 OSes could be tested 

and executed. This system paved the way for virtualization as known nowadays and concepts 

pioneered by CP-40 are still actual, like the case where an abstraction layer catches traps and 

simulates them. Curiously, CP-40 is still supported by IBM mainframes (System z10). Presently, 

mainframes came to suffer from the same underutilization problem and so virtualization is en 

vogue once again. 

2.1.2 Motivation 

Nowadays, deployment via download or DVD defines the typical software distribution model. 

Despite its simplicity, this model features limitations such as deployment and configuration 

complexity. Other flaws that might be pointed are the huge number of application versions to 

maintain, which leads to IT resources draining; having to cope with a myriad of hardware plat- 

forms, which leads to intensive time-consuming testings; Independent Service Vendors (ISVs) 

might see their revenues decreased since if the customer is content with the product, he will 

only upgrade when a groundbreaking feature comes out; and so forth. 

As a response to these challenges, some alternative deployment models arose. Such is the 

case of Software as a Service (SaaS), hardware appliances and virtual appliances. 

Software as a Service (SaaS) 

This approach consists on providing a service by hosting a software application that customers 

over the Internet can use. In this way, the need for install and configure on the customer’s 

computer is eliminated, so problems as software maintenance and support are no longer cus- 

tomer’s responsibility but software vendor. The revenue becomes different: customers pay 

when they want to use the service, rather than having a single expense when purchasing the 

application. 

However, this will turn the application vendor into a service provider that has to maintain 

its infrastructure. Also, given the variety of existing open-source solutions and the inexpensive 

hardware, it appeals economically to the customer to keep the application running under its 

control. 

Hardware Appliances 

Another approach to software distribution is the creation of hardware appliances bundling 

the application with hardware capable of running it. The customer just needs to plug in the 

appliance and turn it on. This approach also clears the need for the customer to install and 

configure the application, but this forces the application vendor to enter a different market, the 

hardware market. 

13


Virtual Appliances 

The existing hardware virtualization techniques allow the leveraging of the hardware appli- 

ances advantages without the hardware requirement. The virtual appliance vendor packs in 

a set of required applications with a JeOS on a virtual machine that can be executed on the 

vast majority of existing hypervisors given the appearance of virtualization format standards 

(OVF). The deployment of the appliance is simple and the appliance comes pre-configured. 

The appliance vendors are able to customize the applications and guest OS as much as they 

want, thus leading to unseen efficiency. Updates are vendor’s responsibility and just require 

him to address the flaws identified in the former version, produce an upgraded VA and deploy 

it in the customer’s hardware. Management is made dramatically simpler by the use of man- 

agement tools created by the appliance vendors [Sta07]. Moreover, VAs supply customers with 

all the benefits of virtualization: increased hardware use, reduction of physical resources, etc. 

2.1.3 Open Virtual Machine Format 

The Open Virtual Machine Format (OVF) is a joint effort of various virtualization solution ven- 

dors such as XenSource, VMware and Microsoft as a part of the Distributed Management Task 

Force Inc. It consists on an open, secure, portable, efficient and extensible format for packaging 

and distribution of VMs [VX]. With OVF, VMs which are created on one platform can also be 

managed and used on other platforms, providing platform independence, although it enables 

platform-specific enhancements to be captured. 

OVF differs from other formats such as VMware’s VMDK and Microsoft’s VHD since these 

are run-time VM images. Although they are currently used for VM transportation, they do not 

address problems such as multi-tiered VMs consisting of multiple independent VMs, or VMs 

with multiple disks. 

Another important aspect in the scope of OVF is its integrity. OVF was designed so that 

the VM configures itself, without the need for the virtualization platform where it is installed 

to recognize the VM’s file system. This is particularly useful since it allows VMs to run on any 

OS and virtualization platform which supports the OVF format. 

OVF is extremely useful in the Virtual Appliance case since it provides essential foundations 

for VA’s such as efficiency, portability, ease of use and distribution, etc. 

An OVF package can be stored as a single file using the TAR format under the .ova extension 

(open virtual appliance or application). It consists of the following [DMT08]: 

• one OVF descriptor file (descriptor file or .ovf file) containing the package metadata and 

its contents; 

• zero or one OVF manifest file (manifest file or .mf file) containing the SHA-1 digests of 

individual files in the package; 

• zero or one certification file (certification file or .cert file) containing the OVF package 

signature digest along with the base64-encoded X.509 certificate. 

14

2. STATE OF THE ART AND RELATED WORK 2.2. Automatic Configuration of Applications 

• zero or more disk images files 

• zero or more additional resource files, such as ISO images 

OVF specification v1.0.0 has been released on September, 2008 and is prone to intensive 

development. 

2.1.4 Virtual Appliances in Real Life 

A good example of an application of virtual appliances is the Amazon Elastic Compute Cloud [Ama09], 

also known as Amazon EC2. It is a service which allows customers to rent hosted computing 

capacity in order to run their applications. The customers can create a VA comprised by appli- 

cations, libraries, data and associated configuration settings or use a pre-constructed image and 

then upload it in the Amazon Machine Image format to the Amazon storage service, Amazon 

Simple Storage Service (S3). VAs provide the means for elastic computing by enabling a fast 

and efficient way for service expansion. 

Whereas previously SaaS solutions failed to succeed due to the lack of control of the ser- 

vice from the customer and the maintenance costs involved, Amazon EC2 uses multiple Xen 

hypervisor nodes so that customers can build VAs containing their applications and host them 

outside their domain. In this way, the vendor is able to provide the service desired by the 

customer without the need to maintain the solution, focusing on keeping a secure, reliable, ef- 

ficient and inexpensive environment for the VAs to run on. On the other hand, the customers 

now have the control over their services and may update them any time necessary without hav- 

ing to wait for the vendor. Furthermore, customers are only billed for what they consume (i.e., 

instance uptime or data transferred), opposed to paying an established fee even if the service 

usage is low or non-existant. 

2.2 Automatic Configuration of Applications 

To the best of our knowledge, our approach is the first to exploit the similarities among con- 

figuration files to allow for automatic, vendor-independent and on-the-fly application recon- 

figuration. Similar existing projects, such as AutoBash [SAF07] or Chronus [WCG04], take on 

automatic application configuration as a way to assist the removal of configuration bugs. Auto- 

Bash employs the causality support within the Linux kernel to track and understand the actions 

performed by a user on an application and then recurs to a speculative execution to rollback 

a process, if it moved from a correct to an incorrect state. Chronus, on the other hand, uses a 

virtual machine monitor to implement rollback, at the expense of an entire computer system. 

It also focuses a more limited problem: finding the exact moment when an application ceased 

to work properly. 

Two other projects, better related to this work, are Thin Crust [Cru08] and 

SmartFrog [GGL + 09]. Both projects aim at automatic application configuration, but take a 

different approach than ours. Following is a brief summary of both. 

15

2. STATE OF THE ART AND RELATED WORK 2.3. Automatic Recognition of Configuration Files 

2.2.1 Thin Crust 

Thin Crust is an open-source set of tools and meta-data for the creation of VAs. It features 

three key components: Appliance Operating System (AOS), Appliance Creation Tool (ACT) 

and Appliance Configuration Engine (ACE). The AOS is a minimal OS built from a Fedora 

Kickstart file 1 , which can be cut down to just the required packages to run an appliance. The 

user may download a ready-to-run AOS image or may create his own through the ACT. The 

ACE is ran at VA boot time and loads the appliance recipe. The appliance recipe contains VA 

metadata and the modules used by the VA. If it does not match the appliance configuration, 

the latter is changed to reflect the changes. Finally, Thin Crust supports VMware, KVM and 

EC2 (see Section 2.1.4) by providing conversion tools from and to the mentioned formats. 

2.2.2 SmartFrog 

SmartFrog (Smart Framework for Object Groups) is a framework for the creation of 

configuration-based systems. Its objective is to make the design, deployment and management 

of distributed component-based systems simpler and more robust. It defines a language to 

describe component configurations and a runtime environment to activate and manage those 

components. 

One of the major causes for the problems in large distributed systems design is identified 

by SmartFrog as the ad-hoc way in which such systems are designed. This results in application 

configuration data scattered by the system, often causing its repetition. 

In SmartFrog, there is the notion of a SmartFrog component and a SmartFrog system, which is 

composed by various such components. A SmartFrog component is constituted by three parts: 

the lifecycle manager, which applies a configuration on a component; configuration data, which 

defines the behaviour of a component; and the managed component, which may be a piece of 

software implementing any functionality-specific interface. Each component is monitorized by 

its lifecycle manager and must report any failure to the other components of the system. The 

components can also be distributed on a network. 

2.3 Automatic Recognition of Configuration Files 

The lexical analysis of the file with the objective of producing that same file in a different struc- 

ture provides a way for the tool to abstract itself from each configuration file language specifics, 

refining the configuration file down to just the essential for configuration (i.e., the elements 

needed to modify the application settings, such as parameter keys and attributes). The lexi- 

cal analysis of a file is accomplished by a parser, which recognizes the file original syntax and 

produces a version of the same file in a different structure. 

Configuration file parsers are produced by grammar compilers (or parser generators). A 

grammar compiler reads a grammar that recognizes a certain configuration file language and 

1 

16

2. STATE OF THE ART AND RELATED WORK 2.3. Automatic Recognition of Configuration Files 

compiles it, resulting in the generation of a parser for a specific configuration file language. 

Following are presented some parser generators: 

2.3.1 SableCC 

SableCC [GH98] is an object-oriented framework that generates compilers, or parsers, in the 

Java programming language. This framework is based on two fundamental design decisions. 

Firstly, the framework uses object-oriented techniques to automatically build a strictly-typed 

abstract syntax tree that matches the grammar of the compiled language and simplifies debug- 

ging. Secondly, the framework generates tree-walker classes using an extended version of the 

visitor design pattern which enables the implementation of actions on the nodes of the abstract 

syntax tree using inheritance. 

SableCC specification files do not contain any action code. Instead, SableCC generates an 

object-oriented framework in which actions can be added by defining new classes containing 

the action code. This leads to a shorter development cycle, allowing for the rapid prototyping 

of SmART. 

Given a grammar, SableCC’s produces a set of packages, namely: 

• analysis: contains different tree walkers (e.g., depth first, reversed); 

• lexer: contains the elements needed for the lexical analysis of the grammar; 

• node: classes containing node representations; 

• parser: contains the parser class. 

None of the automatically produced classes ought to be modified. Instead, all of the user 

generated code should be put in classes extending those created by SableCC. By doing this, 

every time a problem arises (e.g., error in a grammar), it can be easily located and corrected by 

modifying only a small portion of the code. Another aspect of discussible importance is that 

SableCC is distributed under the GNU Lesser General Public License. 

Notwithstanding, SableCC also reveals some significant shortcomings. SableCC does not 

support error productions. In other words, when a generated parser comes upon an unrecog- 

nizable token in a configuration file, it does not allow the parse to advance further beyond that 

token, consequently halting the parse on that position. Since error productions are a common 

feature on parser generators, this lackage was not noticed upon the choice. This, in turn, sev- 

ered the ability to produce multiple recognized file blocks to just one. The parsing statistics 

also become unreliable, since a parser might not recognize the first token a configuration file 

and recognize the rest of the file, but without error productions, parsing will inevitably stop in 

the beginning of the file and that parser will be reported to have parsed 0% of the file, when in 

fact it would manage to parse nearly 100% of the file. 

17

2. STATE OF THE ART AND RELATED WORK 2.4. Representation of Configuration Files in a Generic Syntax 

2.3.2 JavaCC 

The Java Compiler Compiler (JavaCC) [Pro09] is a tool that reads a grammar specification and 

converts it to a Java program that can recognize matches to the grammar. In addition to the 

parser generator itself, JavaCC provides other standard capabilities related to parser generation 

such as tree building, actions, debugging, etc. 

JavaCC generates top-down parsers (i.e., start at the root of the derivation tree, picks a pro- 

duction and tries to match the input tokens [oM]), declares the lexical and grammar specifica- 

tions in one file, making grammars easier to read and maintain, supports semantic lookahead, 

uses the Extended Backus-Naur Form (i.e., operators *, + and ?) and supports error recovery in 

two forms: shallow recovery and deep recovery. In shallow recovery, if an input token does not 

match any production, it is possible to move to the next desired token (e.g., semicolon). Deep 

recovery is similar to the shallow recovery, but also allows to recover from an error inside a 

production, which is impossible resorting to shallow recovery alone. 

2.4 Representation of Configuration Files in a Generic Syntax 

The concept of generic syntax refers to a language-independent syntax used in the process of 

automatic application configuration. The generic syntax is used to abstract the configuration 

files from their original languages. When a configuration file is represented in a generic syntax, 

another independent entity which recognizes that syntax is able to manipulate the file without 

having knowledge of what the original syntax was, or what application uses that file. The cho- 

sen generic syntax must provide a practical way to traverse and manipulate files. Furthermore, 

it must be widely supported and well documented. Its software license must also allow it to be 

used freely. 

Considering the prievous requirements, the chosen syntax was the eXtensible Markup Lan- 

guage (XML) since it is a broadly adopted and supported standard. XML allows for the struc- 

turing of configuration files in a way that makes them easily manipulable and traversable. 

Some available implementations of XML parsers were identified in many programming 

languages. The criteria to choose from an XML parser implementation includes factors such as 

the programming languages it is available on and what it allows the user to do resulting data 

structure from the parse. 

Since the programming language used to develop the tool is going to be Java, two major 

XML parser implementations in Java were considered: the Simple API for XML (SAX) and the 

Domain Object Model (DOM). Following is a brief description of both. 

Simple API for XML 

The Simple API for XML (SAX) is an Application Programming Interface (API) for XML in Java, 

although it is available in many other programming languages. It is event-driven: it reads from 

a stream and reacts when processing instructions, elements, comments and text. 

18

2. STATE OF THE ART AND RELATED WORK 2.4. Representation of Configuration Files in a Generic Syntax 

Being event-oriented, SAX does not implement a representation of the document in mem- 

ory and, therefore, memory accesses are seldom in quantity. This leads to some memory inde- 

pendence and fairly good memory consumption ratings, also excelling in large files parsing. 

SAX parses in an uni-directional way since it does not allow previously read data to be read 

again, without restarting the parser. This consists on a problem when a whole XML document 

must be in memory in order to be properly parsed, as SAX requires the user to start the parsing 

again to get values previously obtained. 

Domain Object Model 

DOM is a cross-platform and language-independent interface for creating and manipulating 

XML documents in memory. It is a World Wide Web Consortium (W3C) standard. 

It loads an XML document to memory and allows for dynamic access to it, allowing for 

bi-directional parsing. DOM represents an XML document in memory in a tree-like structure. 

Furthermore, it is very well supported in many programming languages, including Java. 

Since DOM keeps a document in memory after it is parsed, a drawback is the time spent 

to parse a big file on the grounds that many memory allocation requests might sum up to a 

considerable time. 

19

3 

An application reconfiguration 

framework 

Chapter 1 identified the problem of automatic application configuration, along with the con- 

cept of pattern in a configuration file, whereas Chapter 2 studied the existing tools and tech- 

nologies that can help to mitigate the problem. Using that knowledge, this chapter presents the 

framework for automatic application configuration, the major contribution of this dissertation. 

In truth, the proposed framework reconfigures applications (possibly fresh installations or 

deployments of an application) instead of configuring them at deployment time, but for the 

sake of clarity, the problem will be addressed in the remainder of the dissertation as the auto- 

matic configuration of applications. The proposed framework requires three steps to config- 

ure an application automatically: conversion from original to generic syntax, file modification 

and conversion back to original syntax. Only the first and third belong to the scope of this 

work. The configuration resorts to the lexical and syntatic analysis of the configuration file and 

subsequent production of a data structure which is equivalent to the original file. After the 

data-structure is properly modified, it is reconverted back to the original file form. 

We start by identifying the functional and non-functional requirements of the tool in Sec- 

tion 3.1, Section 3.2 provides a broad view over the solution proposal, Sections 3.3 and 3.4 detail 

the required components in each step and their interfaces, Section 3.5 approaches the file mod- 

ification part by proposing some ways to modify the intermediate file in generic syntax and 

providing some information about the generic syntax and Section 3.6 deals with the support 

for file formats not recognizable by the framework. 

21

3. AN APPLICATION RECONFIGURATION FRAMEWORK 3.1. Architectural Requirements 

3.1 Architectural Requirements 

Before approaching the proposal of a framework for the automatic application configuration, 

the tool requirements must be identified. This section presents the functionalities which the 

tool is expected to provide, in the form of use cases. A use case defines a goal-oriented set 

of interactions between external actors and the tool. Following is a list of functional and non- 

functional requirements: 

1. The user must be able to convert a configuration file syntax from its original syntax to 

a generic one, independently of the application. 

Description: The user must be able to try parsing the configuration file with the available 

parsers. 

Non-functional requirements: 

Performance: The generated file with the generic syntax must be as simple as possible. 

2. The user must be able to define grammars for configuration file languages. 

Description: If there are no suitable parsers for a given configuration file, the user must 

be able to define a grammar that recognizes the new configuration file language. 


Usability: The grammar definition syntax should be a broadly adopted one. 

Usability: To ease the parser generation process, the user must be able to iterate through 

the previously built grammars so as to rollback any change made on a grammar. 

3. The user must be able to produce a parser from a grammar. 

Description: When the user builds a grammar for a new configuration file language, there 

must be a means to compile that grammar in order to generate a parser. 


Usability: The user must be able to add parsers built outside the tool. 

4. The user must have access to the parser compilation trace. 

Description: When a grammar is compiled, the user should be able to check on the parser 

compilation trace to see if it was successfully compiled or whether any error persists in 

the grammar declaration. 

22

3. AN APPLICATION RECONFIGURATION FRAMEWORK 3.1. Architectural Requirements 


Usability: The error messages must clearly identify the source of the error. 

5. The user must receive information relative to the grammar fitness with a given config- 

uration file. 

Description: When the user attempts at parsing a configuration file with a given parser, 

he/she must be informed if the parsing was successful or not, as well as the quantity of 

file recognized by that parser. 


Usability: The sections of the configuration file that were parsed and those that were not 

must be clearly identified. 

6. The user must be able to store a functional parser generated by the tool, in order to be 

used on later tool runs. 

Description: When the user checks a new parser to parse a configuration file entirely, 

he/she must be able to store it in a repository so it can be reused. 


Security: The user must be able to see the parsing outcome in order to check if the parser 

is indeed operating as intended to. 

7. The user must be able to reconvert a configuration file syntax from generic syntax to 

the original one. 

Description: Given a configuration file in generic syntax, the user must be able to convert 

it into its original syntax, keeping its functionality unharmed. 

8. File conversion must not eliminate comments. 

Description: On the original to generic syntax conversion, the comments should be stored 

in order to show up in the final file. 

9. The user must be able to halt the tool execution at some point and continue the config- 

uration process later. 

Description: In the middle of the configuration process (i.e., right after the original to 

generic syntax conversion), the administrator should be able to halt the execution flow, 

have the necessary data stored, such as the file in the generic syntax or the tentative 

23

3. AN APPLICATION RECONFIGURATION FRAMEWORK 3.2. Automatic Application Configuration 

parsers, and continue from the same state later. 

10. The user must be able to manually delete parsers from the parser repository. 

Description: The user must be able to manually delete parsers from the parser repository 

so that the user may intervene if the parser repository becomes to big, or if a certain parser 

becomes obsolete. 

3.2 Automatic Application Configuration 

Having identified the requirements for the tool, it is now possible to tackle a possible solution 

for the problem. This section presents a framework which is the solution proposal for the 

automatic configuration of applications. 

The application configuration files are read by the tool, which parses them in order to con- 

vert their syntax from the original to a generic, application-independent, one. This allows for 

the configuration files to be manipulated and altered in the same way, for any application, 

through an agent which recognizes the generic syntax. 

Being in generic syntax, the file can be efficiently modified by an agent which operates on 

that kind of syntax, such as an automated script. Ensuing the file modification, the tool is 

responsible to do the inverse conversion, back to the file original form. Once the file has been 

converted to the generic syntax, the characteristics of the original syntax have been aprehended 

and can be used to reconvert the file back to its original syntax. 

The configuration process (Figure 3.1) can be divided into three main stages: 

1. Configuration file extraction; 

2. Original to generic syntax conversion; 

3. File modification; 

4. Generic to original syntax conversion. 

The second stage is accomplished through an Original to Generic syntax Converter (OGC) 

and the fourth by a Generic to Original syntax Converter (GOC). This dissertation points ways 

in which OGC and GOC might evolve. The first and third stages are not on the original work 

scope and therefore are not tackled in detail. Nevertheless, Section 3.5 approaches the file 

modification area superficially, hinting at some possible ways of dealing with the subject and 

identifying possible operational constraints. 

3.3 Original to Generic syntax Converter 

The Original to Generic syntax Converter (OGC) is structured like a three-tier 

architecture [Ram00]. The three-tier architecture consists of a software design pattern which 

24

3. AN APPLICATION RECONFIGURATION FRAMEWORK 3.3. Original to Generic syntax Converter 

!!""#$%&'$()* 

+$)&,$-. 

!/()0$12,&'$()* 

0$#-. 

!""#$%&'$()*0$#-. 

? 

5,$1$)&#*'(*1-)-,$%* 

.3)'&4*%()6-,'-, !/()0$12,&'$()*0$#-* 

!/()0$12,&'$()*0$#-* 

$)*(,$1$)&#*.3)'&4 

< 

$)*1-)-,$%*.3)'&4 

7$#-*8(9$0$%&'$() 

= 

Figure 3.1: SmART Work Flow 

;-)-,$%*'(*(,$1$)&#* 

.3)'&4*%()6-,'-, !!""#$%&'$()* 

!:(9$0$-9* 

%()0$12,&'$()*0$#-* 

$)*1-)-,$%*.3)'&4 

> 

+$)&,$-. 

!:(9$0$-9*0$#-*$)* 

(,$1$)&#*.3)'&4 

!""#$%&'$()*0$#-. 

Component Description 

User Interface The User Interface makes the link between the user and the 

tool. It can be thought of as a command interpreter, which 

reacts to user orders by calling other tool components. Additionally, 

it is used to simplify some otherwise complex 

tasks, such as building grammars for configuration files. 

Table 3.1: Presentation Layer Components 

defines three inter-related tiers, each with its own responsibilities. In this work context, the 

tiers are: 

Presentation Layer. Contains the presentation logic, including simple control and user input 

validation. 

Logic Layer. Contains the processing logic and the data access. 

Storage Layer. Provides the data storage. 

In this topology, each layer is modifiable without interfering with the others, increasing 

modularity. 

The three-tier architecture is depicted in Figure 3.2. However, the OGC does not follow a 

strict three-tier architecture, as later on, it is seen that some components on the presentation 

layer interact with those on the storage layer. 

The components employed by the OGC are now briefly described. Table 3.1 presents the 

components used by the Presentation Layer, Table 3.2 contains the description of the Logic 

Layer components and Table 3.3 introduces the components used by the Data Layer. 

3.3.1 Component Description 

This section provides a description for each component required by OGC. First, the task of each 

component is briefly analysed and then, the operations of each component are more thoroughly 

25


Presentation Layer 

User 

Interface 

Logical Layer 

Configuration 

File Parser 

Storage Layer 

Parser 

Repository 

Grammar 

Compiler 

Tentative 

Grammar 

Repository 


Code 

Generator 

Figure 3.2: Components overview 

Logical Layer 

Storage Layer 


Configuration File Parser The Configuration File Parser recognizes the diversity of 

configuration files and parses them in order to change their 

structure to an application-independent one; 

Code Generator The Code Generator reads the parsed output from the Configuration 

File Parser and generates the generic syntax of 

the file; 

Grammar Compiler The Grammar Compiler is the component which generates 

new parsers from grammars. 

Table 3.2: Logic Layer Components 


Parser Repository The Parser Repository is a parser database which holds all 

the functional parsers generated to the date; 

Tentative Grammar 

Repository 

The Tentative Grammar Repository stores the grammars 

built by the user on the process of creating a new parser 

for a configuration file language. 

Table 3.3: Data Layer Components 

26


Listing 3.1: User Interface proposed interface 

interface UI { 

void processFiles(String[] fileNames); 

void submitGrammar(); 

byte[] getTentativeGrammar(); 

void approveParser(); 

boolean importParser(); 

void haltSession(); 

void recoverSession(); 

} 

described. The interfaces that every component should implement are also presented, in a 

Java-like syntax. Enforcing every component to implement an interface not only allows any 

component to be completely remade without having any effect on other components, it also 

aids the integration process of SmART with other tools. 

User Interface 

The User Interface (UI, Listing 3.1) is the tool’s front-end, through which the user interacts with 

the tool. It is invoked by the user and receives an array of configuration file locations at boot 

time, although it only processes one at a time. 

For each file, UI summons the Configuration File Parser to make an attempt at 

parsing a configuration file with the parsers in the Parser Repository. Two outcomes are 

possible: either a parser manages to recognize the totality of a file or none does. 

The first case is the ideal case, where at least one of the parsers in the Parser Repository 

manages to fully recognize the configuration file. In this situation, UI displays the representa- 

tion of the generic syntax, as returned by the Configuration File Parser, to let the user 

validate the generated structure. At this point, the configuration file in generic syntax is stored 

in memory, so UI must save it to a file to let file modification take place. 

The second case, where the configuration file was not completely parsed by any available 

parser, requires user intervention. If the file could not be totally recognized by any parser, 

another parser which recognizes this new language must be generated. For this endeavour, 

UI provides the user with a friendly interface which allows him to define a new grammar 

that recognizes the configuration file language. The user must also be allowed to build a new 

grammar from an existing one. This is useful in the case where a parser managed to parse a 

configuration file almost entirely. In this case, the new parser might be ready just by tuning an 

existing one, instead of having to build it from scratch. 

As soon as a grammar is defined, the user triggers its compilation with the Grammar 

Compiler’s submitGrammar call. Any grammar compilation errors should be displayed to 

the user. To ease parser generation, the user must have access to the previously tested gram- 

mars. This allows for the rollback of changes in grammars in an easy way. To accomplish 

this, every time the user submits a new grammar, UI sends it to the Tentative Grammar 

27


Listing 3.2: Configuration File Parser proposed interface 

interface CFP { 

Map doParse(String fileName); 

ParsingData doParse(String fileName, Parser parser); 

} 

Repository. Then, the user can fetch the previsouly built grammars through the 

getTentativeGrammar call. 

The parsers generated by the Grammar Compiler must be validated by the user to check 

if the new parser is able to parse the configuration file, or if the generic syntax produced by the 

parser is indeed the desired one. UI must handle this information and show it to the user in 

a user-friendly optic (e.g., graphically). Then, if the user approves the parser, he can summon 

the approveParser method to store the parser with the remaining parsers in the Parser 

Repository. Finally, UI stores the generic syntax in memory on a file. 

Besides the presented basic functionality, UI offers some more utilities to the user. Namely, 

the importParser method allows the importation of external parsers if the user is already 

in possession of a functional parser for a new configuration file. Additionally, the user may 

suspend the tool’s execution and resume it later through the haltSession method. This 

method stores the tentative grammars to disk, as well as other volatile data in memory, such as 

the produced file in generic syntax, if it was already produced. The recoverSession method 

obtains all the stored data to resume a previously saved session. 

Configuration File Parser 

The Configuration File Parser (CFP, Listing 3.2) can either parse a configuration file with every 

parser in the database or just with a single one. The parsers analyse the configuration files in 

their original syntax, declared in a grammar which defines the tokens and productions of that 

syntax, and produce the abstract syntax tree (AST) of the file. 

The ASTs generated by the parsers must follow the typing indicated in Section 1.3. There- 

fore, nodes composing an AST are either parameters, blocks, comments or nodes corresponding 

to new patterns. By assuring that the ASTs are typed in such a way, it is possible to conceive a 

uniform Code Generator, which is able to traverse the AST of any application. Furthermore, 

strictly typed ASTs guarantee that the generic syntax is constant for every application. In other 

words, the generic syntax will be composed of elements like parameters, blocks, comments and 

specials, which is a standard recognized by the file modification agent. 

CFP may iterate the available parsers in the Parser Repository and attempt to parse 

the input file with each one of them. Alternatively, CFP may also be called to parse a file with a 

newly generated parser if the file was not entirely recognized with any of the existing parsers. 

When a parser is tested, it returns the configuration file intervals which were recognized, 

along with the AST of each interval. The ASTs are sent to the Code Generator to be con- 

verted into generic syntax, in order to help the user analyse and validate the output produced 

28


,()&/.#*+0+456 

,()&/.#*+0+ 

123 

,()&/.#*+7+ 

123 

,()&/.#*+7+456 

!"#$%&'()*%"#+,%-. !"#$%&'()*%"#+,%-. 

,%-.+456 

Figure 3.3: Partial and totally parsed configuration file stages 

Listing 3.3: Code Generator proposed interface 

interface CG { 

Document doTranslate(List ast); 

} 

,%-.+123 

by a parser. If a file has been completely recognized, the interval returned by the parser is the 

same as the overall original file length and the only returned AST will correspond to the whole 

file. The stages of the file in both scenarios are depicted in Figure 3.3. Since the CFP tests a file 

with every parser in the Parser Repository, it returns a map of every parser and the data 

regarding the parsing attempt with that parser. On the other hand, if the CFP tests a file with a 

single parser, it returns the parsing data with that parser. 

Code Generator 

The Code Generator (CG, Listing 3.3) traverses the ASTs created by the parsers and produces 

the generic syntax for a configuration file or a configuration file fragment. Besides that, CG 

must also capture the properties of the configuration file original syntax and save them in the 

generic syntax in order to ease the opposite conversion process, from generic to original syntax. 

CG is able to correctly traverse any AST, provided that this follows the typing described in 

Section 1.3, with parameters, comments, blocks and other types meanwhile defined. Moreover, 

the generic syntax produced by CG must be the same for any kind of configuration file so as to 

detach the Original to Generic syntax Converter from the Generic to Original syntax Converter. 

More considerations on the generic syntax produced by CG can be found in Section 3.6. 

Grammar Compiler 

To allow the generation of new parsers for the recognition of new configuration file languages, 

the Grammar Compiler (GC, Listing 3.4) was created. GC produces parsers from user-built 

29


Listing 3.4: Grammar Compiler proposed interface 

interface GC { 

CompilationData doCompile(byte[] grammar); 

} 

Listing 3.5: Parser Repository proposed interface 

interface PR { 

Parser storeParser(ParserID parserID, ParserInt parser, byte[] 

grammar); 

Parser importParser(ParserID parserID, ParserInt parser); 

Iterator getIterator(); 

void deleteParser(String parserID); 

void saveState(); 

} 

grammars. For this, CG may make use of an outside parser generator, such as Bison [FSF08], 

SableCC [GH98], JavaCC [Pro09], etc., by invoking it whenever a grammar is received. Also, 

when a grammar is not valid, CG must return an error message containing information about 

the problem. 

Parser Repository 

The Parser Repository (PR, Listing 3.5) is a parser database which manages the parsers used 

by the tool to that date. It maintains a map of ParserID, which is the parser name and must 

uniquely identify it, and Parser, used to call a given parser. PR provides a number of func- 

tionalities to the user. These functionalities are now enumerated. 

The parsers generated by the Grammar Compiler may be stored by the PR if the user has 

approved them. The storeParser method saves a parser in the database, given its name, 

which uniquely identifies it, the parser and its grammar definition. 

If a user possesses an already built parser for a new configuration file, it is possible to im- 

port that parser into the tool. The importParser method saves an external parser in the 

database. The importation of external parsers is, nevertheless, conditioned by a pre-defined in- 

terface which forces external parsers to implement an invocable method. This matter is further 

discussed in Section 3.6. 

Besides inserting parsers into the database, it is also possible to obtain an iterator for the 

existing parsers in the parser repository through the getIterator method, to delete a parser 

from the database with the deleteParser method and to save the state of the database by 

calling the saveState method. 

30


Listing 3.6: Tentative Grammar Repository proposed interface 

interface TentativeGrammarRepository { 

Grammar store(GrammarID grammarID, byte[] grammar); 

Grammar next(); 

Grammar prev(); 

Grammar get(int grammarNumber); 

void discard(int grammarNumber); 

void discardAll(); 

void saveState(); 

} 

Tentative Grammar Repository 

To respond to the architectural non-functional requirement which says that the user must be 

able to iterate through the previsouly tested grammars, the Tentative Grammar Repository 

(TGR, Listing 3.6) was conceived. TGR is a database which maintains a list of the grammars, 

functional or not, created by the user so far, in the process of building a new parser. Each 

grammar in the database knows what grammar was built before and after it. 

To store a grammar, the store method receives a GrammarID with the grammar name and 

a byte array containing the grammar declaration. 

The grammars are retrievable in two ways. The user can obtain grammars on an undo/redo 

fashion using the next and prev methods, which return the grammar built immediately be- 

fore or after. However, if the user desires to get a specific grammar in time, it can be obtained 

using the grammar serial number which is unique to each grammar. 

The discard and discardAll methods delete grammars from the database. The discard 

method discards a grammar and those based on it, while the discardAll method discards 

every grammar in the database. 

Finally, the state of the database can be saved, with the saveState, whenever the tool ex- 

ecution is halted. 

3.3.2 Original to Generic Syntax Converter Operation Scenarios 

Depending on whether a parser for a configuration file exists in the Parser Repository, 

OGC may have two different execution scenarios. For a clearer perception of the architecture 

and to give a better insight on the interactions between components, both scenarios are now 

briefly summarized. 

Considering that there is a parser for a given configuration file in the database, the original 

to generic conversion unfolds as follows (Figure 3.4): 

1. The User Interface calls the Configuration File Parser to parse the configu- 

ration file; 

31



User 

Interface 

Logical Layer 

Configuration 

File Parser 

Storage Layer 

Parser 

Repository 


Generator 


Logical Layer 

Storage Layer 

Figure 3.4: Interactions on the successful file recognition scenario 

2. The Configuration File Parser gets the parsers from the Parser Repository 

and tests the configuration file with all of them. By assumption, at least one parser is 

able to parse the configuration file entirely, therefore, at least one valid AST for the entire 

configuration file is produced; 

3. The Configuration File Parser sends the generated ASTs to the Code Generator 

and returns User Interface the generic syntax of the file for each parser; 

4. Finally, the User Interface chooses a valid file in generic syntax and stores it to disk. 

On the other hand, if a configuration file is new to the tool, the process requires more com- 

ponents to operate. The previous process is performed normally until step 2, where no parser 

was found to be able to parse the configuration file by the User Interface. Afterwards, the 

action flow is as follows: 

1. The user must build a grammar through the User Interface so that the Grammar 

Compiler generates a parser from it. If a grammar declaration is not valid or if the 

generated parser still cannot parse the configuration file, another grammar must be built 

until the configuration file is parsed; 

2. Meanwhile, every grammar produced by the user is stored in the Tentative Grammar 

Repository for the user to easily rollback any change made to a grammar; 

3. When a parser successfully recognizes a configuration file, the user has the last word as 

he may approve or disapprove the structure produced by that parser; 

4. If the user does not approve of a parser, another one must naturally be generated; 

32

3. AN APPLICATION RECONFIGURATION FRAMEWORK 3.4. Generic to Original syntax Converter 


User Interface 

Logical Layer 

Configuration 

File Parser 


Generator 

Storage Layer 

Tentative 

Parser 

Repository 

Grammar 

Repository 


Grammar 

Compiler 

Logical Layer 

Storage Layer 

Figure 3.5: Interactions on the failed file recognition scenario 

interface Printer { 

void print(String file); 

} 

Listing 3.7: Printer proposed interface 

5. If a parser is approved, it is stored in the Parser Repository and the Tentative 

Grammar Repository may be emptied. 

Afterwards, the course of action returns to normal: the AST produced by the parser is sent 

to the Code Generator and the resulting file in generic syntax is stored to disk by the User 

Interface. The interactions between components on this process are depicted in Figure 3.5. 

Annex A.1 is a sequence diagram showing how components interact with each other, and 

in what order. 

3.4 Generic to Original syntax Converter 

The conversion from generic syntax back to the original one is accomplished solely by one 

component, the Printer (Listing 3.7). 

The Printer is able to convert files in generic syntax to any configuration file format. It 

does so by using the information on the file’s original syntax, captured by the Code Generator 

in Section 3.3. 

The outcome of the Printer is the modified file in its original syntax, which remains 100% 

functional and is ready to be re-merged with the other application files, causing the application 

to become configured. 

33

3. AN APPLICATION RECONFIGURATION FRAMEWORK 3.5. Modifying the Configuration File 

3.5 Modifying the Configuration File 

Despite not being part of this work, some possible ways to deal with the configuration file 

modification theme were identified. 

In the presence of a configuration file, structured in an XML document, one can either man- 

ually modify it, using a regular text editor or a similar XML editor, or automatically modify it, 

employing an XML script which applies the changes to a configuration file without the need 

for any user intervention. 

The first way does not ask for any special requirement, other than having a text editor at 

hand, and therefore, consists on the easiest way to modify a configuration file in XML. On the 

other hand, the second way requires a script to be previously built. This script may, neverthe- 

less, be reusable and also allows for a completely automatic application configuration process. 

However, one must first know how the generic syntax is structured in order to build a script 

that operates on it. 

Therefore, it is very important that all the files in generic syntax follow the same structure 

and nomenclature, regardless of their original format. For example, assuming there is a script 

which goes through an entire configuration file searching for the block “Network” to change 

the “address” parameter value to “192.168.0.100”, it is vital that every file in generic syntax rep- 

resents block parameter patterns in the same way (i.e., a generic syntax element called “Block” 

with a “value” element assigned to it) so that the script recognizes those patterns when it sees 

them. The same goes for anyone who wants to manually make that change. The person who 

applies the modification should not be obliged to know multiple generic syntaxes for various 

configuration file languages, but rather a homogeneous syntax which might slightly diverge 

from language to language, depending on that language idiosyncrasies. 

3.6 External Parser Addition 

In response to the non-functional requirement in Section 3.1 which says that the user must be 

able to add parsers built outside the tool, Section 3.3.1 indicates that the Parser Repository 

implements the importParser method. This utility is especially useful when the user wants 

to configure applications whose configuration files cannot be parsed by parsers created by 

the chosen parser generator (e.g., binary files) or if the user already possesses a parser for a 

configuration file. 

Nevertheless, in order for an external parser to be added to the tool, it must implement 

the interface defined in Listing 3.8. This interface solely defines one method, which is the one 

called by the Configuration File Parser when parsing a file with an external parser. 

Furthermore, the method must return the parsing statistics and the parsed blocks in generic 

syntax. 

34

3. AN APPLICATION RECONFIGURATION FRAMEWORK 3.6. External Parser Addition 

Listing 3.8: External Parser Proposed Interface 

interface ExternalParserInterface { 

public ParsingData parse(String fileName); 

} 

35

4 

An Implementation 

Chapter 3 approached the application configuration framework by identifying the require- 

ments to be met and defining a tool workflow and component architecture. This chapter de- 

scribes the implementation of that framework, developed in the context of this thesis. 

First, implementation decisions of several orders are presented. Section 4.1 presents the 

adopted programming language, Section 4.2 presents the adopted parser generator and Sec- 

tion 4.3 presents the adopted generic syntax format. Then, Section 4.4 shows the implementa- 

tion of the created data structures in terms of the motivation for their use and their functionality. 

Afterwards, Section 4.5 presents the details of the Original to Generic syntax Converter com- 

ponent implementation, Section 4.6 shows how the files in generic syntax are printed back to 

original syntax and last, Section 4.7 displays the Generic to Original syntax Converter compo- 

nent implementation. 

4.1 Programming Language 

As it is well known, programming languages differ from each other not only in terms of sup- 

ported programming paradigms, but also expressive power, documentation, popularity, etc. 

Among the requirements that the programming language was expected to meet, two had a 

special weight in the time of decision: the chosen programming language should be object- 

oriented due to the modular nature of the tool and high-level to easily and quickly provide 

programming tools such as data types, data structures, etc. 

Given the previous considerations, the decision fell on the Java 1 programming language, 

considering it supports the object-oriented programming paradigm, among many other pro- 

gramming paradigms, is high-level and its API (Application Programming Interface) is well 

1 http://java.sun.com/ 

37

4. AN IMPLEMENTATION 4.2. Parser Generator 

interface Node { 

void translate(); 

} 

Listing 4.1: Node interface 

documented. Moreover, a great number of parser generators provide support for this pro- 

gramming language. 

4.2 Parser Generator 

The choice of a parser generator took into account the two parsers studied earlier in Section 2.3, 

SableCC and JavaCC. 

SableCC is defined by automatically generating all nodes of an AST from a grammar, im- 

plementing a fairly simple and intuitive grammar definition interface and strictly separating 

all the user-generated code from the parser generator-generated code. 

On the other hand, JavaCC makes it possible for the insertion of Java code in the gram- 

mar definition, which may cause the grammar definition interface a bit more complex. Unlike 

SableCC, JavaCC does not automatically generate the nodes of the AST, leaving that task to the 

user. The separation of parser generator and user-generated code is relative and depends on 

the degree of detachment defined in the grammar. 

Initially, SableCC’s characteristics led us to consider it as the best option, since it allowed 

for the rapid prototyping of the tool. However, after used and tested, its mechanics were found 

to be more complex over time considering that, by having each parser creating its own AST, it 

requires specific walkers for each AST. The direct impact that this characteristic had on SmART 

is that for each different configuration file type is a different grammar, which leads to a different 

AST, which leads to one AST walker for each configuration file type. Adversely, since JavaCC 

requires the nodes of the AST to be manually defined, it is possible to conceive a structure 

where every node in it implements a method that can be called by the structure walker (see 

Listing 4.1). In this way, only one walker is required for any kind of configuration file, which 

is able to visit the AST of any configuration file type and produce the file in generic syntax by 

calling the method defined in the interface of Listing 4.1. 

Another severe limitation of SableCC is that it currently does not offer the error recovery 

functionality, whereas JavaCC provides two types of error recovery: shallow and deep error 

recovery. Error recovery consists on continuing the parsing of a configuration file even after 

an unrecognized token has been found. This allows for better accuracy when calculating the 

fitness of a parser with a configuration file. 

Given the previous considerations, the initial beliefs in SableCC were turned in favour of 

JavaCC. 

38

4. AN IMPLEMENTATION 4.3. Generic Syntax 

interface Parser { 

String getParserID(); 

} 

interface InternalParser { 

Method getParseMethod(); 

byte[] getGrammarDefinition(); 

} 

4.3 Generic Syntax 

Listing 4.2: Parser data structure 

Listing 4.3: InternalParser data structure 

Section 2.4 indicated that the chosen format for the representation of the configuration files 

in generic syntax would be XML (eXtended Markup Language). Furthermore, that Section 

saw that there are two XML parser implementations for the Java programming language, 

DOM [W3S09] and SAX [Pro04]. It was seen that the first keeps a tree in memory, representing 

the XML document, while the second is event-driven. 

By balancing the benefits and drawbacks of each parser, DOM was preferred to SAX on 

account of representing an XML document in memory as a tree structure. This allows for a 

quick transition of an AST to a DOM tree. 

4.4 Data Structures 

This section presents the data structures created to cope with various aspects regarding the 

passing of data between components. For each data structure, an interface is revealed and an 

explanation on the data structure functionality and the methods declared in the interface is 

presented. 

4.4.1 Parser 

Parser objects are used as references for the parsers used by the tool. This implementation 

considers that the parsers are identified by their name and, therefore, the ParserID field 

consists on a String instead. However, the Parser class cannot be instanciated since it is 

an abstract class. To distinguish the internal parsers, built by the tool, and external parsers, 

built outside the tool and imported by a user, two classes extending Parser were created: 

InternalParser and ExternalParser. The two classes are now explained. 

39

4. AN IMPLEMENTATION 4.4. Data Structures 

InternalParser 

'()*+,-#".+()&/+0% 

!"#$%#&! 

!"#$%#&" 

Figure 4.1: Recognition of a configuration file by two parsers 

InternalParser (Listing 4.3) objects represent parsers generated by the tool, through the 

Grammar Compiler. An InternalParser object contains a pointer to the parse method 

and the grammar used to construct the parser. 

Regarding the parse method, the latter is invoked resorting to the Java reflex API and re- 

turns a list of Node objects (Listing 4.1) due to be transformed to XML by the 

Code Generator. 

Concerning the grammar used for the generation of the InternalParser, it is kept with 

the parser in order to aid users on the generation of new parsers. Consider the case where a 

user tries to configure a file which is new to the tool and two parsers are stored. The result of 

the parsing attempt is depicted in Figure 4.1. 

In the example, two parsers are tested on a configuration file with 100 lines. The first parser, 

α, manages to recognize the file from line 0 to 59 and then from line 70 to 99, resulting in a 90% 

of parsed file. On the other hand, parser β recognized 20% of the file, from line 50 to 69. 

In this situation, the user who would build the new parser for the file might want to access 

the grammar of the parser α, since it parsed almost the entire file, and base the new parser on 

the old instead of starting from scratch. Also, seen as the second parser recognized the part of 

the file that the first did not, the user might want to check the grammar of the second parser. 

40


Listing 4.4: ExternalParser data structure 

interface ExternalParser { 

ExternalParserInterface getParseMethod(); 

} 

interface Grammar { 

int getSerialNumber(); 

String getGrammarID(); 

byte[] getGrammarDefinition(); 

} 

ExternalParser 

Listing 4.5: Grammar data structure 

ExternalParser objects are used to represent parsers built outside the tool. Since these 

parsers might not be built with the parser generator used by the tool, there is no knowledge of 

the content they produce when parsing a file. Therefore, an ExternalParser has an attribute 

of the ExternalParserInterface type, which contains the invocable method declared in 

Listing 3.8 to parse a file and return the file in generic XML. 

4.4.2 Grammar 

Grammar objects (Listing 4.5) stand for tentative grammars. They are created in the User 

Interface and stored the Tentative Grammar Repository when a new grammar is built. 

A Grammar contains a serial number, which is used by the Tentative Grammar 

Repository to have a means of identifying each grammar, since the name of all the tenta- 

tive grammars for a parser tends to be equal because it represents the name of the parser being 

built. In this implementation, the getGrammarID returns the grammar name in a String. 

The grammar name is used to name the Grammar, but also to name the parser which it was 

used to generate, as already mentioned. Finally, a Grammar also contains an attribute for the 

grammar declaration, accessible through the getGrammarDefinition method. 

4.4.3 ParsingData 

ParsingData objects (Listing 4.6) contain pieces of a configuration file, in generic syntax, and 

information regarding the parsing of that file with a parser and are returned by the 

Configuration File Parser. 

Although the ParsingData objects do not contain any reference to the corresponding 

parser, they always refer to the parsing result of a parser. This relation is explicit when the 

Configuration File Parser returns objects of this kind. When a single parser is tested, 

the return is a ParsingData object, obviously referring to the parse with the tested parser. 

41


interface ParsingData { 

boolean wasParsed(); 

int getParsingDone(); 

List getParsedBlocks(); 

void dump(String fileName); 

} 

Listing 4.6: ParsingData data structure 

ParsingData 

boolean 

parsed 

int 

parsePercentage 

List 

blocks 

Block 

int 

startPosition 

int 

endPosition 

XML block 

Figure 4.2: Graphical representation of ParsingData 

Whenever multiple parsers are tested, the return is a map of the tested parsers and their 

ParsingData. 

By having a ParsingData contain the parts of a file recognized by a parser, the case where 

a file is completely parsed is a special case of when a file is not totally parsed, considering 

that in the first case, a ParsingData will only contain one part, which is the whole parsed file. 

However, a parser may return only one part of a file but not recognize it entirely. Therefore, 

additional control is required to check if a file was well parsed or not. The wasParsed method 

tells whether a file was entirely recognized, or not. 

In case a file was not completely parsed, its parsing statistics are logged. This is an indicator 

of how fit the parser was found to be with the given configuration file and can be measured in 

terms of percentage of parsed file. 

Additionally, the ParsingData objects also contain a list of Block objects. A Block object 

has no relation with the block pattern, but rather contains the generic syntax of a parsed file part 

as well as the positions of the first and last character that Block represents. The representation 

of a ParsingData object, and the contained Block object, is shown in Figure 4.2. 

Whenever the file is due to get modified in order to be configured, or the user halts the tool 

execution, the dump method is called in order to store the blocks in a given file in persistent 

storage. 

4.4.4 CompilationData 

CompilationData objects (Listing 4.7) are produced by the Grammar Compiler to send in- 

formation regarding the compilation of a grammar to the User Interface. A 

CompilationData object tells if a grammar was successfully compiled in the wasCompiled 

42

4. AN IMPLEMENTATION 4.5. Original to Generic syntax Converter 

interface CompilationData { 

boolean wasCompiled(); 

String getErrorMessage(); 

String getParserLocation(); 

} 

Listing 4.7: CompilationData data structure 

method. If the compilation ended without any occurred error, the parser location is given by 

the getParserLocation method. Alternatively, if an error ocurred during the compilation, 

it is accessible through the getErrorMessage method. 

4.5 Original to Generic syntax Converter 

After the components required in the Original to Generic syntax Converter were identified and 

the behaviour of each one was described, their implementation took place. Following is shown 

how the components identified in Section 3.3 were implemented. 

4.5.1 User Interface 

The User Interface is passed multiple configuration file locations upon its invocation by 

the user. The parsing of multiple configuration files is sequential, or, in other words, UI config- 

ures one file at a time and only when that file is configured should the next file configuration 

start. 

To configure a file, UI sends its location to the Configuration File Parser in order 

to parse it with every available parser. The Configuration File Parser returns a map of 

ParsingData objects which tells if the parsing was successful with any parser, as well as the 

percentage of parsed file, the parsed file blocks in generic syntax and the position of the first 

and last characters of the corresponding block. 

When a file is successfully parsed, the corresponding AST is sent to Code Generator and 

the returned file in XML is placed in a temporary file folder, with the same name as the original 

configuration file, plus the suffix “.xml”. 

If a file was not entirely parsed, UI displays the parts of the file which were recognized 

by the parser with the best configuration file parse percentage and the grammar used on the 

respective parser. Then, UI prompts the user for a file containing a grammar to be com- 

piled and sends the grammar to Grammar Compiler. The Grammar Compiler returns a 

CompilationData object which tells whether there was any error compiling the grammar. If 

an error ocurred, the user must alter the grammar in the file to correct the error(s) and then 

trigger another grammar compilation until the grammar is successfully compiled. 

When a grammar is successfully compiled, UI attempts to parse the configuration file 

with the newly generated parser by invoking the Configuration File Parser’s doParse 

method which tests a file with a single parser. If the file was still not entirely recognized, the 

43


user must alter the grammar to recognize the text parts which failed to be parsed, until a parser 

which recognizes the whole file is generated. 

The initial plan was for a graphical interface to be launched whenever a file could not be 

parsed by any parser in the database. This interface would display useful information to the 

user such as the graphical representation of the configuration file in XML, grammar navigator 

and highlighted configuration file based on its parsed and not parsed parts. UI would also 

allow the user to define a grammar without having to manually program it, resorting to the 

selection of configuration file pieces and assigning meanings to them, in an interactive way. 

However, the implementation of the graphical interface will only take place during the inte- 

gration of the tool with VIRTU. 

4.5.2 Configuration File Parser 

When the Configuration File Parser is summoned for the first time, it initiates the 

Parser Repository. 

The doParse(String fileName) method calls the doParse(String fileName, 

Parser parser) method for every available parser in the Parser Repository and logs 

the resulting ParsingData objects in a map, where the entry key is the used Parser. This 

method returns the map generated in this way. 

The doParse(String fileName, Parser parser) method, in turn, treats a config- 

uration file with a parser at a time. If parser is an InternalParser, its parse method is 

invoked using the Java reflex API and the return, a list of Node objects (Listing 4.1), is sent 

to the Code Generator to be passed to generic XML. Thrown exceptions during parse time 

mean that a file was not entirely recognized. The exceptions caught at parsing time contain 

the position where the recognized block starts and ends. These exceptions are analysed to re- 

trieve that information. The percentage of recognized file is calculated based on the limits of 

every parsed block. In the end, a ParsingData object containing the parsing percentage, the 

recognized blocks, characterized by their start and end points, together with the XML of the 

respective blocks, is returned. 

4.5.3 Parser Repository 

The Parser Repository holds a map of Parser objects corresponding to the available 

parsers. When PR is initiated, it gets the last saved state by loading the parser map to memory. 

The storeParser method creates a Parser object from a string containing the name 

of the parser, and the ExternalParserInterface which contains the parser. An optional 

grammarFile can be passed, in case the user wants to merge the grammar with the Parser 

so that it is available in the future. If the grammarFile argument is not null, the grammar 

must be stored in the grammar directory, with the same name as the parser. The Parser is 

then inserted in the Parser map. The importParser does a similar job in inserting a parser 

in the database, but for external parsers. 

PR also implements methods for getting the parser map iterator, getIterator and to 

44


delete a parser from the repository, deleteParser, which consists on deleting the map entry 

with the parser name in argument. 

4.5.4 Grammar Compiler 

The Grammar Compiler is implemented as a class which receives a grammar definition and 

produces a parser from it. After JavaCC generates the parser classes, these must be compiled 

before the Parser Repository is able to load them. 

In the end, the Grammar Compiler returns a CompilationData object, which tells that 

a grammar failed to compile if an exception was caught, or that it compiled successfully other- 

wise. 

4.5.5 Code Generator 

The Code Generator traverses structures composed of nodes implementing the interface de- 

fined in Listing 4.1 and invokes the method defined the interface upon visiting each node in 

order to generate the generic XML of the pattern associated to that node. The reason why every 

node in structure implements that interface was already explained in Section 4.2 and consists 

on forcing each node of the AST to implement a method which can be called by CG. In this way, 

the information on how to print the nodes to generic XML is stored in the nodes and CG gains 

the ability to successfully traverse ASTs of any configuration files. 

On an early stage of the tool, the code that generates the generic XML of a node in the 

AST must be generated manually, for new patterns. However, with the creation of a graphi- 

cal interface for the definition of grammars for new types of configuration files in the User 

Interface, it should be possible to use the information received from the user to automati- 

cally generate that code. 

CG is responsible for logging the details of each configuration file language, such as the 

used delimiters on patterns, or how a pattern is formatted. For this, CG resorts to a Metadata 

special field, which stores the information regarding the delimiters used by the patterns, and 

another FStr special field to log how each pattern is formatted. Details regarding the generation 

of original syntax details are further described in Section 4.6. A set of functions to manipulate 

the Metadata and the FStr fields are provided. 

The generic XML must be the same for every configuration file types, following a structure 

similar to the presented in Section 1.3, with blocks, comments, parameters and other special ele- 

ments. This is important since the generic XML document might become unrecognizable to the 

configurator which modifies it if the user decides, for example, to denote blocks by “blockx”. 

The configurator will afterwards find a “blockx”, when in fact its operation was defined on a 

“block”. 

4.5.6 Tentative Grammar Repository 

The Tentative Grammar Repository implements a linked list of TGRNode objects. A 

TGRNode represents a tentative grammar in TGR and contains a TGRNode prev attribute with 

45

4. AN IMPLEMENTATION 4.6. Generation of Original Syntax 

!"#$%&' 

)*++ 

( 

!"#$%&' 

, 

!"#$%&' 

- 

)*++ 

Figure 4.3: List of TGRNodes 

!"#$%&' 

!"#$%&' 

the grammar where the current was based on and a TGRNode list next containing the gram- 

mars based on the current. An example is represented in Figure 4.3. 

To allow the user to traverse all the built grammars, the prev and next methods resort to 

each grammar serial number to iterate the grammars backwards/forwards. Each method may 

also return null if no grammar was built before/after the current one, respectively. The get 

method receives an integer representing the desired grammar serial number and returns the 

respective Grammar. 

The discard method deletes a grammar from the database, and every grammar based on 

that one. The discardAll empties the list of tentative grammars. To save the state of the 

database, the gramar list is stored to disk. 

4.6 Generation of Original Syntax 

The generation of original syntax consists on printing a file on generic syntax back to its original 

syntax by means of a Printer component. When a file is transposed from its original syntax, 

say INI, to the generic syntax (XML), the language-specific details, like the ‘[’ and ‘]’ header 

delimiters, are discarded to allow for application-independent configuration. Furthermore, the 

format of each pattern is also lost. Only the relevant information for the configuration process, 

such as parameter keys and values, is kept. 

When the printer tries to print the file in generic syntax back to its original form, it will 

46 

/ 

. 

)*++ 

)*++

4. AN IMPLEMENTATION 4.6. Generation of Original Syntax 

 

 

[ 

] 

 

 

Listing 4.8: Example Metadata 

not know what application is being configured at that moment. Therefore, it cannot determine 

what delimiters to print on a given pattern (e.g., INI “[]” or XML block delimiters), or how to 

print them (e.g., INI single header or XML header and footer). So, to print a file to its original 

syntax, there should be enough information regarding the original syntax, in the generic syntax, 

for the Printer. 

One way to know what delimiters should be print is to store them in a field in the be- 

ginning of the file in generic syntax. As the pattern delimiters are immutable throughout the 

configuration file (i.e., comments always start with # in an Apache configuration file), infor- 

mation regarding them can be unified in a special Metadata field in the beginning of the file. 

This makes it possible for the Printer to query Metadata to know what to print on a certain 

pattern. 

The format of a pattern is of great importance as well. The same pattern might have differ- 

ent formats in two different configuration file languages. For instance, an INI block has only 

one block header, whereas an XML block (i.e., element) has a header and a footer. Again, since 

the Printer does not know what application it is configuring at the moment, it does not know 

whether to print a pattern one way or the other. 

To solve this problem, a format string which describes how a pattern looks like is associated 

with each pattern, forming an FStr field. In this way, the Printer gets the FStr from a pattern 

and prints it accordingly. A format string may be composed of the following: 

%a.name The value of the attribute named name in the current element is printed; 

%e The value of the current element is printed; 

%m.name The value of the element named name in Metadata is printed; 

%c The format string is in the next child of the current element; 

%s A space is printed; 

%n A new line is printed; 

Let’s consider the following header in an INI configuration file: 

[Engine] 

The corresponding Metadata is depicted in Listing 4.8 and the FStr will look like Listing 4.9. 

47

4. AN IMPLEMENTATION 4.7. Generic to Original Syntax Converter 

 

 

%m.LBra%a.name%m.RBra%n 

 

..(other patterns).. 

 

ServerType standalone 

Listing 4.9: Example FStr 

Listing 4.10: Apache parameter with a single value 

Note that in the example, the FStr is assigned to the Engine block and not to the block 

pattern. This happens because there are cases where the same pattern has different formats in 

a configuration file language. For example, a parameter in an Apache configuration file might 

have one (Listing 4.10) or multiple values (Listing 4.11), or an XML element might have one 

(Listing 4.12) or multiple sub-elements (Listing 4.13), which leads to multiple format strings. 

Therefore, FStr cannot be factorized in a single field, like Metadata. 

4.7 Generic to Original Syntax Converter 

The Generic to Original syntax Converter contains two classes, PrinterMain and Printer. 

The first is used to interact with the user, while the second contains methods for the interpreta- 

tion and printing of the file in generic syntax (i.e., XML). Following is a deeper explanation of 

both classes. 

The PrinterMain class implements a main method which receives the location of the XML 

file in argument. It imports the XML document from the file to memory using the Java XML 

parsing API. Then, PrinterMain sends the document pointer and a stream to the Printer and 

the latter prints the final configuration file in original syntax to the stream in argument. 

The Printer class manages the printing of XML elements inside an XML file. It implements 

a single visible method, print. 

The print method is called by PrinterMain to print the entire XML document into a con- 

figuration file in its original syntax. It receives a stream and a NodeList object as arguments. 

NodeList refers to a DOM interface for the representation of lists of Node objects, whereas 

each Node object stands for XML components, such as elements, attributes or text nodes. Since 

the XML standard defines XML documents as having a single root node, PrinterMain calls 

Listing 4.11: Apache parameter with multiple values 

DirectoryIndex index.html index.htm index.shtml 

default.htm default.html index.php 

48

4. AN IMPLEMENTATION 4.7. Generic to Original Syntax Converter 

 

 

 

Listing 4.12: XML element with a single subelement 

Listing 4.13: XML element with multiple elements 

 

 

 

 

the print method with the root node’s children, which is, in turn, a NodeList. 

The first element in the XML files generated by the tool is always Metadata, containing the 

information about the original syntax. This element aids the printing process and is not part of 

the final configuration file, therefore it must not be printed. The remainder of the elements are 

then printed. XML elements generated by the tool always contain an FStr field, corresponding 

to the format string of that element, explained in Section 4.6. All elements in the document 

are visited and printed according to their format string. If the current element has other child 

nodes, they must be printed as well. 

49

5 

Framework Evaluation 

Chapter 3 presented the proposed application configuration framework and Chapter 4 covered 

the process of implementing the framework. This chapter provides us with the validation of 

the tool, carried after the implementation phase had ceased. 

First, Section 5.1 shows the functional validation of the tool by explaining how the require- 

ments identified in Section 3.1 are met. Then, Section 5.2 presents the operational validation 

by checking the integrity of the tool and examining the configuration process step by step. Fi- 

nally, Section 5.3 provides the performance validation by testing the tool behaviour in different 

scenarios (i.e., configuration of a very big and very small configuration file). 

5.1 Functional Validation 

In Section 3.1, some functional and non-functional requirements of the tool were identified. 

This section presents the ways in which these requirements are met. 

1. The user must be able to convert a configuration file syntax from its original syntax to 

a generic one, independently of the application. 

The user is able to summon the tool from a terminal and pass it multiple configuration file 

locations. The tool, in turn, produces abstract trees for the configuration files, and then the 

configuration files in a structured, generic format. 


Performance: The generated file with the generic syntax must be as simple as possible. 

The information regarding the original syntax is factorized by the code generator in a Meta- 

data field. 

2. The user must be able to define grammars for configuration file languages. 

51

5. FRAMEWORK EVALUATION 5.1. Functional Validation 

After the user submits a number of files to the tool, they are processed and, if not fully 

recognized, the user is able to manually define a grammar, or pass the tool a file containing the 

grammar for the generation of new parser. 


Usability: The grammar definition syntax should be a broadly adopted one. 

JavaCC grammar interpreter supports the Extended Backus-Naur Form and allows Java 

code, for more commodity. 

Usability: To ease the parser generation process, the user must be able to iterate through the previ- 

ously built grammars so as to roll back any change made on a grammar. 

Every time the user tries a newly created parser on a configuration file, the grammar used 

for it is stored in a repository and can be recovered at any time. 

3. The user must be able to produce a parser from a grammar. 

As soon as the user finishes declaring a grammar, the latter is passed to the tool, which 

compiles it on an external parser generator. 


Extensibility: The user must be able to add parsers built outside the tool to support other configura- 

tion file paradigms (e.g., Windows registry files, binary files, etc.). 

The parser repository implements the importParser method which allows the user to 

store an external parser. However, the parser must implement the interface presented in List- 

ing 3.8. 

4. The user must have access to the parser compilation trace. 

When a grammar is compiled, its compilation trace is shown in case there any error ocurred, 

or a success message is displayed, otherwise. 


Usability: The error messages must clearly identify the source of the error. 

The error messages contain the source of the compilation error. 

5. The user must receive information relative to the grammar fitness with a given config- 

uration file. 

Any time a parser is tested with a configuration file, the percentage of parsed file with that 

parser is displayed. 


Usability: The sections of the configuration file that were parsed and those that were not must be 

clearly identified. 

When a parser is tested with a configuration file, the regions of the file recognized by it are 

highlighted in a graphical environment. 

6. The user must be able to store a functional parser generated by the tool, in order to be 

used on later tool runs. 

52

5. FRAMEWORK EVALUATION 5.2. Operational Validation 

The user can trigger a storeParser command, which stores a parser in the parser reposi- 

tory definitively. 


Security: The user must be able to see the parsing outcome in order to check if the parser is indeed 

operating as intended to. 

The graphical user interface displays a user-friendly representation of the generic syntax so 

the user can confirm that the parser is indeed operating in the intended way. 

7. The user must be able to reconvert a configuration file syntax from the generic syntax 

to the original one. 

The user is able to summon the tool and pass it a tool-generated XML file to be converted 

into the original syntax, keeping its functionality. 

8. File conversion must not eliminate comments 

The built-in parsers preserve the comments of the original file. For new parsers, their gram- 

mars must contemplate the comments and not ignore them. 

9. The user must be able to halt the tool execution at any point and continue the config- 

uration process later. 

When the user halts the tool execution, the parser repository and tentative grammar repos- 

itory states are stored in order to be recovered later. If a valid version of the configuration file 

has been produced before the user halted the execution, the file is stored and can be recovered 

later. 

10. The user must be able to manually delete parsers from the parser repository. 

The parser repository implements the delete method which allows the user to delete an 

existing parser from the database, given its name. 

5.2 Operational Validation 

This section presents some tests carried on the tool, using various configuration file types, 

with the objective of specifying the configuration file intermediate states and determining the 

validity of the final configuration file, produced by the tool. 

The framework provides built-in support for three major configuration file categories, iden- 

tified in Section 1.3: 

• Files composed of parameter blocks delimited only by a header, and comments (i.e., INI- 

like); 

• Files composed of parameters, blocks delimited by a header and a footer and comments 

(i.e., Apache-like); 

53


1 [mysqldump] 

2 quick 

3 quote-names 

4 max_allowed_packet = 16M 

5 

Listing 5.1: MySQL configuration file snippet 

6 # The MySQL database server configuration file. 

7 

8 !includedir /etc/mysql/conf.d/ 

• Files composed of blocks delimited by dynamic headers and footers, and comments (i.e., 

XML-like). 

Three tests were conducted, one with a file of each category. All the tests were ran on a 

system with the Intel Pentium Dual T3200 processor, 2 GB DDR2 main memory and Ubuntu 

Linux 9.04 operating system. 

Each test consisted on configuring an application by invoking SmART with the application 

configuration file. The configuration files were converted in XML and then converted back to 

their original format. Between the tests, the file was not modified, since Section 3.2 showed 

that file modification is not part of this dissertation scope. 

Due to the considerable length of each file, the tested configuration files are not integrally 

displayed in this section. Instead, they can be found in the appendix section of this dissertation 

(Chapter B). The examples refer to parts of the configuration file which were handpicked to 

display the tool behaviour in the presence of each pattern implemented by the file. 

To check the differences between the original and final file, the diff 1 UNIX utility was used. 

diff is a file comparison utility that shows the differences between two text files, in terms of 

different lines. Whenever lines from both files differ, the result shows the divergent lines of the 

original configuration file, and then those from the generated configuration file. 

5.2.1 INI-like Configuration Files 

This test consisted on configuring the MySQL database management system. The MySQL con- 

figuration file format is a slightly altered version of the INI format, but is still recognized 

by the tool. The configuration file (Listing 5.1) implements parameters with zero or one as- 

signed values, contained in blocks delimited by a static header (lines 1-4), comments (line 

6) and special instructions (line 8). The complete configuration file is in Annex B.1. The 

file containing the code generated by the tool for this configuration file is found at http: 

//www-asc.di.fct.unl.pt/˜jml/SmART/ini.xml. 

The tool produced the following XML for the block (Listing 5.2), comment (Listing 5.3) and 

special instruction (Listing 5.4): 

The Metadata generated by the tool is represented on Figure 5.5. 

1 http://www.gnu.org/software/diffutils/diffutils.html 

54


Listing 5.2: MySQL block in XML 

 

%m.start%a.name%m.end%n%c%c%c%c 

 

%e%n 

quick 

 

 

%e%n 

quote-names 

 

 

%e%m.equal%e%n 

max_allowed_packet 

16M 

 

 

%n 

 

 

Listing 5.3: MySQL comment in XML 

 

%m.start%e%n 

The MySQL database server configuration file. 

 

Listing 5.4: MySQL special instruction in XML 

 

%m.start%e%n 

includedir /etc/mysql/conf.d/ 

 

55


 

 

# 

 

 

= 

 

 

[ 

] 

 

 

! 

 

 

Listing 5.5: MySQL Metadata 

The file was re-converted to its original syntax. diff showed that the differences between 

the original and generated files resume to a few absent spaces from the first to the second. This 

happens since the parser ignores the space characters outside strings. As MySQL ignores the 

spaces in configuration files, the file functionality is maintained. 

5.2.2 Apache-like Configuration Files 

This test consisted on configuring the Apache HTTP server. The Apache configuration file 

(Listing 5.6) implements parameters with a single value (line 1), parameters with multiple val- 

ues (lines 3-4), blocks (lines 6-8), nested blocks (10-14) and comments (line 16). The complete 

configuration file is in Annex B.2. The file containing the code generated by the tool for this con- 

figuration file is found at http://www-asc.di.fct.unl.pt/˜jml/SmART/blox.xml. 

The tool produced the following XML for the parameter with a single value (Listing 5.7), 

parameter with multiple values (Listing 5.8), block (Listing 5.9), nested (Listing 5.10) and com- 

ment (Listing 5.11): 

The Metadata generated by the tool is represented on Figure 5.12. 

The file was re-converted to its original syntax. diff showed that the differences between 

the original and generated files are some absent new lines and tabulations. Once again, this is 

due to the parser which ignores some new lines and tabulations. However, this does not affect 

in any way the functionality of the new file, so the result is valid. 

5.2.3 XML-like Configuration Files 

This test consisted on configuring the Eclipse workbench. The Eclipse workbench configura- 

tion (Listing 5.13) file is originally on XML format and implements empty elements/blocks 

(line 1), elements with attributes/blocks containing parameters (line 3-4), nested blocks (lines 

56


1 ServerType standalone 

2 

Listing 5.6: Apache configuration file snippet 

3 DirectoryIndex index.html index.htm index.shtml default.htm 

4 default.html index.php 

5 

6 

7 MIMEMagicFile /usr/local/apache/conf/magic 

8 

9 

10 

11 

12 Options Indexes MultiViews 

13 

14 

15 

16 # httpd.conf -- Apache HTTP server configuration file 

Listing 5.7: Apache parameter with a single value in XML 

 

%e%m.equal%e 

ServerType 

standalone 

 

Listing 5.8: Apache parameter with multiple values in XML 

 

%e%m.equal%e%m.equal%e%m.equal%e%m.equal%e%m.equal%e%m. 

equal%e 

DirectoryIndex 

index.html 

index.htm 

index.shtml 

default.htm 

default.html 

index.php 

 

57


Listing 5.9: Apache block in XML 

 

%c%c%c%c%c 

 

%m.start%a.name%m.end 

 

 

%n 

 

 

%e%m.equal%e 

MIMEMagicFile 

/usr/local/apache/conf/magic 

 

 

%n 

 

 


 

 

6-8) and comments (line 10). The complete configuration file is in Annex B.3. The file contain- 

ing the code generated by the tool for this configuration file is found at http://www-asc. 

di.fct.unl.pt/˜jml/SmART/xml.xml. 

The tool produced the following XML for the empty block (Listing 5.14), block containing 

parameters (Listing 5.15), nested blocks (Listing 5.16) and comments (Listing 5.17): 

The Metadata generated by the tool is represented on Listing 5.18. 

Then, the file was re-converted to its original syntax. diff showed that the differences be- 

tween the original and generated files are some extra new lines in the generated file. This is due 

to the original file having a dual criteria for new lines. In some places, an XML tag is followed 

by a new line, where in other places they are not. 

Still, the tool is able to recognize the whole file, and it produces a file which is completely 

functional and well defined. 

5.2.4 Parsing the Generated XML 

In order to make sure that the XML file generated by the tool was indeed valid, an experiment 

was carried: parsing the file generated by the tool. Beforehand, the tool should be able to parse 

the generated XML file since it is able to parse files resemblant to XML. 

First, OGC (Original to Generic syntax Converter) was called to parse a configuration file. 

From this parse resulted the file xml1. Then, OGC was called again to parse xml1 into xml2. 

Then, GOC (Generic to Original syntax Converter) was called to print xml2 into xml3, and 

finally GOC was called again to print xml3 into the final file. This is depicted in Figure 5.1. 

58


Listing 5.10: Apache nested block in XML 

 

%c%c%c%c 

 


 

 

%n 

 

 

%c%c%c%c%c 

 


 

 

%n 

 

 

%e%m.equal%e%m.equal%e 

Options 

Indexes 

MultiViews 

 

 

%n 

 

 


 

 

 


 

 

Listing 5.11: Apache comment 

 

%m.start%e%m.end 

# httpd.conf -- Apache HTTP server configuration file 

 

59


 

 

# 

\n 

 

 

 

 

 

 

 

 

 

 

 

1 

2 

Listing 5.12: Apache Metadata 

Listing 5.13: Eclipse configuration file snippet 

3 

5 

6 

7 

8 

9 

10 

Listing 5.14: Eclipse empty block in XML 

 


 

60


Listing 5.15: Eclipse block with parameters in XML 

 

%m.start%a.name%s%c%s%c%s%c%s%c%m.end 

 

%e%m.equal%e 

id 

"org.eclipse.ui.workbench.file" 

 

 

%e%m.equal%e 

itemType 

"typeToolBarContribution" 

 

 

%e%m.equal%e 

x 

"104" 

 

 

%e%m.equal%e 

y 

"30" 

 

 

Listing 5.16: Eclipse nested blocks in XML 

 

%c%c%c 

 


 

 

%m.start%a.name%s%c%m.end 

 

%e%m.equal%e 

x 

"196" 

 

 

 


 

 

61


 

%m.start%e%m.end 

this is a comment 

 

Listing 5.17: Eclipse comment in XML 

Listing 5.18: Eclipse Metadata 

 

 

 

 

 

= 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

62

5. FRAMEWORK EVALUATION 5.3. Performance Validation 

!"#$#%&'( 

)#'* 

+,'- 

+,'. 

0 

0 

+,'/ 

Figure 5.1: Parsing the generated XML 

)#%&'()#'* 

Before the test, xml1 was expected to be equal to xml3 and the original file to be equal to 

the final file. diff showed that xml1 was exactly equal to xml3 and that the original file only 

differed from final in the sense that only a few new lines were omitted from the original file, 

analogously to the test in Section 5.2.3. Still, the final file remains recognizable by XML parsers, 

since they also ignore new lines. Moreover, the tool was shown to produce valid XML files 

since it is capable of parsing them over again. 

5.3 Performance Validation 

This section studies how the tool behaves in the presence of configuration files with different 

sizes. Despite not having been designed to excel performance-wise, obtained performance 

results by the tool might turn out to be interesting. 

The tested configuration files are from PostgreSQL and are all recognized by an INI-like 

parser. These are: 

• file α with 1,4 KB (Annex B.4) 

• file β with 3,5 KB (Annex B.5) 

• file γ with 16,6 KB (Annex B.6) 

These files were selected to verify if there is any relation between the file size and the time 

it takes to configure each one. The final results are represented as the sum of two times: the 

63


time to convert the file into generic XML, step 1, and the time to convert the file from XML into 

its original syntax, step 2. 

File α: 

Step 1: 0,143 s 

Step 2: 0,073 s 

Total time: 0,216 s 

File β: 

Step 1: 0,191 s 

Step 2: 0,086 s 


File γ: 

Step 1: 0,463 s 

Step 2: 0,262 s 


This test shows that for short configuration files, the elapsed times to convert a file to XML 

and back to its original format are similar. An interpretation of these results might be that the 

configuration work done by the tool is very small, whereas the majority of the elapsed time 

was spent doing computational tasks such as memory allocations, etc. However, as the files 

become bigger, so does the time to convert them, although in a disproportionate way. γ is 

almost 5 times bigger than β, nevertheless, γ only takes 2.5 times more than β to be converted 

in both ways. 

Taking into account the average file size of VIRTU’s use case application configuration files, 

which is nearly 20KB, we can also conclude that the time spent to configure a configuration file 

is sustainable. Nevertheless, the tool may require considerable amounts of time to carry certain 

tasks, like parser creation. During implementation, we realised that some parsers took a while 

to compile while others were compiled almost instantly. Following is an analysis to the JavaCC 

parser generator which, in spite of approaching elements that fall out of this dissertation scope, 

aims at comprehending the impact of the parser generation process in the tool’s performance. 

In this analysis, we compile the three parsers that the tool supports by default and present the 

elapsed times for each (Table 5.1). 

An interpretation of the obtained times might tell us that the needed time to compile a 

parser is also sustainable. However, the compiled parsers are not too complex and basing 

any conclusion exclusively on these timings may be misleading. To assess more accurately the 

64


INI Apache XML 

0,270s 0,280s 0,300s 

Table 5.1: Elapsed times for parser compilation in JavaCC 

required time for parser generation tasks, we compiled a parser which recognizes a more com- 

plex language, the Java language, and the elapsed time was 0,550s. This allows us to conclude 

that, even for fairly complex languages, the parser generation time is very bearable. Nonethe- 

less, these timings refer to the JavaCC parser generator, as we also generated some parsers 

with SableCC and, although most parsers took as long as JavaCC’s to compile, parsers for very 

complex languages can take up to minutes to compile. 

65

6 

Integration with VIRTU 

The last chapter validated the framework, defined in Chapter 3, in terms of how it met the 

requirements, its functionality and also its performance. We now find SmART suitable to be 

integrated with the VIRTU virtualization platform. 

This chapter describes how SmART can be integrated within the VIRTU tool. First, Sec- 

tion 6.1 gives the background of the VIRTU project, together with the motivation for its creation 

and a brief description. Then, Section 6.2 presents the VIRTU workflow with the objective of 

giving a better perspective of the operations allowed by the tool. The VIRTU architecture is 

detailed in Section 6.3, with the description of each component and its interactions with the 

other components. Finally, the localization of the integration, both in the workflow and in the 

architecture, is indicated and explained in Section 6.4. 

6.1 Project Description 

A consortium of enterprises and universities, composed of Evolve Space Solutions, Universi- 

dade Nova de Lisboa, Universidade de Coimbra, HP Labs and the European Space Agency, 

funded by QREN/ADI, came together to develop a virtualization platform known as project 

VIRTU. 

In Section 2.1.1 we saw that virtualization is a technology for the abstraction of computing 

systems which has recently received a considerable number of contributions, despite having 

emerged for the first time several years ago. The changes that virtualization allows for in the 

IT sector, such as the consolidation of many servers into a single physical machine and the 

resulting lowering of hardware, cooling, power consumption and facility costs, short system 

maintenance time, great isolation level between virtual machines, etc., have revolutionized the 

way IT-recurring companies look at software and hardware deployment. 

67

6. INTEGRATION WITH VIRTU 6.2. VIRTU Top-Level Analysis 

Aware of this market shift, Evolve Space Solutions proposed VIRTU, a platform for the 

creation and management of virtual machines. Before that, the GoVI virtualization tool was 

developed for the European Space Agency in order to deal with some outstanding problems 

in their IT infrastructure, such as the large number of underutilized machines, great demand 

for isolation and a myriad of different hardware platforms. The development on VIRTU ini- 

tially took on the GoVI infrastructure as a starting point, but eventually evolved to become an 

independent project, dettached from the former. 

VIRTU explores interoperability between virtualization solutions, such as Xen [Cit09] or 

VMware [VMw09]. One way to achieve portability of VMs among different virtualization plat- 

forms is to resort to virtualization standards. Open Virtual Machine Format (see Section 2.1.3), 

which is being developed by the biggest virtualization players, is one of such standards and 

VIRTU might evolve to support it. 

Another big opportunity in the virtualization area is the fact that application configuration 

is still largely a manual task. The complexity of configuring a system with manifold applica- 

tions, employing different configuration formats, can be frightening for a system administrator. 

This complexity is only magnified when dealing with a huge system, composed of hundreds, 

or even thousands of such systems. To ease the system maintenance burden, VIRTU proposes 

the generic automatic application configuration or, in other words, the ability to configure any 

application, regardless of its vendor or configuration representation, on the most autonomous 

and automatic manner. 

The framework presented by this dissertation, SmART, is the Universidade Nova de Lisboa 

contribution for project VIRTU. Once SmART was completed, it was integrated with the whole 

VIRTU tool. 

6.2 VIRTU Top-Level Analysis 

The VIRTU tool [Sol09b] implements the concept of Assembly Instance, which refers to 

one, or a set of virtual machines that the users will use and manage directly. An Assembly 

Instance results from the instantiation of an Assembly Configuration, which contains 

the information used to generate one VM, or a set of VMs. To each VM in an Assembly 

Configuration corresponds a Virtual Machine Configuration and when an 

Assembly Configuration is instanced, its VM Configurations are also instanced into 

VM Instances. Each Virtual Machine Configuration is a combination of Building 

Blocks and Publication Files. A Building Block is a template element, such as 

an operating system, application or virtual machine, whereas a Publication File con- 

tains the desired configuration of a Building Block. In order to re-use previously existing 

Assembly Configurations, it is also possible for Assembly Configurations to con- 

tain other Assembly Configurations. The VIRTU Assembly Configuration model is 

depicted in Figure 6.1. 

The configuration of an Assembly Instance is carried at the level of its VM Instances. 

Inside a VM Instance resides a process which is ran at every VM boot and, for each 

68

6. INTEGRATION WITH VIRTU 6.2. VIRTU Top-Level Analysis 

34( 

)*+,-./012-*+ 

6/-&7-+.( 

6&*89 

:/%&-812-*+( 

;-&# 

!""#$%&'( 

)*+,-./012-*+ 

34( 

)*+,-./012-*+ 

6/-&7-+.( 

6&*89 

:/%&-812-*+( 

;-&# 

5(5(5 

5(5(5 

6/-&7-+.( 

6&*89 

:/%&-812-*+( 

;-&# 

Figure 6.1: VIRTU Assembly Configuration 

34( 

)*+,-./012-*+ 

Publication File, checks a flag that tells if that Publication File was altered since the 

last boot. If not, no action is taken for that Building Block. Otherwise, if a Publication 

File is indeed different, it means the Building Block associated to that Publication 

File has altered variables or different operating modes and, consequently, the configuration 

files for that Building Block must be rebuilt. This rebuilding is done by the process itself 

and once it is finished, the flag is reset. 

The actors in the VIRTU tool can be administrators or users. Each role is allowed to perform 

different actions that may be divided in two categories: virtual machine management and user 

management. 

An administrator has the permission to manage (i.e., create, edit and delete) the tool’s 

Building Blocks and Publication Files, to construct Assembly Configurations, 

to accept Assembly Instance requests from the user and deploy those Assembly 

Instances and to manage the running Instances, with the option of halting their execution. 

The administrator may also create and manage the users of the tool. 

On the other hand, the actions available to users are to configure Publication Files, 

to request Assembly Instances, to use the virtual machines and control their lifecycle, to 

share Assembly Instances with other users and to organize them. Beyond that, the user 

may also change his password. 

On a broad view, VIRTU may be divided into a handful of logical modules. This division 

aims at increasing agility and flexibility in the maintenance and optimization of the system 

resources. A scheme containing the modules is shown in Figure 6.2. 

The VIRTU Configuration Database stores all information regarding the tool, such as 

user accounts, the locations of the available Building Blocks, Assembly 

69

6. INTEGRATION WITH VIRTU 6.3. VIRTU Architecture 

34567 

8#$9/1+":%/#$) 

;:%: ?'"@'" 34567 

?'"@'" .,#2A) 

;:%:-%#"' 

B2,/=-') 

3/'C- 

34567) 

D==,/2:%/#$ 

?'"@'" 

E'< 

!"#$%&'$()*#(+,'-) 

Figure 6.2: VIRTU Logical View 

3*)4$-%:$2'- 

?'"@/2'- 

3/"%+:,/F:%/#$ 

?'"@/2'- 

Configurations and Instances and users reports and suggestions. The VIRTU Block 

Datastore is the physical storage for the available Building Blocks. The VIRTU 

Application Server provides access to the tool for users and administrators, together with 

the processing logic. The Virtualization Services is where the virtualization is handled. 

It contains the virtualization servers and the existing virtual machines. Finally, the Eclipse 

Views and Web Front-End are interfaces used to allow interaction with the user through an 

application or a typical web browser. 

6.3 VIRTU Architecture 

The VIRTU tool arcitecture is structured in a three-tier architecture, as depicted in Figure 6.3. 

Following is description of each layer and the respective components. 

6.3.1 Data and Resources Layer 

The Data and Resources Layer provides the storage utility for the system and hosts the 

system’s virtualization layer. It is composed by the Block Datastore, the Configuration 

Database and Virtualization Services. 

Block Datastore 

The Block Datastore is an abstraction for the physical storage of the Building Blocks 

. It may have multiple instances and can be distributed over different physical locations. It 

allows Building Blocks from another system to be uploaded, to create, modify and delete 

70 

B?.


0,()(%#"#1*%$."/(, 

0,*-())1%2$."/(, 

F)(, 

G"%"2(=(%# 

!"#"$"%&$'()*+,-()$."/(, 

D4*-E$ 

!"#")#*,( 

3-415)($61(7) 

8,*%#9(%& 

*%?12+,"#*, 

>*%?12+,"#1*%$ 

!"#";")( 

:(; 

8,*%#9(%& 

*%#,*44(, 

61,#+"41@"#1*%$ 

A(,B1-() 

Figure 6.3: VIRTU Architectural Design 

0,*-())1%2$H%#(,?"-()$."/(, 

3C#(,%"4$ 

1%#(,?"-() 

3C#(,%"4$ 


It provides a number of features for the management of the tool information by interact- 

ing with other components. The User Management performs user (creation, edition, dele- 

tion), password (change) and role (set user, administrator) management on it, the Assembly 

Configurator uses it to manage the available Publication Files, Building Blocks, 

VM Configurations and Assembly Configurations and, finally, the Assembly 

Controller accesses the Assembly Instances in run-time from it. 

Virtualization Services 

The Virtualization Services contain the system’s virtualization layer which provides 

the virtualization foundations for the running virtual machines, together with each virtual ma- 

chine instance hosted by the tool. It provides the basic VM operations such as start, stop, pause, 

create, delete, etc., and the support to configure and install applications at first-run time. 

6.3.2 Processing Layer 

The Processing Layer is where all the tool operations are ran and the tool workflow is 

implemented and managed. 

User Management 

The User Management is the component which controls the user-oriented operations. It han- 

dles operations like login checking, password changes, role attributions and creation/dele- 

tion of users. The access to the User Management is provided by the Eclipse Views 

Front-End and the External Interfaces, in case of a stand-alone application or a web- 

service access, respectively. The effects of the operations are reflected in the Configuration 

Database, under the user related tables. 

Assembly Configurator 

The Assembly Configurator allows the administrators to create, edit or delete Assembly 

Configurations and the Building Blocks and Publication Files that compose 

those Assemblies. As for the user, he may search through the existing Assembly 

Configurations, request instances of some of them and edit Publication Files asso- 

ciated to those instances. Finally, the Assembly Configurator also reports bugs to the ad- 

ministrator. 

The access to the Assembly Configurator is provided by the Eclipse Views Front 

-End and the External Interfaces. The Assembly Configurator interacts with the 

Configuration Database to manage the Assembly definitions and the Block Datastore 

to manage the Building Blocks. 

72


Assembly Controller 

The Assembly Controller allows the administrator to control Assembly Instances in 

run-time and accept or decline user requests for new Assembly Instances, as well as delete 

the existing ones. The user may, in turn, request new Assembly Instances, share 

Instances with other users and organize them, access the Instances (i.e., through Remote 

Desktop) and control them. 

Interaction with the user resorts to the Eclipse Views Front-End and the External 

Interfaces. The Assembly Controller accesses the Configuration Database to 

get any Assembly Instance state on any time, the Block Datastore to request physical 

Building Block deployment and the Virtualization Services to perform basic VM 

management operations. 

External Interfaces 

The External Interfaces allow the interaction of the tool with other existing infrastruc- 

tures. It defines a set of components for the communication with the Processing Layer 

. The External Interfaces interactions cover every component in the Processing 

Layer since this is the component which allows the integration with third-party elements. 

The User Management is used for password, login and role management, the Assembly 

Configurator is used for the management of Assembly Configurations, the Assembly 

Controller is used for VM lifecycle management and remote connection operations and 

the Web Front-End and the External Applications are used to allow the access to the 

tool from a web-browser or a third-party infrastructure. 

6.3.3 Presentation Layer 

The Presentation Layer contains the elements which allow the interaction with users and 

administrators. 

Eclipse Views Front-End 

The Eclipse Views Front-End provides access to the tool through a stand-alone appli- 

cation, based on the Eclipse RCP, which supplies views and perspectives corresponding the 

functionalities available to users and administrators. This component interacts with all other 

components in the Processing Layer to reflect the users’ and administrators’ requests. 

Web Front-End 

The Web Front-End objective is to allow interaction with the tool by means of a regular web 

browser, requiring no special applications to be installed in order to communicate with the 

tool and allowing access from devices other than typical computers, such as smart-phones 

or thin-clients. Unlike Eclipse Views Front-End which communicates directly with the 

73

6. INTEGRATION WITH VIRTU 6.4. SmART integration 

Processing Layer, the Web Front-End uses the External Interfaces component as 

an interface to the tool. 

6.3.4 Other Notable Sectors 

Apart from the presented layers and components, VIRTU makes use of other sectors that are not 

included in the tool architecture, but their relevance is considerable enough for the integration 

of SmART with VIRTU. 

The VIRTU tool employs some third-party software which is also integrated with the tool, 

as is the case of MySQL for the provision of a relational database management system or Eclipse 

for the linkage of Java objects to the database. These components are COTS (Commercial, off- 

the-shelf), software built by other entities which can be integrated with the objective of quickly 

providing functionalities to the tool, possibly saving time by removing the need to build those 

functionalities from scratch, but at the cost of some integration work. VIRTU keeps these third- 

party components in a CVS (Concurrent Versions System) repository, from where they can be 

accessed. 

Another important data sector is located inside the virtual machines created by the tool. 

This sector hosts the first-run and configuration scripts which are ran automatically by each 

virtual machine so that they get the required data from the tool autonomously. 

6.4 SmART integration 

The VIRTU workflow and architecture were described in the previous sections. Using that 

knowledge, this section describes how SmART fits in the VIRTU workflow and the where- 

abouts of the SmART component placement in the VIRTU architecture. The integration takes 

on both SmART major components, the Original to Generic syntax Converter (OGC) and the 

Generic to Original syntax Converter (GOC, see Section 3.2), and employs them on separate 

VIRTU components. OGC is used in the creation of configuration file templates when adding 

new Building Blocks to VIRTU while GOC is used by the virtual machines to generate config- 

uration files from templates. 

In the VIRTU workflow, SmART will be called for the first time when an administrator adds 

new Building Blocks. As it was seen in Section 6.3.1, these can either be applications, oper- 

ating systems or virtual machines. In this case, a Publication File needs to be associated 

to that Building Block. For this endeavour, SmART is used to transpose the Building 

Block configuration files, whose parameterization is relevant, into XML documents, generat- 

ing a Publication File. This Publication File is then stored in the Configuration 

Database, from where it is obtained when its usage is required (i.e., when a user requires 

the instantiation of an Assembly Configuration composed by that Building Block). 

There might be the case where one or more configuration files of an application are unknown 

to the tool. If such is the occasion, the administrator must define a grammar which recognizes 

the new configuration file in order to build a parser for it. In the end, a new Building 

74


!"#$%&'()*)+,-,./+&0-1)( 

F/+:.36(-,./+& 

A.@)&'-(*)( 

?2@.9*)&!.);* 

A(/+,8)+5 

%*)(&"+,)(:-2) 

!"#$%&'(/2)**.+3&0-1)( 

D**)EC@1&F/+:.36(-,/( 

F/5)& 

G)+)(-,/( 

F/+:.36(-,./+& 

4-,-C-*) 

'-(*)(& 

#)9/*.,/(1 

!"#$%&4-,-&-+5&#)*/6(2)*&0-1)( 

G(-EE-(& 

F/E9.@)( 

B)C 

A(/+,8)+5 

%*)(&"+,)(:-2) 

$)+,-,.>)& 

G(-EE-(& 

#)9/*.,/(1 

'-(*)(& 

G)+)(-,/( 

Figure 6.4: SmART integration points with VIRTU 

'(.+,)(


• Parser Repository; 

• Code Generator; 

• Grammar Compiler; 

• Tentative Grammar Repository. 

and the GOC is composed by Printer and PrinterMain. 

As previously mentioned, the two components perform disparate functions. The sub- 

components in OGC will be merged with the Assembly Configuration creation 

part whereas GOC will integrate the configuration file generation part. Following is deeper 

explanation of the integration of the SmART components in the VIRTU tool. 

6.4.1 SmART Original to Generic syntax Converter 

SmART Parser Repository 

The lowest level integration will be that of the Parser Repository. After the parsers are 

created, they are inserted in the Configuration Database. This calls for the alteration of 

the relational database by adding parser-related relations to it and the alteration of VIRTU DB 

in order for the latter to support the necessary database operations for the creation, edition 

and deletion of parsers. The concept of Parser Repository will, therefore, refer to a set of 

database relations which can be accessed by other components through the VIRTU DB software 

layer. 

SmART Processing Layer and Tentative Grammar Repository 

Most of SmART’s core will be integrated with the VIRTU Assembly Configurator. Namely, 

SmART OGC’s Configuration File Parser, Code Generator, Grammar Compiler 

and also Tentative Grammar Repository are due to be located under the Assembly 

Configurator. The reason for the localization of this merge is, as it was seen earlier in this 

section, the fact that SmART OGC is only accessed by the VIRTU tool on the process of creating 

new Assembly Configurations. 

In this way, the normal SmART workflow remains unchanged, apart from some small ad- 

justments. For instance, the Configuration File Parser will now receive parse requests 

from the VIRTU Presentation Layer and get the parsers by issuing parser queries to the 

Data Layer, by means of VIRTU DB. The Code Generator will remain intact, since it only 

interacts with the Configuration File Parser. The Grammar Compiler will now ac- 

cess the parser generator from the VIRTU third-party software sector. As for the Tentative 

Grammar Repository, its integration with the VIRTU tool was found to be more suitable 

to the VIRTU Processing Layer, opposed to the SmART Storage Layer, since it will not 

store permanently any grammar, but rather act as a grammar browser, which saves the final 

76

6. INTEGRATION WITH VIRTU 6.5. Summary 

grammars with the corresponding parsers and discards the rest. Besides that, the Tentative 

Grammar Repository will now interact with the Eclipse Views and the Web Front- 

Ends. 

SmART User Interface 

The User Interface will be integrated in the Presentation Layer with the Eclipse 

Views Front-End and the Web Front-End, so as to provide a way for the administrator 

to perform the tasks of creating the Publication Files, when new Building Blocks 

are added to VIRTU, and building new grammars when parsers capable of recognizing the 

configuration files of the Building Blocks do not exist in the database. 

6.4.2 SmART Generic to Original syntax Converter 

As for Printer and PrinterMain, they will form an autonomous process in the boot sector 

of each virtual machine, which transforms a Publication File on a configuration file, if 

that Publication File modification flag is set. 

6.4.3 Parser Generator 

Much like in SmART, the Parser Generator component will be considered as third-party 

software, therefore, this component will reside as an external package in the third-party soft- 

ware repository. It is accessed solely by the Grammar Compiler. 

6.5 Summary 

At the time of elaboration of this dissertation, the integration of SmART with VIRTU was only 

commencing. This chapter presents the result of project meetings where we analysed ways in 

which both tools can be merged, from the point of view of the VIRTU tool. The taken approxi- 

mation consists on joining the SmART components with the existing VIRTU ones. 

By merging the prototype of a framework on a bigger tool whose development is well un- 

der way, the adaptation of some prototype components should be expected. This is the case of 

the SmART Parser Repository, which will be tranformed from a simple list of parsers into 

a set of relations in the VIRTU Configuration Database. This adaptation is not straight- 

forward though, as with most of the available parser generators, the produced parsers will 

consist on a set of Java classes. The classes that compose the parser can be stored in the data- 

base without any hassle, nevertheless, they cannot be executed from there. Three alternatives 

for the execution of the parser classes were identified: to copy the classes locally to a temporary 

folder and execute them; to copy the classes to memory and run them from there; to execute 

a database method which allows the parser classes to be ran from the database. After the 

project meeting, another solution for this problem was identified, which consists on gathering 

the parser classes in a .JAR container [Mic09d] and storing the packages in the database. In this 

77

6. INTEGRATION WITH VIRTU 6.5. Summary 

way, each class in the package knows the exact location of the others and is always able to call 

them. 

There are still two components left to implement: the graphical interface for the User 

Interface and the configurator. 

The first consists on a graphical interface which allows the administrator to define new 

grammars, in a practical and efficient way. It should provide the administrator with func- 

tionalities such as configuration file syntax highlight, new pattern definition, etc. The SmART 

prototype offers the necessary functionalities for the implementation of this component, there- 

fore, the remaining work consists on designing the graphical interface and linking it with the 

other SmART components. 

The implementation of the configurator involves the design of an interface from where the 

user can alter the configuration file parameters. This interface should also generate a script 

which recognizes the XML generic syntax. That script is used to traverse a Publication File, 

applying the required changes as it comes upon the right parameters, consequently generating 

a new Publication File containing the desired parameterization. 

Apart from the previous considerations, the integration process should unfold in a seam- 

less way, providing the VIRTU tool with the ability to automatically configure applications, 

regardless of their vendors. 

78

7 

Conclusion and Future work 

The concepts of virtualization and, more particularly, virtual appliance, rose to significant im- 

portance in the last few years. Virtualization allows for the consolidation of many underuti- 

lized servers into fewer machines, increased infrastructural utilization, easy replication and 

replication of virtual machines, among many other benefits. Virtual appliances are a conse- 

quence of the virtualization advancements and refer to VMs composed solely of a minimal OS 

and a set of applications, optimized to perform a specific computational task. 

The VIRTU project is a collective effort by a consortium of enterprises and universities, 

whose objective is to develop a platform for the creation and management of virtual appliances. 

Nowadays, the offering of virtualization solutions is increasing each day, but the configuration 

of virtual appliances is still a mainly manual task. Since a single machine tends to garner 

many VAs, configuring them in an effective way has become a problem. VIRTU tries to exploit 

this opportunity by offering on-demand configuration of applications, independently of their 

vendor. 

This thesis tackles the problem of automatic configuration of applications, regardless of 

their vendors, in virtual appliances. The automatic configuration of applications is important 

since it allows for the reduction of required time to configure applications. It assumes a special 

importance if we consider a business scenario, where much time is wasted in configuration du- 

ties. A hurdle for the application configuration, regardless of the vendor, is that nowadays, ap- 

plications already represent their configurations on widely adopted standards, although many 

of those standards exist and the majority of the applications resort to vendor-specific formats. 

The aim of this work consists on creating a framework that allows the abstraction of the used 

formats, allowing for the configuration process to be the same for every application, as well as 

automating that process from it results the least human interaction. 

At first, two possible ways to deal with this subject were approached: configure the appli- 

79

7. CONCLUSION AND FUTURE WORK 7.1. Future Work 

cation pre or post installation on the virtual appliance. Configuring the application before its 

installation on the virtual appliance consists on altering the configuration file in the applica- 

tion package and then install it on the virtual appliance. This approach was seen to lead to 

package redundancy, since the number of required different configuration will equate to the 

number of application packages to be installed. Moreover, in case of further configurations of 

an application, this approach requires a package to be re-installed on another VA. 

A better approach is the configuration post installation. In this approach, an application 

configuration file is extracted from inside a VA and configured by the tool, only to be merged 

with the VA in the end of the configuration process. Although this approach requires knowl- 

edge of the application location, it allows for easy configuration of any number of virtual ap- 

pliances and, additionally, it does not force an application to be reinstalled in case of a recon- 

figuration. 

The Smart Application Reconfiguration Tool (SmART) is the contribution of this thesis for 

the automatic configuration of applications problem. It processes the configuration files of the 

applications, applying a series a transformations on each file with the aim of abstracting them 

from their original formats and eventually converting them back to their formats. 

One way to deal with the problem is to take advantage of the patterns oftenly found in 

configuration files. A study conducted on the applications most likely to be encountered by 

the tool showed that the configuration files resort only to the concepts of parameter setting, 

parameter block and comment, whereas other patterns occur in a punctual way. 

After the development phase had ceased, the tool was submitted to some tests so as to val- 

idate it. SmART was found to meet all the requirements identified in the requirement analysis 

phase. Then, the configuration of some applications was detailed. It shows the intermediate 

states of the configuration file and, in the end, the configuration file reproduced by the tool is 

shown to be functional. 

With the concept proved and having developed a functional prototype, the integration of 

the SmART tool with the VIRTU tool proceeded. This process is intended to provide VIRTU 

with the ability to automatically configure applications, regardless of their vendors. To com- 

plement this dissertation with reliable information about VIRTU, we spent two weeks with 

the Evolve team. The integrational process only started on the final term of the elaboration of 

the dissertation. Therefore, we can only offer a suggestion of how to execute the integration 

process, meanwhile identifying the possible obstacles that might occur. 

7.1 Future Work 

Initially, the tool was thought to implement a graphical interface for the grammar definition 

process to become more natural to the user. However, in the end there was not enough time to 

study a graphical framework and, therefore, this feature was not implemented, although the 

current implementation allows the user to define a grammar. Nevertheless, this dissertation 

explains in detail what utilities the interface must offer, and how they can be accomplished. 

Another suggestion for a future improvement on the tool is for it to infer the meanings of 

80

7. CONCLUSION AND FUTURE WORK 7.1. Future Work 

different patterns in new configuration file languages on its own, using grammar inference. 

Grammar inference [Hon], also known as automata induction, grammar induction and auto- 

matic language acquisition, is a field of research with the purpose of learning grammars and 

languages from data. It can be applied on the fields of syntatic pattern recognition, adaptative 

intelligent agents, diagnosis, computational biology, systems modelling, prediction, natural 

language acquisition, data mining and knowledge discovery. According to [PH01], it is possi- 

ble to learn grammars from simple examples, like sparse configuration files. 

There is active work on this subject, namely the GenParse project, which is a collaboration 

among the University of Alabama-Birmingham, U.S.A. and the University of Maribor, Slove- 

nia. It aims at easing the development of Domain-Specific Languages and also at developing 

renovation tools for legacy systems. It makes use of the Genetic Programming paradigm to 

infer Context-Free Grammars. 

81

A.1 OGC sequence diagram 

A 

Appendix - Architecture 

83

A. APPENDIX - ARCHITECTURE A.1. OGC sequence diagram 

interaction OGC Sequence[ OGC Sequence ] 

alt 

: User : UI 

: CFP : PR 

: CG : GC 

: TGR 

[at least 1 parsed] 

[else] 

loop 

[reDoParse=OK] 

loop 

1: start(fileNames=file) 

8: successfulParse 

opt 

[ ] 

[reDoCompile=OK] 

11: submitGrammar() 

19: storeParser() 

22: successfulParse 

2: doParse(file=file) 

loop 

[all parsers tested] 

7: reDoParse 

15: doParse(file=file, parser=parser) 

18: reDoParse 

3: getIterator() 

4: parse 

5: doTranslate(parsingdata="pd") 

13: reDoCompile 

16: parse 

6: reDoTranslate 

12: doCompile(grammar=grammar) 

20: store(name=name, file=file, grammar=grammar) 

9: get(serialNumber=number) 

10: prev() 

14: store(grammarName=name, grammar=grammar) 

17: doTranslate(parsingdata=pd) 

21: discardAll() 

Figure A.1: Original to generic syntax converter sequence diagram 

84

B.1 MySQL Configuration file 

# 

# The MySQL database server configuration file. 

# 

# You can copy this to one of: 

# - "/etc/mysql/my.cnf" to set global options, 

# - "˜/.my.cnf" to set user-specific options. 

# 

B 

Appendix - Evaluation 

# One can use all long options that the program supports. 

# Run program with --help to get a list of available options and with 

# --print-defaults to see which it would actually understand and use. 

# 

# For explanations see 

# http://dev.mysql.com/doc/mysql/en/server-system-variables.html 

# This will be passed to all mysql clients 

# It has been reported that passwords should be enclosed with ticks/ 

quotes 

# escpecially if they contain "#" chars... 

# Remember to edit /etc/mysql/debian.cnf when changing the socket 

location. 

[client] 

port = 3306 

85

B. APPENDIX - EVALUATION B.1. MySQL Configuration file 


# Here is entries for some specific programs 

# The following values assume you have at least 32M ram 

# This was formally known as [safe_mysqld]. Both versions are 

currently parsed. 

[mysqld_safe] 


nice = 0 

[mysqld] 

# 

# * Basic Settings 

# 

# 

# * IMPORTANT 

# If you make changes to these settings and your system uses 

apparmor, you may 

# also need to also adjust /etc/apparmor.d/usr.sbin.mysqld. 

# 

user = mysql 

pid-file = /var/run/mysqld/mysqld.pid 


port = 3306 

basedir = /usr 

datadir = /var/lib/mysql 

tmpdir = /tmp 

skip-external-locking 

# 

# Instead of skip-networking the default is now to listen only on 

# localhost which is more compatible and is not less secure. 

bind-address = 127.0.0.1 

# 

# * Fine Tuning 

# 

key_buffer = 16M 

max_allowed_packet = 16M 

thread_stack = 128K 

86


thread_cache_size = 8 

# This replaces the startup script and checks MyISAM tables if needed 

# the first time they are touched 

myisam-recover = BACKUP 

#max_connections = 100 

#table_cache = 64 

#thread_concurrency = 10 

# 

# * Query Cache Configuration 

# 

query_cache_limit = 1M 

query_cache_size = 16M 

# 

# * Logging and Replication 

# 

# Both location gets rotated by the cronjob. 

# Be aware that this log type is a performance killer. 

#log = /var/log/mysql/mysql.log 

# 

# Error logging goes to syslog. This is a Debian improvement :) 

# 

# Here you can see queries with especially long duration 

#log_slow_queries = /var/log/mysql/mysql-slow.log 

#long_query_time = 2 

#log-queries-not-using-indexes 

# 

# The following can be used as easy to replay backup logs or for 

replication. 

# note: if you are setting up a replication slave, see README.Debian 

about 

# other settings you may need to change. 

#server-id = 1 

#log_bin = /var/log/mysql/mysql-bin.log 

expire_logs_days = 10 

max_binlog_size = 100M 

#binlog_do_db = include_database_name 

#binlog_ignore_db = include_database_name 

# 

# * InnoDB 

# 

87


# InnoDB is enabled by default with a 10MB datafile in /var/lib/mysql 

/. 

# Read the manual for more InnoDB related options. There are many! 

# You might want to disable InnoDB to shrink the mysqld process by 

circa 100MB. 

#skip-innodb 

# 

# * Federated 

# 

# The FEDERATED storage engine is disabled since 5.0.67 by default in 

the .cnf files 

# shipped with MySQL distributions (my-huge.cnf, my-medium.cnf, and 

# 

so forth). 

skip-federated 

# 

# * Security Features 

# 

# Read the manual, too, if you want chroot! 

# chroot = /var/lib/mysql/ 

# 

# For generating SSL certificates I recommend the OpenSSL GUI "tinyca 

# 

". 

# ssl-ca=/etc/mysql/cacert.pem 

# ssl-cert=/etc/mysql/server-cert.pem 

# ssl-key=/etc/mysql/server-key.pem 

[mysqldump] 

quick 

quote-names 

max_allowed_packet = 16M 

[mysql] 

#no-auto-rehash # faster start of mysql but no tab completition 

[isamchk] 

key_buffer = 16M 

88

B. APPENDIX - EVALUATION B.2. Apache Configuration File 

# 

# * NDB Cluster 

# 

# See /usr/share/doc/mysql-server-*/README.Debian for more 

# 

information. 

# The following configuration is read by the NDB Data Nodes (ndbd 

processes) 

# not from the NDB Management Nodes (ndb_mgmd processes). 

# 

# [MYSQL_CLUSTER] 

# ndb-connectstring=127.0.0.1 

# 

# * IMPORTANT: Additional settings that can override those from this 

file! 

# The files must end with ’.cnf’, otherwise they’ll be ignored. 

# 

!includedir /etc/mysql/conf.d/ 

B.2 Apache Configuration File 

## 

## httpd.conf -- Apache HTTP server configuration file 

## 

# 

# Based upon the NCSA server configuration files originally by Rob 

# 

McCool. 

# This is the main Apache server configuration file. It contains the 

# configuration directives that give the server its instructions. 

# See for detailed information 

about 

# the directives. 

# 

# Do NOT simply read the instructions in here without understanding 

# what they do. They’re here only as hints or reminders. If you are 

unsure 

89


# consult the online docs. You have been warned. 

# 

# After this file is processed, the server will look for and process 

# /usr/local/apache/conf/srm.conf and then /usr/local/apache/conf/ 

access.conf 

# unless you have overridden these with ResourceConfig and/or 

# AccessConfig directives here. 

# 

# The configuration directives are grouped into three basic sections: 

# 1. Directives that control the operation of the Apache server 

process as a 

# whole (the ’global environment’). 

# 2. Directives that define the parameters of the ’main’ or ’default 

’ server, 

# which responds to requests that aren’t handled by a virtual 

host. 

# These directives also provide default values for the settings 

# of all virtual hosts. 

# 3. Settings for virtual hosts, which allow Web requests to be sent 

to 

# different IP addresses or hostnames and have them handled by 

the 

# same Apache server process. 

# 

# Configuration and logfile names: If the filenames you specify for 

many 

# of the server’s control files begin with "/" (or "drive:/" for 

Win32), the 

# server will use that explicit path. If the filenames do *not* 

begin 

# with "/", the value of ServerRoot is prepended -- so "logs/foo.log" 

# with ServerRoot set to "/usr/local/apache" will be interpreted by 

the 

# server as "/usr/local/apache/logs/foo.log". 

# 

### Section 1: Global Environment 

# 

# The directives in this section affect the overall operation of 

Apache, 

# such as the number of concurrent requests it can handle or where it 

90


# can find its configuration files. 

# 

# 

# ServerType is either inetd, or standalone. Inetd mode is only 

supported on 

# Unix platforms. 

# 

ServerType standalone 

# 

# ServerRoot: The top of the directory tree under which the server’s 

# configuration, error, and log files are kept. 

# 

# NOTE! If you intend to place this on an NFS (or otherwise network) 

# mounted filesystem then please read the LockFile documentation 

# (available at ); 

# you will save yourself a lot of trouble. 

# 

ServerRoot "/usr/local/apache" 

# 

# The LockFile directive sets the path to the lockfile used when 

Apache 

# is compiled with either USE_FCNTL_SERIALIZED_ACCEPT or 

# USE_FLOCK_SERIALIZED_ACCEPT. This directive should normally be left 

at 

# its default value. The main reason for changing it is if the logs 

# directory is NFS mounted, since the lockfile MUST BE STORED ON A 

LOCAL 

# DISK. The PID of the main server process is automatically appended 

to 

# the filename. 

# 

#LockFile /usr/local/apache/logs/httpd.lock 

# 

# PidFile: The file in which the server should record its process 

# identification number when it starts. 

# 

91


PidFile /usr/local/apache/logs/httpd.pid 

# 

# ScoreBoardFile: File used to store internal server process 

information. 

# Not all architectures require this. But if yours does (you’ll know 

because 

# this file will be created when you run Apache) then you *must* 

ensure that 

# no two invocations of Apache share the same scoreboard file. 

# 

ScoreBoardFile /usr/local/apache/logs/httpd.scoreboard 

# 

# In the standard configuration, the server will process httpd.conf ( 

this 

# file, specified by the -f command line option), srm.conf, and 

access.conf 

# in that order. The latter two files are now distributed empty, as 

it is 

# recommended that all directives be kept in a single file for 

simplicity. 

# The commented-out values below are the built-in defaults. You can 

have the 

# server ignore these files altogether by using "/dev/null" (for Unix 

) or 

# "nul" (for Win32) for the arguments to the directives. 

# 

#ResourceConfig conf/srm.conf 

#AccessConfig conf/access.conf 

# 

# Timeout: The number of seconds before receives and sends time out. 

# 

Timeout 300 

# 

# KeepAlive: Whether or not to allow persistent connections (more 

than 

# one request per connection). Set to "Off" to deactivate. 

# 

92


KeepAlive On 

# 

# MaxKeepAliveRequests: The maximum number of requests to allow 

# during a persistent connection. Set to 0 to allow an unlimited 

amount. 

# We recommend you leave this number high, for maximum performance. 

# 

MaxKeepAliveRequests 100 

# 

# KeepAliveTimeout: Number of seconds to wait for the next request 

from the 

# same client on the same connection. 

# 

KeepAliveTimeout 15 

# 

# Server-pool size regulation. Rather than making you guess how many 

# server processes you need, Apache dynamically adapts to the load it 

# sees --- that is, it tries to maintain enough server processes to 

# handle the current load, plus a few spare servers to handle 

transient 

# load spikes (e.g., multiple simultaneous requests from a single 

# Netscape browser). 

# 

# It does this by periodically checking how many servers are waiting 

# for a request. If there are fewer than MinSpareServers, it creates 

# a new spare. If there are more than MaxSpareServers, some of the 

# spares die off. The default values are probably OK for most sites. 

# 

MinSpareServers 10 

MaxSpareServers 25 

# 

# Number of servers to start initially --- should be a reasonable 

ballpark 

# figure. 

# 

StartServers 10 

93


# 

# Limit on total number of servers running, i.e., limit on the number 

# of clients who can simultaneously connect --- if this limit is ever 

# reached, clients will be LOCKED OUT, so it should NOT BE SET TOO 

LOW. 

# It is intended mainly as a brake to keep a runaway server from 

taking 

# the system with it as it spirals down... 

# 

MaxClients 150 

# 

# MaxRequestsPerChild: the number of requests each child process is 

# allowed to process before the child dies. The child will exit so 

# as to avoid problems after prolonged use when Apache (and maybe the 

# libraries it uses) leak memory or other resources. On most systems 

, this 

# isn’t really needed, but a few (such as Solaris) do have notable 

leaks 

# in the libraries. For these platforms, set to something like 10000 

# or so; a setting of 0 means unlimited. 

# 

# NOTE: This value does not include keepalive requests after the 

initial 

# request per connection. For example, if a child process 

handles 

# an initial request and 10 subsequent "keptalive" requests, it 

# would only count as 1 request towards this limit. 

# 

MaxRequestsPerChild 0 

# 

# Listen: Allows you to bind Apache to specific IP addresses and/or 

# ports, in addition to the default. See also the 

# directive. 

# 

Listen 80 

#Listen 12.34.56.78:80 

# 

94


# BindAddress: You can support virtual hosts with this option. This 

directive 

# is used to tell the server which IP address to listen to. It can 

either 

# contain "*", an IP address, or a fully qualified Internet domain 

name. 

# See also the and Listen directives. 

# 

#BindAddress * 

# 

# Dynamic Shared Object (DSO) Support 

# 

# To be able to use the functionality of a module which was built as 

a DSO you 

# have to place corresponding ‘LoadModule’ lines at this location so 

the 

# directives contained in it are actually available _before_ they are 

used. 

# Please read the file README.DSO in the Apache 1.3 distribution for 

more 

# details about the DSO mechanism and run ‘httpd -l’ for the list of 

already 

# built-in (statically linked and thus always available) modules in 

your httpd 

# binary. 

# 

# Note: The order in which modules are loaded is important. Don’t 

change 

# the order below without expert advice. 

# 

# Example: 

# LoadModule foo_module libexec/mod_foo.so 

# 

#LoadModule php4_module libexec/libphp4.so 

# 

# ExtendedStatus controls whether Apache will generate "full" status 

# information (ExtendedStatus On) or just basic information ( 

ExtendedStatus 

95


# Off) when the "server-status" handler is called. The default is Off 

# 

. 

#ExtendedStatus On 

### Section 2: ’Main’ server configuration 

# 

# The directives in this section set up the values used by the ’main’ 

# server, which responds to any requests that aren’t handled by a 

# definition. These values also provide defaults for 

# any containers you may define later in the file. 

# 

# All of these directives may appear inside containers, 

# in which case these default settings will be overridden for the 

# virtual host being defined. 

# 

# 

# If your ServerType directive (set earlier in the ’Global 

Environment’ 

# section) is set to "inetd", the next few directives don’t have any 

# effect since their settings are defined by the inetd configuration. 

# Skip ahead to the ServerAdmin directive. 

# 

# 

# Port: The port to which the standalone server listens. For 

# ports < 1023, you will need httpd to be run as root initially. 

# 

Port 80 

## SSL Support 

## 

## When we also provide SSL we have to listen to the 

## standard HTTP port (see above) and to the HTTPS port 

## 

 

Listen 443 

 

96


# 

# If you wish httpd to run as a different user or group, you must run 

# httpd as root initially and it will switch. 

# 

# User/Group: The name (or #number) of the user/group to run httpd as 

. 

# . On SCO (ODT 3) use "User nouser" and "Group nogroup". 

# . On HPUX you may not be able to use shared memory as nobody, and 

the 

# suggested workaround is to create a user www and use that user. 

# NOTE that some kernels refuse to setgid(Group) or semctl(IPC_SET) 

# when the value of (unsigned)Group is above 60000; 

# don’t use Group nobody on these systems! 

# 

User apache 

Group apache 

# 

# ServerAdmin: Your address, where problems with the server should be 

# e-mailed. This address appears on some server-generated pages, 

such 

# as error documents. 

# 

ServerAdmin webmaster@here.com 

# 

# ServerName allows you to set a host name which is sent back to 

clients for 

# your server if it’s different than the one the program would get (i 

.e., use 

# "www" instead of the host’s real name). 

# 

# Note: You cannot just invent host names and hope they work. The 

name you 

# define here must be a valid DNS name for your host. If you don’t 

understand 

# this, ask your network administrator. 

# If your host doesn’t have a registered DNS name, enter its IP 

address here. 

# You will have to access it by its address (e.g., http 

://123.45.67.89/) 

97


# anyway, and this will make redirections work in a sensible way. 

# 

# 127.0.0.1 is the TCP/IP local loop-back address, often named 

localhost. Your 

# machine always knows itself by this address. If you use Apache 

strictly for 

# local testing and development, you may use 127.0.0.1 as the server 

# 

name. 

ServerName www.here.com 

# 

# DocumentRoot: The directory out of which you will serve your 

# documents. By default, all requests are taken from this directory, 

but 

# symbolic links and aliases may be used to point to other locations. 

# 

DocumentRoot "/var/www/html" 

# 

# Each directory to which Apache has access, can be configured with 

respect 

# to which services and features are allowed and/or disabled in that 

# directory (and its subdirectories). 

# 

# First, we configure the "default" to be a very restrictive set of 

# permissions. 

# 

 

Options FollowSymLinks 

AllowOverride None 

 

# 

# Note that from this point forward you must specifically allow 

# particular features to be enabled - so if something’s not working 

as 

# you might expect, make sure that you have specifically enabled it 

# below. 

# 

98


# 

# This should be changed to whatever you set DocumentRoot to. 

# 

 

# 

# This may also be "None", "All", or any combination of "Indexes", 

# "Includes", "FollowSymLinks", "ExecCGI", or "MultiViews". 

# 

# Note that "MultiViews" must be named *explicitly* --- "Options All" 

# doesn’t give it to you. 

# 

# 

Options Includes FollowSymLinks MultiViews 

# This controls which options the .htaccess files in directories can 

# override. Can also be "All", or any combination of "Options", " 

FileInfo", 

# "AuthConfig", and "Limit" 

# 

# 

AllowOverride All 

# Controls who can get stuff from this server. 

# 

Order allow,deny 


 

# 

# UserDir: The name of the directory which is appended onto a user’s 

home 

# directory if a ˜user request is received. 

# 

 

UserDir public_html 

 

# 

# Control access to UserDir directories. The following is an example 

# for a site where these directories are restricted to read-only. 

99


# 

 

AllowOverride All 

Options MultiViews Indexes SymLinksIfOwnerMatch IncludesNoExec 

# 

# Order allow,deny 

# Allow from all 

# 

# 

# Order deny,allow 

# Deny from all 

# 

 

# 

# DirectoryIndex: Name of the file or files to use as a pre-written 

HTML 

# directory index. Separate multiple entries with spaces. 

# 

 

DirectoryIndex index.html index.htm index.shtml default.htm 

 

# 

default.html index.php 

# AccessFileName: The name of the file to look for in each directory 

# for access control information. 

# 

AccessFileName .htaccess 

# 

# The following lines prevent .htaccess files from being viewed by 

# Web clients. Since .htaccess files often contain authorization 

# information, access is disallowed for security reasons. Comment 

# these lines out if you want Web visitors to see the contents of 

# .htaccess files. If you change the AccessFileName directive above, 

# be sure to make the corresponding changes here. 

# 

# Also, folks tend to use names such as .htpasswd for password 

100


# files, so this will protect those as well. 

# 

 

 

# 


Deny from all 

# CacheNegotiatedDocs: By default, Apache sends "Pragma: no-cache" 

with each 

# document that was negotiated on the basis of content. This asks 

proxy 

# servers not to cache the document. Uncommenting the following line 

disables 

# this behavior, and proxies will be allowed to cache the documents. 

# 

#CacheNegotiatedDocs 

# 

# UseCanonicalName: (new for 1.3) With this setting turned on, 

whenever 

# Apache needs to construct a self-referencing URL (a URL that refers 

back 

# to the server the response is coming from) it will use ServerName 

and 

# Port to form a "canonical" name. With this setting off, Apache 

will 

# use the hostname:port that the client supplied, when possible. 

This 

# also affects SERVER_NAME and SERVER_PORT in CGI scripts. 

# 

UseCanonicalName On 

# 

# TypesConfig describes where the mime.types file (or equivalent) is 

# to be found. 

# 

 

TypesConfig /usr/local/apache/conf/mime.types 

 

101


# 

# DefaultType is the default MIME type the server will use for a 

document 

# if it cannot otherwise determine one, such as from filename 

extensions. 

# If your server contains mostly text or HTML documents, "text/plain" 

is 

# a good value. If most of your content is binary, such as 

applications 

# or images, you may want to use "application/octet-stream" instead 

to 

# keep browsers from trying to display binary files as though they 

are 

# text. 

# 

DefaultType text/plain 

# 

# The mod_mime_magic module allows the server to use various hints 

from the 

# contents of the file itself to determine its type. The 

MIMEMagicFile 

# directive tells the module where the hint definitions are located. 

# mod_mime_magic is not part of the default server (you have to add 

# it yourself with a LoadModule [see the DSO paragraph in the ’Global 

# Environment’ section], or recompile the server and include 

mod_mime_magic 

# as part of the configuration), so it’s enclosed in an 

container. 

# This means that the MIMEMagicFile directive will only be processed 

if the 

# module is part of the server. 

# 

 

MIMEMagicFile /usr/local/apache/conf/magic 

 

# 

# HostnameLookups: Log the names of clients or just their IP 

addresses 

# e.g., www.apache.org (on) or 204.62.129.132 (off). 

102


# The default is off because it’d be overall better for the net if 

people 

# had to knowingly turn this feature on, since enabling it means that 

# each client request will result in AT LEAST one lookup request to 

the 

# nameserver. 

# 

HostnameLookups Off 

# 

# ErrorLog: The location of the error log file. 

# If you do not specify an ErrorLog directive within a 

# container, error messages relating to that virtual host will be 

# logged here. If you *do* define an error logfile for a < 

VirtualHost> 

# container, that host’s errors will be logged there and not here. 

# 

ErrorLog /usr/local/apache/logs/error_log 

# 

# LogLevel: Control the number of messages logged to the error_log. 

# Possible values include: debug, info, notice, warn, error, crit, 

# alert, emerg. 

# 

LogLevel warn 

# 

# The following directives define some format nicknames for use with 

# a CustomLog directive (see below). 

# 

LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i 

\"" combined 

LogFormat "%h %l %u %t \"%r\" %>s %b" common 

LogFormat "%{Referer}i -> %U" referer 

LogFormat "%{User-agent}i" agent 

# 

# The location and format of the access logfile (Common Logfile 

Format). 

# If you do not define any access logfiles within a 

103


# container, they will be logged here. Contrariwise, if you *do* 

# define per- access logfiles, transactions will be 

# logged therein and *not* in this file. 

# 

CustomLog /usr/local/apache/logs/access_log common 

# 

# If you would like to have agent and referer logfiles, uncomment the 

# following directives. 

# 

#CustomLog /usr/local/apache/logs/referer_log referer 

#CustomLog /usr/local/apache/logs/agent_log agent 

# 

# If you prefer a single logfile with access, agent, and referer 

information 

# (Combined Logfile Format) you can use the following directive. 

# 

#CustomLog /usr/local/apache/logs/access_log combined 

# 

# Optionally add a line containing the server version and virtual 

host 

# name to server-generated pages (error documents, FTP directory 

listings, 

# mod_status and mod_info output etc., but not CGI generated 

documents). 

# Set to "EMail" to also include a mailto: link to the ServerAdmin. 

# Set to one of: On | Off | EMail 

# 

ServerSignature On 

#This directive controls whether Server response header field which 

is 

#sent back to clients includes a description of the generic OS-type 

of the 

#server as well as information about compiled-in modules. 

#ServerTokens Prod[uctOnly] only available in versions later than 

1.3.12 

104


# Server sends (e.g.): Server: Apache 

#ServerTokens Min[imal] 

# Server sends (e.g.): Server: Apache/1.3.0 

#ServerTokens OS 

# Server sends (e.g.): Server: Apache/1.3.0 (Unix) 

#ServerTokens Full (or not specified) 

# Server sends (e.g.): Server: Apache/1.3.0 (Unix) PHP/3.0 MyMod 

/1.2 

ServerTokens Prod 

# EBCDIC configuration: 

# (only for mainframes using the EBCDIC codeset, currently one of: 

# Fujitsu-Siemens’ BS2000/OSD, IBM’s OS/390 and IBM’s TPF)!! 

# The following default configuration assumes that "text files" 

# are stored in EBCDIC (so that you can operate on them using the 

# normal POSIX tools like grep and sort) while "binary files" are 

# stored with identical octets as on an ASCII machine. 

# 

# The directives are evaluated in configuration file order, with 

# the EBCDICConvert directives applied before EBCDICConvertByType. 

# 

# If you want to have ASCII HTML documents and EBCDIC HTML documents 

# at the same time, you can use the file extension to force 

# conversion off for the ASCII documents: 

# > AddType text/html .ahtml 

# > EBCDICConvert Off=InOut .ahtml 

# 

# EBCDICConvertByType On=InOut text/* message/* multipart/* 

# EBCDICConvertByType On=In application/x-www-form-urlencoded 

# EBCDICConvertByType On=InOut application/postscript model/vrml 

# EBCDICConvertByType Off=InOut */* 

# 

# Aliases: Add here as many aliases as you need (with no limit). The 

format is 

# Alias fakename realname 

# 

 

# 

105


# Note that if you include a trailing / on fakename then the 

server will 

# require it to be present in the URL. So "/icons" isn’t aliased 

in this 

# example, only "/icons/". If the fakename is slash-terminated, 

then the 

# realname must also be slash terminated, and if the fakename 

omits the 

# trailing slash, the realname must also omit it. 

# 

Alias /icons/ "/usr/local/apache/icons/" 

Alias /icons/ "/var/www/icons/" 

 

Options Indexes MultiViews 




 

# 

# ScriptAlias: This controls which directories contain server 

scripts. 

# ScriptAliases are essentially the same as Aliases, except that 

# documents in the realname directory are treated as applications 

and 

# run by the server when requested rather than as documents sent 

to the client. 

# The same rules about trailing "/" apply to ScriptAlias 

directives as to 

# Alias. 

# 

# ScriptAlias /cgi-bin/ "/usr/local/apache/cgi-bin/" 

ScriptAlias /cgi-bin/ "/var/www/cgi-bin/" 

# 

# "/var/www/cgi-bin" should be changed to whatever your ScriptAliased 

# CGI directory exists, if you have that configured. 

# 

106


 


Options ExecCGI 



 

# 

# "/usr/local/apache/cgi-bin" should be changed to whatever your 

ScriptAliased 

# CGI directory exists, if you have that configured. 

# 

# 

# AllowOverride None 

# Options None 

# Order allow,deny 

# Allow from all 

# 

 

# End of aliases. 

# 

# Redirect allows you to tell clients about documents which used to 

exist in 

# your server’s namespace, but do not anymore. This allows you to 

tell the 

# clients where to look for the relocated document. 

# Format: Redirect old-URI new-URL 

# 

# 

# Directives controlling the display of server-generated directory 

# 

listings. 

 

# 

# FancyIndexing is whether you want fancy directory indexing or 

# 

standard 

107


IndexOptions FancyIndexing 

# 

# AddIcon* directives tell the server which icon to show for 

different 

# files or filename extensions. These are only displayed for 

# FancyIndexed directories. 

# 

AddIconByEncoding (CMP,/icons/compressed.gif) x-compress x-gzip 

AddIconByType (TXT,/icons/text.gif) text/* 

AddIconByType (IMG,/icons/image2.gif) image/* 

AddIconByType (SND,/icons/sound2.gif) audio/* 

AddIconByType (VID,/icons/movie.gif) video/* 

AddIcon /icons/binary.gif .bin .exe 

AddIcon /icons/binhex.gif .hqx 

AddIcon /icons/tar.gif .tar 

AddIcon /icons/world2.gif .wrl .wrl.gz .vrml .vrm .iv 

AddIcon /icons/compressed.gif .Z .z .tgz .gz .zip 

AddIcon /icons/a.gif .ps .ai .eps 

AddIcon /icons/layout.gif .html .shtml .htm .pdf 

AddIcon /icons/text.gif .txt 

AddIcon /icons/c.gif .c 

AddIcon /icons/p.gif .pl .py 

AddIcon /icons/f.gif .for 

AddIcon /icons/dvi.gif .dvi 

AddIcon /icons/uuencoded.gif .uu 

AddIcon /icons/script.gif .conf .sh .shar .csh .ksh .tcl 

AddIcon /icons/tex.gif .tex 

AddIcon /icons/bomb.gif core 

AddIcon /icons/back.gif .. 

AddIcon /icons/hand.right.gif README 

AddIcon /icons/folder.gif ˆˆDIRECTORYˆˆ 

AddIcon /icons/blank.gif ˆˆBLANKICONˆˆ 

# 

# DefaultIcon is which icon to show for files which do not have 

an icon 

# explicitly set. 

108


# 

DefaultIcon /icons/unknown.gif 

# 

# AddDescription allows you to place a short description after a 

file in 

# server-generated indexes. These are only displayed for 

FancyIndexed 

# directories. 

# Format: AddDescription "description" filename 

# 

#AddDescription "GZIP compressed document" .gz 

#AddDescription "tar archive" .tar 

#AddDescription "GZIP compressed tar archive" .tgz 

# 

# ReadmeName is the name of the README file the server will look 

for by 

# default, and append to directory listings. 

# 

# HeaderName is the name of a file which should be prepended to 

# directory indexes. 

# 

# If MultiViews are amongst the Options in effect, the server 

will 

# first look for name.html and include it if found. If name.html 

# doesn’t exist, the server will then look for name.txt and 

include 

# it as plaintext if found. 

# 

ReadmeName README 

HeaderName HEADER 

# 

# IndexIgnore is a set of filenames which directory indexing 

should ignore 

# and not include in the listing. Shell-style wildcarding is 

# 

permitted. 

IndexIgnore .??* *˜ *# HEADER* README* RCS CVS *,v *,t 

109


 

# End of indexing directives. 

# 

# Document types. 

# 

 

# 

# AddEncoding allows you to have certain browsers (Mosaic/X 2.1+) 

uncompress 

# information on the fly. Note: Not all browsers support this. 

# Despite the name similarity, the following Add* directives have 

nothing 

# to do with the FancyIndexing customization directives above. 

# 

AddEncoding x-compress Z 

AddEncoding x-gzip gz tgz 

# 

# AddLanguage allows you to specify the language of a document. 

You can 

# then use content negotiation to give a browser a file in a 

language 

# it can understand. 

# 

# Note 1: The suffix does not have to be the same as the language 

# keyword --- those with documents in Polish (whose net-standard 

# language code is pl) may wish to use "AddLanguage pl .po" to 

# avoid the ambiguity with the common suffix for perl scripts. 

# 

# Note 2: The example entries below illustrate that in quite 

# some cases the two character ’Language’ abbreviation is not 

# identical to the two character ’Country’ code for its country, 

# E.g. ’Danmark/dk’ versus ’Danish/da’. 

# 

# Note 3: In the case of ’ltz’ we violate the RFC by using a 

three char 

# specifier. But there is ’work in progress’ to fix this and get 

# the reference data for rfc1766 cleaned up. 

110


# 

# Danish (da) - Dutch (nl) - English (en) - Estonian (ee) 

# French (fr) - German (de) - Greek-Modern (el) 

# Italian (it) - Korean (kr) - Norwegian (no) 

# Portugese (pt) - Luxembourgeois* (ltz) 

# Spanish (es) - Swedish (sv) - Catalan (ca) - Czech(cz) 

# Polish (pl) - Brazilian Portuguese (pt-br) - Japanese (ja) 

# Russian (ru) 

# 

AddLanguage da .dk 

AddLanguage nl .nl 

AddLanguage en .en 

AddLanguage et .ee 

AddLanguage fr .fr 

AddLanguage de .de 

AddLanguage el .el 

AddLanguage he .he 

AddCharset ISO-8859-8 .iso8859-8 

AddLanguage it .it 

AddLanguage ja .ja 

AddCharset ISO-2022-JP .jis 

AddLanguage kr .kr 

AddCharset ISO-2022-KR .iso-kr 

AddLanguage no .no 

AddLanguage pl .po 

AddCharset ISO-8859-2 .iso-pl 

AddLanguage pt .pt 

AddLanguage pt-br .pt-br 

AddLanguage ltz .lu 

AddLanguage ca .ca 

AddLanguage es .es 

AddLanguage sv .se 

AddLanguage cz .cz 

AddLanguage ru .ru 

AddLanguage zh-tw .tw 

AddLanguage tw .tw 

AddCharset Big5 .Big5 .big5 

AddCharset WINDOWS-1251 .cp-1251 

AddCharset CP866 .cp866 

AddCharset ISO-8859-5 .iso-ru 

AddCharset KOI8-R .koi8-r 

111


AddCharset UCS-2 .ucs2 

AddCharset UCS-4 .ucs4 

AddCharset UTF-8 .utf8 

# LanguagePriority allows you to give precedence to some 

languages 

# in case of a tie during content negotiation. 

# 

# Just list the languages in decreasing order of preference. We 

have 

# more or less alphabetized them here. You probably want to 

# 

change this. 

 

LanguagePriority en da nl et fr de el it ja kr no pl pt pt-br 

 

# 

ru ltz ca es sv tw 

# AddType allows you to tweak mime.types without actually editing 

it, or to 

# make certain files to be certain types. 

# 

# For example, the PHP 3.x module (not part of the Apache 

# 

distribution - see # http://www.php.net) will typically use 

: 

#AddType application/x-httpd-php3 .php3 

#AddType application/x-httpd-php3-source .phps 

# 

# And for PHP 4.x, use: 

# 

AddType application/x-httpd-php .php 

#AddType application/x-httpd-php-source .phps 

AddType application/x-tar .tgz 

AddType application/x-ica .ica 

# 

# AddHandler allows you to map certain file extensions to " 

handlers", 

112


# actions unrelated to filetype. These can be either built into 

the server 

# or added with the Action command (see below) 

# 

# If you want to use server side includes, or CGI outside 

# ScriptAliased directories, uncomment the following lines. 

# 

# To use CGI scripts: 

# 

#AddHandler cgi-script .cgi 

# 

# To use server-parsed HTML files 

# 

AddType text/html .shtml 

AddHandler server-parsed .shtml 

# 

# Uncomment the following line to enable Apache’s send-asis HTTP 

file 

# feature 

# 

#AddHandler send-as-is asis 

# 

# If you wish to use server-parsed imagemap files, use 

# 

AddHandler imap-file map 

# 

# To enable type maps, you might want to use 

# 

#AddHandler type-map var 

 

# End of document types. 

# 

# Action lets you define media types that will execute a script 

whenever 

113


# a matching file is called. This eliminates the need for repeated 

URL 

# pathnames for oft-used CGI file processors. 

# Format: Action media/type /cgi-script/location 

# Format: Action handler-name /cgi-script/location 

# 

# 

# MetaDir: specifies the name of the directory in which Apache can 

find 

# meta information files. These files contain additional HTTP headers 

# to include when sending the document 

# 

#MetaDir .web 

# 

# MetaSuffix: specifies the file name suffix for the file containing 

the 

# meta information. 

# 

#MetaSuffix .meta 

# 

# Customizable error response (Apache style) 

# these come in three flavors 

# 

# 1) plain text 

#ErrorDocument 500 "The server made a boo boo. 

# n.b. the single leading (") marks it as text, it does not get 

# 

output 

# 2) local redirects 

#ErrorDocument 404 /missing.html 

# to redirect to local URL /missing.html 

#ErrorDocument 404 /cgi-bin/missing_handler.pl 

# N.B.: You can redirect to a script or a document using server-side 

# 

-includes. 

# 3) external redirects 

#ErrorDocument 402 http://some.other_server.com/subscription_info. 

html 

114


# N.B.: Many of the environment variables associated with the 

original 

# request will *not* be available to such a script. 

# 

# Customize behaviour based on the browser 

# 

 

# 

# The following directives modify normal HTTP response behavior. 

# The first directive disables keepalive for Netscape 2.x and 

browsers that 

# spoof it. There are known problems with these browser 

implementations. 

# The second directive is for Microsoft Internet Explorer 4.0b2 

# which has a broken HTTP/1.1 implementation and does not 

properly 

# support keepalive when it is used on 301 or 302 (redirect) 

# 

responses. 

BrowserMatch "Mozilla/2" nokeepalive 

BrowserMatch "MSIE 4\.0b2;" nokeepalive downgrade-1.0 force- 

# 

response-1.0 

# The following directive disables HTTP/1.1 responses to browsers 

which 

# are in violation of the HTTP/1.0 spec by not being able to grok 

a 

# basic 1.1 response. 

# 

BrowserMatch "RealPlayer 4\.0" force-response-1.0 

BrowserMatch "Java/1\.0" force-response-1.0 

BrowserMatch "JDK/1\.0" force-response-1.0 

 

# End of browser customization directives 

# 

115


# Allow server status reports, with the URL of http://servername/ 

server-status 

# Change the ".your_domain.com" to match your domain to enable. 

# 

# 

# SetHandler server-status 



# Allow from .your_domain.com 

# 

# 

# Allow remote server configuration reports, with the URL of 

# http://servername/server-info (requires that mod_info.c be loaded). 

# Change the ".your_domain.com" to match your domain to enable. 

# 

# 

# SetHandler server-info 




# 

# 

# There have been reports of people trying to abuse an old bug from 

pre-1.1 

# days. This bug involved a CGI script distributed as a part of 

Apache. 

# By uncommenting these lines you can redirect these attacks to a 

logging 

# script on phf.apache.org. Or, you can record them yourself, using 

the script 

# support/phf_abuse_log.cgi. 

# 

# 


# ErrorDocument 403 http://phf.apache.org/phf_abuse_log.cgi 

# 

# 

# Proxy Server directives. Uncomment the following lines to 

116


# enable the proxy server: 

# 

# 

# ProxyRequests On 

# 




# 

# 

# Enable/disable the handling of HTTP/1.1 "Via:" headers. 

# ("Full" adds the server version; "Block" removes all outgoing 

Via: headers) 

# Set to one of: Off | On | Full | Block 

# 

# ProxyVia On 

# 

# To enable the cache as well, edit and uncomment the following 

lines: 

# (no cacheing without CacheRoot) 

# 

# CacheRoot "/usr/local/apache/proxy" 

# CacheSize 5 

# CacheGcInterval 4 

# CacheMaxExpire 24 

# CacheLastModifiedFactor 0.1 

# CacheDefaultExpire 1 

# NoCache a_domain.com another_domain.edu joes.garage_sale.com 

# 

# End of proxy directives. 

### Section 3: Virtual Hosts 

# 

# VirtualHost: If you want to maintain multiple domains/hostnames on 

your 

117


# machine you can setup VirtualHost containers for them. Most 

configurations 

# use only name-based virtual hosts so the server doesn’t need to 

worry about 

# IP addresses. This is indicated by the asterisks in the directives 

# 

below. 

# Please see the documentation at 

# for further details before you try to setup virtual hosts. 

# 

# You may use the command line option ’-S’ to verify your virtual 

host 

# configuration. 

# 

# Use name-based virtual hosting. 

# 

#NameVirtualHost * 

NameVirtualHost 192.168.0.100:80 

# 

# VirtualHost example: 

# Almost any Apache directive may go into a VirtualHost container. 

# The first VirtualHost section is used for requests without a known 

# server name. 

# 

# 

# ServerAdmin webmaster@dummy-host.example.com 

# DocumentRoot /www/docs/dummy-host.example.com 

# ServerName dummy-host.example.com 

# ErrorLog logs/dummy-host.example.com-error_log 

# CustomLog logs/dummy-host.example.com-access_log common 

# 

 

ServerAdmin webmaster@domain1.com 

DocumentRoot /var/www/html/domain1.com 

ServerName www.domain1.com 

118


ErrorLog /var/log/httpd/domain1.com-error_log 

CustomLog /var/log/httpd/domain1.com-access_log combined 

 

 

## 

## SSL Virtual Host Context 

## 

 


DocumentRoot /var/www/html/domain1.com 

ServerName www.domain1.com 

ErrorLog /var/log/httpd/domain1.com-error_log 


SSLEngine on 

#next line is a fix for page not found error with php 

SetEnvIf User-Agent ".*MSIE.*" nokeepalive ssl-unclean-shutdown 

downgrade-1.0 force-response-1.0 

SSLCipherSuite ALL:!ADH:!EXPORT56:RC4+RSA:+HIGH:+MEDIUM:+LOW:+ 

SSLv2:+EXP:+eNULL 

SSLCertificateFile /usr/local/apache/conf/ssl.crt/www.domain1.com 

.crt 

SSLCertificateKeyFile /usr/local/apache/conf/ssl.key/www.domain1. 

com.key 

 

 

 


DocumentRoot /var/www/html/domain2.com/ 

ServerName domain2.com 

ServerAlias www.domain2.com domain2-variant.com www.domain2- 

variant.com 

ErrorLog /var/log/httpd/domain2.com-access_log combined 


 

119

B. APPENDIX - EVALUATION B.3. Eclipse Configuration File 

B.3 Eclipse Configuration File 

 

 

 

blabla 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

120


 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

121


 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

122


 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


linktypehierarchytoeditor="0"/> 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

124


 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

125


 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

126


 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

127


 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

128


 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


activePageID="org.eclipse.ui.views.ContentOutline" expanded="2" 

appearance="2">
 

 

 

 

 

 

&# 

x0A;
 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

130


 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

< 

folder activePageID="org.tigris.subversion.subclipse.ui.repository 

.RepositoriesView" expanded="2" appearance="2">
 

 

 

 

 

 

 

131


 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


node/Switchable.java"/> 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

133


 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

134


 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

135


 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

136


 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

137


 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

138

B. APPENDIX - EVALUATION B.4. Performance Validation File 1 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

B.4 Performance Validation File 1 

# PostgreSQL Ident Authentication Maps 

# ==================================== 

# 

# Refer to the PostgreSQL Administrator’s Guide, chapter "Client 

# Authentication" for a complete description. A short synopsis 

# follows. 

# 

# This file controls PostgreSQL ident-based authentication. It maps 

# ident user names (typically Unix user names) to their corresponding 

# PostgreSQL user names. Records are of the form: 

# 

# MAPNAME IDENT-USERNAME PG-USERNAME 

# 

# (The uppercase quantities must be replaced by actual values.) 

# 

139


# MAPNAME is the (otherwise freely chosen) map name that was used in 

# pg_hba.conf. IDENT-USERNAME is the detected user name of the 

# client. PG-USERNAME is the requested PostgreSQL user name. The 

# existence of a record specifies that IDENT-USERNAME may connect as 

# PG-USERNAME. Multiple maps may be specified in this file and used 

# by pg_hba.conf. 

# 

# This file is read on server startup and when the postmaster 

receives 

# a SIGHUP signal. If you edit the file on a running system, you 

have 

# to SIGHUP the postmaster for the changes to take effect. You can 

use 

# "pg_ctl reload" to do that. 

# Put your actual configuration here 

# ---------------------------------- 

# 

# No map names are defined in the default configuration. If all 

ident 

# user names and PostgreSQL user names are the same, you don’t need 

# this file. Instead, use the special map name "sameuser" in 

# pg_hba.conf. 

# MAPNAME IDENT-USERNAME PG-USERNAME 


# PostgreSQL Client Authentication Configuration File 

# =================================================== 

# 

# Refer to the "Client Authentication" section in the 

# PostgreSQL documentation for a complete description 

# of this file. A short synopsis follows. 

# 

# This file controls: which hosts are allowed to connect, how clients 

# are authenticated, which PostgreSQL user names they can use, which 

# databases they can access. Records take one of these forms: 

# 

# local DATABASE USER METHOD [OPTION] 

140


# host DATABASE USER CIDR-ADDRESS METHOD [OPTION] 

# hostssl DATABASE USER CIDR-ADDRESS METHOD [OPTION] 

# hostnossl DATABASE USER CIDR-ADDRESS METHOD [OPTION] 

# 

# (The uppercase items must be replaced by actual values.) 

# 

# The first field is the connection type: "local" is a Unix-domain 

socket, 

# "host" is either a plain or SSL-encrypted TCP/IP socket, "hostssl" 

is an 

# SSL-encrypted TCP/IP socket, and "hostnossl" is a plain TCP/IP 

# 

socket. 

# DATABASE can be "all", "sameuser", "samerole", a database name, or 

# a comma-separated list thereof. 

# 

# USER can be "all", a user name, a group name prefixed with "+", or 

# a comma-separated list thereof. In both the DATABASE and USER 

fields 

# you can also write a file name prefixed with "@" to include names 

from 

# a separate file. 

# 

# CIDR-ADDRESS specifies the set of hosts the record matches. 

# It is made up of an IP address and a CIDR mask that is an integer 

# (between 0 and 32 (IPv4) or 128 (IPv6) inclusive) that specifies 

# the number of significant bits in the mask. Alternatively, you can 

write 

# an IP address and netmask in separate columns to specify the set of 

# 

hosts. 

# METHOD can be "trust", "reject", "md5", "crypt", "password", "gss", 

"sspi", 

# "krb5", "ident", "pam" or "ldap". Note that "password" sends 

passwords 

# in clear text; "md5" is preferred since it sends encrypted 

# 

passwords. 

# OPTION is the ident map or the name of the PAM service, depending 

# 

on METHOD. 

141


# Database and user names containing spaces, commas, quotes and other 

special 

# characters must be quoted. Quoting one of the keywords "all", " 

sameuser" or 

# "samerole" makes the name lose its special character, and just 

match a 

# database or username with that name. 

# 

# This file is read on server startup and when the postmaster 

receives 

# a SIGHUP signal. If you edit the file on a running system, you 

have 

# to SIGHUP the postmaster for the changes to take effect. You can 

use 

# "pg_ctl reload" to do that. 

# Put your actual configuration here 

# ---------------------------------- 

# 

# If you want to allow non-local connections, you need to add more 

# "host" records. In that case you will also need to make PostgreSQL 

listen 

# on a non-local interface via the listen_addresses configuration 

parameter, 

# or via the -i or -h command line switches. 

# 

# DO NOT DISABLE! 

# If you change this first entry you will need to make sure that the 

# database 

# super user can access the database using some other method. 

# Noninteractive 

# access to all databases is required during automatic maintenance 

# (custom daily cronjobs, replication, and similar tasks). 

# 

# Database administrative login by UNIX sockets 

local all postgres ident sameuser 

142


# TYPE DATABASE USER CIDR-ADDRESS METHOD 

# "local" is for Unix domain socket connections only 

local all all ident sameuser 

# IPv4 local connections: 

host all all 127.0.0.1/32 md5 

# IPv6 local connections: 

host all all ::1/128 md5 


# ----------------------------- 

# PostgreSQL configuration file 

# ----------------------------- 

# 

# This file consists of lines of the form: 

# 

# name = value 

# 

# (The "=" is optional.) Whitespace may be used. Comments are 

introduced with 

# "#" anywhere on a line. The complete list of parameter names and 

allowed 

# values can be found in the PostgreSQL documentation. 

# 

# The commented-out settings shown in this file represent the default 

values. 

# Re-commenting a setting is NOT sufficient to revert it to the 

default value; 

# you need to reload the server. 

# 

# This file is read on server startup and when the server receives a 

SIGHUP 

# signal. If you edit the file on a running system, you have to 

SIGHUP the 

# server for the changes to take effect, or use "pg_ctl reload". 

Some 

# parameters, which are marked below, require a server shutdown and 

restart to 

# take effect. 

143


# 

# Any parameter can also be given as a command-line option to the 

server, e.g., 

# "postgres -c log_connections=on". Some paramters can be changed at 

run time 

# with the "SET" SQL command. 

# 

# Memory units: kB = kilobytes MB = megabytes GB = gigabytes 

# Time units: ms = milliseconds s = seconds min = minutes h = 

hours d = days 

#------------------------------------------------------------------------------ 

# FILE LOCATIONS 

#------------------------------------------------------------------------------ 

# The default values of these variables are driven from the -D 

command-line 

# option or PGDATA environment variable, represented here as 

ConfigDir. 

data_directory = ’/var/lib/postgresql/8.3/main’ 

hba_file = ’/etc/postgresql/8.3/main/pg_hba.conf’ 

ident_file = ’/etc/postgresql/8.3/main/pg_ident.conf’ 

# If external_pid_file is not explicitly set, no extra PID file is 

written. 

external_pid_file = ’/var/run/postgresql/8.3-main.pid’ 

#------------------------------------------------------------------------------ 

# CONNECTIONS AND AUTHENTICATION 

#------------------------------------------------------------------------------ 

# - Connection Settings - 

144


#listen_addresses = ’localhost’ # what IP address(es) to 

listen on; 

port = 5432 

max_connections = 100 

# comma-separated list of 

addresses; 

# defaults to ’localhost’, 

’*’ = all 

# (change requires restart) 

# Note: Increasing max_connections costs ˜400 bytes of shared memory 

per 

# connection slot, plus lock space (see max_locks_per_transaction). 

You might 

# also need to raise shared_buffers to support more connections. 

#superuser_reserved_connections = 3 # (change requires restart) 

unix_socket_directory = ’/var/run/postgresql’ 

#unix_socket_group = ’’ # (change requires restart) 

#unix_socket_permissions = 0777 # begin with 0 to use octal 

notation 


#bonjour_name = ’’ # defaults to the computer 

name 

# - Security and Authentication - 

#authentication_timeout = 1min # 1s-600s 

ssl = true 


#ssl_ciphers = ’ALL:!ADH:!LOW:!EXP:!MD5:@STRENGTH’ # allowed SSL 

ciphers 

#password_encryption = on 

#db_user_namespace = off 

# Kerberos and GSSAPI 


#krb_server_keyfile = ’’ # (change requires restart) 

#krb_srvname = ’postgres’ # (change requires restart, 

Kerberos only) 

#krb_server_hostname = ’’ # empty string matches any 

keytab entry 

145


# (change requires restart, 

Kerberos only) 

#krb_caseins_users = off # (change requires restart) 

#krb_realm = ’’ # (change requires restart) 

# - TCP Keepalives - 

# see "man 7 tcp" for details 

#tcp_keepalives_idle = 0 # TCP_KEEPIDLE, in seconds; 

# 0 selects the system 

default 

#tcp_keepalives_interval = 0 # TCP_KEEPINTVL, in seconds; 



#tcp_keepalives_count = 0 # TCP_KEEPCNT; 



#------------------------------------------------------------------------------ 

# RESOURCE USAGE (except WAL) 

#------------------------------------------------------------------------------ 

# - Memory - 

shared_buffers = 24MB 

#temp_buffers = 8MB # min 800kB 

#max_prepared_transactions = 5 # can be 0 or more 


# Note: Increasing max_prepared_transactions costs ˜600 bytes of 

shared memory 

# per transaction slot, plus lock space (see 

max_locks_per_transaction). 

#work_mem = 1MB # min 64kB 

#maintenance_work_mem = 16MB # min 1MB 

#max_stack_depth = 2MB # min 100kB 

# - Free Space Map - 

146


max_fsm_pages = 153600 

#max_fsm_relations = 1000 # min 100, ˜70 bytes each 

# - Kernel Resource Usage - 

#max_files_per_process = 1000 # min 25 



#shared_preload_libraries = ’’ # (change requires restart) 

# - Cost-Based Vacuum Delay - 

#vacuum_cost_delay = 0 # 0-1000 milliseconds 

#vacuum_cost_page_hit = 1 # 0-10000 credits 

#vacuum_cost_page_miss = 10 # 0-10000 credits 

#vacuum_cost_page_dirty = 20 # 0-10000 credits 

#vacuum_cost_limit = 200 # 1-10000 credits 

# - Background Writer - 

#bgwriter_delay = 200ms # 10-10000ms between rounds 

#bgwriter_lru_maxpages = 100 # 0-1000 max buffers written/ 

round 

#bgwriter_lru_multiplier = 2.0 # 0-10.0 multipler on buffers 

scanned/round 

#----------------------------------------------------------------------------- 

# WRITE AHEAD LOG 

#----------------------------------------------------------------------------- 

fsync = on 

synchronous_commit = on 

commit_delay = 0 

commit_siblings = 1 

# - Settings - 

147


#fsync = on # turns forced 

synchronization on or off 

#synchronous_commit = on # immediate fsync at commit 

#wal_sync_method = fsync # the default is the first 

option 

# supported by the operating 

system: 

# open_datasync 

# fdatasync 

# fsync 

# fsync_writethrough 

# open_sync 

#full_page_writes = on # recover from partial page 

writes 

#wal_buffers = 64kB # min 32kB 


#wal_writer_delay = 200ms # 1-10000 milliseconds 

#commit_delay = 0 # range 0-100000, in 

microseconds 

#commit_siblings = 5 # range 1-1000 

# - Checkpoints - 

#checkpoint_segments = 3 # in logfile segments, min 1, 

16MB each 

#checkpoint_timeout = 5min # range 30s-1h 

#checkpoint_completion_target = 0.5 # checkpoint target duration, 

0.0 - 1.0 

#checkpoint_warning = 30s # 0 is off 

# - Archiving - 

#archive_mode = off # allows archiving to be done 


#archive_command = ’’ # command to use to archive a logfile 

segment 

#archive_timeout = 0 # force a logfile segment switch 

after this 

# time; 0 is off 

148


#----------------------------------------------------------------------------- 

# QUERY TUNING 

#----------------------------------------------------------------------------- 

# - Planner Method Configuration - 

#enable_bitmapscan = on 

#enable_hashagg = on 

#enable_hashjoin = on 

#enable_indexscan = on 

#enable_mergejoin = on 

#enable_nestloop = on 

#enable_seqscan = on 

#enable_sort = on 

#enable_tidscan = on 

# - Planner Cost Constants - 

#seq_page_cost = 1.0 # measured on an arbitrary 

scale 

#random_page_cost = 4.0 # same scale as above 

#cpu_tuple_cost = 0.01 # same scale as above 

#cpu_index_tuple_cost = 0.005 # same scale as above 

#cpu_operator_cost = 0.0025 # same scale as above 

#effective_cache_size = 128MB 

# - Genetic Query Optimizer - 

#geqo = on 

#geqo_threshold = 12 

#geqo_effort = 5 # range 1-10 

#geqo_pool_size = 0 # selects default based on 

effort 

#geqo_generations = 0 # selects default based on 

effort 

#geqo_selection_bias = 2.0 # range 1.5-2.0 

# - Other Planner Options - 

149


#default_statistics_target = 10 # range 1-1000 

#constraint_exclusion = off 

#from_collapse_limit = 8 

#join_collapse_limit = 8 # 1 disables collapsing of 

explicit 

# JOIN clauses 

#------------------------------------------------------------------------------ 

# ERROR REPORTING AND LOGGING 

#------------------------------------------------------------------------------ 

### EMANUEL 

#log_destination = ’syslog’ 

#silent_mode = on 

#log_min_duration_statement = 0 

#log_duration = off 

#log_statement = ’none’ 

#syslog_ident = ’emanuel’ 

#log_line_prefix = ’%u %d %t ’ 

# - Where to Log - 

#log_destination = ’stderr’ # Valid values are 

combinations of 

# This is used when logging to stderr: 

# stderr, csvlog, syslog and 

eventlog, 

# depending on platform. 

csvlog 

# requires logging_collector 

to be on. 

#logging_collector = off # Enable capturing of stderr 

and csvlog 

150


# into log files. Required to 

be on for 

# csvlogs. 

# These are only used if logging_collector is on: 


#log_directory = ’pg_log’ # directory where log files 

are written, 

# can be absolute or relative 

to PGDATA 

#log_filename = ’postgresql-%Y-%m-%d_%H%M%S.log’ # log file 

name pattern, 

# can include strftime() 

escapes 

#log_truncate_on_rotation = off # If on, an existing log file 

of the 

# same name as the new log 

file will be 

# truncated rather than 

appended to. 

# But such truncation only 

occurs on 

# time-driven rotation, not 

on restarts 

# or size-driven rotation. 

Default is 

# off, meaning append to 

existing files 

# in all cases. 

#log_rotation_age = 1d # Automatic rotation of 

logfiles will 

# happen after that time. 0 

to disable. 

#log_rotation_size = 10MB # Automatic rotation of 

logfiles will 

# These are relevant when logging to syslog: 

#syslog_facility = ’LOCAL0’ 

151 

# happen after that much log 

output. 

# 0 to disable.


#syslog_ident = ’postgres’ 

# - When to Log - 

#client_min_messages = notice # values in order of 

decreasing detail: 

# debug5 





# log 

# notice 

# warning 

# error 

#log_min_messages = notice # values in order of 







# info 

# notice 

# warning 

# error 

# log 

# fatal 

# panic 

#log_error_verbosity = default # terse, default, or verbose 

messages 

#log_min_error_statement = error # values in order of 


152 




# debug2



# info 

# notice 

# warning 

# error 

# log 

# fatal 

# panic (effectively off) 

#log_min_duration_statement = 0 # -1 is disabled, 0 logs all 

statements 

# and their durations, > 0 

logs only 

# statements running at least 

this time. 

#silent_mode = off # DO NOT USE without syslog 

or 

# - What to Log - 

#debug_print_parse = off 

#debug_print_rewritten = off 

#debug_print_plan = off 

#debug_pretty_print = off 

#log_checkpoints = off 

#log_connections = off 

#log_disconnections = off 

#log_duration = off 

#log_hostname = off 

log_line_prefix = ’%t’ 

153 

# logging_collector 


# special values: 

# %u = user name 

# %d = database name 

# %r = remote host and port 

# %h = remote host 

# %p = process ID 

# %t = timestamp without 

milliseconds


# %m = timestamp with 

milliseconds 

# %i = command tag 

# %c = session ID 

# %l = session line number 

# %s = session start 

timestamp 

# %v = virtual transaction 

ID 

# %x = transaction ID (0 if 

none) 

# %q = stop here in non- 

session 

# processes 

# %% = ’%’ 

# e.g. ’ ’ 

#log_lock_waits = off # log lock waits >= 

deadlock_timeout 

#log_statement = ’none’ # none, ddl, mod, all 

#log_temp_files = -1 # log temporary files equal 

or larger 

# than specified size; 

# -1 disables, 0 logs all 

temp files 

#log_timezone = unknown # actually, defaults to TZ 

environment 

# setting 

#------------------------------------------------------------------------------ 

# RUNTIME STATISTICS 

#------------------------------------------------------------------------------ 

# - Query/Index Statistics Collector - 

#track_activities = on 

#track_counts = on 

#update_process_title = on 

154


# - Statistics Monitoring - 

#log_parser_stats = off 

#log_planner_stats = off 

#log_executor_stats = off 

#log_statement_stats = off 

#----------------------------------------------------------------------------- 

# AUTOVACUUM PARAMETERS 

#----------------------------------------------------------------------------- 

#autovacuum = on # Enable autovacuum 

subprocess? ’on’ 

# requires track_counts to 

also be on. 

#log_autovacuum_min_duration = -1 # -1 disables, 0 logs all 

actions and 

# their durations, > 0 logs 

only 

# actions running at least 

that time. 

#autovacuum_max_workers = 3 # max number of autovacuum 

subprocesses 

#autovacuum_naptime = 1min # time between autovacuum 

runs 

#autovacuum_vacuum_threshold = 50 # min number of row updates 

before 

# vacuum 

#autovacuum_analyze_threshold = 50 # min number of row updates 

before 

# analyze 

#autovacuum_vacuum_scale_factor = 0.2 # fraction of table size 

before vacuum 

#autovacuum_analyze_scale_factor = 0.1 # fraction of table size 

before analyze 

#autovacuum_freeze_max_age = 200000000 # maximum XID age before 

forced vacuum 

155



#autovacuum_vacuum_cost_delay = 20 # default vacuum cost delay 

for 

# autovacuum, -1 means use 

# vacuum_cost_delay 

#autovacuum_vacuum_cost_limit = -1 # default vacuum cost limit 

for 

# autovacuum, -1 means use 

# vacuum_cost_limit 

#------------------------------------------------------------------------------ 

# CLIENT CONNECTION DEFAULTS 

#------------------------------------------------------------------------------ 

# - Statement Behavior - 

#search_path = ’"$user",public’ # schema names 

#default_tablespace = ’’ # a tablespace name, ’’ uses 

the default 

#temp_tablespaces = ’’ # a list of tablespace names, 

’’ uses 

#check_function_bodies = on 

#default_transaction_isolation = ’read committed’ 

#default_transaction_read_only = off 

#session_replication_role = ’origin’ 

# only default tablespace 

#statement_timeout = 0 # 0 is disabled 

#vacuum_freeze_min_age = 100000000 

#xmlbinary = ’base64’ 

#xmloption = ’content’ 

# - Locale and Formatting - 

datestyle = ’iso,mdy’ 

#timezone = unknown # actually, defaults to TZ 

environment 

156 

# setting


#timezone_abbreviations = ’Default’ # Select the set of available 

time zone 

# abbreviations. Currently, 

there are 

# Default 

# Australia 

# India 

# You can create your own 

file in 

# share/timezonesets/. 

#extra_float_digits = 0 # min -15, max 2 

#client_encoding = sql_ascii # actually, defaults to 

database 

# encoding 

# These settings are initialized by initdb, but they can be changed. 

lc_messages = ’en_US.UTF-8’ 

lc_monetary = ’en_US.UTF-8’ 

lc_numeric = ’en_US.UTF-8’ 

lc_time = ’en_US.UTF-8 

# default configuration for text search 

default_text_search_config = ’pg_catalog.english’ 

# - Other Defaults - 

#explain_pretty_print = on 

#dynamic_library_path = ’$libdir’ 

#local_preload_libraries = ’’ 

#----------------------------------------------------------------------------- 

# LOCK MANAGEMENT 

#----------------------------------------------------------------------------- 

#deadlock_timeout = 1s 

#max_locks_per_transaction = 64 # min 10 

157 

# (change requires restart)


# Note: Each lock table slot uses ˜270 bytes of shared memory, and 

there are 

# max_locks_per_transaction * (max_connections + 

max_prepared_transactions) 

# lock table slots. 

#------------------------------------------------------------------------------ 

# VERSION/PLATFORM COMPATIBILITY 

#------------------------------------------------------------------------------ 

# - Previous PostgreSQL Versions - 

#add_missing_from = off 

#array_nulls = on 

#backslash_quote = safe_encoding # on, off, or safe_encoding 

#default_with_oids = off 

#escape_string_warning = on 

#regex_flavor = advanced # advanced, extended, or 

basic 

#sql_inheritance = on 

#standard_conforming_strings = off 

#synchronize_seqscans = on 

# - Other Platforms and Clients - 

#transform_null_equals = off 

#------------------------------------------------------------------------------ 

# CUSTOMIZED OPTIONS 

#------------------------------------------------------------------------------ 

#custom_variable_classes = ’’ # list of custom variable 

class names 

158

Bibliography 

[Ama09] Amazon. Amazon elastic compute cloud (amazon ec2). World Wide Web, http: 

//aws.amazon.com/ec2/, 2009. 

[Cit09] Citrix. Xen source: Open source virtualization. World Wide Web, http://www. 

xensource.com, 2009. 

[Con03] World Wide Web Consortium. Extensible markup language. World Wide Web, 

http://www.w3.org/XML/, 2003. 

[Cru08] Thin Crust. Thin crust main page. World Wide Web, http://www.thincrust. 

net/, 2008. 

[DMT08] DMTF. Open virtualization format specificiation. World Wide Web, http:// 

www.dmtf.org/standards/published_documents/DSP0243_1.0.0.pdf, 

September 2008. 

[Fou09a] The Apache Software Foundation. Apache homepage. World Wide Web, http: 

//www.apache.org/, 2009. 

[Fou09b] The Eclipse Foundation. Eclipse homepage. World Wide Web, http://www. 

eclipse.org/, 2009. 

[Fou09c] The Eclipse Foundation. Rich client platform. World Wide Web, http://wiki. 

eclipse.org/index.php/Rich_Client_Platform, 2009. 

[FSF08] Inc. Free Software Foundation. Bison - gnu parser generator. World Wide Web, 

http://www.gnu.org/software/bison/, 2008. 

[GGL + 09] Patrick Goldsack, Julio Guijarro, Steve Loughran, Alistair Coles, Andrew Farrell, 

Antonio Lain, Paul Murray, and Peter Toft. The smartfrog configuration manage- 

ment framework. SIGOPS Oper. Syst. Rev., 43(1):16–25, 2009. 

[GH98] Etienne M. Gagnon and Laurie J. Hendren. Sablecc, an object-oriented compiler 

framework. Technology of Object-Oriented Languages, International Conference on, 

0:140, 1998. 

159

BIBLIOGRAPHY 

[Gro09a] Mantis PHP Bug Tracker Development Group. Mantis bug tracker homepage. 

World Wide Web, http://www.mantisbt.org/, 2009. 

[Gro09b] PostgreSQL Global Development Group. Postgresql homepage. World Wide Web, 

http://www.postgresql.org/, 2009. 

[Hon] Vasant Honavar. Automata induction, grammar inference, and language acqui- 

sition. World Wide Web, http://www.cs.iastate.edu/˜honavar/ailab/ 

projects/grammar.html. 

[Mic09a] Microsoft. Microsoft virtualization. World Wide Web, http://www.microsoft. 

com/virtualization/default.mspx, 2009. 

[Mic09b] Sun Microsystems. Mysql homepage. World Wide Web, http://www.mysql. 

com/, 2009. 

[Mic09c] Sun Microsystems. Sun virtualization. World Wide Web, http://www.sun.com/ 

solutions/virtualization/index.jsp, 2009. 

[Mic09d] Sun Microsystems. Using jar files. World Wide Web, http://java.sun.com/ 

developer/Books/javaprogramming/JAR/basics/, 2009. 

[oM] University of Maryland. Lecture: Top-down parsers. http://www.cs.umd.edu/ 

class/fall2002/cmsc430/lec4.pdf. 

[oV08] History of Virtualization. Vmware. World Wide Web, http://www.vmware.com/ 

technology/history.html, 2008. 

[Par09] Parallels. Parallels homepage. World Wide Web, http://www.parallels.com/, 

2009. 

[PH01] Rajesh Parekh and Vasant G. Honavar. Learning dfa from simple examples. Mach. 

Learn., 44(1-2):9–35, 2001. 

[Pro04] Sax Project. Sax. World Wide Web, http://www.saxproject.org/, 2004. 

[Pro09] JavaCC Project. Javacc homepage. World Wide Web, https://javacc.dev. 

java.net/, 2009. 

[Ram00] Ariel Ortiz Ramirez. Three-tier architecture. 2000. 

[SAF07] Ya-Yunn Su, Mona Attariyan, and Jason Flinn. Autobash: improving configuration 

management with operating system causality analysis. SIGOPS Oper. Syst. Rev., 

41(6):237–250, 2007. 

[Sin04] Amit Singh. An introduction to virtualization. http://www.kernelthread. 

com/publications/virtualization/, 2004. 

160

BIBLIOGRAPHY 

[Sol09a] Evolve Space Solutions. Virtu product line. World Wide Web, http: 

//www.evolve.pt/index.php?option=com_content&view=article&id= 

82&Itemid=85, 2009. 

[Sol09b] Evolve Space Solutions. Virtu software design document, 2009. 

[Sta07] James Staten. The case for virtual appliances, November 2007. 

[VMw] VMware. Vmware virtualization. http://www.vmware.com/technology/ 

virtualization.html. 

[VMw09] VMware. Vmware home page. World Wide Web, http://www.vmware.com, 

2009. 

[VX] Vmware and Xensource. The open virtual machine format whitepaper for ovf spec- 

ification. White Paper, Version 0.9. 

[W3S09] W3Schools. Xml dom tutorial. World Wide Web, http://www.w3schools.com/ 

dom/default.asp, 2009. 

[WCG04] Andrew Whitaker, Richard S. Cox, and Steven D. Gribble. Configuration debugging 

as search: finding the needle in the haystack. In OSDI’04: Proceedings of the 6th 

conference on Symposium on Opearting Systems Design & Implementation, pages 6–6, 

Berkeley, CA, USA, 2004. USENIX Association. 

161

Dissertaç ˜ao de Mestrado Mestrado em Engenharia Informática Jo ...

Create successful ePaper yourself

Delete template?

Save as template?