INFOCOMP - Departamento de CiÃªncia da ComputaÃ§Ã£o - Ufla

ISSN 1807-4545 

INFOCOMP 

Journal of Computer Science 

Endereço/Address 

INFOCOMP – Journal of Computer Science 

Departamento de Ciência da Computação 

Universidade Federal de Lavras 

Caixa Postal 3037 

37200-000 – Lavras, MG, Brasil 

Tel./Fax: +55 35 3829-1545 

E-mail: infocomp@dcc.ufla.br 

http://www.dcc.ufla.br/infocomp 

Revista financiada com recursos da / Journal financed by 

Fundação de Amparo à Pesquisa do Estado de Minas Gerais 

INFOCOMP – Journal of Computer Science, Lavras, MG, Brazil, v.10, n.3, p. 01-55, September of 2011.

Ministério da Educação 

Ministro: Fernando Haddad 

Reitor: Antônio Nazareno Guimarães Mendes 

Vice-Reitor: José Roberto Soares Scolforo 

Pró-Reitora de Pesquisa: Édila Vilela de Resende Von Pinho 

Ed. UFLA - Presidente do conselho editorial: Renato Paiva 

Volume 10, no. 3, September of 2011. 

Editorial Board 

Editor-in-Chief 

Sanderson L. Gonzaga de Oliveira, UFLA, Brazil 

Advisory Editors 

Heitor Augustus Xavier Costa, UFLA, Brazil 

João Manuel R. S. Tavares, FEUP, Universidade do Porto, Portugal 

Muthu Ramachandran, Leeds Metropolitan University, UK 

Executive Editor 

Talles Heinfarth, UFLA, Brazil 

Horácio Hideki Yanasse, INPE, Brazil 

Luiz Henrique Andrade Correia, UFLA, Brazil 

Plínio de Sá Leitão Júnior, UFG, Brazil 

Scientific Editors 

Abdelmalek Amine, Univ. Djillali Liabes - Sidi, Algeria Alceu Britto Jr., PUC-PR, Brazil 

Alessandra Alaniz Macedo, USP, Brazil 

Alessandro Marchetto, IRST, Italy 

Alice Kozakevicius, UFSM, Brazil 

Anderson de Rezende Rocha, UNICAMP, Brazil 

André Luiz Zambalde, UFLA, Brazil 

André Vital Saúde, UFLA, Brazil 

Anita Fernandes, UNIVALI, Brazil 

Antônio Maria Pereira de Resende, UFLA, Brazil 

Antonio Pedro Timoszczuk, USP, Brazil 

António Ribeiro, European Commission, Italy 

Arnaldo de Albuquerque Araújo, UFMG, Brazil 

Aruna Ranganath, Bhoj Reddy Eng. Col. for Women, India 

Aswani Kumar Cherukuri, VIT University, India Ayyaswamy Kathirvel, KVCET, India 

Bruno de Oliveira Schneider, UFLA, Brazil 

Carlos de Castro Goulart, UFV, Brazil 

Claudio Cesar de Sá, UESC, Brazil 

Claudio R. Jung, UNISINOS, Brazil 

Daniel Mesquita, UFU, Brazil 

Deepak Dahiya, ITM Gurgaon, India 

Denilson Alve Pereira, UFLA, Brazil 

Eder Mateus Nunes Gonçalves, FURG, Brazil 

Elisa Huzita, UEM, Brazil 

Fábio Levy Siqueira, USP, Brazil 

Fatima L. S. Nunes, USP, Brazil 

Frank José Affonso, UNESP, Brazil 

Giovani Rubert Librelotto, UFSM, Brazil 

Heitor Augustus Xavier Costa, UFLA, Brazil 

Hernan Astudillo, Univ.Tec.Federico Santa Maria, Chile Hyggo Almeida, UFCG, Brazil 

Ilda Reis, FEUP, Universidade do Porto, Portugal Ildeberto Aparecido Rodello, USP, Brazil 

João Carlos Giacomin, UFLA, Brazil 

João Manuel R. S. Tavares, FEUP, Univ. do Porto, Portugal 

Joaquim Quinteiro Uchôa, UFLA, Brazil 

Johan M. Sharif, Swansea University, UK 

Jorge Martinez-Gil, University of Malaga, Spain 

Jorge Rady Almeida Junior, USP, Brazil 

José Luís Braga, UFV, Brazil 

Leonardo Ribeiro, UFLA, Brazil 

Luciana A. F. Martimiano, UEM, Brazil 

Luciano José Senger, UEPG, Brazil 

Luiz Camolesi Jr., UNICAMP, Brazil 

Luiz Carlos Begosso, FEMA, Brazil 

Luiz Eduardo G. Martins, UNIMEP, Brazil 

Luiz Henrique Andrade Correia, UFLA, Brazil 

Marco Aurelio Gerosa, USP, Brazil 

Marcos A. Cavenaghi, UNESP, Brazil 

Maria Istela Cagnin, UFMS, Brazil 

Marinalva Dias Soares, INPE, Brazil 

Michel S. Soares, UFU, Brazil 

Muthu Ramachandran, Leeds Metropolitan Univ., UK 

Nandamudi Vijaykumar, LAC-INPE, Brazil 

Omar Andres Carmona Cortes, CEFET/MA, Brazil 

O. P. Gupta, Punjab Agricultural University, India Plínio Sá Leitão Júnior, UFG, Brazil 

Priti Sajja, Sardar Patel University, India 

Rajkumar Samanta, Megnad Saha Inst. of Tech., India 

Reghunadhan Rajesh, Bharathiar University, India Renato de Freitas Bulcão Neto, UFG, Brazil 

Ricardo Terra, UFMG, Brazil 

Ricardo da Silva Torres, UNICAMP, Brazil 

Rodrigo Fernandes de Mello, USP, Brazil 

Roger Pizzato Nunes, UFPel, Brazil 

Rogéria Cristiane Gratão Souza, UNESP, Brazil 

Rosângela A. Delosso Penteado, UFSCar, Brazil 

Sanderson Lincohn Gonzaga de Oliveira, UFLA, Brazil Udo Fritzke Jr., PUC Minas em Poços de Caldas, Brazil 

Valter F. Avelino, USP, Brazil 

Valter Vieira de Camargo, UFSCar, Brazil 

Vitus S. W. Lam, Hong Kong University 

Wilian Soares Lacerda, UFLA, Brazil 

Technical staff: Ariana da Silva Laureado. 

Indexed in: INSPEC; Qualis-CAPES. 

INFOCOMP – Journal of Computer Science – v.10, n.3 (2011) – Lavras: Universidade Federal 

de Lavras, 2011. 

Anual (1999 - 2003), Semestral (2004), Trimestral (2005 - ) 

Sumários em Inglês 

ISSN 1807–4545 

1. Ciência da Computação I.Universidade Federal de Lavras. II. Departamento de Ciência 

da Computação. 

Solicita-se permuta /Exchange desired; tiragem /circulation: 250.

Transformation by Modeling MOF 2.0 QVT: From UML to MVC2 

Web model 

REDOUANE ESBAI 1 

MOHAMMED ERRAMDANI 2 

SAMIR MBARKI 3 

IBTISSAM ARRASSEN 4 

ABDELOUAFI MEZIANE 5 

MIMOUN MOUSSAOUI 6 

MATSI Laboratory, EST 

Mohammed First University, Oujda, Morocco 

1 es.redouane@gmail.com 

6 moussaoui@est.univ-oujda.ac.ma 

Department of Management, EST 


2 mramdani69@yahoo.co.uk 

Department of Computer Sciences, Faculty of Science 

Ibn Tofail University, Kenitra, BP 133, Morocco 

3 mbarkisamir@hotmail.com 

Department of Mathematics and Computer Sciences, Faculty of Science 


4 arrassen@yahoo.com 

5 abdelouafi_meziane@yahoo.fr 

Abstract. The continuing evolution of business needs and technology makes Web applications more demanding 

in terms of development, maintenance, and management. To cope with this complexity, several 

frameworks have emerged. Given this diversity of solutions, the generation of a code based on UML 

models has become important. This paper presents the application of the MDA (Model Driven Architecture) 

to generate, from the UML model, the Code following the MVC2 pattern (Model-View-Controller) 

using the standard MOF 2.0 QVT (Meta-Object Facility 2.0 Query-View-Transformation) as a transformation 

language. This standard defines the meta-model for the development of model transformation. 

The transformation rules defined in this paper can generate, from the class diagram, an XML file containing 

the Actions, the Forms, and JSP pages. This file can be used to generate the necessary code of a 

web application. 

Keywords: MDA, model transformation, MVC 2, transformation rules, MOF 2.0 QVT, meta-model, 

OCL. 

(Received January 11st, 2011 / Accepted July 16th, 2011) 

1 Introduction 

In recent years many organizations have begun to consider 

MDA as an approach to design and implement enterprise 

applications. The key principle of MDA is the 

use of models at different phases of application development 

by implementing many transformations. These 

changes are present in MDA, and help transform a CIM 

(Computation Independent Model) into a PIM (Platform 

Independent Model) or to obtain a PSM (Platform 

Specific Model) from a PIM. 

MVC2 is a programming scheme that takes into ac- 

INFOCOMP, v. 10, no. 3, p. 01-11, September of 2011.

Redouane Esbai et al. Transformation by Modeling MOF 2.0 QVT: From UML to MVC2 Web model 2 

count the entire architecture of a program. It categorizes 

the different types of objects that make up the application 

into three categories: The model manages the 

behavior and data of the application domain, the view 

corresponding to the interface with which users interact, 

and the Controller that supports event management 

synchronization to update the model. This pattern saves 

time for maintenance as well as upgrading and greater 

flexibility to organize the development of different developers 

(independent data, display and actions). Many 

frameworks that implement the MVC2 pattern have 

emerged; for instance: Struts [1], PureMVC [4], Gwittir 

[3], SpringMVC [5], Zend [6], ASP.NET MVC2 [2]. 

Struts remains the most mature and highly trusted solution 

among developers. 

Mbarki and Erramdani [26, 27], both source and target 

meta-models have been developed. The first corresponds 

to a specific PIM meta-model class diagram, and 

the second is a PSM meta-model for MVC2 web application. 

The development was done via RSM (Rational 

Software Modeler) based on a programming approach. 

This means that programming transformations models 

was done in the same way as programming computer 

applications. This paper aims to rethink the work presented 

in [26, 27]. However, we develop the transformation 

rules using the MOF 2.0 QVT standard to generate 

an XML file which contains actions, forms and JSP 

pages used to produce the code for the target application. 

The advantage of this standard is the bidirectional 

execution of transformation rules. 

This paper is organized as follows: related works 

are presented in the second section, the third section defines 

the MDA approach, and the fourth section presents 

the MVC2 model and its implementation as a framework, 

Struts in this case. The transformation language 

MOF 2.0 QVT and the language of OCL constraints are 

the subject of the fifth section. In the sixth section, we 

present the UML and MVC2 meta-models. In the seventh 

section, we present the transformation rules using 

MOF 2.0 QVT from UML source model to the MVC2 

target model. The last section concludes this paper and 

presents some perspectives. 

2 Related Work 

A much relevant work on meta-modeling was completed 

in 2007 [22] in which the authors have developed 

a meta-model for web needs. This meta-model takes 

into account concepts such as usage cases. The authors 

have developed transformation rules, but the main purpose 

of this work was the use of this meta-model as a 

CIM to turn it into a PIM and then to a PSM. 

Two other works followed the same logic and have 

been the subject of two works [20, 24]. A meta-model 

for Ajax was defined using AndroMDA tool. The generation 

of Ajax code has been illustrated by an application 

CRUD (Create, Read, Update, and Delete) that 

manages people. 

Kraus, Knapp and Koch [25] show how to build 

JSP pages and JavaBeans using the UWE [31], (UMLbased 

Web Engineering) and the ATL transformation 

language [15]. 

Nasir, Hamid and Hassan [30] have presented 

an approach to generate a code for the .Net 

application Student Nomination Management 

System. The method used is WebML and the code 

was generated by applying the MDA approach, but the 

creation was not done according to the .Net MVC2 

logic. 

The work presented by Amen, Abdelaziz and Samir 

[13] aims at providing a generic approach to automate 

the translation of conceptual models’ integrity 

constraints to the relational context of the MDA approach. 

To do this, the authors proposed a transformational 

model based on the UML meta-model. The 

rules of that transformation are described by the graphical 

notation of QVT-Relations language. 

Oberortner, Vasko and Dustdar [32] have examined 

the safety aspects. A meta-model was developed to integrate 

the roles of users to access various pages of the 

Web application. Each page contains navigation rules 

and each rule contains a decision (if, else if, else). 

Recently, Mbarki, Rahmouni and Erramdani [28] 

were conducted to model Web MVC2 generation using 

the ATL transformation language. 

This paper aims to rethink the work presented by 

Mbarki and Erramdani [26, 27], by applying the standard 

MOF 2.0 QVT to develop the transformation rules 

aiming at generating the MVC2 target model. It is actually 

the only work for reaching this goal. 

3 Model Driven Architecture (MDA) 

In November 2000, OMG, a consortium of over 1 000 

companies, initiated the MDA approach. The key principle 

of MDA is the use of models at different phases 

of application development. Specifically, MDA advocates 

the development of requirements models (CIM), 

analysis and design (PIM) and code (PSM). 

The MDA architecture [29] is divided into four layers. 

In the first layer, we find the standard UML (Unified 

Modelling Language), MOF (Meta-Object Facility) 

and CWM (Common Warehouse Meta-model). In the 

second layer, we find a standard XMI (XML Metadata 

Interchange), which enables the dialogue between middlewares 

(Java, CORBA, .NET and web services). The 



third layer contains the services that manage events, 

security, directories and transactions. The last layer 

provides frameworks which are adaptable to different 

types of applications namely Finance, Telecommunications, 

Transport, medicine, E-commerce and Manufacture, 

etc.). 

The major objective of MDA [16] is to develop sustainable 

models; those models are independent from the 

technical details of platforms implementation (Java EE, 

.Net, PHP or other), in order to enable the automatic 

generation of all codes and applications leading to a significant 

gain in productivity. MDA includes the definition 

of several standards, including UML [7], MOF [8] 

and XMI [9]. 

4.1 The Struts framework 

The Struts project [1] is managed within the community 

of Apache Software Foundation among the Jakarta 

projects. The motivation of this project is to provide the 

Java community with a framework based on the MVC2 

architectural pattern while using Java EE technologies 

standard [19]: JSP / Servlet, JavaBeans, XML. 

However, Struts is not the only framework for managing 

the presentation layer. Indeed, other frameworks 

have been designed for the same goal, but Struts is the 

most mature. The main advantage of Struts is the reduced 

complexity compared to other frameworks of the 

same degree of power [27], for instance, PureMVC, 

Gwittir and WebWork. 

4 The MVC2 Pattern 

The Model-View-Controller (MVC) architectural pattern 

is a widely used software and was created in 1980 

by Xerox PARC for Smalltalk-80. Lately it has been 

recommended as a model for Java EE by Sun. The 

model also won strong popularity among PHP developers. 

The MVC pattern is a useful addition to developer 

tools, whatever the language used is. 

The MVC pattern is a type of Design Patterns in the 

Architectural Patterns category. It is simple 

and very useful, and can essentially build an application 

using three levels: model, view and controller. 

Figure 1 shows the architecture of the MVC2 pattern. 

The main feature of this pattern is to be composed 

of a single Servlet control. This pattern distinguishes 

the business logic, server-side processing and the display. 

Each component is reusable and replaceable. 

4.2 Architecture and functioning of Struts framework 

The structure of the Struts framework derives from the 

MVC2 model (see Figure 2). In this model, there is a 

controller, views and access to the model. 

Controller: The controller of the Struts framework 

is responsible for making the link between the view 

and model. It receives all client requests and forwards 

them to specific actions. These correspondences 

(mapping) are described in a configuration file called 

struts-config.xml. 

View: The view is a set of JSP pages. To facilitate 

construction, the Struts framework provides several tag 

libraries. 

Model: According to the MVC2 pattern, the model is 

independent from the controller. The Struts framework 

does not impose any; instead, technological choice is 

up to the developer (JDBC, EJB, JDO, and XML), etc 

according to his needs. 

Figure 1: MVC2 Architecture 

Based on this model many frameworks are designed 

to help developers build the presentation layer of their 

web applications. In the Java community, the Jakarta 

Struts project is one of the best examples. 

Figure 2: Principle of operation of the Struts framework 

The interaction between the three components is 

managed by the main controller. In order to better to 



understand the working of the framework, we retail the 

life cycle of a HTTP request, schematized in figure 2: 

1- The customer sends his HTTP request to the application. 

This request is taken in charge by the main 

controller, in the ActionServlet case; 

2- The request is redirected towards the adequate controller; 

3- The chosen controller handle the request. A dialogue 

with business logic is started when necessary; 

4- The model provides the requested data; 

QVT meta-model expresses some structural correspondence 

rules between the source and target meta-model 

of a transformation. This model is a perennial and productive 

model that is necessary to transform in order to 

execute the transformation on an execution platform. 

5- The main controller is notified about the result of the 

treatment. In case of success, data are encapsulated 

in the JavaBeans (ActionForm) and then transmitted 

to the JSP selected by the controller; 

Figure 3: Approach by Modeling 

6- The JSP constructs the answer according to the 

transmitted data; 

7- The answer is sent to the browser. 

5 The transformations of MDA models 

MDA establishes the links of traceability between the 

CIM, PIM and PSM models through to the execution of 

the models’ transformations. 

The models’ transformations recommended by 

MDA are essentially the CIM transformations to PIM 

and PIM transformations to PSM. 

5.1 Approach by modeling 

Currently, the models’ transformations can be written 

according to three approaches: The approach by Programming, 

the approach by Template and the approach 

by Modeling. 

The approach by Modeling is the one used in the 

present paper. It consists of applying concepts from 

model engineering to models’ transformations themselves. 

The objective is modeling a transformation, 

to reach perennial and productive transformation models, 

and to express their independence towards the platforms 

of execution. Consequently, OMG elaborated a 

standard transformation language called MOF 2.0 QVT 

[11]. The advantage of the approach by modeling is the 

bidirectional execution of transformation rules. This aspect 

is useful for the synchronization, the consistency 

and the models reverse engineering [18]. 

Figure 3 illustrates the approach by modeling. Models 

transformation is defined as a model structured according 

to MOF 2.0 QVT meta-model. The MOF 2.0 

5.2 MOF 2.0 QVT 

Transformations models are at the heart of MDA, a 

standard known as MOF 2.0 QVT being established to 

model these changes. This standard defines the metamodel 

for the development of transformation model. 

The QVT standard has a hybrid character (declarative 

/ imperative) in the sense that it is composed of three 

different transformation languages (see Figure 4). 

The declarative part of QVT is defined by 

Relations and Core languages, with different levels 

of abstraction. Relations are a user-oriented language 

for defining transformations in a high level of 

abstraction. It has a syntax text and graphics. Core 

language forms the basic infrastructure for the declaration 

part; this is a technical language of lower level 

determined by textual syntax. It is used to specify 

the semantics of Relations language in the form of a 

Relations2Core transformation. The declarative vision 

comes through a combination of patterns, source and 

target side to express the transformation. 

The imperative QVT component is supported by 

Operational Mappings language. The vision requires an 

explicit imperative navigation as well as an explicit creation 

of target model elements. The Operational Mappings 

language extends the two declarative languages 

of QVT, adding imperative constructs (sequence, selection, 

repetition), etc and constructs in OCL edge effect. 

The imperative style languages are better suited for 

complex transformations including a significant algorithm 

component. Compared to the declarative style, 

they have the advantage of optional case management 



in a transformation. For this reason, we chose to use an 

imperative style language in this paper. 

Finally, QVT suggests a second extension mechanism 

for specifying transformations invoking the functionality 

of transformations implemented in an external 

language Black Box. 

• The parser is called and gets as input a text file 

containing a QVT code (qvtCode ). 

• The parser returns the model conforming to the 

QVT metamodel. 

• Then the returned model is passed to the compiler. 

• Finally, we get a Java file implementing the transformation 

(javaFile). 

Figure 4: The QVT Structure 

This work uses the QVT-Operational mappings language 

implemented by SmartQVT [23]. SmartQVT 

is the first open source implementation of the QVT- 

Operational language. The tool comes as an Eclipse 

plug-in under EPL license running on top of EMF 

framework. This tool is developed by France Telecom 

R & D project partially funded by the European IST 

Model Ware. 

SmartQVT is composed of 3 components: 

• QVT Editor: helps end users to write QVT specifications. 

• QVT Parser: converts the QVT concrete textual 

syntax into its corresponding representation in 

terms of the QVT metamodel. 

• QVT Compiler: produces, from a QVT model, a 

Java program on top of EMF generated APIs for 

executing the transformation. The input format is 

a QVT specification provided in XMI 2.0 in conformance 

with the QVT meta-model. 

Figure 5: Transformation Scenario with SmartQVT tool 

In Figure 5, presents a scenario of minimal processing: 

5.3 OCL (Object Constraint Language) 

Object Constraint Language (OCL) is a formal language 

used to describe expressions on UML models. 

These expressions typically specify invariant conditions 

that must hold for the system being modeled or queries 

over objects described in a model. Note that when the 

OCL expressions are evaluated, they do not have side 

effects. OCL expressions can be used to specify operations 

/ actions that, when executed, do alter the state 

of the system. UML modelers can use OCL to specify 

application-specific constraints in their models. 

Currently, several tools of OCL exist, including 

ATL [14] Dresden OCL Toolkit [21], Eclipse MDT 

OCL [10] KMF [12], Ocle [17], etc. 

In MOF 2.0 QVT, OCL is extended to Imperative 

OCL as part of QVT Operational Mappings. 

Imperative OCL added services to manipulate the system 

states (for example, to create and edit objects, links 

and variables) and some constructions of imperative 

programming languages (for example, loops and conditional 

execution). It is used in QVT Operational Mappings 

to specify the transformations. 

QVT defines two ways of expressing model transformations: 

declarative and operational approaches. 

The declarative approach is the Relations language 

where transformations between models are specified as 

a set of relationships that must hold for successful transformation. 

The operational approach allows either defining 

transformations using a complete imperative approach 

or complementing the relational transformations with 

imperative operations, by implementing relationships. 

Imperative OCL adds imperative elements of OCL, 

which are commonly found in programming languages 

like Java. Its semantics are defined in [11] by a model 

of abstract syntax. The complete abstract syntax ImperativeOCL 

is shown in Figure 6. 

The most important aspect of the abstract syntax is 

that all expression classes must inherit OclExpression. 

OclExpression is the base class for all the conventional 

expressions of OCL. Therefore, Imperative Expressions 

can be used wherever there is OclExpressions. 



Figure 8 illustrates the first part of the target metamodel. 

This meta-model is a simplified diagram of relational 

databases. It consists of several tables, themselves 

composed of typed columns. 

Figure 6: Imperative Expressions of ImperativeOCL 

6 The UML and MVC2 meta-models 

To develop the algorithm of transformation between the 

source and target model, we present in this section, the 

different meta-classes forming the UML source metamodel 

and the MVC2 target meta-model. The metamodel 

source structure simplified UML model based on 

a package containing the data types and classes. These 

classes contain properties typed and characterized by 

multiplicities (upper and lower). The classes contain 

operations with typed parameters. Figure 7 shows the 

source meta-model: 

Figure 8: Simplified meta-model of a relational database 

Figure 9 illustrates the second part of the target 

meta-model. This is the business model of the application 

to be processed. In our case, we opted for components 

such as Beans. We recall that Struts does not 

provide specific classes. 

Figure 7: Simplified UML Meta-model 

Figure 9: Simplified meta-model of a modelPackage 



Figure 10 illustrates the third part of the target metamodel. 

This meta-model illustrates the models that represent 

the display of the application. In this model, the 

servlet calls the execute() method on the instance 

of the class action. It performs its processing and then 

calls the mapping.findForward() method with a 

return to the JSP page specified. 

7 The process of transforming UML source 

model to MVC2 target model (Struts) 

CRUD operations (Create, Read, Update, and Delete) 

are most commonly implemented in all systems. That 

is why we have taken into account in our transformation 

rules these types of transactions. In [26], it was implemented 

that read operation, however, our work aims to 

implement all CRUD operations. 

We first developed ECORE models corresponding 

to our source and target meta-models, and then we implemented 

the algorithm (see sub-section 7.1) using the 

transformation language QVT Operational Mappings. 

To validate our transformation rules, we conducted several 

tests. For example, we considered the class diagram 

(see Figure 12). After applying the transformation 

on the UML model, composed by the classes Department, 

Employee and City (ville), we generated the 

target model (see Figure 16). 

Figure 10: Simplified meta-model of a viewPackage 

Figure 11 shows the fourth part of the target metamodel. 

This meta-model is the package controller. This 

meta-model illustrates models that represent the controller 

application. The controller is responsible for receiving 

applications sent by the client, with the invocation 

of the class action. It, thus, interacts with the business 

model and coordinates with the display by sending 

it to the client. 

Figure 11: Simplified meta-model of a controllerPackage 

The works of Mbarki and Erramdani [26, 27] contain 

more details related to this section topic. 

Figure 12: UML instance model 

7.1 The transformation rules 

By source model, we mean model containing the 

various classes of our business model. The elements of 

this model are primarily classes. 

Main algorithm: 

input umlModel:UmlPackage 

output strutsModel 

:StrutsProjectPackage 

begin 

create StrutsProjectPackage struts 

create ViewPackage vp 

vp = transformationRuleOne(e) 

create ControllerPackage cp 

cp = transformationRuleTwo(e) 

link vp to struts 

link cp to struts 

return struts 

end 

function 

transformationRuleOne(e:Class) 

:ViewPackage 

begin 



create ViewPackage vp 

for each e ? source model 

if e.methods.name ? ’remove’ 

create JspPage page 

link page to vp 

end if 

end for 

return vp 

end 

function 

transformationRuleTwo(e:Class) : 

ControllerPackage 

begin 

create ControllerPackage cp 

create ActionMapping am 

for each page viewPackage 

link page to actionForward 

create actionForm 

create Action action 

create ActionForward actionForward 

actionForm.input=page 

actionForm.attribute=action 

link page to actionForward 

link actionForward to action 

put action in am 

end for 

link am to cp 

return cp 

end 

Figure 13 illustrates the first part of the transformation 

code of UML source model to the MVC2 target. 

The transformation uses as input a UML type 

model, named umlModel, and as output a STRUTS type 

model named strutsModel. The entry point of the transformation 

is the main method. This method makes the 

correspondence between all elements of type UmlPackage 

of the input model and the elements of type StrutsProjectPackage 

output model. 

The objective of the second part of this code is to 

transform a UML package to Struts package, creating 

an item such View package and Controller package. 

It is to turn each class in UML package, into JSP 

in the View package, and into Action in the Controller 

package making sure to give names to different packages. 

Figure 14: The mapping class2view and Operation2JspPage 

The methods presented in Figure 14 means that each 

operation in a class corresponds to JSP page. 

Figure 15: The mapping class2action 

The method presented in Figure 15 means that each 

class corresponds to one or more actions as the name 

and type of operations which contains it. 

The codes and models are publicly available 

online http://sites.google.com/site/ 

uml2mvc/. 

Figure 13: The transformation code UML2Strut 

7.2 Result 

Figure 16 shows the result after applying the transformation 

rules. 



CreateXEndAction, UpdateXEndAction, 

where X should be replaced by City(Ville) by Department, 

and Employee. Operations for creation 

and update, add forms to enter new values. For 

this reason, we add CreateXEndAction and 

UpdateXEndAction. 

For each element, for example, 

DisplayDepartementAction contains two 

elements: the attribute element indicating 

the form entered in this action is the ActionForm 

DisplayDepartementForm, and 

Forwards element with forward attribute 

DisplayDepartementPage.jsp. The Action element 

DisplayVilleAction contains only one 

Forwards element with forward attribute DisplayVillePage.jsp. 

The remaining actions follow the same principle. 

Figure 16: Generated PSM MVC2 Web model 

The first element in the generated PSM model 

is: viewPackage that contains the nine JSPs, namely 

DisplayVillePage.jsp, DisplayDepartementPage.jsp, 

DisplayEmployePage.jsp, CreateVillePage.jsp, CreateDepartementPage.jsp, 

CreateEmployePage.jsp, 

UpdateVillePage.jsp, UpdateDepartementPage.jsp 

and UpdateEmployePage.jsp. Since the operation 

of the removal requires any form, we’ll go 

to the controllerPackage element, which contains 

a single element ActionMapping. The latter 

contains eighteen actions whose names are respectively 

DisplayXAction, CreateXAction, 

UpdateXAction, RemoveXAction, 

8 Conclusion and perspectives 

In this paper, we applied the MDA to generate the 

MVC2 code web application based on UML class diagram. 

The purpose of our contribution is to rethink 

the works presented by Mbarki and Erramdani [26, 27]. 

However, the transformation rules were developed applying 

the approach by modeling and MOF 2.0 QVT, as 

transformation language, to browse the class diagram 

and generate, through these rules, an XML file containing 

all the actions, forms and JSP pages. This file can be 

used to produce the necessary code to the target application. 

The transformation algorithm handles all CRUD 

operations. The advantage of this approach is the bidirectional 

execution of transformation rules. 

Moreover, this work can be complemented by advanced 

features of Web applications. For example, we 

can provide some user interface as well as the ability 

to incorporate other features: the persistence of objects 

in relational database (Hibernate) and dependency injection 

(Spring) to produce a complete web application 

according to the n-tier architecture. This is the subject 

of a work in finalization phase. 

References 

[1] Apache software foundation: The apache 

struts web application software framework. 

http://struts.apache.org. 

[2] Asp.net mvc site http://www.asp.net/mvc/. 

[3] Gwittir site http://code.google.com/p/gwittir/. 

[4] Puremvc framework http://puremvc.org/. 

[5] Spring framework http://www.springsource.org/. 



[6] Zend framework http://framework.zend.com/. 

[7] UML Infrastructure Final Adopted Specification, 

version 2.0, September 2003. 

http://www.omg.org/cgi-bin/doc?ptc/03-09- 

15.pdf. 

[8] Meta Object Facility (MOF), version 2.0, January 

2006. http://www.omg.org/spec/MOF/2.0/PDF/. 

[9] XML Metadata Interchange (XMI), version 2.1.1, 

December 2007. http://www.omg.org/spec/XMI/. 

[10] MDT-OCL-Team, MDT OCL, 2008. 

http://www.eclipse.org/modeling/mdt/?project=ocl. 

[11] Meta Object Facility 2.0 Query 

View Transformation (MOF 2.0 

QVT), Version 1.1, December 2009. 

http://www.omg.org/spec/QVT/1.1/Beta2/PDF/. 

[12] Akehurst, D. and Patrascoiu, O. The kent modeling 

framework (kmf). University of Kent, 2005. 

http://www.cs.kent.ac.uk/projects/ocl. 

[13] Ali, A. B. H., Abdellatif, A., and Ahmed, S. B. 

Transformation des contraintes d’intégrité - des 

modèles conceptuels vers le relationnel. In IN- 

FORSID, pages 398–415, 2007. 

[14] Allilaire, F., Bézivin, J., Jouault, F., and Kurtev, 

I. Atl - eclipse support for model transformation. 

In In Proceedings of the Eclipse Technology eXchange 

workshop (eTX) at the ECOOP 2006 Conference, 

2005. 

[15] Allilaire, F., Bézivin, J., Jouault, F., and Kurtev., 

I. Atl: A model transformation tool. Science 

of Computer Programming-Elsevier, 72:31–39, 

2008. 

[16] Blanc, X. MDA en action : Ingénierie logicielle 

guidée par les modèles. Eyrolles, 2005. 

[17] Chiorean, D. and OCLE-Team. Object constraint 

language environment 2.0., 2008. 

http://lci.cs.ubbcluj.ro/ocle/. 

[18] Czarnecki, K. and Helsen, S. Classification of 

model transformation approaches. In In online 

proceedings of the 2nd OOPSLA’03 Workshop on 

Generative Techniques in the Context of MDA, 

October 2003. 

[19] Davis, M. Struts, an open-source MVC implementation 

: Manage complexity in large Web sites with 

this servlets and JSP framework. IBM, Feb 2001. 

http://www.ibm.com/developerworks/library/jstruts/. 

[20] Distante, D., Rossi, G., and Canfora, G. Modeling 

business processes in web applications: An analysis 

framework. In In Proceedings of the The 22nd 

Annual ACM Symposium on Applied Computing, 

page 1677. 

[21] Dresden-OCL-Team. Dresden OCL Toolkit, 2008. 

http://dresden-ocl.sourceforge.net. 

[22] Escalona, M. J. and Koch, N. Metamodeling the 

requirements of web systems. Lecture Notes in 

Business Information Processing, 1:267–282, August 

2007. 

[23] France Telecom. SmartQVT documentation, 

2007. http://smartqvt.elibel.tm.fr/doc/index.html. 

[24] Gharavi, V., Mesbah, A., and van Deursen, A. 

Modelling and generating ajax applications: A 

model-driven approach. In Proceeding of the 7th 

International Workshop on Web-Oriented Software 

Technologies, page 38, New York, USA, 

2008. 

[25] Kraus, A., Knapp, A., and Koch, N. Modeldriven 

generation of web applications in uwe. 

In Proceeding of the 3rd International Workshop 

on Model-Driven Web Engineering, CEUR-WS, 

2007. 

[26] Mbarki, S. and Erramdani, M. Toward automatic 

generation of mvc2 web applications. InfoComp 

- Journal of Computer Science, 7(4):84–91, December 

2008. 

[27] Mbarki, S. and Erramdani, M. Model-driven 

transformations: From analysis to mvc 2 web 

model. International Review on Computers and 

Software (I.RE.CO.S.), 4(5):612–620, September 

2009. 

[28] Mbarki, S., Rahmouni, M., and Erramdani, M. 

Transformation atl pour la génération de modèles 

web mvc 2. In 10e Colloque Africain sur la 

Recherche en Informatique et en Mathématiques 

Appliquées (CARI), 2010. 

[29] Miller, J. and Mukerji, J. MDA Guide Version 

1.0.1, 2003. http://www.omg.org/docs/omg/03- 

06-01.pdf. 

[30] Nasir, M., Hamid, S., and Hassan, H. Webml and 

.net architecture for developing students appointment 

management system. Journal of applied science, 

9(8):1432–1440, 2009. 



[31] Nora, K. Transformations techniques in the 

model-driven development process of uwe. In 

Proceeding of the 2nd International Workshop 

Model-Driven Web Engineering, page 3, Palo 

Alto, 2006. 

[32] Oberortner, E., Vasko, M., and Dustdar, S. Towards 

modeling role-based pageflow definitions 

within web applications. In Proceeding of the 4th 

Model Driven Web Engineering Workshop, 2008. 


A MultiCriteria Group Decision Support System for Industrial 

Diagnosis 

HAMDADOU DJAMILA 1 

THÉRÈSE LIBOUREL 2 

1 University of Oran, Laboratory LIO 

Department of Computer Science 

P.O. Box 1524 - El Mnaouar- Algeria 

(dzhamdadoud)@yahoo.fr 

2 University of Montpellier II, Laboratory LIRMM 

libourel@lirmm.fr 

Abstract. The diagnosis is a research key element to improve business performance. However, the 

diagnosis methods do not possess a unique and universal aspect in a context where diagnosis diversity 

and complexity are increasing. Thus, there is, currently, no susceptible diagnosis method which ensures 

the relevance, efficiency and effectiveness of maintenance in all circumstances. The work presented in 

this article aims to eliminate or at least lessen the impact of unsuccessful attempts of the diagnosis tools 

development on the good functioning of a company. The development of the multicriteria group decision 

support system for diagnosis assistance (DIAG-GDSS) is an answer to the problem; it is a collective 

decision-making tool for the choice of the most relevant diagnosis method. 

On the basis of a set of criteria and diagnosis methods, carefully selected and implemented , the developed 

tool allows: 

• to assist decision makers in maintenance, according to their preferences often conflicting, to adopt 

a diagnosis method; 

• to make a quick and efficient diagnosis using the developed methods. 

In order to meet this group decision where different viewpoints are considered, we propose a multilateral 

negotiation protocol, coupled with a multicriteria method namely ELECTRE III. This protocol features 

a coordinator agent and a set of participating agents, trying to find a compromise that best meets all the 

decision makers. 

Keywords: group decision support system, multiCriteria analysis, diagnosis, multi agents system, negotiation 

protocol, ELECTRE III. 

(Received May 10tf, 2011 / Accepted July 16th, 2011) 


In maintenance field, we need to be able to prevent production 

inability, rather than seek to produce more. It is 

therefore to maintain the total stock of equipment used 

in production. Maintenance includes the functions of 

detection, interpretation and decision performed by a 

diagnosis system which constitutes an important part of 

a maintenance system. Indeed, the problem of diagnosis 

is actually linked to that of maintenance. The latter 

involves various factors; they may be economic (cost 

of maintenance over the expected gain), human (skills, 

personnel training) or industrial (turf skills, industrial 

competition) which are difficult to assess. 

Nowadays, all manufacturers are interested in new technologies 

enabling them to improve diagnosis and enhance 

their competitiveness. Indeed, reducing commissioning 

time, optimizing uptime and the requirements 

of availability constitute common concerns to almost all 

sectors whether in transportation, aviation, space, energy, 

environment, food or health. 

Giving the important role of diagnosis in maintenance, 

the relevance of the diagnosis system is a factor affecting 

the relevance of a maintenance system. For this purpose, 

diagnosis methods are numerous: each of these 

methods aims to solve diagnosis problems differently 


Hamdadou Djamila and Thérèse Libourel A MultiCriteria Group Decision Support System for Industrial Diagnosis 13 

and tries to meet users’ expectations in a better way. 

Faced with this plurality of methods, the choice of the 

diagnosis method the most relevant and suitable to the 

issue is a very complex problem. In this context, research 

on the topic of relevance of diagnosis methods 

for maintenance is largely motivated by the lack of tools. 

This is in order to compare these methods for optimization 

of maintenance that is penalized. Offering companies 

a way to select the diagnosis tool, for implementation 

or adoption, represents a significant financial gain 

in a competitive environment. Thus, ensuring to the 

company that the proposed tool is the most suitable requires 

putting in place means to identify the needs for 

each decision-maker according to a variety of criteria. 

The current study takes in account a problem related 

to investment in industrial maintenance. The proposed 

methodology ensures the development of a Multicriteria 

collective decision support system trying to bring 

a conscious, clear and rational solution for the problem 

of diagnosis methods choice in a multicriteria and multi 

participant’s context. 

In this research context, the proposed group decision 

support system DIAG-GDSS uses the benefits of Multi 

Agents Systems (MAS) to represent the diversity of actors 

involved in the diagnosis decision, their behaviors 

as well as interactions. They are very suitable for modeling 

complex entities which can cooperate, collaborate 

or negotiate to reach an agreement. 

We endow the module MAS by a negotiation protocol 

based on mediation. This protocol features a coordinator 

(initiator) agent which is responsible for the smooth 

conduct of negotiation and a set of participating agents. 

The agents represent the different entities impacted by 

the decision in terms of diagnosis. 

Multicriteria analysis allows classifying the different diagnosis 

methods, according to their relevance, respecting 

different points of view, often conflicted, of the different 

decision makers affected by the group decision. 

The proposed interactive tool also allows an implementation 

of the diagnosis tool chosen after negotiation within 

the company which is the subject of study. 

After presenting some elements of reflection to introduce 

the context of our study and highlighting the problems 

associated with diagnosis systems, we propose in 

Section 2 a classification of the diagnosis methods. Section 

3 gives a quick preview on the studies of diagnosis 

and issues related to this topic. Our contribution is 

described in its whole in Section 4 and Section 5 is devoted 

to a state of art of negotiation in Multi Agents 

System (MAS). Section 6 describes the proposed group 

decision support system DIAG-GDSS and the MAS component 

is described in detail, in the same section. Section 

7 deals with the procedure of use of the proposed 

tool and Section 8 is devoted to the experimentation of 

this tool through an application on an industrial production 

process. This case study constitutes a first validation 

step. Finally, we conclude the paper, in Section 9, 

and give some perspectives. 

2 Classification of the Diagnosis Methods 

The interpretation of the term "diagnosis" has much significance 

according to the addressed field [31]. 

AFNOR (Standard NF X 60-010) defines the diagnosis 

as the identification of the probable cause(s) of failure 

(s) using a logical reasoning on the basis of a set of 

information which is obtained from an inspection, control 

or test. 

There is a great variety of diagnosis methods; some of 

them are specific and appropriate to the industrial sector. 

The selection of the most appropriate diagnosis 

method to a given industrial system can be done only 

after a census of the needs and the available knowledge. 

In this section, we present briefly the main methods 

which meet at least one of the diagnosis process 

functions: the detection function, the localization function 

and the identification function, classified mainly 

according to the type of the used knowledge [31]. 

2.1 Methods of Diagnosis by Modeling 

These methods are founded either on equations governing 

the internal phenomena of the system or on cases reflecting 

the system functioning modes. The associated 

models require a thorough knowledge of the system operation, 

and gather mainly in three main families: physical 

models[7], meta-models (FMEA Method (Failure 

Mode and Effects Analysis) , HAZOP method (HAZard 

and OPerability stydy), Ishikawa diagram, failures 

trees, events trees, . . . ) and macro-states graphs [5] 

(Petri Networks and Hidden Markov Models). 

2.2 Diagnosis Methods by Data Analysis 

When knowledge over the system be diagnosed is not 

sufficient and the development of a process knowledge 

model is impossible, the use of methods based on data 

analysis can be considered. This is the case of probabilistic 

methods for predicting length of life, matching 

matrix, support vector machine (SVM), neural networks 

and pattern recognition, etc. The latter have been, 

successfully, used in the field of diagnosis [28] [9]. 



2.3 Diagnosis Methods by Artificial Intelligence (AI) 

The diagnosis systems using artificial intelligence tools 

are designed to formalize and to model knowledge providing 

mechanisms to exploit them. Compared to other 

methods where a priori quantitative or qualitative knowledge 

about the process is required, in the methods of AI, 

a large amount of data stored on the functioning system 

(normal and during failures) is required. There are 

two main approaches: Probabilistic Model-based approaches 

(Bayesian Networks) [16] and inference Modelbased 

approaches (Case Based Reasoning and Expert 

Systems). 

3 Related Research Works in Diagnosis 

A lot of great and innovational contributions in diagnosis 

have been published. We can quote in a non exhaustive 

way the work of Hernandez [13] who worked 

out a diagnosis system by neural networks applied to 

the detection of the automobile driver hypovigilance; 

Dubuisson [9], who worked on the theory of the diagnosis 

containing models, the diagnosis by pattern recognition, 

and the neural networks; Bellot [1] who tried 

out the Bayesian networks for the diagnosis applied to 

the Telemedicine, Bourouni [3] who developed an assistance 

tool of the maintenance diagnosis based on the 

dominant modes of failures by exploiting the approach 

expert system, Djebbar [8] who approached the diagnosis 

making of hepatic pathologies by using two approaches: 

the Case Based Reasoning and the Bayesian 

Networks. 

More recently, we find the works of Khemliche [17] 

and [20], who used the bondgraphs for the installation 

of diagnosis assistance tools. Theilliol [29] who exploited 

the methods of diagnosis containing models for 

the monitoring of industrial systems, Greziac [14] who 

implemented a new approach of "diagnosis machine" 

based on the execution of a structural model called "Automatic 

Diagnosis", Kiener [18] who proposed an implementation 

of neural networks for the diagnosis on 

coprocessor; Sabeh [25] who detailed the general principles 

of diagnosis systems in the case of the monitoring 

of the loop of gases in an overfed diesel engine with direct 

injection. Lastly, we can quote Chanthery [4] who 

chose an embarked architecture for the modeling and 

the integration of the active diagnosis. 

Various projects exist in the literature regarding the decision 

support in diagnosis, we cite mainly the Project 

HEROS; its goal is to provide doctors, participating to 

Multidisciplinary Consultation Meetings, a support system 

for group decision making (GDSS : Group Decision 

Support System) to take diagnosis and therapeutic 

decisions [21]. 

4 Our Contribution 

The purpose of this study is to provide a response to 

the performance of maintenance systems. Thus, in this 

paper, we propose a methodology to justify the profitability 

of a maintenance project and guide the user in 

choosing the most relevant diagnosis method. The proposed 

tool DIAG-GDSS (DIAGnosis Group Decision 

Support System), on the basis of a multicriteria group 

decision approach (collective), determines the most appropriate 

method of diagnosis to be implemented at an 

industrial company which focuses mainly on the processing 

of semi-finished products into finished products. 

DIAG-GDSS takes into account the specific and divergent 

interests of the various decision makers to reach an 

acceptable agreement. 

The main objective of this work is to offer the company 

a way to indicate the diagnosis method to be applied. 

Thus, each decision maker must establish a ranking and 

a prioritization of the different diagnosis methods, according 

to their relevance, relatively to well defined criteria, 

while respecting its preferences using the multi 

criteria method ELECTRE III [2]. The final choice 

of the diagnosis method, in this decisional situation, is 

made after a negotiation process according to the protocol 

that we propose. 

In this version of our tool, we developed 10 diagnosis 

methods which enabled us to capitalize expert’s knowledge. 

Each method provides, by the adopted approach 

or the used step, more to the diagnosis methodology. 

5 Reaching Agreements in MAS: State of Art 

One of the major problems faced by multi-agent systems 

is that of reaching agreement in a particular problem, 

each agent is supposed having a preference on contracts 

or possible agreements. The agent then sends 

messages in order to reach an agreement that can arrange 

everyone. But the agents face a dilemma: on 

the one hand, they want to maximize their own utilities, 

and on the other hand may fail negotiation and 

miss the agreement that can satisfy everyone. In MAS, 

among the most used techniques to reach such agreements, 

there are mainly auctions [30], voting systems 

[26], negotiation [12], and argumentation [15]. 

In MAS, negotiation is a key form of interaction that 

allows a group of agents to reach a mutual agreement 

regarding their beliefs, goals or plans. It is the predominant 

tool for solving conflicts of interests. Generally 

speaking, there are various protocols of negotiation in 



MAS applications, the most used ones are [30]: Monotonic 

Concession Protocol, One Step Protocol, Contractual 

protocols (the Contract Net Protocol, the Extended 

Contract Net Protocol). 

1. the coordinator agent (initiator or manager): is the 

agent responsible for managing the negotiation, modifying 

the contract and choosing the final elected 

resource. 

6 DIAG-GDSS Description 

The originality of our approach is due to the simultaneous 

use of a MAS and a DIAGNOSIS system. The 

literature offers few examples of this type of coupling. 

For simplicity, we choose a loose coupling (or weak 

coupling 1 ) between the two components which remain 

independent and communicate only by exchanging data 

[10]. Thus, the features of the two systems are different. 

Let us detail the two modules: 

2. the participant agents (contractors): these are agents 

involved in the; the goal of each agent is that its favorite 

resource is chosen. 

It is essential that participant agents go through a negotiation 

phase, according to a well-structured protocol, in 

order to reach a beneficial agreement to the group. The 

negotiation takes place between the coordinator agent 

and all the participant agents (this is a negotiation from 

1 to n agents). Figure (1) shows an overview of DIAG- 

GDSS. 

6.1 The Diagnosis Module 

The advocated approach is generic; the diagnosis vision 

is supported by the DIAGNOSIS Module. The 

latter, after identifying the various diagnosis methods 

and the different criteria, provides the performance matrix 

using several evaluation methods and simulation 

functions. This matrix is injected into the MAS module 

and analyzed by the multicriteria analysis engine 

ELECTRE III in order to generate different preference 

vectors according to each decision maker, then the MAS 

provides a negotiation process to reach an agreement. 

6.2 The MAS Module 

This component aims to represent the different actors 

which have their own objectives, decision strategies and 

preferences. 

Multi Agent technology has, already, proved its worth 

in many areas. It is especially invited in the implementation 

of collective decision-making (multi decision 

makers) applications because of the facilities which are 

provided. 

For modeling preferences, we use techniques of multicriteria 

decision support methodology. The latter allows 

the construction of appropriate tools and is able to replace 

a decision maker on complex problems. 

We delegate to MAS, the selection of the elected resource 

according to a negotiation process; the chosen 

diagnosis method will be implemented. To cope with 

this group decision, it is necessary to go through a negotiation 

procedure to reach a beneficial consensus. To 

this end, we endow the MAS with a negotiation protocol 

based on mediation involving two types of agents: 

1 The best known types of coupling two systems are: the tight coupling 

(or strong coupling), the loose coupling (or weak coupling), the 

cooperative direct coupling and the cooperative indirect coupling 

Figure 1: DIAG-GDSS: An overview 

6.2.1 Modeling Agents 

The agentification is an important aspect of a MAS designing. 

It strongly influences the performance and efficiency 

of the system to solve a problem. In litterature, 

there are a multitude of methodologies offering an important 

interest for MAS in an organizational perspective 

[10] as Gaia, voyelles, ingenias, Aalaadin, . . . . 

Our proposal, for the modeling agents, is based on the 

methodology Aalaadin [11] exploiting the concepts of 

agent, group and role to define a real organization. 

6.2.2 The Phases of Negotiation in the Proposed 

Protocol 

The current negotiation protocol is, largely, based on 

the Contract Net Protocol [6]. It is characterized by 

a series of messages exchanged between the coordinator 

agent and the participant agents. It proceeds in 



five phases: an initialization phase, a proposal phase, 

an evaluation phase, a modification phase and a final 

decision phase. 

1. The initialization phase: this phase is synonymous 

of the beginning of the negotiation process. 

Participants are asked to express their preferences 

concerning the different resources. Each agent establishes 

a classification of resources (methods of 

diagnosis) from the best one (the most beneficial) 

to the worst one, according to a set of criteria by 

using the multicriteria method ELECTRE III [19]. 

2. The proposal phase: during this phase, the coordinator 

agent proposes a deal to all the participants 

on a given resource. They will either accept or reject 

the contract with reference to their vector of 

preferences, previously constructed in the initialization 

phase. 

3. The evaluation phase: when the coordinator receives 

all the answers of the participants concerning 

the proposal of the contract, it accounts the 

number of the participant agents having accepted 

its proposal. If this number is greater than or equal 

to a given threshold, then the negotiation is successful. 

If not, he must carry out a modification of 

the deal. 

4. The modification phase: during this phase, the 

coordinator is brought to make a modification of 

the contract while taking as a starting point the 

proposals of the agents. It must establish a synthesis 

from what it has received during the evaluation 

phase and then returns to the proposal phase. 

5. The decision phase: this is the last phase of the 

suggested protocol. It signifies the end of the negotiation 

process. A decision is taken by the coordinator 

according to the participants answers concerning 

the proposals which it has made. 

6.3 Characteristics of the Proposed Protocol 

During the negotiation process, many fundamental aspects 

must be taken into account, such as: 

• the language used by the agents to exchange information 

during the negotiation (primitives and 

strategies); 

• the objects of negotiation; 

• the strategies adopted by the different agents; 

• the cardinality of the negotiation. 

In what follows, we present the main characteristics of 

the proposed negotiation protocol. 

6.3.1 The objects of negotiation 

Resources are the objects of negotiation, they can be 

either personal or shared. In our case, they are common 

resources (the diagnosis methods). 

6.3.2 The cardinality of negotiation 

It is an important concept for the MAS. The question 

is how agents negotiate among themselves. Our protocol 

allows the coordinator to propose a deal to a set of 

participants; it is a negotiation from 1 to n agents. item 

6.3.3 The primitives of negotiation 

In order to lead a negotiation process to its term, it is 

necessary to define specific primitives to the coordinator 

and other specific primitives to the participants. 

1. Coordinator primitives: the messages sent by the 

coordinator are aimed at all the participant agents, 

three primitives of negotiation are associated with 

the coordinator: 

Request (): the coordinator sends a message to 

the participants in order to indicate the beginning 

of the negotiation process; 

Propose (): the coordinator proposes a contract to 

the participant agents concerning a given resource; 

Confirm (): the coordinator sends a message to all 

the agents informing them that the negotiation is a 

success and that the resource was found. 

2. Participant primitives: the messages sent by the 

participants are solely aimed to the initiator. The 

other participants are not informed of these messages. 

Three negotiation primitives are associated 

with the participant: 

Inform(): after establishing a storage of the resources 

from the best to the least good. Each participant 

indicates to the coordinator that it can make 

them a first proposal; 

Accept (): through this message, the participant 

answers the proposal of the deal made by the coordinator. 

Each participant indicates, by this message, 

to the initiator that it accepts the contract; 

Refuse (): the participant indicates to the coordinator 

that its proposition is refused. The deal can 

not be concluded in its current form and should be 

modified. 

In order to represent the interactions between the coordinator 

agent and the participant agents, we opt for the 



use of the UML sequences diagram, often used to describe 

the interaction of different agents. 

Figure (2) represents the various primitives associated 

with the different agents via an UML diagram. 

Figure 2: UML Sequences diagram of the proposed negotiation protocol 

6.3.4 The Agents Strategies 

The suggested protocol distinguishes two roles: coordinator 

and participant. The negotiation strategy is not 

the same; it differs according to the role of the agent. 

Thus, there are two types of strategies: 

1. the coordinator strategy allows it to modify a contract 

if the participants have not rather been numerous 

to accept it; 

2. the participants strategies allow them to establish 

their preferences, accept a contract or refuse it. 

• Participant Strategies: we associate with each 

participant agent three strategies: 

1. Strategy of establishing preferences: each 

participant must establish a classification of the resources 

from the best (the most beneficial) to the 

least good referring to a certain number of criteria. 

For that, it exploits the advantages offered by the 

multicriteria decision making method ELECTRE 

III [23], [22]. When each participant has established 

its preference vector, it associates with each 

resource a row. The resource classified first will 

have a higher row representing the preference of 

the participant at the first round. This ranking is, 

each time, decremented by 1 for the following resources. 

2. Strategy of acceptance: the negotiation can 

proceed in several rounds, until a compromise is 

found. In each new round, the participant receives 

a new proposal. If it corresponds to its preference 

at the round t, it accepts this proposal. Otherwise, 

it checks whether the proposition corresponds to 

one of an earlier preference. If it is the case, it accepts 

the contract indicating its actual preference. 

3. Strategy of refusal: when the participant receives 

a proposition which corresponds neither to 

its preference at round t, nor to other earlier preferences, 

it refuses it and makes against the proposal 

which corresponds to its preference at round t. 

• coordinator Strategy: we associate with the coordinator 

only one strategy used at the time of the 

modification phase. 

Strategy of modification: when the participants 

are not rather numerous to accept the coordinator’s 

proposal, the latter is obliged to modify its contract 

for the next round while taking as a starting 

point all the modifications sent by the participants 

at round t , in order to find a new possibility for 

the contract. For that, the coordinator associates a 

score SCORE(R i ) with each resource R i (i= 1,.. 

n) which takes into account the weight of the participant 

agent as well as the ranking of the resource 

in the vector of preferences of this agent. As in the 

method of scorages [10], the resource which has 

obtained the highest score at round t, will be the 

winner resource and the coordinator will propose 

it in the new contract. This score is updated each 

time the participants have been less numerous to 

accept the contract. SCORE(R i ) is given by the 

following equation : 

SCORE(R i ) = 

m∑ 

W EIGHT (participant[j]) 

j=1 

∗ROW (R i , participant[j]) 

– n, m: the number of resources and decision 

makers respectively; 

– WEIGHT(participant [j]): to each participant 

j, we associate a different weight, since in reality, 

the Project Engineer, for example, does 

not have the same weight that the Finance responsible 

at the time of a group decision out 

of diagnosis. 

– ROW (R i , participant[j]): the row associated 

with the resource (action) i by the par- 



ticipant j in its vector of preferences (sorting 

provided by ELECTRE III); 

7 DIAG-GDSS: Functional Architecture 

The procedure for using the tool DIAG-GDSS operates 

in two main phases: a phase of group decision support 

and a phase of diagnosis (concretization of results). 

agents have access to the performances matrix managed 

by the diagnosis component to determine their vector 

of preferences by exploiting the multicriteria method 

ELECTRE III. After several rounds of negotiation under 

the proposed protocol, the participant agents arrive 

to a consensus that satisfies all the concerned parties (or 

part of them) in the final agreement. 

7.1 The Phase of Group Decision Support 

This phase of the current approach corresponds to the 

structuration and exploitation of the decisional model. 

7.1.1 Structuring the Decisional Model 

This phase aims to identify the problem and the fundamental 

choices on how to approach it. It aims, also, to 

formalize three basic elements of the decisional situation: 

7.2 The Phase of Diagnosis 

This phase is essentially the result of acceptance, it includes 

the implementation of the group decision and the 

control of the solution. In this last phase, after introducing 

the data of the industrial process, we can carry out 

the diagnosis through the elected diagnosis method.The 

main phases of DIAG-GDSS are designed in Figure (3) 

and Figure (4) summarizes how DIAG-GDSS operates. 

1. Identify actions (resources): the identification 

of all the potential actions is a very significant step 

in any decision support approach, especially when 

the multicriteria analysis method proceeds by partial 

aggregation. It is very important that the set 

of all the actions is complete because its modification 

during the analysis can cause a recurrence of 

multicriteria analysis. 

2. Identify criteria: the list of criteria obtained by 

aggregating the corresponding factors (sub-criteria) 

should be as complete as possible. These criteria 

must be related to constraints and objectives used 

in the generation activities. The family of the most 

relevant criteria must verify the conditions of exhaustivity, 

consistency and independence [23]. 

3. Identify actors (decision makers): the concept 

of actor refers to a concrete entity, localized (in a 

context). It is a unity of individual or collective 

decision, which can allocate resources, purposes 

and strategies 2 . The multiplicity of actors makes 

negotiation difficult since we have on one side the 

strong actors with a significant power and on the 

other side weak actors who have more difficulty 

defending their interests. 

7.1.2 Exploiting the Decisional Model 

In this phase, every actor will be modeled as an agent 

to which is associated a weight expressing its importance 

and its authority scope in the group decision. All 

Figure 3: DIAG-GDSS: Functional architecture 

2 We can identify two main types of actors: individual and collective. 

Collective actors are groups or organizations. 



8.2 Definition of Actions 

Potential actions 3 are all diagnosis methods which are 

susceptible to meet the goals of a diagnosis system for 

maintenance, while respecting the imposed constraints 

and undertaking a technical study of each method. 

The actions correspond to different diagnosis methods 

that we have implemented and must be negotiated in 

order to help the decision makers to choose the method 

that will be maintained for diagnosis. This is by considering 

a set of criteria and taking into account the subjectivity 

of the several actors implied in this project. 

In the present study, and given the invested area (the industrial 

diagnosis), we opted for the following methods: 

Act1:Model-based Diagnosis, Act2:Pattern Recognition 

Method, Act3: Expert Systems, Act4: Neural Networks, 

Act5: Petri Networks, Act6: Markov Chains, Act 7: 

Failure Mode and Effects Analysis (FMEA), Act 8: Failures 

Trees, Act 9: Bayesian Networks, Act 10: Case 

Based Reasoning. 

Figure 4: Functioning of DIAG-GDSS 

8 Case Study: Experimental Results 

The development of a multi-agent module is a complex 

problem. Therefore, it is preferable to use an existing 

multi-agent platform that we adapt to our needs. The 

modules MAS and Diagnosis communicate with each 

other through shared data. 

8.1 The addressed Problem 

The implementation of the proposed group decision support 

process is accompanied by an application on a company 

test. The objective of this simulation exercise is to 

support the proposed methodology by a confrontation 

with the available tools and data. In addition, testing 

DIAG-GDSS on an existing case (with real data) allows 

the validation of our proposal. 

The study carried out was tested on a production process 

of an Algerian company of steel industry "AN- 

ABIB". For the sake of efficiency, we chose the most 

critical equipment of this company. It is about a machine 

which, starting from a steel coil, gives "tubes". 

It ensures the operation of forming and welding. This 

enabled us to show the effectiveness of this type of tool 

in terms of availability and reduction of the downtime of 

the machines. The stages to be followed, are described 

in the next sections. 

8.3 Identification of Criteria 

The identification of criteria for choosing a diagnosis 

method must be based on the exhaustive list of constraints 

and objectives of the diagnosis system in maintenance. 

Setting goals is an essential step that allows 

defining the most relevant criteria to our study, acceptance 

levels of the diagnosis results, and therefore the 

relevance of methods and tools. 

Indeed, a criterion is a translation of an objective or a 

portion of an objective (respectively a constraint) into a 

quantifiable element quantitatively or qualitatively. 

In the current study, choosing the most relevant diagnosis 

method, takes into account a family of criteria. To 

give our study genericity following the methodology of 

diagnosis aid, the criteria used in our study were developed 

using experts from the company [27]. These 

criteria are: 

• Crt 1: Response time 

• Crt 2: Measures variations robustness 

• Crt 3: Modeling errors robustness 

• Crt 4: Prediction capacity 

• Crt 5: Development cost 

• Crt 6: Accessibility level 

• Crt 7: Ease of knowledge exploit 

3 In the multicriteria decision aid methodology, the action is a possible 

solution for the decisional problem; it is synonym of the term 

resource in the MAS vocabulary. 



• Crt 8: Complex systems adaptation 

8.4 Evaluation of Performances 

We focus, in this section, on evaluating the performance 

of each possible and potential diagnosis method according 

to each criterion: 

• If the method performance is high, this method is 

preferred; 

• Performance evaluations can be performed with 

various methods: analytical formulas, measuring 

instruments, human experts, . . . . 

• If the evaluation criterion is not measurable by an 

analytical formula or is not objectively measurable, 

an evaluation or score can be performed on a scale 

or finite set of values. 

The definition and assessment of the identified criteria 

according to different actions generate the matrix of 

performance, illustrated in Figure (5). This matrix is 

managed by the diagnosis component. 

• Decision maker 5 : machinery manager (the person 

who manages the machines and gives its opinion 

on the consistency of diagnosis with the machinery 

operation ) 

Each Decision maker is represented by an agent; the 

generation of agents is performed using the platform 

MAS JADE (JAVA). We attribute to each participant 

agent, a weight expressing its importance in the negotiation. 

The weights of various actors, reflecting a maximum 

of reality, are given in Figure (6). 

Figure 6: The weights of several agents 

Figure 5: The matrix of performances 

8.5 Identification of Decision Makers 

In this study, the different decision makers involved in 

the group decision are: 

• Decision maker 1: project Engineer 

• Decision maker 2: company Manager 

• Decision maker 3: finance responsible 

• Decision maker 4 : diagnostician exhibitor (the 

person offering the diagnosis service) 

8.6 Definition of Subjective Parameters 

Each agent will make its vector of preference where it 

classes resources from the best ones to the worst according 

to the identified criteria. To achieve this goal, it uses 

the multicriteria analysis method ELECTRE III. To be 

conducted, this method introduces some subjective parameters. 

In the following, we show how DIAG-GDSS 

assigns values to these parameters. 

1. Weight: is a number ω j , {j = 1, 2, ..., m}, m designates 

the number of criteria) assigned to each criterion, 

according to its importance regarding other 

criteria. It is not always easy for the decision maker, 

to articulate the assessment of the relative importance 

of each criterion. To assign a value to this 

intra criteria parameter, we used the Saaty scale. 

The latter allows evaluating various criteria and 

ordering them according to their relative importance. 

The scale of Saaty is based on a mathematical 

model developed by Thomas Saaty [24]. 

After comparing sequential pairs of criteria (each 

criterion is assessed on all the others in a series of 

comparisons), we ask the user to order on a scale 



from (-9 to 9) the relative importance of one criterion. 

2. Thresholds: In our study, thresholds of preference 

and indifference p j and q j on the criteria j , respectively, 

are chosen on the basis of values assigned to 

data uncertainties. For example for an uncertainty 

of 20: 

p j = 2 ∗ 20/100 ∗ max(i, k)[g j (a k ) − g j (a i )] 

q j = 20/100 ∗ max(i, k)[g j (a k ) − g j (a i )] 

Where 

g j (a i ): the performance of action a i according to 

criterion j. 

g j (a k ): the performance of action a k according to 

criterion j. 

The veto threshold ν j is also determined by the values 

assigned to data uncertainties. For example for 

an uncertainty of 20, ν j is given by: 

ν j = 3 ∗ 20/100 ∗ max(i, k)[g j (a k ) − g j (a i )] ; q j 



Figure 10: Viewing exchanged messages during the negotiation process 

via the Sniffer of JADE 

vance of the chosen method according to the objectives 

to be achieved by each of them. 

Although the choice of the method, in some cases is 

predictable, in other cases it is more complex, involving 

multiple decision makers with conflicting objectives 

and based on very heterogeneous criteria. 

The objective, we have pursued throughout this paper, 

is to propose a group decision-support process for facilitating 

the choice of a diagnosis method through the 

integration of a multicriteria approach in a negotiation 

protocol implemented in a multi-agent system. 

Thus, we have initiated in the framework of decision 

support in diagnosis a new approach combining multicriteria 

analysis with models based agents to treat the 

multiplicity and diversity of actors in a diagnosis project. 

Through this article, our effort has focused on the proposal 

of a multicriteria group decision support system 

that meets a primary objective and a specific objective, 

respectively: 

• the representation of the multiplicity of decision 

makers in maintenance, their diversity, their be- 

havior and their interaction in order to select the 

diagnosis method which is the most appropriate to 

the context of the company. 

• the development of different approaches we have 

used to model a system of diagnosis. 

An application in the industrial field has been proposed, 

and has served as a basis to demonstrate the feasibility 

of such an approach. 

To consolidate the contribution of different diagnosis 

methods integrated into the tool DIAG-GDSS, we have 

developed a negotiation protocol incorporating a multicriteria 

analysis method, more specifically ELECTRE 

III. The latter allows each actor of maintenance, with 

its own preferences, to make a classification by level of 

relevance using different criteria. 

We end here by evoking the different perspectives of 

research that we plan to address in the future: 

• extending the model of agents allowing them to 

change their goals according to new information 

they receive: developing a protocol of negotiation 

based on argumentation; 

• integrating other strategies of negotiation between 

different agents. 



Acknowledgements 

This research is supported by the Algerian Ministry of 

High Education and Scientific Research MESRS. The 

title of research program " Diagnosis, Decision Support 

System and Human Interaction" . I am very grateful to 

all the members of LIRMM Laboratory (Montpellier II 

University) for the fruitful discussions about this type of 

research which has significantly contributed to improve 

the presentation of this work. 

References 

[1] Bellot, D. Fusion de données avec des réseaux 

bayésiens pour la modélisation des systèmes dynamiques 

et son application en télémédecine. PhD 

thesis, Université Henri poincaré, Nancy, 2002. 

[9] Dubuisson, B. Diagnostic, Intelligence Artificielle 

et reconnaissance de formes. Edition Hermès 

Lavoisier, 2001. 

[10] Ferber, J. Les systèmes multi-agents, vers une intelligence 

collective. Inter Editions, 1995. 

[11] Ferber, J. and Gutknecht, O. A meta-model 

for the analysis and design of organizations in 

multi-agent systems. International Conference 

on Multi-Agent Systems, ICMAS98, IEEE Press, 

1:128–135, 1998. 

[12] Grant, J., Kraus, S., and Perlis, D. A 

logic for characterizing multiple bounded agents. 

Autonomous Agents and Multi-Agent Systems, 

3:351–387, 2000. 

[2] BenMena, S. Une solution informatisée à 

l’analyse de sensibilité d’electre iii. Biotechnology, 

Agronomy, Society and Environment, 5:31– 

35, 2001. 

[3] Bourouni, K. Développement d’un outil d’aide au 

diagnostic de maintenance basé sur les modes de 

défaillances dominants. PhD thesis, Université de 

provence , France, 2005. 

[4] Chanthery, E. and Pencolé, Y. Modélisation et 

intégration du diagnostic actif dans une architecture 

embarquée. Journal européen des systèmes 

automatisés: Modélisation des systèmes réactifs, 

43:789–809, 2009. 

[5] Combacau, M. Commande et surveillance des systèmes 

à événements discrets complexes: Application 

aux ateliers flexible. PhD thesis, Université 

de Paul Sabatier, Toulouse, 1991. 

[6] Davis, R. and Smith, R. Negotiation as a metaphor 

for distributed problem solving,. Distributed Artificial 

Intelligence, 20:63–109, 1983. 

[7] Didier, G. Modélisation et diagnostic de la machine 

asynchrone en présence de défaillances. 

PhD thesis, Université Henri poincaré, Nancy, 

2004. 

[8] Djebbar, A. and Merouani, H. Mocabban: A modelling 

case base by a bayesian network applied 

to the diagnosis of hepatic pathologies. International 

Conference on Computational Intelligence 

for Modelling, Control and Automation, 1:678 – 

685, 2005. 

[13] Gress, N. H. Un système de diagnostic par 

réseaux de neurones et statistiques: App lication à 

la détection d’hypovigilance du conducteur automobile. 

PhD thesis, Université de Toulouse, 1998. 

[14] Greziac, F., editor. Diagnostic Automatique :une 

nouvelle approche de diagnostic machine basée 

sur l’exécution d’un modèle structurel. Journée 

SEE-AAI, GDRMACS-S3, Les nouveaux outils 

de diagnostic dans les processus industriels : les 

clés de la compétitivité, 2008. 

[15] Hamdadou, D. Un Modèle d’Aide à la Décision en 

Aménagement du Territoire, une Approche Multicritère 

et une Approche de Négociation. PhD thesis, 

Université d’Oran, Algérie, 2008. 

[16] Jonquières, S. Application des réseaux bayésiens 

à la reconnaissance active d’objets 3D: contribution 

à la saisie d’objets. PhD thesis, Université de 

Toulouse, 2000. 

[17] Khemliche, M., Bouamama, B. O., and Haffaf, 

H. Sensor placement for component diagnosability 

using bond graph. Sensor and Actuators, 

132:547–556, 2006. 

[18] Kiener, P. Implémentation de réseaux de neurones 

pour le diagnostic sur co-processeur. In Actes de la 

journée SEE-AAI, GDRMACS-S3, Les nouveaux 

outils de diagnostic dans les processus industriels 

: les clés de la compétitivité, 2008. 

[19] Maystre, L. and Simos, J. P. J. Méthodes multicritères 

Electre. Edition Presses Polytechniques 

et Universitaires Romandes, 1994. 



[20] Merzouki, R., Djeziri, K. M. A., and Ould- 

Bouamama, B. Fault detection in mechatronics 

system. MECHATRONICS IFAC, 17:299–310, 

2007. 

[21] Morel-Wascat, C. Projet rnts. Technical report, 

Réseau National des Technologies pour la Santé: 

HEROS, Décembre 2002 - Juin 2004. 

[22] Roy, B. Decision-aid and decision-making. European 

Journal of Operational Research, 45:324– 

331, 1990. 

[23] Roy, B. The outranking approach and the foundations 

of electre methods. Theory and Decision, 

31:49–73, 1991. 

[24] Saaty, L. Decision-making with the ahp: Why 

is the principal eigenvector necessary. European 

Journal of Operational Research, 145:85– 

91, 2003. 

[25] Sabeh, Z., Ragot, J., and Maquin, D. Modélisation 

et surveillance de la boucle des gaz dans 

un moteur diesel suralimenté à injection directe. 

principes généraux du diagnostic de systèmes. In 

Actes de la journée SEE-AAI, GDRMACS-S3, Les 

nouveaux outils de diagnostic dans les processus 

industriels : les clés de la compétitivité, 2008. 

[26] Sandholm, T. Negociation among self-interessted 

computationally limited agents. PhD thesis, Univeristy 

of Massachusetts, 1996. 

[27] Sénéchal, O. and Tahon, C. A modelling approach 

for production costing and continuous improvement 

of manufacturing processes. Production 

Planning and Control Journal, 8:731–742, 

1997. 

[28] Thairrault, Y. Détection et isolation de défauts par 

analyse en composantes principales robuste. PhD 

thesis, Université de Nancy, 2006. 

[29] Theilliol, D. and Aubrun, C. Surveillance de systèmes 

industriels fondée sur les méthodes de diagnostic 

à base de modèles,. In Actes de la journée 

SEE-AAI, GDRMACS-S3, Les nouveaux outils de 

diagnostic dans les processus industriels : les clés 

de la compétitivité, 2008. 

[31] Zwingelstein, G. Diagnostic des défaillances: 

théorie et pratique pour les systèmes industriels. 

Edition Hermès, 1995. 

[30] Verrons, M. Un modèle général de négociation de 

contrats entre agents. PhD thesis, Université de 

Lille, 2004. 


Digital Image Watermarking: A Review of SVD, DCT and DWT 

Based Approaches 

DINESH KUMAR 1 

VIJAY CHAHAR 2 

1 CSE Department, GJUS&T, Hisar, Haryana, INDIA 

2 CSE Department, JCDMCOE, Sirsa, Haryana, INDIA 

1 dinesh_chutani@yahoo.com 

2 vijaykumarchahar@gmail.com 

Abstract. This paper presents digital image watermarking and review of five recently proposed approaches. 

These approaches are based on Singular Value Decomposition (SVD), Discrete Wavelet Transform 

(DWT) and Discrete Cosine Transform (DCT). The approaches include watermarking schemes 

based on SVD and DWT or DCT, that modify the singular values of four bands (data sets), derived from 

the cover image using transform, with the singular values of watermark. We describe all the approaches 

pointing out their strengths and weaknesses. We further suggest an extension to the existing schemes. In 

the proposed work, we first decompose the cover image into bands (data sets) using DWT. These data 

sets are then divided into non-overlapping blocks. The singular values of each block are finally modified 

in a controlled manner using the singular values of watermark matrix. The partitioning of the data sets 

into non overlapping blocks and subsequent modification of singular values of each block make the watermark 

more robust against the various attacks as the results reveal. An experimental comparison with 

the existing schemes has also been carried out. 

Keywords: watermarking, discrete wavelet transform, singular value decomposition, discrete cosine 

transform. 

(Received January 19th, 2011 / Accepted July 16th, 2011) 


We are living in information-oriented society. The evolution 

of fourth generation of mobile communication 

together with Internet and wireless LAN has made it 

very easy and in fact acted as a highway for large number 

of services such as e-commerce, multimedia communication, 

online air reservation and services, e-books 

etc. There is dissemination of lot of information which 

can be in various forms such as text, image, audio or 

video. Most of the information is distributed as digital 

signals. Digitization of data has undoubtedly enabled 

reliable, faster and efficient storage, transfer and processing 

of digital data [12]. Consequently it has lead 

to illegal production and distribution of data and hence 

a threat to the authenticity and copyright of the information. 

Copyright protection of digital data is of great 

concern in such a scenario [22, 33]. 

Cryptography and steganography are the two basic 

among the various data hiding techniques. These techniques 

can help us protect the multimedia data against 

illegal distribution [6]. The former deals with encryption 

and decryption of data, with the help of a key, before 

transmission and after reception respectively. The 

latter constitutes embedding at the sender’s end and extraction 

of the data at the receiver’s end. Watermarking 

is a concept that has been derived from steganography. 

The process of watermarking consists of three steps; 

watermark signal, embedding of watermark signal into 

data to yield watermarked data and the third step is detection 

to verify its presence. Watermark signal may 

either be a number sequence or an image. There is a 

large number of watermarking techniques in the literature 

[29]. 

There are many ways to classify the watermarking techniques. 

The watermark can be embedded in spatial 

domain or in frequency domain [2, 20]. Frequency 

domain techniques are more robust than spatial domain 

as reported in literature [27]. Some popular frequency 

domains used are Discrete Cosine Transform 

(DCT), Discrete Fourier Transform (DFT) and Discrete 

Wavelet Transform (DWT) etc. The main properties 


Dinesh Kumar and Vijay Chahar Digital Image Watermarking: A Review of SVD, DCT and DWT Based Approaches 26 

of watermark are robustness, fidelity, computational 

cost and false positive rate [11, 28]. Literature reports 

several watermarking techniques that have been developed 

by using SVD alone or along with DWT or DCT 

[3, 4, 7, 8, 18, 23, 24, 34]. 

In this study, we review five of the recently proposed 

approaches for watermarking and discuss their relative 

strengths and weaknesses. These approaches are: 

an SVD based watermarking scheme by Liu and Tan 

[25], block-by-block SVD based image watermarking 

scheme by Ghazy et al. [15, 16], DWT-SVD domain 

image watermarking by Ganic and Eskicioglu 

[17], SVD-DCT based image watermarking by Rafizul 

Haque [21], a hybrid SVD and Wavelet based image 

watermarking by Majumder et al. [26]. This paper has 

been divided into 5 sections. Section 2 briefly explains 

these techniques along with their strengths and weaknesses. 

Section 3 proposes an improvement in DWT- 

SVD based scheme. The next section 4 covers experimental 

results and discussions. The conclusions have 

been drawn in section 5. 

1.1 SVD, DCT and DWT Basics 

SVD is a numerical analysis tool for matrices, which 

give minimum least square error. The stability of singular 

values and the representation of intrinsic algebraic 

image properties by these values make SVD the one 

of the widely used methods in image processing. The 

DCT is a technique for converting a signal into elementary 

frequency components. It represents an image as 

a sum of sinusoids of varying magnitudes and frequencies. 

Wavelets are mathematical functions that divide 

the data into different frequency components, and then 

study each component with a resolution matched to its 

scale. DWT has been developed on this idea. It is used 

more frequently in digital image watermarking due to 

its time/frequency decomposition characteristics, which 

resemble to the theoretical models of the human visual 

system [36]. A brief description of the three is as follows: 

where matrix U is a k × k orthogonal matrix. 

U = [u 1 , u 2 , . . . , u r , u r + 1, u r + 2, . . . , u k ] (2) 

Column vectors u i , for i=1,2,. . . ,k form an orthonormal 

set: 

{ 

u T 1 if i = j, 

i u j = δ ij = 

(3) 

0 if i ≠ j. 

and matrix V is a m × m orthogonal matrix. 

V = [v 1 , v 2 , . . . , v r , v r + 1, v r + 2, . . . , v k ] (4) 

Column vectors v i , for i=1,2,. . . ,m form an orthonormal 

set: 

{ 

vi T 1 if i = j, 

v j = δ ij = 

(5) 

0 if i ≠ j. 

Here, S is a k × m diagonal matrix with singular 

values (SV) on the diagonal. The v i ’s and u i ’s are called 

right and left singular-vectors of X respectively. 

1.1.2 Discrete Cosine Transform 

The Discrete Cosine Transform is a technique for converting 

a signal into elementary frequency components 

[1, 31]. The DCT is applied on an image I, having 

K×M pixels to transform the image according to equation 

6 [1]: 

y(u, v) = 

√ √ 

M−1 

2 2 

M K α ∑ 

uα v 

cos 

(2m + 1)uΠ 

2M 

K−1 

∑ 

m=0 k=0 

cos 

I(m, k) 

(2k + 1)vΠ 

2K 

(6) 

where y(u, v) is the DCT coefficients in row u and 

column v of the image matrix and I(m, k) is the intensity 

of the pixel in row m and column k of the image. 

The values of α u and α v are both set to 1/ √ 2 when 

u, v = 0 , otherwise 1. The image can be reconstructed 

by applying IDCT according to equation 7 [1]: 

1.1.1 Singular Value Decomposition 

Singular Value Decomposition (SVD) is an important 

topic in linear algebra. SVD has many practical and 

theoretical values; special feature of SVD is that it can 

be performed on any real (k, m) matrix. Suppose, we 

have a matrix X with k rows and m columns, with rank 

r and r ≤ k ≤ m. X can be factorized into three 

matrices [5]: 

X = USV T (1) 

I(m, k) = 

√ 

2 

M 

cos 

√ 

M−1 

2 ∑ 

K 

u=0 

K−1 

∑ 

v=0 

(2m + 1)uΠ 

2M 

α u α v y(u, v) 

cos 

(2k + 1)vΠ 

2K 

(7) 

DCT-based watermarking is based on two main concepts. 

The first is that low-frequency sub-band contains 

most of signal energy. This band contains the most important 

visual parts of the image. The second concept is 

that high frequency components of the image are usually 

removed through compression and noise attacks. 



Therefore the watermark is embedded by modifying the 

coefficients of the middle frequency sub-band so that 

the visibility of the image will not be affected and the 

watermark will not be removed by compression [1]. 

1.1.3 Discrete Wavelet Transform 

Wavelets are special functions which are used as basal 

functions for representing signals. DWT is applied on 

two dimensional images to processing the image by 2- 

D filters in each dimension [1]. The filters divide the 

image into four sub-bands CA, CH, CV and CD. The 

CA sub-band represents the approximation coefficients 

of DWT. The sub-bands CH, CV and CD represent the 

horizontal, vertical and diagonal coefficients. To obtain 

the next scale of wavelet coefficients, the sub-band CA 

is further processed until some final scale N is reached 

[9, 14, 27]. 

2 Five Approaches to Digital Image Watermarking 

This section describes the above mentioned five approaches 

along with their strengths and weaknesses. 

2.1 Liu et al. : A SVD Based Watermarking Scheme 

Liu and Tan [25] observed that the focus of most of the 

watermarking schemes as proposed by researchers was 

to make the watermark imperceptible. The issue related 

to rightful ownership resolution was not well addressed. 

Although they found some papers [10, 13, 30, 37] in 

which the attention was paid to find out the solution for 

this problem but these were quite a few in number and 

had certain drawbacks. Hence a new digital watermarking 

scheme was proposed by them (Liu and Tan) that 

was based on Singular Value Decomposition and performed 

well both in resolving rightful ownership and in 

resisting common attacks. The strength of this method 

was that the scheme was non-invertible and guided us 

in selection of watermark and in determination of its 

location. It also helped in computing the amount of watermarking 

energy that needs to be inserted. Such information 

was missing in other existing watermarking 

algorithms. Further it did not require encryption to resolve 

rightful ownership. But it had some weaknesses 

also. The watermark was added to the singular values 

of the whole image. A single watermark was used in 

this scheme which might be lost due to attacks. This 

method was found to be resistant to some of the attacks 

like compression, filtering, cropping etc. but was not 

robust against the attacks including rotation and translation. 

2.2 Ghazy et al. : A Block-By-Block SVD Based 

Watermarking Scheme 

The authors [16] presented a block based digital image 

watermarking scheme using SVD. In this approach, 

the original image was divided into blocks and embedding 

of watermark was done in the singular values of 

each block separately. In the non blocked watermarking 

scheme there was a risk of losing the watermark 

due to attacks. This was attributed to the fact that the 

watermark was added to the singular values (SVs) of 

the whole image. To cope up this drawback, the original 

image was first segmented into blocks and watermark 

was added to the SVs of each block. As a result 

of several watermarked blocks, we could recover several 

watermarks. We would still be left with some of 

the watermarks even if many were lost due to attacks. 

This method was robust against many attacks such as 

JPEG compression, cropping, blurring, Gaussian noise, 

resizing and rotation [15]. A watermark can be either a 

pseudo-random number or an image. This method has 

one disadvantage also. It is not resistant to translation 

operation. 

2.3 Ganic and Eskicioglu : DWT-SVD Based Watermarking 

Scheme 

In their paper, Ganic and Eskicioglu [17] presented a 

hybrid scheme based on DWT and SVD in which the 

cover image was first divided into four bands and thereafter 

SVD was applied to each band and then singular 

values were modified to embed the same watermark. 

The scheme proved to be robust to wide range of attacks. 

In two dimensional DWT, the four bands LL, 

HL, LH and HH are produced at each level of decomposition. 

Prior to this paper, Mehul and Priti [32] used 

two level decomposition and the two second level subbands 

LL and HH were used to embed two different 

watermarks. The results depicted robustness against 

JPEG compression, wiener filtering, Gaussian noise, 

scaling and cropping upon embedding the watermark 

in LL band while embedding in HH band resulted in 

robustness against histogram equalization, intensity adjustment 

and gamma correction. But the visibility of 

the embedded watermark was noticed all over the cover 

image thus reducing the value of the image. Ganic and 

Eskicioglu generalized the scheme to the four subbands 

using DWT-SVD watermarking. Instead of using different 

values of scaling factor for the four subbands 

only two were used; one value was assigned to LL subband 

contributing the largest singular value while another 

smaller value for all other three subbands. These 

subbands usually have decreasing singular values as we 

move from HL to HH bands but exceptions are there. 



Using this hybrid scheme it was observed that no difficulty 

was experienced in modifying the LL band which 

was otherwise problematic in most DWT based watermarking 

techniques. Further it was also demonstrated 

that watermarks inserted in the LL subband are resistant 

to one group of attacks such as gussian blur, noise, 

pixilation, JPEG compression, and rescaling while inserted 

in the HH subband are resistant to another group 

of attacks such as sharpening, cropping, contrast adjustment, 

histogram equalization, and gamma correction 

[35]. An advantage of SVD was the embedding 

of only the smaller set (more significant) of singular 

values of the visual watermark. This scheme had certain 

limitations also. After the cropping attack, singular 

value extraction in the HL subband did not allow proper 

construction of the watermark although the correlation 

coefficient was high. Moreover there is no general formula 

for determining the values of scaling factor for 

each subband. 

2.4 S. M. Rafizul Haque : SVD-DCT Based Watermarking 

Scheme 

Rafizul [21], in his thesis, presented a modified DCT- 

SVD based watermarking scheme to provide robustness 

mainly against rotation (R), scaling (S) and translation 

(T) attacks. He added an extra step called the preprocessing 

step before the watermark extraction which actually 

resulted in watermarked image that was capable 

to resist large angle rotation, scaling and translation attacks 

thus making the scheme more robust. Though this 

scheme is RST invariant, but the value of correlation 

coefficient is not very good. Moreover for most of the 

attacks, extracted watermarks from four quadrants are 

not of the same quality. 

2.5 Majumder et al. : A Hybrid SVD and Wavelet 

Based Watermarking 

Majumder et al. [26] proposed hybrid SVD and wavelet 

based watermarking scheme that is different from the 

one discussed above (section 2.3). In this scheme, the 

cover image was first divided into four subbands (LL, 

HL, LH and HH) and SVD was applied on each subbands 

to obtain U LL , S LL and V LL from LL band and 

so on from HL, LH and HH as well. Then watermark 

was iterated and was multiplied by a number named as 

KEY and was added with the singular values(S LL ) 

to yield new matrixSLL ⋆ that was no longer a simple 

diagonal matrix. This is because of the fact that addition 

gives some extra non-zero values at positions 

other than the diagonal. The KEY may have any value 

greater than 0 but less than 1. This was done to reduce 

the impact of watermark. The SVD was again 

applied to the new matrixSLL ⋆ yield U1 LL, S1 LL and 

V 1 LL . Watermarked LL subband was computed using 

the equation LL w = U LL S1 LL VLL T . Similarly 

HL w ,LH w andHH w were computed. Another equation 

LL k = U1 LL S LL V 1 T LL was used to obtain key 

subband of LL called LL k and similarly HL k , LH k 

and HH k . The watermarked image was then transmitted. 

It might be corrupted due to noise or other attacks 

before it reaches the destination. In order to extract 

the watermark, the key image and the value of KEY 

is required along with corrupted watermarked image. 

DWT was applied on the corrupted watermarked image 

and key image so as to obtain LL w ,HL w , LH w 

and HH w andLL k , HL k , LH k and HH k respectively. 

SVD was applied on LL k to obtainU k ,S k and 

V k and on LL w to obtain U w ,S w and V w . Matrix 

D = U k S w V k was computed and extracted average coefficients 

of watermark LL mark were obtained by computing 

LL mark = 1 

KEY × S k. Similarly HL mark , 

LH mark and HH mark were calculated and IDWT was 

applied to extract the watermark. The results showed 

the robustness of the scheme against the application of 

different attacks and they were able to extract visibly 

good watermark. 

3 Improved Watermarking Scheme 

Generally, SVD transform can be applied to an image 

with two different techniques: on the whole image and 

on blocks of the image [4]. The former tends to spread 

the watermark all over the image, whereas the latter 

only affects local regions of the image. 

The improvement in watermarking scheme, we intend 

to propose is block-based. The basic idea of the improved 

algorithm is to decompose the cover image to 

approximation coefficients (CA), horizontal detail coefficients 

(CH), vertical detail coefficients (CV) and diagonal 

detail coefficients (CD) which we get after applying 

DWT transform on cover image. These bands are 

first divided into non-overlapping blocks. Thereafter 

SVD transform is applied on these non-overlapping 

blocks CA i ,CH i ,CV i ,CD i to compute Ua i ,Sa i , 

V a i from CA i and so on from CH i ,CV i ,CD i as well. 

We embed the singular values of watermark in the singular 

values of these non-overlapping blocks. The steps 

are as follows: 

3.1 Improved Watermark Embedding Scheme 

The steps for improved watermarking embedding are as 

follows: 

1. Use DWT to decompose the cover image (F matrix) 

into four separate data sets: CA, CH, CV, and 



CD. 

2. Divide CA, CH, CV, and CD datasets as obtained 

from step 1 into non overlapping blocks. 

3. Perform SVD on each block to obtain the SVs of 

each block. 

CH i = Uh i Sh i V h T i (8) 

CD i = Ud i Sd i V d T i (9) 

CA i = Ua i Sa i V a T i (10) 

CV i = Uv i Sv i V v T i (11) 

where i=1,2,. . . ,N. N is the number of blocks. 

4. Apply SVD to the watermark W . 

W = U w S w V T w (12) 

S w contains the singular values λ wi , 

where i=1,2,. . . ,n. 

5. Modify the singular values of the subbands of the 

cover image with the singular values of the visual 

watermark: 

λ ⋆ i = λ i + αλ wi (13) 

whereλ i , i=1,2,. . . ,n. are the singular values of 

Sd i ,Sh i ,Sv i and Sa i . α is the scaling factor. 

6. Find the four set of modified DWT coefficients: 

3. Perform SVD on each block to obtain the SVs of 

each block. 

CH i = Uh i Sh i V h T i (18) 

CD i = Ud i Sd i V d T i (19) 

CA i = Ua i Sa i V a T i (20) 

CV i = Uv i Sv i V v T i (21) 

4. Extract the singular values from each block: 

λ wi = (λ ⋆ i − λ i )/α (22) 

5. Use the S matrix of each block to build the watermarked 

blocks in the spatial domain. 

W i = U w S wi V T w (23) 

4 Experimentation and Results 

4.1 Experiment 1 and Results 

In order to verify the validity of improved watermarking 

scheme, we choose two different watermark images 

each of size 256 × 256 . Four different cover images 

each of size 256 × 256 have been used. The scaling 

factor has been chosen to be equal to 0.5. The ’Haar’ 

wavelet has been used [19]. Figure 1 shows the cover 

images Fig 1 (a, b, c d), watermark images Fig1 (e, f). 

CH i = Uh i Sh ⋆ i V h T i (14) 

CD i = Ud i Sd ⋆ i V d T i (15) 

CA i = Ua i Sa ⋆ i V a T i (16) 

CV i = Uv i Sv ⋆ i V v T i (17) 

7. Rearrange the DWT coefficients back into one matrix 

to build the CA, CH, CV and CD datasets. 

8. Apply the inverse DWT to produce the watermarked 

cover image. 

Figure 1: Cover images: (a) Lena (b) Barbara (c) Baboon (d) Peppers; 

Watermark images (e) Logo (f) Bird 

3.2 Improved Watermark Extraction Scheme 

The steps of watermarking extraction are as follows: 

1. Apply the DWT to the whole watermarked cover 

image (and possibly attacked) image F ∗ into four 

datasets: CA, CH, CV, and CD. 

2. Divide CA, CH, CV and CD datasets into non 

overlapping blocks having the same size used in 

embedding process. 

4.2 Experiment 2 and Results 

In this section, we perform the implementations of 

schemes as proposed by Liu [25], Ghazy [15, 16], 

Rafizul Haque [21], Emir Ganic [17], Majumder [26], 

and the improved approach. The above said six methods 

have been tested against ten different attacks. The 

attacks chosen are Gaussian Noise, rotation, cropping, 

Gaussian blur, median filtering, histogram equalization, 



sharping, transform, rescaling and Gamma Correction. 

The first attack applied is Gaussian Noise with zero 

mean and 0.5 variance. The second attack is rotation 

by 30 degree and 75 degree. The next image manipulation 

is cropping. We use a part of the image by selecting 

50 % of the original size at the center and cropping every 

other pixel. The fourth attack is blurring using low 

pass filter of 3 × 3 window size. The fifth attack is median 

filtering. The sixth attack is histogram equalization. 

The seventh attack is image sharping. Next attack 

we applied on the watermark image is transform. The 

ninth attack is rescaling, the watermarked image was 

scaled from 256 × 256 to 128 × 128, and then rescaled 

to its original size. Lastly, we perform gamma correction 

attack that changes the brightness of pixels of the 

watermarked image by a specified factor 1.5. 

Correlation metric has been used for comparison between 

original watermark and extracted watermark after 

applying attacks on watermarked image. Figure 

2 depicts the attacked images.The watermarks extracted 

from Peppers and Barbara images using proposed 

scheme are as shown in Figures 3 and 4 respectively. 

Figure 3: Extracted watermarks from Peppers as cover image and 

Logo as a watermark image using Proposed Method; (a) Gaussian 

Noise (b) Rotate 30 (c) Cropping (Left Half) (d) Median Filtering (e) 

Histogram (f) Sharpning (g) Resize (h) Gamma Correction (i) Transform 

Figure 2: Attacked images; (a) Gaussian Noise (b) Rotate 30 (c) 

Cropping (Left Half) (d) Median Filtering (e) Histogram (f) Sharpning 

(g) Resize (h) Gamma Correction (i) Transform 

Correlation coefficients corresponding to images 

i.e., Barbara and Lena as cover images and logo as a 

watermark image are given in Tables 1 and 2 respectively. 

Tables 3 and 4 show the correlation coefficients 

corresponding to images i.e., Peppers and Baboon as 

cover images and bird as a watermark image respectively.Prop 

stands for proposed method in all the tables. 

Figure 4: Extracted watermarks from Barbara as cover image and 

Bird as a watermark image using Proposed Method; (a) Gaussian 

Noise (b) Rotate 30 (c) Cropping (Left Half) (d) Median Filtering (e) 

Histogram (f) Sharpning (g) Resize (h) Gamma Correction (i) Transform 



Table 1: Correlation coefficients for Barbara image as cover image 

and logo image as watermark 

Table 4: Correlation coefficients for Baboon image as cover image 

and Bird image as watermark 

Attacks Liu Gha Emir Haq Mis P rop 

G.Noise 0.61 0.68 0.57 0.58 0.72 0.76 

G.Blur 0.86 0.92 0.96 0.96 0.94 0.96 

Crop(R.H) 0.60 0.97 0.95 0.91 0.94 0.99 

Crop(L.H) 0.49 0.97 0.88 0.95 0.88 0.98 

Rotate30 0.64 0.79 0.54 0.62 0.87 0.93 

Rotate75 0.75 0.80 0.52 0.70 0.87 0.91 

Mid.F il. 0.94 0.94 0.97 0.98 0.95 0.97 

Hist.Eq. 0.88 0.90 0.99 0.99 0.99 0.99 

Sharp 0.75 0.82 0.89 0.89 0.88 0.96 

T rans 0.87 0.90 0.60 0.71 0.86 0.97 

Resize 0.99 0.99 0.85 0.92 0.92 0.89 

G.Corr 0.96 0.98 0.99 0.99 0.99 0.99 


G.Noise 0.57 0.65 0.58 0.57 0.65 0.69 

G.Blur 0.73 0.80 0.83 0.80 0.85 0.86 

Crop(R.H) 0.38 0.98 0.72 0.81 0.80 0.99 

Crop(L.H) 0.33 0.98 0.66 0.74 0.77 0.99 

Rotate30 0.40 0.49 0.54 0.52 0.62 0.69 

Rotate75 0.50 0.70 0.58 0.58 0.61 0.71 

Mid.F il. 0.74 0.78 0.89 0.88 0.86 0.89 

Hist.Eq. 0.99 0.98 0.98 0.97 0.98 0.98 

Sharp 0.45 0.64 0.81 0.80 0.75 0.85 

T rans 0.66 0.78 0.60 0.61 0.66 0.87 

Resize 0.89 0.93 0.75 0.79 0.65 0.79 

G.Corr 0.75 0.76 0.98 0.98 0.96 0.99 

Table 2: Correlation coefficients for Lena image as cover image and 

logo image as watermark 


G.Noise 0.40 0.51 0.55 0.44 0.52 0.59 

G.Blur 0.97 0.97 0.98 0.96 0.97 0.99 

Crop(R.H) 0.43 0.97 0.97 0.99 0.98 0.99 

Crop(L.H) 0.67 0.98 0.97 0.96 0.96 0.99 

Rotate30 0.62 0.75 0.61 0.63 0.82 0.83 

Rotate75 0.74 0.83 0.59 0.66 0.81 0.87 

Mid.F il. 0.98 0.99 0.99 0.98 0.98 0.99 

Hist.Eq. 0.96 0.98 0.99 0.98 0.99 0.99 

Sharp 0.86 0.91 0.95 0.92 0.96 0.97 

T rans 0.86 0.89 0.71 0.75 0.83 0.92 

Resize 0.99 0.99 0.89 0.96 0.94 0.92 

G.Corr 0.96 0.96 0.99 0.97 0.99 0.99 

Table 3: Correlation coefficients for Peppers image as cover image 

and Bird image as watermark 


G.Noise 0.53 0.60 0.53 0.53 0.58 0.63 

G.Blur 0.91 0.93 0.95 0.86 0.92 0.96 

Crop(R.H) 0.33 0.98 0.87 0.95 0.92 0.99 

Crop(L.H) 0.36 0.95 0.89 0.92 0.93 0.97 

Rotate30 0.41 0.48 0.51 0.53 0.58 0.66 

Rotate75 0.56 0.61 0.51 0.57 0.56 0.75 

Mid.F il. 0.95 0.95 0.97 0.97 0.95 0.98 

Hist.Eq. 0.89 0.95 0.99 0.96 0.96 0.98 

Sharp 0.69 0.82 0.90 0.84 0.88 0.91 

T rans 0.68 0.71 0.55 0.66 0.62 0.90 

Resize 0.96 0.98 0.84 0.86 0.85 0.86 

G.Corr 0.78 0.78 0.99 0.98 0.99 0.99 

These tables show the correlation coefficients between 

the original watermark and extracted watermark 

using Liu, Ghazy, Emir Ganic, Rafizul Haque, Majumder, 

and proposed method. Maximum value of correlation 

coefficients means more similarity between ex- 

tracted watermark and original watermark. The correlation 

coefficients using proposed method for Gaussian 

noise, Gaussian blurring, cropping, sharping, transform 

and rotation are far better and for other attacks such as 

histogram equalization, gamma correction attacks, and 

there is slight improvement. 

The results reveal that proposed method outperforms 

the other ones for cropping, rotations, sharping, transform, 

blurring and noise attacks. For Gamma correction 

and histogram equalization attacks, the results obtained 

using other methods are more or less equal to 

those as obtained using proposed method. In case of 

median filtering attack and gamma correction, Rafizul 

Haque and Majumder method performs better for one 

image (Barbara) respectively whereas for other images 

the proposed method gives good results though the difference 

is very little. In case of rescaling (or resizing) 

attack, Gahzy method gives better result as compared to 

other methods. For Peppers, Lena as cover images and 

bird as watermark image, Emir Ganic method performs 

better under histogram equalization attack. 

Next, we saw the effect of adding salt and peeper noise 

to watermarked images with densities within the interval 

[0.001, 0.09]. Tables 5 and 6 give correlation coefficients 

after applying salt and peeper noise attack for 

Barbara as cover images and logo and bird as watermark 

images. The results show that proposed method 

gives better results as the value of density increases. 

Another experiment was performed to see the effect 

of adding Gaussian noise to the watermarked image 

with different variance values using Barbara as cover 

images and bird and logo as watermark images and the 

results are tabulated in Tables 7 and 8. The results reveal 

that proposed method gives better results as the 

value of variance increases. But at low variance, other 



Table 5: Correlation coefficients for Salt and pepper noise attacks Table 8: Correlation coefficients for Gaussian noise attacks with different 

noise variances using Barbara as cover and bird as watermark 

with different noise densities using Barbara as cover and logo as watermark 

Den. Liu Gha Emir Haq Mis P rop 

0.001 0.999 0.999 0.999 0.999 0.999 0.999 

0.003 0.998 0.999 0.999 0.998 0.998 0.999 

0.005 0.997 0.999 0.998 0.997 0.998 0.999 

0.007 0.996 0.997 0.997 0.996 0.997 0.998 

0.009 0.992 0.996 0.995 0.995 0.996 0.997 

0.01 0.990 0.994 0.995 0.993 0.995 0.996 

0.03 0.953 0.961 0.966 0.966 0.978 0.979 

0.05 0.912 0.921 0.926 0.919 0.944 0.955 

0.07 0.862 0.895 0.883 0.858 0.905 0.924 

0.09 0.824 0.8554 0.8274 0.815 0.8764 0.898 

V ar. Liu Gha Emir Haq Mis P rop 

0.001 0.962 0.984 0.992 0.932 0.991 0.989 

0.005 0.931 0.961 0.974 0.827 0.977 0.944 

0.01 0.882 0.923 0.935 0.742 0.951 0.887 

0.05 0.637 0.713 0.631 0.535 0.748 0.685 

0.1 0.527 0.607 0.511 0.498 0.615 0.641 

0.2 0.450 0.528 0.479 0.472 0.509 0.587 

0.3 0.416 0.500 0.466 0.463 0.448 0.579 

0.4 0.395 0.478 0.463 0.452 0.433 0.568 

0.5 0.382 0.468 0.455 0.450 0.425 0.568 

0.6 0.373 0.464 0.455 0.434 0.428 0.560 

Table 6: Correlation coefficients for Salt and pepper noise attacks 

with different noise densities using Barbara as cover and bird as watermark 

Den. Liu Gha Emir Haq Mis P rop 

0.001 0.999 0.999 0.998 0.998 0.998 0.999 

0.003 0.998 0.997 0.998 0.998 0.997 0.998 

0.005 0.995 0.997 0.998 0.998 0.995 0.998 

0.007 0.991 0.995 0.995 0.995 0.993 0.997 

0.009 0.988 0.991 0.994 0.994 0.987 0.995 

0.01 0.985 0.989 0.992 0.993 0.986 0.994 

0.03 0.939 0.943 0.959 0.967 0.934 0.966 

0.05 0.883 0.892 0.910 0.917 0.873 0.955 

0.07 0.834 0.848 0.847 0.866 0.829 0.924 

0.09 0.796 0.818 0.768 0.819 0.781 0.898 

methods have an edge over the proposed one. 

Table 7: Correlation coefficients for Gaussian noise attacks with different 

noise variances using Barbara as cover and logo as watermark 

The result shows that proposed method gives better correlation 

under different wavelets. 

Figure 5 shows the effect of scaling factor on the wa- 

Table 9: Correlation coefficients for different wavelet under sharpening 

attack 

W avelet Emir Mis P rop 

Haar 0.824 0.772 0.874 

Daubechies(db2) 0.787 0.208 0.836 

Symlet(sym2) 0.787 0.208 0.868 

Biorthogonal(bior2.2) 0.823 0.265 0.871 

Coifman(coif2) 0.775 0.395 0.869 

termarked image for proposed method.Peak Signal to 

Noise Ratio (PSNR) for different scaling factors on watermarked 

image is tabulated in Tables 10 and 11 for 

one cover images and two watermark images. 

V ar. Liu Gha Emir Haq Mis P rop 

0.001 0.998 0.998 0.999 0.998 0.999 0.998 

0.005 0.981 0.9841 0.989 0.987 0.992 0.990 

0.01 0.951 0.960 0.968 0.963 0.976 0.977 

0.05 0.747 0.795 0.759 0.736 0.844 0.813 

0.1 0.616 0.684 0.573 0.579 0.727 0.758 

0.2 0.510 0.593 0.476 0.431 0.633 0.644 

0.3 0.464 0.557 0.438 0.415 0.569 0.597 

0.4 0.443 0.532 0.425 0.410 0.549 0.593 

0.5 0.422 0.512 0.422 0.408 0.534 0.589 

0.6 0.419 0.507 0.418 0.402 0.519 0.585 

Lastly,we perform the experiments to show the effect 

of scaling factor and wavelet on proposed method. 

Table 9 gives correlation coefficients for different 

wavelets applying on proposed, Emric Ganic and Majumder 

methods under image sharping attack using 

barabara as cover image and bird as a watermark image. 

Figure 5: Effect of scaling factor on watermarked image using Lena 

as cover image and bird as watermark image 

The results reveal that the increase in the value of 

scaling factor results in decrease in PSNR. It means that 

distortion in watermarked image increases when the 

scaling factor increases. The results also demonstrate 

that Liu and Ghazy methods give best quality image. 



Table 10: PSNR under different scaling factor using Lena as cover 

and logo as watermark 

Scal. Liu Gha Emir Haq Mis P rop 

0.1 44.92 42.67 31.36 31.39 38.73 31.48 

0.2 36.25 34.83 25.53 25.61 29.93 25.79 

0.3 31.64 29.80 22.32 22.42 25.72 22.68 

0.4 28.12 26.51 20.16 20.27 23.12 20.57 

0.5 25.40 23.84 18.57 18.67 21.28 19.01 

0.6 22.76 21.83 17.32 17.42 19.87 17.78 

0.7 20.94 20.33 16.31 16.43 18.73 16.79 

0.8 19.59 19.04 15.47 15.59 17.82 15.95 

0.9 18.54 17.95 14.76 14.89 17.05 15.24 

1.0 17.64 17.04 14.14 14.28 16.36 14.63 

In this paper, an improvement in the existing scheme 

was suggested whereby we divided the data sets (bands) 

into non-overlapping blocks and then modified singular 

values of each block with the singular values of watermark 

image. A correlation metric has been used for performance 

comparison between proposed method and 

the other methods. Proposed method gives much better 

results in terms of correlation coefficients against Gaussian 

noise, rotation, cropping, sharping, and transform 

attacks. The proposed technique and Ganic method give 

almost comparable results for median filtering, gamma 

correction, Gaussian bluring, rescaling and histogram 

equalization. The proposed method also performs better 

against Gaussian noise and Salt and Peeper Noise 

particularly when the noise variance and density values 

are high. The proposed method is best even for four different 

wavelets used. The quality of watermarked image 

for proposed method, though not best, still it is better 

or atleast comparable with some of the techniques. 

The experimental results confirm that proposed method 

outperforms the other methods against some of the image 

processing attacks whereas it is at par with the other 

methods for the rest of the attacks. 

References 

[1] Al-Hai, A. Combined dwt-dct digital image watermarking. 

Computer Science, 3(9):740–746, 

2007. 

Table 11: PSNR under different scaling factor using Lena as cover 

and Bird as watermark 

Scal. Liu Gha Emir Haq Mis P rop 

0.1 37.36 35.53 25.85 25.98 28.38 26.24 

0.2 27.94 26.16 20.42 20.58 22.53 20.80 

0.3 21.92 21.49 17.50 17.67 19.61 17.97 

0.4 18.87 18.63 15.54 15.72 17.61 16.01 

0.5 16.85 16.73 14.12 14.30 16.09 14.58 

0.6 15.47 15.40 13.04 13.22 14.95 13.46 

0.7 14.51 14.45 12.17 12.33 14.03 12.56 

0.8 13.81 13.75 11.45 11.61 13.28 11.83 

0.9 13.27 13.20 10.84 10.99 12.66 11.22 

1.0 12.85 12.73 10.33 10.46 12.13 10.68 

The proposed method, though not so good so far as 

PSNR and hence the image quality is concerned, still it 

is better or at least comparable with Rafizul and Emir 

Ganic methods. 

5 Conclusions 

[2] Arya, D. A survey of frequency and wavelet domain 

digital watermarking techniques. Scientific 

and Engineering Research, 1(2):1–4, 2010. 

[3] Bao, P. and Ha, X. Image adaptive watermarking 

using wavelet domain singular value decompostion. 

IEEE Trans. on Circuits and Systems for 

Video Technology, 15(1):96–102, 2005. 

[4] Basso, A., Bergadano, F., Cavagnino, D., Pomponiu, 

V., and Vernone, A. A novel block-based 

watermarking scheme using the svd transform. Algorithm, 

2(1):46–75, 2009. 

[5] Cao, L. Singular value decomposition applied to 

digital image processing. Master’s thesis, Arizona 

State University Polytechnic Campus, Arizona, 

2007. 

[6] Chandra, D. V. S. Digital image watermarking using 

singular value decomposition. In Proceeding 

of IEEE Midwest Symposium on Circuits and Systems, 

pages 264–267, Tulsa, USA, 2002. 

[7] Chang, C. C., Lin, C. C., and Tsai, P. Svd 

based digital image watermarking scheme. Pattern 

Recognition Letters, pages 1577–1586, 2005. 

[8] Chung, K., Shen, C., and Chang, L. A novel 

svd and vq-based image hiding scheme. Pattern 

Recognition Letters, 22(9):1051–1058, 2001. 

[9] Chung, Y. and Kim, C. Robust image watermarking 

against filtering attacks. In SICE Annual Conference, 

pages 3017–3020, Fukui, Japan, 2003. 

[10] Cox, I., Kilian, J., Leighton, F., and Shamoon, 

T. Secure spread spectrum watermarking for 

multimedia. IEEE Trans. on Image Processing, 

6(12):1673–1687, 1997. 

[11] Cox, I., Millar, M., and Bllom, J. Watermarking 

applications and their properties. In International 



Conference on Information Technology, pages 6– 

10, 2000. 

[12] Cox, I., Millar, M., and Bllom, J. Digital Watermarking. 

Morgan-Kaufmann, San Francisco, CA, 

2002. 

[23] Lee, S., Jang, D., and Yoo, C. D. An svd-based 

watermarking method for image content authentication 

with improved security. In Proceedings 

of IEEE International Conference on Acoustics, 

Speech, and Signal Processing, pages 525–528, 

2005. 

[13] Craver, S., Memon, N., Yeo, B., and Yeung, M. 

Can invisible watermarks resolve rightful ownership. 

Technical Report RC-20509, IBM Research 

Division, July 1996. 

[14] Furht, B. and Kirovski, D. Multimedia Watermarking 

Techniques and Applications. Auerbach 

Publication, 2006. 

[15] Gahzy, R., El-Fishawy, N., Hadhoud, M., and El- 

Samie, F. A. An efficient block-by-block svd 

based image watermarking scheme. Ubiquitous 

Computing and Communication, 2(5):1–9, 2007. 

[16] Gahzy, R., El-Fishawy, N., Hadhoud, M., and El- 

Samie, F. A. Performance evaluation of block 

based svd image watermarking. Electromagnetics 

Research, 8:147–159, 2008. 

[17] Ganic, E. and Eskicioglu, A. M. Robust dwt-svd 

domain image watermarking: Embedding data in 

all frequencies. In Proceedings of the 2004 workshop 

on Multimedia and security, pages 20–21, 

Magdeburg, Germany, 2004. 

[18] Ganic, E., Zubair, N., and Eskicioglu, A. M. An 

optimal watermarking scheme based on singular 

value decomposition. In International Conference 

on Communication, Network and Information Security, 

pages 85–90, New York, USA, 2003. 

[19] Gonzalez, R. and Woods, R. Digital Image Processing. 

Prentice Hall Inc, 2002. 

[20] Gunjal, B. L. and Manthalkar, R. An overview 

of transform domain robust digital image watermarking 

algorithms. Emerging Trends in Computing 

and Information Sciences, 2(1):37–42, 2010. 

[21] Haque, R. Singular value decomposition and 

discrete cosine transform based watermarking. 

Master’s thesis, Blekinge Institute of Technology, 

Sweden, 2008. 

[22] Hartung, F. and Kutter, M. Multimedia watermarking 

techniques. IEEE, 87(7):1079–1107, 

1999. 

[24] Liu, J., Niu, X., and Kong, W. Image watermarking 

scheme based on singular value decomposition. 

In International Conference on Intelligent 

Information Hiding and Multimedia, pages 457– 

460, Pasadena, California, 2006. 

[25] Liu, R. and Tan, T. An svd-based watermarking 

scheme for protecting rightful ownership. IEEE 

Trans. on Multimedia, 4(1):121–128, 2002. 

[26] Majumder, S., Mishra, M., and Singh, A. D. A 

hybrid svd and wavelet based watermarking. In 

National conference on Mathematical techniques: 

Emerging Paradigms for Electronics and IT Industries, 

Delhi , India, 2008. 

[27] Meerwald, P. Digital image watermarking in the 

wavelet transform domain. Master’s thesis, University 

of Salzburg, 2001. 

[28] Miller, M., Cox, I., Linnartz, J., and Kalker, T. A 

review of watermarking principles and practices. 

In IEEE International Conference on Image Processing, 

1997. 

[29] Podilchuk, C. I. and Delp, E. J. Digital watermarking: 

Algorithms and applications. IEEE Signal 

Processing Magazine, 18(4):33–46, 2001. 

[30] Qiao, L. and Nahrstedt, K. Watermarking schemes 

and protocols for protecting rightful ownership 

and customer’s rights. Visual Communication and 

Image Representation, 9(3):194–210, 1998. 

[31] Rao, K. and Yip, P. Discrete Cosine Transform: 

Algorithms, Advantages and applications. Academic 

Press, Inc., San, Diego, 2002. 

[32] Raval, M. and Rege, P. Discrete wavelet transform 

based multiple watermarking scheme. In IEEE 

Conference on Convergent Technologies, Bangalore, 

India, 2003. 

[33] Rezazadeh, S. and Yazdi, M. A non-oblivious image 

watermarking system based on singular value 

decomposition and texture segmentation. Applied 

Science Engineering and Technology, 1(3):1079– 

1107, 2006. 



[34] Sverdlov, A., Dexter, S., and Eskicioglu, A. M. 

Robust dct-svd domain image watermarking for 

copyright protection: Embedding data in all frequencies. 

In European Signal Processing Conference, 

pages 4–8, Antalya, Turkey, 2005. 

[35] Thapa, M., Sood, S., and Sharma, A. Digital image 

watermarking technique based on different attacks. 

Advanced Computer Science and Applications, 

2(4):14–19, 2011. 

[36] Vetterli, M. and Kovacevic, J. Wavelets and Subband 

Coding. Prentice Hall, USA, 1995. 

[37] Wolfgang, R. and Delp, E. J. A watermark technique 

for digital imagery: further studies. In Proceedings 

of International Conference on Imaging 

science, Systems and technology, Las Vegas, 

Nevada, 1997. 


Translation Rules and ANN based model for English to Urdu 

Machine Translation 

SHAHNAWAZ 1 

R. B. MISHRA 2 

IT-BHU, Institute of Technology, Banaras Hindu University 

Department of Computer Engineering 

Varanasi, U.P., India-221005 

1 shahnawaz.rs.cse@itbhu.ac.in 

2 ravibm@bhu.ac.in 

Abstract. In this paper we discuss the working of our English to Urdu Machine Translation (MT) system. 

We used feed-forward back-propagation artificial neural network for the selection of Urdu words/tokens 

(such as verb, noun/pronoun etc.) and translation rules for grammar structure equivalent to English 

words/tokens and grammar structure rules respectively. As English is SVO class language while Urdu 

is SOV class language so grammar structure transfer is main task in English-Urdu machine translation 

problem. Our system is able to translate sentences having gerund, having infinitives (maximum two), 

having prepositions and prepositional objects (maximum three), direct object, indirect object etc. Neural 

network works as the knowledge base for linguistic rules and bilingual dictionary. Bilingual dictionary 

not only stores the meaning of English word in Urdu but also stores linguistic features attached to the 

word. The output of our system is presented in Romanized Urdu. The n-gram blue score achieved by the 

system is 0.6954; METEOR score achieved is 0.8583 and F-score of 0.8650. 

Keywords: Neural network, back-propagation, rule based translation, English, Urdu, machine translation 

system, Artificial Intelligence 

(Received June 4th, 2011 / Accepted September 15th, 2011) 


Machine translation, also referred as MT, is the process 

of translating one natural language (as English) text to 

another natural language (as Urdu) text by the use of 

computing machine. Machine Translation is an automated 

process in which translation job is done by the 

Computer Software. Machine Translation is an application 

of computer linguistic. Computer linguistic is an 

interdisciplinary field of computer science and requires 

language and computer experts. Translation as an art 

of rendering a work of one language into another is as 

old as written literature [1]. As the needs of multilingual 

information are increasing in business, industries 

and economics, machine translation cannot be ignored. 

If MT researchers are able to develop a perfect multilingual 

machine translation system, people with different 

languages can share ideas and information worldwide 

on every topic as research, political, business, economical, 

and socio-cultural etc. The purpose of a translation 

process whether machine translation or human translation 

is that meaning of the text being translated should 

not change. There are many different machine translation 

system available online as well as desktop systems. 

G.R. Tahir, S. Asghar and N. Masood in [21] analyze 

the results from most popular MT systems like Babylon 

8, World-lingo, PakTranslations, ApniUrdu and MT by 

FAST-NU and find out that the translation result is ambiguous 

and have wrong sense of meaning. Many languages 

spoken in the developing countries have been 

ignored by the researchers though these languages are 


Shahnawaz and R. B. Mishra Translation Rules and ANN based model for English to Urdu Machine Translation 37 

spoken by a large population [7]; it holds the same status 

for Urdu language also. 

Urdu language is one of the languages from the 

family of Indo-Aryan languages. There are some 210 

languages and dialects in the family of Indo-Aryan 

languages [17]. Urdu is closely related to Hindi with 

a similar grammatical structure, but differences in 

script and vocabulary [2]. Hindi and Urdu are sister 

languages having many common linguistic features 

[6]. They are structurally very close to each other and 

use similar postpositions, verb morphology as well as 

complex predicate verb structure. There are between 

60 and 70 million self-identified native speakers of 

Urdu in different continents. English has its own 

importance with respect to international language; 

and is a knowledge containing language [3]. A good 

translator will remove this gap and people will be able 

to communicate without any language barrier. 

1.1 Literature survey 

The machine translation work in English to Urdu machine 

translation is substantially lagging in spite of 

large population of Urdu speakers. The English to 

Urdu machine translation system developed by Tafseer 

Ahmed, Sadaf Alvi in [19] uses transfer based approach 

and bottom up chart parsing for English-Urdu translation 

task. An expert system based English-Urdu machine 

translation work in [11] relies on QTAG for part 

of speech tagging and uses knowledge base for grammatical 

patterns and gender aware dictionary. Maryam 

Zafar et al in [25] developed an interactive machine 

translation system using Example based approach. This 

system uses Levenshtein algorithm and semantic distance 

algorithm for searching bilingual corpus. System 

uses n-ary product for listing all the possible translations 

for the input sentence. System has rules for ordering 

the translated text and also supports homograph, 

idioms and some other linguistic features. The work 

discussed in [10] is a bidirectional English-Urdu machine 

translation system with natural language processing. 

This system uses rule based methodology with 

bottom up parsing and dynamic dictionary for translation. 

AGHAZ [13] is an Expert System based automatic 

translator for English to Urdu machine translation. 

It has an expert system, patterns or rules and a 

rich knowledge base which stores English words with 

their Urdu meaning, part of speech, gender, number 

and multi-word information. In a work of English to 

Urdu translation, R M K Sinha in [18] uses his English- 

Hindi MT System as a model to translate Urdu from English. 

This English-Hindi MT system is built by using 

rule based approach and a pseudo Interlingua. In this 

work, a Hindi-Urdu mapping table is used which stores 

Urdu meaning of Hindi words and information that affect 

the composition of Urdu text; to generate Urdu text 

from the Hindi output of this system. Sampark in [22] 

is a machine translation system for automated machine 

translation among Indian languages including Urdu, developed 

by the Consortium of institutions include IIIT 

Hyderabad, University of Hyderabad, KBC, Chennai, 

IIT Kharagpur, CDAC (Noida,Pune), Anna University, 

IIT Kanpur, IISc Bangalore, IIIT Alahabad, Jadavpur 

University, Tamil University [22]. This system uses 

hybrid methodology which consists of rule based approach 

and dictionaries and statistical machine learning 

techniques. A proposed knowledge based machine 

translation system in [21] is an enhancement of Sampark 

model which considers most of the types of ambiguities 

and uses text mining and data mining techniques 

for machine translation. A brief overview of English- 

Urdu machine translation works discussed here is given 

below table (to see Table 1). 

Our English to Urdu machine translation system 

works at paragraph level as well as for a single sentence. 

When a source language text is entered in the 

system as input, system processes the text into sentences. 

Then each sentence is translated and rearranged 

to generate the Urdu translation. First of all, contraction 

removal module removes contraction from all the 

sentences. Then each sentence is parsed and tagged. 

The output of parser and tagger are processed to extract 

all the information related to each word present in 

the sentence and now each word is transformed into an 

object which contains information (like part of speech, 

dependency, word position in the sentence etc.) about 

this word and sentence is transformed into a group of 

knowledgeable objects. Now these objects are given to 

the grammar analysis and sentence structure recognition 

module which processes the information and recognizes 

the grammar tokens (like subject, object, verb, 

infinitive, gerund etc.) of the sentence and generate 

the grammatical structure using the rule base. Artificial 

Neural Network (ANN) and Rule based sentence structure 

mapping module maps this grammatical structure 

to corresponding Urdu grammar structure and ANN 

based Urdu word mapping module maps each word 

from the each sentence part to the Urdu word. Each 

part is now arranged according to the Urdu grammatical 

structure obtained from ANN and Rule based sentence 

structure mapping module. Syntax addition module 

adds verb marker and case markers based on the information 

attached with Urdu words and in knowledgeable 

objects. Translation of each sentence is generated 

and presented in Romanized Urdu. 



Table 1: Overview of English-Urdu machine Translation Works / 

Systems 

S. 

No. 

System/ 

W ork(Y ear) 

1. English 

to Urdu 

Translation 

System 

(2002) 

2. Expert system 

driven 

approach 

to generate 

natural 

language 

(2003) 

3. Urdu Translation 

Engine 

(2004) 

4. AGHAZ 

(2005) 

5. Interactive 

English- 

Urdu 

machine 

translation 

(2009) 

6. English- 

Urdu 

Machine 

Translation 

Via Hindi 

(2009) 

7. Sampark ( 

2009 ) 

8. Knowledge 

based 

Machine 

Translation 

System 

(2010) 

ResearchT eam W ork/System 

Methodology 

Tafseer Ahmed, 

Sadaf Alvi 

Expert system 

driven approach 

to generate natural 

language 

(2003) 

Transfer based approach, 

Bottom Up 

Chart Parsing 

Expert system based 

approach, QTAG, 

knowledge base for 

grammatical patterns 

and gender aware 

dictionary 

Mohammad Natural language 

Kashif Shaikh, processing, rule 

Hussain Hyder based, bottom up 

Ali Khowaja, parsing and dynamic 

Muzammil dictionary Bidirectional 

Ahmed Khan 

English-Urdu 

machine translation 

Uzair Muhammad, 

Expert System based 

Kashif Bi- 

approach, knowledge 

lal, Atif Khan, base for grammatical 

and M. Nasir patterns and gender 

Khan 

aware dictionary, 

Handles multiple 

words and proper 

noun 

Maryam Zafar Example based approach, 

et al 

Levenshtein 

and semantic distance 

algorithm, is N-ary 

Product, ordering 

rules Supports homograph, 

idioms and 

some other features. 

R. M. K. Sinha Mapping of Hindi 

output from English- 

Hindi MT system 

which is based upon 

PLIL (pseudo Lingua 

for Indian Languages) 

and Rule based 

approach. 

Sampark 

machine translation 

Team- 

Consortium of 

Institutions 

Ghulam Rasool 

Tahir, Sohail 

Asghar, Nayyer 

Masood 

Sampark machine 

translation Team- 

Consortium of Institutions 

Text mining and Data 

mining techniques; 

Focuses on adding 

semantics. 

The further work discussed in this paper is divided 

into the following sections: The second section of this 

paper gives a brief overview of linguistic characteristics 

of Urdu language. Third section comprises the discussion 

about our proposed work, our system architecture 

and description. This section also discusses encodingdecoding 

and neural network module. Software implementation 

of the system and working of our system is 

explained in the fourth section. Then we discuss results 

and evaluation of our system. Last section is conclusion 

and future work. 

2 Linguistic characteristics of Urdu 

2.1 Origin and Vocabulary 

Among all the languages in the world, Urdu is most 

closely similar to Hindi. Both the languages Hindi and 

Urdu have originated from the Delhi region dialect and 

other than the minute details, these languages share 

their morphology. Hindi language has adopted many 

words from Sanskrit while Urdu language has borrowed 

a large number of its vocabulary items from Persian and 

Arabic. Urdu has also borrowed words from Turkish, 

Portuguese and English [5]. There are a large number 

of words which have found a place in Urdu Language, 

from the Persian; have differently nuanced connotations 

and usages [14]. 

2.2 Grammar Structure 

One of the most significant aspects of Urdu language 

grammar structure is its word order which is SOV 

(subject, object, and verb). This order does exhibit 

some flexibility as the subject pronouns are frequently 

dropped. 

2.3 Nouns 

Nouns in Urdu Language grammar have two types 

of gender (masculine/feminine), two type of numbers 

(singular/plural) and three cases (vocative, direct and 

oblique). All nouns in Urdu, when used within a sentence, 

will be inflected for number and case. Suffix 

specifies gender on verbs and adjectives in [5]. example: 

pagal → pagalpan (madness), ghabrana → ghabrahat 

(anxiety) Common suffixes can be used to drive 

nouns from other words. These forms are masculine 

and feminine nouns. One notable point is that the borrowed 

Arabic and Persian plurals form of the noun are 

never inflected in Urdu [14]. 

2.4 Verbs 

Verbs in Urdu language have two nonfinite forms, root 

and infinitive. The infinitives comprise a verbal stem 

and a suffix. The stem may itself comprise verbal root 

and suffix. e.g. ana (to come), jana (to go). In this 

example a- and ja- are the root and -na is suffix. The 

infinitive forms of all verbs are marked masculine from 

grammatical point of view. They neither occur in the 

plural, nor in the vocative. There are many verbal forms 



of stem having different endings added to them to form 

the patterns of different verb forms. In Urdu, subjunctive 

is the finite verbal form which conveys the week 

conjectures on the Urdu part of speaker [14]. Another 

two verbal forms that may be finite or nonfinite are the 

perfective and imperfective participles. The imperfective 

participle ends with -ta,-te,-ti,-tin. In case of perfective 

participles, participle ends with e.g -a,-e,-i,-in. But 

the situation will be different whenever there is any verbal 

stem that ends in a vowel; we have to add a/y before 

masculine singular ending [8, 4]. The future verb forms 

in Urdu cannot be decided from the verb stems rather it 

is decided from subjunctive forms. The endings for the 

future forms are (e.g. -ga,-ge,-gi) [14]. Common use 

of semi auxiliary elements also gives various semantic 

connotations [5]. 

2.5 Post-positions 

In Urdu, post positions take place after noun phrase 

head which is an absolute contrast to English language 

where a variety of elements occur between preposition 

and the governed noun. This process has helped to 

reach a conclusion that Urdu has a lot of diversity of 

cases [5] e.g. genitive, accusative etc. 

Our system uses neural network as knowledge base for 

storing linguistic rules and also as knowledge bilingual 

dictionary. Neural network maps Urdu words/tokens 

(such as verb, noun/pronoun etc.) and grammar 

structure rules equivalent to English words/tokens and 

grammar structure rules. These words/tokens are then 

processed, rules are interpreted and all the parts of the 

sentence are arranged according to the interpreted rules. 

3.1 System Architecture and Description 

The block diagram of our English to Urdu Machine 

Translation System is shown in figure (to see Figure1). 

There are eight main modules such as Contractions 

Removal, Parser and Tagger, Knowledge Extraction, 

Grammar Analysis and Sentence Structure Recognition, 

ANN (Artificial Neural Network) and Rule Based 

Sentence Structure Mapping, ANN Based Urdu word 

Mapping, Rule Based Syntax Addition and Urdu Sentence 

Generation in our System. Working of each module 

is explained below. 

3 Our proposed work 

In our work of English to Urdu machine translation, 

we used a neural network and rule based approach. 

Rule based approach is the classical approach of 

machine translation. In rule based machine translation 

approach, system is fed with linguistic rules and 

bilingual dictionaries. System parses and analyses the 

grammatical structure of the source language text, this 

structure is then transformed to the target language 

structure with the help of the linguistic rules. When 

the structure is transformed, target language text is 

generated by the use of bilingual dictionaries and 

linguistic rules. Many systems have been developed 

using rule based machine translation, in which main 

systems are as Systran, Eurotra and Japanese MT 

System. Neural networks are a possible solution to the 

machine translation problem. Neural networks have 

the ability of learning by examples. Neural network 

has proven very useful in various natural language 

processing tasks [8]. PARSEC [9], JANUS [24] and 

English-Sanskrit MT system [12] use neural network 

approach for natural language processing task and 

automated machine translation. Our English to Urdu 

machine translation system uses Feed-Forward Back- 

Propagation Neural Network with rule based machine 

translation approach. Neural networks are very efficient 

in pattern matching. Machine translation using rule 

based approach is consistent and of predictable quality. 

Figure 1: System Architecture 

Sentence Separator and Contractions Removal: The 

translation process starts with English text input to this 

module. This module first separates the paragraph into 

sentences. Then each sentence is processed. If any contraction 

is present in the sentence it is removed. Then 

it is passed to the parser and tagger module. Contractions 

are common in spoken English and now becoming 

informal in written English also. In this step, we 

replace contractions with their respective full form. For 

example, I’m, you’ve, she’ll and they‘d etc will be replaced 

by respectively I am, you have, she will and they 

had/would etc. Similarly, negative contractions aren’t, 

needn’t, won‘t etc will be replaced by respectively are 

not, need not, will not etc. 

Parser and Tagger: Processed text from Contraction 

Removal module is given as input to the Parser and 

Tagger module. Stanford typed dependency parser is 

used for parsing the English Text. Stanford parser is 



the implementation of probabilistic natural language 

parsers, both highly optimized PCFG and lexicalized 

dependency parsers, and a lexicalized PCFG parser. 

Probabilistic parsers use knowledge of language gained 

from hand-parsed sentences to try to produce the most 

likely analysis of new sentences. These statistical 

parsers still make some mistakes, but commonly work 

rather well [16]. The parser provides Stanford typed 

dependencies as output. The output of parser for an 

English sentence is shown below: 

Students must tell the result to their parents. 

nsubj(tell − 3, Students − 1), aux(tell − 3, must − 

2), det(result − 5, the − 4), dobj(tell − 3, result − 

5), poss(parents − 8, their − 7), prep t o(tell − 

3, parents − 8). 

We are using Stanford POS tagger for tagging the 

English text. A Part-Of-Speech Tagger (POS Tagger) 

assigns parts of speech to each word (and other token), 

such as noun, verb, adjective, etc. The Stanford POS 

tagger uses the Penn Treebank tag set and is implemented 

using maximum entropy tagging algorithm 

[20]. POS tagger adds part of speech information to 

each word (and other tokens) in the text. The output of 

tagger for an English sentence is shown below: 

Students must tell the result to their parents. 

Students/NNS must/MD tell/VB the/DT result/NN 

to/TO their/PRP$ parents/NNS. 

Knowledge Extraction: The function of this module 

is to process the typed dependency obtained from parser 

and to process tagged text from tagger. This module extracts 

information from parser and tagger for each part 

of the sentence. Each part of the sentence is converted 

to knowledgeable object by adding all the information 

associated with it and sentence is represented as a collection 

of knowledgeable objects. 

Grammar Analysis and Sentence Structure Recognition: 

This module processes the collection knowledgeable 

objects and recognizes parts of the sentence 

e.g. subject, main verb, auxiliary verb, object, indirect 

object etc. Tense of the sentence is recognized with the 

help of main verb and auxiliary verb. Sentence voice 

whether it’s passive or active, is also recognized in this 

phase. Sentence type is detected from the knowledge 

present in the collection of knowledgeable objects. On 

the basis of knowledge obtained, sentence parts and attributes 

(tense, voice, type etc.) are analyzed and sentence 

grammatical structure is generated with the help 

of rule base. 

ANN and Rule based sentence structure mapping: 

Generated grammatical structure and attributes are 

passed to ANN and Rule based sentence structure mapping 

module. This module gathers information and 

makes a query to obtain the corresponding grammar 

structure for target languages i.e. Urdu. This query 

is coded into numeric form (decimal number). Artificial 

Neural Network (ANN) is trained on a data set of 

decimal encoded Rule Base for English Grammar and 

corresponding Urdu Grammar in which, various parts 

of the structure are separated by space. On query, ANN 

model returns Urdu grammar structure corresponding to 

the attribute and English grammatical structure knowledge 

encoded in the query. The returned structure is 

also in the numeric form which is then decoded to textual 

form for further processing. 

ANN Based Urdu word mapping: Sentence parts 

(words or tokens) are transformed according to the 

Urdu grammar structure obtained from last module. 

Now each sentence part has to be translated. Urdu word 

mapping module encodes each word into numeric form 

and looks for each word in the bilingual ANN model 

which is trained for word mapping, and gets the corresponding 

English word and associated information in 

numeric form. This result is decoded to textual form 

which contains Urdu meaning of the word and coupled 

information. For a word, which is noun/pronoun, coupled 

information will be the number, person and gender 

and a word which is verb, coupled information will 

be its weak verb. Verbs in Neural Network model are 

trained with their base form meaning and weak verb if 

there is any. 

3.2 Translation Rules 

Translation Rules have been created for various classes 

of the sentences. Our system is able to handle all 

forms (affirmative, negative and interrogative) of the 

English simple sentences. Syntax addition to verb 

and case marker addition to subject and object will 

be added on the basis of information of tense, subject 

and object gender, number and person. For example 

for the following sentence: English Sentence: I lent 

my pen to a friend. Following translation rule will 

be used: IF (Sentence structure is SVOPPO and tense 

is Past-Indefinite and sentence is affirmative in active 

voice) THEN (Urdu grammar = subject (S) + object 

(O) + prepositional object (PO) + preposition (P) + 

verb (V)). Syntax addition: As direct object is present 

in the sentence so case marker ’ne’ has to be added and 

marker ’a’ to verb will also be added. This is decided 

on the basis of tense, sentence structure and coupled 

information (number, person, gender) with the Urdu 

meaning of the word. Syntax addition rules have been 

written for each tense considering all cases of number, 

gender, and person and sentence structure. The general 

structure for the grammar rules for training neural 



network as follows 

I/p → gclass + tense + type + category + voice. 

O/p → urdu grammar 

E.g. I/p → svo + pastInd + s + aff + act O/p → sov. 

Where gclass is the grammar class of sentence like 

SVO, tense like Past Indefinite, type of the sentence 

is simple, complex, imperative etc, category is affirmative, 

interrogative etc and voice is active or passive. 

Translation rules for the following structures of the 

sentences have been written SVSc, SV, SVO, SVIoO, 

SVIn, SVInIn, SVInO, SVG, SVGO, SVpPO, SVp- 

POpPO, SVpPOpPOpPO, SVOpPO, SVOpPOpPO, 

SVOpPOpPOpPO; where S = Subject, V = Verb, 

Sc = Subject Compliment, Io = Indirect Object, In 

= Infinitive, G = Gerund, p = preposition and PO = 

Prepositional Object. Some examples of translation 

rules as follows: 

English Sentence (E.S.): Mr S Khan is a research 

scholar 

IF (sentence structure is SVSc and tense is present and 

affirmative sentence in active voice) 

THEN ( Urdu grammar = S + Sc + V) 

E.S.: Has the bell rung? 

IF (sentence structure is SV and tense is present perfect 

and verb interrogative sentence in active voice) 

THEN ( Urdu grammar = kya + S + V) 

E.S.: The boy hadn’t lost his pen. 

IF (sentence structure is SVO and tense is past perfect 

and negative sentence in active voice) 

THEN ( Urdu grammar = S + O + negative word + V ) 

E.S.: Why does he not want to go to watch the movie? 

IF (sentence structure is SVInInO and tense is present 

Indefinite and interrogative-negative sentence in active 

voice) 

THEN (Urdu grammar = S + O + In2 + question word 

+ negation word + In1 + V). 

E.S.: I lent my pen to my friend. 

IF (sentence structure is SVOpPO and tense is past 

Indefinite and interrogative-negative sentence in active 

voice) 

THEN (Urdu grammar = S + O + PO + p + V). 

3.3 Encoder-Decoder 

We created a data set of input-output pairs of English- 

Urdu words with associated knowledge and another 

data set of input-output pairs of grammar rules. 

Encoder-Decoder converts this training data into numeric 

coded form which is suitable to be used as input 

for the ANN trainer. Each English alphabet is represented 

as a five bit binary number (a = 00001, b = 

00002 and so on)(to see Table 2 ). Value of each alphabet 

is converted to decimal by dividing 26 (a = 1/26, b= 

2/26 and so on) to train the neural network. Some special 

characters are also used for correct representation 

of a word in Roman Urdu. All the special characters 

are assigned values higher than one. For training neural 

network, we encode each character of the words/tokens 

and grammar structure to numeric form as explained 

above. 

Table 2: English Alphabet Encoding 

S.No. Alphabet 5 − bit 

binary 

Decimalcode 

forthe 

alphabet 

(binary/26) 

1. a 00001 0.038462 

2. b 00010 0.076923 

3. c 00011 0.115385 

4. d 00100 0.153846 

5. e 00101 0.192308 

6. f 00110 0.230769 

7. g 00111 0.269231 

8. h 01000 0.307692 

9. i 01001 0.346154 

10. j 01010 0.384615 

11. k 01011 0.423077 

12. l 01100 0.461538 

13. m 01101 0.500000 

14. n 01110 0.538462 

15. o 01111 0.576923 

16. p 10000 0.615385 

17. q 10001 0.653846 

18. r 10010 0.692308 

19. s 10011 0.730769 

20. t 10100 0.769231 

21. u 10101 0.807692 

22. v 10110 0.846154 

23. w 10111 0.884615 

24. x 11000 0.923077 

25. y 11001 0.961538 

26. z 11010 1.000000 

27. a (bar) 11011 1.038461 

28. e (bar) 11100 1.076923 

29. i (bar) 11101 1.115384 

30. n (acute) 11110 1.153846 

31. u (bar) 11111 1.192307 

32. Space 00000 0 

3.4 Neural Network and Training 

We have created a two-layer feed-forward neural 

network. First layer in this network is sigmoid and 

second layer is linear. We trained this network with 

Levenberg-Marquardt algorithm. There are many 

numerical optimization techniques to speed up the 

convergence of back-propagation algorithm. Different 

algorithms perform differently for a given problem. 

The results, presented in [4], show that Levenberg- 

Marquardt algorithm is very efficient for training the 

networks having up to a few hundred weights. We have 

trained neural network for grammar structure rules with 



a data set of around 465 input-output pair of grammar 

rules. The input layer of grammatical structure network 

contains 42 nodes, hidden layer contains 100 nodes 

and output layer contains 30 nodes. Mean squared 

error goal was set to training error of 10 −8 which 

was achieved after 29 epochs. The neural network for 

knowledgeable bilingual dictionary has been trained 

with a data set of around 9000 input-output pair of 

English-Urdu words with associated knowledge. The 

input layer of bilingual dictionary network contains 

10 nodes, hidden layer contains 100 nodes and output 

layer contains 32 nodes (for meaning and other information). 

Mean squared error goal was set to training 

error of 10 −8 which was achieved after 333 epochs. 

3.4.1 ANN Based Mapping Process 

We used feed-forward back-propagation artificial 

neural network for the selection of Urdu words/tokens 

(such as verb, noun/pronoun etc) and grammar structure 

rules equivalent to English words/tokens and 

grammar structure rules. There are three main steps in 

mapping process as follows: 

1) Encoding of English words/tokens or grammar 

structure to numeric code. 

2) Mapping of English numeric code: Data sets are fed 

to Neural Network from which ANN selects the Urdu 

equivalent of the English words/tokens or grammar 

structure provided for Translation. 

3) Decoding the code of the obtained Urdu 

words/tokens or grammar structure. 

Once we get the equivalent words/tokens or grammar 

structure, Urdu meaning and information is extracted 

and processed. 

noun, pronoun, number, person and gender. The neural 

networks model for grammar structure rules gives the 

Urdu equivalent grammatical structure to English sentence 

being translated and the neural model for bilingual 

knowledgeable dictionary gives the Urdu equivalent 

word and associated knowledge about the word. A 

java class does coding and decoding of the tokens and 

linguistic rules and gives to the neural networks as input 

for mapping them to their equivalent Urdu tokens 

and linguistic rules. To automate the process we created 

a java class for creating training data in numeric form 

with help of coding and decoding java class from a text 

file where data is present in human readable form as a 

word in numeric form is difficult to read by a human but 

easy for a program. Neural network then maps these numeric 

values and produces equivalent result in numeric 

form which are then again passed to the java class which 

decodes numeric data and present in the string form. 

This knowledge is further processed and Urdu meaning 

and attached information is extracted. Suffix in the 

verb and marker with the subject are attached on the 

basis of knowledge obtained from the neural network 

and information obtained in the Grammar Analysis and 

Sentence Structure Recognition module. These parts 

are then arranged according to the grammar structure 

obtained from grammatical structure network and the 

output is presented in Romanized Urdu form. 

5 Results and Evaluation 

Our system is also able to handle contractions if present 

in the English input text figure 2 (to see Figure2). 

4 Implementation 

We have implemented our English-Urdu machine translation 

system on java platform. We used java jdk1.5 

version for its compatibility with Matlab 7.1. System 

is implemented in java except from the neural network 

module. Neural network model is trained, tested and 

successfully implemented using Matlab 7.1 neural network 

library. Neural network works as the knowledge 

base for linguistic rules and bilingual dictionary. Bilingual 

dictionary does not only store the meaning of English 

word in Urdu but also store linguistic knowledge 

(e.g. verb, noun, pronoun, number, person and gender 

etc) attached to the Urdu word. We trained the 

two-layer feed-forward neural network with Levenberg- 

Marquardt back-propagation algorithm. We created 

two separate neural networks one for grammar structure 

rules and one for English-Urdu bilingual knowledge 

dictionary as Urdu equivalent of English words also 

have associated knowledge about the word like verb, 

Figure 2: Contraction Removal 

The output of Contraction Removal module is 

passed to the Parser and Tagger module which uses 

Stanford parser and tagger for parsing and tagging the 

input English sentence. The output of the Parser and 

Tagger module is shown in the figure 3 (to see Figure3). 

Figure 3: Parser and Tagger Output 



Knowledge extraction module processes the result 

obtained from Parser and Tagger module and converts 

each part of the sentence to a knowledgeable object by 

adding all the information associated with it. Sentence 

is now represented as a collection of knowledgeable 

objects which are then given as input to the Grammar 

Analysis and Sentence Structure Recognition module. 

This module analyzes these objects and identifies the 

attributes of the grammar for the English sentence as 

tense, voice, sentence type, subject, main verb, auxiliary 

verb, object, indirect object etc. Now each token 

and grammatical structure is mapped from the neural 

network as explained earlier in the implementation section. 

Verbs meaning are stored in the base form so suffix 

has to be added according to the tense, gender and 

number of the subject or object sometimes which are 

appended on the basis of knowledge obtained from the 

neural network and tense of the sentence. Case marker 

words like ka, ke, ko, ne, ki etc also attached with the 

subject on the basis of knowledge obtained from the 

neural network and information obtained in the Grammar 

Analysis and Sentence Structure Recognition module. 

All the parts of the sentence are then arranged according 

to the grammar structure rule obtained from 

grammatical structure network and the output is presented 

in Romanized Urdu form as below: 

Input English Text: Ram Kumar Singh is a student. 

He lives in Shimla. Shimla offers you refreshing environment. 

He enjoys playing cricket. He likes singing. 

He went to the market with his father. He saw an old 

man in the market. The old man was buying a book for 

his wife from the market. He bought a pen for his sister. 

He met his friends. They wanted to go to watch the 

movie. He decided to watch the movie. 

Output Urdu Translation: RAM KUMAR SINGH ek 

talib-e-ilm hai | wah SHIMLA me rahta hai | SHIMLA 

tumko tazagi bhara mahaul deta hai | wah cricket 

khelna lutf uthata hai | wah gana pasand karta hai | 

wah apne walid ke sath bazar ko gaya tha | wah bazar 

me ek boodha adami dekha tha | boodha adami bazar 

se apni biwi ke liye ek kitaab kharid raha tha | wah apni 

bahan ke liye ek kalam kharida tha | wah apne doston 

mila tha | ve film dekhna jana chahate the | wah film 

dekhna faisla kiya tha | 

The words which are not present in the bilingual 

dictionary are printed as it is in the translation in 

capitals. 

5.1 Evaluation 

The problem of evaluation is same as the problem of 

translation. Various methods have been employed for 

evaluating the quality of machine translation output. 

Some features can be evaluated automatically for example 

fluency can be checked by n-gram analysis of 

reference translations are available and some can’t as 

meaning sense of translation. It is hard to compare between 

two different Machine Translation algorithms objectively. 

The evaluation scores for twenty-eight randomly selected 

sentences [shown in table 4,5 (to see 4 and 5 )] of 

various classes are shown in the table 3 (to see 3) below. 

In these tables 4 and 5 sentence mean English sentence, 

candidate is the translation output of our machine translation 

system and reference is the Urdu translation of 

the English sentence by a human expert. 

Table 3: Calculated score for the sentences shown in table 4, 5 

S.No. BLEU P R M F 

1 1.0000 1.0000 1.0000 0.9998 1.0000 

2 1.0000 1.0000 1.0000 0.9993 1.0000 

3 1.0000 1.0000 1.0000 0.9995 1.0000 

4 0.8105 1.0000 1.0000 0.9995 1.0000 

5 0.4544 0.5556 0.5556 0.5533 0.5556 

6 0.4137 0.7500 0.7500 0.7361 0.7500 

7 0.8092 0.9091 0.9091 0.9055 0.9091 

8 0.6776 0.8889 0.8889 0.8880 0.8889 

9 0.3119 0.7143 0.7143 0.6371 0.7143 

10 0.3708 0.8333 0.8333 0.8067 0.8333 

11 1.0000 1.0000 1.0000 0.9990 1.0000 

12 0.5435 0.8571 0.8571 0.8552 0.8571 

13 0.4208 0.8333 0.8333 0.8300 0.8333 

14 1.0000 1.0000 1.0000 0.9985 1.0000 

15 0.5630 0.8889 0.7273 0.7212 0.8000 

16 0.5667 0.8333 0.8333 0.8300 0.8333 

17 0.5667 0.8333 0.8333 0.8300 0.8333 

18 1.0000 1.0000 1.0000 0.9960 1.0000 

19 0.7140 0.8750 0.7778 0.7773 0.8235 

20 0.8101 0.8571 0.8750 0.8630 0.8660 

21 0.8914 1.0000 0.9000 0.9041 0.9474 

22 0.2762 0.5714 0.5714 0.5670 0.5714 

23 0.8914 1.0000 1.0000 0.9993 1.0000 

24 1.0000 1.0000 1.0000 0.9977 1.0000 

25 1.0000 1.0000 1.0000 0.9977 1.0000 

26 1.0000 1.0000 1.0000 0.9922 1.0000 

27 1.0000 1.0000 1.0000 0.9977 1.0000 

28 1.0000 1.0000 1.0000 0.9985 1.0000 

BLEU in [15] (Bilingual Evaluation Understudy) is 

an IBM-developed metric, uses a modified form of precision 

(modified n-gram precision) to compare the candidate 

translation against reference translations. It takes 

the geometric mean of modified precision scores of the 

test corpus and then multiplies the result by exponential 

brevity penalty factor to give the BLUE score. Modified 

precision score can be calculated as follows: 

p n = Σ C∈{ C andidates}Σ n −gram∈ C Count c lip(n−gram) 

Σ Ć ∈{ C andidates}Σ n−gram∈ ´ Ć Count c lip( n−gram) ´ 

Where C is the set of candidate translation sentences 

and C’ is the set of reference sentences. Count clip in 



Table 4: Twenty-eight randomly selected sentences 

Table 5: Twenty-eight randomly selected sentences 

S.No. Sentence, CandidateandReference 

1. Sentence: Why has he bought a watch for his sister 

from the market? 

Candidate: wah bazar se apni bahan ke liye ek ghadi 

kyon kharid chuka hai 

Reference : wah bazar se apni bahan ke liye ek ghadi 


2. Reference : wah bazar se apni bahan ke liye ek ghadi 


Candidate: ladaka SALESMAN se ek kitaab kyon 

kharid raha tha 

Reference: ladaka salesman se ek kitaab kyon kharid 

raha tha 

3. Sentence: The girl was singing a song with her 

friends. 

Candidate: ladaki apne doston ke sath ek gana ga rahi 

thi 

Reference: ladaki apne doston ke sath ek gana ga rahi 

thi 

4. Sentence: Why did the teacher give homework to us? 

Candidate: ustad hamko ko ghar ke liye kam kyon 

diya tha 

Reference: ustad ne hamko ghar ke liye kam kyon 

diya tha 

5. Sentence: I bought 10kg mango for my sister. 

Sentence: I bought 10kg mango for my sister. 

Sentence: I bought 10kg mango for my sister. 

6. Sentence: I lent my pen to a friend. 

Candidate: mai ek dost ko mera kalam diya tha 

Reference: maine ek dost ko apna kalam diya tha 

7. Sentence: Why did he not go to the market with his 

friends? 

Candidate: wah apni doston ke sath bazar ko kyon 

nahi jata hai 

Reference: wah apne doston ke sath bazar ko kyon 

nahi jata hai 

8. Sentence: He went to the market with his friends. 

Candidate: wah apni doston ke sath bazar ko gaya tha 

Reference: wah apni doston ke sath bazar ko gaya tha 

9. Sentence: These books belong to me. 

Candidate: yen kitaben mujh ko talluk rakhta hai 

Reference: yen kitaben mujh se talluk rakhti hai 

10. Sentence: I like reading books. 

Candidate: mai padh kitaben pasand karta hun 

Reference: mai kitaben padhna pasand karta hun 

11. Sentence: Has he finished working? 

Candidate: Kya wah kam karna khatm kar chuka hai 

Reference: Kya wah kam karna khatm kar chuka hai 

12. Sentence: Where does he want to go to see the movie? 

Candidate: wah film dekhna kahaan jana chahta hai 

Reference: wah film dekhne kahaan jana chahta hai 

13. Sentence: He wants to go to see the movie. 

Candidate: wah film dekhna jana chahta hai 

Reference: wah film dekhne jana chahta hai 

this equation is calculated as Countclip = min (Count, 

Max Ref Count). The formula for calculating brevity 

penalty is 

BP = 1 if c >r ; BP = e (1−r/c) if c ≤ r 

Where r is the length of reference and c is the length of 

S.No. Sentence, CandidateandReference 

14. Sentence: That man wishes to buy a car. 

Candidate: vah adami ek car kharidna chahta hai 

Reference: vah adami ek car kharidna chahta hai 

15. Sentence: Has he decided to visit the museum? 

Candidate: Kya wah mueseum daura karna faisla kar 

chuka hai 

Reference: Kya wah mueseum ka daura karne ka faisla kar 

chuka hai 

16. Sentence: Where does he want to go to play? 

Candidate: wah kahaan khelna jana chahta hai 

Candidate: wah kahaan khelna jana chahta hai 

.17 Sentence: Why does he not want to play? 

Candidate: wah kyon khelna nahi chahta hai 

Reference: wah kyon khelna nahi chahta hai 

18. Sentence: My friend wants to go. 

Candidate: mera dost jana chahta hai 

Reference: mera dost jana chahta hai 

19. Sentence: Why the old man did not tell us the truth? 

Candidate: boodha adami hamko sach kyon nahi bataya 

tha 

Reference: boodhe adami ne hamko sach kyon nahi bataya 

tha 

20. Sentence: Why did the old man buy a watch? 

Candidate: boodha adami ek ghadi kyon kharida tha 

Reference: boodhe adami ne ek ghadi kyon kharidi thi 

21. Sentence: The teacher did not give us homework. 

Candidate: ustad hamko ghar ke liye kam nahi diya tha 

Reference: ustad ne hamko ghar ke liye kam nahi diya tha 

22. Sentence: Shimla offers you refreshing environment. 

Candidate: SHIMLA tumko tazagi bhara mahaul deta hai 

Reference: Shimla tumko tazagi bhara mahaul deta hai 

23. Sentence: Have I given you my pen? 

Candidate: Kya mai tumko mera kalam de chuka hun 

Reference: Kya mai tumko apna kalam de chuka hun 

24. Sentence: The boy has lost his pen. 

Candidate: ladaka apni kalam kho chuka hai 

Reference: ladaka apni kalam kho chuka hai 

25. Sentence: What these boys were doing? 

Candidate: yen ladaken kya kar rahe the 

Reference: yen ladaken kya kar rahe the 

26. Sentence: Do the birds fly? 

Candidate: Kya chidiyan udati hain 

Reference: Kya chidiyan udati hain 

27. Sentence: The old man was working. 

Candidate: boodha adami kam kar raha tha 

Reference: boodha adami kam kar raha tha 

28. Sentence: Mr S Khan is a research scholar. 

Candidate: MR S KHAN ek taftish alam hai 

Reference: Mr S Khan ek taftish alam hai 

candidate; Then Bleu score is calculated as: 

BLUE = BP.exp(Σw n logp n ) 

Precision in [23] is the fraction of correct instances 

among those that the algorithm believes to belong to 

the relevant subset. Precision can be calculated as: P = 

| X ∩ Y | / | Y | Where Y is the set of candidate items 

and X is the of reference items. 

Recall in [23] is the fraction of correct instances among 



all instances that actually belong to the relevant subset. 

Recall can be calculated as: R = | X ∩ Y | / | X | Where 

Y is the set of candidate items and X is the of reference 

items. 

METEOR (Metric for Evaluation of Translation with 

Explicit ORdering) is a machine translation evaluation 

metric developed at Carnegie Mellon University. The 

Meteor metric is based on the weighted harmonic mean 

of unigram precision ( P = m/w t ) and unigram recall 

( P = m/w r ). Where m is number of unigrams, w t is 

the number of unigrams in candidate translation and w r 

is the reference translation. Precision and Recall are 

combined using the harmonic mean with recall 9 times 

more than precision: F mean = 10PR / 9P + R This measure 

is for congruity with respect to single words but for 

considering longer n-gram matches, a penalty p is calculated 

for the alignment as: p = 0.5 ( c / u m ) 3 ; Where 

c is the number of chunks, and u m is the number of 

unigrams that have been mapped. The more mappings 

there are that are not adjacent in the reference and the 

candidate sentence, the higher the penalty will be. Final 

Meteor-score (M-score) is calculated as: 

M = F mean (1-p). 

F-Measure in [23] is a metric developed on the New 

York University. The F-measure is defined as the harmonic 

mean of precision and the recall as: 

F-measure = (2 * Precision * Recall) / ( Precision + Recall). 

The comparative scores of different Machine Translation 

evaluation methods such as BLEU (BiLingual 

Evaluation Understudy), METEOR (M), F-measure (F) 

scores, unigram Precision (P), unigram Recall (R) for 

thirty-seven randomly selected sentences of various 

classes are shown in figure 4 (to see Figure4). 

It has been seen from the results that system performs 

efficiently on those classes of sentences whose 

grammar rules are trained in the neural network. System 

uses Stanford Parser for typed dependency and Tagger 

for POS-tagging; if the parser or tagger makes an 

error for any sentence then same error will be propagated 

throughout the translation and will result in the 

wrong translation. We obtained an average BLUE score 

of 0.6954, M-score of 0.8583 and F-score of 0.8650. 

6 Conclusion and Future Work 

The working and architecture of our English to Urdu 

Machine translation system is discussed in this paper. 

All the modules have been implemented successfully. 

This paper describes the use of neural network with 

rule based machine translation approach. Our system 

uses neural network for dictionary lookups and grammar 

structure mapping and suffix addition to verbs and 

Figure 4: Evaluation Scores for randomly selected sentences 



case marker addition with subject does not require any 

dictionary lookups, though it is done on the basis of 

information attached with the words; which makes it 

efficient and fast. Our system works efficiently on the 

sentences for which grammar rules are present in the 

rule base and words which are available in the bilingual 

dictionary. If the word is not present in the dictionary, 

English word is printed as it is in the translation in capitals. 

The translation results obtained from the system 

evaluated using machine evaluation methods and manually 

and it has seen that the system works efficiently 

on the trained linguistic rules and bilingual dictionary. 

The n-gram blue score obtained for the system over 100 

sentences is 0.6954; METEOR score achieved is 0.8583 

and F-score of 0.8650. So an enhancement to the grammar 

rules and size of bilingual dictionary will lead to 

the efficient and accurate machine translation system. 

References 

[1] Abdullah, P. and Homiedan, H. Machine translation, 

1997. 

[10] Kashif Shaikh, M., Ali Khowaja, H., and 

Ahmed Khan, M. Urdu text translation with natural 

language processing. In Engineering, Sciences 

and Technology, Student Conference On, pages 81 

– 85, 2004. 

[11] Khan, S., Pervez, Z., Mahmood, M., Mustafa, F., 

and Hasan, U. An expert system driven approach 

to generating natural language in romanized urdu 

from english documents. In Multi Topic Conference, 

2003. INMIC 2003. 7th International, pages 

361 – 366, 2003. 

[12] Mishra, V. and Mishra., R. Ann and rule based 

model for english to sanskrit machine translation. 

INFOCOMP Journal of Computer Science, 

9(1):80–89, 2010. 

[13] Muhammad, U., Bilal, K., Khan, A., and Khan, 

M. N. Aghaz: An expert system based approach 

for the translation of english to urdu. International 

Journal of Social Sciences, 3(1):70–74, 2008. 

[2] Ahmed, T. The interaction of light verbs and verb 

classes of urdu. In Interdisciplinary workshop 

on Verbs: The identification and representation of 

verb features, 2010. 

[3] Atish Durrani, M. Q. Z. Urdu informatics. Islamabad 

: Center of Excellence for Urdu Informatics, 

National Language Authority, 2008. 

[4] Hagan, M. and Menhaj, M. Training feedforward 

networks with the marquardt algorithm. Neural 

Networks, IEEE Transactions on, 5(6):989 –993, 

1994. 

[5] Hardie, A. Developing a tagset for automated 

part-of-speech tagging in urdu. In Corpus Linguistics 

2003, 2003. 

[6] Hock, H. Principles of historical linguistics. 

Mouton de Gruyter, 1986. 

[7] Hutchins, W. Machine Translation: past, present, 

future. Chichester : Ellis Horwood, 1986. 

[8] Imperial, N. K., Koncar, N., and Guthrie, D. G. 

A natural language translation neural network. In 

In Proceedings of the International Conference on 

New Methods in Language Processing (NeMLaP, 

pages 71–77, 1994. 

[9] Jain, A. N. Parsing complex sentences with structured 

connectionist networks. Neural Computation, 

3:110–120, 1991. 

[14] Naim, C. and Qaumi Kaunsil bara’e Taraqqi-yi 

Urdu (New Delhi, I. Introductory Urdu. Number 

v. 1. National Council for Promotion of Urdu 

Language, 2000. 

[15] Papineni, K., Papineni, K., Roukos, S., Roukos, 

S., Ward, T., Ward, T., jing Zhu, W., and jing Zhu, 

W. Bleu: A method for automatic evaluation of 

machine translation. pages 311–318, 2002. 

[16] Parser, S. Stanford parser, 

http://nlp.stanford.edu/software/lex-parser.shtml, 

2011. 

[17] SIL. Sil international, 2011. 

[18] Sinha, R. M. K. Developing english-urdu machine 

translation via hindi, 2009. 

[19] Tafseer Ahmed, S. A. English to urdu translation 

system, 2002. 

[20] Tagger, S. http://nlp.stanford.edu/software/tagger.shtml, 

2011. 

[21] Tahir, G., Asghar, S., and Masood, N. Knowledge 

based machine translation. In Information 

and Emerging Technologies (ICIET), 2010 International 

Conference on, pages 1 –5, 2010. 

[22] TDIL. Technology development for indian languages 

programme, http://tdil-dc.in, 2011. 



[23] Turian, J., Shen, L., and Melamed, I. D. Evaluation 

of machine translation and its evaluation. In 

In Proceedings of MT Summit IX, pages 386–393, 

2003. 

[24] Waibel, A., Jain, A., McNair, A., Saito, H., Hauptmann, 

A., and Tebelskis, J. Janus: a speechto-speech 

translation system using connectionist 

and symbolic processing strategies. In Acoustics, 

Speech, and Signal Processing, 1991. ICASSP- 

91., 1991 International Conference on, pages 793 

–796 vol.2, 1991. 

[25] Zafar, M. and Masood, A. Interactive english to 

urdu machine translation using example-based approach, 

2009. 


Computing a Longest Common Subsequence of two strings when 

one of them is Run Length Encoded 

SHEGUFTA BAKHT AHSAN 1 

TANAEEM M. MOOSA 2 

M. SOHEL RAHMAN 2 

SHAMPA SHAHRIYAR 1 

AlEDA Group, Department of CSE 

Bangladesh University of Engineering and Technology 

1 (plaban777,shampa077)@gmail.com 

2 (tanaeem,msrahman)@cse.buet.ac.bd 

Abstract. Given two strings, the longest common subsequence (LCS) problem computes a common 

subsequence that has the maximum length. In this paper, we present new and efficient algorithms for 

solving the LCS problem for two strings one of which is run length encoded (RLE). We first present 

an algorithm that runs in O(gN) time, where g is the length of the RLE string and N is the length of 

uncompressed string. Then based on the ideas of the above algorithm we present another algorithm that 

runs in O(R log(log g) + N) time, where R is the total number of ordered pairs of positions at which the 

two strings match. Our first algorithm matches the best algorithm in the literature for the same problem. 

On the other hand, for R < gN/ log(log)g, our second algorithm outperforms the best algorithms in the 

literature. 

Keywords: algorithms, longest common subsequence, run length encoded strings. 

(Received May 15th, 2011 / Accepted September 1st, 2011) 


The longest common subsequence (LCS) problem is 

a classic and well-studied problem in computer science 

with extensive applications in diverse areas ranging 

from spelling error corrections to molecular biology. 

For example, the task of spelling error correction 

is to find the dictionary entry which resembles most a 

given word. In order to save storage a file archive of 

several versions of a source program is maintained compactly 

by storing only the original version and the differences 

of subsequent versions with the previous ones. 

In molecular biology [19, 1], we want to compare DNA 

or protein sequences to learn how homologous they are. 

All these cases can be seen as an investigation for the 

‘closeness’ among strings. And an obvious measure for 

the closeness of strings is to find the maximum number 

of common symbols in them preserving the order 

of the symbols. This is known as the longest common 

subsequence of two strings. 

Suppose we are given two strings 

X[1..N] = X[1]X[2] . . . X[N] and Y [1..G] = 

Y [1]Y [2] . . . Y [G]. Without the loss of generality, 

we can assume that N ≤ G. A subsequence 

S[1..R] = S[1]S[2] . . . S[R], 0 < R ≤ N of X is 

obtained by deleting N − R symbols from X. A 

common subsequence of two strings X and Y is a 

subsequence common to both X and Y . The longest 

common subsequence problem for two strings, is to 

find a common subsequence in both the strings, having 

the maximum possible length. We use lcs(X, Y ) and 

r(X, Y ) to denote a longest common subsequence of 

X and Y and its length, respectively. 

The classic dynamic programming solution to LCS 

problem, invented by Wagner and Fischer [22], has 


Shegufta Bakht Ahsan et al. Computing a Longest Common Subsequence of two strings when one of them is Run Length Encoded 49 

O(NG) worst case running time. Masek and Paterson 

[15] improved this algorithm using the “Four- 

Russians" technique [4] to reduce the worst case running 

time to O(NG/ log N). Since then, not much 

improvement in terms of N, G can be found in the literature. 

However, several algorithms exist with complexities 

depending on other parameters. For example, 

Myers in [17] and Nakatsu et al. in [18] presented 

an O(ND) algorithm, where the parameter D is 

the simple Levenshtein distance between the two given 

strings [13]. 

Another interesting and perhaps more relevant parameter 

for this problem is R, which is the total number 

of ordered pairs of positions at which the two strings 

match. Hunt and Szymanski [8] presented an algorithm 

to solve the LCS problem in O((R + N) log N) time. 

They also cited applications, where R ∼ N and thereby 

claimed that for these applications the algorithm would 

run in O(N log N) time. Very recently, Iliopoulos and 

Rahman [10, 11] presented an efficient algorithm to 

solve the LCS problem in O(R log(log(N))+N) time. 

1.1 LCS for RLE Strings 

In this paper, we are interested to compute an LCS when 

one of the strings is run length encoded. The motivation 

for using compressed strings as input comes from 

the huge size of biological sequences. Here we will 

be focusing on the run-length encoded [12] strings. In 

a string, the maximal repeated string of characters is 

called a run and the number of repetitions is called the 

run-length. Thus, a string can be encoded more compactly 

by replacing a run by a single instance of the repeated 

character along with its run-length. Compressing 

a string in this way is called run-length encoding 

and a run-length encoded string is abbreviated as an 

RLE string. 

In what follows, we use the following convention: 

if X is a (uncompressed) string, then the run length encoding 

of X will be denoted by ˜X. For example, the 

RLE string of X = bdcccaaaaaa is ˜X = b 1 d 1 c 3 a 6 . 

Note that for ˜X, we define ˜X[1] = b 1 , ˜X[4] = a 6 

and so on. The notation |X| is used to denote its usual 

meaning, i.e., the length of X; the length of the corresponding 

RLE string ˜X is denoted by | ˜X|. We will 

use small letters to denote the length of an RLE string; 

whereas capital letters will be used to denote the length 

of an uncompressed string. For example, if |X| = N, 

then we shall use n to denote the length of ˜X. 

Note that, the notion of a match and hence the definition 

of the set of matches, M, can be extended in a 

natural way when one or both of the strings involved 

is/are run length encoded. For example, the notion of 

Figure 1: The set of matches M in two different settings. A dot in a 

cell indicates a match. 

a match (i, j) ∈ M, is extended when one input is an 

RLE string as follows: if Ỹ [i] = aq and X[j] = a then 

we say (i, j) ∈ M and run((i, j)) = q. The set M as 

well as its size in two different contexts are illustrated 

in Figure 1. In particular, Figure 1(a) considers two 

normal strings and Figure 1.(b) illustrates the scenario 

when one of those is run length encoded. The problem 

we handle in this paper is formally defined as follows: 

Problem 1. Problem LCS_RLE. Given one uncompressed 

string X[1..N] = X[1]X[2] . . . X[N] and one 

RLE string Ỹ [1..g] = Ỹ [1]Ỹ [2] . . . Ỹ [g], we want to 

compute a Longest Common Subsequence (LCS) of X 

and Ỹ . 

We will use LCS_RLE(X, Ỹ ) to denote an LCS of 

X and Ỹ . There has been significant research on solving 

the LCS problem involving RLE strings in the literature. 

Mitchell proposed an algorithm [16] capable of 

computing an LCS when both the input are RLE strings. 

Given two RLE strings ˜X[1..n] and Ỹ [1..g], Mitchell’s 

algorithm runs in O((R + g + n) log(R + g + n)) 

time. Apostolico et al. [3] gave another algorithm 



for solving the same problem in O(gn log(gn)) time 

whereas the algorithm of Freschi and Bogliolo [6] runs 

in O(gN + Gn − gn) time. Ann et al. also proposed 

an algorithm to compute an LCS of two run length encoded 

strings [2] in O(gn + min{g 1 , g 2 }) where g 1 , g 2 

denote the number of elements in the bottom and right 

boundaries of the matched blocks respectively. The version 

of the problem where only one string is run length 

encoded was handled recently by Liua et al. in [14]. 

Here, the authors proposed an O(gN) time algorithm 

to solve the problem. 

when we talk about comparing two RLE genomes, we 

often ignore the cost of compressing the two genomes. 

Now, if we count the cost of compression, then our 

algorithms (and the algorithm of Liua et al. in [14]) 

may turn out to be more favourable in different practical 

settings. Finally, our work can be seen as a building 

block for an efficient algorithm for the version where 

both strings are RLE. Indeed, we believe that combining 

some of the tricks of Mitchell [16] with our work, 

we would be able to get an algorithm for two RLE 

strings that runs faster than Mitchell’s algorithm. 

1.2 Our Contribution 

In this paper, we make an effort to solve the LCS 

problem efficiently when one of the input strings is 

run length encoded. Our main contributions are two 

novel efficient algorithms, namely, LCS_RLE-I and 

LCS_RLE-II, to solve Problem 1. In particular, we 

first present a novel and interesting idea to solve the 

problem and present an algorithm that runs in O(gN) 

time (LCS_RLE-I). This matches the best algorithm in 

the literature [14] for the same problem. Subsequently, 

based on the ideas of our above algorithm, we present 

another algorithm that runs in O(R log(log(g)) + N) 

time (LCS_RLE-II). Clearly, for R < gN/ log(log(g)), 

our second algorithm outperforms the best algorithms 

in the literature. In this context, Algorithm LCS_RLE- 

II is an input sensitive algorithm. In many cases, the 

input could be such that R = o(gN). In such cases, 

our algorithm will definitely show better behaviour 

than the other algorithms. Also, note that, in our setting, 

Mitchell’s algorithm would run in O((R + G + 

n) log(R + G + n)) time, which clearly is worse than 

ours. (Notably, Mitchell’s algorithm could also be used 

in our setting with an extra preprocessing step to compress 

the uncompressed string. In this case, the cost of 

compression must be taken into account.) 

With the existence of LCS algorithms in the literature 

that can deal with both RLE strings, our algorithms 

may seem to be only theoretically interesting. However, 

we note the following points in favour of the practical 

importance of our algorithms. Firstly, in many practical 

instances a much smaller reference pattern is compared 

with a large genome. In such cases our version of the 

problem may turn out to be more relevant. Secondly, 

1.3 Roadmap 

The rest of the paper is organized as follows. In 

Section 2, we present an O(gN) algorithm, namely 

LCS_RLE-I, to solve Problem LCS_RLE. LCS_RLE-I 

provides the base of our second algorithm, LCS_RLE- 

II, described in Section 3. We achieve O(R log log g + 

N) running time for LCS_RLE-II. Finally, we briefly 

conclude in Section 4. 

2 A New Algorithm 

In this section, we present Algorithm LCS_RLE-I 

which works in O(gN) time. Since our algorithm depends 

on some ideas of Algorithm LCS-I of [10, 11], 

we give a very brief overview of LCS-I in the following 

subsection. 

2.1 Review of LCS-I 

Note that, LCS-I solves the classic LCS problem for 

two given strings X and Y . For the ease of exposition, 

and to remain in line with the description of [10, 11], 

while reviewing LCS-I (in this section) we will assume 

that |X| = |Y | = N. Recall that, we say a pair 

(i, j), 1 ≤ i ≤ N, 1 ≤ j ≤ G, defines a match, if 

X[i] = Y [j]. The set of all matches, M, is defined as 

follows: 

M = {(i, j) | X[i] = Y [j], 1 ≤ i ≤ N, 1 ≤ j ≤ G}. 

Observe that |M| = R. From the definition of LCS 

it is clear that if (i, j) ∈ M, then we can calculate 

T [i, j], 1 ≤ i, j ≤ N by employing the Equation 1 

from [20, 9]. 

⎧ 

Undefined if (i, j) /∈ M, 

⎪⎨ 

1 if (i = 1 or j = 1) and (i, j) ∈ M 

T [i, j] = 

max 1≤li


Here we have used the tabular notion T [i, j] to denote 

r(Y [1..i], X[1..j]). We use the notation M i to denote 

the set of matches in Row i. Also, for the sake 

of better exposition we impose a numbering on the 

matches of a particular row from left to right as follows. 

If we have M i = {(i, j 1 ), (i, j 2 ), . . . , (i, j l )}, 

such that 1 ≤ j 1 < j 2 < . . . < j l , then, we say that 

number((i, j q )) = q and may refer to the match (i, j q ) 

as the qth match in Row i. Note that, number((i, j q )) 

may or may not be equal to j q . 

In what follows, we assume that we are given the set 

M in the prescribed order assuming a row by row operation. 

LCS-I depends on the following facts, problem 

and results. 

Fact 1. ([9, 20]) Suppose (i, j) ∈ M. Then for all 

(i ′ , j) ∈ M, i ′ > i, (resp. (i, j ′ ) ∈ M, j ′ > j), 

we must have T [i ′ , j] ≥ T [i, j] (resp. T [i, j ′ ] ≥ 

T [i, j]). □ 

Fact 2. ([9, 20]) The calculation of the entry 

T [i, j], (i, j) ∈ M, 1 ≤ i, j ≤ n, is independent of 

any T [l, q], (l, q) ∈ M, l = i, 1 ≤ q ≤ N. □ 

Problem 2. Range Maxima Query Problem. We are 

given an array A = a 1 a 2 ...a n of numbers. We need to 

preprocess A to answer the following form of queries: 

Query: Given an interval I = [i s ..i e ], 1 ≤ i s ≤ i e ≤ 

n, the goal is to find the index k (or the value A[k] itself) 

with maximum value (ties can be broken arbitrarily, e.g. 

by taking the one with larger (smaller) index) A[k] for 

k ∈ I. The query is denoted by RMQ A (i s , i e ) 

Theorem 1. ( [7, 5]) Range Maxima Query Problem 

can be solved in O(n) preprocessing time and O(1) 

time per query. □ 

Now, assume that we are computing the match 

(i, j). LCS-I maintains an array H of length N, where, 

for the current value of i ∈ [1..N] we have, H[l] = 

max 1≤k 1, then, we require two steps. Firstly, we 

perform the baseOperation. Then, in the second step 

(referred to as the weightOperation), we consider q 

previous matches (if fewer matches are available we 

need to consider all of them) in Row i, including the 

current one. Now, note carefully that T -values for these 

matches have already been computed and reflected in 

the S array. We copy S to K and S array is never 

changed by any weightOperation. For Row i, we call 

K[k] to be a match position if (i, k) ∈ M i and K[k] 

and k are referred to as the corresponding K-value and 



K-index (similar notations are also defined for the arrays 

H and S). Now, we add a weight to each of the 

corresponding K-values: the weight is 0 for the current 

match, 1 for the previous match, 2 for the match before 

it and so on. Now observe that T [i, j] will be the maximum 

of these values. This is because the kth element 

of this “window" from right, corresponds to matching k 

a’s from the run a q with rightmost k a’s from X ′ and 

then matching the remaining substring with Ỹ ′ . 

We will use an array K to do this computation efficiently. 

Now recall that we are handling the match 

(i, j) ∈ M i . We can implement the weightOperation 

by adding the appropriate weights at the corresponding 

match positions of K and then performing the query 

RMQ K (u, j), such that K[u] is a match position (due 

to (i, u)) and q = number((i, j)) − number((i, u)) + 

1. In what follows we will refer to the above range 

(i.e., the range [u..j]) as the weighted query window. 

However, in this strategy, we may need to adjust the 

weights every time we compute a new match since for 

each match the weighted query window may change. 

However, this would be costly. In what follows, we discuss 

how to do this more efficiently. 

Rather than adding the appropriate weight, for a 

particular row, we will add relative weight to all the 

match positions of K. This would ensure that the 

position of the maximum value remains the same, although 

the value may not. To get the correct value, we 

will finally deduct the appropriate difference from the 

value. We do it as follows. After the baseOperation, 

we copy the array S to array K. Then, to a match 

(i, l), 1 ≤ l ≤ |M i | we add |M i |−number((i, j))+1 

as the relative weight. In other words we give weight 

1 to the rightmost match, 2 to the next one and so on 

and finally, |M i | to the first match. 

Now recall that we are considering the match 

(i, j) ∈ M i , i.e., we are computing T [i, j]. Assume 

that number((i, j)) = |M i | − k + 1, i.e., 

this is the kth match position from right. As before, 

we execute the query RMQ K (q, j), such that 

K[u] is a match position (due to (i, u)) and q = 

number((i, j)) − number((i, u)) + 1. However, this 

time we need to do some adjustment as follows. It is 

easy to realize that each of the values of the matched 

positions in K[u..j], is k higher than the actual value. 

So, to correct the computation we perform T [i, j] = 

RMQ K (q, j) − k. 

The analysis of the algorithm is similar to that of 

LCS-I algorithm of [10, 11]. As we need to do at 

most two RMQ preprocessing per row, overall it will 

cost O(Ng) time (using O(N) time preprocessing algorithm). 

We need two RMQ queries per match which 

amounts to O(R) (using constant time RMQ query) 

time. Note that, in the worst case R = O(Ng). Finally, 

it is easy to see that, the set M in the prescribed order 

can be computed easily in O(Ng) time. Therefore, we 

get the following theorem. 

Theorem 2. LCS_RLE-I solves Problem LCS_RLE in 

O(Ng) time. □ 

2.3 An Illustrative Example 

In this section, we will give a partial example on 

how the LCS_RLE-I algorithm works for X = 

ABBCCCAAAA and Ỹ = C3 A 3 . We assume that 

we have already completed processing the matches belonging 

to Row 1 (i.e. M 1 ). After calculating Row 1, 

values of S array are shown in Table 1 and the T -values 

of Row 1 are shown in Table 2 

value 0 0 0 0 1 2 3 0 0 0 0 

index 0 1 2 3 4 5 6 7 8 9 10 

Table 1: S array after calculating Row 1 

C 3 0 0 0 0 1 2 3 0 0 0 0 

A B B C C C A A A A 


Table 2: T -values of Row 1 after processing all the matches of Row 

1 

Now we will calculate Row 2. We have, M 2 = 

{(2, 1), (2, 7), (2, 8), (2, 9), (2, 10)}. Here, we have to 

perform both baseOperation and weightOperation. 

The values of S array After baseOperations for all the 

matches of M 2 are shown in Table 3 

value 0 1 0 0 1 2 3 4 4 4 4 


Table 3: S array after baseOperations of all the matches of Row 2 

To calculate weightOperation, we will copy the S 

array into the K array and add relative weight as shown 

in Table 4 and Table 5. 

value 0 1 0 0 1 2 3 4 4 4 4 

weight 0 5 0 0 0 0 0 4 3 2 1 


Table 4: K-Array before addition of relative weight 



value 0 6 0 0 1 2 3 8 7 6 5 follows. We will use the vEB tree for this purpose. 


Recall that we want to find the maximum value of 

Table 5: K-Array after addition of relative weight K in the weighted query window. Furthermore, 

note that, only the matched positions of K in the 

weighted query window are important in the calculation. 

So instead of maintaining the array K, we maintain 

a vEB tree where always the appropriate num- 

After calculating the matches of Row 2, values of S 

Array and T Array are shown in Table 6 and Table 7 ber (q in this case) of matches (with values after the 

respectively. 

addition of the relative weights) are kept. And as 

value 0 1 0 0 1 2 3 4 5 6 6 


Table 6: S array after calculating Row 2 

A 3 0 1 0 0 0 0 0 4 5 6 6 

C 3 0 0 0 0 1 2 3 0 0 0 0 

A B B C C C A A A A 


Table 7: T -values of Row 2 after processing all the matches of Row 

2 

3 LCS_RLE-II 

In this section, we use the ideas of LCS_RLE-I to 

present our second algorithm, LCS_RLE-II, which runs 

in O(R log(log(g)) + N) time. To achieve this running 

time, we will use an elegant data structure (referred 

to as the vEB tree henceforth) invented by van Emde 

Boas [21] that allows us to maintain a sorted list of integers 

in the range [1..n] in O(log(log(n))) time per 

insertion and deletion. In addition to that it can return 

next(i) (successor element of i in the list) and prev(i) 

(predecessor element of i in the list) in constant time. 

We follow the same terminology and assume the 

same settings of Section 2 to describe LCS_RLE- 

II. So, assume that we are considering the match 

(i, j) ∈ M i and recall that when the the computation 

of the match (i, j) would be complete, i.e. T [i, j] 

is completely computed, we would have the result 

of LCS_RLE(Ỹ ′ a p , X ′ ). Note carefully that the 

baseOperation is basically the operation required to 

compute a normal LCS. We can use the LCS algorithm 

of [10] or [11] just to do the baseOperation 

for each match. Then, for each match we would 

only need O(log(log(g))) time [11, 10], requiring 

a total of O(R log(log(g))) time to perform all the 

baseOperations. 

Now, we focus on the weightOperations. Our 

goal is to completely avoid any RMQ preprocessing. 

We need to modify the weightOperation as 

the computation moves from one match to the next, 

to maintain the appropriate weighted query window, 

only one element (corresponding to a match) is added 

to the vEB tree and at most one element is deleted. 

(We do not need to delete any element if the current 

weighted query window has fewer matches than q.) 

When we need the maximum value of the 

weighted query window, we just find the maximum 

from the vEB tree which can also be found in 

O(log(log(g))) time (by inserting a fictitious element 

having infinite value and then deleting it after computing 

its predecessor). As we need to insert and delete 

constant number of elements from the vEB tree for each 

match, this can be done in O(R log(log(g))) time on 

the whole. Like before, we would need to deduct the 

appropriate value ((|M i | + 1 − number((i, j)) in this 

case) from the returned maximum to do the proper adjustment. 

Algorithm 1 presents the idea (only the computation 

of the values) more formally. Notably, we must 

maintain appropriate pointers to recover the actual LCS 

after the computation is done. 

Finally, the computation of the set M in the prescribed 

order can be done following the preprocessing 

algorithm of [10, 11] which runs in O(R log(log(g)) + 

N) time. So, we have the following theorem. 

Theorem 3. LCS_RLE-II solves Problem LCS_RLE in 

O(R log(log(g)) + N) time. 

4 Conclusion 

In this paper, we have studied the longest common subsequence 

problem for two strings, where one of the 

input strings is run length encoded. We have presented 

two novel algorithms, namely LCS_RLE-I and 

LCS_RLE-II to solve the problem. We have first presented 

LCS_RLE-I combining some new ideas with the 

techniques used in [11, 10]. LCS_RLE-I runs in O(gN) 

time, which matches the best algorithm in the literature. 

Then we present an input sensitive algorithm, namely, 

LCS_RLE-II that runs in O(R log(log(g)) + N) time. 

Observe that in the worst case, R = O(gN) and hence 

the worst case running time of LCS_RLE-II is slightly 



Algorithm 1 LCS_RLE-II: Computation of Row i 

(Only the computation of value is shown) 

1: for each (i, j) ∈ M i in the left to right order do 

2: Perform baseOperation and update S array accordingly 

3: end for 

{Now we perform the weightOperation for every 

match of M i . Assume that each vEB element is a 

tuple (value, pos)} 

4: vEBTree = null 

5: for each (i, j) ∈ M i in the left to right order do 

6: relativeW eight = |M i | − number((i, j)) + 1 

7: vebTree.Insert((S[j]+relativeWeight,j)) 

8: if |vebT ree| > q then 

9: Delete the earliest inserted element from veb- 

Tree {The earliest inserted element can be efficiently 

found by maintaining a normal linked 

list between the inserted elements} 

10: end if 

11: S[j] = vebT ree.Maximum() − (|M i | + 1 − 

number((i, j))) 

12: T [i, j] = S[j] 

13: end for 

Acknowledments 

The authors would like to thank the annonymous reviewers 

and the editor for constructive comments and 

suggestions which improved the presentation of the paper 

a lot. This research work constitues part of the B.Sc. 

Engineering thesis of Ahsan and Shahriyar under the 

supervision of Rahman. Moosa is currently working at 

Google Inc., USA. 

References 

[1] Altschul, S. F., Gish, W., Miller, W., Meyers, 

E. W., and Lipman, D. J. Basic local alignment 

search tool. Journal of Molecular Biology, 

215(3):403–410, 1990. 

[2] Ann, H.-Y., Yang, C.-B., Tseng, C.-T., and Hor, 

C.-Y. A fast and simple algorithm for computing 

the longest common subsequence of run-length 

encoded strings. Inf. Process. Lett., 108(6):360– 

364, 2008. 

[3] Apostolico, A., Landau, G. M., and Skiena, S. 

Matching for run-length encoded strings. J. Complexity, 

15(1):4–16, 1999. 

worse than the best algorithm in the literature. However, 

in many cases R = o(gN), and our algorithm 

would show superior behaviour in these cases. In particular, 

if R < gN/ log(log(g)), LCS_RLE-II will outperform 

the best algorithms in the literature. Additionally, 

if we run Mitchell’s algorithm (the best algorithm 

in the literature for two RLE strings) in our setting, the 

running time would be O((R+G+n) log(R+G+n)), 

which clearly is worse than ours. Also, employing some 

of the insights of Mitchel [16], we believe, our work can 

be extended to the version where both the input are RLE 

strings. 

Finally, all the works in the literature so far on 

LCS computation considering RLE strings focused 

only on theoretical complexity results of the devised algorithms. 

Theoretical improvement in these algorithms 

were achieved in most cases by using complex data 

structures (e.g., in our case, vEB tree and RMQ data 

structures). In practice, such algorithms, despite having 

better theoretical bounds, may turn out to be worse in 

performance. Hence, an interesting research direction 

could be to implement the algorithms in the literature 

along the new ones proposed here and to compare them 

against each other from a practical point of view. Notably, 

we have already started working in this direction 

and hope to present the findings in a forthcoming paper. 

[4] Arlazarov, V., Dinic, E., Kronrod, M., and 

Faradzev, I. On economic construction of the transitive 

closure of a directed graph (english translation). 

Soviet Math. Dokl., 11:1209–1210, 1975. 

[5] Bender, M. A. and Farach-Colton, M. The lca 

problem revisited. In LATIN, pages 88–94, 2000. 

[6] Freschi, V. and Bogliolo, A. Longest common 

subsequence between run-length-encoded strings: 

a new algorithm with improved parallelism. Inf. 

Process. Lett., 90(4):167–173, 2004. 

[7] Gabow, H., Bentley, J., and Tarjan, R. Scaling 

and related techniques for geometry problems. In 

STOC, pages 135–143, 1984. 

[8] Hunt, J. W. and Szymanski, T. G. A fast algorithm 

for computing longest subsequences. Commun. 

ACM, 20(5):350–353, 1977. 

[9] Iliopoulos, C. S. and Rahman, M. S. Algorithms 

for computing variants of the longest common 

subsequence problem. Theor. Comput. Sci., 

395(2-3):255–267, 2008. 

[10] Iliopoulos, C. S. and Rahman, M. S. New efficient 

algorithms for the lcs and constrained lcs 

problems. Inf. Process. Lett., 106(1):13–18, 2008. 



[11] Iliopoulos, C. S. and Rahman, M. S. A new 

efficient algorithm for computing the longest 

common subsequence. Theory Comput. Syst., 

45(2):355–371, 2009. 

[12] K. Sayoood, E. F. E. Introduction to Data Compression. 

Morgan Kaufmann Publishers Inc, 2000. 

[13] Levenshtein, V. Binary codes capable of correcting 

deletions, insertions, and reversals. Problems 

in Information Transmission, 1:8–17, 1965. 

[14] Liu, J. J., Wang, Y.-L., and Lee, R. C. T. Finding 

a longest common subsequence between a 

run-length-encoded string and an uncompressed 

string. J. Complexity, 24(2):173–184, 2008. 

[15] Masek, W. J. and Paterson, M. A faster algorithm 

computing string edit distances. J. Comput. Syst. 

Sci., 20(1):18–31, 1980. 

[16] Mitchell, J. A geometric shortest path problem, 

with application to computing a longest common 

subsequence in run-length encoded strings. Technical 

Report Department of Applied Mathematics, 

SUNY Stony Brook, 1997. 

[17] Myers, E. W. An o(nd) difference algorithm and 

its variations. Algorithmica, 1(2):251–266, 1986. 

[18] Nakatsu, N., Kambayashi, Y., and Yajima, S. A 

longest common subsequence algorithm suitable 

for similar text strings. Acta Inf., 18:171–179, 

1982. 

[19] Pearson, W. and Lipman, D. Improved tools 

for biological sequence comparison. Proceedings 

of National Academy of Science, USA, 85:2444– 

2448, 1988. 

[20] Rahman, M. S. and Iliopoulos, C. S. Algorithms 

for computing variants of the longest common 

subsequence problem. In ISAAC, pages 399–408, 

2006. 

[21] van Emde Boas, P. Preserving order in a forest in 

less than logarithmic time and linear space. Information 

Processing Letters, 6:80–82, 1977. 

[22] Wagner, R. A. and Fischer, M. J. The string-tostring 

correction problem. J. ACM, 21(1):168– 

173, 1974. 


Universidade Federal de Lavras 

INFOCOMP Journal of Computer Science 

Publication guidelines (July of 2011) 

1. INFOCOMP publishes original scientific and technological papers in English. The 

papers should be related to Computer Science. 

2. The papers should be submitted in PDF format without authors’ informations using 

the JEMS system (https://submissoes.sbc.org.br/infocomp) of the Brazilian Computer 

Society (SBC). Login and password are necessary and they can be obtained 

online in the JEMS system. 

3. The papers should follow the INFOCOMP format: Paper Letter (21.5x28.0cm), 1.1 

line spacing, Times New Roman 10, justified text, with superior, inferior, left and 

right margins of 2.5 cm. The number of pages should not exceed 12. Papers with 

more than 12 pages may be accepted after analysis by the editorial board. 

4. The first page should contain title, authors (only in the final version), abstract and 

keywords. Afterwards, it should contain the following text centralized: “(Received 

January 1st, 2005 / Accepted December 31st, 2005)” for posterior edition with the 

correct dates. These informations should not exceed one page and should be in one 

column. The title should be in font size 14. 

5. The authors’ names (only in the final and accepted version) should be sided horizontally, 

identified by a superscripted number. The affiliation and the electronic 

address of each author should be written right below the names. 

6. The text of the paper should be formatted in two columns, separated by 0.5cm, 

with numerated sections and subsections. Examples are: 1. Introduction, 1.1. 

Terminology. Figures and tables may occupy the page width, if necessary. The 

figure and table titles should be numerated and centralized below them. 

7. The references should be numerated and listed in alphabetical order. Ex: [5] 

Hougardy, S. Even pairs and the strong perfect graph conjecture, Discrete Math. 

v.154, p.277-288, 1996. Citations should be made based on the number of the reference. 

“In [5], it was proved that...” is an example of citation. 

8. Footnotes may be accepted, when strongly necessary, for explanations that cannot 

be included in the text, such as: (a) name of the research institution; (b) support 

organizations and other financial supports; (c) reference to the publication as part 

of MSc or PhD thesis; (d) personal communication. 

9. The authors give the right of publishing and formatting the articles upon submission. 

10. Non compliance with these rules will result in the non acceptance of the paper. 

11. Publications in the INFOCOMP are free of charge.

INFOCOMP - Departamento de CiÃªncia da ComputaÃ§Ã£o - Ufla

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?