Journal of Computers Contents - Academy Publisher

Journal of Computers 

ISSN 1796-203X 

Volume 5, Number 1, January 2010 

Contents 

Special Issue: Selected Papers of the IEEE International Conference on Computer and 

Information Technology (ICCIT 2008) 

Guest Editors: Syed Mahfuzul Aziz, Vijayan K. Asari, M. Alamgir Hossain, Mohammad A. 

Karim, and Mariofanna Milanova 

Guest Editorial 

Syed Mahfuzul Aziz, Vijayan K. Asari, M. Alamgir Hossain, Mohammad A. Karim, and Mariofanna 

Milanova 

SPECIAL ISSUE PAPERS 

Cutting a Cornered Convex Polygon Out of a Circle 

Syed Ishtiaque Ahmed, Md. Ariful Islam, and Masud Hasan 

Decision Tree Based Routine Generation (DRG) Algorithm: A Data Mining Advancement to 

Generate Academic Routine and Exam-time Tabling for Open Credit System 

Ashiqur Md. Rahman, Sheik Shafaat Giasuddin, and Rashedur M Rahman 

Anomaly Network Intrusion Detection Based on Improved Self Adaptive Bayesian Algorithm 

Dewan Md. Farid and Mohammad Zahidur Rahman 

Performance Evaluation for Question Classification by Tree Kernels using Support Vector Machines 

Muhammad Arifur Rahman 

Recurrent Neural Network Classifier for Three Layer Conceptual Network and Performance 

Evaluation 

Md. Khalilur Rhaman and Tsutomu Endo 

An Enhanced Short Text Compression Scheme for Smart Devices 

Md. Rafiqul Islam and S. A. Ahsan Rajon 

Design and Analysis of an Effective Corpus for Evaluation of Bengali Text Compression Schemes 

Md. Rafiqul Islam and S. A. Ahsan Rajon 

A Corpus-based Evaluation of a Domain-specific Text to Knowledge Mapping Prototype 

Rushdi Shams, Adel Elsayed, and Quazi Mah- Zereen Akter 

Implementation of Low Density Parity Check Decoders using a New High Level Design 

Methodology 

Syed Mahfuzul Aziz and Minh Duc Pham 

REGULAR PAPERS 

A Formal Model for Abstracting the Interaction of Web Services 

Li Bao, Weishi Zhang, and Xiong Xie 

1 

4 

12 

23 

32 

40 

49 

59 

69 

81 

91

Performance Evaluation of Elliptic Curve Projective Coordinates with Parallel GF(p) Field 

Operations and Side-Channel Atomicity 

Turki F. Al-Somani 

VPRS-Based Knowledge Discovery Approach in Incomplete Information System 

Shibao Sun, Ruijuan Zheng, Qingtao Wu, Tianrui Li 

NaXi Pictographs Input Method and WEFT 

Hai Guo and Jing-ying Zhao 

A Systematic Decision Criterion for Urban Agglomeration: Methodology for Causality Measurement 

at Provincial Scale 

Yaobin Liu and Li Wan 

Application of Improved Fuzzy Controller for Smart Structure 

Jingjun Zhang, Liya Cao, Weize Yuan, Ruizhen Gao, and Jingtao Li 

Numerical Simulation of Snow Drifting Disaster on Embankment Project 

Shouyun Liang, Xiangxian Ma, and Haifeng Zhang 

The Simulation of Extraterrestrial Solar Radiation Based on SOTER in Zhangpu Sample Plot and 

Fujian Province 

Zhi-qiang Chen and Jian-fei Chen 

Multiplicate Particle Swarm Optimization Algorithm 

Shang Gao, Zaiyue Zhang, and Cungen Cao 

Research and Design of Intelligent Electric Power Quality Detection System Based on VI 

Yu Chen 

99 

110 

117 

125 

131 

139 

144 

150 

158

JOURNAL OF COMPUTERS, VOL. 5, NO. 1, JANUARY 2010 1 

Special Issue on Selected Papers of the IEEE International Conference on Computer and 

Information Technology (ICCIT 2008) 

Guest Editorial 

The unprecedented advances in hardware and software technologies, computer communications, networking 

technologies and protocols, the Internet, parallel, distributed and mobile computing are allowing us to enhance the way 

we go about our everyday business. Consequently, demand imposed by these new and innovative applications of 

computers and information technologies continues to challenge researchers to seek innovative solutions. 

This Special Issue presents selected papers from the IEEE International Conference on Computer and Information 

Technology (ICCIT 2008) held on December 25-27, 2008 at Khulna University of Engineering and Technology in 

Bangladesh. Before introducing the synopsis of the papers, a brief introduction to the history of ICCIT is in order. 

ICCIT 2008 was the eleventh annual conference in the series, the first one was held in Dhaka, Bangladesh, in 1998. 

Since then the conference has grown to one of the largest conferences in the South Asian region, focusing on computer 

technologies, IT and relevant areas, with participation of academics and researchers from many countries. A double 

blind review process is followed whereby each paper submitted to the conference is reviewed by at least two 

independent reviewers of high international standing. The acceptance rate of papers in recent years has been less than 

30%, indicative of the quality of work the papers need to demonstrate to be accepted for presentation at the conference. 

The proceedings of ICCIT 2008 were included in IEEExplore. 

In 2008, a total of 538 full papers were submitted to the conference of which 158 were accepted for the conference 

after reviews conducted by an international program committee comprising 77 members from 12 countries with 

assistance from 83 reviewers. Form these 158 only 21 highly ranked papers were invited for this Special Issue. The 

authors were invited to enhance their papers significantly and submit the same for review. Of those only nine papers 

survived the review process and have been selected for inclusion in this Special Issue. The authors of these papers 

represent academic and/or research institutions from Australia, Bangladesh, Japan, and United Kingdom. These nine 

papers cover four domains of computing namely, application-driven algorithms, classification, text compression, and 

electrical/digital systems. 

The first paper by S.I. Ahmed, M.A. Islam, and M. Hasan presents algorithms for efficiently cutting a cornered 

convex polygon out of a circle. The problem of cutting small polygonal objects efficiently out of larger planar objects 

arises in many applications, such as metal sheet cutting, furniture manufacturing, ceramic industries, ornaments, and 

leather industries. One of these algorithms has been shown to have better running time and the other better 

approximation ratio compared to the known algorithms. The next paper by A.M. Rahman, S.S. Giasuddin, and R.M. 

Rahman presents heuristic based strategies that generate efficient academic routines and exam timetables for an 

educational institution that follows open credit system. The algorithm developed, based on decision tree and sequential 

search techniques, shows promising simulation results to satisfy both student and teacher preferences. To provide 

improved detection of network intrusion, a self adaptive Bayesian algorithm is presented by D.M. Farid and M.Z. 

Rahman in the last application-driven paper. The technique proposed in this paper for alert classification is aimed at 

reducing false positives in intrusion detection. 

The next two papers deal with classification challenges. In the first of these two papers, M.A. Rahman presents an 

approach to automatic question classification through machine learning approaches. It provides empirical evaluation of 

Support Vector Machine based question classification across three variations of tree kernels as well as three major 

parameters. The second classification related paper authored by M.K. Rahman and T. Endo proposes an Elman RNNbased 

classifier for disease classification for a doctor patient-dialog system. A three layer memory structure is adopted 

to address the challenge of contextual analysis of dialog. It simulates the human brain by discourse information. 

Two of the papers next are geared to address text compression issues. The first paper by M.R. Islam, and S.A.A. 

Rajon presents a low-complexity lossless compression scheme suitable for smart devices such as cellular phones and 

personal digital assistants. These devices typically have small memory and relatively low processing speed. Therefore 

these applications are expected to benefit from the proposed compression scheme, which offers lower computational 

complexity and reduced memory requirements. Next paper by these same authors proposes a platform for evaluation of 

Bengali text compression schemes. It includes a scheme for construction of Bengali text compression corpus. The paper 

also presents a mathematical analysis on the data compression performance with structural aspects of corpora. The 

proposed corpus is expected to be useful for evaluating compression efficiency of small and middle sized Bengali text 

files 

The last two papers of this Special Issue either address and/or draw inspiration from electrical/digital systems. The 

paper by R. Shams, A. Elsayed, and Q.M. Akter evaluates a domain-specific Text to Knowledge Mapping prototype by 

using a corpus developed for the DC electrical circuit knowledge domain. The evaluation of the prototype considers 

two of its major components, namely lexical components and knowledge model. The domain-specific corpus is 

expected to be useful for developing parsing and lexical component analysis tools and also contribute to domainspecific 

text summarization. The final Special Issue paper by S.M. Aziz and M.D. Pham proposes a new high level 

© 2010 ACADEMY PUBLISHER 

doi:10.4304/jcp.5.1.1-3

2 JOURNAL OF COMPUTERS, VOL. 5, NO. 1, JANUARY 2010 

methodology for the design and implementation of error correction decoders for digital communication. It uses 

Simulink based design flow and automatic generation of HDL codes using a set of emerging tools. The proposed 

methodology significantly reduces design effort and time while providing decoder performances that are comparable to 

tedious hand coded HDL-based designs. 

Finally, the Guest Editors would like to express their sincere gratitude to the twenty-three reviewers of the Special 

Issue from six countries (M.M. Ali, A.A.S. Awwal, K. P. Dahal, M. Erdmann, N. Funabiki, S. Haran, H-Y. Hsu, S.K. 

Garg, F. Islam, P. Jiang, J. Kamruzzaman, D. Lai, A.S. Madhukumar, D. Neagu, H. Ngo, S. Pandey, Y.H. Peng, R. 

Sarker, M.H. Shaheed, T. Taha, A.P. Vinod, D. Zhang, and M. Zhang) who have given immensely to this process. They 

have responded to the Guest Editors in the shortest possible time and dedicated their valuable time to ensure that the 

Special Issue contains high-quality papers with significant novelty and contributions. 

Guest Editors: 

Syed Mahfuzul Aziz 

School of Electrical & Information Engineering, University of South Australia, Mawson Lakes, SA 5095, Australia 

Vijayan K. Asari 

Department of Electrical & Computer Engineering, Old Dominion University. 231 Kaufman Hall. Norfolk, VA 

23529, USA 

M. Alamgir Hossain 

Department of Computing, University of Bradford, Bradford BD7 1DP, UK 

Mohammad A. Karim 

Office of Research, Old Dominion University, 4111 Monarch Way #203, Norfolk, VA 23508, USA 

Mariofanna Milanova 

Department of Computer Science, University of Arkansas at Little Rock, Dickinson Hall 515, 2801 S. University 

Avenue, Little Rock, AR 72204, USA 

Syed Mahfuzul Aziz received Bachelor and Masters Degrees, both in electrical & electronic engineering, from 

Bangladesh University of Engineering & Technology (BUET) in 1984 and 1986 respectively. He received a 

Ph.D. degree in electronic engineering from the University of Kent (UK) in 1993 and a Graduate Certificate in 

higher education from Queensland University of Technology, Australia in 2002. He was a Professor in BUET 

until 1999, and led the development of the teaching and research programs in integrated circuit (IC) design in 

Bangladesh. He joined the University of South Australia in 1999, where he is currently serving as the inaugural 

academic director of the first year engineering program. He was a visiting scholar at the University of Texas at 

Austin in 1996 and a visiting professor at the National Institute of Applied Science Toulouse, France in 2006. 

During 2001-2003 Dr. Aziz led avionics hardware modelling projects funded by the Australian Defence 

Science and Technology Organisation (DSTO). Since 2005 he has been leading an endoscopic capsule project 

and a near infrared spectroscopic instrumentation project in collaboration with Women’s and Children’s Hospital and the South 

Australian Spinal Cord Injury Research Centre respectively. Recently he has received funding from the Pork CRC, Australia for a 

project on Precision Livestock Farming. He has authored over eighty five research papers. His research interests include CMOS IC 

design, modelling/synthesis of high performance digital systems, biomedical engineering and engineering education. Dr. Aziz is a 

senior member of IEEE. He has received numerous professional awards including international and Australian national teaching 

awards. He has served as member of the program committees of many international conferences and was the organising secretary of 

the inaugural International Conference on Computer and Information Technology (ICCIT) in 1998. He reviews papers for the IEEE 

Transactions on Computer and Electronics Letters, UK. Recently he has been appointed a reviewer of the National Priorities 

Research Program, a flagship funding scheme of the Qatar National Research Fund. 


Vijayan K. Asari is a Professor in Electrical and Computer Engineering at Old Dominion University, 

Virginia, USA, and Director of the Computational Intelligence and Machine Vision Laboratory (Vision Lab) at 

ODU. He received the Bachelor's degree in electronics and communication engineering from the University of 

Kerala (College of Engineering, Trivandrum), India, in 1978, the M. Tech and Ph. D degrees in electrical 

engineering from the Indian Institute of Technology, Madras, in 1984 and 1994 respectively. He had been 

working as an Assistant Professor in Electronics and Communications at the University of Kerala (TKM 

College of Engineering), India. In 1996, he joined the National University of Singapore as a Research Fellow 

and led the research team for the development of a vision-guided microrobotic endoscopy system. He joined 

the School of Computer Engineering, Nanyang Technological University, Singapore in 1998 and led the 

computer vision and image processing related research activities in the Center for High Performance 

Embedded Systems at NTU. Dr. Asari joined Old Dominion University in fall 2000. He has so far published


more than 250 research articles including 54 peer reviewed journal papers. His current research interests include signal processing, 

image processing, computer vision, pattern recognition, neural networks, and high performance and low power digital architectures 

for application specific integrated circuits. Dr. Asari is a Senior Member of the IEEE, Member of the IEEE Computational 

Intelligence Society (CIS), IEEE Computer Society, IEEE Circuits and Systems Society, Association for Computing Machinery 

(ACM), Society of Photo-Optical Instrumentation Engineers (SPIE), and the American Society for Engineering Education (ASEE). 

He was awarded two United States patents in 2008 with his former graduate students. 

M. Alamgir Hossain received the Dphil degree from the University of Sheffield, UK. Currently, he is serving 

as senior lecturer in the Department of Computing at the University of Bradford. He is an active member of 

artificial intelligent (AI) research group. Prior to this he has held academic position at Sheffield University (as 

visiting research fellow), Sheffield Hallam University (as senior Lecturer) and University of Dhaka (as 

Chairman & Associate Professor of the Computer Science & Engineering Department). He has extensive 

research experience in high performance real-time computing, intelligent system, optimisation, system biology 

and adaptive control. He is currently leading an EU funded project, eLINK (about 5.5 million EURO) which 

has ten partners from Asia and Europe. He is also acting as the UK co-ordinator of a British Council funded 

research network project for higher education link programme. Dr. Hossain is currently supervising 11 PhD 

students mostly to the area of intelligent systems, optimisation and systems biology. In the past, he had 

involvement of many funded research projects and joint research with companies, including Balfour Beaty 

Rail, Goodrich Engine Design, Aramco (Saudi Arabia), NEC (Japan) etc. Dr. Hossain acted as programme chair, organising chair 

and IPC member of many international conferences. He is currently serving as an editor and member of the editorial board of three 

journals. He has reviewed many journal papers, including IEEE transaction on SMC, Networking, Aerospace and Electronic 

Systems, IET journals, Elsevier Science etc. Dr. Hossain has published over 120 refereed research articles and 12 books. He received 

the "IEE- F C Williams" award for a research article in 1996. He is a member of the IEEE and Secretary of the CLAWAR 

Association. 

Mohammad Ataul Karim is Vice President for Research of Old Dominion University in Norfolk, Virginia. 

Previously, he served as dean of engineering at the City College of New York of the City University of New 

York. His research areas include information processing, pattern recognition, computing, displays, and electrooptical 

systems. Dr. Karim is author of 15 books, 6 book chapters, and over 350 articles. He is North American 

Editor of Optics & Laser Technology and an Associate Editor of the IEEE Transactions on Education. He has 

served as guest editor for fifteen journal special issues. Professor Karim is an elected fellow of the Optical 

Society of America, Society of Photo-Instrumentation Engineers, the Institute of Physics, the Institution of 

Engineering & Technology, and Bangladesh Academy of Sciences. He received his BS in physics in 1976 from 

the University of Dacca, Bangladesh, and MS degrees in both physics and electrical engineering, and a Ph.D. 

in electrical engineering from the University of Alabama respectively in 1978, 1979, and 1981. 

Mariofanna Milanova is Associate Professor of Computer Science in the Department of Computer Science at 

the University of Arkansas at Little Rock, USA. She received her M. Sc. degree in Expert Systems and AI in 

1991 and her Ph.D. degree in Computer Science in 1995 from the Technical University, Sofia, Bulgaria. Dr. 

Milanova did her post-doctoral research in visual perception at the University of Paderborn, Germany. She has 

extensive academic experience at various academic and research organizations including the Navy SPAWARS 

System Center in San Diego, USA, the University of Louisville, USA, Air Force, Dayton , USA, the National 

Polytechnic Institute Research Center in Mexico City, Mexico, the Technical University of Sofia in Bulgaria, 

the University of Sao Paulo in Brazil, the University of Porto in Portugal, the Polytechnic University of 

Catalunya in Spain, and at the University of Paderborn in Germany. She had grants from the German Research Foundation, the 

Brazilian FAPESP State of Sao Paulo Research Foundation, the US National Science Foundation, the European Community, NATO, 

and from the US Department of Homeland Security. Dr Milanova is a Senior Member of the IEEE and Computer Society, member of 

IAPR, member of the IEEE Women in Engineering, member of the Society of Neuroscience and a member of the Cognitive 

Neuroscience Society. Milanova serves as a book editor of two books and associate editor of several international journals. Her 

main research interests are in the areas of artificial intelligence, biomedical signal processing and computational neuroscience, 

computer vision and communications, machine learning, and privacy and security based on biometric research. She has published 

and co-authored more than 60 publications, over 33 journal papers, 11 book chapters, numerous conference papers and 2 patents. 

© 2010 ACADEMY PUBLISHER


Cutting a Cornered Convex Polygon 

Out of a Circle 

Syed Ishtiaque Ahmed, Md. Ariful Islam and Masud Hasan 

Department of Computer Science and Engineering 

Bangladesh University of Engineering and Technology 

Dhaka 1000, Bangladesh 

Email: ishtiaque@csebuet.org, arifulislam@csebuet.org, masudhasan@cse.buet.ac.bd 

Abstract— The problem of cutting a convex polygon P out 

of a piece of planar material Q with minimum total cutting 

length is a well studied problem in computational geometry. 

Researchers studied several variations of the problem, such 

as P and Q are convex or non-convex polygons and the 

cuts are line cuts or ray cuts. In this paper we consider 

yet another variation of the problem where Q is a circle 

and P is a convex polygon such that P is bounded by a 

half circle of Q and all the cuts are line cuts. We give two 

algorithms for solving this problem. Our first algorithm is 

an O(log n)-approximation algorithm with O(n) running 

time, where n is the number of edges of P . The second 

algorithm is a constant factor approximation algorithm with 

approximation ratio 6.48 and running time O(n 3 ). 

Index Terms— algorithms, approximation algorithms, computational 

geometry, line cut, ray cut, polygon cutting, 

rotating calipers 

I. INTRODUCTION 

The problem of cutting small polygonal objects “efficiently” 

out of larger planar objects arises in many 

industries, such as metal sheet cutting, paper cutting, 

furniture manufacturing, ceramic industries, fabrication, 

ornaments, and leather industries. This type of problems 

are in general known as stock cutting problems [1]. 

Different types of cuts and different criteria for efficiency 

of cutting are considered in cutting processes, mostly 

depending up on the types of the materials. The two most 

common types of cuts are the line cuts (i.e., guillotine 

cuts) and ray cuts. A line cut is a line that cuts the 

given object into several pieces and does not run through 

the target object. A ray cut is a ray that runs from the 

infinity to a certain point of the given object, possibly 

to a boundary point of the target object. A line cut is 

feasible for cutting out convex objects since it cuts an 

object into many pieces below and above the cut. On the 

other hand, ray cuts can be performed by many types 

of saws such as scroll saw, band saw, laser saw and wire 

saw [2]. Ray cuts can cut out non-convex objects too. But 

at the same time they need to make turns in the cutting 

process and so, needs some “clearance” for a turn, which 

can make it impossible to cut an arbitrary non-convex 

polygon. In particular, for applying ray cuts to a nonconvex 

polygon it is necessary for the polygon to have no 

“twisted pockets”, i.e., part of the polygon boundary that 

does not see the infinity. As a whole, a cutting process 


doi:10.4304/jcp.5.1.4-11 

that uses only line cuts is much simpler than that uses 

only ray cuts. 

In a cutting process the main criteria for “efficiency” 

of cutting is to minimize the total cutting length, which 

is also known as the cutting cost. While cutting a convex 

polygon it may be true that the cutting cost for ray cuts are 

less than that for line cuts. But due to the above simplicity 

line cuts are more popular for cutting convex objects and 

are well studied as well, at least theoretically [1], [3]–[9]. 

Moreover, it can be shown that [9] it is not always possible 

to replace line cuts by ray cuts to get better cutting cost. 

In this paper we consider the problem of cutting a 

convex polygon P out of a circle Q by using line cuts 

where P is “much smaller” than Q, namely P is cornered 

convex with respect to Q. A cornered convex polygon P 

inside a circle Q is a convex polygon which is positioned 

completely on one side of a diameter of Q. See Fig. 1(a). 

The (cutting) cost of a line cut is the length of the 

intersection of the line with Q. After a cut is made, Q is 

updated to the piece containing P . A cutting sequence is a 

sequence of cuts such that after the last cut in the sequence 

we have P = Q. We give algorithms for cutting P out 

of Q by line cuts with total cutting cost of the cutting 

sequence as small as possible. See Fig. 1(b). In many 

applications, such as metal sheet cutting, it is natural to 

have the given object as a large circular sheet and the 

target object as a sufficiently smaller convex polygon. 

A. Known results 

If Q is another convex polygon with m edges, this 

problem with line cuts has been approached in various 

ways by many researchers in computational geometry 

community [1], [3]–[9]. Overmars and Welzl first introduced 

this problem in 1985 [3]. If the cuts are allowed 

only along the edges of P , they proposed an O(n 3 + m)time 

algorithm for this problem with optimal cutting 

length, where n is the number of edges of P . The problem 

is more difficult if the cuts are more general, i.e., they are 

not restricted to touch only the edges of P . In that case 

Bhadury and Chandrasekaran showed that the problem 

has optimal solutions that lie in the algebraic extension of 

the input data field [1] and due to this algebraic nature of


Q 

c 

P 

Q 

(a) (b) 

Figure 1. (a) A cornered convex polygon P inside a circle Q. (b) Two different cutting sequences (bold lines) to cut P out of Q; Cutting cost of 

the sequence in the left figure is more than that in the right figure. 

this problem, an approximation scheme1 is the best that 

one can achieve [1]. They also gave an approximation 

scheme with pseudo-polynomial running time [1]. 

After the indication of Bhadury and Chandrasekaran [1] 

to the hardness of the problem, several researchers have 

given polynomial time approximation algorithms. Dumitrescu 

proposed an O(log n)-approximation algorithm 

with O(mn+n log n) running time [5], [6]. Then Daescu 

and Luo [7] gave the first constant factor approximation 

algorithm with ratio 2.5 + ||Q||/||P ||, where ||Q|| 

and ||P || are the perimeter of Q and P respectively. 

Their algorithm has a running time of O(n3 + (n + 

m) log (n + m)). The best known constant factor approximation 

algorithm is due to Tan [4] with an approximation 

ratio of 7.9 and running time of O(n3 + m). In the 

same paper [4], the author also proposed an O(log n)approximation 

algorithm with improved running time 

of O(n + m). As the best known result so far, very 

recently, Bereg, Daescu and Jiang [8] gave a polynomial 

time approximation scheme (PTAS) for this problem with 

running time O(m + n6 

ɛ12 ). 

For ray cuts, Demaine, Demaine and Kaplan [2] gave a 

linear time algorithm to decide whether a given polygon 

P is ray-cuttable or not. For optimally cutting P out of 

Q by ray cuts, if Q is convex and if P is non-convex 

but ray-cuttable, then Daescu and Luo [7] gave an almost 

linear time O(log 2 n)-approximation algorithm. If P is 

convex, then they gave a linear time 18-approximation 

algorithm. Tan [4] improved the approximation ratio for 

both cases as O(log n) and 6, respectively, but with much 

higher running time of O(n3 + m). 

B. Our results 

All the previous results consider Q as a polygon (either 

convex or non-convex). However, to our knowledge, no 

algorithm is known when Q is a circle. In this paper, 

we consider the problem where Q is a circle and P is a 

cornered convex polygon inside Q. We give two approximation 

algorithms for this problem. Our first algorithm 

1 A ρ-approximation algorithm (similarly, an approximation scheme) 

has a cutting length that is ρ times (similarly, (1+ɛ) times, for any value 

ɛ > 0) the optimal cutting length. Please refer to [10] for preliminaries 

on approximation algorithms. 


P 

Q 

has an approximation ratio of O(log n) and runs in O(n) 

time. Our second algorithm has an approximation ratio of 

6.48 and runs in O(n 3 ) time. 

C. Comparison of the results 

While for an approximation algorithm a constant factor 

approximation ratio is preferable to input-dependent 

approximation ratio, the linear running time of our first 

algorithm is much better than the cubic running time 

of our second algorithm. The ratio 6.48 of our second 

algorithm is better than that of 7.9 of Tan’s algorithm [4] 

(although the later deals with Q as a convex polygon). 

When both P and Q are convex polygons, almost 

all the existing algorithms on line cuts use two major 

steps: cutting a minimum area rectangle or a minimum 

area triangle from Q that bounds P and then cutting P 

out of that bounding box. Our algorithms also follow 

similar approach. However, we observe that the existing 

algorithms can not be applied directly to solve our 

problem. Moreover, the running time of those algorithms 

are too high compared to our algorithms. In particular, 

Tan’s [4] constant factor approximation algorithm takes 

O(n3 + m) time, the O(log n)-approximation algorithm 

of [5], [6] takes O(mn + n log n) time and the PTAS 

of Bereg et.al. [8] takes O(m + n6 

ɛ12 ) time, where m can 

be arbitrarily large. In contrast, the running time of our 

algorithms are free of m and one of them is linear. Also 

observe that in the existing algorithms, increasing the 

value of m for “approximating” Q to a circle makes them 

inefficient. See TABLE I for a summary of comparisons 

among the existing algorithms and our algorithms. 

D. Outline 

The rest of the paper is organized as follows. We give 

some definitions and preliminaries in Section II. Then we 

present our two algorithms in Section III and Section IV 

respectively. Finally, Section V concludes the paper with 

some future works. 

P 

II. PRELIMINARIES 

A line cut is a vertex cut through a vertex v of P if 

it is tangent to P at v. Similarly, a line cut is an edge


Cut Type Q P Approx. Ratio Running Time Reference 

- Non-convex Ray-cuttable? O(n) [2] 

Convex Convex 18 O(n) [7] 

Ray cuts Convex Non-convex O(log2 n) O(n) [7] 

Convex Convex 6 O(n3 + m) [4] 

Convex Non-convex O(log n) O(n3 + m) [4] 

Convex Convex O(log n) O(mn + n log n) [5], [6] 

Convex Convex 2.5 + ||Q||/||P || O(n 

Line cuts 

3 + (n + m) log (n + m)) [7] 

Convex Convex 7.9 O(n3 Convex Convex (1 + ɛ) 

+ m) 

O(m + 

[4] 

n6 

ɛ12 Circle 

Circle 

Cornered convex 

Cornered convex 

O(log n) 

6.48 

) 

O(n) 

O(n 

[8] 

This paper 

� ) This paper 

cut through an edge e of P if it contains e. At any time 

the edges of P through which an edge cut has passed are 

called cut edges of P and other edges of P are called 

uncut edges of P . To cut P out of Q all n edges of P 

must become cut edges and for that we require exactly 

n edge cuts. However, applying only edge cuts may not 

give an optimal solution and we need vertex cuts as well. 

In the rest of this section we give some elementary 

geometry that plays important role in our paper. Let c be 

the center of Q. An edge e of P is visible from c if for 

every point p of e the line segment cp does not intersect 

P in any other point. So, if e is collinear with c, then 

we consider e as invisible. Similarly, a vertex v of P is 

visible from c if the line segment cv does not intersect P 

in any other point. 

In this paper we do not consider the diameter of Q as 

a chord, i.e., a chord is always smaller than a diameter. 

Let ll ′ be a chord of Q. ll ′ divides Q into two circular 

segments, one is bigger than a half circle and the other 

one is smaller than a half circle. Let tt ′ be another chord 

intersecting ll ′ at x such that tx is in the smaller circular 

segment of ll ′ . See Fig. 2(a). The following two lemmas 

are obvious and their illustration can be found in Fig. 2. 

Lemma 1: xt is no bigger than ll ′ . 

Lemma 2: Let △abc be an obtuse triangle with the 

obtuse angle � bac. Consider any line segment connecting 

two points b ′ and c ′ on ab and ac, respectively, possibly 

one of b ′ and c ′ coinciding with b or c respectively. Then 

the angle � bb ′ c ′ and � cc ′ b ′ are obtuse. 

l 

t 

x 

l ′ 

(a) 

t ′ 

b 

b ′ 

b ′ 

b ′ 

a 

(b) 

TABLE I. 

COMPARISON OF THE RESULTS. 

c ′ 

c ′ 

Figure 2. Illustration of Lemma 1 and Lemma 2. 


c 

III. ALGORITHM 1 

Our first algorithm has four phases : (1) D-separation, 

(2) triangle separation, (3) obtuse phase and (4) curving 

phase. In D-separation phase, we cut out a small portion 

of Q (less than the half of Q) which contains P (and 

looks like a “D”). Then in triangle separation phase we 

reduce the size of Q even more by two additional cuts and 

bound P by almost a triangle. In obtuse phase we assure 

that all the portions of Q that are not inside P are inside 

some obtuse triangles. Finally, in curving phase we cut 

P out of Q by cutting those obtuse triangles in rounds. 

Let C ∗ be the optimal cutting length to cut P from Q. 

Clearly C ∗ is at least the length of the perimeter of P . 

A. D-separation 

A D of the circle Q is a circular segment of Q which 

is smaller than an half circle of Q. By a D-separation 

of P from Q we mean a line cut of Q that creates a 

D containing P . In general, for a circle Q and a convex 

polygon P there may not exist any D-separation of P . 

But in our case since P is cornered, there always exists a 

D-separation of P . We first find a D-separation that has 

minimum cutting cost. 

Lemma 3: A minimum-cost D-separation, C1, can be 

found in O(n) time. 

Proof: Clearly C1 must touch P . So, C1 must be a 

vertex cut or an edge cut of P . Observe that any vertex 

cut or edge cut that is a D-separation must be through a 

visible vertex or a visible edge. Let e be a visible edge 

of P . Let ll ′ be a line cut through e. Let cp be the line 

segment perpendicular to ll ′ at p. If p is a point of e, 

then we call e a critical edge of P and ll ′ a critical edge 

cut of P . Since P is convex, it can have at most one 

critical edge. Similarly, let v be a vertex of P and let 

tt ′ be a vertex cut through v. Let cp be the line segment 

perpendicular to tt ′ at p. If tt ′ is such that p = v, then 

we call v a critical vertex of P and tt ′ a critical vertex 

cut of P . Again, P can have at most one critical vertex. 

Moreover, P has a critical edge if and only if it does 

not have a critical vertex. See Fig. 3. Now, if P has 

the critical edge e (and no critical vertex), then C1 is 

the corresponding critical edge cut ll ′ . C1 is minimum, 

because any other vertex cut or edge cut of P is either


closer to c (and thus bigger) or does not separate c from 

P . On the other hand, if P has the critical vertex v, then 

C1 is the corresponding critical vertex cut tt ′ of P . Again, 

C1 is minimum by the same reason. 

l 

C1 

e 

c 

Q 

P 

l ′ 

t 

C1 

Figure 3. Critical edge and critical vertex. 

For running time, all visible vertices and visible edges 

of P can be found in linear time. Then finding whether 

an edge of P is critical takes constant time. Over all 

edges finding the critical edge, if it exists, takes O(n) 

time. Similarly, for each visible vertex v we can check in 

constant time whether v is critical or not by comparing 

the angles of cv with two adjacent edges of v, which takes 

constant time. Over all visible vertices, it takes linear 

time. 

Lemma 4: Cost of C1 is at most C ∗ . 

Proof: Consider any optimal cutting sequence C with 

cutting cost C ∗ . C must separate P from c. However, it 

may do that by using a single cut (Case 1) or by using 

more than one cut (Case 2). In Case 1, if it uses a single 

cut then it is in fact doing a D-separation. By Lemma 3, 

it can not do better than C1, and therefore, cost of C1 is 

at most C ∗ . 

In Case 2, there are several sub-cases. We will prove 

that in any case the cutting cost of separating P from c is 

even higher than that for a single cut. Let C be the first 

cut in C that separates c from P . C can not be the very 

first cut of C, otherwise, it is doing a D-separation and we 

are in Case 1. It implies that C is not a (complete) chord 

of Q. For the rest of the proof please refer to Fig. 4. Let 

a and a ′ be the two end points of C. Let bb ′ be the chord 

of Q that contains aa ′ where b is closer to a than to a ′ . At 

least one of a and a ′ is not incident on the boundary of 

Q. We first assume that a is not incident to the boundary 

of Q but a ′ is (Case 1a). Let Cx = xx ′ be the cut that 

was applied immediately before C and intersects C at a. 

Since Cx does not separate c from P , the bigger circular 

segment created by Cx contains P , c and a. Now if x ′ is 

an end point of Cx, then by Lemma 1 ab is smaller than 

xx ′ . It implies that having Cx in addition to C1 increases 

the cost of separating P from c (see Fig. 4(a)). Similarly, 

if x ′ is not an end point of Cx, and thus another cut is 

involved, then by Lemma 1 the cost will be even more 

(see Fig. 4(b)). 

Now consider the case when both a and a ′ are not 

incident on the boundary of Q (Case 1b). So at least two 

cuts are required to separate ab and a ′ b ′ from bb ′ . Let 

those two cuts be Cx and Cy respectively. Let xx ′ and 


c 

Q 

v 

P 

t ′ 

yy ′ be the two chords of Q along which Cx and Cy are 

applied. Now if Cx and Cy do not intersect, then for each 

of them by applying the argument of Case 1a we can say 

that the cutting cost is no better than that for a single 

cut. But handling the case when Cx and Cy intersect is 

not obvious (see Fig. 4(c, d)). Let z ′ be their intersection 

point. Again, there may be several sub-cases: x and y may 

or may not be end points of Cx and Cy. Assume that both 

x and y are end points of Cx and Cy. Remember that none 

of Cx and Cy separates P from c. So, the region bounded 

by xz ′ y must contain P and c inside of it. In that case 

the total length of xz ′ and yz ′ is at least the diameter of 

Q, which is bigger than bb ′ (see Fig. 4(c)). For the other 

cases, where at least one of x and y is not an end point 

of Cx and Cy, respectively, by Case 1a the cost is even 

more (for example, see Fig. 4(d)). 

B. Triangle separation 

In this phase we apply two more cuts C2 and C3 and 

“bound” P inside a “triangle”. From there we achieve 

three triangles inside which we bound the remaining uncut 

edges of P . (In the D-separation phase, at most one edge 

of P becomes cut). 

Let C1 = aa ′ be the cut applied during the Dseparation. 

We apply two cuts C2 = at and C3 = a ′ t ′ 

such that both of them are also tangents to P . If C2 and 

C3 intersect (inside Q or on the boundary of Q), then 

let z be the intersection point (see Fig. 5(a)), otherwise 

let z be the point outside Q where the extensions of C2 

and C3 intersect (see Fig. 5(b)). We get three resulting 

triangles Ta, Ta ′, Tz having a, a ′ and z, respectively, as 

a peak. We only describe how to get Ta; description for 

Ta ′ and Tz are analogous. If C1 is an edge cut, then let 

rr ′ be the corresponding edge such that a is closer to r 

than to r ′ (see Fig. 5(a)). If C1 is a vertex cut, then let 

r be the corresponding vertex of P . Let s be the similar 

vertex due to C2 (see Fig. 5(b)). Then Ta = △ars. The 

polygonal chain of P bounded by Ta is the edges from 

r to s that reside inside Ta. 

Lemma 5: Total cost of C2 and C3 to achieve Ta, Ta ′ 

and Tz is at most 2C ∗ . Moreover, C2 and C3 can be found 

in linear time. 

Proof: Whether z is within Q or outside Q, the 

length of at and a ′ t ′ can not be more than twice the 

length of aa ′ . By Lemma 3, aa ′ is no more than C ∗ . 

Therefore, the total cost of C2 and C3 is at most 2C ∗ . 

To find at linearly, we can simply scan the boundary 

of P starting from the vertex or edge where aa ′ touches 

P and check in constant time whether a tangent of P 

is possible through that vertex or edge. Similarly we can 

find a ′ t ′ within the same time. 

C. Obtuse phase 

Consider the triangle Ta = △ars obtained in the 

previous phase. We call the vertex a the peak of Ta and 

the edge rs the base of Ta. Observe that the angle of Ta 

at a is acute. Similarly, the angle of Ta ′ at its peak a′ is


Ta 

C1 

r ′ 

r 

Q 

x ′ 

a ′ 

a 

b 

C3 

a 

s 

C2 

z 

Cx 

c 

x 

(a) 

P 

C 

a′ = b ′ Q 

x ′ 

b 

a 

Cx 

c 

x 

(b) 

C 

a ′ = b ′ 

y ′ 

x ′ 

Q 

b 

z ′ 

Cx 

x 

b ′ 

x ′ 

Q 

z 

y 

′ 

y ′ 

a ′ Cy 

Cy 

c 

C Cx 

c 

C 

P a P a P 

(c) (d) 

Figure 4. Separating P from c by using more than one cut. Bold lines represent the cutting cost. 

t 

t ′ 

Ta 

a ′ 

C1 C3 

C2 

(a) (b) 

Figure 5. Triangle separation. 

also acute. However, the angle of Tz at its peak z may 

be acute or obtuse. 

For each of Ta, Ta ′ and Tz (if Tz is acute) we apply 

a cut and obtain one or two triangles such that the 

angle at their peaks are obtuse and they jointly bound 

the polygonal chain of the corresponding triangle. We 

describe the construction of the triangle(s) obtained from 

Ta only; description for the triangles obtained from Ta ′ 

and Tz are analogous. 

Please refer to Fig. 6. Let Pa be the polygonal chain 

bounded by Ta. Length of ar and as may not be equal. 

W.l.o.g. assume that as is not larger than ar. Let s ′ be the 

point on ar such that length of as ′ equals to the length 

of as. Connect ss ′ if s ′ is different from r. We will find 

a line segment inside Ta such that it is tangent to Pa and 

is parallel to ss ′ . We have two cases: (i) ss ′ itself is a 

tangent to Pa, and (ii) ss ′ is not a tangent to Pa. 

Case (i): If ss ′ itself is tangent to Pa (at s), we apply 

our cut along ss ′ . This cut may be a vertex cut through 

s or an edge cut through the edge of Pa that is incident 

to s. If it is a vertex cut then we get the resulting obtuse 

triangle △rss ′ and the polygonal chain bounded by this 

triangle is same as Pa (see Fig. 6(a)). On the other hand, 

if the cut is an edge cut, let u be the other vertex of the 

cut edge. Then our resulting obtuse triangle is △rus ′ and 

the polygonal chain bounded by this triangle is the edges 


r 

a 

s 

t ′ 

t 

z 

b 

from r to u (see Fig. 6(b)). Since as and as ′ are of same 

length, in either case the triangle △rss ′ or △rus ′ has 

obtuse angle at s ′ . 

Case(ii): For this case, let u, u ′ be two points on as 

and ar, respectively, such that uu ′ is tangent of Pa and 

is parallel to ss ′ . We apply the cut along uu ′ . Again, 

this cut may be a vertex cut or an edge cut. If it is a 

vertex cut, let g be the vertex of the cut. Then we get 

two obtuse triangles △ru ′ g and △sug and the polygonal 

chains bounded by them are the sets of edges from r to g 

and from g to s respectively (see Fig. 6(c)). If it is an edge 

cut, then let gg ′ be the edge of the edge cut with u being 

closer to g than to g ′ . Then we get two obtuse triangles 

△ru ′ g ′ and △sug and the polygonal chains bounded by 

them are the sets of edges from r to g ′ and from g to s 

respectively (see Fig. 6(d)). Again, since au and au ′ are 

of same length, in either case the pair of triangles have 

obtuse angles at u and u ′ respectively. 

Lemma 6: Total cost of obtaining obtuse triangles from 

Ta, Ta ′, and Tz is at most C ∗ . Moreover, they can be 

found in O(log n) time. 

Proof: Consider the construction of the obtuse 

triangle(s) from Ta. Length of the cut ss ′ or uu ′ is at 

most the length of rs, which is bounded by the length 

of Pa. Over all three triangles Ta, Ta ′ and Tz, the total 

cutting length is bounded by the length of the perimeter 

of P . Since C ∗ is at least the length of the perimeter of 

P , the first part of the lemma holds. 

For running time, in Ta, we can find the tangent of Pa 

in O(log |Pa|) time by using a binary search, where |Pa| 

is the number of edges in Pa. Therefore, over all three 

triangles Ta, Ta ′ and Tz, we need a total of O(log n) time. 

D. Curving phase 

After the obtuse phase the edges of P that are not 

yet cut are partitioned and bounded into polygonal chains 

with at most six obtuse triangles. In this phase we apply 

the cuts in rounds until all edges of P are cut. Our cutting 

procedure is same for all obtuse triangles and we describe 

for only one. 

Let Tu = △gus with peak u and base gs be an obtuse 

triangle. (See Fig. 7). Let the polygonal chain bounded by 

Tu be Pu. Let the edges of Pu be e1, . . . , ek with k ≥ 2. 

x 

a ′ 

b ′ 

y


r 

s ′ 

Pa 

(a) 

a 

s 

r 

a 

′ 

s u 

Ta Ta Ta 

We shall apply cuts in rounds and all cuts will be edge 

cuts. In the first round we apply an edge cut C ′ along 

the edge e k/2. Then we connect g and s with the two 

end points of e k/2 to get two disjoint triangles. Since Tu 

is obtuse, by Lemma 2 these two new triangles are also 

obtuse. In the next round we work on each of these two 

triangles recursively and continue until all edges of Pu 

are cut. 

g 

u 

ek/2 

Tu 

Figure 7. Curving phase 

Lemma 7: In curving phase, to cut all edges of Pu 

there can be at most O(log k) rounds of cuts. Moreover, 

the total cost of these cuts is |Pu| log k, where |Pu| is the 

number of edges of Pu, and the running time is O(|Pu|). 

Proof: At each round the number of triangles get 

doubled and the number of edges that become cut also 

get doubled. So after log k rounds all k edges are cut. 

Let the length of Pu be Lu. At each round the total 

length of the bases plus the length of the edges that are 

being cut is no more than Lu. So, the total cutting cost 

is Lu log k over all rounds. 

For running time, finding the edge e k/2 takes constant 

time. Moreover, once an edge becomes cut, it will not be 

considered for an edge cut again. So, there can be at most 

|Pu| edge cuts, which gives a running time of O(|Pu|). 

Corollary 1: Total cutting cost of curving phase is 

C ∗ log n and the running time is O(n). 

Proof: Over all six obtuse triangles � Lu is no more 

than the perimeter of P , which is bounded by C ∗ , and 

� |Pu| = n. Therefore, for all six triangles, the total cost 

is at most � Lu log k = C ∗ log n and running time is 


s 

r 

s ′ 

u ′ 

g 

a 

(b) (c) (d) 

Figure 6. Obtaining obtuse triangle(s) from Ta. 

s 

u 

s 

r 

O( � |Pu|) = O(n). 

Combining the results of all four phases, we get the 

following theorem. 

Theorem 1: Given a circle Q and a cornered convex 

polygon P within Q, P can be cut out of Q by using line 

cuts in O(n) time with a cutting cost of O(log n) times 

the optimal cutting cost. 

s ′ 

u ′ 

g ′ 

IV. ALGORITHM 2 

In this section we present our second algorithm which 

cuts P out of Q with a constant approximation ratio of 

6.48 and running time of O(n 3 ). This algorithm has three 

phases: (1) D-separation, (2) cutting out a minimum area 

rectangle that bounds P and (3) cutting P out of that 

rectangle by only edge cuts. 

The D-separation phase is same as that of our first 

algorithm and we assume that this phase has been applied. 

A. Cutting a minimum area bounding rectangle 

We will use the technique of rotating calipers, which 

is a well known concept in computational geometry and 

was first introduced by Toussaint [11]. We use the method 

described by Toussaint [11]. 

A pair rotating calipers remain in parallel. It rotate 

along the boundary of an object with two calipers being 

tangents to the object. If the object is a convex polygon 

P , then one fixed caliper, which we call the base caliper, 

is tangent along an edge e of P and the other caliper is 

tangent to a vertex or an edge of P . In the next step of the 

rotation, the base caliper moves to the next edge adjacent 

to e and continue. The rotation is complete when the base 

caliper has encountered all edges of P . 

For our case we use two pairs of rotating calipers, 

where one pair is orthogonal to the other. We fixed only 

one caliper, among the four, as the base caliper. As we 

rotate along the boundary of P , we always place the 

base caliper along an edge of P and adjust other three 

calipers as the tangents of P . The four calipers give us a 

bounding rectangle of P . After the rotation is complete, 

we identify the minimum area rectangle among the n 

bounding rectangles. For that rectangle we apply one cut 

a 

Ta 

g 

u 

s


along each of its edges that are not collinear with the 

chord of the D. 

The above technique can be done in O(n) time [11]. 

Once the base calipers is placed along an edge, the other 

three calipers are also rotated and adjusted to make them 

tangent to P . Notice that each caliper “traverses” an edge 

or a vertex exactly once. 

P 

=⇒ 

Figure 8. Rotating two pairs of orthogonal calipers. The broken lines 

are the total cutting cost of this phase. 

Lemma 8: The cost of cutting a minimum area rectangle 

out of the D achieved from the D-separation phase is 

no more than 2.57C ∗ . 

Proof: There can be at most four pieces, other than 

the one inside the bounding rectangle, resulting from four 

cuts applied to the D. The length of each cut is no more 

than the portion of the perimeter of D that is separated by 

that cut. So, as a whole the total cutting cost is no more 

than the perimeter of D. Also see Fig. 8 

Now the perimeter of D is CD + Rθ, where CD is 

the length of the chord of D and θ is the angle made by 

the arc of D at the center c. Since CD is the cost of Dseparation, 

which by Lemma 4 is bounded by C ∗ , θ is at 

most π, and C ∗ can not be more than 2R, the maximum 

perimeter of D is C ∗ + (C ∗ /2)π = 2.57C ∗ . 

B. Cutting P out of a minimum area rectangle by only 

edge cuts 

For this phase we simply apply the constant factor 

approximatoin algorithm of Tan [4]. If P is bounded by a 

minimum area rectangle, then Tan’s algorithm cuts P out 

of the rectangle by using only edge cuts in O(n 3 ) time 

and with approximation ratio (1.5 + √ 2) [4]. 

We summerize the result of our second algorithm in 

the following theorem. 

Theorem 2: Given a circle Q and a cornered convex 

polygon P of n edges within Q, P can be cut out of Q 

by using line cuts in O(n 3 ) time with a cutting cost of 

6.48 times the optimal cutting cost. 

Proof: We have the cutting cost of C ∗ for D- 

Separation, 2.57C ∗ for cutting the minimum area rectangle, 

and (1.5 + √ 2)C ∗ for cutting P out of the rectangle, 

which give a total cost of 6.48C ∗ . 


P 

V. CONCLUSION 

In this paper we have given two algorithms for cutting 

a convex polygon P out of a circle Q by using line cuts 

where P resides in one side of a diameter of Q. Our first 

algorithm is an O(log n)-approximation algorithm with 

O(n) running time, where n is the number of edges of P . 

Our second algorithm is a 6.48-approximation algorithm 

with running time O(n 3 ). 

While there exist several algorithms when Q is another 

polygon, we are the first to address the problem where Q 

is a circle. Our first algorithm has a better running time 

and the second one has better approximation ratio than 

the best known previous algorithms that deal with Q as 

a convex polygon. 

There remain several open problems and directions for 

future research: 

1) The general case of this problem where P is not 

necessarily in one side of a diameter of Q is still 

to be solved. 

2) We also think it would be interesting to see approximation 

schemes for this problem. 

3) Finally, in many industry applications it is common 

to have both P and Q as 3D objects, for which we 

do not know any algorithm. In future it is important 

to design similar algorithms for the 3D case. 

REFERENCES 

[1] J. Bhadury and R. Chandrasekaran, “Stock cutting to 

minimize cutting length,” European Journal of Operations 

Research, vol. 88, pp. 69–87, 1996. 

[2] E. D. Demaine, M. L. Demaine, and C. S. Kaplan, “Polygons 

cuttable by a circular saw,” Computational Geometry: 

Theory and Applications, vol. 20, pp. 69–84, 2001. 

[3] M. H. Overmars and E. Welzl, “The complexity of cutting 

paper,” in Proc. 1st Annual ACM Symposium on Computational 

Geometry (SoCG’85), 1985, pp. 316–321. 

[4] X. Tan, “Approximation algorithms for cutting out polygons 

with lines and rays,” in Proc. 11th International Conference 

on Computing and Combinatorics (COCOON’05), 

ser. LNCS, vol. 3595. Springer, 2005, pp. 534–543. 

[5] A. Dumitrescu, “The cost of cutting out convex n-gons,” 

Discrete Applied Mathematics, vol. 143, pp. 353–358, 

2004. 

[6] ——, “An approximation algorithm for cutting out convex 

polygons,” Computational Geometry: Theory and Applications, 

vol. 29, pp. 223–231, 2004. 

[7] O. Daescu and J. Luo, “Cutting out polygons with lines and 

rays,” International Journal of Computational Geometry 

and Applications, vol. 16, pp. 227–248, 2006. 

[8] S. Bereg, O. Daescu, and M. Jiang, “A PTAS for cutting 

out polygons with lines,” in Proceedings of the 12th Annual 

International Conference on Computing and Combinatorics 

(COCOON’06), ser. LNCS, vol. 4112. Springer, 

2006, pp. 176–185. 

[9] R. Chandrasekaran, O. Daescu, and J. Luo, “Cutting out 

polygons,” in Proc. 17th Canadian Conference on Computational 

Geometry (CCCG’05), 2005, pp. 183–186. 

[10] T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein, 

Introduction to Algorithms. Cambridge, Massachusetts: 

MIT Press, 2001. 

[11] G. Toussaint, “Solving geometric problems with the rotating 

calipers,” in Proc. 2nd IEEE Mediterranean Electrotechnical 

Conference (MELECON’83), Athens, Greece, 

1983.


Syed Ishtiaque Ahmed was born in 1986 in Pabna, 

Bangladesh. He completed his B.Sc. Engineering in Computer 

Science and Engineering from Bangladesh University 

of Engineering and Technology (BUET) in March, 

2009. Currently he is an M.Sc. student in the same 

department. He is also working as a Lecturer in United 

International University, Dhaka, Bangladesh. During his 

undergraduate studies, he got University Merit Scholarship, 

Deans List Award, Imdad Siatara Khan Foundation 

Scholarship etc. for outstanding academic performances. 

His research interests have focused on the problems of 

Computational Geometry. 

Md. Ariful Islam was born in Bogra, Bangladesh in 

1987. He received his B.Sc. Engineering in Computer 

Science and Engineering from Bangladesh University of 

Engineering and Technology (BUET) in March, 2009. 

While studying as an undergrad student, he got University 

Merit Scholarship, Dean’s List Award, Educational Board 

Scholarship. He was the moderator of BuetNet (Association 

of BUET internal network). He is currently working 

as a Quantitative Software Developer at Stochastic Logic 

Limited, Bangladesh. At the same time, he is doing his 

M.Sc. in computer science and engineering in BUET. His 

research interests include Computational Geometry, Data 

Mining, and Artificial Intelligence. 


Masud Hasan was born in Kurigram, Bangladesh in 

1973. He received his B.Sc. and M.Sc. Engineering in 

Computer Science and Engineering from Bangladesh University 

of Engineering and Technology (BUET) in 1998 

and 2001, respectively. He received his PhD in Computer 

Science from the University of Waterloo, Canada in 2005. 

In 1998, he joined the Department of Computer Science 

and Engineering of BUET as a Lecturer, and since 2001 

he is an Assistant Professor in the same department. 

His undergraduate and graduate teaching includes Algorithms, 

Graph Theory, Mathematical Analysis of Computer 

Science, Computational Geometry, and Algorithms 

in Bioinformatics. His main research interests focus on 

Algorithms, Computational Geometry, and Algorithms in 

Bioinformatics. He is the author of more than 25 peerreviewed 

international journal and conference papers. He 

has served as a reviewer for many international journal 

and conferences.


Decision Tree Based Routine Generation (DRG) 

Algorithm: A Data Mining Advancement to 

Generate Academic Routine and Exam-time 

Tabling for Open Credit System 

Ashiqur Md. Rahman 

North South University, Computer Science and Engineering Department, Dhaka, Bangladesh 

Email: ashiq_rahman@yahoo.com 

Sheik Shafaat Giasuddin and Rashedur M Rahman 

North South University, Computer Science and Engineering Department, Dhaka, Bangladesh 

Email: rshafaat@gmail.com, rashedurrahman@yahoo.com 

Abstract—In this paper we propose and analyze techniques 

for academic routine and exam time table generation for 

open credit system. The contributions of this paper are 

multi-folds. Firstly, a technique namely Decision tree based 

Routine Generation (DRG) algorithm is proposed to 

generate an academic routine. Secondly, based on the DRG 

concept, Exam-time Tabling algorithm (ETA) is developed 

to implement conflict free exam-time schedule. In open 

credit course registration system any student may choose 

any course in any semester after completion of pre-requisite 

course(s). This makes the research more challenging and 

complex to accomplish. Academic routine and exam timetable 

generation are in general NP-Hard problems, i.e., no 

algorithm has been developed to solve it in reasonable 

(polynomial) amount of time. Different methods based on 

heuristics are developed to generate good time-table. In this 

research we developed heuristic based strategies that 

generate an efficient academic routine and exam time-table 

for a university that follow open credit system. OLAP 

representation helps to classify the courses along with the 

proposed algorithm to eliminate some constraints. Daybased 

pattern, minimum manhattan distance between 

courses of same teacher; minimum conflicted course 

distribution has been stage-managed to classify the courses. 

Our ETA algorithm is based on decision tree and sequential 

search techniques. 

Index Terms— OLAP, Crosstable, Conflict List, Favorite 

Slot, Faculty Choice, Course Color, Day-time slot pattern. 


This paper depict Decision Tree based Routine 

Generation (DRG) algorithm to generate a university 

class routine within a tolerable range of some constraints 

and conflict free Exam-Time Tabling algorithm (ETA). A 

decision-tree based classification algorithm has been 


doi:10.4304/jcp.5.1.12-22 

introduced to solve this NP-Hard Problem [22]. CPL 

(constraint logic programming) is a respected technology 

for solving hard problems which include many (nonlinear) 

constraints [1]. Constraints propagation technique 

has been applied to overcome the preferential 

requirements for slots of teachers, courses from preadvising 

by students and class room allocation. Versatile 

choices for courses may lead to a deadlock situation. 

Golz used priorities heuristic ordering [2] where 

Abdennadher introduced an optimized cost-based rule 

mining [3,4] to solve these type of problems. On the other 

hand, knowledge based in a hyper heuristic course 

scheduling using case based reasoning is used to 

maximize the rule covering area [5]. Further expansion is 

possible to accomplish the exam-time tabling using 

OLAP technique [16]. Exam-time tabling is another 

highly constrained combinatorial optimization problem. 

The major objective is to confirm 100% conflict free 

exam schedule with a fixed interval of days. Limited 

room capacity and room availability problems must be 

overcome to place exams on each time slot. The 

computational time is reduced by using heuristic based 

search in comparison with the permutation of courses for 

exam-time tabling. Identification of a novel heuristic is 

the most challenging task. Using OLAP, proposed 

conflict free Exam-Time Tabling algorithm (ETA) 

produces substantial results to accommodate all students 

with zero conflict tolerance. 

DRG presents key features of generating class routine 

with minimum computational time. Heterogeneous 

distribution of courses is classified with maximum 

satisfaction of all constraints. Section II describes about 

previous related works and preliminaries in details. 

Section III illustrates the problem dimensions with the 

data filtering technique used in the paper to summarize 

further manipulation; pursued by the proposed algorithm, 

DRG, and the classification procedure to find the feasible 

solution in Section IV followed by Exam-Time Tabling 

algorithm (ETA) in section V. Extensive computational


results are conducted to study the performance analysis 

for both algorithms in section VI. Finally the paper 

concludes the work in section VII. 

II. PRELIMINARIES &RELATED WORK 

A university class routine generation problem – as 

considered in this work – consists of assigning each 

course in a set of slots (classes) in a limited class rooms 

within the teachers’ favorite time slots. This highly 

constrained problem is optimized by simulated annealing 

and genetic algorithm [6,7]. Seven different major and 

minor objectives are discovered and deadlock situation is 

overcome by randomly exploring the composite 

neighborhood [8]. The most closely related attempt with 

this work appears to be the constraint programming 

approach used by Boizumault [9] and the simulated 

annealing approaches explored by Dowsland and 

Thompson [10,11]. The principal innovation in DRG is 

the sequential use of these two methods. DRG may select 

some poor slots, with respect to teachers or students, 

under a tolerable conflict range. A similar sequential 

approach has been taken in DRG on other problems: 

White and Zhang [12] used constraint programming to 

second a starting point for tabu search in solving course 

timetabling problems. For a high school timetabling 

Yoshikawa tested several combinations of two stage 

algorithms, including a greedy algorithm followed by 

simulated annealing and a constraint programming phase 

followed by a randomized greedy hill climbing algorithm 

(which is deemed to be the best combination of those 

used). In a similar vein, Burke, Newell and Weare [13] 

used their work on sequential construction heuristics [14] 

to generate initial solutions for their memetic algorithm 

[15]. 

Wide variety of courses increases the emergence to 

provide adequate exam-time tables for the educational 

institutions. The development of an examination 

timetable requires the institution to schedule a number of 

examinations in a given set of time slots, so as to satisfy a 

given set of constraints. A common constraint for any 

educational institution is that none of the students can 

have more than one exam scheduled at the same time. 

Many other constraints were presented by Marlot in [17]. 

Sequential construction heuristics have been applied to 

the publicly available data in a variety of forms by 

selecting exams from a randomly chosen subset of all 

exams by Burke [14] where Carter [19, 20] allow limited 

backtracking de-allocation of exams. On the other hand 

Caramia [18] includes an optimization step after each 

exam allocation. Sequential construction heuristics order 

the exams in some way and attempt to allocate each exam 

to an ordered session by satisfying all the constraints. 

Using a memetic algorithm for exam timetabling Burke, 

Newall and Weare [15] proposed a hybrid algorithm 

consist of a simulated annealing phase to improve the 

quality of solution, and a hill climbing phase for further 

improvement. To avoid local maxima problem these 

solutions require random jitter [21] whereas the proposed 

algorithm has no impact on randomization. 


III. PROBLEM DESCRIPTION &DATA FILTERING 

The routine maps a set of courses chosen by students 

and teachers to a specific room and time-slot. A major 

objective, in developing an automated system, is to 

minimize the hassle of separating conflicted courses from 

choices by students. In this paper the major identified 

problems are (a) Number of lectures per week for each 

course are fixed, (b) Room overlapping is prohibited, (c) 

Fitting the routine with teacher’s favorite timeslots, and 

(d) trying to assert different timeslots to same level of 

courses. On the other hand, the minor objectives are (e) 

day-timeslots pattern for the course, (f) room capacity, 

(g) avoiding gaps between classes of same teacher, if 

possible, (h) single class for student per day and (i) 

ensuring compactness of interclass time difference. 

The required scattered data contains total courses 

(course choices from pre-advising by students) C = {c1, 

c2, c3, …, cn} where the dependencies between courses 

are also maintained. Here course dependency can be 

defined as �Ci, Ci � Cj where Ci, Cj � C. For this paper 

the students’, S = {S1, S2, S3, …, Sz}, course choices can 

be derived as Sj = {ci} where � i, ci �C and 1 � i � n and 

|Sj| = max_course_choice for the student as shown in Fig. 

1. Teachers’ favorite timeslots are grouped according to 

day-timeslots pattern. Here group A and B is formed for 

the teacher tk, where T = {t1, t2, t3, …, tm}, A(tk) = 

{favorite time_slots of tk | sequential time slots for 

Saturday, Monday and Wednesday} and B(tk) = {favorite 

time slots of tk | sequential time slots for Sunday, Tuesday 

and Thursday}, whereas time_slots = {1, 2, …, 30} 

contains 5 sequential slots per day starting from Saturday. 

Here Friday is considered as an off-day. Priority of the 

teacher has been introduced by P(tk) = {1, 2, …, 10} 

where a higher value represents higher priority. An 

exceptional priority is also introduced as 11 reflecting 

part-time faculty, whose projected time cannot be 

changed. Fig. 2. and Fig. 3. shows the teachers’ wish-list 

and the course distribution among the teachers’. Target of 

this work is to find the values of the “class slot routine” 

field of the Fig. 3. The resultant routine vector, V = 

{{ci}q} � i, ci �C and 1 � i � n and 1 � q � 30, consists of 

the course classification as per day required for each 

course and class room availability. 

In this paper, this huge dimensionality of dataset is 

reduced by initiating an OLAP (On-Line Analytical 

Processing) representation [16]. Here a Crosstable (Cr) of 

(n × n) courses are initialized. Cr (n × n) = {conflicti, j} 

where 1 � i,j � n and ‘n’ is the total number of courses 

requested by the student (or is ‘n’ number of courses 

offered by the department). Here the conflict of Cri,j is a 

positive integer that reflects the common students 

between ci and cj. The diagonal values of Cri,i show the 

total number of students requesting for the course ci. 

�Cri,j, [1 � j � n] is the total conflict for the course ci [� 

i, i � j]. Maximum “chaos” (conflict) courses can be 

easily sorted out from this two dimensional conflict 

distribution (Crosstable). 

To minimize the potential for time conflict, an 

admissible heuristic (h) is imposed to regroup the courses 

according to their dissimilarity. Graph Coloring


algorithm is used to cluster where the conflict in 

Crosstable confirms the weights of the edges in the 

graph. Here the admissible heuristic is applied as the 

maximum number of colors needed to find the minimum 

number of groups of courses. Here faculty redundancy is 

also considered as weight, that is, same faculty of 

different courses cannot be in the same group. 

Student 

ID 

Course ID Semester ID 

S1 C1 1 

S1 C3 1 

S2 C1 1 

S2 C3 1 

S3 C2 1 

S3 C4 1 

S3 C5 1 

S4 C4 1 

Figure 1. Courses Pre-Advised by the Students. 

Teacher ID 

Favorite 

Slots 

Priority 

t1 7,8,9,13,17,18,19,22,23,24 7 

t2 2,12,22,29 11 

t3 5,10,15,20,25,30 9 

t4 2,5,8,9,12,15,18,22,23,24,25 10 

Figure 2. Teachers’ Slots Preferences along with Priority. 

Teacher 

ID 

Course ID 

Class 

Slot 

Routine 

Sem. 

ID 

Class 

per 

Week 

t1 c1 7,17 1 2 

t1 c4 8,18,24 1 3 

t2 c2 2,12 1 2 

t3 c3 5,15 1 2 

t4 c5 5,15,25 1 3 

Figure 3. Courses and Classes Distribution for the Teachers. 

The output of Graph Coloring Algorithm assigns a color 

to all courses individually in Fig. 5(a); same colored 

courses are treated as a group. Fig. 4. shows the resultant 

Crosstable. Each group may consist more than the 

threshold limit members with tolerable conflict range. 

Here the number of the rooms is considered as the 

threshold value. In this work the tolerable conflict range 

is set to 0. 

Course 

ID 

c1 c2 c3 c4 c5 

Total 

Conflict 

c1 40 5 7 0 0 12 

c2 5 50 0 2 1 8 

c3 7 0 35 5 0 12 

c4 0 2 5 40 1 8 

c5 0 1 0 1 45 2 

t1 t2 t3 t1 t4 

Figure 4. Crosstable for n × n Courses. 

This easy formation of coloring may lead to a measure 

of the undesirability of having classes overlapped in the 

routine. It will be effective to try to fit the most “chaos” 

courses of high priority teachers in the routine first. 

Random selection may be used to select teacher having 


same priority. The data used in this work to test the 

algorithm is real. 

Constraint programming model and data filtering 

techniques for routine generation motivate the increasing 

interest to develop an exam-time tabling. In routine 

generation algorithm the colored courses refer the non- 

conflicting sets of courses. Faculty redundancy is not 

considered as constraint any more but the room capacity. 

For ETA the graph consisting edge weight between two 

courses Dy,z = Cy + Cz – Cri,j where Cy and Cz represents 

the number of students of Ci and Cj courses respectively. 

(a) (b) 

Figure 5(a). Courses Graph Color for DRG. (b). Courses Graph 

Color for ETA. 

�i,j �Dy,z � total_room_capacity and |Dy,z| � 

room_capacity shown in Fig. 5(b). Important factor of 

grouping courses is that the number of members in each 

group must not exceeds the total number of room 

availability. The minimum requirement of days for 100% 

concurrent conflict free exam is greater than or equal to 

the number of groups that is the numbers of color 

requires coloring the graph. If each day consists of more 

than one slot of examination and the provided day is less 

than the numbers of color then the Crosstable is able to 

ensure the numbers of consecutive examinations for an 

individual or groups of students on the day. Each exam 

slot holds a group of courses only, with zero conflict. But 

the consecutive slot may embrace some conflicts among 

the groups due to differ in color. By using dynamic data 

structure the number of consecutive examinations 

between different groups of courses can be easily sorted 

out described in section V. 

IV. THE DECISION TREE BASED ROUTINE GENERATION 

(DRG) ALGORITHM 

The aim of the proposed DRG algorithm is to classify 

all the courses with a degree of satisfaction. Enormous 

permutation of courses may lead to a time consuming 

process. So a standardized branch & bound condition 

may be applied to reduce the problem surface area. The 

DRG sequentially follows 4 sets of cascading decision 

trees. Depending upon the emergence and success rate, 

the result of one tree is propagated to another tree as 

shown in Fig. 6. These transitions may lead to a solution 

but also may degrade the satisfaction threshold. A control 

portion helps to justify the problem solution needed to be 

more explored or not. 

Each transition from one decision tree to another 

shrinks the overall problem surface area by eliminating 

the classified courses. Classical decision tree takes certain 

decision depending upon some gain factor. Beside this, 

the proposed trees concentrate on the reduced problem


dimension which helps to classify the unclassified 

courses with tolerable time complexity. 

The key factor for each of the four decision trees is (a) 

PDRG: Teacher Priority, (b) CDRG: Highest conflicted 

course, (c) TDRG: Tolerable conflict and (d) NTRG: 

Neighbor slots by ignoring teacher’s wish list. A 

university routine is created and remains unchanged for a 

particular semester. If the placed courses do not match 

the favorite slots of the teacher, the evolved 

dissatisfaction is much higher for a higher prioritized 

teacher. So, we used PDRG as our first decision tree. 

From an observation it is clear that the placing conflicted 

course in a dense routine vector is difficult as it can 

introduce student conflict in the routine. So, it will wise 

to consider the higher conflicted courses first as they are 

the principle component which reflects the major 

problem surface area. By doing this the problem surface 

area is reduced easily. For this reason CDRG is the 

second selection. The decision tress TDRG and PDRG 

are quite similar. But TDRG introduces considerable 

student conflict in to the routine vector. 

Figure 6. Program flow of DRG 

Therefore, TDRG plays the third role in the whole 

algorithm. After all the three decision trees, the 

unclassified and partially classified courses shows, there 

is no place (slot) according to the teacher’s favorite slots. 

Exploring over the contour of the teachers favorite slots 

is necessary in order to achieve the course’s class per 

week constraints. Only student conflict is considered in 

TDRG where the day-time pattern is maintained strictly. 

On the other hand, NTDRG will dissatisfy the teachers & 

may not follow the day-time pattern. Hence, NTDRG is 

the final decision tree. 

A. Priority regulated DRG (PDRG) 

The first decision tree accumulates the high prioritized 

teacher tk and most conflicted course ci of tk to the routine 


vector with maximum fitness value. The max_fit function 

tries to discover the day-time pattern for the course. If the 

consequential day-time slots are already occupied by 

other courses, it checks the corresponding course color. If 

the color of the courses is same it classifies the course ci 

with a degree. Here the fitness value of a course is 

referred as degree. The highest returned degree of a daytime 

slot, as maximum fitness value, is selected as the 

course class for ci. This operation provides a patternbased 

course distribution, A(tk) or B(tk), in the routine 

although some courses may be placed partially as per 

their class_per_week and max_class_per_slot constraints. 

In this manner, the low priority teachers may suffer by 

not getting the classes in a sequential manner. But in 

practice 26% of the total courses can be placed with zero 

conflict and with a high level of satisfaction. The level of 

satisfaction is a quantitative measure of placement for a 

course with respect to the teachers’ favorite slots. The 

definitive PDRG tree is shown in Fig. 7. The partially 

placed and not yet placed courses then elected as 

cascaded input to second level of exploration. The pseudo 

code presented in algorithm 1 describes actions to be 

taken by Priority regulated DRG algorithm (PDRG). The 

computational time of this operation requires O (n × m) 

where the maximum number of courses per teacher is ‘n’ 

and the number of faculties is ‘m’. 

PDRG( ) 

{ 

// select the high prioritized teacher; 

// select most conflicted course of the teacher; 

// select the corresponding faculty’s low frequent favorite 

// time slot 

1. Find max-patterned day time slots; 

IF the time slot is empty PLACE the course; 

2. ELSE find the color of the course that already placed 

on the slot; 

2.1. IF the color is same AND on the range of the 

room AND the course is not already placed on 

that day before, PLACE the course; 

2.2. ELSE select the next max-pattern; 

3. every time after PLACEing the course, remove the 

slot from the faculty favorite slot list; 

4. repeat the step 1,2 until the course slot remain 

unchanged; 

} 

Algorithm 1. Priority regulated DRG (PDRG) 

B. Chaos eradication DRG (CDRG) 

In this decision tree, the less demanded time slots (with 

respect to number of rooms) are labeled as “cold” 

whereas high demanded ones are labeled as “hot”. 

Among the remaining most conflicted courses (may not 

or partially placed) with low frequent time slots of the 

corresponding teacher are chosen for the second decision 

tree. If the considerable slot is empty then the course is 

placed in that slot, otherwise the colors have to be 

matched. If the color of the courses placed in the slot 

matches with the concerning course, the latter course is


classified, and if not other options are taken into 

consideration for the teacher. The considerable courses in 

CDRG are the overlooked courses from PDRG, where 

PDRG confirms the day-time pattern is not possible for 

the courses due to color and the scattered choice of slots 

by the teachers. So day-time slots pattern are ignored in 

this operation. After using this second decision tree few 

courses may remain unchanged. Nevertheless this time 

the number of remaining courses is much less than the 

previous one. Around 32% of the remaining courses are 

placed successfully by CDRG. The combined effort of 

decision trees still provides high confidence. Fig. 8. 

demonstrates the Chaos eradication DRG. The pseudo 

Figure 7. Decision Tree of PDRG 

code presented in algorithm 2 describes actions to be 

taken by Chaos eradication DRG (CDRG). Now this tree 

holds time complexity of O((n – d) × (m – l)) where the 

maximum number of courses per teacher is ‘n’, the 

number of faculties is ‘m’ and ‘d’ is the already classified 

courses of each teacher by the PDRG and ‘l’ is the 

number teacher whose all courses were placed in the 

routine by PDRG. 

CDRG( ) 

{ 

// select the most conflicted courses not yet placed or 

// partially placed; 

// select the remaining low frequent favorite slot of the 

// faculty; 

1. IF the slot is empty PLACE the course; 

2. ELSE find the color of the course that already 

placed on the slot; 

2.1. IF the color is same AND on the range of the 

room AND the course is not already placed on 

that day before, PLACE the course; 

2.2. ELSE select the next low frequent favorite slot 

of the faculty; 

3. Repeat the step 1,2 until the course remain 

unchanged state; 


} 

Algorithm 2. Chaos eradication DRG (CDRG) 

C. Tolerable DRG (TDRG) 

The first two decision trees are aimed at automated 

generations of a better assignment. The second approach 

seeks to find an assignment of vector which may be more 

difficult to locate in the search space using the already 

assigned vector. The third decision tree allows the 

remaining courses according to the priority of the 

teachers to find a place into the routine within a tolerable 

conflict range of the subsequent teachers’ favorite slots. 

Here the course color is overlooked. This classification 

now introduces errors into the system by considering the 

tolerable students conflicts only. Important issue is that, 

this manipulation may iterate several times to include as 

many courses possible to place in to the routine. 22% of 

the unclassified and partially classified courses are 

labeled with a tolerable error. 

Figure 8. Decision Tree of CDRG 

The flowchart presentation of this decision tree is 

shown in Fig. 9. The pseudo code presented in algorithm 

3 describes actions to be taken by Tolerable DRG 

(TDRG). The average time complexity of TDRG is 

approximately O (t × (n – (d + o)) × (m – (l + p)) where 

the maximum number of courses per teacher is ‘n’, the 



number teacher whose all courses were placed in the 

routine by PDRG. ‘o’ is the number of courses placed by 

CDRG and ‘p’ is the number of teachers whose courses 

are completely placed by CDRG. Here ‘t’ is the number 

of TDRG iterates. 

TDRG( ) 

{ 

// select the most conflicted courses not yet placed or


// partially placed of a high prioritized teacher; 

// select the remaining low frequent favorite slot of the 

// faculty; 

1. IF the slot is empty PLACE the course; 

2. ELSE IF on the range of the room AND conflict 

among the courses are in tolerable range AND the 

course is not already placed on that day before, 

PLACE the course; 

3. ELSE select the next low frequent favorite slot of the 

faculty; 



} 

Algorithm 3. Tolerable DRG (TDRG) 

D. Neighboring Tour DRG (NTDRG) 

The fourth possibility search allows the courses to look 

over the contour of the teachers’ favorite slots to find the 

least conflicted slots. Although the students’ scattered 

choices is overlooked in TDRG but the placed courses do 

not displease teachers’ preference. The final decision tree 

is modeled in such a manner so that the rest of unplaced 

courses are going to be graded into the routine vector 

within a minimum distance of the teachers’ choice. 

Manhattan Distance (MD) is calculated for this placement 

of courses. MD = total slot gap of tk on each day i.e. the 

total unused slots of a teacher on a particular day. 

Manhattan Distance is a vital performance measuring tool 

to find the slot gaps per day for an individual teacher. The 

optimization can be done by keeping Cumulative 

Manhattan Distance (cMD) for the teachers as low as 

possible where cMD = �all used day by t MD(t) It is assured 

that the courses were not yet placed on that day earlier. 

From the remaining courses, a course is selected 

according to the priority of the teacher. 

Figure 9. Decision Tree of TDRG 

Complement of the intersection between the classification 

value of that course and the corresponding teachers’ used 

slots are considered as new host slots. The neighboring 

slots of the new hosts are the most likely candidates. 


Among the candidates the most “cold” (less desired slots) 

slots are considered as candidates. 

Considerable issues in this placement are tolerable 

conflict range, allocable number of rooms and day-time 

misjudgment. Fig. 10. demonstrates the overall scenario 

of Neighboring Tour DRG (NTDRG). The approximate 

time complexity is O(2 × (n – (d + o + u)) × (m – (l + p + 

v)) � O((n – (d + o + u)) × (m – (l + p + v)) where the 

maximum number of courses per teacher is ‘n’, the 



number of teacher whose all courses were placed in the 

routine by PDRG. ‘o’ is the number of courses placed by 

CDRG and ‘p’ is the number of teachers whose courses 

are completely placed by CDRG. ‘u’ is the number of 

courses placed by TDRG and ‘v’ is the number of 

teachers whose courses are completely placed by TDRG. 

The pseudo code presented in algorithm 4 describes 

actions to be taken by Neighboring Tour DRG (NTDRG). 

NTDRG( ) 

{ 

// select the courses not yet placed or partially placed; 

// select the corresponding faculty routine; 

1. Find the candidate slot (where candidate slot is the 

neighboring slots of the used slots of the faculty); 

2. IF on the range of the room AND conflict among the 

courses are in tolerable range AND the course is not 

already placed on that day before, PLACE the 

course; 

3. ELSE select the next candidate slot of the faculty; 



} 

Algorithm 4. Neighboring Tour DRG (NTDRG). 

Figure 10. Decision Tree of NTDRG 

These four sequential decision trees feed forward to an 

acceptable solution. It is ensured that the course is not yet


place on that very day whenever the “PLACE” decision 

is taken. The Overall Complexity is shown in equation 

(1). 

= PDRG + CDRG + TDRG + NTDRG 

= O(mn) + O( (n-d)(m-l) ) + O( t(n-(d+o))(m- 

(l+p)) ) + O( (n-(d+o+u))(m-(l+p+v)) ) 

= mn + mn – nl – md + dl + t(mn – nl – np – md 

+ dl + dp – mo + ol + op) + mn – nl – np – nv 

– md + dl + dp + dv – mo + ol + op + ov – mu 

+ ul + up + uv 

= (t + 3)mn - (d + td + to + d + o + u)m – (l + tl 

+ l + tp + v)n + (dl + tdl + tdp + tol + top + dl 

+ dp + dv + ol + op + ov + ul + up + uv) 

= A.mn – B.m – C.n + K --------------------- (1) 

where, 

A = t + 3; 

B = (t + 2)d + (t + l)o + u; 

C = (t + 2)l + tp + v; 

and K = dl + tdl + tdp + tol + top + dl + dp + dv + ol 

+ op + ov + ul + up + uv 

The cumulative time complexity of DRG is O(A.mn - 

B.m - C.n), The cumulative time complexity of DRG 

mainly depend upon the number of iterations in TDRG 

algorithm and number of courses placed by each and 

every decision tree. The algorithm degrades due to the 

fact that the students have a freedom to choose any 

course (assuming that the prerequisite course is 

completed). 

Firstly PDRG faces problem if same prioritized 

teachers focus into same favorite time slots. By using 

CDRG this situation may prevail over considering a level 

of discontinuity into the day-time pattern for the courses. 

In second run, if the low “chaos” course holds high 

prioritized teacher then the classification may dissatisfy 

the teacher. For the third rotation the conflict may arise 

for the students for not considering the color. For 

NTDRG, if the host slot is elected as the first (V{{ci}q} � 

i, ci�C and 1 � i � n and 1 � q � 30 where q = 6, 11, 16, 

21, 26 is the first slots of the day) and last (V{{ci}q} � i, 

ci�C and 1 � i � n and 1 � q � 30 where q = 5, 10, 15, 20, 

25 is the last slots of the day) slot of the day, the previous 

and the next consecutive slots are from different days. So, 

this day jump increases huge distance for the teacher 

which may lead to an unfeasible classification. After 

NTDRG a few courses may remain partially classified or 

unclassified due to three major factors (1) Number of 

rooms not adequate, (2) Teacher’s preferred time slot is 

not applicable and (3) Student conflict may cross 

tolerable conflict range. 

The resultant unclassified or partially classified 

courses, after all decision trees exploration, represent the 

problem can not be solved without the extensive 

relaxation of the constraints mentioned earlier. Unclassification 

or partially classification may tag to a 

course not only for the conflict but also for room 

availability. So, such event may occur where there is no 

conflict among the courses but the low prioritize course 

(due to teacher’s priority in PDRG or low conflicted 

score in CDRG or little bit higher considerable conflict 

score in TDRG ) is not able to be placed in to the routine 


vector for room limitation. Same situation may arise for 

other two constraints. So, to eliminate the partially 

classified or unclassified courses the above mentioned 

factors have to be compromised that is by increasing the 

room capacity or expanding teacher’s favorite time or 

ignoring students’ conflicts. 

V. EXAM-TIME TABLING ALGORITHM (ETA) 

Typical constraint programming method is applied to 

the above exam-time tabling problem. By considering the 

equal or less number of members only from the power set 

of the groups of courses, the ETA calculates the total 

number of conflicts and the total number of students on 

each group. The proposed algorithm also tries to place the 

course groups on a specific exam slot. The total numbers 

of valid power set are the combination of the groups 

along with the given slots for the exam. Valid range in 

calculating the number of members is 1 to equal to given 

slot value ‘s’ i.e. Gn Cs + Gn Cs-1 +… + Gn C1 where ‘Gn’ is 

the total number of course groups. By sorting the groups 

in descending order according to their total student to 

conflict ratio, the ETA tries to judge the most appropriate 

groups to place in to the exam-time table first. Here if the 

conflict signifies zero then the total student to conflict 

ratio remains equal to the total number of the students. 

Among the sorted power set list the most supported 

single member is selected to be placed on each day of the 

exam-time table by using greedy algorithm i.e. by 

choosing the most profitable groups first. Support vector 

is calculated from the ratio of the total student and 

conflict. If the provided day for exam scheduling is less 

and the groups of courses and number of slots is more 

than one on each day, then the placed group will try to 

calculate the most appropriate unplaced groups and select 

the group to form the pair. This situation may produce 

concurrent conflicts but conflicts among the courses on 

each slot remain zero. These newly paired groups are 

eliminated from the list so that other placed groups can 

select their feasible member groups. Considering the 

scenario described in Fig. 4, the resultant colorized 

course groups are G1 = {c1, c4} 80 and G2 = {c2, c3} 85 and 

the power set of the groups are {G1}, {G2}, {G1, G2}. 

Here the highest considerable set are those, whose 

number of members are less or equal to the provided slots 

for exam per day. Total student to conflict ratio of the 

selective power sets are 80, 85 and 165/19, where the 

ETA mainly concentrates on. If the provided days for the 

exam are equal to the number of groups then the most 

appropriate exam schedule is Day 1 = G1 and Day 2 = 

G2. There is an 11.5% possibility of having concurrent 

exam, if the provided days for exam are less then the 

number of colorized course groups. The pseudo code 

presented in algorithm 5 describes actions to be taken by 

Exam-Time Tabling Algorithm (ETA). 

ETA( ) 

{ 

1. Color the courses and from the group;


2. Find the power set of the groups according to the 

provided exam slots per day; 

3. Sort the power set in descending order according to 

their total student to conflict ratio treated as “Gain”; 

4. PLACE the highest “Gain”ed course on each day of 

exam; 

5. Find the most appropriate pair for the group; 

6. Optimize the exam routine by imposing the MST 

(Minimum Spanning Tree – by using Prims 

algorithm; Here the graph of the groups contains 

edges with conflicts) to spread away the concurrent 

exams for the student. 

} 

Algorithm 5. Exam-Time Tabling Algorithm (ETA). 

For better exam-time table ETA also calculates the 

conflicts among the each day-placed group and considers 

these groups as a vertex. Further optimization can be 

done by using PRIMS algorithm. In this case PRIMS 

algorithm will increase the Manhattan distance (MD) of 

the closest conflicting groups where the non conflicting 

groups also hold an edge with zero weight. This 

optimization stage is used to maximize the day gap of 

exams for the students. So by using PRIMS algorithm, 

ETA founds the minimum spanning tree of group of 

courses where the nearest group exams hold less common 

students. Manhattan Distance (MD) represents the value 

of total student to conflict ratio between the groups of the 

days. The cumulative time complexity of ETA is � O(s Gn 

+ Gn 2 ) depends upon the provided slots per day (s) 

strictly. 

A. Algorithm Analysis 

VI. EXPERIMENTAL RESULTS 

The Decision tree based Routine Generation algorithm 

(DRG) is a set of 4 sequential feed-forwarded trees. 

Important observation has been made by considering the 

fitness value where it represents the best matching score 

among these selected day-time pattern slots for a selected 

teacher. An example is shown in Fig. 1, 2, 3, 4 and 5(a) 

where the first decision tree (PDRG) selects t2 because of 

high priority and t2 (11) holds the courses {c2}. The daytime 

pattern shows {{2,12,22},{29}} = {2,12}, {12,22} 

or {2,22} as the best match for the course c2 from t2 

favorite time slots. Here after imposing the day-time 

pattern with respect to number of class per week for the 

course, the calculated ordered sets of fitness value are 

{2,12}, {12,22}, {2,22} and {29} where the first three 

sets hold highest and equal fitness value corresponding 

with class per week. As no courses have been placed yet 

vector V is empty i.e.; V({ci}2) = { } & V({ci}12) = { } & 

V({ci}22) = { } & within room limit hence, the fitness 

value({2,12}) = fitness value({12,22}) = fitness 

value({2,22}), where the fitness value is an integer 

number representing the maximum possibility to take the 

classes on the provided time slots. So, PDRG places c2 in 

slot {2,12}, i.e. V({c2}2) and V({c2}12) where {2,12} is a 

random selection. 


Again PDRG selects t4 with the priority 10 as its next 

candidate which holds the courses {c5}. Here the daytime 

pattern is, {{2,12,22},{5,15,25}..} = 

{2,12,22},{5,15,25} extracted from the t4’s favorite slots. 

As V({ci}2) = {c2} & V({ci}12) = {c2} V({ci}22) = { } & 

V({ci}5) = { } & V({ci}15) = { } V({ci}25) = { } and the 

fitness value({2,12,22}) < fitness value({5,15,25}). 

Therefore c5 is placed in slot {5,15,25}, i.e. V({c5}5) and 

V({c5}15) and V({c5}25) where V({ci}2) and V({ci}12) is 

not empty. In similar approach the for the teacher t3 the 

course c3 is placed V({c3}10) and V({c3}20) where 

V({c3}5), V({c3}15) and V({c3}25) is ignored due to 

different course color. And for t1 the courses {c1, c4} hold 

12 and 8 as total conflicts respectively. Selected course c1 

can be placed {7,17}, {8,18}, {9,19}, {13,23}, {22}, 

{24} accordingly to the t1 favorite slots. Course c1 is 

placed in V({c1}7) and V({c1}17). And from the 

remaining time slots none the generated pattern provide 

sufficient classes for the course c4 where the required 

number of classes is 3. So the course c4 is placed on 

V({c4}8) and V({c4}18) and c4 is tagged as partially placed 

course needed to be explored more. Random selection 

among courses for exploration is acceptable if more than 

one course holds same conflict score. 

This class based reasoning left 1 partially placed 

course, need to be walked around more. Next decision 

tree CDRG is now in operation with fewer courses as 

compared with the beginning and slot 24 will be allocated 

for the course c4. The important issue is, although the 

classes are placed in zigzag fashion but all the selected 

slots for the course c4 are from the favorite slots of the 

faculty respectively. So, the denoted term teacher 

satisfaction is 100% for the case and overlapping classes 

for the student is zero (i.e., student satisfaction). The 

results of above example are shown in Fig. 11. In practice 

the situation may be more complex with many courses. 

By using the same data filtering technique for Exam- 

Time tabling the proposed algorithm ETA generates the 

power set of courses according to their course color 

extracted from the cross_table(Cr) where the teacher 

redundancy is ignored Fig. 5(b). So, the resultant set is, 

{c1}, {c2}, {c3}, {c4}, {c5}, {c1, c4} and {c2, c3}. If the 

provided exam days are 3 and exam slots per day are 2 

the for 100% conflict free exam per slots are day 1 : {c1, 

c4}, day 2 : {c2, c3} and day 3 : {c5}. 

PDRG CDRG TDRG NTDRG 

Final 

Finding 

Unclassified 0 0 - - 0 

Partially 

classified 

Day-time 

1 0 - - 0 

patterned 

classified 

4 1 - - 5 

Student 

conflict 

0 0 - - 0 

Unsatisfied 

time 

0 1 - - 1 

Figure 11. Simulation result of DRG 

Further more if the provided days for exam are 2 and 

slots are 2 then the exam-time tabling is like, day 1 : slot 

1 {c1, c4}, slot 2 {c5} and day 2 : slot 1 {c2}, slot 2 {c3},


where day 1 consists zero conflict of exam on each slots 

but 1 consecutive exam of an student. Fig. 12 shows the 

overall outcome of the ETA algorithm for this special 

scenario. 

B. Experimental Results 

Test results for DRG algorithm are carried out on a PC 

with Pentium IV/1.6 GHz processor and 256 MB of 

memory. Table I. shows the computational results for 

semester 1 and semester 2 with 66 and 61 courses 

correspondingly. Where for semester 1, around 24 

teachers, with minimum 10 classes per week and at least 

3 courses and 120 students with 430 combinations of 

choices of courses are considered as input. For semester 

2, around 23 teachers, with minimum 10 classes per week 

and at least 3 courses and 106 students with 414 

combinations of choices of courses are considered as 

input. 

Day 

Slot 1 

Cr. 

(total 

std) 

Slot 2 

Cr. 

(total 

std) 

Slot 3 

Cr. 

(total 

std) 

Concurrent 

Exam 

Conflict 

Overall 

Gain 

Day = 3 

Slot = 2 

1 

2 

3 

c1 (40) 

c2 (50) 

c5 (45) 

c4 (40) 

c3 (35) 

- 

- 

- 

- 

0 

0 

0 

80 

85 

45 

Day = 2 

Slot = 2 

1 

2 

c2 (50) 

c3 (35) 

c1 (40) 

c5 (45) 

c4 (40) 

- 

- 

1 

0 

130 

80 

Day = 1 

Slot = 2 

1 - - - - 

Not 

Possible 

Day = 1 

Slot = 3 

1 

c1 (40) 

c4 (40) 

c5 (45) 

c2 (50) 

c3 (35) 

21 10 

Figure 12. Simulation result of ETA 

In Table I. and Fig. 11 unclassified courses refers to the 

number of courses that were not classified by any 

decision trees, partially classified courses gives the 

number of courses partially classified (the number classes 

already placed into the routine is less then the required 

classes). Day-time pattern shows the numbers of courses 

that followed the day-time pattern. student conflict 

estimates the percentage of unsatisfied requirements for 

courses by students. Unsatisfied time refers the number of 

time slots automatically generated beyond teachers’ 

favorite choice. Final Finding refers the final output for 

every subsection after using the all cascade trees. The 

main objective of this classification is to achieve the state 

where the value the unclassified course is equal to zero. 

Final value of unsatisfied time and student conflict shows 

the teacher and student satisfaction respectively. 

Randomization or prediction on classification is not used 

in DRG. So the output of this � O(A.mn - B.m - C.n) 

based deterministic algorithm strictly depends upon the 

input. 

Test results for ETA are generated depending upon 61 

courses with average 25 students per course within 9 

exam days and 2 slots per day for semester 2. Table II. as 

well as Fig. 12 describes the outcome of the Exam-time 

Tabling Algorithm. Here the “slots” represents the total 

numbers of consecutive exam on a single day. Time 


duration is 3 hours for exam-time tabling whereas same 

slot holds 1 hour as class duration in DRG. Total student 

to conflict ratio on a particular exam day is referred as 

overall gain. Around 3.8% of the total students hold 

concurrent exam schedule whereas scheduled courses on 

each slot are 100% conflict free. The overall gain also 

confirms the profit for taking the course groups together 

as a candidate on a single day. High satisfaction of the 

students attests a high-quality exam-time tabling. 

VII. CONCLUSION 

Timetabling problem usually varies significantly from 

institution to institution in terms of specific requirements 

and constraints [22]. Many current successful university 

timetabling systems are often applied only in the 

institutions where they were designed. The metaheuristic, 

heuristic and hybrid methods are used to 

solving timetabling problems so as case base reasoning. 

The main idea is to try and design an algorithm that will 

choose the right decision tree to carry out a certain task in 

a certain situation. 

This paper outlines the algorithm Decision Tree based 

Routine Generation (DRG) using OLAP representation, 

to construct a university class routine and conflict free 

Exam-Time Tabling algorithm (ETA) to produce conflict 

free exam schedule with a fixed interval of days. It 

should be noted that the DRG algorithm brings the 

complexity to a considerable level and this solution 

classifies 96% - 97% of the courses as well 93% - 95% 

satisfaction for teacher. For this data set, students’ 

satisfaction is 100% but in general 90% - 93% 

satisfaction may be achievable by using DRG. 

Preferential requirements (teacher satisfaction) on time 

variables are met around 93%. Again the ETA algorithm 

provides satisfactory exam-time table with 100% 

satisfaction. The results also illustrate that the proposed 

algorithms achieve significant performance gains over 

different data set. 

The proposed algorithms are designed in such a manner 

so that they are easy to code and imply significant 

importance to construct generalized automated timetabling 

software. Author(s) of the paper realize the 

difference in constraints level in different institutes; 

however in future generalized automated time-tabling 

software will be examined. This paper does not consider 

any classical benchmark problems. It is important to 

analyze the performance with other established 

algorithms. Incorporating those heterogeneous constraints 

with the proposed data structure will also be examined in 

future. 

ACKNOWLEDGEMENT 

The proposed algorithm is employed to construct a 

class and exam routine for a reputed university in 

Bangladesh. Author(s) of the paper express gratitude to 

the university authority for providing indispensable 

information.


Semester-1 

Semester-2 

Semester-2 

REFERENCES 

[1] Christelle Guért. Narendra Jussien, Patrice 

Boizumault and Christain Prins, “Building university 

timetable using constraint logic programming,” 

Springer-Verlag LNCS 1153, pp. 130-145, 1996. 

[2] Hans-Joachim Goltz, Georg Küchler and Dirk 

Matzkae, “Constraint-based timetabling for 

universities,” INAP’98, pp. 75-80, 1998. 

[3] Slim Abdennadher and Michael Marte, “University 

course timetabling using constraint handling rules,” 

Journal of Applied Artificial Intelligence, vol. 14, no. 

4, pp. 311-326, 2000. 

[4] Thom Früwirth, “Constraints handling rules,” 

Constraint Programming: Basic and Trandes, LNCS 

910, Springer, 1995. 

[5] E.K. Burke, B.L. MacCarthy, S. Petrovic and R. Qu, 

“Knowledge Discovery in a Hyper-Heuristic for 

Course Timetabling Using Case-Based Reasoning,” 

PATAT 2002, 4 th International Conference, pp. 90- 

103, August 2002. 

[6] D. Abramson, “Constructing school timetables using 

simulated annealing: Sequential and parallel 

algorithms,” Management Science, vol. 37, pp. 98- 

113, 1991. 

[7] A. Schaert, “Tabu search techniques for large highschool 

timetabling problems,” 13 th National 

Conference on Artificial Intelligence AAAI’96, pp. 

363-368, 1996. 


TABLE I. 

COMPUTATIONAL RESULTS OF DRG 

PDRG CDRG TDRG NTDRG Final Finding 

Unclassified courses 5 3 1 0 0 

Partially classified courses 45 32 2 2 2 

Day-time patterned classified 16 31 63 64 64 

Student conflict 0 0 0 0 0 

Unsatisfied time 0 0 3 7 9 

Unclassified courses 2 2 1 0 0 

Partially classified courses 41 26 2 2 2 

Day-time patterned classified 18 33 58 59 59 

Student conflict 0 0 0 0 0 

Unsatisfied time 0 0 5 8 11 

TABLE II. 

COMPUTATIONAL RESULTS OF ETA 

# of 

Courses on 

Slot 1 

# of 

Courses on 

Slot 2 

Concurrent 

Exam Conflict 

Total 

Student on 

Slot 1 

Total 

Student on 

Slot 2 

Overall Gain 

Day 1 3 4 3 72 63 45 

Day 2 3 3 2 77 63 70 

Day 3 2 2 5 32 53 17 

Day 4 4 4 5 76 87 32.6 

Day 5 5 5 11 91 64 14 

Day 6 3 2 3 46 18 21.3 

Day 7 4 4 7 42 74 16.5 

Day 8 2 5 2 28 71 49.5 

Day 9 2 3 2 16 68 42 

[8] Luca Di Gaspero and Andra Schaerf, “Multi- 

Neighbour Local Search for Course Timetabling,” 

PATAT 2002, 4 th International Conference, pp. 128- 

132, August 2002. 

[9] P. Boizumault, Y. Delon, and L. Peridy, “Constraint 

logic programming for examination timetabling,” 

Journal of Logic programming, vol. 26, pp. 217-233, 

1996. 

[10] K. Dowslan, “Using simulated annealing for efficient 

allocation of students to practical classes,” Applied 

Simulated Annealing – Lecture Notes in Economics 

and Mathematical System, Springer-Verlag, vol. 396, 

pp. 125-150, 1993. 

[11] K. Dowslan, “Off-the peg or made-to-measure? 

Timebaling and scheduling with sa and ts,” 

PATAT’97, Springer – Verlag, pp. 37-52, 1998. 

[12] G.M. White and J. Zhang, “Generating complete 

university timetables by combining tabu search with 

constraint logic,” PATAT’97, Springer – Verlag, pp. 

187-198, 1998. 

[13] E.K. Burke, J. Newall, and R.F. Weare, 

“Initialization strategies and diversity in evolutionary 

timetabling,” Evolutionary Computation Journal, vol. 

6.1, pp. 81-103, 1998. 

[14] E.K. Burke, J. Newall, and R.F. Weare, “A simple 

heuristically guided search for the timetable 

problem,” Proceeding of the International ICSC


Symposium on Engineering of Intelligent System, 

pp. 574-579, 1998. 

[15] E.K. Burke, J. Newall, and R.F. Weare, “A memetic 

algorithm for university exam timetabling,” Lecture 

Notes in Computer Science, Springer-Verlag, vol. 

1153, pp. 241-250, 1996. 

[16] Tan, Steinbach and Kumar, Introduction to Data 

Mining, pp. 130-139, 2004. 

[17] Liam T.G. Merlot, Natashia Boland, Barry D. 

Hughes, and Peter J. Stuckey, “A Hybrid Algorithm 

for the Examination Timetabling Problem”, 

Proceedings of the 4 th International Conference on 

the Practice and Theory of Automated Timetabling - 

PATAT’2002, Springer – Verlag, pp. 348-371. 

[18] M. Caramia, P. Dell'Olmo, and G.F. Italiano, “New 

algorithms for examination timetabling”, 

Proceedings: Algorithm Engineering 4 th International 

Workshop, WAE 2000, Germany, September 2000. 

“Lecture Notes in Computer Science 1982”, 

Springer-Verlag, Berlin Heidelberg New York, pp. 

230-241, 2001. 

[19] M. Carter, G. Laporte, and J. Chinneck, “A general 

examination scheduling system”, Interfaces, pp. 

109-120, 1994. 

[20] M. Carter, G. Laporte, and S.T. Lee, “Examination 

timetabling: algorithmic strategies & applications”, 

Journal of the Operational Research Society, pp. 373 

- 383, 1996. 

[21] Bart Selman, Henry A. Kautz, and Bram Cohen, 

“Noise strategies for improving Local search”, 12 th 

National Conference on Artificial Intelligence 

(AAAI-94), pp. 337– 343, 1994. 

[22] T. B. Cooper, J. H. Kingston, “The Complexity of 

Timetable Construction Problems”, Practice and 

Theory of Automated Timetabling, Springer Verlag, 

pp. 283-295, 1996. 

Ashiqur Md. Rahman received his 

B.Sc. Degree in Computer Science and 

Engineering from American International 

University Bangladesh, Dhaka in 

January, 2004. He is currently perusing 

his M.Sc. degree from North South 

University, Dhaka since January 2006. 

He has authored in 4 national and international journal and 

conference papers in the area of Data Mining, VHDL, 

Cryptography and PVc module design. His current research 

interest is in Grid Computing especially in large Grid 

Environment. 


Shafaat S. Giasuddin received his B.Sc. 

Degree in Computer Science from 

Ahsanullah University of Science and 

technology, Dhaka in November, 2005. 

He is currently perusing his M.Sc. degree 

from North South University, Dhaka 

since January 2006. He has authored in 3 

national and international journal and conference papers in the 

area of Data Mining, VHDL and Cryptography. His current 

research interest is in Data Mining especially in corporate sector 

like banking and telecommunication. 

Rashedur M. Rahman received his 

Ph.D. Degree in Computer Science from 

University of Calgary, Canada in 

November, 2007. He has received his 

M.Sc. degree from University of 

Manitoba, Canada in 2002 and Bachelor 

degree from Bangladesh University of 

Engineering and Technology (BUET) in 2000 respectively. He 

is currently working as an Assistant Professor in North South 

University, Dhaka, Bangladesh. He has authored more than 25 

international journal and conference papers in the area of 

parallel, distributed, grid computing and knowledge and data 

engineering. His current research interest is in data mining 

especially on financial, educational and medical surveillance 

data, data replication on Grid, and application of fuzzy logic for 

grid resource and replica selection.


Anomaly Network Intrusion Detection Based on 

Improved Self Adaptive Bayesian Algorithm 

Dewan Md. Farid 

Dept. of CSE, Jahangirnagar University, Dhaka-1342, Bangladesh 

Email: dmfarid@uiu.ac.bd 

Mohammad Zahidur Rahman 

Dept. of CSE, Jahangirnagar University, Dhaka-1342, Bangladesh 

Email: rmzahid@juniv.edu 

Abstract—Recently, research on intrusion detection in 

computer systems has received much attention to the 

computational intelligence society. Many intelligence 

learning algorithms applied to the huge volume of complex 

and dynamic dataset for the construction of efficient 

intrusion detection systems (IDSs). Despite of many 

advances that have been achieved in existing IDSs, there are 

still some difficulties, such as correct classification of large 

intrusion detection dataset, unbalanced detection accuracy 

in the high speed network traffic, and reduce false positives. 

This paper presents a new approach to the alert 

classification to reduce false positives in intrusion detection 

using improved self adaptive Bayesian algorithm (ISABA). 

The proposed approach applied to the security domain of 

anomaly based network intrusion detection, which correctly 

classifies different types of attacks of KDD99 benchmark 

dataset with high classification rates in short response time 

and reduce false positives using limited computational 

resources. 

Index Terms—anomaly detection, network intrusion 

detection, alert classification, Bayesian algorithm, detection 

rate, false positives 


Network intrusion detection is the problem of detecting 

unauthorized use of computer systems over a network, 

such as the Internet. IDSs were introduced by James P. 

Anderson in 1980, an example of an audit trails would be 

a log of user access [1]. IDSs have become an integral 

part of today’s information security infrastructures. In 

order to detect intrusion activities, many machine 

learning (ML) algorithms, such as Neural Network [2], 

Support Vector Machine [3], Genetic Algorithm [4], 

Fuzzy Logic [5], and Data Mining [6], etc have been 

widely used to the huge volume of complex and dynamic 

dataset to detect known and unknown intrusions. It is 

very important for IDSs to generate rules to distinguish 

normal behaviors from abnormal behavior by observing 

dataset, which is the record of activities generated by the 

operating system that are logged to a file in 

chronologically sorted order. IDSs using ML algorithms 

aim to solve the problems of analyzing huge volumes of 

dataset and realizing performance optimization of 

detection rules. IDSs detect attacks on computer systems 

and signal an alert to the Computer Emergency Response 


doi:10.4304/jcp.5.1.23-31 

Team (CERT). The network intrusion detection area is 

the arrival of new, previously unseen attacks, because 

hackers are very inventive and they use newer ways to 

disrupt the normal operation of servers and users. 

Anomaly detection detects new attacks which has not 

presented in the dataset by observing system activities 

and classifying it as either normal or anomalous. An 

important research challenge today is to develop adaptive 

IDSs to improve classification rates, and reduce false 

positives. 

The Bayesian algorithm (BA) provides a probabilistic 

approach for classification [7], [8], which provides an 

optimal way to predict the class of an unknown example. 

It is widely used in many fields of data mining, image 

processing, bio-informatics, and information retrieval etc. 

It calculates conditional probabilities from a given dataset 

and used these conditional probabilities to find out the 

probabilities of belongingness to different classes. The 

unseen example is then categorized to that class, which 

assumes the maximum value. In this paper, based on a 

comprehensive analysis for the current research 

challenges we propose a new algorithm to address the 

problem of classification rates and false positives using 

improved self adaptive Bayesian Algorithm. This 

algorithm correctly classifies different types of attacks of 

KDD99 dataset with high detection accuracy in short 

response time, and also maximizes the detection rate 

(DR) and minimizes the false positives (FP). In our 

experiments proposed algorithm reduced the number of 

false positives by up to 90% with acceptable 

misclassification rates. 

The remainder of this paper is organized as follows. 

Section II provides a review of IDSs and section III 

provides IDSs using machine learning algorithms. 

Section IV presents our proposed new algorithm. 

Experimental results are presented in section V. Finally, 

section VI makes some concluding remarks along with 

suggestions for further improvement. 

II. REVIEW OF INTRUSION DETECTION SYSTEMS 

A. History of IDSs 

In 1980, the concept of IDSs began with Anderson’s 

seminal paper, introduced the notion that audit trails


contained vital information that could be valuable in 

tracking misuse and understanding user behavior. His 

work was the start of host-based IDSs (HBIDSs). In 

1986, Dr. Dorothy Denning published a model which 

revealed the necessary information for commercial IDSs 

development [9]. In 1988, multics intrusion detection and 

alerting system (MIDAS), an expert system using P- 

BEST and LISP was developed [10]. Haystack was also 

developed in this year using statistics to reduce audit 

trails [11]. In 1989, wisdom & sense (W&S) was a 

statistics-based anomaly detector developed, that created 

rules based on statistical analysis, and then used those 

rules for anomaly detection [12]. In 1990, Heberlein first 

introduced the idea of network IDSs, development of 

Network Security Monitor (NSM), and hybrid IDSs [13] 

and Lunt proposed SRI named intrusion detection expert 

system (IDES), a dual approach of a rule-based expert 

system and a statistical anomaly detection, which ran on 

Sun Workstations and could consider both user and 

network level data [14]. Also in the early 1990s the 

commercial development of IDSs were started and the 

Time-based inductive machine (TIM) did anomaly 

detection using inductive learning of sequential user 

patterns in Common LISP on a VAX 3500 computer 

[15]. In 1991, distributed IDSs (DIDS), an expert system 

created by the researchers of University of California 

[16], and the network anomaly detection and intrusion 

reporter (NADIR) a statistic based anomaly detector and 

also an expert system developed by Los Alamos National 

Laboratory’s Integrated Computing Network (ICN) [17]. 

In 1993, Lunt proposed the Next-generation Intrusion 

Detection Expert System by developing SRI followed 

IDES using artificial neural network [18]. The Lawrence 

Berkeley National Laboratory introduced rule language 

called Bro for packet analysis from libpcap dataset in 

1998 [19]. The audit data analysis and mining IDSs used 

tcpdump to build profiles of rules for classifications in 

2001 [20]. 

B. Types of IDSs 

Intrusion is a set of actions that attempt to compromise 

the confidentiality, integrity or availability of computer 

resources. IDSs collect information from a variety of 

systems and analyze the information for signs of 

intrusions. In general, IDS categorize into five types, 

which are described below: 

1) Network IDSs (NIDSs) responsible for detecting 

attacks related to the network [21], [22]. NIDSs 

investigate incoming and outgoing network traffic by 

connecting with network devices to find suspicious 

patterns. If a NIDS has no additional information about 

the protected host, the malicious attacker can easily avoid 

detection by taking advantage of different handling by 

overlapping IP/TCP fragments by IDS and a target host 

[23]. 

2) Host-based IDSs (HBIDSs) usually are located 

in servers to examine the internal interfaces [24]. HBIDSs 

can either use standard auditing tools [25], or specially 

instrumented operating system [26], or application 

platforms [27]. It detects intrusions by analyzing system 


calls, application logs, file-system modifications, and 

other host activities related to the machine. 

3) Protocol-based IDSs (PIDSs) monitor the 

dynamic behavior and state of the protocol used the web 

server. PIDSs sit at the front end of a web server, 

monitoring and analyzing the HTTP protocol stream. It 

understands the HTTP protocol to protect web server by 

filtering IP address or port number. 

4) Application protocol-based IDSs (APIDSs) 

monitor and analysis on a specific application protocol or 

protocols between a process, and group of servers that is 

used by the computer system. APIDSs can be sitting 

between a web server and the database management 

system that monitoring the SQL protocol specific to the 

business logic. Generally, APIDSs look for the correct 

use of the protocol. 

5) Hybrid IDSs (HIDSs) combines two or more 

intrusion detection approaches. HIDS provide alert 

notification from both network and host-based intrusion 

detection devices. 

C. Detection Models of IDSs 

Detection rate (DR) is defined as the number of 

intrusion instances detected by the system divided by the 

total number of intrusion instances present in the dataset, 

and false positives (FP) is an alarm, which rose for 

something that is not really an attack. There are two types 

of detection models for IDSs, which are described below: 

1) Misuse or signature-based IDSs are also known 

as pattern-based IDSs. It performs simple pattern 

matching to match a pattern corresponding to a known 

attack type in the database. DR of these IDSs is relatively 

low, because attacker will try to modify the basic attack 

signature in such a way that will not match the known 

signatures of that attack and it cannot detect a new attack 

for which a signature is not yet installed in the database. 

2) Anomaly-based IDSs tries to identify new 

attacks by analyzing strange behavior from normal 

behaviors. It has a relatively high detection rate for new 

types of intrusion. The disadvantage is that in many cases 

there is no signal “normal profile” and anomaly-based 

systems tend to produce many false positives. 

D. Functions of IDSs 

IDSs are automated systems detecting and alarming of 

any situation where an intrusion has taken or is about to 

take place. According to the Common Intrusion Detection 

Framework (CIDF), generally IDSs consists of four 

components, like: sensors, analyzers, database, and 

response units. Most modern IDSs use multiple intrusion 

sensors, which obtain alerts from the large computational 

environment to maximize their trustworthiness. 

Analyzers use the input of the sensors, analyze the 

information gathered by these sensors, and return a 

synthesis or summary of the input. Today, machine 

learning algorithms have become an indispensable tool in 

the analyzers of IDSs. Database stores all alerts and 

support the analysis process. The response units carry out 

prescriptions controlled by the analyzers. The functions 

of IDSs are follows as [28]: 

� Monitoring user’s activity.


� Monitoring systems activity. 

� Auditing system configuration. 

� Assessing the data files. 

� Recognizing known attack. 

� Identifying abnormal activity. 

� Managing audit data. 

� Highlighting normal activity. 

� Correcting system configuration errors. 

� Stores information about intruders. 

III. IDS USING MACHINE LEARNING 

A. Bayes Rule 

The Bayes rule provides a way to calculate the 

probability of a hypothesis based on its prior probability 

[7],[8]. The best hypothesis is the most probable 

hypothesis, given the observed data D plus any initial 

knowledge about the prior probabilities of the various 

hypotheses h (h is a hypothesis space containing possible 

target function). Bayes rule is defined in equation “(1)”. 

Bayes Rule: 

P�D| 

h� 

P h| 

D) 

P�D� 

P( 

h) 

( � (1) 

Here P(h|D) is called the posterior probability, while 

P(h) is the prior probability associated with hypothesis h. 

P(D) is the probability of the occurrence of data D and 

P(D|h) is the conditional probability. In many learning 

scenarios, the learner considers some set of candidate 

hypothesis, H and is interested in finding the most 

probable hypothesis h € H given the data D. Any such 

maximally probable hypothesis is called maximum 

posterior (MAP) hypothesis. The MAP hypothesis use 

Bayes rule to calculate the posterior probability of each 

candidate hypothesis. More exactly, hMAP is a MAP 

hypothesis provided: 

hMAP � argmaxh �H 

P�h| 

D� 

= 

P�h| 

D�P( 

h) 

argmaxh �H 

P�D� 

= argmax P�D| 

h� 

P( 

h) 

(2) 

h� H 

Finally, dropped the term P(D) because it is a constant 

dependent of h. P(D|h) is also called likelihood of the 

data D given h any hypothesis that maximizes P(D|h) is 

called a Maximum Likelihood (ML) hypothesis, hML: 

hML � maxh 

H P�D| 

h� 

arg � (3) 

B. Naïve Bayesian Classifier 

Naïve Bayesian (NB) classifier is a simple 

probabilistic classifier based on probability models that 

incorporate strong independence assumptions which often 

have no bearing in reality. The probability model can be 

derived using Bayes rule. Depending on the precise 

nature of the probability model, NB classifier can be 

trained very efficiently in a supervised learning. In many 

practical applications, parameter estimation for naïve 

Bayesian models uses the method of maximum 


likelihood. In spite of naïve design and apparently oversimplified 

assumptions, naïve Bayesian classifiers often 

work much better in many complex real-world situations. 

The NB classifier is given as input a set of training 

examples each of which is described by attributes A1 

through Ak and an associated class, C. The objective is to 

classify an unseen example whose class value is unknown 

but values for attributes A1 through Ak are known and 

they are a1, a2,.…, ak respectively. The optimal prediction 

of the unseen example is the class value c such that 

P(C=ci|A1=a1,…Ak=ak) is maximum. By Bayes rule this 

probability equals to 

argmax 

ci 

�C 

P 

�A�a1,... Ak 

�ak 

| C�c 

i� 

P�A�a,.... 

A �a 

� 

�C�c� 1 

P i 

(4) 

1 1 k k 

Where P(C=ci) is the prior probability of class ci, 

P(A1=a1,…Ak=ak) is the probability of occurrence of the 

description of a particular example, and 

P(A1=a1,…Ak=ak|C=ci) is the class conditional 

probability of the description of a particular example ci of 

class C. The prior probability of a class can be estimated 

from training data. The probability of occurrence of the 

description of particular examples is irrelevant for 

decision making since it is the same for each class value 

c. Learning is therefore reduced to the problem of 

estimating the class conditional probability of all possible 

description of examples from training data. The class 

conditional probability can be written in expanded from 

as follows: 

P(A1=a1,…Ak=ak|C=ci) 

= P(A1=a1| A2=a2 ^…Ak=ak ^ C=ci) 

* P(A2=a2| A3=a3 ^…Ak=ak ^ C=ci) 

* P(A3=a3| A4=a4 ^…Ak=ak ^ C=ci) 

* P(A4=a4 ^…Ak=ak ^ C=ci) (5) 

In NB, it is assumed that outcome of attribute Ai is 

independent of the outcome of all other attributes Aj, 

given c. Thus class conditional probabilities become: 

P(A1=a1,…Ak=ak|C=ci) =� P( 

Ai 

� ai 

| C � ci 

) If the 

i � 1 

above value is inserted in equation “(4)” it becomes: 

� arg maxci 

� C P(C=c)� P( 

Ai 

� ai 

| C � ci 

) (6) 

i � 1 

In Naïve Bayesian learning, the probability values of 

equation “(6)” are estimated from the given training data. 

These estimated values are then used to classify unknown 

examples. 

C. Decision Tree Algorithms 

The ID3 technique builds decision tree using 

information theory [29]. The basic strategy used by ID3 

is to choose splitting attributes from a data set with the 

highest information gain. The amount of information 

associated with an attribute value is related to the 

probability of occurrence. The concept used to quantify 

k 

k


information is called entropy, which is used to measure 

the amount of randomness from a data set. When all data 

in a set belong to a single class, there is no uncertainty 

then the entropy is zero. The objective of decision tree 

classification is to iteratively partition the given data set 

into subsets where all elements in each final subset 

belong to the same class. The entropy calculation is 

shown in equation “(7)”. The value for entropy is 

between 0 and 1 and reaches a maximum when the 

probabilities are all the same. Given probabilities p1, 

p2,..,ps where �i=1 pi=1, 

Entropy: H(p1,p2,…ps) = �� 

s 

i 1 

(pi log(1/pi)) (7) 

Given a data set, D, H(D) finds the amount of subset of 

data set. When that subset is split into s new subsets S = 

{D1, D2,…,Ds}, we can again look at the entropy of those 

subsets, A subset of data set is completely ordered if all 

examples in it are the same class. ID3 chooses the 

splitting attribute with the highest gain. The ID3 

algorithm calculates the gain by the equation “(8)”. 

Gain (D,S) = H(D)- �� 

s 

i 1 

p(Di)H(Di) (8) 

The C4.5 algorithm improves ID3 through GainRatio 

[30]. For splitting purpose, C4.5 uses the largest 

GainRatio that ensures a larger than average information 

gain. 

GainRatio(D,S) = 

Gain.( 

D, 

S) 

�| 

D1 

| | Ds 

| � 

H�� 

,..., �� 

� | D| 

| D| 

� 

The C5.0 algorithm improves the performance of 

building trees using boosting, which is an approach to 

combining different classifiers. But boosting does not 

always help when the training data contains a lot of noise. 

When C5.0 performs a classification, each classifier is 

assigned a vote, voting is performed, and the example of 

data set is assigned to the class with the most number of 

votes. CART (classification and regression trees) is a 

process of generating a binary tree for decision making 

[31]. CART handles missing data and contains a pruning 

strategy. The SPRINT (Scalable Parallelizable Induction 

of Decision Trees) algorithm uses an impurity function 

called gini index to find the best split [32]. “Equation 

(10), defines the gini for a data set, D”. 

(9) 

gini (D) = 1-� pj 2 (10) 

Where, pj is the frequency of class Cj in D. The 

goodness of a split of D into subsets D1 and D2 is defined 

by 

ginisplit(D) = n1/n(gini(D1))+ n2/n(gini(D2)) (11) 

The split with the best gini value is chosen. A number 

of research projects for optimal feature selection and 

classification have been done, which adopt hybrid 

strategy involving evolutionary algorithm and inductive 

decision tree learning [33], [34], [35], [36]. 


D. K-Nearest Neighbors 

The K nearest neighbors (KNN) is a classification 

algorithm based on the use of distance measures [37]. It 

finds k examples in training data that are closest to the 

test example and assigns the most frequent label among 

these examples to the new example. When a classification 

is to be made for a new example, its distance to each 

attribute in the training data must be determined. Only the 

K closest examples in the training data are considered 

further. The new example is then placed in the class that 

contains the most examples from this set of K closest 

examples. KNN can be considered decision making 

technique as equivalent to Bayesian classifier in which 

the number of neighbors of each example is used as an 

estimate of the relative posterior probabilities of class 

membership in the neighborhood of a sample to be 

classified. 

IV. IMPROVED SELF ADAPTIVE BAYESIAN ALGORITHM 

A. Adaptive Bayesian Algorithm 

Adaptive Bayesian algorithm creates a function from 

KDD99 benchmark intrusion detection training data [38], 

which first estimate the class conditional probabilities for 

each attribute value based on their frequencies over the 

weights with match of same class in the training data. In a 

given training data, D = {A1, A2,…,An} of attributes, 

where each attribute Ai = {Ai1, Ai2,…,Aik} contains 

attribute values and a set of classes C = {C1, C2,…,Cn}, 

where each class Cj = {Cj1, Cj2,…,Cjk} has some values. 

Each example in the training data contains weight, W = 

{W1, W2…, Wn}. Initially, all the weights for each 

example of training data have equal unit value that set to 

Wi = 1.0. Then calculate the sum of weights for each 

class from the training data by counting how often each 

class occurs in the training data and the sum of weights 

for each attribute value with respect to same class in the 

training data. Next calculate the class conditional 

probabilities for each attribute value using equation 

“(12)” from the training data. 

P(Aij|Cji) = 

� 

� 

W Aij 

(12) 

Here P(Aij|Cji) is the class conditional probability, WA 

is the sum of weights for each attribute value, and WC is 

the sum of weights for each class. After calculating the 

class conditional probabilities for each attribute value 

from the training data, the algorithm classify the each test 

example using equation “(13)”. 

W 

Cji 

TargetClass = � Ci P(Aij|Cji) (13) 

If any test example is misclassified, the algorithm 

updates the weights of training data. The algorithm 

compare each of test examples with every training 

example and compute the similarity between them, and 

then weights of training data are increased by a fixed


small value multiplied by the corresponding similarity 

measure. 

Wi = Wi + (S*0.01); (14) 

Here S is the similarity between test and training 

examples. If the test example is correctly classified, then 

the weights of training data will remain unchanged. After 

weights adjustment, the class conditional probabilities for 

attribute values are recalculated from the modified 

weights in training data. If the new set of probabilities 

correctly classifies all the test examples, the algorithm 

terminates. Otherwise, the iteration continues until all the 

test examples are correctly classified or the target 

accuracy is achieved. At this stage the algorithm 

terminates, the class conditional probabilities are 

preserved for future classification of seen or unseen 

examples. 

B. Improved Self Adaptive Bayesian Algorithm 

Improved self adaptive Bayesian algorithm (ISABA) is 

the modification of adaptive Bayesian algorithm. Given a 

training data ISABA initializes the weights of each 

example Wi set to 1.0 and estimates the prior probability 

P(Cj) for each class by summing the weights how often 

each class occurs in the training data. For each attribute in 

training data, Ai the number of occurrences of each 

attribute value Aij can be counted by summing the 

weights to determine the probability P(Aij). Similarly, the 

probability P(Aij | Cj) can be estimate by summing the 

weights how often each attribute value occurs in the class 

in the training data. An example in the training data may 

have many different attributes A = {A1, A2,…,An}, and 

each attribute have many values Ai = {Ai1, Ai2,…,Aik}. The 

conditional probability P(Aij | Cj) estimate for all values 

of attributes. Then the algorithm uses these conditional 

probabilities to classify all the training examples. When 

classifying a training example, the conditional and prior 

probabilities generated from the training data are used to 

make the prediction. This is done by multiplying the 

probabilities of the different attribute values from the 

example. Suppose the training example ei has p 

independent attribute values {Ai1, Ai2,…,Aip}, the 

algorithm has P(Aik | Cj), for each class Cj and attribute 

Aik, and then estimate probability P(ei | Cj) by using 

equation “(15)”. 

P(ei | Cj) = P(Cj) �k=1�p P(Aij | Cj) (15) 

To calculate the probability P(ei), the algorithm 

estimate the likelihood that ei is in each class. The 

probability that ei is in a class is the product of the 

conditional probabilities for each attribute values. The 

posterior probability P(Cj | ei) is then found for each 

class. The class with the highest probability is the one 

chosen for the example in training data. Now the 

algorithm updates the weights for each example in the 

training data with the highest value of posterior 

probability P(Cj | ei) for that example and also changes 

the class value associate with highest posterior 

probability. If any example in the training data is 


misclassified then the algorithm again calculates the prior 

probability P(Cj) and conditional probability P(Aij | Cj) 

from the training examples using the updated weights, 

and again classify the training examples and updates the 

weights of training examples. This iteration will continue 

until all the training examples are correctly classified or 

the target accuracy is achieved. 

After classifying the training examples, the algorithm 

classify the test examples using the conditional 

probabilities P(Aij | Cj). If any test example is 

misclassified, the algorithm again updates the weights of 

training examples. The algorithm compares each of test 

example with every training examples and compute the 

similarity between them (out of four attributes two 

attribute values are same then similarity becomes 0.5 or 

one attribute value is same then similarity is 0.25 and so 

on), and then weights of training examples are increased 

by a fixed small value multiplied by the corresponding 

similarity measure. If the test examples are correctly 

classified, then the weights of training examples will 

remain unchanged. After weights adjustment, the 

conditional probabilities P(Aij | Cj) for attribute values are 

recalculated from the modified weights of training 

examples. If the new set of probabilities correctly 

classifies all the test examples, the algorithm stores the 

conditional probabilities and builds a decision tree by 

information gain using last updated weights. Otherwise, 

the iteration continues until all the test examples are 

correctly classified or the target accuracy is achieved. At 

this stage the algorithm correctly classifies all the test 

examples, the conditional probabilities P(Aij | Cj) are 

preserved for future classification of seen or unseen 

intrusions, and builds a decision tree by information gain 

using the last updated weights of training examples, 

which is also used for classification of seen or unseen 

intrusions. The main procedure of ISABA algorithm is 

described as follows. 

Algorithm ISABA 

Input: training data D, testing data 

Output: intrusion detection model 

Procedure: 

1. Initialize all the weights in D, Wi=1.0. 

2. Calculate the prior probabilities P(Cj) for each 

class Cj from D. 

�W 

i 

P(Cj) = 

Ci 

n 

� 

i�1 

3. Calculate the probabilities P(Aij) for each 

attribute value Aij from D. 

W 

P(Aij) = 

�� Wi 

Aij 

Cj 

4. Calculate the conditional probabilities P(Aij | Cj) 

for each attribute values from D. 

i


P(Aij | Cj) = P ( Aij) 

W 

5. Classify each example in D. 

� 

Ci 

P(ei | Cj) = P(Cj) � P(Aij | Cj) 

6. Initialize all the weights in D with Maximum 

Likelihood (ML) of posterior probability P(Cj|ei) 

and change the class value associated with 

highest posterior probability. 

Wi= PML(Cj|ei) 

Cj = Ci� PML(Cj|ei) 

7. If any example in D is misclassified, then go to 

step 2, else go to step 8. 

8. Classify all test examples with the conditional 

probabilities P(Aij | Cj). 

9. If any test example is misclassified then update 

the weights of D using similarity S. 

Wi = Wi + (S+0.01) 

10. If weight update, then recalculate the 

probabilities P(Aij) and P(Aij | Cj) using updated 

weights of D and go to step 8. 

11. If all test examples are correctly classified then 

the algorithm stores conditional probabilities for 

future classification of seen or unseen intrusions. 

12. The algorithm builds decision tree by 

information gain using final updated weights of 

training examples. 

V. EXPERIMENTAL RESULTS 

A. Experimental Data 

The KDD99 dataset is a common benchmark for 

evaluation of intrusion detection techniques [39]. In the 

1998 DARPA intrusion detection evaluation program, a 

simulated environment was set up to acquire raw TCP/IP 

dump data for a local-area network (LAN) by the MIT 

Lincoln Lab to compare the performance of various 

intrusion detection methods. The DARPA98 was 

operated like a real environment, but being blasted with 

multiple intrusion attacks and received much attention in 

the research community of adaptive intrusion detection. 

In 1999, based on the DARPA98 data the Third 

International Knowledge Discovery and Data Mining 

Tools Competition established the KDD99 benchmark 

dataset for intrusion detection based on data mining. In 

the KDD99 data set, each data example represents 

attribute values of a class in the network data flow, and 

each class is labeled either as normal or as an attack with 

exactly one specific attack type. The KDD99 data records 

are all labeled with one of the following five types: 

1) Normal connections are generated by simulated 

daily user behavior such as downloading files, visiting 

web pages, etc. 

2) Denial of Service (DoS) attack causes the 

computing power or memory of a victim machine too 


i 

busy or too full to handle legitimate requests. DoS attacks 

are classified based on the services that an attacker 

renders unavailable to legitimate users. Example of DoS 

attacks are Apache2, Land, Mail bomb, Back, etc. 

3) User to Root (U2R) is a class of attacks that an 

intruder/hacker begins with the access of a normal user 

account and then becomes a super-user by exploiting 

various vulnerabilities of the system. Most common 

exploits of U2R attacks are regular buffer overflows, 

Loadmodule, Fdformat, and Ffbconfig. 

4) Remote to User (R2L) is a class of attacks that a 

remote user gains access of a local account by sending 

packets to a machine over a network communication, 

which include Sendmail, and Xlock. 

5) Probing (Probe) is an attack scans a network to 

gather information or find known vulnerabilities. An 

intruder with a map of machines and services that are 

available on a network can use the information to look for 

exploits. 

B. Eeperimental Analysis 

Correct classification of known and unknown 

intrusions is one of the central problems for networkbased 

intrusion detection. Many supervised and 

unsupervised learning algorithms already applied to 

classifying intrusions but their performance were not very 

satisfactory due to the challenging problem of detection 

novel attacks with low false alarms. In order to evaluate 

the performance of improved self adaptive Bayesian 

algorithm (ISABA) for intrusion detection, we performed 

5-class classification using KDD99 benchmark dataset. 

The training and testing data are taken randomly from the 

KDD99 dataset with different ratios of positive versus 

negative instance. The training data are used to train the 

algorithms, and the test data are used to evaluate the 

performance of the algorithms. Table I shows the number 

of examples in 10% training data and 10% testing data of 

KDD99 dataset. There are some new attack examples in 

testing data, which is no present in the training dataset. 

TABLE I. 

NUMBER OF EXAMPLES IN TRAINING AND TESTING DATA 

Attack Types 

Training 

Examples 

Testing 

Examples 

Normal 97277 60592 

Probing 4107 4166 

Denial of Service 391458 237594 

User to Root 52 70 

Remote to User 1126 8606 

Total Examples 494020 311028 

In the experiments, first we performed the 

classification on KDD99 testing data using naïve 

Bayesian classifier (NB), adaptive Bayesian algorithm 

(ABA), and improved self adaptive Bayesian algorithm 

(ISABA) and compare their results shown in Table II. It 

is clear that the proposed new algorithm is approximately 

2 times faster in training and testing than conventional 

algorithms.


TABLE II. 

TRAINING AND TESTING TIME COMPARISON 

Naïve Bayesian 

Classifier 

Adaptive Bayesian 

Algorithm 

Improved Self Adaptive 

Bayesian Algorithm 

Training Time (s) Testing Time (s) 

42.9 18.4 

24.6 9.6 

28.4 7.5 

The performance comparison between NB and ISABA 

on KDD99 (using 41 attributes) testing data is listed in 

Table III, which shows that the ISABA performs 

balanced and high classification rates on 5 attack classes 

of KDD99 data and minimize the false positives. 

TABLE III. 

PERFORMANCE COMPARISON BETWEEN NB AND ISABA 

Normal Probe DoS U2R R2L 


Classifier (DR %) 

99.25 99.13 99.69 64 99.11 


Classifier (FP %) 


0.08 0.45 0.04 0.14 8.02 


(DR %) 


99.62 99.22 99.49 99.17 99.15 


(FP %) 

0.05 0.36 0.03 0.10 6.91 

We tested our proposed algorithm using the reduced 

dataset of 12 attributes and 17 attributes in KDD99 

dataset, which increase the classification rate for intrusion 

classes that are summarized in Table IV. 

TABLE IV. 

PERFORMANCE OF ISABA USING REDUCED DATASETS 

12 Attributes 17 Attributes 

Normal 99.97 99.96 

Probe 99.91 99.95 

DoS 99.99 99.98 

U2R 99,36 99.46 

R2L 99.53 99.69 

We also compare the detection performance among 

Support Vector Machines (SVM), Neural Network (NN), 

Genetic Algorithm (GA), Naïve Bayesian Classifier 

(NB), and improved self adaptive Bayesian algorithm 

(ISABA) on KDD99 dataset [40],[41],[42]. In total, 40 

attributes of KDD99 dataset have been used. Each 

connection can be categorized into five main classes (one 

normal class and four main intrusion classes: probe, DoS, 

U2R, and R2L). The experimental setting uses 494020 

data samples for training and 311028 data samples for 

testing. The comparative results are summarized in Table 

V. 

TABLE V. 

COMPARISON OF SEVERAL ALGORITHMS 

SVM NN GA NB ISABA 

Normal 99.4 99.6 99.3 99.55 99.82 

Probe 89.2 92.7 98.46 99.43 99.72 

DoS 94.7 97.5 99.57 99.69 99.49 

U2R 71.4 48 99.22 64 99.47 

R2L 87.2 98 98.54 99.11 99.35 


Detection Rate 

1 

0.8 

0.6 

0.4 

0.2 

0 

0 0.1 0.2 0.3 0.4 0.5 0.6 

False Positives 

Figure 1. ROC curves for alert classification systems. 

The ROC (relative operating characteristic) curve 

shows the relationship between detection rate and false 

positives on 10% KDD99 testing data in figure 1. 

VI. CONCLUSION AND FUTURE WORKS 

The main advantage of this proposed algorithm is to 

generate a minimal rule set for network intrusion 

detection, which can detect network intrusions based on 

previous activities. Proposed algorithm analyzes the 

large volume of network data and considers the complex 

properties of attack behaviors to improve the performance 

of detection speed and detection accuracy. This paper 

presents a new intrusion detection algorithm based on 

intelligent machine learning algorithms. In this paper we 

have concentrated on the development of the performance 

of naïve Bayesian classifier, which adjusts the weights of 

training examples until either all the test examples are 

correctly classified or the target accuracy on test 

examples is achieved. The experimental results marked 

that this algorithm minimized false positives, as well as 

maximize balance detection (classification rates) on the 5 

classes of KDD99 data. The future research issues will be 

applying domain knowledge of security to improve the 

detection accuracy, and visualizing the procedure of 

intrusion detection in real world problem domains. 

ACKNOWLEDGMENT 

Support for this research received from Ministry of 

Science and Information & Communication Technology, 

Government of Bangladesh. We would like to thank Prof. 

Dr. Chowdhury Mofizur Rahman, United International 

University, Bangladesh for fruitful discussions and 

valuable help in the implementation of this algorithm. 

REFERENCES 

ISABA 

NB 

GA 

NN 

SVM 

[1] James P. Anderson, “Computer security threat monitoring 

and surveillance,” Technical report, James P. Anderson 

Co., Fort Washington, Pennsylvania, April 1980. 

[2] Cannady J., “Artificial neural networks for misuse 

detection,” Proceedings of the ’98 National Information 

System Security Conference (NISSC’98), Arlington: 

Virginia Press, 1998, pp. 443-456.


[3] Shon T, Seo J, and Moon J, “SVM approach with a genetic 

algorithm for network intrusion detection,” Proceedings of 

the 20 th International Symposium on Computer and 

Information Sciences (ISCIS 05), Berlin: Springer Verlag, 

2005, pp. 224-233. 

[4] Yu Y, and Huang Hao, “An ensemble approach to 

intrusion detection based on improved multi-objective 

genetic algorithm,” Journal of Software, Vol.18, No.6, 

June 2007, pp.1369-1378. 

[5] J. Luo, and S. M. Bridges, “Mining fuzzy association rules 

and fuzzy frequency episodes for intrusion detection,” 

International Journal of Intelligent Systems, 2000, pp. 687- 

703. 

[6] W.K. Lee, and S. J. Stolfo, “A data mining framework for 

building intrusion detection model,” Proceedings of the 

IEEE Symposium on Security and Privacy, Oakland, CA: 

IEEE Computer Society Press, 1999, pp. 120-132. 

[7] Kononenko I, “Comparison of inductive and naïve 

Bayesian learning approaches to automatic knowledge 

acquistion,” in Wieling, B. (Ed), Current trend in 

knowledge acquistion, Amsterdam, IOS press, 1990. 

[8] Langely, P., Iba, W., Thomas, and K., “An analysis of 

Bayesian classifier,” in Proceedings of the 10 th national 

Conference on Artificial Intelligence (San Matro, CA: 

AAAI press) 1992, pp. 223-228. 

[9] Denning, and Dorothy E., “An Intrusion Detection Model,” 

Proceedings of the Seventh IEEE Symposium on Security 

and Privacy, May 1986, pp. 119-131. 

[10] Sebring, Michael M., Whitehurst, and R. Alan., “Expert 

Systems in Intrusion Detection: A Case Study,” The 11 th 

National Computer Security Conference, October, 1988. 

[11] Smaha, and Stephen E., “Haystack: An Intrusion Detection 

System,” The 4 th Aerospace Computer Security 

Applications Conference, Orlando, FL, December, 1988. 

[12] Vaccaro, H.S., and Liepins, G.E., “Detection of Anomalus 

Computer Session Activity,” The 1989 IEEE Symposium 

on Security and Privacy, May, 1989. 

[13] Heberlein, and Let al., “A Network Secutiry Mnitor,” 

Proceedings of the IEEE Computer Society Symposium, 

Research in Security and Privacy, May 1990, pp. 296-303. 

[14] Lunt, and Teresa F., “IDES: An Intelligent System for 

Detecting Intruders,” Proceedings of the Symposium on 

Computer Security; Threats, and Countermeasures; Rome, 

Italy, November 22-23, 1090, pp.110-121. 

[15] Teng, Henry S., Chen, Kaihu, Lu, and Stephen C-Y, 

“Adaptive Real-time Anomaly Detection using Inductively 

Generated Sequential Patterns,” 1990 IEEE Symposium on 

Recearch in Security and Privacy, Oakland, CA, pp. 296- 

304. 

[16] Snapp, Steven R, and Ho., “DIDS (Distributed Intrusion 

Detection System) – Motivation, Architecture, and An 

Early Prototype,” The 14 th National Computer Security 

Conference, October, 1991, pp.167-176. 

[17] Jackson, Kathleen, DuBois, David H., Stallings, and Cathy 

A., “A Phased Approach to Network Intrusion Detection,” 

14 th National Computing Security Conference, 1991. 

[18] Lunt, and Teresa F., “Detecting Intrusion in Computer 

Systems,” 1993 Conference on Auditing and Computer 

Technology, SRI International. 

[19] Paxson, and Vern, “Bro: A System for Detecting Network 

Intruders in Real-Time,” Proceedings of The 7 th USENIX 

Security Symposium, San Antonio, TX, 1998. 

[20] Barbara, Daniel, Couto, Julia, Jajodia, Sushil, Popyack, 

Leonard, Wu, and Ningning, “ADAM: Detecting Intrusion 

by Data Mining,” Proceedings of the IEEE Workshop on 

Information Assurance and Security, West Point, New 

York June 5-6, 2001. 


[21] Jackson, T., Levine, J., Grizzard, J., Owen, and H., “An 

investigation of a compromised host on a honeynet being 

used to increase the security of a large enterprise network,” 

Proceedings of the 2004 IEEE Workshop on Information 

Assurance and Security, IEEE, 2004. 

[22] Krasser, S., Grizzard, J., Owen, H., and Levine. J., “The 

use of honeynets to increase computer network security 

and user awareness,” Journal of Security Education, 

Volume 1, pp. 23-37, 2005. 

[23] Thomas H. Ptacek, and Timothy N. Newsham, “Insertion, 

evasion and denial of service: Eluding network intrusion 

detection,” Technical report, Secure Networks Inc., 1998. 

[24] D.Y. Yeung, and Y.X. Ding, “Host-based intrusion 

detection using dynamic and static behavioral model”, 

Pattern Recognition, 36, pp. 229-243, 2003. 

[25] SunSoft. SunSHIELD Basic Security Model. SunSoft. 

1995. 

[26] Diego Zamboni. “Using Internal Sensors for Computer 

Intrusion Detection,” PhD thesis, Purdue University, 2001. 

[27] Tadeusz Pietraszek, and Chris Vanden Berghe. “Defending 

against injection attacks through context-sensitive string 

evaluation”, In Recent Advances in Intrusion Detection 

(RAID2005), volume 3858 of Lecture Notes in Computer 

Science, pp. 124-145, Seattle, WA, 2005. Springer-Verlag. 

[28] Sebastiaan Tesink, “Improving Intrusion Detection System 

throung Machine Learning,” Technical Report Series no. 

07-02, ILK Research Group, Tilburg University, March 

2007. 

[29] J. R. Quinlan, “Induction of Decision Tree,” Machine 

Learning Vol. 1, pp. 81-106, 1986. 

[30] J. R. Quinlan, “C4.5: Programs for Machine Learning,” 

Morgan Kaufmann Publishers, San Mateo, CA, 1993. 

[31] L. Breiman, J. H. Friedman, R. A. Olshen and C.J. Stone, 

“Classification and Regression Trees,” Statistics 

probability series, Wadsworth, Belmont, 1984. 

[32] John Shafer, Rakesh Agarwal, and Manish Mehta, 

“SPRINT: A Scalable Parallel Classifier for Data 

Maining,” in Proceedings of the VLDB Conference, 

Bombay, India, September 1996. 

[33] D. Turney, “Cost-Sensitive Classification: Empirical 

Evaluation of a Hybrid Genetic Decision Tree Induction 

Algorithm,” Journal of Artificial Intelligence Research, pp. 

369-409, 1995. 

[34] J. Bala, K. De Jong, J. Haung, H. Vafaie and H. Wechsler, 

“Hybrid Learning using Genetic Algorithms and Decision 

Trees for Pattern Classification,” in Proceedings of 14 th 

International Conference on Artificial Intelligence, 1995. 

[35] C. Guerra-Salcedo, S. Chen, D. Whitley, and Stephen 

Smith, “Fast and Accurate Feature Selection using Hybrid 

Genetic Strategies,” in Proceedings of the Genetic and 

Evolutionary Computation Conference, 1999. 

[36] S. R. Safavian and D. Landgrebe, “A Survey of Decision 

Tree Classifier Methodology, ” IEEE Transactions on 

Systems, Man and Cybernetics 21(3), pp. 660-674, 1991. 

[37] Duda, R., P.E. Hart, and D.G. Stork, “Pattern 

classification,” Second edn. John Wiley & Sons, 2001. 

[38] Dewan Md. Farid, and Mohammad Zahidur Rahman, 

“Learning Intrusion Detection Based on Adaptive Bayesian 

Algorithm,” Proceedings of 11 th International Conference 

on Computer and Information Technology (ICCIT 2008), 

Khulna, Bangladesh, 25-27 December, 2008, pp.652-656. 

[39] The KDD Archive. KDD99 cup dataset. 1999. 

http://archive.ics.uci.edu/ml/datasets/KDD+Cup+1999+Da 

ta 

[40] Mukkamala S, Sung AH, and Abraham A, “Intrusion 

dection using an ensemble of intelligent paradigms,”


Proceedings of Journal of Network and Computer 

Applications, 2005, 2(8): pp. 167-182. 

[41] Chebrolu S, Abraham A, and Thomas JP, “Feature 

deduction and ensemble design of intrusion detection 

systems.” Computer & Security, 2004, 24(4), pp. 295-307. 

[42] Lee WK, Stolfo SJ, and Mok KW, “A data mining 

framework for building intrusion detection models,” 

Proceedings of the ’99 IEEE Symp. On Security and 

Privacy, Oakland: IEEE Computer Society. 1999, pp. 120- 

132. 

Dewan Md. Farid was born in 1979. He received the B.Sc. 

Engineering in Computer Science and Engineering from Asian 

University of Bangladesh in 2003 and Master of Science in 

Computer Science and Engineering from United International 

University, Bangladesh in 2004. He is continuing Ph.D. at 

Department of Computer Science and Engineering, 

Jahangirnagar University, Bangladesh. His major field of study 

is artificial intelligence, machine leaning, and data mining. 

He is a faculty member at Department of Computer Science 

and Engineering, United International University, Bangladesh. 

He has published two conference papers, which include 

Learning Intrusion Detection Based on Adaptive Bayesian 

Algorithm, etc. 

Mr. Farid is a member of Bangladesh Computer Society and 

Research Scholar Association, Jahangirnagar University. In 

2008, he received the Fellowship of National Science and 

Information & Communication Technology (NSICT) from 

Ministry of Science and Information & Communication 

Technology, Government of Bangladesh. 


Mohammad Zahidur Rahma is currently a Professor at 


Jahangirnager University, Banglasesh. He obtained his B.Sc. 

Engineering in Electrical and Electronics from Bangladesh 

University of Engineering and Technology in 1986 and his 

M.Sc. Engineering in Computer Science and Engineering from 

the same institute in 1989. He obtained his Ph.D. degree in 

Computer Science and Information Technology from University 

of Malaya in 2001. He is a co-author of a book on E-commerce 

published from Malaysia. His current research includes the 

development of a secure distributed computing environment and 

e-commerce.


Performance Evaluation for 

Question Classification by Tree Kernels using 

Support Vector Machines 

Abstract — Question answering systems use information 

retrieval (IR) and information extraction (IE) methods to 

retrieve documents containing a valid answer. Question 

classification plays an important role in the question answer 

frame to reduce the gap between question and answer. This 

paper presents our research work on automatic question 

classification through machine learning approaches. We 

have experimented with machine learning algorithms 

Support Vector Machines (SVM) using kernel methods. An 

effective way to integrate syntactic structures for question 

classification in machine learning algorithms is the use of 

tree kernel (TK) functions. Here we use SubTree kernel, 

SubSet Tree kernel with Bag of words and Partial Tree 

kernels. Trade-off between training error and margin, Costfactor 

and the decay factor has significant impact when we 

use SVM for the above mentioned kernel types. The 

experiments determined the individual impact for Trade-off 

between training error and margin, Cost-factor and the 

decay factor and later the combined effect for Trade-off 

between training error and margin, Cost-factor. For each 

kernel types depending on these result we also figure out 

some hyper planes which can maximize the performance. 

Based on some standard data set outcomes of our 

experiment for question classification is promising. 

Index Terms — Precision, Recall, kernel, SubSet Tree, 

SubTree, Partial Tree, SVM, Question Classification, 

Question Answering. 


The World Wide Web continues to grow at an 

amazing speed. So, there are also a quickly growing 

number of text and hypertext documents. Due to the huge 

size, high dynamics, and large diversity of the web, it has 

become a very challenging task to find the truly relevant 

content for some user or purpose. The open-domain 

question answering system (QA) has been attached great 

attention for its capacity of providing compact and 

precise results for users. 

The study of question classification (QC), as a new 

field, corresponds with the research of QA. At present the 

studies on QC are mainly based on the text classification. 

Though QC is similar to text classification in some 

aspects, they are clearly distinct in that: Question is 

usually shorter, and contains less lexicon-based 

information than text, which brings great trouble to QC. 

Therefore to obtain higher classifying accuracy, QC has 


doi:10.4304/jcp.5.1.32-39 

Muhammad Arifur Rahman 

Department of Physics, Jahangirnagar University 

Dhaka-1342, Bangladesh 

arifmail@gmail.com 

to make further analysis of sentences, namely QC has to 

extend interrogative sentence with syntactic and semantic 

knowledge, replacing or extending the vocabulary of the 

question with the semantic meaning of every word. 

In QC, many systems apply machine-learning 

approaches [1-3]. The classification is made according to 

the lexical, syntactic features and parts of speech. 

Machine learning approach is of great adaptability, and 

90.0% of classifying accuracy is obtained with SVM 

method and tree kernel as features. However, there is still 

the problem that the classifying result is affected by the 

accuracy of syntactic analyzer, which need manually to 

determine the weights of different classifying features. 

Some other systems adopting manual-rule [4]. 

This paper presents our approaches at question 

classification to improve the f1-measure [6]. of question 

categorization. Our experiment tried to determine the 

optimum value for trade-off between training error and 

margin, cost-factor by which training errors on positive 

examples outweighs errors on negative examples. It is 

used to adjust the rate between precision and recall on the 

development set [5-6]. and the decay factor which has the 

role to penalize larger tree structures by giving them less 

weight in the overall summation [7]. To represent 

questions, the paper uses the standard Li and Roth [8] 

question classification dataset. 

After this overview, question classifying method, a 

brief description about the used algorithm and the kernel 

methods are introduced, and then impact of different 

parameters and parameters combination methods has 

been investigated. The comparisons are testified in 

experiments based on precision, recall and f1-measure. 

The last part of the paper is about the conclusion of the 

present research and about the introduction of the further 

work could be done on this issue. 

II. QUESTION CLASSIFICATION 

Question classification means putting the questions 

into several semantic categories. Approaches to question 

classification can be divided in two broad classes, namely, 

rule-based and machine learning methods. Most recent 

studies have been based on machine learning approaches. 

Li and Roth proposed 6 coarse classes and 50 fine classes 

for TREC factoid question answering. The UIUC QC


dataset, which they developed, contains 5,500 training 

questions and 500 test questions, and it is now the 

standard dataset for question classification [8-9]. We used 

this dataset for our training and testing purpose. 

In Table I each coarse grained category contains a 

non-overlapping set of fine grained categories. Most 

question answering systems use a coarse grained category 

definition. Usually the number of question categories is 

less than 20. However, it is obvious that a fine grained 

category definition is more beneficial in locating and 

verifying the plausible answers. 

TABLE I: THE COARSE AND FINE GRAINED QUESTION 

CATEGORIES. 

Coarse Fine 

ABBR Abbreviation, expansion 

DESC description, definition, manner, reason 

ENTY body, color, event, food, creation, currency, 

animal, disease/medical, instrument, language, 

letter, other, plant, product, religion, sport, 

substance, symbol, technique, term, vehicle, 

word 

HUM description, group, individual, title 

LOC city, country, mountain, other, state 

NUM code, count, date, distance, money, order, 

other, percent, period, speed, temperature, size, 

weight 

III. SUPPORT VECTOR MACHINE (SVM) 

Support vector machine are based on the structural 

risk minimization principle from statistical learning 

theory [10]. The idea of structural risk minimization is to 

find a hypothesis for which we can guarantee the lowest 

true error. The bounds are connected on the true error 

with the margin of separating hyper planes. In their basic 

form support vector machines find the hyper plane that 

separates the training data with maximum margin. Since 

we will be dealing with very unbalanced numbers of 

positive and negative examples in the following, we use 

cost factors C+ and C- to be able to adjust the cost of false 

positives vs. false negatives as [11]. Finding this hyper 

plane can be translated into the following optimization 

problem: 

1 2 

Minimize: � � � i � � � j 

2 

i: 

yi 

�1 

i: 

y j ��1 

C 

C w � 

� 

� 

� � 

Subjected to: �k : yk 

�w. xk 

� b � 1� 

�k 

� 

Here xi is the feature vector of example i. yi equals +1 or - 

1, if example i is in class ‘+’ or ‘-’ respectively. 

Using support vector machine we tried to determine 

the optimal value of trade-off between training error and 

margin (c), cost-factor (j) by which training errors on 

positive examples outweighs errors on negative examples. 

It is used to adjust the rate between precision and recall 

on the development set [11]. Here lambda (�) the decay 

factor which has the role to penalize larger tree structures 

by giving them less weight in the overall summation. 


IV. KERNEL METHODS 

One of the most difficult tasks on applying machine 

learning for question classification is the feature design. 

Feature should represent data in a way that allows 

learning algorithm to separate positive from negative 

examples. In SVMs, features are used to build the vector 

representation of data examples and the scalar product 

between example pairs quantifiers how much they are 

similar. Instead of encoding data in the features vectors, 

kernel functions can be designed that provide such 

similarity between example pairs without using an 

explicit feature representation [6]. 

The kernels we considered in this paper represent 

trees in terms of their substructure. Such fragments define 

the feature space which, in turn, is mapped into a vector 

space. The kernel function measures the similarity 

between trees by counting the number of common 

fragments. These functions have to recognize if a 

common tree subpart belongs to the feature space that we 

intended to generate. Here we considered three important 

characterization: SubTrees (STs), SubSet Trees (SSTs) 

and a new tree class Partial Trees (PTs) described as in 

[6]. 

In case of syntactic parse trees each node with its 

children is associated with a grammar production rule, 

where the symbol at left hand side corresponds to the 

parent and the symbol at right hand side are associated 

with the children. The terminal symbols of the grammar 

are always associated with the leaves of the tree. For 

example, Fig 1. illustrate the syntactic parse of the 

sentence- 

“Adib brought a cat to the school”. 

N 

The root 

a leaf 

a subtree 

S 

VP 

Adib V NP 

brought D N 

Figure. 1. A syntactic parse tree. 

a cat 

S 

S 

N VP 

Adib V NP 

brought D N 

a cat 

VP 

VP 

V NP 

N VP 

brought D N 

a cat 

V NP PP 

PP 

PP 

IN N 

N 

to school 

Figure. 2. A syntactic parse tree with its SubTrees (STs). 

A SubTree (ST) can be defined as any node of a tree 

along with all its descendants. For example, in the Fig. 1. 

circle the SubTree rooted in the NP node. A SubSet Tree 

(SST) is a more general structure which necessarily not 

NP 

D N 

a cat 

IN N 

school 

V 

brought 

D 

a 

N 

cat 

N 

Adib


includes all the descendants. The only restriction is that 

an SST must be generated by applying the same 

grammatical rule set which generated the original tree, as 

pointed out in [12]. Thus, the difference with the SubTree 

is that the SST’s leaves can be associated with nonterminal 

symbols. For example, [S [N VP]] is an SST of 

the tree in the Fig. 1. and it has the two non terminal 

symbols N and VP as leaves. 

Figure. 3. A tree with some of its SubSet Trees (SSTs). 

VP 

V NP 

VP 

V NP 

brought D N 

brought D N 

a cat 

a cat 

VP 

V NP 

VP 

V NP 

D N 

a cat 

D N 

a cat 

Figure. 4. A tree with some of its Partial Trees (PTs). 

If we relax the constraints over the SST’s, we obtain a 

more general form of substructure that we defined as 

Partial Trees (PTs). These can be generated by the 

application of partial production rules of the original 

grammar. For example, [S [N VP]], [S [N]] and [S 

[VP]] are valid PTs of the tree in the Fig. 1. Here Fig. 2. 

shows the parse tree of the sentence “Adib brought 

a cat” together with its 6 STs. The number of SSTs is 

always higher. For example, Fig. 3. shows 10 SSTs (out 

of all 17) of the SubTree of the Fig. 2. rooted in VP. Fig. 

4. shows that the number of PTs derived from the same 

tree is still higher (i.e. 30 PTs). These different 

substructure numbers provide an intuitive quantification 

of the different information level among the tree based 

representations [6]. 

The main idea of the tree kernel is to compute the 

number of the common substructures between two trees 

T1 and T2 without explicitly considering the whole 

fragment space. From [6]. given a tree fragment space {f1, 

f2,….} = F, here we denote the indicator function Ii(n) 

which is equal 1 if the target f1 is rooted at the node n and 

0 otherwise. It followed that- 

K�T1 , T2 

�� n1, 

n2 

�… 

(1) 

n1�N 

T n2 

�N 

1 T2 

where N T and N 

1 

T are the sets of the T1’s and T2’s 

2 

F 

nodes, respectively and ��n1, 

n2 

��Ii�n1�Ii�n2�. i�1 

This letter is equal to the number of common 

fragments rooted at the n1 and n2 nodes. We can compute 

� as follows- 


VP 

NP 

D N 

a cat 

VP 

V NP 

D N 

NP 

D N D 

a 

VP 

NP 

D N D N D N 

D N a cat a 

NP D N 

a 

VP 

NP 

NP 

VP VP 

NP NP 

D 

NP 

V 

a cat brought 

D N D 

NP 

cat 

VP 

NP 

N 

NP NP 

N 

1. if the production at n1 and n2 are different 

the �� n ��0 n 1, 

2 ; 

2. if the production at n1 and n2 are the same, and 

n1 and n2 have only leaf children (pre-terminal 

symbols) then �� n ��1 n 1, 

2 ; 

3. if the production at n1 and n2 are the same, and 

n1 and n2 are not pre-terminals then 

nc�n1� 

j j 

� , �� 

, �� 

c 

n 

� 1 2 � n1 n2 

j �1 

c 

n � … (2) 

where � ��0, 

1�, 

nc �n1� is the number of the children 

of n1 and j 

cn is the j-th child of the node n. As the 

production is same nc(n1)= nc(n2). 

When � =0, � �n1,n2� is equal 1 only if 

j j 

� �� 

, �� 1 

n n c c j , i.e. all the productions associated 

1 2 

with the children are identical. By recursively applying 

the property, it follows the SubTrees in n1 and n2 are 

identical. Thus Eq. 1 evaluate the SubSet Tree 

kernel .When � =1, � 1 2 � ,n n � evaluates the number of 

SSTs common to n1 and n2 as proved in [12]. 

To include the leaves as fragments it is enough to add, 

to the recursive rule set for the � evaluation, the 

condition is- if n1 and n2 are leaves and their associated 

symbols are equal then ��n 1, 

n2 

��1. By [6]. such 

extended kernel can be referred as SST+bow (bag-ofwords). 

Here the decay factor (�) is defined as 

��n x, nz 

��and nc�nx� 

j j 

��nxnz�� 

� ��c�� 

n c 

1 n2 

j�1 

, 

, � � 

. 

To evaluate all possible substructures common to two 

trees, we can (1) select a child from the both trees, (2) 

extract the portion of the syntactic rule that contains such 

subset, (3) apply Eq. 2 to the extracted partial productions 

and sum the contributions of all the children subsets. 

Such subsets corresponds to all possible common (non 

continuous) node subsequence and computed efficiently 

by means of sequence kernel [13]. Let 

� 

� 

J1 � �J11, .., J1r 

� and J2 � �J21, .., J2r 

� be the 

index sequence associated with the order child sequence 

of n1 and n2 respectively. Then the number of PTs is 

evaluated by the following � function: 

� 

l�J1� 

� 

�� 

�� 

� � �� 

� J1 

n1 

, n2 

1 

c , 

� � � � � n1 

J , J , l J �l J i�1 

where �J� 1 

2 

� � � � 

1 

2 

� 

i J 2i 

cn2 

l 1 

� indicates the length of the target child 

� � 

and J2 i are the i th children in the 

sequence, whereas J1i two sequences. 

� 

� 

�


V. RESULTS AND ANALYSIS 

We evaluate the performance of Question 

Classification using SVM with SubTree kernel, SubSet 

Tree kernel with Bag of words (SST+BOW) and Partial 

Tree kernel. For SubSet Tree kernel with bag of words 

first we investigate the optimum value of trade-off 

between training error and margin (c) keeping cost-factor 

(j) and the decay factor (�) constant. Then we investigate 

the impact of cost factor (j) for constant trade-off between 

training error and margin (c) and decay factor (�). 

F1-Measure 

100 

90 

80 

70 

60 

50 

40 

30 

For Cost Factor (j)=0.5 

0 1 2 3 4 

Training error and margin Trade-off (c) 

ABBR 

DESC 

ENTY 

HUM 

LOC 

NUM 

Figure. 5 (a). Training error and Margin Vs F1 Measure 

for j = 0.5 and SST+BOW 

F1-Measure 

100 

90 

80 

70 

60 

50 

40 

30 


0 1 2 3 4 


ABBR 

DESC 

ENTY 

HUM 

LOC 

NUM 

Figure. 5 (b). Training error and Margin Vs F1 Measure 


F1-Measure 

100 

90 

80 

70 

60 

50 

40 

30 


0 1 2 3 4 


ABBR 

DESC 

ENTY 

HUM 

LOC 

NUM 

Figure. 5 (c). Training error and Margin Vs F1 Measure 



Then for constant factor (�) we determine the hyper plane 

for trade-off between training error and margin (c) and 

cost-factor (j) which have the maximum f1-measure. 

Finally we determined the impact of decay factor (�) for a 

constant trade-off between training error and margin (c) 

and cost-factor (j). The similar process was repeated for 

SubTree kernel and partial tree kernel. 

Fig. 5 (a-c). shows the trade-off between training error 

and margin (c) versus performance (measured by f1measure) 

curve for different types of QC data set using 

different Cost Factor (j) and fixed decay factor (�=0.01). 

F1-Measure 

100 

90 

80 

70 

60 

50 

40 

30 

For Trade-off between Training Error & Margin C=0.5 

0 2 4 6 8 

Cost Factor (j) 

ABBR 

DESC 

ENTY 

HUM 

LOC 

NUM 

Figure. 6. (a) Cost factor Vs F1 measure for C=0.5 and 

SST+BOW. 

F1-Measure 

100 

90 

80 

70 

60 

50 

40 

30 

For Trade-off between Training Error & Margin C=1 

0 2 4 6 8 


Figure. 6 (b). Cost factor Vs F1 measure for C=1 and 

SST+BOW. 

F1-Measure 

100 

90 

80 

70 

60 

50 

40 

30 

For Trade-off between Training Error & Margin C=1.5 

0 2 4 6 8 


ABBR 

DESC 

ENTY 

HUM 

LOC 

NUM 

ABBR 

DESC 

ENTY 

HUM 

LOC 

NUM 

Figure. 6 (c). Cost factor Vs F1 measure for C=1.5 and 

SST+BOW.


Here it is notable for same set of parameters different 

data types have different performance level. For different 

cost factor highest f1-measure can be achieved by setting 

the trade-off between training error and margin (c) 

around 2, Afterwards its performance becomes almost 

constant. 

Fig. 6. (a-c) shows the cost factor (j) versus 

performance (measured by f1-measure) curve for 

different types of question classification data set using 

different of Trade of between training error and margin 

(c) and fixed decay factor (�=0.01). For a constant tradeoff 

between training error and margin different types of 

questions have highest f1-measure. 

For different value of c highest f1-measure can be 

achieved by setting the cost factor (j) less than 2. Even 

for some cases if the value of cost factor is increased then 

the f1-measue is decreased a certain amount and after a 

stage it becomes almost constant. 

Fig. 7. shows the decay factor (�) versus f1-measure 

curve for a constant values of trade-off between training 

error and margin (c=1.5) and cost factor (j=1.5). To get a 

very high f1 measure decay factor should have very small. 

Starting after 0, up to 0.5 its performance is almost steady 

but if we increase the decay factor more the performance 

decreased drastically. 

F1-Measure 

F1-Measure 

100 

90 

85 

80 

75 

90 

80 

70 

60 

50 

40 

30 

70 

4 

For fixed Cost Factor (j=1.5) and Trade-off (c=1.5) 

0 0.5 1 1.5 2 

Decay Factor (lambda) 

Figure. 7. Impact of Decay Factor and SST+BOW. 

3 

2 

Trade-off (c) 

1 

ABBR 

0 

0 

1 

2 

3 

4 

5 

Cost FActor (j) 

ABBR 

DESC 

ENTY 

HUM 

LOC 

NUM 

Figure. 8(a). Hyper plane for ABBR data type and SST+BOW. 


6 

Fig. 8(a-f). show the hyper plane by which maximum 

f1-measure can be achieved. For a constant decay factor 

(�) a set of values of trade-off between training error and 

margin (c) and cost factor (j) can provide the highest f1measure. 

Among the six type of data set ‘LOC’ type of 

question have to classified by some fixed combination of 

trade-off between training error and margin (c) and cost 

factor (j) because there is no hyper plane but some peaks. 

F1-Measure 

94 

92 

90 

88 

86 

84 

82 

4 

3 

2 


1 

0 

HUM 

0 

2 

4 


Figure. 8(b). Hyper plane for HUM data type and SST+BOW. 

F1-Measure 

100 

90 

80 

70 

60 

50 

40 

4 

3 

2 


1 

DESC 

0 

0 

2 

4 


Figure. 8(c). Hyper plane for DESC data type and SST+BOW. 

F1-Measure 

90 

80 

70 

60 

0 

1 

2 


3 

4 

6 

5 

LOC 

4 

3 

2 


Figure. 8(d). Hyper plane for LOC data type and SST+BOW. 

1 

0 

6 

6


F1-Measure 

80 

70 

60 

50 

40 

30 

20 

4 

3 

2 


1 

0 

ENTY 

0 

2 

4 


Figure. 8(e). Hyper plane for ENTY data type and SST+BOW. 

F1-Measure 

95 

90 

85 

80 

75 

70 

65 

60 

4 

3 

2 


1 

0 

NUM 

0 

2 

4 


Figure. 8(f). Hyper plane for NUM data type and SST+BOW. 

Figure. 9. Training error and Margin Vs F1 Measure 

for j = 0.5 and PT 

Fig. 9. shows the training error and margin versus F1- 

measure curve for a constant cost factor (j = 0.5) using 

partial tree. Like SST+BOW the performance is very low 

when the training error and margin is very low. After a 

certain level (here it is 2) of training error and margin the 

performance tends to become almost constant. 


6 

6 

Figure. 10. Cost factor Vs F1 measure for C=0.5 and PT. 

Figure. 11. Impact of Decay Factor for PT. 

Figure. 12. Hyper plane for NUM data type and PT. 

Fig. 10. shows the Cost factor versus F1 measure 

curve for a constant training error and margin (C=0.5). 

While Fig. 11. shows the impact of decay Factor. It is 

clear that at the lower range it doesn’t have much impact 

on performance like SST+BOW. Fig. 12 and Fig.13. 

respectively shows the hyper plane for NUM and ENTY 

data type.


Fig. 13. Hyper plane for ENTY data type and PT. 

TABLE II: MAXIMUM F1- MEASURE OBTAINED USING 

SUBSET TREE KERNEL WITH BAG OF WORDS (SST+BOW) FOR 

�=0.01. 

Coarse F1 P(%) R(%) A(%) j c 

ABBR 88.89 99.6 88.89 88.89 6 1 

DESC 96.77 98.2 95.74 97.83 1 2.5 

ENTY 82.87 93.8 86.21 79.79 6 2 

HUM 95.24 98.8 98.36 92.31 4 3.5 

LOC 87.8 96 86.75 88.89 5.5 0.5 

NUM 94.01 97.4 98.08 90.27 5 1 

TABLE III: MAXIMUM F1- MEASURE OBTAINED USING 

SUBSET TREE KERNEL FOR �=0.01. 


ABBR 88.89 99.6 88.89 88.89 3.5 2.5 

DESC 97.12 98.4 96.43 97.83 1 3.5 

ENTY 79.78 92.8 84.52 75.53 6.5 3 

HUM 93.75 98.4 95.24 92.31 2 2.5 

LOC 88.48 96.2 86.9 90.12 5.5 1 

NUM 94.01 97.4 98.08 90.27 5 1.5 

TABLE IV: MAXIMUM F1- MEASURE OBTAINED USING 

PARTIAL TREE KERNEL FOR �=0.01. 


ABBR 87.5 99.6 100 77.78 3 0.5 

DESC 96.06 97.8 95.04 97.1 1 3.5 

ENTY 85.56 94.6 86.02 85.11 4 3.5 

HUM 81.33 94.4 71.76 93.85 5.5 0.5 

LOC 88.48 96.2 86.9 90.12 5.5 1.5 

NUM 94.44 97.6 99.03 90.27 4.5 3 

Table II shows the maximum F1- measure (F1) 

obtained using SubSet tree Kernel-Bag of words 

(SST+BOW) for �=0.01 and the corresponding precision 

(P), recall (R), accuracy (A) for specific cost factor (j) 

and trade-off between training error and margin (c). Table 


III and Table IV shows the performance for SubTree 

kernel and Partial Tree kernel respectively. 

VI. CONCLUSION 

Question classification is one importance role in the 

Question Answering frame to reduce the gap between 

question and answer. It can conduct answer choosing and 

selection. Our question classification method based on 

the use of linguistic knowledge and machine learning 

approaches and it exploit different classification features 

and combination method, also. Though among all the 

experiments one or two data set did not provide 

distinguishable hyper plane in every cases thereafter the 

outcome of experiments done using the tool SVM_light 

on Li and Roth question classification data sets 

demonstrate some optimal set of values which can 

maximize the performance. In the future we aim to 

investigate the impact of different parameters on 

constituent trees using different types of kernel as well as 

other different classifiers. 

REFERENCES 

[1]. Hovy, E., Hermjakob, U., Lin, C.-Y. and Ravichandran, 

D. "Using Knowledge to Facilitate Factoid Answer 

Pinpointing", Proceedings of the 19th International 

Conference on Computational Linguistics (COLING), 

Taipei, Taiwan, 2002. 

[2]. Ittycheriah, A., Franz, M., Zhu, W.-J. and Ratnaparkhi, 

A. "IBM’s Statistical Question Answering System", 

Proceedings of the TREC-9 Conference, Gaithersburg, 

MD: NIST, 2000, p. 229. 

[3]. Zhang, D. and Lee, W. S. 2003. "Question 

Classification using Support Vector Machines", 

Proceedings of ACMSIGIR Conference on Research 

and Development in Information Retrieval (SIGIR), 

Toronto, Canada, 2003. 

[4]. LI Xin, HUANG Xuan-Jing, WU Li-de, “Question 

Classification by Ensemble Learning”, IJCSNS 

International Journal of Computer Science and Network 

Security, VOL.6 No3, March 2006. 

[5]. Alessandro Moschitti, Silvia Quarteroni, Roberto Basili 

and Suresh Manandhar, “Exploiting Syntactic and 

Shallow Semantic Kernels for Question/Answer 

Classification”, Proceedings of the 45th Conference of 

the Association for Computational Linguistics (ACL), 

Prague, June 2007. 

[6]. Roberto Basili, Alessandro Moschitti, “Automatic Text 

Categorization From information Retrival to Support 

Vector Learning”, ARACNE editrice, November 2005. 

[7]. Stephan Bloehdorn and Alessandro Moschitti, 

“Exploiting Structure and Semantics for Expressive 

Text Kernels”, Proceeding of the Conference on 

Information Knowledge and Management, Lisbon, 

Portugal, 2007. 

[8]. Li, X., Roth, D, “Learning question classifiers”, 

Proceedings of the 19th International Conference on 

Computational Linguistics, Taipei, Taiwan, 2002. 

[9]. M.-Y. Day, C.-H. Lu, C.-S. Ong, S.-H. Wu, and W.-L. 

Hsu, "Integrating Genetic Algorithms with Conditional 

Random Fields to Enhance Question Informer 

Prediction", Proceedings of the IEEE International 

Conference on Information Reuse and Integration, 

Waikoloa, Hawaii, USA, 2006, pp. 414-419.


[10]. V. Vapnik, “Statistical Learning Theory”, Wiley, New 

York, USA 1998. 

[11]. K. Morik, P. Brockhausen, and T. Joachims, 

“Combining statistical learning with a knowledge-based 

approach – A case study in intensive care monitoring”, 

International Conference on Machine Learning (ICML), 

Bled, Slovenia, 1999. 

[12]. M. Collins, N. Duffy, “New Ranking algorithm for 

parsing and Tagging: Kernels over Distance structure, 

and the Voted Perception”, Association for 

Computational Linguistics (ACL), Philadelphia, USA, 

2002. 

[13]. Huma Lodhi, Craig Saunders, John Shawe-Taylor, 

Nello Cristianini, Christopher Watkins, “Text 

Classification using string kernels”, in NIPS, pp 563- 

569, 2000. 


Muhammad Arifur Rahman has 

completed his Masters degree in Human 

Language Technology and Interfaces 

(HLTI) under the Department of 

Information Engineering and Computer 

Science at University of Trento, Italy. He 

received his B.Sc. (Honors) degree from the 


Jahangirnagar University, Bangladesh at 2001. His major areas 

of interest include Natural Language Processing, Data 

Communication, and Digital Signal Processing. 

He has authored a book titled “Fundamentals of 

Communication” and more than 10 international journal and 

conference papers. At present he is serving as a Lecturer in the 

Department of Physics, Jahangirnagar University, Dhaka, 

Bangladesh.


Recurrent Neural Network Classifier for Three 

Layer Conceptual Network and Performance 

Evaluation 

Md. Khalilur Rhaman 

Department of Computer Science and Engineering, BRAC University, 66 Mohakhali, Dhaka-1212, 

Bangladesh. 

Email: khalilur@bracu.ac.bd 

Tsutomu Endo 

Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka, 820-8502, Japan. 

Email: endo@pluto.ai.kyutech.ac.jp 

Abstract— Natural language has traditionally been handled 

using symbolic computation and recursive processes. Classification 

of natural language by using neural network is 

a hard problem. Past few years several recurrent neural 

network (RNN) architectures have emerged which have 

been used for several smaller natural language problems. 

In this paper, we adopt Elman RNN classifier for disease 

classification for a doctor patient-dialog system. We find that 

the Elman RNN is able to find a representation for natural 

language. Contextual analysis in dialog is also a major 

problem. A three layers memory structure was adopted to 

address the challenge which we referred to as ”Three Layer 

Conceptual Network” (TLCN). This highly efficient network 

simulates the human brain by discourse information. An 

extended case structure framework is used to represent 

the knowledge. We used the same case frame structure to 

train and examine the RNN classifier. This system prototype 

is based on doctor-patients dialogs. The over all system 

performance achieved 84% accuracy. Disease identification 

accuracy depends on number of disease and number of 

utterances. The performance evaluation is also discussed in 

this paper. 

Index Terms— Three Layer Conceptual Network, Knowledge 

Representation, Recurrent Neural Network. 


In this paper we present a Neural Network Classifier 

to classify the diseases for our system prototype. EL- 

MAN [1] [2] POLLACK [3] and LAWRENCE [4] are 

considered to apply in this application. In this initiative 

we introduced Neural Network in Three Layer Conceptual 

Network (TLCN). To implement this novelty a number of 

researches was studied [5] [6] [7] [8] [9] [10] [11] [12] 

[13] [14] [15] which integrated RNN with NLP. TLCN 

architecture was developed by RHAMAN [16] from the 

memory model idea of NOMURA [17]. NOMURA’s 

model was a framework for a memory management but 

RHAMAN developed architecture to address a real time 

dialog system. A discourse algorithm was introduced to 

represent discourse information which was related to [18], 

[19], [20]. It was a model which could simulate the 


doi:10.4304/jcp.5.1.40-48 

human brain memory. Human brain consist three types 

of memories Long-term, short-term and mid-term which 

was studied in [21], [22] and [23]. In our system episodic 

memory module simulates short-term memory, discourse 

memory module simulates mid-term memory and ground 

memory module simulates long-term memory. But in case 

of disease identification the system performance was not 

satisfactory. So, We use RNN Classifier and get better 

result. We used the traditional modules for input and 

response generation but developed an extended case frame 

model of BRUCE [24] and SHIMAZU & NOMURA [25] 

for handling the knowledge database. Case frame network 

is very useful for unstructured and ambiguous languages 

like Japanese. However, most systems deal with rather 

simple sentences, in which the analysis is a relatively easy 

task. Our extension of case frame architecture reduced the 

complexity using formalism. We use the same structure 

for RNN classifier. 

II. MOTIVATION 

Patient come back to the doctor repeatedly with the 

same symptoms and expects the doctor to fix it. In 

another case most of patients come to the doctor with 

same symptoms. In the decade of 1980 and 1990, a 

number of medical expert systems were designed using 

medical knowledge to diagnose the diseases. Medical 

expert systems have evolved to provide physicians with 

both structured questions and structured responses within 

medical domains of specialized knowledge or experience. 

Most of these systems [26], [27], [15] and [28] were 

based on doctor. Only medical experts could enter the 

predefined data set to find the solution. Our proposed 

system is very different in this point. Our system is 

designed to communicate with patient directly. Patients 

can communicate our system by their natural language. 

Natural language has traditionally been handled using 

symbolic computation and recursive process. The most 

successful stochastic models have been based on finitestate 

descriptions models. However finite-state model can


not represent hierarchical structures found in natural language. 

In past few decades most of the NLP systems [29], 

[30], [31] [32], [33], [34], [35]and [36] were developed 

based on syntactic phrase structure. These systems are 

good for structured language but not good for encoding 

linguistic information of unstructured language like 

Japanese. So, our motivation was to implement a system 

that can communicate directly with patients and also can 

handle the process of sequential sentences and complex 

sentence. 

III. OVERVIEW OF TLCN 

Three Layer Conceptual Network model is a framework 

to use the knowledge that simulates human brain. 

Discourse information is playing the vital role for this 

kind of simulation. It consists of three layered memory 

modules: 1) Ground Memory Module (GMM), 2) Discourse 

Memory Module (DMM) and 3) Episodic Memory 

Module (EMM). In our proposed model, the knowledge 

and their relationships accommodate into the GMM. 

DMM represents the most basic, useful and recently used 

discourse knowledge. EMM is a User interface. It contains 

linguistic and non-linguistic knowledge for understanding 

input dialogs and generates response dialogs. However, 

memory modules are not simply an accumulation of the 

contents of other memories. It is a structured memory 

coherently organized by merging old and new knowledge. 

The process relates to what is called knowledge learning. 

Each memory module consists of a knowledge database 

and its manager. Fig. 1 depicts the simple process flow 

of TLCN. 

A. Architecture 

Figure 1. Processing of TLCN 

The TLCN-Dialog Processor (DP) architecture is illustrated 

in Fig. 2. This architecture provides an efficient 

integration of component modules. Text input utterances 

are sent to dialog understanding unit of EMM. To understand 

the input dialog, the entire linguistic knowledge 

is assimilated to episodic memory from ground memory 

in advance. After normalizing the input text, it generates 

case frames that are sent to classifier unit. Here, input 


case frames are classified into some predefined classes. 

DMM manager receive these classified case frames from 

EMM manager and searches in the discourse memory to 

match some diseases. DM will be acknowledged whether 

any disease is identified or not. If the information does 

not match enough in discourse memory, DM sends the 

input case frames to GMM manager. GMM manager 

does the same task as DMM manager have done and 

assimilates the diseases to discourse memory. If the input 

case frames do not match with any of the diseases, 

DM generates a ”not identified response”. If it could 

match with some diseases but not enough to satisfy 

the identification condition, it generates case structures 

of question response for the dialog generator. Dialog 

generator then asks a natural language question to the 

patient. If the matching result satisfies the identification 

condition, it generates identification response including 

advice case frames, carefulness case frames and treatment 

case frames. Then, the dialog generator generates natural 

language text dialog from the case frames. 

Figure 2. TLCN Dialog Processing System 

B. Ground Memory Module (GMM) 

The ground memory of GMM contains all kinds of 

knowledge. This is the main data source of the system. It 

stores information related to linguistic knowledge such as 

dictionary, grammar, and non-linguistic knowledge which 

includes disease, symptom, cause, treatment, effect etc. 

Procedural knowledge which relates inferring a fact from 

a collection of facts also described here. We considered 

[37], [38], [39], [40], [41] to implement ground 

memory. It also includes discourse information like most 

common diseases, seasonal diseases, previous record of 

a patient and recently processed disease information. In 

ground memory, knowledge is represented by a case frame 

structure form. Fig. 3 is showing the hierarchy of case 

frame representation where the nodes represent the case 

structures and the edges represent their relationships. In 

this figure, X and Y axis represent the case frames and 

Z axis denotes level of their relationships. More than one 

case frame of a previous level can be connected with one 

or more case frames of the next level and vice-verse. In 

the following sub-section we will discuss the details of 

knowledge representation in ground memory.


Figure 3. Case structure and their relationship 

The knowledge is classified in some predefined classes. 

For this system prototype the nodes are classified in 

diseases. A class can be a property of other class. The 

GMM manager is an algorithm that searches the disease 

information in ground memory when requested by the 

Dialog Manager and also finds the discourse knowledge 

requested by discourse memory module. Ground memory 

always updates itself after every conversation and never 

deletes anything. 

1) Case frame Model: The extended case frame model 

we used is an understanding model consists of predicates, 

semantic case relations (roles), modalities, and 

conjunctive relations that were previously proposed by 

SHIMAZU [25]. BRUCH [24] model also studied to 

design our extended model. 

sentence = simple sentence|sentence + 

conjunctive relation + sentence 

simple sentence = predicate + case relations + 

modalities 

case relations = object|method|direction|timeand 

space|supplement|modification 

modalities = tense|aspect|manner|intention|guess 

|attitude|negation|ability|necessity| . . . 

conjunctive relation = time|cause|reason|result| 

contrast|goal|assumption| 

1) Case Relations: one characteristic of this model is 

the relatively large collection of case relations. The 

collection includes: 1) basic cases like Fillmore’s 

case system, 2) case relations such as extent, manner, 

and degree, which appear as adverbial phrases, 

and 3) additional cases represented by the inflection 

of a certain verb. 

2) Modalities: semantic structures must include information 

on modality such as tense, aspect, intention, 

manner, attitude and assumption. Real sentences 

convey much of this type of information. Unfortunately, 

however, most understanding systems and 

linguistic theories have not dealt sufficiently with 

modalities. 

3) Conjunctive Relations: A sentence generally consists 

of several sub-sentences, each of which represents 

a unit event. In the model, event relations 


are expressed by conjunctive relations such as time, 

cause, reason, result, goal, assumption, contrast and 

circumstance. These are essential for understanding 

a sentence, and must be organized into semantic 

structures. 

2) Extended Case Frame Representation: An extension 

of case structure model is proposed as a linguistic 

model for representing the text meaning structure. Thus 

the case frames act as a representation scheme for ground 

memory. The traditional case structure is a structure for a 

unit sentence which consists mainly in relations between 

noun and verb. This is not sufficient to represent structures 

of real sentences which sometimes have complex noun 

phrase and compound sentences. Also our proposed case 

structure has to have facilities for representing other 

structures involving relations between two nouns by verb 

and preposition. It has been designed to integrate those 

structures into one linguistic model. Its nature is hierarchical 

with respect to the way constituents are connected; 

iterative with respect to conjunction, and recursive with 

respect to embedding. Using this formalism, the syntactic 

and semantic structures of sentences can be represented 

uniformly. 

In our proposed structure, every noun has six properties: 

adjective, delimiter, preposition, auxiliary verb, 

adverb and verb. Noun can be related by either verb 

or preposition. This case frame representation makes 

the matching algorithm simple and efficient because the 

same structure can represent both case frames and their 

relationships. The basic structure of case frame is depicted 

in Fig. 4. 

Figure 4. Case frame structure 

Fig. 5 shows the representation of case frame. One case 

structure node can have more than one connection node 

and one connection node can have more than one case 

structure node. There is no restriction about consequence 

case structure and connection node to represent a complex 

or compound knowledge. 

Figure 5. shows the relation of case structures 

Fig. 6. shows an example of knowledge representation


in ground memory where the disease is considered as a 

head class and others are considered as daughter classes. 

This also explains the representation of knowledge in 

a real database. Here the example disease is cholera. 

Symptom, cause and others are sub-nodes of cholera. 

The sub-nodes of symptom and cause represent the case 

frames for the natural language texts written in the box 

located at the right bottom of the figure. 

Figure 6. Knowledge representation 

3) Training to the Ground Memory: The TLCN-DP 

system is based on doctor-patient dialog conversation. 

First the system was trained under the supervision of a 

professional doctor. At the time of training, the trainer 

carefully considered the discourse information and the 

level of nouns. The statistics of training corpus are given 

in Table I. The information about these diseases was 

collected from medical web links [42], [43] and [44]. The 

corpus of RHAMAN [45] is also considered to train the 

system. 

TABLE I. 

TRAINING CORPUS STATISTICS 

Number 

Diseases 30 

Sentences for each disease 95 

Normalized Sentences for each disease 220 

Vocabulary Size 2500 

C. Discourse Memory Module (DMM) 

Discourse memory contains discourse information that 

provides situational and contextual information for an 

utterance environment. It contains nodes with different 

frequency. The frequency is calculated depending on the 

uses of knowledge. In our proposed system we used three 

frequencies: medium, high and very high. We standardize 

the frequency three after examining with other frequencies. 

We found better response using the frequency three. 

The very high frequency is determined by the most used 

case frames, recently used classes and seasonal disease 

case frames. The high frequency nodes are determined 

by the basic information of the patient and his past 


medical record. Medium frequency nodes are determined 

by the medium uses of case frames. The knowledge and 

their frequencies of discourse memory are different for 

different patients. 

1) Discourse Information: Some discourse information 

is pre-defined and some discourse information depends 

on real time information. After getting the basic 

information of patient and environment, the system decides 

the discourse knowledge considering the following 

criteria. The discourse frequency is always predefined by 

the trainer of the system. 

1) Most common diseases: In every area or age there 

are some common diseases. This information is predefined 

by the trainer. 

2) Seasonal disease Information: There are some common 

diseases for each season. This discourse information 

is identified by current date and the predefined 

seasonal information. 

3) Recently treated disease information: The diseases 

and case frames are mostly handled by the current 

system in recent time. Its discourse frequency is 

determined by the frequency of handled disease. 

4) Previous record of current patient: If the patient’s 

basic information repeats, then case frames of entire 

previous records are considered as discourse. All 

previous dialog conversations are uploaded with 

very high discourse frequency. 

If any disease matches with one of above criteria, the 

frequency will be medium. If it matches with two criteria, 

frequency will be high. If match with three criteria the 

discourse frequency will be very high. 

2) Dialog Matching: First the DMM manager compares 

the patients symptoms and causes with all the 

symptoms and causes of discourse memory. If the symptom 

does not match it searches in ground memory with 

the help of GMM manager. It selects all of them as 

a candidate disease and sends all these case frames to 

dialog manager. DM calculates the highest probability 

between these diseases with a probability function. This 

function will be discussed in DM section. Depending on 

the probabilities, DM requests the responses from DMM. 

3) Response Generator: In our TLCN-DP architecture, 

we have incorporated a response module to help patients 

to find the goal disease. The dialog manager analyzes 

the result and selects the type of system response. DMM 

manager provides case frames for responses to the language 

generator. The system responses are divided in 

three types: 1) response before disease identification, 2) 

after disease identification and 3) failed to identify. 

1) Response before disease identification: in this type 

of response, first it considers the symptom and 

the causes of two diseases which have the highest 

probability between selected diseases. Find the dissimilarities 

of symptoms and causes between them. 

Response module then generates case frames for 

these. After 10 conversations it select case frames 

for pharmaceutical tests (if any test exists in the 

selected diseases).


2) After Disease Identification: once the disease has 

been identified, the dialog manager provides case 

frames for treatment including medicines, effects, 

and preventions for the specific disease to the 

language generator. It can also provide more case 

frames for details of this disease according to the 

patient request. 

3) Failed to identify: if the dialog manager does not 

obtain any information about any diseases within 3 

conversations and does not match more than 75% 

of the important symptoms and causes, it generates 

a failure notification. 

D. Episodic Memory Module (EMM) 

EMM receives input from the user and provides output 

to the user. The input processor receives the input text 

dialog and understands meaning. Episodic memory stores 

the meaning of the ongoing segment of utterances. As 

understanding proceeds, the essence of episodic memory 

assimilated into discourse memory and the essence of 

discourse memory assimilated into ground memory. The 

same way the output processor receives case frames from 

DMM manager and generates the appropriate natural 

language text for user. To understand the dialog the 

input processor uses some modules like text normalization 

unit, converter for case structure and semantic classifier. 

Episodic memory contains all the linguistic information 

and the recent conversation with the current user. EMM 

manager includes the entire previous dialog and responses 

with the case structure of the new dialog and classifies 

it in the pre-defined classes. It sends these classified 

case structures to the discourse memory module. Episodic 

memory is a temporary memory. It updates only for the 

ongoing process requirement. After every conversation 

with a user, it refreshes the episodic memory. 

1) Components: EMM consists of language understanding 

module and the language generation module. The 

understanding module normalizes the input text and converts 

them to a language-neutral meaning representation – 

a case frame. Finlay it classifies the case frames and send 

them to DMM. After getting the response from DMM, the 

language generation module produces the response in text 

form. In this section we will discuss four components: 

1) text normalization, 2) case frame representation, 3) 

classifier and 4) Language generator. 

1) Text normalization: this is an essential step for 

minimizing ”noise” variations among words and 

utterances. The text normalization component is essentially 

based on using synonyms and other forms 

of syntactic normalization. The main normalization 

factors include stemming using a synonyms dictionary, 

removal of confusions, non-alphanumeric and 

non-white space characters. 

2) Case frame representation: it generate the case 

frames using the same case frame model discussed 

in GMM section. 

3) Classifier: in TLCN-DP the nodes are classified in 9 

properties: 1) Disease, 2) Symptom, 3) Environment 


Cause, 4) Physical Cause, 5) Other Cause, 6) Tests, 

7) Treatment & Medicine, 8) Future Effect and 9) 

Prevention. 

Disease is the head node. All the other nodes are 

related disease. Fig. 7 explains symptom and cause 

used to find the goal disease. Then pharmaceutical 

Figure 7. Relationship of knowledge 

tests confirm the disease (if any tests exist). After 

that, the TLCN-DP will be able to provide detailed 

information to the patient including treatment, 

medicine, effect, carefulness and prevention. 

If any utterance from patient’s dialog contains any 

data related to the body then it will be classified 

as symptom class. If any utterance contains any 

data about treatment or medicine then it will be 

classified as treatment class. If any utterance contains 

any data about test then it will be classified as 

test class. Otherwise it will be classified as cause 

oriented. Classifier sends the classified utterances to 

the DMM manager. 

4) Language Generator: It organizes the sentences by 

using the parts of speech and following grammatical 

rules from the response case frames, and generates 

the natural language output text to the user. 

IV. NEURAL NETWORK 

Natural language has traditionally been handled using 

symbolic computation and recursive processes. The most 

successful stochastic language models have been based 

on finite-state descriptions such as n-grams or hidden 

Markov models. However, finite-state models cannot represent 

hierarchical structures as found in natural language. 

In the past few years several RNN architectures have 

emerged which have been used for grammatical inference. 

RNNs have been used for several smaller natural language 

problems. Neural network models have been shown to be 

able to account for a variety of phenomena in phonology, 

morphology and role assignment. It has been shown 

that RNNs have the representational power required for 

hierarchical solutions, and that they are Turing equivalent. 

The Elman RNNs investigated in this paper to classify 

diseases. 

A. Recurrent Neural Network 

Recurrent Neural Network is a class of neural network 

where connections between units form a directed cycle.


This creates an internal state of the network which allows 

it to exhibit dynamic temporal behavior. RNN have 

feedback connections and address the temporal relationship 

of inputs by maintaining internal states that have 

memory. RNN are networks with one or more feedback 

connection. A feedback connection is used to pass output 

of a neuron in a certain layer to the previous layer(s). 

The different between NLP and RNN is RNN have feedforward 

connection for all neurons (fully connection). 

Therefore, the connections allow the network show the 

dynamic behavior. 

B. Elman Recurrent Neural Network 

Elman Recurrent Neural Network: A recurrent network 

with feedback from each hidden node to all hidden nodes. 

When training the Elman network back propagationthrough-time 

is used rather than the truncated version used 

by Elman, i.e. in this paper Elman network refers to the 

architecture used by Elman but not the training algorithm. 

Back-propagation-through-time has been used to train the 

recurrent networks. 

V. CLASSIFIER 

Recurrent Neural Network Classifier is used to identify 

the probability of each disease. The inputs of this 

classifier are the indexes of classified case structures. As 

we mentioned before, each disease is a combination of 

some specific classes (symptom, cause etc.), each class is 

a combination of some case frames and each case frame 

is a combination of some case structures. All of these 

elements are indexed in advance. 

A. Architecture 

In our proposal we used an Extended Elman RNN 

for classification. According to the experiment of Pollack 

[3] and Lawrence-Giles-Fongs [4], Elman RNN classifier 

provides better performance for Natural Language Processing. 

Our classifier is using four word inputs, 7 hidden 

nodes. We did experiment with varying the number of 

hidden nodes and found optimal performance for 7 hidden 

nodes considering time and accuracy. The quadratic cost 

function, the learning rate schedules are shown below. An 

initial learning rate of 0.2 and we use the random weight 

initialization strategy. The Recurrent Network we used for 

this application is depicted in Fig. 8. Where S indicates 

Symptom, EC indicates Environment Cause, PC indicates 

Physical Cause, OC indicates Other Cause, T indicates 

Tests, H indicates Hidden node, D indicates Disease and 

NI indicates Not Identified. 

1) Target Output: Target outputs were 0.1 and 0.9 

using the logistic activation function. This helps 

avoid saturating the sigmoid function. 

2) Weight Initialization: Random weights are initialized 

with the goal of ensuring that the sigmoid do 

not start out in saturation but are not very small 

(corresponding to a flat part of the error surface). 

In addition, several (20) sets of random weights 


Figure 8. Disease Classifier 

are tested and the set which provides the best 

performance on the training data is chosen. In our 

experiments on the current problem, it was found 

that these techniques do not make a significant 

difference. 

3) Learning rate schedule:We used the learning rate 

scheduled by LAWRENCE [4] that was first proposed 

by Darken and Moody [46]. 

4) Activation Function: Symmetric sigmoid functions 

often improve convergence over the standard logistic 

function. For our particular problem we found 

that the difference was minor and that the logistic 

function resulted in better performance. 

5) Cost Function: The relative entropy cost function 

has received particular attention and has a natural 

interpretation in terms of learning probabilities. 

We investigated using both quadratic and relative 

entropy cost functions. 

E = 1 

2 

� 

(yk − dk) 2 

k 

(1) 

E = � 

[ 

k 

1 1 + yk 

(1+yk)log + 

2 1 + dk 

1 1 − yk 

(1−yk)log ] 

2 1 − dk 

(2) 

Where y and d correspond to the actual and desired 

output values, k ranges over the outputs. We found 

the quadratic cost function to provide better performance. 

A possible reason for this is that the use 

of the entropy cost function leads to an increased 

variance of weight updates and therefore decreased 

robustness in parameter updating. 

6) Sectioning of the Training Data: We investigated 

dividing the training data into subsets. Initially, only 

one of these subsets was used for training. After 

100% correct classification was obtained or a prespecified 

time limit expired, an additional subset 

was added to the working set. This continued until 

the working set contained the entire training set. 

The data was ordered in terms of sentence length 

with the shortest sentences first. This enabled the


networks to focus on the simpler data first. Elman 

suggests that the initial training constrains later 

training in a useful way. 

B. Training and evaluation of RNN 

For each disease, we used different RNN. Each of 

these networks is trained by positive and same number 

of random negative set of index. Positive set of index 

denotes its own disease case frames where negative set 

of indexes denote all other case frames that are not the 

member of this disease. These case frames are already 

discussed in ground memory module section. We were 

able to train an RNN up to 100% correct classification 

on the training data. Generalization on disease identification 

resulted in 89% correct classification on average. 

This is better than the performance obtained form other 

researches. A possible reason for this is that there are 

similarities between input dialog and trained dialog that 

means the set of index of input dialogs and the set of 

index of disease information. The output range of this 

classifier is from 0.9 to 0.1. 

C. Simulation Details 

The network contained three layers including the input 

layer. The hidden layer contained 7 nodes. Each hidden 

layer node had a recurrent connection to all other hidden 

layer nodes. All inputs and outputs were within the range 

zero to one. Bias inputs were used. The best of 50 random 

weight sets was chosen based on training set performance. 

Targets outputs were 0.1 and 0.9 using the logistic output 

activation function. The quadratic cost function was used. 

The search then converge learning rate schedule used was 

Equ.(3) 

Where 

η = 

n 

N/2 + 

η0 

c1 � 

max(0,c 

max 1,(c1− 1 (n−c2N)) (1−c2 ) 

η = learningrate 

η0 = initiallearningrate = 0.2 

N = totaltreaningepochs 

n = currenttrainingepoch 

Cl = 50, c2 = 0.65 

The training set consisted of 220 normalized sentences 

for each disease. The number of disease were 30. 

D. Dialog Manager 

Dialog manager is a controller for this system. It 

maintains the data flow between memory modules. Also it 

calculates the RNN classifier. After classification it selects 

the disease and system response. 

Pseudo code for Disease Selection: 

1.Select diseases for: Patients symptoms 

or causes = Disease(n)s symptoms 


� 

(3) 

or causes. 

if not found then Goto GMM Manager. 

Select diseases for: Patients symptoms 

or causes=Disease(n)s 

symptoms or causes. 

if found then upload the disease & 

goto DMM. 

Else Generate "Not found" response. 

2.if number of request dialog < 15 

then 

if (MAX(f(input dialog)/f(n))


A. PERFORMANCE EVALUATION 

Evaluation of dialog system performance is a complex 

task and depends on the purpose of the desired dialog 

metric. We have trained our system with mostly common 

30 diseases from [42], [43], [44] where 11 are mainly 

related with age, 10 are related with season and the other 

of 9 diseases are related with food, accident, habitual, 

health condition and location. The real doctor-patient 

dialog conversation is collected for this evaluation. The 

dialogues have been collected from various books on 

medical interviews and some frequently asked questionsanswering 

medical web links. The entire information and 

output was verified by a professional doctor. 85 patients 

dialogs were simulated with this system prototype where 

10 patients were tested 3 times and 15 patients 2 times 

and 25 patients only once. Table II presents the existing 

disease identification accuracy with the number of dialog 

conversation. According to the result analysis table; we 

found that the system achieved a very satisfactory result 

for 10 to 20 diseases. For 15 diseases we found 79% accuracy 

of disease identification. For less than 10 diseases the 

system can identify with very few dialog conversations. 

82% accuracy achieved only for 7 utterances. Before 

utilizing RNN we implemented another system where we 

used a comparatively simple probability function. More 

than 10% overall accuracy gained from that system. The 

result of previous system is depicted in table III. We did 

not verify more than 20 diseases because the performance 

was already not satisfactory for 20 diseases. 

TABLE II. 

DISEASE IDENTIFICATION EVALUATION 

Utterances 

Diseases 5 7 9 11 13 15 

10 57% 82% 83% 83% 84% 84% 

15 35% 60% 78% 78% 79% 79% 

20 19% 41% 59% 66% 67% 68% 

25 14% 37% 51% 53% 55% 56% 

30 10% 23% 33% 42% 48% 48% 

TABLE III. 

PREVIOUS SYSTEM EVALUATION 

Utterances 

Diseases 7 9 11 13 15 

10 56% 59% 62% 64% 66% 

15 49% 55% 58% 60% 63% 

20 46% 50% 53% 57% 61% 


The main underlying strategy is to adopt Elman RNN 

with Three Layer Conceptual Network. Uses of extended 

case frame, dialog matching algorithm, disease identification 

algorithm and response generation criteria are also 

presented. In addition of these fundamental strategies, 

we discussed the characteristic of Elman RNN with our 

system. Finally we presented a comparative study with 

our previous system. 


Still, there are lots of scopes to improve in neural 

network application. We are looking for some scope to 

implement TLCN using efficient neural network. Adopting 

some module, it might be possible to train this system 

by the real time conversation of a doctor and patient. And 

in future we plan to use voice dialog rather then text 

dialog. 

This is only the beginning of Neural Network with 

Three Layer Conceptual Network. Our goal is to make 

human-computer successful interaction. We strongly believe 

this research is a new breed of handshaking between 

TLCN and neural network. 


We thank Hirosato Nomura for numerous discussions 

concerning this work, Teigo Nakamura and Manuel Medina 

González for their assistance and the reviewers for 

their detailed comments. 

REFERENCES 

[1] J. L. Elman, “Distributed representations, simple recurrent 

networks, and grammatical structure,” Machine Learning, 

no. 7, pp. 195–226, 1991. 

[2] ——, “Finding structure time,” Cognitive Science, vol. 14, 

pp. 179–211, 1990. 

[3] J. B. Pollack, “Recursive distributed representations,” Artificial 

Intelligence, vol. 46, pp. 77–105, 1990. 

[4] S. Lawrence, C. L. Giles, and S. Fong, “Natural language 

grammatical inference with recurrent neural networks,” 

IEEE Transactions on Knowledge and Data Engineering, 

vol. 12, pp. 126–140, 2000. 

[5] M. Towsey, J. Diederich, I. Schellhammer, S. Chalup, and 

C. Brugman, “Natural language learning by recurrent neural 

networks: a comparison with probabilistic approaches,” 

pp. 3–10, 1998. 

[6] A. Graves, N. Beringer, and J. Schmidhuber, “A comparison 

between spiking and differentiable recurrent neural 

networks on spoken digit recognition.” 

[7] D. Chen, G. Z. Sun, H. H. Chen, and Y. C. Lee, “Extracting 

and learning an unknown grammar with recurrent neural 

networks,” Advance in neural Information Processing Syatem, 

1992. 

[8] R. J. Williams and J. Peng, “An efficient gradient–based 

algorithm for on–line training of recurrent network trajectories,” 

Neural Computation, vol. 2, no. 4, pp. 490–501, 

1990. 

[9] G. Scheler and T. Munchen, “With raised eyebrows or the 

eyebrows raised? a neural network approach to grammar 

checking for definiteness,” pp. 160–170, 1996. 

[10] C. Fabrizio, F. Paolo, L. Vincenzo, and S. Giovanni, 

“Towards incremental parsing of natural language using 

recursive neural networks,” Towards Incremental Parsing 

of Natural Language using Recursive Neural Networks, p. 

Paginated, 2002. 

[11] Y. Li and J. G. Harris, “A spiking recurrent neural network,” 

VLSI, IEEE Computer Society Annual Symposium 

on, p. 321, 2004. 

[12] S. C. Kwasny, S. Johnson, and B. L. Kalman, “Recurrent 

natural language parsing 1.” 

[13] R. Miikkulainen, “Natural language processing with subsymbolic 

neural networks,” pp. 120–139, 1997. 

[14] M. H. Christiansen and N. Chater, “Natural language 

recursion and recurrent neural networks,” 1994.


[15] S. Wermter and V. Weber, “Screen: Learning a flat syntactic 

and semantic spoken language analysis using artificial 

neural networks,” Journal of Artificial Intelligence 

Research, vol. 6, pp. 35–85, 1997. 

[16] R. M. Khalilur and E. Tsutomu, “Three Layer Conceptual 

Network Dialog Processor,” in The Twelfth IASTED International 

Conference on AI and Soft Computing, 2008, pp. 

92–97. 

[17] H. Nomura, “Modeling and representative framework for 

linguistic and non-linguistic knowledge in natural language 

understanding,” in Germany-Japan Science Seminar, 1986, 

pp. 1–10. 

[18] M. A. Walker, “Evaluating discourse processing algorithms,” 

in Proceedings of the ACL, 1989. 

[19] H. Nomura, “Experimental machine translation systems 

lute,” in Second Joint European-Japanese Workshop on 

Machine Translation, 1985, pp. 1–2. 

[20] D. Marcu and A. Echihabi, “An unsupervised approach 

to recognizing discourse relations,” in The 40th Annual 

Meeting of the Association for Computational Linguistics, 

July 2002, pp. 368–375. 

[21] E. Tulving and F. I. M. Craik, “The oxford handbook of 

memory.” 

[22] T. K. Landauer, “How much do people remember? some 

estimates of the quantity of learned information in longterm 

memory.” Cognitive Science, vol. 10, pp. 477–493, 

1986. 

[23] R. L. Buckner, “Beyond hera: Contributions of specific 

prefrontal brain areas to long-term memory retrieval.” 

Psychonomic Bulletin and Review, vol. 3, pp. 149–158, 

1996. 

[24] B. Bruce, “Case systems for natural language,” Artificial 

Intelligence, vol. 6, no. 4, pp. 327–360, 1975. 

[25] A. Shimazu, S. Naito, and H. Nomura, “Japanese language 

semantic analyzer based on an extended case frame 

model,” in International Joint Conference on Artificial 

Intelligence, 1983, pp. 717–720. 

[26] S. Tsumoto, “Automated extraction of medical expert 

system rules from clinical databases based on rough set 

theory.” Information Sciences, p. 112, 1998. 

[27] J. H. Frenster, “Expert systems and open systems in 

medical artificial intelligence.” Am Assoc Medical Systems 

and Informatics, vol. 7, pp. 118–120, 1989. 

[28] F. Shahbaz, F. Maqbool, S. Razzaq, K. Irfan, and T. Zia, 

“The role of medical expert systems in pakistan.” World 

academy of science engineering and technology, vol. 27, 

2008. 

[29] Y.-S. Lee, D. J. Sinder, and C. J. Weinstein, “Interlinguabased 

english-korean two-way speech translation of 

doctor-patient dialogues with cclinc,” Machine Translation, 

vol. 17, pp. 213–243, 2002. 

[30] J. F. Allen, D. K. Byron, M. Dzikovska, G. Ferguson, 

L. Galescu, and A. Stent, “Conversational humancomputer 

interaction,” in IASTED International Conference 

on AI and Soft Computing, vol. 12, Sept. 2001, pp. 

628–802. 

[31] M. Kipp, J. Alexandersson, R. Engel, and N. Reithinger, 

“Dialog processing,” W. Wahlster (ed.) Verbmobil: Foundations 

of Speech-to-Speech Translation, pp. 452–465, 2000. 

[32] E. Tsutomu and K. Tsuneo, “Cooperative understanding 

of utterances and gestures in a dialogue-based problem 

solving system.” Computational Intelligence, pp. 152–169, 

1999. 

[33] K. Shimada, Y. Uchida, S. Sato, S. Minewaki, and T. Endo, 

“Speech understanding using confidence measures and 

dependency relations,” Proc. of PACLING2005, pp. 278– 

283, 2005. 

[34] N. Okada and T. Endo, “Story generation based on dynamics 

of the mind.” Computational Intelligence, pp. 123–160, 

1992. 


[35] K. Shimada, K. Iwashita, and T. Endo, “A case study of 

comparison of several methods for corpus-based speech intention 

identification,” Proceedings of the 10th Conference 

of the Pacific Association for Computational Linguistics 

(PACLING2007), pp. 255–262, 2007. 

[36] S. Minewaki, K. Shimada, and T. Endo, “Interpretation of 

utterances based on relevance theory: Toward the formalization 

of implicature with the maximum relevance,” Proc. 

of PACLING2005, pp. 214–222, 2005. 

[37] A. Graves, N. Beringer, and J. Schmidhuber, “Grammar 

learning for spoken language understanding,” Automatic 

Speech Recognition and Understanding, pp. 292–295, 

2001. 

[38] R. E. Schapire and Y. Singer, “Boostexter: A boostingbased 

systemfor text categorization,” Machine Learning, 

2000. 

[39] C. Noam, “Three models for the description of language,” 

IRE Transactions on Information Theory, p. 113124, 1956. 

[40] ——, “On certain formal properties of grammars,” Information 

and Control, p. 1959, 137-167. 

[41] C. Noam, Schtzenberger, and M. P., “The algebraic theory 

of context free languages,” Computer Programming and 

Formal Languages. Amsterdam: North Holland, pp. 118– 

161, 1963. 

[42] “http://www.netdoctor.co.uk/,” Weblink, Online Medical 

Solution. 

[43] “http://www.doctoronline.nhs.uk/ver02/index.asp,” 

Weblink, Online Doctor. 

[44] “http://www.nzdoctor.co.nz/default.aspx,” Weblink, online 

Medical Solution. 

[45] R. M. Khalilur and E. Tsutomu, “Recurrent neural network 

classifier for three layer conceptual network and performance 

evaluation,” in 11th International Conference on 

Computer and Information Technology, 2008, pp. 747– 

752. 

[46] D. C. and M. J., “Note on learning rate schedules for 

stochastic optimization,” In Neural Information Processing 

Systems, vol. 3, pp. 832–838, 1991. 

Md. Khalilur Rhaman received his Dr.Eng. degree from 

Kyushu Institute of Technology in 2009, Japan and B.Sc. 

and M.Sc. degrees from Institute of Science and Technology, 

National University, Bangladesh, in 1997 and 1998, respectively. 

Currently he is an Assistant Professor of Department of Computer 

Science and Engineering, BRAC University, Bangladesh. 

In 2002, he joined Uttara University,Bangladesh, where he held 

the position of Lecturer of Department of Computer Science 

and Engineering. His research interests include Natural Language 

Processing, Neural Network, Image Processing and Data 

Mining. 

Tsutomu Endo received the B.Eng., M.Eng., and Dr.Eng. 

degrees from Kyushu University in 1972, 1974, and 1979 

respectively. 

Currently, he is a Professor of the Department of Artificial 

Intelligence, Kyushu Institute of Technology. His research interests 

include natural language understanding, computer vision 

and multimodal interface. He is a member of the IPSJ, JSAI, 

JSSST, RSJ and ANLP.


An Enhanced Short Text Compression Scheme 

for Smart Devices 

Md. Rafiqul Islam 

Computer Science and Engineering Discipline, Khulna University, Khulna, Bangladesh. 

Email: dmri1978@yahoo.com 

S. A. Ahsan Rajon 


Email: ahsan.rajon@gmail.com 

Abstract — Short Text Compression is a great concern 

for data engineering and management. The rapid use of 

small devices especially, mobile phones and wireless sensors 

have turned short text compression into a demand-of-thetime. 

In this paper, we propose an approach of compressing 

short English text for smart devices. The prime objective of 

this proposed technique is to establish a low-complexity 

lossless compression scheme suitable for smart devices like 

cellular phones and PDAs (Personal Digital Assistants) 

having small memory and relatively low processing speed. 

The main target is to compress short messages up to an 

optimal level, which requires optimal space, consumes less 

time and low overhead. Here a new static-statistical context 

model has been proposed to obtain the compression. We use 

character masking with space integration, syllable based 

dictionary matching and static coding in hierarchical steps 

to achieve low complexity lossless compression of short 

English text for low-powered electronic devices. We also 

propose an efficient probabilistic distribution based 

content-ranking scheme for training the statistical model. 

We analyze the performance of the proposed scheme as well 

as the other similar existing schemes with respect to 

compression ratio, computational complexity and 

compression-decompression time. The analysis shows that, 

the required number of operations for the proposed scheme 

is less than that of other existing systems. The experimental 

results of the implemented model give better compression 

for small text files using optimum resources. The obtained 

compression ratio indicates a satisfactory performance in 

terms of compression parameters including better 

compression ratio, lower compression and decompression 

time with reduced memory requirements and lower 

complexity. The compression time is also lower because of 

computational simplicity. In overall analysis, the simplicity 

of computational requirement encompasses the compression 

effective and efficient. 

Index Terms — Short Text Compression, Syllable, 

Statistical Model, Text-ranking, Static Coding, Smart 

Devices. 


Twenty-First century is the age of information and 

communication technology. Science through its 

Contact Author: S. A. Ahsan Rajon 

E-Mail: ahsan.rajon@gmail.com 


doi:10.4304/jcp.5.1.49-58 

marvelous achievements has converted this world into 

information and communication based global village. 

The prime aspect of present technology is to ensure a 

better way of communication throughout the world in a 

more convenient, easy and cost-effective way. With the 

aspects of cost, facility and reliability a new trend of 

introducing small sized devices with some sorts of 

computing and communicating power have established 

its place in the arena of research. With the voyage of 

introducing smart devices, the challenge of adorning 

them with greater and effective use has come into 

question. It is now a great concern to embed maximum 

applications within these smart devices where it is an 

extreme problem to provide a low-complex and lowmemory 

consuming version for smart devices of some 

prime necessary applications like data compression, 

which generally requires large memory and greater 

processing speed. Mobile communication that is a great 

gift of modern technology introducing the era of digital 

communication also suffers from the same limitation. 

Though crossing the boundary of voice communication, 

short messages communication has established its robust 

place in the arena of digital communication, Short 

Message Service (SMS) providers (usually 

Telecommunication Companies) have a constraint that 

each message should be not more than of 160 characters. 

This constraint is really a great limitation for frequent 

communication using SMS. In order to overcome this 

limitation, compression of the short message is a well 

policy. That is why; our aim is to make “short” messages 

“shorter”, expressing “larger” feelings in “smaller” 

expenses. 

Here we introduce a scheme of compressing short 

English text for smart devices like cellular phones and 

wireless sensors having small memory and relatively low 

processing speed communicating with lower bandwidth 

i.e. channel capacity. We have employed a new statistical 

model with a novel approach of integrating text ranking 

or component categorization scheme for building the 

model. Modified syllable based dictionary matching and 

static coding is used to obtain the compression. 

Moreover, we have employed a new theoretical concept


of choosing the multi-grams, which has facilitated us to 

obtain mentionable compression ratio using a small 

number of knowledgebase entries than other methods, 

consuming less resource. Besides of experimental results 

we have provided a comprehensive theoretical analysis 

of compression ratio of proposed scheme with similar 

existing scheme. 

II. RELATED LITERATURE 

Though a number of researches have been performed 

regarding large-scale data compression, in the specific 

field of short text compression, the number of available 

research work is small. Business issue of mobile phone 

service providers may be indicated as a reason behind the 

unavailability of research material. The following 

sections give a glimpse of the most recent research 

developments on short text compression issues for small 

devices. 

A. Compression of Short Text on Embedded Systems 

The recent literature regarding short text compression 

titled “Compression of Short Text on Embedded 

Systems” by Rein et al. [1] proposes a low-complexity 

version of PPM (Prediction by Partial Matching). A hash 

table technique with one-at-a-time hash function is 

employed in this method to design the data structure for 

data and context model. They use statistical models with 

hash table lengths of 16384, 32768, 65536 and 131072 

elements requiring two bytes for each element, which 

result an allocation of 32, 64, 128 and 256 Kbytes of 

RAM respectively. If this memory requirement may be 

substantially decreased, we may achieve more efficient 

compression and hence may make the scheme usable to 

even very low-quality cellular phones. Another 

concerned approach by Rein et al. is “Low Complexity 

Compression of Short Message” [2] with Low Complex 

and Power Efficient Text Compressor for Cellular and 

Sensor Networks [3] are variations of [1]. 

B. Compression of Small Text Files Using Syllables 

“Compression of Small Text Files Using Syllables” 

proposed by Lansky et al. [4, 5] concerns on 

compressing small text files using syllables. To 

implement their concept they created database of 

frequent syllables. Here, condition for adding syllable to 

database is that, its frequency is greater than 1:65000. In 

this scheme, the primary knowledge-base size is more 

than 4000 entries initially. For low memory devices, it is 

obviously difficult to afford this amount of storage as 

well as to facilitate a well suited mechanism of searching; 

which leads our proposed scheme to redefine the 

knowledge-base span as short as possible and hence to 

reduce the scope of loosely choosing the syllables or ngrams. 

Moreover, in formation of the syllables, space is 

not considered with any special concern. But, as in any 

text document, it is a common assumption that, at least 

20% of the total characters may be spaces, it may be a 


good idea to have specific consideration of syllable 

involving spaces. In [4, 5], all the training syllable entries 

are stored without any categorization. This often results 

for coding redundancy, which can be handled by 

integrating text ranking or component categorization 

scheme with syllable selection. 

C. Modified Greedy Sequential Grammar Transform 

based Lossless data Compression 

The model proposed by M. R. Islam et al. [6] uses the 

advantages of greedy sequential grammar transform with 

block sorting to compress data. However, this scheme is 

highly expensive in terms of memory consumption and 

thus not suitable for low memory devices. 

D. Two-Level Directory Based Compression 

Dictionary based text compression techniques are an 

important and mostly adapted data compression schemes. 

A dictionary based text preprocessing scheme titled 

TWRT (Two-level Word Replacing Transformation) has 

been proposed by P. Skibinski [7]. They use several 

dictionaries and divide files into various kinds, which 

improve the compression performance. 

TWRT can use up to two dictionaries, which are 

dynamically selected before the actual preprocessing 

starts. For some types of data like programming 

languages, references etc. first level dictionaries (small 

dictionaries) are specified whereas second level 

dictionaries (large dictionaries) are specific for natural 

languages (e.g., English, Russian, French). While 

concerned with any source text, if no appropriate first 

level dictionary is found, then it is not used. Selection of 

the second level dictionary is analogous. When TWRT 

has selected only the one (the first or the second level) 

dictionary, it works like WRT (Word Replacing 

Transformation) [7]. If TWRT has selected both the 

first and the second level dictionaries, then the 

second level dictionary is appended after the first 

level dictionary. That is, the dictionaries are 

automatically merged. If the same word exists in the first 

and the second level dictionaries, then the second 

appearance of word is ignored to minimize length of 

code-words. Only the names of the dictionaries are 

written in the output file, so the decoder can use 

the same combination of the dictionaries. 

TWRT preprocesses the input file step by step with all 

dictionaries and finally to choose the smallest output file. 

Nevertheless, this idea is very slow. They propose to 

read only the first f (e.g., 250) words from each of n 

dictionaries (small and large) and create one joined 

dictionary which is completely impossible to afford for 

low-memory devices. If there are same words in different 

dictionaries, then all occurrences of this words are 

skipped which is too an extremely infeasible technique 

for smart device platform-aware compression. The main 

problem for TWRT is selection of the dictionaries before 

preprocessing, which hampers the processing time for 

concerned devices [3]. Moreover the dictionary length is


too huge to be used for small memory devices. For small 

text compression it is also not feasible and to some extent 

efficient to have field-specific dictionary. Above all, 

being the code words are determined on the fly, it is great 

doubt to cope with the memory and energy constraints as 

well as time consumptions of the low-memory devices. 

E. Other Schemes 

H. Kruse and A. Mukherjee [8, 9] proposed a 

dictionary based compression scheme named star 

encoding. According to this scheme, words are replaced 

with sequence of * symbol accompanied with reference 

to an external dictionary. The dictionary is arranged 

according to the length of words and is known to both 

sender and receiver. Proper sub-dictionary is selected by 

the length of the sequence of * symbols. Length Index 

Preserving Transformation (LIPT) is a variation of the 

star encoding by the same authors. This algorithm 

improves the PPM, BWCA and LZ based compression 

schemes [9]. Another related literature known StarNT 

works with ternary search tree and is faster than the 

previous. The first 312 words of the dictionary are the 

most frequently used words of the English language. The 

remaining part of the dictionary is filled up by words 

sorted by their lengths first and then by their frequencies. 

This scheme also does not take the use of substring 

weighting approach. Moreover, the scheme requires that, 

the dictionary should be transmitted first in order to set 

up the knowledge-base. It is completely in-feasible to be 

used for compressing small texts for low-powered and 

small memory devices. 

Prediction by partial matching (PPM) is a major 

lossless data compression scheme, where each symbol is 

coded by taking account of the previous symbols [9]. A 

context model is employed that gives statistical 

information about the symbol with its context. In order to 

signal the decoder on the context, specific symbols are 

used by the encoder. The model order in PPM is a vital 

parameter of compression performance. However, PPM 

is computationally more complex and the overhead too is 

greater [1, 9, 12]. 

In [1] the compression starts for text sequences larger 

than 75 Bytes, and in [10] the starting point is 50 Bytes. 

If it is possible to make the lower threshold value into 

less than ten characters, the compression may really be a 

“very small text file” supported one that may place a new 

milestone in very small text file compression ensuring 

“short text gets shortest”. Our prime aim is to design 

such an effective and efficient “very short text 

compression” scheme. 

III. SHORT TEXT COMPRESSION FOR SMART DEVICES 

The prime concern of this paper is to implement a 

lossless compression of short English text for smart 

devices in a low complexity scheme. The idea behind this 

is to make it possible to communicate more efficiently by 


making utilization of minimum required bandwidth of 

wireless and sensor networks for small and embedded 

devices. More precisely saying, the main concern is to 

develop a low-complexity compression technique 

suitable for low-power consuming smart devices with 

small-memory; especially for wireless sensors and 

mobile phones. The proposed scheme is concerned with 

two parts. The first one consists of training the statistical 

model and the second provides a compressiondecompression 

technique. Specifically, in analyzing step 

we count the frequency of represent-able ASCII 

characters. Then, the proposed scheme proceeds by 

identifying the syllables of length two, three, four and 

five. We consider as a distinct vowel and 

include this while counting the frequency of syllables, 

that is, searches for either at the beginning or 

ending of syllable. In the step of boosting the statistical 

model, the substrings (which were not grabbed in the 

phase concerning syllables) with length two to five are 

considered and the frequency of each are counted. In the 

second step, we employ the provided text-ranking 

scheme for each entry and calculate the entry index. For 

entries with same index, we simply sort them. In the 

phase of choosing entries, emphasis is give on 

probability distribution based text ranking, which is 

computed with the help of its neighbor characteristics. 

As in most cases, it is unusual to have frequent match 

of substrings with length more than four characters, 

maximum of five (extra one for ) levels has been 

considered to train the statistical model. In each level, 

multiple entries with same weightage are simply sorted 

over. Resultant entries are assigned with non-conflicting 

binary stream. When we are to compress any text, the 

input text is successively compared with the statistical 

model and for any match, the binary stream is returned as 

output and for mismatch in any level, the level below it is 

forwarded. 

The compression and decompression are expected to 

be performed in the following manner. 

A. Compression Process 

In the first step, we plan to employ Modified Multigrams 

or syllable matching proposed by Lansky et al. [4, 

5]. A static dictionary based compression scheme uses 

approximately the same concept as that of character 

masking. It reads input data and searches for symbols or 

groups of symbols that are previously coded or indexed 

in the dictionary. If a match is found, a pointer or index 

into the dictionary can be output instead of the code for 

the symbol. Compression occurs if the pointers or index 

requires lower space (in terms of memory usage) than the 

corresponding fragments [1, 12, 15]. Though it is the 

basic idea behind multi-grams, we use it in a slightly 

modified fashion. The Knowledgebase for the multigrams 

is constructed with the help of the statistics 

obtained by analyzing the corpuses.


In the second step of compression process, static 

coding has been used. However, here the modification is 

made in such a way that, in spite of calculating on-stage 

codes predefined codes are employed in order to reduce 

the space and time complexity. The codes for dictionary 

entries are also predefined. The total message is encoded 

by a comparatively small number of bits and hence, we 

get a compressed outcome. The prime fact of static 

coding is that, in case of dynamic coding we are to 

calculate the frequencies in an elegant manner. 

Moreover, as the dynamic codes are determined and the 

lengths are varied through the frequencies of the symbols 

it is a must to submit the codes along with the 

compressed data. This is a limitation of dynamic coding 

if the total arena of compression is low spanned and the 

total number of symbols is not huge. Compression of 

short message also suffers in this context while using 

other similar semi-dynamic coding. Consequently, it 

motives to move towards static coding scheme to obtain 

compression of short messages. For this concerned study, 

we have analyzed the corpora- Bib, Book1, Book2, 

News, Paper1, Paper2, Paper4 and Paper5 from Calgary 

Corpus and Canterbury Corpus [16]. We have also 

analyzed 124 collections of small text for the same. A 

detail overview of the texts is presented in [10, 15]. 

B. Decompression Process 

The decompression process is performed through the 

following steps: 

Step 1: Grab the bit representation of the message, 

Step 2: Identify the character representation, 

Step 3: Display the decoded message. 

As all the symbols are to be coded in such a fashion that, 

by looking ahead several symbols (Typically the 

maximum length of the code) we can distinguish each 

character (with the attribute of Static Coding). In step 1, 

the bit representation of the modified message is 

performed. It is simply analyzing the bitmaps. The 

second step involves recognition of each separate bitpatterns 

and indication of the characters or symbols 

indicated by each bit pattern. This recognition is 

performed on the basis of the information from fixed 

encoding table used at the time of encoding. The final 

step involves simply representing i.e. display of the 

characters recognized through decoding the received 

encoded message. 

IV. THEORETICAL ASPECT OF TRAINING THE PROPOSED 

STATISTICAL MODEL 

The proposed scheme achieves better compression 

ratio with relatively low complexity by means of 

computational simplicity and effective expert model, 

which is used to train the statistical context. Firstly, the 

prime modification is performed in the syllable selection 

section of [4, 5] proposed by Lansky et al. In their 

papers, only syllables were considered by defining 

maximal subsequence of vowels including pseudo-vowel 


‘y’. Here our proposal is to consider as a 

Prime Syllable. Using as a prime syllable may 

dramatically reduce the total number of characters 

needed to represent the message through sophisticated 

encoding. Secondly, the paper [4, 5] does not clearly 

express the criteria of choosing the training syllables for 

the model. The term used in [4, 5] to define the criteria of 

choosing syllables was simply “frequentness”. However, 

we employ a new theoretical perspective on “Text 

Ranking or Component Ranking” method to choose the 

syllables for training our proposed model. We, in the first 

step count the frequency of each vowel from the standard 

Text Calgary Corpora. In the second step, we count the 

number of vowels having space either at (i - 1) or (i + 

1) position with itself at position (i). The reason of 

adding the with-space vowels is simply to increase the 

weightage of the corresponding vowels. These entries 

having vowels are also inserted in the knowledgebase as 

separate entity. In this step, we also count the statistics of 

consonants in order to build the dictionary or multigrams 

entries. In the step of extending primary 

knowledgebase, proposed in papers [4, 5] the criteria was 

that, the frequency would be 1:65000, which results a 

knowledgebase size of more than 4000 entries initially. 

For low memory devices, it is obviously difficult to 

afford this amount of storage as well as to facilitate a 

well suited mechanism of searching, which leads our 

proposed scheme to redefine the knowledge base span as 

short as possible and hence to reduce the scope of loosely 

choosing the multi-grams. In the phase of choosing 

multi-grams, we give emphasis on Probability 

Distribution, which is computed with the help of its 

neighbor characteristics. 

A new text weighting or component ranking scheme 

has been employed to select the multi-grams that 

facilitates us to efficiently construct the knowledgebase 

[11, 15]. This developed novel text weighting scheme is 

employed with an aim to get the knowledgebase. The 

proposed text weighting or component ranking is 

obtained through the following equation: 

Suppose we choose a multi-grams D consisting of 

characters C1 C2 C3… Cn of length n. Thus, 

∂(D 

) = 

C ,C ,C ,......... ,C , 

1 2 3 n 

n 

∑ ( ∂(D 

) -1) 

-∂ 

(D 

) + λ 

C C ,C ,C ,......... ,C , C ,C ,C ,......... ,C 

i = 1 

i 1 2 3 n−1 

1 2 3 n 

for i > 1 and, 

∂ ( DC 

) = λ(Ci ) for i = 1, where λ(Ci ) is the frequency 

i 

of the character Ci with assumption that each character of 

the alphabet must exist in the training data. And the value 

of ∂(Dφ) indicates the multi-grams index of character 

Dφ.. Here, we refer the multi-gram index obtained from 

∂ (D C ,C ,C ,......... ,C , ) as the resultant text-weightage 

1 2 3 

n 

for the text C1 C2 C3… Cn .


For example, we want to have the probability distribution 

of multi-grams “and”. 

The steps will be: 

Let ∂ ( a ) = x1 , 

∂ ( n ) = x2 , 

∂( d ) = x3 , 

∂( an ) = x4 and, 

λ(C and ) = Frequency of ‘and’ in the training document = x5. 

Therefore, ∂(and) = ( ∂( a ) -1) + ( ∂( n ) -1) + ( ∂ (d ) -1) 

— ∂ (an) + λ (C and ) = x 1+ x 2 + x 3 — x 4 + x 5 

Static coding is defined as a set S = {B1, B2, B3 ….., Bn-1 , 

Bn } of binary streams, where each element of the set B1, 

B2, B3 ….., Bn-1 , Bn are uniquely identifiable providing that, 

B1, B2, B3 ….., Bn-1 ,Bn are not necessarily to be of equal bit 

lengths. As all the elements of the sets supports our three 

basic points of concern- identifiability, uniqueness and 

variability of any encoding scheme, we are interested to 

use static coding. This theoretical aspect let us do not 

consider updating the model throughout the compression, 

because, firstly, updating a larger knowledgebase using a 

very short data is not computationally affordable and 

effective, secondly, a larger portion of the source code 

may be saved if the update is avoided resulting faster 

execution [1, 3, 11] and finally, if the training statistical 

data are not permitted to be updated as well as to be 

expanded, a constant and consistent memory 

optimization for the overall compression process may 

ensured. That is, there is no possibility to expand the 

knowledgebase arbitrarily and thus there is no risk of 

arising the overloaded memory problem or out-ofmemory 

problem. As the knowledgebase is not updated, 

the use of static coding is also perfect for the same. 

V. PERFORMANCE ANALYSIS 

Though it is a general idea that compression and 

decompression time should have an inter-relation, the 

proposed scheme demonstrated a little exception. The 

points behind that may be summarized through the 

following discussions. 

A. Performance Analysis of Compression Process 

Let the total number of training entries for the 

statistical model be Ng , where Ng is a non-negative 

integer and the maximum level for statistical modeling is 

L. The first level of the statistical model must contain the 

single characters, where the total number of distinct 

character is l1. For levels 1, 2, 3, …., n-1, n the total 

number of distinct multi-gram entries are l1, l2, l3, ……, 

ln-1, ln respectively. 

When any text is to be compressed, it is hierarchically 

compared with each level of statistical model starting 

from the highest order. If there is any match, the 

corresponding static coding for multi-gram entry is 

assigned for the text. If the multi-gram entry is not found 

throughout the level, it is forwarded to the next level. 

This assignment uses efficient searching procedures. Let 

the code m is found at the i-th level with offset k 


resulting a search cost of Sm ( l j ) + km , where km < li 

and, j=L, L-1,……, i-1 with respect to search space. 

Here j limits from L to (i-1) instead of i because, as we 

find the code in somewhere of i-th level not requiring to 

search the whole element-space of the i-th level, rather 

searching through an offset value k for i-th level, the 

overall search-space is L to (i-1). That is why, for the 

above consequences, the total searching appears: search 

overhead for (i-1) number of levels with additional 

search overhead of k elements. Here the term “search 

overhead” stands for the search space complexity as well 

as other related computational requirements like time and 

power consumptions. When the code matches, it is 

placed in output stream as character representation. This 

step requires padding the bit-stream and then conversion 

into character stream. Assume that, the process of overall 

conversion for each successful entry occurs with the 

overhead B. That is, for any multi-gram matching, the 

required overhead is, 

C 

∑ − 

1 = 

i 1 

(S 1( 

l 

j = L 

j )) + k 1 + B 1 

Similarly, 

C n 

= 

∑ − 

C 2 = 

i 1 

(S 2 ( l 

j = L 

j )) + k 2 + B 2 

∑ − i 1 

(S n ( l j )) + k n 

j = L 

+ B n 

, and 

In such a way if n multi-grams are identified and then 

encoded, the required resultant number of operations in 

compression process is: 

n n i −1 

n 

n 

T = C = (S ( l )) + k + B (1) 

∑ 

y = 1 

y 

∑ 

∑ y j ∑ y ∑ 

y= 

1 j = L 

y= 

1 y= 

1 

B. Performance Analysis of Decompression Process 

For the decompression process, the text to be 

decompressed is converted into binary stream. If the 

largest code is of length cmax and the smallest code is of 

length cmin then the decompression process will start the 

searching with the cmax number of bits and search through 

the codes up to cmin bits by reducing one bit in each step 

for unsuccessful match. It is necessary to mention that, 

the codes with same bit length do not essentially 

comprise any specific level. So, to reveal the character 

representation for each entry d if a switch of h levels are 

required, where cmin ≤ h ≤ cmax where the maximum level 

is p, with the matching offset for corresponding level kd, 

and the assignment of the code with character 

representation for each successful match requires an 

overhead of Bd, then the overall requirement for 

comparing through the each level settings results (= 

overhead of searching through level + overhead for 

searching through offset + overhead of representation). 

For detecting first character the overhead is, 

y


h 

E 1 = ∑ ( S ′ 1 

q = p 

( l ′ )) 

q 

+ 

k ′ 

1 

+ 

B ′ 

Similarly, for detecting the second character, the levelwise 

overhead will be: 

h 

E 2 = ∑ ( S ′ 2 ( l q′ 

)) + k 2′ 

+ B 2′ 

q = p 

Hence, 

h 

∑ 

q = p 

E = ( S ′ ( l ′ )) + k ′ + B ′ (2) 

n 

n 

q 

Here, p = maximum level, and S / is a function that 

denotes the search-overhead for searching in element 

space provided as parameter of the function and h = 

minimum level. The computation progresses through p, 

p-1, p-2,……, h+2, h+1, h . Here the subscript n is used 

to denote the level-wise overhead for detecting one 

character representation with respect to level. 

In order to detect a single multi-grams f, the total searchoverhead 

with respect to search space for level-wise 

calculation is, 

h 

( S ′ ( l ′ )) + k ′ because, we are to 

∑= 

q 

p 

f 

q 

start with maximum level p and then proceed 

decreasingly towards the downward levels h (as 

explained above). If S / is the search-overhead function, 

then searching from level p to h will result 

h 

∑ 

q = p 

( S 

′ 

f 

( 

l ′ 

q 

)) 

f 

n 

where f is the multi-grams, 

which is being revealed. For the matching level, as only a 

partial number of elements are to be searched, the offset 

k is used to denote the offset. 

After checking through the levels, the procedure follows 

searching through the bit-wise statistics for any 

unsuccessful match in level-wise statistics. If there are a 

total of u bit-phases, we are to perform searching through 

the search-space consisting of starting from the 

maximum bit phase to the minimum bit phase in 

descending order. Because of any unsuccessful match in 

any bit-phase, a bit switch is performed and level wise 

calculation for that level is forwarded. That is, an 

g 

overhead of ∑= 

b 

d 

(E is incurred for each level-wise 

b ) 

analysis. Consequently, the overhead of unit step will be, 

g 

∑ 

b = d 

C ′ = (E ) where d and g are maximum and 

1 

b 

minimum bit phases respectively and d ≥ g. 

Substituting the value of Eb, we get, 

C ′ 

1 

= 

g 

∑ 

h 

∑ ( S 1,b ′ ( l q′ 

)) + 

g 

∑ k 1,b ′ + 

g 

∑ 

b = d q = p 

b = d 

b = d 

Similarly, we get, 


1 

. 

n 

. 

B ′ 

1,b 

C 2′ 

= 

g 

∑ 

h 

∑ ( S 2′ 

, b ( l q′ 

)) + 

g 

∑ k 2′ 

,b + 

g 

∑ B 2′ 

,b 

b = d q = p 

b = d 

b = d 

And, 

C n′ 

= 

g 

∑ 

h 

∑ ( S n′ 

, b ( l q′ 

)) + 

g 

∑ k n,b ′ + 

g 

∑ 

b = d q = p 

b = d 

b = d 

B ′ 

Here, we use the subscript 1 with k and B in order to 

denote that, the calculations are for detecting unit code 

only where the calculation is performed starting from d 

to g in decreasing order, that is, in the order of d, (d-1), 

(d-2), …… , (g+2), (g+1), g. If we are to reveal n number 

of codes, then the total overhead becomes: 

T ′ = 

n 

∑ 

y = 1 

C ′ 

y 

. 

As for each bit wise overhead calculation, level-wise 

calculations would must be included; we may omit the 

subscript notation for search overhead function for 

simplicity, 

n g h 

n g n g 

T′ 

= ∑ ∑ ∑( S′ 

y b( 

lq′ 

)) + ∑∑k′ 

y,b+ 

∑ ∑B′ 

(3) 

, 

y,b 

y = 1 b = d q = p 

y = 1b 

= d y = 1 b = d 

TABLE 1: 

COMPLEXITIES OF COMPRESSION AND DECOMPRESSION PROCESSES 

T′ 

= 

Compression Complexity of Proposed Scheme 

T 

= 

∑ ∑ 

− n i 1 

( 

y= 

1 j = L 

(S ( l )) + k 

y 

Decompression Complexity of Proposed Scheme 

n 

∑ 

( 

g 

∑ 

h 

j 

∑ ( S′ 

y,b( 

lq′ 

)) + ∑ k′ 

y,b + ∑ B′ 

y = 1 b = d q = p 

b = d b = d 

C. Performance Analysis with respect to Compression 

Ratio 

Compression ratio may be defined as the ratio of total 

number of bits to represent the compressed information 

and the total number of bits in original text. 

Let the training knowledgebase contains n items. The 

items may contain any of the total s symbols of the 

source language. Again, the knowledgebase items vary 

from one to c characters. For traditional encoding, 

required overhead (i.e. total number of required bits) to 

represent each character of knowledgebase entries is 

log(s) in average. Therefore, if we categorize the 

knowledgebase items into kn categories, (where the 

categorization aspect is total number of characters in 

each knowledgebase entry) and there are e1, e2, e3, …, en 

elements in k1, k2, k3, …, kn categories respectively, then 

g 

y 

+ 

B 

y 

) 

g 

y,b 

n,b 

)


we may easily calculate the overhead for coding the total 

knowledgebase. 

For category k1, we need a total of log(s) * k1 bits is 

needed to code each knowledgebase items. 

As there are a total of e1 elements, total bit requirement 

for coding all the elements of category k1 is, 

log(s) * k1 * e1. 

Here, log(s) , k1 and e1 are non-negative integer values. 

That is, 

Ok1 = log(s) * k1 * e1 

Ok1 = k1 e1 log(s) 

Where Ok1 indicates the total bit requirement for 

coding all the elements of category k1 . 

For category k2, we need a total of log(s) * k2 bits to 

code each knowledgebase entries. 

As there are e2 elements, total bit requirement for 

coding all the elements of category k2 is, log(s) * k2 * e2. 

Here, log(s) , k2 and e2 too are non-negative integer 

values. 

That is, Ok2 = log(s) * k2 * e2 

Ok2 = k2 e2 log(s) 

Where Ok2 indicates the total bit requirement for 


Similarly, For category kn, we can write , 

Okn = log(s) * kn * en 

Okn = kn en log(s) 

Where Okn indicates the total bit requirement for 


Symbolically, the total bit requirement for 

representation of the knowledgebase entries is: 

Ot = Ok1 + Ok2 + Ok3 + ... ... ... + Okn 

= log(s)* k1 * e1 + log(s)* k2 * e2 + . . . + log(s)* kn * en 

= k1 e1 log(s) + k2 e2 log(s) + . . . + kn en log(s) 

= log(s) ( k1 e1 + k2 e2 + . . . + kn en ) 

n 

That is , O = log(s) 

t 

∑ ( k . e 

i i 

i = 1 

Since, 1 ≤ ei ≤ log(n) , and s being the total number of 

symbol unit in source language, in worst case, 

n 

O = 

t 

log(s) . n . log n ∑ ( 

i = 1 

k 

i 

) 

) 

(4) 

(5) 

Now let us consider our proposed scheme, where 

components of knowledgebase entries are grouped into 


several levels and each entry in the level is chosen by 

means of effective statistical entity. Because of using 

hierarchical statistics, any element of certain level 

possess greater probability of occurrence than the 

element placed at any position below that. However the 

levels are placed in descending order to facilitate that, 

higher-gram texts or knowledgebase components are 

coded in advance to any lower-gram texts irrespective of 

probability distribution [11]. Here, it is noteworthy that, 

though the probability distribution or statistical context 

was not taken into consideration to organize the levels, 

the total structure resulted an automatic statistical 

distribution, because, the text-ranking scheme used to 

build the knowledgebase followed hierarchical steps that 

inherently inferred statistics from lower groups. 

Consequently, whenever we are formulating the 

statistical entries, any subgroup from its upper group will 

have lower values and specific coding schemes may be 

employed taking this criteria into consideration. 

As we are using static coding in order to encode the 

total knowledgebase and the knowledgebase is 

hierarchically grouped, the resultant outcome leads our 

proposed scheme into a low-bit consuming one. In our 

proposed scheme, symbolically, any knowledgebase 

entry varies from 1 to r characters. That is, the levels are 

r, (r - 1), (r - 2), ………, 2, 1. Formation of levels start 

from single characters and proceed incrementally. Any 

entry in the knowledgebase containing any substring 

from its successor level will have greater multi-gram 

index because of being inferred from the previously 

encountered entry. This aspect results in a sustainable 

knowledgebase architecture if we sort the knowledgebase 

in any order considering multi-gram index as the primary 

criteria. Since we are calculating the multi-gram-index or 

multi-gram-weight through a comprehensive text ranking 

scheme we may consider the underlying text elements as 

a single unit. Even though, we are considering the overall 

knowledgebase a single unit, it has been clarified earlier 

that, the architecture of the knowledgebase will provide 

us elementary grouping facilities. This coherence model 

provides us the opportunity to use static coding. If we 

have a total of m entries in the knowledgebase, coding of 

those values using binary stream may vary from 1 to 

log(m) bits. Let it be y in an average, where 1 ≤ y ≤ 

log(m). 

Again, as we are using multi-grams as single 

component, if the multi-gram consists of even k 

characters where 1 ≤ k ≤ r, it will be turned into a single 

one. Consequently, the average requirement may be 

specified as 

O = y + y + y + .......... ... + y 

p 1 2 3 

n


= ∑ 

= 

n 

O y 

p i 

i 1 

where, 1≤ yi ≤ log(m) for 1≤ i ≤ n . 

For worst case O = n log(n) 

p 

where y = log(n) for 1 ≤i 

≤ n . 

i 

(6) 

(7) 

Even if, the total number of components n is same for 

both the cases, since 1 ≤ y ≤ log(n) (because here m=n, 

both indicating the total number of elements) and 1 ≤ ei ≤ 

log(n) multiplying any value (category index k) with ei 

will definitely result much more than that of yi. 

Consequently, we may deduce that, 

Op ≤ Ot 

Here, n is total number of elements in the 

knowledgebase and y (1 ≤ y ≤ log(m)) refers the total 

bits needed to code component i. It is inherently clear 

that, Op ≤ Ot . That is, the total number of bits needed to 

encode the same source symbols using our proposed 

scheme is less than that of the traditional schemes. As, 

compression ratio (R) is the ratio of compressed bit and 

source bits, we get that, 

and, 

Rt = Ot / nt 

Rp = Op / nt 

As, Op ≤ Ot we get that, 

Rp ≤ Rt. 

Even for the worst case, we get that, 

O 

t 

= log(s) 

. n . log 

n 

n ∑ 

i = 1 

( 

n 

O = log(s) 

t 

. O p ∑ ( 

i = 1 

k ) 

i 

( 8 ) 

k 

i 

Hence we may deduce that, even for worst case, 

Op ≤ Ot 

The lower the value of compression ratio, the better 

the compression. Consequently ,we may conclude that, 

the compression efficiency of proposed scheme is better 

than that of traditional dictionary based compression 

schemes. 

Our proposed scheme requires less space because we 

have constrained the growth of the knowledgebase. The 

facility to use a couple of levels will ensure greater 

flexibility in memory management. As, the search space 

is minimized, the computational complexity will also be 

reduced. It is of no doubt that, lower computational 

complexity will ensure faster performance. 


) 

VI. EXPERIMENTAL RESULTS AND DISCUSSIONS 

The performance evaluation is performed on the basis 

of the file “book1”, “book2”, “paper4”, “paper5” and 

“paper6” from “Calgary Corpus”. As the prime aspect of 

our proposed Compression Scheme is not to compress 

huge amount of text rather to compress texts with limited 

size affordable by the mobile devices i.e. embedded 

systems, we took blocks of texts less than five hundred 

characters chosen randomly from those files ignoring 

binary files and other non-text files and performed the 

efficiency evaluation. 

The most recent study involving compression of text 

data are: 

1. “Low Complex and Power Efficient Text 

Compressor for Cellular and Sensor Networks” (Mode 1) 

by Rein et al. [1, 2, 3] and, 

2. “A modification Of Greedy Sequential Grammar 

Transform based data Compression" by Islam et al [4] 

We denote the above two methods as DCM-1 and 

DCM-2 respectively, where DCM stands for Data 

Compression Method. 

The simulation was performed in a 2.0 GHz Personal 

Computer with 112 MB of RAM with the object oriented 

programming language Java. The average compression 

ratio for three random execution results for different size 

of blocks of text is as follows 

Corpus Number of 

Characters 

Considered 

TABLE 2 

COMPRESSION RATIO 

Compressi 

on Ratio 

(%) for 

DCM-1 

Compressi 

on Ratio 

(%) for 

DCM-2 

paper 4 108 44.01 44.03 42.94 

paper 5 061 44.98 45.51 44.02 

paper 6 032 45.22 45.96 44.30 

book 1 191 46.84 48.11 45.97 

book 2 104 43.95 46.69 45.27 

Compressi 

on Ratio 

(%) for 

proposed 

scheme 

The compression ratio is a metric to describe how 

many compressed units are required to describe one unit 

of data. The lower the presented value shows better 

compression. A general observation is that higher modes 

lead to better compression ratios even if the difference 

with higher orders becomes smaller. 

The Performance of any dictionary based or 

knowledge-inferred compression scheme greatly varies 

with the architecture of dictionary construction and 

knowledge inference. When the test bed is considered as


the same of the knowledgebase inferring one or 

dictionary forming one, the performance gain will be 

higher because of the greater match with the dictionary 

entries or knowledgebase components. Our proposed 

scheme has worse performance because, the training 

scheme that we have provided makes an effective use of 

text ranking scheme for several corpora and choice of the 

knowledgebase entries are unbiased towards any specific 

corpus. It is the novelty of our approach which ensures a 

greater distribution of ranking and makes the developed 

scheme uniformly usable for text data compression. But, 

other schemes which are trained with specific files of any 

corpora, there are greater matching with the 

knowledgebase and the hence performance evaluation 

demonstrates better performance for that corpus or that 

specific file than that of our proposed scheme. 

The prime achievement of our proposed scheme is, it is 

even capable to compress source texts consisting with an 

average of only five characters. Though the compression 

ratio is deteriorated for that case, it is an evolutionary 

step for small text compression. This achievement may 

be greatly helpful for compression of sensor network 

concerned data and even in asynchronous data 

transmission management for web applications having 

the glimpse of real time computation. 

Besides of the evaluation scheme that has been 

presented earlier in this section, we analyze the 

performance of the proposed scheme in terms of 

compression ratio with respect to the text presented in 

[3]. 

TITLE TEXT 

The international system of units consists of a 

set of the units together with a set of the 

prefixes. The units of the SI can be divided into 

two subsets. There are seven base units. Each 

TEST_A 

of these base units are dimensionally 

independent. From the seven base units all 

other units are derived. 

TEST_B 

TEST_C 

However, few believed that SMS would be 

used as the means of sending text messages 

from a mobile to the another. One factor in the 

takeup of the SMS was that operators were 

slow to eliminate billing fraud which was 

possibly by changing SMSC setting on 

individual handsets to the SMSC's of other 

operators. 

Alice is a fictional character in the books of the 

Alice’s adventures in the Wonderland and its 

sequel through the Looking-Glass, which were 

written by Charles Dodgson under the pen 

name Lewis Caroll. The character is based on 

Alice Liddel, a child friend of Dodgson's. The 

pictures, however do not depict Alice Liddel, 

as the illustrator never met her. She is seen as a 

logical girl, sometimes being pedantic, 

especially with Humpty Dumpty in the second 

book. 


The performance of proposed compression scheme (in 

terms of compression ratio) for the above texts in 

comparison with [3] (indicated as DCM-1) is given 

below. 

Compression Ratio in percentage 

(Lower Values indicate better performance) 

54 

52 

50 

48 

46 

44 

42 

40 

TEST_A TEST_B TEST_C 

DCM- 1 PROPOSED SCHEME 

Figure 1. Performance Comparison for example text. 

From the figure we find that the compression ratio is 

lower for all the cases for the text presented in [11]. 

It is necessary to mention here that In order to 

implement our proposed scheme no additional hardware 

is necessary. Rather it is possible to use even in any lowpowered 

and low-memory devices. Basically the aspect 

of ensuring an affordable text compression scheme for a 

greater variety of smart devices we emphasis on static 

coding in lieu of on-the-fly coding. Besides the results 

presented in this section, detailed theoretical analysis on 

the space and time requirements, computational 

complexity i.e. overall computational overhead is already 

presented in section V. 

VII. CONCLUSION AND RECOMMENDATION 

We have presented an effective and efficient approach 

of compressing short English text message for lowpowered 

embedded devices. Here modified syllable 

based dictionary matching and static coding have been 

employed to obtain the compression. Moreover, a new 

theoretical concept of choosing the multi-grams is used, 

which has facilitated us to obtain mentionable


compression ratio using a small number of 

knowledgebase entries than other methods consuming 

less resource. The overall strategy of computational 

simplicity has also ensured the reduced time complexity 

for the proposed compression and decompression 

process. The main aspect of our proposed scheme resides 

in the text ranking based knowledge-base construction 

with space integration that initiates a new arena of text 

compression methodology. A consistent and relevant 

mathematical analysis of the overall performance also 

establishes a strong technical basis of the proposed 

scheme. Moreover, the prime achievement is in the scale 

of starting threshold of text compression; that we have 

reduced to less than five characters. With limited 

knowledge-base size, the achieved compression is of no 

doubt efficient and effective. As the knowledge base is 

not accepted to be grown through the continuous 

applications, we may keep out the low-memory system 

from the risk of expanding its knowledgebase crossing 

optimal memory size and thus, the applicability of the 

proposed system even in any very low memory devices is 

ensured. 

REFERENCES 

[1] Stephan Rein, Clemens Guhmann, Frank H. P. Fitzek: 

“Compression of Short Text on Embedded Systems”, 

Journal of Computers : Volume 1, No: 06, September 

2006. 

[2] S. Rein, C. Guhmann, and F. Fitzek , “Low Complexity 

Compression of Short Messages,” Proceedings of the 

IEEE Data Compression Conference (DCC’06), March 

2006, pp.123–132. 

[3] Stephan. Rein, F. Fitzek, M. P. G. Perucci, T. Schneider 

and C. Guhmann, “Low Complex and Power Efficient 

Text Compressor for Cellular And Sensor Networks”, In 

15 th IST Mobile and Wireless Communication Summit, 

June 2006. 

[4] J. Lansky , M. Zemlicka , “Compression Of Small Text 

Files Using Syllables”. Proceedings of Data Compression 

Conference (DCC’06) , Los Alamitos, CA, USA, 2006. 

[5] J. Lansky and M. Zemlicka. “Text Compression: 

Syllables”. In Annual International Workshop on 

DAtabases, TExts, Specifications and Objects (DATESO), 

Volume 129 of CEUR Workshop Proceedings, pp. 32–45. 

2005.. CEUR-WS. 

[6] Md. Rafiqul Islam, Sajib Kumar Saha, Mrinal Kanti 

Baowaly. “A modification of Greedy Sequential Grammar 

Transform based Universal Lossless data Compression”. 

Proceedings of 9th International Conference on Computer 

and Information Technology (ICCIT 06), 28-30 December, 

2006, Dhaka, Bangladesh. 

[7] Przemysław Skibiński, “Two-Level Dictionary Based 

Compression”, Proceedings of the IEEE Data 

Compression Conference (DCC’05), page 481. 

[8] F. Awan and A. Mukherjee, "LIPT: A Lossless Text 

Transform to improve compression", Proceedings of 

International Conference on Information and Theory: 

Coding and Computing, IEEE Computer Society, Las 

Vegas Nevada, 2001. 

[9] H. Kruse and A. Mukherjee, “Preprocessing Text to 

Improve Compression Ratios”, Proceedings of Data 


Compression Conference, IEEE Computer Society, 

Snowbird Utah, 1998, pp. 556. 

[10] S. A. Ahsan Rajon, “A Study on Text Corpora for 

Evaluating Data Compression Schemes: Summary of 

Findings and Recommendations”, Research Report, 

Computer Science and Engineering Discipline, Khulna 

University, Khulna, Bangladesh, December, 2008. 

[11] Md. Rafiqul Islam, S. A. Ahsan Rajon and Anonda Podder, 

“Short Text Compression for Smart Devices", Proceedings 

of 11th International Conference on Computer and 

Information Technology (ICCIT 2008), 25-27 December, 

2008, Khulna, Bangladesh, pp. 453-558. 

[12] S. Rein and C. Guhmann, “Arithmetic Coding–A Short 

Tutorial”, Wavelet Application Group, Technical Report, 

April 2005. 

[13] David Hertz: “Secure Text Communication for the Tiger 

XS”. Master of Science Thesis, Department of Electrical 

Engineering, Linköpings University, Linköping, Sweden. 

[14] S. A. Ahsan Rajon and Anonda Podder, “Lossless 

Compression of Short English Text for Low-Powered 

Devices”- Undergraduate thesis, CSE Discipline, Khulna 

University, Khulna, Bangladesh, March, 2008. 

[15] Md. Rafiqul Islam, S. A. Ahsan Rajon, Anonda Podder, 

“Lossless Compression of Short English Text for Low- 

Powered Deices”, in the proceedings of International 

Conference on Data Engineering and Management 

(ICDEM 2008) Tiruchirappalli, Tamil Nadu, India. 

February 9, 2008. 

[16] Ross Arnold, and Tim Bell, “A Corpus for the evaluation 

pf lossless compression algorithms”, Data Compression 

Conference, pp. 201-210, IEEE Computer Society Press, 

1997. 

Md. Rafiqul Islam obtained Master of Science (M. S.) in 

Engineering (Computers) from Azerbaijan Polytechnic Institute 

(Azerbaijan Technical University at present) in 1987 and Ph.D. 

in Computer Science from Universiti Teknologi Malaysia 

(UTM) in 1999. His research areas include design and analysis 

of algorithms and Information Security. Dr. Islam has got a 

number of papers related to these areas published in national 

and international journals as well as in referred conference 

proceedings. 

He is currently working as the Head of Computer Science 

and Engineering Discipline, Khulna University, Bangladesh. 

S. A. Ahsan Rajon received his B.Sc. Engineering degree 

from Computer Science and Engineering Discipline, Khulna 

University, Khulna in April 2008. He is now a postgraduate 

student of Business Administration Discipline under 

Management and Business Administration School of same 

university. Rajon is also working as an Adjunct Faculty of 

Computer Science and Engineering Discipline, Khulna 

University, Bangladesh. Rajon has made several publications in 

International conferences. His research interests include data 

engineering and management, electronic commerce and 

ubiquitous computing. Currently he is working on robotics. 

He is a member of Institute of Engineers, Bangladesh (IEB).


Design and Analysis of an Effective Corpus for 

Evaluation of Bengali Text Compression 

Schemes 

Md. Rafiqul Islam 


Email: dmri1978@yahoo.com 

S. A. Ahsan Rajon 


Email: ahsan.rajon@gmail.com 

Abstract — In this paper, we propose an effective 

platform for evaluation of Bengali text compression 

schemes. A novel scheme for construction of Bengali text 

compression corpus has also been incorporated in this 

paper. A methodical study on the formulation-approaches of 

text corpus for data compression and present an effective 

corpus named Ekushe-Khul for evaluating the Bengali text 

compression schemes has also been presented in this paper. 

To design the Bengali text compression corpus, Type to 

Token Ratio has been considered as the selection criteria 

with a number of secondary considerations. This paper also 

presents a mathematical analysis on data compression 

performance with structural aspects of corpora. A 

comprehensive analysis on the evolving criteria of text 

compression corpora with related issues in designing 

dictionary based compression are extensively incorporated 

here. The proposed corpus is effective for evaluating 

compression efficiency of small and middle sized Bengali 

text files. 

Index Terms — Corpus, Bengali Text, Bengali Text 

Compression, Dictionary Coding, Data Management, 

Evaluation Platform, Compression Efficiency, Type to Token 

Ratio (TTR). 


With a number of languages in the world, Bengali is 

the only language, which has been established at a cost of 

lives. However, in the establishment of Bengali as a 

glittering candidate in the field of research the number of 

steps or achievements is not so mention-worthy. 

Enhancing Data Management techniques for Bengali text 

has not yet got any robust base. Text Compression, which 

is one of the important aspects of elementary data 

This paper is an enhancement of the paper “On the Design of an 

Effective Corpus for Evaluation of Bengali Text Compression 

Schemes”, by Md. Rafiqul Islam, and S. A. Ahsan Rajon, which was 

published in the proceedings of International Conference on Computer 

and Information Technology (ICCIT’ 2008), 25-27 December 2008, 

Khulna, Bangladesh. 

Contact Author: S. A. Ahsan Rajon 


doi:10.4304/jcp.5.1.59-68 

management and text manipulation schemes, is still now 

mostly based on generalized universal compression 

techniques. The remaining sophisticated Bengali text 

compression schemes also suffer from unavailability of 

standard evaluation platform, i.e. unavailability of text 

compression corpus. This paper proposes a text 

compression corpus for evaluating Bengali text 

compression schemes. 

The construction of a data compression corpus is now 

a demand-of-time. The state-of-the-art technologies of 

data management have already equipped the benchmarks 

and standard collection of experimental data sets for 

performance evaluation [1]. In Bengali, though a highly 

impressive step has been initiated in [2] for constructing a 

linguistic corpus having CIIL (Central Institute of Indian 

Languages) corpus [4] as a pioneer, still now any 

mentionable sophisticated data compression corpus is not 

available with text compression benchmarks. This paper 

focuses on designing a Bengali text compression corpus 

based on Type to Token Ratio (TTR) and provides the 

mathematical framework for various aspects of choosing 

the corpus with compression benchmarks. 

The paper is targeted to design a corpus that will 

facilitate the future researchers especially in the research 

of Bengali text compression to have a test-bed for 

evaluating their performance. For designing such a 

corpus we are to derive a novel scheme of text 

compression corpus formation because of having no 

existing benchmark. 

The paper is organized into eight sections. Section II 

provides elementary concepts on corpus and text 

compression schemes. The key-points for constructing a 

corpus are presented in section III. Section IV is devoted 

with the literature review presenting various aspects of 

the existing corpora (corpuses). Section V describes the 

proposed approach for formulating the corpus. A brief 

description on the files comprising the proposed text 

compression corpus is presented in section VI. Analysis 

of the proposed corpus is depicted in section VII. Various


issues on the usability of the proposed corpus for other 

languages is presented in section VIII. The paper ends 

with section IX providing conclusion and 

recommendation. 

II. OVERVIEW OF CORPUS AND DATA COMPRESSION 

Text compression is an elementary aspect of data 

compression. Compression is the process of reducing the 

size of a file or data by identifying and removing 

redundancy in its structure [7]. Data Compression offers 

an effective approach of reducing communication costs 

by using available bandwidth effectively. Data 

Compression technique is generally divided into two 

categories; namely, Lossless Data Compression and 

Lossy Data Compression. For lossless schemes, the 

recovery of data should be exact. Lossless compression 

algorithms are to some extent essential for all kinds of 

text processing, scientific and statistical databases 

applications, medical and biological image processing, 

DNA and other biological data management and so on. 

However, a lossy data compression technique does not 

ensure the exact recovery of data. For image compression 

and multimedia data compression, there is a great use of 

lossy data compression. From the early 1990, certain 

specific platforms of evaluating English text compression 

schemes are adapted for standard researches, which are 

better known as data compression corpus [1]. 

In general sense, a corpus is simply a collection of 

texts. From the point of view of data compression, 

compression-corpus is a standard collection of texts or 

data for analyzing and evaluating effectiveness and 

efficiency i.e. performance of compression schemes. The 

mostly used data compression corpora for evaluating data 

compression are namely Calgary corpus, Canterbury 

corpus and Project Gutenberg. Though the corpora for 

English were introduced in the early 1990, still now, for 

analyzing and evaluating Bengali text compression 

performance, no such complete and standard corpus is 

available. In this paper, we present a novel approach for 

designing data compression corpus for evaluation of 

Bengali text compression scheme. Implementation of the 

proposed scheme has also been focused in the developed 

Ekushe-Khul corpus. 

For analyzing domain specific or language specific 

text-compression scheme, it is a must to have the domain 

oriented or language oriented benchmark to analyze the 

compression schemes. There are a number of English text 

compression corpora and corresponding benchmarks 

which are extensively used for evaluation of English Text 

compression schemes. There are also a number of 

multilingual text corpora involving English for evaluating 

English text compression with respect to other languages. 

But in case of Bengali there is neither any corpus for 

evaluation of text compression performance nor any 

benchmark for the same. A detail explanation on the 

necessity of Language specific corpora and the suitability 


(or unsuitability) of any corpora for evaluating non- 

English concerns is provided by Sarker and Roeck in [10] 

with a comprehensive explanation by N. S. Dash in [12] 

and by Akshar et al. in [11]. The required changes for 

designing a new corpus for Bengali in order to meet new 

demands of computational linguistics and other 

approaches is elaborately provided in [9]. 

III. CHARACTERISTICS OF A DATA COMPRESSION CORPUS 

The characteristics of a corpus are defined by the 

purpose of creating the corpus and the evaluation arena 

for which the corpus is designed for. These criteria give 

rise to the necessity to field-specific corpus like data 

compression corpus, information retrieval corpus, corpus 

for data mining research etc. According to Arnold et al., 

there are six criteria for choosing a corpus [1]. Firstly, the 

files should be chosen as representative of the files used 

and expected to be used in compression method. 

Secondly, the availability of the file should be ensured. 

Availability of public domain materials is the third 

consideration for the corpus. They also suggest that the 

files should not be larger than necessary and it should be 

perceived to be valid and useful. Finally, the corpus 

should be actually valid and useful. 

The mentioned criteria for choosing corpus have 

successfully established Canterbury Corpus as one of the 

extensively adapted corpus for data compression. 

However, due to the rapid growth of technology, 

additional criteria have been evolved in constructing the 

corpus. Firstly, the criteria were specified for evaluating 

the compression of middle-sized files. But, with the 

advancement of embedded devices with lower processing 

speed and small memory (like mobile-phones) it has 

become extremely important to provide a new test-bed 

for evaluation of small text compression schemes. 

Frequent uses of mobile phones, short text messages and 

emails have already made a transformation on the use of 

texts, and hence necessity of field and size specific 

corpus formation has been evolved. Taking these 

limitations into concern, we provide specific files 

regarding the issues with consideration of other 

traditional characteristics. Moreover, the file structure i.e. 

sentence structure was not considered as any important 

criteria for choosing corpora. But, compression ratio 

gradually varies with variations of sentence construction 

and file structure. Besides these, probability of 

occurrences of texts-components and frequency of those 

are important criteria, which fluctuate the text 

compression performance. 

IV. LITERATURE REVIEW 

The history of mostly adapted data compression corpus 

started a long ago. Calgary Corpus, which was collected 

in 1987 and was published in 1990 is considered as the 

pioneer of data compression corpus. The files in Calgary 

corpus [1] with their content category are listed in table I.


TABLE I 

FILES IN THE CALGARY CORPUS 

Title Description 

bib Bibliographic File 

book1 Hardy: Far From the Madding crowd 

book2 Witten: Principles of computer speech 

geo Geophysical data 

news News batch file 

obj1 Compiled Code for Vax 

obj2 Compiled Code for Apple Macintosh 

paper1 Witten: Arithmetic Coding for compression 

paper2 Witten: Computer Security 

paper3 Witten: In Search of Autonomy 

paper4 Cleary: Programming by example revisited 

paper5 Cleary: logical implementation of arithmetic 

paper6 Cleary: Compact Hash tables 

pic Picture from CCITT Facsimile 

progc C source code 

progl Lisp source code 

progp Pascal Source Code 

trans Transcript of a session on terminal 

The next initiative and the mostly adapted one is 

Canterbury Corpus. It was developed by Arnold et al. 

[1]. Canterbury corpus considered the benchmark derived 

from the Calgary corpus as their corpus development 

basement. In Table II, the files constituting Canterbury 

corpus are presented. 

TABLE II 

FILES IN THE CANTERBURY CORPUS 

Title Description 

alice29.txt English text 

ptt5 Fax images 

fields.c C source code 

kennedy.xls Spread-sheet Files 

Ssum SPARC Executable 

lcet10.txt Technical documents 

plrabn12.txt English Poetry 

cp.html HTML 

grammar.lsp Lisp source code 

xargs.1 GNU Manual pages 

asyoulik.txt plays 

The above files are solely for evaluation of English text 

compression evaluation corpus and hence not effective or 

at all applicable for evaluating Bengali text compression. 

The most recent and extremely organized analysis on 

Bengali news corpus is provided by Majumder et al. [2]. 


This paper proposes a new corpus on the basis of the well 

circulated newspaper prothom-alo [5] and provides 

exhaustive analysis regarding TTR (Type-to-Token- 

Ratio), Function Word and other morphological and 

linguistic analysis. For analysis of data compression 

efficiency, this corpus requires some specific 

modification in the sense that, data compression involves 

the additional criteria of heterogeneity and text-size. 

Moreover, the news corpus proposed in [2] is extensively 

tested for linguistic analysis only. In order to use this for 

data compression, further text-compression related 

analysis is a must. According to their methodology, they 

collect html news files from prothom-alo websites and 

converting it to Unicode, they categorize the total 

collection into twenty seven distinct groups, which are 

also available in a single file of size three hundred and 

eighteen mega bytes. For most practical applications and 

researches on middle-sized or small text compression, it 

is rarely necessary to have larger files, rather it is 

essential to have standard small and medium sized 

corpus. 

The fist corpus in Bengali developed by Central 

Institute of Indian Languages (CIIL) in the years 1991- 

1995 possesses a collection of three million Bengali 

words, which provide valuable linguistic data for research 

on Bengali language analysis [4]. In evaluating 

compression efficiency, this impressive corpus also 

suffers from the limitation of heterogeneity and lack of 

embedded and small device supportability. 

Besides the corpora presented there are a number of 

researches available describing the features of Bengali 

linguistic corpus. A number of works is also devoted to 

have corpora on Bengali OCR (Optical Character 

Recognition). The contribution of B. B. Chaudhury is 

pioneer in this specific field of Bengali OCR formation 

[13]. 

The necessity of devising specific and relevant criteria 

for building Bengali corpora and the basic limitations as 

well as pitfalls of existing non-Bengali corpora building 

concepts are extensively incorporated by N. S. Dash in 

[9]. Prime techniques of Bengali text corpus processing, 

which are used for various linguistic activities including 

far more complicated orthographical, morphological, and 

lexicological concerns is also presented in [9]. Various 

criteria including Concordance of Words, Lexical 

Collocation, Key-Word-In-Context, Local Word 

Grouping, Lemmatization of Words, Parsing Sentences 

etc are widely described by N. S. Dash in [9]. Being the 

overall aspect of the literature in [9] is to furnish 

linguistic and information retrieval corpora, there is little 

usability of the technique [9] in text compression corpora 

formation. 

The essentiality of domain specific corpora has 

motivated us to design Bengali text compression 

evaluation corpus. The unavailability of reference or 

benchmark in the field has also leaded us to develop a 

new concept for designing a corpus.


V. PROPOSED CORPUS FOR EVALUATION OF BENGALI 

TEXT COMPRESSION SCHEMES 

We propose a new corpus for evaluation of Bengali 

text compression schemes namely Ekushe-Khul corpus 

[8]. This corpus is intended for evaluating only short and 

middle-sized text compression schemes rather than 

evaluation for large text compression. One of the worthmentionable 

points is, this corpus is selected by applying 

a new approach, which takes both TTR and Compression 

Ratio (CR) into account. The two concepts have been 

adapted because we do not have any standard and 

sophisticated Bengali text compression corpus available 

with which we may incorporate and compare the 

performance and thus perform the selection from 

candidate files as done for other existing data 

compression corpora. 

To form the corpus we have considered ten groups- 

Articles, Poems, Advertisements, Speeches, News, Sms, 

Emails, Particulars, Stories And Reports. The files are of 

sized 4 KB to 1800 KB. The steps of constructing the 

proposed corpus are depicted in Fig 1. 

COLLECT CATEGORIZED FILES 

REMOVE ENGLISH AND OTHER NON- 

BENGALI TEXTS OR SYMBOLS 

CONVERT TO UNICODE 

CALCULATE TTR FOR EACH FILE 

SELECT FILES WITH 

MAXIMUM TTR 

IF MORE THAN ONE, 

SELECT ONE WITH 

MAXIMUM SOURCE FILE 

SIZE AND IGNORE 

OTHERS 

SELECT FILES WITH 

MINIMUM TTR 

IF MORE THAN ONE, 

SELECT ONE WITH 

MAXIMUM SOURCE 

FILE SIZE AND IGNORE 

OTHERS 

CATEGORY_TITLE_1.txt CATEGORY_TITLE_2.txt 

Fig. 1: Steps for constructing the Proposed Bengali text 

compression corpus. 


The files constituting the proposed Ekushe-Khul corpus 

are shown in Table III. A detail description on the 

proposed corpus selection criteria with potential issues 

and concerned data compression parameters that we have 

proposed for developing Bengali text corpora for 

evaluation of Bengali text compression schemes is 

presented in section VI. 

TABLE III 

FILES CONSTITUTING THE PROPOSED EKUSHEY-KHUL CORPUS. 

File Name Type 

Article1.txt 

Article2.txt 

Poem1.txt 

Poem2.txt 

Advertise1.txt 

Advertise2.txt 

Speech1.txt 

Speech2.txt 

News1.txt 

News2.txt 

SMS1.txt 

SMS2.txt 

Email1.txt 

Email2.txt 

Particulars1.txt 

Particulars2.txt 

Story1.txt 

Story2.txt 

Report1.txt 

Report2.txt 

Articles in Bengali 

Bengali Classic Poems 

Advertisements 

Various Speeches 

News from Newspaper 

Short Bengali Message 

Emails in Bengali 

Particulars 

Bengali Stories 

Academic Reports 

To collect the files, the first step was to remove 

redundancies and other components (e.g. html tags, 

letters and words from other languages, formatting etc.) 

and convert them into Unicode as several documents 

were non-standard encoded and saved as Unicode text 

files. These files are then forwarded for further analysis. 

For each category of files, all the files are passed to a 

TTR Calculator and then Compression Ratio Evaluator 

for selection of final documents. In case of Canterbury 

Corpus, which is the pioneer for data compression corpus 

was formed by comparing each file with the standard of 

the Calgary Corpus of same group or category, and the 

file demonstrating the closest match in terms of 

compression performance was selected. But, as no such 

standard exists, we had to take account of both the criteria 

and hence selected two with the maximum and minimum 

consistence. 

TTR is a measurement of how many times a previously 

encountered words repeat themselves before a new word 

makes its appearance in the text [2]. It is note worthy that 

though this measure is essential for language engineering,


its application in data compression is not mandatory. For 

calculative approaches, TTR is calculated by dividing the 

total number of word tokens by the total number of 

distinct words. Texts with a large number of distinct 

words result into lower TTR. For this case, the task of 

forming a data compression dictionary appears to be 

more accurate. Conversely, for texts with greater TTR i.e. 

with greater number of frequent words (resulting lower 

number of distinct words) provide a suitable arena for 

constructing and evaluating dictionary-based and 

repetition-analysis oriented text compression schemes. 

This is the reason, which inspires us to provide peer-files 

for each category. The files with suffix ‘2’ indicate the 

file with lower TTR (suitable for text compression 

fluctuation evaluation) and files with suffix ‘1’ indicate 

greater TTR (suitable for dictionary-based text 

compression scheme evaluation) for the respective group. 

The next step involves analyzing the files in terms of 

compression ratio [6] to establish logical and 

implementational validity of choosing TTR as a 

benchmark for constructing the corpus. Though a 

number of approaches exist for compression of English 

texts, (like Burrows-wheeler Transform [4], Arithmetic 

Coding, Huffman Coding etc.) it is a sad tale that, still 

now a little research has been dedicated for Bengali text 

compression schemes. The uses of non-sophisticated text 

compression schemes do not provide optimal 

performance for Bengali text compression purposes. For 

providing a benchmark of the proposed corpus, we adapt 

the dictionary based compression scheme in [7] and 

demonstrate the inter-relation of TTR with Compression 

Ratio. 

Unavailability of existing sophisticated and 

multidimensional text Bengali text compression 

approach, we had a little alternatives to compare the 

performance fluctuation of our corpora. Though there are 

a number of English text compression schemes available, 

testing with those may never provide a comprehensive 

view to out developed scheme with implemented 

prototype. 

VI. ASPECTS OF CORPUS SELECTION CRITERIA AND TEXT 

COMPRESSION PERFORMANCE 

Let n be the total number categories of collections of 

texts. The categories are representative of texts bearing 

distinct themes with differences in sentence construction, 

sentence wording and sentence structure. 

Let the categories are c1 , c 2 , c 3 , … … , c n 

respectively. 

For category c1 the files are 

, f , f , … … , f 

f1,1 1,2 1,3 

1,n 

with length l1,1 , l1,2 

, l1,3 

, … … , l1,n 

, where, the 

length indicates total number of characters (including 

blank spaces, tabs, new lines and other punctuation 

marks). 


If there are total of w1,1 , w 1,2 , w 1,3 , … … , w1,n 

words in f1,1 , f 1,2 , f1,3 

, … … , f 1,n respectively 

with number of distinct words be 

, d , d , … …, 

d for the same. 

d1,1 1,2 1,3 1,n 

Here, for any file fi, j, where i denotes category index 

and j denotes file index, and wi ,j ≥ di,j. Assume Ti. j be 

the Type to Token Ratio (TTR) for file fi, j. 

w 

That is, 

i , j 

T = . 

i , j 

d 

i , j 

For any two files p, q in category i , with TTR Ti,p 

and T1,q, if Ti, p ≥ Ti, q indicates that, Ti, p contains larger 

number of distinct words than that of Ti, q. 

Applying this rule, TTRs are counted for the files. 

Then the file with maximum and minimum TTRs are 

selected. For the file with maximum TTR, if the elements 

comprise the set Sd and St for distinct words and total 

words respectively, then for ideal case, set Sd − St would 

tend towards Φ (empty or null) set. 

Though TTR has been considered as a means of 

forming linguistic corpora and Natural Language 

Processing corpora including Information Retrieval 

corpora, taking it into concerns, it is possible to design 

Data Compression corpora too. Here, we provide an 

analysis has been presented on the effectiveness of using 

TTR as a criteria of constructing Corpora. 

The basic relation between TTR and word-distribution 

is: 

� If TTR is larger (greater) then, the number 

(frequency) of matching word is also greater; 

consequently, number of distinct words is smaller. 

� If TTR is smaller (lower) then, the number 

(frequency) of matching word is smaller; 

consequently, number of distinct word is larger. 

For designing dictionary based data compression 

approaches, TTR plays a great role. For constructing the 

dictionary, the prime concern is to analyze the probability 

of occurrence of each character or syllable or substring or 

words (or any combination of the mentioned units). The 

frequencies in which they occur are an important aspect 

of figuring the probability. As the total number of distinct 

words is greater in files with lower TTR value, the 

detection of the probability of syllables may be easier as 

the word distribution is approximately flat in those files. 

For building word-based dictionaries for data 

compression, files with higher TTR values are more 

effective as the recurring words may be very easily 

pointed out. The fluctuation may be easily identified by 

comparing the performance of the compression scheme 

using the peer files. 

Let us consider a tertiary dictionary based text 

compression approach DCA, which uses a dictionary of n 

items. The total items consist of c characters, s syllables 

and w substrings, which save (in average) sc, ss, and sw 

bits each. Traditionally, sw ≥ ss ≥ sc as words or 

substrings are generally considered as a collection of 

syllable and/or symbols and syllables are considered as a


collection of characters, but the indices or pointers in 

most designs require consecutive values spanning against 

a threshold value. For a file with greater TTR ( i.e. 

Greater number of frequent words and lower number of 

distinct words), the probability of encoding using wordbased 

coding is high. Consequently, the bit-savings will 

also be high as the saving weightage is greater for words 

in compassion with other units. Let the replacement of 

number of distinct words be rw. Here, rw theoretically 

tends towards the total occurrence in average. 

Therefore, the total savings in this phase is 

S 

1 

= 

rw 

∑ 

y = 1 

(sw 

y 

× 

fw 

y 

) 

where fw is the frequency of corresponding distinct 

word. That is, fw indicates the frequency of occurrence 

of distinct word-indexed at y and ideally, fw → tw. 

The number of distinct syllable rs in a collection of 

highly distinct words should be lower. If the weightage is 

considered as double as distinct words, the total savings 

would be- 

S 2 

= 

rs 

∑ 

y = 1 

(ss 

y 

× 

fs 

y 

) 

where fs is the frequency of the syllable. 

The rest of the characters should be encoded separately 

resulting savings of 

S 3 

= 

rc 

∑ 

y = 1 

(sc 

y 

× 

fc 

y 

) 

If we encode a total of p characters separately, which 

saves sp bits for each character and if we encode q words 

separately with an average of d characters per word, 

whenever even d = p, the savings will be much more 

greater for word based bit-savings, since the indexing of 

the constructed code base will be approximately 

sequential and hence, each single character and each 

single word based encoding will occupy about same 

length. 

Thus, the total saving is, 

S = S + S + S 

S 

= 

1 

rw 

∑ 

y = 1 

2 

y 

3 

(sw × fw ) + 

y 

rs 

∑ 

y = 1 

(ss 

y 

× fs ) + 

y 

rc 

∑ 

y = 1 

(sc × fc ) (1) 

For a file with lower TTR, (i.e. greater number of 

distinct words and lower number of frequent words) there 

will have non-frequent occurrence of matching words and 

syllables. This will result extensive application and usage 

of character based encoding, as the primary unit of 

encoding is character. That is, r w′ 

< rw and fw′ → 1 . 

Similarly, r s′ 

< rs and Therefore, if there are total of 

tw words in the file, then, fs′ → 1 . 

But, because of remaining large number of character 

units, r c′ 

→ tc′ 

. 


y 

y 

If there a total of t w′ 

words in the file with lower TTR 

values, then, 

∑ ′ 

S 

′ 

1 = 

rw 

y = 1 

and w′ 

→ tw 

(sw 

y 

× 

fw 

′ 

y 

f ′ 

As low frequent words would be encountered, as much 

frequent syllables would be faced. However, a minor 

portion of the syllables may be coded in form of words. 

That is, we will have a larger number of character units 

and syllable units ( t s′ 

and t c′ 

) left to be coded. 

For syllable based coding, 

∑ ′ rs 

S 

′ 

2 = (ss y × fs′ 

y ) 

y = 1 

After the two steps, a greater number of characters are 

left to be coded with distinct characters. Typically this 

value t c′ 

may be t c′ 

< ts′ 

and for usual cases, t c′ 

< tw′ 

. 

Let these steps save, 

∑ ′ rc 

S 

′ 

3 = (sc y × fc 

′ y ) 

y = 1 

S ′ = S ′ + S ′ + S ′ 

S ′ = 

rw′ 

∑ 

y = 1 

1 

(sw 

y 

2 3 

rs′ 

× fw′ 

) + (ss 

y 

∑ 

y = 1 

y 

) 

× fs′ 

) + 

y 

rc′ 

∑ 

y = 1 

(sc 

y 

× fc′ 

) (2) 

It has been stated that, the ratio of bit-saving for word 

based , syllable based and character based encoding is 

4:2:1 (i.e. for the ratios a : b : c we may write, a > b > 

c). The ratio is taken as standard because, for character 

based dictionary oriented encoding, single character or 

symbol is considered as the primary unit. Consequently, 

substrings and syllables ranges from two to any higher 

value. The number of characters that comprises any word 

may ideally be considered in the range of greater than 

four whereas short length words may be thought of as the 

subgroup of substrings. Hence, it can be easily remarked 

that, for the above consequences, 

(i) S ≥ 

′ 

1 S1 

; since the number of word-matching is 

greater in S1 and the difference is a multiplier of four. 

(ii) S ≥ 

′ 

2 S 2 ; since the number of syllable-matching 

is greater in S2 and the difference is a multiplier of two. 

(iii) S ≤ 

′ 

3 S 3 ; since the number of matching-character 

is greater in S3 and the difference is an unit multiplier. 

From the above three consequences, we may easily 

point that, though S 

′ 

3 ≥ S 3 , it is an average case that, 

′ 

S will cross the saturation point or threshold value 

3 

comprising of (4 + 2 = 6) times higher amplitude. 

Consequently, we may easily conclude: S > S ′ . 

It may be also proved the same as follows: 

From the third clause we may write, 

S α = S 

′ , 

3 + 3 

y


where α is a constant bit-factor. 

From clause (i) and (ii), we can write 

S S ≥ S 

′ 

+ S 

′ 

. 

1 + 2 1 2 

From (i), (ii), we may deduce that (by adding the same 

value S3 / in both sides), 

′ 

≥ 

′ 

+ 

′ 

1 + S 2 + S 3 S1 

S 2 + S 3 

S 

Being a constant factor 

′ 

(3) 

α = 1, 2, 3, ........etc(number 

of bits) , we may write (3) as, 

S S + S + α ≥ S 

′ 

+ S 

′ 

+ S 

′ 

(4) 

1 + 2 3 

1 2 3 

It has been presented that, 

S = S1 

+ S 2 + S 3 

and, 

S ′ = S1′ 

+ S 2′ 

+ S 3′ 

. 

Hence, from (4), 

S + α ≥ S 

′ 

. 

Recall from earlier explanation that, S and S / are the 

aggregation of bit savings for matching characters, 

syllables and words. Again, as the ratio of bit saving for 

word-based, syllable based and character based encoding 

is considered 4:2:1, consideration of linear transformation 

factor l indicates that, S will rush towards 6l where 

α tends words l. for the average cases. 

In better cases, as all the characters are likely to be precoded 

with either syllable based coding or word based 

coding section. After coding with word based and 

syllable based coding there will be little portion of source 

text left for character based coding. That is, the total 

saving is probable to be derived from word based saving. 

Consequently, S 1 + S 2 → S , whereas, character 

based saving will tend towards zero. S 3 → 0 . 

Again, being S + = 

′ 

3 α S 3 ; α too will tend towards 

zero. That is, α → 0 . 

Again, from clause (i), we get, 

S = S ′ + α 

(5) 

1 

1 

Similarly, from clause (ii), 

2 

2 

1 

S = S ′ + α 

(6) 

2 

We get from (iii), S 3 = S 3′ 

− α 3 

(7) 

Where 1 , α 2 , α 3 

α are non negative integer values 

because they all are equalizing factors in terms of bits. 

It has already been stated that, S 3 → 0 . 

Consequently, ′ 3 − α 3 

S will only tends towards zero 

if and only if, α 3 → 0 . 

That is, it can be clearly deduced that, 

α 1 2 3 

+ α − α = c ≥ 0 . 

As a result, we may write, 


α 1 2 3 

+ α − α ≥ 0 . 

It has been previously stated that, α 3 → 0 . 

Consequently, α1 + α 2 ≥ α 3 

(8) 

Now, (5) + (6) + (7) results- 

′ ′ 

S 1 + S 2 + S 3 = S 1 + α 1 + S 2 + α 2 

′ 

+ S 3 − α 3 

⇒ S + S + S 

′ 

= S + S 

′ ′ 

+ S + α + α − α 

1 

2 

3 

1 

⇒ 

′ ′ ′ 

S 1 + S 2 + S 3 = S 1 + S 2 + S 3 + c [Using (8)] 

⇒ 

′ ′ ′ 

S 1 + S 2 + S 3 ≥ S 1 + S 2 + S 3 

⇒ S ≥ S 

′ [Since c is non-negative] 

That is, the performance of dictionary based data 

compression would be better for files with larger TTR 

Values and the result will deteriorate for lower TTR 

valued files. This is the aspect, which motivates us to 

provide peer files of each category to recognize the 

fluctuation of performance for dictionary based data 

compression. 

VII. ANALYSIS AND DISCUSSIONS ON THE PROPOSED 

CORPUS 

In the previous sections, overview of the corpus has 

been presented. In this section, we present statistics 

regarding TTR and Compression ratio of the files of each 

groups. The experimental results demonstrate that, not 

only in expressing the effectiveness of any corpus, TTR 

may also be employed as a criteria for constructing data 

compressing corpus with the great scope of presenting the 

performance fluctuation of dictionary based (i.e. 

repetition analysis oriented) text compression scheme for 

best case and worse case analysis. The maximum TTR for 

each group is provided in Table IV. 

2 

TABLE IV 

MAXIMUM TTR FOR EACH GROUP OF FILES OF PROPOSED CORPUS 

File Name TTR 

Article1 2.944 

Poem1 1.451 

Advertise1 1.170 

Speech1 5.846 

News1 3.087 

SMS1 1.138 

Email1 2.681 

Particulars1 3.510 

Story1 4.862 

Report1 3.364 

3 

1 

2 

3


The minimum Type-to-Token Ratio (TTR) for each 

group is provided in Table V. 

TABLE V 

MINIMUM TTR FOR EACH GROUP OF FILES OF PROPOSED CORPUS 

File Name TTR 

Article2 1.042 

Poem2 1.132 

Advertise2 1.012 

Speech2 2.164 

News2 1.611 

SMS2 1.103 

Email2 1.812 

Particulars2 1.489 

Story2 2.002 

Report2 2.157 

We also analyze the compression ratio for the files 

using the dictionary-based approach for Bengali text 

compression provided in [7] and find the following 

statistics- 

6.5 

6 

5.5 

5 

4.5 

4 

3.5 

3 

2.5 

2 

1.5 

1 

0.5 

0 

Article 

Relation between TTR and Compression Ratio 

Poem 

Advertise 

Speech 

News 

SMS 

Email 

Particulars 

Story 

Report 

MAXIMUM TTR MINIMUM TTR 

CR OF MAX TTR FILE CR OF MIN TTR FILE 

Figure. 2: Relation between Type to Token Ratio and 

Compression Ratio. 


Fig 2, has presented a comparative analysis on 

Compression Ratio with TTR in terms of applicability of 

dictionary based data compression techniques. 

Compression Ratio (CR) is a metric of data compression 

efficiency. Compression ratio indicates the number of bits 

required to describe one byte in compressed form. The 

lower the compression ratio, the better is the 

compression. In order to evaluate the performance 

fluctuation of dictionary based Bengali text compression 

schemes, we propose to use the peer files with 

assumption that, the best case would be obtained from the 

files with greater TTR values and the worse case will 

occur for lower TTR files. The construction of the 

dictionary may really be facilitated from the files with 

larger TTR. 

As, there is no existing corpus for evaluation of 

Bengali Text Compression Scheme, it is not possible to 

incorporate any comparative analysis with respect to any 

parameter. Because of the same in terms of compression 

benchmarks, there is no way to integrate the performance 

(effectiveness) analysis of proposed corpus. Basically, 

this is the reason which leads us to develop a new scheme 

for designing any corpus without any reference 

benchmarks. The proposed scheme is also to some extent 

a novel approach for automatic creation and evaluation of 

suitability (or unsuitability) of any corpora. 

VIII. APPLICABILITY OF PROPOSED CORPUS FORMATION 

SCHEME FOR OTHER LANGUAGES 

The proposed method of building a text corpus for 

evaluation of Bengali text compression scheme is also 

applicable for non-Bengali data compression corpus 

formation. As per the presented scheme for building the 

corpus, we analyze various statistical analysis of the 

source text to decide whether the text will be included or 

not irrespective of the text-structure and other linguistic 

features. This text-structure independence feature 

provides a greater flexibility in applying the proposed 

concept for non-Bengali text. 

We have already presented the theoretical background 

of choosing the peer files in view of dictionary based 

compression scheme. Here the same has been described 

in terms of traditional character based encoding scheme. 

For a character based encoding scheme i.e. a singlegram 

based coding approach, the main strategy is to 

define code-words for each character and then replacing 

each character with corresponding code-word. The codewords 

are considered as binary stream and in practical 

cases the encoding scheme takes static coding into 

account. Static coding is an encoding mechanism where 

source components are encoded with non-conflicting 

variable length binary stream. In the simplest case, the 

length of each binary stream to be used for representing 

each character is determined through their frequency 

distribution. Often this length determination is obtained


through probabilistic distribution based component 

ranking schemes. 

Let there are n unit source symbols in the source 

language. That is, the total number of letters in the source 

language be n. If TTR is larger (greater) then the number 

(frequency) of matching word is also greater; 

consequently, number of distinct words is smaller. 

Whenever there will greater number of matching words, 

it may be assumed that, for average cases, the number of 

matching characters will also be greater. This may not be 

the same if the non-repeating words or low-repeating 

words individually contain unusually large set of 

repeating characters and consequently, even though the 

frequency of matching character is greater, sum of the 

characters present in the non-matching words is greater. 

That is, for a n character-set based language if the 

considering text consist of m characters, and there are tmw 

matching words where each of the matching word 

contains an average of acw characters, the total pool of 

matching character is tmw * acw . Being n character-set 

based language, this pool of matching character may 

contain at best n distinct characters. That is, the TCR 

(Type to Character Ratio in analogy with TTR, Type to 

Token ratio) of the file will be a multiplier of u = 1/n . 

Consequently, for any file with greater TTR, will have a 

large multiplier of u. As for practical cases we may 

assume that the average fluctuation of each word-length 

from average number of characters per word is 

approximately zero, we may consider the word length 

versus character distribution of the source text as a linear 

distribution. As a result, it may assume that, whenever 

there will have greater number of frequently occurring 

words, it implies frequently occurring characters. 

Whenever static coding is used, the code-words are 

defined according to the frequency of the occurring 

elements. That is, frequently appearing elements will be 

assigned lower bit consuming codes in order to minimize 

the total overhead (in terms of bit consumption) whereas, 

the non-frequently occurring elements may be coded with 

greater threshold value. The same may be expressed in 

terms of Type to Character Ratio (TCR) that, elements 

with larger TCR will be assigned low overhead and 

elements with lower TCR will be assigned with grater 

overhead in comparison with lower TCR elements. If we 

want to deploy such scheme for employing in Bengali 

text compression scheme, it is an effective idea to build 

such a corpus which will contain reversible components, 

which presents the fluctuation of TCR clearly. That is, to 

provide a sound evaluation of the corpus, it may 

inherently express that, for a suitable corpus 

taw ∝ tac 

and, 

t 

t 

t 

aw 

dw 

∝ 

t 

t 

taw ac ∝ 

dw 

t 

n 

ac 

dc 


Where, 

taw = Total number of words appeared in the source 

text. 

tdw = Total number of distinct words in the source 

text. 

tac = Total number of characters appeared in the 

source text. 

taw = Total number of distinct characters in the 

source text. 

n = Total number of symbols in the source language. 

This criteria demonstrates an important and to some 

extent essential aspect of forming a data compression 

corpus that will facilitate of an effective and efficient 

design and implementation scheme of text compression 

approaches irrespective of language. Though the main 

concern or uses of data compression corpus that has been 

adapted traditionally is only evaluation of compression 

scheme, the evolving nature of data management has 

motivated (and necessarily forced) to develop corpora for 

being considered as a knowledgebase as well as test-base 

for text compression scheme. The analysis presented in 

this paper establishes the criteria to be taken into 

consideration for building any dictionary based text 

compression scheme irrespective of adapting single-gram 

dictionary based text compression or multi-gram 

dictionary based compression. 

VIII. CONCLUSION 

We have proposed a new Corpus for evaluation of 

Bengali text compression schemes named Ekushey-Khul. 

The methodology is also robust to some extent that, it 

takes both linguistic and compression-ratio into account. 

It is also a contribution that, we specify new criteria for 

choosing text compression corpus, which includes 

consideration of expected device and the changed attitude 

of text varying with communication framework for which 

the compression is intended. Though it is an inaugurating 

step towards forming a Bengali text compression 

evaluation corpus, further improvement may be achieved 

by integrating other types and modes of document like 

technical documents, Bengali dictionary, and Bengali 

spreadsheets etc. with consideration of more training 

files. 

REFERENCES 

[1] Ross Arnold, and Tim Bell, “A Corpus for the evaluation 

of lossless compression algorithms”, Data Compression 

Conference, pp. 201-210, IEEE Computer Society Press, 

1997. 

[2] Khair Md. Yeasir Arafat Majumder, Md. Zahurul Islam, 

and Majuzmder Khan, “Analysis of and Observations from 

a Bangla News Corpus”, Proceedings of 9th International 

Conference on Computer and Information technology 

ICCIT 2006, pp. 520-525, 2006. 

[3] Mat Powel, “Evaluating Lossless Compression 

Algorithms”, February, 2001.


[4] N. S. Dash, “Corpus Linguistics and Language 

Technology”, 2005. 

[5] Official Web site of Prothom-Alo, www.prothom-alo.com 

[6] M. Burrows and D. J .Wheeler. “A block sorting lossless 

data compression algorithm”. Technical report, Digital 

Equipment Corporation, Palo Alto, CA, 1994. 

[7] S. A. Ahsan Rajon, “A study on Bengali Text Compression 

Schemes”, Research Report, Khulna University, Khulna, 

June 2008. 

[8] Md. Rafiqul Islam, and S. A. Ahsan Rajon, “On the 

Design of an Effective Corpus for Evaluation of Bengali 

Text Compression Schemes”, Proceedings of 11th 

International Conference on Computer and Information 

Technology (ICCIT 2008), 25-27 December, 2008, Khulna, 

Bangladesh, pp. 236-241. 

[9] Niladri Sekhar Dash, “Some Techniques Used for 

Processing Bengali Corpus to Meet New Demands of 

Linguistics and Language Technology”, SKASE Journal of 

Theoretical Linguistics. 2007, vol. 4, no. 2 ISSN 1336- 

782X. 

[10] Avik Sarkar, Anne De Roeck , A Framework for 

Evaluating the Suitability of Non-English Corpora for 

Language Engineering [online]. 

[11] Akshar Bharathi, Rajeev Sangal and Sushma M Bendre: 

Some Observations Regarding Corpora of Indian 

Languages. Proc. of Int. Conf. on Knowledge-Based 

Computer Systems (KBCS-98), 17-19 Dec 1998, NCST, 

Mumbai. 

[12] Niladri Sekhar Dash and Bidyut Baran Chaudhuri , Why 

do we need to develop corpora in indian languages. 

[online] 

[13] Upal Garain, B. B. Chaudhuri, “A Complete printed 

Bangla OCR System”, Pattern Recognition Journal, Vol. 

31, No. 5, pp. 531 – 549, Elseiver Science Limited, 1998. 


Md. Rafiqul Islam obtained Master of Science (M. S.) in 

Engineering (Computers) from Azerbaijan Polytechnic Institute 

(Azerbaijan Technical University at present) in 1987 and Ph.D. 

in Computer Science from Universiti Teknologi Malaysia 

(UTM) in 1999. His research areas include design and analysis 

of algorithms and Information Security. Dr. Islam has got a 

number of papers related to these areas published in national 

and international journals as well as in referred conference 

proceedings. 

He is currently working as the Head of Computer Science 

and Engineering Discipline, Khulna University, Bangladesh 

S. A. Ahsan Rajon is an Adjunct Faculty of Computer 

Science and Engineering Discipline, Khulna University, 

Khulna. He has completed his graduation from the same 

discipline in April 2008. He is also pursuing M.B.A. from 

Business Administration Discipline under Management and 

Business Administration School of same university. Rajon has 

made several publications in International conferences. His 

research interest includes data engineering and management, 

electronic commerce and ubiquitous computing. Currently he is 

working on robotics. 

He is a member of Institute of Engineers, Bangladesh (IEB).


A Corpus-based Evaluation of a Domain-specific 

Text to Knowledge Mapping Prototype 

Rushdi Shams 

Department of Computer Science and Engineering, Khulna University of Engineering & Technology (KUET), 

Bangladesh 

Email: rushdecoder@yahoo.com 

Adel Elsayed 

M3C Research Lab, the University of Bolton, United Kingdom 

Email: a.elsayed@bolton.ac.uk 

Quazi Mah- Zereen Akter 

Department of Computer Science & Engineering, University of Dhaka, Bangladesh 

Email: mahzereen@yahoo.com 

Abstract— The aim of this paper is to evaluate a Text to 

Knowledge Mapping (TKM) Prototype. The prototype is 

domain-specific, the purpose of which is to map 

instructional text onto a knowledge domain. The context of 

the knowledge domain is DC electrical circuit. During 

development, the prototype has been tested with a limited 

data set from the domain. The prototype reached a stage 

where it needs to be evaluated with a representative 

linguistic data set called corpus. A corpus is a collection of 

text drawn from typical sources which can be used as a test 

data set to evaluate NLP systems. As there is no available 

corpus for the domain, we developed and annotated a 

representative corpus. The evaluation of the prototype 

considers two of its major components- lexical components 

and knowledge model. Evaluation on lexical components 

enriches the lexical resources of the prototype like 

vocabulary and grammar structures. This leads the 

prototype to parse a reasonable amount of sentences in the 

corpus. While dealing with the lexicon was straight forward, 

the identification and extraction of appropriate semantic 

relations was much more involved. It was necessary, 

therefore, to manually develop a conceptual structure for 

the domain to formulate a domain-specific framework of 

semantic relations. The framework of semantic relations- 

that has resulted from this study consisted of 55 relations, 

out of which 42 have inverse relations. We also conducted 

rhetorical analysis on the corpus to prove its 

representativeness in conveying semantic. Finally, we 

conducted a topical and discourse analysis on the corpus to 

analyze the coverage of discourse by the prototype. 

Index Terms— Corpus, Knowledge Representation, 

Ontology, Lexical Components, Knowledge Model, 

Conceptual Structure, Semantic Relations, Discourse 

Analysis, Topical Analysis 


Text to Knowledge Mapping (TKM) Prototype [1] is a 

domain-specific NLP system, the purpose of which is to 


doi:10.4304/jcp.5.1.69-80 

parse instructional text and to model it with its predefined 

ontology. During development, the prototype has 

been tested with a limited data set from the domain 

instructional text on DC electrical circuit. The prototype 

reached a stage where its lexical components and 

knowledge model need to be evaluated with a 

representative linguistic data set, a corpus- a collection of 

text drawn from typical sources. Information retrieval 

during parsing, activation of concepts and relating them 

with predicate and semantic relations contribute to map 

and model domain-specific text on its knowledge domain. 

Therefore, the usability of the TKM prototype as a 

specialized knowledge representation tool for the domain 

depends on the evaluation of its lexical components like 

vocabulary and grammar structures, knowledge model 

like ontology and coverage of discourse. 

An important precondition to evaluate NLP systems is 

the availability of a suitable set of language data called 

corpus as test and reference material [2]. With an 

extensive web-based search, we did not find any corpus 

for the domain DC electrical circuit. Therefore, we need 

to develop a representative corpus to evaluate the 

prototype because a representative corpus reflects the 

way language is used in the domain [3]. A usable corpus 

requires various annotations according to the scope and 

type of evaluation. As we intend to evaluate both the 

lexical components and knowledge model of the TKM 

prototype, the corpus should be annotated with 

information like Parts of Speech (POS) tagging, phrasal 

structure annotations, and stem word tagging. These 

annotations can lead us to adjust the lexical components 

of the prototype according to the qualitative and 

quantitative layers [1] [4] of its knowledge model. 

Thereafter, evaluation on knowledge representation of the 

prototype demands both development of domain-specific 

ontology and a generic framework of semantic relations 

in the domain. The evaluation helps developing a


representative knowledge representation tool for the 

domain DC electrical circuit. 

In this paper, we proposed a stochastic development 

procedure of a domain-specific representative corpus that 

is used to evaluate two major components of the TKM 

prototype. We presented detail procedure of corpus-based 

evaluation of an NLP system- that includes enriching the 

lexicon and morphological database, testing the parsing 

ability of the prototype, and the adjustment of the lexical 

components according to the linguistic information in the 

corpus. We also developed ontology according to the 

human conceptualization. As successful knowledge 

representation depends on predicate and semantic 

relations in the text, we developed frameworks for 

semantic relations with which any NLP system can read 

and realize text in the domain. We evaluated the coverage 

of discourse by the TKM prototype with a topical and 

discourse analysis on the corpus. 

The remainder of this paper is organized as follows. In 

Section II, corpus-based evaluations of various NLP 

systems have been discussed. Section III describes the 

proposed procedure of representative corpus development 

and annotations. Section IV describes the evaluation of 

lexical components of the TKM prototype such as the 

vocabulary and grammar structure. Section V contains 

the outline of developing an ontology and framework for 

semantic relations. The section also includes the 

rhetorical and topical analysis. Section VI concludes the 

paper. 

II. RELATED WORK 

A text based domain-specific NLP system can be 

evaluated according to the type, context or discourse of 

text from the domain although no established agreement 

has been developed on test sets and training sets [5]. 

Corpus is not restricted today only for researches on 

linguistics [6]; it is now becoming the principal resource 

to evaluate such domain-specific NLP systems. Many 

NLP systems like Saarbrucker Message Extraction 

System (SMES) [8] have been tested with a corpus as 

proper evaluation depends on a representative test set of 

data like corpus [7]. Corpus contains structured and 

variable but representative text. A corpus is said 

representative if the findings from it can be generalized to 

language or a particular aspect of language as a whole 

[3]. Corpus-based evaluations like MORPHIX [9] and 

MORPHIX++ [7] showed that the evaluation with a 

representative corpus results in proper adjustments. 

MORPHIX++ was tested with a corpus and systematic 

inspection revealed some necessary adjustments like 

missing lexical entries, discrepant morphology 

incomplete or erroneous single words. 

NLP systems use either pre-defined or customized 

grammar rules. For instance, the lexical components of 

the TKM prototype use Combinatory Categorical 

Grammar (CCG) [10]. The prototype follows some 

specific clausal and phrasal structures according to CCG. 

As it follows a particular grammar, we need to adjust the 

grammar and phrasal structures according to the 

structures of text from the domain. For example, TKM 


prototype, on its early test, was able to parse simple 

sentences only [35]. This becomes a drawback if majority 

of text in the domain is written in compound and complex 

sentences. Therefore, necessary adjustment on CCG can 

let the prototype parse compound and complex sentences 

as well. In addition, NLP systems may recognize specific 

clausal and phrasal structure which maybe absent in 

domain-specific text. For example, if an NLP system uses 

grammars that handle one subject and one object, both 

parsing and knowledge extraction from domain-specific 

text becomes difficult if majority of the text contains 

more than one subject and one object. These linguistic 

properties of domain-specific text bring in the issue of 

adjustment. The lexicographical resources of such 

systems can be increased by analyzing linguistic patterns 

in domain-specific corpus. Statistical data like frequency 

of words, number of simple, complex or compound 

sentences, number of subject and object present in the 

sentences assist to adjust the lexical components of the 

systems. The grammar structure MORPHIX++ supported 

was not efficient in its early days. It was adjusted and 

extended according to the corpus used as its test suite. 

The text in the corpus sometimes conveys ambiguity to 

a knowledge mapping prototype if its knowledge model 

differs from human cognition. For a sentence a resistor is 

both a circuit component and a diagrammatic 

representation, the role of a resistor is a component in 

physical connection or a component in diagram. To 

differentiate between them, the machine has to 

conceptualize the domain like human. We need semantic 

relations in text to conceptualize the domain. If a 

knowledge model is developed with domain-specific 

semantic relations, then machine identifies the proper role 

played by a concept in the domain. Semantic relations for 

a large domain can be obtained by developing conceptual 

structure of the domain with concept maps as it represents 

both textual and semantic relations graphically [11]. 

A team at Information Sciences Institute of University 

of Southern California was working on computer-based 

authoring. They suffered for an unavailability of a theory 

of discourse structure. Responding to this, Rhetorical 

Structure Theory (RST) was developed out of studies of 

edited or carefully prepared text from a wide variety of 

sources. It now has a status in linguistics that is 

independent of its computational uses [29]. RST is an 

approach to the study of text organization which 

conceptualizes in relational terms a domain within the 

semantic stratum [30]. After the formulation of RST in 

the 1980s, it becomes an emerging area of research for 

computational linguistics. It eventually draws the 

attention of researchers in natural language processing. 

Discourse analysis helps understanding the behaviour 

of a domain-specific NLP system in its discourse. Corpus 

is a strong source of discourse analysis as linguistic and 

semantic relations confined in it play important role to 

manifest, adjust and extend systems to attune with its 

discourse. Researches like [31], [32], and [33] 

incorporated corpus in discourse analysis where 

emphases were given on finding linguistic relations, 

manual annotation and correlation of discourse structures.


III. CORPUS DEVELOPMENT 

In this section, we will discuss regarding the 

development approach of a domain-specific corpus, proof 

or its representativeness, and its annotation procedure. 

A. Development Approach 

As we did not find any corpus for the domain DC 

electrical circuit with extensive web searches, we 

initiated WebBootCaT [12] to develop a representative 

corpus. We developed five corpora using the 

WebBootCaT and analyzed them by comparing the 

number of distinct domain-specific terms and number of 

distinct words present. The significant difference between 

these two numbers and inconsistency on the size of the 

corpus in Figure 1 state that web-based tools are not 

usable to develop domain-specific corpora. 

Figure 1. Inconsistency of WebBootCat to develop domain-specific corpus. 

Therefore, we decided to develop the corpus manually 

and collected text from 141 web resources containing 

1,029 sentences and 18,834 words. During the 

development, we left the non textual information (e.g., 

equations and diagrams) as the TKM prototype operates 

only on text. 

B. Representativeness of the Corpus 

The representativeness of the corpus can be justified 

with a notion of saturation or closure described by [13]. 


At the lexical level, saturation can be tested by dividing 

the corpus into equal sections in terms of number of 

words or any other parameters. If another section of the 

identical size is added, the number of new items in the 

new section should be approximately the same as in other 

sections [14]. 

Figure 2. Representativeness of the corpus with technical terms, verbs, prepositions and coordinators


To find out the representativeness for the corpus, it has 

been segmented into 15 samples. Each sample is 

comprised of 1,267 words on average. We plotted the 

cumulative frequency of the most frequent technical 

terms in the samples. 

Figure 2 depicts that the presence of the domainspecific 

technical terms becomes stationary after a few 

samples. This is one of the criteria showing the 

representativeness of the corpus. After a certain point, no 

matter how much text we add to the corpus, the 

frequencies of the terms are becoming stationary. 

Similarly, we counted the frequency of non-technical 

words in the corpus and grouped them according to their 

parts of speech. Statistics on verbs, prepositions and coordinators 

in Figure 2 show that the corpus has been 

saturated after sample 11. 

We also counted the frequency of types of sentences in 

the corpus. As the domain contains instructional text and 

most of which are simple sentences, it needs to be 

reflected on the corpus as well. Figure 3 shows that 

majority of the text is simple sentence (in percentage). 

Figure 3. Sentence structure in the corpus 

C. Corpus Annotation 

To annotate the corpus with POS tags, Cognitive 

Computation Group POS tagger [15] has been used as it 

works on the basis of learning techniques like Sparse 

Techniques on Winnows (SNOW). The corpus is 

annotated with nine parts of speech include noun, 

pronoun, verb, adverb, adjective, preposition, 

coordinator, determiner, and modal. The phrasal structure 

of the corpus has been annotated by the slash-notation 

grammar rules defined by CCG. We developed an XML 

version of the corpus with seven tags. 

IV. EVALUATION OF LEXICAL COMPONENTS 

The evaluation of vocabulary and the grammar 

structure of the prototype are illustrated in this section. 

This section also refers to the efficiency in parsing and 

richness of lexical entries of the prototype. 

A. Evaluation of Vocabulary 

The lexicon of the prototype is mapped on the unique 

words of the corpus. The words present both in the 

morphology and in corpus are called the vocabulary of 

the prototype. Initially, only five percent of the 

vocabulary was covered by the prototype (Table I). 


TABLE I. 

PRELIMINARY VOCABULARY COVERAGE OF THE TKM PROTOTYPE 

Words in Morphology Unique Words in Vocabulary 

and in Corpus 

the Corpus Coverage 

101 1,902 5% 

MORPHIX++, a second generation NLP system, 

covered 91 percent of word in the corpus developed to 

evaluate it. The reason behind this difference is the 

augmentation of the vocabulary of MORPHIX++ ran 

parallel with the development of the system where the 

main focus in case of TKM prototype was to develop an 

operational system first rather than increasing its 

vocabulary. 

We used the POS tags of the corpus to populate the 

lexicon. We retrieved every distinct word for each 

distinct POS from the corpus and we simply added it if 

that word was absent in the lexicon. The number of added 

entries into the lexicon is shown in Table II. On 

completion of the process, the vocabulary of the 

prototype covers 90 percent of the corpus (Table III). 

TABLE II. 

AUGMENTATION OF LEXICAL ENTRIES IN THE TKM PROTOTYPE 

POS Augmented Entries 

Determiner 19 

Coordinator 5 

Noun and Pronoun 2,094 

Adjective 364 

Preposition 71 

Adverb 177 

Verb 264 

TABLE III. 

VOCABULARY OF THE TKM PROTOTYPE 

Words in Morphology Unique Words in Vocabulary 

and in Corpus the Corpus 

Coverage 

1,783 1,902 90% 

B. Evaluation of Grammar 

The TKM prototype struggles to parse modals or 

auxiliary verb because CCG does not provide any 

specification to categorize modals into finite and nonfinite 

[16]. We defined grammar formalisms for modals 

and adjusted the lexicon that increased the ability of the 

prototype to parse modals. 

CCG does not have any mechanism for phrasal 

structures like adjective–adjective–noun although 

researches showed that numerous adjectives can be 

placed before a noun [17]. Except the regular adjectives, 

we defined grammar formalisms for noun equivalents 

(e.g., two common types of circuits), participle equivalent 

(e.g., the connected wire), gerund equivalents (e.g., the 

conducting material), and adverb equivalents (e.g., the 

above circuit is series circuit) of adjectives that increased 

the rate of parsing adjectives. 

CCG is unable to parse sentences that start and end 

with a prepositional phrase [18]. For example, in series 

circuit, the current is a single current- this sentence is not


parsed by CCG. In contrast, the current is a single 

current in series circuit- is sometimes parsed by CCG. 

The lexicon the prototype is using has nine different types 

of prepositions. Sometimes, it is difficult to even identify 

regular prepositions. For instance, the sum of potential 

differences in a circuit adds up to zero voltage- Though 

in regular grammars, up is not treated as adverbs- these 

are called particles where prepositions have no objects 

and require specific verbs with them (e.g., throw out, add 

up). The parsing ability of the prototype increased as we 

defined grammar rules for such prepositions. 

Complementizer, although it is a form of preposition, it 

is not recognized by CCG. Adverbs, on the other hand, 

have a strong coverage by CCG. In many cases, adverbs 

sit at the end of the sentence- CCG does not provide any 

category to define these adverbs although it has fully 

featured adverb categories for other two positions of an 

adverb in sentence- adverbs that start a sentence or that 

sit in the middle of a sentence. These issues have been 

resolved by adding new grammar rules. 

The lexicon has two categories for coordinatorssitting 

at the beginning of a sentence (e.g., since, as) and 

relating two clauses (e.g., and, or). CCG defined that they 

can be in the middle of two noun phrases only with 

np\np/np but the sentence series and parallel circuits are 

the types of circuits has the category n\n/n rather than 

np\np/np. CCG handles adverbs and conjunctions well 

but it seriously lags in handling sentences having similar 

verbs as in the sentence the sum of current flowing into 

the junction is eventually equal to the sum of current 

flowing out of the junction. The identical verbs flowing 

(gerund) appear twice with another verb (be) is 

concerning. Moreover, a verb has to be present in a 

sentence to form predicate argument structure but we 

discovered that there are sentences which do not have any 

verbs- the bigger the resistance, the smaller the current. 

Gerund of verb is known as noun. Gerund is formed by 

placing ing at the end of the verb. For example, current 

flowing into a junction is equal to the current flowing out 

of the junction- in this sentence, flowing is a gerund. 

Gerunds are not treated as nouns in CCG. In other words, 

gerunds, if treated as nouns in CCG, the sentence 

struggles to be parsed. 

After creating grammar rules and phrasal structures 

and adding them into the lexicon and morphology of the 

prototype, the parsing ability of the prototype increased to 

31 percent (Table IV). Although the prototype was tested 

with a limited dataset, it was unable to parse any sentence 

from the corpus before the evaluation. 

TABLE IV. 

AUGMENTATION OF LEXICAL ENTRIES IN THE TKM PROTOTYPE 

State of the 

Prototype 

Total 

Sentences 

Parsed 

Sentences 

Efficiency 

Preliminary 1,029 0 0% 

Evaluated 981 300 31% 

We analyzed the 300 sentences parsed by the prototype 

and figured out the number of subject, object and verb 

they consist. In Figure 4, we see that the prototype works 

well when the number of subjects and objects in a 


sentence do not exceed two and when the number of 

verbs does not exceed one. 

The inefficiency of the prototype to parse sentence is 

due to the absence of phrasal structures (hence, the 

categories). 69 percent of the sentences in the corpus 

have phrasal structures that are not supported by the CCG 

structure. It should be noted that the prototype fails to 

parse sentences even for absence of just one category. For 

example, One simple DC circuit consists of a voltage 

source (battery or voltaic cell) connected to a resistor – 

this sentence is not parsed by the prototype for the 

absence of category of conjunction or (np\n/np) and for 

the category of verb connected (s\np/pp). In the corpus, 

these absent categories are identified so that modification 

of the lexicon becomes easier. 

Figure 4. Number of subjects, objects, and verbs in the 

sentences parsed by TKM prototype. 

The inefficiency of the prototype to parse sentence is 

due to the absence of phrasal structures (hence, the 

categories). 69 percent of the sentences in the corpus 

have phrasal structures that are not supported by the CCG 

structure. It should be noted that the prototype fails to 

parse sentences even for absence of just one category. For 

example, One simple DC circuit consists of a voltage 

source (battery or voltaic cell) connected to a resistor – 

this sentence is not parsed by the prototype for the 

absence of category of conjunction or (np\n/np) and for 

the category of verb connected (s\np/pp). In the corpus, 

these absent categories are identified so that modification 

of the lexicon becomes easier. 

V. EVALUATION OF KNOWLEDGE MODEL 

In this section, we will discuss the procedure of 

developing a domain-specific ontology and framework 

for semantic relations. The results of rhetorical, topical, 

and discourse analysis are also outlined in this section. 

A. Ontology and Framework for Semantic Relations 

From the 300 parsed sentences, the prototype is able to 

map only 10 percent of the sentences effectively on its 

pre-built ontology. We investigated the ontology and 

found that it was not developed according to a 

representative data set like our corpus. We decided to 

develop ontology for the domain-specific corpus that 

helps to adjust the knowledge model of the TKM 

prototype. 

We developed the ontology in a similar way human 

conceptualizes a domain. In conjunction with the


development of the ontology for the domain, we 

developed a framework for semantic relations. The 

framework is built upon the framework proposed by 

FACTOTUM thesaurus [19] [20]. These semantic 

relations help to represent hierarchical knowledge apart 

from predicate information. 

We conceptualized every sentence in the corpus 

manually. The outcome of the conceptualization led us to 

develop concepts and relations among them and 

graphically represented them as concept maps with Cmap 

Tools [21]. 

To illustrate this procedure, for the sentence One 

simple DC circuit consists of a voltage source (battery or 

voltaic cell) connected to a resistor, we firstly 

conceptualized the sentence in the following manner- 

1. DC circuit has voltage source as its component. 

2. Battery and voltaic cell are voltage sources. 

3. Battery and voltaic cell have similarity. 

4. Voltage source can be connected to resistor. 

5. DC circuit has resistor as its component. 

6. As they all are satisfying the properties of a circuit, 

DC circuit is a type of circuit. 

We used this information to develop base level concept 

maps that represent the predicate relations in the text. To 

develop higher level concept maps, we require to group 

concepts and to find relations among the groups. For this 

particular sentence, we defined groups named circuit and 

circuit component. We assigned DC Circuit and Circuit 

to the group Circuit and the rest of the concepts to the 

group circuit component. We can also find a relation 

between these two groups- circuit is made of circuit 

components. For a sentence Resistors in the diagram are 

in parallel- the concept resistor would be assigned to 

group of concepts called Diagrammatic Notation rather 

than Circuit Components. This process of grouping the 

concepts from the base level concept maps and finding 

relations among them produced four levels of concept 

maps for the corpus. The conceptual structure of the 

domain is comprised of all these concept maps resulted 

from human conceptualization at four different levels. 

The predicate relations in the sentence are as follows- 

1. DC Circuit Have Component Voltage source 

2. Battery Type Of voltage source 

3. Voltaic cell Type Of voltage source 

4. Battery Is Voltaic Cell 

5. Voltage Source Connected To Resistor 

6. Battery Connected To Resistor 

7. Voltaic Cell Connected To Resistor 

8. DC Circuit Have Component Resistor 

9. DC Circuit Type of Circuit 

These relations are then analyzed to initiate developing 

the framework for the semantic relations in the text. The 

analysis provides us the following semantic relations- 

1. Relation which describes parts that are physically 

related (e.g., Have Component) 

2. Relation which describes hyponymy (e.g., Type Of), 

and synonymy (e.g., Is) that are similar 

3. Relation which describes hierarchy or class (e.g., 

Type Of) 


4. Relation which describes spatial relations 

(specifically location of objects) (e.g., Connected To) 

As we represent knowledge by conceptualization 

followed by mapping linguistic information on 

knowledge model, it will allow the prototype to map 

knowledge from the text onto the ontology efficiently. 

For example, the prototype now can provide the user 

knowledge like voltage source is a physical part of the 

DC circuit- which is not stated in the sentence literally 

but semantically. 

As we developed conceptual structure for the corpus 

with the Cmap Tool, the total number of concepts and 

relations increases but number of new concepts and 

relations decreases. On completion, the number of 

concepts and relations are plotted against the corpus size. 

Figure 5 shows the cumulative increment of the number 

of concepts and relations. We see a plateau showing that 

the number of concepts and relations are becoming 

stationary. 

Figure 5. Graph to show that the number of concepts and 

relations in the corpus is becoming stationary. 

We also plotted number of new concepts and relations 

against the corpus size (Figure 6). The plateau in Figure 6 

shows that the number of new concepts and relations are 

becoming stationary. These two observations led us to a 

decision that if we put semantic relations in the corpus 

into a framework, then it will be representative. 

Figure 6. Graph to show that the number of new concepts 

and relations in the corpus is becoming stationary. 

We found 97 predicate relations and 166 concepts in 

the corpus and we developed Tier 2 of our framework 

(Table V) to support these relations. Afterwards, we


grouped level 0 concepts and relations to produce level 1 

of concept maps. As we came across new predicate 

relations, we created Tier 1of our framework to support 

the semantic relations in Tier 2. These two tiers of 

semantic relations comprise the domain-specific 

framework for semantic relations and can be supportive 

to all the predicate relations of the domain. In essence, 

the level 0 concept maps have the predicate relations and 

TABLE V. 

FRAMEWORK FOR SEMANTIC RELATIONS IN THE CORPUS 

the semantics conveyed by them are supported by 

relations in Tier 2. Predicate relations in level 1 and level 

2 concept maps are supported by Tier 1 semantic 

relations. The ontology along with the concept maps is 

depicted in [34]. 

Relation Category 

Tier 1 Semantic 

Relations 

Tier 2 Semantic Relations 

Predicate Relations Inverse Predicate Relations 

Hierarchy Have type Type of 

Physically Related 

Parts 

Constituent Material 

Have component 

Make, Produce 

Component of 

Made of, Produced by 

Take place between, Direction of 

Location of Objects Connected to, Flows 

Spatial Relations 

through, Have direction 

Location of Activities 

Transfer, Find, Divide, 

Commence from 

Transferred by, Found by, 

Divided by, End to 

Affect, Cause, Vary in, Affected by, Caused by, 

Effect/ Partial Cause Resist, Force, Limit, Resisted by, Forced by, 

Opposite to, Related to Limited by 

Causally/ 

Functionally Related 

Production/ Generation 

Destruction 

Manifestation 

Produce 

Collide, Melt 

Represent 

Produced by 

Collided by, Melted by 

Represented by 

Conversion 

Convert, Convertible to Converted by, Convertible 

from 

Predicate Relations 

Carry, Measure, Supply, Carried by, Measured by, 

Instrumental 

Function/ Usage 

Functions 

Share, Depend on, 

Protect, Absorb 

Supplied by, Shared by, 

Depended by, Protected by, 

Absorbed by 

Use Use, Do not use Used by, Not used by 

Human Role Deal with Dealt by 

Topic Govern Governed by 

Representation 

Represent, Characterize Represented by, 

Characterized by 

Conceptually Related 

Have state, Have unit, State of, Unit of, Source of, 

Property 

Have source, Have 

Magnitude, Have 

Terminal 

Magnitude of, Terminal of 

Similarity 

Synonymy 

Hyponymy 

Is, Referred to 

Have type 

Is 

Type of 

Quantitative Relations Numerical Relations 

Proportional, Inverse proportional to, Gain, Lose, Do 

not gain, Do not lose 

Instantiation Have instance Instance of 

Extension Have Extension Extension of 

B. Rhetorical Analysis 

To find the stereotypical relations in the domain, RST 

proposed by Mann & Thompson [22] is used as a 

descriptive tool. Research work like Rosner & Stede [23] 

and Vander Linden [24] also used this framework for the 

rhetorical analysis in their corpus. We used a framework 

based on the work of Hunter [25] who outlines the 

structural model of content of information for second 

language learning materials proposed within the frame of 

machine-mediated communication [26]. This framework 


defined text structures, textual expressions and 

information structures within domain-specific text. 

One common characteristic of expository text is that 

they use text structures. Text structures refer to the 

semantic and syntactic organizational arrangements used 

to present written information. Text structure used in the 

analysis includes introduction, background, 

methodologies, results, observations, and conclusions. 

Textual Expressions are relations that describe the 

nature of a sentence at phrase level. It eventually outlines 

the type of the sentence. These all are mononuclear 

relations- the relations do not depend on the semantic of


the adjacent sentences. We used the following textual 

expressions for the analysis- common knowledge, cite, 

report, explanation, claim, evaluation inference, and 

decision. 

Information structures are used at both phrase level 

and sentence level in the analysis. We analyzed the 

meaning of the sentence and its impact on other 

juxtaposed sentences with relations like description, 

classification, comparison, sequence, cause-effect, and 

contrast. 

We used RSTTool [27] to annotate the corpus with the 

rhetorical relations. The procedure shows that the corpus 

has 2,701 relations grouped into 19 rhetorical relations 

(Table VI). 

TABLE VI. 

RHETORICAL RELATIONS IN THE CORPUS 

Higher means of relations background (13 percent) and 

observations (7 percent) are significant as the corpus 

contains instructional text and instructional text mostly 

describes background and observation of events [28]. The 

analysis also shows that most of the text are descriptive 

(30 percent) and presented as report (20 percent) thus 

proved the representativeness of the corpus in case of 

containing semantic relations. Qualitative layer of the 

prototype deals with the causal relationship between 

concepts and a significant amount of cause-effect relation 

(2.14 percent) is of particular interest for us to deal with. 

We found that the prototype is able to map about 70 

percent of the causal relations in the text. 

Rhetorical Structures Rhetorical Relations Appearance Mean 

Text Structures Introduction 117 4.33% 

Background 356 13.19% 

Methodologies 60 2.22% 

Results 51 1.89% 

Observations 180 6.66% 

Conclusions 42 1.55% 

Textual Expressions Common Knowledge 74 2.74% 

Report 545 20.18% 

Explanation 192 7.11% 

Claim 85 3.15% 

Evaluation 3 0.11% 

Inference 13 0.48% 

Decision 59 2.18% 

Information Structures Description 817 30.25% 

Classification 13 0.48% 

Comparison 25 0.93% 

Sequence 34 1.26% 

Cause-effect 58 2.14% 

Contrast 17 0.63% 

Total 2,701 100% 

C.Topical Analysis 

We intend to analyze the topical progression of the 

corpus as the prototype both handled and failed to handle 

text on various topics. The analysis will help us to 

determine the context and discourse awareness of the 

prototype. The prototype is not developed to parse and 

map text of any particular topic or context and it should 

represent the whole domain. Since we have the 

representative corpus, the topical analysis of the 

prototype can help us understand the topical coverage of 

its context and discourse. 

First, we annotated the corpus with three types of 

topical progressions- parallel progression, sequential 

progression, and extend parallel progression. As the topic 

in the text progresses onwards, we indented the text of the 

corpus according to the type of progression it belongs to. 

For example, indentation is the starting topic, 

indentation is the sequential topic originated from 

, indentation is the sequential topic originated 

from , and indentation is the extended parallel 

topic of (Figure 7). On completion, we found six 

indentations of topical progression in the corpus. 

Putting more resistors in the parallel circuit decreases the total resistance because the electricity has additional branches to flow along and so the 

total current flowing increases. 

This is very useful because it means that we can switch the lamp on and off without affecting the other lamps. 

The brightness of the lamp does not change as other lamps in parallel are switched on or off. 

For this reason, lamps are always connected in parallel. 


Figure 7. Annotation of the corpus with topical progression.


Second, we counted total number of sentences in each 

indentation of the corpus. As we expected, indentation 1 

covers the most of the corpus and indentation 6 has the 

least number of sentences. We also counted number of 

sentences TKM prototype handled in each indentation to 

find its topical coverage. The more the topic progresses 

away from the context, the possibility of not 

understanding the context increases but the prototype 

TABLE VII. 

TOPICAL ANALYSIS OF THE CORPUS AND TKM PROTOTYPE 

showed that even if the topic is six indentations away 

from the original context, it can represent the knowledge 

(Table VII). The prototype efficiently handled language 

and knowledge on topics that are four, five, and six 

indentations away from the original context with 38, 36, 

and 33 percent coverage, respectively. However, topics 

nearer to the starting context are covered relatively low 

with 25 and 26 percent. 

Indentation Number of Sentences Corpus Coverage Number of Sentences handled by Topical Coverage by 

the Prototype 

the Prototype 

1 641 66% 197 31% 

2 259 22% 65 25% 

3 86 7% 22 26% 

4 26 3% 10 38% 

5 11 1% 4 36% 

6 4 1% 2 33% 

D.Discourse Analysis 

The discourse of the prototype contains high level 

concepts developed during the progress of ontology. High 

level concepts are those that are related with Tier 1 and 

Tier 2 semantic relations of our framework and convey 

knowledge rather than predicate information. According 

to our research, the domain has 12 high level concepts 

shown in Table VIII. The table is organized in 

descending order according to the number of concepts in 

discourse. We semi-automatically analyzed the corpus 

and found that the high level concepts of the ontology are 

present 4,120 times in the corpus- this is the discourse of 

the prototype. Moreover, we also found that the high 

level concepts of the domain are present 969 times in the 

sentences that the prototype can handle- which is the 

discourse covered by the prototype. If we divide the 

TABLE VIII. 

DISCOURSE ANALYSIS OF THE TKM PROTOTYPE 

discourse coverage of prototype by the total number of 

concepts in the discourse, then we will find the discourse 

covered by the prototype. In this case, Table VIII shows 

that the discourse coverage of the TKM Prototype is 24 

percent. 

If we consider individual high level concepts, then 

Units and Measuring Instruments are the areas of 

discourse the prototype covers, mostly, with 48 percent of 

coverage. Rules is next to them with 28 percent of 

coverage. The prototype covers only 15 percent of the 

discourse of Electrical Process though the discourse is 

significant in the domain. Environmental Factors, a 

narrower high level concept, is next to it in case of less 

discourse coverage by the prototype. 

High Level Concepts Coverage of Prototype Concepts in Discourse Difference with Discourse 

Discourse 

Coverage Deviation 

(1) Electrical Quantity 335 1433 1098 24% 77% 

(2) Circuit Components 154 685 531 23% 78% 

(3) Diagrammatic Notation 94 482 388 20% 81% 

(4) Electrical Process 63 442 379 15% 86% 

(5) Electrical Device 83 313 230 27% 74% 

(6) Units 100 211 111 48% 53% 

(7) Atomic Level 31 161 130 20% 81% 

(8) Circuits 30 140 110 22% 79% 

(9) Environmental Factors 20 110 90 19% 82% 

(10) Measuring Instrument 46 96 50 48% 53% 

(11) Rules 13 47 34 28% 73% 

(12) Materials 4 21 17 20% 81% 

Total 969 4,120 3,151 24% 77% 

We also analyzed the deviation of the prototype from 

the discourse. First, we measured the difference of the 

coverage of the prototype and coverage in discourse. 

Then, we measured the deviation- the difference with 


discourse divided by the concepts in discourse. This 

deviation is the measure of unawareness in discourse- 

how much of the discourse the prototype failed to pursuit. 

The data show that the prototype is strong to represent 

knowledge from the discourse of Units and Measuring


Instrument (both 53 percent). The prototype has the 

overall deviation from the discourse of 77 percent- means 

its discourse awareness is 23 percent. 

We plotted the presence of high level concepts in the 

discourse and the coverage of the discourse by the 

prototype in Figure 8. The difference between the 

coverage of discourse by the prototype and the discourse 

itself is depicted with vertical lines. From Figure 8 and 

Table VIII, we see that the difference is proportional to 

each other from Electrical Quantity to Electrical 

Processes and then a sudden rise in case of Electrical 

Device and Units indicates that most of the simple 

sentences in the corpus are situated in this area. Another 

smooth maintenance of difference between the discourse 

and the coverage of discourse is manifested from 

concepts Atomic Level to Environmental Factors. The 

prototype shows efficiency in knowledge representation 

in Measuring Instruments that indicates to the possibility 

of having understandable knowledge for the prototype 

lies in this discourse with suitable linguistic and semantic 

arrangement. 

Figure 8. Difference between the discourse and the 

coverage of discourse by the prototype. 

VI. CONCLUSIONS 

In this paper, we presented a corpus-based evaluation 

of lexical components and knowledge model of a 

domain-specific Text to Knowledge Mapping prototype. 

We developed a domain-specific corpus and proved its 

representativeness in linguistic elements with stochastic 

approach and its soundness in semantic features with 

rhetorical analysis. The representative corpus, with 

enriched multimodality, can be used as a reference in text 

summarization, for context and discourse analysis, and 

for developing ontology. The linguistic resources of the 

corpus have been used to evaluate and adjust lexical 

components of the prototype like vocabulary and 

grammar. This evaluation led the prototype to parse 

reasonable amount of domain-specific text. During 

evaluation on knowledge model, we developed a domainspecific 

ontology and a framework for semantic relations 

associated with it. We conducted topical and discourse 

analysis on the prototype to see its context awareness and 

the performance of the prototype is satisfactory. 

However, limited conceptual acquisition of the prototype 


refers to limited knowledge representation and demands a 

framework for domain-specific linguistic relations. 

Using the domain-specific corpus, a generic corpus 

parsing and lexical component analysis tool is developed 

[36] that extracts lexical information from any XML 

corpus and store the information in database. The corpus 

also contributed in domain-specific text summarization 

and the result of summarization was satisfactory [37]. 

REFERENCES 

[1] W. Ou and A. Elsayed, “A knowledge-based approach for 

semantic extraction”, International Conference on 

Business Knowledge Management, Macao, 2006. 

[2] T. Declerck, J. Klein, and G. Neumann, “Evaluation of the 

NLP components of an information extraction system for 

German”, Proceedings of the first international 

Conference on Language Resources and Evaluation 

(LREC), Granada, 1998, pp. 293-297. 

[3] D. Evans, “Corpus building and investigation for the 

humanities”, Available: 

http://www.humcorp.bham.ac.uk/humcorp/information/cor 

pusintro/Unit1.pdf [28 January 2009]. 

[4] K. D. Forbus, “Qualitative process theory: Twelve years 

after”, Artificial Intelligence, vol. 59, no. 1-2, 1993, pp. 

115-123. 

[5] M. Palmer and T. Finin, “Workshop on the evaluation of 

natural language processing systems”, Computational 

Linguistics, vol. 16, no. 3, 1993, pp. 175-181. 

[6] T. McEnery and A. Wilson, Corpus Linguistics, 

Edinburgh: Edinburgh University Press, United Kingdom, 

1996. 

[7] J. Klein, T. Declerck, and G. Neumann, “Evaluation of the 

syntactic analysis component of an information extraction 

system for German”, Proceedings of the 1st International 

Conference on Language Resources and Evaluation, 

Granada, Spain, 1998, pp. 293-297. 

[8] G. Neumann, R. Backofen, J. Baur, M. Becker, and C. 

Braun, “An information extraction core system for real 

world German text processing”, Proceedings of the 5th 

Conference on Applied Natural Language Processing 

(ANLP), USA, 1997, pp. 209-216. 

[9] W. Finkler and G. Neumann, “Morphix: A fast Realization 

of a classification–based approach to morphology”, 

Proceedings der 4. Österreichischen Artificial Intelligence 

Tagung, Wiener Workshop Wissensbasierte 

Sprachverarbeitung, Berlin, Germany, 1988, pp. 11-19. 

[10] S. Clark, M. Steedman, and J. R. Curran, “Objectextraction 

and question-parsing using CCG”, Available: 

http://www.iccs.inf.ed.ac.uk/~stevec/papers/emnlp04.pdf 

[28 January 2009]. 

[11] N. Nathan and E. Kozminsky, “Text concept mapping: The 

contribution of mapping characteristics to learning from 

texts”, Proceeding of the 1 st International Conference on 

Concept Mapping (CMC2004), Pamplona, Spain, 2004. 

[12] M. Baroni, A. Kilgarriff, J. Pomikálek, and P. Rychlý, 

“WebBootCaT: A web tool for instant corpora”, 

Proceeding of the EuraLex Conference, Italy, 2006, pp. 

123-132. 

[13] T. McEnery, R. Xia, and Y. Tono, Corpus-Based 

Language Studies: An Advanced Resource Book, London: 

Routledge, 2006. 

[14] D. Biber, “Representativeness in corpus design”, Literary 

and Linguistic Computing, vol. 8, no. 4, 1993, pp. 243-257.


[15] University of Illinois at Urbana-Champaign, “The SNoW 

learning architecture”, Available: 

http://l2r.cs.uiuc.edu/~danr/snow.html [28 January 2009]. 

[16] A. S. Hornby, Guide to Patterns and Usage in English, 2 nd 

Edition, Oxford University Press, Delhi, 1995, pp. 1-2. 

[17] M. A. Covington, Natural Language Processing for 

Prolog Programmers, 1 st Edition, Prentice Hall, 1993, pp. 

88-90. 

[18] M. Steedman and J. Baldridge, “Combinatory categorical 

grammar”, Unpublished Tutorial Paper, 1993. 

[19] Z. Chen, Y. Perl, M. Halper , J. Geller, and H. Gu, 

“Partitioning the UMLS semantic network”, IEEE 

Transactions on Information Technology in Biomedicine, 

vol. 6, no. 2, 2002. 

[20] Micra, Inc., “FACTOTUM”, Available: 

http://www.micra.com/factotum/ [25 January 2009]. 

[21] A.J.Cañas, G. Hill, R. Carff, N. Suri, J. Lott, G. Gómez, 

T.C. Eskridge, M. Arroyo, and R. Carvajal, “CmapTools: 

A knowledge modelling and sharing environment”, 

Proceeding of the 1 st International Conference on Concept 

Mapping, Pamplona, Spain, 2004. 

[22] W. C. Mann and S. A.Thompson, Rhetorical Structure 

Theory: Toward a functional theory of text organization, 

Text 8 (3), 1988, pp. 243-281. 

[23] D. Rosner and M. Stede, “Customizing RST for the 

automatic production of technical manuals”, Aspects of 

Automated Natural Language Generation, Lecture Notes 

in Artificial Intelligence, Trento, Italy, 1992, pp. 199-214. 

[24] K. V. Linden, “Speaking of actions: Choosing rhetorical 

status and grammatical form in instructional text 

generation”, Ph. D. thesis, University of Colorado, 1993. 

[25] L. Hunter, “Dimensions of media object 

comprehensibility”, 7th IEEE International Conference on 

Advanced Learning Technologies (ICALT 2007), Niigata, 

Japan, 2007, pp. 925-926. 

[26] A. Elsayed, “Machine-mediated communication: The 

technology”, 6 th IEEE International Conference on 

Advanced Learning Technologies (ICALT 2006), Kerkrade, 

The Netherlands, 2006, pp. 1194-1195. 

[27] D. Marcu, “RST annotation tool”, Available: 

http://www.isi.edu/licensed-sw/RSTTool/ [27 January 

2009] 

[28] L. Kosseim and G. Lapalme, “Choosing rhetorical 

structures to plan instructional texts”, Computational 

Intelligence, vol. 16, no. 3, 2000, pp. 408-445. 

[29] W. C. Mann and M. Taboada, “Rhetorical Structure 

Theory: Looking back and moving ahead”, Discourse 

Studies, vol. 8, no. 3, 2006, pp. 423-459. 

[30] V. Stuart-smith, “The hierarchical organization of text as 

conceptualized by rhetorical structure theory: A systemic 

functional perspective”, Australian Journal of Linguistics, 

vol. 27, no. 1, 2007, pp. 41-61. 

[31] R. Williams and E. Reiter, “A corpus analysis of discourse 

relations for natural language generation”, Proceedings of 

Corpus Linguistics, Lacaster, United Kingdom, 2003, pp. 

899-908. 

[32] D. Marcu, M. Romera, and E. Amorrortu, “Experiments in 

constructing a corpus of discourse trees”, University of 

Maryland, 1999, pp. 48-57. 

[33] B. Webber, “D-LTAG: Extending lexicalized TAG to 

discourse”, Journal of Cognitive Science, vol. 28, 2004, pp. 

751-779. 

[34] R. Shams and A. Elsayed, “Development of a conceptual 

structure for a domain-specific corpus”, 3 rd International 

Conference on Concept Maps (CMC 2008), Estonia and 

Finland, 2008. 


[35] R. Shams and A. Elsayed, “A Corpus-based evaluation of 

lexical components of a domain-specific text to knowledge 

mapping prototype”, 2008 International Conference on 

Computer and Information Technology (ICCIT), Khulna, 

Bangladesh, 2008, pp. 242-247. 

[36] S. Chowdhury, A. S. Shawon, and R. Shams, “CorParse: A 

corpus parsing and analysis tool”, Unpublished Tutorial 

Paper, 2008. 

[37] A. Hossain, S. R. Akter, M. Gope, M. M. A. Hashem, and 

R. Shams, “A corpus-dependent text summarization and 

presentation using statistical methods”, Undergraduate 

Thesis, Khulna University of Engineering & Technology, 

2009. 

Rushdi Shams was born in Khulna, Bangladesh on January 

3, 1985. He pursued his M.Sc. in Information Technology from 

the University of Bolton, United Kingdom in 2007 and his B.Sc. 

(Engineering) in Computer Science and Engineering from 

Khulna University of Engineering & Technology (KUET), 

Bangladesh in 2006. He has major fields of study like intelligent 

systems, internet security, data warehouse, IT management and 

artificial intelligence. 

He is currently a Lecturer in the Department of Computer 

Science and Engineering, KUET. Formerly, he was a Research 

Intern at M3C Laboratory, University of Bolton. So far, he has 

publications in the area of Ad hoc Networks and Knowledge 

Processing. He supervised undergraduate theses on diversified 

fields like knowledge processing, wireless sensor networks, 

corpus linguistics, and web engineering. Currently, he is 

developing frameworks for web 3.0 and acquisition and 

machine representation of commonsense knowledge. His 

current research interests are Knowledge and Language 

Processing, Computational Linguistics, and Wireless Networks. 

Mr. Shams is an Associate Member of Institute of Engineers, 

Bangladesh (IEB). 

Adel Elsayed pursued his Ph.D on Applied Optimal Control 

from Loughborough University, UK in 1985. His M.Sc is in 

Electronic Control Engineering from Salford University, UK in 

1975 preceded by a B.Sc. in Communications Engineering from 

Libya University, Libya in 1972. He has a diversified field of 

study from signal processing to multimodal communication and 

from communications to intelligent systems and humancomputer 

interaction. 

He joined the University of Bolton towards the end of 2001. 

His main objective was to establish a research base by building 

on his work on multimodal communication, a new line of 

research that he has started few years earlier. His research at 

Bolton has started as he set up the “Active Presentation 

Technology” Lab (APT Lab). Since then, he diversified into 

investigating the underpinning knowledge structures that 

support human-machine communication. This led to researching 

information structures and their applications. Consequently, the 

scope of his work grew out of the narrow area of active 

presentation technology into the wider scope of Machine- 

Mediated Communication, hence the new name of the research 

lab. He has esteemed numbers of conference and journal 

publications. Besides these, his research interests are cognitive 

tools, knowledge and language processing, and speech 

technology.


Dr. Elsayed established M3C as international workshop 

attached to well known IEEE International Conference on 

Advanced Learning Technologies (ICALT). He acts as guest 

editor for special issues of many well known journals as well as 

reviewer in number of international conferences. 

Quazi Mah- Zereen Akter was born on November 25, 1985 

in Dhaka, Bangladesh. She is now completing her M.Sc. in 

Computer Science and Engineering from the University of 

Dhaka, Bangladesh. She completed her B.Sc. (Engineering) in 

Computer Science and Engineering from the University of 

Dhaka in 2008. Bioinformatics, artificial intelligence, computer 

vision, distributed database systems, digital system design, and 

VLSI are her major fields of study. 

She is currently working on her Masters thesis with 

reversible logic and computational intelligence. Besides 

Intelligent Systems, her research interests are Bioinformatics, 

Management Information Systems, Reversible Logic, and Logic 

Simplification and Minimization. 



Implementation of Low Density Parity Check 

Decoders using a New High Level Design 

Methodology 


School of Electrical & Information Engineering, University of South Australia, Mawson Lakes, Australia 

Email: {Mahfuz.Aziz, Minh.Pham}@unisa.edu.au 

Abstract—Low density parity check (LDPC) codes are 

error-correcting codes that offer huge advantages in terms 

of coding gain, throughput and power dissipation. Error 

correction algorithms are often implemented in hardware 

for fast processing to meet the real-time needs of 

communication systems. However hardware 

implementation of LDPC decoders using traditional 

hardware description language (HDL) based approach is a 

complex and time consuming task. This paper presents an 

efficient high level approach to designing LDPC decoders 

using a collection of high level modelling tools. The 

proposed new methodology supports programmable logic 

design starting from high level modelling all the way up to 

FPGA implementation. The methodology has been used to 

design and implement representative LDPC decoders. A 

comprehensive testing strategy has been developed to test 

the designed decoders at various levels. The simulation and 

implementation results presented in this paper prove the 

validity and productivity of the new high level design 

approach. 

Index Terms—Error correction coding, digital systems, 

digital communication, logic design, FPGA. 


Information passing through a practical 

communication channel may be corrupted in transit by 

noise present in the channel [1]. Therefore it is of 

paramount importance for communication systems to 

have adequate means for the detection and correction of 

errors in the information received over communication 

channels. Turbo codes and LDPC (low density parity 

check) codes are most commonly used for error detection 

and correction nowadays [2]. Both of these codes provide 

coding gains [3] close to Shannon’s limit [4]. LDPC 

codes however outperform turbo codes in terms of coding 

gain for large SNR [5, 6]. LDPC code of length 1 million 

with a coding rate (ratio of information bits to the sum of 

information and parity bits) of 0.5 and BER of 10-6 

provides a capacity which is only 0.13db from Shannon’s 

limit [5]. Further advantages of using LDPC codes are 

Project number: ITEE-09/CGD-12, University of South Australia. 

Corresponding author: Dr Syed Mahfuzul Aziz, Email: 

mahfuz.aziz@unisa.edu.au 


doi:10.4304/jcp.5.1.81-90 

given in [7]. These include less computational complexity 

as compared to turbo codes, ability to pipeline the 

decoder to increase throughput at the cost of registers and 

some latency, and less number of iterations than turbo 

codes. Increased number of iterations reduces the 

throughput and increases power dissipation [5, 8]. The 

studies conducted in [7] and [8] indicate that lesser 

number of iterations in LDPC codes helps achieving 

higher throughput and decreases power dissipation. 

Another factor which contributes to increasing the 

throughput of LDPC decoder is the degree of parallelism 

which is adjustable [9, 10]. Another reason of using 

LDPC codes is that it is relatively easier to implement 

than turbo codes [11]. 

Despite all these advantages of LDPC codes the 

random parity check matrix makes the wiring between 

variable and check nodes complex, especially for large 

matrices. This leads to increased routing congestion in the 

decoder and eventually the size of the LDPC decoder 

increases and speed decreases [12, 13]. Complexity of 

practical implementation makes high throughput very 

difficult to achieve [14, 15]. Moreover, designing LDPC 

decoders in VHDL (Very high speed integrated circuit 

Hardware Description Language) becomes a cumbersome 

task when the size of the design increases. Huge amount 

of time and effort are required to model such large 

designs in VHDL [16]. It results in decrease in 

productivity. It becomes a nightmare for the designer to 

write VHDL code for thousands of connections and to 

make the changes required. Therefore, hardware 

implementation of LDPC decoder remains a challenge. 

This paper examines high level modelling and 

synthesis techniques of LDPC decoders using emerging 

industry tools. It compares the high level approaches with 

traditional hardware description language based 

approaches in terms of modelling complexity, efforts and 

time. It also compares the results obtained for a 

representative LDPC decoder design using the high level 

approaches and using a traditional HDL based approach. 

II. DESIGN APPROACH 

This paper investigates a new high-level modelling and 

synthesis methodology for LDPC decoder using state of


the art tools. As opposed to using only hand written 

VHDL codes for the entire design, this research examines 

high level design methods using Simulink, involving a 

combination of blocks designed with predefined library 

components and with embedded Matlab codes. A VHDL 

(VHSIC Hardware Description Language) code is 

generated automatically from the high level Simulink 

model using Mathworks’s Simulink HDL coder (version 

1.2). The entire design process is captured in Fig. 1. 

Simulink HDL coder gives the flexibility of integrating 

different design approaches, namely Matlab programs, 

Simulink models and VHDL codes. Simulink and Matlab 

programs provide a much higher level of abstraction than 

VHDL. Therefore our design approach utilizes as many 

Simulink library blocks and Matlab blocks as possible to 

design various modules of the decoder provided these 

modules can be successfully processed by the HDL coder 

for automatic generation of VHDL codes. All the 

modules are then integrated in a top level Simulink model 

to produce the overall decoder model. VHDL code is 

generated automatically using Simulink HDL coder along 

with an optional test bench. This test bench can be used 

in ModelSim (a HDL simulator) to check correctness of 

the design. The auto-generated VHDL code can also be 

used in Altera’s Quartus II or Xilinx ISE (Integrated 

Software Environment) for synthesis and implementation 

on desired FPGA (Field Programmable Gate Array). 

Automatic HDL code generation will lead to reduction in 

design efforts thereby increasing design productivity. 

Simulink 

library blocks 

Simulation in 

Simulink 

MATLAB 

Programs 

Simulation in 

MATLAB 

Simulink HDL Coder (Automatic 

Code Generation) 

Simulation, Analysis & Synthesis 

Implementation on FPGA 

Figure 1. Proposed design flow. 

III. OVERVIEW OF LDPC CODES AND DECODING 

ALGORITHM 

Low density parity check (LDPC) codes are a class of 

linear block codes which are used for error detection and 

correction [17]. LDPC codes were first discovered by 

Gallager in 1962. In 1981, Tanner worked on LDPC 

codes and came up with important tanner graphs or 

bipartite graphs [18]. LDPC codes could not be 

implemented at the time of invention because the 

technology was not advanced, though they were studied 

again after almost three decades because they are more 


HDL Coder 

VHDL 

Programs 

Simulation in 

Quartus/Xilinx 

advantageous than any other error-correcting codes. 

LDPC codes are extensively used in standards such as 

10Gigabit Ethernet (10GBaseT) & digital video 

broadcasting (DVB-S2) [19]. 

There are different algorithms which could be used for 

decoding purposes. We used the min-sum decoding 

algorithm [14, 20], which is a special case of sum-product 

algorithm [18, 21, 22]. Sum-product algorithm reduces 

the computational complexity and makes the decoder 

numerically stable [18]. Assume that the messages from 

the host communication system are represented by I and 

are passed on to the decoder for error correction. The 

LDPC decoder consists of a number of variable nodes (v) 

and check nodes (c). The operation of the min-sum 

algorithm can be summarised as follows [14]: 

A. Variable Node Operation 

A variable node performs the operation given in (1) 

and passes the outputs to check nodes. 

L cv = ∑ R + I 

mv 

m∈ 

M ( v) 

\ c 

v 

(1) 

where, Iv is the input to variable node v, also known as 

Log Likelihood ratio (LLR), Lcv is the output of variable 

node v going to check node c, M(v)\c denotes the set of 

check nodes connected to variable node v excluding the 

check node c, Rmv is the output of check nodes going to 

variable node v. 

B. Check Node Operation 

A check node receives messages from variable nodes 

and performs the operation given by (2): 

R 

cv 

= 

∏ 

n∈N 

( c) 

\ v 

sign( 

L ) × | L | (2) 

cv 

min 

n∈N 

( c) 

\ v 

where, Rcv is the output of check node c going to variable 

node v. 

C. Parity Check 

Every check node also checks whether the parity 

condition is satisfied by looking at the sign of the 

messages coming from the variable nodes. Until all the 

parity checks from all the check nodes are satisfied the 

messages are sent back to variable nodes and the variable 

nodes do the operation specified in part (A), otherwise 

the decoder stops the process. 

Min-sum decoding algorithm uses soft decision. 

However, hard decision is taken on the new LLR (Iv). If 

the new LLR is negative then the output bit would be a 1 

otherwise a 0. Interested readers can find the details of 

the sum-product and min-sum decoder algorithms in [18, 

20-22]. 

IV. REPRESENTATIVE DECODER DESIGNS 

LDPC codes can be represented by an M x N sparse 

matrix, usually called H matrix. The H matrix contains 

mostly zeros and a small number of 1s [9, 10]. It can also 

cv


be represented by a graph called bipartite or tanner graph. 

Tanner graphs contain variable and check nodes. The 

decoder prototype designed in this research is based on 

the matrix shown in Fig. 2. It is a very small matrix 

compared to the real matrices. This 10x5 matrix can be 

transformed into a tanner graph having 10 variable nodes 

and 5 check nodes as is shown in Fig. 3. The signals go 

from the variable nodes to the check nodes and from the 

check nodes back to variable nodes. Typically different 

sets of wires are used for the signals going from variable 

nodes to check nodes and vice versa. The process is 

iterative and goes until all the parity checks are satisfied. 

The messages in our prototype designs are 4-bit long. 

1 1 1 1 0 1 1 0 0 0 

0 0 1 1 1 1 1 1 0 0 

0 1 0 1 0 1 0 1 1 1 

1 0 1 0 1 0 0 1 1 1 

1 1 0 0 1 0 1 0 1 1 

V1 

V2 

V3 

V4 

V5 

V6 

V7 

V8 

V9 

V10 


Figure 2. A 10x5 sparse matrix. 

Figure 3. Tanner graph. 

C1 

C2 

C3 

C4 

C5 

Figure 4. Check node design in Simulink. 

A. Design 1 

The approach taken in the first decoder design was to 

use a combination of embedded Matlab blocks and 

Simulink library blocks. No handwritten VHDL blocks 

were used. A generic Simulink model was first developed 

for each of a variable node and a check node. 

Check node design in Simulink: The check node has 

been designed in the same way as the variable node. The 

Simulink model of the check node is shown in Fig. 4. It 

uses blocks from Simulink library (absolute and bitwise 

XOR). The check node finds the minimum of all the 

inputs and performs the parity checks. It contains a 

control block, which controls its operation. 

Variable node design in Simulink: The variable node, 

shown in Fig. 5, has been designed in Simulink using the 

basic add block from the Simulink library. The control 

block controls the operation of the variable node. The 

variable node has four inputs, the first three come from 

various check nodes and the fourth one is the raw LLR, 

which is an external input supplied by the host 

communication system. It performs the operation given in 

(1) and passes the outputs to the check nodes to which it 

is connected. 

Once the Simulink models are developed for the check 

node and the variable node, the VHDL codes for each of 

these can be generated automatically using the HDL 

Coder tool according to the design flow presented in Fig. 

1. The functionality of each of these models can be 

verified at both Simulink and VHDL levels. Each VHDL 

model can also be synthesized for a target FPGA. 

10x5 LDPC decoder: A LDPC decoder with 10 

variable and 5 check nodes has been designed using 

instances of the variable and check node components 

described above. The various nodes are interconnected in 

a way that corresponds to the Tanner graph of Fig. 3.


B. Design 2 

Design 2 uses a combination of embedded Matlab 

blocks, Simulink library blocks and VHDL/Link for 

ModelSim blocks. ‘Link for ModelSim’ is a utility that 

enables modules coded in VHDL to be embedded in 

Simulink models. Design 2 is different from the first 

design in that it uses serial communication of messages 

between variable and check nodes. This is achieved using 

SIPO (serial-in-parallel-out) and PISO (parallel-in-serialout) 

registers at each input and output port in all the 

nodes. This greatly simplifies the interconnections by 

reducing the number of wires four times at the cost of 

some extra registers. The variable and check node 

components are same and are connected in the same way 

as in Design 1. The PISO and SIPO components are 

coded in VHDL and are used in the Simulink model with 

the help of the ‘Link for ModelSim’ utility (to link the 

VHDL blocks with ModelSim for simulation and code 

generation purposes). The way PISO and SIPO are used 

in a variable node is shown in Fig. 6. Check nodes have 

been modified in exactly the same manner. The inputs 

and outputs of all the variable and check nodes are now 

1-bit long. The effects of using a VHDL block in a 

Simulink design are discussed in the next section. The 

HDL coder does not generate the VHDL code of the Link 

for ModelSim blocks. The VHDL code for these blocks 

needs to be added before synthesizing the code. 

SIPO1 

SIPO2 

SIPO3 

Variable 

Node 

PISO1 

PISO2 

PISO3 

Figure 5. Variable node design in Simulink. 

New 

Variable 

Node with 

SIPO & 

PISO 

Figure 6. A variable node with PISO and SIPO. 


V. RESULTS AND ANALYSIS 

A. Convergence Test 

The convergence characteristics of the LDPC decoders 

presented in this paper have been tested using VHDL 

testbench and compared with that of a Matlab code for a 

functionally equivalent decoder. For this purpose the 

VHDL code automatically generated from the Simulink 

models was used. Fig. 7 shows the spread of the number 

of iterations for Design 2 by using plots of (mean number 

of iterations + standard deviation) and (mean number of 

iterations − standard deviation) versus SNR (Eb/No). It is 

clear that convergence is achieved in 6 iterations. This is 

consistent with the convergence result of a functionally 

equivalent decoder shown in Fig. 8, designed using 

Matlab code [23]. 

Figure 7. Spread of the number of iterations for Design 2 obtained 

from VHDL testbench. 

Figure 8. Spread of the number of iterations for a functionally 

equivalent Matlab code.


B. Algorithm Performance 

The performance of our LDPC algorithm has been 

evaluated by simulating the decoder over AWGN channel 

and plotting the Bit-Error-Rate (BER) against Signal-to- 

Noise-Ratio (SNR) [24, 25]. Fig. 9 shows the BER 

performance of Design 2 obtained from simulation of the 

high-level Simulink model. Fig. 9 also shows the BER 

plot of the unencoded BPSK channel. Clearly our decoder 

demonstrates increasing gain in BER with increasing 

SNR compared to the unencoded BER. This proves that 

the proposed high-level design method can deliver 

competitive designs with the desired gain in BER 

performance [24, 25]. 

Figure 9. Performance simulation of the LDPC decoder. 

C. Synthesis 

Synthesis results for the high-level designs are given 

below and are compared with their ‘VHDL only’ 

counterparts. The results were obtained using Altera’s 

Quartus II software with Cyclone II EP2C70F672C6 as 

the target FPGA device. Table 1 compares the resources 

used by our Design 1 and its maximum operating 

frequency (Fmax) with the same decoder designed solely 

using hand coded VHDL. Our design uses 2.1% of the 

FPGA’s logical elements, only 120 registers and has a 

maximum frequency of 42.96 MHz. Clearly our Simulink 

design uses much less resources and is faster than the 

VHDL-only design. The higher register count in the 

VHDL design is due to the presence of dedicated 

registers for latching variable and check node outputs, 

which were removed from the Simulink design. Table 2 

compares our Design 2 with its VHDL-only counterpart. 

It uses 3.5% of the logic elements, which is higher than 

the amount used by the VHDL-only design (2.8%). 

TABLE I. 

DESIGN 1 COMPARED WITH VHDL-ONLY COUNTERPART 

Design 1 VHDL only design 

Logical elements 1432 (2.1%) 1492 (2.2%) 

Combinational functions 1432 (2.1%) 1492 (2.2%) 

Total registers 120 (0.2%) 686 (1%) 

Maximum frequency (Fmax) 42.96MHz 37.3 MHz 


TABLE II. 

DESIGN 2 COMPARED WITH VHDL-ONLY COUNTERPART 

Design 2 VHDL only design 

Logical elements 2416 (3.5%) 1916 (2.8%) 

Combinational functions 2416 (3.5%) 1916 (2.8%) 

Total registers 480 (0.7%) 881 (1.3%) 

Maximum frequency (Fmax) 67.20MHz 67.72MHz 

The reason for the higher number of logic elements in 

our design is that the control unit contains handshaking 

circuitry to facilitate communication of large amount of 

data between the PC and the FPGA board for testing. 

However, the VHDL-only design has a simple control unit 

and does not include any such handshaking circuitry. Our 

Design 2 uses nearly half the number of registers and 

achieves nearly the same maximum frequency (Fmax). 

The time required by our high-level approach for 

successful modelling, simulation and synthesis of the 

LDPC designs was almost a quarter of that required by 

the hand coding method. 

D. Behavioral Simulation of VHDL Model 

Fig. 10 shows the functional simulation result for the 

VHDL model of Design 2 generated from its top level 

Simulink model. A set of raw LLRs (6, 6, 2, 4, 7, 4, -2, 6, 

4, 7) are applied to the variable nodes. The signals 

end_o_vn & end_o_cn are the control signals. The parity 

becomes 1 in the third iteration when all the parity checks 

are satisfied and the decoder stops the iterations. The 

corrected LLRs output by the variable nodes are 7, 7, 7, 

7, 7, 7, -8, 4, 5, 7. 

E. Hardware Implementation and Testing 

After fully simulating the Simulink as well as the 

VHDL models of Design 1 and Design 2, both designs 

were implemented on a Xilinx Spartan 2E FPGA. The 

FPGA platform we used is shown in Fig. 11. It contains 

three separate modules: the USB communication module 

for communicating with the PC, the main FPGA module 

housing the Xilinx Spartan IIE, and the I/O module. The 

LLRs are generated by a MATLAB program and are 

stored in a text file on the PC. A LabVIEW program 

running on the PC sends these LLRs to the decoder via 

the USB module and receives the decoded LLRs back 

along with the parity information. The decoded LLRs are 

written into a separate file by the LabVIEW program and 

analysed for correctness by a MATLAB program by 

comparing with the LLRs generated by simulation of the 

Simulink and/or VHDL models. The PC controls the 

operation of the decoder on the FPGA through a number 

of handshaking signals as is shown in Fig. 12, for 

example start/stop, max_iteration etc. The I/O module has 

been used to display useful runtime information, for 

example the number of iterations completed by the 

decoder for each LLR. This enables us to obtain a visual 

indication that the LDPC decoder is operating. The 

performance results obtained from the implemented 

decoders are presented in the next sub-section along with 

the performance of Simulink and VHDL models.


Figure 11. The FPGA platform used to implement the decoders. 

LLR_In 

Start Stop 

Clock 

LDPC Decoder 

in FPGA 

Figure 12. Block diagram of the LDPC decoder implemented on the 

Xilinx Spartan 2E FPGA device. 

F. Comprehensive Testing Strategy 

The Matlab and Simulink environment used in the high 

level design methodology make it very easy and fast to 

build a full test system for testing the whole design. In 

this section we present a comprehensive testing strategy 

for LDPC decoders whereby we can test the decoder 

outputs from three different environments 

simultaneously. Fig. 13 shows the proposed testing 

strategy. The test system allows the design to be 

simultaneously tested in Simulink, VHDL (ModelSim) 

and on the FPGA. 


Figure 10. Behavioral simulation results of decoder Design 2. 

Max_iteration 

Reset 

Parity 

LLR_Out 

Number_of 

_Iterations 

The LDPC Encode Test Data module generates a 

sequence of LDPC encoded test data and sends these to 

the Simulink simulation model, VHDL testbench and 

FPGA at the same time. After decoding is done in the 

three environments, the data are sent back to the Test 

Data Analysis module. This module analyses the decoded 

data from the three environments and compares the parity 

information for correctness. We have used this scheme to 

validate the decoder designs presented in this paper. 

LDPC 

Encode 

Test Data 

LDPC 

Decoder 

in 

Simulink 

LDPC 

Decoder 

in 

VHDL 

LDPC 

Decoder 

in 

FPGA 

Figure 13. Structure of the testing system. 

LDPC 

Test Data 

Analysis 

Fig. 14 shows the performance plot (BER) obtained 

from the FPGA and compares it with the BER obtained 

from the Simulink model. The performance plots from 

the VHDL testbench and FPGA are also compared in Fig. 

15. These figures show a close match among the 

performance plots obtained from three different 

environments and therefore prove that the design has 

been correctly implemented and run on the FPGA device.


Figure 14. Bit-Error-Rates obtained from FPGA implementation 

compared with Simulink simulation. 

Figure 15. Bit-Error-Rates obtained from FPGA implementation 

compared with VHDL testbench. 

VI. ANALYSIS OF THE HIGH LEVEL DESIGN METHOD 

The high-level design methodology presented in this 

paper reduces design complexity, effort and time. For 

example, to design the variable and check nodes of Fig. 4 

and Fig. 5 completely in VHDL a designer has to write 

behaviours of quite a few modules in VHDL and 

manually do the port mapping for the top level design. As 

the design gets larger and larger it becomes very difficult 

to code everything in VHDL. It not only takes huge effort 

and time, but also managing complex designs and reusing 

the designs are often quite difficult. A great deal of 

expertise and experience in VHDL is also required. For a 

large LDPC decoder with hundreds of nodes, the 

interconnections among the variable and check nodes 

could easily become a designer’s nightmare if the entire 

design has to be coded in VHDL. An alternative, intuitive 

and highly efficient design method has been a long 

standing desire of the engineers engaged in the design of 

complex digital systems such as LDPC decoders. 


The high level methodology we have presented in this 

paper offers a very attractive alternative. The designs can 

be done intuitively in Simulink using predefined 

Simulink library blocks and blocks made from high-level 

Matlab code. The design complexity, effort and time are 

reduced drastically because the need for manually writing 

complex VHDL code is almost eliminated. Our estimates 

have shown that the time required to successfully design, 

simulate and synthesise a LDPC decoder using hand 

coded VHDL is almost four times that required by the 

proposed high-level methodology. Our decoder Design 2 

had 1069 lines of VHDL code generated automatically by 

HDL Coder from the top level Simulink model as 

opposed to only 225 lines of hand written VHDL code. 

The main reason for the large code produced by HDL 

Coder is that it used a very large number of internal 

signals to generate the VHDL description of the decoder. 

This is something we did not have any control over. Of 

course our decider Design 2 had additional handshaking 

circuitry to facilitate communication between the PC and 

the FPGA board for testing purpose. This contributed to 

the larger code to some extent. However, we did not 

optimise the auto-generated VHDL code. Yet our decoder 

designs compare favourably with the hand coded designs 

(see Tables 1 and 2). 

Another important aspect of the proposed high-level 

design methodology is that design reuse requires much 

less effort because changes are made either at block level 

in Simulink or in high-level Matlab code. In addition 

Matlab is a software programming language that is used 

much more widely than hardware description languages 

like VHDL and Verilog. Therefore designers without 

specific skills in hardware languages are able to design 

complex digital systems without much problem. Even 

software engineers and algorithm developers are able to 

quickly implement and test their high-level designs due to 

the ability to automatically generate HDL descriptions 

from the Simulink models. This will surely offer great 

flexibility and efficiency in the design and reuse of 

complex LPDC decoders. There are some other benefits 

of the high level design methodology: 

• Different parts of the model can be enveloped using the 

‘create subsystem’ property of Simulink. It reduces the 

design complexity for large designs. 

• User created Simulink library blocks provide great 

flexibility because the revisions made to the user 

defined library seamlessly propagate through the entire 

model. 

• HDL Coder can generate either VHDL or Verilog 

descriptions from the high-level Simulink models, 

allowing greater flexibility in the choice of the 

hardware description language. 

In the high-level design methodology, it is also easy 

and fast to build a complete test system. The Matlab and 

Simulink library provides a lot of powerful tools for 

generating and analysing test data such as graphical plot, 

scripts, etc. 

The design methodology we have presented requires 

the use of some emerging design tools and library 

functions, such as Simulink hardware library and


embedded Matlab blocks, Link for ModelSim and HDL 

Coder tools. Because these are very recent developments 

the library of Simulink blocks to support the functions a 

designer needs is rather limited at the present time. 

Similarly the capability of HDL Coder is limited to 

conversion of a few commonly used Simulink blocks and 

a few embedded Matlab blocks. Some of the specific 

limitations are discussed below. 

A. Current Limitations 

Some common blocks in the Simulink library are not 

currently supported by the HDL Coder for automatic 

generation of VHDL code, e.g. flip-flops. Although we 

could build the PISO and SIPO registers easily in 

Simulink using the flip flops from its library the registers 

were not converted to VHDL by the HDL Coder. A 

careful selection of supported Matlab functions and 

Simulink library blocks may help addressing this type of 

problems, but not necessarily always. Some other 

limitations we experienced are listed below: 

• Multiple instances of some modules (components) are 

used in the LDPC decoder, and in fact in most modular 

designs utilising a hierarchical design methodology. 

For example, in our LDPC decoder multiple instances 

of the check and variable nodes, and PISO and SIPO 

registers are used. The HDL Coder dumps the full 

behaviour of each component as many times as it is 

instantiated in the design. This produces redundant 

instances of component behaviour in the generated 

VHDL code. It is necessary for the designer to edit the 

auto-generated code. 

• There are some functions which are supported by the 

HDL Coder, but it produces the output in a particular 

data type. For example, the sign function of Matlab 

gives output only in 8-bit integer (int8) format. It is not 

possible to change these default data types in the 

current version. The only option is to manually edit the 

auto-generated HDL code. 

• Link for ModelSim: This utility enables blocks 

designed in VHDL to be included in Simulink models. 

However the ports of the blocks that use Link for 

ModelSim get interchanged while simulating in 

Simulink. HDL code cannot be generated for the 

Simulink model in this situation. The code can be 

generated only after correcting the design. This makes 

the design process difficult and time consuming. The 

other drawback of blocks utilising Link for ModelSim 

is that these blocks may make changes in the autogenerated 

code. In case of the LDPC decoder these 

blocks changed the signals of type signed to unsigned. 

Once again this requires manually editing the autogenerated 

VHDL code. 

VII. CONCLUSIONS 

In this paper a new high-level intuitive design 

methodology based on Simulink for modelling, synthesis 

and implementation of LDPC decoders has been 

presented. It utilises the higher level of abstraction 

offered by the Simulink modelling environment. The 


modelling, simulation and synthesis process utilises a 

combination of emerging design tools and associated 

library functions. These include the Simulink HDL Coder 

and ‘Link for ModelSim’ tools, and embedded Matlab 

and Simulink hardware library blocks. Two versions of 

10x5 LDPC decoders have been designed, simulated, 

synthesised and successfully implemented on a Xilinx 

Spartan 2E FPGA device. A comprehensive testing 

strategy has been adopted to test the decoders at all 

levels, from the high level Simulink model through 

VHDL all the way up to hardware implementation. Our 

testing strategy supports simultaneous testing of the 

decoders at the three levels which is useful for real-time 

debugging. The proposed high level design methodology 

facilitates the creation of such testing strategy very 

quickly because the test data generated at the high level is 

used for testing at all three levels. This helps to reduce 

the time of evaluating and testing the design as well as 

ensures that the final design is efficient and error-free. 

The performance figures on Bit-Error-Rates obtained at 

the three levels compare favourably with those of the 

LDPC decoders reported in the literature. 

The proposed high level design methodology offers 

great advantages in terms of design complexity, effort 

and time compared to a HDL-only design method. Our 

Design 1, completed using the new methodology, uses 

less FPGA resources and achieves higher Fmax compared 

to its HDL-only designs. Our Design 2 achieves nearly 

the same Fmax as its VHDL-only counterpart, but uses 

slightly higher number of logic elements due to the 

inclusion of additional handshaking circuitry required for 

testing the design on FPGA. Given the significant 

reduction in design effort and time, the above results 

make the proposed design methodology a very attractive 

one. We envisage that with further enrichment of the 

Simulink Library blocks, and enhancements of HDL 

Coder and Link for ModelSim tools, design 

methodologies similar to the one presented in this paper 

will eventually replace tedious HDL-based design 

approach. This will pave the way for cost effective design 

and reuse of a new generation of complex high 

performance LDPC decoders. 


This work has been supported in part by a research 

grant from the Division of IT, Engineering and the 

Environment of the University of South Australia 

(UniSA). The authors wish to thank Mr Sunil Sharma for 

his initial investigations into developing variable and 

check node models using Simulink and related tools. The 

authors also acknowledge Prof Bill Cowley of UniSA’s 

Institute of Telecommunications Research for his useful 

suggestions and critical feedback. The authors understand 

that Prof Cowley’s time has been partially supported 

through a research grant from Sir Ross and Sir Keith 

Smith fund. Finally the authors would like to thank Dr 

Mark Ho of the School of Electrical and Information 

Engineering of UniSA for his suggestions on the 

performance simulations of the decoder.


REFERENCES 

[1] R. G. Gallager, Low-Density Parity-Check Codes. 

Cambridge, Mass: Monogram, 1963. 

[2] S. Johnson, Introducing Low-Density Parity-Check 

Codes. Australia: University of Newcastle, 2006. 

[3] S. Lin and D. J. Costello, Error Control Coding: 

Fundamentals and Applications. New Jersey: Prentice 

Hall, 2004. 

[4] B. Reiffen, “Sequential Decoding for Discrete Input 

Memoryless Channels,” IRE Trans. Inf. Theory, vol. 8, 

no. 3, pp. 208-220, April 1962. 

[5] A. J. Blanksby and C. J. Howland, “A 690- mW 1-Gb/s 

1024-b, rate-1/2 low-density parity check code decoder,” 

IEEE J. Solid State Circuits, vol. 37, no. 3, pp. 404–412, 

2002. 

[6] T. J. Richardson, M. A. Shokrollahi, and R. L. Urbanke, 

“Design of capacity-approaching irregular low-density 

parity-check codes,” IEEE Trans. Inf. Theory, vol. 47, pp. 

619-637, February 2001. 

[7] J. Nguyen, B. Nikolic, and E. Yeo, Design of a low 

density parity check iterative decoder. University of 

California, Berkley: EECS, College of Engineering, 2002. 

[8] S. Hong and W. Stark, “Design and implementation of a 

low complexity VLSI turbo-code decoder architecture for 

low energy mobile wireless communications,” J. VLSI 

Signal Processing, vol. 24, pp. 43-57, 2000. 

[9] M. Karkooti, P. Radosavljevic, and J. R. Cavallaro, 

“Configurable, High Throughput, Irregular LDPC 

Decoder Architecture: Tradeoff Analysis and 

Implementation,” Proc. Int. Conf. Application Specific 

Systems, Architectures and Processors, pp. 360-367, 

September 2006. 

[10] M. Karkooti, P. Radosavljevic, and J. R. Cavallaro, 

“Configurable LDPC Decoder Architectures for Regular 

and Irregular Codes”, J. Signal Processing Systems, vol. 

53, pp. 73-88, October 2008. 

[11] Y. Lei, L. Hui, and R. C. J. Shi, “Code Construction and 

FPGA Implementation of a low-error-floor multi-rate 

low-density Parity-check code decoder,” IEEE Trans. 

Circuits & Systems I, vol. 53, pp. 892-904, April 2006. 

[12] A. Darabiha, C. A. Carusone, R. F. Kschischang, and E. 

S. Rogers, “Multi-Gbit/sec Low Density Parity Check 

Decoders with Reduced Interconnect Complexity,” Proc. 

IEEE Int. Symp. Circuits & Systems, vol. 5, pp. 5194- 

5197, 2005. 

[13] A. Darabiha, C. A. Carusone, and R. F. Kschischang, 

“Block-Interlaced LDPC Decoders With Reduced 

Interconnect Complexity,” IEEE Trans. Circuits & 

Systems I I, vol. 55, pp. 74-78, January 2008. 

[14] J. Sha, M. Gao, Z. Zhang, Li Li, and Z. Wang, “An FPGA 

implementation of array LDPC decoder,” Proc. IEEE 

Asia Pacific Conf. Circ. & Systems, pp. 1675-1678, 

December 2006. 

[15] Mauro Cocco, “A scalable architecture of LDPC 

Decoding,” Proc. Design, Automation & Test in Europe 

Conf., vol. 3, pp. 88-93, February 2004. 

[16] J. A. Wicks and J. R. Armstrong, “Efficiency ratings for 

VHDL behavioral models,” IEEE Proc. Southeasrcon’98, 

pp. 401-404, April 1998. 

[17] M. Eroz, F.W. Sun, and L.N. Lee, “DVB-S2 low density 

parity check codes with near Shannon limit performance,” 

Int. J. Satellite Communications and Networking, pp. 

269-279, 2004. 

[18] W. E. Ryan, An Introduction to LDPC Codes. USA: The 

University of Arizona, 2003. 


[19] T. Mohsenin and B. M. Baas, “Split-Row: A reduced 

complexity, high throughput LDPC decoder architecture,” 

Proc. Int. Conf. on Computer Design (ICCD 2006), San 

Jose, CA, pp. 220-225, 1-4 Oct 2007. 

[20] Z. Jianguang, F. Zarkeshvari, and A. H. Banihashemi, “On 

implementation of min-sum algorithm and its 

modifications for decoding low-density Parity-check 

(LDPC) codes”, IEEE Trans. on Communications, vol. 53, 

no. 4, pp. 549-554, April 2005. 

[21] F. R. Kschischang, B. J. Frey, and H. Loeliger, “Factor 

graphs and the sum-product algorithm,” IEEE Trans. on 

Inf. Theory, vol. 47, no. 2, pp. 498-519, 2001. 

[22] Sae-Young Chung, T. J. Richardson, and R. L. Urbanke, 

“Analysis of Sum-Product Decoding of Low-Density 

Parity-Check Codes Using a Gaussian Approximation,” 

IEEE Trans. Info. Theory, vol. 47, no. 2, pp 657-670, 2001. 

[23] S. M. Aziz and S. Sharma, “New Methodologies for High 

Level Modeling and Synthesis of Low Density Parity 

Check Decoders“, Proc. 11 th Int. Conf. on Computers and 

IT (ICCIT 2008), Khulna, pp. 276-281, 24-27 December 

2008. 

[24] D. Sridhara and T. E. Fuja, “LDPC Codes Over Rings for 

PSK Modulation,” IEEE Trans. Inf. Theory, vol. 51, no. 9, 

pp. 3209-3220, September 2005. 

[25] J. K. S. Lee and J. Thorpe, “Memory-Efficient Decoding of 

LDPC Codes,” Proc. IEEE Int. Symp. on Information 

Theory (ISIT 2005), Adelaide, Australia, pp. 459-463, 4-9 

November 2005. 

Syed Mahfuzul Aziz received 

Bachelor and Masters Degrees, both in 

electrical & electronic engineering, 

from Bangladesh University of 

Engineering & Technology (BUET) in 

1984 and 1986 respectively. He 

received a Ph.D. degree in electronic 

engineering from the University of 

Kent (UK) in 1993 and a Graduate 

Certificate in higher education from 

Queensland University of Technology, Australia in 2002. 

He was a Professor in BUET until 1999, and led the 

development of the teaching and research programs in 

integrated circuit (IC) design in Bangladesh. He joined the 

University of South Australia in 1999, where he is currently an 

associate professor and the inaugural academic director of first 

year engineering program. In 1996, he was a visiting scholar at 

the University of Texas at Austin when he spent time at Crystal 

Semiconductor Corporation designing advanced CMOS 

integrated circuits. He was a visiting professor at the National 

Institute of Applied Science Toulouse, France in 2006, where he 

has collaborations in the area of nanoscale CMOS technology 

modelling and integration with educational IC design tools. He 

has been involved in numerous industry projects in Australia 

and overseas, and has attracted funding from reputed research 

organisations such as the Australian Defence Science and 

Technology Organisation (DSTO), and the Pork CRC 

(Cooperative Research Centre), Australia. He has authored 

eighty five refereed research papers. His research interests 

include digital CMOS IC design and testability, modelling and 

FPGA implementation of high performance processing systems, 

biomedical engineering and engineering education.


Dr Aziz is a senior member of IEEE and a member of 

Engineers Australia. He has received numerous professional 

awards. These include: an Excellent Achievement Award in 

Networking and Internet System Development (1998) from the 

Centre of the International Co-operation for Computerisation, 

Japan; the International Network for Engineering Education and 

Research Achievement Award (2007); a Citation for outstanding 

contributions to student learning from both the Australian 

Learning & Teaching Council (2007) and the Australasian 

Association for Engineering Education (AaeE–2007); an AaeE 

Award for Teaching Excellence - Highly Commended (2008). 

He has served as member of the program committees of many 

international conferences. He reviews papers for the IEEE 

Transactions on Computer and Electronics Letters, UK. 

Recently he has been appointed a reviewer of the National 

Priorities Research Program, a flagship funding scheme of the 

Qatar National Research Fund. 


Minh Duc Pham received B.S. 

degree in electronic engineering from 

HCM National University of 

Technology, Vietnam in 2003 and M.S. 

degree in microsystems technology from 

the University of South Australia in 

2008. He has been working as an 

ASIC/FPGA engineer since 2003 for 

Arrive Technologies Inc, a fab-less 

silicon supplier of Disruptive Next 

Generation Solutions for PDH, SONET, SDH and Ethernet 

Internetworking. Mr Pham is currently working as a Research 

Assistant in the School of Electrical and Information 

Engineering of the University of South Australia. His research 

interests are in the fields of VLSI implementation of 

communication systems such as SoC for next generation 

networking, automation in VLSI design, forward error 

correction and coding theory.


A Formal Model for Abstracting the Interaction of 

Web Services 

Li Bao 

Institute of Software Engineering, Dalian Maritime University, Dalian, China 

Email: ebond@163.com 

Weishi Zhang and Xiong Xie 

Institute of Software Engineering, Dalian Maritime University, Dalian, China 

Email: {teesiv, xxyj}@dlmu.edu.cn 

Abstract—This paper addresses the problems of modeling 

the interaction of Web services when they are composed 

together. Many subtle errors such as message not received 

and deadlock may occur due to uncontrolled concurrency of 

Web services. A model called IMWSC (Interaction Module 

for Web Service Composition, IMWSC for short) is 

proposed. The proposed model is used to abstract and 

analyze the interaction of web services. IMWSC is given a 

formal semantics by means of CCS (Calculus of 

Communicating System, CCS for short), which is a kind of 

process algebra that can be used to model concurrent 

systems. The application of this model is further 

investigated in a case study. Some important points related 

to verify the correctness of interaction of Web service are 

discussed. 

Index Terms—Web Service, Interaction, Formal Method, 

IMWSC 


In order to survive the massive competition created by 

the new online economy, many organizations are rushing 

to put their core business competencies on the Internet as 

a collection of web services for more automation and 

global visibility [1] . The concept of web service has 

become recently very popular. Web services are software 

applications which can be used through a network 

(intranet or Internet) via the exchange of messages based 

on XML standards [2] . It has become a vehicle of web 

services rather than just a repository of information. 

The ability to efficiently and effectively share services 

on the Web is a critical step towards the development of 

the new online economy driven by the Business-to- 

Business (B2B) e-commerce [1] . Existing enterprises 

would form alliances and integrate their services to share 

costs, skills, and resources in offering a value-added 

service to form what is known as composite service. 

A composite web service is a system that consists of 

several conceptually autonomous but cooperating units. 

In order to establish a long-running service composition, 


doi:10.4304/jcp.5.1.91-98 

many languages and tools emerged, which provide 

different schemas to glue service operations properly. 

Service composition approaches can be generally divided 

into two categories [3, 4] : business flow based approach 

and semantic based approach. Some famous projects on 

web service are based on business flow [24] , such as 

eFlow [5] , METEOR-S [6] , SELF-SERV [7] ; Semantics based 

approach composes services based on ontology and relies 

on the use of AI planning techniques to automatically 

search, orchestrate, compose and execute services. 

Representative projects on web service research that is 

based on Semantics are: WebDG [8] , SWORD [9] , SHOP [10] . 

From a software engineering viewpoint, the 

construction of new services by the static or dynamic 

composition of existing services raises exciting new 

perspectives which can significantly impact the way 

industrial applications will be developed in the future — 

but they also raise a number of challenges. Among them 

is the essential problem of guaranteeing the correct 

interaction of independent, communicating software 

pieces [2] . 

One legitimate question is therefore whether or not the 

correct and reliable interaction of web services can be 

guaranteed to a great extent by introducing the formal 

description techniques. Our investigations suggest a 

positive answer. This paper addresses the problem of 

formally modeling the interaction of web services when 

they are composed together, be it in a dynamic or static 

way. A model for abstracting and analyzing one scenario 

of the interaction process of web services called IMWSC 

is proposed. After the interaction of web service is 

described in an abstract way, available supporting tool 

can be used to determine whether or not this interaction 

process satisfies the desired properties which are 

expressed in a kind of modal logic. 

This paper is structured as follows. Section 2 discusses 

the related work. In Section 3, we present IMWSC. 

Section 4 defines the semantics of IMWSC. The 

application of IMWSC is investigated in a case study in 

Section 5. And the conclusion and future work are drawn 

up in Section 6.


II. Related Work 

Petri nets are a formal model for concurrency. Since 

the semantics of Petri nets is formally defined, by 

mapping each BPEL process to a Petri net a formal model 

of BPEL can be obtained which allows the verification 

techniques and tools developed for Petri nets to be 

exploited in the context of BPEL processes. Many works 

such as [11, 12, 21, 22] introduce the Petri net based 

method for describing and verifying web service. 

In [21], Schmidt and Stahl discuss a mapping from 

BPEL to Petri nets by giving several examples. Each 

BPEL construct is mapped into a Petri net pattern. 

In [22], Schlingloff, Martens and Schmidt also 

consider the usability problem. They show that usability 

can be expressed in alternating-time temporal logic. As a 

consequence, model checking algorithms for this logic 

can be exploited to check for usability. 

As research aiming at facilitating web services 

integration and verification, WS-Net introduced in [11] is 

an executable architectural description language 

incorporating the semantics of Colored Petri-net with the 

style and understandability of object-oriented concepts. 

In [12], Tao provide a web service composition model 

which is based on a kind of advanced Petri-net, OOPN 

(Object Oriented Petri Net). A web service can be 

mapped to an OOPN system based on this model and 

different OOPN system can be integrated together into a 

composite service via message passing. 

A process algebra is a rather small concurrent language 

that abstracts from many details and focuses on particular 

features. There are several relevant publications [1, 13, 14] 

for process algebra based methods. 

Gwen Salaün, Lucas Bordeaux present an overview of 

the applicability of process algebras in the context of web 

services in [1]. 

Authors present a framework for the design and the 

verification of WSs using process algebras and their tools 

in [13]. 

Li Bao, Weishi Zhang present a CCS based method for 

describing and verifying the behaviour of web service in 

[14]. 

III. Defining IMWSC 

A. Initiative of IMWSC 

For the Petri net based methods, one major defect is 

that the number of the places and the transitions described 

in a Petri net is too large. Researchers often map each 

element in a web service composition language to an 

element in Petri net and do not restrict the number of the 

places and the transitions in a Petri net. If the number of 

the places and the transitions described in a Petri net is 

not restricted, the designers will meet a condition of state 

explosion, which is very difficult to be dealt with; another 

major defect for the Petri net based methods is the lack of 

the description of interaction process of web services. 

The Petri net based methods often put their emphasis on 

describing the workflow inside a web service, and do not 

present the complicate interaction process of web services. 


For the process algebra based methods, one major 

defect is that some kinds of complex structure of web 

service composition can not been defined by using these 

methods; another major defect for the process algebra 

based methods is that the lack of rigorous translation 

mechanism between the element of web service 

composition language and the element of process algebra. 

These methods often give simple corresponding relations 

and translation rules. These relations and rules can not 

guarantee the correct reservation of the information 

related to behavior and are apt to lead to the loss of 

information. We adopt a kind of hierarchically refined 

description method to define the interaction process of 

web services, i.e., we divide the interaction process of 

web services into smaller parts, which is defined as 

Interaction Module for Web Service Composition 

(IMWSC in short). For each of these parts, a scenario of 

the interaction of web services is defined. These smaller 

parts, i.e., modules, have a common property that the 

outcome of each module is determinate, in other words, 

each of the module has only one terminative state. This 

important property suggests these modules can be 

composed. Therefore, by mapping each module to a 

transition in a Petri net, modules which describe the 

scenarios of the interaction of web services are strictly 

composed. However, for the limit of the length, we only 

introduce the definition and properties of a module, i.e. 

IMWSC model, method about how to compose these 

modules will be introduced in further work. 

Instead of composing activities of web service, we 

compose modules. The merit of our approach is that it 

can effectively reduce the number of the objects to be 

analysis, such that the interaction process of web service 

is described more concisely, as well as the state explosion 

can be avoided. At the same time, web service 

composition with complex structure can be described by 

composing these modules, while the process algebra 

based methods can not achieve. 

Another benefit of our approach is the introducing of 

the semantic of IMWSC. The semantic of IMWSC 

comprises three parts: semantic domain, semantic range, 

and valuation function. A process calculus CCS (Calculus 

of Communicating Systems, CCS in short) [15, 16] is 

introduced as a semantic range, and then valuation 

functions are defined that translate an IMWSC(semantic 

domain) into a process term. Since the valuation 

functions are rigorously defined, the correct reservation 

of the information related to behavior can be guaranteed, 

such that the loss of information can be avoided. 

B. Formal Definition of IMWSC 

A web service is a software application which can be 

used through a network (intranet or Internet). For a web 

service, the basic functional unit is operation. The process 

of web service invocation is actually the process of 

operation invocation. For IMWSC, the invocation of 

operation is modeled by Activity. For a better control of 

the structure of activities, we introduce a set of processes, 

i.e. Proc, as the basic control unit. A process, i.e. an 

element in Proc, is a linear concatenation of activities. If


the output data of one operation opr1 

is the input data of 

another operation opr2 

, we consider that there is a 

corresponding relation between and opr . For 

opr1 2 

IMWSC, we introduce the binary relation a to 

represent this kind of relation. Symbol L is introduced 

into IMWSC to record the interaction history of web 

services. We described the interaction process of web 

services in a scenario in the way defined by IMWSC, in 

other words, the definition of an instance of IMWSC is 

the definition of the interaction process of web services in 

a scenario. 

Definition 1. (IMWSC) Formally, an IMWSC is a 

septuple , 

where: 

● Service denotes a set of web services; 

● Proc is a set of processes ; 

● Activity is a set of activities ; 

● L is a set of sequences of activities; 

● Message is a set of messages that are exchanged by 

services; 

R ⊆ Activity×Activity is a binary relation; 

● a 

● F is a sextuple < , , , , , 

f mA 

>, where: 

f → { c, 

b } 

R 

f pT f pS f pU f aP f aT 

─ pT : Proc is a mapping that 

describes the type of each process ( composite 

or basic ) ; 

─ f pS : Proc → Service is a mapping that 

describes the type of each process ( composite 

or atomic ) ; 

─ f pU : Proc → Proc is a mapping that 

associates a process with a composite process; 

─ f aP : Activity → Proc is a mapping that 

associates each activity with a process ; 

─ f aT : Activity → {ii, io, ei, eo, ex} is a 

─ 

mapping that describes the type of each 

activity ( internal input, internal output, 

environmental input, environmental output, 

execute ); 

f mA : Message → Activity is a mapping that 

associates each message with an Activity. 

f con 

( proc) 

{ a | a ∈ Activity ∧ (a) 

f aP 

We let = = 

proc} for p ∈Proc 

∧ f pT ( p) 

= b ; Let c < ⊆ Activity 

× Activity be an partial order relation over Activity, 

defined as: < c = { ( a1 , a2 

) | Activity 

a a1 2 ∈ , 

∧ f aP ( a1) 

= f aP ( a2 

) ∧ ( a1 

happens earlier than 

a2 

)}; An element proc in Proc is constructed by the 

following grammar: 


proc = α | || proc | proc p proc , 

1 2 

proc 1 

2 

where: α ∈ Activity; proc1, proc2∈Proc. 

● 1 || proc2 

is a new process that performs 

proc1 and proc2 independently; 

proc 

● proc1 p proc2 

is a new process that performs 

proc1 and proc2 sequentially. 

Fig. 1 presents an illustration of the structure of 

IMWSC. In Fig.1, a service is visualized by a circle; 

interaction of services is visualized by a pair of parallel 

arrows (with opposite directions); the interaction process 

Definition, i.e., the definition of an instance of IMWSC, 

is visualized by a rectangle. 

Figure 1. Structure of IMWSC 

C. The Necessary Condition for the Correctness of 

IMWSC 

The fundamental requirement for a correct interaction 

process of services is that each input of a service shall be 

met by another service. Thus the basic requirement that 

guarantees the correctness of IMWSC is: 

for any activity a1 ∈Activity, 

if it is an input activity, 

then, there shall be another activity a2 

∈ Activity, such 

that a1Ra a2 

. The necessary condition for the correctness 

of IMWSC can also be defined as the following 

predicative formula: 

∀ a ∈ Activity (( f a ii 

) ( 

aT ( ) = ∨ f aT ( a) 

= io → 

∃a' ∈ Activity ( aR a' 

∨ ))) 

a 'R 

a 

a a 

IV. Formal Semantics of IMWSC 

Formal semantic descriptions of a model are the basis 

for proving properties of this model. Moreover, they 

provide precise documentation of model design and 

standards for implementations, and (sometimes) they can 

be used for generation of prototype implementations. 

The formal semantics of IMWSC comprises three parts: 

semantic domain, semantic range, and valuation function. 

A process calculus CCS (Calculus of Communicating 

Systems, CCS for short) is introduced as a semantic range,


and then valuation functions are defined that translate an 

IMWSC (semantic domain) into a process term. 

A. Basic Syntax of CCS 

Let A be a countably infinite collection of names, and 

the set = { a | a ∈ A} be the set of complementary 

names (or co-names for short). Let L = A U be a set of 

labels, and Act = L U {τ } be the set of actions, where 

τ denote the activities which are not externally visible. 

Let K be a countably infinite collection of process names. 

The collection of CCS expressions is given by the 

following grammar: 

P, Q :: = K | α . P | P | P | Q | P [ f ] | 

P \ L 

where: 

● K is a process name in K ; 

● α is an action in Act ; 

● I is an index set; 

● f : Act → Act is a relabelling function satisfying 

the following constraints: 

─ f ( τ ) = τ and 

─ f ( a) 

= f ( a) 

for each label a ; 

● L is a set of labels. 

B. Operational Semantics of CCS 

CCS is formalized using axiomatic and operational 

semantics. To formally capture the understanding of the 

semantics of the language CCS, the collection of 

inference rules are therefore introduced as follows (a 

transition P Q holds for CCS expressions P, Q if, 

and only if, it can be proven using these rules) : 

α 

→ 

For a detailed introduction to the syntax and 

operational semantics of CCS, readers are referred to [17, 

18]. 

∑ ∈I 

i 

C. Defining Valuation Functions 

The valuation functions of IMWSC, and their 

corresponding semantic domains, semantic ranges are 

given in Tab. 1 ( symbols IMWSC denotes an IMWSC 

instance; P denotes process term in CCS; A denotes the 

set of atomic processes; Activity denotes a set of activities; 

Act denotes the set of actions in CCS). 


i 

where: 

TABLE I 

Valuation Functions and their Domains and Ranges 

● fm ( procr ) = fc (proc1) | fc (proc2) | ⋅⋅⋅ | fc (procn), 

where , proc ∈ Service ( 1 ≤ i ≤ n) 

, and 

procr i 

( ) = Ø 

s r proc 

f ∧ f s proci 

) = procr 

( ; 

● ( ) = 

iff = a ; 

c i proc f ) ( a i proc f ) ( f pT proci 

● ( ) = iff = c ; 

c i proc f ) ( r i proc f ) ( f pT proci 

● ( ) = , where 

a i proc f ) ( ) ( ) ( f e a1 

⋅ f e a2 

L f e an 

a ∈ Activity ∧ f ( a ) = proc , and 

i 

a < a < L < 

1 

c 

2 

c 

● f iff ii ei ; 

e ( ai 

) = ! ai 

f aT ( ai 

) = ∨ f aT ( ai 

) = 

e 

c 

● f iff io eo ; 

e ( ai 

) = ? ai 

f aT ( ai 

) = ∨ f aT ( ai 

) = 

● f ( ) ( ) | ( ) iff 

r proci 

= f c proc1 

f c proc2 

f pU ( proc1) 

= proci 

∧ f pU proc ) = proci 

|| proc proc = ; 

proc i 

1 

a 

i 

n 

2 

; 

i 

( 2 ∧ 

● f ( ) ( ). ( ) iff 

r proci 

= f c proc1 

f c proc2 

f pU ( proc1) 

= proci 

∧ f pU proc ) = proci 

= proc p proc . 

proc i 

1 

2 

( 2 ∧ 

By means of the valuation functions defined in Tab. 1, 

an algorithm aiming at translating an IMWSC instance to 

CCS terms can be developed: 

Algorithm. IMWSC_Instance_to_CCS 

INPUT: IMWSC Instance 

OUTPUT: The corresponding CCS terms 

Process Trans_fm (IMWSC Instance) 

{ 

1. Str Exp = Empty ; 

2. For each p in Proc ; 

3. Exp= Exp | Trans_fc ( p ) ; 

4. Return Exp ;


} 

Process Trans_fc ( process p ∈Proc 

) 

{ 

1. If ( process Type = basic ) 

{ Return Trans_fa ( p ) } ; 

2. Else { Return Trans_fr ( p ) }; 

} 

Process Trans_fa ( process pi ∈Proc) 

{ 

} 

1. Str name = getName( p ) ; 

2. SET name = NIL ; 

a i of process 

3. For each activity in Activity 

4. If ( activityType = output ) 

name = ! getName( i ). name ; 

5. Else If (activityType = input ) 

name = ? getName( i ). name. 

6. RETURN name ; 

Process Trans_fr (process pi ∈Proc) 

{ 

1. Str Exp = Empty ; 

} 

2. For each subService j of process 

3. If ( compositionType of pi 

is parallel) 

a 

a 

p 

u i 

Exp = Exp | Trans_fc ( j ) ; 

4. Else Exp = Exp. Trans_fc ( u j ) ; 

5. Return Exp ; 

If the IMWSC instance to be translated comprises m 

basic processes, and n = max num( 

p )} , where 

num( 

pi 

) 

p 

O ( m× 

n) 

u 

pi 

returns the number of the activities contained 

in process , the complexity of above algorithm will 

be . 

i 

{ i 

V. Case Study: Application of IMWSC to a Concrete 

Scenario 

A. Abstracting the Interaction of Web Services 

We will investigate the application of IMWSC in a 

simple scenario. There are three services involved in this 

scenario: 

― The Client Service, which need to find out some 

useful information (for convenience, client here is 

considered as a service); 

― The Response Service, which is responsible for 

dealing with information inquiry requests; 

― The Information Service, which acts as a database 

and providing the useful information. 


The business process of this scenario is introduced 

briefly as follows: 

1. The Response Service receives a request from the 

Client Service which need to find out some useful 

information; 

2. The Response Service contacts the Information 

Service and relay the information inquiry request; 

3. The Response Service answers the questions to the 

Client Service. 

Fig. 3 presents an illustration of the structure of this 

scenario, where 

● A service is visualized by a rectangle (with round 

angles); 

● A state of a service is visualized by a circle (the 

initial and the terminative states of a service are 

visualized by icons , respectively); 

● A transition between states is visualized by an arrow 

(with curve line), from the source state to the target 

state ; 

● The supply channels of services in this scenario is 

visualized by a pair of parallel arrows (with opposite 

directions). 

Figure 2. A Scenario of Interaction of Services 

By applying IMWSC, the interaction process of 

services in this scenario is described as follows: 

_____________________________________________ 

f con (Client) = { cReq, cAsk, cInquiry, cInfo }; 

f con (Reponse) = { rReq, rAsk, rInquiry, rAnswer }; 

f (InfoS) = { iAnswer, iInfo }. 

con 

f aT (cReq) = ii; f aT (cAsk) = io; f aT (cInquiry) = ii; 

f aT (cInfo) = io; f aT (rReq) = io; f aT (rAsk) = ii; 

f aT (rInquiry) = io; f aT (rAnswer) = ii; 

f aT (iReq) = io; f eT (iInfo) = ii; f aT (iAnswer) = io. 

cReq < c cAsk < c cInquiry < c cInfo;


< c < c < c 

rReq rAsk rInquiry rAnswer; 

< 

iAnswer iInfo. 

c 

< rReq, cReq >∈ ; < cAsk, rAsk >∈ R ; 

Ra a 

R ∈ a 

< rInquiry, cInquiry >∈ ; < cInfo, iInfo > 

a 

R 

R ; 

< iAnswer, rAnswer >∈ a ; 

______________________________________________ 

By means of the semantics of IMWSC defined in 

Section 4, the corresponding CCS terms translated are as 

follows: 

Client = ! Req. ? Ask. ! Inquiry. ? Info. nil ; 

Response = ? Req. ! Ask. ? Inquiry. ! Answer. nil; 

InfoS = ? Answer. ! Info. nil; 

Scenario = ( Client | Response | InfoS ) / { req, ask, 

info, Inquiry, Answer } 

B. Verifying the Interaction of Web Services 

CCS is an effective modeling language which has 

available supporting tool CWB-NC (Concurrency 

Workbench of the New Century, CWB-NC for short) [20] . 

We use this tool to reason on and verify the behavior of 

an instance of IMWSC. 

Using the supporting tool of CCS, i.e., CWB-NC, aims 

at assist the design and verification of a system. Applying 

CCS in the design phase of a system is helpful to show 

explicitly the interaction of the components that compose 

this system; after the model of a system has been 

constructed, modal μ − calculus [23] can be used to reason 

on the system behavior. For a detailed introduction to 

modal logic, readers are referred to, for example, [19, 23]. 

One type of verification supported by the tool is 

reachability analysis. Here, as in each type of verification, 

our first step in using the tool is to write a description of 

the system supported by CWB-NC. The description is 

then parsed by the tool and checked for syntactic 

correctness. We then give a logical formula describing a 

“bad state” that the system should never reach. Given 

such a formula and system description, CWB-NC 

explores every possible state the system may reach during 

execution sequence and checks to see if a bad state is 

reachable. If a bad state is detected, a description of the 

execution sequence leading to the state is reported to the 

user. Many bugs such as deadlock and critical section 

violation may be found using this approach [20] . 

Correct termination is one of the main properties a 

proper web service should satisfy. We use can_terminate 

to define the state of termination of a system. And the 

explanation for this state is as follows: 

can_terminate is true of a system if it will reach a 

terminative state. We express this property the system 

should have in modal − 

μ calculus: 

prop can_terminate = 

min X = [−]ff \/ X 


Reachability analysis is actually a special case of a 

more general type of verification called model checking. 

In the model checking approach a system is again 

described using a design language and a property the 

system should have is formulated as a logical formula [20] . 

Another type of verification supported by CWB-NC 

involves using a design language for defining both 

systems and specifications. Here the specification 

describes a system behavior more abstractly than the 

system description [20] . A relation, i.e., Observational 

equivalence needs to be introduced before we conduct 

this type of verification. 

Observational equivalence is useful in verification as 

they lay the conceptual basis for deciding that the 

behavior of two web services can be considered to be the 

same. They can also be used as a tool for reducing 

verification effort by replacing a process by a smaller (in 

size), but equivalent one. The bisimulation equivalence 

between two processes is a relation between their 

evolutions such that for each evolution of one of the 

services there is a corresponding evolution of the other 

service such that the evolutions are observationally 

equivalent and lead to processes which are again 

bisimilar. This characterization of the behavior of web 

services using the notion of bisimulation helps service 

designer optimize composite services by, e.g., changing 

their component web services with equivalent ones. 

Another motivation is customization of services. To 

enhance competitiveness a service providers may modify 

their service for customers' convenience and this 

customized service must conform to the original one. 

Formally, the relation of observational equivalence is 

defined as: 

Definition 1 [Weak Transitions] [23] : 

● q q' 

iff , ; 

ε 

⇒ q q q qn 

q' 

= → → → = L n ≥ 0 

0 

● q q' 

iff ; 

τ 

⇒ q q' 

ε 

⇒ 

● q q' 

iff , ( 

α 

ε α 

⇒ q⇒ 

q →q 

ε 

⇒ q' 

α ≠ τ ). 

Definition 2[Observational Equivalence] [23] : 

1 

τ 

Let S ⊆ Q× Q . The relation S is a weak 

bisimulation relation if whenever 1 2 then: q S q 

● q1 q'1 

implies for some such 

that ; 

α 

→ q2 q'2 

α 

⇒ 2 ' q 

q'1 

S q'2 

● q2 q'2 

implies for some such that 

. 

α 

→ q1 q'1 

α 

⇒ 1 ' q 

q'1 

S q'2 

q 1 and q2 are observationally equivalent, if 1 2 

for some weak bisimulation relation , written 

q S q 

q . 

1 

2 

τ 

τ 

S 1 q ≈ 2


In this scenario, Client is considered as a service which 

interacts with the composition of the services Response 

and InfoS. 

The behaviour of the composition of the services 

Response and InfoS can be described in two ways: 

1. The system description of the composition of 

services Response and InfoS is: 

Response = ? Req. ! Ask. ? Inquiry. ! Answer. nil; 

InfoS = ? Answer. ! Info. nil; 

Info_Response = ( Response | InfoS ) / {answer} ; 

2. The specification of this composition is: 

Spe = ? Req. ! Ask. ? Inquiry. ! Info. nil 

Command ‘ eq -S obseq ’ of CWB-NC tool can be 

used to examine whether or not two processes are 

observationally equivalent. By executing this command, 

we know that processes Info_Response and Spe are 

observationally equivalent. 

VI. Conclusions and Future Work 

Formal description and verification of the interaction 

of web services is an important research field. After the 

description and verification of a practical application of 

web service, we come to a conclusion that IMWSC has 

very good capability in abstracting, simulating, and 

analyzing a scenario of the interaction process of web 

services, which will facilitate the correct implementation. 

Currently many service composition methods do not 

take into account abstracting and analyzing the interactive 

features of services in a composition. Therefore it is apt 

to make mistakes when using these methods. Our work is 

an attempt to abstract and verify the interaction process of 

web services which will make the composition process 

more reliable. 

Further work will involve defining the way IMWSC 

instances are composed. An instance of IMWSC model 

defined only one scenario of the interaction process of 

web services. To model the complete interaction process 

of web services, there is a need for composing the 

instances of IMWSC model. Since Petri nets are a well 

known formal model that is capable of defining the 

composition process, we plan to compose the instances 

by using Petri net. In our further work, we will present 

the fixed point property of IMWSC model. The fixed 

point property indicated the outcome of each instance of 

IMWSC model is determinate, in other words, each of the 

module has only one terminative state. This property lays 

the mathematical foundation for mapping a module to a 

transition in a Petri net. 


This research is supported by the National Natural 

Science Foundation of China under Grant No.60573087. 


REFERENCES 

[1] Rachid H., Boualem B.. A Petri net based model for Web 

service composition. Proc. of the 14th Australian Database 

Conference on Database Technologies, Adelaide, South 

Australia, 2003: 191-200. 

[2] Salaün G., Bordeaux L.. Describing and reasoning on Web 

services using process algebra. Proc. of the 2nd IEEE 

International Conference on Web Services, San Diego, 

California, USA, 2004: 43-51. 

[3] Schahram Dustdar, Wolfgang Schreiner. A survey on web 

services composition. Int. J. Web and Grid Services, 1(1): 

1-30, 2005. 

[4] Muhammad Adeel Talib. Modeling the Flow in Dynamic 

Web Services Composition. Information Technology 

Journal, 3 (2): 184-187, 2004. 

[5] Casati F, Sayal M, and Shan M C. Developing e-services 

for composing e-services. Proc. of the 13th International 

Conference on Advanced Information Systems 

Engineering (CAiSE2001), Interlaken, Switzerland, 2001: 

171-186. 

[6] Patil A, Oundhakar S, Sheth A, Verma K. METEOR-S 

Web service annotation framework [A]. Proc. of the 13th 

International World Wide Web Conference 

(WWW2004)[C], New York, USA, May 2004: 553 – 562. 

[7] Benatallah B, Sheng Q Z, Dumas M. The Self-Serv 

Environment for Web Services Composition[J], IEEE 

Internet Computing, 2003, 7(1): 40 – 48. 

[8] Medjahed B, Bouguettaya A, Elmagarmid A K. 

Composing Web services on the Semantic Web[J], VLDB 

Journal, 2003, 12(4): 333 – 351. 

[9] Shankar R P, Armando F. SWORD: A developer toolkit 

for Web service composition, Proc. of the 11th 

International World Wide Web Conference, Honolulu, 

Hawaii, USA, May 2002, 786 – 810 

[10] E. Sirin, B. Parsia, D. Wu, J. Hendler, and D. Nau. HTN 

Planning for Web Service Composition using SHOP. 

Journal of Web Semantics, 1(4):377-396, 2004. 

[11] Zhang Jia , Chung Jen Yao , Chang C. K. . WS-Net: A 

petri-net based specification model for web services. Proc. 

of the 2nd IEEE International Conference on Web 

Services,San Diego,California,USA, 2004:420-427. 

[12] Tao X. F. Formalizing Web service and modeling Web 

service based on object oriented Petri net. Lecture Notes in 

Computer Science. Berlin, Heidelberg. Springer-Verlag, 

2004. 

[13] A. Ferrara. Web Services: a Process Algebra Approach”, 

Proc. of the 2nd Int. Conf. on Service-Oriented Computing 

(ICSOC'04), ACM Press, 2004: 242-251. 

[14] Li Bao, Weishi Zhang, Xiuguo Zhang. Describing and 

Verifying Web Service Using CCS. In Proc. of the 7th 

International Conference on Parallel and Distributed 

Computing, Applications and Technologies, 2006: 421-426. 

[15] R. Milner. Communication and Concurrency. Prentice Hall, 

1989. 

[16] R. Milner. A calculi of Communication and Concurrency. 

Spinger-Verlag, 1980. 

[17] Colin Fidge. A comparative introduction to CSP, CCS and 

LOTOS. Technical Report, Department of Computer 

Science, University of Queensland, Brisbane, Australia, 

April 1994. Available at: 

http://sky.fit.qut.edu.au/~fidgec/Publications/fidge94g.pdf. 

[18] Luca Aceto, Kim G. An Introduction to Milner’s CCS. 

Larsen. Technical Report, Department of Computer 

Science, Aalborg University, Aalborg, Denmark, March 

2005. Available at:


http://www.dimi.uniud.it/miculan/Didattica/MFGC04/intro 

2ccs.pdf. 

[19] C. Stirling. Modal logics for communicating systems. 

Theoretical Computer Science, 49(2-3): 311-347, 1987. 

[20] R. Cleaveland and S. Sims. The NCSU concurrency 

workbench. In R. Alur and T. Henzinger, editors, 

Proceedings of the 8th International Conference on 

Computer Aided Verification, volume 1102 of Lecture 

Notes in Computer Science, New Brunswick, NJ, USA, 

1996: 394-397. 

[21] K. Schmidt and C. Stahl. A Petri net semantic for 

BPEL4WS - validation and application. Proc. of the 11th 

Workshop on Algorithms and Tools for Petri Nets, 

Paderborn, Germany, 2004: 1-6. 

[22] B. Schlinglo, A. Martens, and K. Schmidt. Modeling and 

model checking web services. Proc. of the 2nd 

Li Bao received the BS degree in 

computer science from Dalian 

Nationality University, China, in 

2003, and the MS degree in computer 

science from Dalian Maritime 

University, China, in 1996. From 

2006 to date, he works as a PhD 

candidate in the Institute of Software 

Engineering, Dalian Maritime 

University, China. His research interests include 

distributed computing, software engineering, and formal 

description techniques. 


Weishi Zhang received the BS 

degree in computer science from 

Xi’an Jiaotong University, China, in 

1984, and the MS degree in 

computer science from the Chinese 

Academy of Science, China, in 1986. 

He received the PhD degree in 

computer science from the 

University of Munich, Germany, in 

International Workshop on Logic and Communication in 

Multi-Agent Systems. volume 126 of Electronic Notes in 

Theoretical Computer Science, Nancy, France, August 

2004: 3-26. 

[23] M. Hennessy and R. Milner, Algebraic laws for 

nondeterminism and concurrency, J. Assoc. Comput. 

Mach., 32: 137-161, 1985. 

[24] Hull R., Su J. (2005) Tools for Composite Web Services: 

A Short Overview. SIGMOD Record, 34 (2): 86-95, 2005. 

1996. From 1986 to1990, he was an assistant researcher 

at the Shenyang Institute of Computing, Chinese 

Academy of Science, China. From 1990 to 1992, he was 

a visiting scholar at Passau University, Germany. From 

1992 to 1997, he was an assistant professor at the 

University of Munich, Germany. In 1997, he joined the 

Department of Computer Science, Dalian Maritime 

University, China, where he is currently a professor of 

computer science. His research interests include 

distributed computing, software engineering, software 

architecture, formal specification techniques, and 

program semantics models. 

Xiong Xie works as a PhD candidate 

in the Institute of Software 

Engineering, Dalian Maritime 

University, China. Her research 

interests include distributed 

computing, software engineering, 

and formal description techniques


Performance Evaluation of Elliptic Curve 

Projective Coordinates with Parallel GF(p) Field 



Computer Engineering Department, Umm Al-Qura University, P.O. Box: 715 Makkah 21955, Saudi Arabia 

Email: tfsomani@uqu.edu.sa 

Abstract—This paper presents performance analysis and 

evaluation of elliptic curve projective coordinates with 

parallel field operations over GF(p). Side-channel atomicity 

has been used in these comparisons. The field computations 

of point operations are segmented into atomic blocks that 

are indistinguishable from each other to resist against 

simple power analysis attacks. These atomic blocks are 

executed in parallel using 2, 3 and 4 multipliers. 

Comparisons between the Homogeneous, Jacobian and 

Edwards coordinate systems using parallel field operations 

over GF(p) are presented. Results show that Edwards 

coordinate system outperforms both the Homogeneous and 

Jacobian coordinate systems and gives better area-time 

(AT) and area-time 2 (AT 2 ) complexities. 

Index Terms— elliptic curve cryptosystems, projective 

coordinate systems, Edwards coordinates, side-channel 

atomicity. 


Elliptic Curve Cryptosystems (ECCs) have been recently 

attracting increased attention [1]. The ability to use 

smaller key sizes and the computationally more efficient 

ECC algorithms compared to those used in earlier public 

key cryptosystems such as RSA [2] and ElGamal [3] are 

two main reasons why ECCs are becoming more popular. 

They are considered particularly suitable for 

implementation on smart cards or mobile devices. 

Because of the physical characteristics of such devices 

and their use in potentially hostile environments, Side 

Channel Attacks (SCA) [4 - 8] on such devices are 

considered serious threats. Two main types of SCAs have 

gained considerable attention: simple power analysis 

(SPA) attacks and differential power analysis (DPA) 

attacks. An SPA attack uses only a single observation of 

the power consumption, whereas a DPA attack uses many 

observations of the power consumption together with 

statistical tools. 

SCA seek to break the security of these devices 

through observing their power consumption trace or 

computations timing. Careless or naive implementation of 

Manuscript received December 13, 2008; revised April 11, 2009; 

accepted April 27, 2009. 


doi:10.4304/jcp.5.1.99-109 

cryptosystems allows side channel attacks to infer the 

secret key or obtain partial information about it. Thus, 

designers of cryptosystems seek to introduce algorithms 

and designs that are not only efficient, but also side 

channel attack resistant [9]. 

The primary operation of ECCs is scalar 

multiplication. Scalar multiplication in the group of 

points of an elliptic curve is analogous to exponentiation 

in the multiplicative group of integers modulo a fixed 

integer m. The scalar multiplication operation, denoted as 

kP, where k is an integer and P is a point on the elliptic 

curve, represents the addition of k copies of point P. 

Scalar multiplication is computed by a series of point 

doubling and point addition operations of the point P 

depending on the bit sequence representing the scalar 

multiplier k. Several scalar multiplication algorithms have 

been proposed in the literature. A good survey is 

conducted by Hankerson et. al. in [10]. 

Several countermeasures against SCA have been 

proposed in the literature. Chevallier-Mames et al. [11] 

proposed side-channel atomicity as an efficient 

countermeasure against only SPA attacks. Side-channel 

atomicity involves almost no computational overhead to 

resist against SPA attacks. It splits the elliptic curve point 

operations into atomic blocks that are indistinguishable 

from each other. Hence, side-channel atomicity is 

considered to be an inexpensive countermeasure that does 

not leak any data regarding the operation being 

performed [11 - 13]. 

The group operations in an affine coordinate system 

involve finite field inversion, which is a very costly 

operation, particularly over prime fields. Projective 

coordinate systems are used to eliminate the need for 

performing inversion. Several projective coordinate 

systems have been proposed in the literature including the 

Homogeneous, Jacobian and Edwards coordinate systems 

[9][14][15]. 

The selection of a projective coordinate is based on 

the number of arithmetic operations, mainly 

multiplications. This is to be expected due to the 

sequential nature of these architectures where a single 

multiplier is used. For high performance 

implementations, such sequential architectures are too 

slow to meet the demand of increasing number of 

operations. One solution for meeting this requirement is


to exploit the inherent parallelism within the elliptic 

curve point operations in projective coordinate [16 - 19]. 

The performance of these projective coordinates varies 

when parallel field multipliers are used. This is because 

of the nature of their critical paths. This paper 

investigates and compares the performance of the 

Homogeneous, Jacobian and Edwards coordinate systems 

with side-channel atomicity when parallel field 

multipliers are employed. The rest of this paper is 

organized as follows. Section II gives a brief introduction 

to ECCs. Section III introduces projective coordinate 

systems. Section IV shows how the point operations of 

the projective coordinate systems are segmented into 

atomic blocks and how they are executed in parallel. 

Section V shows the performance evaluation of the 

selected projective coordinate systems using parallel field 

multipliers. Finally, Section VI concludes the paper. 

II. ELLIPTIC CURVE PRELIMINARIES 

The elliptic curve cryptosystem (ECC), which was 

originally proposed by Niel Koblitz and Victor Miller in 

1985, is seen as a serious alternative to RSA because the 

key size of ECC is much shorter than that of RSA and 

ElGamal. To date, no significant breakthroughs have 

been made in determining weaknesses in the EC 

algorithm, which is based on the discrete logarithm 

problem over points on an elliptic curve. The fact that the 

problem appears so difficult to crack means that key sizes 

can be reduced considerably, even exponentially. This 

makes ECC a serious challenger to RSA and ElGamal. 

Extensive research has been done on the underlying 

math, security strength, and efficient implementation of 

ECCs [20]. Among the different fields that can underlie 

elliptic curves, prime fields GF(p) and binary fields 

GF(2 m ) have been shown to be best suited for 

cryptographic applications. An elliptic curve E over the 

finite field GF(p) defined by the parameters a, b ∈ GF(p) 

with p > 3, consists of the set of points P = (x, y), where 

x, y ∈ GF(p), that satisfy the equation: 

y 2 = x 3 + ax + b, 

where a, b ∈ GF(p) and 4a 3 + 27b 2 ≠ 0 mod p together 

with the additive identity of the group point O known as 

the “point at infinity”. 

Scalar multiplication (kP) is the primary operation of 

ECCs Several scalar multiplication algorithms have been 

proposed in the literature [10]. Computing kP can be 

done with the straightforward double-and-add algorithm, 

the so-called binary algorithm, based on the binary 

expression of k = (km-1,…,k0) where km-1 is the most 

significant bit of the multiplier k. The double-and-add 

scalar multiplication algorithm is the most 

straightforward scalar multiplication algorithm. It 

inspects the bits of the scalar multiplier k, and if the 

inspected bit ki = 0, only point doubling is performed. If, 

however, the inspected bit ki = 1, both point doubling and 

addition are performed. The double-and-add algorithm 

requires (m-1) point doublings and an average of (m/2) 

point additions [10]. 

Non-adjacent form (NAF) reduces the average number 

of point additions to (m/3) [21]. In NAF, signed-digit 


representations are used such that the scalar multiplier’s 

coefficient ki ∈ {0, ±1}. NAF has the property that no 

two consecutive coefficients are nonzero. NAF also has 

the property that every positive integer k has a unique 

NAF encoding, denoted NAF(k). 

III. PROJECTIVE COORDINATE SYSTEMS 

Projective coordinate systems are used to eliminate the 

need for performing inversion. Several projective 

coordinate systems have been proposed in the literature 

[9][14][15], including the Homogeneous, Jacobian and 

Edwards coordinate systems. For the Homogeneous, so 

called projective, coordinate system, an elliptic curve 

point P takes the form (x, y) = (X/Z, Y/Z), while for the 

Jacobian coordinate system, P takes the form (x, y) = 

(X/Z 2 , Y/Z 3 ) [9]. 

Let P1, P2 and P3 be three different points on the 

elliptic curve over GF(p), where P1=(X1, Y1, Z1), P2=(X2, 

Y2, Z2=1) and P3=(X3, Y3, Z3). Point addition with the 

Homogenous coordinate systems can be computed as: 

A=Y2Z1, B=X2Z1−X1, C=A 2 Z1−B 3 −2B 2 X1, X3=BC, 

Y3=A(B 2 X1−C)−B 3 Y1, Z3=B 3 Z1. Point doubling, on the 

other hand, can be computed as: A=aZ1 2 +3X1 2 , B=Y1Z1, 

C=X1Y1B, D=A 2 −8C, X3=2BD, Y3=A(4C−D)−8Y1 2 B 2 , 

Z3=8B 3 . 

With the Jacobian coordinate system, point addition 

can be computed as: A=X1, B=X2Z1 2 , C=Y1, D=Y2Z1 3 , 

E=B−A, F=D−C, X3=F 2 –(E3+2AE2), 

Y3=F(AE 2 −X3)−CE 3 , Z3=Z1E. Point doubling, on the 

other hand, can be computed as: A=4X1Y1 2 , B=3X1 2 +aZ1 4 , 

X3=B 2 −2A, Y3=B(A−X3)−8Y1 4 , Z3=2Y1Z1. 

Recently, Edwards showed in [14] that all elliptic 

curves over prime fields could be transformed to the 

shape: x 2 + y 2 = c 2 (1 + x 2 y 2 ), with (0, c) as neutral 

element and with the surprisingly simple and symmetric 

addition law of two points P1 = (x1, y1) and P2 = (x2, y2) 

as: 

P1 + P2 → ((x1y2+x2y1)/(c(1+x1x2y1y2)),(y1y2x1x2)/(c(1-x1x2y1y2))). 

To capture a larger class of elliptic curves over the 

original field, the notion of Edwards form have been 

modified in [15] to include all curves x 2 + y 2 = c 2 (1 + 

dx 2 y 2 ) where cd(1−dc 4 ) ≠ 0. 

Point addition with the Edwards coordinate systems 

can be computed as: B=Z1 2 Z1, C=X1X2, D=Y1Y2, E=G– 

(C+D), F=dCD, G=(X1+Y1)(X2+Y2), X3=Z1E(B–F), 

Z3=(B–F)(B+F), Y3=Z1(D–C)(B+F). Point doubling, on 

the other hand, can be computed as: A=X1+Y1, B=A 2 , 

C=X1 2 , D=Y1 2 , E=C+D, F=B–E, H=Z1 2 , I=2H, J=E–I, 

X3=FJ, Z3=EJ, Y3=E(C–D). 

IV. THE PROPOSED METHODOLOGY 

Since field multiplications and squarings are the 

dominant operation in elliptic curve point operations in 

projective coordinates that require much higher 

computation time than field additions and subtractions, 

the emphasis in this paper is to perform comparisons 

between projective coordinate systems when parallel 

multiplications or squarings are performed at the same


time. Furthermore, the field computations of point 

operations are segmented into atomic blocks that are 

indistinguishable from each other to resist against SPA 

attacks, which is called side-channel atomicity [11]. The 

approach adopted in this paper is: 

1. Analyzing the dataflow of point operations for 

each projective coordinate system in the 

following manner: 

a. Find the critical path which has the lowest 

number of field multiplications. 

b. Find the maximum number of multipliers 

that are needed to meet this critical path. 

2. Segmenting the field computations of point 

operations for each as follows: 

a. An atomic block contains at most one field 

multiplication, two field additions, and one 

field subtraction. 

b. A Field squaring is performed by a 

multiplier instead of using a special 

hardware unit for squaring. 

3. Varying the number of parallel multipliers from 

two to the number of multipliers specified by the 

critical path to find the following: 

a. The best schedule of each dataflow using 

the specified number of multipliers. 

b. The area-time (AT) and area-time 2 (AT 2 ) 

complexities. 

Table I shows the field arithmetic operations of the 

selected projective coordinate systems according to the 

presented formulas in Section III. In Table I, αis and βjs, 

represent multiplications/squarings and 

additions/subtractions respectively. For example, the first 

possible multiplication for point addition in the 

Homogenous coordinate system (Y2 × Z1) is represented 

by α1. The second possible field addition for point 

doubling in Edwards coordinate system (C + D), as 

another example, is represented by β2. The data 

dependencies between the αis and βjs in point operations 

for the Homogenous, Jacobian and coordinate systems 

are depicted in Fig. 1, 2 and 3 respectively. 

In Table II, the αis and βjs are grouped in atomic 

blocks. Table II shows the atomic blocks for point 

doubling and point addition, denoted by ∆ and Γ 

respectively. An empty field operations within an atomic 

block are marked by “*”. In Table II, for example, the 

atomic block ∆1 of point doubling in the Jacobian 

coordinate system contains the on field multiplication α1, 

one field addition β1 and two empty slots. The atomic 

block Γ7 of point addition in the Homogenous coordinate 

system, as another example, contains one field 

multiplication α7 and three field additions β2, β3 and β4. 

Let the unit of time be the required time to execute 

an atomic block. In Table II, point addition requires 11 

time units for the three selected projective coordinate 

systems. Point doubling, on the other hand, requires 13, 

10 and 7 for the Homogenous, Jacobian and Edwards 

coordinate systems respectively. 


Table III, IV and V show the scheduling of the atomic 

blocks of the Homogenous, Jacobian and Edwards 

coordinate systems respectively on parallel multipliers 

according to the proposed methodology early in this 

section. In Table III, IV and V, the first column shows the 

number of multipliers. The second column shows the 

required time units to perform point operations using 

parallel multipliers. The utilizations of the parallel 

multipliers depends on the number of multipliers and the 

critical path of the projective coordinate system. Adding 

more multipliers, on the other hand, does not imply better 

performance. For example, the number of the required 

time units to perform point addition using the Jacobian 

projective coordinate is the same when three or four 

multipliers. 

V. RESULTS & PERFORMANCE ANALYSIS 

The lower bound on the area-time cost of a given 

design is usually employed as a performance metric 

(area) x (time) 2α , 0 ≤ α ≤ 1, where the choice of α 

determines the relative importance of area and time [22]. 

Such lower bounds have been obtained for several 

problems, e.g., discrete Fourier transform, matrix 

multiplication, binary addition, and others [22]. Once the 

lower bound on the chosen performance metric is known, 

designers attempt to devise algorithms and designs which 

are optimal for a range of area and time values. Even 

though a design might be optimal for a certain range of 

area and time values, it is nevertheless of interest to 

obtain designs for minimum values of time, i.e., 

maximum speed performance, as well as designs for 

minimum area. In order to make a more meaningful 

comparison between the selected projective coordinate 

systems with parallel multipliers, both the AT and AT 2 

measures are evaluated. 

Table IV shows the AT and AT 2 measures for the 

selected projective coordinate systems with m = 160 bits. 

In Table IV, the Area (A) is the number of multipliers. 

The Time (T), on the other hand, is calculated using the 

NAF binary algorithm as: 

T = m(DBL) + m/3(ADD), 

where DBL and ADD are the required time units for 

performing point doubling and addition respectively in 

Tables III, IV and V. For example, T = 160 ×(4) + × 

160 × (6) = 960 time units for Edwards coordinate system 

with two parallel multipliers. Another example with the 

Jacobian coordinate system with three multipliers gives: 

T = 160 ×(5) + × 160 × (5) = 1066.66667 time units. 

Fig. 4 and Fig. 5 depict the comparisons results of 

Table IV for AT and AT 2 respectively. The results show 

that the Edwards coordinate system provides the best AT 

and AT 2 results. A key observation is that the Edwards 

coordinate system provides better AT and AT 2 using only 

two multipliers when compared to the other two 

coordinate systems with four multipliers, which makes 

the Edwards coordinate system more attractive. 

Despite that the Jacobian coordinate system provides 

better performance than the Homogenous coordinate 

system with sequential designs [23], the results show that


the Homogenous and the Jacobian coordinate systems 

provide the same AT and AT 2 when three multipliers are 

used. The results also show that the Homogenous 

coordinate system provides better AT and AT 2 than the 

Jacobian coordinate system when four multipliers are 

used. This is because of the nature of the critical path of 

the Homogenous coordinate system that allows for more 

parallelism when four multipliers are employed. 

VI. CONCLUSION 

In this paper, the performance of the Homogeneous, 

Jacobian and Edwards coordinate systems with sidechannel 

atomicity have been analyzed when parallel 

GF(p) field multipliers are used. The point operations of 

the selected projective coordinate systems have been 

segmented into atomic blocks. These atomic block are 

executed in parallel using 2, 3 and 4 multipliers. An 

atomic block can contain at most one field multiplication, 

two field additions, and one field subtraction. A Field 

squaring is performed by a multiplier instead of using a 

special hardware unit for squaring. 

The AT and AT 2 performance metric have been 

evaluated for each of the selected projective coordinate 

systems. The results show that the Edwards coordinate 

system provides the best AT and AT 2 as compared to the 

other two coordinate systems. The results also show that 

the Homogenous coordinate system provides better 

performance than the Jacobian coordinate systems when 

four multipliers are used. 


The author would like also to acknowledge the support 

of Umm Al-Qura University (UQU). 

REFERENCES 

[1] N. Koblitz, “Elliptic curve cryptosystems,” Mathematics of 

Computation, Vol. 48, pp. 203-209, 1987. 

[2] R. Rivest, A. Shamir, and L. Adleman, “A method for obtaining 

digital signatures and public key cryptosystems,” 

Communications of the ACM, Vol. 21, No.2, pp. 120-126, 1978. 

[3] T. ElGamal, “A Public-Key Cryptosystem and a Signature 

Scheme Based on Discrete Logarithms,” Advances in 

Cryptology: Proceedings of CRYPTO 84, Springer Verlag, pp. 

10-18, 1985. 

[4] P. Kocher, J. Jaffe, and B. Jun, “Differential power analysis,” 

CRYPTO '99, LNCS 1666, pp. 388-397, 1999. 

[5] P. Kocher, “Timing Attacks on Implementations of Diffe- 

Hellman, RSA, DSS, and Other Systems,” CRYPTO '96, LNCS 

1109, pp. 104-113, 1996. 

[6] P. Fouque and F. Valette, “The doubling attack – why upwards is 

better than downwards,” Cryptographic Hardware and Embedded 

Systems – CHES’03, LNCS 2779, Springer-Verlag, pp.269–280, 

2003 

[7] L. Goubin, “A refined power-analysis attack on elliptic curve 

cryptosystems,” Public Key Cryptography – PKC’03, LNCS 

2567, Springer-Verlag, pp.199–210, 2003. 

[8] T. Akishita, and T. Takagi, “Zero-value point attacks on elliptic 

curve cryptosystem,” Information Security Conference – ISC’03, 

LNCS 2851, Springer-Verlag, pp.218–233, 2003. 

[9] H. Cohen, G. Frey, R. M. Avanzi, C. Doche, T. Lange, K. 

Nguyen and F. Vercauteren, “Handbook of Elliptic and 

Hyperelliptic Curve Cryptography,” Discrete Mathematics and 

Its Applications, vol. 34, Chapman & Hall/CRC, 2005. 


[10] D. Hankerson, A. J. Menezes, and S. Vanstone, “Guide to Elliptic 

Curve Cryptography,” Springer-Verlag, 2004. 

[11] B. Chevallier-Mames, M. Ciet, and M. Joye, “Low-cost solutions 

for preventing simple side-channel analysis: side-channel 

atomicity,” IEEE Trans. Computers, Vol. 53, No. 6, pp. 760-768, 

2004. 

[12] P. Mishra, , “Pipelined computation of scalar multiplication in 

elliptic curve cryptosystems (extended version),” IEEE Trans. On 

Computers, Vol. 55, No. 8, pp. 1000-1010, 2006. 

[13] T.F. Al-Somani, “Overlapped parallel computations of scalar 

multiplication with resistance against Side Channel Attacks,” Int. 

J. Information and Computer Security, Vol. 2, No. 3, pp.261– 

278, 2008. 

[14] H. M. Edwards, “A normal form for elliptic curves,” Bulletin of 

the American Mathematical Society 44, pp. 393–422, 2007. 

[15] D. J. Bernstein and T. Lange, “Faster addition and doubling on 

elliptic curves,”, Advances in Cryptology – ASIACRYPT 2007, 

LNCS 4833, Springer-Verlag, pp.29-50, 2007. 

[16] A. Gutub and M. K. Ibrahim, “High Radix Parallel Architecture 

For GF(p) Elliptic Curve Processor,” IEEE Conference on 

Acoustics, Speech, and Signal Processing, ICASSP 2003, Pages: 

625- 628, Hong Kong, April 6-10, 2003. 

[17] A. Gutub, “Fast 160-Bits GF(p) Elliptic Curve Crypto Hardware 

of High-Radix Scalable Multipliers,” International Arab Journal 

of Information Technology (IAJIT), Vol. 3, No. 4, Pages: 342- 

349, October 2006. 

[18] A. Gutub, M. K. Ibrahim and T. Al-Somani, “Parallelizing GF(P) 

Elliptic Curve Cryptography Computations for Security and 

Speed,” IEEE International Symposium on Signal Processing and 

its Applications in conjunction with the International Conference 

on Information Sciences, Signal Processing and their 

Applications (ISSPA), Sharjah, United Arab Emirates, February 

12-15,2007. 

[19] A. Gutub, “Efficient Utilization of Scalable Multipliers in 

Parallel to Compute GF(p) Elliptic Curve Cryptographic 

Operations,” Kuwait Journal of Science & Engineering (KJSE), 

Vol . 34, No. 2, Pages: 165-182, December 2007. 

[20] A. Menezes, “Elliptic Curve Public Key Cryptosystems,” Kluwer 

Academic Publishers, 1993. 

[21] M. Joye, and C. Tymen, “Compact Encoding of Non-Adjacent 

Forms with Applications to Elliptic Curve Cryptography,” Public 

Key Cryptography, LNCS 1992, Springer-Verlag, pp. 353-364, 

2001. 

[22] D. Thompson, “A complexity theory for VLSI,” Ph.D. 

dissertation, Carnegie Mellon University, Dep. Computer 

Science, 1980. 

[23] I. Blake, G. Seroussi and N. Smart, “Elliptic Curve in 

Cryptography,” Cambridge University Press, New York, 1999. 

Turki F. Al-Somani received his B.Sc. and M.Sc. degrees in 

Electrical and Computer Engineering from King Abdul-Aziz 

University, Saudi Arabia in 1997 and 2000, respectively. He 

obtained his PhD degree from King Fahd University of 

Petroleum and Minerals (KFUPM), Saudi Arabia in 2006. 

Currently, he is an assistant professor at the Computer 

Engineering Department in Umm Al-Qura University (UQU), 

Saudi Arabia. His research interests include computer 

arithmetic, System-on-Chip designs, security and 

cryptosystems, theories of information, application specific 

processors and FPGAs. He published several journal and 

conference papers in the areas of his research.


TABLE I 

FIELD ARITHMETIC OPERATIONS OF THE SELECTED PROJECTIVE COORDINATE SYSTEMS 

Homogeneous Coordinate System Jacobian Coordinate System Edwards Coordinate System (with c = 1) 

Mixed Addition Doubling Mixed Addition Doubling Mixed Addition Doubling 

α1 A = Y2 × Z1 α1 Z1 2 = Z1 × Z1 α1 Z1 2 = Z1 × Z1 α1 Y1 2 = Y1 × Y1 α1 B = Z1 × Z1 β1 A = X1 + Y1 

α2 X2 × Z1 α2 X1 × Y1 α2 Y2 × Z1 β1 2Y1 2 = Y1 2 + 

Y1 2 

α2 C = X1 × X2 α1 C = X1 × X1 

β1 B = X2Z1– X1 α3 B = Y1 × Z1 α3 B = X2 × Z1 2 α2 X1 2 = X1 × X1 α3 D = Y1 × Y2 α2 D = Y1 × Y1 

α3 B 2 = B × B α4 X1 2 = X1 × X1 β1 E = B – A β2 2X1 2 = X1 2 + 

X1 2 

β1 X1 + Y1 β2 E = C + D 

α4 A 2 = A × A β1 2X1 2 = X1 2 + 

X1 2 

α5 B 3 = B 2 × B β2 3X1 2 = 2X1 2 + 

X1 2 

α4 

D = Y2 Z1 × 

Z1 2 

β3 

3X1 2 = 2X1 2 + 

X1 2 

β2 X2 + Y2 β3 C – D 

β2 F = D – C α3 Z1 2 = Z1 × Z1 α4 G = (X1 + Y1) × 

(X2 + Y2) 

α3 

B = A × A 

α6 A 2 × Z1 α5 a × Z1 2 α5 E 2 = E × E α4 Y1 × Z1 β3 C + D β4 F = B – E 

α7 B 2 × X1 β3 A = a4 Z1 2 + 

3X1 2 

α6 F 2 = F × F β4 Z3 = Y1Z1 + β4 D – C α4 H = Z1 × Z1 

Y1Z1 

β2 2B 2 X1 = B 2 X1 + 

B 2 α6 C = X1Y1 × B α7 Z3 = Z1 × E α5 X1 × Y1 

X1 

2 β5 E = G – (C + β5 I = H + H 

D) 

β3 (B 3 + 2B 2 X1) β4 2C = C + C α8 E 3 = E 2 × E β5 2X1Y1 2 = 

X1Y1 2 + X1Y1 2 

α5 C × D β6 J = E – I 

β4 

C = A 2 Z1 – (B 3 

+ 2B 2 X1) 

β5 4C = 2C + 2C α9 A × E 2 β6 A = 2X1Y1 2 + 

2X1Y1 2 

α8 X3 = B × C β6 8C = 4C + 4C β3 2AE 2 = AE 2 + 

AE 2 

α9 Z3 = B 3 × Z1 α7 B 2 = B × B β4 E 3 +2AE 2 α6 4Y1 4 = 2Y1 2 × 

2Y1 2 

β5 (B 2 X1– C) β7 2B 2 = B 2 + B 2 β5 X3 = F 2 – (E 3 

α10 A × (B 2 X1– C) β8 4B 2 = 2B 2 + 

2B 2 

α11 B 3 × Y1 β9 8B 2 = 4B 2 + 

4B 2 

β6 

Y3 = A × 

(B 2 X1– C) – 

B 3 Y1 


+2AE 2 ) 

α8 Y1 2 = Y1 × Y1 β6 Y3 = F (AE 2 – 

X3) – CE 3 

α6 Z1 × (D – C) α5 X3 = F × J 

β7 2A= A + A α7 Z1 × E α6 Z3 = E × J 

β8 

8Y1 4 = 4Y1 4 + 

4Y1 4 

α8 F = d × CD α7 Y3 = E × (C – D) 

β6 B – F 

α10 C × E 3 α7 Z1 4 = Z1 2 × Z1 2 β7 B + F 

α11 F × (AE 2 – X3) α8 a × Z1 4 α9 X3 = Z1E × (B 

β9 

B = 3X1 2 + 

a4Z1 4 

α10 

– F) 

Z3 = (B – F) × 

(B + F) 

α9 A 2 = A × A α9 B 2 = B × B α11 Y3 = Z1(D – C) 

× (B + F) 

β10 D = A 2 – 8C β10 X3 = B 2 – 2A 

β11 4C – D β11 A– X3 

α10 B × (A– X3) 

α10 Z3 = 8B 2 × B 

α11 Y1 2 × –8B 2 β12 Y3 = B(A– X3) 

– 8Y1 4 

α12 B × D 

β12 X3 = BD + BD 

α13 A × (4C – D) 

β13 Y3 = A(4C – 

D) – 8Y1 2 B 2

TABLE II 



POINT OPERATIONS IN ATOMIC BLOCKS 

Homogeneous Coordinate System Jacobian Coordinate System Edwards Coordinate System (with c = 1) 

Mixed Addition Doubling Mixed Addition Doubling Mixed Addition Doubling 

Γ1 α1 ∆1 α1 Γ1 α1 ∆1 α1 Γ1 α1 ∆1 β1 

* * * β1 * α1 

* * * * * * 

* * * * * * 

Γ2 α2 ∆2 α2 Γ2 α2 ∆2 α2 Γ2 α2 ∆2 α2 

β1 * * β2 * * 

* * * β3 * β2 

* * * * * β3 

Γ3 α3 ∆3 α3 Γ3 α3 ∆3 α3 Γ3 α3 ∆3 α3 

* * β1 * β3 * 

* * * * β4 * 

* * * * * β4 

Γ4 α4 ∆4 α4 Γ4 α4 ∆4 α4 Γ4 β1 ∆4 α4 

* β1 β2 β4 β2 β5 

* β2 * * α4 * 

* * * * β5 β6 

Γ5 α5 ∆5 α5 Γ5 α5 ∆5 α5 Γ5 α5 ∆5 α5 

* * * β5 * * 

* * * β6 * * 

* β3 * β7 * * 

Γ6 α6 ∆6 α6 Γ6 α6 ∆6 α6 Γ6 α6 ∆6 α6 

* β4 * β8 * * 

* β5 * * * * 

* β6 * * * * 

Γ7 α7 ∆7 α7 Γ7 α7 ∆7 α7 Γ7 α7 ∆7 α7 

β2 β7 * * * * 

β3 β8 * * * * 

β4 β9 * * * * 

Γ8 α8 ∆8 α8 Γ8 α8 ∆8 α8 Γ8 α8 

* β10 * β9 β6 

* β11 * * β7 

* * * * * 

Γ9 α9 ∆9 α9 Γ9 α9 ∆9 α9 Γ9 α9 

* * β3 β10 * 

* * β4 * * 

* * β5 * * 

Γ10 β5 ∆10 α10 Γ10 α10 ∆10 α10 Γ10 α10 

α10 * * β11 * 

* * * * * 

* * * * * 

Γ11 α11 ∆11 α11 Γ11 α11 Γ11 α11 

β6 * β6 * 

* * * * 

* * * * 

∆12 α12 

β12 

* 

* 

∆13 α13 

β13 

* 

*


TABLE III 

POINT OPERATIONS FOR THE HOMOGENEOUS COORDINATE SYSTEM WITH PARALLEL MULTIPLIERS 

No. of 

Multipliers 

Homogeneous Coordinate System 

Time Mixed Addition Doubling 

Mul1 Mul2 Mul1 Mul2 

2 1 Γ1 Γ2 ∆1 ∆2 

2 Γ3 Γ4 ∆3 ∆4 

3 Γ5 Γ6 ∆5 ∆6 

4 Γ7 Γ8 ∆7 ∆8 

5 Γ9 Γ10 ∆9 ∆10 

6 Γ11 ∆11 ∆12 

7 ∆13 

Mul1 Mul2 Mul3 Mul1 Mul2 Mul3 

3 1 Γ1 Γ2 ∆1 ∆2 ∆3 

2 Γ3 Γ4 ∆4 ∆5 ∆6 

3 Γ5 Γ6 Γ7 ∆7 ∆8 ∆9 

4 Γ8 Γ9 Γ10 ∆10 ∆11 ∆12 

5 Γ11 ∆13 

Mul1 Mul2 Mul3 Mul4 Mul1 Mul2 Mul3 Mul4 

4 1 Γ1 Γ2 ∆1 ∆2 ∆3 ∆4 

2 Γ3 Γ4 ∆5 ∆6 ∆7 ∆8 

3 Γ5 Γ6 Γ7 ∆9 ∆10 ∆11 

4 Γ8 Γ9 Γ10 Γ11 ∆12 ∆13 

TABLE IV 

POINT OPERATIONS FOR THE JACOBIAN COORDINATE SYSTEM WITH PARALLEL MULTIPLIERS 


Multipliers 


Jacobian Coordinate System 



2 1 Γ1 Γ2 ∆1 ∆2 

2 Γ3 Γ4 ∆3 ∆4 

3 Γ5 Γ6 ∆5 ∆6 

4 Γ7 Γ8 ∆7 ∆8 

5 Γ9 Γ10 ∆9 

6 Γ11 ∆10 


3 1 Γ1 Γ2 ∆1 ∆2 ∆3 

2 Γ3 Γ4 ∆4 ∆5 ∆6 

3 Γ5 Γ6 Γ7 ∆7 ∆8 

4 Γ8 Γ9 ∆9 

5 Γ10 Γ11 ∆10 


4 1 Γ1 Γ2 ∆1 ∆2 ∆3 ∆4 

2 Γ3 Γ4 ∆5 ∆6 ∆7 

3 Γ5 Γ6 Γ7 ∆8 

4 Γ8 Γ9 ∆9 

5 Γ10 Γ11 ∆10


TABLE V 

POINT OPERATIONS FOR THE EDWARDS COORDINATE SYSTEM WITH PARALLEL MULTIPLIERS 


Multipliers 

Edwards Coordinate System (with c = 1) 



2 1 Γ1 Γ2 ∆1 ∆2 

2 Γ3 Γ4 ∆3 ∆4 

3 Γ5 Γ6 ∆5 ∆6 

4 Γ7 Γ8 ∆7 

5 Γ9 

6 Γ10 Γ11 


3 1 Γ1 Γ2 Γ3 ∆1 ∆2 

2 Γ4 Γ5 Γ6 ∆3 ∆4 

3 Γ7 Γ8 ∆5 ∆6 ∆7 

4 Γ9 Γ10 Γ11 


4 1 Γ1 Γ2 Γ3 Γ4 ∆1 ∆2 ∆3 ∆4 

2 Γ5 Γ6 Γ7 ∆5 ∆6 ∆7 

3 Γ8 

4 Γ9 Γ10 Γ11 

TABLE VI 

AT & AT 2 COMPARISONS (with m = 160 bits) 

Projective Coordinate System Jacobian Coordinate System Edwards Coordinate System 

Area (A) = No. of Multipliers Time (T) AT AT 2 Time (T) AT AT 2 Time (T) AT AT 2 

2 1440 2880 4147200 1280 2560 3276800 960 1920 1843200 

3 1066.6667 3200 3413333.3 1066.6667 3200 3413333.3 693.33333 2080 1442133 

4 853.33333 3413.3333 2912711.1 1066.6667 4266.6667 4551111.1 533.33333 2133.333 1137778 



α2 

β1 

α3 


α1 

α4 

α5 α6 α7 

β2 

β3 

β4 

β5 

α8 α9 α10 α11 

β6 

α1 α2 α3 α4 

α5 α6 α7 

(a) Point Addition (b) Point Doubling 

Figure1. The data dependency graph of the Homogenous coordinate system. 

β3 

α9 

β10 

β11 

α12 

β12 

β4 

β5 

β6 

α13 

β13 

β7 

β8 

β9 

α10 

β1 

β2 

α8 

α11


α2 

α3 

β1 

α1 

α4 

α5 α6 α7 

α8 

α10 


β2 

α9 

β3 

β4 

β5 

α11 

β6 

α1 α2 α3 α4 

β1 

β2 

β3 

α5 α6 α7 


Figure 2. The data dependency graph of the Jacobian coordinate system. 

β5 

β6 

β7 

β8 

α8 

β9 

α9 

β10 

β11 

α10 

β12 

β4




Figure 3. The data dependency graph of the Edwards coordinate system. 

Figure 4. Area x Time (AT) Comparisons. 

Figure 5. Area x Time 2 (AT 2 ) Comparisons.


VPRS-Based Knowledge Discovery Approach in 

Incomplete Information System 

Shibao SUN 1,2 

1. Electronic Information Engineering College, Henan University of Science and Technology, Luoyang 

Henan 471003, China 

2. National Lab. Of Software Development Environment, Beijing University of Aeronautics & Astronautics, 

Beijing, 100191, China 

Email: sunshibao@126.com 

Ruijuan ZHENG 1 , Qingtao WU 1 , and Tianrui LI 3 

1. Electronic Information Engineering College, Henan University of Science and Technology, Luoyang 

471003, China 

3. School of Information Science and Technology, Southwest Jiaotong University, Chengdu, 610031,China. 

Email: rjwo@163.com, wqt8921@126.com, trli30@gmail.com 

Abstract—Through changing the equivalence relation in the 

incomplete information system, a new variable precision 

rough set model and an approach for knowledge reduction 

are proposed. To overcome no monotonic property of the 

lower approximation, a cumulative variable precision rough 

set model is explored, and the basic properties of cumulative 

lower and upper approximation operators are investigated. 

The example proves that the cumulative variable precision 

rough set model has wide range of applications and better 

result than variable precision rough set model. 

Index Terms—Variable precision rough set, Incomplete 

information system, Decision table, Cumulative 

approximation 


Rough set theory (RST) has been proposed by 

Pawlak [1] as a tool to conceptualize, organize and 

analyze various types of data in knowledge discovery. 

This method is especially useful for dealing with 

uncertain and vague knowledge in information systems. 

Many examples about applications of the rough set 

method to process control, economics, medical diagnosis, 

biochemistry, environmental science, biology, chemistry, 

psychology, conflict analysis and other fields can be 

found in [2,3]. However, the classical rough set theory is 

based on an equivalence relation and can not be applied 

in many real situations. Therefore, many extended RST 

models, e.g. binary relation based rough sets [4], covering 

This work is supported by National Natural Science Foundation of China 

(Grant No. 60873108), the Special Research Funding to Doctoral Subject 

of Higher Education Institutes in P.R. China (Grant No. 20060613007), 

Supported by Natural Science Foundation of Henan Province of 

China(Grant No. 072300410210), Natural Science Research Foundation of 

Henan University of Science and Technology (Grant No.09001172). 

Corresponding author: 

Email: sunshibao@126.com (Shibao SUN). 


doi:10.4304/jcp.5.1.110-116 

based rough sets [5,6], and fuzzy rough sets [7,8] have 

been proposed. In order to solve classification problems 

with uncertain data and no functional relationship 

between attributes and relax the rigid boundary definition 

of the classical rough set model to improve the model 

suitability, the variable precision rough set (VPRS) model 

was proposed by Ziarko [9] in 1993. It is an effective 

mathematical tool with an error-tolerance capability to 

handle uncertainty problem. Basically, the VPRS is an 

extension of classical rough set theory [1-3], allowing for 

a partial classification. By setting a confidence threshold, 

β (0 ≤ β < 0.5) , the VPRS can allow noise data or 

remove error data [10]. Recently the VPRS model has 

been widely applied in many fields [11]. 

The key issues of VPRS model mainly concentrates 

on generalization of models and development of 

reduction approaches under the equivalence relation. For 

example, β -reduct [12], β lower (upper) distribution 

reduction [13] and reduction based on structure [14], etc, 

are reduction approaches under the equivalence relation. 

However, in many practical problems, the equivalence 

relation of objects is difficult to construct, or the 

equivalence relation of objects essentially does not exist. 

In this case, we need to generalize the VPRS model. The 

ideas of generalization are from two aspects. One is to 

generalize approximated objects from a crisp set to a 

fuzzy set [15]; The other is to generalize the relation on 

the universe from the equivalence relation to the fuzzy 

relation [15], binary relation [16], or covering relation 

[17,18]. The idea of the VPRS was introduced to fuzzy 

rough set and the theory and application of fuzzy rough 

set were discussed in [15]. The equivalence relation was 

generalized to a binary relation R on the universe U in 

the VPRS model, so that a generalized VPRS model was 

obtained [16]. Covering rough set model [19] has been 

obtained when the equivalence relation on the universe 

was generalized to cover on the universe in rough set 

model. The equivalence relation was generalized to cover


on universe U in the VPRS model and two kinds of 

variable precision covering rough set models were 

obtained in [17,18]. The definition of the variable 

precision rough fuzzy set model under the equivalence 

relation was given in [20]. 

The classical rough set approach requires the data 

table to be complete, i.e., without missing values. In 

practice, however, the data table is often incomplete. To 

deal with these cases, Greco, et al [21] proposed an 

extension of the rough set methodology to the analysis of 

incomplete data tables. The extended indiscernible 

relation between two objects is considered as a 

directional statement where a subject is compared to a 

referent object. It requires that the referent object has no 

missing values. The extended rough set approach 

maintains all good characteristics of its original version. 

It also boils down to the original approach when there is 

no missing value. The rules induced from the rough 

approximations defined according to the extended 

relation verify a suitable property: they are robust in a 

sense that each rule is supported by at least one object 

with no missing value on the condition attributes 

represented in the rule. Obviously, these ideas can be 

used to the VPRS model. 

The classical and the generalized VPRS approach 

based on indiscernible relations also require the data table 

to be complete. In this paper, two kinds of VPRS 

approaches for dealing with incomplete dada table are 

proposed. The paper is organized as follows. In Section 2, 

a general view of VPRS approach and incomplete 

information system are given. In Section 3, based on the 

extended indiscernible relation, we propose a new VPRS 

model and an approach for knowledge reduction in the 

incomplete information system. In Section 4, a 

cumulative VPRS model in the incomplete information 

system is discussed. They are based on the cumulative β 

lower (upper) approximation of X . In Section 5, we 

present an illustrative example which is intended to 

explain the concepts introduced in Section 3 and Section 

4. The paper ends with conclusions and further research 

topics in Section 6. 

II. PRELIMINARIES AND NOTATIONS 

Definition 1 [1]. An information system is the 4tuple 

S = ( U, Q, V, f) 

, where U is a non-empty finite 

set of objects (universe), Q= { q1, q2, L , qm} 

is a finite 

set of attributes, V q is the domain of the attribute q , 

V = U V and f : U× Q→ V is a total function such 

q∈Q q 

that f ( xq , ) ∈ Vq 

for each q∈ Q , x ∈ U , called an 

information function. If Q= CU { d} 

and 

CI { d } =∅, 

then S = ( U, CU { d}, V, f) 

is called 

a decision table, where d is a decision attribute. 


To every (non-empty) subset of attributes P⊆ C is 

associated an indiscernible relation on U , denoted by 

R P : 

RP = {( xy , ) ∈ U× U: f( xq , ) = f( yq , ) ∀q∈ P}. 

(1) 

If ( x, y) ∈ RP, 

it is said that the objects x and y are P - 

indiscernible. Clearly, the indiscernible relation thus 

defined is an equivalence relation (reflexive, symmetric 

and transitive). The family of all the equivalence classes 

of the relation R p is denoted by U / R p and the 

equivalence class containing an element x ∈ U by 

[ x] = { y∈U :( x, y) ∈ Rp} 

. The equivalence classes 

of the relation R p are called P -elementary sets. If 

P = C , the C -elementary sets are called atoms. 

Definition 2 [9]. Let X and Y be subsets of nonempty 

finite universe U , if every e∈ X then e∈ Y , 

we call Y contain X . It is described as Y ⊇ X . Let 

⎧1 

− | X IY 

|/| X |, |X|>0, 

cXY ( , ) = ⎨ (2) 

⎩0, 

|X|=0, 

where | X | is cardinality of set X . cXY ( , ) is called 

the relative error ratio for X with regard to Y . 

Definition 3 [9]. Let S be a decision table, X a 

nonempty subset of U , 0≤ < 0.5 and ∅≠P⊆ C . 

The β lower approximation and the β upper 

approximation of X in S are defined, respectively, by: 

P ( X) = { x∈U: c([ x], X) 

≤ β}. 

(3) 

β 

β 

Pβ( X) = { x∈ U: c([ x], X) 

< 1 − β}. 

(4) 

The elements of Pβ ( X) 

are those objects x ∈ U 

which belong to the equivalence classes generated by the 

indiscernible relation R p , contained in X with the error 

β 

ratio β ; the elements of P ( X) 

are all and only those 

objects x ∈ U which belong to the equivalence classes 

generated by the indiscernible relation R p , contained in 

X with the error ratio 1− β . 

Definition 4 [21]. An information system is called an 

incomplete information system if there exists x ∈ U and 

a∈ C that satisfy that the value f ( xa , ) is unknown, 

denoted as “*”. It assumes here that at least one of the 

states of x in terms of P is certain where P⊆ C , i.e. 

∃a∈ P such that f ( xa , ) is known. Thus, 

V = VC UVd U {*} . 

Definition 5 [21]. ∀x, y∈ U , object y is called 

subject and object x , referent. Subject y is 

indiscernible with referent x , with respect to condition 

attributes from P⊆ C (denoted as yI Px 

), if for every


q∈ P the following conditions satisfy: (1) f( x, q) ≠ * , 

(2) f( x, q) = f( y, q) or f( y, q) 

= * , where “*” 

denotes a missing value. 

The above definition means that the referent object 

considered for indiscernible with respect to P should 

have no missing values on attributes from set P . The 

binary relation I P is not necessarily reflexive and also 

not necessarily symmetric. However, I P is transitive. 

For each P⊆ C , let us define a set of objects 

having no missing values on attributes from P : 

U x U f x q for eachq P 

P 

= { ∈ : ( , ) ≠* ∈ }. (5) 

III. VARIABLE PRECISION ROUGH SET MODEL IN THE 

INCOMPLETE INFORMATION SYSTEM 

Definition 6. Let S be an incomplete information 

system, X a nonempty subset of U , 0≤ β < 0.5 and 

P C 

∅≠ ⊆ . The β lower approximation and the β 

upper approximation of X in S are defined, 

respectively, by: 

I 

P X x U c I x X β 

β 

I 

( ) = { ∈ : ( ( ), ) ≤ }. (6) 

P P 

Pβ( X) = { x∈ UP: c( IP( x), X) 

< 1 − β}. 

(7) 

I 

The elements of Pβ ( X) 

are those objects x ∈ U 

which belong to the equivalence classes generated by the 

indiscernible relation I P , contained in X with the error 

I 

ratio β ; the elements of Pβ ( X) 

are all and only those 

objects x ∈ U which belong to the equivalence classes 

generated by the indiscernible relation I P , contained in 

X with the error ratio 1− β . 

The β boundary of X in S , denoted by 

I 

BN ( X) 

, is: 

β 

I 

BNβ ( X ) = { x∈ UP: β < c( IP( x), X ) < 1 − β}. 

(8) 

The β negative domain of X in S , denoted by 

NEGβ ( X ) , is: 

I 

NEGβ ( X ) = { x∈UP: c( IP( x), X ) ≥1 − β}. 

(9) 

Corollary 1. When β = 0 , the VPRS model defined 

above is equivalent to rough set model in incomplete 

information system [21]. 

Proof. In formula (10), cI ( P ( x), X) ≤ β is 

equivalent to cI ( P ( x), X) ≤ 0 , so 

1 ≤| I ( x) I X |/| I ( x)| 

, such that I ( x) ⊆ X , that 

P P 

I 

is to say, P ( X) 

β 

is equivalent to PX ( ) . 

I 

Analogously, Pβ ( X) 

is equivalent to PX ( ) . 


P 

Corollary 2. If an information system is complete, 

the VPRS model defined above is equivalent to the 

classical VPRS model. 

Proof. ∀x, y∈ U , P⊆ C , yI Px 

, then for every 

q∈ P , we have (1) f( x, q) ≠ * , (2) 

f( x, q) = f( y, q) or f( y, q) 

= * . In a complete 

information system, we have f ( xq , ) = f( yq , ) , so 

IP ( x ) is equal to [ x ] , such that formula (6) is 

equivalent to formula (3) and formula (7) is equivalent to 

formula (4). 

Theorem 1. ∀X ⊆ U 

where (~ X = UX - ) . 

I I 

, P (~ X) = NEG ( X) 

β β 

, 

Proof. From Definition 6, 

I | IP( x) I (~ X)| 

Pβ(~ X) = { x∈UP:1 − ≤ β} 

| I ( x)| 

| IP( x) I (~ X)| 

= { x∈UP: ≥1 −β} 

| IP( x)| 

| IP( x) I X | 

= { x∈UP: ≤ β} 

| IP( x)| 

| IP( x) I X | 

I 

= { x∈UP:1− ≥1 −β} 

= NEG ( X ) . 

β 

| I ( x)| 

P 

Theorem 2. Let S be an incomplete information 

system, X , Y are two nonempty subsets of U , 

0≤ β < 0.5 and ∅ ≠ P⊆ C . The rough 

approximations defined as above satisfy the following 

properties: 

I 

β ⊆ 

I 

β ; 

I 

β 

I 

I 

β = 

I 

β = ; 

I I 

β β 

⊆ ⇒ 

I 

β ⊆ 

I 

β 

I 

β U ⊇ 

I 

β U 

I 

β 

; 

I 

β I ⊆ 

I 

β I 

I 

β 

; 

I 

β U ⊇ 

I 

β U 

I 

β ; 

I 

β I ⊆ 

I 

β I 

I 

β ; 

I 

β = 

I 

β ; 

I 

β = 

I 

β 

; 

① P ( X) P ( X) 

② P ( ∅ ) = P β ( ∅ ) =∅; P ( U) P ( U) U 

③ X ⊆Y ⇒ P ( X) ⊆ P ( Y) 

; 

④ X Y P ( X) P ( Y) 

⑤ P ( X Y) P ( X) P ( Y) 

⑥ P ( X Y) P ( X) P ( Y) 

⑦ P ( X Y) P ( X) P ( Y) 

⑧ P ( X Y) P ( X) P ( Y) 

⑨ P (~ X) ~ P ( X) 

⑩ P (~ X) ~ P ( X) 

Proof. ① Because of 0 β 0.5 

I 

β ( 

I 

) such that x satisfies Pβ ( X) 

. 

P X 

P 

≤ < , x satisfies


I 

② Because of 0≤ β < 0.5 and X =∅, we have 

I 

P ( ∅ ) =∅ and P ( ) 

β 

β ∅ =∅ . Therefore, 

I 

I 

= β 

β = . 

P ( U) P ( U) U 

X Y 

I 

③ ∀x∈ P ( X) 

, cI ( ( x), X) β 

β 

P 

≤ , when 

⊆ , we have cI ( ( x), Y) ≤ β , that is to say, 

x P ( Y) 

I 

∈ . 

β 

P 

I I 

④ Similar to ③, we have P ( X) ⊆ P ( Y) 

. 

β β 

⑤ From ③, we have 

I I I 

P ( X UY) ⊇ P ( X) U P ( Y) 

. 

β β β 

⑥ From ③, we have 

I I I 

P ( X IY) ⊆ P ( X) I P ( Y) 

. 

β β β 

⑦ From ④, we have 

I I I 

P ( X UY) ⊇ P ( X) U P ( Y) 

. 

β β β 

⑧ From ④, we have 

I I I 

P ( X IY) ⊆ P ( X) I P ( Y) 

. 

β β β 

⑨ From Theorem 1, 

I | IP( x) I X | 

Pβ(~ X) = { x∈UP:1− ≥1 −β} 

| IP( x)| 

| IP( x) I X | 

I 

= ~{ x∈UP:1− < 1 −β} 

= ~ P β ( X) 

. 

| I ( x)| 

P 

I I 

⑩ Similar to ③, we have P (~ X) ~ P ( X) 

β 

= . 

The following ratio defines a β accuracy of the 

approximation of X ⊆ U , X ≠∅, by means of the 

attributes from P⊆ C : 

I 

| P ( X)| 

β 

α P ( X ) = I . 

(10) 

| P ( X)| 

Obviously, 0 ( ) 1 α ≤ < . 

P X 

Another ratio defines a β quality of the 

approximation of X by means of the attributes from 

P⊆ C : 

I 

| P ( X)| 

β λ P ( X ) = . 

(11) 

| X | 

The quality λ ( ) represents the relative frequency of 

P X 

the objects with error ratio β correctly classified by 

means of the attributes from P . 

A primary use of rough set theory is to reduce the 

number of attributes in databases thereby improving the 

performance of applications in a number of aspects 

including speed, storage, and accuracy. For a data set 

with discrete attribute values, this can be done by 

reducing the number of redundant attributes and find a 


β 

β 

subset of the original attributes that are the most 

informative. 

Definition 7. ( β dependability) Suppose that 

S = ( U, CU { d}, V, f) 

is an incomplete information 

system, β dependability is defined as follows: 

γ ( Cd , , β) = | posCd ( , , β)|/| 

U|. 

(12) 

where 

I 

pos( C, d, β ) = U C ( Y ). 

(13) 

Y∈U / d 

Definition 8. ( β approximation reduction) Suppose 

that S = ( U, CU { d}, V, f) 

is an incomplete 

information system, X ⊆ U , a conditional attribute 

subset A⊆ C is called a β approximation reduction if 

and only if it satisfies: ① γ ( Ad , , β) = γ( Cd , , β) 

and 

② there does not exist a conditional attribute subset 

B ⊆ A , such that γ ( Bd , , β) = γ( Cd , , β) 

. 

Based on Definition 8, through removing 

superfluous attributes, we can obtain a reductive database. 

IV. CUMULATIVE VARIABLE PRECISION ROUGH SET 

MODEL IN THE INCOMPLETE INFORMATION SYSTEM 

Let us observe that a very useful property of the 

lower approximation within the classical rough set theory 

is that if an object x ∈ U belongs to the lower 

approximation of X with respect to P⊆ C , then x 

belongs also to the lower approximation of X with 

respect to R ⊆ C when P⊆ R (this is a kind of 

monotonic property). However, formula (6) does not 

satisfy this property of the lower approximation, because 

it is possible that f( x, q) ≠ * for all q∈ P but 

f( x, q ) = * for some q∈R− P . This is quite 

problematic for some key concepts of the variable 

precision rough set theory, like β accuracy and β 

quality of approximation, and β dependability. 

Therefore, another definition of the lower 

approximation should be considered. Then the concepts 

of β accuracy, β quality , and β dependability of 

approximation can be still valid in the case of missing 

values. 

Definition 9. Given X ⊆ U and P⊆ C , 

* I 

P ( X) = U P ( X). 

(14) 

Then 

β 

R⊆P β 

* 

Pβ ( X) 

is called as the cumulative β lower 

approximation of X because it includes all the objects 

belonging to all β lower approximations of X , where 

R ⊆ P . 

It can be shown that another type of the indiscernible 

* 

relation, denoted by I , permits a direct definition of the 

P 

cumulative β lower approximation in a usual way. For 

β


each x, y∈ U and for each P⊆ C , 

* 

yI x means that 

P 

f( x, q) = f( y, q) or f( x, q) 

= * 

and/or 

( , ) * 

∈ . Let 

f y q = for every q P 

* * 

I ( x) = { y∈ U : yI x} 

for each x ∈ U and for each 

P P 

* 

P⊆ C . I is reflexive and symmetric but not transitive. 

P 

We can prove that Definition 9 is equivalent to the 

following definition: 

* * * 

Pβ( X) = { x∈UP: c( I ( x), X) 

≤ β. 

(15) 

P 

where 

* 

U = { x∈U: f( x, q) ≠* for at least one q∈ P}. 

P 

Using the indiscernible relation 

I , we can define 

the cumulative β upper approximation of X , 

* 

β 

* 

Pβ ( X) 

* 

P 

* 

P β 

* * 

complementary to 

P ( X) = { x∈ U : c( I ( x), X) 

< 1 − }. (16) 

For each X ⊆ U , let X = X I U . Let us remark 

* 

P 

P P 

* 

that x ∈ U if and only if there exists R ≠∅ such that 

P 

R ⊆ P and x ∈ U R . 

Rough approximations 

* 

Pβ ( X) 

and 

* 

Pβ ( X) 

satisfies the following properties: 

① For each X ⊆ U and for each P⊆ C : 

* * 

Pβ( X) ⊆ Pβ( X) 

; 

② For each X ⊆ U and for each P⊆ C : 

* * 

P 

* 

β 

Pβ ( X) = U −P ( U − X) 

; 

③ For each X ⊆ U and for each PR , ⊆ C , if 

P⊆ R , then 

* 

β 

* 

Pβ( X) ⊆ R ( X) 

. Furthermore, if 

* * 

* * 

UP= UR, 

then Pβ( X) ⊇ Rβ( X) 

. 

Due to the property of monotonic, when augmenting 

attributes set P , we get a lower approximation of X 

that is at least of the same cardinality. Thus, we can 

define analogously for the case of missing values the 

following key concepts of the variable precision rough 

sets theory: the cumulative β accuracy of approximation 

* 

of X (denoted as ( ) X α ), the cumulative β quality 

P 

* 

λ ( ) of approximation of X (denoted as ( ) X λ ), 

P X 

and the cumulative β dependability (denoted as 

* 

γ ( Cd , , β ) ). These concepts have the same definitions 

as those given in Sections 3 but they use rough 

* 

approximation 

Pβ ( X) 

. 

* 

Pβ ( X) 

and 


V. AN EXAMPLE 

P 

The illustrative example presented in this section is 

to explain the concepts introduced in Section 3 and 

Section 4. The director of the school wants to make a 

global evaluation to some students. This evaluation 

should be based on the level in Mathematics, Physics and 

Literature. However, not all the students have passed all 

three exams and, therefore, there are some missing values. 

The director made the examples of evaluation as shown 

in Table 1. 

TABLE I 

STUDENT EVALUATIONS WITH MISSING VALUES 

Stud 

ent 

Mathem 

atics 

Physics Literature 

Global 

evaluation 

1 medium bad bad bad 

2 good medium * good 

3 medium * medium bad 

4 * medium medium good 

5 * good bad bad 

6 good medium bad good 

For β = 0.35 , The lower and upper 

approximations can be calculated from Table 1: 

Let C = { Mathematics, Physics, Literature} 

be 

condition attributes and { Global evaluation} be 

decision attribute. Let bad = {1,3,5} and 

good = {2, 4,6} . 

U = {1, 6} , I (1) = {1} , I (6) = {2,6} , 

C 

I 

C 

I 

C β ( bad) 

= {1} , C β ( bad) 

= {1} , C β ( good) 

= {6} , 

I 

C β ( good) 

= {6} , γ ( Cd , , β ) = 1/3. 

Let L = { Literature} 

, such that 

γ ( Ld , , β ) = 1/3 . It is easy to validate that L is 

reduction of condition attribute set C . 

For β = 0.35 , the cumulative lower and upper 

approximations can be calculated from Table 1: 

* 

* 

U = {1,2,3,4,5,6} , I (1) = {1} , 

C 

* 

I (2) = {2, 4,6} , 

C 

C 

I 

C 

* 

I (3) = {3, 4} , 

* 

I (4) = {2,3,4} , * I (5) = {5} , * I (6) = {2,6} , 

C 

C 

* 

C β ( bad) 

= {1,5} , 

* 

C ( good) 

{2,4,6} 

β = , * 

* 

C 

C 

C β ( bad) 

= {1,3,4,5} , 

C β ( good) 

= {2,3,4,6} , 

* 

γ ( Cd , , β ) = 5/6. 

Let L = { Literature} 

and P= { Physics} 

, such 

that γ ( Ld , , β ) = 5/6 and γ ( Pd , , β ) = 5/6 . It is 

easy to validate that L and P are cumulative β 

approximation reductions of condition attribute set C . 

From this example, we can see that the cumulative 

variable precision rough set model better reflects the 

rough set’s essence.


VI. CONCLUSIONS 

In incomplete information system, a new VPRS 

model was obtained through defining the relation of 

objects. To overcome no monotonic property of the 

proposed VPRS model, a cumulative data relation was 

defined and a cumulative VPRS model was established. 

However, these models were limited in applications on a 

small database with incomplete information. Moreover, 

this paper only presented the basic reduction approach for 

a decision table. In our future work, we will focus on the 

development of reduction algorithms and extracting 

minimal exact rules from the large decision table. How to 

deal with fuzzy data in incomplete information system 

will also be one of our future research work. 


The authors thank the editor and anonymous referees 

for their constructive comments and suggestions, which 

have improved the quality of this paper. 

REFERENCES 

[1] Z. Pawlak, A. Skowron, Rudiments of rough sets. 

Information Sciences, vol. 177, no. 1, pp. 3–27, 2007. 

[2] Z. Pawlak, A. Skowron, Rough sets: some extensions. 


[3] Z. Pawlak, A. Skowron, Rough sets and boolean reasoning. 


[4] W. Zhu, F.-Y. Wang, Binary relation based rough sets, in: 

IEEE FSKD 2006, LNAI, vol. 4223, pp. 276–285, 2006. 

[5] W. Zhu, Topological approaches to covering rough sets, 

Information Sciences, vol. 177, no. 6, pp. 1499–1508, 

2007. 

[6] W. Zhu, F.-Y. Wang, On three types of covering rough sets, 

IEEE Transactions on Knowledge and Data Engineering, 

vol. 19, no. 8, 2007. 

[7] K.Y. Qin, Z. Pei, On the topological properties of fuzzy 

rough sets, Fuzzy Sets and Systems, vol. 151, no. 3, pp. 

601–613, 2005. 

[8] W.Z. Wu, W.X. Zhang, Constructive and axiomatic 

approaches of fuzzy approximation operators, Information 

Sciences, vol. 159, no. 3–4, pp. 233–254, 2004. 

[9] W. Ziarko, Variable precision rough set model, Journal of 

computer system science, vol. 46, no. 1, pp. 39–59, 1993. 

[10] D. Slezak, W. Ziarko, The investigation of the Bayesian 

rough set model, International Journal of Approximation 

Reason, vol. 40, pp. 81–91, 2005. 

[11] Z. Tao, B. D. Xu, D. W. Wang, R. Li, Rough Rules Mining 

Approach Based on Variable Precision Rough Set Theory, 

Information and Control,vol. 33, no. 1, pp. 18-22, 2004 

(in Chinese). 

[12] M. Beynon, Reducts within the variable precision rough 

sets model: A further investigation, European journal of 

operational research, vol. 134, pp. 592-605, 2001. 

[13] W. X. Zhang, Y. Liang, W. Z. Wu, Information System 

and Knowledge Discovery, Science Press, Beijing, China, 

2003 (in Chinese). 

[14] M. Inuiguchi, Structure-Based Approaches to Attribute 

Reduction in Variable Precision Rough Set Models, 

Proceeding of 2005 IEEE ICGC, pp. 34-39, 2005. 

[15] A. Mieszkowicz-Rolka, L. Rolka, Remarks on 

approximation quality in variable precision fuzzy rough 

sets model Rough Sets and Current Trends in Computing: 


4th International Conference, Springer-Verlag, pp. 402- 

411, 2004. 

[16] Z. T. Gong, B. Z. Sun, Y. B Shao, D. G. Chen, Variable 

precision rough set model based on general relations, 

Journal of Lanzhou University (Natural Sciences), vol. 41, 

no. 6, pp. 110-114, 2005 (in Chinese). 

[17] Y. J. Zhang, Y. P. Wang, Covering rough set model based 

on variable precision, Journal of Liaoning Institute of 

Technology, vol. 26, no. 4, pp. 274-276, 2006 (in Chinese). 

[18] S. B. Sun, R. X. Liu, K. Y. Qin, Comparison of Variable 

Precision Covering Rough Set Models, Computer 

engineering, vol. 34, no. 7, pp. 10-13, 2008 (in Chinese). 

[19] W. Zhu,F. Y. Wang, Reduction and axiomization of 

covering generalized rough sets, Information Sciences, vol. 

152, pp. 217-230, 2003. 

[20] A. Mieszkowicz-Rolka, L. Rolka, Fuzziness in Information 

Systems, Electronic Notes in Theoretical Computer 

Science, vol. 82, no. 4, pp. 1-10, 2003. 

[21] X. B. Yang, J. Y. Yang, C. Wu, D. J. Yu, Dominancebased 

rough set approach and knowledge reductions in 

incomplete ordered information system, Information 

Sciences, vol. 178, pp. 1219–1234, 2008. 

Shibao SUN was born in Henan, China on 07/10/1970. He 

received B.S. degree in computer application from Henan 

University, Kaifeng, China in 1994, the M.S. degree in 

computer application from Zhengzhou University, Zhengzhou, 

China in 2004, the D.E. degree in traffic information 

engineering and control from Southwest Jiaotong University, 

Chengdu, China in 2008. 

In 2008, he was a Post-Doc of Computer Science and 

Technology at the National Lab. Of Software Development 

Environment, Beijing University of Aeronautics & Astronautics, 

Beijing, China. He is currently an Associate Professor in 

Electronic Information Engineering College, Henan University 

of Science and Technology, Luoyang, China. His research 

interests can be summarized as intelligent information 

processing, machine learning. Particularly, he is currently 

interested in rough sets theory and approaches, as well as their 

applications in information processing. 

Ruijuan ZHENG was born in Henan Province, China, on 

01/20/80. She received B.S. degree in Computer Science and 

Technology from Henan University, Kaifeng, China in 2003, 

the M.S. and D.E. in bio-inspired computer network security 

theory and technology from Harbin Engineering University, 

Harbin, China in 2006 and 2008, respectively. 

She works in Electronic Information and Engineering 

College, Henan University of Science and Technology now. Her 

research interests involve Computer immunology, Bio-inspired 

dependability, etc. 

She is to join the China Computer Federation. 

Qingtao WU was born in Jiangxi, China on 03/13/1975. He 

received M.S. degrees in computer science from Henan 

University of Science and Technology, Luoyang, China in 2003, 

the D.E. degree in computer applications from East China 

University of Science and Technology, Shanghai, China in 2006. 

Currently he is working as Associate Professor and Head in 

Dept. of Computer Science, Henan University of Science and 

Technology, Luoyang, China. His area of interest includes 

network & information security, software formal methods. 

Tianrui LI was born in Fujian, China on 06/14/1969. He 

received B.S. and M.S. in mathematics from Southwest Jiaotong 

University, Chengdu, China in 1992 and 1995, respectively. He


received D.E. degree in traffic information engineering and 

control from Southwest Jiaotong University, Chengdu, China in 

2002. 

In 2005-2006, he was a Post-Doc of Computer Science and 

Engineering at the Belgian Nuclear Research Centre, Belgium. 

In 2008, he was a visiting professor for three months in Hasselt 

University, Belgium. He is currently a Professor in School of 

Information Science and Technology of Southwest Jiaotong 

University, Chengdu, China. His research interests can be 

summarized as developing effective and efficient data analysis 

techniques for novel data intensive applications. Particularly, he 

is currently interested in various techniques of data mining, 

granular computing and mathematical modeling, as well as their 

applications in traffic. 

Prof. Li has served as ISKE2007, ISKE2008, ISKE2009 

program chairs, IEEE GrC 2009 program vice chair and 

RSKT2008, FLINS2010 organizing chair, etc. and has been a 

reviewer for several leading academic journals. 




Abstract—NaXi Pictograph , which is the only hieroglyph in 

use is important for researching into the evolution of the 

character. In the past, processing NaXi Pictograph by hand, 

that is very inefficient. The NaXi Pictograph information 

processing modules is developed, such as pictograph Outline 

Font lib, the input method module, the embedded module 

and so on. A method of NaXi Pictographs Outline font 

extraction is proposed, which has been testified better in the 

Linux. As the specialty of NaXi Pictographs,two input 

mothods -- NaXi pictographs pinyin and graphic primitive 

input method are advanced, by evaluating ,which is found 

that the second is better than the first in precision. For 

display in web, The web embedding fonts technology of 

NaXi Pictographs is brought forward, and The web 

embedding fonts technology of NaXi pictographs makes 

Internet clients browse NaXi pictographs web without 

downloading NaXi pictographs font. The development of the 

NaXi Pictograph Information Processing System is 

significant to research into NaXi pictographs and apply it. 

Index Terms—NaXi pictographs, information processing, 

outlines font, IME (Input Method Editors), embedded font, 

WEFT 


NaXi pictograph belongs to the NaXi language of yi 

language branch of Tibetan-Burmese languages which 

was created by the ancestor of NaXi. It’s credited as “the 

only living ancient hieroglyph” and still being used in 

writing lection and composition and in the field of 

communication, therefore, NaXi pictograph has a special 

role in the history of human character. Since hieroglyph 

appeared in the early stage of the development of 

characters, through a deep study of it we can get 

something about the evolutionary history of human 

characters and human culture which will make a great 

contribution to research on the origin of modern 

characters. Many experts and scholars at home and 

aboard have been working on NaXi pictograph for a long 

time, among which Harvard and Yunnan Academy of 

Social Sciences lead a dominant role. But the problem 

that a lot of literature and ancient books can’t be 

processed efficiently makes it urgent to realizing an 

information processing system for NaXi pictograph. 

The traditional hand-drawing method of processing 

NaXi pictograph has low efficiency and can’t guarantee 

the problem can be solved by the output of computer’s 

*Corresponding author: Jing-ying Zhao Email: dalianzjy@gmail.com 


doi:10.4304/jcp.5.1.117-124 

Hai Guo 

Dalian Nationalities University, Dalian, China 

Email: guohai@dlnu.edu.cn 

Jing-ying Zhao* 

Dalian Nationalities University, Dalian, China 

Email: dalianzjy@gmail.com 

standard font, therefore, our project designs and develops 

the NaXi pictograph information processing system for 

the first time which ends the processing history of NaXi 

pictograph without computerization. 

II. NAXI PICTOGRAPH PROCESSING SUMMARIZATION 

The Naxi Pictograph input platform is developed for 

special groups, including units of printing and publication, 

Naxi pictograph research institutes and art designing 

companies, so it’s not a universal software platform. 

These reasons make it definite that our development must 

meet the requirement of different customers. The analysis 

of application level of this project can be summarized as 

Table I. 

According to the analysis above, our project has 

designed various Naxi pictograph input processing 

systems which can meet specific requirement of different 

customers. For professional customer doing publication 

and printing, we developed Naxi pictograph standard 

edition with TrueType and PostScript outline fonts, and 

Four input methods including internal code input method, 

Latin input method, graphic primitive Method and English 

input method, which can fully satisfy the needs of 

printing and typesetting. As for the Naxi pictograph 

research institutes, we developed Naxi pictograph 

TrueType outline font and Naxi Pinyin input method. 

Considering that the artistic designing units just need the 

font style outline, their system only have Naxi pictograph 

TrueType outline fonts and Naxi pictograph English input 

method. At last, we developed web embedding fonts and 

PDF document creating technology of Naxi pictograph 

TABLE I. 

THE ANALYSIS OF CUSTOMER REQUIREMENT OF NAXI PICTOGRAPH 

INPUT PLATFORM 

Regular 

customer requirement 

customer group high accuracy, fast input, output, various 

convenient input method 

Printing 

Publication 

general output accuracy, advanced input method 

research various of output font styles, input method easily 

institutes 

to learned 

artistic offer web pages and electronic document 

designing 

browsing 

ordinary user customer requirement


for ordinary customers. 

III. THE PRINCIPLE OF NAXI PICTOGRAPH INFORMATION 

PROCESSING 

A. The principle of NaXi Pictograph Outline Fonts 

Since outline font can be zoomed in and out at will 

without distortion, it is widely applied in printing, 

typesetting, character processing and art creating, and 

Windows we uses everyday is displayed with outline font 

too. This type of font is composed of Bézier curve, and 

the TrueType font uses quadratic Bézier curve, the 

PostScript font uses cubic Bézier curve, so the 

reducibility of PostScript font is better than the TrueType 

font. A Bézier curve is controlled by three points, as is 

shown in Fig. 1, when the middle control point changes 

the whole curve changes too. If an outline word consists 

of n+1 points (P0, P1 … Pn), it needs m Bézier curves to 

form the font. Equation (1) shows the detail. 

m = 

n 

∑ 

i= 

0 

⎛ i ⎞ 

⎜ ⎟t 

⎝n 

⎠ 

i 

( 1− 

t) 

n−i 

A PostScript outline is formed of a group of cubic Bézier 

curves which can be described in (2), (3): 

3 

2 

x = a * t + b * t + c * t + d 

X 

y 

X 

3 

2 

y = a * t + b * t + c * t + d 

y 

A TrueType outline is formed of a group of quadratic 

Bézier curves and every curve is defined by three control 

points. For a quadratic Bézier curve’s three control points 

are (Ax, Ay),(Bx, By)and(Cx, Cy)described in (4) 

and (5) : 

P 

X 

= 


X 

y 

P 

2 

2 

( 1 − t) 

AX 

+ 2t( 

1 − t) 

B X + t C X 

i 

X 

y 

(1) 

(2) 

(3) 

(4) 

P 

y 

= 

Figure 1. Bézier Curve 

2 

2 

( 1 − t) 

Ay 

+ 2t( 

1 − t) 

B y + t C y 

By modifying the parameter t from 0 to 1, all values of 

p defined by A, B and C can be generated, from which 

the quadratic Bézier curve is obtained. 

Font is defined as a combination of a set of outline 

curves and every curve includes three or more points. The 

illustration using Bézier outline curve of NaXi pictograph 

“deer” is shown in Fig. 2. The outline is constituted of a 

set of instructions which pictured the character’s exterior. 

All the parameter information of outline fonts is stored 

in a NaXi pictograph table “glyf”. Because of the high 

complexity and similarity of Naxi pictograph, the 

precision of outline curve delineation is greatly higher 

than the dot matrix font delineation. Based on the two 

outline font technique, we developed NaXi pictograph 

TrueType font and PostScript font which basically met 

the requirement of processing basic model of NaXi 

pictograph with computer. 

B. The Feature Extraction Method of NaXi Pictograph 

Outline Font 

The outline font is filled with a series of outline 

curves and delineating the outline accurately is the key 

point for the success of a font. It can be seen from 

Figure 2. the Outline Curve delineation of NaXi Pictograph “deer” 

(5)


Figure3 that the NaXi pictograph is complicated, various 

and also with many strokes which make it difficult for 

truly reducing the outline of NaXi pictograph. According 

to these characteristics, our project proposes a dual-mode 

transformation algorithm for extracting the NaXi 

pictograph outline points and uses some rules to check 

whether a pixel point is in the outline. 

Rule1: if the center of a pixel point is in the outline, the 

point is lightened and becomes a part of the outline curve. 

Rule2: if an outline line passes the center of a pixel, 

the pixel is lightened. 

Rule3: if a scanning line located in the center of two 

adjacent pixels (horizontal or vertical) intersects with the 

on-Transition and the off-Transition and the points on 

this line haven’t lightened by rule1 and rule2, the left 

endpoint is lightened when this line is horizontal, while 

the right endpoint is lighted when it’s vertical. 

Rule4: rule3 can be used only in the case that two 

contour surfaces still intersect the scanning line in twoway, 

but this doesn’t mean these pixels are “stubs”. By 

checking the scanning line-segment formed a square with 

crossing scan line-segment, whether they intersect each 

other through two contour surfaces can be verified. There 

is the possibility which is small but exists that more than 

one contour intersect with discontinuity point, so it’s 

necessary to control some characters outline using gridfitting. 

The method discussed above is used in the secondary 

development of Pgaedit based on Linux which effectively 

prevents the generation of discontinuity point, and the 

accuracy of outline point extraction of NaXi pictograph 

reaches 99.99%. 

IV NAXI PICTOGRAPH FONT DEVELOPMENT 

A computer font is an electronic data file containing a 

set of glyphs, characters, or symbols such as dingbats. 

Although the term font first referred to a set of metal type 

sorts in one style and size, since the 1990s most fonts are 

digital, used on computers.There are three basic kinds of 

computer font file data formats: 

� Bitmap fonts consist of a series of dots or pixels 

representing the image of each glyph in each face 

and size. 

� Outline fonts (also called vector fonts) use Bézier 

curves, drawing instructions and mathematical 

formulas to describe each glyph, which make the 

character outlines scalable to any size. 

� Stroke fonts use a series of specified lines and 

additional information to define the profile, or size 

and shape of the line in a specific face, which 

together describe the appearance of the glyph. 

Bitmap fonts are faster and easier to use in computer 

code, but inflexible, requiring a separate font for each 

size. Outline and stroke fonts can be resized using a 

single font and substituting different measurements for 

components of each glyph, but are somewhat more 

complicated to use than bitmap fonts as they require 

additional computer code to render the outline to a bitmap 

for display on screen or in print. 


The difference between bitmap fonts and outline fonts 

is similar to the difference between bitmap and vector 

image file formats. Bitmap fonts are like image formats 

such as Windows Bitmap (.bmp), Portable Network 

Graphics (.png) and Tagged Image Format (.tif or .tiff), 

which store the image data as a grid of pixels, in some 

cases with compression. Outline or stroke image formats 

such as Windows Metafile format (.wmf) and Scalable 

Vector Graphics format (.svg), store instructions in the 

form of lines and curves of how to draw the image rather 

than storing the image itself. 

A bitmap image can be displayed in a different size 

only with some distortion, but renders quickly; outline or 

stroke image formats are resizable but take more time to 

render as pixels must be drawn from scratch each time 

they are displayed. 

Fonts are designed and created using font editors. 

Fonts specifically designed for the computer screen and 

not printing are known as screenfonts. 

B. NaXi pictograph font library 

Font library is the basis and also the core of the system, 

accurate font and correct code set solid foundation for 

developing the complete system. The NaXi pictograph 

font library includes NaXi pictograph TrueType outline 

font and NaXi pictograph PostScript font. TrueType 

outline font is composed of quadratic Bézier curves and 

PostScript is built with cubic Bézier curves. Since a 

quadratic Bézier curve can be transformed to a cubic 

Bézier curve, we develop TrueType outline font first and 

then realize the transformation by expert tools. 

The developement of NaXi Pictographs Outline Font is 

mostly going by to design character draft, to scan the 

font,to digitize the font、to modify and coordinate the 

font、to examine the quality of character、to conformity 

character storeroom. After calculation,draft is scanned as 

lattice character storeroom in high precision by scanner,at 

the same time the character storeroom code is given.To 

simulate and conformity digitallization is according to the 

method of double pattern switch ,which tries its best to 

shift lattice graph to numeral infomation as real as 

manuscript automatically (curve outline).The outline 

point,line,angle and location can be rectified through 

parameter controling,which present extraordinarily 

important in the process of producing NaXi Pictographs 

with complex font and large diverse manner. 

The outline font development can be divided into four 

steps: font and script designing, digital fitting, modifying 

and font generating. Our project uses the original script of 

A dictionary of NaXi pictograph sound-indication which 

is transformed to standard bit map by scanning, and then 

the describing points are extracted for digital fitting, after 

that the NaXi pictograph outline font is integrated and 

created. This kind of technique can guarantee the veracity 

of original script and the generated outline font has the 

features of excellent reducibility and vector property as 

well.. 

V NAXI PICTOGRAPH INPUT METHOD BASED ON IMM- 

IME


A. IMM-IME Introduction 

As Windows is the most widely used operation system at 

present, this paper focuses on the input system for NaXi 

pictograph based on Windows. The input method based 

on Windows transforms standard ASCII string into 

NaXi word or string using some particular coding rule. 

With different application program the user can not 

design a transformation program himself, due to which 

the task of inputting NaXi pictograph should be taken by 

the Windows system administration. 

As shown in Fig.4, at first, the keyboard event of NaXi 

pictograph input system is received by the Windows file 

use.exe, then use.exe transfers the event to the Input 

Method Manager (IMM), after that the IMM conveys the 

event to the input method editor which translates the 

keyboard event to its corresponding NaXi character (or 

string) with reference to user’s encoding dictionary, when 

this is done the translated event is propagated back to 

use.exe and then to the executing application, until now 

the whole input process of NaXi pictograph is finished[4- 

5]. 

B. The Basic Structure of NaXi Pictograph IMM-IME 

The IMM-IME structure provides various input 

methods for applications, each tread of an application can 

keep an active input window. The processing order of 

other messages won’t be disturbed by inserting the NaXi 

pictograph message to message circle. The head file 

TABLE II. NAXI PINYIN CODE 

IPA Latin letter IPA Latin letter IPA Latin letter IPA Latin letter 

p‘ p p 

b b bb f(ɯ) f(w) 

t d t’ t d dd n n 

l l k g k‘ k g gg 

Ŋ ng h h ʈɕ j ʈɕ‘ q 

dʐ jj ȵ ni ɕ(ʑ) x(y) ʐ r 

ʂ sh dʐ rh tʂ zh tʂ‘ ch 

z ss s s dz zz ts‘ c 

ts z i i u u y iu 

a a o o ә e v v 

ɯ ee әr er e ei ӕ ai 

ie iei iӕ iai ia ia iә ie 

uei ui uӕ uai ua ua uә ue 

Figure 3. the Input Method Principle of NaXi Pictograph. 


immdev.h should be included for using these new 

features. The detailed working principle of NaXi 

pictograph Pinyin input method is shown in Fig. 3. 

C. NaXi pictograph pinyin input method 

Current input method includes two types: Pinyin and font. 

The font input method can be divided by strokes which 

needs the user has the ability to write. Unfortunately, 

writing NaXi pictograph is so hard for the ordinary user 

that makes the shape code unsuitable for NaXi pictograph 

being inputted to the computer. The phonetic code input 

method just requires the user know how to read the 

character, therefore, Pinyin input method is more suitable 

for NaXi pictograph. 

Early dictionaries of NaXi pictograph sound-indication 

employ International Phonetic Alphabet to mark NaXi 

character, while the computer uses Latin letters as input 

code, the conversion between NaXi Phonetic and Latin 

Pinyin becomes necessary. Table II lists the mapping 

between International Phonetic Alphabet and Latin 

Pinyin in detail. When designing the input method of 

NaXi pictograph one can encounter the problem of too 

long encoding due to the characteristics of NaXi 

pronunciation. For instance, the NaXi character � 

can be coded as “ssoxiqssoddassa”, it needs fifteen 

English characters to map this single one character which 

will result in very low efficiency when input an article. 

Research shows that the initials repetition phenomenon is 

common in NaXi pictograph pronounced coding, through 

the method of designing code with simplified initials the 

Pinyin input method of NaXi pictograph can greatly 

reduce the coding length and improve the coding 

efficiency. Let’s take the character � as an example, 

its pronounced coding is “ssoxiqssoddassa”, after the 

simplification of pronunciation it becomes to 

soxiqsodasa, the length reduces to eleven from fifteen. 

NaXi pictograph consists of 2120 characters and the 

average coding length is twelve bits which is reduced to 

eight bits after the simplification of pronunciation, so the 

input speed is also increased.


D. NaXi pictograph graphic primitive input method 

1) the Structure Coding Method of NaXi 

Pictograph:The NaXi Pictograph has four common 

structures, those are undivided whole structure codedby b, 

up and lower structure coded by s, right and left structure 

coded by z, surounding structure coded by b. The NaXi 

Pictograph character of undivided whole structure are 

such as�,�,�,�,�and so on, the upper and 

lower structure,�,�,�,�,�,�; right and left 

structure, �,�,�,�,�surounding structure, 

�,�,�,�,�. After the coarse segmentation and 

coding above, the fine granularity coding is introduced. 

2) the Graphic Primitive Coding Method of NaXi 

Pictograph:The NaXi pictograph character hasn’t radical 

component such as Chinese characters, so the 

representation method using Graphic Primitive is 

proposed in this paper,which inspired by syntactic pattern 

recognition. The basic elements of graph, graphic 

TABLE III. 

GRAPHIC PRIMITIVE CODE OF NAXI PICTOGRAPH 

Graphic Primitive code 

point a 

line b 

circle c 

circular curve f 

left oblique line g 

right oblique line h 

vertical line i 

vertical curve g 

oval curve k 

rectangle l 

right oblique line h 

primitives, are point, line, circle, circular curve, left 

oblique line. right oblique line, vertical line, vertical 


curve, oval curve and rectangle. As showed in Table III, 

the basic graphic primitives are coded. Different with 

Chinese and other characters, many NaXi pictograph 

characters contains digit, for example � (dice), the 

number of dots, and � (fire), the number of vertical 

curve, and � (treasure),the number of circle. So the 

quantity is coded, one is 'y', two is 'e', three is 's', four is 'f', 

five is 'w', six is 'l', seven is 'q', eight is 'b', nine is 'j', the 

number greater than nine is 'd'. The user interface of 

NaXi primitive input method is shown in Fig. 4. 

E. Evaluation of NaXi Pictograph Input Method 

In order to evaluate the efficiency of NaXi proposed by 

this paper, we develop an evaluating system for NaXi 

pictograph input method. Its flow procedure is shown in 

Fig. 6. The evaluating system employs the Windows 

message mechanism to automatically converse the NaXi 

Pinyin text to NaXi pictograph text under the conditions 

of activating the NaXi input method with and without 

optimization. After getting the evaluation result, we can 

calculate the conversion accuracy. 

In the experiment, we choose five NaXi texts to the 

evaluating system, the results demonstrate that the NaXi 

graphic primitive input method achieved higher average 

conversion accuracy than the Pinyin input method. Table 

IV shows the details. 

VI WEB EMBEDDING APPLICATION 

Aiming at the ordinary customers’ requirement, our 

project also made some study on the application of NaXi 

pictograph and developed Web embedding and PDF 

embedding technology. 

A. The Classification of Web Embedding 

The lattice font has been gradually eliminated at the 

Figure 4. NaXi Pictograph Primitive Input Method.


Conversion 

Accuracy 

Pinyin 

Method 

graphic 

primitive 

Method 

beginning of 90s in last century with the updating of 

operating system, the new outlines font has replaced it. 

The mainstream outlines font includes: 

①PostScript Type 1 font, brought forward by Adobe 

corporation years ago, applied in publication typesetting 

system, belongs to the first generation of outlines font. 

② TrueType font, brought forward by Apple and 

Microsoft. Because it has so many merits, it has been 

used in a variety of operating system on Mac and Pc, 

belongs to the second generation of outlines font. 

③ OpenType font, Put forward by Adobe and 

Microsoft as a new generation of outlines font standard. It 

fuses the merits of Type 1 and TrueType, belongs to the 

third generation of outlines font. 

The TrueType font embedding technology of NaXi 

pictographs can be divided into embedded OpenType and 

True Doc. Put forward by Microsoft, Embedded 

OpenType technology can compresse TrueType font into 

Eot file, and then embeds it into HTML web pages. 

While TrueDoc brought forward by Netscape and 

Bitstream can compress TrueType font into TrueDoc file 

and then embed it into HTML web pages. 

B. The Principal of Web Embedding Technology of NaXi 

Pictographs 

Web embedding technology of NaXi pictographs uses 

Figure 5. NaXi Pictograph Input Method. 


TABLE IV. 

COMPARISON OF THE NAXI PINYIN INPUT METHODS 

text1 text2 text3 text4 text5 Average 

Accuracy 

99.1% 98.1% 98.4% 96.1% 97.8% 98.1% 

99.6% 99.1% 98.9% 97.1% 98.1% 98.6% 

CSS2 calling compressed OpenType font files to embed 

NaXi pictographs into web page. Downloading 

compressed fonts of NaXi pictographs downloaded to 

clients’ temporary directories lets clients browse web 

pages with NaXi pictographs 

correctly without installing NaXi pictographs font.. 

The principle is that a client sends requirement for 

browsing, and then HTTP server sends HTML files to 

client browser. Client sends the information of web pages 

contained NaXi pictographs to server through CSS2, the 

server send this information to EOT database. EOT calls 

the TrueType font files stored in HTTP server, then 

HTTP server sends this inmessage back to the browser. 

The NaXi pictographs will be deleted autonomously after 

the client closes the browser. In this way fonts copyright 

can be better protected from piracy. By calling CSS2 

several times in a web page, not only NaXi pictographs 

but Chinese, Mongolian, Tibetan and other minority 

languages can be displayed in the same web page. 

C. Realization of Web Embedding Fonts Technology of 

NaXi Pictographs 

1) The environment of developing web embedding fonts 

of NaXi pictographs:The development environment is an 

environment of making web pages, EOT files, and CSS 

tables, it is mainly composed of DreamWeaver, WETF 

(Web Embedding Fonts Tool), Font Creator and etc. Free 

BSD installing Apache is adopted as the server. Firstly 

this structure is totally compatible with Microsoft IIS 

system. Secondly, it provides more useful functions, 

faster operating speed and better stability than Microsoft 

IIS system. 

2) The application of CSS2 in web embedding fonts of 

NaXi pictographs:pictographs TrueType font into web 

pages possible. From the way that CSS is inserted, there 

are three kinds of CSS: inline mode list, embedded mode 

list and exterior mode list, and the inline mode list and 

embedded mode list are broadly used in web embedding 

fonts technology of NaXi pictographs. The key sentence 

of CSS is @font-face, which has defined the name, type, 

thickness and other information of a planted font. 

3) The generation of planted font database of NaXi 

pictographs: The substance in creating font database is to 

compress NaXi pictographs TrueType fonts into 

OpenType fonts. There are many ways in generating 

NaXi pictographs planted databases, this paper adopts 

Microsoft WETF ( Web Embedding Fonts Tool). Before 

planting the validity of NaXi pictographs TrueType fonts 

in the development environment are checked by WEFT. 

WEFT displays the font validity in graph。.


After the success of font-checking, add the needed 

NaXi pictographs into EOT files, then calling NaXi 

pictographs font while browsing will not be a problem. 

4) The embedding of NaXi pictographs font:You can 

input NaXi pictographs by using NaXi pictographs input 

method developed by the information system lab, then 

create a web page containing NaXi pictographs with 

Dreamweaver. Then insert codes as follows between 

and : 

 

 

 

Embedding function @font-face provides with four 

parameters: you can use font-family to name this font in 

the current webpage, this project is defined as NaXi; fontstyle 

can be anyone of normal, italic or oblique, 

commonly defined as normal; font-weight can be normal, 

bold, bolder, lighter or other legal thickness value; 

FontURL is a URL pointing to OpenType files, normally 

adopts absolute pathway. After saving the NaXi 

pictographs webpage, you can test it by transfering it to 

the HTTP server. The testing Figure can be seen in Fig. 6. 

Up to now, the preliminary process of planting NaXi 

pictographs is officially finished. 

VII CONCLUSION 

A complete NaXi pictograph information processing 

platform is developed in this paper. It ends the history of 


Figure 6. NaXi Pictograph WEFT Web. 

NaXi pictograph processing without computer, provides 

valuable reference for creating other minority languages 

information processing system and also plays a 

significant role in promoting the computerization process 

of minority characters in China. 


This work is supported by National Natural Science 

Foundation of China (No. 60803096). 

REFERENCES 

[1] G. Eason, B. Noble, and I. N. Sneddon, “On certain 

integrals of Lipschitz-Hankel type involving products of 

Bessel functions,” Phil. Trans. Roy. Soc. London, vol. 

A247, pp. 529–551, April. 1955. 

[2] Joseph Francis Charles Rock,A Nakhi-English 

encyclopedic dictionary, I.M.E.O, Rome,1963. 

[3] Liu Yongkui, Guo Hai, Lu Guiyan,Li Hongyan,"Input 

technology and information processing of NaXi 

pictograph",Journal of Computational Information 

Systems,vol.3, no.1, p361-368, February, 2007. 

[4] Guo Hai, Che Wengang,Nie Juan,Li Bin etc,"Web 

embedding fonts technology of NaXi pictographs",Jisuanji 

Gongcheng/Computer Engineering, vol.31, no.17, pp.203- 

204+207, Sep. 2005. 

[5] Guo Hai, Zhao Jing-ying,"Development of the NaXi 

Pictographs Information Processing System",Control & 

Automation,vol.22, no. 22, pp.122-124, 2006. 

[6] Xie Qian, Jiang Li,Wu Jian,etc,"Research on Chinese 

Linux input method engine standard",Jisuanji Yanjiu yu 

Fazhan/Computer Research and Development, vol.43, 

no.11, pp.1965-1971,November, 2006. 

[7] Tseng Chun-Han, Chen Chia-Ping,"Chinese input method 

based on reduced Mandarin phonetic alphabet", in Proc. 

INTERSPEECH 2006 and 9th International Conference on 

Spoken Language Processing, 2006, pp.733-736.


[8] Tanimoto Yoshio,Nanba Kuniharu,Rokumyo Yasuhiko, 

etc,"Evaluation system of suitable computer input device 

for patients", Proceedings of the Third Workshop - 2005 

IEEE Intelligent Data Acquisition and Advanced 

Computing Systems, 2007, pp. 369-373. 

Hai Guo received the M.Sc. degree in pattern recognition 

and intelligent system from the Kun Ming University of science 

and technology, China, in 2004. 

He has been teaching at Dalian Nationalities University 

since graduation. For many years, he has done research in 


image processing, pattern recognition. He has published 

more than 10 papers. 

Jing-ying Zhao received the M.Sc. degree in computer 

application from the Chang Chun University of science and 

technology, China, in 2003. 

She has been teaching at Dalian Nationalities University 

since 2003. For many years, she has done research in image 

processing, pattern recognition. He has published more than 10 

papers


A Systematic Decision Criterion for Urban 

Agglomeration: Methodology for Causality 

Measurement at Provincial Scale 

Yaobin Liu 

Research Center of Central China Economic Development, Nanchang University, Nanchang, P.R.China 

Email: liuyaobin2003@163.com 

Li Wan 

Research Center of Central China Economic Development, Nanchang University, Nanchang, P.R.China 

Email: Sandy_gf@163.com 

Abstract—The paper employs the synergism theory and grey 

relative technique to analyze the inner nonlinear 

relationships among factors that attribute to process of 

urban agglomeration, and builds a synergic model to reveal 

the causality running urban agglomeration system. The grey 

relative analysis shows that the urban agglomeration system 

is an open and dissipative system, and it is affected by many 

factors, which are not just simple linear relative but 

relevant, non-uniform and irreversible. Meanwhile, the 

synergic model reveals that the key parameters that 

dominated the process of urban agglomeration in Jiangxi 

province are the rate of industrialization, percentage of 

urban fixed asset investment in the total fixed asset 

investment and the highway mileage on per unit area. 

Obviously, the simulated results are available in practice, so 

the nonlinear systematical methods can be applied to 

analyze the causality mechanism in process of urban 

agglomeration. 

Index Terms—Urban agglomeration; causality 

measurement; grey relative technique; synergic model 


Urban agglomeration is made up of cities with 

different sizes to be linked by traffic network in a given 

area, and it is an inevitable result when urbanization 

reaches a certain level [1]. Today, while the global 

economic development is integrated, the urban 

agglomeration has become a carrier, by which a country 

or region can participate in international competition. For 

example, in the United States, three urban agglomerations 

including the New York area, Los Angeles area and the 

Great Lakes concentrated about 46% of the national 

population and 67% of the GDP. So, it is vital for a 

nation and area to nurture and build a stronger and 

competitive urban agglomeration for participating in 

international competition. How to nurture and build the 

urban agglomerations, rapidly? This is not only an actual 

Manuscript received January 1, 2009; revised June 1, 2009; accepted 

July 1, 2009. 

Corresponding author. Tel.: +86791 3163385; fax: + 867918304401 

E-mail address: liuyaobin2003@163.com (Y.B.Liu). 


doi:10.4304/jcp.5.1.125-130 

problem for city planners and urban managers, but also 

an important academic issue for the theoretical scholars. 

Although many scholars pay more attention to the city 

cluster in the developed areas of western countries, the 

modern researches have been extended to the developing 

areas including Africa and Asia countries. Here, 

mechanism and pattern of urban agglomeration are still a 

focus and topic in urban planning field. For example, 

Naude and Krugell(2003)analyzed the spatial 

development of urban agglomeration in South Africa, and 

thought that the size of the primate city in Johannesmoreeast 

Rand might be relatively too large[2]. Kanemoto et 

al.(2005)took Tokyo metropolitan as an example, and 

analyzed the rational size of Tokyo as the primate city[3]. 

Qin et al.(2006) analyzed formation mechanism and 

spatial pattern of urban agglomeration in central Jilin of 

China[4]. 

Most modern scholars of urban development 

acknowledge that transnational processes are having an 

increasingly important influence on the evolution of 

urban agglomeration. An early observation was the 

recognition of an emerging system of world cities[5], a 

kind of urban elite which is shaped in part by the new 

international division of labor. These urban 

agglomerations are also thought to be controlling and 

coordinating global finance, producer and business 

services[6]. The view of world cities as the “key nodes” 

of the international urban system is a widely held one, 

underpinned in particular by rapid advances in the 

development of information technology and 

telecommunications. However, because the surrounding 

metropolitan area has experienced profound changes of 

spatial organization, with suburbanization bringing the 

most radical reorganization of metropolitan 

space[7][8][9], the growing role of suburbanization in 

metropolitan development is not unique as other major 

cities in post-communist countries follow a similar path 

[10][11][12][13][14][15][16][17].Suburbanization should 

thus be considered as one of the crucial topics in the 

study of urban agglomeration in post communist cities. 

Consequently, a variety of explanation attribute to urban 

agglomerations at present.


When we carefully review the relevant operation 

mechanism of urban system, we may find that urban 

agglomeration being a complicatedly non-linear systems, 

has a lot of “disturbance” and “fluctuation” in orderly 

appearance. The “disturbance” and “fluctuation” can 

make the self-organized process changed and promoted, 

and then the system can change from disordered to 

ordered (or from junior orderly to senior orderly). The 

process of urban agglomeration also can appear a 

“Lorenz Butterfly Effect” phenomenon when the system 

inclines to recession or collapse[18]. Therefore, we have 

to depend on non-linear mathematical methods to analyze 

the interaction mechanism of formation and evolution of 

urban agglomeration, by which we can find the 

quantitative relationship among the key factors and reveal 

the causality in the system. However, there is still a lack 

of studies on the integrated systems from the holistic 

region point of view, thus we take Jiangxi province in 

China as a case, and use the grey relative technique to 

analyze the main factors of urban agglomeration 

evolution and develop a synergic method to reveal the 

causality. The objective of this study is to investigate the 

key factors of urban agglomeration in Jiangsu province. 

Improved understanding of its interactive mechanism will 

certainly help planning efforts for future urbanization 

management. 

II. METHODOLOGY AND DATE 

A. Grey relative technique 

Grey system theory thinks that the world is the 

material world, and it is also the information world. In the 

information world, the white system can be understood 

and the black system can not be known relatively and 

temporarily, only the grey system can partly be explained 

eternally and absolutely. The grey system theory is just 

an analytical method which generally is used to study 

relationships with incomplete information. The main 

characteristic of the grey system theory is to compare the 

data arrays and to calculate the grey relational degrees 

between each element of the system with incomplete 

information. Hereinto, the grey relative technique is an 

effective way to reveal the grey relation. So-called grey 

relative technique is to find the links among the elements 

through the quantitative analysis. Thus, we can reveal the 

incomplete information world by the grey relative 

technique. The basic idea of grey relative technique is to 

judge similar of its close link by grey relational degrees 

[19]. To calculate the grey relational degree rij, the first 

step must calculate the grey relational coefficient between 

main-array and sub-array. The grey relational coefficient 

is defined as: 

minmin xi( k) − xj( k) + ξ max max xi( k) −xj( 

k) 

j k j k 

rij() k = 

xk () − xk () + ξ maxmax xk () −xk 

() 

i j i j 

j k 

Where rij(k) is the grey relational coefficient of the 

main-array and sub-array at time k; xi(k) is the mainarray; 

xj(k) is the sub-array and ξ the differentiation 


(1) 

coefficient, which is used to increase the most prominent 

differences between two arrays. Generally, the value is 

assumed to be 0.5. The second step is to calculate the 

grey relational degree between the main-array and subarray. 

It is defined as: 

n 1 

r r( x ( k), x ( k)) 

= ∑ (2) 

ij i j 

n k = 1 

Where rij is the grey relational degree of sub-array j 

and main-array i; n is the length of the arrays. 

B. Synergic model 

The formation of urban agglomeration is the result of 

nonlinear links among the elements of system, and its 

evolution results from their synergy, cooperation and 

competition. Therefore, there exists a relatively stable 

form in process of urban agglomeration, in which the 

form changes from the disorder state to orderly another. 

Besides, the process of urban agglomeration is also an 

interminable developing period with temporal-spatial 

structure on the whole. In order to reveal the inner 

mechanism and evolution causality, we develop a 

synergic model to analyze the nonlinear function and find 

out theirs quantitative relationship[20]. 

If the dynamic system of urban agglomeration have n 

state variable, the x 1 , 2 x ,…, x m are the date sequences 

during 1-m. Because the variables have different units, 

they can not directly be applied for the synergic model 

formerly, and the variables have to be normalized. 

Supposed normalized variables 

(0) 

be x ()( t i = 1,2, L m) 

, and the corresponding 

i 

sequence after first accumulation is 

variation rate is 

(1) 

dxi () t 

dt 

(1) 

x i , thus the 

. The variation rate is 

determined by two aspects. The first aspect results from 

the inner synergic effects among the elements of system, 

and it is self-development result. If the self-development 

ax t , the restrain items is just 

(1) () 

ii i 

items is 

(1) 2 

− bii ( xi ( t)) 

. The second one lies in the exterior 

synergic effects among the subsystems, and it is so-called 

(1) 

the synergic function. If the synergic items is ax ij i () t , 

(1) 2 

the restrain items is just − b ( x ( t)) 

. Thus, the total 

variation rate 

dx 

(1) 

i 

ij i 

(1) 

xi () t can be expressed as follows: 

m 

(1) (1) 2 (1) 

ii i ( ii ij)( i ) ij j 

j 

ax b b x ax 

dt + =− + +∑ (3) 

If bii + bij = bi, 

we can get the following nonlinear 

equation:


dx 

dt 

(1) 

i 

m 

(1) (1) 2 (1) 

ii i i i ij j 

j 

∑ (4) 

+ ax =− b( x ) + ax ( i≠ j) 

Because the evolution of system is from one critical 

point to another, the fluctuation is only a disturbance. 

Correspondingly, the stability mechanism of system 

always makes the fluctuation to decay or even disappear. 

Considering the synergic model of urban agglomeration 

that we build is to find out the dominant driving 

parameters, they can lead to formation and evolution of 

urban agglomeration. In practical, we will give the 

parameters an important action to strengthen the synergic 

function between systems. Beside, the purpose of our 

work help the systems promote a higher orderly 

development and not forecast its trend when the system 

happens to change or transition. Therefore, we neglect the 

role of the fluctuation when we build model. 

The parameters aii, aij and bi are indicators, by which 

we measure the intensity of synergy and competitive 

among the elements. These parameters can be solved 

through using the nonlinear differential equations. To 

analyze the status equations of the model, we can found 

which is the positive relaxation coefficient among the 

state variables? And which is the negative relaxation 

coefficient in the status equation? When the state 

variables with the negative relaxation coefficient are 

eliminate with adiabatic approximation approach, the 

remained state variables are all the slow ones of this 

system, and they is so-called order parameter. On the 

other hand, we can also use numerical method, i.e. 

Runge-Kutta approach with four times differences, to 

gain the numerical solution of the equation, and then 

eliminate the rapid state variables of system by adiabatic 

approximation approach. At last, the remaining slow state 

variables are just the order parameters of system. Because 

the 

x t is obtained by first accumulation and 

(1) () 

i 

normalization in the equation. After the numerical 

solution of the equation is reduced, the simulated values 

of state variables in line with synergic development of 

urban agglomeration system can be obtained by Runge- 

Kutta approach, subsequently. 

C. Date collection 

Jiangxi province is located in Southeast China, 

bordering Fujian, Guangdong, Zhejiang, Hunan, Hubei 

and Anhui provinces. It represents about 1.8% of the 

surface of the P.R. of China and 3.5% of its population. 

In Jiangxi province, agricultural production dominates its 

economy. The share of agriculture in GDP was 17.93% in 

2005, i.e. 5.32% higher than the average level of the 

whole country. Its GDP per capita in 2006 was equal to 

9440.62 Yuan (US$8.19), i.e. 67.24% of the national 

average[21]. 

The data of non-agricultural population and GDP of 

provincial cities and the other major cities of central 

region origin from “China city statistical yearbook 

(2007)”which is drawn up by urban social-economic 

survey department of National statistics bureau[22]. The 

social-economic data of major cities of Jiangxi province 


origin from “Statistical yearbook of Jiangxi province 

(2007)”, which is edited by statistics bureau of Jiangxi 

province[23]. The distance between every city of Jiangxi 

province and the other cities is measured by mileage of 

railway of National railway ministry, these data origin 

from http://qq.ip138.com/train/ , specially, when there is 

direct train between the two cities, we adopted their 

nearest distance. Taking into account that the 

considerable number of cities doesn’t have any direct 

train, we use the shortest distance of transit based on the 

accessibility. 

III. RESULT AND ANANLYSIS 

A. Factors analysis 

For the evolution of urban agglomeration is interaction 

process among variables, in order to find out their 

behavior characterize, we have to inspect the relationship 

between elements of urban agglomeration system. 

According to qualitative analysis, it can be found that the 

dominant driving force of formation and evolution of 

urban agglomeration in Jiangxi province comes from six 

aspects, i.e. industrialization, migration and transferring 

of industries, foreign investment, pressure of competition 

in the central area, construction of traffic network and 

macro-control of government[24]. Because the 

relationship among the variable in the internal dynamic 

systems is relatively complex, and they are affected by 

some man-made policy, therefore, it is very difficult to 

directly quantify the action characteristics of urban 

agglomeration system. Thus, we can use the indirect 

variables to character the dynamic systems. First of all, 

we selected six variables as the original series during 

2000-2006. The six variables are respectively the rate of 

industrialization, percentage of the tertiary industry in 

GDP, percentage of the actual utilized foreign investment 

in GDP, percentage of urban fixed asset investment in 

total fixed assets investment, the highway mileage on per 

unit area, and percentage of financial income in GDP 

(Table 1). We can use grey relative technique to calculate 

the grey relational degrees of the six variables, and 

then obtain cross table of the dynamic elements during 

2000-2006 in Jiangxi province (Table 2). Through grey 

relative analysis, we can have four results from table 2. 

The first result shows that the grey relative degrees are all 

above 0.45, which indicates the interactive intensity and 

the mutual constrains of the various elements in urban 

agglomeration are significant. The second result shows 

that the grey relative degrees of the variables are 

different, and they are not simple linear relative but 

relevant, non-uniform and irreversible, which indicates 

the system is complex. The third result shows that there is 

a higher grey relative degree between percentage of the 

actual utilized foreign investment in GDP and the other 

variables, which indicates that the ability of attract 

foreign investment is more important than that of other 

elements in process of urban agglomeration in Jiangxi 

province. At the same time, the rate of industrialization, 

percentage of the tertiary industry in GDP, percentage of 

urban fixed asset investment in total fixed assets


investment, the highway mileage on per unit area and 

percentage of financial income in GDP pay an important 

role in the ability of attracting foreign investment. The 

last aspect shows that there exists a higher relationship 

between percentage of urban fixed asset investment in 

total fixed assets investment and the other variables, 

which indicates that the city construction investments 

have an important impact on the evolution of urban 

agglomeration. However, the rate of industrialization, 

percentage of the tertiary industry in GDP, financial 

income and the highway mileage on per unit area are also 

play a vital role in urban fixed asset investment. 

Table 1. The data of action characteristics of urban agglomeration during 2000-2006 in Jiangxi province,China 

items variables times 

2000 2001 2002 2003 2004 2005 2006 

The rate of industrialization (%) 

Percentage of the tertiary industry in GDP (%) 

Percentage of the actual utilized foreign investment in 

GDP (%) 

x1 x2 

x3 

27.2 

40.8 

1.021 

27.7 

40.6 

1.637 

28.7 

39.6 

3.993 

30.8 

37.2 

5.169 

33.0 

35.5 

5.344 

35.9 

34.8 

5.375 

38.7 

33.5 

5.408 

Percentage of urban fixed asset investment in total x4 84.4 84.6 85.1 85.8 86.4 87.7 88.5 

fixed assets investment (%) 

The highway mileage on per unit area(km/km 2 ) x 5 0.361 0.364 0.368 0.369 0.371 0.373 0.768 

Percentage of financial income in GDP (%) x6 0.01 0.01 0.01 0.01 0.01 0.01 0.01 

Table 2. The cross table of factors of urban agglomeration in Jiangxi province,China 

x 1 x 2 x 3 x 4 x 5 x 6 

x 1 1 0.66 0.65 0.76 0.59 0.62 

x 2 0.62 1 0.74 0.72 0.55 0.67 

x 3 0.65 0.77 1 0.64 0.67 0.90 

x4 0.71 0.75 0.60 1 0.58 0.65 

x5 0.56 0.57 0.64 0.57 1 0.72 

x 6 0.46 0.45 0.47 0.46 0.43 1 

B. Causality analysis 

Supposed that the rate of industrialization, percentage 

of the tertiary industry in GDP, percentage of the actual 

utilized foreign investment in GDP, percentage of urban 

fixed asset investment in total fixed assets investment, the 

highway mileage on per unit area and percentage of 

financial income in GDP are the state variables in urban 

⎧dx 

⎪ 

⎪ 

dt 

⎪dx 

⎪ 

dt 

⎪ 

⎪dx 

⎪ dt 

⎨ 

⎪dx4 

⎪ dt 

⎪ 

⎪dx5 

⎪ dt 

⎪ 

⎪dx6 

⎩⎪ 

dt 

(1) 

1 

(1) 

2 

(1) 

3 

(1) 

(1) 

(1) 

agglomeration, respectively, and they can be denoted as 

x t , 

(0) 

x2 () t ,, 

(0) 

x3 () t , 

(0) 

x4 () t , 

(0) 

x5 () t and 

x t . With the software of MATLAB applied, we 

(0) 

1 () 

(0) 

6 () 

can get the parameters of system by using the synergic 

model. And then the parameters are put into Equa.(4), we 

can get the state equations of system: 

=− 0.5870x + 0.2503( x ( t)) + 0.3474x + 0.5219x + 0.4146x + 0.3292x + 0.4158x 

(1) (1) 2 (1) (1) (1) (1) (1) 

1 1 2 3 4 5 6 

= 0.3037x −0.7274x − 0.0529( x ) + 0.4679x + 0.5047x + 0.4053x + 0.3922x 

(1) (1) (1) 2 (1) (1) (1) (1) 

1 2 2 3 4 5 6 

=− 0.7287x + 0.3509x + 0.3726x + 0.0724( x ) + 0.4635x + 0.5064x + 0.4826x 

(1) (1) (1) (1) 2 (1) (1) (1) 

1 2 3 3 4 5 6 

= 0.3275x + 0.3062x + 0.2272x − 0.7271x + 0.0807( x ( t)) + 0.4627x + 0.5243x 

(1) (1) (1) (1) (1) 2 (1) (1) 

1 2 3 4 4 5 6 

= 0.4276x 

+ 0.3907x + 0.1278x + 0.4347x −0.5264x − 0.0902( x ( t)) +−0.3526x 

(1) (1) (1) (1) (1) (1) 2 (1) 

1 2 3 4 5 5 6 

=− 0.5264x + 0.3281x + 0.5071x + 0.2229x + 0.3281x + 0.3209x −0.4563( 

x ( t)) 

(1) (1) (1) (1) (1) (1) (1) 2 

1 2 3 4 5 6 6 

Using Runge-Kutta approach with four times 

differences, we can get the numerical solution as the 

indicate 

(1) 

x% i ()( t i = 2, L ,6) . After the numerical 

solution of the equation is reduced, the simulated values 

% . Through 

of state variables can be got, i.e. (0) xi () t 


analyzing the state equations of this model, we can get 

some results. Firstly, except that the coefficients of the 

first state variable (the rate of industrialization), the third 

state variable (percentage of urban fixed asset investment 

in total fixed assets investment) and the fourth state 

variable (the highway mileage on per unit area) are 

(5)


positive, the coefficients of other three variables 

including percentage of the tertiary industry in GDP, 

percentage of the actual utilized foreign investment in 

GDP and percentage of financial income in GDP are 

negative. Secondly, with the adiabatic approximation 

approach employed, we can eliminate the three state 

variables, and the remaining state variables which are the 

rate of industrialization, percentage of urban fixed asset 

investment in total fixed assets investment and the 

highway mileage on per unit area are just the slow state 

variables, which can be called order parameter. By 

contrasting the cumulative curve of initial values and that 

of forecasted values (Fig.1 and Fig.2), it can be seem that 

the simulated rule of dynamic evolution of urban 

agglomeration is consistent with the actual situation, so 

the model is success. 

The synergic model reveals that the dominant 

parameters of urban agglomeration in Jiangxi province 

are the rate of industrialization, percentage of urban fixed 

asset investment in total fixed assets investment and the 

highway mileage on per unit areas, which are key driving 

factor in formation and evolution of urban agglomeration. 

Therefore, the degree of industrialization, the competition 

ability of the central urban agglomeration and the 

construction of transport network play an important role 

in all driving factor. Therefore, we should pay more 

attention to the new industrialization for developing 

urban economics in the process of urban agglomeration. 

At the same time, the expenditure and maintenance funds 

in urban construction and the construction of transport 

network take a vital impact on developing of urban 

agglomeration in Jiangxi province. It is showed that 

urban construction and transport links are an important 

foundation for economic development. If there is no a 

guarantee for good urban environment in economic 

growth, the healthy economic development can not be 

achieved. Therefore, we should pay more attention to 

planning on the whole urban construction in practice, and 

raise funds for construction and accelerate the 

infrastructure of transports in Jiangxi province. 

x(0) 

3 

2. 5 

2 

1. 5 

1 

0. 5 

0 

x1 x2 x3 x4 

x5 x6 

1 2 3 4 5 6 7 t 

Figure 1. Cumulative curve of initial values 


x(0)/ 

3 

2. 5 

2 

1. 5 

1 

0. 5 

0 

x1 x2 x3 x4 

x5 x6 

1 2 3 4 5 6 7 t 

Figure 2. Simulation curve of forecasted values 

IV. CONCLUSIONS 

There is still a lack of studies on the integrated urban 

agglomeration systems from the holistic region point of 

view, thus we take Jiangxi province in China as a case, 

and use the grey relative technique to analyze the main 

factors of urban agglomeration evolution and use 

synergic method to build a nonlinear differential model 

that can reveal the causality in the process of urban 

agglomeration and get the parameters of system by 

simulation. The results by grey relative analysis shows 

that urban agglomeration system is an open and 

dissipative system, and it affected by many factors, which 

are not simple linear relative but relevant, non-uniform 

and irreversible. And the synergic model further reveals 

that the dominant parameter of urban agglomeration in 

Jiangxi province are the rate of industrialization, 

percentage of urban fixed asset investment in total fixed 

assets investment and the highway mileage on per unit 

area, which are key driving factor in formation and 

evolution of urban agglomeration. Therefore, the 

promoting of industrialization, the enhancing of 

competitive ability of the central urban agglomeration 

and the boosting construction of transport network are 

vital in process of urban agglomeration. 

The studied result shows that the integrated model 

constituted of the grey relative technique, the synergic 

model and numerical analysis can to the degree reveal the 

evolution direction of urban agglomeration, and hold the 

internal mechanism of its formation and evolution. 

Therefore, the nonlinear systematical methods can be 

applied to analyze the mechanism of urban agglomeration 

development. As an applied research, when we use the 

other variables to describe the action characteristics in the 

dynamic systems, the method is simple but it can reflect 

the uncertainty and subjectivity. At the same time, only a 

series of assumptions is considered do we build the 

synergic model by nonlinear differential grey system 

when. Because the fluctuation is not considered, the 

universal practicability can not be ensured when it is 

applied. Therefore, we should amend the model in 

accordance with the specific issues in practice. 


This research was supported by funding from


Provincial Education Department of Jiangxi (No. 

GJJ08014) and National Social Science Foundation of 

China (NSSC) (No.07CJL031). 

REFERENCES 

[1] S.M. Yao, Y.M. Zhu, Z.G.Chen, The Urban 

Agglomerations of China, 3rd ed. Hefei: Science and 

Technology University of China Publishing House,2006, 

pp.23-28. (In Chinese) 

[2] W.A.Naude, W. F.Krugell, “Are South Africa's cities too 

small?” Cities, 2003,20(3),pp.175-180. 

[3] Y,Kanemoto, T.Kitagawa, H.Saito, et al., Estimating 

urban agglomeration economies for Japanese 

metropolitan areas: is Tokyo too large? Tokyo: Graduate 

School of Economics, University of Tokyo,2005,pp.45- 

48. 

[4] G. Qin, P.Y.Zhang, B.Jiao, “Formation mechanism and 

spatial pattern of urban agglomeration in central Jilin of 

China”, Chinese Geographical Science,2006, 

16(2),pp.154-159. 

[5] P.Hall, The World Cities, 3rd eds. London: Weidenfeld 

and Nicolson, 1966, pp.124-128. 

[6] N.Thrift, “Globalization, regulation, urbanisation: the case 

of the Netherlands”, Urban Studies,1986,31, pp.365-380. 

[7] L.Sykora, “Changes in the internal spatial structure of 

post-communist Prague”, Geographical Journal,1999, 49 

(1),pp. 79-89. 

[8] M.Ourdnicek, “Differential suburban development in 

Prague urban region”, Geografiska Annaler: Series B, 

Human Geography, 2007,89 (2),pp.111-126. 

[9] L.Sykora, and M.Ourednicek, “Sprawling postcommunist 

metropolis: commercial and residential suburbanization in 

Prague and Brno, the Czech Republic”, in Dust, M., 

Razin, E. and Vozquez, C. (eds): Employment 

Deconcentration in European Metropolitan Areas: 

Market Forces versus Planning Regulations, 

Geographical Journal Library 91, New York: Springer, 

2007,pp. 209-233. 

[10] K.Leetmaa, and T.Tammaru,“Suburbanization in 

countries in transition: destinations of suburbanizer , s in 

the Tallinn metropolitan area”, Geografiska Annaler: 

Series B, Human Geography ,2007,89 (2), pp.127-146. 

[11] H.Kok, and Z.Kovacs, “The process of suburbanization in 

the agglomeration of Budapest”, Netherlands Journal of 

Housing and Built Environment,1999, 14 (2), pp.119-141. 

[12] H.Nuissl, and D.Rink,”The production of urban sprawl in 

eastern Germany as a phenomenon of post-socialist 

transformation”, Cities,2005,22 (2),pp.123-134. 

[13] N.Pichler-Milanovi, “Ljubljana: from beloved city of the 

nation to Cenral European capital”, in Hamilton, F.E.I., 

Dimitrowska, K. and Pichler-Milanovi, N. (eds): 

Transformation of Cities in Central and Eastern Europe: 

Towards Globalization, Tokyo: United Nations University 

Press, 2005, pp. 318–363. 

[14] U.Sailer-Fliege, “Characteristics of post-socialist urban 

transformation in East Central Europe”, GeoJournal, 1999, 

49 (1),pp.7-16. 

[15] T.Tammaru, “Suburbanization, employment change, and 

commuting in the Tallinn metropolitan area”, 

Environment and Planning A, 2005,37 (9), pp.1669-1687. 

[16] J.Timar, and M.Varadi, “The uneven development of suburbanization 

during transition in Hungary”, European 

Urban and Regional Studies,2001, 8 (4), pp.349-360. 


[17] I.Tosics, “Post-socialist Budapest: the invasion of market 

forces and the response of public leadership”, in 

Hamilton, F.E.I., Dimitrowska, K. and Pichler-Milanovi, 

N. (eds): Transformation of Cities in Central and Eastern 

Europe: Towards Globalization, Tokyo: United Nations 

University Press,2005, pp. 248–280. 

[18] X.F.Pan, “Urban construction and economic development 

of collaborative system through using the model of grey 

correlative analysis”, Science and Technology 

Management Research, 2005, 10, pp.43-45. (In Chinese) 

[19] L.Fu, Grey system theory and application, Beijing: 

Science and Technology Literature Publishing House , 

1992,pp.212-216. (In Chinese) 

[20] K.X.Bi, H,Y.Wang “The Gray Cognate Analyzing and 

Model Structuring of Synergy Development System of 

Universities Science Research and Subject Construction”, 

Systems Engineering-theory & Practice, ,2003,4, pp.140- 

143. (In Chinese) 

[21] State Statistical Bureau of China, China Statistical 

yearbook(2007), Beijing: China Statistical Press, 

2007,pp.243-246. 

[22] Urban social-economic survey department of National 

statistics bureau. China City Statistical yearbook(2007), 

Beijing: China Statistical Press, 2007,pp.142-126. 

[23] statistics bureau of Jiangxi province, Statistical yearbook 

of Jiangxi province (2007), Beijing: China Statistical 

Press, 2007,pp.67-69. 

[24] Y.B. Liu, W.J.Song, , L.Wan, “Formation of urban 

agglomeration and dynamic mechanism analysis in 

Jiangxi province”, Group of Economic Research, 2007,30, 

pp.28-29. (In Chinese) 

Yaobin Liu received an academic degree in Management 

Science and Engineer. His PhD was on urban management 

relation of function and development. He received his PhD 

(2005) from China University of Mining and Technology. 

Following two post-doctoral years at Huazhong University of 

Science and Technology, he served on Research Center of 

Central China Economic Development of Nanchang 

University until 2005. He initiated and led a number of projects 

with the urban agglomerations in the Central China since 

1999. It started with geography mapping, through several 

conservation development plans to a conservation master plan 

for the whole urban agglomeration. He developed a main 

interest in urban management and planning. 

Li Wan is currently a PhD candidate in the Research 

Center of Central China Economic Development of 

Nanchang University. He received a BS in Management 

Science from Nanchang University and MS in Human 

Geography from Nanchang University. His research focuses on 

urban agglomeration assessment and strategic planning 

assessment (SEA), with special emphasis on applying 

management science principles into SEA on spatial plan. He 

had also been involved in several programs in urban functional 

planning and integrated urban agglomeration management 

when he worked as an assistant researcher in Urban Planning 

Center of Jiangxi university of Science and Technology in 2002 

and 2003.


Application of Improved Fuzzy Controller for 

Smart Structure 

Jingjun Zhang 

Department of Science Research, Hebei University of Engineering, Handan, China 

Email: santt88@163.com 

Liya Cao, Weize Yuan, 2 Ruizhen Gao, and Jingtao Li 

College of Civil Engineering, Hebei University of Engineering, Handan, China 

2 College of Mechanical and Electrical Engineering, Hebei University of Engineering, Handan, China 

Email: {yaer305@163.com, yuanweize520@126.com, 217taozi@163.com} 

Abstract—In order to reduce the vibration of smart 

structures, this paper gets the optimal location of 

piezoelectric patch by the D-optimal design principle, and 

then uses the fuzzy logic to control the smart structures 

vibration. The fuzzy IF-THEN rules are established on 

analysis of the motion traits of cantilever beam. The fuzzy 

logic controller (FLC) designs base on using the 

displacement and velocity of the cantilever beams tips as 

the inputs, the control force on cantilever beams as the 

output. This new method improves calculation efficiency 

and reduces calculation complexity. Besides that, the paper 

establishes parameter self-adjustment factor in fuzzy 

controller by s-function to make the fuzzy logic control 

more effective. The simulation results with Matlab 

illustrate that the proposed method has a better control 

performance than existing methods. 

Index Terms—smart structures, optimal location, fuzzy IF- 

THEN rules, fuzzy logic controller, parameter selfadjustment, 

s-function 


In 1985 Bailey and et al. [1] performed experimental 

research on active vibration control using surface-bonded 

piezoelectric polymer actuators on a cantilevered beam. 

Their experiment has greatly inspired the active vibration 

control related field. Recently, much research has been 

developed in the field of smart materials and structures. 

Piezoelectric is a kind of smart material, due to the 

following two characteristics: the first is direct and 

inverse piezoelectric effects and the second is the ability 

to be used as the sensor or the actuator in active vibration 

control systems. Vibration control of smart structures is 

very important because of the lightly damped of the 

materials which were used. The placement of 

piezoelectric patches plays an important role in the 

design procedure of the active structures. The researchers 

have focused on development of the optimal 

piezoelectric patch location. Guo and et al [2] presented 

a global optimization of sensor locations based on the 

damage detection method for structural health 

monitoring systems. Martin Kögl and et al [3] presented 

a novel approach to the design of piezoelectric plates and 

shell actuators using topology optimization. In this 


doi:10.4304/jcp.5.1.131-138 

approach, the optimization problem consists of 

distributing the piezoelectric actuators in such a way as 

to achieve a maximum output displacement in a given 

direction at a given point of the structure. Cao and et al 

[4] use the element sensitivities of singular values to 

identify optimal locations for actuators. 

There have been many performance criterion 

presented, such as controllability and observability of 

control system measures, dissipation energy measures 

and system stability measures. However, in order to 

make use of the above-mentioned measures, a flexible 

structure state space equation should be modeled by the 

given location of piezoelectric patches. The D-optimal 

design principle is an optimization method presented by 

Bayard and et al [5] which suggests that the maximum 

determinant of Fisher Information Matrix Criteria is 

chosen as the optimization function and then simplified 

to determine an optimal principle for the best location for 

piezoelectric elements. 

In the regions of civil engineering and the spaceflight 

engineering, the structures always accompany with 

complicated kinetic characteristics and uncertain factors. 

On the other hand, a robot system is a highly nonlinear 

and heavily coupled mechanical system. The 

mathematical model of such system usually consists of a 

set of linear or nonlinear difference equations derived by 

using some form of approximation and simulation. In 

1983, Brown and Yao [6] used the fuzzy theory to the 

engineering structures at first time. In 1986, Juang and 

Elton [7] adopted fuzzy logic to estimate the density of 

earthquake on the extent of damage for the constructions. 

Battaini [8] designed the fuzzy logic control about mass 

systems and experimented. Symans and Kelly [9] applied 

fuzzy logic control strategy to control the system of 

control. Based on the virtues of fuzzy logic control, 

H.Park [10] established the approximate model of the 

driver, the sensor and the fuzzy logic controller to solve 

the problems of vibrations for flexible structure. The 

result indicated that the fuzzy logic control had the 

stronger robust and self-adaptive for the linear and nonlinear 

system. R.Y.Y.Lee [11] carried on the similar 

experiments of designing the ordinary fuzzy logic 

controller and self-buildup controller for the non-linear


piezoelectric driver. The simulated result showed that the 

fuzzy logic control has excellent suppression 

effectiveness on the vibration of non-linear flexible. This 

paper presents an optimization method using the Doptimal 

design principle simplified to determine an 

optimal principle for the best location of piezoelectric 

elements, and then designs a fuzzy logic controller to 

control the beam’s vibration. The simulation results 

show that this proposed method has more superiority 

than the others. 

II. MODELING AND THEORY 

In this section, we consider beam or plate type 

structures bonded with rectangular shaped piezoelectric 

sensors and actuators. The sensors and actuators are 

symmetrically collocated on both sides of the same 

position of the structure. The research of reference [12] 

is verified that the symmetrical collocation can avoid 

observation spillover and control spillover induced by 

modal truncation, and ensures the controlled system is 

minimum phase system. The finite element method can 

analyze the arbitrary geometry models and the 

anisotropic properties of the piezoelectric materials. 

Considering the piezoelectric effect, special finite 

elements with a degree of volts have been developed. 

These elements have become available in some 

commercial finite element software such as ANSYS. In 

this paper, the solid45 3-d solid elements are used to 

model the host structures and the solid5 3-d solid 

elements are used to model the piezoelectric elements for 

analysis the low modals of flexible structure are 

extracted using ANSYS software. 

III. D-OPTIMAL DESIGN PRINCIPLE 

2dy 

Actually, sensors are used to estimate state parameter. 

Based on the theory of mathematical statistics, the 

determinant of Fisher Information Matrix det(F) has 

inverse ratio to low bound of variance of parameter 

unbiased estimation. Due to the symmetrical collocation 

of the piezoelectric patches, if the locations of sensors 

have been confirmed, the actuators will be at the same 

locations as the sensors. For a lightly damped structure, 

the D-optimal design principle can be simplified to 

decouple the problems of placement of actuators/sensors 

and input control. The principle can be written as: 

max(det( F )) 

(1) 

m 

i k j 

n 

2dx 

Figure 1. Piezoelectric patch finite element 


M 

∑ 

k= 

1 

Subject to m B ∈ β , } : { Bm = β β k = m 

where β is the location selection matrix; Bm is the set of 

possible locations. The objective function can also be 

written as: 

N M 

T 2 

( γ k φi 

) 

S ( m) 

= max ∑ log( ∑ β ) (2) 

k 

β 

2 

i= 1 k = 1 ϕ k 

where N and M are the number of modals and the 

possible locations of sensors, m is the number of the 

sensors; β is composed of 0 and 1, if k 

β k = 1 , this 

denotes that a sensor is located on the location, in 

contrast, if β k = 0 , this denotes that the location has no 

a sensor; γ k is a vector coefficient related to kth 

location of sensor; φ i is a ith normalized mode shape 

vector; ϕ k is covariance of sensor signal noise, it can be 

defined as 1. 

The physical sense of (2) can be regard as finding the 

location of maximum charges or volt output. Therefore, 

T 

γ k φi 

is equivalent to the output charges of sensors. If 

we suppose that the sensor area is enough small relative 

compared to the beam (plate), the sensor charge can be 

written as 

q = D × A = A × ( Dx 

+ D y ) 

(3) 

2 

∂ ω t p 

Dx 

= d31E 

peε 

x = d31E 

pe 2 

∂x 

2 

(4) 

2 

∂ ω t p 

D y = d 31E 

peε 

y = d 31E 

pe 2 

∂y 

2 

(5) 

Where Dx is the electric displacement generated by the x 

axis strain, Dy is the electric displacement generated by 

the y axis strain, and A is the area of sensor. For the 

beam structure, electric displacement is generated by 

unidirectional strain. d31 is the piezoelectric strain 

constant. Ep and t p are the Young modulus and the 

thickness of the piezoelectric patch. ω is the deflection 

of structure. Dimensions of the piezoelectric patch and 

the finite element are shown in Fig. 1. 

Based on second-order difference scheme, k 

x 2 

2 

∂ ω 

∂ 

2 

∂ ω 

and k can be expressed as: 

(6) 

y 2 

∂ 

2 

∂ ω 2 ω( 

i) 

+ ω( 

j) 

− 2ω( 

k) 

d 

2 k ≈ xω 

= 

2 

∂x 

( dx)


2 

∂ ω 2 ω( 

m) 

+ ω( 

n) 

− 2ω( 

k) 

d 2 k ≈ yω 

= 

2 

∂y 

( dy) 

(7) 

Substitute (4) (5) (6) (7) into (3) to yield 

q = d31E 

pe Aλk 

(8) 

where 

t p ω( 

i) 

+ ω( 

k) 

− 2ω( 

j) 

ω( 

m) 

+ ω( 

n) 

− 2ω( 

j) 

λk 

= ( 

+ 

) 

2 

2 

2 ( dx) 

( dy) 

; 

The deflection of structureω can be expressed by mode 

shapeφ i 

N 

∑ 

i= 

1 

ω = η φ 

(9) 

where η is ith modal generalized coordinate, substitute 

i 

N 

∑ 

i=1 

T 

T 

(9) into (8), assuming γ φ = q , γ k can be written 

k 

as 

T 

γ k = 

1 

i 

η i Ad 31 E pe λ (10) 

k 

2 

where 

t i p 1 

λk 

= φi 

( [ 0K1 

2 i 

2 ( dx) 

− 2 k 1 j K0] 

+ 

; 

1 

[ 0K1 

2 m 

( dy) 

− 2 k 1n 

K0]) 

φ i is a ith normalized mode shape vector. Substitute (10) 

into (2), objective function can be simplified as: 

N 

i 

S ( m) 

= max ( η β λ ) 

β 

i 

M 

∑∑ 

i 

i= 

1 k = 1 

i 

i 

k 

k 

(11) 

Note (11), we conclude that the objective function is 

composed of all step of mode shapes with the 

coefficientη i . 

With the vibration generating force, the equation of 

motion in modal coordinates can be written as: 

2 1 

& η&i ( t) 

+ 2ξ 

iω 

& iη 

i ( t) 

+ ω i η i ( t) 

= f ( t) 

M i 

(12) 

where η& & i , η& i andηi represent modal acceleration, velocity 

and displacement, respectively, ω i and ξ i are the 

natural frequency and damping ratio of the ith mode, due 

to flexible structure and absence of interior damping, 

ξ i = 0 . 

For a different force, every mode shape has a different 

proportion in the vibration of a structure. Assuming that 

the vibration generating force is taken as a unit impulse, 

structure vibrates according to the lower mode shape and 

the other mode shapes can be ignored. Assuming that the 

vibration generating force is taken as sine force with a 

frequency of θ, the vibration of structure has relation to 


the frequency of the force rather than uniquely according 

to one mode shape. For the above-mentioned, in order to 

develop the performance of the vibration, the adhered 

patches must control every mode shapes. So, the 

piezoelectric patches should be placed on the maximum 

of all the mode stains. Unitary mode stain can be written 

as: 

i i 

λ k = λk 

/ max( φi 

) 

(13) 

Integrating equation (13) and (11), a new objective 

function can be written as 

∑∑ 

* 

i 

S ( m) 

= max ( β λ ) 

β 

N 

M 

i= 

1 k= 

1 

(14) 

Accordingly, the performance criterion of the sensor 

locations can be obtained using mode shapes of 

structures. 

IV. FUZZY LOGIC CONTROLLER 

A. Modeling the piezoelectric structure with the finite 

element 

The piezoelectric material is PVDF of β . It lays on 

the above and below surfaces that acting as the 

piezoelectric sensors and actuators respectively. The 

sensor and actuator are symmetrically collocated on the 

structure in the same position. Considering the coupling 

effect, the equations which use the limit element could 

be written as: 

M u& 

& ( t) 

+ Cu& 

( t) 

+ Ku( 

t) 

= F + U (15) 

Where M, K and C is the whole mass, the whole 

stiffness, the whole damping respective, while u (t) 

、 

u& (t) 

、 u& & (t) 

is the displacement, velocity and 

acceleration, F is the external force ; U is the force 

produced by the piezoelectricity. In the process of 

modeling, the influence of the force which produces by 

on account of the piezoelectric material is neglected. So 

the model is approximate. The figure of the active 

vibration control of intelligent structure is shown in 

Fig.2. 

Figure 2. Active of vibration piezoelectric structure 

k 

k


B. Modeling the relation of fuzzy logic control 

The field of fuzzy system and control has been making 

a big progress motivated by the practical success in 

industrial process control. Fuzzy systems can be used in 

as closed-loop controllers. In this case the fuzzy system 

measures the outputs of the process and takes control 

actions on the process continuously. The fuzzy logic 

controller uses a form of quantification of imprecise 

information (input fuzzy sets) to generate by an inference 

scheme, which is based on a knowledge base of control 

force to be applied on the system. 

The advantage of this quantification is that the fuzzy 

sets can be represented by a unique linguistic expression 

such as small, medium, and large etc. The linguistic 

representation of a fuzzy set is known as a term, and a 

collection of such terms defines a term-set, or library of 

fuzzy sets. Fuzzy control converts a linguistic control 

strategy usually based on expert knowledge into an 

automation control strategy. There are three functions 

required to be performed by fuzzy logic controller before 

the controller can generate the desired output signals. 

The first step is to fuzzify each input. This can be 

realized by associating each input with a set of fuzzy 

variables. In order to give semantics of a fuzzy variable a 

numerical sense, a membership function is assigned with 

each variable. The logical controller is made of four 

main components: (1) Fuzzifier; (2) Knowledge base 

containing fuzzy IF-THEN rules and membership 

functions; (3) Fuzzy reasoning; and (4) Defuzzifer 

interface [13, 14]. 

In this paper, fuzzy logic controller is designed as the 

double-input, single-out (DISO) system: The inputs are 

the displacement and the velocity of the tip of cantilever 

beams, and the output is the control force on cantilever 

beams. In this fuzzifier, the displacement is defined from 

-5 to 5(-5,-4,-3,-2,-1,0, 1,2,3,4,5),the velocity is defined 

from -2 to 2(-2,-1,0, 1,2),the control force is defined 

from -7 to 7 (-7,-6,-5,-4,-3,-2,-1,0,1,2,3,4,5,6,7). Two 

types of membership functions commonly adopted in 

fuzzy logic control are triangle and trapezoidal shape. 

We use these two type membership functions. In this 

paper, compared with other methods, the method of 

Mom was more effective. Accordingly, in this paper, a 

way of establish fuzzy system is proposed as following: 

(1) At first, the scope of the displacement and the 

velocity are the maximal response of when received step 

response. 

(2) Plot the scopes of displacement’s and the force of 

control’s out NB,NM,NS,ZO,PS,PM,PB; Then plot the 

scopes of velocity ’s out N and P. 

(3) According to the fuzzy rule of L.A.Zadeh’s[15], we 

get the process of fuzzy illation. 

At last, we use the way of Mom method to calculate in 

order to obtain the result. 

C. The rule of fuzzy control and the fuzzy controller 

The fuzzy rule shows the fuzzy relation between the 

input and output. The inputs and output are connected 

with this relationship. In this paper, the displacement of 

the tip of cantilever beam is chosen for the one input, the 


velocity is the other. In tradition method, the inputs 

usually are the velocity and the rate of the velocity. In 

this way, the time of calculation has been improved. 

Figure 3. The basic configuration of the fuzzy system 

Basing on the control rules, the signal is translated to 

the driver. The function of fuzzy logic controller is 

making the inputs fuzz up. In other words, it is the fuzzy 

control that executes the process of fuzzed. The fuzzy 

control’s basis was the rule database which was 

composed of several rules. The final purpose of the fuzzy 

logic controller was to make the fuzzy rule come true, 

Fig. 3. The method of control vibration is minimized the 

response of displacement of cantilever beam. 

The function of the fuzzy logic controller is to provide 

a force to prevent the vibration of the beam. On analysis 

of the motion traits of cantilever beam, the rules were 

obtained as fellows: 

(1) If the displacement is PS and the velocity is PB, the 

tip of beam is up far from of equilibrium positionH. So it 

needs to add the downwardH force of NB and makes the 

tip of beam close to the reference valuesH. 

(2) If the displacement is PB and the velocity is PS, the 

tip of beam is upward to the maximum displacementH. So 

it needs to add the downwardH force of NS and make the 

tip of beam close to the reference valuesH. 

(3) If the displacement is PB and the velocity is NS, the 

tip of beam is downward close to the equilibrium 

positionH. Adding the downwardH force of NS is required 

and makes the tip of beam close to the reference valuesH. 

(4) If the displacement is PS and the velocity is NB, the 

tip of beam is downward close to the equilibrium 

positionH. Adding the downwardH force of NB is required 

and makes the tip of beam close to the reference valuesH. 

(5) If the displacement is NS and the velocity was NB. 

The tip of beam is down far from of equilibrium positionH. 

The upwardH force of PB should be added and makes the 

end of beam close to the reference valuesH. 

(6) If the displacement is NB and the velocity was NS. 

The tip of beam is downward to the maximum 

displacementH. The upwardH force of PS should be added 

and makes the end of beam close to the reference valuesH. 

(7) If the displacement is NB and the velocity is PS, the 

tip of beam is upward close to the equilibrium positionH. 

Making the tip of beam close to the reference valuesH 

could be obtained by adding the P upwardH force. 

(8) If the displacement is NS and the velocity is PB, the 

tip of beam is upward to the maximum displacementH. 

Making the tip of beam close to the reference valuesH 

could be obtained by adding the PB upwardH force. 

Fuzzy IF-THEN rule base is obtained by the analysis, 

the author of this study, with many trial-and-errors. 

(Table I). Fuzzy IF-THEN rule is the center of control


system. The fuzzy rule base is not invariable, it could be 

modify in practice. The system of fuzzy logic control is 

in Fig. 4. 

TABLE I. 

FUZZY IF-THEN RULE BASE 

NB 

NM 

Figure 4. The system of fuzzy logic control 

V. PARAMETER SELF-ADJUSTMENT 

Base on the fuzzy logic control system, with big 

weighting coefficient to be used against bad errors, and 

small weighting coefficient of change rate to be used for 

slight errors. The principle of establish the parameter 

self-adjustment is using the different parameter selfadjustment 

factor to implement the fuzzy control rules. 

In this paper, the effective means of designing fuzzy 

logic controller with fuzzy logic toolbox of Matlab is 

introduced. Self-parameter is realized by compiling sfunction. 

The organic combination of Matlab and 

Simulink makes the design and simulation of parameter 

self-adjustment fuzzy logic control system be easily and 

effectively. These show the means is easy and elastic. It 

can promote working efficiency of designers. The sfunction 

is used to adjust the parameter because the 

blocks in Matlab are not enough. The source programmer 

of s-function is: 

function[sys,x0,str,ts]=fpids(t,x,u,flag,m,AH,AL,Escope, 

ECscope,Uscope) 

ke=m/Escope; 

kc=m/ECscope; 

ku=Uscope/m; 

switch flag, 

case 0, 

[sys,x0,str,ts]=mdlInitializeSizes(ke,kc,ku,m,AH,AL, 

Escope,ECscope,Uscope); 

case 3, 

sys=mdlOutputs(t,x,u,ke,kc,ku,m,AH,AL,Escope, 

ECscope,Uscope); 

case {1,2,4,9} 

NS 

NO 

PS 

PM 

NB ZO PM PB ZO NB NM ZO 

NS PS PM ZO ZO ZO NM NS 

ZO ZO ZO ZO ZO ZO ZO ZO 

PS PS PM ZO ZO ZO NM NS 

PB ZO PM PB ZO NB NM ZO 


PB 

sys=[]; 

otherwise 

error({'Unhandled flag=',num2str(flag)}); 

end 

function[sys,x0,str,ts]=mdlInitializeSizes(ke,kc,ku,m,AH, 

AL,Escope,ECscope,Uscope) 

sys=[0,0,1,2,0,0,0]; 

function[sys]=mdlOutputs(t,x,u,ke,kc,ku,m,AH,AL, 

Escope,ECscope,Uscope) 

if u(1)>=Escope 

u(1)=Escope; 

end 

if u(1)=Escope 

u(2)=Escope; 

end 

if u(2)Uscope 

result=Uscope; 

end 

if result


position response of the smart structure system with 

fuzzy logic control is shown in Fig.11. Fuzzy Logic 

Controllers is designed to control nonlinear vibration of a 

smart structure and the effects of the controllers over the 

system are examined. Fuzzy logic controller using in 

such system which has nonlinear vibrations gives a good 

result as tip position control of a smart structure. Settling 

time of system is approximately 9s, and the displacement 

of the tip beam is decreased from3.5mm to 2mm, so this 

method has practical value in preventing vibration of 

displacement. According to these results, suitable 

performance of fuzzy logic controller is determined for 

tip position control of a smart structure system. The 

fuzzy logic controller designed is established properly 

and this new controller can be used for such kind of 

system. 

Length 

(mm) 

strain quantity of the modal 

Figure 5. Configurations of the cantilevered plate 

5 

4 

3 

2 

1 

0 

Width 

(mm) 

TABLE II. 

AMETER OF THE BEAM 

Thickness 

(mm) 

Density 

Kg/m3 

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 

number of t he nodes 

Figure 6. The strain quantity of the node 

Flexibility 

modulusH(Pa) 

350 25 4 2900 6.8e+10 

TABLE III. 

THE PARAMETER OF THE PIEZOELECTIC PATCHES 

Length Width Thickness Density Flexibility 

(mm) (mm) (mm) Kg/m3 modulusH(Pa) 

35 25 1 8300 3.8e+9 

……………… 

1 3 ……………… 


39 

Figure 7. The fuzzy logic control system 

Figure 8. Rule viewer 

Figure 9. Control surface of Mom


Figure 10. Tip position response of the beam 

Figure 11. Tip position response of the beam with fuzzy logic 

controller 

According to the result in Fig. 10, the system has an 

obvious oscillation phenomenon at the time of 4s. In 

order to avoid this phenomenon, we use parameter selfadjustment 

which is compiled by s-function to improve 

the control effect. Where m is 6, AH is 0.8, AL is 0.2, 

Escope is 0.005, ECscope is 0.1, Uscope is 0.008. The 

result was shown in Fig. 12. Settling time of system is 

approximately 8s; the displacement of the tip beam is 

decreased from 0.2mm to 0.043mm. In addition, this 

parameter self-adjustment method eliminates the 

oscillation phenomenon basically. 


Figure 12. Tip position response of the beam with fuzzy logic 

controller 


This new fuzzy logic controller which designs in this 

paper improves calculation efficiency and reduces 

calculation complexity. More importantly, it has the 

obvious effect to reduce the displacement of the beams 

tips. Besides that, this paper uses a new method to design 

a fuzzy logic controller and compiles parameter selfadjustment 

factor with s-function. The simulation results 

show that this new fuzzy logic control has more 

superiority than the ordinary’s, the performance is 

satisfactory. 


This work was financially supported by Natural 

Science Foundation of Hebei Province under Grant 

No.E2008000731, and Scientific Research Project of 

Education Department of Hebei Province under Grant 

No.2006107. 

REFERENCES 

[1] Bailey, T. and Hubbard, J. E., 1985, Distributed 

piezoelectric polymer active vibration control of a 

cantilever beam, AIAA J. Guidance, Control Dynam, 8: 

605-11. 

[2] Guo, H. Y. Zhang, L. L. and Zhou, J. X., 2004, Optimal 

placement of sensors for structural health monitoring 

using improved genetic algorithms, Smart Materials and 

Structure, 13:528-534. 

[3] Martin Kögl and Emílio, C. N., Silva, 2004, Topology 

optimization of smart structures: design of piezoelectric 

plate and shell actuators, Smart Materials and Structure, 

14:387-399. 

[4] Zongjie Cao, Bangchun Wen and Guangwei Meng, 2000, 

Topological optimization of placements of piezoelectric 

actuator of intelligent structure, Journal of Northeastern 

University (Natural Science), 21(4): 383-385.


[5] Bayard D. S., Hadaegh, F. Y. and Meldrum, D. R., 1988, 

Optimal experiment design for identification of large 

space structures, Automatica, 24(3): 357-364. 

[6] Brown C.B., Yao J.T.P. (1983). “Fuzzy sets and Structural 

Engineering”. Journal of Structural Engineering, ASCE, 

No. 109, pp.1211-1225. 

[7] Juang C., Elton D.J. (1986). “Fuzzy Logic for Estimation 

of Earthquake”. Intensity Based on Building Damage 

Record Civil Engineering System. No.3, pp.187-197. 

[8] Battaini M., Casciati F., Faravelli L. (1998) “Fuzzy 

Control of Structural Vibration. An Active Mass System 

Driven by a Fuzzy controller”. Earthquake Engineering 

and Structural Dynamics, No.27, pp.1267-1276. 

[9] Symans M.D. and Kelly A.S. (1999). “Fuzzy Logic 

Control of Bridge Structures Using Intelligent Semi 

Active Seismic Isolation Systems”. Earthquake 

Engng.Struct.Dyn,vol.28, pp.37-60. 

[10] Park H., R.Agarwal, K.Nho (2000). “Fuzzy Logic Control 

of Structure Vibration of Beams”. Aerospace Aerospace 

Sciences Meeting and Exhibit, A IAA-0172. 

[11] Lee R.Y.Y., Abdel-Motagaly K., Mei C. (1998). “Fuzzy 

Logic Control for Nonlinear Free Vibration of Composite 

Plates with Piezoactuators”. CAIAA, pp. 20-23. 

[12] Ding Wenjing, 1994, Current main topics on active 

vibration control, Advances in mechanics 24(2): 173-180. 

[13] L. Faravelli, T. Yao, Use of Adaptive Networks in Fuzzy 

of Civil Structures, Micricimput, Cilvil ENG. 11:67- 

76,1996. 

[14] J.Yen, R. Langari. Fuzzy Logic-Intelligence Control and 

Information. Prentice Hall, Englewood Cliffs, NJ, 1998. 

[15] Zadeh L.A. (1965). “Fuzzy set”. Informat. Control, vol. 8, 

pp. 338-353. 

Jingjun Zhang, Male, born in Yucheng City, Henan 

Province, in 1963. Received his B.Sc. degree in Engineering 

Mechanics from Lanzhou University, Lanzhou, China, in 1985, 

and the M.Sc. and PhD degrees in Mechanics from Jilin 

University, Changchun, China, in 1993 and 1996 respectively. 

He is currently a Professor in the College of Civil 

Engineering, Hebei University of Engineering, Handan, China. 

He is also the Chief of Department of Science Research in 

Hebei University of Engineering, China. His research interests 

include the area of flexible spacecraft structural control using 

smart structure technologies and piezoelectric materials. He has 

also developed research programs in attitude control of flexible 

spacecraft, smart structures, and space robotics. Recently, he is 

conducting research in spacecraft attitude dynamics, structural 

dynamics, spacecraft testing and optimization design. More 

than 70 papers have been published in academic journals in 

domestic and abroad. 

Prof. Zhang is an Associate Fellow of the Journal of China 

Coal Society. 

Liya Cao, Female, born in Handan City, Hebei Province, 

in 1982. She is currently a graduate student in the College of 

Civil Engineering, Hebei University of Engineering, Handan, 

China. Her main research interests focus on intelligent structure 

and the active vibration control with fuzzy logic. 

Weize Yuan, Male, born in Baoding City, Hebei Province, 

in 1981. Received his B.Sc. degree in College of Civil 

Engineering from Hebei University of Engineering, Handan, 

China, in 2006. He is currently a graduate student in the 

College of Civil Engineering, Hebei University of Engineering, 

Handan, China. His main research interests focus on the 


problem of obtaining the piezoelectric actuator’s optimal 

location and size of the smart structure. 

Ruizhen Gao, Male, born in Baoding City, Hebei 

Province, in 1979. Received his B.Sc. degree in College 

of Civil Engineering from Hebei University of 

Engineering, Handan, China, in 2005. He is currently a 

teacher in the College of Mechanical and Electrical 

Engineering, Hebei University of Engineering, Handan, 

China. His main research interests focus on GA Genetic 

Algorithm. 

Jingtao Li, Male, born in Shijiazhuang City, Hebei 

Province, in 1981. Received his B.Sc. degree in College 

of Natural Science from Hebei University of Engineering, 

Handan, China, in 2005. He is currently a graduate 

student in the College of Civil Engineering, Hebei 

University of Engineering, Handan, China. His main 

research interests focus on topological optimization.


Numerical Simulation of Snow Drifting Disaster 

on Embankment Project 

Shouyun Liang, Xiangxian Ma and Haifeng Zhang 

Key Laboratory of Mechanics on Disaster and Environment in Western China (Lanzhou University) /Ministry of 

Education, Lanzhou, P. R. China 

Email: liangsy@lzu.edu.cn, maxxan04@lzu.cn, hai_feng2008@yahoo.com.cn 

Abstract—Snow drifting is a typical natural disaster, and an 

increasing importance has been attached to its theory 

research and field observation for the sake of cold region 

engineering. Numerical simulation, as an efficient method 

of studying snow drifting, promises to be widely applied in 

this field. In this paper, the finite element method is used to 

simulate the wind velocity field. Selected upwind, 

downwind and middle part of a road as key comparison 

points, a quantitative analysis constructed of the influence 

of subgrade width, height and slope ratio on the wind 

velocity of embankment. The results showed that, under 

certain circumstances, the velocity of snow drifting 

demonstrated different function types of increase with the 

height and slope ratio of subgrade enhancing, and the 

influence of the height is greater than the slop ratio of 

subgrade, while the velocity of wind decreases with the 

width of subgrade increasing; the numerical values 

approximately agreed with the field observed results, but 

the numerical simulation is more sensitive to various forms 

of embankment. The numerical results can offer references 

for engineering construction when the region in which wind 

velocity is slower than the threshold velocity as snowpack 

area. 

Index Terms — embankment project, snow drifting 

disaster, numerical simulation, ANSY 


Snow drifting is a typical natural disaster. The 

frequent occurrence of snow drifting has directly brought 

serious loss to industrial and agricultural production and 

people’s life and property, baffling sight, blocking 

traffic, breaking electricity, leading industry and 

agriculture to go out of production and other hazards [1]. 

Therefore, an increasing importance has been attached to 

its theory research and field observation [2]. Sato et al. 

have brought forward turbulence model and other 

theories based on Eurler view and multiphase flow and 

Prandtl mixed length; Liston et al. have applied simple 

k-ε turbulence model to simulate accumulation process 

of snow according to the way in which the control unit is 

filled in by the information of the transporting capacities 

of particles of snow. In the study of road snow drifting 

disaster, foreign scholars mainly employ CFD to 

simulate the influence of snow fence on snow drifting 

Manuscript received January 23, 2009; revised February 4, 2009; 

accepted February 20, 2009. 

Corresponding author, E-mail: liangsy@lzu.edu.cn 


doi:10.4304/jcp.5.1.139-143 

distribution, and by taking account of suspension and 

saltation analyze the law of flow field transformation and 

snow distribution [3]. By using FLOW-3D, Thordarson 

[4] has simulated distribution and change of volume of 

the snow near road flow field and within the unit, and 

thereby simulated the movement of snow drifting. Xi and 

Ying et al. [5-6] have by employing bilateral flow theory 

simulated the flow field of snow drifting disaster through 

FLUENT software, and concluded the qualitative change 

of different subgrade flow field. Hu et al. [7] has applied 

ANSYS software to simulate the change of wind velocity 

field in order to discuss the impact of snow-sand 

blockings on subgrade snow, and thereby put forward 

sound setting and form of snow-sand blockings. Among 

these informed simulation research of snow drifting, 

there exists a lack of a qualitative study of the impact of 

blockings on flow field, and there runs short of 

considering the relation between engineering structures 

and snow drifting disasters. Taking the embankment 

project for example, the author of this paper has 

simulated its variation of wind velocity field through 

ANSYS, deeply analyzed every factor of embankment 

sections with different models, the quantitative 

relationship of wind velocity field, and the combined 

effects of each factor on wind velocity field, and 

achieved the objective of indirectly reflecting the 

graveness of snow drifting disasters. 

II. MODEL DESIGN 

According to related theory of aerodynamics, there 

existed a certain degree of relevance between 

accumulated snow and wind speed, furthermore, 

particles of snow bore a better following performance, 

therefore, the author has drew on the variation of wind 

velocity field to reflect indirectly the change of snow. A 

simulation experiment has constructed based on Fortran 

CFD of ANSYS, by adopting simple algorithm, N-S 

equation and k-e model [8-10]. And this conception has 

been proved feasible by the contrast of wind velocity 

field observations (Feb, 29th 2008) of Jinghe-yining (JY) 

railway embankment section (DK176+848) under 

construction with its simulation values (Fig.1). Besides, 

some simulation conditions of embankment sections 

have been given by referring to standards concerning 

railway and civil engineering techniques (Table I). 

Seeing that the wind vertical with road surface influences 

accumulated snow the most, the author has only taken 

account of the condition when the wind is vertical with


road surface, leaving out account of the condition when 

the wind is oblique crossing and horizontal. 

Figure 1. Comparison of wind speed of simulation and field 

measured of JY railway embankment section 

(DK176+848.555) 

Condi- 

tion 

1 

TABLE I. 

PARAMETERS BASED ON SIMULATION 

Embankment 

Width (m) 

9, 12, 15, 18, 

24.5 

2 9 

Embankment 

height (m) 

Slope 

ratio 

III. ANALYSES OF SIMULATION RESULTS 

Velocity 

(m/s) 

4 1:1.5 9.4 

2, 4, 8, 12, 

13, 14, 15, 

16, 17 

3 9 4 

1:1.5 9.4 

1:1, 

1:1.25, 

1:1.5, 

1:1.75 

A. Impact of embankment width on wind velocity field 

For subgrade with different width, their wind velocity 

fields are similar, in both of which wind velocity of the 

area with a certain distance to the foot of upwind slope is 

blocked, and thus declines drastically. The wind velocity 

reaches its minimum at the foot of the upwind slope, 

with airflow subsequently climbing the slope it increases 

to its maximum at the shoulder, and it reduces slightly in 

between, and then increases again at the shoulder of 

downwind slope (Fig. 2). The wind velocity of the area 

one meter from the foot of the downwind slope reduces 

almost to zero, thereafter a whirlpool comes into being, 

and wind velocity declines more or less, and then it 

would retrieve its former speed. Fig. 3 illustrates varied 

wind velocity (0.5 m above the surface) of road surface 

of different subgrade width, based on the key points of 

shoulder of upwind slope, in between, and shoulder of 

downwind slope. Regression analysis showed that the 

wind velocity of subgrade represents different function 

changes with the increase of subgrade width at the three 

points: 

At shoulder of upwind slope: 

V= -0.1114w +16.228 (R² = 0.9903) (1) 

In between: 

V = 0.0021w 2 -0.0906w+14.134 

(R² = 0.9907) (2) 

At shoulder of downwind slope: 

V = 0.0052w 2 -0.2199w+12.261 

(R² = 0.9926) (3) 


9.4 

Figure 2. The contour of velocity with different embankment 

width (fig A-D: 9 m,12 m,15 m,24.5 m)


Figure 3. The relationship between changes of subgrade width 

and wind speed of road ( 0.5 m above pavement) 

B. Impact of embankment height on wind velocity field 

For subgrade with different height, their wind velocity 

field bears similarity, and even resembles with wind 

velocity field of subgrade with different width (Fig.4). 

Fig. 5 illustrates varied wind velocity (0.5 m above the 

surface) of road surface of different subgrade height, 

based on the key points of shoulder of upwind slope, in 

between, and shoulder of downwind slope. Regression 

analysis represented that the wind velocity of subgrade 

first increases and then decreases with the increase of 

subgrade height at the three points. With the 15 m being 

the dividing line, when the height is less than 15 m, the 

wind velocity has good relativity to quadratic curves. 

The law presents itself as follows: 


V = -0.0522 h 2 +1.58127 h +10.52638 

(R² = 0.9581) (4) 

In between: 

V=-0.0439h 2 +1.1205h+7.1518 

(R² = 0.9843) (5) 


V = -0.042h 2 +1.4331h+9.1942 

(R² = 0.9994) (6) 

When the height is more than 15 m, wind velocity 

begins to decline, first considerably and then gradually. 

The relativity of fitting curve reduces. 

V = -0.0886h 2 +2.2006h+7.2217 

(R² = 0.7877) (7) 

V = -0.0731h 2 +1.7991h+5.2242 

(R² = 0.7495) (8) 

V = -0.1013h 2 +2.5052h+8.0499 

(R² = 0.7913) (9) 

C. Impact of slope ratio on wind velocity field 

For subgrade with different slope ratio, the wind 

velocity fields distribute generally similar to above 

mentioned two situations (Fig. 6). Fig. 7 illustrates the 

wind velocity (0.5 m above the surface) of road surface 

with different slope ratio. We can see that, the wind 

velocity represents a linear decrease with the increase of 

slope ratio at the three points, in which the variation of 


wind velocity at shoulder of upwind slope can be 

formulated as: 

Figure 4. The contour of velocity with different height of road 

(fig A-D: 2 m, 8 m, 15 m, 16 m)


Figure 5. The relationship between changes of subgrade height 

and wind speed of road (0.5 m above pavement) 

V=-105.89tanα 3 +235.44tanα 2 -167.33tanα+52.271 

(R² = 1) (10) 

D. Comprehensive Analysis 

In order to simultaneously study the combined effects 

of subgrade width, height, and slope ratio on wind 

velocity field of road surface, the author has adopted 

linear regression to calculate correlation coefficient of 

each factor with SPSS under a confidence level of 0.95. 


V=-0.089w+1.034h+3.795 tanα+8.557 (11) 

In between: 

V=-0.052w+2.101h+2.875tanα+10.007 (12) 


V=-0.117w+3.24h+4.018tanα+5.34 (13) 

From the formulation, we can see the wind velocity is 

negatively related to the subgrade width, positively 

related to the other two factors. All taking into 

consideration, the impact of subgrade height on wind 

velocity field is much more significant than that of the 

slope ratio. 

IV. CONCLUSION AND DISCUSSION 

Though embankment project just serves as a relatively 

ideal subgrade model for snow drifting disaster, we still 

can gain constructive knowledge from above research 

with regard to the relation between subgrade model and 

snow drifting disaster: 

(1) With subgrade width increasing, the wind velocity 

of road declines in different functions: it changes the 

most drastically at the shoulder of upwind slope; next is 

at the shoulder of downwind slope; the last is the 

between part. For this reason any increasing or 

decreasing of subgrade width is not a significant factor 

leading to snow disaster on road. With subgrade height 

and slope ratio increasing, the wind velocity of road 

increases in different functions. However, the impact of 

subgrade height on wind velocity field is far bigger than 

that of slope ratio, yet when subgrade height is more 

than15 m. the wind velocity of road reduces instead. This 

dividing line of 15 m coincides with that of h=15.1 m in 


Wang’s (2001) wind tunnel experiment [1]. Thus, there 

exists an appropriate subgrade height from the 

perspective of snow on road. 

Figure 6. The contour of velocity with different slope of ratio(fig 

A-D: 1:1,1:1.25,1:1.5,1:1.75)


Figure 7. The relationship between changes of slope ratio 

and wind speed of road (0.5 m above pavement) 

pavement) 

(2) The conception of considering sugrade width in 

this simulation research has consulted relevant railway 

and civil engineering standards, so a study of its wind 

velocity field can provide some guidance to practical 

practice. Comparison shows that the simulated values 

agree well with the observations. The simulated values 

seem more sensitive to the change of subgrade design 

parameters, whereas the observations would be 

frequently affected by the wild weather, terrain and other 

factors. This renders the system more complicated. If we 

regard the area in which wind speed is less than the 

starting speed in the figure of simulated wind velocity 

field as snow area, this research can offer references for 

prevention and control of snow drifting disaster in 

engineering aspect. 

(3) This research is based only on three representative 

points of a road and the wind speed of 0.5 m above the 

road, therefore, the investigation of wind velocity 

variation on a road remains to be carried further in depth 

by taking more data on points and heights, improving the 

precision and depth of research. 



We are grateful to X. Wang, Y. Yin, X. Chang and Z. 

Shao for their technical support in the field experiment. 

We thank T. Wang for his valuable discussions and 

suggestions. This work was supported by a grant from 

China Railway First Survey and Design Institute Group 

Ltd. 

REFERENCES 

[1] Z. Wang, “Research on snow drift and its hazard control 

engineering in china,” Lanzhou University Press, May. 

2001. (in Chinese) 

[2] L. Zhang, “Research on Snowdrifts and Snow Disaster 

Simulation Technology of road,” Jilin University 

Postgraduate Dissertations, Jun. 2005. (in Chinese) 

[3] S. Alhajraf. “Computation fluid dynamic modelling of 

drifting particles at porous fences,” Environmental 

Modelling & Software, vol.19, No.2, Feb. 2004, pp. 

163-170. 

[4] S. Thordarson, H. Norem, “Simulation of two-dimensional 

wind flow and snow drifting application for roads: part I,” 

Snow Engineering, vol.7, No.4, Jun. 2004, pp. 246-265. 

[5] L. Ying, J. Li, X. Zhang, J. Xi, “Study on numerical 

simulation for snow fence height,” Journal of Changchun 

University of Science and Technology, vol.29, No.3, Sep. 

2006, pp. 54-57. (in Chinese) 

[6] J. Xi, J.Li, G. Zhu, G. Zhang, “Hydrom echaical 

mechanism of road snow drift deposit and its depth 

model,” Journal of Jilin University (Engineering and 

Technology Edition), vol.36, No. supp.2, Sep. 2006, pp. 

152-156. (in Chinese) 

[7] P. Hu, C. Zheng, “The anslysis on the wind speed field of 

the snow protection facilities and sand control facilities,” 

Journal of Chongqing Jiaotong University (Natural 

Sciences), vol.24, No.3, Jun. 2005, pp. 63-68. (in Chinese) 

[8] R. Essery, L. Long, J. Pomeroy, “A distributed model of 

blowing snow over complex terrain,” Hydrological 

Processes, vol.13, No.2, Feb. 2004, pp. 2423-2438. 

[9] P. Bartelt, M, Lehning, “A physical SNOWPACK model 

for the Swiss avalanche warning Part I: Numerical 

model,” Cold Regions Science and Technology, vol.35, 

No.3, Nov. 2002, pp. 123-145. 

[10] J. Judith, M. Lehning, A. Vrouwe, “Field measurements of 

SNOW-DRIFT threshold and mass fluxes, and related 

model simulations,” Boundary-Layer Meteorology, 

vol.113, No.8, Apr. 2004, pp. 347-368.


The Simulation of Extraterrestrial Solar Radiation 

Based on SOTER in Zhangpu Sample Plot and 


Zhi-qiang Chen 

College of Geographical Sciences, Fujian Normal University, Fuzhou, China 

soiltuqiang061@163.com 

Jian-fei Chen 

School of Geographical Sciences, Guangzhou University, Guangzhou, China 

cjf@gzhu.edu.cn 

Abstract—The study establishes DEMs and the computer 

models of daily extraterrestrial radiation in Zhangpu 

sample plot and Fujian province and annually 

extraterrestrial radiation in Fujian province using GIS base 

on SOTER. The results indicate that the daily 

extraterrestrial radiation is mainly 17-18MJ/m 2 in Zhangpu, 

the one is mainly 14-18MJ/m 2 and the annually 

extraterrestrial radiation is mainly 5000~7000MJ/m 2 in 

Fujian province. The influence of topographic feature is 

significant, in general the solar radiation in the chine is 

bigger than the one in the valley, and the one in the sunny 

slope is bigger than in the shady slope, the high radiation 

value appears primarily in the sunny slope with the chine as 

demarcation line. The method can extract daily and 

annually extraterrestrial radiation with high precision and 

speediness at small and moderate scales which don’t take 

cloud into consideration. The orientation of future research 

is to improve precision by RS images and combine with 

SOTER. 

Index Terms—DEM, GIS, solar radiation model, Zhangpu 

sample plot, Fujian province 


Solar radiation is the main energy of physical, 

biological and chemical processes on earth surface, is the 

necessary parameter in many models [1-3]. In the study of 

resource and environment in regional spatial scope, local 

terrain impact is enormous and must be considered [2]. 

Because of the complexity of computer models of 

extraterrestrial solar radiation considering terrain 

conditions, difficulty of terrain parameters acquisition and 

lack of appropriate calculation platforms, for a long time, 

it is neglected or simplified [1, 4]. GIS has provided the 

strong technical support for these work [2, 5]. Solar 

energy study and solar energy heat utilization are usually 

conducted in clear-day and extraterrestrial solar radiation 

models have broad scope [6], so the study took 

Manuscript received Febuary 1, 2009; revised Febuary 15, 2009; 

accepted March 15, 2009. 

Corresponding author: Jian-fei Chen 


doi:10.4304/jcp.5.1.144-149 

extraterrestrial solar radiation as object, took ArcGIS as 

platform in an experimental plot of Zhangpu sample plot 

as small scale and Fujian province as moderate scale, 

calculated the solar radiation spatial distribution law 

caused by hypsography under the same solar radiation in 

the complex topographic area of the subtropical zone and 

the model’s applicability at small and moderate scales. 

The results can provide important information for solar 

radiation spatial-temporal distribution in the small and 

moderate areas, agricultural regionalization, effective 

utilization of solar energy and precision agriculture [1]. 

II. STUDY AREA 

The study area is located in the northeast of Zhangpu 

county in Fujian province at 24°10′ ~ 24°15′N and 

117°48′~118°00′E including Futan, Maping, Qianting 

and part of Baizhuhu farm, is about 8236 hm 2 . Annual 

mean temperature is about 21℃ and annual precipitation 

is about 900mm, it’s clear in dry and humid season; 

coastal plains, platform and hill appear in turn from the 

sea to the land;there is no big river with weak runoff 

regulation and poor hydrological condition; because of 

the long term artificial activity, the original vegetation 

has been destroyed and the secondary vegetation is 

dominated by grass with thin forest; the stratigraphies 

include Neogene Fotan Group, fluvial or marine or 

Aeolian or proluvial deposits of Pleistocene and 

Holocene in Quaternary, intrusive rock is early 

Yanshanian biotite granite, Neogene Fotan Group is 

basalt; the major soil types may be divided into the 

following six orders: Ferrisols, Vertisols, Camblsols, 

Ferralosols, Entisols and Anthrosols [7]. 

Fujian province lies between 23°32′ and 28°22′ of 

north latitude with a area of 121,400 km 2 , is located on 

the southeastern coast of China characterized by 

subtropical climate, complicated landforms and multiple 

mountainous and hilly terrains which account for 87.5% 

of the total area of the province, leaving the remaining 

10% for plains. It is always called the “Mountainous


Region of Southeast” with complex natural conditions. It 

appears middle subtropical climate in the north mountain 

and south subtropical climate in the south coast areas. 

The annual mean temperature is around 15-21℃ and the 

accumulated temperature (≥10℃) is 5500-7500℃, the 

annual precipitation is around 1000-2000 mm and 

increases gradually from the southeast coast towards the 

northwest inland. The predominant natural vegetations 

are the south subtropical monsoon rain forest and the 

evergreen broad-leaved forest. Fujian has a history of 

over 3000 years and its long term agro-economic 

activities have brought about many types of land 

utilization, among which that of forest land accounts for 

66.8%, that of cultivated land 12.7%, that of garden plots, 

3.8%, herbage land 0.1%, water areas 4.4%, construction 

sites 3.1%, idle land 9.1% [8]. 

III. METHODS 

A. Basic Astronomical Parameters 

The method comes from the reference [9] in the 

paper. Some basic parameters should be established when 

calculating solar radiation somewhere on the earth 

including solar altitudinal angle, solar aspect angle, solar 

declination, sunrise & sunset time and so on [9, 10]. 

Eo=1.000109+0.03349cosθ+0.001472sinθ+0.000768co 

s2θ+0.000079sin2θ (1) 

Where Eo is the earth orbit correction factor, θ is day 

angle, θ = 2πt / 365.2422, t = N-No, N is number of day, 

that is date's sequence number in a year, No = 79.6764 + 

0.2422 * (year-1985) - INT((year - 1985)/4). 

Declination is the angle distance from the celestial 

equator to the sun along right ascension circle in 

equatorial coordinate system, it is positive when the sun is 

in the north of the equator and on the contrary it is 

negative, ranging from 0 to ±23.44. Hour angle describes 

the sun’s movement in 24h, it is 0 at noon by local real 

solar time, positive in the morning and negative in the 

afternoon, 15 degrees per hour. Their calculation formulae 

respectively are: 

δ = 0.006894 - 0.399512cosθ + 0.072075sinθ - 

0.006799cos2θ + 0.00089sin2θ (2) 

ω = (So + Fo/60 - 12) * 15° (3) 

ω1 = arcos(- tanδ * tanψ) (4) 

ω2 = - ω1 (5) 

Where δ is declination, ω is hour angle, So and Fo are 

hour and minute by real solar time, ω1 is sunset hour 

angle, ω2 is sunrise hour angle. 

Solar altitudinal angle is the angle between solar ray 

and horizontal surface, solar aspect angle is the angle 

between projection of solar ray on horizontal surface and 

local meridian, due north is 0 degree. They are influenced 

by latitude, declination and hour angle. The calculation 

formulae are: 

H = arcsin(sinδsinψ + cosδcosψcosω) (6) 

A = arccos[(sinhsinψ - sinδ)/coshcosψ] (7) 


Where H is solar altitudinal angle, A is solar aspect 

angle, ψ is latitude. 

B. Terrain Feature Calculation 

When solar altitudinal angle and solar aspect angle are 

decided, slope, aspect and terrain shield are the important 

factors calculating extraterrestrial solar radiation 

especially in mountain area. Terrain shield means that: 

comparing the biggest horizon angle and corresponding 

solar altitudinal angle within a certain solar aspect angle 

range, if the former is bigger than solar altitudinal angle, 

solar radiation is 0 in shadow range or else solar radiation 

should be calculated. When GIS was not mature, slope, 

aspect and terrain shield were calculated using manual 

methods or complex algorithms. With the development of 

GIS technology, most of GIS software have the extraction 

function of terrain feature such as DRIVE SLOPE, 

DRIVE ASPECT and HILLSHADE in ArcGIS; 

contribute to the calculation of influence of terrain on 

solar radiation [1]. 

C. Extraterrestrial Solar Radiation Model 

The solar radiation received somewhere on the earth is 

influenced by many factors such as terrain (elevation, 

slope, aspect, shadowing), soil, vegetation and weather 

conditions (atmospheric turbidity, cloud amount) and so 

on. Because of objective conditions, slope, aspect and 

elevation only were considered in the paper. The actual 

solar radiation can be calculated using revised formulae if 

there are more detailed ground data and weather 

conditions data [2]. The extraterrestrial solar radiation 

formula is: 

n 

W = IoTEo / 2π * ∑ [µsinδ(ωr,j - ωs,j) + νcosδ(sinωr,j 

i= 1 

-sinωs,j) - sinαsinβcosδ(cosωr, j- cosωs,j)] (8) 

Where Io is solar constant, T is day length, n is azimuth 

division number, empirical value is 36, ωr,j and ωs,j are 

sunrise and sunset hour angles at every differential period 

respectively, α is slope, β is aspect. 

µ = sinψcosα - cosψsinαcosβ (9) 

ν = sinψsinαcosβ + cosψcosα (10) 

Annually extraterrestrial radiation can be obtained by 

adding daily extraterrestrial radiation in a year. 

VI. EXTRATERRESTRIAL SOLAR RADIATION 

SIMULATIONS 

DEM is the basis of terrain basic feature parameters 

such as elevation, slope and aspect and so on. The scanned 

1:10000 contour maps in Zhangpu and 1:250000 contour 

maps in Fujian province to get DEM come from Soil and 

terrain digital database (SOTER). SOTER is a cross or 

edge database and can manage attribute database, spatial 

database and models better [11]. SOTER is one of the 

global databases [12]. The International Society of Soil 

Science (ISSS) has developed a methodology for SOTER 

since 1986 whose main objective is to establish a 

computerized database storing attributes on topography,


soils, climate, vegetation and land use/cover, which is 

linked to a Geographic Information System (GIS), 

whereby each type of information or combination of 

attributes can be displayed as a separate map or overlay, 

or in tabular form. The database can be used for improved 

mapping and monitoring of changes in world soils and 

terrain resources, and for the development of an 

information system capable of delivering accurate, useful 

and timely information to decision-makers and 

policy-makers [13]. The SOTER has been carried out in 

many countries and areas including China, Argentina, 

Brazil, Cuba, Mexico, Uruguay, Venezuela, Hungary in 

many fields such as soil erosion, land evaluation, soil 

fertility, food security and so on with powerful function 

and advantages [14-18]. 

The scanned maps in Zhangpu and Fujian province 

were digitized whose contour distances are 5m and 50m 

respectively. DEM resolution should be set to generate 

DEM using contour maps. Based on Z.L. Li’s study, the 

relationship between DEM resolution and contour 

distance on contour maps is: 

D = K * Ci * cosα (11) 

Where Ci is contour distance, α is average slope, K is 

constant (it is between 1.5 and 2.0 taking terrain feature 

into account otherwise it is between 1.0 and 1.5), D is 

DEM resolution. Because terrain feature is often ignored 

in DEM application, according to the formula above, 

DEM accuracy is equal to the one of 1:10000 contour map 

if DEM resolution is between 7 m and 42 m. The DEM 

resolutions in Zhangpu and in Fujian province were set to 

15m and 100m respectively [10]. The digital slope model 

and digital aspect model can be acquired after DEM 

establishment, at the same time the following indicators 

can be calculated: sunrise and sunset angles, discrete 

number of available sunshine angle, sunrise and sunset 

hour angles and the ones at every differential period, 

corresponding solar altitudinal angle and solar aspect 

angle. The extraterrestrial solar radiation can be derived 

by substituting all these indicators into the formulae in 

Zhangpu sample plot and Fujian province. 

Percentage of the Total Area(%) 

70 

60 

50 

40 

30 

20 

10 


0 

Simulation date was March 21, 2008 that was spring 

equinox. The sun's meridian altitude was h = 90 – ψ + δ at 

noon, sunshine time was about 12 hours, the daily 

extraterrestrial solar radiation was calculated using DEM 

according to the formula, that was the digital solar 

radiation grid model corresponding to DEM. The daily 

extraterrestrial solar radiation is mainly in 17-18MJ/m 2 in 

Zhangpu (Fig 1 and Fig 2) and 16-18MJ/m 2 in Fujian 

province (Fig 3 and Fig 5). Simulation year was 2008 and 

the annually extraterrestrial solar radiation was calculated 

which is mainly in 5000~6000MJ/m 2 and 6000~7000 

MJ/m 2 with the minimum 1101.4 MJ/m 2 and the 

maximum 6366.6MJ/m 2 in Fujian province (Fig 4 and Fig 

6). 

There is significant difference in spatial distribution 

diversity of the daily extraterrestrial solar radiation in 

Zhangpu sample plot and Fujian province and annually 

extraterrestrial solar radiation in Fujian province with 

strong spatial autocorrelation from the different 

geographical locations. In general the solar radiation value 

in the chine is bigger than the one in the valley, and the 

one in the sunny slope is bigger than in the shady slope. 

The high radiation value appears primarily in the sunny 

slope with the chine as demarcation line. 

Figure1. Spatial distribution of daily extraterrestrial solar radiation 

in Zhangpu. 

0-14 14-15 15-16 16-17 17-18 18-19 

Daily Extraterrestrial Solar Radiation(MJ/m 2 ) 

Figure2. Statistic of daily extraterrestrial solar radiation in Zhangpu.


Figure3. Spatial distribution of daily extraterrestrial radiation in 

Fujian province. 



70 

60 

50 

40 

30 

20 

10 

0 

70 

60 

50 

40 

30 

20 

10 


0 

Figure4. Spatial distribution of annually extraterrestrial radiation in 

Fujian province. 

0-10 10-12 12-14 14-16 16-18 18-20 

Daily Extraterrestrial Solar Radiation(MJ/m 2 ) 

Figure5. Statistic of daily extraterrestrial solar radiation in Fujian province. 

1000 - 3000 3000 - 4000 4000 - 5000 5000 - 6000 6000 - 7000 

Annually Extraterrestrial Solar Radiation(MJ/m 2 ) 

Figure6. Statistic of annually extraterrestrial radiation in Fujian province.


V. COMBINATION OF EXTRATERRESTRIAL SOLAR 

RADIATION AND SOTER 

Nowadays, with the development of science and 

technology and fast growing economy, the accelerated 

industrialization and urbanization, difficulties 

encountered by researchers are how to manage, store and 

standardized various data efficiently, how to dig out 

valuable data to a great extent, and how to share the data 

resources. The single solar radiation could not meet the 

needs of users with the development of science and 

technology and one of the way out is combination with 

SOTER. 

SOTER formulates a suit of regulations with 

systematization and standardization for the database, 

including not only the databases about topography and 

soil, but also climate, land use/cover, vegetation and 

reference file assistant databases. Underlying the SOTER 

methodology is the identification of areas of land with 

distinctive, often repetitive, pattern of landform, 

lithology, surface form, slope, parent material, and soils. 

The major differentiating criteria are applied in a 

step-by-step manner, each step leading to a closer 

identification of the land area under consideration. In this 

way a SOTER unit can be defined progressively into 

terrain, terrain component and soil component (Fig 7). 

Tracts of land distinguished in this manner are named 

SOTER units. Each SOTER unit represents one unique 

combination of terrain and soil characteristics [19]. 

Terrain 

Terrain 

1:M 

component 

Soil 

1:M 

component 

M:1 

M:1 

Terrain 

component 

data 

1:M 

Profile Horizon 

Figure7. SOTER unit structure (1:M=one to many, M:1=many to one 

relations). 

There are two types of data in a SOTER database: 

geometric data and attribute data. The geometric 

component indicates the location and topology of SOTER 

units, while the attribute part describes the non-spatial 

characteristics. The geometric data is stored and handled 

by GIS software, while the attribute data is stored 

separately in a set of files, managed by a relational 

database system. There are 6 attribute files and 118 

attributes. A unique code is set up in both the geometric 

and attribute databases to link these two types of 

information for each SOTER unit [19]. 

The climate data were based on point observations 

only and the link with the soils and terrain information 


exists by means of the geographical locations of these 

points. Relating to 72 climate stations in Fujian province, 

some climate data were interpolated using ArcGIS such 

as average temperature, accumulated temperature, 

monthly precipitation, minimum and maximum 

temperatures, relative humidity and evaporation besides 

radiation. 

VI. DISCUSSIONS 

The study integrated solar radiation model and DEM 

based on GIS, took Zhangpu sample plot in the southeast 

coastal hilly and platform area of Fujian province as small 

scale and Fujian province as moderate scale, gave full 

consideration to the integration of all kinds of terrain 

factors, explored the effects of slope, aspect and so on to 

the spatial distribution diversity of extraterrestrial solar 

radiation at two scales. The method can extract daily 

extraterrestrial and annually solar radiation with high 

precision and speediness at small and moderate scales, but 

there is certain error in the model which mainly comes 

from two aspects. On the one hand, cloud was not taken 

into consideration, while cloud is one of the most import 

parameters of solar radiation energy transmission, so there 

is certain deviation between calculation results and actual 

solar radiation value [20,21]. On the other hand, DEM is 

the important data in the whole model, its quality (such as 

DEM accuracy, resolution etc.) determines the final 

simulation results [1]. 

The methods taking cloud effect into consideration are 

various in solar radiation models which seldom contain 

space characteristics because there are no conventional 

observation data of cloud spatial distribution, even if 

there are observation data it is hard to simulate the 

geometric relation among sun, cloud and ground. Only 

using RS could effects of cloud be took into 

consideration in the spatial model of solar radiation [22]. 

The study is going to use ASTER images to extract cloud 

amount and ground state for improving simulation 

precision and meeting the needs of general solar energy 

engineering and science research. 

The single solar radiation could not meet the needs of 

users with the development of science and technology. 

The SOTER database contains more sufficient data and 

can be used for a wide range of applications at different 

scales. At the same time, only combining with SOTER, 

can the climate data develop deeply and widely and the 

superiority of simulation of extraterrestrial solar radiation 

based on DEM can be shown. The approach may be used 

to support strategic decision-making seeking to optimize 

land use/cover, prioritize research, and guide 

conservation planning. 


This work was supported in part by a grant from 

foundations: National Natural Science Foundation of 

China, No. 40371054; The Key Project of the Ministry of 

Education of China, No. 206107; Project of Science and 

Technology Department of Fujian Province for the


Youth, No. 2006F3037; Excellent Young Core Teachers 

Foundation of Fujian Normal University, No. 

2008100233 

REFERENCES 

[1] H.L He, G.R. Yi, and D. N. “Method of global solar 

radiation calculation on complex territories,” Resources 

Science. China science press. Beijing, vol.25, pp. 78-85, 

2003. 

[2] Z.W. De, S.X. Ni, H.L. Zhou, H.L. Zhang, and Q.P. Tu. 

“Calculation of possible direct solar radiation around 

Qinghai lake based on DEM,” Plateau Meteorology. Cold 

and arid regions environmental and engineering research 

institute of Chinese academy of sciences. Lanzhou, vol.22, 

pp. 92-96, 2003. 

[3] C. Wloczyk, R. Richter. “Estimation of incident solar 

radiation on the ground from multispectral satellite sensor 

imagery,” International Journal of Remote Sensing. Taylor 

& Francis. London UK, vol.27, pp. 1253, 2006. 

[4] K S. Reddy, R. Manish. “Solar resource estimation using 

artificial neural networks and comparison with other 

correlation models,” Energy Conversion and Management. 

Ergamon-Elsevier Science Ltd. Oxford Uk, vol. 44, pp. 

2519-2530, 2003. 

[5] J.G. Corripio. “Vectorial algebra algorithms for calculating 

terrain parameters from DEMs and solar radiation 

modeling in mountainous terrain,” Geographical 

Information Science. Taylor & Francis. London UK, vol. 

17, pp. 1-23, 2003. 

[6] G.Q. Qiu, Y.J. Xia, and H.Y. Yang. “An optimized 

clear-day solar radiation model,” Acta Energiae Solaris 

Sinica. Chinese institute of renewable energy. Beijing, 

vol.25, pp. 456-460, 2001. 

[7] Z.Q. Chen. “The study and application of the regional soil 

and terrain digital database(SOTER),” Ms D Thesis. Fujian 

Normal University. Fuzhou, 2003. 

[8] H.J. Zhu. Sustainable development and land utilization, 1 rd 

ed., China hong kong yearbook press: Hong Kong, 1997. 

pp. 41-46. 

[9] G.A. Tang, X. Yang, Q. Zhang, and F.D. Deng. “DEM 

based simulation on regional solar radiation,” Science 

Journal of Northwest University Online. Northwest 

university press. Xian, vol.1, pp. 208-214, 2003. 

[10] J.Y. Gong. On theories and technology of contemporary 

GIS, 1 rd ed., Wuhan University of Science and Technology 

Press: Wuhan, 1999, pp. 197-199. 

[11] Z.Q. Chen, J.F. Chen. “Assessment of the relative 

sensitivity of ecosystem to acid deposition in Fujian 

Province,” Journal of Fujian Normal University(Natural 

Science Edition). Fujian normal university. Fuzhou, vol.23, 

pp. 83-86, 2007. 

[12] Z.Q. Chen, J.F. Chen, and X.Q. Xie. “Suitable sites 

selection for Euphoria Longana based on Fujian SOTER 

database and GIS,” Acta Pedologica Sinica. soil science 

society of China. Nanjing, vol.42, pp. 700-703, 2005. 


[13] Y.G. ZHAO, G.L. Zhang, and Z.T. Gong. “SOTER-based 

soil water erosion simulation in Hainan island,” 

Pedosphere. Science press. Beijing, vol.13, pp. 139-146, 

2003. 

[14] Z.H. Yi, D.H. Xiong, Z. Yang, Y.R. He, and Y.Y. Zeng. 

“Natural productivity evaluation of cultivated land based 

on SOTER database in the typical region of upper reaches 

of the Yangtse river,” Chinese Journal of Soil Science. Soil 

Science Society of China. Shenyang, vol.36, pp. 145-148, 

2005. 

[15] C.W. Lv, S.Q. Cui, and L. Zhao. “Soil organic carbon 

storage and its spatial distribution characteristics in Hainan 

Island: a study based on HNSOTER,” Chinese Journal of 

Applied Ecology. Science press. Shenyang, vol.17, pp. 

1014-1018, 2006. 

[16] Y.G. Zhao, G.L. Zhang, and H. Zhang. “Systematic 

assessment and regional features of soil quality in the 

Hainan Island,” Chinese Journal of Eco-Agriculture. China 

science press. Shijiazhuang, vol.12, pp. 13-15, 2004. 

[17] M.X. Men, J. Chen, Z.R. Yu and H. Xu. “Assessment of 

soil erosion based on SOTER in Hebei Province,” Chinese 

Agricultural Science Bulletin. Chinese agricultural science 

bulletin press. Beijing, vol.23, pp. 587-592. 2007. 

[18] X.L. Zhang, G.L. Zhang, and Z.T. “Gong. Evaluation for 

some tropical crops in Hainan Province by using ales based 

upon HaiSOTER,” Scientia Geographica Sinica. 

Changchun publishing house. Changchun, vol.21, pp. 

344-349, 2001. 

[19] H.Z. Zhou. “A testing of SOTER project in China,” 

Pedosphere. Science press. Beijing, vol.36, pp. 

153-160,1993. 

[20] J.G. Sun, J. Zhao, J.G. Zhen, and D.P. Li. “The study of 

computer model and application of solar radiation value in 

yellow earth will ravine region based on DTM,” 

Engineering of Surveying and Mapping. Engineering 

college of heilongjiang. Heilongjiang, vol.12, pp. 28-30, 

2003. 

[21] K.A. Joudi, N.S. Dhaidan. “Application of solar assisted 

Heating and Desicddant Cooling Systems for a Domestic 

building,” Energy Conversion & Management. 

Ergamon-Elsevier Science Ltd. Oxford UK, vol.12, pp. 

995-1022, 2001. 

[22] X. Li, G.D. Cheng, X.Z. Chen, and L. Lu. “Improvement 

of solar radiation model in arbitrary topography,” Chinese 

Science Bulletin. China science press. Beijing, vol.44, pp. 

993-998, 1999. 

Zhi-qiang Chen Fujian,china, 1978, Ph.D., specialized in 

GIS and land resource and land use plan. E-mail: 

soiltuqiang061@163.com 

Jian-fei Chen E-mail:gdcjf@21cn.net


Multiplicate Particle Swarm Optimization 

Algorithm 

Shang Gao and Zaiyue Zhang 

School of Computer Science and Engineering, Jiangsu University of Science and Technology, Zhenjiang 212003, China 

Email: gao_shang@hotmail.com yzzjzzy@sina.com 

Cungen Cao 

Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100080, China 

Email: cgcao@ict.ac.cn 

Abstract—Using Particle Swarm Optimization to handle 

complex functions with high-dimension it has the problems 

of low convergence speed and sensitivity to local 

convergence. The convergence of particle swarm algorithm 

is studied, and the condition for the convergence of particle 

swarm algorithm is given. Results of numerical tests show 

the efficiency of the results. Base on the idea of 

specialization and cooperation of particle swarm 

optimization algorithm, a multiplicate particle swarm 

optimization algorithm is proposed. In the new algorithm, 

particles use five different hybrid flight rules in accordance 

with section probability. This algorithm can draw on each 

other ' s merits and raise the level together The method uses 

not only local information but also global information and 

combines the local search with the global search to improve 

its convergence. The efficiency of the new algorithm is 

verified by the simulation results of five classical test 

functions and the comparison with other algorithms. The 

optimal section probability can get through sufficient 

experiments, which are done on the different section 

probability in the algorithms. 

Index Terms—particle swarm optimization algorithm, 

convergence, parameter 


Particle swarm optimization (PSO) is one of the 

evolutionary computational techniques. Since its 

introduction (Eberhart R C , Kennedy J.1995; Kennedy J, 

Eberhart R.1995; Shi Y H , Eberhart R C.1998)[1,2,3,4], 

PSO has attracted much attention from researchers 

around the world. It is a population-based search 

algorithm and is initialized with a population of random 

solutions, called particles. Each particle in PSO moves 

over the search space at velocity dynamically adjusted 

according to the historical behaviors of the particle and its 

companions. Using Particle Swarm Optimization to 

handle complex functions with high-dimension it has the 

problems of low convergence speed and sensitivity to 

local convergence. Although PSO is a fairly recent 

algorithm, it is quickly gaining momentum in the 

Supported by the National Natural Science Foundation of China 

under Grant No.60773059. 


doi:10.4304/jcp.5.1.150-157 

engineering research community. Several researchers 

have analyzed the performance of the PSO with different 

settings, e.g., neighborhood settings, cluster analysis, etc. 

It has been used for approaches that can be used across a 

wide range of applications. For example Fourie and 

Groenwold applied the algorithm to structural shape and 

sizing [5] and topology optimization [6] problems. The 

authors applied the algorithm to a cantilevered beam [7] 

and multi-disciplinary design optimization of a transport 

aircraft wing [8]. The University of Florida has applied 

the algorithm to biomechanical system identification 

problems e.g. [9]. Some authors have worked on parallel 

PSO algorithms in an attempt to alleviate the associated 

high computational cost. Most parallel implementations 

of the PSO algorithm presented to date are based on a 

synchronous implementation e.g. [9], where all design 

points within a design iteration are evaluated, before the 

next design iteration is started. Particle swarm 

optimization can be and has been used across a wide 

range of applications. Areas where PSOs have shown 

particular promise include multimodal problems and 

problems for which there is no specialized method 

available or all specialized methods give unsatisfactory 

results. PSO applications are so numerous and diverse 

that a whole paper would be necessary just to review a 

subset of the most paradigmatic ones. Here we only have 

a limited space to devote to this topic. So, we will limit 

ourselves to listing the main application areas where 

PSOs have been successfully deployed. Base on the idea 

of specialization and cooperation of particle swarm 

optimization algorithm, a multiplicate particle swarm 

optimization algorithm is proposed. 

II. THE BASIC PSO ALGORITHM 

In the PSO algorithm, the birds in a flock are 

symbolically represented as particles. These particles can 

be considered as simple agents “flying” through a 

problem space. A particle’s location in the multidimensional 

problem space represents one solution for 

the problem. When a particle moves to a new location, a 

different problem solution is generated. This solution is 

evaluated by a fitness function that provides a 

quantitative value of the solution’s utility.


The velocity and direction of each particle moving 

along each dimension of the problem space will be 

altered with each generation of movement. In 

combination, the particle’s personal experience, Pid and 

its neighbors’ experience, Pgd influence the movement of 

each particle through a problem space. The random 

values rand1 and rand2 are used for the sake of 

completeness, that is, to make sure that particles explore a 

wide search space before converging around the optimal 

solution. The values of c1 and c2 control the weight 

balance of Pid and Pgd in deciding the particle’s next 

movement velocity. At every generation, the particle’s 

new location is computed by adding the particle’s current 

velocity, vid, to its location, xid. Mathematically, given a 

multi-dimensional problem space, the ith particle changes 

its velocity and location according to the following 

equations[1][2]: 

vid 

= c0 

× vid 

+ c1 

× rand1 

× ( pid 

− xid 

) 

(1) 

+ c2 

× rand 2 × ( pgd 

− xid 

) 

x id = xid 

+ vid 

(2) 

where c0 denotes the inertia weight factor; pid is the 

location of the particle that experiences the best fitness 

value; Pgd is the location of the particles that experience a 

global best fitness value; c1 and c2 are constants and are 

known as acceleration coefficients; d denotes the 

dimension of the problem space; rand1, rand2 are random 

values in the range of (0, 1). For equation (1), the first 

part represents the inertia of pervious velocity; the second 

part is the “cognition” part, which represents the private 

thinking by itself; the third part is the “social” part, which 

represents the cooperation among the particles. If the sum 

of accelerations would cause the velocity vid, on that 

dimension to exceed vmax,d ,then vid, is limited to vmax,d. 

vmax,d determines the resolution with which regions 

between the present position and the target position are 

searched. Weighted combination of three possible moves 

is shown in Figure 1. 

The PSO algorithm can be described as follows: 

I) For each particle: 

Initialize particle 

II) Do: 

a) For each particle: 

1) Calculate fitness value 

2) If the fitness value is better than the best 

fitness value pbest in history 

3) Set current value as the new pbest 

End 

b) For each particle: 

1) Find in the particle neighborhood, the 

particle with the best fitness gbest. 

2)Calculate particle velocity according 

to the velocity equation (1) 

3) Apply the velocity constriction 

4) Update particle position according to the 

position equation (2) 

5) Apply the position constriction 

End 

While maximum iterations or minimum error 

criteria 


xk 

is not attained 

vk 

pbestk 

xk 

+ 1 

Figure 1. Weighted combination of three possible moves 

III. PARAMETERS ANALYSIS OF PSO 

gbestk 

The basic PSO described above has a small number of 

parameters that need to be fixed. One parameter is the size 

of the population. This is often set empirically on the basis 

of the dimensionality and perceived difficulty of a problem. 

Values in the range 20–50 are quite common. 

c in (1) determine the 

The parameters 1 

c and 2 

magnitude of the random forces in the direction of personal 

best pid and neighborhood best Pgd. These are often called 

acceleration coefficients. The behavior of a PSO changes 

radically with the value of c 1 and c 2 . Interestingly, we 

can interpret the components c1 × rand1 

× ( pid 

− xid 

) 

and c2 × rand 2 × ( pgd 

− xid 

) in (1) as attractive forces 

produced by springs of random stiffness, and we can 

approximately interpret the motion of a particle as the 

integration of Newton’s second law. In this interpretation, 

c 1 / 2 and c 2 / 2 represent the mean stiffness of the 

springs pulling a particle. It is no surprise then that by 

changing c 1 and c 2 one can make the PSO more or less 

“responsive” and possibly even unstable, with particle 

speeds increasing without control. The value c 1 = c 2 =2, 

almost ubiquitously adopted in early PSO research, did 

just that. However, this is often harmful to the search and 

needs to be controlled. The technique originally proposed 

to do this was to bound velocities so that each component 

of vid is kept within the range [−vmax,d,+ vmax,d]. The 

choice of the parameter vmax,d required some care since it 

appeared to influence the balance between exploration 

and exploitation. The use of hard bounds on velocity, 

however, presents some problems. The optimal value of 

vmax,d is problem-specific, but no reasonable rule of 

thumb is known. Further, when vmax,d x was implemented, 

the particle’s trajectory failed to converge. Where one 

would hope to shift from the large-scale steps that typify 

exploratory search to the finer, focused search of 

exploitation, vmax,d simply chopped off the particle’s 

oscillations, so that some hopefully satisfactory 

compromise was seen throughout the run. 

Motivated by the desire to better control the scope of the 

search, reduce the importance of vmax,d, and perhaps 

eliminate it altogether, the following modification of the 

PSO’s update equations was proposed (Shi and Eberhart):


The inertia weight c 0 can be interpreted as the 

fluidity of the medium in which a particle moves. This 

perhaps explains why researchers have found that the best 

performance could be obtained by initially setting c 0 to 

some relatively high value (e.g., 0.9), which corresponds 

to a system where particles move in a low viscosity 

medium and perform extensive exploration, and gradually 

reducing c 0 to a much lower value (e.g., 0.4), where the 

system would be more dissipative and exploitative and 

would be better at homing into local optima. It is even 

possible to start from values of c 0 > 1 , which would 

make the swarm unstable, provided that the value is 

reduced sufficiently to bring the swarm in a stable region 

(the precise value of c 0 that guarantees stability depends 

on the values of the acceleration coefficients). 

Naturally, other strategies can be adopted to adjust the 

inertia weight. For example, in (Eberhart and Shi 2000) 

the adaptation of c 0 using a fuzzy system was reported 

to significantly improve PSO performance. Another 

effective strategy is to use an inertia weight with a 

random component, rather than time-decreasing. For 

example, (Eberhart and Shi 2001) successfully used 

c 0 = U ( 0. 

5, 

1) 

. There are also studies, e.g., (Zheng et 

al. 2003)[10], in which an increasing inertia weight was 

used obtaining good results. 

Clerc and Kennedy (2002)[11] noted that there can be 

many ways to implement the constriction coefficient. One 

of the simplest methods of incorporating it is the 

following: 

vid 

= χ( 

vid 

+ c1 

× rand1 

× ( pid 

− xid 

) 

+ c2 

× rand 2 × ( pgd 

− xid 

)) 

Where c = c1 

+ c2 

> 4 and 

2 

χ = 

2 

c − 2 + c − 4c 

When Clerc’s constriction method is used, c is 

commonly set to 4.1, c 1 = c2 

, and the constant 

multiplier χ is approximately 0.7298. These results in 

the previous velocity being multiplied by 0.7298 and each 

of the two pid − xid 

terms being multiplied by a random 

number limited by 0.7298×2.05≈1.49618. 

The constricted particles will converge without 

using any vmax,d at all. However, subsequent experiments 

and applications (Eberhart and Shi 2000) [12] concluded 

that a better approach to use as a prudent rule of thumb is 

to limit vmax,d to xmax,d, the dynamic range of each variable 

on each dimension. The result is a particle swarm 

optimization algorithm with no problem-specific 

parameters. And this is the canonical particle swarm 

algorithm of today. 

Yuhui Shi, Russell C(1998) [13]have analyzed the 

impact of the inertia weight and maximum velocity 

allowed on the performance of PSO. A number of 


experiments have been done with different inertia weights 

and different values of maximum velocity allowed. It is 

concluded that when vmax,d. is small, an inertia weight of 

approximately 1 is a good choice, while when vmax,d.is not 

small, an inertia weight c 0 = 0. 

8 is a good choice. 

When we lack knowledge regarding the selection of 

vmax,d, it is also a good choice to set vmax,d equal to xmax,d 

and an inertia weight c 0 = 0. 

8 is a good starting point. 

Furthermore if a time varying inertia weight is employed, 

even better performance can be expected. 

In (Kennedy, J. 1997) based on c 1 and c 2 , Kennedy 

introduces four models of PSO, defined by omitting or 

restricting components of the velocity formula. 

(1) The complete formula above defines the Full 

Model. That is 1 0 > c and 2 0 > c . 

(2) Dropping the social component results in the 

Cognition-Only Model. That is 1 0 > c and 2 0 = c . 

(3) dropping the cognition component defines the 

Social-Only Model. That is 1 0 = c and 2 0 > c . 

(4) Selfless is the Social-Only Model, but the 

neighborhood best is chosen only from the neighbors, 

without considering the individual particle’s gbest vector. 

That is 1 0 = c , 2 0 > c and g ≠ i. 

Regarding the inertia weight, it determines how the 

previous velocity of the particle influences the velocity in 

the next iteration: If c 0 = 0 , the velocity of the particle 

is only determined by the pbest and gbest positions. This 

means that the particle may change its velocity instantly 

if it is moving far from the best positions in its 

knowledge. Thus, low inertia weights favor exploitation 

(local search). If c 0 is high, the rate at which the particle 

may change its velocity is lower (it has an "inertia" that 

makes it follow its original path) even when better fitness 

values are known. Thus, high inertia weights favor 

exploration (global search). 

In this paper, an analysis of the impact of this inertia 

c and 

weight c 0 , together with acceleration constants 1 

c 2 on the performance of PSO is given, followed by 

experiments that illustrate the analysis and provide some 

insights into optimal selection of the. inertia weight c 0 , 

together with acceleration constants c 1 and c 2 . 

The parameters of PSO includes: number of particles 

m, inertia weight c 0 , acceleration constants c 1 and c 2 , 

maximum velocity vmax,d. This paper discuss inertia 

c . 

weight c 0 , acceleration constants c 1 and 2 

IV. CONVERGENCE OF PSO 

A. Convergence Analysis 

In ( Gao S, Yang J Y,2006;Gao S et al.2006) [14,15], 

the convergence of particle swarm algorithm is 

studied,and the condition for the convergence of particle 

swarm algorithm is given.


From equation (1) and (2), We can see that although 

and are multi-dimensional variables, but they are 

independent each other. So we can simplify multidimensional 

space into one dimension space. We suppose 

that the swarm’s global best position and the best position 

attained by particle self are unchanged, and they are 

denoted as p b and g b .The equation(1)and(2)can 

be simplified as 

v( 

k + 1) 

= c0v( 

k) 

+ c1( 

pb 

− x( 

k)) 

(3) 

+ c2 

( gb 

− x( 

k)) 

x ( k + 1) 

= x( 

k) 

+ v( 

k + 1) 

(4) 

From equation(3)and(4), we obtain 

v( 

k + 2) 

= c0v( 

k + 1) 

+ c1( 

pb 

− x( 

k + 1)) 

(5) 

+ c2 

( gb 

− x( 

k + 1)) 

x ( k + 2) 

= x( 

k + 1) 

+ v( 

k + 2) 

(6) 

The formula (4) and (5) are substituted into formula 

(6), we can get 

x( 

k + 2) 

+ ( −c0 

+ c1 

+ c2 

−1) 

x( 

k + 1) 

(7) 

+ c0 

x( 

k) 

= c1 

pb 

+ c2 

g b 

Equation (7) is a second order inhomogeneous 

difference equation. We can use characteristic equation 

method to solve this equation. 

The characteristic equation of Equation (7) is 

2 

λ + ( −c0 

+ c1 

+ c2 

−1) 

λ + c0 

= 0 (8) 

2 

(1) If ∆ = ( −c 

0 + c1 

+ c2 

−1) 

− 4c0 

= 0 , 

then λ = λ1 

= λ2 

= −( 

−c0 

+ c1 

+ c2 

−1) 

/ 2 . 

So 

k 

x( k) 

= ( A0 

+ A1k 

) λ 

Where A 0 , A 1 are undetermined coefficients, and 

they are determined by v( 0) 

and x ( 0) 

. Therefore, 

⎧A0 

= x( 

0) 

, 

⎪ 

⎨ ( 1− 

c) 

x( 

0) 

+ c0v( 

0) 

+ c1 

pb 

+ c2 

gb 

⎪A1 

= 

− x( 

0) 

⎩ 

λ 

2 

(2) If ∆ = −c 

+ c + c −1) 

− 4c 

> 0 , then 

( 0 1 2 

0 

c0 

− c1 

− c2 

+ 1± 

λ 1, 

2 = 

2 

∆ 

. So 

( k) 

= A 

k k 

+ A λ + A λ , 

x 0 1 1 2 2 

Where A 0 , 1 A , 2 

A are undetermined coefficients. 

It is assumed that b1 = x( 

0) 

− A0 

and 

b ( 1− 

c) 

x( 

0) 

+ v( 

0) 

+ c p + c g − A , we 

2 

= c0 1 b 2 b 0 

can get 


⎧ c1 

pb 

+ c2 

gb 

⎪A0 

= 

⎪ c 

⎪ λ2b1 

− b2 

⎨A1 

= 

⎪ λ2 

− λ1 

⎪ b2 

− λ1b1 

⎪A2 

= 

⎩ λ2 

− λ1 

(3) If ∆ = −c 

+ c + c 

2 

−1) 

− 4c 

< 0 , then 

( 0 1 2 

0 

c0 

− c1 

− c2 

+ 1± 

i 

λ 1, 

2 = 

2 

− ∆ 

. So 

( k) 

= A 

k k 

+ A λ + A λ 

x 0 1 1 2 2 

Where A 0 , 1 A , 2 

A are undetermined coefficients. 

⎧ c1 

pb 

+ c2 

gb 

⎪A0 

= 

⎪ c 

⎪ λ2b1 

− b2 

⎨A1 

= 

⎪ λ2 

− λ1 

⎪ b2 

− λ1b1 

⎪A2 

= 

⎩ λ2 

− λ1 

The convergence condition is : λ 1 and 

λ 1. 

2 < 

1 < 

If ∆ = 0 , the convergence zone is 

2 2 

c + c − c c − 2c 

− 2c 

+ 1 = 0 and 0 ≤ c 0 < 1 

0 

2 0 0 

c = c + c . 

If ∆ > 0 , the convergence zone is 

2 2 

c + c − c c − 2c 

− 2c 

+ 1 > 0 , c > 0 and 

,where 1 2 

0 

2 0 0 

2c 0 − c + 2 > 0 . 

If ∆ < 0 , the convergence zone is 

2 2 

c + c − c c − 2c 

− 2c 

+ 1 < 0 and c 1 . 

0 

2 0 0 

0 < 

So the convergence zone is c 0 < 1 , c > 0 and 

c − c + 2 > 0 (shown in Figure 2). 

2 0 

Figure 2 The convergence zone of particle swarm algorithm


B. Numerical Simulation 

Numerical simulations are implemented for six 

different types. It is assumed that x ( 0) 

= 2 , v ( 0) 

= 1, 

and c p c g = 0 . 

1 

b + 2 b 

(1) If c 0 =0.81 and c =3.61, then ∆ = 0 and 

k 

x ( k) 

= ( 2 + 2. 

9k)( 

−0. 

9) 

. x (k) 

is convergent. It is 

illustrated in Figure 3. 

(2) If c 0 =4 and c =1, then ∆ = 0 and 

+ 1 

( ) = 2 

k 

x k . (k) 

Figure 4. 

x is divergent. It is illustrated in 

Figure 3 Relationship between particle location and iterative times with 

c 0 =0.81 and c =3.61 


c 0 =4 and c =1 

(3) If c 0 =0 and c =0.5, then ∆ > 0 and 

−k 

x k = 1 

( ) 2 

Figure 5. 

. x (k) 

is convergent. It is illustrated in 

(4) If c 0 =0 and c =3, then ∆ > 0 and 

x ( k) 

Figure 6. 

− ) 

k 

= 2 ⋅ ( 2 . (k) 


x is divergent. It is illustrated in 


c 0 =0 and c =0.5 


c 0 =0 and c =3 

(5) If c 0 =0.5 and c =0.5, then ∆ < 0 and 

k 

kπ 

kπ 

2 x( 

k) 

= 2 ( 2cos 

+ sin ) 

4 4 

− 

. x (k) 

is 

convergent. It is illustrated in Figure 7. 

(6) If c 0 =2 and c =5, then ∆ < 0 and 

k 

1 

2 

x( 

k) 

= 2 

+ 

3 3 

(cos kπ 

− 2sin 

kπ 

) 

4 4 

divergent. It is illustrated in Figure 8. 

. x (k) 

is



c 0 =0.5 and c =0.5 


c 0 =2 and c =5 

V. MULTIPLICATE PARTICLE SWARM 

OPTIMIZATION (MPSO) ALGORITHM 

The first particle swarms (Kennedy and Eberhart 

1995)[2] evolved out of bird-flocking simulations of a 

type described by (Reynolds 1987)[14] and (Heppner and 

Grenander 1990)[15]. In these models, the trajectory of 

each bird’s flight is modified by application of several 

rules, including some that take into account the birds that 

are nearby in physical space. So, early PSO topologies 

were based on proximity in the search space. However, 

besides being computationally intensive, this kind of 

communication structure had undesirable convergence 

properties. Therefore, this Euclidean neighborhood was 

soon abandoned. 

In fact, the real lives work not only specially but also 

cooperatively. In this paper, different particles are 

assigned specific tasks. The particles use five different 

hybrid flight rules in accordance with section probability. 


This algorithm can draw on each other ' s merits and raise 

the level together The method uses not only local 

information but also global information and combines the 

local search with the global search to improve its 

convergence. 

The first flight rule: c 0 , c 1 and c 2 are not equal to 

0, and they are restricted in the convergence zone (Figure 

2).In this paper, we set c 0 = 1, 

1 2 = c , 2 2 = c . 

The second flight rule : c 0 = 0 , 1 c and c 2 are 

restricted in the convergence zone (fig 2). In this paper, 

we set c 0 = 0 , 1 0. 

5 = c , 5 . 0 2 = c . 

The third flight rule : 1 0 = c , c0 and c 2 are 


we set c 0 = 1, 

1 0 = c , 2 2 = c . 

The fourth flight rule: 2 0 = c , c 0 and c 1 are 


we set c 0 = 1, 

1 2 = c , 2 0 = c . 

The fifth flight rule: c 0 , c 1 and c 2 are not 

restricted in the convergence zone (fig 2). The alternate 

will emanate, but the algorithm can escape the local 

minimum. In this paper, we set c 0 = 2 , 1 2 = c , 2 2 = c . 

The five flight rules are incomplete, but they are 

feasible in somewhere. The particles use five different 

hybrid flight rules in accordance with section probability. 

The simplest selection scheme is roulette-wheel selection, 

also called stochastic sampling with replacement. As an 

example, consider 5 flight rules select values (0.6, 0.1, 

0.1, 0.1,0.1). The spin of the wheel results in a random 

number r , 

If 0 < r ≤ 0. 

6 ,the first flight rule is chosen, 

else if 0. 6 < r ≤ 0. 

7 , the second flight rule is 

chosen, 

else if 0. 7 < r ≤ 0. 

8 , the third flight rule is chosen, 

else if 0. 8 < r ≤ 0. 

9 , the fourth flight rule is 

chosen, 

else if 0 . 9 < r < 1, 

the fifth flight rule is chosen. 

For comparison, five benchmark functions that are 

commonly used in the evolutionary computation 

literature are used. All functions have same minimum 

value, which are equal to zero. 

F 1 

2 

= x1 

+ x , −1 ≤ x i ≤ 1 

F = 100( x 

2 

2 

− x ) + ( 1− 

x ) , 2 ≤ x ≤ 2 

2 

2 

2 

2 

1 

2 

1 

− i 

2 2 2 

sin x1 

+ x2 

− 0. 

5 

F 3 = 

+ 0. 

5 

2 2 2 

[ 1+ 

0. 

001( 

x1 

+ x2 

)] 

, 

1 ≤ x ≤ 1 

F4 2 2 

= x1 

+ 2x2 − 0. 

3cos3πx 

1 − 0. 

4cosπx2 

+ 0. 

7 

, −1 ≤ x i ≤ 1 

F 

2 2 

= x + x , 1 ≤ x ≤ 1 

− i 

5 

1 

2 

− i


We write down the alternate times, as the error 

between the minimum value and result is 0.00001. The 5 

flight rules select values are set 4 combinations. They are 

0.6:0.1:0.1:0.1:0.1, 0.4:0.2:0.2:0.1:0.1, 

0.3:0.2:0.2:0.2:0.1, and 0.2:0.2:0.2:0.2:0.2. 50 rounds of 

TABLE I. COMPARISON OF ALGORITHMS 

computer simulation are conducted for each algorithm, 

and the results are shown in Table 1. 

All the MPSO algorithms are proved effective. 

Especially the algorithm with selection probability ( 

0.3:0.2:0.2:0.2:0.1 ) is a simple and effective better 

algorithm than others. 

Functions Algorithms Best Average Worst 

F 1 

F 2 

F 3 

F 4 

F 5 

PSO 5 34.9 107 

MPSO (0.6:0.1:0.1:0.1:0.1) 4 16.5 30 

MPSO (0.4:0.2:0.2:0.1:0.1) 5 12.06 26 

MPSO (0.3:0.2:0.2:0.2:0.1) 4 10.88 18 

MPSO (0.2:0.2:0.2:0.2:0.2) 4 12.4 28 

PSO 9 129.6 417 

MPSO (0.6:0.1:0.1:0.1:0.1) 14 36.68 97 

MPSO (0.4:0.2:0.2:0.1:0.1) 12 21.64 41 

MPSO (0.3:0.2:0.2:0.2:0.1) 13 21.74 37 

MPSO (0.2:0.2:0.2:0.2:0.2) 13 26.08 38 

PSO 5 30.9 84 

MPSO (0.6:0.1:0.1:0.1:0.1) 2 14.5 34 

MPSO (0.4:0.2:0.2:0.1:0.1) 4 12.54 21 

MPSO (0.3:0.2:0.2:0.2:0.1) 3 11.36 19 

MPSO (0.2:0.2:0.2:0.2:0.2) 5 12.76 24 

PSO 7 85.76 272 

MPSO (0.6:0.1:0.1:0.1:0.1) 11 28.36 72 

MPSO (0.4:0.2:0.2:0.1:0.1) 8 17.16 31 

MPSO (0.3:0.2:0.2:0.2:0.1) 6 17.88 26 

MPSO (0.2:0.2:0.2:0.2:0.2) 10 18.52 37 

PSO 4 33.6 96 

MPSO (0.6:0.1:0.1:0.1:0.1) 3 15.5 27 

MPSO (0.4:0.2:0.2:0.1:0.1) 4 11.52 24 

MPSO (0.3:0.2:0.2:0.2:0.1) 3 10.23 16 

MPSO (0.2:0.2:0.2:0.2:0.2) 3 11.4 26 

VI CONCLUSIONS 

Base on the idea of specialization and cooperation of 

particle swarm optimization algorithm, a multiplicate 

particle swarm optimization algorithm is proposed. The 

efficiency of the new algorithm is verified by the 

simulation results of five classical test functions and the 

comparison with other algorithms. Future work will also 

address additional issues related to the use of PSO in 

high-dimensional spaces, including the selection of 

swarm size, the parameters of PSO. Even though good 

experimental results have been obtained in this paper, 

only a small benchmark problem has been tested. The 

selection probability may be problem-dependent. To fully 

justify the benefits of selection probability as described in 

this paper, more problems need to be tested. By doing so, 

a clearer understanding of PSO performance will be 

obtained. 



This work was partially supported by National Basic 

Research Program of Jiangsu Province University 

(08KJB520003), Qing Lan Project and the Open Project 

Program of the State Key Lab of CAD&CG. 

REFERENCES 

[1] R. C. Eberhart and J. Kennedy, “A new optimizer using 

particles swarm theory”, Proceedings Sixth International 

Symposium on Micro Machine and Human Science, 

Nagoya, Japan, 1995,pp.39-43. 

[2] J. Kennedy and R. C. Eberhart, “Particle swarm 

optimization”, Proceedings IEEE International Conference 

on Neural Networks. Perth, 1995, pp.1942-1948. 

[3] Y. H. Shi and R. C.Eberhart, “A modified particle swarm 

optimizer”, IEEE International Conference on 

Evolutionary Computation. Anchorage, Alaska, May 4-9, 

1998, pp.69-73. 

[4] J. Kennedy, “The Particle Swarm: Social Adaptation of 

Knowledge”, Proceedings of the 1997 International 

Conference on Evolutionary Computation, IEEE Press, 

1997,pp. 303-308.


[5] P. C. Fourie and A. A.Groenwold, “The Particle Swarm 

Optimization Algorithm in Size and Shape Optimization”, 

Structural and Multidisciplinary Optimization, vol. 

23,no.4, pp. 259–267,2002. 

[6] P. C. Fourie and A. A. Groenwold, “Particle Swarms in 

Topology Optimization”, Proceedings of the Fourth World 

Congress of Structural and Multidisciplinary Optimization, 

Dalian, China, 2001. 

[7] G. Venter and J. Sobieszczanski-Sobieski, “Particle 

Swarm Optimization”, AIAA Journal, 2003, vol. 41,no.8, 

pp. 1583–1589. 

[8] G. Venter and J. Sobieszczanski-Sobieski, 

“Multidisciplinary Optimization of a Transport Aircraft 

Wing using Particle Swarm Optimization”, Structural and 

Multidisciplinary Optimization, 2004, vol. 26, no.1-2, pp. 

121–131. 

[9] J. F. Schutte, B. J. Fregly, R. T. Haftka, and A. George, “A 

Parallel Particle Swarm Algorithm”. Proceedings of the 

Fifth World Congress of Structural and Multidisciplinary 

Optimization, Venice, Italy, 2003. 

[10] Y.L. Zheng, L.H. Ma, L.Y. Zhang, and J.X. Qian,. “On the 

convergence analysis and parameter selection in particle 

swarm optimization”. Proceedings of the IEEE 

international conference on machinelearning and 

cybernetics. Piscataway: IEEE 2003,pp. 1802–1807 

[11] M. Clerc., and J. Kennedy, The particle swarm—explosion, 

stability, and convergence in a multidimensionalcomplex 

space. IEEE Transaction on Evolutionary Computation, 

2002,Vol.6, No.1, pp.58–73. 

[12] R. C. Eberhart, and Y. Shi, “Comparing inertia weights 

and constriction factors in particle swarm optimization”. 

Proceedings of the IEEE congress on evolutionary 

computation (CEC), San Diego, CA. Piscataway: IEEE, 

2000, pp. 84–88. 

[13] Y. H. Shi and R. C. Eberhart, “Parameter Selection in 

Particle Swarm Optimization”, Proceedings of the 7th 

International Conference on Evolutionary Programming 

VII,1998,pp.591-600. 

[14] C. W. Reynolds, “Flocks, herds, and schools: a distributed 

behavioral model”, Computer Graphics, 1987, Vol.21, 

No.4, pp.25–34. 


[15] H. Heppner and U. Grenander, “A stochastic non-linear 

model for coordinated bird flocks” In S. Krasner (Ed.), The 

ubiquity of chaos. Washington: AAAS. ,1990, pp. 233–238. 

[16] S. Gao and J. Y. Yang, Swarm Intelligence Algorithm and 

Applications. Beijing: China Water Power Press, 2006, 

pp.112-117(in Chinese). 

[17] S. Gao, K. Z. Tang, X. Z. Jiang, and J. Y. Yang, 

“Convergence Analysis of Particle Swarm Optimization 

Algorithm”, Science Technology and Engineering 

,2006,vol.6, no.12, pp.1625-1627,1631(in Chinese). 

Shang Gao was born in 1972, and received his M.S. degree 

in 1996 and Ph.D degree in 2006. He now works in school of 

computer science and technology, Jiangsu University of Science 

and Technology. He is an associate professor and He is engage 

mainly in systems engineering. 

Zaiyue Zhang was born in 1961, and received his M.S. 

degree in mathematics in 1991 from the Department of 

Mathematics, Yangzhou Teaching College, and the Ph.D. 

degree in mathematics in 1995 from the Institute of Software, 

the Chinese Academy of Sciences. Now he is a professor of the 

School of Computer Science and Technology, Jiangsu 

University of Science and Technology. His current research 

areas are recursion theory, knowledge representation and 

knowledge reasoning. 

Cungen Cao was born in 1964, and received his M.S. degree 

in 1989 and Ph.D. degree in 1993 both in mathematics from the 

Institute of Mathematics, the Chinese Academy of Sciences. 

Now he is a professor of the Institute of Computing Technology, 

the Chinese Academy of Sciences. His research area is large 

scale knowledge processing.


Research and Design of Intelligent Electric Power 

Quality Detection System Based on VI 

Yu Chen 

Zhengzhou Institute of Aeronautical Industry Management, Zhengzhou, China 

Email: chenyu3440@gmail.com 

Abstract—Electric power quality problem has become an 

important problem in modern society, which effects 

industrial production and product quality, etc. So, we 

urgently need monitor and analyze electric power 

parameter to solve electric power quality problem. In this 

paper, we design a kind of electric power parameter 

detection system based on visual instrument (VI) technology, 

etc. System uses voltage/currency transformer module and 

signal processor module to realize electric power network 

analog signal conversion. System adopts multiplexing 

technology, which solves multi data acquisition card (DAQ) 

using complexity problem as well as saving cost. In addition, 

system software adopts method of wavelet resolution 

analysis combining with FFT, which solving FFT 

disadvantage in analyzing non-steady distortion signal. 

System is applied to harmonic monitor, voltage fluctuation 

and flicker monitor, etc. This paper mainly introduces 

system’s structure, algorithm simulation, part of hardware 

and software design in detail. Through experiment, monitor 

error of harmonic wave’s THD% is 0.01%, and amplitude 

wave’s frequency and amplitude of voltage flicker is 

respectively less than 0.01Hz and 0.005V, which obtains 

accuracy result in high monitor precision as expectation. 

Index Terms—visual instrument, DAQ, FFT, harmonic 

monitor, voltage fluctuation and flicker, wavelet analysis, 

Matlab+LabVIEW 


With the development of power electric system, 

especially introduced of power electronic equipment, 

power system harmonic harm is serious increasing. On 

production of industrial and mining enterprises, etc, 

power quality is necessary. To assure equipment reliable 

operation, we need detect voltage, currency, and their 

state of power network, etc in real time. Traditional 

detection equipment depends artificial to complete, but 

there are many more problems: one is field circumstance 

is poor, especially, under middle or high voltage 

condition, which is not suitable to work in long term; the 

other is that artificial style has larger error, and data static, 

processor both require amount of time and work. In 

addition, present detection equipment can not completely 

detach field. So, design a kind of convenient and simple 

detection style has very important significance. With the 

increasing of electric component precision, speed, and 

reliability, it provides base to realize high character 

method and in real-time control. [1] 

This system adopts technologies of data acquisition 

multiplexing, VI, PC, and new algorithm of wavelet 


doi:10.4304/jcp.5.1.158-165 

multi-resolution, etc, which an intelligent power network 

detection system is carried out. 

II. SYSTEM STRUCTURE AND WORK PRINCIPLE 

Intelligent electric power quality detection system is 

mainly composed of two parts, which is PC system 

installing VI software LabVIEW8.5+Matlab and data 

acquisition system. Because PC system input signal is 

small voltage signal after processed by transformer and 

process module transforming from three phase 

voltage/currency. According to waiting signal detection 

method demand, when monitor voltage phase, power 

factor, etc, which needs adopt synchronization collection 

style, it is to say, we need synchronization sample to two 

phase voltage or voltage-currency. If we make six groups 

signal amplify, filter and transmit to data acquisition 

module, which needs data acquisition module at least has 

six groups synchronization sampling channel. But normal 

synchronization DAQ only has four groups 

synchronization sampling channel, which needs two card 

and increases synchronization controlling problem 

between two cards, and brings cost and technology 

complex problem by this kind of style. So, system adopts 

multiplexing technology to solve this problem. Overall 

structure of system design is shown as Fig. 1. 

Signal process module 

Data acquisition card 

PC 

Sensor monitor signal 

Figure 1. Overall structure of system design 

III. SYSTEM HARDWARE DESIGN 

VI software 

(LabVIEW8.5) 

Six groups of electric power parameter of electric 

power system two channel groups output through double 

channel 16 groups/double 8 group analog switch AD7507, 

and then it is amplified, low-passing filtered and 

transmitted to DAQ. So, we only need one card to fulfill 

system signal collection requirement, which not only 

reduces cost, but also simplifies technology schema. To 

increase monitor system generality, we design multirange 

automatic control circuit, which makes system 

automatically range shifting to meet monitored value 

changing according DAQ A/D transfer’s factors of 

voltage change limit value, monitor precision and 

resolution, etc.


System completes field electric power system detection 

in real time, and processes field collection data through 

PC Computer adopts normal industry control computer. 

System PC can realize displaying and alarming through 

analyzing and processing receiving data. Because paper 

coverage is limit, so we introduce some important parts. 

A. Transformer Circuit 

System adopts voltage transformer SPT204A and other 

circuit to consist voltage signal collection module. 

SPT204A is a precision current transformer of mA level 

in fact. Because of size technology it usually adopts 

voltage transformer of current type. And its rated 

transformation ratio is 1, input rated current is 2mA, input 

voltage can’t add to primary coil in use, we need to serial 

into limit currency resistance, which makes voltage signal 

transform into currency signal of mA level. Secondary 

output mA level currency signal is transformed into 

voltage signal by amplifier I/V transferring. Amplifier 

uses good character chip LM118, which can easy to reach 

higher precision and stability. Transformer works at 

approximate zero load state, which has many advantages 

such as broad dynamic range, high precision, and good 

linearity, etc. Voltage signal collection circuit is shown as 

Fig. 2. 

Figure 2. Voltage transformer circuit 

System uses currency transformer STC254AK and 

other circuit to compose currency transformer module. 

Circuit principle is basic same as voltage transformer 

without introducing. In addition, in system designing, we 

should notice transformer has homonymy and synonym 

end, input signal should synonym end connects LM118 

reversed phase end, which can assure amplifier signal 

same with detection signal. Otherwise it is reversed phase. 

[2] 

B. Multi-channel Selection Switcht 

System adopts America AD .co production 16 

channel/double 8 channel, low leakage, CMOS analog 

multiplexer AD7507, which makes sensor module output 

in two channel voltage signal, and then realize subsequent 

process of amplify and filter. AD7507 logic true value 

table is shown as Table I. 

TABLE I. 

LOGIC TRUE VALUE TABLE 

A2 A1 A0 EN On-off switch 

× × × 0 NONE 

0 0 0 1 1&9 

0 0 1 1 2&10 

0 1 0 1 3&11 

0 1 1 1 4&12 

1 0 0 1 5&13 

1 0 1 1 6&14 

1 1 0 1 7&15 

1 1 1 1 8&16 


From Table I we can see, each group of chip outputs 

corresponding one group of input, it can control Ua, Ub, 

Uc, Ia, Ib and Ic signal in double channel acquisition by 

three channel selection ports and a enable port, so, we 

need four control signal of D0, D1, D2 and D3 form 

DAQ. Circuit principle is shown as Fig. 3. [3] 

Figure 3. Multi-channel selection switch circuit 

C. DAQ Selectiont 

Because this projection monitor task is simpler without 

multi-module trigger problem, monitor signal is low 

frequency, and has more requirements of EMI, etc. So, 

thinking about hardware cost and realization complexity, 

we adopt data acquisition module based on PCI bus, and 

plug DAQ directly into PC PCI bus, which realizes field 

data collected and processed as controlled object by PC. 

This system adopts ADLINK .co DAQ-2010 4 channel 

synchronization DAQ, which has 14 bit A/D resolution; 

maximum sampling frequency can reach 2MHz. 

According GB of electric network quality, A level 

instrument frequency monitor range is 0~2500Hz, which 

is 50 times of sampling maximum time harmonic signal’s 

frequency. From Nyquist sampling method we can see, if 

DAQ’s maximum sampling frequency is more than 5 kHz, 

it can reach system requirement. This DAQ-2010 can 

completely meet electric power quality Ath level monitor 

requirement of three phase voltage, currency. 

In addition, ADLINK .co not only provides DAQ-2010 

driving program, which can assure DAQ to realize all 

function well in LabVIEW platform, but also provides 

part of VI function, which makes user program by calling 

these functions in LabVIEW circumstance. 

IV. SYSTEM SOFTWARE DESIGN 

Software design of system adopts modularization 

program structure. PC program control software uses 

LabVIEW+Matlab, and adopts LabSQL[4] connecting 

with database to design. Software system adopts modular 

design, which divides into some relative independent 

function module, and arranges suitable enter and exit port 

parameters. We mainly introduce electric power quality 

monitor method of harmonic, voltage fluctuation and 

flicker, etc. In harmonic monitor, we use FFT combining 

wavelet multi-resolution monitor method, which can 

realize harmonic monitor when signal has mutation. In 

voltage fluctuation and flicker monitor, we use 

synchronization square monitor algorithm with multiresolution 

monitor method, which can realize analysis 

without low-passing filter. We mainly introduce part of 

method and theory in use.


A. LabSQL Connecting With Database 

This system selects LabSQL connects with database, 

principle of LabSQL connecting with database is shown 

as Fig. 4. Using LabSQL can realize data transmission 

between application program and database. 

LabVIEW 

LabSQL VIs 

DSN (Data source name) 

ODBC Driver 

Database 

Figure 4. Frame diagram flow of LabSQL connecting with database 

To avoid user error operation because no using 

authority, we design login and check functions module, 

etc. Login program uses LabVIEW to design. User 

information includes user name, password, authority, 

login times, and last time login time, etc. Database “用 

户.mdb” use Microsoft Access to build. Among them, we 

design a “用户” table to storage user information, which 

design structure is shown as Table II. 

TABLE II. 

DATABASE USER TABLE 

B. FFT 

At present, harmonic monitor method is DFT and FFT. 

DFT computational complexity is proportional to square 

of transform length N. When harmonic analysis, in order 

to assure calculation precision, N is a big number value, 

so, computational complexity is greatly. FFT is fast speed 

algorithm of DFT, which can improve 1~2 magnitude in 

DFT operation efficient. So, it has broad application in 

harmonic analysis. [5] Its way is shown as below: at first, 

calculate basic wave, real and imaginary parts of each 

harmonic, then calculate its amplitude and phase, and 

each harmonic containing amount and harmonic total 

amount. As voltage value for example, if we use FFT 

algorithm to get real and imaginary of K times harmonic 

(k=2, 3…40) is Ur( k ) and U ( k ) , then K times harmonic 

i 

voltage amplitude U( k ) is shown as formula (1): 

2 2 

U( k) = Ur ( k) + Ui ( k) 

(1) 

Phase of K times’ harmonic voltage is as (2): 

U(k) 

(2) 

r 

θ( k)= arctg 

Ui( k) 

Correspondingly, we can get harmonic containing 

amount by using some times harmonic amplitude 


corresponding to basic amplitude percentage. K times’ 

harmonic containing amount is shown as (3): [6] 

U( k) 

HRU ( k) 

= × 100% 

(3) 

U (1) 

Total harmonic distortion (THD) is total containing 

amount reflecting harmonic, which can calculate by 

below formula is shown as formula (4): 

40 

2 

∑ U k 

(4) 

k=2 

THD= × 100% 

U1 

Though FFT has character of fast speed, high precision, 

etc, this method has two sides of error, it is leakage error 

when sample period is asynchrony with signal period, and 

frequency aliasing error as sample frequency is less than 

biggest signal frequency twice. Leakage error has found 

some ways to solve, such as adding window technology, 

and interpolation technology, etc. But, at aliasing error 

aspect, normally, we adopt adding low-passing filter 

behind sampling, which can filter higher of Nyquist 

frequency component. But, FFT on harmonic monitor is 

suggested wave is stable and period. In fact, harmonic 

signal would have catastrophe point, integral action of 

Fourier transform smoothing unstable signal catastrophe, 

which can make Fourier algorithm effect is not ideal at 

unstable harmonic monitor, such as signal catastrophe, 

transient distortion, short time harmonic, etc. 

C. Multi-resolution Wavelet Analysis Algorithm 

When sampling frequency of signal that waiting to be 

analyzed is satisfied with the sampling theory of Nyquist, 

normalization frequency band is restrained between -π 

and +π. So we can use parts of low frequency 0~π/2 and 

high frequency 2/π~π. They are decomposed by ideal 

low-pass filter H0 and high-pass filter H1, which is 

corresponding to signal general picture and detail. The 

two groups of output must be orthogonal because the 

frequency band cannot associate with, and that the width 

of frequency band is halved, so the information would not 

lose if we use sample rate in half. Sample rate of bandpass 

signal is decided by its width of band but its upper 

limit of frequency. The analogous process can be carried 

out to every part of low frequency signal after 

decomposing. That is to say, output signal is decomposed 

into a general approach part of low frequency and a detail 

of high frequency part in each level decomposing, and 

output sample rate of each level can divide into half. So 

the original signal s(n) is decomposed in multi-resolution. 

a ( k ) 

1 

x( n ) 

2 ( ) a k d2 ( k ) 

3 ( ) a k d3 ( k ) 

d ( k ) 

Figure 5. Multi-resolution wavelet decomposition 

Fig. 5 takes on an operation as: x(n)=a3(k)+ d3(k)+ 

d2(k)+d1(k). S is original analysis signal. d1(k), a1(k), 

d2(k), a2(k), d3(k) and a3(k) are high and low frequency 

1


of each layer’s decomposition. If we want decompose any 

more, the method is same. The ultimately intention is that 

construct an orthogonal wavelet base in a frequency band, 

which highly approach in 2 L ( R ) dimensional. These 

orthogonal wavelet bases in different frequency resources 

are equal as band-pass filter of different bandwidth. The 

low frequency signal is decomposed constantly in multiresolution 

analysis, which makes frequency resolution 

become higher and higher. 

Based on the theory of multi-resolution analysis, 

Mallat presents a fast algorithm of wavelet decomposition 

and reconstruct—Mallat. [7] The initial value is show as 

below: 

j − 1 

j j −1 

j 

c n = ∑ a i − 2 n c d i n = ∑ bi−2nci i 

i . 

N M 

Output is c − 

j 

and d ( N − M ≤ j ≤ n) 

. 

N is signal decomposition level of analysis signal f(x), 

j 

f n is discrete sample value of analysis signal f(x), 

c n and 

j 

d n are modulus of low frequency and high frequency. n a 

and n b are decomposition list of low frequency and high 

frequency, which is decided by scale function Φ(x) and 

wavelet function Ψ(x). The detail of each signal 

frequency band can be observed through focusing signal 

scale based on multi-resolution Mallat arithmetic. [8] 

D. FFT Combining Wavelet Transform 

Ideal harmonic analysis algorithm should be not only 

getting each times stable harmonic accuracy frequency, 

amplitude, and phase but also monitor small change, such 

as transient harmonic, and transient catastrophe. So, 

single to use Fourier algorithm or wavelet transform 

algorithm also make perfect harmonic analysis, so, this 

paper combines the two algorithm to realize harmonic 

monitor and analysis. 

Before we adopt FFT to carry out harmonic analysis, 

we firstly monitor if unstable interference signal has or 

not, if it has, we will filter it at first. Because Fourier 

transform has limited in time domain, so, it can not 

analyze unstable distorted wave. But, wavelet transform 

has good time domain locality, which can process 

transient signal and make up Fourier transform 

shortcoming. As well as provide better basic to FFT. 

When monitor catastrophe point by using mode 

maximum and singularity detection method. Harmonic 

monitor principle is shown as Fig. 6. [9] 

Basic wave, each times harmonic, THD and spectrum 

Sample signal after wavelet 

de-noising and filter 

Multiresolution 

wavelet 

transform 

Figure 6. Flow diagram of combining wavelet transform with FFT 

From upon diagram we can see, this kind of harmonic 

monitor and analysis algorithm adds multi-resolution 

wavelet transform algorithm before FFT algorithm, which 


Stable 

component 

Subtract 

FFT 

analysis 

Transient 

component 

is the key of this method and its function is: ①analyzing 

signal transient component; ② correction-compensation 

initial signal, which reduces FFT error because of 

transient component, and improve spectrum analysis 

precision. 

E. Voltage Flicker Synchronization Square Algorithm 

Voltage flicker of electric net voltage can be seen that 

a carrier wave, which composed of power frequency 

voltage in sinusoidal, is changed amplitude modulation 

by a frequency 0.5Hz~35Hz signal. Electric net voltage 

changing usually causes voltage flicker. So electric net 

voltage fluctuation and flicker signal u(t) can use be 

expressed as formula (5): 

ut ( ) = V[1 + msin( Ω t)]sin( wt) 

m 

(5) 

In formula (3-3), Vm is voltage amplitude of power 

frequency, w0 is angle frequency of power frequency 

carrier wave, m is voltage amplitude of amplitude 

modulation wave/ carrier wave voltage (modulation 

parameter), and m ≤ 1, 

mcos(Ωt) is flicker voltage, and Ω 

is angle frequency of amplitude modulation voltage. 

The envelope signal of voltage flicker msin(Ωt) 

includes its amplitude and frequency information. The 

signal will be decomposed into a series of sinusoidal 

signals’ sum by using wavelet multi-resolution signal 

decomposition method, which constituted by groups of 

sub-band filters, and each component compares to a kind 

of corresponding frequency sinusoidal signal. Wavelet 

analysis approaches original signal in different resolution, 

and projection component has a certain bandwidth with 

time-domain resolution. Through replacing traditional 

low-pass filter of synchronization radio-detector by subband 

filter, we cannot only measure envelope signal of 

voltage fluctuation with non-distorted, but also precisely 

monitor the start time and amplitude of voltage flicker. 

This paper introduces a method with adopting IEC 

commendatory of square monitor [10] and combining 

wavelet analysis in allusion to the stable or unstable 

voltage flicker signal. After experiment, this method is a 

better way. Its measurement principle diagram is shown 

as Fig. 7. 

Sampling 

2 

u () t 

Square 

ut () 

Wavelet 

decomposition 

Envelope 

signal 

Figure 7. Wavelet transforms measurement 

2 

u () t High frequency 

detail parameter 

2 

u () t Low frequency 

approach parameter 

Wavelet reconstruction 

According to square measurement method, we can gain 

envelope signal of u(t) from 2 

u () t low frequency 

approach parameter when make modulation wave voltage 

to square based on wavelet reconstruction. Formula 

derivation process is shown as below formula (6): 

2 2 2 2 2 

u ( t) = A (1+ 2msin Ω t+ m sin Ω t)sin ωt 

0


2 2 2 2 

2 

2 2 

A 

= 

2 

m A m A 

(1 + ) + A msin Ωt− (1 + )cos 2ωt− 2 2 2 4 

m cos 2Ωt 

2 2 

A 2 

A 

− m sin(2 ω+Ω ) t+ msin(2 ω−Ω) 

t 

2 2 

2 2 

A 2 A 2 

+ m cos 2( ω+Ω ) t+ m cos 2( ω−Ω) 

t 

8 8 

(6) 

From formula (6) we can see, modulation wave voltage 

quadratic term has not only DC component, but also 

2( ω ± Ω ) , 

frequency component, such as Ω , 2Ω , 2ω , 0 0 

2ω 0 ±Ω. We analyze time-frequency of 2 u () t by using 

Mallat wavelet decomposition arithmetic. [11] Its process 

is corresponding to a decomposition process of a series of 

sub-band. Considering actual modulation index m is far 

from 1, multiple component of amplitude modulation 

wave voltage is far from amplitude of amplitude 

modulation wave, so we can neglect it. Thereby, we can 

realize demodulation after wavelet transforming, and 

obtain near weighting amplitude modulation voltage ' u() t, 

which is shown as formula (7) 

' 2 

u () t ≈ mV sinΩ 

t 

(7) 

m 

V. SIMULATION ANALYSIS 

A. Mutation Point Analysis and Compensation 

In order to verify this algorithm accuracy, we suggest 

o 

signal: x( t) = sin100π t+ 0.3sin(200πt+ 15 ) + 0.15sin(400πt+ 

o o o 

30 ) + 0.1sin(600πt+ 30 ) + 0.01sin(1200πt+ 60 ) + pt () cos( 

3200πt). From signal, signal basic wave frequency is 

50Hz, amplitude is 1, and containing amplitude of 0.3, 

frequency of 100Hz 2 times harmonic; amplitude of 0.15, 

frequency of 200Hz 4 times harmonic; amplitude of 0.1, 

frequency of 400Hz 6 times harmonic; and amplitude is 

0.01, frequency of 600Hz 12 times harmonic component.. 

In addition, structure a mutation signal p(t), signal 

produce a unstable cosine shrinkage pulse signal at 

0.015s, constant width is 0.000625s, p(t) signal is 

structured through segment function. Suggesting we 

sample signal as frequency 6.4 kHz, so pulse is between 

96 and 100 sample pot in theory. Through Matlab 

simulation to initial signal, Initial signal simulation wave 

is shown as Fig. 8. 

Figure 8. Initial signal simulation diagram. 

According to signal x(t), its basic frequency f is 50Hz, 

sample frequency f is 6.4kHz, from formula (3-5), we 

s 

can get N=5. So, after making 5 layers wavelet multiresolution 

decomposition, we can get a5, which is basic 

wave component range. Each times’ harmonic 

respectively is distributed at high frequency band d1 and 

d2. Frequency band division is shown as Fig. 9. 


0~3.2 

kHz 

1.6~3.2 

kHz 

d1 

0~1.6 

kHz 

0.8~1.6 

kHz 

d2 

0~800 

Hz 

400~80 

0Hz 

0~400 

Hz 

200~40 

0Hz 

0~200 

Hz 

Figure 9. Dividing of signal y frequency band. 

100~20 

0Hz 

According to multi-resolution analysis layer, we make 

signal x(t) db24 wavelet decomposition, and extract this 

two layers high frequency detail d1, d2 and its neighbor 

d3. According to d1 (or d2) confirming transient 

disturbance continue time segment, without disturbance, 

d1 value is minimal, 0.00000 or 0.00001, and we make 

mode maximum value [13] monitor to judge if there is 

mutation and confirm disturbance time, limited value is 

mode average 10 times, and we can obtain disturbance 

appear between 96 and 100 point, and its continue time 

∆T we can get by ways as: 

1 

∆ T = × (100 − 96) = 0.000625s. 

6400 

From calculation, this transient signal time is shortest, 

(∆T


B. Harmonic Analysis 

Because wavelet analysis is base on multi-resolution 

signal processing theory, different scale corresponds with 

different time and frequency resolution. That is to say, 

different frequency component of original signal can be 

separated through wavelet decomposition to realize 

detection and tracking of the harmonic. In electric power 

system, original signal fundamental wave frequency is 

50Hz and effectively decomposition to it can obtain subband 

( 0 /2 j 

∼ f ) whose frequency range contains 50Hz. 

s 

Therefore, fundamental wave can be separated from 

various harmonic. In power system, harmonic’s order, 

produced actually by the voltage or current of network, 

mostly may be 2K+1. In given simulation algorithm, 

original current signal construct harmonic of 1, 3, 5, 7 

orders and expression is: x(t)=sin(wt)+sin(3wt+3)/3+sin 

(5wt+5)/5+sin(7wt+7)/7, w=100π. According to sampling 

theorem, the sampling frequency of discrete signals can’t 

be lower two times than the highest frequency. The 

sampling frequency is fs=50×512=25600 =25.6 kHz and 

carry on 7 levels decomposition to original signals by 

choosing the db24 as wavelet function. 

In original signals, only fundamental wave frequency 

is between 0Hz and 100Hz. Therefore, we can obtain 

fundamental wave component by means of the 

reconstruction of a7 ( k ) and simultaneously attain the end 

of fundamental wave detection. The transforming results 

of scale 1 to 7 are sub-band d1~d7, and respectively 

indicate the harmonic of continuous frequency. The 

reconstructed images through different wavelet 

coefficient decomposing are as shown in Fig. 12. 

Figure 12. Reconstructed of various wavelet resolution coefficients 

From simulation results of sub-band a7, we can figure 

out the amplitude value of fundamental wave component 

to be 0.9931V and the effective value to be 0.7023V. The 

relative error of the effective values is 0.67%. Therefore, 

this method can realize the separation of fundamental 

wave. In addition, because wavelet db24 is un-symmetry, 

therefore, the excursion of frequency’s phase appears 

between the fundamental wave component, which 

detected and the original fundamental component. It is 

shown as Fig. 13. 


Figure 13. Ideal fundamental wave (broken line) and real fundamental 

wave (real line) 

The original signal can be decomposed into the 

fundamental wave a7 and sub-harmonics d1, d2, d3, d4, 

d5, d6, and d7 through wavelet transform. The frequency 

of sub-band signals respectively is 445.64Hz, 276.72Hz, 

150.00Hz, 50Hz, and the result is consistent with Table 1. 

It can be seem that wavelet transform has good 

localization characteristic in time domain and frequency 

domain as well and it is suitable for harmonic analysis of 

electric power system. The real-time harmonic tracking 

mainly aims at obtaining the trend of harmonic. The 

relative error of amplitude and phase may not demand too 

much. By choosing suitable wavelet, the wavelet 

transform can effectively track the changing trend of 

harmonic. It is shown as Fig. 14. 

Figure 14. Ideal harmonic (broken line) and real harmonic (real line) 

C. Voltage Fluctuation and Flickers 

Steelmaking arc furnace is main equipment causing 

electric voltage fluctuation and flicker, and each 

frequency sub-value of amplitude is inversely 

proportional to frequency. Now we analyze arc furnace 

voltage flicker signal by wavelet analysis tools of 

simulation software Matlab. We suggest electric voltage 

flicker signal expression causing by steelmaking arc 

furnace as formula (8): 

ut () = Vm[1+ msin Ω t]sin( wt) 

(8) 

From formula (8), Vm=1V, Ω=6πrad/s, w=314πrad/s. 

when 0.1875s≤t≤1.875s, m=0.05. On Matlab simulation 

experiment, at first, we make initial signal u(t) squaring, 

get signal s(t), then get 8000 sample point of s(t) by 

3.2kHz sample frequency, and then select Daubechies 

wavelet db24, decompose signal into multi-layer wavelet 

decomposition and reconstruction. 

High frequency d1 and d2 is after wavelet transform 

and reconstruction. Model maximum position is mutation 

point position. It is shown as Fig. 15. In which, abscissa 

is time, unit is s, ordinate is voltage amplitude, and unit is


V. The following wave diagram abscissa and ordinate is 

same. 

Figure 15. Reconstruction high frequency signal d1 and d2. 

From Fig. 15 we can see, two position of mutation 

point respectively at t1≈0.20s and t2≈1.85s, modulus 

maximum of position t2 obvious larger than t1, which 

express singular degree of t2 is higher than t1, simulation 

experiment shows that mutation position and singular 

degree are uniform with initial signal. Low frequency a6 

is voltage flicker signal envelope after wavelet 

transform reconstruction. It is shown as Fig. 16. 

Figure 16. Reconstruction low frequency signal a6 

From Fig. 16 we can see, voltage flicker signal initial 

time t1 is at position 0.2s and end time t2 position is at 

1.85s, which is uniform with theory value 0.1875s and 

1.875s, and in Fig. 15, transient signal shows as slope 

form at position t1, and position t2 shows as spike signal, 

which is uniform to mutation ambiguity degree in 

conclusion1. From it we can see, by selecting suitable 

wavelet, orthogonal wavelet decomposition algorithm can 

effectively recognize signal position of catastrophe point 

and singular degree. 

To verify wavelet reconstruction accuracy of sub-band 

signal, we suppose a(t)=0.1sin(Ωt)+0.03sin(3Ωt)+0.02sin 

(5Ωt), from Ω=6πrad/s, we can see, flicker signal 

includes three amplitude wave of 3Hz, 9Hz, and 15Hz, 

and theoretical amplitude respectively is 0.1, 0.03 and 

0.02. From multi-resolution Mallat algorithm, each subband 

frequency range is shown as Table III. 

TABLE III. 

SUB-BAND FREQUENCY RANGE 

Scale(j) Reconstruction signal Frequency band (Hz) 

1 d1 800~1600 

2 d2 400~800 

3 d3 200~400 

4 d4 100~200 

5 d5 50~100 

6 d6 25~50 

6 a6 0~50 

7 d7 12.5~25 

7 a7 0~25 

8 d8 6.25~12.5 

10 d10 1.5625~3.125 


VI. SIMULATION ANALYSIS 

For easy to realize human-computer interaction 

function, we adopt visual instrument technology to design 

upper PC voltage flicker observe panel. According to 

wavelet decomposition and reconstruction method, we 

mix LabVIEW and Matlab to realize flicker monitor by 

calling Matlab script node in LabVIEW [14]. Through 

setting wavelet decomposition limit frequency value, we 

can make signal decomposed into enough bigger scale, 

which can make program automatic judge wavelet layer 

number, and get all amplitude modulation wave voltage 

signal of carrier wave. 

A. Harmonic Monitor 

Upper PC detection system receives lower PC 

transmitting data, and memories into EXCEL format file. 

Through read power network sample data files, where, 

we introduce software program method as harmonic 

detection for example. System uses wavelet transform 

and FFT analysis algorism to realize harmonic analysis. 

Matlab [15] wavelet analysis tool box function is perfect. 

So, we use Matlab program language to realize wavelet 

analysis. And adopt Matlab+LabVIEW mixed program 

technology, which make harmonic analysis Matlab 

program harmonic.m use at LabVIEW[16][17]program 

call. Voltage steady state component after wavelet 

decomposing is sent into function FFT.vi port X, FFT 

result is obtained at port FFT{X} after program running. 

To make wave display control X axis calibration 

correspond to time, we add Bundle data packer node: first 

item is wave initial position x , where valuation 0, 

0 

express display from point. Secondary item is step 

length ∆ x , where is obtained by system sample 

frequency/sample point. The third item is frequency 

analysis result shaping array. The three items are sent into 

display control after packing. 

A phase voltage as example, its harmonic analysis 

front panel measurement result is shown as Fig. 17. 

Figure 17. Harmonic analysis front panel 

Ideal THD% calculation formula is shown as: 

2 2 2 

77.77 + 40.44 + 31.11 

THD% = × 100% = 29. 83% 

o 

312 

From Fig. 17 we can see, THD% of monitor value is 

29.82%, which obtains accurate measurement result in 

little monitor error. 

B. Voltage Fluctuation and Flicker 

We also use LabVIEW combining Matlab to complete 

voltage fluctuation and flicker monitor. In theory, three 

kinds of amplitude wave signal locate at d7, d8 and d10


after wavelet transform reconstruction, we use 10 level 

reconstruction. Sub-band d7, d8, and d10 simulation 

result is shown as Fig. 18. 

Figure 18. LabVIEW front panel running result 

From Fig. 18 we can get each wavelet reconstruction 

sub-band frequency and amplitude after taking single 

frequency measurement function of LabVIEW function 

tools. It is shown as Table IV. 

TABLE IV. 

RECONSTRUCTION RESULTS 

Sub-band Frequency(Hz) Error(Hz) Amplitude(V) Error(V) 

d10 3.01 0.01 0.097 -0.003 

d8 8.99 -0.01 0.029 -0.001 

d7 14.99 -0.01 0.019 -0.001 

From three sub value of suggestion voltage a(t), which 

theoretical amplitude are respectively 0.1V, 0.03V and 

0.05V. Signal basic wave, three times, and five times 

harmonic wave sub-value amplitude are respectively 

0.097V, 0.029V, and 0.019V, frequency are respectively 

3.01Hz, 8.99Hz, and 14.99Hz. Corresponding frequency 

and amplitude monitor error are shown as Table III, 

which is verified algorithm’s accuracy in smaller monitor 

error. 


System changes traditional electric power parameter 

measurement style based on hardware as core through VI 

technology, which completely adopts PC’s strong 

functions of calculation, display, and storage, etc. 

System uses module of voltage/currency transformer, 

multiplexing switch, etc, which realizes electric power 

network signal detection. Software design of detection 

module, data display module, alarming module, report 

form drawing module, etc are done by control software of 

Matlab+LabVIEW8.5 [18] System hardware has many 

advantages of sample structure, higher measurement 

precision, convenient extension, and easy to realize 


electric power parameter monitor, etc. System software 

uses wavelet multi-resolution analysis method, etc in 

monitor error of harmonic wave’s THD% is 0.01%, and 

amplitude wave’s frequency and amplitude of voltage 

flicker is respectively less than 0.01Hz and 0.005V, 

which realizes high precision electric power parameters 

measurement. 

REFERENCES 

[1] Pan Tianhong, and Sheng Zhanshi, “Desing of Power 

Network Voltage Monitoring Instrument Based on SMS 

Technique of GSM,” Electric Power Automation 

Equipment, p.48, Sep 2005. 

[2] SPT204A . “Voltage transformer SPT204A,” Beijing : 

Department of Chunjing SCM Technology Development, 

http://www.chunjs.com/. 

[3] AD7507.pdf. “8-and 16-Channel Analog Multiplexers 

AD7506/AD7507,”Analog Devices Inc, www.analog.com, 

2006. 

[4] Mi Xiaoyuan, Zhang Yanbin, and Xue Deqing,.”Using 

LabSQL Visiting Database in LabVIEW,”Microcomputer 

Information, pp.53-54, Oct 2004. 

[5] Sun Xiaoming, “Reasch and Design of Electric Power 

Quality Monitor System,” Shandong University, 2005. 

[6] George J. Wakelih. “Electric Power System Harmonic – 

Basci Principle, Analysis Method and Filter Design,” 

Beijing: Mechanical Industry Publishing House, 2003. 

[7] Changhua Hu, “System Analysis and Design based Matlab- 

Wavelet Analysis,” Xi’an: Electronic Science and 

Technology University, 1999. 

[8] Lizhi Cheng, and Hanwei Guo, “Wavelet and Discrete 

Transform Principal & Practice,” Beijing: Tinghua 

University. 2005. 

[9] Zhang Junxia, Ren Zihui, Yue Mingdao, Zhang Guoyuan. 

“Design of Power Quality Monitoring System Based on Virtual 

Instrument,” Instrument Technology, p. 8, Jun 2008. 

[10] Xiaoli G, Jianlan L, Xiao W, Jun D, and Zhenguo S, 

"Simulation on Voltage Fluctuation Signal Measurement,” 

Proceedings of the Electric Power System and Automation, 

pp. 41-42, Feb 2006. 

[11] Lin Zhou, Xue Xia, Yunjie Wan, Hai Zhang, and Peng Lei, 

“Harmonic Detection Based on Wavelet Transform,” 

Transactions of China Electro technical Society, p.68, Sep 

2006. 

[12] Lizhi Cheng, and Hanwei Guo, “Wavelet and Discrete 

Transform Principal & Practice,” Beijing:Tinghua 

University, May. 2005. 

[13] Zheng Chanzheng, Mao Zhe, and Xie Zhaohong, “The 

grain depot temperature test system based on nRF905,” 

Microcomputer Information, pp.284-285, Feb 2007. 

[14] Guo Xianding, “Applying Wavelet-Transform to Inspect 

the Discontinuous Dot of Signal,” Electronic Application, 

pp, 90-92, Nov 2006. 

[15] GeorgeJ,Wakelih,.“Electric power system Harmonic— 

Basic Principle, analysis method and filter design,” Beijing: 

Mechanical Industry Publisher, 2003. 

[16] Dai Pengfei, Wang Shengkai, Wang Gefang, and Ma Xin, 

“Monitor Engineering and LabVIEW Application,” 

Electric Industry Publishing House, May 2006. 

[17] Chen Xihui, and Zhang Yinhong, “LabVIEW Program 

Desing,” Beijing: Tsing Hua University, July 2007. 

[18] Fei Feng, and Yang Wansheng, “LabView and Matlab 

Mixed Program,” Electric Technology Application, pp.4-6, 

March 2004.

Aims and Scope. 

Call for Papers and Special Issues 

Journal of Computers (JCP, ISSN 1796-203X) is a scholarly peer-reviewed international scientific journal published monthly for researchers, 

developers, technical managers, and educators in the computer field. It provide a high profile, leading edge forum for academic researchers, industrial 

professionals, engineers, consultants, managers, educators and policy makers working in the field to contribute and disseminate innovative new work 

on all the areas of computers. 

JCP invites original, previously unpublished, research, survey and tutorial papers, plus case studies and short research notes, on both applied and 

theoretical aspects of computers. These areas include, but are not limited to, the following: 

• Computer Organizations and Architectures 

• Operating Systems, Software Systems, and Communication Protocols 

• Real-time Systems, Embedded Systems, and Distributed Systems 

• Digital Devices, Computer Components, and Interconnection Networks 

• Specification, Design, Prototyping, and Testing Methods and Tools 

• Artificial Intelligence, Algorithms, Computational Science 

• Performance, Fault Tolerance, Reliability, Security, and Testability 

• Case Studies and Experimental and Theoretical Evaluations 

• New and Important Applications and Trends 

Special Issue Guidelines 

Special issues feature specifically aimed and targeted topics of interest contributed by authors responding to a particular Call for Papers or by 

invitation, edited by guest editor(s). We encourage you to submit proposals for creating special issues in areas that are of interest to the Journal. 

Preference will be given to proposals that cover some unique aspect of the technology and ones that include subjects that are timely and useful to the 

readers of the Journal. A Special Issue is typically made of 10 to 15 papers, with each paper 8 to 12 pages of length. 

The following information should be included as part of the proposal: 

• Proposed title for the Special Issue 

• Description of the topic area to be focused upon and justification 

• Review process for the selection and rejection of papers. 

• Name, contact, position, affiliation, and biography of the Guest Editor(s) 

• List of potential reviewers 

• Potential authors to the issue 

• Tentative time-table for the call for papers and reviews 

If a proposal is accepted, the guest editor will be responsible for: 

• Preparing the “Call for Papers” to be included on the Journal’s Web site. 

• Distribution of the Call for Papers broadly to various mailing lists and sites. 

• Getting submissions, arranging review process, making decisions, and carrying out all correspondence with the authors. Authors should be 

informed the Instructions for Authors. 

• Providing us the completed and approved final versions of the papers formatted in the Journal’s style, together with all authors’ contact 

information. 

• Writing a one- or two-page introductory editorial to be published in the Special Issue. 

Special Issue for a Conference/Workshop 

A special issue for a Conference/Workshop is usually released in association with the committee members of the Conference/Workshop like 

general chairs and/or program chairs who are appointed as the Guest Editors of the Special Issue. Special Issue for a Conference/Workshop is 

typically made of 10 to 15 papers, with each paper 8 to 12 pages of length. 

Guest Editors are involved in the following steps in guest-editing a Special Issue based on a Conference/Workshop: 

• Selecting a Title for the Special Issue, e.g. “Special Issue: Selected Best Papers of XYZ Conference”. 

• Sending us a formal “Letter of Intent” for the Special Issue. 

• Creating a “Call for Papers” for the Special Issue, posting it on the conference web site, and publicizing it to the conference attendees. 

Information about the Journal and Academy Publisher can be included in the Call for Papers. 

• Establishing criteria for paper selection/rejections. The papers can be nominated based on multiple criteria, e.g. rank in review process plus 

the evaluation from the Session Chairs and the feedback from the Conference attendees. 

• Selecting and inviting submissions, arranging review process, making decisions, and carrying out all correspondence with the authors. 

Authors should be informed the Author Instructions. Usually, the Proceedings manuscripts should be expanded and enhanced. 

• Providing us the completed and approved final versions of the papers formatted in the Journal’s style, together with all authors’ contact 

information. 

• Writing a one- or two-page introductory editorial to be published in the Special Issue. 

More information is available on the web site at http://www.academypublisher.com/jcp/.

(Contents Continued from Back Cover) 

Implementation of Low Density Parity Check Decoders using a New High Level Design 

Methodology 


REGULAR PAPERS 

A Formal Model for Abstracting the Interaction of Web Services 

Li Bao, Weishi Zhang, and Xiong Xie 

Performance Evaluation of Elliptic Curve Projective Coordinates with Parallel GF(p) Field 



VPRS-Based Knowledge Discovery Approach in Incomplete Information System 

Shibao Sun, Ruijuan Zheng, Qingtao Wu, Tianrui Li 


Hai Guo and Jing-ying Zhao 

A Systematic Decision Criterion for Urban Agglomeration: Methodology for Causality Measurement 

at Provincial Scale 

Yaobin Liu and Li Wan 

Application of Improved Fuzzy Controller for Smart Structure 

Jingjun Zhang, Liya Cao, Weize Yuan, Ruizhen Gao, and Jingtao Li 

Numerical Simulation of Snow Drifting Disaster on Embankment Project 

Shouyun Liang, Xiangxian Ma, and Haifeng Zhang 

The Simulation of Extraterrestrial Solar Radiation Based on SOTER in Zhangpu Sample Plot and 


Zhi-qiang Chen and Jian-fei Chen 

Multiplicate Particle Swarm Optimization Algorithm 

Shang Gao, Zaiyue Zhang, and Cungen Cao 

Research and Design of Intelligent Electric Power Quality Detection System Based on VI 

Yu Chen 

81 

91 

99 

110 

117 

125 

131 

139 

144 

150 

158

Journal of Computers Contents - Academy Publisher

Create successful ePaper yourself

Delete template?

Save as template?