Journal of Software - Academy Publisher

Journal of Software 

ISSN 1796-217X 

Volume 7, Number 12, December 2012 

Contents 

REGULAR PAPERS 

Fault Diagnosis Based on Improved Kernel Fisher Discriminant Analysis 

Zhengwei Li, Guojun Tan, and Yuan Li 

Intra-Transition Data Dependence 

Shenghui Shi, Qunxiong Zhu, Zhiqiang Geng, and Wenxing Xu 

Research of ArtemiS in a Ring-Plate Cycloid Reducer 

Bing Yang, Yan Liu, and Guixin Ye 

A New Access Control Model for Manufacturing Grid 

Zhihui Ge and Taoshen Li 

Software Design and Development of Chang’E-1 Fault Diagnosis System 

Ai-wu Zheng, Yu-hui Gao, Yong-ping Ma, and Jian-ping Zhou 

An Improved Implementation of Preconditioned Conjugate Gradient Method on GPU 

Yechen Gui and Guijuan Zhang 

Joint Polarization and Angle Estimation for Robust Multi-Target Localization in Bistatic MIMO 

Radar 

Hong Jiang, Yu Zhang, Hong-jun Liu, Xiao-hui Zhao, and Chang Liu 

The Central DOA Estimation Algorithm Based on Support Vector Regression for Coherently 

Distributed Source 

Yinghua Han and Jinkuan Wang 

Web Services Evaluation Predication Model based on Time Utility 

Guisheng Yin, Xiaohui Cui, Yuxin Dong, and Jianguo Zhang 

Speech Emotion Recognition based on Optimized Support Vector Machine 

Bo Yu, Haifeng Li, and Chunying Fang 

System Dynamics Modeling and Simulation for Competition and Cooperation of Supply Chain on 

Dual-Replenishment Policy 

Shidi Miao, Chunxian Teng, and Lu Zhang 

Audio Error Concealment Based on Wavelet Decomposition and Reconstruction 

Fei Xiao, Hexin Chen, and Yan Zhao 

Reputation Based Academic Evaluation in a Research Platform 

Kun Yu and Jianhong Chen 

Analyzing ChIP-seq Data based on Multiple Knowledge Sources for Histone Modification 

Dafeng Chen, Deyu Zhou, and Yuliang Zhuang 

2657 

2663 

2671 

2678 

2687 

2695 

2703 

2710 

2717 

2726 

2734 

2742 

2749 

2755

A New Semi-supervised Method for Lip Contour Detection 

Kunlun Li, Miao Wang, Ming Liu, Ruining Xin, and Pan Wang 

Trusted Software Constitution Model Based on Trust Engine 

Junfeng Tian, Ye Zhu, and Jianlei Feng 

Fuzzy Evaluation on Supply Chains’ Overall Performance Based on AHM and M(1,2,3) 

Jing Yang and Hua Jiang 

A Novel Combine Forecasting Method for Predicting News Update Time 

Mengmeng Wang, Xianglin Zuo, Wanli Zuo, and Ying Wang 

Information-based Study of E-Commerce Website Design Course 

Xinwei Zheng 

An Empirical Study on the Correlation and Coordination Degree of Linkage Development between 

Manufacturing and Logistics 

Rui Zhang and Chunhua Ju 

Tourism Crisis Management System Based on Ecological Mechanism 

Xiaohua Hu, Xuan Zhou, Weihui Dai, Zhaozong Zhan, and Xiaoyi Liu 

Image Fusion Method based on Non-Subsampled Contourlet Transform 

Hui Liu 

Research on Intrusion Detection Model of Heterogeneous Attributes Clustering 

Linquan Xie, Ying Wang, Fei Yu, Chen Xu, and Guangxue Yue 

Vector-Distance and Neighborhood Development for High Dimensional Data 

Ping Ling, Xiangsheng Rong, Xiangyang You, and Ming Xu 

Evaluation and Comparison on the Techniques of Vertex Chain Codes 

Linghua Li, Yining Liu, Yongkui Liu, and Borut Zalik 

Research on Web Query Translation based on Ontology 

Xin Wang and Ying Wang 

Data Modeling of Knowledge Rules: An Oracle Prototype 

Rajeev Kaula 

OPC (OLE for Process Control) based Calibration System for Embedded Controller 

Ming Cen, Qian Liu, and Yi Yan 

WS-mcv: An Efficient Model Driven Methodology for Web Services Composition 

Faycal Bachtarzi, Allaoua Chaoui, and Elhillali Kerkouche 

Object Search for the Internet of Things Using Tag-based Location Signatures 

Jung-sing Jwo, Ting-chia Chen, and Mengru Tu 

Fractional Order Ship Tracking Correlation Algorithm 

Mingliang Hou and Yuran Liu 

A Label Correcting Algorithm for Dynamic Tourist Trip Planning 

Jin Li and Peihua Fu 

2763 

2771 

2779 

2787 

2794 

2800 

2808 

2816 

2823 

2832 

2840 

2849 

2857 

2866 

2874 

2886 

2894 

2899

JOURNAL OF SOFTWARE, VOL. 7, NO. 12, DECEMBER 2012 2657 

Fault Diagnosis Based on Improved Kernel 

Fisher Discriminant Analysis 

Zhengwei Li 

School of Computer Science and Technology. China University of Mining and Technology 

Xuzhou, China 

Email:zwli@cumt.edu.cn 

Zhengwei Li, Guojun Tan,Yuan Li 

School of Information and Electrical Engineering, China University of Mining and Technology 

Xuzhou, China 

Email: {zwli,gjtan, Liyuan}@cumt.edu.cn 

Abstract—There are two fundamental problems of the 

Kernel Fisher Discriminant Analysis (KFDA) for nonlinear 

fault diagnosis. The first one is the classification 

performance of KFDA between the normal data and fault 

data degenerates as long as overlapping samples exist. The 

second one is that the computational cost of kernel matrix 

becomes large when the training sample number increases. 

Aiming at the two major problems, in this paper, an 

improved fault diagnosis method based on KFDA(IKFDA) 

is proposed. There are two aspects are improved in the 

method. Firstly, the variable weighting vector was 

incorprated into KFDA which can improve the discriminant 

performance. Secondly, when the training sample number 

becomes large, a feature vector selection scheme based on a 

geometrical consideration is given to reduce the 

computational complexity of KFDA for fault diagnosis. 

Finally, Gaussian mixture model (GMM) is applied for fault 

isolation and diagnosis on the KFDA subspace. 

Experimental results show that the proposed method 

outperforms traditional kernel principal component 

analysis (KPCA) and general KDA algorithms. 

Index Terms—kernel fisher discriminant analysis, fault 

diagnosis, variable weighting, feature vector selection, 

gaussian mixture model 

I. INTRODUCTION 

Since the fault diagnosis problem can be considered as 

a multi-class classification problem, pattern recognition 

methods with good generalization and accurate 

performances have been proposed in recent years. Choi et 

al. [1] proposed a fault detection and isolation 

methodology based on principal component analysis– 

Gaussian mixture model and discriminant analysis– 

Gaussian mixture model. Fisher discriminant analysis 

(FDA) has been proved to outperform PCA in 

discriminating different classes, in the aspect that PCA 

Project for major transformation of scientific and technological 

achievements from Jiangsu province (BA2008029). 

Corresponding author: Zhengwei Li, Email: zwli@cumt.edu.cn 

© 2012 ACADEMY PUBLISHER 

doi:10.4304/jsw.7.12.2657-2662 

aims at reconstruction instead of classification, while 

FDA seeks directions that are optimal for 

discrimination [2] . However, FDA is a linear method. In 

order to handle the nonlinear problem of process data, 

kernel FDA (KFDA) is proposed by Mika et al. [3] . KFDA 

performs a nonlinear discriminant through kernel feature 

space mapping before FDA method is used. Yang et al. [4] 

made an in-depth analysis on the KFDA algorithm, and 

reformulated it as a two-step procedure: kernel principal 

component analysis (KPCA) plus FDA. Recently, KFDA 

has been proved superior to PCA and FDA in fault 

diagnosis, which makes it a promising way for process 

monitoring [5,6] . The basic idea of the kernel trick is that 

input data are mapped into a kernel feature space by a 

nonlinear mapping function and then these mapped data 

are analyzed. 

However, the general KFDA method has some 

shortcomings for fault diagnosis. Firstly, the conventional 

KFDA views the same contribution of each variable to 

the classification and all variables are used in a same 

level so that the data sets are masked with irrelevant 

information. As a result, the classification performance of 

KFDA for fault diagnosis degenerates when the samples 

of the normal data and the fault data are overlapped [7] . 

Focusing on the multi-classification where data are 

overlapped, the paper proposes a variable-weighted 

schema into the general KFDA. Secondly, in the training 

stage of KFDA, it requires to store and manipulate the 

kernel matrix, the size of which is the square of the 

sample number. When the sample number becomes large, 

the eigen-decomposition and the matrix inversion 

calculation will be time-consuming, and then reducing 

the calculation time is very important. In this paper, a 

feature vector selection scheme based on a geometrical 

consideration [8] is given to reduce the computational 

complexity of KFDA when the number of training 

samples becomes large. 

This paper is organized as follows. In Section 2, 

KFDA is explained. In Section 3, Improved KFDA is 

proposed. Fault diagnosis results of the above schemes

2658 JOURNAL OF SOFTWARE, VOL. 7, NO. 12, DECEMBER 2012 

are given by simulations in Section 4. Finally in Section 5, 

the main points are summarized. 

II. KFDA 

The basic idea of KFDA is to solve the problem of 

linear FDA in an implicit feature space F. However, it is 

difficult to do so directly because the dimension h of the 

feature space F can be arbitrarily large or even infinite. In 

implementation, the implicit feature vector in F does not 

need to be computed explicitly, while it is just done by 

computing the inner product of two vectors in F with a 

kernel function. 

Let the dimensionality of original sample feature space 

be m and the number of sample classes be C , the total 

original sample { 1, 2,..., 

C} 

X X X X = 

, the j th (j = 1, 2,. . . ,C) 

class X contains j 

j N samples, namely 

j j j 

X j = { X1 

, X 2 ,..., X } . Here, m 

N 

j 

j 

R X X X 

j j j 

1 , 2 ,..., N ∈ is used 

to denote the N training samples(column vectors) of 

j 

class j for KFDA learning. N is the total number of 

C 

original training samples, and then N = ∑ N . 

j = 1 j 

By the nonlinear mapping φ , the measured inputs are 

extended into the hyper-dimensional feature space as 

follows 

m 

h 

φ : x ∈ R → φ( 

x) 

∈ F 

(1) 

The mapping of sample x i is simply noted 

as φ ( xi ) = φi 

, the total mapped sample set and the j th 

mapped class are given by 

) { ( ), ( ),..., ( )} X X X X φ φ φ φ = 

( 1 2 

C 

( X j ) 

j j 

j 

{ φ( 

X1 

), φ( 

X 2 ),..., φ( 

X N j 

φ = 

The mean of the mapped sample class φ ( X j ) is given 

by m j 

N j 

= ( 1/ 

N j ) ∑ φ ( x ) , and the global mean of the 

i= 

1` 

i 

total mapped samples is given by 

m = ( 1/ 

N ) 

C N j 

φ ( x ) . The within-class scatter 

j 

∑ ∑ 

j= 

1 i= 

1` 

i 

matrix S W in F and between-class scatter matrix S B in 

F are defined as 

S 

W 

)} 

C N j 

j 

j T 

( φ ( xi 

− m j )( φ( 

xi 

− m j ) (2) 

j= 

1 I = 1 

1 

= ∑∑ N 

1 

S N ( m − m)( 

m − m) 

C 

B = ∑ N j= 

1 

j 

j 

Performing FDA in F means maximizing the between 

class scatter matrix S B and minimizing the within-class 

scatter matrix S W . This is equivalent to maximizing the 

following function 

j 

T 

(3) 

T 

| w S Bw 

| 

J ( w) 

= arg max 

(4) 

w 

T 

| w S w | 


W 

The problem of KFDA is converted into finding the 

leading eigenvectors of SW SB 

1 − 

. Here, the dimension of 

SW SB 

1 − 

can be infinite, and it cannot be calculated directly. 

Since any solution w∈ F must lie in the span of all the 

samples in F, there exists coefficients 

{ , i 1, 

2,..., 

n} 

= α = α 

, such that 

i 

w = 

n 

∑ 

i= 

1 

Combine with (5), we can write 

α φ 

(5) 

i 

i 

T 

T 

w S w = K α 

(6) 

B 

α B 

T 

T 

w S w = K α 

(7) 

W 

α W 

Here, K and B K are in the form of matrix {k(x, y)}, 

w 

the kernel matrix of the samples (x, y), and k(x, y) is the 

kernel function. Where 

C C 

⎧ 1 

T 

⎪K 

B = ∑∑( 

qi 

− q j )( qi 

− q j ) 

⎪ C( 

C −1) 

i= 

1 j= 

1 

⎨ 

Ni 

Ni 

⎪ ⎛ 1 

1 

1 

q = ⎜ i ⎪ ⎜ ∑ k( 

x1, 

x j ), ∑ k( 

x2 

, x j ), ..., 

⎩ ⎝ N i j= 

1 N i j= 

1 

N i 

⎧ 

⎪K 

⎨ 

⎪ 

⎩ p 

W 

j 

N 

i 

∑ 

j= 

1 

i 

T 

= ∑ ∑( 

p j − pi 

)( p j − pi 

) 

C i= 

1 N i j= 

1 

( k( 

x , x ), k( 

x , x ),..., k( 

x , x ) ) 

= 

1 

C 

1 

1 

j 

N 

2 

j 

N 

j 

⎞ 

k( 

x x ⎟ N , j ) 

⎟ 

T 

⎠ 

So the solution of Eq. (4) can be obtained by 

maximizing 

T 

(8) 

(9) 

T 

| α K Bα 

| 

J ( α) 

= arg max 

(10) 

α T 

| α K α | 

Then, the problem of KFDA is converted into finding 

the leading eigenvectors of KW KB 

1 − . Let column vectors 

β (i = 1,2, . . .,N) be the eigenvectors of i 

KW KB 

1 − . To a 

new columnvector sample x , the mapping to the 

new 

feature space is φ ( xnew) 

. The projection of x new onto the 

eigenvectors β =( i β , i1 

β ,…, i2 

β )(i = 1, 2,. . .,N) is t = 

iN 

( t 1, 

t 2 , . . ., t N ) T , and it is also called KFDA-transformed 

feature vector 

ti = new ∑ ij new j 

j = 1 

N 

( w⋅ 

φ ( x )) = β k( 

x , x ), i = 1, 

2,...,N 

(11) 

When KFDA is used for feature extraction, a problem 

arises that the matrix K cannot be guaranteed to be 

W 

nonsingular. Several techniques have been proposed to 

handle this problem for numerical computation. In this 

paper, when the matrix K is singular, it is replaced with 

W 

KW + μI 

, where μ is a very small constant and I is an 

identity matrix [3,9,10]. 

W


III. IMPROVED KFDA(IKFDA) 

A. Variable Weighting Vector 

To fault identification, the variable weighting 

determines key variables responsible for the fault from 

the datasets masked with much irrelevant information by 

maximizing separation between the normal and each fault 

data sets. The variable weighting maximizes separation 

between the normal and each fault data. By making full 

use of the normal data information, the weight vector of 

each fault can be obtained. After fault data are weighted 

by the corresponding weight vectors, KFDA is performed 

on these weighted fault data, which offers important 

supplemental classification information to KFDA. 

Fault diagnosis is often characteristic of large scale and 

nonlinear behavior. When all variables are used in a same 

level, the data sets are masked with irrelevant information, 

which results in the classification performance 

degenerating. Correctly representing the corresponding 

variable's contribution to a special fault, the weight vector 

is helpful to extract discriminative features from 

overlapping fault data which effectively improves the 

multi-classification performance of KFDA. 

FDA is a well-known linear technique for reducing 

dimensions and pattern classification. It determines a set 

of Fisher optimal discriminant vectors that maximize the 

scatter between the classes while minimizing the scatter 

within each class. Different from traditional selection 

methods where the deleted variables and the selected 

variables are essentially weighted with discrete values: 0 

and 1, respectively, the variable weighting is to weight 

the variables with continuous non-negative values [10] . 

Pair-wise FDA is performed on normal data and each 

class of fault data to gain the Fisher optimal discriminant 

vector, here named the fault direction. Each fault 

direction associated with a special fault optimally 

separates the fault data from normal data. Taking into 

account nonlinear characteristics of most industrial 

processes, we investigate the nonlinear pair-wise variable 

weighting. The weight vector of each fault maximizes the 

distance between the normal and each class of fault data. 

The concept of kernel target alignment was proposed 

by Cristianini et al. to evaluate the similarity between two 

kernel matrices. Since kernel-based learning methods are 

based around kernel matrix, its properties reflect the 

relative positions of the points in the feature space. For 

the two-class (the normal class and the fault class 

classification problem x ∈ x0, 

f → { −1, 

1} 

, consider the 

T 

T 

kernel target matrix yy , where y = [ − f n , f n ] then the 

0 0 

kernel target alignment is given by 

< K 

A = 

= 

< K, 

K > 

Thus, Eq. (12) can be rewritten by 


T 

, yy > F 

T 

y Ky (12) 

T T 

F < yy , yy > F 

n K F 

T 

y K w y f 

max A( 

w f ) = 

w f n K 

(13) 

s. 

t. 

w ( i) 

≥ 0 

f 

w f 

F 

i = 1,..., m 

To obtain the weight vector w f , the width of 

Gaussian kernel σ first requires to be determined using 

cross validation with pair-wise KFDA on x 0 , f by 

minimizing the total classification error rate, and then the 

optimization problem, i.e. Eq. (13), should be solved. The 

process is repeated until all c fault classes are analyzed 

and all weight vectors ..., ,... ) 

14] . 

( w1 w f wc 

are obtained [11- 

Consider Gaussian kernel function, any element of 

the kernel matrix K of x~ can be given by 

w f i 

~ 0, 

f ~ 0, 

f 

xi 

− xi 

K w ( i, 

j) 

= 

= 

f 

2 

2σ 

(14) 

0, 

f 

0, 

f 

diag( 

w f ) xi 

− diag( 

w f ) xi 

i, j = 1,...(n 0 + n ) 

2 

f 

2σ 

f 

where x , 0 f 

or x , 0 ~ f 

is the ith row of x , 0 f 

or x , 0 ~ . 

Due to depending on only the kernel matrix K , the WF 

kernel target alignment is selected as the variable 

weighting criteria. Due to depending on only the kernel 

matrix K , the kernel target alignment is selected as the 

WF 

variable weighting criteria. Thus, the weight vector can 

be obtained by solving the following optimal problem. 

Therefore, the Rayleigh quotient is selected as the 

variable weighting criteria. Then the variable weighting 

becomes the following optimal problem: 

T 

α; 

( KW 

WK ) 

F W α 

φ 

F 

max∫ 

( w f ) = 

w 

T 

f 

α ( KW 

K ) 

F W α F 

s. 

t. 

w ( i) 

≥ 0 i = 1,..., m 

f 

(15) 

In the above equation, Rayleigh quotient depends on 

not only the kernel matrix but also the optimal 

discriminant vector K such that KFDA should be re- 

W f 

performed to obtain the optimal discriminant vector α 

during the optimization procedure, which would be 

computationally expensive. Instead of the Rayleigh 

quotient in KFDA, the kernel target alignment is selected 

as the variable weighting criteria. 

B. Feature Vector Selection 

In this paper, a preprocessing scheme called feature 

vector selection (FVS) is adopted to reduce the 

computational complexity of KFDA whereas preserve the 

geometrical structure of the whole data in F. 

In Eq. (5), all the training samples in F , φ (i = 1, 

i 

2,…,n), are used to represent eigenvector w . In practice, 

the dimensionality of the subspace spanned byφi is just 

equal to the rank of kernel matrix K , and the rank of


K is often less than n, that is rank (K ) < n. The FVS 

scheme is look for vectors that are sufficient to express 

all of the data in F as a linear combination of those 

selected vectors in F. Suppose a basis of the feature 

vectors, φ (i=1, 2, . . ., rank (K ) ; b 

b [ 1, 

n] 

i 

i ∈ ) is known, 

then Eq. (5) can been rewritten as follows: 

rank ( K ) 

∑ α bi bi 

i= 

1 

w = φ 

(16) 

Since rank (K )


and then, simulation results and discussion of slight and 

serious imbalance problems are presented. 

A. Tennessee Eastman (TE) Process 

Tennessee Eastman (TE) Industrial Challenge Problem 

is designed to provide a realistic industrial process by the 

Eastman Chemical Company. The process consists of 

five major unit operations: a product condenser, a recycle 

compressor, a reactor, a product stripper, and a vapor– 

liquid separator. As a well-known benchmark simulation, 

it has been widely used to compare and evaluate the 

performance of various monitoring approaches. 

There are 41 measured variables and 12 manipulated 

variables in the TE process, in this paper, the selected 33 

monitoring variables include 22 measured variables and 

11 manipulated variables shown in table I. There are 22 

process patterns (1 normal operating condition and 21 

faulty conditions) in the TE process. In order to design 

the simulation study reasonably, in this paper, the TE 

process simulator is used to generate the normal and five 

classes of fault data, i.e. fault 4, 8, 13, 14 and 19. These 

five faults covering all the fault types in TE process are 

divided into two simulation studies. 

Each pattern combination includes the normal pattern 

and three different faults. These selected process patterns 

TABLE II. 

SELECTED PROCESS PATTERNS FOR SIMULATION 

Pattern Fault description Type 

Number of 

training data 

Normal 400(300) 

Fault 3 

Reactor cooling water inlet 

temperature 

Fault 7 A,B,C feed composition 

Step 200(300) 

Random 

variation 

250(300) 

Fault 13 Reactor cooling water valve Sticking 300(300) 


TABLE I. 

MONITORED VARIABLES IN TE PROCESS 

for simulation study are listed in Table II. For each 

pattern, there are 960 observations, and all faults are 

introduced in the process at sample 160. The simulation 

data are separated into two parts: the training and testing 

dataset. The training and testing data amount of these two 

cases are also listed in Table I. 

B. Simulation Results and Discussion 

In both of the two cases, we select radial basis kernel 

function in KPCA and choose the width of Gaussian 

kernel as 500m, where m is the dimension of the inputs. 

KPCA based feature extraction is applied with 85% 

variation of eigenvalue, and the confidence limit in the 

Gaussian distribution is 98%. 

Firstly, IKFDA based first-three dimension projections 

of the four patterns (normal, fault 3, 7 and 14) are 

presented. According to the first–third dimension 

projection, fault 3 has overlapped with fault 7. As the fact 

that the rare classes have less impact on accuracy than the 

common classes, and the pattern determination is apt to 

the majority one. 

Fault diagnosis results are listed in Table III, we can 

find that 28% of fault 7 data are misclassified as the 

normal pattern by general KFDA. Such high rate error 

will endanger the process, even results in serious 

Number Measured variables No Measured variables No Manipulated Variables 

1 A feed 11 

Product separator 

temperature 

21 D feed flow valve 

2 D feed 12 

Product 

level 

separator 

22 E feed flow valve 

3 E feed 13 

Product 

pressure 

separator 

23 A feed flow valve 

4 Total feed 14 

Product separator 

underflow 

24 Total feed flow7 valve 

5 Recycle flow 15 Stripper level 25 Compressor recycle valve 

6 Reactor feed rate 16 Stripper pressure 26 Purge valve 

7 Reactor pressure 17 Stripper underflow 27 

Separator pot liquid flow7 

valve 

8 Reactor level 18 

Stripper 

temperature 

28 

Stripper liquid product flow 

valve 

9 Reactor temperature 19 

Stripper 

flow 

steam 

29 Stripper steam valve 

10 Purge rate 20 Compressor work 30 Reactor cooling water flow 

accidents. Considering the data amount of each pattern, 

normal pattern is the majority class. The average 

diagnosis rates of general KFDA are listed in Table III. 

Comparing the performance of IKFDA-GMM with 

general KFDA and PCA, the diagnosis rate of normal 

pattern reaches to 99%, but the diagnosis performances of 

fault 7 and 14 are even a litter poorer. The diagnosis 

performances of all the three fault patterns are improved 

compared those of general KFDA and KPCA. The 

diagnosis rate of normal pattern descends form 99% to 

66%. For example, the rate of misclassifying fault 7 as 

normal pattern falls from 47% to 2%. It is obvious that 

the IKFDA is instable. Comparing to general KFDA, the


TABLE III. 

FAULT DIAGNOSIS RESULTS(%) 

Methods Normal Fault 3 Fault 7 Fault 14 

KPCA 66 71 53 62 

KFDA 79 76 68 70 

IKFDA- 

GMM 

99 98 98 97 

performances of the proposed IKFDA approaches are 

better in some cases. 

VI. CONCLUSIONS 

In recent years, KFDA has been utilized directly for 

nonlinear process fault diagnosis, and it has been proven 

to outperform conventional FDA method. This paper 

focuses on the improvement of KFDA for fault diagnosis 

from two aspects, which provides effective tools for fault 

diagnosis of nonlinear multivariate process. 

Firstly, the classification performance of KFDA may 

degenerate as long as overlapping samples exist. The 

nonlinear variable weighting finds out the weight vector 

of each fault by maximizing the variable weighting 

criteria. Each weight vector maximizes separation 

between the normal data and each class of fault data. By 

weighting fault data with the corresponding weight vector, 

the proposed method extracts discriminative features 

more effectively than the traditional KFDA from 

overlapping fault data. 

Secondly, a feature vector selection scheme based on a 

geometrical consideration is adopted for sample vector 

selection before KFDA calculation. Simulations 

conducted on TE process have shown that, IKFDA based 

on feature vector selection has nearly the same fault 

recognition rates as KFDA method. Moreover, IKFDA 

based on feature vector selection method can reduce the 

computational complexity significantly, especially when 

the training sample set is very large. 

Finally, Gaussian mixture model (GMM) is applied for 

fault isolation and diagnosis on the KFDA subspace. 

Experimental results show that the proposed method 

outperforms traditional kernel principal component 

analysis (KPCA) and general KDA algorithms. 

ACKNOWLEDGMENT 

This work was supported by the fund project for major 

transformation of scientific and technological 

achievements from Jiangsu province(BA2008029). 

REFERENCES 

[1] Choi, S.W., Park, J.H. and Lee, I.B.. “Process monitoring 

using a Gaussian mixture model via principal component 

analysis and discriminant analysis”. Computers and 

Chemical Engineering, 2004, 28(8) , pp. 1377-1387. 

[2] Chiang, L. H., Russell, E. L., & Braatz, R. D.. “Fault 

diagnosis in chemical processes using Fisher discriminant 

analysis, discriminant partial least squares, and principal 

component analysis”. Chemometrics and Intelligent 

Laboratory Systems, vol.50, pp. 243-252, April 2000. 


[3] Mika,s., “Fisher discriminant analysis with kernels”, In 

Proceedings of IEEE Neural Networks for Singal 

Processing Workshop Madison, WI,USA,1999, pp. 41-51. 

[4] Yang, J., Jin, Z., Yang, J.Y., Zhang, D. and Frangi, A.F., 

“Essence of kernel Fisher discriminant: KPCA plus LDA.” 

Pattern Recognition, 2004,37(10) , pp. 2097-2100. 

[5] Cho, H.W., “Nonlinear feature extraction and classification 

of multivariate process data in kernel feature space”. 

Expert Systems with Applications, 2007,32(2) , pp. 534- 

542. 

[6] Cho, H.W.. “Identification of contributing variables using 

kernel based discriminant modeling and reconstruction”. 

Expert Systems with Applications, 2007,33(2), pp. 274-285. 

[7] Baudat,G., & Anouar, F.. “Kernel-based methods and 

function approximation”. In Proceedings of International 

Conference on Neural Networks, Washington, DC,2001, 

pp. 1244-1249. 

[8] Cui, P. L., Li, J. H., & Wang, G. Z.. “Improved kernel 

principal component analysis for fault detection”. Expert 

Systems with Applications,2008,34(2), pp. 1210-1219. 

[9] Louw, N., Steel, S.J.. “A review of kernel Fisher 

discriminant analysis for statistical classification”. South 

African Statist. J. 2005(39), pp. 1-21. 

[10] Dong-Sheng Cao,Yi-Zeng Liang,Qing-Song Xu, 

“Exploring nonlinear relationships in chemical data using 

kernel-based methods”. Chemometrics and Intelligent 

Laboratory Systems 2011(101), pp. 106-115. 

[11] Zhi-Bo Zhu, Zhi-Huan Song. “A novel fault diagnosis 

system using pattern classification on kernel FDA 

subspace”.Expert Systems with Applications 2011(38) , pp. 

6895-6905. 

[12] Achmad Widodo, Bo-Suk Yang. “Application of nonlinear 

feature extraction and support vector machines for fault 

diagnosis of induction motors”. Expert Systems with 

Applications 2007(33), pp. 241-250 

[13] Guang Dai, Dit-YanYeung, Yun-Tao Qian. “Face 

recognition using a kernel fractional-step discriminant 

analysis algorithm”. Pattern Recognition 2007(40):229- 

243. 

[14] Zhi-Bo Zhu, Zhi-Huan Song. “Fault diagnosis based on 

imbalance modified kernel Fisher discriminant analysis”. 

Chemical Engineering Research and Design 2010(88), pp. 

936–951. 

[15] Ji-Dong Shaoa, Gang Ronga Jong Min Leeb. “Learning a 

data-dependent kernel function for KPCA-based nonlinear 

process monitoring”. Chemical Engineering Research and 

Design, 2009(87) pp. 1471–1480. 

[16] Hyun-Woo Cho. “Nonlinear feature extraction and 

classification of multivariate process data in kernel feature 

space”. Expert Systems with Applications, 2007(32), pp. 

534-542. 

[17] Cheong Hee Park, Haesun Park. “Nonlinear feature 

extraction based on centroids and kernel functions”. 

Pattern Recognition, 2004(37) pp. 801-810. 

[18] N. Louw, S.J. Steel. “Variable selection in kernel Fisher 

discriminant analysis by means of recursive feature 

elimination”. Computational Statistics & Data Analysis 

2006(51), pp. 2043-2055. 

Zhengwei Li, My research interests mainly include machine 

learning and fault diagnosis. Currently, I am an associate 

professor at the School of Computer Science and Engineering, 

China University of Mining and Technology. I received my 

B.Sc. and M.Sc. degrees in computer science all from 

Department of Computer Science & Technology, China 

University of Mining and Technology in 1999 and 2005 

respectively.


Intra-Transition Data Dependence 

Shenghui Shi 

College of Information Science and Technology, Beijing University of Chemical Technology, Beijing, China 

Email: shish@mail.buct.edu.cn 

Qunxiong Zhu* Zhiqiang Geng Wenxing Xu 

College of Information Science and Technology, Beijing University of Chemical Technology, Beijing, China 

Email: zhuqx@mail.buct.edu.cn gengzhiqiang@mail.buct.edu.cn estellaxu@163.com 

Abstract—Currently, two main approaches to data 

dependence of EFSM(Extended Finite State Machine) 

haven’t refined intra-transition data dependence, instead 

they consider that every definition variable in a transition 

depends on all the use variables (including condition 

variables). For data dependence of a specific definition 

variable, not only the relevant use variables but also the 

irrelevant use variables (including condition variables) are 

considered, which obviously causes redundancy. Without a 

doubt, further analysis based on this brings hidden danger 

to the dependent analysis of the entire system and practical 

application. With the idea of introducing program 

dependence graph into to EFSM, this paper studies intratransition 

data dependence, and describes the data 

dependence between every intra-transition definition 

variable and the use and condition variables which influence 

or are influenced by it. Thus irrelevant dependence 

variables are removed to reduce redundancies and errors. 

Also, theoretical and experimental analyses are conducted. 

Index Terms—Extended Finite State Machine(EFSM); 

Dependence Analysis; Intra-Transition Data 

Dependence(IaTDD) 


With the gradual expansion of computer software 

applications, the size and complexity of computer 

software are growing rapidly, which leads to an 

increasing growth of the cost and difficulty of software 

analysis, understanding, test, maintenance, evolution, and 

other aspects in software engineering. Software slicing, as 

an "energy saving" tool for software system, therefore, 

plays an important role. In 1979, M. Weiser first 

proposed the basic idea of program slicing [1] to achieve 

the program's reduction. After thirty years of 

development, program slicing has been widely 

recognized and applied. From the point of view of 

software engineering development cycle, program slicing 

has penetrated into the application of requirement and 

design layer from coding and testing layer. In 1990s, 

Heimdahl et al [2,3] proposed model-based slicing, which 

started model slicing research of FSM (Finite State 

Machine). In the same period, Savage, P. and Dssouli R. 

proposed model slicing based on EFSM [4, 5] . In 2003, 

Korel [6] normalized EFSM model structure by specifying 

the composition of EFSM and transitions, developed 


doi:10.4304/jsw.7.12.2663-2670 

EFSM slicing tool, and proposed irrelevant control 

dependence. But EFSM model must be built on the 

premise that there is a termination node. Korel applied 

the method of program slicing, but failed to conduct indepth 

study of the differences between program slicing 

and EFSM structure. However, this method made EFSM 

model more clear and specific, which lays a solid 

theoretical basis for the present extensive study of the 

EFSM slicing technology. With further research and 

development of programming languages, the limitations 

for program slicing method to be used in the EFSM were 

gradually exposed. Scholars have devoted more attention 

to control dependence, and several solutions have been 

put forward. But there are also many limitations for the 

corresponding data dependence. Due to the limitations of 

requirements of study objects and their own structure, the 

data dependence of EFSM has not yet been well solved. 

Currently, there are two main methods for the 

implementation of EFSM data dependence. The first one 

is traditional EFSM data dependence proposed by Korel [6] 

(hereinafter referred to as K method). This commonly 

used method uses the data dependent methods of program 

slicing, which realizes EFSM data dependence based on 

traversing algorithm of marking visited nodes. The other 

is the transitive dependence function method proposed by 

Chinese scholars of Miao Li and so on [7] (hereinafter 

referred to as M method), which analyzes the problem 

that data dependence of EFSM may be intransitive. 

The two main methods ignore the specific data 

dependence of intra-transition variables, and consider that 

any definition variable depends on all the use variables 

(including condition variables) in that transition. Actually, 

a certain definition variable is associated only with the 

relevant use variables. But if we randomly identify that 

all use variables are related to a certain definition variable, 

those irrelevant use variables would be certainly included 

in the data dependence, which would result in redundancy. 

This paper presents intra-transition data dependent 

method, and is verified by experiments to analyze the 

degree of redundancy reduction. 

The paper is organized as follows: Section Ⅱ provides 

an overview of intra-transition data dependence. Section 

Ⅲ analyzes the differences between traditional intra- 

transition data dependence method and the method put 

forward in this paper. Section Ⅳ compares the method in 

this paper and the traditional method by means of


experiment, and analyses the result. Finally, future 

research is discussed. 

II. INTRA-TRANSITION DATA DEPENDENCE 

Intra-transition data dependence gets definition 

variable set and the relation set between use variable set 

and condition variable set. 

Definition 1: Data Dependence between the 

Variables(DDV) [8] 

DDV is an inner transition set composed of definition 

variable set and relationship between use variable set and 

condition variable set, which can be expressed as follows: 

DDV: (vdi, {Vui, Vci}) 

Where all the variables and variable sets are in a 

transition, and Vd is definition variable set in action or 

event, vdi is a definition variable in action or event, 

vdi∈Vd. Vui is use variable set influencing the value of vdi 

in action, which can be Null. Vci is condition variable set 

influencing the value of vdi in condition, which can be 

Null. In d∈I, u∈I ,c∈I, I represents integer. (vdi, {Vui, 

Vci}) indicates that the value of vdi is data dependent on 

Vui and Vci, or rather that Vui and Vci have influences on 

the value of vdi. {Vu-Vui} is called vdi’s independent use 

variable data dependence set, {Vc-Vci} is called vdi’s 

independent condition variable data dependence set, { Vu- 

Vui } ∪ { Vc-Vci } as vdi’s independent data dependence 

set. This article will be deleted with the set of variables 

unrelated to vdi, in order to reduce redundancy. 

Definition 2: Intra-Transition Data Dependence 

(IaTDD) [8] 

Data dependence of intra-transition is the data 

dependence set composed of definition variable and the 

set of use variable set and condition variable set: 

IaTDD T: {( vd1, {Vu1, Vc1}), ( vd2, {Vu2, Vc2}),…, ( vdi, 

{Vui, Vci}), …} 

Where vdi is a definition variable in action or an input 

variable in event, vdi∈Vd. vd1, vd2, …, vdi, … constitute 

universal set of definition variables in action or event. vd1, 

vd2, …, vdi, …are not equal to each other. vdi≠vdj, i≠j. Vui is 

set of use variables influencing the value of vdi in action, 

which can be Null. Vci is set of condition variables 

influencing the value of vdi in condition, which can be 

Null. Vd ⊂ VT, Vu ⊂ VT, Vc ⊂ VT, VT is the variable set in 

transition. d∈I, u∈I ,c∈I, i∈I. IaTDD indicate all data 

dependence between variables in a transition. 

Definition variables include the input variables in the 

event and the input variables, definition variables and 

output variables in action. The specific dependence is as 

follows: 

1. If a variable is the input variable in the event of 

transition T, its data depends on the empty set. The 

complete set of condition variables in the condition and 

use variables in the action are the irrelevant set of the data 

dependent variable. For example, EventName(vin1,vin2,…) , 

then IaTDD(T, vini): (vini, { }). i ∈ I, I for integers. At this 

point, Vu∪Vc has nothing to do with the dependent 

variable set of the variable vini, so Vu∪Vc is removed 

from the dependent variable set vini to reduce redundancy. 

Event is like the definition of a function in computer 


program language; input variables the formal parameters 

of the function. Condition and action are like the body of 

the function. But it’s likely that if the condition is true 

then the event and the action will be executed. In this 

case, the input variable data depends on the condition 

variable. This article focuses on the former situation. For 

example, the variable pin in the event card(pin) data 

dependence is described as IaTDD(T, pin): (pin, { }). 

Vu∪Vc in the condition and action sequences has nothing 

to do with the variable pin for the dependent variable set. 

2. If a variable is the input variable in the action, its 

data depends on the empty set or a set of condition 

variables. The complete set of use variables is irrelevant 

dependent variable set. If the condition variable does not 

exist in T, its data depends on the empty set. Otherwise, if 

the condition variable exists, it depends on the condition 

variable. In both cases the complete sets of use variables 

are irrelevant dependent variable set. For example, 

Input(vin1,vin2,…) , then IaTDD(T, vini): (vini,{ Vc }), i∈I I 

for integers. At this point, Vu has nothing to do with the 

variable vini for the dependent variable set, so Vu is 

removed from the dependent variable set vini to reduce 

redundancy. For example, if the condition is empty, the 

variable p in the input statement Input(p) data 

dependence is described as IaTDD(T, p): (p, { }). If the 

condition is not empty and attempts


Vout has nothing to do with the variable vout for the 

dependent variable set, so Vu-Vout is removed from the 

dependent variable set vout to reduce redundancy. For 

example, if the condition is empty, the variable p in the 

output statement Output(p) data dependence is described 

as IaTDD(T, p): (p, {p}). If the condition is not empty 

and attempts==3, the variable p in the output statement 

Output(p) data dependence is described as IaTDD(T, p): 

(p, {p, attempts}). In both cases Vu-{p} has nothing to do 

with p. 

In dealing with condition variables, common practice 

is to consider condition variables as use variables. In 

order to facilitate follow-up studies and lay a good 

foundation for dynamic and conditional slicing, our 

research separate the condition variable from the set of 

use variables. 

III. DIFFERENT INTRA-TRANSITION DATA DEPENDENT 

METHODS 

For IaTDD and traditional methods (this refers to the K 

and M method, hereinafter referred to as K&M method) 

data dependence is as follows: 

K&M T: {( vd1, {Vu, Vc}), ( vd2, {Vu, Vc}),…, ( vdi, 

{Vu, Vc}), …} 

IaTDD T: {( vd1, {Vu1, Vc1}), ( vd2, {Vu2, Vc2}),…, 

( vdi, {Vui, Vci}), …} 

Which ∑∪ 

i= 

1,..., 

n 

vd i 

=Vd, vdi ∈Vd, Vd is the 

definition variable set of the transition T. vdi is a 

definition variable. Vu is the use variable set. Vc is the 

condition variable set. In K&M method, definition 

variable vdi is dependent on the complete set of use 

variables Vu and the complete set of condition variables, 

described as (vdi, {Vu, Vc}). In the IaTDD method, 

definition variable vdi is dependent on the use variable set 

Vui and condition variable set Vci that influence the 

changes of vdi, in which vdi can be empty, described as ( vdi, 

{Vui, Vci})Vui ⊂ Vu, Vci ⊂ Vc. Thus, for both methods, the 

use variable set and condition variable set that are 

dependent on the same definition variable in IaTDD is the 

a subset of K&M method. That is, both methods have the 

same number of intra-transition definition variable, but 

K&M method can get more dependence than IaTDD, and 

in fact these variables did not affect definition variables, 

which resulted in redundancy. 

∀ vdi ∈Vd, ∃ vdi(Method=”K&M”) = vdi(Method=”IaTDD”), 

∴Vd(Method=”K&M”) =Vd(Method=”IaTDD”). 

∀ (vdi, {Vu, Vc})∈K&M T, ( vdi, {Vui, Vci})∈IaTDD 

T, Vui is vdi related to the use variable set, which affect vdi 

or be affected by vdi, not including the use variables and 

condition variables irrelevant with the vdi, ∴Vui ⊂ Vu, Vu 

=Vui +(Vu -Vui), Vui ∩(Vu -Vui)=Φ. Similarly, Vci ⊂ Vc, Vc 

=Vci +(Vc –Vci) , Vci ∩(Vc –Vci)=Φ. 

Based on the above two conditions can be drawn: 

IaTDD T ⊂ K&M T. 


Vd1 

Vd2 

… 

Vdi 

A 

Vu1, Vc1 

Vu-Vu1, Vc-Vc1 

Vu2, Vc2 


The relationship between definition variables and 

use, condition variables is shown in Figure 1. Set A= 

Vd={ vd1, vd2, …, vdi,…}, that is the complete set of 

definition variables, A can be empty. Set B1= 

B2=…Bi=…B= Vu∪Vc, Bi indicates the set of use 

variables and condition variables. A is dependent on B. 

∀ (vdi, {Vu, Vc})∈K&M T, each definition variable is 

dependent on the complete set of use variables and 

condition variable, but it is not the case. Actually, vdi is 

dependent on Vu and Vc, omitting use variable and 

condition variable irrelevant to vdi. Thus, a new 

dependence is made, described as ( vdi, {Vui, Vci}), 

Vui ⊂ Vu, Vci ⊂ Vc. In Figure 1 set B1 deletes dependent 

variable set irrelevant to vd1, that is, B1 deletes { Vu -Vu1 , 

Vc -Vc1 }, B2 deletes { Vu -Vu2 , Vc –Vc2 }, …, Bi deletes 

{ Vu -Vui , Vc –Vci }, etc. In other words, B1 equals {Vu1, 

Vc1}, B2 equals {Vu2, Vc2}, …, Bi equals {Vui, Vci}, etc. Reconstruct 

the data dependence and get Figure 2 IaTDD T. 

T 

… 

… 

Vui, Vci 

Vu-Vui, Vc-Vci 

B 

B1 

B2 

Figure 1 Data Dependence of Variables in Transition T 

Derived by K&M Method 



… 


… 

… 

D 

T 

Vd1 

Vd2 

… 

Vdi 

… 

A 

IaTDD T 

Bi 

Vu1, Vc1 

Vu2, Vc2 

… 

Vui, Vci 

… 

… 

C 

K&M T 

Figure. 2 Comparison of Data Dependence between IaTDD and 

K&M Method


∀ ( vdi, {Vu, Vc})∈K&M T, delete ( vdi, {Vu -Vui, Vc– 

Vci }), then get ( vdi, {Vui, Vci})∈IaTDD T, so IaTDD 

T ⊂ K&M T. Figure 2 shows that set A is dependent on 

set C, that is, each definition variable of set A depends on 

the definition variables associated with the use variables 

and condition variables of set C. Each definition variable 

has nothing to do with the use variables and condition 

variables of set D. If there’s any relations, redundancy 

will result. 

Vuj, Vcj 

Vui, Vci 



… 


Vu-Vuj, Vc-Vcj 

… 

D 


(a) 

(b) 

Vu-Vuj, Vc-Vcj 

(c) 

Vu1, Vc1 

Vu2, Vc2 

… 

Vui, Vci 

Vuj, Vcj 

… 

Figure 3 Variable Set Dependent by Definition Variable 

of IaTDD and K&M Method 

∀ ( vdi, {Vu, Vc}), ( vdj, {Vu, Vc}) ∈K&M T, ( vdi, {Vui, 

Vci}),( vdj, {Vuj, Vcj}) ∈IaTDD T, we get Vu =Vui +(Vu - 

Vui), Vui ∩(Vu -Vui)=Φ, Vu =Vuj +(Vu -Vuj), Vuj ∩(Vu - 

Vuj)=Φ. But Vui and (Vu -Vuj), Vuj and (Vu -Vui) may 

intersect, that is, repeated elements may exist. Let’s 

mainly see the use variables and condition variables that 

are dependent. ∵ ∃∀Vui∩(Vu -Vuj) ≠Φ, Vuj∩(Vu -Vui) ≠Φ, 

are shown in (a) and (b) of Figure 3, the illustrated 

variables dependent by definition variable include use 

variables and condition variables. ∴ ∃ (Vui∪Vuj )∩((Vu - 

Vui) ∪(Vu -Vuj)) ≠Φ. Therefore ∃ (Vci∪Vcj )∩((Vc –Vci) 

∪(Vc –Vcj)) ≠Φ, as (c) of Figure 3 shows, ∃ C∩D≠Φ, 

C={ Vu1, Vc1 , Vu2 , Vc2 ,…, Vui , Vci , Vuj , Vcj ,… }, 

D={ Vu-Vu1, Vc-Vc1 , Vu-Vu2 , Vc-Vc2 ,…, Vu-Vui , Vc-Vci , 

Vu-Vuj , Vc-Vcj ,… }. 


C 

IV EXPERIMENT AND ANALYSIS 

This paper compares the EFSM models commonly 

used in various documents to analyze the impact of intratransition 

data dependence. 

A. Experimental Model 

Specific experimental model data is shown in Table 1, 

in which #S is the number of states, #T is the number of 

transition. 

TABLE 1 

EXPERIMENTAL MODELS 

EFSM Model #S #T 

ATM [6] 9 23 

Cashier [9] 12 21 

Cruise Control [10] 5 17 

Fuel Pump [10] 13 25 

PrintToken [9] 11 89 

Door Control [11] 6 12 

Vending Machine [9] 7 28 

INRES protocol [12] 8 18 

TCP [13] 12 57 

B. Experimental Data 

This section is about experiments on the 9 EFSM 

models in Table 1. Through the IaTDD method and 

K&M method, experiments will be done to get the 

number of data dependence between variables, the 

number of redundant variables, and comparison of the 

number of definition variables, data dependence relations 

between numbers, and number of redundancies. 

Comparing the results of the experiments that apply 

the two methods, as is shown in Table 2, "#K&M 

method" indicates the number of data dependence that is 

acquired without using IaTDD method. "#IaTDD 

method" indicates the number of data dependence that is 

acquired with IaTDD method. 

TABLE 2 

NUMBER OF DATA DEPENDENCE BETWEEN INTRA-TRANSITION 

VARIABLES DERIVED BY TWO METHODS 

EFSM Model #K&M method #IaTDD 

ATM 28 28 

Cashier 30 30 

Cruise Control 50 50 

Fuel Pump 44 44 

PrintToken 49 49 

Door Control 6 6 

Vending Machine 30 30 

INRES Protocol 14 14 

TCP 135 135 

The results of Table 2 show that the two methods get 

the same intra-transition data dependence between 

variables, but IaTDD method does not produce redundant 

variables, while K&M method producing a lot of


redundant variables. The numbers of redundant variables 

of the specific nine models are shown in Figure 4. It 

describes the number of redundant variables contained in 

each transition of the models. The horizontal axis 

indicates the specific transition, and the vertical axis 

indicates the number of redundant variables contained in 

transition. With K&M method, Door Control model does 

not produce redundant variables, because the Door 

Control model includes at most one definition variable, 

therefore the data dependence is simple. But in reality, 

the situation is not always so rational. The other eight 

models, as Figure 4 shows, produce redundant variables 

to a different extent. Redundancy caused by K&M 

method is shown in Table 3. 

Number of Redundant 

Variables 


Variables 


Variables 

3.5 

3 

2.5 

2 

1.5 

1 

0.5 

0 

5 

4 

3 

2 

1 

0 

25 

20 

15 

10 

5 

0 

T1 

T1 

T1 

T3 

T3 

T4 

T5 

T5 

T7 

T7 

T10 

ATM 

T13 

T16 

Transition 

(a) 

Cashier 

T9 

T11 

T13 

Transition 

(b) 

Cruise Control 

T7 


T9 

T11 

Transition 

(c) 

T15 

T13 

T19 

T17 

T15 

T22 

T19 

T17 

T21 


Variables 


Variables 


Variables 


Variables 

20 

15 

10 

5 

0 

2.5 

2 

1.5 

1 

0.5 

0 

T1 

0.8 

0.6 

0.4 

0.2 

0 

T4 

T7 

T10 

Fuel Pump 

T13 

T16 

Transition 

(d) 

PrinToken 

T19 

T22 

T25 

T1 

T9 

T17 

T25 

T33 

T41 

T49 

T57 

T65 

T73 

T81 

T89 

1 

2.5 

2 

1.5 

1 

0.5 

0 

T1 

T1 

T4 

T3 

T7 

Transition 

(e) 

Door Control 

T5 

T7 

Transition 

(f) 

T9 

Vending Machine 

T10 

T13 

T16 

T19 

Transition 

(g) 

T22 

T11 

T25 

T28



Variables 


Variables 

14 

12 

10 

8 

6 

4 

2 

0 

35 

30 

25 

20 

15 

10 

5 

0 

T1 

T1 

T3 

T7 

T5 

T13 

INRES Protocol 

T7 

T19 

T9 

T11 

Transition 

(h) 

T25 

TCP 

T31 

T37 

Transition 

Figure 4 Number of Redundant Variables Derived by K&M Method 

(i) 

T13 

T43 

T15 

T49 

T17 

T55 

TABLE 3 

REDUNDANT VARIABLES DERIVED BY K&M METHOD 

EFSM Model 


Redundant 

Variables 

Number of data 

dependence among 

redundant variables 

ATM 16 7 

Cashier 17 10 

Cruise Control 83 31 

Fuel Pump 117 28 

PrintToken 8 8 

Door Control 0 0 

Vending Machine 7 4 

INRES protocol 13 7 

TCP 139 78 

290 samples of transition in the nine models are 

collected to undergo comparative experiments on the 

number of definition variables contained in each 

transition and the number of data dependence among 

redundant variables, as is shown in Figure 5. 


Number of Data Dependence 

among Variables in Each 

Transition 

8 

7 

6 

5 

4 

3 

2 

1 

0 

0 2 4 6 8 

Number of Definition Variables in 

Each Transition 

Figure 5 Comparison of Number of Definition Variables and Number of 

Data Dependence among Variables in Each Transition 

It can be seen from Figure 5 that the more the 

number of variables contained in each transition, the 

more the data dependence among variables; the more 

complex the transition structure. Figure 6 and Figure 7 

show the comparative experiments on the relationship 

between number of variables contained in each transition 

and the number of redundant variables, by using the 

method of K&M. 


Variables in Each 

Transition 

Number of Redundant Variables 

35 

30 

25 

20 

15 

10 

5 

0 

0 2 4 6 8 

Number of Definition Variables in Each 

Transition 

Figure 6 Scatter Diagram of Number of Definition Variables and 

Redundant Variables in Each Transition of Traditional Method 

35 

30 

25 

20 

15 

10 

5 

0 

1 14 27 40 53 66 79 92 105 118 131 144 157 170 183 196 209 222 235 248 261 274 287 

Transition 

Figure 7 Histogram of Number of Definition Variables and Redundant 

Variables in Each Transition of K&M Method 

Figure 6 and 7 show that the more the number of 

definition variables contained in each transition, the more


redundant variables. In Figure 7, 105 transitions have 0 

definition variable, 105 have 1 definition variable, and 80 

have two or more definition variables. Figure 7 shows 

that after sorting out the experiment data, since the 210 th 

transition, the redundant variables increase significantly. 

C. Results 

The two methods produce the same number of intratransition 

variable data dependence, but K&M method 

produces more redundant variables. These redundant 

variables lead to errors in the next phase and new 

redundancies. 

However, if the intra-transition definition variable is 0 

or 1 or many, or the average number of data dependence 

is very low in each transition, for instance, within each 

intra-transition of the Door Control model, the average 

number of data dependence is 0.416667, then the use of 

K&M method can help to get all the data dependence, in 

other words, there is no need to use IaTDD method. If 

there’re relatively few definition variables, use variables 

and condition variables in EFSM model, each action 

sequence of transition is relatively simple. Therefore, the 

corresponding relationship between the variables is 

relatively simple. We can consider not use IaTDD 

method on condition that the focus of a research is not 

data, and redundancy as well as a small amount of errors 

can be tolerated. But when the intra-transition definition 

variables reaches 2 or more, K&M method produces 

more and more redundant variables, then the use of the 

proposed method in this paper is more appropriate. It is 

more simplifying and is a necessary method. Also, 

IaTDD method can be applied during pre-EFSM stage. 

When an EFSM input file finishes scanning, the intratransition 

data dependence is created. Even if one-pass 

scanning is performed, the time complexity is only 

decided by the number of statements. It is not timeconsuming, 

and can be completed by positive traverse of 

all the statements in intra-transition. Therefore, it’s a 

feasible method. 

V. SUMMARIES 

Due to the problem that direct application of existing 

data dependence methods of EFSM model can cause 

redundant variables, this paper compares the intratransition 

data dependence method and the K&M 

methods, proves that the new method can reduce further 

redundancy, and is a necessary and feasible method, 

providing theoretical basis for follow-up study. 

With the application of model slicing in different 

sections, the qualitative description of intra-transition 

data dependence is our next subject of research. 


We would like to acknowledge the generous financial 

support of Fundamental Research Funds for the Central 

Universities (ZZ1136). 


REFERENCES 

[1] Weiser M . Program slices: formal, psychological, and 

practical investigations of an automatic program 

abstraction method [D] . Ann Arbor: University of 

Michigan, 1979 

[2] Mats P E, Heimdahl, Michael W, Whalen. Reduction and 

slicing of hierarchical state machines [A]. In Proc. Fifth 

ACM SIGSOFT Symposium on the Foundations of 

Software Engineering [C]. Springer Verlag, 1997. 

[3] Mats P E, Heimdahl, Je_rey M, Thompson, Michael W, 

Whalen. On the e_ectiveness of slicing hierarchical state 

machines: A case study [A/J]. In EUROMICRO '98: 

Proceedings of the 24th Conference on EUROMICRO[C]. 

IEEE Computer Society. USA, 1998, 10435 

[4] Savage P, Walters S, Stephenson M. Automated Test 

Methodology for Operational Flight Programs [A]. 

Proceedings of IEEE Aerospace Conference [C]. 1997, 4: 

293-305 

[5] Dssouli R, Saleh K, Aboulhamid E, En-Nouaary A, 

Bourhfir C. Test Development For Communication 

Protocols: Towards Automation [J]. Computer Networks, 

1999, 31: 1835–1872 

[6] Korel B, Singh I, Tahat L, Vaysburg B. Slicing of state 

based models[A]. In IEEE International Conference on 

Software Maintenance (ICSM’03)[C]. USA: IEEE 

Computer Society Press Sept. 2003, 34–43 

[7] Miao Li, Zhang Dafang, Computing Backward Slice of 

EFSMs[J]. Journal of Software, China, 2004,15:169-178 

[8] Shenghui Shi, Qunxiong Zhu, Wenxing Xu. Intra- 

Transition and Inter-Transition Data Dependence for 

EFSM[C]. 2011 International Conference on Computer 

Application and System Modeling (ICCASM 2011) 

[9] Korell B. Private communication, 2009 

[10] Korel B, Koutsogiannakis G, Tahat L H. Model-based test 

prioritization heuristic methods and their evaluation [A]. In 

A-MOST ’07: Proceedings of the 3rd international 

workshop on Advances in model-based testing[C]. USA: 

ACM, 2007, 34–43 

[11] Strobl F, Wisspeintner A. Specification of an elevator 

control system – an autofocus case study[R]. Technical 

Report TUM-I9906, Technische Universität München, 

1999. 

[12] Bourhfir C, Dssouli R, Aboulhamid E, Rico N. Automatic 

executable test case generation for extended finite state 

machine protocols [A]. In IWTCS’97[C]. 1997, 75–90 

[13] Zaghal R Y, Khan J I. EFSM/SDL modeling of the original 

TCP standard (RFC793) and the congestion control 

mechanism of TCP Reno[R]. Technical Report TR2005- 

07-22, Internetworking and Media Communications 

Research Laboratories, Department of Computer Science, 

Kent State University, 2005. 

Shenghui Shi 1974-, China, Ph.D, 

Lecturer in Beijing University of 

Chemical Technology. Area of 

Research: Slicing Technology, 

Compiler Technology, Fault 

Detection, Safety Analysis. 

Email: shish@mail.buct.edu.cn



Qunxiong Zhu 1960-, China, Ph.D, 

Professor, Dean of College of 

Information Science and Technology in 

Beijing University of Chemical 

Technology. Area of Research: Fault 

Detection, Artificial Intelligence, Data 

Mining, Decision-Making and Control 

Research Area. 

Email: zhuqx@mail.buct.edu.cn 

Zhiqiang Geng 1973-, China, Ph.D, 

Associate Professor in Beijing University 

of Chemical Technology. Area of 

Research: Artificial Intelligence, Control 

Research Area. 

Email: gengzhiqiang@mail.buct.edu.cn 

Wenxing Xu 1982-, China, Ph.D 

candidate in Beijing University of 

Chemical Technology. Area of Research: 

Intelligent Computing. 

Email: estellaxu@163.com


Research of ArtemiS in a Ring-Plate Cycloid 

Reducer 

Bing Yang 1,2 

1.School of Mechanical Engineering, Dalian Jiaotong University, Dalian, China 

2.State Key Laboratory of Mechanical Transmission, Chongqing University, Chongqing, China 

Email: yangbing@djtu.edu.cn 

Yan Liu 3,2 

3.School of Traffic and Transportation Engineering, Dalian Jiaotong University, Dalian, China 

Email: ly@djtu.edu.cn 

Guixin Ye 

Xiamen Dianzhu Environmental Protection Co, LTD., Xiamen, China 

Email: yeguixin2009@163.com 

Abstract—The noise and vibration of a ring-plate pincycloid 

gear reducer driven by two motors are measured 

respectively at different rotational speeds and different 

loads. The data are collected and analyzed using the HEAD 

acoustics multi-channel noise test and analysis system. The 

ArtemiS is a set of analysis system for improving product 

quality in all areas of sound and vibration. From the 

frequency spectrum analysis, we can draw the following 

conclusions: The vibration acceleration of input shaft of X 

direction increase with the rotational speed and loads goes 

up. The vibration acceleration of input shaft of Z direction 

doesn’t vary much with the rotational and loads vary. The 

vibration acceleration of output shaft of Z direction doesn’t 

vary much with the rotational and loads vary. The vibration 

acceleration of input shaft of X direction increase with the 

rotational speed and speed goes up. When the rotational 

speed is certain, the sound pressure level curves have 

basically the same variation trends. The loudness increases 

as the speed goes up. The loudness mainly concentrates on 

low frequency bands. The sharpness decreases as the speed 

goes up. 

Index Terms—ring-plate cycloid gear reducer; ArtemiS; 

noise 


Noise pollution, water pollution and air pollution are 

the world’s three major environmental pollution problems. 

Mechanical noise becomes one of the main noise sources 

with the development of the industry. The mechanical 

noise hazards on the human body are various. Noise can 

cause ear discomfort, such as tinnitus, hearing loss, sleep 

disturbances and other harmful effects. According to 

clinical statistics, if people work and live in high noise 

environment chronically, they are easily to get deafness. 

Noise can reduce efficiency. The study found that noise 

The work is supported by the Fundamental Research Funds of the State 

Key Laboratory of Mechanical Transmission, Chongqing University, 

China. (Project No. SKLMT-KFKT-200902) 


doi:10.4304/jsw.7.12.2671-2677 

can make people feel upset and they can not concentrate 

on their work. Noise can also cause nervous system 

disorders such as mental disorders, endocrine disorders or 

even the accident rate increased. Noisy environment 

make people dizziness, headache, insomnia, dreaminess, 

malaise, memory loss and fear, irritability, low selfesteem 

and even insanity. Noise can harm children's 

physical and mental health. According to statistics, in 

today's world there are more than 70 million deaf, a 

considerable part of which is caused by the noise. 

1. output shaft 2. cycloid gear 

3. ring-plate with pin gear 4. input shaft 

Figure 1. Transmission sketch of the Ring-Plate Cycloid reducer 

A ring-plate-type cycloid gear reducer driven by two 

motors provides more advantages, including low volume, 

light weight, a high reduction ratio, smooth transmission 

and so on[1-3]. The transmission sketch diagram of the 

ring-plate-type cycloid gear reducer is shown in Fig. 1[4].


Two input shafts are driven by two motors respectively. 

The ring-plates mounted on the input shaft rotate and the 

cycloid gear rotates, then the output shaft rotates. At 

present, the noise and vibration study of the ring-plate- 

crank ring-plate-type cycloid reducer hasn’t been studied. 

In this paper, the noise and vibration tests of the ringplate-type 

cycloid gear reducer are made and the 

vibration and noise characteristics of ring-plate-type 

cycloid gear reducer are obtained. 

II. TEST CONTENT 

A.. Test Equipments 

The HEAD acoustics multi-channel noise test and 

analysis system is selected for the measurement. The 

system is made of SQLabⅡdata acquisition recorder, 

HMS Ⅲ artificial head, PEQ ⅣDigital 

Equalizer, 

Microphones and acceleration sensors, ArtemiS software 

and so on[6]. The SQLabⅡdata acquisition recorder is a 

multi-channel analog and digital I/O front end with a 

variety of available conditioning. The HMS Ⅲartificial 

head is a stand-alone, mobile measuring device that is 

ready to perform aurally accurate binaural recordings 

immediately after powering up. The programmable 

equalizer PEQ Ⅳ provides the highest quality 

reproduction of aurally-accurate recordings. The data 

collection and analysis process of the HEAD acoustics 

multi-channel noise test and analysis system is shown in 

Fig. 2. 

Figure 3. Interface of Artemis 

Figure 2. Data collection and analysis process 

type cycloid reducer need to be studied. Some studies of 

the ring-plate type reducer and planetary gear reducer 

have been done[5-7]. However, the noise of the double 

ring-plate-type cycloid gear reducer sensor SQLab Ⅱ 

ArtemiS 


The ArtemiS is the abbreviation of Advanced Research 

Technology for Measurement and Investigation of Sound 

and Vibration. The ArtemiS is a recording, analysis and 

playback software, which was developed to deal with 

tasks in the field of sound and vibration quickly and 

efficiently. Simultaneous listening, analyzing and 

interactive filtering is of vital importance when using 

ArtemiS. The ArtemiS improves and expedites your 

testing process. Its intuitive interface and ease of use 

allow for a quick test configuration and analysis. A 

analysis interface of ArtemiS is shown in Fig. 3. 

B. Test Setup and Recording 

For the measurement of noise, a microphone is placed 

vertically 1m above the reducer, a microphone is placed 

horizontally 1m away from the reducer. 

Four vibration test points were positioned in the test 

procedure. The location of the vibration test points is 

shown in Fig. 4. The detail information of test points is 

shown in Table I. 

TABLE I. 

TEST POINT LOCATION EXPLANATION 

Point name Test point location 

1 

2 

3 

On the reducer, near the input shaft(X 

direction) 

On the reducer, near the output shaft(Z 

direction) 

On the reducer, near the output shaft(X 

direction) 

4 On the reducer (Z direction) 

01 Noise test, 1m vertically above the reducer 

02 

PEQ Ⅳ 

Noise test, 1m horizontally away from the 

reducer


Figure 4. Test point location 

III. TEST RESULTS AND ANALYSIS 

A. Modal Analysis of Main Parts 

The motion differential equation for the ring-plate pincycloid 

gear reducer system can be expressed as 

{ & x&( 

t) 

} + [ C] 

{ x& 

( t) 

} + [ K] 

{ x( 

t) 

} { F( 

) } 

[ M ] 

= t 

(1) 

Here, [M] is mass matrix; [C] is damping matrix; [K] is 

stiffness matrix; {x} is displacement vector; { x& } is 

velocity vector; { x& & } is acceleration vector; and {F} is 

excitation vector[8,9]. 

{} Let 

] ][ x [ ϕ q = 

, [ϕ ] 

is vibration mode matrix, [q] 

is modal matrix, the equation (1) can be expressed as: 

2 ( [ K ] − w [ M ] + j [ C] 

) [ ϕ][ 

q] 

= { F} 

ω (2) 

Thus, the vibration characteristics of the parts can be 

obtained through the equation (2), including the natural 

frequencies and the corresponding vibration modes. The 

three dimensional model of the cycloid gear of the 

reducer is established and the meshing of the cycloid gear 

is shown in Fig. 5. 

Figure 5. Meshing of cycloid gear 

The modality analysis is carried out on the three model 

by the use of ANSYS. The order of the extracted of 

modal is 10 and the method of the extracted is block 

Lanczos. The first five order nature frequencies of 


cycloid gear are shown in Table Ⅱ . The first order 

vibration model of the cycloid gear is shown Fig. 6. 

TABLEⅡ. 

NATURAL FREQUENCY OF CYCLOID GEAR 

order 1 2 3 4 5 

f[Hz] 357 515 824 1565 1826 

Figure 6. Vibration model of cycloid gear 

B.Noise Frequency Spectrum Analysis 

Table Ⅲ shows the sound pressure level of different 

work conditions. Fig. 7 shows the noise frequency 

spectrum diagram at 500 rotations per minute of test point 

01 under the loads of 50%, 70% and 80% of the full load. 

Fig. 8 shows the noise frequency spectrum diagram at 

750 rotations per minute of test point 01 under different 

loads. Fig. 9 shows the noise frequency spectrum diagram 

at 1000 rotations per minute of test point 01. Fig. 10 

shows the noise frequency spectrum diagram at 500 

rotations per minute of test point 02. Fig. 11 shows the 

noise frequency spectrum diagram at 750 rotations per 

minute of test point 02. 

Rotation 

speed 

TABLE Ⅲ. 

SOUND PRESSURE LEVEL 

load 

SPL of test 

point 01 /dB(A) 

SPL of test 

point 02 /dB(A) 

500 50% 69.1 68.0 

500 70% 69.9 68.9 

500 80% 70.4 69.8 

750 50% 74.7 73.0 

750 75% 75.6 74.5 

750 80% 76.0 75.0 

1000 50% 77.3 76.2 

1000 70% 78.0 76.9 

1000 80% 79.2 78.2


SPL/dB(A) 

SPL/dB(A) 

SPL/dB(A) 

70 

60 

50 

40 

30 

20 

70 

60 

50 

40 

30 

20 

80 

70 

60 

50 

40 

30 

20 

50 80 125 200 315 500 800 1250 2000 3150 

Frequency/Hz 

50% load 

70% load 

80% load 

Figure 7. Spectrum diagram of different loads at the speed of 

500r/min of test point 01. 

50 80 125 200 315 500 800 1250 2000 3150 

Frequency/Hz 

50% load 

70% load 

80% load 

Figure8. Spectrum diagram of different loads at the speed of 


50% load 

70% load 

80% load 

50 80 125 200 315 500 800 1250 2000 3150 

Frequency/Hz 




SPL/dB(A) 

SPL/dB(A) 

70 

60 

50 

40 

30 

20 

70 

60 

50 

40 

30 

20 

50 80 125 200 315 500 800 1250 2000 3150 

Frequency/Hz 

50% load 

70% load 

80% load 



50 80 125 200 315 500 800 1250 2000 3150 

Frequency/Hz 

50% load 

70% load 

80% load 



From the above diagrams, we can know the followings. 

When the rotational speed is certain, the sound pressure 

level curves have basically the same variation trend. At 

the speed of 500 rotations per minute, the main noise 

frequency band of test point 01 is from 500 to 1250Hz. 

At the speed of 750 rotations per minute, the main noise 


At the speed of 1000 rotations per minute, the main noise 


The sound pressure level of test point 01 is a little bit 

bigger than that of mic02. The trend of test point 02 

sound pressure level is basically same as that of 01. The 

sound pressure level of the two test points goes up with 

the increase of the rotational speed. 

C. Vibration Frequency Spectrum Analysis 

Fig. 12 to 15 shows the vibration frequency spectrum 

diagram at 750 rotations per minute of test point 1,2,3,4 

under the loads of 50%, 70% and 80% of the full load. 

When the rotational speed is certain, the trends of 

vibration acceleration curves are basically with the 

changing of load. The peak frequencies are basically the 

same, but the peak values vary.


Vibration acceleration/mg 

70 

60 

50 

40 

30 

20 

10 

0 

40 63 100 160 250 400 630 1000 1600 2500 

Frequency/Hz 

50% load 

70% load 

80% load 

Figure 12. Spectrum diagram of different loads at the speed of750r/min 

of test point 1. 


70 

60 

50 

40 

30 

20 

10 

0 

40 63 100 160 250 400 630 1000 1600 2500 

Frequency/Hz 

50% load 

70% load 

80% load 

Figure 13. Spectrum diagram of different loads at the speed of 750r/min 



90 

80 

70 

60 

50 

40 

30 

20 

10 

50% load 

70% load 

80% load 

40 63 100 160 250 400 630 1000 1600 2500 

Frequency/Hz 

Figure 14. Spectrum diagram of different loads at the speed of 750r/min 




70 

60 

50 

40 

30 

20 

10 

0 

40 63 100 160 250 400 630 1000 1600 2500 

Frequency/Hz 

50% load 

70% load 

80% load 

Figure 15. spectrum diagram of different loads at the speed of 750r/min 


From the above diagrams, we can know the followings. 

The vibration acceleration of input shaft of X direction 

increase with the rotational speed and loads goes up. The 

vibration acceleration of input shaft of Z direction doesn’t 

vary much with the rotational and loads vary. The 

vibration acceleration of output shaft of Z direction 

doesn’t vary much with the rotational and loads vary. The 

vibration acceleration of input shaft of X direction 

increase with the rotational speed and speed goes up. 

D. Sound Quality Analysis 

From the above analysis, we found that the sound 

pressure level of the ring-plate pin-cycloid gear reducer is 

not too high but the people around the reducer still feel 

uncomfortable. So, the feeling of people should be taken 

into consideration besides sound pressure level. That is, 

psychoacoustics analysis is necessary. Psychoacoustics is 

the scientific study of sound perception. More 

specifically, it is the branch of science studying the 

psychological and physiological responses associated 

with sound. There are many aspects in psychoacoustics 

field such as loudness, sharpness, fluctuation strength, 

roughness and etc.. 

Loudness is the attribute of auditory sensation in terms 

of which sounds can be ordered on a scale extending 

from quiet to loud. Loudness analysis can lead to more 

precise results than magnitude estimations. For this 

reason the loudness level is calculated. Loudness can be 

expressed as[10] 

∫ ′ N N dz (3) 

= 24 

0 

Where N’ is specific loudness which can be expressed as 

0.23 

0.23 

⎛ E ⎡ 

⎤ 

TQ ⎞ ⎛ 

⎞ 

⎢⎜ 

E 

N′ 

= 0.08 

+ ⎟ ⎥ 

⎜ 

⎟ 0.5 0.5 -1 

⎢⎜ 

⎟ 

⎝ E 0 ⎠ 

⎥ 

⎣⎝ 

E TQ ⎠ ⎦ 

(4)


Where ETQ is the excitation at threshold in quiet and E0 

is the excitation that corresponds to the reference 

intensity I0 =10 -12 W/m 2 . 

Loudness(sone/Bark) 



4 

3 

2 

1 

0 

4 

3 

2 

1 

0 

4 

3 

2 

1 

0 

500 r/m 

750 r/m 

1000 r/m 

0 2 4 6 8 10 12 14 16 18 20 22 

Frequency(Bark) 

Figure 16. Point 01 loudness of different speeds at 50% load. 

500 r/m 

750 r/m 

1000 r/m 

0 2 4 6 8 10 12 14 16 18 20 22 



500 r/m 

750 r/m 

1000 r/m 

0 2 4 6 8 10 12 14 16 18 20 22 



Fig. 16 to 18 shows the loudness of test point 01 at 

different speeds under the loads of 50%, 70% and 80% of 

the full load. From the diagrams, we can know the 

followings. The loudness of test point 01 increases as the 


speed goes up. The loudness mainly concentrates on low 

frequency bands. The trends of three diagrams are 

basically same. 

Sharpness is a sensation which can be considered 

separately, and it is possible, for example, to compare the 

sharpness of one sound with the sharpness of another. 

One of the important variables influencing the sensation 

of sharpness is the spectral contents. The Sharpness can 

be expressed as[7] 

24 

' 

∫ N ( z) 

g( 

z) 

dz 

0 

S = 0.11 

(5) 

24 

' 

N ( z) 

dz 

∫ 

0 

Where S is the sharpness to be calculated and the 

denominator gives the total loudness N which has already 

been calculated. 

Sharpness(acum) 


3.1 

3 

2.9 

2.8 

2.7 

2.6 

2.5 

2.4 

3.2 

3.1 

3 

2.9 

2.8 

2.7 

2.6 

2.5 

500 r/m 

750 r/m 

1000 r/m 

0 4 7 10 14 17 21 24 

Time(s) 

Figure 19. Point 01sharpness of different speeds at 50% load. 

500 r/m 

750 r/m 

1000 r/m 

0 4 7 10 14 17 21 24 

Time(s) 

Figure 20. Point 01sharpness of different speeds at 70%load. 

The sharpness of test point 01 at the speed of 500 

rotations per minute is higher than that of 750 rpm and 

1000 rpm, which is very different from the trends of the 

sound pressure level and vibration acceleration. The trend 

of sharpness curves of test point 01 at the speed of 750 

rpm is basically same as that of 1000 rpm. The sharpness 

of the test point 01 at the speed 500 rpm is 3.02 acum. 

The sharpness of the test point at the speed 750 rpm is


2.78 acum. The sharpness of the test point 01 at the speed 

500 rpm is 2.77 acum. 


3.2 

3.1 

3 

2.9 

2.8 

2.7 

2.6 

2.5 

2.4 

500 r/m 

750 r/m 

1000 r/m 

0 4 7 10 14 17 21 24 

Time(s) 

Figure 21. Point 01sharpness of different speeds at 80%load. 

Fig. 19 to 21 shows the sharpness of test point 01 at 

different speeds under the loads of 50%, 70% and 80% of 

the full load. From the diagrams, we can know the 

followings. The sharpness of test point 01 decreases as 

the speed goes up. 

IV. CONCLUSION 

A ring-plate-type cycloid gear reducer is a new type 

reducer. The noise and vibration study of the reducer is 

very important. In this paper, the noise and vibration tests 

of the ring-plate-type cycloid gear reducer are made and 

the vibration and noise characteristics of ring-plate-type 

cycloid gear reducer are obtained. The three dimensional 

model of the cycloid gear of the ring-plate-type cycloid 

reducer with is established, the modality analysis are 

carried out on the model by the use of ANSYS. The 

natural frequencies and the corresponding vibration 

models of the cycloid gear are gained. The sound quality 

of the reducer is also analyzed. 


This study is supported by the Fundamental Research 

Funds of the State Key Laboratory of Mechanical 


Transmission, Chongqing University, China.( Project No. 

SKLMT-KFKT-200902) 

REFERENCES 

[1] W.D. He and X. Li, “Study on double crank ring-platetype 

cycloid drive,” Chinese Journal of Mechanical 

Engineering, vol.36,pp.84-88. May 2000. 

[2] L.X. Li and X. Li, “Experimental study of double crank 

ring-plate-type pin-cycloidal gear planetary drive,” 

Journal of Dalian Railway Institute, vol. 26,pp.1-4. 

January 2005. 

[3] Kasap, S and K. Benkrid, “ Parallel processor design 

and implementation for molecular dynamics simulations 

on a FPGA-Based supercomputer,” Journal of Computers,, 

vol. 7(6),pp.1312-1328. 2012. 

[4] X. Li and W. D. He, “A new cycloid drive with high-load 

capacity and high efficiency,” ASME Journal of 

Mechanical Design, vol.126,pp.1683-686. April 2004. 

[5] T.J. Lin, Y.J. Liao, et al. , “Numerical simulation and 

experimental study on radiation noise of double-ring gear 

reducer,” Journal of Vibration and Shock, vol.29 ,pp.43- 

47. March 2010. 

[6] Dyrkolbotn, G.O., K. Wold and E. Snekkenes , “ Layout 

dependent phenomena a new side-channel power mode,” 

Journal of Computers, vol.7(4),pp.827-837. 2012. 

[7] C.C. Zhu, D.T. Qin, et al. , “ Study on surface noise 

distribution of three-ring reducer,” Journal of Chongqing 

University(Natural Science Edition), vol.23,pp.18-21. 

April 2000. 

[8] HEAD acoustics, “ArtemiS tool packs,” Application note, 

in press. March 2006. 

[9] Xinyu D. and G. Guowei, “ Elementary application of 

geometry face modeling under VRML complex object 

muster,” Journal of Computers,, vol. 6(4),pp.683-690. 

2011. 

[10] E. Zwicker and H. Fastl, “Psychoacoustics: Facts and 

Models,” 2nd ed., NewYork :Springer, 1999. 

Bing Yang was born in 1974. She received her M.S. degree in 

Dalian Universtity of Technology in 2003.Her current research 

interests include noise and vibration control, industrial 

engineering. 

Yan Liu was born in 1956. He is an professor in Dalian 

Jiaotong University. His current research interests include noise 

and vibration control, vehicle engineering. 

.


A New Access Control Model for Manufacturing 

Grid 

Zhihui Ge, Taoshen Li 

School of computer science, Electronics and Information, Guangxi University, Nanning, China, 530004 

Email: gezhihui@foxmail.com 

Abstract—In order to protect the sensitive information in 

collaborative manufacturing grid environment, an access 

control solution was proposed to satisfy the inherent 

dynamic natures of the Manufacturing Grid, including 

dynamic Business Flow and system environment. Activity is 

introduced to encapsulate role and permission. Activity state, 

activity hierarchy and activity dependence are used to 

provide dynamic authorization and flexible multigranularity 

permission management, which can get adapted 

to the dynamic, flexible modern business process. UNIX-like 

permission can guarantee default minimum 

read/write/delete permissions. This proposed model can 

meet the need of MG. 

Index Terms—manufacturing grid, access control, RBAC, 

task context, environment context 


Manufacturing grid is such kind of platform, which can 

integrate various resources to form a giant “virtual 

computer”, provided for modern enterprise. In this 

“virtual organization”(VO), users can share resources and 

cooperate with problem solving. So, manufacturing grid 

has not only the same features as traditional network, but 

also has high dynamics, accurate organization, large scale 

etc. 

As a mass customized distributed cooperation system, 

there are lots of data, which may come from different 

users, departments, enterprises, to be processed in 

manufacturing grid. And all the data are associated with 

fund, cost, product, project process and staff management 

etc privacy information. So how to guarantee the security 

of those data is very challenging. As an important 

component of security, access control can prevent data 

suffering from illegal modification and damage. 

Figure 1. Cooperation in VO among Enterprises 


doi:10.4304/jsw.7.12.2678-2686 

Figure 1 shows a typical scenario, within which 

multiple manufacturing enterprises cooperate with each 

other based on virtual organization. 

We assume enterprise A has found some opportunities 

in the market. In order to response quickly, A authorize B 

and C to complete the design of components and the final 

product integration. When the design work is completed, 

A delegates the manufacturing task to D. During this 

commercial activity, multiple enterprises associate and 

cooperate with each other through resources sharing and 

division of labor. Different services provide by various 

enterprises form a temporary virtual organization, once 

the manufacturing activity ends, the VO will naturally 

dissolved. 

During the phases of cooperation, in order to ensure 

the final product satisfied with the primitive requirement, 

The staff participated in the collaborative design work 

have to communicate with each other, and the manager 

also needs to track and control the project process, 

coordinates developers and resources. All the tasks 

mentioned above need mutual cross-domain access. 

However, each enterprise joined in the VO has their 

individual security requirements and access control 

policy. 

In this paper, a context-aware access control model is 

proposed, which can effectively support the 

interoperability with dynamic changing right polices for 

collaboration in manufacturing grid environment. 

II. RELATED WORK 

Researchers have done many works in the field of grid. 

Grid Security Infrastructure (GSI)[3] is the essential 

middleware for authentication in grid environment. GSI 

maps the global user who needs to access resources to an 

account on local resource servers. Because the giant 

amount resources and users in grid environment, the 

mapping table will also be huge, furthermore GSI does 

not have effective global/local permission management 

scheme. Ian Foster etc proposed the Community 

Authorization Service (CAS)[4]. CAS allows resource 

providers to delegate some of the authority for 

maintaining fine-grained access control policies to 

communities, while still maintaining ultimate control 

over their resources. In order to gain access to a CASmanaged 

community resource, a user must first acquire a 

capability from the CAS server. So the final permission 

assigned to user is an intersection of VO (Virtual 

Organization) and resource provider. But CAS is static


delegation of authority, which can not satisfy the 

requirement of dynamic authorization. 

There are lots of research works on extension of 

traditional RBAC in grid environment. The model 

proposed in Ref[5] can provide dynamic permission 

according to gird environment context, but drawback is 

the dynamic changes of task in manufacturing project are 

not considered. 

Recent years, along with the rapid development of 

manufacturing grid, some access control models are 

successively proposed. In order to support interaction 

between global and local, dynamic and static security 

strategies in dynamic heterogeneous manufacturing grid, 

Ref[6] proposes an extension method for RBAC. The 

model is based on CAS, but it is too complicated and can 

not effectively reflect the changes of environment context 

so as to control view of CAD model. 

III. GROUP-ACTIVITY BASED ACCESS CONTROL MODEL 

A. Design Philosophy 

The access control model we designed is showed in 

Figure 2, which is an activity-centered, encapsulated with 

role and right. The work flow can easily be organized and 

coordinated to form a global management and dynamic 

authorization. The group provided autonomous 

management and UNIX-like permission configuration all 

can improve the efficiency of administrator. 

Figure 2. GA_RBAC Access Control Model 

B. Definitions 

Definition 1. Object and UNIX-like permission 

object base class: The most important part in access 

control system is to determine the object to be controlled. 

But all data and resources are dynamically increasing in 

manufacturing grid, and most part of them are stored in 

database. It’s very hard for the administrator to assign all 

the permisssions. So we define the following UNIX-like 

configuration for objects. 

obj_id and owner_id are object and owner identity 

respectively. unix_perms is 32 bit integer, which records 

the read/write/delete access rights. A shared object may 

be composed of multiple objects and organized in 

hierarchical structure. The products designed by different 

designers, who are affiliated with different enterprises, 

belong to different organizations. The owner_group_set 

is used for such situation, which can solve rights 

inheriting problem of sub-objects. 


Definition 2. Subject: subject is an object, which is 

assigned some role and participated in collaborative 

activities with corresponding permission. Subject initiates 

resource request, it is a abstract concept, which can be 

person, program and even group. 

Definition 3. Groups and Group Hierarchy (GH): 

one enterprise is composed of multiple departments 

which are associated with different roles. The 

organizations in VO may be enterprises and departs 

coming from different domains, so we give the following 

definition to describe this relation. 

gid is group identity; super_gid is parent group of gid; 

u_set and r_set are user set and corresponding role set of 

group respectively. 

Definition 4. User: user is a staff with some role and 

taking part in some activity. 

uid is user identity, group_set is the user’s group; 

r_set is role set assigned to user. 

Definition 5. Group-User Relation (GUA): GUA 

indicates the many-to-one relation between user and 

group set. 

Definition 6. Role and Role Hierarchy (RH): Role is 

the entity who owns rights, which is related to task 

function, responsibility etc semantics. 

rid is the role identity, super_rset is the role set 

inherited. group_set and u_set are group and user set 

respectively. actset is activity set the role participated in. 

In manufacturing grid, role can be system role such as 

administrator, service provider, service caller etc, and it 

also can be set according to the specific task and position, 

such as design engineer, process engineer, design leader 

etc. 

Definition 7. Permission: permission is the permitted 

operations on specific object. 

Definition 8. Activity and Activity Hierarchy (AH): 

activity is the base unit of decomposed task and their 

hierarchy relation, which also is encapsulated with role 

and permission etc objects. 

aid is the activity identity, state is the activity state, 

superAct is the father activity, subActs is the child 

activity set, rset is the role set of participators, peSet is 

the record set of right. 

During the cooperation process of VO, some project 

cycle may be divided as several activities. Furthermore, 

activity can be divided into tasks, so the activity 

hierarchy structure can express this relation well. 

Definition 9. Subject-Role Relation (SR): SR is a 

kind of many-to-many relation. 

sid can be group id or user id, rid is role identity. 

Definition 10. Role-Activity Relation (RA): RA is 

many-to-many relation between role and activity.


rid is role identity, aid is activity identity. 

Definition 11. Activity-Permission Relation (AP): AP 

is many-to-many relation between activity and right. 

Activity A can be divided into seven sub-activities 

and they have correlations with each other, figure 3 

shows the activity dependence graph among them. 

Figure 3. Activity Dependence Graph 

Definition 12: Activity State Migration (ASM) is the 

migration of activity state, which is used to describe the 

changes of activity state. 

all is used to guarantee some right can be available at 

any time, such as the right management of administrator. 

active and inactive indicate the activity is in its active or 

inactive state respectively. denied state is used to describe 

backtracking of activity dependence. complete indicates 

the activity is completed, all the rights are not available 

except the read right of its owner. The migration of 

activity state is showed as figure 4. 

Figure 4. Activity State Migration 

Definition 13: Activity Dependence (AD) is the 

relationship of activities, which includes: 

(1) Mutually Exclusive Dependence: the 

exclusive dependence of different activities 

when executing. For any two activities ai and aj, 

aj can not enter into that state when ai has 

entered, vice versa. Noted as ai(state) ↔ 「 

aj(state) or aj(state)↔ 「 ai(state). Mutually 

exclusive dependence meets the conditions of 

non-reflexivity, non-transitivity and symmetry. 

(2) Sequence Dependence: the precedence 

relationship of different activity when 

executing. For any two activities ai and aj, 

when ai enters into complete state, the active 

state of aj is activated. Noted as ai 

(complete)→aj (active). 

(3) Failure Dependence: the denied relation of 

different activities when executing. For any 

two activities ai and aj, when ai enters into 

denied state, the active state of aj is activated. 

Noted as ai (denied)→aj (active). 


(4) Synchronous Dependence: synchronous 

execution of different service state. For any 

two activities ai and aj, when ai is implemented, 

aj has to be implemented at the same time, vice 

versa. We note this as ai (state)↔ aj (state). 

Synchronous relation is a kind of equivalence 

relation, which meets the conditions of 

reflexivity, transitivity, symmetry. 

Definition 14: Activity Property (AP): we define activity 

A has the following properties. 

(1) Start Activity Set: in activity hierarchy and 

activity dependence, is composed of all the 

started sub-activities, which can be formalized as 

It means that the start activity set of activity A is the 

activity set of whose sequence dependence in-degree is 

zero. 

(2) End Activity: In order to guarantee the 

cooperation going successfully, we set each activity 

having only one ending sub-activity is the 

activity set of whose sequence dependence out-degree is 

zero. 

(3) Activate Activity: When some activity instance ai 

is running, all the conditions are satisfied, then ai is 

activated and all its permission set are available. 

deptype is the activity dependence type. 

(4) Prior Activity Set : For the convenient of 

managing the activation of activity, we need to find out 

the prior activity set of some activity. 

(5) Follow-up Activity Set:We also need to find out 

the follow-up activity set of some activity. 

IV. MODEL OPERATIONS 

The implementation process of access control model is 

mainly include static permission assignment, dynamic 

activity management and user authorization. 

A. Static Permission Assignment 

Any implementation of access control model has to 

provide permission policies for the access control system. 

Static permission assignment is a method to achieve 

permission policy, which can initialize the data for access 

control system. In our model, the static permission 

assignment includes activity partition, activity permission 

assignment, activity role association, subject role 

association. 

The activity partition is the first step. The 

administrative staffs organize and partition the activities 

at different stages during the work flow and decide their 

dependence relationship. This task is related to special 

application. When the activity and activity dependence is 

worked out, permission can be assigned by RA and AP 

mapping, and then mapping the available roles for users 

and their groups according to the SR.


B. Activity Management and Dynamic Permission 

Adjustment 

Activity management is one key component to achieve 

the goal of dynamically adjust the permission of users. 

According to the features of access control model of 

manufacturing grid, we design three operations to 

complete the activity management. 

(1) Startup() function is used to activate some activity 

and its corresponding sub-activities. In AH, when 

we want to activate an activity from its inactive 

state, we must keep searching for the startup 

activity set in the sub-activity set until there are no 

more sub-activities due to the hierarchy 

organization of AH. The operations can be 

constructed according to the definition of Start 

Activity Set, the pseudo code is as following: 

Startup(act) 

Step1: tmp_act=act, 

if tmp_act.state!=all 

tmp_act.state=active 

Step2: if tmp_act.subActSet.length=0 

return 

Step3: if tmp_act.subActSet.length!=0 

if tmp_act.SubActs_Dep!=null 

for each start_act in StartAct(tmp_act) 

Startup(sub_act) 

else 

for each act in tmp_act.subActSet 

Startup(sub_act) 

Figure 5. Function for Starting up an Activity 

(2) After some activity is completed or denied, the 

activities which having dependence with it have to 

be activated to proper state. ActivateAct() is used 

to dynamically adjust the state of activities. The 

main idea of this operation is to find a given 

activity ai’s FollowAct(ai,deptype) set according 

to the state of activity ai in AD. If activities in 

FollowAct(ai,deptype) set is satisfied with the 

dependence relation and corresponding state, then 

it will be activated. 


ActivateAct(ai,state) 

Step1: ai.state:= state 

Step2: if state=denied 

followActs:=FollowAct(ai,failure) 

else 

if state=complete 

followActs=FollowAct(ai,order) 

Step3: for each tmp_act in followActs 

preFailureSet:=PriorAct(tmp_act,failure 

) 

preOrderSet:=PriorAct(tmp_act,order) 

for each act in preFailureSet 

if act.state=denied continue 

else return 

for each act in preOrderSet 

if act.state=complete 

continue 

Figure 6. Function for Activating an Activity 

(3) CompleteAct() is used to mark the completion 

state of activity. In AD, when the complete sub 

activity of one activity is completed, that indicated 

the activity is also completed. 

CompleteAct(act) 

Step1: if act.state = all 

return 

else act.state=complete 

goto step2 

Step2: tmp_act:=act.super_act 

Step3: if tmp_act = null 

return 

else 

if EndAct(tmp_act)=act 

ActivateAct(act,Order) 

CompleteAct(tmp_acct) 

if EndAct(tmp_act)!=act 

return 

Figure 7. Function for Completing an Activity 

C. User Authorization 

The object of access control is to judge the requests of 

users, and then decides whether the requested access can 

be implemented on the resource. So in our access control 

model, we introduce activity state, group and 

corresponding UNIX-like permission control. The 

request of user can be expressed as following 

u_id is user identity, group_set is the group of user, r is 

the activity role of user in current session, act_id is the 

activity identity of current request of user, op is the


requested operation, obj is the requested object. We 

design the following strategy: 

(1) If the activity is in its inactive state, then all 

the rights are not available during all the 

activities. 

(2) If activity is in its denied state, then reset the 

activity into active state. 

(3) If activity is in active or all state, then all the 

rights are available. 

(4) If activity is in completed state, then only readonly 

right is available for the corresponding 

users. 

V. IMPLEMENTATION 

A. Framework 

In order to illustrate the effectiveness and security of 

our access control model, we construct a prototype based 

on Globus Toolkit 4. The framework of our prototype is 

showed in Figure 8. 

Figure 8. Framework of Prototype 

(1) CSGService is a grid service based on Globus 

Toolkit 4, which is used to save the properties 

of resource objects. 

(2) CSGClient is the client which is used to submit 

the users requests. 

(3) CSGListener is the subscriber of CSG service, 

which is used to monitor the state changing of 

resources in CSGService. 

(4) AccessControlManager is used to coordinate the 

interaction between CSGService and 

GA_RBAC. 

(5) GA_RBAC is our proposed group-activity based 

access control model, which is used to authorize 

the users’ requests. 

B. WSDL based Service Description 

We use WSDL (Web Service Description Language) 

to describe the interaction protocol between server and 

client. 

(1) Submit user’s request. The cooperation works 

among clients are implemented by invoking the grid 


service of server. We define the message format of 

SubmitRequest operation as following 

: = AccessRequest 

:= 

:= 

:= Identification of current user’s activity 

:= add | create | modify | delete | save | load | read 

| write | read … 

:= 

The most important part of our prototype is the access 

control, so we add all the items needed by authorization 

in SubmitRequest. User indicates the sender of the 

request, in which u_id is the user’s identification, u_name 

is the name of user, u_ip is the IP address, time is the 

request sending time, groupmembership is the user’s 

group. ActID is the id of current activity, Op is the 

operation of user request. PermissionBaseObjec is the 

base class of operation object. The object of our 

prototype simulates the permissionmanagement system of 

UNIX to simplify the configuration of permission, so we 

abstract the basic permissionclass of UNIX as 

PermissionBaseObject. 

(2) Submit user’s response. After the user submits 

request, the server will send feedback to clients according 

to the user’s request. The user’s request accepted, the 

server will update the user’s UI as the correct result is 

sent back to the client to assure the consistency of the 

client and the server. Otherwise, the server must notify 

the client if the user’s request is not accepted. So we 

define the message format of SubmitResponse as 

following 

SubmitResponse:= 

StatusCode indicates the completed state code of 

operation, a Boolean value. PermissionBaseObject is the 

permission object, which is the XML date sent back by 

the server to the client. 

(3) Notification Mechanism. The notification 

mechanism is achieved by using the features of grid 

service that can maintain the state of objects the users 

operates. When the state of some resources changes in 

grid service, the client monitoring the resource can be 

notified immediately. We design an interface 

CSGServicePortType inherited from 

GetMultipleResourceProperties, NotificationProducer 

which are defined in WSRF specification. The WSDL file 

is showed in Figure 9. 

 

 

 

 


ref="tns:ID"/> …… 

 

 

 

 

…. 

…… 

……> 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Figure 9. Part of WSDL File 

C. GA_RBAC Implementation 

(1) Permission Basic Class 


Figure 10. Class Diagram of GA _RBAC 

Figure 10 is the class diagram of GA _RBAC model. 

In order to simplify the administrator’s work, we provide 

the minimal permissionset for each object, which 

includes read/write/delete three operations. We abstract a 

permissionbasic class PermissionBaseObject for 

permission judgment showed in Figure 11. 

Figure 11. PermissionBaseObject Class 

(2) Permission Record and PermissionChecker 

PermissionEntry class is composed of by two-tuples 

, and it overrides the equals() and hashCode() 

functions to judge the permission record is equal or not. 

The class description is showed in Figure 12. 

Figure 12. PermissionEntry Class 

UnixPermChecker is used to check the permission of 

objects. The class description is showed in Figure 13. 

Figure 13. UnixPermChecker Class 

(3) Subject, Group and User Class 

In manufacturing grid, the staffs in enterprise alliances 

and organizations are the subjects of collaboration. 

Usually, each enterprise has some predefined 

organization structure. For example, an enterprise has


several departments, which is composed of several 

groups. There are several staffs in each group, who may 

also belong to different groups and departments. In order 

to present this kind of structure and realize authorization 

to original organization, the subject class is abstracted 

and the group adopts hierarchy structure. The user and 

group are multiple to multiple relation. The class 

description is showed in Figure 14. 

Figure 14. Subject, Group and User Class 

(4) Role and Session 

In GA_RBAC model, one role class indicates a role, 

which is associated with special subject and activity. In 

this way, the subject can operate according to the 

permission in the given activity. Session is mainly used to 

maintain the available roles for the logged users and 

operate according to the user’s request. The class 

description is showed in Figure 15. 

Figure 15. Role and Session Class 

(5) Activity and Activity Dependence 

Activity and Activity Dependence are two very 

important parts in GA_RBAC. They are the key to realize 

the dynamic permission adjustment. In our prototype, we 

use activity diagram to present activity dependence 

relation. Our activity centered authority judgment 

procedure is showed in Figure 17. 


Figure 16. Activity and Activity Dependence Class 

Figure 17. Authority Judgment Procedure Example 

D. Example 

We use the scenario showed in Figure 1 to illustrate 

our access model. There is a product design project 

CSGProject carried out by three groups: Ga, Gb and Gc. 

Gb and Gc are responsible for component design and 

integration design respectively. Project analyzer and 

project supervisor in Ga can provide requirement analysis 

and project tracking for Gb and Gc. We assume there are 

such users {ua1, ua2}, {ub1,ub2}, {uc1, uc2}, the activities 

may include requirement analysis, analysis review, 

system design, component A design, component B design 

and integration design etc stages. The whole cooperation 

procedure is as following. 

When CSGProject is started, the state of activity 

requirement analysis is transformed from inactive to 

active. At this time project analyzer can make 

requirement analysis and create, modify and edit the 

corresponding documents. When analyzer’s work is done, 

his activity changes to state completed, the analysis 

review activity and the permission of monitor are all 

activated as the requirement documents submitted. If the 

analysis review cannot pass, then the state of it should be 

changed to denied, and make the requirement analysis 

activity which has failure dependence relation with it 

reenter into active state. Otherwise, the review activity 

becomes complete state, and system design is activated. 

Requirement analysis and analysis review are mutual 

exclusive dependence, so they must be carried out by 

different roles and users. The cooperation design will 

finally succeed. The activity dependence and tasks 

division is showed in Figure 18.


Figure 18. Activity Dependence 

According static permission assignment rules, the 

related tables are as following. 

TABLE 1 

ACTIVITY OF CSGPROJECT 

AH Available Available 

CSGPro 

Roles 

Permissions 

Analyse Analyser Create 

Document, 

Edit 

Document, 

default 

Verify Verifier Note 

Document, 

default 

GlobalDesign Desinger Create 

Document, 

Edit 

Document, 

default 

PaDesign ADesigner Edit Solid, 

default 

PbDesign BDesigner Edit Solid, 

default 

Assemble Assembler Create Solid, 

Boolean 

Solid, default 

admin Admin Manage Role, 

Manager 

Activity, 

default 

view ProjectMember Read 

TABLE 2 

ROLES HIERARCHY 

Role Name Parent Role 

ProjectMember 

Analyser ProjectMember 

Verfier ProjectMember 

Designer ProjectMember 

PaMember ProjectMember 

PbMember ProjectMember 

Assembler ProjectMember 

Admin ProjectMember 

TABLE 3 

GROUP HIERARCHY 

Parent Group Group 


A Analyser 

A SubProjectLeader(SubPL) 

B Designer 

B Draftman 

B SubProjectLeader(SubPL) 

C Draftman 

TABLE 4 

USER-GROUP MAP 

User Name Group 

ua1 ua2 ub1 ub1 ub2 uc1 A.Analyser 

A.SubProjectLeader(SubPL) 

B.Designer 

SubPL 

B.Draftman 

C.Draftman 

TABLE 5 

SUBJECT-ROLE MAP 

Subject Name Role 

A ProjectMember 

B PaMember 

C PbMember 

A.Analyser Analyser 

A.SubPL Admin 

B.Designer Designer 

B.Draftman PAMember, Assembler 

C.Draftman PbMember 

C.Draftman Assembler 


In traditional manufacturing grid access control model, 

it is hard to deal with the cooperation between dynamic 

and non-dynamic businesses and can not effectively 

support the global control and local autonomous 

management etc. In this paper, we propose a groupactivity 

based access control model. Activity, activity 

hierarchy, activity state and activity dependence are 

introduced into our model, by this means, we can clearly 

describe the dependence of operations and authorize 

dynamically. The UNIX-like permission control make 

the right management is much easier. 


This work is supported by Guangxi Nature Science 

Foundation (09-007-05S018, 2010GXNSFD013037), 

Guangxi Key Laboratory of Manufacturing System & 

Advanced Manufacturing Technology Foundation (11- 

031-12S02-02). 

REFERENCES 

[1] Fan Yushun. Concept and Architecture of Manufacturing 

Grid. Aeronautical Manufacturing Technology. 2005,l0,42- 

45. 

[2] Foster I, Kesselman C and Tuecke S. The anatomy of the 

grid: enabling scalable virtual organizations. International 

Journal of Supercomputer Applications, 2001,15 (3) : 200- 

222. 

[3] I.Foster,C. Kesselman, G. Tsudik, S. Tuecke. A security 

architecture for computational grids. Proceedings of the 

Fifth ACM Conference on Computer and Communications 

Security, November 1998, pp. 83–92. 

[4] Pearlman L, Welch V, Foster I. A community 

authorization service for Group collaboration. IEEE 3rd 

International Workshop on Policies for Distributed 

Systems and Networks. 2002, pp:50-59.


[5] YAO Han-Bing HU He-Ping LU Zheng-Ding LI Rui-Xuan. 

Dynamic Role and Context-Based Access Control for Grid 

Applications. PDCAT 2005, 404 - 406. 

[6] CAI Hong-xia, YU Tao, FANG Ming-lun. Access control 

of manufacturing grid. Computer Integrated Manufacturing 

Systems. 2007,4(13) : 717-720. 

[7] Hongxia Cai, Tao Yu, Minglun Fang. Access Control 

Model of Manufacturing Grid. IFIP International 

Federation for Information Processing, 2006, p. 938-943. 

[8] F. Tao, L. Zhang, K. Lu & D. Zhao. Research on 

manufacturing grid resource service optimal-selection and 

composition framework. Enterprise Information Systems, 

2012,Vol. 6, No. 2, 237-264. 

[9] Dongming Zhao ; Yefa Hu ; Zude Zhou. Resource 

Service Composition and Its Optimal-Selection Based on 

Particle Swarm Optimization in Manufacturing Grid 

System. IEEE Transactions on Industrial Informatics, 

2008,Vol. 4, No. 4, 315 – 327. 

[10] Wenjun Xu, Zude Zhou, D. T. Pham, Quan Liu, C. Ji and 

Wei Meng. Quality of service in manufacturing networks: 

a service framework and its implementation. The 

International Journal of Advanced Manufacturing 

Technology. 2012,1-11. 

[11] Zhengqiu He, Lifa Wu, Huabo Li, Haiguang Lai, Zheng 

Hong. Semantics-based Access Control Approach for Web 

Service. Journal of Computers, Vol 6, No 6 (2011), 1152- 

1161. 

[12] Bailing Liu. Efficient Trust Negotiation based on Trust 

Evaluations and Adaptive Policies. Journal of Computers, 

Vol 6, No 2 (2011), 240-245. 

[13] Xiaoming Wang, Yanchun Lin. An Efficient Access 

control scheme for Outsourced Data. Journal of Computers, 

Vol 7, No 4 (2012), 918-922. 


Zhihui GE was born in Hebei, China, in 

1978. He received a B.S. degree in 

Computer Science from the Beijing 

Technology and Business University, 

Beijing, China, in 2001, and a M.S. 

degree in Computer Science from the 

Guangxi University, Guangxi, China, in 

2004 and a Ph.D. degree in Computer 

Science from the Central South 

University, Hunan, China, in 2007. 

He is a associate professor in the Guangxi University. Nanning, 

Guangxi, China. His research interests are in networks and 

security with special emphasis on distributed system.


Software Design and Development of 

Chang’E-1 Fault Diagnosis System 

Ai-Wu Zheng 1,2,3 

1 School of Astronautics, Beihang University, Beijing 100191, P.R. China 

2 Science and Technology on Aerospace Flight Dynamics Laboratory, Beijing 100094, P.R. China 

3 Beijing Aerospace Control Center , Beijing 100094, P.R. China 

Email:awzheng@163.com 

Yu-Hui Gao 2 , Yong-Ping Ma 2 and Jian-Ping Zhou 1 

1 School of Astronautics, Beihang University, Beijing 100191, P.R. China 

2 Beijing Aerospace Control Center , Beijing 100094, P.R. China 

Abstract— During a mission, detecting failure accurately 

and implementing countermeasures in time may decide the 

success of a space mission. Chang’E-1 mission is the first 

lunar exploration mission of China. The satellite has several 

different telemetry formats and bit rates. Moreover, the 

accuracy requirement of orbit control is high and demands 

real-time handling during contingency or abort operations. 

In addition, the time duration of the mission is very long, 

lasting more than a year. All these cause some difficulties in 

fault detection, judgment and handling in the mission. 

Therefore, in order to detect faults accurately and duly, 

determine the failure modes in time, implement 

countermeasures effectively, and provide warning and trend 

analysis, software of fault diagnosis system is specifically 

developed for Chang’E-1 mission. The system adopts plugin 

software structure and advanced Berkeley DB real-time 

database. In addition, ice middleware is exploited for 

message passing between processes. It is also the first time 

that the failure criterion of the mission is described by 

extensible markup language (XML), and this well solves the 

problem of separating failure criterion from program 

coding. The fault diagnosis system can not only well and 

truly judge failures with clear criterion, but can also 

provides trend warnings for failures without clear 

descriptions, and is very helpful for the personnel who are 

in charge of fault detection. The advanced technologies 

applied in the system make it expansible. It can also be used 

in other space missions. 

Index Terms—Chang’E-1 mission, fault diagnosis, plug-in, 

real-time database, middleware 


Because of the complexity of the space environment 

and testing limitations of a spacecraft, sometimes there 

will be contingency or system failure during the space 

missions. The four most serious accidents in manned 

Manuscript received August 1, 2011; revised September 15, 2011; 

accepted September 20, 2011. 

Copyright credit. Project 11173005 Supported by National Natural 

Science Foundation of China 


doi:10.4304/jsw.7.12.2687-2694 

space flight history where astronauts were killed are 

Apollo 4A in January 1967, Soyuz-1 in April 1967, 

Soyuz-11in June 1971 and Challenger in January 1986 [1] . 

Serious accidents are also happened in Hubble Space 

Telescope of the U.S. in 1990, Mars Climate Observer 

and Mars Polar Lander in 1999. In November 2000, the 

Indian satellite INSA-2B lost the function of pointing to 

the Earth, and the accident also caused huge losses 

[2] .These accidents make countries begin to attach 

importance to the research of spacecraft fault diagnosis 

techniques, in order to detect failures in time, and to avoid 

and reduce casualties and equipment losses as much as 

possible, thus save spacecraft launch and operation costs 

[3, 4] . 

Chang’E-1 mission was the first lunar exploration 

mission of China. The satellite has several different 

telemetry formats and bit rates. The orbit control requires 

high accuracy and demands real-time fault handling. In 

addition, the time duration of the mission is very long, 

lasting more than a year. All these cause some difficulties 

in fault detection, judgment and handling. According to 

the orbit design scheme of the mission, the satellite needs 

two weeks’ time from launch to finally going into the 

target orbit, and stays there for a year [5] .If fault diagnosis 

entirely relies on manual monitoring, the personnel will 

be exhausted, and it is still very easy to miss some 

failures. According to the analysis during mission 

preparation, there would be more than eighty kinds of 

possible failure modes, and some failure would happen in 

a very concentrated time. For example, before and after 

the first brake maneuver near the moon, nearly twenty 

types of fault may occur intensively, and most of them 

are emergent requiring urgent disposal, or the satellite 

may be dangerous or it will fly out of the gravitational 

field of the moon. 

Therefore, in order to monitor satellite status and 

detect fault accurately and duly in Chang’E-1 mission, 

decide prepared failure mode in real-time, implement 

countermeasures effectively, and provide warning and 

trend analysis, software of fault diagnosis system is 

specifically developed as an assistant to improve the


timeliness, accuracy and efficiency of fault diagnosis, and 

to guarantee the success of the mission. 

II. SPACECRAFT FAULT DIAGNOSSIS TECHNOLOGY 

TABLE I 

A BRIEF SUMMARY OF FAULT DIAGNOSIS TECHNOLOGY 

Type Advantage Disadvantage 

Signal 

processingbased 

fault 

diagnosis 

method 

Rule-based 

expert system 

diagnosis 

method 

Fault tree 

based fault 

diagnosis 

method 

Neural 

networks 

based fault 

diagnosis 

method 

Model-based 

fault 

diagnosis 

method 

Petri netbased 

fault 

diagnosis 

method. 

Threshold model is adopted 

and is the basis of other 

diagnostic methods. 

1) Simple, intuitive, image 

and convenience; 

2) Rapid diagnosis; 

3) Require a relatively 

small data storage space; 

4) Easy to program and 

implement. 

1) Rapid diagnosis; 

2)Easy to dynamically 

modify knowledge library 

and maintain consistency; 

3) Domain-independent, as 

long as the corresponding 

fault tree given, the 

diagnosis can be achieved. 

1) Highly nonlinear; 

2) Highly fault tolerance; 

3) Associative memory. 

1) High accuracy; 

2) Can diagnose unforeseen 

failures, no need for 

experience and knowledge. 

Can dynamically describe 

the generation and 

propagation of fault 

phenomena, making it easy 

to diagnose from the 

change of the system 

bh i 

1) Covers limited failure 

mode; 

2) Misdiagnosed or diagnosis 

failure. 

1)Can not diagnose 

unexpected failure; 

2) Diagnosis results rely 

heavily on the information of 

the full extent of the fault 

tree. 

1)Can not reveal potential 

relationships within the 

system, can not give a clear 

explanation of the diagnostic 

process; 

2) Can not diagnose failures 

that never appeared in the 

training sample, or even draw 

wrong diagnosis conclusion. 

1) The diagnosis is slow; 

2) Strong dependence on the 

accuracy of the model, any 

uncertainty of model can lead 

to false alarms. 

According to the domestic and foreign development 

history of fault diagnosis technology, a brief summary of 

the main methods is listed in table I [6-9] : 

Currently fault diagnosis system has developed from 

fault diagnosis expert system of single subsystems (such 

as power system or thermal control system) to integrated 

spacecraft health management system integrating system 

status monitoring, fault diagnosis and fault repair as 

one [10] . Fault diagnosis method has developed from single 

diagnostic method (such as rule-based diagnosis method, 

fault tree based diagnosis method, etc.) to the 

combination of various methods. With the rapid 

development of computer technology, many new 

technologies is used in fault diagnosis system, such as 

network technology, information fusion technology, 

distribution theory, agent and multi-agent system 

technology, and excellent man-machine interface. These 

technologies provide strong support to the development 


1) Totally dependent on the 

Petri net model; 

2) Difficult to locate the 

source of the fault, problemsolving 

process is prone to 

conflict. 

and maintenance of fault diagnosis system. 

Domestic research of spacecraft fault diagnosis 

technology started late. In recent years, domestic 

aerospace experts have gradually recognized the 

importance and urgency of the work in this area, and have 

done some research theoretically. But most diagnosis 

systems developed are demonstration systems. 

Chang’E-1fault diagnosis system is the first fault 

diagnosis system applied to an actual mission. The 

system is a rule-based expert system. It successfully 

separated specific failure criterion from program coding 

by using XML-based markup language description. It 

also uses advanced real-time Berkeley DB database, ice 

middleware and plug-in software architecture. The 

system provides successful experience of fault diagnosis 

for future missions. 

III. CHANG’E-1FAULT DIAGNOSIS SYSTEM DESIGN 

A. System Functional Requirements 

The task of Chang’E-1 fault diagnosis system is to 

provide technical support of centralized satellite status 

monitoring, fault diagnosis and emergency disposal for 

commanders and flight control operators. The system can 

improve the ground ability of centralized monitoring and 

real-time analysis of important satellite status, and can 

provide early warning and trend alarms, help the mission 

decisions. 

The function of the system requirements include: 

1) Telemetry monitoring, alerting and analysis. 

The system receives real-time telemetry data package 

from the monitor display network, checks the correctness 

of the data formats and explains telemetry frames for 

associated judgments. When there is a telemetry 

interruption or overrun, alarm messages are given and 

corresponding parameters are displayed. 

2) Real-time monitoring and alarm of satellite 

important parameters and events. 

During the telemetry monitoring, alarm and analysis, 

the system also receive mission control plans, station 

tracking forecasting, telecommand sending sequence, 

orbit elements and other information to determine 

satellite important status and mission critical events. 

3) Real-time fault inspection, diagnosis and alarm 

According to different phase of the mission, one or 

more failure modes that are most possible to occur are 

selected, and the corresponding telemetry parameters of 

failure criteria are centralized displayed and monitored. 

4) Fault handling support 

Once a fault is conformed to occur, the system will 

prompt corresponding disposal measures step by step. 

During the fault handling, parameters involved and status 

changes are displayed in real-time to determine the 

implementation effect. 

5) Fault trends warning and analysis 

Some parameters or state change is a gradual process, 

such as temperature, pressure and voltage, etc., the 

system will monitor and analyze the trends. If the trend is 

close to the failure threshold, it will give early warnings.


B. System Flow 

The fault diagnosis system receive telemetry packages, 

control plans, station tracking forecasting, telecommand 

sending sequence, orbit elements and other information to 

determine satellite important status and mission critical 

Telemetry 

Mission control plans 

Station tracking phase 

Telemetry channel 

monitoring and 

alarm 

Monitoring net 

Data collection 

process 

Data analysis process 

Telemetry overrun 

judge and alarm 

Orbit determination result 

Telecommand sequence 

Real time 

database 

C. System Hardware Structure 

Fault diagnosis system is independent of Chang’E-1 

mission control system. It is connected with monitor 

display network as shown in Figure 2. It receives the 

original data and results issued from the monitor display 

server directly as same as monitor display terminals. 

Mission control system 

Monitor display net 

Monitor display terminal 

events. The outputs are alarms of failure or other unusual 

events, decision support information, all kinds of status 

messages, and data analysis and statistics. The system 

block diagram is shown in Figure 1. 

Telemetry parameters dictionary 

Telecommnad dictionary 

Telecommnad chain configuration 

Station configuration 

Failure criterion 

Important event criterion 

Telemetry normal range 

84 failure modes 

diagnosis and alarm 

Figure 1. Block diagram of fault diagnosis system 

Monitor display server 

Fault diagnosis terminal 

Figure 2. Network connection diagram of fault diagnosis system. 

Each fault diagnosis terminal is equal, fault diagnosis 

system can run in any one of them, and can maintain 

consistency of monitoring, early warning and analysis by 


Profiles 

Profiles 

Important events 

prediction and alarm 

performing information and version synchronization 

operations in the process of fault diagnosis. 

D. Software Hierarchy 

The software hierarchy model of the fault diagnosis is 

shown as Figure 3. 

Man-machine interface layer 

Man machine interface 

Special plug-in layer Fault diagnosis Countermeasures Trend warning 

Common plug-in layer Rule analysis 

Logic process 

Special Service Layer Status inquiry Status extraction Status save 

Basic service layer Network processing File system Database 

Figure 3. Fault diagnosis system hierarchy. 

There are five layers in the system: 

1) Basic service layer 

Basic service layer is the foundation of the fault 

diagnosis system including network processing, file 

system and real-time database. It provides the ability of 

obtaining various mission information, spacecraft status 

and configuration information from the network, files and 

database for special service and higher layers. 

2) Special service layer 

Special service layer provides inquiry, extraction and 

storage of spacecraft status, configuration information 

and fault rules for fault analysis and diagnosis. 

3) Common plug-in layer


Common plug-in layer includes rule explanation and 

logic process etc. Rule explanation plug-in provides 

explanation of labels, methods and logic operations 

defined in XML files, which describe failure modes. 

Logic process plug-in provides functions listed in table II 

to process logic operations of logical expressions. 

4) Special plug-in layer 

Special plug-in layer includes plug-ins of fault 

diagnosis, countermeasures and trend warning. They are 

based on common plug-in layer, and provide the ability 

of fault diagnosis, countermeasures and trend warning by 

rule explanation and logic process. 

5) Man-machine interface layer 

Man-machine interface layer provides interface for 

operations, configuration and management. 

E. XML-Based Fault Rule Design 

In order to determine the failure modes of the mission 

quickly and accurately, the spacecraft status and its 

change must be translated in a language that a computer 

can recognize. This requires the language easy to 

understand, easy to write and can give a clear description 

of failures. Because XML has the ability to self-describe 

data in a structured way and can be defined and used 

according to the needs of application, so it is selected for 

rule description of spacecraft fault diagnosis. 

First, mathematical and logic operators are defined. 

Criterions of failure modes and significant events are 

expressed in the language to generate formulas of a single 

or several telemetry parameters, or a trend of change, or 

even a combination of several judgments. The formulas 

generate rules and are written into files. Finally, files are 

stored into the rule library as fault diagnosis criterions. 

When data analysis process reads the rules from the files, 

it will parse the rules to computer. 

Rules are independent of program, so program needs 

no modification when rules change. The rules can be 

added or modified at any time, achieving the goal of 

separating a specific fault from program. When the 

criterion of an important event or a failure mode changes, 

it is the corresponding rule file that only needs to be 

modified. The rule library requires regular maintenance 

to be consistent with the mission. 

F. Design of Formulas 

Rule description of status is the key problem that is 

needed to solve. Rule formulas can use common math 

and logical operators directly, or introduce external 

functions to extend operation capabilities. 

External functions are divided into several types 

including citing, extraction, judgment, computation and 

operation, etc. Citing is the type in the form of 

“FromXXX”, which gets related status information 

directly from telemetry parameters, orbit forecasting and 

control strategies. Extraction is the type in the form of 

“GetXXX”, which can obtain auxiliary information of 

other parameters through appropriate treatment. For 

example, function “GetPrev” can obtain the previous 

value of a specified parameter. “GetBit” can obtain a 

particular bit of a specified parameter. Judgment is the 

type in the form of “JudgeXXX”, which determine the 


TABLE II 

FUNCTIONS FOR FAULT RULES 

Function 

name 

Function 

contents 

Parameter description 

FromGlobal 

Direct from external 

global parameters 

[parameter 

identification] 

Direct from external [telemetry library, 

FromTM telemetry library 

parameter 


FromPlan 

Direct from external plan 

[plan identification] 

FromPredict 

Direct from external orbit 

forecasting 

[forecasting 


GetFree 

Read a parameter from a 

user’s input. 

[parameter type, default 

value] 

GetBit 

Get bit from a byte. [parameter code, starting 

bit, total bits] 

GetPrev 

Get the previous value of 

a telemetry parameter. 

[parameter code] 

GetAverage 

Get the average value of a 

parameter during a certain 

period. 

[parameter code, 

time(s)] 

GetUpdateTi 

me 

Get the updated value of a 

parameter. 

[parameter code] 

GetLastTrue 

Time 

Get the latest time of an 

event happened. 

[sub-event code] 

GetLastTrue 

BeginTime 

Get the earliest time of an 

event happened. 

[sub-event code] 

GetHistory 

Get the history value of a 

parameter. 


numbers] 

GetDecVal 

Get the decrease amount 

of parameter during a 

certain period. 


time(s)] 

GetIncVal 

Get the increase amount 

of parameter during a 



time(s)] 

GetAdd 

Get the sum of several 

parameters. 

[parameter code1, 

parameter code2,…] 

GetMulti 

Get the product of several 

parameters. 

[parameter code1, 

parameter code2,…] 

GetMax 

Get the maximum of a 


period. 


time(s)] 

GetMin 

Get the minimum of a 


period. 


time(s)] 

GetTmUpdat 

eTime 

Get the time of telemetry 

updating. 

Judge the change of a 

- 

JudgeIncreas parameter if it is in a [parameter code, 

e 

increase trend during a 


time(s)] 

JudgeExceed 

Judge the parameter if it 

has exceeded. 

[parameter code, lower 

limit, upper limit] 

JudgeChange 

Judge the parameter if it 

has changed. 


time(s)] 

Compute 

Compute the value of a 

formula. 

[logical formula] 

OperateEval 

uate 

Assign a value to a 

variable 

[function expression] 

OperateLog 

Operation log recording 

- 

trend of a parameter. For example, function 

“JudgeDecrease” is used to determine whether a specific 

parameter is in a increasing trend during a certain period. 

“JudgeChange” is used to determine whether a specific 

parameter remains unchanged during a certain period. 

Computation type is only one function named “Compute”, 

which gives the result of the logical expressions. 

Operation type is used to call a specified function to


operate such as assign a value. Another example, 

“OperateLog” can perform log record operation. Some of 

functions are listed in table II. 

G. Real-time Database Technology 

Berkeley DB real-time database, also known as an 

embedded database, is applied in the system. It is 

embedded in the application, and it is suitable for 

managing vast amounts of simple data up to 256TB. 

Because the application and database management 

system are running in the same process space, thus 

cumbersome inter-process communications can be 

avoided in data operations. So it is more efficient than 

relational database. 

The advantages of embedded database are: 

Firstly, cumbersome inter-process communication such 

as establishment of socket connections is avoided, so the 

overhead in communication cost is greatly reduced. 

Secondly, Simple function call instead of frequently 

used SQL language is used to complete all database 

operations. Thus it saves time to parse and process query 

language. 

In Chang’E-1 fault diagnosis system, real-time 

database is used to save the initial satellite status obtained 

from mission monitor and display network. Quick and 

accurate fault diagnosis needs to know real-time satellite 

status, as well as history satellite status for trend warning. 

Berkeley DB database can meet the demands. 

H. Middleware Technology 

Ice middleware was exploited for message passing and 

status synchronization between processes in the system. 

Ice middleware technology is an object-oriented 

middleware platform. Basically, Ice middleware provides 

tools, API and library support for object-oriented clientserver 

applications. Ice applications are suitable for use in 

a heterogeneous environment, where client and server can 

use different programming languages, different operating 

systems, different machine architectures and a variety of 

network technologies. So the fault diagnosis system can 

use different environment from the monitor display server. 

IV. PAGE DESIGN AND SOFTWARE IMPLEMENTATION 

A. Design of Functional Modules 

Fault diagnosis system uses multiple processes. It is 

mainly composed of two processes. One of them is data 

collection process. It collects data from the monitor 

display net, explains the data frame, and stores the data 

into the real-time database. Another process is data 

analysis process. It extracts data from the real-time 

database for data analysis. Then it judges if a failure 

mode has happened based on the analysis. Finally, it 

gives warnings or alarms. The Real-time database 

provides storage and access of data. Data collection 

process functions as a write user of the database while the 

data analysis process functions as a read-only user. The 

relations have shown in Figure 1. 

At the same time, each process uses a plug-in design. 

Plug-in is a kind of flexible component software 

architecture. Functional modules are implemented by 


plug-ins instead of conventional single program 

execution. The plug-in can be independent of program 

module, it can be developed solely, then loaded into the 

system when running, and can be deleted or replaced at 

any time. Thus, the extensibility and flexibility of the 

software are improved [11] . 

B. Fault Diagnosis Process 

Receiving data from net 

Correctness Analysis of 

the data, is it correct? 

Data frame explaining 

Alarm in levels 

yes 

Store the data into the database 

Real-time database 

Data access 

Rule parse 

Fault diagnosis 

Result display 

No 

Figure 4. Fault diagnosis process. 

Discard 

Fault diagnosis process is shown in Figure 4. On one 

hand, the software stores the real-time online data from 

monitoring net into the database. On the other hand, it 

reads fault rules in the library, calls interpreter to parses 

the rules, determines the status of the satellite and fault 

conditions, then gives alarms. Diagnostic results are 

displayed in pages. 

C. Implementation of Fault Rules 

The fault rules includes three levels of description: 

status, event, and parameter. 

Parameter rule is the basic unit of representation. In 

parameter configuration, satellite telemetry parameters 

can be used directly or new parameters can be created by 

formula. A parameter configuration is defined as 

following: 

< Parameter Configuration> 

< Parameter Code ="xxx" Description="xxx" Formula 

="xxx"/> 

< Parameter Code ="xxx" Description="xxx" Formula 

="xxx"/> 

…… 


Event rule is expressed by logic operations of one or 

more parameters. The code and description of the event 

are needed, which is shown as follows: 

 

 

 

 

…… 

 

 

 

 

…… 

 

…… 

 

Status rule is used for satellite status description. A 

status can be expressed by one or more events. The code, 

description and type of the status can be defined. And the 

level of the event can also be defined, shown as follows: 

 

< Status Code ="xxx" Description ="xxx" Type="xxx"> 

< Event Code ="xxx" Level="xxx"/> 

< Event Code ="xxx" Level="xxx"/> 

…… 

 

 

D. Fault Checking Periods 

Some failure modes can only occur in special periods, 

such as at the moment when satellite separates from the 

rocket, before or during orbit maneuvers etc. Some 

failure modes may occur throughout the mission. As for 

the disposal methods, the same fault occurred at different 

times may need different disposal. So period is 

introduced in fault diagnosis. The failures are checked 

according to the period that it may possible occur to 

improve the efficiency of the system. At the same time 

corresponding countermeasure is given according to the 

different period. 

E. Alarm Mode 

Alarms are notified in following ways: 

1) Sound alarm 

Give voice prompts according to the fault content, play 

the corresponding sound files. 

2) Color Alarm 

The color is distinguished by yellow and red, directly 

colored in the parameter or the corresponding button. 

Yellow is a warning color, suggesting that failure may 

occur. Red is an alarm color, suggesting that failure has 

occurred. 

3) Warning dialog box 


Pop-up a warning dialog box according to the failure 

contents, giving a specific cause of the failure occurred. 

4) Alarm log 

Write all abnormal information system monitored and 

all alarms into alarm logs. Alarm color remains for 

convenience for users to query. 

F. Pages Designs 

Many different pages have been designed for fault 

diagnosis system. One of which is the page displaying all 

prepared failure modes, including alarm diagnosis, alarm 

and monitoring, shown in figure 5. 

Figure 5. Judgments and alarm of prepared failure mode. 

The comprehensive monitoring of each failure can be 

inquired in detail from the page shown in Figure 6. 

Figure 6. Comprehensive monitoring of failure mode 

Figure 7 is the display of several tab pages, including 

real-time display of state judgments, parameter 

monitoring, alarm log, mission process viewer and 

telecommand sending monitoring.


Figure 7. Tab pages display 

G. System Testing 

After the completion of program design and coding of 

Chang’E-1 Fault diagnosis system, the software is tested 

many times in failure drillings of mission preparation. 

And the fault rules are modified according to the 

diagnosis results. For a clearly described failure mode, 

only 3 telemetry frames, or in another word 6 seconds, 

are needed to judge the failure mode (one telemetry frame 

is equivalent to 2 seconds when the bit rate of telemetry is 

512bps. The software uses 3 sentence 2 decision 

mechanisms). 

However, some failures can not be quantitatively 

described themselves such as attitude abnormal, so it is 

difficult to judge whether the failure has happened, but 

the system still can provide trend analysis according to 

the changing trend of some parameters. 

Because the rules are independent of program, so once 

the rules are modified and reloaded, the system will 

automatically diagnose according to the new rules. This 

makes the system very easy to upgrade. 

IV. CONCULSION 

Chang’E-1 fault diagnosis system is a multi-process 

program. The data collection process collects data from 

the monitor display net, processes data format and then 

store the data into the real-time database. At the same 

time, several data analysis processes are designed to 

extract data from the database for data analysis, judge 

failure mode, and gives warning or alarm. Every process 

adopts plug-in software structure. Advanced BerkeleyDB 

real-time database of the system not only saves huge 

amounts of data and results, but also provides high 

efficiency of data query. The system also used Ice 

middleware for message passing between processes. 

Based on the experience of the flight control of satellites, 

a XML language is used for rule description of failure 

mode. It is a solution to separate failure criteria from 

program. The system can not only rapid judge the failures 

clearly described, but also can provide trend warning for 


modes that can not be quantitatively described, help the 

mission operators to analyze. Chang’E-1fault diagnosis 

system is very helpful for fault diagnosis in the mission. 

The system is the first fault diagnosis system applied to 

an actual mission. It provides successful experience of 

fault diagnosis for future missions. 


The authors wish to thank Dr. Song-Jie HU. This work 

is supported in part by a grant from National Natural 

Science Foundation of China (Grant No. 11173005). 

REFERENCES 

[1] Zuojun Shen, “Statistical analysis of Manned space flight 

failure and system research of the safety requirements,” 

Aerospace China, No.11, pp. 21-24, 1997. 

[2] Newman J. Steven, “Failure-Space: a systems engineering 

look at 50 space system failures,” Acta Astronautica, 48 

(5-12), 2001, pp.517-527. 

[3] William S. Morris, Timothy Hill, et al, “Advanced fault 

management for the space station external active thermal 

control system,” in: 22 nd International Conference on 

Environmental Systems, Seattle, WA, July 13-16, 1992. 

[4] A.S. Aljabri, D.E. Bernard, D.L. Dvorak, G.K. Man, B. 

Pell, T.W. Starbird, “Infusion of autonomy technology into 

space missions-DS 1 lessons learned”, in: Proc. IEEE 

Aerospace Conference, Snowmass, CO, 1998. 

[5] Yang Wei-lian, Zhou Wen-yan.”Orbit Design for Lunar 

Exploration Satellite CE-1”, Spacecraft Engineering, 16 

(8), 2007, pp.16-24. 

[6] J. Taylor, “An algorithm for fault-tree construction,” IEEE 

Transactions on Reliability, 31(2), pp. 219-254, June 1982. 

[7] M. G. Madden, P. J. Nolan, “Monitoring and Diagnosis of 

Multiple Incipient Faults Using Fault Tree Induction,” 

IEEE Proceeding on Control Theory and Applications, 

146(2), March 1999. 

[8] Long Bing, Song Lihui, Jing Wuxing, Jiang Xingwei, “A 

Review and Propect of Fault Diagnosis Technique of 

Spacecrafts,” Missiles and Space Vehicles, (3), pp. 31-37, 

2003. 

[9] F. N. Pirmoradi, F. Sassani, and C. W. deSilva, “Fault 

Detection and Diagnosis in a Spacecraft Attitude 

Determination System”, Acta Astronautica, 65, pp.710-729, 

March. 2009. 

[10] Fox Jack J , et al. “Informed maintenance for next 

generation reusable launch systems,” Acta Astronautica, 48, 

pp.439~ 449, 2001. 

[11] Wolfinger, R., Dhungana, D., Prähofer, H., “A 

Component Plug-In Architecture For The .NET Platform,” 

in: 7th Joint Modular Languages Conference (JMLC 2006), 

Oxford, UK, September 13-15, 2006. 

BIOGRAPHY 

Ai-wu Zheng born in 1974. She 

received her M.S. degree from the 

National University of Defense 

Technology in 1999. Now she is a 

doctor student of School of 

Astronautics of Beihang University 

majored in Spacecraft design and 

Engineering. Senior engineer. Her main 

research is mission design.



Yu-hui Gao born in 1978. He received 

his M.S. degree from Beihang 

University in 2010. Engineer. His main 

research area is computer architecture. 

Yong-ping Ma born in 1964. He 

received his Ph.D. degree in spacecraft 

design from Beijing University of 

Aeronautics and Astronautics in 2010. 

Researcher. His main research area is 

spacecraft flight control. 

Jian-ping Zhou received his Ph.D. 

degree in Solid Mechanics from the 

National University of Defense 

Technology in 1989. Ph.D. Supervisor, 

Professor, Chief Design of China 

manned space flight program.


An Improved Implementation of Preconditioned 

Conjugate Gradient Method on GPU 

Yechen Gui 

Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China 

Email:liaoran919@yahoo.com.cn 

Guijuan Zhang 

Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China 

Email:gj.zhang@siat.ac.cn 

Abstract—An improved implementation of the 

Preconditioned Conjugate Gradient method on GPU using 

CUDA (Compute Unified Device Architecture) is proposed. 

It aims to solving the Poisson equation arising in liquid 

animation with high efficiency. We consider the features of 

the linear system obtained from the Poisson equation and 

propose an optimization method to solve it. First, a novel 

storage format called mDIA (modified diagonal storage 

format) is presented to improve the efficiency of the Sparse 

Matrix-Vector product (SpMV) operation. Second, a 

parallel Jacobi iterative method is proposed when using the 

Incomplete Cholesky preconditioner to explore inherent 

parallelism. Third, CUDA streams are also introduced to 

overlap computations among separate streams. The 

proposed optimization technique is embedded into our GPU 

based PCG algorithm. Results on Geforce G100 show that 

our SpMV kernel yields an improvement of nearly 100% for 

large sparse matrix with more than 30, 0000 rows. Also, a 

speedup of more than 7 is obtained for PCG method, 

making the real-time physics engine possible. 

Index Terms—CUDA, PCG, Incomplete Cholesky 

preconditioner, SpMV, Poisson equation 


In the past few years, Graphics Processing Unit (GPU) 

has evolved into a unified powerful many-core processor. 

The modern GPUs are well suited for compute-intensive 

tasks and massively parallel computation (e.g., solving 

matrix problems [1] [2]). As one of the most common and 

important matrix problems, solving the large-scale linear 

system can be significantly accelerated if corresponding 

algorithms can be mapped well to the structure of the 

GPU and be accord with SIMD (Single Instruction, 

Multiple Data) pattern. 

In this paper, we focus on the problem of solving the 

Poisson equation. The equation arises in many 

applications such as computational fluid dynamics, 

electrostatics, magnetostatics, etc. Numerical solution of 

the Poisson equation leads to a large sparse linear system. 

It is usually solved by iterative methods such as the bestknown 

conjugate gradient (CG) method instead of direct 

methods (e.g., Gaussian elimination). The CG method 

can be easily implemented to solve linear systems that 


doi:10.4304/jsw.7.12.2695-2702 

have a symmetric, definite positive (SPD) matrix [3]. 

However, it is often used with a suitable preconditioner in 

order to achieve high convergence rates in large scale 

applications. A CG algorithm with a preconditioner is 

called preconditioned conjugate gradient algorithm (PCG) 

and it has been proven to be efficient and robust in a wide 

range of applications [4]. 

Our goal is to solve Poisson equation efficiently by 

applying PCG algorithm on the Nvidia GPU architecture 

using CUDA [5]. Since the SpMV routine is the 

bottleneck of PCG algorithm that consumes nearly 80% 

of the total time, we present a novel storage format called 

mDIA storage format to optimize it. Moreover, we 

parallelize the traditional Jacobi iterative method to solve 

the lower Cholesky triangular equation when using the 

Incomplete Cholesky (IC) preconditioner. In addition, to 

effectively overlap the computation, CUDA streams are 

also adopted in this paper. Results show that our method 

obtains a speedup of 7 for PCG algorithm on Geforce 

G100. 

The paper is organized as follows. The next section 

introduces the background of our method. The related 

work on GPU based PCG methods are reviewed first, and 

then we give a brief introduction of our linear system 

generated from Poisson equation. GPU architecture and 

our optimization algorithm based on GPU are presented 

in Section 3. In this section, the optimization techniques 

are discussed in detail. Section 4 shows experimental 

results followed by conclusions in Section 5. 

II. BACKGROUND 

2.1. Related Work 

Jeff Bolz et.al [6] was the first to implement CG 

method on GPU using shader language and the speedup 

was about 1.5x. He also showed the feasibility of using 

the Compressed Row Storage Format for SpMV routine. 

After the advent of NVIDIA CUDA, GPU based iterative 

methods have been widely used to solve the sparse linear 

systems [7][8][9]. For example, Georgescu et.al [7] 

discussed how CG method could be aligned to the GPU 

architecture. They also discussed the problem with 

precision and applied different preconditioners to


accelerate convergence. In particular, they stated that for 

double precision calculations, problem having condition 

number less than 10 may converge and give a speedup 

also. In 2009, Buatois et al. [10] introduced their 

framework CNC for solving general sparse linear systems 

on GPU with a Jacobi-preconditioned CG method. Their 

method achieved a speedup of 3.2. However, they warned 

that GPU is only able to provide comparable accuracy 

because as the iterations increase, the precision drops in 

comparison to CPU. They also exploited some of the 

techniques like register blocking to optimize their 

algorithm. In [11], the CG method with Incomplete 

Poisson preconditioning was implemented on a multi- 

GPU platform. It mainly focused on overlapping 

communication between different GPUs by interactively 

exchanging boundary stream and inner stream. Their 

results showed that the performance can grow 

proportionally to the problem size and showed a good 

scalability. In work published by A.Asgasri[9], the author 

parallelized a Chebyshev polynomial preconditioner to 

improve the performance of PCG method based on GPU. 

As for the CG algorithm, nearly 80% of the total time 

is consumed by SpMV routine. It yields only a small 

fraction of the machine peak performance due to its 

indirect and irregular memory access. Therefore, there 

exists a large amount of work focusing on speeding up 

SpMV routine. Typical methods often use CSR 

(Compressed sparse row) format, COO (the coordinate) 

format and the DIA (diagonal) storage formats to mitigate 

the irregularity [12]. In a recent study by Nathan Bell [13], 

a hybrid method that used the modified ELL-COO format 

to store the sparse matrix delivered high throughput. 

However, it relied on an additional sweep operation to 

find out the number of nonzero elements in the matrix. 

2.2. The Linear System Derived from the Poisson 

Equation 

The Poisson equation in liquid animation is a secondorder 

PDE as shown in equation (1). It is used to compute 

the pressure p . 

2 ρ 

∇ p = u 

Δt 

Note that in the Poisson equation (1), p is the pressure, 

ρ is the density and t 

Δ is time step. It can be further 

transformed into equation (2) according to finite 

difference method. 

−p −p − p + 6p 

−p −p −p 

i−1, jk , i, j−1, k i, jk , − 1 i, jk , i+ 1, jk , i, j+ 1, k i, jk , + 1 

2 ρ ⎛ui+ 1, jk , − ui, jk , + vi, j+ 1, k− vi, jk , + wi, jk , + 1 −wi, 

jk , ⎞ 

=−Δx ⎜ ⎟ 

Δt⎝ Δx 

⎠ 

ρ 

=−Δx ( ui+ 1,, jk− ui,, jk+ vi, j+ 1, k− vi,, jk+ wi,, jk+ 1 −wi,, 

jk) 

Δt 

In equation (2), Δx is space interval, u = (u,v,w) is the 

velocity field. Let 


(1) 

(2) 

ρ 

i+ 1, jk , i, jk , ij , + 1, k i, jk , ijk , , + 1 i, jk , 

b=−Δx ( u − u + v − v + w −w 

). 

Δt 

Equation (2) can be converted into Ap = b, and our goal 

is to solve the unknown vector p . 

It can be proved that A is a sparse, positive-definite 

and symmetry matrix. In addition, we also explore some 

other features of matrix A in order to design more 

efficient algorithms. See the left side of equation (2), each 

row of A has no more than 7 nonzero elements. In this 

row, the diagonal element is a nonzero integer while the 

other nonzero elements equal to -1. Other rows also have 

similar structures. All these features will be considered in 

our new algorithm. Details will be given in the following 

sections. 

2.3. The PCG Algorithm 

Consider 

Ax= b, 

(4) 

where x is an unknown vector, b is a known vector, A is a 

known SPD matrix. According to PCG algorithm, 

equation (4) can be written as 

(3) 

−1 −1 

M Ax = M b, 

(5) 

where matrix M is a preconditioner[4]. 

Given the inputs A , b , a starting vector x , a 

preconditioner M, a maximum number of iterations 

k_max and a error tolerance err, the PCG algorithm can 

be described in fig. 1. In this figure, a set of α - 

α 

orthogonal search directions 0, α1, α2… αn 

are 

constructed by the conjugation of the 

r 

residues 0, r1, r2… rn 

respectively. Then in the k th 

iteration step, k x takes exactly one step of the length 

hk a . 

along the direction k If the convergence conditions 

err < ε and k < max_ k 

are met, the iterative process is 

x 

terminated. Note that in our method, we set 0 = 0 

. 

−1 

M r 

To compute k at each iteration step, we choose 

IC preconditioner in our method to improve the 

convergence rate [4]. An IC preconditioner can be 

T 

obtained by factoring a matrix A into the form LL 

where L is a lower Cholesky triangular matrix. L is 

restricted to have the same pattern of nonzero elements as 

A and other elements of L are thrown away. Therefore, 

−1 

T 

M equals LL z 

and k ← M rk 

can be converted 

T 

( LL ) z 

into k = rk 

. As a result, k z can be directly 

computed by forward and then backward substitutions. 

III. IMPLEMENTATION ON GPU 

3.1 GPU Architecture and CUDA


The new generation of GPU adopts the unified shader 

architecture CUDA and promise up to 900 Gflops(single 

precision) of computational power. 

A GPU can be seen as a SIMD processer. What this 

means is that there are an army of processers executing 

the same instructions in parallel independently. Take fig. 

2 for example, a GPU has a scalable array of 

multithreaded Streaming Multiprocessors (SPs). Each 

multiprocessor creates, manages, schedules, and executes 

groups of 32 parallel threads which are called warps. 

Individual threads composing a warp start together at the 

r0 ←b− Ax0 

−1 

z0 ← M r0 

h0 ← z0, 

errold ←< r0, z0 

> 

errnew ← errold 

While (err new < ε || k 

xk ← xk−1+ α hk−1 

rk ← rk−1−αAhk−1 zk −1 

← M rk 

errold ← errnew 

errnew ←< rk , zk 

errnew 

β ← 

errold 

> 

h ← z + β h 

k k k−1 

Figure 1. PCG algorithm. 

same program address, but have their own instruction 

address counter and register state. Therefore they are free 

to branch and execute independently. 

On GPU chip, each multiprocessor has a set of 

memories associated with it. They are: on-chip shared 

memory, global memory, read-only constant cache, and 

read-only texture cache. Among these memories, global 

memory is the biggest in size but with highest access 

latency. On the other hand, Shared memory, constant 

cache as well as the texture cache resides on chip and can 

be accessed more efficiently. Note that shared memory is 

only visible to one block and threads of other block 

cannot access the data stored in it. 

The scalable characteristic of modern GPU provides 

coarse-grained and the fine-grained data parallelism. 

They guide the programmer to partition the problem into 

sub-problems that can be solved independently in parallel 

by blocks of threads. Moreover, each sub-problem can be 

divided into finer pieces that can be solved cooperatively 

in parallel by all threads within the block. Fig. 3 gives an 

example. In this figure, kernels are launched on GPU 

device and executed by multiple equally-shaped thread 

blocks. 


3.2 Overview 

Algorithm 1 shows our framework for PCG 

implementation on GPU. In this framework, we use two 

kernels to complete the computation involved in the for 

loop. They are the SpMV kernel that computes matrix- 

vector multiplication such as 1 , 

k 

−1 

M r 

Ah − the preconditioning 

kernel that computes k . 

Besides the above kernels, other operations such as dot 

products among vectors can be done efficiently by cublas 

library [14] because they all belong to level-1 BLAS 

(Basic Linear Algebra Subprograms) functions. 

Figure 2. GPU architecture. 

Figure 3. Serial code executes on the host while parallel code executes 

on the device.


In order to improve the efficiency of our GPU based 

PCG algorithm, we focus on two most expensive kernels 

here: the SpMV kernel and the preconditioning kernel. In 

addition, to get a higher level of concurrency, we use a 

technique called streams [15] to run independent kernels 

asynchronously so as to overlap the computation. 

Algorithm 1. Algorithm for our PCG method on 

GPU 

//Use cublas() 

to compute dot products 

//and then obtain err 

r0 ← b− Ax0 

−1 

z0 ← M r0 

h0 ← z0, 

errold ←< r0, z0 

> 

errnew ← errold 

while (err new < ε && k 

//Use cublasSaxpy() 

xk ← xk−1 + α hk−1 

(2) 

rk ← rk−1 −αAhk−1 

(3) 

// Preconditioning 

kernel 

z ← M r 

−1 

k k 

// Devcie assignment 

err ← err 

old new 

//Use cublasdot() 

err ←< r , z > (4) 

new k k 

//Use cublasSaxpy() 

err 

β ← 

err 

new 

old 

k ← k + β k−1 

h 

k ++ 

z h 

end while 

(5) 

3.3 SpMV Kernel 

Sparse Matrix-vector multiplication (SpMV) is one of 

the most fundamental and important operations in sparse 

matrix computations. It is the dominant cost in many 

iterative methods for solving large-scale linear systems. 

Recently, several research groups have reported their 

implementation on CUDA–compatible GPUs and show 

that the storage pattern such as CSR, ELL, HYBRID 

formats can be efficiently accessed by CUDA threads. 

However, except for the CSR format, all other formats 

have to fill in zeros to keep the array strictly aligned, thus 

causing memory waste. 

Since the SpMV kernel with CSR format provides 

important insight for understanding our algorithm, we 

discuss the implementation of SpMV GPU kernels with 

CSR and mDIA respectively in the following subsections. 

3.3.1 SpMV with CSR format 


CSR format is one of the most popular sparse matrix 

representations. In this format, an N-by-N sparse matrix 

with K nonzero elements is stored as two arrays: one 

array val holds the K nonzero elements and the other 

array col holds the column indexes of these nonzero 

elements. What’s more, an additional array Rowptr with 

the length N+1 is used. The first N components of 

Rowptr record the indexes of the first element in each 

row while the last one denotes the number of nonzero 

elements in the matrix. Fig. 4 gives an example. Unlike 

the ELL or DIA, CSR format doesn’t waste any memory 

space. 

Figure 4. CSR format for sparse Matrix. 

To parallelize the SpMV operation with CSR format, a 

scheme called scalar CSR kernel [13] is used. In this 

kernel, one thread is used to fetch one row of the matrix 

A and then complete the dot product for one component 

of the result vector. The computations of all threads are 

independent. The data parallelism as well as the access 

pattern of scalar CSR kernel is shown in Fig. 5. It gives a 

simplified example of the allocation of the threads, in 

which the array data, Col Index, and Rowptr are stored in 

global memory for the dot product operation. 

Data 

Col Index 

Rowptr 

⎡1 3 0 0 0 ⎤ 

⎢ 

9 7 0 8 2 

⎥ 

⎢ ⎥ 

Matrix = ⎢0 6 1 0 0⎥ 

⎢ ⎥ 

⎢1 0 3 9 0 ⎥ 

⎢ 

⎣0 0 0 0 2⎥ 

⎦ 

val = [1 3 9 7 8 2 6 1 1 3 9 2] 

col = [0 1 0 1 3 4 1 2 0 2 3 2] 

Rowptr = [0 2 6 8 11 12] 

Row1 

Thread1 

Row2 

Thread2 

Row3 

Thread3 

Figure 5. Scalar SpMV Kernel with CSR format. 

Row4 

Thread4 

Global Memory 

3.3.2 SpMV with mDIA format 

(1) Our mDIA storage format 

Our matrix has a regular pattern that the elements off 

the main diagonal are all assigned to a constant integer - 

1while the nonzero diagonal elements are also integers. 

The number of nonzero elements per row varies from 3 to 

7 and this irregularity makes ELL or hybrid pattern 

infeasible, since they will cause large zero fill-ins. 

In mDIA format, the constant value is stored in the 

constant memory. As a result, 1D array Diag only needs 

to store the diagonal elements. Because all diagonal 

elements in our matrix are nonzero values, the row and


column indexes of them can be easily obtained from Diag. 

Array col only needs to record column indexes of the 

constant value. Besides, Array Rowptr is used to tell 

where a new row begins in the array col, similar to array 

Rowptr in CSR format. Fig. 6 shows a portion of one 

matrix and its corresponding storage mechanism. Note 

that the constant -1 is stored in the constant memory. 

Since most of the nonzero elements are off the main 

diagonal in our matrix and they are stored in the constant 

memory, the memory usage can be significantly reduced 

compared with the CSR format. Also, the constant 

memory, a small high-speed cache residing in the global 

memory on GPU, enables us to fetch data efficiently [15]. 

⎡6 -1 0 0 0⎤ 

⎢ 

-1 7 0 -1 0 

⎥ 

⎢ ⎥ 

Matrix = ⎢0 -1 6 0 0⎥ 

⎢ ⎥ 

⎢-1 0 -1 3 -1⎥ 

⎢ 

⎣0 0 -1 0 2⎥ 

⎦ 

diag = [6 7 6 3 2] 

col = [1 0 3 1 0 2 4 2] 

Rowptr = [0 1 3 4 7 8] 

_device_ _constant_ val =-1 

Figure 6. mDIA storage format. 

(2) The SpMV kernel 

To implement the SpMV kernel with mDIA format on 

GPU, we assign one thread to compute one component of 

the result vector. Take thread i for example. It computes 

the dot product between the i-th row of our matrix and the 

vector. First it fetches the diagonal nonzero element from 

diag[i]. Then the column indexes of the constant 

elements from col[Rowptr[i]] to col[Rowptr[i+1]] are 

read contiguously. Finally, the dot products operation is 

executed. 

Considering that the vector is reused in the 

computation of the dot product, we bind it to a 1D texture. 

This can bring potentially higher bandwidth and can be 

used to avoid uncoalesced loads from global memory 

[15]. Other array such as col are stored in the global 

memory. Fig. 7 presents the pseudo-code of our 

implementation. 

3.4 Preconditioning Kernel using Jacobi Method 

Another important kernel in our PCG algorithm is the 

preconditioning kernel. According to IC preconditioner, 

−1 

T 

zk ← M rk 

is converted into ( LL ) zk = rk 

where 

L is the lower Cholesky triangular matrix. Thus, zk can 

T 

be obtained by solving LLz ( k) = rk 

in two stages: 

forward substitution Ly = rk 

and backward 

substitution T 

Lzk = y. 

However, the direct method of backward and forward 

substitution cannot be used to solve all the components 


simultaneously on GPU because the computation of the 

ith component of zk relies on all its previous components. 

To get a higher level of parallelism, we use Jacobi 

method instead of direct method in this paper. Jacobi 

iterative method is data independent that can be well 

aligned to SIMD pattern and improves parallelism 

significantly. 

__Constant__ val = -1; 

__ global__ void SpMV(float *diag, float *col, float 

*Rowptr, float *x, ) 

{ 

int thread_id = blockDim.x * blockIdx.x + threadIdx.x; 

int grid_size = gridDim.x * blockDim.x; 

for(int row = thread_id; row < num_rows; row += 

grid_size) 

{ 

const int row_beg = Rowptr[row]; 

const int row_end = Rowptr[row+1]; 

float sum = 0; 

for (int jj = row_beg; jj < row_end; jj++) 

{ 

sum += val*fetch_x(Col[jj], x, UsedTex); 

sum += diag[row] *fetch_x(row, x, UsedTex); 

} 

y[row] = sum; 

} 

} 

Figure 7. Pseudo-code of our SpMV kernel 

3.4.1 Jacobi iterative method 

Jacobi iterative method is a numerical solution of a 

system of linear equations with largest absolute values in 

each row and column dominated by the diagonal element. 

To solve our lower Cholesky triangular 

equation Ly = rk 

, we first decompose the lower 

Cholesky triangular matrix L into a diagonal matrix D 

and a lower triangle matrix R. Then the system of linear 

equations becomes 

( D + Ry ) = r, 

(6) 

and finally 

Therefore, y can be solved iteratively by 

k 

Dy = rk− Ry. 

(7) 

1 

y = D ( r − Ry ) . (8) 

− 

k+ 1 

k k 

Next, zk can be computed by solving upper Cholesky 

triangular equation T 

Lzk = yin 

the same way. 

3.4.2 Parallel Jacobi algorithm 

Fig. 8 (a) shows the iterative process of our parallel 

Jacobi algorithm. In this figure, d_old stores the result 

from the previous step and d_new is used to update the 

current computation. The constant vector d_const 

−1 

= D rk 

, where the diagonal matrix D is stored in the 

array d_diag. d_R stores the lower triangular matrix R 

and d_res is used to compute the residue. Note that most


−1 

of the operations such as D b can be converted to 

level-1 BLAS operation among the vectors. These 

operations could be easily parallelized using CUDA. Fig. 

8 (b) shows the kernel named VecDiv_kernel for 

−1 

computing D b and the kernel named VectorSub_kernel 

for computing the substraction betweeen two vectors. 

while( err < tol &&k< k max) 

{ 

// d_yNew ← Ryk 

SpMV ( d_R,d_ yOld,d_yNew 

) ; 

−1 

// d_yNew ← D ( Ryk) 

VecDiv _ ker nel ( d _ diag,d _ yNew, 

N ) ; 

−1 −1 

// d_yNew ← D ( Ryk) + D rk 

cublasSaxpy ( N,1.0,d _ const,1,d _ yNew, 

−1 

) ; 

// d_res ← residue 

VectorSub 

d _ yNew,d _ yold,d 

_ res, N ; 

( ) 

( ) 

err = cublasSasum N,d _ res,1 ; 

// d_yOld ← d_yNew 

( ) 

cublasScopy N,d _ yNew,1,d _ yOld,1 

; 

k ++ ; 

} 

(a) The iterative process. 

__global__ void VectorSub_kernel(float* A, float* 

B, float* C,int N) 

{ 

int i = blockDim.x * blockIdx.x + 

threadIdx.x; 

float a = 0.0f; 

if (i< N) 

{ 

a = A[i] - B[i]; 

C[i] = a; 

} 

} 

__global__ void VecDiv_kernel1(float* A, const 

float* B, float* C,int N) 

{ 

int i = blockDim.x * blockIdx.x + 

threadIdx.x; 

float a = 0.0f; 

if (i< N) 

{ 

a = __fdividef(A[i],B[i]); 

C[i] = a; 

} 

} 

(b) Two GPU kernels involved in this process. 

Figure 8. Parallel Jacobi iterative method. 

3.5 Parallelism with Streams 

CUDA applications can manage concurrency through 

streams. A stream is a sequence of commands that 


execute in order. Different streams, on the other hand, 

may execute their commands out of order concurrently. 

Fig. 7 illustrates the GPU time flow for sequential (Fig. 9 

(a)) and concurrent (Fig. 9 (b)) kernel executions. 

Kernel_1 

Kernel_1 

Kernel_2 

Kernel_2 

Figure 9. GPU Time flow with CUDA streams 

We adopt this optimization technique for our two 

independent tasks as shown in fig. 10. 

SetKernelStream( streams[0]); 

xk ← xk−1+ α hk−1 

; 

SetKernelStream( streams[1]); 

r ← r −αAh 

; 

k k−1 k−1 

Kernel_3 

Figure 10. The operations using CUDA streams. 

As a result, the computation of the two tasks can be 

overlapped and the GPU resources can be used more 

effectively. 

IV. RESULTS 

Kernel_3 

(a). GPU time flow (without CUDA streams) 

Kernel_4 

(b). GPU time flow (with CUDA) streams) 

We use Geforce G100 to test the performance of our 

parallel PCG algorithm. Geforce G100 has 8 CUDA 

cores and the peak performance for single precision is 

10.4 Gflops. The CPU used is AMD 7750 dual-core 

processers with the core frequency of 2.7GHz. The CPU 

implementation of PCG is single-threaded. 

Table 1 shows some matrices that generated from our 

Poisson equation. They will be used in our performance 

test. In this section, we start from testing the performance 

of our SpMV kernel and then test Jacobi iterative method 

for solving the lower Cholesky triangular equations, 

followed by CUDA stream results. 

TABLE.1 

MATRICES USED FOR EXPERIMENTS 

MATRIX #N #Nonzeros 

Matrix_1 8087 49915 

Matrix_2 22028 131,118 

Matrix_3 65043 394,877 

Matrix_4 140,120 876,518 

Matrix_5 209,908 1,325,066 

Matrix_6 274,949 1,755,263 

Matrix_7 304,207 1,962,311 

Kernel_4 

4.1.SpMV Kernel Test 

4.1.1 SpMV kernels performance 

Table 2 shows the performance of our SpMV kernel 

against the SpMV kernel with CSR format. In this table, 

GPU Time accounts for the total time consumed for the


SpMV kernel in our PCG algorithm and the speedup is 

the ratio compared with CPU Time. On CPU, the SpMV 

routine is implemented using CSR format. According to 

the results, our SpMV kernel runs an average of one time 

faster than that in CSR format and it offers an average of 

about 10 times speedup compared with the CPU version. 

TABLE 2 

GPU/CPU SPMV KERNELS PERFORMANCE (SECONDS) 

matrix 

Timings for 

spMV CPU 

routine 

Timings for Speedup 

spMV GPU 

kernel 

(seconds) 

CSR mDIA CSR mDIA 

Matrix_1 0.48 0.09 0.05 5.4 9.0 

Matrix _2 1.34 0.25 0.14 5.4 9.5 

Matrix _3 5.50 1.00 0.57 5.5 9.6 

Matrix _4 15.15 2.63 1.52 5.8 10 

Matrix _5 29.67 5.34 2.99 5.6 9.9 

Matrix _6 

Matrix _7 

44.82 

54.69 

8.03 

9.50 

4.50 

5.41 

5.5 

5.1 

9.96 

10.8 

4.1.2 The performance of CG&PCG algorithm with 

our SpMV kernel 

We embed our SpMV kernel into CG&PCG method 

On GPU. Here, the IC preconditioner of PCG method is 

solved by direct method. Further improvement using 

Jacobi iterative method will be presented in the next 

subsection. 

We compare our results with CG&PCG methods using 

CSR format as shown in Table 3. It illustrates that due to 

our SpMV kernels, the CG algorithm outperforms by 

nearly 50% when the number of nonzero elements 

reached 1,962,311. 

However, when PCG method is applied, this table 

shows that our advantage over the CSR_based method 

has been less obvious; and when the dimension of matrix 

has reached up to 304,207, the performance improvement 

is only about 10%. This is due to the sequential nature of 

the direct method as mentioned before. 

4.2 Jacobi Method for IC Preconditioner 

In this section, we adopt two variants: the direct 

method and the Jacobi iterative method to explore GPU 

performance for PCG algorithm. 

TABLE 3. 

TIME COST FOR CG AND PCG ALGORITHM IMPLEMENTED ON GPU 

RESPECTIVELY (SECONDS) 

matrix 

CG 

steps 

Timings for GPU 

based CG 

method(seconds) 

PCG 

steps 

Timings for GPU 

based PCG 

method(seconds) 

mDIA CSR mDIA CSR 

Matrix_1 84 0.64 0.68 17 0.50 0.50 

Matrix_2 90 0.76 0.86 21 1.15 1.17 

Matrix_3 123 1.40 1.80 23 2.63 2.71 

Matrix_4 146 2.42 3.69 25 4.8 5.09 

Matrix_5 192 4.44 6.70 30 7.9 8.32 

Matrix_6 221 6.41 9.87 34 10.98 11.87 

Matrix_7 234 7.25 11.53 35 11.35 12.77 

When solving the lower Cholesky triangular 

equation, Timings for the parallel Jacobi iterative method 

and the direct method are both accumulated at every 


iterative step and their final costs are showed in Table 4. 

From this table, it can be noticed that GPU performance 

of the Jacobi iterative method has been efficiently 

improved about 3 times. 

TALBE 4. 

TIME COST FOR GPU SOLVER OF JACOBI METHOD AND DIRECT METHOD 

WHILE APPLYING IC PRECONDITIONER 

matrix Timings for 

Timings for 

Jacobi solver (seconds) direct solver (seconds) 

Matrix_2 0.18 1.001 

Matrix_3 0.50 2.533 

Matrix_4 1.225 3.912 

Matrix_5 2.288 6.030 

Matrix_6 3.377 9.454 

Matrix_7 3.893 11.021 

4.3 The Speedup of Our GPU based PCG Algorithm 

Fig. 11 shows the speedup of our parallel PCG 

algorithm after using our SpMV kernel, Jacobi iterative 

method, as well as CUDA streams. It can be seen from 

the figure that there has been at least 16% increase for our 

PCG algorithm compared that with CSR formats. 

Furthermore, our PCG algorithm with three optimization 

techniques proposed obtains an average of 6 times 

speedup, while the CSR method only offers an average of 

4. 

Figure 11. Speedup of our GPU based PCG algorithm. 

V. CONCLUSIONS 

In this work, we propose an optimization method for 

GPU based PCG algorithm. It is designed to solve the 

Poisson equation arising in liquid animation efficiently. 

By utilizing optimized SpMV kernel, iterative Jacobi 

method and CUDA streams, our method improves the 

efficiency of solving large sparse linear systems 

significantly. Experimental results also show the 

effectiveness of our method. 

Next we will focus on seeking out the other potential 

bottleneck of PCG algorithm by CUDA Visual Profiler. 

Besides, studies on finding more suitable preconditioners 

for GPU-based PCG algorithm should be taken. 


The authors wish to thank Prof. Shengzhong Feng. 

This work was supported in part by the National High- 

Tech Research and Development Plan of China (863 

program) under Grant No.2009AA01A129-2, the Science


and Technology Project of Guangdong, Province under 

the grant No.2010A090100028, National Natural Science, 

Foundation of P. R. China under the Grant 

No.60903116,and Knowledge, Innovation Project of the 

Chinese Academy of Sciences, No.KGCX2-YW-131, 

the Science and Technology Project of Shenzhen, under 

the grant No. JC200903170443A, ZD201006100023A. 

VI. REFERENCE 

[1] J.D.Hall,N.A. Carr and J.C.Hart,”Cache and Bandwidth 

aware matrix multiplication on the GPU,”,2003.UIUC 

Technical Report UIUCDCSR-2003-2328(2003) 

[2] N.Galopppo, N.K. Govindaraju,”LU-GPU:Efficient 

algorithms for solving dense linear systems on graphics 

hardware,” In SC’05: Proceedings of the 2005 

ACM/IEEEE conference on Supercomputinng, page3, 

Washington D.C, USA, 2005, IEEE Computer Society. 

ISBN1-59593-061-2 

[3] Kendell A. Atkinson (1988), An introduction to numerical 

analysis (2nd ed.), Section 8.9, John Wiley and Sons. 

[4] A.V. Knyazev, I. Lashuk, Steepest descent and conjugate 

gradient methods with variable preconditioning, SIAM J. 

Matrix Analysis and Applications 29(4), 1267-1280, 2007. 

[5] Nvidia CUDA. Website,2009, 

http://www.nvidia.com/cuda. 

[6] J. Bolz, I. Farmer, E. Grinspun, and P. Schröoder, 

“Sparse matrix solvers on the GPU: Conjugate gradients 

and multigrid,” in SIGGRAPH ’03:ACM SIGGRAPH 

2003 Papers, 2003, pp. 917–924. 

[7] W.A.Wiggers, V.Bakker, A.B.J.Kokkeler and G.J.M.Smit, 

“Implementing the conjugate gradient algorithm on multicore 

systems,” In J.Nurmi,J.Takala and O.Vainio, editors, 

Proceedings of the International Symposium on Systemon-Chip, 

Tampere, pages 11-14, Piscataway, 

Born in Jiangxi province in China, 

Yechen Gui is a graduate of Xidian 

University, where she earned her 

bachelor degree in 2006, majoring in 

biomedical engineering. After that, she 

went to Southern Medical University for 

her postgraduate studies. During that 

time, she became fairly interested in 

parallel algorithms in medical image 

processing. She implemented parallel 

ray-casting algorithm as well as CT reconstruction algorithms 

on GPU using CUDA and published two articles in two 

different core journals. 


NJ,November 2007,IEEE,ISBN 1-4244-1367-2 

[8] Maringanti.A.;Athavale.V.; Patkar.S.B.; “Acceleration of 

Conjugate Gradient Method for cirruit simulation using 

CUDA” In High Performance Computing (HiPC), 2009 

International Conference, Kochi, pp438-444 

[9] A.Asgasri, J.E.Tate. Implementing the Chebyshev 

Polynomial Preconditioner for the iterative solutions of 

linear systems on massively parallel graphics processors 

http://www.ele.utoronto.ca/zeb/publications /,2009 

[10] L. Buatois, G. Caumon, and B. Levy, “Concurrent number 

cruncher: a GPU implementation of a general sparse linear 

solver,” Int. J. Parallel Emerg. Distrib. Syst., 24(3):205– 

223,2009. ISSN 1744-5760 

[11] Marco Ament, Gunter Knittel, Daniel Weiskopf, 

Wolfgang Strasser, "A Parallel Preconditioned Conjugate 

Gradient Solver for the Poisson Problem on a Multi-GPU 

Platform," pdp, pp.583-592, 2010 18th Euromicro 

Conference on Parallel, Distributed and Network-based 

Processing, 2010 

[12] S. Williams, L. Oliker, R. Vuduc, J. Shalf, K. Yelick, and 

J. Demmel, "Optimization of sparse matrix-vector 

multiplication on emerging multicore platforms," in 

Proceedings of the 2007 ACM/IEEE conference on 

Supercomputing, 2007, pp. 1-12. 

[13] Nathan Bell "Implementing Sparse Matrix-Vector 

Multiplication on Throughput-Oriented Processors" in 

"Proc. Supercomputing '09", November 2009 

[14] Nvidia, “CUDA Toolkit 4.0 CUBLAS Library” NVIDIA 

Corporation, Santa Clara, April, 2011 

[15] Nvidia,”CUDA Programming Guide 4.0” NVIDIA 

Corporation, Santa Clara, April, 2011 

After Yechen gained her master degree in 2009, she worked 

as a research assistant in Shenzhen Institute of Advanced 

Technology, Chinese Academy of Sciences until now. During 

this time, she published 4 articles in parallel algorithms on GPU 

in all, one of which is accepted in the conference of 2010 GPU 

Solutions to Multi-scale Problems in Science and Engineering 

(GPU-SMP'2010), others of which are all indexed by EI. Now 

she’s research field is towards computational fluid animations 

as well as parallel algorithms.


Joint Polarization and Angle Estimation 

for Robust Multi-Target Localization in Bistatic 

MIMO Radar 

Hong Jiang 1, 2 , Yu Zhang 2 , Hong-Jun Liu 1 , Xiao-Hui Zhao 2 and Chang Liu 3 

1 Military Simulation Technology Institute, Aviation University of Air Force, Changchun, China 

2 College of Communication Engineering, Jilin University, China 

3 Department of Electrical and Computer Engineering, Kansas State University, Manhattan, USA 

Email: jiangh@jlu.edu.cn 

Abstract—In the paper, we propose a novel algorithm using 

joint polarization and angle information for robust and 

high-resolution multi-target localization in bistatic multipleinput 

multiple-output (MIMO) radar system. The proposed 

algorithm exploits the singular value decomposition (SVD) 

of cross-correlation matrix of the received data from two 

transmitter subarrays to obtain robust performance in noise. 

Polarization sensitive array-based ESPRIT technologies are 

employed to estimate the direction of departure (DOD), the 

direction of arrival (DOA) and the polarization parameters. 

The Cramer-Rao bounds (CRBs) are given. In the method, 

the closely spaced targets can be well distinguished by 

polarization diversity. Also, the DODs, DOAs, and 

polarizations of multiple targets can be well paired. The 

simulation results demonstrate that the proposed algorithm 

can work well and achieve high-resolution identification and 

robust localization of multiple targets. 

Index Terms— MIMO radar, joint parameter estimation, 

polarization, singular value decomposition, Cramer-Rao 

bound 


Multiple-input multiple-output (MIMO) radar [1] and 

its applications in localization and direction-finding [2]- 

[4] has recently become a hot topic research. Specifically, 

the methods of target localization for bistatic MIMO 

radar are studied to estimate both the direction of 

departure (DOD) and the direction of arrival (DOA) [5]- 

[9]. However, a situation must be paid attention that the 

resolution of these algorithms is greatly degraded when 

multiple targets are closely spaced and cannot be well 

distinguished from the spatial domain. Polarimetric radar 

reflects the tremendous advantages in target estimation, 

detection and tracking technology [10] [11]. The echoes 

with different states of polarizations of electromagnetic 

wave, can be independent of each other due to targets at 

different locations. By making full use of polarization 

diversity in MIMO radar, the accuracy of multi-target 

identification and localization can be improved. 

In this paper, we propose a novel algorithm jointing 

polarization and angle information for robust multi-target 

localization in bistatic MIMO radar system. We use 

singular value decomposition (SVD) of cross-correlation 


doi:10.4304/jsw.7.12.2703-2709 

matrix of the data received from two transmitter 

subarrays to obtain robust performance in a noise 

environment. Polarization sensitive array processing [12] 

and ESPRIT technologies are used to estimate targets for 

bistatic MIMO radar. By partitioning the transmitter array 

into two subarrays and matching the received data with 

the transmitted signals of two subarrays, we obtain two 

groups of received data from the transmitter subarrays. 

Then, the DOAs, DODs and polarizations of multi-targets 

can be effectively estimated and paired automatically. 

This paper is organized as follows. A signal model for 

polarimetric MIMO radar is presented in Section 2. A 

novel algorithm for robust multi-target localization by 

jointing DOA, DOD and polarization estimation is 

proposed in Section 3. The Cramer-Rao bound is derived 

in Section 4. Some simulations are conducted to verify 

the performance of the proposed method in Section 5. 

Finally, a conclusion is drawn in Section 6. 

II. SIGNAL MODEL 

For a bistatic MIMO radar system, as shown in Fig.1, 

we assume that the transmitter is composed of two 

uniform linear subarrays having 1 M and M sensors, and 

2 

the receiver is a polarization sensitive array with N pairs 

of crossed dipoles. The inter-element spacings at the 

transmitter array and the receiver array are d and t d , r 

respectively, both having no more than half wavelength. 

Assume that the target’s range is much larger than the 

apertures of the transmitter and receiver arrays. At the 

transmitter site, M = M1+ M elements of the transmitter, 

2 

which are orthogonal each other, simultaneously transmit 

waveforms from the subarray 1 and subarray 2. They 

have identical center frequency and bandwidth but are 

temporally coded. P targets appear in the far-field of 

transmitter and receiver. For the p-th target, here 

p= 1, L , P , its DOD and DOA are θ and φ , 

p 

p 

respectively, and two polarization phase factors γ and p 

η denote its polarization information, respectively. 

p


For the l-th snapshot, l = 1, L , L , the observed data 

matrix at the receiver array is denoted as 

Figure 1. Bistatic MIMO radar with polarization sensitive receiver array 

() l 

( φγη , , ) 

( ) 

( θ ) 

T 

⎡A () l t1 

θ ⎤ () l 

X = Ar B ⎢ ⎥ S+ Z 

⎢A⎥ M × K 

where 

⎣ t 2 ⎦ 

S ∈ denotes the transmitted baseband coded 

( l ) 

signal matrix. P P × 

B ∈ denotes the reflected target 

signal matrix. A ( ) ∈ 

M1× P and A ( θ ) ∈ 

M 2 × P denote 

t1 θ1 

t 2 2 

the transmitter steering matrices of the subarray 1 and 2, 

2N× 

P 

respectively. A r ( φγη , , ) ∈ denotes the manifold 

( l) 2N 

K 

matrix of receiver array. × 

Z ∈ denotes the noise 

matrix. 

T T T 

T 

K × 1 

S = ⎡ 

⎣S1 , S2 , L , S ⎤ , where s 

M ⎦ 

m ∈ is signal 

vector of the m-th transmit element, with length K in a 

repetition interval s T H . Let SS = KI 

for M orthogonal 

j2f 2 

transmitted waveforms. () () 1t 

j f t 

l l d l () l dp l 

diag { 1 e , , p e } 

π 

π 

B = β L β , 

( l) 

where f is the Doppler frequency of the p-th target. β 

dp 

p 

is complex amplitudes proportional to the RCSs of the pth 

target, which is time-varying in each snapshot. tl = lTS 

is the slow time. A ti ( θ) = ⎡ati ( θ1) ati 

( θ p ) ⎤ 

⎣ 

L, , , where 

⎦ 

Mi 

× 1 

ati 

( θ p ) ∈ is the p-th steering vector of the i-th 

transmiter subarray, i=1,2. 

− ( d ) − ( M1− )( d ) 

( θ ) 

T 

j2π t λ sinθp j2π 1 t λ sinθ 

⎡ p⎤ 

(1) 

a t1p = 1, e ,... ,e 

(2a) 

⎣ ⎦ 

− M1( d ) − ( M− )( d ) 

( θ ) 

T 

j2π t λ sinθp j2π 1 t λ sinθ 

⎡ p⎤ 

a t2p = 1, e ,... ,e 

(2b) 

⎣ ⎦ 

The manifold matrix of receiver array 

denotes Ar ( φγη , , ) = ⎡ r ( φ1, γ1, η1) , , r ( φp, γ p, ηp) 

⎤ 

⎣ 

a L a 

, 

⎦ 

2N× 1 

where a r ( φp, γ p, η p) 

∈ is the p-th manifold vector 

of receiver array. 

( , , ) 

a φ γ η = q ⊗v 

(3) 

r p p p p p 

where ⊗ denotes Kronecker product. 

p-th steering vector of arrival signal, 

−j2π( d λ) sinφ −j2π( N−1)( d λ) sinφ 

N× 

1 

q p ∈ is the 

r p r p 

q p = ⎡1,e ,... ,e 

⎤ (4) 

⎣ ⎦ 


T 

21 × 

and v ∈ is the p-th polarization vector arrival signal, 

p 

⎡−cosγ p ⎤ 

v p = ⎢ ⎥ jη 

p 

⎢⎣ sinγ pcosφpe⎥⎦ The received signals are matched with M = M1+ M2 

transmitted waveforms. The matched filter matrix is 

( 1 

H 

K ) S . The output of the matched filter 

( l ) 2 N× M 

is Y ∈ , which 

() l () l H 

Y = ( 1 K ) X S , i.e., 

can be written by 

( ) 

T 

() l () ⎡At1 θ ⎤ 

() 

() l () l 

l l H 

= K r ( φγη , , ) ⎢ ⎥ + (1/ K) 

= 1 | 2 

At 

2 ( θ) 

Y A B Z S ⎡Y Y ⎤ (6) 

⎢ ⎥ 

⎣ ⎦ 

⎣ ⎦ 

where ( 1 ) 

(5) 

() l H 

K Z S denotes the noise matrix after 

matched filter. ( l ) 

Y 1 ∈ 

2 N × M 1 and ( l ) 

2 ∈ 

2 N× M 2 

( l) 

the two sub-matrices of Y . 

Vectorize the sub-matrices () l 

Y and 1 

( l) 

2 

we have 

() l () l 

1 = vec 1 

Y denote 

( ) 

Y , respectively, 

y Y (7) 

( ) 

() l () l 

2 = vec 2 

y Y (8) 

Thus, the two column vectors () l 2M1N× 1 

y1 

∈ 

( l) y2 ∈ 

2M2N× 1 are composed of M 1 and 2 

( ) 

vectors of 2N × 1 , respectively. l 

y and 1 

( l) 

further written as 

( ) 

( l) ( l) ( l) 

1 = 1 θφγη , , , + 1 

and 

M column 

y can be 

y A b v (9) 

( ) 

( l) ( l) ( l) 

y A θφγη b v (10) 

2 = 2 , , , + 2 

2M 

where 1N× 

P 

2 

A ( θφγη , , , ) ∈ and 2 

( ) 

1 

A 

2 

2 

θφγη , , , ∈ 

M N× P 

denote the two manifold matrix with respect to the two 

transmitter subarrays, 

( θφγη , , , ) = ( θ) ◊ ( φγη , , ) 

A A A 

1 t1 r 

( θ ) ( φ γ η ) L ( θ ) ( φ γ η ) 

= ⎡ 

⎣ 

at1 1 ⊗a r 1, 1, 1 , ,at1 p ⊗a 

r p, p, p ⎤ 

⎦ 

(11) 

( θφγη , , , ) = ( θ) ◊ ( φγη , , ) 

A A A 

2 t2 r 

( θ ) ( φ γ η ) L ( θ ) ( φ γ η ) 

= ⎡ 

⎣ 

at2 1 ⊗a r 1, 1, 1 , ,at2 p ⊗a 

r p, p, ⎤ p ⎦ 

(12) 

where ◊ denotes Khatri-Rao product. After matched 

filtering with 1 M and M 2 transmitted waveforms which 

are orthogonal with each other, the two noise vectors ( l) 

v 1 

and ( l) 

v are uncorrelated, zero-mean complex Gaussian 

2 

j f t j2fdpt T 

l 

⎡ ⎤ 

distributed. Here () l 2 () l d1l () l 

K 1 e , , P e π 

π 

b = β L β 

⎣ ⎦ 

is 

the reflected signal vector.


Based on the signal model in (9) and (10), the problem 

of interest is to jointly estimate multiple parameters 

θ , φ , γ , η for the p-th target. 

( ) 

P P P P 

III. JOINT POLARIZATION AND ANGLE ESTIMATION 

ALGORITHM FOR ROBUST MULTI-TARGET LOCALIZATION 

The cross-correlation matrix between 

is given by 

y and 

() l 

1 

( ) ( ) 

H 

() l () l 

H 

c = E{ 1 2 } = 1 θ, φγη , , b 2 θφγη , , , 

y 

() l 

2 

R y y A R A (13) 

() l 

where R is the covariance matrix of b . Here the cross- 

b 

correlation matrix of noise ( l) 

v and 

() l 

v has been 

() l 

() l 

canceled since v and 

1 v are uncorrelated. 

2 

Performing a singular value decomposition (SVD) on it, 

we obtain 

1 

H 

c = R UΣV (14) 

2M1N × 2M 

1N 

2 2 

where U ∈ 

and 

2 

2MN × 2M 

N 

V ∈ 

are 

two unitary matrices composed of left and right singular 

vectors corresponding to all the singular values, 

respectively. 

⎡Σ00⎤ Σ = ⎢ ⎥ 

⎣0 0 ⎦ 

(15) 

where Σ 0 = diag ( σ 1, σ 2, 

Lσ 

P ) is a diagonal matrix , 

and σ1, σ2, LσPare the first P non-zero singular 

value of the matrix R , which are real and positive, such 

c 

that σ1 ≥L ≥ σ > 0 . Partition U into 

P 

U ⎡U U ⎤ 

(16) 

⎣ s n ⎦ 

where s U and n U are the first P columns and the last 

2M1N − P columns of U , respectively, then, we have 

the following relations: 

{ s} 

= { 1 ( , , , ) } 

span U span A θ φγη (17) 

2 M 1N× 

P 

That is to say, the columns in U s ∈ span the 

same signal subspace as the column vectors in 

A 1 ( θφγη , , , ) . Thus, there is a unique non-singular matrix 

T such that 

s 

1 

( θφγη , , , ) 

U = A T (18) 

Thus U can be used to form multiple appropriate 

s 

subsets because it preserves the invariance properties of 

A 1 ( θ , φ, γ , η) 

. 

For both transmitter array and receiver array possess 

shift-invariance properties of ESPRIT method, we can 

obtain the estimation of DOD, DOA and polarization 

parameters based on the matrix U . s 


A. DOD Estimation 

Since 1 ( θ , φγη , , ) 

according to row, each is a 2N P 

A is composed of M 1 blocks 

× matrix. If we let A f 1 

and A be the two f 2 

2( M1 − 1) 

N× P sub-matrices of 

A 1 ( θ , φγη , , ) formed with the first and last M1 − 1 

blocks of A 1 ( θ, φγη , , ) , respectively, then, Af2 = Af1Λ , f 

where Λ is a P× P diagonal matrix, 

f 

d 

2 t d 

sin 1 2 t d 

− j π θ − j π sinθ2 − j2π 

t sinθ 

λ λ λ P 

{ , , 

} 

Λ f = diag e e Le 

(19) 

Since the Kroneker product is used in the structure of the 

column in s U , we divide s U into 1 M blocks according 

to row, each of which is a 2N× P matrix. The subspace 

U spanned by the column in s 

A and f 1 A are the same 

f 2 

except for the phase rotation caused by the diagonal 

matrix f Λ . Let U f 1 ∈ 

2( M 1 − 1) 

N × P be a sub-matrix 

formed with the first M1 − 1 blocks of U , and let 

s 

U f 2 ∈ 

2( M 1 − 1) 

N× Pbe 

a subset formed with the last 1 1 M − 

blocks of U s in the same way the A and f 1 A are f 2 

A θ, φγη , , . We have 

formed from ( ) 

1 

U = A T (20) 

f1 f1 

U = A T = A Λ T (21) 

f 2 f 2 f1 f 

Then span{ f1} = span{ f 2} = span{ 

f1} 

where 

U U A , and 

U = U T Λ T = U Ψ (22) 

−1 

f 2 f1 f f1 f 

Ψ = T Λ T = QΛQ (23) 

−1 

T 

f f f 

where the diagonal elements of Λ f are P eigenvalues of 

matrix f Ψ −1 

, and Q= T composed of the eigenvectors 

corresponding to the eigenvalues of matrix f Ψ . 

Therefore, DOD of the p-th target can be obtained 

d 

− j2πtsinθ 

λ p 

from atp = e , p = 1, 2, L , P , i.e., from the 

diagonal elements of Λ f . 

B. DOA Estimation 

For the estimation of DOAs, let A q1 

and q2 

A be the 

A 1 θ , φ, γ , η 

two 2M1( N − 1) 

× P sub-matrices of ( ) 

formed with the first and the last 2( N − 1) 

rows of each 

blocks of A ( θ φ γ η) 

, respectively. Then 

1 , , , 

Aq2 = Aq1Λ , where q 

q Λ is also a P× P diagonal matrix, 

d 

2 r d 

sin 1 2 r d 

− j π φ − j π sinφ2 − j2π 

r sinφ 

λ λ λ p 

{ , , 

} 

Λ q = diag e e L e 

(24)


Similarly, we could form two sub-matrices based on s U 

in the same way A and q1 

q2 

( θ φ γ η) 

A are formed from 

A 1 , , , . However, DOD and DOA for each target 

may not be well paired in such a way. In order to make 

DODs and DOAs be paired automatically, we first form a 

new matrix 

%Us = UsQ . (25) 

−1 

Substitute (18) into (25), and consider Q= T yielding 

( , , , ) ( , , , ) 

%U A TQ A . (26) 

s = 1 θ φγη = 1 θφγη 

Therefore, %U has the same structure with s 

A 1 ( θ, φγη , , ) . 

Thus, we divide %U into s 

1 M blocks according to row, 

2M1( N− 1) 

× P 

each of which is a 2N× P matrix. Let U ∈ 

be a sub-matrix formed with the first 2( N − 1) 

rows of 

each blocks of %U , and s U q 2 ∈ 

2M1( N − 1) 

× P be a subset 

formed with the last 2( N − 1) 

rows of each blocks of s 

%U 

in the same way A q1 

and A q2 

are formed from 

A 1 ( θ, φ, γ , η) 

, we have 

q1 q1 q1 

q1 

U = A TQ = A (27) 

U = A T = A Λ TQ = A Λ (28) 

q2 q2 q1 q q1 q 

{ q1} = { q2} = { q1} 

span U span U span A , and 

U = U Λ (29) 

q2 q1 q 

Thus, the DOA of the p-th target can be obtained from 

the p-th diagonal element of q Λ d 

− j 2π r sinφp 

, i.e., λ qp= e , and 

the DOA and DOD of the p-th target can be well paired 

with each other. 

C. Polarization Estimation 

For the estimation of two polarization parameters, let 

A and r1 

A be the two r 2 

M1N P × sub-matrices of 

A 1 ( θ, φγη , , ) formed with the even and the odd rows of 

A 1 ( θ, φγη , , ) , respectively. Then Ar2 = Ar1Λ , r 

where Λ is the diagonal matrix, 

r 

{ , , } 

Λ = diag r r Lr 

(30) 

r 1 2 P 

where r p is the ratio of the first element and the second 

element of the vector in (5), 

cos p 

r 

p 

sin cos e η 

− γ 

= (31) 

γ φ 

p j 

p p 

Similarly, according to (26), let U r1 

∈ 

M 1 N × P be a submatrix 

formed with the even and the odd rows of s 

%U , and 

let U ∈ 

M 1 N× P be a subset formed with the even and 

r 2 


the odd rows of U % in the same way the A and s 

r1 

r 2 

formed from A ( θ , φγη , , ) 

1 

, we have 

r1 r1 r1 

A are 

U = A TQ = A (32) 

U = A TQ = A Λ TQ = A Λ (33) 

r2 r2 r1 r r1 r 

Then span{ } = span{ } = span{ 

} 

U U A , and 

r1 r2 r1 

U = U Λ (34) 

r2 r1 r 

Therefore, polarization parameters of the p-th target can 

be attained from the the diagonal elements of Λ , i.e., r 

cos p 

rp 

j p 

sin pcos pe 

η 

− γ 

= , and the polarization parameters 

γ φ 

of the p-th target can be paired with DOD of the p-th 

target. 

The DOD and DOA of the p-th target are calculated by 

⎧ λ ⎫ 

θ p = arcsin ⎨− ∠( 

atp) 

⎬ 

⎩ 2π 

dt 

⎭ 

⎧ λ ⎫ 

φ p = arcsin ⎨− ∠( 

qp) 

⎬ 

⎩ 2π 

dr 

⎭ 

(35) 

(36) 

The two polarization parameters of the p-th target are 

calculated by 

⎛ 1 ⎞ 

γ p = arctan ⎜ 

⎟ 

rpcosφ 

⎟ 

⎝ p ⎠ 

⎛ 1 ⎞ 

η p =∠⎜− ⎜ 

⎟ 

rpcosφ 

⎟ 

⎝ p ⎠ 

(37) 

(38) 

The steps for the proposed algorithm are given as 

follows: 

Step1: Contrust a cross-correlation matrix c R between 

() l 

() l y and 1 

2 

y , and perform SVD on it. 

Step2: Form U from the left singular vector of R . 

s 

c 

Divide s 

U into blocks, form two sub-matrices U , f 1 f 2 

+ 

and calculate Ψ f = U f 2 U . f1 

U , 

Step3: Perform eigen-decomposition of f Ψ to obtain its 

eigenvalue matrix f Λ and eigen vector matrix Q . 

Step4: Form a matrix %Us = UsQ, and divide %U into s 

1 M 

blocks according to row. 

Step5: Form two sub-matrices U , q1 

q2 

matrices U , r1 

r 2 

%U , then calculate s 

q = 

+ 

q2 q1 

on (29) and (34), respectively. 

U , and two sub- 

U respectively using different blocks of 

+ 

Λ U U and Λ = U 2 U based 

1 

r r r


Step6: Estimate θ , φ , p p γ and p η from the diagonal 

p 

elements of f Λ , q Λ and Λ using (35)~(38). 

r 

For a bistatic MIMO radar system, the distance 

between the transmitter and the receiver is known. Thus 

we can localize a target in a 2-D plane. θ and p φ can be 

p 

used to determine the position of a target via trilateration, 

and γ and p η can be used in target identification and 

p 

improvement of resolution. 

IV. CRAMER-RAO BOUND 

The Cramer-Rao bound(CRB) provides a lower bound 

on the variance of any unbiased estimator. The CRBs of 

the parameters of DOA, DOD and polarizations in MIMO 

radar are considered here. Based on a composite data of 

() l () l 

y1 and y 2 , the signal model in (9) and (10) can be 

jointly written as 

( l) ( l) ( l) 

y = A( θφγη , , , ) b + v (39) 

where 

⎡A1( θ, φγη , , ) ⎤ 

A( 

θφγη , , , ) = and 

⎢ 

2( 

θ, φγη , , ) 

⎥ v 

⎣A⎦ () l 

() l 

⎡v ⎤ 1 

= ⎢ () l ⎥ . 

⎢⎣ v2 

⎥⎦ 

Here the CRB expression of JADE (Joint Angle and 

Delat Estimation) Approach is extended to the 

polarimetric MIMO Radar system, we give the CRB for 

the parameters of interest as 

2 L σ v 

() l H H ⊥ () l −1 

CRB( θφγη , , , ) = { ∑ real[ 

Bb DAPADAB b ]} (40) 

2 

l= 

1 

⊥ H 

() l () l 

where PA= I−A(, θ ϕγη ,,) A (, θϕγη ,,) , Bb= I4 ⊗diag{ 

b } , 

and 4 I denotes an identity matrix of order 4. D denotes 

A 

the differentiation matrix, where each column is 

differentiated with respect to the corresponding parameter. 

D can be expressed as 

A 

where 

DA = [ A& t( θ ) ◊Ar( φγη , , ), At( θ) ◊A& 

r( 

φγη , , )] 

= [ A& , A& , A& , A& 

] 

θ φ γ η 

( ) 1( ), 1( 

) T 

T T 

At θ = ⎡ 

⎣At θ A t θ ⎤ , and 

⎦ 

(41) 

⎡∂a( ) 

t ( θ1) 

∂at 

θ 

⎤ 

p 

A& θ = ⎢ ⊗ar( φ1, γ1, η1), L , ⊗ar( 

φp, γ p, ηp) 

⎥ 

⎢⎣ ∂θ1∂θp⎥⎦ ⎡ ∂a 

( , , ) 

r ( φ1, γ1, η1) 

∂ar 

φp γ p η ⎤ p 

A& φ = ⎢at( θ1) ⊗ , L , at( 

θp) 

⊗ 

⎥ 

⎢⎣ ∂φ1∂φp⎥⎦ ⎡ ∂a 

( , , ) 

r ( φ1, γ1, η1) 

∂ar 


A& γ = ⎢at( θ1) ⊗ , L , at( 

θp) 

⊗ 

⎥ 

⎢⎣ ∂γ1∂γ p ⎥⎦ 

⎡ ∂a 

( , , ) 

r ( φ1, γ1, η1) 

∂ar 


A& η = ⎢at( θ1) ⊗ , L , at( 

θp) 

⊗ 

⎥ 

⎢⎣ ∂η1∂ηp⎥⎦ © 2012 ACADEMY PUBLISHER 

V. SIMULATION RESULTS 

Simulations are conducted to verify the effectiveness 

of the proposed method in this section. The intervals 

between adjacent array elements of transmit and receive 

arrays are all half wavelength. The length of sequences is 

1024. The RCSs are given by () l 

β p = 1. 

The number of 

snapshots is L=100, and the number of Monte Carlo trials 

is 100. The performance of the algorithm will be 

evaluated through root-mean-square error (RMSE). 

Example1: Suppose that the transmit array and the 

receive array consist of M 1 = 3 and M 2 = 2 elements and 

N = 4 pairs of crossed dipoles in the polarimetric 

MIMO radar, respectively. There are antennas in the first 

subarray and P = 3 targets. Their parameters of the DOD, 

the DOA and the states of polarization are 

° ° ° 

θ = ( 10 ,40 , − 30 ) , ° ° ° 

φ = ( 50 , − 20 ,20 ) , γ = ( 9π 20, π 5, π 4) 

and η = ( 2π 5, π 5,4π 5) 

. The Doppler frequencies of 

the three targets are 100Hz, 2000Hz and 5000Hz, 

respectively. The estimation results of DOD, DOA and 

polarization parameters when SNR is 10dB are shown in 

Fig.2 (a),(b), respectively. The performances of RMSE 

and Root CRB versus SNR for the DOD, DOA and 

polarizations estimation are shown in Fig. 3. 

φ (deg) 

η*π(rad) 

60 

50 

40 

30 

20 

10 

0 

-10 

-20 

-30 

-40 

-40 -30 -20 -10 0 10 

θ(deg) 

20 30 40 50 60 

1 

0.9 

0.8 

0.7 

0.6 

0.5 

0.4 

0.3 

0.2 

0.1 

(a) DOD and DOA 

0 

0 0.05 0.1 0.15 0.2 0.25 

γ*π(rad) 

0.3 0.35 0.4 0.45 0.5 

(b) Two polarization parameters 

Figure 2. The estimation results of DOD, DOA and polarizations when 

SNR is 10 dB


RMSE(degree) 

RMSE(degree) 

RMSE(degree) 

RMSE(degree) 

10 -1 

10 -2 

10 -3 

10 -4 

DOD estimation 

Root CRB of 3 targets 

RMSE of 3 targets 

10 

0 5 10 15 20 25 30 

-5 

SNR(dB) 

10 -1 

10 -2 

10 -3 

DOA estimation 



10 

0 5 10 15 20 25 30 

-4 

SNR(dB) 

10 -1 

10 -2 

10 -3 

Polarization γ estimation 



10 

0 5 10 15 20 25 30 

-4 

SNR(dB) 

10 0 

10 -1 

10 -2 

10 -3 

Polarization η estimation 



10 

0 5 10 15 20 25 30 

-4 

SNR(dB) 

Figure 3. The performances of RMSE and Root CRB versus SNR for 

the DOD, DOA and polarizations estimation 


From Fig. 2 and Fig. 3, it is shown that the proposed 

method could effectively estimate DOD, DOA and 

polarization parameters for bistatic MIMO radar. In 

addtion, the automatical pairing between DOD, DOA and 

polarization parameters can be obtained. 

Example 2: The number of transmitter array, receiver 

array and snapshot is the same as Example 1. The number 

of targets is 2. SNR changes from 0dB to 30dB. We 

consider two targets which are closely spaced, 

θ 

° ° 

10 ,11 φ 

° ° 

20 , 20 γ = π 6, π 3 and 

= ( ) , ( ) 

( 0, 4) 

= , ( ) 

η = π . 

We compare the performance of the proposed 

DOD/DOA/polarization parameter estimation algorithm 

with Chen’s method [5] of DOD/DOA estimation. The 

performance of RMSE with SNR for the two targets is 

shown in Fig. 4 . 

RMSE(degree) 

10 1 

10 0 

10 -1 

10 -2 

target 1,our method 

target 2,our method 

target 1,Chen's method 

target 2,Chen's method 

10 

0 5 10 15 20 25 30 

-3 

SNR(dB) 

Figure 4. The performance comparison of DOD/DOA estimation. 

The simulation shows that the proposed algorithm 

obviously performs better than the algorithm in previous 

research. Because polarization diversity is adopted in our 

algorithm, the multi-target resolution of DOD/DOA 

estimation is obviously improved. 

VI. CONCLUSION 

In the paper, we propose a robust algorithm to jointly 

estimate DOD, DOA and polarization parameters for 

multi-targets in bistatic MIMO radar system via ESPRIT 

algorithm, polarization sensitive array processing and 

SVD of cross-correlation matrix of the received data from 

two transmitter subarrays. The simulation results show 

that the proposed method can effectively estimate 

multiple parameters for each target, i.e. the angles of 

departure and arrival, and two polarization parameters, in 

a noise environment. 

Also, the parameters can be paired automatically. 

Using polarization diversity technique, the estimation 

performance is improved, especially when two targets are 

closely spaced and cannot be well separated in spatial 

domain.



This work is supported by the National Natural Science 

Funds of China (61071140 and 60901060), as well as 

Jilin Province Natural Science Foundations of China 

(201215014). 

The authors wish to thank the anonymous reviewers 

for their valuable comments and suggestions which 

greatly improved the manuscript. 

REFERENCES 

[1]Fishler E and Haimovich A, Blum R, “MIMO Radar: An 

idea whose time has come”. Proc. of the IEEE Int. Conf. on 

Radar, Philadelphia, PA, 2004, 71-78. 

[2]Bekkerman I, Tabrikian J. “Target detection and localization 

using MIMO radars and sonars”, IEEE Trans on Signal 

Processing, 54 (10), 2006, 3873-3883. 

[3]Lehmann, E. Fishler, A. Haimovich, etal. “Evaluation of 

transmit diversity in MIMO radar direction finding”, IEEE 

Transactions on Signal Processing, 55(5), 2007, 2215-2225. 

[4]J. Li and P. Stoica, “MIMO Radar with Colocated antennas”, 

IEEE Signal Proc. Magazine, 2007, 106-114. 

[5]J. Chen, H. Gu, W. Su. “A new method for joint DOD and 

DOA estimation in bistatic MIMO radar”. Signal Processing, 

90, 2009, 714-718. 

[6]J. Chen, H. Gu, W. Su. “Angle estimation using ESPRIT in 

MIMO radar”, Electronics Letters, 44(24), 2008, 1422-1423. 

[7]C. Duofang, C. Baixiao, Q.Guodong. “Angle estimation 

using ESPRIT in MIMO radar”. Electronics Letters, 44 (12), 

2008, 770-771. 

[8]H. Yan, J. Li, G. Liao. “Multitarget identification and 

localization using bistatic MIMO radar systems”. EURASIP 

Journal on Advances in Signal Processing, 2008, 283-483. 

[9]M. Jin, G.Liao, J. Li. “Joint DOD and DOA estimation for 

bistatic MIMO radar”. Signal Processing, 89 (2), 2009, 244- 

251. 

[10]M. Hurtabo, J. JunXiao and A. Nehorai. “Target estimation, 

detection, and tracking”, IEEE Signal Processing Magazine, 

2009, 42-52. 

[11]M.Hurtado and A. Nehorai, “Polarimetric detection of 

targets in heavy inhomogeneous clutter,” IEEE Trans. 

Signal Process., 56, 2008, 1349-1361. 

[12]Z.W. Zhuang, Z.H. Xu, S.P. Xiao, etal. Signal Processing 

of Polarization Sensitive Array (in Chinese), China National 

Defence Industrial Press, Mar. 2005. 


Hong Jiang is an Associate Professor 

with the College of Communication 

Engineering, Jilin University, China. 

She is a IEEE member and a senior 

member of the Chinese Institute of 

Electronics (CIE). Her current research 

fields focus on statistical array 

processing, localization for radar and 

wireless communications. She has 

published over 50 papers. She received the B.S. degree in 

wireless communication from Tianjin University, China, in 

1989, the M.S. degree in Communication and Electronic System 

from Jilin University of Technology, China, in 1996, and the 

Ph.D. degree in Communication and Information System from 

Jilin University, China, in 2005. From 2010 to 2011, she 

worked as a visiting research fellow in McMaster University, 

Canada. Currently, she is working as a Post-Doctoral fellow in 

Military Simulation Technology Institute, Aviation University 

of Air Force, China. 

Yu Zhang was born in Changchun, 

China on June 15, 1987. She is currently 

a graduate student in College of 

Communication Engineering, Jilin 

University, China. Her current research 

interest is target localization in MIMO 

radar and its parameter estimation. 

Hong-Jun Liu is a Professor with Military Simulation 

Technology Institute, Aviation University of Air Force, 

Changchun, China, His research fields focus on flight simulator 

and its application. 

Xiao-Hui Zhao is a Professor with College of Communication 

Engineering, Jilin University, China. His current research fields 

focus on signal processing for wireless communication. 

Chang Liu was born in Zhengzhou, China on Mar. 17, 1987. 

She received the M.S. degree in Communication and and 

Information System from Jilin University, China. Currently, she 

is a PhD student in Dept. Electrical and Computer Engineering, 

Kansas State University, USA. Her current research interest is 

localization in wireless communication networks.


The Central DOA Estimation Algorithm Based 

on Support Vector Regression for Coherently 

Distributed Source 

Yinghua Han 

Northeastern University at Qinhuangdao, Qinhuangdao, China 

Email: yhhan723@126.com 

Jinkuan Wang 

Northeastern University at Qinhuangdao, Qinhuangdao, China 

Email: wjk@mail.neuq.edu.cn 

Abstract—In this paper, the problem of estimating the 

central direction of arrival (DOA) of coherently distributed 

source impinging upon a uniform linear array is considered. 

An efficient method based on the support vector regression 

is proposed. After a training phase in which several known 

input/output mapping are used to determine the parameters 

of the support vector machines, among the outputs of the 

array and the central DOA of unknown plane waves is 

approximated by means of a family of support vector 

machines. So they perform well in response to input signals 

that have not been initially included in the training set. 

Furthermore, particle swarm optimization (PSO) algorithm 

is expressed for determination of the support vector 

machine parameters, which is very crucial for its learning 

results and generalization ability. Several numeral results 

are provided for the validation of the proposed approach. 

Index Terms—coherently distributed source; the central 

DOA; angular spread; support vector machines; particle 

swarm optimization 


Sensor array processing plays a prominent role in the 

propagation of plane waves through a medium. The 

problem of finding the directions impinging on an array 

antenna or sensor array, namely, direction finding or 

DOA estimation, has been of interest for several decades. 

This is because the direction is a useful parameter for 

several systems, such as wireless communications, radar, 

navigation, etc. In most DOA estimation algorithms, it is 

commonly assumed that the received signals originate 

from far-field point sources and give rise to perfectly 

planar wavefronts which impinge on the array from 

discrete and fixed DOAs. However, in many practical 

applications, such as radar, sonar and mobile 

communications, the sensor array often receives sources 

which have been reflected by a number of scatters. The 

Manuscript received July 25, 2011; revised September 25, 2011; 

accepted March 28, 2012. 

Corresponding author: Jinkuan Wang. 


doi:10.4304/jsw.7.12.2710-2716 

scattered signals are received from a narrow angular 

region, an alternative signal model can be derived which 

is called distributed source model [1-4]. 

Note that depending on the relationship between the 

channel coherency time and the observation period, the 

sources can be viewed either as coherently distributed or 

incoherently distributed [5]. A source is called coherently 

distributed if the signal components arriving from 

different directions are replicas of the same signal, 

whereas in the incoherently distributed source case, all 

signals coming from different directions are assumed to 

be uncorrelated. Indeed, if the channel coherency time is 

much smaller than the observation period, then the 

incoherently distributed model is relevant. In the opposite 

case, the coherently distributed model or a partially 

coherent model can be used. 

Several methods have been proposed for estimating 

parameters for these two types of distributed sources. 

Indeed, in coherently distributed source case, the rank of 

the noise-free covariance matrix is equal to the number of 

sources. On the other hand, for incoherently distributed 

sources, the rank of the noise-free covariance matrix 

increases as the angular spread increases. In particular, 

for a single-source case, the rank can reach the number of 

array sensor [6]. However, most of the signal energy is 

concentrated within the first few eigenvalues of the noisefree 

covariance matrix. The number of these eigenvalues 

is referred to as the effective dimension of the signal 

subspace. It is generally smaller than the number of 

sensors. 

Typically, the statistics of a distributed source is 

parameterized by its central DOA and angular spread. A 

number of investigators have proposed distributed source 

modeling, and several parameter estimation techniques 

have been proposed in the literature [7-12].To begin with, 

attempts for coherently distributed source modeling and 

parameter estimation have been accomplished in [7], 

where the central DOAs and angular spreads are 

estimated by algorithms based on MUSIC using a 

uniform linear array. However, this algorithm needs two 

dimensional joint searching and assumes that the multiple


sources must have identical and known angular signal 

intensity function. In contrast to some computationally 

complex approaches such as the maximum likelihood [8], 

the dispersed signal parameter estimator (DISPARE) [9], 

and the covariance fitting [10] have been provided. 

Subsequently, a classical localization algorithm has been 

used to estimate both virtual parameters and deduced the 

required ones. The focus in [11] has been on root-MUSIC 

which was shown to provide better accuracy with 

relatively low computational complexity compared to 

some other point-source localization algorithm [12-13]. 

Other robust techniques using the array geometry have 

recently been developed. A typical example is lowcomplexity 

parameter estimation with ESPRIT technique 

[14], which employs eigenvalue decomposition with two 

uniform linear arrays. The ESPRIT algorithm is still 

computationally extensive and time consuming especially 

when the number of antenna array elements is larger than 

the number of incident signals. An asymptotic maximum 

likelihood for joint estimation of the central DOA and 

angular spreads of multiple distributed sources is 

presented in [15]. Though it has best precision, the 

computationally load is high. 

Recently, many low-complexity methods are proposed 

to reduce the computational burden of estimators [16-18]. 

For example, the decoupled COMET-EXIP [16] uses two 

successive one dimensional searches instead of a two 

dimensional search for parameter estimation of a single 

incoherently distributed source. 

Furthermore, methods based on the use of neural 

networks and radial basis function networks have also 

been efficiently applied for point source DOA estimation 

[19-20]. In these works, the outputs of the array, properly 

preprocessed, are used as input data for a family of neural 

networks trained with a subset of the possible 

configurations of the impinging sources. 

In this paper, an alternative algorithm is proposed, 

which is based on a support vector regression. In 

particular, the support vector regression approximates the 

unknown function that relates the received signals to the 

angles of incidence. The support vector regression is 

based on the theory of support vector machines, which 

are a nonlinear generalization of the generalized portrait 

algorithm. In the past few years, there has been a great 

interest in the development of support vector machines, 

mainly because they have yielded excellent 

generalization performances in applications [21-22]. And 

fast iterative algorithms based on the use of support 

vector machines and relatively simple to be implemented, 

have been developed [23-24]. 

The remainder of this paper is organized as follows. 

Section II presents the array configuration and system 

model. Section III proposes a central DOA estimation 

algorithm for coherently distributed source based on 

support vector regression. Section IV shows particle 

swarm optimization for parameter section of support 

vector regression. Section V gives simulation results. And 

Section VI concludes the paper. 

II. PROBEM STATEMENT AND PRELIMINARIES 


In this work, a uniform linear array composed M 

elements with interelement spacing d is considered. q 

electromagnetic narrowband plane waves impinge on the 

array from directions. 

The complex envelope of the array output can be 

written as 

() t = ∑ i () t + () t 

i= 

1 

where X ( t) 

is the array snapshot vector, ( ) 

q 

X S N (1) 

Si t is the 

vector that describes the contribution of the ith signal 

source to the array output, and the noise N ( t) 

is zeromean 

and spatially and temporally white and Gaussian, 

H 2 

E{ N( t) N ( t′ ) } = σ I δ tt′ 

(2) 

and 

E N t 

T 

N t′ = 0, ∀t, 

t′ 

(3) 

{ ( ) ( ) } 

2 

where σ is the noise variance, I denotes identity matrix 

and δ tt′ 

is the Kronecker delta function with δ tt′ 

= 1 for 

t = t′ and δ tt′ 

= 0 for t ≠ t′ . We also assume that the 

signal is uncorrelated with the noise. 

In point source model, the baseband signal of the ith 

source is modeled as 

S t = s t a θ 

(4) 

i ( ) i ( ) ( i) 

where si ( t ) is the complex envelope of the ith source, θ i 

2 sin 2 ( 1) sin T 

is its DOA, and ( ) [1 , ] i 

i − j πd λ θ −j π M− d λ θ 

a θ = , L, 

is 

i e e 

the corresponding steering vector, d is the distance 

between two adjacent sensors, λ is the wavelength of the 

impinging signal. 

In many environments for modern radio 

communications, the transmitted signal is often 

obstructed by buildings, vehicles, trees, etc., and/or 

reflected by rough surfaces. Hence, the absence of a 

single Line-Of-Sight (LOS) ray will violate the classical 

point source assumption. 

Assume a single narrow point source that contributes 

with a large number of wavefronts originating from 

multi-path reflections near the source and during 

transmission. If we observe the baseband signals received 

at the antenna array, it is possible to regard the source just 

as a spatially distributed source as Fig.1. 

θ 

Y 

a distributed source 

… X 

0 1 

d 

2 M 

Fig.1 Distributed source model 

In distributed signal model, the source energy is 

considered to be spread over some angular volume. 

Hence, S i ( t) 

is written as 

Si ( t) = ∫ a( ϑ ) ς ( ϑ, ψ i, 

t) 

dϑ 

(5) 

ϑ∈Θ


where Θ is the set of the steering vector over some 

ς ϑ, ψ , is a complex 

parameter space of interest, ( ) 

random angular-temporal signal intensity which can be 

expressed as 

ς ( ϑ, ψi, t) = s( t) 

l ( ϑ; 

ψi) 

(6) 

under the coherently distributed source assumptions, ψi is 

the location parameter. Examples of the parameter vector 

are the mean and standard deviation of a source with 

Gaussian angular signal intensity. 

The steering vector of distributed source is defined as 

b ψ = a ϑ l ϑ; ψ ϑ (7) 

i t 

( i) ∫ ( ) ( i)d 

ϑ∈Θ 

As a common example of the coherently distributed 

source, assume that the deterministic angular signal 

l ϑ ; ψ has the Gaussian shape 

intensity ( ) 

i 

( ) 2 

ϑ−θ 1 ⎛ ⎞ 

i 

l ( ϑθ ; i , σθ 

) = exp⎜− 

⎟ (8) 

i 

2 

2πσ 

⎜ 2σ 

⎟ 

θ 

θ 

i ⎝ i ⎠ 

Here ψ i = ⎡ 

⎣θi, σ ⎤ θi 

⎦ 

, θi is the central DOA, σθ is angular 

i 

spread. 

Using the above definitions, the covariance matrix of 

the output signal vector can be written as 

H 

RXX = E⎡ ⎣X( t) X ( t) 

⎤ 

⎦ 

(9) 

In practical situations, the true covariance matrix of 

X t is unavailable but can be estimated. Therefore, the 

() 

sample covariance matrix with N snapshots is defined as 

1 H () () 

N ) 

RXX = ∑ X t X t 

N t = 

(10) 

III. THE CENTRAL DOA ESTIMATION BASED ON SUPPORT 

VECTOR REGRESSION 

The support vector regression is based on the theory of 

support vector machines, which are a nonlinear 

generalization of the generalized portrait algorithm 

developed by Vapnik [25]. In particular, support vector 

machines have a rigorous mathematical foundation, 

which is based on the learning theory. 

Since the correlation matrix R XX is symmetric, only 

the upper triangular part is considered. These matrix 

elements are organized in an array V , given by 

1 

[ , , L, , , , L, , L, , L , L , ] 

V = r11 r12 r1Mr22 r23 r2M rmm rmM rMM 

(11) 

where rhk = [ R] , h, k = 1, L , M . 

hk 

The array V is then normalized in order to obtain the 

input data Z , 

= V 

Z 

V (12) 

M( M+ 

1/2 ) 

Z ∈Σ, Σ⊂ C , and θ1θq Since 

θ = ⎡ 

⎣ L ⎤ 

⎦∈Θ, 

q 

Θ⊂ R , so a mapping G : Θ→Σ exists. The problem of 

the central DOA estimation can be thought as the 

retrieval of θ , starting from the knowledge of the array 

Z . To this end, the unknown inverse mapping 


F : Σ →Θhas to be found. The components of F are 

estimated by using a regression approach, in which, 

starting from the knowledge of several input/output 

pairs ( Z ,θ ) , an approximation F % to F is constructed at 

the end of the training phase. 

By using the support vector regression, F % is defined 

as 

F% ( Z) = w, Φ Z + b 

(13) 

( ) 

where , denotes the scalar product, Φ is a nonlinear 

function that performs a transformation of the input array 

from the space Σ to a high dimensional space, and w and 

b are parameters which are obtained by minimizing the 

regression risk, defined as 

L 

1 2 

Rreg = w + C% 1∑ 

f ( Zi, 

θi 

) (14) 

2 

i= 

1 

where 1 C% is a constant and f ( Zi, θi) 

is the so-called 

ε insensitive loss function, given by 

⎧⎪ 0, if θi −F( Zi) 

≤ε 

f ( Zi, 

θi) 

= ⎨ 

⎪⎩ θi −F( Zi) 

−ε, 

otherwise (15) 

i=1,2, 

L,L 

The (15) can be rewritten as follows, considering the 

regression error. 

⎧ 0, if θi −F( Zi) 

≤ε 

⎪ 

f ( Zi, θi) = ⎨θi 

−w⋅Φ( Zi) 

−b≤ ε + ξi 

⎪ ' , otherwise (16) 

⎪⎩ 

w⋅Φ ( Zi) 

+ b −θi ≤ ε + ξi 

' 

ξ , ξ ≥ 0 i=1,2, LL 

i i 

' 

where ξi, ξ i are slack variables. So the problem is 

equivalent to minimize 

L L 

1 2 ⎛ ' ⎞ 

min w + C% 

1 ξi ξi 

2 

⎜∑ + ∑ ⎟ 

⎝ i= 1 i= 

1 ⎠ 

( ) 

( ) b 

⎧θ 

−w⋅Φ Z −b≤ ε + ξ 

⎪ 

⎨ ⋅Φ + − ≤ + 

⎪ 

⎩ ≥ ≥ 

L C 

+ 

+ 

i i 

w Zi 

θi ε 

' 

ξi 

ξi ' 

0, ξi 

0 

L L 

' 1 2 

' ' ' 

w, ξξ , = w + % 

1 - 

i ∑ ξi + ξi ∑ λξ i i + λξ i i 

2 

i= 1 i= 

1 

L 

∑α 

⎡ i ⎣ 

θi − 

i= 

1 

T 

w ⋅Φ( Zi) 

−b−ε −ξ 

⎤ i⎦ 

L 

' T 

∑α 

⎡ ( ) 

i ⎣ 

w ⋅Φ Zi 

i= 

1 

' 

+ b −θi −ε −ξ 

⎤ 

i⎦ 

(17) 

( ) ( ) ( ) 

(18) 

Besides, the KKT(Karush-Kuhn-Tucker) conditions force 

∂L ∂L ∂L ∂L 

' ' 

= 0, = 0, = 0, λξ 0, 0 

' i i = λξ = . Applying 

i i 

∂w∂b∂ξi ∂ξi 

them, we obtain an optimal solution for support vector 

' 

regression weights = ∑( 

αi −α ) Φ( 

) 

i i 

be expressed as 

L 

w Z . So (18) can 

i= 

1


1 

L α α α α α α 

2 

L L 

' ' ' 

( , ) =− ∑∑( 

i − i) Φ( Zi) ⋅Φ( Zj) 

( j − j) 

Subject to 

i= 1 j= 

1 

L L 

' ' 

∑θi( αi αi) -ε∑( 

αi αi) 

+ − + 

i= 1 i= 

1 

( ) ' 

L 

∑ αi αi 

i= 

1 

α α 

− = 0 

C% i L L 

0 ≤ i, ' 

i ≤ 1 = 1,2, , 

αi ' 

αi 

(19) 

(20) 

The dual variables 

KKT conditions. 

− and b are computed by using 

The regression function for tracking coherently 

distributed source is given 

L 

F% ( Z) = 

' 

∑( 

α −α ) Φ( Z ) ⋅Φ ( Z) 

+ b 

i 

i= 

1 

L 

∑ 

i= 

1 

( α α 

' ) 

= − K Z ⋅ Z + b 

i 

(21) 

where K 

i ⋅ Z Z is the kernel function working on the 

original space Σ . 

Several kernel functions help the support vector 

regression in obtaining the optimal solution. The most 

frequently used such kernel functions are the polynomial, 

sigmoid and radial basis kernel function (RBF) as follows 

[26], 

, ( ) 1 q 

K x x = x⋅ x + (22) 

( i) [ i ] 

( , ) tanh ( ( ) ) 

K xx = vx⋅ x + e (23) 

i i 

2 

⎧⎪ x−x ⎫ 

i ⎪ 

K( x, xi) 

= exp⎨− 

2 ⎬ (24) 

⎪⎩ γ ⎪⎭ 

The RBF is generally applied most frequently, because 

it can classify multi-dimensional data, unlike a linear 

kernel function. Additionally, the RBF has fewer 

parameters to set than a polynomial kernel. RBF and 

other kernel functions have similar overall performance. 

Consequently, RBF is an effective option for kernel 

function. Therefore, this study applies an RBF kernel 

function in the support vector regression to obtain 

optimal solution. 

IV. PARTICLE SWARM OPTIMIZATION FOR PARAMETER 

SELECTION OF SUPPORT VECTOR REGRESSION PROBLEM 

STATEMENT AND PRELIMINARIES 

The determination and selection for the parameters of 

the support vector machine is important in most 

applications. 

Two major RBF parameters applied in support vector 

machine, 1 C% and γ , must be set appropriately. Parameter 

C 1 

% represents the cost of the penalty. The choice of value 

for C influences on the classification outcome. If C 1 

% is 

too large, then the classification accuracy rate is very 

high in the training phase, but very low in the testing 

phase. If C 1 

% is too small, then the classification accuracy 

rate unsatisfactory, making the model useless. 

Parameter γ has a much greater influence on 

classification outcomes than C 1 

% , because its value affects 


the partitioning outcome in the feature space. An 

excessively large value for parameter C 1 

% results in overfitting, 

while a disproportionately small value leads to 

under-fitting. 

Grid search is the most common method to determine 

appropriate values for 1 C% and γ [27]. Values for 

C% and γ that lead to the highest classification 

parameters 1 

accuracy rate in this interval can be found by setting 

appropriate values for the upper and lower bounds (the 

search interval) and the jumping interval in the search. 

Nevertheless, this approach is a local search method, and 

vulnerable to local optima. Additionally, setting the 

search interval is a problem. Too large a search interval 

wastes computational resource, while too small a search 

interval might render a satisfactory outcome impossible. 

In addition to the commonly used, grid search, other 

techniques are employed in support vector machine to 

improve the possibility of a correct choice of parameter 

values. Pai and Hong proposed an SA-based approach to 

obtain parameter values for support vector machine, and 

applied it in real data [28]; however, this approach does 

not address feature selection, and therefore may exclude 

the optimal result. 

As well as the two parameters C 1 

% and γ , other factors, 

such as the quality of the feature’s dataset, may influence 

the classification accuracy rate. For instance, the 

correlations between features influence the classification 

result. Accidental removal of important features might 

lower the classification accuracy rate. Additionally, some 

dataset features may have no influence at all, or may 

contain a high level of noise. Removing such features can 

improve the searching speed and accuracy rate. 

Here the particle swarm optimization (PSO) algorithm 

is used to optimize the parameters of support vector 

machine. PSO is an emerging population-based metaheuristic 

that simulates social behavior such as birds 

flocking to a promising position to achieve precise 

objectives in a multidimensional space [29]. Like 

evolutionary algorithms, PSO performs searches using a 

population (called swarm) of individuals (called particles) 

that are updated from iteration to iteration. To discover 

the optimal solution, each particle changes its searching 

direction according to two factors, its own best previous 

experience and the best experience of all other members. 

Each particle represents a candidate position. A 

particle is considered as a point in a D-dimension space, 

and its status is characterized according to its position and 

velocity. The D-dimensional position for the particle i at 

iteration t can be represented as 

t t t t 

xi = { xi1, xi2, L , xiD} 

. Likewise, the velocity, i.e., 

distance change, which is also a D-dimension vector, for 

particle i at iteration t can be described as 

t t t t 

ti = { ti1, ti2, L , tiD} 

. 

t t t t 

Let pi = { pi1, pi2, L, piD} 

represent the best 

solution that particle i at iteration t, and 

t t t t 

pg = { pg1, pg2, L, pgD} 

denote the best solution 

obtained from t 

pi in the population at iteration t. To 

search for the optimal solution, each particle changes its


velocity according to the cognition and social parts as 

follows, 

t t−1 t t t t 

Vid = Vid + cr 11( pid − xid ) + c2r2( pgd−xid ) (25) 

d = 1, 2, L, 

D 

where 1 c and 2 

c are accelerating factors, and 1 

r and 

r2 are random numbers uniformly distributed in ( 0,1 ) . 

Each particle then moves to a new potential solution 

based on the following equation 

t+ 1 t 

X = X 

t 

+ V , d = 1,2, L , D (26) 

id id id 

The generalization ability of support vector machine 

algorithms depends on a set of parameters, including the 

penalty actor 1 C% , the estimated accuracy ε and the RBF 

kernel parameter γ . Defining i = ( Ci, εi, γ i) 

( v , v , v ) 

i i1 i2 i3 

U % , the speed 

V = 

, the history optimal location 

( p1, p 2, p 3 ) 

P = 

,the global optimal location 

i i i i 

( g , g , g ) 

Gi = i1 i2 i3 

for ith particle. The update for location 

and speed are written as 

C% i = C% i + vi1 

(27) 

vik = wv 1 i1+ c1× rand( ) × ( pi1−C% i ) 

+ c × rand × g −C% 

(28) 

( ) ( i i) 

2 1 

where w 1 is inertia factor, 

wmax −wmin 

w1( itc) = wmin + × ( itcmax − itc) 

(29) 

itc 

where itc is the number of iteration, itcmax is the maximal 

number of iteration, wmax and wmin are the maximal and 

minimum inertia factors, respectively. The fitness 

function for the proposed central DOA estimation 

algorithm is defined as 

( ) 2 

1 

1 

K ) 

fits = ∑ θi −θi 

(30) 

K i= 

The basic process of the particle swarm optimization 

for parameter selection of support vector regression is 

given as follows, 

Step1,(Initialization) Randomly generate initial 

particles; 

Step2, (Fitness) Measure the fitness of each particle in 

the population; 

Step3, (Update) Compute the velocity of each particle 

with (28); 

Step4, (Construction) For each particle, move to the 

next position according to (26); 

Step5, (Termination) Stop the algorithm if termination 

criterion is satisfied; Return to Step 2 otherwise. 

V. NUMBERICAL RESULTS 

Several numerical simulations have been performed in 

order to validate the proposed approach. An array 

composed by 8 elements with interelement distance 

d = 0.5λ 

is considered. 

The kernel function used in this work is a radial kernel. 


We investigate the performances of the proposed fast 

DOA estimation of single coherently distributed source 

with Gaussian deterministic angular signal density. 

Moreover, in the considered simulations, following a 

widely used approach, an estimate of the correlation 

matrix R XX is simply computed by averaging the values 

of 50 snapshots of X . In the first example, we 

numerically illustrate the proposed algorithm for 

selecting the parameter of support vector machine. 

Considering a single coherently distributed source with 

angular spread and SNR=10dB. In order to cover the 

o o 

whole region of interest, the range is ⎣ 

⎡−90 ,90 ⎦ 

⎤ for the 

central DOA in training sample set. After the training 

phase, the test phase is performed by considering 

different values of the central DOAs of the impinging 

waves. The support vector machine parameters are 

initialized as C ∈ [ 30,500] 

, ε ∈ [ 0,0.02] 

and 

γ ∈ [ 0.01,2 ] . The speed range are [ − 500,500] 

, 

[ −0.02,0.02] and [ − 2, 2] 

, respectively. The fitness 

function is initialized as 0. Fig.2 illustrates the iterative 

process. When the number of iteration is about 35, the 

fitness function is convergence. The optimal location of 

the particle is ( 230.4331,0,0.5734 ) ,which is support 

vector regression parameter. 

Fig.2 The iterative 

The estimated DOA values are reported in Fig.3. In the 

abscissa, the indexes of the samples belonging to the test 

set are indicated. The corresponding actual and estimated 

values of the incident angles are reported. As can be seen, 

the proposed method is able to obtain quite good results 

for almost all of the considered DOAs. 

Fig.3 The actual and estimated values of the 

central DOA


The root-mean-squared errors (RMSEs) of the 

estimated central DOA by the proposed method are 

illustrated at different SNR in Fig. 4. As it can be seen, 

the proposed algorithm has a better estimation 

performance at low SNR. 

Fig.4 The RMSE of the estimated central DOA 

versus SNR 

The influence of the number of snapshots is 

investigated in Fig.5 for SNR=10dB.It can be observed 

that the proposed algorithm presents effective 

performance even for a small number of snapshots. 

Fig.5 The RMSE of the estimated central DOA 

versus the number of snapshots 


A method for estimating the central DOA of 

coherently distributed source has been proposed. The 

developed method is based on the use of a support vector 

regression approach for the approximation of the 

unknown mapping that performs the transformation from 

the outputs of the smart array to the central DOA of 

coherently distributed source, which can be used when 

the sample set is small. Furthermore, the reported results 

show that the method is able to correctly produce outputs 

corresponding to accurate estimations even in large 

angular spread. The approach is able to reach this goal in 

a very short time and with good generalization 

capabilities. 


This work was supported by the National Natural 

Science Foundation of China under Grant No. 60904035, 


61004052 and 61104005, by Natural Science Foundation 

of Liaoning Province and Hebei Province under Grant 

No.201202073 and F2011501052, by The Central 

University Fundamental Research Foundation, under 

Grant. N110423004 and by Research Fund for the 

Doctoral Program of Higher Education of China under 

Grant No. 20110042120015. 

REFERENCES 

[1] D. Asztely and B. Ottersten, “The effects of local 

scattering on direction of arrival estimation with MUSIC 

and ESPRIT,” Proceedings of ICASSP1998, Seattle, 

Washington, pp. 3333-3336, 1998. 

[2] A. Hassanien, S. Shahbazpanahi and A. B. Gershman, “A 

generalized Capon estimator for localization of multiple 

spread sources,” IEEE Trans. Signal Processing, vol.52, pp. 

280-283, January 2004. 

[3] J. Lee, J. Joung and J. D. Kim, “A method for the 

direction-of-arrival estimation of incoherently distributed 

sources,” IEEE Trans. Vehicular Technology, vol.57, pp. 

2885-2893, May 2008. 

[4] M. Souden, S. Affes and J. Benesty, “A two-stage 

approach to estimate the angles of arrival and the angular 

spreads of locally scatters sources,” IEEE Trans. Signal 

Processing, vol.56, pp. 1968-1983, May 2008. 

[5] R. Raich, J. Goldberg and H.Messor, “Bearing estimation 

for a distributed source: modeling, inherent accuracy 

limitations and algorithm,” IEEE Trans. Signal Processing, 

vol.48, pp. 429-441, February 2000. 

[6] A. Zoubir, Y. Wang and P. Charge, “Efficient subspacebased 

estimator for localization of multiple incoherently 

distributed source,” IEEE Trans. Signal Processing, vol.56, 

pp. 532-542, February 2008. 

[7] S.Valaee, B. Champagne and P. Kabal, “Parameter 

localization of distributed sources,” IEEE Trans. Signal 

Processing, vol. 43, pp. 2144-2153, February 1995. 

[8] T.Trump and B. Ottersten, “Estimation of nominal 

direction of arrival and angular spread using an array of 

sensors,” Signal Processing, vol.50, pp.57-69, April 1996. 

[9] Y.Meng, P.Stoica and K.Wong, “Estimation of the 

directions of arrival of spatially dispersed signals in array 

processing,” Inst.Elect.Eng.Proc.Radar,Sonar, Navigat., 

vol.143, pp.1-9, February 1996. 

[10] S.Shahbazpanahi, S.Valaee and A. B. Gershman, “A 

covariance fitting approach to parametric localization of 

multiple incoherently distributed source, ”IEEE Trans. 

Signal Processing, vol.52, pp.592-600, March 2004. 

[11] M. Bengtsson and B. Ottersten, “Low-complexity 

estimators for distributed sources,” IEEE Trans.Signal 

Processing, vol.48, pp.2185-2194, August 2000. 

[12] P.Stoica and K.C.Sharman, “Novel eigenanalysis method 

for direction estimation,” IEE Proc.F Radar, Signal 

Processing, vol.137, pp.19-26, February 1990. 

[13] R.Roy and T.Kailath, “ESPRIT-estimation of signal 

parameters via rotational invariance techniques,” IEEE 

Trans.Acoust.,Speech,Signal Processing, vol.37, pp.984- 

995, July 1989. 

[14] S. Shahbazpanahi, S. Valaee and M. H. Bastani, 

“Distributed source localization using ESPRIT algorithm,” 

IEEE Trans. Signal Processing, vol. 49, pp. 2169-2178, 

October 2001. 

[15] B. T. Sieskul, “ An asymptotic maximum likelihood for 

joint estimation of nominal angles and angular spreads of 

multiple spatially distributed sources,” IEEE Trans. 

Vehicular Technology, vol.59, pp. 1534-1538, March 2010.


[16] O.Besson, P.Stoica, “Decoupled estimation of DOA and 

angular spread for a spatially distributed source,” IEEE 

Trans. Signal Processing, vol.48, pp. 1872-1882, July 

2000. 

[17] A.Zoubir, Y.Wang, “Robust generalized Capon algorithm 

for estimating the angular parameters of multiple 

incoherently distributed sources,” IET Signal Process., pp. 

163-168, February 2008. 

[18] A. Monakov ,O. Besson, “ Direction finding for an 

extended target with possibly non-symmetric spatial 

spectrum,” IEEE Trans. Signal Processing, vol. 52, pp. 

283-287, January 2004. 

[19] A.H.EI Zooghby, C.G.Christodoulou and M. 

Georgiopoulos, “performance of radial-basis function 

network for direction arrival estimaion with antenna 

arrays,” IEEE Trans. Antennas Propag., vol.45, pp.1611- 

1617, November 1997. 

[20] A.H.EI Zooghby, C.G.Christodoulou and M. 

Georgiopoulos, “A neural network-based smart antenna for 

multiple source tracking,” IEEE Trans. Antennas Propag., 

vol.48, pp.768-776, May 2000. 

[21] D. Zhu, M. Tao and L. Sun, “Fuzzy support vector 

machine control for 6-DOF parallel robot, ” Journal of 

Computers, vol.6, pp.1926-1934, June 2011. 

[22] V. A. Sotiris, P.W. Tse and M.G. Pecht, “Anomaly 

detection through a Bayesian support vector machine,” 

IEEE Trans.Reliability, vol.59, pp.277-286, Feb. 2010. 

[23] C. Xie, C. Shao and D. Zhao, “Parameters optimization of 

least squares support vector machines and its application,” 

Journal of Computers, vol.6, pp.1935-1941, September 

2011. 

[24] D. Geebelen, J. Suykens and J.Vandewalle, “Reducing the 

number of support vectors of SVM classifiers using the 

smoothed separable case approximation,” IEEE Trans. 

Neural Networks and Learning Systems, vol.23, pp.682- 

688, April 2012. 

[25] V.Vapnik and A.Lerner, “Pattern recognition using 

generalized portrait method,” Automation Remote Control, 

vol. 24, pp. 774-780, June 1963. 


[26] Y. Liao, S.C. Fang and H.L.W. Nuttle, “A neural network 

model with bounded-weights for pattern classification,” 

Computers and Operations Research, vol.31, pp.1411- 

1426, 2004. 

[27] T. Shon, Y.Kim, C. Lee and J.Moon, “A machine learning 

framework for network anomaly detection using SVM and 

GA,” In Proceedings of IEEE Workshop on Information 

Assurance and Security, pp. 176-183, 2005. 

[28] P.F. Pai and W.C.Hong, “Support vector machines with 

simulated annealing algorithm in electricity load 

forecasting,”Energy Conversion and Management, vol.46, 

pp.2669-2688, 2005. 

[29] J. Kennedy and R.C. Eberhart, “Particles swarm 

optimation,” Proceedings of IEEE Conference on Neural 

Network, vol.4, pp.1942-1948, 1995. 

Yinghua Han was born on July 23, 

1979 in Jilin Province in China. She 

received M.S. degree and Ph.D. degree 

in Navigation Guidance and Control in 

Northeastern University, China, in 2005 

and 2008, respectively. She is currently 

a associate professor. Her research 

interests include array signal processing 

and mobile communication. 

Jinkuan Wang was born on April 4, 

1957 in Liaoning Province in China. He 

received the Ph.D. degree in Electronic 

Information and Communication 

Engineering in the University of 

Electro-Communications, Japan, in 

1993. Since 1998, he has been with the 

college of information science and 

engineering, Northeastern University, 

China, where he is now a professor. His 

current research interests are in wireless communication, 

multiple antenna array communication systems and adaptive 

signal processing.


Web Services Evaluation Predication Model 

based on Time Utility 

Guisheng Yin, Xiaohui Cui, Yuxin Dong, Jianguo Zhang 

College of Computer Science and Technology, Harbin Engineering University, Harbin, China 

Email: {yinguisheng, cuixiaohui}@hebru.edu.cn 

Abstract—Current predication models fail to consider the 

time utility of the web services evaluation predication and 

treat the different historical ratings in the same way. To 

solve this problem, we put forward web service evaluation 

predication model based on time utility. In the model, naïve 

quantification method and complex quantification method 

are proposed to achieve the distinct and proper time utility 

for the services evaluation predication procedure. Then, the 

quantification results are used to optimize the length of the 

predication windows. Also, feedback control strategy is 

involved to enhance the robust of the model when facing 

malicious. Experimental results shows our model would 

calculate the proper time utility and obtain the lower 

predication error compared with current predication 

models. Feedback control strategy is an effective method to 

reduce the impact of malicious ratings and guarantee the 

lower predication error compared with the model without 

the feedback control strategy. 

Index Terms—web services, time utility, quantification of 

time utility, predication, feedback control 


The evaluation predication is the principal application 

of the web services system. It could help the users to 

achieve the available web services concerned with their 

requests. The traditional predication models, such as 

Quality of service (Qos) in Ref. [1,2,3,4,5], Web services 

Evaluation System (WES) in Ref. [4], and K Nearest 

Neighbor (KNN) in Ref. [7,8,9] fail to analyze the time 

utility of the historical ratings when evaluating web 

services. In reality, the recent ratings express more 

valuable than the ratings in the past. Therefore, it is 

considerable to involve the decay process of the time 

utility into the predication procedures of the web services 

evaluation. In the past decade, some researches related 

with the time utility have been done in other fields of 

computer sciences. In Machine Learning, Koychev 

deemed the time utility of ratings would decade gradually 

with time, and this decay could be presented by the liner 

function in Ref. [11]. In Recommendation Systems, Ref. 

[13] and [14] assumed that core function would be the 

suitable description of the decay process. In Concept 

Drift System, several scholars applied the exponential 

function in Ref. [15] to describe the decay phenomenon. 

The above models of the decay process of the time 

utility are hard to directly apply to the web services 

evaluation predication system since there are some 


doi:10.4304/jsw.7.12.2717-2725 

problems to be solved. Firstly, Current models usually 

assumed that the time utility would decay in a static ratio 

even for different web services. According to the study of 

Indre Zliobaite, the ratings are the belief of the subjects to 

expect the evaluated objects to accomplish a task in Ref. 

[16]. This belief would be distinct for different web 

services. It is appropriate to use various decay procedures 

to describe the time utility for different web services. 

Secondly, no existing works have ever mentioned how to 

incorporate the time utility to optimize the predication 

procedures of the web services evaluation system. 

Thirdly, the predication procedures heavily rely on the 

historical ratings, while the web services evaluation 

system is easy to be attacked by the malicious ratings. 

To solve the above problems, we propose a web 

services evaluation predication model based on time 

utility (WSEPM-TU). In this model, complex 

quantification method of the time utility is proposed to 

unfold the distinct quantification of different web services. 

Then we apply the quantification results to optimize the 

length of the predication windows, as to enhance the 

performance of WSEPM-TU. Finally, WSEPM-TU 

supplies the feedback control strategy to reduce the side 

impact of the malicious ratings. 

The remainder of the paper is organized as follows: 

Section II states the architecture of WSEPM-TU. Section 

II designs the naive quantification method and the 

complex quantification method of the time utility. Section 

IV describes the optimization method for the length of the 

predication windows. Section V provides the feedback 

control strategy of the malicious ratings. Section VI 

presents the experiments to analyze the performance of 

WSEPM-TU. Section VII concludes the paper. 

II. ARCHITECTURE OF WSEPM-TU 

Web services evaluation predication system (WSEPS) 

is a prototype system of WSEPM-TU. WSEPS adopts the 

predication procedures to achieve the predication of the 

web services evaluation based on the quantification of the 

time utility. The feedback control strategy filters out the 

malicious ratings. Fig. 1 shows the architecture of 

WSEPS. 

The detail predication procedures of WSEPS are 

shown as follows. 

(1) Through the user's interface, WSEPS wraps the 

predication requests of the web services into the request 

entities and delivers them to the predication model.


(2) The predication model extracts the identifiers of the 

web services in the request entities and delivers them to 

the time utility model. 

(3) The time utility would search from the historical 

ratings table and calculate the current time utility of 

different web services. Then it returns the quantification 

results to the predication model. 

(4) The predication model applies the quantification 

results to optimize the length of the predication windows 

and estimates the evaluation predication of the web 

services. Then it returns evaluation results to the user's 

interface. 

The detail feedback procedures of WSEPS are shown 

as follows. 

(1) Through the user's interface, WSEPS wraps the 

user's feedback ratings of the web services into the 

feedback entities and delivers them to the feedback 

control model. 

(2) The feedback control model resolves the service 

identifiers and the users' ratings, and investigates whether 

the ratings are malicious. If the ratings are malicious, the 

feedback control model adjusts the ratings, and stores 

them in the historical ratings table. If the ratings are valid, 

the feedback control model would store the ratings 

directly into the historical ratings table. 

In the procedures of predication and feedback, the time 

utility model, the predication model, the feedback control 

model and the adjustment of the decay ratio model are the 

main focus of our papers. 

III. QUANTIFICATION METHOD OF THE TIME UTILITY 

System Dynamics is a main way to analyze the 

complex sequential system. It adopts the quantitative and 


Figure 1. Architecture of WSEPS 

qualitative methods to confirm the cause and effect of 

different system factors and constructs the dynamic 

system equations. The general steps of System Dynamics 

are Constructing the cause and effect relationship 

diagram among different system factors, transforming the 

cause and effect relationship diagram into the system 

flow diagram, analyzing the characteristics of the 

variables in the flow diagram to achieve the difference 

equations, transforming the difference equations to the 

differential equations, solving the differential equations to 

obtain the primitive functions. The decay procedures of 

the time utility can be deemed as a whole system affected 

by several system factors. To achieve a proper 

quantification method of the time utility, we use System 

Dynamics. 

A. Naive Quantification Method 

The naive quantification system of the time utility 

assumes that all the decay procedures of the time utility 

are identical for the web services. In the naive 

quantification system, the system factors include 

time_utility, decay_speed and decay_ratio. According to 

the natural decay characteristics of the time utility, 

increment of time_utility leads to the growth of 

decay_speed of the time utility per unit time, and it means 

the causal relationship between time_utility and 

decay_speed is positive. In turn, the increment of 

decay_speed leads to the decline of time_utility, and it 

means the causal relationship between decay_speed and 

time_utility is negative. Decay_ratio is a constant in this 

system. The cause and effect diagram of the naive 

quantification system is shown in Fig. 2. 

Figure 2. Cause and effect diagram of the naive quantification system 

There is a first order negative causal loop in Fig. 2. We 

assume that the direction into time_utility is positive and 

it can be inferred that decay_ratio is less than 0. 

According to System Dynamics, time_utility is a level 

variable, decay_speed is a rate variable and decay_ratio 

is an auxiliary variable. Fig. 3 shows the system flow 

diagram of the naive quantification system. 

Figure 3. System flow diagram of the naive quantification system 

If assuming J, K and L as the sequential time points 

and DT expresses a variance of the sequential time points , 

the dynamic equations of the naive quantification system 

are shown as (1) and (2). 

time_utility.K=time_utility.J–decay_speed.JK*DT. (1)


decay_speed.KL= time-utility.K * decay_ratio. (2) 

The equivalent differential equation of (1) and (2) is 

(3). 

dtime_utility/dt = time_utility * decay_ratio. (3) 

The primitive function to describe the dynamic 

characteristics of the naive quantification system is (4). 

time_utility = time_utility0 * e decay_ratio * t . (4) 

In our papers, time_utility0=1 for all the time utility 

would decay from 1. To simplify the expression of 

different system factors, we set X(t)=time_utility and 

decay_radio=K. The naive quantification method of the 

time utility is shown as (5). 

Kt 

X () t = e . (5) 

The function image of (5) is a curve converging to 0, 

which meets hypothesis of the naive quantification 

system. In the web service evaluation system, it is 

unreasonable to use same decay procedure to describe the 

time utility of all the web services. A proper way is to 

adjust the naive quantification method to complex 

quantification by assigning different decay ratios to 

different web services, as to provide distinct 

quantification results. 

B. Complex Quantification Method 

The complex quantification method of the time utility 

takes the frequency of users' ratings to affect the decay 

ratio as to unfold the distinct quantification results for 

different web services. 

When adjusting the decay ratio, we involve the 

psychological phenomenon of the memory enhancement. 

Based on the experiments, relearning would enhance the 

belief and start a new naive decay procedure of the time 

utility in a lower decay ratio compared with the prior 

decay procedure. If the time utility of the web services is 

the object to remember, the whole decay procedures of 

the time utility are the accumulation of the sequential 

naive decay procedures. Fig. 4 indicates a general decay 

procedure of the time utility for a web service. 

Figure 4. General decay procedure of the time utility 

In order to calculate the decay ratio, we select two 

sequential decay procedures. Assuming the last time 

point of adjusting decay ratio as tm, WSEPS receives the 

user's feedback rating and the decay ratio would change 


from Km to Kn at the time point tn . We make the curve of 

Km and Kn share the same starting point to achieve the 

numeric relationship between Km and Kn as shown in 

Fig.5. 

Figure 5. Neighbor naive procedures of the time utility 

In Fig. 5, Xn(tn) and Xm(tn) represent the time utility 

curve. At the time point tn, {Xn(tn)-Xm(tn)} means the 

adjustment degree due to the user's feedback. {1- Xm(tn)} 

is the upper bound of the adjustment degree. If δ is the 

adjustment percentage, the relationship between the 

{Xn(tn)-Xm(tn)} and {1- Xm(tn)} can be depicted by (6). 

[1 − X ( t )] [ X ( t ) − X ( t )] = δ . (6) 

m n n n m n 

The largerδ is, the fewer effects the users’ rating are. 

Generally,δ is an integer more than 1. Using X(t) in (5) 

replace X(t) in (6), we would calculate the adjusted decay 

ratio relationship by (7). 

K = + − e − t − t . (7) 

−km( tn−tm) n (ln(1 ( δ 1) ) ln δ) 

( m n) 

By (7), if we knowing the initial decay ratio K0, the 

decay ratio of the arbitrary procedures can achieve. For a 

specified web service, assuming the time points sequence 

of the adjusted decay ratio as t={t0,t1,…} and Km indicates 

the decay ratio between the neighbor time points(named 

tm and tn), we could use (8) to calculate the complex 

quantification results of the time utility. 

X t = e t∈ t t . (8) 

−Km( t−tm) m() [ m, n] 

In WSEPS, the time utility model utilizes (8) to 

calculate the time utility. Meanwhile, the adjustment of 

the decay ratio model utilizes (7) to update the records of 

the decay ratio in the time utility table. 

IV. PREDICATION OF THE WEB SERVICES EVALUATION 

In current predication model, KNN is the most 

common method to fusion the ratings. However, the 

length of predication windows in KNN should be 

artificially predefined. The unreasonable length of the 

predication windows would affect the performance of 

predication process. In WSEPM-TU, we make use of the 

quantification results of complex quantification method 

to optimize the length of predication windows (abbr. 

pre_win). 

In the predication system, the system factors includes: 

the decay_speed and the max length of the predication 

windows (abbr. max_win).


Using the time point (named t) as the intermediate 

variable, we analyze the cause and effect relationship 

between the decay_speed and pre_win as follows. 

(1) The causal relationship between t and pre_win. We 

assume Δt=t-tm, while tm and t indicate the starting point 

of the current decay procedure and current time point 

respectively. By the decreasing property of (8), the larger 

Δt is, the lower X(t) is. To guarantee the reliability of the 

predication, we need more historical ratings, vice versa. 

Therefore, pre_win∝ Δt. 

(2) The causal relationship between X’(t) and pre_win. 

By the graph of X’(t), the more Δt is, the larger X’(t) is. 

Therefore, Δt ∝ X’(t) . 

For∝ is an equivalence relation, pre_win and X’(t) is a 

positive causal relation under the condition of 

pre_win∝ Δt, 

Δt∝ X’(t) , pre_win∝ X’(t) and 

X’(t)∝pre_win. Fig. 6 shows the cause and effect 

diagram of the predication system. 

Set the direction into the review window as positive, 

then the decay ratio is more than 0. 

Figure 6. Cause and effect diagram of the predication system 

According to the characteristics of the system factors, 

max_win indicates the upper bound of the predication 

windows, which is a constant. var_win is an auxiliary 

variable. Fig.7 shows the system flow diagram 

corresponding to Fig.6. 

Figure 7. System flow diagram of the predication system 

If assuming J, K and L as the sequential time points 

and DT expresses the variance of the sequential time 

points, the dynamic equations of the predication system 

are shown by (9), (10) and (11). 

pre_win.K=pre_win.J +decay_speed.JK*DT. (9) 

var_win.K = max_win – pre_win.K. (10) 

decay_speed.KL= var_win.K * decay_ratio. (11) 

Assuming pre_win |t=0 = 0, we solve (9)-(11) and gain 

the equivalent function as shown by (12). 

pre_win=max_win*(1-e -dcay_ratio * t ). (12) 

Unify the expressions of the arguments in (12). n:= 

pre_win. N := max_win. The length of the predication 

windows is described by (13). 


n= ⎡⎢N(1 −Xm( t)) ⎤⎥ 

t∈[ tm, tn] 

. (13) 

In WSEPM-TU, the predication model searches the 

same mount historical ratings in accordance with (13). 

Assuming B={b1,b2,…,bn}(n


ratings are range from m to n. The relative confidence 

interval is described by (16). In (16), a and b mean the 

lower and the upper bound of the confidence interval of 

the whole ratings set. 

⎣⎡min{ ⎢⎣a− 0.1 × ( m − n + 1) ⎥⎦, m}, min{ ⎡⎢b+ 0.1 × ( m − n + 1) ⎤⎥, 

n} 

⎤. 

(16) 

⎦ 

VI. EXPERIMENTS AND ANALYSIS 

A. Data Preparation 

To evaluate the performance of our model, we use the 

ratings of application in App Store [18]. App store is the 

most mature market and rating platform for software. The 

way to bind credit card guarantees the reliability of the 

ratings in App Store. 

By Search API we capture the 100,000 ratings of 

application in the form of json. Based on the 

preprocessing to the raw ratings, we choose the 5600 

sequential ratings of application 105, 661 and 1084 as 

data sets (named D1, D2 and D3 respectively) for the 

following experiments. The statistics of these 3 datasets 

are shown in table I. Me, Var, Avg and IQR mean Median, 

Variance, Average and Inter-quartile range respectively. 

TABLE I. 

STATISTICS OF SIMULATION DATASETS 

Num Me Var Avg Min Max IQR 

D1 5600 3.00 1.249 3.38 1 5 1 

D2 5600 3.00 0.932 3.41 1 5 1 

D3 5600 4.00 0.769 3.96 1 5 2 

B. Evaluation Metrics and Simulation Parameters 

We use Mean Absolute Error (MAE) in [19] to 

measure the performance of various predication models 

by (17). In (17), U is the length of data segment, b(n) is 

the predicated value and b p (n) is the standard value. The 

lower MAE is, the better the predication models perform. 

Table II presents the simulation parameters of WSEPS. 

U 

p 

MAE = b( n) −b( 

n) U . (17) 

∑ 

n= 

1 

TABLE II. 

PARAMETERS OF WSEPS 

Name Meaning Value 

|Y| feedback sample 30 

K0 Initial decay ratio 1.0 

N max_win 10 

C. Experimental Procedures 

We adopt incremental learning to do the experiments. 

The procedures are shown as follows. 

(1) Select the top-|Y| ratings from the data sets by the 

ascending order of the ratings' time stamp and put the 

selected ratings into the historical ratings table as the 

initial records. 


(2) For the each rating left in the data sets, we treat 

them as the predication requests of the evaluation and 

carry out the different predication models to achieve the 

predicated evaluation. 

(3) Measure the variance by (18) and insert the 

feedback ratings into the historical ratings table. Continue 

to go to step 2 until the last record in the data sets. 

D. Parameters Analysis 

This experiment displays the impact of various settings 

of δ . When δ is equal to 10, 50, 100 and 300, Fig. 8, 9 

and 10 indicate the MAE analysis of WSEPM-TU on 

D1~D3. U=560. 

Figure 8. MAE analysis of various δ settings on D1 



In Fig. 8, the MAE values fluctuate slightly in a small 

range. For instance, whenδ = 300 , the maximum of the 

MAE values emerges at the predication periods from 

3921 to 4480 and the minimum of the MAE values arise 

in the predication periods from 5041 to 5600. In other


predication periods, the MAE values maintain 

at 0.99 ± 0.1 . Meanwhile, the average of the MAE values 

keeps at 0.88 ± 0.1 and 0.78 ± 0.1 on D2 and D3 

respectively, as shown in Fig. 9 and Fig. 10. 

According to the experimental results, the trends of 

fluctuations are similar under the various settings of δ 

and the fluctuated range is [-0.1, +0.1]. In conclusion, the 

performance of WSEPM-TU is not related to settings 

of δ . In the following experiments, we set δ =50. 

E. Performance of Various Predication Models 

By quantification methods of the time utility presented 

in [15] and [17], we construct a web service evaluation 

predication models based on the exponential function 

(WSEPM-E) and analyze the performance of WSEPM-E 

and WSEPM-TU on the data sets. WSEPM-E uses static 

quantification procedures and fails to consider the distinct 

and dynamic of the time utility. 

Fig. 11, Fig. 12 and Fig. 13 indicate the MAE values of 

WSEPM-TU and WSEPM-E with different static decay 

ratios (WSEPM-E(0.5), WSEPM-E(1) and WSEPM-E(2)) 

on the data sets. U=30. 

Figure 11. MAE Analysis of WSEMP-TU and WSEPM-E on D1 


In Fig. 11, the MAE values preserve at 0.99 ± 0.1 

among all the predication periods. For WSEPM-E, on one 

hand, its MAE values would fluctuate with the 

predication periods and appear a rising trend. For instance, 

the MAE value of WSEPM-E (0.5) is 0.99002 at the 

predication periods from 1 to 560 and increases to 

2.28268 at the predication periods from 5046 to 5600. On 

the other hand, owing to the lack of distinct quantification 

procedures for various web services, WSEPM-E (0.5) 

uses the static decay ratios to simulate decay procedures 

of the time utility, which performs poorly. The analysis of 


MAE on D2 and D3 would be similar as it is shown in Fig. 

12 and Fig. 13. 


Figure 14. MAE analysis of WSEPM-TU and WSEPM-KNN on D1 



According to the experimental results, WSEPM-E 

fluctuates in a large range and performs more poorly than 

WSEPM-TU. In conclusion, WSEPM-TU would better


simulate the time utility and perform better than the 

models with static quantification methods. 

WSEPM-KNN [7-9] uses the static length of the 

predication windows to predicate the web services 

evaluation, while WSEPM-TU uses the dynamic length 

of the predication windows. This experiment shows the 

performance of WSEPM-KNN and WSEPM-TU. 

Fig. 14, Fig. 15 and Fig. 16 indicate the MAE values of 

WSEPM-KNN (n=10) and WSEPM-TU on the data sets. 

U=560. 

In Fig. 14, the MAE values of WSEPM-TU and 

WSEPM-KNN both fluctuate with the predication 

periods. For WSEPM-TU, the maximum and the 

minimum of the MAE values are 0.90536 and 1.11518, 

which appear at the predication periods from 1168 to 

2240 and the periods from 2800 to 3360 respectively. For 

WSEPM-KNN, the maximum and the minimum of the 

MAE values are 0.92679 and 1.9929, which appears at 

the predication periods from 1680 to 2240 and the periods 

from 2920 to 4480 respectively. 

Only at predication periods from 2800 to 3360, the 

MAE values of WSEPM-TU are 4.61% more than 

WSEPM-KNN. At other predication periods, WSEPM- 

TU gains the lower MAE values than WSEPM-KNN. 

Results are similar at D2 and D3, as shown in Fig. 15 and 

Fig. 16. 

According to the experimental results, WSEPM-KNN 

could not properly reach the users’ expectation, and 

consumes more computational sources than WSEPM-TU. 

In conclusion, WSEPM-TU utilizes the quantification 

results of the time utility to effectively optimize the 

length of the predication windows and show the better 

performance on various data sets. 

F. Feedback Control Strategy Analysis 

This experiment is to analyze the performance after 

introducing the feedback control strategy of the malicious 

ratings. As a comparison, we remove the feedback 

control strategy from WSEPM-TU and allow the 

malicious ratings to store in the historical ratings table 

directly. 

To express the distributed change with the impact of 

the feedback control strategy, we utilize the box-plot to 

describe the statistics from D1 to D3. In the box-plot, the 

middle line of boxes means Median, the upper and lower 

bound line is the maximum and the minimum and the 

isolated points are the malicious data. In our experiments, 

each box possesses 560 ratings. 

Fig. 17, Fig. 18, Fig. 20, Fig. 21, Fig. 23 and Fig. 24 

show the distributed change of the historical ratings after 

running WSEPM-TU with and without the feedback 

control strategy. Fig. 19, Fig. 22 and Fig. 25 indicate the 

MAE values of WSEPM-TU with and without feedback 

control strategy. 

In Fig. 17, the overall ratings without the feedback 

control strategy are distributed from 3 to 4. There are 6 

boxes with the isolate points among all the predication 

periods. In Fig. 18, there are only 3 boxes with the isolate 

points, and it unfolds that the feedback control strategy 

would effectively filter out the malicious ratings. 

Meanwhile, the feedback control strategy has no impact 


on the valid ratings for the box shapes between Fig. 17 

and Fig. 18 are similar. In Fig. 19, WSEPM-TU with the 

feedback control strategy performs better than the model 

without the feedback control strategy. 

For the experiment results on D2, the feedback control 

strategy would filter out the malicious ratings and provide 

reliable historical ratings. 

For the experiment results on D3, it possesses more 

malicious ratings as shown in Fig. 23. After the feedback 

controlling, the number of the malicious ratings reduce as 

it is shown in Fig. 24. 

Figure 17. Data distribution on D1 without the feedback control 

Figure 18. Data distribution on D1 with the feedback control 

Figure 19. MAE analysis on D1 with various feedback control strategies 

Figure 20. Data distribution on D2 without the feedback control




Predication period 

Figure 23. Data distribution on D3 without the feedback control 




In Fig. 18, Fig. 21 and Fig. 24, the initial isolate points 

at the predication periods from 1 to 560 originate from 

the initial |Y| ratings. In the experiments, we directly store 

the initial |Y| ratings in the historical ratings table. If there 

are isolate points in these |Y| ratings, the malicious ratings 

would be drawn in the first box plot. In fact, the 

malicious ratings only affect the following |Y| predication 

periods. From the whole predication periods, the 

malicious ratings in the first box-plot have less impact on 

the predication performance of WSEPM-TU. 

According to the experimental results, WSEPM-TU 

with the feedback control strategy would filter out the 

malicious ratings and out-perform compared with the 

model without the feedback control strategy. 

VII. CONCLUSION 

This paper proposes a web services evaluation 

predication model based on time utility. The model uses 

the complex quantification method reflecting the 

distinction quantification results for different web 

services. Then, the quantification results are used to 

optimize the length of predication windows. Also, the 

feedback control strategy is involved in WSEPM-TU to 

filter out the malicious ratings. According to the 

experimental results, WSEPM-TU with the feedback 

control strategy would filter out the malicious ratings and 

out-perform compared with other predication model. 

WSEPM-TU adopts the memory enhancement 

phaenomenon to obtain the distinct quantification results. 

In the filed of psychology, there are more controversial 

model that can be used to gain the distinct quantification 

results. In the future, we would compare these models 

with complex quantification model in practical ratings 

datasets, and achieve the more suitable quantification 

model for web service predication process. 


This work is sponsored by the National Natural 

Science Foundation of China under Grant No. 60973075, 

the Provincial Natural Science Foundation under Grant 

No.F200937 and No. F201110, the Foundation of Harbin 

Science and Technology Bureau under Grant No. 

RC2009XK010003. 

REFERENCES 

[1] WU Minghui, XIONG Xianghui, YING Jing,et al, "QoSdriven 

global optimization approach for large-scale web 

services composition," Journal of Computers. vol. 6, no. 7, 

pp. 1452-1460, July, 2011. "doi: 10.4304/jcp.6.7.1452- 

1460" 

[2] GUO Guangjun, "A Method for Semantic Web Service 

Selection Based on QoS Ontology," Journal of Computers. 

vol. 6, no.2, pp. 337-386, February, 2011. "doi: 

10.4304/jcp.6.2.377-386" 

[3] SHAO Ling-Shuang, ZHOU Li, ZHAO Jun-Feng, XIE 

Bing, MEI Hong, "An extensible management framework 

for web service qos," Journal of Software. vol. 20, no. 8, 

pp. 2062-2073, April 2009. (in chinese) 

[4] M. Serhani, A. Benharref, "Enforcing Quality of Service 

within Web Services Communities," Journal of Software.


vol 6, no. 4, pp. 554-563, April, 2011. "DOI: 

10.4304/jsw.6.4.554-563" 

[5] E. Badidi, L. Esmahi, "A Scalable Framework for Policy - 

based QoS Management in SOA Environments, " Journal 

of Software. vol. 6, no. 4, pp. 544-563, April, 2011. "DOI: 

10.4304/jsw.6.4.544-553" 

[6] YANG Wenjun, LI Juan-Zi, WANG Ke-Hong, "Domainadaptive 

service evaluation model," Chinese Journal of 

Computers. vol. 28, no. 4, pp. 514-523, August, 2005. (in 

chinese) 

[7] QIAO Baiyou, DING Linlin, WEI Yong, WANG 

Xiaoyang, "A KNN query processing algorithm over highdimensional 

data objects in P2P systems," in Proc. of the 

2011 2 nd International Congress on Computer Applications 

and Computational Science, Heidelberg, Berlin, Springer 

Press, vol. 144, pp. 133-139, 2012. "DOI: 10.1007/978-3- 

642-28314-7_19 " 

[8] Wikipedia. "k-nearest neighbor algorithm," 2012-3-18; 

Available from: 

http://en.wikipedia.org/wiki/Knearest_neighbor_algorithm. 

[9] YE Tao, ZHU Xue-Feng, LI Xiang-Yang, SHI Bu-Hai. 

"Soft sensor modeling based on a modified k-nearest 

neighbor regression algorithm," ACTA AUTOMATICA 

SINICA. vol. 33, no. 9, pp. 996-999, September 2007. (in 

chinese) 

[10] LI Xiaoyong, GUI Xiaolin, "Cognitive model of dynamic 

trust forecasting," Journal of Software. vol. 21, no. 1, pp. 

163-176, January 2008. (in chinese) 

[11] I. Koychev, I. Schwab, "Adaptation to Drifting User’s 

Interests," in Proc. of ECML 2000 Workshop: Machine 

Learning in New Information Age, Barcelona, Spain, IEEE 

Press, pp. 39-46, 2000. 

[12] J. Krücker, XU Sheng, V. Aradhana, K. Locklin, A. Hayet , 

G. Neil, et al, "Clinical utility of real-time fusion guidance 

for biopsy and ablation," Journal of Vascular and 

Interventional Radiology, vol. 22, no. 4, pp. 515-524, April 

2011. "doi: 10.1016/j.jvir.2010.10.033" 

[13] M. Kukar, S. Ljubljana, "Drifting concepts as hidden 

factors in clinical studies," in Proc. of Artificial 

intelligence in Medicine 9th Conference on Artificial 

Intelligence in Medicine in Europe, AIME 2003, Protaras, 

Cyprus, IEEE Press, pp. 355-364, October 2003. 

[14] SUN Chao, JIANG Bo, "The research on model of network 

teaching system based on agent and recommendation 

technology," Journal of Zhengzhou University, vol. 40. no. 

3, pp. 84-87, September 2009. (in chinese) 

[15] YIN Guisheng, WANG Shuyin, "Dynamic trust model of 

internet-ware based on task classification," in Proc. of 2 nd 

International Conference on Mechanical Engineering and 

Green Manufacturing, MEGM 2012, Chongqing, China, 

IEEE Press, March 2012. "doi: 

10.4028/www.scientific.net/AMM.155-156.221" 

[16] I. Zliobaite, "Learning under Concept Drift: an Overview," 

Technical Report. 2010, Faculty of Mathematics and 

Informatics, Vilnius University: Vilnius, Lithuania, 2010. 


[17] H. Ebbinghaus, "Memory: A Contribution to Experimental 

Psychology," New York by Teachers College, Columbia U 

niversity, 1885. http://psy.ed.asu.edu/~classics/Ebbinghaus/ 

index.htm. 

[18] Apple. 2011-05-12. Available from http://www.apple.com. 

cn/mac/app-store. 

[19] ZHANG Jinbo, LIN Zhi-qing, XIAO Bo, ZHANG Chuang, 

"An optimized item-based collaborative filtering 

recommendation algorithm," in Proc. Of Network 

Infrastructure and Digital Content, IC-NIDC 2009, Beijing, 

China, IEEE Press, pp.414-418, November 2009. "doi: 

10.1109/ICNIDC.2009.5360986" 

[20] J. Haddad, M. Maude1, R. Guillermo, R. Marta, "QoSdriven 

selection of web services for transactional 

composition," in Proc. Of the IEEE International 

Conference on Web Services, ICWS 2008, Beijing, China, 

IEEE Press, pp. 653-660, September 2008. "doi: 

10.1109/ICWS.2008.116" 

[21] KIM, Youngae, PHALAK Rasik, "A trust prediction 

framework in rating-based experience sharing social 

networks without a web of trust," Information Science, vol. 

191, no. 15, pp. 128-145, January 2012. "doi: 

10.1016/j.ins.2011.12.021" 

[22] ZHAO, Laiping, REN Yizhi, LI Mingchu, S. Kouichi, 

"Flexible service selection with user-specific QoS support 

in service-oriented architecture," Journal of Network and 

Computer Applications, vol. 35, no. 3, pp. 962-973, May 

2012. "doi: 10.1016/j.jnca.2011.03.013" 

Guisheng Yin was born in China. He received the PhD degree 

in automatic control from Harbin Engineering University, where 

he is a Full Professor and Doctoral Advisor, the Acting Dean of 

College of Computer Science and Technology and the Dean of 

School of Software Engineering. He ever worked in Tokyo 

University before he joined the current university. His research 

interests include database and web services discovery. 

Xiaohui Cui was born in China. He got his bachelor’s degree in 

Harbin Engineering University in 2007. Now he is a PhD 

Candidate of computer application in College of Computer 

Science and Technology of Harbin Engineering University. His 

research interests include web services discovery and web 

services evaluation. 

Yuxin Dong was born in China. She received the PhD degree in 

Computer Application from Harbin Engineering University, 

where she is an associate professor. Her research interests 

include Internetware and trust evaluation. 

Jianguo Zhang was born in China. His research interests 

include trust evaluation.


Speech Emotion Recognition based on Optimized 

Support Vector Machine 

1, 2 

Bo Yu 

1 

School of Computer Science and Technology/Harbin Institute of Technology, Harbin, China 

2 

Software College/Harbin University of Science and Technology, Harbin, China 

Email: hrbust_yubo1981@163.com 

Haifeng Li and Chunying Fang 

School of Computer Science and Technology/Harbin Institute of Technology, Harbin, China 

Email: lihaifeng@hit.edu.cn, fcy3333@163.com 

Abstract—Speech emotion recognition is a very important 

speech technology. In this paper, Mel Frequency Cepstral 

Coefficients (MFCC) has been used to represent speech 

signal as emotional features. MFCCs plus energy of an 

utterance are used as the input for Support Vector Machine. 

Support Vector Machine (SVM) has been profoundly 

successful in the area of pattern recognition. In the recent 

years there has been use of SVM for speech recognition. 

Many kinds of kernel functions are available for SVM to 

map an input space problem to high dimensional spaces. We 

lack guidelines on choosing a better kernel with optimized 

parameters of SVM. Some kernels are better for some 

questions, but worse for other questions. Which is better is 

unknown for speech emotion recognition, thus the thesis 

studies the SVM classifier and proposes methods used to 

select a better kernel with optimized parameters. The new 

method we proposed in this paper can more efficiently gain 

optimized parameters than common methods. In order to 

improve recognition accuracy rate of the speech emotion 

recognition system, a speech emotion recognition based on 

optimized support vector machine is proposed. 

Experimental studies are performed over the HIT 

Emotional Speech Database established by Speech 

Processing Lab in School of Computer Science and 

Technology at HIT. The experiment result shows that the 

speech emotion recognition based on optimized SVM can 

improve the performance of the emotion recognition system 

effectively. 

Index Terms—speech emotion recognition, MFCC optimized 

SVM, kernel function 


Recognizing emotional state of a person from the 

speech signal has been increasingly important, especially 

in natural human-computer interaction[1]. Speech 

emotion recognition can be used in wide range of 

applications, such as remote call customer services 

center[2-5], speech emotion network communication 

system[6], improving the robustness of speech 

recognition[7], emotion states detection in voice mail 

messages[8], speech emotion recognition system in web 

learning[9], interactive movies[10], monitoring a driver’s 

emotion and ensuring save drive[11] and so on. Therefore 


doi:10.4304/jsw.7.12.2726-2733 

speech emotion recognition has important research value 

and great potential for development. 

Global statistical prosodic and voice quality features 

have been broadly used in speech emotion recognition 

and gained great success. Besides the global statistical 

prosodic and voice quality features, spectral features are 

also useful features for describing speech emotion signal, 

such as Mel Frequency Cepstral Coefficients (MFCC). 

MFCC of speech signal have already been successfully 

used for the features of speech signals for emotion 

recognition. Therefore, we use MFCC plus energy with 

their delta and acceleration as speech emotion features. 

Hidden Markov model (HMM) and Gaussian mixture 

model (GMM) using MFCC have achieved valuable 

results on speech emotion recognition[12]. However, 

there is a problem when using GMM to recognize speech 

emotion states. Effective training of GMM requires a 

great deal of data, while collecting emotional speech 

utterances will cost a lot and therefore the available 

training data is usually scant. 

SVM has a better classification performance on a small 

amount of training samples. But we are lacking in 

guidelines on choosing a better kernel with optimized 

parameters of SVM. Some kernels are better for some 

questions, but worse for other questions. There is no 

uniform pattern used to the choice of SVM with its 

parameters and kernel function with its parameters. The 

paper proposed methods about selecting optimized 

parameters and kernel function of SVM. 

The paper is organized as follows. In section 2, we 

give a brief description about the process of speech 

emotion feature extraction. Section 3 includes three 

works: establishing emotion recognition model based on 

optimized SVM, studying support vector machine 

classification, proposing the method for optimizing SVM. 

Some experimental data and results with our analysis are 

shown in section 4. Finally the conclusion and future 

work are given in section 5. 

II. EXPERIMENT DATA ACQUISITION AND FEATURE 

EXTRACTION


A. Speech Emotional Database Description 

Referring to the domestic and foreign research, this 

paper divides emotion into four categories——anger, 

happiness, sadness, and surprise, and tries to include all 

kinds of feelings in them. In order to obtain experiment 

utterances, some non-professionals have been invited to 

record their emotions, thus creating an emotional 

database. The design of the experiment is speakerindependent 

and gender-independent, thus students who 

took part in the experiment aged about 20 include 5 males 

and 9 females. Recorded with Cool Edit Pro 2.0, all the 

data are with the technology of sampling rate of 16 kHz, 

a single channel audio tape recorder, 16-bit quantization, 

and are recorded in PC with the form of Wave. Besides, 

we also invite another two groups of people who were not 

engaged in the recording to make distinguishing 

experiment. Each group involved several people. The first 

group's task was to distinguish the emotions, thus getting 

rid of the unqualified recording and selecting 1256 

sentences as emotion utterances stored for later usage in 

the experiment. Another group of people tries to tell the 

differences in the 1256 sentences, which can make a 

subjunctive evaluation of the emotion utterances in the 

experiment. Four types of emotions (anger, happiness, 

sadness, surprise) are respectively labeled by 1,2,3,4. Fig. 

1 shows the emotion class distribution of 1256 samples. 

The 1256 items consists of 636 sentences from the males, 

and 620 sentences from the females. Each of the four 

main emotions has 300 sentences accordingly and has 

even distribution. 

B. Speech Emotional Features Extraction 

Speech emotional features extraction is a process 

extracting a small number of parameters from the speech 

signal that can be later used to represent each utterance. 

Speech extraction techniques include temporal analysis 

and spectral analysis techniques. The waveform of speech 

signal is used for analysis in temporal. The spectral form 

of speech signal is used for analysis in spectral analysis. 

Mel Frequency Cepstral Coefficients is a spectral analysis 

technique. In the recent years, MFCC feature has been 

widely used for not only the speaker but also speech 

recognition. 

Mel Frequency Cepstral Coefficients are set of features 

reported to be robust in various kinds of pattern 

classification tasks in speech signal. It has been proven 

that human being perception of the frequency contents of 

sounds for speech signals does not follow a linear scale in 

the psychology studies. Therefore for each tone with an 

actual frequency f measured in Hz, a subjective pitch is 

measured on a scale called the Mel scale [14]. The Mel 

frequency scale is in the form of a linear frequency scale 

below 1000 Hz while a logarithmic scale above 1000 Hz. 

Consequently we can take advantage of the following 

formula to compute the Mel for a given frequency f (Hz). 

f = 2595× log(1 + f / 700) . (1) 

Mel 


Class label 

4 

3.5 

3 

2.5 

2 

1.5 

Sample class distribution 

1 

0 200 400 600 800 1000 1200 1400 

Sample 

Figure 1. The distribution of four kinds of emotion. 

Fig. 2 shows the MFCC extraction Algorithm. An 

MFCC process converts linear spectrum into nonlinear 

Mel-spectrum. 

1). Pre-emphasis 

In our system, each of the utterances is sampled by 16 

kHz. Pre-emphasizing the sampled speech signals with 

filter is the first process in feature extraction. The purpose 

of pre-emphasis is to spectrally flatten the signal. The ztransform 

of the filter is 

1 

H( z) 1 z , 0.94 0.97 

− 

= −μ


Figure 2. MFCC extraction algorithm. 

N −1 

( ) a ( ) 

− j2 π nk/ N 

0 

∑ 

X k = x n e ≤ k < N 

n= 

0 

5). Mel scale filterbanks 

After the FFT processing, the frequency spectrum of 

each frame is filtered by a group of filters, and the power 

of each filter band is computed. A filter bank which 

spaced uniformly on the Mel scale is used to simulate the 

subjective spectrum. Filter banks filter the magnitude 

spectrum into a number of bands. Low frequencies are 

given more weight than high frequencies using triangular 

overlapping windows and sum the frequency contents of 

each band. The process reflects the selectivity of human 

ear [15]. 

6). Logarithm 

This operation simulates the perception of loudness. 

We can calculate the Mel Frequency Cepstral 

Coefficients from the output power of the filter bank 

using logarithm arithmetic. The operation is mapping the 

logarithmic amplitudes of the spectrum obtained the Mel 

scale as we mention above. 

7). Discrete Cosine Transform 

Discrete Cosine Transform (DCT) can convert logpower 

spectrums to the time domain, for the Mel 

Frequency Cepstral coefficients are real numbers. After 

the DCT operation, we get a featured vector with 12 

dimensional MFCC. 

After computing MFCC, we calculate Logarithmic 

energy of each frame as one of the coefficients. Up to 

now we have got 13 dimensional coefficients vector 

consisting of 12 cepstral coefficients and one energy. To 

enhance the performance of the speech emotion 

recognition, the delta and acceleration coefficients are 

computed. After all the calculations, the total number of 

MFCC of one frame is 39. In order to predict emotion 

label of one sentence, we calculate the mean value of all 

frames of one sentence. Therefore, we get 39-dimensional 

feature vector of one sample. 

III. SPEECH EMOTION RECOGNITION BASED ON 

OPTIMIZED SVM 


.(4) 

Figure 3. System model architecture. 

A. System Model Architecture 

Fig. 3 shows the speech emotion recognition system 

model architecture based on optimized SVM. The 

process of the system is as follows: 

STEP1: Extracting speech emotion feature from 1256 

utterances, the extracting features method referred section 

II B. 

STEP2: The main task in optimized process is to 

improve the classification accuracy rate of the SVM. The 

main method and Algorithm is studied in section III C. 

STEP3: After optimizing process, the system trains an 

optimized model used to classify. 

STEP4: The system gives a classification result (class 

label or recognition rate) about test samples. 

The principle of SVM Method is studied in next part. 

B. Support Vector Machine Classification Method 

Instead of using empirical risk minimization (ERM), 

which is commonly used in statistical learning, SVM is 

established on structural risk minimization (SRM). ERM 

only minimises an upper bound on the generalization 

error. Thus SVM generalize well. The major principle of 

SVM is to establish a hyperplane as the decision surface 

maximizing the margin of separation between negative 

and positive samples. Thus SVM is designed for twoclass 

pattern classification. Multiple pattern classification 

problems can be solved using a combination of binary 

support vector machines. 

1). Linear classification 

Fig. 4 gives the idea of an optimal hyperplane for 

linearly classification. Triangular and square points 

represent the two types of training sample. H represents 

the hyperplane. B1 and B2 respectively go through the 

points which are the closest point of H, and parallel to H. 

The distance between B1 and B2 is called margin, which 

is a measurement of the expected generalization ability. 

There are many linear hyperplanes that separate the 

samples. However only one of these achieves maximum 

margin. An optimal hyperplane tries to achieve maximum 

margin between the classes. Using a large margin to


B2 

H 

B1 

Large margin 

=2/||w|| 

Figure 4. The idea of an optimal hyperplane for linearly classification. 

separate the classes minimizes the boundary of expected 

generalization error. 

Given a linearly Separable set of training samples (xi, 

yi), i =1, 2…N, where xi ∈ R d (d is the dimension of 

sample space) is the real world data instances and yi 

∈{1,-1} represent the class label of xi. A hyperplane 

can be represented as 

f( x) = w⋅ x+ b= 

0. 

(1) 

where x represents an input vector, w represents an 

adjustable weight vector, and b represents a bias. Given a 

point x, if f(x)>0, the point belongs to class 1, if f(x) < 0 

the points belongs to class 2, that is f(x,w,b) = 

sign(w⋅ x+b). Let the f(x) normalized so that all samples 

meet |f(x)| ≥ 1. The closest vector to H that satisfy the 

requirement |f(x)| = 1 is called support vector. Triangular 

and square points with a circle represent the support 

vectors for classification. The closest vector to H meet 

|f(x)| = 1. Thus the margin equals to 2/||w||. It suggests 

that maximizing the margin of separation between classes 

is equal to minimizing the Euclidean norm of the weight 

vector w. Thus the classification problem can be 

transformed to a constrained optimization problem. 

1 2 

min Φ ( w) = || w|| 

2 

(2) 

subject to yi( w⋅ xi+ b) -1 ≥ 0, i = 1,2,...... N 

We can use the method of Lagrange function to settle 

the primal problem. The corresponding Lagrange 

function is as follows. 

N 1 

J( w, b, ∂ ) = w⋅w−∑ ∂i{ yi( w⋅ xi+ b) 

−1} 

(3) 

2 

i= 

1 

Where ∂ i is Lagrange multiplier. Saddle points decide 

the solution. After a series of transformation, we can get 

the following dual problem. 

max 

N N N 1 

Q( ∂ ) = ∑∂i−∑ ∑∂∂ 

i jyyx i j i ⋅ xj 

i= 1 2 i= 1 j= 

1 

n ⎧∂ 

iyi = 0 

where ∑ ⎨ , 

i ⎩∂≥ 

i 0 

i = 1, 2,..., N 


(4) 

Equation 5 can be solved by standard quadratic 

programming method. We can get the optimum Lagrange 

multipliers which is denoted by * 

∂ i . Finally, we can 

compute the optimum weight vector w * . 

N 

* * 

= ∑ ∂i 

i= 

1 

i i 

(5) 

w yx 

To compute the optimum bias b * , we may use the w * 

and a positive support vector xi (yi =1). 

* * 

b =1-w ⋅ xi 

(6) 

Consequently, the optimal hyperplane, standing for a 

multidimensional linear decision surface in the input 

space, is defined by 

* * 

w ⋅ x+ b = 0 . (7) 

Accordingly, the decision function f(x) = 

sign(w * ⋅ x+b * ) can decide the label of the new sample. 

We will study the Optimal Hyperplane for nonseparable 

patterns in the next part. 

2). Optimal Hyperplane for nonseparable patterns 

The case we discuss above is hard-margin 

classification, that is, no sample points are allowed to be 

mislabeled. Actually, we cannot exclude the situation 

where exist some noisy points. That is, not all the points 

satisfy the constraints of the hyperplane. We solve this 

problem by introducing slack variables. The 

corresponding optimization problem is as follows. 

N 

1 2 

min Φ(w , ξ )= || w|| + c∑ 

ξi 

2 

subject to 

i= 

1 

⎧yi( 

w⋅ xi+ b)-1+ξi≥0 

⎨ 

⎩ξ 

i ≥ 0, i= 1,2,...... N 

Where ξi is a slack variable. C stands for the penalty 

imposed on mislabeling the point. The bigger the value of 

C is, the less likely the SVM model mislabels the point. 

However, C with high value will lead to overfitting study. 

The parameter C controls the tradeoff between 

complexity of the machine and the number of 

nonseparable points. The parameter C is determined 

experimentally via the standard use of a training/test set 

[16]. 

3). Kernel function 

Not all training data is linearly separable. In order to 

handle the situation, kernel is introduced. SVM performs 

a non-linear mapping from a low-dimensional space to a 

high-dimensional space through a kernel. As a result, the 

training samples are not linearly separable in a lowdimensional 

space while the training samples are linearly 

separable in the feature space. 

The main principle of the kernel function is to let 

operations be performed in the low-dimensional space 

instead of the potentially high dimensional feature space. 

Thus the inner product need not be computed in the 

feature space. According the Mercer's theory, there 

(8)


always exists a kernel function K which satisfies the 

requirement below. 

K( x , x) =ϕ( x) ⋅ϕ ( x ) . (9) 

i i 

The vector ϕ ( x) 

represents the "image" induced in 

the feature space due to the input vector x. Therefore the 

inner product can be evaluated using the kernel function. 

In this way, we are able to reduce the scale of the 

problem. The dual problem is redefined as 

1 

2 

N N N 

max Q( ∂ ) = ∂i− ∂∂ i jyyk i j ( xi, xj) 

n 

∑ 

i= 1 i= 1 j= 

1 

where ∂ iyi = 0 and ∂i≥ 0 i = 1, 2,..., N 

i 

∑ ∑ ∑ 

.(10) 

We compute the optimum weight vector w * and the 

bias. 

N 

* * 

= ∂i 

i= 

1 

w ∑ yiϕ( xi) 

* * 

b =1-w ⋅ xi 

(11) 

Finally, a classifier based on the support vectors will 

be 

N 

∑ 

i= 

1 

i i i . (12) 

f ( x) = sign( α yk( x, x) + b) 

In our system, we select the following four kernel 

functions as learning machines: 

(1) Linear function 

It has the simplest formula form, which is given by the 

inner product of two vectors in low-dimensional space 

plus an optional constant COEF. 

Kx ( , x) = x⋅ x+ COEF (13) 

i j i j 

(2) Polynomial function 

It is a unstable kernel. It is fit for problems in which all 

the training samples are normalized. Adjustable 

parameters are the slope γ, the constant term COEF and 

the polynomial degree d. 

d 

K( x, x ) = ( γx ⋅ x + COEF) 

, γ > 0(14) 

i j i j 

(3) Radial basis function (RBF) 

Its common form is a Gaussian form. The adjustable 

parameterγplays a major role in the performance of the 

kernel, and should be carefully adjusted to the specific 

problem. 

⎡ 

2 

⎤ 

i j i j 

Kx ( , x) = exp −γ x−x, γ> 0 

⎢⎣ ⎥⎦ 

(15) 

(4) Sigmoid function 

It is also known as Multi-layer Perceptron (MLP) 

kernel. A SVM using a sigmoid function as kernel is 

equal to a two-layer perceptron neural network. The 

adjustable parameters in the sigmoid kernel are the slope 


γ and the intercept constant COEF. A common value for γ 

is 1/N, where N is the data dimension. 

Kx ( , x) = tanh( γx⋅ x+ COEF) 

. (16) 

i j i j 

C. Optimizing SVM Method 

In our work, we propose the method of optimzed SVM 

including selection of a kernel function and kernel 

parameters. We will compare different typical kernels 

with different parameters and select a better one to do the 

speech emotion recognition job. The system uses K-fold 

Cross Validation (K-CV) Algorithm to select kernel 

function parameters. The main idea of K-CV Algorithm 

with grid search in pseudo code form to decide two 

optimized parameters is as follows. The process of 

deciding more than two parameters is similar. 

FOR(parameter1 = begin1; parameter1


IV. SPEECH EMOTION RECOGNITION EXPERIMENTS AND 

RESULTS ANALYSIS 

We perform experiments on the HIT speech database 

referred in section II by Matlab 7.11. The objective of 

these experiments is to find an optimized SVM to 

improve the speech emotion recognition accuracy rate. 

Firstly 1256 utterances’ emotion features are extracted. 

The spectral feature of each utterance is 12 MFCC and 1 

energy of a frame, together with their delta and 

acceleration. We use LibSVM v3.1 [17] as the 

implementation of support vector machine classification. 

A. Selection of Optimized Parameters 

We use RBF kernel with different parameters to 

perform a selection of optimized parameters experiments. 

A SVM model with RBF usually has two adjustable 

parameters — g (γin Gaussian function) and C (penalty 

parameter) . The scope of g and C is from 2 -10 to 2 10 , and 

the step of g and C is 1.2. 5 fold cross validation is 

performed for parameters selection. Fig. 5 and Fig. 6 

shows the 5-VC parameter selection result 3-Dimensional 

view and in contour [18]. Observing that parameter g and 

c varying in a smaller scope have higher recognition 

accuracy rate, we can narrow the scope of grid search and 

the search step. Fig. 7 shows parameter selection result in 

contour form with smaller search scope. Parameters C 

and g centers in smaller scope, but they have a high 

Accuracy(%) 

5-CV Parameter Selection Result(3-Dimensional View) 

Best c=4 g=0.0051543 

5-CV-RecognitionAccuracy=71.4172% 

c=(-10,10) cstep=1.2 

g=(-10,10) gstep=1.2 

90 

80 

70 

60 

50 

40 

30 

10 

log2g 

5 

0 

log2g 

-5 

-10 

-10 

-5 

0 

log2c 

Figure 6. Parameter selection result in 3-Dimensional view. 


-2 

-4 

-6 

-8 

60 

5-CV Parameter Selection Result(Contour) 

Best c=4 g=0.0051543 


c=(-10,10) cstep=1.2 

g=(-10,10) gstep=1.2 

best log2C=2 best log2g=-7.6 

61 

5 

63 

62 

10 

64 

65 

69 

-10 

-4 -3 -2 -1 0 1 2 3 4 5 

66 

log2c 

71 

67 

68 

70 

68 

60 61 62 63 64 65 66 

Figure 5. Parameter selection result in contour form. 

recognition accuracy rate. If several groups of g and c 

correspond to the same recognition accuracy rate, the 

method will choose the group with smaller c. The reason 

for that c with high value will lead to overfitting study. 

Table I shows optimized c and g in different scope grid 

search and the search step and gives the recognition 

accuracy rate of 80 test utterances in corresponding 

parameters. From the results experiment (NO.1~4) using 

our methods referred in section III C, we can see that 

with the decrease of search scope and search step, the 

recognition accuracy rate of train set is improving. The 

recognition accuracy rate of test set is high. Through four 

experiments we gain the optimized parameters of C and g 

with the highest recognition accuracy rate of train set, 

that is C=2.3784 and g=0.0057191. We compare our 

methods with common methods having a bigger search 

scope and lower search step. Experiment NO 5* shows 

the result of common method. We can also gain the 

optimized parameters same to the result of NO 4. 

However, the total runtime of NO 5* is 38027.12s far 

greater than the runtime of NO.1~4 (606.93 +333.18+ 

300.16+ 331.76 = 1572.03s). Experiments show that the 

new method can avoid unnecessary process of optimizing 

parameters, reduce sharply the time complexity of 

optimizing parameters, and improve the speed of training 

and classification. The method is especially useful for 

training of very large samples. It also maintains the 

recognition accuracy rate at the same time. 

log2g 

-6.8 

-7 

-7.2 

-7.4 

-7.6 

-7.8 

-8 

-8.2 

-8.4 

70 

69 

71 

68 

67 

71 

71 

69 

70 

70 

5-CV Parameter Selection Result(Contour) 

Best c=2.3784 g=0.0057191 


c=(0.5,3.4) cstep=0.15 

g=(-8.5,-6.7) gstep=0.15 

best log2C=1.25best log2g=-7.45 

71 

0.5 1 1.5 2 2.5 3 

71 

71 

70 

70 

69 

log2c 

Figure 7. Contour with smaller search scope. 

71 

71 

71 

70 

69 

69 

71 

71 

68 

68 

69 

70 

71


B. Selection of Kernel Function 

Table II shows the accuracy rate of speech emotion 

recognition based on optimized SVM using four different 

kernels referred in section III B. The method of selection 

optimized parameters of other kernels is same to RBF. It 

can be seen that the RBF kernel has the best performance 

in recognition accuracy of train set and test set. 

TABLE II. 

RECOGNITION ACCURACY OF SPEECH EMOTION USING DIFFERENT 

KERNELS 

Optimized 

Kernel 

Parameters 

COEF = 0 

Linear 

C = 2.3784 

COEF = 0 

C = 0.056328 

Polynomial d = 3 

g = 0.0046453 

(γin function) 

C = 2.3784 

RBF 

g = 0.0057191 

COEF = 0 

C = 0.4278 

Sigmoid 

g = 0.00097656 

(γin function) 


Recognition Accuracy 

Train set Test set 

61.305% 66.25% 

68.1529% 82.5% 

71.8949 % 88.75% 

48.8854% 48.75% 

In this paper, we propose methods about selecting 

optimized parameters and kernel function of SVM and 

establish a speech emotion model based on optimized 

SVM. Support machine vector is studied as emotion 

classification. Mel Frequency Cepstral Coefficients plus 

energy with their delta and acceleration as speech 

emotion features are utilized as the input of SVM. 

Experiments show that the method of selecting optimized 

parameters not only sharply reduces the time complexity 

compared with common method, but also maintains the 

recognition accuracy rate at the same time. Four kernels 

contrast experiments show that RBF kernel has better 

performance in speech emotion recognition than other 

kernels. Based on selection optimized parameters and 

RBF kernels, we gain an optimized SVM improving the 

TABLE I. 

OPTIMIZED PARAMETERS AND RECOGNITION ACCURACY 

NO. Parameter Scope Step Optimized Values 

1 

2 

3 

4 

5* 


C 2 -10 ~2 10 1.2 4 

g 2 -10 ~2 10 1.2 0.0051543 

C 2 -2.8 ~2 8.8 0.6 2.639 

g 2 -10 ~2 -4 0.6 0.0078125 

C 2 -0.4 ~2 4.6 0.3 2.1435 

g 2 -9.4 ~2 -5.8 0.3 0.0063457 

C 2 0.5 ~2 3.4 0.15 2.3784 

g 2 -8.5 ~2 -6.7 0.15 0.0057191 

C 2 -10 ~2 10 0.15 2.3784 

g 2 -10 ~2 10 0.15 0.0057191 

Recognition Accuracy Rate 

Train set Test set 

performance of the emotion recognition system 

effectively. Other methods to optimize SVM will be 

studied in future work. 


We thank Speech Processing Lab of HIT for 

supporting this work. The work presented in this paper is 

supported by the National Natural Science Foundation of 

China under Grant No. 61171186, 60772076 and Project 

(HIT.KLOF.2009015) Supported by Key Laboratory 

Opening Funding of MOE-Microsoft Key Laboratory of 

Natural Language Processing and Speech. The authors 

are grateful for the anonymous reviewers who made 

constructive comments. 

REFERENCES 

Runtime(second) 

71.4172% 92.5% 606.93 

71.7357 % 95% 333.18 

71.7357% 90% 300.16 

71.8949 % 88.75% 331.76 

71.8949 % 88.75% 38027.12 

[1] R. Cowie, E. Douglas-Cowie, N. Tsapatsoulis, G. Votsis, 

S. Kollias and et al, “Emotion Recognition in Human- 

Computer Interaction,” IEEE Signal Processing Magazine. 

America, vol. 18, pp. 32-80, January 2001. 

[2] L. Devillers, C. Vaudable, and C. Chastagnol, “Real-life 

emotion-related states detection in call centers: a crosscorpora 

study,” in Proc. INTERSPEECH 2010. Chiba, pp. 

2350-2353, September 2010. 

[3] D. Morrison and R. Wang, LC, “De Silva. Ensemble 

methods for spoken emotion recognition in call-centers. 

Speech Communication,” Speech Communication. 

Amsterdam, vol. 49, pp. 98-112, February 2007. 

[4] A. Batliner, K. Fischer, R. Huber, J. Spilker and E. Noth, 

“How to find trouble in communication,” Speech 

Communication. Amsterdam, vol. 40, pp. 117-143, April 

2003. 

[5] C. M. Lee and S. S. Narayanan, “Toward detecting 

emotions in spoken dialogs,” IEEE Trans. Speech and 

Audio Processing. America, vol. 13, pp. 293-303, March 

2005. 

[6] Scherer KR, “Vocal communication of emotion: A review 

of research paradigms,” Speech Communication. 

Amsterdam, vol. 40, pp. 227-256, April 2003. 

[7] L. Ten Bosch, “Emotion, speech and the ASR framework,” 

Speech Communication. Amsterdam, vol. 40, pp. 213-225, 

April 2003. 

[8] Z. Inanoglu and R. Caneel, "Emotive alert: HMM-based 

emotion detection in voicemail messages," in Proc.


Intelligent User Interfaces. San Diego, pp. 251-253, 

January 2005. 

[9] K. Chen, GX Yue, F. Yu, Y. Shen and A.Q. Zhu, 

“Research on speech emotion recognition system in elearning,” 

Lecture Notes in Computer Science. Berlin, vol. 

4489, pp. 555-558, 2007. 

[10] R. Nakats, N. Tosa and T. Ochi, “Construction of an 

interactive movie system for multi-person participation,” in 

Proc. Multimedia Computing and Systems. Austin, pp. 228- 

232, 1998. 

[11] C. M. Jones and I. M. Jonsson. Performance analysis of 

acoustic emotion recognition for in-car conversational 

interfaces,” Lecture Notes in Computer Science. Berlin, vol. 

4555, pp. 411-420, 2007. 

[12] I. Luengo, E. Navasm, I. Hernaez and J. Sanchez, 

"Automatic Emotion Recognition using Prosodic 

Parameters," in Proc. INTERSPEECH 2005. Lisbon, pp. 

493-496, 2005. 

[13] H. Hao, X. Ming-Xing and W. Wei, “GMM Supervector 

Based SVM with Spectral Features for Speech Emotion 

Recognition, in Acoustics, Speech and Signal Processing,” 

in Conf. ICASSP 2007. Honolulu, pp. 413-420, 2007. 

[14] S. S. Stevens, J. Volkmann and E. B. Newman, “A scale 

for the measurement of the psychological magnitude 

pitches,” Journal of the Acoustical Society of American, 

pp.185-190, August 1937. 

[15] N. Kamaruddin and A. Wahab, “Speech Emotion 

Verification System (SEVS) based on MFCC for real time 

applications,” in Conf. Intelligent Environments 2008. 

Seattle, July 2008. 

[16] H. Simon, Neural Networks A Comprehensive Foundation, 

2 nd ed., Perason Education, 1999, pp.340–372. 

[17] Ch. Chang, Ch. Lin, LIBSVM: a Library for Support 

Vector Machines, http://www.csie.ntu.edu.tw/~cjlin/libsvm, 

2005. 

[18] MATLAB Chinese forum, Thirty Neural Networks 

Examples Analysis in MATLAB, Publication of Beijing 

University of Aeronautics and Astronautics, pp.110-pp.127, 

2010. 

Bo Yu received his B.E. in Computer 

Science and Technology from Harbin 

University of Science and Technology, 

Harbin, Heilongjiang Province, China in 

2004 and received his M.E. in 

Technology of Computer Application 

from Harbin University of Science and 

Technology, China in 2007. He is 

currently working towards the PhD 

degree in Harbin Institute of Technology. His major fields are: 


Speech Recognition, Pattern Recognition& Software 

development. 

He started his teaching career in the College of Software in 

2007 in Harbin University of Science and Technology and 

promoted as a lecturer in 2008. He is the Deputy Secretary of 

Software Engineering Department. He used to teach Computer 

Theory Test in C Language in 2005 in Harbin University of 

Science and Technology and was the programmer in Hua Ze 

Digital Company from Oct. 2005 to Mar. 2006. He had his field 

work in Neusoft Group from Sep. 2007 to Jan. 2008. One of his 

published articles is Milti-agent Web Texts Mining Based on 

Galois Lattice.2010 International Forum on Information 

Technology and Applications (IFITA 2010). His current 

research Interests are Speech Emotion, Pattern Recognition & 

Software development. His previous research interests are Web 

text mining. 

Mr.Yu received Certificate of Achievement as coach for his 

students getting Second Prize in 2010 The Fifth Heilongjiang 

Province Programming Contest, Asia Provincial-National 

Contests in May 16, 2010. 

Haifeng Li Doctoral Supervisor, 

Director of Speech Processing Lab in 

School of Computer Science and 

Technology at HIT and IEEE member. 

He is the Dean of Honors School now. 

He got his Doctor's Degree from Electro- 

Magnetical Measuring 

Technique&Instrumentation from HIT in 

1997 and Doctor's Degree from 

Computer, Communication and 

Electronic Science from University of Paris VI, France in 2002. 

He started the teaching career in 1994 in HIT, promoted as 

lecturer in 1995 and professor in 2003. From 1997 to 2002, he is 

engaged in the post-doctoral research at University of Paris VI, 

and presided the project of Speech Noise Reduction Research 

for France Telecom. In August 2004, he became the Assistant 

Dean of School of Software. His research fields are Audio 

Information Retrieval & Processing, Artificial Neural Networks. 

He undertakes many projects of National Natural Science 

Foundation, Provincial and Ministry Science Foundation and 

has published over 30 papers in journals and conferences at 

home and abroad. 

Chunying Fang was born in 1978. She is currently working 

toward the PhD degree at the Computer Science Department, 

Harbin Institute of technology (HIT), Harbin, China. Her 

present research interests include speech recognition.


System Dynamics Modeling and Simulation for 

Competition and Cooperation of Supply Chain on 

Dual-Replenishment Policy 

Shidi Miao 

Department of Software Engineering, School of Software, Harbin University of Science and Technology, Harbin, China 

Email: msdking@hrbust.edu.cn 

Chunxian Teng, Lu Zhang 

The System Engineering Research Institute, Harbin University of Science and Technology, Harbin, China 

Email: tengcx@hrbust.edu.cn 

Abstract—The need for holistic modeling efforts that satisfy 

the increasing supply chain enterprise at a strategic level 

has been clearly recognized by industry first and by 

academia recently. In order to increase the profitability of 

the entire chain, strategic decision-makers need 

comprehensive models to guide them to make efficient 

decision. The determination of optimal network 

configuration, inventory management policies, supply 

contracts, distribution strategies, supply chain integration 

and information technology are prime examples of strategic 

decision that affect the long-term profitability of the entire 

supply chain. With the main aim of supply chain 

management being to maximize the profits of supply chains, 

we depict a benchmark model which describes two supply 

chain competition behaviors under the fluctuation demand 

situations. On that basis, we design the cooperation contract 

between the chains for retailers’ inventory replenishment, 

and moreover, extend this cooperation contract to dual 

replenishment policy in order to heighten the profits of 

supply chain and its members in development. Found by the 

system dynamics simulation, this chain cooperation contract 

makes profit increasing to some degree, but inventory 

fluctuation of the supply chain members will be aggravated 

and the inventory cost will be increased. Consequently, we 

analyze the key issue of strategic supply chain management 

in depth, that of regulation parameter. Finally, we 

demonstrate the applicability of contract model on dualreplenishment 

policy through the application of computer 

simulation. 

Index Terms—System dynamics, Dual-replenishment policy, 

Competition and Cooperation, Contract, Supply chain 


An increasingly vocal and popular sentiment holds that 

the nature of competition in the future will not be 

between companies but rather between supply chains, 

supply chain has become an important way to winning 

the future[1-4]. The rise of global manufacture and 

information technology booming, makes the structure of 

the supply chain become more complex, and there are 

plenty of research fields involved. Inventory management 

is one of the research hot points for both domestic and 

foreign scholars. Cachon[5-7] analyzed the competition 

and cooperation strategy of supplier and retailer from a 

single chain angle. Towill[8] studied inventory 

competition of two symmetrical supply chain that each 

consists of a manufacturer and retailer, but he didn’t 

consider cooperation. Bernstein[9-12] studied the 

equilibrium state of the competing retailers in the 

decentralized supply chain under uncertainty demand. 

Zhang and Xiao[13-15] made a large contribution on 

supply chain network competition, but they just 

considered price, service, and demand. 

Most of these studies is from a single standpoint, which 

only considered the across-competition or only 

considered replenishment. Based on the above studies, 

under demand uncertainty conditions, this paper 

constructs model of competition between supply chains, 

each consists of one manufacturer and one retailer. 

Considering inventory level and profit fluctuation, we 

establish an cooperation contract based on inventory 

replenishment, and extend the contract to the dualreplenishment 

policy to achieve win-win situation. 

Finally, we furtherly indicate the effectiveness of the 

model by contrasting the simulation results. 

II. A SYSTEM DYNAMIC MODEL OF THE TWO-ECHELON 

SUPPLY CHAIN COMPETITION 

We assume that the competition model is composed by 

two supply chains and each chain is composed by a 

manufacturer and a retailer. The product which managed 

by the two chains are homogeneous and can be 

substituted completely. The competition between two 

supply chains is normal distribution in the consumer 

market. As shown in Fig 1. 

Project supported by the National Natural Science Foundation of China Figure1. Structure of the two-echelon supply chain competition 

(Grant No.70871031 ). 


doi:10.4304/jsw.7.12.2734-2741


In each supply chain, when retailers place an order to 

the manufacturer, it always uses the periodic inspection 

replenishment strategy, that is to check inventory state 

periodically. There is no replenishment between the same 

node in two supply chains, and the relationship between 

the two nodes is fully competitive. If the existing storage 

is above the replenishment point, there is no supplement 

when checking, otherwise the retailer will replenish the 

stock. Fig. 3 presents the system dynamics model of 

supply chain competition. 

The actual replenishment behavior is usually triggered 

by the retailer’s supplement signal. When retailer’s 

accumulative order is above or equal to its economic 

orders quantity , or retailer’s inventory is below or equal 

to its orders points, retailer’s supplement signal will be 

one, triggering replenishment. In this moment, retailer’s 

actual replenishment is minimum, and yet, replenishment 

quantity is zero. The DYNAMO equation of retailer 

replenishment signal can be shown as: 

⎧1, ORDERRA.K EOQR.K or INVR.K ORDERPR.K 

REPSIGNALR.K= ⎪ 

≥ ≤ 

⎨ 

⎪⎪⎩ 0, ORDERRA.K≤EOQR.K and INVR.K ≥ORDERPR.K 

The DYNAMO equation of retailer’s actual 

replenishment quantity can be shown as: 

⎧0, REPSIGNALR.K 1 

REALREPR.KL= ⎪ 

≠ 

⎨ 

⎪⎩ MIN(INVM.K,EOQR.K), REPSIGNALR.K=1 

In this paper, our research only relates to the stock 

which have been finished in two-echelon supply chain. In 

this stock model of manufacturer, we don’t consider 

other cost of raw material and pay close attention to the 

price of raw material which has a direct relationship with 

the stock of finished products. Under this competition 

mode, retailer’ inventory is determined by the shipment 

rate(SALER) and receiving rate(SHIPTOR) of the retailer, 

and its DYNAMO equation is shown as: 

(1) 

(2) 

INVR.K=INVR.J+DT*(SHIPTOR.JK-SALER.JK) (3) 

Two supply chains share the same market, but they 

have different market occupation ratio. We assume that 

the DYNAMO equation of demand and price is: 

DEMANDSC1.K=DEMAND.K-β1*PRICER1.K+ β 2*PRICER2.K 

DEMANDSC2.K=DEMAND.K- β1*PRICER2.K+ β 2*PRICER1.K (4) 

Among them the total market demand quantity is 

nonnegative and random variable, both distribution 

function and density function are continuous. The market 

demand of SC1(supply chain 1)and SC2(supply chain 2) 

are determined by the retail price of product of 

R1(Retailer 1) and R2(Retailer 2). When the price of 

product of R1 increases, its market demand will be 

decreased, we call it as crowding out effect. When the 

price of R2’s products rises, its rival’s market demand 

maybe increases, we call it as attraction effect. The 

parameters of β1and β2 are the measurement of this two 

effects respectively. 


In the model, shown as in figure 1, before the arrival of 

marketing circling, every manufacturer and retailer will 

make a decision of production and order according to the 

expectation of sales, market demand, and the level of 

stock of themselves. There the demand of retailers from 

manufacturers depends on the demand of customers. 

Manufacturers delivery the products to retailers, retailers 

receive commodities, and sell them to customers. Each 

manufacturer’s cost mainly involves four sections: the 

treatment cost of orders which manufactures have 

received, the purchase cost of raw material when 

manufacturers purchased, the production cost which 

manufacturers make the raw material for production and 

the storage cost of production and raw material. The 

manufacturers’ income is that it sells products to the 

retailer and then obtains income. Its profits are the 

difference of total income and total cost. Each retailer’s 

cost mainly involves three sections: the purchase cost 

when retailers buy products from manufacturers, the 

order cost when the stock is not enough, accordingly, the 

existing inventory cost if stocks have the rest. The 

retailer’s receipts equals to sales receipts, its total profits 

is the difference of total revenue and total cost. The total 

profits of entire supply chain is the sum of total profits of 

manufacturers and retailers. 

Ⅲ. A SYSTEM DYNAMICS MODEL FOR ACROSS-CHAIN 

COOPERATION CONTRACT BASED ON RETAILERS’ 

INVENTORY REPLENISHMENT 

The foregoing model, which is inner-chain cooperation 

mode, just uses the way of interior chain replenishment to 

meet the upstream’s demand. Owing to the single target 

that increases their own supply chain profits, the winning 

node of the supply chain may be out of stock. Conversely, 

the failure node may appeared to the rest of the stock. We 

will present an across-chain cooperation contract based 

on retailers’ inventory replenishment to coordinate the 

supply chains under competition. As shown in Fig 2. 

Figure2. Structure of across-chain cooperation 

we suppose that the competitive node is near in 

geographical position and take no account of 

replenishment lead time and replenishment delay. The 

situation that a retailer is out of stock and another is 

surplus can trigger the across-chain cooperation contract. 

The surplus or not will be judged by the difference of the 

customer demand and retailers’ inventory. When one 

difference greater than zero and another is less than zero, 

cooperation signal will be triggered and two retailers of 

across-chain begin to make a cooperation. According to 

the cooperative product price and batch, we coordinate 

two retailers by adding across-chain replenishment 

cooperation contract, which make retailers, manufacturers 

and total supply chain to win-win situation.


. 

 

 

OPCIRATEM1 

PCIRATEM1 

PCYCLEM2 

MPCIRATEM1 

INVADJUSTM2 

OPCOSTM2 

OPCOSTM1 

IIRATEM1 

UPPCOSTM1 

APCOSTM1 

AVGRR1 

PRODUCT 

IVITYM1 

OPACOSTM1 

MPCOSTM1 

TINVM1 

INCOMING 

M1 

TINVCTIMEM1 

PCYCLEM1 INVADJUSTM1 

PRODUCT 

IVITYM2 

TINVM2 

PCIRATEM2 

MPCIRATEM2 

OPCIRATEM2 

 

 

SIRATIOM2 

TINVCTIMEM2 

 


PROFITM1 

INVM1 

ORSPERIODR1 

SIRATIOM1 

SINVM1 

INVM2 

SINVM2 

APCOSTM2 

 

IIRATEM2 

PROFITSC1 

COSTM1 

MUPRICEM1 

REALREPRR2 

AVGRR2 

 

EOQR1 

PROFITR1 

REALREPRR1 

 

MPCOSTM2 

UPPCOSTM2 

OPACOSTM2 

INCOMING 

M2 

STCOSTM1 

REPSIGNALR1 

TRATIOR1 

REPSIGNALR2 

ORSPERIODR2 

INCOMINGR1 

COSTSC1 

 

PTPRICER1 

 

MUPRICEM2 

 

INVSRATIOR1 

IIRATER1 

MSCRATEM1 

OGRATER1 

RLEADTIMER1 

INVR1 

 

PRICER1 

SCRATIOM1 

APCOSTR1 

SCRATIOR1 

SINVR1 

INVGAPR1 

OCOSTR1 

OACCR1 

PCITRATER1 

TINVR1 

COSTR1 

 

OFRATER1 

ORR1 

ORDERPR1 

TINVCTIMER1 

STCOSTR1 

M1TOR1 DELRATER1 

 

INVR2 


 

EOQR2 

RLEADTIMER2 

 

PTPRICER2 

 

 

INVSRATIOR2 

STCOSTM2 

COSTM2 

PROFITM2 

SCRATIOR2 

PROFITSC2 

INVGAPR2 

TRATIOR2 

OCOSTR2 

PROFITR2 

SINVR2 

MSCRATEM2 

SCRATIOM2 

COSTSC2 

TINVR2 

ORR2 INVADTIMER2 

OACCR2 

INCOMINGR2 

OCIRATER1 

RSDEVIAT 

IONR1 

CDSMOOTHR1 

OGRATER2 OFRATER2 

 

ORDERPR2 

CPERIOD1 

 

CPERIOD2 

Figure 3. Causal loop diagram of supply chain competition 

TINVCTIMER2 

RSDEVIAT 

IONR2 

IIRATER2 

COSTR2 

CSIRATER1 

 

 

 

CDSMOOTHR2 

 

INVADTIMER1 

CDSTIMER1 

APCOSTR2 

 

CDSTIMER2 

 

AOCOSTR1 

RMVALUER1 

STCOSTR2 

CDFUNCT 

IONR1 

CDFUNCT 

IONR2 

OCIRATER2 

 

 

 

 

RMVALUER2 

 

 

PRATIO1 

MDSC1 

MDSC2 

 

PCITRATER2 

AMDEMAND 

AOCOSTR2 

CSIRATER2 

PRATIO2 

 

 

 

PRICER2 


Fig. 7 presents the model of across-chain cooperation 

contract. Difference(DISPERSION) is the differential of 

retailers inventory(INVR)and customer demand function 

(BDEMANDF). The DYNAMO equation for 

cooperation signal(CSIGNAL) can be shown as: 

⎧⎪ CSIGNAl1.K DISPERSION1.K>0 and DISPERSION2.K100 

CPRICER1.K= ⎨ 

(7) 

⎪⎩ TPRICER1*(1+30%) R2TOR1


 

 

OPCIRATEM1 

PCIRATEM1 

PCYCLEM2 

MPCIRATEM1 

INVADJUSTM2 

OPCOSTM2 

OPCOSTM1 

IIRATEM1 

UPPCOSTM1 

APCOSTM1 

AVGRR1 

PRODUCT 

IVITYM1 

MPCOSTM1 

TINVM1 

OPACOSTM1 

INCOMING 

M1 

TINVCTIMEM1 


PRODUCT 

IVITYM2 

TINVM2 

PCIRATEM2 

MPCIRATEM2 

OPCIRATEM2 

 

 

SIRATIOM2 

TINVCTIMEM2 


 

PROFITM1 

INVM1 

ORSPERIODR1 

SIRATIOM1 

SINVM1 

COOPCOST1 

COOPCOST2 

INVM2 

SINVM2 

APCOSTM2 

 

IIRATEM2 

PROFITSC1 

COSTM1 

MUPRICEM1 

REALREPRR2 

AVGRR2 

 

EOQR1 

PROFITR1 

REALREPRR1 

 

MPCOSTM2 

UPPCOSTM2 

OPACOSTM2 

STCOSTM1 

REPSIGNALR1 

CCIRATE1 

 

CCIRATE2 

 

INCOMING 

M2 

TRATIOR1 

REPSIGNALR2 

ORSPERIODR2 

INCOMINGR1 

COSTSC1 

 

PTPRICER1 

 

MUPRICEM2 

 

INVSRATIOR1 

IIRATER1 

MSCRATEM1 

OGRATER1 

RLEADTIMER1 

INVR1 

 

PRICER1 

SCRATIOM1 

APCOSTR1 

SCRATIOR1 

SINVR1 

INVGAPR1 

OCOSTR1 

OACCR1 

PCITRATER1 

TINVR1 

COSTR1 

 

OFRATER1 

ORR1 

ORDERPR1 

TINVCTIMER1 

STCOSTR1 


INVR2 


TINVR2 

 

 

EOQR2 

COPRICE1 

COPRICE2 

RLEADTIMER2 

 

PTPRICER2 

 

 

INVSRATIOR2 

STCOSTM2 

COSTM2 

PROFITM2 

SCRATIOR2 

PROFITSC2 

INVGAPR2 

TRATIOR2 

R2DR1 

R1DR2 

OCOSTR2 

PROFITR2 

SINVR2 

ORDERPR2 


MSCRATEM2 

SCRATIOM2 

COSTSC2 

OACCR2 

INCOMINGR2 

OCIRATER1 

RSDEVIAT 

IONR1 

CDSMOOTHR1 

TINVCTIMER2 

RSDEVIAT 

IONR2 


 

CPERIOD1 

 

CSIGNAl1 

Figure 7. Causal loop diagram of across-chain cooperation 

CSIGNAl2 

CPERIOD2 

IIRATER2 

COSTR2 

CSIRATER1 

 

 

 

CDSMOOTHR2 

 

INVADTIMER1 

CDSTIMER1 

APCOSTR2 

 

DISPERSION1 

DISPERSION2 

CDSTIMER2 

 

AOCOSTR1 

RMVALUER1 

STCOSTR2 

CDFUNCT 

IONR1 

CDFUNCT 

IONR2 

OCIRATER2 

 

 

 

 

RMVALUER2 

 

 

PRATIO1 

MDSC1 

MDSC2 

 

PCITRATER2 

AMDEMAND 

AOCOSTR2 

CSIRATER2 

PRATIO2 

 

 

 

PRICER2 


 

 

OPCIRATEM1 

PCIRATEM1 

COOPCOS 

TM1 

MPCIRATEM1 

 

COOPCOS 

TM2 

 

PCYCLEM2 

INVADJUSTM2 

OPCOSTM2 

OPCOSTM1 

IIRATEM1 

UPPCOSTM1 

APCOSTM1 

AVGRR1 

PRODUCT 

IVITYM1 

CCIRATEM1 

CCIRATEM2 

OPACOSTM1 

MPCOSTM1 

TINVM1 

INCOMING 

M1 

TINVCTIMEM1 


PRODUCT 

IVITYM2 

TINVM2 

PCIRATEM2 

MPCIRATEM2 

OPCIRATEM2 

 

 

COPRICEM1 

COPRICEM2 

SIRATIOM2 

TINVCTIMEM2 


 

PROFITM1 

INVM1 

APCOSTM2 

IIRATEM2 

ORSPERIODR1 

SIRATIOM1 

SINVM1 

M2TOM1 

INVM2 

SINVM2 

 

PROFITSC1 

COSTM1 

 

MUPRICEM1 

M1TOM2 

REALREPRR2 

AVGRR2 

EOQR1 

PROFITR1 

REALREPRR1 

MPCOSTM2 

UPPCOSTM2 

OPACOSTM2 

STCOSTM1 

DISPERSIONM1 

CSIGNAlM1 

 

INCOMING 

M2 

REPSIGNALR1 

TRATIOR1 

REPSIGNALR2 

ORSPERIODR2 

INCOMINGR1 

COSTSC1 

 

PTPRICER1 

 

MUPRICEM2 

 

INVSRATIOR1 

IIRATER1 

MSCRATEM1 

OGRATER1 

RLEADTIMER1 

INVR1 

 

PRICER1 

SCRATIOM1 

APCOSTR1 

SCRATIOR1 

SINVR1 

INVGAPR1 

OCOSTR1 

OACCR1 

PCITRATER1 

TINVR1 

COSTR1 

 

OFRATER1 

ORR1 

ORDERPR1 

TINVCTIMER1 

STCOSTR1 


COOPCOSTR1 

INVR2 


TINVR2 

 

 

 

EOQR2 

CCIRATE1 

COOPCOSTR2 

CCIRATE2 

DISPERSIONM2 

 

CSIGNAlM2 

RLEADTIMER2 

 

PTPRICER2 

 

 

INVSRATIOR2 

STCOSTM2 

COSTM2 

PROFITM2 

COPRICE1 

COPRICE2 

SCRATIOR2 

PROFITSC2 

INVGAPR2 

TRATIOR2 

R2DR1 

R1DR2 

OCOSTR2 

PROFITR2 

SINVR2 

ORDERPR2 


MSCRATEM2 

SCRATIOM2 

COSTSC2 

OACCR2 

INCOMINGR2 

OCIRATER1 

RSDEVIAT 

IONR1 

CDSMOOTHR1 

TINVCTIMER2 

RSDEVIAT 

IONR2 


 

CPERIOD1 

 

CSIGNAl1 

CSIGNAl2 

CPERIOD2 

IIRATER2 

COSTR2 

CSIRATER1 

 

 

 

CDSMOOTHR2 

 

INVADTIMER1 

CDSTIMER1 

APCOSTR2 

 

DISPERSION1 

DISPERSION2 

CDSTIMER2 

 

AOCOSTR1 

RMVALUER1 

STCOSTR2 

CDFUNCT 

IONR1 

CDFUNCT 

IONR2 

OCIRATER2 

 

 

Figure 8. Causal loop diagram of across-chain cooperation based on dual-replenishment 

 

 

RMVALUER2 

 

 

PRATIO1 

MDSC1 

MDSC2 

 

PCITRATER2 

AMDEMAND 

AOCOSTR2 

CSIRATER2 

PRATIO2 

 

 

 

PRICER2 


Ⅳ. A SYSTEM DYNAMICS MODEL OF ACROSS-THE- 

CHAIN COOPERATION CONTRACT BASED ON DUAL- 

REPLENISHMENT 

On the basis of retailers’ across-chain cooperation, we 

introduce across-chain cooperation contract between the 

manufacturers by the same, and establish the model of 

across-chain cooperation contract of manufacturers and 

retailers, shown as in Fig 9, to furtherly improve the 

overall profits. 

Figure9. Structure of dual-replenishment cooperation 

Suppose that it takes the way of across-chain 

replenishment as the manufacturer shortage, unit prices 

of across-chain replenishment raw materials is according 

to the quantity of cooperation. When it achieves 200 or 

more, unit price of cooperation raw materials is 110% of 

the original price, when it is less than 200, unit price of 

cooperation raw materials is for 130% of the original 

price. 

For comparing with the benchmark competition model, 

retailers’ replenishment, and inventory dualreplenishment, 

we stimulate and make contrast with each 

other. Shown as in Fig 10, Fig 11, Fig 12, Fig 13, the 

profit of the supply chain using dual-replenishment 

furtherly improves. Although the cooperation cost in the 

total cost of manufacturers and retailers improved, and 

inventory cost of each node increased owing to more 

demands, across-chain cooperation replenishment 

reduces other costs, and offsets additional cost, and the 

total cost will be reduced eventually. 

Figure 10. Contrast of SC1’s profit 

(1:single-replenishment 2: competition 3: dual- replenishment) 

Figure 11. Contrast of R2’s profit 

(1:single-replenishment 2: competition 3: dual- replenishment) 


Figure 12. Contrast of M1’s inventory cost 

(1: dual- replenishment 2:single-replenishment 3:competition ) 

Figure 13. Contrast of M1’s cost 

(1: dual- replenishment 2:single-replenishment 3:competition ) 

Shown as in Figure 14 simulation results, the 

manufacturers and retailers will not accept replenishment 

at the same time, namely out-of-stock will not be 

happened simultaneously. As a node shortage, its 

upstream (or downstream )node would be in a state of 

ample supply or a state of being cooperate 

replenishment. 

Figure 14. Contrast of replenishment 

(1: M2 delivers to M1 2: R2 delivers to R1) 

Across-chain cooperation make node’s profits and 

total chain’s profits improved, but owing to the 

increasing demand, inventory fluctuates more frequently, 

inventory costs also be increased, which is the focus of 

further research in future. 

Ⅴ. CONCLUSIONS 

In order to enhance profit, we present system 

dynamics to construct supply chain competition model 

and across-chain cooperation model based on retailers’ 

inventory replenishment, and then we extend the acrosschain 

cooperation model to the dual-replenishment 

policy on both manufacturers and retailers to enhance 

more profit. The study shows the profits of the total 

supply chains and their nodes will increase step by step


when competition transforms into cooperation, single 

replenishment policy transforms into dual-replenishment 

policy. Owing to the continuous improvement of the 

demand, the fluctuation of inventory changes a lot and 

inventory cost also increases. From simulation results, 

we can see that replenishment may not occur at the same 

time is consistent with reality. The improved model can 

be furtherly used to analyze many supply chain policy 

and answer questions about the operation of supply 

chains, using total supply chain profit as the measure of 

performance. The model can furtherly be tailored and 

used in a wide range of manufacture supply chains. Thus, 

it may be proved useful to policy-makers/regulators, and 

decision-makers disposing a wide spectrum of strategic 

supply chain management issues. 

REFERENCES 

[1] Rice J B, HOPPE R M. Supply chain vs. supply chain: the 

hype and the reality[J]. Supply Chain Management 

Review,2011.22(5):47–54. 

[2] Christopher M. Logistics and supply chain management: 

strategies for reducing cost and improving service[M]. 

London: Financial Times Press, 1999: 5–25. 

[3] Barnes D. Competing supply chain are the future[N]. 

Financial Times, 2006,11–08(8). 

[4] Beamon B M. Supply chain design and analysis: models 

and methods[J]. International Journal of Production 

Economics, 1998, 55(3): 281–294. 

[5] Cachon H. Competitive and cooperative inventory policies 

in a two-stage supply chain[J]. Management Science, 

1999, 45(7): 936–952. 

[6] Cachon G P. Stock wars: Inventory competition in a twoechelon 

supply chain with multiple retailers[J]. Operations 

Research, 2001,49(5):658–674. 

[7] Cachon G P. Supply chain coordination with contracts: 

Handbooks in Operations Research and Management 

Science: Supply Chain Management[M]. North Holland: 

Elsevier Publishing, 2003: 229–239. 

[8] Towill D R. Decoupling for supply chain competitiveness 

[J]. IEE Manufacturing Engineering, 2005,84(1) :36–39. 

[9] Bernstein F, Federgruen A. Decentralized supply chain 

competing retailers demand uncertainty[J]. Management 

Science, 2005, 51(1):18–29. 


[10] Bernstein F, Federgruen A. Dynamic inventory and 

pricing Models for competing retailers[J]. Naval Research 

Logistics, 2004, 51(2) 258–274. 

[11] Bernstein F, Federgruen A. Pricing and replenishment 

strategies in a distribution system with competing 

retailers[J]. Operations Research, 2003, 51(3):409–426. 

[12] Bernstein F, Federgruen A. Coordination mechanisms for 

supply chains under price and service competition. 

Working Paper, 2003 

[13] Zhang D, Dong J. A supply chain network economy: 

modeling and qualitative analysis [A]. In: Anna Nagurney, 

Innovations in financial and economic networks [M], 

Edward Elgar Publishing Inc, 2003. 

[14] Zhang D. A network economic model for supply chain 

versus supply chain competition[J]. Omega: The 

International Journal of Management Science, 2006 ,34(3), 

283–295. 

[15] Xiao T J, YANG D Q. Price and service competition of 

supply chains with risk-averse retailers under demand 

uncertainty[J]. International Journal of Production 

Economics, 2008,114(1): 187–200. 

Shidi Miao(1979- )(Tel.: +8613796092201,+86045186397007) 

is Lecturer at Department of Software Engineering at School of 

Software in Harbin University of Science and Technology in 

China. He received the master’s degree in computer application 

technology and is reading a doctorate in management science 

and engineering. His research interests are in supply 

management and system dynamics applications. He teaches 

Oracle database, ERP, Electronic Commerce and software 

project management. 

Chunxian Teng(1947-)(Tel.:+86045186390845) is professor at 

the college of management in Harbin University of Science and 

Technology in China. He is Ph.D. supervisor in supply 

management and system analysis and optimization. Currently 

He is the director of the System Engineering Research Institute 

in Harbin University of Science and Technology. 

Lu Zhang(1986-)(zhanglu-sunny@hotmail.com) is operator of 

China national offshore oil corporation hui zhou refinery.She 

received the master’s degree in management science and 

engineering.


Audio Error Concealment Based on Wavelet 

Decomposition and Reconstruction 

Fei Xiao, Hexin Chen and Yan Zhao 

School of Communication Engineering, Jilin University, Changchun, China 

Email: xiaofei200411@163.com, {chx, zhao_y}@jlu.edu.cn 

Abstract—The algorithm of wavelet decomposition and 

reconstruction are applied in the audio error concealment 

method in this paper. When there is lost frame of the audio 

signal, the correct frames which lie before and after the lost 

frame are wavelet decomposed firstly. Then the two sets of 

the wavelet coefficients obtained from the wavelet 

decomposition are utilized to get the wavelet coefficients for 

the lost frame. Finally the concealment of the lost frame is 

completed by the wavelet coefficients reconstruction. 

Comparing to the performance of the traditional audio 

concealment algorithm, the proposed method is better for 

audio frame reconstruction. 

Index Terms—error concealment, audio, wavelet, MALLAT, 

CELP 


People living in the environment of a variety of sounds, 

language conveys information in social communication 

activities and the music expresses the feelings of the 

people. So the sound has a dual nature, one is objective 

reality and the other is subjective feeling of reflection. 

When the compressed audio signal is lost during the 

process of storage or transmission, the use of audio error 

concealment (EC) to deal with the loss of audio signal is 

necessary. The audio error concealment technique makes 

use of the short-term stable characteristics of the audio 

signal, combining with characteristics of human auditory 

system to cover up audio compression data which has 

some decoding errors caused by damaged storage media 

and transmission channel errors in order to improve the 

audio playback quality. 

So far, researchers have put forward many techniques 

for dealing with the lost audio, such as waveform 

substitution, which refers to reconstruction of missing 

packets by substitution of past waveform segments and 

concludes pattern matching and pitch detection [1][2]. 

Another type of audio error concealment algorithm 

constructs audio packets at the receiver which need 

parameters from the encoder based on CELP to complete 

the process of concealment [3]. Because this method 

derives the encoder state from packets surrounding the 

loss and generate a replacement for the lost packet from 

Corresponding author: Yan Zhao; 


doi:10.4304/jsw.7.12.2742-2748 

that, so this process is complex to implement but can give 

good results. An error concealment algorithm focus on 

the reconstruction of the linear predictive coding (LPC) 

coefficients which represents the short-term spectral 

information of speech within a frame and preserving 

them plays a major role in the quality of the reconstructed 

speech [4]. 

Based on previous studies, when the short-term 

stability of the audio signal is invalid or the waveform of 

the audio signal is non-repetitive, the results of some 

methods are not satisfactory to reconstruct the loss of the 

audio signal. Therefore in this paper, an audio error 

concealment algorithm based on wavelet decomposition 

and reconstruction is proposed to improve the concealed 

audio quality. 

II. BASIC THEORY OF WAVELET DECOMPOSITION AND 

RECONSTRUCTION 

MALLAT algorithm which is based on the compactly 

supported wavelet of DAUBECHIES is used in the 

proposed method to decompose and reconstruct the audio 

signal. 

A. MALLAT Algorithm 

Let { V j} 

be a given multi-resolution analysis scale 

space, f ∈ V ( J J is a definite integer) is an arbitrary 

1 1 

signal, which has the following wavelet decomposition 

[5]. 

where: 

Define: 

Then: 

f () t = A f () t = A f() t + D f () t (1) 

J1 J1+ 1 J1+ 

1 

∞ 

A f() t = ∑ C ϕ 

(2) 

J1+ 1 J1+ 1, m J1+ 1, m 

m=−∞ 

D f() t d ψ 

∞ 

= ∑ (3) 

J1+ 1 J1+ 1, m J1+ 1, m 

m=−∞ 

H = ( H ) = ( h ), G = ( G ) = ( g ) (4) 

mk , k−2 m mk , k−2m C = HC , d = GC (5) 

J1+ 1 J1 J1+ 1 J1


Similarly, there is: 

where 

C 0 

G 

H 

d 1 

C 1 

H H H H 

L 

C C 

2 

N−1 

CN 

G G G 

d d 2 3 d N 

Figure 1. MALLAT decomposition 

G 

H 

G 

H 

G 

H 

L 

L 

Figure 2. Flow chart for wavelet decomposition of the audio signal 

J2 

f () t = A f() t + ∑ D f() t (6) 

2 

1 1 

J j 

j= J + 

C = HC , d = GC , j= J , J + 1,..., J (7) 

j+ 1 j j+ 1 j 1 1 2 

This is the MALLAT pyramid decomposition 

algorithm, where Af j is called the continuous 

approximation of f under the resolution 2 j [5]. The 

discrete approximation signal C passes through the 

j− 

1 

filter H and obtains the discrete approximation signal j C 

under the resolution 2 j , and the discrete detail signal 

C passing through the filter G 

− 

d can be obtained by j 

j 1 

under the resolution 2 j 

. MALLAT wavelet 

decomposition is shown in Fig. 1. Obviously, the inverse 

process of the wavelet decomposition is valid. MALLAT 

reconstruction algorithm is [5]: 

C = H C + G d , j = J −1, J − 2,..., J (8) 

* * 

j j+ 1 j+ 

1 2 2 1 

* 

* 

where H and G are the dual operators of H and G 

respectively [5]. 

B. Wavelet Decomposition of the Audio Signal 

The wavelet decomposition method is applied to the 

audio signal. According to (7) of MALLAT 

decomposition, the smooth version (averages) of the 

audio signal can be got by the role of H and the detail 

version (details) of the audio signal can be got by the role 

of G . Then the smooth version is further wavelet 

transformed to get the smoother version and a more 

detailed version. Wavelet decomposition process of the 

audio signal is shown in Fig.2 [6]. 


The compactly supported wavelet which is called 

DAUBECHIES is used to decompose and reconstruct the 

audio signal. In the finite length FIR filter, 

DAUBECHIES wavelet function has the greatest 

regularity, so the waveform of the wavelet is relatively 

smooth and its time-frequency localization characteristic 

is better [7]. Let H ( ω) be the Fourier transform of hn ( ) , 

that is: 

j n 

H( ) h( n) e ω − 

ω = ∑ (9) 

After determining H ( ω ) , the scaling function hn and 

wavelet coefficients g can be obtained, which are 

n 

defined as [6]: 

n 

2N-1 

Nϕ() t = 2 ∑ hnϕ(2 t−n) (10) 

n=0 

2N-1 

Nψ() t = 2 ∑ gnϕ(2 t−n) (11) 

n 

n 2N−n−1 n=0 

g = ( − 1) h , n= 0,1, 2,..., 2N − 1 (12) 

Through the above operations, decomposed layers for 

high-frequency coefficients and the last layer of the low 

frequency coefficients can be obtained [6]. 

C. Wavelet Reconstruction of the Audio Signal 

The reverse process of wavelet decomposition is the 

wavelet reconstruction of the audio signal. The 

decomposed high-frequency coefficients and lowfrequency 

coefficients can be used to reconstruct the 

audio signal. 

III. THE PROPOSED METHOD 

A. Overview 

The core idea of the proposed algorithm is the using of 

wavelet function and MALLAT algorithm to perform 

wavelet decomposition of the audio signals near the lost 

frame. Then the wavelet coefficients of the audio signals 

near the lost frame are used to estimate the wavelet 

coefficients of the lost frame. Finally, the estimated 

wavelet coefficients are used for wavelet reconstruction 

to replace the lost signal to complete the audio signal 

recovery. 

Fig.3 shows the overall block diagram of the proposed 

method. 

The audio signal is assumed to have L frames in total. 

Two cases are considered in our experiments which are 

unilateral concealment and bilateral concealment. In the 

unilateral concealment, only old frames before the lost 

frame are used for concealing the lost frame. In the 

bilateral concealment, both old and future frames of the 

lost frame are used for error concealment.


B. Unilateral Concealment of Wavelet Coefficients 

Assume the number of the lost frame is l , then the l− 1 

frame is decomposed by n layer wavelet decomposition 

and the corresponding wavelet coefficients, 

r 

g = ( a , d , d ,..., d ) can be 

l−1 ( l−1) n ( l−1) n ( l−1)( n−1) ( l−1)1 

obtained, where ( l 1) n 

d , d ,..., d 

a − is the smooth coefficient and 

( l−1) n ( l−1)( n−1) ( l−1)1 

are the detail coefficients. 

Similarly, the frames with number from 1 to l− 2 of the 

audio signal are decomposed respectively. Then the 

coefficients of every frame can be got and stored in the 

matrix C : 

⎡a1n d1n d1( n−1) 

... d11 

⎤ 

⎢ ⎥ 

a2n d2n d2( n−1) 

... d21 

C = 

⎢ ⎥ . 

⎢............................................... ⎥ 

⎢ ⎥ 

⎢⎣a( l−2) n d( l−2) n d( l−2)( n−1) ... d( 

l−2)1⎥⎦( 

l− 2) × ( n+ 

1) 

The cross-correlation value ai, i = 1,2,..., l−2between 

r 

a( l−1) n in gl −1 

and ain, i = 1,2,..., l− 

2 obtained in C is 

calculated, which is further used to constitute a row 

r 

vector of A= ( a1, a2,..., al 3, al 

2) . To find the 

− − 1 × ( l−2) 

maximum value in A r and get the frame number m 1 

corresponding to the maximum, a ( m1+ 1) n is used as the 

wavelet coefficients of the corresponding lost frame. 

Same as above, the maximum values of all the detail 

d , d ,..., d 

, ,..., n, n 

d m2 n, d m3 n ,....., d mn , d mn+ 

1 

coefficients ( l−1) n ( l−1)( n−1) ( l− 

1)1 are calculated, 

and in turn the frame numbers m2 m3 m m + 1are 

obtained, then ( + 1) ( + 1)( − 1) ( + 1)2 ( + 1)1 

are used as the substitute wavelet coefficients of the 

corresponding lost frame. At last the resulting 

r 

vector gl = ( a( m1+ 1) n, d( m2+ 1) n, d( m3+ 1)( n− 1) ,..., d ( m 1 1)1) 

n+ 

+ is 

got. Using this vector as the wavelet coefficients 

corresponding to the lost frame of the audio signal, we 

can finally reconstruct the lost signal by wavelet 

reconstruction, thus complete the error concealment 

process. 


Figure 3. Overall block diagram 

C. Bilateral Concealment of Wavelet Coefficients 

In this case, both old and future frames of the lost 

frame are used conceal the lost frame. For the old frames, 

the unilateral concealment method is used as above. 

While for the future frames, the following process is done. 

The l+ 1 frame are decomposed by n layer wavelet 

decomposition and the wavelet coefficients 

r 

gl+ 1 = ( a( l+ 1) n, d( l+ 1) n, d( l+ 1)( n− 1) ,..., d( 

l+ 

1)1) 

can be obtained, 

and the frames with number from l + 2 to L of the audio 

signal are decomposed respectively, and then the 

coefficients of every frame can be got and stored in the 

matrix D : 

⎡a( l+ 2) n d( l+ 2) n d( l+ 2)( n− 1) ... d( 

l+ 

2)1⎤ 

⎢ ⎥ 

⎢a( l+ 3) n d( l+ 3) n d( l+ 3)( n− 1) ... d( 

l+ 

3)1 ⎥ 

D = . 

⎢............................................... ⎥ 

⎢ ⎥ 

⎢ 

⎣aLn dLn dL( n−1) ... d ⎥ L1 

⎦( 

L−− l 1) × ( n+ 

1) 

The cross-correlation value 

r 

bj, j = l+ 2, l+ 3,..., L between a in ( l+ 1) n g and l+ 

1 

a , j = l+ 2, l+ 3,..., L obtained in D is computed, 

jn 

which is used to constitute a row vector of 

r 

B = ( b , b ,..., b ) . The maximum value in B r 

l+ 2 l+ 3 L 1 × ( L−l−1) and its corresponding frame number p1 can be 

determined. Then 

a − is used as the wavelet 

( p11) n 

coefficient of the corresponding lost frame. Similarly, the 

maximum values of all the detail 

coefficients d( l+ 1) n, d( l+ 1)( n− 1) ,..., d( 

l+ 

1)1 are calculated, 

and in turn the frame numbers p2, p3,..., pn, p n+ 

1 can be 

obtained. Then d( p2−1) n, d( p3−1)( n−1) ,....., d( p 1)2 , 

n− d ( pn+ 

1− 

1)1 

are used as the wavelet coefficients of the corresponding 

lost frame. At last the resulting vector 

r 

Gl = ( a( p1−1) n, d( p2−1) n, d( p3−1)( n−1) ,..., d ( p 1 1)1) 

can be got, 

n+ 

− 

v v v 

and then G = (2* gl + Gl)/3is 

calculated. Using vector 

G v as the wavelet coefficients corresponding to the lost 

frame of the audio signal, and the lost signal is finally 

reconstructed by the process of wavelet reconstruction, 

thus completing the error concealment process.


IV. EXPERIMENTAL RESULTS 

In order to evaluate the performance of the proposed 

method, we test the proposed method in both the 

waveform coding system and the parameter coding 

system. Test audio sequences with different features are 

used in our experiments. The type of wavelet function 

used in our experiments is DAUBECHIES wavelet. 

A. Results in the Waveform Coding 

The first part gives the results of the proposed method 

comparing with the original audio error concealment 

based on pattern matching [2]. Without loss of generality, 

we assume a data packet is constituted of a frame signal 

in our experiments, so the packet loss rate is the frame 

error rate (FER). The recovery effect of the audio signal 

is measured by SNR (signal-to-noise ratio) [8]. 

Firstly, the pure background music signal is used as the 

test sequence, which sampling frequency is 8kHz. Each 

frame has 160 sampling values and the length of each 

frame is 20ms. The method based on pattern matching 

uses 80 samples for the template. The wavelet 

decomposition level is 3. 

Table 1 gives the SNR results of the concealed audio 

signal under different FER (Frame Error Rate). From 

Table 1, we can see that the proposed method has much 

higher SNR than that of the pattern matching method in 

[2] for both bilateral concealment and unilateral 

concealment under all five different FERs. The results in 

Table 1 also show that the recovery quality of the 

background music using bilateral concealment is better 

than that of using unilateral concealment because more 

adjacent information has been utilized in the bilateral 

concealment. 

Fig. 4 shows the comparison curves of using different 

methods. It is obvious to see that the concealed audio 

qualities are all improved under different FER comparing 

to the pattern matching method in [2]. 

Secondly, the music is used as the test sequence, which 

sampling frequency is 16kHz. Each frame has 640 

sampling values and the length of each frame is 40ms. 

TABLE I. 

SNR(DB) FOR THE PURE BACKGROUND MUSIC 

FER 

SNR(dB) 

(%) Bilateral Unilateral D.Goodman’s 

concealment concealment Method in [2] 

2 24.199 22.439 19.775 

5 19.449 18.879 10.611 

8 15.838 14.733 9.3641 

10 14.478 12.809 9.3382 

15 13.238 11.533 6.1407 


SNR(dB) 

25 

22 

19 

16 

13 

10 

The number of samples in the template is 160. The 

wavelet decomposition level is 3. The concealment 

results are shown in Table 2 and Fig.5. From Table 2 and 

Fig. 5, show that the recovery effect of using the 

proposed method is also better than that of using the 

pattern matching method for music under different FERs. 

Comparing the data in Table 2 with the data in Table 1, 

we can see also that the recovery results for pure 

background music are better than those for music due to 

more correlation existing in the pure background music. 

SNR(dB) 

7 

4 

19 

16 

13 

10 

7 

4 

2 5 8 10 15 

FER(%) 

Bilateral concealment Unilateral concealment D.Goodman's Method in [2] 

Figure 4. Comparison of different methods in SNR for pure 

background music 

TABLE II 

SNR(DB) FOR THE MUSIC 

FER 

(%) 

Bilateral 

concealment 

SNR(dB) 

Unilateral 

concealment 

D.Goodman’s 

Method in [2] 

1 17.243 17.183 15.854 

2 15.691 15.362 13.942 

3 15.086 14.605 13.290 

5 12.072 11.746 11.142 

10 7.9959 7.4146 7.9177 

1 2 3 5 10 

FER(%) 

Bilateral concealment Unilateral concealment D.Goodman Method in [2] 

Figure 5. Comparison of different methods in SNR for music


Finally, the speech signal with sampling frequency 

16kHz is used as the test sequence. Each frame has 320 

sampling values. The length of each frame is 20ms. The 

number of samples in the template is 160. The wavelet 

decomposition level is 3. The concealment results are 

shown in Table 3 and Fig.6. 

From Table 3 and Fig.6, we can see that the method in 

this paper improves the quality of the audio signal on the 

value of SNR comparing with the method based on 

pattern matching [2]. And the bilateral concealment is 

better than the unilateral concealment in most cases. 

It is obvious to see in Fig. 6 that the recovery quality 

of using the proposed method is much improved 

comparing to the pattern matching method when the FER 

increases. The test sequence here is the speech signal, 

therefore there may exist pause when communicating 

between two persons. The method of pattern matching is 

to find the matching samples before the lost frame, and 

the searching range is limited. If the amplitude of the 

speech signal is weak in the searching range, the 

matching results will be very poor. While the proposed 

method makes use of two adjacent frames, which may 

have strong correlation with the lost frame, to conceal the 

lost frame to get improved recovery results. 

SNR(dB) 

16 

15 

14 

13 

12 

11 

10 

9 

TABLE III 

SNR(DB) FOR THE SPEECH AS FER CHANGED 

FER 

(%) Bilateral 

concealment 

SNR(dB) 

Unilateral 

concealment 

D.Goodman’s 

Method in [2] 

1 15.365 15.365 15.251 

2 15.153 15.155 14.815 

3 15.058 15.066 14.730 

5 11.545 11.535 10.070 

10 10.989 10.981 9.1442 

1 2 3 5 10 

FER(%) 

Bilateral concealment Unilateral concealment D.Goodman's Method in [2] 

Figure 6. Comparison of different methods in SNR for speech signal 


B. Results in the Parameter Coding 

The second part of our experiments present the results 

of the proposed method comparing with the CELP-based 

audio error concealment technique [3]. The CELP-based 

audio error concealment techniques use a number of 

characteristic parameters of the audio signal to recover 

the lost audio, so restoration of the audio waveform is 

biased and the recovery results measured by SNR value 

are not accurate. In this case, PESQ (Perceptual 

Estimation of Speech Quality) is often used to measure 

the recovery effect of the audio [9]. 

Using the background sound signal for the test 

sequence, which sampling frequency is 16kHz and the 

length of each frame is 20ms. Table 4 and Fig.7 give the 

comparison results of the proposed method and M. 

Chibani’s method [3] in PESQ. From Table 4 and Fig.7, 

we can see that the proposed method has better 

concealment results than M. Chibani’s method under 

different FER and the bilateral concealment is better than 

the unilateral concealment in most cases. For the special 

case, when FER=1% the unilateral concealment is better 

than the bilateral concealment, because the lost signal is 

more similar to the previous samples. 

PESQ 

FER 

(%) 

4.2 

3.9 

3.6 

3.3 

3 

2.7 

2.4 

2.1 

TABLE IV 

PESQ UNDER DIFFERENT FER 

Bilateral 

concealment 

PESQ 

Unilateral 

concealment 

1 5 10 15 20 

FER(%) 

M.Chibani’s 

Method in 

[3] 

1 4.1581 4.1807 3.6821 

5 4.1513 3.2195 2.9083 

10 3.2072 3.1100 2.8060 

15 3.0433 2.9181 2.7699 

20 3.0323 2.9120 2.7664 

Bilateral concealment Unilateral concealment M.Chibani's Method in [3] 

Figure 7. Comparison of different methods in PESQ


C. Results of Using Different Decomposition Level 

The above two parts give the experimental results of 

audio error concealment algorithm based on wavelet 

decomposition and reconstruction comparing with the 

existing audio error concealment methods. The 

experiments are done with fixed wavelet decomposition 

level. Wavelet decomposition level, as a factor, is going 

to affect the recovery of the audio quality. Therefore 

selecting a suitable Wavelet decomposition level is 

important. We choose the Wavelet decomposition level 

n according to experimental results. 

Take the background sound signal for the test sequence, 

which sampling frequency is 16kHz and the length of 

each frame is 20ms. When the FER is 5%, SNR values 

vary for different n . The specific results are shown in 

Table 5 and Fig.8. Experimental data in Table 5 and the 

curves in Fig.8 illustrate that the maximum SNR value is 

got when the level n is 7. However, the larger the level 

is, the higher the computational complexity will be. 

Furthermore the SNR varies very little for different 

decomposition level n . Therefore, the best choice of the 

decomposition level is 3 considering both reconstruction 

quality and complexity. 

SNR(dB) 

20 

19 

18 

17 


In this paper, we propose an audio error concealment 

algorithm based on the wavelet decomposition and 

reconstruction. This algorithm includes two cases which 

TABLE V 

SNR(DB) WITH DIFFERENT n (FER=5%) 

n 

SNR(dB) 

Bilateral Unilateral 

concealment concealment 

3 19.449 18.879 

5 19.359 18.709 

7 19.491 18.887 

9 19.335 18.497 

11 19.339 18.497 

3 5 7 9 11 

n 

Bilateral concealment Unilateral concealment 

Figure 8. Comparison of using different n 


are unilateral concealment and bilateral concealment 

respectively. 

The proposed algorithm uses the information from the 

old frames or both the old and future frames of the lost 

frame to conceal the lost frame. Comparing with the 

traditional audio error concealment algorithms, it can get 

more related information about the lost frame. In addition, 

the proposed algorithm can be used in both the waveform 

coding system and the parameter coding system. 

Therefore it can be widely used. 

The experimental results show that the proposed 

algorithm works better to recover the lost audio signal 

than the traditional methods. The value of SNR and 

PESQ of the reconstructed audio signal are improved. 

Experimental results also show that the bilateral treatment 

in the wavelet coefficients to restore the lost audio signal 

is better than the unilateral treatment in most cases. 

However, the bilateral concealment method may bring a 

little delay which is not suitable for strict real-time 

application, such as interactive audio. 

In general the algorithm proposed in this paper has 

reached the expected recovery effect of the audio signal. 


This work was supported by the project of National 

Natural Science Foundation of China under Grant 

60832002, 61171078 and in part by the Research Fund 

for Doctorial Program of Higher Education of China 

under Grant 20110061110084 and the Outstanding Youth 

Foundation of Jilin University under Grant 200905018. 

REFERENCES 

[1] O.J.Wasem, D.J.Goodman, C.A.Dvorak and H. G. Page, 

“The effect of waveform substitution on the quality of 

PCM packet communications”. IEEE Transactions on 

Acoustics, Speech, and Signal Processing, vol.36, no. 3, pp. 

342-348, Mar. 1988. 

[2] D. Goodman, G.Lockhart and W.C.WONG, “Waveform 

Substitution Techniques for Recovering Missing Speech 

Segments in Packet Voice Communications”, IEEE 

Transactions on Acoustics, Speech, and Signal Processing, 

vol.34, pp.1440, Dec. 1986. 

[3] Mohanmd Chibani, Roch Lefebver, Philippe Gourrnay, 

“Fast Recovery for a CELP-Like Speech Codec After a 

Frame Erasure”, IEEE Transactions on Audio, Speech, and 

Language Processing, vol.15, no.8, pp.2485-2495, 2007 

[4] Farshad, Lahouti, K.Khandani, “Soft Reconstruction of 

Speech in the Presence of Noise and Packet Loss”, IEEE 

Transactions on Audio, Speech and Language Processing, 

vol.15, pp.44, Jan. 2007. 

[5] Dongmei He, Wen Gao, “Complexity Scalability Audio 

Coding Algorithm Based on Wavelet Packet 

Decomposition”, IEEE International Conference on Signal 

Processings, pp.659, 2000. 

[6] Pramila Srinivasan, Leah H.Jamieson, “High-Quality 

Audio Compression Using an Adaptive Wavelet Packet 

Decomposition and Psychoacoustic Modeling”, IEEE 

Transaction On Signal Processing, vol.46, pp.1085, 

Apr.1998. 

[7] Fang Chen, Wei Li, Xiaoqiang Li., “Audio quality-based 

authentication using wavelet packet decomposition and 

best tree selection”, International Conference on


Intelligent Information Hiding and Multimedia Signal 

Processing, pp.1265, August.2008. 

[8] Nurgun Erdol, Claude Castelluccia and Ali Zilouchian, 

“Recovery of Missing Speech Packets Using the Short- 

Time Energy and Zero-Crossing Measurements”, IEEE 

Transactions on Speech and Audio Processing, vol.1, 

pp.295, Jul.1993. 

[9] Antony W. Rix, John G.beerends, Michanel P. Hollier, et 

al., “Perceptual evaluation of speech quality(PESQ)-a new 

method for speech quality assessment of telephone 

networks and codecs”, IEEE International Conference on 

Acoustics, Speech and Signal Processing, pp.749-752, May 

2001. 

Fei Xiao was born in Ha’erbin, China, 

in 1985. She received the B.S. degree 

from Jilin University in 2008. She is 

now pursuing her M.S degree in 

School of Communication 

Engineering, Jilin University, China. 

Her research interests include audio 

signal processing, error concealment 

and digital video processing. 

Hexin Chen was born in Jilin, 

China, in 1949. He received the M.S. 

and Ph.D. degrees in 

communication and electronic in 

1982 and 1990 from Jilin University 

of Technology, respectively. 

He has been a visiting scholar in the 

University of Alberta from 1987 to 

1988. From Feb.1993 to Aug.1993, 

he was a visiting professor in 

Tampere University of Technology 

in Finland. He currently is a professor of communication 

engineering. His research interests include image and video 

coding, multidimensional signal processing, image and video 

retrieval and audio and video synchronization. 


Yan Zhao was born in Jilin, China, 

in 1971. She received the B.S. degree 

in communication engineering in 

1993 from Changchun Institute of 

Posts and Telecommunications, the 

M.S. degree in communication and 

electronic in 1999 from Jilin 

University of Technology, and the 

Ph.D. degree in communication and 

information system in 2003 from Jilin 

University. 

She has been a postdoc researcher in the Digital Media 

Institute of Tampere University of Technology in Finland from 

Mar.2003 to Dec.2003. From Mar. 2008 to Aug.2008, she was a 

visiting professor in the Institute of Communications and Radio- 

Frequency Engineering in the Vienna University of Technology. 

She currently is an associate professor of communication 

engineering. Her research interests include image and video 

coding, multimedia signal processing and err-or concealment 

for audio and video transmitted over unreliable networks. 

Dr. Zhao is a member of IEEE.


Reputation Based Academic Evaluation in a 

Research Platform 

Kun Yu 

Department of Computer Engineering, Huaiyin Institute of Technology, Huaian, Jiangsu, China 

Email: varguard@yeah.net 

Jianhong Chen 

Department of Computer Engineering, Huaiyin Institute of Technology, Huaian, Jiangsu, China 

Email: chenjianhong@hyit.edu.cn 

Abstract—Researchers have to face with huge 

information in their daily works. It is hard for them to 

screening for valuable information from huge volume 

of data. Reputation of literatures, publications, or 

scholars can help the researches to relieve their puzzle 

and advance their research ability. In this paper, the 

problem of screening is presented in a realized 

research platform. Reputation is modeled by 

synthesizing four elements: literature, author, source 

and reader. The perceptible interactions, such as 

reference, comment and P2P communication is 

considered to be the relationships between each pair 

of elements and help improve the accuracy of 

reputation. The reputation we build is similar to 

impact factor and PageRank, but it is more complex 

and is expected to be more robust in realistic 

environments, which has been proved by simulations. 

An iterative algorithm is introduced to evaluate 

reputation in a distributed mode. Simulations prove 

the practicality and effectiveness of the scheme we 

have proposed. 

Index Terms—academic evaluating; reputation model; 

impact factor 


Recently researchers can benefit from remarkable 

development of computer science and communication 

technology, and must face with new problem brought 

about by information explosion. It involves in the rapid 

increase of papers, web pages and other new media. 

Science research becomes more convenient but and more 

difficult. How to find the most appropriate information is 

becoming a key factor for the success of research[1]. 

Traditionally, face-to-face interactions are the most 

common and believable way to gain knowledge and clues 

for next step of research. Direct talk plays the role of 

content sifter. But the speed of knowledge propagation is 

slow and limits the efficiency of research output. New 

computer and communication technologies, such as BBS, 

Email, instant communication and WWW, facilitate 

knowledge acquisition in modern studies. A main way to 


doi:10.4304/jsw.7.12.2749-2754 

gain new knowledge in academic research is through 

reading scientific documents, which can be easily got 

from online databases. In the meantime, researchers can 

exchange their ideas through Email, BBS or other internet 

tools. It may demonstrate great advantage over traditional 

methods if researchers can easily get the information they 

indeed need. 

Reputation or rank is usually efficient to help them to 

choice the most suitable content. Among the applications 

of this kind, impact factor[2] and PageRank[3] are maybe 

the most famous two, which have been used widely. Not 

by chance, both them use a similar algorithm that regards 

the relationship between two elements as a 

recommendation which finally contributes to reputation 

of the receptor. In impact factor algorithm, references 

play this role and in PageRank, hyperlinks do so. 

However, in both reputation models, recommendations is 

only belong to single class, that is to say, each 

recommendation has only its weight, but without any 

difference in importance. For example in impact factor, 

the citation of a paper brings a weighted recommendation 

to the paper that only depend on the IF of the journal the 

paper is published on. As we will see in next section, it is 

not always true when more factors are taken into 

considered. In those cases, the recommendation is not 

only about the importance of the referrer, but the class of 

the recommendation. The multi-dimensional reputation 

makes it possible to introduce more clues to evaluate 

reputation more exactly. 

In this paper, we developed a cooperative research 

platform, CRP. The platform helps researchers and 

learners to meet their need with the least effort. The 

system includes a research forum and a P2P 

communication tool. Literature indexes and user 

comments are the main part of the research forum. And 

P2P tool can facilitate the exchange of ideas and reviews 

about a paper or a research work. So there are many clues 

of the relationship of any two elements, such as comment, 

access times and friend list in P2P tool. Friend list is extra 

important because it implies the evaluation directly to 

someone else without the shortage of the bias of review 

articles, and includes some factors that does not directly


affect academic value, but are about interest similarity, 

reliability, participation degree, and incentive mechanism. 

Searching is a key function of the platform, including 

paper search and user search. It combines keyword search 

with reputation to sort information by quality. A user can 

also searches for men who are suitable for his research 

field or his unsolved problem. Reciprocity and 

trustworthiness are also considered. The reputation is 

used to filter the users who cannot be trusted enough. The 

users with similar interests can be located based on their 

behaviors and the relationship network. In fact, searching 

provides an incentive method: the user with higher 

reputation will have more chance to take part in 

interaction and have more chance to be helped, so he is 

willing to contribute more to the system. 

II. BACKGROUND 

The Reputation is a metric of entity quality, which can 

only be evaluated by others who once interacted with this 

entity, directly or indirectly. The indirect interact is called 

recommendation. In academic research, the most 

representative reputation is SCI[2]. However, in calculate 

SCI, only periodicals and references are considered 

which can't distinguish different literatures of a same 

journal[4]. The distribution of IF isn’t even, so paper in 

some fields cannot have high IF that cumber the 

comparison between different domains[5]. In the past few 

years, several new measures has been put forward which 

provide the readers more choices than SCI to evaluate a 

paper [6]. 

Usually, academic reputation are not judged based on 

single quantitative variable but through synthesizing 

multiple subjective indexes. The Index Copernicus 

Scientists[7] besides providing scientists with global 

scientists networking and international research 

collaboration, present a multi-parameter career 

assessment system which analyses the researcher 

individual profile. Journal to Field Impact Score (JFIS) [8] 

developed an alternative system for the journal impact 

evaluation. Its source to compute index includes 

literatures, technical reports, notes and reviews. With 

extended data source, Castelnuovo focused on the 

reputation of simple researcher, which is called Single 

Researcher Impact Factor[9]. 

PageRank [3,10] is a more efficient reputation 

evaluation algorithm which is initially used to web page 

ranking. Compared to impact factor which only takes 

citation times into considered, PageRank extends impact 

factor by introducing weighted links to improve the 

validity of the information recommendation[11]. Google 

Scholar is special application of PageRank in the 

academic field with a broader range of open data sources: 

books, technical reports etc. Popular journals such as 

review journals with little prestige could have a very high 

IF and a very low weighted PageRank. EngerTrust[12] 

models reputation by the concept of recommendation 

which is similar to hyperlink and citation. EngerTrust 

also weighs recommendation based on recommending 

credibility like the weighted citation analysis. Recently 


some new methods towards the reputation or trust model 

have proposed too[13][14]. 

Recommender system[15] is a useful tool to help user 

to research or learn. Among those systems, reputation 

based method is a promising one which usually combines 

user reputation, bias, behavior model and relationship 

among different users to efficiently filter information and 

present the result satisfied the user’s demand. 

Recommendation is a strong incentive for users because 

better result can only be presented to the users of better 

reputation. 

Our scheme is a modified version of PageRank and 

EngerTrust and takes users, papers, comments and private 

relationship into reputation computation. All the data 

above can be easily obtained in the platform by 

systematic means. This paper will examine how to 

integrate multiply clues of different qualities into a 

complete reputation metric and discuss the feasible 

reputation evaluation algorithm. The detail of the 

academic platform is described firstly in next section. 

III. BASIC ACADEMIC RESEARCH PLATFORM 

The platform includes an academic forum and P2P 

user network. This forum is thought exchange platform 

for the researchers to study literatures. In this forum, each 

literature is a basic item and other researchers post replies 

to feed back personal assessments. In fact, the 

researchers’ explicit feedbacks can serve as a kind of 

literature reputation, which can evaluate more accurate by 

combining impact factor of the source of the literature 

with the author's reputation, comments, comment 

reputation and so on. 

Fig. 1 Platform Structure 

This forum is a public platform for the researchers to 

exchange their ideas. Forum database stores literatures 

that have been formal published in journals or conference 

proceedings. Then other users can search for them and 

get the detail information about them from the web page 

of the platform. Users can also post comments of 

personal opinions about articles or about the comments 

posted by other users. When a user posts a comment, he 

is enforced to mark the commented in the same time. 

Marks can be regard as the weight of recommendation. 

A user can use the P2P tool to directly communicate to 

another user if the later is willing to interact with him. 

This is a kind of private exchange that can only be seen 

by participators. However, it is very important for 

reputation evaluation because the user can evaluate the 

partner not only by the explicit reputation, but also by the


personal judgment about the partner through reading the 

articles the partner has published and interaction history 

with him. So the judgment is usually more pertinent than 

calculated reputation. 

Reputation module is the core of the system which 

computes the reputation by interacting with multiple 

datasets and software modules and stores the results in 

the reputation database, as shown in Fig. 2. 

Fig. 2 Modules and structure of the academic platform 

A search engine is employed to provide users with 

literature filter. The user needs to input key words and 

restrictive conditions firstly. The system works out 

literature list sorted by reputation then. When the user 

select one or more items to read, the system records the 

hits and the executor of each hit on background. Search 

and peer user selection use reputation data in reputation 

database. Search, comment and P2P communication 

apply network and user interaction module to transfer 

data and show the result on the screen. 

In order to improve the accuracy of reputation 

evaluation and relieve the cheat and malicious behaviors, 

user identification divided into two classes: the 

authorized and the unauthorized. The user of the 

authorized is the author of a formal published paper and 

has passed the email check: Firstly, the checker sends an 

authentication message to the email address which is the 

corresponding address of the paper; if the checker 

receives a reply from the address later, he labels the user 

authorized user. Otherwise, the user is a unauthorized one. 

IV. REPUTATION COMPUTATION ALGORITHM 


Fig.3 Reputation model 

The literature, user or comment each has its own 

reputation, but each is related to others and its reputation 

is depended on the others’ reputations, as shown in Fig.3, 

which is quite different to PageRank. The later has only 

one kind of element, web page. So the basic strategy is to 

compute local reputation firstly and to integrate them 

then. Iterate the process until each element reaches a 

Fig.4 Construction of recommendation relation graph 

stable status. P2P network is partly independent system, 

but it also contributes to reputation computing. Although 

users can gain their reputations only depending on 

relationships in P2P network, the reputations can also be 

introduced into reputations in the forum,. 

The above model is too complex, so in order to 

effectively calculation user reputation, it needs some 

simplification. Referring to PageRank and EngerTrust 

reputation model, literature review or the citation can be 

regarded as recommendation relationship which weight is 

decided based on the presenter's reputation and the total 

number of recommendation, as shown in figure 4. 

A. Literature Reputation Model 

Literature reputation Rp shall consider at least three 

factors: authors, citations and comments. The formula (1) 

shows the reputation evaluation of literature i: 

Rp(i)=λ1Rs(s(i))+λ2Ra(a(i)) +λ3Rc(i)+ (1-λ1-λ21-λ3)Rr(i) (1) 

Here, d is a decay factor which is usually set to 0.85. 

Rs is the initial reputation of the journal or proceeding the 

article published on. Usually it is in proportion to its 

impact factor. If it has not impact factor, its initial 

reputation is 0. 

Ra is the reputation of the first author, defined in 

formula (3). Rc(i) is the score from comments about it, as 

show in formula (4). 

Rr denotes citation value of the article. 

Rr (i)= 

R ( j) 

∑ 

j∈ref ( i) 

p 

(2) 

ref ( j) 

|ref(j)| donates the number of the citations of 

literature j. 

B. User Reputation Model 

User reputation reflects the user’s academic authority 

and the academic value of his articles and reviews. User 

reputation has two sources: the forum and P2P network. 

There are two kind of user reputation: reputation of 

author and common reader. Fig.4 shows that the relation 

between two users is built on comments or articles. 

Therefore, we can calculate the two kinds of reputation 

by formula (3):


∑ ∑ (3) 

p 

λ4 Rp() i + (1 − λ4) 

Rc( j) + Ra ( u) 

Ra(u)= 

2 

Here, i is a paper issued by the author u, j is a comment 

p 

about u. Ra( u ) is the reputation gained from P2P 

network. We simply assume the reputation from forum is 

same important as that from P2P network. 

C. Comment Reputation Model 

Comment reputation of a comment k includes two 

parts: commentator reputation and the comments on this 

comment by other readers. 

Ra( a( k)) Val( p( k), a( k)) 

Rc( k) 

= λ5 

comments( a( k)) 

(4) 

R () i Val( k,) i 

c 

+ (1 − λ ) 

) 

5 ∑ 

i∈sub( k) 

5 comments( i) 

|comments(a(k))| denotes the total number the user 

a(k) has posted, Val(p(k),a(k)) is the score a(k) marks on 

p(k) and |comments(i)| is the total number of the 

comments posted by user i. 

The comments form a tree structure that a father 

comment can be made up of a few child comments and 

each child may have a few children of him. The 

reputation formula is a recursive function: the end node in 

the comment tree is firstly evaluated by only Ra, his 

father then combines Ra and the reputations of child 

comments gained just now. The variables in the formulas 

above list in table 1. 

D. Reputation Computing in P2P Network 

In P2P user network, each user has several friends 

and all the friend relationships construct a friend 

network. Each edge in the network has a weight equal to 

the reputation of the partner, so PageRank likely 

algorithm can be used here. But different to PageRank, 

friend relationship is bidirectional, which needs to 

transform into two directed relationships with different 

directions, as shown in Fig.5. Assume that a is a friend 

p 

of b, the reputation of a is denoted by Ra( a ) . out(a) 

denotes the number of friends of a. So when the link 

between a and b is divided into two directed connects, 

p 

the connect from a to b has a weight of R ( a ) /out(a). 

Reputation can be computed by an iterative algorithm. 

In the algorithm, the basic formula of u’s reputation is: 

p 

p Ra() v 

Ra( u) 

= ∑ (5) 

v∈friend( u) 

out() v 

In the iteration, u is assigned an initial reputation equal 

to Ra(u), the reputation of u computed by forum module. 

a 

p 

R ( a)/ out( a) 

a 

p 

R ( b)/ out( b) 

Fig.5 transformation of friend relationship 


a 

The evaluation algorithm is detailed below: 

p 

∀u∈ S Ra ( u) 0 = Ra( u) 

//S is a user set where a user has at least one friend. 

p p 

while( Ra ( u) i − Ra ( u) i−1> 

ε ){ 

for each ∀u∈ S 

p 

p Ra() v 

Ra ( u) 

i = ∑ 

v∈friend( u) 

out() v 

} 

Because all nodes have at least one friend, they have 

at least a directed connect to other users. So, there is no 

dangling problem in the network. 

E. Dangling Problem 

Similar to the problem of dangling page in Pagerank, 

when a user hasn’t any paper citing other papers and no 

comment, he is called dangling user. Simultaneously, the 

dangling paper is the paper without any citation. A 

comment always has an out link so it hasn’t the dangling 

problem. Dangling entities can disturb the reputation 

evaluation of other entities. 

Two methods can solve the problem to ensure the 

reputation computation convergent: (1) add a virtual link 

from the entity to all other available entities, the user can 

link to literatures and comments, the literature can link to 

literatures; (2) ignore all converse links pointed to the 

entity firstly in computation until he add a new link to 

another entity, for example, he posts a comment. 

In fact, it is unimaginable that a paper without any 

citation is high in quality. So deleting the paper directly 

has little effect to the accuracy of reputation evaluation. 

F. Globe Reputation Algorithm and Convergence 

Evidently, reputation variables in (1)~(4) is not 

independent, so reputation computing is an iterative 

process and must be convergent. The adopted algorithm 

refers to the idea of recommendation networks. The 

evaluation algorithm is detailed in Fig. 6. 

10 Procedure Evaluation 

20 Begin 

30 For each user u and literature i, Ra(u)=1 and Rp(i)=1 

40 calculate initial reputation Rp(i) by formula (1), (2) 

and (5) 

50 store reputations of all entities t in R(t) 

60 recalculate user reputation Ra(u) with formula (3) 

70 recalculate literature reputation Rp(i) with formula (1) 

80 recalculate comment reputation Rc(k) with formula (4) 

90 store reputations of all entities t in R’(t) 

100 if max Rt ( ) − R'( t) > ε goto 50 

∀t 

110 end 

Fig. 6 The iterative reputation evaluation algorithm 

G. Tradeoff Scheme 

The simulation below proves the convergence of the 

algorithm, but within large computing complexity. A 

tradeoff scheme proposes that the reputation computing 

steps, such as step 60, 70, 80, can only trigger when a 

new event happened. The event may be a submission of 

literature, comment or a register of user or periodical. 

When it happens, the corresponding formula will be


called for one time but will not trigger the iterative 

process. It means that only the reputation of the entity the 

new entity directly links to changes and the change don’t 

diffuse to other entities. 

The reputation renew is local that mitigate greatly the 

computing cost. On the other hand, the tradeoff scheme 

cannot guarantee the accuracy of reputation that the speed 

of convergence to globe reputation is subject to events 

relative to this entity. To speed the convergence, the 

globe iterative algorithm may run periodically. 

V. INITIAL REPUTATION 

Formula(1) can be used to compute initial literature 

reputation, here Rs(j) is the reputation of journal or 

proceeding j can be gained by literature databases. By 

now we choose most well-known databases include SCI 

journals, EI source, the list of Chinese core journals of 

PKU. The normalized computing is as (5): 

factor( j) 

⎧1-0.5 j ∈SCI 

⎪ 

Rs( j) = ⎨0.3 

j∈EI 

(6) 

⎪0.1 j ∈ core journal list of PUK 

⎩ 

Then, with the initial literature reputation above, initial 

user reputation can be computed by (3). 

An authenticated user has an initial reputation 

corresponding to the reputation of journal where he has 

published a paper. The unauthenticated user is assigned a 

reputation of 0.01. 

⎧Rs( 

j) u is authorized and j is the journal 

Ra( u) 

= ⎨ (7) 

⎩0.01 

u is unauthorized 

VI. SIMULATIONS AND PERFORMANCE ANALYSIS 

Simulations abstract literature indexes randomly from 

SCI and EI database to construct the test base database, 

which is selected from 1995 to now, mainly from the 

field of computer science. The first authors of the papers 

in the base dataset are added into test user set. The 

dataset contains approximately 33, 500 articles, with 510, 

000 citations. There are about 15, 000 authors. Two 

authors are considered identical if their full names match. 

There were a few citations from an article to another 

article published outside the paper set that were removed. 

In the same while, the test produces a number of users 

who are not the authors, and only issue comments but not 

publish new literatures. The proportion of the users who 

are not authors to authors is 2. There are some comments 

randomly issued for each literature, includes direct 

comments upon the literature and indirect comments that 

are comments to other comments. The numbers of direct 

and indirect comments follow the Poisson distribution of 

P(1). Commentators are chosen randomly from all users. 

Here, λ1=0.2, λ2=0.3, λ3=0.2, λ4=0.8, λ5=0.5. Comment 

mark level is an integer selected from 0 to 5. 


We first experiment to check the convergence of globe 

iterative algorithm. The experimental papers are selected 

randomly from base dataset. The number of documents in 

base dataset is called dataset scale. 

At first, we specify a series of dataset scales from 400 

to 33, 500 and the globe algorithm to calculate the 

reputation. The end condition is ε


Fig. 8 The availability of local algorithm 

Fig. 8 shows that after the test started, the accuracy of 

the local reputation evaluation kept on improving. s is the 

update speed of reputation in fact. When s arises, the 

evaluation precision increases. But the improvement 

cannot continue when the deviation is below a degree. 

That means the local algorithm cannot gain better 

performance if other methods aren’t adopted. A feasible 

way is to collect all information and compute the 

reputations of all entities periodically, for example at 

weekend or after ten o’clock at night every Monday. 


We present a new reputation evaluation mechanism 

that can be used to evaluate reputation of literatures and 

researchers. The basic idea is that there are some 

relationships among literatures, its author, periodicals and 

readers. The relationship can be regard as 

recommendation for each other that is in some ways 

similar to hyperlink in Google. Therefore, we build a 

similar reputation model here and propose a PageRanklike 

algorithm. By simulations, we prove that the 

mechanism is feasible and convergent. We also discuss 

the strategies to apply the algorithm to gain better 

performance. A tradeoff scheme is proposed to replace 

the globe reputation by a local one. The outstanding 

advantage of local scheme is its simplicity and 

decentralization. Therefore, it is easy to deploy in a huge 

forum system. 

There are some issues remained unsolved. One issue is 

if the reputation model and the evaluation method can 

exactly indicate the academic values of literatures or 

researchers. What is the standard? In despite of impact 

factor is widely accepted, but whether it is the most 

scientific one is in doubt. Another problem is how to 

decide the values of the parameters, such as λ1~λ5. Here 

the values are specified subjectively. Lastly, because the 

comment is a key factor of reputation model, how to 

promote users to issue their reviews is the next work in 

the future. 

REFERENCES 

[1] Alfarez Abdul-Rahman, Stephen Hailes. Supporting Trust 

in Virtual Communities. Proceedings Hawaii International 

Conference on System Sciences, 2000, 4 -7. 


[2] Kaltenborn K-F, Kuhn K. The journal impact factor as a 

parameter for the evaluation of researchers and research, 

Rev Esp Enferm Dig. 2004, 96:460-476. 

[3] L.Page, S.Brin, R.Motwani, and T.Winograd. The 

Pagerank Citation Ranking: Bringing Order to the Web, 

Technical report, Stanford Digital Library Technologies 

Project, 1998. 

[4] Brumback RA. Worshiping False Idols: The Impact Factor 

Dilemma: Correcting the Record, Journal of Child 

Neurology, 2008, 23:1092-1094. 

[5] Huth EJ. Authors, editors, policy makers, and the impact 

factor. Croatian Medical Journal, 2001;42(1):14-17. 

[6] Sevinc A. Manipulating impact factor: an unethical issue 

or an Editor's choice? Swiss Med Wkly, 2004, 134(27-28), 

410. 

[7] Graczynski MR. Personal impact factor: the need for speed. 

Med Sci Monit, 2008, 14(10): ED1-ED2. 

[8] Soualmia LF, Darmoni SJ, Le Duff F, Douyere M, 

Thelwall M. Web impact factor: a bibliometric criterion 

applied to medical informatics societies' web sites. Stud 

Health Technol Inform, 2002;90:178-183. 

[9] Burke, R. Hybrid Recommender Systems: Survey and 

Experiments. User Modeling and User-Adapted 

Interaction,2002,12 (4),331-370. 

[10] N. Ma, J. Guan, and Y. Zhao. Bringing pagerank to the 

citation analysis. Inf. Process. Manage. 2008, 44(2):800- 

810. 

[11] Srivastava, J., Cooley, R., Deshpande, M,Tan, P. Web 

Usage Mining: Discovery and Applications of Usage 

Patterns from Web Data. SIGKDD Explorations ,2000, 

1(2),12-23. 

[12] S.D.Kamvar, M.T.Schlosser, and H.G.-Mollina. The Eigen 

Trust Algorithm for Reputation Management in P2P 

Networks, The Twelfth International World Wide Web 

Conference, 2003. 

[13] Min Peng, ZhengQuan Xu, ShaoMing Pan, Rui Li, 

Tengyue Mao. AgentTMS: A MAS Trust Model based on 

Agent Social Relationship, Journal of Computers, 2012, 

7(6), 1535-1542. 

[14] Hui Chen. The Impact Mechanism of Consumer-generated 

Comments of Shopping Sites on Consumer Trust, Journal 

of Computers, 2011, 6(8), 1677-1682. 

[15] G Adomavicius , A Tuzhilin1. Toward the next generation 

of recommender systems : A survey of the state of the art 

and possible extensions, IEEE Trans on Knowledge and 

Data Engineering, 17(6),734 - 749,2005. 

Kun Yu Born in Huaian, Jiangsu, China, 

1972. He Received the Ph.D. degree in 

computer science and technology from 

southeast University, China in 2010. 

Currently, he is a teacher at Huiyin 

Institute of Technology, China. His 

research interests include computer 

network and protocol analysis. 

Jianhong Chen Born in Huaian, 

Jiangsu, China, 1972. He received his 

Ph.D. degree in Computer Science 

and Technology from Shanghai Jiao 

Tong University in 2011. Currently, 

he is a teacher at Huiyin Institute of 

Technology, China. His research 

interests include public key 

cryptosystem and network security.


Analyzing ChIP-seq Data based on Multiple 

Knowledge Sources for Histone Modification 

Dafeng Chen 

School of Information Science, Nanjing Audit University, Nanjing 211815, China 

Email: windking@nau.edu.cn 

Deyu Zhou 

School of Computer Science and Engineering, Southeast University, Nanjing 210029, China 

Email: zhoudeyu@gmail.com 

Yuliang Zhuang 

School of Information Science, Nanjing Audit University, Nanjing 211815, China 

Email: Zhuangyl@nau.edu.cn 

Abstract—ChIP-seq is able to capture the genomic profiles 

for histone modification by combining chromatin 

immunoprecipitation (ChIP) with next generation 

sequencing. However, enriched regions generated from peak 

finding algorithms are evaluated only based on the limited 

knowledge acquired from manually examining the relevant 

biological literature. This paper proposes a novel 

framework of incorporating multiple knowledge sources, 

consisting of information extracted from biological 

literature, Gene Ontology, and microarray data, in order to 

precisely analyze ChIP-seq data for histone modification. 

The information is combined in a unified probabilistic 

model to rerank the enriched regions generated from peak 

finding algorithms. Through filtering the reranked enriched 

regions using some predefined threshold, more reliable and 

precise results could be generated. The combination of the 

multiple knowledge sources with the peaking finding 

algorithm produces a new paradigm for ChIP-seq data 

analysis. 

Index Terms—ChIP-seq, histone modification, reranking, 

information extraction 


Histones, acting as spools around which DNA binds, is 

the chief protein components of chromatin. Histones are 

subject to lots of posttranslational modifications, such as 

lysine acetylation, lysine and arginine methylation, serine 

and threonine phosphorylation, and lysine ubiquitination 

and sumoylation [1]. Histone modifications may alter the 

electrostatic charge of the histone resulting in a structural 

change in histones or their binding to DNA. Histone 

modifications may be the binding sites for protein 

recognition modules which recognize acetylated lysines 

or methylated lysine, respectively. Overall, histone 

modifications affect chromosome function in may ways. 

Thus, posttranslational modifications of histones create a 

mechanism for the regulation of a variety of normal and 

disease-related processes. 

ChIP-seq [2], which combines chromatin immunoprecipitation 

(ChIP) with next generation sequencing, is able 


doi:10.4304/jsw.7.12.2755-2762 

to capture the genomic profiles for histone modification 

and transcription factor (TF). It is characterized by high 

resolution, cost effectiveness and no complication.A large 

amount of data have recently been generated using the 

ChIP-Seq technique, therefore calling for new analysis 

algorithms. 

To discover the exact locations of TF binding sites 

from ChIP-seq data, a number of algorithms, such as 

CisGenome [3], MACS [4], PeakSeq [5], QuEST [6], sPP 

[7], Useq [8] and SISSRs [9], have been proposed. TF 

binding is mainly governed by sequence specificity. 

Therefore TF binding sites are typically correlated with 

very localized ChIP-seq signals in the genome. On the 

contrary, many modification marks consist of broad 

domains, which are believed to stabilize the chromatin 

state. Moreover, the signals for histone modifications, 

histone variants and histone-modifying enzymes are 

usually diffuse and lack of well-defined peaks, spanning 

from several nucleosomes to large domains 

encompassing multiple genes. As such, peak-finding 

algorithms employed to find TF binding sites with strong 

local enrichment are unsuitable for discovering these 

generally weak signals from DNA modification marks. 

To the best of our knowledge, only few methods, e.g. 

ChIPDiff [10] and SICER [11], have been published 

focusing on analyzing ChIP-seq data specifically for 

histone modification. ChIPDiff attempts to identify 

differential histone modification sites by computationally 

comparing two ChIP-seq libraries generated from 

different cell types. Instead of partitioning the genome 

into bins and computing the fold-change of the number of 

ChIP fragments in each bin, ChIPDiff modeled the cor 

relation as a hidden Markov model (HMM) where 

transmission probabilities were automatically trained in 

an unsupervised manner. By inferring the states of 

histone modification changes using the trained HMM 

parameters, the correlation between consecutive bins is 

taken into account. Nevertheless, ChIPDiff fails to 

compare more than two ChIP-seq libraries. Instead of


comparing two ChIP-seq libraries, SICER partition the 

genome into non-overlapping windows with fixed size. 

Islands (potential ChIP-enriched domains) are identified 

as clusters of eligible windows separated by gaps of a 

size less than a predetermined threshold. Then, a 

clustering method is employed to score each island. 

After discovering enriched regions using a peak 

finding algorithm, validation of the results is typically 

performed based on some limited knowledge acquired 

from biomedical literature, such as experimentally 

validated genes relating with the histone modification. It 

is also possible to validate the correctness of the 

discovered enriched regions through QPCR (real-time 

Quantitative Polymerase Chain Reaction detecting system) 

experiments; but this is too costly and labor intensive and 

is therefore seldom adopted in practice. Thus, the 

prevailing approach of validating the discovered enriched 

regions is the former method which uses limited 

knowledge acquired from biomedical literature. However, 

it suffers from the following drawbacks: 

• Amount of knowledge for validation. Most 

knowledge for validation are obtained by handcurated 

the relevant experimental results described 

in biomedical literature, which is laborious, time 

consuming, and error-prone. Moreover, it has been 

demonstrated that biomedical literature is growing 

at a double-exponential pace, it thus becomes 

extremely hard for biologists to be updated with 

the most up to-date knowledge from biomedical 

literature. 

• Source of knowledge for validation. Existing 

approaches mainly use knowledge extracted from 

biomedical literature for validation. It is worth to 

exploit knowledge from other sources, such as results 

from microarray data analysis, or knowledge 

inferred from Gene Ontology. 

• Handling of contradictory knowledge. It is 

possible that the results discovered by peak 

finding algorithms are contradictory to the 

knowledge obtained from biomedical literature. 

There lack of effective methods in handling such a 

situation. 

This paper explores an efficient way to improve the 

precision of genomic-wide chromatin modification 

profiles. A framework of incorporating information 

extraction into a probabilistic model for reranking 

discovered enriched regions (candidate histone 

modification sites) is comprehensively investigated. To 

improve the histone modification sites discovery results, 

the external knowledge sources, such as information 

extracted from biomedical literature, microarray data, and 

Gene Ontology, are employed to re-score enriched 

regions. The rationale be hind this is that biomedical 

literature, microarray data, and Gene Ontology are 

reliable resources for describing the gene expression level 

in some specific cell lines, while the histone 

modifications are major epigenetic factors regulating 

gene expression. Therefore, there is some casual 

relationship between histone modifications and the 


knowledge sources which can be used to improve the 

accuracy of discovered histone modification sites. 


This section presents the existing work in two areas, 

information extraction for genes regulated by histone 

modification, and reranking based on multiple knowledge 

sources. 

A. Information Extraction for Genes Regulated by 

Histone Modification 

Large amount of experimental and computational 

biomedical data, specifically in the areas of genomics and 

proteomics have been generated along with new 

discoveries, which are accompanied by an exponential 

increase in the number of biomedical publications 

describing these discoveries. In the meantime, there has 

been great interest with scientific communities in 

literature mining tools to sort through this abundance of 

literature and find the nuggets of information such as 

protein-protein interactions, gene regulation and so on, 

which are most relevant and useful for specific analysis 

tasks. 

To mine information from the biomedical literature, 

two steps are crucial. One is named entity recognition 

(NER) which recognizes names of biomedical entities, 

such as gene, proteins, cells and diseases. The other is 

information extraction. In general, current approaches for 

biomedical information extraction can be divided into 

three categories, computational linguistics-based methods, 

rule-based methods and machine learning and statistical 

methods. 

Corinna [12] developed an approach for identifying 

histone modifications in biomedical literature with 

Conditional Random Fields (CRFs) and for resolving the 

recognized histone modification term variants by term 

standardization. 

Many systems [13–17], examples including EDGAR 

[18], BioRAT [19], GeneWays [20] etc.,have been 

developed to extract protein-protein interaction from text. 

To the best of our knowledge, there are no existing 

approaches focusing on mining the gene information 

regulated by histone modification. 

B. Reranking based on Multiple Knowledge Sources 

Recently, reranking algorithms have been quite 

popular for data mining and natural language processing. 

The idea behind reranking is that some information which 

is crucial for generating ranking scores is not 

incorporated in the ranking algorithm used. Therefore, 

there is a need for a reranking algorithm to rerank results 

by incorporating these information. 

For example, documents can be represented in the 

vector space model used in information retrieval. In 

traditional information retrieval, given a query q, 

retrieved documents are presented in a decreasing order 

of the ranking scores with respect to the content 

information. In addition to content, documents are 

interconnected to each other through an explicit or latent 

link. Thus, many recent methods take into account link-


based information. However, one of the issues is that 

those ranking algorithms typically treat the content and 

link information separately, and each document is 

assigned a score independent of other documents for the 

same query. Reranking algorithm leverage the 

interconnection between documents/entities to improve 

the ranking of retrieved results [21]. 

Reranking approaches in the natural language 

processing domain attempt to improve upon an existing 

probabilistic parser by reranking the output of the parser. 

Reranking has benefited applications such as name-entity 

extraction [22], semantic parsing [23] and semantic 

labeling [24]. Most reranking approaches are based on 

discriminative models while base parers are mostly based 

on generative models. The reason behind is that 

generative probability models such as hidden Markov 

models (HMMs) or hidden vector state (HVS) models 

provide a principled way of treating missing information 

and dealing with variable length sentences. On the other 

hand, discriminative methods such as support vector 

machines (SVMs) enable us to construct flexible decision 

boundaries and often result in performance superior to 

that of generative models. The combination of generative 

and discriminative models could leverage the advantages 

of both approaches. 

III. PROPOSED FRAMEWORK 

The overall process of the proposed framework is 

shown in Figure 1 which takes the form of the three main 

processes. Firstly, millions of short reads generated from 

the deep sequencing platform are mapped to reference 

genome. After peak finding, enriched regions are 

discovered. Secondly, information extraction based on a 

statistical model aims to extract information about genes 

which are regulated by histone modification. Information 

about the environment for these regulations will also be 

extracted. The extracted information will be combined 

with the external knowledge sources such as gene 

ontology and results mined from microarray data to form 

inputs to a probabilistic model, which is then employed 

for re-ranking the discovered enriched regions. 


A. Information Extraction based the Conditional HVS 

model 

In order to extract genes regulated by histone 

modification, they need to be first identified through 

named entity recognition. After that, the genes regulated 

by histone modification can be extracted through relation 

extraction. For the first step, CRFs or SVMs can be 

employed to recognize genes regulated by histone 

modifications. For the second step, we are particularly 

interested in relation extraction from biomedical literature 

based on the Hidden Vector State (HVS) model. The 

HVS model was originally proposed in [25] and has been 

successfully applied in biomedical domain for proteinprotein 

interactions extraction [26, 27]. 

Given a model and an observed word sequence W 

=(W1 … WT ), semantic parsing can be viewed as a 

pattern recognition problem and the most likely semantic 

representation can be found through statistical decoding. 

If assuming that the hidden data take the form of a 

semantic parse tree C then the model should be a pushdown 

automata which can generate the pair 

through some canonical sequence of moves D = (d1 … 

dT ). That is 

When considering a constrained form of automata 

where the stack is finite depth and is built by 

repeatedly popping 0 to n labels off the stack, pushing 

exactly one new label onto the stack and then generating 

the next word, it defines the HVS model in which 

conventional grammar rules are replaced by three 

probability tables. Given a word sequence W, concept 

vector sequence C and a sequence of stack pop operations 

N, the joint probability of P (W,C,N) can be decomposed 

as 

where Ct, the vector state at word position t, is a vector 

of Dt semantic concept labels (tags), i.e. Ct=[ Ct[1], Ct


[2], .. Ct [Dt]] where Ct[1] is the preterminal concept label 

and Ct [Dt] is the root concept label (SS in Fig. 2), nt is 

the vector stack shift operation at word position t and take 

values in the range 0, . . . , Dt-1 and Ct[1] = Cwt is the new 

preterminal semantic tag assigned to word wt at word 

position t. 

An example parse tree is illustrated in Figure 2 which 

shows the sequence of HVS stack states corresponding to 

the given parse tree. State transitions are factored into 

separate stack pop and push operations constrained to 

give a tractable search space. The result is a model which 

is complex enough to capture hierarchical structure but 

which can be trained automatically from only lightly 

annotated data. 

The HVS model computes a hierarchical parse tree for 

each word string W, and then extracts semantic concepts 

C from this tree. Each semantic concept consists of a 

name-value pair where the name is a dotted list of 

primitive semantic concept labels. For example, the top 

part of Figure 2 shows a typical semantic parse tree and 

the semantic concepts extracted from this parse would be 

in Equation 3 

HistoneM = H3 acetylation 

HistoneM.HistoneM = H3K4me3 

HistoneM.HistoneM.REL.GENE = IL17 

HistoneM.HistoneM.REL.GENE = IL17f (3) 

The HVS model parameters are estimated using an EM 

algorithm and then used to compute parse trees at runtime 

using Viterbi decoding. In training, each word string 

W is marked with the set of semantic concepts C that it 

contains. For example, if the sentence shown in Figure 2 

was in the training set, then it would be marked with the 

four semantic concepts given in equation 3. For each 

word wk of each training sentence W, EM training uses 

the forward-backward algorithm to compute the 

probability of the model being in stack state c when wk is 

processed. Without any constraints, the set of possible 

stack states would be intractably large. However, in the 

HVS model this problem can be avoided by pruning out 

all states which are inconsistent with the semantic 


concepts associated with W. The details of how this is 

done are given in [25]. 

The original HVS model takes a form of a generative 

model which makes it difficult to incorporate background 

knowledge or non-local features. We propose to represent 

the model as a conditionally trained graphical model 

similar to the CRFs. The HVS model can be viewed as a 

graphical model. Assuming the vector state stack depth is 

limited to be 4, that is, there are at most 4 semantic tags 

(states) relating to each word position. Ct is the vector 

state corresponding to the word Wt. St is the stack shift 

operation which consists of popping Nt semantic tags 

from the previous vector state Ct-1 and pushing one preterminal 

semantic tag to the stack and thus producing Ct. 

Given a word sequence W, concept vector sequence C 

and a sequence of stack pop operations N, the conditional 

HVS model takes the form 

where Θ = is the 

parameter vector of the conditional HVS model. fk, gk, hk 

are arbitrary feature functions over their respective 

arguments, and λk, µk, νk are the corresponding learned 

weights for each feature function. 

Inference for the conditional HVS models can be 

performed efficiently with dynamic programming. 

Parameter estimation can be performed with standard 

optimization procedures such as iterative scaling, 

conjugate gradient descent, or limited memory quasi- 

Newton method(L-BFGS). 

B. Reranking based on a Probabilistic Model 

To rerank the enriched regions generated from a peak 

finding algorithm, we need to first select some essential


features based on the multiple knowledge sources. 

Suppose the enriched region R and its related gene G, its 

information extracted from text IT , results mined from 

microarray data IM, and information inferred from Gene 

Ontology IO are defined as follows: 

• Information extracted from Text IT , for the pair < 

Histone Modification, G>, is defined as the 

probabilistic score that is generated from the 

conditional HVS model. 

• Results mined from Microarray, IM is defined as 

the expression level results obtained from 

microarray data for G. 

• Information inferred from Gene Ontology IO 

describes the trust level of inference that this gene 

is regulated by the histone modification. IO is 

defined as the score of inference based on gene 

ontology. 

Overall, it can be observed that the higher the value of 

IT , IM, and IO, the strong confidence of the correctness of 

the enriched region. 

We use the above parameters IT , IM and IO to 

calculate Score, the overall score of the enriched region R. 

Based on these scores, enriched regions generated from 

peak finding are reranked. It should be noted that up to 

this point, the relationship between Score and the above 

parameters is not apparent and it could be linear or nonlinear. 

We thus investigate several ways to describe this 

relationship by constructing three models including a loglinear 

regression model, neural networks, and support 

vector machines. 

1. Log-linear Regression Model 

For the log-linear regression model, Score is defined as 

which is a combination of the above three defined 

parameters. To estimate the coefficients β = (βt, βm, βo, 

β0), the method of least squares is applied and the 

coefficients β are selected to minimize the residual sum 

of squares 

where M is the number of training data and log Score′is 

the true value of Score. 

2. Neural Networks 

The central idea of neural networks is to extract linear 

combinations of the inputs as derived features, and then 

model the target as a nonlinear function of these 

features.The model based on neural networks has the 

form 

where X = (IT , IM, IO) and ωm,m = 1, 2, . . . ,M is unit 

3-vectors of unknown parameters. 

3. Support Vector Machines 

Support vector machines produce nonelinear 

boundaries by constructing a linear boundary in a large, 

transformed version of the feature space. 


The model based on support vector machines has the 

form: 

where hm(X),m = 1, . . . ,M are basis functions and X = 

(IT , IM, IO). 

IV. EXPERIMENTAL RESULTS 

The proposed framework of analyzing ChIP-seq data 

based on multiple knowledge sources for histone 

modification are evaluated in two parts, information 

extraction and re-ranking based on multiple knowledge 

sources. 

The information extraction system works as follows.At 

the beginning, abstracts are retrieved from MED-LINE 

and split into sentences. Gene names, other biological 

terms are then identified based on a pre-constructed 

biological term dictionary. And histone modifications are 

identified using a classification model. After that, each 

sentence is parsed by the semantic parser employing the 

conditional HVS model. Finally, information about genes 

related to histone modification is extracted from the 

tagged sentences using a set of manually-defined simple 

rules. An example of the procedure is illustrated in Figure 

3. 

To investigate the performance of the information 

extraction system, abstracts from PubMed and 

PubMedCentral are selected. Based on the search 

keyword “h3k4me3”, 211 abstracts are retrieved. In the 

similar way, 731 abstracts are retrieved from PubMed 

based on the search keyword “histone h3 lysine 4 

methylation”. Abstracts about “h3k9ac” and “h3k27me3” 

are also retrieved in the similar way. These abstracts are 

split intosentences. The sentences with at least one gene 

or protein name and histone modification are kept and 

other sentences are filtered out. All the kept sentences are 

collected as the input for the information extraction 

system. After the parsing process described in Section 3, 

a list of histone modification-gene name pairs are 

generated. An example of the histone modification and 

gene name pair is given in Figure 3. To evaluate the 

precision of the extracted pair of histone modification and 

gene, some annotators qualified to PhD level worked on 

the abstracts and the extracted pair. Evaluation results 

show that the information extraction system achieved as 

high as 73.2% on precision, in which extracted pairs can 

be further annotated by some experienced researchers to 

ensure there correctness with little efforts. 

To investigate the performance of the proposed 

framework, we worked on the ChIP-seq data for histone 

modification “H3K4Me3”, “H3K9Ac”,and “H3K27Me3” 

in three cell lines, EB, MK and HUVEC. The data were 

generate from Dr. Willem Ouwehand’s research group in 

the Department of Haematology of the University of 

Cambridge. The short reads are mapped to the reference 

genome using the Maq program. After mapping, the 

enriched regions are generated based on some peaking 

finding programs. Here, SISSRs [9] is employed. Part of


the enriched regions and their related genes for “H3K9Ac” in EB cell line are listed in Table 1. 

TABLE 1 

AN EXAMPLE OF ENRICHED REGIONS AND THEIR RELATED GENES GENERATED FROM THE PEAK FINDING ALGORITHM 

For the enriched regions ranked and selected by the 

peak finding algorithm, there are several possible changes 

of the score as shown in Table 2. As the purpose of 

analyzing ChIP-seq is to do some novel discovery, 

regions with type II or IV with final re-ranking medium 

score are paid more attention. In the following, two 

examples are given to illustrate how the regions with type 

II and IV are discovered. They also show the feasibility 

of our proposed framework. 

For the regions with type II, a region at position from 

87046200 to 87062199 on chromosome 16 is assigned a 

score of 1178 based on the number of short reads mapped 

to the regions. The ChIP-seq data are generated for 

H3K9Ac in EB cell line. The gene related to the region is 

ENSG00000179588 (ZFPM1). However, we can not find 

the pair of H3K9Ac and ZFPM1 in the list of pairs 

generated from the information extraction system. Based 

on the search keyword “ZFPM1” and “Histone”, no 

results are even retrieved from the PubMed. Moreover, 

no information from microarray data or gene ontology are 

found to support the high score region. Based on our 

proposed framework, the region’s score is decreased and 

more attention will be paid to the region and the related 

gene. 

For the regions with type IV, a detailed example is 

shown in Figure 4. Firstly, thousands of enriched regions 

are discovered from ChIP-seq data based on a peak 


finding algorithm. Among the output regions, one region 

is initially not considered as an enriched region because 

of its low score generated from the peak finding 

algorithm. However if we check the related gene against 

other biologists’ findings based on the microarray data, 

experimental results described in biomedical literature, 

and the Gene Ontology, the region would be enriched by 

H3K27em3. Especially, based on the sentence The results 

from these studies showed that H3K27me3 is associated 

primarily with the INK4A, and not the ARF, locus in the 

explanted fibroblasts, the pair of H3K27me3 and INK4A 

is extracted based on the information extraction system 

mentioned above. Generating such an error may be 

ascribed to the peak finding algorithm’s inability of 

processing diffuse data. By employing the reranking 

model, the region is assigned a new score which will be 

considered as an enriched region. From this example, we 

speculate that employing the reranking model based on 

the multiple knowledge sources can improve the recall 

and reliability of the enriched region detection results. 

V. CONCLUSION 

In this paper, we have presented a novel framework of 

incorporating multiple knowledge sources in order to 

precisely analyze ChIP-seq data for histone modification. 

Information extracted from text, Gene Ontology, and 

knowledge mined from microarray data are combined in


a unified probabilistic model to rerank the enriched 

regions detected from peak finding algorithms. By 

filtering the reranked enriched regions, more reliable and 

precise results are generated. A case study has been 

presented to illustrate its feasibility. In future work we 

VI. ACKNOWLEDGEMENT 

We would like to thank Augusto Rendon and Peter 

Smethurst for constructive suggestions on the proposed 

framework and Sylvia Nünberg for providing the ChIP- 

seq data.This article is supported by social science fund 

project in Jiangsu ,whose ID is 12DDB011. 

REFERENCES 

[1] Alejandro Vaquero, Alejandra Loyola, and Danny 

Reinberg. The constantly changing face of chromatin. 

Sci.Aging Knowl. Environ, 2003, 2003. 

[2] Elaine R Mardis. Chip-seq: welcome to the new frontier. 

Nature Methods, (4):613 – 614, 2007. 

[3] Hongkai Ji, Hui Jiang, Wenxiu Ma, David S Johnson, 

Richard M Myers, and Wing H Wong. An integrated 

software system for analyzing chip-chip and chip-seq data. 

Nature Biotechnology, 26:1293–1300, 2008. 

[4] Yong Zhang, Tao Liu, Clifford Meyer, Jerome Eeckhoute, 

David Johnson, Bradley Bernstein, Chad Nussbaum, 


TABLE 2 

ENRICHED REGIONS BEFORE AND AFTER RE-RANKING 

will continue on the development of the gene expression 

data clustering component and the gene ontology 

inference component and conduct a large scale of 

experiments to evaluate the system performance. 

Richard Myers, Myles Brown, Wei Li, and X Shirley Liu. 

Model-based analysis of chip-seq (macs).Genome Biology, 

9(9):R137, 2008. 

[5] Joel Rozowsky, Ghia Euskirchen, Raymond K Auerbach, 

Zhengdong D Zhang, Theodore Gibson, Robert Bjornson, 

Nicholas Carriero, Michael Snyder, and Mark B Gerstein. 

Peakseq enables systematic scoring of chip-seq 

experiments relative to controls. Nature Biotechnology, 

(27):66 – 75, 2009. 

[6] Anton Valouev, David S Johnson, and Andreas Sundquist. 

Genome-wide analysis of transcription factor binding sites 

based on chip-seq data. Nature Methods, 5:829–834, 2008. 

[7] Peter V Kharchenko, Michael Y Tolstorukov, and Peter J 

Park. Design and anlysis of chip-seq experiments for dnabinding 

proteins. Nature Biotechnology, 26:1351-1359, 

2008. 

[8] Samir J Courdy David A Nix and Kenneth M Boucher. 

Empirical methods for controlling false positives and 

estimating confidence in chip-seq peaks. BMC 

Bioinformatics, 9(523), 2008. 

[9] Artem Barski Kairong Cui Raja Jothi, Suresh Cuddapah 

and Keji Zhao. Genome-wide identification of in


vivoprotein-dna binding sites from chip-seq data. Nucleic 

Acids Research, 36:5221–5231, 2008. 

[10] Han Xu, Chia-Lin Wei, Feng Lin, and Wing-Kin Sung. An 

hmm approach to genome-wide identification of 

differential histone modification sites from chip-seq data. 

Bioinformatics, 24(20):2344–2349, October 2008. 

[11] Chongzhi Zang, Dustin E. Schones, Chen Zeng, Kairong 

Cui, Keji Zhao, andWeiqun Peng. A clustering approach 

for identification of enriched domains from histone 

modification chip-seq data. Bioinformatics, 25(15):1952– 

1958, August 2009. 

[12] Corinna Kolarik, Roman Klinger, and Martin Hofmann- 

Apitius. Identification of histone modifications in 

biomedical text for supporting epigenomic research. BMC 

Bioinformatics, 10:S28, 2009. 

[13] L. Wong. PIES, a protein interaction extraction system. In 

Proceedings of the Pacific Symposium on Biocomputing., 

pages 520–531, Hawaii, U.S.A, 2001. 

[14] Christian Blaschke and Alfonso Valencia. The Frame- 

Based Module of the SUISEKI Information Extraction 

system. IEEE Intelligent Systems, 17(2):14–20, 2002. 

[15] I. Donaldson, J. Martin, B. de Bruijn, and C. Wolting. 

PreBIND and Textomy–mining the biomedical literature 

for protein-protein interactions using a support vector 

machine. BMC Bioinformatics, 4(11), 2003. 

[16] Jung-Hsien Chiang, Hsu-Chun Yu, and Huai-Jen Hsu.GIS: 

a biomedical text-mining system for gene information 

discovery. Bioinformatics, 20(1):120–121, 2004. 

[17] Syed Toufeeq Ahmed, Deepthi Chidambaram, Hasan 

Davulcu, and Chitta Baral. IntEx: A Syntactic Role Driven 

Protein-Protein Interaction Extractor for BioMedical Text. 

In Proceedings of the ACL-ISMB Workshop on Linking 

Biological Literature, Ontologies and Database 2005, 

pages 54–61, 2005. 

[18] TC Rindflesch, L Tanabe, JN.Weinstein, and L. Hunter. 

EDGAR: extraction of drugs, genes and relations from the 

biomedical literature. In Proceedings of Pacific 

Symposium Biocomputing, pages 517–28, 2000. 

[19] David P. A. Corney, Bernard F. Buxton, William B. 

Langdon, and David T. Jones. BioRAT: extracting 

biological information from full-length papers. Bioinformatics, 

20(17):3206–3213, 2004. 

[20] Rzhetsky A, Iossifov I, Koike T, Krauthammer M, KraP, 

Morris M, Yu H, Dubouĺę PA, Weng W, Wilbur 

WJ,Hatzivassiloglou V, and Friedman C. GeneWays: a 

system for extracting, analyzing, visualizing, and integrat 

ing molecular pathway data. Journal of Biomedical 

Informatic, 37(1):43–53, February 2004. 

[21] Hongbo Deng, Michael R. Lyu, and Irwin King. Effective 

latent space graph-based re-ranking model with global 

consistency. In Proceedings of the Second ACM 

International Conference on Web Search and Data Mining, 

pages 212–221, Barcelona, Spain, 2009. 

[22] M. Collins. Ranking algorithms for named-entity 

extraction: Boosting and the voted perceptron. In 

Proceedings of the Annual meeting of the Association for 

Computational Linguistics (ACL) 2002, pages 489–496, 

2002. 

[23] Ruifang Ge and Raymond J. Mooney. Discriminative 

reranking for semantic parsing. In Proceedings of the 

conference of the International Committee on 

Computational Linguistics and the Association for 

Computational Linguistics (COLING/ACL) 2006, pages 

263–270, 2006. 

[24] Kristina Toutanova, Aria Haghighi, and Christopher D. 

Manning. Joint learning improves semantic role labeling. 


In Proceedings of the Annual meeting of the Association 

for Computational Linguistics (ACL) 2005, pages 589 – 

596, 2005. 

[25] Y. He and S. Young. Semantic processing using the hidden 

vector state model. Computer Speech and 

Language,19(1):85–106, 2005. 

[26] Deyu Zhou, Yulan He, and Chee Keong Kwoh. Extracting 

Protein-Protein Interactions from the Literature using the 

Hidden Vector State Model. International Journal of 

Bioinformatics Research and Applications, 4:64–80, 2008. 

[27] Xiao peng hua and Shifei Ding.Incremental Learning 

Algorithm for Support Vector Data Description. Journal of 

Software, Vol 6, No 7 (2011), 1166-1173, Jul 2011 

[28] Deyu Zhou and Yulan He. Discriminative Training of 

theHidden Vector State Model for Semantic Parsing. IEEE 

Transaction on Knowledge and Data Engineering, page In 

Press, 2008. 

[29] Xixiang Zhang,Guangxue Yue, Xiajie Zheng and Fei 

Yu.Assigning Method for Decision Power Based on 

Linguisitc 2-tuple Judgment Matrices.Journal of Software, 

Vol 6, No 3 (2011), 508-515, Mar 2011. 

[30] S. M. Masud Karim.Data Exchange: Algorithm for 

Computing Maybe Answers for Relational Algebra 

Queries.Journal of Software, Vol 6, No 1 (2011), 3-9, Jan 

2011 

Dafeng Chen , Male, was born in 1977, 

received the master degree of 

Engineering from Southeast University, 

China. He is a lecturer in Institute of 

Information Science and Technology, 

Nanjing Audit University since 

December 2000. As a primary principal 

or researcher, he has finished 3 national 

or ministry projects successively. He has 

wide research interests, mainly including computer audit, 

measuring and testing techniques, and Intelligent Control. 

Deyu Zhou, Male,received the BS 

degree in mathematics and ME degree in 

computer science from Nanjing 

University, China, in 2000 and 2003, 

respectively. In 2009, he got the PhD 

degree in School of System Engineering, 

University of Reading, United Kingdom. 

Currently, he worked at School of 

Computer Science and Engineering, 

Southeast University. His interests are statistical methods for 

mining knowledge from biomedical data. includes the 

biography here. 

management etc. 

Yuliang Zhuang, Male, Professor, 

Ph.D., Supervisor of Ph.D. Candidates. 

As a primary principal or researcher ,a 

lot of national or ministry projects 

have been finished successively. He 

has wide research interests and 

engages in the study of management 

information systems, electronic 

commerce, logistics and supply chain


A New Semi-supervised Method for Lip Contour 

Detection 

Kunlun Li, Miao Wang, Ming Liu, Ruining Xin, Pan Wang 

College of Electronic and Information Engineering, Hebei University, Baoding, 071002,China 

E-mail: likunlun@hbu.edu.cn, hbhdwm800@sina.com 

Abstract—Lip contour detection is regarded as an essential 

issue in many applications such as personal identification, 

facial expressions classification, and man-machine 

interaction. Moreover, semi-supervised learning is utilized 

to automatically exploit unlabeled data in addition to 

labeled data to improve the performance of certain machine 

learning approaches. In this paper, three contour 

preprocessing approaches for eliminate lip image noise, i.e., 

Average filtering, Bilateral filtering, and Edge preserving 

smoothing techniques are compared. Furthermore, a hybrid 

approach combing level set theory and semi-supervised 

Fisher transformation for lip contour detection is proposed. 

Experiment results show that the proposed semi-supervised 

strategy for lip contour detection is effective. 

Index Terms—Semi-supervised learning, level set, lip 

contour diction, semi-supervised FDA 


Lip contour detection is a challenging and an important 

issue in computer vision due to the variation of human’s 

expressions and environmental conditions. It has got 

numerous applications in computer vision such as audiovideo 

speech and speaker recognition and so on. 

Lip contour detection is an active research topic. The 

related methods can be classified into three categories. 

The first category is threshold based method [1, 2], which 

enjoys a central position in applications of image 

segmentation because of its intuitive properties. The 

second is edge and line oriented approaches [3-5], in 

which lines or boundaries are defined by contrast in 

luminance, color or texture to be detected. The third is 

hybrid approaches [6], which aim at consistency between 

regions and region boundaries. 

In our previous relevant research, we have proposed an 

improved level set method for lip contour detection [xx]. 

It can optimize the gradient information and enhance the 

accuracy of the lip contour detection by combining 

YCbCr color space and Fisher transformation [30]. In 

literature [31], we improved the Fisher transformation 

with semi-supervised learning, and combed level set 

theory with the improved the Fisher transformation. 

As is well known, Fisher transformation requires large 

number of artificial tag data. Therefore, we propose a 

Manuscript received December 11, 2010; revised June 1, 2011; 

accepted July 1, 2011. 

Corresponding author: Kunlun Li, likunlun@hbu.edu.cn 


doi:10.4304/jsw.7.12.2763-2770 

hybrid approach combing the improved level set method 

and semi-supervised Fisher transformation method for lip 

contour detection in this paper [30, 31]. It comprehensive 

utilizes the marked and unmarked image pixel 

information, and obtains good contour extraction results 

in the cases with fewer tag information. 

The rest of the paper is organized as follows. In 

Section 2, we discuss the preprocessing techniques which 

aim at improving the results of lip contour detection. We 

compare the three categories lip contour detection method 

in Section 3 and propose a new method combines with 

the semi-supervised learning method in Section 4. Last, 

we present some experiment results in Section 5 and give 

the solution in Section 6. 

II. PREPROCESSING 

In this section, we describe the Average filtering, 

Bilateral filtering and Edge preserving smoothing 

techniques to eliminate noise. 

A. Average Filtering 

The average filter aims at smooth image data, thus 

eliminating noise. This filter performs spatial filtering on 

each individual pixel in an image in a square or 

rectangular window surrounding each pixel. 

B. Bilateral Filtering 

In Bilateral filtering method[7, 8], the original values 

of every point in the image are replaced by the average 

values of the adjacent and gray similar pixel values. The 

simplest commonly used bilateral filter is moving 

unchanged Gaussian filter. The Space near degree 

function and gray scale similar function are the Euclid 

distance Gaussian function in parameters space. 

c( 

ξ, 

t) 

= e 

2 

−( 

1/ 

2( 

) || ξ −x|| 

/ σ d ) 

2 

−( 

1/ 

2)(|| 

f ( ξ ) − f( 

x) 

|| / σ r ) 

(1) 

s( 

ξ, 

t) 

= e 

(2) 

Where c( ξ, t) 

is the space near degree between the 

point ξ and the geometric center x, s( ξ, t) 

is the gray 

scale similarity between the point ξ and the geometric 

center x. 

The average filter and bilateral filtering are very 

common for removing noise from images. Even the noise 

reduction can be achieved by these methods, some


valuable information is lost and the details of object 

boundaries are deformed. In order to solve such kind of 

problems, we use the edge preserving smoothing for 

image preprocessing. 

(a) (b) 

Figure 2. (a) Input image (b) Result of bilateral filtering 

C. Edge Preserving Smoothing 

The Edge preserving smoothing [9] is an adaptive 

mean filter where the amount of blurring for each pixel is 

determined after gathering local information in a 

specified n × n neighborhood. It uses a simple and 

effective edge preserving smoothing filter, which 

performs low computation time. 

The Edge preserving smoothing filter is applied 

independently to every image pixel using different 

coefficients. To calculate the coefficients of the 

convolution mask for every pixel, Manhattan color 

distances di, i=1,…,8 are extracted between the central 

pixel and the eight neighboring pixels in a 3× 3 window, 

which are normalized in the range [0, 1]. That is: 

| Rac − Rai | + | Gac − Gai | + | Bac −Bai| 

di= ,0 ≤ di≤1 

3× 255 

(3) 

Where R ac , ac G , Bac is the RGB value of the central 

pixel. To compute the coefficients for the filter’s 

convolution mask, the following equation is used: 

p 

c i = ( 1 − d i ) , where P ≥1 

. (4) 

As p gets larger, coefficients with small color distance 

from the central pixel increase their relative value 

difference from coefficients with large color distance, so 

the blurring effect decreases. We select p=1,3,5,10, for 

experiments and a fixed value p=5 is used for all of our 

experiments. The central pixel of the convolution mask is 

set to zero to remove impulsive noise. 

(a) (b) 

(c) (d) 

(e) (f) 

Figure 3 (a) original image (b) the resulted image when the edge 

preserving smooth method applied fifth (p=5) (c) the resulted image 

when the edge preserving method applied tenth (p=10) (d) ~ (f) are the 

RGB pix value of (a), (b) and (c) respectively. 


The preprocessing methods aimed at reducing texture 

and noise while preserving and enhancing lip contour 

detection. We compared the average filter, bilateral 

filtering and edge preserving smoothing and we finally 

use the edge preserving smoothing method for 

preprocessing. 

III. SOME APPROACHES FOR CONTOUR DETECTION 

The contour detection methods can be mainly 

classified in three categories: threshold based methods, 

edge and line oriented approaches and hybrid approaches. 

In this paper we compare the three kinds of methods, and 

present a hybrid approach combing the improved level set 

and semi-supervised FDA algorithm for lip contour 

detection. 

A. Threshold based Method 

Image threshold segmentation is widely used in image 

segmentation. The images are considered as the 

combination of the target area and background region. 

We select a more reasonable closed value to determine 

each pixel of the image belongs to target or background 

region, thus producing corresponding binary image. 

(a) (b) 

(c) 

Figure 4 (a) gray image (b) iteration threshold binary image (c) Ostu 

threshold method 

Through the experiments we can see that the results of 

the threshold method is affected for the difference 

between grayscale information of lip color and that of 

skin color is small. 

B. Edge and Line Features Oriented Approachs 

1. Local contour detector 

Differential methods 

The earliest linear filtering approaches such as the 

Sobel, Prewitt, Robert and Canny [10-13] detectors are 

based on measures of matching between the pixel value 

on the neighborhood of each pixel, and an edge temple. 

These methods are belonging to differential methods. The 

most significant limitation of these methods is that they 

could not distinguish between texture edges and region 

boundaries and object contour. 

Morphological edge detectors 

Mathematical morphology theory is introduced by 

Matheron for analyzing geometric structure of metallic 

and geologic samples. It was first used to image analysis 

by Serra [14]. Based on this theory, mathematical 

morphology based on set operations, is provided an 

approach to the development of nonlinear signal


(a) (b) (c) 

(d) (e) (f) 

Figure 5 (a)gray image (b)Sobel (c)Log (d)Canny (e)Prewitt (f)Robert 

processing operators that incorporate shape information 

of a signal [15]. In mathematical morphological 

operations, there are always two sets involved: The shape 

of a signal is determined by the values that the signal 

takes on. The shape information of the signal is extracted 

by using a structuring element to operate on the signal. 

There are two basic morphological operators: erosion 

and dilation. These operators are usually applied in 

tandem. Opening and closing are two derived operations 

defined in terms of erosion and dilation. Erosion of a 

grey-level image F by another structuring element B, 

denoted F Θ B, is defined as follows: 

F Θ B(m, n)=min{F(m+s, n+t)-B(s,t)}, where P ≥1 

. 

(5) 

Erosion is a “shrinking” operator in the values of 

F Θ B, and it is always less than or equal to the values of 

F. Dilation of a grey-level image F by another structuring 

element B, denoted F ⊕ B, is defined as follows: 

F ⊕ B (m , n) max{ F(m+s , n+t)+B (s, t )} (6) 

Dilation is an “expansion” operator in the values of 

F ⊕ B, and it always larger than or equal to the values of 

F. 

By the erosion and dilation operators defined above, 

we can detect the edge of image F, denoted by Ee(F), 

defined as the difference set of the original image F and 

the erosion result of F. This is also known as erosion 

residue edge detector: 

Ee(F)=F-(F Θ B) (7) 

The edge of image F, denoted by Ed(F) , is defined as 

the difference set of the dilation result of F and the 

original image F. This is also known as dilation residue 

edge detector: 

Ed(F)=( F ⊕ B)-F (8) 

(a) (b) 

(c) (d) 

Figure6 (a) gray image (b) dilation (c) erosion (e) morphological edge 

detection 


The traditional morphological algorithms only consider 

of intensity information of the edge and ignore the 

direction of the edge. Such methods have some problems 

due to the edge character information is not 

comprehensive. The problems are: (1) the detected edges 

are wider and have lower resolution ratio, (2) if we use 

the morphological gradient threshold method to detect the 

edge singly, it will lose part of low intensity edge. 

Statistical approaches 

In this category of method, we take the EM algorithm 

and FDA algorithm for example. 

(1) Expectation-maximization algorithm 

Suppose we have a number of samples drawn from a 

distribution which can be approximated by a mixture of 

Gaussian distributions and we wish to estimate the 

parameters of each Gaussian and assign each datum to a 

particular one. The Expectation Maximization provides a 

framework. 

Expectation-maximization [16], as expected, works in 

two alternating steps. Expectation refers to computing 

the probability that each datum is a member of each class; 

maximization refers to altering the parameters of each 

class to maximize those probabilities. Eventually it 

converges, though not necessarily correctly. The 

Expectation step is defined by the following equation: 

1 

2 

− ( xi 

−μ 

j ) 

2 

2σ 

p( 

x = xi 

| μ = μ j ) e 

E[Zij 

] = 

= 

k 

k 1 

2 

− ( xi 

−μ 

n ) 

2 

∑ p( 

x = xi 

| μ = μ 

2 

n ) 

σ ∑e 

n= 

1 

n= 

1 

(9) 

This equation states that the expectations or weight for 

pixel z with respect to partition j equals the probability 

that x is pixel xi given that µ is partition µi divided by the 

sum over all partitions k of the same previously described 

probability. This leads to the lower expression for the 

weights. The sigma squared seen in the second expression 

represents the covariance of the pixel data. Once the E 

step has been performed and every pixel has a weight or 

expectation for each partition, the M step or 

maximization step begins. This step is defined by the 

following equation: 

1 

μ j ← 

m 

m 

∑ 

i= 

1 

E[ 

Z 

ij 

] x 

i 

(10) 

This equation states that the partition value j is 

changed to the weighted average of the pixel values 

where the weights are the weights from the E step for this 

particular partition. 

Although EM algorithm is an effective method in 

decision-supporting system, it has the disadvantage of 

this algorithm is slow convergence. It is not suitable in 

this experiment for the differences between grayscale 

information of lip color and that of skin color is small. 

The experiment results are shown in Figure 7.


(a) 

(b) 

(c) 

(d) 

Figure7 the EM algorithm results 

(2) Fisher linear discriminate analysis (LDA) 

Fisher linear discriminate analysis (LDA) is a 

traditional statistical technique. It has been widely used 

and proven to be successful in many real-world 

applications. Here we use the Fisher linear discriminate 

analysis to enhance the gradient information of the lip 

contour and then we can obtain the lip contour clearly 

[17]. 

Here, the G and B components of RGB color space are 

set to a vector x which is used to distinguish the lip color 

and skin color. In the training process, we manually 

extract patches of 50 people's lips and skin regions as the 

training samples. By utilizing Fisher transformation to the 

T 

vector x = ( G, 

B) 

, we obtain a function that can be used 

to discriminate the two classes. This function is 

calculated by using the within-class scatter matrix and 

defined as: 

Fisher( x) 

= W ⋅ x 

The projection vector is calculated by: 

−1 

W = SW 

m − m ) 

T 

( 1 2 

The within-class scatter matrix Sw, is defined: 

= S + S 

SW 1 2 

T 

S 1 = ∑( x − m1)( 

x − m1) 

S ) 

(11) 

(12) 

(13) 

(14) 

T 

2 = ∑ ( x − m2 

)( x − m2 

(15) 

The sample mean vector of each class, m k , is defined 

as: 

1 

m x ( k = 1, 

2) 

k = ∑ nk 

x∈ 

xk 

(16) 

Where x k is the set of the vectors in th 

k class and 

nk is the number of the vectors in the class. 


k 

(a) (b) (c) 

Figure 8 (a) original image (b) gray image (c) the result of FDA 

algorithm 

Figure 8 shows the results before and after Fisher 

transformation. The approach optimizes the gradient 

information enhances the accuracy of the lip contour 

detection by combining of YCbCr color space and Fisher 

transformation. But this method requires a lot of artificial 

tag data; which needs a lot of time to mark. 

2. Global algorithms 

Most of the edge features reviewed in this section is 

based on local method, and a more serious limitation of 

the aforementioned local method is that the decision of 

whether a pixel belongs to a contour or not is only based 

on a small neighborhood of each point. On the other hand, 

it is easy to produce images in which local patterns are 

visually similar to an edge do not belong to object 

contours and vice-versa. So we take into account global 

information for the contour detection. 

Global methods include the level set method and snake 

model method, the level set is easy to track the 

topological structure change of the objects, it is a 

powerful object modeling tool that changing with time. 

Level set based method was proposed by Osher and 

Sethian, it is an effective calculation tool to deal with 

closed time-evolution movement interface geometric 

topology change [29]. 

Given a closed curve C, the whole plane is divided into 

the exterior and internal domain of the curve. Define a 

distance function Φ ( x , y, 

t) 

= ± d in the plane, and d 

represents the shortest distance between the point ( x , y) 

to the curve C. The positive and negative sign of the 

function represent the point in internal domain or external 

domain of the curve and t represents time. Curve C can 

be represented by the zero level set of the distance 

functionΦ(x,y,t), that is 

C ( t) 

= {( x, 

y) 

: Φ( 

x, 

y, 

t) 

= 0} 

. 

In the evolution process, the curve points always 

satisfy the following equation: 

Φ( x, 

y, 

t) 

= 0 

(17) 

Derivate both sides of the equation (17) of t, we can 

get: 

∂φ 

∂φ 

dx ∂φ 

dy 

+ + = 0 

∂t 

∂x 

dt ∂y 

dt 

(18) 

Suppose the velocity of all the points on the curve is F, 

the level set method limits the movement at various 

points on the curve is along the curve normal direction, 

that is, the gradient direction of the points on the curve, 

therefore the evolution of the entire curve can be 

expressed as: 

∂φ 

− F | ∇φ 

| = 0 

∂ 

t (19)


φ ( x, y, 

t) 

= φ0 

( x, 

y) 

(20) 

Equation (19) is called the level set evolution equation, 

∇ φ indicates gradient norm of the level set function, 

and F is the velocity function along the normal direction, 

which is the direction of the gradient of curve at various 

points. Equation (20) is the surface equation. 

Level set method is an effective tool for describing the 

curve with curvature related speed evolvement [8]. It has 

been widely used in image segment and computer visual. 

It is based on the change of the image gradient. When the 

object contour is not clear or the gradient information of 

the image is too weak, the final detection result would be 

far from the expected. The experiment results is in Figure 

9. 

5 

10 

15 

20 

25 

30 

35 

40 

45 

50 

Final contour, 200 iterations 

10 20 30 40 50 60 70 80 90 100 

10 

20 

30 

40 

50 

60 

Final contour, 200 iterations 

20 40 60 80 100 120 

10 

20 

30 

40 

50 

60 

70 

80 

90 

200 iterations 

20 40 60 80 100 120 140 160 180 

(a) (b) (c) 

Figure9. The results of the traditional level set method 

The level set method described in this Section is based 

on the change of the image gradient. As Figure 9 shows 

the contour detection may not be accurate if the object 

contour is not obvious or the gradient information is too 

weak. 

IV. HYBRID APPROACHES 

Through the analysis and experiments above, we can 

see that the three methods have their advantages or 

shortcomings respectively. The semi-supervised FDA can 

learn from both labeled and unlabeled data and enhance 

the gradient information of the lip contour. It can improve 

the level set method and solve the limitation of the 

traditional level set method. So in this paper, we present a 

new hybrid approach combined with semi-supervised 

FDA method and level set method. 

A. Semi-supervised Fisher 

FDA is a classical supervised learning method. In the 

context of pattern classification, FDA seeks for the best 

projection subspace such that the ratio of the betweenclass 

scatter to the within-class scatter is maximized. But 

the labeled data often consume much time and are 

expensive to obtain, as they require the efforts of human 

annotators. Contrarily, in many cases, it is far easier to 

obtain large numbers of unlabeled data. Effectively 

combining unlabeled data with labeled data is therefore 

the central task in machine learning. 

We introduce some prior knowledge on Semisupervised 

learning. We know that unlabeled examples 

can improve the performance of the learned hypothesis. 

Semi-supervised learning [18,19] deals with methods for 

automatically exploiting unlabeled data in addition to 

labeled data to improve learning performance, where no 

human intervention is assumed. 

In semi-supervised learning, a labeled training data set 

L={(x1,y1), ( x2, y2), … ,( x|L| y|L|)} and an unlabeled 


training data set U={x1’, x2’,…x|U|’}are presented to the 

learning algorithm to construct a function f: X→Y for 

predicting the labels of unseen instances, where X and Y 

are respectively the input space and out space, xi , xj’ ∈X 

(i=1,2, … ,|L|, j=1,2, … ,|U|)are d-dimensional feature 

vectors drawn from X, and yi ∈ Y is the label of xi, 

Usually |L| 〈〈 |U|. 

It is regarded that semi-supervised learning originated 

from [20]. In fact, some straight forward use of unlabeled 

examples appeared even earlier [18-21]. Due to the 

difficulties in incorporating unlabeled data directly into 

conventional supervised learning methods and the lack of 

a clear understanding of the value of unlabeled data in the 

learning process, the study of semi-supervised learning 

attracted attention only after the middle of 1990s. As the 

demand for automatic exploitation of unlabeled data 

increases and the value of unlabeled data was disclosed 

by some early analyses [23, 24], semi-supervised learning 

has become a hot topic. 

In this paper, we improve the traditional FDA [28] by 

using Semi-supervised learning strategy. The following is 

the algorithm steps of the algorithm: 

a) Find a orthogonal basis of the subspace R ( X k ) : 

Perform singular value decomposition of X k as 

X k = ∑ T 

U V .U= [ 1 u , … r u , u r+ 

1 , … , u m ]is the 

orthogonal matrix, where r is the rank of X k .The vector 

set { u 1,… 

u r } forms a orthogonal basis of R( X k ).The 

matrix P=[ 1 u ,…u r ] gives a vector space projection from 

m 

R to subspace R( X k ). 

b) Map data points into subspace R ( X l ): 

T 

xi → zi 

= P xi 

, i = 1, 

…, 

l, 

l + 1, 

…, 

n (21) 

c) Construct the scatter matrixes: Construct the 

between-class scatter matrix Sb and total scatter matrix 

S t as 

T 

T T 

S = Z BZ = PX BX P (22) 

b 

l 

l 

T 

l 

T 

S t Z 

= lCZ 

= PX lCX 

P 

(23) 

d) Eigen-problem: when k=n, we use regularization to 

ensure the nonsingularity of S t : St = St 

+ δI 

r ,where 

δ ( δ >0)is the regularization parameter and Ir is the 

r × r identity matrix. Compute the eigenvectors with 

respect to the nonzero eigen-values for generalized 

eigenvector problem: 

Sbbi = λ iS 

tbi 

, i = 1, 

2, 

…c 

−1 

(24) 

We obtain the transformation matrix 

B=[ b 1, …, bc−1 

]. 

e) Construct the adjacency graph: Construct the pnearest 

neighbor graph matrix W as in (23)to model the 

relationship between nearby data points and calculate the 

graph Laplacian matrix L=D-W. 

l 

l 

T 

l


f) Eigen-problem: Compute the eigenvectors with 

respect to the nonzero eigenvalues for the generalized 

eigenvector problem: 

T T 

S bai 

= λ( 

aSt 

+ ( 1− 

α) 

PXLX P ) ai 

, i = 1, 

…, 

c -1 

(25) 

There are at most c-1 nonzero eighvalues for the 

eigenvector problem. We obtain the transformation 

matrix A=( a 1, …, ac−1 

). 

g) Embedding: The data point can be embedded into c- 

1 dimensional subspace by: 

T T T 

T 

x → y = A z = A P x = ( PA) 

x (26) 

B. Hybrid Approaches 

For the Fisher transform requires a lot of time to labor 

the artificial tag data,we propose the semi-supervised 

FDA learning based on an improved level set method. It 

effectively solves the problems. The use of a large 

number of unlabeled data improves the lip contour 

extraction efficiency. 

The following is the step of this new method: 

(1) Apply the Edge preserving smoothing method to 

reduce texture and noise while preserving and enhancing 

lip contour detection. 

(2) Utilizes the semi-supervised FDA method to 

enhance the gradient information of the lip contour which 

using label data and unlabeled data. We take the manual 

labeling lip color points and skin color points as the 

labeled data, table 1 is the data sample. The pairwise 

constraints between the data are generated by the manual 

tagging data according to their corresponding class labels, 

the untagged data are generated directly from the 

remaining unlabeled samples without restriction. After 

sample training, we use the semi-supervised clustering 

method to distinguish lip color and color. Figure 11 

shows the image clustering results. 

(3) Use Gauss filter and morphology corrosion 

expansion method to remove noise and discrete points. 

(4) Apply the level set method on the results of the 

semi-supervised FDA algorithm and get the segmentation 

result. 

TABLE I. 

SOME MANUAL TAGGED DATA 

Lip color pixel point Skin color pixel point 

G B label G B label 

34 33 0 79 87 1 

33 37 0 72 78 1 

90 109 0 91 97 1 

Figure 10. (a) Original image (b) the result of 20 constrain numbers (c) 

the result of 50 constrain numbers 50 (d) remove noise and discrete 

points.(e) segmentation result 


V. EXPERIMENT 

For testing the performances of our approach, we built 

a color lip image database. The database contains the data 

of 100 images. The data set included the lighting, facial 

expressions and attitude changes. 

Figure 11. Some of the accuracy results of the lip contour detection 

Figure 12 depicts such poor detection results. 

Figure 11 shows some of the better results. From 

failure cases of our method we observed that there are 

leakages between upper lip, nose and around the edge of 

the lip. Figure 12 depicts such poor detection results. 

(a) (b) (c) (d) 

Figure 13. Detection results of the traditional ASM method 

We compared the improved approach with the 

traditional Active Shape Models. Figure 13 shows the 

detection results by using traditional ASM. The 

traditional ASM needs a large set of training set to 

include all types of lips, the accuracy and speed depend 

on the initialization of the model parameters. It will 

increase the contour matching scope and impact detection 

speed if the initialization is not accurate. It is also often 

trapped in local optimizing problem, while the improved 

level set method avoids the problem of easily falling into 

local optimizing. 


In this paper, we contrast three contour preprocessing 

methods for eliminating lip image noise, i.e., average 

filtering, bilateral filtering, and edge preserving 

smoothing technique. Moreover, the two approaches for 

lip contour detection, i.e. the threshold based method and 

the edge and line oriented approaches are also compared. 

Table 1 lists the comparison of the three category contour 

detection approaches. Basing on the experiment results 

and analysis of the advantage and disadvantage of these


Contour detection 

method 

Threshold based method 

Edge and line oriented 

approach 

Hybrid approaches 

TABLE II. 

THE COMPARITION OF THE THREE CATEGORY CONTOUR DETECTION APPROACHES 

Example Advantages Dis-advantages 

Ostu threshold method Simple and easy to implement 

Differential methods Fast and conceptually simple 

Statistical approaches 

EM: a good way in decision-supporting system 

FDA: conceptually simple 

Semi-supervissed FDA: automatically exploiting 

unlabeled data in addition to labeled data to improve 

learning performance 

Global algorithm Higher performance 

Level set method+ 

Semi-supervised FDA 

approaches, we propose a hybrid approach combines with 

level set method and semi-supervised Fisher 

transformation method for lip contour detection. We 

apply the semi-supervised idea for the lip contour 

detection and improve the accuracy of detecting the lip 

contours. The proposed method can efficiently reduce the 

time complexity and human annotators’ efforts. The 

experimental results demonstrate that the proposed 

method can improve the accuracy of detecting the lip 

contours. 


This work is supported by the National Natural 

Science Foundation of China under Grant No. 61073121 

and the Natural Science Foundation of Hebei Province 

under Grant No. F2009000215. 

REFERENCES 

[1] Fu K S, Mui J K :A survey in image segmentation. Pattern 

recognitions, 13:3-6,(2006). 

[2] Murthy C A, Pal S K: Histogram thresholding by 

minimizing gray-level fuzziness Information Sciences, 

60:107-135(2002). 

[3] P.T.Jackway: Morphological scale-spaces, 

Adv.Imag.Elec.Phys.119 123-189(2001). 

[4] T.McInerney, D.Terzopoulos:Deformable models in 

medical image analysis: a survey, Med. Image Anal.1(2) 

91-108(1996). 

[5] X. Cufi, X.Munoz, J. Freixenet, J.Marti: A review of 

image segmentation techniques integrating region and 

boundary information, Adv. Imaging Electron. Phys.120 1- 

39(2002). 

[6] D. ziou, S. Tabbone, Edge detection techniques-an 

overview, Int. J.Pattern Recognit, Image Anal.88 (1994) 

297-345. 

[7] G.Papari, N.Petkov, P. Campisi, Artistic edge and corner 

enhancing smoothing, EEE TIP 29(10) (2007) 2449-2462. 

[8] C.Tomasi, R.Manduchi, Bilateral filtering for gray and 

color images: Proceedings of the Sixth International 

Conference on Computer Vision, Narosa Publishing 

House,Bombay, p.839(1998). 


Good performance 

more sensitive to the noise of 

the input images 

Fail in capturing mid-level 

and high-level visual cues. 

EM: slow convergence 

FDA: the labeled data often 

consume much time and are 

expensive to obtain 

Computation- 

ally more demanding 

[9] Nikos Nikolaou, Nikos Papamarkos: Color Reduction for 

Complex Document Images Vol. 19, 14–26 (2009) 

[10] K.Anil, Jain, Fundamentals of Digital Image Processing. 

Prentice-Hall, Englewood Cliffs (1989). 

[11] L Ganesan, P.Bhattacharyya, Edge detection in untextured 

and textured images-a common computational 

framework,IEEE SMC 27(5)823-834 (1997). 

[12] J.F.Canny, A computational approach to edge detection, 

IEEE T-PAMI 8(6)679-698(1986). 

[13] J.F.Canny, A variational approach to edge detection, 

in:M.A. Gennsereth (Ed.),Proceedings of the Mational 

Conference on Artificial Intelligence,AAAI Press, 

Washington,D.C,1983,PP.54-58,(August). 

[14] C. Chu and J. K. Aggarwal: The integration of image 

segmentation maps using region and edge 

information,IEEE Trans. Pattern Anal. Machine Intell., vol. 

15, pp. 1241–1252, 1993. 

[15] J. Haddon and J. Boyce, “Image segmentation by unifying 

region and boundary information,” IEEE Trans. Pattern 

Anal. Machine Intell., vol.12, pp. 929–948, 1990. 

[16] Arthur Dempster, Nan Laird, and Donald 

Rubin.:Maximum likelihood from incomplete data via the 

EM algorithm, Journal of the Royal Statistical Society, 

Series B, 39(1):1–38, 1977 

[17] Wang Rui, Gao Wen, Ma Jiyong, "An Approach to Robust 

and Fast Locating of Lip Motion". Chinese Journal of 

Computers, pp.866-871, 2001. 

[18] X. Zhu. Semi-supervised learning literature survey. 

Technical Report 1530, Department of Computer Sciences, 

University of Wisconsin at Madison, Madison, WI, 2006. 

[19] O. Chapelle, B. Schölkopf, and A. Zien, editors. Semi- 

Supervised Learning. MIT Press, Cambridge,MA, 2006. 

[20] B. Shahshahani and D. Landgrebe. The effect of unlabeled 

samples in reducing the small sample size problem and 

mitigating the hughes phenomenon. IEEE Transactions on 

Geoscience and Remote Sensing, 32(5):1087–1095, 1994. 

[21] W. Hosmer. A comparison of iterative maximum 

likelihood estimates of the parameters of a mixture of two 

normal distributions under three different types of sample. 

Biometrics, 29(4):761–770, 1973. 

[22] R. P. Lippmann. Pattern classification using neural 

networks. IEEE Communications, 27(11):47–64, 1989. 

[23] D. J. Miller and H. S. Uyar. A mixture of experts classifier 

with learning based on both labelled and unlabelled data. 

In M. Mozer, M. I. Jordan, and T. Petsche, editors,


Advances in Neural Information Processing Systems 9, 

pages 571–577. MIT Press, Cambridge, MA,1997. 

[24] T. Zhang and F. J. Oles. A probability analysis on the 

value of unlabeled data for classification problems. In 

Proceedings of 17th International Conference on Machine 

Learning, pages 1191–1198, Stanford, CA, 2000. 

[25] Luo J, Chen H, Tang Y. Analysis of graph-based semisupervised 

regression. In: Proceedings of the 5th 

International Conference on Fuzzy Systems and 

Knowledge Discovery. Ji nan, China: IEEE, 2008. 111-115. 

[26] Gong Y C, Chen C L, Tian Y J. Graph-based 

semisupervised learning with redundant views. In: 

Proceedings of the 19th International Conference on 

Pattern Recognition. Tampa,USA: IEEE, 2008. 1-4. 

[27] Cai D, He X F, Han J W. Semi-supervised discriminant 

analysis. In:Proceedings of the 11th IEEE International 

[28] YANG Wu-Yi1;, LIANG Wei, XIN Le, ZHANG Shu-Wu.: 

Subspace Semi-supervised Fisher Discriminant Analysis, 

Kunlun Li born in 1962, received his Ph.D in 

school of computer and information 

technology from Beijin Jiaotong University. 

Currently he is a professor in College of 

Electronic and Information Engineering, 

Hebei University. His main research interests 

include pattern recognition and artificial 

intelligent, machine learning and data mining, 

information security, biological information 

and image processing. 


Miao Wang was born in 1986, received her 

B.E. degree in College of Electronic and 

Information Engineering, Hebei University. 

Currently she is a M.E. candidate in College 

of Electronics and information Engineering, 

Hebei University. Her main research 

interests include pattern recognition and 

image processing. 

ACTA AUTOMATICA SINICA, Vol. 35, No. 12, 

December, 2009. 

[29] Yang Meng, Wang Guoping, Dong Shihai, "Curves 

Evolving Based on Level Set Method". Journal of Software, 

pp.1858-1864, 2002. 

[30] Kunlun Li, Miao Wang, Ming Liu, Aimin Zhao,: Improved 

Level Set Method for Lip Contour Detection. In: 

Proceedings of the 17th International Conference on Image 

Processing. Hang Kong: IEEE, 673-676. (2010) 

[31] Kunlun Li, Ruining Xin, Miao Wang, Hui Bai. Research of 

Lip Contour Extraction in Face Recognition. 2011 

International Conference on Electronic Engineering, 

Communication and Management. In: Lecture Notes in 

Electrical Engineering, Volume 140(2), 333-339.(2011) 

Liu Ming received the B.S. and the M.S. degree in applied 

mathematics and computer science from Hebei University, 

Baoding, China, in 1991 and 2000, and the Ph.D. degree in 

signal and information processing from Beijing Jiaotong 

University in 2009. He is currently a Associate Professor in the 

College of Electronic and information Engineering, Hebei 

University. His main research interests include biometrics and 

computer vision. 

Ruining Xin was born in 1986, received her B.E. degree in 

Hebei University. Currently she is a M.E. candidate in College 

of Electronics and information Engineering, Hebei University. 

Her main research interests include pattern recognition and 

image processing. 

Pan Wang was born in 1988, received her B.E. degree in Hebei 

University. Currently she is a M.E. candidate in College of 

Electronics and information Engineering, Hebei University. Her 

main research interests include pattern recognition and image 

processing.


Trusted Software Constitution Model Based on 

Trust Engine 

Junfeng Tian 

College of Mathematics and Computer Science of HeBei University, Baoding, China 

Email: tjf@hbu.edu.cn 

Ye Zhu and Jianlei Feng 

Shijiazhuang Posts and Telecommunications Technical College, Shijiazhuang, China 

College of Mathematics and Computer Science of HeBei University, Baoding, China 

Email: zhuye626@163.com, snuboy_2008@126.com 

Abstract—The guaranty of trustiness wasn’t considered 

enough in traditional software development methods, the 

software developed in that methods lack effective measures 

for ensuring its trustiness. Combining agent technique with 

the support of trusted computing provided by TPM, a 

trusted software constitution model based on Trust Engine 

(TSCMTE) is demonstrated in this paper, Trust Engine 

extends the “chain of trust” of TCG into application, and 

cooperates with TPM to perform integrity measurement for 

software entity to ensure the static trustiness, then through 

verifying whether the dynamic behavior of software satisfies 

the trustiness constraints at runtime, Trust Engine 

guarantees the dynamic trustiness of software behavior. For 

the purpose of improving the accuracy of trustiness 

constraints, a strategy of determining the weights of 

characteristic attributes based on information entropy is 

proposed. Simulation experiments illustrate that the 

trustiness of software developed by the TSCMTE is 

improved effectively without performance degradation. 

Index Terms—Trusted Software Constitution, Trust Engine, 

Trust Control and Evaluation, Trust View, Software 

Behavior Trace 


The increase in software size and the complexity of 

external environment have resulted in the increasingly 

descending of software quality. Once software failures 

and malfunctions occur, especially when software is 

attacked maliciously, it will bring tremendous loss to 

people’s work and life. Trusted software will not result in 

malfunction or failure largely even if caused by malicious 

attacks, or system errors [1]. How to ensure the trustiness 

of software will be an inexorable trend of software’s 

development and application. 

Trusted software constitution technology is an 

important guarantee of software’s correct execution, 

which ensures that software is always works in the 

intended way and goes towards the intended direction [2]. 

Based upon the theory and technology of traditional 

software, this paper presents a trusted software 

constitution model based on Trust Engine (TSCMTE). 

The remainder of this paper is organized as follows: 


doi:10.4304/jsw.7.12.2771-2778 

Section 2 covers the previous related works on the 

research of software’s trustiness. Section 3 introduces the 

framework of TSCMTE, and Section 4 presents the 

software behavior and trustiness evaluation in detail. 

Section 5 states the simulation experiments and results. 

Finally, Section 6 draws conclusion and outlines future 

extension of this work. 

II. RELATED WORKS 

In recent years, the trustiness of software has drawn 

more and more people’s attentions and large quantities of 

research achievements have mounted in the field of the 

software’s trustiness [3]. 

Formal methods [4] ensure the software’s trustiness in 

a rigorous way. Cleaning Software Engineering [5] places 

the process of software’s development in the control of 

statistic value, which can be used to develop software 

with high reliability certification. Aspect-oriented 

programming methods (AOP) [6] can be used to separate 

the monitoring and controlling for developing software 

with trustiness, which records the process of software’s 

execution to guarantee trustiness. Qu Yanwen et al. [7] 

described the trustiness of software by Software Behavior, 

which is defined by the expectations of software’s correct 

execution, and the trustiness can be classified into several 

classes. Lin Huimin et al. [8] carried out the formal 

research on software with high trustiness, through 

converting the problem that "whether the software has the 

intended characteristics" into a mathematical problem 

that "whether the software behavior S satisfies software 

character F", to ensure that the behavior of software is 

always consistent with the intended. Wang Huaimin et al. 

[9] proposed the trustworthiness classification 

specification of software, mainly included the definitions 

of the trustiness classes, measurements of trusted 

evidence and so on. Liu Jing [10] discussed how to 

integrate the UTP and UML together to form a unified 

modeling system, which not only makes the MDA 

technology to be used to constitute trusted software, but 

also adopts the formal specification of software and 

techniques of model checking in the software’s


development process to ensure software’s trustiness 

fundamentally. 

Through a brief review of the above researches on 

software’s trustiness, it can be concluded that the 

achievement on software’s trustiness have initiated us 

into the causes of untrusted software, and some 

countermeasures have been adopted. However, in the 

field of software’s trustiness, the researches on the 

constitution of trusted software continue to be scarce. To 

bridge this gap, this paper demonstrates a trusted software 

constitution model based on Trust Engine (TSCMTE), 

and aims to: 

1. show the framework of TSCMTE by combining 

agent technique with the support of trusted computing 

provided by TPM. 

2. introduce software behavior and trustiness 

evaluation in detail, including the guaranty for the static 

integrity of software with Trust Engine, the definition, 

representation and extraction of Software Intended 

Behavior Trace. 

3. propose a strategy of determining the weights of 

characteristic attributes based on information entropy to 

improve the accuracy of constraints. 

III. THE TRUSTED SOFTWARE CONSTITUTION MODEL 

BASED ON TRUST ENGINE 

The basic idea of TSCMTE is not only guaranteeing 

the static integrity of software, but also constraining the 

dynamic behavior in the process of software execution 

effectively. The trustiness of software is mainly 

manifested in the trustiness of static integrity and 

dynamic behavior of software. On the basis of ensuring 

the static trustiness of software, the TSCMTE is driven 

by software dynamic behavior, through monitoring the 

dynamic behavior in the process of software execution, it 

can be verified whether the dynamic behavior is always 

consistent with the intended behavior, then the dynamic 

behavior of software will be adjusted and controlled 

possibly to ensure the dynamic trustiness of software. 

A. Framework of TSCMTE 

The software based on traditional theories faces two 

typical security threats: first, the static integrity of 

software is broken probably, suffering from virus, which 

caused dynamic behavior to be changed; second, software 


Figure 1. Framework of TSCMTE. 

is illegally injected or interrupted by other processes in 

the process of execution, such as buffer overflow attack, 

which can change the dynamic behavior of software 

possibly without breaking the static integrity. 

Therefore, in order to identify the above-mentioned 

security threats and ensure software’s trustiness, 

combining agent technique with the support of trusted 

computing provided by TPM, the framework of 

TSCMTE is proposed, as shown in Fig. 1. 

The reasons why the TSCMTE is proposed are listed 

as follows: 

1. Trusted computer based on TPM is of temporary 

trustiness in the initial phase [11]. When software is 

running, the trustiness of dynamic behavior of software 

can’t be guaranteed if the static integrity has been broken. 

2. It is effective for ensuring the dynamic trustiness to 

monitor and constrain the dynamic behavior of software. 

Inject the capability of monitoring in an appropriate way, 

and then the entity agent [12] can autonomously monitor 

the dynamic behavior and extract the related information 

with its context. 

The TSCMTE is made up of Application Software, 

Trust Engine, TPM and Operation System. 

1. Application Software: the source of software’s 

dynamic behavior. 

2. Trust Engine: extends the “chain of trust” of TCG 

into application, and cooperates with TPM to perform 

integrity measurement for software entity to ensure the 

static trustiness, and based on agent technique, through 

monitoring and extracting the dynamic behavior of 

software, it can verify whether the dynamic behavior 

satisfies the trustiness constraints, then adjust and control 

the software behavior to ensure the dynamic trustiness. It 

is composed of Trust Monitor, Context, Trust Control and 

Evaluation, Trust View and Trusted Communication 

Agent Interface. 

(1) Trust Monitor: it can monitor the process of 

software execution to extract related information. 

(2) Context: it refers to the necessary conditions for 

the operation and interaction of software, which 

includes the time, environmental factors and other 

related information. Whether the behavior is 

trusted or not is closely related to the specific 

context.


(3) Trust Control and Evaluation: it’s responsible for 

trustiness evaluation, and then it can adjust the 

dynamic behavior and take appropriate measures 

to control anomalous behavior according to the 

result of evaluation. 

(4) Trust View: it’s defined to represent the 

characteristics of software behavior 

(5) Trusted Communication Agent Interface (TCAI): 

it’s a security bus actually which is of trustiness, 

privacy and integrity [13] and responsible for 

communication with other modules. That is, it can 

guarantee the security of identity, the transmission 

of message invisible to other processes and can’t 

be modified unauthorizedly, etc. 

Trust Engine as the core of TSCMTE interacts with 

other modules to ensure the trustiness of software. 

3. TPM: trusted computer based on TPM can be of 

temporary trustiness in the initial phase. 

The TSCMTE focuses on the logical framework and 

the design of Trust Engine. 

B. Trust Engine 

For the openness of operational environment, there is 

the security threat to the static integrity of software 

inevitably. Combining “chain of trust” of TCG with the 

support of trusted computing provided by TPM [14], the 

TCG sets up a root of trust in computer system and builds 

a chain of trust, which starts from root of trust to 

hardware platform, and operating system. Trust Engine in 

TSCMTE provides effective measurements to constrain 

the static integrity of Software Module, and then passes 


Figure 2. Chain of trust of TCG with Trust Engine. 

Figure 3. Framework of Trust Control and Evaluation. 

trust. Therefore, trust can be extended into the whole 

computer system. The chain of trust of TCG with Trust 

Engine is shown in Fig. 2. 

When Application Software is loading, Trust Engine 

interacts with TPM firstly to check the static integrity, 

and then trust can be extended from the root of trust into 

Application Software. After that, the dynamic behavior of 

Application Software in the process of execution can be 

monitored and extracted to ensure the dynamic trustiness. 

C. Trust Control and Evaluation 

Application Software is the source of Software 

Behavior. The dynamic behavior of software monitored 

by Trust Monitor needs to be verified the consistency 

with the intended behavior. Therefore, the intended 

behavior as a benchmark of evaluation is the essential 

prerequisite of dynamic trustiness constraints. The 

Context extracted during software execution can 

guarantee objectivity and credibility of software behavior. 

The Trust Control and Evaluation proposed is made up 

of Trusted Network Interface, Event Channel, Evaluation 

and Trust Control. Its framework is shown as in Fig. 3: 

1. Trusted Network Interface: the software based on 

network interacts with other entities through the Trusted 

Network Interface. 

2. Event Channel: receive the related information 

which extracted through monitoring the process of 

software execution. 

3. Evaluation: verify whether the dynamic behavior 

satisfies the trustiness constraints.


4. Trust Control: according to the evaluation result, 

adjust dynamic behavior and take appropriate measures to 

control anomalous behavior. 

IV. SOFTWARE BEHAVIOR AND TRUSTINESS EVALUATION 

A. Related Definitions 

The definition of trustiness is given by TCG according 

to the behavior of the entity: An entity can be trusted if it 

always behaves in the expected manner for the intended 

purpose [15]. Therefore, as an entity, the trustiness of 

software depends on whether its behavior is trusted or not. 

The related definitions are listed as below: 

Definition 1. Software Behavior: Software behavior 

refers to any changes, influences or any operations made 

to the other independent entities when the software works 

as an independent entity [7], that is, software is able to 

perform its function by consuming computer resources. 

Definition 2. Trustiness of Software Behavior: If the 

dynamic behavior in the process of software execution is 

always consistent with the intended behavior, it can be 

considered as trusted. 

Definition 3. Software Intended Behavior Trace: it is 

the representation of intended behavior of software, 

which is composed of Software Intended Operation Trace 

and Software Intended Function Trace. 

Definition 4. Software Intended Operation Trace: 

represents the intended routes on which some significant 

positions are selected orderly as monitor points. It can be 

denoted by ordered vectors. 

Definition 5. Software Intended Function Trace: 

describes the intended functions performed on monitor 

points. It is constituted by a series of functions with 

related information and also denoted by ordered vectors. 

B. Software Intended Operation Trace 

Definition 6. Check Point: It’s a significant point 

which was set up as a monitor on the route of software 

execution. It contains two types: Ordinary Check Point 

and Branch Check Point. Ordinary Check Point records 

function with its related information, and Branch Check 

Point records transfer condition and other related 

information. 

The ordered vector of Check Points can guarantee the 

dynamic trustiness of software operation trace from the 

aspect of intended routes. However, where to set up the 

Check Points on the route of software execution is a 

significant problem that should be solved. It is to consider 

that: 1. from the routes coverage of software execution, 

setting up as more Check Points as possible will improve 

the accuracy of the constraints of software’s dynamic 

operation trace; 2. from the efficiency of software 

execution, setting up more Check Points means extracting 

and storing more related information, which will reduce 

the efficiency of software execution. Therefore, how to 

make balance between accuracy and efficiency should be 

analyzed. In addition, the granularity of setting up Check 

Points determines the degree of software’s dynamic 

trustiness. According to the rules as follows, Check 

Points can be set up on the routes of software execution: 


Rule 1: Set up Ordinary Check Points at significant 

system calls. In order to perform certain function, most 

software needs to interact with kernel through system 

calls. System call sequences can reflect software behavior 

to a certain degree [16]. Therefore, it can evaluate the 

trustiness of software’s dynamic behavior. 

Rule 2: Set up Branch Check Points at each conditional 

branch, and set up Ordinary Check Points at the body of 

each branch separately. Due to the non-determinism 

caused by branches, software is easy to be attacked at 

each conditional branch and executes the unexpected 

branch path which is difficult to be detected. Therefore, it 

is necessary to set up Branch Check Points for ensuring 

the trustiness of dynamic behavior. 

Rule 3: Set up Ordinary Check Points in the end of 

basic function. The results of software execution can 

evaluate whether the independent Software Module 

performed intended operation. 

C. Software Intended Function Trace 

Definition 7. Scene: It’s a vector of n-tuples which 

records the background and function during software 

execution. Ordinary Check Point contains certain 

function with function name, function arguments, CPU 

load, memory usage, result of software execution and so 

on; Branch Check Point concludes CPU load, memory 

usage, branch data, transfer condition and so on. 

Definition 8. Time Interval: It is an interval that 

software consumes between adjacent Check Points in the 

process of execution, which can ensure the dynamic 

trustiness of software behavior between adjacent Check 

Points. 

The vectors which records Scene and Time Interval 

can guarantee the dynamic trustiness of Software 

Operation Trace from the aspect of intended function. 

However, for different Check Points, the same attribute, 

due to the difference of values, contributes to ensuring 

the dynamic trustiness of Software Intended Function 

Trace differently. 

Information entropy is the measure of uncertainty of a 

random variable [17]. For the purpose of improving 

accuracy of the constraints of dynamic trustiness, a 

strategy of determining the weights of characteristic 

attributes based on information entropy is proposed. 

Suppose that there is a set of n-samples, denoted by 

E={e1, e2, ……, en}, which is obtained at certain Ordinary 

Check Point during software execution. Each sample ej is 

represented by the vector of characteristic attributes 

ej=. So, the matrix E={ e(i,j) }, 1≤i≤n, 1≤j≤6 

is the source of determining the weights of characteristic 

attributes based on information entropy. The strategy 

mainly contains 5 steps as follows: 

1. The probability pij when the value of attribute j is 

e(i,j): 

e( 

i, 

j) 

pij 

= 

i = 1, 

2, 

L, 

n (1) 

n 

e( 

i, 

j) 

∑ 

i= 

1


Where e ( i, 

j) 

is the frequency when the value of 

n 

∑ 

i= 

1 

attribute j is e(i,j) , so that e( 

i, 

j) 

= n 

2. The information entropy of attribute j is: 

n 

⎧1 

ej = −k∑ 

pij 

ln pij 

⎪ 

k = ⎨ 1 

i= 

1 

⎪ 

⎩ln 

n 

a 

na 

= 1 

na 

≥ 2 

Where na is the count of different values of attribute 

j, lnna is the max value of information entropy, so 0≤ej≤1 

3. Let 

gj = 1 − ej 

(3) 

4. The sum of information entropy of E={ e(i,j) } is: 

6 

∑ ej 

j= 

1 

(2) 

E' = 

(4) 

5. The weight of attribute j after normalized is: 

6 

gj 

ω j = j = 1, 

2, 

L, 

6 0 ≤ ω j ≤ 1, 

6 − E' 

∑ ω j = 1 (5) 

j= 

1 

However, the strategy has its limitation which didn’t 

consider the relations among characteristic attributes, and 

an intensive study will be made in the future. 

D. Extraction of Software Intended Behavior 

Extracting the intended behavior of software is crucial 

to the generation of Trust View. The methods of 

extraction generally contains: static extraction and 

dynamic extraction. Static extraction doesn’t need to 

execute software. It is able to obtain the control flow of 

software by analyzing the source code, but can’t extract 

background information [18]. Dynamic extraction can 

obtain the execution model through monitoring the 

dynamic behavior. The model is incomplete because its 

generation relays on the input and operation [19], but it 

contains the context, execution time and related 

information of software execution. 

The TSCMTE makes full use of static extraction and 

dynamic extraction. Firstly, static extraction was done to 

select some significant points as Check Points on the 

intended routes of software, and then the intended 

operation trace of software was gotten. Secondly, when 

the software is in the process of execution, dynamic 

extraction was done to extract the Scene, Time Interval 

with other related information by weaving the sensors on 


Figure 4. Generation process. 

the position of Check Points. Specifically, sensor is 

essentially a program for extracting function with related 

information, which is triggered automatically. Then the 

intended function trace during software execution was 

obtained. 

Both the intended operation trace and intended 

function trace mutually complete each other, and then 

constitute the software intended behavior trace accurately. 

The generation process of software intended behavior 

trace is shown in Fig. 4. 

E. Representation of Software Intended Behavior 

The existing representations of dynamic behavior of 

software, such as Petri nets [20], automata theory [21], 

didn’t take the relationships between the behavior and 

trustiness of software into account. Qu Yanwen [7] 

proposed the behavior tree, which is a representation of 

behavior trace of software. Actually, the behavior tree is 

an effective representation of Software Intended Behavior. 

The TSCMTE adopts Trust View to represent Software 

Intended Behavior which can be described as 

TrustView(V, E, T, v0, Ve): 

V is the set of check points, can be expressed as V( Id, 

Type, Scene), where Id is the unique identity; Type 

describes the type of check points: 0 represents ordinary 

check points, and 1 represents branch check points; Scene 

characterizes the function and related information of 

current check point. If Type=0, it can be denoted by 

Scene; if 

Type=1, it can be denoted by Scene. 

E is a set of edges connecting check points associated 

with transitions. It is the subset of the 2-dimensional 

space V×V. Its element ei is a directed edge and be 

described as ei=. 

T is the set of time intervals. For any edge ei= ∈ E, 

the weight of ei represents the transferred interval 

between vi and vj. 

v0 is the initial check point, v0∈ V. 

Ve is the set of the final check points, ve∈ V. 

F. Trustiness Evaluation 

Based on the research on software behavior, a strategy 

of trustiness evaluation is proposed, the process is given 

in Fig. 5.


The evaluation of software intended operation trace is 

to verify whether the practical identification of check 

point consistent with the intended. However, for the 

evaluation of software intended function trace, we adopt 

the above-mentioned strategy of determining the weights 

of characteristic attributes based on information entropy. 

After get the weights of all attributes, we sum up all 

sample pattern with different weighting factor to perform 

evaluation. The values of every attribute are described by 

abstract value range [22]. If the value of j belongs to the 

intended range, then cj=1, else cj=0. The value of 

evaluation C is: 

6 

∑ 

j = 1 

C = ω (6) 

jc j 

A trusted threshold T is defined. If C


TABLE I. 

WEIGHTS OF CHARACTERISTIC ATTRIBUTES AND DETECTION RESULTS 

Characteristic Attributes Function Argument CPU Memory Result TimeInterval Detection Results 

Without 

weights 

With 

weights 

Trusted threshold T is 0.4:√ represents consistence with the intended; × represents deviation from the intended. 

The performance of the TSCMTE mainly comes from 

verifying whether the dynamic behavior trace extracted 

during software execution is in accordance with the 

intended. From the investigation, it is found that regular 

software has a considerably lower system call density. 

Take gzip for example, when the model was enabled, the 

CPU load increased about 15% and the memory usage 

grew by about 10%. Therefore, the TSCMTE has an 

acceptable performance. Meanwhile, it also greatly 

improves the trustiness of software. 

In addition, this model has been adopted in the 

information management system of Department of 

Science and Technology of Hebei Province, China, the 

effective trustiness and acceptable performance have 

been proved completely. 


In this paper, a trusted software constitution model 

based on Trust Engine (TSCMTE) is proposed. Software 

Intended Behavior Trace was introduced to describe the 

intended behavior of software, which consists of Intended 

Operation Trace and Intended Function Trace. The 

former is the intended routes which can be denoted by 

vectors of ordered Check Points; the latter describes the 

intended functions, which is constituted by a series of 

functions with related information. Time Interval can 

ensure the trustiness of software behavior between 

adjacent Check Points. For the purpose of ensuring 

software’s trustiness, the TSCMTE carries out constraints 

of software’s static integrity and dynamic behavior 

completely. Furthermore, in order to improve the 

accuracy of constraints, a strategy of determining the 

weights of characteristic attributes based on information 

entropy is proposed. The simulation experiments and 

practical application show that the trustiness of software 

developed by the TSCMTE is improved greatly without 

performance degradation. However, the TSCMTE has its 

own limitations, an intensive research will be made on 

the trustiness evaluation and the temporal and spatial 

correlations among characteristic attributes for 

determining weights in the future. 


Weights 0.166 0.166 0.166 0.166 0.166 0.166 —— 

Attack × × √ √ √ √ False Negative(0.33) 

Normal √ × √ × √ × False Positive(0.498) 

Weights 0.37 0.23 0.13 0.05 0.2 0.02 —— 

Attack × × √ √ √ √ Attack(0.6) 

Normal √ × √ × √ × Normal(0.3) 

ACKNOWLEDGEMENT 

This work is supported by the National Natural 

Science Foundation of China (Grant No.60873203), the 

Foundation of Key Laboratory of Aerospace Information 

Security and Trusted Computing Ministry of Education 

(Grant No.AISTC2009_03), Hebei National Funds for 

Distinguished Young Scientists (Grant No.F2010000317), 

the National Science Foundation of Hebei Province 

(Grant No.F2010000319). 

REFERENCES 

[1] Chen Huowang, Wang Ji, Dong Wei, “High Confidence 

Software Engineering Technologies,” Acta Electronica 

Sinica, vol. 31, no. 12A, pp. 1933–1937, 2003. 

[2] Li Renjie, Zhang Zhuxi, Jiang Haiyan, Wang Huaimin, 

“Research and implementation of trusted software 

constitution based on monitoring,” Application Research of 

Computers, vol. 26, no. 12, pp. 4585–4588, Dec. 2009. 

[3] Mei Hong, Liu Xizhe, “Software techniques evolved by the 

Internet: Current situation and future trend,” Chinese Sci 

Bull, vol. 55, no. 13, pp. 1214–1220, 2010. 

[4] Edmund M Clarke, Jeannette M Wing, “Formal methods: 

State of the art and future directions,” ACM Computing 

Surveys, vol. 28, pp. 626–643, 1996. 

[5] S. J. Prowell, C. J. Trammell, R. C. Linger, J. H. Poore, 

Cleanroom Software Engineering: Technology and 

Process, Boston: Addison-Wesley Professional, 1999. 

[6] VO Safonov, Using aspect-oriented programming for 

trustworthy software development, Wiley Interscience, Jun. 

2008. 

[7] Qu Yanwen, Software Behavior, Beijing: Electronic 

Industry Press, Oct. 2004. 

[8] Liu Huimin, Zhang Wenhui, “Model Checking: Theories, 

Techniques and Applications,” Chinese Journal of 

Electronics, vol. 30, no. 12A, pp. 1907–1912, Dec. 2002. 

[9] Wang Huaimin, Liu Xudong, Xie Bing, Software 

Trustworthiness Classification Specification, TRUSTIE- 

STC V2.0, May 2009. 

[10] Liu Jing, He Jifeng, Miao Huaikou, “A strategy for model 

construction and integration in MDA,” Journal of Software, 

vol. 17, no. 6, pp. 1141–1422, Jun. 2006. 

[11] Li Xiaoyong, Shen Changxiang, “Research to a dynamic 

application transitive trust model,” J.Huazhong Univ. of 

Sci. & Tech (Nature Science Edition), vol. 33, pp. 310–312, 

Dec. 2005.


[12] Liu Dayou, Yang Kun, Chen Jianzhong, “Agents: Present 

Status and Trends,” Journal of Software, vol. 11, no. 3, pp. 

315–321, Nov. 2000. 

[13] David Challener, Kent Yoder, Ryan Catherman, A 

Practical Guide to Trusted Computing, Upper Saddle 

River, NJ: IBM Press, 2009. 

[14] Shen Changxiang, Zhang Huanguo, Wang Huaimin, 

“Research on trusted computing and its development,” 

SCIENCE CHINA Information Sciences, vol. 53, pp. 405– 

433, Mar. 2010. 

[15] Trusted Computing Group, TCG Specification Architecture 

Overview, Https://www.trustedcomputinggroup.org/groups 

/TCG_1_ 0_Architecture_Overview.pdf. 

[16] Yao Lihong, Zi Xiaochao, Huang Hao, Mao Bing, Xie Li, 

“Research of System Call Based Intrusion Detection,” Acta 

Electronica Sinica, vol. 31, no. 8, pp. 1134–1137, Aug. 

2003. 

[17] Cover T M. and Thomas J A, Elements of Information 

Theory, New York: Wiley, 1991. 

[18] M. Christodorescu, S. Jha, “Static analysis of executables 

to detect malicious patterns,” In Proceedings of the 12th 

USENIX Security Symposium ( Security’03), pp. 169–186, 

Aug. 2003. 

[19] L. Wendehals, “Improving Design Pattern Instance 

Recognition by Dynamic Analysis,” In Proc. of the ICSE 

2003 Workshop on Dynamic Analysis (WODA), pp. 29–32, 

May 2003. 

[20] Luo Junzhou, Shen Jun, Gu Guanqun, “From Petri Nets to 

Formal Description Techniques and Protocol Engineering,” 

Journal of Software, vol. 11, no. 5, pp. 606–615, Nov. 

2000. 

[21] S Helke, F Kammiller, “Representing hierarchical 

automata in interactive theorem provers,” Proc of the 14th 


International Conference on Theorem Proving in Higher 

Order Logics, London: Springer-Verlag, pp. 233–248, 

2001. 

[22] Xiao Qing, Gong Yunzhan, Yang Zhaohong, Jin Dahai, 

Wang Yawen, “Path Sensitive Static Defect Detecting 

Method,” Journal of Software, vol. 21, no. 2, pp. 209–217, 

Feb. 2010. 

Junfeng Tian, born in Baoding, China, 

1965, received Ph.D degree of 

computer science from University of 

Science and Technology of China in 

2004. 

He is a professor of Computer 

Science at Hebei University. In the past 

few years, he has published many 

technical papers in refereed journals 

and conference proceedings. His research interests include 

network security, trust computing and distributed computing. 

Ye Zhu, born in Shijiazhuang, China, 1985, received Master 

degree of computer science from Hebei University in 2011. 

He works in Shijiazhuang Posts and Telecommunications 

Technical College. His research interests include network 

security, trust computing. 

Jianlei Feng, born in Cangzhou, China, 1984, received Master 

degree of computer science from Hebei University in 2011. 

He studied in the College of Mathematics and Computer 

Science of HeBei University His research interests include 

network security, trust computing.


Fuzzy Evaluation on Supply Chains’ Overall 

Performance Based on AHM and M(1,2,3) 

Jing Yang 

School of Kexin, Hebei University of Engineering, Handan, China 

hdjianghua@126.com 

Hua Jiang 

School of Economics and Management, Hebei University of Engineering, Handan, China 

hdjianghua@126.com 

Abstract—To effectively measure supply chain performance 

is one of the most important aspects for supply chain 

management, which can help decision makers analyze the 

historical performance and current status, and can help 

them set future performance targets. We firstly base on the 

Supply Chain Operations Reference-model (SCOR-model) 

to construct an index system for evaluating the supply 

chains’ overall performance, and then use Analytic 

Hierarchical Model (AHM) to determine the weight of every 

index in the system. In order to effectively evaluate the 

supply chains’ overall performance, we define the 

distinguishable weight to eliminate the redundant index 

data and extract valid values to compute object membership. 

Lastly, we use an example to illustrate the effectiveness of 

the proposed approach, whose results show that the 

combined model could effectively evaluate supply chains’ 

overall performance and identify improvement aspects. 

Index Terms—supply chains, overall performance, fuzzy 

evaluation, analytic hierarchical model, M(1,2,3) model 


Today’s fierce market conditions drive the enterprise to 

effectively assess the overall performance of the supply 

chain, and to determine the aspects that need improvement 

in order to gain a competitive advantage. In recent 

decades, enterprises have been improving their internal 

performances by using practices such as JIT, Kanban, 

Kaizen, and TQM. Meanwhile new methods in Supply 

Chain Management have forced enterprises to enhance not 

only their internal performances but also their supply 

chain performance. Many companies have not been 

successful to maximize the potential of their supply chain, 

because they often fail to develop the performance metrics 

needed to fully integrate their supply chain [1]. Lee and 

Billington [2] observed that the discrete sites in a supply 

chain do not maximize efficiency if each pursues goals 

independently. In recent years, more and more researchers 

and practitioners pay much attention to supply chain 

performance measurement. 

In order to assess the supply chain performance, many 

scholars have, from different perspectives, proposed 

corresponding evaluation index systems which can be 

generally classified into three kinds: the evaluation index 


doi:10.4304/jsw.7.12.2779-2786 

system based on Supply Chain Operations Reference 

model (SCOR-model), the evaluation system based on 

Supply Chain Balanced Scorecard and ROF (Resources, 

Output, Flexibility) system proposed by Beamon [3]. Of 

the three evaluation systems, SCOR-model is the most 

influential and most widespread applied which can 

measure and improve enterprises’ internal and external 

business processes, making the implementation of 

Strategic Enterprise Management (SEM) possible [4]. 

Bullinger et al [5], according to the SCOR framework, 

carried out a "bottom-up" performance evaluation of 

supply chains. Kee-hung Lai et al [6], based on the 

SCOR-model and various established measures, proposed 

a measurement model and a measurement instrument for 

supply chain performance in transporttion logistics. 

Robert S. Kaplan et al [7] proposed the Balanced 

Scorecard (BSC) evaluation system. BSC is not only an 

evaluation system but also a manifestation of management 

thinking. Since the BSC was proposed, with its simplicity 

and easy-to-operate advantage, it has been recognized in a 

wide range. Kleijnen J.P.C et al [8] and Ma Shi-hua [9] 

applied the basic principles of BSC in supply chain 

performance evaluation, and established a supply chain 

balanced scorecard evaluation system according to the 

characteristics of supply chains. Beamon [3], starting with 

the strategic objectives of supply chains, determined a few 

key factors influencing strategic objectives to establish an 

index system framework of supply chain performance 

evaluation. 

In addition to the above mentioned supply chain 

performance evaluation index systems, more scholars 

from other perspectives put forward corresponding 

evaluation index systems, but these systems are not from 

the perspective of the overall supply chain, and proposed 

indexes are numerous and complex, even containing 

redundant data. Although many scholars have pointed out 

the evaluation indexes in theoretical level, few are in 

operational level. 

As the process of supply chain operations contains a lot 

of vague information which is difficult to use 

conventional methods to measure and quantify; In 

addition, the characteristics of the supply chain itself 

require its decision-making issues seeking an integrated,


coordinated balance and overall optimization, which 

makes supply chain performance evaluation with a 

number of qualitative indicators. These bring a certain 

degree of difficulty to performance evaluation of supply 

chains. Currently main methods in the supply chain 

performance evaluation include Analytic Hierarchy 

Process (AHP), fuzzy decision-making evaluation method, 

Data Envelopment Analysis and so on. 

AHP, proposed by T. L. Saaty in the early 70s last 

century, is a flexible and practical method of multi-criteria 

decision-making. As supply chain performance evaluation 

is a typical multi-objective decision making issue, AHP 

has been widely used in the area. F.T.S. Chan [10] took 

the electronic industry as an example to demonstrate the 

priority of AHP technique in performance measurement in 

a supply Chain. R. BHAGWAT [11] used AHP 

methodology as aid in making SCM evaluation decisions. 

The application of AHP in supply chain performance 

evaluation brings a set of systematic analysis for the 

problem, providing a more convincing basis of scientific 

management and decision-making. However, AHP also 

has its limitations, so many scholars had tried a variety of 

improved and perfected ways to overcome the 

shortcomings of AHP. F.T.S. Chan and Qi H.J. [12] 

proposed a novel channel-spanning performance 

measurement method from a system perspective and 

introduced fuzzy set theory to address the real situation in 

the judgement and evaluation processes of supply chains. 

Rajat Bhagwat [13] proposed a new mathematical model 

to optimize the overall performance measurement of SCM 

for Small- and Medium-sized Enterprises (SMEs). 

Besides above evaluation methods, there are also many 

other attempts. Qinghua Zhu et al [14], taking 341 

Chinese manufacturers as samples, applied confirmatory 

factor analysis to test and compare two measurement 

models of green supply chain management (GSCM) 

practices implementation. 

Although there have been many researches on the 

performance measurement of supply chains, few are on 

the overall performance evaluation of supply chains. So, it 

is of important practical and theoretical significance to 

evaluate the overall performance of supply chains. The 

objective of this paper is to apply Analytic Hierarchical 

Model (AHM) and a new membership transformation 

method M(1,2,3) to evaluate the supply chains’ overall 

performance. The contributions of this study include: i) 

constructing an index system for assessing the overall 

performance of supply chain; ii) using AHM to determine 

the index weights in the system; iii) building an evaluation 

model by M(1,2,3) for evaluating the supply chains’ 

overall performance. 

The rest of the paper is organized as follows. Section 2 

introduces the models of AHM and M(1,2,3) and 

establishes the framework of applying the models to 

evaluate the performance measurement of supply chains. 

Section 3, according to the SCOR-model proposed by 

Supply Chain Council (SCC), constructs an evaluation 

index system of the overall performance of supply chains. 

Section 4 determines the weight of every index by 

applying Analytic Hierarchical Model. Section 5 applies 


the M(1,2,3) model in the fuzzy evaluation on supply 

chains’ overall performance. The last section concludes 

our discussion by summarizing our findings and 

implications for future research. 

II. THE MODELS 

This section introduces the AHM and M(1,2,3) model. 

AHM is a multi-criteria decision-making tool which can 

be used to evaluate alternative programs or to determine 

the weights of evaluation indexes. M(1,2,3) model is a 

more accurate evaluation model. We also construct the 

framework of applying AHM and M(1,2,3) to evaluate the 

overall performance of supply chains which is the 

guideline of next sections. 

A. Analytic Hierarchical Model 

AHM is different from AHP [15]. There is no 

eigenvalues calculation or consistency test in AHM, often 

called Ball Game model. The concrete contents are as 

follows [16]: 

Assume that there are N elements, u 1, u2, 

L , un 

, 

which respectively represent n ball teams. There are two 

teams in one game, so there will be 

1 

n ( n − 1) 

games 

2 

totally. Every game gains one score, μ ij and μ ji 

representing the corresponding scores of u i and 

u j ( i ≠ j) 

in the same game. The score denotes the 

criterion, short for C. Under C criterion, we can sort 

u 1, u2, 

L , un 

according to their gained scores. 

μ ij and μ ji should satisfy the following conditions: 

⎧μij 

≥ 0 , μ ji ≥ 0 

⎪ 

⎨μij 

+ μ ji = 1 i ≠ j 

(1) 

⎪ 

⎩μii 

= 0 (A teamcan't 

match withitself) 

In the practical problems, μ ij can take all real values 

from 0 to 1. We call μ ij as Relative Measurement of 

u and ( i j) 

i 

u j ≠ and call = ( μij 

) n× 

n 

μ as pair-wise 

comparison matrix. 

If μ ij > μ ji , it means u i is stronger than u j , 

denoted by u i > u j . So, after the game, if the score of 

u i is larger than that of u j , u i is the winner. If 

( ij ) n× 

n 

μ satisfies: If u i > u j , u j > uk 

, then u i > uk 

, 

meaning that comparison matrix satisfies the consistency.


The final score of u i is : = ∑ 

= 

n 

fi 

μ ij , obviously, 

j 1 

n 1 

1 

∑ fi 

= n( 

n − 1) 

. There are n ( n − 1) 

games totally, 

i= 

1 2 

2 

1 

so the total score should be n ( n − 1) 

. Supposing: 

2 

C C C 

( , , L, 

) 

⎧ ω = 

⎪ 

ω ω ω 

⎨ 

n 

C 2 

⎪ωu 

= μ 

i ∑ ij 

⎪⎩ nn ( −1) 

j= 

1 

C u1 u2 un 

where ω C is called the Relative Weight Vector. 

Usually, it is not easy to directly get the comparison 

μ in AHM, but we can deduce it from the 

matrix ( ij ) n× 

n 

comparison Matrix ( aij 

) n× 

n 

( aij 

) n n 

T 

(2) 

in AHP. When 

× satisfies the consistency, we can sort 

u L u according to the relative components of 

u , , 

1, 2 

ω . 

C 

n 

B. M(1,2,3) Model 

Assume that there are m indexes which affect the 

evaluation object Q , where the importance weights 

λ j ( Q) 

of j ( j = 1 ~ m ) index about object Q is given 

and satisfiesP : 

( ) 1 

0 ≤ Q ≤ 

λ , ( ) = 1 

j 

m 

∑ j 

j= 

1 

Q λ (3) 

Every index is classified into p classes. CBKB 

represents the K th class and CBKB is prior to CBK+1B. If the 

membership μ (Q) 

of j th index belonging to CBKB is 

jK 

given, where K = 1 ~ P and j = 1 ~ m , and μ (Q) 

satisfies: 

0 ≤ μ ( Q) 

≤ 1 , μ jK ( Q) 

= 1 (4) 

jK 

P 

∑ 

K = 1 

1) The distinguishable weight 

Let α (Q) 

represent the normalized and quantized 

j 

value describing j th index contributes to classification. 

And it can be described quantitatively by the entropy 

H j (Q) 

. Therefore, α (Q) 

is a function of (Q) 

: 

j 

p 

H j ( Q) 

= − ∑ μ jk ( Q) 

⋅ logμ 

jk ( Q) 

k= 

1 

H j 

jK 

(5) 

1 

( Q) 

= 1 − H ( Q) 

(6) 

log p 

v j 

j 

m 

α j ( Q) 

= ν j ( Q) 

∑ν 

t ( Q) 

( j = 1 ~ m) 

(7) 

t= 

1 


Definition 1: If μ (Q) 

( k = 1 ~ p, 

j = 1 ~ m) 

is the 

jk 

membership of j th index belonging to C k and 

satisfies Eq. (4); by (5) (6) (7), α (Q) 

is called 

distinguishable weight of j th index corresponding to Q . 

Obviously, α (Q) 

satisfies 

j 

0 ≤ α ( Q) 

≤ 1 , j ( ) = 1 Q α (8) 

j 

m 

∑ 

j= 

1 

2) The effective value 

Definition 2: If μ (Q) 

( k = 1 ~ p, 

j = 1 ~ m) 

is the 

jk 

membership of j th index belonging to C k and 

satisfies Eq. (8), and α (Q) 

is the distinguishable 

weight of j th index corresponding to Q , then 

j 

α ( ) ⋅ μ ( Q) 

( k = 1 ~ p) 

(9) 

j 

Q jk 

is called effective distinguishable value of K th class 

membership of j th index, or K th class effective value 

for short. 

3) The comparable value 

Definition 3: If α j ( Q) ⋅ μ jk ( Q) 

is K th class effective 

value of j th index, and β j (Q) 

is importance weight of 

j th index related to object Q , then 

β ( ) ⋅ α ( Q) 

⋅ μ ( Q) 

( k = 1 ~ p) 

(10) 

j Q j jk 

is called comparable effective value of K th class 

membership of j th index, or K th class comparable 

value for short. 

Definition 4: If β j ( Q) ⋅ α j ( Q) 

⋅ μ jk ( Q) 

is K th class 

comparable value of j th index of Q , where ( j = 1 ~ m) 

, 

then 

M 

k 

m 

= 

j= 

1 

( Q) 

∑ β ( Q) 

⋅α 

( Q) 

⋅ μ ( Q) 

( k = 1 ~ p) 

(11) 

j 

j 

is named K th class comparable sum of object Q . 

Definition 5: If M k (Q) 

is K th class comparable 

sum of object Q , and μ k (Q) 

is the membership of 

object Q belonging to C K , then 

Δ p 

k 

t= 

1 

μ ( Q) 

= M ( Q) 

∑ M ( Q) 

( k = 1 ~ p) 

(12) 

k 

t 

Obviously, given by Eq. (13), membership degree 

μ (Q) 

satisfies: 

k 

jk 

p 

0 ≤ μ k ( Q) 

≤ 1, 

∑ k = 

k= 

Q μ ( ) 1 (13) 

1 

The above membership transformation method can be 

summarized as “effective, comparison and composition”, 

M 1, 

2, 

3 model [17]. 

which is denoted as ( ) 

j


C. The Framework of Applying AHM and M(1,2,3) to 

Evaluate Supply Chains’ Overall Performance 

According to the calculation processes of the AHM 

and M(1,2,3), we can construct the framework of 

applying the two models to evaluate the overall 

performance of supply chains, as Fig.1 shows. The first 

step is to construct an evaluation index system of supply 

chains’ overall performance; the second step is to apply 

AHM to determine the weight of every index; the third 

step is to establish the fuzzy evaluation matrix of supply 

chains’ overall; the fourth step is to calculate the 

evaluation results by M(1,2,3) model; the fifth step is to 

analyze the results and propose improvement 

measurements in the last step. 

Constructing an evaluation index system of supply chains’ overall 

Applying AHM to determine the weight of every index 

Establishing the fuzzy evaluation matrix of supply chains’ overall 

Calculating the evaluation results by M(1,2,3) model 

Results analysis 

Proposing improvement measurements 

Figure 1. The framework of applying AHM and M(1,2,3) to evaluate 

supply chains’ overall performance. 

III. THE EVALUATION INDEX SYSTEM OF SUPPLY CHAINS’ 

OVERALL PERFORMANCE 

We, according to the SCOR-model, establish an 

evaluation index system of supply chains’ total 

performance, as Table I shows [18]. 

IV. APPLYING AHM TO DETERMINE THE WEIGHT OF 

EVERY INDEX IN THE EVALUATION INDEX SYSTEM OF 

SUPPLY CHAINS’ OVERALL PERFORMANCE 

This section applies AHM to determine the weight of 

every index in the evaluation index system of supply 

chains’ total performance. 

A. 1-9 Proportional Scaling Method 

u , , 

, 2 

N elements, 1 u L un 

, compare importance 

TABLE I. 

THE EVALUATION INDEX SYSTEM OF SUPPLY CHAINS’ OVERALL 

PERFORMANCE 

The 

goal 

Evaluation on supply chains’ total performance G 

Criteria layer Index layer 

C1 : 

Reliability 

C2: 

Responsiveness 

C3: 

Flexibility 

C4: 

Cost 

C5: 

Assets 

F11: Delivery performance 

F12: Order fill rate 

F13: On time delivery 

F21: Order lead-time 

F22: Planning cycle time 

F23: Information transmission rate 

F31: Supply chain responsiveness time 

F32: Production flexibility 

F33: Delivery flexibility 

F41: Supply chain total costs 

F42: Value-added employee productivity 

F43: Quality warranty costs 

F51: Cash turn age 

F52: Inventory days 

F53: Asset turns 

pairwise, so there will be 

n( 

n −1) 

2 

times. The 

importance ratio of u i and u j is a ij . The problem is 

how to get a ij . AHP uses 1-9 proportional scaling 

method to determine a ij . 

B. Constructing Pairwise Comparison Judgment Matrix in 

AHP 

In this paper, we compare the criterions layer based on 

the goal: improving the overall performance of supply 

chains and will get a 5 x 5 comparison matrix; we 

compare the factors layer based on corresponding 

criterion and will get five 3 x 3 comparison matrices. The 

comparison matrices are as follows: 

G Reliability C1 Responsiveness C2 Flexibility C3 Cost C4 Assets C5 

Reliability C1 1 2 2 3 5 

Responsiveness C2 1/2 1 1 2 3 


Flexibility C3 1/2 1 1 2 3 

Cost C4 1/3 1/2 1/2 1 2 

Assets C5 1/5 1/3 1/3 1/2 1


Reliability C1 F11 F12 F13 Responsiveness C2 F21 F22 F23 

Delivery performance F11 1 2 2 Order lead-time F21 1 2 3 

Order fill rate F12 1/2 1 1 Planning cycle time F22 1/2 1 2 

On time delivery F13 1/2 1 1 

Information transmission 

rate F23 

1/3 1/2 1 

Flexibility C3 F31 F32 F33 Costs C4 F41 F42 F43 

Supply chain 

responsiveness time F31 

Production flexibility F32 1/2 1 2 

Delivery flexibility F33 1/4 1/2 1 

C. The Pairwise Comparison Judgment Matrix in AHM 

after Converting from AHP 

Using the models mentioned in Section II, we can 

aij 

n× 

in AHP 

convert the comparison judgment matrix ( ) n 

1 2 4 Supply chain total costs F41 1 2 2 

Value-added employee 

productivity F42 

Quality warranty costs 

F43 

Assets C5 F51 F52 F53 

Cash turn age F51 1 3 5 

Inventory days F52 1/3 1 2 

Asset turns F53 1/5 1/2 1 

1/2 1 1 

1/2 1 1 

into the comparison judgment matrix ( ij ) n× 

n 

as follows: 

G Reliability C1 Responsiveness C2 Flexibility C3 Cost C4 Assets C5 

Reliability C1 0 0.8 0.8 0.857 0.909 

Responsiveness C2 0.2 0 0.5 0.8 0.857 

Flexibility C3 0.2 0.5 0 0.8 0.857 

Cost C4 0.143 0.2 0.2 0 0.8 

Assets C5 0.091 0.143 0.143 0.2 0 

Reliability C1 F11 F12 F13 Responsiveness C2 F21 F22 F23 

Delivery performance F11 0 0.8 0.8 Order lead-time F21 0 0.8 0.857 

Order fill rate F12 0.2 0 0.5 Planning cycle time F22 0.2 0 0.8 

On time delivery F13 0.2 0.5 0 

Information transmission 

rate F23 

0.143 0.2 0 

Flexibility C3 F31 F32 F33 Costs C4 F41 F42 F43 

Supply chain responsiveness 

time F31 

Production flexibility F32 0.2 0 0.8 

0 0.8 0.889 Supply chain total costs F41 0 0.8 0.8 

Value-added employee 

productivity F42 

0.2 0 0.5 

Delivery flexibility F33 0.111 0.2 0 Quality warranty costs F43 0.2 0.5 0 


Assets C5 F51 F52 F53 

Cash turn age F51 0 0.857 0.909 

Inventory days F52 0.143 0 0.8 

Asset turns F53 0.091 0.2 0 

μ in AHM,


From the above conversion results, we can see that all 

the conversed comparison judgment matrices in AHM 

satisfy the consistency. 

D. Calculating the Relative Weights in AHM under the 

Single Criterion 

Using the single criterion C, we can calculate the 

relative weight by the formula of each factor which is as 

follows: 

T 

C C C 

C u u un 

⎟ n 

ω = ⎜ 

⎛ω 

, ω , L , ω ⎞ , 

C 2 

ω = ∑ μij 

⎝ 1 2 ⎠ ui n( 

n −1) 

j= 

1 

The detailed values of the factor relative weight are as 

follows: 

T 

ω ⎜ 

⎛ 

⎟ 

⎞ 

G = ω , ω , ω , ω , ω 

⎝ C1 

C2 

C3 

C4 

C5 

⎠ 

= ( ) T 

0.337 , 0.236, 

0.236, 

0.133, 

0.058 

T 

⎜ 

⎛ F 

= 11 F12 

F13 

⎟ 

⎞ 

C ω , ω , ω 

1 ⎝ C1 

C1 

C1 

⎠ 

ω = ( ) T 

0.534 , 0.233, 

0.233 

T 

⎜ 

⎛ F 

= 21 F22 

F23 

⎟ 

⎞ 

C ω , ω , ω 

2 ⎝ C2 

C2 

C2 

⎠ 

ω = ( ) T 

0.552 , 0.334, 

0.114 

T 

⎜ 

⎛ F 

= 31 F32 

F33 

⎟ 

⎞ 

C ω , ω , ω 

3 ⎝ C3 

C3 

C3 

⎠ 

ω = ( ) T 

0.562, 

0.334, 

0.104 

T 

⎜ 

⎛ F 

= 41 F42 

F43 

⎟ 

⎞ 

C ω , ω , ω 

4 ⎝ C4 

C4 

C4 

⎠ 

ω = ( ) T 

0.534 , 0.233, 

0.233 

T 

⎜ 

⎛ F 

= 51 F52 

F53 

⎟ 

⎞ 

C ω , ω , ω 

5 ⎝ C5 

C5 

C5 

⎠ 

ω = ( ) T 

0.588 , 0.314, 

0.098 

E. Calculating the Synthetic Weight of Each Factor to the 

Goal 

According to the relative weights under the single 

criterion in every layer gained in the above subsection, 

we can calculate the synthetic weights of factors in the 

bottom layer to the goal which are as follows: 

The 

Goal 

performance S 

Fuzzy evaluation on supply chains’ total 

ω 

Fij 

G 

F 

F 

F11 

G 

F 

F12 

G 

F 

F 

G 

F 

F 

G 

33 

43 51 52 53 

ωG 

, ω 41 

G , ω 42 

G , ωG 

, ωG 

, ωG 

, ωG 

) 

= ( 0.180,0.079,0.079,0. 

130,0.079, 0.027,0.133, 

0.079 , 

T 

0.025,0.071,0.031, 

0.031, 0.034 ,0.018, 0.006) 

F 

F 

G 

F 

F 

G 

T 

F 

G 

F32 

G 

13 

23 31 

= ( ω , ω , ω , ω 21, 

ω 22 , ω , ω , ω , 

V. FUZZY EVALUATION ON SUPPLY CHAINS’ TOTAL 

PERFORMANCE BASED ON M(1,2,3) 

A. The Fuzzy Evaluation Matrix of Supply Chains’ Total 

Performance 

According to the evaluation index system of the total 

performance of supply chains we have constructed in 

Section III, we invited fifty domain experts including the 

top leaders of the supply chain to evaluate the total 

performance of some supply chain. The evaluation results 

on the each base index are as Table II shows. In Table II, 

the values in the corresponding brackets of each index 

represent the corresponding importance weights; the 

vectors behind the base indexes represent the 

corresponding membership vectors which are classified 

into five levels: G1: Very satisfied, G2: Satisfied, G3: 

General, G4: Dissatisfied, G5: Very dissatisfied. 

B. Fuzzy Evaluation Based on M(1,2,3) Model 

(1) We take the criterion C 1 (Reliability) as the 

example. The calculation processes of its membership 

vector are: 

1) From Table II, on the index of "Delivery 

performance", 24% of experts regarded it as very satisfied, 

22% regarded it as satisfied, 22% regarded it as general, 

20% regarded it as dissatisfied and 12% regarded it as 

very dissatisfied, so its evaluation membership vector is 

0.24 0.22 0.22 0.20 0.12 . 

[ ] 

TABLE II. 

THE INDEX DATA OF SUPPLY CHAINS’ TOTAL PERFORMANCE 

Criteria Indexes 

Very 

satisfied 

Satisfied General Dissatisfied 

Very 

dissatisfied 

C1: Reliability 

(0.337) 

F11: Delivery performance (0.534) 

F12: Order fill rate (0.233) 

F13: On time delivery (0.233) 

12 

15 

14 

11 

14 

14 

11 

10 

10 

10 

7 

6 

6 

4 

6 

C2: F21: Order lead-time (0.552) 6 7 14 13 10 

Responsiveness F22: Planning cycle time (0.334) 5 4 14 14 13 

(0.236) F23: Information transmission rate (0.114) 10 11 14 10 5 

C3: Flexibility 

(0.236) 

F31: Supply chain responsiveness time (0.562) 

F32: Production flexibility (0.334) 

F33: Delivery flexibility (0.104) 

11 

16 

18 

16 

14 

12 

14 

12 

11 

6 

6 

6 

3 

2 

3 

C4: Costs 

(0.133) 

F41: Supply chain total costs (0.534) 

F42: Value-added employee productivity (0.233) 

F43: Quality warranty costs (0.233) 

11 

10 

14 

12 

13 

14 

13 

13 

10 

10 

10 

8 

4 

4 

4 

C5: Assets 

(0.058) 

F51: Cash turn age (0.588) 

F52: Inventory days (0.314) 

F53: Asset turns (0.098) 

13 

12 

14 

14 

11 

12 

10 

12 

12 

10 

10 

10 

3 

5 

2 

© 2012 ACADEMY PUBLISHER


According to the fuzzy theory, we can draw the 

evaluation matrix of the criterion " Reliability" as follows: 

U 

⎛ 0.24 

⎜ 

1 = ⎜ 0.30 

⎜ 

⎝ 0.28 

( C ) 

0.22 

0.28 

0.28 

0.22 

0.20 

0.20 

0.20 

0.14 

0.12 

0.12 

0.08 

0.12 

According to the j th row 11 ~ F13 

U C1 

, the 

distinguishable weights of F1 j are obtained and the 

distinguishable weight vector is: 

α 

F of ( ) 

( C ) ( 0.1334 0.5065 0.3600) 

1 = 

2) In Table II, the importance weight vector of 

F 11 ~ F13 

on C 1 is given: 

β 

( C ) ( 0.534 0.233 0.233) 

1 = 

⎞ 

⎟ 

⎟ 

⎟ 

⎠ 

3) Calculate the K th comparable value of F1 j and 

obtain the comparable value matrix ( C ) 

N 

⎛ 0.0171 0.0157 

⎜ 

1 = ⎜ 0.0354 0.0330 

⎜ 

⎝ 0.0235 0.0235 

( C ) 

4) According to ( C ) 

0.0157 

0.0236 

0.0168 

N 1 of C 1 : 

0.0142 0.0085⎞ 

⎟ 

0.0165 0.0094⎟ 

0.0101 0.0101 

⎟ 

⎠ 

N 1 , calculate the K th comparable 

sum of C 1 and obtain the comparable sum vector: 

M 

( C ) ( 0.0760 0.0722 0.0561 0.0408 0.0281) 

1 = 

5) According to ( C1 

) 

vector ( C ) C : 

μ 

μ 1 of 1 

M , calculate the membership 

( C ) ( 0.2782 0.2644 0.2052 0.1495 0.1027) 

1 = 

In the same steps, we can calculate μ ( C2 

) , ( C3 

) 

μ ( C4 

) and μ ( C5 

) , which, with ( C1 

) 

evaluation matrix ( S ) 

U 

performance: 

( S) 

μ , 

μ , form the 

U of supply chains’ total 

⎛ μ( 

C ⎞ ⎛ 

⎞ 

⎜ 

1) 

0.2782 0.2644 0.2052 0.1495 0.1027 

⎟ ⎜ 

⎟ 

⎜μ( 

C2) 

⎟ ⎜0.1152 

0.1139 0.2800 0.2663 0.2246⎟ 

= 

⎜μ( 

C 

⎟ 

= 

⎜ 

⎟ 

⎜ 3) 

0.2724 0.2966 0.2586 0.1200 0.0524 

⎟ ⎜ 

⎟ 

⎜μ( 

C4) 

⎟ ⎜0.2325 

0.2559 0.2429 0.1886 0.0800⎟ 

⎜ ⎟ ⎜ 

⎟ 

⎝μ( 

C5) 

⎠ ⎝0.2598 

0.2644 0.2124 0.2000 0.0634⎠ 

(2) According to U ( S ) and the weights of each 

criteria in the criterion level, we can calculate the final 

μ S of the goal S : 

membership vector ( ) 

( S) = ( μ1( 

S) 

, , μ5( 

S) 

) 

( 0.2366 0.2437 0.2481 0.1634 0.1081) 

μ L 

= 


C. Recognition 

Because the evaluation grades of the overall 

performance of supply chains are orderly, that is, Gk is 

superior to Gk+1, so we apply confidence recognition rule 

to determine the grade of the overall performance of the 

supply chain. 

Let λ ( λ > 0. 

7) 

represent the confidence degree, then 

we can calculate 

⎪⎧ 

k 

⎪⎫ 

K0 

= min⎨k 

∑ μ t ( S) 

≥ λ, 

1 ≤ k ≤ 5⎬ 

. 

⎪⎩ t= 

1 

⎪⎭ 

and judge that S belongs the kth grade, of which the 

k 

t S 

t= 

1 

confidence degree is no lower than ∑ ( ) 

μ . 

In the example, according to the final membership 

μ S gained in the above subsection, we can 

vector ( ) 

judge that the overall performance of the supply chain S 

belongs the G3 (General) level, with the confidence 

degree 72.84% (0.2366+0.2437+0.2481= 0.7284). 

D. Results Analysis 

We have judged the total performance of the supply 

chain as the “General” level with the confidence level 

72.84%. By the evaluation matrix U ( S ) , we judge it as 

"Very satisfied" with the confidence level only being 

23.66%, indicating that the supply chain should improve 

its total performance from every aspect greatly, especially 

from the “Responsiveness” aspect, which is with the 

lowest confidence level 50.91% if we judge it as the 

“General” level. 


In this paper, we integrated Analytic Hierarchical 

Model (AHM) and a new membership transformation 

method M(1,2,3) to evaluate the overall performance of 

supply chains. The contributions of this study include: i) 

constructing an index system for assessing the overall 

performance of supply chain; ii) using AHM to determine 

the index weights in the system; iii) building an 

evaluation model by M(1,2,3) for evaluating the supply 

chains’ overall performance. From the proposed approach, 

we can not only judge the overall levels of supply chains 

but also find out which aspects the decision makers 

should enhance to increase the overall performance of 

supply chains. 

However this study has several limitations. First, we 

just used the the first layer indicators in the SCOR-model 

proposed in the last 90s and didn’t consider modern 

factors such as Green and Ecological aspect [14]. Second, 

in Part IV, the pairwise comparison matrices are 

determined only according to author’ own consideration, 

which is too subjective. The focus of this research was to 

propose a new evaluation method for evaluating supply 

chains’ total performance. Whether the method achieves 

effective and scientific results also depends on the index 

and the data used in the method. So, Future research can


focus on improving the evaluation index system of supply 

chains’ total performance and developing accurate way to 

attain index data. 

[1] 

REFERENCES 

Hua Jiang, “Fuzzy comprehensive evaluation on supply 

chains' total performance based on improved algorithm”, 

2009 2nd International Conference on Power Electronics 

and Intelligent Transportation System (PEITS), 19-20 Dec. 

2009, pp.148-151. 

[2] Lee, H.L. and Billington, C., “Managing supply chain 

inventory: Pitfalls and opportunities”, Sloan Management 

Review, 1992, vol.33, no.3, pp.65-73. 

[3] Beamon, M. B. “Measuring supply chain performance”, 

International Journal of Operations & Production 

Management, 1999, vol.19, no.3, pp.275-292. 

[4] Chai Yue-ting and Liu Yi, Agile Supply Chain 

[5] 

Management, Beijing: Tsinghua University Press, 2001. 

Bullinger H., Kuhner M., and Hoof A.V., “Analyzing 

supply chain performance using a balanced measurement 

method”, International Journal of Production Research, 

2002, vol.40, no.15, pp.3533-3543. 

[6] Kee-hung Lai, E. W. T. Ngai, and T. C. E. Cheng, 

“Measures for evaluating supply chain performance in 

transport logistics”, Transportation Research Part E: 

Logistics and Transportation Review, 2002, vol.38, no.6, 

pp.439-456. 

[7] Robert S. Kaplan and David P. Norton, “The Balanced 

Scorecard-Measures That Drive Performance”, Harvard 

Business Review, 1992, vol.70, no.1, pp.71-79. 

[8] Kleijnen J.P.C. and Smits M.T., “Performance metrics in 

supply chain management”, Journal of the Operational 

Research Society, 2003,vol.54, no5, pp.507-514. 

[9] Ma Shi-hua, Li Hua-yan, and Lin Yong, “Application of 

Balanced Scorecard in Supply Chain Performance 

Measurement”, Industrial Engineering and Management, 

2003, vol.7, no.4, pp.5-10. 

[10] F.T.S. Chan, “Performance Measurement in a Supply 

Chain”, International Journal of Advanced Manufacturing 

Technology, 2003, vol.21, no.7, pp.534-548. 

[11] R. BHAGWAT and M. K. SHARMA, “Performance 

measurement of supply chain management using the 

analytical hierarchy process”, Production Planning and 

Control, 2007, vol.18, no.8, pp.666-680. 

[12] F.T.S. Chan and H. J. Qi, “An innovative performance 

measurement method for supply chain management”, 

Supply Chain Management, 2003, vol.8, no.3-4, 

pp.209-223. 


[13] Rajat Bhagwat, T.S. Felix Chan and M. K. Sharma, 

“Performance measurement model for supply chain 

management in SMEs”, International Journal of 

Globalisation and Small Business, 2008, vol.2, no.4, 

pp.428-445. 

[14] Qinghua Zhu, Joseph Sarkis, and Kee-hung Lai, 

“Confirmation of a measurement model for green supply 

chain management practices implementation”, 

International Journal of Production Economics, 2008, 

vol.111, no.2, pp.261-273. 

[15] Saaty, T.L., The Analytical Hierarchy Process, 

McGraw-Hill: New York, 1980. 

[16] Hua Jiang, and Fangshan Wang, “Analysis of influencing 

factors on performance evaluation of agricultural products 

network marketing based on AHM”, 2009 2nd 

International Conference on Power Electronics and 

Intelligent Transportation System (PEITS), 19-20 Dec. 

2009, pp. 140- 143. 

[17] Hua Jiang, and Junhu Ruan, “Fuzzy Evaluation on 

Network Security Based on the New Algorithm of 

Membership Degree Transformation—M(1,2,3)”, Journal 

of Networks, 2009, vol.4, no.5, pp.324-331. 

[18] Hua Jiang, and Zhanping Hou, “Analysis on the evaluation 

index system of supply chains' total performance based on 

rough set theory”, 2009 2nd International Conference on 

Power Electronics and Intelligent Transportation System 

(PEITS), 19-20 Dec. 2009, pp. 144 - 147. 

Jing Yang, born in 1979, Handan, Hebei Province, China. In 

May, 2009, she graduated from Hebei University of 

Engineering and obtained her postgraduate qualifications. Her 

main research fields include: Enterprise e-business applications, 

Information Management, Supply Chain Management. 

She is now working in School of Kexin, Hebei University of 

Engineering, Lecturer. Her current research interests include: 

Supply Chain Management and Scientific Decision-making. 

Hua Jiang, born in 1977, Handan, Hebei Province, China. In 

March, 2006, he graduated from Hebei University of 

Engineering and obtained his postgraduate qualifications. His 

main research fields include: network security, information 

management, supply chain management. 

He is now working in Information Management Department, 

Economics and Management School, Hebei University of 

Engineering, Associate Professor. His current research interests 

include: Management Optimization and Scientific 

Decision-making.


A Novel Combine Forecasting Method for 

Predicting News Update Time 

Mengmeng Wang 

College of Computer Science and Technology, Jilin University, Changchun, China 

Key Laboratory of Symbolic Computation and Knowledge Engineering attached to the Ministry of Education, Jilin 

Unversity, Changchun, China 

Email: wmmwwlh@126.com 

Xianglin Zuo 


Email: 295228473@qq.com 

Wanli Zuo and Ying Wang 


Key Laboratory of Symbolic Computation and Knowledge Engineering attached to the Ministry of Education, Jilin 

Unversity, Changchun, China 

Email: { zuowl, wangying2010}@ jlu.edu.cn 

Abstract—With the rapid development of Internet, 

information provided by the Internet has shown explosive 

growth. In the face of massive and constantly updated 

information on the Internet, how the user can fast access to 

more valuable and more information has become one of the 

hot spots. The time of Web Page update appears to be 

erratic, so forecasting the update time of news reports is 

even more difficult. From the view of application, we can 

use mathematical models to maximize the approximation of 

variation, although it cannot be completely accurate. So is 

the predicting the update time of news which helps in 

improving the news crawler’s scheduling policy. In this 

paper, we proposed a combined predict algorithm for news 

update. In order to predict the update time of news, firstly, 

we applied the Exponential Smoothing method to our 

dataset, and we also have selected the optimal parameters. 

Secondly, we leveraged the Naive Bayes Model for 

prediction. Finally, we combined two methods for 

Combination Forecasting, as well as made a compare with 

former methods. Through the experiments on Sohu News, 

we show that Combination Forecasting method outperforms 

other methods while estimating localized rate of updates. 

Index Terms—Exponential Smoothing Method, Naive Bayes 

Model, Combination Forecasting, News Update Time 


News provides information on recent events, and 

therefore timeliness is very important to the news. 

Timeliness means that report and the fact are taking 

place synchronized in order to meet the needs of audience. 

The rapid development of Internet technology make 

Corresponding author: Wanli Zuo 

Tel.: +1-359-608-5187 

E-mail address: zuowl@jlu.edu.cn 


doi:10.4304/jsw.7.12.2787-2793 

demands of the real-time grow geometrically. The 

network news media has been an unprecedented 

development. Nicholas Negroponte once said, "Network 

media is the traditional media's grave digger". A recent 

survey shows that about 90% of decision information can 

be acquired from the web[1]. 

Web is growing explosively[ 2 ]. And it is almost 

impossible to download all novel pages. The update time 

of web page appears to be erratic, news are extremely 

time sensitive by nature, so forecasting the update time of 

news reports is even more difficult. From the view of 

application, we can use mathematical models to 

maximize the approximation to variation, although it 

cannot be completely accurate. In the face of massive and 

constantly updated information on the Internet, predicting 

the time of news page update helps in improving the 

news crawler’s scheduling policy. Fetterly et al.[ 3 ] 

analyzed several million of pages with the aim of 

measuring the rate and the degree of changes to web 

pages. The statistical observations of the measurements 

showed that page size was a strong predictor of both 

frequency and degree of change. Real-time Web crawling 

system leveraged the active crawling, the updated time of 

news pages is unknown, but in order to maximize the 

approximation to news page update frequency, fixed 

cycle crawling obviously does not work, but to crawl in 

the dynamic frequency which should continue to be 

adjusted in the application environment. Focused 

crawlers only download pages related to a given 

topic[4][5][6][7]. These works are similar with ours. We 

also make several kinds of focused crawler, and each 

crawler is in charge of one kind of news. 

In this paper, we proposed a combined predict 

algorithm about news update. In order to predict the 

update time of news, firstly, we applied the Exponential


Smoothing method to our dataset, and we also have 

selected the optimal parameters. Secondly, we leveraged 

the Naive Bayes Model for prediction. Finally, we 

combined two methods for Combination Forecasting, as 

well as made a compare with former methods. Through 

the experiments on Sohu News, we show that 

Combination Forecasting achieve better compared to the 

other two methods. Our study differs from previous 

studies in that we derived Combination Forecasting 

which combined Naive Bayes Model and Exponential 

Smoothing to predict the time of news update. 

Roadmap for rest of this paper is organized as follows: 

Some related work is discussed in section 2;Section 3 

briefly describes the problem formulation; Section 4 

introduces the approach proposed; experiment and the 

result is analyzed in Section 5; Section 6 is the conclusion 

of this paper. 


The rapid development of the Web2.0 technology put 

forward higher requirements on the timeliness issue. The 

flood of information in news, blogs and micro blogs is 

explosive, undergoing rapid changes and changes over 

time. Explosive events which happen in the morning may 

demise at noon, if do not pass this information to user 

until afternoon, the user is already not interested. 

The time-sensitive information can be divided into two 

forms: static time-stamp information and dynamic timestamp 

information. For instance, news belongs to static 

time-stamp information. The information which does not 

change over time is only related with a point in time or 

time period. Once generated, it is tightly bounded with 

time, and with time variation of it only changes at the 

moment it is produced, namely, the process from scratch 

over time. Although such information is no longer 

changed after generation, when and where they are 

generated is random. It is an almost impossible task that 

fully grasps the precise information[8][9]. 

Forecasting page update frequency is a very difficult 

task and related works have been published very early. 

Their research can be traced back to the seventies of the 

last century which mainly focus on forecasting update 

frequency. There are two main approaches in the study of 

the variation of the page: A kind of method is based on 

the experimental method of the Web page for Web 

sampling. Through collecting and checking sample to 

study the change rule of the Web so as to estimate the 

change rule, such as the work 

in[10][11][12][13][14][15][16][17]; Another kind of 

method is a more classic algorithm, Establish poisson 

mathematical model, then carry on the analysis and 

argument, and verify the model by experiments and 

estimate related parameters, so as to predict the time of 

page changes, such as the work in 

[10][11][13][14][ 18 ][ 19 ][ 20 ]. Poisson distribution is 

often used for modeling a series of the probability of 

stochastic time series which happened independent at a 

fixed speed. 

Nowadays, it is widely to simulate the behavior of the 

website update through the poisson distribution model, 


from the thought put forward, it is a long time researching 

on it, but it is difficult to have substantial breakthrough 

and progress, and web behavior fitting accuracy and 

stability is not very high through the inornate 

mathematical model forecast. 

Ashutosh Dixit and A.K. Sharma[21] have proposed 

architecture of incremental web crawler which manages 

the process of revisiting of a web site with a view to 

maintain fairly fresh documents at the search engine site. 

The computation of update time helps in improving the 

effectiveness of the crawling system by efficiently 

managing the revisiting frequency of a website. 

Niraj Singhal, Ashutosh Dixit, and Dr. A. K. 

Sharma[ 22 ] has developed a novel method for 

computing the revisiting frequency that helps crawler to 

remove its deficiencies by dynamically adjusting the 

revising frequency thereby improving the efficiency. The 

proposed mechanism not only reduces the network traffic 

but also increases the quality by providing the updated 

pages. In our work, we continue to adjust the news update 

time in the application environment by leveraging 

Combination Forecasting method. 

III. PROBLEM FORMULATION 

We present Combination Forecasting to predict the 

time of news update, a novel method which combined 

Naive Bayes Model and Exponential Smoothing. 

Fig. 1 shows the architecture of Combination 

Forecasting. The input items are the news files extracted 

from the Sohu News pages. These files have been 

preprocessed and converted to XML files. The first 

component of the system is predictor 1 that predicts the 

time of news update in Exponential Smoothing method. 

The second component of the system is predictor 2 that 

predicts the time of news update in Naive Bayes Model. 

The outputs of predictor 1and predictor 2 are sent to the 

final predictor to compute the combined result. Then 

crawlers can download pages according to the result. 

Figure1. The architecture of Combination Forecasting 

IV. COMBINATION FORECASTING METHOD 

A.Exponential Smoothing Method 

The Exponential Smoothing method is proposed by 

Robert G. Brown. Brown believes that the time series 

trend has stability or regularity, hence the time series can 

be reasonably homeopathic postponed; He holds the 

opinion that the trend of the recent past situation will 

continue in the future in a way, thus the larger weight 

should be put on recent data.


The exponential smoothing method is a common 

method of production forecasts. Also used for short-term 

economic trends. Of all prediction methods, exponential 

smoothing method is most used. The full-period average 

method used time series data equivalently left out of all 

of them; the moving average rule does not consider the 

longer-term data, and given the recent data more weight 

in the weighted moving average method. Exponential 

smoothing method is compatible with the full-period 

average and moving average methods for it does not 

abandon the past data, but gives it a diminishing extent. 

B. Naive Bayes Model 

Naïve Bayes model is consisted of a tree Bayesian 

network, which contains a root node and a number of leaf 

nodes. Naïve Bayes model uses probability to represent 

all forms of uncertainty, dose learning and reasoning 

processes by the probabilistic rules. Naïve Bayes model is 

based on Bayes’ theorem, reducing the computational 

cost through the conditional independence assumptions. 

Predicting unknown sample data belongs to the highest 

posterior probability of class standard. 

C.Combination Forecasting Method 

Combination Forecasting uses two or more different 

prediction methods for the same problem. It can be a 

combination of several quantitative or qualitative 

methods, however, a combination of qualitative and 

quantitative methods is often used. The main purpose of 

the combination is leveraging the information provided 

by various methods for the sake of improving the 

prediction accuracy as much as possible. 

The combination forecast has two basic forms: 

(a) Equivalent Weigh Combination Forecasting, 

namely combine into a new predictive value of the 

predictive value of each prediction method according to 

the same weights; 

(b) Non-equivalent Weigh Combination Forecasting, 

that is, the weight given to the predictive value of 

different prediction methods is not the same. 

The principles and application of these two forms are 

equal, but the different weights taken. In our work, we 

combined Exponential Smoothing method and Naive 

Bayes Model for Combination Forecasting. 

St = α yt + (1 − α) S 

(1) 

t−1 

Exponential smoothing method, in its simplest form 

such as (1), involves successive applications of the 

formula, where, in our work, y is the value of a time 

t 

interval between two latest news update time and S is a t 

‘smoothed’ value representing the next time interval of 

news update and 0≤α≤ 1. 

The calculation of the news updates also in line with 

the Naive Bayes model which is for the type of news and 

the size of news site to decide, where the news update 

time interval can be seen as the root node in the Naive 

Bayes model, the type of news and the size of news site 

can be seen as a leaf node in the Naïve Bayes model. 


Therefore, the news updates time interval computing such 

as (2) below: 

PN ( i = iN , t = jW , s = h) 

PN ( i = i| Nt = jW , s = h) 

= 

PN ( i = iPN ) ( t = jPW ) ( s = h) 

PN ( i = iPN ) ( t = j| Ni = iPW ) ( s = h| Ni = i) 

= 

PN ( = iPN ) ( = jPW ) ( = h) 

i t s 

Where i, j, h are the value of corresponding discrete 

interval, Ni, Nt and Ws stand for the news update time 

interval, the type of news and the size of news site 

respectively. And they are defined as follows: 

Ni, is the interval between the news next update time 

and the news latest update time; 

Nt, is the type of news, including IT news, house news, 

health news, education news, economic news, travel news, 

media news, auto news, fashion news, sports news, 

culture news, game news and entertainment news; 

Ws, is the number of occurrences of news sites in the 

training dataset. And the model is presented in Fig. 2. 

Figure2. The Naïve Bayes model of the news update time interval 

We will discuss the value of α , discrete interval and 

the parameter of Combination Forecasting in the next 

section. 

V. EXPERIMENT AND EVALUATION 

A.Dataset 

Since the updates rate of the sources may vary with 

time, localized estimation provides more detail, useful 

and accurate information. We have chosen the Sohu news 

from April 23, 2012 to June 23, 2012, including two 

months data, as our data set. Among these data, we have 

chosen the news from April 23, 2012 to May 16, 2012 as 

the training dataset and chosen news from May 16, 2012 

to June 23, 2012 as the test dataset. 

The news is sorted out into 13 categories. Each kind of 

news has a XML file, file format is presented in Fig. 

3.All kinds of news have the same file format. 

(2)


B.Experiments 

Figure3. XML file of the IT news 

1) Exponential Smoothing 

For the exponential smoothing forecasting methods, 

the smoothing constant determines the level of smooth 

and the response speed of the differences between 

predicted values and actual values. 

The more the smoothing constant α is close to 1, the 

less impact the long-term actual value has on the 

decreasing rate of the current smoothed value. The more 

the smoothing constant α is close to 0, the more impact 

the long-term actual value has on the decreasing rate of 

the current smoothed value. Hence, when the time series 

is relatively stable, a lager α should be selected; A 

smaller α should be selected when the time series 

fluctuate, so as to not ignore the influence of the longterm 

value. Thus, to evaluate the effects of α , we 

compare the results by assigning different values to α : 

0.5, 0.05, 0.005, 0.0005 and 0.00005. Table Ⅰ compares 

the MAD (Mean Absolute Difference) of different 

parameters, where the unit of MAD is minute. 

Experimental results demonstrate that 0.0005 was the 

optimal parameter. 

2) Naive Bayes Model 

For the Bayesian approach, News Type is a discrete 

attribute, but in view of some variables (NewsTime and 

WebSize) are continuous attributes, in order to calculate 

the conditional probability of continuous attributes, the 

Naive Bayes model provides two methods: 

(a)Make the continuous attributes discrete and use of 

discrete intervals instead of continuous attributes; 

(b)Leverage probability distribution function to 

calculate. 

We intend to use the first method for the calculation of 

conditional probability. The discrete intervals of Ni are 

defined as fast, middle and slow, while large, middle and 

small are the intervals of Ws. The discrete interval of 

each attribute is defined as shown in Table Ⅱ . The 


parameter values are determined according to the 

experiment. 

In line with probability distribution of the news 

updates, update frequency is divided into three levels: fast, 

middle and slow. Consequently achieve the forecast on 

the news updates. 

TABLE I. 

THE MAD (MEAN ABSOLUTE DIFFERENCE) OF DIFFERENT PARAMETERS 

News Type 0.5 0.05 0.005 0.0005 0.00005 

IT news 71 61 45 44 44 

house news 54 45 34 34 34 

health news 39 31 25 25 25 

education news 44 36 28 28 28 

economic news 44 38 30 30 30 

travel news 64 56 44 44 44 

media news 69 60 45 45 45 

auto news 45 36 29 29 29 

fashion news 58 50 38 38 38 

sports news 167 155 117 117 117 

culture news 68 62 48 48 48 

game news 100 92 62 62 62 

entertainment news 47 38 29 29 29 

3) Combination Forecasting 

Finally, we use Non-equivalent Weigh Combination 

forecasting, assigning weights to exponential smoothing, 

and Bayesian methods. Tuple ω = (1 − β, β) 

is used to 

represent assignment of the weight, where β is the weight 

assigned to the exponential smoothing method, 1− β is 

the weight assigned to the Bayesian model. We compared 

the results obtained from setting different values to the 

parameter ω , where these values are (0.85, 0.15), (0.95, 

0.05), (0.98, 0.02), (0.99, 0.01) , (0.995, 0.005). Table Ⅲ 

has shown the MAD (Mean Absolute Difference) of 

different parameters, where the unit of MAD is minute. 

Experiments showed that ω = (0.99, 0.01) was the 

optimal parameter. 

C. Evaluation 

The comparisons of different estimators are based on 

the same datasets. The following experiment compares 

our Combination Forecasting method to other baseline 

methods Exponential Smoothing method and Naive 

Bayes Model discussed above. 

Table Ⅳ shows the the MAD (Mean Absolute 

Difference) for each method, where the unit of MAD is 

minute. As illustrated in Table Ⅳ , Combination 

Forecasting method, which leverages the information 

provided by various methods, achieved the best 

performance. 

In Table Ⅳ , ES, B and CF stand for Exponential 

Smoothing method, Naive Bayes Model and Combination 

Forecasting method respectively. Due to space limitations, 

Fig. 4(a), (b) and (c) merely show the predict time and 

accurate time of media news for each method with the


optimal parameter. We can make a conclusion that the 

Combination Forecasting method which we proposed 

outperforms other methods for most of the cases. 

The follow diagram can be obtained with the statistical 

of variation on collecting pages, where the X axis is the 

number of the news, the Y axis is the value of the update 

time of news by the log function. ‘*’ stands for predict 

news update time and ‘-’ stands for accurate news update 

time. From Fig.4 (a), (b) and (c) we found that the update 

of the news has obvious temporal locality regular pattern, 

which is similar to Tao Meng et al[23]. 

TABLE II. 

THE OPTIMAL DISCRETE INTERVAL OF EACH ATTRIBUTE 

Variable Interval1 Interval2 Interval3 


The update time of web page appears to be erratic, 

news are extremely time sensitive by nature. From the 

view of application, we can use mathematical models to 

forecast the update time of news reports, although it can 

not be completely accurate. Predicting the time of news 

page update helps in improving the news crawler’s 

scheduling policy. In this paper, we proposed a new 

predict policy for news updates. In order to predict the 

time of news updates, firstly, we applied the Exponential 

Smoothing method to our dataset, and we also have 

Ni fast[0-134] middle(134,853] slow(853,+∞) 

Ws large(62,+∞) middle(11,62] small[0,11] 

TABLE III. 

THE MAD (MEAN ABSOLUTE DIFFERENCE) OF DIFFERENT PARAMETERS 

News Type (0.85,0.15) (0.95,0.05) (0.98,0.02) (0.99,0.01) (0.995,0.005) 

IT news 26 21 20 20 20 

house news 19 19 19 19 19 

health news 21 17 17 17 18 

education news 22 21 21 21 21 

economic news 11 9 9 8 8 

travel news 22 21 21 21 21 

media news 36 21 17 15 15 

auto news 48 25 19 18 18 

fashion news 35 25 23 23 23 

sports news 77 31 20 17 17 

culture news 31 24 23 23 23 

game news 60 60 60 60 60 

entertainment news 22 19 18 17 17 

TABLE IV. 

THE BEST VALUE OF MAD (MEAN ABSOLUTE DIFFERENCE) OF DIFFERENT METHODS 

Method IT house health education economic travel media auto fashion sports culture game entertainment 

ES 44 34 25 28 30 44 45 29 38 117 48 62 29 

B 369 184 251 224 254 241 684 546 374 1045 459 147 425 

CF 20 19 17 21 8 21 15 18 23 17 23 60 17 



(a) Exponential Smoothing method 

(b) Naive Bayes Model 

(c) Combination Forecasting 

Figure4. The predict time and accurate time of media news in three 

methods 


selected the optimal parameters. Secondly, we leveraged 

the Naive Bayes Model for prediction. Finally, we 

proposed a new method, which combined the above 

methods for Combination Forecasting, as well as made a 

compare with the other two methods. In a scenario with 

inconsistent rate of updates, Combination Forecasting 

provides more detail and useful information compared to 

global estimation. Tests on datasets confirm that the 

proposed Combination Forecasting method outperforms 

the case in which uses Exponential Smoothing method or 

Naive Bayes Model only, while estimating localized rate 

of updates. 

ACKNOWLEDGEMENTS 

This work is supported by the National Natural Scienc 

e Foundation of China under Grant No.60973040; the Nat 

ional Natural Science Foundation of China under Grant N 

o.60903098; the basic scientific research foundation for t 

he interdisciplinary research and innovation project of Jili 

n University under Grant No.201103129; the Science Fou 

ndation for China Postdoctor under Grant No.2012M5108 

79. 

REFERENCES 

[1] S. Thompson, C. Y. Wing. Assessing the Impact of Using 

the Internet for Competitive Intelligence. Information & 

Management, 2001. 

[2] Brewington, B. & Cybenko, G. How Dynamic is the Web. 

Proceedings of WWW –9th International World Wide Web 

Conference, 2000. 

[3] D. Fetterly, M. Manasse, M. Najork, and J. Wiener. A 

Large-Scale Study of the Evolution of Web Pages. 

Software: Practice & Experience, 2004. 

[4] Menczer F,Belew R. Adaptive Retrieval 

Agents:Internalizing Local Context and Scaling up to the 

Web. Machine Learning , 2000. 

[5] Pant G, Menczer F. Topical Crawling for Business 

Intelligence. Proc 7th European Conference on Research 

and Advanced Technology for Digital Libraries, 2003. 

[6] K. Stamatakis, V. Karkaletsis, G. Paliouras, J. Horlock, et 

al. Domain-specific Web site identification: the 

CROSSMARC focused Web crawler. Proceedings of the 

2nd International Workshop on Web Document Analysis, 

2003. 

[7] Filippo Menczer, Gautam Pant, Padmini Srinivasan, 

Topical web crawlers: Evaluating adaptive algorithms. 

ACM Transactions on Internet Technology, 2004. 

[8] Dennis Fetterly, Mark Manasse, Marc Najork, Janet L. 

Wiener: A large-scale study of the evolution of Web pages. 

Software: Practice & Experience, 2004. 

[9] Judit Bar-Ilan: Search Engine Ability to Cope With the 

Changing Web. Web Dynamics, 2004. 

[10] Junghoo Cho, Hector Garcia-Molina: The Evolution of the 

Web and Implications for an Incremental Crawler. Very 

Large Data Base Endowment Inc., 2000. 

[11] Fred Douglis, Anja Feldmann, Balachander Krishnamurthy, 

Jeffrey C. Mogul: Rate of Change and other Metrics: a 

Live Study of the World Wide Web. USENIX Symposium 

on Internet Technologies and Systems, 1997.


[12] Dennis Fetterly, Mark Manasse, Marc Najork: On The 

Evolution of Clusters of Near-Duplicate Web Pages. J. 

Journal of Web Engineering, 2004. 

[13] Brian E. Brewington, George Cybenko: How dynamic is 

the Web? Computer Networks, 2000. 

[14] Brian E. Brewington, George Cybenko: Keeping Up with 

the Changing Web. IEEE Computer, 2000. 

[15] Luis Francisco-Revilla, Frank M. Shipman III, Richard 

Furuta, Unmil Karadkar, Avital Arora: Perception of 

content, structure, and presentation changes in Web-based 

hypertext. Hypertext , 2001. 

[16] Tao Meng II, Hongfei Yan, Jimin Wang, Xiaoming Li: The 

Evolution of Link-Attributes for Pages and Its Implications 

on Web Crawling. Web Intelligence, 2004. 

[17] Dennis Fetterly, Mark Manasse, Marc Najork, Janet L. 

Wiener: A large-scale study of the evolution of web pages. 

Proceedings of WWW –12th International World Wide 

Web Conference, 2003. 

[18] Judit Bar-Ilan, Bluma C. Peritz: Evolution, continuity, and 

disappearance of documents on a specific topic on the Web: 

A longitudinal study of informetrics. Journal of the 

American Society for Information Science and Technology, 

2004. 

[19] XM Li. An Estimation of the Quantity of Web Pages Ever 

in China. Journal of Peking University (Science and 

Technology), 2003. (in Chinese with English abstract) 

[20] Sandeep Pandey, Christopher Olston: User-centric Web 

crawling. Proceedings of WWW –15th International World 

Wide Web Conference, 2005. 

[21] Ashutosh Dixit and A.K. Sharma. A Mathematical Model 

for Crawler Revisit Frequency. Proceedings of IEEE 2nd 

International Advance Computing Conference, 2010. 

[22] Niraj Singhal, Ashutosh Dixit, and Dr. A. K. Sharma. 

Design of a Priority Based Frequency Regulated 

Incremental Crawler. International Journal of Computer 

Applications, 2010. 

[23] Tao Meng, Hongfei Yan, Jimin Wang, Characterizing 

Temporal Locality in Changes of Web Documents. Journal 

of the China Society for Scientific and Technical 

Information, 2005. 


Mengmeng Wang was born in 1987 in the city of Changchun. 

She graduated from college of computer science and technology 

in Jilin university in the year of 2011.She is currently a graduate 

student with the computer software and theory at Jilin 

University, China, working in the fields of social computing and 

data mining. Her research interest includes data mining, 

information retrieval and social network. 

Xianglin Zuo is a student at Jilin University, China. His major 

is Computer Science and Technology. His research interest 

includes data structure, data mining and web crawling. 

Ying Wang was born in the year of 1981. She took her master 

degree(Computer application technology) from Jilin University 

in the year of 2007. She received her Ph.D(Computer 

application technology) from Jilin University in the year 

2010.Currently she is working as an instructor in the colleage of 

computer science and technology at Jilin University, China. She 

has many international publications in Journals and conferences. 

Her research interest includes data mining, information retrieval 

and social network. In the year of 2009, she won the second 

prize of Jilin province scientific and technological progress. 

Wanli Zuo received his Ph.D(computer software and theory) 

from Jilin University in the year of 1985. Currently he is 

working as professor in the colleage of computer science and 

technology at Jilin University, China. His research interest 

includes database theory, machine learning, data mining, web 

mining and internet search engine. He has guided several Ph.D 

students. He has published 5 materials and works and has more 

than 110 publications in international Journals and conferences. 

Prof.Wanli is China's computer society system software 

professional committee, teaching commission committee 

member of college of computer science and technology in Jilin 

university, the first batch of Jilin province top-notch innovative 

talents, Jilin university teacher's morality pacesetter, Jilin 

university teaching demonstration teachers title. He has 

outstanding contributions of the young and middle-aged 

professional and technical personnel of Jilin province. 

Prof.Wanli has received several awards such as” The national 

teaching achievement prize”,” the baosteel education fund 

excellent teachers” and five a provincial-level award.


Information-based Study of E-Commerce 

Website Design Course 

Xinwei Zheng 

The Key Lab of E-Business Market Application on Technology of Guangdong Province, Guangzhou, China 

Email: caiweihust@163.com 

Abstract—Currently information-based teaching is more 

and more popular in many colleges, how to construct 

Electronic Commerce website design lesson is also an 

important research in the construction of informatization. 

This paper analyses the disadvantages of information-based 

teaching, the corresponding methods and implementation 

plan. This paper demonstrates the structure diagram of 

information-based teaching platform. Moreover, this paper 

proposed a new teaching method named “teaching method 

with eight incremental steps”. Finally the statistical data 

further proves that information-based teaching inspires the 

studying interests of students, and the platform improves 

the interaction between teachers and students and enhances 

teaching quality and effectiveness of lesson. 

Index Terms—information-based teaching, EC website 

design lesson, learning resources 


With the rapid development and widely used of 

information technology, using information technology 

such as computer and network to deepen teaching content, 

innovate teaching mode, promote the effective use of 

teaching resources, expand the coverage of education, 

improve the quality of teaching is the inevitable trend of 

the information-based teaching in universities and the 

important strategic of healthy and sustainable 

development of higher education. The construction of 

information-based teaching is the emphases of higher 

information-based education. The constructing level is 

the most important scale to measure the effectiveness, 

impression and status of running a college. The national 

document [1] of compendium of educational innovation 

and development for long term proposed that the 

information technology has revolutionary effect on 

education development, which must be paid more 

attention to. UCISA (Universities and Colleges 

Information System Association) in Britain conducted a 

study on 264 colleges. The result showed that, on 

averages, 76% of colleges have drawn the development 

plan of E-learning in order to advance the teaching 

quality and satisfy the study needs of student [2]. 

As a professional product of network teaching platform, 

in addition to the function of facilitating the teaching, 

Blackboard enhances the applications of communication 

and evaluation, and has the easy-to-use and powerful 

features [3]. Through Blackboard teaching platform, 

teachers can efficiently manage lessons, make content, 


doi:10.4304/jsw.7.12.2794-2799 

assign homework, and test online. All of above functions 

of Blackboard can help students study easily, 

communicate with pleasure, participate with passion, also 

help to improve the level of teaching online and realize 

the profound innovation of teaching system and methods. 

II. EXISTING DEFECT OF INFORMATION-BASED TEACHING 

Although EC (electronic commerce) design lesson is 

only one of elective course of EC profession, the 

arrangement of teaching content, organization of 

procedure, and level of teaching have great effect on 

basic education, practical experience and comprehensive 

quality training. Currently the communication tools of 

students after lesson are telephone and email, so there are 

many problems in teaching. The problems are the 

following three points: 

A. Conservative Ideas and Backward Form of Teaching 

45 minutes of traditional teaching lesson is very 

limited for many demos could not be demonstrated to 

student. Although Guangdong University of Business 

Study has introduced Blackboard network teaching 

system which can be uploaded many learning resources, 

but the practical effect is not ideal. Some teachers who 

has still followed traditional teaching model only 

uploaded powerpoints without facilitating the 

development of teaching by using the information-based 

technology. 

B. Isolated Island of Teaching Resources 

By concluding from using instances of network lesson, 

we has find that many resources of teachers are relevantly 

independent, just like many isolated islands. These 

resources are rarely shared by other teachers, so the 

resource utilization rate is very low. 

C. Lack of Monitor, Evaluation, Feedback in Teaching 

The paper has carried out the questionnaire survey 

aimed at the problem “for the existing network 

curriculums, which do you think is the most urgent to 

improve?" As shown in TABLE I, the survey indicates 

that the most urgent aspect needed to be improved is “the 

intercommunion between student and teacher is too little", 

64.2%. Students have an eager to communicate and 

interact through the network learning platform. With the 

help of Blackboard system, such as notification, 

discussion boards, message and homework area, not only


the human-computer interaction can be achieved, but also 

the communications between teachers and the students, or 

students and students can be performed. Forum can be 

divided into a few discussion sections, for example, “the 

section of webpage art design”, “the section of 

advertising design”, “the section of commercial website”, 

and so on. Through communication in the forum, students 

can improve independent thinking ability and literal 

expression ability. At the same time, in curriculum design 

teachers can take full account of the nature of teaching 

content, provide better theme, promote development of 

discussion, achieve the effect of teaching. 

TABLE I. 

THE QUESTIONNAIRE SURVEY AIMED AT THE PROBLEM “FOR THE 

EXISTING NETWORK CURRICULUMS, WHICH DO YOU THINK IS THE 

MOST URGENT TO IMPROVE” 

No Aspect Value 

1 The quantity of pictures is too small 10.6% 

2 Video contents are not enough. 30.7% 

3 Animation amount is too little. 16.8% 

4 The interface is not beautiful. 33.5% 

5 Interactions are not enough. 50.8% 

6 The intercommunion between student and 

teacher is too little. 

64.2% 

7 Expanding documents are too few. 48.6% 

8 Teaching the courses is mainly in the traditional 

way, therefore Inquiry learning, case studies 

and other forms of teaching are required. 

57.5% 

In traditional teaching lessons, we evaluate the effect 

through the feedback and research from student and 

tracking assessments. Therefore, we can not track the 

exact information, and this will be the bottleneck of 

improving the study effect and affects the assessment of 

lesson. 

III. MEASURE OF INFORMATION-BASED TEACHING 

A. Create a Diversified and Reasonable Classification of 

Learning Resources 

Through information-based teaching, not only 

courseware but also other resources can be uploaded to 

Blackboard. Resources in BB platform is grouped into 

lessons cell, including teachers’ info, introduction of 

learning method, teaching program, teaching plan, 

teaching resources (PPT, exercises after lessons, self 

testing practice and answer, classic PPT and comment), 

expanding resources, homework, online testing, 

discussion area, video, assistant resource, exercises DB 

and answers, many classic cases and website material and 

so on, every cell has a description and definition of its 

function and content [4]. 

As to the specific application, EC lessons group the 

resource into asp.net, JSP and PHP by different 

technology of network design, in order to help student 

master several languages of developing website. 

Therefore, student formerly can not obtain resources from 

lessons, now they can obtain more and more knowledge 

from BB platform. 


B. Create a Way of Learning based on Student- 

Independent,Teacher-Led and Teacher-Student 

Interaction 

In order to realize information-based teaching, we 

must get rid of the fetters of the traditional ideas and 

transform the idea of knowledge inculcation into the idea 

that student are main body and teachers leads student to 

study. We should give more learning autonomy to 

students, reasonably use of teaching resources, design, 

develop, organize, manage and monitor the whole 

teaching procedure [5]. This paper proposed a new 

teaching method named “teaching method with eight 

progressive steps” which is based on student-independent, 

teacher-led and teacher-student interaction. 

C. Strengthen the Resources Sharing of Teachers 

Teachers and students construct study resources 

together, and extend the scope of resources. Learners can 

upload the resources they collected, e.g. text, image, 

video, audio, PPT, learning website, network lesson, blog 

and so on, also upload the hyperlink to corresponding 

cells which is related to resources to realize resource 

sharing and raise the utilization rate of resources. 

Learners can publish their opinions about uploaded 

resources, communicate and discuss with each other. 

Teacher who teaches EC website lesson can share 

resource with teachers who teach EC system lesson or 

web page development and management, for many 

chapters are repeated. Therefore teachers can share 

teaching experiences and resources with each other. 

D. Create Interactive Leaning Resources, Strengthen 

Feedback 

Based on information-based teaching on Internet, 

classroom space can stretch to every location which is 

covered by Internet. Therefore, information-based 

teaching can help to shorten the distance of teachers and 

students, and help to efficiently communicate between 

them. Blackboard network teaching system has many 

resources, so we must design teaching content according 

to before lesson or in lesson or after lesson. For example, 

we can assign teaching task before lesson and let students 

make good preparation; we can teacher lesson combined 

with network resources, as video, paper, cases and so on; 

we can assign homework and test after lesson. We can 

realize teaching activities through network, as answers to 

constant problems, individual coaching, discussion of 

problems, and discussion of cases and so on. Through the 

discussion area of Blackboard, we can realize interaction 

between teachers, teachers and students, students and 

students; students can communicate and discuss the 

integration of IT and lesson, design scheme of teaching, 

teaching content and video cases. Students and teachers 

discuss the same subject through BB platform, so the 

discussed subject will be pertinent and analyzed in depth. 

Finally in this way students can improve their ability of 

analyzing problems, resolving problems and logical 

thinking. Except for obtaining more knowledge, students 

can feedback more quickly and communicate more easily. 

Blackboard network teaching platform is a lesson 

management system through Internet [6]. Through BB


platform, we can upload related teaching files of EC 

website design lesson, such as teaching program, teaching 

plan, PPT, exercises, videos of teaching and so on. In this 

way we can remedy the defect of limited class hour and 

less information of teaching in traditional teaching, 

strengthen the interaction of students in discussion area, 

improve Learning autonomy, extend the scope of 

studying from classroom to outside of classroom, inspire 

the passion of studying and improve the teaching effect. 

Blackboard network teaching platform requests teacher to 

pay more attention on dynamic changes of website, 

answer to problems of students and update the resource of 

website. In this way, students can keep more interests in 

lessons and the network platform can give full play to its 

function. 

V. INFORMATION-BASED IMPLEMENTATION PLAN ABOUT 

ELECTRONIC BUSINESS WEBSITE DESIGN LESSON 

A. Creates Staged Curriculum Plan 

First, the talents training goal of e-commerce sites 

design lesson has been subdivided and positioned. The 

goal is that students can independently design, develop, 

product and maintain the website through the teaching 

and experiment on the information platform. 

Secondly, the required knowledge system is 

constructed for training goal. Students need to master the 

webpage making, website design, website maintenance 

and other curriculum knowledge, at the same time in 

order to build a site they need to master image processing, 

animation, the technology of background database 

supporting. The structure diagram of knowledge system 

is as Fig. 1. 

Webpage making 

website design 

website maintenance 

Figure 1. The structure diagram of knowledge system 

Finally, the stages and scheme of curriculum design 

are set down stage by stage as TABLE II. Students can 

achieve a specific goal of each stage under the staged 

curriculum training mode. Information-based platform is 

used throughout the course. Teachers organize and plan 

the teaching. At the same time, students make full use of 

information-based platform for learning. In this way, 

students are cultivated to be professional talent. 


Information-based platform 

TABLE II. 

THE STAGED CURRICULUM PLAN OF ELECTRONIC BUSINESS WEBSITE 

DESIGN LESSON 

Stage Staged training goal 

1 Learn to make simple webpages with Dreamweaver 

tool 

2 Skilled in using Photoshop and JavaScript to make 

dynamic webpage, build a simple personal website 

3 Learn through the information-based platform, and 

assessing the effect of personal website webpage design 

and making. 

4 Learn to connect database, and design the interactive 

website through teaching and information platform. 

5 Learn to design the function module of website, define 

the table and keywords of database, and define classes. 

6 Learn through information-based platform, and 

comprehensively assess own personal website in the 

aspects of design, function, interface, maintenance and 

updating management. 

B. Construct Personal Information-based Teaching 

Platform according to BB Platform Function 

Based on the blackboard platform, this paper has 

independently designed the teaching platform of 

electronic commerce website design lesson. The structure 

diagram of the platform is shown in Fig. 2. 

Based on the blackboard network teaching platform, 

the content of information teaching platform of electronic 

business website design lesson is mainly divided into two 

parts, the Public Column and the Teaching Column. The 

Public Column is about introduction of the course, and 

the Teaching Column is about working process guide. 

(1) Teaching resources can be obtained through the 

Public Column, i.e. video, electronic teaching plan, the 

different experimental tasks, reference materials, 

excellent works, etc. Teaching resource includes many 

multimedia materials, which benefit student a lot. As the 

description in [7], instructional actives under the 

environment of multimedia technology can stimulate 

students’ interest and mobilize their enthusiasm in 

learning. 

Especially the different experimental tasks, not only 

deepen students’ understanding of the course, but also 

improve their ability of analysis, design, production and 

site management. Salmon pointed out that if we want to 

bring satisfactory of network learning experience to 

students, the management is one of the important means 

to successful construct network teaching [8]. When we 

designed and used platform, we must design clear 

guidance information and give a clear requirements and 

evaluation standards for students to participate in. At the 

same time we must provide independent participation 

platform to students. 

Students are looking forward to attain the confirmation 

from teachers about their works. Through every 

experimental task, teacher chooses some representative 

works, and upload to Blackboard platform to share. 

Students can express their views in the forum. In this way, 

motivation of learners can be promoted.


Public Column 

Course navigation Teacher team Teaching resources Experimental task Research achievements 

Learning notes 

Course introduction 

Teaching program 

Figure 2. The structure diagram of information-based teaching platform in Electronic Commerce website design lesson 

(2) In Teaching Column, using Exam and homework, 

teachers can make the final evaluation to individual 

homework, interest group homework, special homework, 

project homework, and take statistics analysis on the 

grades data of students through Grade management, 

finally make an accurate judgment to the students’ ability. 

Collaborative tools of Blackboard platform can 

effectively support the collaborative learning of students, 

which includes Virtual classroom, Chat room and 

Electronic interaction. Chat room supports real-time chat 

in text, which enhances the exchange communication of 

students in real time. In addition to chat in text, Virtual 

classroom also supports Whiteboard, Collaboration 

browsing, Questions and Answers Collection as well as 

Exiting from Virtual classroom. 

Teachers can also use the Course statistics of 

blackboard network teaching platform which has strong 

background monitoring function. By use of Content 

tracking tools in Course statistics, teachers can obtain the 

tracking statistics of study information, and accurate 

statistics of login clicks times of students. But if 

evaluating from the access times and the work’ 

completed situation of students each time, teachers can 

only draw a one-sided conclusion. Thus, teachers should 

set specific scoring criteria of learning. According to the 

students' learning time, participation of network forum 

and speech quality, real-time interactive situation as well 

as the resource contribution, with the integration of 

teaching strategies based on resources and the teaching 

strategy based on experimental task, using Virtual 

classroom, Chat room, Electronic interaction, Exam and 

homework, Grade management, forum and so on, Single 

summative evaluation can be realized for the present 

experimental teaching effects of students, the evaluation 

is also diagnostic evaluation arranged for the next 

Electronic teaching plan Experimental task 1 

Teaching video 

Reference material 

Experimental task 2 

Excellent works exhibition 

Experimental task n 

Diagnostic evaluation Diagnostic evaluation 

Learning Platform 

Virtual 

classroom 

Chat 

room 

Collaboration tools 

Whiteboard Collaborative 

browsing 

Teaching Column 


Electronic 

interaction 

Email File 

sharing 

Exam and 

homework 

Exam pool 

management 

Grade 

management 

Forum 

Course statistics 

experiment task, finally teachers can give pluralistic 

evaluation to the students. Course statistics data of 

blackboard platform of Guangdong University of 

Business studies is as shown in Fig. 3 and Table III. The 

diagram and table shows that 

the most frequently accessed section for students is forum 

for communicating, followed by Teaching resources for 

downloading, then is Exam and homework. From the data, 

it has been proved that information-based platform 

provides a powerful function, which can help teachers 

and students interact with each other. At the same time, 

tracking function of platform can help teachers to track 

the learning effect, to improve their teaching methods and 

teaching focus. 

Experimental tasks Content area 

Forum E-mail 

others 

Figure 3. The pie chart of statistical data of course in Oct, 2012


TABLE III. 

THE TABLE OF STATISTICAL DATA OF COURSE IN OCT, 2012 

Area ID the numbers of Percentage Area ID the numbers of Percentage 

click 

click 

Personal information 1 0.01% Grade Center 919 5.35% 

Content area 2602 15.14% Grade indicator board 115 0.67% 

Forum 2172 12.64% Copy a file to Collection 753 4.38% 

Electric Blackboard 50 0.29% Check the Collection link 689 4.01% 

experimental task 3445 20.05% Content Collection 0 0% 

Collaboration 895 5.21% Address book 1 0.01% 

Exchange area 36 0.21% Glossary 1 0.01% 

E-mail 2067 12.03% Roster 0 0% 

Digital transceiver cabinet 457 2.66% Safe Assign 0 0% 

Electronic filing 146 0.85% Tools area 6 0.03% 

My grades 1299 7.56% Early warning system 1 0.01% 

Notice 1524 8.87% Schedule 0 0% 

News 2 0.01% 

Total 17181 100% 

Where do we 

want to be? 

T: show the 

task of 

website 

S: understand 

the aim 

How do we 

get there? 

T: lead to 

Scheme of 

Design 

S: discuss the 

flow of 

design 

How exactly 

do we get 

there? 

T: answer the 

technology 

difficulties 

S: research in 

correspond 

-ing 

knowledge 

Are you 

ready? 

T: demo 

Preparatory 

Work 

S: ready to 

work 

Teacher 

leading 

Follow me! 

Figure 4. teaching method with eight incremental steps(T-Teacher, S-student) 

C. A New Teaching Method-” Teaching Method with 

Eight Progressive Steps” 

Through the design procedure of EC website lesson, 

based on different teaching content, this paper proposed a 

new teaching method named “teaching method with eight 

progressive steps” as Fig. 4. The aim of lessons is to 


T: demo steps 

of examples 

S: imitate the 

methods of 

operations 

Interaction 

Have you 

succeeded? 

Begin 

sprinting! 

T: assign the 

Expanding 

Experimental 

task 

S: give 

assessment to 

Experimental 

task of 

oneself and 

each others 

Am I the 

best? 

T: group 

Students 

and guide 

their design 

of website 

S: 

autonomous 

Learning 

in groups 

T: assign a 

grade to 

every group 

with students 

S: strengthen 

operational 

skill, give 

an assessment 

to each other 

Students 

autonomy 

design an EC website with friendly interface, convenient 

communication, simple operation and dynamic 

appearance. The experimental tasks should be finished in 

groups. So during the process of cooperative learning, 

team members support and communicate with each other, 

which can enhance the trust between team members, this 

help them develop good relationships[9]. In designing the


experiment, we should broaden the condition of topic in 

order to let students select technique which they like to 

develop website, and divide students into groups and let 

them finish website in groups. At the same time we 

should demo and explain the excellent scheme of former 

students, encourage students to explore higher technique. 

As to some doubtful points, we can design some 

expanding homework to let student conquer these basic 

threshold through the cases of homework, for example 

linking to DB, so student can save more time to study 

some higher knowledge of design website. At the end of 

lesson, every group demonstrates website, teachers and 

students together give score to every group. In this way 

students can learn and encourage each other. 

VI. POSSIBLE PROBLEMS 

A. Some Possible Copyright Problems 

Firstly, “comes out" of one course is not a simple 

matter, after a lot of hard work of teachers, as producing, 

modifying, making perfect, selecting process and so on, 

the course transforms to the final product. So the original 

courseware copyright protection is very necessary. 

Furthermore, the teachers-collected resources which 

are collected from Internet, inevitably involved in some 

copyright problems, especially that we must mark the 

source on the site when downloading commercial 

software. 

B. Some Possible Safe Problems about Server of 

Information Technique Center 

Any web site may be vulnerable to hackers and virus 

attacks, so there must be a security policy to ensure the 

legitimate user of the legal service. We must have data 

backup, access control and anti virus strategy. 

VII. CONCLUSIONS 

In conclusion, college teachers and students should 

actively study and apply information knowledge, 

information technology, continuously improve their 

information accomplishment, actively use informationbased 

environment conditions and information platform 

of colleges and universities, effectively integrate 

information technology with their own teaching and 

learning. Because of the nature of technological, 

information-based teaching has won more and more 

universities’ favor, and it will become the trend of new 

teaching model reformation and development. The 

exploration and research of information-based teaching 

construction will be way which can not be passed for 

higher colleges in order to advance resources share, 

strengthen the communication between teachers and 

students, consolidate and promote the teaching 

reformation. 



This work was supported by Guangdong Nature 

Science Foundation (No.10151032001000003); the Open 

Research Foundation of Guangdong Province Key Lab of 

EC Market Application Technology and Guangdong 

University of Business Studies Foundation 

(No.09YB52001, No.53026535). 

REFERENCES 

[1] Zhang Yichun, Jia Xiaoyuan, Liu Pingchuang, connotation 

and development strategy of new college informationbased 

teaching construction(J), Modern Distance 

Education Research,2011(4) , pp. 26-32. 

[2] Tom Browne, Roger Hewitt, Martin Jenkins & Richard 

Walker(2008).2008 Survey of Technology Enhanced 

Learning for Higher Education in 

UK[DB/OL].http://www.ucisa.ac.uk/publications/~/media/ 

290DD5217DA5422C8775F246791F5523.ashx. 

[3] Gu Zengjun, network teaching environment construction 

based on Blackboard teaching platform, Laboratory 

research and exploration[J].2011,30(7) , pp. 174-177. 

[4] Wang Ailin, information-based analysis of Gudong 

University of Business Studies(J), Technology guide, 

2012(3) , pp. 70 

[5] Li Longlong, Wang Jin, Study On the construction of 

information-based teaching[J], Value engineering, 

2012(01) , pp. 228~230 

[6] Zhan Yu, Liu Jun, network curriculum construction of 

experimental mechanics lesson Based on the Blackboard 

platform[J], China Education Innovation Herald, 2011(22) , 

pp. 184 

[7] Zhihua Tan; Song Li, “Multimedia Technology in Physical 

Education”, International Symposium on Computer 

Network and Multimedia Technology, 2009, pp.1-4. 

[8] Salmon, G.2003.E- Moderating: The key to teaching & 

learning online.London: Taylor & Francis, Ltd. 

[9] S. Bulut, "A cross-cultural study on the usage of 

cooperative learning techniques in graduate level education 

in five different countries," Revista Latinoamericana de 

Psicología, vol. 42, pp. 111-118, 2010. 

technology. 

Xinwei Zheng is an instructor in 

Information Science School at Guangdong 

University of Business Studies. She 

received her Master and PHD degree in 

Computer Science from Huazhong 

University of Science and Technology in 

2003 and 2008. Her current research 

interests include E-learning, computer 

network technology and stream media


An Empirical Study on the Correlation and 

Coordination Degree of Linkage Development 

between Manufacturing and Logistics 

Rui Zhang 

College of Computer and Information Management,Zhejiang Gongshang University,Hangzhou, China 

Email: zhangrui@mail.zjgsu.edu.cn 

Chunhua Ju 

College of Computer and Information Management,Zhejiang Gongshang University,Hangzhou, China 

Email: jch@mail.zjgsu.edu.cn 

Abstract—Manufacturing is a major driving force and an 

important pillar for national economic development .It is an 

important source to create national income. Currently the 

low level of logistics industry development significantly 

restricted prosperity and development of manufacturing, 

and affected the overall competitiveness of the supply chain. 

The paper firstly explains the relevant research about 

industrial linkage and coordination degree, and creatively 

forms development paths of industry linkage between 

manufacturing and logistics. Then, the paper takes the 

development situation of the manufacturing and the logistics 

industry during the period of 2000 to 2008 in China. It 

builds an evaluation model about coordination of 

manufacturing and logistics industry linkage and analyzes 

the changes of coordination degree between manufacturing 

and logistics. The research results show that manufacturing 

is positively related to logistics industry. But their 

coordination degree is in the critical state of coordination 

and disharmony. In the economic restructuring, making 

efforts to promote linkage development has more 

significance for enhancing the competitiveness of industries 

in China. 

Index Terms—correlation, coordination degree, 

manufacturing, logistics, industry linkage 


Since the 21st century, manufacturing industry in 

China can be considered as the rapid development stage, 

and the manufacturing industry has started into advanced 

course. These own many characteristics, such as having 

the knowledge, intensive in information and technology, 

output with high added value, low consumption of 

resources, environmental pollution and so on. Even 

though, it has many constraints. Especially in the new 

supply chain management, manufacturing enterprises are 

increasingly faced with pressures, for example, lower 

costs, shorted delivery time, improving product quality 

and serve. 

The current logistics model of manufacturing and the 

low level of development in logistics industry 

significantly restricted prosperity and development of 


doi:10.4304/jsw.7.12.2800-2807 

manufacturing, and affected the overall strength of the 

supply chain. 

Logistics in manufacturing is an important part of the 

logistics industry. It is the key to enhancing the core 

competitiveness of manufacturing industries. It is also the 

base demand for logistics development [1]. According the 

survey, from raw materials to finished products, general 

product processing time does not exceed 10%. And 90% 

of the time is spent in storage, transport, handling, 

packaging, distribution and other logistics sectors. 

Promoting the linkage development of manufacturing and 

logistics industry is not only an important way to adjust 

industrial structure and transform economic growth, but 

also the common requirements and the urgent desire to 

manufacturing and logistics companies [2]. 

II. LITERATURE REVIEW 

A. Industry Linkage 

Currently, the term of industry linkage is widely used 

in China, but related connotations have not been defined. 

Rui Nie [3] proposed the linkage industries both the 

industry and the linkage. Industries are the set or system 

interacted formed by economic organization and activities 

with same features. Linkage refers to a number of 

associated things with "contact" and "interaction". When 

the one makes a movement or change, the others follow. 

Based on industry association, he defined industrial 

linkage as some industry collaborative activities, which 

are launched in the industrial chain links in order to reduce 

transaction costs and reduce business risk between the 

same or different enterprises. Lan Ling [4] considered the 

industry linkage as the main form of regional interaction, 

and economic organizations with similar characteristics 

integrated into the economic cooperation organization or 

economic group based on the institutional framework and 

regulatory mechanism. Its purpose is to achieve 

complementary and coordinated development during 

regional industries, optimize the regional industrial 

structure, upgrade the industry level, and enhance the 

competitiveness of regional industries.


B. Correlation 

Currently researches on the correlation between the 

two industry linkages mainly are reflected in the 

followings: 

Based on related analysis on the number of large 

enterprises patents and profits, Baizhou Li [5] applied 

statistics software to make Granger tests for them. The 

results showed that the number of invention patents had 

positive impact on corporate profits .He also calculated 

out that profits upgraded the corresponding 0.5615% 

when the number of invention patents increased by 1%. 

Starting from the evidence, Hongqiong Zhu [6] made a 

solution to this problem from tax revenue and total 

economic output. He did regression analysis, structure 

decomposition, and systematic analysis on each factor 

affecting the revenue growth and economic growth, in 

order to overcome the total lack, and arrived at 

quantitative and more powerful findings. 

While measuring information development index of 

regional information development level and economic 

growth index of regional economic growth level, Yukai 

Shao, Huanchen Wang and Saixing Zeng [7] used the 

latest statistical data to analyze their correlation and found 

that China's economic growth and information 

development showed strong regional imbalance. The 

information development had a more regional imbalance. 

With the region's economic growth, information 

development showed strong correlation. 

III. DEVELOPING PATHS OF INDUSTRY LINKAGE 

At present, China is in transition mode of economic 

development, analysis of development path of linkage 

between logistics and manufacturing, exploring the 

continuous and stable development of the industry and 

promoting the process of new industrialization have 

important practical significance and practical value. Based 

on some studies, this section divides the development path 

into five stages, including cells, the division of labor, 

interaction, integration and linkage, and eventually 

receives that two industry’s linkage development is the 

future trend of industrial development. 

A. Industrial Cell 

Cells are the basic unit of life activities. Most of the 

family business stem from early small workshop, which is 

like a single cell and the basis for industrial production 

and growth. Although smaller, it is independent that can 

be described as small and complete. 

With the deepening impact on the industrial revolution, 

manufacturing has achieved an unprecedented 

development. Once unable to transit to adapt the new 

changing economic form, cell-oriented enterprises are 

more vulnerable to face bankruptcy. 

B. The Division of Labor 

Adam Smith, a British classical economics pedigree, 

was the first one to systematically discuss the causes of 

division. As early as 200 years ago, Adam Smith proposed 

that the division of labor aroused from the need for 

transactions and trading capacity, which will affect the 

development of the division of labor in his book. Marx 

pointed out that natural division of labor and the growth of 

social productivity led to commodity exchange, which 


further led to specialized commodity production and 

social division of labor. All of those driving force is the 

behind interests [8]. Jingdong Huo [9] did the induction 

on the industrial division of labor and pointed out that the 

transaction costs was the direct cause of separation in 

manufacturing and logistics industry. When the logistics 

cost was higher than the cost of acquisition, the business 

will implement logistics outsourcing. In this demand 

stimulus, the logistics industry has produced. 

C. Interaction 

The performance to interaction is interacting, 

interdependence and common development between 

logistics and manufacturing [10]. With the expansion of 

the manufacturing sector, the demand for the logistics 

industry increased rapidly, that will improve the 

productivity of the manufacturing. The other hand is also 

fit. Moreover, with economic development, they rely on a 

deeper level each other. Payne noted that in our special 

institutional transformation environment, logistics and 

manufacturing have been involved in highly relevant and 

additional stage [11]. 

D. Integration 

As development and wide application of information 

and communication technology, services and 

manufacturing increasingly blurred boundaries and had 

emerging integration. Porter made the Convergence of the 

Theory in 2001, that new economy and old economy 

increasingly integrated, and IT companies and traditional 

business’s boundaries trended to disappear [12]. 

E. Industrial Linkage 

In recent years, manufacturing logistics has rapid 

development, but many problems to be solved, which 

mainly due to lack of communication and convergence 

between manufacturing and logistics industry. 

Manufacturers do not trust logistics services capabilities, 

and logistics companies do not understand the real needs 

of manufacturers. Thus, to achieve industry interaction 

and to enhance the exchange of the two industries is not 

only conducive to the development of manufacturing and 

logistics industry, but also to help improve the 

competitiveness of the industry chain. 

IV. DEVELOPMENT SITUATION OF MANUFACTURING AND 

LOGISTICS INDUSTRY IN CHINA 

Although the scope of industries is wider, the channels 

to grasping China's manufacturing statistical data are few. 

In this paper, the industrial output value is instead of 

manufacturing sector (MGDP). The Statistics Bureau of 

China has not yet regarded the logistics industry as an 

independent industry, so in this paper, output value of 

tertiary industry will take place of production value of the 

logistics industry (PESR). 

A. Analysis of MGDP and PESR 

From the two growth trends of Fig.1, obtained from 

the use of Eviews3.1 software, we can see that MGDP 

was up from 4.0034 trillion Yuan to 12.9112 trillion Yuan 

and the average annual growth rate researched 15.84% 

from 2000 to 2008. PESR increased to 12.0487 trillion 

Yuan from 3.8714 trillion Yuan, and the average annual


growth rate researched 15.29%. Average annual growth 

rate of logistics industry is slightly less than that of 

manufacturing industry, but more similar, we can see that 

the logistics industry in China started relatively late has 

been great development. 

140000 

120000 

100000 

80000 

60000 

40000 

20000 

00 01 02 03 04 05 06 07 08 

0.24 

0.22 

0.20 

0.18 

0.16 

0.14 

0.12 

0.10 

PESR MGDP 

Figure.1 Growth trends of MGDP and PESR from 2000 to 2008 

B. Analysis of Growth Rate of MGDP and PESR 

As can be seen from Fig.2, there are two distinct peaks 

respectively in 2004 and 2007. But in 2008 there is a clear 

decline. The reason is that Chinese manufacturing and 

logistics industry have a rapid development stage with 

Chinese openness and market improvement. Since China 

joined in the WTO in 2001, China's logistics industry and 

manufacturing have entered a rapid development process 

and created a peak in 2004. In 2008 Olympics Game, 

Chinese economic development was strong and reached a 

new peak. Second half of 2008, due to U.S. subprime 

mortgage crisis, the world economy gone into the valley. 

China was also affected, so that the economic situation 

was worrying and manufacturing and logistics industry 

had a significant decline. 

0.08 

00 01 02 03 04 05 06 07 08 

PESR MGDP 

Figure. 2 Changing trends Figure of growth rate of MGDP and 

PESR from 2000 to 2008 

Fig.2 also shows that growth trend in logistics industry 

and manufacturing have a strong consistency. And 

logistics and manufacturing have a strong positive 

correlation. Of course, changes in the logistics industry 

grow slightly lag behind the growth of manufacturing, 

especially in the year of 2003. 


V. CASE ANALYSIS ON CORRELATION OF MANUFACTURING 

AND LOGISTICS INDUSTRIES 

A. Analysis of the Correlation between Variables by 

EVIEWS 

• OLS Regression Analysis 

The paper selects MGDP on behalf of the level of 

manufacturing industry and PSER on behalf of the 

logistics industry development, units both are 100 million 

Yuan. Selected sample interval is from 2000 to 2008, and 

the data source is the China Statistical Yearbook 

published in 2009 Analysis of the data relies on 

application of Eviews3.1. 

Since taking the natural logarithm of the data does not 

alter the integration relationship between variables and 

can linear the trend between the variables. To some extent, 

it can also be possible to eliminate time-series 

heteroscedasticity, so the paper takes the natural logarithm 

of MGDP and PSER , respectively LN and MGDP LN . PSER 

Model is build as follows: 

MGDP = a + b* PSER 

(1) 

LNMGDP = c1+ c2∗ 

LN PSER 

(2) 

One said that c1 for constant, c 2 for LN PSER 's 

coefficient. It uses Eviews3.1 software to carry out the 

ordinary least squares regression analysis for equation (1) 

and (2), and heteroscedasticity and autocorrelation were 

processed, the final results are shown in table Ⅰ. 

TABLE I. 

CORRELATION COEFFICIENTS BETWEEN VARIABLES 

MGDP PSER LN LN 

MGDP 

PSER 

MGDP 1.000000 0.996974 

LN MGDP 

PSER 0.996974 1.000000 LN PSER 

1.000000 0.994394 

0.994394 1.000000 

Figure.3 Regression results of LN MGDP and LN PSER 

Making OLS regression on LN MGDP 、 LN PSER , 

the regression results are shown in Figure.3: 

LN MGDP = −0 

. 907216 + 1. 

083865* 

LN PSER 

s = ( 0. 

341575 )( 0. 

030759 ) 

t = (−2. 

655977)( 

35. 

23789) 

R 2 = 0. 

994394, 

DW = 1. 

078497, 

F = 1241. 

709 

It showed that in the sample period, added value of 

manufacturing and logistics not only has a positive


relationship, and that is significant. c 2 =1.083865 

indicated that at this stage, when one percentage point 

increases in added value of logistics, added value of 

manufacturing will increase 1.083865 percent. It can be 

seen that during the period of 2000-2008, the role of 

manufacturing to the growth in the logistics industry is 

obvious. Similarly, manufacturing also has a positive role 

in stimulating and promoting the growth of the logistics 

industry. 

• Variable Stability Test 

Although the correlation between variables is high, a 

false return likely exited in the measurement results of 

data involved time series. Namely if there are two nonstationary 

time series data showing a consistent trend, 

even if there is no economic relationship between them, 

the use of their traditional methods may also show a high 

coefficient of determination. 

To test whether a false return is exited, it is necessary 

to check the stationary of time series data. If the mean and 

variance of a random time series are constant, and any 

covariance of two periods only rely on their distance or 

the lag time, rather than relying on the calculating the 

actual time of covariance, we call it stable. In economic 

field, a lot of time series observations are the nonstationary 

series, and stability is in an important position 

in economic modeling. Therefore, it is necessary to check 

the stationary of time series data. The methods of 

Stationary test comprise the unit root test and diagram test. 

As there is significant growth trend in GDP’s time series, 

this paper adopts ADF to conduct unit root test on 

LN MGDP and LN PSER .The test results is shown in 

Fig.4 and Fig.5. 

Figure.4 Unit root test results of LN MGDP 

Seen from the test results, at three significance level, 

including 1%, 5% and 10%, the unit root test’s critical 

values of LN PSER were -5.2459, -3.5507, -2.9312, and t 

test’s statistic is 2.475771, more than three different 

significant thresholds, indicating that the sequence is nonstationary. 


Based on the above, researching on the correlation 

between manufacturing and logistics had false return by 

EVIEWS software. It has some differences from desired 

results, but it is normal to produce this result. At first, it 

was because of the small sample used in this paper. 

Secondly, the reason was that domestic GDP was 

influenced by many economic factors, in most cases, it 

grew sustained over time. 

X 

Figure.5 Unit root test results of LN PSER 

TABLE II. 

CORRELATIONS 

X Y 

Pearson Correlation 1 .989(**) 

Sig. (2-tailed) . .000 

Sum of Squares and 

Cross-products 

760144941.432 1894938830.270 

Covariance 95018117.679 236867353.784 

N 9 9 

Pearson Correlation .989(**) 1 

Sig. (2-tailed) .000 . 

Y Sum of Squares and 

Cross-products 

1894938830.270 4830116200.000 

Covariance 236867353.784 603764525.000 

N 9 9 

Note:Correlation is significant at the 0.01 level (2-tailed) 

The results showed that: the correlation coefficient of 

the variable X for a total profit of the manufacturing 

(billion) and variable Y for cargo turnover of logistics 

industry (million tons / km) is 0.989, and it is significant 

in the 0.01 significance level. This can be learned in two 

ways: the first is the "**" in the right shoulder of 

correlation coefficient for 0.989, the other is that 

significance probability (Sig.) of the bilateral test in the 

second line is less than 0.01. And the significance level of 

the correlation coefficient is 0.000, less than 0.01 

significantly. Thus, the total profits of manufacturing and 

cargo turnover of logistics industry is related.


VI. EVALUATION MODEL OF COORDINATION DEGREE 

A. Index Selection 

Based on the actual situation and indicators system, 

comprehensiveness, independence, and so on, this paper 

determines the index system of coordination degree, as 

table Ⅲ. 

It is necessary to be stated that these indicators are 

based on China Statistical Yearbook (2009) and collected 

corresponding statistics between 2000 and 2008. Data on 

manufacturing systems mainly is aimed at industrial 

enterprises above the scale. At the same time, the 

development of logistics industry indirectly is assessed by 

transport-related indicators. In addition, the selection of 

this indicator will be corrected and improved in the future. 

B. Calculation of Coordination Degree 

• Methods 

Coordination degree is a quantitative indicator to 

measure coordination level among the state system or 

elements. Studies about the degree of coordination have 

four main categories: coupled coordination degree model, 

entropy change equation, interval-valued judgments 

method and grey relational model [13]. 

Coupled coordination degree model can couple 

parties’ properties to evaluate the harmony level in 

different stages among systems [14]. Entropy equation can 

be used to describe the law that the isolation system 

evolves non-equilibrium to equilibrium state [15]. 

Interval-valued judgments method is to establish 

mathematical models to determine the system whether is 

coordinated or not. Grey model can calculate the 

correlation between each indicator of one system and each 

indicator of another system for, and calculate a major 

stress factor of the impact. It is superior to other methods 

[16]. To this end, this paper constructed grey model to 

research coordination degree of manufacturing and 

logistics. 

• Calculation Procedure 

Firstly, in order to eliminate indicators’ dimensional 

relationship between the original data, it makes 

normalized data processing before carrying out correlation 

analysis. In other words, it uses SPSS software’s principal 

component analysis on data for standardization, so that 

data is comparable. 

Only through standardization of data, that it is 

comparable. The general standard is adopted standardized 

Z , that is mean 0, variance 1. 

Secondly, on the basis of standardized, calculating 

correlation coefficient. 

min min zxi 

( t) 

− zx j ( t) 

+ ρ max max zxi 

( t) 

− zy j ( t) 

i j 

i j 

ξij 

( t) 

= 

zxi 

( t) 

− zy j ( t) 

+ ρ max max zxi 

( t) 

− zy j ( t) 

i j 

(3) 

Where ρ is standardized coefficient, the general value 

is 0.5; zxi (t) 

and zy j (t) 

for standardized value of each 

index for the t moment; ξ ij (t) 

is correlation coefficient 

for the t moment. 

Thirdly, calculate the mean of correlation coefficient, 

according to the sample size, in order to get correlation 

coefficient matrix. It can reflect relationships between 


manufacturing and logistics industry. By comparing the 

size of correlation γ ij , it can analyze whether the 

relationships between the factors in manufacturing and the 

factors in logistics industry is close or not. That is 

calculated as follows: 

k 1 

γ = ξ ij ( t) 

k 

(4) 

ij ∑ 

i= 

1 

On the basis of the connection matrix, respectively 

seek their mean by rows or columns. According to the 

average size, the most important factors affecting 

manufacturing and logistics industry each other can be 

selected. 

Fourthly, define coordination degree. In order to 

determine the size of coordination degree between the two 

systems from the overall, it can on the basis of equation 

(3), further define the coordination degree [17]. The 

equation is: 

m l 1 

C( 

t) 

= ∑∑ξ 

ij ( t) 

m × l i= 

1 j= 

1 

(5) 

VII. CASE STUDY 

A. Data Standardization 

According to China Statistical Yearbook (2009), it 

achieves the corresponding data of indicators. Then, using 

SPSS11.5 to standardize the above data for Z . 

B. Correlation Matrix 

According to the equation (3), (4) and (5), use 

MATLAB to build grey linkage model ,and make 

program to calculate gray correlation matrix for 

manufacturing system indicators x 1 ~ x6 

on the logistics 

system indicators y 1 ~ y6 

. 

Evaluation criteria of coordination degree is provided 

as follow: If γ = 1 , it indicates the relevance of an 

indicator in logistics system and one in manufacturing 

system is a target maximum. If 0


coefficient is 0.6197~0.7142. The coordinating role is in 

the stronger situation, in which industrial output value is 

associated with the logistics industry in the strongest level 

for 0.7142. 

As to the level of manufacturing indexes associated 

with the logistics industry, the correlation coefficient is 

0.4049~0.8680, and the coordinating intensity is of 

inequality. The coordinating role of freight and 

manufacturing is the strongest for 0.8680. The specific 

circumstances of manufacturing associated with logistics 

each other is shown in Fig.5: 

1.2 

1 

0.8 

0.6 

0.4 

0.2 

制造业与物流业关联强弱散点图 

0 

0 0.2 0.4 0.6 0.8 1 1.2 

系列1 

系列2 

系列3 

系列4 

系列5 

Figure.5 Scatter point for the strength of manufacturing associated with 

logistics 

Seen from table Ⅳ and Fig.5, in general, the 

relationship of manufacturing and logistics industry shows 

the same phase and positive correlation, characterizing 

status quo of the relationship between manufacturing and 

logistics industry. Among 36 associated values of 6 

indicators of manufacturing and Logistics, 9 showed 

strong association, 10 showed stronger association, 12 

showed moderate correlation, and only five showed weak 

correlation. The number of occurrences of the strong, 

stronger, medium and weak respectively covered 

percentage of the total 36 was: 25.00%, 27.78%, 33.33%, 

13.89%. 

C. The Change of Coordination Degree between 


According to equation (5), the coordination degree 

curve of manufacturing and logistics since 2000, is shown 

in Fig.6. 

The Fig.6 shows the changes in the coordination of 

two systems. In the whole, the coordination degree 

fluctuates between 0.6251 and 0.7542, showing the close 

of coordination between manufacturing and logistics. Seen 

from the curve, the fluctuation of coordination degree is 

less sensitive from 2000 to 2008 and around 0.67. During 

2001 and 2005, there was a maximum of ups and downs, 

that from the peak 0.7542 in 2001 rapidly declining in the 

low to 0.6335, and to 0.728 reaching the second peak in 

2004. From 2004 to 2005, it fell to 0.6251 in the valley 

bottom. 


coordination 协调度 degree 

0.8 

0.7 

0.6 

0.5 

0.4 

0.3 

0.2 

0.1 

0 

我国制造业与物流业协调度变化曲线 

0.7542 

0.6408 0.63350.6535 0.7285 0.7051 

0.6735 

0.62510.6376 

2001 2002 2003 2004 2005 2006 2007 2008 2009 

年份 

系列1 

Figure.6 China's manufacturing and logistics’ coordination degree curve 

Overall, from Chinese manufacturing and logistics the 

change of degree coordination between the two system 

during 2000 to 2008, we can see that the coordination 

degree reach the highest point for 0.7542 at the stronger 

stage. But it is far from full coordination, namely that the 

coordinated development of the two industries is far from 

ideal. 

VIII. CONCLUSIONS 

OLS using EVIEWS showed that in the sample period, 

added value of manufacturing and logistics not only has a 

positive relationship, but also when one percentage point 

increases in added value of logistics, added value of 

manufacturing will increase 1.083865 percent. Similarly, 

manufacturing also has a positive role in stimulating and 

promoting the growth of the logistics industry. But, 

research on the correlation between manufacturing and 

logistics existed false return, heteroscedasticity and so on, 

that may influence the reliability of results. 

Seen from the above, manufacturing is positively 

related to logistics industry, and the coordination of the 

two industries is more closely, but the coordination degree 

is at most about 0.66, in the critical state of coordination 

and disharmony. In the economic restructuring, making 

efforts to promote linkage development has far-reaching 

significance for enhancing the competitiveness of China's 

industries. 

The industry linkage between manufacturing and 

logistics is the linkage in ideas, organization, and mode of 

operation, cost-effective and exchange of human resources. 

For manufacturing, two industries’ linkage development 

can promote internal division of labor, focus on core 

business in order to reduce logistics costs, improve 

manufacturing competitiveness and enhance the 

manufacturing sector’s ability to tackle the financial crisis. 

For the logistics industry, the linkage development can 

promote the optimization and integration of resources, and 

improve service levels. Meanwhile, deepening logistics 

outsourcing can increase the total market share of the 

logistics industry, to provide more space for the survival 

and development. 


It is a project support by scientific research fund of 

Zhejiang Provincial Education Department (Y200804863), 

Zhejiang Provincial Natural Science Foundation of China


(Y6090015) and the Key Foundation of Philosophy and 

Social Science of Zhejiang Province(09CGYD012Z). 

REFERENCES 

[1] Tuanying He, Tianshan Ma, “Thinking about linkage development 

On manufacturing and logistics industry”, Innovation Exploration, 

vol. 5, pp. 28-29, 2009. 

[2] Jian Zhou, “Empirical Study on Interaction between 

manufacturing and producer services-a case study Jiangsu 

Province(1990-2007)”, Market Weekly(Disquisition Edition), vol. 

4, pp. 38-40, 2009. 

[3] Rui Nie, Tao Lv, “Industrial Linkage: Strategic Options for 

Sustainable Development and Utilization of Western Energy 

Resources”, Natural Resource Economics of China, vol. 1, pp. 12- 

14, 2008. 

[4] Lan Lin, Sen Ye, Gang Zeng, “STUDY ON REGIONAL 

INDUSTRY COOPERATION IN THE YANGTZE RIVER 

DELTA”, Economic Geography, vol. 30, pp. 6-10, 2010. 

[5] Baizhou Li, Yuanyuan Dong, “China's Large Enterprises' 

Evaluation of the Original Innovation Ability Based on AHP”, 

Science & Technology Progress and Policy, vol. 27, pp. 125-129, 

2010. 

[6] Hongqiong Zhu, “A Study on Relation between Tax Increase and 

Economic Growth”, Productivity Research, vol. 19, pp. 38-40, 

2007. 

[7] Yukai Shao, Huanchen Wang, Saixing Zeng, “The Relative 

Analyse between the Informatization Index and the Economy 

Growth in China”, Information Science, vol. 24, pp. 172-174, 

2006. 

[8] Liufu Chen, “Formation and deepening of the division of labor”, 

Frontier, vol. 12,pp. 47-49, 2002. 

[9] Jingdong Huo, Jiechang Xia, “The International Comparison on 

R&D Competitiveness of Modern Service Industry”, China Soft 

Science, vol. 10, pp.10-13, 2007. 

Coordination 

Evaluation 

System 


TABLE III. 

THE INDEX SYSTEM OF COORDINATION DEGREE OF INDUSTRY LINKAGE 

Evaluation 

System 

Manufacturing 

system 

Logistics 

system 

[10] Xian Chen, Jianfeng Huang, “Division of Labor, Interactions and 

Convergence: Empirical Research on the Evolvement of the 

Relationship between Service and Manufacturing Industries”, 

China Soft Science, vol. 10, pp. 65-71, 2004. 

[11] Jianhui Shi, Peirong Ding, “Interaction between manufacturing 

and producer services in developing relations - Analysis based on 

the paradigm stage”, North Trade, vol. 11, pp. 45-46, 2008. 

[12] Shixian Zhang, “The efficiency of industrial investment and 

industrial structure of the Empirical Study”, Management World, 

vol. 5, pp. 79-85, 2000. 

[13] Shengbin, Bo Yu, “Research on Coupling Degree Model and 

Application of the Enterprise's Technology Capacity and 

Technology Management Capacity”, Forecasting, vol. 27, pp. 12- 

15, 2008. 

[14] Tong Yang, Nengmin Wang, “An Empirical Study on the 

Coupling Model of Ecological Environment and Urban 

Competitiveness”, Ecological Economy, vol. 10, pp. 33-36, 2008. 

[15] Jichu Tang, “The Entropy Variation Equation of Thermodynamic 

Open System”, Journal of Systemic Dialectics, vol. 8, pp. 88-90, 

2000. 

[16] Hong Mi, Guoli Ji, “Chinese county-level regional population, 

resources, environment and sustainable development of the 

coordinated development of economic systems theory and 

evaluation methods”, POPULATION & ECONOMICS, vol. 6,pp. 

17-24, 1999. 

[17] Xiaoli Han, Li Wang, “The Quantitative Analysing on 

Coordinated Development Between Manufacture and Logistics”, 

Value Engineering, vol. 1, pp. 84-86, 2009. 

[18] Weidong Chen, Hua Wang, “Education and Economic 

Development by Gray Correlation Analysis - A Case Study of 

Shenzhen City”, Journal of Southwest University for 

Nationalities(Humanities and Social Science), vol. 3, pp. 195-197, 

2007. 

Evaluation indicator variables 

industrial output value (billion) 

Enterprises (unit) 

Main Business Revenue (billion) 

Total profit (billion) 

Annual Average Number of workers (million) 

Total Assets (million) 

the tertiary industry output value (billion) 

Passenger volume (million) 

Total cargo throughput (tons) 

Cargo turnover (100 million tons km) 

Civilian car ownership (10000) 

The scale of port cargo throughput of coastal (tons) 

x 1 

x 

x 

x 

x 

x 

y 

y 

y 

y 

y 

2 

3 

4 

5 

6 

1 

2 

3 

4 

5 

y 6


x 1 

x 2 

x 3 

x 4 

x 5 

x 6 

mean 

TABLE IV. 

Rui Zhang was born in 1976, Jilin Province 

of China. And she is a PH.D in reading and 

associate professor of Contemporary Business 

and Trade Research Center of Zhejiang 

Gongshang University. She holds MSc from 

Zhejiang University of Technology in 

Zhejiang,China. 

Her current areas of teaching are logistics 

system planning and designing, while her research areas are 

logistics simulation, logistics planning(logistics network 

infrastructure, logistics capability and logistics modeling) and 

application of modern optimization techniques to complex 


GREY LINKAGE MATRIX 

1 y 2 y 3 y 4 y 5 y 6 y mean 

0.9702 0.3843 0.8124 0.5706 0.9763 0.5716 0.7142 

0.7514 0.4551 0.8845 0.4635 0.6997 0.4642 0.6197 

0.9050 0.4127 0.9507 0.5177 0.8311 0.5185 0.6893 

0.7334 0.3407 0.6395 0.7043 0.7903 0.7059 0.6524 

0.9335 0.4070 0.9211 0.5269 0.8551 0.5278 0.6952 

0.8331 0.4296 1.0000 0.4933 0.7701 0.4941 0.6700 

0.8544 0.4049 0.8680 0.5461 0.8204 0.5470 

engineering problems, including production planning and 

control. She has published some teaching books and finished 

more than 10 articles in journals and conference proceedings. 

Ms. Zhang is a researcher and a Chartered Member of the China 

Federation of Logistics & Purchasing, and she is also a member 

of China Materials Storage& Transportation Association. 

Chunhua Ju is a professor, doctoral supervisor of School of 

Business Administration, Zhejiang Gongshang University. 

His main research interests are business informationization 

management, E-commerce and logistics, decision support 

system, etc.


Tourism Crisis Management System Based on 

Ecological Mechanism 

Xiaohua Hu 

Changhai Hospital, Second Military Medical University, Shanghai 200438, P.R.China 

Email: huhexu@hotmail.com 

Xuan Zhou, Weihui Dai, Zhaozong Zhan and Xiaoyi Liu 

Sicuan Conservatory of Music, Chengdu, 610021, China 

School of Management, Fudan University, Shanghai 200433, China 

School of Software, Fudan University, Shanghai 200433, China 

School of Information Science and Engineering, Fudan University, Shanghai 200433, China 

Email: 465029725@qq.com, whdai@fudan.edu.cn, 043053314@fudan.edu.cn, liuxiaoyi92@gmail.com 

Abstract—Tourism industry has become one of the biggest 

and dynamic industries in the world, giving a great impetus 

to the economic development. However, its vulnerability and 

the crisis management have been the momentous problem. 

The findings of researches and practices have shown that 

tourism crisis will possibly take place due to the destruction 

of system balance, which is similar to the ecological crisis in 

the natural world. This paper built an ecological system 

model to research the formation mechanism of tourism 

crisis, and designed the crisis management system for 

preventing, monitoring, controlling and recovering the 

tourism crisis both from natural and social environment. 

Index Terms—Tourism Crisis, Crisis management, 

Management System, Ecological Mechanism 


Tourism industry has become one of the biggest and 

dynamic industries in the world, giving a great impetus to 

the economic development. However, as the tourism 

industry started to be the pillar industry of the economic 

development, its vulnerability is revealed seriously. 

Since 1980s, a series of crises such as the tsunami in 

Indonesia in 2004, the hurricane in New Orleans in 

2005,etc have attacked the tourism industry, damaging 

severely the travel economy and the people’s life and 

properties. How to avoid and reduce the possibility of 

crisis while maintaining the fast development of the 

tourism industry has been a burning problem. 

The management of tourism crisis has been paid much 

attention, but it is difficult to perform it due to the 

distinctiveness of the tourism industry. On one hand, it 

concerns lots of industries that have different characters, 

Manuscript received August 6, 2012; revised October 16, 2012; 

accepted November 12, 2012. 

This research was supported by National Natural Science 

Foundation of China (No. 90924013) . 

Corresponding author: Xuan Zhou. 


doi:10.4304/jsw.7.12.2808-2815 

for instance, hotels, travel agencies, etc. On the other 

hand, the tourism industry has a strong dependence on the 

natural and social environment. The loose industrial 

structure and strong dependence on the environments 

makes it even more difficult to resolve the crises once 

they occur. The crisis management demands not only the 

regulations and organizations, but the fast communication 

of information, quick data processing and well 

coordination of every part. So, applying the information 

technology, to establish a crisis management system that 

can widely collect the information in travel operative 

activities and crisis and can respond to them efficiently is 

more and more regarded as a necessity for the realization 

of an efficient crisis management. 

The research on tourism crisis started from 1970s. In 

1974, the Travel Research Association first listed the 

travel research as one of the most important projects. 

Since then, the areas of crime and travel[1], terrorism and 

travel[2], war and travel[3], financial crisis and travel[4], 

natural disaster and travel[5], public health and travel[6], 

etc have been researched by lots of scholars. But only 

Faulkner[7] and few scholars have presented the 

theoretical system of crisis management. As a conclusion, 

it is still scare in the research of the data acquisition for 

the general tourism crisis and its responding system, of 

the mechanism of tourism system and the tourism crisis 

management system built with modern techniques. 

In the area of ecological management research, the 

ecology was introduced into the management study in 

1970 by Aldrich. H. E. and J. Pfeffer[8]. Hannan and 

Freeman[9] pointed out the community ecological theory 

in 1977. James F. Moore[10] in 1996 built a business 

ecological system and Richard L. Daft[11] in 1998 

further developed the cooperation and conflict between 

ecological systems, and reached a competitive and 

cooperative mode and an organization for study. A 

through introduction of the ecological management is 

made by D. Kong and X. Z. Cui in 2003[12].


This paper used the ecological mechanism to build the 

tourism ecological system and analyzed the formation 

and influence of its crisis. On such basis, a management 

system for tourism crisis is presented to provide a 

responding solution to the tourism crisis. The system’ 

module functions and architecture are to be analyzed 

further. In later sections, we provide a compound 

ecological system of Buddhism cultural travel and put 

forward an ecological-chain-based developing mode to 

maintain its sustainable development. 

II. TOURISM ECOLOGICAL SYSTEM AND ITS CRISIS 

MECHANISM 

A. Tourism Ecological System 

The natural ecological system has the characteristic of 

fragile since the ecological community and its living 

environment are dependent on each other. Likewise, the 

tourism system has integrated into the environment, 

whose each little fluctuation can leads to the incredible 

disaster of the tourism industry. With such consideration, 

introducing the concept of the ecological system into the 

research of tourism crisis to learn the experience of 

natural ecological degradation and recovery is really 

meaningful. 

In the natural ecological system, producers, consumers 

and reducers constitute the whole organic ecological 

community. The ecological system is not the simple 

assembly of various kinds of creatures and their 

environment. Instead, it is an organic system in which all 

the creatures and environmental factors are 

interdependent and inter-checked. Since the ecological 

system has the ability of self-adjusting, it is relatively 

stable in the normal condition. And generally, there are 

In this system model, producers, consumers and 

producers promote the circulation of the tourism 


Figure1. Tourism ecological system model 

two kinds of stable states for one organism. One is the 

homeostasis, namely the self-adjusting ability to respond 

to the environmental changes, and another is state that 

every ecological factor is in the range of its ecological 

valence, which regulates the highest level that an 

ecological factor is allowed to reach. Once some 

ecological factor exceeds its ecological valence, 

ecological crisis occurs, leading further to the 

deterioration of the ecological system. Among the various 

reasons that may cause the ecological system, external 

interference, especially the manual interference, is the 

primary reason. Ecological recovery is also in the 

progress as the deterioration is going on. The natural 

ecological system retains its systematic structure and 

function in this process with the favorable factors of the 

environment and the self-adjusting ability. 

In tourism industry, travel activity is the foundation, in 

which tourists, tourism resources and tourism business 

constitute the three “T” factors of the travel industry. To 

put it one step further, the travel industrial system is an 

organized and functional entity that is constituted by 

travel activity organizers and participants with the use of 

information flow, power flow, material flow and value 

flow, and that is in the form of tourism environmental 

resources [13][14]. 

As the natural ecological system we discussed above, 

the tourism industrial ecological system is also formed 

generally by producers, consumers, reducers and their 

dependent environment. Apart from this, the flow of 

information, power and value, etc is also the part of 

tourism ecological system. By referring to the natural 

ecological system and its operation mechanism presented 

by W. H. Dai, etc in 2005[15], we established the 

ecological system model of tourism as Figure 1. 

ecological system by making use of the information flow, 

consuming flow, power flow and value flow. The tourism


corporations drive the consumers’ travelling desire by the 

tourism information flow, which leads to the formation of 

tourism consuming flow, and after the digesting of 

tourism corporations, the tourism power flow forms to go 

through the whole process of travel activity and produces 

industrial nutrition. The positive power flow changes into 

the travel value flow to raise the system into a higher 

level and thus can start a new circulation, while the 

negative power flow may damage the social and natural 

environment by releasing rejections. 

B. Tourism Crisis Mechanism and Manegement 

There is such a wide range of the concept of tourism 

crisis, which includes the tourism economic crisis, 

tourism cultural and social relationship crisis, and tourism 

security crisis, etc[16]-[21], that it is necessary for us to 

give it a definite definition for our research. In this paper, 

we defined the tourism crisis as the natural or manual 

unexpected disasters that affect the travel industrial 

operation and have to be responded in the limited time 

and in uncertain conditions. In other words, the tourism 

crisis in our research range mainly refers to the crises that 

are related to the tourism security. On the whole, the 

tourism crisis can be divided into two types: one is the 

external crisis that occurs when the nature and society are 

regarded as the environment of the tourism industrial 

development, and another one is named internal tourism 

crisis that happens in the travelling process, for example, 

housing, eating, and transportation, etc. It is not the pure 

crisis coming out of the inner of the industry. On the 

contrary, its causes are quite complicated, its occurrences 

are pretty unpredictable and versatile, its influence covers 

various regions, and also it is difficult to recover. 

Since tourism crisis can’t be avoided only by securing 

the operating procedures in travel operational activities, 

its management becomes especially important. At present, 

the management systems for tourism crisis are mostly 

based on institutions and organizations, lacking a 

practical management system. The general method is to 

divide the whole crisis management process into three 

stages, referring to the warning mechanism before the 

crisis, the processing mechanism in the crisis, and the 

recovering mechanism after the crisis. In the warning 

mechanism management stage, most of the works, such 

as analyzing and evaluating the original information, are 


Figure2. Ecological mechanism of tourism crisis 

completed by the management information system to 

evaluate and predict the risks. In the crisis processing 

stage, the role that the crisis management mechanism 

plays changes into the quick response in the shortest time 

to prevent the development of crisis. And in the last 

recovering stage, travel places and public confidence are 

to be resumed by the combining efforts of institutions and 

organizations. 

Though some developed countries have already 

established a set of public crisis processing system and 

crisis information management system, shortcomings 

have been revealed in the real cases in that the traditional 

crisis management system is weak in the continuous risk 

management and lacks the quantitative indicators and 

ability to process the non-structured data. 

Addressed by Z. F. Yang, etc. [14] in 2005, the natural 

ecological system has a potential ability to maintain its 

service function and health, called the ecological carrying 

capacity. From the perspective of this view, the tourism 

ecological system also possesses the ecological carrying 

capacity concerning its ecological health. The natural and 

social environments provide the resources carrying 

capacity for the tourism ecological system, and the 

tourism ecological system supports travelers’ activities in 

a certain resilience range. As the natural ecological 

system does, the tourism ecological system maintains 

stable state with its dependent natural and social 

environments in the supporting of self-resilience and 

carrying capacity. 

We have already known that the natural ecological 

system imbalance leads to the crisis finally, and such 

natural ecological system crisis is mainly driven by 

external interference. Referring to this mechanism, we 

reached the conclusion that the tourism crisis is caused by 

tourism crisis factors, actually the external interference. 

Once the quantity of the tourism crisis factors exceeds the 

ecological valiance that the system is able to carry, the 

system comes into deterioration and even crash. To be 

more specific, one kind of crisis will bring other kinds of 

crises, forming a crisis chain finally, because of the food 

net that all kinds of tourism factors build in power 

flowing and value flowing process. The crisis spreads in 

the three mechanisms in Figure 2.


As it shows in the figure, the crisis may occur when the 

material flow or the value flow breaks down. Crisis series 

may also happen when the information flow affects the 

psychological factor to cause more and more crises. 

Finally, the crisis may affect the environment and the 

environment reversely affects other ecological factors to 

change into other interfering factors that may cause other 

kinds of tourism crises. 

III. TOURISM CRISIS MANAGEMENT SYSTEM 

A. Framework and Procedures of Tourism Crisis 

Management 

The application of ecological mechanism into the 

research of tourism industry and its crisis gives us a much 

more great understanding of the cause and developing 

process of tourism crisis. In the same way, we can make 

use of the methods to the crisis in natural ecological 

system to manage the travel crisis. On the whole, the 

tourism crisis management which based on the ecological 

In the preventing stage, the model of security tree is 

built to analyze and divide the crisis into the smallest 

factor so that the crisis can be prevented by avoiding the 

conditions that these smallest crisis factors need. 

In the warning stage, it is the primary and most difficult 

task to monitor the crisis factor. Due to the fact that many 


Figure3. Framework and procedures of tourism crisis management 

mechanism is to use the ecological principles to analyze 

the cause and developing process of tourism crisis and to 

respond to them by using the methods in ecological 

system. 

In the end, it is the everlasting objective for the tourism 

crisis management to promote and maintain the health 

and stability of the travel ecological system. According to 

what we have analyzed about the ecological mechanism 

of tourism crisis, the tourism system is such a selfadjusted 

system that it has certain carrying ability and can 

respond to the crisis in each stage. Thus it can further be 

concluded that the ecological-mechanism-based tourism 

crisis management is terminally to guide the travel 

system to perform its self-adjusting ability to prevent, 

warn, control and recover from the crisis. 

In order to build a tourism crisis management system 

based on ecological mechanism, we analyzed the desired 

system function in the separate stages in crisis 

management. 

tourism crises emerge from their related environments, it 

is essential to build a well-organized information 

communicating network that helps to provide the crisis 

factors’ information. In the controlling stage, the system 

needs to identify the crises and especially their causes 

first and make the interfering decision with the conditions


of labors and techniques. So a regional crisis management 

system based on ecological mechanism is to be 

constructed. 

In the recovery stage, we may choose the manual way 

or the natural way to help recover the tourism system. 

The choice is based on whether the crisis is caused by 

natural factors or social factors. The natural-factor-caused 

crisis can be recovered by the combination of manual and 

natural method. However, the social-factor-caused crisis 

can only be resolved by manual method. 

On the whole, the tourism crisis management system 

should realize the following functions: indentifying the 

tourism crisis, analyzing the crisis factor, modeling the 

crisis, making the responding decision, returning the 

evaluation feedback and the publishing the information. 

Here we present the tourism crisis management system in 

ecological mechanism. Its function framework and 

processing procedures can be designed as Figure3. 

According to the function figure, there are three 

working levels and two sub-systems to constitute the 

whole system. On the first working level, pre-processing 

work, including the crisis indentifying, crisis factor 

analyzing and crisis modeling, is completed. Since these 

works demand different models, there is not a uniformed 


Figure4. Architecture of Tourism Crisis Management System 

system to process them. Instead, the crisis factor 

analyzing and crisis modeling come into progress 

separately after the crisis is identified. 

On the second working level is the tourism crisis 

deciding and supporting system. This occupies the heart 

place of the whole management system. On this level, 

analysis and decisions are made in the support of 

database and decision models. And reference models for 

managing the crisis are available later. 

On the third level is the feedback system, which is used 

to evaluate the crisis to check the afterwards situation, to 

give a whole understanding of the crisis, and to 

accumulate the cases for future use. 

Apart from the three working levels, dialogue subsystem 

is constructed for the data input and information 

declaration. The security sub-system is also set with 

hardware and software such as identity authentication, 

firewall and antivirus tools. 

B. Architecture of Tourism Crisis Management System 

According to the requirements of the framework and 

procedures of tourism crisis management, we give the 

architecture design of that crisis management as Figure 4:


This includes the integration of two levels. One is the 

integration of tourism corporations that have the same 

objection and cooperate frequently. Databases are 

connected and related information that scatted on 

different departments is shared on this level. Another is 

the information integration with regional governments, 

ministry of public security, fire agents, and other 

departments. Technologies such as Extranet and Internet 

are used to realize the information integration of these 

sub-systems.. 

In this architecture, there are three major platforms: the 

system access platform, the application service platform, 

and the fundamental supporting platform. Various kinds 

of users connect to the system by the access platform, and 

receive the services by the application service platform. 

The network infrastructure, technological standard, 

agreements and regulations constitute the fundamental 

supporting platform. 

On the side of software system structure, we choose 

JESS technology to build the three-level browser/server 

system structure for the tourism crisis management 

system to realize the information and resources 

integration. 

The sustainable development of tourism industry is 

depended on the balance of a compound ecological 

system, the concept presented by S. J. Ma[22] in 1984 


Figure 5 Compound ecological system model of the Buddhism cultural travel 

On the side of network system structure, a regional 

special network is established and connected with other 

networks to realize the reliable and widely connected 

network system. 

IV. APPLICATION IN BUDDHISM CULTURAL TRAVEL 

Cultural travel has been a new and fashionable kind of 

travel in recent years. Since the Buddhism cultural travel 

resources are abundant all over the world and take a 

heavy place in promoting the local travel industry and 

spreading the Buddhism culture, developing and 

managing them in a sustainable way is of great 

importance. However, the present developing work has 

been revealed some problems. The development has been 

mostly in the area of material cultural resources but not 

the non-material culture, so that the travelers can’t fully 

understand the Buddhism culture that is contained in the 

places of interest. Besides, the Buddhism cultural travel 

industrial chain is weak in some parts, with the travel 

commodities junior in cultural connotation, etc. 

and the system including social sub-system, economic 

sub-system and natural sub-system. In this paper, the 

development of tourism industry is regarded as the


manual interfering process on the compound ecological 

system. The development of Buddhism cultural travel 

presents a complicated relationship among this compound 

ecological system. According to the theory of compound 

ecological system and the ecological community 

succession, we analyzed the influence mechanism of 

tourism development on the subjects of the tourism 

industrial ecological system, and also analyzed the 

community succession process of the Buddhism cultural 

travel development. Aftermath, we established the 

compound ecological system model of the Buddhism 

cultural travel development and its sustainable 

development mechanism. 

As it concerns to the influence mechanism of the 

Buddhism cultural travel development on the compound 

ecological system, the developed Buddhism cultural 

travel resources come into the consuming market in the 

form of Buddhism cultural travel products and their 

derivatives. The consumers create the investment returns 

by consuming and the Buddhism cultural travel industry 

and its related industries got promotion as a result. The 

social and natural environments receive both the positive 

and negative effects of these industries’ development. As 

Government, to maintain and improve the tourism 

resource conditions is the primary responsibility. 

The compound ecological system model of the 

Buddhism cultural travel development is presented by 

this paper as Figure 5. 

In this model, there are three sub-systems of social, 

economic and natural system, interacted as the arrow 

showed. In the social system, government, corporations, 

non-profitable organizations, investors, travelling 

consumers and other related workers interrelated via 

some political and economical relationship. In the 

economical system, the Buddhism cultural travel industry 

derives other related industries, and the tourism resources 

provide attractions as the basement that the whole 

industry existed on. Last, in the natural system, the 

natural environments such as geology, climate and 

transportation offer the grounds for the activity sites, 

living facilities for the travelling. This natural system is 

also the general basement of the whole Buddhism cultural 

travel industry. 

Tourism crisis may take place while the balance of that 

ecological system has been seriously broken by any of 

social, economic or natural factors. Based on the 

ecological mechanism, we developed a tourism crisis 

management system for Buddhism cultural travel, which 

has been successfully applied in China. 

V. CONCLUSION 

Inspired by natural ecological system, this paper 

researched the ecological mechanism of tourism system 

and tourism crisis. To help resolve the present situation in 

tourism crisis responses, a new design of crisis 

management system was presented for preventing, 

monitoring, controlling and recovering the tourism crisis 

both from natural and social environment based on that 

ecological mechanism. 


With that system, efficient processing can be 

conducted at the moment that the general tourism crisis 

happens. This system has been successfully applied in the 

crisis management of Buddhism cultural travel. 

Improvement of practical application is our future work. 

VI. ACKNOWLEDGEMENT 

This research was supported by National Natural 

Science Foundation of China (No. 90924013). 

REFERENCES 

[1] Fujii, “Tourism and crime: implications for regional 

development policy.” Regional Studies, Vol.14, pp.27-36, 

1980. 

[2] Richter, “Tourism politics and political science: A case of 

not so benign neglect.” Annals of Tourism Research, 

Vol.10, pp.313-335, 1983. 

[3] V. L. Smith, “War and tourism: an american 

ethnography.” Annals of Tourism Research, Vol.25(1), 

pp.202-227, 1998. 

[4] Pine, “The current and future impact of Asia’s economic 

downturn on the region’s hospitality industry,” 

International Journal of Contemporary Hospitality 

Management, Vol.10(7), pp.252-256, 1998. 

[5] Faulkner, “Towards a framework for tourism disaster 

management,” Tourism Management, Vol.22, pp.135-147, 

2001. 

[6] Frisby, “Communicating in a crisis: the British Tourist 

Authority’s response to the foot-and-mouth outbreak on 

11th September, 2001,” Journal of Vacation Marketing, 

Vol.9(1), pp.89-100, 2002, 

[7] B. Faulkner, “Towards a framework for tourism disaster 

management.” Tourism Management, Vol.22, pp.135-147, 

2001. 

[8] H. E. Aldrich, J. Pfeffer, “Environments of 

organizations,” Annual Review of Sociology, Vol.2 pp.79- 

105, 1976. 

[9] M.T. Hannan, J. H. Freeman, “The population ecology of 

organizations,” The American Journal of Sociology, 

Vol.82(5), pp.929-964, 1997. 

[10] J. F. Moore, The Death of Competition: Leadership and 

Strategy in the Age of Business Ecosystems, Harpercollins 

Publishing, 1996 

[11] R. L. Daft: Organization Theory and Design, South- 

Western College Publishing, 1998 

[12] K. Dong, X. Z. Cui, “Management ecology-the 

management in 21century,” Modern Management Science, 

Vol.2, pp.6-8, 2003. 

[13] G. L. Hou, “Tourism crisis: types, influential mechanism 

and management model,” Nankai Management Comment, 

Vol.1, pp. 78-82, 2005. 

[14] Z. F. Yang, etc. “Assessment of the ecological carrying 

capacity based on the ecosystem health,” Acta Scientiae 

Circumstantiae, Vol.5, pp.586-594, 2005. 

[15] W. H. Dai, Ecological Community Pattern and Strategy of 

Independent Innovation: Some Pillar Industries in 

Shanghai, Project Report of Shanghai Municipal Science 

and Technology Commission, 2005. 

[16] J. Huang, “On the establishment of tourism crisis 

management mechanism,” Social Scientist, Vol. 7, pp. 76- 

79, 2003. 

[17] The World Tourism Organization, Crisis Guidelines for 

the Tourism Industry, http://www.world-tourism.org. 

[18] J. Q. Li, etc, “Crisis accident and it’s management in 

tourism,” Human Geography, Vol.12, pp.35-39, 2003.


[19] H. F. Guo, X. Y. Zeng, “Risk analysis method research,” 

Computer Engineering, Vol.3, pp.131-132, 2001. 

[20] J. Zhou, “Research on the systematic mechanism analysis 

and strategic countermeasures of tourism crisis 

management,” Journal of Guilin Institute of Tourism, 

Vol.1, pp.20-27, 2003. 

[21] J. Zhang, “A study of tourism crisis,” Journal of Huaiyin 

Industry College, Vol.4, pp.8-10, 2004. 

[22] S. J. Ma, R. S. Wang, “The social-economic-natural 

complex ecosystem,” Acta Ecologica Sinica, Vol.1, 1984. 

Xiaohua Hu received his Master degree in Software 

Engineering in 2011 from Fudan University, China. He is 

current a software engineer at Changhai Hospital, Second 

Military Medical University, China. 

. 

Xuan Zhou received her Master degree in Digital Arts in 2009 

from Shanghai University, China. She is current a teacher at 

Sicuan Conservatory of Music, China. 


Weihui Dai received his B.S. degree in Automation 

Engineering in 1987, his Master degree in Automobile 

Electronics in 1992, and his Ph.D. in Biomedical Engineering 

in1996, all from Zhejiang University, China. He is currently an 

Associate Professor at the Department of Information 

Management and Information Systems, School of Management, 

Fudan University, China. 

Dr. Dai has published more than 120 papers in Software 

Engineering, Information Management and Information 

Systems, Financial Intelligence, and Complex Adaptive System 

and Socioeconomic Ecology, etc. Dr. Dai became a member of 

IEEE in 2003, a senior member of China Computer Society in 

2004, and a senior member of China Society of Technology 

Economics in 2004. 

Zhaozong Zhan received his Master degree in Software 

Engineering in 2006 from Fudan University, China. 

Xiaoyi Liu received her B.S. degree in Optical Information 

Science and Technology from Fudan University in 2009. She is 

currently a master student of Accountancy at University of 

Denver, USA.


Image Fusion Method based on Non-Subsampled 

Contourlet Transform 

Hui Liu 

School of Science, Jiangxi University of Science and Technology, 341000 Ganzhou, China 

Email: lxyliuhui@163.com 

Abstract—Considering human visual system and 

characteristics of images, a novel image fusion strategy is 

presented for panchromatic high resolution image and 

multispectral image in non-subsampled contourlet 

transform (NSCT) domain. The NSCT can give an 

asymptotic optimal representation of edges and contours in 

image by virtue of its characteristics of good multiresolution, 

shiftinvariance, and high directionality. An intensity 

component addition strategy based on NMF algorithm is 

introduced into NSCT domain to preserve spatial resolution 

and color content. Experiments show that the proposed 

algorithm can not only reduce computational complexities, 

but achieve better performances than other mentioned 

techniques both in visual point and statistics compared with 

the traditional principle component analysis (PCA) method, 

intensity-hue-saturation (IHS) transform technique, wavelet 

transform weighted fusion method, corresponding wavelet 

transform-based fusion method, and contourlet transformbased 

fusion method. 

Index Terms—Image fusion, Non-Subsampled Contourlet 

Transform, Non-Negative Matrix Factorization. 


Image fusion is a process by combining two or more 

source images from different modalities or instruments 

into a single image with more information. The 

successful fusion is of great importance in many 

applications, such as remote sensing, computer vision, 

medical imaging, and so on. In the pixel level fusion, 

some generic requirements can be imposed on the fusion 

results [1]: 

1) The fused image should preserve all relevant 

information contained in the source images as closely as 

possible. 

2) The fused process should not introduce any artifacts 

or inconsistencies, which can distract or mislead the 

human observer, or any subsequent image processing 

steps. 

3) In the fused image, irrelevant features and noise 

should be suppressed to a maximum extent. 

Panchromatic (PAN) images of high spatial resolution 

can provide detailed geometric information, such as 

shapes, features, and structures of objects of the earth’s 

Manuscript received February 20, 2011; revised March 22, 2011; 

accepted June 25, 2011. 


doi:10.4304/jsw.7.12.2816-2822 

surface. While multispectral (MS) images with usually 

lower resolution are used to obtain spectral information 

necessary for environmental applications. The different 

objects within images of high spectral resolution are 

easily identified. Data fusion methods aim to obtain the 

images with high spatial and spectral resolution, 

simultaneously. The PAN and MS remote sensing image 

fusion is different from other fusion applications, such as 

image fusion in military missions or computer-aided 

quality control. The specificity is to preserve the spectral 

information for subsequent classification of ground cover. 

The classical fusion methods are principle component 

analysis (PCA), intensity-hue-saturation (IHS) transform, 

etc. In recent years, with the development of wavelet 

transform theory and multi-resolution analysis, twodimensional 

separable wavelets have been widely used in 

image fusion and have achieved good results[2−4]. 

Thus, the fusion algorithms mentioned above can 

hardly make it by themselves. They usually cause some 

characteristic degradation, spectral loss, or color 

distortion. For example, the IHS transform can enhance 

texture information and spatial features of fused images, 

but suffers from much spectral distortion. The PCA 

method will lose some original spectral features in the 

process of principal component substitution. The wavelet 

transform (WT) can preserve spectral information 

efficiently but cannot express spatial characteristics well. 

Furthermore, the isotropic wavelets are scant of shiftinvariance 

and multi-directionality and fail to provide an 

optimal expression of highly anisotropic edges and 

contours in images. 

Image decomposition is an important link of image 

fusion and affects the information extraction quality, even 

the whole fusion quality. In recent years, along with the 

development and application of the wavelet theory, the 

favorable time-frequency localization to express local 

signal makes wavelet a candidate in multi-sensor image 

fusion. However, wavelet bases are isotropy and of 

limited directions and fail to represent high anisotropic 

edges and contours in images well. The MGA emerges, 

which comes from wavelet multi-resolution, but beyond 

it. The MGA can take full advantage of the geometric 

regularity of image intrinsic structures and obtain the 

asymptotic optimal representation. As an MGA tool, the 

contourlet transform (CT) has the characteristics of 

localization, multi-direction, and anisotropy[5]. The CT 

can give the asymptotic optimal representation of


contours and has been applied in image fusion effectively. 

However, the CT is lack of shift-invariance and results in 

artifacts along the edges to some extends. 

Until recently, the multi-resolution decomposition 

based algorithms have been widely used in multi-source 

image fusion field, and effectively overcame the problem 

of spectrum distortion. In which, wavelet transformation 

enjoys great time-frequency analytical features and is the 

focus of multi-source image fusion. In 2002, Do and 

Vetteri proposed a flexible contourlet transform method 

that may efficiently detect the geometric structure of 

images attribute to its properties of multi-resolution, local 

and directionality [1]. But spectrum aliasing phenomenon 

occurs posed by unfavorable smoothness of basis 

function. Cunha et al. put forward a NSCT (Non- 

Subsampled Contourlet Transform) method [2] in 2006, 

improvements has been made in solving limitations of 

contourlet, and it was the transformation with attributes 

of shift-invariant, multi-scale and multi-directionality [3]. 

NMF (Non-Negative Matrix Factorization) was a new 

matrix analysis method [4], presented by Lee and Seung 

in 1999, and has been proved converged to its local 

minimum in 2000 [5]. The non-negative constraints 

imposed on NMF lead to extensive applications, and it 

has been successfully applied to image analysis, text 

clustering, data mining, speech processing, robot control, 

face recognition, biomedical and chemical engineering. 

In current references, Miao et al. applied NMF in multifocus 

image fusion [6]; Novak et al. utilized NMF in 

language modeling for grammar identification [7]; Feng 

et al. used NMF in face recognition program [8, 9]. 

Owing to the pixels are generally non-negative in digital 

image processing, hence, the results arising from NMF 

directly express specific physical meaning. 

An improved NMF algorithm is proposed, and applied 

in image fusion program combine with NSCT, in which 

the novel NMF approach is performed to fuse the lowfrequency 

information in NSCT domain while the fusion 

of high-frequency details can be realized by adopting the 

technique called as NHM (Neighborhood Homogeneous 

Measurement). The experimental results demonstrate that 

the fusion method proposed can effectively extract useful 

information of source images and inject it into the final 

fused one which owns better visual effect and occupies 

less CPU time comparing with algorithm in [10]. 

This paper discusses the fusion of multispectral and 

panchromatic remote sensing images. An improved NMF 

algorithm is proposed and applied The rest of this paper 

is organized as follows. Section 1 gives the NSCT of 

images. Section 2 introduces NMF fusion algorithm and 

it’s transform. Section 3 proposes a new algorithm based 

on the combination ANMF and NSCT. Section 4 reports 

about the fusion experiments tested on PAN and MS 

image sets using the proposed algorithm. Conclusions are 

drawn in Section 5. 

II. NON-SUBSAMPLED CONTOURLET TRANSFORM 

A. Contourlet Transform 

Do and Vetterli proposed a “true” two-dimensional 


transform called contourlet transform, which is based on 

nonseparable filter banks and provides an efficient 

directional multiresolution image representation. The CT 

expresses image by first applying a multiscale transform, 

followed by a local directional transform to gather the 

nearby basis functions at the same scale into linear 

structures. For example, the Laplacian pyramid (LP) is 

first used to capture the point discontinuities, and then 

followed by a direction filter banks (DFB) to link point 

discontinuities into linear structures. In particular, 

contourlets have elongate supports at various scales, 

directions, and aspect ratios. The contourlets satisfy 

anisotropy principle and can capture intrinsic geometric 

structure information of images and achieve better 

expression than discrete wavelet transform (DWT), 

especially for the edges and contours. 

However, because of the downsampling and 

upsampling, the CT is lack of shift-invariance and results 

in ringing artifacts. But, the shift-invariance is desirable 

in image analysis applications, such as edge detection, 

contour characterization, image fusion, etc[7]. 

Especially, during the realization of the CT, the 

analysis filter banks and synthesis filter banks of LP 

decomposition are nonseparable bi-orthogonal filter 

banks with band width larger than π/2. Based on 

multisampled rate theory, downsample on filtered image 

may result in lowpass and highpass frequency aliasing. 

Therefore, the frequency aliasing affects lie in directional 

subbands, which comes from the highpass subbands 

filtered by DFB. The frequency aliasing will result in 

information in a direction to appear in different 

directional subbands at the same time. This must weaken 

the directional selectivity of contourlets. 

B. Non-subsampled Contourlet Transform 

NSCT is proposed on the grounds of contourlet 

conception [1], which discards the sampling step during 

image decomposition and reconstruction stages. 

Furthermore, NSCT achieves the ability of shift-invariant, 

multi-resolution and multi-dimension for image 

presentation by using non-sampled filter bank iteratively. 

The structure of NSCT consists of two parts: NSP 

(Non-Subsampled Pyramid) and NSDFB (Non- 

Subsampled Directional Filter Banks). NSP, a multi-scale 

decomposed structure, is a dual-channel non-sampled 

filter that is developed from àtrous algorithm. And it does 

not contain subsampled process. (Fig. 1, a) shows the 

framework of NSP, for each decomposition of next level, 

the filter H (z) is firstly sampled using upper-two 

sampling method, the sampling matrix is D = (2, 0; 0, 2). 

Then, low-frequency components derive from last level 

are decomposed iteratively just as its predecessor did. As 

a result, a tree-like structure that enables multi-scale 

decomposition is achieved. NSDFB is constructed based 

on the fan-out DFB presented by Bamberger and Smith. 

It does not include both the super-sampling and subsampling 

steps, but rely on sampling the relative filters in 

DFB by treating D = (1, 1; 1, -1), which is illustrated in 

(Fig. 1, b).


a) 

b) 

Fig.1. Diagram of NSP and NSDFB: a – three-levels NSP; b – 

decomposition of NSDFB 

III IMPROVED NONNEGATIVE MATRIX FACTORIZATION 

A. Nonnegative Matrix Factorization 

NMF is a recently developed matrix analysis algorithm 

[4, 5], which can not only describes low-dimensional 

intrinsic structure in high-dimensional space, but achieves 

linear representation for original sample data by imposing 

non-negativity constraints. It makes all the components 

non-negative (i.e., pure additive description) after being 

decomposed as well as realizes the non-linear dimension 

reduction, simultaneously. NMF is defined as: 

Conduct N times of investigation on a M-dimensional 

stochastic vector v first, then record these data as vj, j = 

1,2,…, N, let V = [ V•1, V•2, V•N ], where V•j = vj, j = 1,2,…, 

N. NMF is required to find a non-negative M×L base 

matrix W = [W•1, W•2,…, W•N] and a L×N coefficient 

factor H = [H•1, H•2,…, H•N ], so that V≈WH [4]. The 

equation can also be wrote in a more intuitive form of 

L 

V. j ≈ ∑W. 

iH. j where L should be chose to satisfy (M + N) 

i= 

1 

L < MN. 

In the purpose of finding the appropriate factors W and 

H in solving NMF problem, the commonly used two 

objective functions are depicted as [5]: 

2 

F 

M N 

∑∑ ij ij 

2 

i= 1 j= 

1 

E( V || WH ) = || V − WH || = ( V −( 

WH ) ) 

, (1) 

V 

D( V || WH ) = ( V log − V + ( WH ) ) , 


M N 

ij 

∑∑ ij ij ij 

i= 1 j= 1 ( WH ) ij 

In respect to formulas (1) and (2), ∀i, a, j subject to 

Wia > 0 and Haj > 0. ||•||F is the Frobenius norm, (1) is 

called as the Euclid distance while (2) is referred to as K- 

L divergence function. Note that, Find the approximate 

solution to V ≈WH is considered equal to the optimization 

of above mentioned two objective functions. 

B. Accelerated Nonnegative Matrix Factorization 

Roughly speaking, NMF algorithm has high time 

complexity that results in limited advantage for overall 

performance of algorithm, so that the introduction of 

improved iteration rules used to optimize the NMF is 

extremely crucial to promote the efficiency. In the point 

of algorithm optimization, NMF is the majorization 

problem that contents non-negative constraint. Until now, 

a wide range of decomposition algorithms have been 

investigated on the basis of non-negative constraint, such 

as the multiplicative iteration rules, interactive nonnegative 

least squares, gradient method and projected 

gradient [11]. In which the projected gradient approach is 

capable of reducing the time complexity of iteration to 

realize the NMF applications in mass data condition. In 

addition, these works are distinguished by meaningful 

physical significance, effective sparse data, enhanced 

classification accuracy and striking time decreasing. We 

propose a modified version of projected gradient NMF 

that will greatly reduce the complexity of iterations, the 

main idea of the algorithm is listed blow: 

As we known, the Lee-Seung algorithm continuously 

updates H and W, which fixing the other, by taking a step 

in a certain weighted negative gradient direction, namely: 

ij ij ij ij ij ij 

ij 

(2) 

⎡ ∂f 

⎤ 

T T 

H ← H −η ⎢ ≡ H + η ( W A−W WH) 

∂H 

⎥ 

, (3) 

⎣ ⎦ 

⎡ ∂f 

⎤ 

T T 

Wij ←Wij −ςij ⎢ ≡ Wij + ςij 

( AH −WHH 

) ij 

W 

⎥ 

, 

⎣∂⎦ij (4) 

where ηij and ζij are individual weights for the 

corresponding gradient elements, which are expressed 

like follows: 

Hij 

Wij 

η ij = , ς T 

ij = , T 

( W WH) 

( WHH ) 

and then the updating formulas: 

ij 

ij 

(5) 

T 

T 

( W A) 

ij 

( AH ) ij 

Hij ← Hij , W T 

ij ← Wij , (6) 

T 

( W WH) 

( WHH ) 

ij 

We notice that the optimal H related to a fixed W can 

be obtained, column by column, by independently: 

1 

2 

min || Ae j −WHe j || 2 s.t. He j ≥ 0 , (7) 

2 

ij


where ej is the j th column of the n×n identity matrix. 

Similarly, we can also acquire the optimal W relative to a 

fixed H by solving, row by row: 

1 T T 2 

T 

min || Aei −HWei || 2 s.t. W ei 

≥ 0 , 

2 

(8) 

where ei is the i th column of the m×m identity matrix. 

Actually, both problems (7) and (8) can be changed into 

an ordinary form: 

1 

2 

min || Ax −b|| 2 s.t. x ≥ 0 , 

2 

where A ≥ 0 and b ≥ 0. As the variables and given data 

are all nonnegative, the problem is therefore named as 

TNNLS (Totally Nonnegative Least Squares) issue. 

We propose to revise the algorithm claimed in article 

[4] by using the same update rule with step-length α in 

[10] to the successive updates in improving the objective 

functions about the two TNNLS problems mentioned in 

formulas (7) and (8). As a result, this brings about a 

modified form of the Lee-Seung algorithm that 

successively updates the matrix H column by column and 

W row by row, with individual step-length α and β for 

each column of H and each row of W respectively. So we 

try to write the update rule as: 

(9) 

T T 

H ← H + αη ( W A− W WH) 

, (10) 

ij ij j ij ij 

T T 

W ← W + βς ( AH − WHH ) , (11) 

ij ij j ij ij 

where ηij and ζij are set equal to some small positive 

number as described in [10], αj (j=1,2,…,n) and βi 

(i=1,2,…,m) are step-length parameters can be computed 

as follows. Let x > 

T T 

0, q = A ( b− Ax) and p = [ x./( A Ax)] o q 

where the symbol “./” means component-wise division 

and “ o ” denotes multiplication. Then we introduce 

variable τ∈(0,1), 

T 

pq 

α = min( , τ max{ ˆ α : x+ ˆ αp≥ 

0}) , (12) 

T T 

p A Ap 

We can easily obtain the step-length formula of αj or βi 

if (A, b, x) is replaced by (W, Aej, Hej) or (H T , A T ei, W T ei), 

respectively. It is necessary to point out that q is the 

negative gradient of the objective function in [10], and 

the search direction p is a diagonally scaled negative 

gradient direction. The step-length α or β is either the 

minimum of the objective function in the search direction 

or a τ-fraction of the step to the boundary of the 

nonnegative quadrant. 

Learning from literature [10] that both quantities, p T 

q/p T A T Ap and max{â : x + âp ≥0} are greater than 1 in the 

definition of the step α. Thereby, we make αj ≥ 1 and βi ≥ 

1 by treating τ sufficiently close to 1. In our experiment, 

we choose τ = 0.99 which practically guarantees that α 

and β are always greater than 1. 


Obviously, when α←1 or β←1, update formulas (10) and 

(11) reduce to updates (3) and (4). In the algorithm, the 

step-length parameters are allowed to be greater than 1. It 

is this indicates that for any given (W, H), we can get at 

least the same or greater decrease in the objective 

function than the algorithm [10]. Hence, we will call the 

proposed algorithm the ANMF (accelerated NMF). 

Besides, the experiments in later section will demonstrate 

that the (ANMF) algorithm is indeed superior to that 

algorithm by generating better test results, especially 

when the amount of iterations is not too big. 

IV THE ANMF AND NSCT COMBINED ALGORITHM 

A. The Selection of Fusion Rules 

As we known, the approximation characteristics of an 

image belongs to the low-frequency part, while the highfrequency 

counterpart exhibits detailed features of edge 

and texture. In this paper, NSCT method is utilized to 

separate the high and low components of source image in 

frequency domain, and then the two parts are dealt with 

different fusion rules according to their features. As a 

result, the fused image can be more complementary, 

reliable, clear and better understood. 

By and large, the low-pass sub-band coefficients 

approximate original image at low-resolution, it generally 

represents image contour, but such high-frequency details 

as edge, region contour are not contained. So we take 

ANMF algorithm to achieve the low-pass sub-band 

coefficients which including holistic features of the two 

source images. The band-pass directional sub-band 

coefficients embody the particular information, edges, 

lines, and boundaries of region, the main function of 

which is to obtain spatial details as much as possible. In 

our paper, a NHM based local self-adaptive fusion 

method is adopted in band-pass directional sub-band 

coefficients acquisition phase, by calculating the identical 

degree of the corresponding neighborhood to determine 

the selection for band-pass coefficients fusion rules. 

B. The Course of Image Fusion 

Given that the two source images are A and B, with the 

same size, both have been registered, F is fused image. 

The fusion process is shown in (Fig. 2) and the steps are 

given as follows. 

(1) Adopt NSCT to implement the multi-scale and 

multi-direction decompositions for source images A and 

A A 

B and the sub-band coefficients { C ( m, n), C ( m, n )} , 

{ C ( m, n), C ( m, n)} can be obtained. 

B B 

i0 i, l 

i0 i, l 

Fig. 2. Flowchart of fusion algorithm


(2) Construct matrix V on the basis of low-pass sub- 

A 

B 

band coefficients C ( m, n) and C ( m, n ) : 

i0 

i0 

⎡v1A v1B 

⎤ 

⎢ 

v2A v 

⎥ 

2B 

V = [ vA, vB] 

= ⎢ ⎥, 

(13) 

⎢L L ⎥ 

⎢ ⎥ 

⎣vnA vnB 

⎦ 

where vA and vB are column vectors consisting of pixels 

come from A and B respectively according to principles 

of row by row. n is the number of pixels of source image. 

And perform ANMF algorithm described above on V, 

from which W that is actually the low-pass sub-band 

coefficients of fused image F is separated. We set 

maximum iteration number as 1000 with τ = 0.99. 

The fusion rule NHM is applied to band-pass 

A 

B 

directional sub-band coefficients Cil , ( m, n ) , Cil , ( m, n) of 

source images A, B. The NHM is calculated as: 

2{ | C ( m, n)|| C ( m, n)|} 

A B 

i, j i, j 

i, j( , ) 

( k, j) ∈Ni, 

j( 

m, n) 

A B 

Ei, j( m, n) + Ei, j( 

m, n) 

NHM m n 

∑ 

, 

(14) 

where Ei,l(m, n) is regarded as the neighborhood energy 

under resolution of 2 l in direction i, Ni,l(m, n) is the 3 × 3 

neighborhood centers at point (m, n). In fact, NHM 

quantifies the identical degree of correspond 

neighborhoods for two images, the higher the identical 

degree is, the greater the NHM value should be. Because 

0 ≤ NHMi,l (m, n) ≤ 1, we define a threshold T, generally 

have it 0.5


superior to the two. Statistic results in Table 1 verified the 

visual effect. 

Table 1 illustrates that the proposed method has 

advantage over others since three of four criteria, in 

details rendering and preserving respects, are superior to 

that of rest algorithms. The index IE of our method 

exceeds that of M1, M2 and M3 by 3.1%, 1.3% and 1.5% 

respectively. RCE index of the latter three methods are 

relatively high when against that of M1 in assessing 

deviation level. In index AG, the value of our method 

excels, which indicates that our method enjoys better 

visual effect. As for index Q, 0.9844 of our method 

means the most excellent performance when compared to 

value of former three algorithms. 

TABLE 1. 

COMPARISON OF THE FUSION METHODS FOR MULTI-FOCUS IMAGES 

M1 M2 M3 Proposed method 

IE 7.3276 7.4594 7.4486 7.5608 

RCE 0.2875 0.4940 0.4832 0.4539 

AG 8.4581 8.2395 8.4595 8.6109 

Q Index 0.9579 0.9723 0.9706 0.9844 

C. Visible and Infrared Image Fusion 

One set of registered visible and infrared images that a 

person is walking in front of a house are labeled as (Fig. 4, 

a) and (Fig. 4, b) with the size of 360 by 240. In which, 

(Fig. 4, a) has clear background but fails to detect 

foreground while (Fig. 4, b) highlights the person and 

house but its ability to render other surroundings is weak. 

Similar to the former experiment, four methods are 

utilized one by one to fuse these images. 

We can find that the image based on method M1 is the 

worst in overall effect, especially a dark area around the 

person, which is partly caused by the significant 

differences between two source images. Method M2 

produces more smooth details over M1, as a case in point, 

the road on the right side of the image and the grass on 

the other side can easily be recognized for the 

enhancement of intensity. Approximate effects displayed 

in (Fig. 4, e) and (Fig. 4, f) are achieved by using M3 and 

our method, from which we can easily distinguish most 

parts of the scene except the lighting beside the house in 

(Fig. 4, e) can hardly be observed. And it is difficult to 

judge the performance of M3 and our method through 

visual watching in case of the concrete data are not 

provided by Table 2. 

In index IE, the value of our method is 6.7962 surpass 

that of M3 by 1.72% which indicate that our method has 

a distinct superiority over other algorithms as far as the 

information amount is concerned. As for RCE, the least 

deviation value is achieved by M1. In index AG, the 

optimal value is obtained on the basis of our method 

while that of M2 holds the final place. And the Q index of 

our method is still top of Table 2. 


a) b) 

c) d) 

e) f) 

Fig. 4. Visible and infrared source images and fusion results: a – visible 

band image; b – Infrared band image; c – fused image based on M1; d – 

fused image based on M2; e – fused image based on M3; f – fused 

image based on our method 

TABLE 2. 

COMPARISON OF THE FUSION METHODS FOR VISIBLE AND 

INFRARED IMAGES 

M1 M2 M3 Proposed method 

IE 6.2103 6.3278 6.7012 6.7962 

RCE 1.3254 5.8970 4.4375 4.0261 

AG 3.2746 3.0833 3.3695 3.5428 

Q Index 0.9761 0.9784 0.9812 0.9903 

D. Numerical Experiment on ANMF 

In this section, we compare the performance of ANMF 

with that of algorithm presented in article [10] in order to 

prove its advantage, the algorithms are implemented in 

Matlab and applied to Equinox face database [13]. The 

contrast experiments are conducted 4 times, where p is as 

described above and n denotes for the number of images 

chosen from the face database. The Y axis of (Fig. 5) 

represents the number of iteration repeated by the two 

algorithms and the X axis is time consuming scale. We 

choose one group of these experiments and demonstrate 

the results in (Fig. 5) with p=100 and n=1000, in which 

algorithm in [10] is first performed for a given number of 

iterations and record the time elapsed and then run our 

algorithm until the time consumed is equivalent to that of 

former. We note that our algorithm offers improvement in 

all given time points, however, the relative improvement 

percentage of our method over algorithm in [10] goes 

down when increasing the number of iterations. Actually, 

the performance of our method increase about 36.8%, 

26.4%, 15.7%, 12.6%, 7.5% respectively when 

comparing with it for five times. In other words, our 

method converges faster, especially at early stages, but 

the percentage tends to decline, which implies that this 

attribute is useful merely for real-time applications that 

without very large scale.


Fig. 5. Comparison of our method and algorithm in [10] 

V. CONCLUSION 

In this paper, we presented a technique for image 

fusion based on NSCT and ANMF model. The 

accelerated NMF method modifies the previous update 

rules of W and H, which achieves better effect by 

adopting the theory of matrix decomposition. The current 

approaches on the basis of NMF usually need more 

iterations to converge than proposed method, but the 

contented result can be attained by our technique via less 

iterations. The results of simulation experiments show 

that the proposed algorithm can not only reduce 

computational complexities, but achieve better 

performances than other mentioned techniques both in 

visual point and statistics. 


This paper is sponsored by the Foundation of Jiangxi 

Educational Committee (GJJ10478). 

REFERENCES 

[1] Do M. N., Vetterli M. “The Contourlet Transform: an 

Efficient Directional Multi-resolution Image 

Representation”. IEEE Transactions on Image Process. 

14(12):2091–2106, 2005. 


[2] Cunha A. L., Zhou J. P., Do M. N. “The Non-subsampled 

Contourlet Transform: Theory, Design and Applications”. 

IEEE Transactions on Image Process. 15(10):3089– 

3101,2006. 

[3] Qu X. B., Yan J. W., Yang G. D. “Multi-focus Image 

Fusion Method of Sharp Frequency Localized Contourlet 

Transform Domain based on Sum-modified-laplacian”. 

Opt. Precision Eng.17(5):1203–1211,2009. 

[4] Lee D. D., Seung H. S. “Learning the Parts of Objects by 

Nonnegative Matrix Factorization” Nature. 

401(6755):788–791,1999. 

[5] Miao Q. G., Wang B. S. “Multi-focus Image Fusion based 

on Nonnegative Matrix Factorization” Acta Optica Sinica. 

25(6):755–759,2005. 

[6] Novak M., Mammone R. “Use of Non-negative Matrix 

Factorization for Language Model Adaptation in a Lecture 

Transcription Task”. Proc. of IEEE International 

Conference on Acoustics, Speech and Signal Processing. - 

Salt Lake, P. 541-544,2001. 

[7] Guillamet D., Bressan M., Vitria J. “A Weighted Nonnegative 

Matrix Factorization for Local Representations”. 

Proc. of IEEE Computer Society Conference on Computer 

Vision and Pattern Recognition. - Kauai, 942-947,2001. 

[8] Feng T., Li S. Z., Shun H. Y. “Local Non-negative Matrix 

Factorization as a Visual Representation”. Proc. of 2nd 

International Conference on Development and Learning. - 

Cambridge, P.1-6, 2002. 

[9] Michael M., Zhang Y. “An Interior-point Gradient Method 

for Large-scale Totally Nonnegative Least Squares 

Problems”.International Journal of Optimization Theory 

and Applications. 126(1):191–202, 2005. 

[10] Li L., Zhang Y. J. “A Survey on Algorithms of Nonnegative 

Matrix Factorization”.Acta Electronica Sinica. 

36(4):737–742, 2008. 

[11] Anjali M., Bhirud S. G. “Objective Criterion for 

Performance Evaluation of Image Fusion 

Techniques”International Journal of Computer 

Applications. 11(5):57–60, 2010. 

[12] Rockinger O. “Pixel-level fusion of image sequences using 

wavelet frames”. Proceedings of Image Fusion and Shape 

Variability Techniques. P.149–154, 1996. 

[13] Gonzalez A M, Saleta J L, Catalan R G, Garcia R. “Fusion 

of multispectral and panchromatic images using improved 

IHS and PCA mergers based on wavelet decomposition”. 

IEEE Transactions on Geoscience and Remote Sensing, 

42(6):1291–1299,2004,


Research on Intrusion Detection Model of 

Heterogeneous Attributes Clustering 

Linquan Xie 

School of Science, Jiangxi University of Science and Technology, 341000 Ganzhou, China 

Email: lq_xie@163.com 

Ying Wang 1 , Fei Yu 2, Chen Xu 2, 3 and GuangXue Yue 4 

1 School of Science, Jiangxi University of Science and Technology, 341000 Ganzhou, China 

2 Jiangsu Provincial Key Laboratory for Computer Information Processing Technology, 

Soochow University, 215000 Soochow, China 

3 School of Information Science and Engineering, Hunan University, 416000 Changsha, China 

4 Department of Computer Science and Technology, Huaihua University, Huaihua, China 

Email: hunanyufei@126.com 

Abstract—A fuzzy clustering algorithm for intrusion 

detection based on heterogeneous attributes is proposed in 

this paper. Firstly, the algorithm modifies the comparability 

measurement for the categorical attributes according to the 

formula of Hemingway; then, for the shortages of fuzzy Cmeans 

clustering algorithm: initialize sensitively and easy to 

get into the local optimum, the presented new algorithm is 

optimized by GuoTao approach. We simulate our algorithm 

with the KDDCUP99 data set, and the results show that the 

convergence rate of the new algorithm is faster than the 

original fuzzy C-means clustering algorithm and the 

performance of our algorithm is more stable. 

Index Terms—Intrusion Detection, Heterogeneous 

Attributes, Fuzzy Clusterin. 


The computer network has developed at full speed 

with the internet as the representative. It provides a 

convenience and efficient method for information open 

access spreading and sharing. At the same time, the 

network faces with kinds of security issues which are 

getting more serious. Intrusion detection is an important 

part in the field of network security research, and how to 

make models for intrusion detection so that it can detect 

intrusions fast and precisely is the key point in this area at 

present. 

Intrusion detection can be considered as a 

classification problem which classifies the given datasets: 

what kind of data is normal and what kind of data is 

abnormal [1]. Cluster as an unsupervised anomaly 

detection algorithm, it can classify the large data sets 

independent of the pre-defined data types and the training 

set of labeled data, avoiding the high cost of marking the 

data. 

Manuscript received June 11, 2011; revised June 29, 2011; accepted 

August 27, 2011. 


doi:10.4304/jsw.7.12.2823-2831 

The research of how to continuously improve the 

detection efficiency of intrusion detection system has 

always been a hot. The fuzzy C-means clustering 

algorithm has a preferable scalability efficiency and 

expandability performance when it is used in processing 

large data sets. But the algorithm can only process the 

continuous data, and it is helplessness to the discrete data. 

However, in fact, the KDDCUP99 dataset which is used 

in the simulation below is consist of continuous data and 

discrete data. If the research only focuses on the 

continuous data or numerical alternative of the discrete 

data simply, it may affect the efficiency of the intrusion 

detection. In this paper, both the continuous data and the 

discrete data are considered, and the similarity measure 

formula of the discrete attribute is improved so that 

detection efficiency can be enhanced. We also propose an 

intrusion detection algorithm of fuzzy clustering based on 

heterogeneous attributes; then for the shortages of the 

fuzzy C-means clustering algorithm: initialize sensitively 

and easy to get into the local optimum, optimize the 

presented new algorithm combining with GuoTao 

algorithm. 

In paper, the first section will introduce the intrusion 

detection algorithm of fuzzy clustering based on 

heterogeneous attributes; the second section will 

recommend the Intrusion Detect System of the algorithm; 

the third section will describe the simulation and the 

performance analyses will be given; the last part of this 

paper will draw the conclusions. 

II. FUZZY CLUSTERING ALGORITHM FOR INTRUSION 

DETECTION BASED ON HETEROGENEOUS ATTRIBUTES 

A. Fuzzy C-means Clustering Algorithm 

The detail of the fuzzy C-means clustering algorithm is 

to divide a dataset X which contains n instances into K 

categories, ( 1 < K < N ) 

, 

= X X ∈ R i = 1, 2, 

L, 

n , according to the 

{ ( ) } 

X i i


minimization principle of the sum of squares in subgroups, 

which category does the data belong to 

determined by using the membership, and calculate each 

clustering center so that the objective function is 

minimum[2]. The matrix of classification express 

as U = ( uij i = 1, 

2, 

L n; 

j = 1, 

2, 

L, 

k) 

, 

uij mean the membership between i and j , i express an 

extent instance, j express an extent category, and it must 

satisfy the conditions as follow: 

k 

∑ 

j= 

1 

u 

ij 

= 1, 

∀i 

= 1, 

L, 

n 

The membership between 0, 1 of each data objects can 

determine the cluster which their belong to after fuzzy 

partition of using Fuzzy C-means Clustering. The interval 

of elements of the matrix U is [ 0 , 1] 

.We defined the 

objective function as follow[2]: 

N k 

m 2 

( U, 

C) 

u d ( X C ) 

J , 

m ∑∑ 

i= 1 j= 

1 

ij 

ij 

i 

j 

(1) 

= (2) 

In the function, J m is seen as the sum of squares of 

the distance between each instance and cluster center; 

C j ∈ I mean clustering centers, and 

C = { C j C j ∈ I, 

j = 1, 

2, 

L, 

k} 

; X i ∈ I is the data 

set of instances; u ij indicate the membership between 

instance i and clustering center j , its interval is [ 0 , 1] 

, 

U = { uij 

} is a matrix with n× k , and 

C [ C1, 

C2 

, , Ck 

] L 

p 

= is a matrix with s× k ; X i ∈ R 

indicate the data objects; d ij ( X i , C j ) mean the distance 

of the instance i and the clustering center j ; 

m ( 1 ≤ m < ∞) 

is the fuzzy coefficient; k is the number 

of categories which was given in advance, and 

determined by the initial clustering. The necessary 

condition of minimizing J m using the Lagrange 

multiplier method is [2]: 

c 

ij 

u 

ij 

= 1/ 

m 

k 

∑ 

i= 

1 

( d 

ij 

/ d 

m 

= ( ∑uij 

x j ) /( ∑u 

i= 

1 

) 

2 /( m−1) 

i1 

m 

i= 

1 

ij 

), ∀j 

, ∀i 

The fuzzy coefficient m is a scalar used to control the 

fuzzy clustering algorithm in formulas, it can measure the 

blur length of the membership matrix U , the greater the 

value m is, the algorithm represents more blurred. As 

m = 1 , the fuzzy C-means clustering algorithm reduces 

to the traditional C-means clustering algorithm, if we 

want to make the objective function to minimize, we need 

to calculate iteratively for the fuzzy C-means clustering 

algorithm. 


(3) 

(4) 

B. Heterogeneous Attributes of Fuzzy Clustering 

Number citations consecutively in square brackets [1]. 

No punctuation follows the bracket [2]. Use “Ref. [3]” or 

“Reference [3]” at the beginning of a sentence: 

Give all authors’ names; use “et al.” if there are six 

authors or more. Papers that have not been published, 

even if they have been submitted for publication, should 

be cited as “unpublished” [4]. Papers that have been 

accepted for publication should be cited as “in press” [5]. 

In a paper title, capitalize the first word and all other 

words except for conjunctions, prepositions less than 

seven letters, and prepositional phrases. 

For papers published in translated journals, first give 

the English citation, then the original foreign-language 

citation [6]. 

For on-line references a URL and time accessed must 

be given. 

At the end of each reference, give the DOI (Digital 

Object Identifier) number as long as available, in the 

format as “doi:10.1518/hfes.2006.27224” 

C. Footnotes 

Research on the heterogeneous attributes of the sample 

data, the distance can only be suit to numerical data, so 

we solve the problem with a new approach according to 

the paper [3]. The description of the distance between x i 

and x j of the categorical attributes is: 

d 

⎧a, 

xik 

≠ x jk 

, (5) 

0, 

x = x 

⎪ ( xik 

x jk ) = ⎨ 

⎪⎩ 

ik 

In (5), a indicates the distance of x i and x j when 

k has the dissimilar values. Assume the number of 

continuous attributes and categorical attributes 

respectively indicate p , q , the distance between objects 

can be expressed as: 

' ' 

d ( xi 

, x j ) = dn 

( xi 

, x j ) + dc 

( xi, 

x j ) 

(6) 

' ' 

In (6), d n( 

xi, 

x j ) means the distance between objects of 

numeric attributes after standardizing, d c ( xik 

, x jk ) 

means the distance between objects of categorical 

attributes. 

The objective function of the heterogeneous attributes 

datasets can be modified from the formula(2), it can be 

expressed as[3]: 

+ 

( ) ∑∑ ∑( ) ∑ ( ) 

= = = 

= + ⎭ ⎬⎫ 

n k p 

p q 

⎧ m ' ' 2 

Jm 

U, 

C = μ ij ⎨ xil 

− x jl + dc 

xil, 

x (7) 

jl 

i 1 j 1 ⎩ l 1 l p 1 

In (7), m > 1 is the fuzzy coefficient, it is used to control 

the blur length of the membership matrix U . 

Suppose 

C 

C 

c 

i 

n 

i 

= 

jk 

k p 

m 

' 

∑uij ∑( 

xik 

− x jl ) 

j= 

1 l= 

1 

k p+ 

q 

m 

∑uij ∑ c 

j= 

1 l= 

p+ 

1 

' 2 

(8) 

( x , x ) 

= λ d 

(9) 

ik 

jk


Because of 

n 

c 

C i and C i are non-negative, we can 

n 

minimize J m ( U, 

C) 

by respectively minimizing C i 

c 

andC 

i . Meanwhile, the expression can be described as 

follow by using the Lagrangian multiplier method [3]. 

2 

−1 

⎧ 

⎫ 

k d( 

x x ) m 

⎪ ⎡ i , ⎤ −1 

j ⎪ 

uij 

= ⎨∑ 

⎢ ⎥ ⎬ , ∀i 

(10) 

⎪ l= 

1 ⎢⎣ 

d( 

xl 

, x j ) ⎥⎦ 

⎪ 

⎩ 

⎭ 

The formula of cluster center can be corrected as follows: 

⎧ 

⎪ = ∑ = 

⎪ 

= ⎨ ∑ 

⎪ 

⎪⎩ 

= = + + 

= 

n 

' 1 m 

Cil 

uij 

xil 

; l 1, 

2, 

L, 

p 

n 

m i 1 

C 

u 

(11) 

ij 

ij 

i= 

1 

max 

Cil 

Cl 

; l p 1, 

L, 

p q 

Because of m > 1 , we can prove the algorithm is 

convergent. 

D. Fuzzy Clustering Algorithm of Heterogeneous 

Attributes 

The optimization process of fuzzy clustering algorithm 

can be summarized as follows: 

Step1: initialize the membership matrix U with 

values between 0 and 1, so that it will satisfy the 

n 

∑ 

i= 

1 

constraints: u = 1, 

∀j 

= 1, 

L, 

n . 

ij 

Step2: for the different attributes of data, calculating 

the cluster centers as formula (11). 

Step3: calculate the new membership matrix U with 

the formula (10). 

Step4: calculate the objective function by formula (7). 

When the value of the objective function is less than a 

certain threshold, or the value of the change is less than a 

certain threshold, the algorithm will stop, while the result 

of clustering will output. Otherwise, it will get to Step2 

for iterating. 

With the above method, we not only consider the 

continuous attributes of the sample dataset, but also 

considered the categorical attributes. It analysis the data 

in the round, so that reducing the rate of fault, while 

combined the method with the optimization method of 

fuzzy clustering algorithm can further improve the 

detection efficiency. 

E. GuoTao Algorithm 

The problem for processing large data sets of fuzzy cmeans 

algorithm as fallow: the time it takes a lot, 

sensitive to the initialization and it is easy to fall into the 

local minimum. There are many improved methods to 

save time, but it can’t solve the problem of initialization 

sensitive, such as neural networks; however, genetic 

algorithm is a global initialization algorithm which can 

overcome the initialization sensitive problem of fuzzy Cmeans 

clustering algorithm. 

Genetic algorithm is built on the basis of biological 

evolution algorithm; it is a search algorithm which is 

based on natural selection and genetic mechanism. 


Genetic algorithm needn’t build models and have 

complex computation for the complex problems, it can 

find out the optimal value when only use the genetic 

operators. 

GuoTao algorithm is a kind of random search 

algorithm which is proposed based on genetic algorithm, 

and it is a improved algorithm of genetic algorithm. 

GuoTao algorithm was proposed in 1999 by Guo Tao[4], 

it combined the sub-space search method with group 

climbing method, and it suitable for solving the function 

with inequality constraints. GuoTao algorithm is 

conducive to get the global optimal in search space; the 

application of random search strategy in subspace reflects 

the non-convexity of random search in sub-space, which 

is expressed as: 

m m 

' 

' 

X = ∑ai 

X i, 

∑ai 

= 1, 

− 0. 5 ≤ a i ≤ 1. 

5 

i= 

1 

i= 

1 

(12) 

Assuming the search space is 

T 

⎧X 

| X = ( x ) ∧ ≤ ≤ ⎫ 

1, 

x2, 

L, 

xd 

xmin 

xi 

xmax, 

V = ⎨ 

⎬ , 

⎩i 

= 1, 

2, 

L, 

d 

⎭ 

the dimension is expressed as d , the objective function 

f X means the minimization function, suppose 

( ) 

' ' ' T 

( x , x , L , x ) , j = 1, 

2, 

L, 

m 

' 

X j = j1 

j 2 jd 

is the 

different points in V , write the subspace as 

m 

⎧ 

'⎫ 

V = ⎨X 

∈V 

| X = ∑ ai 

X 

a 

i ⎬ , and i must satisfy the 

⎩ 

i= 

1 ⎭ 

m 

conditions of ∑ ai 

= 1 and −0. 5 ≤ a i ≤1. 

5 . When 

i= 

1 

⎧0, if gi 

( X ) ≤ 0 

m 

⎪ 

l ( X ) = ⎨ 

and i 

( ) = ∑ i ( ) 

⎪ 

i= 

⎩gi 

( X ) , else 

X l X L 

, the 

1 

logic function Better can be defined as [6]: 

Better 

( X 1, 

X 2 ) = 

( X ) ≤ L( 

X ) or( 

L( 

X ) = L( 

X ) ∧ f ( X ) ≤ f ( X ) ) 

⎧1, 

if L 1 

2 

1 

2 

1 

2 

⎪ 

⎨ 

⎪ 

⎩0, 

ifL( 

X1) 

> L( 

X 2 ) or( 

L( 

X1) 

= L( 

X 2 ) ∧ f ( X 1) 

> f ( X 2 ) ) 

(13) ( 1 , X 2 ) X Better means 1 X is better than X 2 . 

The advantages of GuoTao algorithm summarized in 

the following five aspects[5]: 1) the algorithm just have 

less than one hundred language program of C to 

implement, so it’s a simple algorithm; 2) the algorithm is 

versatile, it can be used to solve the optimization 

problems of complex function; 3) the algorithm does not 

need to modify the parameters, it is as long as to input 

different function expressions for different problems; 4) 

as usual, the algorithm can be obtained the global optimal 

in a relatively short period of time; 5) when the optimal is


not unique, the algorithm may also find more than one 

optimal. 

F. Algorithm Optimization of Heterogeneous Attributes 

of the Fuzzy C-means Clustering 

For the analysis of the original fuzzy C-means 

clustering algorithm, the shortage was mainly expressed 

as: first, the algorithm performance is not stable enough, 

the reason is mainly caused by the initialization sensitive; 

second, the algorithm is easy to fall into the local 

optimum; third, when the value of the function attained 

the minimum, the highest detection rate is not, it is in 

contradiction to the original fuzzy C-means clustering 

algorithm that the value of the function is minimum and 

the rate of detection is highest. 

To avoid the algorithm may be confronted with these 

shortcomings, first of all, we choose GuoTao algorithm to 

solve the problem of being easy to the local optimum, 

described as above. 

Combining with GuoTao algorithm, it need to initialize 

the population first, and the population with multiple 

individuals. Running the heterogeneous attributes of the 

fuzzy C-means clustering algorithm to iterative, it must 

successively update the cluster center and the 

membership of each individual of population. As each 

update is required to be initialized and the values of 

random initialization may be different, the performance 

of the algorithm will be instable. In the modified 

algorithm, it use the parallel method to deal with the 

problem, it use the FOR loop to update the cluster centers 

of all the individuals before selecting the best and the 

worst individual. 

Owing to the application of the group search strategy 

of Evolutionary Computation, GuoTao algorithm ensure 

the global of the search space, and it is conducive to 

obtain the optimal set in global scope; at the same time, 

the algorithm only eliminate the individual with the worst 

fitness, the pressure is quite minimum so that ensure the 

diversity of population and make sure the individual with 

best fitness can be retained. For the problem that the 

value of the function attained the minimum, the highest 

detection rate is not, we introduced a crossover 

probability p , and the crossover probability p is 

defined as follow: 

( n) 

J m p ( n) 

avgJm 

min 

= (14) 

In (14), n means the number of individuals of the 

initial population, using the parallel method to iterate, it 

will have n function values, and the minimum is 

( n) 

( n) 

min J m , avgJ m is the average of all the values. The 

introduction of the crossover probability p can make the 

algorithm only run by p , but not in each iteration of each 

individual. 

There will generate a random probability before the 

cross operation of each iteration, compare the random 

probability with the crossover probability p , when the 

random probability is less than p , then do the cross- 


operation. Select m individuals from n individuals and 

compose a new individual, if the fitness of the new one is 

worse than the worst individual in the original population, 

then composed a new individual to do the crossover 

operation. 

Use the GuoTao algorithm to optimize the fuzzy Cmeans 

clustering algorithm with heterogeneous attributes, 

the process can be described as: 

Step1: initialize the population 

n 

P = { X1, 

X 2, 

L , X n} 

, X i ∈V 

, and 

generation = 1, 

X = 0 , X = 0 , ε , the 

worst 

maximum number of cycles MAXgen . 

Step2: decoding, obtain the cluster center from the 

genotype of each individual chromosome. 

Step3: use the fuzzy C-means clustering algorithm with 

heterogeneous attributes; calculate the 

membership for each cluster center according 

μ 

ij 

to equation (10), 

which i = 1, 2, 

L , n; 

k = 1, 

2, 

L, 

c , then 

calculate the value of objective function by 

equation (7). 

Step4: calculate the fitness 

1 

F( 

X ) = of each 

1+ 

J m 

individual X i ; compare the fitness of X i with 

the fitness of X worst , if the fitness of X i is 

greater than the fitness of X , replace X 

worst 

worst 

with X i ; while if the fitness of X i is less than 

the fitness of X , replace best X with X best 

i . 

Step5: when the number of iterations generation is 

less than the maximum number of cycles 

MAXgen , update the cluster centers of all the 

individuals. 

Step6: select ( X best , X worst ) to satisfy 

Better( X best , X ) = 1 

and 

Better( X worst , X ) = 0 ;respectively calculate 

the fitness of X worst and X best , if the difference 

of them is greater thanε , the value of the logic 

function is 1, otherwise the value is 0. 

Step7: when Better ( X best , X worst ) = 1 , calculate the 

crossover probability p by the formula (14), if 

the random probability is less than p , generate 

the subspace 

' n 

V = { X 1, 

X 2 , L , X m} 

, X i ∈V 

; randomly 

' ' ' 

select X , X ∈ V . 

Step8: obtain X best : calculate the fitness of ' 

X ; if 

' 

' 

Better ( X , X worst ) = 1 , then X worst = X , 

and replace the fitness of X worst with the fitness 

best


of 

' 

' 

X , if ( X , X best ) = 1 

Better , then 

X worst will be assigned to X best . 

Step9: obtain X worst : assume X worst = 0 , calculate the 

fitness of X worst and the fitness of each 

individual, if Better( X worst , i) 

= 1 , that is to 

say, X worst is better than i , then replace X worst 

with i , loop the process until all the individuals 

are in the comparison complete. 

X , once again, make sure 

Step10: chose ( best X worst ) 

Better( X best , X ) = 1 and ( X , X ) = 0 

Better worst : 

calculate the fitness of X worst and X best , if the 

difference of them is greater than ε , the value of 

the logic function is 1, otherwise the value is 0. 

Step11: calculate the fitness of the final obtained X best , 

at the same time generation + 1 . 

Step12: judge the evolution conditions, if 

generation is less than or equal to MAXgen, return to 

Step1, otherwise, exit the evolutionary loop. 

III. INTRUSION DETECTION SYSTEM OF CLUSTERING WITH 

HETEROGENEOUS ATTRIBUTES 

In anomaly detection model of fuzzy C-means 

algorithm, it mainly composed of three parts [7]: data 

pre-processor 、 classifier of fuzzy C-means clustering 

and anomaly detection system, shown in Figure 1. 

When input the network data, the pre-processor will 

select the attributes of the data, for the data preprocessing, 

it include data standardization, normalization, etc; the 

classifier of fuzzy C-means clustering is used to cluster 

the preprocessed data, and afford the obtained cluster 

centers to the anomaly detection system; the anomaly 

detection system is used to determine the data are normal 

or abnormal in test data. 

Figure 1 anomaly detection system of fuzzy C-means clustering 

algorithm 

Unsupervised clustering based anomaly detection 

algorithms have one in common: all of them built on the 

basis of a hypothesis, that the number of normal data is 

much larger than the number of abnormal data. When the 

assumption established, the set can be judged is normal or 


abnormal according to the size of the cluster. Data in the 

larger cluster can be judged as normal, and the smaller 

cluster is often the abnormal data. 

IV. EXPERIMENTS AND ANALYSIS OF PERFORMANCE 

A. Data Selection 

In the research of intrusion detection system, we 

generally choose to use the network packet of intrusion 

detection called KDDCUP99, especially the packet of 

kddcup_data_10percent, it was formed from 10% of 

kddcup_data packet (about 4.9 million data records)[8]. 

In the later experiments, we choose 5000 records as a 

sample set from the 10% test set. In order to satisfy the 

needs of the two assumptions, that is, 1) the number of 

the normal data in the practical application is much more 

than the number of the abnormal; 2) intrusions and 

normal behavior was really different. Select 5000 records 

from the entire test as s sample set, 1000 of which are 

invasion data, it is consistent with the first requirement. 

The data type and number of the selected sample was 

shown in Table 1. 

TABLE 1 

SELECTION OF EXPERIMENTAL DATA 

Identification type number 

normal 4000 

dos 450 

probing 250 

r2l 250 

u2r 50 

The sample as follow: 

1)0,tcp,http,SF,181,5450,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0, 

0,8,8,0.00,0.00,0.00,0.00,1.00,0.00,0.00,9,9, 

1.00,0.00,0.11,0.00,0.00,0.00,0.00,0.00,normal. 

2)0,tcp,http,SF,239,486,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0, 

8,8,0.00,0.00,0.00,0.00,1.00,0.00,0.00,19, 

19,1.00,0.00,0.05,0.00,0.00,0.00,0.00,0.00,normal. 

3)0,tcp,http,SF,235,1337,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0, 

0,8,8,0.00,0.00,0.00,0.00,1.00,0.00,0.00,29, 

29,1.00,0.00,0.03,0.00,0.00,0.00,0.00,0.00,normal. 

4)0,tcp,http,SF,219,1337,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0, 

0,6,6,0.00,0.00,0.00,0.00,1.00,0.00,0.00,39, 

39,1.00,0.00,0.03,0.00,0.00,0.00,0.00,0.00,normal. 

……………………………… 

1000)0,tcp,http,SF,211,5693,0,0,0,0,0,1,0,0,0,0,0,0,0, 

0,0,0,5,5,0.00,0.00,0.00,0.00,1.00,0.00,0.00, 

5,255,1.00,0.00,0.20,0.07,0.00,0.00,0.00,0.00,normal. 

1001)0,tcp,http,SF,211,2271,0,0,0,0,0,1,0,0,0,0,0,0,0, 

0,0,0,15,15,0.00,0.00,0.00,0.00,1.00,0.00, 

0.00,15,255,1.00,0.00,0.07,0.07,0.00,0.00,0.00,0.00,n 

ormal. 

1002)0,tcp,http,SF,511,238,0,0,0,0,0,1,0,0,0,0,0,0,0,0, 

0,0,1,3,0.00,0.00,0.00,0.00,1.00,0.00,0.67,1, 

255,1.00,0.00,1.00,0.07,0.00,0.00,0.00,0.00,normal. 

……………………………… 

5000)47,tcp,telnet,SF,2402,3816,0,0,0,3,0,1,2,1,0,0,0, 

0,0,0,0,0,1,1,0.00,0.00,0.00,0.00,1.00,0.00, 

0.00,10,10,1.00,0.00,0.10,0.00,0.00,0.00,0.10,0.10,buffer 

_overflow.


B. Data Standardization 

In the process of the data analysis, we must first 

standardize the data, the common methods are the 

minimum-maximum standardization, calibration by 

fractional, and Z-score standardization [9]. 

In general, clustering algorithm use the method of 

calculating the distance to cluster the data, however, there 

are two types of attributes exit: discrete and continuous 

[10]. For the discrete attributes, we encode the attributes 

of the data to make them into continuous values, as in this 

paper, it mainly modified the measure way of similarity 

of the discrete attributes, there will not standardize and 

normalize the discrete data; for the continuous attributes, 

the measure method are not the same, in order to avoid 

the phenomenon” large number eat the smaller”, we 

generally need to standardize for the values of attributes 

before clustering to the data set. 

Suppose the number of the network connection record 

of test data set is n , that is, the number of the data 

objects is n , the attributes vector of each data is written 

as X ij ( 1 ≤ i ≤ n, 

1 ≤ j ≤ 41) 

. Standardize the X ij [8]: 

X 

' 

ij 

= 

1 ⎛ 

⎜ X 

n ⎝ 

1 j 

1 

− 

n 

Among them, 

of X ij . 

1 

X ij − 

n 

( X + X + L+ 

X ) 

1 j 

2 j 

1 

⎞ 

( X1 

j + L+ 

X nj ) + L+ 

X nj − ( X1 

j + L+ 

X nj )⎟ 

n 

⎠ 

nj 

(15) 

' 

X ij mean the values after standardizing 

C. Data Normalization 

The process that we need to make the data objects 

limit in a certain called data normalization. It is 

convenient for data processing after normalizing, and it 

ensure to improve the rate of convergence. 

We will choose the method of linear function to 

normalize the data objects X ij , and ensure the values 

'' 

after normalizing to the interval[ 0 , 1] 

. Suppose X is the 

' 

X ij , its normalization process 

value after normalizing of 

as follow [8]: 

{ } 

{ } { } ' 

' 

' 

' 

X 

'' 

ij − min X ij 

X ij = 

max X ij − min X ij 

1 ≤ i ≤ n, 

1 ≤ j ≤ 41 

(16) 

D. Analysis of Performance 

The analysis of performance of fuzzy C-means 

clustering algorithm with heterogeneous attributes 

In the fuzzy C-means clustering algorithm, the 

parameters value as follow: the number of clusters is 2, 

that is, cluster the data into two categories, normal data 

and abnormal data; the fuzzy coefficient value is 4; the 

distance a of two of the data objects which have the 

difference discrete attributes values is 0.28. Now we will 

analyze the performance of fuzzy C-means clustering 

algorithm with heterogeneous attributes, and compare 

with the original fuzzy C-means clustering algorithm, 


ij 

Table 2 indicate the performances of the two algorithms 

while the parameters C and m respectively have the 

same values. 

TABLE 2 

THE PERFORMANCE COMPARISON OF THE TWO ALGORITHMS 

Performance 

index 

Algorithms 

FCM with 

heterogeneous 

attributes 

number 


iteration 

rate of 

detection 


error 

detection 

values of 

function 

79 0.973 0.01925 605.4851 

FCM 511 0.975 0.0195 740.0417 

Seen from the above table, the rate of detection and 

the rate of error detection of the two kinds of algorithms 

are respectively approximately equal, and the number of 

iterations of fuzzy C-means clustering algorithm with 

heterogeneous attributes is far less than the number of the 

original algorithm. It shows that consider the discrete 

attributes and continuous attributes of the data will 

improve the speed of convergence and more efficient of 

detection without affecting the rate of detection and the 

rate of error detection. 

Do the further experiment for the fuzzy C-means 

clustering algorithm with heterogeneous attributes, Table 

3 shows the results of the experiment while m =4 , 

a =0.28, and it runs 10 times, the simulation results were 

shown in Figure 2. 

TABLE 3 

THE PERFORMANCE ANALYSIS OF FUZZY C-MEANS CLUSTERING 

ALGORITHM WITH HETEROGENEOUS ATTRIBUTES 

Performance 

index 

Run times 

Number 


iteration 


detection 


error 

detection 


function 

1 77 0.973 0.01925 605.4851 

2 56 0.612 0.42275 571.5549 

3 84 0.973 0.01925 605.4851 

4 78 0.973 0.01925 605.4851 

5 75 0.973 0.01925 605.4851 

6 100 0.973 0.01925 605.4851 

7 77 0.973 0.01925 605.4851 

8 57 0.612 0.42275 571.5549 

9 75 0.973 0.01925 605.4851 

10 86 0.973 0.01925 605.4851 

average 77 0.901 0.09995 598.6991 

From the above table and the figure, it can be seen 

that the performance of the algorithm is not stable enough, 

its number of iterations, the rate of detection, the rate of 

error detection and the value of function all have a certain 

volatility, so it indicate that the fuzzy C-means clustering 

algorithm with heterogeneous attributes is sensitive to 

initialize and easy to falling into the local optimum. 

In order to avoid the above problems, we will 

combine with global optimization algorithms to optimize 

the fuzzy C-means clustering algorithm with 

heterogeneous attributes, and the following experiments 

will analysis the optimized algorithm. 

Experiment II: The performance comparison between 

the fuzzy C-means clustering algorithm with 

heterogeneous attributes and the optimal algorithm


For the three problems of fuzzy C-means clustering 

algorithm with heterogeneous attributes: the initialization 

sensitive cause the performance of algorithm is not stable 

enough, easily falling into the local optimum, the value of 

function is minimum but the rate of detection is not the 

Suppose the datasets is clustered into two categories, 

that is C = 2 ; the fuzzy coefficient m =4; a =0.28; and 

the number of iterations is 200. Now, respectively run 10 

times of fuzzy C-means clustering algorithm with 

heterogeneous attributes and the optimal algorithm, the 

results was respectively shown in Table 4 and Table 5. 

From the above tables, the results illustrate the 

performance of fuzzy C-means clustering algorithm with 

heterogeneous attributes is not stable enough, that is to 

say, the robust of the algorithm is not strong. By 

combining with the GuoTao algorithm and introducing 

the crossover probability, the performance of algorithm is 

provided with stability. In the results of running 10 times, 

the rate of detection and the rate of error detection 

remained unchanged, with strong stability. For the values 

of function, as it is operated on all the individuals of the 

population each time, we use the average of the values of 

function to measure. It is more clear to show that the 

instability of the original algorithm and the stability of the 

optimized algorithm in Figure 3 and Figure 4. 

TABLE 4 

PERFORMANCE OF FUZZY C-MEANS CLUSTERING ALGORITHM WITH 

HETEROGENEOUS ATTRIBUTES 

Performance 

index 

Run times 


detection 

rate of error 

detection 

Figure 2 variation trend of performance of fcm 


function 

1 0.973 0.01925 605.4874 

2 0.973 0.01925 605.4874 

3 0.973 0.01925 605.4874 

4 0.612 0.42275 571.5569 

5 0.973 0.01925 605.4874 

6 0.973 0.01925 605.4874 

7 0.973 0.01925 605.4874 

8 0.973 0.01925 605.4874 


highest. We will consider the rate of detection, the rate of 

error detection and the value of function to analysis the 

optimal algorithm whether it can solve the problems or 

not. 

9 0.973 0.01925 605.4874 

10 0.612 0.42275 571.5569 

average 0.901 0.09995 598.7013 

Performance 

index 

Run times 

TABLE 5 

PERFORMANCE OF OPTIMAL ALGORITHM 


detection 

rate of error 

detection 

Average values 

of function 

1 0.973 0.01925 605.4874 

2 0.973 0.01925 598.7013 

3 0.973 0.01925 591.9152 

4 0.973 0.01925 595.3082 

5 0.973 0.01925 602.0943 

6 0.973 0.01925 605.4874 

7 0.973 0.01925 602.0943 

8 0.973 0.01925 605.4874 

9 0.973 0.01925 595.3082 

10 0.973 0.01925 602.0943 

average 0.973 0.01925 600.3978 

By the performance comparison of original algorithm 

and the optimized algorithm, we can obtain that the 

algorithm which use the parallel method have the stable 

performance; from the experimental data, the algorithm 

combing with GuoTao algorithm does not fall into the 

local optimum, the rate of detection was 97.3%, and the 

rate of error detection was 1.925%, its detection 

efficiency is stable; in the results of the experiment, the 

minimum average value of function was 591.9152, and 

the corresponding rate of detection was 97.3%, it was the 

highest, the rate of error detection was 1.925%, so the 

optimized algorithm can solve the problem of the value of 

function is minimum but the rate of detection is not the


highest. It improved the availability of the optimized 

algorithm. 

Get the number of iterations from 10 to 100 for 10 

times, then respectively obtain the rate of detection and 

the rate of error detection, as shown in Table 6. The 

experimental data in the table are all the average of the 

obtained results. 

TABLE 6 

THE RATE OF DETECTION AND THE RATE OF ERROR DETECTION OF FCM WITH HETEROGENEOUS ATTRIBUTES AND THE OPTIMIZED ALGORITHM 

Performance index 

number of iteration 

rate of detection of 

the original algorithm 

rate of detection of the 

optimized algorithm 

rate of error detection of 

the original algorithm 

rate of error detection of 

the optimized algorithm 

10 0.9054 0.979 0.10135 0.021 

20 0.8288 0.974 0.18145 0.02 

30 0.8286 0.973 0.18075 0.0195 

40 0.973 0.973 0.01925 0.01925 

50 0.8286 0.973 0.18065 0.01925 

60 0.8768 0.973 0.09995 0.01925 

70 0.973 0.973 0.01925 0.01925 

80 0.8286 0.973 0.18065 0.01925 

90 0.8768 0.973 0.09995 0.01925 

100 0.8768 0.973 0.09995 0.01925 

. 

Figure 3 the rate of detection and the rate of error detection of FCM 

with heterogeneous attributes 

Figure 4 the rate of detection and the rate of error detection of the 

optimized algorithm 


Figure 5 the rate of detection comparison between the original algorithm 

and the optimized algorithm 

Figure 6 the rate of error detection comparison between the original 

algorithm and the optimized algorithm


From the above table, when the number of iterations is 

the same of the fuzzy clustering algorithm and the 

optimized algorithm, the rate of detection of former is 

slightly lower than the rate of detection of the latter. With 

the number of iterations increasing, as the performance of 

the former algorithm is not stable enough, the average of 

the rate of detection and the error detection have no the 

obvious trend, but the optimized algorithm have a certain 

stability, and when the number of iterations is 30, the 

algorithm began to converge. The compare of the rate of 

detection and the rate of error detection between them 

were shown in Figure 5 and Figure 6 

Form the above analysis can determine that the 

algorithm combined with GuoTao algorithm and 

introduce the crossover probability can improve the speed 

of convergence. 

V. CONCLUSION 

In order to solve the problem of the experimental data 

with heterogeneous attributes which are commonly used 

in intrusion detection, an intrusion detection method 

based on fuzzy clustering algorithm with heterogeneous 

attributes is proposed. And we combined with GuoTao 

algorithm to optimize the model. By the simulation of the 

data set, the convergence of algorithm is more quickly, 

the detection efficiency was improved, and the optimized 

algorithm is more stable, more robust, and can solve the 

problem of the original algorithm. 


This paper is sponsored by the Foundation of Jiangxi 

Educational Committee (GJJ10478), Project supported by 

the Key Laboratory for Computer Information Processing 

Technology, Jiangsu Provincial, China (2008-03), and 


supported by the Hunan Provincial Natural Science 

Foundation of China (07JJ6113). 

REFERENCES 

[1] An JY, Yue GX, Yu F, et al(2006).Intrusion Detection 

Based on Fuzzy Neural Networks. Lecture Notes in 

Computer Science Vol.3973,pp. 231-239 

[2] Yang DG (2005). Research of the Network Intrusion 

Detection Based on Fuzzy Clustering: Computer Science, 

Vol.32 (1), pp86-87 

[3] Li J, Gao XB, Jiao LC (2004). A GA-Based Clustering 

Algorithm for Large Data Sets with Mixed Numerical and 

Categorical Values:Journal of Electronics and Information 

Technology, Vol.26(8),pp1203-1209 

[4] Guo T (1999). Evolutionary Computation and 

Optimization. Ph.D. Thesis State Key Laboratory of 

Software Engineering of Wuhan University, China 

[5] Li Y, Kang Z, Liu P(2000). Guo’s Algorithm and Its 

Application. Journal of WUHan Automotive Polytechnic 

University, Vol.22(3),pp101-104 

[6] He YC, Zhang CJ, Wang PC , et al. (2007).Comparison 

between Particle Swarm Optimization and GuoTao 

algorithm on function optimization problems. Computer 

Engineering and Applications, Vol.43(11),pp100-103 

[7] Xiao LZ, Shao ZQ, Ma HH, et al (2008). An Algorithm for 

Automatic Clustering Number Determination in Networks 

Intrusion Detection. Journal of Software, Vol.19 (8), 

pp2140-2148 

[8] Wang SJ, Zhang XF (2008). Analysis and Preprocessing of 

KDDCup99 network data of Intrusion Detection. Science 

and Technology Information, Vol.15 (9), 407-408 

[9] Han JW, Micheline Kamber. Data Mining Concepts and 

Techniques (2007). China Machine Press 

[10] Chen ST, Chen GL, Guo WZ, et al (2010). Feature 

Selection of the Intrusion Detection Data Based on Particle 

Swarm Optimization and Neighborhood Reduction. 

Journal of Computer Research and Development, 

Vol.47(7),pp1261-1267


Vector-Distance and Neighborhood Development 

for High Dimensional Data 

Ping Ling 

College of Computer Science and Technology, Xuzhou Normal University, Xuzhou, 221116, China 

Email: lingicehan@yahoo.cn 

Xiangsheng Rong 

Training Department, Xuzhou Air Force College of P. L. A, Xuzhou 221000, China 

Email: rxs12@126.com 

Xiangyang You 

Training Department, Xuzhou Air Force College of P. L. A, Xuzhou 221000, China 

Email: xyyou@126.com 

Ming Xu 

Department of Logistic Command, Xuzhou Air Force College of P. L. A, Xuzhou 221000, China 

Email: mingxu@sina.com 

Abstract—This paper presents a novel distance concept, 

Vector-Distance (VD) for high dimensional data. VD 

extends traditional scalar-distance to a vector-like fashion 

by collecting multi partial distances from diverse angles. 

These partial distances are derived from random 

projections, and they preserve individual features of 

dimensions as much as possible. Based on VD definition, a 

method family for neighborhood development is proposed, 

where methods consist of some norm definitions and certain 

constrains specified for various purposes. Experiments on 

real datasets verify the quality of neighborhoods produced 

by the proposed method family better or competitive with 

the neighborhood produced by the state of the art. 

Index Terms—vector-distance (VD), high dimensional data, 

partial distances, neighborhood development 


High dimensional data received renewed interest in 

recent years thanks to the increase of the available 

computer hardware. While the software or approaches 

handling to them didn’t achieve so sharp improvement as 

hardware due to the difficulty in learning high 

dimensional data structure. The structure of high 

dimensional space defies usual 3-dimensional geometric 

intuition, and it is extremely sparse with data points far 

away from each other. If using conventional metric to 

explore a data’s neighborhood, only a small number of 

neighbors may be inferred. Unless the neighborhood 

radius is set large, sufficient neighbors could be covered. 

But that leads to the loss of locality. This phenomenon is 

known in the statistical literatures as the curse of 

dimensionality, and its affect increases exponentially in 

the dimension [1, 2, 3]. 


doi:10.4304/jsw.7.12.2832-2839 

Dimension reduction is the natural solution to address 

this issue. This approach family includes dimension 

selection that chooses important dimensions, dimension 

extraction that derives new dimensions from the original 

ones, and dimension weighting that equips dimensions 

with significance coefficients. Their idea relies on 

defining data-dependent metrics that can capture local 

distribution features from dimension analysis so as to 

generate data’s new representation. These metrics provide 

a scalar value, which reflects the distance information 

from a single angle. Yet in high dimensional space the 

single-source-based metric might not succeed in 

exploring exact distance information everywhere because 

the dimension significance might vary from region to 

region. For example, high dimensional data x, y and z, 

perhaps the dimensions that are critical to measure 

distance between x and y are not important to x and z. 

That inspires us to define a multi-source-based metric. 

This paper defines a vector-fashion distance concept 

for high dimensional data, named as Vector-Distance 

(VD). VD equips a pair of data points with a vector as 

their distance; the components of that vector are partial 

distance values derived using random projections 

technique. The random projections are realized by 

iterations of data space partition. That idea is rooted from 

Local Sensitive Hashing (LSH) [4]. LSH is focused on 

nearest neighbor searching, and it develops neighborhood 

by collecting hash table buckets that query is projected to. 

The neighborhood produced by LSH is an unsorted set, 

so that an extra metric has to be consulted to find the 

nearest neighbor. Compared with classical LSH, VD is 

characterized with the ability to sort neighbors. 

Based on VD, a method family of neighborhood 

formulation is proposed, where various metrics plus some


constrains specify diverse manners of neighborhood 

formulation. Heuristics are given to facilitate VD 

computation, thus without suffering from huge cost in 

tuning parameters that many random-projection-based 

method need. 

The rest of this paper is organized as below. Section 2 

reviews some work, namely, state of the art of 

neighborhood development for high dimensional data. 

Then VD definition and method family are presented in 

Section 3. Consequently, Section 4 discusses self tuning 

of parameters of random projections. Experimental 

evidence and analysis are given in Section 5, followed by 

conclusion in the last section. 


LSH originally aims to solve the ε–approximate 

nearest neighbor problem of high dimensional data. The 

relaxation from finding an exact answer to an 

approximate answer removes the curse. 

LSH is asked to return a point whose distance from the 

query is at most (1+ε) times the distance from the query 

to its nearest points. The appeal of this approximation 

fashion is that in many cases, an approximate nearest 

neighbor is almost as good as the exact one. The general 

LSH schema relies on existence of locality-sensitive hash 

functions. 

Definition. Locality-Sensitive Hashing. 

A family H is called (r, (1+ε)r, p1, p2)-sensitive 

if for any p, q∈R d : 

1) If ||p-q||≦r then Pr[h(q) = h(p)]≧p1 

2) If ||p-q||≧(1+ε)r then Pr[h(q) = h(p)] ≦ p2 

Various families of hash functions can be defined to 

yield various LSH schemas. But the precondition is that 

the function must meet the locality-sensitiveness property, 

that is, the basic parameters r, p1 and p2 can be computed. 

Many literatures have touched this issue. Reference [5] 

defines the hash function mapping from original space to 

Hamming space, and rectangular-shaped cell acts as the 

basic grid to form neighborhood. Reference [6] projects 

data to a R 1 space, where the projected line in R 1 is 

partitioned into equal-length intervals. The hash function 

returns the index of the interval containing the projection 

of query. 

Literature [7] partitions data space with ball-shaped 

grid; the resulted ball boundaries actually correspond to 

hash functions definitions. In that paper, the number of 

balls is parameterized to ensure the union of balls can 

cover all space. In above methods, interval, rectangular 

and ball that contain query are collected to form 

neighborhood. 

Recently an analysis in reference [8] uses one hash 

function to store all data. To search the neighborhood of 

query, not only query itself but also some neighbor-like 

points generated are hashed to find interesting hashing 

buckets. 


III. VECTOR-DISTANCE AND PARAMETERIZATION 

A. VD Definition 

The novel of Vector-Distance definition is that it 

employs a vector as the distance representation of a pair 

of points. Elements of such a vector are partial distance 

values that are derived from some number of partitions of 

data space. And each partition is generated by random 

projections, a technique that used to handling high 

dimensional data. 

Assume dataset S = {x1…xN}, xi∈R n , and q is the 

query. S is tessellated P times with random partitions. In 

each partition, a partial distance value is derived from C 

dimensions that are selected at random. In more details, 

each partition is defined by C pairs of random numbers (d, 

vd), where d is an integer between 1 and n and vd is a 

value within the range of the data along the d th coordinate. 

vd acts as a benchmark coordinate to measure the local 

distance between q and xi in d th dimension. Denote qd and 

xid as the d th coordinate, and then the local distance is 

defined as: 

Ad(q, xi) = exp(-(qd - vd)(xid - vd)) (1) 

Simultaneously, vd is used to form the inequality ‘xid < 

vd’. q and xi yield the true or false result under that 

inequality. After a partition q and xi yield C-length 

Boolean vector with 0 meaning false and 1 meaning true. 

This Boolean vector is the projections of data under C 

random embeddings. 

Denote PDI as the set of selected dimensions of I th 

partition, b I (x) as the Boolean vector of x, with b I (x)d 

being its d th component. We employ b I (q)d and b I (xi)d to 

weight the local distance between qd and xid, to generate 

their partial distance of I th partition: 

A 

I 

( qx , ) =Σ α ⋅A 

( qx , ) 

i d∈PD d d i 

I 

With α = exp(| b 

I 

( q) − b 

I 

( x ) |) . 

d d i d 

The employment of α d will strength the d th local 

distance between q and xi if they have different response 

to the inequality xid < vd. The VD between q and xi is: 

(2) 

VDqx ( , ) = ( A 

1 

( qx , ) K A 

P 

( qx , )) (3) 

i i i 

Points with the same Boolean vector are grouped into 

the same cell. View one partition as a hash function; then 

a cell is actually a hash table bucket. Compared with [4], 

which uses one-dimensional interval as bucket, our 

schema extends the bucket from one-dimension to multidimension, 

so brings richer hashing meaning and locality 

sensitiveness without expensive cost. That low 

computation cost also makes cell-bucket hashing 

competitive with ball-bucket hashing [9], which needs the 

nonlinear embeddings to fulfill random projections. Of 

course the nonlinear is at the gains of stronger hashing 

power and locality sensitiveness. Consider both 

performance and cost, cell-bucket hashing is a fine choice. 

VD definition integrates distance measurement into cell


formulation, say random projections, so that neighbors 

are inherently sorted according to closeness with the 

query. 

Basic parameters of VD, r, p1 and p2, are specified as 

in [10]. The time complexity is O(N 1+1/(1+ε) ), and the 

query time is O(dN 1/(1+ε) ). 

B. Parameterization of P and C 

P and C have a direct influence on the size of cell and 

the quality of neighborhood. But it is hard to integrate 

them into some cost function explicitly and find the 

optimal configuration through optimization. Reference [5] 

gave an approach that runs over all pairs of configuration 

and chooses the pair that can incur least time cost under a 

pre-specified error upper bound. Here we search P and C 

in an empirical way. That is, we do searching trails within 

an appreciate range, where a measurement of 

neighborhood quality is employed to find the best 

parameter pair. 

When C increases, the number of cells increases and 

the average volume of one cell drops. When P becomes 

larger, more cells are produced and then the final 

neighborhood becomes large. Given the fixed C, only 

values of P below an upper bound are of interest. Because 

once P exceeds some bound, the neighbors it finds have 

been covered by smaller values of P, and the larger 

values of P only brings extra computation without any 

improvement in the neighborhood quality. 

Therefore this paper fixes C by setting its value as an 

integer randomly generated from range [ n , n ] in 

advance. 

As to P, we run P over a specified range to find the 

one able to bring the best neighborhood quality. The 

neighborhood quality is measured in the way: Qua(P) = 

ave{|MEM i| / | NEI i |}, where NEIi is the neighborhood of 

xi, and MEMi is the set of xi’s same-class members in 

NEIi. Then the optimal P is parameterized by the method 

of below steps: 

1) Select m data randomly; 

2) For P = Pdown : Pup 

3) Develop neighborhoods for m data: {NEIi }, 

(i = 1…m); 

4) Qua(P) = ave{|MEM i| / | NEI i |}; 

5) EndFor 

6) Poptimal = max P {Qua(P)}. 

The upper bound Pup can be specified by the memory 

available or problem at hand. The below bound is set as: 

Pdown = max (n/C, C). Its underlying idea is following. 

a) When n/C > C, that is, C < n . There are a few 

conditions to specify a cell, which leads to the coarse 

boundaries of cells. So P should be large to produce more 

cells, so that the quality of the final neighborhood can be 

guaranteed. In this case, P is set as n/C. 

b) When n/C < C, a cell can be constrained by the 

moderate number of conditions so it is refined 

sufficiently. Since every cell is of fine performance, a 

good neighborhood can be obtained without the need to 

yield many cells. 


IV. METHOD FAMILY OF NEIGHBORHOOD DEVELOPMENT 

VD spans a vector space where diverse norms can be 

defined for purposes. Assuming μ = VD(q, xi), we give 

below five norm definitions: 

|| μ || = min | μ | 

(4) 

1 j 

|| μ || = max | μ | 

∞ j 

(5) 

|| μ || = Σ μ 

2 

(6) 

2 j 

|| || {| |} 

3 ave μ = μ 

(7) 

j 

|| μ || − || μ || 

|| μ || = ∞ 1 

(8) 

4 || μ ||3 

The neighborhood is specified according to: ||μ||F < δ, 

where F = 1, 2, 3, 4, and ∞. Threshold δ is probed by 

following steps: 

1) Sort VD values between q and xi in the ascending 

order: {||VD(q, x1)||F…||VD(q, xN)||F}; 

2) Find the max gap between two adjacent values of 

that list, and let δ = ||VD(q, xgap)||F, where gap is 

defined as below (9): 

|| VDqx ( , ) || − || VDqx ( , ) || 

j + 1 F j F 

gap = max { } 

j || VD( q, x ) || 

j+ 1 F 

(9) 

The above strategy thinks the max gap reveals the 

boundary of dense area around q, and this boundary 

provides a natural estimate of neighborhood. Points 

located before of the gap are taken as q’s neighbors. 

V. EXPERIMENTS 

A. Compared with Other Metric 

As a kind of metric definition, VD is compared with 

some popular metrics: Euclidean metric Machete [11], 

Scythe [11], DANN [12], Adamenn [13]. 

These methods aim to reduce dimensionality and 

formulate data new representation by learning dimension 

relevance and then weighting dimensions. Machete is a 

recursive splitting procedure, in which the input variable 

used for splitting at each step is the one that maximizes 

the estimated local relevance. Scythe is a generalization 

of Machete method. DANN works as an adaptive nearest 

neighbor classifier. Adamenn is an adaptive nearest 

neighbor approach based on probability programming. 

Gaussian Kernel function, viewed as a kind of metric, is


also compared, with its width parameter tuned by crossvalidation. 

These metrics are introduced into ||x - q||F < δ to 

develop neighborhoods. The neighborhood size takes two 

fashions: the pre-specified size NS1 and the adaptive size 

NS2. NS1 is expressed as a selectivity percentage to 

tellhow many data are selected from dataset as neighbors. 

NS2 is computed using our strategy mentioned in Section 

4. Neighborhood quality is evaluated by: 

η = ave{| MEM | / | NEI | (10) 

q q 

News Group [14] is used as experimental dataset. 

This dataset contains about 20,000 articles (email 

messages). These articles are evenly divided into the 20 

newsgroups. In this paper, each newsgroup is labeled as 

following: 

NG1: alt.atheism; 

NG2: comp.graphics; 

NG3: comp.os.ms.windows.misc; 

NG4: comp.sys.ibm.pc.hardware; 

NG5: comp.sys.mac.hardware; 

NG6: comp.windows.x; 

NG7: misc.forsale; 

NG8: rec.autos;NG9: rec.motorcycles; 

NG10: rec.sport.baseball; 

NG11: rec.sport.hockey; 

NG12: sci.crypt; 

NG13: sci.electronics; 

NG14: sci.med; 

NG15: sci.space; 

NG16: soc.religion.christian; 

NG17: talk.politics.guns; 

NG18: talk.politics.mideast; 

NG19: talk.politics.misc; 

NG20: talk.religion.misc. 

We apply the usual tf.idf weighting schema to express 

documents. We delete words that appear too few times 

and normalize each document vector to have unit 

1 

0.9 

0.8 

0.7 

0.6 

0.5 

Euclidean length. We do experiments in the whole dataset 

with the ever-increasing NS1. 

Figure 1 shows the average dependence of η on NS1, 

where 0.5% data are sampled randomly as queries. 

1 

0.9 

0.8 

0.7 

0.6 

1 2 3 4 5 6 NS1 (10 -3 ) 

Figure 1. Neighborhood quality comparison 

Euclidean 

Kernel 

Machete 

Scythe 

DANN 

Adamenn 

According to results reflected from Figure 1, it is easy 

to find that with NS1 increasing, η values of these 

methods drop at different speed. On average, the 

dropping speed of VD is the least sharp. Adamenn is 

competitive with VD. Their ratio curves are relatively 

gentle. Other methods yield somewhat sharp tendency 

curves; that suggests their performance is unsteady and 

they are readily to be influenced by the changes of 

neighborhood size. VD sees a local peak at about NS1 = 

0.004; this NS1 value could be considered as the desired 

neighborhood size searched by cross-validation under VD 

metric. 

Then take a look of other methods. Firstly, it is easy to 

find that Adamenn also has a local peak, but its peak is 

located at about 0.005, a larger size than that of VD. This 

is because Adamenn extracts complete and profound 

dimension relevance from data distribution, and 

consequently shows more effectiveness when more data 

are absorbed into the neighborhood. 

1) 2) 3) 4) 5) 6) 

Secondly, it comes to DANN and Scythe. Obviously, 

the performance of DANN and Scythe follows first two 


Figure 2. Neighborhood quality comparison 

VD 

Euclidea 

n 

Kernel 

Machete 

Scythe 

NS2 (10 -3 ) 

methods and they produce similar results. But the curve 

of Scythe is a little sharper than DANN because DANN


is equipped with adaptation to data distribution. But 

DANN approximates the weighted Chi-squared distance, 

which will cause its failure in datasets of non-Gaussian 

distribution. The local peaks of DANN and Scythe are 

between Adamenn and VD. Thirdly, Machete shares the 

same spirit of Scythe, while it employs a greedy idea, so 

its job is not good as Scythe. 

Finally, Euclidean metric works poorly, and the reason 

lies in the mismatch between its measurement meaning 

and the high-dimensional data space. And its ratio curve 

is somewhat devious with more local peaks than other 

methods, which shows the unsteady behavior of this 

metric. Although Kernel’s results are better than 

Euclidean, its curve also experiences more local peaks. 

Compared with the first 5 methods, Euclidean metric and 

Kernel metric present unsteady performance and their 

multi local peaks prevent finding optimal NS1 value. For 

Adamenn and VD, VD is a fine choice because it has 

computation ease brought by parameterization strategies 

while Adamenn has six parameters to be tuned carefully. 

Then we do experiments in subsets consisting of 

several news groups. These experimental subsets are 

below six ones: 

1) {NG1, NG2, NG7} (400); 

2) {NG2, NG8, NG12, NG17} (300); 

3) {NG11, NG12, NG16, NG19} (400); 

4) {NG2 (200), NG3 (350), NG4 (400)}; 

5) {NG4 (200), NG5 (300), NG6 (300), NG7 (200)}; 

6) {NG17 (300) NG18 (500), NG19 (300)}. 

TABLE I. 

CLASSIFICATION ACCURACY COMPARISON (%) 

Therein, the number in bracket is the size of random 

samples selected from the original set. Now suppose that 

all methods use their own NS2 value as neighborhood 

size. Then the corresponding ratios are described in 

Figure 2. 

In Figure 2, all ratios are lower than corresponding 

peaks of Fig. 1. This is because those peak ratios are 

found in the searching way, while ratios of Fig. 2 are 

computed under fixed NS2 values. If ratios of Fig. 2 are 

close to peak ratios of Fig. 1, it suggests that NS2 is a 

qualified neighborhood size and the specification 

heuristic is fine. As expected, the difference between 

them is not far. If the cost is taken into consideration, 

NS2-based procedure is more welcome than NS1-based 

methods. 

From Figure 2, the analysis of Euclidean, Kernel 

methods and four dimension derivation methods follow 

the above patterns. For VD and Adamenn, the former is 

competitive or even outperforms the later. In those 

subsets containing similar news groups, say 4), 5) and 6), 

VD takes more advantage than Adamenn. In that cases, 

class boundaries are not distinct, data original 

representations are unfriendly to reveal class features, and 

consequently the probability derivation based on these 

representations is less confident. That leads to the metric 

produced by Adamenn being not so informative. VD 

relies on repeating random projections without much 

direct dependence on data representation, therefore it is 

less influenced. 

Data kNN AkNN VkNN SVM1r ASVM1r VSVM1r DAGSVM ADAGSVM VDAGSVM 

Vote 92.2 96.8 96.8 96.1 96.7 96.1 96.1 96.5 96.7 

BC1 88.3 92.5 92 89 93.9 93.7 91.1 94.2 94.1 

BC2 86.2 90.7 90.5 88.4 92.6 92.6 89.8 93.4 93.2 

Musk1 89.6 93.8 94.1 94.8 95.9 96 94.8 95.9 96 

Musk2 59.5 62.4 63.1 62.8 67.3 68 62.8 63.7 63.8 

Iris 94 97 97 95.9 97 97 96.2 96.4 96.8 

Wine 92 93.9 94.7 93.1 93 90.9 93.6 94 90.6 

1) 83 86.9 88.1 84.2 86.3 86.5 85.3 86.8 86.9 

2) 87.1 89.2 90.5 90.1 92.1 91.6 91.2 92.8 93 

3) 79.6 82 82.2 82.6 84.7 84.7 83 83.6 84.1 

4) 67.7 70.3 70.1 70.2 72.9 73.4 71.1 73.5 73 

5) 69.4 73.2 73.5 73.5 75.1 75.9 73.4 75.1 75.3 

6) 70.4 72 72.1 71.8 73.2 73.5 72 73.8 74.1 



B. Test VD Performance through Classifiers 

In this experimental section, VD definition and 

Adamenn are introduced into some metric-based 

classifiers to fulfill classification task. Those metricbased 

classifiers are: pure kNN [15], SVM1r [16, 17] and 

DAGSVM [18]. 

Due to the introduction of VD definition and Adamenn, 

resulted methods form two groups of variants of original 

algorithm, and these two groups of variants are denoted 

by adding two prefixes to the original names, namely, 

adding ‘V-’ and ‘A-’ before these method names. That is, 

for kNN, there are two versions: AkNN and VkNN.For 

kNN, two metrics can work directly; for SVM1r and 

DAGSVM, two metrics appear in their Gaussian Kernel 

in the way that: 

K( x, y) = exp( − [ VD( x, y)] 2 

/ σ 

2 

) (11) 

K( x, y) = exp( −|| x− y|| 2 

/ σ 

2 

) (12) 

Adamenn 

The original classifiers and variants are compared on 

some real datasets that are taken from UCI Machine 

Learning Repository [11]. Among these experimental 

datasets, 1% data are sampled at random as training data 

and metric quality is measured by the classification 

accuracy listed in Table I. 

Note that for bi-classification cases, SVM1r and 

DAGSVM yield same result because both of them train 

one basic SVM. It is easy to find that ‘A-’ and ‘V-’ 

variants improve their original models, which indicates 

two metrics do improve effectiveness. But two metrics 

take their own advantage in different scenarios. In lowdimensional 

datasets, ‘A-’ methods behave better, while 

in high-dimensional space, ‘V-’ family is relatively 

preferred. 

According to experimental evidence of this section, it 

concludes that VD is more suitable to high-dimensional 

space than Adamenn because the later derives new metric 

from statistics of data distribution. Those statistics, as 

measurements, might be caught by the curse of 

dimensionality, and consequently the resulted metric 

becomes less informed. However, VD focuses on 

learning valid information from repeating projections, 

thus does not suffer from this problem. For SVM1r and 

DAGSVM, the later accounts to a weighted framework of 

basic SVM, while the former is the non-weighted 

framework of SVM, so naturally the later gives higher 

accuracies. 

C. Test VD Performance for High-dimensional Data 

After investigating the performance of VD through 

comparing it with other metrics and other classifiers, This 

section experiment aims to compare VD with some NN 

searching techniques that are developed specially for 

high-dimensional data. These NN searching techniques 

are those famous ones: VA-file [19, 20, 21, 22, 23], iLSH 

[6], cLSH [10] and bLSH [9]. To check their property, 

another evaluation is used, neighborhood cohesion. For 

q’s neighborhood, NEI (q), its cohesion is defined as 

formula (13): 


exp( −|| x− y|| 

2 

) 

Σ 

xy , ∈ NEIq ( ) | NEI( q)| 

(13) 

Suppose that each method finds its own NS2 value. 

Here, 10% data are sampled randomly as queries. The 

average cohesion values of each method are collected and 

shown in Table II. 

TABLE II. 

COMPARISON OF LSH-BASED APPROACHES THROUGH AVERAGE 

COHESION 

Data 1) 2) 3) 4) 5) 6) 

VA-file 0.73 0.78 0.86 0.7 0.67 0.65 

iLSH 0.71 0.75 0.83 0.68 0.68 0.63 

cLSH 0.74 0.74 0.87 0.71 0.67 0.66 

bLSH 0.79 0.79 0.86 0.73 0.71 0.68 

VD 0.78 0.79 0.85 0.75 0.71 0.7 

From Table II, it is clear that bLSH and VD behave 

similarly and take their own advantage in various cases. 

bLSH does better in subsets with clear class 

boundaries, while VD exhibits its merit in handling 

subsets with blurred classes. cLSH and VA-file yield 

close results for they actually construct the same-shaped 

bucket using different strategies. iLSH is relatively poor 

due to its weak hashing power carried by the onedimensional 

hashing embeddings. bLSH can be seen the 

best one among four indexing approaches because the 

shape of its bucket is the best to approximate the inherent 

shape of neighborhood, and thus it has more ability to 

formulate the good neighborhood which contains true 

neighbors. 

The mismatch between neighborhood region shape and 

rectangular, cell and interval leads to some irrelevant data 

absorbed into neighborhood, so affects the quality of 

neighborhood spanned by other three methods. Although 

VD’s bucket is of cell-like shape, it exploits the informed 

weighted metric to sort neighbors. That removes the 

influence brought by the mismatch. 

D. Test VD Performance for Real Dataset 

Finally, VD is conducted on the real datasets: Musk 

[24] and Mutagenesis [25]. Musk dataset has two 

versions Musk1 and Musk2. They record 476 and 6598 

conformations for musk molecules and non-musk 

molecules. We fix the data with some normalization and 

develop its relation frame shown in Figure 3. 

RK: Molecule-Name Common Attributes 

FK:M-Name Feature1 …… Feature 166 

Figure 3. Dataset Relationship of Musk


Mutagenesis dataset records 230 aromatic and 

heteroaromatic nitro compounds. They are divided into 

two groups based on the mutagenicity: the active and the 

inactive. Its relation schema is formulated in Figure 4. 

RK: Molecule-ID MOLECULE 

FK: Molecule-ID BOND FK: Molecule-ID ATOM 

Figure 4. Dataset Relationship of Mutagenesis 

Here, for two datasets, their tables are integrated into 

one big table, which includes all records. Note that since 

there are two types of Mutagenesis molecules: regressionfriendly 

and regression-unfriendly, some data are 

sampled from two subsets and the whole dataset 

respectively. Table III shows classification accuracy of 

above methods. 

TABLE III. 

COMPARISON OF LSH-BASED APPROACHES THROUGH CLASSIFICATION 

ACCURACY (%) 

Data Musk Mutagenesis Mutagenesis Mutagenesis 

(friendly) (unfriendly) (full) 

VA-file 86.3 79.5 63.2 73.1 

iLSH 87.6 82.9 68.7 75.2 

cLSH 85.2 92.7 70.2 78.9 

bLSH 88.4 90.5 70.5 77.2 

VD 90.5 93.5 70.6 79.5 

According to the experimental results, it is easy to 

know that VD exhibits outstanding behaviors by 

presenting highest classification accuracy among five 

methods. That is the similar conclusion with above 

section. Other LSH-based methods follow VD. 

Through above experiments, the fine performance of 

VD definition can be verified. 


This paper proposes a novel distance concept 

customized to high-dimensional data, Vector-Distance 

(VD). VD is a vector with its entries reflecting multiaspect 

information summarized from random projections. 

Through applying some metric to VD values, a family of 

neighborhood formulation methods is yielded. Empirical 

evidence on real datasets demonstrates the fine 

performance of VD and the proposed method family. 

On the other hand, VD provides another thinking angle 

by extending the common distance information to a 

compound fashion. Consequently the new distance 

collects rich information, and this information reveals 

more to reflect the relationship among data. That is 

expected to make more importance in high-structured 

data environment. 

Furthermore, high-dimensional data always attracts a 

lot of interest in diverse applications, more work is 


needed to promote the improvement of algorithms of 

high-dimensional data, even high-structured data. 


This work is supported by the Youth National Natural 

Science Foundation of China under Grant No. 61105129. 

And this work thanks for the contribution of Dr. 

Xiangsheng Rong. 

REFERENCES 

[1] E Novak, K Ritter, “The Curse of Dimension and a 

Universal Method for Numerical Integration,” Multivariate 

Approximation and Splines, pp. 177-187, 1998. 

[2] Michael E. Houle, Hans-Peter Kriegel, Peer Kröger, Erich 

Schubert and Arthur Zimek, “Can Shared-Neighbor 

Distances Defeat the Curse of Dimensionality?” Lecture 

Notes in Computer Science, Volume 6187, pp. 482-500, 

2010. 

[3] Jiajun Liu, Zi Huang, Heng Tao Shen and Xiaofang Zhou, 

“Efficient Histogram-Based Similarity Search in Ultra- 

High Dimensional Space,” Lecture Notes in Computer 

Science, Volume 6588, pp. 1-15, 2011. 

[4] A. Gionis, P. Indyk, and R. Motwani, “Similarity Search in 

High Dimensions via Hashing,” Proceeding of 

International Conference on Very Large Data Bases, pp. 

518–529, 1999. 

[5] D. Comaniciu, P. Meer: Mean shift, “A robust approach 

toward feature space analysis,” IEEE Transactions on 

pattern analysis and machine intelligence, Vol. 24, No. 5, 

pp. 603-619, 2002. 

[6] M. Datar, N Immorlica, P Indyk, “Locality-sensitive 

hashing scheme based on p-stable distributions,” 

Proceeding of 20th annual symposium on Computational 

geometry, p-. 253–262, 2004. 

[7] A. Chakrabarti, S.Khot, S. Xiaodong, “Near-Optimal 

Lower Bounds on the Multi-Party Communication 

Complexity of Set Disjointness,” Proceeding of 18th 

Annual IEEE Conference on Computational Complexity, 

pp. 107-117, 2003. 

[8] X.G. Cheng, C. Yang, J. Yu, “A New Approach to Group 

Signature Schemes,” Journal of Computers, Vol. 6, No 4. 

pp. 812-817, 2011. 

[9] A. Andoni, P. Indyk, “Near-Optimal Hashing Algorithms 

for Approximate Nearest Neighbor in High Dimensions,” 

Proceeding of 47th Annual IEEE Symposium on 

Foundations of Computer Science, pp. 459-468, 2006. 

[10] P. Indyk and R. Motwani, “Approximate Nearest 

Neighbors: towards Removing the Curse of 

Dimensionality,” Proceeding of Symposium on Theory of 

Computing, pp. 604-613, 1998. 

[11] http://www.uncc.edu/knowledgediscovery 

[12] T. Hastie, R. Tibshirani, “Discriminant Adaptive Nearest 

Neighbor Classification,” IEEE Trans. on Pattern Analysis 

and Machine Intelligence. Vol. 18(6), pp. 607-615, 1996. 

[13] C. Domeniconi, J. Peng, D. Gunopulos, “An Adaptive 

Metric Machine for Pattern Classification,” Advances in 

Neural Information Processing Systems, Vol. 13. pp. 458- 

464, 2001. 

[14] http://www.cs.cmu.edu/afs/cs/project/theo-11/www/naivebayes.html 

[15] K.N.N. Unni, R. de Bettignies, S.-D. Seignon and J.-M. 

Nunzi, “Application Physics Letter,” Vol. 85, pp. 1823. 

2004.


[16] V.Murino, M. Bicego, I.A. Rossi, “Statistical classification 

of raw textile defects,” Proceedings of the 17th 

International Conference on Pattern Recognition, Vol. 4, 

pp. 311-314, 2004. 

[17] C.C. Chang, C.J. Lin, “LIBSVM: A library for support 

vector machines,” ACM Transactions on Intelligent 

Systems and Technology, Volume 2 Issue 3, pp. 35-63, 

2011. 

[18] J. C. Platt, N. Cristianini, J. Shawe-Taylor, “Large margin 

DAG’s for Multiclass classification,” Advances in neural 

information processing systems. MIT press, Cambridge, 

MA, 12: 547-553, 2000. 

[19] R. Weber, H. Schek, and S. Blott, “A quantitative analysis 

and performance study for similarity-search methods in 

high-dimensional spaces,” Proceeding of 24th 

International Conference on Very Large Data Bases, pp. 

194–205. 1998. 

[20] Peisen Yuan, Chaofeng Sha, Xiaoling Wang, Bin Yang, 

Aoying Zhou, “Efficient Approximate Similarity Search 

Using Random Projection Learning”, Lecture Notes in 

Computer Science, Volume 6897, pp. 517-529, 2011. 

[21] C.H. Zhou, M.L. Cao, M. Ye, Z.H. Qian, “SAT-based 

Algorithmic Verification of Noninterference,” Journal of 

Computers, Vol. 6, No 11, 2310-2320, 2011. 

[22] Z.Q. Wang, X. Sun, “An Efficient Discriminant Analysis 

Algorithm for Document Classification,” Journal of 

Computer, Vol 6, No 7, pp. 1265-1272, 2011. 

[23] Chunyang Ma, Yongluan Zhou, Lidan Shou, Dan Dai, 

Gang Chen, “Matching Query Processing in Highdimensional 

space,” Proceeding of the 20 th ACM 

International Conference on Information and Knowledge 

Management, pp. 32-40, 2011. 

[24] J. Renders, “Kernel Methods in Natural Language 

Processing,” Learning Methods for Text Understanding 

and Mining Conference Tutorial, 2004. 

[25] L. Lopriore, “Page Protection in Multithreaded Systems,” 

Journal of Computers, Vol. 5, No 9, pp. 1297-1304, 2010. 

Ping Ling was born in Xuzhou, 

Jiangsu Province, China, Feb. 1979. 

She received her Bachelor’s degree 

in 2000, from College of Computer 

Science and Technology, Xuzhou 

Normal University. And then she 

received her Master’s degree and 

PHD from College of Computer 

Science and Technology, Jilin University in 2006 and 

2010 respectively. She research field focuses on data 

mining, intelligence computing, support vector machine 

and support vector clustering, etc. 


Xiangsheng Rong was born in 

Yanggu, Shandong Province, China, 

1975. He received his Bachelor’s 

degree in 1997, from Department of 

Logistic Command, Xuzhou Air 

Force College of P. L. A. And then 

he received his Master’s degree in 

2003 from Xuzhou Air Force 

College of P. L. A. His major research directions include 

the application of information technology and dynamic 

programming technique in military logistic command, 

intelligence command in combined operations of a sham 

battle, etc. 

Xiangyang You was born in Xuzhou, Jiangsu Province, 

China, 1972. He received his Bachelor’s degree in 

College of Computer Science and Technology, Harbin 

Institute of Technology. His research directions are 

operational research in military logistic command, etc. 

Ming Xu was in Suqian, Jiangsu Province, China, 1968. 

He received his Bachelor’s degree in 1994, from 

Department of Logistic Command, Xuzhou Air Force 

College of P. L. A. And then he received his Master’s 

degree in 2005 from Xuzhou Air Force College of P. L. A. 

His research fields are centered in data integration, data 

mining, semantic network, etc.


Evaluation and Comparison on the Techniques of 

Vertex Chain Codes 

Linghua Li 

College of Computer Science and Engineering, Dalian Nationalities University, Dalian, China 

Email: linghl@139.com 

Yining Liu 

College of Communication Engineering, Jilin University, Changchun, China 

Email: 1337687488@qq.com 

Yongkui Liu 

College of Computer Science and Engineering, Dalian Nationalities University, Dalian, China 

Email: ykliu@dlnu.edu.cn 

Abstract—This paper firstly describes the techniques of six 

representative vertex chain codes, they are: original vertex 

chain code, extended vertex chain code, variable-length 

vertex chain code, variable-length compressed vertex chain 

code, dynamic vertex chain code, and equal-length 

compressed vertex chain code. The description includes the 

main idea and encoding method of each vertex chain code. 

Then the chain length, namely the code numbers, the 

memory occupancy, namely the general binary bits, the code 

average length of each code, namely bits per code of each 

vertex chain code were compared respectively by large 

numbers of experiments. In the end, the evaluation and 

comparison were given from the view of chain code 

efficiency. The goal of the paper is to provide convenience 

and reference for the chain code researchers and users. 

Index Terms—chain code, vertex chain code, comparison, 

evaluation 


Chain code has been a research topic for more than 

five decades. Since the pioneer work of Freeman in 1961, 

different approaches of chain coding have been proposed 

to improve the various aspects involved in chain code [1- 

5]. Because of the comprehensive applicability of chain 

code in many parts of pattern recognition and image 

processing [6-12], the techniques of chain code increase 

rapidly. Chain code is an efficient representation of 

binary images composed of contours [13-15]. The idea of 

a chain code is based on identifying and storing the 

directions from each pixel to its neighbor pixel on each 

contour. The technique of chain code includes two 

aspects: the chain codes based on pixel and the chain 

codes based on edge. For the former, some representative 

pixel-based chain codes include 8-direction Freeman 


doi:10.4304/jsw.7.12.2840-2848 

Borut Žalik 

Computer Science, University of Maribor, Maribor, Slovenia 

Email: borut.zalik@um.si 

chain code proposed by Freeman, 4-direction Freeman 

chain code commonly used by people, angle differences 

Freeman chain code (ADFCC) proposed by Y. K. Liu et 

al. [1], enhanced relative 8-direction Freeman chain code 

(ERDFCC) proposed by S. Zahir et al. [2], orthogonal 

three-direction Freeman chain code (3OT) proposed by H. 

Sánchez-Cruz et al. [14], arithmetic coding applied to 

3OT chain code (Arith_3OT) proposed by H. Sánchez- 

Cruz et al. [3], and modified directional Freeman chain 

code in eight directions by a set of nine symbols (MDF9) 

proposed by H. Sánchez-Cruz et al. [4], etc. For the latter, 

the six vertex chain codes introduced in the paper are all 

edge-based chain codes. Viewing from the compression 

of the information, there are two kinds of compression: 

with loss of information and with lossless of information. 

The six vertex chain codes evaluated and compared in the 

paper are all lossless-compression chain codes. 

II. OVERVIEW OF TECHNIQUES OF VERTEX CHAIN CODES 

We mainly describe the techniques of six 

representative vertex chain codes including the original 

vertex chain code (VCC), the extended vertex chain code 

(E_VCC), the variable-length vertex chain code 

(V_VCC), the variable-length compressed vertex chain 

code (VC_VCC), the dynamic vertex chain code 

(D_VCC), and the equal-length compressed vertex chain 

code (EC_VCC). 

A. Original Vertex Chain Code (VCC) 

In 1999, E. Bribiesca first introduced the original 

vertex chain code (VCC) [16]. The VCC is based on the 

numbers of cell vertices which are in touch with the 

bounding contour of the shape. This code determines the 

number of pixels of the binary shape that are in touch 

with the observed vertex of the shape’s boundary contour.


The latter represents a connected sequence of edges and 

vertices on the border between the shape and its exterior. 

In the VCC, the boundaries or contours of any discrete 

shape that are composed of regular cells can be 

represented by chains. Therefore, these chains represent 

closed boundaries. The minimum perimeter of closed 

boundary corresponds to the shape composed only of one 

cell. An element of a chain indicates the number of cell 

vertices, which are in touch with the bounding contour of 

the shape in that element position. 

Figure 1 is an illustration of VCC which presents a 

shape. It is previous that there are only three numbers of 

0, 1, and 2 which are needed to present a boundary 

composed with pixels of quadrate grids. 

In order to digitally represent these numbers of cell 

vertices, two bits are needed for each number. The 

element 1, 2, 3 of VCC can be represented by their 2-bit 

binary equivalents, as shown in Table I. 

TABLE I. 

ENCODING OF THE ELEMENT OF ORIGINAL VCC 

VCC 1 2 3 

Binary code 01 10 11 

As an illustration, consider the shape of Figure 1, when 

starting at the left-top point and walking along the 

boundary in a counter-clockwise direction, the VCC of 

the contour is 

1 2 3 1 2 2 1 2 2 2 1 3 1 2 2 1 3 1 2 2, 

as shown in Figure 1, or in binary form, 

01 10 11 01 10 10 01 10 10 10 01 11 01 10 10 01 11 

01 10 10. 

Figure 1. Obtaining the original VCC from the shape. 

As we can see from the above, the numbers of chain 

code of VCC for the shape of Figure 1 are 20, and the 

numbers of binary bit of it are 40. 

The VCC is invariant under translation and rotation. 

Using this concept of chain code it is possible to relate 

the chain length to the contact perimeter, which 

corresponds to the sum of the boundaries of neighboring 

cells of the shape; also, to relate the chain nodes to the 

contact vertices, which correspond to the vertices of 

neighboring cells. So, in this way, these relations among 

the chain and the characteristics of interior of the shape 

allow you to obtain interesting properties [16]. 

Since the VCC was proposed, it has obtained a lot of 

applications. For example, in [17], VCC was used for 

image recognition. In [18], VCC was used in neural 

network corner detection. In [19], VCC was used in skew 


detection for form document. In [20], VCC was used for 

calculating the compact and posture ratio of an image 

region. 

B. Extended Vertex Chain Code (E_VCC) 

Considering the compression efficiency of chain code, 

in 2007, Y.K. Liu et al. proposed extended vertex chain 

code (E_VCC) which is developed based on the original 

VCC [21]. Considering that the original VCC uses 2 bits 

to represent only three code elements 1, 2, and 3, E_VCC 

is introduced to add an element 0 without increasing the 

average bits per code. In E_VCC, the element “0” is 

added according to the experiments which show that the 

combination of element 1 and 3 is the most often 

occurring combination. Then the combination of element 

1 and 3 is substituted by code 0. 

In order to digitally represent these codes, two bits are 

needed for each code, and the binary bits of each code are 

not increased according to the original VCC. Table II 

shows the code relationship between E_VCC and original 

VCC and the binary form of E_VCC which consists of 

four code symbols. 

TABLE II. 

RELATIONSHIP BETWEEN E_VCC AND ORIGINAL VCC 

E_VCC 0 1 2 3 

VCC 1 and 3 1 2 3 

Binary of 

E_VCC 

00 01 10 11 

As an illustration, consider the previous shape of 

Figure 1, when starting at the left-top point and walking 

along the boundary in a counter-clockwise direction, the 

E_VCC of the contour is 

1 2 3 1 2 2 1 2 2 2 0 1 2 2 0 1 2 2, 


01 10 11 01 10 10 01 10 10 10 00 01 10 10 00 01 10 

10. 

Figure 2. Obtaining the E_VCC from the shape. 


code of E_VCC for the shape of Figure 1 are 18 which 

are less than the original VCC, and the numbers of binary 

bit of it are 36 for the same condition. 

From the example it is clear that the length of the 

E_VCC is less than the original VCC because of the two 

code symbols in original VCC are substituted by one 

code symbol in E_VCC but with the same symbol length 

with the E_VCC. 

C. Variable-Length Vertex Chain Code (V_VCC) 

Reference [21] also proposed a new vertex chain code 

named variable-length vertex chain code (V_VCC) which


is also developed based on the original VCC. The 

V_VCC also uses the codes of VCC which has three 

elements 1, 2, and 3. The difference between them is that 

the definition of the V_VCC considers the probability of 

three codes occurring. The experiments showed that the 

probability of code 2 is greater than the other two. 

Therefore, in V_VCC, the one bit binary digit 0 is used to 

represent for code 2, the two bits binary digit 10 and 11 is 

used to represent respectively for code 1 and code 3 

which is with variable-length coding by applying the 

concept of Huffman coding concept. 

Table III shows the code relationship between V_VCC 

and original VCC and their binary form of each code 

symbol are also listed. TABLE III. 

RELATIONSHIP BETWEEN V_VCC AND ORIGINAL VCC 

V_VCC 1 2 3 

VCC 1 2 3 

Binary of VCC 01 10 11 


V_VCC 

10 0 11 




V_VCC of the contour is the same as the original VCC, it 

is 

1 2 3 1 2 2 1 2 2 2 1 3 1 2 2 1 3 1 2 2, 

as shown in Figure 1, but the binary form of the V_VCC 

(shown in figure 3) is different from the original VCC, it 

is 

10 0 11 10 0 0 10 0 0 0 10 11 10 0 0 10 11 10 0 0. 

Figure 3. Obtaining the V_VCC from the shape. 


code of V_VCC for the shape of Figure 1 are 20 which 

are as same as the original VCC, and the numbers of 

binary bit of it are 30 because that the numbers of binary 

bit for code 2 are decreased. 

From the example it is clear that the function of the 

V_VCC is the same as the original VCC, but the length 

of the V_VCC is less than the original VCC. 

D. Variable-Length Compressed Vertex Chain Code 

(VC_VCC) 

Reference [21] also proposed a third new vertex chain 

code named variable-length compressed vertex chain 

code (VC_VCC) which is developed based on the 

original VCC and Huffman coding concept by 

considering the different probabilities of the codes. The 

VC_VCC is constructing by taking account the E_VCC 


and V_VCC and it consists of five codes: Code 1, Code 2, 

Code 3, Code 4 and Code 5. The two new codes added 

for representing respectively for two combinations, one is 

the combination with first code 1 and second code 3, the 

other is the combination with first code 3 and second 

code 1. The probabilities of the codes were obtained 

experimentally [21]. From the experiments, it gets that 

the probability of the two combinations are equal. 

According to the statistical probabilities, the binary 

Huffman codes 0, 10, 110, 1110, and 1111 are assigned 

to the code values of Code 1, Code 2, Code 3, Code 4 and 

Code 5 respectively. Table IV shows the relationship 

between VC_VCC and original VCC and the probability 

and binary form of each code. 

TABLE IV. 

RELATIONSHIP BETWEEN VC_VCC AND ORIGINAL VCC 

VC_VCC Code 1 Code 2 Code 3 Code 4 Code 5 

VCC 2 1 and 3 3 and 1 1 3 

Probability 0.657 0.138 0.138 0.034 0.033 


VC_VCC 

0 10 110 1110 1111 




VC_VCC of the contour is 

c4 c1 c3 c1 c1 c4 c1 c1 c1 c2 c4 c1 c1 c2 c4 c1 c1, 

as shown in Figure 4, where the c1, c2, c3, c4 and c5 are 

the respective abbreviations for Code 1, Code 2, Code 3, 

Code 4 and Code 5, or in binary form 

1110 0 110 0 0 1110 0 0 0 10 1110 0 0 10 1110 0 0. 

Figure 4. Obtaining the VC_VCC from the shape. 


code of VC_VCC for the shape of Figure 1 are 17 which 

are the least one in the vertex chain codes among the 

above, and the numbers of binary bit of it are 33 which 

are not the least among the above for the reason that the 

advantage of Huffman coding relies in the conditions 

where the length of the shape contours which will be 

described is more longer. Namely, the more the length of 

the shape, the less the numbers of binary bit required by 

VC_VCC compared to the chain codes without Huffman 

coding. 

Within all of the vertex chain codes described above, 

the result of the comparison shows that the VC_VCC is 

the most efficient [21]. Up to now, VC_VCC remains one 

of the most efficient to compress binary objects. In [22], 

VC_VCC was further compressed by applying the 

concept of segment on it.


E. Dynamic Vertex Chain code 

To decrease the chain length of the vertex chain code, 

in 2010, G. F. Yu et al. proposed dynamic vertex chain 

code (D_VCC) which is developed based on the original 

VCC [23]. The main idea of D_VCC is to change the 

means of original VCC in which each code expresses the 

numbers of cell vertices, and assigns new code value for 

the elements of chain code. There are ten elements in 

D_VCC, from the decimal number 0 to 9. The decimal 

number 0 (or 9) represents code “1” of original VCC, the 

decimal number 9 (or 0) represents code “3” of original 

VCC, and the decimal number 1 to 8 represent code “2” 

and the numbers of sequential code “2” directly. For 

instance, the decimal number 1 represents the one code 

“2”, the decimal number 2 represents the two sequential 

codes “22” of VCC, and so on. Table V shows the 

relationship between D_VCC and original VCC. 

TABLE V. 

RELATIONSHIP BETWEEN D_VCC AND ORIGINAL VCC 

D_VCC VCC 

0 1 or 3 

1 2 

2 22 

3 222 

4 2222 

5 22222 

6 222222 

7 2222222 

8 22222222 

9 3 or 1 

As an illustration, consider the shape of Figure 5, when 

starting at the left-top point and walking along the 

boundary in a counter-clockwise direction, the VCC of 

the contour is 

1 2 3 1 3 1 1 3 2 2 2 2 2 1 1 3 2 2 2 2 2 2 2 1 1 2 3 1 3 

1 3 1 3 3 2 2 2 2 1 2 1 2 3 1 1 2 3 3 2 1 3 1 2 2, 


01 10 11 01 11 01 01 11 10 10 10 10 10 01 01 11 10 

10 10 10 10 10 10 01 01 10 11 01 11 01 11 01 11 11 10 

10 10 10 01 10 01 10 11 01 01 10 11 11 10 01 11 01 10 

10. 

Figure 5. Obtaining the VCC from the shape. 


In the same conditions, when expressing with D_VCC, 

the chain code of the contour is 

0 1 9 0 9 0 0 9 5 0 0 9 7 0 0 2 9 0 9 0 9 0 9 9 4 0 1 0 1 

9 0 0 1 9 9 1 0 9 0 2, 

as shown in Figure 6, where the code “0” and “9” of 

D_VCC represent the code “1” and “3” of VCC 

respectively. When expressing in binary form, it is 

0000 0001 1001 0000 1001 0000 0000 1001 0101 

0000 0000 1001 0111 0000 0000 0010 1001 0000 1001 

0000 1001 0000 1001 1001 0100 0000 0001 0000 0001 

1001 0000 0000 0001 1001 1001 0001 0000 1001 0000 

0010. 

Figure 6. Obtaining the D_VCC from the shape. 


code of D_VCC for the shape of Figure 5 are 40 which 

are less than the original VCC with 54 for the reason that 

the codes of D_VCC apply the method of arithmetic 

coding, but the numbers of binary bit of it are 160 which 

are more than the original VCC with 108 because of the 

coding for the codes of D_VCC are also with equal 

length. Moreover, from the above description, we can see 

that D_VCC has ten codes but uses four binary bits for 

each code. It is obviously that four binary bits can encode 

for sixteen codes at most, it generates redundancy among 

the rest. 

So, it is obviously that the ultimate goal of D_VCC is 

to reduce the general numbers of chain code by 

overlooking the memory capability occupied by the chain 

code. In other word, the general numbers of binary bits 

are increased largely. 

F. Equal-Length Compressed Vertex Chain Code 

(EC_VCC) 

To further decrease the chain length of the vertex chain 

code, in [23], another vertex chain code named equallength 

compressed vertex chain code (EC_VCC) was 

proposed, which is developed based on E_VCC. The 

main idea of EC_VCC is to replace the minimum unit bit 

of other vertex chain code with byte. EC_VCC utilizes 

one byte (eight binary bits) for one code, and it divides 

the eight bits into two parts: two high bits and six low bits. 

The encodings of two high bits represent for four code 

values of EC_VCC, and the encodings of six low bits 

represent for the numbers of sequential codes indicated 

by the two high bits. For the six binary bits can encode


for 64 codes at most, if the numbers of sequential codes 

exceed 64, there will start a new byte. 

Table VI shows the relationship between EC_VCC and 

original VCC. The four codes of the two high bits of a 

byte are “00”, “01”, “10”, and “11”, which represent code 

0, 1, 2, 3 of E_VCC respectively. The six low bits are 

indicated by “b5b4b3b2b1b0”, in which “b” means bit 

and the number from 5 to 0 indicates the location of each 

bit of the byte. The binary code of the six low bits is from 

“00000” to “1111111”, which represents the decimal 

value from 0 to 63. So the decimal number 0 indicates the 

number of sequential codes is one, the decimal number 1 

indicates the numbers of sequential codes are two, and so 

on, until the decimal number 63 indicates the numbers of 

sequential codes are 64. When the numbers of sequential 

codes exceed 64, there will start a new byte. 

TABLE VI. 

RELATIONSHIP BETWEEN EC_VCC AND ORIGINAL VCC 

EC_VCC (eight binary bits) E_VCC 

00 b5b4b3b2b1b0 

Code 0 and its sequential 

number/numbers 

01b5b4b3b2b1b0 



10 b5b4b3b2b1b0 



11 b5b4b3b2b1b0 



As an illustration, consider the shape of Figure 5, in 

the same conditions, when encoding by EC_VCC, the 

binary form of the chain code is 

01000000 10000000 11000000 00000000 01000001 

11000000 10000100 01000001 11000000 10000110 

01000001 10000000 11000000 00000010 11000000 

10000011 01000000 10000000 01000000 10000000 

11000000 01000001 10000000 11000001 10000000 

00000000 01000000 10000001. 


code of EC_VCC for the shape of Figure 5 are 28 which 

are less than the original VCC with 54 for the same 

reason with D_VCC that the codes of them apply the 

method of arithmetic coding, but the numbers of binary 

bit of it are 224 which are more than the D_VCC with 

160 and the original VCC with 108 because of the same 

reason with D_VCC that the coding for the codes of them 

are also with equal length. Moreover, from the above 

description, we can see that EC_VCC uses 64 codes to 

represent code "1" of VCC which only needs two code in 

deed because the sequential numbers of code "1" are only 

two at most. It is obviously that the code values of 

EC_VCC generate more redundancy than the D_VCC. 

It is obviously that the ultimate goal of EC_VCC is 

also to reduce the general numbers of chain code by 

overlooking the memory capability occupied by the chain 

code, which is as same as D_VCC. Each code of 

EC_VCC has eight binary bits, and the whole numbers of 

codes in EC_VCC are 256. 

III. COMPARISON AND EVALUATION 


A. Method of Evaluation for Chain Codes 

In [21], a method for evaluating the efficiency of chain 

codes is proposed. The efficiency E of the chain code was 

defined as: 

E = C / L (1) 

where C is the average expression ability per code, and 

L represents the average number of bits per code (the 

average code length). In other words, the efficiency of a 

chain code is proportional to the average expression 

ability per code, and with inverse ratio to the average 

number of bits per code. It means the average length of 

contours which represented by each binary bit. 

The expression ability per code C refers to the average 

length of contours (or digital curves) which can be 

represented by every code of the chain code (measured in 

pixel units). When a code represents the relationship 

between two edge-adjacency pixels, such as the code “1”, 

“2” and “3” of E_VCC, the expression abilities of them 

become 1, and when a code represents the relationship 

between two corner-adjacency pixels, such as the code 

“0” of E_VCC, the expression ability of it becomes 2. 

B. Comparison of Chain Codes 

Figure 7 shows ten sample shapes we used to apply 

vertex chain codes, as taken from black-and-white raster 

images, whose sizes and numbers of pixels are shown in 

Table VII. We compared them from two aspects with the 

six vertex chain codes we described above. One is from 

the way of experiment results; the other is from the way 

of theoretic analysis. All these contours were 

counterclockwise oriented in the process of experiments. 

Figure 7. Sample shapes utilized to apply vertex chain codes. 

TABLE VII. 

SIZE AND NUMBER OF PIXELS OF EACH SAMPLE SHAPE 

Shape Size Pixels 

Tiger 133×129 17157 

Dragon 127×143 18161 

Gun 557×258 143706 

Mouse 508×466 236728 

Flower 484×478 231352 

Handwriting 352×335 117920 

Angel wings 654×315 206010 

Moon cake 504×463 233352 

Table 364×400 145600 

Cuttlefish 396×376 148896 

1) Comparison on Experiment Results 

First, we calculated the chain lengths of each contour 

shape in terms of the numbers of chain code with the six


vertex chain codes described as above, which are showed 

in Table VIII. 

As it can be seen in Table VIII, the longest chain 

length comes from VCC and V_VCC with 29600 codes 

of total length following by E_VCC with 24124. On the 

other hand, the shortest chain length comes from 

EC_VCC with 14241 codes of total length following by 

TABLE VIII. 

CHAIN LENGTH, IN NUMBERS OF CHAIN CODE, FOR THE SAMPLE SHAPES 

D_VCC with 22310 and then VC_VCC with 22265 

which is very near to the D_VCC. 

Then, we calculated the memory capability of each 

contour shape in terms of the numbers of binary bit with 

the six vertex chain codes described as above, which are 

showed in Table IX. 

VCC E_VCC V_VCC VC_VCC D_VCC EC_VCC 

Tiger 868 698 868 648 666 424 

Dragon 1134 939 1134 843 876 608 

Gun 1884 1462 1884 1401 1346 766 

Mouse 4140 3235 4140 2992 3334 2085 

Flower 4602 3801 4602 3468 3599 2419 

Handwriting 4112 3389 4112 3175 3414 2344 

Angel wings 3454 2760 3454 2485 2780 1745 

Moon cake 2026 1640 2026 1476 1532 956 

Table 3288 3031 3288 2913 1336 819 

Cuttlefish 4092 3169 4092 2864 3427 2075 

Total 29600 24124 29600 22265 22310 14241 

TABLE IX. 

MEMORY CAPABILITY, IN NUMBERS OF BINARY BIT, FOR THE SAMPLE SHAPES 


Tiger 1736 1396 1364 1120 2664 3392 

Dragon 2268 1878 1784 1481 3504 4864 

Gun 3768 2924 2884 2131 5384 6128 

Mouse 8280 6470 6642 5183 13336 16680 

Flower 9204 7602 7206 6149 14396 19352 

Handwriting 8224 6778 6606 6378 13656 18752 

Angel wings 6908 5520 5566 4413 11120 13960 

Moon cake 4052 3280 3172 2410 6128 7648 

Table 6576 6062 4100 3626 5344 6552 

Cuttlefish 8184 6338 6700 5109 13708 16600 

Total 59200 48248 46024 38000 89240 113928 

As it can be seen in Table IX, the most number of And then, based on the chain lengths of each contour 

binary bits produced by EC_VCC with 113928 bits of shape shown in Table VIII and the memory capability of 

total memory capability following by D_VCC with 89240 each contour shape shown in Table IX, we calculated the 

bits of total memory capability. On the other hand, the average length in terms of bits per symbol with the six 

least number of binary bits produced by VC_VCC with vertex chain codes described as above, which are showed 

38000 bits of total memory capability following by in Table X. In Table X, the numbers of symbols of each 

V_VCC with 46024 bits of total memory capability. vertex chain code are also listed. 

TABLE X. 

AVERAGE LENGTH OF EACH CODE, IN NUMBERS OF BITS PER SYMBOL 


Numbers of symbols 3 4 3 5 10 256 

Bits/symbol 2 2 1.55 1.71 4 8 

As it can be seen in Table X, the longest average 

length in bits per symbol is produced by EC_VCC with 8 

bits per symbol following by D_VCC with 4 bits per 

symbol. On the other hand, the shortest average length in 

bits per symbol is produced by V_VCC with 1.55 bits per 

symbol following by VC_VCC with 1.77 bits per symbol. 

At the same time, it can be seen in Table X, the 

numbers of symbols of EC_VCC are the most with 256 

symbols following by D_VCC with 10 symbols, and the 

numbers of symbols of VCC and V_VCC are the least 

with 3 symbols following by E_VCC with 4 symbols and 

then the VC_VCC with 5 symbols. 


The results of chain codes of D_VCC and EC_VCC 

above show that the more the numbers of symbols, the 

more bits per symbol of the chain code. And the results of 

chain codes of V_VCC and VC_VCC above show that 

the use of Huffman encoding concept can decrease the 

bits per symbol by contrast them to the chain codes of 

VCC and E_VCC. 

The next, we calculated the average expression ability. 

Analyzing the above six vertex chain codes, for VCC and 

V_VCC, the expression ability of each code is 1 because 

each code represents the relationship between two edge-


adjacency pixels. Namely, the average expression 

abilities of VCC and V_VCC are 1. 

For E_VCC and VC_VCC, the expression ability of 

some codes is 1 because they represent the relationship 

between two edge-adjacency pixels such as code 1, 2 and 

3 of E_VCC and code c1, c4 and c5 of VC_VCC, and the 

expression ability of the other codes is 2 because they 

represent the relationship between two corner-adjacency 

pixels such as code 0 of E_VCC and code c2 and c3 of 

VC_VCC. So, we calculated the probabilities of the 

codes with expression ability of 1 and 2 respectively, and 

then calculated the average expression abilities of 

E_VCC and VC_VCC. 

For D_VCC and EC_VCC, some codes represent the 

relationship between two edge-adjacency pixels such as 

code 0, 1 and 9 of D_VCC and code 01000000, 

10000000, and 11000000 in binary form of EC_VCC, 

and the others represent the relationship between two 

corner-adjacency pixels such as code 00000000 in binary 

form of EC_VCC. So, some codes have the expression 

ability of 1 and the others have the expression ability of 2. 

At the same time, because some codes are generated 

based on calculation, they represent the relationship not 

only between two pixels but a number. For instance, the 

expression ability of code 2 to code 8 of D_VCC is from 

2 to 8, because they represent the sequential numbers of 

code 2 are from 2 to 8; the expression ability of some 

codes of EC_VCC such as code “01000001”, 

“01000010”, and so on, until code “01111111” in binary 

form, and code “10000001”, “10000010”, and so on, until 

code “10111111” in binary form, and code “11000001”, 

TABLE XI. 

AVERAGE EXPRESSION ABILITY CALCULATED BASED ON THE SAMPLE SHAPES 

“11000010”, and so on, until code “11111111” in binary 

form can be from 2, 3 up to 64 respectively; and the 

expression ability of some codes of EC_VCC such as 

code “00000001”, “00000010”, and so on, until code 

“00111111” in binary form can be from 4, 6, up to 128 

respectively. To sum up, we calculated the probabilities 

of different codes with different expression ability 

respectively, and then calculated the average expression 

abilities of D_VCC and EC_VCC. 

Table XI shows the number of chain symbols, different 

chain symbols with same expression ability and 

probabilities for each, and the average expression ability 

of each vertex chain codes calculated by applying to the 

above ten sample shapes. In Table XI , the chain symbols 

of EC_VCC are represented in decimal form, namely the 

symbols from 0 to 255 are represented for the codes from 

“00000000” to “11111111” in binary form of one byte 

respectively. 

For the last row of Table XI, it only provided the 

average expression ability of EC_VCC. Considering that 

the symbols of EC_VCC are very large (from 0 up to 

256), and the expression ability for each symbol is vary 

(from 1 up to 128), it did not list different chain symbols 

with same expression ability and probabilities 

respectively, but only represent the total probabilities of 

1.000 for all symbols. Nevertheless, the result of average 

expression ability of EC_VCC is also calculated based on 

the experiments on the same ten sample shapes with the 

same method applied to other vertex chain codes, the 

difference lies in the amount of calculation is very large. 


symbols 

Chain symbols Expression ability probabilities Average expression ability 

VCC 3 1,2,3 1 1.000 1.00 

V_VCC 3 1,2,3 1 1.000 1.00 

E_VCC 

4 

1,2,3 

0 

1 

2 

0.769 

0.231 

1.23 

VC_VCC 

5 

c1,c4,c5 

c2,c3 

1 

2 

0.664 

0.336 

1.34 

D_VCC 

0,1,9 1 0.869 

2 2 0.047 

3 3 0.028 

10 

4 

5 

4 

5 

0.015 

0.008 

1.40 

6 6 0.004 

7 7 0.004 

8 8 0.025 

EC_VCC 256 0~255 1~128 1.000 2.08 

Synthesizing the above analysis and data, we 

calculated the efficiency E of the six vertex chain codes 

based on the average code length L obtained from the 

above Table X and the average expression ability per 

code C obtained from the above Table XI, as shown in 

Table XII. 

TABLE XII. 

EFFICIENCY OF VERTEX CHAIN CODES 


C L E 

VCC 1 2.00 0.50 

E_VCC 1.23 2.00 0.62 

V_VCC 1 1.55 0.65 

VC_VCC 1.34 1.71 0.78 

D_VCC 1.40 4.00 0.35 

EC_VCC 2.08 8.00 0.26 

As it can be seen in Table XII, the highest efficiency is 

produced by VC_VCC with 0.78 following by V_VCC 

with 0.65 and then E_VCC with 0.62. On the other hand,


the lowest efficiency is produced by EC_VCC with 0.26 

following by D_VCC with 0.35 and then VCC with 0.50. 

2) Comparison on Theoretic Analysis 

Analyzing the reason of the generation of different 

efficiencies, the two highest efficiency produced by the 

two vertex chain codes with not much symbols of chain 

code and variable length produced by Huffman coding. 

Namely, the numbers of bits per symbol is varying with 

optimum coding. On the other hand, the two lowest 

efficiency produced by the two vertex chain codes with 

equal length and much symbols. Even so, D_VCC and 

EC_VCC apply the idea of arithmetic encoding which 

can be referenced in producing new methods of chain 

code. Although D_VCC and EC_VCC produced the 

lowest efficiency, but for the reason that there are 

redundancy codes in them, and the number of symbols is 

too large with unnecessarily. 

To enhance the efficiency of vertex chain code, we can 

consider from the selection of numbers of symbols, the 

coding of chain code symbols with variable-length 

encoding and arithmetic encoding, etc. 


The methods of vertex chain code discussed thus far 

are by no means the all. In the paper, we described the 

techniques of six vertex chain codes. The techniques are 

discussed from two aspects: the description of techniques 

of vertex chain code and the comparison and evaluation 

of them. 

The main reason for the popularity of vertex chain 

codes is their compression capabilities with more and 

more economical in their uses of memory capacity. We 

hope our research in the paper can provide convenience 

and reference for the chain code researchers and users. 

Although not mentioned in this paper, there have been 

other approaches of vertex chain code. The goal of the 

research related to pattern recognition and image 

processing, but there are some problems such as 

extending some new algorithm about vertex chain code, 

which are the work we need to do in the future. 


This work was supported by the National Natural 

Science Foundation of China (60675008) and the Natural 

Science Foundation of Liaoning Province (201102042). 

REFERENCES 

[1] Y.K. Liu, B. Zalik, “An Efficient Chain Code with 

Huffman Coding,” Pattern Recognition, 2005, 38 (4):553– 

557. 

[2] S. Zahir, K. Dhou, “A new chain coding based method for 

binary image compression and reconstruction,” PCS, 

Lisbon, Portugal, 2007, pp. 1321–1324. 

[3] H. Sánchez-Cruz, M.A. Rodríguez-Díaz, “Coding Long 

Contour Shapes of Binary Objects,” 14th Iberoamerican 

Congress on Pattern Recognition, CIARP, 2009, pp. 45–52. 

[4] H. Sánchez-Cruz, “Proposing a new code by considering 

pieces of discrete straight lines in contour shapes,” J. Vis. 

Commun. Image R. 2010, 21: 311–324. 


[5] S. Priyadarshini1, G. Sahoo, “A new lossless chain code 

compression scheme based on substitution,” International 

Journal of Signal and Imaging Systems Engineering, 2011, 

4 (1):50–56. 

[6] H. Sun, J.Y. Yang, M.W. Ren, “A fast watershed algorithm 

based on chain code and its application in image 

segmentation,” Pattern Recognition Letters, 2005, 26(9): 

1266–1274. 

[7] F. Arrebola, F. Sandoval, “Corner detection and curve 

segmentation by multiresolution chain-code linking,” 

Pattern Recognition, 2005, 38(10): 1596–1614. 

[8] S. Zhao, Y. Xu, H. Li, H. Yang, “A Comparison of 

Lossless Compression Methods for Palmprint Images”, 

Journal of Software, 2012, 7(3 ): 594–598. 

[9] Z. Zeng, W. Yang, “Design Patent Image Retrieval Based 

on Shape and Color Features”, Journal of Software, 2012, 

7(6 ): 1179-1186. 

[10] R.K. Gupta, B. Gurumoorthy, “Automatic extraction of 

free-form surface features (FFSFs),” Computer-Aided 

Design, 2010, 44(2): 99–112. 

[11] X.L. Yan, L.P. Bu, L.M. Wang, “A Flame Apex Angle 

Recognition Arithmetic Based on Chain Code,” Advances 

in Intelligent and Soft Computing, 2012, 116: 29–35. 

[12] I.K.G.D. Putra, M.A. Sentosa, “Hand Geometry 

Verification based on Chain Code and Dynamic Time 

Warping,” International Journal of Computer Applications, 

2012, 38(12): 17–22. 

[13] E. Bribiesca, “A chain code for representing 3D curves,” 

Pattern Recognition, 2000, 33(5): 755–765. 

[14] H. Sánchez-Cruza, E. Bribiescab, R.M. Rodríguez- 

Dagnino, “Efficiency of chain codes to represent binary 

objects,” Pattern Recognition, 2007, 40(6): 1660–1674. 

[15] M. Maitre, M.N. Do, “Depth and depth–color coding using 

shape-adaptive wavelets,” Journal of Visual 

Communication and Image Representation, 2010, 21(5): 

513–522. 

[16] E. Bribiesca, “A new chain code,” Pattern Recognition, 

1999, 32(2): 235–251. 

[17] M. Salem, A. Sewisy, U. Elyan, “A vertex chain code 

approach for image recognition,” ICGST International 

Journal on Graphics, Vision and Image Processing, 2005, 

5(3): 1–8. 

[18] S.H. Subri, H. Haron, R. Sallehuddin, “eural network 

corner detection of vertex chain code,” AIML Journal, 

2006, 6(1): 37–43. 

[19] S.X. Zhang, W. Zhang, G.Q. Li, G.Q. Gu, “Skew detection 

for form document using Vertex-chain-code,” Journal of 

East China Normal University: Natural Science, 2004, 

9(3): 54–58.(In Chinese) 

[20] D. Q. Wang, W. Zhang, G. Q. Gu, “Calculating the 

compact and posture ratio of an image region based on 

VCC,” Journal of East China Normal University: Natural 

Science, 2005, 3(1): 59–62. (In Chinese) 

[21] Y. K. Liu, W. Wei, P. J. Wang, B. Žalik, “Compressed 

vertex chain codes,” Pattern Recognition, 2007, 40(11): 

2908–2913. 

[22] P. Y. Chen, C. P. Chang, “The segmented vertex chain 

code,” Journal of the Chinese Institute of Engineers, 2011, 

34(6): 40–44. 

[23] G. F. Yu, L. Wang, “Research on compression-type vertex 

chain code,” Journal of Image and Graphics, 2010, 15(10): 

1465–1470. (In Chinese)


Linghua Li female, was born in 

Liaoning China in February 1975, a 

lecturer in Computer Science Department 

at Dalian Nationalities University, Dalian, 

China. She received her BSc in 

application of electronic technique from 

Liaoning Normal University in 1998. She 

received her MSc in physics from 

Liaoning Normal University in 2001. Her 

areas of interest are pattern recognition and image processing & 

recognition. 


Yining Liu female, was born in 

Liaoning China in January 1991, a 

junior in College of Communication 

Engineering, Jilin University, 

Changchun, China. 

Yongkui Liu male, was born in Liaoning 

China in May 1961, a professor in Computer 

Science Department at Dalian Nationalities 

University, Dalian, China. He has his BSc in 

computer science from Jilin University in 

1982. He has his MSc in computer 

application from Shenyang University of 

Technology in 1987. He has his PhD in 

CAD and computer graphics from Zhejiang 

University in 1999. His research interests lie 

in computer Graphics, pattern recognition and image processing. 

Borut Žalik is a professor of Computer 

Science at University of Maribor, Slovenia. 

He obtained PhD in computer science in 

1993 from the University of Maribor, 

Slovenia. He is the head of Laboratory for 

Geometric modeling and multimedia 

algorithms. His research interests include 

computational geometry, geometric data 

compression, scientific visualization and 

geographic information systems. He is an 

author or co-author of more than 70 journal and 90 conference 

papers.


Research on Web Query Translation based on 

Ontology 

Xin Wang 

Changchun Institute of Technology/College of Software, Changchun, China 

Email: wangxccs@126.com 

Ying Wang 

Jilin University/College of Computer Science and Technology, Changchun, China 

Email: {wangying2010}@jlu.edu.cn 

Abstract—This paper presents a framework of query 

translation based on ontology between different query 

interfaces (forms), which will translate the queries into 

suitable formats, hiding the differences from underlying 

sources, and enabling users to access data sources 

conveniently. Firstly, constructing domain ontology concept 

model (DOCM), then, finding out predicate matching 

relationships between source form and target forms based 

on DOCM, lastly, analyzing query rewrite strategy which 

contains four type rewriters to construct automatically a set 

of constraint mapping rules so that the system can use the 

user-provided input values to fill out the appropriate form 

fields of different web database forms. Experiment results 

show that this method is feasible and effective. 

Index Terms—Ontology, Query Translation, Predicate 

Matching, Query Rewrite 


Large portions of web are buried behind user-oriented 

interfaces (forms), which can only be accessed by filling 

out forms. When a user poses all the queries on a global 

query form, web query translation system will rewrite the 

queries into suitable formats for the sources, and 

automatically submit the queries to different underlying 

physical sources, making the accesses to individual 

physical sources transparent to the user[1]. Due to 

integrating multiple heterogeneous data sources to 

provide users with a unified view, web query translation 

will be a complex and challenged key point. 

One of the earliest efforts at automated form filling 

was the ShopBot project[2], which uses domain-specific 

heuristics to fill out forms for the purpose of comparison 

shopping. The ShopBot project, however, did not propose 

a general-purpose mechanism of filling out forms for 

non-shopping domains. Raghavan,S. and Garcia- 

Molina,H.[3] build Hidden Web Exposer (HiWE), it does 

this by filling out the searchable form, submitting one or 

more queries, and after the contents have been extracted, 

it stores them in a repository and builds an index to 

support user queries. Shu L et al.[4] propose a method to 

model the querying capability of a query interface based 

on the concept of atomic queries. Meanwhile, they also 

present an approach to construct querying capability 


doi:10.4304/jsw.7.12.2849-2856 

automatically. Fangjiao Jiang et al.[5] study the query 

translation problem of the exclusive query interface and 

present a novel Zipf-based selectivity estimation 

approach for infinite-value attribute. Experimental results 

on several large-scale Web databases indicate that the 

approach can achieve high precision on selectivity 

estimation of infinite-value attribute for exclusive query 

translation. Yongquan Dong et al.[6] propose a novel 

query interface matching approach based on extended 

evidence theory for Deep Web. Against the limitations of 

existing combination methods, their approach considers 

the credibilities of different matchers and incorporates 

them into exponentially weighted evidence theory to 

combine the results of multiple matchers. Tan et al. [7] 

introduce personalization recommendation to the Deep 

Web data query, proposed a user interest model based on 

fine-grained management of structured data and a 

similarity matching algorithm based on attribute 

eigenvector in allusion to personalization recommendation. 

A few efforts are dedicated to query translation which 

has responsibility for translating a user’s query from 

source form to target forms. However, research works 

related to query translation have mainly fallen into two 

categories: attribute mapping and constraint mapping. 

These methods do not consider applying background 

knowledge, which is important to understand problems 

and situations. Ontology is an explicit, formal 

specification of a shared conceptualization in a interesting 

domain, therefore, we propose a novel method of 

ontology-assisted query translation between different 

query forms in this paper. 

II. WEB QUERY TRANSLATION ANALYSIS 

Web query translation problems can be defined as 

translating a user’s query from a source form to different 

target forms (Fig.1), which contains three characteristics:


Qs 

Figure 1. The query translation between source form and target form, 

where denotes the source form, and denotes the target form. 

Characteristic1 Query-generally: the queries of target 

forms are similar with the queries of the source form. 

Characteristic2 Domain-generally: the source form 

and the target forms belong to a domain-specific. 

Characteristic3 Minimum subsumption-generally: 

web query translation does not miss any correct answer 

and contains fewest incorrect answers. Simultaneity, it is 

domain content independent. 

Definition1 Query Translation: Given a source form S 

and a set of target forms T = { T1, T2, …, Tn}, if the userprovided 

input value is Qs in the source form S, then, the 

output is the query combination Qt * = {Q1 * , Q2*, …, Qn*} 

in the target forms after query translator, 

* 

Qi = δ i( q1, q2,..., qn) 

, 1 ≤i ≤ n , which satisfies the 

following constraints[8][9]: 

(1) Each query qj (1 ≤ j ≤ n) 

in Qi * must be a valid query 

to target formT i . 

(2) Qi * is “close” with the source queries Qs as much as 

possible in order to minimize the processing costs of target 

form, and improve the retrieval quality. 

Definition2 Minimum Subsumption Strategy (MinSS): 

Given a global query Qs from source form Qs and a set of 

* target forms T = { T1, T2, …, Tn}, Qi(1 ≤i ≤n) is a web 

query translation using MinSS if the following three 

conditions are satisfied at the same time: 

(1) Qi * is a valid query of target form, namely, Qi * is 

acceptable to target formT i . 

(2) Qi * semantically subsumes Qs, namely, for any back- 

* 

end database D, Q ( D) ⊆ Q ( D) 

, it means that the retrieved 

s i 

information of Qs is a subset of Qi * . 

(3) Qi * is the minimum subsumption strategy, that is to 

say, there does not exist any query Qt which satisfies (1) 

* 

and (2) and Qt( D) ⊆ Qi ( D) 

. 

MinSS can minimize the amount of data to be sent 

from the web database and minimize the effort to filter 

out undesired results. 

* 

Qi 

III. QUERY TRANSLATION FRAMEWORK 

In order to realize query translation between different 

query forms, we need to reconcile the heterogeneities at 

the predicate-level and value-level, and then generate a 


query plan expressed upon each target form. The 

framework of web query translation based on ontology is 

shown in Fig.2. 

Predicate recognition module is used to find out the 

attribute(predicate) matching relationships between the 

source form and the target forms based on ontology; 

Predicate mapping is used to match the source form and 

the target forms to determine the mapping type; Query 

rewrite module is used to match the value translation of 

mapped predicate; Query submission module is used to 

trigger button to submit query plans with respective 

sources, so that the user can immediately receive query 

results from multiple data sources. 

Qs 

Figure 2. The framework of web query translation, which mainly 

contains predicate recognition, predicate mapping, query rewrite and 

query submission. 

A.Ontology Construction 

Ontology as the concept model describing information 

system in semantic and knowledge level, user’s queries 

and relevant data can be mapped to the concept model, 

thus, ontology can be seen as a knowledge system which 

describes concepts and relationships[10][11]. The process 

of ontology construction is shown in Fig.3: 

Figure 3. Ontology construction: the process is semi-automatic, the 

concepts and relationships of ontology are extracted from searchable 

forms. 

The process of ontology construction is as follows: 

(1) Extracting domain terminologies from searchable 

forms, simultaneously, capturing the semantic 

relationships of terminologies.


(2) Extracting domain concepts. The domain concepts 

need explicit, so there is a need to disambiguate 

terminology, and describe the terminology with the most 

common word. 

(3) Constructing the hierarchical relationships of 

concepts, such as Is-A, Part-Of, Synonymous etc. 

(4) Refining domain ontology. Ontology engineers 

revise concepts and semantic relationships under the 

guidance of ontology engineering standards to ensure that 

the domain knowledge is shared and consistent. 

(5) After constructing the ontology, the semantics of 

concepts in domain ontology can be enriched by adding 

concept-level knowledge based on three philosophical 

notions: identity, rigidity, and dependency. 

Definition 3. Domain Ontology Concept Model 

(DOCM): DOCM is a data model that describes a set of 

concepts and relationships that may appear in a specific 

domain. It should be understandable by machine so that it 

can be used to reason about these objects within that 

domain. Each object can be denoted as Class = {CM, DT, 

{Si}, {CAi}, {SCi}}, which describes the relevant 

information of object. 

CM: It denotes the main class of object, which is 

universal and easy to understand for users. It can be seen 

as the keyword of object, e.g. “Author” can be seen CM 

of name object in Book domain. 

DT: It denotes the data type of object, such as “string”, 

“numerical” and so on, e.g. the DT of “Author” is 

“string”. 

{Si}: It denotes the synonyms of CM, namely, the 

concept aliases, e.g. “Writer” is the aliases of “Author”. 

{CAi}: It denotes the condition property set of object, 

which is “Part-Of” relationship to CM, e.g. “First name” 

and “Last name” are two condition properties of 

“Author”. 

{SCi}: It denotes the sub class set of CM, which is “Is- 

A” relationship to CM. Such as “Author keywords” and 

“Title keywords” are “Is-A” relationships to CM 

“Keywords”. 

DOCM has a good organizational structure, which 

represents high-level background knowledge with 

concepts and relationships. In this paper, the ontology is 

implemented by Protégé API and represented in the Web 

Ontology Language(OWL)[12]. In this way, to operate 

DOCM is equivalent to operate the OWL file. 

B. Predicate Recognition 

Predicate recognition is used to find out the predicate 

matching relationships among different query forms, an 

example is shown in Fig.4. 

Q1 3 Q 

2 Q 

Figure 4. Predicate recognition of three query forms in Book domain, 

we can find predicate correspondences such as those indicated by 

different shapes or underlined. 


The predicate matching relationships between the 

source form and the target forms are complex as follows: 

1) The predicate labels belong to the same concept are 

always different, e.g. source form denotes name using 

“Author”, while target form using “Writer”. 

2) The complex predicate relationships are common, 

e.g. source form denotes time using “data”, while target 

form using “year, month, day”. 

Predicate recognition can be seen as schema matching 

for different query forms, which are classified attributelevel 

(predicate-level) matching(Fig.5). 

Figure 5. Predicate recognition, which can be divided into simple 

predicate recognition(SPR) and complex predicate recognition(CPR). 

Definition4 Simple Predicate Recognition (SPR): the 

predicate of source form and the predicate of target form 

is 1:1 matching. 

Definition5 Complex Predicate Recognition (CPR): 

the predicate of source form and the predicate of target 

form is M:N matching. 

Definition6 Ontology-assisted Predicate Recognition 

Algorithm: the semantic similarity between two predicate 

concepts is measured by DOCM, which is the “bridging” 

to obtain the predicate matching relationships among 

different query forms. Assume that A * is a predicate label 

from target form, Ai is the main class of concept node Ci 

from DOCM, {Si} is the synonym set of Ai, {CAi}is the 

condition property set of concept node Ci, {SCi} is the 

sub class set of concept node Ci, Sim(A * , Ai) denotes the 

similarity between A * and Ai,σ denotes the threshold of 

matching degree: 

* * 

a) If A ∈{ Si} 

or A = A , then it is 1:1 matching(SPR) 

i 

* 

between predicate label A of target form and ontology 

concept A i . 

* * 

* 

b) If A ∉ { Si} 

, A ≠ A , and i Sim( A , Ai ) ≥ σ , then it 

means 1:1 matching (SPR) between predicate label A * 

and ontology concept Ai, we add A * to {Si} as a new 

synonym of ontology concept Ai. 

* c) If A ∈{ CAi} ∪ { SCi} 

, then it is M:1 matching (CPR) 

between predicate label A * of target form and ontology 

concept Ai. 

* d) If A ∉ DOCM and i A ∀ , * 

Sim( A , Ai ) < σ , then it 

means there is no matching between predicate label A * 

of target form and ontology concept Ai, we add A * to 

DOCM as a new ontology concept.


e) For the other target forms, we do the same above, in 

this way, we can get ontology-form predicate matching 

table, which records the matching relationships between 

different forms and ontology concepts. The structure of 

predicate matching table is shown in Fig.6. 

Figure 6. Predicate recognition, which can be divided into simple 

predicate recognition(SPR) and complex predicate recognition(CPR). 

We exploit the “bridging” effect in the process of 

matching predicates based on DOCM from a large 

number of query forms. Through predicate recognition, 

we can capture the semantic relationships of predicates 

among different query forms. 

C. Predicate Mapping 

Predicate mappings are crucial for the system to 

reformulate a user query into queries on the sources. If 

we can understand their semantics, we can query in the 

target forms to find the nearest queries for query 

optimization. Due to the predicate types are not single 

from different forms, relying on a single predicate type is 

not sufficient and the match results of individual types are 

often inaccurate and uncertain. Literature [13] finds that 

the most commonly used predicate templates are 

[attribute; value] and [ attribute; value∈{ D}] 

by 

observing 150 web query forms. Through observation, we 

found that the predicate type mainly contains “text”, 

“select”, “numeric”, “data time” and “specific type”. 

Therefore, we propose a type-based approach for query 

rewrite by combining multiple type rewriters to obtain the 

appropriate matching value. 

Predicate mapping is used to match source form and 

target forms to determine the mapping type. The structure 

of predicate mapping is shown in Fig.7. 

Figure 7. The structure of predicate mapping, which contains text type, 

select type, numeric type and specific type. 


We define a predicate template as a four-tuple 

[attribute; type; value; constraint] in this paper, where 

attribute denotes the predicate label, type denotes the 

predicate type, value denotes the filling value of predicate, 

and constraint denotes the predicate constraint condition. 

For example, in Fig.4(b), if a user fills out “Deep Web” 

in the predicate “Title/Keywords” and selects “All 

Terms” as the constraint condition, then, its predicate 

template can be expressed with [Title/Keywords; text; 

DeepWeb; AllTerms]. For the target forms, according to 

different predicate type, predicate template can be 

divided into text predicate template [attribute; text; value; 

constraint], select predicate template[attribute; select; 

value; constraint], numeric and data time predicate 

template [attribute; numeric; value; constraint] and 

specific predicate template [attribute; specific; value; 

constraint]. When a user fills out query form, it will form 

the source binding template, and then recognize the 

corresponding type by predicate mapping, lastly, send to 

different type rewriters to rewrite queries. 

D.Query Rewrite 

Query rewrite is responsible for rewriting the query 

into an equivalent from that one can execute against the 

physical databases, which is classified value-level 

matching. Nowadays, query rewrite based mostly on 

inefficient keyword matching techniques becomes a 

bottleneck of the web, therefore, we need to construct 

automatically a set of constraint mapping rules so that the 

system can use the user-provided input values to fill out 

the appropriate form fields of different web database 

forms[14]. In order to guarantee MinSS , a query of target 

form field can be seen as a union query Qt * which is a 

union of queries upon the target form to relevant answers 

from the target database. We want the union query Qt * to 

be as “close” to the source query Qs as possible so that it 

retrieves fewest extra answers. 

Generally, these requirements of form filling are 

commonly referred to as access limitations in that access 

to data can only take place according to given patterns, 

namely, search space[15]. 

Definition7 Search Space. For a predicate P, we define 

search space of P is I(P), which denotes all possible 

instances of P. 

Query rewrite based on type can be seen as a search 

problem, its search space can be confirmed by the 

predicate type. Each type of rewriter implements a search 

for the type-driven mechanism, if the predicate value is 

Ws for source form, then its value is Ws for the absence of 

a predefined target value. If there exists a predefined 

search space in target form, its value can be obtained 

from predefined values, namely, can be obtained from 

search space[16]. Each type rewriter has a quite different 

performance in rewriting queries. To obtain the final 

matches, we use some heuristics to perform rewriting 

strategy. 

(1) Text type rewriter: If the predicate type is “text” in 

target form, namely, its predicate template is [attribute; 

text; value; constraint], then filling out this text control of 

target form with the same value of source form, 

simultaneity, filling out this constraint value with the


same value of source form. If there does not exist 

constraint condition for the predicate in target form, then 

constraint is set null . For select type rewriter, numeric 

type rewriter and specific type rewriter, we adopt the 

same approach to obtain the constraint value of target 

form. 

(2) Select type rewriter: If the predicate type is “select” 

in target form, namely, its predicate template is , and its 

search space is the string instances of select list, then, we 

need to compute the matching degree between the source 

predicate value and the target search space, lastly, 

selecting the value by using null correspondences 

strategy[17] as the union query (filling value) of target 

form, in which we no longer need a threshold to filter out 

likely incorrect value correspondences. In this union 

query, the one semantically closest to the source predicate 

value can be seen as the best translation. 

Definition8 Null Correspondence: Given a string s and 

a set of string {ti}(i=1, 2, …, n), null correspondence of 

the string s based on the set of string {ti} is defined as 

formula (1): 

simn(, s null) =∏ (1 − simp(, s t )) (1) 

n 

i= 1 

i 

Where simp(s, ti) denotes the semantic similarity 

between the string s and the string ti. When we get the 

value of null correspondence, simn(s, null) will compare 

with all simp(s, ti) (i=1, 2, …, n). If simp(s, ti) is the 

maximum value, then it means the string s is mostly 

similar with the string ti, so ti will be selected as the 

filling value; If the maximum value is simn(s, null), then 

it means the string s is not similar with all string ti (i=1, 

2, …, n), so null or default value will be filled into the 

query field. 

There are two cases for “select” types: 

1) If the predicate type is “select” in source form, 

namely, its predicate template is [attribute; select; value; 

constraint], the predicate type in target form is also 

“select”. Since the set of select values is fixed, there is no 

possibility of mismatched for users. In this case, we can 

use the ontology-assisted similarity algorithm to obtain 

the desired union query. 

2) If the predicate type is “text” in source form, namely, 

its predicate template is [attribute; text; value; constraint], 

the predicate type in target form is “select”. While not all 

the users are professional, they may not know how to 

exactly describe what they want and sometimes they type 

a wrong word into the query field. Therefore, query 

system may possibly return some results the user doesn’t 

want or even no result at all. Such a query is called failed 

query and this will bring on some uncertainties. In order 

to obtain a fixed quantity of results satisfying the query to 

the user, a query strategy is proposed: firstly, computing 

similarity based on edit distance matcher which is 

explained in definition 10, if the similarity does not 

exceed a threshold, then it means the two phrases are not 

shaped in font, in this case, switching to call ontologyassisted 

similarity algorithm. 

Definition9 Edit Distance Matcher: Given two 

character strings S1 and S2, edit distance between S1 and 

S2 is measured by the number of edit operations, 

including insertions, deletions and substitutions, 


necessary to transform one string into another. We define 

the similarity based on edit distance between two strings 

(phrases) S1 and S2 as formula (2): 

1 

( 1, 2) 

= 

1 + ed( s1, s2) 

Sim s s 

Where ed (S1, S2) is the edit distance of string S1 and S2. 

(3) Numeric type rewriter: the predicate type is 

“numeric” in target form, namely, its predicate template 

is [attribute; numeric; value; constraint]. Numeric and 

data time have very similar nature in that they all form a 

linear space, and have similar operators, therefore, the 

mapping techniques for the two types are generally the 

same. 

For the choice of numeric or data time can be divided 

into two ways, one is the discrete data selection, the other 

one is the continuous data selection. For the discrete data 

selection method, we call discrete numeric data matcher 

which is explained in definition11. 

Definition9 Discrete Numeric Data Matcher: If the 

user-provided input value in source form is q, a discrete 

data. The corresponding search space in target form is D 

= {q1, q2, …, qn}, where, q1, q2, …, qn are discrete data, 

then the similarity between q and qi (1 ≤i≤ n ) can be 

defined as formula (3): 

q−qi Sim( q, qi 

) = 1− (3) 

max( qq , i ) 

Assume σ is the threshold of MD, if 

Sim( q, qm) = Max{ Sim( q, q1), Sim( q, q2)... Sim( q, qn)} ≥ σ , 

1≤ m≤ n, 

then we select qm as the union query of target 

form. 

Example1 If the user-provided input value in source 

form is “100” for predicate label “Price”, the 

corresponding search space in target form is {90,95,98}, 

δ = 0.8, 

then through the similarity formula (2), we can 

infer that Sim(100,90) = 0.9, Sim(100,95) = 0.95, 

Sim(100,98) = 0.98, thus, the value “98” of search space 

is the most greatest similarity with “100”, thus, we should 

select “98” as the union query of target form. 

In query forms, the user is likely to be given ranges of 

values to choose(Fig.8), to help the query rewriter 

understand different ranges, we need to let the system 

know the meanings of the range modifiers. For this 

purpose, we build a semantic dictionary that keeps 

commonly used range modifiers for numeric domains 

(Table1). 

(2) 

Figure 8. An example for rang modifiers, we can see that the ranges 

are often formed using numeric values and range modifiers together.


TABLE I. 

THE SEMANTIC DICTIONARY FOR RANGE MODIFIERS, AND 

WITH THE INFORMATION AND AN APPROPRIATE SEMANTIC 

DICTIONARY FOR RANGE MODIFIERS, WE CAN INFER THE 

CORRESPONDING NUMERIC RANGES FROM RANGE 

SEMANTICS TABLE. 

Range modifiers Meaning 

More than > 

Less than < 

under < 

from >= 

……………….. ………………….. 

Any All range 

After converting the range modifiers, we need to select 

the advisable value as the filling value. In order to reduce 

the undesired intermediate results to the minimum, we 

ought to choose the minimum target range, namely, the 

target range is exactly a minimum subsumption of the 

original one. The most appropriate value can be easily 

found out by projecting the original value to the number 

or time axis, which is explained in definition12. 

Definition10 Continuous Numeric Data Matcher: If the 

user-provided input value in source form is q, which is a 

continuous data with range [a,b], the corresponding target 

range is qi, whose selective values are continuous ranges 

ι ι ι 

S, S = {[ n1, n1],[ n2, n2],...[ nm, nm]} 

, ni ni ι 

≤ , ι 

ni ≤ n , i+ 

1 

1≤i≤m− 1, 

the overlap grade between q and S reflects 

the MD. There are four cases for the overlap grade 

between [a, b]and [ ni, ni] ι : 

a) nia ≤ and ι 

a≤ni≤ b 

ι b) a≤n and i ni≤ b 

ι c) a≤ni≤ b and ni≥ b 

d) nia ≤ ι and ni≥ b 

In this condition, the union query is the conjunction of 

the above four cases, namely: 

qi = Conditiona ∪Conditionb ∪Conditionc ∪ Conditionb 

If the source query range becomes large, the cost of 

MinSS will be high because the number of targets queries 

to the web database will increase. An example for 

minimum subsumption strategy is shown in Fig.9. 

s: 

t1: 0 

t2: 

t3: 

t4 

t5: 

25 

25 45 

45 65 

65 

85 

85 

Figure 9. An example for minimum subsumption strategy: For the 

predicate label “Price range”(Fig.9), if the filling value is “under 35” in 

source form, whose meaning is the value of less than 35 by range 

modifiers table. The search space in target form contains five selective 

values “0-25”, “25-45”, “45-65”, “65-85”, “85-”, we can observe that 

the projection area in source form contains two target search space “0- 


25” and “25-45” by numeric projection, that is to say the minimum 

subsumption using MinSS in target form is {“0-25”, “25-45”}, namely, 

the conjunction of target form which contains the source projection area. 

(4) Specific type rewriter: The above type rewriters 

can only deal with 1:1 mapping to find corresponding 

filling values, which can not handle complex mapping. 

Complex mapping binds a source schema element to a 

target schema element through an appropriate mapping 

expression. Therefore, we need specific type rewriter to 

realize the complex mapping by DOCM, which the userprovided 

input values will be stored to corresponding 

class. If the predicate type is “text”, then, according to a 

predefined order, specific type rewriter will assign the 

value of source form to target form; if the predicate type 

is “numeric”, then, according to a predefined format, 

specific type rewriter will assign the value of source form 

to corresponding field of target form. 

E. Query Submission 

After query rewrite, user’s suggested queries have been 

filled out each target forms, then the system will 

automatically trigger button to submit these query plans. 

As we know, when a form is submitted, the web browser 

will code the data which the user fills out source form, 

the final query forms for sending web server are in 

accordance with the “variable name 1 = value & variable 

name 2 = value & ...”, the data format as URL 

parameters affixing to this basic URL address to realize 

the complete URL address for web server. To sum up, we 

can see that to complete the automatic submission of each 

target form, it needs to achieve this function of simulating 

browser, namely, constructing URL address with 

parameters and triggering this URL. In this case, the 

query posted to the source form can be precisely 

translated to a query against the target form. 

Example2 Given a URL http://www.abbeys.com.au/ 

search.asp, we can observe that the base URL is http: 

//www.abbeys.com.au, and the action path is /search.asp, 

the form fields are “Author”, “Keywords”, “Category”, 

“Price Range”, “ISBN” and “Publication data”, if we fill 

out this form with value “Bing Liu”、“web mining”、 

“Computing and information technology” 、 “All 

Prices”、“After the year 2000” respectively, then, the 

query is thus constructed as: http://www.abbeys. com. au 

/search.asp?Author=Bing+Liu&Title/Keywords =web+ 

mining&Category=Computing+and+information+technol 

ogy&PriceRange=All+Prices&ISBN=&Publication data= 

After+the+year+2000. If we send this query directly to 

the web site, we will get the desired results from the 

target site[18]. 

IV. EXPERIMENTS 

Though the above analysis, we implement the 

graphical interface for web query translation which is 

shown in Fig.10.


Figure 10. The graphical interface for web query translation. 

To evaluate our approach, we select the Book-Domain 

query interfaces from UIUC[19] as the test samples. The 

evaluation metric for query translation is measured by 

Precision, Recall and F-measure. Precision denotes the 

percentage of the correctly query translation templates 

over all the query translation templates, Recall denotes 

the percentage of the correctly query translation templates 

over all the query form, and F-measure is a 

comprehensive assessment of Precision and Recall. The 

formulas are defined as (4), (5) and (6): 

the correctlyquery translation templates 

Precision = × 100% (4) 

all thequery translation templates 

the correctly query translation templates 

Recall= × 100% 

(5) 

all the query form 

2 

(1 + μ ) × Precision× Recall 

F - measure = × 100% (6) 

2 

μ × Precision + Recall 

Where μ denotes the weight coefficient, which is used 

to measure the importance between Recall and Precision. 

The results are shown in Table.2: 

TABLE III. 

THE RESULTS OF QUERY TRANSLATION FROM BOOK- 

DOMAIN QUERY INTERFACES. 

Form 

number 

Predicate 

template 

number 

Predicate 

template 

Number 

correctly 

Recall Precision 

Fmeasure 

25 132 123 0.884 0.932 0.907 

50 256 235 0.869 0.922 0.895 

75 364 337 0.867 0.926 0.896 

average 0.873 0.927 0.899 

From the result of Table2, we can see that the query 

translation accuracy is stabilized and high with the 

increase of form numbers, and it indicates that the query 

translation method with ontology is feasible and effective. 

In order to verify the validity of various predicate types 

during the process of query translation, we measure it by 

the precision of predicate mapping, the results are shown 

in Table.3: 


TABLE II. 

THE MAPPING RESULTS OF FOUR DIFFERENT PREDICATE TYPES 

Predicate 

type 

Predicate 

template 

number 

From the results of Table3, we can see that the 

predicate type “text” occupies the largest number in all 

predicate types, and the precision is also the best. The 

main reason for “text” type translation mistake is the 

predicate matching error, and the text field does not 

rationally combine with corresponding constraints in the 

process of schema extracting. The mistake reason for 

“select” type is to mistakenly input dissimilarity value 

and not to fill in the real similarity value in computing 

similarity based on edit distance. The inconsistent 

representation for “numeric” type content is the main 

reason to cause mistake. The “specific” type is more 

complicated, the mistake is mainly due to unreasonably 

split or combine source query, partly to schema extraction 

and predicate matching. The statistical results show that 

the precision of four types predicates mapping have 

reached a higher level, but the precision of “specific” 

type query translation would be further improved. 

V.CONCLUSION 

We presented a novel approach to query translation 

based on ontology automatically, which can transform 

queries from source form into queries that address the 

underlying physical databases. The experiment results 

show that our approach is feasible and effective. 

For our future work, we plan to enrich the domain 

ontology and continually improve query translation 

algorithm, we will also further research on the complex 

matching problem. We believe that our approach, with 

appropriate extensions, can achieve better results. 

REFERENCES 

Predicate 

template 

Number 

correctly 

Precision 

Text 135 129 95.6% 

Select 73 66 90.4% 

Numeric 17 15 88.2% 

Specific 26 21 80.8% 

Total 251 231 92.0% 

[1] Fan Wang, Gagan Agrawal, and Ruoming Jin. Query 

Planning for Searching Inter-dependent Deep-Web 

Databases. In Proceedings of SSDBM, 2008, pp24-41. 

[2] R.B.Do orenbos, O.E tzioni, and D.S.W eld. A scalable 

comparison-shopping agent for the world wide web. In 

Proceedings of the First International Confence on 

Autonomous Agents Marina del Rey, 1997, pp39-48. 

[3] Raghavan,S. and Garcia-Molina,H. Crawling the hidden 

web. In Proceedings of the 27th International Conference 

on Very Large Data Bases (VLDB), 2000, pp129-138. 

[4] Shu L, Meng W, He H, et al. Querying capability modeling 

and construction. In Proceedings of the 8th International 

Conference on Web Information Systems Engineering 

(WISE), 2007, pp13-25. 

[5] Fangjiao Jiang, Weiyi Meng, Xiaofeng Meng. Selectivity 

estimation for exclusive query translation in deep web data


integration. Lecture Notes in Computer Science(LNCS) 

Journal, 2009, 5463: 595-600. 

[6] Yongquan Dong, Qingzhong Li, Yanhui Ding, Zhaohui 

Peng. A query interface matching approach based on 

extended evidence theory for deep web. Journal of 

Computer Science and Technology, 2010, 25(3): 537-547. 

[7] Tao Tan, Hongjun Chen. A Personalization Recommendation 

Method Based on Deep Web Data Query. Journal of 

Computers, 2012, 7(7): 1599-1606. 

[8] K.C.-C.Chang and H.Garcia-Molina. Approximate query 

mapping: accounting for translation closeness. VLDB 

Journal, 2001. 

[9] K.C.-C.Chang and H.Garcia-Molina. Approximate query 

translation across heterogeneous information sources. In 

Proceedings of the 26th VLDB Conference, 2000, pp566- 

577. 

[10] Kerui Chen, Wanli Zuo, Fan Zhang, Fengling He, 

Yongheng Chen. Robust and Efficient Annotation based on 

Ontology Evolution for Deep Web Data. Journal of 

Computers, 2011, 6(10): 2029-2036. 

[11] Weifeng Su, Jiying Wang, Frederick H.Lochovsky. ODE: 

Ontology-assisted data extraction. ACM Transactions on 

Database Systems, 2009, 34(2):1-35. 

[12] Matthew Horridge, Bijan Parsia, Ulrike Sattler. 

Explanation of OWL entailments in Protege4. In 

proceedings of International Semantic Web Conference, 

2008. 

[13] Zhen Zhang, Bin He, Kevin Chen-Chuan Chang. Lightweight 

domain-based form assistant: querying web 

databases on the fly. In Proceedings of the 31st Very Large 

Data Bases Conference (VLDB), 2005, pp97-108. 

[14] Jie Bao, Doina Caragea, Vasant Honavar. Query 

translation for ontology-extended data sources. American 

Association for Artificial Intelligence, 2007. 

[15] Andrea Cali, Davide Martinenghi. Querying the deep web. 

EDBT, 2010. 

[16] Ye Ma, Derong Shen, Yue Kou, and Wei Liu. An effective 

query relaxation solution for the deep web. In Proceedings 

of APWeb, 2008, pp649-659. 

[17] Zhongtian He, Hong Jun, D A.Bell. A Prioritized 

Collective Selection Strategy for Schema Matching across 

Query Interfaces [C]. Proceedings of the 26th British 

national conference on Databases (BNCOD’09), LNCS 

5588, 2009: 21-32. 

[18] Stephen W.Liddle,David W.Embley, Del T. Scott. 

Extracting data behind web forms. Proceedings of the 28th 

VLDB Conference, 2008, pp402-413. 

[19] The UIUC Web Integration Repository. 

http://metaquerier.cs.uiuc.edu/repository. 


Xin Wang was born in HuLuDao, 

LiaoNing Province, China, on October 21, 

1981. He received his Master of computer 

science from Harbin Engineering 

University in 2008. Currently, he is a PH. 

D candidate with computer science at Jilin 

University since 2011. His main research 

interests include information retrieval, 

machine learning, and web mining. 

Ying Wang was born in 1981. She is a 

lecturer at the Jilin University and a CCF 

member. She received her Ph.D. degree 

from Jilin University. Her research area is 

Web Information Mining, Ontology and 

Web search engine.


Data Modeling of Knowledge Rules: An Oracle 

Prototype 

Rajeev Kaula 

Computer Information Systems Department, Missouri State University, Springfield, MO 65897 (USA) 

E-Mail: RajeevKaula@missouristate.edu 

Abstract—Knowledge rules are outlined declaratively in a 

knowledge base repository. Each rule is created 

independently for storage in the repository. This paper 

provides an approach to apply the techniques of traditional 

entity-relationship data modeling to structure the 

knowledge rules for storage as a database schema in a 

relational database management system. Utilization of entity 

relationship model and relational database for modeling 

knowledge rules provides for a more standardized 

mechanism for structuring knowledge rules. Storage of 

knowledge rules in a relational database shall also bring 

about improved integration with business applications, 

besides having the availability of services provided for 

transactional database applications. The paper utilizes the 

Oracle database for illustrating the application of the 

concepts through a sample set of knowledge rules. The 

approach is explained through a prototype in Oracle's 

PL/SQL Server Pages. 

Index Terms—Data Modeling, Knowledge Rules, Expert 

Systems, Knowledge Base, Entity-Relationship Diagram, 

Relational model. 


Knowledge rules (or production rules) are the primary 

mechanisms to define knowledge in rule based systems 

like expert systems or knowledge-based systems [4, 5, 10, 

11, 12, 13, 17, 21, 25, 27, 28, 32]. Such rules typically 

express decision-making guidelines. Each knowledge rule 

is written declaratively in constraint-action terminology 

represented as IF constraint THEN action statements. A 

constraint is some condition, while the action clause 

reflects the decision or advice. Figure 1 shows an 

example of a knowledge rule that describes a set of 

constraints applicable for approving a loan application. 

Figure 1. Sample Knowledge Rule 

Each rule is independently outlined in the knowledge 

base repository. In general the structuring and storage of 

knowledge rules is done through special programming 

languages like Prolog or Lisp [26, 29] or some 

proprietary development environments like CLIPS [9], 


doi:10.4304/jsw.7.12.2857-2865 

and JESS [7]. This paper outlines an approach to 

structure knowledge rules as a database schema for 

storage in relational databases using traditional entity 

relationship (ER) modeling techniques. As relational 

database and the associated language SQL are widely 

considered the ANSI/ISO standard for data storage and 

manipulation (viz. SQL:2008), data modeling of 

knowledge rules for storage as a relational database 

schema provides a more standardized structure for 

knowledge base (repository). Besides, representation of 

the knowledge rules as a relational database schema 

enables utilization of SQL for rule definition, 

maintenance, and manipulation. 

Even though there have been attempts toward 

integration of knowledge base and database, such 

attempts have traditionally focused on (i) improving the 

database working in the form of intelligent databases [1, 

2, 18, 20, 22, 23, 30, 31], or (ii) using knowledge based 

techniques to extract meaningful data from databases in 

the form of knowledge discovery [6, 8, 15, 24]. Whereas 

intelligent databases deal with the utilization of artificial 

intelligence techniques to capture the heuristics needed to 

control data in databases, the knowledge discovery 

approach involves utilization of artificial intelligence 

techniques to discover new knowledge in the form of data 

mining. Modeling of knowledge rules as a knowledge 

repository in relational database is an alternative 

approach to structure knowledge rules and enhance its 

utilization or integration with application development. 

The modeling of knowledge rules as relational 

database schema is now outlined in the following sections. 

First the entity relationship concepts for modeling 

knowledge rules are outlined. This is followed by a 

prototype that illustrates the entity relationship modeling 

through a sample set of knowledge rules, along with their 

transformation and retrieval from an Oracle database. The 

approach is illustrated on an Oracle 11g database through 

a prototype in PL/SQL Server Pages [3, 14]. PL/SQL 

server pages is a server-side scripting approach for 

developing database driven dynamic Web pages. The 

PL/SQL server page uses Oracle's primary database 

language PL/SQL as a scripting language along with 

HTML to generate database driven Web pages. Even 

though the prototype utilizes Oracle technology, due to 

the standardization of relational database concepts, such 

modeling and manipulation can be accomplished through


any other relational database product like MySQL, SQL 

Server, etc. 

II. RELATIONAL MODEL SCHEMA FOR KNOWLEDGE 

RULES 

The relational model schema for knowledge rules 

begins by modeling the structure of knowledge rules 

through entity relationship modeling followed by their 

transformation into a relational model. The modeling and 

transformation process consists of the following elements: 

(i) subject area schema specification, (ii) entity type 

structure specification, (iii) entity relationship 

specification, (iv) relational table representation, and (v) 

sharing of subject area schemas. Each element builds on 

one another. 

A. Subject Area Schema Specification 

Modeling of knowledge rules begins with the concept 

of categorizing organizational knowledge into subject 

areas. A subject area is the decision making area of 

business. Each subject area contains knowledge rules 

specific to its domain of working. For example, there 

could be customer loan subject area, sales analysis 

subject area, and so on. 

While the collection of knowledge rules within a 

subject area provide the action or decision support for 

that area, from a data modeling perspective such rules 

define the schema of knowledge belonging to the subject 

area. In other words, each subject area is a database 

schema supporting the knowledge as defined through its 

knowledge rules. So, for instance, there could be a 

database schema for customer loan subject area 

knowledge rules, another one for sales analysis subject 

area knowledge rules, and so on. The specification of 

entity types within a subject area schema is outlined now. 

B. Entity Type Structure 

The knowledge rules within a subject area are 

represented through a collection of entity types. Such 

entity types essentially follow the abstract structure of a 

knowledge rule statement as shown in Figure 2. In the 

figure each “constraint-i operator value” clause is some 

constraint, the “AND/OR” entries are logical operators 

joining constraint clauses, and the “subject area action” 

clause is some action representing the decision when the 

constraint conditions are true. 

Figure 2. Abstract Structure of Knowledge Rule 

Each constraint clause is represented through 

individual constraint entity types. Each constraint entity 

type will consist of three attributes as shown in Figure 3. 

The “Constraint Name” entry that defines the name of the 

constraint clause entity type is the name of the constraint 

entry in the constraint clause of the knowledge rule 

statement. The “Constraint ID” attribute is the primary 


key, the “Operator” attribute is the condition operator in 

the constraint clause, while the “Constraint Value” 

attribute is the value assigned to the constraint condition. 

Figure 3. Constraint Entity Type 

Instances of the constraint entity type are the 

individual constraint clauses in an associated knowledge 

rule statement. For example, consider two knowledge 

rules pertaining to subject area “customer loan” as shown 

in Figure 4. 

Figure 4. Sample Customer Loan Knowledge Rules 

The modeling of the constraint clause is shown in 

Figure 5. In the figure, part (a) shows the structure 

(definition) of constraint entity type, while part (b) shows 

the entity instances pertaining to the different clauses for 

the constraint in the two knowledge rule statements. 

Figure 5. Constraint Entity Instances for Customer Loan Example 

The subject area action clause is represented through a 

subject entity type. This entity type consists of two 

attributes as shown in Figure 6. To ensure symmetry 

within the modeling process, the subject entity type is 

named after the subject area. The “Action ID” attribute is 

the primary key, while the “Action Value” attribute is the 

value assigned to the subject area action entry in the


associated knowledge rule statement. So, for instance if 

the knowledge rule belongs to “customer loan” subject 

area, then the subject entity type would be titled 

“customer loan.” 

Figure 6. Action Entity Type 

Instances of the subject entity type are the subject area 

action clauses within different knowledge rule statements. 

For example, consider the knowledge rules of Figure 4 to 

outline the details of subject entity type. Since the subject 

area for the knowledge rules is “customer loan” the 

subject entity type is also named “customer loan.” Figure 

7 shows the modeling of the subject area action clauses 

for these knowledge rules. Part (a) of the figure shows the 

structure (definition) of subject (Customer Loan) entity 

type. Part (b) of the figure shows the entity instances 

pertaining to the different clauses for the subject area 

action values in the associated knowledge rule statements. 

Figure 7. Action Entity Instance for Customer Loan Example 

C. Entity Relationship Specification 

The various entity types of a knowledge rule will have 

binary relationship with the subject entity type. The 

binary relationship will be one-to-many (1:N) as shown 

in Figure 8. The logical operator binding the constraint 

clauses within a knowledge rule shall become the 

relationship attribute of the binary relationship between 

the constraint entity type and the subject entity type. The 

minimum cardinality is optional to mandatory from the 

constraint entity type to subject entity type. Each instance 

of constraint entity type now will be associated with one 

or more subject entity instances. On the other hand, the 

subject entity instances may optionally be associated with 

different constraint entity instances. 

Figure 8. Knowledge Rule Entity-Relationship Model 


The 1:N relationship between the constraint and 

subject entity types binds the constraint entity instances 

with the subject entity instances to represent a complete 

knowledge rule statement. Further, as knowledge rules 

are structured into distinct entity types, the entity 

relationship model of a subject area represents a database 

schema of entity types. For example, there can be a 

schema of entity types for the customer loan subject area 

representing its various knowledge rules. The 

transformation of entity relationship model of a subject 

area into a relational model is outlined now. 

D. Relational Table Representation 

Each constraint entity type is represented as a separate 

table in a relational database. For example, Figure 9 

which is an extension of Figure 5 shows the table 

structure of the credit risk and loan requested constraint 

entity types. 

Figure 9. Database Tables for Customer Loan Example Constraints 

Similarly the subject entity type will be represented as 

a separate table in the relational database. For example, 

Figure 10 extends Figure 7 through the table structure of 

the Customer Loan subject entity type. The 

CreditRisk_Logical attribute represents the logical 

operator value that binds credit risk constraint with loan 

requested constraint for this rule. The 

loanrequested_logical attribute being associated with the 

last constraint clause within the rule structure will be null. 

Part (a) of the figure shows the 1:N relationship between 

the subject (Customer Loan) entity type with the 

constraint Credit Risk and Loan Requested, while part (b) 

shows the table structure of the Customer Loan entity 

type. The foreign key CreditRisk_ID and 

LoanRequested_ID in Customer Loan table represents the 

1:N relationship with the CreditRisk and LoanRequested 

tables respectively. Included in the CustomerLoan table is 

also the value of the logical operators. 

The database schema of constraint entity types tables 

and subject entity type table represents a collection of


knowledge rules for the subject area in a relational 

database. 

Figure 10. Database Tables for Customer Loan Example 

E. Sharing of Subject Area Schemas 

Subject area database schemas’ can share entity types 

among each other. Such sharing represents the chaining 

of knowledge among knowledge rules belonging to 

different subject areas. First type of sharing occurs when 

the constraint in one subject area set of knowledge rules 

is also a constraint in another subject area knowledge 

rules. For example, consider two knowledge rules in two 

subject areas. The first rule belongs to the credit risk 

subject area with three constraint clauses as shown below: 

IF credit score is less than 600 AND 

debt to income is greater than 40% OR 

house is investment 

THEN Credit Risk is High 

The second rule belongs to the customer loan subject 

area with three constraint clauses as shown below. 


loan to value is less than 80% OR 

loan requested is less than $40,000 

THEN Approve with 5% APR 

The first constraint “credit score is less than 600” in 

both rules is the same. From a modeling perspective, the 

entity type representing this constraint is defined once in 

either of the two subject area schema, and then shared 

between the two subject area schemas. Such sharing of 

constraint entity types across subject area schemas can be 

viewed symbolically in Figure 11. 

Figure 11. Sharing Constraints Entity Types among Subject Areas 

Second type of sharing occurs when action value in 

one subject area set of knowledge rules serves as a 

constraint clause in another subject area set of knowledge 

rules. For example, consider two knowledge rules in two 


subject areas. The first rule belongs to the credit risk 

subject area with three constraint clauses as shown below: 




THEN Credit Risk is High 

The second rule belongs to the customer loan subject 

area with three constraint clauses as shown below. 

IF credit risk is High AND 




The action clause in credit risk subject area value is 

referred as a constraint “credit risk is High” in customer 

loan subject area. This reference in the customer loan 

subject area helps in the validation of the constraint value 

associated with a separate set of knowledge rules in credit 

risk subject area. From a modeling perspective, such 

reference is represented through the relationship between 

the subject entity types between the involved subject 

areas as shown symbolically in Figure 12. 

Figure 12. Sharing Action Entity Types among Subject Areas 

III. KNOWLEDGE RULES MODELING PROTOTYPE 

A prototype for modeling knowledge rules based on 

two subject areas belonging to finance discipline is 

outlined in this section. The proposed entity relationship 

model is transformed for storage in an Oracle database 

through the SQL language. The prototype also shows the 

retrieval of database stored knowledge rules in 

declarative format through a database procedure using the 

Oracle’s PL/SQL database language. 

A. Modeling Subject Area Knowledge Rules 

The subject areas for the prototype are customer loan 

and credit risk. The credit risk knowledge rules are listed 

first, followed by the entity relationship diagram for the 

credit risk subject area as shown in Figure 13. 

Credit Risk 

Rule 1: 




THEN High 

Rule 2: 

IF customer credit score is more than 600 AND


debt to income is less than 40% OR 

house is primary 

THEN Low 

Figure 13. Credit Risk Knowledge Rules Entity-Relationship Model 

The customer loan knowledge rules are listed now, 

followed by the entity relationship diagram for the 

customer loan subject area as shown in Figure 14. 

Customer Loan 

Rule 1: 


loan to value is greater than 80% OR 

loan requested is greater than $40,000 

THEN Reject 

Rule 2: 





Rule 3: 


loan to value is 100% OR 


THEN Reject 

Rule 4: 

IF credit risk is Low AND 



THEN Approve with 4.5% APR 

Figure 14. Customer Loan Knowledge Rules Entity-Relationship Model 

The prototype also illustrates the second type of 

sharing between the two subject areas. The credit risk 


constraint entity type within the customer loan schema is 

itself a separate subject area. Consequently, there is 

sharing of credit risk subject entity type with customer 

loan subject entity type. The composite entity relationship 

model for the two subject area schemas is shown in 

Figure 15. 

Figure 15. Sharing among Customer Loan and Credit Risk Subject 

Areas 

B. Relational Model Representation 

The entity relationship model of the two subject area 

schemas is transformed into a relational model for storage 

in a relational database. The table structure along with the 

row values for the various entity types is now outlined. 

To facilitate understanding of the concepts, the credit risk 

schema tables are outlined first, followed by the customer 

loan schema tables. 

TABLE I. 

CREDITSCORE TABLE 

CreditScore_ID Operator Value 

1 is less than 600 

2 is more than 600 

TABLE II. 

DEBTTOINCOME TABLE 

DebtToIncome_ID Operator Value 

CreditRisk_I 

D 

CreditRisk 

_ID 

1 is less than 40% 

2 is greater than 40% 

TABLE III. 

HOUSE TABLE 

House_ID Operator Value 

1 is Investment 

2 is Primary 

Value 

TABLE IV. 

CREDITRISK TABLE (PART 1) 

CreditScore 

_ID 

CreditScore_ 

Logical 

DebtToIn 

come_ID 

1 High 1 AND 2 

2 Low 2 AND 1 

TABLE V. 


DebtToIncome_Logical House_ID House_Logical


CreditRisk 

_ID 

TABLE V. 


DebtToIncome_Logical House_ID House_Logical 

1 OR 1 

2 OR 2 

where CreditScore_ID is foreign key to CreditScore table; 

DebtToIncome_ID is foreign key to DebtToIncome table; 

and, House_ID is foreign key to House table. 

Customer 

Loan_ID 

TABLE VI. 

LOANTOVALUE TABLE 

LoanToValue_ID Operator Value 

1 is greater than 80% 

2 is 100% 

3 is less than 80% 

TABLE VII. 

LOANREQUESTED TABLE 

LoanRequested_ID Operator Value 

1 is greater than 40000 

2 is less than 40000 



TABLE VIII. 

CUSTOMERLOAN TABLE (PART 1) 

Value 

CreditRisk_ 

ID 

CreditRisk_L 

ogical 

LoanToV 

alue_ID 

1 Reject 1 AND 1 

2 

Approve with 

5% APR 

1 AND 3 

3 Reject 1 AND 2 

4 

Approve with 

5% APR 

2 AND 3 

CustomerLoan 

_ID 

TABLE IX. 

CUSTOMERLOAN TABLE (PART 2) 

LoanToValue_Logical LoanReque 

sted_ID 

1 OR 1 

2 OR 2 

3 OR 3 

4 OR 4 

LoanRequested 

_Logical 

whereCreditRisk_ID is foreign key to CreditRisk table; 

LoanToValue_ID is foreign key to LoanToValue table; 

and, LoanRequested_ID is foreign key to LoanRequested 

table. 

C. Web Prototype 

The relational schema tables of the two subject areas 

are installed in an Oracle 11g database. Once the subject 

area schema is in the database, it can be queried for 


decision support. The prototype at this stage performs a 

simple retrieval of knowledge rules. The results of the 

retrieval in the form of selected knowledge rules is 

displayed in declarative format. The prototype consists of 

two Web pages. The interaction of the two Web pages 

within the Web architecture is shown in Figure 16. 

Figure 16. Prototype Web Architecture 

The user requests for the first Web page titled “input 

rule.” This page displays a Web form with text boxes to 

input data needed to search for valid knowledge rule in 

the database (knowledge) repository. Figure 17 shows a 

view of the input rule Web page. 

Figure 17. input_rule Web Form 

The input rule Web page is generated through a Web 

procedure titled "input_rule_web." Once the user 

completes the Web form, the “Provide Advice” button in 

clicked which enables the browser to forward the form 

input data to the second Web page in the database 

through the Web (HTTP) Server. 

The second Web page is titled “select rule.” This Web 

page receives the form input data, completes the database 

processing for searching the valid knowledge rule, and 

returns the outcome to the Web server, which in turn 

forwards the page to the Web browser. Figure 18 shows a 

view of the output using the inputs entered in the first 

Web page.


Figure 18. Select rule Web output 

The select rule Web page is generated through two 

Web procedures. The first Web procedure titled 

“select_rule_web” searches for the valid knowledge rule, 

while the second Web procedure titled 

“cust_loan_output_web” formats the selected knowledge 

rule in declarative format for display in the Web browser. 

Figure 19 shows the pseudocode logic for knowledge rule 

search strategy. The select_rule_web Web procedure is 

listed in Appendix-A, while the cust_loan_output_web 

Web procedure is listed in Appendix-B. 

Figure 19. Select rule Web page logic 


Entity relationship model is generally utilized to model 

data in a transactional database or data warehouse [13, 

16]. However, modeling of knowledge rules through an 

entity relationship model and storing them in a relational 

database provides for a standard based mechanism for 

structuring knowledge rules. Storage of knowledge rules 

as a knowledge base in a relational DBMS allows for the 

utilization of services similar to those provided for 

transactional database like conceptually centralized 

management, access optimization, recovery and 

concurrency controls, and so on. 

The knowledge rules representation as a relational 

database schema can facilitate some additional features 

like: 


1. Any access to knowledge can be restricted to 

only those rules that are pertinent to that user. 

This is similar to how access is restricted to data 

in a transactional database. 

2. Rules shall be queryable and updatable through 

widely known SQL language in the form of (i) 

new rules can be added or dropped to keep the 

nature of knowledge rules current (ii) new 

constraints can be added or existing constraints 

can be dropped or modified, and (iii) even 

individual attributes can be modified since now 

the knowledge rules are stored as relational 

tables. 

3. Rule consistency can be maintained with regard 

to its format and relationships. 

4. Rules may be integrated with business or 

enterprise applications, wherein such 

applications shall always get current knowledge 

from the database. 

As the relational database SQL concepts are 

standardized since SQL was adopted as a standard by the 

American National Standards Institute (ANSI) in 1986 

and International Organization for Standardization (ISO) 

in 1987 (the current standard is SQL:2008), the data 

model (relational database schema) can be easily ported 

to all major enterprise DBMS like Oracle, SQL Server, 

DB2, MySQL, and so on. The manipulation of the 

schema through a database language as illustrated 

through the prototype will vary, even though the nature of 

such manipulations will be conceptually similar. 

Further research is in progress to extend the modeling 

of knowledge rules. This involves (i) developing rule 

engine mechanisms to query rules for specific constraints, 

(ii) incorporate additional complexity in rule specification, 

(iii) techniques to link constraint values with transactional 

database, and (iv) exploring the notion of inferencing rule 

chains. 

APPENDIX A WEB PROCEDURE TO SEARCH VALID 

KNOWLEDGE RULE 

 

 

 

 

 

 

 

 

 


if dti_in < 40 then 

select debttoincome_id into dti_key 

from debttoincome 

where operator = 'is less than' and value = '40%'; 

else 

select debttoincome_id into dti_key 

from debttoincome 

where operator = 'is greater than' and value = '40%'; 

end if; 

if h_in = 'Investment' then 

select house_id into h_key 

from house 

where value = 'Investment'; 

else 

select house_id into h_key 

from house 

where value = 'Primary'; 

end if; 

select creditrisk_id, value into cr_key, cr_value 

from creditrisk 

where creditscore_id = cs_key and debttoincome_id = dti_key 

and house_id = h_key; 

if (ltv_in < 80) then 

select loantovalue_id into ltv_key 

from loantovalue 

where operator = 'is less than' and value = '80%'; 

end if; 

if ((ltv_in > 80) and (ltv_in < 100)) then 



where operator = 'is greater than' and value = '80%'; 

end if; 

if (ltv_in = 100) then 



where operator = 'is' and value = '100%'; 

end if; 

if (lr_in < 40000) then 

select loanrequested_id into lr_key 

from loanrequested 

where operator = 'is less than' and value = 40000; 

end if; 

if ((lr_in > 40000) and (lr_in < 75000)) then 



where operator = 'is greater than' and value = 40000; 

end if; 

if ((lr_in > 75000) and (lr_in < 150000)) then 




end if; 

if (lr_in > 150000) then 




end if; 

select customerloan_id, value into cl_key, cl_value 

from customerloan 

where creditrisk_id = cr_key and loantovalue_id = ltv_key 

and loanrequested_id = lr_key; 

cust_loan_output_web(cr_key, cl_key); %> 

APPENDIX B WEB PROCEDURE TO DISPLAY IN 

DECLARATIVE FORMAT 


 

 

 

 

 

 

 

Select Rule 

 

 

Knowledge Rules Query 

Prototype 

 

The appropriate knowledge rule for advice is: 

 

 

IF 

 

Credit Score is 

 

Debt to Income 

 

 

House is 

 

THEN &nbspCredit Risk 

 

 

 

 

IF 


where creditrisk_id = cl_row.creditrisk_id; %> 

Credit Risk is 

 

Loan to Value 

 

 

 

Loan Requested 

 

 

THEN &nbspCustomer Loan 

 

 

 

 

 

REFERENCES 

[1] S. Antony, D. Batra, and R. Santhanam, “The use of a 

knowledge-based system in conceptual data modeling,” 

Decision Support Systems, vol. 41, pp. 176 - 188, 2005. 

[2] E. Babiker, D. Simmons, R. Shannon, and N. Ellis, “A 

Model for Reengineering Legacy Expert Systems to 

Object-Oriented Architecture,” Expert Systems with 

Applications, vol. 12, pp. 363-371, 1997. 

[3] S. Boardman, M. Caffrey, S. Morse, and B. Rosenzweig, 

Oracle Web Application Programming for PL/SQL 

Developers, Upper Saddle River, NJ: Prentice-Hall, 2003. 

[4] R.J. Brachman and H. J. Levesque, Knowledge 

Representation and Reasoning, San Francisco, CA: 

Morgan Kaufmann, 2004. 

[5] Y. Duan and P. Burrell, “Some issues in developing expert 

marketing systems,” Journal of Business & Industrial 

Marketing, vol. 12, pp. 149-162, 1997. 

[6] U. Fayyad and R. Uthurusamy, “Data Mining and 

Knowledge Discovery in Databases,” Communications of 

the ACM, vol. 39, pp. 24-26, 1996. 

[7] E. Friedman-Hill, Jess in Action: Rule Based Systems in 

Java, Greenwich, CT: Manning Publications, 2003. 

[8] C. Gertosio and A. Dussauchoy, “Knowledge discovery 

from industrial databases,” Journal of Intelligent 

Manufacturing, vol. 15, pp. 29-37, 2004. 

[9] J. C. Giarratano and G.D. Riley, Expert Systems: 

Principles and Programming, Boston, MA: Course 

Technology, 1998. 

[10] F. Gomez and C. Segami, “Semantic interpretation and 

knowledge extraction,” Knowledge-Based Systems, vol. 20, 

pp. 51–60, 2007. 

[11] Y. Guo, Z. Pan, and J. Heflin, “Choosing the best 

knowledge base system for large semantic web 

applications,” in Proceedings of the 13th international 

World Wide Web conference, New York, NY, pp. 302 - 

303, 2004. 


[12] H. Hayes-Roth and N. Jacobstein, “The State of 

Knowledge-Based Systems,” Communications of the ACM, 

vol. 37, pp. 27-39, 1994. 

[13] R.D. Hull and F. Gomez, “Automatic acquisition of 

biographic knowledge from encyclopedic texts,” Expert 

Systems with Applications, vol. 16, pp. 261–270, 1999 . 

[14] R. Kaula, Oracle 11g: Developing AJAX Applications with 

PL/SQL Server Pages, New York, NY: Mc-Graw-Hill, 

2008. 

[15] Y. Kim and W.N. Street, “An intelligent system for 

customer targeting: a data mining approach,” Decision 

Support Systems, vol. 37, pp. 215 - 228, 2004. 

[16] R. Kimball, The Data Warehouse Toolkit, New York, NY: 

John Wiley & Sons, 1996. 

[17] S. Liao, “Expert system methodologies and applications— 

a decade review from 1995 to 2004,” Expert Systems with 

Applications, vol. 28, pp. 93-103, 2005. 

[18] B. Lin, “An Overview of Intelligent Database,” Journal of 

Computer Information Systems, vol. 33, pp. 8-12, 1993. 

[19] M. Mannino, Database Design, Application Development, 

and Administration , New York, NY: McGraw-Hill, 2006. 

[20] F. Manola, “Object-Oriented Knowledge Bases,” AI Expert, 

vol. 5, pp. 46 - 57, 1990. 

[21] Y. Ma, B. Jin, and Y. Feng, “Dynamic evolutions based on 

ontologies,” Knowledge-Based Systems, vol. 20, pp. 98– 

109, 2007. 

[22] B. Martin, A. Mitrovic, P. Suraweera, and A. Weerasinghe, 

“DB-Suite: Experiences with Three Intelligent, Web-Based 

Database Tutors,” Journal of Interactive Learning 

Research, vol. 15, pp. 409-432, 2004. 

[23] M.M.O. Owrang and F.H. Grupe, “Database Tools to 

Acquire Knowledge for Rule-Based Expert Systems,” 

Information and Software Technology, vol. 39, pp. 607- 

616, 1997. 

[24] S. K. Pal and P. Mitra, Pattern Recognition Algorithms for 

Data Mining, Boca Raton, FL: CRC Press, 2004. 

[25] J.B. Quinn, Intelligent Enterprise: A Knowledge and 

Service Based Paradigm for Industry, New York, NY: The 

Free Press, 1992. 

[26] P. Seibel, Practical Common Lisp, New York, NY: Apress, 

2005. 

[27] Y.P. Shao, “The Infusion of Expert Systems in Banking: 

An Exploratory Study,” Expert Systems with Applications, 

vol. 12, pp. 429-440, 1997. 

[28] J.F. Sowa, Knowledge Representation: Logical, 

Philosophical, and Computational Foundations, New 

York, NY: Brooks/Cole, 2000. 

[29] L. Sterling and E. Shapiro, The Art of Prolog, Second 

Edition: Advanced Programming Techniques (Logic 

Programming), Boston, MA: The MIT Press, 1994. 

[30] P. Suraweera and A. Mitrovic, “An Intelligent Tutoring 

System for Entity Relationship Modelling,” International 

Journal of Artificial Intelligence in Education, vol. 14, pp. 

375-417, 2004. 

[31] Z. Yuanhui, L. Yuchang, and S. Chunyi, “A Connectionist 

Approach to Extracting Knowledge from Databases,” in 

X.Liu, P. Cohen, and M. Berthold (ed.), Advances in 

Intelligent Data Analysis, Berlin, Germany: Springer- 

Verlag, pp. 465-475, 1997. 

[32] O.M. Vasil’ev, D. P. Vetrov, and D. A. Kropotov, 

“Knowledge Representation and Acquisition in Expert 

Systems for Pattern Recognition,” Computational 

Mathematics and Mathematical Physics, Vol. 47, pp. 

1373–1397, 2007.


OPC (OLE for Process Control) based 

Calibration System for Embedded Controller 

Ming Cen 

School of Automation, Chongqing University of Posts and Telecommunications, Chongqing, P. R. China 

Email: m_cen0104@sina.com 

Qian Liu, Yi Yan 

School of Automation, Chongqing University of Posts and Telecommunications, Chongqing, P. R. China 

Email: liuqian.122678@163.com, yanyi210@126.com 

Abstract—During embedded software development of 

complex control system, the calibration is an important 

approach to obtain optimal parameters of embedded 

software. Currently, typical calibration systems are with 

poor adaptability for various controllers which have 

different communication interfaces and calibration 

protocols. In order to solve the problem, an improved 

architecture for embedded controller calibration system is 

proposed to support more communication buses and 

protocols. In this architecture, by introducing OPC (OLE 

for Process Control) technology, the host software of 

calibration system is separated into OPC server and client. 

The OPC server masks the difference of various calibration 

protocols and detail of communication devices, and provides 

a unified access interface for controller parameters. The 

OPC client calibrates and acquires the parameters though 

calling the interface provided by the OPC server. By the 

method, when communication device or calibration protocol 

is varied, only the corresponding OPC server is required to 

be replaced. Then the details of communication devices and 

calibration protocols are no longer considered while 

developing a calibration system, and the generality and 

openness of the calibration system are enhanced greatly. 

The calibration system corresponding to the architecture 

was applied to an engine controller to verify the 

effectiveness of the method presented. 

Index Terms—Embedded software, Embedded 

controller, Calibration system, OPC technology 


The wide application of embedded system in nearly all 

of industrial control fields, such as automotive, aerospace, 

military and other manufacturing, was grown extremely. 

For example, the safety, comfort and efficiency of 

modern automotive mainly depend on various embedded 

electronic control systems, so modern automotives are 

equipped with more and more parts involving embedded 

controllers called electronic control units (ECU), 

including powertrain, chassis and body control, etc [1, 2]. 

In new energy vehicles especially, such as hybrid electric 


doi:10.4304/jsw.7.12.2866-2873 

vehicles and pure electric vehicles, the number of ECUs 

is increasing greatly. 

Nowadays, electronics makes 90% of the innovations 

in automotive industry, and 80% out of that is from 

software [3]. Because of increasing complexity and 

functionality of the systems, the size of software in 

modern automotive raised continuously, and almost 50- 

70% of the development costs of the ECUs are software 

costs [4]. For example, some cars contain more than 50 

controllers, more than 600,000 lines of code, three 

different bus systems and approximately 150 messages 

and 600 signals [5]. On the other hand, due to the high 

pressure of time and cost, the productivity of the software 

development is of great importance for automotive 

manufacturers. It is similar in other industrial control 

fields also. 

To reduce the increasing software development costs 

and time-to-market, and enhance the reliability, a series 

of approaches are applied to improve the efficiency of the 

software development and the quality. One of the 

solutions is to increase reuse of development results, 

including requirements, models, functions, and software 

components. For example, AUTOSAR (AUTomotive 

Open System ARchitecture) provides a group of 

embedded software architecture and interfaces 

standardization to facilitate the reuse of software 

components between different vehicle platforms, 

OEMs and suppliers [6]. The other is to manage the 

software development and maintenance processes, 

including requirements engineering, design, coding, 

software and systems integration, quality assurance and 

maintenance. The international standard, IEC 61508 and 

ISO 26262, provide an automotive safety lifecycle to 

develop a safety-related system [7, 8]. 

Other than ordinary embedded software, because of the 

complexity of automotive or other industrial control 

systems, the performance of embedded software highly 

dependents on the working parameters, including that of 

controlled objects and controllers. The working 

parameters of embedded controllers must be determined


and optimized by calibration and matching experiment 

from trial to finalizing of the products. So calibration is 

one of the key technologies in the development of 

embedded controllers. The calibration system with high 

efficiency and adaptability can improve the development 

efficiency and the quality of the embedded controllers 

greatly [9]. 

Currently, calibration systems mostly follow the 

ASAM standard architecture. ASAM [10, 11] is a 

standard system defined by Association for 

Standardization of Automation and Measuring Systems 

(ASAM). It provides a standard interface for 

measurement, calibration and fault diagnosis of 

automotive ECUs and embedded controllers of other 

industrial field. In this architecture, the communication 

interfaces, such as serial, CAN, USB and Ethernet, and 

the calibration protocols, including CCP (CAN 

Calibration Protocol), XCP (eXtended Calibration 

Protocol) and KWP2000 are supported. Many calibration 

systems were introduced. For example, the calibration 

system based on serial communication had been used to 

calibrate the parameters of the vehicular battery 

management system [12]. The hybrid electric vehicle 

ECU calibration system based on CCP was developed 

with Labview, and implemented the calibration and 

measurement of ECU parameters by calling the driver 

library of CAN communication adapter [13]. The gas 

engine controller calibration system based on CCP was 

developed with Borland C++ builder, and the connector 

of the host and the ECU was USB-CAN adapter [14]. 

There were other applications of CCP protocol in 

automotive controller also [15]. The KWP2000 protocol 

was used for diesel engine calibration system, and the 

serial port was as the connector of calibration system and 

high-pressure common rail diesel engine [16]. To support 

different types of ECUs with different parameters, a 

calibration system architecture supporting user interface 

customizing and reconstruction is provided [17]. 

However, the calibration systems in above solutions 

are strongly coupled with a specifically communication 

device or calibration protocol. Once the communication 

device or calibration protocol changed, the calibration 

system must be updated correspondingly. Therefore, the 

solutions are insufficient to suit different calibration 

protocols and hardware interfaces, and the generality and 

openness of the calibration system are restricted greatly. 

An improved architecture of calibration system is 

presented to solve the problem, in which the different 

calibration protocols and communication devices are 

encapsulated with OPC technology to provide unified 

data access interface of ECUs. The development of OPC 

server and client of calibration system are discussed and 

verified also. 

II. ARCHITECTURE OF CALIBRATION SYSTEM BASED ON 

MIDDLEWARE 

A. ASAM architecture of Calibration System 

ASAM working Group defines the conception of MCD 

(Measurement, calibration and Diagnostics) model, and 


provides a corresponding standard architecture to direct 

the development of calibration system. The typical 

calibration system accorded with the ASAM standard 

architecture is shown in Figure 1. 

Automation system 

Measurement, Calibration 

and Diagnostic system 

ASAM 1a 

driver 

Adapter 

ASAM 1a 

driver 

Hardware Hardware 

ASAM MCD_3 

ASAM MCD_2 

ASAM MCD_1 

*.map 

ASAM 

data base 

ASAP 

editor 

Figure 1. ASAM standard architecture of calibration system 

In this architecture, there are three interfaces or 

protocols called ASAM MCD-1, ASAM MCD-2 and 

ASAM MCD-3. The calibration and measurement 

parameters of the ECU are originated from map file 

(*.map), which generated by the compiler of the micro 

controller firstly. The map file is transformed to ASAM 

MCD-2 data base file (*.a2l) by the ASAP editor. 

According to ASAM MCD-2 data base, the host of MCD 

system can upload or download parameters to the ECU 

by ASAM MCD-1 interface to implement calibration and 

measurement. 

In the architecture, the host software user interface of 

calibration system is strong coupled with calibration 

protocol and bottom hardware layer generally. For 

example, a calibration system supporting serial port can 

not be used for the calibration of the ECUs which have 

CAN interface only. The host calibration software based 

on USB-CAN adapter can not support PCI-CAN adapter 

also. Furthermore, when the calibration protocol is 

changed to XCP, the original calibration system based on 

CCP would be failure. Once the calibration protocol or 

hardware interface layer is changed, the calibration 

system should be updated accordingly. So the calibration 

system can not meet the requirement of customizing and 

reconstruction according to the changing of ECUs. 

The calibration system which supports host software 

interface customizing and automatically generating can 

provide the reconstruction of user interface, and the 

adaptability for different parameters or ECUs is improved 

[17]. For lack of the standard of abstract communication 

interface to encapsulate the different bottom hardware 

and calibration protocols, the host software is coupled 

with bottom hardware or calibration protocol layer yet. 

So it is very necessary to introduce a mechanism to shield 

the difference of communication hardware layer and 

protocol layer, and provides a unified data access 

interface to improve the generality and adaptability of 

calibration system.


B. Architecture of Middleware based Calibration System 

There are mainly two approaches to mask different 

bottom hardware and calibration protocols, namely 

standard API and middleware technology. standard APIs 

can encapsulate the detail of communication hardware 

and calibration protocol, and provide unified access 

interface for calibration host software. The scheme of 

calibration system based on API is shown in Figure 2. 

But in fact, for lack of API specification, the API 

interface is varied for differnt communication device. If 

the driver interface or device are changed, the calibration 

system is required to be updated also. 

Device 

driver 1 

Device 1 

Calibration system 

API functions 

Device 

driver 2 

Device 2 

Figure 2. Scheme of calibration system based on API 

Middleware technology [18] can provide unified 

interface to shield the detail of bottom communication 

devices and calibration protocols also. Once the device 

driver or communication protocol changed, only the 

middleware is required to be updated, and the calibration 

host software keeps the same. Then the development time 

and cost of calibration system are both reduced, and the 

development efficiency is improved to raise the 

development of embedded software. The architecture of 

calibration system based on middleware is shown in 

Figure 3. 

Communication middleware 

Device 

driver3 

Device 3 

Middleware interfaces 

USB CAN Ethernet LIN 

Figure 3. Architecture of calibration system based on middleware 

The middleware of calibration system can provide a 

unified standard interface to shield the difference of 

network communication interfaces such as USB, Ethernet, 

CAN, LIN, etc, as well as the difference of calibration 

protocols such as CCP, XCP, and KWP2000 and so on. 


Calibration system 

Embedded controller 

… 

… 

So the adaptability and generality of calibration system 

would be improved greatly. 

III. CALIBRATION SYSTEM BASED ON OPC TECHNOLOGY 

A. Selection of Calibration System Middleware 

There are two basic middleware resolutions. The one is 

COM/DCOM (Component Object Model/Distributed 

COM) standard proposed by Microsoft Corporation [19, 

20], and the other is CORBA (Common Object Request 

Broker Architecture) standard proposed by OMG 

(Object Management Group) [21]. The OPC is a 

COM/DCOM based industrial standard which specifies 

the communication of real-time plant data between 

control devices from different manufacturers. By OPC 

technology, the problem of heterogeneity for bottom 

device driver design is solved effectively, and the 

development costs and incompatibilities between 

different devices are reduced. Meanwhile, the unified 

interface shielded the difference of bottom devices and 

improved the performance of industrial control systems. 

But this resolution can only be used in windows 

platform. CORBA is an open industry standard 

developed by OMG. The resolution provides the 

capability of platform independence, program language 

independence and transparent message passing, but it is 

too huge or too complicated. 

Currently, the hosts of calibration system are largely 

PC and industrial computer, and they are based on 

Windows platform mostly. Considering that the scale of 

calibration system is small generally, the OPC approach 

is more suitable for proposed method than that of 

CORBA. 

B. Architecture of OPC based General Calibration 

System 

The calibration of embedded control system is a 

process that adjusts the control parameters of embedded 

controller to optimize the system working state. A typical 

calibration system should have the functions of data 

acquisition, display, modification and storage, etc. 

Beyond the basic features, the presented calibration 

system is required to satisfy different calibration 

parameters of embedded controllers, HMI (humanmachine 

interface), calibration protocols and detail of 

bottom devices based on various communication buses. 

To achieve the goals above, the calibration host software 

must be with sufficient generality and adapbility. 

The configuration technique based approach introduces 

a XML (eXtensible Markup Language) file as interface to 

describe the configuration of HMI and parameters, and 

separates the calibration software as editing environment 

and running environment. The editing environment 

provides a visualized development interface to customize 

the manifestation of calibration parameters as XML 

configuration file, and the running environment parses 

the XML file to generate the HMI of calibration host 

software automatically. Once the parameters or 

embedded controller are updated, only the customizing of 

HMI in the editing environment is necessary, any new


update is not required. Then adaptability for different 

calibration parameters and HMI is obtained. 

Ulteriorly, by integrating the configuration and OPC 

approaches, a new solution with more generality and 

adapbility can be introduced to adapt various calibration 

protocols, bottom devices and communication buses. 

Then the generality of calibration system is further 

improved. The architecture of the corresponding 

calibration system is shown in Figure 4. 

Running 

environment 

OPC client 

OPC server 

ASAM 

MCD_1 

Embedded 

controller 

Calibration Host Software 

Import 

XML 

configuration 

file 

ASAP 

editor 

Generat 

Editing 

environment 

Figure 4. Architecture of OPC based general calibration system 

In the architecture, the calibration system running 

environment is separated into OPC server and client. The 

OPC server encapsulates calibration protocols and the 

driver of communication device, and provides a unified 

data access interface of embedded controller for the OPC 

client. Then the OPC client would be unrelated to the 

details of calibration protocol and communication device, 

and acquires and modifies the parameters of embedded 

controller through the OPC interface. The structure of 

running environment based on OPC is shown in Figure 5. 

OPC Client 

*.map 

Monitor HMI Calibration HMI 

OPC Sever 

Data acquisition 

OPC interface layer 

Data management layer 

Communication layer 

Hardware driver interface layer 

Embedded controller 

ASAM MCD_2 

ASAM 

Data base 

Figure 5. Structure of calibration system based on OPC 


XML 

configuration 

file 

ASAM 

Data 

Base 

In the structure, like the user interface layer of the 

conventional calibration system, the OPC client provides 

HMI to implement measurement and calibration. 

Corresponding to measurement and calibration function 

respectively, there are two types of HMI that is the 

monitor interface and the calibration interface. The 

monitor interface is used for displaying the running state 

of the embedded controller in various forms, and the 

calibration interface are used for manipulating calibration 

parameters generally. Similarly, like the communication 

layer of the conventional calibration system, the OPC 

server accesses the parameters of the embedded controller 

by CCP or XCP protocol, but it provides the unified 

communication interface to upper OPC client. The OPC 

server consists of communication hardware driver, 

calibration protocol, data buffer and OPC interface. With 

the standard OPC interface, the detail of hardware driver 

and calibration protocol is concealed. 

Furthermore, because an OPC client can access 

multiple servers simultaneously, the client and server of 

OPC based calibration system can be deployed at 

different network nodes to support distribute calibration. 

The OPC client can access data from different embedded 

controllers with different protocol and hardware 

concurrently. So repeated development of calibration 

system for different embedded controller is reduced, and 

the universality of the calibration system is improved to 

enhance the development efficiency of the embedded 

software. 

C. Design of OPC Server 

As the same in the field of industrial control, the OPC 

server of calibration system is designed to shield the 

difference of bottom devices. Standard OPC server is a 

COM component according with OPC specification, an 

industrial standard established by OPC foundation. As the 

most foundational OPC specification, data access 

specification defines a mechanism for real-time data 

communication [20]. According to the functions, the OPC 

server can be divided into four layers, communication 

interface layer, communication layer or calibration 

protocol stack, data management and OPC interface layer. 

The OPC server structure of calibration system is shown 

in Figure 6.




Communication interface layer 

FlexRay Ethernet CAN 

Embedded 

controller 

Communication layer 

DAQ processor Command processor 

Embedded 

controller 

Figure 6. Structure of calibration system OPC server 

The communication interface layer is the lowest layer 

of the OPC server, which includes different driving 

interfaces provided by different physical layer devices 

and different communication interfaces provided by 

various buses. Because the OPC client accesses data of 

embedded controller by OPC interface rather than driver 

of bottom device, even if the device or driver changed, 

only the corresponding OPC server is required to be 

replaced, and it is not necessary to update the calibration 

system to adapt to new hardware. Then the calibration 

system can support various devices and buses, such as 

CAN, LIN, Ethernet, FlexRay and so on. 

The communication layer encapsulates calibration 

protocols. There are two typical calibration protocols 

supporting different buses, CCP and XCP. Similarly, the 

OPC client does not access calibration protocol stack 

directly, so only the corresponding OPC server is 

required to be replaced when calibration protocol is 

varied. Then the generality of calibration system is 

improved correspondingly. 

The data management layer provides the management 

for two kinds of data, one is metadata from ASAM 

database, and the other is measurement and calibration 

data from embedded controller. According to the ASAM 

architecture, the metadata from ASAM database 

describes the information of the parameters of embedded 

controller, such as variable name, data type, address and 

so on. The OPC server configures the DAQ-ODT (Data 

AcQuisition - Object Descriptor Table) tables of 

embedded controller with the metadata, and then the 

controller can send specified data to the calibration host 

automatically according to DAQ-ODT configuration. The 

latter provides a data buffer to temporarily store the real 

time measurement data from the controller as well as the 

calibration data from the OPC client. Essentially, the data 

buffer is equivalent to a map of the embedded controller 

memory. 

The OPC interface layer provides standard interfaces 

according with OPC data access specification. By the 


Embedded 

controller 

... 

OPC interface, the calibration data from client can be sent 

to the controller, and the measurement data from the 

controller can be acquired also. Then the detail of 

communication devices and calibration protocols are 

concealed completely. 

D. Design of OPC Client 

The goal of calibration system client is to provide a 

friendly HMI and effective communication capability. 

The former can be implemented by user interface 

customizing and configuration, and the latter can be 

implemented by accessing embedded controller through 

OPC Server. 

The OPC client of calibration system consists of three 

layers, HMI, data management and OPC interface layer. 

The OPC client model is shown in Figure 7. 

The OPC interface layer of the client provides an 

access mechanism to OPC server. By the interfaces, the 

callback mechanism is implemented to achieve 

bidirectional communication between client and server. 

Then the OPC client can send calibration data to the 

controller, and acquire the measurement data of the 

controller from OPC server asynchronously by polling or 

publish/subscribe mode. 

Monitor HMI Calibration HMI 


ASAM data base 

parsing 

Human-machine interface layer 

Data storage 

Data conversion 


Figure 7. Model of calibration system client 

The client side data management layer parses ASAM 

database to obtain calibration metadata, and provides a 

data buffer to store the real time measurement data of 

embedded controller from OPC server and the calibration 

data. Similarly, the data buffer is corresponding to the 

memory of embedded controller, but the data is converted 

according to calibration metadata.


Figure 8. Sequence diagram of OPC based calibration system 

The human-machine interface layer includes 

calibration and monitor interfaces to display calibration 

and measurement parameters respectively. Calibration 

users can modify and calibration parameters through 

calibration interfaces, then the modified parameters 

should be sent to the server through OPC interface timely. 

Then OPC server sends the data to embedded controller 

after some data conversion. The monitor interfaces 

update the display of measurement data periodically by 

various styles, such as scopes, meters or grids. 

The OPC client and server described above are 

integrated to realize the host software of calibration 

system. The sequence diagram of interaction process 

between OPC server and client is shown in Figure 8. 

During the calibration process, the OPC server connects 

to the embedded controller firstly, and configures the 

DAQ-ODT tables of the controller according to parsed 

calibration metadata, then starts the DAQ command to 

obtain the controller parameters continuously. Secondly, 

the OPC client connects to the server, and begins to 

perform measurement and calibration. 

IV. TEST AND APPLICATION 

To verify the effectiveness of the method presented, 

the calibration system according with the method is 

developed, and used for the calibration of an engine ECU. 

In the calibration bench shown in Figure 9, an industrial 

PC with USB interface is as host computer, and an USB- 

CAN adapter is used to connect the host and the ECU. 


Calibration host USB-CAN Engine ECU 

Figure 9. Calibration bench for engine ECU 

The OPC server is corresponding to the USB-CAN 

adaptor, and CCP is used as calibration protocol. Both the 

OPC server and client of calibration host software are 

installed at the industrial PC. 

In calibration experiment, the OPC server runs firstly, 

and a series of initialization and configuration operations 

are executed, such as importing the ASAM database to 

obtain meta data of calibration parameters, connecting to 

the engine ECU and configurating DAQ-ODT tables. 

Then measurement data of the ECU running state is sent 

to the OPC server continuously. 

Subsequently, the OPC client starts and connects to the 

OPC server. Then the ASAM data base and XML 

configuration file are imported to generate the calibration 

and monitor interfaces automatically, and the 

measurement data are acquired and displayed 

continuously. At the same time, the control parameters of 

engine ECU can be updated at HMI, sent to OPC server 

by OPC interface, and send to ECU afterwards to achieve 

the calibration. The corresponding host software HMI of 

engine ECU calibration system is shown in Figure 10. 

By means of the calibration bench and corresponding 

measurement instruments and equipments, the engine 

ECU is calibrated to obtain a group of optimal control 

parameters. The main parameters include fuel injection 

pulse width, ignition advance angle and so on. Because 

the fuel injection pulse width influences the air-fuel ratio 

directly, the calibration of the parameter is one of the 

most important parts for engine ECU. Figure 11 shows 

the calibration results of fuel injection pulse width as map 

graph, where blue-axis denotes the engine speed and the 

red-axis denotes intake manifold pressure. Figure 12 is 

the corresponding fuel injection pulse width table. 

Figure 10. The client interface of ECU


Figure 11. MAP graph of fuel injection pulse width 

Figure 12. Fuel Injection Pulse Width Table 

Replacing the USB-CAN adaptor by a PCI-CAN 

adaptor, developing corresponding OPC server for new 

adaptor, and repeating the calibration experiment, the 

same result as above is obtained. It is shown that while 

the bottom hardware interface changed, only the 

middleware is required to be updated, and the rest of 

calibration host software keeps the same. Subsequently, 

the vehicle test verified the effectiveness of the 

calibration parameters also. So the calibration system 

according with the method presented can adapt to 

different bottom hardware. 

The adaptability of the calibration system to different 

communication buses and calibration protocols can be 

verified similarly. So the method presented improves the 

adaptability and generality, and the development cost and 

time of calibration system are reduced. Finally, the 

development efficiency and quality of embedded 

controller software are enhanced. 


Calibration is one of the key technologies in the 

development of the software of embedded controller. The 

existing calibration system is strongly coupled with a 

specifically communication device or calibration protocol. 

It is difficult to meet the fast, efficient and reliable 

development requirement of the embedded software. 

The OPC based calibration system presented is 

separated into OPC server and client to isolate the bottom 


hardware and calibration protocol with the rest. The OPC 

server masks the difference of various calibration 

protocols and detail of communication devices, and 

provides a unified data access interface. Once the 

hardware or calibration protocol changed or updated, 

only the OPC server is required to be updated 

correspondingly, and the rest of calibration system keeps 

the same. So the adaptability and openness of the 

calibration system is enhanced to improve the 

development efficiency of embedded software in 

industrial control fields. 


This work is supported by Science and Technology 

Project of Chongqing Municipal Education Commission 

under the Grant No. KJ110521, and Chunhui Project of 

the Ministry of education of China under the Grant No. 

z2009-1-63019. 

REFERENCES 

[1] Kiesel Rainer, Streubühr Martin, Haubelt Christian, 

Löhlein Otto and Teich Jürgen, “Calibration and validation 

of software performance models for pedestrian detection 

systems,” In Int. Conf. Embedded Comput. Syst.: Archit., 

Model. Simul., IC-SAMOS, pp.182-189, 2011 

[2] Huizong Feng, Ming Cen, Yu Zhang, Jianchun Jiang and 

Huasheng Dai, “A weak coupled calibration system 

architecture for electronic control unit,” In IEEE Veh. 

Power Propul. Conf., VPPC, pp.1-4, 2008 

[3] Bernd Hardung, Thorsten Kölzow and Andreas Krüger, 

“Reuse of software in distributed embedded automotive 

systems,” In Fourth ACM Int. Conf. Embedded Softw., 

EMSOFT, pp.203-210, 2004 

[4] Manfred Broy, “Challenges in automotive software 

engineering,” In Int. Conf. Software Eng., ICSE'06, pp.33- 

42, 2006 

[5] Klaus Grimm, “Software technology in an automotive 

company - Major challenges,” In Int. Conf. Software Eng., 

pp.498-503, 2003 

[6] Th. Scharnhorst, H. Heinecke, K.-P. Schnelle and H. 

Fennel, et al. “AUTOSAR - Challenges and achievements 

2005,” VDI Berichte, no.1907, pp.395-408, 2005 

[7] Ron Bell, “Introduction & revision of IEC 61508,” 

Measurement and Control, vol.42, no.6, pp.174-179, 2009 

[8] B. Dion and J. Gartner, “Efficient development of 

embedded automotive software with IEC 61508 objectives 

using SCADE drive,” VDI Berichte, no.1907, pp.237-247, 

2005 

[9] M. Beham, M. Etzel, D. L. Yu, “Development of a new 

automatic calibration method for control of variable valve 

timing,” Proc. Inst. Mech. Eng. Part D J. Automob. Eng., 

vol.218, pp.707-718, 2004 

[10] ASAM, “AE MCD-2MC: ASAP2 interface specification 

v.51, ” http://www.asam.net, 2009 

[11] S. Bienk, “ASAM ODX: syntax as semantics,” In Int. Conf. 

Software Eng., pp.583-592, 2008 

[12] John Chatzakis, Kostas Kalaitzakis, Nicholas C. Voulgaris 

and Stefanos N. Manias, “Designing a new generalized 

battery management system,” IEEE Trans. Ind. Electron., 

vol.50, pp.990-996, 2003 

[13] C. M. Vong, P. K. Wong and H. Huang, “Case-based 

reasoning for automotive engine electronic control unit 

calibration,” In Int. IEEE Conf. Inf. Autom., ICIA, pp.1380- 

1385, 2009


[14] Shiwei Yang, Lin Yang and Bin Zhuo, “Developing a 

multi-node calibration system for can bus based vehicle,” 

In IEEE Int. Conf. Veh. Electron. Saf., ICVES, pp.199-203, 

2006 

[15] G. S. King, R. Peter Jones, Andrew D. Bailey, 

“Application of systems modeling and simulation in the 

discrete ratio automatic transmission calibration process 

for an automotive,” ME Dyn Syst Control Div Publ DSC, 

vol.72, pp.935-944, 2003 

[16] Xiaojun Fang, Yinnan Yuan, Jiayi Du, Xing Wu and Mingquan 

Jia, “Development of ECU calibration system for 

electronic controlled engine based on Labview,” In Int. 

Conf. Electr. Inf. Control Eng., ICEICE - Proc., pp.4930- 

4933, 2011 

[17] Ming Cen, Yi Yan and Huasheng Dai, “General calibration 

system architecture of automotive electronic control unit,” 

Journal of computers, vol.5, pp.1894-1898, 2010 

[18] G. Blair, A. T. Campbell and D. C. Schmidt, “Middleware 

technologies for future communication networks,” IEEE 

Network, vol.18, pp.1-4, 2004 


[19] Al Chisholm, “Technical overview of the OPC data access 

interfaces,” ISA TECH EXPO Technol. Update, vol. 2, no.1, 

pp.63-72, 1998 

[20] Renjie Huang and Feng Liu, “Research on OPC UA based 

on electronic device description,” In IEEE Conf. Ind. 

Electron. Appl., ICIEA, pp.2162-2166, 2008 

[21] F. G. Chatzipapadopoulos, M. K. Perdikeas and I. S. 

Venleris, “Mobile agent and CORBA technologies in the 

broadband intelligent network,” IEEE Commun. Mag., 

vol.38, pp.116-124, 2000 

Ming Cen received the PhD degree in Optical Engineering 

from Graduate University of the Chinese Academy of Sciences. 

He had worked on a number of projects related to automation 

and communication techniques. His research interests include 

information fusion, target tracking and recognition, embedded 

system and intelligent vehicle. 

Qian Liu and Yi Yan are Master Degree Candidates of 

Chongqing University of Posts and Telecommunications in 

Control Theory and Control Engineering. The main research 

interest is automotive electronics and embedded system.


WS-mcv: An Efficient Model Driven 

Methodology for Web Services Composition 

Fayçal Bachtarzi 

Department of Computer Science, University Mentouri, Constantine, Algeria 

Email: bachtarzi@misc-umc.org 

Allaoua Chaoui 

Department of Computer Science, University Mentouri, Constantine, Algeria 

Email: a chaoui2001@yahoo.com 

Elhillali Kerkouche 

Department of Computer Science, University of Jijel, Jijel, Algeria 

Email: elhillalik@yahoo.fr 

Abstract— Web services are available applications on the 

Web which can be invoked by users to accomplish a potentially 

business task. However, to meet user’s requirements, it 

becomes necessary to dynamically organize existent services 

and combine them, responding thus to a new purpose. In 

this paper, we propose a methodology called WS-mcv (Web 

Service Modeling, Composing and Verifying) that addresses 

the main problems arising in Web service composition 

area. WS-mcv represents an efficient and modular multistep 

approach achieved by breaking service composition into 

three processes: service modeling, automatic composition 

and formal verification. The proposed methodology makes 

use of the G-Net framework to allow an easiest modeling 

of basic and existent services. We propose a collection of 

expressive G-Net based operators that successfully solves 

complex Web service composition. WS-mcv also defines 

means to ensure composition correctness. All the processes 

of WS-mcv have been successfully automated in a model 

transformation based visual environment. 

Index Terms— Web services composition, G-Nets, MDE, 

Graph transformation, ATOM 3 , G-Net Algebra 


Web services are software components available on the 

Web that implement business collaborations between corporations. 

They can be invoked via Internet to accomplish 

a potentially business task making possible interactions 

between applications and e-customers. Programs or external 

users can access Web services using standard Internet 

protocols such as Universal Description, Discovery, and 

Integration (UDDI) [1], Web Service Description Language 

(WSDL) [2], and Simple Object Access Protocol 

(SOAP) [3]. Web services have the particularity to provide 

specific and general functionalities and, in most cases, 

cannot respond to user’s requirements. To provide users 

customized services, it becomes then necessary to combine 

existent basic services. The process achieving this 

task is called Web service composition. Current solutions 

based on UDDI, WSDL and SOAP offer solutions for 

description, publication, discovery and interoperability of 

Web services but do not accomplish their complex composition. 

Research in the area of service composition has 

focused on trying to provide models expressed in different 


doi:10.4304/jsw.7.12.2874-2885 

formalisms. Some of the propositions used different kinds 

of Petri Nets, basic Petri nets [4], colored Petri Nets [5] 

[6] and Object oriented Petri Nets [7]. Other proposals exploit 

semantic features offered by Ontologies [8] [9] [10]. 

In this paper, we address the Web service composition 

problem by defining an efficient multistep methodology 

called WS-mcv (Web Service Modeling Composing and 

Verifying). WS-mcv has the advantage to resolve the 

main problems arising in Web service composition area. It 

breaks service composition process into several phases to 

offer solutions for both 1) specifying services, 2) automatically 

composing them and 3) ensuring their correctness. 

For Web services specification, we have proposed a set 

of modeling rules which allows modeling Web services 

in a high level Petri Nets framework called G-Nets [11]. 

For services composition, we have defined a G-Net based 

algebra that successfully solves complex composition. 

The proposed algebra supports basic constructs as well as 

more elaborate ones. All the operators within the algebra 

are syntactically and semantically defined by means of 

G-Nets. To ensure Web services correctness, we exploit 

translation rules [12] which enables to transform G-Net 

specifications into their equivalent Predicate/Transition 

Nets (PrT-Nets) [13]. Unlike others approaches which 

develop their own verification tools [14], we perform this 

transformation in order to exploit existing tools with a 

variety of analysis techniques for PrT-Nets. Each of the 

underlying well-defined phases of our methodology is performed 

by a different process which has been successfully 

automated. As the main requirement of our approach is 

to offer a high level of genericity and to make abstraction 

of a particular implementation, we propose to use 

Model Driven Engineering (MDE) techniques that support 

model evolution and manipulate models as instances of 

meta-models. The modeling process is implemented as 

a visual environment that allows designing the services 

according to a G-Net meta-model. The composition and 

verification processes, including the proposed operators, 

are implemented by graph transformation techniques. The 

remainder of this paper is organized as follows. In the next 

section, we present some related work. Section 3 outlines


the overall approach and presents the different phases of 

WS-mcv methodology. We introduce modeling, composition 

and verification processes in Section 4, 5 and 6 

respectively. For each of them, we describe the operating 

mode and the solutions adopted for its implementation. 

We finally conclude the paper by summarizing the main 

contributions and identifying future research directions. 


Various techniques for web service composition have 

been suggested in the literature. Most of them try to 

provide languages, semantic models and platforms in 

order to propose efficient solutions to this problem. 

Syntactic (XML-based) service composition [15] has a 

limited ability to support automatic composition. This is 

essentially due to the absence of semantic representations 

of the available services. Indeed, composition languages 

such as BPEL4WS [16] provide a set of primitives that 

allows interaction between services being composed. In 

these approaches, flow of processes and bindings between 

ervices are specified in advance. On the contrary, semantic 

approaches [8] [10] [5] [6] allow describing various 

aspects of Web services using machine-understandable semantics 

or solid mathematical basis. Semantic approaches 

are mainly classified into two categories: ontology-driven 

approaches and Petri Nets based approaches. In the following, 

we present some works related to each of the 

identified classes. 

A. Ontology-driven approaches 

Ontology-driven approaches for Web services composition 

[8] [9] [17] [10] use terms from pre-agreed 

ontologies to declare preconditions and effects of the 

concerned services. Works of [10] lead to OWL-S, which 

is a particular ontology for declaring and describing 

services. OWL-S provides a standard vocabulary that can 

be used together with the other aspects of the OWL 

description language to create service descriptions. Similarly, 

SAWSDL [8] defines a set of extension attributes 

useful to annotate WSDL interfaces and operations. These 

latter are used to publish a web service in a registry. 

The proposal of [17] is different as it makes use of 

semantic graph transformations to model web services. 

In the proposed model, each web service operation is 

associated with a semantic annotation that describes the 

input and output messages specifications using RDF graph 

patterns. The main difference here is that in [10] and 

[8] the inputs and outputs are expressed by concepts, 

while [17] describe them in terms of instance-based graph 

patterns. If these approaches present the advantage of 

clearly understanding the meaning of the messages, their 

main drawback remains the difficulty to discover the 

explicit goal of the services. This latter constitutes a key 

element when composing by AI planners [18]. WSML [9] 

also provides a formal syntax for web service modeling 

based on Description Logics, First-Order Logic and Logic 

Programming. It allows specifying axioms with variables 

in the pre- and post-conditions of a service capability. 


However, it does not have an explicit model to define 

the components of a message and their semantics. In all 

these works, the composition problem is modeled as a 

planning problem based on a reasoning process which 

uses semantic descriptions of services. Composing by 

reasoning is a challenging task as it is time consuming 

and it relies on a set of goals, plans, and rules to design 

complex processes. 

B. Petri-nets based approaches 

Existing web service composition works also uses Petri 

nets framework, simple Petri nets [4] [19] as well as High 

level Petri nets [7] [5] [6]. In [4] the authors propose 

a Petri net-based algebra for modeling Web services 

control flows. Their model is suitably expressive to make 

possible the creation of dynamic and temporary relationships 

among services. However, the main drawback 

is that the data types cannot be distinguishable because 

an elementary Petri net model is used. The work of [6] 

also deals with this problem by modeling and composing 

Web services using Colored Petri nets (CPN) [20]. Their 

proposal offers semantic support improving the reliability 

and maintainability of composite services. It also allows 

analyzing availability, confidentiality and integrity of the 

composite services. CPN framework is also exploited 

by [5] where an efficient algebra is suggested to model 

Web service composition. Algorithms to construct and 

execute a composite service are also delivered. These 

two works seems especially connected, even if in [5] 

the service composition sequence cannot be generated 

automatically because pre-defined conditions are required. 

In the Object-Oriented Petri Nets (OOPN) based approach 

[7], the Web service composition relies on mapping a 

service as the collaborative objects. Therefore, describing 

their behavior and communications is easily performed 

using the OOPN model. Their approach is much interesting 

since they offer a Web process design tool (WPDT) 

allowing to graphically doing the composition. The behavior 

and performances of a system can be checked when 

studying the process in action. This survey highlights 

the challenges and the proposed solutions for integrating 

existing services to create new value-added ones. Due 

to solid theoretical basis of semantic methods, they are 

well suited for not only modeling and composing Web 

services, but also verifying their behavioral correctness. 

Ontologies are not expressive enough to accomplish this 

task, because they are better to describe the features of 

a system rather than its behavior. In our work, we use 

a kind of object-oriented Petri-Nets for the specification 

of complex Web services, namely the G-Net framework 

which is powerful enough to capture the semantics of Web 

services combinations. The following section outlines the 

proposed methodology along with the involved phases and 

technologies. 

III. WS-MCV METHODOLOGY OVERVIEW 

Our approach aims to achieve Web service composition 

through a simple yet powerful methodology. WS-mcv


Figure 1. WS-mcv methodology 

breaks the composition process into four phases: 

1) Modeling of Web services using the basic concepts 

of the G-Net framework, 

2) Pre-verification of each modeled Web service, 

3) Composition of the verified Web services using the 

G-Net algebra, and 

4) Post-verification of the resulting service. 

The separation of the Web service composition process 

into a number of well-defined phases has several advantages. 

First, it simplifies the composition process, since 

each phase has a specific goal. Second, it facilitates the 

verification process, since it becomes easier to target 

the potential errors. Third, if offers more flexibility for 

automating the whole composition process, since each 

phase can be independently implemented. The complete 

composition process based on the proposed methodology 

is illustrated by UML activity diagram of Fig. 1. We can 

see the execution order of the different phases as well 

as the interactions between them. We specify that each 

phase uses the results of the previous one and achieved 

by an independent process. In the following, we give more 

details about each phase. 

A. Modeling phase 

The modeling phase is the first step of WS-mcv. Its role 

is to translate the services specifications into G-Nets. The 

basic concepts of G-Nets are: the interface and the internal 

structure. These two concepts are used for designing 

the component services while taking into account the 

modeling constraints imposed by the G-Net framework. 

The intention is to gain benefits of the modularity and the 

flexibility offered by this formalism on the one hand and 

to exploit its ease of conceptual modeling on the other 

hand. This phase is achieved by the modeling process 

which will be wholly described in Section 4. 

B. Composition phase 

The composition phase takes as input G-Nets representing 

the services to compose, together with a composition 

formula and generates a new value added service as a 

G-Net. To perform this operation, we propose a G-Net 

based algebra. This algebra offers a representative set 

of operators that can be applied to the G-Net services. 

The composition formula provided as input is in fact 

an algebraic expression where operands are the handled 

services and operators are graph manipulating operations 


performed on them. The complete composition process 

accomplishing this phase is presented in detail in Section 

5. 

C. Pre/Post-verification phases 

These two phases represent in WS-mcv the second 

and the fourth phases respectively and are both achieved 

by the verification process. Accomplishing verification 

before and after the composition has the main advantage 

to facilitate this operation. Traditional approaches don’t 

accomplish verification at all or only verify the resulting 

service. In doing so, it becomes difficult to localize the 

potential errors. Unlike other approaches, ours detects if 

the anomalies occur in the component services or in the 

composite one. Pre-verification phase is carried out before 

composition. It intends to verify whether the obtained 

G-Net models will be executing as expected and don’t 

contain behavioral inconsistencies such as deadlock or 

livelock. It is convenient to detect and correct possible 

errors as early as possible. If necessary, steps (1) and (2) 

are repeated until the specification of the modeled services 

passes the verification. Post-verification phase is applied 

after composition in order to check the correctness of the 

resulting composition; i.e. the integration of the partner 

services correctly runs. Hence, steps (3) and (4) may 

also be repeated until the composition of the concerned 

services passes the verification. 

D. WS-mcv realization 

As we have defined above, WS-mcv methodology intends 

to accomplish Web service composition into several 

steps, each one achieved by a specific process. Our proposal, 

as we will see in the next sections, doesn’t remain at 

the descriptive level. We propose to describe not only the 

operating mode of each process, but also the techniques 

adopted for its implementation. To implement the WSmcv 

processes, we have identified three requirements that 

must be met by our system: 

1) It shall support the evolution of the used modeling 

language, i.e. possible extensions of the G-Net 

framework. 

2) It shall be convivial, to allow users designing and 

manipulating models (G-Net specifications) in a 

direct and intuitive way. 

3) It shall offer a high level of genericity that allows 

users to make abstraction of a particular implementation. 

To meet these requirements, we propose to: 

1) Use the syntax of the visual modeling language (G- 

Net) by means of meta-modeling. 

2) Exploit a visual environment that allows designing 

the services according to the G-Net meta-model. 

3) Express model manipulation (i.e. composition) by 

means of graph-transformation. 

4) Use MDE (Model Driven Engineering) techniques 

to provide a generic approach that manipulates 

models as instances of meta-models.


IV. MODELING USING G-NET FRAMEWORK 

In this section, we first present the G-Net framework, 

and then we give formal definitions of G-Net services 

and web service together with some modeling rules. We 

finally describe the operating mode and the implementation 

of the modeling process. 

A. The G-net framework 

G-Net is a Petri Net based framework introduced by 

[11]. It is used for the modular design and specification 

of complex and distributed information systems. This 

framework provides a formalism that extensively adopts 

object oriented structuring into Petri Nets. The intention 

is to take advantages from the formal treatment and the 

expressive comfort of Petri Nets and at the same time 

to gain benefits from object-oriented approach (reusable 

software, extensible components, encapsulation, etc...). A 

system designed by the G-Net framework consists of a 

set of autonomous and loosely coupled modules called 

G-Nets. Similarly to an object in the object oriented 

programming concept, a G-Net satisfies the property of 

encapsulation i.e. a module can only access another one 

throw a well defined mechanism called G-Net abstraction. 

A G-Net is composed of two parts: the Generic Switch 

Place (GSP) and the Internal Structure of the G-Net (IS). 

The GSP is a special place and represents the visible 

part of the G-Net i.e. the interface between a G-Net and 

other ones. The Internal structure is the hidden part of 

the G-Net; it represents the internal realization of the 

designed system. The notation used for IS specification 

is very close to the Petri Net notation [21]. For more 

elaborate introduction to G-Nets, the reader is referred 

to [11] [22]. Like a G-Net system, Web services are 

assimilated to a distributed system that consists of a set 

of loosely coupled modules which communicate throw 

messages exchange. Thus, modeling Web services using 

G-Net is straightforward. 

B. Web services as G-nets 

In order to reduce the specification ambiguity and to 

help designers to understand description and possible behaviors 

of Web services, we give some formal definitions 

about G-Net service and Web service. 

Definition 1. ( G-Net Service) A G-net service is a 

G-Net S(GSP,IS) where: 

• GSP(MS,AS) is a special place that represents the 

abstraction of the service where: 

– MS is a set of executable methods in the 

form of < MtdName >< description >= 

{[P1 : description,...,Pn : description](< 

InitPL >)} 

where < MtdName > and < description > 

are the name and the description of the method 

respectively. 

< P1 : description,...,Pn : description > 

is a set of arguments for the method and < 


InitPl > is the name of the initial place for 

the method. 

– AS is a set of attributes in the form of < 

attribute − name >= {< type >} where < 

attribute−name > is the name of the attribute 

and < type > is the type of the attribute. 

• IS(P,T,W,l) is the internal structure of the service, 

a modified predicate/transition net [13], where: 

– P = NP ∪ ISP ∪ GP is a finite and nonempty 

set of places where NP is a set of normal 

places denoted by circles, ISP is the set of 

instantiated switch places denoted by ellipses 

used to interconnect G-Nets, GP is the set of 

goal places denoted by double circles used to 

represent final state of method’s execution. 

– T is a set of transitions 

– W is a set of directed arcs W ⊆ (P×T)∪(T× 

P) (the flow relation) 

– l : P → O ∪{τ} is a labeling function where 

O is a set of operation names and τ is a silent 

operation. 

Definition 2. (Web Service) A Web service is a tuple 

S = (NameS,Desc,Loc,URL,CS,SGN) where: 

• NameS is the name of the service used as its unique 

identifier 

• Desc is the description of the provided service. It 

summarizes what functionalities the service offers 

• Loc is the server in witch the service is located 

• URL is the invocation of the Web service 

• CS is a set of the component services of the Web 

service, ifCS = {NameS} then S is a basic service, 

otherwise S is a Composite service 

• SGN = (GSP,IS) is the G-Net modeling the 

dynamic behavior of the service 

Since Web service designer may be unfamiliar with G- 

Nets, we present modeling rules of a Web service into 

G-Net concepts. 

1) Each Web service is represented by a different G- 

Net. 

2) A service operation is modeled by a method in the 

G-Net. Then each method is associated a piece of 

Petri Net in the IS of the G-Net. 

3) Messages exchanged by the service and its customers 

are modeled by tokens. 

4) The state of the service is modeled by the position 

of the tokens in the G-Net. 

5) Synchronization and coordination of information 

exchange between places is modeled by a transition 

associated with input and output arcs. 

6) Interconnection between different G-Nets is carried 

by the ISP notation that represents the primary communication 

mechanism (for example, integrating the 

ISP of a server in the IS of a customer service 

specifies a client/server relation). 

C. Operating mode 

MDE approach is founded on the massive use of models 

during all the steps of an application life cycle. It en-


sures model stability by using meta-models as structuring 

elements. For designer’s applications, using these techniques 

prevents them to hardly encode their applications 

when creating custom modeling environments (domainspecific 

tools), as MDE raises the level abstraction from 

source code to models. Once the meta-model is defined, it 

is easy to make small modifications to obtain customized 

variations of the modeling formalism for specific use. 

To implement the modeling process, we exploit the 

powerful features of a tool called ATOM 3 , A Tool 

for Multi-formalism and Meta-Modeling [23]. ATOM 3 

allows describing (or meta-modeling) different kinds of 

formalisms used to model systems. In ATOM 3 , Entity 

Relation-ship (ER) formalism extended with constraints 

is available at the meta-meta-level. Therefore, 

the designer must use the ER formalism when modeling 

new meta-formalisms. Given the meta-model of 

the G-Net formalism, ATOM 3 can automatically generate 

a visual modeling tool to create and edit models 

in this formalism. In the context of our work, 

we make use of the G-Net meta-model defined by 

[12]. As shown in Fig. 2, the meta-model contains 

four classes (G − NetsGSP,G − NetsIS,G − 

NetsPlace,G − NetsTransition) and five relations 

(GNetsRealisation,G−Nets−hasPlaceInsid,G− 

Nets − hasTransitionInsid,G − NetsPl2Tr,G − 

NetsTr2Pl). To ensure a correct appearance of G- 

Nets models, the G-Nets meta-model associates graphical 

constraints to each G-Net entity. For example, a place 

is associated to a circle and a transition is associated 

to a rectangle. These constraints are specified when 

creating the meta-model in ATOM 3 . Once the tool is 

generated (according to the meta-model), the user interface 

buttons allow the designer to create entities of 

his model defined in the G-Nets meta-model. He then 

applies the modeling rules defined above to conceptualize 

any service in the G-Net formalism. The created G- 

Net services can be stored, edited and modified. Fig. 

3 illustrates the complete modeling process. After the 

compilation of the G-Net meta-model, ATOM 3 only 

accepts syntactically correct models in this formalism. 

The right window in the figure shows an example of a 

modeled service edited by the generated tool. The G- 

Net service reproduces the behavior of a checkout system 

service. The GSP of the service contains one method 

(mtd.Collect[Bill : data](PMC)(GP)) which receives 

the attribute ’Bill’ and have PMC and GP as initial and 

goal places respectively. The Checkout service checks the 

payment mode that the client invokes (PMC). According 

to the payment mode, the service performs the necessary 

operations. 

V. COMPOSING USING THE G-NET ALGEBRA 

This section first presents the G-Net based algebra that 

allows combining G-Net services and then shows how 

the proposed set of operators is implemented using Graph 

transformation techniques. 


A. The G-net based algebra 

Figure 2. The G-Nets meta-model 

WS-mcv allows combining existing G-Net services to 

obtain a new value added one that best meets end users’ 

requirements. For example, a service of hotel booking 

can collaborate with a Web mapping service like Google 

Maps API Web Service [24] to inform customers about 

the location of hotels. The collaboration of these services 

generates a composed Web service which performs the 

original individual tasks as well as a new one. Various 

constructs for Web service composition were discussed 

in later works [4] [25] [26]. Based in these works, we 

present an algebra that combines existing Web services 

for building more complex ones. We will take Sequence, 

Parallel, Alternative, Iteration and Arbitrary Sequence as 

basic constructs. We also define three more developed 

constructs which are Discriminator, Delegation and Selection. 

The BNF-like notation below describes the grammar 

defining the set of services that can be generated using 

our algebra’s operators. 

S ::= ǫ | X | S ◮ S | S ◭◮ S | � S | S ⇔ S | 

S�S | (S ⊡S) ≫ S | Deleg(S1,o,S2) | 

Select[S1: Sn] 

In what follows, we first give an informal definition of 

each operator and then we define its syntax and formal 

semantics in terms of G-Nets. 

The Empty service (ǫ) is the Zero Service; i.e. it performs 

no operation. It is used for technical and theoretical 

reasons. 

The Sequence operator (S1 ◮ S2) allows the construction 

of a service composed of two services executed one 

after the other. This construction is used when a service 

should wait the execution result of another one before 

starting its execution. For example when subscribing to 

a forum, the service Registration is executed before the 

service Confirmation. 

The Alternative operator or Mutual Exclusion operator 

(S1 ◭◮ S2) is a composite service. When applied to 

a pair of services S1 and S2, it reproduces either the


behavior of S1 or S2, but not both. For example the 

service Identification is followed either by the service 

Allow-access or the service Deny-access. 

The Iteration operator (� S) represents a composite 

service where one service is successively executed multiple 

times in a row. An example of use of this construct is 

when a customer orders from a service, a good a certain 

number of times. 

The Arbitrary Sequence operator (S1 ⇔ S2) is an 

unordered operator that performs the execution of two 

services that must not be executed concurrently. This construct 

is useful when there is no benefit to execute services 

in parallel. For example when there is no deadline to 

accomplish the global task and the parallelism generates 

additional costs. 

The Parallel Operator (S1�S2) builds a composite 

service. Given two services S1 and S2, it performs S1 

and S2 at the same time and independently (without 

communication and without interaction between them). 

The accomplishment of the resulting service is achieved 

when the two services are completed. This construct is 

useful when a service executes multiple atomic services 

completely independent. 

The Discriminator operator ((S1 ⊡ S2) ≫ S3) is a 

composite service built on three services S1, S2 and S3. It 

submits redundant orders to different services performing 

the same task (S1 and S2 for example) and waits the 


Figure 3. Customized modeling tool with ATOM 3 

outputs from S1 and S2. The first service (among S1 and 

S2) which responds to the request activates the service S3. 

All other late responses will be ignored. Note that S1 and 

S2 are performed in parallel and without communication. 

The main goal of this operator is to increase reliability and 

delays of the services through the Web. For customers, 

best services are those which respond in optimal time 

and are constantly available. 

The Delegation operator (Deleg(S1,o,S2)), where o 

is an operation (o ∈ O1, O1 being the set of operations 

of S1) which is replaced by the ISP of another more 

specialized service (S2). In a given service, this operator 

is used to delegate a task to another service that has 

more abilities to execute it. This operator contributes to 

increase quality of service, enhances cooperation between 

enterprises and decreases the development efforts. 

The Selection operator (Select[S1 : Sn]) is a complex 

operator that is applied to n services (S1,,Sn); it sends 

requests to different services through messages passed by 

their ISPs. According to the responses and rank criteria, 

the Selection operator chooses the best service between 

its competitors for performing a particular task that a 

company would to subcontract. This operator provides 

ways to maintain relationships with different suppliers 

which can offer different prices and provide different level 

of quality of service. It contributes then to increase the


independency of a company against its suppliers. 

The proposed algebra verifies the closure property. This 

property ensures that the product of any operation on 

services is itself a service to which we can apply algebra 

operators. We are thus able to build more complex services 

by aggregating and reusing existing services through 

declarative expressions of service algebra. Semantics of 

the composition operators is characterized by description 

of the GSP and IS parts of the component services. 

We also focus on the dynamic behavior of the resulting 

service and this to address the Web service composition 

problem. Table 1 summarizes the G-Net algebra operators; 

in particular, it gives their syntax and formal semantics. 

The notations which are common to all the operators are: 

• NameS is the name of the new service, 

• Desc is the description of the new service, 

• Loc is the location of the new service. It can be in 

the same server as one of the component service (s) 

or in a new server, 

• URL is the invocation of the new service. 

B. Operator’s implementation 

For the implementation of the composition process, we 

make use of meta-modeling and model transformation 

techniques based on the G-Net modeling language. The 

syntax of the class of models (G-Net) is graphically 

meta- modeled in an appropriate formalism, the Entity- 

Relationship Diagrams. Since the abstract syntax of the 

used models is graph-like, graph rewriting can be used 

to perform model transformation. Regarding to existing 

classification criteria, the kind of transformation that is 

applied in our approach is: 

• Endogenous (in contrast with exogenous model 

transformation), since the meta- model used to express 

both the source and target models is the same 

and, 

• Horizontal (in contrast with vertical model transformation), 

since the source and target models reside 

on the same level of abstraction. 

To implement the G-Net algebra’s operators, we have 

defined a graph grammar which consists in a set of 

transformation rules. The complete grammar includes 

twelve rules which can be applied to perform any operator 

present in a submitted composition formula. When the 

graph grammar execution finishes, we obtain a new G- 

Net service that models the composite service. Due to 

page limitations we show in Fig. 4 only three rules. In 

all of these rules, the nodes and different connections are 

labeled by numbers that identify them. These identifiers 

are used during the application of the rules. 

If an identifier is present in both the left hand side 

(LHS) and right hand side (RHS) of a rule, the corresponding 

element (node or connection) will be preserved 

in the result. If this identifier appears only in LHS, the 

corresponding element will be deleted. If this identifier 

appears only in RHS, the corresponding element will be 

created. As we will see, the identifiers are also used 


Figure 4. Some rules of the graph grammar for G-nets services 

composition 

in the python code to compute the attributes values 

′ < SPECIFIED > ′ . The elements attributes values 

in LHSs of the rules are compared with the elements 

attributes values of the host graph during the matching 

process. The first rule aims to implement the working 

of the Sequence operator. The LHS of the rule corresponds 

to the GSPs of the two G-Net operands. When 

representing only the G-Net interface (GSP), we make 

abstraction of the internal structure. In LHS, we have 

set all the attributes values to < ANY >. The RHS 

represents the resulting G-Net service. In this latter, the 

attributes of the nodes 3, 4 and 6 have the additional 

label ′ < SPECIFIED > ′ . This label specifies that 

the attribute value is computed by python code defined 

in the ′ Actions ′ . The code is executed only if the rule is 

applied and the computation of the value is based on the 

attributes’ nodes of the LHS. For example, in the first 

rule, the action: nodeWithLabel(4).InvokedGnet = 

LHS.nodeWithLabel(1).name.getValue() assigns the 

value of the attribute ’name’ of the node (1) to the value 

of the attribute ’InvokedGnet’ of the node (4). The two 

other rules are based on the same reasoning to perform 

Arbitrary Sequence and Discriminator operators. 

Like the modeling process, the composition process 

is also performed with AToM 3 , since this tool offers 

capabilities for model manipulation by graph transformation. 

The graph grammar presented above is stored in the


Operator Syntax Semantic 

Sequance 

Alternative 

Iteration 

Arbitrary Sequence 

Parallel 

Discriminator 

Delegation 

Selection 

S1 ◮ S2 = (NameS, 

Desc,Loc,URL,CS,SGN) 

S1 ◭◮ S2 = (NameS, 


� S1 = (NameS, 


S1 ⇔ S2 = (NameS, 


S1�S2 = (NameS, 


(S1 ⊡ S2) ≫ S3 = 

(NameS,Desc,Loc,URL, 

CS,SGN) 

Deleg(S1,o,S2) = 


CS,SGN) 

Select[S1 : Sn] = 


CS,SGN) 

TABLE I. 

THE G-NETS BASED ALGEBRA FOR WEB SERVICE COMPOSITION 

CS = CS1 ∪ CS2. SGN = (GSP,IS) where GSP = (MS,AS)|MS = 

Mtd.seq{[...](p1)}, AS = ∅ ; IS = (P,T,W,L)|P = {p1,p2,p3}, 

T = {t1,t2}, W = {(p1,t1),(t1,p2),(p2,t2),(t2,p3)}, L = {(P1,Isp(S1)), 

(P2,Isp(S2)),(P3,goal)}. 

user area of ATOM 3 . The actions associated to each 

rule are specified by Python code. Fig. 5 illustrates the 

implementation of the first rule in AToM 3 . The reader 

can see the LHS and RHS of the rule as well as the 

python code (in the top of the figure) which specifies the 

action discussed above. The composition process starts 

by the importation of G-Net services previously modeled. 

The user enters the composition formula according to the 

defined algebra. Since the formula may involve several 

operators, the corresponding rule(s) of each operator is 

(are) applied to the imported operand services. Consider 

the example composition scenario that occurs when a 

customer wants to get a product as soon as possible. 

The customer submits redundant orders to two Provider 

services. Once he obtains a response from the fastest 

service, he starts the payment procedure. This scenario 

can be performed using the Discriminator operator. The 

payment is achieved by the Checkout G-Net service 

presented above and the providers are modeled by the 

G-Net services Provider1 and Provider2. The Provider1 

and Provider2 services start by checking availability of 

the required product (CA). If the product is available, they 

make a bill and send it to the customer. In the case where 



Mtd.Alt{[...](p1)}, AS = ∅ ; IS = (P,T,W,L) where P = {p1,p2,p3,p4}, 

T = {t1,t2,t3,t4}, W = {(p1,t1),(t1,p2),(p2,t3),(t3,p4),(p1,t2), 

(t2,p3),(p3,t4),(t4,p4)},L = {(P1,τ), (P2,Isp(S1)),(P3,Isp(S2)),(P4,goal)} 

CS = CS1. SGN = (GSP,IS) where GSP = (MS,AS)|MS = 

Mtd.iter{[...](p1)}, AS = ∅ ;IS = (P,T,W,l) where P = {p1,p2},T = {t1,t2}, 

W = {(p1,t1),(t1,p1),(p1,t2),(t2,p2)}, l = {(P1,Isp(S1)),(p2,goal)} 

CS = CS1 ∪ CS2. SGN = (GSP,IS) where GSP = 

(MS,AS)|MS = Mtd.ar.seq{[...](p1)}, AS = ∅ ; IS = 

(P,T,W,L) where P = {p1,p2,p3,p4,p5,p6,p7,p8,p9}, T = 

{t1,t2,t3,t4,t5,t6}, W = {(p1,t1),(t1,p2),(t1,p3),(t1,p4),(p2,t2), 

(t2,p5),(p5,t4),(t4,p7),(t4,p3),(p7,t6),(t6,p9),(p3,t3),(p3,t6),(p4,t3),(t3,p6), 

(p6,t5),(t5,p3),(t5,p8),(p8,t6)} L = {(P1,τ),(P2,τ),(P3,τ),(P4,τ),(P7,τ), 

(P8,τ),(P5,Isp(S1)),(P6,Isp(S2)),(P9,goal)} 


Mtd.par{[...](p1)}, AS = ∅ ; IS = (P,T,W,l) where P = {p1,p2,p3,p4},T = 

{t1,t2}, W = {(p1,t1),(t1,p2),(t1,p3),(p2,t2), (p3,t2),(t2,p4)} l = 

{(P1,τ),(P2,Isp(S2)),(P3,Isp(S2)),(p4,goal)} 

CS = CS1 ∪ CS2 ∪ CS3. SGN = (GSP,IS) where GSP = 

(MS,AS)|MS = Mtd.disc{[...](p1)}, AS = ∅ ; IS = (P,T,W,l) 

where P = {p1,p2,p3,p4,P5,P6,p7}, T = {t1,t2,t3,t4,t5}, W = 

{(p1,t1),(t1,p2),(t1,p3),(t1,p5), (p2,t2),(p3,t3),(t2,p4),(t3,p4),(p4,t4), 

(p5,t4),(t4,p6),(p6,t5),(t5,p7)} l = {(P1,τ),(P4,τ),(P5,τ),(P2,Isp(S1)), 

(P3,Isp(S2)),(P6,Isp(S3)),(p7,goal)} 

CS = CS1 ∪ CS2, SGN = (GSP,IS) where GSP = (MS,AS)|MS = 

MS1, AS = AS1; IS = (P,T,W,L)| P = P1\L −1 (o) ∪ Isp(S2), T = T1, 

W = W1 ∪ {(t,Isp(S2))|t ∈ •L −1 (o)} ∪ {(Isp(S2),t)|t ∈ L −1 (o)•}\{(p,t)|p ∈ 

L −1 (o)}\{(t,p)|p ∈ L −1 (o)} L = L1∪{(Px,Isp(S2))} 

CS = �n i=1CSi,SGN = (GSP,IS) where GSP = (MS,AS)|MS = 

Mtd.Select[](p1),AS = ASn+1; IS = (P,T,W,L)| P = p1,...,p2n+3, 

T = t1,...,t2n+2, W = (p1,t1) ∪ �n+1 i=2 (t1,pi) ∪ �n+1 i=2 (pi,t2) ∪ 

(t2,pn+2) ∪ �n i=1 (pn+2,ti) ∪ �2n+2 i=3 (ti,pn+i) ∪ �2n+2 i=n+3 (pi,ti) ∪ 

�2n+2 i=n+3 (pi,t2n+3), L = {(P1,τ)}∪ {(p2,Isp(S1.req)),...,(pn+1,Isp(Sn.req))} 

∪{(pn+2,SelectService)}∪ {(pn+3,Isp(S1.mtd)),..., (p2n+2,Isp(Sn.mtd))} ∪ 

{(p2n+2,goal)} 

the product is not available, they trigger their respective 

restock procedure and recheck availability. 

In Fig. 6, we present the services composition using 

our graph transformation based tool. The three services 

being modeled and imported in the tool, the user enters 

the formula (Provider1 ⊡ Provider2) ≫ Checkout . 

ATOM 3 applies our graph grammar. At the end of 

grammar execution, we obtain the composite service Disc 

shown in the right side of Fig. 4. We can see that Disc 

invokes Provider1 and Provider2 through their ISPs. The 

first service which responds to the request activates the 

Checkout service. 

VI. VERIFICATION PROCESS 

WS-mcv methodology deals with the formal verification 

in order to test and repair design errors even before 

actual running of the (composed) service. The intention is 

to raise reliability of Web service composition by ensuring 

that a composite G-Net service will behave as required 

by its specification and that the system and its components 

contain no errors or behavioral anomalies (such as 

deadlock and livelock). To perform formal verification on 

systems modeled by Petri-Nets like languages, current


researches exploit the powerful features of reachability 

graph analysis techniques. Since there is no reachability 

analyzer tool for the G-Net framework, we make use 

of translation rules which enables to transform G-Net 

specifications into their equivalent Predicate/Transition 

Nets (PrT-Nets). This transformation is performed in 

order to exploit an existing tool with a variety of analysis 

techniques for PrT-Nets namely PROD reachability 

analyzer [27]. PROD creates a reachability graph of a 

system modeled as PrT-Nets. From that graph, users can 

search for terminal nodes, path leading to that terminal 

nodes and path which satisfy some given properties. 

The complete manual of PROD can be found in [28]. 

Unlike other approaches, WS-mcv methodology allows 

performing verification before and after the composition 

in order to detect if the anomalies occur in the component 

services or in the composite one. The two phases of Pre 

and Post verification occur in the same way except that 

the former concerns the operand services and the latter 


Figure 5. The first rule in AToM 3 

Figure 6. G-nets services composition using our tool 

concerns the resulting service. In this way we guarantee 

a correct Web service modeling and composition. The 

verification process is then divided in two tasks: 1) the 

transformation of a G-Net service into an equivalent PrT- 

Net and 2) the translation of the obtained PrT-Net into 

PROD description. 

A. G-net/PrT-net transformation 

To perform this task, we still use MDE techniques in 

order to make model transformation. The works of [12] 

present a graph transformation based framework that allows 

transforming a G-Net specification into its equivalent 

PrT-Net using a defined graph grammar. This latter has a 

slight inconvenient as, when applied to a source model, 

it progressively deletes it. As a consequence, we have 

improved these grammar rules in order to preserve the 

source model (G-Net model). We propose to exploit the 

modified grammar in ATOM 3 environment. Once we 

provide our tool the meta-model of the PrT-Net formalism


[12] and the modified G-Net/PrT-Net grammar, it can then 

support the two formalisms of G-Net and PrT-Net. It can 

also automatically generate PrT-Net specifications from 

G-Net ones. The verification process starts by specifying 

the G-Net service the user wants to analyze. The transformation 

rules are then applied to the G-Net model. Once 

it finishes, we obtain an equivalent PrT-Net specification. 

This transformation is illustrated by the user interface of 

ATOM 3 in Fig. 7. In the left side of the figure, we 

can see the Provider1 service in the G-Net formalism. 

The right side of the same figure represents its equivalent 

resulting PrT-Net. 

B. Prt-net/PROD net description transformation 

To perform the analysis using PROD, we need to convert 

the PrT-Net specification into PROD’s Net description 

language. This language is C preprocessor language 

extended with net description directives. PROD compiles 

this net description and generates the full reachability 

graph. However, at present time, this task is still manually 

performed. Later, when the user interface of our tool 

will be complete, the user doesn’t have to be familiar 

with PROD. Using the reachability graph, we can verify 

many important properties such as boundedness, liveness 

and reachability which can be used as general criteria 

of correctness of composition. The result of this transformation 

is illustrated by Fig. 8. This figure represents 

the resulting PROD net description file of the service 

Provider1 previously described in PrT-Net formalism. 

VII. CONCLUSION 

In this paper, an efficient model-driven methodology 

for Web service composition has been presented. The 

proposed methodology offers solutions for both modeling 

existent services, successfully composing them and verifying 

their correctness. The main contributions of this 

paper are: 

• The definition of a set of modeling rules for Web 

service specification into G-Net concepts. 

• The proposition of a G-Net based algebra that allows 

combining G-Net services by means of basic and 

complex operators. 

• The formal definitions of G-Net services as well as 

the introduced operators. 

• The implementation of the proposed operators by an 

efficient graph grammar. 

• The specification of a verification method to ensure 

composition correctness. 

All the phases of our methodology have been realized 

under different processes. The modeling and composition 

processes have been implemented with a customized 

visual tool which allows editing and manipulating models 

in the G-Net formalism. The verification process, which 

is partially automated, is based on model transformations 

performed by ATOM 3 to produce models that can be 

verified by PROD. To the best of our knowledge, WSmcv 

is the only approach that makes use of the G- 

Net framework to provide a complete solution for Web 


service composition. Compared to other Petri Nets based 

approaches, ours presents several advantages. It requires 

less effort when modeling complex services and produces 

more reduced models. Furthermore it offers a visual tool 

and deals with the formal verification. In future work, 

we will propose to meta-model the WSDL description 

language and to define a graph grammar which allows 

translating Web services described in WSDL language 

into equivalent G-Net services. Then we will extend our 

tool with a new module that can import existing services 

in WSDL and automatically model them into G-Net 

concepts. We also plan to improve the verification process 

by automating the transformation task from PrT-Nets to 

PROD’s Net description language. This will avoid users 

to be familiar with PROD Net descriptions. 

REFERENCES 

[1] F. Curbera, M. Duftler, R. Khalaf, W. Nagy, N. Mukhi, 

and S. Weerawarana, “Unraveling the web services web 

an introduction to soap, wsdl, and uddi,” IEEE INTERNET 

COMPUTING. 

[2] E. Christensen, F. Curbera, G. Meredith, and 

S. Weerawarana, “Web services description language 

(wsdl) 1.1,” Mar 2001, [Online]. Available: 

http://www.w3.org/TR/wsdl. 

[3] D. Box, D. Ehnebuske, G. Kakivaya, A. Layman, 

N. Mendelsohn, , H. F. Nielsen, S. Thatte, and D. Winer, 

“Simple object access protocol (soap) 1.1,” May 2000, 

[Online]. Available: http://www.w3.org/TR/2000/NOTE- 

SOAP-20000508/. 

[4] R. Hamadi and B. Benatallah, “A petri net based-model for 

web service composition,” in proc. the 14th australasian 

database conference, adelaide. Darlinghurst: Australian 

Computer Society, 2003, pp. 191–200. 

[5] G. Yubin, D. Yuyue, and X. Jianqing, “A cp-net model and 

operation properties for web service composition,” Chinese 

Journal of Computers (Chinese edition), vol. 29, Number 

7, p. 10671075, 2006. 

[6] Z. Zhang, F. hong, and H. xiao, “A colored petri net-based 

model for web service composition,” Journal of Shanghai 

University (English Edition), vol. 12, Number 4, pp. 323– 

329, 2008. 

[7] X. Feng, Q. Liu, and Z. Wang, “A web service composition 

modeling and evaluation method used petri net,” in Proc. 

APWeb Workshops, 2006, pp. 905–911. 

[8] R. Akkiraju and B. Sapkota, “Semantic annotations for 

wsdl and xml schema usage guide,” (2007), [Online]. 

Available: http://www.w3.org/TR/sawsdl-guide/. 

[9] J. Bruijn, D. Fensel, U. Keller, H. M Lausen, R. Krummenacher, 

A. Polleres, and L. Predoiu, “The web service 

modeling language wsml,” 2005, [Online]. Available: 

http://www.wsmo.org/wsml/. 

[10] D. Martin, M. Burstein, J. Hobbs, and al, “Owl-s: Semantic 

markup for web services,” [Online]. Available: 

http://www.w3.org/Submission/OWL-S/. 

[11] Y. Deng, S. K. Chang, J. C. A. De Figueiredo, and 

A. Psrkusich, “Integrating software engineering methods 

and petri nets for the specification and prototyping of 

complex information systems,” in Proc. The 14th International 

Conference on Application and Theory of Petri 

Nets, Chicago, June21–25, 1993, pp. 206–223. 

[12] E. H. Kerkouche and A. Chaoui, “A formal framework and 

a tool for the specification and analysis of g-nets models 

based on graph transformation,” in proc. of International 

Conference on Distributed Computing and Networking 

CDCN09, India, January 2009, p. 206211.


[13] H. J. Genrich and K. Lautenbach, “System modeling with 

high level petri nets,” Theorical Computer Science, vol. 13, 

pp. 109–136, 1981. 

[14] T. He and L. Li, “Research on verification tool for software 

requirements,” JOURNAL OF SOFTWARE, vol. 07, Issue 

7, pp. 1069–1616, JULY 2012. 

[15] M. Ter Beek, A. Bucchiarone, and S. Gnesi, “Formal 

methods for service composition,” Annals of Mathematics, 

Computing and Teleinformatics, vol. 1, Issue 5, pp. 1–10, 

2007. 

[16] F. Curbera, Y. Goland, J. Klein, F. Leymann, D. Roller, 

S. Thatte, and S. Weerawarana, . Business Process Execution 

Language for Web Service (BPEL4WS) 1.0. Published 

on the World Wide Web by BEA Corp, IBM Corp and 

Microsoft Corp, Aug 2002. 

[17] Z. Liu, A. Ranganathan, and A. Riabov, “Modeling web 

services using semantic graph transformations to aid automatic 

composition,” in IEEE International Conference on 

Web Services (ICWS 2007), Salt Lake City, Utah, July13– 

19, 2007. 

[18] B. Srivastava and J. Koehler, “Web service composition 

- current solutions and open problems,” in ICAPS 2003 


Figure 7. Provider1 G-net service and its equivalent PrT-Net 

Figure 8. PROD net description for the Provider1 PrT-Net 

workshop on Planning for Web Services, July22 2003. 

[19] B. Li, Y. Xu, J. Wu, and J. Zhu, “A petri-net and qos based 

model for automatic web service composition,” JOURNAL 

OF SOFTWARE, vol. 07, Issue 1, pp. 149–155, JANUARY 

2012. 

[20] K. Jensen, “Coloured petri nets- a high level language for 

system design and analysis,” in Lecture Notes in Computer 

Science 483. Advances in Petri Nets 1990 Springer-verlag, 

1990. 

[21] C. A. Petri, “Kommunikation mit automaten (in german),” 

Ph.D. dissertation, University of Bonn, Germany, 1962. 

[22] A. Perkusich and J. C. A. De Figueiredo, “G-nets: A 

petri net based approach for logical and timing analysis 

of complex software systems,” Journal of Systems and 

Software, vol. 39, Issue 1, pp. 39–59, Oct 1997. 

[23] J. De Lara and H. Vangheluwe, “Atom3: A tool for multiformalism 

modelling and meta-modelling,” in proc. of 

European Conferences on Theory And Practice of Software 

Engineering ETAPS02, 2002, p. 174 188. 

[24] Google, “Google maps api web service,” 

(2005), [Online]. Available: From: 

http://code.google.com/intl/com/apis/maps/.


[25] S. Narayanan and S. McIlraith, “Analysis and simulation 

of web services,” Computer Networks, vol. 42, Number 5, 

pp. 675–693, 2003. 

[26] D. Zhovtobryukh, “Context-aware web service composition,” 

Ph.D. dissertation, University of Jyvaskyla, Finland, 

2006. 

[27] PROD, “Prod: An advanced tool for efficient reachability 

analysis, version 3.4.01.” 1995, [Online]. Available: 

http://www.tcs.hut.fi/Software/prod/. 

[28] K. Varpaaniemi, J. Halme, K. Hiekkanen, and T. Pyssysalo, 

Helsinki University of technology, Tech. Rep. 

Fayçal Bachtarzi is currently a Ph.D. candidate at Mentouri 

University of Constantine, Algeria. He received his Master in 

computer science from the same University in 2010. His research 

interests include web service compostion, model driving 

engeneering, formal verification and distributed systems. 

Allaoua Chaoui is full Professor with the department of 

computer science, Faculty of Engineering, University Mentouri 

Constantine, Algeria. He received his PhD degree in 1998 from 

the University of Constantine (in cooperation with the CEDRIC 

Laboratory of CNAM in Paris, France). His research interests 

include Mobile Computing, formal specification and verification 

of distributed systems, and graph transformation systems. 

Elhillali Kerkouche is Associate Professor in the department 

of Computer science, University of Jijel, Algeria. His research 

field is Formal Methods and Distributed Systems. 



Object Search for the Internet of Things Using 

Tag-based Location Signatures 

Jung-Sing Jwo, Ting-Chia Chen 

Department of Computer Science, Tunghai University, Taichung, Taiwan 

Email: {jwo, g96350005}@thu.edu.tw 

Abstract—In this paper, an object search solution for the 

Internet of Things (IoT) is proposed. This study first 

differentiates localization and searching. Localization is to 

calculate an object’s current location. Searching is to return 

a set of locations where a target object could be. It is 

possible that the locations of the returned set are not 

contiguous. Searching accuracy can be improved if the 

number of the returned locations is small. Even though 

localization technique is applicable to searching applications, 

a simpler and easier solution will attract more enterprise 

users. In this paper, based on a concept called location 

signature, defined by a set of reference tags, an object 

searching method named Location Signature Search (LSS) 

is proposed. The study of LSS shows that the searching 

accuracy can be very high if a location signature is not 

shared by too many locations. Since location signatures are 

affected by the deployment of the reference tags, trade-off 

between searching accuracy and implementation cost is 

achievable. A real world experiment is conducted in this 

research. The results show that LSS indeed is a practical 

method for object searching applications. 

Index Terms—Internet of Things, location signature, object 

search, RFID, ubiquitous computing 


The Internet of Things (IoT) envisions a world where 

each everyday object has a unique identity and is able to 

connect to a wireless data network [1][3]. Being a digital 

identity for an object, Radio Frequency Identification 

(RFID) technology has recently been adopted by a wide 

range of industries such as retail and pharmaceuticals. 

The successful utilization of RFID technology can also 

help realizing the IoT vision – a global infrastructure of 

networked physical objects [2]. In fact, with IoT, people 

can live in a smarter world [4]. 

Recent IoT applications to support enterprise 

operations can be seen on manufacturing [5][6][7][8] and 

supply chain [9][10][11]. The ability of bridging the 

virtual world of digital information and the real world of 

products and logistical units is the key reason why IoT 

Corresponding author: Jung-Sing Jwo. 


doi:10.4304/jsw.7.12.2886-2893 

Mengru Tu 

Industrial Technology Research Institute, Hsinchu, Taiwan 

Email: tuarthur@itri.org.tw 

becomes more and more promising in solving existing 

business problems [7][12]. On the other hand, IoT has 

also attracted great attention from indoor tracking and 

localization applications [13][15-20][22-26][28-35]. 

Indoor localization, especially accurately positioning, 

is crucial for many ubiquitous computing applications 

[21][27]. In fact, for many enterprise applications, 

searching and identifying where an important asset is, for 

example, a specific mold inside a factory, is very 

important [27]. Even though Global Positioning System 

(GPS) technology is widely used to track moving objects 

outdoors, it performs quite poorly when operating indoors. 

Solutions using RFID technology for indoor 

localization or positioning have been proposed recently 

by many research teams. Examples include SpotON [20], 

and LANDMARC [23]. SpotON utilizes the RF signal 

strength to perform location calculation. LANDMARC 

uses reference tags, RF map and a large number of 

received signal strength data stored in a database to 

position an active tagged object’s location. Triangulation 

is another popular technique for RFID-based localization 

and positioning. In recent years, many researches employ 

triangulation algorithm to help indoor localization in 

places like factor’s assembly lines or conveyer belts 

[31][32][33]. To track so many tagged objects in an 

enterprise, many RFID readers must be deployed to help 

track these objects. Thus, integrating RFID technology 

with wireless sensor network to form a wireless RFID 

network [35] is another burgeoning trend in IoT-based 

localization. 

Instead of tracking the tag attached on the object, 

another IoT indoor localization solution uses tags as 

known location references and it tracks the moving reader 

mounted on the object [15][18][19][22][26][34]. In order 

to know where a given object is, with mathematical 

analysis of the sampling set of the references tags sensed 

by its attached reader, an object’s current location can be 

estimated. These types of solutions are especially useful 

for moving robot systems or tracking moving wafer 

boxes in semiconductor manufacturing or testing 

facilities [30]. 

In this paper, in stead of emphasizing localization, we 

are more interested in the issue of searching an object in


a known area where locations in that area are well 

marked. The difference between localization and 

searching is that localization is to calculate an object’s 

current coordinates while searching is to identify a set of 

limited locations an object could appear. If all the 

locations in an area are well marked, knowing a restrict 

set of positions regarding an object being searched in that 

area can greatly increase searching accuracy; but the 

same set of positions may not return a meaningful 

coordinate for that object since those positions may be 

totally irrelevant with respect to actual coordinates. 

In order to develop a simpler and easier object 

searching solution, a concept called location signature is 

introduced. By first deploying reference tags on a target 

area, each location inside the area will have its own 

location signature defined by a subset of the deployed 

tags. Based on location signature, an indoor searching 

solution named LSS (Location Signature Search) is 

proposed. In order to study the characteristics of LSS, 

simulations and experiments are conducted. The results 

show that a good reference tag deployment scheme can 

dramatically reduce the number of reference tags used to 

build location signatures and still maintain the uniqueness 

property for each location. However, if some positions of 

an area allow lower accuracy resolution, i.e. their location 

signatures are shared by other locations, the number of 

reference tags used to build location signatures can be 

further reduced. In fact, if 95% searching accuracy is 

acceptable, the number of the reference tags used to build 

location signatures is less than 200 in a 100 ×100 logical 

grid area. 

This paper is organized as follows. Section 2 describes 

the concept of location signature and object searching 

solution LSS. Section 3 studies the characteristics of LSS. 

Experiments and observations are given in Section 4. 

Section 5 is the concluding remarks. 

II. LOCATION SIGNATURES AND OBJECT SEARCHING 

In an enterprise, objects required to be searched, 

usually valuable assets, are either mounted with mobile 

readers or loaded on a recyclable pallet/trolley equipped 

with a mobile reader. Figure 1 shows an example of a 

RFID-equipped trolley from a semiconductor testing firm. 

A trolley is mounted with a RFID reader and two 

antennas: one for sensing the RFID tagged wafer boxes 

loaded on the trolley (antenna 1), and the other (antenna 2) 

for detecting the location tags placed beneath the floor. In 

this research, UHF RFID technology with frequency 

range 902-928 MHz is used to conduct the experiments. 

Instead of placing reference tags beneath the floor we 

deploy the tags on the ceiling, which is considered as a 

more cost effective approach for experiment environment. 


Figure 1. A trolley equipped with RFID reader to facilitate enterprise 

asset tracking 

Even though localization solutions can be used to 

perform object searching, searching is not necessary as 

complicated as localization. Therefore, it is possible that 

there exists a simpler searching solution than those 

utilizing the addressed RFID positioning techniques. 

In order to develop a simpler and easier object 

searching solution, location signature is introduced. By 

first deploying reference tags on an area carefully, each 

location inside the area has its own signature which is 

defined by a set of the deployed tags related to the 

location. If the location signature can be uniquely decided 

for each location in the area, indoor searching solution 

with expected accuracy becomes possible. Also, since the 

location signatures of a given area are fixed and can be 

pre-computed, the location of any given spot can be 

easily retrieved by using location signature as the 

corresponding index. Therefore, instead of tracking and 

calculating the location of a given object, an object can be 

easily searched by checking its current location signature. 

Consider a target area A. Define the expected accuracy 

resolution of a given searching requirement as r. 

Accuracy resolution r means that an object locating 

inside a r × r square is considered as at the same position. 

If r is not a large value, for example let’s say one meter, it 

means an object can be identified inside a one squaremeter 

large area which is good enough for object 

searching. Also the boundary between any two square 

areas is assumed to be belonging to one specific square. 

That is, there is no ambiguous position. In order to 

simplify our future discussion, let A be an N × N square 

area where N is a multiple of r and therefore the area A 

becomes an n × n grid where each grid size is r × r and n 

= N/r. 

Let P = { pij | where 1≦i≦n and 1≦j≦n} be the set 

of all the physical locations inside area A. Assume the 

effective detecting radius of a mobile reader attached on 

an object is d. Let 

S(pij) = {taguv | where taguv is located inside the circle 

centering at position pij with radius d}. (2.1)


Then, S(pij) is recognized as the location signature of 

pij. Consider an arbitrary position named pst. If S(pst) ≠ 

S(pij) for all 1≦i≦n and 1≦j≦n, it is clear that the 

location pst can be uniquely identified by S(pst). In other 

words, when an object is on this location, it can be 

identified by the location signature S(pst) with 100% 

accuracy. If the reference tag deployment is not good 

enough, it is possible that many neighboring locations are 

sharing the same signature. Assume there are another m 

locations sharing the same location signature S(pst), then 

the searching accuracy for location pst is: 

(n×n– m ) / (n × n) × 100%. (2.2) 

In the real world, with the uncertainty of equipments 

and environment, tags detected by a reader may not 

always be the same even when at the same location. 

Therefore, there exist various situations required to be 

discussed. First, let S be the set of the reference tags 

sensed by the reader attached on some object x at the 

location p. Assume the location signature of p is T. 

If S = T, the location(s) with the location signature T 

is (are) returned as the identified location(s) for searching 

object x. 

If S ≠ T, there exist two possible cases: 

1. S also is a valid location signature, 

2. S is not a valid location signature. 

. 

For the first case, it is clear that wrong location(s) will 

be returned for searching and therefore the object x 

cannot be found at the returned location(s). 

For the second case, less or more tags are detected at 

the location p. Since S is not a valid signature, no 

location(s) can be returned for searching. Let Ŝ denote a 

set of location signatures such that the members in Ŝ are 

either subsets or supersets of S. In other words, by adding 

or removing some tags, S becomes a valid location 

signature. Let mi be the number of the locations sharing 

the same location signature Si where Si ∈ Ŝ. Then, we say 

that the searching accuracy for object x at position p is: 

n 

× ∑ ∈ 

n − ˆ m 

∀S 

S i 

i × 100% 

(2.3) 

n× 

n 

Based on the above definitions, LSS can be described 

in the following: 

Step 1. Let A be the searching target area. Choose values 

for r and d. These two values decide the 

searching precision of the target area. 

Step 2. Based on r and d, define a reference tag 

deployment scheme for the area A. Tag 

deployment scheme can be arbitrary or any 

preferred pattern. It is obvious that the location 

signatures are highly related to the chosen 

deployment scheme. 

Step 3. Following the equation 2.1, build location 

signatures for all the locations on the area A. 

Each location and its location signature are 


paired together. It is possible that more than two 

locations are sharing the same signature. 

Step 4. When searching an object x, LSS requests the 

corresponding mobile reader of x to return its 

current detected reference tags through wireless 

network. LSS uses the returned tags to represent 

x‘s current location S. 

Step 5. If a location signature T is identified to be equal 

to S, the location(s) paired with T is (are) 

returned for searching. If object x cannot be 

found in the returned location(s), go back to Step 

4. 

Step 6. If no any location signature is identified to be 

equal to S, build set Ŝ and return all the 

location(s) paired with the location signatures in 

Ŝ. Search object x within the returned location(s). 

If object x cannot be found in the returned 

location(s), go back to Step 4. 

One of the major issues of using tag-based location 

signatures for LSS is how to guarantee that each location 

is paired with a unique location signature. This issue 

depends on how many reference tags are used and how 

they are deployed. In order to investigate the searching 

accuracy problem caused by the tag deployment, further 

studies with experiments are given in the next section. 

III. CHARACTERISTICS OF LSS 

In order to study the characteristics of LSS, an LSS 

simulation is developed. It provides a user interface to 

show the deployed reference tags in green dots on the 

target area. Based on the mobile reader mounted on the 

object, the simulator calculates all the location signatures 

and shows a visualized accuracy map with the 

corresponding searching accuracy. The dot on the map 

with deep blue color represents its location signature is 

unique and therefore the searching accuracy on that 

location is 100%. The dot with lighter blue color means 

the location is sharing the location signature with other 

locations and following the searching accuracy equation 

2.2, the computation is less than 100% for the location. 

When the blue color getting even lighter, it means the 

location is sharing the location signature with even more 

other locations and therefore the location’s searching 

accuracy is far less than 100%. 

The simulation parameters of LSS in this section are 

N = 100, r = 1, n = N / r = 100, and d =8. It is clear that 

the N × N square area is divided into n × n =10,000 grid 

locations. Before continuing the discussions, an extreme 

case is introduced first. Assume each location in the 

target area is deployed by a reference tag. It is obvious 

that the sets of tags detected at all locations are all 

different and therefore it guarantees the uniqueness of 

every location in the area. However, this extreme case 

requires 10,000 reference tags and it is not realistic when 

considering the deployment cost. Therefore, in the 

following studies this case is treated as the benchmark for 

searching accuracy and deployment cost. 

The first case we are interested in is a random case. 

Let 800 reference tags are deployed randomly in the


target area. In this case, the deployment cost is only 8% 

of the benchmark case. There are around 70% locations 

in the whole area having unique location signatures and 

therefore the searching accuracies for these locations are 

100%. In fact, the worst searching accuracy in this area is 

still larger than 99.8% and it represents that under this 

random tag deployment, the worst situation is that there 

exists a location signature shared by no more than 20 

locations. In other words, instead of going through all 

10,000 locations, any object inside the area can be 

identified within 20 locations. The tag deployment and 

searching accuracy for this case are given in Figure 2 (a) 

and (b), respectively. Figure 2 (c) shows the distribution 

of the searching accuracy for all the locations in the area. 

Figure 2. Case 1: (a) 800 reference tags randomly deployed in the area, 

(b) accuracy map of the area, (c) distribution of the searching accuracies 

for all locations. 

With the results of the random case, a new question is 

raised: is it possible to design a better tag deployment 

such that it can increase the searching accuracy but 

further reducing the deployment cost? Figure 3 shows the 

results for this new case. This deployment uses 576 

reference tags. They are deployed as a mesh in the area. 

The distance between any two vertical or horizontal tags 

is 4 grid locations. The deployment cost is less than the 

cost of the random case and its value is 5.76% of the 

benchmark case. There are 74.1% locations having 

unique location signatures and therefore the searching 

accuracies for those locations are 100%. The worst 

searching accuracy is larger than 99.9%. In other words, 

any object inside the area can be searched in less than 10 

locations. It is obvious that either deployment cost or 

search accuracy, this case is better than the previous 

random case, i.e., a good deployment design really can 

improve the searching accuracy and reduce the 

deployment cost. 


Figure 3. Case 2: (a) 576 reference tags deployed as a mesh in the area, 

(b) accuracy map of the area, (c) distribution of the searching accuracies 

for all locations. 

In the above two cases, there still exist some locations 

sharing their signatures with other locations. Therefore, 

another question is raised: is it possible to design a tag 

deployment such that the searching accuracy in the whole 

area is 100% but using fewer tags? Figure 4 is the results 

for Case 3. By observing the accuracy map of the 

previous case, it is obvious that the location signatures in 

the boundary area are not unique. Therefore, we redesign 

the tag deployment of the previous case by refining the 

boundary. The distance between any two vertical or 

horizontal tags in the boundary is changed to 2 grid 

locations. In this case, 1,161 reference tags are deployed 

on the area and the deployment cost is 11.61% of the 

benchmark case which is higher than the values of the 

previous two cases. However, in this case every location 

has its own unique location signature and therefore the 

searching accuracy is 100% for all locations in the area. 

In other word, any object appeared in the area can be 

identified exactly in its location without any other 

consideration. 

Based on the above discussions, it is clear that with 

enough tags well deployed in the target area, 100% 

searching accuracy can be achieved. A curious question is 

again raised: if only few tags can be deployed as 

reference tags, then how bad LSS will perform. Figure 5 

is the results for an example using less than 100 tags. In 

this case, only 98 tags are deployed in the area. The 

deployment design is in diamond pattern. It is obvious 

that the deployment cost is very low. In this case, there 

are only 2.16% locations having unique location 

signatures. It means if an object appeared in this area, it 

looks like the object is very difficult to be identified in an 

exact location. However, the whole area’s searching 

accuracy is still higher than 99.5%, that is, the worst 

situation to search an object is going through no more 

than 50 different locations. Since LSS can return the 

exact positions of the locations, the above issue, 

searching over 50 locations, it should not cause too much 

trouble for general enterprise applications.


Figure 4. Case 3: (a) 1,161 reference tags deployed as a mesh with fine 

grained boundary, (b) accuracy map of the area, (c) distribution of the 

searching accuracies for all locations. 

Figure 5. Case 4: (a) 98 reference tags deployed in the area with 

diamond pattern, (b) accuracy map of the area, (c) distribution of the 

searching accuracies for all locations. 

The studies of the previous four cases have shown 

some fundamental characteristics of LSS. Next, we want 

to explore the usages of LSS with some real world 

constraints. 

First, we consider the constraint such that some 

locations in the target area require higher searching 

accuracy while other locations do not. As an example, 

Figure 6 shows a rectangle hotspot in the area requiring 

100% search accuracy. The tag deployment design for 

this case includes two different schemes. In the hotspot 

area, tags are deployed equally with 4 location distances. 

For the remaining area, tags are deployed equally with 8 

location distances. With this hybrid deployment, totally 

248 tags are used. The worst searching accuracy in the 

remaining area is 99.7%. 


Figure 6. Case 5: (a) 248 tags deployed in the area, (b) accuracy map 

of the area. 

Finally, we consider another situation such that the 

target area is not a completely open space, i.e., there 

exists non-free locations for putting objects on them. 

Figure 7 gives an example of this kind. In this case, 504 

reference tags are deployed equally with 4 location 

distances. It is trivial that no reference tags are deployed 

on those non-free locations. However, the searching 

accuracy in this area still can be maintained better than 

99.8%. 

Figure 7. Case 6: (a) 504 tags deployed in the open area, (b) accuracy 

map of the open area. 

By observing the above six simulations, it shows that 

LSS indeed can be developed as a practical object 

searching solution for IoT applications. In order to further 

verify LSS behavior in the real world, an experiment is 

introduced in the next section. 

IV. EXPERIMENTS 

The schematic view of an LSS solution is depicted in 

Figure 8. The Location Signature Deployment (LSD) 

process begins with devising and simulating various 

reference tag deployment schemes. Then, a tag 

deployment scheme is adopted and saved in the location 

signature database. Upon receiving a query from a user 

who is inquiring where an enterprise asset is, LSS is 

performed to return a set of possible locations for 

searching.


Figure 8. A schematic view of LSS system. 

In this section, the above addressed system is 

implemented for conducting a real experiment. First, we 

use our lab as the target area. The size of the lab is a 24 × 

12 m 2 rectangle area. The RFID reader mounted on the 

object in this experiment is an UHF AWID MPR- 

2010BN reader with frequency range 902-928 MHz and 

reader range 5 meters. The reference tags deployed in the 

area are UPM Raflatac UHF tags with frequency range 

860-960 MHz. The location resolution chosen for the 

experiment is 0.6 meters. It means there are 800 different 

locations can be identified for searching in the area. The 

tag deployment is designed as a mesh and the reference 

tags are attached on the ceiling as shown in Figure 9. 

Distance between each pair of tags is 2.4 meters, i.e., the 

location distance between any two tags is four. This 

deployment has been first evaluated. It shows that 66 tags 

are required to use, 45.5% locations possess unique 

location signatures, and the worst searching accuracy in 

the area is 99.25%. 

Figure 9. Experiment environment: a 24×12 m 2 rectangle area with 

reference tags deployed on the ceiling. 


When performing an experiment, first the target 

object with mounted reader is randomly put on any 

available location in the area. Since the uncertainty 

caused by the experiment environment and equipment, 

the number of the location signature may be none, only 

one or larger than one. Therefore, based on LSS Step 6 

given in Section 2, the returned locations may be quite 

large. It should be noticed that it is possible the object is 

not on any of the returned locations. 

In this section, twenty experiments are conducted and 

Table 1 contains the experiment results. The second 

column of the table represents the number of the location 

signatures returned by LSS. The third column is the 

number of the locations identified by the signatures. The 

forth column is the searching accuracy. If the target 

object cannot be found in the returned locations, the value 

should be 0%. If the target object can be found in the 

returned locations, the searching accuracy is computed 

based on equations 2.2 or 2.3. 

TABLE I. 

RESULTS OF TWENTY EXPERIMENTS 

Number of Number of 

Possible Locations Search 

Experiment Signatures Returned Accuracy 

1 29 54 93.4% 

2 13 28 96.6% 

3 29 64 92.1% 

4 15 40 95.1% 

5 12 34 95.9% 

6 12 34 95.9% 

7 12 34 95.9% 

8 15 42 94.9% 

9 14 26 96.9% 

10 14 26 96.9% 

11 3 7 99.3% 

12 3 7 99.3% 

13 3 3 99.8% 

14 12 12 98.6% 

15 1 1 100.0% 

16 1 1 100.0% 

17 1 1 100.0% 

18 1 1 100.0% 

19 4 7 99.3% 

20 4 7 99.3% 

The first observation of these experiments is that 

there is no any experiment its searching accuracy is 0%. 

In other words, the target object can be 100% searched in 

all these experiments. 

The Second observation is that in some of these cases 

the exact location signature indeed cannot be identified. 

However, even under this kind of situations, the 

searching accuracies are still maintained better than 

92.1%. 

The third observation is that the number of the 

locations returned by the worst case is 64. Since the size 

of each location is 0.6 × 0.6 square meter area, searching


a target object in such a location is really not a timeconsuming 

job. In fact, searching through these 64 

locations may take around ten minutes only. 

Finally, the average searching accuracy of these 

twenty experiments is 97.4%. The average locations 

returned for searching is 21.45. From these real world 

experiments, LSS shows its potential usage for object 

searching applications. 

V. CONCLUSION 

In this paper, object localization and searching are 

first differentiated. For some IoT searching applications, 

knowing where the possible locations the target object 

could be is good enough. To satisfy the above 

requirement, an object searching solution named LSS is 

proposed. Idea behind LSS is based on a concept called 

location signature. Not like those known positioning and 

localization techniques, LSS is an easier and applicable 

object searching solution. The simulations and 

experiments conducted in this research show that the 

searching accuracy and the implementation cost of LSS 

are highly related to the tag deployment design. Therefore, 

the study of the above issue will be our future work. 

REFERENCES 

[1] N. Gershenfeld, R. Krikorian, and D. Cohen, “The internet 

of things,” Scientific American, vol. 291, no. 4, pp. 76–81, 

2004. 

[2] G. Kortuem, F. Kawsar, D. Fitton, and V. Sundramoorthy, 

“Smart objects as building blocks for the Internet of 

Things,” IEEE Internet Computing , vol.14, no.1, pp.44-51, 

2010. 

[3] E. Welbourne, L. Battle, G. Cole, K. Gould, K. Rector, S. 

Raymer, M. Balazinska, and G. Borriello, “Building the 

Internet of Things using RFID,” IEEE Internet Computing, 

vol.13, no.3, pp.48-55, 2009. 

[4] Till Quack, Herbert Bay, and Luc Van Gool, “Object 

Recognition for the Internet of Things,” Proceedings of the 

1st International Conference on Internet of Things (IOT 

08), pp.230-246, 2008. 

[5] Ruey-Shun Chen, Yung-Shun Tsai, and Arthur Tu, “An 

RFID-based manufacturing control framework for loosely 

coupled distributed manufacturing system supporting mass 

customization,” IEICE Transactions on Information and 

Systems, vol.E91-D, no.12, pp.2834-2845, 2008. 

[6] Luciana Moreira Sá de Souza, Patrik Spiess, Dominique 

Guinard, Moritz Kohler, Stamatis Karnouskos, and 

Domnic Savio, “SOCRADES: A web service based shop 

floor integration infrastructure,” Proceedings of the 1st 

International Conference on Internet of Things (IOT 08), 

pp.50-67, 2008. 

[7] Ruey-Shun Chen, Mengru Arthur Tu, and Jung-Sing Jwo, 

“An RFID-based enterprise application integration 

framework for real-time management of dynamic 

manufacturing processes,” International Journal of 

Advanced Manufacturing Technology, published online: 18 

March 2010. 

[8] Mengru (Arthur) Tu, Jia-Hong Lin, Ruey-Shun Chen, Kai- 

Ying Chen, and Jung-Sing Jwo, “Agent-Based Control 

Framework for mass customization manufacturing with 

UHF RFID technology,” IEEE Systems Journal, vol.3, 

no.3, pp.343-359, 2009. 


[9] Chris Kurschner, Cosmin Condea, Oliver Kasten, and 

Frédéric Thiesse, “Discovery Service Design in the 

EPCglobal Network,” Proceedings of the 1st International 

Conference on Internet of Things (IOT 08), pp.19-34, 2008. 

[10] Ali Dada and Frédéric Thiesse, “Sensor Applications in the 

Supply Chain: The Example of Quality-Based Issuing of 

Perishables,” Proceedings of the 1st International 

Conference on Internet of Things (IOT 08), pp.140-154, 

2008. 

[11] Bo Yan and Guangwen Huang, “Supply Chain Information 

Transmission based on RFID and Internet of Things,” 

ISECS International Colloquium on Computing, 

Communication, Control, and Management, pp.166-169, 

2009. 

[12] Patrik Spiess, Stamatis Karnouskos, Dominique Guinard, 

Domnic Savio, Oliver Baecker, Luciana Moreira Sá de 

Souza, and Vlad Trifa, “SOA-based Integration of the 

Internet of Things in Enterprise Services,” IEEE 

International Conference on Web Services, pp.968-975, 

2009. 

[13] K. Tanaka, Y. Kimuro, K. Yamano, M. Hirayama, E. 

Kondo, and M. Matsumoto, “A supervised learning 

approach to robot localization using a short-range RFID 

sensor,” IEICE Transactions on Information and Systems, 

vol.E90-D, no.11, pp.1762-1771, 2007. 

[14] Pedro Coronel, Simeon Furrer, Wolfgang Schott, and Beat 

Weiss, “Indoor Location Tracking Using Inertial 

Navigation Sensors and Radio Beacons,” Proceedings of 

the 1st International Conference on Internet of Things 

(IOT 08), pp.325-340, 2008. 

[15] X. Bao and G. Wang, “Random sampling algorithm in 

RFID indoor location system,” 3rd IEEE International 

Workshop on Electronic Design, Test and Applications, pp. 

168-176, Jan. 2006. 

[16] K. Cho, S. Pack, T. Kwon, and Y. Choi, “SRMS: SIPbased 

RFID management system,” IEEE International 

Conference on Pervasive Services, pp.11-18, July 2007. 

[17] E. Coca and V. Popa, “Experimental results and EMC 

considerations on RFID location systems,” Proceedings of 

the 1st Annual RFID Eurasia, pp.1-5, Sept. 2007. 

[18] F. Dellaert, D. Fox, W. Burgard, and S. Thrun, “Monte 

Carlo localization for mobile robots,” Proceedings of the 

IEEE International Conference on Robotics and 

Automation, pp.1322-1328, May1999. 

[19] S. Han, H. Lim, and J. Lee, ”An Efficient Localization 

Scheme for a Differential-Driving Mobile Robot Based on 

RFID System ,” IEEE Trans. Ind. Electron., vol. 54, no. 6, 

pp. 3362-3369, Dec. 2007. 

[20] J. Hightower, R. Want , and G. Borriello, “SpotON: An 

indoor 3D location sensing technology based on RF signal 

strength,” University of Washington, Department of 

Computer Science and Engineering, Seattle, Feb. 2000. 

[21] J. Hightower, and G. Borriello, “Location systems for 

ubiquitous computing,” IEEE Computer, vol. 34, no. 8, pp. 

57-66, Aug. 2001. 

[22] J.-H. Lee and H. Hashimoto, “Controlling mobile robots in 

distributed intelligent sensor network,” IEEE Trans. Ind. 

Electron., vol. 50, no. 5, pp. 890-902, Oct. 2003. 

[23] L. M. Ni, Y. Liu, Y. C. Lau, and A. P. Patil, 

“LANDMARC: Indoor location sensing using active 

RFID,” Wireless Network, pp. 701–710, Nov. 2004. 

[24] S. Polito, D. Biondo, A. Iera, M. Mattei, and A. Molinaro, 

“Performance Evaluation of Active RFID Location 

Systems based on RF Power Measures,” The 18th Annual 

IEEE International Symposium on Personal, Indoor and 

Mobile Radio Communications, pp. 1-5, Sept. 2007.


[25] C. Roduner , and C. Floerkemeier ,”Towards an Enterprise 

Location Service,” International Symposium on 

Applications and the Internet Workshops (SAINTW'06), 

pp. 84-87, Jan. 2006. 

[26] K. Yamano, K. Tanaka, M. Hirayama, E. Kondo, Y. 

Kimuro, and M. Matsumoto, “Self-localization of mobile 

robots with RFID system by using support vector 

machine,” Proc. IEEE/RSJ Int. Conf. Intell. Robots Syst., 

vol. 4, pp. 3756-3761, Sep. 2004. 

[27] L. Yan, Y. Zhang, L. T. Yang, and H. Ning, The Internet 

of Things: From RFID to the Next-Generation Pervasive 

Networked Systems (Wireless Networks and Mobile 

Communications), Auerbach Publications, Taylor & 

Francis Group, FL, USA, 2008. 

[28] T. Zhang, Z. Xiong, and Y. Ouyang, “A Framework of 

Networked RFID System Supporting Location 

Tracking,”2nd IEEE/IFIP International Conference in 

Central Asia on Internet, pp. 1-4, Sept. 2006. 

[29] J. Zhao, Y. Zhang, and M. Ye, “Research on the received 

signal strength indication location algorithm for RFID 

system,” International Symposium on Communications 

and Information Technologies, pp. 881-885, Sept. 2006. 

[30] Frédéric Thiesse Thiesse, Markus Dierkes, and Elgar 

Fleisch, "LotTrack: RFID-Based Process Control in the 

Semiconductor Industry," IEEE Pervasive Computing, vol. 

5, no. 1, pp. 47-53, Jan.-Mar. 2006. 

[31] Jiahao Wang, Zongwei Luo, and Edward C Wong, “RFIDenabled 

tracking in flexible assembly line,” International 

Journal of Advance Manufacturing Technology, 46:351– 

360, June 2009. 

[32] Yimin Zhang, Moeness G. Amin, and Shashank Kaushik, 

“Localization and Tracking of Passive RFID Tags Based 

on Direction Estimation,” International Journal of 

Antennas and Propagation, vol. 2007, Article ID 17426, 9 

pages, 2007. 

[33] Guang-yao Jin, Xiao-yi Lu, and Myong-Soon Park, “An 

Indoor Localization Mechanism Using Active RFID Tags,” 

in Proceedings of the IEEE International Conference on 

Sensor Networks, Ubiquitous, and Trustworthy Computing 

(SUTC'06), vol.1, pp. 40–43, Taichung, Taiwan, June 2006. 


[34] Jurgen Bohn, “Prototypical implementation of locationaware 

services based on a middleware architecture for 

super-distributed RFID tag infrastructures,” Personal and 

Ubiquitous Computing, 12:155-166, 2008. 

[35] Ching-Hsien Hsu, Yi-Min Chen, and Heau-Jo Kang, 

“Performance-Effective and Low-Complexity Redundant 

Reader Detection in Wireless RFID Networks,” EURASIP 

Journal on Wireless Communications and Networking, vol. 

2008, Article ID 604747, 9 pages, 2008. 

Jung-Sing Jwo received his BSE degree from the National 

Taiwan University in 1984, MSE and PhD degrees from the 

University of Oklahoma, in 1988 and 1991 respectively, both in 

computer science. He is currently with the Department of 

Computer Science, Tunghai University, Taiwan. His research 

interests include distributed computing, enterprise computing 

and software engineering. 

Ting-Chia Chen is a graduate student at the Department of 

Computer Science, Tunghai University, Taiwan. His research 

interests are in RFID and Internet of Things (IoT). 

Mengru Tu received his MBA degree in Information 

Management from the University of Texas at Austin and M.S. 

degree in Computer Science from the Northwestern Polytechnic 

University, United States, in 1998 and 2002 respectively. He 

received his PhD degree in Information Management from the 

National Chiao Tung University, Taiwan, in Jan 2011. 

He is currently a full-time researcher at the Industrial 

Technology Research Institute of Taiwan and has been working 

there since 2004. His research interests include Internet of 

Things (IoT) and RFID, intelligent agents, logistics information 

systems, e-commerce, and empirical research in information 

systems.


Fractional Order Ship Tracking Correlation 

Algorithm 

Mingliang Hou 

School of Computer Engineering /Huaihai Institute of Technology, Lianyungang,China 

Email: ioehml@yahoo.cn 

Yuran Liu 

School of Computer Engineering /Huaihai Institute of Technology, Lianyungang,China 

Email:cnlyr@sina.com.cn 

Abstract—Aiming at the radar and AIS sensor's fuzzy 

association to ship tracking which caused by different error, 

this article provided fractional order ship tracking 

correlation model. From the mathematical point of view, the 

model used in this thesis extended the integer-order 

correlation measurement to the fractional-order correlation 

measurement; from the correlation of view, elongated the 

non-process correlation of the point information to the 

process correlation of the line information. Example shows 

that, fractional-order association algorithm can provide 

much more related information; and enhance the ship 

tracking correlative accuracy. 

Index Terms—fractional order, ship tracking association, 

process association, association. 


In recent years, the rapid economic development in 

various regions and countries in the world and the global 

economic integration have paved the way for the 

prosperous development of shipping industry. With the 

thriving of water transportation and the increasing of 

maritime navigation density, ships encounter more 

frequently and offshore traffic accidents happen 

inevitably on the increase. Consequently the accurate 

tracking and detecting for ships has become one of the 

research subjects. Currently radar and automatic 

identification system (AIS) are the main means for ship 

tracking detection, nevertheless, the information provided 

by radar is subject to the influences of sea conditions and 

topography while the non-autonomous detection of AIS 

and the fact that not all ships are installed with AIS also 

make AIS application under limitations. Therefore, in 

order to integrate the complementary radar information 

and AIS information, first of all AIS information and 

radar information needs to be correlated, namely to 

establish tracking correlation. Tracking correlation is the 

Manuscript received January 1, 2011; revised June 1, 2011; accepted 

July 1, 2011. 

Corresponding Author: Yu-ran Liu is with the Huaihai Institute of 

Technology, Lianyungang,China. 

Tel: 86-18061342113.E-mail: cnlyr@sina.com.cn 


doi:10.4304/jsw.7.12.2894-2898 

process of comparing the degree of similarity between 

two tracking and judging whether the tracking from 

different sensors with different errors is of the same target. 

II. REVIEW 

A. Tracking-related Review 

Tracking-related issues are essentially time series 

correlation[1,2] which were first put forward by Singer 

and Kanyuck. At present, the major algorithms include 

the weighted statistical distance test, the revised weighted 

statistical distance test, the nearest neighbor(NN), the 

classic distribution method[3], the likelihood ratio test, 

the maximum likelihood method, K neighbor, multivariant 

hypothesis test, generalized correlation method, 

interacting multi-model method[4], grey correlation 

algorithm[5-8] and various other fuzzy methods[9]. 

Due to the fact that targets are more concentrated in 

the port, target tracking is usually intersected and there 

are much mobile tracking, as well as the fact that the data 

are of great grayscale and without typical regularity of 

distribution, the application of the algorithm above will 

produce lots of wrong or missing related tracking, thus 

failing to meet the accuracy requirement of the tracking 

correlation. Nevertheless, the method of gray correlation 

analysis which is based on grey system will fill in this 

gap. Gray correlation analysis to analyze the correlation 

degree between various factors of the system by 

comparing the geometric relations of system data 

sequences. 

In recent years, gray correlation algorithm obtained a 

significant development, and many scholars have made 

great contributions. From the relational degree itself, it 

experienced from the gray relational algorithm of no 

differential measurement information (such as Tang's 

correlation, the absolute correlation Ⅱ , the relative 

correlation, correlation interval Ⅰ, range correlation Ⅱ) 

to gray relational algorithm with first-order differential 

metrical information (such as absolute relational degree 

Ⅰ, slope correlation, and T-type correlation), and then 

turned to be the gray relational algorithm with second-


order differential metrical information (B-type correlation, 

C-type correlation). 

The abovementioned indicates that the introduction of 

the high order information and the fractional information 

into the associated metrics of the series is the 

development trend of related algorithms and it is also 

correspond with the tracking accurate associated needs. 

In the paper, the notion of fractional order is 

introduced into the tracking correlation model and the 

similarities between global shape and local shape are 

taken into consideration. Besides, the point identification 

is expanded to the line identification so as to improve 

spatial resolution and system reliability and to reduce the 

uncertainty of system information, thus enhancing the 

accuracy of tracking correlation. 

B. Summary of Fractional Differential 

Fractional calculus refers to the calculus with order of 

any real number order. For more than three centuries, 

many famous scientists did a lot of basic work on 

fractional calculus; however, fractional calculus really 

began to grow till the last 30 years. Oldham and Spanier 

discussed mathematical calculations of the fractional 

number and their application in areas like physics, 

engineering, finance, biology, etc. In 1993, Samko made 

systematic and comprehensive exposition on fractional 

integral and derivative related properties and their 

applications. Many researchers have found that, fractional 

derivative model can more accurately describe the nature 

of a memory and the genetic material and the distribution 

process than integer order derivative model[10]. The 

overall and memory characteristic of fractional are widely 

used in physics, chemistry, materials, fractal theory[11], 

image processing[12] and other fields. Currently, the 

analysis of fractional differential has become a new 

active researched area that aroused great attention of 

domestic and foreign scholars, and turned to be the 

world's leading edge and hot research field. 

III. ALGORITHM PRINCIPLE 

In this paper, the notion of fractional order is 

introduced into the tracking correlation model, both the 

similarities of overall shape and partial shape are 

considered so that the original point recognition is 

expanded to the line recognition, thus to reduce the 

uncertainty of information system and improve the 

accuracy of tracking correlation.[13,14] 

A. Fractional Order Differential 

Differential operations can enhance the high-frequency 

and weaken low-frequency of the signals. Fractional 

differential operation can nonlinearly improve more highfrequency 

and weaken less low-frequency of the signals 

with the growth of the order. From the perspective of the 

information extraction, the order of integer order 

operations is discrete whereas fractional one is 

continuous, and can provide more tracking correlation 

information to help the identification of the target 

tracking. 


Each observed value on the tracking is the common 

result of variety of subjective and objective factors and 

the development of all previous observations; therefore 

tracking is of overall and memory characteristics. 

Fractional differential operator is intended differential 

operator with overall and memory characteristics whereas 

integral order doesn’t have this feature. Therefore, from 

the description of the tracking it can conclude that, 

fractional differential could more accurately describe the 

memory nature of tracking comparing with the integral 

order one, and was imported to calculate the relevancy of 

the tracking. 

B. Fractional Differential Difference Form 

Since time series is discrete, when using the fractional 

differentials in it's associate calculation, the definition 

pattern of fractional differential algorithms must be 

change into the difference form. Then, we derive the 

fractional differential difference formula via Grümwald- 

Letnikov definition. 

Known, v order fractional differential Grümwald- 

Letnikov definition is 

n 

G v v −v −v 

a t t ∑ r 

h→0 h→0 

nh→− t a r= 

0 

Dst () = lim s() t = lim h C st ( −rh) 

(1) 

Where in 

−v ( −v)( − v+ 1)...( − v+ r−1) 

Cr 

= 

r! 

According to Expression(1), if the persistent period of 

s(t) is: t ∈ [a ,t], divide [a ,t] into equal parts 

corresponding to one unit interval h=1, it can be got that 

h= 

1 

⎡t−a⎤ n= ⎢ 

= [ t−a] ⎣ h ⎥ 

⎦ 

Then, v order fractional differential difference 

expression to unitary signal s(t) can be get 

v 

dst () ( −v)( − v+ 

1) 

≈ st () + ( −vst ) ( − 1) + st ( −2) 

v 

dt 

2 

( −v)( − v+ 1)( − v+ 

2) 

+ st ( − 3) + ... 

6 

Γ− ( v + 1) 

+ st ( −n) 

n! Γ− ( v+ n+ 

1) 

From this differential expression, the difference 

coefficient of the fractional is 

( −v)( − v+ 

1) 

a0 = 1, a1 =− v, a2 

= 

, 

2 

( −v)( − v+ 1)( − v+ 2) Γ( − v+ 

1) 

a3= ,..., an= 

6 n! Γ− ( v+ n+ 

1) 

(2) 

C. The Nature of Fractional Differential 

Fractional differential operator can meet the exchange 

rate and the overlay standard 

v1 v2 v2 v1 v1+ v2 

D D st () = D D st () = D st () 

.


(0, 1) differential order measures the overall situation 

of the sequence, other differential order results can all be 

acquired through the iterate integer-order differential on it. 

First-order differential reflects the slope of the tracking, 

second-order differential reflects the curvature of the 

tracking, and they all response to the partial trends of the 

tracking. To give consideration to the measurements of 

both global and local trends, non-integral order emphasis 

on (0, 1) order, integer order taking into account of the 

first, second order, therefore, this paper is only analysis 

the (0, 3)-order differential related information. 

IV. ALGORITHM MODEL 

Based on the time series of tracking in x, y and z 

directions, figure out the respective correlation curve and 

analyze the degree of correlation of each curve at each 

differential so as to confirm tracking correlation. 

Based on the t tracking are acquired 

s11, s12,..., s1ms 

, 21, s22,..., s2m, 

... s 

, t1, st2,..., s tm,wh 

Sij = ( xij , yij ), i = 1, 2,..., t; j = 1, 2,..., m 

ere 

, 

( xij, yij 

) 

is the spatial three-dimensional coordinate of 

S ij s 

. Set 11, s12,..., s1mas 

reference vector sequence, 

and figure out the correlation degree between 

si1, si2,..., sim, i = 2,3,..., t 

and reference vector 

sequence. Then find out the sequence correlated with the 

reference sequence, so as to discover correlation tracking. 

A. Fractional Order Correlation 

Based on the vector sequence of 

Sij = ( xij , yij ), i = 1, 2,..., t; j = 1, 2,..., m 

, generate 

the following sequences in the direction of x, y 

respectively. 

X i = xi1, xi2,..., xim; i = 1,2,..., t 

Yi = yi1, yi2,..., yim; i= 1,2,..., t 

Calculate the correlation degree of the above 

sequences respectively and assume the correlation degree 

Qi() v 

between sequence 1 Q and sequence 

Qi, i = 2,3,..., t 

as: 

1 

Qi v Q Q i t v 

m = 

m 

v v 2 

( ) = ( 1 j − ij) 

; = 2,3,..., ; ∈(0,3) 

− 7 j 8 

∑ 

Where: 

j 

v 

1j= ∑ 1 d ( j− d + 1) 

j 

v 

ij = ∑ id ( j− d + 1) 

d= j− 5 d= j−5 

Q q * a , Q q * a ; 

i=2,3,…,t; Q=X,Y; 

q = x, y 

. 

B. Correlation Judgment 

(1) Judgment for numerical value of correlation: 


The greater the value of 

Qi() v 

is, the smaller the 

correlation degree between sequence Qi and Q1 is; 

whereas, the smaller the value of 

Qi() v 

is, the greater the 

correlation degree between sequence Qi and Q1 is. 

(2) Relation between order and correlation: 

Comparing with high order differential, low order 

differential extract more of the low-frequency 

information and less high-frequency information. As for 

the tracking, low order differential extract more longterm-effect 

information while high order differential 

extract more short-term-effect information. 

When 

Qi( v) < Qj( v), i, j = 2,3,..., t 

and v ranges 

between 0 and 1, sequence Qi has a greater correlation 

with sequence Q1 than sequence Qj does in the long term; 

when v ranges between 1 and 3, sequence Qi has a 

greater correlation with sequence Q1 than sequence Qj 

does in the short term. 

V. ALGORITHM SIMULATION 

Based on the tracking of 

Sij= ( xij , yij ), i = 1,2,...,7; j = 1,2,...,26 

, 

generate the following sequences in the direction of x, y 

respectively. 

Xi = xi1, xi2,..., xi26, i= 

2,3,...,7 

xij ∈[0,1000] 

where 

,see Fig.1, 

Yi = yi1, yi2,..., yi26, i = 2,3,...,7 

yij ∈[0,1000] 

where 

,see Fig.2, 

Figure 1. Tracking data in x direction


Figure 2. Tracking data in y direction 

Taking Radar and AIS's detecting error into account 

which is about 20 meters and 2 meters respectivly, 

simulation experiments are structured as follows. Add 

noise with range of [-2, 2] into the reference 

X , 1 Y X 

to get 11 , 11 Y ; add noise with range of 

sequences 1 

[-20, 20] into all the tracking to get i2 

X 

, i2 

Y , in which 

i=1 , 2,..., 7. Then calculate separately the curve of 

correlation degree (see Fig.3,4) at the order of (0, 3) 

X 

between i2 

X 

and 11 Y 

, i2 

and 11 Y . In the figure, mqhk 

stands for the correlation degree between qh and qk, in 

which m stands for correlation degree, 

q = x, y 

, h and k 

are the subscripts of the sequences generated after adding 

noise. 

Figure 3. Curve of correlation degree at the order of (0, 3) between 

X i2 

X 

and 11 , i=1,2,...,7 


Figure 4. Curve of correlation degree at the order of (0, 3) between 

Yi 2 and 11 Y , i=1,2,...,7 

In Fig.3,4, from the curves of correlation degree at the 

X 

order of (0, 3) between i2 

X 

and 11 Y 

, i2 

and 11 Y , it can 

be seen that all the correlation values between 

X i2 

X 

and 11 Y 

, i2 

and 11 Y at the order of (0,3) are far 

smaller than other correlation values. Therefore, it can be 

X 

concluded that i2 

X 

and 11 Y 

, i2 

and 11 Y are correlated 

S 

and thus 11 S 

and 12 are the tracking of the same target. 

The results agree with the experimental expectation. 


It is proved in the experiments that the addition of 

error will affect high-frequency information of the 

sequence and moreover, with the increase of the error, the 

influence will gradually expand from higher order to 

lower order. On account of the effects of errors, the order 

can be adjusted to calculate correlation degree. The 

greater the error is, the smaller the order needs to choose 

at. It is also proven in the experiments that when adding 

noises with range of [-20,20] and [-2,2] into the sequence 

with the threshold of [0, 1000], the required correlated 

accuracy can still be acquired with this algorithm. 


This work was financially supported by the National 

Natural Science Foundation of China (Grant No. 

10731050) and Ministry of Education Innovation Team 

of China(No. IRT00742). 

REFERENCES 

[1] Agafonov E, Bargiela A, and Burke E. "Mathematical 

justification of a heuristic for statistical correlation of reallife 

time series". European Journal of Operational 

Research, vol.198,pp. 275-286,2009. 

[2] Kharrat A., Chabchoub H., and Aouni B. "Serial 

correlation estimation through the imprecise Goal 

Programming model". European Journal of Operational 

Research, vol.177,pp.1839-1851.2007.


[3] Chang C.B. and Youens L.C.. "Measurement correlation 

for multiple sensors tracking in a dense target 

environment". IEEE Trans AC ,vol.27,pp.1250-1252,1982. 

[4] Zhang J. S. , Yang W. Q. and Hu S. Q.. "Target Tracking 

Using the Interactive Multiple Model Method". Journal of 

Beijing Institute of Technology, vol.7,pp.299-304,1998. 

[5] Deng J. L. and Zhou C. S.. "Sufficient conditions for the 

stability of a class of interconnected dynamic systems". 

Systems & Control Letters. vol.7,pp.105-108,1986. 

[6] Deng J. L.. "Introduction to grey mathematical resources. 

Journal of Grey System". vol.20,pp.87-92,2008. 

[7] Guan X. , He Y. and Yi X.. "Gray track-to-track 

correlation algorithm for distributed multitarget tracking 

system". Signal Processing, vol.86,pp.3448-3455,2006. 

[8] Tsai M. S. and Hsu F. Y.. "Application of in grey 

correlation analysis Evolutionary Programming for 

Distribution System Feeder Reconfiguration". IEEE 

Transactions on Power Systems.vol.25,pp.1126- 

1133,2010. 

[9] Ye. J.. "Fuzzy decision-making method based on the 

weighted correlation coefficient under intuitionistic fuzzy 

environment". European Journal of Operational Research, 

vol.205,pp.202-204,2010. 

[10] Valdes-Parada F.J., Oehoa-Tapia J.A. and Alvarez- 

Ramirez J.."Effective Medium Equation for Fractional 

Cattaneo’s Diffusion and Heterogeneous Reaction in 

Disordered Porous Media". Physica A: Statistical 

Mechanics and its Applications , vol.369,pp.318- 

328,2006. 

[11] Carpinteri A., Cornetti.P.."A Fractional Calculus 

Approach to the Description of Stress and Strain 

Localization in Fractal Media".Chaos,Solitons and 

Fractals,vol.13,pp.85-94,2002. 

[12] Hou M. L., Liu Y. R., Wang Q.. "An image information 

extraction algorithm for salt and pepper noise on 

fractional differentials". Advanced Materials Research, 

vol.179, pp.1011-1015,2011. 

[13] Liu Y. R., Luo M. K., Ma H., Hou M. L.."Fractional 

Order Correlation Algorithm of Uncertain Time 

Sequence". Journal of Grey System. vol.14,pp.55-60,2011. 

[14] Liu Y. R.,Hu Y., Hou M. L. "Fractional Order Gray 

Prediction Algorithm". vol.14,pp.5-10,2011. 


Ming-Liang Hou received his M.S. 

degree in China University of Petroleum 

in 2004 and Ph. D. degree in Institute of 

Optics and Electronics, Chinese Academy 

of Sciences in 2008. He is now a Lecturer 

in School of Computer Engineering, 

Huaihai Institute of Technology, 

Lianyungang, China. His current research 

interests ranged over the fields of Grey 

Theory, Pattern Recognition, Virtual Simulation, Intelligent 

Fault Diagnosis Technology. 

Yu-Ran Liu received her M.S. degree in 

China University of Petroleum in 2004 and 

PhD degree in Institute of Optics and 

Electronics, Chinese Academy of Sciences 

in 2009. She is now an Associate Professor 

in School of Computer Engineering, 

Huaihai Institute of Technology, 

Lianyungang, China. Her current research 

interests lied in the fields of Grey Theory, 

Operational Research, Image Processing, 

Pattern Recognition, etc.


A Label Correcting Algorithm for Dynamic 

Tourist Trip Planning 

Abstract—One of the most important considerations for 

tourist in tourism is how to design an optimal trip planning. 

Selecting the most interesting points of interest and 

designing a personalized tourist trip is called a tourist trip 

design problem (TTDP) and, it can be modeled as a 

orienteering problem. However, most previous researches 

solve this problem in static network, which leads to 

unreasonable results. In this paper, to formulate this 

orienteering problem in a time-dependent network, a 

mathematical programming model is built with introducing 

a time-aggregated graph. Then a novel label correcting 

algorithm (LCA) is presented to solve this problem based on 

the idea of network planning and dynamic programming. 

Finally, a numerical example is given to show the 

effectiveness and feasibility of the algorithm. 

Index Terms—tourist trip planning, orienteering problem, 

time-dependent network, label correcting algorithm, 

tourism 


Recently, China has held various sightseeing and 

exhibition activities, such as Beijing Olympic Games, 

Shanghai World Expo and many types of commercial 

exhibitions, etc. However, many tourists visit a region or 

a city during one or more days. Obviously, it is 

impossible to visit everywhere and tourists want to select 

what they believe to be the most attractive points of 

interest (POI). Nowadays, their selection can be made 

according to information of web sites, articles in 

magazines, or on guidebooks available in bookstores. 

Once the decision is made, the tourists will decide on the 

time to visit each point and choose a route between the 

points. However, it is difficult for tourists to design 

schedule, as many factors, such as many places are 

crowded, the traffic accidents may happen, or the road is 

under construction or closed, will lead to uncertainties for 

trip plans. The goal for tourists is to make a decision on 

trip plan where the POIs are visited as many as possible 

within a time budget. In fact, the traditional tourist trip 

design in static network has not been able to meet the 

demand. With the help of mobile and sensing technology, 

and taking the real-time traffic information into account, 

only tourist trip design in dynamic network is able to 


doi:10.4304/jsw.7.12.2899-2905 

Jin Li 

School of Computer and Information Engineering, 

Zhejiang Gongshang University, Hangzhou, P.R.China 

Email: jinli@mail.zjgsu.edu.cn 

Peihua Fu 

School of Computer and Information Engineering, 

Zhejiang Gongshang University, Hangzhou, P.R.China 

Email: fph@mail.zjgsu.edu.cn 

provide personalized and high-quality services for 

tourists. 

Nowadays, the tourist trip design problem (TTDP) is 

commonly seen as an extension of orienteering problem 

(OP)[1]. The OP is also known as the Selective Traveling 

Salesperson Problem (STSP)[2-4], the knapsack 

problem[5], the Maximum Collection Problem (MCP)[6] 

and the bank robber problem[7]. Furthermore, the OP can 

be formulated as a special case of the Resource- 

Constrained TSP (RCTSP)[8]. Since these kinds of 

problems are NP-hard, most researches have focused on 

heuristics and metaheuristics [9-14], e.g. guided local 

search, the ant colony algorithms, etc. Dynamic trip 

design relies mainly on the mobile communication 

technology to provide tourist location-based real-time and 

fast navigation services. In the meanwhile, it can also 

help the tourists choose target points and find the shortest 

route to reach there. This kind of mobile tourist guides 

include Cyberguide[15], Gulliver’s Genie[16], GUIDE 

[17], CRUMPET[18], etc. However, tourists preferences 

are diverse[19] with the rising quality of life and aesthetic 

taste. Static trip design does not take into account the 

specific situation of the tourist, e.g. tourist’s start and end 

visiting time, current time, total time budget and weather 

conditions, etc. So the personalized mobile tourist guides 

based on context and location become an inevitable trend. 

Ten Hagen et al.[20] designed a mobile tourist guide, i.e. 

Dynamic Tour Guide (DTG) to decide on the tour of 

visiting a city during a period of time. He saw the tour as 

a set of Tour Building Blocks (TBB), which denotes 

attractions, hotels and something else chosen by tourist. 

Then, TBB receive interest matching points by a semantic 

matching algorithm, which calculates the degree of 

similarity between the TBB and the profile of the user. 

Finally, a heuristic approximation algorithm is used to 

calculate the tour. Souffria et al.[21] proposed a 

personalized tourist trip design algorithm for mobile 

tourist guides. By combining an artificial intelligence and 

a guided local search metaheuristic approach, this 

algorithm provided fast decision support for tourist using 

the mobile devices, and was compared with other


algorithms making use of a real data set from the city of 

Ghent. 

The above researches just study the TTDP in a static 

network based on the start point, destination point and 

time budget without involving tourist trip design problem 

in a dynamic time-dependent network. However, the 

tourist trip network is actually time-dependent in real 

exhibitions or tourism due to the crowed places, 

emergencies, temporary shows, and promotions, etc. For 

example, the tourist travel time is significantly different 

between peak time and valley time in exhibitions. With 

the development of advanced sensor network, 

information system and databases, dynamic traffic 

information is available. Therefore, to formulate the 

tourist trip design problem in time-dependent network, 

based on the previous research on time-aggregated 

graph[20][21], this paper proposes a mathematical 

programming model, and puts forward a label correcting 

algorithm (LCA) to solve this problem. 

This paper is organized as follows. The problem 

formulation for tourist trip design problem in timedependent 

network is described in Section 2. In Section 3, 

the label correcting algorithm is given. Moreover, Section 

4 validates the proposed label correcting algorithm by 

numerical example. Finally, the concluding remarks and 

further research are included in Section 5. 

II. PROBLEM DESCRIPTION 

A. Prolbem Assumptions 

Tourist trip design problem in time-dependent network 

can be expressed as follows. Given a set of visiting points 

and the tourist preference value, each visiting point has 

an average stay time. And each visiting point can only be 

visited once. Then the objective of this problem is how to 

decide on the time to visit each point and select a route 

between the points in order to maximize the total utility 

of tourist trip ( that is, the sum of preference value of all 

selected visiting points) within a given time budget. This 

paper will solve the tourist trip design problem in the 

time-dependent network based on the following 

assumptions: 

� In time-dependent network, the travel time on an 

edge depends on the time entering it, i.e., once an 

edge is entered, the travel time is given by the start 

time of entering it. It is assumed that time is 

discredited into small units (such as 1 hour or less 

than 10 minutes). 

� A tourist has a preference value for every visiting 

point, which is set as an random integer from 

interval [1, 10]. This value represents the tourist’s 

interest in the point. To be practical application, 

this preference value can be obtained using 

information retrieval or semantic identification 

technology via mobile tourist guide devices [20] 

[21]. 

� Tourist often decides trip plan in terms of the 

schedule, set the time budget to be fixed and 

constant. 


� Generally speaking, a tourist avoids going to a 

visiting point more than once, so each point is 

assumed to be visited at most once. 

� We don’t consider the phenomenon of return and 

round in traveling road. 

� Taking into account the wide use of traffic sensor 

network, information system and the databases 

[23][24], the travel time on discrete time is set to 

be dynamic and available. 

B. Time-aggregated Network 

Given a transportation network G � ( V , E) 

, where 

V � { Vi 

| i �1, 

2, 

�, 

n} 

is the node set, E � { eij 

| Vi 

, V j �V} 

is 

the edge set, if V is adjacent with i 

V , then there is one 

j 

edge e between them, ij 

| E | � m . Since the edge and the 

travel time on a edge are time-dependent, it is not easy to 

denote the tourist moves on each edge. We use a timeaggregated 

graph[22] to represent the network change at 

each time instant. 

In time-aggregated graph, each arc e has two 

ij 

properties: time series ET indicating the time instants at 

ij 

which they are present and travel time series TT ij 

representing the travel times at various instants. 

ETij � [ 1, 

2, 

�, 

T] 

represents the existing state of arcs at 

each instant, and is a set of the corresponding time when 

the arcs are connected, which indicates a network 

topology varies with time. Travel time series 

TTij � ( Tij 

( 1), 

Tij 

( 2), 

�, 

Tij 

( T )) denote the travel time when 

the edge is present, Tij (t) 

represents the travel time on 

edge e ij when the entry time is t . If the edge e ij isn’t 

connected at time t , then (t) 

� � , which means this 

Tij edge cannot be passed and must wait. For example, a 

time-dependent network at time instant t =1,2,3 is shown 

in Figure 1. The topological structure and travel time of 

this network vary with time. For edge e 21 , it is present at 

time t =1,2, but disappears at time t=3, i.e., no passing. 

Moreover, the travel time of edge e 21 equals 1 at time 

t =1, and becomes 5 at time t =2. This time-aggregated 

graph is illustrated in Figure 1(d), where edge e 21 has 

two series properties: [1,2] represents the time series 

when the edge is present, and (1,5,∞) is the travel time 

series representing the travel time at time instant 1 and 2. 

C. Trip Denotation 

In the time-dependent network, temporal and spatial 

dimensions are used to represent the tourist trip, where 

time is described as discrete unit and space is expressed 

as V � { Vi 

| i �[ 

1, 

| V |]} . So, the tourist trip is a list 

consisting in elements V i( ti 

) , where V i( ti) 

denotes that 

the node V i is reached at time t i . The tourist trip is 

presented as P V , V ) � { V ( t ), V ( t ), �, 

V ( t )} , where 

( 1 n 1 1 2 2 n n


V 1 is the source node and n 

V is the destination node. 

According to the time-aggregated graph, if the stay time 

at node V i is vt i , then the earliest arrival time at edge 

e ij is te � min{ t | t � ti 

� vti, 

t � ETij} 

. For the two 

V 1 

V 3 

V 1 

V 3 

nodes V i ( ti 

) and V j( t j) 

on edge e ij , the time constraints 

is expressed as: t j � ti 

� vti 

� Tij 

( te) 

. 

Figure 1. Denotation of time-dependent network using time-aggregated graph. 

D. Mathmatical Model 

Notations and Meanings: 

G � (V, E ) : Time-dependent network; 

P (i) 

: Set of predecessor nodes of node V ; i 

S (i) 

: Set of successor nodes of node V ; i 

T : The total time budget of the trip; 

t 0 : The traveling start time; 

p i : Tourist preference value for node V ; i 

vt i : Stay time on node V ; i 

t i : Arrival time at node V ; i 

tij (t) 

: Travel time on edge e when the entry time is 

ij 

t ; 

�1, 

if edge e is entered at time ij 

t 

xij (t) 

� � 

�0, 

otherwise 

Choose node V as the source point and node V as the 

1 

n 

destination point. And we establish the following mixed 

integer programming model as follows. 


2 

2 

1 

1 

1 

V 2 

V 4 

2 

(a) t=1 (b) t=2 

2 

3 

V 2 

V 4 

2 

(2,1,2) 

[1,2,3] 

4 

(1,∞,4) [1,3] 

(c) t=3 (d) Time-aggregated graph 

T 

n�1 

�� 

max p x ( t) 

(1) 

i 

t�t0 

i�2j�S(i) T 

�� 

1 j 

t�t0 j�S( 

1) 

T 

ij 

T 

�� 

s. 

t. 

x ( t) 

� x ( t) 

�1 

(2) 

�� 

ik 

t�t0 i�P( 

k) 

T 

�� ij ( t 

t�t0 j�S( 

i) 

T 

�� 

t�t0 

i�P( 

j) 

T 

�� 

V 1 

V 3 

V 1 

V 3 

ij 

t�t0 j�S( 

i) 

1 

T 

in 

t�t0i�P( n) 

x ( t) 

� x ( t) 

, � k � 2, �, 

n � 1 

�� 

kj 

t�t0j�S( k) 

x ) �1 

, � i � 2, �, 

n � 1 

(3) 

(4) 

( t � t ( t)) 

x ( t) 

� t , � j � 2, �, 

n (5) 

ij 

i 

1 

5 

2 

(1,1,∞) [1,2] 

(1,5,∞) [1,2] 

(∞,2,2) [2,3] 

ij 

i 

j 

tx ( t) 

� t � vt , � i � 1, 

� , n �1 

(6) 

t1 � t , 0 tn � T 

(7) 

t i � 0 , � i � 1, 

�, 

n 

(8) 

x(t ij ) � 0, 

1, 

� eij � E , � t �1, 

�, 

T 

(9) 

The objective of the problem is to maximize the total 

utility, as shown in (1). In this formulation, constraint (2) 

and (3) are flow-conservation constraints. Constraint (4) 

ensures that every point is visited at most once. 

V 2 

V 4 

(∞,∞,3) [3] 

2 

V 2 

(2,2,2) 

[1,2,3] 

V 4


Constraint (5) and (6) guarantees that if one edge is 

visited in a given tour, the arrival time of the edge 

following node is the sum of the preceding arrival time, 

visiting time and the edge travel time. Constraint (7) is 

the start time and end time constraint. Constraint (8) and 

(9) are the variables constraint. 

III. LABEL CORRECTING ALGORITHM 

A. The Ideas of Algorithm 

The objective of the time-dependent tourist trip design 

problem is to decide an optimal trip to maximize the 

tourist’s total utility within time budget taking into 

account tourist preference and time-dependent network, 

and provide the tourist with real-time navigation service 

via the mobile communication devices (e.g. PDA, cell 

phone). In this dynamic network, the trip design and 

travel time depend on the start time of the source node. 

Consequently, starting from an optimal start time, a best 

trip is produced to help the tourist visit his interesting 

points within a time budget. We present a label correcting 

algorithm to solve this problem. This algorithm can 

produce an optimal trip plan and achieve maximum 

utility within a given time budget. 

Some notations used in the algorithm are defined in the 

following. 

Definition 1: Q is the priority queue of the node to be 

processed and satisfies first-in-first-out (FIFO) principles. 

Definition 2: Ci (t) 

is the cost of the source node V at i 

time t , which represents the travel time from node V to i 

the destination node at time t . 

Definition 3: U i ( V j , t) 

represents the total utility 

arriving at node V at time t ( i t � ET ), where ij V is the 

i 

successor node( V j � S(i) 

). 

The start time is variable. In order to seek the optimal 

start time and travel route, we use a label correcting 

algorithm from the destination node back to the source 

node. As the edge travel time series represent the edge 

travel time in every time unit, the label records the travel 

time and total utility every time from the current node to 

the destination node. This algorithm calculates the node’s 

trip utility and updates the pre-node’s label. Repeat these 

iterations until we get the optimal route with the best start 

time. 

B. The Detail Steps of Algorithm 

According to the idea of network planning and 

dynamic programming[25], a novel label correcting 

algorithm is presented in the following. 

Step 1: Initialization 

Given every edge eij two properties: the edge time 

series ET and travel time series ij 

TT . ij V is the source 

1 

node and V is the destination node. p n 

i represents the 

tourist’s preference value for each visiting point V , the i 

stay time is vt i . Let p p � 0 , vt vt � 0 . 

1 � n 


1 � n 

Step 2: For the destination node V , n Cn ( t) 

� 0 , 

U n( 

Vn� 

, t) 

� p , 1 n t � 1, 

2, 

�T 

,where V is the virtual 

n�1 

successor node of source node V , n Q � { Vn} 

. 

Step 3: Processing the node in priority queue Q 

3.1. For j V in Q , delete j V from Q; 

3.2. For Vi � P( 

j) 

, calculate Ci ( t) 

� C j ( t � vt j 

� Tij ( t)) 

� Tij 

( t) 

� vt , j � t � ET . If ij ci ( t) 

� T , then 

Tij (t) 

� � , else if ci ( t) 

� T , then go to step 3.3; 

3.3. Calculate the tourist’s total utility and label 

Calculate the total utility from the source node V to i 

the destination node at time t . 

U i ( V j , t) 

� U j ( Vk 

, t � vt j � Tij 

( t)) 

� P , i V j � S(i) 

, 

Vk � S( 

j) 

, � t � ET ; ij 

Label ( V j , t, 

Ci 

( t), 

U i ( V j , t)) 

; 

3.4. Node V i is inserted into Q , and update it; 

Step 4: Judgment of whether all the nodes have 

been processed 

If Q � � , go to step 3, else go to step 5; 

Step 5: Calculation of the route’s total utility 

Calculate V ) � max U ( V , t) 

, � ET e � E 

U ( 1 

1 r 

Vr�S ( 1) 

�t 1 r , ; 1r 

Step 6: Backward the tourist’s route with 

maximum total utility 

According to the label with maximum utility, we can 

get the optimal start time t best . And go back based on this 

start time, we will obtain the optimal route from the 

source node 1 V to the destination node Vn in the 

following: P( V1, 

Vn 

) � { V1( 

tbest 

), V2( 

t2), 

�, 

Vn 

( tn 

)} . If 

there are several routes with maximum total utility, 

choose the route with minimum 1( ) t C as the final 

optimal route. 

C. Time Complexity Analysis of Algorithm 

According to the steps of the algorithm, we can see 

that when the travel time and label of the node is a scalar, 

2 

the time complexity in worst case is O ( V E ) . As each 

label is generated when the length of time series is T , the 

2 

total computational time is O ( V E T ) . 

The above analysis indicates that our algorithm is a 

polynomial algorithm, and can meet the real-time 

applications. 

IV. NUMERICAL EXAMPLE 

The efficiency and feasibility of the algorithm would 

be demonstrated by the following numerical example in 

this section. 

Example: Given an exhibition graph as shown in 

Figure 2, where V 1 is the entrance (source node) and V 6 

is the exit (destination node), other nodes are visiting 

point; The first figure at each node denotes the stay time 

and the second figure denotes the preference value of a


tourist. The time budget of the trip is T =10 and the 

figures on the edges are travel time series. To simplify the 

problem, all the time series on the edges are set 

as ETij � [ 1, 

2, 

�, 

T] 

, � eij � E . If t � T , then 

Tij ( t) 

� Tij 

( T ) . 

Figure 2. Time-dependent exhibition graph. 

TABLE I. 

TRACE OF THE LABEL USING LABEL CORRECTING ALGORITHM 

We shall use the proposed label correcting algorithm to 

solve the above example, and the optimal trip is 

calculated in detail as follows. 

steps V1 V2 V3 V4 V5 V6 Q 

1 

2 

3 {4,(5,6,5,5,4,5,4,4,4,5),5} 

4 

5 

6 

(4,4,1,1,1,2,2,3,3,3) 

(0,0) 

V1 

(1,1,1,2,2,2,2,3,3,4) 

{2,(10,9,6,7,6,7,8,9,9,9), 

5} 

{2,(9,10,7,6,7,9,9,10,10, 

10),9} 

{2,(-,-,9,8,8,10,10,-,-,- 

),14} 

{2,(-,-,9,-,-,-,-,-,-,-),12} 

{2,(-,-,10,-,-,-,-,-,-,-),17} 

{3,(8,8,7,8,8,9,9,10,10,- 

),11} 

{3,(-,10,9,9,9,10,10,-,-,- 

),16} 


(1,1) (2,2,3,3,2,2,1,1,1,2) 

(1,5) 

V2 

V3 

(1,3) 

(1,1,1,1,1,2,2,3,3,4) 

{4,(5,6,5,5,4,5,4,4,4,5),5} 

{5,(7,7,6,6,5,4,5,5,6,6),9} 

{5,(10,10,9,8,7,6,6,6,7,7),1 

4} 

{4,(5,6,5,5,4,5,4,4,4,5),5} 

{5,(7,7,6,6,5,4,5,5,6,6),9} 

{5,(10,10,9,8,7,6,6,6,7,7),1 

4} 

{3,(8,8,7,8,7,9,9,10,10,- 

),12} 

{3,(-,10,9,9,8,10,10,-,-,- 

),17} 

{4,(5,6,5,5,4,5,4,4,4,5),5} 

{5,(7,7,6,6,5,4,5,5,6,6),9} 

{5,(10,10,9,8,7,6,6,6,7,7),1 

4} 

{3,(8,8,7,8,7,9,9,10,10,- 

),12} 

{3,(-,10,9,9,8,10,10,-,-,- 

),17} 

(4,4,3,3,2,1,1,1,2,2) 

(4,4,3,3,2,2,1,1,2,2) 

{5,(7,7,6,6,5,6,5 

,5,6,6),11} 

{5,(10,10,9,8,7, 

7,6,6,7,7),16} 

{5,(7,7,6,6,5,6,5 

,5,6,6),11} 

{5,(10,10,9,8,7, 

7,6,6,7,7),16} 

{5,(7,7,6,6,5,6,5 

,5,6,6),11} 

{5,(10,10,9,8,7, 

7,6,6,7,7),16} 

V4 

V5 

(1,8) 

{7,(0,0,0,0,0, 

0,0,0,0,0),0} V6 

{6,(1,1,2,2,3, {6,(4,4,3,3,2, {7,(0,0,0,0,0, V4, 

3,1,1,2,2),5} 2,2,2,3,3),8} 0,0,0,0,0),0} V5 

{6,(4,4,3,3,2, 

2,2,2,3,3),8} 

{6,(1,1,2,2,3, 

{7,(0,0,0,0,0, V2, 

{4,(5,6,5,5,4, 

3,1,1,2,2),5} 

0,0,0,0,0),0} V5 

5,5,4,4,4), 

13} 

{6,(4,4,3,3,2, 

2,2,2,3,3),8} 

{6,(1,1,2,2,3, 

{7,(0,0,0,0,0, V2, 

{4,(5,6,5,5,4, 

3,1,1,2,2),5} 

0,0,0,0,0),0} V3 

5,5,4,4,4), 

13} 

{6,(1,1,2,2,3, 

3,1,1,2,2),5} 

{6,(1,1,2,2,3, 

3,1,1,2,2),5} 

(2,2,3,3,2,2,2,1,1,1) 

(1,1,2,2,3,3,1,1,2,2) 

{6,(4,4,3,3,2, 

2,2,2,3,3),8} 

{4,(5,6,5,5,4, 

5,5,4,4,4), 

13} 

{6,(4,4,3,3,2, 

2,2,2,3,3),8} 

{4,(5,6,5,5,4, 

5,5,4,4,4), 

13} 

V6 

(0,0) 

(4,4,3,3,2,2,2,2,3,3) 

{7,(0,0,0,0,0, 

0,0,0,0,0),0} 

V2 

{7,(0,0,0,0,0, 

0,0,0,0,0),0} V1


First, initialization: 

Given the time series ET and travel time series ij 

TT ij 

on each edge e ij ; V is the source node and 1 

V is the 

n 

destination node. For every node i V , the preference value 

is i p and the stay time is vt ; i p 1 � pn 

� 0 , vt 1 � vtn 

� 0 . 

For the destination node V , n Cn ( t) 

� 0 , U n( 

vn� 

, t) 

� p , 

1 n 

t � 1, 

2, 

�T 

, where it is assumed that the successor node 

of n V isV n�1 

( virtual node), Q � { Vn} 

. 

Each node in priority queue Q is processed in order, 

and the trace of the label is shown in table 1. It is noted 

that the label for each node is denoted as a triple 

{ Sub , TS, 

TU} 

, where Sub represents the sub index of 

the successor of the labeled node, TS is the travel time 

series from the labeled node to the destination node, and 

TU is the total utility from the labeled node to the 

destination node. For each travel time series, if the travel 

time exceeds the time budget at any time, it is denoted as 

“–”. 

According to the label at the source node V , the 1 

maximal total utility is U ( V1 

) =17 and the best start time is 

t =3. best 

Backward the route based on the node label, the 

optimal route is obtained in the following: 

P( V1, 

V6 

) � { V1( 

3), 

V2( 

4), 

V3( 

6), 

V5( 

8), 

V4( 

10), 

V6 

( 13)} 

. 

From the above implementation of the algorithm, it is 

clear that the presented label correcting algorithm is 

effective to decide on optimal start time, and provide a 

best trip to improve the tourist satisfaction. 

V. CONCLUSION 

A variety of sightseeing and exhibition activities 

develop rapidly in China, such as urban tourism, business 

exhibitions, theme parks etc. And they have played an 

important role in promoting the economic growth. The 

tourist trip design problem is not only the vital content of 

their trip plans, but also the key to provide high-quality 

service for tourists. With the development of the mobile 

communication technology and the popularization of cell 

phone, PDA, etc., the mobile tourist guides turn into 

possible. They are able to provide the tourists with the 

real-time and personalized services based on context and 

location. The previous researches ignored the time- 

dependent characteristics in exhibition networks. The 

traffic network in tourism is one of the complex systems, 

which has great uncertainties. The tourist travel time in 

the exhibition network may change because of many 

things such as crowed places, temporary shows, 

promotions, and so on. Therefore, by introducing a timeaggregated 

graph, this paper establishes a time-dependent 

tourist trip design model, and proposes a label correcting 

algorithm. The time complexity is also analyzed. Then 

the efficiency and feasibility of the algorithm are 

demonstrated by a numerical example. 

Further studies will focus on multi-trips design 

problem during several days, and provide the tourists 


more satisfied and high-quality services, simultaneously 

considering more realistic factors such as the location 

open time and capacity constraints. 


The authors wish to thank the reviewers for their 

valuable comments. This work was supported in part by 

the National Natural Science Foundation of China (Grant 

No.71171178), Humanities and Social Sciences 

Foundation of Ministry of Education of China (Grant No. 

12YJC630091), Zhejiang Provincial Natural Science 

Foundation of China (Grant No. LQ12G02007), and 

Zhejiang Provincial Commonweal Technology Applied 

Research Projects of China (Grant No.2011C23076). 

REFERENCES 

[1] P. Vansteenwegen, D. Van Oudheusden, “The mobile 

tourist guide: an OR opportunity,” OR Insights, vol.20, no. 

3, pp.21–27, 2007. 

[2] G. Laporte, S. Martello, “The selective traveling salesman 

problem,” Discrete Applied Mathematics, vol.26, pp.193– 

207, 1990. 

[3] J. Pekny, D. Miller, “An exact parallel algorithm for the 

resource constrained traveling salesman problem with 

application to scheduling with an aggregate deadline,” In: 

Proceedings of the 1990 ACM annual conference on 

cooperation,1990. 

[4] D. Feillet, P. Dejax, M. Gendreau, “Traveling salesman 

problems with profits,” Transportation Science, vol.39, 

pp.188–205, 2005. 

[5] S. Martello, and P. Toth, “Knapsack problems-algorithm 

and computer implementations,” John Wiley & Sons. New 

York, NY, USA, 1990. 

[6] S. Butt, T.A. Cavalier, “heuristic for the multiple tour 

maximum collection problem,” Computers & Operations 

Research, vol.21, pp.101–11, 1994. 

[7] E. Arkin, J. Mitchell, G. Narasimhan, “Resourceconstrained 

geometric network optimization,” In: 

Proceedings of the 14th ACM symposium on 

computational geometry, pp.307–316, 1998. 

[8] G. Righini, M. Salani, “Decrement state space relaxation 

strategies and initialization heuristics for solving the 

orienteering problem with time windows with dynamic 

programming,” Computers & Operations Research, vol.4, 

pp.1191–203, 2009. 

[9] P. Vansteenwegen, W. Souffriau, G. Vanden Berghe, D. 

Van Oudheusden, “Metaheuristics for tourist trip 

planning,” Metaheuristics in the Service Industry. In: 

Lecture notes in economics and mathematical systems, vol. 

624. Berlin: Springer; to appear (ISBN: 978-3-642-00938- 

9), 2009. 

[10] P. Vansteenwegen, W. Souffriau, G. Vanden Berghe, D. 

Van Oudheusden, “A guided local search metaheuristic for 

the team orienteering problem,” European Journal of 

Operational Research, vol.196, no.1, pp.118–27, 2009. 

[11] L. Ke, C. Archetti, Z. Feng, “Ants can solve the team 

orienteering problem,” Computers & Industrial 

Engineering, vol.54, pp.648–65, 2008. 

[12] J. Li, “Research on team orienteering problem with 

dynamic travel times,” Journal of Software, vol.7, no.2, pp. 

249-255, 2012. 

[13] F.Q. Zhao. “An improved PSO algorithm with decline 

disturbance index,” Journal of Computers, vol.6, no.4, 

pp.691-697, 2011.


[14] X.Y. Liu, H.F. Lin, C. Zhang. “An improved HITS 

algorithm based on page-query similarity and page 

popularity,” Journal of Computers, vol.7, no.1, pp.130-134, 

2012. 

[15] D. Abowd, G. Atkeson, J. Hong, S. Long, R. Kooper, and 

M. Pinkerton, “Cyberguide: a mobile context-aware tour 

guide,” ACM Wireless Networks, vol.3, pp.421–433, 1997. 

[16] G. O’Hare, M. O’Grady, “Gulliver’s Genie, a multi-agent 

system for ubiquitous and intelligent content delivery,” 

Computer Communications, vol.26,pp.1177–1187, 2003. 

[17] N. Davies, K. Cheverst, K. Mitchell, A. Efrat, “Using and 

determining location in a context sensitive tour guide”, 

IEEE Computer, vol.34, no.8, pp.35–41, 2001. 

[18] B. Schmidt-Belz, H. Laamanen, S. Poslad, A. Zipf, 

“Location-based mobile tourist services– first user 

experiences,” In: Enter 2003, Helsinki, Finland. 

[19] R. Kramer, M. Modsching, and K. ten Hagen, “A city 

guide agent creating and adapting individual sightseeing 

tours based on field trial results,” International Journal of 

Computational Intelligence Research, vol.2, no.2, pp.191– 

206, 2006. 

[20] K. ten Hagen, R. Kramer, M. Hermkes, B. Schumann, and 

P. Mueller, “Semantic matching and heuristic search for a 

dynamic tour guide,” In: Information and Communication 

Technologies in Tourism, eds. Frew A. J., M. Hitz, and P. 

O’Connor, Vienna: Springer, 2005. 

[21] W. Souffriau, P. Vansteenwegen, J. Vertommen, G. 

Vanden Berghe, D. Van Oudheusden, “A personalized 

tourist trip design algorithm for mobile tourist guides,” 

Applied Artificial Intelligence, vol.22, no.10,pp.964-985, 

2008. 

[22] B. George, and S. Shekhar, “Time aggregated graphs: a 

model for spatial-temporal network,” Journal on Data 

Semantics, vo.11, December 2007. 


[23] B. George, S. Kim, and S. Shekhar “Spatial-temporal 

network databases and routing algorithms: a summary of 

results,” Proceedings of International Symposium on 

Spatial and Temporal Databases (SSTD’07), July 2007. 

[24] B. George, S. Shekhar, “Time-aggregated graphs for 

modeling spatial-temporal networks-an extended abstract,” 

Proceedings of Workshops at International Conference on 

Conceptual Modeling, 2006. 

[25] B.V. Cherkassky, A.V. Goldberg, T. Radzik, “Shortest 

paths algorithms: theory and experimental evaluation,” 

Mathematical Programming, vol.73, pp.129–174, 1996. 

Jin Li was born in Jiangsu province, China. He received the B.S. 

and M.S. degrees from Nanchang University, Nanchang, 

Jiangxi Province, China in 2004 and 2007 respectively. He 

received the Ph.D. degree in management science and 

engineering department, School of management, at Fudan 

University, Shanghai, China in 2010. 

From July 2010 until now he works as an assistant professor 

in the Department of Management Science and Engineering, 

School of Computer Science and Information Engineering, 

Zhejiang Gongshang University, Hangzhou, Zhejiang Province, 

China. He has authored/coauthored more than 20 scientific 

papers. His research interests include logistics and supply chain 

management, system modeling and simulation, network 

optimization, and emergency management. 

Dr. Li is a fellow of China Society of Logistics (CSL), 

Chinese Computer Federation (CCF), and Chinese Association 

for System Simulation(CASS).

Aims and Scope. 

Call for Papers and Special Issues 

Journal of Software (JSW, ISSN 1796-217X) is a scholarly peer-reviewed international scientific journal focusing on theories, methods, and 

applications in software. It provide a high profile, leading edge forum for academic researchers, industrial professionals, engineers, consultants, 

managers, educators and policy makers working in the field to contribute and disseminate innovative new work on software. 

We are interested in well-defined theoretical results and empirical studies that have potential impact on the construction, analysis, or management 

of software. The scope of this Journal ranges from the mechanisms through the development of principles to the application of those principles to 

specific environments. JSW invites original, previously unpublished, research, survey and tutorial papers, plus case studies and short research notes, 

on both applied and theoretical aspects of software. Topics of interest include, but are not restricted to: 

• Software Requirements Engineering, Architectures and Design, Development and Maintenance, Project Management, 

• Software Testing, Diagnosis, and Validation, Software Analysis, Assessment, and Evaluation, Theory and Formal Methods 

• Design and Analysis of Algorithms, Human-Computer Interaction, Software Processes and Workflows 

• Reverse Engineering and Software Maintenance, Aspect-Orientation and Feature Interaction, Object-Oriented Technology 

• Component-Based Software Engineering, Computer-Supported Cooperative Work, Agent-Based Software Systems, Middleware Techniques 

• AI and Knowledge Based Software Engineering, Empirical Software Engineering and Metrics 

• Software Security, Safety and Reliability, Distribution and Parallelism, Databases 

• Software Economics, Policy and Ethics, Tools and Development Environments, Programming Languages and Software Engineering 

• Mobile and Ubiquitous Computing, Embedded and Real-time Software, Database, Data Mining, and Data Warehousing 

• Internet and Information Systems Development, Web-Based Tools, Systems, and Environments, State-Of-The-Art Survey 

Special Issue Guidelines 

Special issues feature specifically aimed and targeted topics of interest contributed by authors responding to a particular Call for Papers or by 

invitation, edited by guest editor(s). We encourage you to submit proposals for creating special issues in areas that are of interest to the Journal. 

Preference will be given to proposals that cover some unique aspect of the technology and ones that include subjects that are timely and useful to the 

readers of the Journal. A Special Issue is typically made of 10 to 15 papers, with each paper 8 to 12 pages of length. 

The following information should be included as part of the proposal: 

• Proposed title for the Special Issue 

• Description of the topic area to be focused upon and justification 

• Review process for the selection and rejection of papers. 

• Name, contact, position, affiliation, and biography of the Guest Editor(s) 

• List of potential reviewers 

• Potential authors to the issue 

• Tentative time-table for the call for papers and reviews 

If a proposal is accepted, the guest editor will be responsible for: 

• Preparing the “Call for Papers” to be included on the Journal’s Web site. 

• Distribution of the Call for Papers broadly to various mailing lists and sites. 

• Getting submissions, arranging review process, making decisions, and carrying out all correspondence with the authors. Authors should be 

informed the Instructions for Authors. 

• Providing us the completed and approved final versions of the papers formatted in the Journal’s style, together with all authors’ contact 

information. 

• Writing a one- or two-page introductory editorial to be published in the Special Issue. 

Special Issue for a Conference/Workshop 

A special issue for a Conference/Workshop is usually released in association with the committee members of the Conference/Workshop like 

general chairs and/or program chairs who are appointed as the Guest Editors of the Special Issue. Special Issue for a Conference/Workshop is 

typically made of 10 to 15 papers, with each paper 8 to 12 pages of length. 

Guest Editors are involved in the following steps in guest-editing a Special Issue based on a Conference/Workshop: 

• Selecting a Title for the Special Issue, e.g. “Special Issue: Selected Best Papers of XYZ Conference”. 

• Sending us a formal “Letter of Intent” for the Special Issue. 

• Creating a “Call for Papers” for the Special Issue, posting it on the conference web site, and publicizing it to the conference attendees. 

Information about the Journal and Academy Publisher can be included in the Call for Papers. 

• Establishing criteria for paper selection/rejections. The papers can be nominated based on multiple criteria, e.g. rank in review process plus 

the evaluation from the Session Chairs and the feedback from the Conference attendees. 

• Selecting and inviting submissions, arranging review process, making decisions, and carrying out all correspondence with the authors. 

Authors should be informed the Author Instructions. Usually, the Proceedings manuscripts should be expanded and enhanced. 

• Providing us the completed and approved final versions of the papers formatted in the Journal’s style, together with all authors’ contact 

information. 

• Writing a one- or two-page introductory editorial to be published in the Special Issue. 

More information is available on the web site at http://www.academypublisher.com/jsw/.

(Contents Continued from Back Cover) 

Reputation Based Academic Evaluation in a Research Platform 

Kun Yu and Jianhong Chen 

Analyzing ChIP-seq Data based on Multiple Knowledge Sources for Histone Modification 

Dafeng Chen, Deyu Zhou, and Yuliang Zhuang 

A New Semi-supervised Method for Lip Contour Detection 

Kunlun Li, Miao Wang, Ming Liu, Ruining Xin, and Pan Wang 

Trusted Software Constitution Model Based on Trust Engine 

Junfeng Tian, Ye Zhu, and Jianlei Feng 

Fuzzy Evaluation on Supply Chains’ Overall Performance Based on AHM and M(1,2,3) 

Jing Yang and Hua Jiang 

A Novel Combine Forecasting Method for Predicting News Update Time 

Mengmeng Wang, Xianglin Zuo, Wanli Zuo, and Ying Wang 

Information-based Study of E-Commerce Website Design Course 

Xinwei Zheng 

An Empirical Study on the Correlation and Coordination Degree of Linkage Development between 


Rui Zhang and Chunhua Ju 

Tourism Crisis Management System Based on Ecological Mechanism 

Xiaohua Hu, Xuan Zhou, Weihui Dai, Zhaozong Zhan, and Xiaoyi Liu 

Image Fusion Method based on Non-Subsampled Contourlet Transform 

Hui Liu 

Research on Intrusion Detection Model of Heterogeneous Attributes Clustering 

Linquan Xie, Ying Wang, Fei Yu, Chen Xu, and Guangxue Yue 

Vector-Distance and Neighborhood Development for High Dimensional Data 

Ping Ling, Xiangsheng Rong, Xiangyang You, and Ming Xu 

Evaluation and Comparison on the Techniques of Vertex Chain Codes 

Linghua Li, Yining Liu, Yongkui Liu, and Borut Zalik 

Research on Web Query Translation based on Ontology 

Xin Wang and Ying Wang 

Data Modeling of Knowledge Rules: An Oracle Prototype 

Rajeev Kaula 

OPC (OLE for Process Control) based Calibration System for Embedded Controller 

Ming Cen, Qian Liu, and Yi Yan 

WS-mcv: An Efficient Model Driven Methodology for Web Services Composition 

Faycal Bachtarzi, Allaoua Chaoui, and Elhillali Kerkouche 

Object Search for the Internet of Things Using Tag-based Location Signatures 

Jung-sing Jwo, Ting-chia Chen, and Mengru Tu 

Fractional Order Ship Tracking Correlation Algorithm 

Mingliang Hou and Yuran Liu 

A Label Correcting Algorithm for Dynamic Tourist Trip Planning 

Jin Li and Peihua Fu 

2749 

2755 

2763 

2771 

2779 

2787 

2794 

2800 

2808 

2816 

2823 

2832 

2840 

2849 

2857 

2866 

2874 

2886 

2894 

2899

Journal of Software - Academy Publisher

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?