Computers and Intelligent Systems - isast

ISAST Transactions on Computers and Intelligent Systems No. 1 Vol. 2, 2010 

ISAST Transactions on No. 1, Vol. 2, 2010 (ISSN 1798-2448) 

Computers and Intelligent Systems 

Gang LIU, Gang CUI, Hongwei LIU, and Zhibo WU: 

A Reliability Enhancement Adaptive Routing Mechanism for Mobile Ad Hoc Networks………………….1 

J. C. Chedjou, K. Kyamakya, U. A. Khan, and M. A. Latif: 

Potential Contribution of CNN-based Solving of Stiff ODEs& PDEs to Enabling Real-Time 

Computational Engineering………………………………………………………………………………….8 

Y. Labrador, M. Karimi, N. Pissinou, and D. Pan: 

Performance Comparison of OFDM and Single Carrier Modulations over Satellite Channels…….……....15 

Z. R. Ghobadi and H. Rashidi: 

Software Rejuvenation Technique-An Improvement in Applications with Multiple Versions…….………22 

S. A. Asghari, H. Pedram and H. Taheri: 

A New Attitude based on Real Time Operating System for NoC in Hotspot Traffic Model..…….………27 

V. Y. Kontorovich, Z. Lovtchikova, J. A. Meda-Campaña, and K. Tinsley: 

Nonlinear Filtering Algorithms for Chaotic Signals: A Comparative Study……………………………….34 

H. D. Vankayalapati and K. Kyamakya: 

Nonlinear Feature Extraction Approaches for Scalable Face Recognition Applications…………………..44 

R.Karthikeyan, A. Karthikeyan and S.Sivaperumal: 

Artificial Human Limbs – A Design Approach for Military Application………………………………….53 

K. Kyamakya, J. C. Chedjou, M. A. Latif, and U. A. Khan: 

A Novel Image Processing Approach Combining a ‘Coupled Nonlinear Oscillators’-based Paradigm 

with Cellular Neural Networks for Dynamic Robust Contrast Enhancement………………………….…..61 

JianLi GUO, HongWei LIU, and XiaoZong YANG: 

Common-neighbor Monitoring Enhanced Cooperation Enforcement Scheme for MANETs……………..69 

J. M. Blackledge: 

Systemic Risk Assessment using a Non-stationary Fractional Dynamic Stochastic Model for the 

Analysis of Economic Signals……………………………………………………………………………..76 

J. M. Blackledge and D. A. Dubovitskiy: 

An Optical Machine Vision System for Applications in Cytopathology………………………………….95

1 ISAST Transactions on Computers and Intelligent Systems, No. 1, Vol. 2, 2010 (ISSN 1798-2448) 

A Reliability Enhancement Adaptive Routing 

Mechanism for Mobile Ad Hoc Networks 

Gang LIU, Gang CUI, Hongwei LIU, and Zhibo WU 

Abstract—Selecting a stable routing for data packet transmission is effective on reducing control packet traffic, and 

energy consuming generated by the frequent routing reconstruction and maintenance in dynamic mobile Ad Hoc 

networks, so it can improve efficient and extend lifetime of the networks. A kind of algorithm to measure dynamic 

characters of mobile nodes based on information entropy is proposed in the paper, which analyze uncertainty of 

behavior character of its neighbor set in the transformation process, and use it as a metric for selecting stable routing in 

mobile Ad Hoc networks. Simulation results show that stable routing measurement method can remarkably improve key 

performances of mobile Ad Hoc networks, such as packet delivery ratio and packet end-to-end delay. 

Index Terms—Mobile Ad Hoc networks, stable routing, uncertainty, information entropy. 

1 INTRODUCTION 

MOBILE Ad Hoc networks is a group 

of autonomous wireless mobile nodes 

in the composition of the temporary selforganization, 

non-center multi-hop wireless 

network system, the nodes can move freely, 

join or leave the network at any time without 

having to send any warning information in 

the networks running process. Therefore, the 

states of their network topology, mutual relations 

between the nodes and wireless links 

constantly change. In such a dynamic network 

environment, selecting a stable routing for data 

transmission can effectively reduce the transmission 

process of reconstruction and maintenance 

of the routing frequently generated by 

network bandwidth and energy consumption, 

and thus improve network resource utilization 

efficiency and prolong survival of life, so it is 

of great significance for network resources and 

relatively limited supply of energy in mobile 

• Gang LIU, Gang CUI, Hongwei LIU and Zhibo WU are with 

the School of computer science and technology, Harbin Institute 

of Technology, Harbin, China, 150001 

E-mail: lg.hit@163.com 

• This paper is partially supported by the Hi-Tech Research 

and Development Program (863) of China under grant No. 

2008AA01A201 and the National Natural Science Foundation 

of China under grant No. 60503015. 

Manuscript received April 19, 2009; revised September 11, 2009. 

✦ 

Ad Hoc network. 

A more common strategy is aimed at a formalization 

of mobile Ad Hoc network node 

movement model in the current approach for 

selecting stability routing, by analyzing the 

specific movement model of the network mobile 

nodes, the wireless link behavior demonstrated, 

as a routing selection process in the 

establishment and stability of metrics[1], [2], 

[3], [4], [5], one main problem is the limitation 

of its application area of the stability of 

the routing nodes. LENDERS[6] analyzed the 

impact of the actual nodes mobile model on 

wireless links connecting state, but only gave a 

qualitative summary of type conclusions, and 

there were no formal quantitative test results. 

Another common way is through real-time 

precision for Mobile Ad Hoc networks, wireless 

mobile nodes in the relationship between 

the physical location and the relative speed 

of change in the stability of information as a 

link or routing metrics[7], [8], [9], [10], in this 

way, real-time access and update the location 

of wireless mobile nodes, speed change, supporting 

the information need a special facilities 

(such as GPS, etc.) to provide the necessary 

technical support, it is only suitable for certain 

specific applications in the network environment. 

ROHIT[11] and GEUNHWI[12] used the 

network data transmission process of mobile 

1


node changes in signal strength and stability 

of features as the route selection criteria, and 

the author used an extended device driver 

interface to implement a routing protocol for 

maintain signal stability table (SST) protocol 

stack required for cross-layer-type operation 

in the signal stability based adaptive routing 

protocol SSA. In addition, XU[13] analyzed the 

network topology dynamic characteristics, and 

applied it to hierarchical network architecture, 

clustering algorithm and cluster-based routing 

protocol in order to achieve maximum network 

performance and stability. KIM[14] put 

forward a kind of scaleable Ad Hoc routing 

protocol which enhanced routing protocol capacity 

to adapt the increasing scale of Ad Hoc 

networks by the logical topology information 

of networks. In the literature[15], the authors 

proposed a stability calculation method based 

on the correlation factor of the wireless link 

path, which used the wireless links to connect 

the adjacent state of change, rather than use a 

separate wireless link to considerate the stability 

characteristics of the full routing. 

Unlike the above method, this paper use the 

information uncertainty metric method, which 

utilize a collection of wireless nodes in a neighbor 

behavior change process as a source with 

uncertain properties, through its quantitative 

measurement as a routing selecting benchmarks 

in the dynamic network environment, 

which provide the necessary support for reliable 

data transmission. 

The rest of this paper is organized as follows. 

In section 2, we introduce a novel method 

to measure the routing stability based on the 

information entropy, a new routing protocol 

called BSORP combined the stability measurement 

method with AODV protocol. In section 

3, we compare the difference performances of 

AODV and BSORP to validate the effect of 

the introduced stability measurement method 

in this paper. Finally, section 4 is conclusion. 

2 MODEL OF ROUTING STABILITY 

2.1 Stability measurement for wireless mobile 

nodes 

On the assumption that each of wireless mobile 

nodes has its only identified sign for mobile Ad 

Hoc networks, and all nodes have the same 

wireless spreading radius, we can transform 

network topology model of the time t to a 

undirected graph G t = (V, E t ), while V denotes 

the set of all wireless nodes, |V | denotes the 

amount of all wireless nodes in set V , E t denotes 

the set of |E| bidirectional wireless links 

in the time t. 

If node m periodically inspects the members 

in the set of neighbor nodes, the inspection 

period is ∆t, and then an ordered sequence 

composed by neighbor sets under different 

time is gained 

S T NS(N) = (ns t1 , ns t2 , · · · , ns tN ), N = T/∆t (1) 

In the sequence ST NS (N), if we regard the set 

of inspected current neighbor nodes at time ti 

as a random variable NS, then the new set 

made up of different neighbor nodes of all 

members is value range for the random variable, 

namely NS T = {ns1, ns2, · · · , nsk}, while 

(nsti ∈ NS T ; i = 1, · · · , N; 1 ≤ k ≤ N), and we 

can compute each probability distributing of all 

elements in the set NS T according to sequence 

ST NS (N). 

The sequence S T NS 

2 

(N) is reflection of cor- 

relation dynamic character between moving 

wireless node m and its neighbor nodes, so we 

can provide a convergence stability criterion for 

building and selecting mobile Ad Hoc stability 

routing, by information entropy[16] with the 

ability of measuring uncertainty, and the uncertainty 

is computed by quantizing the sequence 

ST NS (N) and the set NST . 

In order to measure the uncertainty of the 

sequence S T NS 

(N), we give another express by 

incorporating the set with the same neighbor 

nodes, using the length of the continuous appearance 

set, namely 

RNS T = (R1(ns 1 ), R2(ns 2 ), . . . , Rl(ns l )), where 

m� 

Ri(ns i ) = N, ns i =∈ NS T , l ≥ k (2) 

i=1 

We can measure the uncertainty character 

of the neighbor nodes set of wireless mobile 

node m according to the sequence RNS T by 

the weighted entropy


HW (RNS T ) = 

l� 

i=1 

RC i NS( Ri(ns i ) 

N ) log(Ri(ns i ) 

N ) 

(3) 

In the above formula, the weight RC i NS is 

change ratio of composing members belonged 

to the neighbor nodes set in the neighbor sub- 

sequence in the sequence RNS T . 

RC i � 

1 i = 1 

NS = 

i > 1 

1 − nsi−1 ∩ns i 

ns i−1 ∪ns i 

(4) 

According to the same method, the uncertainty 

character of the set NS T is measured by 

the standard information entropy of disperse 

stochastic variable 

H(NS T ) = 

k� 

p(nsj) log p(nsj) (5) 

j=1 

From the above result, the metric stability 

of mobile node m in the Ad Hoc Networks is 

defined 

MS(m) = 1 − Hw(RNS T )H(NS T ) 

log N log N 

(6) 

The characters of Metric Stability are as follows. 

1. According to the character of the information 

entropy, the value ranges of H(NS T ) and 

HW (RNS T ) are [0, log N], so the value range 

of MS(m) is [0, 1], and the uncertainty will 

increase along with the increasing members of 

the set and the tend to average of the probability 

distributing, the metric stability MS(m) is 

decreased. In the worst condition, the different 

neighbor sets are gained in the course of each 

sampling, that is MS(m) equals 0, so the routing 

can’t reliably transfer the data. Contrarily, 

the same neighbor sets are gained in the course 

of each sampling, that is MS(m) equals 1. 

2. The different effects of changing members 

in the neighbor set are introduced into the 

uncertainty metric by weighting during the 

course of computation HW (RNS T ), the uncertainty 

is larger and larger along with the 

exquisite changes of members in the neighbor 

set, the result is that the stability of wireless 

mobile node decrease. 

3. The dynamic character of the wireless 

mobile node is compactly described form the 

aspect of statistics and action, the reason is that 

the uncertainty is measured by the value range 

of the neighbor set and the distribution of the 

members in the neighbor set. 

2.2 Stability measurement of the routing 

From the above analysis of stability measurement 

about the wireless mobile node, the stability 

measurement SR(S,D) of the routing from 

the source node S to the aim node D can be 

denoted by the multiplication of all stability 

measurement participating in this routing stability 

of wireless mobile nodes, namely 

SR(S,D) = � 

S(i) (7) 

i∈R(S,D) 

The maximum of the stability measurement 

is 1, because the value range of each stability 

measurement S(i) is fall into [0, 1]. The stability 

measurement is affected by the two factors, one 

is the length of the route R(S, D), the other 

is the stability degree of all wireless mobile 

nodes in the routing. The jump forward routing 

R(S, D) is less, the routing stability is higher, 

the value of SR(S,D) tends to 1, and the routing 

is more stability. 

2.3 Based on the stability of measurement 

on-demand routing protocol (BSORP) 

On the basis of the Ad Hoc on-demand distance 

vector routing protocol (AODV)[17], a 

BSORP is put forward by building and selecting 

the necessary stability routing, the affection 

of the routing presented in this paper 

on the stability of the calculation methods in 

the network performance is analyzed by the 

simulation, which analyze the performance difference 

of the improved routing protocols and 

the original on-demand distance vector routing 

protocol in the same network environment. 

The routing table and the associated control 

data structure grouping need to be extended in 

the routing protocol in order to take advantage 

of the mobile node stability measurement as 

the path choosing metric in the routing building. 

In the original AODV routing protocol, 

3


routing table entry increase the routing stability 

metric range, it is used to record the 

routing stability metric from the source node 

to the current forwarding node. In the routing 

search process, each RREQ packet increase the 

stability of the current routing metric range 

(the stability metric value from RREQ source 

node to the RREQ packet routing node), when 

an intermediate forwarding node receives a 

valid RREQ control packets, firstly, update and 

record the routing stability metric from the 

sponsored node by the RREQ to the current 

nod, and then continue the routing discovery 

process. In addition, the RREP control packets 

increase the stability of the routing metric 

range as the stability indicator for the source 

node of RREQ packet, the routing stability 

measurement is a full stability measurement in 

the RREQ source node routing table entries. 

When a wireless mobile node receives the 

same RREQ packet in different copies in the 

routing search process, if the routing stability 

of the new received RREQ packet is less than 

the minimum of the node in the current routing 

table entry, or the stability measurement are 

equal, but less jumping forward, the RREQ 

control packet is not discarded, but the group 

need to update the routing information in the 

routing table entry in the stability of the measurement 

range and to redirect the hop node 

for this RREQ packet forwarding node, otherwise 

discard the duplicate copies of RREQ. In 

addition, in order to obtain more stable routing, 

the copies of other RREQ packet that the 

routing stability measurement is less than its 

current value of the minimum recorded should 

be continued to answer before the RREQ destination 

node in the RREQ packet response timer 

times out. 

Routing redirection of the forwarding node 

in the process of RREQ packet and the multiple 

response process of the final destination node 

of the RREQ packet are shown as Fig. 1 during 

the course of establishing mobile Ad Hoc. 

3 SIMULATION RESULTS 

3.1 Simulation evaluation indicators 

Simulation process is based on NS2 (v2.31) 

network simulator, wireless mobile node move- 

1 

2 

5 6 

3 

Fig. 1. Routing redirection and multiple response 

operations in the process of routing 

search 

ment model is Random Waypoint Model 

(RWM), IEEE802.11 specification is used in 

the simulation of the distributed coordination 

function (DCF) as the MAC protocol, for all 

wireless mobile nodes in the Ad Hoc networks 

move randomly in the rectangular range of 

1800m × 900m. 

Other relevant parameters in the process of 

simulation are as shown in Table 1. 

TABLE 1 

Simulation parameters 

Simulate time 800s 

Transmission range 250m 

Receiver range 250m 

Node numbers 50 

Maximum pause time 50s 

Traffic type CBR 

Packet size 512byte 

CBR rate 5pkt/s 

3.2 Simulation environment 

In the simulation process, we analyze the difference 

performances of the original AODV 

routing protocol and BSORP routing protocol 

with the same parameters setting from the 

key network performance parameters, such as 

packet successful delivery ratio of the application 

layer, packet end-to-end delay of the application 

layer, network control load overhead. 

Packet successful delivery ratio: the ratio 

is the total number of the packets issued by 

source nodes of all CBR data flow to the application 

layer success receiving packets of the 

4 

7 

4


destination node of all CBR data flow in the 

mobile Ad Hoc networks. 

Packet end-to-end delay: the average transmission 

delay is CBR data flow from the source 

node sent the application layer data packet 

eventually reaches its final destination node 

application layer, that is, the application layer 

data packets end-to-end delay. 

Routing protocol control load overhead: the 

ratio is the number of control packets sent 

by the simulation process of all network layer 

routing protocols to the number of all sending 

packets. 

3.3 Analysis of the simulation results 

The performance difference in the data packet 

successful delivery compared for two kinds of 

routing protocols in the simulation is shown as 

Fig. 2. From the simulation results, it is clear 

that the wireless mobile nodes moving at a 

speed in Ad Hoc networks have a considerable 

impact on data transmission quality. The 

successfully submitted packets of two kinds 

of routing protocol are significantly decreased 

when the nodes increased the moving speed. 

However, due to BSORP routing protocol with 

a relatively stable network nodes for data transmission 

in the routing process of establishing, 

it is effective for improving the rate of packet 

successfully submitted, and delaying the rapid 

decline trend with the increasing node speed 

of the success submission packet. 

End-to-end delay performance of packet affected 

by the BSORP in the Ad Hoc networks 

is shown as Fig. 3, the simulation results show 

that the BSORP protocol significantly reduces 

the end-to-end delay in the process of the data 

packet transmission. A stable routing selection 

strategy adopts the multiply accumulate in the 

routing search process for the BSORP protocol, 

the number of nodes forwarding is regarded as 

an important factor, that is, the routing has a 

higher competitive advantage with fewer routing 

nodes, and by choosing a relatively stable 

network of mobile nodes involved in data forwarding, 

it is significantly reduced the packet 

delay due to frequent disruptions caused by 

wireless link routing maintenance and reconstruction. 

P a c k e t S u c c e s s fu l D e liv e r y R a tio (% ) 

1 .0 0 

0 .9 5 

0 .9 0 

0 .8 5 

0 .8 0 

0 .7 5 

0 .7 0 

0 .6 5 

0 .6 0 

0 .5 5 

0 .5 0 

0 .4 5 

0 .4 0 

5 1 0 1 5 2 0 2 5 

M o tio n S p e e d o f W ir e le s s N o d e (m /s ) 

A O D V 

B S O R P 

Fig. 2. Packet successful delivery ratio at different 

motion speed 

P a c k e t E n d -to -E n d D e la y (s ) 

2 .0 0 

1 .7 5 

1 .5 0 

1 .2 5 

1 .0 0 

0 .7 5 

0 .5 0 

0 .2 5 

0 .0 0 

5 1 0 1 5 2 0 2 5 


A O D V 

B S O R P 

Fig. 3. Packet end-to-end delay at different 

motion speed 

The specific performance differences in the 

data transfer process under the same network 

environment are shown as Fig. 4 and Fig. 5, 

such as the routing interrupt, control the load 

between the AODV routing protocol and the 

BSORP routing protocol. In the mobile Ad Hoc 

networks, the frequent disruption caused by 

wireless link routing maintenance and reconstruction 

of network control operations are a 

major cause of the load increasing, especially 

for on-demand routing protocol, routing disruption 

will lead to the control of heavy loads 

5

overhead because of adopting the flood network 

to establish or maintain the necessary 

routing. It is show as Fig. 4 that choosing a 

relatively stable network of mobile nodes to 

participate in the activities of the routing of 

data forwarding can significantly reduce the 

interrupting possibility, and the advantage of 

the stability characters of routing strategy become 

more pronounced with the increasing of 

speed network nodes and the network dynamic 

characters. Routing interrupt and control load 

have a significant correlation in a dynamic 

network environment, which can be explained 

from the control load overhead of two kinds 

of routing protocol, the control load of the ondemand 

type routing protocol can be effectively 

reduced by avoiding frequent routing 

disruption. 

R o u te B r o k e n T im e s 


4 0 0 0 

3 5 0 0 

3 0 0 0 

2 5 0 0 

2 0 0 0 

1 5 0 0 

1 0 0 0 

5 0 0 

0 

5 1 0 1 5 2 0 2 5 


A O D V 

B S O R P 

Fig. 4. The number of routing disruption for 

mobile routing 

From the above simulation results, we can 

see that stable routing selection has a major 

impact on the network performance in the 

Ad Hoc networks, the relevant network performance 

evaluation indicators have a clear 

upgrade, it means that the networks can 

achieve higher resource utilization efficiency 

and longer survival life in the resourceconstrained 

network environment, so it has 

important significance for the practical application 

of Ad Hoc networks. 

R o u tin g O v e r h e a d 

0 .6 0 

0 .5 5 

0 .5 0 

0 .4 5 

0 .4 0 

0 .3 5 

0 .3 0 

0 .2 5 

0 .2 0 

0 .1 5 

0 .1 0 

0 .0 5 

0 .0 0 

5 1 0 1 5 2 0 2 5 


Fig. 5. Routing control load 

4 CONCLUSIONS 

A O D V 

B S O R P 

The paper presents a stable routing calculating 

method, which measure the stability of 

the mobile wireless nodes by the uncertainty 

of the members changing in the neighbor 

node set. The method is used to improve the 

on-demand distance vector routing protocol 

(AODV), namely, by choosing a stable wireless 

mobile node as the active participation of the 

node routing approach to improve the dynamic 

network environment under the conditions of 

network performance. Simulation results show 

that the proposed stable routing method can 

effectively improve some important network 

performance indicators, such as the routing 

protocol packet submission rate, end-to-end 

delay and so on. 

5 ACKNOWLEDGMENTS 

The authors would like to thank the National 

Natural Science Foundation of China 

(60503015), and National High Technology Research 

and Development Program of China 

(863) (2008AA01A201). 

REFERENCES 

[1] C. Carofiglio, C. Chiasserini, M. Garetto, and E. Leonardi, 

“Route stability in MANETs under the random direction 

mobility model,” IEEE Transactions on Mobile Computing, 

vol. 8, no. 9, pp. 1167–1179, 2009. 

6


[2] M. Garetto and E. Leonardi, “Analysis of random mobility 

models with partial differential equations,” IEEE Transactions 

on Mobile Computing, vol. 6, no. 11, pp. 1204–1217, 

2007. 

[3] W. Su, S. Lee, and M. Gerla, “Mobility prediction and 

routing in Ad Hoc wireless networks,” International Journal 

of Network Management, vol. 11, no. 1, pp. 3–30, 2001. 

[4] S. Misra, S. Dhurandher, M. Obaidat, N. Nangia, 

N. Bhardwaj, P. Goyal, and S. Aggarwal, “Node stabilitybased 

location updating in mobile Ad-Hoc networks,” 

IEEE Systems Journal, vol. 2, no. 2, pp. 237–247, 2008. 

[5] J. Liu, W. Guo, X. B.L., and F. Huang, “Path holding 

probability based Ad Hoc on-demand routing protocol,” 

Journal of Software, vol. 18, no. 3, pp. 693–701, 2007. 

[6] V. Lenders, J. Wagner, S. Heimlicher, M. May, B. Plattner, 

and E. Zurich, “An empirical study of the impact of 

mobility on link failures in an 802.11 Ad Hoc network,” 

IEEE Wireless Communications, vol. 15, no. 6, pp. 16–21, 

2008. 

[7] S. Prince and B. Stephen, “On the behavior of communication 

links of a node in a multi-hop mobile environment,” 

in Proceedings of 5th ACM International Symposium on 

Mobile Ad Hoc Networking and Computing, 2004, pp. 145– 

156. 

[8] W. Tang and W. Guo, “A path reliable routing protocol in 

mobile ad hoc networks,” in Proceedings of 4th International 

Conference on Mobile Ad-Hoc and Sensor Networks, 2008, pp. 

203–207. 

[9] S. Xu, K. Blackmore, and H. Jonees, “Mobility assessment 

for MANETS requiring persistent links,” in Proceedings of 

International Conference on Mobile System, Applications and 

Services, 2005, pp. 39–44. 

[10] J. Sumesh and V. Anand, “Mobility aware path maintenance 

in ad hoc networks,” in Proceedings of the 2009 ACM 

Symposium on Applied Computing, 2009, pp. 201–206. 

[11] D. Rohit, D. Cynthia, K. Wang, and K. Satish, “Signal 

stability based adaptive routing (SSA) for Ad Hoc mobile 

networks,” IEEE Personal Communications, vol. 4, no. 1, pp. 

36–45, 1997. 

[12] G. Lim, K. Shim, S. Kim, and H. Yoon, “Signal strengthbased 

link stability estimation in ad hoc wireless networks,” 

Electronics Letters, vol. 39, no. 5, pp. 485–486, 2003. 

[13] Y. Xu and W. Wang, “Topology stability analysis and its 

application in hierarchical mobile ad hoc networks,” IEEE 

Transactions on Vehicular Technology, vol. 58, no. 3, pp. 

1546–1560, 2009. 

[14] H. Kim and M. Yoo, “A Scalable Ad Hoc routing protocol 

based on logical topology for ubiquitous community 

network,” in Processing of the 9th International Conference 

on Advanced Communication Technology, vol. 2, 2007, pp. 

1306–1310. 

[15] H. Zhang and Y. Dong, “A novel path stability computation 

model for wireless Ad Hoc Networks,” IEEE Signal 

Processing Letters, vol. 14, no. 12, pp. 928–931, 2007. 

[16] C. Channon, “A mathematical theory of communication,” 

The Bell System Technical Journal, vol. 27, no. 12, pp. 379– 

423,623–656, 1948. 

[17] C. Perkins and E. Royer, “Ad hoc on-demand distance 

vector routing (AODV),” 2003. 

Gang Liu is a PHD student in HIT. His research interest includes 

ad hoc networks, dependable computing. 

Gang CUI is a professor in HIT. His research interest includes 

fault tolerant computing technology, computer architecture, ad 

hoc network,wireless sensor network. 

Hongwei Liu is a professor in HIT. His research interest includes 

fault tolerant computing technology, ad hoc network, 

wireless sensor network. 

Zhibo WU is a professor in HIT. His research interest includes 

fault tolerant computing technology, computer architecture, ad 

hoc network,wireless sensor network. 

7


Potential Contribution of CNN-based Solving of Stiff ODEs 

& PDEs to Enabling Real-Time Computational Engineering 

Jean Chamberlain Chedjou ( 1 ), Cyrille Kalenga Wa Ngoy ( 2 ), Michel Matalatala Tamasala ( 2 ), Kyandoghere Kyamakya( 1 ) 

( 1 ): Transportation Informatics Group, Institute of Smart Systems Technologies, University of Klagenfurt (Austria), 

Email: kyandoghere.kyamakya@uni-klu.ac.at ; jean.chedjou@uni-klu.ac.at 

( 2 ): Department of Electrical and Computer Engineering, Polytechnic Faculty, University of Kinshasa (D. R. Congo) 

Email: Cyrille.Kalenga@vodacom.cd ; Michel.Matalatala@vodacom.cd 

Abstract — One of the most common approaches to avoid 

complexity while numerically solving stiff ordinary differential 

equations (ODEs) is approximating them by ignoring the 

nonlinear terms. While facing stiff partial differential equations 

(PDEs) the same is done by avoiding/suppressing the nonlinear 

terms from the Taylor’s series expansion. By so doing, the 

traditional methods for solving stiff PDEs and ODEs do 

compromise on both efficiency and precision of the resulting 

computations. This does inevitably lead to less accurate results 

that consequently cannot provide the full insight that may be 

needed in diverse cutting-edge situations in the ‘real’ nonlinear 

dynamical behavior experienced by the various engineering and 

natural systems (generally modeled by nonlinear differential 

equations of the types ODE or PDE), which are analyzed in the 

frame of the novel discipline called Computational Engineering. 

For many of these systems, even a real-time simulation and/or 

control of the behavior is wished or needed; this sets evidently 

extremely high challenging requirements to the computing 

capability with regard to both computing speed and precision. 

This paper develops/proposes and validate through a series of 

presentable examples a comprehensive high-precision and ultrafast 

computing concept for solving stiff ODEs and PDEs with 

Cellular Neural Networks (CNN). The core of this concept is a 

straight-forward scheme that we call ‘Nonlinear Adaptive 

Optimization (NAOP)’, which is used for a precise template 

calculation for solving any (stiff) nonlinear ODE through CNN 

processors. One of the key contributions of this work, this is a 

real breakthrough, is to demonstrate the possibility of 

mapping/transforming different types of nonlinearities displayed 

by various classical and well-known oscillators (e.g. van der Pol-, 

Rayleigh-, Duffing-, Rössler-, Lorenz-, and Jerk- oscillators, just 

to name a few) unto first-order CNN elementary cells, and 

thereby enabling the easy derivation of corresponding CNN 

templates. Furthermore, in case of PDE solving, the same concept 

also allows a mapping unto first-order CNN cells while 

considering one or even more nonlinear terms of the Taylor’s 

series expansion generally used in the transformation of a PDE in 

a set of coupled nonlinear ODEs. Therefore, the concept of this 

paper does significantly contribute to the consolidation of CNN 

as a universal and ultra-fast solver of stiff differential equations 

(both ODEs and ODEs). This clearly enables a CNN-based, realtime, 

ultra-precise, and low-cost Computational Engineering. As 

proof of concept some well-known prototypes of stiff equations 

(van der Pol, Lorenz, and Rössler oscillators) have been 

considered; the corresponding precise CNN templates are 

derived to obtain precise solutions of corresponding equations. 

An implementation of the concept developed is possible even on 

embedded digital platforms (e.g. FPGA, DSP, GPU, etc.); this 

opens a broad range of applications. On-going works (as outlook) 

are using NAOP for deriving precise templates for a selected set 

of practically interesting PDE models such as Navier Stokes, 

Schrödinger, Maxwell, etc. 

Keywords: Stiff ODEs and PDEs, CNN-based differential equation 

solving, high-precision computing, ultra-fast computing, NAOP 

scheme for CNN templates’ calculation. 

I. INTRODUCTION 

The last decades have witnessed a tremendous attention on 

solving nonlinear and stiff models (ODEs and/or PDEs) with 

the CNN paradigm [1]. The interest devoted to solving stiff 

models can be explained by their multiple potential 

applications especially in the so-called Computational 

Engineering context. Indeed, nonlinear models have been 

intensively used to understand, predict and describe the 

dynamical behavior of various engineering or natural systems. 

In the field of transportation and logistics, for example, traffic 

models do take the form of ODEs and/or PDEs [2]. Still, in the 

field of transportation, various image processing tasks which 

are of high importance for visual sensors in Advance Driver 

Assistant Systems (e.g. contrast enhancement, segmentation, 

edge detection, etc…) can be expressed through solving 

corresponding stiff ODEs and/or PDEs [3]. 

Diverse contributions have been made to develop 

analytical, numerical and even hardware-based approaches to 

solve stiff ODEs and/or PDEs [1]-[20]. Amongst these 

contributions some have retained our attention namely “the 

solutions of PDEs and ODEs using the CNN-paradigm”. In 

fact, the flexibility of the CNN paradigm and its huge potential 

to enable a renaissance of the old “analog computing” through 

an emulation on digital platforms (e.g. FPGA or GPU, etc.) to 

perform ultra-fast and accurate computing of nonlinear models 

are some of its strongest points. Nevertheless, the relevant 

state-of-the-art does not provide significant information related 

to a straight-forward method to calculate the CNN templates 

needed for solving stiff ODEs and/or PDEs with the CNN 

paradigm. Despite some intensive works developed in this 

direction it is still unclear how to solve PDEs and/or ODEs 

with good accuracy or high precision. Only approximate 

solutions exist, for example the use of CNN processors in an 

approximation of numerical solutions of PDEs involving the 

finite difference method [7], [10]-[14]. This later approach 

does not provide accurate results due to the Taylor series’ 

expansion which does consider only up to the first order (i.e. 

linear expansion). A further interesting published approach to


solve PDEs is the group of learning schemes involved in an 

approximated solution of PDEs through CNN processors 

[15]-[20]. This late approach does require some initial solutions 

along with some critical parameter settings of the equations 

under investigation in order to enable the training process. This 

is a clearly significant drawback as it is not always possible to 

provide this data/information whenever dealing with stiff 

ODEs and/or PDEs. 

Our aim in this paper is therefore to contribute to the 

enrichment of the relevant state-of-the-art by 

proposing/developing a systematic methodology (based on the 

CNN paradigm) which should help to clear some of the 

problems actually unsolved by the classical above described 

approaches. The key challenge thereby is developing a CNNbased 

computing concept for performing both ultra-fast and 

high-precision computing of stiff differential equations. The 

proposed method is based on a nonlinear adaptive optimization 

scheme to which we give the acronym “NAOP”. For proof of 

concept, the novel approach developed in this paper is applied 

to derive solutions of selected classical and well-known 

examples of stiff ODEs. In the following, the flexibility of the 

approach developed is extensively discussed and we then do 

show/explain an easy extension of this approach to similarly 

efficiently solving stiff PDEs. 

The rest of the paper is organized as follows. Section 2 

presents an in-depth description of the novel concept. The 

quintessence of NAOP is explained and we thereby describe 

the scheme for deriving appropriate CNN templates values for 

any given nonlinear ODE. Section 3 does then focus on the 

proof of concept through a selected nonlinear differential 

equation that is solved using the new concept developed in this 

paper: the van der Pol equation. For this, corresponding 

‘precise’ templates are calculated through NAOP. In section 4 

the possible extension of the novel scheme involving NAOP 

for solving PDEs is discussed. And finally, a series of 

concluding remarks are presented in Section 5 along with the 

presentation of some interesting open research questions 

(outlook) that are under investigation in some of our on-going 

works. 

II. THE CONCEPT OF “NAOP” FOR CNN TEMPLATE 

CALCULATION AND SOLUTIONS OF STIFF ODES 

This section describes the approach based on the Nonlinear 

Adaptive Optimization (NAOP) for solving ODEs. The 

overall flow diagram of this approach is schematically 

displayed by the synoptic representation in Fig. 1. 

The NAOP is performed by a complex ‘computing’ 

“module/entity/procedure” which does work on two inputs. 

The first input contains wave-solutions generated by the state 

control CNN- network modeled by (1): 

M 

dxi =− xi + ⎡Aˆ ijxj Aijyj Biju ⎤ 

⎣ 

+ + j⎦ + Ii 

j 1 

∑ (1) 

dt = 

The second input contains wave-solutions of the model or 

better the linear/nonlinear differential equation, under 

investigation which could be re-written in the following 

simplified form as a set/couple of second order ODEs (see 

(2)): 

2 

dyi 

2 

dt 

= F(y , y , y & , z , z , z & , t) 

(2a) 

n m n m 

i i i i i i 

2 

dzj 

n m n m 

2 

j j j j j j 

dt 

= F(z , z , z & , y , y , y & , t) (2b) 

Figure.1. Synoptic representation of the key steps involved in the NAOP 

approach used for a precise template calculations for solving both linear and 

nonlinear differential equations. 

The output of the NAOP system will generate, after 

extensive iterative computations or ‘training’ steps, 

appropriate CNN-templates to solve the corresponding ODEs 

(see (2)) when the convergence of the training process is 

achieved. 

The global process to derive the CNN-templates (i.e. 

NAOP) can be summarized as follows. The learning/training 

process is based on a mapping between the two inputs of the 

NAOP procedure. A convergence to local minima is the key 

purpose governing this template calculation process, the socalled 

NAOP. To achieve this, various basins of attraction are 

investigated sequentially, and corresponding CNN templates 

are determined for those various initial conditions. If some 

local attractors diverge from a local minimum, new sets of 

initial conditions are automatically generated to annihilate the 

divergence leading to a possible convergence to a local 

minimum. A large number of randomly generated attractors 

(either regular or chaotic) are obtained through various 

numerical simulations whereby each attractor corresponds to a 

specific set of CNN-templates. An attempt to map these 

attractors to those generated by the model under investigation 

is performed in a sequential process leading to the 

convergence to a local minimum when the mapping is 

achieved successfully. However, it should be worth a 

mentioning that during the training process our various 

numerical simulations have revealed that it is very 

tough/difficult to find the optimal solution (i.e. the local 

minimum). This difficulty can be explained by the well-known 

inherent local minimum problem of the Hopfield neural 

network [8]-[9]. To overcome this problem, various basins of 

attractions are therefore generated within the NAOP process


and this generation is conducted in a sequential way until the 

internal dynamics of the global network of coupled oscillators 

converges to stable states. This convergence must be achieved 

in both the ‘CNN-templates’ and the ‘attractors’ which are all 

considered to be dynamic variables during the 

learning/training process. It is further worth a mentioning that 

the quintessence of the concept NAOP is in the core an 

adaptive training process that is very comparable to the 

concept developed for the training of Hopfield neural 

networks towards an efficient tracking of local minima. 

Nevertheless, NOAP has been demonstrated capable of 

mapping all known nonlinearity of ODEs unto appropriate 

templates of a first-order CNN processor matrix. 

Â 11 

Â 12 

Â 21 

Â 22 

Figure. 2a: Convergence of state-control CNN templates as achieved by the 

NOAP process for the following values of the system parameters: Є =0.25 and 

ω=1. 

III. APPLICATIONS TO SOLVING STIFF ODES 

We restrict our analysis to the case of the van der Pol 

oscillator which is a good prototype of a well-known selfsustained 

oscillator having the interesting characteristic of 

being able to generate sinusoidal-, quasi-periodic-, and 

relaxation- oscillations (see (3)) 

2 

dx 2 dx 

2 ( ) 

dt 

−ε 1− x +ω x = 0 

(3) 

dt 

Two possible states can be generated by (3). The first is the 

sinusoidal or almost sinusoidal state (Є 1). We now want to solve 

(3) using the CNN-paradigm. We envisage the case where 

Є=0.25 and ω=1. For these parameter values the NAOP concept 

has been exploited to calculate the corresponding templates 

after convergence of the training process. This convergence is 

clearly illustrated by the plots presented in Figs. (2a) and (2b) 

showing the temporal evolution of both the state-control 

templates Âij (see Fig. (2a)) and the feedback templates Aij 

(see Fig. (2b)). As it appears in these figures, the convergence 

is achieved after a long transient phase displayed by the global 

training network. It is worth a mentioning that the 

convergence of the process is achieved for suitable basins of 

attractions. From Figs. (2), one can easily read the following 

corresponding CNN templates that are then used to solve the 

van der Pol equation: 

Â11 = 1.0770 , Â12 =− 0.6300 , Â21 = 1.3450 , Â22 = 0.5850 , 

A 0.4473 A 0.2586 A 0.4846 A = 0.1310. 

11 = , 12 =− , 21 = , 22 

11 A 12 A 

A 21 

Figure. 2b: Convergence of Feedback- templates achieved by the NOAP 

process for the following values of the system parameters: Є =0.25 and ω=1. 

This set of template values has been used/inserted in Fig. 3 to 

obtain the solution of (3) through the CNN paradigm. Indeed 

Fig. 3 is a general representation in SIMULINK of a CNN 

processor platform to solve second-order nonlinear ordinary 

differential equations. The key contribution of our approach, 

which is a breakthrough, is that we are now capable of 

transforming/mapping any type of nonlinearity displayed by 

nonlinear coupled and uncoupled ODEs into the type of 

nonlinearity displayed by the elementary first-order CNN- cell 

model. As proof of concept of the approach developed in this 

paper, we have used the CNN templates derived by this 

scheme to obtain the exact solutions of (3). The graphical 

representation of the CNN-processors for second order ODEs 

presented in Fig.3 has been used for rapid prototyping 

purposes (a hardware implementation in either DSP or FPGA 

or GPU platforms is then straight-forward). A direct numerical 

simulation of the same equation, i.e. (3) has also been 

performed using MATLAB and a comparison between these 

A 22


two results is shown in Figs. 4. As it clearly appears in Fig. 

(4a) and Fig. (4c), the result (i.e. the solution of (3)) by the 

approach based on the CNN-paradigm developed in this paper 

and the result (i.e. Fig. (4b) and Fig. (4d)) of the same 

equation through a direct numerical solution through 

MATLAB of (3) are in a very good agreement (i.e. same value 

of the amplitude of oscillations and same frequency of 

oscillations). 

Figure. 3. SIMULINK graphical representation of the CNN- computing 

platform to solve (3). 

The method proposed in this paper is challenging as it 

shows/demonstrates a systematic and straightforward way to 

solve nonlinear ordinary differential equations by the CNN- 

paradigm. The key challenge has been the possibility and then 

the appropriate way/algorithmic of/for mapping any type of 

nonlinearity unto the nonlinearity displayed by the elementary 

CNN- cell. Therefore, the approach developed in this work is 

very flexible as it can be applied to solve different types of 

nonlinear and stiff ODEs. The template calculation scheme 

based on NAOP has also been successfully applied for solving 

Rayleigh, Lorenz and Rössler equations and corresponding 

CNN- templates have been successfully derived (due to space 

constraints we cannot present all these results in this paper). 

One interesting issue under investigation is the 

establishment/development of a library of CNN template-sets 

to solve the most common nonlinear and stiff ODEs including 

the ones already cited above. 

The next section is addressing the generalization/extension 

of the approach developed in this paper to solving nonlinear 

and stiff PDEs. In fact, it will be shown that a discretization 

process could help to transform PDEs into sets of coupled or 

uncoupled nonlinear ODEs in order to make them solvable by 

the CNN-paradigm while thereby applying the scheme 

developed in this paper. 

Figure 4a. Wave-form solution of (3) obtained by our new approach 

based on the CNN- paradigm for Є =0.25 and ω=1. 

Figure 4b. Wave-form solution of (3) obtained through direct 

numerical simulation of (3) in MATLAB for Є =0.25 and ω=1. 

CNN – Waveform 

CNN- Phase Portrait 

Figure 4c. Wave-form solution of (3) obtained by our new approach 

based on the CNN- paradigm for Є =1 and ω=1.


MATLAB – Waveform MATLAB – Phase Portrait 

Figure 4d. Wave-form solution of (3) obtained through direct 

numerical simulation of (3) in MATLAB for Є =1 and ω=1. 

IV. EXTENSION OF THE NAOP SCHEME TO SOLVING 

STIFF PARTIAL DIFFERENTIAL EQUATIONS 

This section explains the possibility of extending/applying 

the approach developed in this paper to solving PDEs. Unlike 

the traditional approach of solving stiff PDEs through CNN 

which takes into consideration only the linear terms of the 

Taylor’s series expansion, we include the higher order 

derivative terms in the Taylor’s series expansion of any given 

PDE in order to improve the accuracy of the obtained 

solutions. We consider, for illustration, the Burger’s equation 

(4) which is a well-known prototype of partial differential 

equations and which is having multiple potential applications 

in the field of transportation. 

2 

∂u 1 ∂ u ∂u 

= −u 

2 

∂t R ∂x 

∂x 

In order to solve (4) by the CNN-paradigm, applying an 

expansion (at the first order) based on the Taylor’s series does 

lead to the following equivalent form of (4): 

[ ] 

i 1 i+ 1− i + i−1 ui ui+ 1 − ui−1 

2 

(4) 

du u 2u u 

= − (5) 

dt R h 

2h 

One can see that (5) is a well-known prototype of a set of firstorder 

coupled nonlinear ODEs. As it appears in (5), the 

discretization performed has resulted into a set of coupled 

ODEs with quadratic nonlinear terms (i.e of types similar to 

Lorenz or Rössler). This type of nonlinearity is solvable by 

our approach (NAOP) developed in the preceding paragraph as 

we could already solve more complex types of nonlinearity 

(e.g. the nonlinearity in the van der Pol equation). As 

discussed in Section 1, taking the truncated Taylor’s series 

(only the linear terms) has been done reluctantly in the many 

published works, since there has been no way so far, 

according to the literature, to deal with the increased 

complexity and the nonlinearity that appear otherwise. It is 

obvious that the results produced in the case of a linear 

approximation are de facto less precise. While considering the 

higher-order (in this case second-order) derivative terms in 

order to increase precision, the Taylor’s series expansion 

could be applied to (4) and this could lead to results presented 

in (6): 

dui 1 ⎡ui+ 1− 2ui + ui− 1 ui+ 1− 3u i + 3u i−1−ui−2 ⎤ 

= .... 

2 2 

dt R 

⎢ − − 

h 2h 

⎥ 

⎣ ⎦ 

⎡ ui+ 1−uiui+ 1− 2ui+ ui−1⎤ 

−u i ⎢ − −...... 

h 2h 

⎥ 

⎣ ⎦ (6) 

Therefore, while considering (6), it becomes obvious that the 

NAOP developed in this paper is a best candidate for a 

straightforward derivation of the appropriate CNN-templates 

to solve (6). 

NAOP is also applicable for solving PDEs. The PDE must 

be first transformed in a set of coupled nonlinear ODEs. In 

this process, even nonlinear terms of/in the Taylor series 

expansion can be kept. Then NAOP will be used to determine 

appropriate templates for solving those complex sets of 

generally coupled nonlinear ODEs. 

V. CONCLUDING REMARKS 

We have proposed and validated a theoretical/concept 

based on the CNN paradigm for ultra-fast, potentially low-cost 

and high-precision computing of stiff ODEs and PDEs. Since 

we can solve these through CNN independently of the actual 

nonlinearity, we have reached a clear breakthrough that has 

the potential to enable a really ‘real-time’ Computational 

Engineering. 

The main benefit of solving ODEs and PDEs using CNN is the 

offered flexibility through NAOP to extract the CNN 

parameters through which CNN can solve any type of ODE or 

PDE. Another strong point of the CNN-paradigm is the 

resulting ultra-fast processing depending on the CNN 

implementation: DSP, FPGA, GPU, or CNN-Chip. One key 

objective of this work has been to advance the relevant stateof-the-art 

by proposing a novel framework to solve stiff 

ODE’s and PDE’s with high- precision. To achieve this goal, 

we have proposed and demonstrated that the Nonlinear 

Adaptive Optimization (NAOP) technique is a best and 

efficient scheme to cope with solutions of any ODE or PDE. 

The NAOP is a learning/training method for mapping the 

wave solutions of the models describing the dynamics of a 

CNN-network to that of a given model (ODE). Taking just 

these two inputs, the learning process leads to the convergence 

to a local minimum where the complete mapping of the two 

models is achieved and CNN-templates are produced. 

Using the same technique, we proposed a high- precision 

computing of stiff PDEs while accounting even nonlinear 

terms (i.e. high order-terms) in the Taylor’s series expansion 

used while transforming the PDE unto a set of coupled 

nonlinear ODEs. In order to overcome the problem related to 

the speed of computation, an implementation either on FPGA 

or DSP or GPU of the concept developed in this work is 

possible and straight-forward.

REFERENCES 

[1] Leon O. Chua, and Lin Yang, “Cellular Neural Networks: 

Theory,” IEEE Transactions on Circuits and Systems, vol. 35, 

no. 10, October 1988. 

[2] Milka Uzunova, Daniel Jolly, Emil Nikolov, and Kamel 

Boumediene, “The Macroscopic LWR Model of the Transport 

Equation Viwed as a Distributed Parameter System,” 

Proceedings of the 5th international conference on Soft 

computing as transdisciplinary science and technology, pp. 572- 

576, October 2008. 

[3] Song Chun Zhu, and David Mumford, “Gibbs Reaction and 

Diffusion Equations,” Proceedings of the 6 th International 

Conference on Computer Vision, pp. 847, January 1998. 

[4] Tamer A. Abassy, Magdy A. El-Tawil, H. El-Zoheiry, “Exact 

Solutions of Some Nonlinear Partial Differential Equations 

Using the Variational Iteration Method Linked With Laplace 

Transforms and the Pade Technique,” Computers and 

Mathematics with Applications, vol. 54, pp. 940-954, October 

2007. 

[5] N. H. Sweilam, “Variational Iteration Method for Solving Cubic 

Nonlinear Schrodinger Equation,” Journal of Computational and 

Applied Mathematics, vol. 207, pp. 155-163, October 2007. 

[6] Michaek Striebel, Andreas Bartel, and Michael Gunther, “A 

Multirate ROW-scheme for Index-1 Network Equations,” 

Applied Numerical Mathematics, vol. 59, pp. 800-814, March 

2009. 

[7] Tamas Roska, Leon O.Chua, Dietrich Wolf, Tibor Kozek, 

Ronald Tetzlaff, and Frank Puffer, “Simulating Nonlinear 

Waves and Partial Differential Equations via CNN-Part I:Basic 

Techniques,” IEEE Transactions on Circuits and Systems- 

I:Fundamental Theory and Applications, vol. 42, no. 10, 

October 1995. 

[8] J. J. Hopfield, and D. W. Tank, “Neural computation of 

decisions in optimization problems,” Biol. Cybernet. N 52, pp. 

141-152, 1985. 

[9] K. A. Smith, “Neural network for combinatorial optimization: A 

review of more than a decade of research,” INFORMS J, 

Computing, vol. 11, no. 1, pp. 15-34, 1999. 

[10] C. .Del Negro, L.Fortuna, and A.Vicari, “Modelling Lava Flows 

by Cellular Nonlinear Networks (CNN): Preliminary Results,” 

Nonlinear Processes in Geophysics, vol. 12, pp. 505-513, 2005. 

API 

API 

Users 

Users 

Users 


Internet 

MIDDLEWARE 

MIDDLEWARE 

GPIO GPIO IO 

GPIO 

DSP DSP Cluster 

Cluster 

Platform BUS 

Internet Connection 

Central Server 

Multi Channel Memory Controller 

FPGA FPGA Cluster Cluster GPU GPU Cluster Cluster Power Power PC PC Cluster 

Cluster 

Memory Memory Memory Memory 

Hyper-Computer 



Hyper Computer 

Emulated Emulated Analog Analog Computing 

Computing 

or or CNN CNN processors 

processors 

CNN CNN implementation implementation HPC 

HPC 

Shared Memory 

Figure 5. Global architecture of the computing platform planned to enable a 

real-time computational Engineering. Diverse users may access the CNN 

processor platforms in a remote way through the Internet 

[11] I. Krstic, D. Kandic, and B. Reljin, “Cellular Neural Networks- 

An Analogous Model for Stress Analysis of Prismatic Bars 

Subjected to Torsion,” FME transactions, vol. 31, pp. 7-14, 

2003. 

[12] J. –H. Niu, H.-Z.Wang, and H.-X.Zhang, J.-Y.Yan, Y.-S.Zhu, 

“Cellular Neural Network Analysis for Two-Dimensional 

Bioheat Transfer Equation,” Medical & Biological Engineering 

& Computing, vol. 39, pp. 601-604, 2001. 

[13] Tibor Kozek, Leon O.Chua, Tamas Roska, Dietrich Wolf, 

Ronald Tetzlaff, Frank Puffer, and Karoly Lotz, “Simulating 

Nonlinear Waves and Partial Differential Equations via CNN- 

Part II: Typical Examples,” IEEE transactions on circuits and 

systems-I: Fundamental theory and applications, vol.42, No. 10, 

October 1995. 

[14] T. Kozek and T. Roska, “A Double Time-Scale CNN for 

Solving 2-D Navier-Stokes Equations,” CNNA-94 3 rd IEEE 

International Workshop on Cellular Neural Networks and their 

Applications, December 1994. 

[15] Puffer, R. Tetzlaff, and D. Wolf, “A Learning Algorithm for 

Cellular Neural Networks (CNN) Solving Nonlinear Partial 

Differential Equations,” ISSSE Proceedings, 1995. 

[16] P. Lucie Aarts, and P. van der Veer, “Neural Network Method 

for Solving Partial Differential Equations,” J. Neural Processing 

Letters, vol. 14, no. 3, pp. 261-271, December, 2001 

[17] I. G. Tsoulos, D. Gavrilis, and E. Glavas, “Solving differential 

equations with constructed neural networks,” J. 

Neurocomputing, vol. 72, pp. 2385-2391, June 2009. 

[18] L O Chua, M Hasler, G S Moschytz, J Neirynck, “Autonomous 

cellular neural networks: A unified paradigm for pattern 

formation and active wave propagation,” IEEE Transactions on 

Circuits & Systems-I, Fundamental Theory and Applications, 

vol. 42, no.10, October 1995. 

[19] F. Puffer , R. Tetzlaff , D. Wolf, “Modeling Nonlinear Systems 

With Cellular Neural Networks”, IEEE Transcactions on 

Acoustics, Speech, and Signal Processing ICASSP-96, vol. 6, 

pp. 3513-3516, 1996. 

[20] Josef A. Nossek, “Design and Learning With Cellular Neural 

Networks,” International Journal of Circuit Theory & 

Applications, vol. 24, pp. 15 – 24, 31 Dec 1998. 

Job Request 

Server Architecture 

Web 

Server 

Task Scheduler 

Bill Manager 

API & 

Abstract Layer 

Platform BUS 

Synthesis Tools 

Finite Element 

Image Processing 

Figure 6. Core idea of the server architecture intended for the CNN based 

super-computing platform to enable real-time Computational Engineering. 

It is a detailed description of the central sever given in Fig. 5. 

PDE 

ODE


Kyandoghere Kyamakya obtained 

the ‘Ir. Civil’ degree in Electrical 

Engineering in 1990 at the 

University of Kinshasa. In 1999 he 

received his Doctorate in Electrical 

Engineering at the University of 

Hagen in Germany. He then worked 

three years as post-doctorate 

researcher at the Leibniz University 

of Hannover in the field of Mobility 

Management in Wireless Networks. From 2002 to 2005 he 

was junior professor for Positioning Location Based 

Services at Leibniz University of Hannover. Since 2005 he 

is full Professor for Transportation Informatics and Director 

of the Institute for Smart Systems Technologies at the 

University of Klagenfurt in Austria. 

Cyrille Kalenga Wa Ngoy obtained the ‘Ir. Civil’ degree in 

Electrical Engineering at the University of Kinshasa. He is 

since about ten years Assistant at the same University in the 

Department of Electrical and Computer Engineering. 

Michel Matalatala Tamasala obtained the ‘Ir. Civil’ 

degree in Electrical Engineering at the University of 

Kinshasa. He is since about four years Assistant at the same 

University in the Department of Electrical and Computer 

Engineering. 

Jean Chamberlain Chedjou 

received in 2004 his doctorate in 

Electrical Engineering at the 

Leibniz University of Hanover, 

Germany. He has been a DAAD 

(Germany) scholar and also an 

AUF research Fellow (Postdoc.). 

From 2000 to date he has been a 

Junior Associate researcher in the 

Condensed Matter section of the ICTP (Abdus Salam 

International Centre for Theoretical Physics) Trieste, Italy. 

Currently, he is a senior researcher at the Institute for Smart 

Systems Technologies of the Alpen-Adria University of 

Klagenfurt in Austria. His research interests include 

Electronics Circuits Engineering, Chaos Theory, Analog 

Systems Simulation, Cellular Neural Networks, Nonlinear 

Dynamics, Synchronization and related Applications in 

Engineering. He has authored and co-authored 3 books and 

more than 40 journals and conference papers.

Abstract — This paper aims to explore the feasibility of using 

OFDM over satellite channels with high order modulation 

techniques such as QPSK and 16QAM, and strong error 

correction algorithms. Moreover, a performance comparison 

between currently used single carriers techniques and OFDM is 

presented. 

D 


Performance Comparison of OFDM and Single 

Carrier Modulations over Satellite Channels 

Index Terms — Satellite, OFDM, QPSK, 16 QAM. 

Yuri Labrador, Masoumeh Karimi, Niki Pissinou, and Deng Pan 


IGITAL modulation techniques over satellite have 

become, in the past five years, the mainly transmission 

technique used in television and video transmission because 

digital modulation combined with video compression can more 

efficiently use the satellite bandwidth. The single carrier 

modulation currently used performs very well over satellite 

channels in fixed receiver environments. When we deal with 

mobile users the channel presents multi-paths effects as well as 

Doppler shifts. In this scenario single carrier modulation does 

not work as in fixed environments. Orthogonal Frequency 

Division Multiplexing, on the other hand, performs much 

better in multi-paths and frequency selective channels; it also 

uses the available spectrum in a very efficient way. This paper 

aims to explore the feasibility of using OFDM over satellite 

channels with high order modulation techniques such as QPSK 

and 16QAM, and strong error correction algorithms. A 

performance comparison between currently used single 

carriers techniques and OFDM is presented as well. 

II. SINGLE CARRIER MODULATION VERSUS OFDM 

Single carrier modulation presents two main problems when 

used in frequency selective channels. These two problems are 

[1]: (1) frequency selective channels introduce inter symbol 

interference at the receiver; and (2) equalization at the receiver 

may also amplify noise in frequencies where channel response 

is poor. As a result, single carrier modulation is affected due to 

high attenuations in some bands. Since the same carrier uses 

the entire bandwidth, this problem can become very serious 

(see Figure 1). 

The bandwidth must be divided into many small bands, and 

then a carrier may be allocated in each one. Furthermore, the 

Manuscript received January 21, 2010. 

The authors are with Florida International University, Miami, FL, e-mails: 

{ylabr001, mkari001, pissinou, pand}@fiu.edu. 

data stream should be divided into many parallel data streams, 

modulating individual carriers. Then, the signals can be added 

together and transmitted. Thus, the entire bandwidth will be 

used, but with many individual and smaller carriers as shown 

in Figure 2. 

H( 

jΩ) 

0 

X 0 

X 1 

XM−1 

φ [ ] 

0 n 

x 

[ ] φ 

1 n 

x 

φ [ ] 

M−1 

n 

x 

f 0 

Fig. 1. Channel Response 

Some advantages of OFDM are as follows [8]: (1) the 

available spectrum is divided into smaller sub-bands; (2) data 

is divided in the transmitter site, and each sub-stream 

modulates one sub-carrier; (3) power and rate of transmission 

in a band depend on the channel response on that band; and (4) 

no ISI, since in each narrow sub-band, the channel response is 

almost flat [7] (see Figure 3). 

In general an OFDM transmission can be represented as 

shown in Figure 4. 

∑ 

Fig. 2. OFDM Principle of Operation. 

f 

x[n 

]

The power required in each sub-channel is distributed, 

depending on the value of Hi. Then, the number of bits to be 

transmitted to each sub-channel is determined. The number of 

bits and the constellation can be chosen for a sub-channel 

based on the SNR in that particular sub-channel and the 

required probability of error. 

Amplitude 

Fig. 3. Orthogonal Carriers. 

The signal spectrum of a single carrier and an OFDM 

modulation differ in two main characteristics (Figure 5). 

1) Single carrier shows one main frequency which is 

modulated using some digital scheme such as QPSK or 

8PSK. 

2) OFDM is composed of a series of carriers individually 

modulated; these carriers are orthogonal with respect to 

each other. 

III. SATELLITE CHANNEL MODELS 

A fixed receiver satellite channel is modeled for practical 

application as an Additive Gaussian White Noise channel with 

a path loss block that takes into consideration the distance 

between the satellite and the receiver antenna and the 

operating frequency. This produces a path loss attenuation that 

varies depending on the type of satellite used. For 

geostationary satellites this attenuation in C Band can be in the 

order of hundredths of dB. These parameters, when simulating 

in Mat Lab, give a very close representation of real life 

scenarios in terms of Bit Error Rate (BER) calculation and 

X 0 

X 1 

��.. 

X M −1 


x[ 

0] 

x[ 

1] 

��.. 

x[ 

M −1] 

x[ 

n ] 

f 

h[n 

] 

Signal to Noise ratios (SNR). Several simulation runs, and real 

life test have been performed to demonstrate that the channel 

model is a correct approximation of real life events [10]. 

When dealing with mobile receivers a more complex model 

needs to be considered. 

Propagation characteristics in satellite channels are more 

susceptible to weather impairments, especially at higher 

frequencies [2], [3]. Average rain and shadowing may 

completely disrupt the communication link. A mobile satellite 

channel model that takes into consideration potential weather 

impairments and the multipath-fading phenomenon is 

necessary in order to represent the satellite channel. Some 

models have been proposed, but they only consider one of 

either the multipath effects or the weather effect. The 

propagation effects present in a mobile satellite link include 

those related to the troposphere (rain, etc.), or effects caused 

by the receiver’s environment (multipath). The troposphere 

effects are denoted byα , and the environmental effects re 

denoted byβ . The two effects are assumed to be statistically 

independent because their underlying mechanisms are 

independent. The amplitude of the received signal can be 

described as: 

A = α ⋅ β (1) 

The satellite channel model has two states (good and bad 

states): one is a non-shadowing state, and the other is a 

shadowing state [4], [5]. This two-state model forms a Markov 

model. In the non-shadowing state, the received signal 

amplitude can be described as a Rician distribution : 

pnon−shadowing (A) = 2K ⋅ A ⋅ exp −K(A 2 [ +1) ]⋅ I0(2K ⋅ A) 

where K is the Rice factor. 

In the shadowing state, where no LOS exists, the channel is 

described as a Rayleigh multipath fading. The signal at the 

receiver is expressed as: 

y[n] 

Fig. 4. OFDM Transmitter and Receiver. 

p shadowing 

' 

x[ 

0] 

' 

x[ 

1] 

��.. 

' 

x[ 

M −1] 

(2) 

⎛ A 

⎜ 

⎝ 

⎞ 

⎟ = 

⎠ 

2A 

exp − A2 ⎛ ⎞ 

⎜ ⎟ (3) 

⎝ ⎠ 

s 0 

s 0 

' 

X 0 

' 

X1 

��.. 

X 

1 

H0 

1 

H1 

' 

M −1 

1 

HM− 

1 

s 0 

X 0 

X 1 

��.. 

X M −1


Of all potential weather impairments, rain is the most 

critical, especially in tropical weather, where rainfall can be 

severe. The long-term statistics of potential rainfall can be 

described by a lognormal equation: 

1 

PL (L) = 

σ d L ⋅ 2π exp − (lnL − md )2 

⎡ 

⎤ 

⎢ 

2 ⎥ , L ≥ 0 (4) 

⎣ 2σ d ⎦ 

Studies on rain attenuation between fixed and mobile 

systems show that the probability distribution of the envelope 

of a mobile receiver can be described by the one used for the 

fixed system, multiplied by a factor that changes between 0.5 

and 2.0 and is independent of rain attenuation [6]. Figure 6 

demonstrates the probability density function versus the 

amplitude. 

Figure 7 shows measures in real life C Band transponders 

showing the effects of rain over the on board solid state 

amplifier current. 

IV. SIMULATIONS 

The Mat Lab software is used in this paper to simulate these 

types of modulations techniques. The simulation includes 

constellations for QPSK, 8PSK and 16QAM and signal 

spectrums for both single carrier and OFDM techniques. We 

decided to include the 16QAM in order to show a type of 

digital modulation that includes variations of both Amplitude 

and Phase in contrast to QPSK and 8PSK, which only 

modulate the phase of the carrier signal [9]. 

For each simulation we created five blocks: 

1. Ground station block that includes: 

a) Random Digital Source that generates digital pulses. 

b) Error corrections blocks for a code rate of 3/4. 

Fig. 5. Single carrier vs. OFDM spectrum 

c) QPSK, 8PSK or 16QAM Modulator that performs the 

actual Modulation. 

d) OFDM modulator (for the OFDM simulations). 

e) Raise Cosine Transmit Filter. 

f) High Power Amplifier. 

g) Transmitting Antenna. 

2. Uplink Path block that includes: 

a) Free Space Path Loss, this block simulates the Uplink 

free space attenuation due to frequency and distance 

from the Uplink site to the satellite. The Uplink 

frequency is 6245 MHz and the distance 35600 Km, 

giving a total attenuation of 199 dB. 

b) Phase/Frequency offset. 

Probability Density 

Fig. 6. Probability Density Functions for Non-shadowing State.


3. Satellite block that includes: 

a) Satellite receiving antenna. 

b) Satellite receiver system temperature. 

c) Phase Noise. 

d) I/Q Balance. 

e) Phase/Frequency offset. 

f) Power Amplifier. 

g) Satellite Transmitting Antenna. 

4. Downlink Path block that includes: 

a) Free Space Path Loss; this block simulates the 

Downlink free space attenuation due to frequency 

and distance from the Satellite to the Receiving 

Station. The Downlink frequency is 4020 MHz and 

the distance 35600 Km, giving a total attenuation of 

196 dB. 

b) Phase/Frequency offset. 

c) Rician Multipath fading Channel (for mobile 

receivers’ scenarios). 

5. Receiving Earth Station block that includes: 

a) Receiving Antenna. 

b) Receiver Noise Temperature. 

c) Phase Noise. 

d) I/Q Balance. 

e) Phase/Frequency Offset. 

f) Raised Cosine Receive Filter. 

g) OFDM demodulator (for OFDM simulations). 

h) QPSK, 8PSK or 16QAM Demodulator. 

i) Error correction decoder. 

Fig. 7. On board SSPA Current Attenuation in the Satellite Transponder. 

The simulation allows varying several parameters for 

different scenarios such as TX and RX antennas Diameter, and 

Gain; in that way we can see how the receiving spectrum is 

affected when the sizes of the antennas are changed. 

Others parameters of interest that can be changed and 

affects the receiving signal are: HPA Gain, Uplink and 

Downlink frequencies and thus Uplink and Downlink free 

space attenuation, Noise Temperatures that originally were set 

to typical cases of 290 K, Phase Noise, Phase Correction, 

Doppler error, AGC type, Phase and frequency offsets, order 

of the error corrections algorithms used. Note that in the 

simulations we have include several spectrum monitors and 

constellation representations that can be moved to different 

parts of the diagram to check the form of the spectrum at any 

place during the path. 

Figures 8 and 9 show the effects of channel models on the 

transmitted and received spectrum for both single carrier 

modulation and OFDM. 

Figure 10 shows the values of BER of the OFDM simulation 

for different values of the Rician factor K. Under this channel 

model the BER detection threshold is reached at BER values 

of 

3 

10 − with a Eb/N0 = 8dB. The bandwidth is 5 MHz. 

If Turbo codes are used then the values of BER in the 

OFDM QPSK signal are shown in Figure 11. The same Rician 

factors K were used.

Fig. 8. Transmitted and Received Spectrums Single Carrier Modulation. 

OFDM Spectrum 

40 

35 

30 

25 

20 

15 

10 

5 

0 


Uplink Signal 

HPA Effects on the OFDM Signal 

-5 

-2.5 -2 -1.5 -1 -0.5 0 

Frequency 

BW = 5 MHz 

0.5 1 1.5 2 2.5 

Fig. 9. Transmitted and Received Spectrums OFDM Modulation. 

V. PERFORMANCE COMPARISON AND EXPERIMENTAL 

RESULTS SIMULATIONS 

Table I presents a performance comparison between the 

existing single carrier satellite modulation techniques data rate 

versus a multiple carrier (OFDM) scheme using time diversity 

data rate. 

The test was aimed to produce a working version of the 

OFDM QPSK modulation system, for performance 

verification, to include the following: 

1) Conduct a review of actual RF performance over a typical 

C-Band transponder. 

2) Demonstrate the inherent robustness of the system in the 

presence of normal satellite transmissions impairments. 

The test was located at Univision Network Communications 

Uplink facility in Miami, Fl. The output of the modulator at 70 

MHz was fed into a Radyne Upconverter with +7 dBm output 

option, which then fed an MCL Klystron C-Band HPA. 

The Uplink antenna used was a 9.1 m Scientific Atlanta with 

4 ports feed, transmitting onto transponder 16 on AMC-1 at 

103 West. The downlink available for the test is a 3.1 m 

receiving antenna in Miami. The downlink is equipped with 

TABLE I 

PERFORMANCE OF OFDM TIME DIVERSITY IN SATELLITE CHANNELS. 

Modulati 

on 

Bandwidth 

(MHz) 

FEC 

Single carrier 

data rate. DVB S 

scheme. (Mbps) 

Proposed OFDM 

time diversity 

scheme. (Mbps) 

QPSK 5 1/2 3.5 4.5 

QPSK 5 3/4 5.3 6.75 

QPSK 5 5/6 5.9 7.5 

16 QAM 5 

1/2 

standard DRO-based LNBs digital quality. Typical noise 

temperatures of 25 to 35 Kelvin were noted. 

The modulator was set to a nominal output level of – 8 

dBm. The spectrum was noted as very clean, with only lowlevel 

spurs noted at the IF frequency of 70 MHz. The resulting 

IF spectrum exhibited at least -30 dBc at the + or – 2.5 MHz, 

indicating that very little RF power would be wasted into outof-band 

transmissions being absorbed by the transponder 

filters. The RF output of the Upconverter was set to a nominal 

output level of + 1 dBm, well below the + 7 dBm saturation 

level of the upconverter output. 

The RF input to the HPA was set to a nominal level of – 22 

dBm. Checking the HPA output via a 57.1 dB coupler. The 

spurious emissions were noted to be – 55 dBc or lower. No 

special tuning of the Klystron was necessary as it was simply 

deemed unnecessary for the purpose of the test. In order to 

determine the proper operating point for the service in the 

transponder, a series of RF level tests were performed, using 

both CW and modulated carrier. The CW tests indicated the – 

1 dB saturation point for the transponder SSPA was with a 

transmit level of 80 Watts, as measured by the HPA output 

coupler. 

An operating point of 0.5 dB below the – 1 dB saturation 

point was chosen as a nominal operation level, to maximize 

downlink performance without introducing significant 

distortion to the modulation. The effect of saturating the 

transponder is to be avoided due to increase in Inter Symbol 

Interference in the demodulator within the receiver, causing 

loss of RF margin performance. The local 3.1 m antenna was 

peaked on the satellite, and was used as reference antenna for 

the bulk of the tests, as it constituted a stress-test scenario for 

the system. 

Upon modulating the OFDM carriers, the system 

performance was measured over full range 10 dB OBO to 

saturation, with the signal to noise ratio, SNR, displayed by the 

receiver providing nominal increases up to the 1 dB saturation 

point. This result indicates that if there was an increase in 

distortion of the OFDM signal, it was not discernable by the 

receiver. Further tests should be performed to determine the 

actual extent of such distortion, independent of the SNR 

readout. 

n/a 

9.01 

16 QAM 5 3/4 10.6 13.5 

16 QAM 5 5/6 12.4 15.1


Bit Error Rate 

Bit Error Rate 

10 0 

10 -1 

10 -2 

10 -3 

10 -4 

10 -5 

10 -6 

10 -7 

10 

0 5 10 15 20 25 30 35 40 

-8 

E /N (dB) 

b 0 

Satellite HPA Saturation Level = 1 dB 

10 0 

10 -1 

10 -2 

10 -3 

10 -4 

10 -5 

10 -6 

10 -7 

The overall system performance is shown below: 

Parameter 3.1m 3.6m 3.7m 5.0m 7.3m 

SNR (dB) 11.0 11.7 11.7 14.4 18.0 

Signal Level 57 61 57 53 57 

Margin (3/4 FEC) (dB) 2.0 2.7 2.7 5.4 9.0 

Margin (5/6 FEC) (dB) 0.6 1.3 1.3 4.0 7.6 

Fig. 10. BER Values QPSK OFDM Satellite Channel. 

The system was tested at an FEC of 3/4, 5/6 and 8/9 (not 

shown), with most of the testing performed at 3/4 and 5/6 

rates, as this was thought to be the most likely operational rates 

for the system. 

Some problems encountered: 

OFDM QPSK Satellite Link K=5 





10 

0 5 10 15 20 25 30 35 40 

-8 

E /N (dB) 

b 0 

HPA Saturation Level 1 dB 

Fig. 11. BER Values QPSK OFDM Turbo Coded Satellite Channel. 

OFDM QPSK Satellite Link Turbo Code 1/3 K=5 





Transponder 16 operations were significantly affected by 

adjacent-satellite interference, both uplink and downlink, from


a co-frequency, co-polarized analog video uplink on Galaxy 4, 

at 99 West, 4 degrees away. This is a perfectly legitimate 

interference situation, and is typical of the interference to be 

expected while operating on a C-Band transponder in the 

middle of the dense cable neighborhood portion of the US 

domestic arc. 

The interference was mainly downlink-dominated on the 

smaller diameter receiving antennas. It is estimated that the 

interference contributed to a general 1.5 to 3 dB degradation 

of the system performance on the 3.1 m antenna. It was also 

noted that the SNR reading on the receiver monitoring the 3.1 

m antenna would occasionally fluctuate 0.1 to 0.2 dB, 

probably due to changes in the nature of the overall 

interference level. 

Using 3/4 FEC, the system performed with about 3 dB 

margin for the worst-case antenna of 3.1 m. A 3/4 FEC rate 

operating into any location within the 39 dBW contour. Using 

a 3.1 m antenna or better should have adequate margin. 

The following data shows a comparison between OFDM 

QPSK and OFDM 16QAM and the available margins over 

threshold: 

Modulation (OFDM) QPSK QPSK 16QAM 16QAM 

Coding RSV RSV RSTC RSTC 

Transponder BW (MHz) 5 5 5 5 

FEC 3/4 5/6 3/4 5/6 

Total Data Rate (Mbs) 6.75 7.5 13.5 15.1 

C/No Threshold (dB) 6.9 7.9 9.0 10.4 

A similar analysis is performed with existing single carrier 

modulation using QPSK and 8PSK. 

Modulation (Single Carrier) QPSK QPSK 8PSK 8PSK 

Coding RSV RSV RSTC RSTC 

Transponder BW (MHz) 5 5 5 5 

FEC 3/4 5/6 3/4 5/6 

Total Data Rate (Mbs) 5.3 5.9 9.7 11.5 

C/No Threshold (dB) 5.8 7.2 8.4 10.1 

The C/N threshold increases from 3/4 to 5/6 and also if the 

modulation order is higher. It can be shown that a OFDM 

QPSK 3/4 threshold is 2 dB lower than a OFDM 16QAM 3/4 

threshold, this is in accordance with theoretical analysis 

because in the case of 16QAM modulation the signal to noise 

ratio has to be better in order to detect, in the receiver end, the 

phases now more close together than in QPSK modulation. 

VI. CONCLUSION 

The single carrier modulation currently used performs very 

well over satellite channels in fixed receiver environments. 

When we deal with mobile users the channel presents multipaths 

effects as well as Doppler shifts. In this scenario, single 

carrier modulation does not work as in fixed environments. 

Orthogonal Frequency Division Multiplexing, on the other 

hand, performs much better in multi-paths and frequency 

selective channels; it also uses the available spectrum in a very 

efficient way. This paper has aimed to explore the feasibility 

of using OFDM over satellite channels with high order 

modulation techniques such as QPSK and 16QAM, and strong 

error correction algorithms. Furthermore, a performance 

comparison between currently used single carriers techniques 

and OFDM has been presented. 

REFERENCES 

[1] K. J. Ray Liu, Ahmed K. Sadek, Weifeng Su, and Andres Kwasinski, 

“Cooperative Communications and Networks,” Cambridge, 2009. 

[2] M. Rice, J. Slack, and B. Humphreys, “K-Band land mobile satellite 

channel characterization,” Int. J. Satellite Communications, Vol. 14, 

pp. 283-296, 1996. 

[3] E. Kubista, F. Perez Fontan. M. Angeles Vazquez Castro, S. Bunomo, 

B. R. Arbesser-Rasburg, and J.P.V. Poiares Baptista, “Ka-band 

propagation measurements and statistics for land mobile satellite 

applications,” IEEE Transactions on Vehicular Technology, Vol. 49, 

pp. 973-983, May 2000. 

[4] Wenzhen Li, Choi Look Law, V. K. Dubey, and J. T. Ong, “Ka-band 

land mobile satellite channel model incorporating weather effects,” 

IEEE Communications Letters, Vol. 5, Issue 5, pp. 194-196, May 2001. 

[5] C. Loo and J. S. Butterworth, “Land mobile satellite channel 

measurements and modeling,” Proc. IEEE, Vol. 86, pp. 1442-1463, July 

1998. 

[6] E. Lutz, D. Cyagn, M. Dippold, F. Dolainsky , and W. Papke, “The land 

mobile satellite communication channel-Recording, statistics and 

channel model,” IEEE Transactions on Vehicular Technology, Vol. 40, 

pp. 375-384, May 1991. 

[7] Yuri Labrador, Masoumeh Karimi, Deng Pan, and Jerry Miller, “OFDM 

MIMO Space Diversity in Terrestrial Channels,” International Journal of 

Computer Science and Network Security (IJCSNS), Vol.9, No.10, pp. 

52-61, October 2009. 

[8] Simon Plass, Armin Dammann, Gerd Richter, and Martin Bossert, 

“Channel Correlation Properties in OFDM by using Time-Varying 

Cyclic Delay Diversity,” Journal of Communications, Vol. 3, No. 3, July 

2008. 

[9] Yuri Labrador, Masoumeh Karimi, Deng Pan, and Jerry Miller, “An 

Approach to Cooperative Satellite Communications for 4G Mobile 

Systems,” Journal of Communications, Vol. 4, No. 10, November 2009. 

[10] Oh-Soon Shin, A. M. Chan, H. T. Kung, and V. Tarokh, “Design of an 

OFDM Cooperative Space-Time Diversity System,” IEEE Transactions 

on Vehicular Technology, Vol. 56, No. 4, July 2007.


Software Rejuvenation Technique-An 

Improvement in Applications with Multiple 

Versions 

Abstract — By notice to extension software technology and 

modern applications, software reliability and availability is very 

serious problem. Software fault tolerance techniques improve 

these capabilities. One of the techniques is Software rejuvenation, 

which counteracts software aging. Software aging may lead to 

performance degradation or crash/hang failure or both. In this 

paper, we address this technique for the application with one, 

and then extend model for multiple versions. The numerical 

experiment results show that with more software versions can 

greatly reduce expected downtime and improve availability of 

application. 

Index Terms— Software rejuvenation, Availability, reliability 


ith the increase of the complication of computer 

W systems, the loss which is caused by software 

inefficiency is more and more widespread problem. One 

solution to reduce the loss of systems is to improve its 

reliability. At present, software fault-tolerate technique is the 

most effective approach to the problem [1]. Traditional faulttolerant 

techniques belong to a passive technique works in a 

reactive way. It implements rejuvenation operation only when 

the system is in failure; whereas, the software rejuvenation 

technique belongs to a kind of active technique, which 

prevents or slows down system failures before their 

occurrence [1]. 

When software applications execute continuously for long 

periods of time (scientific and analytical applications run for 

days or weeks, servers in client-server systems are expected to 

run forever), the processes corresponding to the software in 

execution age or slowly degrade with respect to effective use 

of their system resources. The causes of process aging are 

memory leaking, unreleased file locks, file descriptor leaking, 

data corruption in the operating environment of system 

1. Zahra Rahmani Ghobadi is a Msc Student in Department of Computer 

Engineering, Qazvin Azad University, Qazvin, Iran (phone: 0989358224714 

e-mail: m.rah62@ gmail.com). 

2. Hassan Rashidi is an Assistant Professor in Department of Computer 

Engineering, Qazvin Azad University, Qazvin, Iran (phone: 0989126772017 

e-mail: hrashi@gmail.com). 

Zahra Rahmani Ghobadi 1 , Hassan Rashidi 2 

resources, etc. process aging will affect the performance of the 

application and eventually cause the application to fail [2]. 

The software rejuvenation technique terminates the program 

when its performance declines to a certain degree, then restarts 

to clean the inner state and the software performance will be 

restored. 

Huang et al. (1995) introduced the continuous Markov 

process to build two-phase software rejuvenation model that 

includes healthy state, aging probable state, system failure 

state and rejuvenation state [8]. By Markov decision process, 

Pfening et al. (1996) proposed a software rejuvenation frame 

and applied it to AT and T communication system. Garg et al. 

(1998) constructed rejuvenation model of transaction 

processing system based on queuing theory [7]. Dohi et al. 

(2000) set up software rejuvenation model of client/server 

system and adopted non-parameter statistic analysis to 

estimate optimal software rejuvenation interval [8] [9]. For 

cluster system, Garg et al. (1998) and Wei et al. (2004) 

presented stochastic Petri net approach to analyze software 

rejuvenation. Vaidyanathan et al. (2001) used stochastic 

Reward Net to model and analyze cluster system that 

employed software rejuvenation [10]. Bao et al. (2005) and 

Vaidyanathan and Trivedi (2005) took the system workload 

into account for building a model to estimate resource 

exhaustion times [5]. 

We extend software rejuvenation model for multiple 

software version. In order to improve systematic reliability of 

application, the systematic availability formula is derived. 

Finally, the numerical results are given to validate the 

proposed model. 

II. SOFTWARE REJUVENATION 

Software rejuvenation is a proactive fault management 

technique aiming at cleaning up the internal state of the 

system to prevent the occurrence of more severe crash failures 

in the future. It involves occasionally terminating an 

application or a system, cleaning its internal state and 

restarting it [3]. Application is unavailable during 

rejuvenation. Although rejuvenation may sometimes increase 

the downtime of an application, those are usually planned and 

scheduled downtimes. If care is taken to schedule rejuvenation 

during the idlest times of an application, then the cost due to 

those downtimes is expected to be short. Downtime costs are


the costs incurred due to the unavailability of the service 

during downtime of an application [2]. 

Let pij (t) be transition probability function of continuoustime 

Markov process and qij be transition rate. Kolmogorov 

forward equation is defined as follows: 

dPij 

( t) 

dt 

N 

= ∑ ik 

k = 0 

P ( t) 

q , i, 

j = 0, 

1, 

2 

kj 

By Letting p(t) to be the matrix of transition probability 

function pij(t)(i,j=0,1,2,…) and Q to be the matrix of transition 

rate function qij(i,j=0,1,2,…), formula (1) can be expressed in 

matrix format as follows: 

P ′ ( t) 

= P( 

t) 

Q 

A. Software Rejuvenation Model of One-Node Application 

First, we study Software rejuvenation model for the 

application with one software version, model based Markov 

process, as is shown in Fig. 1. The system has three states: the 

working state 0 (denoted as S0), the failure state 1 (denoted as 

SF) and the rejuvenation state 2 (denoted as SR). In the 

beginning, the application stays in the working state 0. With 

system performance degrades over time, a failure may occur. 

If system failure occurs before triggering software 

rejuvenation, the application changes from the working state 0 

to system failure state 1 and then the system recovery 

operation is started immediately. Otherwise, the application 

changes from the working state 0 to the software rejuvenation 

state 2 and later the software rejuvenation is carried out. After 

completing the system repair or rejuvenation, the application 

becomes as good as new and changes to the beginning 

working state 0 again. We define the time interval from the 

beginning of the system working to the next one as one cycle. 

According to the model described above, at any time t the 

application can be in any one of three states: up and available 

for service (working state 0), recovering from a failure (the 

failure state 1), or undergoing software rejuvenation (the 

rejuvenation state 2). To formally describe the software 

rejuvenation model of single version application, continuous 

time Markov process denoted as Z= (Zt; t≥0) is used, where Zt 

represents the state of application at time t. The transition 

probability function of Z is expressed as follows [6]: 

P ( t) 

= P( 

Z = j Z = i)( 

∀i, 

j ∈ Ω, 

t ≥ 0) 

(3) 

ij t 0 

Where, Ω= {0, 1, 2} is the state space set. 

For the software rejuvenation model in Fig.1, λ1, µ1, r1, and 

R1 represents the failure rates from system working state to 

failure state, the transition rate to trigger software 

rejuvenation, the rejuvenation rate from software rejuvenation 

state to system working state and the recovery rate from 

system failure state and the recovery rate from system failure 

state to system working state, respectively. Let Q be the 

matrix of the transition rate function. According to the state 

(1) 

(2) 

transition relationship of single version application, the 

transition rate matrix for the continuous time Markov process 

Z can be easily derived as: 

-(μ1+λ1) λ1 μ1 

Q = R1 -R1 0 (4) 

r1 0 -r1 

Let p (t) be the matrix of transition probability function 

pij(t)(∀i,j∈Ω). According to Kolmogorov forward Eq.1, 

transition probability matrix p (t) satisfies: 

P ′ ( t) 

= P( 

t) 

Q 

P ( 0) 

= I 

Where, I is the unit matrix. 

Let pj, j∈Ω be the instantaneous steady probability of single 

version application in state j. According to the limit 

distribution theorem, pj, j∈Ω is given by: 

lim 

Pj = ij 

t→∞ 

P ( t)( 

∀i, 

j ∈ Ω ) 

By Substitution Eq.4 and 6 to Eq.5, the following equation 

is derived: 

− ( μ 1 + λ1 

) P0 

+ R1P1 

+ r1 

P2 

= 0 

− R1P1 

+ λ1 

P0 

= 0 

− r P + μ P = 0 

1 2 

2 

∑ Pi 

i= 

0 

= 1 

1 

0 

Where pi, i=0, 1, 2 can be obtained by solving the Eq.7. 

The application is available for service requests in working 

state 0 and application is unavailable for the rejuvenation state 

1 and failure state 2, thereafter, the system availability for 

single version application is given by: 

PA = P 

1 

0 

µ1 

B. Software Rejuvenation Model of Two-Node Application 

We extend the software rejuvenation model of single 

application to two-dimension state space, then derive software 

rejuvenation model of two-node application as shown in Fig.2. 

The states of application are denoted by a 2-tuple S, which is 

formally defined as: S={(i,j)￨i,j∈{H,F,R}}, where i is the 

state of the first version of application and j is the state of the 

second version of application. For the first version of 

application, λ1, μ1, r1, and R1 represents the failure rates from 

R1 

SR(2) S0(0) SF(1) 

r1 λ1 

Fig. 1. Software rejuvenation model of single application. 

(6) 

(5) 

(7) 

(8)


system working state to failure state, the transition rate to 

trigger software rejuvenation, the rejuvenation rate from the: 

software rejuvenation state to working state, respectively. 

Correspondingly, for the second version of application, λ2, μ2, 

r2, and R2 denotes the failure rate, the transition rate to trigger 

software rejuvenation, the rejuvenation rate and the recovery 

rate, respectively. 

We discussed assumptions for simplicity and limited this 

model. The assumptions are explained as following: 

Assumption 1: Software rejuvenation is not allowed for 

both versions to be carried out concurrently. 

Assumption 2: At any time t only one version can be in 

rejuvenation state. 

Assumption 3: if the version be in failure state, other 

versions can’t transfer to rejuvenation state. 

Assumption 4: rejuvenation rate from software 

rejuvenation state to system working state is faster than 

recovery rate from system failure state to system working 

state. 

Also it is assumed that Zt is the state of the version at time t, 

Ω′= {0, 1, 2…7} is the state space set. Similarly, we use 

continuous time Markov process, denoted as Z= (Zt; t≥0), to 

describe the software rejuvenation model of two-node 

application. The transition probability function of Z is 

expressed as Eq. 10 and pj, j∈Ω is given by [4]: 

lim 

P = P ( t)( 

∀i, 

j ∈ Ω ′ 

j 

ij 

) 

t → ∞ 

(R,H) 

4 

µ1 

r1 

λ2 R2 λ2 R2 λ2 R2 

(R,F) 

6 

r1 

r2 

(H,R) 

5 

(H,H) 

0 

(H,F) 

2 

Correspondingly, the transition probability matrix P (t) also 

satisfies the condition in Eq. 5. By substitution Eq. 9 and 10 to 

Eq. 5 the Eq.11 can be derived [5]: 

µ2 

R1 

λ1 

R1 

λ1 

R1 

λ1 

(F,R) 

7 

r2 

(F,H) 

1 

(F,F) 

3 

Fig. 2. Software rejuvenation model of two applications. 

(9) 

-( λ 1+ λ2+ μ 1+ μ 2) λ1 λ2 0 μ 1 μ2 0 0 

2 

7 

∑ Pi 

i= 

0 

R1 -(R1+ λ2) 0 λ2 0 0 0 0 

R2 0 -(R2+ λ1) λ1 0 0 0 0 

0 R2 R1 -(R1+R2) 0 0 0 0 

r1 0 0 0 -(r1+ λ2) 0 λ2 0 

r2 0 0 0 0 -(r2+ λ1) 0 λ1 

0 0 r1 0 R2 0 -(r1+R2) 0 

0 r2 0 0 0 R1 0 -(r2+R1) 

− ( μ1 

+ μ 2 + λ1 

+ λ 2 ) P0 

+ R1P1 

+ R2 

P2 

+ r1 

P4 

+ r2 

P5 

= 0 

− ( R1 

+ λ 2 ) P1 

+ λ1P0 

+ R2 

P3 

+ r2 

P7 

= 0 

− ( R 2 + λ1 

) P2 

+ λ 2 P0 

+ R1P3 

+ r1 

P6 

= 0 

− ( R1 

+ R2 

) P3 

+ λ 2 P1 

+ λ1P2 

= 0 

− ( r1 

+ λ 2 ) P4 

+ μ1 

P0 

+ R2 

P6 

= 0 

− ( r2 

+ λ1 

) P5 

+ μ 2 P0 

+ R1P7 

= 0 

− ( r1 

+ R 2 ) P6 

+ λ 2 P4 

= 0 

− ( r + R ) P + λ P = 0 

1 

= 1 

7 

1 5 

(10) 

(11) 

By solving the above equations, we can obtain the value of 

pi, i=0, 1, 2…7. According to the rejuvenation model in Fig.2, 

the application is unavailable in the state of (F, F), (R, F), and 

(F, R). Thereafter, the availability of two-node application is 

given by: 

PA 2 

= 1 3 6 7 

8(H,H,R) 

= P0 

+ P1 

+ P2 

+ P4 

+ P5 

− ( P + P + P ) 

13(F,H,R) 

19(F,F,R) 

r1 

r2 

r3 

11(H,F,R) 

6 (F,F,H) 

3(F,H,H) 

14(F,R,H) 

R1 

R2 

R3 

0(H,H,H) 

7 (F,F,F) 

5 (F,H,F) 

2(H,F,H) 

18(F,R,F) 

9(H,R,H) 

4 (H,F,F) 

1(H,H,F) 

12(H,R,F) 

μ1 

μ2 

16(R,H,F) 

17(R,F,F) 

15(R,F,H) 

Fig. 3. Software rejuvenation model of three applications. 

μ3 

(12) 

10(R,H,H) 

λ 1 

λ 2 

λ3

C. Software Rejuvenation Model of Three-Node Application 

We study this work for three-dimension state space and 

gain the less unavailability by Software rejuvenation model of 

three-node application as shown in Fig.3. Q is matrix of the 

transition rate function as in Eq.14. 

By solving the obtained equations, we obtain the value of 

Pi, i=0, 1, 2…19. According to the rejuvenation model in 

Fig.3, the application is unavailable in the state of 

(F,F,F),(R,F,F), (F,R,F), (F,F,R). Thereafter, the system 

availability of three-node application is given by: 

PA 3 


= 1 − ( P7 

+ P17 

+ P18 

+ P19 

) 

(13) 

A λ1 λ2 λ3 0 0 0 0 μ1 μ2 μ3 0 0 0 0 0 0 0 0 0 

R1 B 0 0 λ2 λ3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 

R2 0 C 0 λ1 0 λ3 0 0 0 0 0 0 0 0 0 0 0 0 0 

R3 0 0 D 0 λ1 λ2 0 0 0 0 0 0 0 0 0 0 0 0 0 

0 R2 R1 0 E 0 0 λ3 0 0 0 0 0 0 0 0 0 0 0 0 

0 R3 0 R1 0 F 0 λ2 0 0 0 0 0 0 0 0 0 0 0 0 

0 0 R3 R2 0 0 G λ1 0 0 0 0 0 0 0 0 0 0 0 0 

0 0 0 0 R3 R2 R1 H 0 0 0 0 0 0 0 0 0 0 0 0 

r1 0 0 0 0 0 0 0 I 0 0 λ2 λ3 0 0 0 0 0 0 0 

r2 0 0 0 0 0 0 0 0 J 0 0 0 λ3 λ1 0 0 0 0 0 

r3 0 0 0 0 0 0 0 0 0 K 0 0 0 0 λ1 λ2 0 0 0 

0 0 r1 0 0 0 0 0 0 0 0 L 0 0 0 0 0 λ3 0 0 

0 0 0 r1 0 0 0 0 0 0 0 0 M 0 0 0 0 λ2 0 0 

0 0 0 r2 0 0 0 0 0 0 0 0 0 N 0 0 0 0 λ1 0 

0 0 0 r2 0 0 0 0 0 0 0 0 0 0 O 0 0 0 λ3 0 

0 0 r3 0 0 0 0 0 0 0 0 0 0 0 0 P 0 0 0 λ2 

0 r3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Q 0 0 λ1 

0 0 0 0 0 0 r1 0 0 0 0 0 0 0 0 0 0 R 0 0 

0 0 0 0 0 r2 0 0 0 0 0 0 0 0 0 0 0 0 S 0 

0 0 0 0 r3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 T 

(14) 

III. NUMERICAL RESULTS AND ANALYSIS 

To acquire reliability measure of application, we perform 

numerical experiments by taking system unavailability as 

evaluation indicator. 

The unavailability of single application Pu1, two-node 

application Pu2, and three-node application Pu3 can be 

evaluated as follows: 

PU 1 = 1 − PA1 

= P1 

+ P2 

PU 2 = 1 − PA 

2 = P3 

+ P6 

+ P7 

PU 3 = 1 − PA 

3 = P7 

+ P17 

+ P18 

+ P19 

TABLE I 

PARAMETER VALUES USED IN THE EXPERIMENT 

r1=r2=…=rn R1=R2=…=Rn λ1=λ1=…=λn µ1=µ2=…=µn 

1 

0.1 0.005 0.002 

The system parameter default values in software 

rejuvenation model are given in Table I, in which the 

rejuvenation rate is 1, the recovery rate is 0.1, failure rate is 

0.005 and transition rate to trigger software rejuvenation is 

0.002. All the parameter values are selected by experimental 

experience for demonstration purposes. For simplify the 

numerical experiment, we assume the failure rate, Recovery 

rate and Rejuvenation rate of all versions is equal. 

Figure 4 shows the system unavailability versus number of 

versions. We can see that number of versions strongly 

influences system reliability. With the number of version 

increasing, the system unavailability reduces rapidly and goes 

to a steady value. 

IV. CONCLUSION 

In this paper, we presented software rejuvenation structure 

and set up the software rejuvenation model in one, two, and 

three-dimension state space for one application. In the model, 

the system availability formula is derived from continuous 

time Markov process. The numerical experiment results show 

that the system unavailability greatly minimizes when the 

number of versions increases. 

Fig. 4. The system unavailability versus number of version in the application 

with multiple versions.


REFERENCES 

[1] S.Yu, CH.Qi, H.Xin, “Positive software fault-tolerate technique based 

on time policy”, Journal of Communication and Computer, ISSN1548- 

7709, Volume 4, No.8 (Serial No.33), 2007. 

[2] Y. Huang, C. Kintala, N. Koletis, and N.D. Fulton, “Software 

Rejuvenation: Analysis, Module and Applications”, in Proc. 25th 

Symposium on Fault Tolerant Computer Systems, pp. 381-390, 1995. 

[3] T.Thein, J.Sou Park, Member, IEEE, “Availability Analysis of 

Application Servers Using Software Rejuvenation and Virtualization”, 

JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY 24(2): 

339-346 Mar. 2009. 

[4] S. Pfening, S. Garg, A. Puliafito, M. Telek and K. S.Trivedi, “Optimal 

Rejuvenation for toleranting Soft Failure”, Performance Evaluation, 

27/28, , pp.491–506, 1996. 

[5] Q, Yong, M.Haining, H.Di, Ch. Ying. “A Study on Software 

Rejuvenation Model of Application Server Cluster in Two-Dimension 

State Space Using Markov Process”, Information Technology Journal 

7(1): 98-104, 2008. 

[6] T.Dohi, S.Trivedi, “Statistical Non-Parametric Algorithms to Estimate 

the Optimal Software Rejuvenation Schedule”, Dept. of Electrical and 

Computer Engineering, Duke University, Durham, NC 27708-0294, 

USA,2000. 

[7] W. Xiea, Y. Hong, K. Trivedi. “Analysis of a two-level software 

rejuvenation policy”, Reliability Engineering and System Safety 87 

(2005) 13–22. 

[8] Y. Huang, C. Kintala, N. Koletis, N.D. Fulton, “Software rejuvenation: 

analysis, module and application”, in: Proc. of 25 th Symposium on Fault 

Tolerant Computing, June 1995. 

[9] T. Dohi, K. Goseva-Popstojanova, K.S. Trivedi, Statistical nonparametric 

algorithms to estimate the optimal software rejuvenation 

schedule, in: Proceedings of the 2000 Pacific Rim International 

Symposium on Dependable Computing, December 2000. 

[10] K. Vaidyanathan, R.E. Harper, S.W. Hunter, K.S. Trivedi, “Analysis and 

implementation of software rejuvenation in cluster systems”, ACM 

SIGMETRICS Performance Evaluation Review, in: Proceedings of the 

2001 ACM SIGMETRICS International Conference on Measurement 

and Modeling of Computer Systems, vol. 29 (1), June 2001.

Abstract— The reduction in the size of transistors, leads to the 

increase in the numbers of transistors to more than several 

billions on a chip. Therefore, new techniques have to be carried 

out to manage this large quantity of transistors on a single chip. 

Network on Chip (NoC) is an implementation technique to resolve 

this problem. But this NoC management is a challenging job and 

the communication management need regular scheduling and 

configuration. One attitude towards NoC management is making 

use of Real Time Operating System (RTOS) for scheduling, task 

introduction, and dynamic assigning priorities to the tasks and 

message passing. Therefore in this paper, MicroC/OS-II RTOS is 

used. This RTOS is ported in Motorola ColdFire microprocessor. 

This microprocessor is located in the core of a node of mesh 

topology based NoC. The traffic model in this paper is hotspot. 

Index Terms—MicroC/OS-II, Motorola ColdFire 

Microprocessor, Network on Chip, Real Time Operating 

System. 

T 


A New Attitude based on Real Time Operating 

System for NoC in Hotspot Traffic Model 


HE System on Chip (SoC) can include different 

components such as processor, I/O unit and various types 

of memories. Each of these components can have different 

communication protocols [1]. 

Generally, Interconnection processing elements in NoC is 

carried out by ports, whereas, in multiprocessor SoC (MPSoC) 

with numerous processing elements, it is expected that these 

ports in the case of latency, scalability and energy 

consumption, are turned into bottlenecks. 

Therefore, the idea of NoC that includes the routers which are 

connected by the means of links is introduced. But the 

communication management in NoC is a challenging job. So, 

utilization the RTOS will be in charge of managing this 

challenge. This OS can be ported on NoC node 

microprocessor. In this paper, MicroC/OS-II RTOS is ported 

in the central node of NoC mesh topology based on hotspot 

traffic model. However, the OS can be ported in the all nodes. 

The idea of applying NoCs also has been used in the previous 

works such as [9]. 

In this paper MicroC/OS-II is used in an innovative way that is 

making use of the RTOS. This OS, in contrast with similar 

OSs such as Windows and Linux is not monolithic and 

Seyyed Amir Asghari, Hossein Pedram and Hassan Taheri 

application program do not effect on kernel. Also, it has a few 

number of code lines for kernel that it has a willing impact on 

power computing to usual OSs. As NoCs are power 

constrained, this is considered a privilege feature [11]. 

In the 2 nd part of this paper, NoC structure and its components 

are introduced. 

In the 3 rd part, MicroC/OS-II RTOS and its privilege features 

are introduced. 

In the 4 th part, different types of traffic models are explained. 

A specific traffic model which is being taken into account is 

hotspot traffic model. 

In 5 th part, Motorola ColdFire processors are introduced. In 

the implementation of OS based NoC, the MCF5484 ColdFire 

processor is used. 

In the 6 th part, microprocessor programming and debugging 

tools are introduced. 

In the 7 th part, two different attitudes, one based on using OS, 

the other one without using OS are compared and the 

advantages of OS based NoC are brought up. Also, in this 

section, a PrioRout routing algorithm is introduced. 

In the 8 th part, the carried out implementation is presented and 

the last part the final conclusion is brought up. 

II. NOC STRUCTURE 

A NoC has been formed of routers and links. The IP blocks 

have been connected to each other by means of the network 

interfaces (NI). Also the routers communicate to each other 

over links. A router distinguishes packet paths in network. The 

router has been concluded of some buffers, a routing function 

unit, a selection function unit and a switch for packet 

transmission to packet destinations [2] [10]. 

Network Interfaces justifies IP block communication protocol 

and packet transmission protocol by means of the router. Each 

network interfaces can connect several IP blocks to the routers.


Figure 1. A router with its components 

III. MICROC/OS-II RTOS 

MicroC/OS-II is a RTOS that has been applied to embedded 

application. If we have a toolchain (A system concluded 

compiler, assembler and linker), we can add an OS to it. 

MicroC/OS-II has a full preemptive and real time kernel which 

means OS runs the high priority tasks which are ready to 

running. Many traditional kernel acts on format of preemptive, 

but the MicoC/OS-II is much better than them. 

Analysis of OSs with monolithic kernel (such as Windows and 

Linux) which is consisting of millions of line of code when 

they encounter problem is difficult and nearly these OSs would 

not bug free. 

The kernel of MicroC/OS-II has only 5000 lines of code and 

we can confirm that it reached to a level that will be bug free 

[3]. 

A. Multitasking feature 

MicroC/OS-II can manage up to 64 tasks. However 

MicroC/OS-II reserves the four highest priority tasks and the 

four least priority tasks for its uses. So it leaves the 56 free 

tasks. 

B. Multitasking feature 

For MicroC/OS-II task managing capability, first we need to 

be creating a task. For creating the task, we can use one of 

these functions: 

• OSTaskCreate 

• OSTaskCreateExt() 

OSTaskCreateExt() is a extended version of 

OSTaskCreate() that it has some extra features. For a 

creating multitasking, at least we need to create one task. We 

can not create the task with Interrupt Service Routine (ISR).In 

the figure 2 we can see the segment code of OSTaskCreate 

function: 

INT8U OSTaskCreate (void (*task)(void *pd), 

void *pdata, OS_STK *ptos, INT8U prio) 

Figure 2. OSTaskCreate function 

As you see above, need four arguments: 

Task; A pointer to task code. 

Pdata; It is a pointer to the argument. This argument passed to 

the wanted task of the beginning moment. 

Ptos; It is a pointer to the top stack. This pointer should be 

assigned to the task. 

Prio; It is the priority of the wanted task. 

IV. TRAFFIC MODELS 

The traffic model is one of the important parameters in 

evaluating the latency time of interconnection networks. 

These models are produced according to the application 

programs which are run on the machine. In different 

application, different models are used. Traffic models are 

defined according to three parameters [4]: 

• The entrance time to networks 

• Message length 

• Address distribution type 

A. The uniform traffic model 

Uniform traffic model is the simplest traffic model which used 

in most of evaluations. In this model, each node sends message 

to the other nodes in network with equal probability. For 

example in a 6 × 6 mesh topology, each nodes sends message 

to the other nodes with the probability of %2.85. 

All source or destination nodes are selected with equal 

probability. The selection of source and destination node for 

each message will be independent from other messages [4]. 

B. Hotspot traffic model 

In hotspot traffic model, the numbers of messages which are 

sent to special node as the hot node are more than the other 

nodes. Usually the one node is considered as a hot node. 

Because of sending some packets of the created messages in 

network to this spot, the traffic around this node is more than 

the other spot. 

Equalizing protocols and OS functions are the instances which 

lead to the production of this kind of traffic. The most colorful 

node in figure 3 is the hot node and the traffic congestion is 

clear around it.


C. Permutation traffic model 

Figure 3. Hotspot traffic model 

Permutation traffic model is another traffic model that a lot of 

parallel programs like FFT, matrix problems, and fault tolerant 

routing algorithms have behavior like it. 

In this model, the destination address is found by placing the 

source address in a permutation function. So for each source 

address there always is a destination address. Bit reversal, First 

(second) matrix transpose, shuffle and butterfly traffic models 

are some examples of the permutation model. For instance the 

traffic model of matrix transpose explained; if we consider M 

and N as the dimension size of the 2-D network and (i,j) as the 

source node address, the destination address is produced as 

follow: 

( i, j) 

→ ( M× 

N−1− 

j, 

M× 

N−1−i) 

(1) 

The destination address in second matrix transpose is 

produced as follow: 

( i, j) 

→ ( j, 

i) 

(2) 

D. Local Traffic model 

Local traffic model is similar to application program. In this 

model, each node sends special volume of its created message 

to its neighbor. The number of neighbors is related to the 

distance between neighbor nodes (called neighbor radius). 

Radius one is shown in figure 4. In that the block nodes are the 

neighbors of node. 

Figure 4. Local traffic model 

In all explained traffic model, some percentages messages are 

distributed as per mutative, local or one sent to the hotspot and 

the other messages are distributed in another way which is 

usually uniform. 

V. COLDFIRE MICROPROCESSOR INTRODUCTION 

Motorola corporation is one of pioneer in producing 8, 16, 32 

bit microprocessors and microcontrollers. ColdFire 

microprocessor family is the most famous and successful 

production of its company. These processors have m68000 

architecture that which are suitable to be used in real time 

system. To meet this purpose of this paper MCF5484 is used. 

VI. BDM MODULE AS A DEBUGGING AND PROGRAMMING 

TOOL 

The figures 5 show the interface of this module with processor 

core and its other interfaces. As you see, debug module is 

connected to the main bus of the microprocessor and so in 

some cases if can work with ColdFire CPU core in a parallel 

form. 

Figure 5. BDM interfaces 

The capabilities of this module are divided into three groups:


A. Real time trace support 

It has the ability to dynamic calculation of the running path, 

which is useful for debugging. ColdFire has the ability to place 

8bits of parallel data on emulator. This data shows the 

microprocessor status and memory data. 

B. Background Debug Mode (BDM) 

This capability provides low level debugging for ColdFire. In 

this module we can access the memory without stopping the 

microprocessor. But changing amount of registers needs to halt 

the microprocessor. 

C. Real time Debug Support 

With use of Debug Interrupt Routine Time, in this mode, the 

amount of registers and variable data are saved fast and the 

systems returns to normal stopping the main program. 

BDM mode is useful for the following reasons: 

• BDM is always accessible for debugging and 

firmware upgrading 

• It is used for programming external flash 

• It provides the entire control of the microprocessor 

and so the whole system. 

These features lead to debugging the microprocessor by the 

use of those tools, which are used for programming the 

microprocessor. 

Although, most of BDM commands don’t lead to halt stopping 

and they are capable to be run with a program concurrently. 

Some conditions which lead to microprocessor stopping are 

available as follow: 

• Fault occurrence in BDM system 

• Breakpoints 

• Halt command that can be activated with 'Go' from 

BDM 

VII. PERFORMANCE COMPARISON IN TWO STATUSES: WITH 

AND WITHOUT OF OS 

In this section, we want to compare two different attitude in a 

mesh topology based NoC. For this comparison, we use of 

3× 3 mesh topology based on hotspot traffic model. The use of 

OS in topologies with limited nodes is worth if nodes 

communication complication or the number of defined tasks 

are a lot. In the attitude that OS is not used, the pass traffic in 

the central node of hotspot traffic model is a lot. So as a result 

there is the probability of the congestion of the packets when 

input packets are assigned the output. In order to remove this 

problem, we use the virtual channel. However these virtual 

channels increase many overhead. For each channel which is 

added the power consumption increases and results to the 

increases of power in this attitude. 

If virtual channel are used, the router needs to use the MUX 

and DEMUX for he selection of the packets. The figure 6 

shows the packet placing in virtual channel and also the 

selection of packet from virtual channel. 

Figure 6. Virtual channel 

As you see in this attitude, some components such as MUX, 

DEMUX and buffers are necessary. These components lead 

some complication like a buffer management and packet 

selection from buffers. In the attitude based on the use of OS, 

we define task s based on I/O ports (Local port is negligible). 

As a result there are four tasks: North, South, East and West. 

Now, OS assigns one task priority for each port. Based on the 

assignment of priorities to these tasks, we can manage the 

routing of the input packet to input port easily. 

OS is responsible for scheduling and task management. In this 

trend, priority assigning is programmed in the way: each time 

the output port is busy, the free ports based on PrioRout are 

used. 

A. Deterministic: 

Execution time of all MicroC/OS-II functions and services are 

deterministic. This means that you can always know how much 

time MicroC/OS-II will take to execute a function or a service. 

Furthermore, except for one service, execution time of all 

MicroC/OS-II services does not depend on the number of tasks 

running in your application. 

B. Task stacks: 

Each task requires its own stack. However, MicroC/OS-II 

allows each task to have a different stack size. This allows you 

to reduce the amount of RAM needed in your application. 

With MicroC/OS-II's stack checking feature, you can 

determine exactly how much stack space, each task actually 

requires. 

C. Services: 

MicroC/OS-II provides a number of system services such as 

mailboxes, queues, semaphores, fixed-sized memory 

partitions, time related functions, etc. 

D. Interrupt Management: 

Interrupts can suspend the execution of a task and, if a higher 

priority task is awakened as a result of the interrupt, the 

highest priority task will run as soon as all nested interrupts 

complete. Interrupts can be nested up to 255 levels deep.


E. Critical section of code 

The critical section of code, briefly named critical section, is a 

code which should be atomic and run as a basic block 

necessarily. So the segment code is uninterruptible when 

placed in this section. To assure that, all interrupt are disabled 

before critical section to be ran and after that they will be able 

again. 

VIII. PRIOROUT ROUTING ALGORITHM 

In this algorithm, input packet will choose a different output 

port based on the selected input port and its destination. In the 

figures 7, we can see a 3× 3 mesh topology of a NoC. In this 

topology, OS has been ported on the router which has special 

color. 

Figure 7. Mesh topology based NoC 

Figure 8. The Central router ports 

The number of router ports depends on the location. For 

example the router which situated in the north east has three 

ports: Eastern port, Southern port and Local port. The router 

which the OS has ported on it is located in the central of the 

mesh topology and it has five ports which are: Southern port, 

Northern port, Eastern port and Local port. The figure 8 shows 

the number of ports in the central router. 

In PrioRout routing, if the input port is the northern one and 

output port is the eastern one and eastern port is free, output 

port is the eastern port. If the eastern port is busy, the output 

port is would be the southern port and the southern port would 

be also, in the worst situation, and the western port would be 

the output port. So the task priority in this example would be: 

TPNorth 

To East = 3 

TPNorth 

To South = 2 

TPNorth 

ToWest 

= 1 

As a result, there is need neither for saving nor buffering. In 

the same manner, for all packets which their destination is 

neighbor port, higher task priority belongs to this port. The 

next priority would be toward the frontal port and the lower 

priority belongs to the output port. In PrioRout routing, if the 

input port, is eastern one and the output port is the eastern one 

and also be free, the output port would be the western one. If 

the western port is busy, the output port would be either the 

northern port or the southern port. That in this case, we choose 

the free port in clockwise. So we should have: 

TPEast 

To West = 3 

TPEast 

To North = 2 

TPEast 

To North = 1 

TABLE I. PACKET ROUTING BASED ON PRIOROUT ROUTING 

The table1 shows the packet routing according to use of the 

OS. 

Task Routing Best Output Case Mean Output Case Worst 

Output 

Case 

North 

South 

East 

West 

North to South South East West 

North to East East South West 

North to West West South East 

South to North North West East 

South to East East North West 

South to West West North East 

East to West West South North 

East to North North West South 

East to South South West North 

West to East East North South 

West to North North East South 

West to South South East North


IX. EXPERIMENTAL RESULTS 

A packet from north to east has been analyzed based on 

PrioRout routing and MicroC/OS-II features. 

OS dynamically assigns the updated priorities. In this way, 

there is a priority table for packet routing which is shown in 

table II. 

Table I Task (Priority) table for routing 

Priority Destination 

3 East 

1 West 

2 South 

Consequently, the input packet follows this priority 

assignment once it reaches the task (North port). Four 

Boolean global variables are defined during task 

implementation and creation to show whether or not the 

ports are busy. The next higher priority (south port in this 

example) will be selected. Also the other tasks follow this 

routing. To sum up, based on using OS, when packet flits 

are going to pass the router, they do not need to be stored in 

buffers. Therefore the power consumption is lowered in 

comparison the case without using OS. In the worst case 

(figure10-b), if the all ports are busy, the packets can be 

stored in input buffers and task stacks. This means that we 

do not need virtual channel. In the case without using OS 

(figure10-a), higher priority packets may be waited for 

lower priority packets. But in attitude with using OS, sent 

packet based on their priorities send according to their 

importance. But we should notice that the new output 

packets can not interrupt until the all flits of previous 

packets are sent. Also, in attitude with using OS, we are able 

to message passing. As when two ports reach the different 

ports, once one packet based on critical section feature is 

Figure 9. Creation of four tasks in MicroC/OS-II 

There is one task for each port; therefore, there are four tasks 

altogether. There are 12 paths as shown in table 1. 

Creation of four tasks in MicroC/OS-II is shown in figure 9: 

#define TASK_STK_SIZE 512 //Size of each task's stacks (# of WORDs) 

#define TASK_START_ID 0 // Application tasks IDs 

#define TASK_1_ID 1 




#define TASK_START_PRIO 4 Application tasks priorities 

#define TASK_1_PRIO 1 




// Create the first task 

OSTaskCreate(TestTask1,(void*)11,&TestTaskStk1[TASK_STK_SIZE], 11); 

// Create the Second task 


// Create the Third task 


// Create the Forth task 


been selected. This section can be priority assigning. For 

example forward path has higher priority than the neighbor 

path. So the table III shows the comparison two attitudes 

(with and without using OS). Horizontal axis shows the task 

priorities and vertical axis the packet transmission time. As 

a result, transmission time of higher priority packet is lower. 

3 

2.5 

2 

Time 1.5 

1 

0.5 

3 

2.5 

2 

1 

0.5 

0 

0 

Time 1.5 

Priority 

1 2 3 

(a 

Priority 

1 2 3 

(b 

Figure 10. a) Worst case in without using OS b) Normal Case in using 

OS state.


Table II Packet transmission status with northern source 

Source Port that sends 1000 packets 

Other Port Status 

Destination Port and the 

number of received packets 

North 

East is Free 

East-1000 

North 

East is Busy and South is Free 

South-1000 

North 

East and South are Busy and West is 

Free 

West-1000 

In our simulation, phytech evaluation board is used that the 

MicroC/OS-II has been ported in it. We reach to these result 

that have been shown in table III. 

X. CONCLUSION 

In this paper, the usage of a real time OS, in a NoC 

framework based on hotspot traffic model has been 

analyzed. Communication management in NoC, needs a 

precise planning, scheduling, resource allocation, message 

passing. Satisfy these parameters, needs efficiently. In this 

paper a RTOS has been used. Since the NoC is power 

constrained and the OS which is used has the a few line of 

code, this selection (a RTOS) has a significant effect on 

minimizing the power consumption. Based on the 

implementation, RTOS features can be used in NoC. 

REFERENCES 

[1] Allan, D. Edenfeld, W. H. Joyner, A. B. Kahng, M. Rodgers, Yervant 

Zorian, "2001 Technology Roadmap for Semiconductors," Computer, 

vol. 35, no. 1, pp. 42-53, Jan., 2002 

[2] J. Clerk Maxwell, A Treatise on Electricity and Magnetism, 3rd ed., 

vol. 2. Oxford: Clarendon, 1892, pp.68–73. 

[3] http://www.micrium.com/ 

[4] W. Hsh, Performance issues in wire-limited hierarchical networks, 

PhD Thesis, University of Illinois-Urbana Champaign, 1992. 

[5] G.J. Pfister, V.A. Norton, “Hotspot contention and combining in 

multistage interconnection networks,” IEEE Transactions on 

Computers, Vol. 34, No. 10, 1985, pp. 943-948. 

[6] K. Hwang, Advanced computer architecture: parallelism, scalability 

and programmability, McGraw-Hill (Ed.), 1993. 

[7] J. Duato, S. Yalamanchili, and L. Ni, Interconnection Networks—An 

Engineering Approach. Morgan Kaufmann, 2002. 

[8] MCF548x Integrated Microprocessor Electrical Characteristics 

Applies to the MCF5480, MCF5481, MCF5482, MCF5483, 

MCF5484, and MCF5485, © Freescale Semiconductor, Inc., 2004. 

[9] Nollet, V.; Marescaux, T.; Verkest, D, Operating-system controlled 

network on chip. Design Automation Conference (DAC), 2004. 

Proceedings.41 st Volume , Issue , 2004 Page(s): 256 - 259 

[10] S. A. Asghari, H. Pedram, P. Yaghini and M. Khademi, Designing 

and Implementation of a Network on Chip Router based on 

Handshaking Communication Mechanism, World Applied Science 

Journal 6 (1),pp: 88-93, 2009 

[11] N. Eisley and L.Peh, “HighLevel Power Analysis for OnChip 

Networks,” CASES’04 September 22–25, 2004, Washington, DC, 

USA 

Seyyed Amir Asghari was born in Lashte Nesha in Guilan province of Iran, on 

June 26, 1984. He received his BS degree in Computer Engineering from 

Amirkabir University of Technology in 2007. He graduated from the Amirkabir 

University of Technology in MSc. He is a research assistant of Asynchronous 

Design Laboratory in the same school. 

Hossein Pedram Received his BS degree from Sharif University in 1977 and 

MS degree from ohio State University in 1980 in Electrical Engineering. He 

received his PhD degree from Washington State University in 1992 in 

Computer Engineering. 

Dr Pedram has served as a faculty member in the Computer Engineering 

Department in Amirkabir University of Technology since 1992. He teaches 

courses in computer architecture and distributed systems. His research interests 

include innovative methods in computer architecture such as asynchronous 

circuits, management of computer networks, distributed systems, and robotics. 

Hassan Taheri Received his BS degree from Amirkabir University of 

Technology in 1975 and MS degree from University of Manchester Institute of 

Science and Technology (UMIST) in 1978 in Electrical Engineering. He 

received his PhD degree from UMIST University in 1988 in Electrical 

Engineering. 

Dr Taheri has served as a faculty member in the Electrical Engineering 

Department in Amirkabir University of Technology. He teaches courses in Data 

Communication Network, Computer Communication, Teletraffic Engineering, 

Electronic Switching, Digital Communications, Telephone Switching, 

Probability and Statistics.


Nonlinear Filtering Algorithms for Chaotic 

Signals: A Comparative Study 

Valeri Ya Kontorovich, Zinaida Lovtchikova, Jesús. A. Meda-Campaña, and Keith Tinsley 

Abstract— In this work, a comparative analysis of some 

approximate nonlinear filtering algorithms for chaos is 

addressed, assuming that output signals of chaotic attractors are 

affected by additive white noises. Estimation accuracy and 

computational complexity of filtering algorithms are taken into 

account during comparison process. 

It is shown, that the nonlinear filtering algorithm of chaos can 

be interpreted, for certain levels of the Signal-Noise Ratio (SNR), 

as a close to singular one, which dramatically decrease the Mean- 

Square Error (MSE) of filtering. 

Index Terms—Chaotic signals, Markov theory, non linear 

filtering 


HEORETICALLY, chaos is represented as an output 

Tsignal 

of dissipative continuous dynamic systems (strange 

attractors) (see, for example [4]): 

( x(t) 

) 

x & = f , 

n 

x ∈ R , 0 0 ) x ( = x 

t , (1) 

where [ ] T 

f f 1( 

x),... 

fn 

( x) 

is a differentiable vector function. 

According to the idea of Kolmogorov, equations for the 

strange attractors (1) can be successfully transformed in the 

equivalent stochastic form as a stochastic differential equation 

(SDE) [4], [11]: 

( ( t) ) + εξ( 

t) 

x& = f x , (2) 

Manuscript received October 9, 2009. This work was supported through 

the grant “Intel-VK” from INTEL Corporation. 

V. Ya. Kontorovich is with the Communications Section of the Electrical 

Engineering Department of CINVESTAV-IPN, Av. IPN 2508 Col. San Pedro 

Zacatenco. C.P. 07360 México, D.F. Apartado postal 14-740, 07000 México, 

D.F. Phone: +52 55 57473764. Fax: +52 55 50613977 (Email: 

valeri@cinvestav.mx) 

Z. Lovtchikova is with the Engineering and Advanced Technology 

Interdisciplinary Professional Unit, UPIITA-IPN. Colonia Laguna de 

Ticomán. C.P. 07340. México, D.F. Phone +52 55 57296000x56848. (Email: 

lovtchikova@ipn.mx) 

J. A. Meda-Campaña is with the Mechanical Engineering Department of 

SEPI-ESIME Zacatenco, IPN, Av. IPN s/n. Edificio 5, piso 3. Unidad 

Profesional Adolfo López Mateos. Zacatenco. Col. Lindavista. C.P. 07738. 

México D.F. México. Phone +52 55 5729600x54737(Email: 

jmedac@ipn.mx). 

K. Tinsley is with INTEL Labs, INTEL Corporation. Hillsboro, Oregón 

97124, USA. Phone: +503 712 1790. (Email: keith.r.tinsley@intel.com) 

where ξ(t) is a vector of “weak” external white noise with the 

related positive defined matrix of “intensities” ε = [ε ij] nxn . 

The assumption of the weak white noise component in (2) 

guarantees the existence of the stationary distribution Wst(x), 

∀ εij →0 [1]. The latter was considered as an invariant 

physical measure for statistical characterization of the strange 

attractors [11], [12], [17]. 

Statistical description of chaotic systems and noise effects 

in chaotic trajectories are deeply analyzed in [1], but it is 

rather difficult to apply those results in engineering 

applications. In this regard, the authors proposed earlier the 

so-called “degenerate cumulant equations method” [14] for 

applied statistical analysis of the strange attractors. 

It sounds logical to suppose that if one can model some 

stochastic phenomena by means of dynamic chaos (SDE (2)), 

then its filtering could be carried out through the same 

approach [13]. 

Chaos modeling using SDE (2) gives an opportunity to 

provide the filtering of chaotic signals by means of the 

classical approach of nonlinear filtering for Markov processes, 

first proposed at the beginning of the 60’s by R. Stratonovich 

and H. Kushner [15], [20] and intensively developed in the 

last 40 years [2], [6], [7], [8], [9], [19]. 

It is worth mentioning here, that the tendency of the 

intensities in SDE (2) to zero have to be applied with certain 

caution, as the latter formally changes characteristics of the 

Markov process, generated by (2). 

This problem will be considered in our further publications 

with all necessary details; here we will like to stress, that 

intensities will be considered, for the process noise in (2), as 

very small and close or equal to zero. 

As it follows from the above mentioned references the 

nonlinear filtering approach is mainly, by definition, an 

approximate one being that the differential equations for the aposteriori 

Probability Density Functions (Stratonovich- 

Kushner equations) do not provide analytical solution. 

During more than 40 years of intensive developments, 

many approximate methods for non-linear filtering have been 

proposed. For the purpose of this paper, the most important of 

them will be presented in the next section. 

It is worth stressing here that the comparison of the 

accuracy of the approximate methods does not provide a 

sustainable certainty, mainly because their creation is rather 

heuristic. Moreover, for certain methods the attempts to

increase the precision by increasing the number of 

approximation terms, etc. can give exactly the opposite effect 

and reduce the accuracy [7], [19]. 

The main goal of this paper is to present a comparative 

study of some nonlinear algorithms bearing in mind possible 

applications to the filtering of chaotic signals provided by 

Lorenz, Chua and Rössler attractors in presence of additive 

white noises (channel noises). 

The rest of the work is organized as follows. In section II, 

Markov theory of nonlinear filtering is briefly recalled. 

Section III summarizes some of the approximate approaches 

for nonlinear filtering, while chaotic filtering is analyzed in 

section IV. Afterwards, numerical simulations are discussed in 

section V. Finally, in section VI, some conclusions are drawn. 

II. MARKOV THEORY OF NON-LINEAR FILTERING 

Let us consider the following filtering scenario where the 

received signal is: 

( , ( ) ) ( ) t t x 

( ) 

0 t 

t n s y = + , (3) 

where y(t) – is a vector of the received signal with dimension 

“m”, s (⋅) – is a vector function of the desired signal of the 

same dimension “m”, n0 –is a vector of the white additive 

noises with the intensity matrix N0(mxm). 

Here the signal s (⋅) depends on the “message” x (t) which 

is subject of filtering and is modeled by means of the 

following SDE as an n-dimensional Markov diffusion process: 

( t, ) + ξ( 

t) 

x& = g x . (4) 

Formally, SDE (4) coincides with (2) and the vector 

function g (⋅) is similar to f (⋅) in (2); the matrix of intensities 

for ξ (⋅) in (4) corresponds to ε in (2). 

As it is well known (see [18] and [20] for example), with 

this assumption the a-priori Probability Density Function, or 

a-priori PDF, for x(t) follows the so-called Fokker-Plank- 

Kolmogorov (FPK) equation: 

∂WPR 

( x, 

t) 

= − 

∂t 

1 

+ 

2 

n 

n 

∑ 

i= 

1 

n 

∑∑ 

i= 

1 j= 

1 

∂ 

[ g i ( t, 

x) 

WPR 

( x, 

t) 

]+ 

∂x 

i 

∂ 

∂x 

∂x 

i 

2 

j 

[ W ( x, 

t) 

] 

ε , (5) 

where WPR(x,t0) =W(x0) 

Equation (5) can be rewritten in another form [9], [21]: 

or 


∂ 

W PR 

ij 

PR 

( x, 

t) 

= −divπ( 

x, 

t) 

, (6a) 

∂t 

∂WPR 

( x, 

t) 

= L 

∂t 

PR 

{ W ( x, 

t) 

} 

PR 

, (6b) 

where π(x, t) – is a probabilistic “flow” with the components: 

1 

πi( 

x, t) 

= gi( 

x, 

t) 

WPR( 

x, 

t) 

− 

2 

n 

∂ 

x 

[ εijWPR( 

x, 

t) 

] (7) 

∑ ∂ 

j= 

1 j 

In (5)-(7) { } n 

gi (x , t) 

1 are drift coefficients and {εij} are 

diffusion coefficients of the Markov process, note that in the 

following they are defined in the Stratonovich sense [18], 

[20]; LPR{⋅} – is a FPK linear operator. 

Then, as it was shown in [20] the integro-differential 

equation for the a-posteriori PDF WPS(x, t) is given in the 

following equivalent forms: 

or 

∂WPS 

( x, 

t) 

= LPR 

{ WPS 

( x, 

t) 

}+ 

∂t 

1 ⎡ 

∞ 

⎤ 

⎢F 

( x, 

t) − F ( x, 

t) 

WPS 

( x, 

t) 

dx⎥WPS 

( x, 

t) 

2 ⎢ ∫ ⎥ 

⎣ 

− ∞ 

⎦ 

1 

2 

∂ 

W PS 

( x, 

t) 

= −divπ 

ˆ( 

x, 

t) 

+ 

∂t 

[ ( , t) F( 

, t) 

] WPS 

( , t) 

x x x F 〉 〈 − 

(8a) 

(8b) 

where ∫ ∞ 

〈 F( x, t) 

〉 = F ( x, 

t) 

WPS 

( x, 

t) 

dx 

, π ˆ ( x , t ) is (5), 

−∞ 

where WPR(x, t) is substituted by WPS(x, t) and: 

T 

⎡ 1 ⎤ −1 

⎡ 1 ⎤ 

F ( x, 

t) 

= ⎢ y( 

t) 

− s( 

x, 

t) 

⎥ N 0 ⎢ y( 

t) 

− s( 

x, 

t) 

⎥ . (9) 

⎣ 2 ⎦ ⎣ 2 ⎦ 

Equations (8) together with (9) are called Stratonovich- 

Kushner nonlinear equations (SKE) and have a rather 

attractive physical interpretation: the first summand in (8) 

describes the dynamics of the a-priori dates of the x(t) and the 

second summand depends on the innovation of the a-priori 

dates from the analysis of observations. 

The optimum estimation of x (t) 

is x ˆ( t) 

by any known 

criteria of optimization and is a result of the filtering of 

x(t); it is obtained from the solution of (8), while the input 

signal is y(t) (see (3)). 

When intensity of additive noises vector N0 is large, the 

influence of the first summand in (8) prevails, equation (8) 

translates into FPK (6) and the filtering accuracy diminishes 

drastically. In contrary: when the signal to noise ratio 

increases, the WPS(x, t) tends to the unimodal Gaussian PDF 

[7], [20]. Note that SKE equation fully describes the 

“evolution” of WPS(x, t) in time but does not provide with 

exact analytical solutions. 

Even so, there are very few exceptions: linear SDE (4) 

which yields the well known Kalman filtering algorithm [2], 

[6]-[9], [15], [16], [18]-[21]; the Zakai approach [22], etc.


Due to this, the nonlinear filtering algorithms are practically 

always approximate. 

During more than 40 years the bibliography for nonlinear 

filtering algorithms has become enormous. In the next section 

we will consider only some of them, taking into account the 

following considerations: 

− The models of the desired signals applied for filtering are 

equations for Lorenz, Chua and Rössler strange attractors 

with n=3, i.e., of rather low dimension. 

− The algorithms of interest have to be adequate for real time 

applications and so they have to be of reduced 

computational complexity. 

− The algorithms for nonlinear filtering have to be able to 

perform satisfactorily in scenarios with low signal to noise 

ratios (SNR), although the Gaussian assumption for WPS(x) 

is not always valid. 

( ) ) ( ), ( t t t s x ≅ x 

(10) 

− All εij are equal to zero, except ε11≅ ε1 [1]. 

Advances in cumulant statistical analysis of chaos [11], [12] 

supposing low SNR, makes one guess that it might be 

reasonable to consider application of the high order cumulants 

(HOS), (see [9], [10], [19] for example), etc. 

⎡ ∞ 

∞ 

1 

⎤ 

+ ⎢ 

⎥ 

⎢ ∫ xi F( x, 

t) 

Wˆ 

G( 

x, 

t) 

dx-xˆ 

i ∫ F( 

x, 

t) 

Wˆ 

G( 

x, 

t) 

dx 

2 

⎥ 

⎣−∞ 

−∞ 

⎦ 

&ˆ 

∞ 

⎛ 

o o ⎞ 

= ⎜ ˆ T 

∫ π ( x, t) 

grad x i x ⎟dx 

+ 

⎝ 

⎠ 

Rij j 

−∞ 

(11) 

⎡ ∞ 

∞ 

1 o o 

⎤ 

+ ⎢ 

⎥ 

⎢ ∫ xi x j F( x, 

t) 

Wˆ 

G ( x, 

t) 

dx-Rˆ 

ij ∫ F( 

x, 

t) 

Wˆ 

G ( x, 

t) 

dx 

, 

2 

⎥ 

⎣−∞ 

−∞ 

⎦ 

o 

where x i = xi 

− xˆ 

i , x j = x j − xˆ 

j . 

o 

Equations (11) can be presented in the matrix form [7], 

[15], [19], [20] as well, but for concrete applications percomponent 

representation (11) might be more suitable (see 

the following). 

Practically, it is possible to assume for ∀ ˆ ( t) 

when t→∞, 

are converging to the stationary values R ij , and in 

consequence the second equation in (11) usually tends to the 

system of nonlinear algebraic equations, which can be solved 

numerically. 

This assumption can significantly simplify the 

implementation of the corresponding EKF algorithms for real 

time scenarios. 

Functional approximation for WPS ( x , t) 

. It follows 

from [9], [19]: 

III. APPROXIMATE APPROACHES FOR NON LINEAR FILTERING 

It is always “better” to approximate the a-posteriori PDF 

WPS ( x , t) 

than the nonlinearity at (4), (8) [2], [8], [19]. In this 

context, let us mention the following approximate approaches 

for WPS ( x , t) 

: 

− Gaussian approximations: Extended Kalman Filter (EKF) 

[2], [6]-[9], [15], [16], [18]-[21]; Unscented Kalman Filter 

(UKF) [8]; Quadrature Kalman Filter (QKF) [2]; Gauss- 

Hermite Quadrature Filter (GHF), [6], Iterated Kalman 

Filter (IKF), etc. 

− Functional approximations for 

WPS ( x, 

t) 

[9], [19]; 

− Integral or Global approximations for 

WPS ( x, 

t) 

[7]; 

− HOS approximations for 

WPS ( x, 

t) 

[10]; etc. 

Due to the lack of space, it is hardly feasible to give a 

complete overview of all those methods; moreover not all of 

them are adequate taking into account the observations 

introduced at the end of section II but some comments will be 

made at section V. 

Let us start with the Extended Kalman Filter (EKF): 

Considering WPS ( x , t) 

as a three dimensional Gaussian PDF- 

Wˆ G ( x , t) 

, from (8) it is possible to obtain the following 

equations for per-component of the mean estimates { } 3 

x ˆi 1 and 

for estimates of the elements of the a-posteriori covariance 

matrix { } 3 

3 ⎡ 3 q−1 

R 

⎤ 

qj 

W = ∏ ⎢ + ∑∑ − ˆ − ˆ 

PS ( x, 

t) 

WPS 

( xi 

) 1 

( xq 

xq)( 

x j x j) 

⎥ 

⎢ R 

= 1 

⎥ 

⎣ q= 

2 j= 

1 qqR 

i 

ji 

⎦ (12) 

From (12) we see that the Functional Approximation for the 

PDF is sufficiently non-Gaussian (marginal WPS(xi) are 

arbitrary) but for “joint” characterization of the vector xˆ , only 

elements of the a-posteriori covariance matrix Rij R ˆ 

ij : 

i, 

j= 

1 

∞ 

x 

&ˆ 

i = ∫ ( ˆ T 

π ( x, t) 

gradxi 

) dx 

+ 

−∞ 

ˆ are 

considered. 

It can be shown that the equations for { } n 

xˆ i 1 and { Rij } ˆ are 

the same as in (11), being the only difference that instead of 

Wˆ G ( x , t) 

one has to substitute in (11) the approximation (12) 


. The corresponding integrals can be solved 

analytically or by the Gauss-Hermite quadrature formula [2], 

[6] (see below). 

Integral or Global approximation for WPS ( x , t) 

. The 

reader already realized that the previous two approximations 


are in some sense “local” because they provide 

the estimation of { xˆ i } as the maximum of ) , ( t WPS x , and 

{ Rij } ˆ . When the SNR is considerable high this is quite 

enough, but when the SNR is low, one has to look for another 

approach, which is called Integral approximation. This 

approach was proposed for successful approximation of 

R ij

WPS ( x , t) 

including the PDF’s “tails”, i.e. for the whole span 

of x. 

Let us assume that WPS ( x , t) 

can be represented in the 

form: 

WPS PS 

( x , t) 

= W ( x, 

α( 

t)) 

, (13) 

where α is an unknown vector of approximation parameters. 

Then, applying the well known Kullback measure as an 

approximation criteria, we obtain the following equation for 

the unknown vector α: 

+ 

LPR −1 

{ h( 

x, 

t) 

} + V ( t) 

h( 

x, 

t) 

F ( x, 

t) 

α& = 

, (14) 

where: 

∂lnWPS 

( x, 

α( 

t)) 

h ( x, 

t) = 

, and 

∂α 

∞ 

T 

⎡∂ 

lnWPS 

( x, 

α( 

t)) 

⎤ 

V ( t) 

= − ∫ ⎢ 

( , α( 

)) 

α ⎥ WPS 

x t dx 

⎣ ∂ 

−∞ 

⎦ 

2 

∂ WPS 

( x, 

α( 

t)) 

= − 

, Τ 

∂α∂α 

+ PR 

{} • 

L – is a self ad joint operator to the FPK operator [18]. 

Now, as an integral approximation of ( x , α( 

t)) 

, let us 

W PS 

choose the so-called “Dynkin PDF” with α(t) – as a vector of 

sufficient statistics for WPS(⋅): 

⎪ 

⎧ K 

⎪ 

⎫ 

W PS ( x, α( 

t)) 

= C exp⎨∑ 

α p ( t) 

ϕ p ( x) 

+ ϕ0 

( x) 

⎬ , (15) 

⎪⎩ p= 

1 

⎪⎭ 

ϕ (x) 

is a complete set of orthogonal 

where { } 

p 

multidimensional functions: Hermite, Laguerre, etc. 

One can see, that there is a high degree of similarity 

between (15) and the orthogonal series representation of 

( x , α( 

t)) 

[18]: in both cases, series of orthogonal 

W PS 


functions are applied, but in (15) it is done for the 

monotonical transform in{ WPS ( x , α( 

t)) 

} and not for 

WPS ( x , α( 

t)) 

. So, the coefficients {αp(t)} can always be 

represented through the cumulants of WPS(x). This opens the 

opportunity to search equations for the cumulants (HOS) of 

WPS ( x, t) 

directly (see [10], [19], for example), instead of 

search for a solution of (15), which cannot be obtained 

analytically. 

Being that the last problem was extensively tackled in the 

mentioned references, the HOS approach will not be 

completely addressed in the following. However, one 

comment comes in line: for n > 1 equations (14) and equations 

for HOS are rather complex when real time solutions are 

required; for n = 1 there might be no significant difference 

between both methods (see [10] for details). Then, in order to 

apply the last two approaches for approximate nonlinear 

filtering of chaos it is necessary to decrease the dimension of 

the SDE (4). In other words, for chaos one has to adequately 

find an equation statistically equivalent to the SDE (4). This 

can be achieved making a synthesis of the equivalent SDE 

(see [18]). 

IV. FILTERING ALGORITHMS FOR CHAOTIC SIGNALS 

For simplicity, let us consider the following special case of 

the one dimensional scenario: 

y t) 

= x ( t) 

+ n ( t) 

, (16) 

( 1 0 

where x1(t) – is the first (observable) component of any 

strange attractor (Lorenz, Chua, Rössler) and n0(t) is an scalar 

white noise [12], [17]. For the sake of completeness we 

present in Table I all the features which will be required here. 

Let us consider Lorenz, Chua and Rössler attractors. It can 

be seen from Table I that, marginal PDF’s of the components 

for Lorenz attractor are practically Gaussian, or its orthogonal 

representation has a Gaussian kernel PDF; for Rössler 

attractor orthogonal representation with the Gaussian kernel 

PDF is also valid for “x” and “y” components of the attractor 

[12]. The opposite situation takes place for Chua attractor 

(Table I): it can be seen that this attractor represents a clearly 

non-Gaussian case. 

Next, when SNR is low, then the influence of the second 

summand in SKE (8) on WPS ( x, t) 

is low as well, and for the 

first approximation it is possible to assume, that the marginal 

a-posteriori PDF’s are close to their a-priori shapes. 

Therefore, it is feasible that EKF algorithms will be rather 

adequate for both high and low SNR scenarios for Lorenz and 

Rössler attractors, but not for Chua attractor. Now, let us 

consider Chua attractor with the Integral (Global) 

approximation for the a-posteriori PDF, assuming (Table I), 

that first component has a symmetric 

WPS ( x1 

, t) 

. Supposing 

{ ϕ ( )}K 

that i xi 1 are polynomials of Hermite and K= 4, from 

(15) it follows: 

x1, 

t) 

= C exp{ 

α1 

( t) 

H1 

( x1) 

+ α 2 ( t) 

H 2 ( x ) + 

+ α t) H ( x ) + α ( t) 

H ( x ) . (17) 

WPS ( 1 

3( 

3 1 4 4 1 ) 

With the help of definition of the Hermite polynomials one 

can get for (15): 

WPS ( xi, 

t) 

= Const exp[ 

−α 

2( 

t) 

− 3α 

4( 

t) 

⋅] 

{ } 4 3 2 

⋅ exp Ax + Bx + Cx + Dx 

(18) 

where: 

A = α1( 

t) 

− 3α 

3( 

t); 

B = α 2 ( t) 

− 6α 

4 ( t); 

C = α3 

( t); 

D = α 4 ( t) 

. 

As { } 4 

α i (t) 

1 are sufficient statistics for , and invoking the 

symmetry and normalization conditions for a-posteriori PDF 

one can get: 

A=C=0, C ( α α ) C ⋅ exp{ 

−α 

( t − 3α 

( t) 

} 

W PS 

1 = , and 

( x ) = C α α 

Dx . (19) 

1 

1, 

4 

2 4 

( ) { } 4 2 

1 2, 

4 ⋅exp 

Bx1 

− 1

No Name of the 

Strange 

attractor 

1 Lorenz, n = 3 

It is worth mentioning that for the case of low SNR (19) 

coincides with the a-priori PDF 

WPR ( x1, 

t) 

for Chua attractor 

(Table I). Now, from (14) it follows: 

where i=2, 4. 

' ε '' 

x1 

) ϕi ( x1) 

+ ϕi 

( x1) 

+ h ( x1) 

F( 

t, 

x ) = 0 , (20) 

2 

f ( i 

1 

Statistically equivalent SDE-1 with PDF (19) can be found 

in [18]: 

1 

3 ( Bx 2 ) 

f ( x ) = ε − Dx . 

Then, for i=2, one gets ( ε → 0) 

the following equation: 

where 


⎡x1 

⎤ 

⎢ ⎥ 

x = 

⎢ 

x2 

⎥ 

⎢ 

⎣x 

⎥ 3 ⎦ 

2 Chua, n = 3 

⎡x1 

⎤ 

⎢ ⎥ 

x = 

⎢ 

x2 

⎥ 

⎢ 

⎣x 

⎥ 3 ⎦ 

3 Rössler, n = 

3 

⎡x1 

⎤ 

⎢ ⎥ 

x = 

⎢ 

x2 

⎥ 

⎢ 

⎣x 

⎥ 3 ⎦ 

2 

2 

4 y ( t) 

2 

[ x1 

− 2D 

x1 

] = [ 1 −1] 

1 

2ε 

B x , (21) 

N 

2m 

1 

x 

g(x) 

⎧σ( 

x2 

− x1) 

⎪ 

⎨Rx1 

− x2 

− x3x 

⎪ 

⎩x1 

x2 

− Bx3 

σ, 

R, 

B ≥ 0 

1 

1 

− 

2 

0 

⎛ 1 ⎞ 

Γ⎜m 

+ ⎟D 

⎝ 2 ⎠ 

= 

πD 

1 

1 

−m− 

2 

( − δ) 

m 

( − δ)( 

2D)2 

m = 1,2,…; D (⋅) 

is function of parabolic cylinder, 

4 

1 

TABLE I 

Strange attractors and their statistical characteristics 

WPR(xi) Comments 

Ι. ε 

ε11 = ε →0 

ε 12 = ε 13 = 

ε 23 = ε 21 = 

ε 32 = ε 33 = 

0 

(22) 

B 

δ = . 

2D 

x1 ~ WG(⋅) 

x2 ~ WG(⋅) 

x3 ~ WG(⋅) 

2 4 

⎧β1( 

x2 

− x1) 

− αh( 

x1) 

ε 11 = ε →0 x1 

~ C exp( p1x1 

− q1x1 

) 

⎪ 

⎨β2 

( x1 

− x2 

) + β4 

x 

ε 12 = ε 13 = 

3 

x2 

~ WG 

( ⋅) 

⎪ 

ε 23 = ε 21 = 

⎩− 

β3 

x 

2 4 

2 

ε 32 = ε 33 = x3 

~ C exp( p3x3 

− q3x3 

) 

β − β ≥ 0, 

α < 0 0 p , p , q , q > 0 

⎧− 

x2 

− x3 

⎪ 

⎨x1 

+ ax2 

⎪ 

⎩b 

+ x3x1 

− x3c 

a, 

b, 

c, 

≥ 0 

ε 11 = ε →0 

ε 12 = ε 13 = 

ε 23 = ε 21 = 

ε 32 = ε 33 = 

0 

1 

with 

2 

1 

x1 

~ W ( 1) 

⎡ 

G x 1 + 

⎢⎣ 

x2 

~ W ( ) 

⎡ 

G x2 

1 + 

⎢⎣ 

x ~ W ( ⋅) 

3 

G 

R 

33 

2 

< 1 

Analogically for i=4 

ε 

y 

= 

N 

( ) 0 → ε 

, it yields: 

4 { 4 x 

2 

( B − 6ε 

−12B 

x 

6 

− 8ε 

x 

2 ) } + 6ε 

x = 

2 

( t) 

4 [ x 

2 

− 6 x + 3] 

+ 

1 6 [ x 

4 

− 6 x 

2 

+ 3 x ]. 

0 

N 

0 

2 

( t 

(23) 

Assuming, that in (21)-(23) y ) tends to its stationary 

2 

value y ( t) 

while t →∝ and substituting into (21) - (23), 

one can get nonlinear algebraic equations for stationary 

parameters α 2 , α4 

, which are obviously related to the aposteriori 

variance(MSE) and fourth moment (cumulant) of 

x ) . 

WPS ( 1 

Therefore, α 2 can be used as a measure of the filtering 

accuracy, being calculated with influence of the fourth aposteriori 

moments (cumulants). 

The similar approach with application of higher –order 

statistics (HOS) will be presented below, where the equation 

for estimate of x 1 = ˆx 1 will be obviously the same as for the 

Integral Approximation. 

It is worth to mention here, that for the case of low SNR it 

can be developed so-called asymptotical algorithms as well. 

For example, the asymptotical filtering algorithm for 

∆ 

( 1) 

γ3 

3! 

( 2) 

γ3 

3! 

( 1) 

H 3 ( x1) 

+ γ 4 H 4 ( x1) 

⎤ 

⎥⎦ 

H ( x ) + γ 

3 

2 

( 2) 

4 

H 4 ( x2 

) 

⎤ 

⎥⎦ 

Normalized dates, 

WG(⋅) – Gaussian 

PDF 


p1~ 3.5 

p3 ~ 3.5 

q1~ 1.5 

q3 ~ 2.5 


( 1) 

4 

( 2) 

3 

( 2) 

4 

~ 0. 

2 

~ 0. 

6 

x1( 

t) 

= x( 

t) 

of Chua attractor in discrete time can be 

represented in a way: 

γ 

γ 

γ 

γ 

( 1) 

3 

~ 0. 

2 

~ 0. 

6

xˆ 

i+ 

j 

= xˆ 

+ T f 

j 

+ σ 

2 

ε j 

0 

d 

dx 

( xˆ 

) 

j+ 

1 

where T0 is a sampling interval, 

j 

lnW 

PS 

[ ( y j+ 

1 − x j+ 

1) 

] x = xˆ 

j+ 

1 

, (24) 

2 

σ ε is a-posteriori filtering 

variance (MSE). 

This a-posteriori variance can be calculated through α 2 

and α 4 (see above), but also might be found from the 

following equation: 

ˆ σ 

2 

ε j + 1 

2 

ε j 

4 ∂ 

+ ˆ σ ε j 

∂x 

2 

ˆ ε j 

2 

= ˆ σ + 2σ 

f 

2 

j+ 

1 

ln 

′ ( xˆ 

j ) T0 

WPS 

[ ( y j+ 

1 − x j+ 

1 ) ] x j + 1= 

xˆ 

j 

(25) 

If the SNR is low and n0(t) is a Gaussian additive white 

noise, then applying Taylor series expansion for the 

lnW ( ⋅) 

, with this asymptotic one can get: 

PS 

2 

σ ε 

xˆ 

j+ 

1 = xˆ 

j + T0 

f 

= 

σ n 

σ = ˆ σ + 2 ˆ σ 

j ( xˆ 

j ) + 2 [ ( y j x j ) ] 2 + 1 − + 1 x j+ 

1 xˆ 

j 

ˆ ε j + 1 

2 

ε j 

2 

ε j 2 

ε j 

which, in stationary conditions is: 

εT 

T 

ˆ = = 

j 

2 f 

ε 

2 

σ ε 

fˆ 

( xˆ 

j ) T0 

( xˆ 

), 

ε ˆ σ 

0 

0 

' 

2 

( xˆ 

) 2( 

B − 6 xˆ 

) 

j 

(26) 

(27) 

It can be seen from (27) that accuracy of the filtering 

depends on absolute value of xˆ which is the specific feature of 

the asymptotical algorithm. This interesting issue follows from 

ˆ 2 

σ ε on the derivative of the nonlinear drift 

the dependence of 

f ′ ( xˆ 

j ) . 

Now, let us take the low SNR scenario and apply the 

Functional Approximation (12) for WPS ( x , t) 

. When we 

assume the low SNR case the WPS ( x , t) 

becomes: 

W 

PS 


2 4 

exp( 

p1x1 

− qi 

x1 

) 

2 4 ( p x − q x ) 

( x, 

t) 

= C 

exp 3 3 i 1 

1 

2πRˆ 

22 

(28) 

2 ⎛ ⎞⎡ 

3 p−1 

x 

R − ˆ − ˆ ) ⎤ 

⎜ 1 

ij ( x j x j )( xi 

xi 

exp ⎟ 

⎜ 

− 

⎟⎢1 

+ 

ˆ ∑∑ 

⎥ 

⎝ 2R22 

⎠⎢⎣ 

i= 

1 j= 1 RiiR 

jj ⎥⎦ 

Substituting (28) into (11) and after rather simple, but 

cumbersome developments, one can get: 

x& ˆ1 = −2εxˆ 

1( 

p1 

+ q1) 

+ 2εq 

ˆ 1R11 

+ 

2 ( y ( t) 

− R ) 

2( 

y( 

t) 

− xˆ 

) Rˆ 

xˆ 

+ . (29) 

N 

1 11 1 − 

ˆ 

11 

0 N0 

If ε → 0 (see section I) and the SNR is low, then from (29) 

and (11) it follows: 

2( 

y( 

t) 

− xˆ 

1) 

x& 

ˆ ˆ 

1 = −2εxˆ 

1( 

p1 

+ q1) 

+ 

R11 

(30) 

N0 

and one can immediately obtain: 

2 

Rˆ 

11 

R ˆ& ε 

ˆ 

11 = − + + 4ε 

( p1 

+ q1) 

R11 

. (31) 

2 N 

0 

One can see that (30), (31) coincide totally with the EKF 

for one component x1. Why it happened? The answer is 

simple, it happened because of practical linearity of the 

equations for Chua attractor with exception of h(x1), symmetry 

of the WPS(x, t) for all arguments and symmetry of h(x1), 

which finally provides and “implicit linearization” of the SDE 

for x1 in the case of Chua attractor. 

It is also interesting that for the analyzed scenario, the 

statistically equivalent SDE for x1(t) is practically linear with 

time constant 2D(p1+q1). 

For t → ∞ R ˆ 

11( t) 

tends to its stationary value R 11 , which 

coincides with the a-posteriori variance or MSE and can be 

simply calculated as: 

− 4ε 

( p1 

+ q1) 

+ 

R11 

= 

2 

2 ε 

16ε 

( p1 

+ q1) 

+ 

N 0 

2 

N 0 

≥ 0 

R11 ≅ 0. 71⋅ 

invoking ε → 0, 

N 0ε 

. 

(32) 

If one assumes that N0 ≅ 1, then R 11 is almost zero and 

doesn’t depend to SNR,i.e it is a singular case! 

Some further developments with the help of HOS can be 

achieved for the case of n=1, assuming that the nonlinear 

statistically equivalent SDE for x1 is [18]: 

ε 

2 

x & 1 = − ( p1x1 

− 2q1x1 

) + ξ ( t) 

ε . (33) 

2 

Then, it can be shown that with the help of the first four 

cumulants (HOS), the filtering equations are [10]: 

ε 

2 

& κ1 = − ( p1κ1 

− 2q1κ1 

) + F' 

( κ1 

) κ 2 − 

2 

2 1 

2 

− ε ( p 1 κ1 

− 2q1κ 

1 ) κ 2 + F' 

'( 

κ1) 

( κ 4 + 2κ 

2 ) = 0, 

(34) 

2 

where, as before the upper line denotes the time averaging 

procedure, κi denotes i-th cumulant, κ 3 = 0 , 

κ ≅ −2κ 

(see [4]), κ 2 = R11 , xˆ 

H 1 = κ and κ3 and κ4 coincide with 

their a-priori values for the low SNR case. 

4 

2 

2

From (34) it easily follows: 

2 

& κ = −2ε 

( p κ − 2q 

κ ) + F' 

( κ ) κ 

1 

1 1 

1 1 

2ε 

( pκ 

1 − 2qκ 

1 ) ⎡ 

κ 2 = 

⎢ 1+ 

F' 

'( 

κ1) 

⎢ 

⎣ 

2 

1 

2 

F' 

'( 

κ ) ε 

1 

2 2 

[ 2( 

pκ 

1 − 2qκ 

1 ) ] 

⎤ 

−1⎥. 

⎥ 

⎦ 

(35) 

After some simple, but rather cumbersome algebra, one can 

find that for the case of low SNR: 

So, 11 11 

R11 ~ N 

H 

ε . 

R ≥ R H , which coincides with the Rao-Kramer 

bounds for non-linear filtering, but also tending to zero! 

Therefore, as it follows from (30) and (35) the EKF shows 

its adequacy for application to the case of Chua attractor at 

least for the low SNR scenarios. 

Taking into account that all analytical developments were 

done with certain grade of approximation, it is mandatory to 

check them by numerical simulations. The corresponding 

results are presented in the next section. 

Finally, let us name some general observations regarding 

singularity for chaos filtering problem. 

From (1) it definitely follows that its solution is 

x( t) = Φ( 

t0, 

t, 

x0 

) 

, (36) 

defined by x0 and f(x). 

Then, from the SKE (8): 

where 

W 

PS 

( t, 

x) 

= Cδ 

0 

t 

⎪⎧ 

⋅ exp⎨∫ 

F 

⎪⎩ t0 

det 

with the elements 

( t , t, 

x) 

0 

⎪⎧ 

T 

∂Φ 

0 

⎨ 

⎪⎩ 

∂x 

j 

0 

( Φ( 

t , t, 

x ) − x ) det( 

t , t, 

x) 

( t , t, 

x) 

0 

[ τ , Φ( 

t , t, 

x) 

] dτ 

, 

T ⎡∂Φ 

= det⎢ 

⎣ ∂x 

h 

⎪⎫ 

⎬ 

⎪⎭ 

i, 

j= 

1 

0 

≠ 0 

and δ(·) is a delta function; F[⋅] is (9). 

then 


0 

⎪⎫ 

⎬ 

⎪⎭ 

( t , , x) 

⎤ 0 t 

⎥⎦ 

Taking into account that the fundamental matrix is: 

0 

⋅ 

(37) 

(38) 

(39) 

x( t) = Φ( 

t0 

, t, 

x0 

) 

, (40) 

x = Φ t , t, 

x). 

0 

( 0 

(41) 

If 

and if N0→∞, then 

W PS 

y t) = Φ( 

t , t, 

x ) + n ( t) 

( 0 0 0 

( x Φ( 

t , t, 

x) 

) det( 

t , , x), 

( t, 

x) = Cδ 

− 0 

0 t 

(42) 

(43) 

i.e., it is not zero, only when the filtering solution is 

Φ( 

t0 

, t, 

x0 

). 

When N0→0, WPS(t,x) is not equal to zero if and only if 

x0 Φ( 

t0 

, t, 

x) 

= 

, i.e., once more it is singular. 

So, for both those marginal cases WPS(t,x) “memorize” 

the solution of (1). 

For approximate algorithms, these phenomena take place 

from some low but finite values of N0. 

V. NUMERICAL SIMULATIONS 

For numerical simulations were considered the same 

attractors as mentioned before: Rössler, Lorenz and Chua, but 

with neglecting of the process noise in (2). This is done in 

order to verify how fast the a-posteriori variance or MSE is 

tending to zero, independently to SNR level, which is actually 

the sign of the singularity of filtering. 

From previous analysis, the EKF algorithm seems to be 

working practically in singular conditions: algorithm 

completely applied the a-priori information of the attractor, 

output signals are deterministic, though results doesn’t have to 

depend to SNR. This is another reason for opportunistic 

prognosis for EKF for the low SNR scenarios in case of chaos 

filtering. 

It was analyzed in details conditions for the process noise 

(see(2)) and was shown, that even a small fraction of process 

noise provides with a drastic growth of the MSE, which is not 

acceptable for filtering. So, the solution is to definitively tend 

this noise to zero. 

In order to compare the efficiency and accuracy of the 

above mentioned nonlinear approaches, Rössler, Lorenz and 

Chua attractors are filtered (estimated) using the EKF, 

unscented Kalman Fileter (UKF) [8], Gauss-Hermite 

quadrature filter (GHF) [6], and Quadrature Kalman filter 

(QKF) [2]. Before proceeding with the comparisons, some 

brief descriptions of the mentioned nonlinear filters are given. 

Unscented Kalman filter (UKF) 

The UKF is based on the unscented transformation, which 

considers the idea that it is easier to approximate a probability 

distribution than an arbitrary nonlinear function. To achieve 

this, a set of sigma points with adequate mean and covariance 

are chosen (see also the “Functional Approximation method”, 

mentioned in [9], [18] and [19]). This approach differs from 

particle filters because the sigma points are chosen in a 

deterministic way instead of randomly as in particle filters [5], 

[8]. 

This method does not include the linearization of state

and/or output equations. But, although the sigma weights are 

computed before the filtering process begins, the sigma points 

need to be calculated in each algorithm iteration, and 

afterwards the sigma points must be propagated through the 

nonlinear system [5], [8]. 

A detailed derivation of the UKF algorithm is given in [8]. 

Gauss-Hermite quadrature filter (GHF) 

As it is well-known, Guass-Hermite quadrature rule allows 

to approximate integrals of the form 

I = 


∫ 

n 

R 

1 ⎛ 1 

f ( t) 

exp⎜− 

Σ 

1/ 2 

⎝ 2 

n ( 2π 

) det Σ) 

where Σ is covariance matrix. 

T −1 

⎞ 

( t − x) 

( t − x) 

dt , (44) 

The Gauss-Hermite quadrature rule is given by 

∞ 

∫ 1/ 

2 ( 2 ) 

∑ 

−∞ 

π 

i= 

1 

m 

1 

2 

− x 

f ( x) e = w f ( q ) , 

i 

i 

⎟ ⎠ 

(45) 

which holds for all polynomials of degree up to 2m-1. Where 

qi are the quadrature points and wi the corresponding weights 

[6]. So, through (45) the GHF algorithm approximates the 

integrals involved in the Gaussian estimation. It is important 

to remark that the quadrature points and quadrature weights 

used by the GHF algorithm are computed before the filtering 

process is started. Therefore, this method requires less 

computer effort than the UKF algorithm. 

Quadrature Kalman filter (QKF) 

The QKF is a more simplified version of the GHF, which 

considers the nonlinear filtering problem from a statistical 

linear regression (SLR) point of view. In other words, QKF 

uses SLR to linearize a nonlinear function by means of a set of 

Gauss-Hermite quadrature points and weights [2]. Although 

QKF is algebraically equivalent to the GHF, in simulations, 

the filtering process carried out with QKF algorithm is solved 

in a faster way than the filtering performed through GHF. The 

QKF algorithm is derived for the first time in [2]. 

Comparison between nonlinear filters 

The computational complexity of the algorithms is briefly 

presented in the following table, where additions 

(subtractions), multiplications (divisions), Cholesky 

decompositions, Jacobian calculations (linearization) and 

nonlinear propagations are included. 

From Table II, it can be easily seen that UKF involves a 

bigger complexity, while EKF seems to be the simpler 

algorithm. However, the linearization process preformed by 

the Jacobian calculation involves partial derivatives. For that 

reason, and depending on the mathematical model of the 

attractor, the EKF may not always be the fastest algorithm; 

although in our study it is not the case. 

TABLE II. 

COMPUTATIONAL COMPLEXITY 

EKF UKF GHF QKF 

Additions 8 50 25 25 

Multiplications 15 77 33 40 

Cholesky 

decomposition 

1 2 2 2 

Nonlinear 

propagation 

0 15 21 6 

Jacobian 

calculation 

1 0 0 0 

On the other hand, the complexity involved in each one of 

the algorithms is also analyzed by measuring the consumed 

time by the different filtering methods. To this end, the 

algorithms were applied on the chaotic attractors which 

evolved during 3000 sample times. 

One has to notice that the time plots for all above 

mentioned attractors are completely different: Lorenz attractor 

provides with noise-like chaos, while Rossler and Chua 

outputs are more likely to be as “modulated sine-waves”. 

Though for the same filtering accuracy Lorenz outputs have to 

be “oversampled” more frequently than Rossler and Chua 

signals. Oversampling has to be applied in order to achieve a 

required filtering performance and this statement will be 

illustrated with concrete dates for the sampling times for 

attractors upon consideration in this work. 

In other words, bigger sampling times demand better 

accuracy of the filtering procedures. Consequently, extremely 

large sample periods may destroy the effectiveness of filtering 

algorithms, while very small sample times require great 

amount of data storage and faster processors. 

In is important mentioning that the algorithms are executed 

on an Intel Core 2 6420 @ 2.13GHz, with 1.5GB of RAM. 

Fig. 1 shows the MSE versus SNR for Chua attractor when 

the process noise is not present. 

Fig. 1. MSE vs. SNR for Chua attractor. 

As it can be seen the EKF is outperformed by GHF, QKF 

and UKF. This is because the approximation carried out by 

EKF though the linearization is not as good as the Gaussian 

approximations (GHF and QKF) or the unscented 

transformation (UKF). Even so, the MSE generated by the 

EKF is really small for SNR not less than 0.5.For Lorenz


attractor, the performance of the nonlinear filters is depicted in 

Fig. 2. 

Fig. 2. MSE vs. SNR for Lorenz attractor. 

From previous figure, it can be observed that GHF, QKF 

and EKF give better results than UKF when the nonlinear 

filters are applied on a Lorenz chaotic system. This due to the 

a-posteriori distribution of Lorenz attractor, which can be 

better approximated by the Gaussian filters (GHF and QKF), 

while the linearization approach involved in EKF is sufficient 

to approximate the chaotic dynamics. 

It is also deduced that UKF does not match the performance 

of Gaussian filters and EKF, even though the UKF is the most 

complex algorithm. 

Finally, the results obtained from using the nonlinear filters 

on Rössler attractor are depicted in Fig. 4. 

Fig. 4. MSE vs. SNR for Rössler attractor. 

For Rössler system, EKF works better than GHF, QKF, and 

UKF. This is because, the a-posteriori PDF for Rössler 

attractor can be successfully represented through Gramm- 

Charlier series [12], such that approximation by linearization 

is close to the real system. Notice that the MSE for GHF and 

QKF does not tends to zero. 

Another way to compare the nonlinear filters is through the 

necessary time to complete the filtering process for the 

different attractors. As mentioned above, only the first 3000 

samples of the nonlinear filtering processes are considered. 

The following table is intended to give an idea of the 

efficiency of the filtering algorithms. 

It is important to remark that, although QKF is executed in 

a faster way than EKF, GHF and UKF; the EKF algorithm is 

simple and fast enough to be considered as a good choice for 

chaotic filtering. 

Consequently, corroborating the analysis presented in 

previous sections, EKF is suggested as the best option for 

filtering and estimating of Chua, Lorenz and Rössler 

attractors. 

VI. CONCLUSION 

In this report the effectiveness of extended Kalman filter 

(EKF), unscented Kalman filter (UKF), Gauss-Hermite 

Quadrature filter (GHF), and Quadrature Kalman filer 

(QKF), are compared during state estimation of chaotic 

attractors, for both high and rather low SNR’s scenarios. 

It was shown that, in contrary to SDE modeling of Non- 

Gaussian signals, chaos representation of statistically 

equivalent signals (in terms of PDF’s) provides with “force 

sensing driving” of the filtering algorithms to the singular 

conditions from rather low SNR’s limits and as a 

consequence, it follows with the high filtering accuracy (low 

MSE rates), practically invariant to SNR level. 

This fact follows from the absence of the process noise 

components in the SDE of chaos and it was first predicted 

theoretically in [13]. To the best of our knowledge, these 

phenomena has not been discussed in the existing literature. 

On the basis of filtering results, the analysis shows that 

EKF achieves very acceptable performance for Chua, Lorenz 

and Rössler attractors. Although, UKF, GHF and QKF works 

better for Chua and Lorenz than EKF, these filters might be 

much more complex for real-time implementations. 

TABLE III 

SIMULATION TIME FOR CHAOTIC ATTRACTORS 

EKF UKF GHF QKF 

Chua 1.30s 2.07s 1.52s 1.07s 

Lorenz 1.26s 2.19s 1.58s 1.09s 

Rössler 1.25s 2.17s 1.62s 1.08s 

From a computational complexity point of view, the EKF 

and QKF require less effort than the GHF and UKF, while 

UKF involves the most complicated filtering procedure. 

Finally, it is important to remark that EKF algorithm is the 

one with the smaller code, so together with previous 

observations, the analysis suggests EKF as the better filtering 

choice for real-time applications. 

ACKNOWLEDGMENT 

Authors would like to acknowledge the valuable help of 

Dr. Fernando Ramos and M. Sc. Beatriz Rodríguez for the 

preparation of this paper.


REFERENCES 

[1] Anischenko, V. S. et al, “Statistical properties of dynamical chaos”, 

Physics--Uspekhi, vol. 48, no. 2, pp. 151-166, 2005. 

[2] Arasaratnam, I. et al, “Discrete-Time Nonlinear Filtering Algorithms 

Using Gauss-Hermite Quadrature”, Proceedings of the IEEE, vol. 95, no. 

5, pp. 953-977, 2007. 

[3] Chui, C.K., and Chen, G., Kalman Filtering with Real-Time 

Applications, Springer-Verlag Berlin Heidelberg, 1999. 

[4] Eckmann, J., and Ruelle, D., “Ergodic Theory and Strange Attractors”, 

Review of Modern Physics, vol. 57, pp. 617-656, July 1985. 

[5] Haykin, S., Kalman Filtering an Neural Networks, John Wiley & Sons, 

2001. 

[6] Ito, K., and Xiong, K., “Gaussian Filters for Nonlinear Filtering” 

Problems, IEEE Transactions on Automatic Control, vol. 48, no. 5, pp. 

910-927, 2000. 

[7] Jazwinski, A., Stochastic Processing and Filtering Theory, N.Y. 

Academic, 1970. 

[8] Julier, S. J., et al, “Unscented Filtering and Nonlinear Estimation”, 

Proceedings of the IEEE, vol. 92, no. 3, pp. 401-422, 2004. 

[9] Kazakov, I., and Artemiev, V., Optimization of Dynamic Systems with 

Random Structure, Nauka, 1980. (In Russian). 

[10] Kontorovich, V., “Non-Linear Filtering for Markov Stochastic Processes 

using High-Order Statistics (HOS) Approach”, Non-Linear Analysis: 

Theory, Methods and Applications, vol. 30, no. 5, pp. 3165-3170,1997. 

[11] Kontorovich, V., “Applied Statistical Analysis for Strange Attractors 

and Related Problems”, Mathematical Methods in the Applied Sciences, 

vol. 30, pp. 1705-1717, 2007. 

[12] Kontorovich, V., et al., “Analysis of Rössler Attractor and its 

Applications”, Special Issue on Nonlinear Dynamics and 

Synchronization in The Open Cybernetics and Systemics Journal, 2009. 

(In press) 

[13] Kontorovich, V., Lovtchikova, Z., “Nonlinear filtering algorithms for 

chaotic signals: a comparative study”. Proceedings of INDS’09. Second 

International Workshop on Nonlinear Dynamics and Synchronization. 

Klagenfurt, pp. 221-227. Austria. July, 2009. 

[14] Kontorovich, V., Lovtchikova, Z., ”Cumulant analysis of strange 

attractors. Theory and applications”. Recent Advances in Nonlinear 

Dynamics and Sychronization. SCI 254, 2009. (In press) 

[15] Kushner, H., “Dynamical Equations for Optimal Nonlinear Filtering”, 

Journal of Differential Equations, vol. 3, pp. 179-190, 1967. 

[16] Kushner, H. and Budhiraja, A., “A Nonlinear Filtering Algorithm Based 

on an Approximation of the Conditional Distribution”, IEEE Trans. on 

Automatic Control, vol. 45, no. 3, pp. 580-585, March 2000. 

[17] Mijangos, M., Kontorovich, V., and Aguilar-Torrentera, J., “Some 

Statistical Properties os Strange Attractors: Engineering View”, Journal 

of Physics: Conference Series: 012147 (6pp), vol. 96, March 2008. 

[18] Primak, S., Kontorovich, V., and Lyandres, V., Stochastic Methods and 

their Applications to Communications: Stochastic Differential Equations 

Approach, John Wiley & Sons, 2004. 

[19] Pugachev, V., and Sinitsyn, I., Stochastic Differential Systems. Analysis 

and Filtering, John Wiley & Sons, 1987. 

[20] Stratonovich, R., Topics of the Theory of Random Noise, vol 1 and vol. 

2, Gordon and Breach, 1963. 

[21] Van Trees, H., Detection, Estimation and Modulation Theory, John 

Wiley & Sons, 2001. 

[22] Zakai, M., “On the Optimal Filtering of Diffusion Processes”, 

Wahrscheinlichkeitstheorie verngebiete, vol. 11, pp. 230-243, 1969.


Nonlinear Feature Extraction Approaches for 

Scalable Face Recognition Applications 

Hima Deepthi Vankayalapati 

Institute of Smart Systems Technologies 

University of Klagenfurt 

9020 Klagenfurt, Austria 

hvankaya@edu.uni-klu.ac.at 

Abstract—The human skill of identifying thousands of people 

even after so many years excited many researchers to focus on 

face recognition systems. The majority of real world applications 

demands more robust, scalable and computationally efficient 

face recognition techniques which can operate under complex 

viewing and environmental conditions. The appearance based 

linear subspace techniques are very useful in data classification 

and dimensionality reduction tasks; however these algorithms 

only classify the linear data. The scalability of the linear subspace 

techniques is limited, as the computational load and memory 

requirements increase dramatically with the large database. This 

paper evaluates different nonlinear feature extraction approaches 

for face recognition application, namely wavelet transform, radon 

transform and cellular neural networks (CNN). In this work, the 

combination of radon and wavelet transform based approaches is 

used to extract the multi-resolution features, which are invariant 

to facial expression and illumination conditions. The efficiency of 

the stated wavelet and radon based nonlinear approaches over the 

databases is demonstrated, with the simulation results performed 

over the FERET database. This paper also presents the use of 

CNN in extracting the nonlinear facial features in improving the 

recognition rate, as well as computational speed, compared to 

other stated nonlinear approaches over the ORL database. 

Index Terms—Feature extraction, Face recognition, Linear 

subspace techniques, Cellular neural network, Wavelet transform, 

Radon transform. 


In computer vision, a feature is a set of measurements. Each 

measurement contains a piece of information, and specifies the 

property or characteristics of the object present in the image 

[1]. The linear features are more advantageous, when the given 

data is Gaussian distributed in terms of mean. However in most 

real world face recognition applications, facial features of the 

face image are not purely Gaussian distributed (they vary with 

complex viewing and environmental conditions). 

Researchers have developed various biometric techniques to 

identify or recognize persons by their physical characteristics 

like finger, voice, face etc. These biometric techniques have 

their own advantages and drawbacks as well [2]. Among all 

the biometric techniques, the face recognition has a distinct 

advantage of collecting the required data (i.e image) without 

any cooperation from the person [3]. The face recognition is 

a complex visual classification task which plays an important 

role in computer vision, image processing and pattern recognition. 

Kyandoghere Kyamakya 

Institute of Smart Systems Technologies 

University of Klagenfurt 

9020 Klagenfurt, Austria 

kyandoghere.kyamakya@uni-klu.ac.at 

Research concerning the face recognition started nearly in 

1960’s [4]. Different face recognition techniques have been 

proposed during last decades namely feature based, model 

based and appearance based techniques [5], [6]. In feature 

based techniques, the overall technique describes the position 

and size of each feature (eye, nose, mouth or face outline) 

[7]. In this approach, the extracting features in different poses 

(viewing conditions) and lighting conditions are very complex 

tasks. For applications with large databases, we have large 

set of features with different sizes and positions, making it 

difficult to identify the required feature points [8]. In the 

model based approach, a 3D model is constructed based on 

the facial variations in the image or important information 

related to the image. The difficulties in this approach are, we 

need a very expensive camera (Stereo vision) to capture the 

facial variations clearly; further construction of 3D model is 

difficult, and it takes more time to construct the model for 

large databases [6]. The availability of large 3D data is also 

one of the essential complex tasks that makes the model based 

methods not suitable for real world applications dealing with 

large databases. 

In 1990’s, researchers introduced appearance based linear 

subspace techniques, statistics related techniques, to solve face 

recognition problems. The introduction of the linear subspace 

techniques is a milestone in the face recognition concept. The 

performance of appearance based techniques heavily depends 

on the quality of the extracted features from the image [9]. The 

appearance based linear subspace techniques extract the global 

features, as these techniques use the statistical properties like 

the mean and variance of the image [6]. The major difficulty 

in applying these techniques over large databases is that the 

computational load and memory requirements for calculating 

features increase dramatically for large databases [3]. In order 

to increase the performance of the face recognition techniques, 

the nonlinear feature extraction techniques are introduced. 

In order to improve the performance of the face recognition 

technique, we have to extract both linear and nonlinear features. 

We have many nonlinear feature extraction techniques, 

such as radon transform and wavelet transform. The radon 

transform based nonlinear feature extraction gives the direction 

of local features. This process extracts the spatial frequency 

components in the direction of radon projection is computed


[10]. When features are extracted using radon transform, the 

variations in this facial frequency are also boosted [10]. The 

wavelet transform gives the spacial and frequency components 

present in an image [11]. However these nonlinear feature 

extraction techniques are computationally expensive. In order 

to improve the computational speed of the nonlinear feature 

extraction process, the cellular neural network (CNN) concept 

is being proposed. 

The novel scheme will involve, at its heart, CNN based 

processors, which will be the key component of the analog 

computing based ultra-fast solver for image processing tasks. 

CNN based analog computing has the very attractive advantage 

of easy implementation or emulation on digital platforms. 

The objective of this paper is to present the use of CNN in 

extracting nonlinear features using the ORL database. 

The paper is organized as follows: in section 2, the importance 

and methodologies of the linear subspace techniques 

are explained briefly. In section 3, the basics and importance 

of the radon transform are explained briefly. In section 4, 

wavelet transform is briefly described. In section 5, cellular 

neural network is introduced. Genetic algorithm based template 

calculation method is also briefly described in section 

5. The experimental simulation results using the FERET and 

the ORL databases are described in section 6. Section 7 deals 

with some concluding remarks and outlooks. 

II. LINEAR SUBSPACE TECHNIQUES 

Principal Component Analysis (PCA), Independent Component 

Analysis (ICA) and Linear Discriminant Analysis (LDA) 

are related to the appearance based linear subspace technique 

[6]. These linear subspace techniques use statistics (mean and 

co-variance). The calculation of the mean and co-variance is 

performed by using the train data set to form the data matrix 

X. In data matrix X, each column xi represents the image 

in the train data set. The mean image of the train data set is 

expressed as shown in Eq. 1. 

m = 1 

N 

N� 

i=1 

The co-variance matrix C of the random vector x is calculated 

using Eq. 2. 

C = 1 

N 

xi 

N� 

(xi − m)(xi − m) T (or)C = AA T 

i=1 

Calculating the co-variance matrix by using Eq. 2 takes high 

memory because of the dimensions of C. The size of A is 

LMxN. The size of C is LMxLM, which is very large. 

So the matrix L = A T A is considered instead of C. The 

dimension of L is NxN, which is much smaller than the 

dimensions of C. After the co-variance matrix, each technique 

(PCA, ICA and LDA) uses a specific approach to calculate the 

key parameters of the feature space. 

In linear subspace technique, all the images in the train data 

set are represented as points in the feature space as shown in 

Fig. 1. The given test image is also represented as a point in 

(1) 

(2) 

X 1 

X 3 

X 2 

Fig. 1. Image representation in the high dimensional space 

the same space and the minimum distance train data set image 

gives the best match. 

A. Principal Component Analysis (PCA) 

PCA highlights the similarities and differences between the 

variables in the data [12], [13]. After calculating the covariance 

matrix, we have to calculate the eigenvalues and 

eigenvectors of the co-variance matrix. Then we arrange all 

eigenvalues in descending order and we take first few highest 

eigenvalues and corresponding eigenvectors. This operation is 

the evaluation of principal components [14]. The eigenvectors 

e1, e2,...en are shown in Eq. 3. 

Wpca = [e1, e2, ....., en] (3) 

We neglect the remaining less significant eigenvalues and 

the corresponding eigenvectors. The eigenvalues neglected 

lead to a very small information loss [15]. The principal 

component axis passes through the mean values. A new 

transformation matrix Wpca is obtained, by projecting the 

principal component on to the original data set. 

B. Independent Component Analysis (ICA) 

ICA uses the higher order statistics of the input data 

to find the independent components. The independency is 

distinguished by knowing the uncorrelated data. ICA is a 

special case of blind source problem [16]. One of the simplest 

applications of ICA is found in the cocktail party problem. 

So the ICA technique is a generalization of PCA technique. 

In this technique we first calculate the PCA transformation 

matrix Wpca, transform the centered matrix P = [x1−m, x2− 

m, ...., xn − m] using Wpca and then form a new matrix Z 

(square matrix with size NXN), which contains the random 

vector z, whose elements are uncorrelated as shown in Eq. 4. 

Z = W pca T P (4) 

The next important stage is the rotation stage. In this one, 

the fixed point algorithm is used to find the Wk [17]. After 

that, we calculate the overall transformation matrix as shown 

in Eq. 5. 

Wica = WpcaWk 

(5)


C. Linear Discriminant Analysis (LDA) 

The main objective of the LDA is minimizing the within 

class variance and maximizing the between class variance in 

the given data set. In other words it groups the same class 

images and separates the different class images [18]. A class 

means the collection of data (images) belonging to the same 

object or same person. In LDA, we have to calculate the mean 

image of each class i which is represented as mi. 

Si = 1 

c� 

(x − mi)(x − mi) T 

(6) 

Ni 

x∈Xi 

Eq. 6 represents the class dependant scatter matrix and it gives 

the sum of the co-variance matrix of the centered images in 

each class. Xi represents the data matrix corresponding to 

class i. Ni represents the images present in class i. c represents 

the total number of classes. The within class scatter matrix Sw 

is calculated from Eq. 7. 

c� 

Sw = 

(7) 

This leads to the evaluation of the amount of variance between 

the images in each class. Sb represents the between class 

scatter matrix [3] and it calculates the variance between the 

classes by using Eq. 8. The co-variance matrix of each class 

is the difference between the total mean of all classes and the 

mean of each class. Sb is expressed in Eq. 8. 

c� 

Sb = (mi − m)(mi − m) T 

(8) 

i=1 

i=1 

If Sw is non-singular, we should solve the generalized eigen 

problem of the transformation matrix W by the linear discriminant 

analysis in Fig. 2. This transformation matrix should 

maximize the between class scatter matrix and minimize the 

within class scatter matrix [19]. There are many solutions to 

solve the generalized eigen problem [20]. One method for 

solving this eigen problem is to take the inverse of Sw and 

solve the problem by using S −1 

w SbW = W λ. This task is 

derived from Eq. 9. 

Si 

SbW = SwW λ (9) 

λ is a diagonal matrix containing the eigen values of the matrix 

S−1 w Sb. The above algorithm is optimal only when the within 

class scatter matrix is singular. If the within class scatter matrix 

is non-singular, we should use the direct LDA technique [15]. 

The direct LDA is performed in the following steps as shown 

in Fig. 2. 

The first step is related to find the eigen vectors of the 

between class scatter matrix Sb = P T b Pb, where Pb is 

calculated by subtracting the mean face images of each class 

from the mean face image of all images as expressed in Eq. 10. 

Pb = [m1 − m, m2 − m, ...., mc − m] (10) 

The second step takes the most significant eigen values and 

corresponding eigen vectors V . These eigen vectors are used 

Test image 

(y) 

Data Matrix (X) 

Mean Image 

+ 

Mean Image of each person 

Center 

test 

image 

+ 

Mean Image 

+ 

Calculate 

Within class 

Scatter matrix Sw 

Calculate 

Betweenclass 

Scatter matrix Sb 

S w is 

singular 

and 

eigen values of 

b and S Calculate eigen 

vectors 

S w 

Highest fisher faces 

Yes 

Calculate the distance 

Py=WTy = between Px and Py 

Min(dist) 

No 

Calculate 

eigen vectors 

of Sb b 

Form whitening 

Transform (Z) 

Calculate the 

eigen vectors of 

(Z’ (Z’Sw Z) 

P=W Px =WTX Recognition 

Result 

Fig. 2. Linear discriminant analysis technique for face recognition 

to calculate Y = PbV and Db = Y T SbY . This leads to the 

evaluation of the whitening transform as Z = Y D −1/2 

b . Sb and 

Sw are projected onto the new subspace spanned by Z. The 

small matrix ZT SwZ can be diagonalized. The relationship 

between them is expressed in Eq. 11. 

U T Z T SwZU = λw 

(11) 

U and λw are the eigen vectors and eigen values of the 

matrix Z T SwZ. The corresponding eigen matrix is represented 

as R. The overall transformation matrix is calculated from 

W = ZR. A new transformation can be performed by using 

the linear transformation of the original space into a new 

reduced dimensional feature space Px = W T X (i.e project 

this transformation matrix on to the train data set) [6]. 

The next operation is concerned with the projection of this 

transformation matrix on to the test data sets to obtain Py. 

The best match is found by calculating the distance between 

Px and Py using the distance measure technique. The overall 

linear discriminant analysis technique for face recognition is 

shown in Fig. 2. 

Technique 

Year 

Iterative 

Class Information usage 

Order of statistics 

Recognition rate (for 80 

persons database) 

Speed 

Scalability 

Principal component 

analysis (PCA) 

1990 

No 

No 

Second order 

70% 

medium 

low 

Independent 

component analysis 

(ICA) 

1999 

Yes 

No 

Higher order 

79% 

very low 

low 

Linear discriminant 

analysis (LDA) 

1997 

No 

Yes 

Second order 

Fig. 3. Comparison of linear subspace techniques (PCA, ICA and LDA) 

The performance of different linear subspace techniques like 

PCA, ICA and LDA is evaluated. Experiments are conducted 

to understand the performance (recognition rate and speed) of 

these linear subspace techniques over the FERET database. 

Among linear subspace techniques, LDA gives both high 

recognition rate and speed when compared with PCA and 

89% 

high 

high


ICA as shown in Fig. 3. But LDA is not scalable, and the 

recognition rate is also not sufficient for real world applications. 

In linear subspace techniques, the computational load 

and memory requirements are dramatically increasing with the 

size of database. 

III. RADON TRANSFORM 

The two dimensional radon transform was introduced by 

Austrian mathematician Johann Radon in 1917. This transform 

gives the integral of the set of lines present in a given image 

[10]. Due to this, it captures the direction of the local features 

(lines, curves and circles) which are present in the image. This 

transform is useful in many line, circle and curve detection 

applications, related to image processing and computer vision 

[10]. The radon transform of the two dimensional function 

f(x, y) in (r, θ) plane (Fig. 4(a)) is shown in Eq. 12 

� ∞ � ∞ 

R(r, θ)[f(x, y)] = f(x, y)δ(xcosθ+ysinθ−r)dxdy 

−∞ 

−∞ 

(12) 

Where δ(.) function is the Dirac function, rɛ[−∞, ∞] is the 

R(r, ) 

0 

r 

0 

Y 

(a) 

−100 

−80 

−60 

−40 

−20 

r 0 

20 

40 

60 

80 

100 

f(x,y) 

0 20 40 60 80 100 120 140 160 180 

θ (degrees) 

(c) 

X 

20 

40 

60 

80 

Y 

100 

120 

140 

160 

180 

20 40 60 80 100 120 

X 

Fig. 4. (a) The radon transform of an image (b) Shows the original image 

(b) Radon transform of the image with an angle 0 to 180 

perpendicular distance of a line from the origin and θɛ[0, π] is 

the angle formed by the distance vector [10]. The δ function 

converts the two dimensional integral to a line integral dl 

along the line xcosθ + ysinθ = r. The simplified form of 

R(r, θ)[f(x, y)] is Rf shown in Eq. 13 

� ∞ 

Rf = f(rcosθ − lsinθ − rcosθ + lsinθ)dxdy (13) 

−∞ 

The transformed function (r, θ) is referred to as the sinogram 

of f(x, y). The δ function transforms the point in f to 

sinusoidal line δ function in (r, θ) plane. The Rf is defined 

as a function of straight lines. The radon transform of the two 

dimensional image shown in Fig. 4(b), extracts the direction 

of the lines present in that image, as shown in Fig. 4(c). 

(b) 

The sinogram (Fig. 4(c)) of the given image has 181 radon 

projections. Each projection in the image is a feature vector. 

IV. WAVELET TRANSFORM 

Morlet introduced the wavelet transform in the early 1980’s 

[21]. Wavelet is named ’ondelette’ in French, which means 

’small waves’ [11]. A wavelet gives both the spatial and 

frequency information of the images. In the frequency representation, 

the signal is cut into several parts and each part 

is analyzed separately. Commonly used discrete wavelets are 

daubechies wavelets [22]. Wavelets with one level decomposition 

is performed by using the high pass filter g and the low 

pass filter h. Convolution with the low pass filter gives the 

approximation information, while convolution with the high 

pass filter leads to the detail information [23]. The wavelet 

decomposition process of two dimensional signal f(x, y) is 

shown in Fig. 5. The overall process is modeled in Eqs.( 14 

- 17). 

X(n) 

HP 2 

LP 

2 

HP 2 

LP 

2 

HP 2 

Fig. 5. Wavelet coefficients decomposition in discrete wavelet transform 

LP 

A = [h ∗ [h ∗ f]x ↓ 2]y ↓ 2 (14) 

H = [g ∗ [h ∗ f]x ↓ 2]y ↓ 2 (15) 

V = [h ∗ [g ∗ f]x ↓ 2]y ↓ 2 (16) 

D = [g ∗ [g ∗ f]x ↓ 2]y ↓ 2 (17) 

The star (∗) represents the convolution operation, and ↓ 2 

represents the downsampling by 2 along the direction x or 

y [11]. To correct this sample rate, the down sampling of the 

filter by two is performed (by simply throwing away every second 

coefficient). The daubechies wavelets have many wavelets 

functions. In this work, db4 (because of the symmetry) is used. 

db4 leads to the four wavelet coefficients A, H, V and D 

and the corresponding images. In this decomposition A gives 

the approximation information, and the image is a blurred 

image as shown in Fig. 5. H gives the horizontal features, V 

gives the vertical features and D gives the diagonal features 

present in the image. The wavelet coefficient A gives the high 

performance, when compared to the remaining three wavelet 

coefficients. Further D gives the less performance. Using the 

A + H + V + D wavelet coefficients leads to a performance, 

which is nearly equal to the A’s performance. 

2 

D 

H 

V 

A


V. CELLULAR NEURAL NETWORK 

The concept of CNN, also called cellular neural networks 

was introduced in 1988 by Leon O.Chua and Lin Yang. The 

basic building block in the CNN model is the cell. The CNN 

model consists of regularly spaced array of cells. It can be 

identified as the combination of cellular automata [24] and 

neural networks [25]. The adjacent cells communicate directly 

through their nearest neighbours and other cells communicate 

indirectly, because of the propagation effects in the model. 

The original idea was to use an array of simple, non-linearly 

coupled dynamic circuits to process, parallely, large amounts 

of data in real time [25]. 

Cells are multiple input, single output nonlinear processors. 

Cells in the CNN processor contain fixed location and fixed 

topology. Inputs, initial state, and output variables are used to 

define the CNN processor behavior. Professor Leon O.Chua 

proposed the diagram of an isolated cell, as shown in Fig. 6. 

The state variable is not observable from outside the cell itself. 

Input Uij 

Threshold Zij 

State 

Xij 

Cell Cij 

Fig. 6. Representation of an isolated cell 

Output Yij 

The cell is a lumped circuit, and it contains both linear and 

nonlinear elements, such as resistors, capacitors and nonlinear 

controlled sources as shown in Fig. V. The CNN processor 

is modeled by Eqs.( 18 - 19), with xi, yi and ui as state, 

output and input variables respectively. The schematic model 

of a CNN cell is shown in Fig.8 

˙xij = −xij + � 

c(j)∈Nr(i) 

Aijyij + Bijuij + I (18) 

yij = 1 

2 (|xij + 1| − |xij − 1|) (19) 

The coefficients Aij and Bij values, synaptic weights, completely 

define the behavior of the network, with given input 

and initial conditions, as shown in Eq. 18. These values are 

called the templates. For the ease of representation, they can 

be represented as a matrix. We have three types of templates: 

the first one is feedforward or control template, the second 

is feedback template and the third is bias. All these space 

invariant templates are called cloning templates. CNNs are 

particularly interesting, because of their programmable nature 

i.e. changeable templates. 

These templates values and synaptic weights completely 

define the behavior of the network, with given input and 

initial conditions. These templates are expressed in the form 

of a matrix and are repeated in every neighborhood cell. The 

template set for r = 1 CNN contains 19 coefficients (Atemplate 

9, B-template 9 and bias 1). 

Euij 

u ij 

x ij 

I C R 

-1 

-1 

(a) 

y ij 

+1 

(b) 

Fig. 7. (a)Electronic circuit model of the isolated cell (b) The classical output 

nonlinear function for each cell 

Outputs 

from 

neighbouring 

cells 

Uij 

Inputs 

from 

neighbouring 

cells 

A = 

⎡ 

Template 

A 

I 

Template 

B 

(Convolution block) 

∑ 

(Summation) 

-1 

Iuij 

+1 

∫ 

x ij 

(gain block) 

Iyij 

Fig. 8. Schematic representation of the CNN 

⎣ A−1,−1 A−1,0 A−1,1 

A0,−1 A0,0 A0,1 

⎡ 

B = ⎣ 

A1,−1 A1,0 A1,1 

B−1,−1 B−1,0 B−1,1 

B0,−1 B0,0 B0,1 

B1,−1 B1,0 B1,1 

y ij 

Eyij 

X ij 

ij 

(Integration) 

(Output function) 

Y 

X ij ( 0) 

The genetic algorithm is used to estimate the A, B and I 

templates, depending upon the given application. The template 

set is unique for each application. In this work, we use the 

genetic algorithm to obtain the template set for the ORL 

database. 

A. Genetic algorithm 

In order to extract the facial features from a frontal face 

image, we assume that the template set values will have symmetrical 

behavior, as the front view of the face is symmetrical. 

⎤ 

⎦; 

⎤ 

⎦;


Because of this symmetry, instead of 19 template elements, 

we are calculating the 11 template elements (A-template 5, 

B-template 5 and bias 1). Each template element is encoded 

with 32 bit floating point format. Genetic algorithm (GA) 

uses the papulation of binary strings called chromosomes. In 

the learning process, initially 72 random chromosomes, with 

length of 11∗32 bits each, are constructed. Genetic Algorithm 

is explained in detail in the following steps: 

• Construct the random population matrix with size 

72X(11∗32) i.e. each row represents a chromosome (for 

11 template elements) of length 11 ∗ 32 = 352. 

• The IEEE 754 floating point standard is used to calculate 

the template (A, B and I) elements from each chromosome 

[24]. In each chromosome first 11 bits represents 

the first bit of the 11 template elements, and second 11 

bits represents the second bit of the 11 template elements 

so on as given in Eq. 20. 

S = [A11, A12, A13, A21, A22, B11, B12, B13, B21, B22, I] 

(20) 

• After template calculation, these templates are given as 

input to the CNN. The first CNN works with the template 

of the first chromosome. After the CNN output appears 

as stable, cost function is calculated by using this CNN 

output image P and the target image T . This process 

is repeated for each chromosome template sets in the 

population matrix [24]. The cost function is selected as 

shown in Eq. 21. 

cost(A, B, I) = 

m� 

i 

n� 

j 

Pi,j ⊕ Ti,j 

(21) 

Here m,n are the number of pixels of the image. ⊕ 

represents the XOR operation. 

• After calculating the cost function, the fitness function 

for each chromosome is evaluated as given in Eq. 22. 

fitness(A, B, I) = m ∗ n − cost(A, B, I) (22) 

• The whole process is repeated for each chromosome until 

the fitness value exceeds the stop criteria. The stop criteria 

is considered as stcriteria = 0.99∗m∗n. This maximum 

fitness value of the chromosome in the population matrix 

is selected. 

• The next step is reproduction. In this process, the fitness 

values corresponding chromosomes are sorted in descending 

order. All the fitness values are normalized with the 

sum of the fitness values. The bad fitness value corresponding 

chromosomes are deleted. The most successful 

chromosomes will produce the next generation. 

• Take the first highest fitness values corresponding chromosomes 

S1 and S2, apply the crossover and mutation 

operations to generate the children [24]. Crossover operation 

exchanges the substrings between the two chromosomes 

S1 and S2. In this work, one-point crossover is 

used and its first cross site is selected with chromosome 

length of the uniform probability. If the mutation probability 

is set to 0.01 then 253 bits are selected randomly 

and then they are inverted. 

• Take these new chromosomes and apply the same steps 

from template calculation to stop criteria. 

This learning process is repeated to find the best chromosome. 

After satisfying the stop criteria, the template elements are 

calculated from the best chromosome. The template elements 

to extract features from the frontal face images for ORL 

database are obtained as: 

⎡ 

A = ⎣ 

⎡ 

B = ⎣ 

I = 0.4414 

2.7612 7.3152 1.7566 

1.5916 8.5273 1.5916 

1.7566 7.3152 2.7612 

⎤ 

⎦; 

−6.1912 2.8350 −7.9270 

1.3044 −2.7349 1.3044 

−7.9270 2.8350 −6.1912 

The corresponding best chromosome is 

S = [000001010101100111101000110000101 

00110000101001100001010011000010100110000101 

00110000101011101011010111010100111100011111 

10000011000010110010100000011111001010100110 

00010011011101100011010010101010110011011101 

11110110101001111010111100110010111001100101 

10011001001100110111100010110100000000011001 

01001001110010101010110111011110101001100011 

00100100000] 

(a) (b) 

Fig. 9. (a) The input image of the genetic cellular neural network (b) The 

genetic cellular neural network output image 

The two dimensional image shown in Fig. 9(a), is given 

as the input image for CNN to extract the important frontal 

facial features present in that image and the output image with 

extracted feature set is shown in Fig. 9(b). 

⎤ 

⎦; 

VI. EXPERIMENTAL RESULT 

In this section, we evaluate the performance of the wavelet 

and radon transform based feature extraction approaches using 

FERET database. The performance of the CNN based approach 

is compared to other stated face recognition approaches 

over the ORL database. The performance is evaluated over the 

FERET database for frontal images (fa or fb), pose variant 

with an angle 67.5 half left or right shifted images (hr or hl), 

and pose variant with an angle 90 profile left or right shifted 

images (pr or pl) [26]. For the ORL database, the performance 

is evaluated for facial expressions and varying light conditions.


1) Performance evaluation of radon and wavelet transforms: 

: The radon transform gives the direction of the 

local features (lines, circles). Radon transform preserves the 

variation in pixel intensities. While computing the radon 

projections, the pixel intensities along a line are added. This 

process extracts the spatial frequency components in the 

direction of radon projection. When features are extracted 

using radon transform, the variations in this facial frequency 

are also boosted. The wavelet transform gives the spacial and 

frequency components present in an image. 

A. Different wavelet functions versus recognition rate 

Daubechies wavelets contain different wavelet functions. 

The recognition rates of two different wavelet functions db1 

and db4 are compared in Fig. 10. db1 stands for the haar 

wavelet and it encodes the constant component. db4 encodes 

both constant and linear components. The db4 performance is 

high when compared to db1. 

B. Different wavelet coefficients versus recognition rate 

In the db4 daubechies wavelets, there are four wavelet 

coefficients. These coefficients vary in terms of the wavelet 

functions. The four wavelet coefficients are A, H, V and 

D. The wavelet coefficient A gives approximate information 

on the features. H, V , and D gives the information about 

horizontal, vertical and diagonal features present in the given 

image respectively. 

The wavelet coefficient A gives the high recognition rate, 

when compared to the remaining three wavelet coefficients. 

Further D gives the less recognition rate (see Fig. 10). Using 

the A + H + V + D wavelet coefficients leads to a recognition 

rate, which is nearly equal to the A’s recognition rate. 

Recognition rate (%) 

100% 

90% 

80% 

70% 

60% 

50% 

40% 

30% 

20% 

10% 

0% 

D V H A 

Wavelet coefficients 

Fig. 10. Performance comparison of different wavelet function db1 and db4 

The next experiments are conducted on a FERET database 

with one frontal image (fb) for each subject as test image, 

and five images in different poses for each subject in train 

database. The performance evaluation is shown in Fig. 11(a). 

The experiments are repeated with pose variant images like 

hr and pr as test image for each subject, and five images 

db4 

db1 


100% 



100% 

95% 

90% 

85% 

80% 

75% 

70% 

90% 

80% 

70% 

60% 

50% 

40% 

30% 

20% 

10% 

0% 

100% 

90% 

80% 

70% 

60% 

50% 

40% 

30% 

20% 

10% 

0% 

100% 

90% 

80% 

70% 

60% 

50% 

40% 

30% 

20% 

10% 

0% 

PCA 

PCA 

LDA 

Radon 

Wavelet 

Radon+wavelet 

Radon+LDA 

Wavelet+LDA 

40 100 150 200 400 

Number of subjects in the database 

PCA 

LDA 

(a) 

Radon 

Wavelet 

Radon+wavelet 

Radon+LDA 

Wavelet+LDA 

40 100 150 200 400 


LDA 

(b) 

PCA 

LDA 

Radon 

Wavelet 

Radon+wavelet 

Radon+LDA 

Wavelet+LDA 

40 100 150 200 400 


Radon 

(c) 

Wavelet 

Radon+wavelet 

(d) 

Radon+LDA 

Wavelet+LDA 

Fig. 11. (a) Performance comparison of different face recognition approaches 

with front images (FERET database) (b) Performance comparison of different 

face recognition approaches with half right images (FERET database) (c) 

Performance comparison of different face recognition approaches with profile 

right images (FERET database) (d) Performance comparison of different face 

recognition approaches with ORL database 

CNN


excluding the test image for each subject in train database. The 

results are shown in Fig. 11(b) and Fig. 11(c) respectively. For 

best matching, the euclidean distance measure is used here. 

The recognition rate depends upon the number of subjects 

in the data set. It is difficult to recognize a subject in the 

large data set than in the small data set. The experiments 

are conducted with different sizes of the FERET database, 

by using linear subspace techniques (principal component 

analysis (PCA), linear discriminant analysis (LDA)), radon 

transform and wavelet transform. In applying linear subspace 

techniques for large databases, computational load and memory 

requirements increases dramatically with the size of the 

database. This effects the performance of PCA and LDA on 

large data sets as shown in Fig. 11. 

The radon transform and wavelet transform are mostly independent 

of size of the database. The combination of radon and 

wavelet transform gives the multi-resolution features, which 

are more useful in face recognition. This has been validated 

with the experimental results shown in Fig. 11. Even though 

the combination of radon and wavelet transform gives better 

performance, there is still a need for improvement in pose 

variant face recognition as shown in Fig. 11(b) and Fig. 11(c). 

1) Performance evaluation of cellular neural networks: : 

The CNN based face recognition approach and other stated 

approaches are applied on ORL database. The ORL database 

contains images of 40 subjects. All images are taken in frontal 

position against a dark homogeneous background. The performance 

of various algorithms are evaluated using ORL database 

are shown in Fig. 11(d). CNN with its parallel computing 

paradigm promises to outperform the other approaches over 

the ORL database as shown in Fig. 11(d). 

VII. CONCLUSION 

The face recognition performance has been systematically 

evaluated by using different sizes of the database. To improve 

the performance of the face recognition technique, wavelets, 

radon and combination of both radon and wavelet transform 

have been proposed to extract the nonlinear features. The 

results of the evaluation have shown that the recognition rate is 

considerably increased with the combination of both radon and 

wavelet transform compared to PCA and LDA. In addition to 

these two approaches, this work also shows CNN based feature 

extraction approach for face recognition outperforms both 

radon and wavelet transforms for ORL database. However, this 

should be validated for FERET database, where the images are 

in different poses. The CNN algorithm should able to detect 

the pose, and then apply the appropriate template to extract 

the relevant feature set. 

Future work should focus on the recognition algorithm 

performing over videos, as many applications demand real 

time recognition. Further, such a system may be integrated 

in driver assistance system to either recognize the driver of a 

car, or extract facial expressions that may provide information 

about his mood or fatigue. 

REFERENCES 

[1] I. Guyon and A. Elisseeff, “An Introduction to Feature Extraction,” 

Zurich Research Laboratory, (2004). 

[2] M. Aleemuddin, “A Pose Invariant Face Recognition system using 

Subspace Techniques,” Deanship of Graduate studies, (2004). 

[3] G. Shakhnarovich and B. Moghaddam, Face Recognition in Subspaces. 

Springer-verlag, May (2004). 

[4] M. Kirby and L. Sirovich, “Application of the karhunen-loeve procedure 

for the characterization of human faces,” IEEE Trans. Pattern Anal. 

Mach. Intell., vol. 12, no. 1, pp. 103–108, (1988). 

[5] A. SATO, H. IMAOKA, T. SUZUKI, and T. HOSOI, “Advances in Face 

Detection and Recognition Technologies,” NEC Journal of Advanced 

Technology, vol. 2, no. 1, (2005). 

[6] O. Toygar and A. Acan, “Face Recognition using PCA, LDA and ICA 

approaches on colored images,” Electrical and Electronics engineering, 

vol. 3, no. 1, pp. 735–743, (2003). 

[7] R. Brunelli, T. Poggio, and I. P. Trento, “Face recognition through 

geometrical features,” in European Conference on Computer Vision 

(ECCV), pp. 792–800, (1992). 

[8] R. Brunelli and T. Poggio, “Face recognition: Features vs. templates.,” 

IEEE Transactions on Pattern Analysis and Machine Intelligence, 

vol. 15, no. 10, pp. 1042–1052, (1993). 

[9] B. J. Lei, E. A. Hendriks, and M. Reinders, “On Feature Extraction from 

Images,” Technical Report on MCCWS project, (1999). 

[10] Q. W. Yan CHEN and X. HE, “Human Action Recognition by Radon 

Transform,” IEEE International Conference on Data Mining Workshops, 

May (2008). 

[11] N. Shams, I. Hosseini, M. Sadri, and E. Azarnasab, “Low cost fpgabased 

highly accurate face recognition system using combined wavelets 

with subspace methods,” pp. 2077–2080, (2006). 

[12] P. N. Belhumeur, J. a. P. Hespanha, and D. J. Kriegman, “Eigenfaces 

vs. fisherfaces: Recognition using class specific linear projection.,” IEEE 

Transactions on Pattern Analysis and Machine Intelligence, vol. 19, 

pp. 711–720, July (1997). 

[13] W. S.Yambor, “Analysis of PCA based and Fisher discriminant based 

image recognition algorithms,” Degree of Master of Science, (2000). 

[14] B. A. Draper, K. Baek, M. S. Bartlett, and J. R. Beveridge, “Recognizing 

faces with pca and ica,” Comput. Vis. Image Underst., vol. 91, no. 1-2, 

pp. 115–137, (2003). 

[15] P. N.Belhumeur, J. P.Hespanha, and D. J.Kriegman, “Eigenfaces vs. 

Fisherfaces:Recognition Using Class Specific Linear Projection,” vol. 19, 

no. 7, (1997). 

[16] J. Kim, M.-J. Choi, M.-J. Yi, and M. Turk, “Effective representation 

using ica for face recognition robust to local distortion and partial 

occlusion,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 27, no. 12, 

pp. 1977–1981, (2005). 

[17] A. Hyvrinen, “The Fixed-Point Algorithm and Maximum Likelihood 

estimation for Independent Component Analysis,” pp. 1–5, (1999). 

[18] W. S. Y. B. A. D. J. R. Beveridge, “Analyzing PCA-based Face 

Recognition Algorithms: Eigenvector Selection and Distance Measures,” 

Department of Computer Science, (2000). 

[19] P. N.Belhumeur, J. P.Hespanha, and D. J.Kriegman, “Eigenfaces vs. 

Fisherfaces:Recognition Using Class Specific Linear Projection,” European 

Conference on Computer Vision, (1996). 

[20] W.Zhao, R.Chellappa, and P.J.Phillips, “Subspace Linear Discriminant 

Analysis for Face Recognition,” Department of Electrical and Electronic 

Engineering, (1999). 

[21] C. Garcia, G. Zikos, and G. Tziritas, “A wavelet-based framework for 

face recognition,” in Workshop on Advances in Facial Image Analysis 

and Recognition Technology, 5 th European Conference on Computer 

Vision, pp. 84–92, Publications, (1998). 

[22] M. I. M. D. Fatma H. Elfouly, Mohamed I. Mahmoud and S. Deyab, 

“Comparison between haar and daubechies wavelet transformations on 

fpga technology,” International Journal of Computer, Information, and 

Systems Science, and Engineering, vol. 2, no. 1, pp. 1047–1061, (2006). 

[23] C. C. LIU, D. Q. Dai, and H. Yan, “Local Discriminant Wavelet 

Packet Coordinates for Face Recognition,” Journal of Machine learning 

Research, pp. 1165–1195, May (2007). 

[24] T. R. Tibor Kozek and L. . Chua, “Genetic Algorithm for CNN Template 

Learning,” IEEE Transactions on circuits and systems, vol. 40, no. 6, 

(1993). 

[25] L. Chua and T. Roska, Cellular neural networks and visual computing: 

foundations and applications. Cambridge University Press, (2005).


[26] P. Phillips, H. Wechsler, J. Huang, and P. Rauss, “The feret database and 

evaluation procedure for face-recognition algorithms,” vol. 16, pp. 295– 

306, April (1998).


ARTIFICIAL HUMAN LIMBS – A DESIGN APPROACH FOR MILITARY 

APPLICATION 

Abstract— The most essential automation is saving 

human life, saving their belongings, protecting their 

properties and making arrangements in a systematic way 

for automation. This paper deals with the design of a real 

time Human Limb which acts according to the design 

configurations of the prescribed datas as per the sensor 

calibrated. This research proposes to overcome current 

limitations using three axis optimal inertial sensors 

combined with an Embedded Controller on which the 

filter algorithm as well as analog to digital converter is 

implemented for correcting drift and angular motion 

through all orientations. The mechanical design will have 

miniature or hybrid stepper motors with associated 

mechanical elements to move the limbs on all the axis like 

up/down ,roll, elevation and azimuth. 

I.INTRODUCTION 

R.Karthikeyan, Department of EIE, Veltech,Member IEEE, 

rkarthiekeyan@gmail.com 

Anitha Karthikeyan, Department of ECE,Meenakchi College of Engineering. 

mrs.anithakarthikeyan@gmail.com 

S.Sivaperumal, Department of ECE,Vel HIGHTECH SRS Engineering College. 

sivaperumals@gmail.com 

With the development of networked synthetic environments 

(SE) stand to revolutionize the fields of education, training, 

business, retailing and entertainment. They will 

fundamentally alter our societies and the way in which 

mankind views the world. In the educational field, synthetic 

environments will offer the ultimate in hands-on and 

visualization of difficult concepts. They will allow training 

to transpire in a place much like that in which the skills 

being practiced will be used without exposure to possible 

hazards and at less cost. In the workplace, employees will be 

able to work “side by side” even though they may be 

physically separated by hundreds or even thousands of 

miles. .[Durlach -1995].Using synthetic environments, 

corporations will obtain a safe, economical and efficient 

method of testing new concepts and systems. Retailers will 

create virtual department stores where consumers will be 

able to try out products to an unprecedented degree before 

actually buying them. 

Using synthetic environments, the entertainment 

industry will be able to create entire worlds in which 

customers will be able to experience thrills and live out 

entire fantasy lives [Zyda-1997].The power of the synthetic 

environment lies in its ability to immerse users in a different 

world. The more complete the immersion, the more effective 

the synthetic environment. For complete immersion, the user 

should sense and interact with the synthetic environment in 

the same manner in which interaction with the natural world 

takes place. Interaction in the natural world results from 

body motion. Information regarding the surrounding 

environment is obtained through the five senses. Changes in 

body posture and position directly affect what is seen, heard, 

felt and smelled[ Mavor-1995]. 

The parameters sensed in the environment are altered and 

manipulated by the actions of the body. Thus, in order for a 

user to interact with a synthetic environment in a natural 

way and have the synthetic environment present appropriate 

information to the senses, it is imperative that data regarding 

body motion and posture be obtained[Skopowski,1996]. 

Body posture and location data are also needed in multi-user 

environments to drive the animation of avatars which 

represent the actions of users of the environment to each 

other. At this time, there is no practical and intuitive 

interface that allows an individual human to be inserted into 

a SE in a fully immersive manner. [Badler, N,1993]. 

Numerous motion tracking technologies are currently in 

use, but each suffers from its own set of limitations. 

Depending on the technology, these limitations may include 

marginal accuracy, user encumbrance, restricted range, 

susceptibility to interference and noise, poor registration, 

occlusion difficulties and high latency. Due to these 

problems, real-time animations of avatars must be largely 

script-based using motion libraries. For the most part, only a 

single user may be tracked in a small working volume. Thus, 

none of the current technologies fulfills the need for widearea 

tracking of multiple users. The ideal motion tracking 

technology must meet several requirements. It should have 

low latency, be tolerant to noise and other environmental 

interference, track multiple users and maintain both 

adequate accuracy and registration throughout a large 

working volume [MoletAubel-1999]. 

The primary reason current tracking systems fail to 

meet the requirements described above is the dependence of 

these systems on a generated “source” to determine 

orientation and location information. This source may be 

sent by transmitters to body-based receivers or it may be 

sent from body-based transmitters to receivers positioned at 

known locations throughout the working volume. Usually


the effective range of this source is extremely limited or 

there may be compromises between resolution and range. 

Interference with or distortion of this source will at best 

result in erroneous orientation and position measurements. 

II.MOTIVATION 

Motion tracking technology currently fail to 

provide accurate wide area tracking of multiple users 

without interference and occulation problems. This research 

proposes to overcome current limitations using three axis 

optimal inertial sensors combined with an Embedded 

Controller on which the filter algorithm as well as analog to 

digital converter is implemented for correcting drift and 

angular motion through all orientations.The mechanical 

design will have miniature or hybrid stepper motors with 

associated mechanical elements to move the limbs on all the 

axis like up/down ,roll, elevation and azimuth. An 

appropriate electronic circuit is used for isolation between 

the stepper motors and an Embedded Controller in 

computers. 

The electronic system will be suitable upto 5A for 

70kg cm stepper motor but in this research the stepper motor 

used is only 7kg cm .Joint angle determination for robots 

with flexible links is difficult. Use of Bluetooth technology 

will enable sensors to wirelessly transmit data from body 

extremities to the wearable PC. Inertial orientation tracking 

combined with RF positioning are also tried to provide an 

accurate method for determining orientation and location. It 

describes a system designed to determine the posture of an 

articulated body in real time. Finally ,this work describes 

the design, implementation ,calibration algorithm for the 

sensors and testing of inertial tracking system of human limb 

segment. 

IV.OBJECTIVES 

Based on the above discussion, the objectives of the present 

research work are, 

� Orientation tracking of human limb segments using three 

axis inertial sensors. 

� Calibration of individual sensors without the use of any 

specialized equipment . 

� Sufficient dynamic response and update rate (100 HZ or 

better) to capture faster human body limb motion. 

� Ability to change the three stepper motors rotation 

according to the assigned threshold value. 

� Three sensors are attached on the human limb, if the 

threshold value attains 360 and below, then the three 

stepper motors rotates in the forward direction. Finally , 

axis direction and three sensor data are also displayed 

graphically in the computer as per the limb movement 

of the human body. 

� Three sensors are attached on the human limb, if the 

threshold value attains 400 and above then the three 

stepper motors rotates in the reverse direction. Finally , 

axis direction and three sensor data are also displayed 

graphically in the computer as per the limb movement 

of the human body. 

� If the sensors are not attached on the human limb, ,the 

threshold value, rotation of stepper motors ,axis 

direction and the three sensor data all this parameter 

should lie in the initial condition. 

� Automatic accounting for the peculiarities related to the 

mounting of a sensor on an associated limb segment. 

� Creation of data files for recording data relating to limb 

as per the embedded software Filter. 

� Use of Bluetooth technology will enable sensors to 

wirelessly transmit data from body extremities to the 

wearable PC. 

� RF positioning are also tried to provide an accurate 

method for determining orientation and location . 

� It describes the design, implementation ,calibration of 

the sensors and testing of inertial tracking system of 

human limb segment. 

III.DESIGN EQUATIONS OF LIMB 

Force Calculations of Joints 

The point of doing force calculations is for motor selection. 

We must make sure that the motor we choose can not only 

support the weight of the robot arm, but also what the robot 

arm will carry (the blue ball in the image below). 

The first step is to label the FBD(Diagram 1), with the robot 

arm stretched out to its maximum length. 

Diagram 1 

Next we do a moment arm calculation, multiplying 

downward force times the linkage lengths. This calculation 

must be done for each lifting actuator. This particular design 

has just two DEGREE OF FREEDOM that requires lifting, 

and the center of mass of each linkage is assumed to be 

Length/2. 

Torque About Joint 1: 

M1 = L1/2 * W1 + L1 * W4 + (L1 + L2/2) * W2 + (L1 + L3) * W3


Torque About Joint 2: 

M2 = L2/2 * W2 + L3 * W3 

Forward Kinematics 

Forward kinematics is the method for determining the 

orientation and position of the end effector, given the joint 

angles and link lengths of the robot arm. For our robot arm, 

here we calculate end effector location with given joint 

angles and link lengths. 

Diagram 2 

Assume that the base is located at x=0 and y=0. The first 

step would be to locate x and y of each joint as shown in 

Diagram2 

Joint 0 (with x and y at base equaling 0): 

x0 = 0 

y0 = L0 

Joint 1 (with x and y at J1 equaling 0): 

cos(psi) = x1/L1 => x1 = L1*cos(psi) 

sin(psi) = y1/L1 => y1 = L1*sin(psi) 

Joint 2 (with x and y at J2 equaling 0): 

sin(theta) = x2/L2 => x2 = L2*sin(theta) 

cos(theta) = y2/L2 => y2 = L2*cos(theta) 

End Effector Location (make sure your signs are correct): 

x0 + x1 + x2, or 0 + L1*cos(psi) + L2*sin(theta) 

y0 + y1 + y2, or L0 + L1*sin(psi) + L2*cos(theta) 

z equals alpha, in cylindrical coordinates 

Inverse Kinematics 

Inverse kinematics is the opposite of forward kinematics. 

This is when we have a desired end effector position, but 

need to know the joint angles required to achieve it. The 

robot sees a kitten and wants to grab it, what angles should 

each joint go to? Although way more useful than forward 

kinematics, this calculation is much more complicated too. 

psi = arccos((x^2 + y^2 - L1^2 - L2^2) / (2 * L1 * L2)) 

theta = arcsin((y * (L1 + L2 * c2) - x * L2 * s2) / (x^2 + 

y^2)) 

where c2 = (x^2 + y^2 - L1^2 - L2^2) / (2 * L1 * L2); 

and s2 = sqrt(1 - c2^2); 

There is the very likely possibility of multiple, sometimes 

infinite, number of solutions (as shown below). How 

would the arm choose which is optimal, based on torques, 

previous arm position, gripping angle, etc.? There is the 

possibility of zero solutions. Maybe the location is outside 

the workspace, or maybe the point within the workspace 

must be gripped at an impossible angle(Diagram 3). 

Diagram 3 

Singularities, a place of infinite acceleration, can blow up 

equations and/or leave motors lagging behind (motors cant 

achieve infinite acceleration). 

And lastly, exponential equations take forever to calculate 

on a microcontroller. No point in having advanced equations 

on a processor that cant keep up. 

Motion Planning 

Motion planning on a robot arm is fairly complex so I will 

just give you the basics. 

Diagram 4 

Suppose the robot arm has objects within its workspace 

(Diagram 4), how does the arm move through the workspace 

to reach a certain point? To do this, assume the robot arm is 

just a simple mobile robot navigating in 3D space. The end 

effector will traverse the space just like a mobile robot, 

except now it must also make sure the other joints and links 

do not collide with anything too. This is extremely difficult 

to do . . . 

What if you the robot end effector to draw straight lines 

with a pencil? Getting it to go from point A to point B in a 

straight line is relatively simple to solve. What the robot 

should do, by using inverse kinematics, is go to many points


between point A and point B. The final motion will come 

out as a smooth straight line. We can not only do this 

method with straight lines, but curved ones too. On 

expensive professional robotic arms all we need to do is 

program two points, and tell the robot how to go between 

the two points (straight line, fast as possible, etc.). 

Velocity (and more Motion Planning) 

Calculating end effector velocity is mathematically complex, 

so we will go only into the basics. The simplest way to do it 

is assume the robot arm (held straight out) is a rotating 

wheel of L diameter. The joint rotates at Y rpm, so therefore 

the velocity is 

Velocity of end effector on straight arm = 2 * pi * radius * rpm 

However the end effector does not just rotate about the base, 

but can go in many directions. The end effector can follow a 

straight line, or curve, etc. 

With robot arms, the quickest way between two points is 

often not a straight line. If two joints have two different 

motors, or carry different loads, then max velocity can vary 

between them. When we tell the end effector to go from one 

point to the next, we have two decisions. Have it follow a 

straight line between both points, or tell all the joints to go 

as fast as possible - leaving the end effector to possibly 

swing wildly between those points. 

In the diagram 5 the end effector of the robot arm is moving 

from the blue point to the red point. In the top example, the 

end effector travels a straight line. This is the only possible 

motion this arm can perform to travel a straight line. In the 

bottom example, the arm is told to get to the red point as fast 

as possible. Given many different trajectories, the arm goes 

the method that allows the joints to rotate the fastest. 

Diagram 5 

There are many deciding factors to select the best method. 

Usually we want straight lines when the object the arm 

moves is really heavy, as it requires the momentum change 

for movement (momentum = mass * velocity). But for 

maximum speed (perhaps the arm isn't carrying anything, or 

just light objects) we would want maximum joint speeds. 

Now suppose we want the robot arm to operate at a certain 

rotational velocity, how much torque would a joint need? 

First, lets go back to our Functional Block Diagram 

(Diagram 6): 

Diagram 6 

Now lets suppose we want joint J0 to rotate 180 degrees in 

under 2 seconds, what torque does the J0 motor need? Well, 

J0 is not affected by gravity, so all we need to consider is 

momentum and inertia. Putting this in equation form we get 

this: 

torque = moment_of_inertia * angular_acceleration 

breaking that equation into sub components we get: 

torque = (mass * distance^2) * (change_in_angular_velocity 

/ change_in_time) and 

change_in_angular_velocity = (angular_velocity1)- 

(angular_velocity0) 

angular_velocity = change_in_angle / change_in_time 

Now assuming at start time 0 that angular_velocity0 is zero, 

we get 

torque = (mass * distance^2) * (angular_velocity / 

change_in_time) 

where distance is defined as the distance from the rotation 

axis to the center of mass of the arm: 

center of mass of the arm = distance = 1/2 * (arm_length) 

(use arm mass) 

but we also need to account for the object the arm holds: 

center of mass of the object = distance = arm_length 

(use object mass) 

So we calculate torque for both the arm and then again for 

the object, then add the two torques together for the total: 

torque(of_object) + torque(of_arm) = torque(for_motor)


And of course, if J0 was additionally affected by gravity, 

add the torque required to lift the arm to the torque required 

to reach the velocity needed. 

V IMPLEMENTATION OF INERTIAL TRACKING OF 

HUMAN LIMB SEGMENTS 

The implementation of inertial tracking of human limb 

segment is shown in figure 1. Three inertial sensors are 

mounted on the body of the human limb segment. The 

analog output from the limb for adults is 20mv (infants is 5 

mv)and given to signal conditioner circuit here for better 

ADC accuracy it amplify the 20mv to 5v and the 

corresponding amplifier gain is 5000/20 250. 

The 5v analog output is given to filter circuit which provides 

high speed noise filtering output with constant frequency 

approximately 100HZ and to Embedded Controller on 

which the software filter algorithm as well as analog to 

digital converter is implemented. The output is digitized by 

an associated inbuilt A\D converter .The digitized output 

from an Embedded Controller by a RS 232 converter is 

connected to the PC. All data processing and calculations 

are performed by software running on this single processor. 

An appropriate electronic optocoupler circuit is used for 

isolation between the three stepper motors and an Embedded 

Controller. The driver circuit drives the three stepper motor 

in different direction. The electronic system will be suitable 

upto 5A for 70kg cm stepper motor but in this research the 

stepper motor used is only 7kg cm The rotation depends 

upon the human limb movement on all the axis like up/down 

,roll, elevation and azimuth as per the assigned threshold 

value. The threshold value ,three stepper motor direction, 

axis movement and graphical representation and sensing 

system all this implemented data can be displayed on the 

monitor by means of using C programming language. The 

optimal filter theory to the filter software is done by Flash 

Embedded Controller and visual simulation software run on 

a single standard Pentium III processor in computers RH 

(Barnett,LO’Cull,2004).Use of Bluetooth technology will 

enable sensors to wirelessly transmit data from body 

extremities to the wearable PC. Inertial orientation tracking 

combined with RF positioning are also tried to provide an 

accurate method for determining orientation and location. 

Finally, the prototype sensor overall system hardware kit is 

shown in figure 2 . 

Static Stability of the system 

Figure 3 plots the magnitude of the quaternion 

filter criterion function versus time. The drift characteristics 

of the quaternion filter algorithm and the MARG sensor 

over extended periods were evaluated using static tests. 

Average total drift is about 1%. During the experiment 

shown, the filter gain, k was set to unity. It is expected that 

increasing the filter gain to 4.0 would reduce the drift error


by a factor of four or to about 0.25 percent. Further 

experiments indicated that nearly all drift was due to bias in 

the rate sensors. Experiments are currently underway using 

improved sensors containing rate-sensor capacitive coupling 

conditioning circuitry designed to remove these biases. 

Dynamic Response of the system 

Preliminary experiments were conducted to establish the 

accuracy of the orientation estimates and the dynamic 

response of the system. The preliminary test procedure 

consisted of repeatedly cycling the sensor through various 

angles of roll, pitch and yaw at rates ranging from 10 to 30 

deg./sec. Accuracy was measured to be better than one 

degree. The overall smoothness of the plot shows excellent 

dynamic response. 

VI. EXPERIMENTAL TEST RESULTS OF HUMAN 

LIMB SEGMENT 

Figure 4 

Figure 5 

Figure 6 

Stepper 

Motor 

M1 

M2 

M3 

Figure 7 

Figure 8 

Figure 9 

Table 1 

Threshold Value 

Axis 

400 

above 

& 360& below Rotation 

Reverse Forward Forward/ 

Reverse 


Reverse 


Reverse 

Sensors 

S1 

S2 

S3

Table 2 Simulation Results Of Human Limb Segment 

Stepper 

Motor 

M1 


Stepper 

Motor 

M2 

Stepper 

Motor 

M3 

Sensor 

S1 

Sensor 

S2 

Sensor 

S3 

Axis 

Rotation 

Forward Stopped Stopped 110 000 000 Forward 

Stopped Forward Stopped 000 160 000 Forward 

Stopped Stopped Forward 000 000 141 Forward 

Reverse Stopped Stopped 450 000 000 Reverse 

Stopped Reverse Stopped 000 580 000 Reverse 

Stopped Stopped Reverse 000 000 650 Reverse 

Threshold 

Value 

360 and 

below 

400 and 

above 

The above results were obtained using the 

hardware and software to achieve an update rate of 100 Hz. 

The roll, pitch, and yaw test results are presented in Figures 

4 , 5, 6 & 7 respectively. The smoothness of the graphs 

indicates excellent dynamic response. It is expected that 

adjusting the filter gain values that improves the overall 

accuracy and dynamic response. The transition times 

observed in the plots are around 4.5-5 seconds as expected 

for a 10-degree per second rotation rate to 45 degrees. In 

qualitative tests, the system was able to track the limb 

segment, including those in which pitch equaled 90 degrees 

the same orientations normally cause singularities in Euler 

angle filters. The qualitative tests also show that the system 

could easily be combined with a simulation program and 

track motion in real time. 

The purpose of the human body tracking system is to 

estimate the orientation of multiple human limb segments 

and use the resulting estimates to set the posture of the 

human body model that is visually displayed. Numerous 

experiments were conducted to qualitatively evaluate and 

demonstrate this capability. 

In each experiment three sensors where attached to the limb 

segments to be tracked. Due to the minimal number of 

sensors available, tracking was limited to a single arm or 

leg. In the case of arm and limb segments, sensor attachment 

was achieved through the use of elastic bandages. In most 

cases this method appeared to keep the sensors fixed relative 

to the limb. Body tracking was also performed using various 

gains. 

VII.CONCLUSIONS 

This research has demonstrated an alternative 

technology for tracking the posture of an articulated rigid 

body. High speed Embedded Controller avoids the 

electronic complexity , Bluetooth technology enables 

sensors to wirelessly transmit data from body to PC and the 

use of inertial sensors determine the orientation of link in 

the rigid body. RF positioning provides the source less 

capability of inertial sensing and enables tracking of 

multiple users over a wide area. At the core of the system is 

an efficient complementary filter that uses a quaternion 

representation of orientation and the filter can continuously 

track the orientation of human body limb segments (Robert B. 

McGhee2000).. Drift corrections are also made. This research 

overcomes the analysis and calculations used by the previous 

researchers by the technology of Embedded Controller. 

Embedded software filter process the data from three axis 

inertial sensors which is attached on the human limb 

segment. Sensor calibration is achieved without using any 

specialized equipment .Accurate calibration algorithm 

compensates the misalignment between sensor and limb 

segment co-ordinate axis. Hybrid stepper motors with 

associated mechanical elements is used to move the limbs 

on all the axis like up/down, roll, elevation and azimuth. The 

implemented system tracks human limb segments accurately 

with a 100 Hz update rate. Experimental results demonstrate 

the inertial orientation estimation is a practical method of 

tracking human body posture. With additional sensors, the 

architecture produced could be easily scaled for full body 

tracking. Due to its source less nature, tracking could 

overcome many of the limitations of motion tracking 

technologies currently in widespread use. It is potentially 

capable of providing wide area tracking of multiple users for 

synthetic environment and augmented reality applications.


VIII. REFERENCES 

[1] An, K.N., Chao, E.Y., Cooney, W.P., and Linscheid, 

R.L., 

1979, “Normative Model of Human Hand for 

Biomechanical 

Analysis,” J. Biomechanics, vol. 12, pp. 775-788. 

[2] Bennet, D.J., Hollerbach, J.M., 1990, “Closed-loop 

Kinematic Calibration of the Utah-MIT Hand,” in 

Experimental Robotics I: The First International Symp., V. 

Hayward, O. Khatib, (eds.), Springer-Verlag, N.Y., pp. 539- 

552. 

[3] Cooney, W.P., Lucca, M.J., Chao, E.Y.S., Linschied 

R.L., 

1981, “The kinesiology of the thumb trapeziometacarpal 

joint,” J. Bone Joint Surg. 63A:1371-1381. 

[4] Fischer, M., van der Smagt, P., Hirzinger, G., 1998, 

“Learning Techniques in a Dataglove Based 

Telemanipulation 

System for the DLR Hand,” 1998 IEEE ICRA, pp1603- 

1608. 

[5] Hollister, A., Buford, W.L., Myers, L.M., Giurintano, 

D.J, 

Novick, A., 1992, “The Axes of Rotation of the Thumb 

Carpometacarpal Joint,” J. of Orthopaedic Res., vol. 10, pp. 

454-460. 

[6] Khatib, O., 1987, “Unified Approach for motion and 

force 

control of robot manipulators: the operational space 

formulation,” IEEE J. of Robotics and Automation, vol. 3, 

no. 1, pp. 43-53. 

[7] Kramer, J.F., “Determination of Thumb Position Using 

Measurements of Abduction and Rotation,” U.S. Patent 

#5,482,056. 

[8] Kuch, J.J., Huang, T.S., 1995, “Human Computer 

Interaction via the Human Hand: A Hand Model,” 1995 

Asilomar Conf. on Signals, Systems. and Computers. pp. 

1252-1256. 

[9] Rohling, R.N., Hollerbach, J.M., 1993, “Calibrating the 

Human Hand for Haptic Interfaces,” Presence, vol. 2 no. 4, 

pp.281-296. 

[10] Rohling, R.N, Hollerbach, J.M., Jacobsen, S.C., 1993, 

“Optimized Fingertip Mapping: A General Algorithm for 

Robotic Hand Teleoperation,” Presence, vol. 2 no. 3, pp. 

203- 

220. 

[11] Turner, M.L., Gomez, D.H. Tremblay, M.R. and 

Cutkosky, 

M.R., 1998, “Preliminary Tests of an Arm-Grounded Haptic 

Feedback Device in Telemanipulation,” 1998 ASME 

IMECE 

Symp. on Haptic Interfaces. pp.145-149. 

[12] Turner, M.L., Findley, R.P., Griffin, W.B., Cutkosky, 

M.R., 

Gomez, D.H., 2000, “Development and Testing of a 

Telemanipulation System with Arm and Hand Motion,” 

Accepted to 2000 ASME IMECE Symp. on Haptic 

Interfaces. 

[13] Wampler, C.W., Hollerbach, J.M., Arai, T., 1995, “An 

Implicit Loop Method for Kinematic Calibration and its 

Application to Closed-chain mechanisms,” IEEE Trans. 

Robotics and Automation, vol. 11, no. 5, pp. 710-724. 

[14] Wright, A.K., Stanisic, M.M., 1990, “Kinematic 

Mapping 

between the EXOS Handmaster Exoskeleton and the Utah- 

MIT Dextrous Hand,” 1990 IEEE Int’l Conf. on Systems 

Engineering, pp. 809-811. 

[15] www.societyofrobots.com /robot building ideas


A Novel Image Processing Approach Combining a ‘Coupled 

Nonlinear Oscillators’-based Paradigm with Cellular Neural 

Networks for Dynamic Robust Contrast Enhancement 

Kyandoghere Kyamakya( 1 ), Cyrille Kalenga Wa Ngoy ( 2 ), Michel Matalatala Tamasala ( 2 ), Jean Chamberlain Chedjou ( 1 ) 

( 1 ): Transportation Informatics Group, Institute of Smart Systems Technologies, University of Klagenfurt (Austria), 

Email: kyandoghere.kyamakya@uni-klu.ac.at ; jean.chedjou@uni-klu.ac.at 

( 2 ): Department of Electrical and Computer Engineering, Polytechnic Faculty, University of Kinshasa (D. R. Congo) 

Email: Cyrille.Kalenga@vodacom.cd ; Michel.Matalatala@vodacom.cd 

Abstract−− In this paper, a systematic discussion of both pros and 

cons of two well-known traditional approaches for image contrast 

enhancement is conducted. The first approach is based on the 

CNN paradigm and the second one is based on the coupled 

nonlinear oscillators’ paradigm for image processing. In the later 

case an extensive bifurcation analysis is carried out and 

analytical formulas are derived to define the various states of the 

system. Both equilibrium and oscillatory states of the system are 

depicted. It is shown that each of these states has a significant 

impact on the quality of the resulting image contrast 

enhancement. A benchmarking is considered whereby a 

comparison is performed between the results obtained by a CNNbased 

processing, on one side, with those obtained by a ‘coupled 

nonlinear oscillators’ based processing, on the other side. The 

superiority of the later approach (for contrast enhancement) is 

demonstrated both analytically and through various experiments. 

A major drawback of the CNN based image processing is the 

practical inability to adjust/re-calculate templates in real-time in 

face of a dynamic scene with input images experiencing visibility 

and/or lighting related spatio-temporal dynamics. Finally, a novel 

hybrid approach integrating both schemes in an efficient way is 

proposed: the ‘coupled nonlinear oscillators’ based image 

processing is the main processing scheme that is however realized 

on top of a CNN processors’ framework. The hybrid approach 

does prove to overcome key practical problems faced by both 

original approaches. 

Keywords: Cellular neural networks (CNN), Nonlinear coupled 

oscillators, van der Pol oscillator, Duffing oscillator, contrast 

enhancement, stability, bifurcation, Routh-Hurwitz theorem 


The last decades have witnessed a tremendous attention 

devoted to the study of nonlinear coupled oscillators [2]-[17] 

with various related applications in diverse areas such as 

electrical engineering [18], [19], mechanics [15], electromechanics 

[14] and electronics [16], just to name a few. In 

some previous works [18]-[21], we have shown some 

interesting applications of the coupling between van der Pol 

and Duffing oscillators in both electronics and electromechanics. 

Further, in the recent literature a good number of 

notable contributions have been published thereby showing 

various applications of the paradigm of nonlinear dynamics in 

image processing [1]-[13]: (a) the use of the CNN paradigm 

for contrast enhancement [1], edge detection [11], [17], image 

segmentation [2]-[10], [12], [13]; and (b) the use of the socalled 

LEGION model (involving nonlinear coupled 

oscillators) mainly for image segmentation [3]. One does 

realize that the relevant literature does not provide sufficient 

information concerning the application of nonlinear 

coupled/uncoupled oscillators in image processing, especially 

for the specific task of contrast enhancement. In fact, only a 

single paper can be found in which image contrast 

enhancement has been done by using this later paradigm [1]. 

In contrast, the cellular neural network paradigm has shown 

through numerous publications its rich potential to solve many 

important low-level image processing tasks, e.g. image contrast 

enhancement [22]-[24], edge detection [25] and segmentation 

[26], [27] just to name a few. Despite the ideal framework 

offered by the CNN paradigm to perform parallel and therefore 

ultrafast image processing there are still some important related 

issues that still need a better theoretical foundation. One of 

these open questions is that of a comprehensive and straightforward 

methodology to derive appropriate CNN templates for 

a given image processing task. Actually known approaches are 

based on a sort of supervised learning paradigm to determine 

the templates. Hereby either genetic algorithms or simulated 

annealing or even particle swarm optimization are the most 

commonly used schemes. Thus, the template obtained through 

such a ‘supervised learning’ like approach does highly depend 

on the used reference image(s). Therefore, this traditional way 

of calculating templates will totally fail in face of a dynamic 

environment, which would require an adaptive and real-time 

determination/re-calculation of the respectively appropriate 

templates in reaction to visibility and lighting related 

environmental changes. Indeed, for a specific processing task 

(e.g. contrast enhancement, segmentation, etc.) the optimal 

CNN templates (for an optimal processing) must be adjusted / 

recalculated depending on the varying input image. 

That’s why an important open key issue not yet 

answered so far by the relevant scientific community is that of 

developing/providing a comprehensive, robust and general 

framework that should allow a real-time adaptation of the 

CNN templates related to a specific image processing task to 

the variations in all aspects of the input image(s). It is known 

that CNN templates are very sensitive to the quality of the 

input image and must be adjusted in case of dynamic image 

for an optimal processing. The supervised learning template


calculation paradigm is therefore not appropriate for a 

situation where the input image does experience visibility 

related temporal dynamics; it is almost impossible to 

recalculate the templates in real-time. 

Thus, a key objective of this paper is to propose an 

approach or better an image processing (in this case for 

“contrast enhancement”) framework which is robust to both 

the temporal quality and the spatial changes of the input 

image(s). The novel approach proposed here does combine the 

paradigm of coupled nonlinear oscillators with that of cellular 

neural networks. It is shown how this integration should be 

realized at best. Afterwards, it is in the following steps clearly 

demonstrated that the new architectural framework does result 

in invariant templates while still being capable of robustly 

adapting the efficient image processing to the spatial-temporal 

dynamics of the input image. 

The nonlinear coupled oscillator system model used in 

this paper does consist of the coupling between van der Pol 

and Duffing type oscillators. The focus is hereby on the 

application of this coupled system for the specific image 

processing task of “contrast enhancement”. Contrast 

enhancement is an important issue in difficult and dynamic 

visual environments such as the ones faced by advanced driver 

assistant systems (ADAS). Therefore, this could help 

improving the image quality or the visibility in real time. 

We do propose the realization or better the 

implementation of the coupled nonlinear oscillators’ image 

processing concept on top of a cellular neural network 

framework. Hereby, the CNN processors are viewed as a 

slave-system used to solve, in real-time, the nonlinear ordinary 

differential equations describing the coupled nonlinear 

oscillators’ model. The image processing based on the coupled 

nonlinear oscillators has a very great strong feature, which is 

that its processing efficiency is sensitive neither to the actual 

image quality nor lighting variations or states, but solely on 

the coefficients of the nonlinear differential equations 

describing the coupled oscillators’ model. The appropriate 

coefficients/parameters of the coupled nonlinear oscillators are 

determined in an offline bifurcation analysis process, which is 

explained further in this paper. The new resulting challenge 

becomes then that of being capable of solving these nonlinear 

differential equations in real-time. We should first notice that 

these differential equations (ODE’s) do have ‘constant’ 

coefficients, which have been selected, as explained before, 

from the analysis of the results of the bifurcation analysis. 

Therefore, the problem setting for the CNN processor 

system, on top of which the coupled oscillators will be 

implemented, is that of solving in real-time a set of highly stiff 

nonlinear differential equations having constant coefficients. 

The input images are the frames which are then considered / 

taken as initial conditions for the coupled oscillator system. 

The real-time constraint is determined by the actual frame 

rate. The new key challenge becomes therefore, evidently, that 

of determining the appropriate templates for solving the set of 

stiff nonlinear differential equations. But this has been a still 

open issue when one looks at the actual state of the relevant 

literature. We could however address and efficiently solve this 

challenging issue in a subsequent work. The results obtained 

are presented in all details in another paper that we do also 

publish in the conference proceedings of CNNA 2010; it has 

the title: “CNN-based Real-time Computational Engineering” 

(see [28]). 

The previous explanations do clearly highlight how we 

could combine the two concepts together in a powerful and 

highly efficient real-time processing framework: a ‘coupled 

nonlinear oscillators’ based image processing scheme on top 

of a CNN processors framework”. 

Contrast enhancement has been an issue of prime 

importance in dynamic environments. It is one of the major 

low-level image processing tasks needed to be done before 

further processing of an image can be possible at higher levels. 

Things become more challenging in a continuously changing 

environment like the one experienced by driver assistance 

systems on the road; weather, lighting, etc. do result in 

significant spatial-temporal variations of the input image 

quality. The real-time processing constraint does make the 

overall scenario more challenging: the higher the car speed is, 

the faster the image processing must be. A continuously 

changing environment requires the system to be adaptive, i.e. 

the system should process/enhance the input image in such a 

way that the corresponding output image always 

presents/possesses the best possible contrast regardless of the 

effects of different environmental conditions experienced by 

the input image (like darkness, non-uniform lighting, raining, 

fog, etc.). This implies that the output image should contain 

significant contrast in it, so that all of the objects contained in 

it could be easily distinguishable by the system for further 

processing such as scene analysis, etc. 

The implementation on top of cellular neural network of 

the coupled nonlinear oscillatory systems’ paradigm is a best 

candidate/concept for providing an appropriate answer to this 

need. To develop such a paradigm, a systematic analytical 

framework should provide tools/methods for a straight 

forward design and parameters calculation of a related robust 

and ultra-fast image processing. 

The rest of the paper is organized as follows. Section 2 

exploits the Routh-Hurwitz theorem to address the stability 

analysis of the nonlinear coupled oscillatory system. Three 

main states of the coupled system are depicted, namely 

equilibrium-, quenched-, and oscillatory- states. Analytical 

formulas/relations are derived under which each of these states 

could be displayed by the coupled system. The quality of the 

image ‘contrast enhancement’ is discussed in each of the 

possible states of the coupled system. Windows of the systemparameters 

are determined, under which either a good or a 

worst contrast enhancement can be predicted. Section 3 deals 

with the numerical study. An in-depth explanation of the 

image processing concept involving coupled nonlinear 

oscillators is provided. For rapid prototyping purposes a 

computing platform is developed, which is based 

MATLAB/SIMULINK. It is then used for a set of processing 

tasks on both images having a poor contrast and on images 

with very good contrast as well. Section 4 deals with the 

benchmarking. This benchmarking shows how far this novel 

approach does outperform the classical CNN based way of 

doing the same task, since the CNN templates used for 

contrast enhancement (or published in relevant books or 

papers) are in reality only optimal for the images used in the 

related training process. The later is traditionally based on 

offline optimization processes involving either genetic 

algorithms or simulated annealing or particle swarm 

optimization [29]-[34]. As proof of concepts of the approach 

developed in this paper our results are compared with those 

provided by the relevant literature for CNN based contrast 

enhancement.

We further discuss a possible implementation of the coupled 

nonlinear oscillators on top of a CNN computing platform. 

The challenge hereby is that of transforming, as much as 

possible, the nonlinearity types present in both ‘van der Pol’ 

and ‘Duffing’ oscillators into a type of nonlinearity similar to 

that displayed by the elementary CNN cell. We use a novel 

optimization concept/process to achieve this goal. Section 5 

presents a set of concluding remarks. Furthermore a summary 

of the key results obtained is provided. 

II. ANALYTICAL STUDY 

The dynamics of a system consisting of a van der Pol 

oscillator coupled to a Duffing oscillator is described by the 

following equations: 

2 

dx 2 dx 2 

dy 

ε 2 1( ) + ω 1 = 1 + 2 

dt 

- 1 - x x c y c (1a) 

dt dt 

2 

dy dy 2 3 

dx 

2 2 2 o 3 4 

dt 

+ ε + ω y + c y = c x + c (1b) 

dt dt 

where 1 c and 3 c are the elastic coupling parameters, and 2 c 

and 4 


c are the dissipative coupling parameters. x(t) and y(t) 

represent the coordinates of the coupled oscillators (i.e. van 

der Pol and Duffing respectively). The stability analysis of the 

equilibrium points is carried out by restricting our 

investigation to the case where the elastic couplings 

(respectively the dissipative couplings) are identical. From (1), 

we obtain the following equilibrium points ( c 2 = c 4 = 0 ) : 

⎛ 2 2 2 2 2 2 2 

c 1 (c1 -ω1ω 2) c1 -ω1ω ⎞ 

2 

P 1 ⎜ , 0, , 0 ⎟ (2a) 

6 2 

⎜ c 0ω1 c 0ω ⎟ 

⎝ 1 ⎠ 

⎛ 2 2 2 2 2 2 2 

c 1 (c1 -ω1ω 2) c1 -ω1ω ⎞ 

2 

P ⎜ 2 , 0, - , 0 ⎟ (2b) 

6 2 

⎜ c 0ω1 c 0ω ⎟ 

⎝ 1 ⎠ 

⎛ 2 2 2 2 2 2 2 

c 1 (c1 - ω1ω 2) c1 - ω1ω ⎞ 

2 

P3 ⎜- , 0, , 0 ⎟ (2c) 

6 2 

⎜ c 0ω1 c 0ω ⎟ 

⎝ 1 ⎠ 

⎛ 2 2 2 2 2 2 2 

c 1 (c1 -ω1ω 2) c1 -ω1ω ⎞ 

2 

P4 ⎜- , 0, - , 0 ⎟ (2d) 

6 2 

⎜ c 0ω1 c 0ω ⎟ 

⎝ 1 ⎠ 

These points exist under the conditions c 1 0. 

We also obtain a critical equilibrium 

or c 1 >ωω 1 2 and 0 

point Pc ( 0,0,0,0 ) . The stability of the above equilibrium points 

can be investigated by re-writing (1) in the following form: 

dx 

v (3a) 

dt = 

dv 

2 2 

=ε1(1- x )v - ω 1x+ c1y (3b) 

dt 

dy 

z (3c) 

dt = 

dz 

2 3 

= -ε2z-ω 2y-c0y + c1x (3d) 

dt 

and linearizing around a given equilibrium state ( x,v,y,z 

0 0 0 0) 

to obtain the Jacobian matrix M . J 

⎡ 0 1 0 0 ⎤ 

⎢ 2 2 

⎥ 

-ω1 -2ε1x0v0 ε1( 

1-x0) c1 0 

M J = 

⎢ ⎥ 

⎢ ⎥ 

(4) 

⎢ 

0 0 0 1 

⎥ 

2 2 

⎢⎣ c1 0 -ω2-3c0y ⎥ 

0 −ε2⎦ 

The eigen-values of the 4x4 matrix, formed from the Jacobian 

M are the solutions of (5) 

matrix J 

a λ + a λ + a λ + a λ+ a = 0 (5) 

4 3 2 

0 1 2 3 4 

where the coefficients a l are defined as follows: 

a0= 1 (6a) 

a1 2 

=ε 2 - ε 1 (1-x 0 ) (6b) 

a 2 

2 2 2 2 

=ω 1 +ω 2 + 2ε 1x 0v 0 + 3c0y 0 - εε 1 2(1- x 0) 

(6c) 

2 2 2 2 

a 3 = ε2( ω 1 + 2ε1x0v 0)- ε1(1-x 0)( ω 2 + 3c0y 0) 

(6d) 

a 

2 

= ( ω + 2ε x v 

2 2 2 

)( ω + 3c y ) - c (6e) 

4 1 1 0 0 2 0 0 1 

It can be shown (by the analysis of the oscillatory states of the 

coupled system and by exploiting the Routh-Hurwitz theorem) 

that three possible states of the system can be depicted. The 

first state is the quenching state (i.e. the death of oscillations). 

The second is the state of equilibrium. And the last one is the 

oscillatory state. 

The system exhibits its quenching state when the critical 

equilibrium point Pc ( 0,0,0,0 ) is stable. It can be shown, using 

the Routh Hurwitz theorem, that the critical equilibrium point 

is stable if the following relationships are satisfied (assuming 

that the natural frequencies of the coupled oscillators are 

equal): 

1 2 (7a) 

ε < ε 

ω ε ε < c < ω (7b) 

1 1 2 1 

2 

1 

It can also be shown (using the Routh Hurwitz theorem) that 

the non zero equilibrium points P i (i= 1,2,3,4) are stable for 

ε 1 < ε 2 (8a) 

2 

c >ω 

(8b) 

1 1 

Under the conditions described by (8) all the neighboring 

orbits of the critical equilibrium points are stable. It can be 

shown, using the oscillatory states analysis method (e.g. the 

multiple time scale method), that the coupled system displays 

oscillatory states. These states could be observed under the 

following conditions: 

ε 1 < ε 2 

(9a) 

0


some preliminary results of the processing (contrast 

enhancement) of images with/having a very poor initial 

contrast. The main focus will be on showing that the quality of 

the contrast enhancement is different in each of the various 

‘parameter-windows’ established analytically in (7)-(9). The 

advantage of this remark/feature is the possibility of predicting 

either a good or a worst image processing; both depend on the 

selected parameter values of the coupled nonlinear oscillators’ 

model. 

III. NUMERICAL STUDY 

A. Description of the concept 

The proposed coupled oscillatory system consists of two 

nonlinear oscillators, i.e., a van der Pol oscillator and a Duffing 

oscillator, each represented by a second order nonlinear 

differential equation as given in (1). From a nonlinear 

dynamics perspective the scheme to solve this oscillatory 

system is straightforward. 

⎡x1⎤ r ⎢x⎥ 2 

x= ⎢ 

. 

⎥ 

⎢ ⎥ 

⎢x⎥ ⎣ n⎦ 

(Discretization) 

⎡y1⎤ r ⎢y⎥ 2 

y= ⎢ 

. 

⎥ 

⎢ ⎥ 

⎢y⎥ ⎣ n⎦ 

(Input image) 

(Vectorization) 

⎡ dx dy 

1 ⎤ ⎡ 1 ⎤ 

⎢ dt ⎥ ⎢ dt ⎥ 

⎢ ⎥ ⎢ ⎥ 

r dx dy 

2 r 2 

dy 

⎢ ⎥ 

dx 

⎢ ⎥ 

= ⎢ dt ⎥ = ⎢ dt ⎥ 

dt ⎢ ⎥ dt ⎢ ⎥ 

. ⎢ 

. 

⎢ ⎥ ⎥ 

⎢dx ⎥ ⎢dy ⎥ 

n 

n 

⎢ ⎥ ⎢ ⎥ 

⎣ dt ⎦ ⎣ dt ⎦ 

Coupled Oscillatory Paradigm 

Figure 1. Image processing through the oscillatory model 

In order to exploit the coupled nonlinear model/equations for 

some image processing tasks (e.g. contrast enhancement, edge 

detection, segmentation, etc.) the basic idea remains the same 

although a bit trickier. The input image is pixelized first, i.e., it 

must take a grid-like form. Then the pixelized image is 

vectorized. The elements of the vector image are the individual 

pixels. This vector image serves as initial condition vector for 

the coupled oscillatory system. To solve a 2 nd order ordinary 

differential equation we need two initial conditions (i.e. 

position and velocity), it is the same in this case here also. To 

solve/process each pixel the system needs four values, which 

are ‘position’ and ‘velocity’ values for both the van der Pol 

oscillator and the Duffing oscillator. In this case, the initial 

conditions vector has four elements, each of which having the 

same size as that of the input image, i.e., two vector elements 

for the initial positions and two further vector elements to hold 

the initial velocities. The key steps of the overall principle are 

shown in Fig. 1. 

The system generates two solutions at each time step. One 

is the van der Pol oscillator’ solution and the other is to the 

Duffing oscillator one. These solutions are obtained in the form 

of vector images which must be converted back to the grid like 

shape. Normally, the input images are loaded (as initial 

conditions) either in x r or y r or in both. But there are different 

possible scenarios for initializing the model. Some of these 

scenarios are listed in the following: 

• Loading the image in x r 

• Loading the image in y r 

• Loading the image in x r and y r 

• Loading the image in dx 

r 

dt 

• Loading the image in dy 

r 

dt 

• Loading the image in dx 

r 


dt 

dy 

r 

dt 

• Loading the image in x r and dx 

r 

dt 

The SIMULINK model (i.e. a graphical representation) that has 

been used for the simulations of this paper (i.e. for image 

contrast enhancement) is shown in Fig. 2. This graphical model 

is a representation of the nonlinear coupled oscillatory system 

from the nonlinear dynamics perspective. 

B. Results 

Our objective in this part is to connect the results obtained 

analytically (different states of the nonlinear oscillator system) 

to some sample image processing examples obtained through 

numerical simulations. The key issue hereby is that of 

establishing a correlation between the formulas derived 

analytically and the related image processing results obtained 

numerically (i.e. contrast enhancement). 

It has been shown analytically that the equilibrium points 

(i.e. both Pc and Pi) are stable under some analytical conditions 

described by (7), (8) and (9). We now want to exploit these 

equations to show the quality of the image processing tasks 

performed by the coupled oscillators’ system in its equilibrium 

states. It is worth a mentioning that two main equilibrium 

states of the coupled system have been depicted analytically. 

The first state is the quenching state under which the critical 

point (i.e. the point at origin) P c (0, 0, 0, 0) is stable. At the 

critical points both oscillators are mutually damped (i.e. 

complete damping), leading to the quenching phenomenon. 

When this phenomenon occurs, the result of the image 

processing leads to an image which is completely dark (see 

Fig. 3b), whatever the quality (good or worst) of the input 

image may be (see Fig. 3a). The following set of parameters 

has been used to obtain the quenching state under the 

conditions described in (7): ε 1 =0.4, 

ε 2 = 1, 

ω 1 = 1 ; ω 2 = 1, 

c 1 = 0.8, 3 c = 0.8, 2 c = 0, 4 

c = 0, and c 0 = 0.5


Figure 2. Simulink representation of the coupled oscillators’ model 

(a) (b) 

Figure 3: Result of the image processing in the case where the system is the 

critical equilibrium point P c (0, 0, 0, 0) : input image (a) and result of 

the processing (b), leading to an output image which is completely dark 

(Quenching phenomenon). 

An important observation which could be drawn from Fig. 4 is 

that the image processing quality is the highest for equilibrium 

points that lie much further (far away) from the critical point. 

For instance, the parameter values c1= 1.05 , c1= 1.15 , and 

c1= 1.3 lead to the following equilibrium points 

P 1(0.475,0,0.455,0) 

, P 1(0.923,0,0.803,0) 


P 1(1.52,0,1.17,0) 

respectively. Therefore, by increasing c 1 , 

the equilibrium points Pi move far away from the critical 

equilibrium Pc and thereby leading to a significant 

improvement in the quality of image processing (contrast 

enhancement. The results/images obtained are presented in Fig. 

4. We have also performed a series of image processing 

numerical simulations in the ‘oscillatory states’ of the coupled 

system described by (9). Using the same set of parameters like 

in Fig. 4, c 1 has been used as control parameter. The results 

obtained in Fig. 5 have revealed that in the oscillatory state of 

the coupled system, the quality of the processing increases with 

decreasing c 1 (see the results of the processing in Fig. 5b, Fig. 

5c and Fig. 5d). 

(a) (b) 

(c) (d) 

Figure 4. Effects of the control parameter c1 on the image processing qualitythe 

system is in different equilibrium states: (a) is the input image; (b) is the 

related image processing result for c1 = 1.05; (c) is the image processing result 

for c1 = 1.15; and (d) is the image processing result for c1 = 1.3, the later 

leading to the optimal result/processing obtained in the corresponding 

equilibrium state of the coupled system. 

(a) (b) 

(c) (d) 

Figure 5: Effects of the control parameter c1 on the processing of quality of 

the input image (a) – the system is in different oscillatory states; the results of 

the processing are: (b) for c1 = 0.6, (c) for c1 = 0.55, and (d) for c1 = 0.50, the 

later leading to an optimal result obtained in the corresponding oscillatory 

state of the coupled system. 

IV. BENCHMARKING 

In this section we discuss and compare the results of the 

CNN based image contrast enhancement techniques published 

in the literature so far with those obtained through a 

processing by the coupled nonlinear oscillators’ paradigm. A 

first attempt for a CNN based contrast enhancement was 

presented by Mάrton Csapodi et al. [22]. In this concept, 

another well known contrast enhancement approach, i.e. 

adaptive histogram equalization, has been emulated by 

performing a piecewise linear approximation of different 

mapping functions. The technique is computationally intensive 

since for each contextual region it requires a histogram 

generation, a mapping function calculations and a rescaling of 

pixel values according to the new mapping. Mάtyάs Brendel et 

al. [23] addressed the contrast enhancement problem by 

proposing a set of linear templates, which however do not 

provide good results for all test images due to the high 

nonlinear nature of the images. A. Gacsάdi et al. [24] have 

designed a set of templates for image enhancement by 

minimizing the image energy function.


(a) (b) 

(c ) (d) 

Figure 6. Results of CNN based contrast enhancement schemes: (a) Image 

enhancement based on an approach developed by Mάrton Csapodi et al. [22]; 

(b) Image enhancement based on an approach developed by Mάtyάs Brendel et 

al. [23]; .(c) Image enhancement based on approach by A.Gacsάdi et al.[24];. 

(d) Edge preservation observed in the approach by A. Gacsάdi et al. [24]. All 

these results are obtained w.r.t. the input image of Fig. 4a. 

The energy function considered consists of two processes that 

are smoothness constraint and edge penalty. Thus, to obtain an 

optimum result a tradeoff between image smoothness and edge 

detection was to be found and adjusted. Applying the approach 

based on the CNN paradigm on the same input image of Fig. 

4a, we have obtained an enhanced contrast w.r.t the input 

image but with a loss of key information (see Fig. 6). The parts 

of the input image with small gray level differences have been 

lost, e.g. driver’s face, the road, the round lane and the 

background. In contrast to that, the optimum results (Fig. 4d, 

Fig. 5d) obtained through the coupled nonlinear oscillators 

processing paradigm clearly show that almost all of the basic 

information of the same input image is restored during the 

image enhancement processing. A comparison of Figures 5 and 

6 does underscore the superiority of the coupled nonlinear 

oscillator based contrast enhancement while compared to CNN 

based approaches. The reason for the weakness of the CNN 

based approach lies in the essentially “supervised 

training/learning”-like process used to determine the templates. 

Due to this, the (linear) templates obtained are only optimal for 

the test images. Beyond that, there is no way that those 

templates can be optimal for images experiencing temporal 

and/or spatial dynamics. 

Having seen the superiority of the coupled oscillators’ 

based approach we do now propose a hybrid architecture that 

will combine the strong points of both CNN and the coupled 

nonlinear oscillators’ based image processing. The core 

processing will be the later one. Thereby, we do propose the 

realization of the coupled nonlinear oscillators’ image 

processing concept on top of a cellular neural network 

processors framework. Hereby, the CNN processors will play 

the role of a slave-system used to solve, in real-time, the 

nonlinear ordinary differential equations describing the 

coupled nonlinear oscillators’ model. The image processing 

based on coupled nonlinear oscillators has a very great strong 

feature, which is that its processing efficiency is sensitive 

neither to the actual image quality nor to its variations or 

states, but solely on the coefficients of the nonlinear 

differential equations describing the coupled oscillators. The 

appropriate coefficients/parameters of the coupled nonlinear 

oscillators are determined, as explained in the previous 

sections, in an offline bifurcation analysis process. The new 

challenge related to the CNN processor becomes now that of 

being capable of solving these nonlinear differential equations 

in real-time. Thus, the problem formulation for the CNN 

processor system is that of solving a set of highly stiff 

nonlinear differential equations having constant coefficients. 

The key challenge for this task is solely that of determining 

the appropriate templates. This is not trivial at all and is still 

an open issue if one looks at the actual state of the relevant 

literature. We could however solve it and we do present the 

related results obtained in the other paper that we publish in 

the proceedings of the CNNA-2010 conference; see the paper 

entitled “CNN based Real-time Computational Engineering” 

(see [28]). 

V. CONCLUDING REMARKS 

CNN needs well optimized templates to perform any 

specific image processing task and it is well known that 

template optimization is still unsolved for a really 

straightforward and efficient CNN based computing. For 

dynamic environments linear templates do not provide a 

robust processing since every next image is different from the 

previous one and does in reality require a new set of 

appropriate templates for the same processing task. Nonlinear 

templates require some preprocessing to be performed on each 

image to get the template values that are appropriate to the 

actual image, leading to huge problems for real-time 

applications. The proposed paradigm of a coupled nonlinear 

oscillators based processing has shown (both analytically and 

numerically) domains of good/efficient contrast enhancement 

processing whereby the processing quality remains 

robust/constant and is insensitive to eventual spatial-temporal 

dynamics that may experience the input images. Furthermore, 

in CNN based processing the template-based computing 

involve also the pixels’ neighborhood while processing each 

pixel. This is not the case in the nonlinear oscillators based 

processing paradigm as each pixel is processed independently 

without taking into account its neighborhood. For both 

paradigms, i.e. CNN and nonlinear oscillators, the input image 

serves as an initial condition. Both frameworks offer parallel 

image processing with a couple of differences: (a) CNN 

templates appear to be sensitive to the training conditions and 

lack adaptivity to dynamic environments; (b) the performance 

of the coupled oscillator model is independent/insensitive to 

both quality and dynamics of the input image. 

In summary, after analyzing the results obtained we do 

propose the realization of the coupled nonlinear oscillators 

based image processing concept on top of a cellular neural 

network processor system. Hereby, the CNN framework will 

be viewed as a slave-system used to solve, in real-time, the 

nonlinear ordinary differential equations describing the 

coupled nonlinear oscillators’ model. The image processing 

based on coupled nonlinear oscillators has demonstrated its 

great strong feature, which is that its processing efficiency is 

sensitive neither to the actual image quality nor to its 

variations or states, but solely on the coefficients of the 

nonlinear differential equations describing the coupled 

oscillators. The appropriate coefficients/parameters of the 

coupled nonlinear oscillators are determined in an offline 

bifurcation analysis process, which has been extensively 

explained further in this paper. And these coefficients remain 

constant and do not need to be recalculated in real-time.


REFERENCES 

[1] S. Morfu and J. C. Comte, “A nonlinear oscillators 

network devoted to image processing,” International 

Journal of Bifurcation and Chaos, vol. 14, no. 4(2004) 

1385-1394. 

[2] Naoko Kurata, Hitoshi Mahara, Tatsunari Sakurai, 

Atsushi Nomura and Hidetoshi Miike, “Image processing 

by a coupled non-linear oscillator system,” 23 rd 

International Technical Conference on Circuits/Systems, 

Computers and Communications (ITC-CSCC 2008). 

[3] De Liang Wang and David Terman, “Locally excitatory 

globally inhibitory oscillator networks,” IEEE 

Transactions on Neural Networks, vol. 6, no. 1, January 

1995. 

[4] Xiuwen Liu and DeLiang Wang, “Range image 

segmentation using a LEGION network,” IEEE 

Transactions on Neural Networks, vol. 10, no. 3, May 

1999. 

[5] D. L. Wang, “Object selection based on oscillatory 

correlation,” Elsevier Transactions on Neural Networks, 

vol. 12, pp. 579-592, 1999. 

[6] Hiroshi Ando, Takashi Morie, Makoto Nagata and 

Atsushi Iwata, “A non-linear oscillator network for grey 

level image segmentation and PWM / PPM circuits for its 

VLSI implementation,” IEICE Trans. Fundamentals, vol. 

E83-A, no. 2, pp. 329-336, February 2000. 

[7] Hiroshi Ando, Takashi Morie, Makoto Nagata and 

Atsushi Iwata, “Image segmentation/extraction using nonlinear 

cellular networks and their VLSI implementation 

using pulse-coupled modulation techniques,” IEICE 

Trans. Fundamentals, vol. E85-A, no.2, pp. 381-388, 

February 2002. 

[8] Hidehiro Nakano and Toshimichi Saito, “Grouping 

synchronization in a pulse-coupled network of chaotic 

spiking oscillators,” IEEE Transactions on Neural 

Networks, vol 15, no.5, September 2004. 

[9] Yakov Kazanovich and Roman Borisyuk, “Object 

selection by an oscillatory neural network,” Elsevier 

Transactions on Biosystems, vol. 67, pp. 103-111, August 

2002. 

[10] Ke Chen and DeLiang Wang, “A dynamically coupled 

neural oscillator network for image segmentation,” 

Elsevier Transactions on Neural Networks, vol. 15, pp. 

423-439, April 2002. 

[11] M. Strzelecki, “Texture boundary detection using network 

of synchronised oscillators,” IEEE Electronics letters, vol. 

40, pp. 466-467, ISSN 0013-5194, April 2004. 

[12] Michal Strzelecki, Jacques de Certaines, and Suhong Ko, 

“Segmentation of 3D MR liver images using 

synchronised oscillators networks,” Proceedings of the 

2007 IEEE International Symposium on Information 

Technology Convergence, ISBN: 0-7695-3045-1, pp. 

259-263, 2007. 

[13] Balarey Yuri, Cohen Alexander, Johnson Walter and 

Elinson, “Image processing by oscillatory media,” 

Proceedings of SPIE-the International Society for Optical 

Engineering, vol. 2430, pp. 198-207, 1994. 

[14] R. Forke, Dirk Scheibner, Wolfram Dötzel and Jan 

Mehner, “Measurement unit for tunable low frequency 

vibration detection with MEMS force coupled 

oscillators,” Elsevier Transactions on Sensors and 

Actuators, vol. 156, pp. 59-65, 2009. 

[15] R. Sepulchre, Derek Paley and Naomi Leonard, Lecture 

notes in control and information sciences, vol. 309/2004, 

pp:189-205. ISBN:978-3-540-22861-5, ISSN:0170-8643, 

Springer Berlin/Heidelberg, November 2004. 

[16] James F. Buckwalter, Aydin Babakhani, Abbas Komijani 

and Ali Hajimiri,”An integrated subharmonic coupledoscillator 

scheme for a 60-GHz phased-array transmitter,” 

IEEE Transactions on Microwave Theory and 

Techniques, vol. 54, no.12, pp. 4271-4280, December 

2006. 

[17] G. W. Wei and Y. Q. Jia, “Synchronization-based image 

edge detection,“ Europhysics letters, vol. 59, pp.814-819, 

2002. 

[18] J. C. Chedjou, On the analysis of nonlinear 

electromechanical systems with applications, Shaker 

Verlag, ISBN 978-3-8322-3750, 2005. 

[19] J. C. Chedjou, H. B. Fotsin, P. Woafo, and S. Domngang, 

“Analog simulation of the dynamics of a van der Pol 

oscillator coupled to a Duffing oscillator,” IEEE 

Transactions on Circuits and Systems-I, vol. 48, no. 06, 

pp. 748-757, 2001. 

[20] J. C. Chedjou, P. Woafo, and S. Domngang, “Shilnikov 

chaos and dynamics of a self-sustained electromechanical 

transducer,” ACME Transactions on Vibration and 

Acoustics, vol. 123, pp. 170-174, 2001. 

[21] J. C. Chedjou, K. Kyamakya, I. Moussa, H. P. 

Kuchenbecker, and W. Mathis, “Behavior of a selfsustained 

electromechanical transducer and routes to 

chaos,” ACME Transactions on Vibration and Acoustics, 

vol. 128, pp. 183-192, 2006. 

[22] Mάrton Csapodi and Tamάs Roska, “Adaptive histogram 

equalization with cellular neural network,” CNNA’96: 

Fourth IEEE International Workshop on Cellular Neural 

Networks and their Applications, Seville, Spain, June 24- 

26, 1996. 

[23] Mάtyάs Brendel and Tamάs Roska, “Adaptive image 

sensing and enhancement using the adaptive cellular 

neural network universal machine,” Proceedings of the 6 th 

IEEE International Workshop on Cellular Neural 

Networks and their Applications, Catania, Italy, May 23- 

25, 2000. 

[24] A. Gacsadi, C. Grava and A. Grava, “Medical image 

enhancement by using cellular neural networks,” 

Proceedings of the EEE International Conference on 

Computers in Caradiology, Lyon, France, Sep 25-28, 

2005. 

[25] Masaru Nakano and Yoshifumi Nishio, “A method of 

edge detection using small world cellular neural 

network”, International Symposium on Nonlinear Theory 

and its Applications, NOLTA’07, Vancouver, Canada, 

Sep 16-19, 2007. 

[26] D. L. Vilarino, D. Cabello, X. M. Pardo and V. M. Bera, 

“Cellular neural networks and active contours: A tool for 

image segmentation”, Transaction of Elsevier on Image 

and Vision Computing, vol. 21, pp. 189-204, 2003.


[27] Yao Li, Liu Jiamin, Xie Yonggui and Pei Liuqing, 

“Medical image segmentation based on cellular neural 

networks”, Science in China series F: information 

sciences, ISSN: 1009-2757(print) 1862-2836(online), vol. 

44, pp. 68-72, 2007. 

[28] J.C. Chedjou, K. Kyamakya, U.A. Khan, M.A. Latif, 

“CNN-based real-time computational engineering,” 

Proceedings of CNNA 2010, February 3-5, 2010, 

Berkeley, California – USA. 

[29] Taraqlio S., Zanela A., “Cellular neural networks: a 

genetic algorithm for parameters optimization in artificial 

vision applications,” Proceedings of 4 th IEEE 

International workshop on Cellular Neural Network and 

their Applications, pp. 315-320, ISBN: 0-7803-3261-X, 

Spain, 1996. 

[30] M. Zamparelli, “Genetically trained cellular neural 

networks,” Transactions of Elsevier on neural networks, 

vol. 10, pp. 1143-1151, 1997. 

[31] Samuel Xavier-de-Souza, Mustak E. Yalcin, Mü stak E. 

Yalcin, Joos Vandewalle, Johan A. K. Suykens, 

“Automatic chip-specific CNN template optimization 

using adaptive simulated annealing,” Proceedings of the 

European conference on Circuit Theory and Design 

(ECCTD ‘03). 

[32] Brett Chandler, Csaba Rekeczky, Yoshifumi Nishio, Akio 

Ushida, “Adaptive simulated annealing in CNN template 

learning,” IEICE Trans. Fundamentals, vol. E82-A, no. 

02, February 1999. 

[33] H. L. Wei, S. A. Billings, “Generalized cellular neural 

networks constructed using particle swarm optimization 

for spatio-temporal evolutionary pattern identification,” 

International journal of Bifurcation and Chaos, vol. 18, 

pp. 3611-3624, 2008. 

[34] Te-Jen Su, Tzu-Hsiang Lin, Jia-Wei Liu, “Particle swarm 

optimization for gray scale image noise cancellation,” 

Proceedings of the 4 th IEEE International Conference on 

Intelligent Information hiding and Multimedia Signal 

Processing, Harbin-China, 2008. 

Kyandoghere Kyamakya obtained the 

M.S. in Electrical Engineering in 1990 

at the University of Kinshasa. In 1999 

he received his Doctorate in Electrical 

Engineering at the University of Hagen 

in Germany. He then worked three 

years as post-doctorate researcher at the 

Leibniz University of Hannover in the 

field of Mobility Management in 

Wireless Networks. From 2002 to 2005 

he was junior professor for Positioning Location Based 

Services at Leibniz University of Hannover. Since 2005 he is 

full Professor for Transportation Informatics and Director of 

the Institute for Smart Systems Technologies at the University 

of Klagenfurt in Austria. 

Jean Chamberlain Chedjou received 

in 2004 his doctorate in Electrical 

Engineering at the Leibniz University 

of Hanover, Germany. He has been a 

DAAD (Germany) scholar and also an 

AUF research Fellow (Postdoc.). From 

2000 to date he has been a Junior 

Associate researcher in the Condensed 

Matter section of the ICTP (Abdus 

Salam International Centre for Theoretical Physics) Trieste, 

Italy. Currently, he is a senior researcher at the Institute for 

Smart Systems Technologies of the Alpen-Adria University of 

Klagenfurt in Austria. His research interests include 

Electronics Circuits Engineering, Chaos Theory, Analog 

Systems Simulation, Cellular Neural Networks, Nonlinear 

Dynamics, Synchronization and related Applications in 

Engineering. He has authored and co-authored 3 books and 

more than 40 journals and conference papers. 

Cyrille Kalenga Wa Ngoy obtained the ‘Ir. Civil’ degree in 

Electrical Engineering at the University of Kinshasa. He is 

since about ten years Assistant at the same University in the 

Department of Electrical and Computer Engineering. 

Michel Matalatala Tamasala obtained the ‘Ir. Civil’ degree 

in Electrical Engineering at the University of Kinshasa. He is 

since about four years Assistant at the same University in the 

Department of Electrical and Computer Engineering.


Common-neighbor Monitoring Enhanced 

Cooperation Enforcement Scheme for MANETs 

JianLi GUO, HongWei LIU, and XiaoZong YANG 

Abstract—Ad hoc networks are distributed, self-organized 

wireless networks. By their nature, it is easy for selfish nodes 

to save their energy by not forwarding packets. The existing of 

selfish nodes can degrade the network performance severely. A 

new cooperation enforcement scheme called CMC was proposed 

to mitigate this problem. The common neighbor monitoring 

technique was introduced, with whose help the watchdog could 

monitor all packets transmitting around it. The system could 

detect the non-cooperation nodes quickly and easily. In the 

routing discovery phase, the control messages that contained noncooperation 

nodes were dropped, decreasing the probability of a 

well-behaved node using a bad route for data transmission. The 

ns-2 simulation results indicated that CMC could improve the 

throughput of well-behaved nodes by 10%-40% in the presence 

of 10%-60% non-cooperation nodes. 

Index Terms—Mobile Ad Hoc networks, selfish node, reputation, 

cooperation. 


MOBILE Ad hoc network[1], [2] is a multi-hop temporary 

autonomous system, which is composed of a group 

of mobile nodes. In this environment, the transmission range 

of each node is limited within a small area, so two mobile 

nodes which are geographically distant require other nodes 

forwarding function to communicate. At present, the mature 

routing protocols for mobile ad hoc network, such as DSR[3] 

and AODV[4], all assume that nodes are cooperative, and they 

are happy to forward data for other nodes. In recent years, with 

the development of hardware technology, all kinds of civilian 

ad hoc networks, such as the temporary wireless network at 

the classrooms, are appeared. In these networks, each node 

separately belongs to different individuals or organizations, 

they have no common purpose, and cooperation among nodes 

cannot be guaranteed. In mobile Ad hoc network where nodes 

are always powered by battery, energy is very valuable, and 

the wireless interface consumes substantial energy (higher 

than 40%)[2], [5]. In order to save energy, selfish nodes may 

discard packets that need to be forwarded, thus showing noncooperative 

behaviors. In literature [6], by employing game 

theory, the authors had proved that, in mobile ad hoc networks, 

Manuscript received April 9, 2009. This work was supported in part by 

the Hi-Tech Research and Development Program (863) of China under grant 

No. 2008AA01A201 and the National Natural Science Foundation of China 

under grants No. 60503015. 

JianLi GUO is with the School of computer science and technology, Harbin 

Institute of Technology, Harbin, China, 150001 email: gjl@ftcl.hit.edu.cn. 

HongWei LIU is with the School of computer science and technology, 

Harbin Institute of Technology, Harbin, China, 150001 email: 

lhw@ftcl.hit.edu.cn. 

XiaoZong YANG is with the School of computer science and technology, 

Harbin Institute of Technology, Harbin, China, 150001 email: 

yxz@ftcl.hit.edu.cn. 

spontaneous cooperation did not exist, and the external mechanisms 

that ensured cooperation among nodes were required. 

In mobile ad hoc network, even if only a small number of 

nodes showing non-cooperative behaviors, there may be a 

great impact on network performance. Literature [7] farther 

more pointed out that, if there existed 10%-40% selfish nodes 

in the network, the entire network performance would drop 

16%-32%. 

For the nodes cooperation problem in mobile ad hoc networks, 

the researchers had proposed a lot of solutions[8], [9], 

mainly divided into two categories: virtual currency based 

schemes and reputation based schemes. In the virtual currency 

based schemes[5], [10], [11], [12], [13], nodes who 

forward packets for other nodes are compensated by some 

virtual currency to motivate them to cooperate. However, these 

kind of schemes have some drawbacks: the need for special 

hardware[10] or a central server[5], [11], [12], [13]; violating 

the distributed characteristics of ad hoc networks; because of 

the lack of opportunity to forward packets for other nodes, and 

thus unable to obtain enough currency, the nodes that located 

at the edge of the network may be starved to death[10]; in 

order to calculate the optimal compensation[11], [12], [13], 

nodes require to exchange substantial information, introducing 

quantity control packets into the network. These limit their 

applications in ad hoc networks. 

In the reputation based schemes, each node is given a reputation 

value[14] according to its behavior and the selfish ones 

are punished. Literature [7] first proposed the use of watchdog 

to detect selfish nodes. After that, literatures [15], [16] further 

more used the second-hand information to compute reputation 

values to speed up the detection rate, at the same time the 

Bayesian statistical method was used to prevent attacks on 

rumors. Literature [17] focused on the security problems in 

the process of calculating the reputation value and proposed a 

safety scheme named SORI. Literatures [18], [19] pointed out 

that the detection techniques based on the watchdog were not 

accurate enough, and put forward a two-hop ACKs detection 

method that could more accurately detect the selfish nodes. But 

this approach introduced substantial ACK messages, which seriously 

occupied the network bandwidth, making the network 

more vulnerable to congestion. 

Literatures [14], [20] analyzed and summarized the calculation 

methods for reputation values, and pointed out that the 

use of second-hand information leaded to some disadvantages: 

each node needed to save reputation values for every node in 

the network, occupying substantial storage space; the dissemination 

of second-hand information among nodes used up a lot 

of network bandwidth; each time node received a second-hand 

1


information, it needed to re-calculate the reputation values for 

all nodes in the network, taking up lots of CPU resources; 

more vulnerable to be attacked. These all make the reputation 

calculating method based on the first-hand information is 

more applicable to ad hoc networks. However, there also 

existed one drawback in the method that used the first-hand 

information. The detection to selfish nodes was slower and 

the non-cooperative nodes could not be separated from the 

network quickly and effectively. 

In this paper, we proposed the CMC scheme (Commonneighbor 

Monitoring enabled Cooperation enforcement 

scheme), which used the common-neighbor monitoring 

technique to speed up the detection rate to the noncooperative 

nodes. At the same time, the control messages 

(RREQs or RREPs) were filtered by the CMC scheme, 

which threw away the control messages that contained the 

non-cooperative nodes, making the routes chosen by nodes 

can bypass the non-cooperative nodes as much as possible, 

thereby improving the network performance. 

A. The Structure of CMC 

II. CMC SCHEME 

CMC is one kind of reputation based cooperation scheme, 

and the direct information (first-hand information) was used to 

calculate the reputation value for each node. Like all reputation 

based schemes [7], [15], [16], [17], [18], [19], [20], CMC also 

assumed that the non-cooperative nodes involved in routing 

discovery phase, but in the forwarding phase, may discarded 

packets for the purpose of saving their own resources (such 

as energy). 

CMC was based on the DSR[3] routing protocol, and 

located between the network layer and the MAC layer, including 

five components: Watchdog, Filter, Neighbor Manager, 

Reputation Manager and the Second Chance Mechanism, 

as shown in Fig. 1. Among them, the Neighbor Manager 

was responsible for maintaining a list of neighbors, as well 

as periodically sending Hello messages; the Watchdog was 

responsible for eavesdropping the channel, and the monitoring 

results were passed to the Reputation Manager; the Reputation 

Manager calculated the reputation value for each node, and 

added the non-cooperative nodes into the Black-list; The Filter 

had two functions, one was to punish the non-cooperation 

nodes, and the other was to filter the routing control messages, 

suppressing the routes containing non-cooperative nodes. 

The functions and codes of the DSR protocol remain 

unchanged, only the FindRoute() function was rewritten. The 

FindRoute() function searched for routes in the route cache in 

accordance with Black-list, and the routes that did not contain 

nodes lying in Black-list were returned. When the node had 

data to be sent, the FindRoute() function was first called to 

search for the available route in the route cache. If hit, the 

available route was used to send data, otherwise, the routing 

discovery phase needed to be restarted, re-searching for new 

routes. 

The packets generated by DSR protocol were first processed 

by the Watchdog, after that, they were handed down to the 

MAC layer to be sent. Node was set to the promiscuous mode, 

Fig. 1. The architecture of the CMC scheme 

and the packets received by the MAC layer were handed up 

to the Watchdog. Only those packets that were sent to this 

node or needed to be forwarded by this node could pass by 

the Watchdog. After that, the packets were handed up to the 

Filter module, and finally arrived at the DSR protocol and 

processed by the DSR protocol. 

B. Neighbor Manager 

Neighbor list recorded the current active neighbor nodes. In 

the neighbor list, each neighbor corresponded to one item, and 

the node’s ID, rating value and the timeout value were stored 

in it. The rating value corresponded to the reputation of the 

neighbor, initialized to 0. 

Neighbor Manager was used to maintain a list of the 

neighbors. The interface which was set to promiscuous mode 

monitored the channel, and each time it received a packet, 

it would send a copy to the Neighbor Manager. Neighbor 

Manager picked out node ID from the packet, and searched it 

in the neighbor list. If finding, the corresponding timeout value 

was updated; otherwise, it was thought to be a new neighbor, 

and its ID was put into the neighbor list. If the timeout of an 

item in the neighbor list expired (10s in our experiment), the 

node corresponding to this item would be thought to move 

out of the transmission range, so deleting this item from the 

neighbor list. If the node did not send any data in TNeib 

time (3s in our experiment), Neighbor Manager required to 

broadcast a Hello message in order to prevent being deleted 

from the neighbor list by neighbor nodes. 

C. Watchdog 

Watchdog was mainly used to monitor nodes in the neighborhood 

to observe whether they forwarded the packets, as 

well as whether or not modified the packet contents. Watchdog 

maintained a data structure: the packet monitoring buffer. 

Those packets that needed to be monitored were stored in the 

packet monitoring buffer, and each packet corresponded to one


item. The item was composed of the content of the packet, the 

expecting forwarding node’ ID and the timeout value. 

The packets sent by the routing layer (DSR protocol) were 

first processed by the Watchdog. As long as its next hop node 

was not the destination node, a copy of the packet was put 

into the packet monitoring buffer. 

The packets received by the interface were handed up to 

the MAC layer, and then the MAC layer delivered them to the 

Watchdog. Each time the Watchdog received one packet, it 

searched this packet in the packet monitoring buffer. If found 

and the packet was not tampered, a positive event was sent 

to the Reputation Management, increasing the rating value 

corresponding to the forwarding node. Then the Watchdog 

checked the next hop field in the head of the packet. If it 

consisted with the address of this node, the packet was handed 

up to the Filter for further processing. Otherwise, it meant 

that the packet was not sent to this node, discarded directly. If 

until the packet in the packet monitoring buffer timeout, the 

Watchdog failed to observe the forwarding behavior, a negative 

event would be sent to the Reputation Management to reduce 

the rating value of the forwarding node. 

D. Common-neighbor Monitoring 

From the describing about the Watchdog in the previous 

section, we know that only those nodes located on the route 

can watch the forwarding behavior of the next hop node. 

Studying the network topology carefully, we found that those 

nodes located at some special place could also watch the 

forwarding behavior of the next hop node. So, the commonneighbor 

monitoring technique was introduced. Each time the 

watchdog captured a packet, if its current forwarding node and 

the next forwarding node were both in the neighbor list, this 

packet would be put into the packet monitoring buffer and 

watched by Watchdog. 

Fig. 2. Common-neighbor monitoring technique 

As shown in Fig. 2, node S sends data to node D through 

node A, B and C, node M and node N are both located in the 

transmission range of node B and node C. When node B sends 

a packet to node C, node M and node N are able to capture 

the packet and find that the packet’s current forwarding node 

(node B) and next forwarding node (node C) are both their 

own neighbors. So node M and N put this packet into their 

data monitoring buffer, watching the forwarding behavior to 

this packet. 

In order to punish the non-cooperative nodes, Filters would 

discard all packets from the non-cooperative nodes. As shown 

in Fig. 2, it is assumed that the source node S is a noncooperative 

node and has been found by node A, as a 

punishment, all packets sent by node S will be discarded by 

node A. In order to prevent Watchdog regarding this kind 

of punishment as non-cooperation, we demand that watchdog 

does not monitor the forwarding behavior of the first relaying 

node in the route. That is node P will not monitor the 

forwarding behavior of node A in Fig. 2. 

E. Reputation Management 

Reputation Management was responsible for updating the 

node’s reputation value. A good reputation system[14] should 

have the following characteristics: the reputation value is able 

to accurately reflect the behavior of the node; node’s recent actions 

have greater impaction on the reputation value, whereas 

the past behaviors have less impaction on the reputation value; 

be able to diagnose the non-cooperative nodes quickly. So, the 

Reputation Manager calculated the reputation values for nodes 

in accordance with the equation (1): 

� 

0.95 × Rold + 0.05 If Positive Event 

Rnew = 

0.90 × Rold − 0.1 If Negative Event 

If the Reputation Manager received a positive event, the 

new reputation value of the node would be the sum of the 

discounted (multiplied by 0.95) old value and 0.05; if the Reputation 

Manager received a negative event, the new reputation 

value would be the difference of the discounted (multiplied by 

0.9) old value and 0.1; finally, if the reputation value of the 

node was lower than a threshold (-0.5 in our experiment), this 

node would be considered as a non-cooperative node and put 

into Black-list. 

F. Filter 

Filter primarily filtered the passing by routing control messages 

(RREQs or RREPs) and data packets according to the 

Black-list. For each routing control message, if it contained 

nodes that located in Black-list (containing non-cooperative 

nodes), then the current finding route was considered as ”bad” 

and the control message was discarded. In addition, the Filter 

was also required to check all the passing by data packets. 

As punishment to non-cooperative nodes, each packet whose 

source node located in the Black-list was thrown away. See 

table I for the detail algorithm. 

In the route discovery phase, all nodes that were located 

between the source and destination node did the route filtering. 

Therefore, the routes found by the source node could bypass 

the non-cooperative nodes as much as possible, increasing the 

success rate of sending data. 

TABLE I 

FILTER ALGORITHM 

1 Receive a packet; 

2 if RREP or RREQ, and contains nodes in black-list then 

3 Suppress this packet, and return; 

4 else if data packet, and source node in black-list then 

5 Suppress this packet, and return; 

6 end if 

7 Hand this packet to route layer; 

(1)


G. Second Chance Mechanism 

Literatures [7], [19], [20] proposed that several reasons 

would affect the detecting results of Watchdog, such as signal 

conflict, network congestion and temporary link failure, etc., 

which may lead to cooperative nodes wrongly being marked 

as non-cooperative nodes by Watchdog(in our experiments, we 

also found that when the network load was heavy, network 

congestion was likely to happen, and the cooperative nodes 

may appear in Black-list). In addition, those nodes that had 

been detected as non-cooperative nodes at the early time, 

may show cooperative behaviors at the late time. In order to 

allow those nodes that had been isolated from the network 

could re-join into the network, giving them a ”rehabilitative” 

opportunity, CMC introduced the second chance mechanism. 

After a fixed period of time, nodes would be released from 

the Black-list, but their reputation values would not be reset to 

0, rather maintained the current value unchanged. Once these 

nodes showed the non-cooperative behaviors again, they could 

be quickly re-added into the Black-list. 

III. SIMULATION AND RESULTS ANALYSIS 

In this section, ns-2[21] was used to verify the CMC 

scheme, observing its impact on the network performance. 

In the simulation, the tool named setdest from CMU was 

used to generate movement scenes for nodes. In addition, 

the data streams generating tool from CMU could not meet 

our requirements and needed to do some changes, making it 

satisfy: the source and destination nodes of each connection 

are randomly distributed in the network; the start time of each 

connection is uniformly distributed in [0s, 1000s]; the duration 

of each connection is uniformly distributed in [50s, 100s]. 

The basic parameters in simulations are shown in table II. 

The sending and receiving transmission ranges are 250m, the 

maximum transmission rate is 2Mbits/s, simulation duration 

time is 1000 seconds, the data type used in simulations is CBR, 

and the size of each CBR packet is 512 byte. In the simulation, 

the proportion of non-cooperative nodes is changed from 10% 

to 60%, and the impact that non-cooperative nodes have on the 

network performance is observed. For each group parameters, 

the simulations are run 10 times, and the averaged result is 

used. 

TABLE II 

BASIC PARAMETERS FOR SIMULATION 

Simulate time 1000s 

Transmission range 250m 

Receiver range 250m 

Carrier sense range 550m 

Maximum pause time 100s 

Traffic type CBR 

Packet size 512byte 

CBR rate 5pkt/s 

The following two standards are used to assess the network 

performance: 

• throughput: the ratio of the packets that successfully 

arrived at the destination nodes to the packets that were 

sent by the source nodes. 

• forwarding throughput: the throughput with those packets 

that were sent directly to destination nodes removed. 

T h ro u g h p u t 

1 .0 

0 .9 

0 .8 

0 .7 

0 .6 

0 .5 

0 .4 

0 .3 

0 .2 

0 .1 

0 .0 

C M C 

P a th ra te r 

D e fe n s e le s s 

1 0 % 2 0 % 3 0 % 4 0 % 5 0 % 6 0 % 

F ra c tio n o f m is b e h a v io r n o d e s 

Fig. 3. Throughput in static network (670*670 m 2 , 50 nodes) 


1 .0 

0 .9 

0 .8 

0 .7 

0 .6 

0 .5 

0 .4 

0 .3 

0 .2 

0 .1 

0 .0 

C M C 



1 0 % 2 0 % 3 0 % 4 0 % 5 0 % 6 0 % 


Fig. 4. Forwarding throughput in static network (670*670 m 2 , 50 nodes) 

First, the impact of non-cooperative nodes on throughput 

in static network is studied. The selected simulation region 

is 670*670 m 2 , 50 nodes are randomly distributed in the 

simulation region, and the number of the CBR connections 

is 50. The simulation results are shown in Fig. 3 – Fig. 5, 

the curve named defenseless means no cooperative scheme 

is used and the curve named pathrater denotes the scheme 

proposed in literature [7]. As can be seen in Fig. 3, with 

the number of non-cooperative nodes in network increasing, 

the three curves all descend. The throughput of CMC scheme 

has no obvious improvement compared with that of pathrater 

scheme, whereas with the proportion of non-cooperative nodes 

beyond 50%, the SMC scheme shows slightly advantages. The 

main reason is that, in static network, pathrater scheme will

A v e ra g e ro u te le n g th 


2 .4 

2 .3 

2 .2 

2 .1 

2 .0 

C M C 



1 0 % 2 0 % 3 0 % 4 0 % 5 0 % 6 0 % 


Fig. 5. Average route length in static network (670*670 m 2 , 50 nodes) 

finally be able to detect all the non-cooperative nodes (only the 

detection rate is slower), and routes chosen by source nodes in 

the following process can bypass these non-cooperative nodes. 

Forwarding throughput is shown in Fig. 4. We can see that, 

as the proportion of non-cooperative nodes increase, the curve 

corresponding to defenseless declines sharply, which indicates 

that non-cooperative nodes have resulted in a significant impact 

on network performance. In addition, the picture also 

shows that, compared with pathrater scheme, CMC scheme 

has obviously improvement on forwarding throughput. 

In the experiments, the average route lengths of the packets 

that arrived at destination nodes have been counted, and the 

results are shown in Fig. 5. We can see that, CMC and 

pathrater schemes have considerable average route lengths, 

which is obviously more than that of defenseless scheme. 

The reason is that, in CMC and pathrater schemes, when 

source nodes have data to send, they do not select the shortest 

routes, but choose the routes which are able to bypass the 

non-cooperative nodes. 

Next, throughputs in small dynamic network are studied. 

The random waypoint model is chosen for nodes movements, 

nodes maximum velocity is 10m/s, and node’s maximum pause 

time is 100s. Simulation region is 670*670 m 2 , the number 

of nodes in dynamic network is 50, and the number of CBR 

connections is 50. Simulation results are shown in Fig. 6 

– Fig. 8. When the proportion of non-cooperative nodes in 

network is greater than 30%, CMC scheme is superior to 

pathrater scheme. In addition, compared to static network, the 

three curves in dynamic network have all declined. The mainly 

reason is that, in dynamic network, the nodes’ movements 

often cause link disconnection, resulting in route’s frequent 

changes, which lead to some packets loss. 

Fig. 8 shows the average route length of packets in dynamic 

network. We can see that CMC and pathrater schemes have 

longer average route length than defenseless scheme. 

Last, throughputs in large size dynamic network are studied, 

the number of nodes is increased to 100, and the simulation 


1 .0 

0 .9 

0 .8 

0 .7 

0 .6 

0 .5 

0 .4 

0 .3 

0 .2 

0 .1 

0 .0 

C M C 



1 0 % 2 0 % 3 0 % 4 0 % 5 0 % 6 0 % 


Fig. 6. Throughput in small dynamic network (670*670 m 2 , 50 nodes) 

F o rw a rd in g th ro u g h p u t 

1 .0 

0 .9 

0 .8 

0 .7 

0 .6 

0 .5 

0 .4 

0 .3 

0 .2 

0 .1 

0 .0 

C M C 



1 0 % 2 0 % 3 0 % 4 0 % 5 0 % 6 0 % 


Fig. 7. Forwarding throughput in small dynamic network (670*670 m 2 , 50 

nodes) 

region at the same time is increased from 670*670 m 2 to 

1200*1200 m 2 . Simulation results are shown in Fig. 9 – 

Fig. 11. We can see that CMC scheme is obviously superior 

to pathrater scheme, mainly because the expansion of the 

network size makes the average route length larger. In Fig. 11, 

the average route length of CMC scheme is more than 3.6 

and that of pathrater scheme is also greater than 3.4, which 

means that, in average case, each packet is relayed by 2.5 

nodes. In pathrater scheme, the source nodes can only find 

non-cooperative nodes within their one hop scope. For those 

nodes that locate at two hops or more long distance, the 

source nodes will not be able to judge their behaviors. Thus, 

the source nodes are likely to choose routes containing noncooperative 

nodes, decreasing the throughput. Contemporary, 

in CMC scheme, the routing control messages are filtered, 

which makes the routes chosen by the source nodes more


2 .4 

2 .2 

2 .0 

1 .8 

1 .6 

C M C 



1 0 % 2 0 % 3 0 % 4 0 % 5 0 % 6 0 % 


Fig. 8. Average route length in small dynamic network (670*670 m 2 , 50 

nodes) 



1 .0 

0 .9 

0 .8 

0 .7 

0 .6 

0 .5 

0 .4 

0 .3 

0 .2 

0 .1 

0 .0 

1 0 % 2 0 % 3 0 % 4 0 % 5 0 % 6 0 % 


C M C 



Fig. 9. Throughput in large dynamic network (1200*1200 m 2 , 100 nodes) 

likely bypass the non-cooperative nodes. 

IV. CONCLUSION AND FUTURE WORK 

In this paper, the CMC method was proposed, which could 

quickly detect the non-cooperative nodes in mobile ad hoc 

networks, isolating them and reducing their impact on network 

performance. The use of common-neighbors monitoring 

technique could speed up the detection speed to the noncooperative 

nodes, making the system isolate the selfish nodes 

quickly. In CMC method, all nodes between source and 

destination nodes suppressed the control messages that contained 

non-cooperative nodes, further reduced the performance 

impact that the non-cooperative nodes had on the network. 

The simulation results showed that when there existed noncooperative 

nodes in network, CMC could significantly improve 

node throughput and network performance. 

F o rw a rd in g th ro u g h p u t 

1 .0 

0 .9 

0 .8 

0 .7 

0 .6 

0 .5 

0 .4 

0 .3 

0 .2 

0 .1 

0 .0 

1 0 % 2 0 % 3 0 % 4 0 % 5 0 % 6 0 % 


C M C 



Fig. 10. Forwarding throughput in large dynamic network (1200*1200 m 2 , 

100 nodes) 


4 .0 

3 .8 

3 .6 

3 .4 

3 .2 

3 .0 

2 .8 

2 .6 

C M C 



1 0 % 2 0 % 3 0 % 4 0 % 5 0 % 6 0 % 


Fig. 11. Average route length in large dynamic network (1200*1200 m 2 , 

100 nodes) 

In the next step, we will port the code from ns2 to linux to 

study the CMC’s performance on the real life environment. 

At the same time, the impact that different kind of end 

user applications (ie. video and audio) have on the CMC’s 

performance is under our consideration. 

REFERENCES 

[1] I. Chlamtac, M. Conti, and J. Liu, “Mobile ad hoc networking: imperatives 

and challenges,” Ad Hoc Networks, vol. 1, no. 1, pp. 13–64, 2003. 

[2] BASAGNI S, CONTI M, GIORDANO S and STOJMENOVIC I, Mobile 

Ad Hoc Networking. New Jersey: Wiley-IEEE press, 2004. 

[3] D. Johnson, D. Maltz, Y. Hu, and J. Jetcheva, “The dynamic source 

routing protocol for mobile ad hoc networks (DSR),” 2002. 

[4] C. Perkins and E. Royer, “Ad-hoc on-demand distance vector routing,” 

in proceedings of the 2nd IEEE Workshop on Mobile Computing Systems 

and Applications, vol. 2, 1999, pp. 90–100.


[5] D. J. G, “Game-theoretic power management in mobile ad hoc networks,” 

Ph.D thesis, Carnegie Mellon University Department of Electrical 

and Computer Engineering, Pittsburgh, Pennsylvania, Aug. 2004. 

[6] M. Felegyhazi, J. Hubaux, and L. Buttyan, “Nash equilibria of packet 

forwarding strategies in wireless ad hoc networks,” IEEE Transactions 

on Mobile Computing, vol. 5, no. 5, pp. 463–476, 2006. 

[7] S. Marti, T. Giuli, K. Lai, and M. Baker, “Mitigating routing misbehavior 

in mobile ad hoc networks,” in Proceedings of the 6th annual 

international conference on Mobile computing and networking. ACM 

New York, NY, USA, 2000, pp. 255–265. 

[8] G. Marias, P. Georgiadis, D. Flitzanis, and K. Mandalas, “Cooperation 

enforcement schemes for MANETs: A survey,” Wireless Communications 

and Mobile Computing, vol. 6, no. 3, pp. 319–332, 2006. 

[9] Y. Yoo and D. Agrawal, “Why does it pay to be selfish in a MANET?” 

IEEE Wireless Communications, vol. 13, no. 6, pp. 87–97, 2006. 

[10] L. Buttyan and J. Hubaux, “Nuglets: a virtual currency to stimulate 

cooperation in self-organized mobile ad hoc networks,” ICCA, Swiss 

Federal Institute of Technology, 2001. 

[11] L. Anderegg and S. Eidenbenz, “Ad hoc-VCG: a truthful and costefficient 

routing protocol for mobile ad hoc networks with selfish 

agents,” in Proceedings of the 9th annual international conference on 

Mobile computing and networking. ACM New York, NY, USA, 2003, 

pp. 245–259. 

[12] Y. Wang and M. Singhal, “On improving the efficiency of truthful routing 

in MANETs with selfish nodes,” Pervasive and Mobile Computing, 

vol. 3, no. 5, pp. 537–559, 2007. 

[13] S. Eidenbenz, G. Resta, and P. Santi, “The COMMIT Protocol for 

Truthful and Cost-Efficient Routing in Ad Hoc Networks with Selfish 

Nodes,” IEEE Transactions on Mobile Computing, vol. 7, no. 1, pp. 

19–33, 2008. 

[14] S. Buchegger, D. Telekom, J. Mundinger, S. BC205, J. Le Boudec, and 

S. BC203, “Reputation Systems for Self-Organized Networks: Lessons 

Learned,” IEEE Technology & Society Magazine, 2007. 

[15] S. Buchegger, “Coping with misbehavior in mobile ad-hoc networks,” 

Ph.D. dissertation, Ecole Polytechnique Federale DE Lausanne, 2004. 

[16] S. Buchegger and J. Le Boudee, “Self-policing mobile ad hoc networks 

by reputation systems,” IEEE Communications Magazine, vol. 43, no. 7, 

pp. 101–107, 2005. 

[17] Q. He, D. Wu, and P. Khosla, “A secure incentive architecture for ad 

hoc networks,” Wireless Communications and Mobile Computing, vol. 6, 

no. 3, 2006. 

[18] K. Liu, J. Deng, P. Varshney, and K. Balakrishnan, “An 

acknowledgment-based approach for the detection of routing 

misbehavior in MANETs,” IEEE Transactions on Mobile Computing, 

vol. 6, no. 5, pp. 536–550, 2007. 

[19] D. Djenouri and N. Badache, “Struggling against selfishness and black 

hole attacks in MANETs,” Wireless Communications and Mobile Computing, 

vol. 8, no. 6, 2008. 

[20] H. J. Y, “Cooperation in mobile ad hoc networks,” 

http://www.cs.fsu.edu/research/reports/TR-050111.pdf, January 2005. 

[21] K. Fall and K. Varadhan, “The ns manual (formerly ns notes and 

documentation), The VINT Project, 2008.” 

Jianli Guo received her BS and MS in computer science and technology from 

Harbin Institute of Technology in 2002 and 2004 respectively. Now he is a 

PHD student in HIT. His research interest includes ad hoc network, wireless 

sensor network. 

Hongwei Liu is a professor in HIT. His research interest includes fault tolerant 

computing technology, ad hoc network, wireless sensor network. 

Xiaozong Yang is a professor in HIT. His research interest includes fault tolerant 

computing technology, computer architecture, ad hoc network,wireless 

sensor network.


Systemic Risk Assessment using a Non-stationary 

Fractional Dynamic Stochastic Model for the 

Analysis of Economic Signals 

Jonathan M Blackledge, Fellow, IET, Fellow, IoP, Fellow, IMA, Fellow, RSS 

Abstract— This paper considers the Fractal Market Hypothesis 

(FMH) for assessing the risk(s) in developing a financial portfolio 

based on data that is available through the Internet from an 

increasing number of sources. Most financial risk management 

systems are still based on the Efficient Market Hypothesis which 

often fails due to the inaccuracies of the statistical models that 

underpin the hypothesis, in particular, that financial data are 

based on stationary Gaussian processes. The FMH considered 

in this paper assumes that financial data are non-stationary and 

statistically self-affine so that a risk analysis can, in principal, be 

applied at any time scale provided there is sufficient data to make 

the output of a FMH analysis statistically significant. This paper 

considers a numerical method and an algorithm for accurately 

computing a parameter - the Fourier dimension - that serves 

in the assessment of a financial forecast and is applied to data 

taken from the Dow Jones and FTSE financial indices. A more 

detailed case study is then presented based on a FMH analysis 

of Sub-Prime Credit Default Swap Market ABX Indices. 

Index Terms— Risk assessment of economy, Risk assessment 

statistics and numerical data, Fractal Market Hypothesis, FTSE, 

Dow Jones and ABX index. 


Attempts to develop stochastic models for financial time 

series are common place in financial mathematics and econometric 

in general. Financial time series are essentially digital 

signals composed of ‘tick data’ that provides traders with daily 

tick-by-tick data of trade price, trade time, and volume traded, 

for example, at different sampling rates [1], [2]. Stochastic 

financial models can be traced back to the early Twentieth 

Century when Louis Bachelier [3] proposed that fluctuations 

in the prices of stocks and shares (which appeared to be 

yesterday’s price plus some random change) could be viewed 

in terms of random walks in which price changes were entirely 

independent of each other. Thus, one of the simplest models 

for price variation is based on the sum of independent random 

numbers. This is the basis for Brownian motion [4] in which 

the random numbers are considered to conform to a normal 

distribution. This model is the basis for the Efficient Market 

Hypothesis (EMH) which has a number of questionable assumptions 

as discussed in the following section. In this paper, 

we consider a method for processing financial time series 

data based on the Fractal Market Hypothesis. The underlying 

Manuscript completed in December, 2009. The work reported in this paper 

is supported by the Science Foundation Ireland. 

Jonathan Blackledge (jonathan.blackledge@dit.ie) is the Stokes Professor 

of Digital Signal Processing, School of Electrical Engineering 

Systems, Faculty of Engineering, Dublin Institute of Technology 

(http://eleceng.dit.ie/blackledge). 

rationale for this model is discussed and example results 

presented to illustrate the ability for the model to provide 

an improved risk assessment of an economy with regard to 

predicting the characteristics of an economic time series based 

on a risk assessment statistic computed from numerical data. A 

case study is presented that is based on the sub-prime credit 

default swap market ABX index which is acknowledged as 

being one of the principal markets whose collapse triggered 

the current global recession. 

II. BROWNIAN MOTION AND THE EFFICIENT MARKET 

HYPOTHESIS 

Random walk models, which underpin the so called Efficient 

Market Hypothesis (EMH) [5]-[12] have been the basis 

for financial time series analysis since the work of Bachelier 

in the late Nineteenth Century. Although the Black-Scholes 

equation [13], developed in the 1970s for valuing options, is 

deterministic (one of the first financial models to achieve determinism), 

it is still based on the EMH, i.e. stationary Gaussian 

statistics. The EMH is based on the principle that the current 

price of an asset fully reflects all available information relevant 

to it and that new information is immediately incorporated 

into the price. Thus, in an efficient market, the modelling 

of asset prices is concerned with modelling the arrival of 

new information. New information must be independent and 

random, otherwise it would have been anticipated and would 

not be new. The arrival of new information can send ‘shocks’ 

through the market (depending on the significance of the 

information) as people react to it and then to each other’s 

reactions. The EMH assumes that there is a rational and 

unique way to use the available information and that all agents 

possess this knowledge. Further, the EMH assumes that this 

‘chain reaction’ happens effectively instantaneously. These 

assumptions are clearly questionable at any and all levels of 

a complex financial system. 

The EMH implies independence of price increments and is 

typically characterised by a normal of Gaussian Probability 

Density Function (PDF) which is chosen because most price 

movements are presumed to be an aggregation of smaller 

ones, the sums of independent random contributions having a 

Gaussian PDF. However, it has long been known that financial 

time series do not follow random walks. The shortcomings 

of the EMH model include: failure of the independence and 

Gaussian distribution of increments assumption, clustering, 

apparent non-stationarity and failure to explain momentous


financial events such as ‘crashes’ leading to recession and, 

in some extreme cases, depression. These limitations have 

prompted a new class of methods for investigating time series 

obtained from a range of disciplines. For example, Re-scaled 

Range Analysis (RSRA), e.g. [14]-[16], which is essentially 

based on computing the Hurst exponent [17], is a useful tool 

for revealing some well disguised properties of stochastic time 

series such as persistence (and anti-persistence) characterized 

by non-periodic cycles. Non-periodic cycles correspond to 

trends that persist for irregular periods but with a degree of 

statistical regularity often associated with non-linear dynamical 

systems. RSRA is particularly valuable because of its 

robustness in the presence of noise. The principal assumption 

associated with RSRA is concerned with the self-affine or 

fractal nature of the statistical character of a time-series rather 

than the statistical ‘signature’ itself. Ralph Elliott first reported 

on the fractal properties of financial data in 1938 (e.g. [18] and 

reference therein). He was the first to observe that segments 

of financial time series data of different sizes could be scaled 

in such a way that they were statistically the same producing 

so called Elliot waves. 

III. RISK ASSESSMENT AND REPEATING ECONOMIC 

PATTERNS 

A good stochastic financial model should ideally consider 

all the observable behaviour of the financial system it is 

attempting to model. It should therefore be able to provide 

some predictions on the immediate future behaviour of the 

system within an appropriate confidence level. Predicting the 

markets has become (for obvious reasons) one of the most 

important problems in financial engineering. Although, at least 

in principle, it might be possible to model the behaviour of 

each individual agent operating in a financial market, one 

can never be sure of obtaining all the necessary information 

required on the agents themselves and their modus operandi. 

This principle plays an increasingly important role as the 

scale of the financial system, for which a model is required, 

increases. Thus, while quasi-deterministic models can be of 

value in the understanding of micro-economic systems (with 

known ‘operational conditions’), in an ever increasing global 

economy (in which the operational conditions associated with 

the fiscal policies of a given nation state are increasingly open), 

we can take advantage of the scale of the system to describe 

its behaviour in terms of functions of random variables. 

A. Elliot Waves 

The stochastic nature of financial time series is well known 

from the values of the stock market major indices such as the 

FTSE (Financial Times Stock Exchange) in the UK, the Dow 

Jones in the US which are frequently quoted. A principal aim 

of investors is to attempt to obtain information that can provide 

some confidence in the immediate future of the stock markets 

often based on patterns of the past. One of the principal components 

of this aim is based on the observation that there are 

‘waves within waves’ and ‘events within events’ that appear to 

permeate financial signals when studied with sufficient detail 

and imagination. It is these repeating patterns that occupy both 

the financial investor and the systems modeller alike and it is 

clear that although economies have undergone many changes 

in the last one hundred years, the dynamics of market data 

do not appear to change significantly (ignoring scale). For 

example, with data obtained from [19], Figure 1 shows the rescaled 

signals and associated ‘macrotrends’ (i.e. normalised 

time series and associated time series after application of 

a Gaussian lowpass filter) associated with FTSE Close-of- 

Day (COD) illustrating the ‘development’ of three different 

‘crashes’; those of 1987, 1997 and the most recent crash of 

2007. The macrotrends are computed by filtering each signal 

in Fourier space using a Gaussian lowpass filter exp(−βω 2 ) 

with β = 0.1 where ω is the angular frequency. 

Fig. 1. Evolution of the 1987, 1997 and 2007 financial crashes. Normalised 

data (left) and macrotrends (right) where the data has been smoothed and 

rescaled to values between 0 and 1 inclusively) of the daily FTSE value 

(close-of-day) for 02-04-1984 to 24-12-1987 (blue), 05-04-1994 to 24-12- 

1997 (green) and 02-04-2004 to 24-09-2007 (red). 

The similarity in behaviour of these signals is remarkable 

and clearly indicates a wavelength of approximately 1000 

days. This is indicative of the quest to understand economic 

signals in terms of some universal phenomenon from which 

appropriate (macro) economic models can be generated. In an 

efficient market, only the revelation of some dramatic information 

can cause a crash, yet post-mortem analysis of crashes 

typically fail to (convincingly) tell us what this information 

must have been. 

One cause of correlations in market price changes (and 

volatility) is mimetic behaviour, known as herding. In general, 

market crashes happen when large numbers of agents place sell 

orders simultaneously creating an imbalance to the extent that 

market makers are unable to absorb the other side without 

lowering prices substantially. Most of these agents do not 

communicate with each other, nor do they take orders from 

a leader. In fact, most of the time they are in disagreement, 

and submit roughly the same amount of buy and sell orders. 

This is a healthy non-crash situation; it is a diffusive (randomwalk) 

process which underlies the EMH and financial portfolio 

rationalization. 

B. Non-equilibrium Systems 

Financial markets can be considered to be non-equilibrium 

systems because they are constantly driven by transactions that


occur as the result of new fundamental information about firms 

and businesses. They are complex systems because the market 

also responds to itself, often in a highly non-linear fashion, and 

would carry on doing so (at least for some time) in the absence 

of new information. The ‘price change field’ is highly nonlinear 

and very sensitive to exogenous shocks and it is probable 

that all shocks have a long term effect. Market transactions 

generally occur globally at the rate of hundreds of thousands 

per second. It is the frequency and nature of these transactions 

that dictate stock market indices, just as it is the frequency and 

nature of the sand particles that dictates the statistics of the 

avalanches in a sand pile. These are all examples of random 

scaling fractals [21]-[26]. 

IV. THE FRACTAL MARKET HYPOTHESIS 

Developing mathematical models to simulate stochastic 

processes has an important role in financial analysis and 

information systems in general where it should be noted that 

information systems are now one of the most important aspects 

in terms of regulating financial systems, e.g. [27]-[30]. A good 

stochastic model is one that accurately predicts the statistics 

we observe in reality, and one that is based upon some well 

defined rationale. Thus, the model should not only describe 

the data, but also help to explain and understand the system. 

There are two principal criteria used to define the characteristics 

of a stochastic field: (i) The PDF or the Characteristic 

Function (i.e. the Fourier transform of the PDF); the Power 

Spectral Density Function (PSDF). The PSDF is the function 

that describes the envelope or shape of the power spectrum of 

a signal. In this sense, the PSDF is a measure of the field 

correlations. The PDF and the PSDF are two of the most 

fundamental properties of any stochastic field and various 

terms are used to convey these properties. For example, the 

term ‘zero-mean white Gaussian noise’ refers to a stochastic 

field characterized by a PSDF that is effectively constant over 

all frequencies (hence the term ‘white’ as in ‘white light’) and 

has a PDF with a Gaussian profile whose mean is zero. 

Stochastic fields can of course be characterized using transforms 

other than the Fourier transform (from which the PSDF 

is obtained) but the conventional PDF-PSDF approach serves 

many purposes in stochastic systems theory. However, in 

general, there is no general connectivity between the PSDF 

and the PDF either in terms of theoretical prediction and/or 

experimental determination. It is not generally possible to 

compute the PSDF of a stochastic field from knowledge of 

the PDF or the PDF from the PSDF. Hence, in general, the 

PDF and PSDF are fundamental but non-related properties 

of a stochastic field. However, for some specific statistical 

processes, relationships between the PDF and PSDF can 

be found, for example, between Gaussian and non-Gaussian 

fractal processes [31] and for differentiable Gaussian processes 

[32]. 

There are two conventional approaches to simulating a 

stochastic field. The first of these is based on predicting the 

PDF (or the Characteristic Function) theoretically (if possible). 

A pseudo random number generator is then designed whose 

output provides a discrete stochastic field that is characteristic 

of the predicted PDF. The second approach is based on 

considering the PSDF of a field which, like the PDF, is ideally 

derived theoretically. The stochastic field is then typically 

simulated by filtering white noise. A ‘good’ stochastic model 

is one that accurately predicts both the PDF and the PSDF 

of the data. It should take into account the fact that, in 

general, stochastic processes are non-stationary. In addition, it 

should, if appropriate, model rare but extreme events in which 

significant deviations from the norm occur. 

One explanation for crashes involves a replacement for the 

EMH by the Fractal Market Hypothesis (FMH) which is the 

basis of the model considered in this paper. The FMH proposes 

the following: (i) The market is stable when it consists of 

investors covering a large number of investment horizons 

which ensures that there is ample liquidity for traders; (ii) 

information is more related to market sentiment and technical 

factors in the short term than in the long term - as investment 

horizons increase and longer term fundamental information 

dominates; (iii) if an event occurs that puts the validity 

of fundamental information in question, long-term investors 

either withdraw completely or invest on shorter terms (i.e. 

when the overall investment horizon of the market shrinks 

to a uniform level, the market becomes unstable); (iv) prices 

reflect a combination of short-term technical and long-term 

fundamental valuation and thus, short-term price movements 

are likely to be more volatile than long-term trades - they are 

more likely to be the result of crowd behaviour; (v) if a security 

has no tie to the economic cycle, then there will be no longterm 

trend and short-term technical information will dominate. 

Unlike the EMH, the FMH states that information is valued 

according to the investment horizon of the investor. Because 

the different investment horizons value information differently, 

the diffusion of information will also be uneven. Unlike most 

complex physical systems, the agents of the economy, and 

perhaps to some extent the economy itself, have an extra 

ingredient, an extra degree of complexity. This ingredient is 

consciousness. 

V. MATHEMATICAL MODEL FOR THE FMH 

We consider an economic times series to be a solution to 

the fractional diffusion equation [33]-[38] 

� � 

2 ∂ 

u(x, t) = δ(x)n(t) (1) 

∂q 

− σq 

∂x2 ∂tq where σ is the fractional diffusion coefficient, q > 0 is the 

‘Fourier dimension’ and n(t) is ‘white noise’. Let 

u(x, t) = 1 

�∞ 

U(x, ω) exp(iωt)dω 

2π 


Using the result 

n(t) = 1 

2π 

∂q 1 

u(x, t) = 

∂tq 2π 

−∞ 

�∞ 

−∞ 

�∞ 

−∞ 

N(ω) exp(iωt)dω. 

U(x, ω)(iω) q exp(iωt)dω

we can then transform the fractional diffusion equation to the 

form � 2 ∂ 

� 

U(x, ω) = δ(x)N(ω) 

where we take 

∂x 2 + Ω2 q 

Ωq = i(iωσ) q 

2 

Defining the Green’s function g [39] to be the solution of 

� � 

2 ∂ 

g(| x − y |, ω) = δ(x − y) 

∂x 2 + Ω2 q 

where δ is the delta function, we obtain the solution 

U(x, ω) = N(ω) 

where [40] 

�∞ 

−∞ 

g(| x − y |, ω)δ(y)dy = N(ω)g(| x |, ω) 

g(| x |, ω) = i 

exp(iΩq | x |) 

2Ωq 

under the assumption that u and ∂u/∂x → 0 as x → ±∞. The 

Green’s function characterises the response a system modelled 

by equation (1) due to an impulse at x = y and it is clear that 

or 


iN(ω) 

lim U(x, ω) = 

x→0 2Ωq 

U(ω) = 1 

2σ q 

N(ω) 

2 (iω) q 

2 

The time series associated with this asymptotic solution is 

then obtained by Fourier inversion giving (ignoring scaling by 

[2σ q/2 Γ(q/2)] −1 ) 

u(t) = 1 

⊗ n(t) (2) 

t1−q/2 where ⊗ defines the convolution integral. This equation 

is the Riemann - Liouville transform (ignoring scaling by 

[Γ−1 (q/2)] −1 ) [41] which is a fractional integral and defines a 

function u(t) which is statistically self-affine, i.e. for a scaling 

parameter λ > 0, 

λ q/2 Pr[u(λt)] = Pr[u(t)] 

where Pr[u(t)] denotes the Probability Density Function of 

u(t). Thus, equation (2) can be considered to be the temporal 

solution of equation (1) as x → 0 and u(t) is taken to be a 

random scaling fractal signal. Note that for | x |> 0 the phase 

Ωq | x | does not affect the ω −q scaling law of the power 

spectrum, i.e. ∀x, 

| U(x, ω) | 2 = 

| N(ω) |2 

4σ q ω q , ω > 0 

Thus for a uniformly distributed spectrum N(ω) the Power 

Spectrum Density Function of U is determined by ω−q and the 

algorithm developed to compute q given in Section 6 applies 

∀x and not just for the case when x → 0. However, since we 

can write 

i 

U(x, ω) = N(ω) exp(iΩq | x |) 

2Ωq 

1 

= N(ω) 

2(iωσ) q/2 

� 

1 + i(iωσ) q/2 | x | − 1 

2! (iωσ)q | x | 2 � 

+... 

unconditionally, by inverse Fourier transforming, we obtain 

the following expression for u(x, t) (ignoring scaling factors): 

u(x, t) = n(t) ⊗ 1 

+ i | x | n(t) 

t1−q/2 ∞� i 

+ 

k+1 

(k + 1)! 

k=1 

dkq/2 

| x |2k n(t) 

dtkq/2 Here, the solution is composed of three terms composed of (i) 

a fractional integral, (ii) the source term n(t); (iii) an infinite 

series of fractional differentials of order kq/2. 

A. Rationale for the Model - Hurst Processes 

A Hurst process describes fractional Brownian motion and 

is based on the generalization of Brownian motion quantified 

by the equation A(t) = √ t to 

A(t) = t H , H ∈ (0, 1] 

for a unit random step length in the plane where A is the 

most likely position in the plane after time t with respect to an 

initial position in the plane at t = 0. This scaling law makes 

no prior assumptions about any underlying distributions. It 

simply tells us how the system is scaling with respect to 

time. Processes of this type appear to exhibit cycles, but with 

no predictable period. The interpretation of such processes 

in terms of the Hurst exponent H is as follows: We know 

that H = 0.5 is consistent with an independently distributed 

system. The range 0.5 < H ≤ 1, implies a persistent time 

series, and a persistent time series is characterized by positive 

correlations. Theoretically, what happens today will ultimately 

have a lasting effect on the future. The range 0 < H ≤ 0.5 

indicates anti-persistence which means that the time series 

covers less ground than a random process. In other words, 

there are negative correlations. For a system to cover less 

distance, it must reverse itself more often than a random 

process. 

Given that random walks with H = 0.5 describe processes 

whose macroscopic behaviour is characterised by the diffusion 

equation, then, by induction, Hurst processes should be 

characterised by generalizing the diffusion operator 

to the fractional form 

∂2 ∂ 

− σ 

∂x2 ∂t 

∂2 ∂q 

− σq 

∂x2 ∂tq where q ∈ (0, 2] Fractional diffusive processes can therefore be 

interpreted as intermediate between classical diffusive (random 

phase walks with H = 0.5; diffusive processes with q = 1) 

and ‘propagative process’ (coherent phase walks for H = 

1; propagative processes with q = 2), e.g. [42] and [43]. 

The relationship between the Hurst exponent H, the Fourier 

dimension q and the Fractal dimension DF is given by [44] 

DF = DT + 1 − H = 1 − q + 3 

2 DT

where DT is the topological dimension. Thus, a Brownian 

process, where H = 1/2, has a fractal dimension of 1.5. 

Fractional diffusion processes are based on random walks 

which exhibit a bias with regard to the distribution of angles 

used to change the direction. By induction, it can be expected 

that as the distribution of angles reduces, the corresponding 

walk becomes more and more coherent, exhibiting longer and 

longer time correlations until the process conforms to a fully 

coherent walk. A simulation of such an effect is given in 

Figure 2 which shows a random walk in the (real) plane 

as the (uniform) distribution of angles decreases. The walk 

becomes less and less random as the width of the distribution is 

reduced. Each position of the walk (xj, yj), j = 1, 2, 3, ..., N 

is computed using 

j� 

j� 

xj = cos(θi), yj = sin(θi) 

where 


i=1 

i=1 

θi = απ ni 

�n�∞ 

and ni are random numbers computed using the linear congruential 

pseudo random number generator 

ni+1 = animodP, i = 1, 2, ..., N, a = 7 7 , P = 2 31 − 1 

The parameter 0 ≤ α ≤ 2π defines the width of the 

distribution of angles such that as α → 0, the walk becomes 

increasingly coherent or ‘propagative’ 

Fig. 2. Random phase walks in the plane for a uniform distribution of angles 

θi ∈ [0, 2π] (top left), θi ∈ [0, 1.9π] (top right), θi ∈ [0, 1.8π] (bottom left) 

and θi ∈ [0, 1.2π] (bottom right). 

In considering a t H scaling law with Hurst exponent H ∈ 

(0, 1], Hurst paved the way for an appreciation that most natural 

stochastic phenomena which, at first site, appear random, 

have certain trends that can be identified over a given period 

of time. In other words, many natural random patterns have a 

bias to them that leads to time correlations in their stochastic 

behaviour, a behaviour that is not an inherent characteristic of 

a random walk model and fully diffusive processes in general. 

This aspect of stochastic field theory is the basis for Lévy 

processes [45]. 

B. Lévy Processes 

Lévy processes are random walks whose distribution has 

infinite moments. The statistics of (conventional) physical 

systems are usually concerned with stochastic fields that have 

PDFs where (at least) the first two moments (the mean and 

variance) are well defined and finite. Lévy statistics is concerned 

with statistical systems where all the moments (starting 

with the mean) are infinite. Many distributions exist where the 

mean and variance are finite but are not representative of the 

process, e.g. the tail of the distribution is significant, where 

rare but extreme events occur. These distributions include 

Lévy distributions. Lévy’s original approach to deriving such 

distributions is based on the following question: Under what 

circumstances does the distribution associated with a random 

walk of a few steps look the same as the distribution after 

many steps (except for scaling)? This question is effectively 

the same as asking under what circumstances do we obtain a 

random walk that is statistically self-affine. The characteristic 

function (i.e. the Fourier transform) P (k) of such a distribution 

p(x) was first shown by Lévy to be given by (for symmetric 

distributions only) 

P (k) = exp(−a | k | γ ), 0 < γ ≤ 2 (3) 

where a is a constant and γ is the Lévy index. For γ ≥ 2, the 

second moment of the Lévy distribution exists and the sums of 

large numbers of independent trials are Gaussian distributed. 

For example, if the result were a random walk with a step 

length distribution governed by p(x), γ > 2, then the result 

would be normal (Gaussian) diffusion, i.e. a Brownian process. 

For γ < 2 the second moment of this PDF (the mean square), 

diverges and the characteristic scale of the walk is lost. For 

values of γ between 0 and 2, Lévy’s characteristic function 

corresponds to a PDF of the form 

p(x) ∼ 1 

, x → ∞. 

x1+γ This type of random walk is called a Le´vy flight and is an 

example of a non-stationary fractal walk. 

Lévy process are consistent with a fractional diffusion 

equation [46]. The basic evolution equation for a random 

Brownian particle process is given by 

�∞ 

u(x, t + τ) = u(x + λ, t)p(λ)dλ 

−∞ 

where u(x, t) is the concentration of particles and τ is the 

interval of time in which a particle moves some distance 

between λ and λ + dλ with a probability p(λ) satisfying the 

condition p(λ) = p(−λ). We note that 

u(x, t + τ) = u(x, t) ⊗ p(x) 

and that in Fourier space, this equation is 

U(k, t + τ) = U(k, t)P (k)

where U and P are the Fourier transforms of u and p 

respectively. From equation (3), 

P (k) � 1 − a | k | γ 

so that we can write 

U(k, t + τ) − U(k, t) 

� − 

τ 

a 

τ | k |γ U(k, t) 

which for τ → 0 gives the fractional diffusion equation 

σ ∂ ∂γ 

u(x, t) = u(x, t), 

∂t ∂xγ γ ∈ (0, 2] (4) 

where σ = τ/a and we have used the result 

∂γ 1 

u(x, t) = − 

∂xγ 2π 

�∞ 

−∞ 

| k | γ U(k, t) exp(ikx)dk 

The solution to this equation with the singular initial condition 

u(x, 0) = δ(x) is given by 

u(x, t) = 1 

2π 

�∞ 

−∞ 

exp(ikx − t | k | γ /σ)dk 

which is itself Lévy distributed. This derivation of the fractional 

diffusion equation reveals its physical origin in terms of 

Lévy statistics, i.e. Lévy’s characteristic function. Note that the 

diffusion equation is fractional in the spatial derivative rather 

than the temporal derivative as given in equation (1). However, 

since the Green’s function for equation (4) is given by 

where 


g(| x |, ω) = i 

exp(iΩγ | x |) 

2Ωγ 

Ωγ = i 2 

γ (iωσ) 1 

γ , 

by induction, we obtain a relationship between the Lévy index 

γ and the Fourier dimension q given by 

1 q 

= 

γ 2 

Gaussian processes associated with the classical diffusion 

equation are thus recovered when γ = 2 and q = 1. 

C. Fractional Differentials 

Fractional differentials of any order need to be considered 

in terms of the definition for a fractional differential given by 

ˆD q f(t) = dm 

dt m [Îm−q f(t)], m − q > 0 

where m is an integer and Î is the fractional integral operator 

(the Riemann-Liouville transform) given by 

Î p f(t) = 1 1 

f(t) ⊗ , p > 0 

Γ(p) t1−p The reason for this is that direct fractional differentiation can 

yield divergences. However, there is a deeper interpretation of 

this result that has a synergy with the issue over a macroeconomic 

system having ‘memory’ and is based on observing that 

the evaluation of a fractional differential operator depends on 

the history of the function in question. Thus, unlike an integer 

differential operator of order m, a fractional differential operator 

of order q has ‘memory’ because the value of Îm−qf(t) at 

a time t depends on the behaviour of f(t) from −∞ to t via the 

convolution of f(t) with t (m−q)−1 /Γ(m−q). The convolution 

process is dependent on the history of a function f(t) for 

a given kernel and thus, in this context, we can consider a 

fractional derivative defined by ˆ Dq to have ‘memory. In this 

sense, the operator 

∂2 ∂q 

− σq 

∂x2 ∂tq describes a process, compounded in a field u(x, t), that has 

memory association with regard to the temporal characteristics 

of the system it is attempting to model. This is not an intrinsic 

characteristic of systems that are purely diffusive q = 1 or 

propagative q = 2. 

D. Non-stationary Model 

The fractional diffusion operator used in equation (1) is 

appropriate for modelling fractional diffusive processes that 

are stationary. For non-stationary fractional diffusion, we could 

consider the case where the diffusivity is time variant as 

defined by the function σ(t). However, a more interesting 

case arises when the characteristics of the diffusion processes 

change over time becoming less or more diffusive. This is 

illustrated in terms of the random walk in the plane given in 

Figure 3. Here, the walk starts off being fully diffusive (i.e. 

H = 0.5 and q = 1), changes to being fractionally diffusive 

(0.5 < H < 1 and 1 < q < 2) and then changes back to 

being fully diffusive. In terms of fractional diffusion, this is 

equivalent to having an operator 

∂2 ∂q 

− σq 

∂x2 ∂tq where q = 1, t ∈ (0, T1]; q > 1, t ∈ (T1, T2]; q = 1, t ∈ 

(T2, T3] where T3 > T2 > T1. If we want to generalise 

such processes over arbitrary periods of time, then we should 

consider q to be a function of time. We can then introduce a 

non-stationary fractional diffusion operator given by 

∂2 ∂q(t) 

− σq(t) . 

∂x2 ∂tq(t) This operator is the theoretical basis for the Fractal Market 

Hypothesis considered in this paper. In terms of using this 

model to develop a FMH risk management metric based on 

the analysis of economic time series, the principal Hypothesis 

is that a change in q(t) precedes a change in a macroeconomic 

index. This requires accurately numerical methods for 

computing q(t) for a given index which are discussed later. 

Real economic signals exhibit non-stationary fractal walks. An 

example of this is illustrated in Figure 4 which shows a nonstationary 

walk in the complex plane obtained by taking the 

Hilbert transform of an economic signal, i.e. computing the 

analytic signal 

s(t) = u(t) + i 

⊗ u(t) 

πt 

and plotting the real and imaginary component of this signal 

in the complex plane.


Fig. 3. Non-stationary random phase walk in the plane. 

Fig. 4. Non-stationary fractal walk in the complex plane (right) obtained by 

computig the Hilbert transform of the economic signal (left) - FTSE Closeof-Day 

from 02-04-1984 to 24-12-1987. 

The non-stationary model considered here exhibits behaviour 

that is similar to Lévy processes. However, the aim 

is not to derive a statistical model for a stochastic process 

using a stationary fractional diffusion of the type given by 

equation (4) but to be able to compute a function - namely 

q(t) - which is a measure of the non-stationary behaviour 

especially with regard to a ‘future flight’. This is because, 

in principle, the value of q(t) should reflect the early stages 

of a change in the behaviour of u(t), a principle that is the 

basis for the financial data processing and analysis discussed 

in the following section. 

VI. FINANCIAL DATA ANALYSIS 

If we consider the case where the Fourier dimension is 

a relatively slowly varying function of time, then we can 

legitimately consider q(t) to be composed of a sequence of 

different states qi = q(ti). This approach allows us to develop 

a stationary solution for a fixed q over a fixed period of time. 

Non-stationary behaviour can then be introduced by using the 

same solution for different values of q over fixed (or varying) 

periods of time and concatenating the solutions for all q to 

produce an output digital signal. 

The FMH model for a quasi-stationary segment of a financial 

signal is given by 

u(t) = 1 

⊗ n(t), q > 0 

t1−q/2 which has characteristic spectrum 

U(ω) = N(ω) 

(iω) q/2 

The PSDF is thus characterised by ω −q , ω ≥ 0 and our 

problem is thus, to compute q from the data P (ω) =| U(ω) | 2 

, ω ≥ 0. For this data, we consider the PSDF 

ˆP (ω) = c 

ω q 

or 

ln ˆ P (ω) = C + q ln ω 

where C = ln c. The problem is therefore reduced to implementing 

an appropriate method to compute q (and C) by 

finding a best fit of the line ln ˆ P (ω) to the data ln P (ω). 

Application of the least squares method for computing q, 

which is based on minimizing the error 

e(q, C) = � ln P (ω) − ln ˆ P (ω, q, C)� 2 2 

with regard to q and C, leads to errors in the estimates for 

q which are not compatible with market data analysis. The 

reason for this is that relative errors at the start and end 

of the data ln P may vary significantly especially because 

any errors inherent in the data P will be ‘amplified’ through 

application of the logarithmic transform required to linearise 

the problem. In general, application of a least squares approach 

is very sensitive to statistical heterogeneity [47] and in this 

application, may provide values of q that are not compatible 

with the rationale associated with the FMH (i.e. values of 1 < 

q < 2 that are intermediate between diffusive and propagative 

processes). For this reason, an alternative approach must be 

considered which, in this paper, is based on Orthogonal Linear 

Regression (OLR) [48] [49]. 

Applying a standard moving window, q(t) is computed by 

repeated application of OLR based on the m-code available 

from [51]. This provides a numerical estimate of the function 

q(t) whose values reflect the state of a financial signals 

(assumed to be a non-stationary random fractal) in terms of a 

stable or unstable economy, from which a risk analysis can be 

performed. Since q is, in effect, a statistic, its computation 

is only as good as the quantity (and quality) of data that 

is available for its computation. For this reason, a relatively 

large window is required whose length is compatible with the 

number of samples available. 

A. Numerical Algorithm 

The principal algorithm associated with the application of 

the FMH analysis is as follows: 

Step 1: Read data (financial time series) from file into 

operating array a[i], i = 1, 2, ..., N.


Step 2: Set length L < N of moving window w to be used. 

Step 3: For j = 1 assign L + j − 1 elements of a[i] to array 

w[i], i = 1, 2, ..., L. 

Step 4: Compute the power spectrum P [i] of w[i] using a 

Discrete Fourier Transform (DFT). 

Step 5: Compute the logarithm of the spectrum excluding the 

DC, i.e. compute log(P [i])∀i ∈ [2, L/2]. 

Step 6: Compute q[j] using the OLR algorithm whose m-code 

is given in Appendix I. 

Step 7: For j = j + 1 repeat Step 3 - Step 5 stopping when 

j = N − L. 

Step 8: Write the signal q[j] to file for further analysis and 

post processing. 

The following points should be noted: 

(i) The DFT is taken to generate an output in standard form 

where the zero frequency component of the power spectrum 

is taken to be P [1]. 

(ii) With L = 2 m for integer m, a Fast Fourier Transform can 

be used 

(iii) The minimum window size that should be used in order 

provide statistically significant values of q[j] is L = 64 when 

q can be computed accurate to 2 decimal places. 

An example of the output generated by this algorithm for 

a 1024 element window is given in Figure 5 using Dow 

Jones Close-of-Day data obtained from [20]. Inspection of the 

signals illustrates a qualitative relationship between trends in 

the financial data and q(t) in accordance with the theoretical 

model considered. In particular, over periods of time in which 

q increases in value, the amplitude of the financial signal u(t) 

decreases. Moreover, and more importantly, an upward trend 

in q appears to be a precursor to a downward trend in u(t), a 

correlation that is compatible with the idea that a rise in the 

value of q relates to the ‘system’ becoming more propagative, 

which in stock market terms, indicates the likelihood for the 

markets becoming ‘bear’ dominant in the future. 

The results of using the method discussed above not only 

provides for a general appraisal of different macroeconomic 

financial time series, but, with regard to the size of selected 

window used, an analysis of data at any point in time. 

The output can be interpreted in terms of ‘persistence’ and 

‘anti-persistence’ and in terms of the existence or absence 

of after-effects (macroeconomic memory effects). For those 

periods in time when q(t) is relatively constant, the existing 

market tendencies usually remain. Changes in the existing 

trends tend to occur just after relatively sharp changes in 

q(t) have developed. This behaviour indicates the possibility 

of using the time series q(t) for identifying the behaviour 

of a macroeconomic financial system in terms of both intermarket 

and between-market analysis. These results support the 

possibility of using q(t) as an independent volatility predictor 

to give a risk assessment associated with the likely future 

behaviour of different economic time series. Further, because 

Fig. 5. Application of the FMH using a 1024 element window for analysing 

financial time series composed of Dow Jones Close-of-Day data from from 

02-11-1932 to 25-03-2009. Above: Dow Jones Close-of-Day data (blue) and 

q(t) (red) computed using a window of 1024; Below: Histogram of q(t) for 

100 bins. 

this analysis is based on the equation (2) which defines a 

(stationary) random scaling fractal signal, the results are, in 

principle, scale invariant. 

B. Equivalence with a Wavelet Transform 

The wavelet transform is defined in terms of projections of 

f(t) onto a family of functions that are all normalized dilations 

and translations of a prototype ‘wavelet’ function w [50], i.e. 

where 

W[f(t)] = FL(t) = 

wL(τ, t) = 1 

√ L w 

�∞ 

−∞ 

� τ − t 

L 

f(τ)wL(τ, t)dτ 

� 

, L > 0. 

The independent variables L and t are continuous dilation and 

translation parameters respectively. The wavelet transformation 

is essentially a convolution transform where wL(t) is the 

convolution kernel with dilation variable L. The introduction 

of this factor provides dilation and translation properties into 

the convolution integral that gives it the ability to analyse 

signals in a multi-resolution role (the convolution integral is 

now a function of L), i.e. 

FL(t) = wL(t) ⊗ f(t), L > 0. 

In this sense, the asymptotic solution (ignoring scaling) 

u(t) = 1 

⊗ n(t), q > 0 

t1−q/2 is compatible with the case of a wavelet transform where 

w1(t) = 1 

t 1−q/2 

for the stationary case and where, for the non-stationary case, 

1 

w1(t, τ) = . 

t1−q(τ)/2


C. Macrotrend Analysis 

In order to develop a macrotrend signal that has optimal 

properties with regard the assessment of risk (i.e. the likely 

future behaviour of an economic signal), it is important that the 

filter used is: (i) consistent with the properties of a Variation 

Diminishing Smoothing Kernel (VDSK); (ii) that the last few 

values of the trend signal are ‘data consistent’. VDSKs are 

convolution kernels with properties that guarantee smoothness 

around points of discontinuity of a given signal where the 

smoothed function is composed of a similar succession of 

concave or convex arcs equal in number to those of signal. 

VDSKs also have ‘geometric properties’ that preserve the 

‘shape’ of the signal. There are a range of VDSKs of which the 

most common is a Gaussian function and, for completeness, 

Appendix II provides a overview of the principal analytical 

properties, including fundamental Theorems and Proofs of 

such kernels including the Gaussian kernel. 

In practice, the computation of the smoothing process using 

a VDSK must be performed in such a way that the initial and 

final elements of the output data are entirely data consistent 

with the input array within the locality of any element. Since 

a VDSK is a non-localised filter which tends to zero at 

infinity, in order to optimise the numerical efficiency of the 

smoothing process, filtering is undertaken in Fourier space. 

However, in order to produce a data consistent macrotrend 

signal using a Discrete Fourier Transform, wrapping effects 

must be eliminated. The solution is to apply an ‘end point 

extension’ scheme which involves padding the input vector 

with elements equal to the first and last values of the vector. 

The length of the ‘padding vectors’ are taken to be at least 

half the size of the input vector. The output vector is obtained 

by deleting the filtered padding vectors. 

Figures 6 and 7 show examples of macrotrend analysis 

applied to the economic time series obtained from [19] and 

[20] and the signal q(t) using the VDSK filter exp(−βω 2 ). 

Table 1 provides quantitative information of the statistics of the 

signal q(t). Figures 6 and 7 include the normalised gradients 

computed using a ‘forward differencing scheme’ which clearly 

illustrate ‘phase shifts’ associated with the two signals. From 

Table 1, the mean value of q(t) for the Dow Jones index 

is slightly lower than the mean for the FTSE and in both 

cases, the Null Hypothesis test as to whether q(t) is Gaussian 

distributed is negative, i.e. the ‘Composite Normality’ is of 

type ‘Reject’. 

VII. CASE STUDY: ANALYSIS OF ABX INDICES 

ABX indices serve as a benchmark of the market for 

securities backed by home loans issued to borrowers with weak 

credit. The index is administered by the London-based Markit 

Group which specialises in credit derivative pricing [52]. 

A. What is an ABX index? 

The index is based on a basket of Credit Default Swap 

(CDS) contracts for the sub-prime housing equity sector. 

Credit Default Swaps operate as a type of insurance policy 

for banks or other holders of bad mortgages. If the mortgage 

goes bad, then the seller of the CDS must pay the bank for the 

Fig. 6. Analysis of FTSE Close-of-Day data from 25-04-1988 to 20-03- 

2009. Top-left:FTSE data (blue) and q(t) (red) computed using a 1024 moving 

window; Top-right: 100 bin histogram; Bottom-left: Macrotrends (β = 0.1); 

Bottom-right: Normalised gradients of macrotrends. 

Fig. 7. Analysis of DJ Close-of-Day data from 25-04-1988 to 20-03-2009. 

Top-left: FTSE data (blue) and q(t) (red) computed using a window of 1024; 

Top-right: 100 bin histogram; Bottom-left: Macrotrends (β = 0.1); Bottomright: 

Normalised gradients of macrotrends.


Statistical Parameter q(t)-FTSE q(t)-DJ 

Minimum Value 0.9876 0.9752 

Maximum value 1.5067 1.5154 

Range 0.5190 0.5402 

Mean 1.2482 1.2218 

Median 1.2639 1.2452 

Standard Deviation 0.1017 0.1269 

Variance 0.0104 0.0161 

Skew -0.4080 -0.2881 

Kertosis 2.3745 1.8233 

Composite Normality Reject Reject 

TABLE I 

STATISTICAL VALUES ASSOCIATED WITH q(t) COMPUTED FOR FTSE AND 

DJ CLOSE-OF-DAY DATA FROM 25-04-1988 TO 20-03-2009 GIVEN IN 

FIGURES 6 AND 7 RESPECTIVELY. 

lost mortgage payments. Alternatively, if the mortgage stays 

good then the seller makes a lot of money. The riskier the 

bundle of mortgages the lower the rating. 

The original goal of the index was to create visibility and 

transparency but it was not clear at the time of its inception 

that the index would be so closely followed. As subprime 

securities have become increasingly uncertain, the ABX index 

has become a key point of reference for investors navigating 

risky mortgage debt on an international basis. Hence, in light 

of the current financial crisis (i.e. from 2008-date), and given 

that most economist agree that the subprime mortgage was a 

primary catalyst for the crisis, analysis of the ABX index has 

become a key point of reference for investors navigating the 

world of risky mortgage debt. 

On asset-backed securities such as home equity loans the 

CDS provides an insurance against the default of a specific 

security. The index enables users to trade in a security without 

being limited to the physical outstanding amount of that 

security thereby given investors liquid access to the most 

frequently traded home equity tranches in a basket form. The 

ABX uses five indices that range from triple-A to triple- 

B minus. Each index chooses deals from 20 of the largest 

sub-prime home equity shelves by issuance amount from 

the previous six months. The minimum deal size is $500 

million and each tranche referenced must have an average 

life of between four and six years, except for the triple-A 

tranche, which must have a weighted average life greater than 

five years. Each of the indices is referenced to by different 

rated tranches, i.e. AAA, AA, A, BBB and BBB-. They are 

selected through identification of the most recently issued 

deals that meet the specific size and diversity criteria. The 

principal ‘market-makers’ in the index were/are: Bank of 

America, Bear Stearns, Citigroup, Credit Suisse, Deutsche 

Bank, Goldman Sachs, J P Morgan, Lehman Brothers, Merrill 

Lynch (now Bank of America), Morgan Stanley, Nomura 

International, RBS Greenwich Capital, UBS and Wachovia. 

However, during the financial crisis that developed in 2008, 

a number of changes have taken place. For example, on 

September 15, 2008, Lehman Brothers filed for bankruptcy 

protection following a massive exodus of most of its clients, 

drastic losses in its stock, and devaluation of its assets by 

credit rating agencies and in 2008 Merrill Lynch was acquired 

by Bank of America at which point Bank of America merged 

its global banking and wealth management division with the 

newly acquired firm. The Bear Stearns Companies, Inc. was a 

global investment bank and securities trading and brokerage, 

until its collapse and fire sale to J P Morgan Chase in 2008. 

ABX contracts are commonly used by investors to speculate 

on or to hedge against the risk that the underling mortgage 

securities are not repaid as expected. The ABX swaps offer 

protection if the securities are not repaid as expected, in 

return for regular insurance-like premiums. A decline in the 

ABX index signifies investor sentiment that subprime mortgage 

holders will suffer increased financial losses from those 

investments. Likewise, an increase in the ABX index signifies 

investor sentiment looking for subprime mortgage holdings to 

perform better as investments. 

B. ABX and the Sub-prime Market 

Prime loans are often packaged into securities and sold to 

investors to help lenders reduce risk. More than $500B of 

such securities were issued in the US in 2006. The problem 

for investors who bought 2006’s crop of high-risk mortgage 

originations, was that as the US housing market slowed as 

did mortgage applications. To prop up the market, mortgage 

lenders relaxed their underwriting standards lending to everriskier 

borrowers at ever more favourable terms. 

In the last few weeks of 2006, the poor credit quality of 

the 2006 vintage subprime mortgage origination started to 

become apparent. Delinquencies and foreclosures among highrisk 

borrowers increased at a dramatic rate, weakening the 

performance of the mortgage pools. In one security backed by 

subprime mortgages issued in March 2006, foreclosure rates 

were already 6.09% by December that year, while 5.52% of 

borrowers were late on their payments by more than 30 days. 

Lenders also began shutting their doors, sending shock waves 

through the high-risk mortgage markets throughout 2007. The 

problem kept new investor money at bay, and dramatically 

weakened a key derivative index tied to the performance of 

2006 high-risk mortgages, i.e. the ABX index. As a result 

the ABX suffered a major plummet of the index starting in 

December 2006 when BBB- fell below 100 for the first time. 

The most heavily traded subindex, representing loans rated 

BBB-, fell as hedge funds flocked to bet on the downturn and 

pushed up the cost of insuring against default. This led to a 

knock-on effect as lenders withdrew from the ABX market 

In early 2007 the issues were seen as: (i) Which investors 

were bearing the losses from having bought sub-prime mortgage 

backed securities? (ii) How large and concentrated were 

these losses? (iii) Had this sub-prime securitization distributed 

their risk among many players in the financial system or were 

the positions and losses concentrated among a few players? 

(iv) What were the potential systemic risk effects of these 

losses? We now know that the systemic risk had a devastating 

affect on the global economy and became known as the ‘Credit 

Crunch’. One of the catalysts for the problem was a US 

bill allowing bankruptcy judges to alter loan balances which 

nobody dealing in CDS had considered. The second key factor 

was the speed of deterioration of the ABX Indices in 2007


which shocked investors and left them waiting to see the 

bottom of the market before getting back in - they are still 

waiting. The third key factor was the failure of the US Treasury 

to provide foreclosure relief for distressed home owners which 

congress had approved. The following series of reactions 

(denoted by →) were triggered as a result: The treasury said it 

won’t take steps to prevent home foreclosures, so that prices 

of mortgage securities collapsed → bank equity was wiped 

out → banks, with shrunken equity capital, were forced to cut 

back on all types of credit → financing for anything, especially 

residential mortgage loans, dried up → market values of homes 

declined further → mortgage securities declined further, and 

the downward spiral becomes self perpetuating. 

C. Effect of ABX on Bank Equities 

At the end of February 2007 a price of 92.5 meant that a 

protection buyer will need to pay the protection seller 7.5% 

upfront and then 0.64% per year. At the time, this kind of 

mortgage yield was about 6.5%, so the upfront charge was 

more than the yield per year. By April 2009 the A grade index 

had fallen to 8 meaning that the protection seller would want 

92% upfront which meant that the sub-prime market ‘died’. In 

July 2007 AAA mortgage securities started trading at prices 

materially below par, or below 100. Until then, many banks 

had bulked up mortgage securities that were rated AAA at 

the time of issue. This was because they believed that AAA 

bonds could always be traded at prices close to par, and 

consequently the bonds’ value would have a very small impact 

on the earnings and equity capital. The mystique about AAA 

ratings dated back more than 80 years. From 1920 onward, 

the default experience on AAA rated bonds, even during the 

Great Depression, was nominal. 

The way the securities are structured is that different classes 

of creditors, or different tranches, all hold ownership interests 

in the same pool of mortgages. However, the tranches with 

the lower ratings - BBB, A, AA - take the first credit 

losses and they are supposed to be destroyed before the 

AAA bondholders lose anything. Typically, AAA bondholders 

represent about 75-80% of the entire mortgage pool. During 

the Great Depression (1929-1933), national average home 

prices held their value far better than they have since 2007. The 

assumptions that a highly liquid trading market and gradual 

price declines, have proved to be wrong. Beginning in the last 

half of 2007, the price declines of AAA bonds was steep, 

and the trading market suddenly became very illiquid. Under 

standard accounting rules, those securities must be marked 

to market every fiscal quarter, and the banks’ equity capital 

shrank beyond all expectations. Hundreds of billions of dollars 

have been lost as a result. However, the losses in mortgage 

securities, and from financial institutions such Lehman that 

were undone by mortgage securities, dwarf everything else. 

Before the end of each fiscal quarter, bank managements must 

also budget for losses associated with mortgage securities. But 

since they cannot control market prices at a future date, they 

compensate by adjusting what they can control, which is all 

discretionary extensions of credit. Banks cannot legally lend 

beyond a certain multiple of their capital. 

D. Credit Default Swap Index 

This index is used to hedge credit risk or to take a position 

on a basket of credit entities. Unlike a credit default swap, a 

credit default swap index is a completely standardised credit 

security and may therefore be more liquid and trade at a 

smaller bid-offer spread. This means that it can be cheaper 

to hedge a portfolio of credit default swaps or bonds with a 

CDS index than to buy many CDS to achieve a similar effect. 

Credit-default swap indexes are benchmarks for protecting 

investors owning bonds against default, and traders use them to 

speculate on changes in credit quality. There are currently two 

main families of CDS indices: CDX and iTraxx. CDX indices 

contain North American and Emerging Market companies and 

are administered by CDS Index Company and marketed by 

Markit Group Limited, and iTraxx contain companies from 

the rest of the world and are managed by the International 

Index Company (IIC). A new series of CDS indices is issued 

every six months by Markit Group and IIC. Running up 

to the announcement of each series, a group of investment 

banks is polled to determine the credit entities that will form 

the constituents of the new issue. This process is intended 

to ensure that the index does not become ‘cluttered with 

instruments that no longer exist, or which trade illiquidly. On 

the day of issue a fixed coupon is decided for the whole index 

based on the credit spread of the entities in the index. Once this 

has been decided the index constituents and the fixed coupon 

is published and the indices can be actively traded. 

E. Analysis of Sub-Prime CDS Market ABX Indices using the 

FMH 

The US Sub-Prime Housing Market is widely viewed as 

the source of the current economic crisis. The reason that 

it has had such a devastating effect on the global economy 

is that investment grade bonds were purchased by many 

substantial international financial institutions but in reality 

the method used to designate the relatively low risk required 

for investment grade securities was seriously flawed. This 

resulted in the investment grade bonds becoming virtually 

worthless very quickly when systemic risks that wrongly had 

been ignored undermined the entire market. About 80% of 

the market was designated investment grade (AAA - highest, 

AA and A - lowest) with protection provided by a high risk 

grades (BBB- and BBB). The flawed risk model was based 

on an assumption that the investment grades would always be 

protected by the higher risk grades that would take all of the 

first 20% of defaults. Once defaults exceeded 20% the ‘house 

of cards’ was demolished. It is therefore of interest to see if a 

FMH based analysis of the ABX indices could have been used 

a predictive tool in order to develop a superior risk model. 

Figure 8 shows the ABX index for each grade using data 

supplied by the Systemic Risk Assessment Division of the 

Bank of England. During the second week of December 2006 

the BBB- index slipped to 99.76 for a couple of days but then 

recovered. In March 2007 the index for BBB- slipped just 

below 90 and seemed to be recovering and by mid-May was 

above 90 again. In June 2007 the BBB- really began to slide 

and this time it never recovered and was closely followed by


Fig. 8. Grades for the ABX Indices from 19 January 2006 to 2 April 2009 

based on Close-of-Day prices. 

the collapse of the BBB index after which there was no further 

protection for the investment grades. The default swaps work 

like an insurance so that if the cost of insuring against risk 

becomes greater than the annual return from the loan then the 

market is effectively dead. By February 2008 the AAA grade 

was below this viable level. 

The results of applying the FMH based on the algorithms 

discussed in Section 6 is given in Figures 9-13. Table 2 provides 

a list of the statistical variables associated with q(t) for 

each case. In each case, q(t) initially has values > 2 but this 

falls rapidly prior to a change of the index. Also, in each case, 

the turning point of the normalised gradient of the Gaussian 

filtered signal (i.e. point in time of the minimum value) is 

an accurate reflection of the point in time prior to when the 

index falls rapidly relatively to the prior data. This turning 

point occurs before the equivalent characteristic associated 

with the smoothed index. The model consistently ‘signals’ the 

coming meltdown with sufficient notice for orderly withdrawal 

from the market. For example, the data used for Figure 9 

reflects the highest Investment Grade and would be regarded 

as particularly safe. The normalised gradient of the output data 

provides a very early signal of a change in trend, in this case, 

at around approximately 180 days from the start of the run, 

which is equivalent to early April 2007 at which point the 

index was just above 100. In fact the AAA index appears to 

be viable as an investment right up to early November 2008 

after which is falls dramatically. In Figure 11, a trend change 

is again observed in the normalised gradient at approximately 

190 days which is equivalent mid April 2007. It is not until the 

second week of July 2007 that this index begins to fall rapidly. 

In Figure 13 the normalised gradient signals a trend change 

at around 170 for the highest risk grade. This is equivalent to 

the third week of March 2007. At this stage the index was 

only just below 90 and appeared to be recovering. 

Fig. 9. Analysis of AAA ABX.HE indices (2006 H1 vintage) by rating 

(closing prices) from 24-07-2006 to 02-04-2009. Top-left: AAA data (blue) 

and q(t) (red); Top-right: 100 bin histogram; Bottom-left: Macotrends for 

β = 0.1; Bottom-right: Normalised gradients of macrotrends. 

Fig. 10. Analysis of AA ABX.HE indices (2006 H1 vintage) by rating 

(closing prices) from 24-07-2006 to 02-04-2009 for a 128 moving window. 

Top-left: AA data (blue) and q(t) (red); Top-right: 100 bin histogram; 

Bottom-left: Macotrends for β = 0.1; Bottom-right: Normalised gradients 

of macrotrends.


Fig. 11. Analysis of A ABX.HE indices (2006 H1 vintage) by rating (closing 

prices) from 24-07-2006 to 02-04-2009 for a 128 size moving window. Topleft: 

AA data (blue) and q(t) (red); Top-right: 100 bin histogram; Bottom-left: 

Macotrends for β = 0.1; Bottom-right: Normalised gradients of macrotrends. 

Fig. 12. Analysis of BBB ABX.HE indices (2006 H1 vintage) by rating 

(closing prices) from 24-07-2006 to 02-04-2009 for a moving window with 

128 elements. Top-left: AA data (blue) and q(t) (red); Top-right: 100 bin 

histogram; Bottom-left: Macotrends for β = 0.1; Bottom-right: Normalised 

gradients of macrotrends. 

Fig. 13. Analysis of BBB- ABX.HE indices (2006 H1 vintage) by rating 

(closing prices) from 24-07-2006 to 02-04-2009 for a moving window of 

size 128 element. Top-left:AA data (blue) and q(t) (red); Top-right: 100 bin 

histogram; Bottom-left: Macotrends for β = 0.1; Bottom-right: Normalised 

gradients of macrotrends. 

Statistical AAA AA A BBB BBB- 

Parameter 

Min. 1.1834 1.0752 1.0522 1.0610 1.0646 

Max. 3.1637 2.8250 2.7941 2.4476 2.5371 

Range 1.9803 1.7499 1.7420 1.3867 1.4726 

Mean 2.0113 1.7869 1.6663 1.5141 1.4722 

Median 1.9254 1.7001 1.4923 1.3425 1.3243 

SD 0.3928 0.4244 0.4384 0.3746 0.3476 

Variance 0.1543 0.1801 0.1922 0.1404 0.1208 

Skew 0.7173 0.3397 0.6614 0.8359 1.0345 

Kertosis 2.7117 1.8479 2.0809 2.2480 2.7467 

CN Reject Reject Reject Reject Reject 

TABLE II 

STATISTICAL VALUES ASSOCIATED WITH q(t) COMPUTED FOR ABX.HE 

INDICES (2006 H1 VINTAGE) BY RATING (CLOSING PRICES) FROM 

24-07-2006 TO 02-04-2009. NOTE THAT THE ACRONYMS SD AND CN 

STAND FOR ‘STANDARD DEVIATION’ AND ‘COMPOSITE NORMALITY’ 

RESPECTIVELY. 

VIII. CONCLUSION 

In terms of the non-stationary fractional diffusion model 

considered in this paper, the time varying Fourier dimension 

q(t) can be interpreted in terms of a ‘gauge’ on the characteristics 

of a dynamical system. This includes the management 

processes from which all modern economies may be assumed 

to be derived. In this sense, the FMH is based on three principal 

considerations: (i) the non-stationary behaviour associated 

with any system undergoing continuous change that is driven 

by a management infrastructure; (ii) the cause and effect that is 

inherent at all scales (i.e. all levels of management hierarchy); 

(iii) the self-affine nature of outcomes relating to points (i) 

and (ii). 

In a modern economy, the principal issue associated with


any form of financial management is based on the flow 

of information and the assessment of this information at 

different points connecting a large network. In this sense, a 

macroeconomy can be assessed in terms of its information 

network which consists of a distribution of nodes from which 

information can flow in and out. The ‘efficiency’ of the system 

is determined by the level of randomness associated with the 

direction of flow of information to and from each node. The 

nodes of the system are taken to be individuals or small 

groups of individuals whose assessment of the information 

they acquire together with their remit, responsibilities and 

initiative, determines the direction of the information flow 

from one node to the next. The determination of the efficiency 

of a system in terms of randomness is the most critical in terms 

of the model developed. It suggests that the performance of a 

business is related to how well information flows through an 

organisation. 

The FMH has a number of fundamental differences with 

regard to the EMH which are tabulated in Table 3. 

EMH FMH 

Gaussian Non-Gaussian 

Statistics Statistics 

Stationary Non-stationary 

Process Process 

No memory - Memory - 

no historical correlations historical correlations 

No repeating Many repeating 

patterns at any scale patterns at all scales - 

‘Elliot waves’ 

Continuously stable Continuously unstable 

at all scales at any scale - 

‘Lévy Flights’ 

TABLE III 

PRINCIPAL DIFFERENCES BETWEEN THE EFFICIENT MARKET 

HYPOTHESIS (EMH) AND THE FRACTAL MARKET HYPOTHESIS (FMH). 

The non-stationary nature of the model presented in this 

paper is taken to account for stochastic processes that can vary 

in time and are intermediate between diffusive and propagative 

or persistent behaviour. Application of Orthogonal Linear 

Regression to macroeconomic time series data provides an 

accurate and robust method to compute q(t) when compared to 

other statistical estimation techniques such as the least squares 

method. As a result of the physical interpretation associated 

with the fractional diffusion equation and the ‘meaning’ of 

q(t), we can, in principal, use the signal q(t) as a predictive 

measure in the sense that as the value of q(t) continues to 

increases, there is a greater likelihood for volatile behaviour 

of the markets. This is reflected in the data analysis based 

on the examples given in which a Gaussian lowpass filter 

exp(−βω 2 ) has been used to smooth both u(t) and q(t) to 

produce the associated macrotrends in which the value of β 

determines the level of detail they contain. From the examples 

provided, it is clear that the turning points of the gradients 

of a macrotend in q(t) flag a future change in the trend of 

the economic signal u(t). This is compounded in the phase 

shifts that exist in the normalised gradients of u(t) and q(t) 

over frequency bands determined by the value of β. Although 

the interpretation of these phase shifts requires further study, 

from the results presented in this paper, it is clear that they 

provide an assessment of the risk associated with investing 

in a particular economic time series provided the series in 

question is a random scaling fractal. The ‘case study’ on the 

ABX Close-of-Day indices clearly illustrates the ability for the 

model to flag a point in time after which the indices change 

rapidly. The ABX indices exhibit a clear transition between 

a period when q(t) > 2 and when 1 < q(t) < 2 - Figures 

9-13 - which precedes the ‘collapse’ of the indices in 2008 

are thereby the onset of the ‘Credit Crunch’ 

In a statistical sense, q(t) is just another measure that may, 

or otherwise, be of value to market traders. In comparison 

with other statistical measures, this can only be assessed 

through its practical application in a live trading environment. 

However, in terms of its relationship to a stochastic model 

for macroeconomic data, q(t) does provide a measure that 

is consistent with the physical principles associated with a 

random walk that includes a directional bias, i.e. fractional 

Brownian motion. The model considered, and the signal 

processing algorithm proposed, has a close association with 

re-scaled range analysis for computing the Hurst exponent H 

[35]. In this sense, the principal contribution of this paper 

has been to consider a model that is quantified in terms of 

a physically significant (but phenomenological) model that 

is compounded in a specific (fractional) partial differential 

equation. As with other financial time series, their derivatives, 

transforms etc., a range of statistical measures can be used 

to characterise q(t) examples of which have been provided in 

this paper. It should be noted that in all cases studied to date, 

the composite normality of the signal q(t) is of type ‘Reject’. 

In other words, the statistics of q(t) are non-Gaussian. Further, 

assuming that a financial time series is statistically self-affine, 

the computation of q(t) can be applied over any time scale 

provided there is sufficient data for the computation of q(t) 

to be statistically significant. Thus, the results associated with 

the Close-of-Day data studied in this paper are, in principle, 

applicable to economic time series associated with tick data 

over a range of time scales. 

APPENDIX I 

M-CODE FOR THE ORTHOGONAL LINEAR REGRESSION 

ALGORITHM 

The following m-code is used to compute the Fourier 

dimension q from the power spectrum of a random fractal 

signal and is based on the code given in [51]. 

function x=linortfit(xdata,ydata) 

% Input arrays are 

% 

%xdata: 2,3,...,L/2 

%ydata: P[2], P[3], ..., P(L/2) 

% 

% Output value is x which gives the Fourier 

% dimension q for input data P[i].


% 

fun=inline(’sum((p(1)+p(2)*xdata-ydata... 

...).ˆ2)/(1+p(2)ˆ2)’,’p’,’xdata’,’ydata’); 

x0=flipdim(polyfit(xdata,ydata,1),2); 

options=optimset(’TolX’,1e-6,... 

...’TolFun’,1e-6); 

x=fminsearch(fun,x0,options,xdata,ydata); 

APPENDIX II 

VARIATION DIMINISHING SMOOTHING KERNELS 

Variation Diminishing Smoothing Kernels (VDSK) are convolution 

kernels with properties that guarantee smoothness and 

thereby, eliminate Gibbs’ effect around points of discontinuity 

of a given function. Further the smoothed function can be 

shown to be made up of a similar succession of concave or 

convex arcs equal in number to those of the function. Thus, we 

consider the following question: let there be given a continuous 

or discontinuous function f whose graph is composed of a 

succession of alternating concave or convex arcs. Is there 

a smoothing kernel (or a set of them) which produces a 

smoothed function whose graph is also made up of a similar 

succession of concave or convex arcs equal in number to those 

of f? 1 . 

II.1 Laguerre-Pôlya Class Entire Functions 

The class of kernels which relate to this question are a class 

of entire functions which shall be called class E originally 

studied earlier by E Laguerre and G Pôlya. An entire function 

E(z), z ∈ C belongs to the class E 

⇐⇒ 

E(z) = exp(bz − cz 2 ∞� � 

) 1 − z 

� 

exp[z/a(ℓ)], (II.1.1) 

a(ℓ) 

ℓ=1 

where b, c, a(ℓ) ∈ R, c ≥ 0, and 

∞� 

a −2 (ℓ) < ∞. (II.1.2) 

ℓ=1 

where ⇐⇒ is taken to denote ‘if and only if’ - iff. The convergence 

of the series (II.1.2) guarantees that the product in 

(II.1.1) converges and represents an entire function. Laguerre 

proved, and Pôlya added a refinement, that a sequence of 

polynomials, having real roots only, which converge uniformly 

in every compact set of the complex plane C, approaches a 

function of class E in the uniform limit of such a sequence. 

For example, 

exp(−z 2 � 

) = lim 

ℓ→∞ 

1 − z2 

ℓ 2 

� ℓ 2 

, 

and the polynomials (1 − z 2 /ℓ 2 ) have real roots only. In this 

definition, it is not assumed that the a(ℓ) are distinct. To 

include the case in which the product has a finite number 

of factors or reduces to 1 without additional notation, it 

is assumed that certain points on all the a(ℓ) may be ∞. 

1 Based on an edited version of material developed by A Domingez-Torres, 

‘Fourier Based Method in CAD’, PhD Thesis, Cranfield University, 1991 

Furthermore, it is assumed, without loss of generality, that 

the roots a(ℓ) are arranged in an order of increasing absolute 

values, 

0 < |a(1)| ≤ |a(2)| ≤ |a(3)| ≤ . . . 

Examples of functions belonging to class E are 

1, 1 − z, exp(z), exp(z 2 ), cos z 

sin z 

z , Γ−1 (1 − z), Γ −1 (z) 

Note that the product of two functions of this class produce a 

new function of the same class. 

II.2 Variation Diminishing Smoothing Kernels (VDSKs) 

A function k is variation diminishing iff it is of the form 

k(x) = (2πi) −1 

�i∞ 

−i∞ 

ℓ=1 

[E(z)] −1 exp(zx) dz, (II.2.1) 

where E(z) ∈ E is given by 

E(z) = exp(bz − cz 2 ∞� � 

) 1 − z 

� 

exp[z/a(ℓ)], (II.2.2) 

a(ℓ) 

with b, c, a(ℓ) ∈ R, c ≥ 0, and 

∞� 

a −2 (ℓ) < ∞ 

ℓ=1 

In other words, a frequency function k is variation diminishing 

iff its bilateral Laplace transform equals [E(z)] −1 : 

[E(z)] −1 = 

�∞ 

−∞ 

k(x) exp(−zx) dx. (II.2.3) 

In order to define a smoothing kernel, the function k given in 

(II.2.1) must be an even function. For, if k(x) is even, then 

the corresponding bilateral Laplace transform [E(z)] −1 is also 

even. This fact follows readily from 

= 

�∞ 

−∞ 

[E(z)] −1 

k(x) exp(−zx) dx = 

= 

�∞ 

−∞ 

�∞ 

−∞ 

k(−x) exp(−zx) dx 

k(x) exp(zx) dx = [E(−z)] −1 

Conversely, if [E(z)] −1 is even, then its inverse bilateral 

transform is even since a component of convergence of (II.2.3) 

contains the imaginary axis. This follows from the fact that 

the component of convergence of each one of the functions 

which compose E(z) contains completely the imaginary axis. 

Further, it follows that 

[E(iu)] −1 = K(u), (II.2.4) 

where K(u) is the FT of k. From the evenness of [E(z)] (−1) 

it follows that K(u) is real, hence k is even. But E(z) is even


iff b = 0 and a(2ℓ − 1) = −a(2ℓ), ℓ = 1, 2, . . . . Therefore 

E(z) is taken to be 

E(z) = exp(−cz 2 ∞� � 

) 1 − z2 

a2 � 

, (II.2.5) 

(ℓ) 

with c, a(ℓ) ∈ R, c ≥ 0, and 

ℓ=1 

∞� 

a −2 (ℓ) < ∞. 

ℓ=1 

Equation (II.2.4) establishes the relationship between the 

bilateral Laplace transform and the Fourier transform of k. 

Thus, any analysis associated with use of the bilateral Laplace 

transform can be undertaken in terms of the Fourier transform. 

Using equation (II.2.4) the Fourier transform of (II.2.1) is 

given by 

k(x) ↔ K(u) = [E(iu)] −1 = exp(−cu 2 ) 

∞� � 

ℓ=1 

a 2 (ℓ) 

a 2 (ℓ) + u 2 

(II.2.6) 

where ↔ denotes transformation from real to Fourier space, 

c, a(ℓ) ∈ R, c ≥ 0, and ∞� 

a−2 (ℓ) < ∞. 

ℓ=1 

Because equation (II.2.6) is a variation diminishing function 

by construction and |K(0)| ≤ 1, then the following result 

holds. 

Theorem II.2.1 (VDSKs) 

k defined as in equation (II.2.6) 

=⇒ 

1. k is a smoothing kernel belonging to SK1, 

2. k is variation diminishing, 

3. k(x) ≥ 0, x ∈ R. 

In order to make a complete study of the VDSKs, such 

kernels will be divided in three classes: The Finite VDSKs, 

The Non-Finite VDSKs, and The Gaussian VDSK. 

II.3 The Finite VDSKs 

The finite and the non-finite VDSKs are kernels which can 

be synthesized from the following basic function: 

� 

, 

e(x) = 1 

exp(−|x|), x ∈ R. (II.3.1) 

2 

The finite VDSKs are made up by a finite number of convolutions 

of functions a(ℓ) e[a(ℓ)x], ℓ = 1, 2, . . . . Clearly e(x) 

is a VDSK with mean ν = 0 and variance σ 2 = 2 and its 

Fourier transform is given by 

e(x) ↔ 

1 

. (II.3.2) 

1 + u2 Note that if a > 0, then a e(ax) is again a VDSK. Using 

the similarity property of the Fourier transform and equation 

(II.3.2), its Fourier transform is given by 

a e(ax) ↔ 

a2 

a2 . (II.3.3) 

+ u2 Its mean ν again vanishes and its variance takes the value 

σ 2 = 2/a 2 . 

Let a(1), a(2), . . . , a(n) > 0 be constants, some or all 

of which may be coincident. The following VDSKs are 

introduced 

kℓ(x) = a(ℓ) e[a(ℓ)x], ℓ = 1, 2, . . . , n. (II.3.4) 

The combination of these functions by convolution gives a new 

VDSKs with properties quantified in the following theorem. 

Theorem II.3.1 (Properties of The Finite VDSKs) 

1. a(ℓ) > 0, ℓ = 1, 2, . . . , 

2. kℓ(x) = a(ℓ) e[a(ℓ)x], 

3. k = k1 ⊗ k2 ⊗ · · · ⊗ kn, 

4. K(u) = n� � � 

2 2 2 a (ℓ)/(a (ℓ) + u ) 

ℓ=1 

=⇒ 

A. k is a VDSK, 

B. k(x) ↔ K(u), 

C. k has mean ν = 0, 

D. k has variance σ2 = n� � � 

2 2/a (ℓ) < ∞. 

ℓ=1 

Proof. A. The assertion follows from mathematical induction. 

B. It follows from Convolution Theorem and mathematical 

induction. 

C. Let kℓ(x) ↔ Kℓ(u). Then because each kℓ is a VDSK, 

it follows that the respective mean, νℓ, is given by 

νℓ = iK ′ ℓ(0) = 0, ℓ = 1, 2, . . . , n. 

Moreover, if n = 2, then the mean ν of k is given by 

ν = iK ′ (0) = i(K1K2) ′ (0) = i(K1K2 ′ +K1 ′ K2)(0) = i(0) = 0. 

The assertion follows from this result and mathematical induction. 

D. Let kℓ(x) ↔ Kℓ(u). Then because kℓ is a VDSK, it 

follows that the respective variance, σ2 ℓ , is given by 

a2 , ℓ = 1, 2, . . . , n. 

ℓ 

Furthermore, from the result given in C above, if n = 2, then 

the mean σ2 of k is given by 

σ 2 ℓ = −K ′′ (0) = 2 

σ 2 = −K ′′ (0) = −(K1K2) ′′ (0) 

= (−K1K2 ′′ − 2K1 ′ K2 ′ − K1 ′′ K2)(0) = 2 

a2 2 

+ 

(1) a2 (2) . 

The assertion follows from this result and mathematical induction. 

From the explicit expression of K(u) given in Theorem 

II.3.1. it follows that 

= 

= 

K(u) = 

n� 

ℓ=1 

n� 

ℓ=1 

n� 

ℓ=1 

� 2 a (ℓ) 

a2 (ℓ) + u2 � 

� 

a(ℓ) 

� � 

−a(ℓ) 

� 

a(ℓ) − iu −a(ℓ) − iu 

� a(ℓ) 

a(ℓ) − iu 

= 

2n� 

ℓ=1 

� n � 

ℓ=1 

� 

d(ℓ) 

d(ℓ) − iu 

� 

� 

−a(ℓ) 

−a(ℓ) − iu 

�


where d(ℓ) = a(ℓ) for ℓ = 1, 2, . . . , n and d(ℓ) = −a(ℓ) for 

ℓ = n + 1, n + 2, . . . , 2n. Thus k is of degree 2n and the 

following theorem holds. 

Theorem II.3.2 (Degree of Differentiability of The Finite 

VDSKs) 

k a finite VDSK, 

=⇒ 

1. k ∈ C 2n−2 (R, R), 

2. k ∈ C 2n−1 (R, R) except at x = 0, where 

k 2n−1 (0 + ), k 2n−1 (0 − ) 

both exist. 

The asymptotic behaviour of k and its Fourier transform, 

K, will be now studied. 

Theorem II.3.3 (Asymptotic Behaviour of The Fourier 

transform of The Finite VDSKs) 

1. k a finite VDSK, 

2. k(x) ↔ K(u) 

=⇒ 

|K(u)| = O(|u| −2n ), |u| → ∞. 

Proof. k is made up of a finite convolution operations 

of functions kℓ(x) = a(ℓ) e[a(ℓ)x], where a(ℓ) > 0, ℓ = 

1, 2, . . . , n; and whose FT, Kℓ(u), satisfy the inequality 

� 

� 

|Kℓ(u)| = � 

a 

� 

2 (ℓ) 

a2 (ℓ) + u2 � 

� 

� 

� ≤ a2 (ℓ) 

, ℓ = 1, 2, . . . , n. 

|u| 2 

Thus 

� � 

� n� � 

� � 

|K(u)| = � Kℓ(u) � 

� � 

ℓ=1 

≤ 

n� 

� 2 a (ℓ) 

|u| 

ℓ=1 

2 

� 

= |u| −2n 

n� 

a 

ℓ=1 

2 (ℓ). 

(II.3.5) 

From the above theorem we construct the following corollarys. 

Corollary II.3.4 (Absolute and Quadratic Integrability 

of The Fourier transform of The Finite VDSKs) 

1. k a finite VDSK, 

2. k(x) ↔ K(u) 

=⇒ 

K(u) ∈ L(R, R) ∩ L2 (R, R). 

Corollary II.3.5 (Absolute and Quadratic Integrability 

of The Finite VDSKs) 

k a finite VDSK, 

=⇒ 

k(x) ∈ L(R, R) ∩ L2 (R, R). 

The Fourier transform K(u) of the Fourier transform of k 

is given by 

K(u) ↔ 2πk(−x). 

Since k is a even function then 

K(u) ↔ 2πk(x). 

This result, in conjunction with Corollary II.3.4. and Riemann- 

Lebesgue Lemma proves the following theorem. 

Theorem II.3.6 (Asymptotic Behaviour of The Finite 

VDSKs) 

k a finite VDSK 

=⇒ 

k(x) → 0 as |x| → ∞. 

II.4 The Non-Finite VDSKs 

We now study kernels k holding the property 

∞� 

� 2 a (ℓ) 

k(x) ↔ K(u) = 

a2 (ℓ) + u2 � 

ℓ=1 

(II.4.1) 

which are non-finite kernels. In particular, the infinite product 

in equation (II.4.1) may have only a finite number of factors, 

so that the finite VDSKs of the last section are included. 

Kernels holding equation (II.4.1) can be synthesized from the 

basic kernel 

e(x) = 1 

exp(−|x|), x ∈ R. 

2 

The non-finite VDSKs are composed of a non-finite number 

of functions a(ℓ) e[a(ℓ)x], ℓ = 1, 2, . . . . The properties of such 

kernels are given in the following theorem. 

Theorem II.4.1 (Properties of The Non-Finite VDSKs) 

1. a(ℓ) > 0, ℓ = 1, 2, . . . , 

2. kℓ(x) = a(ℓ) e[a(ℓ)x], 

3. k = k1 ⊗ k2 ⊗ · · · ⊗ kn . . . , 

4. K(u) = ∞� � � 

2 2 2 a (ℓ)/(a (ℓ) + u ) 

ℓ=1 

=⇒ 


B. k(x) ↔ K(u), 


D. k has variance σ2 = ∞� � � 

2 2/a (ℓ) < ∞. 

ℓ=N+1 

ℓ=1 

Since k (Theorem II.4.1) is made up by a non-finite number 

of convolution operationw, then it is of degree infinity, which 

leads to the following. 

Theorem II.4.2 (Degree of Differentiability of The Non- 

Finite VDSKs) 

k a non-finite VDSK 

=⇒ 

k ∈ C∞ (R, R). 

The asymptotic behaviour of the Fourier transform of a nonfinite 

kernel is established in the following theorem. 

Theorem II.4.3 (Asymptotic Behaviour of The Fourier 

transform of The Non-Finite VDSKs) 

1. k a non-finite VDSK, 

2. k(x) ↔ K(u), 

3. R, p > 0 

=⇒ 

|K(u)| = O(|u| −2p ), |u| → ∞. 

Proof. Choose N > p and so large that |a(ℓ)| ≥ R when 

ℓ > N which is possible since |a(ℓ)| → ∞ as ℓ → ∞. Set 

∞� 

� 2 a (ℓ) 

KN(u) = 

a2 (ℓ) + u2 � 

. 

By equation (II.3.5), it follows that 

|K(u)| ≤ |KN (u)| 

|u| 2N 

N� 

a 2 (ℓ). 

ℓ=1 

Because |KN(u)| never vanishes and is continuous for all u ∈ 

R, then it has a positive lower bound. Hence, for a suitable 

constant M 

|K(u)| ≤ M 

. 

|u| 2N


In particular, if p = 1 in the above theorem and because k is a 

variation diminishing function, the following corollary results. 

Corollary II.4.4 (Absolute Integrability of The Non- 

Finite Kernels and Their FT) 

1. k a non-finite VDSK, 

2. k(x) ↔ K(u) 

=⇒ 

k, K ∈ L(R, R). 

Application of the symmetry property of the Fourier transform, 

the Riemann-Lebesgue Lemma and the above corollary 

proves the following theorem. 

Theorem II.4.5 (Asymptotic Behaviour of The Non-Finite 

VDSKs) 

k a non-finite VDSK 

=⇒ 

k(x) → 0 as |x| → ∞. 

Some examples of non-finite VDSKs are: 

π 

4 sech2 ( πx 

) ↔ u csch u 

2 

∞� 

� � 2 2 ℓ π 

= 

, (II.4.2) 

ℓ=1 

ℓ 2 π 2 + u 2 

1 

2 sech(πx 

∞� 

� 2 2 (2ℓ − 1) π 

) ↔ sech u = 

2 (2ℓ − 1) 

ℓ=1 

2π2 + u2 � 

. 

(II.4.3) 

Note that a non-finite VDSK does not necessarily belongs to 

L2 (R, R), e.g. the kernel given by equation (II.4.3). 

II.5 The Gaussian VDSK 

The Gaussian VDSK, k, is defined by the relation 

k(x) ↔ K(u) = exp(−cu 2 ), c > 0. (II.5.1) 

With c → 1/4c 2 , the Gaussian VDSK is now defined as 

k(x) ↔ K(u) = exp(−u 2 /4c 2 ), c > 0. (II.5.2) 

The basic properties of the above kernel follow directly and 

are collated together in the following theorem. 

Theorem II.5.1 (Basic Properties of The Gaussian 

VDSK) 

1. k(x) = c gauss(cx), c > 0, 

2. K(u) = exp(−u 2 /4c 2 ), c > 0, 

3. p > 0 

=⇒ 


B. k(x) ↔ K(u), 


D. k has variance σ 2 = 1/2c 2 , 

E. k, K ∈ L(R, R) ∩ L 2 (R, R), 

F. k, K ∈ C ∞ (R, R), 

G. |k(x)| = o(|x| −p ), 

H. |K(u)| = o(|u| −p ). 

If in equation (II.5.1), c is considered as a variable, say t, 

then after taking the inverse Fourier transform with respect to 

x we obtain a real valued function of two variables, i.e. 

k(x, t) = 1 

√ 4πt exp(−x 2 /4t). (II.5.3) 

This new function is the familiar source solution of the 

diffusion equation 

� � 

2 ∂ ∂ 

− k(x, t) = 0 (II.5.4) 

∂x2 ∂t 

II.6 Geometric Properties of The VDSKs 

We consider the general geometric properties shared by the 

finite, non-finite and the Gaussian VDSKs where k denotes 

either a finite, non-finite or Gaussian VDSK throughout. 

Theorem II.6.1 (Geometric Properties of The VDSKs) 

1. k a VDSK, 

2. f : R → R bounded and convex (concave) 

=⇒ 

A. For a, b ∈ R 

V [k(x) ⊗ f(x) − a − bx] ≤ V [f(x) − a − bx], (II.6.1) 

B. (k ⊗ f)(x) is convex (concave). 

Proof. A. Inequality (II.6.1) follows by a direct application 

of the variation diminishing property of k. 

B. It is well known that f is convex iff 

∆ 2 hf(x) = f(x + 2h) − 2f(x + h) − f(x) ≥ 0, 

for all x ∈ R, h > 0. Because k is a non-negative function, 

then 

∆ 2 h[(k ⊗ f)(x)] = ∆ 2 ⎡ 

�∞ 

⎤ 

⎣ 

h k(y)f(x − y) dy⎦ 

= 

�∞ 

−∞ 

−∞ 

k(y)∆ 2 hf(x − y) dy ≥ 0. 

Thus the inequality follows. The case for which f is concave 

follows using a similar argument but ∆2 hf(x) ≤ 0, for all 

x ∈ R, h > 0. 

The geometric significance of inequality (II.6.1) is that the 

number of intersections of the straight line y = a + bx, a, b ∈ 

R, with (k⊗f)(x) does not exceed the number of intersections 

of y = a + bx with y = f(x). As a special instance of such 

an inequality, it follows that (k ⊗ f)(x) is non-negative if f 

is non-negative. 

Corollary II.6.2 (Non-Negativity of k ⊗ f) 

1. k a VDSK, 

2. f : R → R, f ≥ 0, and bounded 

=⇒ 

(k ⊗ f)(x) ≥ 0, x ∈ R. 

From the above results, it is clear that if f is composed of 

a succession of alternating convex or concave arcs, then k ⊗ f 

is also made up of a similar succession of convex or concave 

arcs equal in number to those of f. Thus, a VDSK is shape 

preserving. 

ACKNOWLEDGMENTS 

The ABX data was provided by the Systemic Risk Analysis 

Division, Bank of England, who originally commissioned the 

research.


REFERENCES 

[1] http://www.tickdata.com/ 

[2] http://www.vhayu.com/ 

[3] http://en.wikipedia.org/wiki/Louis Bachelier 

[4] http://en.wikipedia.org/wiki/Robert Brown (botanist) 

[5] T. R. Copeland, J. F. Weston and K. Shastri, Financial Theory and 

Corporate Policy, 4th Edition, Pearson Addison Wesley, 2003. 

[6] J. D. Martin, S. H. Cox, R. F. McMinn and R. D. Maminn, The Theory of 

Finance: Evidence and Applications, International Thomson Publishing, 

1997. 

[7] R. C. Menton, Continuous-Time Finance, Blackwell Publishers, 1992. 

[8] T. J. Watsham and K. Parramore, Quantitative Methods in Finance, 

Thomson Business Press, 1996. 

[9] E. Fama, The Behavior of Stock Market Prices, Journal of Business Vol. 

38, 34-105, 1965. 

[10] P. Samuelson, Proof That Properly Anticipated Prices Fluctuate Randomly, 

Industrial Management Review Vol. 6, 41-49, 1965. 

[11] E. Fama, Efficient Capital Markets: A Review of Theory and Empirical 

Work, Journal of Finance Vol. 25, 383-417, 1970. 

[12] G. M. Burton, Efficient Market Hypothesis, The New Palgrave: A 

Dictionary of Economics, Vol. 2, 120-23, 1987. 

[13] F. Black and M. Scholes, The Pricing of Options and Corporate 

Liabilities, Journal of Political Economy, Vol. 81(3), 637-659, 1973. 

[14] http://uk.finance.yahoo.com/q/hp?s=%5EFTSE 

[15] B. B. Mandelbrot and J. R. Wallis, Robustness of the Rescaled Range 

R/S in the Measurement of Noncyclic Long Run Statistical Dependence, 

Water Resources Research, Vol. 5(5), 967-988, 1969. 

[16] B. B. Mandelbrot, Statistical Methodology for Non-periodic Cycles: 

From the Covariance to R/S Analysis, Annals of Economic and Social 

Measurement, Vol. 1(3), 259-290, 1972. 

[17] E. H. Hurst, A Short Account of the Nile Basin, Cairo, Government 

Press, 1944. 

[18] http://en.wikipedia.org/wiki/Elliott wave principle 

[19] http://uk.finance.yahoo.com/q/hp?s=%5EFTSE 

[20] http://uk.finance.yahoo.com/q/hp?s=%5EDJI 

[21] B. B. Mandelbrot, The Fractal Geometry of Nature, Freeman, 1983. 

[22] J. Feder, Fractals, Plenum Press, 1988. 

[23] K. J. Falconer, Fractal Geometry, Wiley, 1990. 

[24] P. Bak, How Nature Works, Oxford University Press, 1997. 

[25] N. Lam and L. De Cola L, Fractal in Geography, Prentice-Hall, 1993. 

[26] H. O. Peitgen and D. Saupe (Eds.), The Science of Fractal Images, 

Springer, 1988. 

[27] A. J. Lichtenberg and M. A. Lieberman, Regular and Stochastic Motion: 

Applied Mathematical Sciences, Springer-Verlag, 1983. 

[28] J. J. Murphy, Intermarket Technical Analysis: Trading Strategies for the 

Global Stock, Bond, Commodity and Currency Market, Wiley Finance 

Editions, Wiley, 1991. 

[29] J. J. Murphy, Technical Analysis of the Futures Markets: A Comprehensive 

Guide to Trad-ing Methods and Applications, New York Institute 

of Finance, Prentice-Hall, 1999. 

[30] T. R. DeMark, The New Science of Technical Analysis, Wiley, 1994. 

[31] J. O. Matthews, K. I. Hopcraft, E. Jakeman and G. B. Siviour, Accuracy 

Analysis of Measurements on a Stable Power-law Distributed Series of 

Events, J. Phys. A: Math. Gen. 39, 1396713982, 2006. 

[32] W. H. Lee, K. I. Hopcraft, and E. Jakeman, Continuous and Discrete 

Stable Processes, Phys. Rev. E 77, American Physical Society, 011109, 

1-4. 

[33] A. Einstein, On the Motion of Small Particles Suspended in Liquids at 

Rest Required by the Molecular-Kinetic Theory of Heat, Annalen der 

Physik, Vol. 17, 549-560, 1905. 

[34] J. M. Blackledge, G. A. Evans and P. Yardley, Analytical Solutions to 

Partial Differential Equations, Springer, 1999. 

[35] H. Hurst, Long-term Storage Capacity of Reservoirs, Transactions of 

American Society of Civil Engineers, Vol. 116, 770-808, 1951. 

[36] M. F. Shlesinger, G. M. Zaslavsky and U. Frisch (Eds.), Lévy Flights 

and Related Topics in Physics, Springer 1994. 

[37] R. Hilfer, Foundations of Fractional Dynamics, Fractals Vol. 3(3), 549- 

556, 1995. 

[38] A. Compte, Stochastic Foundations of Fractional Dynamics, Phys. Rev 

E, Vol. 53(4), 4191-4193, 1996. 

[39] P. M. Morse and H. Feshbach, Methods of Theoretical Physics, McGraw- 

Hill, 1953. 

[40] G. F. Roach, Green’s Functions (Introductory Theory with Applications), 

Van Nostrand Reihold, 1970. 

[41] T. F. Nonnenmacher, Fractional Integral and Differential Equations for 

a Class of Lévy-type Probability Densities, J. Phys. A: Math. Gen. Vol. 

23, L697S-L700S, 1990. 

[42] R. Hilfer, Exact Solutions for a Class of Fractal Time Random Walks, 

Fractals, Vol. 3(1), 211-216, 1995. 

[43] R. Hilfer and L. Anton, Fractional Master Equations and Fractal Time 

Random Walks, Phys. Rev. E, Vol. 51(2), R848-R851, 1995. 

[44] M. J. Turner, J. M. Blackledge and P. Andrews, Fractal Geometry in 

Digital Imaging, Academic Press, 1997. 

[45] M. F. Shlesinger, G. M. Zaslavsky and U. Frisch (Eds.), Lévy Flights 

and Related Topics in Physics, Springer 1994. 

[46] S. Abea and S. Thurnerb, Anomalous Diffusion in View of Einsteins 1905 

Theory of Brownian Motion, Physica A(356) 403407, Elsevier 2005. 

[47] I. Lvova, Application of Statistical Fractional Methods for the Analysis 

of Time Series of Currency Exchange Rates, PhD Thesis, De Montfort 

University, 2006. 

[48] C. R. Rao, Linear Statistical Inference and its Applications, Wiley, 1973. 

[49] http://webscripts.softpedia.com/script/Scientific-Engineering-Ruby/ 

Mathematics/Orthogonal-Linear-Regression-33745.html 

[50] S. Mallat, A Wavelet Tour of Signal Processing, Academic Press, ISBN: 

0-12-466606-X, 1999. 

[51] http://www.mathworks.com/matlabcentral/fileexchange/ 

loadFile.do?objectId=6716&objectType=File 

[52] http://www.markit.com/en/home.page. 

Jonathan Blackledge graduated in physics from 

Imperial College in 1980. He gained a PhD in theoretical 

physics from London University in 1984 and 

was then appointed a Research Fellow of Physics 

at Kings College, London, from 1984 to 1988, 

specializing in inverse problems in electromagnetism 

and acoustics. During this period, he worked on 

a number of industrial research contracts undertaking 

theoretical and computational research into 

the applications of inverse scattering theory for the 

analysis of signals and images. In 1988, he joined 

the Applied Mathematics and Computing Group at Cranfield University as 

Lecturer and later, as Senior Lecturer and Head of Group where he promoted 

postgraduate teaching and research in applied and engineering mathematics 

in areas which included computer aided engineering, digital signal processing 

and computer graphics. While at Cranfield, he co-founded Management and 

Personnel Services Limited through the Cranfield Business School which 

was originally established for the promotion of management consultancy 

working in partnership with the Chamber of Commerce. He managed the 

growth of the company from 1993 to 2007 to include the delivery of a 

range of National Vocational Qualifications, primarily through the City and 

Guilds London Institute, including engineering, ICT, business administration 

and management. In 1994, Jonathan Blackledge was appointed Professor of 

Applied Mathematics and Head of the Department of Mathematical Sciences 

at De Montfort University where he expanded the post-graduate and research 

portfolio of the Department and established the Institute of Simulation 

Sciences. From 2002-2008 he was appointed Visiting Professor of Information 

and Communications Technology in the Advanced Signal Processing Research 

Group, Department of Electronics and Electrical Engineering at Loughborough 

University, England (a group which he co-founded in 2003 as part 

of his appointment). In 2004 he was appointed Professor Extraordinaire of 

Computer Science in the Department of Computer Science at the University 

of the Western Cape, South Africa. His principal roles at these institutes 

include the supervision of MSc and MPhil/PhD students and the delivery 

of specialist short courses for their Continuous Professional Development 

programmes. He currently holds the prestigious Stokes Professorship funded 

by the Science Foundation Ireland at Dublin Institute of Technology and 

is Distinguished Professor in the Centre for Advanced Studies at Warsaw 

University of Technology


An Optical Machine Vision System for 

Applications in Cytopathology 

Jonathan M Blackledge, Fellow, IET and Dmitry A Dubovitskiy, Member, IET 

Abstract— This paper discusses a new approach to the processes 

of object detection, recognition and classification in a 

digital image focusing on problem in Cytopathology. A unique self 

learning procedure is presented in order to incorporate expert 

knowledge. The classification method is based on the application 

of a set of features which includes fractal parameters such as the 

Lacunarity and Fourier dimension. Thus, the approach includes 

the characterisation of an object in terms of its fractal properties 

and texture characteristics. The principal issues associated with 

object recognition are presented which include the basic model 

and segmentation algorithms. The self-learning procedure for 

designing a decision making engine using fuzzy logic and membership 

function theory is also presented and a novel technique 

for the creation and extraction of information from a membership 

function considered. The methods discussed and the algorithms 

developed have a range of applications and in this work, we 

focus the engineering of a system for automating a Papanicolaou 

screening test. 

Index Terms— Computer vision, Segmentation, Object recognition, 

Contour detection, Edge detection, Decision making, 

Self-learning, Fuzzy logic, Image morphology, Cytopathology, 

Cervical smear analysis, Papanicolaou screening test. 


THE cervix is an important site for pathological studies, 

particularly in women of reproductive age. It protects the 

uterine cavity from intrusion of pathogenic micro-organisms, 

promotes the movement of spermatozoa to the ovule and holds 

a fetus in the uterus at pregnancy. The conventional study 

of cellular structures on stained glass slides for cytological 

reporting is a routine procedure for the early detection of 

pre-carcinoma conditions. Visual inspection allows an estimate 

to be made of the state of the cervix and a diagnosis to be 

developed based on the cytological pattern observed providing 

an adequate specimen is available. Worldwide, approximately 

471,000 women are diagnosed with invasive carcinoma of 

the cervix each year and the order of 233,000 die from the 

disease. Although mortality from cervical cancer continues to 

decrease due to improved screening programmes, it remains 

among the most common female cancers in many countries. 

For example, in the United Kingdom, it is ranked eleventh 

for women, sexually transmitted infections by certain strains 

of the human papilloma virus being the major cause of the 

condition. 

Manuscript completed in December, 2009. The work reported in this paper 

is supported by the Science Foundation Ireland. 

Jonathan Blackledge (email: jonathan.blackledge@dit.ie) is SFI (Science 

Foundation Ireland) Stokes Professor, School of Electrical Engineering Systems, 

Faculty of Engineering, Dublin Institute of Technology, Kevin Street, 

Dublin 8, Ireland - http://eleceng.dit.ie/blackledge. Dr Dmitry Dubovitskiy is 

Director of Oxford Recognition Limited (email: dda@oxreco.com). 

A. Papanicolaou Screening 

Cervical cancer is preceded by a precancerous condition 

called Cervical Intraepithelial Neoplasia (CIN) which can be 

easily treated if detected. It is therefore important to identify 

CINs through a Papanicolaou screening test commonly known 

as a ‘PAP test’. A small sample of cells from the surface of 

the cervix is removed and smeared onto a glass slide and 

the material is fixed in alcohol. The slide is then stained 

and the sample(s) examined under a microscope, a search 

being carried to detect abnormal cells. Examination typically 

involves observing the nucleus of a cell and inspecting it 

for characteristics that point toward abnormalities that include 

size, texture and colour. For example, if the nucleus is enlarged 

relative to the area of the cytoplasm as shown in the example 

given in Figure 1 then there is a likelihood of abnormal activity 

within the nucleus. 

Fig. 1. Example of normal (left) and abnormal cell clusters (right) where, 

in the latter case, the Cytoplasm to Nuclei area ratio is enlarged. 

The order of four million cervical smears are taken annually 

in the UK and fifty million in USA, for example, and a principal 

diagnostic problem is that about one fifth of the borderline 

preparations show the disease at an advanced stage on referral 

and biopsy. Overall there is a 50% ‘failure’ rate in detecting 

significant diseases within borderline cases. In addition there 

is a 50% ‘failure’ in detecting significant deceases within 

negative cases. The reasons for this vary from extraction of 

a sample, the preparation of the slide, but most of all, from 

the sequential reading of a slide in the diagnostic laboratory 

when human error occurs. 

In current practices world-wide a diagnosis is performed 

manually. It typically takes 8-10 minutes for a cytopathologist 

to screen a slide and involves upto 300 movements of a 

microscope over the slide. This approach not only takes time 

but inevitably leads to outcomes in which it is not possible to 

guarantee consistent and accurate results as many borderline 

results are generated, for example. It is therefore of significant


value if accurate image analysis and object recognition techniques 

can be developed in an attempt to automate the process 

and produce a system that provides a reliable, consistent and 

quantitative estimation of CINs and other abnormalities to 

improve upon the subjective assessments of a cytopathologist. 

A typical screening session involves a cytopathologist 

analysing a slide under the microscope with a magnification 

up to 400x. The output is related to the number of slides 

and working hour per cytologist and an increase in either 

reduces the speed and reliability of the results. Telecytology 

[5] provides a large number of digital images for consideration 

which can lead increased human error. Moreover, in 

telecytology the cytopatholoist is not usually able to examine 

cellular details and to change the focal plane of the image. 

In virtual microscopy a digital image of the entire slide is 

generated and consequently the image file can become very 

large ∼4-7Gb. Another problem with virtual microscopy is 

that the focal plane limits the representation of the specimen. 

Virtual microscopy is used for proficiency tests and there are a 

number of commercially available medical imaging assistant 

tools [11], [12], [13]. However, a cytopathologist is still an 

important factor in the ‘diagnostic cycle’. Furthermore, due 

to compression and/or differences in the focal depth, many 

images may not provide a clear enough representation of a 

cell in comparison to those obtained using conventional microscopy. 

Thus, the development of automated recognition and 

classification systems provides the potential for introducing 

quality control in national screening procedures. 

B. Image Analysis and Pattern Recognition 

Conventional microscopy, as applied to cytopathology, involves 

the use of image processing methods that are often 

designed in an attempt to provide a machine interpretation 

of an image, ideally in a form that allows some decision 

criterion to be applied, such that a pattern and/or object can 

be recognised [1], [2]. Pattern recognition uses a range of 

different approaches that are not necessarily based on any 

one particular theme or unified theoretical approach. The main 

problem is that, to date, there is no complete theoretical model 

for simulating the processes that take place when a human 

interprets an image generated by the eye, i.e. there is no 

fully compatible model, currently available, for explaining the 

processes of visual image comprehension. Hence, machine or 

computer vision remains a rather elusive subject area in which 

automatic inspection systems are advanced without having a 

fully operational theoretical framework as a guide. Nevertheless, 

numerous algorithms for interpreting two- and threedimensional 

objects in a digital image have and continue to be 

researched in order to design systems that can provide reliable 

automatic object detection and recognition in an independent 

environment, e.g. [3], [4], [14], [16], [25]. 

Vision can be thought of as a process of linking parts of 

the visual field (objects) with stored information or templates 

about their significance for the observer. There are a number of 

questions concerning vision such as: (i) what are the goals and 

constraints? (ii) what type of algorithm or set of algorithms 

is required to effect vision? (iii) what are the implications 

for the process given the types of hardware that might be 

available? (iv) what are the levels of representation required 

to achieve vision? The levels of representation are dependent 

on what type of segmentation can and/or should be applied 

to an image. For example, we may be able to produce a 

primal sketch from an image via some measure of the intensity 

changes in a scene which are recorded as place tokens and 

stored in a database. This allows sets of raw components 

to be generated, e.g. regions of pixels with similar intensity 

values or sets of lines obtained by isolating the edges of an 

image scene and computed by locating regions where there is 

a significant difference in the intensity. However, such sets are 

subject to inherent ambiguities when computed from a given 

input image and associated with those from which an existing 

database has been constructed. Such ambiguities can only be 

overcome by the application of high-level rules, based on how 

humans interpret images, but the nature of this interpretation is 

not always clear. Nevertheless, parts of an image will tend to 

have an association if they share size, colour, figural similarity, 

continuity, shading and texture, for example. For this purpose, 

we are required to consider how best to segment an image and 

what form this segmentation should take. 

The identification of the edges of an object in an image 

scene is an important aspect of the human visual system 

because it provides information on the basic topology of the 

object from which an interpretative match can be achieved. In 

other words, the segmentation of an image into a complex 

of edges is a useful pre-requisite for object identification. 

However, although many low-level processing methods can be 

applied for this purpose, the problem is to decide which object 

boundary each pixel in an image falls within and which highlevel 

constraints are necessary. Thus, in many cases, a principal 

question is, which comes first, recognition or segmentation? 

Compared to image processing, computer vision (which 

incorporates machine vision) is more than automated image 

processing. It results in a conclusion, based on a machine 

performing an inspection of its own. The machine must be 

programmed to be sensitive to the same aspects of the visual 

field as humans find meaningful. Segmentation is concerned 

with the process of dividing an image into meaningful regions 

or segments. It is used in image analysis to separate features or 

regions of a pre-determined type from the background; it is the 

first step in automatic image analysis and pattern recognition. 

Segmentation is broadly based on one of two properties in 

an image: (i) similarity; (ii) discontinuity. The first property 

is used to segment an image into regions which have grey 

(or colour) levels within a predetermined range. The second 

property segments the image into regions of discontinuity 

where there is a more or less abrupt change in the values 

of the grey (or colour) levels. 

In this paper, we consider an approach to object detection in 

an image that is based on a new segmentation (edge detection) 

algorithm based on a Contour Tracing Algorithm and spaceoriented 

filter [6]. The image usually requires enhancing 

before it is process and for this purpose a novel self-adjusting 

sharpening filter has been developed as discussed in this paper. 

The segmented object is then analysed in terms metrics derived 

from both a Euclidean and fractal geometric perspective, the


output fields being used to train a fuzzy inference engine and 

the recognition structure being based on some of the methods 

reported in [15], for example. The approach considered is 

generic in that it can, in principle, be applied to any type 

of imaging modality. There are numerous applications of 

this technique especially when self-calibration and leaning is 

mandatory. Example applications may include remote sensing, 

non-destructive evaluation and testing and many other applications 

which specifically require the classification of objects 

that are textural. However, in this paper we focus on one 

particular application, namely, the diagnosis of cervical cancer 

based on standard Papanicolaou screening test images. 

II. OBJECT RECOGNITION ARCHITECTURE 

Suppose we have an image which is given by a function 

f(x, y) and contains some object described by a set S = 

{s1, s2, ..., sn}. We consider the case when it is necessary 

to define a sample which is somewhat ‘close’ to this object. 

This task can be reduced to the construction of some function 

determining a degree of proximity of the object to a sample 

- a template of the object. Recognition is the process of 

comparing individual features against some pre-established 

template subject to a set of conditions and tolerances. The 

process of recognition commonly takes place in four definable 

stages: (i) image acquisition and filtering (as required for the 

removal of noise, for example); (ii) object location (which 

may include edge detection); (iii) measurement of object 

parameters; (iv) object class estimation. We now consider the 

common aspects of each step. In particular, we consider details 

on the design features and their implementation together with 

their advantages, disadvantages and proposals for a solution 

whose application, in this paper, focuses on problems in 

cytopathology. 

Image acquisition depends on the technology that is best 

suited for integration with a particular application. For pattern 

recognition in cytopathology, for example, high fidelity digital 

images are required for image analysis whose resolution is, 

at least, compatible by the image acquisition equipment used 

for human inspection. For cytopathology this involves optical 

microscopy and for the application considered in this work, 

the microscope is equipped with digital camera. The colour 

images generated, examples of which are presented in this 

paper are, in general, relatively noise free and are digitised 

using a standard CCD camera. Nevertheless, it is important 

that good quality images are obtained that are homogeneous 

with regard to brightness and contrast, for example. Unless 

consistently high quality images can be generated that are 

compatible with the sample images used to design a given 

computer vision system, then that same system can be severely 

compromised. 

The system discussed in this paper is based on an object 

detection technique that includes a novel segmentation method 

and must be adjusted or ‘fine tuned’ for the each area of 

application. The necessary features associated with the ‘object’ 

must be computed for a particular area of application. In the 

work reported here, this includes objects for which fractal 

models are well suited [23], [1], [2]. The system provides 

an output (i.e. a decision) using a knowledge database and 

outputs a result by subscribing different objects. The ‘expert 

data’ in the application field creates the knowledge database 

by using a supervised training system with a number of model 

objects [18]. The recognition process is illustrated in Figure 2, 

a process that includes the following steps: 

1 

2 

3 

4 

5 

6 

image 

acqusition 

special 

transform 

segmentation 

feature 

detection 

decision 

making 

reporting 

Fig. 2. Recognition processes. 

digital image {fm,n} 

transformed image { ˜ fm,n} 

. . . object images {f 1 m,n}, {f 2 m,n}, . . . 

. . . feature vectors {x1 k }, {x2 k }, . . . 

. . . class probability vectors {p1 j }, {p2j }, . . . 

1) Image Acquisition and Filtering 

A physical object is digitally imaged and the data 

transferred to memory using current image acquisition 

hardware available commercially. The image is filtered 

to reduce noise and to remove unnecessary features such 

as light flecks. 

2) Special Transform: Edge Detection 

The digital image function fm,n is transformed into 

˜fm,n to identify regions of interest and provide an 

input dataset for the segmentation and feature detection 

operations [17]. This transform avoids the use of edge 

detection filters which have proved to be highly unreliable 

in the present application. 

3) Segmentation 

The image {fm,n} is segmented into individual objects 

{f 1 m,n}, {f 2 m,n}, . . . to perform a separate analysis 

of each region. This step includes such operations as 

thresholding, morphological analysis and contour tracing 

using the convex hull method developed in [6]. 

4) Feature Detection 

Feature vectors {x1 k }, {x2 k }, . . . are computed from the 

object images {f 1 m,n}, {f 2 m,n}, . . . and corresponding 

{ ˜ f 1 m,n}, { ˜ f 2 m,n}, . . . . The features are numeric parameters 

that characterize the object inclusive of its texture.


The feature vectors computed consist of a number of Euclidean 

and fractal geometric parameters together with 

statistical measures in both one- and two-dimensions. 

The one-dimensional features correspond to the border 

of an object whereas the two-dimensional features relate 

to the surface within and/or around the object. 

5) Decision Making 

This involves assigning a probability to a predefined 

set of classes [21]. Probability theory and fuzzy logic 

[19] are applied to estimate the class probability vec- 

}, . . . from the object feature vectors 

tors {p1 j }, {p2 j 

{x1 k }, {x2 k 

}, . . . . A fundamental problem is to establish 

a quantitative relationship between features and class 

probabilities, i.e. 

{pj} ↔ {xk} 

A ‘decision’ is the estimated class of the object coupled 

with the a probabilistic accuracy [20]. 

The application considered in this paper is based on algorithms 

that have been designed to solve problems associated 

with the above steps details of which are given in [6] which 

provides algorithms on threshold selection and a contour 

tracing algorithm using the ‘convex hull’ property. However, 

the application considered here requires some additional algorithms 

to solve the object recognition problem associated with 

cytopathology. This is because edge detection is particularly 

difficult to solve for images consisting of many cells and 

a special space-oriented filter has therefore been designed 

to extract parameters associated with the spatial distribution 

of object borders. This includes a self-adjustable filter for 

enhanced object sharpness that has been considered as an 

inter-medium mask filter in order to clarify a cellular border. 

For characterisation, the line of objects found using the steps 

described above, need to be considered in terms of their major 

properties. 

With regard to the design of a decision making engine, 

the approach proposed is based on establishing an expert 

learning procedure in which a Knowledge Data Base (KDB) 

is constructed based on answers that an expert makes during 

a manual mode. Once the KDB has been developed, the 

system is ready for application in the field and provides results 

automatically. However, the accuracy and robustness of the 

output depends critically on the extent and and completeness 

of the KDB as well as the quality of the input image, primarily 

in terms of its compatibility with those images that have been 

used to generate the KDB. The algorithm discussed in Section 

IV has no analogy with previous contour tracing algorithms 

and has been designed to trace a contour of an object with 

any level of complexity to produce an output that consists 

of a consecutive list of coordinates of an object’s edge. The 

algorithm is optimised in terms of computational efficiency 

and can be realised in a compact form suitable for hardware 

implementation. 

III. REGION OF INTEREST SEGMENTATION 

For applications in cytopathology, a fundamental requirement 

is to select Regions of Interest (ROI) for detail review. 

The ROI is not taken to be the object itself but its local 

boundary. This approach improves the efficiency associated 

with the process of recognition, a process that is recursive and 

involves different settings required to evaluate the probability 

of a the presence of a cell in the image. The algorithm used 

for ROI segmentation is based on adaptive thresholding and 

morphological analysis. The adaptive image threshold is given 

by 

Tx = 1 

� 

min 

2 y 

Ty = 1 

� 

min 

2 x 

� max 

x f(x, y)� − 〈max 

x 

+〈max f(x, y)〉y, 

x 

� max 

y f(x, y)� − 〈max 

y 

+〈max f(x, y)〉x, 

y 

� 

Tx, Tx ≥ Ty, 

T = 

Ty, otherwise, 

f(x, y)〉y 

f(x, y)〉x 

where 〈·〉x and 〈·〉y are the means within column x and row y, 

respectively. This approach provides a solution for extracting 

the most significant features in the image, in this case, the 

nucleus of cells. If these objects cover an extensive area of the 

image, then this ‘filter’ provides the fastest compact solution. 

An example of the output generated by this algorithm is shown 

in Figure 3). In order to obtain a clear boundary, morphological 

analysis is applied to select objects with a predefined area. This 

is discussed in the following section. 

Fig. 3. Example of ROI segmentation where + points to the location in the 

image where there is a cell. 

IV. SPACE ORIENTED FILTER DESIGN FOR EDGE 

DETECTION 

Edge detection is used to identify the edges in an image 

which are those areas that correspond to object boundaries. 

To find these edges, an algorithm is designed that looks for 

places in the image where the intensity changes rapidly; this 

is typically based on using one of two principal criteria: 

� 

�

(i) areas where the first derivative of the intensity is larger 

in magnitude than some threshold; 

(ii) regions where the second derivative of the intensity has 

a zero crossing. 

There are many standard digital filters available for this 

process. Taking into account that in many images, high frequency 

noise (white noise) is usually present, we consider an 

appropriate adaptive filtering strategy. 

A. Noise Reduction by Adaptive Wiener Filtering 

Edge detection methods typically require an effective noise 

reduction algorithm in order to eliminate noise which should 

be undertaken adaptively. A well known adaptive filter is the 

Wiener filter which can be applied to an image adaptively, 

tailoring itself to the local image variance. When the variance 

is large, the Wiener filter performs little smoothing; when 

the variance is small, it performs more smoothing. This 

approach often produces better results than linear filtering. The 

adaptive filter is more selective than a comparable linear filter, 

preserving edges and other high frequency parts of an image. 

Although the Wiener filter requires greater computational time 

than linear filtering, it performs better when the noise is 

constant-power or ’white’ additive noise, such as Gaussian 

noise which is one of the conditions required to simplify the 

result of applying a least squares criterion. 

The Wiener filter algorithm uses a pixel-wise adaptive filtering 

procedure with neighborhoods of size m-by-n to estimate 

the local image mean and standard deviation. It estimates the 

local mean and variance around each pixel given respectively 

by 

µ = 1 � 

Is(r, c) - mean of the brightness of the image 

nm 



r,c∈η 

σ 2 = 1 � 

nm 

r,c∈η 

(I 2 s (r, c) − µ 2 ) - dispersion 

where the sum is taken over the n-by-m local neighborhood 

of each pixel in the image I. The algorithm then creates a 

pixel-wise Wiener filter using the following estimates 

ID (r, c) = µ + σ2 − v2 σ2 (Is(r, c) − µ) 

where ν 2 is the noise variance. If the noise variance is not 

given, the filter uses the average of all the local estimated 

variances. In this work, the Wiener filter is used as a first step 

to processing the image prior to applying a space oriented edge 

detection filter in order to provide an image that is optimal with 

regard to solving the edge detection problem for applications 

in cytopathology. Example results are shown in Figures 4 

and 5. Figure 4 shows the original image and Figure 5 is 

the result of applying the Wiener filter described above. 

B. Edge Detection 

Edge detection methods are based on a number of derivative 

estimators. For some of these estimators, it is possible to 

specify whether the operation should be sensitive to horizontal 

Fig. 4. Original image of a cell cluster obtained from a cervical smear after 

staining. 

Fig. 5. Adaptive Wiener filtered image. 

or vertical edges, or both. In each case, the aim is to return 

a binary image - an array containing elements which are 

either 0 or 1 where 1 represents an element of an edge and 0 

represents an empty edge space. Moreover, within the context 

of the overall approach, it is assumed that different edge 

detectors will yield minimal differences. In this application 

a Canny filter [8] is used to provide a first estimate of the 

edge boundaries of a cell nucleus. 

The Canny edge detector is based on a functional analysis 

to derive an optimal function for edge detection, starting 

with three optimisation criteria, namely, good detection, good 

localization, and only one response per edge under white noise 

conditions. The 1D ‘Canny function’ is accurately approximated 

by the derivative of a Gaussian function which is then 

combined with a Gaussian of identical standard deviation in 

the perpendicular direction, truncated at 0.001 of its peak 

value, and split into suitable masks. Underlying this method, is 

the idea of locating edges at the local maxima of the gradient 

magnitudes of a Gaussian-smoothed image. In addition, the 

Canny implementation employs a hysteresis operation on edge 

magnitude in order to make edges reasonably connected. 

Finally, a multiple-scale method is employed to analyse the


output of the edge detector. 

Fig. 6. Application of a Canny Filter to Figure 5. 

An example of applying a Canny filter to Figure 5 is 

given in Figure 6. This result typically illustrates that it 

is not possible to uniquely tell where the edge of a cell or 

nuclei occurs, especially when there is a connection between 

one edge with another gradient, where Canny edge detection 

introduces errors. For this purpose, it is necessary to design a 

new filter which is discussed in the following section. 

C. Space Oriented Filtering 

In some cases, the nuclei of the cells in a cervical smear 

can appear very close together, or be in touch with a foreign 

object such as a bacterium. In this case, an extra filter must be 

used to obtain a contour boundary. For this purpose, a spaceoriented 

filter for the detection of ‘holes’ has been developed. 

The nuclei represent a ‘hole’ if the image is visualised in terms 

of a surface in which the nuclei are regions of lower intensity. 

The filter has been designed to take account of the following: 

(i) objects should be of a quasi-spherical form; (ii) the search 

space should include objects with lower intensity (i.e. which 

have a darker colour); (iii) it is necessary to find only the 

surface of a cell without a hysteresis zone. An example of a 

profile that is characteristic of a nucleus is given in Figure 7. 

The same principle can of course be used for other objects. 

The solution to this problem is compounded in the algorithm 

that is now described, the basic procedure being illustrated in 

Figure 8. To start with, we estimate the brightness of the 

central area (using a window of 9×9 pixels) and a circle (a 

layer consisting of 2 pixels). If the center is dark, we suppose 

that it is part of the nuclei and compare the intensity along the 

white line in Figure 8 with the central zone. If the profile along 

this line has a maximum and minimum gradient, we consider 

the angle between them. If the angle lies in the range 79 o to 

248 o degrees then we assume that we are near to the border 

of a nucleus. This angle can be estimated automatically or 

established as a constant and ‘hard-wired’ into the algorithm. 

The next step is to apply the hole detection method (red 

and brown lines in Figure 8). This hole detection algorithm 

is extended in a procedure to decide whether the area under 

investigation is a nuclei or otherwise. In Figure 8, the 

Fig. 7. Example intensity profile of a Nucleus. 

Fig. 8. Mask used for space-oriented filtering. 

maximum length of the brown line is approximately 70 pixels 

(which depends on the image resolution) and can be chosen 

automatically. A useful procedure is to check the direction 

toward the center of a nuclei but this is application dependent. 

If, for a period, there is no hole, then the present position is 

ignored. If the test for detecting a hole gives a positive result, 

as in an index figure, the line from the center of a hole up to 

the border of a hysteresis is drawn. 

In the central part of the image (Figure 5) one can see 5 

joint kernels in the centre of the image. To automatically find 

the edges between all of these nuclei requires a special algorithm 

for object separation The sequence of steps associated 

with the algorithm designed for this purpose can be divided 

into following list: 

(i) estimation of the edge; 

(ii) search the boundaries of the cell; 

(iii) calculate the direction to the center of a core; 

(iv) search the opposite edge of the core; 

(v) calculate the centers of the kernels; 

(vi) save the index map of the figure. 

Estimation of edge expectation 

Pre-processing can be used to form part of the estimated 

performance for edge expectation. This allows for accelerated


scanning of the image. For this purpose a structure estimation 

operator is applied at the central part of the mask as shown in 

Figure 8. This selects only those nuclei of interest and avoids 

spending computer time processing other parts of the image. 

Searching the boundaries of the cell (Step 1) 

The ring around of the central part of a mask (Figure 8) is 

decomposed using the operator 

R = [x1, x2, ...xn] 

In the following analysis we evaluated the gradient sequence: 

g1..n = dR 

dn 

Upon demarcation of a core and after the derivation, the 

gradient window will contain two maxima - positive and 

negative. The polar angle then gives the direction of the 

nuclear center θ1. 

Calculation of the direction of the center (Step 2) 

In this step, the expected direction to the center is updated 

by means of a check on the position of the angle on a plane 

between the maxima obtained in the previous step. In general, 

for the purpose of recognition, a point on the binary map uses 

a convolution technique with a series of masks for searching 

the exact point on the object edge. The sequence of masks 

used is as follows: 

⎧ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ 

⎨ 0 0 0 0 0 0 0 0 0 

M = ⎣ 0 1 0 ⎦ , ⎣ 0 1 0 ⎦ , ⎣ 0 1 0 ⎦ , 

⎩ 

0 1 0 1 1 0 0 1 1 

⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎫ 

0 0 0 0 0 0 0 0 0 ⎬ 

⎣ 0 1 0 ⎦ , ⎣ 0 1 1 ⎦ , ⎣ 1 1 0 ⎦ 

⎭ 

1 1 1 1 1 1 1 1 1 

The appropriate mask is applied in the direction of a local 

gradient rate and gives a maximal convolution between both 

the points obtained from the previous step. From the definition 

of the angle θ2, utilizing the a priori results, we form the ratio 

θ = θ1 + θ2 

2 

The logical conformity of the mask and adjacent points of the 

binary map is further evaluated and the binary representation 

of object is determined via 

IB(r, c) = 

� 1, if M /∈ Ig; 

0, if M ∈ Ig. 

The profile information (gradient and amplitude) is memorized 

for Step 3 (discussed below). The dimension IB(r, c) 

corresponds to the dimension and starting map Ig(r, c). 

Search for the opposite edge of a core (Step 3) 

The opposite gradient is searched for by finding of centre 

of a nuclei together with the gradient on the opposite end 

which serves as a final confirmation for the coordinates of 

object. In Figure 9 these lines are illustrated in brown. The 

opposite profile has to have the same properties as at Step 

2. This prevents any wrong detection through irregularities in 

the image. If the opposite profile is found, then a green line 

is ‘painted’ on the index binary image from the center to the 

boundary of the nucleus as in Figure 10. 

Fig. 9. Mask of the space-oriented filter with an image. 

Fig. 10. Result of applying the space oriented filter to an image. 

Calculation of the central of kernel 

The centre calculation algorithm is based on the weighted 

mean from the total number of bars detected in the previous 

steps - Figure 3. The calculation depends on the kind of 

implementation used to design the processing engine. If the 

calculations are implemented in a programmed logic, the data 

are better stored in an index space. For a PC, the data are 

stored as array of coordinates. 

Saving the index map (Figure 11) 

After application of the algorithm, a connected area can 

be detected which serves as an index for further processing. 

An example of an index image is given in Figure 11 

which includes the application of erosion and dilation for the 

subdivision of close located objects. 

V. TWO DIMENSIONAL ALGORITHM FOR IMAGE 

SHARPENING 

In this section, we consider the procedures necessary during 

object recognition. These procedures are adaptive and are not 

bound to a particular range of applications.


Fig. 11. Segmentation of nuclei (Index Image). 

A. Self-adjustable filter for enhanced object sharpness 

The task of edge searching of an object in an image is 

a part of the process of object recognition. In the case of 

an image with no preliminary information on the quantity of 

the points on each edge, resolution or particular boundary, 

it is possible to convert the data into an auxiliary map with 

an increased contrast range. With existing algorithms image 

contrast enhancement does not provide sufficient fidelity to 

cope with unknown levels of difference between objects. 

Typically, noise appears causing an increase in the level of 

transformation parameters and at a low level there is poor 

detection of an objects edge. 

An image I, is represented in a computer memory in terms 

of an array r × c of points and the value of a particular 

point is determined as I(r, c). One of the approaches to applying 

a filter or transformation to two-dimensional information 

representation is in terms of a sequence of masks M over 

m × n points and the subsequent calculation of a value for a 

central pixel depending on its environment. We now consider 

Fig. 12. Cytology cells - Mild dyskaryosis. 

an algorithm for calculating the value of a central point in 

a moving window M with m × n points. The algorithm is 

applied sequentially and not recursively to all points of an 

image. For example, consider the image given in (Figure 12). 

The characteristic property of the given image is that during 

preparation of a sample, a cell can be fixed at a given angle and 

consequently, it can have a different gradient rate on different 

boundaries. The mask sizes m and n are selected according to 

the proportional sizes of the object to the image. The method 

is compounded in the following stages: 

1. The first step is to sort out the array M[m × n] in 

terms of increasing values. The result of applying this 

operation gives an information represented in terms of 

a one-dimensional array S[i] as illustrated in Figure 13. 

Fig. 13. Profile obtained by sorting an image into an array of increasing 

pixel values. 

2. We define an index i as a point with the greatest value 

of a gradient rate Simax. Otherwise, we determine a 

maximal gradient rate such that the given position of 

the window M does not correspond to a boundary of 

the object. It is then possible to apply general filtering 

methods, e.g. to calculate the average value or to take 

the value of a point with a predetermined index and with 

this value, assign it to a central point. For example, in 

Figure 13 Simax is the point shown by the red arrow. 

3. We estimate in which part of the sorted array S[i] from 

mask M there exists a value of the original central 

mask point Ic(r, c). For example, in Figure 13, this 

is indicated by the green arrow. We denote this part of 

the array by Sc[i] (see Figure 13). 

4. We estimate the parameter established by the user which 

sets a factor on a boundary excretion - in percentage 

terms, 50% for example - and then define the value of 

point Scr[i] of the array Sc[i] from the beginning of 

the array. This value is the resultant solution Ic(r, c) = 

Scr[i] displayed by the cyan arrow in Figure 13. 

An example result of applying this procedure is shown in 

Figure 14. Application of this filter allows us to observe very 

precisely the evolution of cell boundaries during the operation 

of the object recognition system.


Fig. 14. Filtered image. 

VI. PRECISION CALCULATIONS ON THE MEASURE OF 

STRUCTURE 

For characterization, the line of objects obtained using 

the method described in the previous section, need to be 

considered in terms of their major properties. The modern 

requirements for recognition systems establish structures as 

main features for natural objects such as the measures defined 

by Tamura [9]. 

For structure classification, we apply fractal geometry for 

a description of natural objects. A fundamental property of 

a fractal is its Fractal Dimension. There are a number ways 

to calculate this feature of a fractal object and many different 

approaches to computing the Fractal Dimension have 

been considered [23]. For example, the origins of the ‘box 

dimension’ is hard to trace but would have been considered 

by pioneers of the Hausdorff measure and dimension 

and was probably rejected as being less satisfactory from 

a computational viewpoint. The precision of the calculation 

is less than two decimal places. Computation of the Fourier 

dimension provides a better result [10]. However, in our case, 

we have to estimate the dimension from an image with a 

lower resolution than that at which the object ‘exists’ using a 

frequency spectrum that is subject to additive noise. 

Many signal processing applications are based on the use 

of different transforms. The signals under consideration are 

written as a linear combination (or series) of some predefined 

set of functions. Traditionally, orthogonal basis functions have 

been used for this purpose, for example, the discrete Fourier 

transform. The theory for orthogonal basis and Hilbert spaces 

can, however, be generalized to other sequences of functions 

called frames which have been used in this work to develop 

measures of structure with high precision. 

If we consider the profile of a typical cytopathology image, 

then the curve does not coincide with a sine-wave signal. 

To obtain adequate accuracy, it is necessary to magnify the 

resolution of the image, which in turn introduces distortion. 

For increased accuracy on low-resolution data, we consider a 

convolution function of a form more consistent with the profile 

of a video signal. For a signal I we consider the representation 

F (k) = 

N� 

I (n) 

n=1 

� � 

2π(k − 1)(n − 1) 

arccos cos 

− 

N 

π 

�� 

− 

2 

π 

2 

� � �� 

2π(k − 1)(n − 1) 

−i arcsin cos 

N 

and for an image I with resolution m × n, 

F (p, q) = 

M� 

m=1 n=1 

N� 

I (m, n) (1) 

� � � 

2π(p − 1)(m − 1) 

arccos cos 

− 

M 

π 

�� 

− 

2 

π 

� 

2 

� � � 

2π(k − 1)(n − 1) 

× arccos cos 

− 

N 

π 

�� 

− 

2 

π 

� 

2 

� � �� 

2π(k − 1)(p − 1) 

−i arcsin arccos 

M 

� � �� 

2π(k − 1)(n − 1) 

× arcsin cos 

(2) 

N 

In this work, application of the power spectrum method used 

to compute the fractal dimension of a cell boundary and 

cell surface is based the above representations for F (k) and 

F (p, q) respectively. We then consider the power spectrum 

of an ideal fractal signal given by P = c|k| −β , where c is a 

constant and β is the spectral exponent. In two dimensions, 

the power spectrum is given by P (kx, ky) = c|k| −β � 

, where 

|k| = k2 x + k2 y. In both cases, application of the least squares 

method or Orthogonal Linear Regression yields a solution 

for β and c [23], the relationship between β and the Fractal 

Dimension DF being given by [23] 

DF = 3DT + 2 − β 

2 

for Topological Dimension DT . This approach allows us to 

drop the limits on the recognition of small objects since 

application of the FFT (for computing the power spectrum) 

works well (in terms of computational accuracy) only for 

large data sets, i.e. arrays sizes larger than 256 and 256×256. 

Tests on the accuracy associated with computing the fractal 

dimension using equations (1) and (2) show an improvement of 

5% over computations based on conventional Discrete Fourier 

Transform. 

VII. FEATURE DETERMINATION 

Features (which are typically compounded in a set of 

metrics - floating point or decimal integer numbers) describe 

the object state in an image and provides the input for a 

decision making engine as illustrated in Figure 2. The features 

considered in this paper are computed in the spatial domains 

of the original image {fm,n} and transformed image { ˜ fm,n}. 

Further, these features are extracted from the three colour 

channels - Red (R), Green (G) and Blue (B) - captured


by the CCD array. The issue of what type and how many 

features should be used to develop a computer vision system 

is critical to the design associated with a specific application. 

The system considered here has been developed to include 

features associated with the texture of an object which include 

the Fractal Dimension. Texture is particularly important in 

medical image classification and of primary importance in the 

application considered in this paper. The following features 

or their derivatives have been considered (primarily through a 

process of ’trial and error’) in the recognition system reported 

in this paper: 

Average Gradient G 

describes how the intensity changes when scanning 

from the object center to the border. The object 

gradient is computed using the least squares method 

in polar coordinates as compounded in the following 

result: 

g = 

N � 

(m,n)∈S 

N � 

rm,n ˜ fm,n − � 

(m,n)∈S 

r 2 m,n − 

(m,n)∈S 

⎛ 

⎝ � 

rm,n 

(m,n)∈S 

� 

(m,n)∈S 

⎞2 

rm,n 

⎠ 

˜fm,n 

where N is the number of object pixels and rm,n is 

the distance between (m, n) and the center (m ′ , n ′ ), 

i.e. 

rm,n = � (m − m ′ ) 2 + (n − n ′ ) 2 . 

The centers (m ′ , n ′ ) correspond to the local maximums 

of ˜ fm,n within the cluster. The cluster gradient 

is the average of object gradients, 

G = 〈gi〉i∈I 

where i ∈ I is the object index. 

Colour Composites Υ and ΥD characterises the relationship between R, G and B 

layers of the transformed image. The triangle formula 

� 

(s − a)(s − b)(s − c) 

r(a, b, c) = 

, 

s 

s = 1 

(a + b + c) 

2 

is applied to the ‘colour triangle’ RGB such that the 

following pixel colour composite is obtained 

where 

υm,n = r(a, b, c) 

a = ˜ f R m,n, b = ˜ f G m,n, c = ˜ f B m,n 

and υ D = rincircle(a, b, c) with 


a = | ˜ f R m,n − ˜ f G m,n|, b = | ˜ f G m,n − ˜ f B m,n| 

c = | ˜ f R m,n − ˜ f B m,n|. 

The average colour composites are then given by 

Υ = 〈υm,n〉 (m,n)∈S, Υ D = 〈υ D m,n〉 (m,n)∈S. 

, 

Fourier Dimension q 

determines the frequency characteristics of the object 

and is related to the fractal dimension D by q = 

4 − DF [1], [2]. It represents a measure of texture 

[23] and is computed using the approach discussed 

in Section VI. 

Lacunarity (Gap Dimension) Λk 

characterizes the way the ‘gaps’ are distributed in an 

image [2]. The gap dimension is, roughly speaking, 

the number of light or dark spots in the image. It is 

defined for the given degree k by 

Λk = 

� �� 

fm,n 

〈fm,n〉 

� 

� 

− 1� 

� 

k 

� 1 

k 

, 

where 〈fm,n〉 = 1 � 

N 

fm,n denotes the mean value. 

In the system described in this paper, an average of 

local lacunarities of the degree k = 2 is measured in 

the spatial and frequency domains. 

Symmetry Features Sn and M 

are estimated by morphological analysis in threedimensional 

space, i.e. two-dimensional spatial coordinates 

and intensity. A symmetry feature Sn is measured 

for a given degree of symmetry n (currently 

n = {2, 4}). This value shows the deviation from a 

perfectly symmetric object, i.e. Sn is close to zero 

when the object is symmetric and Sn > 0 otherwise. 

Feature M describes the fluctuation of the centre or 

mass for pixels with different intensities; M = 0 for 

symmetric objects and M > 0 otherwise. 

Structure γ 

provides an estimation of the 2D curvature of the 

object in terms of the following: 

γ < 0, if the object bulging is less than a threshold, 

γ = 0, if the object has the standard bulging, 

γ > 0, if the object has a higher level of bulging. 

Geometrical Features 

include the minimum Rmin and maximum radius 

Rmax of the object (or ratio Rmax/Rmin), object 

area S, object perimeter P (or ratio S/P 2 ) and the 

coefficient of infill S/SR, where SR is the area of 

the bounding polygon which, in this application, is 

determined using the convex hull algorithm given in 

Section V. 

The system reported in this paper classifies objects using 

mixed mode features that are based on Euclidean and fractal 

geometric metrics. The procedure of object detection is performed 

at the segmentation stage and needs to be adjusted 

for each area of application. The recognition algorithm then 

makes a decision using a knowledge database and outputs a 

result by subscribing objects based on the features defined 

above. The ‘expert data’ associated with a given application 

creates a knowledge database by using a supervised training 

system with a number of model objects. This is discussed in 

the following section.


VIII. OBJECT RECOGNITION 

In order to characterize an object, the ‘system’ must have 

a mathematical representation compounded in metrics that 

are used to compose a feature vector. The basis for the 

application considered in this paper are the textural features 

(Fourier dimension and Lacunarity) for an object coupled with 

the Euclidean and morphological measures as defined in the 

previous section. In the case of a general application, all 

objects are represented by a list of parameters for implementation 

of supervised learning in which a fuzzy logic engine 

automatically adjusts the weight coefficients for the remaining 

features. The methods developed represent a contribution to 

pattern recognition based on fractal geometry (at least in a 

partial sense), fuzzy logic and the implementation of a fully 

automatic recognition scheme as illustrated in Figure 15 for 

the Fractal Dimension D (just one element of the feature 

vector used in practice). The recognition procedure uses the 

decision making rules from fuzzy logic theory [21], [18], [19], 

[20] based on all, or a selection, of the features defined and 

discussed in Section VII which are combined to produce a 

feature vector x. 

Fig. 15. Basic architecture of the diagnostic system based on the Fractal 

Dimension D (a single feature) and decision making criteria β. 

A. Decision Making 

The class probability vector p = {pj} is estimated from 

the object feature vector x = {xi} and membership functions 

mj(x) defined in the knowledge database. If mj(x) is a membership 

function, the following equation defines the probability 

for each j th class and i th feature as follows: 

� 

pj(xi) = max 

σj 

· mj(xj,i) 

|xi − xj,i| 

for weight coefficient matrix given by wj = wj,i where σj 

is the distribution density of values xj at the point xi of the 

membership function. The next step is to compute the mean 

class probability given by 

〈p〉 = 1 

j 

� 

j 

wjpj 

where the distance from the mean probability selects the class 

associated with 

� 

p(j) = min [(pj · wj − 〈p〉) ≥ 0] 

providing a result for the decision making of the j th class. 

The weight coefficient matrix is adjusted during the learning 

stage of the algorithm. 

The decision criterion method considered here represents a 

weighing-density minimax expression. The estimation of the 

decision accuracy is achieved by using the density function 

di = |xσmax − xi| 3 3 

+ (σmax(xσmax ) − pj(xi)) 

with an accuracy determined by 

2 

P = wjpj − wjpj 

π 

B. Supervised Learning Process 

N� 

di. 

i=1 

The supervised learning procedure is the most important 

part of the system for operation in automatic recognition mode. 

The training set of sample objects should cover all ranges of 

class characteristics with a uniform distribution together with 

a universal membership function. This rule should be taken 

into account for all classes participating in the training of the 

system. An expert defines the class and accuracy for each 

model object where the accuracy is the level of self confidence 

that the object belongs to a given class. During this procedure, 

the system computes and transfers to a knowledge database 

a vector of values of parameters x = {xi} which forms the 

membership function mj(x). The matrix of weight factors wj,i 

is formed at this stage accordingly for the i th parameter and 

j th class using the following expression: 

wi,j = 

� 

� N� 

� � 

�1 

− pi,j(x 

� k i,j) − 〈pi,j(xi,j)〉 � pi,j(x k � 

� 

� 

i,j) � 

� . 

k=1 

The result of the weight matching procedure is that all 

parameters which have been computed but have not made any 

contribution to the characteristic set of an object are removed 

from the decision making algorithm by setting wj,i to null. 

IX. DISCUSSION 

The methods discussed in the previous sections represent 

a novel approach to designing an object recognition system 

that is robust in classifying textured features, the application 

considered in this paper, having required a symbiosis of the 

parametric representation of an object and its geometrical 

invariant properties. In comparison with existing methods, the 

approach adopted here has the following advantages: 

Speed of operation. The approach uses a limited but effective 

parameter set (feature vector) associated with an object 

instead of a representation using a large set of values (pixel 

values, for example). This provides a considerably higher operational 

speed in comparison with existing schemes, especially 

with composite tasks, where the large majority of methods 

require object separation. The principal computational effort


is that associated with the computation of the feature vector 

using the metrics discussed in Section VII 

Accuracy. The methods constructed for the analysis of 

sets of geometrical primitives are, in general, more precise. 

Because the parameters are feature values, which are not 

connected to an orthogonal grid, it is possible to design 

different transformations (shifts, rotational displacements and 

scaling) without any significant loss of accuracy compared 

with a set of pixels, for example. On the other hand, the overall 

accuracy of the method is directly influenced by the accuracy 

of the procedure used to extract the required geometrical tags. 

Generally, the accuracy of a method will always be lower, 

than, for example, classical correlative techniques, where, 

due to padding, error can arise during the extraction of a 

parameter set. However, by using precise parameterization 

structures based on fractal geometry, remarkably good results 

are obtained. 

Reliability. The proposed approach relies first and foremost 

on the reliability of the extraction procedure used to establish 

the geometrical and parametric properties of objects, which, 

in turn, depends on the quality of the image; principally in 

terms of the quality of the contours. It should be noted, that 

the image quality is a common problem in any visual system 

and that in conditions of poor visibility and/or resolution, all 

vision systems will fail. In other words, the reliability of the 

system is fundamentally dependent on the quality of the input 

data. 

An additional feature of the system discussed in this paper, 

is that the sub-products of the image processes can be used 

for tasks that are related to image analysis such as a search for 

objects in a field of view, object identification, maintaining an 

object in a view field, optical correction of a view point and 

so on. These can include tasks involving the relative motion 

of an object with respect to another object or with respect to 

background for which the method considered can be also be 

applied - collision avoidance tasks, for example. 

Among the characteristic disadvantages of the approach, it 

should be noted that: (i) The method requires a considerable 

number of different calculations to be performed and appropriate 

hardware requirements are therefore mandatory in the 

development of a real time system; (ii) the accuracy of the 

method is intimately connected with the required computing 

speed - an increase in accuracy can be achieved but may be 

incompatible with acceptable computing costs. In general, it 

is often difficult to acquire a template of samples under real 

life or field trial conditions which have a uniform distribution 

of membership functions. If a large number of training objects 

are non-uniformly distributed, it is, in general, not possible to 

generate accurate recognition system. 

The original approach to the decision process proposed 

includes the following important steps: (i) estimation of the 

density distribution is accurately determined from the original 

samples in the membership function during a supervised 

learning phase which improves the recognition accuracy under 

non-ideal conditions; (ii) the pre-filtering procedures provide 

a good response to the required features of the object without 

generating noise; (iii) the segmentation procedures discussed 

in Section III efficiently select only those objects required; (iv) 

computation of fractal parameters, in particular, the average 

lacunarity, helps to characterize the textural features (in terms 

of their classification) associated with the object. 

The integration of Euclidean with fractal geometric parameters 

provides a more complete suite of tools for pattern 

recognition in combination with supervised learning through 

fuzzy logic criteria. In the following section, we consider the 

application of our approach for the design of a cytological 

screening system. 

X. APPLICATION TO CERVICAL SMEAR SCREENING 

The application considered in this section has focused on 

screening programmes that utilize Liquid Based Cytology 

(LBC). Cells are collected from the cervix in the same way as 

PAP smear, but using a very small brush instead of a spatula. 

The head of the brush is broken off and maintained in a liquid 

environment instead of smearing the cells directly onto a slide. 

This preserves the cells and so the results of the test are more 

reliable. At present, about one in twelve PAP smears have to 

be done again because they can not be read properly. With the 

LBC approach, far fewer test have to be repeated. However, the 

LBC method is not, as yet, in widespread use. Nevertheless, 

the system reported in this paper has been designed to operate 

in conjunction with screening centres that use LBC. 

A. Classes of Cervical Cells 

There are two main types of cervical cancer: (i) Squamous 

cell cancer; (ii) Adenocarcinoma. They are named after the 

type of cell that becomes cancerous. Squamous cells are 

the flat skin-like cells that cover the surface of the cervix. 

Squamous cell cancer is the most common type of cervical 

cancer. Adenocarcinoma cells are glandular cells that produce 

mucus. The cervix has these glandular cells along the inside 

of the passageway that runs from the cervix to the womb 

(the endocervical canal). Adenocarcinoma is a cancer of these 

cell types. It is less common than squamous cell cancer, but 

has become more commonly recognised in recent years. Only 

about one in five to one in ten cases of cervical cancer are 

adenocarcinoma. Adenocarcinoma is associated with a similar 

precancerous phase. It is treated in the same way as squamous 

cell cancer of the cervix. 

Tables I and II explain the relationship between 

the current system and Bathesda 2001 classifications - 

http://www.aafp.org/afp/2003/1115/p1992.html. The first class 

represents normal cells and the last one are malignant (cancerous) 

cells. Intermediate classes represent different degrees 

of abnormalities; it is important to detect these as well. The 

classification, for which the system is ‘focused’ is simplified 

because, unlike Bathesda 2001, it provides a fuzzy estimation 

of class membership, which gives a better description of the 

cell state. An additional class Exudate is defined to described 

irrelevant structures in the image. 

With current techniques, all cervical smear tests are examined 

by ‘screeners’ who have only a few minutes per slide. 

This means that the screening is done at low magnification and 

high speed so it is not surprising that mistakes can be made. 

The ‘screeners’ look for abnormal variations in the ratio of the


TABLE I 

CLASSIFICATION OF SQUAMOUS CELLS. 

System Bathesda 2001 

Normal Sq Normal squamous cells 

Normal Sq Atypical squamous cells – ’undetermined significance’ 

(ASC-US) 

Normal Sq Atypical squamous cells – ’cannot exclude high 

grade disease’ (ASC-H) 

LSIL Low grade squamous intra-epithelial lesion (LSIL) 

HSIL High grade squamous intra-epithelial lesion (HSIL) 

– CIN2 

HSIL High grade squamous intra-epithelial lesion (HSIL) 

– CIN3 

Invasive Sq Invasive squamous carcenoma 

TABLE II 

CLASSIFICATION OF GLANDULAR CELLS. 

System Bathesda 2001 

Normal Gl Normal glandular cells 

Normal Gl Atypical glandular cells (AGC) – endocx/endom/not 

specified 

Normal Gl Atypical glandular cells (AGC) – favour neoplasia 

AIS Adenocarcinoma in situ (AIS) 

Invasive Adeno Adenocarcinoma 

size of the nucleus relative to the size of the cell, as well as 

other markers of diseased tissue. When they identify suspect 

areas of the slide they mark these with a felt tip pen and pass 

them on for further inspection. These slides are then looked 

at by ‘checkers’ who have more experience and examine the 

slide more carefully and at higher magnification. If they are not 

satisfied that ‘all is well’, then they pass the suspect slides to a 

cytopathologist for further, more detailed analysis and diagnosis. 

Even at this final stage, mistakes can be made as each slide 

is prepared differently and it is common for cells to overlie 

each other, compounding the problem of accurate diagnosis 

further. New techniques that use cytocentrifuge preparations 

(e.g. http://www.tharmac.com/?p=15) can overcome this last 

problem but have yet to be introduced in general. 

One of the major criteria of assessing whether a cell is premalignant 

or malignant is the ratio of the size of the nucleus of 

the cell compared with that of the whole cytoplasm - the nuclear/cytoplasmic 

ratio. The rapid identification of variations in 

these ratios enables ‘checkers’ to quickly and more accurately 

determine if there are abnormalities by examining cells that 

are located in a small area. To estimate the condition of the 

cells, the cytologist typically makes upto 300 slide movements 

over a period of 8-10 minutes on a desk microscope and may 

consequently miss many important features. This approach not 

only takes time but inevitably can not guarantee consistent 

and accurate estimates of the condition of the cells. With an 

increasing number of screening projects taking place together 

with the variability of different preparations, diagnostic errors 

can lead to a number of fatalities due to false negatives and 

lack of appropriate treatment in the early stages of cervical 

cancer. 

At present, there are no commercial or experimental systems 

available for the automatic identification and classification 

of tissue cells without human participation. Obtaining results 

from cytology diagnostics in real time with a robust least error 

criterion is a widespread and important problem for screening 

the cervix uteri. The automatic coloring (staining) and 

scanning of the material creates preconditions in designing an 

algorithm and technical devices for the automatic identification 

and classification in cytopathology. A key point is to identify 

and classify the condition of the cell nuclei using a suitable 

recognition process. 

There are a range of techniques that aim to 

improve the examination of slides using integrated 

optical densitometry. For example, SurePath - 

http://www.pathlabsofark.com/surepathliquidpap.html - 

uses integrated optical density of conventional smears. The 

aim of the system reported in this paper is to exclude 25% 

of samples without visual examination. Unlike a human 

expert, the automatic scanning method can count the cells 

and estimate their statistical distribution among classes or 

states. The system delivers high accuracy and automation due 

to the following innovations: 

Fractal analysis 

Biological structures (such as body tissues) have 

natural fractal properties. Numerical measurements 

of these properties provides for the efficient and 

effective detection of abnormalities. 

Extended set of detectable features 

High accuracy is achieved when multiple features are 

measured together and combined into a result 

Advanced fuzzy logic engine 

The knowledge-based recognition scheme enables 

highly accurate diagnosis. 

B. System Overview 

It is proposed that the approach described in this paper and 

the system developed may assist cytopathologists in reducing 

the workload by eliminating in a secure manner a percentage 

of normal smears, thus allowing more time for the evaluation 

of the abnormal cases. The ‘software solutions’ detect abnormalities 

in organic structures such as cells by digital image 

analysis. Cancer experts create the knowledge database by 

training the system with a number of case study images. The 

recognition algorithm is composed of the following steps: 

Filtering 

The image is filtered to reduce noise and remove 

unnecessary features (bacteria, broken cells). 

Segmentation 

The image is segmented to perform a separate analysis 

of each object. In order to separate connected 

objects a new algorithm has been designed. An 

example of the GUI developed is given in Figure 16 

which shows the stage at which the nuclei of suspect 

cells have been identified and located. 

Feature Detection 

For each object, a set of recognition features are 

detected. The features are numeric parameters that 

describe the object inclusive of fractal geometric 

parameters. The system captures a variety of geometrical, 

fractal and statistical features in one- and twodimensions. 

One-dimensional features correspond to


Model and 

Supplier 

Nikon 

Coolscope 

(Nikon 

Instruments 

Europe BV) 

Aperio 

Scanscope 

(Aperio 

Technologies: 

DakoCytomation) 

Nikon Eclipse 

E8000 + 

JVC 3-CCD 

KY-F55B. 

TABLE III 

IMAGING ACQUISITION HARDWARE. 

Advantages Shortcomings 

Available on the 

market 

Magnification 40x. 

Complete solution 

with a slide feeder. 

High scanning 

speed 

(20 min/slide). 

Magnification 40x. 

Non-tiling scan. 

Better focus. 

Better dynamic 

range. 

Variable resolution 

4X-80X. 

Manual focus. 

Manual brightness. 

Very slow (several 

hours/slide). 

Small focus depth and 

automatic focus does 

not find the optimal zposition. 

Dynamic range to be 

adjusted. 

Tiling scan. 

Not fully developed. 

Problem to achieve 

60x. 

Manual image capture. 

Can be used only for 

testing. 

the border of objects, whereas two-dimensional features 

relate to the surface within and around objects. 

Decision Making 

The system uses fuzzy logic to combine features 

into a decision. A decision is the estimated class 

of object and accuracy probability. In-between states 

are determined by the probability. For example, 

35% normal is equivalent to 65% abnormal and 

suggests careful analysis by cancer specialists. In 

the extended training version for cervical cancer, the 

system provides upto 10 classes (CINs) depending on 

the classification system and the number and extent 

of available samples for learning procedures. 

Fig. 16. GUI associated with the cervical smear analysis system. 

The system has been developed to operate with a range of 

image acquisition hardware, examples of which are provided 

in Table III. 

XI. CONCLUSION 

This paper has been concerned with the task of developing 

a methodology and implementing applications that are concerned 

with two key tasks: (i) the partial analysis of an image 

in terms of its fractal structure and the fractal properties that 

characterize that structure; (ii) the use of a fuzzy logic engine 

to classify an object based on both its Euclidean and fractal 

geometric properties. The combination of these two aspects 

has been used to define a processing and image analysis engine 

that is unique in its modus operandi but entirely generic in 

terms of the applications to which it can be applied. 

The research has investigated numerous processes for pattern 

recognition using fractal geometry as a central processing 

kernel. This has led to the design of a new library of pattern 

recognition algorithms. The image types considered contain 

about 80% useful environmental information for the human. 

With rapid advances in video technology, the content of a 

video stream is increasing at a rate that is far beyond the 

human brain capacity for decision making. This necessitates 

a need for developing an automatic image processing and 

decision making system using artificial intelligence. Such 

systems are required in search engines, information databases, 

navigation in unknown terrain, interpretation of two dimensional 

data, etc. 

The creation of logic and general purpose hardware for 

artificial intelligence is a basic theme for any future development 

based on the results reported in this paper for 

the applications developed and beyond. The results of the 

current system can be utilized in a number of different areas 

although medical imaging would appear to be one of the 

most natural fields of interest because of the nature of the 

images available, their complex structures and the difficulty 

of obtaining accurate diagnostic results which are efficient 

and time effective. A further extension of our approach is to 

consider the effect of replacing the fuzzy logic engine used 

to date with an appropriate Artificial Neural Network. It is 

not clear as to whether the application of an ANN could 

provide a more effective system and whether it could provide 

greater flexibility with regard to the type of images used and 

the classifications that may be required. Within the context 

of this paper, algorithms have been designed that focus on 

solving the detection and classification problems associated 

with the analysis of cervical smear images. In this respect, a 

new set of image processing algorithms have been developed 

that may have value in a wider class of image processing 

and pattern recognition application, particularly with regard to 

medical image analysis. 

ACKNOWLEDGMENTS 

This work is supported by the Science Foundation Ireland. 

The authors are grateful for the advice and help of Dr Alastair 

Deery (Department of Cellular Pathology, St Georges Hospital, 

London), Professor Jonathan Brostoff (Kings College, London 

University) and Professor Irina Shabalova (Russian Medical 

Academy of Postgraduate Education, Moscow).


REFERENCES 

[1] J. M. Blackledge, Digital Signal Processing, 2 nd Edition, Horwood 

Publishing, 2006. 

[2] J. M. Blackledge, Digital Image Processing, Horwood Publishing, 2005. 

[3] E. R. Davies. Machine Vision: Theory, Algorithms, Practicalities, Academic 

press, London, 1997. 

[4] H. Freeman, Machine vision. Algorithms, Architectures, and Systems, 

Academic press, London, 1988. 

[5] M. G. Rojo, G. B. Garcia, C. P. Mateos, J. G. Garcia and M. C. Vicente, 

Critical Comparison of 31 Commercially Available Digital Slide Systems 

in Pathology, Int. J. Surg. Pathol., 14, 285-30, 2006. 

[6] J. M. Blackledge and D. Dubovitskiy, Object Detection and Classification 

with Applications to Skin Cancer Screening, ISAST Transactions 

on Intelligent Systems, No. 1, Vol. 1, 34-45, ISSN:1797-1802, 2008. 

[7] J. M. Blackledge, D, Dubovitskiy, Surface Inspection using a Computer 

Vision System that Includes Fractal Analysis, ISAST Transaction on 

Electronics and Signal Processing, No. 2, Vol. 3, 76 -89, ISSN:1797- 

2329, 2008 

[8] Canny J. A computational approach to edge detection. IEEE Trans. 

Pattern Analysis and Machine Intelligence, (PAMI-8):679–698, 1986. 

[9] Shunji Mori Hideyuki Tamura and Takashi Yamawaki. Textual features 

corresponding to visual perception. IEEE Man. and Cybernetics, 6, 

1978. 

[10] Falconer K. Fractal Geometry. Wiley, 1990. 

[11] Pantanowitz L, Henricks W, Beckwith B. Medical laboratory informatics. 

Clin Lab Med, 27:823-43, 2007. 

[12] Pantanowitz L, Hornish MA, Goulart RA. Computer-assisted cervical 

cytology. Medical information Science, 2008. 

[13] Yagi Y, Gilberson JR. Digital imaging in pathology: The case for 

standardization. J Telemed Telecare, 11:109-16, 2005. 

[14] Jr Louis J. Galbiati. Machine vision and digital image processing 

fundamentals. State University of New York, New-York, 1990. 

[15] Roger Boyle Milan Sonka, Vaclav Hlavac. Image Processing, Analysis 

and Machine Vision. PWS, USA, 1999. 

[16] Wesley E.Snyder Hairong Qi. Machine Vision. Cambridge University 

Press, England, 2004. 

[17] V.S Nalwa and T.O.Binford. On detecting edge. IEEE Trans. Pattern 

Analysis and Machine Intelligence, (PAMI-8):699–714, 1986. 

[18] Lotfi A. Zadeh. Fuzzy sets and their applications to cognitive and 

decision processes. Academic Press, New York, 1975. 

[19] E.H.Mamdani. Advances in linguistic synthesis of fuzzy controllers. 

J.Man Mach., 8:669–678, 1976. 

[20] E.Sanchez. Resolution of composite fuzzy relation equations. 

Inf.Control, 30:38–48, 1976. 

[21] N.Vadiee. Fuzzy rule based expert system-I. Prentice Hall, Englewood, 

1993. 

[22] Contour Tracing Algorithms http://www.cs.mcgill.ca/ aghnei/alg.html 

[23] Patrick R.Andrews Martin J.Turner, Jonathan M.Blackledge. Fractal 

Geometry in Digital Imaging. Academic Press, London, 1998. 

[24] Cancer research uk. http://www.cancerresearchuk.org/ 

aboutcancer/reducingyourrisk/9314. 

[25] J. S. Lim, Two-Dimensional Signal and Image Processing, Prentice-Hall, 

1990. 

Jonathan Blackledge received a BSc in Physics 

from Imperial College, London University in 1980, 

a Diploma of Imperial College in Plasma Physics 

in 1981 and a PhD in Theoretical Physics from 

Kings College, London University in 1983. As a Research 

Fellow of Physics at Kings College (London 

University) from 1984 to 1988, he specialized in 

information systems engineering undertaking work 

primarily for the defence industry. This was followed 

by academic appointments at the Universities of 

Cranfield (Senior Lecturer in Applied Mathematics) 

and De Montfort (Professor in Applied Mathematics and Computing) 

where he established new post-graduate MSc/PhD programmes and research 

groups in computer aided engineering and informatics. In 1994, he cofounded 

Management and Personnel Services Limited where he is currently 

Executive Director. His work for Microsharp (Director of R & D, 1998- 

2002) included the development of manufacturing processes now being 

used for digital information display units. In 2002, he co-founded a group 

of companies specializing in information security and cryptology for the 

defence and intelligence communities, actively creating partnerships between 

industry and academia (e.g. Lexicon Data Limited). He is currently holder 

of the Stokes Professorship in Digital Signal Processing and Information and 

Communications Technology based at Dublin Institute of Technology and has 

published over one hundred scientific and engineering research papers and 

technical reports for industry, six industrial software systems, fifteen patents, 

ten books and been supervisor to sixty research (PhD) graduates. His current 

research interests include computational geometry and computer graphics, 

image analysis, nonlinear dynamical systems modelling and computer network 

security, working in both an academic and commercial context. He holds 

Fellowships with England’s leading scientific and engineering Institutes and 

Societies including the Institute of Physics, the Institute of Mathematics and 

its Applications, the Institution of Electrical Engineers, the Institution of 

Mechanical Engineers, the British Computer Society, the Royal Statistical 

Society and the Chartered Management Institute. 

Dmitry Dubovitskiy received a BSc and Diploma 

in Aeronautical Engineering from Saratov Aviation 

Technical College in 1993, an MSc in Computer 

Science and Information Technology from Baumann 

Moscow State Technical University in 1999 and a 

PhD in Computer Science from De Montfort University 

in 2005 under the supervision of Professor 

J M Blackledge. As a project leader in medical 

imaging at Microsharp Limited from 2002 to 2005, 

he specialized in information systems engineering, 

developing image recognition systems for medical 

applications for real time operational diagnosis. He founded Oxford Recognition 

Limited in 2005 which specialises in the applications of artificial 

intelligence for computer vision. He has developed a range of computer vision 

systems for industry including applications for 3D image visualisation and has 

been coordinator for the INTAS project in distributed automated systems for 

acquiring and analysing eye tracking data.

Computers and Intelligent Systems - isast

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?