04.12.2012 Views

Computers and Intelligent Systems - isast

Computers and Intelligent Systems - isast

Computers and Intelligent Systems - isast

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong> No. 1 Vol. 2, 2010<br />

ISAST Transactions on No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

<strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong><br />

Gang LIU, Gang CUI, Hongwei LIU, <strong>and</strong> Zhibo WU:<br />

A Reliability Enhancement Adaptive Routing Mechanism for Mobile Ad Hoc Networks………………….1<br />

J. C. Chedjou, K. Kyamakya, U. A. Khan, <strong>and</strong> M. A. Latif:<br />

Potential Contribution of CNN-based Solving of Stiff ODEs& PDEs to Enabling Real-Time<br />

Computational Engineering………………………………………………………………………………….8<br />

Y. Labrador, M. Karimi, N. Pissinou, <strong>and</strong> D. Pan:<br />

Performance Comparison of OFDM <strong>and</strong> Single Carrier Modulations over Satellite Channels…….……....15<br />

Z. R. Ghobadi <strong>and</strong> H. Rashidi:<br />

Software Rejuvenation Technique-An Improvement in Applications with Multiple Versions…….………22<br />

S. A. Asghari, H. Pedram <strong>and</strong> H. Taheri:<br />

A New Attitude based on Real Time Operating System for NoC in Hotspot Traffic Model..…….………27<br />

V. Y. Kontorovich, Z. Lovtchikova, J. A. Meda-Campaña, <strong>and</strong> K. Tinsley:<br />

Nonlinear Filtering Algorithms for Chaotic Signals: A Comparative Study……………………………….34<br />

H. D. Vankayalapati <strong>and</strong> K. Kyamakya:<br />

Nonlinear Feature Extraction Approaches for Scalable Face Recognition Applications…………………..44<br />

R.Karthikeyan, A. Karthikeyan <strong>and</strong> S.Sivaperumal:<br />

Artificial Human Limbs – A Design Approach for Military Application………………………………….53<br />

K. Kyamakya, J. C. Chedjou, M. A. Latif, <strong>and</strong> U. A. Khan:<br />

A Novel Image Processing Approach Combining a ‘Coupled Nonlinear Oscillators’-based Paradigm<br />

with Cellular Neural Networks for Dynamic Robust Contrast Enhancement………………………….…..61<br />

JianLi GUO, HongWei LIU, <strong>and</strong> XiaoZong YANG:<br />

Common-neighbor Monitoring Enhanced Cooperation Enforcement Scheme for MANETs……………..69<br />

J. M. Blackledge:<br />

Systemic Risk Assessment using a Non-stationary Fractional Dynamic Stochastic Model for the<br />

Analysis of Economic Signals……………………………………………………………………………..76<br />

J. M. Blackledge <strong>and</strong> D. A. Dubovitskiy:<br />

An Optical Machine Vision System for Applications in Cytopathology………………………………….95


1 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

A Reliability Enhancement Adaptive Routing<br />

Mechanism for Mobile Ad Hoc Networks<br />

Gang LIU, Gang CUI, Hongwei LIU, <strong>and</strong> Zhibo WU<br />

Abstract—Selecting a stable routing for data packet transmission is effective on reducing control packet traffic, <strong>and</strong><br />

energy consuming generated by the frequent routing reconstruction <strong>and</strong> maintenance in dynamic mobile Ad Hoc<br />

networks, so it can improve efficient <strong>and</strong> extend lifetime of the networks. A kind of algorithm to measure dynamic<br />

characters of mobile nodes based on information entropy is proposed in the paper, which analyze uncertainty of<br />

behavior character of its neighbor set in the transformation process, <strong>and</strong> use it as a metric for selecting stable routing in<br />

mobile Ad Hoc networks. Simulation results show that stable routing measurement method can remarkably improve key<br />

performances of mobile Ad Hoc networks, such as packet delivery ratio <strong>and</strong> packet end-to-end delay.<br />

Index Terms—Mobile Ad Hoc networks, stable routing, uncertainty, information entropy.<br />

1 INTRODUCTION<br />

MOBILE Ad Hoc networks is a group<br />

of autonomous wireless mobile nodes<br />

in the composition of the temporary selforganization,<br />

non-center multi-hop wireless<br />

network system, the nodes can move freely,<br />

join or leave the network at any time without<br />

having to send any warning information in<br />

the networks running process. Therefore, the<br />

states of their network topology, mutual relations<br />

between the nodes <strong>and</strong> wireless links<br />

constantly change. In such a dynamic network<br />

environment, selecting a stable routing for data<br />

transmission can effectively reduce the transmission<br />

process of reconstruction <strong>and</strong> maintenance<br />

of the routing frequently generated by<br />

network b<strong>and</strong>width <strong>and</strong> energy consumption,<br />

<strong>and</strong> thus improve network resource utilization<br />

efficiency <strong>and</strong> prolong survival of life, so it is<br />

of great significance for network resources <strong>and</strong><br />

relatively limited supply of energy in mobile<br />

• Gang LIU, Gang CUI, Hongwei LIU <strong>and</strong> Zhibo WU are with<br />

the School of computer science <strong>and</strong> technology, Harbin Institute<br />

of Technology, Harbin, China, 150001<br />

E-mail: lg.hit@163.com<br />

• This paper is partially supported by the Hi-Tech Research<br />

<strong>and</strong> Development Program (863) of China under grant No.<br />

2008AA01A201 <strong>and</strong> the National Natural Science Foundation<br />

of China under grant No. 60503015.<br />

Manuscript received April 19, 2009; revised September 11, 2009.<br />

✦<br />

Ad Hoc network.<br />

A more common strategy is aimed at a formalization<br />

of mobile Ad Hoc network node<br />

movement model in the current approach for<br />

selecting stability routing, by analyzing the<br />

specific movement model of the network mobile<br />

nodes, the wireless link behavior demonstrated,<br />

as a routing selection process in the<br />

establishment <strong>and</strong> stability of metrics[1], [2],<br />

[3], [4], [5], one main problem is the limitation<br />

of its application area of the stability of<br />

the routing nodes. LENDERS[6] analyzed the<br />

impact of the actual nodes mobile model on<br />

wireless links connecting state, but only gave a<br />

qualitative summary of type conclusions, <strong>and</strong><br />

there were no formal quantitative test results.<br />

Another common way is through real-time<br />

precision for Mobile Ad Hoc networks, wireless<br />

mobile nodes in the relationship between<br />

the physical location <strong>and</strong> the relative speed<br />

of change in the stability of information as a<br />

link or routing metrics[7], [8], [9], [10], in this<br />

way, real-time access <strong>and</strong> update the location<br />

of wireless mobile nodes, speed change, supporting<br />

the information need a special facilities<br />

(such as GPS, etc.) to provide the necessary<br />

technical support, it is only suitable for certain<br />

specific applications in the network environment.<br />

ROHIT[11] <strong>and</strong> GEUNHWI[12] used the<br />

network data transmission process of mobile<br />

1


2 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

node changes in signal strength <strong>and</strong> stability<br />

of features as the route selection criteria, <strong>and</strong><br />

the author used an extended device driver<br />

interface to implement a routing protocol for<br />

maintain signal stability table (SST) protocol<br />

stack required for cross-layer-type operation<br />

in the signal stability based adaptive routing<br />

protocol SSA. In addition, XU[13] analyzed the<br />

network topology dynamic characteristics, <strong>and</strong><br />

applied it to hierarchical network architecture,<br />

clustering algorithm <strong>and</strong> cluster-based routing<br />

protocol in order to achieve maximum network<br />

performance <strong>and</strong> stability. KIM[14] put<br />

forward a kind of scaleable Ad Hoc routing<br />

protocol which enhanced routing protocol capacity<br />

to adapt the increasing scale of Ad Hoc<br />

networks by the logical topology information<br />

of networks. In the literature[15], the authors<br />

proposed a stability calculation method based<br />

on the correlation factor of the wireless link<br />

path, which used the wireless links to connect<br />

the adjacent state of change, rather than use a<br />

separate wireless link to considerate the stability<br />

characteristics of the full routing.<br />

Unlike the above method, this paper use the<br />

information uncertainty metric method, which<br />

utilize a collection of wireless nodes in a neighbor<br />

behavior change process as a source with<br />

uncertain properties, through its quantitative<br />

measurement as a routing selecting benchmarks<br />

in the dynamic network environment,<br />

which provide the necessary support for reliable<br />

data transmission.<br />

The rest of this paper is organized as follows.<br />

In section 2, we introduce a novel method<br />

to measure the routing stability based on the<br />

information entropy, a new routing protocol<br />

called BSORP combined the stability measurement<br />

method with AODV protocol. In section<br />

3, we compare the difference performances of<br />

AODV <strong>and</strong> BSORP to validate the effect of<br />

the introduced stability measurement method<br />

in this paper. Finally, section 4 is conclusion.<br />

2 MODEL OF ROUTING STABILITY<br />

2.1 Stability measurement for wireless mobile<br />

nodes<br />

On the assumption that each of wireless mobile<br />

nodes has its only identified sign for mobile Ad<br />

Hoc networks, <strong>and</strong> all nodes have the same<br />

wireless spreading radius, we can transform<br />

network topology model of the time t to a<br />

undirected graph G t = (V, E t ), while V denotes<br />

the set of all wireless nodes, |V | denotes the<br />

amount of all wireless nodes in set V , E t denotes<br />

the set of |E| bidirectional wireless links<br />

in the time t.<br />

If node m periodically inspects the members<br />

in the set of neighbor nodes, the inspection<br />

period is ∆t, <strong>and</strong> then an ordered sequence<br />

composed by neighbor sets under different<br />

time is gained<br />

S T NS(N) = (ns t1 , ns t2 , · · · , ns tN ), N = T/∆t (1)<br />

In the sequence ST NS (N), if we regard the set<br />

of inspected current neighbor nodes at time ti<br />

as a r<strong>and</strong>om variable NS, then the new set<br />

made up of different neighbor nodes of all<br />

members is value range for the r<strong>and</strong>om variable,<br />

namely NS T = {ns1, ns2, · · · , nsk}, while<br />

(nsti ∈ NS T ; i = 1, · · · , N; 1 ≤ k ≤ N), <strong>and</strong> we<br />

can compute each probability distributing of all<br />

elements in the set NS T according to sequence<br />

ST NS (N).<br />

The sequence S T NS<br />

2<br />

(N) is reflection of cor-<br />

relation dynamic character between moving<br />

wireless node m <strong>and</strong> its neighbor nodes, so we<br />

can provide a convergence stability criterion for<br />

building <strong>and</strong> selecting mobile Ad Hoc stability<br />

routing, by information entropy[16] with the<br />

ability of measuring uncertainty, <strong>and</strong> the uncertainty<br />

is computed by quantizing the sequence<br />

ST NS (N) <strong>and</strong> the set NST .<br />

In order to measure the uncertainty of the<br />

sequence S T NS<br />

(N), we give another express by<br />

incorporating the set with the same neighbor<br />

nodes, using the length of the continuous appearance<br />

set, namely<br />

RNS T = (R1(ns 1 ), R2(ns 2 ), . . . , Rl(ns l )), where<br />

m�<br />

Ri(ns i ) = N, ns i =∈ NS T , l ≥ k (2)<br />

i=1<br />

We can measure the uncertainty character<br />

of the neighbor nodes set of wireless mobile<br />

node m according to the sequence RNS T by<br />

the weighted entropy


3 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

HW (RNS T ) =<br />

l�<br />

i=1<br />

RC i NS( Ri(ns i )<br />

N ) log(Ri(ns i )<br />

N )<br />

(3)<br />

In the above formula, the weight RC i NS is<br />

change ratio of composing members belonged<br />

to the neighbor nodes set in the neighbor sub-<br />

sequence in the sequence RNS T .<br />

RC i �<br />

1 i = 1<br />

NS =<br />

i > 1<br />

1 − nsi−1 ∩ns i<br />

ns i−1 ∪ns i<br />

(4)<br />

According to the same method, the uncertainty<br />

character of the set NS T is measured by<br />

the st<strong>and</strong>ard information entropy of disperse<br />

stochastic variable<br />

H(NS T ) =<br />

k�<br />

p(nsj) log p(nsj) (5)<br />

j=1<br />

From the above result, the metric stability<br />

of mobile node m in the Ad Hoc Networks is<br />

defined<br />

MS(m) = 1 − Hw(RNS T )H(NS T )<br />

log N log N<br />

(6)<br />

The characters of Metric Stability are as follows.<br />

1. According to the character of the information<br />

entropy, the value ranges of H(NS T ) <strong>and</strong><br />

HW (RNS T ) are [0, log N], so the value range<br />

of MS(m) is [0, 1], <strong>and</strong> the uncertainty will<br />

increase along with the increasing members of<br />

the set <strong>and</strong> the tend to average of the probability<br />

distributing, the metric stability MS(m) is<br />

decreased. In the worst condition, the different<br />

neighbor sets are gained in the course of each<br />

sampling, that is MS(m) equals 0, so the routing<br />

can’t reliably transfer the data. Contrarily,<br />

the same neighbor sets are gained in the course<br />

of each sampling, that is MS(m) equals 1.<br />

2. The different effects of changing members<br />

in the neighbor set are introduced into the<br />

uncertainty metric by weighting during the<br />

course of computation HW (RNS T ), the uncertainty<br />

is larger <strong>and</strong> larger along with the<br />

exquisite changes of members in the neighbor<br />

set, the result is that the stability of wireless<br />

mobile node decrease.<br />

3. The dynamic character of the wireless<br />

mobile node is compactly described form the<br />

aspect of statistics <strong>and</strong> action, the reason is that<br />

the uncertainty is measured by the value range<br />

of the neighbor set <strong>and</strong> the distribution of the<br />

members in the neighbor set.<br />

2.2 Stability measurement of the routing<br />

From the above analysis of stability measurement<br />

about the wireless mobile node, the stability<br />

measurement SR(S,D) of the routing from<br />

the source node S to the aim node D can be<br />

denoted by the multiplication of all stability<br />

measurement participating in this routing stability<br />

of wireless mobile nodes, namely<br />

SR(S,D) = �<br />

S(i) (7)<br />

i∈R(S,D)<br />

The maximum of the stability measurement<br />

is 1, because the value range of each stability<br />

measurement S(i) is fall into [0, 1]. The stability<br />

measurement is affected by the two factors, one<br />

is the length of the route R(S, D), the other<br />

is the stability degree of all wireless mobile<br />

nodes in the routing. The jump forward routing<br />

R(S, D) is less, the routing stability is higher,<br />

the value of SR(S,D) tends to 1, <strong>and</strong> the routing<br />

is more stability.<br />

2.3 Based on the stability of measurement<br />

on-dem<strong>and</strong> routing protocol (BSORP)<br />

On the basis of the Ad Hoc on-dem<strong>and</strong> distance<br />

vector routing protocol (AODV)[17], a<br />

BSORP is put forward by building <strong>and</strong> selecting<br />

the necessary stability routing, the affection<br />

of the routing presented in this paper<br />

on the stability of the calculation methods in<br />

the network performance is analyzed by the<br />

simulation, which analyze the performance difference<br />

of the improved routing protocols <strong>and</strong><br />

the original on-dem<strong>and</strong> distance vector routing<br />

protocol in the same network environment.<br />

The routing table <strong>and</strong> the associated control<br />

data structure grouping need to be extended in<br />

the routing protocol in order to take advantage<br />

of the mobile node stability measurement as<br />

the path choosing metric in the routing building.<br />

In the original AODV routing protocol,<br />

3


4 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

routing table entry increase the routing stability<br />

metric range, it is used to record the<br />

routing stability metric from the source node<br />

to the current forwarding node. In the routing<br />

search process, each RREQ packet increase the<br />

stability of the current routing metric range<br />

(the stability metric value from RREQ source<br />

node to the RREQ packet routing node), when<br />

an intermediate forwarding node receives a<br />

valid RREQ control packets, firstly, update <strong>and</strong><br />

record the routing stability metric from the<br />

sponsored node by the RREQ to the current<br />

nod, <strong>and</strong> then continue the routing discovery<br />

process. In addition, the RREP control packets<br />

increase the stability of the routing metric<br />

range as the stability indicator for the source<br />

node of RREQ packet, the routing stability<br />

measurement is a full stability measurement in<br />

the RREQ source node routing table entries.<br />

When a wireless mobile node receives the<br />

same RREQ packet in different copies in the<br />

routing search process, if the routing stability<br />

of the new received RREQ packet is less than<br />

the minimum of the node in the current routing<br />

table entry, or the stability measurement are<br />

equal, but less jumping forward, the RREQ<br />

control packet is not discarded, but the group<br />

need to update the routing information in the<br />

routing table entry in the stability of the measurement<br />

range <strong>and</strong> to redirect the hop node<br />

for this RREQ packet forwarding node, otherwise<br />

discard the duplicate copies of RREQ. In<br />

addition, in order to obtain more stable routing,<br />

the copies of other RREQ packet that the<br />

routing stability measurement is less than its<br />

current value of the minimum recorded should<br />

be continued to answer before the RREQ destination<br />

node in the RREQ packet response timer<br />

times out.<br />

Routing redirection of the forwarding node<br />

in the process of RREQ packet <strong>and</strong> the multiple<br />

response process of the final destination node<br />

of the RREQ packet are shown as Fig. 1 during<br />

the course of establishing mobile Ad Hoc.<br />

3 SIMULATION RESULTS<br />

3.1 Simulation evaluation indicators<br />

Simulation process is based on NS2 (v2.31)<br />

network simulator, wireless mobile node move-<br />

1<br />

2<br />

5 6<br />

3<br />

Fig. 1. Routing redirection <strong>and</strong> multiple response<br />

operations in the process of routing<br />

search<br />

ment model is R<strong>and</strong>om Waypoint Model<br />

(RWM), IEEE802.11 specification is used in<br />

the simulation of the distributed coordination<br />

function (DCF) as the MAC protocol, for all<br />

wireless mobile nodes in the Ad Hoc networks<br />

move r<strong>and</strong>omly in the rectangular range of<br />

1800m × 900m.<br />

Other relevant parameters in the process of<br />

simulation are as shown in Table 1.<br />

TABLE 1<br />

Simulation parameters<br />

Simulate time 800s<br />

Transmission range 250m<br />

Receiver range 250m<br />

Node numbers 50<br />

Maximum pause time 50s<br />

Traffic type CBR<br />

Packet size 512byte<br />

CBR rate 5pkt/s<br />

3.2 Simulation environment<br />

In the simulation process, we analyze the difference<br />

performances of the original AODV<br />

routing protocol <strong>and</strong> BSORP routing protocol<br />

with the same parameters setting from the<br />

key network performance parameters, such as<br />

packet successful delivery ratio of the application<br />

layer, packet end-to-end delay of the application<br />

layer, network control load overhead.<br />

Packet successful delivery ratio: the ratio<br />

is the total number of the packets issued by<br />

source nodes of all CBR data flow to the application<br />

layer success receiving packets of the<br />

4<br />

7<br />

4


5 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

destination node of all CBR data flow in the<br />

mobile Ad Hoc networks.<br />

Packet end-to-end delay: the average transmission<br />

delay is CBR data flow from the source<br />

node sent the application layer data packet<br />

eventually reaches its final destination node<br />

application layer, that is, the application layer<br />

data packets end-to-end delay.<br />

Routing protocol control load overhead: the<br />

ratio is the number of control packets sent<br />

by the simulation process of all network layer<br />

routing protocols to the number of all sending<br />

packets.<br />

3.3 Analysis of the simulation results<br />

The performance difference in the data packet<br />

successful delivery compared for two kinds of<br />

routing protocols in the simulation is shown as<br />

Fig. 2. From the simulation results, it is clear<br />

that the wireless mobile nodes moving at a<br />

speed in Ad Hoc networks have a considerable<br />

impact on data transmission quality. The<br />

successfully submitted packets of two kinds<br />

of routing protocol are significantly decreased<br />

when the nodes increased the moving speed.<br />

However, due to BSORP routing protocol with<br />

a relatively stable network nodes for data transmission<br />

in the routing process of establishing,<br />

it is effective for improving the rate of packet<br />

successfully submitted, <strong>and</strong> delaying the rapid<br />

decline trend with the increasing node speed<br />

of the success submission packet.<br />

End-to-end delay performance of packet affected<br />

by the BSORP in the Ad Hoc networks<br />

is shown as Fig. 3, the simulation results show<br />

that the BSORP protocol significantly reduces<br />

the end-to-end delay in the process of the data<br />

packet transmission. A stable routing selection<br />

strategy adopts the multiply accumulate in the<br />

routing search process for the BSORP protocol,<br />

the number of nodes forwarding is regarded as<br />

an important factor, that is, the routing has a<br />

higher competitive advantage with fewer routing<br />

nodes, <strong>and</strong> by choosing a relatively stable<br />

network of mobile nodes involved in data forwarding,<br />

it is significantly reduced the packet<br />

delay due to frequent disruptions caused by<br />

wireless link routing maintenance <strong>and</strong> reconstruction.<br />

P a c k e t S u c c e s s fu l D e liv e r y R a tio (% )<br />

1 .0 0<br />

0 .9 5<br />

0 .9 0<br />

0 .8 5<br />

0 .8 0<br />

0 .7 5<br />

0 .7 0<br />

0 .6 5<br />

0 .6 0<br />

0 .5 5<br />

0 .5 0<br />

0 .4 5<br />

0 .4 0<br />

5 1 0 1 5 2 0 2 5<br />

M o tio n S p e e d o f W ir e le s s N o d e (m /s )<br />

A O D V<br />

B S O R P<br />

Fig. 2. Packet successful delivery ratio at different<br />

motion speed<br />

P a c k e t E n d -to -E n d D e la y (s )<br />

2 .0 0<br />

1 .7 5<br />

1 .5 0<br />

1 .2 5<br />

1 .0 0<br />

0 .7 5<br />

0 .5 0<br />

0 .2 5<br />

0 .0 0<br />

5 1 0 1 5 2 0 2 5<br />

M o tio n S p e e d o f W ir e le s s N o d e (m /s )<br />

A O D V<br />

B S O R P<br />

Fig. 3. Packet end-to-end delay at different<br />

motion speed<br />

The specific performance differences in the<br />

data transfer process under the same network<br />

environment are shown as Fig. 4 <strong>and</strong> Fig. 5,<br />

such as the routing interrupt, control the load<br />

between the AODV routing protocol <strong>and</strong> the<br />

BSORP routing protocol. In the mobile Ad Hoc<br />

networks, the frequent disruption caused by<br />

wireless link routing maintenance <strong>and</strong> reconstruction<br />

of network control operations are a<br />

major cause of the load increasing, especially<br />

for on-dem<strong>and</strong> routing protocol, routing disruption<br />

will lead to the control of heavy loads<br />

5


overhead because of adopting the flood network<br />

to establish or maintain the necessary<br />

routing. It is show as Fig. 4 that choosing a<br />

relatively stable network of mobile nodes to<br />

participate in the activities of the routing of<br />

data forwarding can significantly reduce the<br />

interrupting possibility, <strong>and</strong> the advantage of<br />

the stability characters of routing strategy become<br />

more pronounced with the increasing of<br />

speed network nodes <strong>and</strong> the network dynamic<br />

characters. Routing interrupt <strong>and</strong> control load<br />

have a significant correlation in a dynamic<br />

network environment, which can be explained<br />

from the control load overhead of two kinds<br />

of routing protocol, the control load of the ondem<strong>and</strong><br />

type routing protocol can be effectively<br />

reduced by avoiding frequent routing<br />

disruption.<br />

R o u te B r o k e n T im e s<br />

6 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

4 0 0 0<br />

3 5 0 0<br />

3 0 0 0<br />

2 5 0 0<br />

2 0 0 0<br />

1 5 0 0<br />

1 0 0 0<br />

5 0 0<br />

0<br />

5 1 0 1 5 2 0 2 5<br />

M o tio n S p e e d o f W ir e le s s N o d e (m /s )<br />

A O D V<br />

B S O R P<br />

Fig. 4. The number of routing disruption for<br />

mobile routing<br />

From the above simulation results, we can<br />

see that stable routing selection has a major<br />

impact on the network performance in the<br />

Ad Hoc networks, the relevant network performance<br />

evaluation indicators have a clear<br />

upgrade, it means that the networks can<br />

achieve higher resource utilization efficiency<br />

<strong>and</strong> longer survival life in the resourceconstrained<br />

network environment, so it has<br />

important significance for the practical application<br />

of Ad Hoc networks.<br />

R o u tin g O v e r h e a d<br />

0 .6 0<br />

0 .5 5<br />

0 .5 0<br />

0 .4 5<br />

0 .4 0<br />

0 .3 5<br />

0 .3 0<br />

0 .2 5<br />

0 .2 0<br />

0 .1 5<br />

0 .1 0<br />

0 .0 5<br />

0 .0 0<br />

5 1 0 1 5 2 0 2 5<br />

M o tio n S p e e d o f W ir e le s s N o d e (m /s )<br />

Fig. 5. Routing control load<br />

4 CONCLUSIONS<br />

A O D V<br />

B S O R P<br />

The paper presents a stable routing calculating<br />

method, which measure the stability of<br />

the mobile wireless nodes by the uncertainty<br />

of the members changing in the neighbor<br />

node set. The method is used to improve the<br />

on-dem<strong>and</strong> distance vector routing protocol<br />

(AODV), namely, by choosing a stable wireless<br />

mobile node as the active participation of the<br />

node routing approach to improve the dynamic<br />

network environment under the conditions of<br />

network performance. Simulation results show<br />

that the proposed stable routing method can<br />

effectively improve some important network<br />

performance indicators, such as the routing<br />

protocol packet submission rate, end-to-end<br />

delay <strong>and</strong> so on.<br />

5 ACKNOWLEDGMENTS<br />

The authors would like to thank the National<br />

Natural Science Foundation of China<br />

(60503015), <strong>and</strong> National High Technology Research<br />

<strong>and</strong> Development Program of China<br />

(863) (2008AA01A201).<br />

REFERENCES<br />

[1] C. Carofiglio, C. Chiasserini, M. Garetto, <strong>and</strong> E. Leonardi,<br />

“Route stability in MANETs under the r<strong>and</strong>om direction<br />

mobility model,” IEEE Transactions on Mobile Computing,<br />

vol. 8, no. 9, pp. 1167–1179, 2009.<br />

6


7 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

[2] M. Garetto <strong>and</strong> E. Leonardi, “Analysis of r<strong>and</strong>om mobility<br />

models with partial differential equations,” IEEE Transactions<br />

on Mobile Computing, vol. 6, no. 11, pp. 1204–1217,<br />

2007.<br />

[3] W. Su, S. Lee, <strong>and</strong> M. Gerla, “Mobility prediction <strong>and</strong><br />

routing in Ad Hoc wireless networks,” International Journal<br />

of Network Management, vol. 11, no. 1, pp. 3–30, 2001.<br />

[4] S. Misra, S. Dhur<strong>and</strong>her, M. Obaidat, N. Nangia,<br />

N. Bhardwaj, P. Goyal, <strong>and</strong> S. Aggarwal, “Node stabilitybased<br />

location updating in mobile Ad-Hoc networks,”<br />

IEEE <strong>Systems</strong> Journal, vol. 2, no. 2, pp. 237–247, 2008.<br />

[5] J. Liu, W. Guo, X. B.L., <strong>and</strong> F. Huang, “Path holding<br />

probability based Ad Hoc on-dem<strong>and</strong> routing protocol,”<br />

Journal of Software, vol. 18, no. 3, pp. 693–701, 2007.<br />

[6] V. Lenders, J. Wagner, S. Heimlicher, M. May, B. Plattner,<br />

<strong>and</strong> E. Zurich, “An empirical study of the impact of<br />

mobility on link failures in an 802.11 Ad Hoc network,”<br />

IEEE Wireless Communications, vol. 15, no. 6, pp. 16–21,<br />

2008.<br />

[7] S. Prince <strong>and</strong> B. Stephen, “On the behavior of communication<br />

links of a node in a multi-hop mobile environment,”<br />

in Proceedings of 5th ACM International Symposium on<br />

Mobile Ad Hoc Networking <strong>and</strong> Computing, 2004, pp. 145–<br />

156.<br />

[8] W. Tang <strong>and</strong> W. Guo, “A path reliable routing protocol in<br />

mobile ad hoc networks,” in Proceedings of 4th International<br />

Conference on Mobile Ad-Hoc <strong>and</strong> Sensor Networks, 2008, pp.<br />

203–207.<br />

[9] S. Xu, K. Blackmore, <strong>and</strong> H. Jonees, “Mobility assessment<br />

for MANETS requiring persistent links,” in Proceedings of<br />

International Conference on Mobile System, Applications <strong>and</strong><br />

Services, 2005, pp. 39–44.<br />

[10] J. Sumesh <strong>and</strong> V. An<strong>and</strong>, “Mobility aware path maintenance<br />

in ad hoc networks,” in Proceedings of the 2009 ACM<br />

Symposium on Applied Computing, 2009, pp. 201–206.<br />

[11] D. Rohit, D. Cynthia, K. Wang, <strong>and</strong> K. Satish, “Signal<br />

stability based adaptive routing (SSA) for Ad Hoc mobile<br />

networks,” IEEE Personal Communications, vol. 4, no. 1, pp.<br />

36–45, 1997.<br />

[12] G. Lim, K. Shim, S. Kim, <strong>and</strong> H. Yoon, “Signal strengthbased<br />

link stability estimation in ad hoc wireless networks,”<br />

Electronics Letters, vol. 39, no. 5, pp. 485–486, 2003.<br />

[13] Y. Xu <strong>and</strong> W. Wang, “Topology stability analysis <strong>and</strong> its<br />

application in hierarchical mobile ad hoc networks,” IEEE<br />

Transactions on Vehicular Technology, vol. 58, no. 3, pp.<br />

1546–1560, 2009.<br />

[14] H. Kim <strong>and</strong> M. Yoo, “A Scalable Ad Hoc routing protocol<br />

based on logical topology for ubiquitous community<br />

network,” in Processing of the 9th International Conference<br />

on Advanced Communication Technology, vol. 2, 2007, pp.<br />

1306–1310.<br />

[15] H. Zhang <strong>and</strong> Y. Dong, “A novel path stability computation<br />

model for wireless Ad Hoc Networks,” IEEE Signal<br />

Processing Letters, vol. 14, no. 12, pp. 928–931, 2007.<br />

[16] C. Channon, “A mathematical theory of communication,”<br />

The Bell System Technical Journal, vol. 27, no. 12, pp. 379–<br />

423,623–656, 1948.<br />

[17] C. Perkins <strong>and</strong> E. Royer, “Ad hoc on-dem<strong>and</strong> distance<br />

vector routing (AODV),” 2003.<br />

Gang Liu is a PHD student in HIT. His research interest includes<br />

ad hoc networks, dependable computing.<br />

Gang CUI is a professor in HIT. His research interest includes<br />

fault tolerant computing technology, computer architecture, ad<br />

hoc network,wireless sensor network.<br />

Hongwei Liu is a professor in HIT. His research interest includes<br />

fault tolerant computing technology, ad hoc network,<br />

wireless sensor network.<br />

Zhibo WU is a professor in HIT. His research interest includes<br />

fault tolerant computing technology, computer architecture, ad<br />

hoc network,wireless sensor network.<br />

7


8 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

Potential Contribution of CNN-based Solving of Stiff ODEs<br />

& PDEs to Enabling Real-Time Computational Engineering<br />

Jean Chamberlain Chedjou ( 1 ), Cyrille Kalenga Wa Ngoy ( 2 ), Michel Matalatala Tamasala ( 2 ), Ky<strong>and</strong>oghere Kyamakya( 1 )<br />

( 1 ): Transportation Informatics Group, Institute of Smart <strong>Systems</strong> Technologies, University of Klagenfurt (Austria),<br />

Email: ky<strong>and</strong>oghere.kyamakya@uni-klu.ac.at ; jean.chedjou@uni-klu.ac.at<br />

( 2 ): Department of Electrical <strong>and</strong> Computer Engineering, Polytechnic Faculty, University of Kinshasa (D. R. Congo)<br />

Email: Cyrille.Kalenga@vodacom.cd ; Michel.Matalatala@vodacom.cd<br />

Abstract — One of the most common approaches to avoid<br />

complexity while numerically solving stiff ordinary differential<br />

equations (ODEs) is approximating them by ignoring the<br />

nonlinear terms. While facing stiff partial differential equations<br />

(PDEs) the same is done by avoiding/suppressing the nonlinear<br />

terms from the Taylor’s series expansion. By so doing, the<br />

traditional methods for solving stiff PDEs <strong>and</strong> ODEs do<br />

compromise on both efficiency <strong>and</strong> precision of the resulting<br />

computations. This does inevitably lead to less accurate results<br />

that consequently cannot provide the full insight that may be<br />

needed in diverse cutting-edge situations in the ‘real’ nonlinear<br />

dynamical behavior experienced by the various engineering <strong>and</strong><br />

natural systems (generally modeled by nonlinear differential<br />

equations of the types ODE or PDE), which are analyzed in the<br />

frame of the novel discipline called Computational Engineering.<br />

For many of these systems, even a real-time simulation <strong>and</strong>/or<br />

control of the behavior is wished or needed; this sets evidently<br />

extremely high challenging requirements to the computing<br />

capability with regard to both computing speed <strong>and</strong> precision.<br />

This paper develops/proposes <strong>and</strong> validate through a series of<br />

presentable examples a comprehensive high-precision <strong>and</strong> ultrafast<br />

computing concept for solving stiff ODEs <strong>and</strong> PDEs with<br />

Cellular Neural Networks (CNN). The core of this concept is a<br />

straight-forward scheme that we call ‘Nonlinear Adaptive<br />

Optimization (NAOP)’, which is used for a precise template<br />

calculation for solving any (stiff) nonlinear ODE through CNN<br />

processors. One of the key contributions of this work, this is a<br />

real breakthrough, is to demonstrate the possibility of<br />

mapping/transforming different types of nonlinearities displayed<br />

by various classical <strong>and</strong> well-known oscillators (e.g. van der Pol-,<br />

Rayleigh-, Duffing-, Rössler-, Lorenz-, <strong>and</strong> Jerk- oscillators, just<br />

to name a few) unto first-order CNN elementary cells, <strong>and</strong><br />

thereby enabling the easy derivation of corresponding CNN<br />

templates. Furthermore, in case of PDE solving, the same concept<br />

also allows a mapping unto first-order CNN cells while<br />

considering one or even more nonlinear terms of the Taylor’s<br />

series expansion generally used in the transformation of a PDE in<br />

a set of coupled nonlinear ODEs. Therefore, the concept of this<br />

paper does significantly contribute to the consolidation of CNN<br />

as a universal <strong>and</strong> ultra-fast solver of stiff differential equations<br />

(both ODEs <strong>and</strong> ODEs). This clearly enables a CNN-based, realtime,<br />

ultra-precise, <strong>and</strong> low-cost Computational Engineering. As<br />

proof of concept some well-known prototypes of stiff equations<br />

(van der Pol, Lorenz, <strong>and</strong> Rössler oscillators) have been<br />

considered; the corresponding precise CNN templates are<br />

derived to obtain precise solutions of corresponding equations.<br />

An implementation of the concept developed is possible even on<br />

embedded digital platforms (e.g. FPGA, DSP, GPU, etc.); this<br />

opens a broad range of applications. On-going works (as outlook)<br />

are using NAOP for deriving precise templates for a selected set<br />

of practically interesting PDE models such as Navier Stokes,<br />

Schrödinger, Maxwell, etc.<br />

Keywords: Stiff ODEs <strong>and</strong> PDEs, CNN-based differential equation<br />

solving, high-precision computing, ultra-fast computing, NAOP<br />

scheme for CNN templates’ calculation.<br />

I. INTRODUCTION<br />

The last decades have witnessed a tremendous attention on<br />

solving nonlinear <strong>and</strong> stiff models (ODEs <strong>and</strong>/or PDEs) with<br />

the CNN paradigm [1]. The interest devoted to solving stiff<br />

models can be explained by their multiple potential<br />

applications especially in the so-called Computational<br />

Engineering context. Indeed, nonlinear models have been<br />

intensively used to underst<strong>and</strong>, predict <strong>and</strong> describe the<br />

dynamical behavior of various engineering or natural systems.<br />

In the field of transportation <strong>and</strong> logistics, for example, traffic<br />

models do take the form of ODEs <strong>and</strong>/or PDEs [2]. Still, in the<br />

field of transportation, various image processing tasks which<br />

are of high importance for visual sensors in Advance Driver<br />

Assistant <strong>Systems</strong> (e.g. contrast enhancement, segmentation,<br />

edge detection, etc…) can be expressed through solving<br />

corresponding stiff ODEs <strong>and</strong>/or PDEs [3].<br />

Diverse contributions have been made to develop<br />

analytical, numerical <strong>and</strong> even hardware-based approaches to<br />

solve stiff ODEs <strong>and</strong>/or PDEs [1]-[20]. Amongst these<br />

contributions some have retained our attention namely “the<br />

solutions of PDEs <strong>and</strong> ODEs using the CNN-paradigm”. In<br />

fact, the flexibility of the CNN paradigm <strong>and</strong> its huge potential<br />

to enable a renaissance of the old “analog computing” through<br />

an emulation on digital platforms (e.g. FPGA or GPU, etc.) to<br />

perform ultra-fast <strong>and</strong> accurate computing of nonlinear models<br />

are some of its strongest points. Nevertheless, the relevant<br />

state-of-the-art does not provide significant information related<br />

to a straight-forward method to calculate the CNN templates<br />

needed for solving stiff ODEs <strong>and</strong>/or PDEs with the CNN<br />

paradigm. Despite some intensive works developed in this<br />

direction it is still unclear how to solve PDEs <strong>and</strong>/or ODEs<br />

with good accuracy or high precision. Only approximate<br />

solutions exist, for example the use of CNN processors in an<br />

approximation of numerical solutions of PDEs involving the<br />

finite difference method [7], [10]-[14]. This later approach<br />

does not provide accurate results due to the Taylor series’<br />

expansion which does consider only up to the first order (i.e.<br />

linear expansion). A further interesting published approach to


9 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

solve PDEs is the group of learning schemes involved in an<br />

approximated solution of PDEs through CNN processors<br />

[15]-[20]. This late approach does require some initial solutions<br />

along with some critical parameter settings of the equations<br />

under investigation in order to enable the training process. This<br />

is a clearly significant drawback as it is not always possible to<br />

provide this data/information whenever dealing with stiff<br />

ODEs <strong>and</strong>/or PDEs.<br />

Our aim in this paper is therefore to contribute to the<br />

enrichment of the relevant state-of-the-art by<br />

proposing/developing a systematic methodology (based on the<br />

CNN paradigm) which should help to clear some of the<br />

problems actually unsolved by the classical above described<br />

approaches. The key challenge thereby is developing a CNNbased<br />

computing concept for performing both ultra-fast <strong>and</strong><br />

high-precision computing of stiff differential equations. The<br />

proposed method is based on a nonlinear adaptive optimization<br />

scheme to which we give the acronym “NAOP”. For proof of<br />

concept, the novel approach developed in this paper is applied<br />

to derive solutions of selected classical <strong>and</strong> well-known<br />

examples of stiff ODEs. In the following, the flexibility of the<br />

approach developed is extensively discussed <strong>and</strong> we then do<br />

show/explain an easy extension of this approach to similarly<br />

efficiently solving stiff PDEs.<br />

The rest of the paper is organized as follows. Section 2<br />

presents an in-depth description of the novel concept. The<br />

quintessence of NAOP is explained <strong>and</strong> we thereby describe<br />

the scheme for deriving appropriate CNN templates values for<br />

any given nonlinear ODE. Section 3 does then focus on the<br />

proof of concept through a selected nonlinear differential<br />

equation that is solved using the new concept developed in this<br />

paper: the van der Pol equation. For this, corresponding<br />

‘precise’ templates are calculated through NAOP. In section 4<br />

the possible extension of the novel scheme involving NAOP<br />

for solving PDEs is discussed. And finally, a series of<br />

concluding remarks are presented in Section 5 along with the<br />

presentation of some interesting open research questions<br />

(outlook) that are under investigation in some of our on-going<br />

works.<br />

II. THE CONCEPT OF “NAOP” FOR CNN TEMPLATE<br />

CALCULATION AND SOLUTIONS OF STIFF ODES<br />

This section describes the approach based on the Nonlinear<br />

Adaptive Optimization (NAOP) for solving ODEs. The<br />

overall flow diagram of this approach is schematically<br />

displayed by the synoptic representation in Fig. 1.<br />

The NAOP is performed by a complex ‘computing’<br />

“module/entity/procedure” which does work on two inputs.<br />

The first input contains wave-solutions generated by the state<br />

control CNN- network modeled by (1):<br />

M<br />

dxi =− xi + ⎡Aˆ ijxj Aijyj Biju ⎤<br />

⎣<br />

+ + j⎦ + Ii<br />

j 1<br />

∑ (1)<br />

dt =<br />

The second input contains wave-solutions of the model or<br />

better the linear/nonlinear differential equation, under<br />

investigation which could be re-written in the following<br />

simplified form as a set/couple of second order ODEs (see<br />

(2)):<br />

2<br />

dyi<br />

2<br />

dt<br />

= F(y , y , y & , z , z , z & , t)<br />

(2a)<br />

n m n m<br />

i i i i i i<br />

2<br />

dzj<br />

n m n m<br />

2<br />

j j j j j j<br />

dt<br />

= F(z , z , z & , y , y , y & , t) (2b)<br />

Figure.1. Synoptic representation of the key steps involved in the NAOP<br />

approach used for a precise template calculations for solving both linear <strong>and</strong><br />

nonlinear differential equations.<br />

The output of the NAOP system will generate, after<br />

extensive iterative computations or ‘training’ steps,<br />

appropriate CNN-templates to solve the corresponding ODEs<br />

(see (2)) when the convergence of the training process is<br />

achieved.<br />

The global process to derive the CNN-templates (i.e.<br />

NAOP) can be summarized as follows. The learning/training<br />

process is based on a mapping between the two inputs of the<br />

NAOP procedure. A convergence to local minima is the key<br />

purpose governing this template calculation process, the socalled<br />

NAOP. To achieve this, various basins of attraction are<br />

investigated sequentially, <strong>and</strong> corresponding CNN templates<br />

are determined for those various initial conditions. If some<br />

local attractors diverge from a local minimum, new sets of<br />

initial conditions are automatically generated to annihilate the<br />

divergence leading to a possible convergence to a local<br />

minimum. A large number of r<strong>and</strong>omly generated attractors<br />

(either regular or chaotic) are obtained through various<br />

numerical simulations whereby each attractor corresponds to a<br />

specific set of CNN-templates. An attempt to map these<br />

attractors to those generated by the model under investigation<br />

is performed in a sequential process leading to the<br />

convergence to a local minimum when the mapping is<br />

achieved successfully. However, it should be worth a<br />

mentioning that during the training process our various<br />

numerical simulations have revealed that it is very<br />

tough/difficult to find the optimal solution (i.e. the local<br />

minimum). This difficulty can be explained by the well-known<br />

inherent local minimum problem of the Hopfield neural<br />

network [8]-[9]. To overcome this problem, various basins of<br />

attractions are therefore generated within the NAOP process


10 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

<strong>and</strong> this generation is conducted in a sequential way until the<br />

internal dynamics of the global network of coupled oscillators<br />

converges to stable states. This convergence must be achieved<br />

in both the ‘CNN-templates’ <strong>and</strong> the ‘attractors’ which are all<br />

considered to be dynamic variables during the<br />

learning/training process. It is further worth a mentioning that<br />

the quintessence of the concept NAOP is in the core an<br />

adaptive training process that is very comparable to the<br />

concept developed for the training of Hopfield neural<br />

networks towards an efficient tracking of local minima.<br />

Nevertheless, NOAP has been demonstrated capable of<br />

mapping all known nonlinearity of ODEs unto appropriate<br />

templates of a first-order CNN processor matrix.<br />

 11<br />

 12<br />

 21<br />

 22<br />

Figure. 2a: Convergence of state-control CNN templates as achieved by the<br />

NOAP process for the following values of the system parameters: Є =0.25 <strong>and</strong><br />

ω=1.<br />

III. APPLICATIONS TO SOLVING STIFF ODES<br />

We restrict our analysis to the case of the van der Pol<br />

oscillator which is a good prototype of a well-known selfsustained<br />

oscillator having the interesting characteristic of<br />

being able to generate sinusoidal-, quasi-periodic-, <strong>and</strong><br />

relaxation- oscillations (see (3))<br />

2<br />

dx 2 dx<br />

2 ( )<br />

dt<br />

−ε 1− x +ω x = 0<br />

(3)<br />

dt<br />

Two possible states can be generated by (3). The first is the<br />

sinusoidal or almost sinusoidal state (Є 1). We now want to solve<br />

(3) using the CNN-paradigm. We envisage the case where<br />

Є=0.25 <strong>and</strong> ω=1. For these parameter values the NAOP concept<br />

has been exploited to calculate the corresponding templates<br />

after convergence of the training process. This convergence is<br />

clearly illustrated by the plots presented in Figs. (2a) <strong>and</strong> (2b)<br />

showing the temporal evolution of both the state-control<br />

templates Âij (see Fig. (2a)) <strong>and</strong> the feedback templates Aij<br />

(see Fig. (2b)). As it appears in these figures, the convergence<br />

is achieved after a long transient phase displayed by the global<br />

training network. It is worth a mentioning that the<br />

convergence of the process is achieved for suitable basins of<br />

attractions. From Figs. (2), one can easily read the following<br />

corresponding CNN templates that are then used to solve the<br />

van der Pol equation:<br />

Â11 = 1.0770 , Â12 =− 0.6300 , Â21 = 1.3450 , Â22 = 0.5850 ,<br />

A 0.4473 A 0.2586 A 0.4846 A = 0.1310.<br />

11 = , 12 =− , 21 = , 22<br />

11 A 12 A<br />

A 21<br />

Figure. 2b: Convergence of Feedback- templates achieved by the NOAP<br />

process for the following values of the system parameters: Є =0.25 <strong>and</strong> ω=1.<br />

This set of template values has been used/inserted in Fig. 3 to<br />

obtain the solution of (3) through the CNN paradigm. Indeed<br />

Fig. 3 is a general representation in SIMULINK of a CNN<br />

processor platform to solve second-order nonlinear ordinary<br />

differential equations. The key contribution of our approach,<br />

which is a breakthrough, is that we are now capable of<br />

transforming/mapping any type of nonlinearity displayed by<br />

nonlinear coupled <strong>and</strong> uncoupled ODEs into the type of<br />

nonlinearity displayed by the elementary first-order CNN- cell<br />

model. As proof of concept of the approach developed in this<br />

paper, we have used the CNN templates derived by this<br />

scheme to obtain the exact solutions of (3). The graphical<br />

representation of the CNN-processors for second order ODEs<br />

presented in Fig.3 has been used for rapid prototyping<br />

purposes (a hardware implementation in either DSP or FPGA<br />

or GPU platforms is then straight-forward). A direct numerical<br />

simulation of the same equation, i.e. (3) has also been<br />

performed using MATLAB <strong>and</strong> a comparison between these<br />

A 22


11 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

two results is shown in Figs. 4. As it clearly appears in Fig.<br />

(4a) <strong>and</strong> Fig. (4c), the result (i.e. the solution of (3)) by the<br />

approach based on the CNN-paradigm developed in this paper<br />

<strong>and</strong> the result (i.e. Fig. (4b) <strong>and</strong> Fig. (4d)) of the same<br />

equation through a direct numerical solution through<br />

MATLAB of (3) are in a very good agreement (i.e. same value<br />

of the amplitude of oscillations <strong>and</strong> same frequency of<br />

oscillations).<br />

Figure. 3. SIMULINK graphical representation of the CNN- computing<br />

platform to solve (3).<br />

The method proposed in this paper is challenging as it<br />

shows/demonstrates a systematic <strong>and</strong> straightforward way to<br />

solve nonlinear ordinary differential equations by the CNN-<br />

paradigm. The key challenge has been the possibility <strong>and</strong> then<br />

the appropriate way/algorithmic of/for mapping any type of<br />

nonlinearity unto the nonlinearity displayed by the elementary<br />

CNN- cell. Therefore, the approach developed in this work is<br />

very flexible as it can be applied to solve different types of<br />

nonlinear <strong>and</strong> stiff ODEs. The template calculation scheme<br />

based on NAOP has also been successfully applied for solving<br />

Rayleigh, Lorenz <strong>and</strong> Rössler equations <strong>and</strong> corresponding<br />

CNN- templates have been successfully derived (due to space<br />

constraints we cannot present all these results in this paper).<br />

One interesting issue under investigation is the<br />

establishment/development of a library of CNN template-sets<br />

to solve the most common nonlinear <strong>and</strong> stiff ODEs including<br />

the ones already cited above.<br />

The next section is addressing the generalization/extension<br />

of the approach developed in this paper to solving nonlinear<br />

<strong>and</strong> stiff PDEs. In fact, it will be shown that a discretization<br />

process could help to transform PDEs into sets of coupled or<br />

uncoupled nonlinear ODEs in order to make them solvable by<br />

the CNN-paradigm while thereby applying the scheme<br />

developed in this paper.<br />

Figure 4a. Wave-form solution of (3) obtained by our new approach<br />

based on the CNN- paradigm for Є =0.25 <strong>and</strong> ω=1.<br />

Figure 4b. Wave-form solution of (3) obtained through direct<br />

numerical simulation of (3) in MATLAB for Є =0.25 <strong>and</strong> ω=1.<br />

CNN – Waveform<br />

CNN- Phase Portrait<br />

Figure 4c. Wave-form solution of (3) obtained by our new approach<br />

based on the CNN- paradigm for Є =1 <strong>and</strong> ω=1.


12 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

MATLAB – Waveform MATLAB – Phase Portrait<br />

Figure 4d. Wave-form solution of (3) obtained through direct<br />

numerical simulation of (3) in MATLAB for Є =1 <strong>and</strong> ω=1.<br />

IV. EXTENSION OF THE NAOP SCHEME TO SOLVING<br />

STIFF PARTIAL DIFFERENTIAL EQUATIONS<br />

This section explains the possibility of extending/applying<br />

the approach developed in this paper to solving PDEs. Unlike<br />

the traditional approach of solving stiff PDEs through CNN<br />

which takes into consideration only the linear terms of the<br />

Taylor’s series expansion, we include the higher order<br />

derivative terms in the Taylor’s series expansion of any given<br />

PDE in order to improve the accuracy of the obtained<br />

solutions. We consider, for illustration, the Burger’s equation<br />

(4) which is a well-known prototype of partial differential<br />

equations <strong>and</strong> which is having multiple potential applications<br />

in the field of transportation.<br />

2<br />

∂u 1 ∂ u ∂u<br />

= −u<br />

2<br />

∂t R ∂x<br />

∂x<br />

In order to solve (4) by the CNN-paradigm, applying an<br />

expansion (at the first order) based on the Taylor’s series does<br />

lead to the following equivalent form of (4):<br />

[ ]<br />

i 1 i+ 1− i + i−1 ui ui+ 1 − ui−1<br />

2<br />

(4)<br />

du u 2u u<br />

= − (5)<br />

dt R h<br />

2h<br />

One can see that (5) is a well-known prototype of a set of firstorder<br />

coupled nonlinear ODEs. As it appears in (5), the<br />

discretization performed has resulted into a set of coupled<br />

ODEs with quadratic nonlinear terms (i.e of types similar to<br />

Lorenz or Rössler). This type of nonlinearity is solvable by<br />

our approach (NAOP) developed in the preceding paragraph as<br />

we could already solve more complex types of nonlinearity<br />

(e.g. the nonlinearity in the van der Pol equation). As<br />

discussed in Section 1, taking the truncated Taylor’s series<br />

(only the linear terms) has been done reluctantly in the many<br />

published works, since there has been no way so far,<br />

according to the literature, to deal with the increased<br />

complexity <strong>and</strong> the nonlinearity that appear otherwise. It is<br />

obvious that the results produced in the case of a linear<br />

approximation are de facto less precise. While considering the<br />

higher-order (in this case second-order) derivative terms in<br />

order to increase precision, the Taylor’s series expansion<br />

could be applied to (4) <strong>and</strong> this could lead to results presented<br />

in (6):<br />

dui 1 ⎡ui+ 1− 2ui + ui− 1 ui+ 1− 3u i + 3u i−1−ui−2 ⎤<br />

= ....<br />

2 2<br />

dt R<br />

⎢ − −<br />

h 2h<br />

⎥<br />

⎣ ⎦<br />

⎡ ui+ 1−uiui+ 1− 2ui+ ui−1⎤<br />

−u i ⎢ − −......<br />

h 2h<br />

⎥<br />

⎣ ⎦ (6)<br />

Therefore, while considering (6), it becomes obvious that the<br />

NAOP developed in this paper is a best c<strong>and</strong>idate for a<br />

straightforward derivation of the appropriate CNN-templates<br />

to solve (6).<br />

NAOP is also applicable for solving PDEs. The PDE must<br />

be first transformed in a set of coupled nonlinear ODEs. In<br />

this process, even nonlinear terms of/in the Taylor series<br />

expansion can be kept. Then NAOP will be used to determine<br />

appropriate templates for solving those complex sets of<br />

generally coupled nonlinear ODEs.<br />

V. CONCLUDING REMARKS<br />

We have proposed <strong>and</strong> validated a theoretical/concept<br />

based on the CNN paradigm for ultra-fast, potentially low-cost<br />

<strong>and</strong> high-precision computing of stiff ODEs <strong>and</strong> PDEs. Since<br />

we can solve these through CNN independently of the actual<br />

nonlinearity, we have reached a clear breakthrough that has<br />

the potential to enable a really ‘real-time’ Computational<br />

Engineering.<br />

The main benefit of solving ODEs <strong>and</strong> PDEs using CNN is the<br />

offered flexibility through NAOP to extract the CNN<br />

parameters through which CNN can solve any type of ODE or<br />

PDE. Another strong point of the CNN-paradigm is the<br />

resulting ultra-fast processing depending on the CNN<br />

implementation: DSP, FPGA, GPU, or CNN-Chip. One key<br />

objective of this work has been to advance the relevant stateof-the-art<br />

by proposing a novel framework to solve stiff<br />

ODE’s <strong>and</strong> PDE’s with high- precision. To achieve this goal,<br />

we have proposed <strong>and</strong> demonstrated that the Nonlinear<br />

Adaptive Optimization (NAOP) technique is a best <strong>and</strong><br />

efficient scheme to cope with solutions of any ODE or PDE.<br />

The NAOP is a learning/training method for mapping the<br />

wave solutions of the models describing the dynamics of a<br />

CNN-network to that of a given model (ODE). Taking just<br />

these two inputs, the learning process leads to the convergence<br />

to a local minimum where the complete mapping of the two<br />

models is achieved <strong>and</strong> CNN-templates are produced.<br />

Using the same technique, we proposed a high- precision<br />

computing of stiff PDEs while accounting even nonlinear<br />

terms (i.e. high order-terms) in the Taylor’s series expansion<br />

used while transforming the PDE unto a set of coupled<br />

nonlinear ODEs. In order to overcome the problem related to<br />

the speed of computation, an implementation either on FPGA<br />

or DSP or GPU of the concept developed in this work is<br />

possible <strong>and</strong> straight-forward.


REFERENCES<br />

[1] Leon O. Chua, <strong>and</strong> Lin Yang, “Cellular Neural Networks:<br />

Theory,” IEEE Transactions on Circuits <strong>and</strong> <strong>Systems</strong>, vol. 35,<br />

no. 10, October 1988.<br />

[2] Milka Uzunova, Daniel Jolly, Emil Nikolov, <strong>and</strong> Kamel<br />

Boumediene, “The Macroscopic LWR Model of the Transport<br />

Equation Viwed as a Distributed Parameter System,”<br />

Proceedings of the 5th international conference on Soft<br />

computing as transdisciplinary science <strong>and</strong> technology, pp. 572-<br />

576, October 2008.<br />

[3] Song Chun Zhu, <strong>and</strong> David Mumford, “Gibbs Reaction <strong>and</strong><br />

Diffusion Equations,” Proceedings of the 6 th International<br />

Conference on Computer Vision, pp. 847, January 1998.<br />

[4] Tamer A. Abassy, Magdy A. El-Tawil, H. El-Zoheiry, “Exact<br />

Solutions of Some Nonlinear Partial Differential Equations<br />

Using the Variational Iteration Method Linked With Laplace<br />

Transforms <strong>and</strong> the Pade Technique,” <strong>Computers</strong> <strong>and</strong><br />

Mathematics with Applications, vol. 54, pp. 940-954, October<br />

2007.<br />

[5] N. H. Sweilam, “Variational Iteration Method for Solving Cubic<br />

Nonlinear Schrodinger Equation,” Journal of Computational <strong>and</strong><br />

Applied Mathematics, vol. 207, pp. 155-163, October 2007.<br />

[6] Michaek Striebel, Andreas Bartel, <strong>and</strong> Michael Gunther, “A<br />

Multirate ROW-scheme for Index-1 Network Equations,”<br />

Applied Numerical Mathematics, vol. 59, pp. 800-814, March<br />

2009.<br />

[7] Tamas Roska, Leon O.Chua, Dietrich Wolf, Tibor Kozek,<br />

Ronald Tetzlaff, <strong>and</strong> Frank Puffer, “Simulating Nonlinear<br />

Waves <strong>and</strong> Partial Differential Equations via CNN-Part I:Basic<br />

Techniques,” IEEE Transactions on Circuits <strong>and</strong> <strong>Systems</strong>-<br />

I:Fundamental Theory <strong>and</strong> Applications, vol. 42, no. 10,<br />

October 1995.<br />

[8] J. J. Hopfield, <strong>and</strong> D. W. Tank, “Neural computation of<br />

decisions in optimization problems,” Biol. Cybernet. N 52, pp.<br />

141-152, 1985.<br />

[9] K. A. Smith, “Neural network for combinatorial optimization: A<br />

review of more than a decade of research,” INFORMS J,<br />

Computing, vol. 11, no. 1, pp. 15-34, 1999.<br />

[10] C. .Del Negro, L.Fortuna, <strong>and</strong> A.Vicari, “Modelling Lava Flows<br />

by Cellular Nonlinear Networks (CNN): Preliminary Results,”<br />

Nonlinear Processes in Geophysics, vol. 12, pp. 505-513, 2005.<br />

API<br />

API<br />

Users<br />

Users<br />

Users<br />

13 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

Internet<br />

MIDDLEWARE<br />

MIDDLEWARE<br />

GPIO GPIO IO<br />

GPIO<br />

DSP DSP Cluster<br />

Cluster<br />

Platform BUS<br />

Internet Connection<br />

Central Server<br />

Multi Channel Memory Controller<br />

FPGA FPGA Cluster Cluster GPU GPU Cluster Cluster Power Power PC PC Cluster<br />

Cluster<br />

Memory Memory Memory Memory<br />

Hyper-Computer<br />

Hyper-Computer<br />

Hyper-Computer<br />

Hyper Computer<br />

Emulated Emulated Analog Analog Computing<br />

Computing<br />

or or CNN CNN processors<br />

processors<br />

CNN CNN implementation implementation HPC<br />

HPC<br />

Shared Memory<br />

Figure 5. Global architecture of the computing platform planned to enable a<br />

real-time computational Engineering. Diverse users may access the CNN<br />

processor platforms in a remote way through the Internet<br />

[11] I. Krstic, D. K<strong>and</strong>ic, <strong>and</strong> B. Reljin, “Cellular Neural Networks-<br />

An Analogous Model for Stress Analysis of Prismatic Bars<br />

Subjected to Torsion,” FME transactions, vol. 31, pp. 7-14,<br />

2003.<br />

[12] J. –H. Niu, H.-Z.Wang, <strong>and</strong> H.-X.Zhang, J.-Y.Yan, Y.-S.Zhu,<br />

“Cellular Neural Network Analysis for Two-Dimensional<br />

Bioheat Transfer Equation,” Medical & Biological Engineering<br />

& Computing, vol. 39, pp. 601-604, 2001.<br />

[13] Tibor Kozek, Leon O.Chua, Tamas Roska, Dietrich Wolf,<br />

Ronald Tetzlaff, Frank Puffer, <strong>and</strong> Karoly Lotz, “Simulating<br />

Nonlinear Waves <strong>and</strong> Partial Differential Equations via CNN-<br />

Part II: Typical Examples,” IEEE transactions on circuits <strong>and</strong><br />

systems-I: Fundamental theory <strong>and</strong> applications, vol.42, No. 10,<br />

October 1995.<br />

[14] T. Kozek <strong>and</strong> T. Roska, “A Double Time-Scale CNN for<br />

Solving 2-D Navier-Stokes Equations,” CNNA-94 3 rd IEEE<br />

International Workshop on Cellular Neural Networks <strong>and</strong> their<br />

Applications, December 1994.<br />

[15] Puffer, R. Tetzlaff, <strong>and</strong> D. Wolf, “A Learning Algorithm for<br />

Cellular Neural Networks (CNN) Solving Nonlinear Partial<br />

Differential Equations,” ISSSE Proceedings, 1995.<br />

[16] P. Lucie Aarts, <strong>and</strong> P. van der Veer, “Neural Network Method<br />

for Solving Partial Differential Equations,” J. Neural Processing<br />

Letters, vol. 14, no. 3, pp. 261-271, December, 2001<br />

[17] I. G. Tsoulos, D. Gavrilis, <strong>and</strong> E. Glavas, “Solving differential<br />

equations with constructed neural networks,” J.<br />

Neurocomputing, vol. 72, pp. 2385-2391, June 2009.<br />

[18] L O Chua, M Hasler, G S Moschytz, J Neirynck, “Autonomous<br />

cellular neural networks: A unified paradigm for pattern<br />

formation <strong>and</strong> active wave propagation,” IEEE Transactions on<br />

Circuits & <strong>Systems</strong>-I, Fundamental Theory <strong>and</strong> Applications,<br />

vol. 42, no.10, October 1995.<br />

[19] F. Puffer , R. Tetzlaff , D. Wolf, “Modeling Nonlinear <strong>Systems</strong><br />

With Cellular Neural Networks”, IEEE Transcactions on<br />

Acoustics, Speech, <strong>and</strong> Signal Processing ICASSP-96, vol. 6,<br />

pp. 3513-3516, 1996.<br />

[20] Josef A. Nossek, “Design <strong>and</strong> Learning With Cellular Neural<br />

Networks,” International Journal of Circuit Theory &<br />

Applications, vol. 24, pp. 15 – 24, 31 Dec 1998.<br />

Job Request<br />

Server Architecture<br />

Web<br />

Server<br />

Task Scheduler<br />

Bill Manager<br />

API &<br />

Abstract Layer<br />

Platform BUS<br />

Synthesis Tools<br />

Finite Element<br />

Image Processing<br />

Figure 6. Core idea of the server architecture intended for the CNN based<br />

super-computing platform to enable real-time Computational Engineering.<br />

It is a detailed description of the central sever given in Fig. 5.<br />

PDE<br />

ODE


14 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

Ky<strong>and</strong>oghere Kyamakya obtained<br />

the ‘Ir. Civil’ degree in Electrical<br />

Engineering in 1990 at the<br />

University of Kinshasa. In 1999 he<br />

received his Doctorate in Electrical<br />

Engineering at the University of<br />

Hagen in Germany. He then worked<br />

three years as post-doctorate<br />

researcher at the Leibniz University<br />

of Hannover in the field of Mobility<br />

Management in Wireless Networks. From 2002 to 2005 he<br />

was junior professor for Positioning Location Based<br />

Services at Leibniz University of Hannover. Since 2005 he<br />

is full Professor for Transportation Informatics <strong>and</strong> Director<br />

of the Institute for Smart <strong>Systems</strong> Technologies at the<br />

University of Klagenfurt in Austria.<br />

Cyrille Kalenga Wa Ngoy obtained the ‘Ir. Civil’ degree in<br />

Electrical Engineering at the University of Kinshasa. He is<br />

since about ten years Assistant at the same University in the<br />

Department of Electrical <strong>and</strong> Computer Engineering.<br />

Michel Matalatala Tamasala obtained the ‘Ir. Civil’<br />

degree in Electrical Engineering at the University of<br />

Kinshasa. He is since about four years Assistant at the same<br />

University in the Department of Electrical <strong>and</strong> Computer<br />

Engineering.<br />

Jean Chamberlain Chedjou<br />

received in 2004 his doctorate in<br />

Electrical Engineering at the<br />

Leibniz University of Hanover,<br />

Germany. He has been a DAAD<br />

(Germany) scholar <strong>and</strong> also an<br />

AUF research Fellow (Postdoc.).<br />

From 2000 to date he has been a<br />

Junior Associate researcher in the<br />

Condensed Matter section of the ICTP (Abdus Salam<br />

International Centre for Theoretical Physics) Trieste, Italy.<br />

Currently, he is a senior researcher at the Institute for Smart<br />

<strong>Systems</strong> Technologies of the Alpen-Adria University of<br />

Klagenfurt in Austria. His research interests include<br />

Electronics Circuits Engineering, Chaos Theory, Analog<br />

<strong>Systems</strong> Simulation, Cellular Neural Networks, Nonlinear<br />

Dynamics, Synchronization <strong>and</strong> related Applications in<br />

Engineering. He has authored <strong>and</strong> co-authored 3 books <strong>and</strong><br />

more than 40 journals <strong>and</strong> conference papers.


Abstract — This paper aims to explore the feasibility of using<br />

OFDM over satellite channels with high order modulation<br />

techniques such as QPSK <strong>and</strong> 16QAM, <strong>and</strong> strong error<br />

correction algorithms. Moreover, a performance comparison<br />

between currently used single carriers techniques <strong>and</strong> OFDM is<br />

presented.<br />

D<br />

15 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

Performance Comparison of OFDM <strong>and</strong> Single<br />

Carrier Modulations over Satellite Channels<br />

Index Terms — Satellite, OFDM, QPSK, 16 QAM.<br />

Yuri Labrador, Masoumeh Karimi, Niki Pissinou, <strong>and</strong> Deng Pan<br />

I. INTRODUCTION<br />

IGITAL modulation techniques over satellite have<br />

become, in the past five years, the mainly transmission<br />

technique used in television <strong>and</strong> video transmission because<br />

digital modulation combined with video compression can more<br />

efficiently use the satellite b<strong>and</strong>width. The single carrier<br />

modulation currently used performs very well over satellite<br />

channels in fixed receiver environments. When we deal with<br />

mobile users the channel presents multi-paths effects as well as<br />

Doppler shifts. In this scenario single carrier modulation does<br />

not work as in fixed environments. Orthogonal Frequency<br />

Division Multiplexing, on the other h<strong>and</strong>, performs much<br />

better in multi-paths <strong>and</strong> frequency selective channels; it also<br />

uses the available spectrum in a very efficient way. This paper<br />

aims to explore the feasibility of using OFDM over satellite<br />

channels with high order modulation techniques such as QPSK<br />

<strong>and</strong> 16QAM, <strong>and</strong> strong error correction algorithms. A<br />

performance comparison between currently used single<br />

carriers techniques <strong>and</strong> OFDM is presented as well.<br />

II. SINGLE CARRIER MODULATION VERSUS OFDM<br />

Single carrier modulation presents two main problems when<br />

used in frequency selective channels. These two problems are<br />

[1]: (1) frequency selective channels introduce inter symbol<br />

interference at the receiver; <strong>and</strong> (2) equalization at the receiver<br />

may also amplify noise in frequencies where channel response<br />

is poor. As a result, single carrier modulation is affected due to<br />

high attenuations in some b<strong>and</strong>s. Since the same carrier uses<br />

the entire b<strong>and</strong>width, this problem can become very serious<br />

(see Figure 1).<br />

The b<strong>and</strong>width must be divided into many small b<strong>and</strong>s, <strong>and</strong><br />

then a carrier may be allocated in each one. Furthermore, the<br />

Manuscript received January 21, 2010.<br />

The authors are with Florida International University, Miami, FL, e-mails:<br />

{ylabr001, mkari001, pissinou, p<strong>and</strong>}@fiu.edu.<br />

data stream should be divided into many parallel data streams,<br />

modulating individual carriers. Then, the signals can be added<br />

together <strong>and</strong> transmitted. Thus, the entire b<strong>and</strong>width will be<br />

used, but with many individual <strong>and</strong> smaller carriers as shown<br />

in Figure 2.<br />

H(<br />

jΩ)<br />

0<br />

X 0<br />

X 1<br />

XM−1<br />

φ [ ]<br />

0 n<br />

x<br />

[ ] φ<br />

1 n<br />

x<br />

φ [ ]<br />

M−1<br />

n<br />

x<br />

f 0<br />

Fig. 1. Channel Response<br />

Some advantages of OFDM are as follows [8]: (1) the<br />

available spectrum is divided into smaller sub-b<strong>and</strong>s; (2) data<br />

is divided in the transmitter site, <strong>and</strong> each sub-stream<br />

modulates one sub-carrier; (3) power <strong>and</strong> rate of transmission<br />

in a b<strong>and</strong> depend on the channel response on that b<strong>and</strong>; <strong>and</strong> (4)<br />

no ISI, since in each narrow sub-b<strong>and</strong>, the channel response is<br />

almost flat [7] (see Figure 3).<br />

In general an OFDM transmission can be represented as<br />

shown in Figure 4.<br />

∑<br />

Fig. 2. OFDM Principle of Operation.<br />

f<br />

x[n<br />

]


The power required in each sub-channel is distributed,<br />

depending on the value of Hi. Then, the number of bits to be<br />

transmitted to each sub-channel is determined. The number of<br />

bits <strong>and</strong> the constellation can be chosen for a sub-channel<br />

based on the SNR in that particular sub-channel <strong>and</strong> the<br />

required probability of error.<br />

Amplitude<br />

Fig. 3. Orthogonal Carriers.<br />

The signal spectrum of a single carrier <strong>and</strong> an OFDM<br />

modulation differ in two main characteristics (Figure 5).<br />

1) Single carrier shows one main frequency which is<br />

modulated using some digital scheme such as QPSK or<br />

8PSK.<br />

2) OFDM is composed of a series of carriers individually<br />

modulated; these carriers are orthogonal with respect to<br />

each other.<br />

III. SATELLITE CHANNEL MODELS<br />

A fixed receiver satellite channel is modeled for practical<br />

application as an Additive Gaussian White Noise channel with<br />

a path loss block that takes into consideration the distance<br />

between the satellite <strong>and</strong> the receiver antenna <strong>and</strong> the<br />

operating frequency. This produces a path loss attenuation that<br />

varies depending on the type of satellite used. For<br />

geostationary satellites this attenuation in C B<strong>and</strong> can be in the<br />

order of hundredths of dB. These parameters, when simulating<br />

in Mat Lab, give a very close representation of real life<br />

scenarios in terms of Bit Error Rate (BER) calculation <strong>and</strong><br />

X 0<br />

X 1<br />

���..<br />

X M −1<br />

16 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

x[<br />

0]<br />

x[<br />

1]<br />

���..<br />

x[<br />

M −1]<br />

x[<br />

n ]<br />

f<br />

h[n<br />

]<br />

Signal to Noise ratios (SNR). Several simulation runs, <strong>and</strong> real<br />

life test have been performed to demonstrate that the channel<br />

model is a correct approximation of real life events [10].<br />

When dealing with mobile receivers a more complex model<br />

needs to be considered.<br />

Propagation characteristics in satellite channels are more<br />

susceptible to weather impairments, especially at higher<br />

frequencies [2], [3]. Average rain <strong>and</strong> shadowing may<br />

completely disrupt the communication link. A mobile satellite<br />

channel model that takes into consideration potential weather<br />

impairments <strong>and</strong> the multipath-fading phenomenon is<br />

necessary in order to represent the satellite channel. Some<br />

models have been proposed, but they only consider one of<br />

either the multipath effects or the weather effect. The<br />

propagation effects present in a mobile satellite link include<br />

those related to the troposphere (rain, etc.), or effects caused<br />

by the receiver’s environment (multipath). The troposphere<br />

effects are denoted byα , <strong>and</strong> the environmental effects re<br />

denoted byβ . The two effects are assumed to be statistically<br />

independent because their underlying mechanisms are<br />

independent. The amplitude of the received signal can be<br />

described as:<br />

A = α ⋅ β (1)<br />

The satellite channel model has two states (good <strong>and</strong> bad<br />

states): one is a non-shadowing state, <strong>and</strong> the other is a<br />

shadowing state [4], [5]. This two-state model forms a Markov<br />

model. In the non-shadowing state, the received signal<br />

amplitude can be described as a Rician distribution :<br />

pnon−shadowing (A) = 2K ⋅ A ⋅ exp −K(A 2 [ +1) ]⋅ I0(2K ⋅ A)<br />

where K is the Rice factor.<br />

In the shadowing state, where no LOS exists, the channel is<br />

described as a Rayleigh multipath fading. The signal at the<br />

receiver is expressed as:<br />

y[n]<br />

Fig. 4. OFDM Transmitter <strong>and</strong> Receiver.<br />

p shadowing<br />

'<br />

x[<br />

0]<br />

'<br />

x[<br />

1]<br />

���..<br />

'<br />

x[<br />

M −1]<br />

(2)<br />

⎛ A<br />

⎜<br />

⎝<br />

⎞<br />

⎟ =<br />

⎠<br />

2A<br />

exp − A2 ⎛ ⎞<br />

⎜ ⎟ (3)<br />

⎝ ⎠<br />

s 0<br />

s 0<br />

'<br />

X 0<br />

'<br />

X1<br />

���..<br />

X<br />

1<br />

H0<br />

1<br />

H1<br />

'<br />

M −1<br />

1<br />

HM−<br />

1<br />

s 0<br />

X 0<br />

X 1<br />

���..<br />

X M −1


17 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

Of all potential weather impairments, rain is the most<br />

critical, especially in tropical weather, where rainfall can be<br />

severe. The long-term statistics of potential rainfall can be<br />

described by a lognormal equation:<br />

1<br />

PL (L) =<br />

σ d L ⋅ 2π exp − (lnL − md )2<br />

⎡<br />

⎤<br />

⎢<br />

2 ⎥ , L ≥ 0 (4)<br />

⎣ 2σ d ⎦<br />

Studies on rain attenuation between fixed <strong>and</strong> mobile<br />

systems show that the probability distribution of the envelope<br />

of a mobile receiver can be described by the one used for the<br />

fixed system, multiplied by a factor that changes between 0.5<br />

<strong>and</strong> 2.0 <strong>and</strong> is independent of rain attenuation [6]. Figure 6<br />

demonstrates the probability density function versus the<br />

amplitude.<br />

Figure 7 shows measures in real life C B<strong>and</strong> transponders<br />

showing the effects of rain over the on board solid state<br />

amplifier current.<br />

IV. SIMULATIONS<br />

The Mat Lab software is used in this paper to simulate these<br />

types of modulations techniques. The simulation includes<br />

constellations for QPSK, 8PSK <strong>and</strong> 16QAM <strong>and</strong> signal<br />

spectrums for both single carrier <strong>and</strong> OFDM techniques. We<br />

decided to include the 16QAM in order to show a type of<br />

digital modulation that includes variations of both Amplitude<br />

<strong>and</strong> Phase in contrast to QPSK <strong>and</strong> 8PSK, which only<br />

modulate the phase of the carrier signal [9].<br />

For each simulation we created five blocks:<br />

1. Ground station block that includes:<br />

a) R<strong>and</strong>om Digital Source that generates digital pulses.<br />

b) Error corrections blocks for a code rate of 3/4.<br />

Fig. 5. Single carrier vs. OFDM spectrum<br />

c) QPSK, 8PSK or 16QAM Modulator that performs the<br />

actual Modulation.<br />

d) OFDM modulator (for the OFDM simulations).<br />

e) Raise Cosine Transmit Filter.<br />

f) High Power Amplifier.<br />

g) Transmitting Antenna.<br />

2. Uplink Path block that includes:<br />

a) Free Space Path Loss, this block simulates the Uplink<br />

free space attenuation due to frequency <strong>and</strong> distance<br />

from the Uplink site to the satellite. The Uplink<br />

frequency is 6245 MHz <strong>and</strong> the distance 35600 Km,<br />

giving a total attenuation of 199 dB.<br />

b) Phase/Frequency offset.<br />

Probability Density<br />

Fig. 6. Probability Density Functions for Non-shadowing State.


18 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

3. Satellite block that includes:<br />

a) Satellite receiving antenna.<br />

b) Satellite receiver system temperature.<br />

c) Phase Noise.<br />

d) I/Q Balance.<br />

e) Phase/Frequency offset.<br />

f) Power Amplifier.<br />

g) Satellite Transmitting Antenna.<br />

4. Downlink Path block that includes:<br />

a) Free Space Path Loss; this block simulates the<br />

Downlink free space attenuation due to frequency<br />

<strong>and</strong> distance from the Satellite to the Receiving<br />

Station. The Downlink frequency is 4020 MHz <strong>and</strong><br />

the distance 35600 Km, giving a total attenuation of<br />

196 dB.<br />

b) Phase/Frequency offset.<br />

c) Rician Multipath fading Channel (for mobile<br />

receivers’ scenarios).<br />

5. Receiving Earth Station block that includes:<br />

a) Receiving Antenna.<br />

b) Receiver Noise Temperature.<br />

c) Phase Noise.<br />

d) I/Q Balance.<br />

e) Phase/Frequency Offset.<br />

f) Raised Cosine Receive Filter.<br />

g) OFDM demodulator (for OFDM simulations).<br />

h) QPSK, 8PSK or 16QAM Demodulator.<br />

i) Error correction decoder.<br />

Fig. 7. On board SSPA Current Attenuation in the Satellite Transponder.<br />

The simulation allows varying several parameters for<br />

different scenarios such as TX <strong>and</strong> RX antennas Diameter, <strong>and</strong><br />

Gain; in that way we can see how the receiving spectrum is<br />

affected when the sizes of the antennas are changed.<br />

Others parameters of interest that can be changed <strong>and</strong><br />

affects the receiving signal are: HPA Gain, Uplink <strong>and</strong><br />

Downlink frequencies <strong>and</strong> thus Uplink <strong>and</strong> Downlink free<br />

space attenuation, Noise Temperatures that originally were set<br />

to typical cases of 290 K, Phase Noise, Phase Correction,<br />

Doppler error, AGC type, Phase <strong>and</strong> frequency offsets, order<br />

of the error corrections algorithms used. Note that in the<br />

simulations we have include several spectrum monitors <strong>and</strong><br />

constellation representations that can be moved to different<br />

parts of the diagram to check the form of the spectrum at any<br />

place during the path.<br />

Figures 8 <strong>and</strong> 9 show the effects of channel models on the<br />

transmitted <strong>and</strong> received spectrum for both single carrier<br />

modulation <strong>and</strong> OFDM.<br />

Figure 10 shows the values of BER of the OFDM simulation<br />

for different values of the Rician factor K. Under this channel<br />

model the BER detection threshold is reached at BER values<br />

of<br />

3<br />

10 − with a Eb/N0 = 8dB. The b<strong>and</strong>width is 5 MHz.<br />

If Turbo codes are used then the values of BER in the<br />

OFDM QPSK signal are shown in Figure 11. The same Rician<br />

factors K were used.


Fig. 8. Transmitted <strong>and</strong> Received Spectrums Single Carrier Modulation.<br />

OFDM Spectrum<br />

40<br />

35<br />

30<br />

25<br />

20<br />

15<br />

10<br />

5<br />

0<br />

19 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

Uplink Signal<br />

HPA Effects on the OFDM Signal<br />

-5<br />

-2.5 -2 -1.5 -1 -0.5 0<br />

Frequency<br />

BW = 5 MHz<br />

0.5 1 1.5 2 2.5<br />

Fig. 9. Transmitted <strong>and</strong> Received Spectrums OFDM Modulation.<br />

V. PERFORMANCE COMPARISON AND EXPERIMENTAL<br />

RESULTS SIMULATIONS<br />

Table I presents a performance comparison between the<br />

existing single carrier satellite modulation techniques data rate<br />

versus a multiple carrier (OFDM) scheme using time diversity<br />

data rate.<br />

The test was aimed to produce a working version of the<br />

OFDM QPSK modulation system, for performance<br />

verification, to include the following:<br />

1) Conduct a review of actual RF performance over a typical<br />

C-B<strong>and</strong> transponder.<br />

2) Demonstrate the inherent robustness of the system in the<br />

presence of normal satellite transmissions impairments.<br />

The test was located at Univision Network Communications<br />

Uplink facility in Miami, Fl. The output of the modulator at 70<br />

MHz was fed into a Radyne Upconverter with +7 dBm output<br />

option, which then fed an MCL Klystron C-B<strong>and</strong> HPA.<br />

The Uplink antenna used was a 9.1 m Scientific Atlanta with<br />

4 ports feed, transmitting onto transponder 16 on AMC-1 at<br />

103 West. The downlink available for the test is a 3.1 m<br />

receiving antenna in Miami. The downlink is equipped with<br />

TABLE I<br />

PERFORMANCE OF OFDM TIME DIVERSITY IN SATELLITE CHANNELS.<br />

Modulati<br />

on<br />

B<strong>and</strong>width<br />

(MHz)<br />

FEC<br />

Single carrier<br />

data rate. DVB S<br />

scheme. (Mbps)<br />

Proposed OFDM<br />

time diversity<br />

scheme. (Mbps)<br />

QPSK 5 1/2 3.5 4.5<br />

QPSK 5 3/4 5.3 6.75<br />

QPSK 5 5/6 5.9 7.5<br />

16 QAM 5<br />

1/2<br />

st<strong>and</strong>ard DRO-based LNBs digital quality. Typical noise<br />

temperatures of 25 to 35 Kelvin were noted.<br />

The modulator was set to a nominal output level of – 8<br />

dBm. The spectrum was noted as very clean, with only lowlevel<br />

spurs noted at the IF frequency of 70 MHz. The resulting<br />

IF spectrum exhibited at least -30 dBc at the + or – 2.5 MHz,<br />

indicating that very little RF power would be wasted into outof-b<strong>and</strong><br />

transmissions being absorbed by the transponder<br />

filters. The RF output of the Upconverter was set to a nominal<br />

output level of + 1 dBm, well below the + 7 dBm saturation<br />

level of the upconverter output.<br />

The RF input to the HPA was set to a nominal level of – 22<br />

dBm. Checking the HPA output via a 57.1 dB coupler. The<br />

spurious emissions were noted to be – 55 dBc or lower. No<br />

special tuning of the Klystron was necessary as it was simply<br />

deemed unnecessary for the purpose of the test. In order to<br />

determine the proper operating point for the service in the<br />

transponder, a series of RF level tests were performed, using<br />

both CW <strong>and</strong> modulated carrier. The CW tests indicated the –<br />

1 dB saturation point for the transponder SSPA was with a<br />

transmit level of 80 Watts, as measured by the HPA output<br />

coupler.<br />

An operating point of 0.5 dB below the – 1 dB saturation<br />

point was chosen as a nominal operation level, to maximize<br />

downlink performance without introducing significant<br />

distortion to the modulation. The effect of saturating the<br />

transponder is to be avoided due to increase in Inter Symbol<br />

Interference in the demodulator within the receiver, causing<br />

loss of RF margin performance. The local 3.1 m antenna was<br />

peaked on the satellite, <strong>and</strong> was used as reference antenna for<br />

the bulk of the tests, as it constituted a stress-test scenario for<br />

the system.<br />

Upon modulating the OFDM carriers, the system<br />

performance was measured over full range 10 dB OBO to<br />

saturation, with the signal to noise ratio, SNR, displayed by the<br />

receiver providing nominal increases up to the 1 dB saturation<br />

point. This result indicates that if there was an increase in<br />

distortion of the OFDM signal, it was not discernable by the<br />

receiver. Further tests should be performed to determine the<br />

actual extent of such distortion, independent of the SNR<br />

readout.<br />

n/a<br />

9.01<br />

16 QAM 5 3/4 10.6 13.5<br />

16 QAM 5 5/6 12.4 15.1


20 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

Bit Error Rate<br />

Bit Error Rate<br />

10 0<br />

10 -1<br />

10 -2<br />

10 -3<br />

10 -4<br />

10 -5<br />

10 -6<br />

10 -7<br />

10<br />

0 5 10 15 20 25 30 35 40<br />

-8<br />

E /N (dB)<br />

b 0<br />

Satellite HPA Saturation Level = 1 dB<br />

10 0<br />

10 -1<br />

10 -2<br />

10 -3<br />

10 -4<br />

10 -5<br />

10 -6<br />

10 -7<br />

The overall system performance is shown below:<br />

Parameter 3.1m 3.6m 3.7m 5.0m 7.3m<br />

SNR (dB) 11.0 11.7 11.7 14.4 18.0<br />

Signal Level 57 61 57 53 57<br />

Margin (3/4 FEC) (dB) 2.0 2.7 2.7 5.4 9.0<br />

Margin (5/6 FEC) (dB) 0.6 1.3 1.3 4.0 7.6<br />

Fig. 10. BER Values QPSK OFDM Satellite Channel.<br />

The system was tested at an FEC of 3/4, 5/6 <strong>and</strong> 8/9 (not<br />

shown), with most of the testing performed at 3/4 <strong>and</strong> 5/6<br />

rates, as this was thought to be the most likely operational rates<br />

for the system.<br />

Some problems encountered:<br />

OFDM QPSK Satellite Link K=5<br />

OFDM QPSK Satellite Link K=4<br />

OFDM QPSK Satellite Link K=3<br />

OFDM QPSK Satellite Link K=2<br />

OFDM QPSK Satellite Link K=1<br />

10<br />

0 5 10 15 20 25 30 35 40<br />

-8<br />

E /N (dB)<br />

b 0<br />

HPA Saturation Level 1 dB<br />

Fig. 11. BER Values QPSK OFDM Turbo Coded Satellite Channel.<br />

OFDM QPSK Satellite Link Turbo Code 1/3 K=5<br />

OFDM QPSK Satellite Link Turbo Code 1/3 K=4<br />

OFDM QPSK Satellite Link Turbo Code 1/3 K=3<br />

OFDM QPSK Satellite Link Turbo Code 1/3 K=2<br />

OFDM QPSK Satellite Link Turbo Code 1/3 K=1<br />

Transponder 16 operations were significantly affected by<br />

adjacent-satellite interference, both uplink <strong>and</strong> downlink, from


21 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

a co-frequency, co-polarized analog video uplink on Galaxy 4,<br />

at 99 West, 4 degrees away. This is a perfectly legitimate<br />

interference situation, <strong>and</strong> is typical of the interference to be<br />

expected while operating on a C-B<strong>and</strong> transponder in the<br />

middle of the dense cable neighborhood portion of the US<br />

domestic arc.<br />

The interference was mainly downlink-dominated on the<br />

smaller diameter receiving antennas. It is estimated that the<br />

interference contributed to a general 1.5 to 3 dB degradation<br />

of the system performance on the 3.1 m antenna. It was also<br />

noted that the SNR reading on the receiver monitoring the 3.1<br />

m antenna would occasionally fluctuate 0.1 to 0.2 dB,<br />

probably due to changes in the nature of the overall<br />

interference level.<br />

Using 3/4 FEC, the system performed with about 3 dB<br />

margin for the worst-case antenna of 3.1 m. A 3/4 FEC rate<br />

operating into any location within the 39 dBW contour. Using<br />

a 3.1 m antenna or better should have adequate margin.<br />

The following data shows a comparison between OFDM<br />

QPSK <strong>and</strong> OFDM 16QAM <strong>and</strong> the available margins over<br />

threshold:<br />

Modulation (OFDM) QPSK QPSK 16QAM 16QAM<br />

Coding RSV RSV RSTC RSTC<br />

Transponder BW (MHz) 5 5 5 5<br />

FEC 3/4 5/6 3/4 5/6<br />

Total Data Rate (Mbs) 6.75 7.5 13.5 15.1<br />

C/No Threshold (dB) 6.9 7.9 9.0 10.4<br />

A similar analysis is performed with existing single carrier<br />

modulation using QPSK <strong>and</strong> 8PSK.<br />

Modulation (Single Carrier) QPSK QPSK 8PSK 8PSK<br />

Coding RSV RSV RSTC RSTC<br />

Transponder BW (MHz) 5 5 5 5<br />

FEC 3/4 5/6 3/4 5/6<br />

Total Data Rate (Mbs) 5.3 5.9 9.7 11.5<br />

C/No Threshold (dB) 5.8 7.2 8.4 10.1<br />

The C/N threshold increases from 3/4 to 5/6 <strong>and</strong> also if the<br />

modulation order is higher. It can be shown that a OFDM<br />

QPSK 3/4 threshold is 2 dB lower than a OFDM 16QAM 3/4<br />

threshold, this is in accordance with theoretical analysis<br />

because in the case of 16QAM modulation the signal to noise<br />

ratio has to be better in order to detect, in the receiver end, the<br />

phases now more close together than in QPSK modulation.<br />

VI. CONCLUSION<br />

The single carrier modulation currently used performs very<br />

well over satellite channels in fixed receiver environments.<br />

When we deal with mobile users the channel presents multipaths<br />

effects as well as Doppler shifts. In this scenario, single<br />

carrier modulation does not work as in fixed environments.<br />

Orthogonal Frequency Division Multiplexing, on the other<br />

h<strong>and</strong>, performs much better in multi-paths <strong>and</strong> frequency<br />

selective channels; it also uses the available spectrum in a very<br />

efficient way. This paper has aimed to explore the feasibility<br />

of using OFDM over satellite channels with high order<br />

modulation techniques such as QPSK <strong>and</strong> 16QAM, <strong>and</strong> strong<br />

error correction algorithms. Furthermore, a performance<br />

comparison between currently used single carriers techniques<br />

<strong>and</strong> OFDM has been presented.<br />

REFERENCES<br />

[1] K. J. Ray Liu, Ahmed K. Sadek, Weifeng Su, <strong>and</strong> Andres Kwasinski,<br />

“Cooperative Communications <strong>and</strong> Networks,” Cambridge, 2009.<br />

[2] M. Rice, J. Slack, <strong>and</strong> B. Humphreys, “K-B<strong>and</strong> l<strong>and</strong> mobile satellite<br />

channel characterization,” Int. J. Satellite Communications, Vol. 14,<br />

pp. 283-296, 1996.<br />

[3] E. Kubista, F. Perez Fontan. M. Angeles Vazquez Castro, S. Bunomo,<br />

B. R. Arbesser-Rasburg, <strong>and</strong> J.P.V. Poiares Baptista, “Ka-b<strong>and</strong><br />

propagation measurements <strong>and</strong> statistics for l<strong>and</strong> mobile satellite<br />

applications,” IEEE Transactions on Vehicular Technology, Vol. 49,<br />

pp. 973-983, May 2000.<br />

[4] Wenzhen Li, Choi Look Law, V. K. Dubey, <strong>and</strong> J. T. Ong, “Ka-b<strong>and</strong><br />

l<strong>and</strong> mobile satellite channel model incorporating weather effects,”<br />

IEEE Communications Letters, Vol. 5, Issue 5, pp. 194-196, May 2001.<br />

[5] C. Loo <strong>and</strong> J. S. Butterworth, “L<strong>and</strong> mobile satellite channel<br />

measurements <strong>and</strong> modeling,” Proc. IEEE, Vol. 86, pp. 1442-1463, July<br />

1998.<br />

[6] E. Lutz, D. Cyagn, M. Dippold, F. Dolainsky , <strong>and</strong> W. Papke, “The l<strong>and</strong><br />

mobile satellite communication channel-Recording, statistics <strong>and</strong><br />

channel model,” IEEE Transactions on Vehicular Technology, Vol. 40,<br />

pp. 375-384, May 1991.<br />

[7] Yuri Labrador, Masoumeh Karimi, Deng Pan, <strong>and</strong> Jerry Miller, “OFDM<br />

MIMO Space Diversity in Terrestrial Channels,” International Journal of<br />

Computer Science <strong>and</strong> Network Security (IJCSNS), Vol.9, No.10, pp.<br />

52-61, October 2009.<br />

[8] Simon Plass, Armin Dammann, Gerd Richter, <strong>and</strong> Martin Bossert,<br />

“Channel Correlation Properties in OFDM by using Time-Varying<br />

Cyclic Delay Diversity,” Journal of Communications, Vol. 3, No. 3, July<br />

2008.<br />

[9] Yuri Labrador, Masoumeh Karimi, Deng Pan, <strong>and</strong> Jerry Miller, “An<br />

Approach to Cooperative Satellite Communications for 4G Mobile<br />

<strong>Systems</strong>,” Journal of Communications, Vol. 4, No. 10, November 2009.<br />

[10] Oh-Soon Shin, A. M. Chan, H. T. Kung, <strong>and</strong> V. Tarokh, “Design of an<br />

OFDM Cooperative Space-Time Diversity System,” IEEE Transactions<br />

on Vehicular Technology, Vol. 56, No. 4, July 2007.


22 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

Software Rejuvenation Technique-An<br />

Improvement in Applications with Multiple<br />

Versions<br />

Abstract — By notice to extension software technology <strong>and</strong><br />

modern applications, software reliability <strong>and</strong> availability is very<br />

serious problem. Software fault tolerance techniques improve<br />

these capabilities. One of the techniques is Software rejuvenation,<br />

which counteracts software aging. Software aging may lead to<br />

performance degradation or crash/hang failure or both. In this<br />

paper, we address this technique for the application with one,<br />

<strong>and</strong> then extend model for multiple versions. The numerical<br />

experiment results show that with more software versions can<br />

greatly reduce expected downtime <strong>and</strong> improve availability of<br />

application.<br />

Index Terms— Software rejuvenation, Availability, reliability<br />

I. INTRODUCTION<br />

ith the increase of the complication of computer<br />

W systems, the loss which is caused by software<br />

inefficiency is more <strong>and</strong> more widespread problem. One<br />

solution to reduce the loss of systems is to improve its<br />

reliability. At present, software fault-tolerate technique is the<br />

most effective approach to the problem [1]. Traditional faulttolerant<br />

techniques belong to a passive technique works in a<br />

reactive way. It implements rejuvenation operation only when<br />

the system is in failure; whereas, the software rejuvenation<br />

technique belongs to a kind of active technique, which<br />

prevents or slows down system failures before their<br />

occurrence [1].<br />

When software applications execute continuously for long<br />

periods of time (scientific <strong>and</strong> analytical applications run for<br />

days or weeks, servers in client-server systems are expected to<br />

run forever), the processes corresponding to the software in<br />

execution age or slowly degrade with respect to effective use<br />

of their system resources. The causes of process aging are<br />

memory leaking, unreleased file locks, file descriptor leaking,<br />

data corruption in the operating environment of system<br />

1. Zahra Rahmani Ghobadi is a Msc Student in Department of Computer<br />

Engineering, Qazvin Azad University, Qazvin, Iran (phone: 0989358224714<br />

e-mail: m.rah62@ gmail.com).<br />

2. Hassan Rashidi is an Assistant Professor in Department of Computer<br />

Engineering, Qazvin Azad University, Qazvin, Iran (phone: 0989126772017<br />

e-mail: hrashi@gmail.com).<br />

Zahra Rahmani Ghobadi 1 , Hassan Rashidi 2<br />

resources, etc. process aging will affect the performance of the<br />

application <strong>and</strong> eventually cause the application to fail [2].<br />

The software rejuvenation technique terminates the program<br />

when its performance declines to a certain degree, then restarts<br />

to clean the inner state <strong>and</strong> the software performance will be<br />

restored.<br />

Huang et al. (1995) introduced the continuous Markov<br />

process to build two-phase software rejuvenation model that<br />

includes healthy state, aging probable state, system failure<br />

state <strong>and</strong> rejuvenation state [8]. By Markov decision process,<br />

Pfening et al. (1996) proposed a software rejuvenation frame<br />

<strong>and</strong> applied it to AT <strong>and</strong> T communication system. Garg et al.<br />

(1998) constructed rejuvenation model of transaction<br />

processing system based on queuing theory [7]. Dohi et al.<br />

(2000) set up software rejuvenation model of client/server<br />

system <strong>and</strong> adopted non-parameter statistic analysis to<br />

estimate optimal software rejuvenation interval [8] [9]. For<br />

cluster system, Garg et al. (1998) <strong>and</strong> Wei et al. (2004)<br />

presented stochastic Petri net approach to analyze software<br />

rejuvenation. Vaidyanathan et al. (2001) used stochastic<br />

Reward Net to model <strong>and</strong> analyze cluster system that<br />

employed software rejuvenation [10]. Bao et al. (2005) <strong>and</strong><br />

Vaidyanathan <strong>and</strong> Trivedi (2005) took the system workload<br />

into account for building a model to estimate resource<br />

exhaustion times [5].<br />

We extend software rejuvenation model for multiple<br />

software version. In order to improve systematic reliability of<br />

application, the systematic availability formula is derived.<br />

Finally, the numerical results are given to validate the<br />

proposed model.<br />

II. SOFTWARE REJUVENATION<br />

Software rejuvenation is a proactive fault management<br />

technique aiming at cleaning up the internal state of the<br />

system to prevent the occurrence of more severe crash failures<br />

in the future. It involves occasionally terminating an<br />

application or a system, cleaning its internal state <strong>and</strong><br />

restarting it [3]. Application is unavailable during<br />

rejuvenation. Although rejuvenation may sometimes increase<br />

the downtime of an application, those are usually planned <strong>and</strong><br />

scheduled downtimes. If care is taken to schedule rejuvenation<br />

during the idlest times of an application, then the cost due to<br />

those downtimes is expected to be short. Downtime costs are


23 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

the costs incurred due to the unavailability of the service<br />

during downtime of an application [2].<br />

Let pij (t) be transition probability function of continuoustime<br />

Markov process <strong>and</strong> qij be transition rate. Kolmogorov<br />

forward equation is defined as follows:<br />

dPij<br />

( t)<br />

dt<br />

N<br />

= ∑ ik<br />

k = 0<br />

P ( t)<br />

q , i,<br />

j = 0,<br />

1,<br />

2<br />

kj<br />

By Letting p(t) to be the matrix of transition probability<br />

function pij(t)(i,j=0,1,2,…) <strong>and</strong> Q to be the matrix of transition<br />

rate function qij(i,j=0,1,2,…), formula (1) can be expressed in<br />

matrix format as follows:<br />

P ′ ( t)<br />

= P(<br />

t)<br />

Q<br />

A. Software Rejuvenation Model of One-Node Application<br />

First, we study Software rejuvenation model for the<br />

application with one software version, model based Markov<br />

process, as is shown in Fig. 1. The system has three states: the<br />

working state 0 (denoted as S0), the failure state 1 (denoted as<br />

SF) <strong>and</strong> the rejuvenation state 2 (denoted as SR). In the<br />

beginning, the application stays in the working state 0. With<br />

system performance degrades over time, a failure may occur.<br />

If system failure occurs before triggering software<br />

rejuvenation, the application changes from the working state 0<br />

to system failure state 1 <strong>and</strong> then the system recovery<br />

operation is started immediately. Otherwise, the application<br />

changes from the working state 0 to the software rejuvenation<br />

state 2 <strong>and</strong> later the software rejuvenation is carried out. After<br />

completing the system repair or rejuvenation, the application<br />

becomes as good as new <strong>and</strong> changes to the beginning<br />

working state 0 again. We define the time interval from the<br />

beginning of the system working to the next one as one cycle.<br />

According to the model described above, at any time t the<br />

application can be in any one of three states: up <strong>and</strong> available<br />

for service (working state 0), recovering from a failure (the<br />

failure state 1), or undergoing software rejuvenation (the<br />

rejuvenation state 2). To formally describe the software<br />

rejuvenation model of single version application, continuous<br />

time Markov process denoted as Z= (Zt; t≥0) is used, where Zt<br />

represents the state of application at time t. The transition<br />

probability function of Z is expressed as follows [6]:<br />

P ( t)<br />

= P(<br />

Z = j Z = i)(<br />

∀i,<br />

j ∈ Ω,<br />

t ≥ 0)<br />

(3)<br />

ij t 0<br />

Where, Ω= {0, 1, 2} is the state space set.<br />

For the software rejuvenation model in Fig.1, λ1, µ1, r1, <strong>and</strong><br />

R1 represents the failure rates from system working state to<br />

failure state, the transition rate to trigger software<br />

rejuvenation, the rejuvenation rate from software rejuvenation<br />

state to system working state <strong>and</strong> the recovery rate from<br />

system failure state <strong>and</strong> the recovery rate from system failure<br />

state to system working state, respectively. Let Q be the<br />

matrix of the transition rate function. According to the state<br />

(1)<br />

(2)<br />

transition relationship of single version application, the<br />

transition rate matrix for the continuous time Markov process<br />

Z can be easily derived as:<br />

-(μ1+λ1) λ1 μ1<br />

Q = R1 -R1 0 (4)<br />

r1 0 -r1<br />

Let p (t) be the matrix of transition probability function<br />

pij(t)(∀i,j∈Ω). According to Kolmogorov forward Eq.1,<br />

transition probability matrix p (t) satisfies:<br />

P ′ ( t)<br />

= P(<br />

t)<br />

Q<br />

P ( 0)<br />

= I<br />

Where, I is the unit matrix.<br />

Let pj, j∈Ω be the instantaneous steady probability of single<br />

version application in state j. According to the limit<br />

distribution theorem, pj, j∈Ω is given by:<br />

lim<br />

Pj = ij<br />

t→∞<br />

P ( t)(<br />

∀i,<br />

j ∈ Ω )<br />

By Substitution Eq.4 <strong>and</strong> 6 to Eq.5, the following equation<br />

is derived:<br />

− ( μ 1 + λ1<br />

) P0<br />

+ R1P1<br />

+ r1<br />

P2<br />

= 0<br />

− R1P1<br />

+ λ1<br />

P0<br />

= 0<br />

− r P + μ P = 0<br />

1 2<br />

2<br />

∑ Pi<br />

i=<br />

0<br />

= 1<br />

1<br />

0<br />

Where pi, i=0, 1, 2 can be obtained by solving the Eq.7.<br />

The application is available for service requests in working<br />

state 0 <strong>and</strong> application is unavailable for the rejuvenation state<br />

1 <strong>and</strong> failure state 2, thereafter, the system availability for<br />

single version application is given by:<br />

PA = P<br />

1<br />

0<br />

µ1<br />

B. Software Rejuvenation Model of Two-Node Application<br />

We extend the software rejuvenation model of single<br />

application to two-dimension state space, then derive software<br />

rejuvenation model of two-node application as shown in Fig.2.<br />

The states of application are denoted by a 2-tuple S, which is<br />

formally defined as: S={(i,j)│i,j∈{H,F,R}}, where i is the<br />

state of the first version of application <strong>and</strong> j is the state of the<br />

second version of application. For the first version of<br />

application, λ1, μ1, r1, <strong>and</strong> R1 represents the failure rates from<br />

R1<br />

SR(2) S0(0) SF(1)<br />

r1 λ1<br />

Fig. 1. Software rejuvenation model of single application.<br />

(6)<br />

(5)<br />

(7)<br />

(8)


24 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

system working state to failure state, the transition rate to<br />

trigger software rejuvenation, the rejuvenation rate from the:<br />

software rejuvenation state to working state, respectively.<br />

Correspondingly, for the second version of application, λ2, μ2,<br />

r2, <strong>and</strong> R2 denotes the failure rate, the transition rate to trigger<br />

software rejuvenation, the rejuvenation rate <strong>and</strong> the recovery<br />

rate, respectively.<br />

We discussed assumptions for simplicity <strong>and</strong> limited this<br />

model. The assumptions are explained as following:<br />

Assumption 1: Software rejuvenation is not allowed for<br />

both versions to be carried out concurrently.<br />

Assumption 2: At any time t only one version can be in<br />

rejuvenation state.<br />

Assumption 3: if the version be in failure state, other<br />

versions can’t transfer to rejuvenation state.<br />

Assumption 4: rejuvenation rate from software<br />

rejuvenation state to system working state is faster than<br />

recovery rate from system failure state to system working<br />

state.<br />

Also it is assumed that Zt is the state of the version at time t,<br />

Ω′= {0, 1, 2…7} is the state space set. Similarly, we use<br />

continuous time Markov process, denoted as Z= (Zt; t≥0), to<br />

describe the software rejuvenation model of two-node<br />

application. The transition probability function of Z is<br />

expressed as Eq. 10 <strong>and</strong> pj, j∈Ω is given by [4]:<br />

lim<br />

P = P ( t)(<br />

∀i,<br />

j ∈ Ω ′<br />

j<br />

ij<br />

)<br />

t → ∞<br />

(R,H)<br />

4<br />

µ1<br />

r1<br />

λ2 R2 λ2 R2 λ2 R2<br />

(R,F)<br />

6<br />

r1<br />

r2<br />

(H,R)<br />

5<br />

(H,H)<br />

0<br />

(H,F)<br />

2<br />

Correspondingly, the transition probability matrix P (t) also<br />

satisfies the condition in Eq. 5. By substitution Eq. 9 <strong>and</strong> 10 to<br />

Eq. 5 the Eq.11 can be derived [5]:<br />

µ2<br />

R1<br />

λ1<br />

R1<br />

λ1<br />

R1<br />

λ1<br />

(F,R)<br />

7<br />

r2<br />

(F,H)<br />

1<br />

(F,F)<br />

3<br />

Fig. 2. Software rejuvenation model of two applications.<br />

(9)<br />

-( λ 1+ λ2+ μ 1+ μ 2) λ1 λ2 0 μ 1 μ2 0 0<br />

2<br />

7<br />

∑ Pi<br />

i=<br />

0<br />

R1 -(R1+ λ2) 0 λ2 0 0 0 0<br />

R2 0 -(R2+ λ1) λ1 0 0 0 0<br />

0 R2 R1 -(R1+R2) 0 0 0 0<br />

r1 0 0 0 -(r1+ λ2) 0 λ2 0<br />

r2 0 0 0 0 -(r2+ λ1) 0 λ1<br />

0 0 r1 0 R2 0 -(r1+R2) 0<br />

0 r2 0 0 0 R1 0 -(r2+R1)<br />

− ( μ1<br />

+ μ 2 + λ1<br />

+ λ 2 ) P0<br />

+ R1P1<br />

+ R2<br />

P2<br />

+ r1<br />

P4<br />

+ r2<br />

P5<br />

= 0<br />

− ( R1<br />

+ λ 2 ) P1<br />

+ λ1P0<br />

+ R2<br />

P3<br />

+ r2<br />

P7<br />

= 0<br />

− ( R 2 + λ1<br />

) P2<br />

+ λ 2 P0<br />

+ R1P3<br />

+ r1<br />

P6<br />

= 0<br />

− ( R1<br />

+ R2<br />

) P3<br />

+ λ 2 P1<br />

+ λ1P2<br />

= 0<br />

− ( r1<br />

+ λ 2 ) P4<br />

+ μ1<br />

P0<br />

+ R2<br />

P6<br />

= 0<br />

− ( r2<br />

+ λ1<br />

) P5<br />

+ μ 2 P0<br />

+ R1P7<br />

= 0<br />

− ( r1<br />

+ R 2 ) P6<br />

+ λ 2 P4<br />

= 0<br />

− ( r + R ) P + λ P = 0<br />

1<br />

= 1<br />

7<br />

1 5<br />

(10)<br />

(11)<br />

By solving the above equations, we can obtain the value of<br />

pi, i=0, 1, 2…7. According to the rejuvenation model in Fig.2,<br />

the application is unavailable in the state of (F, F), (R, F), <strong>and</strong><br />

(F, R). Thereafter, the availability of two-node application is<br />

given by:<br />

PA 2<br />

= 1 3 6 7<br />

8(H,H,R)<br />

= P0<br />

+ P1<br />

+ P2<br />

+ P4<br />

+ P5<br />

− ( P + P + P )<br />

13(F,H,R)<br />

19(F,F,R)<br />

r1<br />

r2<br />

r3<br />

11(H,F,R)<br />

6 (F,F,H)<br />

3(F,H,H)<br />

14(F,R,H)<br />

R1<br />

R2<br />

R3<br />

0(H,H,H)<br />

7 (F,F,F)<br />

5 (F,H,F)<br />

2(H,F,H)<br />

18(F,R,F)<br />

9(H,R,H)<br />

4 (H,F,F)<br />

1(H,H,F)<br />

12(H,R,F)<br />

μ1<br />

μ2<br />

16(R,H,F)<br />

17(R,F,F)<br />

15(R,F,H)<br />

Fig. 3. Software rejuvenation model of three applications.<br />

μ3<br />

(12)<br />

10(R,H,H)<br />

λ 1<br />

λ 2<br />

λ3


C. Software Rejuvenation Model of Three-Node Application<br />

We study this work for three-dimension state space <strong>and</strong><br />

gain the less unavailability by Software rejuvenation model of<br />

three-node application as shown in Fig.3. Q is matrix of the<br />

transition rate function as in Eq.14.<br />

By solving the obtained equations, we obtain the value of<br />

Pi, i=0, 1, 2…19. According to the rejuvenation model in<br />

Fig.3, the application is unavailable in the state of<br />

(F,F,F),(R,F,F), (F,R,F), (F,F,R). Thereafter, the system<br />

availability of three-node application is given by:<br />

PA 3<br />

25 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

= 1 − ( P7<br />

+ P17<br />

+ P18<br />

+ P19<br />

)<br />

(13)<br />

A λ1 λ2 λ3 0 0 0 0 μ1 μ2 μ3 0 0 0 0 0 0 0 0 0<br />

R1 B 0 0 λ2 λ3 0 0 0 0 0 0 0 0 0 0 0 0 0 0<br />

R2 0 C 0 λ1 0 λ3 0 0 0 0 0 0 0 0 0 0 0 0 0<br />

R3 0 0 D 0 λ1 λ2 0 0 0 0 0 0 0 0 0 0 0 0 0<br />

0 R2 R1 0 E 0 0 λ3 0 0 0 0 0 0 0 0 0 0 0 0<br />

0 R3 0 R1 0 F 0 λ2 0 0 0 0 0 0 0 0 0 0 0 0<br />

0 0 R3 R2 0 0 G λ1 0 0 0 0 0 0 0 0 0 0 0 0<br />

0 0 0 0 R3 R2 R1 H 0 0 0 0 0 0 0 0 0 0 0 0<br />

r1 0 0 0 0 0 0 0 I 0 0 λ2 λ3 0 0 0 0 0 0 0<br />

r2 0 0 0 0 0 0 0 0 J 0 0 0 λ3 λ1 0 0 0 0 0<br />

r3 0 0 0 0 0 0 0 0 0 K 0 0 0 0 λ1 λ2 0 0 0<br />

0 0 r1 0 0 0 0 0 0 0 0 L 0 0 0 0 0 λ3 0 0<br />

0 0 0 r1 0 0 0 0 0 0 0 0 M 0 0 0 0 λ2 0 0<br />

0 0 0 r2 0 0 0 0 0 0 0 0 0 N 0 0 0 0 λ1 0<br />

0 0 0 r2 0 0 0 0 0 0 0 0 0 0 O 0 0 0 λ3 0<br />

0 0 r3 0 0 0 0 0 0 0 0 0 0 0 0 P 0 0 0 λ2<br />

0 r3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Q 0 0 λ1<br />

0 0 0 0 0 0 r1 0 0 0 0 0 0 0 0 0 0 R 0 0<br />

0 0 0 0 0 r2 0 0 0 0 0 0 0 0 0 0 0 0 S 0<br />

0 0 0 0 r3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 T<br />

(14)<br />

III. NUMERICAL RESULTS AND ANALYSIS<br />

To acquire reliability measure of application, we perform<br />

numerical experiments by taking system unavailability as<br />

evaluation indicator.<br />

The unavailability of single application Pu1, two-node<br />

application Pu2, <strong>and</strong> three-node application Pu3 can be<br />

evaluated as follows:<br />

PU 1 = 1 − PA1<br />

= P1<br />

+ P2<br />

PU 2 = 1 − PA<br />

2 = P3<br />

+ P6<br />

+ P7<br />

PU 3 = 1 − PA<br />

3 = P7<br />

+ P17<br />

+ P18<br />

+ P19<br />

TABLE I<br />

PARAMETER VALUES USED IN THE EXPERIMENT<br />

r1=r2=…=rn R1=R2=…=Rn λ1=λ1=…=λn µ1=µ2=…=µn<br />

1<br />

0.1 0.005 0.002<br />

The system parameter default values in software<br />

rejuvenation model are given in Table I, in which the<br />

rejuvenation rate is 1, the recovery rate is 0.1, failure rate is<br />

0.005 <strong>and</strong> transition rate to trigger software rejuvenation is<br />

0.002. All the parameter values are selected by experimental<br />

experience for demonstration purposes. For simplify the<br />

numerical experiment, we assume the failure rate, Recovery<br />

rate <strong>and</strong> Rejuvenation rate of all versions is equal.<br />

Figure 4 shows the system unavailability versus number of<br />

versions. We can see that number of versions strongly<br />

influences system reliability. With the number of version<br />

increasing, the system unavailability reduces rapidly <strong>and</strong> goes<br />

to a steady value.<br />

IV. CONCLUSION<br />

In this paper, we presented software rejuvenation structure<br />

<strong>and</strong> set up the software rejuvenation model in one, two, <strong>and</strong><br />

three-dimension state space for one application. In the model,<br />

the system availability formula is derived from continuous<br />

time Markov process. The numerical experiment results show<br />

that the system unavailability greatly minimizes when the<br />

number of versions increases.<br />

Fig. 4. The system unavailability versus number of version in the application<br />

with multiple versions.


26 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

REFERENCES<br />

[1] S.Yu, CH.Qi, H.Xin, “Positive software fault-tolerate technique based<br />

on time policy”, Journal of Communication <strong>and</strong> Computer, ISSN1548-<br />

7709, Volume 4, No.8 (Serial No.33), 2007.<br />

[2] Y. Huang, C. Kintala, N. Koletis, <strong>and</strong> N.D. Fulton, “Software<br />

Rejuvenation: Analysis, Module <strong>and</strong> Applications”, in Proc. 25th<br />

Symposium on Fault Tolerant Computer <strong>Systems</strong>, pp. 381-390, 1995.<br />

[3] T.Thein, J.Sou Park, Member, IEEE, “Availability Analysis of<br />

Application Servers Using Software Rejuvenation <strong>and</strong> Virtualization”,<br />

JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY 24(2):<br />

339-346 Mar. 2009.<br />

[4] S. Pfening, S. Garg, A. Puliafito, M. Telek <strong>and</strong> K. S.Trivedi, “Optimal<br />

Rejuvenation for toleranting Soft Failure”, Performance Evaluation,<br />

27/28, , pp.491–506, 1996.<br />

[5] Q, Yong, M.Haining, H.Di, Ch. Ying. “A Study on Software<br />

Rejuvenation Model of Application Server Cluster in Two-Dimension<br />

State Space Using Markov Process”, Information Technology Journal<br />

7(1): 98-104, 2008.<br />

[6] T.Dohi, S.Trivedi, “Statistical Non-Parametric Algorithms to Estimate<br />

the Optimal Software Rejuvenation Schedule”, Dept. of Electrical <strong>and</strong><br />

Computer Engineering, Duke University, Durham, NC 27708-0294,<br />

USA,2000.<br />

[7] W. Xiea, Y. Hong, K. Trivedi. “Analysis of a two-level software<br />

rejuvenation policy”, Reliability Engineering <strong>and</strong> System Safety 87<br />

(2005) 13–22.<br />

[8] Y. Huang, C. Kintala, N. Koletis, N.D. Fulton, “Software rejuvenation:<br />

analysis, module <strong>and</strong> application”, in: Proc. of 25 th Symposium on Fault<br />

Tolerant Computing, June 1995.<br />

[9] T. Dohi, K. Goseva-Popstojanova, K.S. Trivedi, Statistical nonparametric<br />

algorithms to estimate the optimal software rejuvenation<br />

schedule, in: Proceedings of the 2000 Pacific Rim International<br />

Symposium on Dependable Computing, December 2000.<br />

[10] K. Vaidyanathan, R.E. Harper, S.W. Hunter, K.S. Trivedi, “Analysis <strong>and</strong><br />

implementation of software rejuvenation in cluster systems”, ACM<br />

SIGMETRICS Performance Evaluation Review, in: Proceedings of the<br />

2001 ACM SIGMETRICS International Conference on Measurement<br />

<strong>and</strong> Modeling of Computer <strong>Systems</strong>, vol. 29 (1), June 2001.


Abstract— The reduction in the size of transistors, leads to the<br />

increase in the numbers of transistors to more than several<br />

billions on a chip. Therefore, new techniques have to be carried<br />

out to manage this large quantity of transistors on a single chip.<br />

Network on Chip (NoC) is an implementation technique to resolve<br />

this problem. But this NoC management is a challenging job <strong>and</strong><br />

the communication management need regular scheduling <strong>and</strong><br />

configuration. One attitude towards NoC management is making<br />

use of Real Time Operating System (RTOS) for scheduling, task<br />

introduction, <strong>and</strong> dynamic assigning priorities to the tasks <strong>and</strong><br />

message passing. Therefore in this paper, MicroC/OS-II RTOS is<br />

used. This RTOS is ported in Motorola ColdFire microprocessor.<br />

This microprocessor is located in the core of a node of mesh<br />

topology based NoC. The traffic model in this paper is hotspot.<br />

Index Terms—MicroC/OS-II, Motorola ColdFire<br />

Microprocessor, Network on Chip, Real Time Operating<br />

System.<br />

T<br />

27 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

A New Attitude based on Real Time Operating<br />

System for NoC in Hotspot Traffic Model<br />

I. INTRODUCTION<br />

HE System on Chip (SoC) can include different<br />

components such as processor, I/O unit <strong>and</strong> various types<br />

of memories. Each of these components can have different<br />

communication protocols [1].<br />

Generally, Interconnection processing elements in NoC is<br />

carried out by ports, whereas, in multiprocessor SoC (MPSoC)<br />

with numerous processing elements, it is expected that these<br />

ports in the case of latency, scalability <strong>and</strong> energy<br />

consumption, are turned into bottlenecks.<br />

Therefore, the idea of NoC that includes the routers which are<br />

connected by the means of links is introduced. But the<br />

communication management in NoC is a challenging job. So,<br />

utilization the RTOS will be in charge of managing this<br />

challenge. This OS can be ported on NoC node<br />

microprocessor. In this paper, MicroC/OS-II RTOS is ported<br />

in the central node of NoC mesh topology based on hotspot<br />

traffic model. However, the OS can be ported in the all nodes.<br />

The idea of applying NoCs also has been used in the previous<br />

works such as [9].<br />

In this paper MicroC/OS-II is used in an innovative way that is<br />

making use of the RTOS. This OS, in contrast with similar<br />

OSs such as Windows <strong>and</strong> Linux is not monolithic <strong>and</strong><br />

Seyyed Amir Asghari, Hossein Pedram <strong>and</strong> Hassan Taheri<br />

application program do not effect on kernel. Also, it has a few<br />

number of code lines for kernel that it has a willing impact on<br />

power computing to usual OSs. As NoCs are power<br />

constrained, this is considered a privilege feature [11].<br />

In the 2 nd part of this paper, NoC structure <strong>and</strong> its components<br />

are introduced.<br />

In the 3 rd part, MicroC/OS-II RTOS <strong>and</strong> its privilege features<br />

are introduced.<br />

In the 4 th part, different types of traffic models are explained.<br />

A specific traffic model which is being taken into account is<br />

hotspot traffic model.<br />

In 5 th part, Motorola ColdFire processors are introduced. In<br />

the implementation of OS based NoC, the MCF5484 ColdFire<br />

processor is used.<br />

In the 6 th part, microprocessor programming <strong>and</strong> debugging<br />

tools are introduced.<br />

In the 7 th part, two different attitudes, one based on using OS,<br />

the other one without using OS are compared <strong>and</strong> the<br />

advantages of OS based NoC are brought up. Also, in this<br />

section, a PrioRout routing algorithm is introduced.<br />

In the 8 th part, the carried out implementation is presented <strong>and</strong><br />

the last part the final conclusion is brought up.<br />

II. NOC STRUCTURE<br />

A NoC has been formed of routers <strong>and</strong> links. The IP blocks<br />

have been connected to each other by means of the network<br />

interfaces (NI). Also the routers communicate to each other<br />

over links. A router distinguishes packet paths in network. The<br />

router has been concluded of some buffers, a routing function<br />

unit, a selection function unit <strong>and</strong> a switch for packet<br />

transmission to packet destinations [2] [10].<br />

Network Interfaces justifies IP block communication protocol<br />

<strong>and</strong> packet transmission protocol by means of the router. Each<br />

network interfaces can connect several IP blocks to the routers.


28 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

Figure 1. A router with its components<br />

III. MICROC/OS-II RTOS<br />

MicroC/OS-II is a RTOS that has been applied to embedded<br />

application. If we have a toolchain (A system concluded<br />

compiler, assembler <strong>and</strong> linker), we can add an OS to it.<br />

MicroC/OS-II has a full preemptive <strong>and</strong> real time kernel which<br />

means OS runs the high priority tasks which are ready to<br />

running. Many traditional kernel acts on format of preemptive,<br />

but the MicoC/OS-II is much better than them.<br />

Analysis of OSs with monolithic kernel (such as Windows <strong>and</strong><br />

Linux) which is consisting of millions of line of code when<br />

they encounter problem is difficult <strong>and</strong> nearly these OSs would<br />

not bug free.<br />

The kernel of MicroC/OS-II has only 5000 lines of code <strong>and</strong><br />

we can confirm that it reached to a level that will be bug free<br />

[3].<br />

A. Multitasking feature<br />

MicroC/OS-II can manage up to 64 tasks. However<br />

MicroC/OS-II reserves the four highest priority tasks <strong>and</strong> the<br />

four least priority tasks for its uses. So it leaves the 56 free<br />

tasks.<br />

B. Multitasking feature<br />

For MicroC/OS-II task managing capability, first we need to<br />

be creating a task. For creating the task, we can use one of<br />

these functions:<br />

• OSTaskCreate<br />

• OSTaskCreateExt()<br />

OSTaskCreateExt() is a extended version of<br />

OSTaskCreate() that it has some extra features. For a<br />

creating multitasking, at least we need to create one task. We<br />

can not create the task with Interrupt Service Routine (ISR).In<br />

the figure 2 we can see the segment code of OSTaskCreate<br />

function:<br />

INT8U OSTaskCreate (void (*task)(void *pd),<br />

void *pdata, OS_STK *ptos, INT8U prio)<br />

Figure 2. OSTaskCreate function<br />

As you see above, need four arguments:<br />

Task; A pointer to task code.<br />

Pdata; It is a pointer to the argument. This argument passed to<br />

the wanted task of the beginning moment.<br />

Ptos; It is a pointer to the top stack. This pointer should be<br />

assigned to the task.<br />

Prio; It is the priority of the wanted task.<br />

IV. TRAFFIC MODELS<br />

The traffic model is one of the important parameters in<br />

evaluating the latency time of interconnection networks.<br />

These models are produced according to the application<br />

programs which are run on the machine. In different<br />

application, different models are used. Traffic models are<br />

defined according to three parameters [4]:<br />

• The entrance time to networks<br />

• Message length<br />

• Address distribution type<br />

A. The uniform traffic model<br />

Uniform traffic model is the simplest traffic model which used<br />

in most of evaluations. In this model, each node sends message<br />

to the other nodes in network with equal probability. For<br />

example in a 6 × 6 mesh topology, each nodes sends message<br />

to the other nodes with the probability of %2.85.<br />

All source or destination nodes are selected with equal<br />

probability. The selection of source <strong>and</strong> destination node for<br />

each message will be independent from other messages [4].<br />

B. Hotspot traffic model<br />

In hotspot traffic model, the numbers of messages which are<br />

sent to special node as the hot node are more than the other<br />

nodes. Usually the one node is considered as a hot node.<br />

Because of sending some packets of the created messages in<br />

network to this spot, the traffic around this node is more than<br />

the other spot.<br />

Equalizing protocols <strong>and</strong> OS functions are the instances which<br />

lead to the production of this kind of traffic. The most colorful<br />

node in figure 3 is the hot node <strong>and</strong> the traffic congestion is<br />

clear around it.


29 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

C. Permutation traffic model<br />

Figure 3. Hotspot traffic model<br />

Permutation traffic model is another traffic model that a lot of<br />

parallel programs like FFT, matrix problems, <strong>and</strong> fault tolerant<br />

routing algorithms have behavior like it.<br />

In this model, the destination address is found by placing the<br />

source address in a permutation function. So for each source<br />

address there always is a destination address. Bit reversal, First<br />

(second) matrix transpose, shuffle <strong>and</strong> butterfly traffic models<br />

are some examples of the permutation model. For instance the<br />

traffic model of matrix transpose explained; if we consider M<br />

<strong>and</strong> N as the dimension size of the 2-D network <strong>and</strong> (i,j) as the<br />

source node address, the destination address is produced as<br />

follow:<br />

( i, j)<br />

→ ( M×<br />

N−1−<br />

j,<br />

M×<br />

N−1−i)<br />

(1)<br />

The destination address in second matrix transpose is<br />

produced as follow:<br />

( i, j)<br />

→ ( j,<br />

i)<br />

(2)<br />

D. Local Traffic model<br />

Local traffic model is similar to application program. In this<br />

model, each node sends special volume of its created message<br />

to its neighbor. The number of neighbors is related to the<br />

distance between neighbor nodes (called neighbor radius).<br />

Radius one is shown in figure 4. In that the block nodes are the<br />

neighbors of node.<br />

Figure 4. Local traffic model<br />

In all explained traffic model, some percentages messages are<br />

distributed as per mutative, local or one sent to the hotspot <strong>and</strong><br />

the other messages are distributed in another way which is<br />

usually uniform.<br />

V. COLDFIRE MICROPROCESSOR INTRODUCTION<br />

Motorola corporation is one of pioneer in producing 8, 16, 32<br />

bit microprocessors <strong>and</strong> microcontrollers. ColdFire<br />

microprocessor family is the most famous <strong>and</strong> successful<br />

production of its company. These processors have m68000<br />

architecture that which are suitable to be used in real time<br />

system. To meet this purpose of this paper MCF5484 is used.<br />

VI. BDM MODULE AS A DEBUGGING AND PROGRAMMING<br />

TOOL<br />

The figures 5 show the interface of this module with processor<br />

core <strong>and</strong> its other interfaces. As you see, debug module is<br />

connected to the main bus of the microprocessor <strong>and</strong> so in<br />

some cases if can work with ColdFire CPU core in a parallel<br />

form.<br />

Figure 5. BDM interfaces<br />

The capabilities of this module are divided into three groups:


30 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

A. Real time trace support<br />

It has the ability to dynamic calculation of the running path,<br />

which is useful for debugging. ColdFire has the ability to place<br />

8bits of parallel data on emulator. This data shows the<br />

microprocessor status <strong>and</strong> memory data.<br />

B. Background Debug Mode (BDM)<br />

This capability provides low level debugging for ColdFire. In<br />

this module we can access the memory without stopping the<br />

microprocessor. But changing amount of registers needs to halt<br />

the microprocessor.<br />

C. Real time Debug Support<br />

With use of Debug Interrupt Routine Time, in this mode, the<br />

amount of registers <strong>and</strong> variable data are saved fast <strong>and</strong> the<br />

systems returns to normal stopping the main program.<br />

BDM mode is useful for the following reasons:<br />

• BDM is always accessible for debugging <strong>and</strong><br />

firmware upgrading<br />

• It is used for programming external flash<br />

• It provides the entire control of the microprocessor<br />

<strong>and</strong> so the whole system.<br />

These features lead to debugging the microprocessor by the<br />

use of those tools, which are used for programming the<br />

microprocessor.<br />

Although, most of BDM comm<strong>and</strong>s don’t lead to halt stopping<br />

<strong>and</strong> they are capable to be run with a program concurrently.<br />

Some conditions which lead to microprocessor stopping are<br />

available as follow:<br />

• Fault occurrence in BDM system<br />

• Breakpoints<br />

• Halt comm<strong>and</strong> that can be activated with 'Go' from<br />

BDM<br />

VII. PERFORMANCE COMPARISON IN TWO STATUSES: WITH<br />

AND WITHOUT OF OS<br />

In this section, we want to compare two different attitude in a<br />

mesh topology based NoC. For this comparison, we use of<br />

3× 3 mesh topology based on hotspot traffic model. The use of<br />

OS in topologies with limited nodes is worth if nodes<br />

communication complication or the number of defined tasks<br />

are a lot. In the attitude that OS is not used, the pass traffic in<br />

the central node of hotspot traffic model is a lot. So as a result<br />

there is the probability of the congestion of the packets when<br />

input packets are assigned the output. In order to remove this<br />

problem, we use the virtual channel. However these virtual<br />

channels increase many overhead. For each channel which is<br />

added the power consumption increases <strong>and</strong> results to the<br />

increases of power in this attitude.<br />

If virtual channel are used, the router needs to use the MUX<br />

<strong>and</strong> DEMUX for he selection of the packets. The figure 6<br />

shows the packet placing in virtual channel <strong>and</strong> also the<br />

selection of packet from virtual channel.<br />

Figure 6. Virtual channel<br />

As you see in this attitude, some components such as MUX,<br />

DEMUX <strong>and</strong> buffers are necessary. These components lead<br />

some complication like a buffer management <strong>and</strong> packet<br />

selection from buffers. In the attitude based on the use of OS,<br />

we define task s based on I/O ports (Local port is negligible).<br />

As a result there are four tasks: North, South, East <strong>and</strong> West.<br />

Now, OS assigns one task priority for each port. Based on the<br />

assignment of priorities to these tasks, we can manage the<br />

routing of the input packet to input port easily.<br />

OS is responsible for scheduling <strong>and</strong> task management. In this<br />

trend, priority assigning is programmed in the way: each time<br />

the output port is busy, the free ports based on PrioRout are<br />

used.<br />

A. Deterministic:<br />

Execution time of all MicroC/OS-II functions <strong>and</strong> services are<br />

deterministic. This means that you can always know how much<br />

time MicroC/OS-II will take to execute a function or a service.<br />

Furthermore, except for one service, execution time of all<br />

MicroC/OS-II services does not depend on the number of tasks<br />

running in your application.<br />

B. Task stacks:<br />

Each task requires its own stack. However, MicroC/OS-II<br />

allows each task to have a different stack size. This allows you<br />

to reduce the amount of RAM needed in your application.<br />

With MicroC/OS-II's stack checking feature, you can<br />

determine exactly how much stack space, each task actually<br />

requires.<br />

C. Services:<br />

MicroC/OS-II provides a number of system services such as<br />

mailboxes, queues, semaphores, fixed-sized memory<br />

partitions, time related functions, etc.<br />

D. Interrupt Management:<br />

Interrupts can suspend the execution of a task <strong>and</strong>, if a higher<br />

priority task is awakened as a result of the interrupt, the<br />

highest priority task will run as soon as all nested interrupts<br />

complete. Interrupts can be nested up to 255 levels deep.


31 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

E. Critical section of code<br />

The critical section of code, briefly named critical section, is a<br />

code which should be atomic <strong>and</strong> run as a basic block<br />

necessarily. So the segment code is uninterruptible when<br />

placed in this section. To assure that, all interrupt are disabled<br />

before critical section to be ran <strong>and</strong> after that they will be able<br />

again.<br />

VIII. PRIOROUT ROUTING ALGORITHM<br />

In this algorithm, input packet will choose a different output<br />

port based on the selected input port <strong>and</strong> its destination. In the<br />

figures 7, we can see a 3× 3 mesh topology of a NoC. In this<br />

topology, OS has been ported on the router which has special<br />

color.<br />

Figure 7. Mesh topology based NoC<br />

Figure 8. The Central router ports<br />

The number of router ports depends on the location. For<br />

example the router which situated in the north east has three<br />

ports: Eastern port, Southern port <strong>and</strong> Local port. The router<br />

which the OS has ported on it is located in the central of the<br />

mesh topology <strong>and</strong> it has five ports which are: Southern port,<br />

Northern port, Eastern port <strong>and</strong> Local port. The figure 8 shows<br />

the number of ports in the central router.<br />

In PrioRout routing, if the input port is the northern one <strong>and</strong><br />

output port is the eastern one <strong>and</strong> eastern port is free, output<br />

port is the eastern port. If the eastern port is busy, the output<br />

port is would be the southern port <strong>and</strong> the southern port would<br />

be also, in the worst situation, <strong>and</strong> the western port would be<br />

the output port. So the task priority in this example would be:<br />

TPNorth<br />

To East = 3<br />

TPNorth<br />

To South = 2<br />

TPNorth<br />

ToWest<br />

= 1<br />

As a result, there is need neither for saving nor buffering. In<br />

the same manner, for all packets which their destination is<br />

neighbor port, higher task priority belongs to this port. The<br />

next priority would be toward the frontal port <strong>and</strong> the lower<br />

priority belongs to the output port. In PrioRout routing, if the<br />

input port, is eastern one <strong>and</strong> the output port is the eastern one<br />

<strong>and</strong> also be free, the output port would be the western one. If<br />

the western port is busy, the output port would be either the<br />

northern port or the southern port. That in this case, we choose<br />

the free port in clockwise. So we should have:<br />

TPEast<br />

To West = 3<br />

TPEast<br />

To North = 2<br />

TPEast<br />

To North = 1<br />

TABLE I. PACKET ROUTING BASED ON PRIOROUT ROUTING<br />

The table1 shows the packet routing according to use of the<br />

OS.<br />

Task Routing Best Output Case Mean Output Case Worst<br />

Output<br />

Case<br />

North<br />

South<br />

East<br />

West<br />

North to South South East West<br />

North to East East South West<br />

North to West West South East<br />

South to North North West East<br />

South to East East North West<br />

South to West West North East<br />

East to West West South North<br />

East to North North West South<br />

East to South South West North<br />

West to East East North South<br />

West to North North East South<br />

West to South South East North


32 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

IX. EXPERIMENTAL RESULTS<br />

A packet from north to east has been analyzed based on<br />

PrioRout routing <strong>and</strong> MicroC/OS-II features.<br />

OS dynamically assigns the updated priorities. In this way,<br />

there is a priority table for packet routing which is shown in<br />

table II.<br />

Table I Task (Priority) table for routing<br />

Priority Destination<br />

3 East<br />

1 West<br />

2 South<br />

Consequently, the input packet follows this priority<br />

assignment once it reaches the task (North port). Four<br />

Boolean global variables are defined during task<br />

implementation <strong>and</strong> creation to show whether or not the<br />

ports are busy. The next higher priority (south port in this<br />

example) will be selected. Also the other tasks follow this<br />

routing. To sum up, based on using OS, when packet flits<br />

are going to pass the router, they do not need to be stored in<br />

buffers. Therefore the power consumption is lowered in<br />

comparison the case without using OS. In the worst case<br />

(figure10-b), if the all ports are busy, the packets can be<br />

stored in input buffers <strong>and</strong> task stacks. This means that we<br />

do not need virtual channel. In the case without using OS<br />

(figure10-a), higher priority packets may be waited for<br />

lower priority packets. But in attitude with using OS, sent<br />

packet based on their priorities send according to their<br />

importance. But we should notice that the new output<br />

packets can not interrupt until the all flits of previous<br />

packets are sent. Also, in attitude with using OS, we are able<br />

to message passing. As when two ports reach the different<br />

ports, once one packet based on critical section feature is<br />

Figure 9. Creation of four tasks in MicroC/OS-II<br />

There is one task for each port; therefore, there are four tasks<br />

altogether. There are 12 paths as shown in table 1.<br />

Creation of four tasks in MicroC/OS-II is shown in figure 9:<br />

#define TASK_STK_SIZE 512 //Size of each task's stacks (# of WORDs)<br />

#define TASK_START_ID 0 // Application tasks IDs<br />

#define TASK_1_ID 1<br />

#define TASK_2_ID 2<br />

#define TASK_3_ID 3<br />

#define TASK_4_ID 4<br />

#define TASK_START_PRIO 4 Application tasks priorities<br />

#define TASK_1_PRIO 1<br />

#define TASK_2_PRIO 1<br />

#define TASK_3_PRIO 1<br />

#define TASK_4_PRIO 1<br />

// Create the first task<br />

OSTaskCreate(TestTask1,(void*)11,&TestTaskStk1[TASK_STK_SIZE], 11);<br />

// Create the Second task<br />

OSTaskCreate(TestTask2,(void*)11,&TestTaskStk2[TASK_STK_SIZE], 11);<br />

// Create the Third task<br />

OSTaskCreate(TestTask3,(void*)11,&TestTaskStk3[TASK_STK_SIZE], 11);<br />

// Create the Forth task<br />

OSTaskCreate(TestTask4,(void*)11,&TestTaskStk4[TASK_STK_SIZE], 11);<br />

been selected. This section can be priority assigning. For<br />

example forward path has higher priority than the neighbor<br />

path. So the table III shows the comparison two attitudes<br />

(with <strong>and</strong> without using OS). Horizontal axis shows the task<br />

priorities <strong>and</strong> vertical axis the packet transmission time. As<br />

a result, transmission time of higher priority packet is lower.<br />

3<br />

2.5<br />

2<br />

Time 1.5<br />

1<br />

0.5<br />

3<br />

2.5<br />

2<br />

1<br />

0.5<br />

0<br />

0<br />

Time 1.5<br />

Priority<br />

1 2 3<br />

(a<br />

Priority<br />

1 2 3<br />

(b<br />

Figure 10. a) Worst case in without using OS b) Normal Case in using<br />

OS state.


33 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

Table II Packet transmission status with northern source<br />

Source Port that sends 1000 packets<br />

Other Port Status<br />

Destination Port <strong>and</strong> the<br />

number of received packets<br />

North<br />

East is Free<br />

East-1000<br />

North<br />

East is Busy <strong>and</strong> South is Free<br />

South-1000<br />

North<br />

East <strong>and</strong> South are Busy <strong>and</strong> West is<br />

Free<br />

West-1000<br />

In our simulation, phytech evaluation board is used that the<br />

MicroC/OS-II has been ported in it. We reach to these result<br />

that have been shown in table III.<br />

X. CONCLUSION<br />

In this paper, the usage of a real time OS, in a NoC<br />

framework based on hotspot traffic model has been<br />

analyzed. Communication management in NoC, needs a<br />

precise planning, scheduling, resource allocation, message<br />

passing. Satisfy these parameters, needs efficiently. In this<br />

paper a RTOS has been used. Since the NoC is power<br />

constrained <strong>and</strong> the OS which is used has the a few line of<br />

code, this selection (a RTOS) has a significant effect on<br />

minimizing the power consumption. Based on the<br />

implementation, RTOS features can be used in NoC.<br />

REFERENCES<br />

[1] Allan, D. Edenfeld, W. H. Joyner, A. B. Kahng, M. Rodgers, Yervant<br />

Zorian, "2001 Technology Roadmap for Semiconductors," Computer,<br />

vol. 35, no. 1, pp. 42-53, Jan., 2002<br />

[2] J. Clerk Maxwell, A Treatise on Electricity <strong>and</strong> Magnetism, 3rd ed.,<br />

vol. 2. Oxford: Clarendon, 1892, pp.68–73.<br />

[3] http://www.micrium.com/<br />

[4] W. Hsh, Performance issues in wire-limited hierarchical networks,<br />

PhD Thesis, University of Illinois-Urbana Champaign, 1992.<br />

[5] G.J. Pfister, V.A. Norton, “Hotspot contention <strong>and</strong> combining in<br />

multistage interconnection networks,” IEEE Transactions on<br />

<strong>Computers</strong>, Vol. 34, No. 10, 1985, pp. 943-948.<br />

[6] K. Hwang, Advanced computer architecture: parallelism, scalability<br />

<strong>and</strong> programmability, McGraw-Hill (Ed.), 1993.<br />

[7] J. Duato, S. Yalamanchili, <strong>and</strong> L. Ni, Interconnection Networks—An<br />

Engineering Approach. Morgan Kaufmann, 2002.<br />

[8] MCF548x Integrated Microprocessor Electrical Characteristics<br />

Applies to the MCF5480, MCF5481, MCF5482, MCF5483,<br />

MCF5484, <strong>and</strong> MCF5485, © Freescale Semiconductor, Inc., 2004.<br />

[9] Nollet, V.; Marescaux, T.; Verkest, D, Operating-system controlled<br />

network on chip. Design Automation Conference (DAC), 2004.<br />

Proceedings.41 st Volume , Issue , 2004 Page(s): 256 - 259<br />

[10] S. A. Asghari, H. Pedram, P. Yaghini <strong>and</strong> M. Khademi, Designing<br />

<strong>and</strong> Implementation of a Network on Chip Router based on<br />

H<strong>and</strong>shaking Communication Mechanism, World Applied Science<br />

Journal 6 (1),pp: 88-93, 2009<br />

[11] N. Eisley <strong>and</strong> L.Peh, “HighLevel Power Analysis for OnChip<br />

Networks,” CASES’04 September 22–25, 2004, Washington, DC,<br />

USA<br />

Seyyed Amir Asghari was born in Lashte Nesha in Guilan province of Iran, on<br />

June 26, 1984. He received his BS degree in Computer Engineering from<br />

Amirkabir University of Technology in 2007. He graduated from the Amirkabir<br />

University of Technology in MSc. He is a research assistant of Asynchronous<br />

Design Laboratory in the same school.<br />

Hossein Pedram Received his BS degree from Sharif University in 1977 <strong>and</strong><br />

MS degree from ohio State University in 1980 in Electrical Engineering. He<br />

received his PhD degree from Washington State University in 1992 in<br />

Computer Engineering.<br />

Dr Pedram has served as a faculty member in the Computer Engineering<br />

Department in Amirkabir University of Technology since 1992. He teaches<br />

courses in computer architecture <strong>and</strong> distributed systems. His research interests<br />

include innovative methods in computer architecture such as asynchronous<br />

circuits, management of computer networks, distributed systems, <strong>and</strong> robotics.<br />

Hassan Taheri Received his BS degree from Amirkabir University of<br />

Technology in 1975 <strong>and</strong> MS degree from University of Manchester Institute of<br />

Science <strong>and</strong> Technology (UMIST) in 1978 in Electrical Engineering. He<br />

received his PhD degree from UMIST University in 1988 in Electrical<br />

Engineering.<br />

Dr Taheri has served as a faculty member in the Electrical Engineering<br />

Department in Amirkabir University of Technology. He teaches courses in Data<br />

Communication Network, Computer Communication, Teletraffic Engineering,<br />

Electronic Switching, Digital Communications, Telephone Switching,<br />

Probability <strong>and</strong> Statistics.


34 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

Nonlinear Filtering Algorithms for Chaotic<br />

Signals: A Comparative Study<br />

Valeri Ya Kontorovich, Zinaida Lovtchikova, Jesús. A. Meda-Campaña, <strong>and</strong> Keith Tinsley<br />

Abstract— In this work, a comparative analysis of some<br />

approximate nonlinear filtering algorithms for chaos is<br />

addressed, assuming that output signals of chaotic attractors are<br />

affected by additive white noises. Estimation accuracy <strong>and</strong><br />

computational complexity of filtering algorithms are taken into<br />

account during comparison process.<br />

It is shown, that the nonlinear filtering algorithm of chaos can<br />

be interpreted, for certain levels of the Signal-Noise Ratio (SNR),<br />

as a close to singular one, which dramatically decrease the Mean-<br />

Square Error (MSE) of filtering.<br />

Index Terms—Chaotic signals, Markov theory, non linear<br />

filtering<br />

I. INTRODUCTION<br />

HEORETICALLY, chaos is represented as an output<br />

Tsignal<br />

of dissipative continuous dynamic systems (strange<br />

attractors) (see, for example [4]):<br />

( x(t)<br />

)<br />

x & = f ,<br />

n<br />

x ∈ R , 0 0 ) x ( = x<br />

t , (1)<br />

where [ ] T<br />

f f 1(<br />

x),...<br />

fn<br />

( x)<br />

is a differentiable vector function.<br />

According to the idea of Kolmogorov, equations for the<br />

strange attractors (1) can be successfully transformed in the<br />

equivalent stochastic form as a stochastic differential equation<br />

(SDE) [4], [11]:<br />

( ( t) ) + εξ(<br />

t)<br />

x& = f x , (2)<br />

Manuscript received October 9, 2009. This work was supported through<br />

the grant “Intel-VK” from INTEL Corporation.<br />

V. Ya. Kontorovich is with the Communications Section of the Electrical<br />

Engineering Department of CINVESTAV-IPN, Av. IPN 2508 Col. San Pedro<br />

Zacatenco. C.P. 07360 México, D.F. Apartado postal 14-740, 07000 México,<br />

D.F. Phone: +52 55 57473764. Fax: +52 55 50613977 (Email:<br />

valeri@cinvestav.mx)<br />

Z. Lovtchikova is with the Engineering <strong>and</strong> Advanced Technology<br />

Interdisciplinary Professional Unit, UPIITA-IPN. Colonia Laguna de<br />

Ticomán. C.P. 07340. México, D.F. Phone +52 55 57296000x56848. (Email:<br />

lovtchikova@ipn.mx)<br />

J. A. Meda-Campaña is with the Mechanical Engineering Department of<br />

SEPI-ESIME Zacatenco, IPN, Av. IPN s/n. Edificio 5, piso 3. Unidad<br />

Profesional Adolfo López Mateos. Zacatenco. Col. Lindavista. C.P. 07738.<br />

México D.F. México. Phone +52 55 5729600x54737(Email:<br />

jmedac@ipn.mx).<br />

K. Tinsley is with INTEL Labs, INTEL Corporation. Hillsboro, Oregón<br />

97124, USA. Phone: +503 712 1790. (Email: keith.r.tinsley@intel.com)<br />

where ξ(t) is a vector of “weak” external white noise with the<br />

related positive defined matrix of “intensities” ε = [ε ij] nxn .<br />

The assumption of the weak white noise component in (2)<br />

guarantees the existence of the stationary distribution Wst(x),<br />

∀ εij →0 [1]. The latter was considered as an invariant<br />

physical measure for statistical characterization of the strange<br />

attractors [11], [12], [17].<br />

Statistical description of chaotic systems <strong>and</strong> noise effects<br />

in chaotic trajectories are deeply analyzed in [1], but it is<br />

rather difficult to apply those results in engineering<br />

applications. In this regard, the authors proposed earlier the<br />

so-called “degenerate cumulant equations method” [14] for<br />

applied statistical analysis of the strange attractors.<br />

It sounds logical to suppose that if one can model some<br />

stochastic phenomena by means of dynamic chaos (SDE (2)),<br />

then its filtering could be carried out through the same<br />

approach [13].<br />

Chaos modeling using SDE (2) gives an opportunity to<br />

provide the filtering of chaotic signals by means of the<br />

classical approach of nonlinear filtering for Markov processes,<br />

first proposed at the beginning of the 60’s by R. Stratonovich<br />

<strong>and</strong> H. Kushner [15], [20] <strong>and</strong> intensively developed in the<br />

last 40 years [2], [6], [7], [8], [9], [19].<br />

It is worth mentioning here, that the tendency of the<br />

intensities in SDE (2) to zero have to be applied with certain<br />

caution, as the latter formally changes characteristics of the<br />

Markov process, generated by (2).<br />

This problem will be considered in our further publications<br />

with all necessary details; here we will like to stress, that<br />

intensities will be considered, for the process noise in (2), as<br />

very small <strong>and</strong> close or equal to zero.<br />

As it follows from the above mentioned references the<br />

nonlinear filtering approach is mainly, by definition, an<br />

approximate one being that the differential equations for the aposteriori<br />

Probability Density Functions (Stratonovich-<br />

Kushner equations) do not provide analytical solution.<br />

During more than 40 years of intensive developments,<br />

many approximate methods for non-linear filtering have been<br />

proposed. For the purpose of this paper, the most important of<br />

them will be presented in the next section.<br />

It is worth stressing here that the comparison of the<br />

accuracy of the approximate methods does not provide a<br />

sustainable certainty, mainly because their creation is rather<br />

heuristic. Moreover, for certain methods the attempts to


increase the precision by increasing the number of<br />

approximation terms, etc. can give exactly the opposite effect<br />

<strong>and</strong> reduce the accuracy [7], [19].<br />

The main goal of this paper is to present a comparative<br />

study of some nonlinear algorithms bearing in mind possible<br />

applications to the filtering of chaotic signals provided by<br />

Lorenz, Chua <strong>and</strong> Rössler attractors in presence of additive<br />

white noises (channel noises).<br />

The rest of the work is organized as follows. In section II,<br />

Markov theory of nonlinear filtering is briefly recalled.<br />

Section III summarizes some of the approximate approaches<br />

for nonlinear filtering, while chaotic filtering is analyzed in<br />

section IV. Afterwards, numerical simulations are discussed in<br />

section V. Finally, in section VI, some conclusions are drawn.<br />

II. MARKOV THEORY OF NON-LINEAR FILTERING<br />

Let us consider the following filtering scenario where the<br />

received signal is:<br />

( , ( ) ) ( ) t t x<br />

( )<br />

0 t<br />

t n s y = + , (3)<br />

where y(t) – is a vector of the received signal with dimension<br />

“m”, s (⋅) – is a vector function of the desired signal of the<br />

same dimension “m”, n0 –is a vector of the white additive<br />

noises with the intensity matrix N0(mxm).<br />

Here the signal s (⋅) depends on the “message” x (t) which<br />

is subject of filtering <strong>and</strong> is modeled by means of the<br />

following SDE as an n-dimensional Markov diffusion process:<br />

( t, ) + ξ(<br />

t)<br />

x& = g x . (4)<br />

Formally, SDE (4) coincides with (2) <strong>and</strong> the vector<br />

function g (⋅) is similar to f (⋅) in (2); the matrix of intensities<br />

for ξ (⋅) in (4) corresponds to ε in (2).<br />

As it is well known (see [18] <strong>and</strong> [20] for example), with<br />

this assumption the a-priori Probability Density Function, or<br />

a-priori PDF, for x(t) follows the so-called Fokker-Plank-<br />

Kolmogorov (FPK) equation:<br />

∂WPR<br />

( x,<br />

t)<br />

= −<br />

∂t<br />

1<br />

+<br />

2<br />

n<br />

n<br />

∑<br />

i=<br />

1<br />

n<br />

∑∑<br />

i=<br />

1 j=<br />

1<br />

∂<br />

[ g i ( t,<br />

x)<br />

WPR<br />

( x,<br />

t)<br />

]+<br />

∂x<br />

i<br />

∂<br />

∂x<br />

∂x<br />

i<br />

2<br />

j<br />

[ W ( x,<br />

t)<br />

]<br />

ε , (5)<br />

where WPR(x,t0) =W(x0)<br />

Equation (5) can be rewritten in another form [9], [21]:<br />

or<br />

35 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

∂<br />

W PR<br />

ij<br />

PR<br />

( x,<br />

t)<br />

= −divπ(<br />

x,<br />

t)<br />

, (6a)<br />

∂t<br />

∂WPR<br />

( x,<br />

t)<br />

= L<br />

∂t<br />

PR<br />

{ W ( x,<br />

t)<br />

}<br />

PR<br />

, (6b)<br />

where π(x, t) – is a probabilistic “flow” with the components:<br />

1<br />

πi(<br />

x, t)<br />

= gi(<br />

x,<br />

t)<br />

WPR(<br />

x,<br />

t)<br />

−<br />

2<br />

n<br />

∂<br />

x<br />

[ εijWPR(<br />

x,<br />

t)<br />

] (7)<br />

∑ ∂<br />

j=<br />

1 j<br />

In (5)-(7) { } n<br />

gi (x , t)<br />

1 are drift coefficients <strong>and</strong> {εij} are<br />

diffusion coefficients of the Markov process, note that in the<br />

following they are defined in the Stratonovich sense [18],<br />

[20]; LPR{⋅} – is a FPK linear operator.<br />

Then, as it was shown in [20] the integro-differential<br />

equation for the a-posteriori PDF WPS(x, t) is given in the<br />

following equivalent forms:<br />

or<br />

∂WPS<br />

( x,<br />

t)<br />

= LPR<br />

{ WPS<br />

( x,<br />

t)<br />

}+<br />

∂t<br />

1 ⎡<br />

∞<br />

⎤<br />

⎢F<br />

( x,<br />

t) − F ( x,<br />

t)<br />

WPS<br />

( x,<br />

t)<br />

dx⎥WPS<br />

( x,<br />

t)<br />

2 ⎢ ∫ ⎥<br />

⎣<br />

− ∞<br />

⎦<br />

1<br />

2<br />

∂<br />

W PS<br />

( x,<br />

t)<br />

= −divπ<br />

ˆ(<br />

x,<br />

t)<br />

+<br />

∂t<br />

[ ( , t) F(<br />

, t)<br />

] WPS<br />

( , t)<br />

x x x F 〉 〈 −<br />

(8a)<br />

(8b)<br />

where ∫ ∞<br />

〈 F( x, t)<br />

〉 = F ( x,<br />

t)<br />

WPS<br />

( x,<br />

t)<br />

dx<br />

, π ˆ ( x , t ) is (5),<br />

−∞<br />

where WPR(x, t) is substituted by WPS(x, t) <strong>and</strong>:<br />

T<br />

⎡ 1 ⎤ −1<br />

⎡ 1 ⎤<br />

F ( x,<br />

t)<br />

= ⎢ y(<br />

t)<br />

− s(<br />

x,<br />

t)<br />

⎥ N 0 ⎢ y(<br />

t)<br />

− s(<br />

x,<br />

t)<br />

⎥ . (9)<br />

⎣ 2 ⎦ ⎣ 2 ⎦<br />

Equations (8) together with (9) are called Stratonovich-<br />

Kushner nonlinear equations (SKE) <strong>and</strong> have a rather<br />

attractive physical interpretation: the first summ<strong>and</strong> in (8)<br />

describes the dynamics of the a-priori dates of the x(t) <strong>and</strong> the<br />

second summ<strong>and</strong> depends on the innovation of the a-priori<br />

dates from the analysis of observations.<br />

The optimum estimation of x (t)<br />

is x ˆ( t)<br />

by any known<br />

criteria of optimization <strong>and</strong> is a result of the filtering of<br />

x(t); it is obtained from the solution of (8), while the input<br />

signal is y(t) (see (3)).<br />

When intensity of additive noises vector N0 is large, the<br />

influence of the first summ<strong>and</strong> in (8) prevails, equation (8)<br />

translates into FPK (6) <strong>and</strong> the filtering accuracy diminishes<br />

drastically. In contrary: when the signal to noise ratio<br />

increases, the WPS(x, t) tends to the unimodal Gaussian PDF<br />

[7], [20]. Note that SKE equation fully describes the<br />

“evolution” of WPS(x, t) in time but does not provide with<br />

exact analytical solutions.<br />

Even so, there are very few exceptions: linear SDE (4)<br />

which yields the well known Kalman filtering algorithm [2],<br />

[6]-[9], [15], [16], [18]-[21]; the Zakai approach [22], etc.


36 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

Due to this, the nonlinear filtering algorithms are practically<br />

always approximate.<br />

During more than 40 years the bibliography for nonlinear<br />

filtering algorithms has become enormous. In the next section<br />

we will consider only some of them, taking into account the<br />

following considerations:<br />

− The models of the desired signals applied for filtering are<br />

equations for Lorenz, Chua <strong>and</strong> Rössler strange attractors<br />

with n=3, i.e., of rather low dimension.<br />

− The algorithms of interest have to be adequate for real time<br />

applications <strong>and</strong> so they have to be of reduced<br />

computational complexity.<br />

− The algorithms for nonlinear filtering have to be able to<br />

perform satisfactorily in scenarios with low signal to noise<br />

ratios (SNR), although the Gaussian assumption for WPS(x)<br />

is not always valid.<br />

( ) ) ( ), ( t t t s x ≅ x<br />

(10)<br />

− All εij are equal to zero, except ε11≅ ε1 [1].<br />

Advances in cumulant statistical analysis of chaos [11], [12]<br />

supposing low SNR, makes one guess that it might be<br />

reasonable to consider application of the high order cumulants<br />

(HOS), (see [9], [10], [19] for example), etc.<br />

⎡ ∞<br />

∞<br />

1<br />

⎤<br />

+ ⎢<br />

⎥<br />

⎢ ∫ xi F( x,<br />

t)<br />

Wˆ<br />

G(<br />

x,<br />

t)<br />

dx-xˆ<br />

i ∫ F(<br />

x,<br />

t)<br />

Wˆ<br />

G(<br />

x,<br />

t)<br />

dx<br />

2<br />

⎥<br />

⎣−∞<br />

−∞<br />

⎦<br />

&ˆ<br />

∞<br />

⎛<br />

o o ⎞<br />

= ⎜ ˆ T<br />

∫ π ( x, t)<br />

grad x i x ⎟dx<br />

+<br />

⎝<br />

⎠<br />

Rij j<br />

−∞<br />

(11)<br />

⎡ ∞<br />

∞<br />

1 o o<br />

⎤<br />

+ ⎢<br />

⎥<br />

⎢ ∫ xi x j F( x,<br />

t)<br />

Wˆ<br />

G ( x,<br />

t)<br />

dx-Rˆ<br />

ij ∫ F(<br />

x,<br />

t)<br />

Wˆ<br />

G ( x,<br />

t)<br />

dx<br />

,<br />

2<br />

⎥<br />

⎣−∞<br />

−∞<br />

⎦<br />

o<br />

where x i = xi<br />

− xˆ<br />

i , x j = x j − xˆ<br />

j .<br />

o<br />

Equations (11) can be presented in the matrix form [7],<br />

[15], [19], [20] as well, but for concrete applications percomponent<br />

representation (11) might be more suitable (see<br />

the following).<br />

Practically, it is possible to assume for ∀ ˆ ( t)<br />

when t→∞,<br />

are converging to the stationary values R ij , <strong>and</strong> in<br />

consequence the second equation in (11) usually tends to the<br />

system of nonlinear algebraic equations, which can be solved<br />

numerically.<br />

This assumption can significantly simplify the<br />

implementation of the corresponding EKF algorithms for real<br />

time scenarios.<br />

Functional approximation for WPS ( x , t)<br />

. It follows<br />

from [9], [19]:<br />

III. APPROXIMATE APPROACHES FOR NON LINEAR FILTERING<br />

It is always “better” to approximate the a-posteriori PDF<br />

WPS ( x , t)<br />

than the nonlinearity at (4), (8) [2], [8], [19]. In this<br />

context, let us mention the following approximate approaches<br />

for WPS ( x , t)<br />

:<br />

− Gaussian approximations: Extended Kalman Filter (EKF)<br />

[2], [6]-[9], [15], [16], [18]-[21]; Unscented Kalman Filter<br />

(UKF) [8]; Quadrature Kalman Filter (QKF) [2]; Gauss-<br />

Hermite Quadrature Filter (GHF), [6], Iterated Kalman<br />

Filter (IKF), etc.<br />

− Functional approximations for<br />

WPS ( x,<br />

t)<br />

[9], [19];<br />

− Integral or Global approximations for<br />

WPS ( x,<br />

t)<br />

[7];<br />

− HOS approximations for<br />

WPS ( x,<br />

t)<br />

[10]; etc.<br />

Due to the lack of space, it is hardly feasible to give a<br />

complete overview of all those methods; moreover not all of<br />

them are adequate taking into account the observations<br />

introduced at the end of section II but some comments will be<br />

made at section V.<br />

Let us start with the Extended Kalman Filter (EKF):<br />

Considering WPS ( x , t)<br />

as a three dimensional Gaussian PDF-<br />

Wˆ G ( x , t)<br />

, from (8) it is possible to obtain the following<br />

equations for per-component of the mean estimates { } 3<br />

x ˆi 1 <strong>and</strong><br />

for estimates of the elements of the a-posteriori covariance<br />

matrix { } 3<br />

3 ⎡ 3 q−1<br />

R<br />

⎤<br />

qj<br />

W = ∏ ⎢ + ∑∑ − ˆ − ˆ<br />

PS ( x,<br />

t)<br />

WPS<br />

( xi<br />

) 1<br />

( xq<br />

xq)(<br />

x j x j)<br />

⎥<br />

⎢ R<br />

= 1<br />

⎥<br />

⎣ q=<br />

2 j=<br />

1 qqR<br />

i<br />

ji<br />

⎦ (12)<br />

From (12) we see that the Functional Approximation for the<br />

PDF is sufficiently non-Gaussian (marginal WPS(xi) are<br />

arbitrary) but for “joint” characterization of the vector xˆ , only<br />

elements of the a-posteriori covariance matrix Rij R ˆ<br />

ij :<br />

i,<br />

j=<br />

1<br />

∞<br />

x<br />

&ˆ<br />

i = ∫ ( ˆ T<br />

π ( x, t)<br />

gradxi<br />

) dx<br />

+<br />

−∞<br />

ˆ are<br />

considered.<br />

It can be shown that the equations for { } n<br />

xˆ i 1 <strong>and</strong> { Rij } ˆ are<br />

the same as in (11), being the only difference that instead of<br />

Wˆ G ( x , t)<br />

one has to substitute in (11) the approximation (12)<br />

for WPS ( x , t)<br />

. The corresponding integrals can be solved<br />

analytically or by the Gauss-Hermite quadrature formula [2],<br />

[6] (see below).<br />

Integral or Global approximation for WPS ( x , t)<br />

. The<br />

reader already realized that the previous two approximations<br />

for WPS ( x , t)<br />

are in some sense “local” because they provide<br />

the estimation of { xˆ i } as the maximum of ) , ( t WPS x , <strong>and</strong><br />

{ Rij } ˆ . When the SNR is considerable high this is quite<br />

enough, but when the SNR is low, one has to look for another<br />

approach, which is called Integral approximation. This<br />

approach was proposed for successful approximation of<br />

R ij


WPS ( x , t)<br />

including the PDF’s “tails”, i.e. for the whole span<br />

of x.<br />

Let us assume that WPS ( x , t)<br />

can be represented in the<br />

form:<br />

WPS PS<br />

( x , t)<br />

= W ( x,<br />

α(<br />

t))<br />

, (13)<br />

where α is an unknown vector of approximation parameters.<br />

Then, applying the well known Kullback measure as an<br />

approximation criteria, we obtain the following equation for<br />

the unknown vector α:<br />

+<br />

LPR −1<br />

{ h(<br />

x,<br />

t)<br />

} + V ( t)<br />

h(<br />

x,<br />

t)<br />

F ( x,<br />

t)<br />

α& =<br />

, (14)<br />

where:<br />

∂lnWPS<br />

( x,<br />

α(<br />

t))<br />

h ( x,<br />

t) =<br />

, <strong>and</strong><br />

∂α<br />

∞<br />

T<br />

⎡∂<br />

lnWPS<br />

( x,<br />

α(<br />

t))<br />

⎤<br />

V ( t)<br />

= − ∫ ⎢<br />

( , α(<br />

))<br />

α ⎥ WPS<br />

x t dx<br />

⎣ ∂<br />

−∞<br />

⎦<br />

2<br />

∂ WPS<br />

( x,<br />

α(<br />

t))<br />

= −<br />

, Τ<br />

∂α∂α<br />

+ PR<br />

{} •<br />

L – is a self ad joint operator to the FPK operator [18].<br />

Now, as an integral approximation of ( x , α(<br />

t))<br />

, let us<br />

W PS<br />

choose the so-called “Dynkin PDF” with α(t) – as a vector of<br />

sufficient statistics for WPS(⋅):<br />

⎪<br />

⎧ K<br />

⎪<br />

⎫<br />

W PS ( x, α(<br />

t))<br />

= C exp⎨∑<br />

α p ( t)<br />

ϕ p ( x)<br />

+ ϕ0<br />

( x)<br />

⎬ , (15)<br />

⎪⎩ p=<br />

1<br />

⎪⎭<br />

ϕ (x)<br />

is a complete set of orthogonal<br />

where { }<br />

p<br />

multidimensional functions: Hermite, Laguerre, etc.<br />

One can see, that there is a high degree of similarity<br />

between (15) <strong>and</strong> the orthogonal series representation of<br />

( x , α(<br />

t))<br />

[18]: in both cases, series of orthogonal<br />

W PS<br />

37 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

functions are applied, but in (15) it is done for the<br />

monotonical transform in{ WPS ( x , α(<br />

t))<br />

} <strong>and</strong> not for<br />

WPS ( x , α(<br />

t))<br />

. So, the coefficients {αp(t)} can always be<br />

represented through the cumulants of WPS(x). This opens the<br />

opportunity to search equations for the cumulants (HOS) of<br />

WPS ( x, t)<br />

directly (see [10], [19], for example), instead of<br />

search for a solution of (15), which cannot be obtained<br />

analytically.<br />

Being that the last problem was extensively tackled in the<br />

mentioned references, the HOS approach will not be<br />

completely addressed in the following. However, one<br />

comment comes in line: for n > 1 equations (14) <strong>and</strong> equations<br />

for HOS are rather complex when real time solutions are<br />

required; for n = 1 there might be no significant difference<br />

between both methods (see [10] for details). Then, in order to<br />

apply the last two approaches for approximate nonlinear<br />

filtering of chaos it is necessary to decrease the dimension of<br />

the SDE (4). In other words, for chaos one has to adequately<br />

find an equation statistically equivalent to the SDE (4). This<br />

can be achieved making a synthesis of the equivalent SDE<br />

(see [18]).<br />

IV. FILTERING ALGORITHMS FOR CHAOTIC SIGNALS<br />

For simplicity, let us consider the following special case of<br />

the one dimensional scenario:<br />

y t)<br />

= x ( t)<br />

+ n ( t)<br />

, (16)<br />

( 1 0<br />

where x1(t) – is the first (observable) component of any<br />

strange attractor (Lorenz, Chua, Rössler) <strong>and</strong> n0(t) is an scalar<br />

white noise [12], [17]. For the sake of completeness we<br />

present in Table I all the features which will be required here.<br />

Let us consider Lorenz, Chua <strong>and</strong> Rössler attractors. It can<br />

be seen from Table I that, marginal PDF’s of the components<br />

for Lorenz attractor are practically Gaussian, or its orthogonal<br />

representation has a Gaussian kernel PDF; for Rössler<br />

attractor orthogonal representation with the Gaussian kernel<br />

PDF is also valid for “x” <strong>and</strong> “y” components of the attractor<br />

[12]. The opposite situation takes place for Chua attractor<br />

(Table I): it can be seen that this attractor represents a clearly<br />

non-Gaussian case.<br />

Next, when SNR is low, then the influence of the second<br />

summ<strong>and</strong> in SKE (8) on WPS ( x, t)<br />

is low as well, <strong>and</strong> for the<br />

first approximation it is possible to assume, that the marginal<br />

a-posteriori PDF’s are close to their a-priori shapes.<br />

Therefore, it is feasible that EKF algorithms will be rather<br />

adequate for both high <strong>and</strong> low SNR scenarios for Lorenz <strong>and</strong><br />

Rössler attractors, but not for Chua attractor. Now, let us<br />

consider Chua attractor with the Integral (Global)<br />

approximation for the a-posteriori PDF, assuming (Table I),<br />

that first component has a symmetric<br />

WPS ( x1<br />

, t)<br />

. Supposing<br />

{ ϕ ( )}K<br />

that i xi 1 are polynomials of Hermite <strong>and</strong> K= 4, from<br />

(15) it follows:<br />

x1,<br />

t)<br />

= C exp{<br />

α1<br />

( t)<br />

H1<br />

( x1)<br />

+ α 2 ( t)<br />

H 2 ( x ) +<br />

+ α t) H ( x ) + α ( t)<br />

H ( x ) . (17)<br />

WPS ( 1<br />

3(<br />

3 1 4 4 1 )<br />

With the help of definition of the Hermite polynomials one<br />

can get for (15):<br />

WPS ( xi,<br />

t)<br />

= Const exp[<br />

−α<br />

2(<br />

t)<br />

− 3α<br />

4(<br />

t)<br />

⋅]<br />

{ } 4 3 2<br />

⋅ exp Ax + Bx + Cx + Dx<br />

(18)<br />

where:<br />

A = α1(<br />

t)<br />

− 3α<br />

3(<br />

t);<br />

B = α 2 ( t)<br />

− 6α<br />

4 ( t);<br />

C = α3<br />

( t);<br />

D = α 4 ( t)<br />

.<br />

As { } 4<br />

α i (t)<br />

1 are sufficient statistics for , <strong>and</strong> invoking the<br />

symmetry <strong>and</strong> normalization conditions for a-posteriori PDF<br />

one can get:<br />

A=C=0, C ( α α ) C ⋅ exp{<br />

−α<br />

( t − 3α<br />

( t)<br />

}<br />

W PS<br />

1 = , <strong>and</strong><br />

( x ) = C α α<br />

Dx . (19)<br />

1<br />

1,<br />

4<br />

2 4<br />

( ) { } 4 2<br />

1 2,<br />

4 ⋅exp<br />

Bx1<br />

− 1


No Name of the<br />

Strange<br />

attractor<br />

1 Lorenz, n = 3<br />

It is worth mentioning that for the case of low SNR (19)<br />

coincides with the a-priori PDF<br />

WPR ( x1,<br />

t)<br />

for Chua attractor<br />

(Table I). Now, from (14) it follows:<br />

where i=2, 4.<br />

' ε ''<br />

x1<br />

) ϕi ( x1)<br />

+ ϕi<br />

( x1)<br />

+ h ( x1)<br />

F(<br />

t,<br />

x ) = 0 , (20)<br />

2<br />

f ( i<br />

1<br />

Statistically equivalent SDE-1 with PDF (19) can be found<br />

in [18]:<br />

1<br />

3 ( Bx 2 )<br />

f ( x ) = ε − Dx .<br />

Then, for i=2, one gets ( ε → 0)<br />

the following equation:<br />

where<br />

38 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

⎡x1<br />

⎤<br />

⎢ ⎥<br />

x =<br />

⎢<br />

x2<br />

⎥<br />

⎢<br />

⎣x<br />

⎥ 3 ⎦<br />

2 Chua, n = 3<br />

⎡x1<br />

⎤<br />

⎢ ⎥<br />

x =<br />

⎢<br />

x2<br />

⎥<br />

⎢<br />

⎣x<br />

⎥ 3 ⎦<br />

3 Rössler, n =<br />

3<br />

⎡x1<br />

⎤<br />

⎢ ⎥<br />

x =<br />

⎢<br />

x2<br />

⎥<br />

⎢<br />

⎣x<br />

⎥ 3 ⎦<br />

2<br />

2<br />

4 y ( t)<br />

2<br />

[ x1<br />

− 2D<br />

x1<br />

] = [ 1 −1]<br />

1<br />

2ε<br />

B x , (21)<br />

N<br />

2m<br />

1<br />

x<br />

g(x)<br />

⎧σ(<br />

x2<br />

− x1)<br />

⎪<br />

⎨Rx1<br />

− x2<br />

− x3x<br />

⎪<br />

⎩x1<br />

x2<br />

− Bx3<br />

σ,<br />

R,<br />

B ≥ 0<br />

1<br />

1<br />

−<br />

2<br />

0<br />

⎛ 1 ⎞<br />

Γ⎜m<br />

+ ⎟D<br />

⎝ 2 ⎠<br />

=<br />

πD<br />

1<br />

1<br />

−m−<br />

2<br />

( − δ)<br />

m<br />

( − δ)(<br />

2D)2<br />

m = 1,2,…; D (⋅)<br />

is function of parabolic cylinder,<br />

4<br />

1<br />

TABLE I<br />

Strange attractors <strong>and</strong> their statistical characteristics<br />

WPR(xi) Comments<br />

Ι. ε<br />

ε11 = ε →0<br />

ε 12 = ε 13 =<br />

ε 23 = ε 21 =<br />

ε 32 = ε 33 =<br />

0<br />

(22)<br />

B<br />

δ = .<br />

2D<br />

x1 ~ WG(⋅)<br />

x2 ~ WG(⋅)<br />

x3 ~ WG(⋅)<br />

2 4<br />

⎧β1(<br />

x2<br />

− x1)<br />

− αh(<br />

x1)<br />

ε 11 = ε →0 x1<br />

~ C exp( p1x1<br />

− q1x1<br />

)<br />

⎪<br />

⎨β2<br />

( x1<br />

− x2<br />

) + β4<br />

x<br />

ε 12 = ε 13 =<br />

3<br />

x2<br />

~ WG<br />

( ⋅)<br />

⎪<br />

ε 23 = ε 21 =<br />

⎩−<br />

β3<br />

x<br />

2 4<br />

2<br />

ε 32 = ε 33 = x3<br />

~ C exp( p3x3<br />

− q3x3<br />

)<br />

β − β ≥ 0,<br />

α < 0 0 p , p , q , q > 0<br />

⎧−<br />

x2<br />

− x3<br />

⎪<br />

⎨x1<br />

+ ax2<br />

⎪<br />

⎩b<br />

+ x3x1<br />

− x3c<br />

a,<br />

b,<br />

c,<br />

≥ 0<br />

ε 11 = ε →0<br />

ε 12 = ε 13 =<br />

ε 23 = ε 21 =<br />

ε 32 = ε 33 =<br />

0<br />

1<br />

with<br />

2<br />

1<br />

x1<br />

~ W ( 1)<br />

⎡<br />

G x 1 +<br />

⎢⎣<br />

x2<br />

~ W ( )<br />

⎡<br />

G x2<br />

1 +<br />

⎢⎣<br />

x ~ W ( ⋅)<br />

3<br />

G<br />

R<br />

33<br />

2<br />

< 1<br />

Analogically for i=4<br />

ε<br />

y<br />

=<br />

N<br />

( ) 0 → ε<br />

, it yields:<br />

4 { 4 x<br />

2<br />

( B − 6ε<br />

−12B<br />

x<br />

6<br />

− 8ε<br />

x<br />

2 ) } + 6ε<br />

x =<br />

2<br />

( t)<br />

4 [ x<br />

2<br />

− 6 x + 3]<br />

+<br />

1 6 [ x<br />

4<br />

− 6 x<br />

2<br />

+ 3 x ].<br />

0<br />

N<br />

0<br />

2<br />

( t<br />

(23)<br />

Assuming, that in (21)-(23) y ) tends to its stationary<br />

2<br />

value y ( t)<br />

while t →∝ <strong>and</strong> substituting into (21) - (23),<br />

one can get nonlinear algebraic equations for stationary<br />

parameters α 2 , α4<br />

, which are obviously related to the aposteriori<br />

variance(MSE) <strong>and</strong> fourth moment (cumulant) of<br />

x ) .<br />

WPS ( 1<br />

Therefore, α 2 can be used as a measure of the filtering<br />

accuracy, being calculated with influence of the fourth aposteriori<br />

moments (cumulants).<br />

The similar approach with application of higher –order<br />

statistics (HOS) will be presented below, where the equation<br />

for estimate of x 1 = ˆx 1 will be obviously the same as for the<br />

Integral Approximation.<br />

It is worth to mention here, that for the case of low SNR it<br />

can be developed so-called asymptotical algorithms as well.<br />

For example, the asymptotical filtering algorithm for<br />

∆<br />

( 1)<br />

γ3<br />

3!<br />

( 2)<br />

γ3<br />

3!<br />

( 1)<br />

H 3 ( x1)<br />

+ γ 4 H 4 ( x1)<br />

⎤<br />

⎥⎦<br />

H ( x ) + γ<br />

3<br />

2<br />

( 2)<br />

4<br />

H 4 ( x2<br />

)<br />

⎤<br />

⎥⎦<br />

Normalized dates,<br />

WG(⋅) – Gaussian<br />

PDF<br />

Normalized dates,<br />

p1~ 3.5<br />

p3 ~ 3.5<br />

q1~ 1.5<br />

q3 ~ 2.5<br />

Normalized dates,<br />

( 1)<br />

4<br />

( 2)<br />

3<br />

( 2)<br />

4<br />

~ 0.<br />

2<br />

~ 0.<br />

6<br />

x1(<br />

t)<br />

= x(<br />

t)<br />

of Chua attractor in discrete time can be<br />

represented in a way:<br />

γ<br />

γ<br />

γ<br />

γ<br />

( 1)<br />

3<br />

~ 0.<br />

2<br />

~ 0.<br />

6


xˆ<br />

i+<br />

j<br />

= xˆ<br />

+ T f<br />

j<br />

+ σ<br />

2<br />

ε j<br />

0<br />

d<br />

dx<br />

( xˆ<br />

)<br />

j+<br />

1<br />

where T0 is a sampling interval,<br />

j<br />

lnW<br />

PS<br />

[ ( y j+<br />

1 − x j+<br />

1)<br />

] x = xˆ<br />

j+<br />

1<br />

, (24)<br />

2<br />

σ ε is a-posteriori filtering<br />

variance (MSE).<br />

This a-posteriori variance can be calculated through α 2<br />

<strong>and</strong> α 4 (see above), but also might be found from the<br />

following equation:<br />

ˆ σ<br />

2<br />

ε j + 1<br />

2<br />

ε j<br />

4 ∂<br />

+ ˆ σ ε j<br />

∂x<br />

2<br />

ˆ ε j<br />

2<br />

= ˆ σ + 2σ<br />

f<br />

2<br />

j+<br />

1<br />

ln<br />

′ ( xˆ<br />

j ) T0<br />

WPS<br />

[ ( y j+<br />

1 − x j+<br />

1 ) ] x j + 1=<br />

xˆ<br />

j<br />

(25)<br />

If the SNR is low <strong>and</strong> n0(t) is a Gaussian additive white<br />

noise, then applying Taylor series expansion for the<br />

lnW ( ⋅)<br />

, with this asymptotic one can get:<br />

PS<br />

2<br />

σ ε<br />

xˆ<br />

j+<br />

1 = xˆ<br />

j + T0<br />

f<br />

=<br />

σ n<br />

σ = ˆ σ + 2 ˆ σ<br />

j ( xˆ<br />

j ) + 2 [ ( y j x j ) ] 2 + 1 − + 1 x j+<br />

1 xˆ<br />

j<br />

ˆ ε j + 1<br />

2<br />

ε j<br />

2<br />

ε j 2<br />

ε j<br />

which, in stationary conditions is:<br />

εT<br />

T<br />

ˆ = =<br />

j<br />

2 f<br />

ε<br />

2<br />

σ ε<br />

fˆ<br />

( xˆ<br />

j ) T0<br />

( xˆ<br />

),<br />

ε ˆ σ<br />

0<br />

0<br />

'<br />

2<br />

( xˆ<br />

) 2(<br />

B − 6 xˆ<br />

)<br />

j<br />

(26)<br />

(27)<br />

It can be seen from (27) that accuracy of the filtering<br />

depends on absolute value of xˆ which is the specific feature of<br />

the asymptotical algorithm. This interesting issue follows from<br />

ˆ 2<br />

σ ε on the derivative of the nonlinear drift<br />

the dependence of<br />

f ′ ( xˆ<br />

j ) .<br />

Now, let us take the low SNR scenario <strong>and</strong> apply the<br />

Functional Approximation (12) for WPS ( x , t)<br />

. When we<br />

assume the low SNR case the WPS ( x , t)<br />

becomes:<br />

W<br />

PS<br />

39 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

2 4<br />

exp(<br />

p1x1<br />

− qi<br />

x1<br />

)<br />

2 4 ( p x − q x )<br />

( x,<br />

t)<br />

= C<br />

exp 3 3 i 1<br />

1<br />

2πRˆ<br />

22<br />

(28)<br />

2 ⎛ ⎞⎡<br />

3 p−1<br />

x<br />

R − ˆ − ˆ ) ⎤<br />

⎜ 1<br />

ij ( x j x j )( xi<br />

xi<br />

exp ⎟<br />

⎜<br />

−<br />

⎟⎢1<br />

+<br />

ˆ ∑∑<br />

⎥<br />

⎝ 2R22<br />

⎠⎢⎣<br />

i=<br />

1 j= 1 RiiR<br />

jj ⎥⎦<br />

Substituting (28) into (11) <strong>and</strong> after rather simple, but<br />

cumbersome developments, one can get:<br />

x& ˆ1 = −2εxˆ<br />

1(<br />

p1<br />

+ q1)<br />

+ 2εq<br />

ˆ 1R11<br />

+<br />

2 ( y ( t)<br />

− R )<br />

2(<br />

y(<br />

t)<br />

− xˆ<br />

) Rˆ<br />

xˆ<br />

+ . (29)<br />

N<br />

1 11 1 −<br />

ˆ<br />

11<br />

0 N0<br />

If ε → 0 (see section I) <strong>and</strong> the SNR is low, then from (29)<br />

<strong>and</strong> (11) it follows:<br />

2(<br />

y(<br />

t)<br />

− xˆ<br />

1)<br />

x&<br />

ˆ ˆ<br />

1 = −2εxˆ<br />

1(<br />

p1<br />

+ q1)<br />

+<br />

R11<br />

(30)<br />

N0<br />

<strong>and</strong> one can immediately obtain:<br />

2<br />

Rˆ<br />

11<br />

R ˆ& ε<br />

ˆ<br />

11 = − + + 4ε<br />

( p1<br />

+ q1)<br />

R11<br />

. (31)<br />

2 N<br />

0<br />

One can see that (30), (31) coincide totally with the EKF<br />

for one component x1. Why it happened? The answer is<br />

simple, it happened because of practical linearity of the<br />

equations for Chua attractor with exception of h(x1), symmetry<br />

of the WPS(x, t) for all arguments <strong>and</strong> symmetry of h(x1),<br />

which finally provides <strong>and</strong> “implicit linearization” of the SDE<br />

for x1 in the case of Chua attractor.<br />

It is also interesting that for the analyzed scenario, the<br />

statistically equivalent SDE for x1(t) is practically linear with<br />

time constant 2D(p1+q1).<br />

For t → ∞ R ˆ<br />

11( t)<br />

tends to its stationary value R 11 , which<br />

coincides with the a-posteriori variance or MSE <strong>and</strong> can be<br />

simply calculated as:<br />

− 4ε<br />

( p1<br />

+ q1)<br />

+<br />

R11<br />

=<br />

2<br />

2 ε<br />

16ε<br />

( p1<br />

+ q1)<br />

+<br />

N 0<br />

2<br />

N 0<br />

≥ 0<br />

R11 ≅ 0. 71⋅<br />

invoking ε → 0,<br />

N 0ε<br />

.<br />

(32)<br />

If one assumes that N0 ≅ 1, then R 11 is almost zero <strong>and</strong><br />

doesn’t depend to SNR,i.e it is a singular case!<br />

Some further developments with the help of HOS can be<br />

achieved for the case of n=1, assuming that the nonlinear<br />

statistically equivalent SDE for x1 is [18]:<br />

ε<br />

2<br />

x & 1 = − ( p1x1<br />

− 2q1x1<br />

) + ξ ( t)<br />

ε . (33)<br />

2<br />

Then, it can be shown that with the help of the first four<br />

cumulants (HOS), the filtering equations are [10]:<br />

ε<br />

2<br />

& κ1 = − ( p1κ1<br />

− 2q1κ1<br />

) + F'<br />

( κ1<br />

) κ 2 −<br />

2<br />

2 1<br />

2<br />

− ε ( p 1 κ1<br />

− 2q1κ<br />

1 ) κ 2 + F'<br />

'(<br />

κ1)<br />

( κ 4 + 2κ<br />

2 ) = 0,<br />

(34)<br />

2<br />

where, as before the upper line denotes the time averaging<br />

procedure, κi denotes i-th cumulant, κ 3 = 0 ,<br />

κ ≅ −2κ<br />

(see [4]), κ 2 = R11 , xˆ<br />

H 1 = κ <strong>and</strong> κ3 <strong>and</strong> κ4 coincide with<br />

their a-priori values for the low SNR case.<br />

4<br />

2<br />

2


From (34) it easily follows:<br />

2<br />

& κ = −2ε<br />

( p κ − 2q<br />

κ ) + F'<br />

( κ ) κ<br />

1<br />

1 1<br />

1 1<br />

2ε<br />

( pκ<br />

1 − 2qκ<br />

1 ) ⎡<br />

κ 2 =<br />

⎢ 1+<br />

F'<br />

'(<br />

κ1)<br />

⎢<br />

⎣<br />

2<br />

1<br />

2<br />

F'<br />

'(<br />

κ ) ε<br />

1<br />

2 2<br />

[ 2(<br />

pκ<br />

1 − 2qκ<br />

1 ) ]<br />

⎤<br />

−1⎥.<br />

⎥<br />

⎦<br />

(35)<br />

After some simple, but rather cumbersome algebra, one can<br />

find that for the case of low SNR:<br />

So, 11 11<br />

R11 ~ N<br />

H<br />

ε .<br />

R ≥ R H , which coincides with the Rao-Kramer<br />

bounds for non-linear filtering, but also tending to zero!<br />

Therefore, as it follows from (30) <strong>and</strong> (35) the EKF shows<br />

its adequacy for application to the case of Chua attractor at<br />

least for the low SNR scenarios.<br />

Taking into account that all analytical developments were<br />

done with certain grade of approximation, it is m<strong>and</strong>atory to<br />

check them by numerical simulations. The corresponding<br />

results are presented in the next section.<br />

Finally, let us name some general observations regarding<br />

singularity for chaos filtering problem.<br />

From (1) it definitely follows that its solution is<br />

x( t) = Φ(<br />

t0,<br />

t,<br />

x0<br />

)<br />

, (36)<br />

defined by x0 <strong>and</strong> f(x).<br />

Then, from the SKE (8):<br />

where<br />

W<br />

PS<br />

( t,<br />

x)<br />

= Cδ<br />

0<br />

t<br />

⎪⎧<br />

⋅ exp⎨∫<br />

F<br />

⎪⎩ t0<br />

det<br />

with the elements<br />

( t , t,<br />

x)<br />

0<br />

⎪⎧<br />

T<br />

∂Φ<br />

0<br />

⎨<br />

⎪⎩<br />

∂x<br />

j<br />

0<br />

( Φ(<br />

t , t,<br />

x ) − x ) det(<br />

t , t,<br />

x)<br />

( t , t,<br />

x)<br />

0<br />

[ τ , Φ(<br />

t , t,<br />

x)<br />

] dτ<br />

,<br />

T ⎡∂Φ<br />

= det⎢<br />

⎣ ∂x<br />

h<br />

⎪⎫<br />

⎬<br />

⎪⎭<br />

i,<br />

j=<br />

1<br />

0<br />

≠ 0<br />

<strong>and</strong> δ(·) is a delta function; F[⋅] is (9).<br />

then<br />

40 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

0<br />

⎪⎫<br />

⎬<br />

⎪⎭<br />

( t , , x)<br />

⎤ 0 t<br />

⎥⎦<br />

Taking into account that the fundamental matrix is:<br />

0<br />

⋅<br />

(37)<br />

(38)<br />

(39)<br />

x( t) = Φ(<br />

t0<br />

, t,<br />

x0<br />

)<br />

, (40)<br />

x = Φ t , t,<br />

x).<br />

0<br />

( 0<br />

(41)<br />

If<br />

<strong>and</strong> if N0→∞, then<br />

W PS<br />

y t) = Φ(<br />

t , t,<br />

x ) + n ( t)<br />

( 0 0 0<br />

( x Φ(<br />

t , t,<br />

x)<br />

) det(<br />

t , , x),<br />

( t,<br />

x) = Cδ<br />

− 0<br />

0 t<br />

(42)<br />

(43)<br />

i.e., it is not zero, only when the filtering solution is<br />

Φ(<br />

t0<br />

, t,<br />

x0<br />

).<br />

When N0→0, WPS(t,x) is not equal to zero if <strong>and</strong> only if<br />

x0 Φ(<br />

t0<br />

, t,<br />

x)<br />

=<br />

, i.e., once more it is singular.<br />

So, for both those marginal cases WPS(t,x) “memorize”<br />

the solution of (1).<br />

For approximate algorithms, these phenomena take place<br />

from some low but finite values of N0.<br />

V. NUMERICAL SIMULATIONS<br />

For numerical simulations were considered the same<br />

attractors as mentioned before: Rössler, Lorenz <strong>and</strong> Chua, but<br />

with neglecting of the process noise in (2). This is done in<br />

order to verify how fast the a-posteriori variance or MSE is<br />

tending to zero, independently to SNR level, which is actually<br />

the sign of the singularity of filtering.<br />

From previous analysis, the EKF algorithm seems to be<br />

working practically in singular conditions: algorithm<br />

completely applied the a-priori information of the attractor,<br />

output signals are deterministic, though results doesn’t have to<br />

depend to SNR. This is another reason for opportunistic<br />

prognosis for EKF for the low SNR scenarios in case of chaos<br />

filtering.<br />

It was analyzed in details conditions for the process noise<br />

(see(2)) <strong>and</strong> was shown, that even a small fraction of process<br />

noise provides with a drastic growth of the MSE, which is not<br />

acceptable for filtering. So, the solution is to definitively tend<br />

this noise to zero.<br />

In order to compare the efficiency <strong>and</strong> accuracy of the<br />

above mentioned nonlinear approaches, Rössler, Lorenz <strong>and</strong><br />

Chua attractors are filtered (estimated) using the EKF,<br />

unscented Kalman Fileter (UKF) [8], Gauss-Hermite<br />

quadrature filter (GHF) [6], <strong>and</strong> Quadrature Kalman filter<br />

(QKF) [2]. Before proceeding with the comparisons, some<br />

brief descriptions of the mentioned nonlinear filters are given.<br />

Unscented Kalman filter (UKF)<br />

The UKF is based on the unscented transformation, which<br />

considers the idea that it is easier to approximate a probability<br />

distribution than an arbitrary nonlinear function. To achieve<br />

this, a set of sigma points with adequate mean <strong>and</strong> covariance<br />

are chosen (see also the “Functional Approximation method”,<br />

mentioned in [9], [18] <strong>and</strong> [19]). This approach differs from<br />

particle filters because the sigma points are chosen in a<br />

deterministic way instead of r<strong>and</strong>omly as in particle filters [5],<br />

[8].<br />

This method does not include the linearization of state


<strong>and</strong>/or output equations. But, although the sigma weights are<br />

computed before the filtering process begins, the sigma points<br />

need to be calculated in each algorithm iteration, <strong>and</strong><br />

afterwards the sigma points must be propagated through the<br />

nonlinear system [5], [8].<br />

A detailed derivation of the UKF algorithm is given in [8].<br />

Gauss-Hermite quadrature filter (GHF)<br />

As it is well-known, Guass-Hermite quadrature rule allows<br />

to approximate integrals of the form<br />

I =<br />

41 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

∫<br />

n<br />

R<br />

1 ⎛ 1<br />

f ( t)<br />

exp⎜−<br />

Σ<br />

1/ 2<br />

⎝ 2<br />

n ( 2π<br />

) det Σ)<br />

where Σ is covariance matrix.<br />

T −1<br />

⎞<br />

( t − x)<br />

( t − x)<br />

dt , (44)<br />

The Gauss-Hermite quadrature rule is given by<br />

∞<br />

∫ 1/<br />

2 ( 2 )<br />

∑<br />

−∞<br />

π<br />

i=<br />

1<br />

m<br />

1<br />

2<br />

− x<br />

f ( x) e = w f ( q ) ,<br />

i<br />

i<br />

⎟ ⎠<br />

(45)<br />

which holds for all polynomials of degree up to 2m-1. Where<br />

qi are the quadrature points <strong>and</strong> wi the corresponding weights<br />

[6]. So, through (45) the GHF algorithm approximates the<br />

integrals involved in the Gaussian estimation. It is important<br />

to remark that the quadrature points <strong>and</strong> quadrature weights<br />

used by the GHF algorithm are computed before the filtering<br />

process is started. Therefore, this method requires less<br />

computer effort than the UKF algorithm.<br />

Quadrature Kalman filter (QKF)<br />

The QKF is a more simplified version of the GHF, which<br />

considers the nonlinear filtering problem from a statistical<br />

linear regression (SLR) point of view. In other words, QKF<br />

uses SLR to linearize a nonlinear function by means of a set of<br />

Gauss-Hermite quadrature points <strong>and</strong> weights [2]. Although<br />

QKF is algebraically equivalent to the GHF, in simulations,<br />

the filtering process carried out with QKF algorithm is solved<br />

in a faster way than the filtering performed through GHF. The<br />

QKF algorithm is derived for the first time in [2].<br />

Comparison between nonlinear filters<br />

The computational complexity of the algorithms is briefly<br />

presented in the following table, where additions<br />

(subtractions), multiplications (divisions), Cholesky<br />

decompositions, Jacobian calculations (linearization) <strong>and</strong><br />

nonlinear propagations are included.<br />

From Table II, it can be easily seen that UKF involves a<br />

bigger complexity, while EKF seems to be the simpler<br />

algorithm. However, the linearization process preformed by<br />

the Jacobian calculation involves partial derivatives. For that<br />

reason, <strong>and</strong> depending on the mathematical model of the<br />

attractor, the EKF may not always be the fastest algorithm;<br />

although in our study it is not the case.<br />

TABLE II.<br />

COMPUTATIONAL COMPLEXITY<br />

EKF UKF GHF QKF<br />

Additions 8 50 25 25<br />

Multiplications 15 77 33 40<br />

Cholesky<br />

decomposition<br />

1 2 2 2<br />

Nonlinear<br />

propagation<br />

0 15 21 6<br />

Jacobian<br />

calculation<br />

1 0 0 0<br />

On the other h<strong>and</strong>, the complexity involved in each one of<br />

the algorithms is also analyzed by measuring the consumed<br />

time by the different filtering methods. To this end, the<br />

algorithms were applied on the chaotic attractors which<br />

evolved during 3000 sample times.<br />

One has to notice that the time plots for all above<br />

mentioned attractors are completely different: Lorenz attractor<br />

provides with noise-like chaos, while Rossler <strong>and</strong> Chua<br />

outputs are more likely to be as “modulated sine-waves”.<br />

Though for the same filtering accuracy Lorenz outputs have to<br />

be “oversampled” more frequently than Rossler <strong>and</strong> Chua<br />

signals. Oversampling has to be applied in order to achieve a<br />

required filtering performance <strong>and</strong> this statement will be<br />

illustrated with concrete dates for the sampling times for<br />

attractors upon consideration in this work.<br />

In other words, bigger sampling times dem<strong>and</strong> better<br />

accuracy of the filtering procedures. Consequently, extremely<br />

large sample periods may destroy the effectiveness of filtering<br />

algorithms, while very small sample times require great<br />

amount of data storage <strong>and</strong> faster processors.<br />

In is important mentioning that the algorithms are executed<br />

on an Intel Core 2 6420 @ 2.13GHz, with 1.5GB of RAM.<br />

Fig. 1 shows the MSE versus SNR for Chua attractor when<br />

the process noise is not present.<br />

Fig. 1. MSE vs. SNR for Chua attractor.<br />

As it can be seen the EKF is outperformed by GHF, QKF<br />

<strong>and</strong> UKF. This is because the approximation carried out by<br />

EKF though the linearization is not as good as the Gaussian<br />

approximations (GHF <strong>and</strong> QKF) or the unscented<br />

transformation (UKF). Even so, the MSE generated by the<br />

EKF is really small for SNR not less than 0.5.For Lorenz


42 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

attractor, the performance of the nonlinear filters is depicted in<br />

Fig. 2.<br />

Fig. 2. MSE vs. SNR for Lorenz attractor.<br />

From previous figure, it can be observed that GHF, QKF<br />

<strong>and</strong> EKF give better results than UKF when the nonlinear<br />

filters are applied on a Lorenz chaotic system. This due to the<br />

a-posteriori distribution of Lorenz attractor, which can be<br />

better approximated by the Gaussian filters (GHF <strong>and</strong> QKF),<br />

while the linearization approach involved in EKF is sufficient<br />

to approximate the chaotic dynamics.<br />

It is also deduced that UKF does not match the performance<br />

of Gaussian filters <strong>and</strong> EKF, even though the UKF is the most<br />

complex algorithm.<br />

Finally, the results obtained from using the nonlinear filters<br />

on Rössler attractor are depicted in Fig. 4.<br />

Fig. 4. MSE vs. SNR for Rössler attractor.<br />

For Rössler system, EKF works better than GHF, QKF, <strong>and</strong><br />

UKF. This is because, the a-posteriori PDF for Rössler<br />

attractor can be successfully represented through Gramm-<br />

Charlier series [12], such that approximation by linearization<br />

is close to the real system. Notice that the MSE for GHF <strong>and</strong><br />

QKF does not tends to zero.<br />

Another way to compare the nonlinear filters is through the<br />

necessary time to complete the filtering process for the<br />

different attractors. As mentioned above, only the first 3000<br />

samples of the nonlinear filtering processes are considered.<br />

The following table is intended to give an idea of the<br />

efficiency of the filtering algorithms.<br />

It is important to remark that, although QKF is executed in<br />

a faster way than EKF, GHF <strong>and</strong> UKF; the EKF algorithm is<br />

simple <strong>and</strong> fast enough to be considered as a good choice for<br />

chaotic filtering.<br />

Consequently, corroborating the analysis presented in<br />

previous sections, EKF is suggested as the best option for<br />

filtering <strong>and</strong> estimating of Chua, Lorenz <strong>and</strong> Rössler<br />

attractors.<br />

VI. CONCLUSION<br />

In this report the effectiveness of extended Kalman filter<br />

(EKF), unscented Kalman filter (UKF), Gauss-Hermite<br />

Quadrature filter (GHF), <strong>and</strong> Quadrature Kalman filer<br />

(QKF), are compared during state estimation of chaotic<br />

attractors, for both high <strong>and</strong> rather low SNR’s scenarios.<br />

It was shown that, in contrary to SDE modeling of Non-<br />

Gaussian signals, chaos representation of statistically<br />

equivalent signals (in terms of PDF’s) provides with “force<br />

sensing driving” of the filtering algorithms to the singular<br />

conditions from rather low SNR’s limits <strong>and</strong> as a<br />

consequence, it follows with the high filtering accuracy (low<br />

MSE rates), practically invariant to SNR level.<br />

This fact follows from the absence of the process noise<br />

components in the SDE of chaos <strong>and</strong> it was first predicted<br />

theoretically in [13]. To the best of our knowledge, these<br />

phenomena has not been discussed in the existing literature.<br />

On the basis of filtering results, the analysis shows that<br />

EKF achieves very acceptable performance for Chua, Lorenz<br />

<strong>and</strong> Rössler attractors. Although, UKF, GHF <strong>and</strong> QKF works<br />

better for Chua <strong>and</strong> Lorenz than EKF, these filters might be<br />

much more complex for real-time implementations.<br />

TABLE III<br />

SIMULATION TIME FOR CHAOTIC ATTRACTORS<br />

EKF UKF GHF QKF<br />

Chua 1.30s 2.07s 1.52s 1.07s<br />

Lorenz 1.26s 2.19s 1.58s 1.09s<br />

Rössler 1.25s 2.17s 1.62s 1.08s<br />

From a computational complexity point of view, the EKF<br />

<strong>and</strong> QKF require less effort than the GHF <strong>and</strong> UKF, while<br />

UKF involves the most complicated filtering procedure.<br />

Finally, it is important to remark that EKF algorithm is the<br />

one with the smaller code, so together with previous<br />

observations, the analysis suggests EKF as the better filtering<br />

choice for real-time applications.<br />

ACKNOWLEDGMENT<br />

Authors would like to acknowledge the valuable help of<br />

Dr. Fern<strong>and</strong>o Ramos <strong>and</strong> M. Sc. Beatriz Rodríguez for the<br />

preparation of this paper.


43 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

REFERENCES<br />

[1] Anischenko, V. S. et al, “Statistical properties of dynamical chaos”,<br />

Physics--Uspekhi, vol. 48, no. 2, pp. 151-166, 2005.<br />

[2] Arasaratnam, I. et al, “Discrete-Time Nonlinear Filtering Algorithms<br />

Using Gauss-Hermite Quadrature”, Proceedings of the IEEE, vol. 95, no.<br />

5, pp. 953-977, 2007.<br />

[3] Chui, C.K., <strong>and</strong> Chen, G., Kalman Filtering with Real-Time<br />

Applications, Springer-Verlag Berlin Heidelberg, 1999.<br />

[4] Eckmann, J., <strong>and</strong> Ruelle, D., “Ergodic Theory <strong>and</strong> Strange Attractors”,<br />

Review of Modern Physics, vol. 57, pp. 617-656, July 1985.<br />

[5] Haykin, S., Kalman Filtering an Neural Networks, John Wiley & Sons,<br />

2001.<br />

[6] Ito, K., <strong>and</strong> Xiong, K., “Gaussian Filters for Nonlinear Filtering”<br />

Problems, IEEE Transactions on Automatic Control, vol. 48, no. 5, pp.<br />

910-927, 2000.<br />

[7] Jazwinski, A., Stochastic Processing <strong>and</strong> Filtering Theory, N.Y.<br />

Academic, 1970.<br />

[8] Julier, S. J., et al, “Unscented Filtering <strong>and</strong> Nonlinear Estimation”,<br />

Proceedings of the IEEE, vol. 92, no. 3, pp. 401-422, 2004.<br />

[9] Kazakov, I., <strong>and</strong> Artemiev, V., Optimization of Dynamic <strong>Systems</strong> with<br />

R<strong>and</strong>om Structure, Nauka, 1980. (In Russian).<br />

[10] Kontorovich, V., “Non-Linear Filtering for Markov Stochastic Processes<br />

using High-Order Statistics (HOS) Approach”, Non-Linear Analysis:<br />

Theory, Methods <strong>and</strong> Applications, vol. 30, no. 5, pp. 3165-3170,1997.<br />

[11] Kontorovich, V., “Applied Statistical Analysis for Strange Attractors<br />

<strong>and</strong> Related Problems”, Mathematical Methods in the Applied Sciences,<br />

vol. 30, pp. 1705-1717, 2007.<br />

[12] Kontorovich, V., et al., “Analysis of Rössler Attractor <strong>and</strong> its<br />

Applications”, Special Issue on Nonlinear Dynamics <strong>and</strong><br />

Synchronization in The Open Cybernetics <strong>and</strong> Systemics Journal, 2009.<br />

(In press)<br />

[13] Kontorovich, V., Lovtchikova, Z., “Nonlinear filtering algorithms for<br />

chaotic signals: a comparative study”. Proceedings of INDS’09. Second<br />

International Workshop on Nonlinear Dynamics <strong>and</strong> Synchronization.<br />

Klagenfurt, pp. 221-227. Austria. July, 2009.<br />

[14] Kontorovich, V., Lovtchikova, Z., ”Cumulant analysis of strange<br />

attractors. Theory <strong>and</strong> applications”. Recent Advances in Nonlinear<br />

Dynamics <strong>and</strong> Sychronization. SCI 254, 2009. (In press)<br />

[15] Kushner, H., “Dynamical Equations for Optimal Nonlinear Filtering”,<br />

Journal of Differential Equations, vol. 3, pp. 179-190, 1967.<br />

[16] Kushner, H. <strong>and</strong> Budhiraja, A., “A Nonlinear Filtering Algorithm Based<br />

on an Approximation of the Conditional Distribution”, IEEE Trans. on<br />

Automatic Control, vol. 45, no. 3, pp. 580-585, March 2000.<br />

[17] Mijangos, M., Kontorovich, V., <strong>and</strong> Aguilar-Torrentera, J., “Some<br />

Statistical Properties os Strange Attractors: Engineering View”, Journal<br />

of Physics: Conference Series: 012147 (6pp), vol. 96, March 2008.<br />

[18] Primak, S., Kontorovich, V., <strong>and</strong> Ly<strong>and</strong>res, V., Stochastic Methods <strong>and</strong><br />

their Applications to Communications: Stochastic Differential Equations<br />

Approach, John Wiley & Sons, 2004.<br />

[19] Pugachev, V., <strong>and</strong> Sinitsyn, I., Stochastic Differential <strong>Systems</strong>. Analysis<br />

<strong>and</strong> Filtering, John Wiley & Sons, 1987.<br />

[20] Stratonovich, R., Topics of the Theory of R<strong>and</strong>om Noise, vol 1 <strong>and</strong> vol.<br />

2, Gordon <strong>and</strong> Breach, 1963.<br />

[21] Van Trees, H., Detection, Estimation <strong>and</strong> Modulation Theory, John<br />

Wiley & Sons, 2001.<br />

[22] Zakai, M., “On the Optimal Filtering of Diffusion Processes”,<br />

Wahrscheinlichkeitstheorie verngebiete, vol. 11, pp. 230-243, 1969.


44 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

Nonlinear Feature Extraction Approaches for<br />

Scalable Face Recognition Applications<br />

Hima Deepthi Vankayalapati<br />

Institute of Smart <strong>Systems</strong> Technologies<br />

University of Klagenfurt<br />

9020 Klagenfurt, Austria<br />

hvankaya@edu.uni-klu.ac.at<br />

Abstract—The human skill of identifying thous<strong>and</strong>s of people<br />

even after so many years excited many researchers to focus on<br />

face recognition systems. The majority of real world applications<br />

dem<strong>and</strong>s more robust, scalable <strong>and</strong> computationally efficient<br />

face recognition techniques which can operate under complex<br />

viewing <strong>and</strong> environmental conditions. The appearance based<br />

linear subspace techniques are very useful in data classification<br />

<strong>and</strong> dimensionality reduction tasks; however these algorithms<br />

only classify the linear data. The scalability of the linear subspace<br />

techniques is limited, as the computational load <strong>and</strong> memory<br />

requirements increase dramatically with the large database. This<br />

paper evaluates different nonlinear feature extraction approaches<br />

for face recognition application, namely wavelet transform, radon<br />

transform <strong>and</strong> cellular neural networks (CNN). In this work, the<br />

combination of radon <strong>and</strong> wavelet transform based approaches is<br />

used to extract the multi-resolution features, which are invariant<br />

to facial expression <strong>and</strong> illumination conditions. The efficiency of<br />

the stated wavelet <strong>and</strong> radon based nonlinear approaches over the<br />

databases is demonstrated, with the simulation results performed<br />

over the FERET database. This paper also presents the use of<br />

CNN in extracting the nonlinear facial features in improving the<br />

recognition rate, as well as computational speed, compared to<br />

other stated nonlinear approaches over the ORL database.<br />

Index Terms—Feature extraction, Face recognition, Linear<br />

subspace techniques, Cellular neural network, Wavelet transform,<br />

Radon transform.<br />

I. INTRODUCTION<br />

In computer vision, a feature is a set of measurements. Each<br />

measurement contains a piece of information, <strong>and</strong> specifies the<br />

property or characteristics of the object present in the image<br />

[1]. The linear features are more advantageous, when the given<br />

data is Gaussian distributed in terms of mean. However in most<br />

real world face recognition applications, facial features of the<br />

face image are not purely Gaussian distributed (they vary with<br />

complex viewing <strong>and</strong> environmental conditions).<br />

Researchers have developed various biometric techniques to<br />

identify or recognize persons by their physical characteristics<br />

like finger, voice, face etc. These biometric techniques have<br />

their own advantages <strong>and</strong> drawbacks as well [2]. Among all<br />

the biometric techniques, the face recognition has a distinct<br />

advantage of collecting the required data (i.e image) without<br />

any cooperation from the person [3]. The face recognition is<br />

a complex visual classification task which plays an important<br />

role in computer vision, image processing <strong>and</strong> pattern recognition.<br />

Ky<strong>and</strong>oghere Kyamakya<br />

Institute of Smart <strong>Systems</strong> Technologies<br />

University of Klagenfurt<br />

9020 Klagenfurt, Austria<br />

ky<strong>and</strong>oghere.kyamakya@uni-klu.ac.at<br />

Research concerning the face recognition started nearly in<br />

1960’s [4]. Different face recognition techniques have been<br />

proposed during last decades namely feature based, model<br />

based <strong>and</strong> appearance based techniques [5], [6]. In feature<br />

based techniques, the overall technique describes the position<br />

<strong>and</strong> size of each feature (eye, nose, mouth or face outline)<br />

[7]. In this approach, the extracting features in different poses<br />

(viewing conditions) <strong>and</strong> lighting conditions are very complex<br />

tasks. For applications with large databases, we have large<br />

set of features with different sizes <strong>and</strong> positions, making it<br />

difficult to identify the required feature points [8]. In the<br />

model based approach, a 3D model is constructed based on<br />

the facial variations in the image or important information<br />

related to the image. The difficulties in this approach are, we<br />

need a very expensive camera (Stereo vision) to capture the<br />

facial variations clearly; further construction of 3D model is<br />

difficult, <strong>and</strong> it takes more time to construct the model for<br />

large databases [6]. The availability of large 3D data is also<br />

one of the essential complex tasks that makes the model based<br />

methods not suitable for real world applications dealing with<br />

large databases.<br />

In 1990’s, researchers introduced appearance based linear<br />

subspace techniques, statistics related techniques, to solve face<br />

recognition problems. The introduction of the linear subspace<br />

techniques is a milestone in the face recognition concept. The<br />

performance of appearance based techniques heavily depends<br />

on the quality of the extracted features from the image [9]. The<br />

appearance based linear subspace techniques extract the global<br />

features, as these techniques use the statistical properties like<br />

the mean <strong>and</strong> variance of the image [6]. The major difficulty<br />

in applying these techniques over large databases is that the<br />

computational load <strong>and</strong> memory requirements for calculating<br />

features increase dramatically for large databases [3]. In order<br />

to increase the performance of the face recognition techniques,<br />

the nonlinear feature extraction techniques are introduced.<br />

In order to improve the performance of the face recognition<br />

technique, we have to extract both linear <strong>and</strong> nonlinear features.<br />

We have many nonlinear feature extraction techniques,<br />

such as radon transform <strong>and</strong> wavelet transform. The radon<br />

transform based nonlinear feature extraction gives the direction<br />

of local features. This process extracts the spatial frequency<br />

components in the direction of radon projection is computed


45 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

[10]. When features are extracted using radon transform, the<br />

variations in this facial frequency are also boosted [10]. The<br />

wavelet transform gives the spacial <strong>and</strong> frequency components<br />

present in an image [11]. However these nonlinear feature<br />

extraction techniques are computationally expensive. In order<br />

to improve the computational speed of the nonlinear feature<br />

extraction process, the cellular neural network (CNN) concept<br />

is being proposed.<br />

The novel scheme will involve, at its heart, CNN based<br />

processors, which will be the key component of the analog<br />

computing based ultra-fast solver for image processing tasks.<br />

CNN based analog computing has the very attractive advantage<br />

of easy implementation or emulation on digital platforms.<br />

The objective of this paper is to present the use of CNN in<br />

extracting nonlinear features using the ORL database.<br />

The paper is organized as follows: in section 2, the importance<br />

<strong>and</strong> methodologies of the linear subspace techniques<br />

are explained briefly. In section 3, the basics <strong>and</strong> importance<br />

of the radon transform are explained briefly. In section 4,<br />

wavelet transform is briefly described. In section 5, cellular<br />

neural network is introduced. Genetic algorithm based template<br />

calculation method is also briefly described in section<br />

5. The experimental simulation results using the FERET <strong>and</strong><br />

the ORL databases are described in section 6. Section 7 deals<br />

with some concluding remarks <strong>and</strong> outlooks.<br />

II. LINEAR SUBSPACE TECHNIQUES<br />

Principal Component Analysis (PCA), Independent Component<br />

Analysis (ICA) <strong>and</strong> Linear Discriminant Analysis (LDA)<br />

are related to the appearance based linear subspace technique<br />

[6]. These linear subspace techniques use statistics (mean <strong>and</strong><br />

co-variance). The calculation of the mean <strong>and</strong> co-variance is<br />

performed by using the train data set to form the data matrix<br />

X. In data matrix X, each column xi represents the image<br />

in the train data set. The mean image of the train data set is<br />

expressed as shown in Eq. 1.<br />

m = 1<br />

N<br />

N�<br />

i=1<br />

The co-variance matrix C of the r<strong>and</strong>om vector x is calculated<br />

using Eq. 2.<br />

C = 1<br />

N<br />

xi<br />

N�<br />

(xi − m)(xi − m) T (or)C = AA T<br />

i=1<br />

Calculating the co-variance matrix by using Eq. 2 takes high<br />

memory because of the dimensions of C. The size of A is<br />

LMxN. The size of C is LMxLM, which is very large.<br />

So the matrix L = A T A is considered instead of C. The<br />

dimension of L is NxN, which is much smaller than the<br />

dimensions of C. After the co-variance matrix, each technique<br />

(PCA, ICA <strong>and</strong> LDA) uses a specific approach to calculate the<br />

key parameters of the feature space.<br />

In linear subspace technique, all the images in the train data<br />

set are represented as points in the feature space as shown in<br />

Fig. 1. The given test image is also represented as a point in<br />

(1)<br />

(2)<br />

X 1<br />

X 3<br />

X 2<br />

Fig. 1. Image representation in the high dimensional space<br />

the same space <strong>and</strong> the minimum distance train data set image<br />

gives the best match.<br />

A. Principal Component Analysis (PCA)<br />

PCA highlights the similarities <strong>and</strong> differences between the<br />

variables in the data [12], [13]. After calculating the covariance<br />

matrix, we have to calculate the eigenvalues <strong>and</strong><br />

eigenvectors of the co-variance matrix. Then we arrange all<br />

eigenvalues in descending order <strong>and</strong> we take first few highest<br />

eigenvalues <strong>and</strong> corresponding eigenvectors. This operation is<br />

the evaluation of principal components [14]. The eigenvectors<br />

e1, e2,...en are shown in Eq. 3.<br />

Wpca = [e1, e2, ....., en] (3)<br />

We neglect the remaining less significant eigenvalues <strong>and</strong><br />

the corresponding eigenvectors. The eigenvalues neglected<br />

lead to a very small information loss [15]. The principal<br />

component axis passes through the mean values. A new<br />

transformation matrix Wpca is obtained, by projecting the<br />

principal component on to the original data set.<br />

B. Independent Component Analysis (ICA)<br />

ICA uses the higher order statistics of the input data<br />

to find the independent components. The independency is<br />

distinguished by knowing the uncorrelated data. ICA is a<br />

special case of blind source problem [16]. One of the simplest<br />

applications of ICA is found in the cocktail party problem.<br />

So the ICA technique is a generalization of PCA technique.<br />

In this technique we first calculate the PCA transformation<br />

matrix Wpca, transform the centered matrix P = [x1−m, x2−<br />

m, ...., xn − m] using Wpca <strong>and</strong> then form a new matrix Z<br />

(square matrix with size NXN), which contains the r<strong>and</strong>om<br />

vector z, whose elements are uncorrelated as shown in Eq. 4.<br />

Z = W pca T P (4)<br />

The next important stage is the rotation stage. In this one,<br />

the fixed point algorithm is used to find the Wk [17]. After<br />

that, we calculate the overall transformation matrix as shown<br />

in Eq. 5.<br />

Wica = WpcaWk<br />

(5)


46 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

C. Linear Discriminant Analysis (LDA)<br />

The main objective of the LDA is minimizing the within<br />

class variance <strong>and</strong> maximizing the between class variance in<br />

the given data set. In other words it groups the same class<br />

images <strong>and</strong> separates the different class images [18]. A class<br />

means the collection of data (images) belonging to the same<br />

object or same person. In LDA, we have to calculate the mean<br />

image of each class i which is represented as mi.<br />

Si = 1<br />

c�<br />

(x − mi)(x − mi) T<br />

(6)<br />

Ni<br />

x∈Xi<br />

Eq. 6 represents the class dependant scatter matrix <strong>and</strong> it gives<br />

the sum of the co-variance matrix of the centered images in<br />

each class. Xi represents the data matrix corresponding to<br />

class i. Ni represents the images present in class i. c represents<br />

the total number of classes. The within class scatter matrix Sw<br />

is calculated from Eq. 7.<br />

c�<br />

Sw =<br />

(7)<br />

This leads to the evaluation of the amount of variance between<br />

the images in each class. Sb represents the between class<br />

scatter matrix [3] <strong>and</strong> it calculates the variance between the<br />

classes by using Eq. 8. The co-variance matrix of each class<br />

is the difference between the total mean of all classes <strong>and</strong> the<br />

mean of each class. Sb is expressed in Eq. 8.<br />

c�<br />

Sb = (mi − m)(mi − m) T<br />

(8)<br />

i=1<br />

i=1<br />

If Sw is non-singular, we should solve the generalized eigen<br />

problem of the transformation matrix W by the linear discriminant<br />

analysis in Fig. 2. This transformation matrix should<br />

maximize the between class scatter matrix <strong>and</strong> minimize the<br />

within class scatter matrix [19]. There are many solutions to<br />

solve the generalized eigen problem [20]. One method for<br />

solving this eigen problem is to take the inverse of Sw <strong>and</strong><br />

solve the problem by using S −1<br />

w SbW = W λ. This task is<br />

derived from Eq. 9.<br />

Si<br />

SbW = SwW λ (9)<br />

λ is a diagonal matrix containing the eigen values of the matrix<br />

S−1 w Sb. The above algorithm is optimal only when the within<br />

class scatter matrix is singular. If the within class scatter matrix<br />

is non-singular, we should use the direct LDA technique [15].<br />

The direct LDA is performed in the following steps as shown<br />

in Fig. 2.<br />

The first step is related to find the eigen vectors of the<br />

between class scatter matrix Sb = P T b Pb, where Pb is<br />

calculated by subtracting the mean face images of each class<br />

from the mean face image of all images as expressed in Eq. 10.<br />

Pb = [m1 − m, m2 − m, ...., mc − m] (10)<br />

The second step takes the most significant eigen values <strong>and</strong><br />

corresponding eigen vectors V . These eigen vectors are used<br />

Test image<br />

(y)<br />

Data Matrix (X)<br />

Mean Image<br />

+<br />

Mean Image of each person<br />

Center<br />

test<br />

image<br />

+<br />

Mean Image<br />

+<br />

Calculate<br />

Within class<br />

Scatter matrix Sw<br />

Calculate<br />

Betweenclass<br />

Scatter matrix Sb<br />

S w is<br />

singular<br />

<strong>and</strong><br />

eigen values of<br />

b <strong>and</strong> S Calculate eigen<br />

vectors<br />

S w<br />

Highest fisher faces<br />

Yes<br />

Calculate the distance<br />

Py=WTy = between Px <strong>and</strong> Py<br />

Min(dist)<br />

No<br />

Calculate<br />

eigen vectors<br />

of Sb b<br />

Form whitening<br />

Transform (Z)<br />

Calculate the<br />

eigen vectors of<br />

(Z’ (Z’Sw Z)<br />

P=W Px =WTX Recognition<br />

Result<br />

Fig. 2. Linear discriminant analysis technique for face recognition<br />

to calculate Y = PbV <strong>and</strong> Db = Y T SbY . This leads to the<br />

evaluation of the whitening transform as Z = Y D −1/2<br />

b . Sb <strong>and</strong><br />

Sw are projected onto the new subspace spanned by Z. The<br />

small matrix ZT SwZ can be diagonalized. The relationship<br />

between them is expressed in Eq. 11.<br />

U T Z T SwZU = λw<br />

(11)<br />

U <strong>and</strong> λw are the eigen vectors <strong>and</strong> eigen values of the<br />

matrix Z T SwZ. The corresponding eigen matrix is represented<br />

as R. The overall transformation matrix is calculated from<br />

W = ZR. A new transformation can be performed by using<br />

the linear transformation of the original space into a new<br />

reduced dimensional feature space Px = W T X (i.e project<br />

this transformation matrix on to the train data set) [6].<br />

The next operation is concerned with the projection of this<br />

transformation matrix on to the test data sets to obtain Py.<br />

The best match is found by calculating the distance between<br />

Px <strong>and</strong> Py using the distance measure technique. The overall<br />

linear discriminant analysis technique for face recognition is<br />

shown in Fig. 2.<br />

Technique<br />

Year<br />

Iterative<br />

Class Information usage<br />

Order of statistics<br />

Recognition rate (for 80<br />

persons database)<br />

Speed<br />

Scalability<br />

Principal component<br />

analysis (PCA)<br />

1990<br />

No<br />

No<br />

Second order<br />

70%<br />

medium<br />

low<br />

Independent<br />

component analysis<br />

(ICA)<br />

1999<br />

Yes<br />

No<br />

Higher order<br />

79%<br />

very low<br />

low<br />

Linear discriminant<br />

analysis (LDA)<br />

1997<br />

No<br />

Yes<br />

Second order<br />

Fig. 3. Comparison of linear subspace techniques (PCA, ICA <strong>and</strong> LDA)<br />

The performance of different linear subspace techniques like<br />

PCA, ICA <strong>and</strong> LDA is evaluated. Experiments are conducted<br />

to underst<strong>and</strong> the performance (recognition rate <strong>and</strong> speed) of<br />

these linear subspace techniques over the FERET database.<br />

Among linear subspace techniques, LDA gives both high<br />

recognition rate <strong>and</strong> speed when compared with PCA <strong>and</strong><br />

89%<br />

high<br />

high


47 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

ICA as shown in Fig. 3. But LDA is not scalable, <strong>and</strong> the<br />

recognition rate is also not sufficient for real world applications.<br />

In linear subspace techniques, the computational load<br />

<strong>and</strong> memory requirements are dramatically increasing with the<br />

size of database.<br />

III. RADON TRANSFORM<br />

The two dimensional radon transform was introduced by<br />

Austrian mathematician Johann Radon in 1917. This transform<br />

gives the integral of the set of lines present in a given image<br />

[10]. Due to this, it captures the direction of the local features<br />

(lines, curves <strong>and</strong> circles) which are present in the image. This<br />

transform is useful in many line, circle <strong>and</strong> curve detection<br />

applications, related to image processing <strong>and</strong> computer vision<br />

[10]. The radon transform of the two dimensional function<br />

f(x, y) in (r, θ) plane (Fig. 4(a)) is shown in Eq. 12<br />

� ∞ � ∞<br />

R(r, θ)[f(x, y)] = f(x, y)δ(xcosθ+ysinθ−r)dxdy<br />

−∞<br />

−∞<br />

(12)<br />

Where δ(.) function is the Dirac function, rɛ[−∞, ∞] is the<br />

R(r, )<br />

0<br />

r<br />

0<br />

Y<br />

(a)<br />

−100<br />

−80<br />

−60<br />

−40<br />

−20<br />

r 0<br />

20<br />

40<br />

60<br />

80<br />

100<br />

f(x,y)<br />

0 20 40 60 80 100 120 140 160 180<br />

θ (degrees)<br />

(c)<br />

X<br />

20<br />

40<br />

60<br />

80<br />

Y<br />

100<br />

120<br />

140<br />

160<br />

180<br />

20 40 60 80 100 120<br />

X<br />

Fig. 4. (a) The radon transform of an image (b) Shows the original image<br />

(b) Radon transform of the image with an angle 0 to 180<br />

perpendicular distance of a line from the origin <strong>and</strong> θɛ[0, π] is<br />

the angle formed by the distance vector [10]. The δ function<br />

converts the two dimensional integral to a line integral dl<br />

along the line xcosθ + ysinθ = r. The simplified form of<br />

R(r, θ)[f(x, y)] is Rf shown in Eq. 13<br />

� ∞<br />

Rf = f(rcosθ − lsinθ − rcosθ + lsinθ)dxdy (13)<br />

−∞<br />

The transformed function (r, θ) is referred to as the sinogram<br />

of f(x, y). The δ function transforms the point in f to<br />

sinusoidal line δ function in (r, θ) plane. The Rf is defined<br />

as a function of straight lines. The radon transform of the two<br />

dimensional image shown in Fig. 4(b), extracts the direction<br />

of the lines present in that image, as shown in Fig. 4(c).<br />

(b)<br />

The sinogram (Fig. 4(c)) of the given image has 181 radon<br />

projections. Each projection in the image is a feature vector.<br />

IV. WAVELET TRANSFORM<br />

Morlet introduced the wavelet transform in the early 1980’s<br />

[21]. Wavelet is named ’ondelette’ in French, which means<br />

’small waves’ [11]. A wavelet gives both the spatial <strong>and</strong><br />

frequency information of the images. In the frequency representation,<br />

the signal is cut into several parts <strong>and</strong> each part<br />

is analyzed separately. Commonly used discrete wavelets are<br />

daubechies wavelets [22]. Wavelets with one level decomposition<br />

is performed by using the high pass filter g <strong>and</strong> the low<br />

pass filter h. Convolution with the low pass filter gives the<br />

approximation information, while convolution with the high<br />

pass filter leads to the detail information [23]. The wavelet<br />

decomposition process of two dimensional signal f(x, y) is<br />

shown in Fig. 5. The overall process is modeled in Eqs.( 14<br />

- 17).<br />

X(n)<br />

HP 2<br />

LP<br />

2<br />

HP 2<br />

LP<br />

2<br />

HP 2<br />

Fig. 5. Wavelet coefficients decomposition in discrete wavelet transform<br />

LP<br />

A = [h ∗ [h ∗ f]x ↓ 2]y ↓ 2 (14)<br />

H = [g ∗ [h ∗ f]x ↓ 2]y ↓ 2 (15)<br />

V = [h ∗ [g ∗ f]x ↓ 2]y ↓ 2 (16)<br />

D = [g ∗ [g ∗ f]x ↓ 2]y ↓ 2 (17)<br />

The star (∗) represents the convolution operation, <strong>and</strong> ↓ 2<br />

represents the downsampling by 2 along the direction x or<br />

y [11]. To correct this sample rate, the down sampling of the<br />

filter by two is performed (by simply throwing away every second<br />

coefficient). The daubechies wavelets have many wavelets<br />

functions. In this work, db4 (because of the symmetry) is used.<br />

db4 leads to the four wavelet coefficients A, H, V <strong>and</strong> D<br />

<strong>and</strong> the corresponding images. In this decomposition A gives<br />

the approximation information, <strong>and</strong> the image is a blurred<br />

image as shown in Fig. 5. H gives the horizontal features, V<br />

gives the vertical features <strong>and</strong> D gives the diagonal features<br />

present in the image. The wavelet coefficient A gives the high<br />

performance, when compared to the remaining three wavelet<br />

coefficients. Further D gives the less performance. Using the<br />

A + H + V + D wavelet coefficients leads to a performance,<br />

which is nearly equal to the A’s performance.<br />

2<br />

D<br />

H<br />

V<br />

A


48 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

V. CELLULAR NEURAL NETWORK<br />

The concept of CNN, also called cellular neural networks<br />

was introduced in 1988 by Leon O.Chua <strong>and</strong> Lin Yang. The<br />

basic building block in the CNN model is the cell. The CNN<br />

model consists of regularly spaced array of cells. It can be<br />

identified as the combination of cellular automata [24] <strong>and</strong><br />

neural networks [25]. The adjacent cells communicate directly<br />

through their nearest neighbours <strong>and</strong> other cells communicate<br />

indirectly, because of the propagation effects in the model.<br />

The original idea was to use an array of simple, non-linearly<br />

coupled dynamic circuits to process, parallely, large amounts<br />

of data in real time [25].<br />

Cells are multiple input, single output nonlinear processors.<br />

Cells in the CNN processor contain fixed location <strong>and</strong> fixed<br />

topology. Inputs, initial state, <strong>and</strong> output variables are used to<br />

define the CNN processor behavior. Professor Leon O.Chua<br />

proposed the diagram of an isolated cell, as shown in Fig. 6.<br />

The state variable is not observable from outside the cell itself.<br />

Input Uij<br />

Threshold Zij<br />

State<br />

Xij<br />

Cell Cij<br />

Fig. 6. Representation of an isolated cell<br />

Output Yij<br />

The cell is a lumped circuit, <strong>and</strong> it contains both linear <strong>and</strong><br />

nonlinear elements, such as resistors, capacitors <strong>and</strong> nonlinear<br />

controlled sources as shown in Fig. V. The CNN processor<br />

is modeled by Eqs.( 18 - 19), with xi, yi <strong>and</strong> ui as state,<br />

output <strong>and</strong> input variables respectively. The schematic model<br />

of a CNN cell is shown in Fig.8<br />

˙xij = −xij + �<br />

c(j)∈Nr(i)<br />

Aijyij + Bijuij + I (18)<br />

yij = 1<br />

2 (|xij + 1| − |xij − 1|) (19)<br />

The coefficients Aij <strong>and</strong> Bij values, synaptic weights, completely<br />

define the behavior of the network, with given input<br />

<strong>and</strong> initial conditions, as shown in Eq. 18. These values are<br />

called the templates. For the ease of representation, they can<br />

be represented as a matrix. We have three types of templates:<br />

the first one is feedforward or control template, the second<br />

is feedback template <strong>and</strong> the third is bias. All these space<br />

invariant templates are called cloning templates. CNNs are<br />

particularly interesting, because of their programmable nature<br />

i.e. changeable templates.<br />

These templates values <strong>and</strong> synaptic weights completely<br />

define the behavior of the network, with given input <strong>and</strong><br />

initial conditions. These templates are expressed in the form<br />

of a matrix <strong>and</strong> are repeated in every neighborhood cell. The<br />

template set for r = 1 CNN contains 19 coefficients (Atemplate<br />

9, B-template 9 <strong>and</strong> bias 1).<br />

Euij<br />

u ij<br />

x ij<br />

I C R<br />

-1<br />

-1<br />

(a)<br />

y ij<br />

+1<br />

(b)<br />

Fig. 7. (a)Electronic circuit model of the isolated cell (b) The classical output<br />

nonlinear function for each cell<br />

Outputs<br />

from<br />

neighbouring<br />

cells<br />

Uij<br />

Inputs<br />

from<br />

neighbouring<br />

cells<br />

A =<br />

⎡<br />

Template<br />

A<br />

I<br />

Template<br />

B<br />

(Convolution block)<br />

∑<br />

(Summation)<br />

-1<br />

Iuij<br />

+1<br />

∫<br />

x ij<br />

(gain block)<br />

Iyij<br />

Fig. 8. Schematic representation of the CNN<br />

⎣ A−1,−1 A−1,0 A−1,1<br />

A0,−1 A0,0 A0,1<br />

⎡<br />

B = ⎣<br />

A1,−1 A1,0 A1,1<br />

B−1,−1 B−1,0 B−1,1<br />

B0,−1 B0,0 B0,1<br />

B1,−1 B1,0 B1,1<br />

y ij<br />

Eyij<br />

X ij<br />

ij<br />

(Integration)<br />

(Output function)<br />

Y<br />

X ij ( 0)<br />

The genetic algorithm is used to estimate the A, B <strong>and</strong> I<br />

templates, depending upon the given application. The template<br />

set is unique for each application. In this work, we use the<br />

genetic algorithm to obtain the template set for the ORL<br />

database.<br />

A. Genetic algorithm<br />

In order to extract the facial features from a frontal face<br />

image, we assume that the template set values will have symmetrical<br />

behavior, as the front view of the face is symmetrical.<br />

⎤<br />

⎦;<br />

⎤<br />

⎦;


49 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

Because of this symmetry, instead of 19 template elements,<br />

we are calculating the 11 template elements (A-template 5,<br />

B-template 5 <strong>and</strong> bias 1). Each template element is encoded<br />

with 32 bit floating point format. Genetic algorithm (GA)<br />

uses the papulation of binary strings called chromosomes. In<br />

the learning process, initially 72 r<strong>and</strong>om chromosomes, with<br />

length of 11∗32 bits each, are constructed. Genetic Algorithm<br />

is explained in detail in the following steps:<br />

• Construct the r<strong>and</strong>om population matrix with size<br />

72X(11∗32) i.e. each row represents a chromosome (for<br />

11 template elements) of length 11 ∗ 32 = 352.<br />

• The IEEE 754 floating point st<strong>and</strong>ard is used to calculate<br />

the template (A, B <strong>and</strong> I) elements from each chromosome<br />

[24]. In each chromosome first 11 bits represents<br />

the first bit of the 11 template elements, <strong>and</strong> second 11<br />

bits represents the second bit of the 11 template elements<br />

so on as given in Eq. 20.<br />

S = [A11, A12, A13, A21, A22, B11, B12, B13, B21, B22, I]<br />

(20)<br />

• After template calculation, these templates are given as<br />

input to the CNN. The first CNN works with the template<br />

of the first chromosome. After the CNN output appears<br />

as stable, cost function is calculated by using this CNN<br />

output image P <strong>and</strong> the target image T . This process<br />

is repeated for each chromosome template sets in the<br />

population matrix [24]. The cost function is selected as<br />

shown in Eq. 21.<br />

cost(A, B, I) =<br />

m�<br />

i<br />

n�<br />

j<br />

Pi,j ⊕ Ti,j<br />

(21)<br />

Here m,n are the number of pixels of the image. ⊕<br />

represents the XOR operation.<br />

• After calculating the cost function, the fitness function<br />

for each chromosome is evaluated as given in Eq. 22.<br />

fitness(A, B, I) = m ∗ n − cost(A, B, I) (22)<br />

• The whole process is repeated for each chromosome until<br />

the fitness value exceeds the stop criteria. The stop criteria<br />

is considered as stcriteria = 0.99∗m∗n. This maximum<br />

fitness value of the chromosome in the population matrix<br />

is selected.<br />

• The next step is reproduction. In this process, the fitness<br />

values corresponding chromosomes are sorted in descending<br />

order. All the fitness values are normalized with the<br />

sum of the fitness values. The bad fitness value corresponding<br />

chromosomes are deleted. The most successful<br />

chromosomes will produce the next generation.<br />

• Take the first highest fitness values corresponding chromosomes<br />

S1 <strong>and</strong> S2, apply the crossover <strong>and</strong> mutation<br />

operations to generate the children [24]. Crossover operation<br />

exchanges the substrings between the two chromosomes<br />

S1 <strong>and</strong> S2. In this work, one-point crossover is<br />

used <strong>and</strong> its first cross site is selected with chromosome<br />

length of the uniform probability. If the mutation probability<br />

is set to 0.01 then 253 bits are selected r<strong>and</strong>omly<br />

<strong>and</strong> then they are inverted.<br />

• Take these new chromosomes <strong>and</strong> apply the same steps<br />

from template calculation to stop criteria.<br />

This learning process is repeated to find the best chromosome.<br />

After satisfying the stop criteria, the template elements are<br />

calculated from the best chromosome. The template elements<br />

to extract features from the frontal face images for ORL<br />

database are obtained as:<br />

⎡<br />

A = ⎣<br />

⎡<br />

B = ⎣<br />

I = 0.4414<br />

2.7612 7.3152 1.7566<br />

1.5916 8.5273 1.5916<br />

1.7566 7.3152 2.7612<br />

⎤<br />

⎦;<br />

−6.1912 2.8350 −7.9270<br />

1.3044 −2.7349 1.3044<br />

−7.9270 2.8350 −6.1912<br />

The corresponding best chromosome is<br />

S = [000001010101100111101000110000101<br />

00110000101001100001010011000010100110000101<br />

00110000101011101011010111010100111100011111<br />

10000011000010110010100000011111001010100110<br />

00010011011101100011010010101010110011011101<br />

11110110101001111010111100110010111001100101<br />

10011001001100110111100010110100000000011001<br />

01001001110010101010110111011110101001100011<br />

00100100000]<br />

(a) (b)<br />

Fig. 9. (a) The input image of the genetic cellular neural network (b) The<br />

genetic cellular neural network output image<br />

The two dimensional image shown in Fig. 9(a), is given<br />

as the input image for CNN to extract the important frontal<br />

facial features present in that image <strong>and</strong> the output image with<br />

extracted feature set is shown in Fig. 9(b).<br />

⎤<br />

⎦;<br />

VI. EXPERIMENTAL RESULT<br />

In this section, we evaluate the performance of the wavelet<br />

<strong>and</strong> radon transform based feature extraction approaches using<br />

FERET database. The performance of the CNN based approach<br />

is compared to other stated face recognition approaches<br />

over the ORL database. The performance is evaluated over the<br />

FERET database for frontal images (fa or fb), pose variant<br />

with an angle 67.5 half left or right shifted images (hr or hl),<br />

<strong>and</strong> pose variant with an angle 90 profile left or right shifted<br />

images (pr or pl) [26]. For the ORL database, the performance<br />

is evaluated for facial expressions <strong>and</strong> varying light conditions.


50 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

1) Performance evaluation of radon <strong>and</strong> wavelet transforms:<br />

: The radon transform gives the direction of the<br />

local features (lines, circles). Radon transform preserves the<br />

variation in pixel intensities. While computing the radon<br />

projections, the pixel intensities along a line are added. This<br />

process extracts the spatial frequency components in the<br />

direction of radon projection. When features are extracted<br />

using radon transform, the variations in this facial frequency<br />

are also boosted. The wavelet transform gives the spacial <strong>and</strong><br />

frequency components present in an image.<br />

A. Different wavelet functions versus recognition rate<br />

Daubechies wavelets contain different wavelet functions.<br />

The recognition rates of two different wavelet functions db1<br />

<strong>and</strong> db4 are compared in Fig. 10. db1 st<strong>and</strong>s for the haar<br />

wavelet <strong>and</strong> it encodes the constant component. db4 encodes<br />

both constant <strong>and</strong> linear components. The db4 performance is<br />

high when compared to db1.<br />

B. Different wavelet coefficients versus recognition rate<br />

In the db4 daubechies wavelets, there are four wavelet<br />

coefficients. These coefficients vary in terms of the wavelet<br />

functions. The four wavelet coefficients are A, H, V <strong>and</strong><br />

D. The wavelet coefficient A gives approximate information<br />

on the features. H, V , <strong>and</strong> D gives the information about<br />

horizontal, vertical <strong>and</strong> diagonal features present in the given<br />

image respectively.<br />

The wavelet coefficient A gives the high recognition rate,<br />

when compared to the remaining three wavelet coefficients.<br />

Further D gives the less recognition rate (see Fig. 10). Using<br />

the A + H + V + D wavelet coefficients leads to a recognition<br />

rate, which is nearly equal to the A’s recognition rate.<br />

Recognition rate (%)<br />

100%<br />

90%<br />

80%<br />

70%<br />

60%<br />

50%<br />

40%<br />

30%<br />

20%<br />

10%<br />

0%<br />

D V H A<br />

Wavelet coefficients<br />

Fig. 10. Performance comparison of different wavelet function db1 <strong>and</strong> db4<br />

The next experiments are conducted on a FERET database<br />

with one frontal image (fb) for each subject as test image,<br />

<strong>and</strong> five images in different poses for each subject in train<br />

database. The performance evaluation is shown in Fig. 11(a).<br />

The experiments are repeated with pose variant images like<br />

hr <strong>and</strong> pr as test image for each subject, <strong>and</strong> five images<br />

db4<br />

db1<br />

Recognition rate (%)<br />

100%<br />

Recognition rate (%)<br />

Recognition rate (%)<br />

100%<br />

95%<br />

90%<br />

85%<br />

80%<br />

75%<br />

70%<br />

90%<br />

80%<br />

70%<br />

60%<br />

50%<br />

40%<br />

30%<br />

20%<br />

10%<br />

0%<br />

100%<br />

90%<br />

80%<br />

70%<br />

60%<br />

50%<br />

40%<br />

30%<br />

20%<br />

10%<br />

0%<br />

100%<br />

90%<br />

80%<br />

70%<br />

60%<br />

50%<br />

40%<br />

30%<br />

20%<br />

10%<br />

0%<br />

PCA<br />

PCA<br />

LDA<br />

Radon<br />

Wavelet<br />

Radon+wavelet<br />

Radon+LDA<br />

Wavelet+LDA<br />

40 100 150 200 400<br />

Number of subjects in the database<br />

PCA<br />

LDA<br />

(a)<br />

Radon<br />

Wavelet<br />

Radon+wavelet<br />

Radon+LDA<br />

Wavelet+LDA<br />

40 100 150 200 400<br />

Number of subjects in the database<br />

LDA<br />

(b)<br />

PCA<br />

LDA<br />

Radon<br />

Wavelet<br />

Radon+wavelet<br />

Radon+LDA<br />

Wavelet+LDA<br />

40 100 150 200 400<br />

Number of subjects in the database<br />

Radon<br />

(c)<br />

Wavelet<br />

Radon+wavelet<br />

(d)<br />

Radon+LDA<br />

Wavelet+LDA<br />

Fig. 11. (a) Performance comparison of different face recognition approaches<br />

with front images (FERET database) (b) Performance comparison of different<br />

face recognition approaches with half right images (FERET database) (c)<br />

Performance comparison of different face recognition approaches with profile<br />

right images (FERET database) (d) Performance comparison of different face<br />

recognition approaches with ORL database<br />

CNN


51 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

excluding the test image for each subject in train database. The<br />

results are shown in Fig. 11(b) <strong>and</strong> Fig. 11(c) respectively. For<br />

best matching, the euclidean distance measure is used here.<br />

The recognition rate depends upon the number of subjects<br />

in the data set. It is difficult to recognize a subject in the<br />

large data set than in the small data set. The experiments<br />

are conducted with different sizes of the FERET database,<br />

by using linear subspace techniques (principal component<br />

analysis (PCA), linear discriminant analysis (LDA)), radon<br />

transform <strong>and</strong> wavelet transform. In applying linear subspace<br />

techniques for large databases, computational load <strong>and</strong> memory<br />

requirements increases dramatically with the size of the<br />

database. This effects the performance of PCA <strong>and</strong> LDA on<br />

large data sets as shown in Fig. 11.<br />

The radon transform <strong>and</strong> wavelet transform are mostly independent<br />

of size of the database. The combination of radon <strong>and</strong><br />

wavelet transform gives the multi-resolution features, which<br />

are more useful in face recognition. This has been validated<br />

with the experimental results shown in Fig. 11. Even though<br />

the combination of radon <strong>and</strong> wavelet transform gives better<br />

performance, there is still a need for improvement in pose<br />

variant face recognition as shown in Fig. 11(b) <strong>and</strong> Fig. 11(c).<br />

1) Performance evaluation of cellular neural networks: :<br />

The CNN based face recognition approach <strong>and</strong> other stated<br />

approaches are applied on ORL database. The ORL database<br />

contains images of 40 subjects. All images are taken in frontal<br />

position against a dark homogeneous background. The performance<br />

of various algorithms are evaluated using ORL database<br />

are shown in Fig. 11(d). CNN with its parallel computing<br />

paradigm promises to outperform the other approaches over<br />

the ORL database as shown in Fig. 11(d).<br />

VII. CONCLUSION<br />

The face recognition performance has been systematically<br />

evaluated by using different sizes of the database. To improve<br />

the performance of the face recognition technique, wavelets,<br />

radon <strong>and</strong> combination of both radon <strong>and</strong> wavelet transform<br />

have been proposed to extract the nonlinear features. The<br />

results of the evaluation have shown that the recognition rate is<br />

considerably increased with the combination of both radon <strong>and</strong><br />

wavelet transform compared to PCA <strong>and</strong> LDA. In addition to<br />

these two approaches, this work also shows CNN based feature<br />

extraction approach for face recognition outperforms both<br />

radon <strong>and</strong> wavelet transforms for ORL database. However, this<br />

should be validated for FERET database, where the images are<br />

in different poses. The CNN algorithm should able to detect<br />

the pose, <strong>and</strong> then apply the appropriate template to extract<br />

the relevant feature set.<br />

Future work should focus on the recognition algorithm<br />

performing over videos, as many applications dem<strong>and</strong> real<br />

time recognition. Further, such a system may be integrated<br />

in driver assistance system to either recognize the driver of a<br />

car, or extract facial expressions that may provide information<br />

about his mood or fatigue.<br />

REFERENCES<br />

[1] I. Guyon <strong>and</strong> A. Elisseeff, “An Introduction to Feature Extraction,”<br />

Zurich Research Laboratory, (2004).<br />

[2] M. Aleemuddin, “A Pose Invariant Face Recognition system using<br />

Subspace Techniques,” Deanship of Graduate studies, (2004).<br />

[3] G. Shakhnarovich <strong>and</strong> B. Moghaddam, Face Recognition in Subspaces.<br />

Springer-verlag, May (2004).<br />

[4] M. Kirby <strong>and</strong> L. Sirovich, “Application of the karhunen-loeve procedure<br />

for the characterization of human faces,” IEEE Trans. Pattern Anal.<br />

Mach. Intell., vol. 12, no. 1, pp. 103–108, (1988).<br />

[5] A. SATO, H. IMAOKA, T. SUZUKI, <strong>and</strong> T. HOSOI, “Advances in Face<br />

Detection <strong>and</strong> Recognition Technologies,” NEC Journal of Advanced<br />

Technology, vol. 2, no. 1, (2005).<br />

[6] O. Toygar <strong>and</strong> A. Acan, “Face Recognition using PCA, LDA <strong>and</strong> ICA<br />

approaches on colored images,” Electrical <strong>and</strong> Electronics engineering,<br />

vol. 3, no. 1, pp. 735–743, (2003).<br />

[7] R. Brunelli, T. Poggio, <strong>and</strong> I. P. Trento, “Face recognition through<br />

geometrical features,” in European Conference on Computer Vision<br />

(ECCV), pp. 792–800, (1992).<br />

[8] R. Brunelli <strong>and</strong> T. Poggio, “Face recognition: Features vs. templates.,”<br />

IEEE Transactions on Pattern Analysis <strong>and</strong> Machine Intelligence,<br />

vol. 15, no. 10, pp. 1042–1052, (1993).<br />

[9] B. J. Lei, E. A. Hendriks, <strong>and</strong> M. Reinders, “On Feature Extraction from<br />

Images,” Technical Report on MCCWS project, (1999).<br />

[10] Q. W. Yan CHEN <strong>and</strong> X. HE, “Human Action Recognition by Radon<br />

Transform,” IEEE International Conference on Data Mining Workshops,<br />

May (2008).<br />

[11] N. Shams, I. Hosseini, M. Sadri, <strong>and</strong> E. Azarnasab, “Low cost fpgabased<br />

highly accurate face recognition system using combined wavelets<br />

with subspace methods,” pp. 2077–2080, (2006).<br />

[12] P. N. Belhumeur, J. a. P. Hespanha, <strong>and</strong> D. J. Kriegman, “Eigenfaces<br />

vs. fisherfaces: Recognition using class specific linear projection.,” IEEE<br />

Transactions on Pattern Analysis <strong>and</strong> Machine Intelligence, vol. 19,<br />

pp. 711–720, July (1997).<br />

[13] W. S.Yambor, “Analysis of PCA based <strong>and</strong> Fisher discriminant based<br />

image recognition algorithms,” Degree of Master of Science, (2000).<br />

[14] B. A. Draper, K. Baek, M. S. Bartlett, <strong>and</strong> J. R. Beveridge, “Recognizing<br />

faces with pca <strong>and</strong> ica,” Comput. Vis. Image Underst., vol. 91, no. 1-2,<br />

pp. 115–137, (2003).<br />

[15] P. N.Belhumeur, J. P.Hespanha, <strong>and</strong> D. J.Kriegman, “Eigenfaces vs.<br />

Fisherfaces:Recognition Using Class Specific Linear Projection,” vol. 19,<br />

no. 7, (1997).<br />

[16] J. Kim, M.-J. Choi, M.-J. Yi, <strong>and</strong> M. Turk, “Effective representation<br />

using ica for face recognition robust to local distortion <strong>and</strong> partial<br />

occlusion,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 27, no. 12,<br />

pp. 1977–1981, (2005).<br />

[17] A. Hyvrinen, “The Fixed-Point Algorithm <strong>and</strong> Maximum Likelihood<br />

estimation for Independent Component Analysis,” pp. 1–5, (1999).<br />

[18] W. S. Y. B. A. D. J. R. Beveridge, “Analyzing PCA-based Face<br />

Recognition Algorithms: Eigenvector Selection <strong>and</strong> Distance Measures,”<br />

Department of Computer Science, (2000).<br />

[19] P. N.Belhumeur, J. P.Hespanha, <strong>and</strong> D. J.Kriegman, “Eigenfaces vs.<br />

Fisherfaces:Recognition Using Class Specific Linear Projection,” European<br />

Conference on Computer Vision, (1996).<br />

[20] W.Zhao, R.Chellappa, <strong>and</strong> P.J.Phillips, “Subspace Linear Discriminant<br />

Analysis for Face Recognition,” Department of Electrical <strong>and</strong> Electronic<br />

Engineering, (1999).<br />

[21] C. Garcia, G. Zikos, <strong>and</strong> G. Tziritas, “A wavelet-based framework for<br />

face recognition,” in Workshop on Advances in Facial Image Analysis<br />

<strong>and</strong> Recognition Technology, 5 th European Conference on Computer<br />

Vision, pp. 84–92, Publications, (1998).<br />

[22] M. I. M. D. Fatma H. Elfouly, Mohamed I. Mahmoud <strong>and</strong> S. Deyab,<br />

“Comparison between haar <strong>and</strong> daubechies wavelet transformations on<br />

fpga technology,” International Journal of Computer, Information, <strong>and</strong><br />

<strong>Systems</strong> Science, <strong>and</strong> Engineering, vol. 2, no. 1, pp. 1047–1061, (2006).<br />

[23] C. C. LIU, D. Q. Dai, <strong>and</strong> H. Yan, “Local Discriminant Wavelet<br />

Packet Coordinates for Face Recognition,” Journal of Machine learning<br />

Research, pp. 1165–1195, May (2007).<br />

[24] T. R. Tibor Kozek <strong>and</strong> L. . Chua, “Genetic Algorithm for CNN Template<br />

Learning,” IEEE Transactions on circuits <strong>and</strong> systems, vol. 40, no. 6,<br />

(1993).<br />

[25] L. Chua <strong>and</strong> T. Roska, Cellular neural networks <strong>and</strong> visual computing:<br />

foundations <strong>and</strong> applications. Cambridge University Press, (2005).


52 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

[26] P. Phillips, H. Wechsler, J. Huang, <strong>and</strong> P. Rauss, “The feret database <strong>and</strong><br />

evaluation procedure for face-recognition algorithms,” vol. 16, pp. 295–<br />

306, April (1998).


53 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

ARTIFICIAL HUMAN LIMBS – A DESIGN APPROACH FOR MILITARY<br />

APPLICATION<br />

Abstract— The most essential automation is saving<br />

human life, saving their belongings, protecting their<br />

properties <strong>and</strong> making arrangements in a systematic way<br />

for automation. This paper deals with the design of a real<br />

time Human Limb which acts according to the design<br />

configurations of the prescribed datas as per the sensor<br />

calibrated. This research proposes to overcome current<br />

limitations using three axis optimal inertial sensors<br />

combined with an Embedded Controller on which the<br />

filter algorithm as well as analog to digital converter is<br />

implemented for correcting drift <strong>and</strong> angular motion<br />

through all orientations. The mechanical design will have<br />

miniature or hybrid stepper motors with associated<br />

mechanical elements to move the limbs on all the axis like<br />

up/down ,roll, elevation <strong>and</strong> azimuth.<br />

I.INTRODUCTION<br />

R.Karthikeyan, Department of EIE, Veltech,Member IEEE,<br />

rkarthiekeyan@gmail.com<br />

Anitha Karthikeyan, Department of ECE,Meenakchi College of Engineering.<br />

mrs.anithakarthikeyan@gmail.com<br />

S.Sivaperumal, Department of ECE,Vel HIGHTECH SRS Engineering College.<br />

sivaperumals@gmail.com<br />

With the development of networked synthetic environments<br />

(SE) st<strong>and</strong> to revolutionize the fields of education, training,<br />

business, retailing <strong>and</strong> entertainment. They will<br />

fundamentally alter our societies <strong>and</strong> the way in which<br />

mankind views the world. In the educational field, synthetic<br />

environments will offer the ultimate in h<strong>and</strong>s-on <strong>and</strong><br />

visualization of difficult concepts. They will allow training<br />

to transpire in a place much like that in which the skills<br />

being practiced will be used without exposure to possible<br />

hazards <strong>and</strong> at less cost. In the workplace, employees will be<br />

able to work “side by side” even though they may be<br />

physically separated by hundreds or even thous<strong>and</strong>s of<br />

miles. .[Durlach -1995].Using synthetic environments,<br />

corporations will obtain a safe, economical <strong>and</strong> efficient<br />

method of testing new concepts <strong>and</strong> systems. Retailers will<br />

create virtual department stores where consumers will be<br />

able to try out products to an unprecedented degree before<br />

actually buying them.<br />

Using synthetic environments, the entertainment<br />

industry will be able to create entire worlds in which<br />

customers will be able to experience thrills <strong>and</strong> live out<br />

entire fantasy lives [Zyda-1997].The power of the synthetic<br />

environment lies in its ability to immerse users in a different<br />

world. The more complete the immersion, the more effective<br />

the synthetic environment. For complete immersion, the user<br />

should sense <strong>and</strong> interact with the synthetic environment in<br />

the same manner in which interaction with the natural world<br />

takes place. Interaction in the natural world results from<br />

body motion. Information regarding the surrounding<br />

environment is obtained through the five senses. Changes in<br />

body posture <strong>and</strong> position directly affect what is seen, heard,<br />

felt <strong>and</strong> smelled[ Mavor-1995].<br />

The parameters sensed in the environment are altered <strong>and</strong><br />

manipulated by the actions of the body. Thus, in order for a<br />

user to interact with a synthetic environment in a natural<br />

way <strong>and</strong> have the synthetic environment present appropriate<br />

information to the senses, it is imperative that data regarding<br />

body motion <strong>and</strong> posture be obtained[Skopowski,1996].<br />

Body posture <strong>and</strong> location data are also needed in multi-user<br />

environments to drive the animation of avatars which<br />

represent the actions of users of the environment to each<br />

other. At this time, there is no practical <strong>and</strong> intuitive<br />

interface that allows an individual human to be inserted into<br />

a SE in a fully immersive manner. [Badler, N,1993].<br />

Numerous motion tracking technologies are currently in<br />

use, but each suffers from its own set of limitations.<br />

Depending on the technology, these limitations may include<br />

marginal accuracy, user encumbrance, restricted range,<br />

susceptibility to interference <strong>and</strong> noise, poor registration,<br />

occlusion difficulties <strong>and</strong> high latency. Due to these<br />

problems, real-time animations of avatars must be largely<br />

script-based using motion libraries. For the most part, only a<br />

single user may be tracked in a small working volume. Thus,<br />

none of the current technologies fulfills the need for widearea<br />

tracking of multiple users. The ideal motion tracking<br />

technology must meet several requirements. It should have<br />

low latency, be tolerant to noise <strong>and</strong> other environmental<br />

interference, track multiple users <strong>and</strong> maintain both<br />

adequate accuracy <strong>and</strong> registration throughout a large<br />

working volume [MoletAubel-1999].<br />

The primary reason current tracking systems fail to<br />

meet the requirements described above is the dependence of<br />

these systems on a generated “source” to determine<br />

orientation <strong>and</strong> location information. This source may be<br />

sent by transmitters to body-based receivers or it may be<br />

sent from body-based transmitters to receivers positioned at<br />

known locations throughout the working volume. Usually


54 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

the effective range of this source is extremely limited or<br />

there may be compromises between resolution <strong>and</strong> range.<br />

Interference with or distortion of this source will at best<br />

result in erroneous orientation <strong>and</strong> position measurements.<br />

II.MOTIVATION<br />

Motion tracking technology currently fail to<br />

provide accurate wide area tracking of multiple users<br />

without interference <strong>and</strong> occulation problems. This research<br />

proposes to overcome current limitations using three axis<br />

optimal inertial sensors combined with an Embedded<br />

Controller on which the filter algorithm as well as analog to<br />

digital converter is implemented for correcting drift <strong>and</strong><br />

angular motion through all orientations.The mechanical<br />

design will have miniature or hybrid stepper motors with<br />

associated mechanical elements to move the limbs on all the<br />

axis like up/down ,roll, elevation <strong>and</strong> azimuth. An<br />

appropriate electronic circuit is used for isolation between<br />

the stepper motors <strong>and</strong> an Embedded Controller in<br />

computers.<br />

The electronic system will be suitable upto 5A for<br />

70kg cm stepper motor but in this research the stepper motor<br />

used is only 7kg cm .Joint angle determination for robots<br />

with flexible links is difficult. Use of Bluetooth technology<br />

will enable sensors to wirelessly transmit data from body<br />

extremities to the wearable PC. Inertial orientation tracking<br />

combined with RF positioning are also tried to provide an<br />

accurate method for determining orientation <strong>and</strong> location. It<br />

describes a system designed to determine the posture of an<br />

articulated body in real time. Finally ,this work describes<br />

the design, implementation ,calibration algorithm for the<br />

sensors <strong>and</strong> testing of inertial tracking system of human limb<br />

segment.<br />

IV.OBJECTIVES<br />

Based on the above discussion, the objectives of the present<br />

research work are,<br />

� Orientation tracking of human limb segments using three<br />

axis inertial sensors.<br />

� Calibration of individual sensors without the use of any<br />

specialized equipment .<br />

� Sufficient dynamic response <strong>and</strong> update rate (100 HZ or<br />

better) to capture faster human body limb motion.<br />

� Ability to change the three stepper motors rotation<br />

according to the assigned threshold value.<br />

� Three sensors are attached on the human limb, if the<br />

threshold value attains 360 <strong>and</strong> below, then the three<br />

stepper motors rotates in the forward direction. Finally ,<br />

axis direction <strong>and</strong> three sensor data are also displayed<br />

graphically in the computer as per the limb movement<br />

of the human body.<br />

� Three sensors are attached on the human limb, if the<br />

threshold value attains 400 <strong>and</strong> above then the three<br />

stepper motors rotates in the reverse direction. Finally ,<br />

axis direction <strong>and</strong> three sensor data are also displayed<br />

graphically in the computer as per the limb movement<br />

of the human body.<br />

� If the sensors are not attached on the human limb, ,the<br />

threshold value, rotation of stepper motors ,axis<br />

direction <strong>and</strong> the three sensor data all this parameter<br />

should lie in the initial condition.<br />

� Automatic accounting for the peculiarities related to the<br />

mounting of a sensor on an associated limb segment.<br />

� Creation of data files for recording data relating to limb<br />

as per the embedded software Filter.<br />

� Use of Bluetooth technology will enable sensors to<br />

wirelessly transmit data from body extremities to the<br />

wearable PC.<br />

� RF positioning are also tried to provide an accurate<br />

method for determining orientation <strong>and</strong> location .<br />

� It describes the design, implementation ,calibration of<br />

the sensors <strong>and</strong> testing of inertial tracking system of<br />

human limb segment.<br />

III.DESIGN EQUATIONS OF LIMB<br />

Force Calculations of Joints<br />

The point of doing force calculations is for motor selection.<br />

We must make sure that the motor we choose can not only<br />

support the weight of the robot arm, but also what the robot<br />

arm will carry (the blue ball in the image below).<br />

The first step is to label the FBD(Diagram 1), with the robot<br />

arm stretched out to its maximum length.<br />

Diagram 1<br />

Next we do a moment arm calculation, multiplying<br />

downward force times the linkage lengths. This calculation<br />

must be done for each lifting actuator. This particular design<br />

has just two DEGREE OF FREEDOM that requires lifting,<br />

<strong>and</strong> the center of mass of each linkage is assumed to be<br />

Length/2.<br />

Torque About Joint 1:<br />

M1 = L1/2 * W1 + L1 * W4 + (L1 + L2/2) * W2 + (L1 + L3) * W3


55 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

Torque About Joint 2:<br />

M2 = L2/2 * W2 + L3 * W3<br />

Forward Kinematics<br />

Forward kinematics is the method for determining the<br />

orientation <strong>and</strong> position of the end effector, given the joint<br />

angles <strong>and</strong> link lengths of the robot arm. For our robot arm,<br />

here we calculate end effector location with given joint<br />

angles <strong>and</strong> link lengths.<br />

Diagram 2<br />

Assume that the base is located at x=0 <strong>and</strong> y=0. The first<br />

step would be to locate x <strong>and</strong> y of each joint as shown in<br />

Diagram2<br />

Joint 0 (with x <strong>and</strong> y at base equaling 0):<br />

x0 = 0<br />

y0 = L0<br />

Joint 1 (with x <strong>and</strong> y at J1 equaling 0):<br />

cos(psi) = x1/L1 => x1 = L1*cos(psi)<br />

sin(psi) = y1/L1 => y1 = L1*sin(psi)<br />

Joint 2 (with x <strong>and</strong> y at J2 equaling 0):<br />

sin(theta) = x2/L2 => x2 = L2*sin(theta)<br />

cos(theta) = y2/L2 => y2 = L2*cos(theta)<br />

End Effector Location (make sure your signs are correct):<br />

x0 + x1 + x2, or 0 + L1*cos(psi) + L2*sin(theta)<br />

y0 + y1 + y2, or L0 + L1*sin(psi) + L2*cos(theta)<br />

z equals alpha, in cylindrical coordinates<br />

Inverse Kinematics<br />

Inverse kinematics is the opposite of forward kinematics.<br />

This is when we have a desired end effector position, but<br />

need to know the joint angles required to achieve it. The<br />

robot sees a kitten <strong>and</strong> wants to grab it, what angles should<br />

each joint go to? Although way more useful than forward<br />

kinematics, this calculation is much more complicated too.<br />

psi = arccos((x^2 + y^2 - L1^2 - L2^2) / (2 * L1 * L2))<br />

theta = arcsin((y * (L1 + L2 * c2) - x * L2 * s2) / (x^2 +<br />

y^2))<br />

where c2 = (x^2 + y^2 - L1^2 - L2^2) / (2 * L1 * L2);<br />

<strong>and</strong> s2 = sqrt(1 - c2^2);<br />

There is the very likely possibility of multiple, sometimes<br />

infinite, number of solutions (as shown below). How<br />

would the arm choose which is optimal, based on torques,<br />

previous arm position, gripping angle, etc.? There is the<br />

possibility of zero solutions. Maybe the location is outside<br />

the workspace, or maybe the point within the workspace<br />

must be gripped at an impossible angle(Diagram 3).<br />

Diagram 3<br />

Singularities, a place of infinite acceleration, can blow up<br />

equations <strong>and</strong>/or leave motors lagging behind (motors cant<br />

achieve infinite acceleration).<br />

And lastly, exponential equations take forever to calculate<br />

on a microcontroller. No point in having advanced equations<br />

on a processor that cant keep up.<br />

Motion Planning<br />

Motion planning on a robot arm is fairly complex so I will<br />

just give you the basics.<br />

Diagram 4<br />

Suppose the robot arm has objects within its workspace<br />

(Diagram 4), how does the arm move through the workspace<br />

to reach a certain point? To do this, assume the robot arm is<br />

just a simple mobile robot navigating in 3D space. The end<br />

effector will traverse the space just like a mobile robot,<br />

except now it must also make sure the other joints <strong>and</strong> links<br />

do not collide with anything too. This is extremely difficult<br />

to do . . .<br />

What if you the robot end effector to draw straight lines<br />

with a pencil? Getting it to go from point A to point B in a<br />

straight line is relatively simple to solve. What the robot<br />

should do, by using inverse kinematics, is go to many points


56 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

between point A <strong>and</strong> point B. The final motion will come<br />

out as a smooth straight line. We can not only do this<br />

method with straight lines, but curved ones too. On<br />

expensive professional robotic arms all we need to do is<br />

program two points, <strong>and</strong> tell the robot how to go between<br />

the two points (straight line, fast as possible, etc.).<br />

Velocity (<strong>and</strong> more Motion Planning)<br />

Calculating end effector velocity is mathematically complex,<br />

so we will go only into the basics. The simplest way to do it<br />

is assume the robot arm (held straight out) is a rotating<br />

wheel of L diameter. The joint rotates at Y rpm, so therefore<br />

the velocity is<br />

Velocity of end effector on straight arm = 2 * pi * radius * rpm<br />

However the end effector does not just rotate about the base,<br />

but can go in many directions. The end effector can follow a<br />

straight line, or curve, etc.<br />

With robot arms, the quickest way between two points is<br />

often not a straight line. If two joints have two different<br />

motors, or carry different loads, then max velocity can vary<br />

between them. When we tell the end effector to go from one<br />

point to the next, we have two decisions. Have it follow a<br />

straight line between both points, or tell all the joints to go<br />

as fast as possible - leaving the end effector to possibly<br />

swing wildly between those points.<br />

In the diagram 5 the end effector of the robot arm is moving<br />

from the blue point to the red point. In the top example, the<br />

end effector travels a straight line. This is the only possible<br />

motion this arm can perform to travel a straight line. In the<br />

bottom example, the arm is told to get to the red point as fast<br />

as possible. Given many different trajectories, the arm goes<br />

the method that allows the joints to rotate the fastest.<br />

Diagram 5<br />

There are many deciding factors to select the best method.<br />

Usually we want straight lines when the object the arm<br />

moves is really heavy, as it requires the momentum change<br />

for movement (momentum = mass * velocity). But for<br />

maximum speed (perhaps the arm isn't carrying anything, or<br />

just light objects) we would want maximum joint speeds.<br />

Now suppose we want the robot arm to operate at a certain<br />

rotational velocity, how much torque would a joint need?<br />

First, lets go back to our Functional Block Diagram<br />

(Diagram 6):<br />

Diagram 6<br />

Now lets suppose we want joint J0 to rotate 180 degrees in<br />

under 2 seconds, what torque does the J0 motor need? Well,<br />

J0 is not affected by gravity, so all we need to consider is<br />

momentum <strong>and</strong> inertia. Putting this in equation form we get<br />

this:<br />

torque = moment_of_inertia * angular_acceleration<br />

breaking that equation into sub components we get:<br />

torque = (mass * distance^2) * (change_in_angular_velocity<br />

/ change_in_time) <strong>and</strong><br />

change_in_angular_velocity = (angular_velocity1)-<br />

(angular_velocity0)<br />

angular_velocity = change_in_angle / change_in_time<br />

Now assuming at start time 0 that angular_velocity0 is zero,<br />

we get<br />

torque = (mass * distance^2) * (angular_velocity /<br />

change_in_time)<br />

where distance is defined as the distance from the rotation<br />

axis to the center of mass of the arm:<br />

center of mass of the arm = distance = 1/2 * (arm_length)<br />

(use arm mass)<br />

but we also need to account for the object the arm holds:<br />

center of mass of the object = distance = arm_length<br />

(use object mass)<br />

So we calculate torque for both the arm <strong>and</strong> then again for<br />

the object, then add the two torques together for the total:<br />

torque(of_object) + torque(of_arm) = torque(for_motor)


57 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

And of course, if J0 was additionally affected by gravity,<br />

add the torque required to lift the arm to the torque required<br />

to reach the velocity needed.<br />

V IMPLEMENTATION OF INERTIAL TRACKING OF<br />

HUMAN LIMB SEGMENTS<br />

The implementation of inertial tracking of human limb<br />

segment is shown in figure 1. Three inertial sensors are<br />

mounted on the body of the human limb segment. The<br />

analog output from the limb for adults is 20mv (infants is 5<br />

mv)<strong>and</strong> given to signal conditioner circuit here for better<br />

ADC accuracy it amplify the 20mv to 5v <strong>and</strong> the<br />

corresponding amplifier gain is 5000/20 250.<br />

The 5v analog output is given to filter circuit which provides<br />

high speed noise filtering output with constant frequency<br />

approximately 100HZ <strong>and</strong> to Embedded Controller on<br />

which the software filter algorithm as well as analog to<br />

digital converter is implemented. The output is digitized by<br />

an associated inbuilt A\D converter .The digitized output<br />

from an Embedded Controller by a RS 232 converter is<br />

connected to the PC. All data processing <strong>and</strong> calculations<br />

are performed by software running on this single processor.<br />

An appropriate electronic optocoupler circuit is used for<br />

isolation between the three stepper motors <strong>and</strong> an Embedded<br />

Controller. The driver circuit drives the three stepper motor<br />

in different direction. The electronic system will be suitable<br />

upto 5A for 70kg cm stepper motor but in this research the<br />

stepper motor used is only 7kg cm The rotation depends<br />

upon the human limb movement on all the axis like up/down<br />

,roll, elevation <strong>and</strong> azimuth as per the assigned threshold<br />

value. The threshold value ,three stepper motor direction,<br />

axis movement <strong>and</strong> graphical representation <strong>and</strong> sensing<br />

system all this implemented data can be displayed on the<br />

monitor by means of using C programming language. The<br />

optimal filter theory to the filter software is done by Flash<br />

Embedded Controller <strong>and</strong> visual simulation software run on<br />

a single st<strong>and</strong>ard Pentium III processor in computers RH<br />

(Barnett,LO’Cull,2004).Use of Bluetooth technology will<br />

enable sensors to wirelessly transmit data from body<br />

extremities to the wearable PC. Inertial orientation tracking<br />

combined with RF positioning are also tried to provide an<br />

accurate method for determining orientation <strong>and</strong> location.<br />

Finally, the prototype sensor overall system hardware kit is<br />

shown in figure 2 .<br />

Static Stability of the system<br />

Figure 3 plots the magnitude of the quaternion<br />

filter criterion function versus time. The drift characteristics<br />

of the quaternion filter algorithm <strong>and</strong> the MARG sensor<br />

over extended periods were evaluated using static tests.<br />

Average total drift is about 1%. During the experiment<br />

shown, the filter gain, k was set to unity. It is expected that<br />

increasing the filter gain to 4.0 would reduce the drift error


58 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

by a factor of four or to about 0.25 percent. Further<br />

experiments indicated that nearly all drift was due to bias in<br />

the rate sensors. Experiments are currently underway using<br />

improved sensors containing rate-sensor capacitive coupling<br />

conditioning circuitry designed to remove these biases.<br />

Dynamic Response of the system<br />

Preliminary experiments were conducted to establish the<br />

accuracy of the orientation estimates <strong>and</strong> the dynamic<br />

response of the system. The preliminary test procedure<br />

consisted of repeatedly cycling the sensor through various<br />

angles of roll, pitch <strong>and</strong> yaw at rates ranging from 10 to 30<br />

deg./sec. Accuracy was measured to be better than one<br />

degree. The overall smoothness of the plot shows excellent<br />

dynamic response.<br />

VI. EXPERIMENTAL TEST RESULTS OF HUMAN<br />

LIMB SEGMENT<br />

Figure 4<br />

Figure 5<br />

Figure 6<br />

Stepper<br />

Motor<br />

M1<br />

M2<br />

M3<br />

Figure 7<br />

Figure 8<br />

Figure 9<br />

Table 1<br />

Threshold Value<br />

Axis<br />

400<br />

above<br />

& 360& below Rotation<br />

Reverse Forward Forward/<br />

Reverse<br />

Reverse Forward Forward/<br />

Reverse<br />

Reverse Forward Forward/<br />

Reverse<br />

Sensors<br />

S1<br />

S2<br />

S3


Table 2 Simulation Results Of Human Limb Segment<br />

Stepper<br />

Motor<br />

M1<br />

59 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

Stepper<br />

Motor<br />

M2<br />

Stepper<br />

Motor<br />

M3<br />

Sensor<br />

S1<br />

Sensor<br />

S2<br />

Sensor<br />

S3<br />

Axis<br />

Rotation<br />

Forward Stopped Stopped 110 000 000 Forward<br />

Stopped Forward Stopped 000 160 000 Forward<br />

Stopped Stopped Forward 000 000 141 Forward<br />

Reverse Stopped Stopped 450 000 000 Reverse<br />

Stopped Reverse Stopped 000 580 000 Reverse<br />

Stopped Stopped Reverse 000 000 650 Reverse<br />

Threshold<br />

Value<br />

360 <strong>and</strong><br />

below<br />

400 <strong>and</strong><br />

above<br />

The above results were obtained using the<br />

hardware <strong>and</strong> software to achieve an update rate of 100 Hz.<br />

The roll, pitch, <strong>and</strong> yaw test results are presented in Figures<br />

4 , 5, 6 & 7 respectively. The smoothness of the graphs<br />

indicates excellent dynamic response. It is expected that<br />

adjusting the filter gain values that improves the overall<br />

accuracy <strong>and</strong> dynamic response. The transition times<br />

observed in the plots are around 4.5-5 seconds as expected<br />

for a 10-degree per second rotation rate to 45 degrees. In<br />

qualitative tests, the system was able to track the limb<br />

segment, including those in which pitch equaled 90 degrees<br />

the same orientations normally cause singularities in Euler<br />

angle filters. The qualitative tests also show that the system<br />

could easily be combined with a simulation program <strong>and</strong><br />

track motion in real time.<br />

The purpose of the human body tracking system is to<br />

estimate the orientation of multiple human limb segments<br />

<strong>and</strong> use the resulting estimates to set the posture of the<br />

human body model that is visually displayed. Numerous<br />

experiments were conducted to qualitatively evaluate <strong>and</strong><br />

demonstrate this capability.<br />

In each experiment three sensors where attached to the limb<br />

segments to be tracked. Due to the minimal number of<br />

sensors available, tracking was limited to a single arm or<br />

leg. In the case of arm <strong>and</strong> limb segments, sensor attachment<br />

was achieved through the use of elastic b<strong>and</strong>ages. In most<br />

cases this method appeared to keep the sensors fixed relative<br />

to the limb. Body tracking was also performed using various<br />

gains.<br />

VII.CONCLUSIONS<br />

This research has demonstrated an alternative<br />

technology for tracking the posture of an articulated rigid<br />

body. High speed Embedded Controller avoids the<br />

electronic complexity , Bluetooth technology enables<br />

sensors to wirelessly transmit data from body to PC <strong>and</strong> the<br />

use of inertial sensors determine the orientation of link in<br />

the rigid body. RF positioning provides the source less<br />

capability of inertial sensing <strong>and</strong> enables tracking of<br />

multiple users over a wide area. At the core of the system is<br />

an efficient complementary filter that uses a quaternion<br />

representation of orientation <strong>and</strong> the filter can continuously<br />

track the orientation of human body limb segments (Robert B.<br />

McGhee2000).. Drift corrections are also made. This research<br />

overcomes the analysis <strong>and</strong> calculations used by the previous<br />

researchers by the technology of Embedded Controller.<br />

Embedded software filter process the data from three axis<br />

inertial sensors which is attached on the human limb<br />

segment. Sensor calibration is achieved without using any<br />

specialized equipment .Accurate calibration algorithm<br />

compensates the misalignment between sensor <strong>and</strong> limb<br />

segment co-ordinate axis. Hybrid stepper motors with<br />

associated mechanical elements is used to move the limbs<br />

on all the axis like up/down, roll, elevation <strong>and</strong> azimuth. The<br />

implemented system tracks human limb segments accurately<br />

with a 100 Hz update rate. Experimental results demonstrate<br />

the inertial orientation estimation is a practical method of<br />

tracking human body posture. With additional sensors, the<br />

architecture produced could be easily scaled for full body<br />

tracking. Due to its source less nature, tracking could<br />

overcome many of the limitations of motion tracking<br />

technologies currently in widespread use. It is potentially<br />

capable of providing wide area tracking of multiple users for<br />

synthetic environment <strong>and</strong> augmented reality applications.


60 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

VIII. REFERENCES<br />

[1] An, K.N., Chao, E.Y., Cooney, W.P., <strong>and</strong> Linscheid,<br />

R.L.,<br />

1979, “Normative Model of Human H<strong>and</strong> for<br />

Biomechanical<br />

Analysis,” J. Biomechanics, vol. 12, pp. 775-788.<br />

[2] Bennet, D.J., Hollerbach, J.M., 1990, “Closed-loop<br />

Kinematic Calibration of the Utah-MIT H<strong>and</strong>,” in<br />

Experimental Robotics I: The First International Symp., V.<br />

Hayward, O. Khatib, (eds.), Springer-Verlag, N.Y., pp. 539-<br />

552.<br />

[3] Cooney, W.P., Lucca, M.J., Chao, E.Y.S., Linschied<br />

R.L.,<br />

1981, “The kinesiology of the thumb trapeziometacarpal<br />

joint,” J. Bone Joint Surg. 63A:1371-1381.<br />

[4] Fischer, M., van der Smagt, P., Hirzinger, G., 1998,<br />

“Learning Techniques in a Dataglove Based<br />

Telemanipulation<br />

System for the DLR H<strong>and</strong>,” 1998 IEEE ICRA, pp1603-<br />

1608.<br />

[5] Hollister, A., Buford, W.L., Myers, L.M., Giurintano,<br />

D.J,<br />

Novick, A., 1992, “The Axes of Rotation of the Thumb<br />

Carpometacarpal Joint,” J. of Orthopaedic Res., vol. 10, pp.<br />

454-460.<br />

[6] Khatib, O., 1987, “Unified Approach for motion <strong>and</strong><br />

force<br />

control of robot manipulators: the operational space<br />

formulation,” IEEE J. of Robotics <strong>and</strong> Automation, vol. 3,<br />

no. 1, pp. 43-53.<br />

[7] Kramer, J.F., “Determination of Thumb Position Using<br />

Measurements of Abduction <strong>and</strong> Rotation,” U.S. Patent<br />

#5,482,056.<br />

[8] Kuch, J.J., Huang, T.S., 1995, “Human Computer<br />

Interaction via the Human H<strong>and</strong>: A H<strong>and</strong> Model,” 1995<br />

Asilomar Conf. on Signals, <strong>Systems</strong>. <strong>and</strong> <strong>Computers</strong>. pp.<br />

1252-1256.<br />

[9] Rohling, R.N., Hollerbach, J.M., 1993, “Calibrating the<br />

Human H<strong>and</strong> for Haptic Interfaces,” Presence, vol. 2 no. 4,<br />

pp.281-296.<br />

[10] Rohling, R.N, Hollerbach, J.M., Jacobsen, S.C., 1993,<br />

“Optimized Fingertip Mapping: A General Algorithm for<br />

Robotic H<strong>and</strong> Teleoperation,” Presence, vol. 2 no. 3, pp.<br />

203-<br />

220.<br />

[11] Turner, M.L., Gomez, D.H. Tremblay, M.R. <strong>and</strong><br />

Cutkosky,<br />

M.R., 1998, “Preliminary Tests of an Arm-Grounded Haptic<br />

Feedback Device in Telemanipulation,” 1998 ASME<br />

IMECE<br />

Symp. on Haptic Interfaces. pp.145-149.<br />

[12] Turner, M.L., Findley, R.P., Griffin, W.B., Cutkosky,<br />

M.R.,<br />

Gomez, D.H., 2000, “Development <strong>and</strong> Testing of a<br />

Telemanipulation System with Arm <strong>and</strong> H<strong>and</strong> Motion,”<br />

Accepted to 2000 ASME IMECE Symp. on Haptic<br />

Interfaces.<br />

[13] Wampler, C.W., Hollerbach, J.M., Arai, T., 1995, “An<br />

Implicit Loop Method for Kinematic Calibration <strong>and</strong> its<br />

Application to Closed-chain mechanisms,” IEEE Trans.<br />

Robotics <strong>and</strong> Automation, vol. 11, no. 5, pp. 710-724.<br />

[14] Wright, A.K., Stanisic, M.M., 1990, “Kinematic<br />

Mapping<br />

between the EXOS H<strong>and</strong>master Exoskeleton <strong>and</strong> the Utah-<br />

MIT Dextrous H<strong>and</strong>,” 1990 IEEE Int’l Conf. on <strong>Systems</strong><br />

Engineering, pp. 809-811.<br />

[15] www.societyofrobots.com /robot building ideas


61 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

A Novel Image Processing Approach Combining a ‘Coupled<br />

Nonlinear Oscillators’-based Paradigm with Cellular Neural<br />

Networks for Dynamic Robust Contrast Enhancement<br />

Ky<strong>and</strong>oghere Kyamakya( 1 ), Cyrille Kalenga Wa Ngoy ( 2 ), Michel Matalatala Tamasala ( 2 ), Jean Chamberlain Chedjou ( 1 )<br />

( 1 ): Transportation Informatics Group, Institute of Smart <strong>Systems</strong> Technologies, University of Klagenfurt (Austria),<br />

Email: ky<strong>and</strong>oghere.kyamakya@uni-klu.ac.at ; jean.chedjou@uni-klu.ac.at<br />

( 2 ): Department of Electrical <strong>and</strong> Computer Engineering, Polytechnic Faculty, University of Kinshasa (D. R. Congo)<br />

Email: Cyrille.Kalenga@vodacom.cd ; Michel.Matalatala@vodacom.cd<br />

Abstract−− In this paper, a systematic discussion of both pros <strong>and</strong><br />

cons of two well-known traditional approaches for image contrast<br />

enhancement is conducted. The first approach is based on the<br />

CNN paradigm <strong>and</strong> the second one is based on the coupled<br />

nonlinear oscillators’ paradigm for image processing. In the later<br />

case an extensive bifurcation analysis is carried out <strong>and</strong><br />

analytical formulas are derived to define the various states of the<br />

system. Both equilibrium <strong>and</strong> oscillatory states of the system are<br />

depicted. It is shown that each of these states has a significant<br />

impact on the quality of the resulting image contrast<br />

enhancement. A benchmarking is considered whereby a<br />

comparison is performed between the results obtained by a CNNbased<br />

processing, on one side, with those obtained by a ‘coupled<br />

nonlinear oscillators’ based processing, on the other side. The<br />

superiority of the later approach (for contrast enhancement) is<br />

demonstrated both analytically <strong>and</strong> through various experiments.<br />

A major drawback of the CNN based image processing is the<br />

practical inability to adjust/re-calculate templates in real-time in<br />

face of a dynamic scene with input images experiencing visibility<br />

<strong>and</strong>/or lighting related spatio-temporal dynamics. Finally, a novel<br />

hybrid approach integrating both schemes in an efficient way is<br />

proposed: the ‘coupled nonlinear oscillators’ based image<br />

processing is the main processing scheme that is however realized<br />

on top of a CNN processors’ framework. The hybrid approach<br />

does prove to overcome key practical problems faced by both<br />

original approaches.<br />

Keywords: Cellular neural networks (CNN), Nonlinear coupled<br />

oscillators, van der Pol oscillator, Duffing oscillator, contrast<br />

enhancement, stability, bifurcation, Routh-Hurwitz theorem<br />

I. INTRODUCTION<br />

The last decades have witnessed a tremendous attention<br />

devoted to the study of nonlinear coupled oscillators [2]-[17]<br />

with various related applications in diverse areas such as<br />

electrical engineering [18], [19], mechanics [15], electromechanics<br />

[14] <strong>and</strong> electronics [16], just to name a few. In<br />

some previous works [18]-[21], we have shown some<br />

interesting applications of the coupling between van der Pol<br />

<strong>and</strong> Duffing oscillators in both electronics <strong>and</strong> electromechanics.<br />

Further, in the recent literature a good number of<br />

notable contributions have been published thereby showing<br />

various applications of the paradigm of nonlinear dynamics in<br />

image processing [1]-[13]: (a) the use of the CNN paradigm<br />

for contrast enhancement [1], edge detection [11], [17], image<br />

segmentation [2]-[10], [12], [13]; <strong>and</strong> (b) the use of the socalled<br />

LEGION model (involving nonlinear coupled<br />

oscillators) mainly for image segmentation [3]. One does<br />

realize that the relevant literature does not provide sufficient<br />

information concerning the application of nonlinear<br />

coupled/uncoupled oscillators in image processing, especially<br />

for the specific task of contrast enhancement. In fact, only a<br />

single paper can be found in which image contrast<br />

enhancement has been done by using this later paradigm [1].<br />

In contrast, the cellular neural network paradigm has shown<br />

through numerous publications its rich potential to solve many<br />

important low-level image processing tasks, e.g. image contrast<br />

enhancement [22]-[24], edge detection [25] <strong>and</strong> segmentation<br />

[26], [27] just to name a few. Despite the ideal framework<br />

offered by the CNN paradigm to perform parallel <strong>and</strong> therefore<br />

ultrafast image processing there are still some important related<br />

issues that still need a better theoretical foundation. One of<br />

these open questions is that of a comprehensive <strong>and</strong> straightforward<br />

methodology to derive appropriate CNN templates for<br />

a given image processing task. Actually known approaches are<br />

based on a sort of supervised learning paradigm to determine<br />

the templates. Hereby either genetic algorithms or simulated<br />

annealing or even particle swarm optimization are the most<br />

commonly used schemes. Thus, the template obtained through<br />

such a ‘supervised learning’ like approach does highly depend<br />

on the used reference image(s). Therefore, this traditional way<br />

of calculating templates will totally fail in face of a dynamic<br />

environment, which would require an adaptive <strong>and</strong> real-time<br />

determination/re-calculation of the respectively appropriate<br />

templates in reaction to visibility <strong>and</strong> lighting related<br />

environmental changes. Indeed, for a specific processing task<br />

(e.g. contrast enhancement, segmentation, etc.) the optimal<br />

CNN templates (for an optimal processing) must be adjusted /<br />

recalculated depending on the varying input image.<br />

That’s why an important open key issue not yet<br />

answered so far by the relevant scientific community is that of<br />

developing/providing a comprehensive, robust <strong>and</strong> general<br />

framework that should allow a real-time adaptation of the<br />

CNN templates related to a specific image processing task to<br />

the variations in all aspects of the input image(s). It is known<br />

that CNN templates are very sensitive to the quality of the<br />

input image <strong>and</strong> must be adjusted in case of dynamic image<br />

for an optimal processing. The supervised learning template


62 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

calculation paradigm is therefore not appropriate for a<br />

situation where the input image does experience visibility<br />

related temporal dynamics; it is almost impossible to<br />

recalculate the templates in real-time.<br />

Thus, a key objective of this paper is to propose an<br />

approach or better an image processing (in this case for<br />

“contrast enhancement”) framework which is robust to both<br />

the temporal quality <strong>and</strong> the spatial changes of the input<br />

image(s). The novel approach proposed here does combine the<br />

paradigm of coupled nonlinear oscillators with that of cellular<br />

neural networks. It is shown how this integration should be<br />

realized at best. Afterwards, it is in the following steps clearly<br />

demonstrated that the new architectural framework does result<br />

in invariant templates while still being capable of robustly<br />

adapting the efficient image processing to the spatial-temporal<br />

dynamics of the input image.<br />

The nonlinear coupled oscillator system model used in<br />

this paper does consist of the coupling between van der Pol<br />

<strong>and</strong> Duffing type oscillators. The focus is hereby on the<br />

application of this coupled system for the specific image<br />

processing task of “contrast enhancement”. Contrast<br />

enhancement is an important issue in difficult <strong>and</strong> dynamic<br />

visual environments such as the ones faced by advanced driver<br />

assistant systems (ADAS). Therefore, this could help<br />

improving the image quality or the visibility in real time.<br />

We do propose the realization or better the<br />

implementation of the coupled nonlinear oscillators’ image<br />

processing concept on top of a cellular neural network<br />

framework. Hereby, the CNN processors are viewed as a<br />

slave-system used to solve, in real-time, the nonlinear ordinary<br />

differential equations describing the coupled nonlinear<br />

oscillators’ model. The image processing based on the coupled<br />

nonlinear oscillators has a very great strong feature, which is<br />

that its processing efficiency is sensitive neither to the actual<br />

image quality nor lighting variations or states, but solely on<br />

the coefficients of the nonlinear differential equations<br />

describing the coupled oscillators’ model. The appropriate<br />

coefficients/parameters of the coupled nonlinear oscillators are<br />

determined in an offline bifurcation analysis process, which is<br />

explained further in this paper. The new resulting challenge<br />

becomes then that of being capable of solving these nonlinear<br />

differential equations in real-time. We should first notice that<br />

these differential equations (ODE’s) do have ‘constant’<br />

coefficients, which have been selected, as explained before,<br />

from the analysis of the results of the bifurcation analysis.<br />

Therefore, the problem setting for the CNN processor<br />

system, on top of which the coupled oscillators will be<br />

implemented, is that of solving in real-time a set of highly stiff<br />

nonlinear differential equations having constant coefficients.<br />

The input images are the frames which are then considered /<br />

taken as initial conditions for the coupled oscillator system.<br />

The real-time constraint is determined by the actual frame<br />

rate. The new key challenge becomes therefore, evidently, that<br />

of determining the appropriate templates for solving the set of<br />

stiff nonlinear differential equations. But this has been a still<br />

open issue when one looks at the actual state of the relevant<br />

literature. We could however address <strong>and</strong> efficiently solve this<br />

challenging issue in a subsequent work. The results obtained<br />

are presented in all details in another paper that we do also<br />

publish in the conference proceedings of CNNA 2010; it has<br />

the title: “CNN-based Real-time Computational Engineering”<br />

(see [28]).<br />

The previous explanations do clearly highlight how we<br />

could combine the two concepts together in a powerful <strong>and</strong><br />

highly efficient real-time processing framework: a ‘coupled<br />

nonlinear oscillators’ based image processing scheme on top<br />

of a CNN processors framework”.<br />

Contrast enhancement has been an issue of prime<br />

importance in dynamic environments. It is one of the major<br />

low-level image processing tasks needed to be done before<br />

further processing of an image can be possible at higher levels.<br />

Things become more challenging in a continuously changing<br />

environment like the one experienced by driver assistance<br />

systems on the road; weather, lighting, etc. do result in<br />

significant spatial-temporal variations of the input image<br />

quality. The real-time processing constraint does make the<br />

overall scenario more challenging: the higher the car speed is,<br />

the faster the image processing must be. A continuously<br />

changing environment requires the system to be adaptive, i.e.<br />

the system should process/enhance the input image in such a<br />

way that the corresponding output image always<br />

presents/possesses the best possible contrast regardless of the<br />

effects of different environmental conditions experienced by<br />

the input image (like darkness, non-uniform lighting, raining,<br />

fog, etc.). This implies that the output image should contain<br />

significant contrast in it, so that all of the objects contained in<br />

it could be easily distinguishable by the system for further<br />

processing such as scene analysis, etc.<br />

The implementation on top of cellular neural network of<br />

the coupled nonlinear oscillatory systems’ paradigm is a best<br />

c<strong>and</strong>idate/concept for providing an appropriate answer to this<br />

need. To develop such a paradigm, a systematic analytical<br />

framework should provide tools/methods for a straight<br />

forward design <strong>and</strong> parameters calculation of a related robust<br />

<strong>and</strong> ultra-fast image processing.<br />

The rest of the paper is organized as follows. Section 2<br />

exploits the Routh-Hurwitz theorem to address the stability<br />

analysis of the nonlinear coupled oscillatory system. Three<br />

main states of the coupled system are depicted, namely<br />

equilibrium-, quenched-, <strong>and</strong> oscillatory- states. Analytical<br />

formulas/relations are derived under which each of these states<br />

could be displayed by the coupled system. The quality of the<br />

image ‘contrast enhancement’ is discussed in each of the<br />

possible states of the coupled system. Windows of the systemparameters<br />

are determined, under which either a good or a<br />

worst contrast enhancement can be predicted. Section 3 deals<br />

with the numerical study. An in-depth explanation of the<br />

image processing concept involving coupled nonlinear<br />

oscillators is provided. For rapid prototyping purposes a<br />

computing platform is developed, which is based<br />

MATLAB/SIMULINK. It is then used for a set of processing<br />

tasks on both images having a poor contrast <strong>and</strong> on images<br />

with very good contrast as well. Section 4 deals with the<br />

benchmarking. This benchmarking shows how far this novel<br />

approach does outperform the classical CNN based way of<br />

doing the same task, since the CNN templates used for<br />

contrast enhancement (or published in relevant books or<br />

papers) are in reality only optimal for the images used in the<br />

related training process. The later is traditionally based on<br />

offline optimization processes involving either genetic<br />

algorithms or simulated annealing or particle swarm<br />

optimization [29]-[34]. As proof of concepts of the approach<br />

developed in this paper our results are compared with those<br />

provided by the relevant literature for CNN based contrast<br />

enhancement.


We further discuss a possible implementation of the coupled<br />

nonlinear oscillators on top of a CNN computing platform.<br />

The challenge hereby is that of transforming, as much as<br />

possible, the nonlinearity types present in both ‘van der Pol’<br />

<strong>and</strong> ‘Duffing’ oscillators into a type of nonlinearity similar to<br />

that displayed by the elementary CNN cell. We use a novel<br />

optimization concept/process to achieve this goal. Section 5<br />

presents a set of concluding remarks. Furthermore a summary<br />

of the key results obtained is provided.<br />

II. ANALYTICAL STUDY<br />

The dynamics of a system consisting of a van der Pol<br />

oscillator coupled to a Duffing oscillator is described by the<br />

following equations:<br />

2<br />

dx 2 dx 2<br />

dy<br />

ε 2 1( ) + ω 1 = 1 + 2<br />

dt<br />

- 1 - x x c y c (1a)<br />

dt dt<br />

2<br />

dy dy 2 3<br />

dx<br />

2 2 2 o 3 4<br />

dt<br />

+ ε + ω y + c y = c x + c (1b)<br />

dt dt<br />

where 1 c <strong>and</strong> 3 c are the elastic coupling parameters, <strong>and</strong> 2 c<br />

<strong>and</strong> 4<br />

63 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

c are the dissipative coupling parameters. x(t) <strong>and</strong> y(t)<br />

represent the coordinates of the coupled oscillators (i.e. van<br />

der Pol <strong>and</strong> Duffing respectively). The stability analysis of the<br />

equilibrium points is carried out by restricting our<br />

investigation to the case where the elastic couplings<br />

(respectively the dissipative couplings) are identical. From (1),<br />

we obtain the following equilibrium points ( c 2 = c 4 = 0 ) :<br />

⎛ 2 2 2 2 2 2 2<br />

c 1 (c1 -ω1ω 2) c1 -ω1ω ⎞<br />

2<br />

P 1 ⎜ , 0, , 0 ⎟ (2a)<br />

6 2<br />

⎜ c 0ω1 c 0ω ⎟<br />

⎝ 1 ⎠<br />

⎛ 2 2 2 2 2 2 2<br />

c 1 (c1 -ω1ω 2) c1 -ω1ω ⎞<br />

2<br />

P ⎜ 2 , 0, - , 0 ⎟ (2b)<br />

6 2<br />

⎜ c 0ω1 c 0ω ⎟<br />

⎝ 1 ⎠<br />

⎛ 2 2 2 2 2 2 2<br />

c 1 (c1 - ω1ω 2) c1 - ω1ω ⎞<br />

2<br />

P3 ⎜- , 0, , 0 ⎟ (2c)<br />

6 2<br />

⎜ c 0ω1 c 0ω ⎟<br />

⎝ 1 ⎠<br />

⎛ 2 2 2 2 2 2 2<br />

c 1 (c1 -ω1ω 2) c1 -ω1ω ⎞<br />

2<br />

P4 ⎜- , 0, - , 0 ⎟ (2d)<br />

6 2<br />

⎜ c 0ω1 c 0ω ⎟<br />

⎝ 1 ⎠<br />

These points exist under the conditions c 1 0.<br />

We also obtain a critical equilibrium<br />

or c 1 >ωω 1 2 <strong>and</strong> 0<br />

point Pc ( 0,0,0,0 ) . The stability of the above equilibrium points<br />

can be investigated by re-writing (1) in the following form:<br />

dx<br />

v (3a)<br />

dt =<br />

dv<br />

2 2<br />

=ε1(1- x )v - ω 1x+ c1y (3b)<br />

dt<br />

dy<br />

z (3c)<br />

dt =<br />

dz<br />

2 3<br />

= -ε2z-ω 2y-c0y + c1x (3d)<br />

dt<br />

<strong>and</strong> linearizing around a given equilibrium state ( x,v,y,z<br />

0 0 0 0)<br />

to obtain the Jacobian matrix M . J<br />

⎡ 0 1 0 0 ⎤<br />

⎢ 2 2<br />

⎥<br />

-ω1 -2ε1x0v0 ε1(<br />

1-x0) c1 0<br />

M J =<br />

⎢ ⎥<br />

⎢ ⎥<br />

(4)<br />

⎢<br />

0 0 0 1<br />

⎥<br />

2 2<br />

⎢⎣ c1 0 -ω2-3c0y ⎥<br />

0 −ε2⎦<br />

The eigen-values of the 4x4 matrix, formed from the Jacobian<br />

M are the solutions of (5)<br />

matrix J<br />

a λ + a λ + a λ + a λ+ a = 0 (5)<br />

4 3 2<br />

0 1 2 3 4<br />

where the coefficients a l are defined as follows:<br />

a0= 1 (6a)<br />

a1 2<br />

=ε 2 - ε 1 (1-x 0 ) (6b)<br />

a 2<br />

2 2 2 2<br />

=ω 1 +ω 2 + 2ε 1x 0v 0 + 3c0y 0 - εε 1 2(1- x 0)<br />

(6c)<br />

2 2 2 2<br />

a 3 = ε2( ω 1 + 2ε1x0v 0)- ε1(1-x 0)( ω 2 + 3c0y 0)<br />

(6d)<br />

a<br />

2<br />

= ( ω + 2ε x v<br />

2 2 2<br />

)( ω + 3c y ) - c (6e)<br />

4 1 1 0 0 2 0 0 1<br />

It can be shown (by the analysis of the oscillatory states of the<br />

coupled system <strong>and</strong> by exploiting the Routh-Hurwitz theorem)<br />

that three possible states of the system can be depicted. The<br />

first state is the quenching state (i.e. the death of oscillations).<br />

The second is the state of equilibrium. And the last one is the<br />

oscillatory state.<br />

The system exhibits its quenching state when the critical<br />

equilibrium point Pc ( 0,0,0,0 ) is stable. It can be shown, using<br />

the Routh Hurwitz theorem, that the critical equilibrium point<br />

is stable if the following relationships are satisfied (assuming<br />

that the natural frequencies of the coupled oscillators are<br />

equal):<br />

1 2 (7a)<br />

ε < ε<br />

ω ε ε < c < ω (7b)<br />

1 1 2 1<br />

2<br />

1<br />

It can also be shown (using the Routh Hurwitz theorem) that<br />

the non zero equilibrium points P i (i= 1,2,3,4) are stable for<br />

ε 1 < ε 2 (8a)<br />

2<br />

c >ω<br />

(8b)<br />

1 1<br />

Under the conditions described by (8) all the neighboring<br />

orbits of the critical equilibrium points are stable. It can be<br />

shown, using the oscillatory states analysis method (e.g. the<br />

multiple time scale method), that the coupled system displays<br />

oscillatory states. These states could be observed under the<br />

following conditions:<br />

ε 1 < ε 2<br />

(9a)<br />

0


64 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

some preliminary results of the processing (contrast<br />

enhancement) of images with/having a very poor initial<br />

contrast. The main focus will be on showing that the quality of<br />

the contrast enhancement is different in each of the various<br />

‘parameter-windows’ established analytically in (7)-(9). The<br />

advantage of this remark/feature is the possibility of predicting<br />

either a good or a worst image processing; both depend on the<br />

selected parameter values of the coupled nonlinear oscillators’<br />

model.<br />

III. NUMERICAL STUDY<br />

A. Description of the concept<br />

The proposed coupled oscillatory system consists of two<br />

nonlinear oscillators, i.e., a van der Pol oscillator <strong>and</strong> a Duffing<br />

oscillator, each represented by a second order nonlinear<br />

differential equation as given in (1). From a nonlinear<br />

dynamics perspective the scheme to solve this oscillatory<br />

system is straightforward.<br />

⎡x1⎤ r ⎢x⎥ 2<br />

x= ⎢<br />

.<br />

⎥<br />

⎢ ⎥<br />

⎢x⎥ ⎣ n⎦<br />

(Discretization)<br />

⎡y1⎤ r ⎢y⎥ 2<br />

y= ⎢<br />

.<br />

⎥<br />

⎢ ⎥<br />

⎢y⎥ ⎣ n⎦<br />

(Input image)<br />

(Vectorization)<br />

⎡ dx dy<br />

1 ⎤ ⎡ 1 ⎤<br />

⎢ dt ⎥ ⎢ dt ⎥<br />

⎢ ⎥ ⎢ ⎥<br />

r dx dy<br />

2 r 2<br />

dy<br />

⎢ ⎥<br />

dx<br />

⎢ ⎥<br />

= ⎢ dt ⎥ = ⎢ dt ⎥<br />

dt ⎢ ⎥ dt ⎢ ⎥<br />

. ⎢<br />

.<br />

⎢ ⎥ ⎥<br />

⎢dx ⎥ ⎢dy ⎥<br />

n<br />

n<br />

⎢ ⎥ ⎢ ⎥<br />

⎣ dt ⎦ ⎣ dt ⎦<br />

Coupled Oscillatory Paradigm<br />

Figure 1. Image processing through the oscillatory model<br />

In order to exploit the coupled nonlinear model/equations for<br />

some image processing tasks (e.g. contrast enhancement, edge<br />

detection, segmentation, etc.) the basic idea remains the same<br />

although a bit trickier. The input image is pixelized first, i.e., it<br />

must take a grid-like form. Then the pixelized image is<br />

vectorized. The elements of the vector image are the individual<br />

pixels. This vector image serves as initial condition vector for<br />

the coupled oscillatory system. To solve a 2 nd order ordinary<br />

differential equation we need two initial conditions (i.e.<br />

position <strong>and</strong> velocity), it is the same in this case here also. To<br />

solve/process each pixel the system needs four values, which<br />

are ‘position’ <strong>and</strong> ‘velocity’ values for both the van der Pol<br />

oscillator <strong>and</strong> the Duffing oscillator. In this case, the initial<br />

conditions vector has four elements, each of which having the<br />

same size as that of the input image, i.e., two vector elements<br />

for the initial positions <strong>and</strong> two further vector elements to hold<br />

the initial velocities. The key steps of the overall principle are<br />

shown in Fig. 1.<br />

The system generates two solutions at each time step. One<br />

is the van der Pol oscillator’ solution <strong>and</strong> the other is to the<br />

Duffing oscillator one. These solutions are obtained in the form<br />

of vector images which must be converted back to the grid like<br />

shape. Normally, the input images are loaded (as initial<br />

conditions) either in x r or y r or in both. But there are different<br />

possible scenarios for initializing the model. Some of these<br />

scenarios are listed in the following:<br />

• Loading the image in x r<br />

• Loading the image in y r<br />

• Loading the image in x r <strong>and</strong> y r<br />

• Loading the image in dx<br />

r<br />

dt<br />

• Loading the image in dy<br />

r<br />

dt<br />

• Loading the image in dx<br />

r<br />

<strong>and</strong><br />

dt<br />

dy<br />

r<br />

dt<br />

• Loading the image in x r <strong>and</strong> dx<br />

r<br />

dt<br />

The SIMULINK model (i.e. a graphical representation) that has<br />

been used for the simulations of this paper (i.e. for image<br />

contrast enhancement) is shown in Fig. 2. This graphical model<br />

is a representation of the nonlinear coupled oscillatory system<br />

from the nonlinear dynamics perspective.<br />

B. Results<br />

Our objective in this part is to connect the results obtained<br />

analytically (different states of the nonlinear oscillator system)<br />

to some sample image processing examples obtained through<br />

numerical simulations. The key issue hereby is that of<br />

establishing a correlation between the formulas derived<br />

analytically <strong>and</strong> the related image processing results obtained<br />

numerically (i.e. contrast enhancement).<br />

It has been shown analytically that the equilibrium points<br />

(i.e. both Pc <strong>and</strong> Pi) are stable under some analytical conditions<br />

described by (7), (8) <strong>and</strong> (9). We now want to exploit these<br />

equations to show the quality of the image processing tasks<br />

performed by the coupled oscillators’ system in its equilibrium<br />

states. It is worth a mentioning that two main equilibrium<br />

states of the coupled system have been depicted analytically.<br />

The first state is the quenching state under which the critical<br />

point (i.e. the point at origin) P c (0, 0, 0, 0) is stable. At the<br />

critical points both oscillators are mutually damped (i.e.<br />

complete damping), leading to the quenching phenomenon.<br />

When this phenomenon occurs, the result of the image<br />

processing leads to an image which is completely dark (see<br />

Fig. 3b), whatever the quality (good or worst) of the input<br />

image may be (see Fig. 3a). The following set of parameters<br />

has been used to obtain the quenching state under the<br />

conditions described in (7): ε 1 =0.4,<br />

ε 2 = 1,<br />

ω 1 = 1 ; ω 2 = 1,<br />

c 1 = 0.8, 3 c = 0.8, 2 c = 0, 4<br />

c = 0, <strong>and</strong> c 0 = 0.5


65 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

Figure 2. Simulink representation of the coupled oscillators’ model<br />

(a) (b)<br />

Figure 3: Result of the image processing in the case where the system is the<br />

critical equilibrium point P c (0, 0, 0, 0) : input image (a) <strong>and</strong> result of<br />

the processing (b), leading to an output image which is completely dark<br />

(Quenching phenomenon).<br />

An important observation which could be drawn from Fig. 4 is<br />

that the image processing quality is the highest for equilibrium<br />

points that lie much further (far away) from the critical point.<br />

For instance, the parameter values c1= 1.05 , c1= 1.15 , <strong>and</strong><br />

c1= 1.3 lead to the following equilibrium points<br />

P 1(0.475,0,0.455,0)<br />

, P 1(0.923,0,0.803,0)<br />

<strong>and</strong><br />

P 1(1.52,0,1.17,0)<br />

respectively. Therefore, by increasing c 1 ,<br />

the equilibrium points Pi move far away from the critical<br />

equilibrium Pc <strong>and</strong> thereby leading to a significant<br />

improvement in the quality of image processing (contrast<br />

enhancement. The results/images obtained are presented in Fig.<br />

4. We have also performed a series of image processing<br />

numerical simulations in the ‘oscillatory states’ of the coupled<br />

system described by (9). Using the same set of parameters like<br />

in Fig. 4, c 1 has been used as control parameter. The results<br />

obtained in Fig. 5 have revealed that in the oscillatory state of<br />

the coupled system, the quality of the processing increases with<br />

decreasing c 1 (see the results of the processing in Fig. 5b, Fig.<br />

5c <strong>and</strong> Fig. 5d).<br />

(a) (b)<br />

(c) (d)<br />

Figure 4. Effects of the control parameter c1 on the image processing qualitythe<br />

system is in different equilibrium states: (a) is the input image; (b) is the<br />

related image processing result for c1 = 1.05; (c) is the image processing result<br />

for c1 = 1.15; <strong>and</strong> (d) is the image processing result for c1 = 1.3, the later<br />

leading to the optimal result/processing obtained in the corresponding<br />

equilibrium state of the coupled system.<br />

(a) (b)<br />

(c) (d)<br />

Figure 5: Effects of the control parameter c1 on the processing of quality of<br />

the input image (a) – the system is in different oscillatory states; the results of<br />

the processing are: (b) for c1 = 0.6, (c) for c1 = 0.55, <strong>and</strong> (d) for c1 = 0.50, the<br />

later leading to an optimal result obtained in the corresponding oscillatory<br />

state of the coupled system.<br />

IV. BENCHMARKING<br />

In this section we discuss <strong>and</strong> compare the results of the<br />

CNN based image contrast enhancement techniques published<br />

in the literature so far with those obtained through a<br />

processing by the coupled nonlinear oscillators’ paradigm. A<br />

first attempt for a CNN based contrast enhancement was<br />

presented by Mάrton Csapodi et al. [22]. In this concept,<br />

another well known contrast enhancement approach, i.e.<br />

adaptive histogram equalization, has been emulated by<br />

performing a piecewise linear approximation of different<br />

mapping functions. The technique is computationally intensive<br />

since for each contextual region it requires a histogram<br />

generation, a mapping function calculations <strong>and</strong> a rescaling of<br />

pixel values according to the new mapping. Mάtyάs Brendel et<br />

al. [23] addressed the contrast enhancement problem by<br />

proposing a set of linear templates, which however do not<br />

provide good results for all test images due to the high<br />

nonlinear nature of the images. A. Gacsάdi et al. [24] have<br />

designed a set of templates for image enhancement by<br />

minimizing the image energy function.


66 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

(a) (b)<br />

(c ) (d)<br />

Figure 6. Results of CNN based contrast enhancement schemes: (a) Image<br />

enhancement based on an approach developed by Mάrton Csapodi et al. [22];<br />

(b) Image enhancement based on an approach developed by Mάtyάs Brendel et<br />

al. [23]; .(c) Image enhancement based on approach by A.Gacsάdi et al.[24];.<br />

(d) Edge preservation observed in the approach by A. Gacsάdi et al. [24]. All<br />

these results are obtained w.r.t. the input image of Fig. 4a.<br />

The energy function considered consists of two processes that<br />

are smoothness constraint <strong>and</strong> edge penalty. Thus, to obtain an<br />

optimum result a tradeoff between image smoothness <strong>and</strong> edge<br />

detection was to be found <strong>and</strong> adjusted. Applying the approach<br />

based on the CNN paradigm on the same input image of Fig.<br />

4a, we have obtained an enhanced contrast w.r.t the input<br />

image but with a loss of key information (see Fig. 6). The parts<br />

of the input image with small gray level differences have been<br />

lost, e.g. driver’s face, the road, the round lane <strong>and</strong> the<br />

background. In contrast to that, the optimum results (Fig. 4d,<br />

Fig. 5d) obtained through the coupled nonlinear oscillators<br />

processing paradigm clearly show that almost all of the basic<br />

information of the same input image is restored during the<br />

image enhancement processing. A comparison of Figures 5 <strong>and</strong><br />

6 does underscore the superiority of the coupled nonlinear<br />

oscillator based contrast enhancement while compared to CNN<br />

based approaches. The reason for the weakness of the CNN<br />

based approach lies in the essentially “supervised<br />

training/learning”-like process used to determine the templates.<br />

Due to this, the (linear) templates obtained are only optimal for<br />

the test images. Beyond that, there is no way that those<br />

templates can be optimal for images experiencing temporal<br />

<strong>and</strong>/or spatial dynamics.<br />

Having seen the superiority of the coupled oscillators’<br />

based approach we do now propose a hybrid architecture that<br />

will combine the strong points of both CNN <strong>and</strong> the coupled<br />

nonlinear oscillators’ based image processing. The core<br />

processing will be the later one. Thereby, we do propose the<br />

realization of the coupled nonlinear oscillators’ image<br />

processing concept on top of a cellular neural network<br />

processors framework. Hereby, the CNN processors will play<br />

the role of a slave-system used to solve, in real-time, the<br />

nonlinear ordinary differential equations describing the<br />

coupled nonlinear oscillators’ model. The image processing<br />

based on coupled nonlinear oscillators has a very great strong<br />

feature, which is that its processing efficiency is sensitive<br />

neither to the actual image quality nor to its variations or<br />

states, but solely on the coefficients of the nonlinear<br />

differential equations describing the coupled oscillators. The<br />

appropriate coefficients/parameters of the coupled nonlinear<br />

oscillators are determined, as explained in the previous<br />

sections, in an offline bifurcation analysis process. The new<br />

challenge related to the CNN processor becomes now that of<br />

being capable of solving these nonlinear differential equations<br />

in real-time. Thus, the problem formulation for the CNN<br />

processor system is that of solving a set of highly stiff<br />

nonlinear differential equations having constant coefficients.<br />

The key challenge for this task is solely that of determining<br />

the appropriate templates. This is not trivial at all <strong>and</strong> is still<br />

an open issue if one looks at the actual state of the relevant<br />

literature. We could however solve it <strong>and</strong> we do present the<br />

related results obtained in the other paper that we publish in<br />

the proceedings of the CNNA-2010 conference; see the paper<br />

entitled “CNN based Real-time Computational Engineering”<br />

(see [28]).<br />

V. CONCLUDING REMARKS<br />

CNN needs well optimized templates to perform any<br />

specific image processing task <strong>and</strong> it is well known that<br />

template optimization is still unsolved for a really<br />

straightforward <strong>and</strong> efficient CNN based computing. For<br />

dynamic environments linear templates do not provide a<br />

robust processing since every next image is different from the<br />

previous one <strong>and</strong> does in reality require a new set of<br />

appropriate templates for the same processing task. Nonlinear<br />

templates require some preprocessing to be performed on each<br />

image to get the template values that are appropriate to the<br />

actual image, leading to huge problems for real-time<br />

applications. The proposed paradigm of a coupled nonlinear<br />

oscillators based processing has shown (both analytically <strong>and</strong><br />

numerically) domains of good/efficient contrast enhancement<br />

processing whereby the processing quality remains<br />

robust/constant <strong>and</strong> is insensitive to eventual spatial-temporal<br />

dynamics that may experience the input images. Furthermore,<br />

in CNN based processing the template-based computing<br />

involve also the pixels’ neighborhood while processing each<br />

pixel. This is not the case in the nonlinear oscillators based<br />

processing paradigm as each pixel is processed independently<br />

without taking into account its neighborhood. For both<br />

paradigms, i.e. CNN <strong>and</strong> nonlinear oscillators, the input image<br />

serves as an initial condition. Both frameworks offer parallel<br />

image processing with a couple of differences: (a) CNN<br />

templates appear to be sensitive to the training conditions <strong>and</strong><br />

lack adaptivity to dynamic environments; (b) the performance<br />

of the coupled oscillator model is independent/insensitive to<br />

both quality <strong>and</strong> dynamics of the input image.<br />

In summary, after analyzing the results obtained we do<br />

propose the realization of the coupled nonlinear oscillators<br />

based image processing concept on top of a cellular neural<br />

network processor system. Hereby, the CNN framework will<br />

be viewed as a slave-system used to solve, in real-time, the<br />

nonlinear ordinary differential equations describing the<br />

coupled nonlinear oscillators’ model. The image processing<br />

based on coupled nonlinear oscillators has demonstrated its<br />

great strong feature, which is that its processing efficiency is<br />

sensitive neither to the actual image quality nor to its<br />

variations or states, but solely on the coefficients of the<br />

nonlinear differential equations describing the coupled<br />

oscillators. The appropriate coefficients/parameters of the<br />

coupled nonlinear oscillators are determined in an offline<br />

bifurcation analysis process, which has been extensively<br />

explained further in this paper. And these coefficients remain<br />

constant <strong>and</strong> do not need to be recalculated in real-time.


67 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

REFERENCES<br />

[1] S. Morfu <strong>and</strong> J. C. Comte, “A nonlinear oscillators<br />

network devoted to image processing,” International<br />

Journal of Bifurcation <strong>and</strong> Chaos, vol. 14, no. 4(2004)<br />

1385-1394.<br />

[2] Naoko Kurata, Hitoshi Mahara, Tatsunari Sakurai,<br />

Atsushi Nomura <strong>and</strong> Hidetoshi Miike, “Image processing<br />

by a coupled non-linear oscillator system,” 23 rd<br />

International Technical Conference on Circuits/<strong>Systems</strong>,<br />

<strong>Computers</strong> <strong>and</strong> Communications (ITC-CSCC 2008).<br />

[3] De Liang Wang <strong>and</strong> David Terman, “Locally excitatory<br />

globally inhibitory oscillator networks,” IEEE<br />

Transactions on Neural Networks, vol. 6, no. 1, January<br />

1995.<br />

[4] Xiuwen Liu <strong>and</strong> DeLiang Wang, “Range image<br />

segmentation using a LEGION network,” IEEE<br />

Transactions on Neural Networks, vol. 10, no. 3, May<br />

1999.<br />

[5] D. L. Wang, “Object selection based on oscillatory<br />

correlation,” Elsevier Transactions on Neural Networks,<br />

vol. 12, pp. 579-592, 1999.<br />

[6] Hiroshi Ando, Takashi Morie, Makoto Nagata <strong>and</strong><br />

Atsushi Iwata, “A non-linear oscillator network for grey<br />

level image segmentation <strong>and</strong> PWM / PPM circuits for its<br />

VLSI implementation,” IEICE Trans. Fundamentals, vol.<br />

E83-A, no. 2, pp. 329-336, February 2000.<br />

[7] Hiroshi Ando, Takashi Morie, Makoto Nagata <strong>and</strong><br />

Atsushi Iwata, “Image segmentation/extraction using nonlinear<br />

cellular networks <strong>and</strong> their VLSI implementation<br />

using pulse-coupled modulation techniques,” IEICE<br />

Trans. Fundamentals, vol. E85-A, no.2, pp. 381-388,<br />

February 2002.<br />

[8] Hidehiro Nakano <strong>and</strong> Toshimichi Saito, “Grouping<br />

synchronization in a pulse-coupled network of chaotic<br />

spiking oscillators,” IEEE Transactions on Neural<br />

Networks, vol 15, no.5, September 2004.<br />

[9] Yakov Kazanovich <strong>and</strong> Roman Borisyuk, “Object<br />

selection by an oscillatory neural network,” Elsevier<br />

Transactions on Biosystems, vol. 67, pp. 103-111, August<br />

2002.<br />

[10] Ke Chen <strong>and</strong> DeLiang Wang, “A dynamically coupled<br />

neural oscillator network for image segmentation,”<br />

Elsevier Transactions on Neural Networks, vol. 15, pp.<br />

423-439, April 2002.<br />

[11] M. Strzelecki, “Texture boundary detection using network<br />

of synchronised oscillators,” IEEE Electronics letters, vol.<br />

40, pp. 466-467, ISSN 0013-5194, April 2004.<br />

[12] Michal Strzelecki, Jacques de Certaines, <strong>and</strong> Suhong Ko,<br />

“Segmentation of 3D MR liver images using<br />

synchronised oscillators networks,” Proceedings of the<br />

2007 IEEE International Symposium on Information<br />

Technology Convergence, ISBN: 0-7695-3045-1, pp.<br />

259-263, 2007.<br />

[13] Balarey Yuri, Cohen Alex<strong>and</strong>er, Johnson Walter <strong>and</strong><br />

Elinson, “Image processing by oscillatory media,”<br />

Proceedings of SPIE-the International Society for Optical<br />

Engineering, vol. 2430, pp. 198-207, 1994.<br />

[14] R. Forke, Dirk Scheibner, Wolfram Dötzel <strong>and</strong> Jan<br />

Mehner, “Measurement unit for tunable low frequency<br />

vibration detection with MEMS force coupled<br />

oscillators,” Elsevier Transactions on Sensors <strong>and</strong><br />

Actuators, vol. 156, pp. 59-65, 2009.<br />

[15] R. Sepulchre, Derek Paley <strong>and</strong> Naomi Leonard, Lecture<br />

notes in control <strong>and</strong> information sciences, vol. 309/2004,<br />

pp:189-205. ISBN:978-3-540-22861-5, ISSN:0170-8643,<br />

Springer Berlin/Heidelberg, November 2004.<br />

[16] James F. Buckwalter, Aydin Babakhani, Abbas Komijani<br />

<strong>and</strong> Ali Hajimiri,”An integrated subharmonic coupledoscillator<br />

scheme for a 60-GHz phased-array transmitter,”<br />

IEEE Transactions on Microwave Theory <strong>and</strong><br />

Techniques, vol. 54, no.12, pp. 4271-4280, December<br />

2006.<br />

[17] G. W. Wei <strong>and</strong> Y. Q. Jia, “Synchronization-based image<br />

edge detection,“ Europhysics letters, vol. 59, pp.814-819,<br />

2002.<br />

[18] J. C. Chedjou, On the analysis of nonlinear<br />

electromechanical systems with applications, Shaker<br />

Verlag, ISBN 978-3-8322-3750, 2005.<br />

[19] J. C. Chedjou, H. B. Fotsin, P. Woafo, <strong>and</strong> S. Domngang,<br />

“Analog simulation of the dynamics of a van der Pol<br />

oscillator coupled to a Duffing oscillator,” IEEE<br />

Transactions on Circuits <strong>and</strong> <strong>Systems</strong>-I, vol. 48, no. 06,<br />

pp. 748-757, 2001.<br />

[20] J. C. Chedjou, P. Woafo, <strong>and</strong> S. Domngang, “Shilnikov<br />

chaos <strong>and</strong> dynamics of a self-sustained electromechanical<br />

transducer,” ACME Transactions on Vibration <strong>and</strong><br />

Acoustics, vol. 123, pp. 170-174, 2001.<br />

[21] J. C. Chedjou, K. Kyamakya, I. Moussa, H. P.<br />

Kuchenbecker, <strong>and</strong> W. Mathis, “Behavior of a selfsustained<br />

electromechanical transducer <strong>and</strong> routes to<br />

chaos,” ACME Transactions on Vibration <strong>and</strong> Acoustics,<br />

vol. 128, pp. 183-192, 2006.<br />

[22] Mάrton Csapodi <strong>and</strong> Tamάs Roska, “Adaptive histogram<br />

equalization with cellular neural network,” CNNA’96:<br />

Fourth IEEE International Workshop on Cellular Neural<br />

Networks <strong>and</strong> their Applications, Seville, Spain, June 24-<br />

26, 1996.<br />

[23] Mάtyάs Brendel <strong>and</strong> Tamάs Roska, “Adaptive image<br />

sensing <strong>and</strong> enhancement using the adaptive cellular<br />

neural network universal machine,” Proceedings of the 6 th<br />

IEEE International Workshop on Cellular Neural<br />

Networks <strong>and</strong> their Applications, Catania, Italy, May 23-<br />

25, 2000.<br />

[24] A. Gacsadi, C. Grava <strong>and</strong> A. Grava, “Medical image<br />

enhancement by using cellular neural networks,”<br />

Proceedings of the EEE International Conference on<br />

<strong>Computers</strong> in Caradiology, Lyon, France, Sep 25-28,<br />

2005.<br />

[25] Masaru Nakano <strong>and</strong> Yoshifumi Nishio, “A method of<br />

edge detection using small world cellular neural<br />

network”, International Symposium on Nonlinear Theory<br />

<strong>and</strong> its Applications, NOLTA’07, Vancouver, Canada,<br />

Sep 16-19, 2007.<br />

[26] D. L. Vilarino, D. Cabello, X. M. Pardo <strong>and</strong> V. M. Bera,<br />

“Cellular neural networks <strong>and</strong> active contours: A tool for<br />

image segmentation”, Transaction of Elsevier on Image<br />

<strong>and</strong> Vision Computing, vol. 21, pp. 189-204, 2003.


68 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

[27] Yao Li, Liu Jiamin, Xie Yonggui <strong>and</strong> Pei Liuqing,<br />

“Medical image segmentation based on cellular neural<br />

networks”, Science in China series F: information<br />

sciences, ISSN: 1009-2757(print) 1862-2836(online), vol.<br />

44, pp. 68-72, 2007.<br />

[28] J.C. Chedjou, K. Kyamakya, U.A. Khan, M.A. Latif,<br />

“CNN-based real-time computational engineering,”<br />

Proceedings of CNNA 2010, February 3-5, 2010,<br />

Berkeley, California – USA.<br />

[29] Taraqlio S., Zanela A., “Cellular neural networks: a<br />

genetic algorithm for parameters optimization in artificial<br />

vision applications,” Proceedings of 4 th IEEE<br />

International workshop on Cellular Neural Network <strong>and</strong><br />

their Applications, pp. 315-320, ISBN: 0-7803-3261-X,<br />

Spain, 1996.<br />

[30] M. Zamparelli, “Genetically trained cellular neural<br />

networks,” Transactions of Elsevier on neural networks,<br />

vol. 10, pp. 1143-1151, 1997.<br />

[31] Samuel Xavier-de-Souza, Mustak E. Yalcin, Mü stak E.<br />

Yalcin, Joos V<strong>and</strong>ewalle, Johan A. K. Suykens,<br />

“Automatic chip-specific CNN template optimization<br />

using adaptive simulated annealing,” Proceedings of the<br />

European conference on Circuit Theory <strong>and</strong> Design<br />

(ECCTD ‘03).<br />

[32] Brett Ch<strong>and</strong>ler, Csaba Rekeczky, Yoshifumi Nishio, Akio<br />

Ushida, “Adaptive simulated annealing in CNN template<br />

learning,” IEICE Trans. Fundamentals, vol. E82-A, no.<br />

02, February 1999.<br />

[33] H. L. Wei, S. A. Billings, “Generalized cellular neural<br />

networks constructed using particle swarm optimization<br />

for spatio-temporal evolutionary pattern identification,”<br />

International journal of Bifurcation <strong>and</strong> Chaos, vol. 18,<br />

pp. 3611-3624, 2008.<br />

[34] Te-Jen Su, Tzu-Hsiang Lin, Jia-Wei Liu, “Particle swarm<br />

optimization for gray scale image noise cancellation,”<br />

Proceedings of the 4 th IEEE International Conference on<br />

<strong>Intelligent</strong> Information hiding <strong>and</strong> Multimedia Signal<br />

Processing, Harbin-China, 2008.<br />

Ky<strong>and</strong>oghere Kyamakya obtained the<br />

M.S. in Electrical Engineering in 1990<br />

at the University of Kinshasa. In 1999<br />

he received his Doctorate in Electrical<br />

Engineering at the University of Hagen<br />

in Germany. He then worked three<br />

years as post-doctorate researcher at the<br />

Leibniz University of Hannover in the<br />

field of Mobility Management in<br />

Wireless Networks. From 2002 to 2005<br />

he was junior professor for Positioning Location Based<br />

Services at Leibniz University of Hannover. Since 2005 he is<br />

full Professor for Transportation Informatics <strong>and</strong> Director of<br />

the Institute for Smart <strong>Systems</strong> Technologies at the University<br />

of Klagenfurt in Austria.<br />

Jean Chamberlain Chedjou received<br />

in 2004 his doctorate in Electrical<br />

Engineering at the Leibniz University<br />

of Hanover, Germany. He has been a<br />

DAAD (Germany) scholar <strong>and</strong> also an<br />

AUF research Fellow (Postdoc.). From<br />

2000 to date he has been a Junior<br />

Associate researcher in the Condensed<br />

Matter section of the ICTP (Abdus<br />

Salam International Centre for Theoretical Physics) Trieste,<br />

Italy. Currently, he is a senior researcher at the Institute for<br />

Smart <strong>Systems</strong> Technologies of the Alpen-Adria University of<br />

Klagenfurt in Austria. His research interests include<br />

Electronics Circuits Engineering, Chaos Theory, Analog<br />

<strong>Systems</strong> Simulation, Cellular Neural Networks, Nonlinear<br />

Dynamics, Synchronization <strong>and</strong> related Applications in<br />

Engineering. He has authored <strong>and</strong> co-authored 3 books <strong>and</strong><br />

more than 40 journals <strong>and</strong> conference papers.<br />

Cyrille Kalenga Wa Ngoy obtained the ‘Ir. Civil’ degree in<br />

Electrical Engineering at the University of Kinshasa. He is<br />

since about ten years Assistant at the same University in the<br />

Department of Electrical <strong>and</strong> Computer Engineering.<br />

Michel Matalatala Tamasala obtained the ‘Ir. Civil’ degree<br />

in Electrical Engineering at the University of Kinshasa. He is<br />

since about four years Assistant at the same University in the<br />

Department of Electrical <strong>and</strong> Computer Engineering.


69 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

Common-neighbor Monitoring Enhanced<br />

Cooperation Enforcement Scheme for MANETs<br />

JianLi GUO, HongWei LIU, <strong>and</strong> XiaoZong YANG<br />

Abstract—Ad hoc networks are distributed, self-organized<br />

wireless networks. By their nature, it is easy for selfish nodes<br />

to save their energy by not forwarding packets. The existing of<br />

selfish nodes can degrade the network performance severely. A<br />

new cooperation enforcement scheme called CMC was proposed<br />

to mitigate this problem. The common neighbor monitoring<br />

technique was introduced, with whose help the watchdog could<br />

monitor all packets transmitting around it. The system could<br />

detect the non-cooperation nodes quickly <strong>and</strong> easily. In the<br />

routing discovery phase, the control messages that contained noncooperation<br />

nodes were dropped, decreasing the probability of a<br />

well-behaved node using a bad route for data transmission. The<br />

ns-2 simulation results indicated that CMC could improve the<br />

throughput of well-behaved nodes by 10%-40% in the presence<br />

of 10%-60% non-cooperation nodes.<br />

Index Terms—Mobile Ad Hoc networks, selfish node, reputation,<br />

cooperation.<br />

I. INTRODUCTION<br />

MOBILE Ad hoc network[1], [2] is a multi-hop temporary<br />

autonomous system, which is composed of a group<br />

of mobile nodes. In this environment, the transmission range<br />

of each node is limited within a small area, so two mobile<br />

nodes which are geographically distant require other nodes<br />

forwarding function to communicate. At present, the mature<br />

routing protocols for mobile ad hoc network, such as DSR[3]<br />

<strong>and</strong> AODV[4], all assume that nodes are cooperative, <strong>and</strong> they<br />

are happy to forward data for other nodes. In recent years, with<br />

the development of hardware technology, all kinds of civilian<br />

ad hoc networks, such as the temporary wireless network at<br />

the classrooms, are appeared. In these networks, each node<br />

separately belongs to different individuals or organizations,<br />

they have no common purpose, <strong>and</strong> cooperation among nodes<br />

cannot be guaranteed. In mobile Ad hoc network where nodes<br />

are always powered by battery, energy is very valuable, <strong>and</strong><br />

the wireless interface consumes substantial energy (higher<br />

than 40%)[2], [5]. In order to save energy, selfish nodes may<br />

discard packets that need to be forwarded, thus showing noncooperative<br />

behaviors. In literature [6], by employing game<br />

theory, the authors had proved that, in mobile ad hoc networks,<br />

Manuscript received April 9, 2009. This work was supported in part by<br />

the Hi-Tech Research <strong>and</strong> Development Program (863) of China under grant<br />

No. 2008AA01A201 <strong>and</strong> the National Natural Science Foundation of China<br />

under grants No. 60503015.<br />

JianLi GUO is with the School of computer science <strong>and</strong> technology, Harbin<br />

Institute of Technology, Harbin, China, 150001 email: gjl@ftcl.hit.edu.cn.<br />

HongWei LIU is with the School of computer science <strong>and</strong> technology,<br />

Harbin Institute of Technology, Harbin, China, 150001 email:<br />

lhw@ftcl.hit.edu.cn.<br />

XiaoZong YANG is with the School of computer science <strong>and</strong> technology,<br />

Harbin Institute of Technology, Harbin, China, 150001 email:<br />

yxz@ftcl.hit.edu.cn.<br />

spontaneous cooperation did not exist, <strong>and</strong> the external mechanisms<br />

that ensured cooperation among nodes were required.<br />

In mobile ad hoc network, even if only a small number of<br />

nodes showing non-cooperative behaviors, there may be a<br />

great impact on network performance. Literature [7] farther<br />

more pointed out that, if there existed 10%-40% selfish nodes<br />

in the network, the entire network performance would drop<br />

16%-32%.<br />

For the nodes cooperation problem in mobile ad hoc networks,<br />

the researchers had proposed a lot of solutions[8], [9],<br />

mainly divided into two categories: virtual currency based<br />

schemes <strong>and</strong> reputation based schemes. In the virtual currency<br />

based schemes[5], [10], [11], [12], [13], nodes who<br />

forward packets for other nodes are compensated by some<br />

virtual currency to motivate them to cooperate. However, these<br />

kind of schemes have some drawbacks: the need for special<br />

hardware[10] or a central server[5], [11], [12], [13]; violating<br />

the distributed characteristics of ad hoc networks; because of<br />

the lack of opportunity to forward packets for other nodes, <strong>and</strong><br />

thus unable to obtain enough currency, the nodes that located<br />

at the edge of the network may be starved to death[10]; in<br />

order to calculate the optimal compensation[11], [12], [13],<br />

nodes require to exchange substantial information, introducing<br />

quantity control packets into the network. These limit their<br />

applications in ad hoc networks.<br />

In the reputation based schemes, each node is given a reputation<br />

value[14] according to its behavior <strong>and</strong> the selfish ones<br />

are punished. Literature [7] first proposed the use of watchdog<br />

to detect selfish nodes. After that, literatures [15], [16] further<br />

more used the second-h<strong>and</strong> information to compute reputation<br />

values to speed up the detection rate, at the same time the<br />

Bayesian statistical method was used to prevent attacks on<br />

rumors. Literature [17] focused on the security problems in<br />

the process of calculating the reputation value <strong>and</strong> proposed a<br />

safety scheme named SORI. Literatures [18], [19] pointed out<br />

that the detection techniques based on the watchdog were not<br />

accurate enough, <strong>and</strong> put forward a two-hop ACKs detection<br />

method that could more accurately detect the selfish nodes. But<br />

this approach introduced substantial ACK messages, which seriously<br />

occupied the network b<strong>and</strong>width, making the network<br />

more vulnerable to congestion.<br />

Literatures [14], [20] analyzed <strong>and</strong> summarized the calculation<br />

methods for reputation values, <strong>and</strong> pointed out that the<br />

use of second-h<strong>and</strong> information leaded to some disadvantages:<br />

each node needed to save reputation values for every node in<br />

the network, occupying substantial storage space; the dissemination<br />

of second-h<strong>and</strong> information among nodes used up a lot<br />

of network b<strong>and</strong>width; each time node received a second-h<strong>and</strong><br />

1


70 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

information, it needed to re-calculate the reputation values for<br />

all nodes in the network, taking up lots of CPU resources;<br />

more vulnerable to be attacked. These all make the reputation<br />

calculating method based on the first-h<strong>and</strong> information is<br />

more applicable to ad hoc networks. However, there also<br />

existed one drawback in the method that used the first-h<strong>and</strong><br />

information. The detection to selfish nodes was slower <strong>and</strong><br />

the non-cooperative nodes could not be separated from the<br />

network quickly <strong>and</strong> effectively.<br />

In this paper, we proposed the CMC scheme (Commonneighbor<br />

Monitoring enabled Cooperation enforcement<br />

scheme), which used the common-neighbor monitoring<br />

technique to speed up the detection rate to the noncooperative<br />

nodes. At the same time, the control messages<br />

(RREQs or RREPs) were filtered by the CMC scheme,<br />

which threw away the control messages that contained the<br />

non-cooperative nodes, making the routes chosen by nodes<br />

can bypass the non-cooperative nodes as much as possible,<br />

thereby improving the network performance.<br />

A. The Structure of CMC<br />

II. CMC SCHEME<br />

CMC is one kind of reputation based cooperation scheme,<br />

<strong>and</strong> the direct information (first-h<strong>and</strong> information) was used to<br />

calculate the reputation value for each node. Like all reputation<br />

based schemes [7], [15], [16], [17], [18], [19], [20], CMC also<br />

assumed that the non-cooperative nodes involved in routing<br />

discovery phase, but in the forwarding phase, may discarded<br />

packets for the purpose of saving their own resources (such<br />

as energy).<br />

CMC was based on the DSR[3] routing protocol, <strong>and</strong><br />

located between the network layer <strong>and</strong> the MAC layer, including<br />

five components: Watchdog, Filter, Neighbor Manager,<br />

Reputation Manager <strong>and</strong> the Second Chance Mechanism,<br />

as shown in Fig. 1. Among them, the Neighbor Manager<br />

was responsible for maintaining a list of neighbors, as well<br />

as periodically sending Hello messages; the Watchdog was<br />

responsible for eavesdropping the channel, <strong>and</strong> the monitoring<br />

results were passed to the Reputation Manager; the Reputation<br />

Manager calculated the reputation value for each node, <strong>and</strong><br />

added the non-cooperative nodes into the Black-list; The Filter<br />

had two functions, one was to punish the non-cooperation<br />

nodes, <strong>and</strong> the other was to filter the routing control messages,<br />

suppressing the routes containing non-cooperative nodes.<br />

The functions <strong>and</strong> codes of the DSR protocol remain<br />

unchanged, only the FindRoute() function was rewritten. The<br />

FindRoute() function searched for routes in the route cache in<br />

accordance with Black-list, <strong>and</strong> the routes that did not contain<br />

nodes lying in Black-list were returned. When the node had<br />

data to be sent, the FindRoute() function was first called to<br />

search for the available route in the route cache. If hit, the<br />

available route was used to send data, otherwise, the routing<br />

discovery phase needed to be restarted, re-searching for new<br />

routes.<br />

The packets generated by DSR protocol were first processed<br />

by the Watchdog, after that, they were h<strong>and</strong>ed down to the<br />

MAC layer to be sent. Node was set to the promiscuous mode,<br />

Fig. 1. The architecture of the CMC scheme<br />

<strong>and</strong> the packets received by the MAC layer were h<strong>and</strong>ed up<br />

to the Watchdog. Only those packets that were sent to this<br />

node or needed to be forwarded by this node could pass by<br />

the Watchdog. After that, the packets were h<strong>and</strong>ed up to the<br />

Filter module, <strong>and</strong> finally arrived at the DSR protocol <strong>and</strong><br />

processed by the DSR protocol.<br />

B. Neighbor Manager<br />

Neighbor list recorded the current active neighbor nodes. In<br />

the neighbor list, each neighbor corresponded to one item, <strong>and</strong><br />

the node’s ID, rating value <strong>and</strong> the timeout value were stored<br />

in it. The rating value corresponded to the reputation of the<br />

neighbor, initialized to 0.<br />

Neighbor Manager was used to maintain a list of the<br />

neighbors. The interface which was set to promiscuous mode<br />

monitored the channel, <strong>and</strong> each time it received a packet,<br />

it would send a copy to the Neighbor Manager. Neighbor<br />

Manager picked out node ID from the packet, <strong>and</strong> searched it<br />

in the neighbor list. If finding, the corresponding timeout value<br />

was updated; otherwise, it was thought to be a new neighbor,<br />

<strong>and</strong> its ID was put into the neighbor list. If the timeout of an<br />

item in the neighbor list expired (10s in our experiment), the<br />

node corresponding to this item would be thought to move<br />

out of the transmission range, so deleting this item from the<br />

neighbor list. If the node did not send any data in TNeib<br />

time (3s in our experiment), Neighbor Manager required to<br />

broadcast a Hello message in order to prevent being deleted<br />

from the neighbor list by neighbor nodes.<br />

C. Watchdog<br />

Watchdog was mainly used to monitor nodes in the neighborhood<br />

to observe whether they forwarded the packets, as<br />

well as whether or not modified the packet contents. Watchdog<br />

maintained a data structure: the packet monitoring buffer.<br />

Those packets that needed to be monitored were stored in the<br />

packet monitoring buffer, <strong>and</strong> each packet corresponded to one


71 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

item. The item was composed of the content of the packet, the<br />

expecting forwarding node’ ID <strong>and</strong> the timeout value.<br />

The packets sent by the routing layer (DSR protocol) were<br />

first processed by the Watchdog. As long as its next hop node<br />

was not the destination node, a copy of the packet was put<br />

into the packet monitoring buffer.<br />

The packets received by the interface were h<strong>and</strong>ed up to<br />

the MAC layer, <strong>and</strong> then the MAC layer delivered them to the<br />

Watchdog. Each time the Watchdog received one packet, it<br />

searched this packet in the packet monitoring buffer. If found<br />

<strong>and</strong> the packet was not tampered, a positive event was sent<br />

to the Reputation Management, increasing the rating value<br />

corresponding to the forwarding node. Then the Watchdog<br />

checked the next hop field in the head of the packet. If it<br />

consisted with the address of this node, the packet was h<strong>and</strong>ed<br />

up to the Filter for further processing. Otherwise, it meant<br />

that the packet was not sent to this node, discarded directly. If<br />

until the packet in the packet monitoring buffer timeout, the<br />

Watchdog failed to observe the forwarding behavior, a negative<br />

event would be sent to the Reputation Management to reduce<br />

the rating value of the forwarding node.<br />

D. Common-neighbor Monitoring<br />

From the describing about the Watchdog in the previous<br />

section, we know that only those nodes located on the route<br />

can watch the forwarding behavior of the next hop node.<br />

Studying the network topology carefully, we found that those<br />

nodes located at some special place could also watch the<br />

forwarding behavior of the next hop node. So, the commonneighbor<br />

monitoring technique was introduced. Each time the<br />

watchdog captured a packet, if its current forwarding node <strong>and</strong><br />

the next forwarding node were both in the neighbor list, this<br />

packet would be put into the packet monitoring buffer <strong>and</strong><br />

watched by Watchdog.<br />

Fig. 2. Common-neighbor monitoring technique<br />

As shown in Fig. 2, node S sends data to node D through<br />

node A, B <strong>and</strong> C, node M <strong>and</strong> node N are both located in the<br />

transmission range of node B <strong>and</strong> node C. When node B sends<br />

a packet to node C, node M <strong>and</strong> node N are able to capture<br />

the packet <strong>and</strong> find that the packet’s current forwarding node<br />

(node B) <strong>and</strong> next forwarding node (node C) are both their<br />

own neighbors. So node M <strong>and</strong> N put this packet into their<br />

data monitoring buffer, watching the forwarding behavior to<br />

this packet.<br />

In order to punish the non-cooperative nodes, Filters would<br />

discard all packets from the non-cooperative nodes. As shown<br />

in Fig. 2, it is assumed that the source node S is a noncooperative<br />

node <strong>and</strong> has been found by node A, as a<br />

punishment, all packets sent by node S will be discarded by<br />

node A. In order to prevent Watchdog regarding this kind<br />

of punishment as non-cooperation, we dem<strong>and</strong> that watchdog<br />

does not monitor the forwarding behavior of the first relaying<br />

node in the route. That is node P will not monitor the<br />

forwarding behavior of node A in Fig. 2.<br />

E. Reputation Management<br />

Reputation Management was responsible for updating the<br />

node’s reputation value. A good reputation system[14] should<br />

have the following characteristics: the reputation value is able<br />

to accurately reflect the behavior of the node; node’s recent actions<br />

have greater impaction on the reputation value, whereas<br />

the past behaviors have less impaction on the reputation value;<br />

be able to diagnose the non-cooperative nodes quickly. So, the<br />

Reputation Manager calculated the reputation values for nodes<br />

in accordance with the equation (1):<br />

�<br />

0.95 × Rold + 0.05 If Positive Event<br />

Rnew =<br />

0.90 × Rold − 0.1 If Negative Event<br />

If the Reputation Manager received a positive event, the<br />

new reputation value of the node would be the sum of the<br />

discounted (multiplied by 0.95) old value <strong>and</strong> 0.05; if the Reputation<br />

Manager received a negative event, the new reputation<br />

value would be the difference of the discounted (multiplied by<br />

0.9) old value <strong>and</strong> 0.1; finally, if the reputation value of the<br />

node was lower than a threshold (-0.5 in our experiment), this<br />

node would be considered as a non-cooperative node <strong>and</strong> put<br />

into Black-list.<br />

F. Filter<br />

Filter primarily filtered the passing by routing control messages<br />

(RREQs or RREPs) <strong>and</strong> data packets according to the<br />

Black-list. For each routing control message, if it contained<br />

nodes that located in Black-list (containing non-cooperative<br />

nodes), then the current finding route was considered as ”bad”<br />

<strong>and</strong> the control message was discarded. In addition, the Filter<br />

was also required to check all the passing by data packets.<br />

As punishment to non-cooperative nodes, each packet whose<br />

source node located in the Black-list was thrown away. See<br />

table I for the detail algorithm.<br />

In the route discovery phase, all nodes that were located<br />

between the source <strong>and</strong> destination node did the route filtering.<br />

Therefore, the routes found by the source node could bypass<br />

the non-cooperative nodes as much as possible, increasing the<br />

success rate of sending data.<br />

TABLE I<br />

FILTER ALGORITHM<br />

1 Receive a packet;<br />

2 if RREP or RREQ, <strong>and</strong> contains nodes in black-list then<br />

3 Suppress this packet, <strong>and</strong> return;<br />

4 else if data packet, <strong>and</strong> source node in black-list then<br />

5 Suppress this packet, <strong>and</strong> return;<br />

6 end if<br />

7 H<strong>and</strong> this packet to route layer;<br />

(1)


72 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

G. Second Chance Mechanism<br />

Literatures [7], [19], [20] proposed that several reasons<br />

would affect the detecting results of Watchdog, such as signal<br />

conflict, network congestion <strong>and</strong> temporary link failure, etc.,<br />

which may lead to cooperative nodes wrongly being marked<br />

as non-cooperative nodes by Watchdog(in our experiments, we<br />

also found that when the network load was heavy, network<br />

congestion was likely to happen, <strong>and</strong> the cooperative nodes<br />

may appear in Black-list). In addition, those nodes that had<br />

been detected as non-cooperative nodes at the early time,<br />

may show cooperative behaviors at the late time. In order to<br />

allow those nodes that had been isolated from the network<br />

could re-join into the network, giving them a ”rehabilitative”<br />

opportunity, CMC introduced the second chance mechanism.<br />

After a fixed period of time, nodes would be released from<br />

the Black-list, but their reputation values would not be reset to<br />

0, rather maintained the current value unchanged. Once these<br />

nodes showed the non-cooperative behaviors again, they could<br />

be quickly re-added into the Black-list.<br />

III. SIMULATION AND RESULTS ANALYSIS<br />

In this section, ns-2[21] was used to verify the CMC<br />

scheme, observing its impact on the network performance.<br />

In the simulation, the tool named setdest from CMU was<br />

used to generate movement scenes for nodes. In addition,<br />

the data streams generating tool from CMU could not meet<br />

our requirements <strong>and</strong> needed to do some changes, making it<br />

satisfy: the source <strong>and</strong> destination nodes of each connection<br />

are r<strong>and</strong>omly distributed in the network; the start time of each<br />

connection is uniformly distributed in [0s, 1000s]; the duration<br />

of each connection is uniformly distributed in [50s, 100s].<br />

The basic parameters in simulations are shown in table II.<br />

The sending <strong>and</strong> receiving transmission ranges are 250m, the<br />

maximum transmission rate is 2Mbits/s, simulation duration<br />

time is 1000 seconds, the data type used in simulations is CBR,<br />

<strong>and</strong> the size of each CBR packet is 512 byte. In the simulation,<br />

the proportion of non-cooperative nodes is changed from 10%<br />

to 60%, <strong>and</strong> the impact that non-cooperative nodes have on the<br />

network performance is observed. For each group parameters,<br />

the simulations are run 10 times, <strong>and</strong> the averaged result is<br />

used.<br />

TABLE II<br />

BASIC PARAMETERS FOR SIMULATION<br />

Simulate time 1000s<br />

Transmission range 250m<br />

Receiver range 250m<br />

Carrier sense range 550m<br />

Maximum pause time 100s<br />

Traffic type CBR<br />

Packet size 512byte<br />

CBR rate 5pkt/s<br />

The following two st<strong>and</strong>ards are used to assess the network<br />

performance:<br />

• throughput: the ratio of the packets that successfully<br />

arrived at the destination nodes to the packets that were<br />

sent by the source nodes.<br />

• forwarding throughput: the throughput with those packets<br />

that were sent directly to destination nodes removed.<br />

T h ro u g h p u t<br />

1 .0<br />

0 .9<br />

0 .8<br />

0 .7<br />

0 .6<br />

0 .5<br />

0 .4<br />

0 .3<br />

0 .2<br />

0 .1<br />

0 .0<br />

C M C<br />

P a th ra te r<br />

D e fe n s e le s s<br />

1 0 % 2 0 % 3 0 % 4 0 % 5 0 % 6 0 %<br />

F ra c tio n o f m is b e h a v io r n o d e s<br />

Fig. 3. Throughput in static network (670*670 m 2 , 50 nodes)<br />

T h ro u g h p u t<br />

1 .0<br />

0 .9<br />

0 .8<br />

0 .7<br />

0 .6<br />

0 .5<br />

0 .4<br />

0 .3<br />

0 .2<br />

0 .1<br />

0 .0<br />

C M C<br />

P a th ra te r<br />

D e fe n s e le s s<br />

1 0 % 2 0 % 3 0 % 4 0 % 5 0 % 6 0 %<br />

F ra c tio n o f m is b e h a v io r n o d e s<br />

Fig. 4. Forwarding throughput in static network (670*670 m 2 , 50 nodes)<br />

First, the impact of non-cooperative nodes on throughput<br />

in static network is studied. The selected simulation region<br />

is 670*670 m 2 , 50 nodes are r<strong>and</strong>omly distributed in the<br />

simulation region, <strong>and</strong> the number of the CBR connections<br />

is 50. The simulation results are shown in Fig. 3 – Fig. 5,<br />

the curve named defenseless means no cooperative scheme<br />

is used <strong>and</strong> the curve named pathrater denotes the scheme<br />

proposed in literature [7]. As can be seen in Fig. 3, with<br />

the number of non-cooperative nodes in network increasing,<br />

the three curves all descend. The throughput of CMC scheme<br />

has no obvious improvement compared with that of pathrater<br />

scheme, whereas with the proportion of non-cooperative nodes<br />

beyond 50%, the SMC scheme shows slightly advantages. The<br />

main reason is that, in static network, pathrater scheme will


A v e ra g e ro u te le n g th<br />

73 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

2 .4<br />

2 .3<br />

2 .2<br />

2 .1<br />

2 .0<br />

C M C<br />

P a th ra te r<br />

D e fe n s e le s s<br />

1 0 % 2 0 % 3 0 % 4 0 % 5 0 % 6 0 %<br />

F ra c tio n o f m is b e h a v io r n o d e s<br />

Fig. 5. Average route length in static network (670*670 m 2 , 50 nodes)<br />

finally be able to detect all the non-cooperative nodes (only the<br />

detection rate is slower), <strong>and</strong> routes chosen by source nodes in<br />

the following process can bypass these non-cooperative nodes.<br />

Forwarding throughput is shown in Fig. 4. We can see that,<br />

as the proportion of non-cooperative nodes increase, the curve<br />

corresponding to defenseless declines sharply, which indicates<br />

that non-cooperative nodes have resulted in a significant impact<br />

on network performance. In addition, the picture also<br />

shows that, compared with pathrater scheme, CMC scheme<br />

has obviously improvement on forwarding throughput.<br />

In the experiments, the average route lengths of the packets<br />

that arrived at destination nodes have been counted, <strong>and</strong> the<br />

results are shown in Fig. 5. We can see that, CMC <strong>and</strong><br />

pathrater schemes have considerable average route lengths,<br />

which is obviously more than that of defenseless scheme.<br />

The reason is that, in CMC <strong>and</strong> pathrater schemes, when<br />

source nodes have data to send, they do not select the shortest<br />

routes, but choose the routes which are able to bypass the<br />

non-cooperative nodes.<br />

Next, throughputs in small dynamic network are studied.<br />

The r<strong>and</strong>om waypoint model is chosen for nodes movements,<br />

nodes maximum velocity is 10m/s, <strong>and</strong> node’s maximum pause<br />

time is 100s. Simulation region is 670*670 m 2 , the number<br />

of nodes in dynamic network is 50, <strong>and</strong> the number of CBR<br />

connections is 50. Simulation results are shown in Fig. 6<br />

– Fig. 8. When the proportion of non-cooperative nodes in<br />

network is greater than 30%, CMC scheme is superior to<br />

pathrater scheme. In addition, compared to static network, the<br />

three curves in dynamic network have all declined. The mainly<br />

reason is that, in dynamic network, the nodes’ movements<br />

often cause link disconnection, resulting in route’s frequent<br />

changes, which lead to some packets loss.<br />

Fig. 8 shows the average route length of packets in dynamic<br />

network. We can see that CMC <strong>and</strong> pathrater schemes have<br />

longer average route length than defenseless scheme.<br />

Last, throughputs in large size dynamic network are studied,<br />

the number of nodes is increased to 100, <strong>and</strong> the simulation<br />

T h ro u g h p u t<br />

1 .0<br />

0 .9<br />

0 .8<br />

0 .7<br />

0 .6<br />

0 .5<br />

0 .4<br />

0 .3<br />

0 .2<br />

0 .1<br />

0 .0<br />

C M C<br />

P a th ra te r<br />

D e fe n s e le s s<br />

1 0 % 2 0 % 3 0 % 4 0 % 5 0 % 6 0 %<br />

F ra c tio n o f m is b e h a v io r n o d e s<br />

Fig. 6. Throughput in small dynamic network (670*670 m 2 , 50 nodes)<br />

F o rw a rd in g th ro u g h p u t<br />

1 .0<br />

0 .9<br />

0 .8<br />

0 .7<br />

0 .6<br />

0 .5<br />

0 .4<br />

0 .3<br />

0 .2<br />

0 .1<br />

0 .0<br />

C M C<br />

P a th ra te r<br />

D e fe n s e le s s<br />

1 0 % 2 0 % 3 0 % 4 0 % 5 0 % 6 0 %<br />

F ra c tio n o f m is b e h a v io r n o d e s<br />

Fig. 7. Forwarding throughput in small dynamic network (670*670 m 2 , 50<br />

nodes)<br />

region at the same time is increased from 670*670 m 2 to<br />

1200*1200 m 2 . Simulation results are shown in Fig. 9 –<br />

Fig. 11. We can see that CMC scheme is obviously superior<br />

to pathrater scheme, mainly because the expansion of the<br />

network size makes the average route length larger. In Fig. 11,<br />

the average route length of CMC scheme is more than 3.6<br />

<strong>and</strong> that of pathrater scheme is also greater than 3.4, which<br />

means that, in average case, each packet is relayed by 2.5<br />

nodes. In pathrater scheme, the source nodes can only find<br />

non-cooperative nodes within their one hop scope. For those<br />

nodes that locate at two hops or more long distance, the<br />

source nodes will not be able to judge their behaviors. Thus,<br />

the source nodes are likely to choose routes containing noncooperative<br />

nodes, decreasing the throughput. Contemporary,<br />

in CMC scheme, the routing control messages are filtered,<br />

which makes the routes chosen by the source nodes more


A v e ra g e ro u te le n g th<br />

2 .4<br />

2 .2<br />

2 .0<br />

1 .8<br />

1 .6<br />

C M C<br />

P a th ra te r<br />

D e fe n s e le s s<br />

1 0 % 2 0 % 3 0 % 4 0 % 5 0 % 6 0 %<br />

F ra c tio n o f m is b e h a v io r n o d e s<br />

Fig. 8. Average route length in small dynamic network (670*670 m 2 , 50<br />

nodes)<br />

T h ro u g h p u t<br />

74 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

1 .0<br />

0 .9<br />

0 .8<br />

0 .7<br />

0 .6<br />

0 .5<br />

0 .4<br />

0 .3<br />

0 .2<br />

0 .1<br />

0 .0<br />

1 0 % 2 0 % 3 0 % 4 0 % 5 0 % 6 0 %<br />

F ra c tio n o f m is b e h a v io r n o d e s<br />

C M C<br />

P a th ra te r<br />

D e fe n s e le s s<br />

Fig. 9. Throughput in large dynamic network (1200*1200 m 2 , 100 nodes)<br />

likely bypass the non-cooperative nodes.<br />

IV. CONCLUSION AND FUTURE WORK<br />

In this paper, the CMC method was proposed, which could<br />

quickly detect the non-cooperative nodes in mobile ad hoc<br />

networks, isolating them <strong>and</strong> reducing their impact on network<br />

performance. The use of common-neighbors monitoring<br />

technique could speed up the detection speed to the noncooperative<br />

nodes, making the system isolate the selfish nodes<br />

quickly. In CMC method, all nodes between source <strong>and</strong><br />

destination nodes suppressed the control messages that contained<br />

non-cooperative nodes, further reduced the performance<br />

impact that the non-cooperative nodes had on the network.<br />

The simulation results showed that when there existed noncooperative<br />

nodes in network, CMC could significantly improve<br />

node throughput <strong>and</strong> network performance.<br />

F o rw a rd in g th ro u g h p u t<br />

1 .0<br />

0 .9<br />

0 .8<br />

0 .7<br />

0 .6<br />

0 .5<br />

0 .4<br />

0 .3<br />

0 .2<br />

0 .1<br />

0 .0<br />

1 0 % 2 0 % 3 0 % 4 0 % 5 0 % 6 0 %<br />

F ra c tio n o f m is b e h a v io r n o d e s<br />

C M C<br />

P a th ra te r<br />

D e fe n s e le s s<br />

Fig. 10. Forwarding throughput in large dynamic network (1200*1200 m 2 ,<br />

100 nodes)<br />

A v e ra g e ro u te le n g th<br />

4 .0<br />

3 .8<br />

3 .6<br />

3 .4<br />

3 .2<br />

3 .0<br />

2 .8<br />

2 .6<br />

C M C<br />

P a th ra te r<br />

D e fe n s e le s s<br />

1 0 % 2 0 % 3 0 % 4 0 % 5 0 % 6 0 %<br />

F ra c tio n o f m is b e h a v io r n o d e s<br />

Fig. 11. Average route length in large dynamic network (1200*1200 m 2 ,<br />

100 nodes)<br />

In the next step, we will port the code from ns2 to linux to<br />

study the CMC’s performance on the real life environment.<br />

At the same time, the impact that different kind of end<br />

user applications (ie. video <strong>and</strong> audio) have on the CMC’s<br />

performance is under our consideration.<br />

REFERENCES<br />

[1] I. Chlamtac, M. Conti, <strong>and</strong> J. Liu, “Mobile ad hoc networking: imperatives<br />

<strong>and</strong> challenges,” Ad Hoc Networks, vol. 1, no. 1, pp. 13–64, 2003.<br />

[2] BASAGNI S, CONTI M, GIORDANO S <strong>and</strong> STOJMENOVIC I, Mobile<br />

Ad Hoc Networking. New Jersey: Wiley-IEEE press, 2004.<br />

[3] D. Johnson, D. Maltz, Y. Hu, <strong>and</strong> J. Jetcheva, “The dynamic source<br />

routing protocol for mobile ad hoc networks (DSR),” 2002.<br />

[4] C. Perkins <strong>and</strong> E. Royer, “Ad-hoc on-dem<strong>and</strong> distance vector routing,”<br />

in proceedings of the 2nd IEEE Workshop on Mobile Computing <strong>Systems</strong><br />

<strong>and</strong> Applications, vol. 2, 1999, pp. 90–100.


75 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

[5] D. J. G, “Game-theoretic power management in mobile ad hoc networks,”<br />

Ph.D thesis, Carnegie Mellon University Department of Electrical<br />

<strong>and</strong> Computer Engineering, Pittsburgh, Pennsylvania, Aug. 2004.<br />

[6] M. Felegyhazi, J. Hubaux, <strong>and</strong> L. Buttyan, “Nash equilibria of packet<br />

forwarding strategies in wireless ad hoc networks,” IEEE Transactions<br />

on Mobile Computing, vol. 5, no. 5, pp. 463–476, 2006.<br />

[7] S. Marti, T. Giuli, K. Lai, <strong>and</strong> M. Baker, “Mitigating routing misbehavior<br />

in mobile ad hoc networks,” in Proceedings of the 6th annual<br />

international conference on Mobile computing <strong>and</strong> networking. ACM<br />

New York, NY, USA, 2000, pp. 255–265.<br />

[8] G. Marias, P. Georgiadis, D. Flitzanis, <strong>and</strong> K. M<strong>and</strong>alas, “Cooperation<br />

enforcement schemes for MANETs: A survey,” Wireless Communications<br />

<strong>and</strong> Mobile Computing, vol. 6, no. 3, pp. 319–332, 2006.<br />

[9] Y. Yoo <strong>and</strong> D. Agrawal, “Why does it pay to be selfish in a MANET?”<br />

IEEE Wireless Communications, vol. 13, no. 6, pp. 87–97, 2006.<br />

[10] L. Buttyan <strong>and</strong> J. Hubaux, “Nuglets: a virtual currency to stimulate<br />

cooperation in self-organized mobile ad hoc networks,” ICCA, Swiss<br />

Federal Institute of Technology, 2001.<br />

[11] L. Anderegg <strong>and</strong> S. Eidenbenz, “Ad hoc-VCG: a truthful <strong>and</strong> costefficient<br />

routing protocol for mobile ad hoc networks with selfish<br />

agents,” in Proceedings of the 9th annual international conference on<br />

Mobile computing <strong>and</strong> networking. ACM New York, NY, USA, 2003,<br />

pp. 245–259.<br />

[12] Y. Wang <strong>and</strong> M. Singhal, “On improving the efficiency of truthful routing<br />

in MANETs with selfish nodes,” Pervasive <strong>and</strong> Mobile Computing,<br />

vol. 3, no. 5, pp. 537–559, 2007.<br />

[13] S. Eidenbenz, G. Resta, <strong>and</strong> P. Santi, “The COMMIT Protocol for<br />

Truthful <strong>and</strong> Cost-Efficient Routing in Ad Hoc Networks with Selfish<br />

Nodes,” IEEE Transactions on Mobile Computing, vol. 7, no. 1, pp.<br />

19–33, 2008.<br />

[14] S. Buchegger, D. Telekom, J. Mundinger, S. BC205, J. Le Boudec, <strong>and</strong><br />

S. BC203, “Reputation <strong>Systems</strong> for Self-Organized Networks: Lessons<br />

Learned,” IEEE Technology & Society Magazine, 2007.<br />

[15] S. Buchegger, “Coping with misbehavior in mobile ad-hoc networks,”<br />

Ph.D. dissertation, Ecole Polytechnique Federale DE Lausanne, 2004.<br />

[16] S. Buchegger <strong>and</strong> J. Le Boudee, “Self-policing mobile ad hoc networks<br />

by reputation systems,” IEEE Communications Magazine, vol. 43, no. 7,<br />

pp. 101–107, 2005.<br />

[17] Q. He, D. Wu, <strong>and</strong> P. Khosla, “A secure incentive architecture for ad<br />

hoc networks,” Wireless Communications <strong>and</strong> Mobile Computing, vol. 6,<br />

no. 3, 2006.<br />

[18] K. Liu, J. Deng, P. Varshney, <strong>and</strong> K. Balakrishnan, “An<br />

acknowledgment-based approach for the detection of routing<br />

misbehavior in MANETs,” IEEE Transactions on Mobile Computing,<br />

vol. 6, no. 5, pp. 536–550, 2007.<br />

[19] D. Djenouri <strong>and</strong> N. Badache, “Struggling against selfishness <strong>and</strong> black<br />

hole attacks in MANETs,” Wireless Communications <strong>and</strong> Mobile Computing,<br />

vol. 8, no. 6, 2008.<br />

[20] H. J. Y, “Cooperation in mobile ad hoc networks,”<br />

http://www.cs.fsu.edu/research/reports/TR-050111.pdf, January 2005.<br />

[21] K. Fall <strong>and</strong> K. Varadhan, “The ns manual (formerly ns notes <strong>and</strong><br />

documentation), The VINT Project, 2008.”<br />

Jianli Guo received her BS <strong>and</strong> MS in computer science <strong>and</strong> technology from<br />

Harbin Institute of Technology in 2002 <strong>and</strong> 2004 respectively. Now he is a<br />

PHD student in HIT. His research interest includes ad hoc network, wireless<br />

sensor network.<br />

Hongwei Liu is a professor in HIT. His research interest includes fault tolerant<br />

computing technology, ad hoc network, wireless sensor network.<br />

Xiaozong Yang is a professor in HIT. His research interest includes fault tolerant<br />

computing technology, computer architecture, ad hoc network,wireless<br />

sensor network.


76 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

Systemic Risk Assessment using a Non-stationary<br />

Fractional Dynamic Stochastic Model for the<br />

Analysis of Economic Signals<br />

Jonathan M Blackledge, Fellow, IET, Fellow, IoP, Fellow, IMA, Fellow, RSS<br />

Abstract— This paper considers the Fractal Market Hypothesis<br />

(FMH) for assessing the risk(s) in developing a financial portfolio<br />

based on data that is available through the Internet from an<br />

increasing number of sources. Most financial risk management<br />

systems are still based on the Efficient Market Hypothesis which<br />

often fails due to the inaccuracies of the statistical models that<br />

underpin the hypothesis, in particular, that financial data are<br />

based on stationary Gaussian processes. The FMH considered<br />

in this paper assumes that financial data are non-stationary <strong>and</strong><br />

statistically self-affine so that a risk analysis can, in principal, be<br />

applied at any time scale provided there is sufficient data to make<br />

the output of a FMH analysis statistically significant. This paper<br />

considers a numerical method <strong>and</strong> an algorithm for accurately<br />

computing a parameter - the Fourier dimension - that serves<br />

in the assessment of a financial forecast <strong>and</strong> is applied to data<br />

taken from the Dow Jones <strong>and</strong> FTSE financial indices. A more<br />

detailed case study is then presented based on a FMH analysis<br />

of Sub-Prime Credit Default Swap Market ABX Indices.<br />

Index Terms— Risk assessment of economy, Risk assessment<br />

statistics <strong>and</strong> numerical data, Fractal Market Hypothesis, FTSE,<br />

Dow Jones <strong>and</strong> ABX index.<br />

I. INTRODUCTION<br />

Attempts to develop stochastic models for financial time<br />

series are common place in financial mathematics <strong>and</strong> econometric<br />

in general. Financial time series are essentially digital<br />

signals composed of ‘tick data’ that provides traders with daily<br />

tick-by-tick data of trade price, trade time, <strong>and</strong> volume traded,<br />

for example, at different sampling rates [1], [2]. Stochastic<br />

financial models can be traced back to the early Twentieth<br />

Century when Louis Bachelier [3] proposed that fluctuations<br />

in the prices of stocks <strong>and</strong> shares (which appeared to be<br />

yesterday’s price plus some r<strong>and</strong>om change) could be viewed<br />

in terms of r<strong>and</strong>om walks in which price changes were entirely<br />

independent of each other. Thus, one of the simplest models<br />

for price variation is based on the sum of independent r<strong>and</strong>om<br />

numbers. This is the basis for Brownian motion [4] in which<br />

the r<strong>and</strong>om numbers are considered to conform to a normal<br />

distribution. This model is the basis for the Efficient Market<br />

Hypothesis (EMH) which has a number of questionable assumptions<br />

as discussed in the following section. In this paper,<br />

we consider a method for processing financial time series<br />

data based on the Fractal Market Hypothesis. The underlying<br />

Manuscript completed in December, 2009. The work reported in this paper<br />

is supported by the Science Foundation Irel<strong>and</strong>.<br />

Jonathan Blackledge (jonathan.blackledge@dit.ie) is the Stokes Professor<br />

of Digital Signal Processing, School of Electrical Engineering<br />

<strong>Systems</strong>, Faculty of Engineering, Dublin Institute of Technology<br />

(http://eleceng.dit.ie/blackledge).<br />

rationale for this model is discussed <strong>and</strong> example results<br />

presented to illustrate the ability for the model to provide<br />

an improved risk assessment of an economy with regard to<br />

predicting the characteristics of an economic time series based<br />

on a risk assessment statistic computed from numerical data. A<br />

case study is presented that is based on the sub-prime credit<br />

default swap market ABX index which is acknowledged as<br />

being one of the principal markets whose collapse triggered<br />

the current global recession.<br />

II. BROWNIAN MOTION AND THE EFFICIENT MARKET<br />

HYPOTHESIS<br />

R<strong>and</strong>om walk models, which underpin the so called Efficient<br />

Market Hypothesis (EMH) [5]-[12] have been the basis<br />

for financial time series analysis since the work of Bachelier<br />

in the late Nineteenth Century. Although the Black-Scholes<br />

equation [13], developed in the 1970s for valuing options, is<br />

deterministic (one of the first financial models to achieve determinism),<br />

it is still based on the EMH, i.e. stationary Gaussian<br />

statistics. The EMH is based on the principle that the current<br />

price of an asset fully reflects all available information relevant<br />

to it <strong>and</strong> that new information is immediately incorporated<br />

into the price. Thus, in an efficient market, the modelling<br />

of asset prices is concerned with modelling the arrival of<br />

new information. New information must be independent <strong>and</strong><br />

r<strong>and</strong>om, otherwise it would have been anticipated <strong>and</strong> would<br />

not be new. The arrival of new information can send ‘shocks’<br />

through the market (depending on the significance of the<br />

information) as people react to it <strong>and</strong> then to each other’s<br />

reactions. The EMH assumes that there is a rational <strong>and</strong><br />

unique way to use the available information <strong>and</strong> that all agents<br />

possess this knowledge. Further, the EMH assumes that this<br />

‘chain reaction’ happens effectively instantaneously. These<br />

assumptions are clearly questionable at any <strong>and</strong> all levels of<br />

a complex financial system.<br />

The EMH implies independence of price increments <strong>and</strong> is<br />

typically characterised by a normal of Gaussian Probability<br />

Density Function (PDF) which is chosen because most price<br />

movements are presumed to be an aggregation of smaller<br />

ones, the sums of independent r<strong>and</strong>om contributions having a<br />

Gaussian PDF. However, it has long been known that financial<br />

time series do not follow r<strong>and</strong>om walks. The shortcomings<br />

of the EMH model include: failure of the independence <strong>and</strong><br />

Gaussian distribution of increments assumption, clustering,<br />

apparent non-stationarity <strong>and</strong> failure to explain momentous


77 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

financial events such as ‘crashes’ leading to recession <strong>and</strong>,<br />

in some extreme cases, depression. These limitations have<br />

prompted a new class of methods for investigating time series<br />

obtained from a range of disciplines. For example, Re-scaled<br />

Range Analysis (RSRA), e.g. [14]-[16], which is essentially<br />

based on computing the Hurst exponent [17], is a useful tool<br />

for revealing some well disguised properties of stochastic time<br />

series such as persistence (<strong>and</strong> anti-persistence) characterized<br />

by non-periodic cycles. Non-periodic cycles correspond to<br />

trends that persist for irregular periods but with a degree of<br />

statistical regularity often associated with non-linear dynamical<br />

systems. RSRA is particularly valuable because of its<br />

robustness in the presence of noise. The principal assumption<br />

associated with RSRA is concerned with the self-affine or<br />

fractal nature of the statistical character of a time-series rather<br />

than the statistical ‘signature’ itself. Ralph Elliott first reported<br />

on the fractal properties of financial data in 1938 (e.g. [18] <strong>and</strong><br />

reference therein). He was the first to observe that segments<br />

of financial time series data of different sizes could be scaled<br />

in such a way that they were statistically the same producing<br />

so called Elliot waves.<br />

III. RISK ASSESSMENT AND REPEATING ECONOMIC<br />

PATTERNS<br />

A good stochastic financial model should ideally consider<br />

all the observable behaviour of the financial system it is<br />

attempting to model. It should therefore be able to provide<br />

some predictions on the immediate future behaviour of the<br />

system within an appropriate confidence level. Predicting the<br />

markets has become (for obvious reasons) one of the most<br />

important problems in financial engineering. Although, at least<br />

in principle, it might be possible to model the behaviour of<br />

each individual agent operating in a financial market, one<br />

can never be sure of obtaining all the necessary information<br />

required on the agents themselves <strong>and</strong> their modus oper<strong>and</strong>i.<br />

This principle plays an increasingly important role as the<br />

scale of the financial system, for which a model is required,<br />

increases. Thus, while quasi-deterministic models can be of<br />

value in the underst<strong>and</strong>ing of micro-economic systems (with<br />

known ‘operational conditions’), in an ever increasing global<br />

economy (in which the operational conditions associated with<br />

the fiscal policies of a given nation state are increasingly open),<br />

we can take advantage of the scale of the system to describe<br />

its behaviour in terms of functions of r<strong>and</strong>om variables.<br />

A. Elliot Waves<br />

The stochastic nature of financial time series is well known<br />

from the values of the stock market major indices such as the<br />

FTSE (Financial Times Stock Exchange) in the UK, the Dow<br />

Jones in the US which are frequently quoted. A principal aim<br />

of investors is to attempt to obtain information that can provide<br />

some confidence in the immediate future of the stock markets<br />

often based on patterns of the past. One of the principal components<br />

of this aim is based on the observation that there are<br />

‘waves within waves’ <strong>and</strong> ‘events within events’ that appear to<br />

permeate financial signals when studied with sufficient detail<br />

<strong>and</strong> imagination. It is these repeating patterns that occupy both<br />

the financial investor <strong>and</strong> the systems modeller alike <strong>and</strong> it is<br />

clear that although economies have undergone many changes<br />

in the last one hundred years, the dynamics of market data<br />

do not appear to change significantly (ignoring scale). For<br />

example, with data obtained from [19], Figure 1 shows the rescaled<br />

signals <strong>and</strong> associated ‘macrotrends’ (i.e. normalised<br />

time series <strong>and</strong> associated time series after application of<br />

a Gaussian lowpass filter) associated with FTSE Close-of-<br />

Day (COD) illustrating the ‘development’ of three different<br />

‘crashes’; those of 1987, 1997 <strong>and</strong> the most recent crash of<br />

2007. The macrotrends are computed by filtering each signal<br />

in Fourier space using a Gaussian lowpass filter exp(−βω 2 )<br />

with β = 0.1 where ω is the angular frequency.<br />

Fig. 1. Evolution of the 1987, 1997 <strong>and</strong> 2007 financial crashes. Normalised<br />

data (left) <strong>and</strong> macrotrends (right) where the data has been smoothed <strong>and</strong><br />

rescaled to values between 0 <strong>and</strong> 1 inclusively) of the daily FTSE value<br />

(close-of-day) for 02-04-1984 to 24-12-1987 (blue), 05-04-1994 to 24-12-<br />

1997 (green) <strong>and</strong> 02-04-2004 to 24-09-2007 (red).<br />

The similarity in behaviour of these signals is remarkable<br />

<strong>and</strong> clearly indicates a wavelength of approximately 1000<br />

days. This is indicative of the quest to underst<strong>and</strong> economic<br />

signals in terms of some universal phenomenon from which<br />

appropriate (macro) economic models can be generated. In an<br />

efficient market, only the revelation of some dramatic information<br />

can cause a crash, yet post-mortem analysis of crashes<br />

typically fail to (convincingly) tell us what this information<br />

must have been.<br />

One cause of correlations in market price changes (<strong>and</strong><br />

volatility) is mimetic behaviour, known as herding. In general,<br />

market crashes happen when large numbers of agents place sell<br />

orders simultaneously creating an imbalance to the extent that<br />

market makers are unable to absorb the other side without<br />

lowering prices substantially. Most of these agents do not<br />

communicate with each other, nor do they take orders from<br />

a leader. In fact, most of the time they are in disagreement,<br />

<strong>and</strong> submit roughly the same amount of buy <strong>and</strong> sell orders.<br />

This is a healthy non-crash situation; it is a diffusive (r<strong>and</strong>omwalk)<br />

process which underlies the EMH <strong>and</strong> financial portfolio<br />

rationalization.<br />

B. Non-equilibrium <strong>Systems</strong><br />

Financial markets can be considered to be non-equilibrium<br />

systems because they are constantly driven by transactions that


78 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

occur as the result of new fundamental information about firms<br />

<strong>and</strong> businesses. They are complex systems because the market<br />

also responds to itself, often in a highly non-linear fashion, <strong>and</strong><br />

would carry on doing so (at least for some time) in the absence<br />

of new information. The ‘price change field’ is highly nonlinear<br />

<strong>and</strong> very sensitive to exogenous shocks <strong>and</strong> it is probable<br />

that all shocks have a long term effect. Market transactions<br />

generally occur globally at the rate of hundreds of thous<strong>and</strong>s<br />

per second. It is the frequency <strong>and</strong> nature of these transactions<br />

that dictate stock market indices, just as it is the frequency <strong>and</strong><br />

nature of the s<strong>and</strong> particles that dictates the statistics of the<br />

avalanches in a s<strong>and</strong> pile. These are all examples of r<strong>and</strong>om<br />

scaling fractals [21]-[26].<br />

IV. THE FRACTAL MARKET HYPOTHESIS<br />

Developing mathematical models to simulate stochastic<br />

processes has an important role in financial analysis <strong>and</strong><br />

information systems in general where it should be noted that<br />

information systems are now one of the most important aspects<br />

in terms of regulating financial systems, e.g. [27]-[30]. A good<br />

stochastic model is one that accurately predicts the statistics<br />

we observe in reality, <strong>and</strong> one that is based upon some well<br />

defined rationale. Thus, the model should not only describe<br />

the data, but also help to explain <strong>and</strong> underst<strong>and</strong> the system.<br />

There are two principal criteria used to define the characteristics<br />

of a stochastic field: (i) The PDF or the Characteristic<br />

Function (i.e. the Fourier transform of the PDF); the Power<br />

Spectral Density Function (PSDF). The PSDF is the function<br />

that describes the envelope or shape of the power spectrum of<br />

a signal. In this sense, the PSDF is a measure of the field<br />

correlations. The PDF <strong>and</strong> the PSDF are two of the most<br />

fundamental properties of any stochastic field <strong>and</strong> various<br />

terms are used to convey these properties. For example, the<br />

term ‘zero-mean white Gaussian noise’ refers to a stochastic<br />

field characterized by a PSDF that is effectively constant over<br />

all frequencies (hence the term ‘white’ as in ‘white light’) <strong>and</strong><br />

has a PDF with a Gaussian profile whose mean is zero.<br />

Stochastic fields can of course be characterized using transforms<br />

other than the Fourier transform (from which the PSDF<br />

is obtained) but the conventional PDF-PSDF approach serves<br />

many purposes in stochastic systems theory. However, in<br />

general, there is no general connectivity between the PSDF<br />

<strong>and</strong> the PDF either in terms of theoretical prediction <strong>and</strong>/or<br />

experimental determination. It is not generally possible to<br />

compute the PSDF of a stochastic field from knowledge of<br />

the PDF or the PDF from the PSDF. Hence, in general, the<br />

PDF <strong>and</strong> PSDF are fundamental but non-related properties<br />

of a stochastic field. However, for some specific statistical<br />

processes, relationships between the PDF <strong>and</strong> PSDF can<br />

be found, for example, between Gaussian <strong>and</strong> non-Gaussian<br />

fractal processes [31] <strong>and</strong> for differentiable Gaussian processes<br />

[32].<br />

There are two conventional approaches to simulating a<br />

stochastic field. The first of these is based on predicting the<br />

PDF (or the Characteristic Function) theoretically (if possible).<br />

A pseudo r<strong>and</strong>om number generator is then designed whose<br />

output provides a discrete stochastic field that is characteristic<br />

of the predicted PDF. The second approach is based on<br />

considering the PSDF of a field which, like the PDF, is ideally<br />

derived theoretically. The stochastic field is then typically<br />

simulated by filtering white noise. A ‘good’ stochastic model<br />

is one that accurately predicts both the PDF <strong>and</strong> the PSDF<br />

of the data. It should take into account the fact that, in<br />

general, stochastic processes are non-stationary. In addition, it<br />

should, if appropriate, model rare but extreme events in which<br />

significant deviations from the norm occur.<br />

One explanation for crashes involves a replacement for the<br />

EMH by the Fractal Market Hypothesis (FMH) which is the<br />

basis of the model considered in this paper. The FMH proposes<br />

the following: (i) The market is stable when it consists of<br />

investors covering a large number of investment horizons<br />

which ensures that there is ample liquidity for traders; (ii)<br />

information is more related to market sentiment <strong>and</strong> technical<br />

factors in the short term than in the long term - as investment<br />

horizons increase <strong>and</strong> longer term fundamental information<br />

dominates; (iii) if an event occurs that puts the validity<br />

of fundamental information in question, long-term investors<br />

either withdraw completely or invest on shorter terms (i.e.<br />

when the overall investment horizon of the market shrinks<br />

to a uniform level, the market becomes unstable); (iv) prices<br />

reflect a combination of short-term technical <strong>and</strong> long-term<br />

fundamental valuation <strong>and</strong> thus, short-term price movements<br />

are likely to be more volatile than long-term trades - they are<br />

more likely to be the result of crowd behaviour; (v) if a security<br />

has no tie to the economic cycle, then there will be no longterm<br />

trend <strong>and</strong> short-term technical information will dominate.<br />

Unlike the EMH, the FMH states that information is valued<br />

according to the investment horizon of the investor. Because<br />

the different investment horizons value information differently,<br />

the diffusion of information will also be uneven. Unlike most<br />

complex physical systems, the agents of the economy, <strong>and</strong><br />

perhaps to some extent the economy itself, have an extra<br />

ingredient, an extra degree of complexity. This ingredient is<br />

consciousness.<br />

V. MATHEMATICAL MODEL FOR THE FMH<br />

We consider an economic times series to be a solution to<br />

the fractional diffusion equation [33]-[38]<br />

� �<br />

2 ∂<br />

u(x, t) = δ(x)n(t) (1)<br />

∂q<br />

− σq<br />

∂x2 ∂tq where σ is the fractional diffusion coefficient, q > 0 is the<br />

‘Fourier dimension’ <strong>and</strong> n(t) is ‘white noise’. Let<br />

u(x, t) = 1<br />

�∞<br />

U(x, ω) exp(iωt)dω<br />

2π<br />

<strong>and</strong><br />

Using the result<br />

n(t) = 1<br />

2π<br />

∂q 1<br />

u(x, t) =<br />

∂tq 2π<br />

−∞<br />

�∞<br />

−∞<br />

�∞<br />

−∞<br />

N(ω) exp(iωt)dω.<br />

U(x, ω)(iω) q exp(iωt)dω


we can then transform the fractional diffusion equation to the<br />

form � 2 ∂<br />

�<br />

U(x, ω) = δ(x)N(ω)<br />

where we take<br />

∂x 2 + Ω2 q<br />

Ωq = i(iωσ) q<br />

2<br />

Defining the Green’s function g [39] to be the solution of<br />

� �<br />

2 ∂<br />

g(| x − y |, ω) = δ(x − y)<br />

∂x 2 + Ω2 q<br />

where δ is the delta function, we obtain the solution<br />

U(x, ω) = N(ω)<br />

where [40]<br />

�∞<br />

−∞<br />

g(| x − y |, ω)δ(y)dy = N(ω)g(| x |, ω)<br />

g(| x |, ω) = i<br />

exp(iΩq | x |)<br />

2Ωq<br />

under the assumption that u <strong>and</strong> ∂u/∂x → 0 as x → ±∞. The<br />

Green’s function characterises the response a system modelled<br />

by equation (1) due to an impulse at x = y <strong>and</strong> it is clear that<br />

or<br />

79 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

iN(ω)<br />

lim U(x, ω) =<br />

x→0 2Ωq<br />

U(ω) = 1<br />

2σ q<br />

N(ω)<br />

2 (iω) q<br />

2<br />

The time series associated with this asymptotic solution is<br />

then obtained by Fourier inversion giving (ignoring scaling by<br />

[2σ q/2 Γ(q/2)] −1 )<br />

u(t) = 1<br />

⊗ n(t) (2)<br />

t1−q/2 where ⊗ defines the convolution integral. This equation<br />

is the Riemann - Liouville transform (ignoring scaling by<br />

[Γ−1 (q/2)] −1 ) [41] which is a fractional integral <strong>and</strong> defines a<br />

function u(t) which is statistically self-affine, i.e. for a scaling<br />

parameter λ > 0,<br />

λ q/2 Pr[u(λt)] = Pr[u(t)]<br />

where Pr[u(t)] denotes the Probability Density Function of<br />

u(t). Thus, equation (2) can be considered to be the temporal<br />

solution of equation (1) as x → 0 <strong>and</strong> u(t) is taken to be a<br />

r<strong>and</strong>om scaling fractal signal. Note that for | x |> 0 the phase<br />

Ωq | x | does not affect the ω −q scaling law of the power<br />

spectrum, i.e. ∀x,<br />

| U(x, ω) | 2 =<br />

| N(ω) |2<br />

4σ q ω q , ω > 0<br />

Thus for a uniformly distributed spectrum N(ω) the Power<br />

Spectrum Density Function of U is determined by ω−q <strong>and</strong> the<br />

algorithm developed to compute q given in Section 6 applies<br />

∀x <strong>and</strong> not just for the case when x → 0. However, since we<br />

can write<br />

i<br />

U(x, ω) = N(ω) exp(iΩq | x |)<br />

2Ωq<br />

1<br />

= N(ω)<br />

2(iωσ) q/2<br />

�<br />

1 + i(iωσ) q/2 | x | − 1<br />

2! (iωσ)q | x | 2 �<br />

+...<br />

unconditionally, by inverse Fourier transforming, we obtain<br />

the following expression for u(x, t) (ignoring scaling factors):<br />

u(x, t) = n(t) ⊗ 1<br />

+ i | x | n(t)<br />

t1−q/2 ∞� i<br />

+<br />

k+1<br />

(k + 1)!<br />

k=1<br />

dkq/2<br />

| x |2k n(t)<br />

dtkq/2 Here, the solution is composed of three terms composed of (i)<br />

a fractional integral, (ii) the source term n(t); (iii) an infinite<br />

series of fractional differentials of order kq/2.<br />

A. Rationale for the Model - Hurst Processes<br />

A Hurst process describes fractional Brownian motion <strong>and</strong><br />

is based on the generalization of Brownian motion quantified<br />

by the equation A(t) = √ t to<br />

A(t) = t H , H ∈ (0, 1]<br />

for a unit r<strong>and</strong>om step length in the plane where A is the<br />

most likely position in the plane after time t with respect to an<br />

initial position in the plane at t = 0. This scaling law makes<br />

no prior assumptions about any underlying distributions. It<br />

simply tells us how the system is scaling with respect to<br />

time. Processes of this type appear to exhibit cycles, but with<br />

no predictable period. The interpretation of such processes<br />

in terms of the Hurst exponent H is as follows: We know<br />

that H = 0.5 is consistent with an independently distributed<br />

system. The range 0.5 < H ≤ 1, implies a persistent time<br />

series, <strong>and</strong> a persistent time series is characterized by positive<br />

correlations. Theoretically, what happens today will ultimately<br />

have a lasting effect on the future. The range 0 < H ≤ 0.5<br />

indicates anti-persistence which means that the time series<br />

covers less ground than a r<strong>and</strong>om process. In other words,<br />

there are negative correlations. For a system to cover less<br />

distance, it must reverse itself more often than a r<strong>and</strong>om<br />

process.<br />

Given that r<strong>and</strong>om walks with H = 0.5 describe processes<br />

whose macroscopic behaviour is characterised by the diffusion<br />

equation, then, by induction, Hurst processes should be<br />

characterised by generalizing the diffusion operator<br />

to the fractional form<br />

∂2 ∂<br />

− σ<br />

∂x2 ∂t<br />

∂2 ∂q<br />

− σq<br />

∂x2 ∂tq where q ∈ (0, 2] Fractional diffusive processes can therefore be<br />

interpreted as intermediate between classical diffusive (r<strong>and</strong>om<br />

phase walks with H = 0.5; diffusive processes with q = 1)<br />

<strong>and</strong> ‘propagative process’ (coherent phase walks for H =<br />

1; propagative processes with q = 2), e.g. [42] <strong>and</strong> [43].<br />

The relationship between the Hurst exponent H, the Fourier<br />

dimension q <strong>and</strong> the Fractal dimension DF is given by [44]<br />

DF = DT + 1 − H = 1 − q + 3<br />

2 DT


where DT is the topological dimension. Thus, a Brownian<br />

process, where H = 1/2, has a fractal dimension of 1.5.<br />

Fractional diffusion processes are based on r<strong>and</strong>om walks<br />

which exhibit a bias with regard to the distribution of angles<br />

used to change the direction. By induction, it can be expected<br />

that as the distribution of angles reduces, the corresponding<br />

walk becomes more <strong>and</strong> more coherent, exhibiting longer <strong>and</strong><br />

longer time correlations until the process conforms to a fully<br />

coherent walk. A simulation of such an effect is given in<br />

Figure 2 which shows a r<strong>and</strong>om walk in the (real) plane<br />

as the (uniform) distribution of angles decreases. The walk<br />

becomes less <strong>and</strong> less r<strong>and</strong>om as the width of the distribution is<br />

reduced. Each position of the walk (xj, yj), j = 1, 2, 3, ..., N<br />

is computed using<br />

j�<br />

j�<br />

xj = cos(θi), yj = sin(θi)<br />

where<br />

80 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

i=1<br />

i=1<br />

θi = απ ni<br />

�n�∞<br />

<strong>and</strong> ni are r<strong>and</strong>om numbers computed using the linear congruential<br />

pseudo r<strong>and</strong>om number generator<br />

ni+1 = animodP, i = 1, 2, ..., N, a = 7 7 , P = 2 31 − 1<br />

The parameter 0 ≤ α ≤ 2π defines the width of the<br />

distribution of angles such that as α → 0, the walk becomes<br />

increasingly coherent or ‘propagative’<br />

Fig. 2. R<strong>and</strong>om phase walks in the plane for a uniform distribution of angles<br />

θi ∈ [0, 2π] (top left), θi ∈ [0, 1.9π] (top right), θi ∈ [0, 1.8π] (bottom left)<br />

<strong>and</strong> θi ∈ [0, 1.2π] (bottom right).<br />

In considering a t H scaling law with Hurst exponent H ∈<br />

(0, 1], Hurst paved the way for an appreciation that most natural<br />

stochastic phenomena which, at first site, appear r<strong>and</strong>om,<br />

have certain trends that can be identified over a given period<br />

of time. In other words, many natural r<strong>and</strong>om patterns have a<br />

bias to them that leads to time correlations in their stochastic<br />

behaviour, a behaviour that is not an inherent characteristic of<br />

a r<strong>and</strong>om walk model <strong>and</strong> fully diffusive processes in general.<br />

This aspect of stochastic field theory is the basis for Lévy<br />

processes [45].<br />

B. Lévy Processes<br />

Lévy processes are r<strong>and</strong>om walks whose distribution has<br />

infinite moments. The statistics of (conventional) physical<br />

systems are usually concerned with stochastic fields that have<br />

PDFs where (at least) the first two moments (the mean <strong>and</strong><br />

variance) are well defined <strong>and</strong> finite. Lévy statistics is concerned<br />

with statistical systems where all the moments (starting<br />

with the mean) are infinite. Many distributions exist where the<br />

mean <strong>and</strong> variance are finite but are not representative of the<br />

process, e.g. the tail of the distribution is significant, where<br />

rare but extreme events occur. These distributions include<br />

Lévy distributions. Lévy’s original approach to deriving such<br />

distributions is based on the following question: Under what<br />

circumstances does the distribution associated with a r<strong>and</strong>om<br />

walk of a few steps look the same as the distribution after<br />

many steps (except for scaling)? This question is effectively<br />

the same as asking under what circumstances do we obtain a<br />

r<strong>and</strong>om walk that is statistically self-affine. The characteristic<br />

function (i.e. the Fourier transform) P (k) of such a distribution<br />

p(x) was first shown by Lévy to be given by (for symmetric<br />

distributions only)<br />

P (k) = exp(−a | k | γ ), 0 < γ ≤ 2 (3)<br />

where a is a constant <strong>and</strong> γ is the Lévy index. For γ ≥ 2, the<br />

second moment of the Lévy distribution exists <strong>and</strong> the sums of<br />

large numbers of independent trials are Gaussian distributed.<br />

For example, if the result were a r<strong>and</strong>om walk with a step<br />

length distribution governed by p(x), γ > 2, then the result<br />

would be normal (Gaussian) diffusion, i.e. a Brownian process.<br />

For γ < 2 the second moment of this PDF (the mean square),<br />

diverges <strong>and</strong> the characteristic scale of the walk is lost. For<br />

values of γ between 0 <strong>and</strong> 2, Lévy’s characteristic function<br />

corresponds to a PDF of the form<br />

p(x) ∼ 1<br />

, x → ∞.<br />

x1+γ This type of r<strong>and</strong>om walk is called a Le´vy flight <strong>and</strong> is an<br />

example of a non-stationary fractal walk.<br />

Lévy process are consistent with a fractional diffusion<br />

equation [46]. The basic evolution equation for a r<strong>and</strong>om<br />

Brownian particle process is given by<br />

�∞<br />

u(x, t + τ) = u(x + λ, t)p(λ)dλ<br />

−∞<br />

where u(x, t) is the concentration of particles <strong>and</strong> τ is the<br />

interval of time in which a particle moves some distance<br />

between λ <strong>and</strong> λ + dλ with a probability p(λ) satisfying the<br />

condition p(λ) = p(−λ). We note that<br />

u(x, t + τ) = u(x, t) ⊗ p(x)<br />

<strong>and</strong> that in Fourier space, this equation is<br />

U(k, t + τ) = U(k, t)P (k)


where U <strong>and</strong> P are the Fourier transforms of u <strong>and</strong> p<br />

respectively. From equation (3),<br />

P (k) � 1 − a | k | γ<br />

so that we can write<br />

U(k, t + τ) − U(k, t)<br />

� −<br />

τ<br />

a<br />

τ | k |γ U(k, t)<br />

which for τ → 0 gives the fractional diffusion equation<br />

σ ∂ ∂γ<br />

u(x, t) = u(x, t),<br />

∂t ∂xγ γ ∈ (0, 2] (4)<br />

where σ = τ/a <strong>and</strong> we have used the result<br />

∂γ 1<br />

u(x, t) = −<br />

∂xγ 2π<br />

�∞<br />

−∞<br />

| k | γ U(k, t) exp(ikx)dk<br />

The solution to this equation with the singular initial condition<br />

u(x, 0) = δ(x) is given by<br />

u(x, t) = 1<br />

2π<br />

�∞<br />

−∞<br />

exp(ikx − t | k | γ /σ)dk<br />

which is itself Lévy distributed. This derivation of the fractional<br />

diffusion equation reveals its physical origin in terms of<br />

Lévy statistics, i.e. Lévy’s characteristic function. Note that the<br />

diffusion equation is fractional in the spatial derivative rather<br />

than the temporal derivative as given in equation (1). However,<br />

since the Green’s function for equation (4) is given by<br />

where<br />

81 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

g(| x |, ω) = i<br />

exp(iΩγ | x |)<br />

2Ωγ<br />

Ωγ = i 2<br />

γ (iωσ) 1<br />

γ ,<br />

by induction, we obtain a relationship between the Lévy index<br />

γ <strong>and</strong> the Fourier dimension q given by<br />

1 q<br />

=<br />

γ 2<br />

Gaussian processes associated with the classical diffusion<br />

equation are thus recovered when γ = 2 <strong>and</strong> q = 1.<br />

C. Fractional Differentials<br />

Fractional differentials of any order need to be considered<br />

in terms of the definition for a fractional differential given by<br />

ˆD q f(t) = dm<br />

dt m [Îm−q f(t)], m − q > 0<br />

where m is an integer <strong>and</strong> Î is the fractional integral operator<br />

(the Riemann-Liouville transform) given by<br />

Î p f(t) = 1 1<br />

f(t) ⊗ , p > 0<br />

Γ(p) t1−p The reason for this is that direct fractional differentiation can<br />

yield divergences. However, there is a deeper interpretation of<br />

this result that has a synergy with the issue over a macroeconomic<br />

system having ‘memory’ <strong>and</strong> is based on observing that<br />

the evaluation of a fractional differential operator depends on<br />

the history of the function in question. Thus, unlike an integer<br />

differential operator of order m, a fractional differential operator<br />

of order q has ‘memory’ because the value of Îm−qf(t) at<br />

a time t depends on the behaviour of f(t) from −∞ to t via the<br />

convolution of f(t) with t (m−q)−1 /Γ(m−q). The convolution<br />

process is dependent on the history of a function f(t) for<br />

a given kernel <strong>and</strong> thus, in this context, we can consider a<br />

fractional derivative defined by ˆ Dq to have ‘memory. In this<br />

sense, the operator<br />

∂2 ∂q<br />

− σq<br />

∂x2 ∂tq describes a process, compounded in a field u(x, t), that has<br />

memory association with regard to the temporal characteristics<br />

of the system it is attempting to model. This is not an intrinsic<br />

characteristic of systems that are purely diffusive q = 1 or<br />

propagative q = 2.<br />

D. Non-stationary Model<br />

The fractional diffusion operator used in equation (1) is<br />

appropriate for modelling fractional diffusive processes that<br />

are stationary. For non-stationary fractional diffusion, we could<br />

consider the case where the diffusivity is time variant as<br />

defined by the function σ(t). However, a more interesting<br />

case arises when the characteristics of the diffusion processes<br />

change over time becoming less or more diffusive. This is<br />

illustrated in terms of the r<strong>and</strong>om walk in the plane given in<br />

Figure 3. Here, the walk starts off being fully diffusive (i.e.<br />

H = 0.5 <strong>and</strong> q = 1), changes to being fractionally diffusive<br />

(0.5 < H < 1 <strong>and</strong> 1 < q < 2) <strong>and</strong> then changes back to<br />

being fully diffusive. In terms of fractional diffusion, this is<br />

equivalent to having an operator<br />

∂2 ∂q<br />

− σq<br />

∂x2 ∂tq where q = 1, t ∈ (0, T1]; q > 1, t ∈ (T1, T2]; q = 1, t ∈<br />

(T2, T3] where T3 > T2 > T1. If we want to generalise<br />

such processes over arbitrary periods of time, then we should<br />

consider q to be a function of time. We can then introduce a<br />

non-stationary fractional diffusion operator given by<br />

∂2 ∂q(t)<br />

− σq(t) .<br />

∂x2 ∂tq(t) This operator is the theoretical basis for the Fractal Market<br />

Hypothesis considered in this paper. In terms of using this<br />

model to develop a FMH risk management metric based on<br />

the analysis of economic time series, the principal Hypothesis<br />

is that a change in q(t) precedes a change in a macroeconomic<br />

index. This requires accurately numerical methods for<br />

computing q(t) for a given index which are discussed later.<br />

Real economic signals exhibit non-stationary fractal walks. An<br />

example of this is illustrated in Figure 4 which shows a nonstationary<br />

walk in the complex plane obtained by taking the<br />

Hilbert transform of an economic signal, i.e. computing the<br />

analytic signal<br />

s(t) = u(t) + i<br />

⊗ u(t)<br />

πt<br />

<strong>and</strong> plotting the real <strong>and</strong> imaginary component of this signal<br />

in the complex plane.


82 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

Fig. 3. Non-stationary r<strong>and</strong>om phase walk in the plane.<br />

Fig. 4. Non-stationary fractal walk in the complex plane (right) obtained by<br />

computig the Hilbert transform of the economic signal (left) - FTSE Closeof-Day<br />

from 02-04-1984 to 24-12-1987.<br />

The non-stationary model considered here exhibits behaviour<br />

that is similar to Lévy processes. However, the aim<br />

is not to derive a statistical model for a stochastic process<br />

using a stationary fractional diffusion of the type given by<br />

equation (4) but to be able to compute a function - namely<br />

q(t) - which is a measure of the non-stationary behaviour<br />

especially with regard to a ‘future flight’. This is because,<br />

in principle, the value of q(t) should reflect the early stages<br />

of a change in the behaviour of u(t), a principle that is the<br />

basis for the financial data processing <strong>and</strong> analysis discussed<br />

in the following section.<br />

VI. FINANCIAL DATA ANALYSIS<br />

If we consider the case where the Fourier dimension is<br />

a relatively slowly varying function of time, then we can<br />

legitimately consider q(t) to be composed of a sequence of<br />

different states qi = q(ti). This approach allows us to develop<br />

a stationary solution for a fixed q over a fixed period of time.<br />

Non-stationary behaviour can then be introduced by using the<br />

same solution for different values of q over fixed (or varying)<br />

periods of time <strong>and</strong> concatenating the solutions for all q to<br />

produce an output digital signal.<br />

The FMH model for a quasi-stationary segment of a financial<br />

signal is given by<br />

u(t) = 1<br />

⊗ n(t), q > 0<br />

t1−q/2 which has characteristic spectrum<br />

U(ω) = N(ω)<br />

(iω) q/2<br />

The PSDF is thus characterised by ω −q , ω ≥ 0 <strong>and</strong> our<br />

problem is thus, to compute q from the data P (ω) =| U(ω) | 2<br />

, ω ≥ 0. For this data, we consider the PSDF<br />

ˆP (ω) = c<br />

ω q<br />

or<br />

ln ˆ P (ω) = C + q ln ω<br />

where C = ln c. The problem is therefore reduced to implementing<br />

an appropriate method to compute q (<strong>and</strong> C) by<br />

finding a best fit of the line ln ˆ P (ω) to the data ln P (ω).<br />

Application of the least squares method for computing q,<br />

which is based on minimizing the error<br />

e(q, C) = � ln P (ω) − ln ˆ P (ω, q, C)� 2 2<br />

with regard to q <strong>and</strong> C, leads to errors in the estimates for<br />

q which are not compatible with market data analysis. The<br />

reason for this is that relative errors at the start <strong>and</strong> end<br />

of the data ln P may vary significantly especially because<br />

any errors inherent in the data P will be ‘amplified’ through<br />

application of the logarithmic transform required to linearise<br />

the problem. In general, application of a least squares approach<br />

is very sensitive to statistical heterogeneity [47] <strong>and</strong> in this<br />

application, may provide values of q that are not compatible<br />

with the rationale associated with the FMH (i.e. values of 1 <<br />

q < 2 that are intermediate between diffusive <strong>and</strong> propagative<br />

processes). For this reason, an alternative approach must be<br />

considered which, in this paper, is based on Orthogonal Linear<br />

Regression (OLR) [48] [49].<br />

Applying a st<strong>and</strong>ard moving window, q(t) is computed by<br />

repeated application of OLR based on the m-code available<br />

from [51]. This provides a numerical estimate of the function<br />

q(t) whose values reflect the state of a financial signals<br />

(assumed to be a non-stationary r<strong>and</strong>om fractal) in terms of a<br />

stable or unstable economy, from which a risk analysis can be<br />

performed. Since q is, in effect, a statistic, its computation<br />

is only as good as the quantity (<strong>and</strong> quality) of data that<br />

is available for its computation. For this reason, a relatively<br />

large window is required whose length is compatible with the<br />

number of samples available.<br />

A. Numerical Algorithm<br />

The principal algorithm associated with the application of<br />

the FMH analysis is as follows:<br />

Step 1: Read data (financial time series) from file into<br />

operating array a[i], i = 1, 2, ..., N.


83 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

Step 2: Set length L < N of moving window w to be used.<br />

Step 3: For j = 1 assign L + j − 1 elements of a[i] to array<br />

w[i], i = 1, 2, ..., L.<br />

Step 4: Compute the power spectrum P [i] of w[i] using a<br />

Discrete Fourier Transform (DFT).<br />

Step 5: Compute the logarithm of the spectrum excluding the<br />

DC, i.e. compute log(P [i])∀i ∈ [2, L/2].<br />

Step 6: Compute q[j] using the OLR algorithm whose m-code<br />

is given in Appendix I.<br />

Step 7: For j = j + 1 repeat Step 3 - Step 5 stopping when<br />

j = N − L.<br />

Step 8: Write the signal q[j] to file for further analysis <strong>and</strong><br />

post processing.<br />

The following points should be noted:<br />

(i) The DFT is taken to generate an output in st<strong>and</strong>ard form<br />

where the zero frequency component of the power spectrum<br />

is taken to be P [1].<br />

(ii) With L = 2 m for integer m, a Fast Fourier Transform can<br />

be used<br />

(iii) The minimum window size that should be used in order<br />

provide statistically significant values of q[j] is L = 64 when<br />

q can be computed accurate to 2 decimal places.<br />

An example of the output generated by this algorithm for<br />

a 1024 element window is given in Figure 5 using Dow<br />

Jones Close-of-Day data obtained from [20]. Inspection of the<br />

signals illustrates a qualitative relationship between trends in<br />

the financial data <strong>and</strong> q(t) in accordance with the theoretical<br />

model considered. In particular, over periods of time in which<br />

q increases in value, the amplitude of the financial signal u(t)<br />

decreases. Moreover, <strong>and</strong> more importantly, an upward trend<br />

in q appears to be a precursor to a downward trend in u(t), a<br />

correlation that is compatible with the idea that a rise in the<br />

value of q relates to the ‘system’ becoming more propagative,<br />

which in stock market terms, indicates the likelihood for the<br />

markets becoming ‘bear’ dominant in the future.<br />

The results of using the method discussed above not only<br />

provides for a general appraisal of different macroeconomic<br />

financial time series, but, with regard to the size of selected<br />

window used, an analysis of data at any point in time.<br />

The output can be interpreted in terms of ‘persistence’ <strong>and</strong><br />

‘anti-persistence’ <strong>and</strong> in terms of the existence or absence<br />

of after-effects (macroeconomic memory effects). For those<br />

periods in time when q(t) is relatively constant, the existing<br />

market tendencies usually remain. Changes in the existing<br />

trends tend to occur just after relatively sharp changes in<br />

q(t) have developed. This behaviour indicates the possibility<br />

of using the time series q(t) for identifying the behaviour<br />

of a macroeconomic financial system in terms of both intermarket<br />

<strong>and</strong> between-market analysis. These results support the<br />

possibility of using q(t) as an independent volatility predictor<br />

to give a risk assessment associated with the likely future<br />

behaviour of different economic time series. Further, because<br />

Fig. 5. Application of the FMH using a 1024 element window for analysing<br />

financial time series composed of Dow Jones Close-of-Day data from from<br />

02-11-1932 to 25-03-2009. Above: Dow Jones Close-of-Day data (blue) <strong>and</strong><br />

q(t) (red) computed using a window of 1024; Below: Histogram of q(t) for<br />

100 bins.<br />

this analysis is based on the equation (2) which defines a<br />

(stationary) r<strong>and</strong>om scaling fractal signal, the results are, in<br />

principle, scale invariant.<br />

B. Equivalence with a Wavelet Transform<br />

The wavelet transform is defined in terms of projections of<br />

f(t) onto a family of functions that are all normalized dilations<br />

<strong>and</strong> translations of a prototype ‘wavelet’ function w [50], i.e.<br />

where<br />

W[f(t)] = FL(t) =<br />

wL(τ, t) = 1<br />

√ L w<br />

�∞<br />

−∞<br />

� τ − t<br />

L<br />

f(τ)wL(τ, t)dτ<br />

�<br />

, L > 0.<br />

The independent variables L <strong>and</strong> t are continuous dilation <strong>and</strong><br />

translation parameters respectively. The wavelet transformation<br />

is essentially a convolution transform where wL(t) is the<br />

convolution kernel with dilation variable L. The introduction<br />

of this factor provides dilation <strong>and</strong> translation properties into<br />

the convolution integral that gives it the ability to analyse<br />

signals in a multi-resolution role (the convolution integral is<br />

now a function of L), i.e.<br />

FL(t) = wL(t) ⊗ f(t), L > 0.<br />

In this sense, the asymptotic solution (ignoring scaling)<br />

u(t) = 1<br />

⊗ n(t), q > 0<br />

t1−q/2 is compatible with the case of a wavelet transform where<br />

w1(t) = 1<br />

t 1−q/2<br />

for the stationary case <strong>and</strong> where, for the non-stationary case,<br />

1<br />

w1(t, τ) = .<br />

t1−q(τ)/2


84 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

C. Macrotrend Analysis<br />

In order to develop a macrotrend signal that has optimal<br />

properties with regard the assessment of risk (i.e. the likely<br />

future behaviour of an economic signal), it is important that the<br />

filter used is: (i) consistent with the properties of a Variation<br />

Diminishing Smoothing Kernel (VDSK); (ii) that the last few<br />

values of the trend signal are ‘data consistent’. VDSKs are<br />

convolution kernels with properties that guarantee smoothness<br />

around points of discontinuity of a given signal where the<br />

smoothed function is composed of a similar succession of<br />

concave or convex arcs equal in number to those of signal.<br />

VDSKs also have ‘geometric properties’ that preserve the<br />

‘shape’ of the signal. There are a range of VDSKs of which the<br />

most common is a Gaussian function <strong>and</strong>, for completeness,<br />

Appendix II provides a overview of the principal analytical<br />

properties, including fundamental Theorems <strong>and</strong> Proofs of<br />

such kernels including the Gaussian kernel.<br />

In practice, the computation of the smoothing process using<br />

a VDSK must be performed in such a way that the initial <strong>and</strong><br />

final elements of the output data are entirely data consistent<br />

with the input array within the locality of any element. Since<br />

a VDSK is a non-localised filter which tends to zero at<br />

infinity, in order to optimise the numerical efficiency of the<br />

smoothing process, filtering is undertaken in Fourier space.<br />

However, in order to produce a data consistent macrotrend<br />

signal using a Discrete Fourier Transform, wrapping effects<br />

must be eliminated. The solution is to apply an ‘end point<br />

extension’ scheme which involves padding the input vector<br />

with elements equal to the first <strong>and</strong> last values of the vector.<br />

The length of the ‘padding vectors’ are taken to be at least<br />

half the size of the input vector. The output vector is obtained<br />

by deleting the filtered padding vectors.<br />

Figures 6 <strong>and</strong> 7 show examples of macrotrend analysis<br />

applied to the economic time series obtained from [19] <strong>and</strong><br />

[20] <strong>and</strong> the signal q(t) using the VDSK filter exp(−βω 2 ).<br />

Table 1 provides quantitative information of the statistics of the<br />

signal q(t). Figures 6 <strong>and</strong> 7 include the normalised gradients<br />

computed using a ‘forward differencing scheme’ which clearly<br />

illustrate ‘phase shifts’ associated with the two signals. From<br />

Table 1, the mean value of q(t) for the Dow Jones index<br />

is slightly lower than the mean for the FTSE <strong>and</strong> in both<br />

cases, the Null Hypothesis test as to whether q(t) is Gaussian<br />

distributed is negative, i.e. the ‘Composite Normality’ is of<br />

type ‘Reject’.<br />

VII. CASE STUDY: ANALYSIS OF ABX INDICES<br />

ABX indices serve as a benchmark of the market for<br />

securities backed by home loans issued to borrowers with weak<br />

credit. The index is administered by the London-based Markit<br />

Group which specialises in credit derivative pricing [52].<br />

A. What is an ABX index?<br />

The index is based on a basket of Credit Default Swap<br />

(CDS) contracts for the sub-prime housing equity sector.<br />

Credit Default Swaps operate as a type of insurance policy<br />

for banks or other holders of bad mortgages. If the mortgage<br />

goes bad, then the seller of the CDS must pay the bank for the<br />

Fig. 6. Analysis of FTSE Close-of-Day data from 25-04-1988 to 20-03-<br />

2009. Top-left:FTSE data (blue) <strong>and</strong> q(t) (red) computed using a 1024 moving<br />

window; Top-right: 100 bin histogram; Bottom-left: Macrotrends (β = 0.1);<br />

Bottom-right: Normalised gradients of macrotrends.<br />

Fig. 7. Analysis of DJ Close-of-Day data from 25-04-1988 to 20-03-2009.<br />

Top-left: FTSE data (blue) <strong>and</strong> q(t) (red) computed using a window of 1024;<br />

Top-right: 100 bin histogram; Bottom-left: Macrotrends (β = 0.1); Bottomright:<br />

Normalised gradients of macrotrends.


85 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

Statistical Parameter q(t)-FTSE q(t)-DJ<br />

Minimum Value 0.9876 0.9752<br />

Maximum value 1.5067 1.5154<br />

Range 0.5190 0.5402<br />

Mean 1.2482 1.2218<br />

Median 1.2639 1.2452<br />

St<strong>and</strong>ard Deviation 0.1017 0.1269<br />

Variance 0.0104 0.0161<br />

Skew -0.4080 -0.2881<br />

Kertosis 2.3745 1.8233<br />

Composite Normality Reject Reject<br />

TABLE I<br />

STATISTICAL VALUES ASSOCIATED WITH q(t) COMPUTED FOR FTSE AND<br />

DJ CLOSE-OF-DAY DATA FROM 25-04-1988 TO 20-03-2009 GIVEN IN<br />

FIGURES 6 AND 7 RESPECTIVELY.<br />

lost mortgage payments. Alternatively, if the mortgage stays<br />

good then the seller makes a lot of money. The riskier the<br />

bundle of mortgages the lower the rating.<br />

The original goal of the index was to create visibility <strong>and</strong><br />

transparency but it was not clear at the time of its inception<br />

that the index would be so closely followed. As subprime<br />

securities have become increasingly uncertain, the ABX index<br />

has become a key point of reference for investors navigating<br />

risky mortgage debt on an international basis. Hence, in light<br />

of the current financial crisis (i.e. from 2008-date), <strong>and</strong> given<br />

that most economist agree that the subprime mortgage was a<br />

primary catalyst for the crisis, analysis of the ABX index has<br />

become a key point of reference for investors navigating the<br />

world of risky mortgage debt.<br />

On asset-backed securities such as home equity loans the<br />

CDS provides an insurance against the default of a specific<br />

security. The index enables users to trade in a security without<br />

being limited to the physical outst<strong>and</strong>ing amount of that<br />

security thereby given investors liquid access to the most<br />

frequently traded home equity tranches in a basket form. The<br />

ABX uses five indices that range from triple-A to triple-<br />

B minus. Each index chooses deals from 20 of the largest<br />

sub-prime home equity shelves by issuance amount from<br />

the previous six months. The minimum deal size is $500<br />

million <strong>and</strong> each tranche referenced must have an average<br />

life of between four <strong>and</strong> six years, except for the triple-A<br />

tranche, which must have a weighted average life greater than<br />

five years. Each of the indices is referenced to by different<br />

rated tranches, i.e. AAA, AA, A, BBB <strong>and</strong> BBB-. They are<br />

selected through identification of the most recently issued<br />

deals that meet the specific size <strong>and</strong> diversity criteria. The<br />

principal ‘market-makers’ in the index were/are: Bank of<br />

America, Bear Stearns, Citigroup, Credit Suisse, Deutsche<br />

Bank, Goldman Sachs, J P Morgan, Lehman Brothers, Merrill<br />

Lynch (now Bank of America), Morgan Stanley, Nomura<br />

International, RBS Greenwich Capital, UBS <strong>and</strong> Wachovia.<br />

However, during the financial crisis that developed in 2008,<br />

a number of changes have taken place. For example, on<br />

September 15, 2008, Lehman Brothers filed for bankruptcy<br />

protection following a massive exodus of most of its clients,<br />

drastic losses in its stock, <strong>and</strong> devaluation of its assets by<br />

credit rating agencies <strong>and</strong> in 2008 Merrill Lynch was acquired<br />

by Bank of America at which point Bank of America merged<br />

its global banking <strong>and</strong> wealth management division with the<br />

newly acquired firm. The Bear Stearns Companies, Inc. was a<br />

global investment bank <strong>and</strong> securities trading <strong>and</strong> brokerage,<br />

until its collapse <strong>and</strong> fire sale to J P Morgan Chase in 2008.<br />

ABX contracts are commonly used by investors to speculate<br />

on or to hedge against the risk that the underling mortgage<br />

securities are not repaid as expected. The ABX swaps offer<br />

protection if the securities are not repaid as expected, in<br />

return for regular insurance-like premiums. A decline in the<br />

ABX index signifies investor sentiment that subprime mortgage<br />

holders will suffer increased financial losses from those<br />

investments. Likewise, an increase in the ABX index signifies<br />

investor sentiment looking for subprime mortgage holdings to<br />

perform better as investments.<br />

B. ABX <strong>and</strong> the Sub-prime Market<br />

Prime loans are often packaged into securities <strong>and</strong> sold to<br />

investors to help lenders reduce risk. More than $500B of<br />

such securities were issued in the US in 2006. The problem<br />

for investors who bought 2006’s crop of high-risk mortgage<br />

originations, was that as the US housing market slowed as<br />

did mortgage applications. To prop up the market, mortgage<br />

lenders relaxed their underwriting st<strong>and</strong>ards lending to everriskier<br />

borrowers at ever more favourable terms.<br />

In the last few weeks of 2006, the poor credit quality of<br />

the 2006 vintage subprime mortgage origination started to<br />

become apparent. Delinquencies <strong>and</strong> foreclosures among highrisk<br />

borrowers increased at a dramatic rate, weakening the<br />

performance of the mortgage pools. In one security backed by<br />

subprime mortgages issued in March 2006, foreclosure rates<br />

were already 6.09% by December that year, while 5.52% of<br />

borrowers were late on their payments by more than 30 days.<br />

Lenders also began shutting their doors, sending shock waves<br />

through the high-risk mortgage markets throughout 2007. The<br />

problem kept new investor money at bay, <strong>and</strong> dramatically<br />

weakened a key derivative index tied to the performance of<br />

2006 high-risk mortgages, i.e. the ABX index. As a result<br />

the ABX suffered a major plummet of the index starting in<br />

December 2006 when BBB- fell below 100 for the first time.<br />

The most heavily traded subindex, representing loans rated<br />

BBB-, fell as hedge funds flocked to bet on the downturn <strong>and</strong><br />

pushed up the cost of insuring against default. This led to a<br />

knock-on effect as lenders withdrew from the ABX market<br />

In early 2007 the issues were seen as: (i) Which investors<br />

were bearing the losses from having bought sub-prime mortgage<br />

backed securities? (ii) How large <strong>and</strong> concentrated were<br />

these losses? (iii) Had this sub-prime securitization distributed<br />

their risk among many players in the financial system or were<br />

the positions <strong>and</strong> losses concentrated among a few players?<br />

(iv) What were the potential systemic risk effects of these<br />

losses? We now know that the systemic risk had a devastating<br />

affect on the global economy <strong>and</strong> became known as the ‘Credit<br />

Crunch’. One of the catalysts for the problem was a US<br />

bill allowing bankruptcy judges to alter loan balances which<br />

nobody dealing in CDS had considered. The second key factor<br />

was the speed of deterioration of the ABX Indices in 2007


86 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

which shocked investors <strong>and</strong> left them waiting to see the<br />

bottom of the market before getting back in - they are still<br />

waiting. The third key factor was the failure of the US Treasury<br />

to provide foreclosure relief for distressed home owners which<br />

congress had approved. The following series of reactions<br />

(denoted by →) were triggered as a result: The treasury said it<br />

won’t take steps to prevent home foreclosures, so that prices<br />

of mortgage securities collapsed → bank equity was wiped<br />

out → banks, with shrunken equity capital, were forced to cut<br />

back on all types of credit → financing for anything, especially<br />

residential mortgage loans, dried up → market values of homes<br />

declined further → mortgage securities declined further, <strong>and</strong><br />

the downward spiral becomes self perpetuating.<br />

C. Effect of ABX on Bank Equities<br />

At the end of February 2007 a price of 92.5 meant that a<br />

protection buyer will need to pay the protection seller 7.5%<br />

upfront <strong>and</strong> then 0.64% per year. At the time, this kind of<br />

mortgage yield was about 6.5%, so the upfront charge was<br />

more than the yield per year. By April 2009 the A grade index<br />

had fallen to 8 meaning that the protection seller would want<br />

92% upfront which meant that the sub-prime market ‘died’. In<br />

July 2007 AAA mortgage securities started trading at prices<br />

materially below par, or below 100. Until then, many banks<br />

had bulked up mortgage securities that were rated AAA at<br />

the time of issue. This was because they believed that AAA<br />

bonds could always be traded at prices close to par, <strong>and</strong><br />

consequently the bonds’ value would have a very small impact<br />

on the earnings <strong>and</strong> equity capital. The mystique about AAA<br />

ratings dated back more than 80 years. From 1920 onward,<br />

the default experience on AAA rated bonds, even during the<br />

Great Depression, was nominal.<br />

The way the securities are structured is that different classes<br />

of creditors, or different tranches, all hold ownership interests<br />

in the same pool of mortgages. However, the tranches with<br />

the lower ratings - BBB, A, AA - take the first credit<br />

losses <strong>and</strong> they are supposed to be destroyed before the<br />

AAA bondholders lose anything. Typically, AAA bondholders<br />

represent about 75-80% of the entire mortgage pool. During<br />

the Great Depression (1929-1933), national average home<br />

prices held their value far better than they have since 2007. The<br />

assumptions that a highly liquid trading market <strong>and</strong> gradual<br />

price declines, have proved to be wrong. Beginning in the last<br />

half of 2007, the price declines of AAA bonds was steep,<br />

<strong>and</strong> the trading market suddenly became very illiquid. Under<br />

st<strong>and</strong>ard accounting rules, those securities must be marked<br />

to market every fiscal quarter, <strong>and</strong> the banks’ equity capital<br />

shrank beyond all expectations. Hundreds of billions of dollars<br />

have been lost as a result. However, the losses in mortgage<br />

securities, <strong>and</strong> from financial institutions such Lehman that<br />

were undone by mortgage securities, dwarf everything else.<br />

Before the end of each fiscal quarter, bank managements must<br />

also budget for losses associated with mortgage securities. But<br />

since they cannot control market prices at a future date, they<br />

compensate by adjusting what they can control, which is all<br />

discretionary extensions of credit. Banks cannot legally lend<br />

beyond a certain multiple of their capital.<br />

D. Credit Default Swap Index<br />

This index is used to hedge credit risk or to take a position<br />

on a basket of credit entities. Unlike a credit default swap, a<br />

credit default swap index is a completely st<strong>and</strong>ardised credit<br />

security <strong>and</strong> may therefore be more liquid <strong>and</strong> trade at a<br />

smaller bid-offer spread. This means that it can be cheaper<br />

to hedge a portfolio of credit default swaps or bonds with a<br />

CDS index than to buy many CDS to achieve a similar effect.<br />

Credit-default swap indexes are benchmarks for protecting<br />

investors owning bonds against default, <strong>and</strong> traders use them to<br />

speculate on changes in credit quality. There are currently two<br />

main families of CDS indices: CDX <strong>and</strong> iTraxx. CDX indices<br />

contain North American <strong>and</strong> Emerging Market companies <strong>and</strong><br />

are administered by CDS Index Company <strong>and</strong> marketed by<br />

Markit Group Limited, <strong>and</strong> iTraxx contain companies from<br />

the rest of the world <strong>and</strong> are managed by the International<br />

Index Company (IIC). A new series of CDS indices is issued<br />

every six months by Markit Group <strong>and</strong> IIC. Running up<br />

to the announcement of each series, a group of investment<br />

banks is polled to determine the credit entities that will form<br />

the constituents of the new issue. This process is intended<br />

to ensure that the index does not become ‘cluttered with<br />

instruments that no longer exist, or which trade illiquidly. On<br />

the day of issue a fixed coupon is decided for the whole index<br />

based on the credit spread of the entities in the index. Once this<br />

has been decided the index constituents <strong>and</strong> the fixed coupon<br />

is published <strong>and</strong> the indices can be actively traded.<br />

E. Analysis of Sub-Prime CDS Market ABX Indices using the<br />

FMH<br />

The US Sub-Prime Housing Market is widely viewed as<br />

the source of the current economic crisis. The reason that<br />

it has had such a devastating effect on the global economy<br />

is that investment grade bonds were purchased by many<br />

substantial international financial institutions but in reality<br />

the method used to designate the relatively low risk required<br />

for investment grade securities was seriously flawed. This<br />

resulted in the investment grade bonds becoming virtually<br />

worthless very quickly when systemic risks that wrongly had<br />

been ignored undermined the entire market. About 80% of<br />

the market was designated investment grade (AAA - highest,<br />

AA <strong>and</strong> A - lowest) with protection provided by a high risk<br />

grades (BBB- <strong>and</strong> BBB). The flawed risk model was based<br />

on an assumption that the investment grades would always be<br />

protected by the higher risk grades that would take all of the<br />

first 20% of defaults. Once defaults exceeded 20% the ‘house<br />

of cards’ was demolished. It is therefore of interest to see if a<br />

FMH based analysis of the ABX indices could have been used<br />

a predictive tool in order to develop a superior risk model.<br />

Figure 8 shows the ABX index for each grade using data<br />

supplied by the Systemic Risk Assessment Division of the<br />

Bank of Engl<strong>and</strong>. During the second week of December 2006<br />

the BBB- index slipped to 99.76 for a couple of days but then<br />

recovered. In March 2007 the index for BBB- slipped just<br />

below 90 <strong>and</strong> seemed to be recovering <strong>and</strong> by mid-May was<br />

above 90 again. In June 2007 the BBB- really began to slide<br />

<strong>and</strong> this time it never recovered <strong>and</strong> was closely followed by


87 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

Fig. 8. Grades for the ABX Indices from 19 January 2006 to 2 April 2009<br />

based on Close-of-Day prices.<br />

the collapse of the BBB index after which there was no further<br />

protection for the investment grades. The default swaps work<br />

like an insurance so that if the cost of insuring against risk<br />

becomes greater than the annual return from the loan then the<br />

market is effectively dead. By February 2008 the AAA grade<br />

was below this viable level.<br />

The results of applying the FMH based on the algorithms<br />

discussed in Section 6 is given in Figures 9-13. Table 2 provides<br />

a list of the statistical variables associated with q(t) for<br />

each case. In each case, q(t) initially has values > 2 but this<br />

falls rapidly prior to a change of the index. Also, in each case,<br />

the turning point of the normalised gradient of the Gaussian<br />

filtered signal (i.e. point in time of the minimum value) is<br />

an accurate reflection of the point in time prior to when the<br />

index falls rapidly relatively to the prior data. This turning<br />

point occurs before the equivalent characteristic associated<br />

with the smoothed index. The model consistently ‘signals’ the<br />

coming meltdown with sufficient notice for orderly withdrawal<br />

from the market. For example, the data used for Figure 9<br />

reflects the highest Investment Grade <strong>and</strong> would be regarded<br />

as particularly safe. The normalised gradient of the output data<br />

provides a very early signal of a change in trend, in this case,<br />

at around approximately 180 days from the start of the run,<br />

which is equivalent to early April 2007 at which point the<br />

index was just above 100. In fact the AAA index appears to<br />

be viable as an investment right up to early November 2008<br />

after which is falls dramatically. In Figure 11, a trend change<br />

is again observed in the normalised gradient at approximately<br />

190 days which is equivalent mid April 2007. It is not until the<br />

second week of July 2007 that this index begins to fall rapidly.<br />

In Figure 13 the normalised gradient signals a trend change<br />

at around 170 for the highest risk grade. This is equivalent to<br />

the third week of March 2007. At this stage the index was<br />

only just below 90 <strong>and</strong> appeared to be recovering.<br />

Fig. 9. Analysis of AAA ABX.HE indices (2006 H1 vintage) by rating<br />

(closing prices) from 24-07-2006 to 02-04-2009. Top-left: AAA data (blue)<br />

<strong>and</strong> q(t) (red); Top-right: 100 bin histogram; Bottom-left: Macotrends for<br />

β = 0.1; Bottom-right: Normalised gradients of macrotrends.<br />

Fig. 10. Analysis of AA ABX.HE indices (2006 H1 vintage) by rating<br />

(closing prices) from 24-07-2006 to 02-04-2009 for a 128 moving window.<br />

Top-left: AA data (blue) <strong>and</strong> q(t) (red); Top-right: 100 bin histogram;<br />

Bottom-left: Macotrends for β = 0.1; Bottom-right: Normalised gradients<br />

of macrotrends.


88 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

Fig. 11. Analysis of A ABX.HE indices (2006 H1 vintage) by rating (closing<br />

prices) from 24-07-2006 to 02-04-2009 for a 128 size moving window. Topleft:<br />

AA data (blue) <strong>and</strong> q(t) (red); Top-right: 100 bin histogram; Bottom-left:<br />

Macotrends for β = 0.1; Bottom-right: Normalised gradients of macrotrends.<br />

Fig. 12. Analysis of BBB ABX.HE indices (2006 H1 vintage) by rating<br />

(closing prices) from 24-07-2006 to 02-04-2009 for a moving window with<br />

128 elements. Top-left: AA data (blue) <strong>and</strong> q(t) (red); Top-right: 100 bin<br />

histogram; Bottom-left: Macotrends for β = 0.1; Bottom-right: Normalised<br />

gradients of macrotrends.<br />

Fig. 13. Analysis of BBB- ABX.HE indices (2006 H1 vintage) by rating<br />

(closing prices) from 24-07-2006 to 02-04-2009 for a moving window of<br />

size 128 element. Top-left:AA data (blue) <strong>and</strong> q(t) (red); Top-right: 100 bin<br />

histogram; Bottom-left: Macotrends for β = 0.1; Bottom-right: Normalised<br />

gradients of macrotrends.<br />

Statistical AAA AA A BBB BBB-<br />

Parameter<br />

Min. 1.1834 1.0752 1.0522 1.0610 1.0646<br />

Max. 3.1637 2.8250 2.7941 2.4476 2.5371<br />

Range 1.9803 1.7499 1.7420 1.3867 1.4726<br />

Mean 2.0113 1.7869 1.6663 1.5141 1.4722<br />

Median 1.9254 1.7001 1.4923 1.3425 1.3243<br />

SD 0.3928 0.4244 0.4384 0.3746 0.3476<br />

Variance 0.1543 0.1801 0.1922 0.1404 0.1208<br />

Skew 0.7173 0.3397 0.6614 0.8359 1.0345<br />

Kertosis 2.7117 1.8479 2.0809 2.2480 2.7467<br />

CN Reject Reject Reject Reject Reject<br />

TABLE II<br />

STATISTICAL VALUES ASSOCIATED WITH q(t) COMPUTED FOR ABX.HE<br />

INDICES (2006 H1 VINTAGE) BY RATING (CLOSING PRICES) FROM<br />

24-07-2006 TO 02-04-2009. NOTE THAT THE ACRONYMS SD AND CN<br />

STAND FOR ‘STANDARD DEVIATION’ AND ‘COMPOSITE NORMALITY’<br />

RESPECTIVELY.<br />

VIII. CONCLUSION<br />

In terms of the non-stationary fractional diffusion model<br />

considered in this paper, the time varying Fourier dimension<br />

q(t) can be interpreted in terms of a ‘gauge’ on the characteristics<br />

of a dynamical system. This includes the management<br />

processes from which all modern economies may be assumed<br />

to be derived. In this sense, the FMH is based on three principal<br />

considerations: (i) the non-stationary behaviour associated<br />

with any system undergoing continuous change that is driven<br />

by a management infrastructure; (ii) the cause <strong>and</strong> effect that is<br />

inherent at all scales (i.e. all levels of management hierarchy);<br />

(iii) the self-affine nature of outcomes relating to points (i)<br />

<strong>and</strong> (ii).<br />

In a modern economy, the principal issue associated with


89 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

any form of financial management is based on the flow<br />

of information <strong>and</strong> the assessment of this information at<br />

different points connecting a large network. In this sense, a<br />

macroeconomy can be assessed in terms of its information<br />

network which consists of a distribution of nodes from which<br />

information can flow in <strong>and</strong> out. The ‘efficiency’ of the system<br />

is determined by the level of r<strong>and</strong>omness associated with the<br />

direction of flow of information to <strong>and</strong> from each node. The<br />

nodes of the system are taken to be individuals or small<br />

groups of individuals whose assessment of the information<br />

they acquire together with their remit, responsibilities <strong>and</strong><br />

initiative, determines the direction of the information flow<br />

from one node to the next. The determination of the efficiency<br />

of a system in terms of r<strong>and</strong>omness is the most critical in terms<br />

of the model developed. It suggests that the performance of a<br />

business is related to how well information flows through an<br />

organisation.<br />

The FMH has a number of fundamental differences with<br />

regard to the EMH which are tabulated in Table 3.<br />

EMH FMH<br />

Gaussian Non-Gaussian<br />

Statistics Statistics<br />

Stationary Non-stationary<br />

Process Process<br />

No memory - Memory -<br />

no historical correlations historical correlations<br />

No repeating Many repeating<br />

patterns at any scale patterns at all scales -<br />

‘Elliot waves’<br />

Continuously stable Continuously unstable<br />

at all scales at any scale -<br />

‘Lévy Flights’<br />

TABLE III<br />

PRINCIPAL DIFFERENCES BETWEEN THE EFFICIENT MARKET<br />

HYPOTHESIS (EMH) AND THE FRACTAL MARKET HYPOTHESIS (FMH).<br />

The non-stationary nature of the model presented in this<br />

paper is taken to account for stochastic processes that can vary<br />

in time <strong>and</strong> are intermediate between diffusive <strong>and</strong> propagative<br />

or persistent behaviour. Application of Orthogonal Linear<br />

Regression to macroeconomic time series data provides an<br />

accurate <strong>and</strong> robust method to compute q(t) when compared to<br />

other statistical estimation techniques such as the least squares<br />

method. As a result of the physical interpretation associated<br />

with the fractional diffusion equation <strong>and</strong> the ‘meaning’ of<br />

q(t), we can, in principal, use the signal q(t) as a predictive<br />

measure in the sense that as the value of q(t) continues to<br />

increases, there is a greater likelihood for volatile behaviour<br />

of the markets. This is reflected in the data analysis based<br />

on the examples given in which a Gaussian lowpass filter<br />

exp(−βω 2 ) has been used to smooth both u(t) <strong>and</strong> q(t) to<br />

produce the associated macrotrends in which the value of β<br />

determines the level of detail they contain. From the examples<br />

provided, it is clear that the turning points of the gradients<br />

of a macrotend in q(t) flag a future change in the trend of<br />

the economic signal u(t). This is compounded in the phase<br />

shifts that exist in the normalised gradients of u(t) <strong>and</strong> q(t)<br />

over frequency b<strong>and</strong>s determined by the value of β. Although<br />

the interpretation of these phase shifts requires further study,<br />

from the results presented in this paper, it is clear that they<br />

provide an assessment of the risk associated with investing<br />

in a particular economic time series provided the series in<br />

question is a r<strong>and</strong>om scaling fractal. The ‘case study’ on the<br />

ABX Close-of-Day indices clearly illustrates the ability for the<br />

model to flag a point in time after which the indices change<br />

rapidly. The ABX indices exhibit a clear transition between<br />

a period when q(t) > 2 <strong>and</strong> when 1 < q(t) < 2 - Figures<br />

9-13 - which precedes the ‘collapse’ of the indices in 2008<br />

are thereby the onset of the ‘Credit Crunch’<br />

In a statistical sense, q(t) is just another measure that may,<br />

or otherwise, be of value to market traders. In comparison<br />

with other statistical measures, this can only be assessed<br />

through its practical application in a live trading environment.<br />

However, in terms of its relationship to a stochastic model<br />

for macroeconomic data, q(t) does provide a measure that<br />

is consistent with the physical principles associated with a<br />

r<strong>and</strong>om walk that includes a directional bias, i.e. fractional<br />

Brownian motion. The model considered, <strong>and</strong> the signal<br />

processing algorithm proposed, has a close association with<br />

re-scaled range analysis for computing the Hurst exponent H<br />

[35]. In this sense, the principal contribution of this paper<br />

has been to consider a model that is quantified in terms of<br />

a physically significant (but phenomenological) model that<br />

is compounded in a specific (fractional) partial differential<br />

equation. As with other financial time series, their derivatives,<br />

transforms etc., a range of statistical measures can be used<br />

to characterise q(t) examples of which have been provided in<br />

this paper. It should be noted that in all cases studied to date,<br />

the composite normality of the signal q(t) is of type ‘Reject’.<br />

In other words, the statistics of q(t) are non-Gaussian. Further,<br />

assuming that a financial time series is statistically self-affine,<br />

the computation of q(t) can be applied over any time scale<br />

provided there is sufficient data for the computation of q(t)<br />

to be statistically significant. Thus, the results associated with<br />

the Close-of-Day data studied in this paper are, in principle,<br />

applicable to economic time series associated with tick data<br />

over a range of time scales.<br />

APPENDIX I<br />

M-CODE FOR THE ORTHOGONAL LINEAR REGRESSION<br />

ALGORITHM<br />

The following m-code is used to compute the Fourier<br />

dimension q from the power spectrum of a r<strong>and</strong>om fractal<br />

signal <strong>and</strong> is based on the code given in [51].<br />

function x=linortfit(xdata,ydata)<br />

% Input arrays are<br />

%<br />

%xdata: 2,3,...,L/2<br />

%ydata: P[2], P[3], ..., P(L/2)<br />

%<br />

% Output value is x which gives the Fourier<br />

% dimension q for input data P[i].


90 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

%<br />

fun=inline(’sum((p(1)+p(2)*xdata-ydata...<br />

...).ˆ2)/(1+p(2)ˆ2)’,’p’,’xdata’,’ydata’);<br />

x0=flipdim(polyfit(xdata,ydata,1),2);<br />

options=optimset(’TolX’,1e-6,...<br />

...’TolFun’,1e-6);<br />

x=fminsearch(fun,x0,options,xdata,ydata);<br />

APPENDIX II<br />

VARIATION DIMINISHING SMOOTHING KERNELS<br />

Variation Diminishing Smoothing Kernels (VDSK) are convolution<br />

kernels with properties that guarantee smoothness <strong>and</strong><br />

thereby, eliminate Gibbs’ effect around points of discontinuity<br />

of a given function. Further the smoothed function can be<br />

shown to be made up of a similar succession of concave or<br />

convex arcs equal in number to those of the function. Thus, we<br />

consider the following question: let there be given a continuous<br />

or discontinuous function f whose graph is composed of a<br />

succession of alternating concave or convex arcs. Is there<br />

a smoothing kernel (or a set of them) which produces a<br />

smoothed function whose graph is also made up of a similar<br />

succession of concave or convex arcs equal in number to those<br />

of f? 1 .<br />

II.1 Laguerre-Pôlya Class Entire Functions<br />

The class of kernels which relate to this question are a class<br />

of entire functions which shall be called class E originally<br />

studied earlier by E Laguerre <strong>and</strong> G Pôlya. An entire function<br />

E(z), z ∈ C belongs to the class E<br />

⇐⇒<br />

E(z) = exp(bz − cz 2 ∞� �<br />

) 1 − z<br />

�<br />

exp[z/a(ℓ)], (II.1.1)<br />

a(ℓ)<br />

ℓ=1<br />

where b, c, a(ℓ) ∈ R, c ≥ 0, <strong>and</strong><br />

∞�<br />

a −2 (ℓ) < ∞. (II.1.2)<br />

ℓ=1<br />

where ⇐⇒ is taken to denote ‘if <strong>and</strong> only if’ - iff. The convergence<br />

of the series (II.1.2) guarantees that the product in<br />

(II.1.1) converges <strong>and</strong> represents an entire function. Laguerre<br />

proved, <strong>and</strong> Pôlya added a refinement, that a sequence of<br />

polynomials, having real roots only, which converge uniformly<br />

in every compact set of the complex plane C, approaches a<br />

function of class E in the uniform limit of such a sequence.<br />

For example,<br />

exp(−z 2 �<br />

) = lim<br />

ℓ→∞<br />

1 − z2<br />

ℓ 2<br />

� ℓ 2<br />

,<br />

<strong>and</strong> the polynomials (1 − z 2 /ℓ 2 ) have real roots only. In this<br />

definition, it is not assumed that the a(ℓ) are distinct. To<br />

include the case in which the product has a finite number<br />

of factors or reduces to 1 without additional notation, it<br />

is assumed that certain points on all the a(ℓ) may be ∞.<br />

1 Based on an edited version of material developed by A Domingez-Torres,<br />

‘Fourier Based Method in CAD’, PhD Thesis, Cranfield University, 1991<br />

Furthermore, it is assumed, without loss of generality, that<br />

the roots a(ℓ) are arranged in an order of increasing absolute<br />

values,<br />

0 < |a(1)| ≤ |a(2)| ≤ |a(3)| ≤ . . .<br />

Examples of functions belonging to class E are<br />

1, 1 − z, exp(z), exp(z 2 ), cos z<br />

sin z<br />

z , Γ−1 (1 − z), Γ −1 (z)<br />

Note that the product of two functions of this class produce a<br />

new function of the same class.<br />

II.2 Variation Diminishing Smoothing Kernels (VDSKs)<br />

A function k is variation diminishing iff it is of the form<br />

k(x) = (2πi) −1<br />

�i∞<br />

−i∞<br />

ℓ=1<br />

[E(z)] −1 exp(zx) dz, (II.2.1)<br />

where E(z) ∈ E is given by<br />

E(z) = exp(bz − cz 2 ∞� �<br />

) 1 − z<br />

�<br />

exp[z/a(ℓ)], (II.2.2)<br />

a(ℓ)<br />

with b, c, a(ℓ) ∈ R, c ≥ 0, <strong>and</strong><br />

∞�<br />

a −2 (ℓ) < ∞<br />

ℓ=1<br />

In other words, a frequency function k is variation diminishing<br />

iff its bilateral Laplace transform equals [E(z)] −1 :<br />

[E(z)] −1 =<br />

�∞<br />

−∞<br />

k(x) exp(−zx) dx. (II.2.3)<br />

In order to define a smoothing kernel, the function k given in<br />

(II.2.1) must be an even function. For, if k(x) is even, then<br />

the corresponding bilateral Laplace transform [E(z)] −1 is also<br />

even. This fact follows readily from<br />

=<br />

�∞<br />

−∞<br />

[E(z)] −1<br />

k(x) exp(−zx) dx =<br />

=<br />

�∞<br />

−∞<br />

�∞<br />

−∞<br />

k(−x) exp(−zx) dx<br />

k(x) exp(zx) dx = [E(−z)] −1<br />

Conversely, if [E(z)] −1 is even, then its inverse bilateral<br />

transform is even since a component of convergence of (II.2.3)<br />

contains the imaginary axis. This follows from the fact that<br />

the component of convergence of each one of the functions<br />

which compose E(z) contains completely the imaginary axis.<br />

Further, it follows that<br />

[E(iu)] −1 = K(u), (II.2.4)<br />

where K(u) is the FT of k. From the evenness of [E(z)] (−1)<br />

it follows that K(u) is real, hence k is even. But E(z) is even


91 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

iff b = 0 <strong>and</strong> a(2ℓ − 1) = −a(2ℓ), ℓ = 1, 2, . . . . Therefore<br />

E(z) is taken to be<br />

E(z) = exp(−cz 2 ∞� �<br />

) 1 − z2<br />

a2 �<br />

, (II.2.5)<br />

(ℓ)<br />

with c, a(ℓ) ∈ R, c ≥ 0, <strong>and</strong><br />

ℓ=1<br />

∞�<br />

a −2 (ℓ) < ∞.<br />

ℓ=1<br />

Equation (II.2.4) establishes the relationship between the<br />

bilateral Laplace transform <strong>and</strong> the Fourier transform of k.<br />

Thus, any analysis associated with use of the bilateral Laplace<br />

transform can be undertaken in terms of the Fourier transform.<br />

Using equation (II.2.4) the Fourier transform of (II.2.1) is<br />

given by<br />

k(x) ↔ K(u) = [E(iu)] −1 = exp(−cu 2 )<br />

∞� �<br />

ℓ=1<br />

a 2 (ℓ)<br />

a 2 (ℓ) + u 2<br />

(II.2.6)<br />

where ↔ denotes transformation from real to Fourier space,<br />

c, a(ℓ) ∈ R, c ≥ 0, <strong>and</strong> ∞�<br />

a−2 (ℓ) < ∞.<br />

ℓ=1<br />

Because equation (II.2.6) is a variation diminishing function<br />

by construction <strong>and</strong> |K(0)| ≤ 1, then the following result<br />

holds.<br />

Theorem II.2.1 (VDSKs)<br />

k defined as in equation (II.2.6)<br />

=⇒<br />

1. k is a smoothing kernel belonging to SK1,<br />

2. k is variation diminishing,<br />

3. k(x) ≥ 0, x ∈ R.<br />

In order to make a complete study of the VDSKs, such<br />

kernels will be divided in three classes: The Finite VDSKs,<br />

The Non-Finite VDSKs, <strong>and</strong> The Gaussian VDSK.<br />

II.3 The Finite VDSKs<br />

The finite <strong>and</strong> the non-finite VDSKs are kernels which can<br />

be synthesized from the following basic function:<br />

�<br />

,<br />

e(x) = 1<br />

exp(−|x|), x ∈ R. (II.3.1)<br />

2<br />

The finite VDSKs are made up by a finite number of convolutions<br />

of functions a(ℓ) e[a(ℓ)x], ℓ = 1, 2, . . . . Clearly e(x)<br />

is a VDSK with mean ν = 0 <strong>and</strong> variance σ 2 = 2 <strong>and</strong> its<br />

Fourier transform is given by<br />

e(x) ↔<br />

1<br />

. (II.3.2)<br />

1 + u2 Note that if a > 0, then a e(ax) is again a VDSK. Using<br />

the similarity property of the Fourier transform <strong>and</strong> equation<br />

(II.3.2), its Fourier transform is given by<br />

a e(ax) ↔<br />

a2<br />

a2 . (II.3.3)<br />

+ u2 Its mean ν again vanishes <strong>and</strong> its variance takes the value<br />

σ 2 = 2/a 2 .<br />

Let a(1), a(2), . . . , a(n) > 0 be constants, some or all<br />

of which may be coincident. The following VDSKs are<br />

introduced<br />

kℓ(x) = a(ℓ) e[a(ℓ)x], ℓ = 1, 2, . . . , n. (II.3.4)<br />

The combination of these functions by convolution gives a new<br />

VDSKs with properties quantified in the following theorem.<br />

Theorem II.3.1 (Properties of The Finite VDSKs)<br />

1. a(ℓ) > 0, ℓ = 1, 2, . . . ,<br />

2. kℓ(x) = a(ℓ) e[a(ℓ)x],<br />

3. k = k1 ⊗ k2 ⊗ · · · ⊗ kn,<br />

4. K(u) = n� � �<br />

2 2 2 a (ℓ)/(a (ℓ) + u )<br />

ℓ=1<br />

=⇒<br />

A. k is a VDSK,<br />

B. k(x) ↔ K(u),<br />

C. k has mean ν = 0,<br />

D. k has variance σ2 = n� � �<br />

2 2/a (ℓ) < ∞.<br />

ℓ=1<br />

Proof. A. The assertion follows from mathematical induction.<br />

B. It follows from Convolution Theorem <strong>and</strong> mathematical<br />

induction.<br />

C. Let kℓ(x) ↔ Kℓ(u). Then because each kℓ is a VDSK,<br />

it follows that the respective mean, νℓ, is given by<br />

νℓ = iK ′ ℓ(0) = 0, ℓ = 1, 2, . . . , n.<br />

Moreover, if n = 2, then the mean ν of k is given by<br />

ν = iK ′ (0) = i(K1K2) ′ (0) = i(K1K2 ′ +K1 ′ K2)(0) = i(0) = 0.<br />

The assertion follows from this result <strong>and</strong> mathematical induction.<br />

D. Let kℓ(x) ↔ Kℓ(u). Then because kℓ is a VDSK, it<br />

follows that the respective variance, σ2 ℓ , is given by<br />

a2 , ℓ = 1, 2, . . . , n.<br />

ℓ<br />

Furthermore, from the result given in C above, if n = 2, then<br />

the mean σ2 of k is given by<br />

σ 2 ℓ = −K ′′ (0) = 2<br />

σ 2 = −K ′′ (0) = −(K1K2) ′′ (0)<br />

= (−K1K2 ′′ − 2K1 ′ K2 ′ − K1 ′′ K2)(0) = 2<br />

a2 2<br />

+<br />

(1) a2 (2) .<br />

The assertion follows from this result <strong>and</strong> mathematical induction.<br />

From the explicit expression of K(u) given in Theorem<br />

II.3.1. it follows that<br />

=<br />

=<br />

K(u) =<br />

n�<br />

ℓ=1<br />

n�<br />

ℓ=1<br />

n�<br />

ℓ=1<br />

� 2 a (ℓ)<br />

a2 (ℓ) + u2 �<br />

�<br />

a(ℓ)<br />

� �<br />

−a(ℓ)<br />

�<br />

a(ℓ) − iu −a(ℓ) − iu<br />

� a(ℓ)<br />

a(ℓ) − iu<br />

=<br />

2n�<br />

ℓ=1<br />

� n �<br />

ℓ=1<br />

�<br />

d(ℓ)<br />

d(ℓ) − iu<br />

�<br />

�<br />

−a(ℓ)<br />

−a(ℓ) − iu<br />


92 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

where d(ℓ) = a(ℓ) for ℓ = 1, 2, . . . , n <strong>and</strong> d(ℓ) = −a(ℓ) for<br />

ℓ = n + 1, n + 2, . . . , 2n. Thus k is of degree 2n <strong>and</strong> the<br />

following theorem holds.<br />

Theorem II.3.2 (Degree of Differentiability of The Finite<br />

VDSKs)<br />

k a finite VDSK,<br />

=⇒<br />

1. k ∈ C 2n−2 (R, R),<br />

2. k ∈ C 2n−1 (R, R) except at x = 0, where<br />

k 2n−1 (0 + ), k 2n−1 (0 − )<br />

both exist.<br />

The asymptotic behaviour of k <strong>and</strong> its Fourier transform,<br />

K, will be now studied.<br />

Theorem II.3.3 (Asymptotic Behaviour of The Fourier<br />

transform of The Finite VDSKs)<br />

1. k a finite VDSK,<br />

2. k(x) ↔ K(u)<br />

=⇒<br />

|K(u)| = O(|u| −2n ), |u| → ∞.<br />

Proof. k is made up of a finite convolution operations<br />

of functions kℓ(x) = a(ℓ) e[a(ℓ)x], where a(ℓ) > 0, ℓ =<br />

1, 2, . . . , n; <strong>and</strong> whose FT, Kℓ(u), satisfy the inequality<br />

�<br />

�<br />

|Kℓ(u)| = �<br />

a<br />

�<br />

2 (ℓ)<br />

a2 (ℓ) + u2 �<br />

�<br />

�<br />

� ≤ a2 (ℓ)<br />

, ℓ = 1, 2, . . . , n.<br />

|u| 2<br />

Thus<br />

� �<br />

� n� �<br />

� �<br />

|K(u)| = � Kℓ(u) �<br />

� �<br />

ℓ=1<br />

≤<br />

n�<br />

� 2 a (ℓ)<br />

|u|<br />

ℓ=1<br />

2<br />

�<br />

= |u| −2n<br />

n�<br />

a<br />

ℓ=1<br />

2 (ℓ).<br />

(II.3.5)<br />

From the above theorem we construct the following corollarys.<br />

Corollary II.3.4 (Absolute <strong>and</strong> Quadratic Integrability<br />

of The Fourier transform of The Finite VDSKs)<br />

1. k a finite VDSK,<br />

2. k(x) ↔ K(u)<br />

=⇒<br />

K(u) ∈ L(R, R) ∩ L2 (R, R).<br />

Corollary II.3.5 (Absolute <strong>and</strong> Quadratic Integrability<br />

of The Finite VDSKs)<br />

k a finite VDSK,<br />

=⇒<br />

k(x) ∈ L(R, R) ∩ L2 (R, R).<br />

The Fourier transform K(u) of the Fourier transform of k<br />

is given by<br />

K(u) ↔ 2πk(−x).<br />

Since k is a even function then<br />

K(u) ↔ 2πk(x).<br />

This result, in conjunction with Corollary II.3.4. <strong>and</strong> Riemann-<br />

Lebesgue Lemma proves the following theorem.<br />

Theorem II.3.6 (Asymptotic Behaviour of The Finite<br />

VDSKs)<br />

k a finite VDSK<br />

=⇒<br />

k(x) → 0 as |x| → ∞.<br />

II.4 The Non-Finite VDSKs<br />

We now study kernels k holding the property<br />

∞�<br />

� 2 a (ℓ)<br />

k(x) ↔ K(u) =<br />

a2 (ℓ) + u2 �<br />

ℓ=1<br />

(II.4.1)<br />

which are non-finite kernels. In particular, the infinite product<br />

in equation (II.4.1) may have only a finite number of factors,<br />

so that the finite VDSKs of the last section are included.<br />

Kernels holding equation (II.4.1) can be synthesized from the<br />

basic kernel<br />

e(x) = 1<br />

exp(−|x|), x ∈ R.<br />

2<br />

The non-finite VDSKs are composed of a non-finite number<br />

of functions a(ℓ) e[a(ℓ)x], ℓ = 1, 2, . . . . The properties of such<br />

kernels are given in the following theorem.<br />

Theorem II.4.1 (Properties of The Non-Finite VDSKs)<br />

1. a(ℓ) > 0, ℓ = 1, 2, . . . ,<br />

2. kℓ(x) = a(ℓ) e[a(ℓ)x],<br />

3. k = k1 ⊗ k2 ⊗ · · · ⊗ kn . . . ,<br />

4. K(u) = ∞� � �<br />

2 2 2 a (ℓ)/(a (ℓ) + u )<br />

ℓ=1<br />

=⇒<br />

A. k is a VDSK,<br />

B. k(x) ↔ K(u),<br />

C. k has mean ν = 0,<br />

D. k has variance σ2 = ∞� � �<br />

2 2/a (ℓ) < ∞.<br />

ℓ=N+1<br />

ℓ=1<br />

Since k (Theorem II.4.1) is made up by a non-finite number<br />

of convolution operationw, then it is of degree infinity, which<br />

leads to the following.<br />

Theorem II.4.2 (Degree of Differentiability of The Non-<br />

Finite VDSKs)<br />

k a non-finite VDSK<br />

=⇒<br />

k ∈ C∞ (R, R).<br />

The asymptotic behaviour of the Fourier transform of a nonfinite<br />

kernel is established in the following theorem.<br />

Theorem II.4.3 (Asymptotic Behaviour of The Fourier<br />

transform of The Non-Finite VDSKs)<br />

1. k a non-finite VDSK,<br />

2. k(x) ↔ K(u),<br />

3. R, p > 0<br />

=⇒<br />

|K(u)| = O(|u| −2p ), |u| → ∞.<br />

Proof. Choose N > p <strong>and</strong> so large that |a(ℓ)| ≥ R when<br />

ℓ > N which is possible since |a(ℓ)| → ∞ as ℓ → ∞. Set<br />

∞�<br />

� 2 a (ℓ)<br />

KN(u) =<br />

a2 (ℓ) + u2 �<br />

.<br />

By equation (II.3.5), it follows that<br />

|K(u)| ≤ |KN (u)|<br />

|u| 2N<br />

N�<br />

a 2 (ℓ).<br />

ℓ=1<br />

Because |KN(u)| never vanishes <strong>and</strong> is continuous for all u ∈<br />

R, then it has a positive lower bound. Hence, for a suitable<br />

constant M<br />

|K(u)| ≤ M<br />

.<br />

|u| 2N


93 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

In particular, if p = 1 in the above theorem <strong>and</strong> because k is a<br />

variation diminishing function, the following corollary results.<br />

Corollary II.4.4 (Absolute Integrability of The Non-<br />

Finite Kernels <strong>and</strong> Their FT)<br />

1. k a non-finite VDSK,<br />

2. k(x) ↔ K(u)<br />

=⇒<br />

k, K ∈ L(R, R).<br />

Application of the symmetry property of the Fourier transform,<br />

the Riemann-Lebesgue Lemma <strong>and</strong> the above corollary<br />

proves the following theorem.<br />

Theorem II.4.5 (Asymptotic Behaviour of The Non-Finite<br />

VDSKs)<br />

k a non-finite VDSK<br />

=⇒<br />

k(x) → 0 as |x| → ∞.<br />

Some examples of non-finite VDSKs are:<br />

π<br />

4 sech2 ( πx<br />

) ↔ u csch u<br />

2<br />

∞�<br />

� � 2 2 ℓ π<br />

=<br />

, (II.4.2)<br />

ℓ=1<br />

ℓ 2 π 2 + u 2<br />

1<br />

2 sech(πx<br />

∞�<br />

� 2 2 (2ℓ − 1) π<br />

) ↔ sech u =<br />

2 (2ℓ − 1)<br />

ℓ=1<br />

2π2 + u2 �<br />

.<br />

(II.4.3)<br />

Note that a non-finite VDSK does not necessarily belongs to<br />

L2 (R, R), e.g. the kernel given by equation (II.4.3).<br />

II.5 The Gaussian VDSK<br />

The Gaussian VDSK, k, is defined by the relation<br />

k(x) ↔ K(u) = exp(−cu 2 ), c > 0. (II.5.1)<br />

With c → 1/4c 2 , the Gaussian VDSK is now defined as<br />

k(x) ↔ K(u) = exp(−u 2 /4c 2 ), c > 0. (II.5.2)<br />

The basic properties of the above kernel follow directly <strong>and</strong><br />

are collated together in the following theorem.<br />

Theorem II.5.1 (Basic Properties of The Gaussian<br />

VDSK)<br />

1. k(x) = c gauss(cx), c > 0,<br />

2. K(u) = exp(−u 2 /4c 2 ), c > 0,<br />

3. p > 0<br />

=⇒<br />

A. k is a VDSK,<br />

B. k(x) ↔ K(u),<br />

C. k has mean ν = 0,<br />

D. k has variance σ 2 = 1/2c 2 ,<br />

E. k, K ∈ L(R, R) ∩ L 2 (R, R),<br />

F. k, K ∈ C ∞ (R, R),<br />

G. |k(x)| = o(|x| −p ),<br />

H. |K(u)| = o(|u| −p ).<br />

If in equation (II.5.1), c is considered as a variable, say t,<br />

then after taking the inverse Fourier transform with respect to<br />

x we obtain a real valued function of two variables, i.e.<br />

k(x, t) = 1<br />

√ 4πt exp(−x 2 /4t). (II.5.3)<br />

This new function is the familiar source solution of the<br />

diffusion equation<br />

� �<br />

2 ∂ ∂<br />

− k(x, t) = 0 (II.5.4)<br />

∂x2 ∂t<br />

II.6 Geometric Properties of The VDSKs<br />

We consider the general geometric properties shared by the<br />

finite, non-finite <strong>and</strong> the Gaussian VDSKs where k denotes<br />

either a finite, non-finite or Gaussian VDSK throughout.<br />

Theorem II.6.1 (Geometric Properties of The VDSKs)<br />

1. k a VDSK,<br />

2. f : R → R bounded <strong>and</strong> convex (concave)<br />

=⇒<br />

A. For a, b ∈ R<br />

V [k(x) ⊗ f(x) − a − bx] ≤ V [f(x) − a − bx], (II.6.1)<br />

B. (k ⊗ f)(x) is convex (concave).<br />

Proof. A. Inequality (II.6.1) follows by a direct application<br />

of the variation diminishing property of k.<br />

B. It is well known that f is convex iff<br />

∆ 2 hf(x) = f(x + 2h) − 2f(x + h) − f(x) ≥ 0,<br />

for all x ∈ R, h > 0. Because k is a non-negative function,<br />

then<br />

∆ 2 h[(k ⊗ f)(x)] = ∆ 2 ⎡<br />

�∞<br />

⎤<br />

⎣<br />

h k(y)f(x − y) dy⎦<br />

=<br />

�∞<br />

−∞<br />

−∞<br />

k(y)∆ 2 hf(x − y) dy ≥ 0.<br />

Thus the inequality follows. The case for which f is concave<br />

follows using a similar argument but ∆2 hf(x) ≤ 0, for all<br />

x ∈ R, h > 0.<br />

The geometric significance of inequality (II.6.1) is that the<br />

number of intersections of the straight line y = a + bx, a, b ∈<br />

R, with (k⊗f)(x) does not exceed the number of intersections<br />

of y = a + bx with y = f(x). As a special instance of such<br />

an inequality, it follows that (k ⊗ f)(x) is non-negative if f<br />

is non-negative.<br />

Corollary II.6.2 (Non-Negativity of k ⊗ f)<br />

1. k a VDSK,<br />

2. f : R → R, f ≥ 0, <strong>and</strong> bounded<br />

=⇒<br />

(k ⊗ f)(x) ≥ 0, x ∈ R.<br />

From the above results, it is clear that if f is composed of<br />

a succession of alternating convex or concave arcs, then k ⊗ f<br />

is also made up of a similar succession of convex or concave<br />

arcs equal in number to those of f. Thus, a VDSK is shape<br />

preserving.<br />

ACKNOWLEDGMENTS<br />

The ABX data was provided by the Systemic Risk Analysis<br />

Division, Bank of Engl<strong>and</strong>, who originally commissioned the<br />

research.


94 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

REFERENCES<br />

[1] http://www.tickdata.com/<br />

[2] http://www.vhayu.com/<br />

[3] http://en.wikipedia.org/wiki/Louis Bachelier<br />

[4] http://en.wikipedia.org/wiki/Robert Brown (botanist)<br />

[5] T. R. Copel<strong>and</strong>, J. F. Weston <strong>and</strong> K. Shastri, Financial Theory <strong>and</strong><br />

Corporate Policy, 4th Edition, Pearson Addison Wesley, 2003.<br />

[6] J. D. Martin, S. H. Cox, R. F. McMinn <strong>and</strong> R. D. Maminn, The Theory of<br />

Finance: Evidence <strong>and</strong> Applications, International Thomson Publishing,<br />

1997.<br />

[7] R. C. Menton, Continuous-Time Finance, Blackwell Publishers, 1992.<br />

[8] T. J. Watsham <strong>and</strong> K. Parramore, Quantitative Methods in Finance,<br />

Thomson Business Press, 1996.<br />

[9] E. Fama, The Behavior of Stock Market Prices, Journal of Business Vol.<br />

38, 34-105, 1965.<br />

[10] P. Samuelson, Proof That Properly Anticipated Prices Fluctuate R<strong>and</strong>omly,<br />

Industrial Management Review Vol. 6, 41-49, 1965.<br />

[11] E. Fama, Efficient Capital Markets: A Review of Theory <strong>and</strong> Empirical<br />

Work, Journal of Finance Vol. 25, 383-417, 1970.<br />

[12] G. M. Burton, Efficient Market Hypothesis, The New Palgrave: A<br />

Dictionary of Economics, Vol. 2, 120-23, 1987.<br />

[13] F. Black <strong>and</strong> M. Scholes, The Pricing of Options <strong>and</strong> Corporate<br />

Liabilities, Journal of Political Economy, Vol. 81(3), 637-659, 1973.<br />

[14] http://uk.finance.yahoo.com/q/hp?s=%5EFTSE<br />

[15] B. B. M<strong>and</strong>elbrot <strong>and</strong> J. R. Wallis, Robustness of the Rescaled Range<br />

R/S in the Measurement of Noncyclic Long Run Statistical Dependence,<br />

Water Resources Research, Vol. 5(5), 967-988, 1969.<br />

[16] B. B. M<strong>and</strong>elbrot, Statistical Methodology for Non-periodic Cycles:<br />

From the Covariance to R/S Analysis, Annals of Economic <strong>and</strong> Social<br />

Measurement, Vol. 1(3), 259-290, 1972.<br />

[17] E. H. Hurst, A Short Account of the Nile Basin, Cairo, Government<br />

Press, 1944.<br />

[18] http://en.wikipedia.org/wiki/Elliott wave principle<br />

[19] http://uk.finance.yahoo.com/q/hp?s=%5EFTSE<br />

[20] http://uk.finance.yahoo.com/q/hp?s=%5EDJI<br />

[21] B. B. M<strong>and</strong>elbrot, The Fractal Geometry of Nature, Freeman, 1983.<br />

[22] J. Feder, Fractals, Plenum Press, 1988.<br />

[23] K. J. Falconer, Fractal Geometry, Wiley, 1990.<br />

[24] P. Bak, How Nature Works, Oxford University Press, 1997.<br />

[25] N. Lam <strong>and</strong> L. De Cola L, Fractal in Geography, Prentice-Hall, 1993.<br />

[26] H. O. Peitgen <strong>and</strong> D. Saupe (Eds.), The Science of Fractal Images,<br />

Springer, 1988.<br />

[27] A. J. Lichtenberg <strong>and</strong> M. A. Lieberman, Regular <strong>and</strong> Stochastic Motion:<br />

Applied Mathematical Sciences, Springer-Verlag, 1983.<br />

[28] J. J. Murphy, Intermarket Technical Analysis: Trading Strategies for the<br />

Global Stock, Bond, Commodity <strong>and</strong> Currency Market, Wiley Finance<br />

Editions, Wiley, 1991.<br />

[29] J. J. Murphy, Technical Analysis of the Futures Markets: A Comprehensive<br />

Guide to Trad-ing Methods <strong>and</strong> Applications, New York Institute<br />

of Finance, Prentice-Hall, 1999.<br />

[30] T. R. DeMark, The New Science of Technical Analysis, Wiley, 1994.<br />

[31] J. O. Matthews, K. I. Hopcraft, E. Jakeman <strong>and</strong> G. B. Siviour, Accuracy<br />

Analysis of Measurements on a Stable Power-law Distributed Series of<br />

Events, J. Phys. A: Math. Gen. 39, 1396713982, 2006.<br />

[32] W. H. Lee, K. I. Hopcraft, <strong>and</strong> E. Jakeman, Continuous <strong>and</strong> Discrete<br />

Stable Processes, Phys. Rev. E 77, American Physical Society, 011109,<br />

1-4.<br />

[33] A. Einstein, On the Motion of Small Particles Suspended in Liquids at<br />

Rest Required by the Molecular-Kinetic Theory of Heat, Annalen der<br />

Physik, Vol. 17, 549-560, 1905.<br />

[34] J. M. Blackledge, G. A. Evans <strong>and</strong> P. Yardley, Analytical Solutions to<br />

Partial Differential Equations, Springer, 1999.<br />

[35] H. Hurst, Long-term Storage Capacity of Reservoirs, Transactions of<br />

American Society of Civil Engineers, Vol. 116, 770-808, 1951.<br />

[36] M. F. Shlesinger, G. M. Zaslavsky <strong>and</strong> U. Frisch (Eds.), Lévy Flights<br />

<strong>and</strong> Related Topics in Physics, Springer 1994.<br />

[37] R. Hilfer, Foundations of Fractional Dynamics, Fractals Vol. 3(3), 549-<br />

556, 1995.<br />

[38] A. Compte, Stochastic Foundations of Fractional Dynamics, Phys. Rev<br />

E, Vol. 53(4), 4191-4193, 1996.<br />

[39] P. M. Morse <strong>and</strong> H. Feshbach, Methods of Theoretical Physics, McGraw-<br />

Hill, 1953.<br />

[40] G. F. Roach, Green’s Functions (Introductory Theory with Applications),<br />

Van Nostr<strong>and</strong> Reihold, 1970.<br />

[41] T. F. Nonnenmacher, Fractional Integral <strong>and</strong> Differential Equations for<br />

a Class of Lévy-type Probability Densities, J. Phys. A: Math. Gen. Vol.<br />

23, L697S-L700S, 1990.<br />

[42] R. Hilfer, Exact Solutions for a Class of Fractal Time R<strong>and</strong>om Walks,<br />

Fractals, Vol. 3(1), 211-216, 1995.<br />

[43] R. Hilfer <strong>and</strong> L. Anton, Fractional Master Equations <strong>and</strong> Fractal Time<br />

R<strong>and</strong>om Walks, Phys. Rev. E, Vol. 51(2), R848-R851, 1995.<br />

[44] M. J. Turner, J. M. Blackledge <strong>and</strong> P. Andrews, Fractal Geometry in<br />

Digital Imaging, Academic Press, 1997.<br />

[45] M. F. Shlesinger, G. M. Zaslavsky <strong>and</strong> U. Frisch (Eds.), Lévy Flights<br />

<strong>and</strong> Related Topics in Physics, Springer 1994.<br />

[46] S. Abea <strong>and</strong> S. Thurnerb, Anomalous Diffusion in View of Einsteins 1905<br />

Theory of Brownian Motion, Physica A(356) 403407, Elsevier 2005.<br />

[47] I. Lvova, Application of Statistical Fractional Methods for the Analysis<br />

of Time Series of Currency Exchange Rates, PhD Thesis, De Montfort<br />

University, 2006.<br />

[48] C. R. Rao, Linear Statistical Inference <strong>and</strong> its Applications, Wiley, 1973.<br />

[49] http://webscripts.softpedia.com/script/Scientific-Engineering-Ruby/<br />

Mathematics/Orthogonal-Linear-Regression-33745.html<br />

[50] S. Mallat, A Wavelet Tour of Signal Processing, Academic Press, ISBN:<br />

0-12-466606-X, 1999.<br />

[51] http://www.mathworks.com/matlabcentral/fileexchange/<br />

loadFile.do?objectId=6716&objectType=File<br />

[52] http://www.markit.com/en/home.page.<br />

Jonathan Blackledge graduated in physics from<br />

Imperial College in 1980. He gained a PhD in theoretical<br />

physics from London University in 1984 <strong>and</strong><br />

was then appointed a Research Fellow of Physics<br />

at Kings College, London, from 1984 to 1988,<br />

specializing in inverse problems in electromagnetism<br />

<strong>and</strong> acoustics. During this period, he worked on<br />

a number of industrial research contracts undertaking<br />

theoretical <strong>and</strong> computational research into<br />

the applications of inverse scattering theory for the<br />

analysis of signals <strong>and</strong> images. In 1988, he joined<br />

the Applied Mathematics <strong>and</strong> Computing Group at Cranfield University as<br />

Lecturer <strong>and</strong> later, as Senior Lecturer <strong>and</strong> Head of Group where he promoted<br />

postgraduate teaching <strong>and</strong> research in applied <strong>and</strong> engineering mathematics<br />

in areas which included computer aided engineering, digital signal processing<br />

<strong>and</strong> computer graphics. While at Cranfield, he co-founded Management <strong>and</strong><br />

Personnel Services Limited through the Cranfield Business School which<br />

was originally established for the promotion of management consultancy<br />

working in partnership with the Chamber of Commerce. He managed the<br />

growth of the company from 1993 to 2007 to include the delivery of a<br />

range of National Vocational Qualifications, primarily through the City <strong>and</strong><br />

Guilds London Institute, including engineering, ICT, business administration<br />

<strong>and</strong> management. In 1994, Jonathan Blackledge was appointed Professor of<br />

Applied Mathematics <strong>and</strong> Head of the Department of Mathematical Sciences<br />

at De Montfort University where he exp<strong>and</strong>ed the post-graduate <strong>and</strong> research<br />

portfolio of the Department <strong>and</strong> established the Institute of Simulation<br />

Sciences. From 2002-2008 he was appointed Visiting Professor of Information<br />

<strong>and</strong> Communications Technology in the Advanced Signal Processing Research<br />

Group, Department of Electronics <strong>and</strong> Electrical Engineering at Loughborough<br />

University, Engl<strong>and</strong> (a group which he co-founded in 2003 as part<br />

of his appointment). In 2004 he was appointed Professor Extraordinaire of<br />

Computer Science in the Department of Computer Science at the University<br />

of the Western Cape, South Africa. His principal roles at these institutes<br />

include the supervision of MSc <strong>and</strong> MPhil/PhD students <strong>and</strong> the delivery<br />

of specialist short courses for their Continuous Professional Development<br />

programmes. He currently holds the prestigious Stokes Professorship funded<br />

by the Science Foundation Irel<strong>and</strong> at Dublin Institute of Technology <strong>and</strong><br />

is Distinguished Professor in the Centre for Advanced Studies at Warsaw<br />

University of Technology


95 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

An Optical Machine Vision System for<br />

Applications in Cytopathology<br />

Jonathan M Blackledge, Fellow, IET <strong>and</strong> Dmitry A Dubovitskiy, Member, IET<br />

Abstract— This paper discusses a new approach to the processes<br />

of object detection, recognition <strong>and</strong> classification in a<br />

digital image focusing on problem in Cytopathology. A unique self<br />

learning procedure is presented in order to incorporate expert<br />

knowledge. The classification method is based on the application<br />

of a set of features which includes fractal parameters such as the<br />

Lacunarity <strong>and</strong> Fourier dimension. Thus, the approach includes<br />

the characterisation of an object in terms of its fractal properties<br />

<strong>and</strong> texture characteristics. The principal issues associated with<br />

object recognition are presented which include the basic model<br />

<strong>and</strong> segmentation algorithms. The self-learning procedure for<br />

designing a decision making engine using fuzzy logic <strong>and</strong> membership<br />

function theory is also presented <strong>and</strong> a novel technique<br />

for the creation <strong>and</strong> extraction of information from a membership<br />

function considered. The methods discussed <strong>and</strong> the algorithms<br />

developed have a range of applications <strong>and</strong> in this work, we<br />

focus the engineering of a system for automating a Papanicolaou<br />

screening test.<br />

Index Terms— Computer vision, Segmentation, Object recognition,<br />

Contour detection, Edge detection, Decision making,<br />

Self-learning, Fuzzy logic, Image morphology, Cytopathology,<br />

Cervical smear analysis, Papanicolaou screening test.<br />

I. INTRODUCTION<br />

THE cervix is an important site for pathological studies,<br />

particularly in women of reproductive age. It protects the<br />

uterine cavity from intrusion of pathogenic micro-organisms,<br />

promotes the movement of spermatozoa to the ovule <strong>and</strong> holds<br />

a fetus in the uterus at pregnancy. The conventional study<br />

of cellular structures on stained glass slides for cytological<br />

reporting is a routine procedure for the early detection of<br />

pre-carcinoma conditions. Visual inspection allows an estimate<br />

to be made of the state of the cervix <strong>and</strong> a diagnosis to be<br />

developed based on the cytological pattern observed providing<br />

an adequate specimen is available. Worldwide, approximately<br />

471,000 women are diagnosed with invasive carcinoma of<br />

the cervix each year <strong>and</strong> the order of 233,000 die from the<br />

disease. Although mortality from cervical cancer continues to<br />

decrease due to improved screening programmes, it remains<br />

among the most common female cancers in many countries.<br />

For example, in the United Kingdom, it is ranked eleventh<br />

for women, sexually transmitted infections by certain strains<br />

of the human papilloma virus being the major cause of the<br />

condition.<br />

Manuscript completed in December, 2009. The work reported in this paper<br />

is supported by the Science Foundation Irel<strong>and</strong>.<br />

Jonathan Blackledge (email: jonathan.blackledge@dit.ie) is SFI (Science<br />

Foundation Irel<strong>and</strong>) Stokes Professor, School of Electrical Engineering <strong>Systems</strong>,<br />

Faculty of Engineering, Dublin Institute of Technology, Kevin Street,<br />

Dublin 8, Irel<strong>and</strong> - http://eleceng.dit.ie/blackledge. Dr Dmitry Dubovitskiy is<br />

Director of Oxford Recognition Limited (email: dda@oxreco.com).<br />

A. Papanicolaou Screening<br />

Cervical cancer is preceded by a precancerous condition<br />

called Cervical Intraepithelial Neoplasia (CIN) which can be<br />

easily treated if detected. It is therefore important to identify<br />

CINs through a Papanicolaou screening test commonly known<br />

as a ‘PAP test’. A small sample of cells from the surface of<br />

the cervix is removed <strong>and</strong> smeared onto a glass slide <strong>and</strong><br />

the material is fixed in alcohol. The slide is then stained<br />

<strong>and</strong> the sample(s) examined under a microscope, a search<br />

being carried to detect abnormal cells. Examination typically<br />

involves observing the nucleus of a cell <strong>and</strong> inspecting it<br />

for characteristics that point toward abnormalities that include<br />

size, texture <strong>and</strong> colour. For example, if the nucleus is enlarged<br />

relative to the area of the cytoplasm as shown in the example<br />

given in Figure 1 then there is a likelihood of abnormal activity<br />

within the nucleus.<br />

Fig. 1. Example of normal (left) <strong>and</strong> abnormal cell clusters (right) where,<br />

in the latter case, the Cytoplasm to Nuclei area ratio is enlarged.<br />

The order of four million cervical smears are taken annually<br />

in the UK <strong>and</strong> fifty million in USA, for example, <strong>and</strong> a principal<br />

diagnostic problem is that about one fifth of the borderline<br />

preparations show the disease at an advanced stage on referral<br />

<strong>and</strong> biopsy. Overall there is a 50% ‘failure’ rate in detecting<br />

significant diseases within borderline cases. In addition there<br />

is a 50% ‘failure’ in detecting significant deceases within<br />

negative cases. The reasons for this vary from extraction of<br />

a sample, the preparation of the slide, but most of all, from<br />

the sequential reading of a slide in the diagnostic laboratory<br />

when human error occurs.<br />

In current practices world-wide a diagnosis is performed<br />

manually. It typically takes 8-10 minutes for a cytopathologist<br />

to screen a slide <strong>and</strong> involves upto 300 movements of a<br />

microscope over the slide. This approach not only takes time<br />

but inevitably leads to outcomes in which it is not possible to<br />

guarantee consistent <strong>and</strong> accurate results as many borderline<br />

results are generated, for example. It is therefore of significant


96 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

value if accurate image analysis <strong>and</strong> object recognition techniques<br />

can be developed in an attempt to automate the process<br />

<strong>and</strong> produce a system that provides a reliable, consistent <strong>and</strong><br />

quantitative estimation of CINs <strong>and</strong> other abnormalities to<br />

improve upon the subjective assessments of a cytopathologist.<br />

A typical screening session involves a cytopathologist<br />

analysing a slide under the microscope with a magnification<br />

up to 400x. The output is related to the number of slides<br />

<strong>and</strong> working hour per cytologist <strong>and</strong> an increase in either<br />

reduces the speed <strong>and</strong> reliability of the results. Telecytology<br />

[5] provides a large number of digital images for consideration<br />

which can lead increased human error. Moreover, in<br />

telecytology the cytopatholoist is not usually able to examine<br />

cellular details <strong>and</strong> to change the focal plane of the image.<br />

In virtual microscopy a digital image of the entire slide is<br />

generated <strong>and</strong> consequently the image file can become very<br />

large ∼4-7Gb. Another problem with virtual microscopy is<br />

that the focal plane limits the representation of the specimen.<br />

Virtual microscopy is used for proficiency tests <strong>and</strong> there are a<br />

number of commercially available medical imaging assistant<br />

tools [11], [12], [13]. However, a cytopathologist is still an<br />

important factor in the ‘diagnostic cycle’. Furthermore, due<br />

to compression <strong>and</strong>/or differences in the focal depth, many<br />

images may not provide a clear enough representation of a<br />

cell in comparison to those obtained using conventional microscopy.<br />

Thus, the development of automated recognition <strong>and</strong><br />

classification systems provides the potential for introducing<br />

quality control in national screening procedures.<br />

B. Image Analysis <strong>and</strong> Pattern Recognition<br />

Conventional microscopy, as applied to cytopathology, involves<br />

the use of image processing methods that are often<br />

designed in an attempt to provide a machine interpretation<br />

of an image, ideally in a form that allows some decision<br />

criterion to be applied, such that a pattern <strong>and</strong>/or object can<br />

be recognised [1], [2]. Pattern recognition uses a range of<br />

different approaches that are not necessarily based on any<br />

one particular theme or unified theoretical approach. The main<br />

problem is that, to date, there is no complete theoretical model<br />

for simulating the processes that take place when a human<br />

interprets an image generated by the eye, i.e. there is no<br />

fully compatible model, currently available, for explaining the<br />

processes of visual image comprehension. Hence, machine or<br />

computer vision remains a rather elusive subject area in which<br />

automatic inspection systems are advanced without having a<br />

fully operational theoretical framework as a guide. Nevertheless,<br />

numerous algorithms for interpreting two- <strong>and</strong> threedimensional<br />

objects in a digital image have <strong>and</strong> continue to be<br />

researched in order to design systems that can provide reliable<br />

automatic object detection <strong>and</strong> recognition in an independent<br />

environment, e.g. [3], [4], [14], [16], [25].<br />

Vision can be thought of as a process of linking parts of<br />

the visual field (objects) with stored information or templates<br />

about their significance for the observer. There are a number of<br />

questions concerning vision such as: (i) what are the goals <strong>and</strong><br />

constraints? (ii) what type of algorithm or set of algorithms<br />

is required to effect vision? (iii) what are the implications<br />

for the process given the types of hardware that might be<br />

available? (iv) what are the levels of representation required<br />

to achieve vision? The levels of representation are dependent<br />

on what type of segmentation can <strong>and</strong>/or should be applied<br />

to an image. For example, we may be able to produce a<br />

primal sketch from an image via some measure of the intensity<br />

changes in a scene which are recorded as place tokens <strong>and</strong><br />

stored in a database. This allows sets of raw components<br />

to be generated, e.g. regions of pixels with similar intensity<br />

values or sets of lines obtained by isolating the edges of an<br />

image scene <strong>and</strong> computed by locating regions where there is<br />

a significant difference in the intensity. However, such sets are<br />

subject to inherent ambiguities when computed from a given<br />

input image <strong>and</strong> associated with those from which an existing<br />

database has been constructed. Such ambiguities can only be<br />

overcome by the application of high-level rules, based on how<br />

humans interpret images, but the nature of this interpretation is<br />

not always clear. Nevertheless, parts of an image will tend to<br />

have an association if they share size, colour, figural similarity,<br />

continuity, shading <strong>and</strong> texture, for example. For this purpose,<br />

we are required to consider how best to segment an image <strong>and</strong><br />

what form this segmentation should take.<br />

The identification of the edges of an object in an image<br />

scene is an important aspect of the human visual system<br />

because it provides information on the basic topology of the<br />

object from which an interpretative match can be achieved. In<br />

other words, the segmentation of an image into a complex<br />

of edges is a useful pre-requisite for object identification.<br />

However, although many low-level processing methods can be<br />

applied for this purpose, the problem is to decide which object<br />

boundary each pixel in an image falls within <strong>and</strong> which highlevel<br />

constraints are necessary. Thus, in many cases, a principal<br />

question is, which comes first, recognition or segmentation?<br />

Compared to image processing, computer vision (which<br />

incorporates machine vision) is more than automated image<br />

processing. It results in a conclusion, based on a machine<br />

performing an inspection of its own. The machine must be<br />

programmed to be sensitive to the same aspects of the visual<br />

field as humans find meaningful. Segmentation is concerned<br />

with the process of dividing an image into meaningful regions<br />

or segments. It is used in image analysis to separate features or<br />

regions of a pre-determined type from the background; it is the<br />

first step in automatic image analysis <strong>and</strong> pattern recognition.<br />

Segmentation is broadly based on one of two properties in<br />

an image: (i) similarity; (ii) discontinuity. The first property<br />

is used to segment an image into regions which have grey<br />

(or colour) levels within a predetermined range. The second<br />

property segments the image into regions of discontinuity<br />

where there is a more or less abrupt change in the values<br />

of the grey (or colour) levels.<br />

In this paper, we consider an approach to object detection in<br />

an image that is based on a new segmentation (edge detection)<br />

algorithm based on a Contour Tracing Algorithm <strong>and</strong> spaceoriented<br />

filter [6]. The image usually requires enhancing<br />

before it is process <strong>and</strong> for this purpose a novel self-adjusting<br />

sharpening filter has been developed as discussed in this paper.<br />

The segmented object is then analysed in terms metrics derived<br />

from both a Euclidean <strong>and</strong> fractal geometric perspective, the


97 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

output fields being used to train a fuzzy inference engine <strong>and</strong><br />

the recognition structure being based on some of the methods<br />

reported in [15], for example. The approach considered is<br />

generic in that it can, in principle, be applied to any type<br />

of imaging modality. There are numerous applications of<br />

this technique especially when self-calibration <strong>and</strong> leaning is<br />

m<strong>and</strong>atory. Example applications may include remote sensing,<br />

non-destructive evaluation <strong>and</strong> testing <strong>and</strong> many other applications<br />

which specifically require the classification of objects<br />

that are textural. However, in this paper we focus on one<br />

particular application, namely, the diagnosis of cervical cancer<br />

based on st<strong>and</strong>ard Papanicolaou screening test images.<br />

II. OBJECT RECOGNITION ARCHITECTURE<br />

Suppose we have an image which is given by a function<br />

f(x, y) <strong>and</strong> contains some object described by a set S =<br />

{s1, s2, ..., sn}. We consider the case when it is necessary<br />

to define a sample which is somewhat ‘close’ to this object.<br />

This task can be reduced to the construction of some function<br />

determining a degree of proximity of the object to a sample<br />

- a template of the object. Recognition is the process of<br />

comparing individual features against some pre-established<br />

template subject to a set of conditions <strong>and</strong> tolerances. The<br />

process of recognition commonly takes place in four definable<br />

stages: (i) image acquisition <strong>and</strong> filtering (as required for the<br />

removal of noise, for example); (ii) object location (which<br />

may include edge detection); (iii) measurement of object<br />

parameters; (iv) object class estimation. We now consider the<br />

common aspects of each step. In particular, we consider details<br />

on the design features <strong>and</strong> their implementation together with<br />

their advantages, disadvantages <strong>and</strong> proposals for a solution<br />

whose application, in this paper, focuses on problems in<br />

cytopathology.<br />

Image acquisition depends on the technology that is best<br />

suited for integration with a particular application. For pattern<br />

recognition in cytopathology, for example, high fidelity digital<br />

images are required for image analysis whose resolution is,<br />

at least, compatible by the image acquisition equipment used<br />

for human inspection. For cytopathology this involves optical<br />

microscopy <strong>and</strong> for the application considered in this work,<br />

the microscope is equipped with digital camera. The colour<br />

images generated, examples of which are presented in this<br />

paper are, in general, relatively noise free <strong>and</strong> are digitised<br />

using a st<strong>and</strong>ard CCD camera. Nevertheless, it is important<br />

that good quality images are obtained that are homogeneous<br />

with regard to brightness <strong>and</strong> contrast, for example. Unless<br />

consistently high quality images can be generated that are<br />

compatible with the sample images used to design a given<br />

computer vision system, then that same system can be severely<br />

compromised.<br />

The system discussed in this paper is based on an object<br />

detection technique that includes a novel segmentation method<br />

<strong>and</strong> must be adjusted or ‘fine tuned’ for the each area of<br />

application. The necessary features associated with the ‘object’<br />

must be computed for a particular area of application. In the<br />

work reported here, this includes objects for which fractal<br />

models are well suited [23], [1], [2]. The system provides<br />

an output (i.e. a decision) using a knowledge database <strong>and</strong><br />

outputs a result by subscribing different objects. The ‘expert<br />

data’ in the application field creates the knowledge database<br />

by using a supervised training system with a number of model<br />

objects [18]. The recognition process is illustrated in Figure 2,<br />

a process that includes the following steps:<br />

1<br />

2<br />

3<br />

4<br />

5<br />

6<br />

image<br />

acqusition<br />

special<br />

transform<br />

segmentation<br />

feature<br />

detection<br />

decision<br />

making<br />

reporting<br />

Fig. 2. Recognition processes.<br />

digital image {fm,n}<br />

transformed image { ˜ fm,n}<br />

. . . object images {f 1 m,n}, {f 2 m,n}, . . .<br />

. . . feature vectors {x1 k }, {x2 k }, . . .<br />

. . . class probability vectors {p1 j }, {p2j }, . . .<br />

1) Image Acquisition <strong>and</strong> Filtering<br />

A physical object is digitally imaged <strong>and</strong> the data<br />

transferred to memory using current image acquisition<br />

hardware available commercially. The image is filtered<br />

to reduce noise <strong>and</strong> to remove unnecessary features such<br />

as light flecks.<br />

2) Special Transform: Edge Detection<br />

The digital image function fm,n is transformed into<br />

˜fm,n to identify regions of interest <strong>and</strong> provide an<br />

input dataset for the segmentation <strong>and</strong> feature detection<br />

operations [17]. This transform avoids the use of edge<br />

detection filters which have proved to be highly unreliable<br />

in the present application.<br />

3) Segmentation<br />

The image {fm,n} is segmented into individual objects<br />

{f 1 m,n}, {f 2 m,n}, . . . to perform a separate analysis<br />

of each region. This step includes such operations as<br />

thresholding, morphological analysis <strong>and</strong> contour tracing<br />

using the convex hull method developed in [6].<br />

4) Feature Detection<br />

Feature vectors {x1 k }, {x2 k }, . . . are computed from the<br />

object images {f 1 m,n}, {f 2 m,n}, . . . <strong>and</strong> corresponding<br />

{ ˜ f 1 m,n}, { ˜ f 2 m,n}, . . . . The features are numeric parameters<br />

that characterize the object inclusive of its texture.


98 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

The feature vectors computed consist of a number of Euclidean<br />

<strong>and</strong> fractal geometric parameters together with<br />

statistical measures in both one- <strong>and</strong> two-dimensions.<br />

The one-dimensional features correspond to the border<br />

of an object whereas the two-dimensional features relate<br />

to the surface within <strong>and</strong>/or around the object.<br />

5) Decision Making<br />

This involves assigning a probability to a predefined<br />

set of classes [21]. Probability theory <strong>and</strong> fuzzy logic<br />

[19] are applied to estimate the class probability vec-<br />

}, . . . from the object feature vectors<br />

tors {p1 j }, {p2 j<br />

{x1 k }, {x2 k<br />

}, . . . . A fundamental problem is to establish<br />

a quantitative relationship between features <strong>and</strong> class<br />

probabilities, i.e.<br />

{pj} ↔ {xk}<br />

A ‘decision’ is the estimated class of the object coupled<br />

with the a probabilistic accuracy [20].<br />

The application considered in this paper is based on algorithms<br />

that have been designed to solve problems associated<br />

with the above steps details of which are given in [6] which<br />

provides algorithms on threshold selection <strong>and</strong> a contour<br />

tracing algorithm using the ‘convex hull’ property. However,<br />

the application considered here requires some additional algorithms<br />

to solve the object recognition problem associated with<br />

cytopathology. This is because edge detection is particularly<br />

difficult to solve for images consisting of many cells <strong>and</strong><br />

a special space-oriented filter has therefore been designed<br />

to extract parameters associated with the spatial distribution<br />

of object borders. This includes a self-adjustable filter for<br />

enhanced object sharpness that has been considered as an<br />

inter-medium mask filter in order to clarify a cellular border.<br />

For characterisation, the line of objects found using the steps<br />

described above, need to be considered in terms of their major<br />

properties.<br />

With regard to the design of a decision making engine,<br />

the approach proposed is based on establishing an expert<br />

learning procedure in which a Knowledge Data Base (KDB)<br />

is constructed based on answers that an expert makes during<br />

a manual mode. Once the KDB has been developed, the<br />

system is ready for application in the field <strong>and</strong> provides results<br />

automatically. However, the accuracy <strong>and</strong> robustness of the<br />

output depends critically on the extent <strong>and</strong> <strong>and</strong> completeness<br />

of the KDB as well as the quality of the input image, primarily<br />

in terms of its compatibility with those images that have been<br />

used to generate the KDB. The algorithm discussed in Section<br />

IV has no analogy with previous contour tracing algorithms<br />

<strong>and</strong> has been designed to trace a contour of an object with<br />

any level of complexity to produce an output that consists<br />

of a consecutive list of coordinates of an object’s edge. The<br />

algorithm is optimised in terms of computational efficiency<br />

<strong>and</strong> can be realised in a compact form suitable for hardware<br />

implementation.<br />

III. REGION OF INTEREST SEGMENTATION<br />

For applications in cytopathology, a fundamental requirement<br />

is to select Regions of Interest (ROI) for detail review.<br />

The ROI is not taken to be the object itself but its local<br />

boundary. This approach improves the efficiency associated<br />

with the process of recognition, a process that is recursive <strong>and</strong><br />

involves different settings required to evaluate the probability<br />

of a the presence of a cell in the image. The algorithm used<br />

for ROI segmentation is based on adaptive thresholding <strong>and</strong><br />

morphological analysis. The adaptive image threshold is given<br />

by<br />

Tx = 1<br />

�<br />

min<br />

2 y<br />

Ty = 1<br />

�<br />

min<br />

2 x<br />

� max<br />

x f(x, y)� − 〈max<br />

x<br />

+〈max f(x, y)〉y,<br />

x<br />

� max<br />

y f(x, y)� − 〈max<br />

y<br />

+〈max f(x, y)〉x,<br />

y<br />

�<br />

Tx, Tx ≥ Ty,<br />

T =<br />

Ty, otherwise,<br />

f(x, y)〉y<br />

f(x, y)〉x<br />

where 〈·〉x <strong>and</strong> 〈·〉y are the means within column x <strong>and</strong> row y,<br />

respectively. This approach provides a solution for extracting<br />

the most significant features in the image, in this case, the<br />

nucleus of cells. If these objects cover an extensive area of the<br />

image, then this ‘filter’ provides the fastest compact solution.<br />

An example of the output generated by this algorithm is shown<br />

in Figure 3). In order to obtain a clear boundary, morphological<br />

analysis is applied to select objects with a predefined area. This<br />

is discussed in the following section.<br />

Fig. 3. Example of ROI segmentation where + points to the location in the<br />

image where there is a cell.<br />

IV. SPACE ORIENTED FILTER DESIGN FOR EDGE<br />

DETECTION<br />

Edge detection is used to identify the edges in an image<br />

which are those areas that correspond to object boundaries.<br />

To find these edges, an algorithm is designed that looks for<br />

places in the image where the intensity changes rapidly; this<br />

is typically based on using one of two principal criteria:<br />

�<br />


(i) areas where the first derivative of the intensity is larger<br />

in magnitude than some threshold;<br />

(ii) regions where the second derivative of the intensity has<br />

a zero crossing.<br />

There are many st<strong>and</strong>ard digital filters available for this<br />

process. Taking into account that in many images, high frequency<br />

noise (white noise) is usually present, we consider an<br />

appropriate adaptive filtering strategy.<br />

A. Noise Reduction by Adaptive Wiener Filtering<br />

Edge detection methods typically require an effective noise<br />

reduction algorithm in order to eliminate noise which should<br />

be undertaken adaptively. A well known adaptive filter is the<br />

Wiener filter which can be applied to an image adaptively,<br />

tailoring itself to the local image variance. When the variance<br />

is large, the Wiener filter performs little smoothing; when<br />

the variance is small, it performs more smoothing. This<br />

approach often produces better results than linear filtering. The<br />

adaptive filter is more selective than a comparable linear filter,<br />

preserving edges <strong>and</strong> other high frequency parts of an image.<br />

Although the Wiener filter requires greater computational time<br />

than linear filtering, it performs better when the noise is<br />

constant-power or ’white’ additive noise, such as Gaussian<br />

noise which is one of the conditions required to simplify the<br />

result of applying a least squares criterion.<br />

The Wiener filter algorithm uses a pixel-wise adaptive filtering<br />

procedure with neighborhoods of size m-by-n to estimate<br />

the local image mean <strong>and</strong> st<strong>and</strong>ard deviation. It estimates the<br />

local mean <strong>and</strong> variance around each pixel given respectively<br />

by<br />

µ = 1 �<br />

Is(r, c) - mean of the brightness of the image<br />

nm<br />

<strong>and</strong><br />

99 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

r,c∈η<br />

σ 2 = 1 �<br />

nm<br />

r,c∈η<br />

(I 2 s (r, c) − µ 2 ) - dispersion<br />

where the sum is taken over the n-by-m local neighborhood<br />

of each pixel in the image I. The algorithm then creates a<br />

pixel-wise Wiener filter using the following estimates<br />

ID (r, c) = µ + σ2 − v2 σ2 (Is(r, c) − µ)<br />

where ν 2 is the noise variance. If the noise variance is not<br />

given, the filter uses the average of all the local estimated<br />

variances. In this work, the Wiener filter is used as a first step<br />

to processing the image prior to applying a space oriented edge<br />

detection filter in order to provide an image that is optimal with<br />

regard to solving the edge detection problem for applications<br />

in cytopathology. Example results are shown in Figures 4<br />

<strong>and</strong> 5. Figure 4 shows the original image <strong>and</strong> Figure 5 is<br />

the result of applying the Wiener filter described above.<br />

B. Edge Detection<br />

Edge detection methods are based on a number of derivative<br />

estimators. For some of these estimators, it is possible to<br />

specify whether the operation should be sensitive to horizontal<br />

Fig. 4. Original image of a cell cluster obtained from a cervical smear after<br />

staining.<br />

Fig. 5. Adaptive Wiener filtered image.<br />

or vertical edges, or both. In each case, the aim is to return<br />

a binary image - an array containing elements which are<br />

either 0 or 1 where 1 represents an element of an edge <strong>and</strong> 0<br />

represents an empty edge space. Moreover, within the context<br />

of the overall approach, it is assumed that different edge<br />

detectors will yield minimal differences. In this application<br />

a Canny filter [8] is used to provide a first estimate of the<br />

edge boundaries of a cell nucleus.<br />

The Canny edge detector is based on a functional analysis<br />

to derive an optimal function for edge detection, starting<br />

with three optimisation criteria, namely, good detection, good<br />

localization, <strong>and</strong> only one response per edge under white noise<br />

conditions. The 1D ‘Canny function’ is accurately approximated<br />

by the derivative of a Gaussian function which is then<br />

combined with a Gaussian of identical st<strong>and</strong>ard deviation in<br />

the perpendicular direction, truncated at 0.001 of its peak<br />

value, <strong>and</strong> split into suitable masks. Underlying this method, is<br />

the idea of locating edges at the local maxima of the gradient<br />

magnitudes of a Gaussian-smoothed image. In addition, the<br />

Canny implementation employs a hysteresis operation on edge<br />

magnitude in order to make edges reasonably connected.<br />

Finally, a multiple-scale method is employed to analyse the


100 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

output of the edge detector.<br />

Fig. 6. Application of a Canny Filter to Figure 5.<br />

An example of applying a Canny filter to Figure 5 is<br />

given in Figure 6. This result typically illustrates that it<br />

is not possible to uniquely tell where the edge of a cell or<br />

nuclei occurs, especially when there is a connection between<br />

one edge with another gradient, where Canny edge detection<br />

introduces errors. For this purpose, it is necessary to design a<br />

new filter which is discussed in the following section.<br />

C. Space Oriented Filtering<br />

In some cases, the nuclei of the cells in a cervical smear<br />

can appear very close together, or be in touch with a foreign<br />

object such as a bacterium. In this case, an extra filter must be<br />

used to obtain a contour boundary. For this purpose, a spaceoriented<br />

filter for the detection of ‘holes’ has been developed.<br />

The nuclei represent a ‘hole’ if the image is visualised in terms<br />

of a surface in which the nuclei are regions of lower intensity.<br />

The filter has been designed to take account of the following:<br />

(i) objects should be of a quasi-spherical form; (ii) the search<br />

space should include objects with lower intensity (i.e. which<br />

have a darker colour); (iii) it is necessary to find only the<br />

surface of a cell without a hysteresis zone. An example of a<br />

profile that is characteristic of a nucleus is given in Figure 7.<br />

The same principle can of course be used for other objects.<br />

The solution to this problem is compounded in the algorithm<br />

that is now described, the basic procedure being illustrated in<br />

Figure 8. To start with, we estimate the brightness of the<br />

central area (using a window of 9×9 pixels) <strong>and</strong> a circle (a<br />

layer consisting of 2 pixels). If the center is dark, we suppose<br />

that it is part of the nuclei <strong>and</strong> compare the intensity along the<br />

white line in Figure 8 with the central zone. If the profile along<br />

this line has a maximum <strong>and</strong> minimum gradient, we consider<br />

the angle between them. If the angle lies in the range 79 o to<br />

248 o degrees then we assume that we are near to the border<br />

of a nucleus. This angle can be estimated automatically or<br />

established as a constant <strong>and</strong> ‘hard-wired’ into the algorithm.<br />

The next step is to apply the hole detection method (red<br />

<strong>and</strong> brown lines in Figure 8). This hole detection algorithm<br />

is extended in a procedure to decide whether the area under<br />

investigation is a nuclei or otherwise. In Figure 8, the<br />

Fig. 7. Example intensity profile of a Nucleus.<br />

Fig. 8. Mask used for space-oriented filtering.<br />

maximum length of the brown line is approximately 70 pixels<br />

(which depends on the image resolution) <strong>and</strong> can be chosen<br />

automatically. A useful procedure is to check the direction<br />

toward the center of a nuclei but this is application dependent.<br />

If, for a period, there is no hole, then the present position is<br />

ignored. If the test for detecting a hole gives a positive result,<br />

as in an index figure, the line from the center of a hole up to<br />

the border of a hysteresis is drawn.<br />

In the central part of the image (Figure 5) one can see 5<br />

joint kernels in the centre of the image. To automatically find<br />

the edges between all of these nuclei requires a special algorithm<br />

for object separation The sequence of steps associated<br />

with the algorithm designed for this purpose can be divided<br />

into following list:<br />

(i) estimation of the edge;<br />

(ii) search the boundaries of the cell;<br />

(iii) calculate the direction to the center of a core;<br />

(iv) search the opposite edge of the core;<br />

(v) calculate the centers of the kernels;<br />

(vi) save the index map of the figure.<br />

Estimation of edge expectation<br />

Pre-processing can be used to form part of the estimated<br />

performance for edge expectation. This allows for accelerated


101 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

scanning of the image. For this purpose a structure estimation<br />

operator is applied at the central part of the mask as shown in<br />

Figure 8. This selects only those nuclei of interest <strong>and</strong> avoids<br />

spending computer time processing other parts of the image.<br />

Searching the boundaries of the cell (Step 1)<br />

The ring around of the central part of a mask (Figure 8) is<br />

decomposed using the operator<br />

R = [x1, x2, ...xn]<br />

In the following analysis we evaluated the gradient sequence:<br />

g1..n = dR<br />

dn<br />

Upon demarcation of a core <strong>and</strong> after the derivation, the<br />

gradient window will contain two maxima - positive <strong>and</strong><br />

negative. The polar angle then gives the direction of the<br />

nuclear center θ1.<br />

Calculation of the direction of the center (Step 2)<br />

In this step, the expected direction to the center is updated<br />

by means of a check on the position of the angle on a plane<br />

between the maxima obtained in the previous step. In general,<br />

for the purpose of recognition, a point on the binary map uses<br />

a convolution technique with a series of masks for searching<br />

the exact point on the object edge. The sequence of masks<br />

used is as follows:<br />

⎧ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />

⎨ 0 0 0 0 0 0 0 0 0<br />

M = ⎣ 0 1 0 ⎦ , ⎣ 0 1 0 ⎦ , ⎣ 0 1 0 ⎦ ,<br />

⎩<br />

0 1 0 1 1 0 0 1 1<br />

⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎫<br />

0 0 0 0 0 0 0 0 0 ⎬<br />

⎣ 0 1 0 ⎦ , ⎣ 0 1 1 ⎦ , ⎣ 1 1 0 ⎦<br />

⎭<br />

1 1 1 1 1 1 1 1 1<br />

The appropriate mask is applied in the direction of a local<br />

gradient rate <strong>and</strong> gives a maximal convolution between both<br />

the points obtained from the previous step. From the definition<br />

of the angle θ2, utilizing the a priori results, we form the ratio<br />

θ = θ1 + θ2<br />

2<br />

The logical conformity of the mask <strong>and</strong> adjacent points of the<br />

binary map is further evaluated <strong>and</strong> the binary representation<br />

of object is determined via<br />

IB(r, c) =<br />

� 1, if M /∈ Ig;<br />

0, if M ∈ Ig.<br />

The profile information (gradient <strong>and</strong> amplitude) is memorized<br />

for Step 3 (discussed below). The dimension IB(r, c)<br />

corresponds to the dimension <strong>and</strong> starting map Ig(r, c).<br />

Search for the opposite edge of a core (Step 3)<br />

The opposite gradient is searched for by finding of centre<br />

of a nuclei together with the gradient on the opposite end<br />

which serves as a final confirmation for the coordinates of<br />

object. In Figure 9 these lines are illustrated in brown. The<br />

opposite profile has to have the same properties as at Step<br />

2. This prevents any wrong detection through irregularities in<br />

the image. If the opposite profile is found, then a green line<br />

is ‘painted’ on the index binary image from the center to the<br />

boundary of the nucleus as in Figure 10.<br />

Fig. 9. Mask of the space-oriented filter with an image.<br />

Fig. 10. Result of applying the space oriented filter to an image.<br />

Calculation of the central of kernel<br />

The centre calculation algorithm is based on the weighted<br />

mean from the total number of bars detected in the previous<br />

steps - Figure 3. The calculation depends on the kind of<br />

implementation used to design the processing engine. If the<br />

calculations are implemented in a programmed logic, the data<br />

are better stored in an index space. For a PC, the data are<br />

stored as array of coordinates.<br />

Saving the index map (Figure 11)<br />

After application of the algorithm, a connected area can<br />

be detected which serves as an index for further processing.<br />

An example of an index image is given in Figure 11<br />

which includes the application of erosion <strong>and</strong> dilation for the<br />

subdivision of close located objects.<br />

V. TWO DIMENSIONAL ALGORITHM FOR IMAGE<br />

SHARPENING<br />

In this section, we consider the procedures necessary during<br />

object recognition. These procedures are adaptive <strong>and</strong> are not<br />

bound to a particular range of applications.


102 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

Fig. 11. Segmentation of nuclei (Index Image).<br />

A. Self-adjustable filter for enhanced object sharpness<br />

The task of edge searching of an object in an image is<br />

a part of the process of object recognition. In the case of<br />

an image with no preliminary information on the quantity of<br />

the points on each edge, resolution or particular boundary,<br />

it is possible to convert the data into an auxiliary map with<br />

an increased contrast range. With existing algorithms image<br />

contrast enhancement does not provide sufficient fidelity to<br />

cope with unknown levels of difference between objects.<br />

Typically, noise appears causing an increase in the level of<br />

transformation parameters <strong>and</strong> at a low level there is poor<br />

detection of an objects edge.<br />

An image I, is represented in a computer memory in terms<br />

of an array r × c of points <strong>and</strong> the value of a particular<br />

point is determined as I(r, c). One of the approaches to applying<br />

a filter or transformation to two-dimensional information<br />

representation is in terms of a sequence of masks M over<br />

m × n points <strong>and</strong> the subsequent calculation of a value for a<br />

central pixel depending on its environment. We now consider<br />

Fig. 12. Cytology cells - Mild dyskaryosis.<br />

an algorithm for calculating the value of a central point in<br />

a moving window M with m × n points. The algorithm is<br />

applied sequentially <strong>and</strong> not recursively to all points of an<br />

image. For example, consider the image given in (Figure 12).<br />

The characteristic property of the given image is that during<br />

preparation of a sample, a cell can be fixed at a given angle <strong>and</strong><br />

consequently, it can have a different gradient rate on different<br />

boundaries. The mask sizes m <strong>and</strong> n are selected according to<br />

the proportional sizes of the object to the image. The method<br />

is compounded in the following stages:<br />

1. The first step is to sort out the array M[m × n] in<br />

terms of increasing values. The result of applying this<br />

operation gives an information represented in terms of<br />

a one-dimensional array S[i] as illustrated in Figure 13.<br />

Fig. 13. Profile obtained by sorting an image into an array of increasing<br />

pixel values.<br />

2. We define an index i as a point with the greatest value<br />

of a gradient rate Simax. Otherwise, we determine a<br />

maximal gradient rate such that the given position of<br />

the window M does not correspond to a boundary of<br />

the object. It is then possible to apply general filtering<br />

methods, e.g. to calculate the average value or to take<br />

the value of a point with a predetermined index <strong>and</strong> with<br />

this value, assign it to a central point. For example, in<br />

Figure 13 Simax is the point shown by the red arrow.<br />

3. We estimate in which part of the sorted array S[i] from<br />

mask M there exists a value of the original central<br />

mask point Ic(r, c). For example, in Figure 13, this<br />

is indicated by the green arrow. We denote this part of<br />

the array by Sc[i] (see Figure 13).<br />

4. We estimate the parameter established by the user which<br />

sets a factor on a boundary excretion - in percentage<br />

terms, 50% for example - <strong>and</strong> then define the value of<br />

point Scr[i] of the array Sc[i] from the beginning of<br />

the array. This value is the resultant solution Ic(r, c) =<br />

Scr[i] displayed by the cyan arrow in Figure 13.<br />

An example result of applying this procedure is shown in<br />

Figure 14. Application of this filter allows us to observe very<br />

precisely the evolution of cell boundaries during the operation<br />

of the object recognition system.


103 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

Fig. 14. Filtered image.<br />

VI. PRECISION CALCULATIONS ON THE MEASURE OF<br />

STRUCTURE<br />

For characterization, the line of objects obtained using<br />

the method described in the previous section, need to be<br />

considered in terms of their major properties. The modern<br />

requirements for recognition systems establish structures as<br />

main features for natural objects such as the measures defined<br />

by Tamura [9].<br />

For structure classification, we apply fractal geometry for<br />

a description of natural objects. A fundamental property of<br />

a fractal is its Fractal Dimension. There are a number ways<br />

to calculate this feature of a fractal object <strong>and</strong> many different<br />

approaches to computing the Fractal Dimension have<br />

been considered [23]. For example, the origins of the ‘box<br />

dimension’ is hard to trace but would have been considered<br />

by pioneers of the Hausdorff measure <strong>and</strong> dimension<br />

<strong>and</strong> was probably rejected as being less satisfactory from<br />

a computational viewpoint. The precision of the calculation<br />

is less than two decimal places. Computation of the Fourier<br />

dimension provides a better result [10]. However, in our case,<br />

we have to estimate the dimension from an image with a<br />

lower resolution than that at which the object ‘exists’ using a<br />

frequency spectrum that is subject to additive noise.<br />

Many signal processing applications are based on the use<br />

of different transforms. The signals under consideration are<br />

written as a linear combination (or series) of some predefined<br />

set of functions. Traditionally, orthogonal basis functions have<br />

been used for this purpose, for example, the discrete Fourier<br />

transform. The theory for orthogonal basis <strong>and</strong> Hilbert spaces<br />

can, however, be generalized to other sequences of functions<br />

called frames which have been used in this work to develop<br />

measures of structure with high precision.<br />

If we consider the profile of a typical cytopathology image,<br />

then the curve does not coincide with a sine-wave signal.<br />

To obtain adequate accuracy, it is necessary to magnify the<br />

resolution of the image, which in turn introduces distortion.<br />

For increased accuracy on low-resolution data, we consider a<br />

convolution function of a form more consistent with the profile<br />

of a video signal. For a signal I we consider the representation<br />

F (k) =<br />

N�<br />

I (n)<br />

n=1<br />

� �<br />

2π(k − 1)(n − 1)<br />

arccos cos<br />

−<br />

N<br />

π<br />

��<br />

−<br />

2<br />

π<br />

2<br />

� � ��<br />

2π(k − 1)(n − 1)<br />

−i arcsin cos<br />

N<br />

<strong>and</strong> for an image I with resolution m × n,<br />

F (p, q) =<br />

M�<br />

m=1 n=1<br />

N�<br />

I (m, n) (1)<br />

� � �<br />

2π(p − 1)(m − 1)<br />

arccos cos<br />

−<br />

M<br />

π<br />

��<br />

−<br />

2<br />

π<br />

�<br />

2<br />

� � �<br />

2π(k − 1)(n − 1)<br />

× arccos cos<br />

−<br />

N<br />

π<br />

��<br />

−<br />

2<br />

π<br />

�<br />

2<br />

� � ��<br />

2π(k − 1)(p − 1)<br />

−i arcsin arccos<br />

M<br />

� � ��<br />

2π(k − 1)(n − 1)<br />

× arcsin cos<br />

(2)<br />

N<br />

In this work, application of the power spectrum method used<br />

to compute the fractal dimension of a cell boundary <strong>and</strong><br />

cell surface is based the above representations for F (k) <strong>and</strong><br />

F (p, q) respectively. We then consider the power spectrum<br />

of an ideal fractal signal given by P = c|k| −β , where c is a<br />

constant <strong>and</strong> β is the spectral exponent. In two dimensions,<br />

the power spectrum is given by P (kx, ky) = c|k| −β �<br />

, where<br />

|k| = k2 x + k2 y. In both cases, application of the least squares<br />

method or Orthogonal Linear Regression yields a solution<br />

for β <strong>and</strong> c [23], the relationship between β <strong>and</strong> the Fractal<br />

Dimension DF being given by [23]<br />

DF = 3DT + 2 − β<br />

2<br />

for Topological Dimension DT . This approach allows us to<br />

drop the limits on the recognition of small objects since<br />

application of the FFT (for computing the power spectrum)<br />

works well (in terms of computational accuracy) only for<br />

large data sets, i.e. arrays sizes larger than 256 <strong>and</strong> 256×256.<br />

Tests on the accuracy associated with computing the fractal<br />

dimension using equations (1) <strong>and</strong> (2) show an improvement of<br />

5% over computations based on conventional Discrete Fourier<br />

Transform.<br />

VII. FEATURE DETERMINATION<br />

Features (which are typically compounded in a set of<br />

metrics - floating point or decimal integer numbers) describe<br />

the object state in an image <strong>and</strong> provides the input for a<br />

decision making engine as illustrated in Figure 2. The features<br />

considered in this paper are computed in the spatial domains<br />

of the original image {fm,n} <strong>and</strong> transformed image { ˜ fm,n}.<br />

Further, these features are extracted from the three colour<br />

channels - Red (R), Green (G) <strong>and</strong> Blue (B) - captured


104 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

by the CCD array. The issue of what type <strong>and</strong> how many<br />

features should be used to develop a computer vision system<br />

is critical to the design associated with a specific application.<br />

The system considered here has been developed to include<br />

features associated with the texture of an object which include<br />

the Fractal Dimension. Texture is particularly important in<br />

medical image classification <strong>and</strong> of primary importance in the<br />

application considered in this paper. The following features<br />

or their derivatives have been considered (primarily through a<br />

process of ’trial <strong>and</strong> error’) in the recognition system reported<br />

in this paper:<br />

Average Gradient G<br />

describes how the intensity changes when scanning<br />

from the object center to the border. The object<br />

gradient is computed using the least squares method<br />

in polar coordinates as compounded in the following<br />

result:<br />

g =<br />

N �<br />

(m,n)∈S<br />

N �<br />

rm,n ˜ fm,n − �<br />

(m,n)∈S<br />

r 2 m,n −<br />

(m,n)∈S<br />

⎛<br />

⎝ �<br />

rm,n<br />

(m,n)∈S<br />

�<br />

(m,n)∈S<br />

⎞2<br />

rm,n<br />

⎠<br />

˜fm,n<br />

where N is the number of object pixels <strong>and</strong> rm,n is<br />

the distance between (m, n) <strong>and</strong> the center (m ′ , n ′ ),<br />

i.e.<br />

rm,n = � (m − m ′ ) 2 + (n − n ′ ) 2 .<br />

The centers (m ′ , n ′ ) correspond to the local maximums<br />

of ˜ fm,n within the cluster. The cluster gradient<br />

is the average of object gradients,<br />

G = 〈gi〉i∈I<br />

where i ∈ I is the object index.<br />

Colour Composites Υ <strong>and</strong> ΥD characterises the relationship between R, G <strong>and</strong> B<br />

layers of the transformed image. The triangle formula<br />

�<br />

(s − a)(s − b)(s − c)<br />

r(a, b, c) =<br />

,<br />

s<br />

s = 1<br />

(a + b + c)<br />

2<br />

is applied to the ‘colour triangle’ RGB such that the<br />

following pixel colour composite is obtained<br />

where<br />

υm,n = r(a, b, c)<br />

a = ˜ f R m,n, b = ˜ f G m,n, c = ˜ f B m,n<br />

<strong>and</strong> υ D = rincircle(a, b, c) with<br />

<strong>and</strong><br />

a = | ˜ f R m,n − ˜ f G m,n|, b = | ˜ f G m,n − ˜ f B m,n|<br />

c = | ˜ f R m,n − ˜ f B m,n|.<br />

The average colour composites are then given by<br />

Υ = 〈υm,n〉 (m,n)∈S, Υ D = 〈υ D m,n〉 (m,n)∈S.<br />

,<br />

Fourier Dimension q<br />

determines the frequency characteristics of the object<br />

<strong>and</strong> is related to the fractal dimension D by q =<br />

4 − DF [1], [2]. It represents a measure of texture<br />

[23] <strong>and</strong> is computed using the approach discussed<br />

in Section VI.<br />

Lacunarity (Gap Dimension) Λk<br />

characterizes the way the ‘gaps’ are distributed in an<br />

image [2]. The gap dimension is, roughly speaking,<br />

the number of light or dark spots in the image. It is<br />

defined for the given degree k by<br />

Λk =<br />

� ����<br />

fm,n<br />

〈fm,n〉<br />

�<br />

�<br />

− 1�<br />

�<br />

k<br />

� 1<br />

k<br />

,<br />

where 〈fm,n〉 = 1 �<br />

N<br />

fm,n denotes the mean value.<br />

In the system described in this paper, an average of<br />

local lacunarities of the degree k = 2 is measured in<br />

the spatial <strong>and</strong> frequency domains.<br />

Symmetry Features Sn <strong>and</strong> M<br />

are estimated by morphological analysis in threedimensional<br />

space, i.e. two-dimensional spatial coordinates<br />

<strong>and</strong> intensity. A symmetry feature Sn is measured<br />

for a given degree of symmetry n (currently<br />

n = {2, 4}). This value shows the deviation from a<br />

perfectly symmetric object, i.e. Sn is close to zero<br />

when the object is symmetric <strong>and</strong> Sn > 0 otherwise.<br />

Feature M describes the fluctuation of the centre or<br />

mass for pixels with different intensities; M = 0 for<br />

symmetric objects <strong>and</strong> M > 0 otherwise.<br />

Structure γ<br />

provides an estimation of the 2D curvature of the<br />

object in terms of the following:<br />

γ < 0, if the object bulging is less than a threshold,<br />

γ = 0, if the object has the st<strong>and</strong>ard bulging,<br />

γ > 0, if the object has a higher level of bulging.<br />

Geometrical Features<br />

include the minimum Rmin <strong>and</strong> maximum radius<br />

Rmax of the object (or ratio Rmax/Rmin), object<br />

area S, object perimeter P (or ratio S/P 2 ) <strong>and</strong> the<br />

coefficient of infill S/SR, where SR is the area of<br />

the bounding polygon which, in this application, is<br />

determined using the convex hull algorithm given in<br />

Section V.<br />

The system reported in this paper classifies objects using<br />

mixed mode features that are based on Euclidean <strong>and</strong> fractal<br />

geometric metrics. The procedure of object detection is performed<br />

at the segmentation stage <strong>and</strong> needs to be adjusted<br />

for each area of application. The recognition algorithm then<br />

makes a decision using a knowledge database <strong>and</strong> outputs a<br />

result by subscribing objects based on the features defined<br />

above. The ‘expert data’ associated with a given application<br />

creates a knowledge database by using a supervised training<br />

system with a number of model objects. This is discussed in<br />

the following section.


105 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

VIII. OBJECT RECOGNITION<br />

In order to characterize an object, the ‘system’ must have<br />

a mathematical representation compounded in metrics that<br />

are used to compose a feature vector. The basis for the<br />

application considered in this paper are the textural features<br />

(Fourier dimension <strong>and</strong> Lacunarity) for an object coupled with<br />

the Euclidean <strong>and</strong> morphological measures as defined in the<br />

previous section. In the case of a general application, all<br />

objects are represented by a list of parameters for implementation<br />

of supervised learning in which a fuzzy logic engine<br />

automatically adjusts the weight coefficients for the remaining<br />

features. The methods developed represent a contribution to<br />

pattern recognition based on fractal geometry (at least in a<br />

partial sense), fuzzy logic <strong>and</strong> the implementation of a fully<br />

automatic recognition scheme as illustrated in Figure 15 for<br />

the Fractal Dimension D (just one element of the feature<br />

vector used in practice). The recognition procedure uses the<br />

decision making rules from fuzzy logic theory [21], [18], [19],<br />

[20] based on all, or a selection, of the features defined <strong>and</strong><br />

discussed in Section VII which are combined to produce a<br />

feature vector x.<br />

Fig. 15. Basic architecture of the diagnostic system based on the Fractal<br />

Dimension D (a single feature) <strong>and</strong> decision making criteria β.<br />

A. Decision Making<br />

The class probability vector p = {pj} is estimated from<br />

the object feature vector x = {xi} <strong>and</strong> membership functions<br />

mj(x) defined in the knowledge database. If mj(x) is a membership<br />

function, the following equation defines the probability<br />

for each j th class <strong>and</strong> i th feature as follows:<br />

�<br />

pj(xi) = max<br />

σj<br />

· mj(xj,i)<br />

|xi − xj,i|<br />

for weight coefficient matrix given by wj = wj,i where σj<br />

is the distribution density of values xj at the point xi of the<br />

membership function. The next step is to compute the mean<br />

class probability given by<br />

〈p〉 = 1<br />

j<br />

�<br />

j<br />

wjpj<br />

where the distance from the mean probability selects the class<br />

associated with<br />

�<br />

p(j) = min [(pj · wj − 〈p〉) ≥ 0]<br />

providing a result for the decision making of the j th class.<br />

The weight coefficient matrix is adjusted during the learning<br />

stage of the algorithm.<br />

The decision criterion method considered here represents a<br />

weighing-density minimax expression. The estimation of the<br />

decision accuracy is achieved by using the density function<br />

di = |xσmax − xi| 3 3<br />

+ (σmax(xσmax ) − pj(xi))<br />

with an accuracy determined by<br />

2<br />

P = wjpj − wjpj<br />

π<br />

B. Supervised Learning Process<br />

N�<br />

di.<br />

i=1<br />

The supervised learning procedure is the most important<br />

part of the system for operation in automatic recognition mode.<br />

The training set of sample objects should cover all ranges of<br />

class characteristics with a uniform distribution together with<br />

a universal membership function. This rule should be taken<br />

into account for all classes participating in the training of the<br />

system. An expert defines the class <strong>and</strong> accuracy for each<br />

model object where the accuracy is the level of self confidence<br />

that the object belongs to a given class. During this procedure,<br />

the system computes <strong>and</strong> transfers to a knowledge database<br />

a vector of values of parameters x = {xi} which forms the<br />

membership function mj(x). The matrix of weight factors wj,i<br />

is formed at this stage accordingly for the i th parameter <strong>and</strong><br />

j th class using the following expression:<br />

wi,j =<br />

�<br />

� N�<br />

� �<br />

�1<br />

− pi,j(x<br />

� k i,j) − 〈pi,j(xi,j)〉 � pi,j(x k �<br />

�<br />

�<br />

i,j) �<br />

� .<br />

k=1<br />

The result of the weight matching procedure is that all<br />

parameters which have been computed but have not made any<br />

contribution to the characteristic set of an object are removed<br />

from the decision making algorithm by setting wj,i to null.<br />

IX. DISCUSSION<br />

The methods discussed in the previous sections represent<br />

a novel approach to designing an object recognition system<br />

that is robust in classifying textured features, the application<br />

considered in this paper, having required a symbiosis of the<br />

parametric representation of an object <strong>and</strong> its geometrical<br />

invariant properties. In comparison with existing methods, the<br />

approach adopted here has the following advantages:<br />

Speed of operation. The approach uses a limited but effective<br />

parameter set (feature vector) associated with an object<br />

instead of a representation using a large set of values (pixel<br />

values, for example). This provides a considerably higher operational<br />

speed in comparison with existing schemes, especially<br />

with composite tasks, where the large majority of methods<br />

require object separation. The principal computational effort


106 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

is that associated with the computation of the feature vector<br />

using the metrics discussed in Section VII<br />

Accuracy. The methods constructed for the analysis of<br />

sets of geometrical primitives are, in general, more precise.<br />

Because the parameters are feature values, which are not<br />

connected to an orthogonal grid, it is possible to design<br />

different transformations (shifts, rotational displacements <strong>and</strong><br />

scaling) without any significant loss of accuracy compared<br />

with a set of pixels, for example. On the other h<strong>and</strong>, the overall<br />

accuracy of the method is directly influenced by the accuracy<br />

of the procedure used to extract the required geometrical tags.<br />

Generally, the accuracy of a method will always be lower,<br />

than, for example, classical correlative techniques, where,<br />

due to padding, error can arise during the extraction of a<br />

parameter set. However, by using precise parameterization<br />

structures based on fractal geometry, remarkably good results<br />

are obtained.<br />

Reliability. The proposed approach relies first <strong>and</strong> foremost<br />

on the reliability of the extraction procedure used to establish<br />

the geometrical <strong>and</strong> parametric properties of objects, which,<br />

in turn, depends on the quality of the image; principally in<br />

terms of the quality of the contours. It should be noted, that<br />

the image quality is a common problem in any visual system<br />

<strong>and</strong> that in conditions of poor visibility <strong>and</strong>/or resolution, all<br />

vision systems will fail. In other words, the reliability of the<br />

system is fundamentally dependent on the quality of the input<br />

data.<br />

An additional feature of the system discussed in this paper,<br />

is that the sub-products of the image processes can be used<br />

for tasks that are related to image analysis such as a search for<br />

objects in a field of view, object identification, maintaining an<br />

object in a view field, optical correction of a view point <strong>and</strong><br />

so on. These can include tasks involving the relative motion<br />

of an object with respect to another object or with respect to<br />

background for which the method considered can be also be<br />

applied - collision avoidance tasks, for example.<br />

Among the characteristic disadvantages of the approach, it<br />

should be noted that: (i) The method requires a considerable<br />

number of different calculations to be performed <strong>and</strong> appropriate<br />

hardware requirements are therefore m<strong>and</strong>atory in the<br />

development of a real time system; (ii) the accuracy of the<br />

method is intimately connected with the required computing<br />

speed - an increase in accuracy can be achieved but may be<br />

incompatible with acceptable computing costs. In general, it<br />

is often difficult to acquire a template of samples under real<br />

life or field trial conditions which have a uniform distribution<br />

of membership functions. If a large number of training objects<br />

are non-uniformly distributed, it is, in general, not possible to<br />

generate accurate recognition system.<br />

The original approach to the decision process proposed<br />

includes the following important steps: (i) estimation of the<br />

density distribution is accurately determined from the original<br />

samples in the membership function during a supervised<br />

learning phase which improves the recognition accuracy under<br />

non-ideal conditions; (ii) the pre-filtering procedures provide<br />

a good response to the required features of the object without<br />

generating noise; (iii) the segmentation procedures discussed<br />

in Section III efficiently select only those objects required; (iv)<br />

computation of fractal parameters, in particular, the average<br />

lacunarity, helps to characterize the textural features (in terms<br />

of their classification) associated with the object.<br />

The integration of Euclidean with fractal geometric parameters<br />

provides a more complete suite of tools for pattern<br />

recognition in combination with supervised learning through<br />

fuzzy logic criteria. In the following section, we consider the<br />

application of our approach for the design of a cytological<br />

screening system.<br />

X. APPLICATION TO CERVICAL SMEAR SCREENING<br />

The application considered in this section has focused on<br />

screening programmes that utilize Liquid Based Cytology<br />

(LBC). Cells are collected from the cervix in the same way as<br />

PAP smear, but using a very small brush instead of a spatula.<br />

The head of the brush is broken off <strong>and</strong> maintained in a liquid<br />

environment instead of smearing the cells directly onto a slide.<br />

This preserves the cells <strong>and</strong> so the results of the test are more<br />

reliable. At present, about one in twelve PAP smears have to<br />

be done again because they can not be read properly. With the<br />

LBC approach, far fewer test have to be repeated. However, the<br />

LBC method is not, as yet, in widespread use. Nevertheless,<br />

the system reported in this paper has been designed to operate<br />

in conjunction with screening centres that use LBC.<br />

A. Classes of Cervical Cells<br />

There are two main types of cervical cancer: (i) Squamous<br />

cell cancer; (ii) Adenocarcinoma. They are named after the<br />

type of cell that becomes cancerous. Squamous cells are<br />

the flat skin-like cells that cover the surface of the cervix.<br />

Squamous cell cancer is the most common type of cervical<br />

cancer. Adenocarcinoma cells are gl<strong>and</strong>ular cells that produce<br />

mucus. The cervix has these gl<strong>and</strong>ular cells along the inside<br />

of the passageway that runs from the cervix to the womb<br />

(the endocervical canal). Adenocarcinoma is a cancer of these<br />

cell types. It is less common than squamous cell cancer, but<br />

has become more commonly recognised in recent years. Only<br />

about one in five to one in ten cases of cervical cancer are<br />

adenocarcinoma. Adenocarcinoma is associated with a similar<br />

precancerous phase. It is treated in the same way as squamous<br />

cell cancer of the cervix.<br />

Tables I <strong>and</strong> II explain the relationship between<br />

the current system <strong>and</strong> Bathesda 2001 classifications -<br />

http://www.aafp.org/afp/2003/1115/p1992.html. The first class<br />

represents normal cells <strong>and</strong> the last one are malignant (cancerous)<br />

cells. Intermediate classes represent different degrees<br />

of abnormalities; it is important to detect these as well. The<br />

classification, for which the system is ‘focused’ is simplified<br />

because, unlike Bathesda 2001, it provides a fuzzy estimation<br />

of class membership, which gives a better description of the<br />

cell state. An additional class Exudate is defined to described<br />

irrelevant structures in the image.<br />

With current techniques, all cervical smear tests are examined<br />

by ‘screeners’ who have only a few minutes per slide.<br />

This means that the screening is done at low magnification <strong>and</strong><br />

high speed so it is not surprising that mistakes can be made.<br />

The ‘screeners’ look for abnormal variations in the ratio of the


107 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

TABLE I<br />

CLASSIFICATION OF SQUAMOUS CELLS.<br />

System Bathesda 2001<br />

Normal Sq Normal squamous cells<br />

Normal Sq Atypical squamous cells – ’undetermined significance’<br />

(ASC-US)<br />

Normal Sq Atypical squamous cells – ’cannot exclude high<br />

grade disease’ (ASC-H)<br />

LSIL Low grade squamous intra-epithelial lesion (LSIL)<br />

HSIL High grade squamous intra-epithelial lesion (HSIL)<br />

– CIN2<br />

HSIL High grade squamous intra-epithelial lesion (HSIL)<br />

– CIN3<br />

Invasive Sq Invasive squamous carcenoma<br />

TABLE II<br />

CLASSIFICATION OF GLANDULAR CELLS.<br />

System Bathesda 2001<br />

Normal Gl Normal gl<strong>and</strong>ular cells<br />

Normal Gl Atypical gl<strong>and</strong>ular cells (AGC) – endocx/endom/not<br />

specified<br />

Normal Gl Atypical gl<strong>and</strong>ular cells (AGC) – favour neoplasia<br />

AIS Adenocarcinoma in situ (AIS)<br />

Invasive Adeno Adenocarcinoma<br />

size of the nucleus relative to the size of the cell, as well as<br />

other markers of diseased tissue. When they identify suspect<br />

areas of the slide they mark these with a felt tip pen <strong>and</strong> pass<br />

them on for further inspection. These slides are then looked<br />

at by ‘checkers’ who have more experience <strong>and</strong> examine the<br />

slide more carefully <strong>and</strong> at higher magnification. If they are not<br />

satisfied that ‘all is well’, then they pass the suspect slides to a<br />

cytopathologist for further, more detailed analysis <strong>and</strong> diagnosis.<br />

Even at this final stage, mistakes can be made as each slide<br />

is prepared differently <strong>and</strong> it is common for cells to overlie<br />

each other, compounding the problem of accurate diagnosis<br />

further. New techniques that use cytocentrifuge preparations<br />

(e.g. http://www.tharmac.com/?p=15) can overcome this last<br />

problem but have yet to be introduced in general.<br />

One of the major criteria of assessing whether a cell is premalignant<br />

or malignant is the ratio of the size of the nucleus of<br />

the cell compared with that of the whole cytoplasm - the nuclear/cytoplasmic<br />

ratio. The rapid identification of variations in<br />

these ratios enables ‘checkers’ to quickly <strong>and</strong> more accurately<br />

determine if there are abnormalities by examining cells that<br />

are located in a small area. To estimate the condition of the<br />

cells, the cytologist typically makes upto 300 slide movements<br />

over a period of 8-10 minutes on a desk microscope <strong>and</strong> may<br />

consequently miss many important features. This approach not<br />

only takes time but inevitably can not guarantee consistent<br />

<strong>and</strong> accurate estimates of the condition of the cells. With an<br />

increasing number of screening projects taking place together<br />

with the variability of different preparations, diagnostic errors<br />

can lead to a number of fatalities due to false negatives <strong>and</strong><br />

lack of appropriate treatment in the early stages of cervical<br />

cancer.<br />

At present, there are no commercial or experimental systems<br />

available for the automatic identification <strong>and</strong> classification<br />

of tissue cells without human participation. Obtaining results<br />

from cytology diagnostics in real time with a robust least error<br />

criterion is a widespread <strong>and</strong> important problem for screening<br />

the cervix uteri. The automatic coloring (staining) <strong>and</strong><br />

scanning of the material creates preconditions in designing an<br />

algorithm <strong>and</strong> technical devices for the automatic identification<br />

<strong>and</strong> classification in cytopathology. A key point is to identify<br />

<strong>and</strong> classify the condition of the cell nuclei using a suitable<br />

recognition process.<br />

There are a range of techniques that aim to<br />

improve the examination of slides using integrated<br />

optical densitometry. For example, SurePath -<br />

http://www.pathlabsofark.com/surepathliquidpap.html -<br />

uses integrated optical density of conventional smears. The<br />

aim of the system reported in this paper is to exclude 25%<br />

of samples without visual examination. Unlike a human<br />

expert, the automatic scanning method can count the cells<br />

<strong>and</strong> estimate their statistical distribution among classes or<br />

states. The system delivers high accuracy <strong>and</strong> automation due<br />

to the following innovations:<br />

Fractal analysis<br />

Biological structures (such as body tissues) have<br />

natural fractal properties. Numerical measurements<br />

of these properties provides for the efficient <strong>and</strong><br />

effective detection of abnormalities.<br />

Extended set of detectable features<br />

High accuracy is achieved when multiple features are<br />

measured together <strong>and</strong> combined into a result<br />

Advanced fuzzy logic engine<br />

The knowledge-based recognition scheme enables<br />

highly accurate diagnosis.<br />

B. System Overview<br />

It is proposed that the approach described in this paper <strong>and</strong><br />

the system developed may assist cytopathologists in reducing<br />

the workload by eliminating in a secure manner a percentage<br />

of normal smears, thus allowing more time for the evaluation<br />

of the abnormal cases. The ‘software solutions’ detect abnormalities<br />

in organic structures such as cells by digital image<br />

analysis. Cancer experts create the knowledge database by<br />

training the system with a number of case study images. The<br />

recognition algorithm is composed of the following steps:<br />

Filtering<br />

The image is filtered to reduce noise <strong>and</strong> remove<br />

unnecessary features (bacteria, broken cells).<br />

Segmentation<br />

The image is segmented to perform a separate analysis<br />

of each object. In order to separate connected<br />

objects a new algorithm has been designed. An<br />

example of the GUI developed is given in Figure 16<br />

which shows the stage at which the nuclei of suspect<br />

cells have been identified <strong>and</strong> located.<br />

Feature Detection<br />

For each object, a set of recognition features are<br />

detected. The features are numeric parameters that<br />

describe the object inclusive of fractal geometric<br />

parameters. The system captures a variety of geometrical,<br />

fractal <strong>and</strong> statistical features in one- <strong>and</strong> twodimensions.<br />

One-dimensional features correspond to


108 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

Model <strong>and</strong><br />

Supplier<br />

Nikon<br />

Coolscope<br />

(Nikon<br />

Instruments<br />

Europe BV)<br />

Aperio<br />

Scanscope<br />

(Aperio<br />

Technologies:<br />

DakoCytomation)<br />

Nikon Eclipse<br />

E8000 +<br />

JVC 3-CCD<br />

KY-F55B.<br />

TABLE III<br />

IMAGING ACQUISITION HARDWARE.<br />

Advantages Shortcomings<br />

Available on the<br />

market<br />

Magnification 40x.<br />

Complete solution<br />

with a slide feeder.<br />

High scanning<br />

speed<br />

(20 min/slide).<br />

Magnification 40x.<br />

Non-tiling scan.<br />

Better focus.<br />

Better dynamic<br />

range.<br />

Variable resolution<br />

4X-80X.<br />

Manual focus.<br />

Manual brightness.<br />

Very slow (several<br />

hours/slide).<br />

Small focus depth <strong>and</strong><br />

automatic focus does<br />

not find the optimal zposition.<br />

Dynamic range to be<br />

adjusted.<br />

Tiling scan.<br />

Not fully developed.<br />

Problem to achieve<br />

60x.<br />

Manual image capture.<br />

Can be used only for<br />

testing.<br />

the border of objects, whereas two-dimensional features<br />

relate to the surface within <strong>and</strong> around objects.<br />

Decision Making<br />

The system uses fuzzy logic to combine features<br />

into a decision. A decision is the estimated class<br />

of object <strong>and</strong> accuracy probability. In-between states<br />

are determined by the probability. For example,<br />

35% normal is equivalent to 65% abnormal <strong>and</strong><br />

suggests careful analysis by cancer specialists. In<br />

the extended training version for cervical cancer, the<br />

system provides upto 10 classes (CINs) depending on<br />

the classification system <strong>and</strong> the number <strong>and</strong> extent<br />

of available samples for learning procedures.<br />

Fig. 16. GUI associated with the cervical smear analysis system.<br />

The system has been developed to operate with a range of<br />

image acquisition hardware, examples of which are provided<br />

in Table III.<br />

XI. CONCLUSION<br />

This paper has been concerned with the task of developing<br />

a methodology <strong>and</strong> implementing applications that are concerned<br />

with two key tasks: (i) the partial analysis of an image<br />

in terms of its fractal structure <strong>and</strong> the fractal properties that<br />

characterize that structure; (ii) the use of a fuzzy logic engine<br />

to classify an object based on both its Euclidean <strong>and</strong> fractal<br />

geometric properties. The combination of these two aspects<br />

has been used to define a processing <strong>and</strong> image analysis engine<br />

that is unique in its modus oper<strong>and</strong>i but entirely generic in<br />

terms of the applications to which it can be applied.<br />

The research has investigated numerous processes for pattern<br />

recognition using fractal geometry as a central processing<br />

kernel. This has led to the design of a new library of pattern<br />

recognition algorithms. The image types considered contain<br />

about 80% useful environmental information for the human.<br />

With rapid advances in video technology, the content of a<br />

video stream is increasing at a rate that is far beyond the<br />

human brain capacity for decision making. This necessitates<br />

a need for developing an automatic image processing <strong>and</strong><br />

decision making system using artificial intelligence. Such<br />

systems are required in search engines, information databases,<br />

navigation in unknown terrain, interpretation of two dimensional<br />

data, etc.<br />

The creation of logic <strong>and</strong> general purpose hardware for<br />

artificial intelligence is a basic theme for any future development<br />

based on the results reported in this paper for<br />

the applications developed <strong>and</strong> beyond. The results of the<br />

current system can be utilized in a number of different areas<br />

although medical imaging would appear to be one of the<br />

most natural fields of interest because of the nature of the<br />

images available, their complex structures <strong>and</strong> the difficulty<br />

of obtaining accurate diagnostic results which are efficient<br />

<strong>and</strong> time effective. A further extension of our approach is to<br />

consider the effect of replacing the fuzzy logic engine used<br />

to date with an appropriate Artificial Neural Network. It is<br />

not clear as to whether the application of an ANN could<br />

provide a more effective system <strong>and</strong> whether it could provide<br />

greater flexibility with regard to the type of images used <strong>and</strong><br />

the classifications that may be required. Within the context<br />

of this paper, algorithms have been designed that focus on<br />

solving the detection <strong>and</strong> classification problems associated<br />

with the analysis of cervical smear images. In this respect, a<br />

new set of image processing algorithms have been developed<br />

that may have value in a wider class of image processing<br />

<strong>and</strong> pattern recognition application, particularly with regard to<br />

medical image analysis.<br />

ACKNOWLEDGMENTS<br />

This work is supported by the Science Foundation Irel<strong>and</strong>.<br />

The authors are grateful for the advice <strong>and</strong> help of Dr Alastair<br />

Deery (Department of Cellular Pathology, St Georges Hospital,<br />

London), Professor Jonathan Brostoff (Kings College, London<br />

University) <strong>and</strong> Professor Irina Shabalova (Russian Medical<br />

Academy of Postgraduate Education, Moscow).


109 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />

REFERENCES<br />

[1] J. M. Blackledge, Digital Signal Processing, 2 nd Edition, Horwood<br />

Publishing, 2006.<br />

[2] J. M. Blackledge, Digital Image Processing, Horwood Publishing, 2005.<br />

[3] E. R. Davies. Machine Vision: Theory, Algorithms, Practicalities, Academic<br />

press, London, 1997.<br />

[4] H. Freeman, Machine vision. Algorithms, Architectures, <strong>and</strong> <strong>Systems</strong>,<br />

Academic press, London, 1988.<br />

[5] M. G. Rojo, G. B. Garcia, C. P. Mateos, J. G. Garcia <strong>and</strong> M. C. Vicente,<br />

Critical Comparison of 31 Commercially Available Digital Slide <strong>Systems</strong><br />

in Pathology, Int. J. Surg. Pathol., 14, 285-30, 2006.<br />

[6] J. M. Blackledge <strong>and</strong> D. Dubovitskiy, Object Detection <strong>and</strong> Classification<br />

with Applications to Skin Cancer Screening, ISAST Transactions<br />

on <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 1, 34-45, ISSN:1797-1802, 2008.<br />

[7] J. M. Blackledge, D, Dubovitskiy, Surface Inspection using a Computer<br />

Vision System that Includes Fractal Analysis, ISAST Transaction on<br />

Electronics <strong>and</strong> Signal Processing, No. 2, Vol. 3, 76 -89, ISSN:1797-<br />

2329, 2008<br />

[8] Canny J. A computational approach to edge detection. IEEE Trans.<br />

Pattern Analysis <strong>and</strong> Machine Intelligence, (PAMI-8):679–698, 1986.<br />

[9] Shunji Mori Hideyuki Tamura <strong>and</strong> Takashi Yamawaki. Textual features<br />

corresponding to visual perception. IEEE Man. <strong>and</strong> Cybernetics, 6,<br />

1978.<br />

[10] Falconer K. Fractal Geometry. Wiley, 1990.<br />

[11] Pantanowitz L, Henricks W, Beckwith B. Medical laboratory informatics.<br />

Clin Lab Med, 27:823-43, 2007.<br />

[12] Pantanowitz L, Hornish MA, Goulart RA. Computer-assisted cervical<br />

cytology. Medical information Science, 2008.<br />

[13] Yagi Y, Gilberson JR. Digital imaging in pathology: The case for<br />

st<strong>and</strong>ardization. J Telemed Telecare, 11:109-16, 2005.<br />

[14] Jr Louis J. Galbiati. Machine vision <strong>and</strong> digital image processing<br />

fundamentals. State University of New York, New-York, 1990.<br />

[15] Roger Boyle Milan Sonka, Vaclav Hlavac. Image Processing, Analysis<br />

<strong>and</strong> Machine Vision. PWS, USA, 1999.<br />

[16] Wesley E.Snyder Hairong Qi. Machine Vision. Cambridge University<br />

Press, Engl<strong>and</strong>, 2004.<br />

[17] V.S Nalwa <strong>and</strong> T.O.Binford. On detecting edge. IEEE Trans. Pattern<br />

Analysis <strong>and</strong> Machine Intelligence, (PAMI-8):699–714, 1986.<br />

[18] Lotfi A. Zadeh. Fuzzy sets <strong>and</strong> their applications to cognitive <strong>and</strong><br />

decision processes. Academic Press, New York, 1975.<br />

[19] E.H.Mamdani. Advances in linguistic synthesis of fuzzy controllers.<br />

J.Man Mach., 8:669–678, 1976.<br />

[20] E.Sanchez. Resolution of composite fuzzy relation equations.<br />

Inf.Control, 30:38–48, 1976.<br />

[21] N.Vadiee. Fuzzy rule based expert system-I. Prentice Hall, Englewood,<br />

1993.<br />

[22] Contour Tracing Algorithms http://www.cs.mcgill.ca/ aghnei/alg.html<br />

[23] Patrick R.Andrews Martin J.Turner, Jonathan M.Blackledge. Fractal<br />

Geometry in Digital Imaging. Academic Press, London, 1998.<br />

[24] Cancer research uk. http://www.cancerresearchuk.org/<br />

aboutcancer/reducingyourrisk/9314.<br />

[25] J. S. Lim, Two-Dimensional Signal <strong>and</strong> Image Processing, Prentice-Hall,<br />

1990.<br />

Jonathan Blackledge received a BSc in Physics<br />

from Imperial College, London University in 1980,<br />

a Diploma of Imperial College in Plasma Physics<br />

in 1981 <strong>and</strong> a PhD in Theoretical Physics from<br />

Kings College, London University in 1983. As a Research<br />

Fellow of Physics at Kings College (London<br />

University) from 1984 to 1988, he specialized in<br />

information systems engineering undertaking work<br />

primarily for the defence industry. This was followed<br />

by academic appointments at the Universities of<br />

Cranfield (Senior Lecturer in Applied Mathematics)<br />

<strong>and</strong> De Montfort (Professor in Applied Mathematics <strong>and</strong> Computing)<br />

where he established new post-graduate MSc/PhD programmes <strong>and</strong> research<br />

groups in computer aided engineering <strong>and</strong> informatics. In 1994, he cofounded<br />

Management <strong>and</strong> Personnel Services Limited where he is currently<br />

Executive Director. His work for Microsharp (Director of R & D, 1998-<br />

2002) included the development of manufacturing processes now being<br />

used for digital information display units. In 2002, he co-founded a group<br />

of companies specializing in information security <strong>and</strong> cryptology for the<br />

defence <strong>and</strong> intelligence communities, actively creating partnerships between<br />

industry <strong>and</strong> academia (e.g. Lexicon Data Limited). He is currently holder<br />

of the Stokes Professorship in Digital Signal Processing <strong>and</strong> Information <strong>and</strong><br />

Communications Technology based at Dublin Institute of Technology <strong>and</strong> has<br />

published over one hundred scientific <strong>and</strong> engineering research papers <strong>and</strong><br />

technical reports for industry, six industrial software systems, fifteen patents,<br />

ten books <strong>and</strong> been supervisor to sixty research (PhD) graduates. His current<br />

research interests include computational geometry <strong>and</strong> computer graphics,<br />

image analysis, nonlinear dynamical systems modelling <strong>and</strong> computer network<br />

security, working in both an academic <strong>and</strong> commercial context. He holds<br />

Fellowships with Engl<strong>and</strong>’s leading scientific <strong>and</strong> engineering Institutes <strong>and</strong><br />

Societies including the Institute of Physics, the Institute of Mathematics <strong>and</strong><br />

its Applications, the Institution of Electrical Engineers, the Institution of<br />

Mechanical Engineers, the British Computer Society, the Royal Statistical<br />

Society <strong>and</strong> the Chartered Management Institute.<br />

Dmitry Dubovitskiy received a BSc <strong>and</strong> Diploma<br />

in Aeronautical Engineering from Saratov Aviation<br />

Technical College in 1993, an MSc in Computer<br />

Science <strong>and</strong> Information Technology from Baumann<br />

Moscow State Technical University in 1999 <strong>and</strong> a<br />

PhD in Computer Science from De Montfort University<br />

in 2005 under the supervision of Professor<br />

J M Blackledge. As a project leader in medical<br />

imaging at Microsharp Limited from 2002 to 2005,<br />

he specialized in information systems engineering,<br />

developing image recognition systems for medical<br />

applications for real time operational diagnosis. He founded Oxford Recognition<br />

Limited in 2005 which specialises in the applications of artificial<br />

intelligence for computer vision. He has developed a range of computer vision<br />

systems for industry including applications for 3D image visualisation <strong>and</strong> has<br />

been coordinator for the INTAS project in distributed automated systems for<br />

acquiring <strong>and</strong> analysing eye tracking data.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!