Computers and Intelligent Systems - isast
Computers and Intelligent Systems - isast
Computers and Intelligent Systems - isast
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong> No. 1 Vol. 2, 2010<br />
ISAST Transactions on No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
<strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong><br />
Gang LIU, Gang CUI, Hongwei LIU, <strong>and</strong> Zhibo WU:<br />
A Reliability Enhancement Adaptive Routing Mechanism for Mobile Ad Hoc Networks………………….1<br />
J. C. Chedjou, K. Kyamakya, U. A. Khan, <strong>and</strong> M. A. Latif:<br />
Potential Contribution of CNN-based Solving of Stiff ODEs& PDEs to Enabling Real-Time<br />
Computational Engineering………………………………………………………………………………….8<br />
Y. Labrador, M. Karimi, N. Pissinou, <strong>and</strong> D. Pan:<br />
Performance Comparison of OFDM <strong>and</strong> Single Carrier Modulations over Satellite Channels…….……....15<br />
Z. R. Ghobadi <strong>and</strong> H. Rashidi:<br />
Software Rejuvenation Technique-An Improvement in Applications with Multiple Versions…….………22<br />
S. A. Asghari, H. Pedram <strong>and</strong> H. Taheri:<br />
A New Attitude based on Real Time Operating System for NoC in Hotspot Traffic Model..…….………27<br />
V. Y. Kontorovich, Z. Lovtchikova, J. A. Meda-Campaña, <strong>and</strong> K. Tinsley:<br />
Nonlinear Filtering Algorithms for Chaotic Signals: A Comparative Study……………………………….34<br />
H. D. Vankayalapati <strong>and</strong> K. Kyamakya:<br />
Nonlinear Feature Extraction Approaches for Scalable Face Recognition Applications…………………..44<br />
R.Karthikeyan, A. Karthikeyan <strong>and</strong> S.Sivaperumal:<br />
Artificial Human Limbs – A Design Approach for Military Application………………………………….53<br />
K. Kyamakya, J. C. Chedjou, M. A. Latif, <strong>and</strong> U. A. Khan:<br />
A Novel Image Processing Approach Combining a ‘Coupled Nonlinear Oscillators’-based Paradigm<br />
with Cellular Neural Networks for Dynamic Robust Contrast Enhancement………………………….…..61<br />
JianLi GUO, HongWei LIU, <strong>and</strong> XiaoZong YANG:<br />
Common-neighbor Monitoring Enhanced Cooperation Enforcement Scheme for MANETs……………..69<br />
J. M. Blackledge:<br />
Systemic Risk Assessment using a Non-stationary Fractional Dynamic Stochastic Model for the<br />
Analysis of Economic Signals……………………………………………………………………………..76<br />
J. M. Blackledge <strong>and</strong> D. A. Dubovitskiy:<br />
An Optical Machine Vision System for Applications in Cytopathology………………………………….95
1 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
A Reliability Enhancement Adaptive Routing<br />
Mechanism for Mobile Ad Hoc Networks<br />
Gang LIU, Gang CUI, Hongwei LIU, <strong>and</strong> Zhibo WU<br />
Abstract—Selecting a stable routing for data packet transmission is effective on reducing control packet traffic, <strong>and</strong><br />
energy consuming generated by the frequent routing reconstruction <strong>and</strong> maintenance in dynamic mobile Ad Hoc<br />
networks, so it can improve efficient <strong>and</strong> extend lifetime of the networks. A kind of algorithm to measure dynamic<br />
characters of mobile nodes based on information entropy is proposed in the paper, which analyze uncertainty of<br />
behavior character of its neighbor set in the transformation process, <strong>and</strong> use it as a metric for selecting stable routing in<br />
mobile Ad Hoc networks. Simulation results show that stable routing measurement method can remarkably improve key<br />
performances of mobile Ad Hoc networks, such as packet delivery ratio <strong>and</strong> packet end-to-end delay.<br />
Index Terms—Mobile Ad Hoc networks, stable routing, uncertainty, information entropy.<br />
1 INTRODUCTION<br />
MOBILE Ad Hoc networks is a group<br />
of autonomous wireless mobile nodes<br />
in the composition of the temporary selforganization,<br />
non-center multi-hop wireless<br />
network system, the nodes can move freely,<br />
join or leave the network at any time without<br />
having to send any warning information in<br />
the networks running process. Therefore, the<br />
states of their network topology, mutual relations<br />
between the nodes <strong>and</strong> wireless links<br />
constantly change. In such a dynamic network<br />
environment, selecting a stable routing for data<br />
transmission can effectively reduce the transmission<br />
process of reconstruction <strong>and</strong> maintenance<br />
of the routing frequently generated by<br />
network b<strong>and</strong>width <strong>and</strong> energy consumption,<br />
<strong>and</strong> thus improve network resource utilization<br />
efficiency <strong>and</strong> prolong survival of life, so it is<br />
of great significance for network resources <strong>and</strong><br />
relatively limited supply of energy in mobile<br />
• Gang LIU, Gang CUI, Hongwei LIU <strong>and</strong> Zhibo WU are with<br />
the School of computer science <strong>and</strong> technology, Harbin Institute<br />
of Technology, Harbin, China, 150001<br />
E-mail: lg.hit@163.com<br />
• This paper is partially supported by the Hi-Tech Research<br />
<strong>and</strong> Development Program (863) of China under grant No.<br />
2008AA01A201 <strong>and</strong> the National Natural Science Foundation<br />
of China under grant No. 60503015.<br />
Manuscript received April 19, 2009; revised September 11, 2009.<br />
✦<br />
Ad Hoc network.<br />
A more common strategy is aimed at a formalization<br />
of mobile Ad Hoc network node<br />
movement model in the current approach for<br />
selecting stability routing, by analyzing the<br />
specific movement model of the network mobile<br />
nodes, the wireless link behavior demonstrated,<br />
as a routing selection process in the<br />
establishment <strong>and</strong> stability of metrics[1], [2],<br />
[3], [4], [5], one main problem is the limitation<br />
of its application area of the stability of<br />
the routing nodes. LENDERS[6] analyzed the<br />
impact of the actual nodes mobile model on<br />
wireless links connecting state, but only gave a<br />
qualitative summary of type conclusions, <strong>and</strong><br />
there were no formal quantitative test results.<br />
Another common way is through real-time<br />
precision for Mobile Ad Hoc networks, wireless<br />
mobile nodes in the relationship between<br />
the physical location <strong>and</strong> the relative speed<br />
of change in the stability of information as a<br />
link or routing metrics[7], [8], [9], [10], in this<br />
way, real-time access <strong>and</strong> update the location<br />
of wireless mobile nodes, speed change, supporting<br />
the information need a special facilities<br />
(such as GPS, etc.) to provide the necessary<br />
technical support, it is only suitable for certain<br />
specific applications in the network environment.<br />
ROHIT[11] <strong>and</strong> GEUNHWI[12] used the<br />
network data transmission process of mobile<br />
1
2 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
node changes in signal strength <strong>and</strong> stability<br />
of features as the route selection criteria, <strong>and</strong><br />
the author used an extended device driver<br />
interface to implement a routing protocol for<br />
maintain signal stability table (SST) protocol<br />
stack required for cross-layer-type operation<br />
in the signal stability based adaptive routing<br />
protocol SSA. In addition, XU[13] analyzed the<br />
network topology dynamic characteristics, <strong>and</strong><br />
applied it to hierarchical network architecture,<br />
clustering algorithm <strong>and</strong> cluster-based routing<br />
protocol in order to achieve maximum network<br />
performance <strong>and</strong> stability. KIM[14] put<br />
forward a kind of scaleable Ad Hoc routing<br />
protocol which enhanced routing protocol capacity<br />
to adapt the increasing scale of Ad Hoc<br />
networks by the logical topology information<br />
of networks. In the literature[15], the authors<br />
proposed a stability calculation method based<br />
on the correlation factor of the wireless link<br />
path, which used the wireless links to connect<br />
the adjacent state of change, rather than use a<br />
separate wireless link to considerate the stability<br />
characteristics of the full routing.<br />
Unlike the above method, this paper use the<br />
information uncertainty metric method, which<br />
utilize a collection of wireless nodes in a neighbor<br />
behavior change process as a source with<br />
uncertain properties, through its quantitative<br />
measurement as a routing selecting benchmarks<br />
in the dynamic network environment,<br />
which provide the necessary support for reliable<br />
data transmission.<br />
The rest of this paper is organized as follows.<br />
In section 2, we introduce a novel method<br />
to measure the routing stability based on the<br />
information entropy, a new routing protocol<br />
called BSORP combined the stability measurement<br />
method with AODV protocol. In section<br />
3, we compare the difference performances of<br />
AODV <strong>and</strong> BSORP to validate the effect of<br />
the introduced stability measurement method<br />
in this paper. Finally, section 4 is conclusion.<br />
2 MODEL OF ROUTING STABILITY<br />
2.1 Stability measurement for wireless mobile<br />
nodes<br />
On the assumption that each of wireless mobile<br />
nodes has its only identified sign for mobile Ad<br />
Hoc networks, <strong>and</strong> all nodes have the same<br />
wireless spreading radius, we can transform<br />
network topology model of the time t to a<br />
undirected graph G t = (V, E t ), while V denotes<br />
the set of all wireless nodes, |V | denotes the<br />
amount of all wireless nodes in set V , E t denotes<br />
the set of |E| bidirectional wireless links<br />
in the time t.<br />
If node m periodically inspects the members<br />
in the set of neighbor nodes, the inspection<br />
period is ∆t, <strong>and</strong> then an ordered sequence<br />
composed by neighbor sets under different<br />
time is gained<br />
S T NS(N) = (ns t1 , ns t2 , · · · , ns tN ), N = T/∆t (1)<br />
In the sequence ST NS (N), if we regard the set<br />
of inspected current neighbor nodes at time ti<br />
as a r<strong>and</strong>om variable NS, then the new set<br />
made up of different neighbor nodes of all<br />
members is value range for the r<strong>and</strong>om variable,<br />
namely NS T = {ns1, ns2, · · · , nsk}, while<br />
(nsti ∈ NS T ; i = 1, · · · , N; 1 ≤ k ≤ N), <strong>and</strong> we<br />
can compute each probability distributing of all<br />
elements in the set NS T according to sequence<br />
ST NS (N).<br />
The sequence S T NS<br />
2<br />
(N) is reflection of cor-<br />
relation dynamic character between moving<br />
wireless node m <strong>and</strong> its neighbor nodes, so we<br />
can provide a convergence stability criterion for<br />
building <strong>and</strong> selecting mobile Ad Hoc stability<br />
routing, by information entropy[16] with the<br />
ability of measuring uncertainty, <strong>and</strong> the uncertainty<br />
is computed by quantizing the sequence<br />
ST NS (N) <strong>and</strong> the set NST .<br />
In order to measure the uncertainty of the<br />
sequence S T NS<br />
(N), we give another express by<br />
incorporating the set with the same neighbor<br />
nodes, using the length of the continuous appearance<br />
set, namely<br />
RNS T = (R1(ns 1 ), R2(ns 2 ), . . . , Rl(ns l )), where<br />
m�<br />
Ri(ns i ) = N, ns i =∈ NS T , l ≥ k (2)<br />
i=1<br />
We can measure the uncertainty character<br />
of the neighbor nodes set of wireless mobile<br />
node m according to the sequence RNS T by<br />
the weighted entropy
3 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
HW (RNS T ) =<br />
l�<br />
i=1<br />
RC i NS( Ri(ns i )<br />
N ) log(Ri(ns i )<br />
N )<br />
(3)<br />
In the above formula, the weight RC i NS is<br />
change ratio of composing members belonged<br />
to the neighbor nodes set in the neighbor sub-<br />
sequence in the sequence RNS T .<br />
RC i �<br />
1 i = 1<br />
NS =<br />
i > 1<br />
1 − nsi−1 ∩ns i<br />
ns i−1 ∪ns i<br />
(4)<br />
According to the same method, the uncertainty<br />
character of the set NS T is measured by<br />
the st<strong>and</strong>ard information entropy of disperse<br />
stochastic variable<br />
H(NS T ) =<br />
k�<br />
p(nsj) log p(nsj) (5)<br />
j=1<br />
From the above result, the metric stability<br />
of mobile node m in the Ad Hoc Networks is<br />
defined<br />
MS(m) = 1 − Hw(RNS T )H(NS T )<br />
log N log N<br />
(6)<br />
The characters of Metric Stability are as follows.<br />
1. According to the character of the information<br />
entropy, the value ranges of H(NS T ) <strong>and</strong><br />
HW (RNS T ) are [0, log N], so the value range<br />
of MS(m) is [0, 1], <strong>and</strong> the uncertainty will<br />
increase along with the increasing members of<br />
the set <strong>and</strong> the tend to average of the probability<br />
distributing, the metric stability MS(m) is<br />
decreased. In the worst condition, the different<br />
neighbor sets are gained in the course of each<br />
sampling, that is MS(m) equals 0, so the routing<br />
can’t reliably transfer the data. Contrarily,<br />
the same neighbor sets are gained in the course<br />
of each sampling, that is MS(m) equals 1.<br />
2. The different effects of changing members<br />
in the neighbor set are introduced into the<br />
uncertainty metric by weighting during the<br />
course of computation HW (RNS T ), the uncertainty<br />
is larger <strong>and</strong> larger along with the<br />
exquisite changes of members in the neighbor<br />
set, the result is that the stability of wireless<br />
mobile node decrease.<br />
3. The dynamic character of the wireless<br />
mobile node is compactly described form the<br />
aspect of statistics <strong>and</strong> action, the reason is that<br />
the uncertainty is measured by the value range<br />
of the neighbor set <strong>and</strong> the distribution of the<br />
members in the neighbor set.<br />
2.2 Stability measurement of the routing<br />
From the above analysis of stability measurement<br />
about the wireless mobile node, the stability<br />
measurement SR(S,D) of the routing from<br />
the source node S to the aim node D can be<br />
denoted by the multiplication of all stability<br />
measurement participating in this routing stability<br />
of wireless mobile nodes, namely<br />
SR(S,D) = �<br />
S(i) (7)<br />
i∈R(S,D)<br />
The maximum of the stability measurement<br />
is 1, because the value range of each stability<br />
measurement S(i) is fall into [0, 1]. The stability<br />
measurement is affected by the two factors, one<br />
is the length of the route R(S, D), the other<br />
is the stability degree of all wireless mobile<br />
nodes in the routing. The jump forward routing<br />
R(S, D) is less, the routing stability is higher,<br />
the value of SR(S,D) tends to 1, <strong>and</strong> the routing<br />
is more stability.<br />
2.3 Based on the stability of measurement<br />
on-dem<strong>and</strong> routing protocol (BSORP)<br />
On the basis of the Ad Hoc on-dem<strong>and</strong> distance<br />
vector routing protocol (AODV)[17], a<br />
BSORP is put forward by building <strong>and</strong> selecting<br />
the necessary stability routing, the affection<br />
of the routing presented in this paper<br />
on the stability of the calculation methods in<br />
the network performance is analyzed by the<br />
simulation, which analyze the performance difference<br />
of the improved routing protocols <strong>and</strong><br />
the original on-dem<strong>and</strong> distance vector routing<br />
protocol in the same network environment.<br />
The routing table <strong>and</strong> the associated control<br />
data structure grouping need to be extended in<br />
the routing protocol in order to take advantage<br />
of the mobile node stability measurement as<br />
the path choosing metric in the routing building.<br />
In the original AODV routing protocol,<br />
3
4 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
routing table entry increase the routing stability<br />
metric range, it is used to record the<br />
routing stability metric from the source node<br />
to the current forwarding node. In the routing<br />
search process, each RREQ packet increase the<br />
stability of the current routing metric range<br />
(the stability metric value from RREQ source<br />
node to the RREQ packet routing node), when<br />
an intermediate forwarding node receives a<br />
valid RREQ control packets, firstly, update <strong>and</strong><br />
record the routing stability metric from the<br />
sponsored node by the RREQ to the current<br />
nod, <strong>and</strong> then continue the routing discovery<br />
process. In addition, the RREP control packets<br />
increase the stability of the routing metric<br />
range as the stability indicator for the source<br />
node of RREQ packet, the routing stability<br />
measurement is a full stability measurement in<br />
the RREQ source node routing table entries.<br />
When a wireless mobile node receives the<br />
same RREQ packet in different copies in the<br />
routing search process, if the routing stability<br />
of the new received RREQ packet is less than<br />
the minimum of the node in the current routing<br />
table entry, or the stability measurement are<br />
equal, but less jumping forward, the RREQ<br />
control packet is not discarded, but the group<br />
need to update the routing information in the<br />
routing table entry in the stability of the measurement<br />
range <strong>and</strong> to redirect the hop node<br />
for this RREQ packet forwarding node, otherwise<br />
discard the duplicate copies of RREQ. In<br />
addition, in order to obtain more stable routing,<br />
the copies of other RREQ packet that the<br />
routing stability measurement is less than its<br />
current value of the minimum recorded should<br />
be continued to answer before the RREQ destination<br />
node in the RREQ packet response timer<br />
times out.<br />
Routing redirection of the forwarding node<br />
in the process of RREQ packet <strong>and</strong> the multiple<br />
response process of the final destination node<br />
of the RREQ packet are shown as Fig. 1 during<br />
the course of establishing mobile Ad Hoc.<br />
3 SIMULATION RESULTS<br />
3.1 Simulation evaluation indicators<br />
Simulation process is based on NS2 (v2.31)<br />
network simulator, wireless mobile node move-<br />
1<br />
2<br />
5 6<br />
3<br />
Fig. 1. Routing redirection <strong>and</strong> multiple response<br />
operations in the process of routing<br />
search<br />
ment model is R<strong>and</strong>om Waypoint Model<br />
(RWM), IEEE802.11 specification is used in<br />
the simulation of the distributed coordination<br />
function (DCF) as the MAC protocol, for all<br />
wireless mobile nodes in the Ad Hoc networks<br />
move r<strong>and</strong>omly in the rectangular range of<br />
1800m × 900m.<br />
Other relevant parameters in the process of<br />
simulation are as shown in Table 1.<br />
TABLE 1<br />
Simulation parameters<br />
Simulate time 800s<br />
Transmission range 250m<br />
Receiver range 250m<br />
Node numbers 50<br />
Maximum pause time 50s<br />
Traffic type CBR<br />
Packet size 512byte<br />
CBR rate 5pkt/s<br />
3.2 Simulation environment<br />
In the simulation process, we analyze the difference<br />
performances of the original AODV<br />
routing protocol <strong>and</strong> BSORP routing protocol<br />
with the same parameters setting from the<br />
key network performance parameters, such as<br />
packet successful delivery ratio of the application<br />
layer, packet end-to-end delay of the application<br />
layer, network control load overhead.<br />
Packet successful delivery ratio: the ratio<br />
is the total number of the packets issued by<br />
source nodes of all CBR data flow to the application<br />
layer success receiving packets of the<br />
4<br />
7<br />
4
5 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
destination node of all CBR data flow in the<br />
mobile Ad Hoc networks.<br />
Packet end-to-end delay: the average transmission<br />
delay is CBR data flow from the source<br />
node sent the application layer data packet<br />
eventually reaches its final destination node<br />
application layer, that is, the application layer<br />
data packets end-to-end delay.<br />
Routing protocol control load overhead: the<br />
ratio is the number of control packets sent<br />
by the simulation process of all network layer<br />
routing protocols to the number of all sending<br />
packets.<br />
3.3 Analysis of the simulation results<br />
The performance difference in the data packet<br />
successful delivery compared for two kinds of<br />
routing protocols in the simulation is shown as<br />
Fig. 2. From the simulation results, it is clear<br />
that the wireless mobile nodes moving at a<br />
speed in Ad Hoc networks have a considerable<br />
impact on data transmission quality. The<br />
successfully submitted packets of two kinds<br />
of routing protocol are significantly decreased<br />
when the nodes increased the moving speed.<br />
However, due to BSORP routing protocol with<br />
a relatively stable network nodes for data transmission<br />
in the routing process of establishing,<br />
it is effective for improving the rate of packet<br />
successfully submitted, <strong>and</strong> delaying the rapid<br />
decline trend with the increasing node speed<br />
of the success submission packet.<br />
End-to-end delay performance of packet affected<br />
by the BSORP in the Ad Hoc networks<br />
is shown as Fig. 3, the simulation results show<br />
that the BSORP protocol significantly reduces<br />
the end-to-end delay in the process of the data<br />
packet transmission. A stable routing selection<br />
strategy adopts the multiply accumulate in the<br />
routing search process for the BSORP protocol,<br />
the number of nodes forwarding is regarded as<br />
an important factor, that is, the routing has a<br />
higher competitive advantage with fewer routing<br />
nodes, <strong>and</strong> by choosing a relatively stable<br />
network of mobile nodes involved in data forwarding,<br />
it is significantly reduced the packet<br />
delay due to frequent disruptions caused by<br />
wireless link routing maintenance <strong>and</strong> reconstruction.<br />
P a c k e t S u c c e s s fu l D e liv e r y R a tio (% )<br />
1 .0 0<br />
0 .9 5<br />
0 .9 0<br />
0 .8 5<br />
0 .8 0<br />
0 .7 5<br />
0 .7 0<br />
0 .6 5<br />
0 .6 0<br />
0 .5 5<br />
0 .5 0<br />
0 .4 5<br />
0 .4 0<br />
5 1 0 1 5 2 0 2 5<br />
M o tio n S p e e d o f W ir e le s s N o d e (m /s )<br />
A O D V<br />
B S O R P<br />
Fig. 2. Packet successful delivery ratio at different<br />
motion speed<br />
P a c k e t E n d -to -E n d D e la y (s )<br />
2 .0 0<br />
1 .7 5<br />
1 .5 0<br />
1 .2 5<br />
1 .0 0<br />
0 .7 5<br />
0 .5 0<br />
0 .2 5<br />
0 .0 0<br />
5 1 0 1 5 2 0 2 5<br />
M o tio n S p e e d o f W ir e le s s N o d e (m /s )<br />
A O D V<br />
B S O R P<br />
Fig. 3. Packet end-to-end delay at different<br />
motion speed<br />
The specific performance differences in the<br />
data transfer process under the same network<br />
environment are shown as Fig. 4 <strong>and</strong> Fig. 5,<br />
such as the routing interrupt, control the load<br />
between the AODV routing protocol <strong>and</strong> the<br />
BSORP routing protocol. In the mobile Ad Hoc<br />
networks, the frequent disruption caused by<br />
wireless link routing maintenance <strong>and</strong> reconstruction<br />
of network control operations are a<br />
major cause of the load increasing, especially<br />
for on-dem<strong>and</strong> routing protocol, routing disruption<br />
will lead to the control of heavy loads<br />
5
overhead because of adopting the flood network<br />
to establish or maintain the necessary<br />
routing. It is show as Fig. 4 that choosing a<br />
relatively stable network of mobile nodes to<br />
participate in the activities of the routing of<br />
data forwarding can significantly reduce the<br />
interrupting possibility, <strong>and</strong> the advantage of<br />
the stability characters of routing strategy become<br />
more pronounced with the increasing of<br />
speed network nodes <strong>and</strong> the network dynamic<br />
characters. Routing interrupt <strong>and</strong> control load<br />
have a significant correlation in a dynamic<br />
network environment, which can be explained<br />
from the control load overhead of two kinds<br />
of routing protocol, the control load of the ondem<strong>and</strong><br />
type routing protocol can be effectively<br />
reduced by avoiding frequent routing<br />
disruption.<br />
R o u te B r o k e n T im e s<br />
6 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
4 0 0 0<br />
3 5 0 0<br />
3 0 0 0<br />
2 5 0 0<br />
2 0 0 0<br />
1 5 0 0<br />
1 0 0 0<br />
5 0 0<br />
0<br />
5 1 0 1 5 2 0 2 5<br />
M o tio n S p e e d o f W ir e le s s N o d e (m /s )<br />
A O D V<br />
B S O R P<br />
Fig. 4. The number of routing disruption for<br />
mobile routing<br />
From the above simulation results, we can<br />
see that stable routing selection has a major<br />
impact on the network performance in the<br />
Ad Hoc networks, the relevant network performance<br />
evaluation indicators have a clear<br />
upgrade, it means that the networks can<br />
achieve higher resource utilization efficiency<br />
<strong>and</strong> longer survival life in the resourceconstrained<br />
network environment, so it has<br />
important significance for the practical application<br />
of Ad Hoc networks.<br />
R o u tin g O v e r h e a d<br />
0 .6 0<br />
0 .5 5<br />
0 .5 0<br />
0 .4 5<br />
0 .4 0<br />
0 .3 5<br />
0 .3 0<br />
0 .2 5<br />
0 .2 0<br />
0 .1 5<br />
0 .1 0<br />
0 .0 5<br />
0 .0 0<br />
5 1 0 1 5 2 0 2 5<br />
M o tio n S p e e d o f W ir e le s s N o d e (m /s )<br />
Fig. 5. Routing control load<br />
4 CONCLUSIONS<br />
A O D V<br />
B S O R P<br />
The paper presents a stable routing calculating<br />
method, which measure the stability of<br />
the mobile wireless nodes by the uncertainty<br />
of the members changing in the neighbor<br />
node set. The method is used to improve the<br />
on-dem<strong>and</strong> distance vector routing protocol<br />
(AODV), namely, by choosing a stable wireless<br />
mobile node as the active participation of the<br />
node routing approach to improve the dynamic<br />
network environment under the conditions of<br />
network performance. Simulation results show<br />
that the proposed stable routing method can<br />
effectively improve some important network<br />
performance indicators, such as the routing<br />
protocol packet submission rate, end-to-end<br />
delay <strong>and</strong> so on.<br />
5 ACKNOWLEDGMENTS<br />
The authors would like to thank the National<br />
Natural Science Foundation of China<br />
(60503015), <strong>and</strong> National High Technology Research<br />
<strong>and</strong> Development Program of China<br />
(863) (2008AA01A201).<br />
REFERENCES<br />
[1] C. Carofiglio, C. Chiasserini, M. Garetto, <strong>and</strong> E. Leonardi,<br />
“Route stability in MANETs under the r<strong>and</strong>om direction<br />
mobility model,” IEEE Transactions on Mobile Computing,<br />
vol. 8, no. 9, pp. 1167–1179, 2009.<br />
6
7 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
[2] M. Garetto <strong>and</strong> E. Leonardi, “Analysis of r<strong>and</strong>om mobility<br />
models with partial differential equations,” IEEE Transactions<br />
on Mobile Computing, vol. 6, no. 11, pp. 1204–1217,<br />
2007.<br />
[3] W. Su, S. Lee, <strong>and</strong> M. Gerla, “Mobility prediction <strong>and</strong><br />
routing in Ad Hoc wireless networks,” International Journal<br />
of Network Management, vol. 11, no. 1, pp. 3–30, 2001.<br />
[4] S. Misra, S. Dhur<strong>and</strong>her, M. Obaidat, N. Nangia,<br />
N. Bhardwaj, P. Goyal, <strong>and</strong> S. Aggarwal, “Node stabilitybased<br />
location updating in mobile Ad-Hoc networks,”<br />
IEEE <strong>Systems</strong> Journal, vol. 2, no. 2, pp. 237–247, 2008.<br />
[5] J. Liu, W. Guo, X. B.L., <strong>and</strong> F. Huang, “Path holding<br />
probability based Ad Hoc on-dem<strong>and</strong> routing protocol,”<br />
Journal of Software, vol. 18, no. 3, pp. 693–701, 2007.<br />
[6] V. Lenders, J. Wagner, S. Heimlicher, M. May, B. Plattner,<br />
<strong>and</strong> E. Zurich, “An empirical study of the impact of<br />
mobility on link failures in an 802.11 Ad Hoc network,”<br />
IEEE Wireless Communications, vol. 15, no. 6, pp. 16–21,<br />
2008.<br />
[7] S. Prince <strong>and</strong> B. Stephen, “On the behavior of communication<br />
links of a node in a multi-hop mobile environment,”<br />
in Proceedings of 5th ACM International Symposium on<br />
Mobile Ad Hoc Networking <strong>and</strong> Computing, 2004, pp. 145–<br />
156.<br />
[8] W. Tang <strong>and</strong> W. Guo, “A path reliable routing protocol in<br />
mobile ad hoc networks,” in Proceedings of 4th International<br />
Conference on Mobile Ad-Hoc <strong>and</strong> Sensor Networks, 2008, pp.<br />
203–207.<br />
[9] S. Xu, K. Blackmore, <strong>and</strong> H. Jonees, “Mobility assessment<br />
for MANETS requiring persistent links,” in Proceedings of<br />
International Conference on Mobile System, Applications <strong>and</strong><br />
Services, 2005, pp. 39–44.<br />
[10] J. Sumesh <strong>and</strong> V. An<strong>and</strong>, “Mobility aware path maintenance<br />
in ad hoc networks,” in Proceedings of the 2009 ACM<br />
Symposium on Applied Computing, 2009, pp. 201–206.<br />
[11] D. Rohit, D. Cynthia, K. Wang, <strong>and</strong> K. Satish, “Signal<br />
stability based adaptive routing (SSA) for Ad Hoc mobile<br />
networks,” IEEE Personal Communications, vol. 4, no. 1, pp.<br />
36–45, 1997.<br />
[12] G. Lim, K. Shim, S. Kim, <strong>and</strong> H. Yoon, “Signal strengthbased<br />
link stability estimation in ad hoc wireless networks,”<br />
Electronics Letters, vol. 39, no. 5, pp. 485–486, 2003.<br />
[13] Y. Xu <strong>and</strong> W. Wang, “Topology stability analysis <strong>and</strong> its<br />
application in hierarchical mobile ad hoc networks,” IEEE<br />
Transactions on Vehicular Technology, vol. 58, no. 3, pp.<br />
1546–1560, 2009.<br />
[14] H. Kim <strong>and</strong> M. Yoo, “A Scalable Ad Hoc routing protocol<br />
based on logical topology for ubiquitous community<br />
network,” in Processing of the 9th International Conference<br />
on Advanced Communication Technology, vol. 2, 2007, pp.<br />
1306–1310.<br />
[15] H. Zhang <strong>and</strong> Y. Dong, “A novel path stability computation<br />
model for wireless Ad Hoc Networks,” IEEE Signal<br />
Processing Letters, vol. 14, no. 12, pp. 928–931, 2007.<br />
[16] C. Channon, “A mathematical theory of communication,”<br />
The Bell System Technical Journal, vol. 27, no. 12, pp. 379–<br />
423,623–656, 1948.<br />
[17] C. Perkins <strong>and</strong> E. Royer, “Ad hoc on-dem<strong>and</strong> distance<br />
vector routing (AODV),” 2003.<br />
Gang Liu is a PHD student in HIT. His research interest includes<br />
ad hoc networks, dependable computing.<br />
Gang CUI is a professor in HIT. His research interest includes<br />
fault tolerant computing technology, computer architecture, ad<br />
hoc network,wireless sensor network.<br />
Hongwei Liu is a professor in HIT. His research interest includes<br />
fault tolerant computing technology, ad hoc network,<br />
wireless sensor network.<br />
Zhibo WU is a professor in HIT. His research interest includes<br />
fault tolerant computing technology, computer architecture, ad<br />
hoc network,wireless sensor network.<br />
7
8 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
Potential Contribution of CNN-based Solving of Stiff ODEs<br />
& PDEs to Enabling Real-Time Computational Engineering<br />
Jean Chamberlain Chedjou ( 1 ), Cyrille Kalenga Wa Ngoy ( 2 ), Michel Matalatala Tamasala ( 2 ), Ky<strong>and</strong>oghere Kyamakya( 1 )<br />
( 1 ): Transportation Informatics Group, Institute of Smart <strong>Systems</strong> Technologies, University of Klagenfurt (Austria),<br />
Email: ky<strong>and</strong>oghere.kyamakya@uni-klu.ac.at ; jean.chedjou@uni-klu.ac.at<br />
( 2 ): Department of Electrical <strong>and</strong> Computer Engineering, Polytechnic Faculty, University of Kinshasa (D. R. Congo)<br />
Email: Cyrille.Kalenga@vodacom.cd ; Michel.Matalatala@vodacom.cd<br />
Abstract — One of the most common approaches to avoid<br />
complexity while numerically solving stiff ordinary differential<br />
equations (ODEs) is approximating them by ignoring the<br />
nonlinear terms. While facing stiff partial differential equations<br />
(PDEs) the same is done by avoiding/suppressing the nonlinear<br />
terms from the Taylor’s series expansion. By so doing, the<br />
traditional methods for solving stiff PDEs <strong>and</strong> ODEs do<br />
compromise on both efficiency <strong>and</strong> precision of the resulting<br />
computations. This does inevitably lead to less accurate results<br />
that consequently cannot provide the full insight that may be<br />
needed in diverse cutting-edge situations in the ‘real’ nonlinear<br />
dynamical behavior experienced by the various engineering <strong>and</strong><br />
natural systems (generally modeled by nonlinear differential<br />
equations of the types ODE or PDE), which are analyzed in the<br />
frame of the novel discipline called Computational Engineering.<br />
For many of these systems, even a real-time simulation <strong>and</strong>/or<br />
control of the behavior is wished or needed; this sets evidently<br />
extremely high challenging requirements to the computing<br />
capability with regard to both computing speed <strong>and</strong> precision.<br />
This paper develops/proposes <strong>and</strong> validate through a series of<br />
presentable examples a comprehensive high-precision <strong>and</strong> ultrafast<br />
computing concept for solving stiff ODEs <strong>and</strong> PDEs with<br />
Cellular Neural Networks (CNN). The core of this concept is a<br />
straight-forward scheme that we call ‘Nonlinear Adaptive<br />
Optimization (NAOP)’, which is used for a precise template<br />
calculation for solving any (stiff) nonlinear ODE through CNN<br />
processors. One of the key contributions of this work, this is a<br />
real breakthrough, is to demonstrate the possibility of<br />
mapping/transforming different types of nonlinearities displayed<br />
by various classical <strong>and</strong> well-known oscillators (e.g. van der Pol-,<br />
Rayleigh-, Duffing-, Rössler-, Lorenz-, <strong>and</strong> Jerk- oscillators, just<br />
to name a few) unto first-order CNN elementary cells, <strong>and</strong><br />
thereby enabling the easy derivation of corresponding CNN<br />
templates. Furthermore, in case of PDE solving, the same concept<br />
also allows a mapping unto first-order CNN cells while<br />
considering one or even more nonlinear terms of the Taylor’s<br />
series expansion generally used in the transformation of a PDE in<br />
a set of coupled nonlinear ODEs. Therefore, the concept of this<br />
paper does significantly contribute to the consolidation of CNN<br />
as a universal <strong>and</strong> ultra-fast solver of stiff differential equations<br />
(both ODEs <strong>and</strong> ODEs). This clearly enables a CNN-based, realtime,<br />
ultra-precise, <strong>and</strong> low-cost Computational Engineering. As<br />
proof of concept some well-known prototypes of stiff equations<br />
(van der Pol, Lorenz, <strong>and</strong> Rössler oscillators) have been<br />
considered; the corresponding precise CNN templates are<br />
derived to obtain precise solutions of corresponding equations.<br />
An implementation of the concept developed is possible even on<br />
embedded digital platforms (e.g. FPGA, DSP, GPU, etc.); this<br />
opens a broad range of applications. On-going works (as outlook)<br />
are using NAOP for deriving precise templates for a selected set<br />
of practically interesting PDE models such as Navier Stokes,<br />
Schrödinger, Maxwell, etc.<br />
Keywords: Stiff ODEs <strong>and</strong> PDEs, CNN-based differential equation<br />
solving, high-precision computing, ultra-fast computing, NAOP<br />
scheme for CNN templates’ calculation.<br />
I. INTRODUCTION<br />
The last decades have witnessed a tremendous attention on<br />
solving nonlinear <strong>and</strong> stiff models (ODEs <strong>and</strong>/or PDEs) with<br />
the CNN paradigm [1]. The interest devoted to solving stiff<br />
models can be explained by their multiple potential<br />
applications especially in the so-called Computational<br />
Engineering context. Indeed, nonlinear models have been<br />
intensively used to underst<strong>and</strong>, predict <strong>and</strong> describe the<br />
dynamical behavior of various engineering or natural systems.<br />
In the field of transportation <strong>and</strong> logistics, for example, traffic<br />
models do take the form of ODEs <strong>and</strong>/or PDEs [2]. Still, in the<br />
field of transportation, various image processing tasks which<br />
are of high importance for visual sensors in Advance Driver<br />
Assistant <strong>Systems</strong> (e.g. contrast enhancement, segmentation,<br />
edge detection, etc…) can be expressed through solving<br />
corresponding stiff ODEs <strong>and</strong>/or PDEs [3].<br />
Diverse contributions have been made to develop<br />
analytical, numerical <strong>and</strong> even hardware-based approaches to<br />
solve stiff ODEs <strong>and</strong>/or PDEs [1]-[20]. Amongst these<br />
contributions some have retained our attention namely “the<br />
solutions of PDEs <strong>and</strong> ODEs using the CNN-paradigm”. In<br />
fact, the flexibility of the CNN paradigm <strong>and</strong> its huge potential<br />
to enable a renaissance of the old “analog computing” through<br />
an emulation on digital platforms (e.g. FPGA or GPU, etc.) to<br />
perform ultra-fast <strong>and</strong> accurate computing of nonlinear models<br />
are some of its strongest points. Nevertheless, the relevant<br />
state-of-the-art does not provide significant information related<br />
to a straight-forward method to calculate the CNN templates<br />
needed for solving stiff ODEs <strong>and</strong>/or PDEs with the CNN<br />
paradigm. Despite some intensive works developed in this<br />
direction it is still unclear how to solve PDEs <strong>and</strong>/or ODEs<br />
with good accuracy or high precision. Only approximate<br />
solutions exist, for example the use of CNN processors in an<br />
approximation of numerical solutions of PDEs involving the<br />
finite difference method [7], [10]-[14]. This later approach<br />
does not provide accurate results due to the Taylor series’<br />
expansion which does consider only up to the first order (i.e.<br />
linear expansion). A further interesting published approach to
9 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
solve PDEs is the group of learning schemes involved in an<br />
approximated solution of PDEs through CNN processors<br />
[15]-[20]. This late approach does require some initial solutions<br />
along with some critical parameter settings of the equations<br />
under investigation in order to enable the training process. This<br />
is a clearly significant drawback as it is not always possible to<br />
provide this data/information whenever dealing with stiff<br />
ODEs <strong>and</strong>/or PDEs.<br />
Our aim in this paper is therefore to contribute to the<br />
enrichment of the relevant state-of-the-art by<br />
proposing/developing a systematic methodology (based on the<br />
CNN paradigm) which should help to clear some of the<br />
problems actually unsolved by the classical above described<br />
approaches. The key challenge thereby is developing a CNNbased<br />
computing concept for performing both ultra-fast <strong>and</strong><br />
high-precision computing of stiff differential equations. The<br />
proposed method is based on a nonlinear adaptive optimization<br />
scheme to which we give the acronym “NAOP”. For proof of<br />
concept, the novel approach developed in this paper is applied<br />
to derive solutions of selected classical <strong>and</strong> well-known<br />
examples of stiff ODEs. In the following, the flexibility of the<br />
approach developed is extensively discussed <strong>and</strong> we then do<br />
show/explain an easy extension of this approach to similarly<br />
efficiently solving stiff PDEs.<br />
The rest of the paper is organized as follows. Section 2<br />
presents an in-depth description of the novel concept. The<br />
quintessence of NAOP is explained <strong>and</strong> we thereby describe<br />
the scheme for deriving appropriate CNN templates values for<br />
any given nonlinear ODE. Section 3 does then focus on the<br />
proof of concept through a selected nonlinear differential<br />
equation that is solved using the new concept developed in this<br />
paper: the van der Pol equation. For this, corresponding<br />
‘precise’ templates are calculated through NAOP. In section 4<br />
the possible extension of the novel scheme involving NAOP<br />
for solving PDEs is discussed. And finally, a series of<br />
concluding remarks are presented in Section 5 along with the<br />
presentation of some interesting open research questions<br />
(outlook) that are under investigation in some of our on-going<br />
works.<br />
II. THE CONCEPT OF “NAOP” FOR CNN TEMPLATE<br />
CALCULATION AND SOLUTIONS OF STIFF ODES<br />
This section describes the approach based on the Nonlinear<br />
Adaptive Optimization (NAOP) for solving ODEs. The<br />
overall flow diagram of this approach is schematically<br />
displayed by the synoptic representation in Fig. 1.<br />
The NAOP is performed by a complex ‘computing’<br />
“module/entity/procedure” which does work on two inputs.<br />
The first input contains wave-solutions generated by the state<br />
control CNN- network modeled by (1):<br />
M<br />
dxi =− xi + ⎡Aˆ ijxj Aijyj Biju ⎤<br />
⎣<br />
+ + j⎦ + Ii<br />
j 1<br />
∑ (1)<br />
dt =<br />
The second input contains wave-solutions of the model or<br />
better the linear/nonlinear differential equation, under<br />
investigation which could be re-written in the following<br />
simplified form as a set/couple of second order ODEs (see<br />
(2)):<br />
2<br />
dyi<br />
2<br />
dt<br />
= F(y , y , y & , z , z , z & , t)<br />
(2a)<br />
n m n m<br />
i i i i i i<br />
2<br />
dzj<br />
n m n m<br />
2<br />
j j j j j j<br />
dt<br />
= F(z , z , z & , y , y , y & , t) (2b)<br />
Figure.1. Synoptic representation of the key steps involved in the NAOP<br />
approach used for a precise template calculations for solving both linear <strong>and</strong><br />
nonlinear differential equations.<br />
The output of the NAOP system will generate, after<br />
extensive iterative computations or ‘training’ steps,<br />
appropriate CNN-templates to solve the corresponding ODEs<br />
(see (2)) when the convergence of the training process is<br />
achieved.<br />
The global process to derive the CNN-templates (i.e.<br />
NAOP) can be summarized as follows. The learning/training<br />
process is based on a mapping between the two inputs of the<br />
NAOP procedure. A convergence to local minima is the key<br />
purpose governing this template calculation process, the socalled<br />
NAOP. To achieve this, various basins of attraction are<br />
investigated sequentially, <strong>and</strong> corresponding CNN templates<br />
are determined for those various initial conditions. If some<br />
local attractors diverge from a local minimum, new sets of<br />
initial conditions are automatically generated to annihilate the<br />
divergence leading to a possible convergence to a local<br />
minimum. A large number of r<strong>and</strong>omly generated attractors<br />
(either regular or chaotic) are obtained through various<br />
numerical simulations whereby each attractor corresponds to a<br />
specific set of CNN-templates. An attempt to map these<br />
attractors to those generated by the model under investigation<br />
is performed in a sequential process leading to the<br />
convergence to a local minimum when the mapping is<br />
achieved successfully. However, it should be worth a<br />
mentioning that during the training process our various<br />
numerical simulations have revealed that it is very<br />
tough/difficult to find the optimal solution (i.e. the local<br />
minimum). This difficulty can be explained by the well-known<br />
inherent local minimum problem of the Hopfield neural<br />
network [8]-[9]. To overcome this problem, various basins of<br />
attractions are therefore generated within the NAOP process
10 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
<strong>and</strong> this generation is conducted in a sequential way until the<br />
internal dynamics of the global network of coupled oscillators<br />
converges to stable states. This convergence must be achieved<br />
in both the ‘CNN-templates’ <strong>and</strong> the ‘attractors’ which are all<br />
considered to be dynamic variables during the<br />
learning/training process. It is further worth a mentioning that<br />
the quintessence of the concept NAOP is in the core an<br />
adaptive training process that is very comparable to the<br />
concept developed for the training of Hopfield neural<br />
networks towards an efficient tracking of local minima.<br />
Nevertheless, NOAP has been demonstrated capable of<br />
mapping all known nonlinearity of ODEs unto appropriate<br />
templates of a first-order CNN processor matrix.<br />
 11<br />
 12<br />
 21<br />
 22<br />
Figure. 2a: Convergence of state-control CNN templates as achieved by the<br />
NOAP process for the following values of the system parameters: Є =0.25 <strong>and</strong><br />
ω=1.<br />
III. APPLICATIONS TO SOLVING STIFF ODES<br />
We restrict our analysis to the case of the van der Pol<br />
oscillator which is a good prototype of a well-known selfsustained<br />
oscillator having the interesting characteristic of<br />
being able to generate sinusoidal-, quasi-periodic-, <strong>and</strong><br />
relaxation- oscillations (see (3))<br />
2<br />
dx 2 dx<br />
2 ( )<br />
dt<br />
−ε 1− x +ω x = 0<br />
(3)<br />
dt<br />
Two possible states can be generated by (3). The first is the<br />
sinusoidal or almost sinusoidal state (Є 1). We now want to solve<br />
(3) using the CNN-paradigm. We envisage the case where<br />
Є=0.25 <strong>and</strong> ω=1. For these parameter values the NAOP concept<br />
has been exploited to calculate the corresponding templates<br />
after convergence of the training process. This convergence is<br />
clearly illustrated by the plots presented in Figs. (2a) <strong>and</strong> (2b)<br />
showing the temporal evolution of both the state-control<br />
templates Âij (see Fig. (2a)) <strong>and</strong> the feedback templates Aij<br />
(see Fig. (2b)). As it appears in these figures, the convergence<br />
is achieved after a long transient phase displayed by the global<br />
training network. It is worth a mentioning that the<br />
convergence of the process is achieved for suitable basins of<br />
attractions. From Figs. (2), one can easily read the following<br />
corresponding CNN templates that are then used to solve the<br />
van der Pol equation:<br />
Â11 = 1.0770 , Â12 =− 0.6300 , Â21 = 1.3450 , Â22 = 0.5850 ,<br />
A 0.4473 A 0.2586 A 0.4846 A = 0.1310.<br />
11 = , 12 =− , 21 = , 22<br />
11 A 12 A<br />
A 21<br />
Figure. 2b: Convergence of Feedback- templates achieved by the NOAP<br />
process for the following values of the system parameters: Є =0.25 <strong>and</strong> ω=1.<br />
This set of template values has been used/inserted in Fig. 3 to<br />
obtain the solution of (3) through the CNN paradigm. Indeed<br />
Fig. 3 is a general representation in SIMULINK of a CNN<br />
processor platform to solve second-order nonlinear ordinary<br />
differential equations. The key contribution of our approach,<br />
which is a breakthrough, is that we are now capable of<br />
transforming/mapping any type of nonlinearity displayed by<br />
nonlinear coupled <strong>and</strong> uncoupled ODEs into the type of<br />
nonlinearity displayed by the elementary first-order CNN- cell<br />
model. As proof of concept of the approach developed in this<br />
paper, we have used the CNN templates derived by this<br />
scheme to obtain the exact solutions of (3). The graphical<br />
representation of the CNN-processors for second order ODEs<br />
presented in Fig.3 has been used for rapid prototyping<br />
purposes (a hardware implementation in either DSP or FPGA<br />
or GPU platforms is then straight-forward). A direct numerical<br />
simulation of the same equation, i.e. (3) has also been<br />
performed using MATLAB <strong>and</strong> a comparison between these<br />
A 22
11 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
two results is shown in Figs. 4. As it clearly appears in Fig.<br />
(4a) <strong>and</strong> Fig. (4c), the result (i.e. the solution of (3)) by the<br />
approach based on the CNN-paradigm developed in this paper<br />
<strong>and</strong> the result (i.e. Fig. (4b) <strong>and</strong> Fig. (4d)) of the same<br />
equation through a direct numerical solution through<br />
MATLAB of (3) are in a very good agreement (i.e. same value<br />
of the amplitude of oscillations <strong>and</strong> same frequency of<br />
oscillations).<br />
Figure. 3. SIMULINK graphical representation of the CNN- computing<br />
platform to solve (3).<br />
The method proposed in this paper is challenging as it<br />
shows/demonstrates a systematic <strong>and</strong> straightforward way to<br />
solve nonlinear ordinary differential equations by the CNN-<br />
paradigm. The key challenge has been the possibility <strong>and</strong> then<br />
the appropriate way/algorithmic of/for mapping any type of<br />
nonlinearity unto the nonlinearity displayed by the elementary<br />
CNN- cell. Therefore, the approach developed in this work is<br />
very flexible as it can be applied to solve different types of<br />
nonlinear <strong>and</strong> stiff ODEs. The template calculation scheme<br />
based on NAOP has also been successfully applied for solving<br />
Rayleigh, Lorenz <strong>and</strong> Rössler equations <strong>and</strong> corresponding<br />
CNN- templates have been successfully derived (due to space<br />
constraints we cannot present all these results in this paper).<br />
One interesting issue under investigation is the<br />
establishment/development of a library of CNN template-sets<br />
to solve the most common nonlinear <strong>and</strong> stiff ODEs including<br />
the ones already cited above.<br />
The next section is addressing the generalization/extension<br />
of the approach developed in this paper to solving nonlinear<br />
<strong>and</strong> stiff PDEs. In fact, it will be shown that a discretization<br />
process could help to transform PDEs into sets of coupled or<br />
uncoupled nonlinear ODEs in order to make them solvable by<br />
the CNN-paradigm while thereby applying the scheme<br />
developed in this paper.<br />
Figure 4a. Wave-form solution of (3) obtained by our new approach<br />
based on the CNN- paradigm for Є =0.25 <strong>and</strong> ω=1.<br />
Figure 4b. Wave-form solution of (3) obtained through direct<br />
numerical simulation of (3) in MATLAB for Є =0.25 <strong>and</strong> ω=1.<br />
CNN – Waveform<br />
CNN- Phase Portrait<br />
Figure 4c. Wave-form solution of (3) obtained by our new approach<br />
based on the CNN- paradigm for Є =1 <strong>and</strong> ω=1.
12 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
MATLAB – Waveform MATLAB – Phase Portrait<br />
Figure 4d. Wave-form solution of (3) obtained through direct<br />
numerical simulation of (3) in MATLAB for Є =1 <strong>and</strong> ω=1.<br />
IV. EXTENSION OF THE NAOP SCHEME TO SOLVING<br />
STIFF PARTIAL DIFFERENTIAL EQUATIONS<br />
This section explains the possibility of extending/applying<br />
the approach developed in this paper to solving PDEs. Unlike<br />
the traditional approach of solving stiff PDEs through CNN<br />
which takes into consideration only the linear terms of the<br />
Taylor’s series expansion, we include the higher order<br />
derivative terms in the Taylor’s series expansion of any given<br />
PDE in order to improve the accuracy of the obtained<br />
solutions. We consider, for illustration, the Burger’s equation<br />
(4) which is a well-known prototype of partial differential<br />
equations <strong>and</strong> which is having multiple potential applications<br />
in the field of transportation.<br />
2<br />
∂u 1 ∂ u ∂u<br />
= −u<br />
2<br />
∂t R ∂x<br />
∂x<br />
In order to solve (4) by the CNN-paradigm, applying an<br />
expansion (at the first order) based on the Taylor’s series does<br />
lead to the following equivalent form of (4):<br />
[ ]<br />
i 1 i+ 1− i + i−1 ui ui+ 1 − ui−1<br />
2<br />
(4)<br />
du u 2u u<br />
= − (5)<br />
dt R h<br />
2h<br />
One can see that (5) is a well-known prototype of a set of firstorder<br />
coupled nonlinear ODEs. As it appears in (5), the<br />
discretization performed has resulted into a set of coupled<br />
ODEs with quadratic nonlinear terms (i.e of types similar to<br />
Lorenz or Rössler). This type of nonlinearity is solvable by<br />
our approach (NAOP) developed in the preceding paragraph as<br />
we could already solve more complex types of nonlinearity<br />
(e.g. the nonlinearity in the van der Pol equation). As<br />
discussed in Section 1, taking the truncated Taylor’s series<br />
(only the linear terms) has been done reluctantly in the many<br />
published works, since there has been no way so far,<br />
according to the literature, to deal with the increased<br />
complexity <strong>and</strong> the nonlinearity that appear otherwise. It is<br />
obvious that the results produced in the case of a linear<br />
approximation are de facto less precise. While considering the<br />
higher-order (in this case second-order) derivative terms in<br />
order to increase precision, the Taylor’s series expansion<br />
could be applied to (4) <strong>and</strong> this could lead to results presented<br />
in (6):<br />
dui 1 ⎡ui+ 1− 2ui + ui− 1 ui+ 1− 3u i + 3u i−1−ui−2 ⎤<br />
= ....<br />
2 2<br />
dt R<br />
⎢ − −<br />
h 2h<br />
⎥<br />
⎣ ⎦<br />
⎡ ui+ 1−uiui+ 1− 2ui+ ui−1⎤<br />
−u i ⎢ − −......<br />
h 2h<br />
⎥<br />
⎣ ⎦ (6)<br />
Therefore, while considering (6), it becomes obvious that the<br />
NAOP developed in this paper is a best c<strong>and</strong>idate for a<br />
straightforward derivation of the appropriate CNN-templates<br />
to solve (6).<br />
NAOP is also applicable for solving PDEs. The PDE must<br />
be first transformed in a set of coupled nonlinear ODEs. In<br />
this process, even nonlinear terms of/in the Taylor series<br />
expansion can be kept. Then NAOP will be used to determine<br />
appropriate templates for solving those complex sets of<br />
generally coupled nonlinear ODEs.<br />
V. CONCLUDING REMARKS<br />
We have proposed <strong>and</strong> validated a theoretical/concept<br />
based on the CNN paradigm for ultra-fast, potentially low-cost<br />
<strong>and</strong> high-precision computing of stiff ODEs <strong>and</strong> PDEs. Since<br />
we can solve these through CNN independently of the actual<br />
nonlinearity, we have reached a clear breakthrough that has<br />
the potential to enable a really ‘real-time’ Computational<br />
Engineering.<br />
The main benefit of solving ODEs <strong>and</strong> PDEs using CNN is the<br />
offered flexibility through NAOP to extract the CNN<br />
parameters through which CNN can solve any type of ODE or<br />
PDE. Another strong point of the CNN-paradigm is the<br />
resulting ultra-fast processing depending on the CNN<br />
implementation: DSP, FPGA, GPU, or CNN-Chip. One key<br />
objective of this work has been to advance the relevant stateof-the-art<br />
by proposing a novel framework to solve stiff<br />
ODE’s <strong>and</strong> PDE’s with high- precision. To achieve this goal,<br />
we have proposed <strong>and</strong> demonstrated that the Nonlinear<br />
Adaptive Optimization (NAOP) technique is a best <strong>and</strong><br />
efficient scheme to cope with solutions of any ODE or PDE.<br />
The NAOP is a learning/training method for mapping the<br />
wave solutions of the models describing the dynamics of a<br />
CNN-network to that of a given model (ODE). Taking just<br />
these two inputs, the learning process leads to the convergence<br />
to a local minimum where the complete mapping of the two<br />
models is achieved <strong>and</strong> CNN-templates are produced.<br />
Using the same technique, we proposed a high- precision<br />
computing of stiff PDEs while accounting even nonlinear<br />
terms (i.e. high order-terms) in the Taylor’s series expansion<br />
used while transforming the PDE unto a set of coupled<br />
nonlinear ODEs. In order to overcome the problem related to<br />
the speed of computation, an implementation either on FPGA<br />
or DSP or GPU of the concept developed in this work is<br />
possible <strong>and</strong> straight-forward.
REFERENCES<br />
[1] Leon O. Chua, <strong>and</strong> Lin Yang, “Cellular Neural Networks:<br />
Theory,” IEEE Transactions on Circuits <strong>and</strong> <strong>Systems</strong>, vol. 35,<br />
no. 10, October 1988.<br />
[2] Milka Uzunova, Daniel Jolly, Emil Nikolov, <strong>and</strong> Kamel<br />
Boumediene, “The Macroscopic LWR Model of the Transport<br />
Equation Viwed as a Distributed Parameter System,”<br />
Proceedings of the 5th international conference on Soft<br />
computing as transdisciplinary science <strong>and</strong> technology, pp. 572-<br />
576, October 2008.<br />
[3] Song Chun Zhu, <strong>and</strong> David Mumford, “Gibbs Reaction <strong>and</strong><br />
Diffusion Equations,” Proceedings of the 6 th International<br />
Conference on Computer Vision, pp. 847, January 1998.<br />
[4] Tamer A. Abassy, Magdy A. El-Tawil, H. El-Zoheiry, “Exact<br />
Solutions of Some Nonlinear Partial Differential Equations<br />
Using the Variational Iteration Method Linked With Laplace<br />
Transforms <strong>and</strong> the Pade Technique,” <strong>Computers</strong> <strong>and</strong><br />
Mathematics with Applications, vol. 54, pp. 940-954, October<br />
2007.<br />
[5] N. H. Sweilam, “Variational Iteration Method for Solving Cubic<br />
Nonlinear Schrodinger Equation,” Journal of Computational <strong>and</strong><br />
Applied Mathematics, vol. 207, pp. 155-163, October 2007.<br />
[6] Michaek Striebel, Andreas Bartel, <strong>and</strong> Michael Gunther, “A<br />
Multirate ROW-scheme for Index-1 Network Equations,”<br />
Applied Numerical Mathematics, vol. 59, pp. 800-814, March<br />
2009.<br />
[7] Tamas Roska, Leon O.Chua, Dietrich Wolf, Tibor Kozek,<br />
Ronald Tetzlaff, <strong>and</strong> Frank Puffer, “Simulating Nonlinear<br />
Waves <strong>and</strong> Partial Differential Equations via CNN-Part I:Basic<br />
Techniques,” IEEE Transactions on Circuits <strong>and</strong> <strong>Systems</strong>-<br />
I:Fundamental Theory <strong>and</strong> Applications, vol. 42, no. 10,<br />
October 1995.<br />
[8] J. J. Hopfield, <strong>and</strong> D. W. Tank, “Neural computation of<br />
decisions in optimization problems,” Biol. Cybernet. N 52, pp.<br />
141-152, 1985.<br />
[9] K. A. Smith, “Neural network for combinatorial optimization: A<br />
review of more than a decade of research,” INFORMS J,<br />
Computing, vol. 11, no. 1, pp. 15-34, 1999.<br />
[10] C. .Del Negro, L.Fortuna, <strong>and</strong> A.Vicari, “Modelling Lava Flows<br />
by Cellular Nonlinear Networks (CNN): Preliminary Results,”<br />
Nonlinear Processes in Geophysics, vol. 12, pp. 505-513, 2005.<br />
API<br />
API<br />
Users<br />
Users<br />
Users<br />
13 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
Internet<br />
MIDDLEWARE<br />
MIDDLEWARE<br />
GPIO GPIO IO<br />
GPIO<br />
DSP DSP Cluster<br />
Cluster<br />
Platform BUS<br />
Internet Connection<br />
Central Server<br />
Multi Channel Memory Controller<br />
FPGA FPGA Cluster Cluster GPU GPU Cluster Cluster Power Power PC PC Cluster<br />
Cluster<br />
Memory Memory Memory Memory<br />
Hyper-Computer<br />
Hyper-Computer<br />
Hyper-Computer<br />
Hyper Computer<br />
Emulated Emulated Analog Analog Computing<br />
Computing<br />
or or CNN CNN processors<br />
processors<br />
CNN CNN implementation implementation HPC<br />
HPC<br />
Shared Memory<br />
Figure 5. Global architecture of the computing platform planned to enable a<br />
real-time computational Engineering. Diverse users may access the CNN<br />
processor platforms in a remote way through the Internet<br />
[11] I. Krstic, D. K<strong>and</strong>ic, <strong>and</strong> B. Reljin, “Cellular Neural Networks-<br />
An Analogous Model for Stress Analysis of Prismatic Bars<br />
Subjected to Torsion,” FME transactions, vol. 31, pp. 7-14,<br />
2003.<br />
[12] J. –H. Niu, H.-Z.Wang, <strong>and</strong> H.-X.Zhang, J.-Y.Yan, Y.-S.Zhu,<br />
“Cellular Neural Network Analysis for Two-Dimensional<br />
Bioheat Transfer Equation,” Medical & Biological Engineering<br />
& Computing, vol. 39, pp. 601-604, 2001.<br />
[13] Tibor Kozek, Leon O.Chua, Tamas Roska, Dietrich Wolf,<br />
Ronald Tetzlaff, Frank Puffer, <strong>and</strong> Karoly Lotz, “Simulating<br />
Nonlinear Waves <strong>and</strong> Partial Differential Equations via CNN-<br />
Part II: Typical Examples,” IEEE transactions on circuits <strong>and</strong><br />
systems-I: Fundamental theory <strong>and</strong> applications, vol.42, No. 10,<br />
October 1995.<br />
[14] T. Kozek <strong>and</strong> T. Roska, “A Double Time-Scale CNN for<br />
Solving 2-D Navier-Stokes Equations,” CNNA-94 3 rd IEEE<br />
International Workshop on Cellular Neural Networks <strong>and</strong> their<br />
Applications, December 1994.<br />
[15] Puffer, R. Tetzlaff, <strong>and</strong> D. Wolf, “A Learning Algorithm for<br />
Cellular Neural Networks (CNN) Solving Nonlinear Partial<br />
Differential Equations,” ISSSE Proceedings, 1995.<br />
[16] P. Lucie Aarts, <strong>and</strong> P. van der Veer, “Neural Network Method<br />
for Solving Partial Differential Equations,” J. Neural Processing<br />
Letters, vol. 14, no. 3, pp. 261-271, December, 2001<br />
[17] I. G. Tsoulos, D. Gavrilis, <strong>and</strong> E. Glavas, “Solving differential<br />
equations with constructed neural networks,” J.<br />
Neurocomputing, vol. 72, pp. 2385-2391, June 2009.<br />
[18] L O Chua, M Hasler, G S Moschytz, J Neirynck, “Autonomous<br />
cellular neural networks: A unified paradigm for pattern<br />
formation <strong>and</strong> active wave propagation,” IEEE Transactions on<br />
Circuits & <strong>Systems</strong>-I, Fundamental Theory <strong>and</strong> Applications,<br />
vol. 42, no.10, October 1995.<br />
[19] F. Puffer , R. Tetzlaff , D. Wolf, “Modeling Nonlinear <strong>Systems</strong><br />
With Cellular Neural Networks”, IEEE Transcactions on<br />
Acoustics, Speech, <strong>and</strong> Signal Processing ICASSP-96, vol. 6,<br />
pp. 3513-3516, 1996.<br />
[20] Josef A. Nossek, “Design <strong>and</strong> Learning With Cellular Neural<br />
Networks,” International Journal of Circuit Theory &<br />
Applications, vol. 24, pp. 15 – 24, 31 Dec 1998.<br />
Job Request<br />
Server Architecture<br />
Web<br />
Server<br />
Task Scheduler<br />
Bill Manager<br />
API &<br />
Abstract Layer<br />
Platform BUS<br />
Synthesis Tools<br />
Finite Element<br />
Image Processing<br />
Figure 6. Core idea of the server architecture intended for the CNN based<br />
super-computing platform to enable real-time Computational Engineering.<br />
It is a detailed description of the central sever given in Fig. 5.<br />
PDE<br />
ODE
14 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
Ky<strong>and</strong>oghere Kyamakya obtained<br />
the ‘Ir. Civil’ degree in Electrical<br />
Engineering in 1990 at the<br />
University of Kinshasa. In 1999 he<br />
received his Doctorate in Electrical<br />
Engineering at the University of<br />
Hagen in Germany. He then worked<br />
three years as post-doctorate<br />
researcher at the Leibniz University<br />
of Hannover in the field of Mobility<br />
Management in Wireless Networks. From 2002 to 2005 he<br />
was junior professor for Positioning Location Based<br />
Services at Leibniz University of Hannover. Since 2005 he<br />
is full Professor for Transportation Informatics <strong>and</strong> Director<br />
of the Institute for Smart <strong>Systems</strong> Technologies at the<br />
University of Klagenfurt in Austria.<br />
Cyrille Kalenga Wa Ngoy obtained the ‘Ir. Civil’ degree in<br />
Electrical Engineering at the University of Kinshasa. He is<br />
since about ten years Assistant at the same University in the<br />
Department of Electrical <strong>and</strong> Computer Engineering.<br />
Michel Matalatala Tamasala obtained the ‘Ir. Civil’<br />
degree in Electrical Engineering at the University of<br />
Kinshasa. He is since about four years Assistant at the same<br />
University in the Department of Electrical <strong>and</strong> Computer<br />
Engineering.<br />
Jean Chamberlain Chedjou<br />
received in 2004 his doctorate in<br />
Electrical Engineering at the<br />
Leibniz University of Hanover,<br />
Germany. He has been a DAAD<br />
(Germany) scholar <strong>and</strong> also an<br />
AUF research Fellow (Postdoc.).<br />
From 2000 to date he has been a<br />
Junior Associate researcher in the<br />
Condensed Matter section of the ICTP (Abdus Salam<br />
International Centre for Theoretical Physics) Trieste, Italy.<br />
Currently, he is a senior researcher at the Institute for Smart<br />
<strong>Systems</strong> Technologies of the Alpen-Adria University of<br />
Klagenfurt in Austria. His research interests include<br />
Electronics Circuits Engineering, Chaos Theory, Analog<br />
<strong>Systems</strong> Simulation, Cellular Neural Networks, Nonlinear<br />
Dynamics, Synchronization <strong>and</strong> related Applications in<br />
Engineering. He has authored <strong>and</strong> co-authored 3 books <strong>and</strong><br />
more than 40 journals <strong>and</strong> conference papers.
Abstract — This paper aims to explore the feasibility of using<br />
OFDM over satellite channels with high order modulation<br />
techniques such as QPSK <strong>and</strong> 16QAM, <strong>and</strong> strong error<br />
correction algorithms. Moreover, a performance comparison<br />
between currently used single carriers techniques <strong>and</strong> OFDM is<br />
presented.<br />
D<br />
15 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
Performance Comparison of OFDM <strong>and</strong> Single<br />
Carrier Modulations over Satellite Channels<br />
Index Terms — Satellite, OFDM, QPSK, 16 QAM.<br />
Yuri Labrador, Masoumeh Karimi, Niki Pissinou, <strong>and</strong> Deng Pan<br />
I. INTRODUCTION<br />
IGITAL modulation techniques over satellite have<br />
become, in the past five years, the mainly transmission<br />
technique used in television <strong>and</strong> video transmission because<br />
digital modulation combined with video compression can more<br />
efficiently use the satellite b<strong>and</strong>width. The single carrier<br />
modulation currently used performs very well over satellite<br />
channels in fixed receiver environments. When we deal with<br />
mobile users the channel presents multi-paths effects as well as<br />
Doppler shifts. In this scenario single carrier modulation does<br />
not work as in fixed environments. Orthogonal Frequency<br />
Division Multiplexing, on the other h<strong>and</strong>, performs much<br />
better in multi-paths <strong>and</strong> frequency selective channels; it also<br />
uses the available spectrum in a very efficient way. This paper<br />
aims to explore the feasibility of using OFDM over satellite<br />
channels with high order modulation techniques such as QPSK<br />
<strong>and</strong> 16QAM, <strong>and</strong> strong error correction algorithms. A<br />
performance comparison between currently used single<br />
carriers techniques <strong>and</strong> OFDM is presented as well.<br />
II. SINGLE CARRIER MODULATION VERSUS OFDM<br />
Single carrier modulation presents two main problems when<br />
used in frequency selective channels. These two problems are<br />
[1]: (1) frequency selective channels introduce inter symbol<br />
interference at the receiver; <strong>and</strong> (2) equalization at the receiver<br />
may also amplify noise in frequencies where channel response<br />
is poor. As a result, single carrier modulation is affected due to<br />
high attenuations in some b<strong>and</strong>s. Since the same carrier uses<br />
the entire b<strong>and</strong>width, this problem can become very serious<br />
(see Figure 1).<br />
The b<strong>and</strong>width must be divided into many small b<strong>and</strong>s, <strong>and</strong><br />
then a carrier may be allocated in each one. Furthermore, the<br />
Manuscript received January 21, 2010.<br />
The authors are with Florida International University, Miami, FL, e-mails:<br />
{ylabr001, mkari001, pissinou, p<strong>and</strong>}@fiu.edu.<br />
data stream should be divided into many parallel data streams,<br />
modulating individual carriers. Then, the signals can be added<br />
together <strong>and</strong> transmitted. Thus, the entire b<strong>and</strong>width will be<br />
used, but with many individual <strong>and</strong> smaller carriers as shown<br />
in Figure 2.<br />
H(<br />
jΩ)<br />
0<br />
X 0<br />
X 1<br />
XM−1<br />
φ [ ]<br />
0 n<br />
x<br />
[ ] φ<br />
1 n<br />
x<br />
φ [ ]<br />
M−1<br />
n<br />
x<br />
f 0<br />
Fig. 1. Channel Response<br />
Some advantages of OFDM are as follows [8]: (1) the<br />
available spectrum is divided into smaller sub-b<strong>and</strong>s; (2) data<br />
is divided in the transmitter site, <strong>and</strong> each sub-stream<br />
modulates one sub-carrier; (3) power <strong>and</strong> rate of transmission<br />
in a b<strong>and</strong> depend on the channel response on that b<strong>and</strong>; <strong>and</strong> (4)<br />
no ISI, since in each narrow sub-b<strong>and</strong>, the channel response is<br />
almost flat [7] (see Figure 3).<br />
In general an OFDM transmission can be represented as<br />
shown in Figure 4.<br />
∑<br />
Fig. 2. OFDM Principle of Operation.<br />
f<br />
x[n<br />
]
The power required in each sub-channel is distributed,<br />
depending on the value of Hi. Then, the number of bits to be<br />
transmitted to each sub-channel is determined. The number of<br />
bits <strong>and</strong> the constellation can be chosen for a sub-channel<br />
based on the SNR in that particular sub-channel <strong>and</strong> the<br />
required probability of error.<br />
Amplitude<br />
Fig. 3. Orthogonal Carriers.<br />
The signal spectrum of a single carrier <strong>and</strong> an OFDM<br />
modulation differ in two main characteristics (Figure 5).<br />
1) Single carrier shows one main frequency which is<br />
modulated using some digital scheme such as QPSK or<br />
8PSK.<br />
2) OFDM is composed of a series of carriers individually<br />
modulated; these carriers are orthogonal with respect to<br />
each other.<br />
III. SATELLITE CHANNEL MODELS<br />
A fixed receiver satellite channel is modeled for practical<br />
application as an Additive Gaussian White Noise channel with<br />
a path loss block that takes into consideration the distance<br />
between the satellite <strong>and</strong> the receiver antenna <strong>and</strong> the<br />
operating frequency. This produces a path loss attenuation that<br />
varies depending on the type of satellite used. For<br />
geostationary satellites this attenuation in C B<strong>and</strong> can be in the<br />
order of hundredths of dB. These parameters, when simulating<br />
in Mat Lab, give a very close representation of real life<br />
scenarios in terms of Bit Error Rate (BER) calculation <strong>and</strong><br />
X 0<br />
X 1<br />
���..<br />
X M −1<br />
16 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
x[<br />
0]<br />
x[<br />
1]<br />
���..<br />
x[<br />
M −1]<br />
x[<br />
n ]<br />
f<br />
h[n<br />
]<br />
Signal to Noise ratios (SNR). Several simulation runs, <strong>and</strong> real<br />
life test have been performed to demonstrate that the channel<br />
model is a correct approximation of real life events [10].<br />
When dealing with mobile receivers a more complex model<br />
needs to be considered.<br />
Propagation characteristics in satellite channels are more<br />
susceptible to weather impairments, especially at higher<br />
frequencies [2], [3]. Average rain <strong>and</strong> shadowing may<br />
completely disrupt the communication link. A mobile satellite<br />
channel model that takes into consideration potential weather<br />
impairments <strong>and</strong> the multipath-fading phenomenon is<br />
necessary in order to represent the satellite channel. Some<br />
models have been proposed, but they only consider one of<br />
either the multipath effects or the weather effect. The<br />
propagation effects present in a mobile satellite link include<br />
those related to the troposphere (rain, etc.), or effects caused<br />
by the receiver’s environment (multipath). The troposphere<br />
effects are denoted byα , <strong>and</strong> the environmental effects re<br />
denoted byβ . The two effects are assumed to be statistically<br />
independent because their underlying mechanisms are<br />
independent. The amplitude of the received signal can be<br />
described as:<br />
A = α ⋅ β (1)<br />
The satellite channel model has two states (good <strong>and</strong> bad<br />
states): one is a non-shadowing state, <strong>and</strong> the other is a<br />
shadowing state [4], [5]. This two-state model forms a Markov<br />
model. In the non-shadowing state, the received signal<br />
amplitude can be described as a Rician distribution :<br />
pnon−shadowing (A) = 2K ⋅ A ⋅ exp −K(A 2 [ +1) ]⋅ I0(2K ⋅ A)<br />
where K is the Rice factor.<br />
In the shadowing state, where no LOS exists, the channel is<br />
described as a Rayleigh multipath fading. The signal at the<br />
receiver is expressed as:<br />
y[n]<br />
Fig. 4. OFDM Transmitter <strong>and</strong> Receiver.<br />
p shadowing<br />
'<br />
x[<br />
0]<br />
'<br />
x[<br />
1]<br />
���..<br />
'<br />
x[<br />
M −1]<br />
(2)<br />
⎛ A<br />
⎜<br />
⎝<br />
⎞<br />
⎟ =<br />
⎠<br />
2A<br />
exp − A2 ⎛ ⎞<br />
⎜ ⎟ (3)<br />
⎝ ⎠<br />
s 0<br />
s 0<br />
'<br />
X 0<br />
'<br />
X1<br />
���..<br />
X<br />
1<br />
H0<br />
1<br />
H1<br />
'<br />
M −1<br />
1<br />
HM−<br />
1<br />
s 0<br />
X 0<br />
X 1<br />
���..<br />
X M −1
17 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
Of all potential weather impairments, rain is the most<br />
critical, especially in tropical weather, where rainfall can be<br />
severe. The long-term statistics of potential rainfall can be<br />
described by a lognormal equation:<br />
1<br />
PL (L) =<br />
σ d L ⋅ 2π exp − (lnL − md )2<br />
⎡<br />
⎤<br />
⎢<br />
2 ⎥ , L ≥ 0 (4)<br />
⎣ 2σ d ⎦<br />
Studies on rain attenuation between fixed <strong>and</strong> mobile<br />
systems show that the probability distribution of the envelope<br />
of a mobile receiver can be described by the one used for the<br />
fixed system, multiplied by a factor that changes between 0.5<br />
<strong>and</strong> 2.0 <strong>and</strong> is independent of rain attenuation [6]. Figure 6<br />
demonstrates the probability density function versus the<br />
amplitude.<br />
Figure 7 shows measures in real life C B<strong>and</strong> transponders<br />
showing the effects of rain over the on board solid state<br />
amplifier current.<br />
IV. SIMULATIONS<br />
The Mat Lab software is used in this paper to simulate these<br />
types of modulations techniques. The simulation includes<br />
constellations for QPSK, 8PSK <strong>and</strong> 16QAM <strong>and</strong> signal<br />
spectrums for both single carrier <strong>and</strong> OFDM techniques. We<br />
decided to include the 16QAM in order to show a type of<br />
digital modulation that includes variations of both Amplitude<br />
<strong>and</strong> Phase in contrast to QPSK <strong>and</strong> 8PSK, which only<br />
modulate the phase of the carrier signal [9].<br />
For each simulation we created five blocks:<br />
1. Ground station block that includes:<br />
a) R<strong>and</strong>om Digital Source that generates digital pulses.<br />
b) Error corrections blocks for a code rate of 3/4.<br />
Fig. 5. Single carrier vs. OFDM spectrum<br />
c) QPSK, 8PSK or 16QAM Modulator that performs the<br />
actual Modulation.<br />
d) OFDM modulator (for the OFDM simulations).<br />
e) Raise Cosine Transmit Filter.<br />
f) High Power Amplifier.<br />
g) Transmitting Antenna.<br />
2. Uplink Path block that includes:<br />
a) Free Space Path Loss, this block simulates the Uplink<br />
free space attenuation due to frequency <strong>and</strong> distance<br />
from the Uplink site to the satellite. The Uplink<br />
frequency is 6245 MHz <strong>and</strong> the distance 35600 Km,<br />
giving a total attenuation of 199 dB.<br />
b) Phase/Frequency offset.<br />
Probability Density<br />
Fig. 6. Probability Density Functions for Non-shadowing State.
18 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
3. Satellite block that includes:<br />
a) Satellite receiving antenna.<br />
b) Satellite receiver system temperature.<br />
c) Phase Noise.<br />
d) I/Q Balance.<br />
e) Phase/Frequency offset.<br />
f) Power Amplifier.<br />
g) Satellite Transmitting Antenna.<br />
4. Downlink Path block that includes:<br />
a) Free Space Path Loss; this block simulates the<br />
Downlink free space attenuation due to frequency<br />
<strong>and</strong> distance from the Satellite to the Receiving<br />
Station. The Downlink frequency is 4020 MHz <strong>and</strong><br />
the distance 35600 Km, giving a total attenuation of<br />
196 dB.<br />
b) Phase/Frequency offset.<br />
c) Rician Multipath fading Channel (for mobile<br />
receivers’ scenarios).<br />
5. Receiving Earth Station block that includes:<br />
a) Receiving Antenna.<br />
b) Receiver Noise Temperature.<br />
c) Phase Noise.<br />
d) I/Q Balance.<br />
e) Phase/Frequency Offset.<br />
f) Raised Cosine Receive Filter.<br />
g) OFDM demodulator (for OFDM simulations).<br />
h) QPSK, 8PSK or 16QAM Demodulator.<br />
i) Error correction decoder.<br />
Fig. 7. On board SSPA Current Attenuation in the Satellite Transponder.<br />
The simulation allows varying several parameters for<br />
different scenarios such as TX <strong>and</strong> RX antennas Diameter, <strong>and</strong><br />
Gain; in that way we can see how the receiving spectrum is<br />
affected when the sizes of the antennas are changed.<br />
Others parameters of interest that can be changed <strong>and</strong><br />
affects the receiving signal are: HPA Gain, Uplink <strong>and</strong><br />
Downlink frequencies <strong>and</strong> thus Uplink <strong>and</strong> Downlink free<br />
space attenuation, Noise Temperatures that originally were set<br />
to typical cases of 290 K, Phase Noise, Phase Correction,<br />
Doppler error, AGC type, Phase <strong>and</strong> frequency offsets, order<br />
of the error corrections algorithms used. Note that in the<br />
simulations we have include several spectrum monitors <strong>and</strong><br />
constellation representations that can be moved to different<br />
parts of the diagram to check the form of the spectrum at any<br />
place during the path.<br />
Figures 8 <strong>and</strong> 9 show the effects of channel models on the<br />
transmitted <strong>and</strong> received spectrum for both single carrier<br />
modulation <strong>and</strong> OFDM.<br />
Figure 10 shows the values of BER of the OFDM simulation<br />
for different values of the Rician factor K. Under this channel<br />
model the BER detection threshold is reached at BER values<br />
of<br />
3<br />
10 − with a Eb/N0 = 8dB. The b<strong>and</strong>width is 5 MHz.<br />
If Turbo codes are used then the values of BER in the<br />
OFDM QPSK signal are shown in Figure 11. The same Rician<br />
factors K were used.
Fig. 8. Transmitted <strong>and</strong> Received Spectrums Single Carrier Modulation.<br />
OFDM Spectrum<br />
40<br />
35<br />
30<br />
25<br />
20<br />
15<br />
10<br />
5<br />
0<br />
19 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
Uplink Signal<br />
HPA Effects on the OFDM Signal<br />
-5<br />
-2.5 -2 -1.5 -1 -0.5 0<br />
Frequency<br />
BW = 5 MHz<br />
0.5 1 1.5 2 2.5<br />
Fig. 9. Transmitted <strong>and</strong> Received Spectrums OFDM Modulation.<br />
V. PERFORMANCE COMPARISON AND EXPERIMENTAL<br />
RESULTS SIMULATIONS<br />
Table I presents a performance comparison between the<br />
existing single carrier satellite modulation techniques data rate<br />
versus a multiple carrier (OFDM) scheme using time diversity<br />
data rate.<br />
The test was aimed to produce a working version of the<br />
OFDM QPSK modulation system, for performance<br />
verification, to include the following:<br />
1) Conduct a review of actual RF performance over a typical<br />
C-B<strong>and</strong> transponder.<br />
2) Demonstrate the inherent robustness of the system in the<br />
presence of normal satellite transmissions impairments.<br />
The test was located at Univision Network Communications<br />
Uplink facility in Miami, Fl. The output of the modulator at 70<br />
MHz was fed into a Radyne Upconverter with +7 dBm output<br />
option, which then fed an MCL Klystron C-B<strong>and</strong> HPA.<br />
The Uplink antenna used was a 9.1 m Scientific Atlanta with<br />
4 ports feed, transmitting onto transponder 16 on AMC-1 at<br />
103 West. The downlink available for the test is a 3.1 m<br />
receiving antenna in Miami. The downlink is equipped with<br />
TABLE I<br />
PERFORMANCE OF OFDM TIME DIVERSITY IN SATELLITE CHANNELS.<br />
Modulati<br />
on<br />
B<strong>and</strong>width<br />
(MHz)<br />
FEC<br />
Single carrier<br />
data rate. DVB S<br />
scheme. (Mbps)<br />
Proposed OFDM<br />
time diversity<br />
scheme. (Mbps)<br />
QPSK 5 1/2 3.5 4.5<br />
QPSK 5 3/4 5.3 6.75<br />
QPSK 5 5/6 5.9 7.5<br />
16 QAM 5<br />
1/2<br />
st<strong>and</strong>ard DRO-based LNBs digital quality. Typical noise<br />
temperatures of 25 to 35 Kelvin were noted.<br />
The modulator was set to a nominal output level of – 8<br />
dBm. The spectrum was noted as very clean, with only lowlevel<br />
spurs noted at the IF frequency of 70 MHz. The resulting<br />
IF spectrum exhibited at least -30 dBc at the + or – 2.5 MHz,<br />
indicating that very little RF power would be wasted into outof-b<strong>and</strong><br />
transmissions being absorbed by the transponder<br />
filters. The RF output of the Upconverter was set to a nominal<br />
output level of + 1 dBm, well below the + 7 dBm saturation<br />
level of the upconverter output.<br />
The RF input to the HPA was set to a nominal level of – 22<br />
dBm. Checking the HPA output via a 57.1 dB coupler. The<br />
spurious emissions were noted to be – 55 dBc or lower. No<br />
special tuning of the Klystron was necessary as it was simply<br />
deemed unnecessary for the purpose of the test. In order to<br />
determine the proper operating point for the service in the<br />
transponder, a series of RF level tests were performed, using<br />
both CW <strong>and</strong> modulated carrier. The CW tests indicated the –<br />
1 dB saturation point for the transponder SSPA was with a<br />
transmit level of 80 Watts, as measured by the HPA output<br />
coupler.<br />
An operating point of 0.5 dB below the – 1 dB saturation<br />
point was chosen as a nominal operation level, to maximize<br />
downlink performance without introducing significant<br />
distortion to the modulation. The effect of saturating the<br />
transponder is to be avoided due to increase in Inter Symbol<br />
Interference in the demodulator within the receiver, causing<br />
loss of RF margin performance. The local 3.1 m antenna was<br />
peaked on the satellite, <strong>and</strong> was used as reference antenna for<br />
the bulk of the tests, as it constituted a stress-test scenario for<br />
the system.<br />
Upon modulating the OFDM carriers, the system<br />
performance was measured over full range 10 dB OBO to<br />
saturation, with the signal to noise ratio, SNR, displayed by the<br />
receiver providing nominal increases up to the 1 dB saturation<br />
point. This result indicates that if there was an increase in<br />
distortion of the OFDM signal, it was not discernable by the<br />
receiver. Further tests should be performed to determine the<br />
actual extent of such distortion, independent of the SNR<br />
readout.<br />
n/a<br />
9.01<br />
16 QAM 5 3/4 10.6 13.5<br />
16 QAM 5 5/6 12.4 15.1
20 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
Bit Error Rate<br />
Bit Error Rate<br />
10 0<br />
10 -1<br />
10 -2<br />
10 -3<br />
10 -4<br />
10 -5<br />
10 -6<br />
10 -7<br />
10<br />
0 5 10 15 20 25 30 35 40<br />
-8<br />
E /N (dB)<br />
b 0<br />
Satellite HPA Saturation Level = 1 dB<br />
10 0<br />
10 -1<br />
10 -2<br />
10 -3<br />
10 -4<br />
10 -5<br />
10 -6<br />
10 -7<br />
The overall system performance is shown below:<br />
Parameter 3.1m 3.6m 3.7m 5.0m 7.3m<br />
SNR (dB) 11.0 11.7 11.7 14.4 18.0<br />
Signal Level 57 61 57 53 57<br />
Margin (3/4 FEC) (dB) 2.0 2.7 2.7 5.4 9.0<br />
Margin (5/6 FEC) (dB) 0.6 1.3 1.3 4.0 7.6<br />
Fig. 10. BER Values QPSK OFDM Satellite Channel.<br />
The system was tested at an FEC of 3/4, 5/6 <strong>and</strong> 8/9 (not<br />
shown), with most of the testing performed at 3/4 <strong>and</strong> 5/6<br />
rates, as this was thought to be the most likely operational rates<br />
for the system.<br />
Some problems encountered:<br />
OFDM QPSK Satellite Link K=5<br />
OFDM QPSK Satellite Link K=4<br />
OFDM QPSK Satellite Link K=3<br />
OFDM QPSK Satellite Link K=2<br />
OFDM QPSK Satellite Link K=1<br />
10<br />
0 5 10 15 20 25 30 35 40<br />
-8<br />
E /N (dB)<br />
b 0<br />
HPA Saturation Level 1 dB<br />
Fig. 11. BER Values QPSK OFDM Turbo Coded Satellite Channel.<br />
OFDM QPSK Satellite Link Turbo Code 1/3 K=5<br />
OFDM QPSK Satellite Link Turbo Code 1/3 K=4<br />
OFDM QPSK Satellite Link Turbo Code 1/3 K=3<br />
OFDM QPSK Satellite Link Turbo Code 1/3 K=2<br />
OFDM QPSK Satellite Link Turbo Code 1/3 K=1<br />
Transponder 16 operations were significantly affected by<br />
adjacent-satellite interference, both uplink <strong>and</strong> downlink, from
21 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
a co-frequency, co-polarized analog video uplink on Galaxy 4,<br />
at 99 West, 4 degrees away. This is a perfectly legitimate<br />
interference situation, <strong>and</strong> is typical of the interference to be<br />
expected while operating on a C-B<strong>and</strong> transponder in the<br />
middle of the dense cable neighborhood portion of the US<br />
domestic arc.<br />
The interference was mainly downlink-dominated on the<br />
smaller diameter receiving antennas. It is estimated that the<br />
interference contributed to a general 1.5 to 3 dB degradation<br />
of the system performance on the 3.1 m antenna. It was also<br />
noted that the SNR reading on the receiver monitoring the 3.1<br />
m antenna would occasionally fluctuate 0.1 to 0.2 dB,<br />
probably due to changes in the nature of the overall<br />
interference level.<br />
Using 3/4 FEC, the system performed with about 3 dB<br />
margin for the worst-case antenna of 3.1 m. A 3/4 FEC rate<br />
operating into any location within the 39 dBW contour. Using<br />
a 3.1 m antenna or better should have adequate margin.<br />
The following data shows a comparison between OFDM<br />
QPSK <strong>and</strong> OFDM 16QAM <strong>and</strong> the available margins over<br />
threshold:<br />
Modulation (OFDM) QPSK QPSK 16QAM 16QAM<br />
Coding RSV RSV RSTC RSTC<br />
Transponder BW (MHz) 5 5 5 5<br />
FEC 3/4 5/6 3/4 5/6<br />
Total Data Rate (Mbs) 6.75 7.5 13.5 15.1<br />
C/No Threshold (dB) 6.9 7.9 9.0 10.4<br />
A similar analysis is performed with existing single carrier<br />
modulation using QPSK <strong>and</strong> 8PSK.<br />
Modulation (Single Carrier) QPSK QPSK 8PSK 8PSK<br />
Coding RSV RSV RSTC RSTC<br />
Transponder BW (MHz) 5 5 5 5<br />
FEC 3/4 5/6 3/4 5/6<br />
Total Data Rate (Mbs) 5.3 5.9 9.7 11.5<br />
C/No Threshold (dB) 5.8 7.2 8.4 10.1<br />
The C/N threshold increases from 3/4 to 5/6 <strong>and</strong> also if the<br />
modulation order is higher. It can be shown that a OFDM<br />
QPSK 3/4 threshold is 2 dB lower than a OFDM 16QAM 3/4<br />
threshold, this is in accordance with theoretical analysis<br />
because in the case of 16QAM modulation the signal to noise<br />
ratio has to be better in order to detect, in the receiver end, the<br />
phases now more close together than in QPSK modulation.<br />
VI. CONCLUSION<br />
The single carrier modulation currently used performs very<br />
well over satellite channels in fixed receiver environments.<br />
When we deal with mobile users the channel presents multipaths<br />
effects as well as Doppler shifts. In this scenario, single<br />
carrier modulation does not work as in fixed environments.<br />
Orthogonal Frequency Division Multiplexing, on the other<br />
h<strong>and</strong>, performs much better in multi-paths <strong>and</strong> frequency<br />
selective channels; it also uses the available spectrum in a very<br />
efficient way. This paper has aimed to explore the feasibility<br />
of using OFDM over satellite channels with high order<br />
modulation techniques such as QPSK <strong>and</strong> 16QAM, <strong>and</strong> strong<br />
error correction algorithms. Furthermore, a performance<br />
comparison between currently used single carriers techniques<br />
<strong>and</strong> OFDM has been presented.<br />
REFERENCES<br />
[1] K. J. Ray Liu, Ahmed K. Sadek, Weifeng Su, <strong>and</strong> Andres Kwasinski,<br />
“Cooperative Communications <strong>and</strong> Networks,” Cambridge, 2009.<br />
[2] M. Rice, J. Slack, <strong>and</strong> B. Humphreys, “K-B<strong>and</strong> l<strong>and</strong> mobile satellite<br />
channel characterization,” Int. J. Satellite Communications, Vol. 14,<br />
pp. 283-296, 1996.<br />
[3] E. Kubista, F. Perez Fontan. M. Angeles Vazquez Castro, S. Bunomo,<br />
B. R. Arbesser-Rasburg, <strong>and</strong> J.P.V. Poiares Baptista, “Ka-b<strong>and</strong><br />
propagation measurements <strong>and</strong> statistics for l<strong>and</strong> mobile satellite<br />
applications,” IEEE Transactions on Vehicular Technology, Vol. 49,<br />
pp. 973-983, May 2000.<br />
[4] Wenzhen Li, Choi Look Law, V. K. Dubey, <strong>and</strong> J. T. Ong, “Ka-b<strong>and</strong><br />
l<strong>and</strong> mobile satellite channel model incorporating weather effects,”<br />
IEEE Communications Letters, Vol. 5, Issue 5, pp. 194-196, May 2001.<br />
[5] C. Loo <strong>and</strong> J. S. Butterworth, “L<strong>and</strong> mobile satellite channel<br />
measurements <strong>and</strong> modeling,” Proc. IEEE, Vol. 86, pp. 1442-1463, July<br />
1998.<br />
[6] E. Lutz, D. Cyagn, M. Dippold, F. Dolainsky , <strong>and</strong> W. Papke, “The l<strong>and</strong><br />
mobile satellite communication channel-Recording, statistics <strong>and</strong><br />
channel model,” IEEE Transactions on Vehicular Technology, Vol. 40,<br />
pp. 375-384, May 1991.<br />
[7] Yuri Labrador, Masoumeh Karimi, Deng Pan, <strong>and</strong> Jerry Miller, “OFDM<br />
MIMO Space Diversity in Terrestrial Channels,” International Journal of<br />
Computer Science <strong>and</strong> Network Security (IJCSNS), Vol.9, No.10, pp.<br />
52-61, October 2009.<br />
[8] Simon Plass, Armin Dammann, Gerd Richter, <strong>and</strong> Martin Bossert,<br />
“Channel Correlation Properties in OFDM by using Time-Varying<br />
Cyclic Delay Diversity,” Journal of Communications, Vol. 3, No. 3, July<br />
2008.<br />
[9] Yuri Labrador, Masoumeh Karimi, Deng Pan, <strong>and</strong> Jerry Miller, “An<br />
Approach to Cooperative Satellite Communications for 4G Mobile<br />
<strong>Systems</strong>,” Journal of Communications, Vol. 4, No. 10, November 2009.<br />
[10] Oh-Soon Shin, A. M. Chan, H. T. Kung, <strong>and</strong> V. Tarokh, “Design of an<br />
OFDM Cooperative Space-Time Diversity System,” IEEE Transactions<br />
on Vehicular Technology, Vol. 56, No. 4, July 2007.
22 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
Software Rejuvenation Technique-An<br />
Improvement in Applications with Multiple<br />
Versions<br />
Abstract — By notice to extension software technology <strong>and</strong><br />
modern applications, software reliability <strong>and</strong> availability is very<br />
serious problem. Software fault tolerance techniques improve<br />
these capabilities. One of the techniques is Software rejuvenation,<br />
which counteracts software aging. Software aging may lead to<br />
performance degradation or crash/hang failure or both. In this<br />
paper, we address this technique for the application with one,<br />
<strong>and</strong> then extend model for multiple versions. The numerical<br />
experiment results show that with more software versions can<br />
greatly reduce expected downtime <strong>and</strong> improve availability of<br />
application.<br />
Index Terms— Software rejuvenation, Availability, reliability<br />
I. INTRODUCTION<br />
ith the increase of the complication of computer<br />
W systems, the loss which is caused by software<br />
inefficiency is more <strong>and</strong> more widespread problem. One<br />
solution to reduce the loss of systems is to improve its<br />
reliability. At present, software fault-tolerate technique is the<br />
most effective approach to the problem [1]. Traditional faulttolerant<br />
techniques belong to a passive technique works in a<br />
reactive way. It implements rejuvenation operation only when<br />
the system is in failure; whereas, the software rejuvenation<br />
technique belongs to a kind of active technique, which<br />
prevents or slows down system failures before their<br />
occurrence [1].<br />
When software applications execute continuously for long<br />
periods of time (scientific <strong>and</strong> analytical applications run for<br />
days or weeks, servers in client-server systems are expected to<br />
run forever), the processes corresponding to the software in<br />
execution age or slowly degrade with respect to effective use<br />
of their system resources. The causes of process aging are<br />
memory leaking, unreleased file locks, file descriptor leaking,<br />
data corruption in the operating environment of system<br />
1. Zahra Rahmani Ghobadi is a Msc Student in Department of Computer<br />
Engineering, Qazvin Azad University, Qazvin, Iran (phone: 0989358224714<br />
e-mail: m.rah62@ gmail.com).<br />
2. Hassan Rashidi is an Assistant Professor in Department of Computer<br />
Engineering, Qazvin Azad University, Qazvin, Iran (phone: 0989126772017<br />
e-mail: hrashi@gmail.com).<br />
Zahra Rahmani Ghobadi 1 , Hassan Rashidi 2<br />
resources, etc. process aging will affect the performance of the<br />
application <strong>and</strong> eventually cause the application to fail [2].<br />
The software rejuvenation technique terminates the program<br />
when its performance declines to a certain degree, then restarts<br />
to clean the inner state <strong>and</strong> the software performance will be<br />
restored.<br />
Huang et al. (1995) introduced the continuous Markov<br />
process to build two-phase software rejuvenation model that<br />
includes healthy state, aging probable state, system failure<br />
state <strong>and</strong> rejuvenation state [8]. By Markov decision process,<br />
Pfening et al. (1996) proposed a software rejuvenation frame<br />
<strong>and</strong> applied it to AT <strong>and</strong> T communication system. Garg et al.<br />
(1998) constructed rejuvenation model of transaction<br />
processing system based on queuing theory [7]. Dohi et al.<br />
(2000) set up software rejuvenation model of client/server<br />
system <strong>and</strong> adopted non-parameter statistic analysis to<br />
estimate optimal software rejuvenation interval [8] [9]. For<br />
cluster system, Garg et al. (1998) <strong>and</strong> Wei et al. (2004)<br />
presented stochastic Petri net approach to analyze software<br />
rejuvenation. Vaidyanathan et al. (2001) used stochastic<br />
Reward Net to model <strong>and</strong> analyze cluster system that<br />
employed software rejuvenation [10]. Bao et al. (2005) <strong>and</strong><br />
Vaidyanathan <strong>and</strong> Trivedi (2005) took the system workload<br />
into account for building a model to estimate resource<br />
exhaustion times [5].<br />
We extend software rejuvenation model for multiple<br />
software version. In order to improve systematic reliability of<br />
application, the systematic availability formula is derived.<br />
Finally, the numerical results are given to validate the<br />
proposed model.<br />
II. SOFTWARE REJUVENATION<br />
Software rejuvenation is a proactive fault management<br />
technique aiming at cleaning up the internal state of the<br />
system to prevent the occurrence of more severe crash failures<br />
in the future. It involves occasionally terminating an<br />
application or a system, cleaning its internal state <strong>and</strong><br />
restarting it [3]. Application is unavailable during<br />
rejuvenation. Although rejuvenation may sometimes increase<br />
the downtime of an application, those are usually planned <strong>and</strong><br />
scheduled downtimes. If care is taken to schedule rejuvenation<br />
during the idlest times of an application, then the cost due to<br />
those downtimes is expected to be short. Downtime costs are
23 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
the costs incurred due to the unavailability of the service<br />
during downtime of an application [2].<br />
Let pij (t) be transition probability function of continuoustime<br />
Markov process <strong>and</strong> qij be transition rate. Kolmogorov<br />
forward equation is defined as follows:<br />
dPij<br />
( t)<br />
dt<br />
N<br />
= ∑ ik<br />
k = 0<br />
P ( t)<br />
q , i,<br />
j = 0,<br />
1,<br />
2<br />
kj<br />
By Letting p(t) to be the matrix of transition probability<br />
function pij(t)(i,j=0,1,2,…) <strong>and</strong> Q to be the matrix of transition<br />
rate function qij(i,j=0,1,2,…), formula (1) can be expressed in<br />
matrix format as follows:<br />
P ′ ( t)<br />
= P(<br />
t)<br />
Q<br />
A. Software Rejuvenation Model of One-Node Application<br />
First, we study Software rejuvenation model for the<br />
application with one software version, model based Markov<br />
process, as is shown in Fig. 1. The system has three states: the<br />
working state 0 (denoted as S0), the failure state 1 (denoted as<br />
SF) <strong>and</strong> the rejuvenation state 2 (denoted as SR). In the<br />
beginning, the application stays in the working state 0. With<br />
system performance degrades over time, a failure may occur.<br />
If system failure occurs before triggering software<br />
rejuvenation, the application changes from the working state 0<br />
to system failure state 1 <strong>and</strong> then the system recovery<br />
operation is started immediately. Otherwise, the application<br />
changes from the working state 0 to the software rejuvenation<br />
state 2 <strong>and</strong> later the software rejuvenation is carried out. After<br />
completing the system repair or rejuvenation, the application<br />
becomes as good as new <strong>and</strong> changes to the beginning<br />
working state 0 again. We define the time interval from the<br />
beginning of the system working to the next one as one cycle.<br />
According to the model described above, at any time t the<br />
application can be in any one of three states: up <strong>and</strong> available<br />
for service (working state 0), recovering from a failure (the<br />
failure state 1), or undergoing software rejuvenation (the<br />
rejuvenation state 2). To formally describe the software<br />
rejuvenation model of single version application, continuous<br />
time Markov process denoted as Z= (Zt; t≥0) is used, where Zt<br />
represents the state of application at time t. The transition<br />
probability function of Z is expressed as follows [6]:<br />
P ( t)<br />
= P(<br />
Z = j Z = i)(<br />
∀i,<br />
j ∈ Ω,<br />
t ≥ 0)<br />
(3)<br />
ij t 0<br />
Where, Ω= {0, 1, 2} is the state space set.<br />
For the software rejuvenation model in Fig.1, λ1, µ1, r1, <strong>and</strong><br />
R1 represents the failure rates from system working state to<br />
failure state, the transition rate to trigger software<br />
rejuvenation, the rejuvenation rate from software rejuvenation<br />
state to system working state <strong>and</strong> the recovery rate from<br />
system failure state <strong>and</strong> the recovery rate from system failure<br />
state to system working state, respectively. Let Q be the<br />
matrix of the transition rate function. According to the state<br />
(1)<br />
(2)<br />
transition relationship of single version application, the<br />
transition rate matrix for the continuous time Markov process<br />
Z can be easily derived as:<br />
-(μ1+λ1) λ1 μ1<br />
Q = R1 -R1 0 (4)<br />
r1 0 -r1<br />
Let p (t) be the matrix of transition probability function<br />
pij(t)(∀i,j∈Ω). According to Kolmogorov forward Eq.1,<br />
transition probability matrix p (t) satisfies:<br />
P ′ ( t)<br />
= P(<br />
t)<br />
Q<br />
P ( 0)<br />
= I<br />
Where, I is the unit matrix.<br />
Let pj, j∈Ω be the instantaneous steady probability of single<br />
version application in state j. According to the limit<br />
distribution theorem, pj, j∈Ω is given by:<br />
lim<br />
Pj = ij<br />
t→∞<br />
P ( t)(<br />
∀i,<br />
j ∈ Ω )<br />
By Substitution Eq.4 <strong>and</strong> 6 to Eq.5, the following equation<br />
is derived:<br />
− ( μ 1 + λ1<br />
) P0<br />
+ R1P1<br />
+ r1<br />
P2<br />
= 0<br />
− R1P1<br />
+ λ1<br />
P0<br />
= 0<br />
− r P + μ P = 0<br />
1 2<br />
2<br />
∑ Pi<br />
i=<br />
0<br />
= 1<br />
1<br />
0<br />
Where pi, i=0, 1, 2 can be obtained by solving the Eq.7.<br />
The application is available for service requests in working<br />
state 0 <strong>and</strong> application is unavailable for the rejuvenation state<br />
1 <strong>and</strong> failure state 2, thereafter, the system availability for<br />
single version application is given by:<br />
PA = P<br />
1<br />
0<br />
µ1<br />
B. Software Rejuvenation Model of Two-Node Application<br />
We extend the software rejuvenation model of single<br />
application to two-dimension state space, then derive software<br />
rejuvenation model of two-node application as shown in Fig.2.<br />
The states of application are denoted by a 2-tuple S, which is<br />
formally defined as: S={(i,j)│i,j∈{H,F,R}}, where i is the<br />
state of the first version of application <strong>and</strong> j is the state of the<br />
second version of application. For the first version of<br />
application, λ1, μ1, r1, <strong>and</strong> R1 represents the failure rates from<br />
R1<br />
SR(2) S0(0) SF(1)<br />
r1 λ1<br />
Fig. 1. Software rejuvenation model of single application.<br />
(6)<br />
(5)<br />
(7)<br />
(8)
24 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
system working state to failure state, the transition rate to<br />
trigger software rejuvenation, the rejuvenation rate from the:<br />
software rejuvenation state to working state, respectively.<br />
Correspondingly, for the second version of application, λ2, μ2,<br />
r2, <strong>and</strong> R2 denotes the failure rate, the transition rate to trigger<br />
software rejuvenation, the rejuvenation rate <strong>and</strong> the recovery<br />
rate, respectively.<br />
We discussed assumptions for simplicity <strong>and</strong> limited this<br />
model. The assumptions are explained as following:<br />
Assumption 1: Software rejuvenation is not allowed for<br />
both versions to be carried out concurrently.<br />
Assumption 2: At any time t only one version can be in<br />
rejuvenation state.<br />
Assumption 3: if the version be in failure state, other<br />
versions can’t transfer to rejuvenation state.<br />
Assumption 4: rejuvenation rate from software<br />
rejuvenation state to system working state is faster than<br />
recovery rate from system failure state to system working<br />
state.<br />
Also it is assumed that Zt is the state of the version at time t,<br />
Ω′= {0, 1, 2…7} is the state space set. Similarly, we use<br />
continuous time Markov process, denoted as Z= (Zt; t≥0), to<br />
describe the software rejuvenation model of two-node<br />
application. The transition probability function of Z is<br />
expressed as Eq. 10 <strong>and</strong> pj, j∈Ω is given by [4]:<br />
lim<br />
P = P ( t)(<br />
∀i,<br />
j ∈ Ω ′<br />
j<br />
ij<br />
)<br />
t → ∞<br />
(R,H)<br />
4<br />
µ1<br />
r1<br />
λ2 R2 λ2 R2 λ2 R2<br />
(R,F)<br />
6<br />
r1<br />
r2<br />
(H,R)<br />
5<br />
(H,H)<br />
0<br />
(H,F)<br />
2<br />
Correspondingly, the transition probability matrix P (t) also<br />
satisfies the condition in Eq. 5. By substitution Eq. 9 <strong>and</strong> 10 to<br />
Eq. 5 the Eq.11 can be derived [5]:<br />
µ2<br />
R1<br />
λ1<br />
R1<br />
λ1<br />
R1<br />
λ1<br />
(F,R)<br />
7<br />
r2<br />
(F,H)<br />
1<br />
(F,F)<br />
3<br />
Fig. 2. Software rejuvenation model of two applications.<br />
(9)<br />
-( λ 1+ λ2+ μ 1+ μ 2) λ1 λ2 0 μ 1 μ2 0 0<br />
2<br />
7<br />
∑ Pi<br />
i=<br />
0<br />
R1 -(R1+ λ2) 0 λ2 0 0 0 0<br />
R2 0 -(R2+ λ1) λ1 0 0 0 0<br />
0 R2 R1 -(R1+R2) 0 0 0 0<br />
r1 0 0 0 -(r1+ λ2) 0 λ2 0<br />
r2 0 0 0 0 -(r2+ λ1) 0 λ1<br />
0 0 r1 0 R2 0 -(r1+R2) 0<br />
0 r2 0 0 0 R1 0 -(r2+R1)<br />
− ( μ1<br />
+ μ 2 + λ1<br />
+ λ 2 ) P0<br />
+ R1P1<br />
+ R2<br />
P2<br />
+ r1<br />
P4<br />
+ r2<br />
P5<br />
= 0<br />
− ( R1<br />
+ λ 2 ) P1<br />
+ λ1P0<br />
+ R2<br />
P3<br />
+ r2<br />
P7<br />
= 0<br />
− ( R 2 + λ1<br />
) P2<br />
+ λ 2 P0<br />
+ R1P3<br />
+ r1<br />
P6<br />
= 0<br />
− ( R1<br />
+ R2<br />
) P3<br />
+ λ 2 P1<br />
+ λ1P2<br />
= 0<br />
− ( r1<br />
+ λ 2 ) P4<br />
+ μ1<br />
P0<br />
+ R2<br />
P6<br />
= 0<br />
− ( r2<br />
+ λ1<br />
) P5<br />
+ μ 2 P0<br />
+ R1P7<br />
= 0<br />
− ( r1<br />
+ R 2 ) P6<br />
+ λ 2 P4<br />
= 0<br />
− ( r + R ) P + λ P = 0<br />
1<br />
= 1<br />
7<br />
1 5<br />
(10)<br />
(11)<br />
By solving the above equations, we can obtain the value of<br />
pi, i=0, 1, 2…7. According to the rejuvenation model in Fig.2,<br />
the application is unavailable in the state of (F, F), (R, F), <strong>and</strong><br />
(F, R). Thereafter, the availability of two-node application is<br />
given by:<br />
PA 2<br />
= 1 3 6 7<br />
8(H,H,R)<br />
= P0<br />
+ P1<br />
+ P2<br />
+ P4<br />
+ P5<br />
− ( P + P + P )<br />
13(F,H,R)<br />
19(F,F,R)<br />
r1<br />
r2<br />
r3<br />
11(H,F,R)<br />
6 (F,F,H)<br />
3(F,H,H)<br />
14(F,R,H)<br />
R1<br />
R2<br />
R3<br />
0(H,H,H)<br />
7 (F,F,F)<br />
5 (F,H,F)<br />
2(H,F,H)<br />
18(F,R,F)<br />
9(H,R,H)<br />
4 (H,F,F)<br />
1(H,H,F)<br />
12(H,R,F)<br />
μ1<br />
μ2<br />
16(R,H,F)<br />
17(R,F,F)<br />
15(R,F,H)<br />
Fig. 3. Software rejuvenation model of three applications.<br />
μ3<br />
(12)<br />
10(R,H,H)<br />
λ 1<br />
λ 2<br />
λ3
C. Software Rejuvenation Model of Three-Node Application<br />
We study this work for three-dimension state space <strong>and</strong><br />
gain the less unavailability by Software rejuvenation model of<br />
three-node application as shown in Fig.3. Q is matrix of the<br />
transition rate function as in Eq.14.<br />
By solving the obtained equations, we obtain the value of<br />
Pi, i=0, 1, 2…19. According to the rejuvenation model in<br />
Fig.3, the application is unavailable in the state of<br />
(F,F,F),(R,F,F), (F,R,F), (F,F,R). Thereafter, the system<br />
availability of three-node application is given by:<br />
PA 3<br />
25 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
= 1 − ( P7<br />
+ P17<br />
+ P18<br />
+ P19<br />
)<br />
(13)<br />
A λ1 λ2 λ3 0 0 0 0 μ1 μ2 μ3 0 0 0 0 0 0 0 0 0<br />
R1 B 0 0 λ2 λ3 0 0 0 0 0 0 0 0 0 0 0 0 0 0<br />
R2 0 C 0 λ1 0 λ3 0 0 0 0 0 0 0 0 0 0 0 0 0<br />
R3 0 0 D 0 λ1 λ2 0 0 0 0 0 0 0 0 0 0 0 0 0<br />
0 R2 R1 0 E 0 0 λ3 0 0 0 0 0 0 0 0 0 0 0 0<br />
0 R3 0 R1 0 F 0 λ2 0 0 0 0 0 0 0 0 0 0 0 0<br />
0 0 R3 R2 0 0 G λ1 0 0 0 0 0 0 0 0 0 0 0 0<br />
0 0 0 0 R3 R2 R1 H 0 0 0 0 0 0 0 0 0 0 0 0<br />
r1 0 0 0 0 0 0 0 I 0 0 λ2 λ3 0 0 0 0 0 0 0<br />
r2 0 0 0 0 0 0 0 0 J 0 0 0 λ3 λ1 0 0 0 0 0<br />
r3 0 0 0 0 0 0 0 0 0 K 0 0 0 0 λ1 λ2 0 0 0<br />
0 0 r1 0 0 0 0 0 0 0 0 L 0 0 0 0 0 λ3 0 0<br />
0 0 0 r1 0 0 0 0 0 0 0 0 M 0 0 0 0 λ2 0 0<br />
0 0 0 r2 0 0 0 0 0 0 0 0 0 N 0 0 0 0 λ1 0<br />
0 0 0 r2 0 0 0 0 0 0 0 0 0 0 O 0 0 0 λ3 0<br />
0 0 r3 0 0 0 0 0 0 0 0 0 0 0 0 P 0 0 0 λ2<br />
0 r3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Q 0 0 λ1<br />
0 0 0 0 0 0 r1 0 0 0 0 0 0 0 0 0 0 R 0 0<br />
0 0 0 0 0 r2 0 0 0 0 0 0 0 0 0 0 0 0 S 0<br />
0 0 0 0 r3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 T<br />
(14)<br />
III. NUMERICAL RESULTS AND ANALYSIS<br />
To acquire reliability measure of application, we perform<br />
numerical experiments by taking system unavailability as<br />
evaluation indicator.<br />
The unavailability of single application Pu1, two-node<br />
application Pu2, <strong>and</strong> three-node application Pu3 can be<br />
evaluated as follows:<br />
PU 1 = 1 − PA1<br />
= P1<br />
+ P2<br />
PU 2 = 1 − PA<br />
2 = P3<br />
+ P6<br />
+ P7<br />
PU 3 = 1 − PA<br />
3 = P7<br />
+ P17<br />
+ P18<br />
+ P19<br />
TABLE I<br />
PARAMETER VALUES USED IN THE EXPERIMENT<br />
r1=r2=…=rn R1=R2=…=Rn λ1=λ1=…=λn µ1=µ2=…=µn<br />
1<br />
0.1 0.005 0.002<br />
The system parameter default values in software<br />
rejuvenation model are given in Table I, in which the<br />
rejuvenation rate is 1, the recovery rate is 0.1, failure rate is<br />
0.005 <strong>and</strong> transition rate to trigger software rejuvenation is<br />
0.002. All the parameter values are selected by experimental<br />
experience for demonstration purposes. For simplify the<br />
numerical experiment, we assume the failure rate, Recovery<br />
rate <strong>and</strong> Rejuvenation rate of all versions is equal.<br />
Figure 4 shows the system unavailability versus number of<br />
versions. We can see that number of versions strongly<br />
influences system reliability. With the number of version<br />
increasing, the system unavailability reduces rapidly <strong>and</strong> goes<br />
to a steady value.<br />
IV. CONCLUSION<br />
In this paper, we presented software rejuvenation structure<br />
<strong>and</strong> set up the software rejuvenation model in one, two, <strong>and</strong><br />
three-dimension state space for one application. In the model,<br />
the system availability formula is derived from continuous<br />
time Markov process. The numerical experiment results show<br />
that the system unavailability greatly minimizes when the<br />
number of versions increases.<br />
Fig. 4. The system unavailability versus number of version in the application<br />
with multiple versions.
26 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
REFERENCES<br />
[1] S.Yu, CH.Qi, H.Xin, “Positive software fault-tolerate technique based<br />
on time policy”, Journal of Communication <strong>and</strong> Computer, ISSN1548-<br />
7709, Volume 4, No.8 (Serial No.33), 2007.<br />
[2] Y. Huang, C. Kintala, N. Koletis, <strong>and</strong> N.D. Fulton, “Software<br />
Rejuvenation: Analysis, Module <strong>and</strong> Applications”, in Proc. 25th<br />
Symposium on Fault Tolerant Computer <strong>Systems</strong>, pp. 381-390, 1995.<br />
[3] T.Thein, J.Sou Park, Member, IEEE, “Availability Analysis of<br />
Application Servers Using Software Rejuvenation <strong>and</strong> Virtualization”,<br />
JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY 24(2):<br />
339-346 Mar. 2009.<br />
[4] S. Pfening, S. Garg, A. Puliafito, M. Telek <strong>and</strong> K. S.Trivedi, “Optimal<br />
Rejuvenation for toleranting Soft Failure”, Performance Evaluation,<br />
27/28, , pp.491–506, 1996.<br />
[5] Q, Yong, M.Haining, H.Di, Ch. Ying. “A Study on Software<br />
Rejuvenation Model of Application Server Cluster in Two-Dimension<br />
State Space Using Markov Process”, Information Technology Journal<br />
7(1): 98-104, 2008.<br />
[6] T.Dohi, S.Trivedi, “Statistical Non-Parametric Algorithms to Estimate<br />
the Optimal Software Rejuvenation Schedule”, Dept. of Electrical <strong>and</strong><br />
Computer Engineering, Duke University, Durham, NC 27708-0294,<br />
USA,2000.<br />
[7] W. Xiea, Y. Hong, K. Trivedi. “Analysis of a two-level software<br />
rejuvenation policy”, Reliability Engineering <strong>and</strong> System Safety 87<br />
(2005) 13–22.<br />
[8] Y. Huang, C. Kintala, N. Koletis, N.D. Fulton, “Software rejuvenation:<br />
analysis, module <strong>and</strong> application”, in: Proc. of 25 th Symposium on Fault<br />
Tolerant Computing, June 1995.<br />
[9] T. Dohi, K. Goseva-Popstojanova, K.S. Trivedi, Statistical nonparametric<br />
algorithms to estimate the optimal software rejuvenation<br />
schedule, in: Proceedings of the 2000 Pacific Rim International<br />
Symposium on Dependable Computing, December 2000.<br />
[10] K. Vaidyanathan, R.E. Harper, S.W. Hunter, K.S. Trivedi, “Analysis <strong>and</strong><br />
implementation of software rejuvenation in cluster systems”, ACM<br />
SIGMETRICS Performance Evaluation Review, in: Proceedings of the<br />
2001 ACM SIGMETRICS International Conference on Measurement<br />
<strong>and</strong> Modeling of Computer <strong>Systems</strong>, vol. 29 (1), June 2001.
Abstract— The reduction in the size of transistors, leads to the<br />
increase in the numbers of transistors to more than several<br />
billions on a chip. Therefore, new techniques have to be carried<br />
out to manage this large quantity of transistors on a single chip.<br />
Network on Chip (NoC) is an implementation technique to resolve<br />
this problem. But this NoC management is a challenging job <strong>and</strong><br />
the communication management need regular scheduling <strong>and</strong><br />
configuration. One attitude towards NoC management is making<br />
use of Real Time Operating System (RTOS) for scheduling, task<br />
introduction, <strong>and</strong> dynamic assigning priorities to the tasks <strong>and</strong><br />
message passing. Therefore in this paper, MicroC/OS-II RTOS is<br />
used. This RTOS is ported in Motorola ColdFire microprocessor.<br />
This microprocessor is located in the core of a node of mesh<br />
topology based NoC. The traffic model in this paper is hotspot.<br />
Index Terms—MicroC/OS-II, Motorola ColdFire<br />
Microprocessor, Network on Chip, Real Time Operating<br />
System.<br />
T<br />
27 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
A New Attitude based on Real Time Operating<br />
System for NoC in Hotspot Traffic Model<br />
I. INTRODUCTION<br />
HE System on Chip (SoC) can include different<br />
components such as processor, I/O unit <strong>and</strong> various types<br />
of memories. Each of these components can have different<br />
communication protocols [1].<br />
Generally, Interconnection processing elements in NoC is<br />
carried out by ports, whereas, in multiprocessor SoC (MPSoC)<br />
with numerous processing elements, it is expected that these<br />
ports in the case of latency, scalability <strong>and</strong> energy<br />
consumption, are turned into bottlenecks.<br />
Therefore, the idea of NoC that includes the routers which are<br />
connected by the means of links is introduced. But the<br />
communication management in NoC is a challenging job. So,<br />
utilization the RTOS will be in charge of managing this<br />
challenge. This OS can be ported on NoC node<br />
microprocessor. In this paper, MicroC/OS-II RTOS is ported<br />
in the central node of NoC mesh topology based on hotspot<br />
traffic model. However, the OS can be ported in the all nodes.<br />
The idea of applying NoCs also has been used in the previous<br />
works such as [9].<br />
In this paper MicroC/OS-II is used in an innovative way that is<br />
making use of the RTOS. This OS, in contrast with similar<br />
OSs such as Windows <strong>and</strong> Linux is not monolithic <strong>and</strong><br />
Seyyed Amir Asghari, Hossein Pedram <strong>and</strong> Hassan Taheri<br />
application program do not effect on kernel. Also, it has a few<br />
number of code lines for kernel that it has a willing impact on<br />
power computing to usual OSs. As NoCs are power<br />
constrained, this is considered a privilege feature [11].<br />
In the 2 nd part of this paper, NoC structure <strong>and</strong> its components<br />
are introduced.<br />
In the 3 rd part, MicroC/OS-II RTOS <strong>and</strong> its privilege features<br />
are introduced.<br />
In the 4 th part, different types of traffic models are explained.<br />
A specific traffic model which is being taken into account is<br />
hotspot traffic model.<br />
In 5 th part, Motorola ColdFire processors are introduced. In<br />
the implementation of OS based NoC, the MCF5484 ColdFire<br />
processor is used.<br />
In the 6 th part, microprocessor programming <strong>and</strong> debugging<br />
tools are introduced.<br />
In the 7 th part, two different attitudes, one based on using OS,<br />
the other one without using OS are compared <strong>and</strong> the<br />
advantages of OS based NoC are brought up. Also, in this<br />
section, a PrioRout routing algorithm is introduced.<br />
In the 8 th part, the carried out implementation is presented <strong>and</strong><br />
the last part the final conclusion is brought up.<br />
II. NOC STRUCTURE<br />
A NoC has been formed of routers <strong>and</strong> links. The IP blocks<br />
have been connected to each other by means of the network<br />
interfaces (NI). Also the routers communicate to each other<br />
over links. A router distinguishes packet paths in network. The<br />
router has been concluded of some buffers, a routing function<br />
unit, a selection function unit <strong>and</strong> a switch for packet<br />
transmission to packet destinations [2] [10].<br />
Network Interfaces justifies IP block communication protocol<br />
<strong>and</strong> packet transmission protocol by means of the router. Each<br />
network interfaces can connect several IP blocks to the routers.
28 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
Figure 1. A router with its components<br />
III. MICROC/OS-II RTOS<br />
MicroC/OS-II is a RTOS that has been applied to embedded<br />
application. If we have a toolchain (A system concluded<br />
compiler, assembler <strong>and</strong> linker), we can add an OS to it.<br />
MicroC/OS-II has a full preemptive <strong>and</strong> real time kernel which<br />
means OS runs the high priority tasks which are ready to<br />
running. Many traditional kernel acts on format of preemptive,<br />
but the MicoC/OS-II is much better than them.<br />
Analysis of OSs with monolithic kernel (such as Windows <strong>and</strong><br />
Linux) which is consisting of millions of line of code when<br />
they encounter problem is difficult <strong>and</strong> nearly these OSs would<br />
not bug free.<br />
The kernel of MicroC/OS-II has only 5000 lines of code <strong>and</strong><br />
we can confirm that it reached to a level that will be bug free<br />
[3].<br />
A. Multitasking feature<br />
MicroC/OS-II can manage up to 64 tasks. However<br />
MicroC/OS-II reserves the four highest priority tasks <strong>and</strong> the<br />
four least priority tasks for its uses. So it leaves the 56 free<br />
tasks.<br />
B. Multitasking feature<br />
For MicroC/OS-II task managing capability, first we need to<br />
be creating a task. For creating the task, we can use one of<br />
these functions:<br />
• OSTaskCreate<br />
• OSTaskCreateExt()<br />
OSTaskCreateExt() is a extended version of<br />
OSTaskCreate() that it has some extra features. For a<br />
creating multitasking, at least we need to create one task. We<br />
can not create the task with Interrupt Service Routine (ISR).In<br />
the figure 2 we can see the segment code of OSTaskCreate<br />
function:<br />
INT8U OSTaskCreate (void (*task)(void *pd),<br />
void *pdata, OS_STK *ptos, INT8U prio)<br />
Figure 2. OSTaskCreate function<br />
As you see above, need four arguments:<br />
Task; A pointer to task code.<br />
Pdata; It is a pointer to the argument. This argument passed to<br />
the wanted task of the beginning moment.<br />
Ptos; It is a pointer to the top stack. This pointer should be<br />
assigned to the task.<br />
Prio; It is the priority of the wanted task.<br />
IV. TRAFFIC MODELS<br />
The traffic model is one of the important parameters in<br />
evaluating the latency time of interconnection networks.<br />
These models are produced according to the application<br />
programs which are run on the machine. In different<br />
application, different models are used. Traffic models are<br />
defined according to three parameters [4]:<br />
• The entrance time to networks<br />
• Message length<br />
• Address distribution type<br />
A. The uniform traffic model<br />
Uniform traffic model is the simplest traffic model which used<br />
in most of evaluations. In this model, each node sends message<br />
to the other nodes in network with equal probability. For<br />
example in a 6 × 6 mesh topology, each nodes sends message<br />
to the other nodes with the probability of %2.85.<br />
All source or destination nodes are selected with equal<br />
probability. The selection of source <strong>and</strong> destination node for<br />
each message will be independent from other messages [4].<br />
B. Hotspot traffic model<br />
In hotspot traffic model, the numbers of messages which are<br />
sent to special node as the hot node are more than the other<br />
nodes. Usually the one node is considered as a hot node.<br />
Because of sending some packets of the created messages in<br />
network to this spot, the traffic around this node is more than<br />
the other spot.<br />
Equalizing protocols <strong>and</strong> OS functions are the instances which<br />
lead to the production of this kind of traffic. The most colorful<br />
node in figure 3 is the hot node <strong>and</strong> the traffic congestion is<br />
clear around it.
29 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
C. Permutation traffic model<br />
Figure 3. Hotspot traffic model<br />
Permutation traffic model is another traffic model that a lot of<br />
parallel programs like FFT, matrix problems, <strong>and</strong> fault tolerant<br />
routing algorithms have behavior like it.<br />
In this model, the destination address is found by placing the<br />
source address in a permutation function. So for each source<br />
address there always is a destination address. Bit reversal, First<br />
(second) matrix transpose, shuffle <strong>and</strong> butterfly traffic models<br />
are some examples of the permutation model. For instance the<br />
traffic model of matrix transpose explained; if we consider M<br />
<strong>and</strong> N as the dimension size of the 2-D network <strong>and</strong> (i,j) as the<br />
source node address, the destination address is produced as<br />
follow:<br />
( i, j)<br />
→ ( M×<br />
N−1−<br />
j,<br />
M×<br />
N−1−i)<br />
(1)<br />
The destination address in second matrix transpose is<br />
produced as follow:<br />
( i, j)<br />
→ ( j,<br />
i)<br />
(2)<br />
D. Local Traffic model<br />
Local traffic model is similar to application program. In this<br />
model, each node sends special volume of its created message<br />
to its neighbor. The number of neighbors is related to the<br />
distance between neighbor nodes (called neighbor radius).<br />
Radius one is shown in figure 4. In that the block nodes are the<br />
neighbors of node.<br />
Figure 4. Local traffic model<br />
In all explained traffic model, some percentages messages are<br />
distributed as per mutative, local or one sent to the hotspot <strong>and</strong><br />
the other messages are distributed in another way which is<br />
usually uniform.<br />
V. COLDFIRE MICROPROCESSOR INTRODUCTION<br />
Motorola corporation is one of pioneer in producing 8, 16, 32<br />
bit microprocessors <strong>and</strong> microcontrollers. ColdFire<br />
microprocessor family is the most famous <strong>and</strong> successful<br />
production of its company. These processors have m68000<br />
architecture that which are suitable to be used in real time<br />
system. To meet this purpose of this paper MCF5484 is used.<br />
VI. BDM MODULE AS A DEBUGGING AND PROGRAMMING<br />
TOOL<br />
The figures 5 show the interface of this module with processor<br />
core <strong>and</strong> its other interfaces. As you see, debug module is<br />
connected to the main bus of the microprocessor <strong>and</strong> so in<br />
some cases if can work with ColdFire CPU core in a parallel<br />
form.<br />
Figure 5. BDM interfaces<br />
The capabilities of this module are divided into three groups:
30 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
A. Real time trace support<br />
It has the ability to dynamic calculation of the running path,<br />
which is useful for debugging. ColdFire has the ability to place<br />
8bits of parallel data on emulator. This data shows the<br />
microprocessor status <strong>and</strong> memory data.<br />
B. Background Debug Mode (BDM)<br />
This capability provides low level debugging for ColdFire. In<br />
this module we can access the memory without stopping the<br />
microprocessor. But changing amount of registers needs to halt<br />
the microprocessor.<br />
C. Real time Debug Support<br />
With use of Debug Interrupt Routine Time, in this mode, the<br />
amount of registers <strong>and</strong> variable data are saved fast <strong>and</strong> the<br />
systems returns to normal stopping the main program.<br />
BDM mode is useful for the following reasons:<br />
• BDM is always accessible for debugging <strong>and</strong><br />
firmware upgrading<br />
• It is used for programming external flash<br />
• It provides the entire control of the microprocessor<br />
<strong>and</strong> so the whole system.<br />
These features lead to debugging the microprocessor by the<br />
use of those tools, which are used for programming the<br />
microprocessor.<br />
Although, most of BDM comm<strong>and</strong>s don’t lead to halt stopping<br />
<strong>and</strong> they are capable to be run with a program concurrently.<br />
Some conditions which lead to microprocessor stopping are<br />
available as follow:<br />
• Fault occurrence in BDM system<br />
• Breakpoints<br />
• Halt comm<strong>and</strong> that can be activated with 'Go' from<br />
BDM<br />
VII. PERFORMANCE COMPARISON IN TWO STATUSES: WITH<br />
AND WITHOUT OF OS<br />
In this section, we want to compare two different attitude in a<br />
mesh topology based NoC. For this comparison, we use of<br />
3× 3 mesh topology based on hotspot traffic model. The use of<br />
OS in topologies with limited nodes is worth if nodes<br />
communication complication or the number of defined tasks<br />
are a lot. In the attitude that OS is not used, the pass traffic in<br />
the central node of hotspot traffic model is a lot. So as a result<br />
there is the probability of the congestion of the packets when<br />
input packets are assigned the output. In order to remove this<br />
problem, we use the virtual channel. However these virtual<br />
channels increase many overhead. For each channel which is<br />
added the power consumption increases <strong>and</strong> results to the<br />
increases of power in this attitude.<br />
If virtual channel are used, the router needs to use the MUX<br />
<strong>and</strong> DEMUX for he selection of the packets. The figure 6<br />
shows the packet placing in virtual channel <strong>and</strong> also the<br />
selection of packet from virtual channel.<br />
Figure 6. Virtual channel<br />
As you see in this attitude, some components such as MUX,<br />
DEMUX <strong>and</strong> buffers are necessary. These components lead<br />
some complication like a buffer management <strong>and</strong> packet<br />
selection from buffers. In the attitude based on the use of OS,<br />
we define task s based on I/O ports (Local port is negligible).<br />
As a result there are four tasks: North, South, East <strong>and</strong> West.<br />
Now, OS assigns one task priority for each port. Based on the<br />
assignment of priorities to these tasks, we can manage the<br />
routing of the input packet to input port easily.<br />
OS is responsible for scheduling <strong>and</strong> task management. In this<br />
trend, priority assigning is programmed in the way: each time<br />
the output port is busy, the free ports based on PrioRout are<br />
used.<br />
A. Deterministic:<br />
Execution time of all MicroC/OS-II functions <strong>and</strong> services are<br />
deterministic. This means that you can always know how much<br />
time MicroC/OS-II will take to execute a function or a service.<br />
Furthermore, except for one service, execution time of all<br />
MicroC/OS-II services does not depend on the number of tasks<br />
running in your application.<br />
B. Task stacks:<br />
Each task requires its own stack. However, MicroC/OS-II<br />
allows each task to have a different stack size. This allows you<br />
to reduce the amount of RAM needed in your application.<br />
With MicroC/OS-II's stack checking feature, you can<br />
determine exactly how much stack space, each task actually<br />
requires.<br />
C. Services:<br />
MicroC/OS-II provides a number of system services such as<br />
mailboxes, queues, semaphores, fixed-sized memory<br />
partitions, time related functions, etc.<br />
D. Interrupt Management:<br />
Interrupts can suspend the execution of a task <strong>and</strong>, if a higher<br />
priority task is awakened as a result of the interrupt, the<br />
highest priority task will run as soon as all nested interrupts<br />
complete. Interrupts can be nested up to 255 levels deep.
31 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
E. Critical section of code<br />
The critical section of code, briefly named critical section, is a<br />
code which should be atomic <strong>and</strong> run as a basic block<br />
necessarily. So the segment code is uninterruptible when<br />
placed in this section. To assure that, all interrupt are disabled<br />
before critical section to be ran <strong>and</strong> after that they will be able<br />
again.<br />
VIII. PRIOROUT ROUTING ALGORITHM<br />
In this algorithm, input packet will choose a different output<br />
port based on the selected input port <strong>and</strong> its destination. In the<br />
figures 7, we can see a 3× 3 mesh topology of a NoC. In this<br />
topology, OS has been ported on the router which has special<br />
color.<br />
Figure 7. Mesh topology based NoC<br />
Figure 8. The Central router ports<br />
The number of router ports depends on the location. For<br />
example the router which situated in the north east has three<br />
ports: Eastern port, Southern port <strong>and</strong> Local port. The router<br />
which the OS has ported on it is located in the central of the<br />
mesh topology <strong>and</strong> it has five ports which are: Southern port,<br />
Northern port, Eastern port <strong>and</strong> Local port. The figure 8 shows<br />
the number of ports in the central router.<br />
In PrioRout routing, if the input port is the northern one <strong>and</strong><br />
output port is the eastern one <strong>and</strong> eastern port is free, output<br />
port is the eastern port. If the eastern port is busy, the output<br />
port is would be the southern port <strong>and</strong> the southern port would<br />
be also, in the worst situation, <strong>and</strong> the western port would be<br />
the output port. So the task priority in this example would be:<br />
TPNorth<br />
To East = 3<br />
TPNorth<br />
To South = 2<br />
TPNorth<br />
ToWest<br />
= 1<br />
As a result, there is need neither for saving nor buffering. In<br />
the same manner, for all packets which their destination is<br />
neighbor port, higher task priority belongs to this port. The<br />
next priority would be toward the frontal port <strong>and</strong> the lower<br />
priority belongs to the output port. In PrioRout routing, if the<br />
input port, is eastern one <strong>and</strong> the output port is the eastern one<br />
<strong>and</strong> also be free, the output port would be the western one. If<br />
the western port is busy, the output port would be either the<br />
northern port or the southern port. That in this case, we choose<br />
the free port in clockwise. So we should have:<br />
TPEast<br />
To West = 3<br />
TPEast<br />
To North = 2<br />
TPEast<br />
To North = 1<br />
TABLE I. PACKET ROUTING BASED ON PRIOROUT ROUTING<br />
The table1 shows the packet routing according to use of the<br />
OS.<br />
Task Routing Best Output Case Mean Output Case Worst<br />
Output<br />
Case<br />
North<br />
South<br />
East<br />
West<br />
North to South South East West<br />
North to East East South West<br />
North to West West South East<br />
South to North North West East<br />
South to East East North West<br />
South to West West North East<br />
East to West West South North<br />
East to North North West South<br />
East to South South West North<br />
West to East East North South<br />
West to North North East South<br />
West to South South East North
32 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
IX. EXPERIMENTAL RESULTS<br />
A packet from north to east has been analyzed based on<br />
PrioRout routing <strong>and</strong> MicroC/OS-II features.<br />
OS dynamically assigns the updated priorities. In this way,<br />
there is a priority table for packet routing which is shown in<br />
table II.<br />
Table I Task (Priority) table for routing<br />
Priority Destination<br />
3 East<br />
1 West<br />
2 South<br />
Consequently, the input packet follows this priority<br />
assignment once it reaches the task (North port). Four<br />
Boolean global variables are defined during task<br />
implementation <strong>and</strong> creation to show whether or not the<br />
ports are busy. The next higher priority (south port in this<br />
example) will be selected. Also the other tasks follow this<br />
routing. To sum up, based on using OS, when packet flits<br />
are going to pass the router, they do not need to be stored in<br />
buffers. Therefore the power consumption is lowered in<br />
comparison the case without using OS. In the worst case<br />
(figure10-b), if the all ports are busy, the packets can be<br />
stored in input buffers <strong>and</strong> task stacks. This means that we<br />
do not need virtual channel. In the case without using OS<br />
(figure10-a), higher priority packets may be waited for<br />
lower priority packets. But in attitude with using OS, sent<br />
packet based on their priorities send according to their<br />
importance. But we should notice that the new output<br />
packets can not interrupt until the all flits of previous<br />
packets are sent. Also, in attitude with using OS, we are able<br />
to message passing. As when two ports reach the different<br />
ports, once one packet based on critical section feature is<br />
Figure 9. Creation of four tasks in MicroC/OS-II<br />
There is one task for each port; therefore, there are four tasks<br />
altogether. There are 12 paths as shown in table 1.<br />
Creation of four tasks in MicroC/OS-II is shown in figure 9:<br />
#define TASK_STK_SIZE 512 //Size of each task's stacks (# of WORDs)<br />
#define TASK_START_ID 0 // Application tasks IDs<br />
#define TASK_1_ID 1<br />
#define TASK_2_ID 2<br />
#define TASK_3_ID 3<br />
#define TASK_4_ID 4<br />
#define TASK_START_PRIO 4 Application tasks priorities<br />
#define TASK_1_PRIO 1<br />
#define TASK_2_PRIO 1<br />
#define TASK_3_PRIO 1<br />
#define TASK_4_PRIO 1<br />
// Create the first task<br />
OSTaskCreate(TestTask1,(void*)11,&TestTaskStk1[TASK_STK_SIZE], 11);<br />
// Create the Second task<br />
OSTaskCreate(TestTask2,(void*)11,&TestTaskStk2[TASK_STK_SIZE], 11);<br />
// Create the Third task<br />
OSTaskCreate(TestTask3,(void*)11,&TestTaskStk3[TASK_STK_SIZE], 11);<br />
// Create the Forth task<br />
OSTaskCreate(TestTask4,(void*)11,&TestTaskStk4[TASK_STK_SIZE], 11);<br />
been selected. This section can be priority assigning. For<br />
example forward path has higher priority than the neighbor<br />
path. So the table III shows the comparison two attitudes<br />
(with <strong>and</strong> without using OS). Horizontal axis shows the task<br />
priorities <strong>and</strong> vertical axis the packet transmission time. As<br />
a result, transmission time of higher priority packet is lower.<br />
3<br />
2.5<br />
2<br />
Time 1.5<br />
1<br />
0.5<br />
3<br />
2.5<br />
2<br />
1<br />
0.5<br />
0<br />
0<br />
Time 1.5<br />
Priority<br />
1 2 3<br />
(a<br />
Priority<br />
1 2 3<br />
(b<br />
Figure 10. a) Worst case in without using OS b) Normal Case in using<br />
OS state.
33 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
Table II Packet transmission status with northern source<br />
Source Port that sends 1000 packets<br />
Other Port Status<br />
Destination Port <strong>and</strong> the<br />
number of received packets<br />
North<br />
East is Free<br />
East-1000<br />
North<br />
East is Busy <strong>and</strong> South is Free<br />
South-1000<br />
North<br />
East <strong>and</strong> South are Busy <strong>and</strong> West is<br />
Free<br />
West-1000<br />
In our simulation, phytech evaluation board is used that the<br />
MicroC/OS-II has been ported in it. We reach to these result<br />
that have been shown in table III.<br />
X. CONCLUSION<br />
In this paper, the usage of a real time OS, in a NoC<br />
framework based on hotspot traffic model has been<br />
analyzed. Communication management in NoC, needs a<br />
precise planning, scheduling, resource allocation, message<br />
passing. Satisfy these parameters, needs efficiently. In this<br />
paper a RTOS has been used. Since the NoC is power<br />
constrained <strong>and</strong> the OS which is used has the a few line of<br />
code, this selection (a RTOS) has a significant effect on<br />
minimizing the power consumption. Based on the<br />
implementation, RTOS features can be used in NoC.<br />
REFERENCES<br />
[1] Allan, D. Edenfeld, W. H. Joyner, A. B. Kahng, M. Rodgers, Yervant<br />
Zorian, "2001 Technology Roadmap for Semiconductors," Computer,<br />
vol. 35, no. 1, pp. 42-53, Jan., 2002<br />
[2] J. Clerk Maxwell, A Treatise on Electricity <strong>and</strong> Magnetism, 3rd ed.,<br />
vol. 2. Oxford: Clarendon, 1892, pp.68–73.<br />
[3] http://www.micrium.com/<br />
[4] W. Hsh, Performance issues in wire-limited hierarchical networks,<br />
PhD Thesis, University of Illinois-Urbana Champaign, 1992.<br />
[5] G.J. Pfister, V.A. Norton, “Hotspot contention <strong>and</strong> combining in<br />
multistage interconnection networks,” IEEE Transactions on<br />
<strong>Computers</strong>, Vol. 34, No. 10, 1985, pp. 943-948.<br />
[6] K. Hwang, Advanced computer architecture: parallelism, scalability<br />
<strong>and</strong> programmability, McGraw-Hill (Ed.), 1993.<br />
[7] J. Duato, S. Yalamanchili, <strong>and</strong> L. Ni, Interconnection Networks—An<br />
Engineering Approach. Morgan Kaufmann, 2002.<br />
[8] MCF548x Integrated Microprocessor Electrical Characteristics<br />
Applies to the MCF5480, MCF5481, MCF5482, MCF5483,<br />
MCF5484, <strong>and</strong> MCF5485, © Freescale Semiconductor, Inc., 2004.<br />
[9] Nollet, V.; Marescaux, T.; Verkest, D, Operating-system controlled<br />
network on chip. Design Automation Conference (DAC), 2004.<br />
Proceedings.41 st Volume , Issue , 2004 Page(s): 256 - 259<br />
[10] S. A. Asghari, H. Pedram, P. Yaghini <strong>and</strong> M. Khademi, Designing<br />
<strong>and</strong> Implementation of a Network on Chip Router based on<br />
H<strong>and</strong>shaking Communication Mechanism, World Applied Science<br />
Journal 6 (1),pp: 88-93, 2009<br />
[11] N. Eisley <strong>and</strong> L.Peh, “HighLevel Power Analysis for OnChip<br />
Networks,” CASES’04 September 22–25, 2004, Washington, DC,<br />
USA<br />
Seyyed Amir Asghari was born in Lashte Nesha in Guilan province of Iran, on<br />
June 26, 1984. He received his BS degree in Computer Engineering from<br />
Amirkabir University of Technology in 2007. He graduated from the Amirkabir<br />
University of Technology in MSc. He is a research assistant of Asynchronous<br />
Design Laboratory in the same school.<br />
Hossein Pedram Received his BS degree from Sharif University in 1977 <strong>and</strong><br />
MS degree from ohio State University in 1980 in Electrical Engineering. He<br />
received his PhD degree from Washington State University in 1992 in<br />
Computer Engineering.<br />
Dr Pedram has served as a faculty member in the Computer Engineering<br />
Department in Amirkabir University of Technology since 1992. He teaches<br />
courses in computer architecture <strong>and</strong> distributed systems. His research interests<br />
include innovative methods in computer architecture such as asynchronous<br />
circuits, management of computer networks, distributed systems, <strong>and</strong> robotics.<br />
Hassan Taheri Received his BS degree from Amirkabir University of<br />
Technology in 1975 <strong>and</strong> MS degree from University of Manchester Institute of<br />
Science <strong>and</strong> Technology (UMIST) in 1978 in Electrical Engineering. He<br />
received his PhD degree from UMIST University in 1988 in Electrical<br />
Engineering.<br />
Dr Taheri has served as a faculty member in the Electrical Engineering<br />
Department in Amirkabir University of Technology. He teaches courses in Data<br />
Communication Network, Computer Communication, Teletraffic Engineering,<br />
Electronic Switching, Digital Communications, Telephone Switching,<br />
Probability <strong>and</strong> Statistics.
34 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
Nonlinear Filtering Algorithms for Chaotic<br />
Signals: A Comparative Study<br />
Valeri Ya Kontorovich, Zinaida Lovtchikova, Jesús. A. Meda-Campaña, <strong>and</strong> Keith Tinsley<br />
Abstract— In this work, a comparative analysis of some<br />
approximate nonlinear filtering algorithms for chaos is<br />
addressed, assuming that output signals of chaotic attractors are<br />
affected by additive white noises. Estimation accuracy <strong>and</strong><br />
computational complexity of filtering algorithms are taken into<br />
account during comparison process.<br />
It is shown, that the nonlinear filtering algorithm of chaos can<br />
be interpreted, for certain levels of the Signal-Noise Ratio (SNR),<br />
as a close to singular one, which dramatically decrease the Mean-<br />
Square Error (MSE) of filtering.<br />
Index Terms—Chaotic signals, Markov theory, non linear<br />
filtering<br />
I. INTRODUCTION<br />
HEORETICALLY, chaos is represented as an output<br />
Tsignal<br />
of dissipative continuous dynamic systems (strange<br />
attractors) (see, for example [4]):<br />
( x(t)<br />
)<br />
x & = f ,<br />
n<br />
x ∈ R , 0 0 ) x ( = x<br />
t , (1)<br />
where [ ] T<br />
f f 1(<br />
x),...<br />
fn<br />
( x)<br />
is a differentiable vector function.<br />
According to the idea of Kolmogorov, equations for the<br />
strange attractors (1) can be successfully transformed in the<br />
equivalent stochastic form as a stochastic differential equation<br />
(SDE) [4], [11]:<br />
( ( t) ) + εξ(<br />
t)<br />
x& = f x , (2)<br />
Manuscript received October 9, 2009. This work was supported through<br />
the grant “Intel-VK” from INTEL Corporation.<br />
V. Ya. Kontorovich is with the Communications Section of the Electrical<br />
Engineering Department of CINVESTAV-IPN, Av. IPN 2508 Col. San Pedro<br />
Zacatenco. C.P. 07360 México, D.F. Apartado postal 14-740, 07000 México,<br />
D.F. Phone: +52 55 57473764. Fax: +52 55 50613977 (Email:<br />
valeri@cinvestav.mx)<br />
Z. Lovtchikova is with the Engineering <strong>and</strong> Advanced Technology<br />
Interdisciplinary Professional Unit, UPIITA-IPN. Colonia Laguna de<br />
Ticomán. C.P. 07340. México, D.F. Phone +52 55 57296000x56848. (Email:<br />
lovtchikova@ipn.mx)<br />
J. A. Meda-Campaña is with the Mechanical Engineering Department of<br />
SEPI-ESIME Zacatenco, IPN, Av. IPN s/n. Edificio 5, piso 3. Unidad<br />
Profesional Adolfo López Mateos. Zacatenco. Col. Lindavista. C.P. 07738.<br />
México D.F. México. Phone +52 55 5729600x54737(Email:<br />
jmedac@ipn.mx).<br />
K. Tinsley is with INTEL Labs, INTEL Corporation. Hillsboro, Oregón<br />
97124, USA. Phone: +503 712 1790. (Email: keith.r.tinsley@intel.com)<br />
where ξ(t) is a vector of “weak” external white noise with the<br />
related positive defined matrix of “intensities” ε = [ε ij] nxn .<br />
The assumption of the weak white noise component in (2)<br />
guarantees the existence of the stationary distribution Wst(x),<br />
∀ εij →0 [1]. The latter was considered as an invariant<br />
physical measure for statistical characterization of the strange<br />
attractors [11], [12], [17].<br />
Statistical description of chaotic systems <strong>and</strong> noise effects<br />
in chaotic trajectories are deeply analyzed in [1], but it is<br />
rather difficult to apply those results in engineering<br />
applications. In this regard, the authors proposed earlier the<br />
so-called “degenerate cumulant equations method” [14] for<br />
applied statistical analysis of the strange attractors.<br />
It sounds logical to suppose that if one can model some<br />
stochastic phenomena by means of dynamic chaos (SDE (2)),<br />
then its filtering could be carried out through the same<br />
approach [13].<br />
Chaos modeling using SDE (2) gives an opportunity to<br />
provide the filtering of chaotic signals by means of the<br />
classical approach of nonlinear filtering for Markov processes,<br />
first proposed at the beginning of the 60’s by R. Stratonovich<br />
<strong>and</strong> H. Kushner [15], [20] <strong>and</strong> intensively developed in the<br />
last 40 years [2], [6], [7], [8], [9], [19].<br />
It is worth mentioning here, that the tendency of the<br />
intensities in SDE (2) to zero have to be applied with certain<br />
caution, as the latter formally changes characteristics of the<br />
Markov process, generated by (2).<br />
This problem will be considered in our further publications<br />
with all necessary details; here we will like to stress, that<br />
intensities will be considered, for the process noise in (2), as<br />
very small <strong>and</strong> close or equal to zero.<br />
As it follows from the above mentioned references the<br />
nonlinear filtering approach is mainly, by definition, an<br />
approximate one being that the differential equations for the aposteriori<br />
Probability Density Functions (Stratonovich-<br />
Kushner equations) do not provide analytical solution.<br />
During more than 40 years of intensive developments,<br />
many approximate methods for non-linear filtering have been<br />
proposed. For the purpose of this paper, the most important of<br />
them will be presented in the next section.<br />
It is worth stressing here that the comparison of the<br />
accuracy of the approximate methods does not provide a<br />
sustainable certainty, mainly because their creation is rather<br />
heuristic. Moreover, for certain methods the attempts to
increase the precision by increasing the number of<br />
approximation terms, etc. can give exactly the opposite effect<br />
<strong>and</strong> reduce the accuracy [7], [19].<br />
The main goal of this paper is to present a comparative<br />
study of some nonlinear algorithms bearing in mind possible<br />
applications to the filtering of chaotic signals provided by<br />
Lorenz, Chua <strong>and</strong> Rössler attractors in presence of additive<br />
white noises (channel noises).<br />
The rest of the work is organized as follows. In section II,<br />
Markov theory of nonlinear filtering is briefly recalled.<br />
Section III summarizes some of the approximate approaches<br />
for nonlinear filtering, while chaotic filtering is analyzed in<br />
section IV. Afterwards, numerical simulations are discussed in<br />
section V. Finally, in section VI, some conclusions are drawn.<br />
II. MARKOV THEORY OF NON-LINEAR FILTERING<br />
Let us consider the following filtering scenario where the<br />
received signal is:<br />
( , ( ) ) ( ) t t x<br />
( )<br />
0 t<br />
t n s y = + , (3)<br />
where y(t) – is a vector of the received signal with dimension<br />
“m”, s (⋅) – is a vector function of the desired signal of the<br />
same dimension “m”, n0 –is a vector of the white additive<br />
noises with the intensity matrix N0(mxm).<br />
Here the signal s (⋅) depends on the “message” x (t) which<br />
is subject of filtering <strong>and</strong> is modeled by means of the<br />
following SDE as an n-dimensional Markov diffusion process:<br />
( t, ) + ξ(<br />
t)<br />
x& = g x . (4)<br />
Formally, SDE (4) coincides with (2) <strong>and</strong> the vector<br />
function g (⋅) is similar to f (⋅) in (2); the matrix of intensities<br />
for ξ (⋅) in (4) corresponds to ε in (2).<br />
As it is well known (see [18] <strong>and</strong> [20] for example), with<br />
this assumption the a-priori Probability Density Function, or<br />
a-priori PDF, for x(t) follows the so-called Fokker-Plank-<br />
Kolmogorov (FPK) equation:<br />
∂WPR<br />
( x,<br />
t)<br />
= −<br />
∂t<br />
1<br />
+<br />
2<br />
n<br />
n<br />
∑<br />
i=<br />
1<br />
n<br />
∑∑<br />
i=<br />
1 j=<br />
1<br />
∂<br />
[ g i ( t,<br />
x)<br />
WPR<br />
( x,<br />
t)<br />
]+<br />
∂x<br />
i<br />
∂<br />
∂x<br />
∂x<br />
i<br />
2<br />
j<br />
[ W ( x,<br />
t)<br />
]<br />
ε , (5)<br />
where WPR(x,t0) =W(x0)<br />
Equation (5) can be rewritten in another form [9], [21]:<br />
or<br />
35 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
∂<br />
W PR<br />
ij<br />
PR<br />
( x,<br />
t)<br />
= −divπ(<br />
x,<br />
t)<br />
, (6a)<br />
∂t<br />
∂WPR<br />
( x,<br />
t)<br />
= L<br />
∂t<br />
PR<br />
{ W ( x,<br />
t)<br />
}<br />
PR<br />
, (6b)<br />
where π(x, t) – is a probabilistic “flow” with the components:<br />
1<br />
πi(<br />
x, t)<br />
= gi(<br />
x,<br />
t)<br />
WPR(<br />
x,<br />
t)<br />
−<br />
2<br />
n<br />
∂<br />
x<br />
[ εijWPR(<br />
x,<br />
t)<br />
] (7)<br />
∑ ∂<br />
j=<br />
1 j<br />
In (5)-(7) { } n<br />
gi (x , t)<br />
1 are drift coefficients <strong>and</strong> {εij} are<br />
diffusion coefficients of the Markov process, note that in the<br />
following they are defined in the Stratonovich sense [18],<br />
[20]; LPR{⋅} – is a FPK linear operator.<br />
Then, as it was shown in [20] the integro-differential<br />
equation for the a-posteriori PDF WPS(x, t) is given in the<br />
following equivalent forms:<br />
or<br />
∂WPS<br />
( x,<br />
t)<br />
= LPR<br />
{ WPS<br />
( x,<br />
t)<br />
}+<br />
∂t<br />
1 ⎡<br />
∞<br />
⎤<br />
⎢F<br />
( x,<br />
t) − F ( x,<br />
t)<br />
WPS<br />
( x,<br />
t)<br />
dx⎥WPS<br />
( x,<br />
t)<br />
2 ⎢ ∫ ⎥<br />
⎣<br />
− ∞<br />
⎦<br />
1<br />
2<br />
∂<br />
W PS<br />
( x,<br />
t)<br />
= −divπ<br />
ˆ(<br />
x,<br />
t)<br />
+<br />
∂t<br />
[ ( , t) F(<br />
, t)<br />
] WPS<br />
( , t)<br />
x x x F 〉 〈 −<br />
(8a)<br />
(8b)<br />
where ∫ ∞<br />
〈 F( x, t)<br />
〉 = F ( x,<br />
t)<br />
WPS<br />
( x,<br />
t)<br />
dx<br />
, π ˆ ( x , t ) is (5),<br />
−∞<br />
where WPR(x, t) is substituted by WPS(x, t) <strong>and</strong>:<br />
T<br />
⎡ 1 ⎤ −1<br />
⎡ 1 ⎤<br />
F ( x,<br />
t)<br />
= ⎢ y(<br />
t)<br />
− s(<br />
x,<br />
t)<br />
⎥ N 0 ⎢ y(<br />
t)<br />
− s(<br />
x,<br />
t)<br />
⎥ . (9)<br />
⎣ 2 ⎦ ⎣ 2 ⎦<br />
Equations (8) together with (9) are called Stratonovich-<br />
Kushner nonlinear equations (SKE) <strong>and</strong> have a rather<br />
attractive physical interpretation: the first summ<strong>and</strong> in (8)<br />
describes the dynamics of the a-priori dates of the x(t) <strong>and</strong> the<br />
second summ<strong>and</strong> depends on the innovation of the a-priori<br />
dates from the analysis of observations.<br />
The optimum estimation of x (t)<br />
is x ˆ( t)<br />
by any known<br />
criteria of optimization <strong>and</strong> is a result of the filtering of<br />
x(t); it is obtained from the solution of (8), while the input<br />
signal is y(t) (see (3)).<br />
When intensity of additive noises vector N0 is large, the<br />
influence of the first summ<strong>and</strong> in (8) prevails, equation (8)<br />
translates into FPK (6) <strong>and</strong> the filtering accuracy diminishes<br />
drastically. In contrary: when the signal to noise ratio<br />
increases, the WPS(x, t) tends to the unimodal Gaussian PDF<br />
[7], [20]. Note that SKE equation fully describes the<br />
“evolution” of WPS(x, t) in time but does not provide with<br />
exact analytical solutions.<br />
Even so, there are very few exceptions: linear SDE (4)<br />
which yields the well known Kalman filtering algorithm [2],<br />
[6]-[9], [15], [16], [18]-[21]; the Zakai approach [22], etc.
36 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
Due to this, the nonlinear filtering algorithms are practically<br />
always approximate.<br />
During more than 40 years the bibliography for nonlinear<br />
filtering algorithms has become enormous. In the next section<br />
we will consider only some of them, taking into account the<br />
following considerations:<br />
− The models of the desired signals applied for filtering are<br />
equations for Lorenz, Chua <strong>and</strong> Rössler strange attractors<br />
with n=3, i.e., of rather low dimension.<br />
− The algorithms of interest have to be adequate for real time<br />
applications <strong>and</strong> so they have to be of reduced<br />
computational complexity.<br />
− The algorithms for nonlinear filtering have to be able to<br />
perform satisfactorily in scenarios with low signal to noise<br />
ratios (SNR), although the Gaussian assumption for WPS(x)<br />
is not always valid.<br />
( ) ) ( ), ( t t t s x ≅ x<br />
(10)<br />
− All εij are equal to zero, except ε11≅ ε1 [1].<br />
Advances in cumulant statistical analysis of chaos [11], [12]<br />
supposing low SNR, makes one guess that it might be<br />
reasonable to consider application of the high order cumulants<br />
(HOS), (see [9], [10], [19] for example), etc.<br />
⎡ ∞<br />
∞<br />
1<br />
⎤<br />
+ ⎢<br />
⎥<br />
⎢ ∫ xi F( x,<br />
t)<br />
Wˆ<br />
G(<br />
x,<br />
t)<br />
dx-xˆ<br />
i ∫ F(<br />
x,<br />
t)<br />
Wˆ<br />
G(<br />
x,<br />
t)<br />
dx<br />
2<br />
⎥<br />
⎣−∞<br />
−∞<br />
⎦<br />
&ˆ<br />
∞<br />
⎛<br />
o o ⎞<br />
= ⎜ ˆ T<br />
∫ π ( x, t)<br />
grad x i x ⎟dx<br />
+<br />
⎝<br />
⎠<br />
Rij j<br />
−∞<br />
(11)<br />
⎡ ∞<br />
∞<br />
1 o o<br />
⎤<br />
+ ⎢<br />
⎥<br />
⎢ ∫ xi x j F( x,<br />
t)<br />
Wˆ<br />
G ( x,<br />
t)<br />
dx-Rˆ<br />
ij ∫ F(<br />
x,<br />
t)<br />
Wˆ<br />
G ( x,<br />
t)<br />
dx<br />
,<br />
2<br />
⎥<br />
⎣−∞<br />
−∞<br />
⎦<br />
o<br />
where x i = xi<br />
− xˆ<br />
i , x j = x j − xˆ<br />
j .<br />
o<br />
Equations (11) can be presented in the matrix form [7],<br />
[15], [19], [20] as well, but for concrete applications percomponent<br />
representation (11) might be more suitable (see<br />
the following).<br />
Practically, it is possible to assume for ∀ ˆ ( t)<br />
when t→∞,<br />
are converging to the stationary values R ij , <strong>and</strong> in<br />
consequence the second equation in (11) usually tends to the<br />
system of nonlinear algebraic equations, which can be solved<br />
numerically.<br />
This assumption can significantly simplify the<br />
implementation of the corresponding EKF algorithms for real<br />
time scenarios.<br />
Functional approximation for WPS ( x , t)<br />
. It follows<br />
from [9], [19]:<br />
III. APPROXIMATE APPROACHES FOR NON LINEAR FILTERING<br />
It is always “better” to approximate the a-posteriori PDF<br />
WPS ( x , t)<br />
than the nonlinearity at (4), (8) [2], [8], [19]. In this<br />
context, let us mention the following approximate approaches<br />
for WPS ( x , t)<br />
:<br />
− Gaussian approximations: Extended Kalman Filter (EKF)<br />
[2], [6]-[9], [15], [16], [18]-[21]; Unscented Kalman Filter<br />
(UKF) [8]; Quadrature Kalman Filter (QKF) [2]; Gauss-<br />
Hermite Quadrature Filter (GHF), [6], Iterated Kalman<br />
Filter (IKF), etc.<br />
− Functional approximations for<br />
WPS ( x,<br />
t)<br />
[9], [19];<br />
− Integral or Global approximations for<br />
WPS ( x,<br />
t)<br />
[7];<br />
− HOS approximations for<br />
WPS ( x,<br />
t)<br />
[10]; etc.<br />
Due to the lack of space, it is hardly feasible to give a<br />
complete overview of all those methods; moreover not all of<br />
them are adequate taking into account the observations<br />
introduced at the end of section II but some comments will be<br />
made at section V.<br />
Let us start with the Extended Kalman Filter (EKF):<br />
Considering WPS ( x , t)<br />
as a three dimensional Gaussian PDF-<br />
Wˆ G ( x , t)<br />
, from (8) it is possible to obtain the following<br />
equations for per-component of the mean estimates { } 3<br />
x ˆi 1 <strong>and</strong><br />
for estimates of the elements of the a-posteriori covariance<br />
matrix { } 3<br />
3 ⎡ 3 q−1<br />
R<br />
⎤<br />
qj<br />
W = ∏ ⎢ + ∑∑ − ˆ − ˆ<br />
PS ( x,<br />
t)<br />
WPS<br />
( xi<br />
) 1<br />
( xq<br />
xq)(<br />
x j x j)<br />
⎥<br />
⎢ R<br />
= 1<br />
⎥<br />
⎣ q=<br />
2 j=<br />
1 qqR<br />
i<br />
ji<br />
⎦ (12)<br />
From (12) we see that the Functional Approximation for the<br />
PDF is sufficiently non-Gaussian (marginal WPS(xi) are<br />
arbitrary) but for “joint” characterization of the vector xˆ , only<br />
elements of the a-posteriori covariance matrix Rij R ˆ<br />
ij :<br />
i,<br />
j=<br />
1<br />
∞<br />
x<br />
&ˆ<br />
i = ∫ ( ˆ T<br />
π ( x, t)<br />
gradxi<br />
) dx<br />
+<br />
−∞<br />
ˆ are<br />
considered.<br />
It can be shown that the equations for { } n<br />
xˆ i 1 <strong>and</strong> { Rij } ˆ are<br />
the same as in (11), being the only difference that instead of<br />
Wˆ G ( x , t)<br />
one has to substitute in (11) the approximation (12)<br />
for WPS ( x , t)<br />
. The corresponding integrals can be solved<br />
analytically or by the Gauss-Hermite quadrature formula [2],<br />
[6] (see below).<br />
Integral or Global approximation for WPS ( x , t)<br />
. The<br />
reader already realized that the previous two approximations<br />
for WPS ( x , t)<br />
are in some sense “local” because they provide<br />
the estimation of { xˆ i } as the maximum of ) , ( t WPS x , <strong>and</strong><br />
{ Rij } ˆ . When the SNR is considerable high this is quite<br />
enough, but when the SNR is low, one has to look for another<br />
approach, which is called Integral approximation. This<br />
approach was proposed for successful approximation of<br />
R ij
WPS ( x , t)<br />
including the PDF’s “tails”, i.e. for the whole span<br />
of x.<br />
Let us assume that WPS ( x , t)<br />
can be represented in the<br />
form:<br />
WPS PS<br />
( x , t)<br />
= W ( x,<br />
α(<br />
t))<br />
, (13)<br />
where α is an unknown vector of approximation parameters.<br />
Then, applying the well known Kullback measure as an<br />
approximation criteria, we obtain the following equation for<br />
the unknown vector α:<br />
+<br />
LPR −1<br />
{ h(<br />
x,<br />
t)<br />
} + V ( t)<br />
h(<br />
x,<br />
t)<br />
F ( x,<br />
t)<br />
α& =<br />
, (14)<br />
where:<br />
∂lnWPS<br />
( x,<br />
α(<br />
t))<br />
h ( x,<br />
t) =<br />
, <strong>and</strong><br />
∂α<br />
∞<br />
T<br />
⎡∂<br />
lnWPS<br />
( x,<br />
α(<br />
t))<br />
⎤<br />
V ( t)<br />
= − ∫ ⎢<br />
( , α(<br />
))<br />
α ⎥ WPS<br />
x t dx<br />
⎣ ∂<br />
−∞<br />
⎦<br />
2<br />
∂ WPS<br />
( x,<br />
α(<br />
t))<br />
= −<br />
, Τ<br />
∂α∂α<br />
+ PR<br />
{} •<br />
L – is a self ad joint operator to the FPK operator [18].<br />
Now, as an integral approximation of ( x , α(<br />
t))<br />
, let us<br />
W PS<br />
choose the so-called “Dynkin PDF” with α(t) – as a vector of<br />
sufficient statistics for WPS(⋅):<br />
⎪<br />
⎧ K<br />
⎪<br />
⎫<br />
W PS ( x, α(<br />
t))<br />
= C exp⎨∑<br />
α p ( t)<br />
ϕ p ( x)<br />
+ ϕ0<br />
( x)<br />
⎬ , (15)<br />
⎪⎩ p=<br />
1<br />
⎪⎭<br />
ϕ (x)<br />
is a complete set of orthogonal<br />
where { }<br />
p<br />
multidimensional functions: Hermite, Laguerre, etc.<br />
One can see, that there is a high degree of similarity<br />
between (15) <strong>and</strong> the orthogonal series representation of<br />
( x , α(<br />
t))<br />
[18]: in both cases, series of orthogonal<br />
W PS<br />
37 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
functions are applied, but in (15) it is done for the<br />
monotonical transform in{ WPS ( x , α(<br />
t))<br />
} <strong>and</strong> not for<br />
WPS ( x , α(<br />
t))<br />
. So, the coefficients {αp(t)} can always be<br />
represented through the cumulants of WPS(x). This opens the<br />
opportunity to search equations for the cumulants (HOS) of<br />
WPS ( x, t)<br />
directly (see [10], [19], for example), instead of<br />
search for a solution of (15), which cannot be obtained<br />
analytically.<br />
Being that the last problem was extensively tackled in the<br />
mentioned references, the HOS approach will not be<br />
completely addressed in the following. However, one<br />
comment comes in line: for n > 1 equations (14) <strong>and</strong> equations<br />
for HOS are rather complex when real time solutions are<br />
required; for n = 1 there might be no significant difference<br />
between both methods (see [10] for details). Then, in order to<br />
apply the last two approaches for approximate nonlinear<br />
filtering of chaos it is necessary to decrease the dimension of<br />
the SDE (4). In other words, for chaos one has to adequately<br />
find an equation statistically equivalent to the SDE (4). This<br />
can be achieved making a synthesis of the equivalent SDE<br />
(see [18]).<br />
IV. FILTERING ALGORITHMS FOR CHAOTIC SIGNALS<br />
For simplicity, let us consider the following special case of<br />
the one dimensional scenario:<br />
y t)<br />
= x ( t)<br />
+ n ( t)<br />
, (16)<br />
( 1 0<br />
where x1(t) – is the first (observable) component of any<br />
strange attractor (Lorenz, Chua, Rössler) <strong>and</strong> n0(t) is an scalar<br />
white noise [12], [17]. For the sake of completeness we<br />
present in Table I all the features which will be required here.<br />
Let us consider Lorenz, Chua <strong>and</strong> Rössler attractors. It can<br />
be seen from Table I that, marginal PDF’s of the components<br />
for Lorenz attractor are practically Gaussian, or its orthogonal<br />
representation has a Gaussian kernel PDF; for Rössler<br />
attractor orthogonal representation with the Gaussian kernel<br />
PDF is also valid for “x” <strong>and</strong> “y” components of the attractor<br />
[12]. The opposite situation takes place for Chua attractor<br />
(Table I): it can be seen that this attractor represents a clearly<br />
non-Gaussian case.<br />
Next, when SNR is low, then the influence of the second<br />
summ<strong>and</strong> in SKE (8) on WPS ( x, t)<br />
is low as well, <strong>and</strong> for the<br />
first approximation it is possible to assume, that the marginal<br />
a-posteriori PDF’s are close to their a-priori shapes.<br />
Therefore, it is feasible that EKF algorithms will be rather<br />
adequate for both high <strong>and</strong> low SNR scenarios for Lorenz <strong>and</strong><br />
Rössler attractors, but not for Chua attractor. Now, let us<br />
consider Chua attractor with the Integral (Global)<br />
approximation for the a-posteriori PDF, assuming (Table I),<br />
that first component has a symmetric<br />
WPS ( x1<br />
, t)<br />
. Supposing<br />
{ ϕ ( )}K<br />
that i xi 1 are polynomials of Hermite <strong>and</strong> K= 4, from<br />
(15) it follows:<br />
x1,<br />
t)<br />
= C exp{<br />
α1<br />
( t)<br />
H1<br />
( x1)<br />
+ α 2 ( t)<br />
H 2 ( x ) +<br />
+ α t) H ( x ) + α ( t)<br />
H ( x ) . (17)<br />
WPS ( 1<br />
3(<br />
3 1 4 4 1 )<br />
With the help of definition of the Hermite polynomials one<br />
can get for (15):<br />
WPS ( xi,<br />
t)<br />
= Const exp[<br />
−α<br />
2(<br />
t)<br />
− 3α<br />
4(<br />
t)<br />
⋅]<br />
{ } 4 3 2<br />
⋅ exp Ax + Bx + Cx + Dx<br />
(18)<br />
where:<br />
A = α1(<br />
t)<br />
− 3α<br />
3(<br />
t);<br />
B = α 2 ( t)<br />
− 6α<br />
4 ( t);<br />
C = α3<br />
( t);<br />
D = α 4 ( t)<br />
.<br />
As { } 4<br />
α i (t)<br />
1 are sufficient statistics for , <strong>and</strong> invoking the<br />
symmetry <strong>and</strong> normalization conditions for a-posteriori PDF<br />
one can get:<br />
A=C=0, C ( α α ) C ⋅ exp{<br />
−α<br />
( t − 3α<br />
( t)<br />
}<br />
W PS<br />
1 = , <strong>and</strong><br />
( x ) = C α α<br />
Dx . (19)<br />
1<br />
1,<br />
4<br />
2 4<br />
( ) { } 4 2<br />
1 2,<br />
4 ⋅exp<br />
Bx1<br />
− 1
No Name of the<br />
Strange<br />
attractor<br />
1 Lorenz, n = 3<br />
It is worth mentioning that for the case of low SNR (19)<br />
coincides with the a-priori PDF<br />
WPR ( x1,<br />
t)<br />
for Chua attractor<br />
(Table I). Now, from (14) it follows:<br />
where i=2, 4.<br />
' ε ''<br />
x1<br />
) ϕi ( x1)<br />
+ ϕi<br />
( x1)<br />
+ h ( x1)<br />
F(<br />
t,<br />
x ) = 0 , (20)<br />
2<br />
f ( i<br />
1<br />
Statistically equivalent SDE-1 with PDF (19) can be found<br />
in [18]:<br />
1<br />
3 ( Bx 2 )<br />
f ( x ) = ε − Dx .<br />
Then, for i=2, one gets ( ε → 0)<br />
the following equation:<br />
where<br />
38 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
⎡x1<br />
⎤<br />
⎢ ⎥<br />
x =<br />
⎢<br />
x2<br />
⎥<br />
⎢<br />
⎣x<br />
⎥ 3 ⎦<br />
2 Chua, n = 3<br />
⎡x1<br />
⎤<br />
⎢ ⎥<br />
x =<br />
⎢<br />
x2<br />
⎥<br />
⎢<br />
⎣x<br />
⎥ 3 ⎦<br />
3 Rössler, n =<br />
3<br />
⎡x1<br />
⎤<br />
⎢ ⎥<br />
x =<br />
⎢<br />
x2<br />
⎥<br />
⎢<br />
⎣x<br />
⎥ 3 ⎦<br />
2<br />
2<br />
4 y ( t)<br />
2<br />
[ x1<br />
− 2D<br />
x1<br />
] = [ 1 −1]<br />
1<br />
2ε<br />
B x , (21)<br />
N<br />
2m<br />
1<br />
x<br />
g(x)<br />
⎧σ(<br />
x2<br />
− x1)<br />
⎪<br />
⎨Rx1<br />
− x2<br />
− x3x<br />
⎪<br />
⎩x1<br />
x2<br />
− Bx3<br />
σ,<br />
R,<br />
B ≥ 0<br />
1<br />
1<br />
−<br />
2<br />
0<br />
⎛ 1 ⎞<br />
Γ⎜m<br />
+ ⎟D<br />
⎝ 2 ⎠<br />
=<br />
πD<br />
1<br />
1<br />
−m−<br />
2<br />
( − δ)<br />
m<br />
( − δ)(<br />
2D)2<br />
m = 1,2,…; D (⋅)<br />
is function of parabolic cylinder,<br />
4<br />
1<br />
TABLE I<br />
Strange attractors <strong>and</strong> their statistical characteristics<br />
WPR(xi) Comments<br />
Ι. ε<br />
ε11 = ε →0<br />
ε 12 = ε 13 =<br />
ε 23 = ε 21 =<br />
ε 32 = ε 33 =<br />
0<br />
(22)<br />
B<br />
δ = .<br />
2D<br />
x1 ~ WG(⋅)<br />
x2 ~ WG(⋅)<br />
x3 ~ WG(⋅)<br />
2 4<br />
⎧β1(<br />
x2<br />
− x1)<br />
− αh(<br />
x1)<br />
ε 11 = ε →0 x1<br />
~ C exp( p1x1<br />
− q1x1<br />
)<br />
⎪<br />
⎨β2<br />
( x1<br />
− x2<br />
) + β4<br />
x<br />
ε 12 = ε 13 =<br />
3<br />
x2<br />
~ WG<br />
( ⋅)<br />
⎪<br />
ε 23 = ε 21 =<br />
⎩−<br />
β3<br />
x<br />
2 4<br />
2<br />
ε 32 = ε 33 = x3<br />
~ C exp( p3x3<br />
− q3x3<br />
)<br />
β − β ≥ 0,<br />
α < 0 0 p , p , q , q > 0<br />
⎧−<br />
x2<br />
− x3<br />
⎪<br />
⎨x1<br />
+ ax2<br />
⎪<br />
⎩b<br />
+ x3x1<br />
− x3c<br />
a,<br />
b,<br />
c,<br />
≥ 0<br />
ε 11 = ε →0<br />
ε 12 = ε 13 =<br />
ε 23 = ε 21 =<br />
ε 32 = ε 33 =<br />
0<br />
1<br />
with<br />
2<br />
1<br />
x1<br />
~ W ( 1)<br />
⎡<br />
G x 1 +<br />
⎢⎣<br />
x2<br />
~ W ( )<br />
⎡<br />
G x2<br />
1 +<br />
⎢⎣<br />
x ~ W ( ⋅)<br />
3<br />
G<br />
R<br />
33<br />
2<br />
< 1<br />
Analogically for i=4<br />
ε<br />
y<br />
=<br />
N<br />
( ) 0 → ε<br />
, it yields:<br />
4 { 4 x<br />
2<br />
( B − 6ε<br />
−12B<br />
x<br />
6<br />
− 8ε<br />
x<br />
2 ) } + 6ε<br />
x =<br />
2<br />
( t)<br />
4 [ x<br />
2<br />
− 6 x + 3]<br />
+<br />
1 6 [ x<br />
4<br />
− 6 x<br />
2<br />
+ 3 x ].<br />
0<br />
N<br />
0<br />
2<br />
( t<br />
(23)<br />
Assuming, that in (21)-(23) y ) tends to its stationary<br />
2<br />
value y ( t)<br />
while t →∝ <strong>and</strong> substituting into (21) - (23),<br />
one can get nonlinear algebraic equations for stationary<br />
parameters α 2 , α4<br />
, which are obviously related to the aposteriori<br />
variance(MSE) <strong>and</strong> fourth moment (cumulant) of<br />
x ) .<br />
WPS ( 1<br />
Therefore, α 2 can be used as a measure of the filtering<br />
accuracy, being calculated with influence of the fourth aposteriori<br />
moments (cumulants).<br />
The similar approach with application of higher –order<br />
statistics (HOS) will be presented below, where the equation<br />
for estimate of x 1 = ˆx 1 will be obviously the same as for the<br />
Integral Approximation.<br />
It is worth to mention here, that for the case of low SNR it<br />
can be developed so-called asymptotical algorithms as well.<br />
For example, the asymptotical filtering algorithm for<br />
∆<br />
( 1)<br />
γ3<br />
3!<br />
( 2)<br />
γ3<br />
3!<br />
( 1)<br />
H 3 ( x1)<br />
+ γ 4 H 4 ( x1)<br />
⎤<br />
⎥⎦<br />
H ( x ) + γ<br />
3<br />
2<br />
( 2)<br />
4<br />
H 4 ( x2<br />
)<br />
⎤<br />
⎥⎦<br />
Normalized dates,<br />
WG(⋅) – Gaussian<br />
PDF<br />
Normalized dates,<br />
p1~ 3.5<br />
p3 ~ 3.5<br />
q1~ 1.5<br />
q3 ~ 2.5<br />
Normalized dates,<br />
( 1)<br />
4<br />
( 2)<br />
3<br />
( 2)<br />
4<br />
~ 0.<br />
2<br />
~ 0.<br />
6<br />
x1(<br />
t)<br />
= x(<br />
t)<br />
of Chua attractor in discrete time can be<br />
represented in a way:<br />
γ<br />
γ<br />
γ<br />
γ<br />
( 1)<br />
3<br />
~ 0.<br />
2<br />
~ 0.<br />
6
xˆ<br />
i+<br />
j<br />
= xˆ<br />
+ T f<br />
j<br />
+ σ<br />
2<br />
ε j<br />
0<br />
d<br />
dx<br />
( xˆ<br />
)<br />
j+<br />
1<br />
where T0 is a sampling interval,<br />
j<br />
lnW<br />
PS<br />
[ ( y j+<br />
1 − x j+<br />
1)<br />
] x = xˆ<br />
j+<br />
1<br />
, (24)<br />
2<br />
σ ε is a-posteriori filtering<br />
variance (MSE).<br />
This a-posteriori variance can be calculated through α 2<br />
<strong>and</strong> α 4 (see above), but also might be found from the<br />
following equation:<br />
ˆ σ<br />
2<br />
ε j + 1<br />
2<br />
ε j<br />
4 ∂<br />
+ ˆ σ ε j<br />
∂x<br />
2<br />
ˆ ε j<br />
2<br />
= ˆ σ + 2σ<br />
f<br />
2<br />
j+<br />
1<br />
ln<br />
′ ( xˆ<br />
j ) T0<br />
WPS<br />
[ ( y j+<br />
1 − x j+<br />
1 ) ] x j + 1=<br />
xˆ<br />
j<br />
(25)<br />
If the SNR is low <strong>and</strong> n0(t) is a Gaussian additive white<br />
noise, then applying Taylor series expansion for the<br />
lnW ( ⋅)<br />
, with this asymptotic one can get:<br />
PS<br />
2<br />
σ ε<br />
xˆ<br />
j+<br />
1 = xˆ<br />
j + T0<br />
f<br />
=<br />
σ n<br />
σ = ˆ σ + 2 ˆ σ<br />
j ( xˆ<br />
j ) + 2 [ ( y j x j ) ] 2 + 1 − + 1 x j+<br />
1 xˆ<br />
j<br />
ˆ ε j + 1<br />
2<br />
ε j<br />
2<br />
ε j 2<br />
ε j<br />
which, in stationary conditions is:<br />
εT<br />
T<br />
ˆ = =<br />
j<br />
2 f<br />
ε<br />
2<br />
σ ε<br />
fˆ<br />
( xˆ<br />
j ) T0<br />
( xˆ<br />
),<br />
ε ˆ σ<br />
0<br />
0<br />
'<br />
2<br />
( xˆ<br />
) 2(<br />
B − 6 xˆ<br />
)<br />
j<br />
(26)<br />
(27)<br />
It can be seen from (27) that accuracy of the filtering<br />
depends on absolute value of xˆ which is the specific feature of<br />
the asymptotical algorithm. This interesting issue follows from<br />
ˆ 2<br />
σ ε on the derivative of the nonlinear drift<br />
the dependence of<br />
f ′ ( xˆ<br />
j ) .<br />
Now, let us take the low SNR scenario <strong>and</strong> apply the<br />
Functional Approximation (12) for WPS ( x , t)<br />
. When we<br />
assume the low SNR case the WPS ( x , t)<br />
becomes:<br />
W<br />
PS<br />
39 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
2 4<br />
exp(<br />
p1x1<br />
− qi<br />
x1<br />
)<br />
2 4 ( p x − q x )<br />
( x,<br />
t)<br />
= C<br />
exp 3 3 i 1<br />
1<br />
2πRˆ<br />
22<br />
(28)<br />
2 ⎛ ⎞⎡<br />
3 p−1<br />
x<br />
R − ˆ − ˆ ) ⎤<br />
⎜ 1<br />
ij ( x j x j )( xi<br />
xi<br />
exp ⎟<br />
⎜<br />
−<br />
⎟⎢1<br />
+<br />
ˆ ∑∑<br />
⎥<br />
⎝ 2R22<br />
⎠⎢⎣<br />
i=<br />
1 j= 1 RiiR<br />
jj ⎥⎦<br />
Substituting (28) into (11) <strong>and</strong> after rather simple, but<br />
cumbersome developments, one can get:<br />
x& ˆ1 = −2εxˆ<br />
1(<br />
p1<br />
+ q1)<br />
+ 2εq<br />
ˆ 1R11<br />
+<br />
2 ( y ( t)<br />
− R )<br />
2(<br />
y(<br />
t)<br />
− xˆ<br />
) Rˆ<br />
xˆ<br />
+ . (29)<br />
N<br />
1 11 1 −<br />
ˆ<br />
11<br />
0 N0<br />
If ε → 0 (see section I) <strong>and</strong> the SNR is low, then from (29)<br />
<strong>and</strong> (11) it follows:<br />
2(<br />
y(<br />
t)<br />
− xˆ<br />
1)<br />
x&<br />
ˆ ˆ<br />
1 = −2εxˆ<br />
1(<br />
p1<br />
+ q1)<br />
+<br />
R11<br />
(30)<br />
N0<br />
<strong>and</strong> one can immediately obtain:<br />
2<br />
Rˆ<br />
11<br />
R ˆ& ε<br />
ˆ<br />
11 = − + + 4ε<br />
( p1<br />
+ q1)<br />
R11<br />
. (31)<br />
2 N<br />
0<br />
One can see that (30), (31) coincide totally with the EKF<br />
for one component x1. Why it happened? The answer is<br />
simple, it happened because of practical linearity of the<br />
equations for Chua attractor with exception of h(x1), symmetry<br />
of the WPS(x, t) for all arguments <strong>and</strong> symmetry of h(x1),<br />
which finally provides <strong>and</strong> “implicit linearization” of the SDE<br />
for x1 in the case of Chua attractor.<br />
It is also interesting that for the analyzed scenario, the<br />
statistically equivalent SDE for x1(t) is practically linear with<br />
time constant 2D(p1+q1).<br />
For t → ∞ R ˆ<br />
11( t)<br />
tends to its stationary value R 11 , which<br />
coincides with the a-posteriori variance or MSE <strong>and</strong> can be<br />
simply calculated as:<br />
− 4ε<br />
( p1<br />
+ q1)<br />
+<br />
R11<br />
=<br />
2<br />
2 ε<br />
16ε<br />
( p1<br />
+ q1)<br />
+<br />
N 0<br />
2<br />
N 0<br />
≥ 0<br />
R11 ≅ 0. 71⋅<br />
invoking ε → 0,<br />
N 0ε<br />
.<br />
(32)<br />
If one assumes that N0 ≅ 1, then R 11 is almost zero <strong>and</strong><br />
doesn’t depend to SNR,i.e it is a singular case!<br />
Some further developments with the help of HOS can be<br />
achieved for the case of n=1, assuming that the nonlinear<br />
statistically equivalent SDE for x1 is [18]:<br />
ε<br />
2<br />
x & 1 = − ( p1x1<br />
− 2q1x1<br />
) + ξ ( t)<br />
ε . (33)<br />
2<br />
Then, it can be shown that with the help of the first four<br />
cumulants (HOS), the filtering equations are [10]:<br />
ε<br />
2<br />
& κ1 = − ( p1κ1<br />
− 2q1κ1<br />
) + F'<br />
( κ1<br />
) κ 2 −<br />
2<br />
2 1<br />
2<br />
− ε ( p 1 κ1<br />
− 2q1κ<br />
1 ) κ 2 + F'<br />
'(<br />
κ1)<br />
( κ 4 + 2κ<br />
2 ) = 0,<br />
(34)<br />
2<br />
where, as before the upper line denotes the time averaging<br />
procedure, κi denotes i-th cumulant, κ 3 = 0 ,<br />
κ ≅ −2κ<br />
(see [4]), κ 2 = R11 , xˆ<br />
H 1 = κ <strong>and</strong> κ3 <strong>and</strong> κ4 coincide with<br />
their a-priori values for the low SNR case.<br />
4<br />
2<br />
2
From (34) it easily follows:<br />
2<br />
& κ = −2ε<br />
( p κ − 2q<br />
κ ) + F'<br />
( κ ) κ<br />
1<br />
1 1<br />
1 1<br />
2ε<br />
( pκ<br />
1 − 2qκ<br />
1 ) ⎡<br />
κ 2 =<br />
⎢ 1+<br />
F'<br />
'(<br />
κ1)<br />
⎢<br />
⎣<br />
2<br />
1<br />
2<br />
F'<br />
'(<br />
κ ) ε<br />
1<br />
2 2<br />
[ 2(<br />
pκ<br />
1 − 2qκ<br />
1 ) ]<br />
⎤<br />
−1⎥.<br />
⎥<br />
⎦<br />
(35)<br />
After some simple, but rather cumbersome algebra, one can<br />
find that for the case of low SNR:<br />
So, 11 11<br />
R11 ~ N<br />
H<br />
ε .<br />
R ≥ R H , which coincides with the Rao-Kramer<br />
bounds for non-linear filtering, but also tending to zero!<br />
Therefore, as it follows from (30) <strong>and</strong> (35) the EKF shows<br />
its adequacy for application to the case of Chua attractor at<br />
least for the low SNR scenarios.<br />
Taking into account that all analytical developments were<br />
done with certain grade of approximation, it is m<strong>and</strong>atory to<br />
check them by numerical simulations. The corresponding<br />
results are presented in the next section.<br />
Finally, let us name some general observations regarding<br />
singularity for chaos filtering problem.<br />
From (1) it definitely follows that its solution is<br />
x( t) = Φ(<br />
t0,<br />
t,<br />
x0<br />
)<br />
, (36)<br />
defined by x0 <strong>and</strong> f(x).<br />
Then, from the SKE (8):<br />
where<br />
W<br />
PS<br />
( t,<br />
x)<br />
= Cδ<br />
0<br />
t<br />
⎪⎧<br />
⋅ exp⎨∫<br />
F<br />
⎪⎩ t0<br />
det<br />
with the elements<br />
( t , t,<br />
x)<br />
0<br />
⎪⎧<br />
T<br />
∂Φ<br />
0<br />
⎨<br />
⎪⎩<br />
∂x<br />
j<br />
0<br />
( Φ(<br />
t , t,<br />
x ) − x ) det(<br />
t , t,<br />
x)<br />
( t , t,<br />
x)<br />
0<br />
[ τ , Φ(<br />
t , t,<br />
x)<br />
] dτ<br />
,<br />
T ⎡∂Φ<br />
= det⎢<br />
⎣ ∂x<br />
h<br />
⎪⎫<br />
⎬<br />
⎪⎭<br />
i,<br />
j=<br />
1<br />
0<br />
≠ 0<br />
<strong>and</strong> δ(·) is a delta function; F[⋅] is (9).<br />
then<br />
40 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
0<br />
⎪⎫<br />
⎬<br />
⎪⎭<br />
( t , , x)<br />
⎤ 0 t<br />
⎥⎦<br />
Taking into account that the fundamental matrix is:<br />
0<br />
⋅<br />
(37)<br />
(38)<br />
(39)<br />
x( t) = Φ(<br />
t0<br />
, t,<br />
x0<br />
)<br />
, (40)<br />
x = Φ t , t,<br />
x).<br />
0<br />
( 0<br />
(41)<br />
If<br />
<strong>and</strong> if N0→∞, then<br />
W PS<br />
y t) = Φ(<br />
t , t,<br />
x ) + n ( t)<br />
( 0 0 0<br />
( x Φ(<br />
t , t,<br />
x)<br />
) det(<br />
t , , x),<br />
( t,<br />
x) = Cδ<br />
− 0<br />
0 t<br />
(42)<br />
(43)<br />
i.e., it is not zero, only when the filtering solution is<br />
Φ(<br />
t0<br />
, t,<br />
x0<br />
).<br />
When N0→0, WPS(t,x) is not equal to zero if <strong>and</strong> only if<br />
x0 Φ(<br />
t0<br />
, t,<br />
x)<br />
=<br />
, i.e., once more it is singular.<br />
So, for both those marginal cases WPS(t,x) “memorize”<br />
the solution of (1).<br />
For approximate algorithms, these phenomena take place<br />
from some low but finite values of N0.<br />
V. NUMERICAL SIMULATIONS<br />
For numerical simulations were considered the same<br />
attractors as mentioned before: Rössler, Lorenz <strong>and</strong> Chua, but<br />
with neglecting of the process noise in (2). This is done in<br />
order to verify how fast the a-posteriori variance or MSE is<br />
tending to zero, independently to SNR level, which is actually<br />
the sign of the singularity of filtering.<br />
From previous analysis, the EKF algorithm seems to be<br />
working practically in singular conditions: algorithm<br />
completely applied the a-priori information of the attractor,<br />
output signals are deterministic, though results doesn’t have to<br />
depend to SNR. This is another reason for opportunistic<br />
prognosis for EKF for the low SNR scenarios in case of chaos<br />
filtering.<br />
It was analyzed in details conditions for the process noise<br />
(see(2)) <strong>and</strong> was shown, that even a small fraction of process<br />
noise provides with a drastic growth of the MSE, which is not<br />
acceptable for filtering. So, the solution is to definitively tend<br />
this noise to zero.<br />
In order to compare the efficiency <strong>and</strong> accuracy of the<br />
above mentioned nonlinear approaches, Rössler, Lorenz <strong>and</strong><br />
Chua attractors are filtered (estimated) using the EKF,<br />
unscented Kalman Fileter (UKF) [8], Gauss-Hermite<br />
quadrature filter (GHF) [6], <strong>and</strong> Quadrature Kalman filter<br />
(QKF) [2]. Before proceeding with the comparisons, some<br />
brief descriptions of the mentioned nonlinear filters are given.<br />
Unscented Kalman filter (UKF)<br />
The UKF is based on the unscented transformation, which<br />
considers the idea that it is easier to approximate a probability<br />
distribution than an arbitrary nonlinear function. To achieve<br />
this, a set of sigma points with adequate mean <strong>and</strong> covariance<br />
are chosen (see also the “Functional Approximation method”,<br />
mentioned in [9], [18] <strong>and</strong> [19]). This approach differs from<br />
particle filters because the sigma points are chosen in a<br />
deterministic way instead of r<strong>and</strong>omly as in particle filters [5],<br />
[8].<br />
This method does not include the linearization of state
<strong>and</strong>/or output equations. But, although the sigma weights are<br />
computed before the filtering process begins, the sigma points<br />
need to be calculated in each algorithm iteration, <strong>and</strong><br />
afterwards the sigma points must be propagated through the<br />
nonlinear system [5], [8].<br />
A detailed derivation of the UKF algorithm is given in [8].<br />
Gauss-Hermite quadrature filter (GHF)<br />
As it is well-known, Guass-Hermite quadrature rule allows<br />
to approximate integrals of the form<br />
I =<br />
41 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
∫<br />
n<br />
R<br />
1 ⎛ 1<br />
f ( t)<br />
exp⎜−<br />
Σ<br />
1/ 2<br />
⎝ 2<br />
n ( 2π<br />
) det Σ)<br />
where Σ is covariance matrix.<br />
T −1<br />
⎞<br />
( t − x)<br />
( t − x)<br />
dt , (44)<br />
The Gauss-Hermite quadrature rule is given by<br />
∞<br />
∫ 1/<br />
2 ( 2 )<br />
∑<br />
−∞<br />
π<br />
i=<br />
1<br />
m<br />
1<br />
2<br />
− x<br />
f ( x) e = w f ( q ) ,<br />
i<br />
i<br />
⎟ ⎠<br />
(45)<br />
which holds for all polynomials of degree up to 2m-1. Where<br />
qi are the quadrature points <strong>and</strong> wi the corresponding weights<br />
[6]. So, through (45) the GHF algorithm approximates the<br />
integrals involved in the Gaussian estimation. It is important<br />
to remark that the quadrature points <strong>and</strong> quadrature weights<br />
used by the GHF algorithm are computed before the filtering<br />
process is started. Therefore, this method requires less<br />
computer effort than the UKF algorithm.<br />
Quadrature Kalman filter (QKF)<br />
The QKF is a more simplified version of the GHF, which<br />
considers the nonlinear filtering problem from a statistical<br />
linear regression (SLR) point of view. In other words, QKF<br />
uses SLR to linearize a nonlinear function by means of a set of<br />
Gauss-Hermite quadrature points <strong>and</strong> weights [2]. Although<br />
QKF is algebraically equivalent to the GHF, in simulations,<br />
the filtering process carried out with QKF algorithm is solved<br />
in a faster way than the filtering performed through GHF. The<br />
QKF algorithm is derived for the first time in [2].<br />
Comparison between nonlinear filters<br />
The computational complexity of the algorithms is briefly<br />
presented in the following table, where additions<br />
(subtractions), multiplications (divisions), Cholesky<br />
decompositions, Jacobian calculations (linearization) <strong>and</strong><br />
nonlinear propagations are included.<br />
From Table II, it can be easily seen that UKF involves a<br />
bigger complexity, while EKF seems to be the simpler<br />
algorithm. However, the linearization process preformed by<br />
the Jacobian calculation involves partial derivatives. For that<br />
reason, <strong>and</strong> depending on the mathematical model of the<br />
attractor, the EKF may not always be the fastest algorithm;<br />
although in our study it is not the case.<br />
TABLE II.<br />
COMPUTATIONAL COMPLEXITY<br />
EKF UKF GHF QKF<br />
Additions 8 50 25 25<br />
Multiplications 15 77 33 40<br />
Cholesky<br />
decomposition<br />
1 2 2 2<br />
Nonlinear<br />
propagation<br />
0 15 21 6<br />
Jacobian<br />
calculation<br />
1 0 0 0<br />
On the other h<strong>and</strong>, the complexity involved in each one of<br />
the algorithms is also analyzed by measuring the consumed<br />
time by the different filtering methods. To this end, the<br />
algorithms were applied on the chaotic attractors which<br />
evolved during 3000 sample times.<br />
One has to notice that the time plots for all above<br />
mentioned attractors are completely different: Lorenz attractor<br />
provides with noise-like chaos, while Rossler <strong>and</strong> Chua<br />
outputs are more likely to be as “modulated sine-waves”.<br />
Though for the same filtering accuracy Lorenz outputs have to<br />
be “oversampled” more frequently than Rossler <strong>and</strong> Chua<br />
signals. Oversampling has to be applied in order to achieve a<br />
required filtering performance <strong>and</strong> this statement will be<br />
illustrated with concrete dates for the sampling times for<br />
attractors upon consideration in this work.<br />
In other words, bigger sampling times dem<strong>and</strong> better<br />
accuracy of the filtering procedures. Consequently, extremely<br />
large sample periods may destroy the effectiveness of filtering<br />
algorithms, while very small sample times require great<br />
amount of data storage <strong>and</strong> faster processors.<br />
In is important mentioning that the algorithms are executed<br />
on an Intel Core 2 6420 @ 2.13GHz, with 1.5GB of RAM.<br />
Fig. 1 shows the MSE versus SNR for Chua attractor when<br />
the process noise is not present.<br />
Fig. 1. MSE vs. SNR for Chua attractor.<br />
As it can be seen the EKF is outperformed by GHF, QKF<br />
<strong>and</strong> UKF. This is because the approximation carried out by<br />
EKF though the linearization is not as good as the Gaussian<br />
approximations (GHF <strong>and</strong> QKF) or the unscented<br />
transformation (UKF). Even so, the MSE generated by the<br />
EKF is really small for SNR not less than 0.5.For Lorenz
42 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
attractor, the performance of the nonlinear filters is depicted in<br />
Fig. 2.<br />
Fig. 2. MSE vs. SNR for Lorenz attractor.<br />
From previous figure, it can be observed that GHF, QKF<br />
<strong>and</strong> EKF give better results than UKF when the nonlinear<br />
filters are applied on a Lorenz chaotic system. This due to the<br />
a-posteriori distribution of Lorenz attractor, which can be<br />
better approximated by the Gaussian filters (GHF <strong>and</strong> QKF),<br />
while the linearization approach involved in EKF is sufficient<br />
to approximate the chaotic dynamics.<br />
It is also deduced that UKF does not match the performance<br />
of Gaussian filters <strong>and</strong> EKF, even though the UKF is the most<br />
complex algorithm.<br />
Finally, the results obtained from using the nonlinear filters<br />
on Rössler attractor are depicted in Fig. 4.<br />
Fig. 4. MSE vs. SNR for Rössler attractor.<br />
For Rössler system, EKF works better than GHF, QKF, <strong>and</strong><br />
UKF. This is because, the a-posteriori PDF for Rössler<br />
attractor can be successfully represented through Gramm-<br />
Charlier series [12], such that approximation by linearization<br />
is close to the real system. Notice that the MSE for GHF <strong>and</strong><br />
QKF does not tends to zero.<br />
Another way to compare the nonlinear filters is through the<br />
necessary time to complete the filtering process for the<br />
different attractors. As mentioned above, only the first 3000<br />
samples of the nonlinear filtering processes are considered.<br />
The following table is intended to give an idea of the<br />
efficiency of the filtering algorithms.<br />
It is important to remark that, although QKF is executed in<br />
a faster way than EKF, GHF <strong>and</strong> UKF; the EKF algorithm is<br />
simple <strong>and</strong> fast enough to be considered as a good choice for<br />
chaotic filtering.<br />
Consequently, corroborating the analysis presented in<br />
previous sections, EKF is suggested as the best option for<br />
filtering <strong>and</strong> estimating of Chua, Lorenz <strong>and</strong> Rössler<br />
attractors.<br />
VI. CONCLUSION<br />
In this report the effectiveness of extended Kalman filter<br />
(EKF), unscented Kalman filter (UKF), Gauss-Hermite<br />
Quadrature filter (GHF), <strong>and</strong> Quadrature Kalman filer<br />
(QKF), are compared during state estimation of chaotic<br />
attractors, for both high <strong>and</strong> rather low SNR’s scenarios.<br />
It was shown that, in contrary to SDE modeling of Non-<br />
Gaussian signals, chaos representation of statistically<br />
equivalent signals (in terms of PDF’s) provides with “force<br />
sensing driving” of the filtering algorithms to the singular<br />
conditions from rather low SNR’s limits <strong>and</strong> as a<br />
consequence, it follows with the high filtering accuracy (low<br />
MSE rates), practically invariant to SNR level.<br />
This fact follows from the absence of the process noise<br />
components in the SDE of chaos <strong>and</strong> it was first predicted<br />
theoretically in [13]. To the best of our knowledge, these<br />
phenomena has not been discussed in the existing literature.<br />
On the basis of filtering results, the analysis shows that<br />
EKF achieves very acceptable performance for Chua, Lorenz<br />
<strong>and</strong> Rössler attractors. Although, UKF, GHF <strong>and</strong> QKF works<br />
better for Chua <strong>and</strong> Lorenz than EKF, these filters might be<br />
much more complex for real-time implementations.<br />
TABLE III<br />
SIMULATION TIME FOR CHAOTIC ATTRACTORS<br />
EKF UKF GHF QKF<br />
Chua 1.30s 2.07s 1.52s 1.07s<br />
Lorenz 1.26s 2.19s 1.58s 1.09s<br />
Rössler 1.25s 2.17s 1.62s 1.08s<br />
From a computational complexity point of view, the EKF<br />
<strong>and</strong> QKF require less effort than the GHF <strong>and</strong> UKF, while<br />
UKF involves the most complicated filtering procedure.<br />
Finally, it is important to remark that EKF algorithm is the<br />
one with the smaller code, so together with previous<br />
observations, the analysis suggests EKF as the better filtering<br />
choice for real-time applications.<br />
ACKNOWLEDGMENT<br />
Authors would like to acknowledge the valuable help of<br />
Dr. Fern<strong>and</strong>o Ramos <strong>and</strong> M. Sc. Beatriz Rodríguez for the<br />
preparation of this paper.
43 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
REFERENCES<br />
[1] Anischenko, V. S. et al, “Statistical properties of dynamical chaos”,<br />
Physics--Uspekhi, vol. 48, no. 2, pp. 151-166, 2005.<br />
[2] Arasaratnam, I. et al, “Discrete-Time Nonlinear Filtering Algorithms<br />
Using Gauss-Hermite Quadrature”, Proceedings of the IEEE, vol. 95, no.<br />
5, pp. 953-977, 2007.<br />
[3] Chui, C.K., <strong>and</strong> Chen, G., Kalman Filtering with Real-Time<br />
Applications, Springer-Verlag Berlin Heidelberg, 1999.<br />
[4] Eckmann, J., <strong>and</strong> Ruelle, D., “Ergodic Theory <strong>and</strong> Strange Attractors”,<br />
Review of Modern Physics, vol. 57, pp. 617-656, July 1985.<br />
[5] Haykin, S., Kalman Filtering an Neural Networks, John Wiley & Sons,<br />
2001.<br />
[6] Ito, K., <strong>and</strong> Xiong, K., “Gaussian Filters for Nonlinear Filtering”<br />
Problems, IEEE Transactions on Automatic Control, vol. 48, no. 5, pp.<br />
910-927, 2000.<br />
[7] Jazwinski, A., Stochastic Processing <strong>and</strong> Filtering Theory, N.Y.<br />
Academic, 1970.<br />
[8] Julier, S. J., et al, “Unscented Filtering <strong>and</strong> Nonlinear Estimation”,<br />
Proceedings of the IEEE, vol. 92, no. 3, pp. 401-422, 2004.<br />
[9] Kazakov, I., <strong>and</strong> Artemiev, V., Optimization of Dynamic <strong>Systems</strong> with<br />
R<strong>and</strong>om Structure, Nauka, 1980. (In Russian).<br />
[10] Kontorovich, V., “Non-Linear Filtering for Markov Stochastic Processes<br />
using High-Order Statistics (HOS) Approach”, Non-Linear Analysis:<br />
Theory, Methods <strong>and</strong> Applications, vol. 30, no. 5, pp. 3165-3170,1997.<br />
[11] Kontorovich, V., “Applied Statistical Analysis for Strange Attractors<br />
<strong>and</strong> Related Problems”, Mathematical Methods in the Applied Sciences,<br />
vol. 30, pp. 1705-1717, 2007.<br />
[12] Kontorovich, V., et al., “Analysis of Rössler Attractor <strong>and</strong> its<br />
Applications”, Special Issue on Nonlinear Dynamics <strong>and</strong><br />
Synchronization in The Open Cybernetics <strong>and</strong> Systemics Journal, 2009.<br />
(In press)<br />
[13] Kontorovich, V., Lovtchikova, Z., “Nonlinear filtering algorithms for<br />
chaotic signals: a comparative study”. Proceedings of INDS’09. Second<br />
International Workshop on Nonlinear Dynamics <strong>and</strong> Synchronization.<br />
Klagenfurt, pp. 221-227. Austria. July, 2009.<br />
[14] Kontorovich, V., Lovtchikova, Z., ”Cumulant analysis of strange<br />
attractors. Theory <strong>and</strong> applications”. Recent Advances in Nonlinear<br />
Dynamics <strong>and</strong> Sychronization. SCI 254, 2009. (In press)<br />
[15] Kushner, H., “Dynamical Equations for Optimal Nonlinear Filtering”,<br />
Journal of Differential Equations, vol. 3, pp. 179-190, 1967.<br />
[16] Kushner, H. <strong>and</strong> Budhiraja, A., “A Nonlinear Filtering Algorithm Based<br />
on an Approximation of the Conditional Distribution”, IEEE Trans. on<br />
Automatic Control, vol. 45, no. 3, pp. 580-585, March 2000.<br />
[17] Mijangos, M., Kontorovich, V., <strong>and</strong> Aguilar-Torrentera, J., “Some<br />
Statistical Properties os Strange Attractors: Engineering View”, Journal<br />
of Physics: Conference Series: 012147 (6pp), vol. 96, March 2008.<br />
[18] Primak, S., Kontorovich, V., <strong>and</strong> Ly<strong>and</strong>res, V., Stochastic Methods <strong>and</strong><br />
their Applications to Communications: Stochastic Differential Equations<br />
Approach, John Wiley & Sons, 2004.<br />
[19] Pugachev, V., <strong>and</strong> Sinitsyn, I., Stochastic Differential <strong>Systems</strong>. Analysis<br />
<strong>and</strong> Filtering, John Wiley & Sons, 1987.<br />
[20] Stratonovich, R., Topics of the Theory of R<strong>and</strong>om Noise, vol 1 <strong>and</strong> vol.<br />
2, Gordon <strong>and</strong> Breach, 1963.<br />
[21] Van Trees, H., Detection, Estimation <strong>and</strong> Modulation Theory, John<br />
Wiley & Sons, 2001.<br />
[22] Zakai, M., “On the Optimal Filtering of Diffusion Processes”,<br />
Wahrscheinlichkeitstheorie verngebiete, vol. 11, pp. 230-243, 1969.
44 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
Nonlinear Feature Extraction Approaches for<br />
Scalable Face Recognition Applications<br />
Hima Deepthi Vankayalapati<br />
Institute of Smart <strong>Systems</strong> Technologies<br />
University of Klagenfurt<br />
9020 Klagenfurt, Austria<br />
hvankaya@edu.uni-klu.ac.at<br />
Abstract—The human skill of identifying thous<strong>and</strong>s of people<br />
even after so many years excited many researchers to focus on<br />
face recognition systems. The majority of real world applications<br />
dem<strong>and</strong>s more robust, scalable <strong>and</strong> computationally efficient<br />
face recognition techniques which can operate under complex<br />
viewing <strong>and</strong> environmental conditions. The appearance based<br />
linear subspace techniques are very useful in data classification<br />
<strong>and</strong> dimensionality reduction tasks; however these algorithms<br />
only classify the linear data. The scalability of the linear subspace<br />
techniques is limited, as the computational load <strong>and</strong> memory<br />
requirements increase dramatically with the large database. This<br />
paper evaluates different nonlinear feature extraction approaches<br />
for face recognition application, namely wavelet transform, radon<br />
transform <strong>and</strong> cellular neural networks (CNN). In this work, the<br />
combination of radon <strong>and</strong> wavelet transform based approaches is<br />
used to extract the multi-resolution features, which are invariant<br />
to facial expression <strong>and</strong> illumination conditions. The efficiency of<br />
the stated wavelet <strong>and</strong> radon based nonlinear approaches over the<br />
databases is demonstrated, with the simulation results performed<br />
over the FERET database. This paper also presents the use of<br />
CNN in extracting the nonlinear facial features in improving the<br />
recognition rate, as well as computational speed, compared to<br />
other stated nonlinear approaches over the ORL database.<br />
Index Terms—Feature extraction, Face recognition, Linear<br />
subspace techniques, Cellular neural network, Wavelet transform,<br />
Radon transform.<br />
I. INTRODUCTION<br />
In computer vision, a feature is a set of measurements. Each<br />
measurement contains a piece of information, <strong>and</strong> specifies the<br />
property or characteristics of the object present in the image<br />
[1]. The linear features are more advantageous, when the given<br />
data is Gaussian distributed in terms of mean. However in most<br />
real world face recognition applications, facial features of the<br />
face image are not purely Gaussian distributed (they vary with<br />
complex viewing <strong>and</strong> environmental conditions).<br />
Researchers have developed various biometric techniques to<br />
identify or recognize persons by their physical characteristics<br />
like finger, voice, face etc. These biometric techniques have<br />
their own advantages <strong>and</strong> drawbacks as well [2]. Among all<br />
the biometric techniques, the face recognition has a distinct<br />
advantage of collecting the required data (i.e image) without<br />
any cooperation from the person [3]. The face recognition is<br />
a complex visual classification task which plays an important<br />
role in computer vision, image processing <strong>and</strong> pattern recognition.<br />
Ky<strong>and</strong>oghere Kyamakya<br />
Institute of Smart <strong>Systems</strong> Technologies<br />
University of Klagenfurt<br />
9020 Klagenfurt, Austria<br />
ky<strong>and</strong>oghere.kyamakya@uni-klu.ac.at<br />
Research concerning the face recognition started nearly in<br />
1960’s [4]. Different face recognition techniques have been<br />
proposed during last decades namely feature based, model<br />
based <strong>and</strong> appearance based techniques [5], [6]. In feature<br />
based techniques, the overall technique describes the position<br />
<strong>and</strong> size of each feature (eye, nose, mouth or face outline)<br />
[7]. In this approach, the extracting features in different poses<br />
(viewing conditions) <strong>and</strong> lighting conditions are very complex<br />
tasks. For applications with large databases, we have large<br />
set of features with different sizes <strong>and</strong> positions, making it<br />
difficult to identify the required feature points [8]. In the<br />
model based approach, a 3D model is constructed based on<br />
the facial variations in the image or important information<br />
related to the image. The difficulties in this approach are, we<br />
need a very expensive camera (Stereo vision) to capture the<br />
facial variations clearly; further construction of 3D model is<br />
difficult, <strong>and</strong> it takes more time to construct the model for<br />
large databases [6]. The availability of large 3D data is also<br />
one of the essential complex tasks that makes the model based<br />
methods not suitable for real world applications dealing with<br />
large databases.<br />
In 1990’s, researchers introduced appearance based linear<br />
subspace techniques, statistics related techniques, to solve face<br />
recognition problems. The introduction of the linear subspace<br />
techniques is a milestone in the face recognition concept. The<br />
performance of appearance based techniques heavily depends<br />
on the quality of the extracted features from the image [9]. The<br />
appearance based linear subspace techniques extract the global<br />
features, as these techniques use the statistical properties like<br />
the mean <strong>and</strong> variance of the image [6]. The major difficulty<br />
in applying these techniques over large databases is that the<br />
computational load <strong>and</strong> memory requirements for calculating<br />
features increase dramatically for large databases [3]. In order<br />
to increase the performance of the face recognition techniques,<br />
the nonlinear feature extraction techniques are introduced.<br />
In order to improve the performance of the face recognition<br />
technique, we have to extract both linear <strong>and</strong> nonlinear features.<br />
We have many nonlinear feature extraction techniques,<br />
such as radon transform <strong>and</strong> wavelet transform. The radon<br />
transform based nonlinear feature extraction gives the direction<br />
of local features. This process extracts the spatial frequency<br />
components in the direction of radon projection is computed
45 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
[10]. When features are extracted using radon transform, the<br />
variations in this facial frequency are also boosted [10]. The<br />
wavelet transform gives the spacial <strong>and</strong> frequency components<br />
present in an image [11]. However these nonlinear feature<br />
extraction techniques are computationally expensive. In order<br />
to improve the computational speed of the nonlinear feature<br />
extraction process, the cellular neural network (CNN) concept<br />
is being proposed.<br />
The novel scheme will involve, at its heart, CNN based<br />
processors, which will be the key component of the analog<br />
computing based ultra-fast solver for image processing tasks.<br />
CNN based analog computing has the very attractive advantage<br />
of easy implementation or emulation on digital platforms.<br />
The objective of this paper is to present the use of CNN in<br />
extracting nonlinear features using the ORL database.<br />
The paper is organized as follows: in section 2, the importance<br />
<strong>and</strong> methodologies of the linear subspace techniques<br />
are explained briefly. In section 3, the basics <strong>and</strong> importance<br />
of the radon transform are explained briefly. In section 4,<br />
wavelet transform is briefly described. In section 5, cellular<br />
neural network is introduced. Genetic algorithm based template<br />
calculation method is also briefly described in section<br />
5. The experimental simulation results using the FERET <strong>and</strong><br />
the ORL databases are described in section 6. Section 7 deals<br />
with some concluding remarks <strong>and</strong> outlooks.<br />
II. LINEAR SUBSPACE TECHNIQUES<br />
Principal Component Analysis (PCA), Independent Component<br />
Analysis (ICA) <strong>and</strong> Linear Discriminant Analysis (LDA)<br />
are related to the appearance based linear subspace technique<br />
[6]. These linear subspace techniques use statistics (mean <strong>and</strong><br />
co-variance). The calculation of the mean <strong>and</strong> co-variance is<br />
performed by using the train data set to form the data matrix<br />
X. In data matrix X, each column xi represents the image<br />
in the train data set. The mean image of the train data set is<br />
expressed as shown in Eq. 1.<br />
m = 1<br />
N<br />
N�<br />
i=1<br />
The co-variance matrix C of the r<strong>and</strong>om vector x is calculated<br />
using Eq. 2.<br />
C = 1<br />
N<br />
xi<br />
N�<br />
(xi − m)(xi − m) T (or)C = AA T<br />
i=1<br />
Calculating the co-variance matrix by using Eq. 2 takes high<br />
memory because of the dimensions of C. The size of A is<br />
LMxN. The size of C is LMxLM, which is very large.<br />
So the matrix L = A T A is considered instead of C. The<br />
dimension of L is NxN, which is much smaller than the<br />
dimensions of C. After the co-variance matrix, each technique<br />
(PCA, ICA <strong>and</strong> LDA) uses a specific approach to calculate the<br />
key parameters of the feature space.<br />
In linear subspace technique, all the images in the train data<br />
set are represented as points in the feature space as shown in<br />
Fig. 1. The given test image is also represented as a point in<br />
(1)<br />
(2)<br />
X 1<br />
X 3<br />
X 2<br />
Fig. 1. Image representation in the high dimensional space<br />
the same space <strong>and</strong> the minimum distance train data set image<br />
gives the best match.<br />
A. Principal Component Analysis (PCA)<br />
PCA highlights the similarities <strong>and</strong> differences between the<br />
variables in the data [12], [13]. After calculating the covariance<br />
matrix, we have to calculate the eigenvalues <strong>and</strong><br />
eigenvectors of the co-variance matrix. Then we arrange all<br />
eigenvalues in descending order <strong>and</strong> we take first few highest<br />
eigenvalues <strong>and</strong> corresponding eigenvectors. This operation is<br />
the evaluation of principal components [14]. The eigenvectors<br />
e1, e2,...en are shown in Eq. 3.<br />
Wpca = [e1, e2, ....., en] (3)<br />
We neglect the remaining less significant eigenvalues <strong>and</strong><br />
the corresponding eigenvectors. The eigenvalues neglected<br />
lead to a very small information loss [15]. The principal<br />
component axis passes through the mean values. A new<br />
transformation matrix Wpca is obtained, by projecting the<br />
principal component on to the original data set.<br />
B. Independent Component Analysis (ICA)<br />
ICA uses the higher order statistics of the input data<br />
to find the independent components. The independency is<br />
distinguished by knowing the uncorrelated data. ICA is a<br />
special case of blind source problem [16]. One of the simplest<br />
applications of ICA is found in the cocktail party problem.<br />
So the ICA technique is a generalization of PCA technique.<br />
In this technique we first calculate the PCA transformation<br />
matrix Wpca, transform the centered matrix P = [x1−m, x2−<br />
m, ...., xn − m] using Wpca <strong>and</strong> then form a new matrix Z<br />
(square matrix with size NXN), which contains the r<strong>and</strong>om<br />
vector z, whose elements are uncorrelated as shown in Eq. 4.<br />
Z = W pca T P (4)<br />
The next important stage is the rotation stage. In this one,<br />
the fixed point algorithm is used to find the Wk [17]. After<br />
that, we calculate the overall transformation matrix as shown<br />
in Eq. 5.<br />
Wica = WpcaWk<br />
(5)
46 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
C. Linear Discriminant Analysis (LDA)<br />
The main objective of the LDA is minimizing the within<br />
class variance <strong>and</strong> maximizing the between class variance in<br />
the given data set. In other words it groups the same class<br />
images <strong>and</strong> separates the different class images [18]. A class<br />
means the collection of data (images) belonging to the same<br />
object or same person. In LDA, we have to calculate the mean<br />
image of each class i which is represented as mi.<br />
Si = 1<br />
c�<br />
(x − mi)(x − mi) T<br />
(6)<br />
Ni<br />
x∈Xi<br />
Eq. 6 represents the class dependant scatter matrix <strong>and</strong> it gives<br />
the sum of the co-variance matrix of the centered images in<br />
each class. Xi represents the data matrix corresponding to<br />
class i. Ni represents the images present in class i. c represents<br />
the total number of classes. The within class scatter matrix Sw<br />
is calculated from Eq. 7.<br />
c�<br />
Sw =<br />
(7)<br />
This leads to the evaluation of the amount of variance between<br />
the images in each class. Sb represents the between class<br />
scatter matrix [3] <strong>and</strong> it calculates the variance between the<br />
classes by using Eq. 8. The co-variance matrix of each class<br />
is the difference between the total mean of all classes <strong>and</strong> the<br />
mean of each class. Sb is expressed in Eq. 8.<br />
c�<br />
Sb = (mi − m)(mi − m) T<br />
(8)<br />
i=1<br />
i=1<br />
If Sw is non-singular, we should solve the generalized eigen<br />
problem of the transformation matrix W by the linear discriminant<br />
analysis in Fig. 2. This transformation matrix should<br />
maximize the between class scatter matrix <strong>and</strong> minimize the<br />
within class scatter matrix [19]. There are many solutions to<br />
solve the generalized eigen problem [20]. One method for<br />
solving this eigen problem is to take the inverse of Sw <strong>and</strong><br />
solve the problem by using S −1<br />
w SbW = W λ. This task is<br />
derived from Eq. 9.<br />
Si<br />
SbW = SwW λ (9)<br />
λ is a diagonal matrix containing the eigen values of the matrix<br />
S−1 w Sb. The above algorithm is optimal only when the within<br />
class scatter matrix is singular. If the within class scatter matrix<br />
is non-singular, we should use the direct LDA technique [15].<br />
The direct LDA is performed in the following steps as shown<br />
in Fig. 2.<br />
The first step is related to find the eigen vectors of the<br />
between class scatter matrix Sb = P T b Pb, where Pb is<br />
calculated by subtracting the mean face images of each class<br />
from the mean face image of all images as expressed in Eq. 10.<br />
Pb = [m1 − m, m2 − m, ...., mc − m] (10)<br />
The second step takes the most significant eigen values <strong>and</strong><br />
corresponding eigen vectors V . These eigen vectors are used<br />
Test image<br />
(y)<br />
Data Matrix (X)<br />
Mean Image<br />
+<br />
Mean Image of each person<br />
Center<br />
test<br />
image<br />
+<br />
Mean Image<br />
+<br />
Calculate<br />
Within class<br />
Scatter matrix Sw<br />
Calculate<br />
Betweenclass<br />
Scatter matrix Sb<br />
S w is<br />
singular<br />
<strong>and</strong><br />
eigen values of<br />
b <strong>and</strong> S Calculate eigen<br />
vectors<br />
S w<br />
Highest fisher faces<br />
Yes<br />
Calculate the distance<br />
Py=WTy = between Px <strong>and</strong> Py<br />
Min(dist)<br />
No<br />
Calculate<br />
eigen vectors<br />
of Sb b<br />
Form whitening<br />
Transform (Z)<br />
Calculate the<br />
eigen vectors of<br />
(Z’ (Z’Sw Z)<br />
P=W Px =WTX Recognition<br />
Result<br />
Fig. 2. Linear discriminant analysis technique for face recognition<br />
to calculate Y = PbV <strong>and</strong> Db = Y T SbY . This leads to the<br />
evaluation of the whitening transform as Z = Y D −1/2<br />
b . Sb <strong>and</strong><br />
Sw are projected onto the new subspace spanned by Z. The<br />
small matrix ZT SwZ can be diagonalized. The relationship<br />
between them is expressed in Eq. 11.<br />
U T Z T SwZU = λw<br />
(11)<br />
U <strong>and</strong> λw are the eigen vectors <strong>and</strong> eigen values of the<br />
matrix Z T SwZ. The corresponding eigen matrix is represented<br />
as R. The overall transformation matrix is calculated from<br />
W = ZR. A new transformation can be performed by using<br />
the linear transformation of the original space into a new<br />
reduced dimensional feature space Px = W T X (i.e project<br />
this transformation matrix on to the train data set) [6].<br />
The next operation is concerned with the projection of this<br />
transformation matrix on to the test data sets to obtain Py.<br />
The best match is found by calculating the distance between<br />
Px <strong>and</strong> Py using the distance measure technique. The overall<br />
linear discriminant analysis technique for face recognition is<br />
shown in Fig. 2.<br />
Technique<br />
Year<br />
Iterative<br />
Class Information usage<br />
Order of statistics<br />
Recognition rate (for 80<br />
persons database)<br />
Speed<br />
Scalability<br />
Principal component<br />
analysis (PCA)<br />
1990<br />
No<br />
No<br />
Second order<br />
70%<br />
medium<br />
low<br />
Independent<br />
component analysis<br />
(ICA)<br />
1999<br />
Yes<br />
No<br />
Higher order<br />
79%<br />
very low<br />
low<br />
Linear discriminant<br />
analysis (LDA)<br />
1997<br />
No<br />
Yes<br />
Second order<br />
Fig. 3. Comparison of linear subspace techniques (PCA, ICA <strong>and</strong> LDA)<br />
The performance of different linear subspace techniques like<br />
PCA, ICA <strong>and</strong> LDA is evaluated. Experiments are conducted<br />
to underst<strong>and</strong> the performance (recognition rate <strong>and</strong> speed) of<br />
these linear subspace techniques over the FERET database.<br />
Among linear subspace techniques, LDA gives both high<br />
recognition rate <strong>and</strong> speed when compared with PCA <strong>and</strong><br />
89%<br />
high<br />
high
47 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
ICA as shown in Fig. 3. But LDA is not scalable, <strong>and</strong> the<br />
recognition rate is also not sufficient for real world applications.<br />
In linear subspace techniques, the computational load<br />
<strong>and</strong> memory requirements are dramatically increasing with the<br />
size of database.<br />
III. RADON TRANSFORM<br />
The two dimensional radon transform was introduced by<br />
Austrian mathematician Johann Radon in 1917. This transform<br />
gives the integral of the set of lines present in a given image<br />
[10]. Due to this, it captures the direction of the local features<br />
(lines, curves <strong>and</strong> circles) which are present in the image. This<br />
transform is useful in many line, circle <strong>and</strong> curve detection<br />
applications, related to image processing <strong>and</strong> computer vision<br />
[10]. The radon transform of the two dimensional function<br />
f(x, y) in (r, θ) plane (Fig. 4(a)) is shown in Eq. 12<br />
� ∞ � ∞<br />
R(r, θ)[f(x, y)] = f(x, y)δ(xcosθ+ysinθ−r)dxdy<br />
−∞<br />
−∞<br />
(12)<br />
Where δ(.) function is the Dirac function, rɛ[−∞, ∞] is the<br />
R(r, )<br />
0<br />
r<br />
0<br />
Y<br />
(a)<br />
−100<br />
−80<br />
−60<br />
−40<br />
−20<br />
r 0<br />
20<br />
40<br />
60<br />
80<br />
100<br />
f(x,y)<br />
0 20 40 60 80 100 120 140 160 180<br />
θ (degrees)<br />
(c)<br />
X<br />
20<br />
40<br />
60<br />
80<br />
Y<br />
100<br />
120<br />
140<br />
160<br />
180<br />
20 40 60 80 100 120<br />
X<br />
Fig. 4. (a) The radon transform of an image (b) Shows the original image<br />
(b) Radon transform of the image with an angle 0 to 180<br />
perpendicular distance of a line from the origin <strong>and</strong> θɛ[0, π] is<br />
the angle formed by the distance vector [10]. The δ function<br />
converts the two dimensional integral to a line integral dl<br />
along the line xcosθ + ysinθ = r. The simplified form of<br />
R(r, θ)[f(x, y)] is Rf shown in Eq. 13<br />
� ∞<br />
Rf = f(rcosθ − lsinθ − rcosθ + lsinθ)dxdy (13)<br />
−∞<br />
The transformed function (r, θ) is referred to as the sinogram<br />
of f(x, y). The δ function transforms the point in f to<br />
sinusoidal line δ function in (r, θ) plane. The Rf is defined<br />
as a function of straight lines. The radon transform of the two<br />
dimensional image shown in Fig. 4(b), extracts the direction<br />
of the lines present in that image, as shown in Fig. 4(c).<br />
(b)<br />
The sinogram (Fig. 4(c)) of the given image has 181 radon<br />
projections. Each projection in the image is a feature vector.<br />
IV. WAVELET TRANSFORM<br />
Morlet introduced the wavelet transform in the early 1980’s<br />
[21]. Wavelet is named ’ondelette’ in French, which means<br />
’small waves’ [11]. A wavelet gives both the spatial <strong>and</strong><br />
frequency information of the images. In the frequency representation,<br />
the signal is cut into several parts <strong>and</strong> each part<br />
is analyzed separately. Commonly used discrete wavelets are<br />
daubechies wavelets [22]. Wavelets with one level decomposition<br />
is performed by using the high pass filter g <strong>and</strong> the low<br />
pass filter h. Convolution with the low pass filter gives the<br />
approximation information, while convolution with the high<br />
pass filter leads to the detail information [23]. The wavelet<br />
decomposition process of two dimensional signal f(x, y) is<br />
shown in Fig. 5. The overall process is modeled in Eqs.( 14<br />
- 17).<br />
X(n)<br />
HP 2<br />
LP<br />
2<br />
HP 2<br />
LP<br />
2<br />
HP 2<br />
Fig. 5. Wavelet coefficients decomposition in discrete wavelet transform<br />
LP<br />
A = [h ∗ [h ∗ f]x ↓ 2]y ↓ 2 (14)<br />
H = [g ∗ [h ∗ f]x ↓ 2]y ↓ 2 (15)<br />
V = [h ∗ [g ∗ f]x ↓ 2]y ↓ 2 (16)<br />
D = [g ∗ [g ∗ f]x ↓ 2]y ↓ 2 (17)<br />
The star (∗) represents the convolution operation, <strong>and</strong> ↓ 2<br />
represents the downsampling by 2 along the direction x or<br />
y [11]. To correct this sample rate, the down sampling of the<br />
filter by two is performed (by simply throwing away every second<br />
coefficient). The daubechies wavelets have many wavelets<br />
functions. In this work, db4 (because of the symmetry) is used.<br />
db4 leads to the four wavelet coefficients A, H, V <strong>and</strong> D<br />
<strong>and</strong> the corresponding images. In this decomposition A gives<br />
the approximation information, <strong>and</strong> the image is a blurred<br />
image as shown in Fig. 5. H gives the horizontal features, V<br />
gives the vertical features <strong>and</strong> D gives the diagonal features<br />
present in the image. The wavelet coefficient A gives the high<br />
performance, when compared to the remaining three wavelet<br />
coefficients. Further D gives the less performance. Using the<br />
A + H + V + D wavelet coefficients leads to a performance,<br />
which is nearly equal to the A’s performance.<br />
2<br />
D<br />
H<br />
V<br />
A
48 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
V. CELLULAR NEURAL NETWORK<br />
The concept of CNN, also called cellular neural networks<br />
was introduced in 1988 by Leon O.Chua <strong>and</strong> Lin Yang. The<br />
basic building block in the CNN model is the cell. The CNN<br />
model consists of regularly spaced array of cells. It can be<br />
identified as the combination of cellular automata [24] <strong>and</strong><br />
neural networks [25]. The adjacent cells communicate directly<br />
through their nearest neighbours <strong>and</strong> other cells communicate<br />
indirectly, because of the propagation effects in the model.<br />
The original idea was to use an array of simple, non-linearly<br />
coupled dynamic circuits to process, parallely, large amounts<br />
of data in real time [25].<br />
Cells are multiple input, single output nonlinear processors.<br />
Cells in the CNN processor contain fixed location <strong>and</strong> fixed<br />
topology. Inputs, initial state, <strong>and</strong> output variables are used to<br />
define the CNN processor behavior. Professor Leon O.Chua<br />
proposed the diagram of an isolated cell, as shown in Fig. 6.<br />
The state variable is not observable from outside the cell itself.<br />
Input Uij<br />
Threshold Zij<br />
State<br />
Xij<br />
Cell Cij<br />
Fig. 6. Representation of an isolated cell<br />
Output Yij<br />
The cell is a lumped circuit, <strong>and</strong> it contains both linear <strong>and</strong><br />
nonlinear elements, such as resistors, capacitors <strong>and</strong> nonlinear<br />
controlled sources as shown in Fig. V. The CNN processor<br />
is modeled by Eqs.( 18 - 19), with xi, yi <strong>and</strong> ui as state,<br />
output <strong>and</strong> input variables respectively. The schematic model<br />
of a CNN cell is shown in Fig.8<br />
˙xij = −xij + �<br />
c(j)∈Nr(i)<br />
Aijyij + Bijuij + I (18)<br />
yij = 1<br />
2 (|xij + 1| − |xij − 1|) (19)<br />
The coefficients Aij <strong>and</strong> Bij values, synaptic weights, completely<br />
define the behavior of the network, with given input<br />
<strong>and</strong> initial conditions, as shown in Eq. 18. These values are<br />
called the templates. For the ease of representation, they can<br />
be represented as a matrix. We have three types of templates:<br />
the first one is feedforward or control template, the second<br />
is feedback template <strong>and</strong> the third is bias. All these space<br />
invariant templates are called cloning templates. CNNs are<br />
particularly interesting, because of their programmable nature<br />
i.e. changeable templates.<br />
These templates values <strong>and</strong> synaptic weights completely<br />
define the behavior of the network, with given input <strong>and</strong><br />
initial conditions. These templates are expressed in the form<br />
of a matrix <strong>and</strong> are repeated in every neighborhood cell. The<br />
template set for r = 1 CNN contains 19 coefficients (Atemplate<br />
9, B-template 9 <strong>and</strong> bias 1).<br />
Euij<br />
u ij<br />
x ij<br />
I C R<br />
-1<br />
-1<br />
(a)<br />
y ij<br />
+1<br />
(b)<br />
Fig. 7. (a)Electronic circuit model of the isolated cell (b) The classical output<br />
nonlinear function for each cell<br />
Outputs<br />
from<br />
neighbouring<br />
cells<br />
Uij<br />
Inputs<br />
from<br />
neighbouring<br />
cells<br />
A =<br />
⎡<br />
Template<br />
A<br />
I<br />
Template<br />
B<br />
(Convolution block)<br />
∑<br />
(Summation)<br />
-1<br />
Iuij<br />
+1<br />
∫<br />
x ij<br />
(gain block)<br />
Iyij<br />
Fig. 8. Schematic representation of the CNN<br />
⎣ A−1,−1 A−1,0 A−1,1<br />
A0,−1 A0,0 A0,1<br />
⎡<br />
B = ⎣<br />
A1,−1 A1,0 A1,1<br />
B−1,−1 B−1,0 B−1,1<br />
B0,−1 B0,0 B0,1<br />
B1,−1 B1,0 B1,1<br />
y ij<br />
Eyij<br />
X ij<br />
ij<br />
(Integration)<br />
(Output function)<br />
Y<br />
X ij ( 0)<br />
The genetic algorithm is used to estimate the A, B <strong>and</strong> I<br />
templates, depending upon the given application. The template<br />
set is unique for each application. In this work, we use the<br />
genetic algorithm to obtain the template set for the ORL<br />
database.<br />
A. Genetic algorithm<br />
In order to extract the facial features from a frontal face<br />
image, we assume that the template set values will have symmetrical<br />
behavior, as the front view of the face is symmetrical.<br />
⎤<br />
⎦;<br />
⎤<br />
⎦;
49 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
Because of this symmetry, instead of 19 template elements,<br />
we are calculating the 11 template elements (A-template 5,<br />
B-template 5 <strong>and</strong> bias 1). Each template element is encoded<br />
with 32 bit floating point format. Genetic algorithm (GA)<br />
uses the papulation of binary strings called chromosomes. In<br />
the learning process, initially 72 r<strong>and</strong>om chromosomes, with<br />
length of 11∗32 bits each, are constructed. Genetic Algorithm<br />
is explained in detail in the following steps:<br />
• Construct the r<strong>and</strong>om population matrix with size<br />
72X(11∗32) i.e. each row represents a chromosome (for<br />
11 template elements) of length 11 ∗ 32 = 352.<br />
• The IEEE 754 floating point st<strong>and</strong>ard is used to calculate<br />
the template (A, B <strong>and</strong> I) elements from each chromosome<br />
[24]. In each chromosome first 11 bits represents<br />
the first bit of the 11 template elements, <strong>and</strong> second 11<br />
bits represents the second bit of the 11 template elements<br />
so on as given in Eq. 20.<br />
S = [A11, A12, A13, A21, A22, B11, B12, B13, B21, B22, I]<br />
(20)<br />
• After template calculation, these templates are given as<br />
input to the CNN. The first CNN works with the template<br />
of the first chromosome. After the CNN output appears<br />
as stable, cost function is calculated by using this CNN<br />
output image P <strong>and</strong> the target image T . This process<br />
is repeated for each chromosome template sets in the<br />
population matrix [24]. The cost function is selected as<br />
shown in Eq. 21.<br />
cost(A, B, I) =<br />
m�<br />
i<br />
n�<br />
j<br />
Pi,j ⊕ Ti,j<br />
(21)<br />
Here m,n are the number of pixels of the image. ⊕<br />
represents the XOR operation.<br />
• After calculating the cost function, the fitness function<br />
for each chromosome is evaluated as given in Eq. 22.<br />
fitness(A, B, I) = m ∗ n − cost(A, B, I) (22)<br />
• The whole process is repeated for each chromosome until<br />
the fitness value exceeds the stop criteria. The stop criteria<br />
is considered as stcriteria = 0.99∗m∗n. This maximum<br />
fitness value of the chromosome in the population matrix<br />
is selected.<br />
• The next step is reproduction. In this process, the fitness<br />
values corresponding chromosomes are sorted in descending<br />
order. All the fitness values are normalized with the<br />
sum of the fitness values. The bad fitness value corresponding<br />
chromosomes are deleted. The most successful<br />
chromosomes will produce the next generation.<br />
• Take the first highest fitness values corresponding chromosomes<br />
S1 <strong>and</strong> S2, apply the crossover <strong>and</strong> mutation<br />
operations to generate the children [24]. Crossover operation<br />
exchanges the substrings between the two chromosomes<br />
S1 <strong>and</strong> S2. In this work, one-point crossover is<br />
used <strong>and</strong> its first cross site is selected with chromosome<br />
length of the uniform probability. If the mutation probability<br />
is set to 0.01 then 253 bits are selected r<strong>and</strong>omly<br />
<strong>and</strong> then they are inverted.<br />
• Take these new chromosomes <strong>and</strong> apply the same steps<br />
from template calculation to stop criteria.<br />
This learning process is repeated to find the best chromosome.<br />
After satisfying the stop criteria, the template elements are<br />
calculated from the best chromosome. The template elements<br />
to extract features from the frontal face images for ORL<br />
database are obtained as:<br />
⎡<br />
A = ⎣<br />
⎡<br />
B = ⎣<br />
I = 0.4414<br />
2.7612 7.3152 1.7566<br />
1.5916 8.5273 1.5916<br />
1.7566 7.3152 2.7612<br />
⎤<br />
⎦;<br />
−6.1912 2.8350 −7.9270<br />
1.3044 −2.7349 1.3044<br />
−7.9270 2.8350 −6.1912<br />
The corresponding best chromosome is<br />
S = [000001010101100111101000110000101<br />
00110000101001100001010011000010100110000101<br />
00110000101011101011010111010100111100011111<br />
10000011000010110010100000011111001010100110<br />
00010011011101100011010010101010110011011101<br />
11110110101001111010111100110010111001100101<br />
10011001001100110111100010110100000000011001<br />
01001001110010101010110111011110101001100011<br />
00100100000]<br />
(a) (b)<br />
Fig. 9. (a) The input image of the genetic cellular neural network (b) The<br />
genetic cellular neural network output image<br />
The two dimensional image shown in Fig. 9(a), is given<br />
as the input image for CNN to extract the important frontal<br />
facial features present in that image <strong>and</strong> the output image with<br />
extracted feature set is shown in Fig. 9(b).<br />
⎤<br />
⎦;<br />
VI. EXPERIMENTAL RESULT<br />
In this section, we evaluate the performance of the wavelet<br />
<strong>and</strong> radon transform based feature extraction approaches using<br />
FERET database. The performance of the CNN based approach<br />
is compared to other stated face recognition approaches<br />
over the ORL database. The performance is evaluated over the<br />
FERET database for frontal images (fa or fb), pose variant<br />
with an angle 67.5 half left or right shifted images (hr or hl),<br />
<strong>and</strong> pose variant with an angle 90 profile left or right shifted<br />
images (pr or pl) [26]. For the ORL database, the performance<br />
is evaluated for facial expressions <strong>and</strong> varying light conditions.
50 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
1) Performance evaluation of radon <strong>and</strong> wavelet transforms:<br />
: The radon transform gives the direction of the<br />
local features (lines, circles). Radon transform preserves the<br />
variation in pixel intensities. While computing the radon<br />
projections, the pixel intensities along a line are added. This<br />
process extracts the spatial frequency components in the<br />
direction of radon projection. When features are extracted<br />
using radon transform, the variations in this facial frequency<br />
are also boosted. The wavelet transform gives the spacial <strong>and</strong><br />
frequency components present in an image.<br />
A. Different wavelet functions versus recognition rate<br />
Daubechies wavelets contain different wavelet functions.<br />
The recognition rates of two different wavelet functions db1<br />
<strong>and</strong> db4 are compared in Fig. 10. db1 st<strong>and</strong>s for the haar<br />
wavelet <strong>and</strong> it encodes the constant component. db4 encodes<br />
both constant <strong>and</strong> linear components. The db4 performance is<br />
high when compared to db1.<br />
B. Different wavelet coefficients versus recognition rate<br />
In the db4 daubechies wavelets, there are four wavelet<br />
coefficients. These coefficients vary in terms of the wavelet<br />
functions. The four wavelet coefficients are A, H, V <strong>and</strong><br />
D. The wavelet coefficient A gives approximate information<br />
on the features. H, V , <strong>and</strong> D gives the information about<br />
horizontal, vertical <strong>and</strong> diagonal features present in the given<br />
image respectively.<br />
The wavelet coefficient A gives the high recognition rate,<br />
when compared to the remaining three wavelet coefficients.<br />
Further D gives the less recognition rate (see Fig. 10). Using<br />
the A + H + V + D wavelet coefficients leads to a recognition<br />
rate, which is nearly equal to the A’s recognition rate.<br />
Recognition rate (%)<br />
100%<br />
90%<br />
80%<br />
70%<br />
60%<br />
50%<br />
40%<br />
30%<br />
20%<br />
10%<br />
0%<br />
D V H A<br />
Wavelet coefficients<br />
Fig. 10. Performance comparison of different wavelet function db1 <strong>and</strong> db4<br />
The next experiments are conducted on a FERET database<br />
with one frontal image (fb) for each subject as test image,<br />
<strong>and</strong> five images in different poses for each subject in train<br />
database. The performance evaluation is shown in Fig. 11(a).<br />
The experiments are repeated with pose variant images like<br />
hr <strong>and</strong> pr as test image for each subject, <strong>and</strong> five images<br />
db4<br />
db1<br />
Recognition rate (%)<br />
100%<br />
Recognition rate (%)<br />
Recognition rate (%)<br />
100%<br />
95%<br />
90%<br />
85%<br />
80%<br />
75%<br />
70%<br />
90%<br />
80%<br />
70%<br />
60%<br />
50%<br />
40%<br />
30%<br />
20%<br />
10%<br />
0%<br />
100%<br />
90%<br />
80%<br />
70%<br />
60%<br />
50%<br />
40%<br />
30%<br />
20%<br />
10%<br />
0%<br />
100%<br />
90%<br />
80%<br />
70%<br />
60%<br />
50%<br />
40%<br />
30%<br />
20%<br />
10%<br />
0%<br />
PCA<br />
PCA<br />
LDA<br />
Radon<br />
Wavelet<br />
Radon+wavelet<br />
Radon+LDA<br />
Wavelet+LDA<br />
40 100 150 200 400<br />
Number of subjects in the database<br />
PCA<br />
LDA<br />
(a)<br />
Radon<br />
Wavelet<br />
Radon+wavelet<br />
Radon+LDA<br />
Wavelet+LDA<br />
40 100 150 200 400<br />
Number of subjects in the database<br />
LDA<br />
(b)<br />
PCA<br />
LDA<br />
Radon<br />
Wavelet<br />
Radon+wavelet<br />
Radon+LDA<br />
Wavelet+LDA<br />
40 100 150 200 400<br />
Number of subjects in the database<br />
Radon<br />
(c)<br />
Wavelet<br />
Radon+wavelet<br />
(d)<br />
Radon+LDA<br />
Wavelet+LDA<br />
Fig. 11. (a) Performance comparison of different face recognition approaches<br />
with front images (FERET database) (b) Performance comparison of different<br />
face recognition approaches with half right images (FERET database) (c)<br />
Performance comparison of different face recognition approaches with profile<br />
right images (FERET database) (d) Performance comparison of different face<br />
recognition approaches with ORL database<br />
CNN
51 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
excluding the test image for each subject in train database. The<br />
results are shown in Fig. 11(b) <strong>and</strong> Fig. 11(c) respectively. For<br />
best matching, the euclidean distance measure is used here.<br />
The recognition rate depends upon the number of subjects<br />
in the data set. It is difficult to recognize a subject in the<br />
large data set than in the small data set. The experiments<br />
are conducted with different sizes of the FERET database,<br />
by using linear subspace techniques (principal component<br />
analysis (PCA), linear discriminant analysis (LDA)), radon<br />
transform <strong>and</strong> wavelet transform. In applying linear subspace<br />
techniques for large databases, computational load <strong>and</strong> memory<br />
requirements increases dramatically with the size of the<br />
database. This effects the performance of PCA <strong>and</strong> LDA on<br />
large data sets as shown in Fig. 11.<br />
The radon transform <strong>and</strong> wavelet transform are mostly independent<br />
of size of the database. The combination of radon <strong>and</strong><br />
wavelet transform gives the multi-resolution features, which<br />
are more useful in face recognition. This has been validated<br />
with the experimental results shown in Fig. 11. Even though<br />
the combination of radon <strong>and</strong> wavelet transform gives better<br />
performance, there is still a need for improvement in pose<br />
variant face recognition as shown in Fig. 11(b) <strong>and</strong> Fig. 11(c).<br />
1) Performance evaluation of cellular neural networks: :<br />
The CNN based face recognition approach <strong>and</strong> other stated<br />
approaches are applied on ORL database. The ORL database<br />
contains images of 40 subjects. All images are taken in frontal<br />
position against a dark homogeneous background. The performance<br />
of various algorithms are evaluated using ORL database<br />
are shown in Fig. 11(d). CNN with its parallel computing<br />
paradigm promises to outperform the other approaches over<br />
the ORL database as shown in Fig. 11(d).<br />
VII. CONCLUSION<br />
The face recognition performance has been systematically<br />
evaluated by using different sizes of the database. To improve<br />
the performance of the face recognition technique, wavelets,<br />
radon <strong>and</strong> combination of both radon <strong>and</strong> wavelet transform<br />
have been proposed to extract the nonlinear features. The<br />
results of the evaluation have shown that the recognition rate is<br />
considerably increased with the combination of both radon <strong>and</strong><br />
wavelet transform compared to PCA <strong>and</strong> LDA. In addition to<br />
these two approaches, this work also shows CNN based feature<br />
extraction approach for face recognition outperforms both<br />
radon <strong>and</strong> wavelet transforms for ORL database. However, this<br />
should be validated for FERET database, where the images are<br />
in different poses. The CNN algorithm should able to detect<br />
the pose, <strong>and</strong> then apply the appropriate template to extract<br />
the relevant feature set.<br />
Future work should focus on the recognition algorithm<br />
performing over videos, as many applications dem<strong>and</strong> real<br />
time recognition. Further, such a system may be integrated<br />
in driver assistance system to either recognize the driver of a<br />
car, or extract facial expressions that may provide information<br />
about his mood or fatigue.<br />
REFERENCES<br />
[1] I. Guyon <strong>and</strong> A. Elisseeff, “An Introduction to Feature Extraction,”<br />
Zurich Research Laboratory, (2004).<br />
[2] M. Aleemuddin, “A Pose Invariant Face Recognition system using<br />
Subspace Techniques,” Deanship of Graduate studies, (2004).<br />
[3] G. Shakhnarovich <strong>and</strong> B. Moghaddam, Face Recognition in Subspaces.<br />
Springer-verlag, May (2004).<br />
[4] M. Kirby <strong>and</strong> L. Sirovich, “Application of the karhunen-loeve procedure<br />
for the characterization of human faces,” IEEE Trans. Pattern Anal.<br />
Mach. Intell., vol. 12, no. 1, pp. 103–108, (1988).<br />
[5] A. SATO, H. IMAOKA, T. SUZUKI, <strong>and</strong> T. HOSOI, “Advances in Face<br />
Detection <strong>and</strong> Recognition Technologies,” NEC Journal of Advanced<br />
Technology, vol. 2, no. 1, (2005).<br />
[6] O. Toygar <strong>and</strong> A. Acan, “Face Recognition using PCA, LDA <strong>and</strong> ICA<br />
approaches on colored images,” Electrical <strong>and</strong> Electronics engineering,<br />
vol. 3, no. 1, pp. 735–743, (2003).<br />
[7] R. Brunelli, T. Poggio, <strong>and</strong> I. P. Trento, “Face recognition through<br />
geometrical features,” in European Conference on Computer Vision<br />
(ECCV), pp. 792–800, (1992).<br />
[8] R. Brunelli <strong>and</strong> T. Poggio, “Face recognition: Features vs. templates.,”<br />
IEEE Transactions on Pattern Analysis <strong>and</strong> Machine Intelligence,<br />
vol. 15, no. 10, pp. 1042–1052, (1993).<br />
[9] B. J. Lei, E. A. Hendriks, <strong>and</strong> M. Reinders, “On Feature Extraction from<br />
Images,” Technical Report on MCCWS project, (1999).<br />
[10] Q. W. Yan CHEN <strong>and</strong> X. HE, “Human Action Recognition by Radon<br />
Transform,” IEEE International Conference on Data Mining Workshops,<br />
May (2008).<br />
[11] N. Shams, I. Hosseini, M. Sadri, <strong>and</strong> E. Azarnasab, “Low cost fpgabased<br />
highly accurate face recognition system using combined wavelets<br />
with subspace methods,” pp. 2077–2080, (2006).<br />
[12] P. N. Belhumeur, J. a. P. Hespanha, <strong>and</strong> D. J. Kriegman, “Eigenfaces<br />
vs. fisherfaces: Recognition using class specific linear projection.,” IEEE<br />
Transactions on Pattern Analysis <strong>and</strong> Machine Intelligence, vol. 19,<br />
pp. 711–720, July (1997).<br />
[13] W. S.Yambor, “Analysis of PCA based <strong>and</strong> Fisher discriminant based<br />
image recognition algorithms,” Degree of Master of Science, (2000).<br />
[14] B. A. Draper, K. Baek, M. S. Bartlett, <strong>and</strong> J. R. Beveridge, “Recognizing<br />
faces with pca <strong>and</strong> ica,” Comput. Vis. Image Underst., vol. 91, no. 1-2,<br />
pp. 115–137, (2003).<br />
[15] P. N.Belhumeur, J. P.Hespanha, <strong>and</strong> D. J.Kriegman, “Eigenfaces vs.<br />
Fisherfaces:Recognition Using Class Specific Linear Projection,” vol. 19,<br />
no. 7, (1997).<br />
[16] J. Kim, M.-J. Choi, M.-J. Yi, <strong>and</strong> M. Turk, “Effective representation<br />
using ica for face recognition robust to local distortion <strong>and</strong> partial<br />
occlusion,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 27, no. 12,<br />
pp. 1977–1981, (2005).<br />
[17] A. Hyvrinen, “The Fixed-Point Algorithm <strong>and</strong> Maximum Likelihood<br />
estimation for Independent Component Analysis,” pp. 1–5, (1999).<br />
[18] W. S. Y. B. A. D. J. R. Beveridge, “Analyzing PCA-based Face<br />
Recognition Algorithms: Eigenvector Selection <strong>and</strong> Distance Measures,”<br />
Department of Computer Science, (2000).<br />
[19] P. N.Belhumeur, J. P.Hespanha, <strong>and</strong> D. J.Kriegman, “Eigenfaces vs.<br />
Fisherfaces:Recognition Using Class Specific Linear Projection,” European<br />
Conference on Computer Vision, (1996).<br />
[20] W.Zhao, R.Chellappa, <strong>and</strong> P.J.Phillips, “Subspace Linear Discriminant<br />
Analysis for Face Recognition,” Department of Electrical <strong>and</strong> Electronic<br />
Engineering, (1999).<br />
[21] C. Garcia, G. Zikos, <strong>and</strong> G. Tziritas, “A wavelet-based framework for<br />
face recognition,” in Workshop on Advances in Facial Image Analysis<br />
<strong>and</strong> Recognition Technology, 5 th European Conference on Computer<br />
Vision, pp. 84–92, Publications, (1998).<br />
[22] M. I. M. D. Fatma H. Elfouly, Mohamed I. Mahmoud <strong>and</strong> S. Deyab,<br />
“Comparison between haar <strong>and</strong> daubechies wavelet transformations on<br />
fpga technology,” International Journal of Computer, Information, <strong>and</strong><br />
<strong>Systems</strong> Science, <strong>and</strong> Engineering, vol. 2, no. 1, pp. 1047–1061, (2006).<br />
[23] C. C. LIU, D. Q. Dai, <strong>and</strong> H. Yan, “Local Discriminant Wavelet<br />
Packet Coordinates for Face Recognition,” Journal of Machine learning<br />
Research, pp. 1165–1195, May (2007).<br />
[24] T. R. Tibor Kozek <strong>and</strong> L. . Chua, “Genetic Algorithm for CNN Template<br />
Learning,” IEEE Transactions on circuits <strong>and</strong> systems, vol. 40, no. 6,<br />
(1993).<br />
[25] L. Chua <strong>and</strong> T. Roska, Cellular neural networks <strong>and</strong> visual computing:<br />
foundations <strong>and</strong> applications. Cambridge University Press, (2005).
52 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
[26] P. Phillips, H. Wechsler, J. Huang, <strong>and</strong> P. Rauss, “The feret database <strong>and</strong><br />
evaluation procedure for face-recognition algorithms,” vol. 16, pp. 295–<br />
306, April (1998).
53 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
ARTIFICIAL HUMAN LIMBS – A DESIGN APPROACH FOR MILITARY<br />
APPLICATION<br />
Abstract— The most essential automation is saving<br />
human life, saving their belongings, protecting their<br />
properties <strong>and</strong> making arrangements in a systematic way<br />
for automation. This paper deals with the design of a real<br />
time Human Limb which acts according to the design<br />
configurations of the prescribed datas as per the sensor<br />
calibrated. This research proposes to overcome current<br />
limitations using three axis optimal inertial sensors<br />
combined with an Embedded Controller on which the<br />
filter algorithm as well as analog to digital converter is<br />
implemented for correcting drift <strong>and</strong> angular motion<br />
through all orientations. The mechanical design will have<br />
miniature or hybrid stepper motors with associated<br />
mechanical elements to move the limbs on all the axis like<br />
up/down ,roll, elevation <strong>and</strong> azimuth.<br />
I.INTRODUCTION<br />
R.Karthikeyan, Department of EIE, Veltech,Member IEEE,<br />
rkarthiekeyan@gmail.com<br />
Anitha Karthikeyan, Department of ECE,Meenakchi College of Engineering.<br />
mrs.anithakarthikeyan@gmail.com<br />
S.Sivaperumal, Department of ECE,Vel HIGHTECH SRS Engineering College.<br />
sivaperumals@gmail.com<br />
With the development of networked synthetic environments<br />
(SE) st<strong>and</strong> to revolutionize the fields of education, training,<br />
business, retailing <strong>and</strong> entertainment. They will<br />
fundamentally alter our societies <strong>and</strong> the way in which<br />
mankind views the world. In the educational field, synthetic<br />
environments will offer the ultimate in h<strong>and</strong>s-on <strong>and</strong><br />
visualization of difficult concepts. They will allow training<br />
to transpire in a place much like that in which the skills<br />
being practiced will be used without exposure to possible<br />
hazards <strong>and</strong> at less cost. In the workplace, employees will be<br />
able to work “side by side” even though they may be<br />
physically separated by hundreds or even thous<strong>and</strong>s of<br />
miles. .[Durlach -1995].Using synthetic environments,<br />
corporations will obtain a safe, economical <strong>and</strong> efficient<br />
method of testing new concepts <strong>and</strong> systems. Retailers will<br />
create virtual department stores where consumers will be<br />
able to try out products to an unprecedented degree before<br />
actually buying them.<br />
Using synthetic environments, the entertainment<br />
industry will be able to create entire worlds in which<br />
customers will be able to experience thrills <strong>and</strong> live out<br />
entire fantasy lives [Zyda-1997].The power of the synthetic<br />
environment lies in its ability to immerse users in a different<br />
world. The more complete the immersion, the more effective<br />
the synthetic environment. For complete immersion, the user<br />
should sense <strong>and</strong> interact with the synthetic environment in<br />
the same manner in which interaction with the natural world<br />
takes place. Interaction in the natural world results from<br />
body motion. Information regarding the surrounding<br />
environment is obtained through the five senses. Changes in<br />
body posture <strong>and</strong> position directly affect what is seen, heard,<br />
felt <strong>and</strong> smelled[ Mavor-1995].<br />
The parameters sensed in the environment are altered <strong>and</strong><br />
manipulated by the actions of the body. Thus, in order for a<br />
user to interact with a synthetic environment in a natural<br />
way <strong>and</strong> have the synthetic environment present appropriate<br />
information to the senses, it is imperative that data regarding<br />
body motion <strong>and</strong> posture be obtained[Skopowski,1996].<br />
Body posture <strong>and</strong> location data are also needed in multi-user<br />
environments to drive the animation of avatars which<br />
represent the actions of users of the environment to each<br />
other. At this time, there is no practical <strong>and</strong> intuitive<br />
interface that allows an individual human to be inserted into<br />
a SE in a fully immersive manner. [Badler, N,1993].<br />
Numerous motion tracking technologies are currently in<br />
use, but each suffers from its own set of limitations.<br />
Depending on the technology, these limitations may include<br />
marginal accuracy, user encumbrance, restricted range,<br />
susceptibility to interference <strong>and</strong> noise, poor registration,<br />
occlusion difficulties <strong>and</strong> high latency. Due to these<br />
problems, real-time animations of avatars must be largely<br />
script-based using motion libraries. For the most part, only a<br />
single user may be tracked in a small working volume. Thus,<br />
none of the current technologies fulfills the need for widearea<br />
tracking of multiple users. The ideal motion tracking<br />
technology must meet several requirements. It should have<br />
low latency, be tolerant to noise <strong>and</strong> other environmental<br />
interference, track multiple users <strong>and</strong> maintain both<br />
adequate accuracy <strong>and</strong> registration throughout a large<br />
working volume [MoletAubel-1999].<br />
The primary reason current tracking systems fail to<br />
meet the requirements described above is the dependence of<br />
these systems on a generated “source” to determine<br />
orientation <strong>and</strong> location information. This source may be<br />
sent by transmitters to body-based receivers or it may be<br />
sent from body-based transmitters to receivers positioned at<br />
known locations throughout the working volume. Usually
54 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
the effective range of this source is extremely limited or<br />
there may be compromises between resolution <strong>and</strong> range.<br />
Interference with or distortion of this source will at best<br />
result in erroneous orientation <strong>and</strong> position measurements.<br />
II.MOTIVATION<br />
Motion tracking technology currently fail to<br />
provide accurate wide area tracking of multiple users<br />
without interference <strong>and</strong> occulation problems. This research<br />
proposes to overcome current limitations using three axis<br />
optimal inertial sensors combined with an Embedded<br />
Controller on which the filter algorithm as well as analog to<br />
digital converter is implemented for correcting drift <strong>and</strong><br />
angular motion through all orientations.The mechanical<br />
design will have miniature or hybrid stepper motors with<br />
associated mechanical elements to move the limbs on all the<br />
axis like up/down ,roll, elevation <strong>and</strong> azimuth. An<br />
appropriate electronic circuit is used for isolation between<br />
the stepper motors <strong>and</strong> an Embedded Controller in<br />
computers.<br />
The electronic system will be suitable upto 5A for<br />
70kg cm stepper motor but in this research the stepper motor<br />
used is only 7kg cm .Joint angle determination for robots<br />
with flexible links is difficult. Use of Bluetooth technology<br />
will enable sensors to wirelessly transmit data from body<br />
extremities to the wearable PC. Inertial orientation tracking<br />
combined with RF positioning are also tried to provide an<br />
accurate method for determining orientation <strong>and</strong> location. It<br />
describes a system designed to determine the posture of an<br />
articulated body in real time. Finally ,this work describes<br />
the design, implementation ,calibration algorithm for the<br />
sensors <strong>and</strong> testing of inertial tracking system of human limb<br />
segment.<br />
IV.OBJECTIVES<br />
Based on the above discussion, the objectives of the present<br />
research work are,<br />
� Orientation tracking of human limb segments using three<br />
axis inertial sensors.<br />
� Calibration of individual sensors without the use of any<br />
specialized equipment .<br />
� Sufficient dynamic response <strong>and</strong> update rate (100 HZ or<br />
better) to capture faster human body limb motion.<br />
� Ability to change the three stepper motors rotation<br />
according to the assigned threshold value.<br />
� Three sensors are attached on the human limb, if the<br />
threshold value attains 360 <strong>and</strong> below, then the three<br />
stepper motors rotates in the forward direction. Finally ,<br />
axis direction <strong>and</strong> three sensor data are also displayed<br />
graphically in the computer as per the limb movement<br />
of the human body.<br />
� Three sensors are attached on the human limb, if the<br />
threshold value attains 400 <strong>and</strong> above then the three<br />
stepper motors rotates in the reverse direction. Finally ,<br />
axis direction <strong>and</strong> three sensor data are also displayed<br />
graphically in the computer as per the limb movement<br />
of the human body.<br />
� If the sensors are not attached on the human limb, ,the<br />
threshold value, rotation of stepper motors ,axis<br />
direction <strong>and</strong> the three sensor data all this parameter<br />
should lie in the initial condition.<br />
� Automatic accounting for the peculiarities related to the<br />
mounting of a sensor on an associated limb segment.<br />
� Creation of data files for recording data relating to limb<br />
as per the embedded software Filter.<br />
� Use of Bluetooth technology will enable sensors to<br />
wirelessly transmit data from body extremities to the<br />
wearable PC.<br />
� RF positioning are also tried to provide an accurate<br />
method for determining orientation <strong>and</strong> location .<br />
� It describes the design, implementation ,calibration of<br />
the sensors <strong>and</strong> testing of inertial tracking system of<br />
human limb segment.<br />
III.DESIGN EQUATIONS OF LIMB<br />
Force Calculations of Joints<br />
The point of doing force calculations is for motor selection.<br />
We must make sure that the motor we choose can not only<br />
support the weight of the robot arm, but also what the robot<br />
arm will carry (the blue ball in the image below).<br />
The first step is to label the FBD(Diagram 1), with the robot<br />
arm stretched out to its maximum length.<br />
Diagram 1<br />
Next we do a moment arm calculation, multiplying<br />
downward force times the linkage lengths. This calculation<br />
must be done for each lifting actuator. This particular design<br />
has just two DEGREE OF FREEDOM that requires lifting,<br />
<strong>and</strong> the center of mass of each linkage is assumed to be<br />
Length/2.<br />
Torque About Joint 1:<br />
M1 = L1/2 * W1 + L1 * W4 + (L1 + L2/2) * W2 + (L1 + L3) * W3
55 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
Torque About Joint 2:<br />
M2 = L2/2 * W2 + L3 * W3<br />
Forward Kinematics<br />
Forward kinematics is the method for determining the<br />
orientation <strong>and</strong> position of the end effector, given the joint<br />
angles <strong>and</strong> link lengths of the robot arm. For our robot arm,<br />
here we calculate end effector location with given joint<br />
angles <strong>and</strong> link lengths.<br />
Diagram 2<br />
Assume that the base is located at x=0 <strong>and</strong> y=0. The first<br />
step would be to locate x <strong>and</strong> y of each joint as shown in<br />
Diagram2<br />
Joint 0 (with x <strong>and</strong> y at base equaling 0):<br />
x0 = 0<br />
y0 = L0<br />
Joint 1 (with x <strong>and</strong> y at J1 equaling 0):<br />
cos(psi) = x1/L1 => x1 = L1*cos(psi)<br />
sin(psi) = y1/L1 => y1 = L1*sin(psi)<br />
Joint 2 (with x <strong>and</strong> y at J2 equaling 0):<br />
sin(theta) = x2/L2 => x2 = L2*sin(theta)<br />
cos(theta) = y2/L2 => y2 = L2*cos(theta)<br />
End Effector Location (make sure your signs are correct):<br />
x0 + x1 + x2, or 0 + L1*cos(psi) + L2*sin(theta)<br />
y0 + y1 + y2, or L0 + L1*sin(psi) + L2*cos(theta)<br />
z equals alpha, in cylindrical coordinates<br />
Inverse Kinematics<br />
Inverse kinematics is the opposite of forward kinematics.<br />
This is when we have a desired end effector position, but<br />
need to know the joint angles required to achieve it. The<br />
robot sees a kitten <strong>and</strong> wants to grab it, what angles should<br />
each joint go to? Although way more useful than forward<br />
kinematics, this calculation is much more complicated too.<br />
psi = arccos((x^2 + y^2 - L1^2 - L2^2) / (2 * L1 * L2))<br />
theta = arcsin((y * (L1 + L2 * c2) - x * L2 * s2) / (x^2 +<br />
y^2))<br />
where c2 = (x^2 + y^2 - L1^2 - L2^2) / (2 * L1 * L2);<br />
<strong>and</strong> s2 = sqrt(1 - c2^2);<br />
There is the very likely possibility of multiple, sometimes<br />
infinite, number of solutions (as shown below). How<br />
would the arm choose which is optimal, based on torques,<br />
previous arm position, gripping angle, etc.? There is the<br />
possibility of zero solutions. Maybe the location is outside<br />
the workspace, or maybe the point within the workspace<br />
must be gripped at an impossible angle(Diagram 3).<br />
Diagram 3<br />
Singularities, a place of infinite acceleration, can blow up<br />
equations <strong>and</strong>/or leave motors lagging behind (motors cant<br />
achieve infinite acceleration).<br />
And lastly, exponential equations take forever to calculate<br />
on a microcontroller. No point in having advanced equations<br />
on a processor that cant keep up.<br />
Motion Planning<br />
Motion planning on a robot arm is fairly complex so I will<br />
just give you the basics.<br />
Diagram 4<br />
Suppose the robot arm has objects within its workspace<br />
(Diagram 4), how does the arm move through the workspace<br />
to reach a certain point? To do this, assume the robot arm is<br />
just a simple mobile robot navigating in 3D space. The end<br />
effector will traverse the space just like a mobile robot,<br />
except now it must also make sure the other joints <strong>and</strong> links<br />
do not collide with anything too. This is extremely difficult<br />
to do . . .<br />
What if you the robot end effector to draw straight lines<br />
with a pencil? Getting it to go from point A to point B in a<br />
straight line is relatively simple to solve. What the robot<br />
should do, by using inverse kinematics, is go to many points
56 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
between point A <strong>and</strong> point B. The final motion will come<br />
out as a smooth straight line. We can not only do this<br />
method with straight lines, but curved ones too. On<br />
expensive professional robotic arms all we need to do is<br />
program two points, <strong>and</strong> tell the robot how to go between<br />
the two points (straight line, fast as possible, etc.).<br />
Velocity (<strong>and</strong> more Motion Planning)<br />
Calculating end effector velocity is mathematically complex,<br />
so we will go only into the basics. The simplest way to do it<br />
is assume the robot arm (held straight out) is a rotating<br />
wheel of L diameter. The joint rotates at Y rpm, so therefore<br />
the velocity is<br />
Velocity of end effector on straight arm = 2 * pi * radius * rpm<br />
However the end effector does not just rotate about the base,<br />
but can go in many directions. The end effector can follow a<br />
straight line, or curve, etc.<br />
With robot arms, the quickest way between two points is<br />
often not a straight line. If two joints have two different<br />
motors, or carry different loads, then max velocity can vary<br />
between them. When we tell the end effector to go from one<br />
point to the next, we have two decisions. Have it follow a<br />
straight line between both points, or tell all the joints to go<br />
as fast as possible - leaving the end effector to possibly<br />
swing wildly between those points.<br />
In the diagram 5 the end effector of the robot arm is moving<br />
from the blue point to the red point. In the top example, the<br />
end effector travels a straight line. This is the only possible<br />
motion this arm can perform to travel a straight line. In the<br />
bottom example, the arm is told to get to the red point as fast<br />
as possible. Given many different trajectories, the arm goes<br />
the method that allows the joints to rotate the fastest.<br />
Diagram 5<br />
There are many deciding factors to select the best method.<br />
Usually we want straight lines when the object the arm<br />
moves is really heavy, as it requires the momentum change<br />
for movement (momentum = mass * velocity). But for<br />
maximum speed (perhaps the arm isn't carrying anything, or<br />
just light objects) we would want maximum joint speeds.<br />
Now suppose we want the robot arm to operate at a certain<br />
rotational velocity, how much torque would a joint need?<br />
First, lets go back to our Functional Block Diagram<br />
(Diagram 6):<br />
Diagram 6<br />
Now lets suppose we want joint J0 to rotate 180 degrees in<br />
under 2 seconds, what torque does the J0 motor need? Well,<br />
J0 is not affected by gravity, so all we need to consider is<br />
momentum <strong>and</strong> inertia. Putting this in equation form we get<br />
this:<br />
torque = moment_of_inertia * angular_acceleration<br />
breaking that equation into sub components we get:<br />
torque = (mass * distance^2) * (change_in_angular_velocity<br />
/ change_in_time) <strong>and</strong><br />
change_in_angular_velocity = (angular_velocity1)-<br />
(angular_velocity0)<br />
angular_velocity = change_in_angle / change_in_time<br />
Now assuming at start time 0 that angular_velocity0 is zero,<br />
we get<br />
torque = (mass * distance^2) * (angular_velocity /<br />
change_in_time)<br />
where distance is defined as the distance from the rotation<br />
axis to the center of mass of the arm:<br />
center of mass of the arm = distance = 1/2 * (arm_length)<br />
(use arm mass)<br />
but we also need to account for the object the arm holds:<br />
center of mass of the object = distance = arm_length<br />
(use object mass)<br />
So we calculate torque for both the arm <strong>and</strong> then again for<br />
the object, then add the two torques together for the total:<br />
torque(of_object) + torque(of_arm) = torque(for_motor)
57 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
And of course, if J0 was additionally affected by gravity,<br />
add the torque required to lift the arm to the torque required<br />
to reach the velocity needed.<br />
V IMPLEMENTATION OF INERTIAL TRACKING OF<br />
HUMAN LIMB SEGMENTS<br />
The implementation of inertial tracking of human limb<br />
segment is shown in figure 1. Three inertial sensors are<br />
mounted on the body of the human limb segment. The<br />
analog output from the limb for adults is 20mv (infants is 5<br />
mv)<strong>and</strong> given to signal conditioner circuit here for better<br />
ADC accuracy it amplify the 20mv to 5v <strong>and</strong> the<br />
corresponding amplifier gain is 5000/20 250.<br />
The 5v analog output is given to filter circuit which provides<br />
high speed noise filtering output with constant frequency<br />
approximately 100HZ <strong>and</strong> to Embedded Controller on<br />
which the software filter algorithm as well as analog to<br />
digital converter is implemented. The output is digitized by<br />
an associated inbuilt A\D converter .The digitized output<br />
from an Embedded Controller by a RS 232 converter is<br />
connected to the PC. All data processing <strong>and</strong> calculations<br />
are performed by software running on this single processor.<br />
An appropriate electronic optocoupler circuit is used for<br />
isolation between the three stepper motors <strong>and</strong> an Embedded<br />
Controller. The driver circuit drives the three stepper motor<br />
in different direction. The electronic system will be suitable<br />
upto 5A for 70kg cm stepper motor but in this research the<br />
stepper motor used is only 7kg cm The rotation depends<br />
upon the human limb movement on all the axis like up/down<br />
,roll, elevation <strong>and</strong> azimuth as per the assigned threshold<br />
value. The threshold value ,three stepper motor direction,<br />
axis movement <strong>and</strong> graphical representation <strong>and</strong> sensing<br />
system all this implemented data can be displayed on the<br />
monitor by means of using C programming language. The<br />
optimal filter theory to the filter software is done by Flash<br />
Embedded Controller <strong>and</strong> visual simulation software run on<br />
a single st<strong>and</strong>ard Pentium III processor in computers RH<br />
(Barnett,LO’Cull,2004).Use of Bluetooth technology will<br />
enable sensors to wirelessly transmit data from body<br />
extremities to the wearable PC. Inertial orientation tracking<br />
combined with RF positioning are also tried to provide an<br />
accurate method for determining orientation <strong>and</strong> location.<br />
Finally, the prototype sensor overall system hardware kit is<br />
shown in figure 2 .<br />
Static Stability of the system<br />
Figure 3 plots the magnitude of the quaternion<br />
filter criterion function versus time. The drift characteristics<br />
of the quaternion filter algorithm <strong>and</strong> the MARG sensor<br />
over extended periods were evaluated using static tests.<br />
Average total drift is about 1%. During the experiment<br />
shown, the filter gain, k was set to unity. It is expected that<br />
increasing the filter gain to 4.0 would reduce the drift error
58 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
by a factor of four or to about 0.25 percent. Further<br />
experiments indicated that nearly all drift was due to bias in<br />
the rate sensors. Experiments are currently underway using<br />
improved sensors containing rate-sensor capacitive coupling<br />
conditioning circuitry designed to remove these biases.<br />
Dynamic Response of the system<br />
Preliminary experiments were conducted to establish the<br />
accuracy of the orientation estimates <strong>and</strong> the dynamic<br />
response of the system. The preliminary test procedure<br />
consisted of repeatedly cycling the sensor through various<br />
angles of roll, pitch <strong>and</strong> yaw at rates ranging from 10 to 30<br />
deg./sec. Accuracy was measured to be better than one<br />
degree. The overall smoothness of the plot shows excellent<br />
dynamic response.<br />
VI. EXPERIMENTAL TEST RESULTS OF HUMAN<br />
LIMB SEGMENT<br />
Figure 4<br />
Figure 5<br />
Figure 6<br />
Stepper<br />
Motor<br />
M1<br />
M2<br />
M3<br />
Figure 7<br />
Figure 8<br />
Figure 9<br />
Table 1<br />
Threshold Value<br />
Axis<br />
400<br />
above<br />
& 360& below Rotation<br />
Reverse Forward Forward/<br />
Reverse<br />
Reverse Forward Forward/<br />
Reverse<br />
Reverse Forward Forward/<br />
Reverse<br />
Sensors<br />
S1<br />
S2<br />
S3
Table 2 Simulation Results Of Human Limb Segment<br />
Stepper<br />
Motor<br />
M1<br />
59 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
Stepper<br />
Motor<br />
M2<br />
Stepper<br />
Motor<br />
M3<br />
Sensor<br />
S1<br />
Sensor<br />
S2<br />
Sensor<br />
S3<br />
Axis<br />
Rotation<br />
Forward Stopped Stopped 110 000 000 Forward<br />
Stopped Forward Stopped 000 160 000 Forward<br />
Stopped Stopped Forward 000 000 141 Forward<br />
Reverse Stopped Stopped 450 000 000 Reverse<br />
Stopped Reverse Stopped 000 580 000 Reverse<br />
Stopped Stopped Reverse 000 000 650 Reverse<br />
Threshold<br />
Value<br />
360 <strong>and</strong><br />
below<br />
400 <strong>and</strong><br />
above<br />
The above results were obtained using the<br />
hardware <strong>and</strong> software to achieve an update rate of 100 Hz.<br />
The roll, pitch, <strong>and</strong> yaw test results are presented in Figures<br />
4 , 5, 6 & 7 respectively. The smoothness of the graphs<br />
indicates excellent dynamic response. It is expected that<br />
adjusting the filter gain values that improves the overall<br />
accuracy <strong>and</strong> dynamic response. The transition times<br />
observed in the plots are around 4.5-5 seconds as expected<br />
for a 10-degree per second rotation rate to 45 degrees. In<br />
qualitative tests, the system was able to track the limb<br />
segment, including those in which pitch equaled 90 degrees<br />
the same orientations normally cause singularities in Euler<br />
angle filters. The qualitative tests also show that the system<br />
could easily be combined with a simulation program <strong>and</strong><br />
track motion in real time.<br />
The purpose of the human body tracking system is to<br />
estimate the orientation of multiple human limb segments<br />
<strong>and</strong> use the resulting estimates to set the posture of the<br />
human body model that is visually displayed. Numerous<br />
experiments were conducted to qualitatively evaluate <strong>and</strong><br />
demonstrate this capability.<br />
In each experiment three sensors where attached to the limb<br />
segments to be tracked. Due to the minimal number of<br />
sensors available, tracking was limited to a single arm or<br />
leg. In the case of arm <strong>and</strong> limb segments, sensor attachment<br />
was achieved through the use of elastic b<strong>and</strong>ages. In most<br />
cases this method appeared to keep the sensors fixed relative<br />
to the limb. Body tracking was also performed using various<br />
gains.<br />
VII.CONCLUSIONS<br />
This research has demonstrated an alternative<br />
technology for tracking the posture of an articulated rigid<br />
body. High speed Embedded Controller avoids the<br />
electronic complexity , Bluetooth technology enables<br />
sensors to wirelessly transmit data from body to PC <strong>and</strong> the<br />
use of inertial sensors determine the orientation of link in<br />
the rigid body. RF positioning provides the source less<br />
capability of inertial sensing <strong>and</strong> enables tracking of<br />
multiple users over a wide area. At the core of the system is<br />
an efficient complementary filter that uses a quaternion<br />
representation of orientation <strong>and</strong> the filter can continuously<br />
track the orientation of human body limb segments (Robert B.<br />
McGhee2000).. Drift corrections are also made. This research<br />
overcomes the analysis <strong>and</strong> calculations used by the previous<br />
researchers by the technology of Embedded Controller.<br />
Embedded software filter process the data from three axis<br />
inertial sensors which is attached on the human limb<br />
segment. Sensor calibration is achieved without using any<br />
specialized equipment .Accurate calibration algorithm<br />
compensates the misalignment between sensor <strong>and</strong> limb<br />
segment co-ordinate axis. Hybrid stepper motors with<br />
associated mechanical elements is used to move the limbs<br />
on all the axis like up/down, roll, elevation <strong>and</strong> azimuth. The<br />
implemented system tracks human limb segments accurately<br />
with a 100 Hz update rate. Experimental results demonstrate<br />
the inertial orientation estimation is a practical method of<br />
tracking human body posture. With additional sensors, the<br />
architecture produced could be easily scaled for full body<br />
tracking. Due to its source less nature, tracking could<br />
overcome many of the limitations of motion tracking<br />
technologies currently in widespread use. It is potentially<br />
capable of providing wide area tracking of multiple users for<br />
synthetic environment <strong>and</strong> augmented reality applications.
60 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
VIII. REFERENCES<br />
[1] An, K.N., Chao, E.Y., Cooney, W.P., <strong>and</strong> Linscheid,<br />
R.L.,<br />
1979, “Normative Model of Human H<strong>and</strong> for<br />
Biomechanical<br />
Analysis,” J. Biomechanics, vol. 12, pp. 775-788.<br />
[2] Bennet, D.J., Hollerbach, J.M., 1990, “Closed-loop<br />
Kinematic Calibration of the Utah-MIT H<strong>and</strong>,” in<br />
Experimental Robotics I: The First International Symp., V.<br />
Hayward, O. Khatib, (eds.), Springer-Verlag, N.Y., pp. 539-<br />
552.<br />
[3] Cooney, W.P., Lucca, M.J., Chao, E.Y.S., Linschied<br />
R.L.,<br />
1981, “The kinesiology of the thumb trapeziometacarpal<br />
joint,” J. Bone Joint Surg. 63A:1371-1381.<br />
[4] Fischer, M., van der Smagt, P., Hirzinger, G., 1998,<br />
“Learning Techniques in a Dataglove Based<br />
Telemanipulation<br />
System for the DLR H<strong>and</strong>,” 1998 IEEE ICRA, pp1603-<br />
1608.<br />
[5] Hollister, A., Buford, W.L., Myers, L.M., Giurintano,<br />
D.J,<br />
Novick, A., 1992, “The Axes of Rotation of the Thumb<br />
Carpometacarpal Joint,” J. of Orthopaedic Res., vol. 10, pp.<br />
454-460.<br />
[6] Khatib, O., 1987, “Unified Approach for motion <strong>and</strong><br />
force<br />
control of robot manipulators: the operational space<br />
formulation,” IEEE J. of Robotics <strong>and</strong> Automation, vol. 3,<br />
no. 1, pp. 43-53.<br />
[7] Kramer, J.F., “Determination of Thumb Position Using<br />
Measurements of Abduction <strong>and</strong> Rotation,” U.S. Patent<br />
#5,482,056.<br />
[8] Kuch, J.J., Huang, T.S., 1995, “Human Computer<br />
Interaction via the Human H<strong>and</strong>: A H<strong>and</strong> Model,” 1995<br />
Asilomar Conf. on Signals, <strong>Systems</strong>. <strong>and</strong> <strong>Computers</strong>. pp.<br />
1252-1256.<br />
[9] Rohling, R.N., Hollerbach, J.M., 1993, “Calibrating the<br />
Human H<strong>and</strong> for Haptic Interfaces,” Presence, vol. 2 no. 4,<br />
pp.281-296.<br />
[10] Rohling, R.N, Hollerbach, J.M., Jacobsen, S.C., 1993,<br />
“Optimized Fingertip Mapping: A General Algorithm for<br />
Robotic H<strong>and</strong> Teleoperation,” Presence, vol. 2 no. 3, pp.<br />
203-<br />
220.<br />
[11] Turner, M.L., Gomez, D.H. Tremblay, M.R. <strong>and</strong><br />
Cutkosky,<br />
M.R., 1998, “Preliminary Tests of an Arm-Grounded Haptic<br />
Feedback Device in Telemanipulation,” 1998 ASME<br />
IMECE<br />
Symp. on Haptic Interfaces. pp.145-149.<br />
[12] Turner, M.L., Findley, R.P., Griffin, W.B., Cutkosky,<br />
M.R.,<br />
Gomez, D.H., 2000, “Development <strong>and</strong> Testing of a<br />
Telemanipulation System with Arm <strong>and</strong> H<strong>and</strong> Motion,”<br />
Accepted to 2000 ASME IMECE Symp. on Haptic<br />
Interfaces.<br />
[13] Wampler, C.W., Hollerbach, J.M., Arai, T., 1995, “An<br />
Implicit Loop Method for Kinematic Calibration <strong>and</strong> its<br />
Application to Closed-chain mechanisms,” IEEE Trans.<br />
Robotics <strong>and</strong> Automation, vol. 11, no. 5, pp. 710-724.<br />
[14] Wright, A.K., Stanisic, M.M., 1990, “Kinematic<br />
Mapping<br />
between the EXOS H<strong>and</strong>master Exoskeleton <strong>and</strong> the Utah-<br />
MIT Dextrous H<strong>and</strong>,” 1990 IEEE Int’l Conf. on <strong>Systems</strong><br />
Engineering, pp. 809-811.<br />
[15] www.societyofrobots.com /robot building ideas
61 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
A Novel Image Processing Approach Combining a ‘Coupled<br />
Nonlinear Oscillators’-based Paradigm with Cellular Neural<br />
Networks for Dynamic Robust Contrast Enhancement<br />
Ky<strong>and</strong>oghere Kyamakya( 1 ), Cyrille Kalenga Wa Ngoy ( 2 ), Michel Matalatala Tamasala ( 2 ), Jean Chamberlain Chedjou ( 1 )<br />
( 1 ): Transportation Informatics Group, Institute of Smart <strong>Systems</strong> Technologies, University of Klagenfurt (Austria),<br />
Email: ky<strong>and</strong>oghere.kyamakya@uni-klu.ac.at ; jean.chedjou@uni-klu.ac.at<br />
( 2 ): Department of Electrical <strong>and</strong> Computer Engineering, Polytechnic Faculty, University of Kinshasa (D. R. Congo)<br />
Email: Cyrille.Kalenga@vodacom.cd ; Michel.Matalatala@vodacom.cd<br />
Abstract−− In this paper, a systematic discussion of both pros <strong>and</strong><br />
cons of two well-known traditional approaches for image contrast<br />
enhancement is conducted. The first approach is based on the<br />
CNN paradigm <strong>and</strong> the second one is based on the coupled<br />
nonlinear oscillators’ paradigm for image processing. In the later<br />
case an extensive bifurcation analysis is carried out <strong>and</strong><br />
analytical formulas are derived to define the various states of the<br />
system. Both equilibrium <strong>and</strong> oscillatory states of the system are<br />
depicted. It is shown that each of these states has a significant<br />
impact on the quality of the resulting image contrast<br />
enhancement. A benchmarking is considered whereby a<br />
comparison is performed between the results obtained by a CNNbased<br />
processing, on one side, with those obtained by a ‘coupled<br />
nonlinear oscillators’ based processing, on the other side. The<br />
superiority of the later approach (for contrast enhancement) is<br />
demonstrated both analytically <strong>and</strong> through various experiments.<br />
A major drawback of the CNN based image processing is the<br />
practical inability to adjust/re-calculate templates in real-time in<br />
face of a dynamic scene with input images experiencing visibility<br />
<strong>and</strong>/or lighting related spatio-temporal dynamics. Finally, a novel<br />
hybrid approach integrating both schemes in an efficient way is<br />
proposed: the ‘coupled nonlinear oscillators’ based image<br />
processing is the main processing scheme that is however realized<br />
on top of a CNN processors’ framework. The hybrid approach<br />
does prove to overcome key practical problems faced by both<br />
original approaches.<br />
Keywords: Cellular neural networks (CNN), Nonlinear coupled<br />
oscillators, van der Pol oscillator, Duffing oscillator, contrast<br />
enhancement, stability, bifurcation, Routh-Hurwitz theorem<br />
I. INTRODUCTION<br />
The last decades have witnessed a tremendous attention<br />
devoted to the study of nonlinear coupled oscillators [2]-[17]<br />
with various related applications in diverse areas such as<br />
electrical engineering [18], [19], mechanics [15], electromechanics<br />
[14] <strong>and</strong> electronics [16], just to name a few. In<br />
some previous works [18]-[21], we have shown some<br />
interesting applications of the coupling between van der Pol<br />
<strong>and</strong> Duffing oscillators in both electronics <strong>and</strong> electromechanics.<br />
Further, in the recent literature a good number of<br />
notable contributions have been published thereby showing<br />
various applications of the paradigm of nonlinear dynamics in<br />
image processing [1]-[13]: (a) the use of the CNN paradigm<br />
for contrast enhancement [1], edge detection [11], [17], image<br />
segmentation [2]-[10], [12], [13]; <strong>and</strong> (b) the use of the socalled<br />
LEGION model (involving nonlinear coupled<br />
oscillators) mainly for image segmentation [3]. One does<br />
realize that the relevant literature does not provide sufficient<br />
information concerning the application of nonlinear<br />
coupled/uncoupled oscillators in image processing, especially<br />
for the specific task of contrast enhancement. In fact, only a<br />
single paper can be found in which image contrast<br />
enhancement has been done by using this later paradigm [1].<br />
In contrast, the cellular neural network paradigm has shown<br />
through numerous publications its rich potential to solve many<br />
important low-level image processing tasks, e.g. image contrast<br />
enhancement [22]-[24], edge detection [25] <strong>and</strong> segmentation<br />
[26], [27] just to name a few. Despite the ideal framework<br />
offered by the CNN paradigm to perform parallel <strong>and</strong> therefore<br />
ultrafast image processing there are still some important related<br />
issues that still need a better theoretical foundation. One of<br />
these open questions is that of a comprehensive <strong>and</strong> straightforward<br />
methodology to derive appropriate CNN templates for<br />
a given image processing task. Actually known approaches are<br />
based on a sort of supervised learning paradigm to determine<br />
the templates. Hereby either genetic algorithms or simulated<br />
annealing or even particle swarm optimization are the most<br />
commonly used schemes. Thus, the template obtained through<br />
such a ‘supervised learning’ like approach does highly depend<br />
on the used reference image(s). Therefore, this traditional way<br />
of calculating templates will totally fail in face of a dynamic<br />
environment, which would require an adaptive <strong>and</strong> real-time<br />
determination/re-calculation of the respectively appropriate<br />
templates in reaction to visibility <strong>and</strong> lighting related<br />
environmental changes. Indeed, for a specific processing task<br />
(e.g. contrast enhancement, segmentation, etc.) the optimal<br />
CNN templates (for an optimal processing) must be adjusted /<br />
recalculated depending on the varying input image.<br />
That’s why an important open key issue not yet<br />
answered so far by the relevant scientific community is that of<br />
developing/providing a comprehensive, robust <strong>and</strong> general<br />
framework that should allow a real-time adaptation of the<br />
CNN templates related to a specific image processing task to<br />
the variations in all aspects of the input image(s). It is known<br />
that CNN templates are very sensitive to the quality of the<br />
input image <strong>and</strong> must be adjusted in case of dynamic image<br />
for an optimal processing. The supervised learning template
62 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
calculation paradigm is therefore not appropriate for a<br />
situation where the input image does experience visibility<br />
related temporal dynamics; it is almost impossible to<br />
recalculate the templates in real-time.<br />
Thus, a key objective of this paper is to propose an<br />
approach or better an image processing (in this case for<br />
“contrast enhancement”) framework which is robust to both<br />
the temporal quality <strong>and</strong> the spatial changes of the input<br />
image(s). The novel approach proposed here does combine the<br />
paradigm of coupled nonlinear oscillators with that of cellular<br />
neural networks. It is shown how this integration should be<br />
realized at best. Afterwards, it is in the following steps clearly<br />
demonstrated that the new architectural framework does result<br />
in invariant templates while still being capable of robustly<br />
adapting the efficient image processing to the spatial-temporal<br />
dynamics of the input image.<br />
The nonlinear coupled oscillator system model used in<br />
this paper does consist of the coupling between van der Pol<br />
<strong>and</strong> Duffing type oscillators. The focus is hereby on the<br />
application of this coupled system for the specific image<br />
processing task of “contrast enhancement”. Contrast<br />
enhancement is an important issue in difficult <strong>and</strong> dynamic<br />
visual environments such as the ones faced by advanced driver<br />
assistant systems (ADAS). Therefore, this could help<br />
improving the image quality or the visibility in real time.<br />
We do propose the realization or better the<br />
implementation of the coupled nonlinear oscillators’ image<br />
processing concept on top of a cellular neural network<br />
framework. Hereby, the CNN processors are viewed as a<br />
slave-system used to solve, in real-time, the nonlinear ordinary<br />
differential equations describing the coupled nonlinear<br />
oscillators’ model. The image processing based on the coupled<br />
nonlinear oscillators has a very great strong feature, which is<br />
that its processing efficiency is sensitive neither to the actual<br />
image quality nor lighting variations or states, but solely on<br />
the coefficients of the nonlinear differential equations<br />
describing the coupled oscillators’ model. The appropriate<br />
coefficients/parameters of the coupled nonlinear oscillators are<br />
determined in an offline bifurcation analysis process, which is<br />
explained further in this paper. The new resulting challenge<br />
becomes then that of being capable of solving these nonlinear<br />
differential equations in real-time. We should first notice that<br />
these differential equations (ODE’s) do have ‘constant’<br />
coefficients, which have been selected, as explained before,<br />
from the analysis of the results of the bifurcation analysis.<br />
Therefore, the problem setting for the CNN processor<br />
system, on top of which the coupled oscillators will be<br />
implemented, is that of solving in real-time a set of highly stiff<br />
nonlinear differential equations having constant coefficients.<br />
The input images are the frames which are then considered /<br />
taken as initial conditions for the coupled oscillator system.<br />
The real-time constraint is determined by the actual frame<br />
rate. The new key challenge becomes therefore, evidently, that<br />
of determining the appropriate templates for solving the set of<br />
stiff nonlinear differential equations. But this has been a still<br />
open issue when one looks at the actual state of the relevant<br />
literature. We could however address <strong>and</strong> efficiently solve this<br />
challenging issue in a subsequent work. The results obtained<br />
are presented in all details in another paper that we do also<br />
publish in the conference proceedings of CNNA 2010; it has<br />
the title: “CNN-based Real-time Computational Engineering”<br />
(see [28]).<br />
The previous explanations do clearly highlight how we<br />
could combine the two concepts together in a powerful <strong>and</strong><br />
highly efficient real-time processing framework: a ‘coupled<br />
nonlinear oscillators’ based image processing scheme on top<br />
of a CNN processors framework”.<br />
Contrast enhancement has been an issue of prime<br />
importance in dynamic environments. It is one of the major<br />
low-level image processing tasks needed to be done before<br />
further processing of an image can be possible at higher levels.<br />
Things become more challenging in a continuously changing<br />
environment like the one experienced by driver assistance<br />
systems on the road; weather, lighting, etc. do result in<br />
significant spatial-temporal variations of the input image<br />
quality. The real-time processing constraint does make the<br />
overall scenario more challenging: the higher the car speed is,<br />
the faster the image processing must be. A continuously<br />
changing environment requires the system to be adaptive, i.e.<br />
the system should process/enhance the input image in such a<br />
way that the corresponding output image always<br />
presents/possesses the best possible contrast regardless of the<br />
effects of different environmental conditions experienced by<br />
the input image (like darkness, non-uniform lighting, raining,<br />
fog, etc.). This implies that the output image should contain<br />
significant contrast in it, so that all of the objects contained in<br />
it could be easily distinguishable by the system for further<br />
processing such as scene analysis, etc.<br />
The implementation on top of cellular neural network of<br />
the coupled nonlinear oscillatory systems’ paradigm is a best<br />
c<strong>and</strong>idate/concept for providing an appropriate answer to this<br />
need. To develop such a paradigm, a systematic analytical<br />
framework should provide tools/methods for a straight<br />
forward design <strong>and</strong> parameters calculation of a related robust<br />
<strong>and</strong> ultra-fast image processing.<br />
The rest of the paper is organized as follows. Section 2<br />
exploits the Routh-Hurwitz theorem to address the stability<br />
analysis of the nonlinear coupled oscillatory system. Three<br />
main states of the coupled system are depicted, namely<br />
equilibrium-, quenched-, <strong>and</strong> oscillatory- states. Analytical<br />
formulas/relations are derived under which each of these states<br />
could be displayed by the coupled system. The quality of the<br />
image ‘contrast enhancement’ is discussed in each of the<br />
possible states of the coupled system. Windows of the systemparameters<br />
are determined, under which either a good or a<br />
worst contrast enhancement can be predicted. Section 3 deals<br />
with the numerical study. An in-depth explanation of the<br />
image processing concept involving coupled nonlinear<br />
oscillators is provided. For rapid prototyping purposes a<br />
computing platform is developed, which is based<br />
MATLAB/SIMULINK. It is then used for a set of processing<br />
tasks on both images having a poor contrast <strong>and</strong> on images<br />
with very good contrast as well. Section 4 deals with the<br />
benchmarking. This benchmarking shows how far this novel<br />
approach does outperform the classical CNN based way of<br />
doing the same task, since the CNN templates used for<br />
contrast enhancement (or published in relevant books or<br />
papers) are in reality only optimal for the images used in the<br />
related training process. The later is traditionally based on<br />
offline optimization processes involving either genetic<br />
algorithms or simulated annealing or particle swarm<br />
optimization [29]-[34]. As proof of concepts of the approach<br />
developed in this paper our results are compared with those<br />
provided by the relevant literature for CNN based contrast<br />
enhancement.
We further discuss a possible implementation of the coupled<br />
nonlinear oscillators on top of a CNN computing platform.<br />
The challenge hereby is that of transforming, as much as<br />
possible, the nonlinearity types present in both ‘van der Pol’<br />
<strong>and</strong> ‘Duffing’ oscillators into a type of nonlinearity similar to<br />
that displayed by the elementary CNN cell. We use a novel<br />
optimization concept/process to achieve this goal. Section 5<br />
presents a set of concluding remarks. Furthermore a summary<br />
of the key results obtained is provided.<br />
II. ANALYTICAL STUDY<br />
The dynamics of a system consisting of a van der Pol<br />
oscillator coupled to a Duffing oscillator is described by the<br />
following equations:<br />
2<br />
dx 2 dx 2<br />
dy<br />
ε 2 1( ) + ω 1 = 1 + 2<br />
dt<br />
- 1 - x x c y c (1a)<br />
dt dt<br />
2<br />
dy dy 2 3<br />
dx<br />
2 2 2 o 3 4<br />
dt<br />
+ ε + ω y + c y = c x + c (1b)<br />
dt dt<br />
where 1 c <strong>and</strong> 3 c are the elastic coupling parameters, <strong>and</strong> 2 c<br />
<strong>and</strong> 4<br />
63 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
c are the dissipative coupling parameters. x(t) <strong>and</strong> y(t)<br />
represent the coordinates of the coupled oscillators (i.e. van<br />
der Pol <strong>and</strong> Duffing respectively). The stability analysis of the<br />
equilibrium points is carried out by restricting our<br />
investigation to the case where the elastic couplings<br />
(respectively the dissipative couplings) are identical. From (1),<br />
we obtain the following equilibrium points ( c 2 = c 4 = 0 ) :<br />
⎛ 2 2 2 2 2 2 2<br />
c 1 (c1 -ω1ω 2) c1 -ω1ω ⎞<br />
2<br />
P 1 ⎜ , 0, , 0 ⎟ (2a)<br />
6 2<br />
⎜ c 0ω1 c 0ω ⎟<br />
⎝ 1 ⎠<br />
⎛ 2 2 2 2 2 2 2<br />
c 1 (c1 -ω1ω 2) c1 -ω1ω ⎞<br />
2<br />
P ⎜ 2 , 0, - , 0 ⎟ (2b)<br />
6 2<br />
⎜ c 0ω1 c 0ω ⎟<br />
⎝ 1 ⎠<br />
⎛ 2 2 2 2 2 2 2<br />
c 1 (c1 - ω1ω 2) c1 - ω1ω ⎞<br />
2<br />
P3 ⎜- , 0, , 0 ⎟ (2c)<br />
6 2<br />
⎜ c 0ω1 c 0ω ⎟<br />
⎝ 1 ⎠<br />
⎛ 2 2 2 2 2 2 2<br />
c 1 (c1 -ω1ω 2) c1 -ω1ω ⎞<br />
2<br />
P4 ⎜- , 0, - , 0 ⎟ (2d)<br />
6 2<br />
⎜ c 0ω1 c 0ω ⎟<br />
⎝ 1 ⎠<br />
These points exist under the conditions c 1 0.<br />
We also obtain a critical equilibrium<br />
or c 1 >ωω 1 2 <strong>and</strong> 0<br />
point Pc ( 0,0,0,0 ) . The stability of the above equilibrium points<br />
can be investigated by re-writing (1) in the following form:<br />
dx<br />
v (3a)<br />
dt =<br />
dv<br />
2 2<br />
=ε1(1- x )v - ω 1x+ c1y (3b)<br />
dt<br />
dy<br />
z (3c)<br />
dt =<br />
dz<br />
2 3<br />
= -ε2z-ω 2y-c0y + c1x (3d)<br />
dt<br />
<strong>and</strong> linearizing around a given equilibrium state ( x,v,y,z<br />
0 0 0 0)<br />
to obtain the Jacobian matrix M . J<br />
⎡ 0 1 0 0 ⎤<br />
⎢ 2 2<br />
⎥<br />
-ω1 -2ε1x0v0 ε1(<br />
1-x0) c1 0<br />
M J =<br />
⎢ ⎥<br />
⎢ ⎥<br />
(4)<br />
⎢<br />
0 0 0 1<br />
⎥<br />
2 2<br />
⎢⎣ c1 0 -ω2-3c0y ⎥<br />
0 −ε2⎦<br />
The eigen-values of the 4x4 matrix, formed from the Jacobian<br />
M are the solutions of (5)<br />
matrix J<br />
a λ + a λ + a λ + a λ+ a = 0 (5)<br />
4 3 2<br />
0 1 2 3 4<br />
where the coefficients a l are defined as follows:<br />
a0= 1 (6a)<br />
a1 2<br />
=ε 2 - ε 1 (1-x 0 ) (6b)<br />
a 2<br />
2 2 2 2<br />
=ω 1 +ω 2 + 2ε 1x 0v 0 + 3c0y 0 - εε 1 2(1- x 0)<br />
(6c)<br />
2 2 2 2<br />
a 3 = ε2( ω 1 + 2ε1x0v 0)- ε1(1-x 0)( ω 2 + 3c0y 0)<br />
(6d)<br />
a<br />
2<br />
= ( ω + 2ε x v<br />
2 2 2<br />
)( ω + 3c y ) - c (6e)<br />
4 1 1 0 0 2 0 0 1<br />
It can be shown (by the analysis of the oscillatory states of the<br />
coupled system <strong>and</strong> by exploiting the Routh-Hurwitz theorem)<br />
that three possible states of the system can be depicted. The<br />
first state is the quenching state (i.e. the death of oscillations).<br />
The second is the state of equilibrium. And the last one is the<br />
oscillatory state.<br />
The system exhibits its quenching state when the critical<br />
equilibrium point Pc ( 0,0,0,0 ) is stable. It can be shown, using<br />
the Routh Hurwitz theorem, that the critical equilibrium point<br />
is stable if the following relationships are satisfied (assuming<br />
that the natural frequencies of the coupled oscillators are<br />
equal):<br />
1 2 (7a)<br />
ε < ε<br />
ω ε ε < c < ω (7b)<br />
1 1 2 1<br />
2<br />
1<br />
It can also be shown (using the Routh Hurwitz theorem) that<br />
the non zero equilibrium points P i (i= 1,2,3,4) are stable for<br />
ε 1 < ε 2 (8a)<br />
2<br />
c >ω<br />
(8b)<br />
1 1<br />
Under the conditions described by (8) all the neighboring<br />
orbits of the critical equilibrium points are stable. It can be<br />
shown, using the oscillatory states analysis method (e.g. the<br />
multiple time scale method), that the coupled system displays<br />
oscillatory states. These states could be observed under the<br />
following conditions:<br />
ε 1 < ε 2<br />
(9a)<br />
0
64 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
some preliminary results of the processing (contrast<br />
enhancement) of images with/having a very poor initial<br />
contrast. The main focus will be on showing that the quality of<br />
the contrast enhancement is different in each of the various<br />
‘parameter-windows’ established analytically in (7)-(9). The<br />
advantage of this remark/feature is the possibility of predicting<br />
either a good or a worst image processing; both depend on the<br />
selected parameter values of the coupled nonlinear oscillators’<br />
model.<br />
III. NUMERICAL STUDY<br />
A. Description of the concept<br />
The proposed coupled oscillatory system consists of two<br />
nonlinear oscillators, i.e., a van der Pol oscillator <strong>and</strong> a Duffing<br />
oscillator, each represented by a second order nonlinear<br />
differential equation as given in (1). From a nonlinear<br />
dynamics perspective the scheme to solve this oscillatory<br />
system is straightforward.<br />
⎡x1⎤ r ⎢x⎥ 2<br />
x= ⎢<br />
.<br />
⎥<br />
⎢ ⎥<br />
⎢x⎥ ⎣ n⎦<br />
(Discretization)<br />
⎡y1⎤ r ⎢y⎥ 2<br />
y= ⎢<br />
.<br />
⎥<br />
⎢ ⎥<br />
⎢y⎥ ⎣ n⎦<br />
(Input image)<br />
(Vectorization)<br />
⎡ dx dy<br />
1 ⎤ ⎡ 1 ⎤<br />
⎢ dt ⎥ ⎢ dt ⎥<br />
⎢ ⎥ ⎢ ⎥<br />
r dx dy<br />
2 r 2<br />
dy<br />
⎢ ⎥<br />
dx<br />
⎢ ⎥<br />
= ⎢ dt ⎥ = ⎢ dt ⎥<br />
dt ⎢ ⎥ dt ⎢ ⎥<br />
. ⎢<br />
.<br />
⎢ ⎥ ⎥<br />
⎢dx ⎥ ⎢dy ⎥<br />
n<br />
n<br />
⎢ ⎥ ⎢ ⎥<br />
⎣ dt ⎦ ⎣ dt ⎦<br />
Coupled Oscillatory Paradigm<br />
Figure 1. Image processing through the oscillatory model<br />
In order to exploit the coupled nonlinear model/equations for<br />
some image processing tasks (e.g. contrast enhancement, edge<br />
detection, segmentation, etc.) the basic idea remains the same<br />
although a bit trickier. The input image is pixelized first, i.e., it<br />
must take a grid-like form. Then the pixelized image is<br />
vectorized. The elements of the vector image are the individual<br />
pixels. This vector image serves as initial condition vector for<br />
the coupled oscillatory system. To solve a 2 nd order ordinary<br />
differential equation we need two initial conditions (i.e.<br />
position <strong>and</strong> velocity), it is the same in this case here also. To<br />
solve/process each pixel the system needs four values, which<br />
are ‘position’ <strong>and</strong> ‘velocity’ values for both the van der Pol<br />
oscillator <strong>and</strong> the Duffing oscillator. In this case, the initial<br />
conditions vector has four elements, each of which having the<br />
same size as that of the input image, i.e., two vector elements<br />
for the initial positions <strong>and</strong> two further vector elements to hold<br />
the initial velocities. The key steps of the overall principle are<br />
shown in Fig. 1.<br />
The system generates two solutions at each time step. One<br />
is the van der Pol oscillator’ solution <strong>and</strong> the other is to the<br />
Duffing oscillator one. These solutions are obtained in the form<br />
of vector images which must be converted back to the grid like<br />
shape. Normally, the input images are loaded (as initial<br />
conditions) either in x r or y r or in both. But there are different<br />
possible scenarios for initializing the model. Some of these<br />
scenarios are listed in the following:<br />
• Loading the image in x r<br />
• Loading the image in y r<br />
• Loading the image in x r <strong>and</strong> y r<br />
• Loading the image in dx<br />
r<br />
dt<br />
• Loading the image in dy<br />
r<br />
dt<br />
• Loading the image in dx<br />
r<br />
<strong>and</strong><br />
dt<br />
dy<br />
r<br />
dt<br />
• Loading the image in x r <strong>and</strong> dx<br />
r<br />
dt<br />
The SIMULINK model (i.e. a graphical representation) that has<br />
been used for the simulations of this paper (i.e. for image<br />
contrast enhancement) is shown in Fig. 2. This graphical model<br />
is a representation of the nonlinear coupled oscillatory system<br />
from the nonlinear dynamics perspective.<br />
B. Results<br />
Our objective in this part is to connect the results obtained<br />
analytically (different states of the nonlinear oscillator system)<br />
to some sample image processing examples obtained through<br />
numerical simulations. The key issue hereby is that of<br />
establishing a correlation between the formulas derived<br />
analytically <strong>and</strong> the related image processing results obtained<br />
numerically (i.e. contrast enhancement).<br />
It has been shown analytically that the equilibrium points<br />
(i.e. both Pc <strong>and</strong> Pi) are stable under some analytical conditions<br />
described by (7), (8) <strong>and</strong> (9). We now want to exploit these<br />
equations to show the quality of the image processing tasks<br />
performed by the coupled oscillators’ system in its equilibrium<br />
states. It is worth a mentioning that two main equilibrium<br />
states of the coupled system have been depicted analytically.<br />
The first state is the quenching state under which the critical<br />
point (i.e. the point at origin) P c (0, 0, 0, 0) is stable. At the<br />
critical points both oscillators are mutually damped (i.e.<br />
complete damping), leading to the quenching phenomenon.<br />
When this phenomenon occurs, the result of the image<br />
processing leads to an image which is completely dark (see<br />
Fig. 3b), whatever the quality (good or worst) of the input<br />
image may be (see Fig. 3a). The following set of parameters<br />
has been used to obtain the quenching state under the<br />
conditions described in (7): ε 1 =0.4,<br />
ε 2 = 1,<br />
ω 1 = 1 ; ω 2 = 1,<br />
c 1 = 0.8, 3 c = 0.8, 2 c = 0, 4<br />
c = 0, <strong>and</strong> c 0 = 0.5
65 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
Figure 2. Simulink representation of the coupled oscillators’ model<br />
(a) (b)<br />
Figure 3: Result of the image processing in the case where the system is the<br />
critical equilibrium point P c (0, 0, 0, 0) : input image (a) <strong>and</strong> result of<br />
the processing (b), leading to an output image which is completely dark<br />
(Quenching phenomenon).<br />
An important observation which could be drawn from Fig. 4 is<br />
that the image processing quality is the highest for equilibrium<br />
points that lie much further (far away) from the critical point.<br />
For instance, the parameter values c1= 1.05 , c1= 1.15 , <strong>and</strong><br />
c1= 1.3 lead to the following equilibrium points<br />
P 1(0.475,0,0.455,0)<br />
, P 1(0.923,0,0.803,0)<br />
<strong>and</strong><br />
P 1(1.52,0,1.17,0)<br />
respectively. Therefore, by increasing c 1 ,<br />
the equilibrium points Pi move far away from the critical<br />
equilibrium Pc <strong>and</strong> thereby leading to a significant<br />
improvement in the quality of image processing (contrast<br />
enhancement. The results/images obtained are presented in Fig.<br />
4. We have also performed a series of image processing<br />
numerical simulations in the ‘oscillatory states’ of the coupled<br />
system described by (9). Using the same set of parameters like<br />
in Fig. 4, c 1 has been used as control parameter. The results<br />
obtained in Fig. 5 have revealed that in the oscillatory state of<br />
the coupled system, the quality of the processing increases with<br />
decreasing c 1 (see the results of the processing in Fig. 5b, Fig.<br />
5c <strong>and</strong> Fig. 5d).<br />
(a) (b)<br />
(c) (d)<br />
Figure 4. Effects of the control parameter c1 on the image processing qualitythe<br />
system is in different equilibrium states: (a) is the input image; (b) is the<br />
related image processing result for c1 = 1.05; (c) is the image processing result<br />
for c1 = 1.15; <strong>and</strong> (d) is the image processing result for c1 = 1.3, the later<br />
leading to the optimal result/processing obtained in the corresponding<br />
equilibrium state of the coupled system.<br />
(a) (b)<br />
(c) (d)<br />
Figure 5: Effects of the control parameter c1 on the processing of quality of<br />
the input image (a) – the system is in different oscillatory states; the results of<br />
the processing are: (b) for c1 = 0.6, (c) for c1 = 0.55, <strong>and</strong> (d) for c1 = 0.50, the<br />
later leading to an optimal result obtained in the corresponding oscillatory<br />
state of the coupled system.<br />
IV. BENCHMARKING<br />
In this section we discuss <strong>and</strong> compare the results of the<br />
CNN based image contrast enhancement techniques published<br />
in the literature so far with those obtained through a<br />
processing by the coupled nonlinear oscillators’ paradigm. A<br />
first attempt for a CNN based contrast enhancement was<br />
presented by Mάrton Csapodi et al. [22]. In this concept,<br />
another well known contrast enhancement approach, i.e.<br />
adaptive histogram equalization, has been emulated by<br />
performing a piecewise linear approximation of different<br />
mapping functions. The technique is computationally intensive<br />
since for each contextual region it requires a histogram<br />
generation, a mapping function calculations <strong>and</strong> a rescaling of<br />
pixel values according to the new mapping. Mάtyάs Brendel et<br />
al. [23] addressed the contrast enhancement problem by<br />
proposing a set of linear templates, which however do not<br />
provide good results for all test images due to the high<br />
nonlinear nature of the images. A. Gacsάdi et al. [24] have<br />
designed a set of templates for image enhancement by<br />
minimizing the image energy function.
66 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
(a) (b)<br />
(c ) (d)<br />
Figure 6. Results of CNN based contrast enhancement schemes: (a) Image<br />
enhancement based on an approach developed by Mάrton Csapodi et al. [22];<br />
(b) Image enhancement based on an approach developed by Mάtyάs Brendel et<br />
al. [23]; .(c) Image enhancement based on approach by A.Gacsάdi et al.[24];.<br />
(d) Edge preservation observed in the approach by A. Gacsάdi et al. [24]. All<br />
these results are obtained w.r.t. the input image of Fig. 4a.<br />
The energy function considered consists of two processes that<br />
are smoothness constraint <strong>and</strong> edge penalty. Thus, to obtain an<br />
optimum result a tradeoff between image smoothness <strong>and</strong> edge<br />
detection was to be found <strong>and</strong> adjusted. Applying the approach<br />
based on the CNN paradigm on the same input image of Fig.<br />
4a, we have obtained an enhanced contrast w.r.t the input<br />
image but with a loss of key information (see Fig. 6). The parts<br />
of the input image with small gray level differences have been<br />
lost, e.g. driver’s face, the road, the round lane <strong>and</strong> the<br />
background. In contrast to that, the optimum results (Fig. 4d,<br />
Fig. 5d) obtained through the coupled nonlinear oscillators<br />
processing paradigm clearly show that almost all of the basic<br />
information of the same input image is restored during the<br />
image enhancement processing. A comparison of Figures 5 <strong>and</strong><br />
6 does underscore the superiority of the coupled nonlinear<br />
oscillator based contrast enhancement while compared to CNN<br />
based approaches. The reason for the weakness of the CNN<br />
based approach lies in the essentially “supervised<br />
training/learning”-like process used to determine the templates.<br />
Due to this, the (linear) templates obtained are only optimal for<br />
the test images. Beyond that, there is no way that those<br />
templates can be optimal for images experiencing temporal<br />
<strong>and</strong>/or spatial dynamics.<br />
Having seen the superiority of the coupled oscillators’<br />
based approach we do now propose a hybrid architecture that<br />
will combine the strong points of both CNN <strong>and</strong> the coupled<br />
nonlinear oscillators’ based image processing. The core<br />
processing will be the later one. Thereby, we do propose the<br />
realization of the coupled nonlinear oscillators’ image<br />
processing concept on top of a cellular neural network<br />
processors framework. Hereby, the CNN processors will play<br />
the role of a slave-system used to solve, in real-time, the<br />
nonlinear ordinary differential equations describing the<br />
coupled nonlinear oscillators’ model. The image processing<br />
based on coupled nonlinear oscillators has a very great strong<br />
feature, which is that its processing efficiency is sensitive<br />
neither to the actual image quality nor to its variations or<br />
states, but solely on the coefficients of the nonlinear<br />
differential equations describing the coupled oscillators. The<br />
appropriate coefficients/parameters of the coupled nonlinear<br />
oscillators are determined, as explained in the previous<br />
sections, in an offline bifurcation analysis process. The new<br />
challenge related to the CNN processor becomes now that of<br />
being capable of solving these nonlinear differential equations<br />
in real-time. Thus, the problem formulation for the CNN<br />
processor system is that of solving a set of highly stiff<br />
nonlinear differential equations having constant coefficients.<br />
The key challenge for this task is solely that of determining<br />
the appropriate templates. This is not trivial at all <strong>and</strong> is still<br />
an open issue if one looks at the actual state of the relevant<br />
literature. We could however solve it <strong>and</strong> we do present the<br />
related results obtained in the other paper that we publish in<br />
the proceedings of the CNNA-2010 conference; see the paper<br />
entitled “CNN based Real-time Computational Engineering”<br />
(see [28]).<br />
V. CONCLUDING REMARKS<br />
CNN needs well optimized templates to perform any<br />
specific image processing task <strong>and</strong> it is well known that<br />
template optimization is still unsolved for a really<br />
straightforward <strong>and</strong> efficient CNN based computing. For<br />
dynamic environments linear templates do not provide a<br />
robust processing since every next image is different from the<br />
previous one <strong>and</strong> does in reality require a new set of<br />
appropriate templates for the same processing task. Nonlinear<br />
templates require some preprocessing to be performed on each<br />
image to get the template values that are appropriate to the<br />
actual image, leading to huge problems for real-time<br />
applications. The proposed paradigm of a coupled nonlinear<br />
oscillators based processing has shown (both analytically <strong>and</strong><br />
numerically) domains of good/efficient contrast enhancement<br />
processing whereby the processing quality remains<br />
robust/constant <strong>and</strong> is insensitive to eventual spatial-temporal<br />
dynamics that may experience the input images. Furthermore,<br />
in CNN based processing the template-based computing<br />
involve also the pixels’ neighborhood while processing each<br />
pixel. This is not the case in the nonlinear oscillators based<br />
processing paradigm as each pixel is processed independently<br />
without taking into account its neighborhood. For both<br />
paradigms, i.e. CNN <strong>and</strong> nonlinear oscillators, the input image<br />
serves as an initial condition. Both frameworks offer parallel<br />
image processing with a couple of differences: (a) CNN<br />
templates appear to be sensitive to the training conditions <strong>and</strong><br />
lack adaptivity to dynamic environments; (b) the performance<br />
of the coupled oscillator model is independent/insensitive to<br />
both quality <strong>and</strong> dynamics of the input image.<br />
In summary, after analyzing the results obtained we do<br />
propose the realization of the coupled nonlinear oscillators<br />
based image processing concept on top of a cellular neural<br />
network processor system. Hereby, the CNN framework will<br />
be viewed as a slave-system used to solve, in real-time, the<br />
nonlinear ordinary differential equations describing the<br />
coupled nonlinear oscillators’ model. The image processing<br />
based on coupled nonlinear oscillators has demonstrated its<br />
great strong feature, which is that its processing efficiency is<br />
sensitive neither to the actual image quality nor to its<br />
variations or states, but solely on the coefficients of the<br />
nonlinear differential equations describing the coupled<br />
oscillators. The appropriate coefficients/parameters of the<br />
coupled nonlinear oscillators are determined in an offline<br />
bifurcation analysis process, which has been extensively<br />
explained further in this paper. And these coefficients remain<br />
constant <strong>and</strong> do not need to be recalculated in real-time.
67 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
REFERENCES<br />
[1] S. Morfu <strong>and</strong> J. C. Comte, “A nonlinear oscillators<br />
network devoted to image processing,” International<br />
Journal of Bifurcation <strong>and</strong> Chaos, vol. 14, no. 4(2004)<br />
1385-1394.<br />
[2] Naoko Kurata, Hitoshi Mahara, Tatsunari Sakurai,<br />
Atsushi Nomura <strong>and</strong> Hidetoshi Miike, “Image processing<br />
by a coupled non-linear oscillator system,” 23 rd<br />
International Technical Conference on Circuits/<strong>Systems</strong>,<br />
<strong>Computers</strong> <strong>and</strong> Communications (ITC-CSCC 2008).<br />
[3] De Liang Wang <strong>and</strong> David Terman, “Locally excitatory<br />
globally inhibitory oscillator networks,” IEEE<br />
Transactions on Neural Networks, vol. 6, no. 1, January<br />
1995.<br />
[4] Xiuwen Liu <strong>and</strong> DeLiang Wang, “Range image<br />
segmentation using a LEGION network,” IEEE<br />
Transactions on Neural Networks, vol. 10, no. 3, May<br />
1999.<br />
[5] D. L. Wang, “Object selection based on oscillatory<br />
correlation,” Elsevier Transactions on Neural Networks,<br />
vol. 12, pp. 579-592, 1999.<br />
[6] Hiroshi Ando, Takashi Morie, Makoto Nagata <strong>and</strong><br />
Atsushi Iwata, “A non-linear oscillator network for grey<br />
level image segmentation <strong>and</strong> PWM / PPM circuits for its<br />
VLSI implementation,” IEICE Trans. Fundamentals, vol.<br />
E83-A, no. 2, pp. 329-336, February 2000.<br />
[7] Hiroshi Ando, Takashi Morie, Makoto Nagata <strong>and</strong><br />
Atsushi Iwata, “Image segmentation/extraction using nonlinear<br />
cellular networks <strong>and</strong> their VLSI implementation<br />
using pulse-coupled modulation techniques,” IEICE<br />
Trans. Fundamentals, vol. E85-A, no.2, pp. 381-388,<br />
February 2002.<br />
[8] Hidehiro Nakano <strong>and</strong> Toshimichi Saito, “Grouping<br />
synchronization in a pulse-coupled network of chaotic<br />
spiking oscillators,” IEEE Transactions on Neural<br />
Networks, vol 15, no.5, September 2004.<br />
[9] Yakov Kazanovich <strong>and</strong> Roman Borisyuk, “Object<br />
selection by an oscillatory neural network,” Elsevier<br />
Transactions on Biosystems, vol. 67, pp. 103-111, August<br />
2002.<br />
[10] Ke Chen <strong>and</strong> DeLiang Wang, “A dynamically coupled<br />
neural oscillator network for image segmentation,”<br />
Elsevier Transactions on Neural Networks, vol. 15, pp.<br />
423-439, April 2002.<br />
[11] M. Strzelecki, “Texture boundary detection using network<br />
of synchronised oscillators,” IEEE Electronics letters, vol.<br />
40, pp. 466-467, ISSN 0013-5194, April 2004.<br />
[12] Michal Strzelecki, Jacques de Certaines, <strong>and</strong> Suhong Ko,<br />
“Segmentation of 3D MR liver images using<br />
synchronised oscillators networks,” Proceedings of the<br />
2007 IEEE International Symposium on Information<br />
Technology Convergence, ISBN: 0-7695-3045-1, pp.<br />
259-263, 2007.<br />
[13] Balarey Yuri, Cohen Alex<strong>and</strong>er, Johnson Walter <strong>and</strong><br />
Elinson, “Image processing by oscillatory media,”<br />
Proceedings of SPIE-the International Society for Optical<br />
Engineering, vol. 2430, pp. 198-207, 1994.<br />
[14] R. Forke, Dirk Scheibner, Wolfram Dötzel <strong>and</strong> Jan<br />
Mehner, “Measurement unit for tunable low frequency<br />
vibration detection with MEMS force coupled<br />
oscillators,” Elsevier Transactions on Sensors <strong>and</strong><br />
Actuators, vol. 156, pp. 59-65, 2009.<br />
[15] R. Sepulchre, Derek Paley <strong>and</strong> Naomi Leonard, Lecture<br />
notes in control <strong>and</strong> information sciences, vol. 309/2004,<br />
pp:189-205. ISBN:978-3-540-22861-5, ISSN:0170-8643,<br />
Springer Berlin/Heidelberg, November 2004.<br />
[16] James F. Buckwalter, Aydin Babakhani, Abbas Komijani<br />
<strong>and</strong> Ali Hajimiri,”An integrated subharmonic coupledoscillator<br />
scheme for a 60-GHz phased-array transmitter,”<br />
IEEE Transactions on Microwave Theory <strong>and</strong><br />
Techniques, vol. 54, no.12, pp. 4271-4280, December<br />
2006.<br />
[17] G. W. Wei <strong>and</strong> Y. Q. Jia, “Synchronization-based image<br />
edge detection,“ Europhysics letters, vol. 59, pp.814-819,<br />
2002.<br />
[18] J. C. Chedjou, On the analysis of nonlinear<br />
electromechanical systems with applications, Shaker<br />
Verlag, ISBN 978-3-8322-3750, 2005.<br />
[19] J. C. Chedjou, H. B. Fotsin, P. Woafo, <strong>and</strong> S. Domngang,<br />
“Analog simulation of the dynamics of a van der Pol<br />
oscillator coupled to a Duffing oscillator,” IEEE<br />
Transactions on Circuits <strong>and</strong> <strong>Systems</strong>-I, vol. 48, no. 06,<br />
pp. 748-757, 2001.<br />
[20] J. C. Chedjou, P. Woafo, <strong>and</strong> S. Domngang, “Shilnikov<br />
chaos <strong>and</strong> dynamics of a self-sustained electromechanical<br />
transducer,” ACME Transactions on Vibration <strong>and</strong><br />
Acoustics, vol. 123, pp. 170-174, 2001.<br />
[21] J. C. Chedjou, K. Kyamakya, I. Moussa, H. P.<br />
Kuchenbecker, <strong>and</strong> W. Mathis, “Behavior of a selfsustained<br />
electromechanical transducer <strong>and</strong> routes to<br />
chaos,” ACME Transactions on Vibration <strong>and</strong> Acoustics,<br />
vol. 128, pp. 183-192, 2006.<br />
[22] Mάrton Csapodi <strong>and</strong> Tamάs Roska, “Adaptive histogram<br />
equalization with cellular neural network,” CNNA’96:<br />
Fourth IEEE International Workshop on Cellular Neural<br />
Networks <strong>and</strong> their Applications, Seville, Spain, June 24-<br />
26, 1996.<br />
[23] Mάtyάs Brendel <strong>and</strong> Tamάs Roska, “Adaptive image<br />
sensing <strong>and</strong> enhancement using the adaptive cellular<br />
neural network universal machine,” Proceedings of the 6 th<br />
IEEE International Workshop on Cellular Neural<br />
Networks <strong>and</strong> their Applications, Catania, Italy, May 23-<br />
25, 2000.<br />
[24] A. Gacsadi, C. Grava <strong>and</strong> A. Grava, “Medical image<br />
enhancement by using cellular neural networks,”<br />
Proceedings of the EEE International Conference on<br />
<strong>Computers</strong> in Caradiology, Lyon, France, Sep 25-28,<br />
2005.<br />
[25] Masaru Nakano <strong>and</strong> Yoshifumi Nishio, “A method of<br />
edge detection using small world cellular neural<br />
network”, International Symposium on Nonlinear Theory<br />
<strong>and</strong> its Applications, NOLTA’07, Vancouver, Canada,<br />
Sep 16-19, 2007.<br />
[26] D. L. Vilarino, D. Cabello, X. M. Pardo <strong>and</strong> V. M. Bera,<br />
“Cellular neural networks <strong>and</strong> active contours: A tool for<br />
image segmentation”, Transaction of Elsevier on Image<br />
<strong>and</strong> Vision Computing, vol. 21, pp. 189-204, 2003.
68 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
[27] Yao Li, Liu Jiamin, Xie Yonggui <strong>and</strong> Pei Liuqing,<br />
“Medical image segmentation based on cellular neural<br />
networks”, Science in China series F: information<br />
sciences, ISSN: 1009-2757(print) 1862-2836(online), vol.<br />
44, pp. 68-72, 2007.<br />
[28] J.C. Chedjou, K. Kyamakya, U.A. Khan, M.A. Latif,<br />
“CNN-based real-time computational engineering,”<br />
Proceedings of CNNA 2010, February 3-5, 2010,<br />
Berkeley, California – USA.<br />
[29] Taraqlio S., Zanela A., “Cellular neural networks: a<br />
genetic algorithm for parameters optimization in artificial<br />
vision applications,” Proceedings of 4 th IEEE<br />
International workshop on Cellular Neural Network <strong>and</strong><br />
their Applications, pp. 315-320, ISBN: 0-7803-3261-X,<br />
Spain, 1996.<br />
[30] M. Zamparelli, “Genetically trained cellular neural<br />
networks,” Transactions of Elsevier on neural networks,<br />
vol. 10, pp. 1143-1151, 1997.<br />
[31] Samuel Xavier-de-Souza, Mustak E. Yalcin, Mü stak E.<br />
Yalcin, Joos V<strong>and</strong>ewalle, Johan A. K. Suykens,<br />
“Automatic chip-specific CNN template optimization<br />
using adaptive simulated annealing,” Proceedings of the<br />
European conference on Circuit Theory <strong>and</strong> Design<br />
(ECCTD ‘03).<br />
[32] Brett Ch<strong>and</strong>ler, Csaba Rekeczky, Yoshifumi Nishio, Akio<br />
Ushida, “Adaptive simulated annealing in CNN template<br />
learning,” IEICE Trans. Fundamentals, vol. E82-A, no.<br />
02, February 1999.<br />
[33] H. L. Wei, S. A. Billings, “Generalized cellular neural<br />
networks constructed using particle swarm optimization<br />
for spatio-temporal evolutionary pattern identification,”<br />
International journal of Bifurcation <strong>and</strong> Chaos, vol. 18,<br />
pp. 3611-3624, 2008.<br />
[34] Te-Jen Su, Tzu-Hsiang Lin, Jia-Wei Liu, “Particle swarm<br />
optimization for gray scale image noise cancellation,”<br />
Proceedings of the 4 th IEEE International Conference on<br />
<strong>Intelligent</strong> Information hiding <strong>and</strong> Multimedia Signal<br />
Processing, Harbin-China, 2008.<br />
Ky<strong>and</strong>oghere Kyamakya obtained the<br />
M.S. in Electrical Engineering in 1990<br />
at the University of Kinshasa. In 1999<br />
he received his Doctorate in Electrical<br />
Engineering at the University of Hagen<br />
in Germany. He then worked three<br />
years as post-doctorate researcher at the<br />
Leibniz University of Hannover in the<br />
field of Mobility Management in<br />
Wireless Networks. From 2002 to 2005<br />
he was junior professor for Positioning Location Based<br />
Services at Leibniz University of Hannover. Since 2005 he is<br />
full Professor for Transportation Informatics <strong>and</strong> Director of<br />
the Institute for Smart <strong>Systems</strong> Technologies at the University<br />
of Klagenfurt in Austria.<br />
Jean Chamberlain Chedjou received<br />
in 2004 his doctorate in Electrical<br />
Engineering at the Leibniz University<br />
of Hanover, Germany. He has been a<br />
DAAD (Germany) scholar <strong>and</strong> also an<br />
AUF research Fellow (Postdoc.). From<br />
2000 to date he has been a Junior<br />
Associate researcher in the Condensed<br />
Matter section of the ICTP (Abdus<br />
Salam International Centre for Theoretical Physics) Trieste,<br />
Italy. Currently, he is a senior researcher at the Institute for<br />
Smart <strong>Systems</strong> Technologies of the Alpen-Adria University of<br />
Klagenfurt in Austria. His research interests include<br />
Electronics Circuits Engineering, Chaos Theory, Analog<br />
<strong>Systems</strong> Simulation, Cellular Neural Networks, Nonlinear<br />
Dynamics, Synchronization <strong>and</strong> related Applications in<br />
Engineering. He has authored <strong>and</strong> co-authored 3 books <strong>and</strong><br />
more than 40 journals <strong>and</strong> conference papers.<br />
Cyrille Kalenga Wa Ngoy obtained the ‘Ir. Civil’ degree in<br />
Electrical Engineering at the University of Kinshasa. He is<br />
since about ten years Assistant at the same University in the<br />
Department of Electrical <strong>and</strong> Computer Engineering.<br />
Michel Matalatala Tamasala obtained the ‘Ir. Civil’ degree<br />
in Electrical Engineering at the University of Kinshasa. He is<br />
since about four years Assistant at the same University in the<br />
Department of Electrical <strong>and</strong> Computer Engineering.
69 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
Common-neighbor Monitoring Enhanced<br />
Cooperation Enforcement Scheme for MANETs<br />
JianLi GUO, HongWei LIU, <strong>and</strong> XiaoZong YANG<br />
Abstract—Ad hoc networks are distributed, self-organized<br />
wireless networks. By their nature, it is easy for selfish nodes<br />
to save their energy by not forwarding packets. The existing of<br />
selfish nodes can degrade the network performance severely. A<br />
new cooperation enforcement scheme called CMC was proposed<br />
to mitigate this problem. The common neighbor monitoring<br />
technique was introduced, with whose help the watchdog could<br />
monitor all packets transmitting around it. The system could<br />
detect the non-cooperation nodes quickly <strong>and</strong> easily. In the<br />
routing discovery phase, the control messages that contained noncooperation<br />
nodes were dropped, decreasing the probability of a<br />
well-behaved node using a bad route for data transmission. The<br />
ns-2 simulation results indicated that CMC could improve the<br />
throughput of well-behaved nodes by 10%-40% in the presence<br />
of 10%-60% non-cooperation nodes.<br />
Index Terms—Mobile Ad Hoc networks, selfish node, reputation,<br />
cooperation.<br />
I. INTRODUCTION<br />
MOBILE Ad hoc network[1], [2] is a multi-hop temporary<br />
autonomous system, which is composed of a group<br />
of mobile nodes. In this environment, the transmission range<br />
of each node is limited within a small area, so two mobile<br />
nodes which are geographically distant require other nodes<br />
forwarding function to communicate. At present, the mature<br />
routing protocols for mobile ad hoc network, such as DSR[3]<br />
<strong>and</strong> AODV[4], all assume that nodes are cooperative, <strong>and</strong> they<br />
are happy to forward data for other nodes. In recent years, with<br />
the development of hardware technology, all kinds of civilian<br />
ad hoc networks, such as the temporary wireless network at<br />
the classrooms, are appeared. In these networks, each node<br />
separately belongs to different individuals or organizations,<br />
they have no common purpose, <strong>and</strong> cooperation among nodes<br />
cannot be guaranteed. In mobile Ad hoc network where nodes<br />
are always powered by battery, energy is very valuable, <strong>and</strong><br />
the wireless interface consumes substantial energy (higher<br />
than 40%)[2], [5]. In order to save energy, selfish nodes may<br />
discard packets that need to be forwarded, thus showing noncooperative<br />
behaviors. In literature [6], by employing game<br />
theory, the authors had proved that, in mobile ad hoc networks,<br />
Manuscript received April 9, 2009. This work was supported in part by<br />
the Hi-Tech Research <strong>and</strong> Development Program (863) of China under grant<br />
No. 2008AA01A201 <strong>and</strong> the National Natural Science Foundation of China<br />
under grants No. 60503015.<br />
JianLi GUO is with the School of computer science <strong>and</strong> technology, Harbin<br />
Institute of Technology, Harbin, China, 150001 email: gjl@ftcl.hit.edu.cn.<br />
HongWei LIU is with the School of computer science <strong>and</strong> technology,<br />
Harbin Institute of Technology, Harbin, China, 150001 email:<br />
lhw@ftcl.hit.edu.cn.<br />
XiaoZong YANG is with the School of computer science <strong>and</strong> technology,<br />
Harbin Institute of Technology, Harbin, China, 150001 email:<br />
yxz@ftcl.hit.edu.cn.<br />
spontaneous cooperation did not exist, <strong>and</strong> the external mechanisms<br />
that ensured cooperation among nodes were required.<br />
In mobile ad hoc network, even if only a small number of<br />
nodes showing non-cooperative behaviors, there may be a<br />
great impact on network performance. Literature [7] farther<br />
more pointed out that, if there existed 10%-40% selfish nodes<br />
in the network, the entire network performance would drop<br />
16%-32%.<br />
For the nodes cooperation problem in mobile ad hoc networks,<br />
the researchers had proposed a lot of solutions[8], [9],<br />
mainly divided into two categories: virtual currency based<br />
schemes <strong>and</strong> reputation based schemes. In the virtual currency<br />
based schemes[5], [10], [11], [12], [13], nodes who<br />
forward packets for other nodes are compensated by some<br />
virtual currency to motivate them to cooperate. However, these<br />
kind of schemes have some drawbacks: the need for special<br />
hardware[10] or a central server[5], [11], [12], [13]; violating<br />
the distributed characteristics of ad hoc networks; because of<br />
the lack of opportunity to forward packets for other nodes, <strong>and</strong><br />
thus unable to obtain enough currency, the nodes that located<br />
at the edge of the network may be starved to death[10]; in<br />
order to calculate the optimal compensation[11], [12], [13],<br />
nodes require to exchange substantial information, introducing<br />
quantity control packets into the network. These limit their<br />
applications in ad hoc networks.<br />
In the reputation based schemes, each node is given a reputation<br />
value[14] according to its behavior <strong>and</strong> the selfish ones<br />
are punished. Literature [7] first proposed the use of watchdog<br />
to detect selfish nodes. After that, literatures [15], [16] further<br />
more used the second-h<strong>and</strong> information to compute reputation<br />
values to speed up the detection rate, at the same time the<br />
Bayesian statistical method was used to prevent attacks on<br />
rumors. Literature [17] focused on the security problems in<br />
the process of calculating the reputation value <strong>and</strong> proposed a<br />
safety scheme named SORI. Literatures [18], [19] pointed out<br />
that the detection techniques based on the watchdog were not<br />
accurate enough, <strong>and</strong> put forward a two-hop ACKs detection<br />
method that could more accurately detect the selfish nodes. But<br />
this approach introduced substantial ACK messages, which seriously<br />
occupied the network b<strong>and</strong>width, making the network<br />
more vulnerable to congestion.<br />
Literatures [14], [20] analyzed <strong>and</strong> summarized the calculation<br />
methods for reputation values, <strong>and</strong> pointed out that the<br />
use of second-h<strong>and</strong> information leaded to some disadvantages:<br />
each node needed to save reputation values for every node in<br />
the network, occupying substantial storage space; the dissemination<br />
of second-h<strong>and</strong> information among nodes used up a lot<br />
of network b<strong>and</strong>width; each time node received a second-h<strong>and</strong><br />
1
70 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
information, it needed to re-calculate the reputation values for<br />
all nodes in the network, taking up lots of CPU resources;<br />
more vulnerable to be attacked. These all make the reputation<br />
calculating method based on the first-h<strong>and</strong> information is<br />
more applicable to ad hoc networks. However, there also<br />
existed one drawback in the method that used the first-h<strong>and</strong><br />
information. The detection to selfish nodes was slower <strong>and</strong><br />
the non-cooperative nodes could not be separated from the<br />
network quickly <strong>and</strong> effectively.<br />
In this paper, we proposed the CMC scheme (Commonneighbor<br />
Monitoring enabled Cooperation enforcement<br />
scheme), which used the common-neighbor monitoring<br />
technique to speed up the detection rate to the noncooperative<br />
nodes. At the same time, the control messages<br />
(RREQs or RREPs) were filtered by the CMC scheme,<br />
which threw away the control messages that contained the<br />
non-cooperative nodes, making the routes chosen by nodes<br />
can bypass the non-cooperative nodes as much as possible,<br />
thereby improving the network performance.<br />
A. The Structure of CMC<br />
II. CMC SCHEME<br />
CMC is one kind of reputation based cooperation scheme,<br />
<strong>and</strong> the direct information (first-h<strong>and</strong> information) was used to<br />
calculate the reputation value for each node. Like all reputation<br />
based schemes [7], [15], [16], [17], [18], [19], [20], CMC also<br />
assumed that the non-cooperative nodes involved in routing<br />
discovery phase, but in the forwarding phase, may discarded<br />
packets for the purpose of saving their own resources (such<br />
as energy).<br />
CMC was based on the DSR[3] routing protocol, <strong>and</strong><br />
located between the network layer <strong>and</strong> the MAC layer, including<br />
five components: Watchdog, Filter, Neighbor Manager,<br />
Reputation Manager <strong>and</strong> the Second Chance Mechanism,<br />
as shown in Fig. 1. Among them, the Neighbor Manager<br />
was responsible for maintaining a list of neighbors, as well<br />
as periodically sending Hello messages; the Watchdog was<br />
responsible for eavesdropping the channel, <strong>and</strong> the monitoring<br />
results were passed to the Reputation Manager; the Reputation<br />
Manager calculated the reputation value for each node, <strong>and</strong><br />
added the non-cooperative nodes into the Black-list; The Filter<br />
had two functions, one was to punish the non-cooperation<br />
nodes, <strong>and</strong> the other was to filter the routing control messages,<br />
suppressing the routes containing non-cooperative nodes.<br />
The functions <strong>and</strong> codes of the DSR protocol remain<br />
unchanged, only the FindRoute() function was rewritten. The<br />
FindRoute() function searched for routes in the route cache in<br />
accordance with Black-list, <strong>and</strong> the routes that did not contain<br />
nodes lying in Black-list were returned. When the node had<br />
data to be sent, the FindRoute() function was first called to<br />
search for the available route in the route cache. If hit, the<br />
available route was used to send data, otherwise, the routing<br />
discovery phase needed to be restarted, re-searching for new<br />
routes.<br />
The packets generated by DSR protocol were first processed<br />
by the Watchdog, after that, they were h<strong>and</strong>ed down to the<br />
MAC layer to be sent. Node was set to the promiscuous mode,<br />
Fig. 1. The architecture of the CMC scheme<br />
<strong>and</strong> the packets received by the MAC layer were h<strong>and</strong>ed up<br />
to the Watchdog. Only those packets that were sent to this<br />
node or needed to be forwarded by this node could pass by<br />
the Watchdog. After that, the packets were h<strong>and</strong>ed up to the<br />
Filter module, <strong>and</strong> finally arrived at the DSR protocol <strong>and</strong><br />
processed by the DSR protocol.<br />
B. Neighbor Manager<br />
Neighbor list recorded the current active neighbor nodes. In<br />
the neighbor list, each neighbor corresponded to one item, <strong>and</strong><br />
the node’s ID, rating value <strong>and</strong> the timeout value were stored<br />
in it. The rating value corresponded to the reputation of the<br />
neighbor, initialized to 0.<br />
Neighbor Manager was used to maintain a list of the<br />
neighbors. The interface which was set to promiscuous mode<br />
monitored the channel, <strong>and</strong> each time it received a packet,<br />
it would send a copy to the Neighbor Manager. Neighbor<br />
Manager picked out node ID from the packet, <strong>and</strong> searched it<br />
in the neighbor list. If finding, the corresponding timeout value<br />
was updated; otherwise, it was thought to be a new neighbor,<br />
<strong>and</strong> its ID was put into the neighbor list. If the timeout of an<br />
item in the neighbor list expired (10s in our experiment), the<br />
node corresponding to this item would be thought to move<br />
out of the transmission range, so deleting this item from the<br />
neighbor list. If the node did not send any data in TNeib<br />
time (3s in our experiment), Neighbor Manager required to<br />
broadcast a Hello message in order to prevent being deleted<br />
from the neighbor list by neighbor nodes.<br />
C. Watchdog<br />
Watchdog was mainly used to monitor nodes in the neighborhood<br />
to observe whether they forwarded the packets, as<br />
well as whether or not modified the packet contents. Watchdog<br />
maintained a data structure: the packet monitoring buffer.<br />
Those packets that needed to be monitored were stored in the<br />
packet monitoring buffer, <strong>and</strong> each packet corresponded to one
71 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
item. The item was composed of the content of the packet, the<br />
expecting forwarding node’ ID <strong>and</strong> the timeout value.<br />
The packets sent by the routing layer (DSR protocol) were<br />
first processed by the Watchdog. As long as its next hop node<br />
was not the destination node, a copy of the packet was put<br />
into the packet monitoring buffer.<br />
The packets received by the interface were h<strong>and</strong>ed up to<br />
the MAC layer, <strong>and</strong> then the MAC layer delivered them to the<br />
Watchdog. Each time the Watchdog received one packet, it<br />
searched this packet in the packet monitoring buffer. If found<br />
<strong>and</strong> the packet was not tampered, a positive event was sent<br />
to the Reputation Management, increasing the rating value<br />
corresponding to the forwarding node. Then the Watchdog<br />
checked the next hop field in the head of the packet. If it<br />
consisted with the address of this node, the packet was h<strong>and</strong>ed<br />
up to the Filter for further processing. Otherwise, it meant<br />
that the packet was not sent to this node, discarded directly. If<br />
until the packet in the packet monitoring buffer timeout, the<br />
Watchdog failed to observe the forwarding behavior, a negative<br />
event would be sent to the Reputation Management to reduce<br />
the rating value of the forwarding node.<br />
D. Common-neighbor Monitoring<br />
From the describing about the Watchdog in the previous<br />
section, we know that only those nodes located on the route<br />
can watch the forwarding behavior of the next hop node.<br />
Studying the network topology carefully, we found that those<br />
nodes located at some special place could also watch the<br />
forwarding behavior of the next hop node. So, the commonneighbor<br />
monitoring technique was introduced. Each time the<br />
watchdog captured a packet, if its current forwarding node <strong>and</strong><br />
the next forwarding node were both in the neighbor list, this<br />
packet would be put into the packet monitoring buffer <strong>and</strong><br />
watched by Watchdog.<br />
Fig. 2. Common-neighbor monitoring technique<br />
As shown in Fig. 2, node S sends data to node D through<br />
node A, B <strong>and</strong> C, node M <strong>and</strong> node N are both located in the<br />
transmission range of node B <strong>and</strong> node C. When node B sends<br />
a packet to node C, node M <strong>and</strong> node N are able to capture<br />
the packet <strong>and</strong> find that the packet’s current forwarding node<br />
(node B) <strong>and</strong> next forwarding node (node C) are both their<br />
own neighbors. So node M <strong>and</strong> N put this packet into their<br />
data monitoring buffer, watching the forwarding behavior to<br />
this packet.<br />
In order to punish the non-cooperative nodes, Filters would<br />
discard all packets from the non-cooperative nodes. As shown<br />
in Fig. 2, it is assumed that the source node S is a noncooperative<br />
node <strong>and</strong> has been found by node A, as a<br />
punishment, all packets sent by node S will be discarded by<br />
node A. In order to prevent Watchdog regarding this kind<br />
of punishment as non-cooperation, we dem<strong>and</strong> that watchdog<br />
does not monitor the forwarding behavior of the first relaying<br />
node in the route. That is node P will not monitor the<br />
forwarding behavior of node A in Fig. 2.<br />
E. Reputation Management<br />
Reputation Management was responsible for updating the<br />
node’s reputation value. A good reputation system[14] should<br />
have the following characteristics: the reputation value is able<br />
to accurately reflect the behavior of the node; node’s recent actions<br />
have greater impaction on the reputation value, whereas<br />
the past behaviors have less impaction on the reputation value;<br />
be able to diagnose the non-cooperative nodes quickly. So, the<br />
Reputation Manager calculated the reputation values for nodes<br />
in accordance with the equation (1):<br />
�<br />
0.95 × Rold + 0.05 If Positive Event<br />
Rnew =<br />
0.90 × Rold − 0.1 If Negative Event<br />
If the Reputation Manager received a positive event, the<br />
new reputation value of the node would be the sum of the<br />
discounted (multiplied by 0.95) old value <strong>and</strong> 0.05; if the Reputation<br />
Manager received a negative event, the new reputation<br />
value would be the difference of the discounted (multiplied by<br />
0.9) old value <strong>and</strong> 0.1; finally, if the reputation value of the<br />
node was lower than a threshold (-0.5 in our experiment), this<br />
node would be considered as a non-cooperative node <strong>and</strong> put<br />
into Black-list.<br />
F. Filter<br />
Filter primarily filtered the passing by routing control messages<br />
(RREQs or RREPs) <strong>and</strong> data packets according to the<br />
Black-list. For each routing control message, if it contained<br />
nodes that located in Black-list (containing non-cooperative<br />
nodes), then the current finding route was considered as ”bad”<br />
<strong>and</strong> the control message was discarded. In addition, the Filter<br />
was also required to check all the passing by data packets.<br />
As punishment to non-cooperative nodes, each packet whose<br />
source node located in the Black-list was thrown away. See<br />
table I for the detail algorithm.<br />
In the route discovery phase, all nodes that were located<br />
between the source <strong>and</strong> destination node did the route filtering.<br />
Therefore, the routes found by the source node could bypass<br />
the non-cooperative nodes as much as possible, increasing the<br />
success rate of sending data.<br />
TABLE I<br />
FILTER ALGORITHM<br />
1 Receive a packet;<br />
2 if RREP or RREQ, <strong>and</strong> contains nodes in black-list then<br />
3 Suppress this packet, <strong>and</strong> return;<br />
4 else if data packet, <strong>and</strong> source node in black-list then<br />
5 Suppress this packet, <strong>and</strong> return;<br />
6 end if<br />
7 H<strong>and</strong> this packet to route layer;<br />
(1)
72 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
G. Second Chance Mechanism<br />
Literatures [7], [19], [20] proposed that several reasons<br />
would affect the detecting results of Watchdog, such as signal<br />
conflict, network congestion <strong>and</strong> temporary link failure, etc.,<br />
which may lead to cooperative nodes wrongly being marked<br />
as non-cooperative nodes by Watchdog(in our experiments, we<br />
also found that when the network load was heavy, network<br />
congestion was likely to happen, <strong>and</strong> the cooperative nodes<br />
may appear in Black-list). In addition, those nodes that had<br />
been detected as non-cooperative nodes at the early time,<br />
may show cooperative behaviors at the late time. In order to<br />
allow those nodes that had been isolated from the network<br />
could re-join into the network, giving them a ”rehabilitative”<br />
opportunity, CMC introduced the second chance mechanism.<br />
After a fixed period of time, nodes would be released from<br />
the Black-list, but their reputation values would not be reset to<br />
0, rather maintained the current value unchanged. Once these<br />
nodes showed the non-cooperative behaviors again, they could<br />
be quickly re-added into the Black-list.<br />
III. SIMULATION AND RESULTS ANALYSIS<br />
In this section, ns-2[21] was used to verify the CMC<br />
scheme, observing its impact on the network performance.<br />
In the simulation, the tool named setdest from CMU was<br />
used to generate movement scenes for nodes. In addition,<br />
the data streams generating tool from CMU could not meet<br />
our requirements <strong>and</strong> needed to do some changes, making it<br />
satisfy: the source <strong>and</strong> destination nodes of each connection<br />
are r<strong>and</strong>omly distributed in the network; the start time of each<br />
connection is uniformly distributed in [0s, 1000s]; the duration<br />
of each connection is uniformly distributed in [50s, 100s].<br />
The basic parameters in simulations are shown in table II.<br />
The sending <strong>and</strong> receiving transmission ranges are 250m, the<br />
maximum transmission rate is 2Mbits/s, simulation duration<br />
time is 1000 seconds, the data type used in simulations is CBR,<br />
<strong>and</strong> the size of each CBR packet is 512 byte. In the simulation,<br />
the proportion of non-cooperative nodes is changed from 10%<br />
to 60%, <strong>and</strong> the impact that non-cooperative nodes have on the<br />
network performance is observed. For each group parameters,<br />
the simulations are run 10 times, <strong>and</strong> the averaged result is<br />
used.<br />
TABLE II<br />
BASIC PARAMETERS FOR SIMULATION<br />
Simulate time 1000s<br />
Transmission range 250m<br />
Receiver range 250m<br />
Carrier sense range 550m<br />
Maximum pause time 100s<br />
Traffic type CBR<br />
Packet size 512byte<br />
CBR rate 5pkt/s<br />
The following two st<strong>and</strong>ards are used to assess the network<br />
performance:<br />
• throughput: the ratio of the packets that successfully<br />
arrived at the destination nodes to the packets that were<br />
sent by the source nodes.<br />
• forwarding throughput: the throughput with those packets<br />
that were sent directly to destination nodes removed.<br />
T h ro u g h p u t<br />
1 .0<br />
0 .9<br />
0 .8<br />
0 .7<br />
0 .6<br />
0 .5<br />
0 .4<br />
0 .3<br />
0 .2<br />
0 .1<br />
0 .0<br />
C M C<br />
P a th ra te r<br />
D e fe n s e le s s<br />
1 0 % 2 0 % 3 0 % 4 0 % 5 0 % 6 0 %<br />
F ra c tio n o f m is b e h a v io r n o d e s<br />
Fig. 3. Throughput in static network (670*670 m 2 , 50 nodes)<br />
T h ro u g h p u t<br />
1 .0<br />
0 .9<br />
0 .8<br />
0 .7<br />
0 .6<br />
0 .5<br />
0 .4<br />
0 .3<br />
0 .2<br />
0 .1<br />
0 .0<br />
C M C<br />
P a th ra te r<br />
D e fe n s e le s s<br />
1 0 % 2 0 % 3 0 % 4 0 % 5 0 % 6 0 %<br />
F ra c tio n o f m is b e h a v io r n o d e s<br />
Fig. 4. Forwarding throughput in static network (670*670 m 2 , 50 nodes)<br />
First, the impact of non-cooperative nodes on throughput<br />
in static network is studied. The selected simulation region<br />
is 670*670 m 2 , 50 nodes are r<strong>and</strong>omly distributed in the<br />
simulation region, <strong>and</strong> the number of the CBR connections<br />
is 50. The simulation results are shown in Fig. 3 – Fig. 5,<br />
the curve named defenseless means no cooperative scheme<br />
is used <strong>and</strong> the curve named pathrater denotes the scheme<br />
proposed in literature [7]. As can be seen in Fig. 3, with<br />
the number of non-cooperative nodes in network increasing,<br />
the three curves all descend. The throughput of CMC scheme<br />
has no obvious improvement compared with that of pathrater<br />
scheme, whereas with the proportion of non-cooperative nodes<br />
beyond 50%, the SMC scheme shows slightly advantages. The<br />
main reason is that, in static network, pathrater scheme will
A v e ra g e ro u te le n g th<br />
73 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
2 .4<br />
2 .3<br />
2 .2<br />
2 .1<br />
2 .0<br />
C M C<br />
P a th ra te r<br />
D e fe n s e le s s<br />
1 0 % 2 0 % 3 0 % 4 0 % 5 0 % 6 0 %<br />
F ra c tio n o f m is b e h a v io r n o d e s<br />
Fig. 5. Average route length in static network (670*670 m 2 , 50 nodes)<br />
finally be able to detect all the non-cooperative nodes (only the<br />
detection rate is slower), <strong>and</strong> routes chosen by source nodes in<br />
the following process can bypass these non-cooperative nodes.<br />
Forwarding throughput is shown in Fig. 4. We can see that,<br />
as the proportion of non-cooperative nodes increase, the curve<br />
corresponding to defenseless declines sharply, which indicates<br />
that non-cooperative nodes have resulted in a significant impact<br />
on network performance. In addition, the picture also<br />
shows that, compared with pathrater scheme, CMC scheme<br />
has obviously improvement on forwarding throughput.<br />
In the experiments, the average route lengths of the packets<br />
that arrived at destination nodes have been counted, <strong>and</strong> the<br />
results are shown in Fig. 5. We can see that, CMC <strong>and</strong><br />
pathrater schemes have considerable average route lengths,<br />
which is obviously more than that of defenseless scheme.<br />
The reason is that, in CMC <strong>and</strong> pathrater schemes, when<br />
source nodes have data to send, they do not select the shortest<br />
routes, but choose the routes which are able to bypass the<br />
non-cooperative nodes.<br />
Next, throughputs in small dynamic network are studied.<br />
The r<strong>and</strong>om waypoint model is chosen for nodes movements,<br />
nodes maximum velocity is 10m/s, <strong>and</strong> node’s maximum pause<br />
time is 100s. Simulation region is 670*670 m 2 , the number<br />
of nodes in dynamic network is 50, <strong>and</strong> the number of CBR<br />
connections is 50. Simulation results are shown in Fig. 6<br />
– Fig. 8. When the proportion of non-cooperative nodes in<br />
network is greater than 30%, CMC scheme is superior to<br />
pathrater scheme. In addition, compared to static network, the<br />
three curves in dynamic network have all declined. The mainly<br />
reason is that, in dynamic network, the nodes’ movements<br />
often cause link disconnection, resulting in route’s frequent<br />
changes, which lead to some packets loss.<br />
Fig. 8 shows the average route length of packets in dynamic<br />
network. We can see that CMC <strong>and</strong> pathrater schemes have<br />
longer average route length than defenseless scheme.<br />
Last, throughputs in large size dynamic network are studied,<br />
the number of nodes is increased to 100, <strong>and</strong> the simulation<br />
T h ro u g h p u t<br />
1 .0<br />
0 .9<br />
0 .8<br />
0 .7<br />
0 .6<br />
0 .5<br />
0 .4<br />
0 .3<br />
0 .2<br />
0 .1<br />
0 .0<br />
C M C<br />
P a th ra te r<br />
D e fe n s e le s s<br />
1 0 % 2 0 % 3 0 % 4 0 % 5 0 % 6 0 %<br />
F ra c tio n o f m is b e h a v io r n o d e s<br />
Fig. 6. Throughput in small dynamic network (670*670 m 2 , 50 nodes)<br />
F o rw a rd in g th ro u g h p u t<br />
1 .0<br />
0 .9<br />
0 .8<br />
0 .7<br />
0 .6<br />
0 .5<br />
0 .4<br />
0 .3<br />
0 .2<br />
0 .1<br />
0 .0<br />
C M C<br />
P a th ra te r<br />
D e fe n s e le s s<br />
1 0 % 2 0 % 3 0 % 4 0 % 5 0 % 6 0 %<br />
F ra c tio n o f m is b e h a v io r n o d e s<br />
Fig. 7. Forwarding throughput in small dynamic network (670*670 m 2 , 50<br />
nodes)<br />
region at the same time is increased from 670*670 m 2 to<br />
1200*1200 m 2 . Simulation results are shown in Fig. 9 –<br />
Fig. 11. We can see that CMC scheme is obviously superior<br />
to pathrater scheme, mainly because the expansion of the<br />
network size makes the average route length larger. In Fig. 11,<br />
the average route length of CMC scheme is more than 3.6<br />
<strong>and</strong> that of pathrater scheme is also greater than 3.4, which<br />
means that, in average case, each packet is relayed by 2.5<br />
nodes. In pathrater scheme, the source nodes can only find<br />
non-cooperative nodes within their one hop scope. For those<br />
nodes that locate at two hops or more long distance, the<br />
source nodes will not be able to judge their behaviors. Thus,<br />
the source nodes are likely to choose routes containing noncooperative<br />
nodes, decreasing the throughput. Contemporary,<br />
in CMC scheme, the routing control messages are filtered,<br />
which makes the routes chosen by the source nodes more
A v e ra g e ro u te le n g th<br />
2 .4<br />
2 .2<br />
2 .0<br />
1 .8<br />
1 .6<br />
C M C<br />
P a th ra te r<br />
D e fe n s e le s s<br />
1 0 % 2 0 % 3 0 % 4 0 % 5 0 % 6 0 %<br />
F ra c tio n o f m is b e h a v io r n o d e s<br />
Fig. 8. Average route length in small dynamic network (670*670 m 2 , 50<br />
nodes)<br />
T h ro u g h p u t<br />
74 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
1 .0<br />
0 .9<br />
0 .8<br />
0 .7<br />
0 .6<br />
0 .5<br />
0 .4<br />
0 .3<br />
0 .2<br />
0 .1<br />
0 .0<br />
1 0 % 2 0 % 3 0 % 4 0 % 5 0 % 6 0 %<br />
F ra c tio n o f m is b e h a v io r n o d e s<br />
C M C<br />
P a th ra te r<br />
D e fe n s e le s s<br />
Fig. 9. Throughput in large dynamic network (1200*1200 m 2 , 100 nodes)<br />
likely bypass the non-cooperative nodes.<br />
IV. CONCLUSION AND FUTURE WORK<br />
In this paper, the CMC method was proposed, which could<br />
quickly detect the non-cooperative nodes in mobile ad hoc<br />
networks, isolating them <strong>and</strong> reducing their impact on network<br />
performance. The use of common-neighbors monitoring<br />
technique could speed up the detection speed to the noncooperative<br />
nodes, making the system isolate the selfish nodes<br />
quickly. In CMC method, all nodes between source <strong>and</strong><br />
destination nodes suppressed the control messages that contained<br />
non-cooperative nodes, further reduced the performance<br />
impact that the non-cooperative nodes had on the network.<br />
The simulation results showed that when there existed noncooperative<br />
nodes in network, CMC could significantly improve<br />
node throughput <strong>and</strong> network performance.<br />
F o rw a rd in g th ro u g h p u t<br />
1 .0<br />
0 .9<br />
0 .8<br />
0 .7<br />
0 .6<br />
0 .5<br />
0 .4<br />
0 .3<br />
0 .2<br />
0 .1<br />
0 .0<br />
1 0 % 2 0 % 3 0 % 4 0 % 5 0 % 6 0 %<br />
F ra c tio n o f m is b e h a v io r n o d e s<br />
C M C<br />
P a th ra te r<br />
D e fe n s e le s s<br />
Fig. 10. Forwarding throughput in large dynamic network (1200*1200 m 2 ,<br />
100 nodes)<br />
A v e ra g e ro u te le n g th<br />
4 .0<br />
3 .8<br />
3 .6<br />
3 .4<br />
3 .2<br />
3 .0<br />
2 .8<br />
2 .6<br />
C M C<br />
P a th ra te r<br />
D e fe n s e le s s<br />
1 0 % 2 0 % 3 0 % 4 0 % 5 0 % 6 0 %<br />
F ra c tio n o f m is b e h a v io r n o d e s<br />
Fig. 11. Average route length in large dynamic network (1200*1200 m 2 ,<br />
100 nodes)<br />
In the next step, we will port the code from ns2 to linux to<br />
study the CMC’s performance on the real life environment.<br />
At the same time, the impact that different kind of end<br />
user applications (ie. video <strong>and</strong> audio) have on the CMC’s<br />
performance is under our consideration.<br />
REFERENCES<br />
[1] I. Chlamtac, M. Conti, <strong>and</strong> J. Liu, “Mobile ad hoc networking: imperatives<br />
<strong>and</strong> challenges,” Ad Hoc Networks, vol. 1, no. 1, pp. 13–64, 2003.<br />
[2] BASAGNI S, CONTI M, GIORDANO S <strong>and</strong> STOJMENOVIC I, Mobile<br />
Ad Hoc Networking. New Jersey: Wiley-IEEE press, 2004.<br />
[3] D. Johnson, D. Maltz, Y. Hu, <strong>and</strong> J. Jetcheva, “The dynamic source<br />
routing protocol for mobile ad hoc networks (DSR),” 2002.<br />
[4] C. Perkins <strong>and</strong> E. Royer, “Ad-hoc on-dem<strong>and</strong> distance vector routing,”<br />
in proceedings of the 2nd IEEE Workshop on Mobile Computing <strong>Systems</strong><br />
<strong>and</strong> Applications, vol. 2, 1999, pp. 90–100.
75 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
[5] D. J. G, “Game-theoretic power management in mobile ad hoc networks,”<br />
Ph.D thesis, Carnegie Mellon University Department of Electrical<br />
<strong>and</strong> Computer Engineering, Pittsburgh, Pennsylvania, Aug. 2004.<br />
[6] M. Felegyhazi, J. Hubaux, <strong>and</strong> L. Buttyan, “Nash equilibria of packet<br />
forwarding strategies in wireless ad hoc networks,” IEEE Transactions<br />
on Mobile Computing, vol. 5, no. 5, pp. 463–476, 2006.<br />
[7] S. Marti, T. Giuli, K. Lai, <strong>and</strong> M. Baker, “Mitigating routing misbehavior<br />
in mobile ad hoc networks,” in Proceedings of the 6th annual<br />
international conference on Mobile computing <strong>and</strong> networking. ACM<br />
New York, NY, USA, 2000, pp. 255–265.<br />
[8] G. Marias, P. Georgiadis, D. Flitzanis, <strong>and</strong> K. M<strong>and</strong>alas, “Cooperation<br />
enforcement schemes for MANETs: A survey,” Wireless Communications<br />
<strong>and</strong> Mobile Computing, vol. 6, no. 3, pp. 319–332, 2006.<br />
[9] Y. Yoo <strong>and</strong> D. Agrawal, “Why does it pay to be selfish in a MANET?”<br />
IEEE Wireless Communications, vol. 13, no. 6, pp. 87–97, 2006.<br />
[10] L. Buttyan <strong>and</strong> J. Hubaux, “Nuglets: a virtual currency to stimulate<br />
cooperation in self-organized mobile ad hoc networks,” ICCA, Swiss<br />
Federal Institute of Technology, 2001.<br />
[11] L. Anderegg <strong>and</strong> S. Eidenbenz, “Ad hoc-VCG: a truthful <strong>and</strong> costefficient<br />
routing protocol for mobile ad hoc networks with selfish<br />
agents,” in Proceedings of the 9th annual international conference on<br />
Mobile computing <strong>and</strong> networking. ACM New York, NY, USA, 2003,<br />
pp. 245–259.<br />
[12] Y. Wang <strong>and</strong> M. Singhal, “On improving the efficiency of truthful routing<br />
in MANETs with selfish nodes,” Pervasive <strong>and</strong> Mobile Computing,<br />
vol. 3, no. 5, pp. 537–559, 2007.<br />
[13] S. Eidenbenz, G. Resta, <strong>and</strong> P. Santi, “The COMMIT Protocol for<br />
Truthful <strong>and</strong> Cost-Efficient Routing in Ad Hoc Networks with Selfish<br />
Nodes,” IEEE Transactions on Mobile Computing, vol. 7, no. 1, pp.<br />
19–33, 2008.<br />
[14] S. Buchegger, D. Telekom, J. Mundinger, S. BC205, J. Le Boudec, <strong>and</strong><br />
S. BC203, “Reputation <strong>Systems</strong> for Self-Organized Networks: Lessons<br />
Learned,” IEEE Technology & Society Magazine, 2007.<br />
[15] S. Buchegger, “Coping with misbehavior in mobile ad-hoc networks,”<br />
Ph.D. dissertation, Ecole Polytechnique Federale DE Lausanne, 2004.<br />
[16] S. Buchegger <strong>and</strong> J. Le Boudee, “Self-policing mobile ad hoc networks<br />
by reputation systems,” IEEE Communications Magazine, vol. 43, no. 7,<br />
pp. 101–107, 2005.<br />
[17] Q. He, D. Wu, <strong>and</strong> P. Khosla, “A secure incentive architecture for ad<br />
hoc networks,” Wireless Communications <strong>and</strong> Mobile Computing, vol. 6,<br />
no. 3, 2006.<br />
[18] K. Liu, J. Deng, P. Varshney, <strong>and</strong> K. Balakrishnan, “An<br />
acknowledgment-based approach for the detection of routing<br />
misbehavior in MANETs,” IEEE Transactions on Mobile Computing,<br />
vol. 6, no. 5, pp. 536–550, 2007.<br />
[19] D. Djenouri <strong>and</strong> N. Badache, “Struggling against selfishness <strong>and</strong> black<br />
hole attacks in MANETs,” Wireless Communications <strong>and</strong> Mobile Computing,<br />
vol. 8, no. 6, 2008.<br />
[20] H. J. Y, “Cooperation in mobile ad hoc networks,”<br />
http://www.cs.fsu.edu/research/reports/TR-050111.pdf, January 2005.<br />
[21] K. Fall <strong>and</strong> K. Varadhan, “The ns manual (formerly ns notes <strong>and</strong><br />
documentation), The VINT Project, 2008.”<br />
Jianli Guo received her BS <strong>and</strong> MS in computer science <strong>and</strong> technology from<br />
Harbin Institute of Technology in 2002 <strong>and</strong> 2004 respectively. Now he is a<br />
PHD student in HIT. His research interest includes ad hoc network, wireless<br />
sensor network.<br />
Hongwei Liu is a professor in HIT. His research interest includes fault tolerant<br />
computing technology, ad hoc network, wireless sensor network.<br />
Xiaozong Yang is a professor in HIT. His research interest includes fault tolerant<br />
computing technology, computer architecture, ad hoc network,wireless<br />
sensor network.
76 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
Systemic Risk Assessment using a Non-stationary<br />
Fractional Dynamic Stochastic Model for the<br />
Analysis of Economic Signals<br />
Jonathan M Blackledge, Fellow, IET, Fellow, IoP, Fellow, IMA, Fellow, RSS<br />
Abstract— This paper considers the Fractal Market Hypothesis<br />
(FMH) for assessing the risk(s) in developing a financial portfolio<br />
based on data that is available through the Internet from an<br />
increasing number of sources. Most financial risk management<br />
systems are still based on the Efficient Market Hypothesis which<br />
often fails due to the inaccuracies of the statistical models that<br />
underpin the hypothesis, in particular, that financial data are<br />
based on stationary Gaussian processes. The FMH considered<br />
in this paper assumes that financial data are non-stationary <strong>and</strong><br />
statistically self-affine so that a risk analysis can, in principal, be<br />
applied at any time scale provided there is sufficient data to make<br />
the output of a FMH analysis statistically significant. This paper<br />
considers a numerical method <strong>and</strong> an algorithm for accurately<br />
computing a parameter - the Fourier dimension - that serves<br />
in the assessment of a financial forecast <strong>and</strong> is applied to data<br />
taken from the Dow Jones <strong>and</strong> FTSE financial indices. A more<br />
detailed case study is then presented based on a FMH analysis<br />
of Sub-Prime Credit Default Swap Market ABX Indices.<br />
Index Terms— Risk assessment of economy, Risk assessment<br />
statistics <strong>and</strong> numerical data, Fractal Market Hypothesis, FTSE,<br />
Dow Jones <strong>and</strong> ABX index.<br />
I. INTRODUCTION<br />
Attempts to develop stochastic models for financial time<br />
series are common place in financial mathematics <strong>and</strong> econometric<br />
in general. Financial time series are essentially digital<br />
signals composed of ‘tick data’ that provides traders with daily<br />
tick-by-tick data of trade price, trade time, <strong>and</strong> volume traded,<br />
for example, at different sampling rates [1], [2]. Stochastic<br />
financial models can be traced back to the early Twentieth<br />
Century when Louis Bachelier [3] proposed that fluctuations<br />
in the prices of stocks <strong>and</strong> shares (which appeared to be<br />
yesterday’s price plus some r<strong>and</strong>om change) could be viewed<br />
in terms of r<strong>and</strong>om walks in which price changes were entirely<br />
independent of each other. Thus, one of the simplest models<br />
for price variation is based on the sum of independent r<strong>and</strong>om<br />
numbers. This is the basis for Brownian motion [4] in which<br />
the r<strong>and</strong>om numbers are considered to conform to a normal<br />
distribution. This model is the basis for the Efficient Market<br />
Hypothesis (EMH) which has a number of questionable assumptions<br />
as discussed in the following section. In this paper,<br />
we consider a method for processing financial time series<br />
data based on the Fractal Market Hypothesis. The underlying<br />
Manuscript completed in December, 2009. The work reported in this paper<br />
is supported by the Science Foundation Irel<strong>and</strong>.<br />
Jonathan Blackledge (jonathan.blackledge@dit.ie) is the Stokes Professor<br />
of Digital Signal Processing, School of Electrical Engineering<br />
<strong>Systems</strong>, Faculty of Engineering, Dublin Institute of Technology<br />
(http://eleceng.dit.ie/blackledge).<br />
rationale for this model is discussed <strong>and</strong> example results<br />
presented to illustrate the ability for the model to provide<br />
an improved risk assessment of an economy with regard to<br />
predicting the characteristics of an economic time series based<br />
on a risk assessment statistic computed from numerical data. A<br />
case study is presented that is based on the sub-prime credit<br />
default swap market ABX index which is acknowledged as<br />
being one of the principal markets whose collapse triggered<br />
the current global recession.<br />
II. BROWNIAN MOTION AND THE EFFICIENT MARKET<br />
HYPOTHESIS<br />
R<strong>and</strong>om walk models, which underpin the so called Efficient<br />
Market Hypothesis (EMH) [5]-[12] have been the basis<br />
for financial time series analysis since the work of Bachelier<br />
in the late Nineteenth Century. Although the Black-Scholes<br />
equation [13], developed in the 1970s for valuing options, is<br />
deterministic (one of the first financial models to achieve determinism),<br />
it is still based on the EMH, i.e. stationary Gaussian<br />
statistics. The EMH is based on the principle that the current<br />
price of an asset fully reflects all available information relevant<br />
to it <strong>and</strong> that new information is immediately incorporated<br />
into the price. Thus, in an efficient market, the modelling<br />
of asset prices is concerned with modelling the arrival of<br />
new information. New information must be independent <strong>and</strong><br />
r<strong>and</strong>om, otherwise it would have been anticipated <strong>and</strong> would<br />
not be new. The arrival of new information can send ‘shocks’<br />
through the market (depending on the significance of the<br />
information) as people react to it <strong>and</strong> then to each other’s<br />
reactions. The EMH assumes that there is a rational <strong>and</strong><br />
unique way to use the available information <strong>and</strong> that all agents<br />
possess this knowledge. Further, the EMH assumes that this<br />
‘chain reaction’ happens effectively instantaneously. These<br />
assumptions are clearly questionable at any <strong>and</strong> all levels of<br />
a complex financial system.<br />
The EMH implies independence of price increments <strong>and</strong> is<br />
typically characterised by a normal of Gaussian Probability<br />
Density Function (PDF) which is chosen because most price<br />
movements are presumed to be an aggregation of smaller<br />
ones, the sums of independent r<strong>and</strong>om contributions having a<br />
Gaussian PDF. However, it has long been known that financial<br />
time series do not follow r<strong>and</strong>om walks. The shortcomings<br />
of the EMH model include: failure of the independence <strong>and</strong><br />
Gaussian distribution of increments assumption, clustering,<br />
apparent non-stationarity <strong>and</strong> failure to explain momentous
77 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
financial events such as ‘crashes’ leading to recession <strong>and</strong>,<br />
in some extreme cases, depression. These limitations have<br />
prompted a new class of methods for investigating time series<br />
obtained from a range of disciplines. For example, Re-scaled<br />
Range Analysis (RSRA), e.g. [14]-[16], which is essentially<br />
based on computing the Hurst exponent [17], is a useful tool<br />
for revealing some well disguised properties of stochastic time<br />
series such as persistence (<strong>and</strong> anti-persistence) characterized<br />
by non-periodic cycles. Non-periodic cycles correspond to<br />
trends that persist for irregular periods but with a degree of<br />
statistical regularity often associated with non-linear dynamical<br />
systems. RSRA is particularly valuable because of its<br />
robustness in the presence of noise. The principal assumption<br />
associated with RSRA is concerned with the self-affine or<br />
fractal nature of the statistical character of a time-series rather<br />
than the statistical ‘signature’ itself. Ralph Elliott first reported<br />
on the fractal properties of financial data in 1938 (e.g. [18] <strong>and</strong><br />
reference therein). He was the first to observe that segments<br />
of financial time series data of different sizes could be scaled<br />
in such a way that they were statistically the same producing<br />
so called Elliot waves.<br />
III. RISK ASSESSMENT AND REPEATING ECONOMIC<br />
PATTERNS<br />
A good stochastic financial model should ideally consider<br />
all the observable behaviour of the financial system it is<br />
attempting to model. It should therefore be able to provide<br />
some predictions on the immediate future behaviour of the<br />
system within an appropriate confidence level. Predicting the<br />
markets has become (for obvious reasons) one of the most<br />
important problems in financial engineering. Although, at least<br />
in principle, it might be possible to model the behaviour of<br />
each individual agent operating in a financial market, one<br />
can never be sure of obtaining all the necessary information<br />
required on the agents themselves <strong>and</strong> their modus oper<strong>and</strong>i.<br />
This principle plays an increasingly important role as the<br />
scale of the financial system, for which a model is required,<br />
increases. Thus, while quasi-deterministic models can be of<br />
value in the underst<strong>and</strong>ing of micro-economic systems (with<br />
known ‘operational conditions’), in an ever increasing global<br />
economy (in which the operational conditions associated with<br />
the fiscal policies of a given nation state are increasingly open),<br />
we can take advantage of the scale of the system to describe<br />
its behaviour in terms of functions of r<strong>and</strong>om variables.<br />
A. Elliot Waves<br />
The stochastic nature of financial time series is well known<br />
from the values of the stock market major indices such as the<br />
FTSE (Financial Times Stock Exchange) in the UK, the Dow<br />
Jones in the US which are frequently quoted. A principal aim<br />
of investors is to attempt to obtain information that can provide<br />
some confidence in the immediate future of the stock markets<br />
often based on patterns of the past. One of the principal components<br />
of this aim is based on the observation that there are<br />
‘waves within waves’ <strong>and</strong> ‘events within events’ that appear to<br />
permeate financial signals when studied with sufficient detail<br />
<strong>and</strong> imagination. It is these repeating patterns that occupy both<br />
the financial investor <strong>and</strong> the systems modeller alike <strong>and</strong> it is<br />
clear that although economies have undergone many changes<br />
in the last one hundred years, the dynamics of market data<br />
do not appear to change significantly (ignoring scale). For<br />
example, with data obtained from [19], Figure 1 shows the rescaled<br />
signals <strong>and</strong> associated ‘macrotrends’ (i.e. normalised<br />
time series <strong>and</strong> associated time series after application of<br />
a Gaussian lowpass filter) associated with FTSE Close-of-<br />
Day (COD) illustrating the ‘development’ of three different<br />
‘crashes’; those of 1987, 1997 <strong>and</strong> the most recent crash of<br />
2007. The macrotrends are computed by filtering each signal<br />
in Fourier space using a Gaussian lowpass filter exp(−βω 2 )<br />
with β = 0.1 where ω is the angular frequency.<br />
Fig. 1. Evolution of the 1987, 1997 <strong>and</strong> 2007 financial crashes. Normalised<br />
data (left) <strong>and</strong> macrotrends (right) where the data has been smoothed <strong>and</strong><br />
rescaled to values between 0 <strong>and</strong> 1 inclusively) of the daily FTSE value<br />
(close-of-day) for 02-04-1984 to 24-12-1987 (blue), 05-04-1994 to 24-12-<br />
1997 (green) <strong>and</strong> 02-04-2004 to 24-09-2007 (red).<br />
The similarity in behaviour of these signals is remarkable<br />
<strong>and</strong> clearly indicates a wavelength of approximately 1000<br />
days. This is indicative of the quest to underst<strong>and</strong> economic<br />
signals in terms of some universal phenomenon from which<br />
appropriate (macro) economic models can be generated. In an<br />
efficient market, only the revelation of some dramatic information<br />
can cause a crash, yet post-mortem analysis of crashes<br />
typically fail to (convincingly) tell us what this information<br />
must have been.<br />
One cause of correlations in market price changes (<strong>and</strong><br />
volatility) is mimetic behaviour, known as herding. In general,<br />
market crashes happen when large numbers of agents place sell<br />
orders simultaneously creating an imbalance to the extent that<br />
market makers are unable to absorb the other side without<br />
lowering prices substantially. Most of these agents do not<br />
communicate with each other, nor do they take orders from<br />
a leader. In fact, most of the time they are in disagreement,<br />
<strong>and</strong> submit roughly the same amount of buy <strong>and</strong> sell orders.<br />
This is a healthy non-crash situation; it is a diffusive (r<strong>and</strong>omwalk)<br />
process which underlies the EMH <strong>and</strong> financial portfolio<br />
rationalization.<br />
B. Non-equilibrium <strong>Systems</strong><br />
Financial markets can be considered to be non-equilibrium<br />
systems because they are constantly driven by transactions that
78 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
occur as the result of new fundamental information about firms<br />
<strong>and</strong> businesses. They are complex systems because the market<br />
also responds to itself, often in a highly non-linear fashion, <strong>and</strong><br />
would carry on doing so (at least for some time) in the absence<br />
of new information. The ‘price change field’ is highly nonlinear<br />
<strong>and</strong> very sensitive to exogenous shocks <strong>and</strong> it is probable<br />
that all shocks have a long term effect. Market transactions<br />
generally occur globally at the rate of hundreds of thous<strong>and</strong>s<br />
per second. It is the frequency <strong>and</strong> nature of these transactions<br />
that dictate stock market indices, just as it is the frequency <strong>and</strong><br />
nature of the s<strong>and</strong> particles that dictates the statistics of the<br />
avalanches in a s<strong>and</strong> pile. These are all examples of r<strong>and</strong>om<br />
scaling fractals [21]-[26].<br />
IV. THE FRACTAL MARKET HYPOTHESIS<br />
Developing mathematical models to simulate stochastic<br />
processes has an important role in financial analysis <strong>and</strong><br />
information systems in general where it should be noted that<br />
information systems are now one of the most important aspects<br />
in terms of regulating financial systems, e.g. [27]-[30]. A good<br />
stochastic model is one that accurately predicts the statistics<br />
we observe in reality, <strong>and</strong> one that is based upon some well<br />
defined rationale. Thus, the model should not only describe<br />
the data, but also help to explain <strong>and</strong> underst<strong>and</strong> the system.<br />
There are two principal criteria used to define the characteristics<br />
of a stochastic field: (i) The PDF or the Characteristic<br />
Function (i.e. the Fourier transform of the PDF); the Power<br />
Spectral Density Function (PSDF). The PSDF is the function<br />
that describes the envelope or shape of the power spectrum of<br />
a signal. In this sense, the PSDF is a measure of the field<br />
correlations. The PDF <strong>and</strong> the PSDF are two of the most<br />
fundamental properties of any stochastic field <strong>and</strong> various<br />
terms are used to convey these properties. For example, the<br />
term ‘zero-mean white Gaussian noise’ refers to a stochastic<br />
field characterized by a PSDF that is effectively constant over<br />
all frequencies (hence the term ‘white’ as in ‘white light’) <strong>and</strong><br />
has a PDF with a Gaussian profile whose mean is zero.<br />
Stochastic fields can of course be characterized using transforms<br />
other than the Fourier transform (from which the PSDF<br />
is obtained) but the conventional PDF-PSDF approach serves<br />
many purposes in stochastic systems theory. However, in<br />
general, there is no general connectivity between the PSDF<br />
<strong>and</strong> the PDF either in terms of theoretical prediction <strong>and</strong>/or<br />
experimental determination. It is not generally possible to<br />
compute the PSDF of a stochastic field from knowledge of<br />
the PDF or the PDF from the PSDF. Hence, in general, the<br />
PDF <strong>and</strong> PSDF are fundamental but non-related properties<br />
of a stochastic field. However, for some specific statistical<br />
processes, relationships between the PDF <strong>and</strong> PSDF can<br />
be found, for example, between Gaussian <strong>and</strong> non-Gaussian<br />
fractal processes [31] <strong>and</strong> for differentiable Gaussian processes<br />
[32].<br />
There are two conventional approaches to simulating a<br />
stochastic field. The first of these is based on predicting the<br />
PDF (or the Characteristic Function) theoretically (if possible).<br />
A pseudo r<strong>and</strong>om number generator is then designed whose<br />
output provides a discrete stochastic field that is characteristic<br />
of the predicted PDF. The second approach is based on<br />
considering the PSDF of a field which, like the PDF, is ideally<br />
derived theoretically. The stochastic field is then typically<br />
simulated by filtering white noise. A ‘good’ stochastic model<br />
is one that accurately predicts both the PDF <strong>and</strong> the PSDF<br />
of the data. It should take into account the fact that, in<br />
general, stochastic processes are non-stationary. In addition, it<br />
should, if appropriate, model rare but extreme events in which<br />
significant deviations from the norm occur.<br />
One explanation for crashes involves a replacement for the<br />
EMH by the Fractal Market Hypothesis (FMH) which is the<br />
basis of the model considered in this paper. The FMH proposes<br />
the following: (i) The market is stable when it consists of<br />
investors covering a large number of investment horizons<br />
which ensures that there is ample liquidity for traders; (ii)<br />
information is more related to market sentiment <strong>and</strong> technical<br />
factors in the short term than in the long term - as investment<br />
horizons increase <strong>and</strong> longer term fundamental information<br />
dominates; (iii) if an event occurs that puts the validity<br />
of fundamental information in question, long-term investors<br />
either withdraw completely or invest on shorter terms (i.e.<br />
when the overall investment horizon of the market shrinks<br />
to a uniform level, the market becomes unstable); (iv) prices<br />
reflect a combination of short-term technical <strong>and</strong> long-term<br />
fundamental valuation <strong>and</strong> thus, short-term price movements<br />
are likely to be more volatile than long-term trades - they are<br />
more likely to be the result of crowd behaviour; (v) if a security<br />
has no tie to the economic cycle, then there will be no longterm<br />
trend <strong>and</strong> short-term technical information will dominate.<br />
Unlike the EMH, the FMH states that information is valued<br />
according to the investment horizon of the investor. Because<br />
the different investment horizons value information differently,<br />
the diffusion of information will also be uneven. Unlike most<br />
complex physical systems, the agents of the economy, <strong>and</strong><br />
perhaps to some extent the economy itself, have an extra<br />
ingredient, an extra degree of complexity. This ingredient is<br />
consciousness.<br />
V. MATHEMATICAL MODEL FOR THE FMH<br />
We consider an economic times series to be a solution to<br />
the fractional diffusion equation [33]-[38]<br />
� �<br />
2 ∂<br />
u(x, t) = δ(x)n(t) (1)<br />
∂q<br />
− σq<br />
∂x2 ∂tq where σ is the fractional diffusion coefficient, q > 0 is the<br />
‘Fourier dimension’ <strong>and</strong> n(t) is ‘white noise’. Let<br />
u(x, t) = 1<br />
�∞<br />
U(x, ω) exp(iωt)dω<br />
2π<br />
<strong>and</strong><br />
Using the result<br />
n(t) = 1<br />
2π<br />
∂q 1<br />
u(x, t) =<br />
∂tq 2π<br />
−∞<br />
�∞<br />
−∞<br />
�∞<br />
−∞<br />
N(ω) exp(iωt)dω.<br />
U(x, ω)(iω) q exp(iωt)dω
we can then transform the fractional diffusion equation to the<br />
form � 2 ∂<br />
�<br />
U(x, ω) = δ(x)N(ω)<br />
where we take<br />
∂x 2 + Ω2 q<br />
Ωq = i(iωσ) q<br />
2<br />
Defining the Green’s function g [39] to be the solution of<br />
� �<br />
2 ∂<br />
g(| x − y |, ω) = δ(x − y)<br />
∂x 2 + Ω2 q<br />
where δ is the delta function, we obtain the solution<br />
U(x, ω) = N(ω)<br />
where [40]<br />
�∞<br />
−∞<br />
g(| x − y |, ω)δ(y)dy = N(ω)g(| x |, ω)<br />
g(| x |, ω) = i<br />
exp(iΩq | x |)<br />
2Ωq<br />
under the assumption that u <strong>and</strong> ∂u/∂x → 0 as x → ±∞. The<br />
Green’s function characterises the response a system modelled<br />
by equation (1) due to an impulse at x = y <strong>and</strong> it is clear that<br />
or<br />
79 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
iN(ω)<br />
lim U(x, ω) =<br />
x→0 2Ωq<br />
U(ω) = 1<br />
2σ q<br />
N(ω)<br />
2 (iω) q<br />
2<br />
The time series associated with this asymptotic solution is<br />
then obtained by Fourier inversion giving (ignoring scaling by<br />
[2σ q/2 Γ(q/2)] −1 )<br />
u(t) = 1<br />
⊗ n(t) (2)<br />
t1−q/2 where ⊗ defines the convolution integral. This equation<br />
is the Riemann - Liouville transform (ignoring scaling by<br />
[Γ−1 (q/2)] −1 ) [41] which is a fractional integral <strong>and</strong> defines a<br />
function u(t) which is statistically self-affine, i.e. for a scaling<br />
parameter λ > 0,<br />
λ q/2 Pr[u(λt)] = Pr[u(t)]<br />
where Pr[u(t)] denotes the Probability Density Function of<br />
u(t). Thus, equation (2) can be considered to be the temporal<br />
solution of equation (1) as x → 0 <strong>and</strong> u(t) is taken to be a<br />
r<strong>and</strong>om scaling fractal signal. Note that for | x |> 0 the phase<br />
Ωq | x | does not affect the ω −q scaling law of the power<br />
spectrum, i.e. ∀x,<br />
| U(x, ω) | 2 =<br />
| N(ω) |2<br />
4σ q ω q , ω > 0<br />
Thus for a uniformly distributed spectrum N(ω) the Power<br />
Spectrum Density Function of U is determined by ω−q <strong>and</strong> the<br />
algorithm developed to compute q given in Section 6 applies<br />
∀x <strong>and</strong> not just for the case when x → 0. However, since we<br />
can write<br />
i<br />
U(x, ω) = N(ω) exp(iΩq | x |)<br />
2Ωq<br />
1<br />
= N(ω)<br />
2(iωσ) q/2<br />
�<br />
1 + i(iωσ) q/2 | x | − 1<br />
2! (iωσ)q | x | 2 �<br />
+...<br />
unconditionally, by inverse Fourier transforming, we obtain<br />
the following expression for u(x, t) (ignoring scaling factors):<br />
u(x, t) = n(t) ⊗ 1<br />
+ i | x | n(t)<br />
t1−q/2 ∞� i<br />
+<br />
k+1<br />
(k + 1)!<br />
k=1<br />
dkq/2<br />
| x |2k n(t)<br />
dtkq/2 Here, the solution is composed of three terms composed of (i)<br />
a fractional integral, (ii) the source term n(t); (iii) an infinite<br />
series of fractional differentials of order kq/2.<br />
A. Rationale for the Model - Hurst Processes<br />
A Hurst process describes fractional Brownian motion <strong>and</strong><br />
is based on the generalization of Brownian motion quantified<br />
by the equation A(t) = √ t to<br />
A(t) = t H , H ∈ (0, 1]<br />
for a unit r<strong>and</strong>om step length in the plane where A is the<br />
most likely position in the plane after time t with respect to an<br />
initial position in the plane at t = 0. This scaling law makes<br />
no prior assumptions about any underlying distributions. It<br />
simply tells us how the system is scaling with respect to<br />
time. Processes of this type appear to exhibit cycles, but with<br />
no predictable period. The interpretation of such processes<br />
in terms of the Hurst exponent H is as follows: We know<br />
that H = 0.5 is consistent with an independently distributed<br />
system. The range 0.5 < H ≤ 1, implies a persistent time<br />
series, <strong>and</strong> a persistent time series is characterized by positive<br />
correlations. Theoretically, what happens today will ultimately<br />
have a lasting effect on the future. The range 0 < H ≤ 0.5<br />
indicates anti-persistence which means that the time series<br />
covers less ground than a r<strong>and</strong>om process. In other words,<br />
there are negative correlations. For a system to cover less<br />
distance, it must reverse itself more often than a r<strong>and</strong>om<br />
process.<br />
Given that r<strong>and</strong>om walks with H = 0.5 describe processes<br />
whose macroscopic behaviour is characterised by the diffusion<br />
equation, then, by induction, Hurst processes should be<br />
characterised by generalizing the diffusion operator<br />
to the fractional form<br />
∂2 ∂<br />
− σ<br />
∂x2 ∂t<br />
∂2 ∂q<br />
− σq<br />
∂x2 ∂tq where q ∈ (0, 2] Fractional diffusive processes can therefore be<br />
interpreted as intermediate between classical diffusive (r<strong>and</strong>om<br />
phase walks with H = 0.5; diffusive processes with q = 1)<br />
<strong>and</strong> ‘propagative process’ (coherent phase walks for H =<br />
1; propagative processes with q = 2), e.g. [42] <strong>and</strong> [43].<br />
The relationship between the Hurst exponent H, the Fourier<br />
dimension q <strong>and</strong> the Fractal dimension DF is given by [44]<br />
DF = DT + 1 − H = 1 − q + 3<br />
2 DT
where DT is the topological dimension. Thus, a Brownian<br />
process, where H = 1/2, has a fractal dimension of 1.5.<br />
Fractional diffusion processes are based on r<strong>and</strong>om walks<br />
which exhibit a bias with regard to the distribution of angles<br />
used to change the direction. By induction, it can be expected<br />
that as the distribution of angles reduces, the corresponding<br />
walk becomes more <strong>and</strong> more coherent, exhibiting longer <strong>and</strong><br />
longer time correlations until the process conforms to a fully<br />
coherent walk. A simulation of such an effect is given in<br />
Figure 2 which shows a r<strong>and</strong>om walk in the (real) plane<br />
as the (uniform) distribution of angles decreases. The walk<br />
becomes less <strong>and</strong> less r<strong>and</strong>om as the width of the distribution is<br />
reduced. Each position of the walk (xj, yj), j = 1, 2, 3, ..., N<br />
is computed using<br />
j�<br />
j�<br />
xj = cos(θi), yj = sin(θi)<br />
where<br />
80 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
i=1<br />
i=1<br />
θi = απ ni<br />
�n�∞<br />
<strong>and</strong> ni are r<strong>and</strong>om numbers computed using the linear congruential<br />
pseudo r<strong>and</strong>om number generator<br />
ni+1 = animodP, i = 1, 2, ..., N, a = 7 7 , P = 2 31 − 1<br />
The parameter 0 ≤ α ≤ 2π defines the width of the<br />
distribution of angles such that as α → 0, the walk becomes<br />
increasingly coherent or ‘propagative’<br />
Fig. 2. R<strong>and</strong>om phase walks in the plane for a uniform distribution of angles<br />
θi ∈ [0, 2π] (top left), θi ∈ [0, 1.9π] (top right), θi ∈ [0, 1.8π] (bottom left)<br />
<strong>and</strong> θi ∈ [0, 1.2π] (bottom right).<br />
In considering a t H scaling law with Hurst exponent H ∈<br />
(0, 1], Hurst paved the way for an appreciation that most natural<br />
stochastic phenomena which, at first site, appear r<strong>and</strong>om,<br />
have certain trends that can be identified over a given period<br />
of time. In other words, many natural r<strong>and</strong>om patterns have a<br />
bias to them that leads to time correlations in their stochastic<br />
behaviour, a behaviour that is not an inherent characteristic of<br />
a r<strong>and</strong>om walk model <strong>and</strong> fully diffusive processes in general.<br />
This aspect of stochastic field theory is the basis for Lévy<br />
processes [45].<br />
B. Lévy Processes<br />
Lévy processes are r<strong>and</strong>om walks whose distribution has<br />
infinite moments. The statistics of (conventional) physical<br />
systems are usually concerned with stochastic fields that have<br />
PDFs where (at least) the first two moments (the mean <strong>and</strong><br />
variance) are well defined <strong>and</strong> finite. Lévy statistics is concerned<br />
with statistical systems where all the moments (starting<br />
with the mean) are infinite. Many distributions exist where the<br />
mean <strong>and</strong> variance are finite but are not representative of the<br />
process, e.g. the tail of the distribution is significant, where<br />
rare but extreme events occur. These distributions include<br />
Lévy distributions. Lévy’s original approach to deriving such<br />
distributions is based on the following question: Under what<br />
circumstances does the distribution associated with a r<strong>and</strong>om<br />
walk of a few steps look the same as the distribution after<br />
many steps (except for scaling)? This question is effectively<br />
the same as asking under what circumstances do we obtain a<br />
r<strong>and</strong>om walk that is statistically self-affine. The characteristic<br />
function (i.e. the Fourier transform) P (k) of such a distribution<br />
p(x) was first shown by Lévy to be given by (for symmetric<br />
distributions only)<br />
P (k) = exp(−a | k | γ ), 0 < γ ≤ 2 (3)<br />
where a is a constant <strong>and</strong> γ is the Lévy index. For γ ≥ 2, the<br />
second moment of the Lévy distribution exists <strong>and</strong> the sums of<br />
large numbers of independent trials are Gaussian distributed.<br />
For example, if the result were a r<strong>and</strong>om walk with a step<br />
length distribution governed by p(x), γ > 2, then the result<br />
would be normal (Gaussian) diffusion, i.e. a Brownian process.<br />
For γ < 2 the second moment of this PDF (the mean square),<br />
diverges <strong>and</strong> the characteristic scale of the walk is lost. For<br />
values of γ between 0 <strong>and</strong> 2, Lévy’s characteristic function<br />
corresponds to a PDF of the form<br />
p(x) ∼ 1<br />
, x → ∞.<br />
x1+γ This type of r<strong>and</strong>om walk is called a Le´vy flight <strong>and</strong> is an<br />
example of a non-stationary fractal walk.<br />
Lévy process are consistent with a fractional diffusion<br />
equation [46]. The basic evolution equation for a r<strong>and</strong>om<br />
Brownian particle process is given by<br />
�∞<br />
u(x, t + τ) = u(x + λ, t)p(λ)dλ<br />
−∞<br />
where u(x, t) is the concentration of particles <strong>and</strong> τ is the<br />
interval of time in which a particle moves some distance<br />
between λ <strong>and</strong> λ + dλ with a probability p(λ) satisfying the<br />
condition p(λ) = p(−λ). We note that<br />
u(x, t + τ) = u(x, t) ⊗ p(x)<br />
<strong>and</strong> that in Fourier space, this equation is<br />
U(k, t + τ) = U(k, t)P (k)
where U <strong>and</strong> P are the Fourier transforms of u <strong>and</strong> p<br />
respectively. From equation (3),<br />
P (k) � 1 − a | k | γ<br />
so that we can write<br />
U(k, t + τ) − U(k, t)<br />
� −<br />
τ<br />
a<br />
τ | k |γ U(k, t)<br />
which for τ → 0 gives the fractional diffusion equation<br />
σ ∂ ∂γ<br />
u(x, t) = u(x, t),<br />
∂t ∂xγ γ ∈ (0, 2] (4)<br />
where σ = τ/a <strong>and</strong> we have used the result<br />
∂γ 1<br />
u(x, t) = −<br />
∂xγ 2π<br />
�∞<br />
−∞<br />
| k | γ U(k, t) exp(ikx)dk<br />
The solution to this equation with the singular initial condition<br />
u(x, 0) = δ(x) is given by<br />
u(x, t) = 1<br />
2π<br />
�∞<br />
−∞<br />
exp(ikx − t | k | γ /σ)dk<br />
which is itself Lévy distributed. This derivation of the fractional<br />
diffusion equation reveals its physical origin in terms of<br />
Lévy statistics, i.e. Lévy’s characteristic function. Note that the<br />
diffusion equation is fractional in the spatial derivative rather<br />
than the temporal derivative as given in equation (1). However,<br />
since the Green’s function for equation (4) is given by<br />
where<br />
81 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
g(| x |, ω) = i<br />
exp(iΩγ | x |)<br />
2Ωγ<br />
Ωγ = i 2<br />
γ (iωσ) 1<br />
γ ,<br />
by induction, we obtain a relationship between the Lévy index<br />
γ <strong>and</strong> the Fourier dimension q given by<br />
1 q<br />
=<br />
γ 2<br />
Gaussian processes associated with the classical diffusion<br />
equation are thus recovered when γ = 2 <strong>and</strong> q = 1.<br />
C. Fractional Differentials<br />
Fractional differentials of any order need to be considered<br />
in terms of the definition for a fractional differential given by<br />
ˆD q f(t) = dm<br />
dt m [Îm−q f(t)], m − q > 0<br />
where m is an integer <strong>and</strong> Î is the fractional integral operator<br />
(the Riemann-Liouville transform) given by<br />
Î p f(t) = 1 1<br />
f(t) ⊗ , p > 0<br />
Γ(p) t1−p The reason for this is that direct fractional differentiation can<br />
yield divergences. However, there is a deeper interpretation of<br />
this result that has a synergy with the issue over a macroeconomic<br />
system having ‘memory’ <strong>and</strong> is based on observing that<br />
the evaluation of a fractional differential operator depends on<br />
the history of the function in question. Thus, unlike an integer<br />
differential operator of order m, a fractional differential operator<br />
of order q has ‘memory’ because the value of Îm−qf(t) at<br />
a time t depends on the behaviour of f(t) from −∞ to t via the<br />
convolution of f(t) with t (m−q)−1 /Γ(m−q). The convolution<br />
process is dependent on the history of a function f(t) for<br />
a given kernel <strong>and</strong> thus, in this context, we can consider a<br />
fractional derivative defined by ˆ Dq to have ‘memory. In this<br />
sense, the operator<br />
∂2 ∂q<br />
− σq<br />
∂x2 ∂tq describes a process, compounded in a field u(x, t), that has<br />
memory association with regard to the temporal characteristics<br />
of the system it is attempting to model. This is not an intrinsic<br />
characteristic of systems that are purely diffusive q = 1 or<br />
propagative q = 2.<br />
D. Non-stationary Model<br />
The fractional diffusion operator used in equation (1) is<br />
appropriate for modelling fractional diffusive processes that<br />
are stationary. For non-stationary fractional diffusion, we could<br />
consider the case where the diffusivity is time variant as<br />
defined by the function σ(t). However, a more interesting<br />
case arises when the characteristics of the diffusion processes<br />
change over time becoming less or more diffusive. This is<br />
illustrated in terms of the r<strong>and</strong>om walk in the plane given in<br />
Figure 3. Here, the walk starts off being fully diffusive (i.e.<br />
H = 0.5 <strong>and</strong> q = 1), changes to being fractionally diffusive<br />
(0.5 < H < 1 <strong>and</strong> 1 < q < 2) <strong>and</strong> then changes back to<br />
being fully diffusive. In terms of fractional diffusion, this is<br />
equivalent to having an operator<br />
∂2 ∂q<br />
− σq<br />
∂x2 ∂tq where q = 1, t ∈ (0, T1]; q > 1, t ∈ (T1, T2]; q = 1, t ∈<br />
(T2, T3] where T3 > T2 > T1. If we want to generalise<br />
such processes over arbitrary periods of time, then we should<br />
consider q to be a function of time. We can then introduce a<br />
non-stationary fractional diffusion operator given by<br />
∂2 ∂q(t)<br />
− σq(t) .<br />
∂x2 ∂tq(t) This operator is the theoretical basis for the Fractal Market<br />
Hypothesis considered in this paper. In terms of using this<br />
model to develop a FMH risk management metric based on<br />
the analysis of economic time series, the principal Hypothesis<br />
is that a change in q(t) precedes a change in a macroeconomic<br />
index. This requires accurately numerical methods for<br />
computing q(t) for a given index which are discussed later.<br />
Real economic signals exhibit non-stationary fractal walks. An<br />
example of this is illustrated in Figure 4 which shows a nonstationary<br />
walk in the complex plane obtained by taking the<br />
Hilbert transform of an economic signal, i.e. computing the<br />
analytic signal<br />
s(t) = u(t) + i<br />
⊗ u(t)<br />
πt<br />
<strong>and</strong> plotting the real <strong>and</strong> imaginary component of this signal<br />
in the complex plane.
82 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
Fig. 3. Non-stationary r<strong>and</strong>om phase walk in the plane.<br />
Fig. 4. Non-stationary fractal walk in the complex plane (right) obtained by<br />
computig the Hilbert transform of the economic signal (left) - FTSE Closeof-Day<br />
from 02-04-1984 to 24-12-1987.<br />
The non-stationary model considered here exhibits behaviour<br />
that is similar to Lévy processes. However, the aim<br />
is not to derive a statistical model for a stochastic process<br />
using a stationary fractional diffusion of the type given by<br />
equation (4) but to be able to compute a function - namely<br />
q(t) - which is a measure of the non-stationary behaviour<br />
especially with regard to a ‘future flight’. This is because,<br />
in principle, the value of q(t) should reflect the early stages<br />
of a change in the behaviour of u(t), a principle that is the<br />
basis for the financial data processing <strong>and</strong> analysis discussed<br />
in the following section.<br />
VI. FINANCIAL DATA ANALYSIS<br />
If we consider the case where the Fourier dimension is<br />
a relatively slowly varying function of time, then we can<br />
legitimately consider q(t) to be composed of a sequence of<br />
different states qi = q(ti). This approach allows us to develop<br />
a stationary solution for a fixed q over a fixed period of time.<br />
Non-stationary behaviour can then be introduced by using the<br />
same solution for different values of q over fixed (or varying)<br />
periods of time <strong>and</strong> concatenating the solutions for all q to<br />
produce an output digital signal.<br />
The FMH model for a quasi-stationary segment of a financial<br />
signal is given by<br />
u(t) = 1<br />
⊗ n(t), q > 0<br />
t1−q/2 which has characteristic spectrum<br />
U(ω) = N(ω)<br />
(iω) q/2<br />
The PSDF is thus characterised by ω −q , ω ≥ 0 <strong>and</strong> our<br />
problem is thus, to compute q from the data P (ω) =| U(ω) | 2<br />
, ω ≥ 0. For this data, we consider the PSDF<br />
ˆP (ω) = c<br />
ω q<br />
or<br />
ln ˆ P (ω) = C + q ln ω<br />
where C = ln c. The problem is therefore reduced to implementing<br />
an appropriate method to compute q (<strong>and</strong> C) by<br />
finding a best fit of the line ln ˆ P (ω) to the data ln P (ω).<br />
Application of the least squares method for computing q,<br />
which is based on minimizing the error<br />
e(q, C) = � ln P (ω) − ln ˆ P (ω, q, C)� 2 2<br />
with regard to q <strong>and</strong> C, leads to errors in the estimates for<br />
q which are not compatible with market data analysis. The<br />
reason for this is that relative errors at the start <strong>and</strong> end<br />
of the data ln P may vary significantly especially because<br />
any errors inherent in the data P will be ‘amplified’ through<br />
application of the logarithmic transform required to linearise<br />
the problem. In general, application of a least squares approach<br />
is very sensitive to statistical heterogeneity [47] <strong>and</strong> in this<br />
application, may provide values of q that are not compatible<br />
with the rationale associated with the FMH (i.e. values of 1 <<br />
q < 2 that are intermediate between diffusive <strong>and</strong> propagative<br />
processes). For this reason, an alternative approach must be<br />
considered which, in this paper, is based on Orthogonal Linear<br />
Regression (OLR) [48] [49].<br />
Applying a st<strong>and</strong>ard moving window, q(t) is computed by<br />
repeated application of OLR based on the m-code available<br />
from [51]. This provides a numerical estimate of the function<br />
q(t) whose values reflect the state of a financial signals<br />
(assumed to be a non-stationary r<strong>and</strong>om fractal) in terms of a<br />
stable or unstable economy, from which a risk analysis can be<br />
performed. Since q is, in effect, a statistic, its computation<br />
is only as good as the quantity (<strong>and</strong> quality) of data that<br />
is available for its computation. For this reason, a relatively<br />
large window is required whose length is compatible with the<br />
number of samples available.<br />
A. Numerical Algorithm<br />
The principal algorithm associated with the application of<br />
the FMH analysis is as follows:<br />
Step 1: Read data (financial time series) from file into<br />
operating array a[i], i = 1, 2, ..., N.
83 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
Step 2: Set length L < N of moving window w to be used.<br />
Step 3: For j = 1 assign L + j − 1 elements of a[i] to array<br />
w[i], i = 1, 2, ..., L.<br />
Step 4: Compute the power spectrum P [i] of w[i] using a<br />
Discrete Fourier Transform (DFT).<br />
Step 5: Compute the logarithm of the spectrum excluding the<br />
DC, i.e. compute log(P [i])∀i ∈ [2, L/2].<br />
Step 6: Compute q[j] using the OLR algorithm whose m-code<br />
is given in Appendix I.<br />
Step 7: For j = j + 1 repeat Step 3 - Step 5 stopping when<br />
j = N − L.<br />
Step 8: Write the signal q[j] to file for further analysis <strong>and</strong><br />
post processing.<br />
The following points should be noted:<br />
(i) The DFT is taken to generate an output in st<strong>and</strong>ard form<br />
where the zero frequency component of the power spectrum<br />
is taken to be P [1].<br />
(ii) With L = 2 m for integer m, a Fast Fourier Transform can<br />
be used<br />
(iii) The minimum window size that should be used in order<br />
provide statistically significant values of q[j] is L = 64 when<br />
q can be computed accurate to 2 decimal places.<br />
An example of the output generated by this algorithm for<br />
a 1024 element window is given in Figure 5 using Dow<br />
Jones Close-of-Day data obtained from [20]. Inspection of the<br />
signals illustrates a qualitative relationship between trends in<br />
the financial data <strong>and</strong> q(t) in accordance with the theoretical<br />
model considered. In particular, over periods of time in which<br />
q increases in value, the amplitude of the financial signal u(t)<br />
decreases. Moreover, <strong>and</strong> more importantly, an upward trend<br />
in q appears to be a precursor to a downward trend in u(t), a<br />
correlation that is compatible with the idea that a rise in the<br />
value of q relates to the ‘system’ becoming more propagative,<br />
which in stock market terms, indicates the likelihood for the<br />
markets becoming ‘bear’ dominant in the future.<br />
The results of using the method discussed above not only<br />
provides for a general appraisal of different macroeconomic<br />
financial time series, but, with regard to the size of selected<br />
window used, an analysis of data at any point in time.<br />
The output can be interpreted in terms of ‘persistence’ <strong>and</strong><br />
‘anti-persistence’ <strong>and</strong> in terms of the existence or absence<br />
of after-effects (macroeconomic memory effects). For those<br />
periods in time when q(t) is relatively constant, the existing<br />
market tendencies usually remain. Changes in the existing<br />
trends tend to occur just after relatively sharp changes in<br />
q(t) have developed. This behaviour indicates the possibility<br />
of using the time series q(t) for identifying the behaviour<br />
of a macroeconomic financial system in terms of both intermarket<br />
<strong>and</strong> between-market analysis. These results support the<br />
possibility of using q(t) as an independent volatility predictor<br />
to give a risk assessment associated with the likely future<br />
behaviour of different economic time series. Further, because<br />
Fig. 5. Application of the FMH using a 1024 element window for analysing<br />
financial time series composed of Dow Jones Close-of-Day data from from<br />
02-11-1932 to 25-03-2009. Above: Dow Jones Close-of-Day data (blue) <strong>and</strong><br />
q(t) (red) computed using a window of 1024; Below: Histogram of q(t) for<br />
100 bins.<br />
this analysis is based on the equation (2) which defines a<br />
(stationary) r<strong>and</strong>om scaling fractal signal, the results are, in<br />
principle, scale invariant.<br />
B. Equivalence with a Wavelet Transform<br />
The wavelet transform is defined in terms of projections of<br />
f(t) onto a family of functions that are all normalized dilations<br />
<strong>and</strong> translations of a prototype ‘wavelet’ function w [50], i.e.<br />
where<br />
W[f(t)] = FL(t) =<br />
wL(τ, t) = 1<br />
√ L w<br />
�∞<br />
−∞<br />
� τ − t<br />
L<br />
f(τ)wL(τ, t)dτ<br />
�<br />
, L > 0.<br />
The independent variables L <strong>and</strong> t are continuous dilation <strong>and</strong><br />
translation parameters respectively. The wavelet transformation<br />
is essentially a convolution transform where wL(t) is the<br />
convolution kernel with dilation variable L. The introduction<br />
of this factor provides dilation <strong>and</strong> translation properties into<br />
the convolution integral that gives it the ability to analyse<br />
signals in a multi-resolution role (the convolution integral is<br />
now a function of L), i.e.<br />
FL(t) = wL(t) ⊗ f(t), L > 0.<br />
In this sense, the asymptotic solution (ignoring scaling)<br />
u(t) = 1<br />
⊗ n(t), q > 0<br />
t1−q/2 is compatible with the case of a wavelet transform where<br />
w1(t) = 1<br />
t 1−q/2<br />
for the stationary case <strong>and</strong> where, for the non-stationary case,<br />
1<br />
w1(t, τ) = .<br />
t1−q(τ)/2
84 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
C. Macrotrend Analysis<br />
In order to develop a macrotrend signal that has optimal<br />
properties with regard the assessment of risk (i.e. the likely<br />
future behaviour of an economic signal), it is important that the<br />
filter used is: (i) consistent with the properties of a Variation<br />
Diminishing Smoothing Kernel (VDSK); (ii) that the last few<br />
values of the trend signal are ‘data consistent’. VDSKs are<br />
convolution kernels with properties that guarantee smoothness<br />
around points of discontinuity of a given signal where the<br />
smoothed function is composed of a similar succession of<br />
concave or convex arcs equal in number to those of signal.<br />
VDSKs also have ‘geometric properties’ that preserve the<br />
‘shape’ of the signal. There are a range of VDSKs of which the<br />
most common is a Gaussian function <strong>and</strong>, for completeness,<br />
Appendix II provides a overview of the principal analytical<br />
properties, including fundamental Theorems <strong>and</strong> Proofs of<br />
such kernels including the Gaussian kernel.<br />
In practice, the computation of the smoothing process using<br />
a VDSK must be performed in such a way that the initial <strong>and</strong><br />
final elements of the output data are entirely data consistent<br />
with the input array within the locality of any element. Since<br />
a VDSK is a non-localised filter which tends to zero at<br />
infinity, in order to optimise the numerical efficiency of the<br />
smoothing process, filtering is undertaken in Fourier space.<br />
However, in order to produce a data consistent macrotrend<br />
signal using a Discrete Fourier Transform, wrapping effects<br />
must be eliminated. The solution is to apply an ‘end point<br />
extension’ scheme which involves padding the input vector<br />
with elements equal to the first <strong>and</strong> last values of the vector.<br />
The length of the ‘padding vectors’ are taken to be at least<br />
half the size of the input vector. The output vector is obtained<br />
by deleting the filtered padding vectors.<br />
Figures 6 <strong>and</strong> 7 show examples of macrotrend analysis<br />
applied to the economic time series obtained from [19] <strong>and</strong><br />
[20] <strong>and</strong> the signal q(t) using the VDSK filter exp(−βω 2 ).<br />
Table 1 provides quantitative information of the statistics of the<br />
signal q(t). Figures 6 <strong>and</strong> 7 include the normalised gradients<br />
computed using a ‘forward differencing scheme’ which clearly<br />
illustrate ‘phase shifts’ associated with the two signals. From<br />
Table 1, the mean value of q(t) for the Dow Jones index<br />
is slightly lower than the mean for the FTSE <strong>and</strong> in both<br />
cases, the Null Hypothesis test as to whether q(t) is Gaussian<br />
distributed is negative, i.e. the ‘Composite Normality’ is of<br />
type ‘Reject’.<br />
VII. CASE STUDY: ANALYSIS OF ABX INDICES<br />
ABX indices serve as a benchmark of the market for<br />
securities backed by home loans issued to borrowers with weak<br />
credit. The index is administered by the London-based Markit<br />
Group which specialises in credit derivative pricing [52].<br />
A. What is an ABX index?<br />
The index is based on a basket of Credit Default Swap<br />
(CDS) contracts for the sub-prime housing equity sector.<br />
Credit Default Swaps operate as a type of insurance policy<br />
for banks or other holders of bad mortgages. If the mortgage<br />
goes bad, then the seller of the CDS must pay the bank for the<br />
Fig. 6. Analysis of FTSE Close-of-Day data from 25-04-1988 to 20-03-<br />
2009. Top-left:FTSE data (blue) <strong>and</strong> q(t) (red) computed using a 1024 moving<br />
window; Top-right: 100 bin histogram; Bottom-left: Macrotrends (β = 0.1);<br />
Bottom-right: Normalised gradients of macrotrends.<br />
Fig. 7. Analysis of DJ Close-of-Day data from 25-04-1988 to 20-03-2009.<br />
Top-left: FTSE data (blue) <strong>and</strong> q(t) (red) computed using a window of 1024;<br />
Top-right: 100 bin histogram; Bottom-left: Macrotrends (β = 0.1); Bottomright:<br />
Normalised gradients of macrotrends.
85 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
Statistical Parameter q(t)-FTSE q(t)-DJ<br />
Minimum Value 0.9876 0.9752<br />
Maximum value 1.5067 1.5154<br />
Range 0.5190 0.5402<br />
Mean 1.2482 1.2218<br />
Median 1.2639 1.2452<br />
St<strong>and</strong>ard Deviation 0.1017 0.1269<br />
Variance 0.0104 0.0161<br />
Skew -0.4080 -0.2881<br />
Kertosis 2.3745 1.8233<br />
Composite Normality Reject Reject<br />
TABLE I<br />
STATISTICAL VALUES ASSOCIATED WITH q(t) COMPUTED FOR FTSE AND<br />
DJ CLOSE-OF-DAY DATA FROM 25-04-1988 TO 20-03-2009 GIVEN IN<br />
FIGURES 6 AND 7 RESPECTIVELY.<br />
lost mortgage payments. Alternatively, if the mortgage stays<br />
good then the seller makes a lot of money. The riskier the<br />
bundle of mortgages the lower the rating.<br />
The original goal of the index was to create visibility <strong>and</strong><br />
transparency but it was not clear at the time of its inception<br />
that the index would be so closely followed. As subprime<br />
securities have become increasingly uncertain, the ABX index<br />
has become a key point of reference for investors navigating<br />
risky mortgage debt on an international basis. Hence, in light<br />
of the current financial crisis (i.e. from 2008-date), <strong>and</strong> given<br />
that most economist agree that the subprime mortgage was a<br />
primary catalyst for the crisis, analysis of the ABX index has<br />
become a key point of reference for investors navigating the<br />
world of risky mortgage debt.<br />
On asset-backed securities such as home equity loans the<br />
CDS provides an insurance against the default of a specific<br />
security. The index enables users to trade in a security without<br />
being limited to the physical outst<strong>and</strong>ing amount of that<br />
security thereby given investors liquid access to the most<br />
frequently traded home equity tranches in a basket form. The<br />
ABX uses five indices that range from triple-A to triple-<br />
B minus. Each index chooses deals from 20 of the largest<br />
sub-prime home equity shelves by issuance amount from<br />
the previous six months. The minimum deal size is $500<br />
million <strong>and</strong> each tranche referenced must have an average<br />
life of between four <strong>and</strong> six years, except for the triple-A<br />
tranche, which must have a weighted average life greater than<br />
five years. Each of the indices is referenced to by different<br />
rated tranches, i.e. AAA, AA, A, BBB <strong>and</strong> BBB-. They are<br />
selected through identification of the most recently issued<br />
deals that meet the specific size <strong>and</strong> diversity criteria. The<br />
principal ‘market-makers’ in the index were/are: Bank of<br />
America, Bear Stearns, Citigroup, Credit Suisse, Deutsche<br />
Bank, Goldman Sachs, J P Morgan, Lehman Brothers, Merrill<br />
Lynch (now Bank of America), Morgan Stanley, Nomura<br />
International, RBS Greenwich Capital, UBS <strong>and</strong> Wachovia.<br />
However, during the financial crisis that developed in 2008,<br />
a number of changes have taken place. For example, on<br />
September 15, 2008, Lehman Brothers filed for bankruptcy<br />
protection following a massive exodus of most of its clients,<br />
drastic losses in its stock, <strong>and</strong> devaluation of its assets by<br />
credit rating agencies <strong>and</strong> in 2008 Merrill Lynch was acquired<br />
by Bank of America at which point Bank of America merged<br />
its global banking <strong>and</strong> wealth management division with the<br />
newly acquired firm. The Bear Stearns Companies, Inc. was a<br />
global investment bank <strong>and</strong> securities trading <strong>and</strong> brokerage,<br />
until its collapse <strong>and</strong> fire sale to J P Morgan Chase in 2008.<br />
ABX contracts are commonly used by investors to speculate<br />
on or to hedge against the risk that the underling mortgage<br />
securities are not repaid as expected. The ABX swaps offer<br />
protection if the securities are not repaid as expected, in<br />
return for regular insurance-like premiums. A decline in the<br />
ABX index signifies investor sentiment that subprime mortgage<br />
holders will suffer increased financial losses from those<br />
investments. Likewise, an increase in the ABX index signifies<br />
investor sentiment looking for subprime mortgage holdings to<br />
perform better as investments.<br />
B. ABX <strong>and</strong> the Sub-prime Market<br />
Prime loans are often packaged into securities <strong>and</strong> sold to<br />
investors to help lenders reduce risk. More than $500B of<br />
such securities were issued in the US in 2006. The problem<br />
for investors who bought 2006’s crop of high-risk mortgage<br />
originations, was that as the US housing market slowed as<br />
did mortgage applications. To prop up the market, mortgage<br />
lenders relaxed their underwriting st<strong>and</strong>ards lending to everriskier<br />
borrowers at ever more favourable terms.<br />
In the last few weeks of 2006, the poor credit quality of<br />
the 2006 vintage subprime mortgage origination started to<br />
become apparent. Delinquencies <strong>and</strong> foreclosures among highrisk<br />
borrowers increased at a dramatic rate, weakening the<br />
performance of the mortgage pools. In one security backed by<br />
subprime mortgages issued in March 2006, foreclosure rates<br />
were already 6.09% by December that year, while 5.52% of<br />
borrowers were late on their payments by more than 30 days.<br />
Lenders also began shutting their doors, sending shock waves<br />
through the high-risk mortgage markets throughout 2007. The<br />
problem kept new investor money at bay, <strong>and</strong> dramatically<br />
weakened a key derivative index tied to the performance of<br />
2006 high-risk mortgages, i.e. the ABX index. As a result<br />
the ABX suffered a major plummet of the index starting in<br />
December 2006 when BBB- fell below 100 for the first time.<br />
The most heavily traded subindex, representing loans rated<br />
BBB-, fell as hedge funds flocked to bet on the downturn <strong>and</strong><br />
pushed up the cost of insuring against default. This led to a<br />
knock-on effect as lenders withdrew from the ABX market<br />
In early 2007 the issues were seen as: (i) Which investors<br />
were bearing the losses from having bought sub-prime mortgage<br />
backed securities? (ii) How large <strong>and</strong> concentrated were<br />
these losses? (iii) Had this sub-prime securitization distributed<br />
their risk among many players in the financial system or were<br />
the positions <strong>and</strong> losses concentrated among a few players?<br />
(iv) What were the potential systemic risk effects of these<br />
losses? We now know that the systemic risk had a devastating<br />
affect on the global economy <strong>and</strong> became known as the ‘Credit<br />
Crunch’. One of the catalysts for the problem was a US<br />
bill allowing bankruptcy judges to alter loan balances which<br />
nobody dealing in CDS had considered. The second key factor<br />
was the speed of deterioration of the ABX Indices in 2007
86 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
which shocked investors <strong>and</strong> left them waiting to see the<br />
bottom of the market before getting back in - they are still<br />
waiting. The third key factor was the failure of the US Treasury<br />
to provide foreclosure relief for distressed home owners which<br />
congress had approved. The following series of reactions<br />
(denoted by →) were triggered as a result: The treasury said it<br />
won’t take steps to prevent home foreclosures, so that prices<br />
of mortgage securities collapsed → bank equity was wiped<br />
out → banks, with shrunken equity capital, were forced to cut<br />
back on all types of credit → financing for anything, especially<br />
residential mortgage loans, dried up → market values of homes<br />
declined further → mortgage securities declined further, <strong>and</strong><br />
the downward spiral becomes self perpetuating.<br />
C. Effect of ABX on Bank Equities<br />
At the end of February 2007 a price of 92.5 meant that a<br />
protection buyer will need to pay the protection seller 7.5%<br />
upfront <strong>and</strong> then 0.64% per year. At the time, this kind of<br />
mortgage yield was about 6.5%, so the upfront charge was<br />
more than the yield per year. By April 2009 the A grade index<br />
had fallen to 8 meaning that the protection seller would want<br />
92% upfront which meant that the sub-prime market ‘died’. In<br />
July 2007 AAA mortgage securities started trading at prices<br />
materially below par, or below 100. Until then, many banks<br />
had bulked up mortgage securities that were rated AAA at<br />
the time of issue. This was because they believed that AAA<br />
bonds could always be traded at prices close to par, <strong>and</strong><br />
consequently the bonds’ value would have a very small impact<br />
on the earnings <strong>and</strong> equity capital. The mystique about AAA<br />
ratings dated back more than 80 years. From 1920 onward,<br />
the default experience on AAA rated bonds, even during the<br />
Great Depression, was nominal.<br />
The way the securities are structured is that different classes<br />
of creditors, or different tranches, all hold ownership interests<br />
in the same pool of mortgages. However, the tranches with<br />
the lower ratings - BBB, A, AA - take the first credit<br />
losses <strong>and</strong> they are supposed to be destroyed before the<br />
AAA bondholders lose anything. Typically, AAA bondholders<br />
represent about 75-80% of the entire mortgage pool. During<br />
the Great Depression (1929-1933), national average home<br />
prices held their value far better than they have since 2007. The<br />
assumptions that a highly liquid trading market <strong>and</strong> gradual<br />
price declines, have proved to be wrong. Beginning in the last<br />
half of 2007, the price declines of AAA bonds was steep,<br />
<strong>and</strong> the trading market suddenly became very illiquid. Under<br />
st<strong>and</strong>ard accounting rules, those securities must be marked<br />
to market every fiscal quarter, <strong>and</strong> the banks’ equity capital<br />
shrank beyond all expectations. Hundreds of billions of dollars<br />
have been lost as a result. However, the losses in mortgage<br />
securities, <strong>and</strong> from financial institutions such Lehman that<br />
were undone by mortgage securities, dwarf everything else.<br />
Before the end of each fiscal quarter, bank managements must<br />
also budget for losses associated with mortgage securities. But<br />
since they cannot control market prices at a future date, they<br />
compensate by adjusting what they can control, which is all<br />
discretionary extensions of credit. Banks cannot legally lend<br />
beyond a certain multiple of their capital.<br />
D. Credit Default Swap Index<br />
This index is used to hedge credit risk or to take a position<br />
on a basket of credit entities. Unlike a credit default swap, a<br />
credit default swap index is a completely st<strong>and</strong>ardised credit<br />
security <strong>and</strong> may therefore be more liquid <strong>and</strong> trade at a<br />
smaller bid-offer spread. This means that it can be cheaper<br />
to hedge a portfolio of credit default swaps or bonds with a<br />
CDS index than to buy many CDS to achieve a similar effect.<br />
Credit-default swap indexes are benchmarks for protecting<br />
investors owning bonds against default, <strong>and</strong> traders use them to<br />
speculate on changes in credit quality. There are currently two<br />
main families of CDS indices: CDX <strong>and</strong> iTraxx. CDX indices<br />
contain North American <strong>and</strong> Emerging Market companies <strong>and</strong><br />
are administered by CDS Index Company <strong>and</strong> marketed by<br />
Markit Group Limited, <strong>and</strong> iTraxx contain companies from<br />
the rest of the world <strong>and</strong> are managed by the International<br />
Index Company (IIC). A new series of CDS indices is issued<br />
every six months by Markit Group <strong>and</strong> IIC. Running up<br />
to the announcement of each series, a group of investment<br />
banks is polled to determine the credit entities that will form<br />
the constituents of the new issue. This process is intended<br />
to ensure that the index does not become ‘cluttered with<br />
instruments that no longer exist, or which trade illiquidly. On<br />
the day of issue a fixed coupon is decided for the whole index<br />
based on the credit spread of the entities in the index. Once this<br />
has been decided the index constituents <strong>and</strong> the fixed coupon<br />
is published <strong>and</strong> the indices can be actively traded.<br />
E. Analysis of Sub-Prime CDS Market ABX Indices using the<br />
FMH<br />
The US Sub-Prime Housing Market is widely viewed as<br />
the source of the current economic crisis. The reason that<br />
it has had such a devastating effect on the global economy<br />
is that investment grade bonds were purchased by many<br />
substantial international financial institutions but in reality<br />
the method used to designate the relatively low risk required<br />
for investment grade securities was seriously flawed. This<br />
resulted in the investment grade bonds becoming virtually<br />
worthless very quickly when systemic risks that wrongly had<br />
been ignored undermined the entire market. About 80% of<br />
the market was designated investment grade (AAA - highest,<br />
AA <strong>and</strong> A - lowest) with protection provided by a high risk<br />
grades (BBB- <strong>and</strong> BBB). The flawed risk model was based<br />
on an assumption that the investment grades would always be<br />
protected by the higher risk grades that would take all of the<br />
first 20% of defaults. Once defaults exceeded 20% the ‘house<br />
of cards’ was demolished. It is therefore of interest to see if a<br />
FMH based analysis of the ABX indices could have been used<br />
a predictive tool in order to develop a superior risk model.<br />
Figure 8 shows the ABX index for each grade using data<br />
supplied by the Systemic Risk Assessment Division of the<br />
Bank of Engl<strong>and</strong>. During the second week of December 2006<br />
the BBB- index slipped to 99.76 for a couple of days but then<br />
recovered. In March 2007 the index for BBB- slipped just<br />
below 90 <strong>and</strong> seemed to be recovering <strong>and</strong> by mid-May was<br />
above 90 again. In June 2007 the BBB- really began to slide<br />
<strong>and</strong> this time it never recovered <strong>and</strong> was closely followed by
87 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
Fig. 8. Grades for the ABX Indices from 19 January 2006 to 2 April 2009<br />
based on Close-of-Day prices.<br />
the collapse of the BBB index after which there was no further<br />
protection for the investment grades. The default swaps work<br />
like an insurance so that if the cost of insuring against risk<br />
becomes greater than the annual return from the loan then the<br />
market is effectively dead. By February 2008 the AAA grade<br />
was below this viable level.<br />
The results of applying the FMH based on the algorithms<br />
discussed in Section 6 is given in Figures 9-13. Table 2 provides<br />
a list of the statistical variables associated with q(t) for<br />
each case. In each case, q(t) initially has values > 2 but this<br />
falls rapidly prior to a change of the index. Also, in each case,<br />
the turning point of the normalised gradient of the Gaussian<br />
filtered signal (i.e. point in time of the minimum value) is<br />
an accurate reflection of the point in time prior to when the<br />
index falls rapidly relatively to the prior data. This turning<br />
point occurs before the equivalent characteristic associated<br />
with the smoothed index. The model consistently ‘signals’ the<br />
coming meltdown with sufficient notice for orderly withdrawal<br />
from the market. For example, the data used for Figure 9<br />
reflects the highest Investment Grade <strong>and</strong> would be regarded<br />
as particularly safe. The normalised gradient of the output data<br />
provides a very early signal of a change in trend, in this case,<br />
at around approximately 180 days from the start of the run,<br />
which is equivalent to early April 2007 at which point the<br />
index was just above 100. In fact the AAA index appears to<br />
be viable as an investment right up to early November 2008<br />
after which is falls dramatically. In Figure 11, a trend change<br />
is again observed in the normalised gradient at approximately<br />
190 days which is equivalent mid April 2007. It is not until the<br />
second week of July 2007 that this index begins to fall rapidly.<br />
In Figure 13 the normalised gradient signals a trend change<br />
at around 170 for the highest risk grade. This is equivalent to<br />
the third week of March 2007. At this stage the index was<br />
only just below 90 <strong>and</strong> appeared to be recovering.<br />
Fig. 9. Analysis of AAA ABX.HE indices (2006 H1 vintage) by rating<br />
(closing prices) from 24-07-2006 to 02-04-2009. Top-left: AAA data (blue)<br />
<strong>and</strong> q(t) (red); Top-right: 100 bin histogram; Bottom-left: Macotrends for<br />
β = 0.1; Bottom-right: Normalised gradients of macrotrends.<br />
Fig. 10. Analysis of AA ABX.HE indices (2006 H1 vintage) by rating<br />
(closing prices) from 24-07-2006 to 02-04-2009 for a 128 moving window.<br />
Top-left: AA data (blue) <strong>and</strong> q(t) (red); Top-right: 100 bin histogram;<br />
Bottom-left: Macotrends for β = 0.1; Bottom-right: Normalised gradients<br />
of macrotrends.
88 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
Fig. 11. Analysis of A ABX.HE indices (2006 H1 vintage) by rating (closing<br />
prices) from 24-07-2006 to 02-04-2009 for a 128 size moving window. Topleft:<br />
AA data (blue) <strong>and</strong> q(t) (red); Top-right: 100 bin histogram; Bottom-left:<br />
Macotrends for β = 0.1; Bottom-right: Normalised gradients of macrotrends.<br />
Fig. 12. Analysis of BBB ABX.HE indices (2006 H1 vintage) by rating<br />
(closing prices) from 24-07-2006 to 02-04-2009 for a moving window with<br />
128 elements. Top-left: AA data (blue) <strong>and</strong> q(t) (red); Top-right: 100 bin<br />
histogram; Bottom-left: Macotrends for β = 0.1; Bottom-right: Normalised<br />
gradients of macrotrends.<br />
Fig. 13. Analysis of BBB- ABX.HE indices (2006 H1 vintage) by rating<br />
(closing prices) from 24-07-2006 to 02-04-2009 for a moving window of<br />
size 128 element. Top-left:AA data (blue) <strong>and</strong> q(t) (red); Top-right: 100 bin<br />
histogram; Bottom-left: Macotrends for β = 0.1; Bottom-right: Normalised<br />
gradients of macrotrends.<br />
Statistical AAA AA A BBB BBB-<br />
Parameter<br />
Min. 1.1834 1.0752 1.0522 1.0610 1.0646<br />
Max. 3.1637 2.8250 2.7941 2.4476 2.5371<br />
Range 1.9803 1.7499 1.7420 1.3867 1.4726<br />
Mean 2.0113 1.7869 1.6663 1.5141 1.4722<br />
Median 1.9254 1.7001 1.4923 1.3425 1.3243<br />
SD 0.3928 0.4244 0.4384 0.3746 0.3476<br />
Variance 0.1543 0.1801 0.1922 0.1404 0.1208<br />
Skew 0.7173 0.3397 0.6614 0.8359 1.0345<br />
Kertosis 2.7117 1.8479 2.0809 2.2480 2.7467<br />
CN Reject Reject Reject Reject Reject<br />
TABLE II<br />
STATISTICAL VALUES ASSOCIATED WITH q(t) COMPUTED FOR ABX.HE<br />
INDICES (2006 H1 VINTAGE) BY RATING (CLOSING PRICES) FROM<br />
24-07-2006 TO 02-04-2009. NOTE THAT THE ACRONYMS SD AND CN<br />
STAND FOR ‘STANDARD DEVIATION’ AND ‘COMPOSITE NORMALITY’<br />
RESPECTIVELY.<br />
VIII. CONCLUSION<br />
In terms of the non-stationary fractional diffusion model<br />
considered in this paper, the time varying Fourier dimension<br />
q(t) can be interpreted in terms of a ‘gauge’ on the characteristics<br />
of a dynamical system. This includes the management<br />
processes from which all modern economies may be assumed<br />
to be derived. In this sense, the FMH is based on three principal<br />
considerations: (i) the non-stationary behaviour associated<br />
with any system undergoing continuous change that is driven<br />
by a management infrastructure; (ii) the cause <strong>and</strong> effect that is<br />
inherent at all scales (i.e. all levels of management hierarchy);<br />
(iii) the self-affine nature of outcomes relating to points (i)<br />
<strong>and</strong> (ii).<br />
In a modern economy, the principal issue associated with
89 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
any form of financial management is based on the flow<br />
of information <strong>and</strong> the assessment of this information at<br />
different points connecting a large network. In this sense, a<br />
macroeconomy can be assessed in terms of its information<br />
network which consists of a distribution of nodes from which<br />
information can flow in <strong>and</strong> out. The ‘efficiency’ of the system<br />
is determined by the level of r<strong>and</strong>omness associated with the<br />
direction of flow of information to <strong>and</strong> from each node. The<br />
nodes of the system are taken to be individuals or small<br />
groups of individuals whose assessment of the information<br />
they acquire together with their remit, responsibilities <strong>and</strong><br />
initiative, determines the direction of the information flow<br />
from one node to the next. The determination of the efficiency<br />
of a system in terms of r<strong>and</strong>omness is the most critical in terms<br />
of the model developed. It suggests that the performance of a<br />
business is related to how well information flows through an<br />
organisation.<br />
The FMH has a number of fundamental differences with<br />
regard to the EMH which are tabulated in Table 3.<br />
EMH FMH<br />
Gaussian Non-Gaussian<br />
Statistics Statistics<br />
Stationary Non-stationary<br />
Process Process<br />
No memory - Memory -<br />
no historical correlations historical correlations<br />
No repeating Many repeating<br />
patterns at any scale patterns at all scales -<br />
‘Elliot waves’<br />
Continuously stable Continuously unstable<br />
at all scales at any scale -<br />
‘Lévy Flights’<br />
TABLE III<br />
PRINCIPAL DIFFERENCES BETWEEN THE EFFICIENT MARKET<br />
HYPOTHESIS (EMH) AND THE FRACTAL MARKET HYPOTHESIS (FMH).<br />
The non-stationary nature of the model presented in this<br />
paper is taken to account for stochastic processes that can vary<br />
in time <strong>and</strong> are intermediate between diffusive <strong>and</strong> propagative<br />
or persistent behaviour. Application of Orthogonal Linear<br />
Regression to macroeconomic time series data provides an<br />
accurate <strong>and</strong> robust method to compute q(t) when compared to<br />
other statistical estimation techniques such as the least squares<br />
method. As a result of the physical interpretation associated<br />
with the fractional diffusion equation <strong>and</strong> the ‘meaning’ of<br />
q(t), we can, in principal, use the signal q(t) as a predictive<br />
measure in the sense that as the value of q(t) continues to<br />
increases, there is a greater likelihood for volatile behaviour<br />
of the markets. This is reflected in the data analysis based<br />
on the examples given in which a Gaussian lowpass filter<br />
exp(−βω 2 ) has been used to smooth both u(t) <strong>and</strong> q(t) to<br />
produce the associated macrotrends in which the value of β<br />
determines the level of detail they contain. From the examples<br />
provided, it is clear that the turning points of the gradients<br />
of a macrotend in q(t) flag a future change in the trend of<br />
the economic signal u(t). This is compounded in the phase<br />
shifts that exist in the normalised gradients of u(t) <strong>and</strong> q(t)<br />
over frequency b<strong>and</strong>s determined by the value of β. Although<br />
the interpretation of these phase shifts requires further study,<br />
from the results presented in this paper, it is clear that they<br />
provide an assessment of the risk associated with investing<br />
in a particular economic time series provided the series in<br />
question is a r<strong>and</strong>om scaling fractal. The ‘case study’ on the<br />
ABX Close-of-Day indices clearly illustrates the ability for the<br />
model to flag a point in time after which the indices change<br />
rapidly. The ABX indices exhibit a clear transition between<br />
a period when q(t) > 2 <strong>and</strong> when 1 < q(t) < 2 - Figures<br />
9-13 - which precedes the ‘collapse’ of the indices in 2008<br />
are thereby the onset of the ‘Credit Crunch’<br />
In a statistical sense, q(t) is just another measure that may,<br />
or otherwise, be of value to market traders. In comparison<br />
with other statistical measures, this can only be assessed<br />
through its practical application in a live trading environment.<br />
However, in terms of its relationship to a stochastic model<br />
for macroeconomic data, q(t) does provide a measure that<br />
is consistent with the physical principles associated with a<br />
r<strong>and</strong>om walk that includes a directional bias, i.e. fractional<br />
Brownian motion. The model considered, <strong>and</strong> the signal<br />
processing algorithm proposed, has a close association with<br />
re-scaled range analysis for computing the Hurst exponent H<br />
[35]. In this sense, the principal contribution of this paper<br />
has been to consider a model that is quantified in terms of<br />
a physically significant (but phenomenological) model that<br />
is compounded in a specific (fractional) partial differential<br />
equation. As with other financial time series, their derivatives,<br />
transforms etc., a range of statistical measures can be used<br />
to characterise q(t) examples of which have been provided in<br />
this paper. It should be noted that in all cases studied to date,<br />
the composite normality of the signal q(t) is of type ‘Reject’.<br />
In other words, the statistics of q(t) are non-Gaussian. Further,<br />
assuming that a financial time series is statistically self-affine,<br />
the computation of q(t) can be applied over any time scale<br />
provided there is sufficient data for the computation of q(t)<br />
to be statistically significant. Thus, the results associated with<br />
the Close-of-Day data studied in this paper are, in principle,<br />
applicable to economic time series associated with tick data<br />
over a range of time scales.<br />
APPENDIX I<br />
M-CODE FOR THE ORTHOGONAL LINEAR REGRESSION<br />
ALGORITHM<br />
The following m-code is used to compute the Fourier<br />
dimension q from the power spectrum of a r<strong>and</strong>om fractal<br />
signal <strong>and</strong> is based on the code given in [51].<br />
function x=linortfit(xdata,ydata)<br />
% Input arrays are<br />
%<br />
%xdata: 2,3,...,L/2<br />
%ydata: P[2], P[3], ..., P(L/2)<br />
%<br />
% Output value is x which gives the Fourier<br />
% dimension q for input data P[i].
90 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
%<br />
fun=inline(’sum((p(1)+p(2)*xdata-ydata...<br />
...).ˆ2)/(1+p(2)ˆ2)’,’p’,’xdata’,’ydata’);<br />
x0=flipdim(polyfit(xdata,ydata,1),2);<br />
options=optimset(’TolX’,1e-6,...<br />
...’TolFun’,1e-6);<br />
x=fminsearch(fun,x0,options,xdata,ydata);<br />
APPENDIX II<br />
VARIATION DIMINISHING SMOOTHING KERNELS<br />
Variation Diminishing Smoothing Kernels (VDSK) are convolution<br />
kernels with properties that guarantee smoothness <strong>and</strong><br />
thereby, eliminate Gibbs’ effect around points of discontinuity<br />
of a given function. Further the smoothed function can be<br />
shown to be made up of a similar succession of concave or<br />
convex arcs equal in number to those of the function. Thus, we<br />
consider the following question: let there be given a continuous<br />
or discontinuous function f whose graph is composed of a<br />
succession of alternating concave or convex arcs. Is there<br />
a smoothing kernel (or a set of them) which produces a<br />
smoothed function whose graph is also made up of a similar<br />
succession of concave or convex arcs equal in number to those<br />
of f? 1 .<br />
II.1 Laguerre-Pôlya Class Entire Functions<br />
The class of kernels which relate to this question are a class<br />
of entire functions which shall be called class E originally<br />
studied earlier by E Laguerre <strong>and</strong> G Pôlya. An entire function<br />
E(z), z ∈ C belongs to the class E<br />
⇐⇒<br />
E(z) = exp(bz − cz 2 ∞� �<br />
) 1 − z<br />
�<br />
exp[z/a(ℓ)], (II.1.1)<br />
a(ℓ)<br />
ℓ=1<br />
where b, c, a(ℓ) ∈ R, c ≥ 0, <strong>and</strong><br />
∞�<br />
a −2 (ℓ) < ∞. (II.1.2)<br />
ℓ=1<br />
where ⇐⇒ is taken to denote ‘if <strong>and</strong> only if’ - iff. The convergence<br />
of the series (II.1.2) guarantees that the product in<br />
(II.1.1) converges <strong>and</strong> represents an entire function. Laguerre<br />
proved, <strong>and</strong> Pôlya added a refinement, that a sequence of<br />
polynomials, having real roots only, which converge uniformly<br />
in every compact set of the complex plane C, approaches a<br />
function of class E in the uniform limit of such a sequence.<br />
For example,<br />
exp(−z 2 �<br />
) = lim<br />
ℓ→∞<br />
1 − z2<br />
ℓ 2<br />
� ℓ 2<br />
,<br />
<strong>and</strong> the polynomials (1 − z 2 /ℓ 2 ) have real roots only. In this<br />
definition, it is not assumed that the a(ℓ) are distinct. To<br />
include the case in which the product has a finite number<br />
of factors or reduces to 1 without additional notation, it<br />
is assumed that certain points on all the a(ℓ) may be ∞.<br />
1 Based on an edited version of material developed by A Domingez-Torres,<br />
‘Fourier Based Method in CAD’, PhD Thesis, Cranfield University, 1991<br />
Furthermore, it is assumed, without loss of generality, that<br />
the roots a(ℓ) are arranged in an order of increasing absolute<br />
values,<br />
0 < |a(1)| ≤ |a(2)| ≤ |a(3)| ≤ . . .<br />
Examples of functions belonging to class E are<br />
1, 1 − z, exp(z), exp(z 2 ), cos z<br />
sin z<br />
z , Γ−1 (1 − z), Γ −1 (z)<br />
Note that the product of two functions of this class produce a<br />
new function of the same class.<br />
II.2 Variation Diminishing Smoothing Kernels (VDSKs)<br />
A function k is variation diminishing iff it is of the form<br />
k(x) = (2πi) −1<br />
�i∞<br />
−i∞<br />
ℓ=1<br />
[E(z)] −1 exp(zx) dz, (II.2.1)<br />
where E(z) ∈ E is given by<br />
E(z) = exp(bz − cz 2 ∞� �<br />
) 1 − z<br />
�<br />
exp[z/a(ℓ)], (II.2.2)<br />
a(ℓ)<br />
with b, c, a(ℓ) ∈ R, c ≥ 0, <strong>and</strong><br />
∞�<br />
a −2 (ℓ) < ∞<br />
ℓ=1<br />
In other words, a frequency function k is variation diminishing<br />
iff its bilateral Laplace transform equals [E(z)] −1 :<br />
[E(z)] −1 =<br />
�∞<br />
−∞<br />
k(x) exp(−zx) dx. (II.2.3)<br />
In order to define a smoothing kernel, the function k given in<br />
(II.2.1) must be an even function. For, if k(x) is even, then<br />
the corresponding bilateral Laplace transform [E(z)] −1 is also<br />
even. This fact follows readily from<br />
=<br />
�∞<br />
−∞<br />
[E(z)] −1<br />
k(x) exp(−zx) dx =<br />
=<br />
�∞<br />
−∞<br />
�∞<br />
−∞<br />
k(−x) exp(−zx) dx<br />
k(x) exp(zx) dx = [E(−z)] −1<br />
Conversely, if [E(z)] −1 is even, then its inverse bilateral<br />
transform is even since a component of convergence of (II.2.3)<br />
contains the imaginary axis. This follows from the fact that<br />
the component of convergence of each one of the functions<br />
which compose E(z) contains completely the imaginary axis.<br />
Further, it follows that<br />
[E(iu)] −1 = K(u), (II.2.4)<br />
where K(u) is the FT of k. From the evenness of [E(z)] (−1)<br />
it follows that K(u) is real, hence k is even. But E(z) is even
91 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
iff b = 0 <strong>and</strong> a(2ℓ − 1) = −a(2ℓ), ℓ = 1, 2, . . . . Therefore<br />
E(z) is taken to be<br />
E(z) = exp(−cz 2 ∞� �<br />
) 1 − z2<br />
a2 �<br />
, (II.2.5)<br />
(ℓ)<br />
with c, a(ℓ) ∈ R, c ≥ 0, <strong>and</strong><br />
ℓ=1<br />
∞�<br />
a −2 (ℓ) < ∞.<br />
ℓ=1<br />
Equation (II.2.4) establishes the relationship between the<br />
bilateral Laplace transform <strong>and</strong> the Fourier transform of k.<br />
Thus, any analysis associated with use of the bilateral Laplace<br />
transform can be undertaken in terms of the Fourier transform.<br />
Using equation (II.2.4) the Fourier transform of (II.2.1) is<br />
given by<br />
k(x) ↔ K(u) = [E(iu)] −1 = exp(−cu 2 )<br />
∞� �<br />
ℓ=1<br />
a 2 (ℓ)<br />
a 2 (ℓ) + u 2<br />
(II.2.6)<br />
where ↔ denotes transformation from real to Fourier space,<br />
c, a(ℓ) ∈ R, c ≥ 0, <strong>and</strong> ∞�<br />
a−2 (ℓ) < ∞.<br />
ℓ=1<br />
Because equation (II.2.6) is a variation diminishing function<br />
by construction <strong>and</strong> |K(0)| ≤ 1, then the following result<br />
holds.<br />
Theorem II.2.1 (VDSKs)<br />
k defined as in equation (II.2.6)<br />
=⇒<br />
1. k is a smoothing kernel belonging to SK1,<br />
2. k is variation diminishing,<br />
3. k(x) ≥ 0, x ∈ R.<br />
In order to make a complete study of the VDSKs, such<br />
kernels will be divided in three classes: The Finite VDSKs,<br />
The Non-Finite VDSKs, <strong>and</strong> The Gaussian VDSK.<br />
II.3 The Finite VDSKs<br />
The finite <strong>and</strong> the non-finite VDSKs are kernels which can<br />
be synthesized from the following basic function:<br />
�<br />
,<br />
e(x) = 1<br />
exp(−|x|), x ∈ R. (II.3.1)<br />
2<br />
The finite VDSKs are made up by a finite number of convolutions<br />
of functions a(ℓ) e[a(ℓ)x], ℓ = 1, 2, . . . . Clearly e(x)<br />
is a VDSK with mean ν = 0 <strong>and</strong> variance σ 2 = 2 <strong>and</strong> its<br />
Fourier transform is given by<br />
e(x) ↔<br />
1<br />
. (II.3.2)<br />
1 + u2 Note that if a > 0, then a e(ax) is again a VDSK. Using<br />
the similarity property of the Fourier transform <strong>and</strong> equation<br />
(II.3.2), its Fourier transform is given by<br />
a e(ax) ↔<br />
a2<br />
a2 . (II.3.3)<br />
+ u2 Its mean ν again vanishes <strong>and</strong> its variance takes the value<br />
σ 2 = 2/a 2 .<br />
Let a(1), a(2), . . . , a(n) > 0 be constants, some or all<br />
of which may be coincident. The following VDSKs are<br />
introduced<br />
kℓ(x) = a(ℓ) e[a(ℓ)x], ℓ = 1, 2, . . . , n. (II.3.4)<br />
The combination of these functions by convolution gives a new<br />
VDSKs with properties quantified in the following theorem.<br />
Theorem II.3.1 (Properties of The Finite VDSKs)<br />
1. a(ℓ) > 0, ℓ = 1, 2, . . . ,<br />
2. kℓ(x) = a(ℓ) e[a(ℓ)x],<br />
3. k = k1 ⊗ k2 ⊗ · · · ⊗ kn,<br />
4. K(u) = n� � �<br />
2 2 2 a (ℓ)/(a (ℓ) + u )<br />
ℓ=1<br />
=⇒<br />
A. k is a VDSK,<br />
B. k(x) ↔ K(u),<br />
C. k has mean ν = 0,<br />
D. k has variance σ2 = n� � �<br />
2 2/a (ℓ) < ∞.<br />
ℓ=1<br />
Proof. A. The assertion follows from mathematical induction.<br />
B. It follows from Convolution Theorem <strong>and</strong> mathematical<br />
induction.<br />
C. Let kℓ(x) ↔ Kℓ(u). Then because each kℓ is a VDSK,<br />
it follows that the respective mean, νℓ, is given by<br />
νℓ = iK ′ ℓ(0) = 0, ℓ = 1, 2, . . . , n.<br />
Moreover, if n = 2, then the mean ν of k is given by<br />
ν = iK ′ (0) = i(K1K2) ′ (0) = i(K1K2 ′ +K1 ′ K2)(0) = i(0) = 0.<br />
The assertion follows from this result <strong>and</strong> mathematical induction.<br />
D. Let kℓ(x) ↔ Kℓ(u). Then because kℓ is a VDSK, it<br />
follows that the respective variance, σ2 ℓ , is given by<br />
a2 , ℓ = 1, 2, . . . , n.<br />
ℓ<br />
Furthermore, from the result given in C above, if n = 2, then<br />
the mean σ2 of k is given by<br />
σ 2 ℓ = −K ′′ (0) = 2<br />
σ 2 = −K ′′ (0) = −(K1K2) ′′ (0)<br />
= (−K1K2 ′′ − 2K1 ′ K2 ′ − K1 ′′ K2)(0) = 2<br />
a2 2<br />
+<br />
(1) a2 (2) .<br />
The assertion follows from this result <strong>and</strong> mathematical induction.<br />
From the explicit expression of K(u) given in Theorem<br />
II.3.1. it follows that<br />
=<br />
=<br />
K(u) =<br />
n�<br />
ℓ=1<br />
n�<br />
ℓ=1<br />
n�<br />
ℓ=1<br />
� 2 a (ℓ)<br />
a2 (ℓ) + u2 �<br />
�<br />
a(ℓ)<br />
� �<br />
−a(ℓ)<br />
�<br />
a(ℓ) − iu −a(ℓ) − iu<br />
� a(ℓ)<br />
a(ℓ) − iu<br />
=<br />
2n�<br />
ℓ=1<br />
� n �<br />
ℓ=1<br />
�<br />
d(ℓ)<br />
d(ℓ) − iu<br />
�<br />
�<br />
−a(ℓ)<br />
−a(ℓ) − iu<br />
�
92 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
where d(ℓ) = a(ℓ) for ℓ = 1, 2, . . . , n <strong>and</strong> d(ℓ) = −a(ℓ) for<br />
ℓ = n + 1, n + 2, . . . , 2n. Thus k is of degree 2n <strong>and</strong> the<br />
following theorem holds.<br />
Theorem II.3.2 (Degree of Differentiability of The Finite<br />
VDSKs)<br />
k a finite VDSK,<br />
=⇒<br />
1. k ∈ C 2n−2 (R, R),<br />
2. k ∈ C 2n−1 (R, R) except at x = 0, where<br />
k 2n−1 (0 + ), k 2n−1 (0 − )<br />
both exist.<br />
The asymptotic behaviour of k <strong>and</strong> its Fourier transform,<br />
K, will be now studied.<br />
Theorem II.3.3 (Asymptotic Behaviour of The Fourier<br />
transform of The Finite VDSKs)<br />
1. k a finite VDSK,<br />
2. k(x) ↔ K(u)<br />
=⇒<br />
|K(u)| = O(|u| −2n ), |u| → ∞.<br />
Proof. k is made up of a finite convolution operations<br />
of functions kℓ(x) = a(ℓ) e[a(ℓ)x], where a(ℓ) > 0, ℓ =<br />
1, 2, . . . , n; <strong>and</strong> whose FT, Kℓ(u), satisfy the inequality<br />
�<br />
�<br />
|Kℓ(u)| = �<br />
a<br />
�<br />
2 (ℓ)<br />
a2 (ℓ) + u2 �<br />
�<br />
�<br />
� ≤ a2 (ℓ)<br />
, ℓ = 1, 2, . . . , n.<br />
|u| 2<br />
Thus<br />
� �<br />
� n� �<br />
� �<br />
|K(u)| = � Kℓ(u) �<br />
� �<br />
ℓ=1<br />
≤<br />
n�<br />
� 2 a (ℓ)<br />
|u|<br />
ℓ=1<br />
2<br />
�<br />
= |u| −2n<br />
n�<br />
a<br />
ℓ=1<br />
2 (ℓ).<br />
(II.3.5)<br />
From the above theorem we construct the following corollarys.<br />
Corollary II.3.4 (Absolute <strong>and</strong> Quadratic Integrability<br />
of The Fourier transform of The Finite VDSKs)<br />
1. k a finite VDSK,<br />
2. k(x) ↔ K(u)<br />
=⇒<br />
K(u) ∈ L(R, R) ∩ L2 (R, R).<br />
Corollary II.3.5 (Absolute <strong>and</strong> Quadratic Integrability<br />
of The Finite VDSKs)<br />
k a finite VDSK,<br />
=⇒<br />
k(x) ∈ L(R, R) ∩ L2 (R, R).<br />
The Fourier transform K(u) of the Fourier transform of k<br />
is given by<br />
K(u) ↔ 2πk(−x).<br />
Since k is a even function then<br />
K(u) ↔ 2πk(x).<br />
This result, in conjunction with Corollary II.3.4. <strong>and</strong> Riemann-<br />
Lebesgue Lemma proves the following theorem.<br />
Theorem II.3.6 (Asymptotic Behaviour of The Finite<br />
VDSKs)<br />
k a finite VDSK<br />
=⇒<br />
k(x) → 0 as |x| → ∞.<br />
II.4 The Non-Finite VDSKs<br />
We now study kernels k holding the property<br />
∞�<br />
� 2 a (ℓ)<br />
k(x) ↔ K(u) =<br />
a2 (ℓ) + u2 �<br />
ℓ=1<br />
(II.4.1)<br />
which are non-finite kernels. In particular, the infinite product<br />
in equation (II.4.1) may have only a finite number of factors,<br />
so that the finite VDSKs of the last section are included.<br />
Kernels holding equation (II.4.1) can be synthesized from the<br />
basic kernel<br />
e(x) = 1<br />
exp(−|x|), x ∈ R.<br />
2<br />
The non-finite VDSKs are composed of a non-finite number<br />
of functions a(ℓ) e[a(ℓ)x], ℓ = 1, 2, . . . . The properties of such<br />
kernels are given in the following theorem.<br />
Theorem II.4.1 (Properties of The Non-Finite VDSKs)<br />
1. a(ℓ) > 0, ℓ = 1, 2, . . . ,<br />
2. kℓ(x) = a(ℓ) e[a(ℓ)x],<br />
3. k = k1 ⊗ k2 ⊗ · · · ⊗ kn . . . ,<br />
4. K(u) = ∞� � �<br />
2 2 2 a (ℓ)/(a (ℓ) + u )<br />
ℓ=1<br />
=⇒<br />
A. k is a VDSK,<br />
B. k(x) ↔ K(u),<br />
C. k has mean ν = 0,<br />
D. k has variance σ2 = ∞� � �<br />
2 2/a (ℓ) < ∞.<br />
ℓ=N+1<br />
ℓ=1<br />
Since k (Theorem II.4.1) is made up by a non-finite number<br />
of convolution operationw, then it is of degree infinity, which<br />
leads to the following.<br />
Theorem II.4.2 (Degree of Differentiability of The Non-<br />
Finite VDSKs)<br />
k a non-finite VDSK<br />
=⇒<br />
k ∈ C∞ (R, R).<br />
The asymptotic behaviour of the Fourier transform of a nonfinite<br />
kernel is established in the following theorem.<br />
Theorem II.4.3 (Asymptotic Behaviour of The Fourier<br />
transform of The Non-Finite VDSKs)<br />
1. k a non-finite VDSK,<br />
2. k(x) ↔ K(u),<br />
3. R, p > 0<br />
=⇒<br />
|K(u)| = O(|u| −2p ), |u| → ∞.<br />
Proof. Choose N > p <strong>and</strong> so large that |a(ℓ)| ≥ R when<br />
ℓ > N which is possible since |a(ℓ)| → ∞ as ℓ → ∞. Set<br />
∞�<br />
� 2 a (ℓ)<br />
KN(u) =<br />
a2 (ℓ) + u2 �<br />
.<br />
By equation (II.3.5), it follows that<br />
|K(u)| ≤ |KN (u)|<br />
|u| 2N<br />
N�<br />
a 2 (ℓ).<br />
ℓ=1<br />
Because |KN(u)| never vanishes <strong>and</strong> is continuous for all u ∈<br />
R, then it has a positive lower bound. Hence, for a suitable<br />
constant M<br />
|K(u)| ≤ M<br />
.<br />
|u| 2N
93 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
In particular, if p = 1 in the above theorem <strong>and</strong> because k is a<br />
variation diminishing function, the following corollary results.<br />
Corollary II.4.4 (Absolute Integrability of The Non-<br />
Finite Kernels <strong>and</strong> Their FT)<br />
1. k a non-finite VDSK,<br />
2. k(x) ↔ K(u)<br />
=⇒<br />
k, K ∈ L(R, R).<br />
Application of the symmetry property of the Fourier transform,<br />
the Riemann-Lebesgue Lemma <strong>and</strong> the above corollary<br />
proves the following theorem.<br />
Theorem II.4.5 (Asymptotic Behaviour of The Non-Finite<br />
VDSKs)<br />
k a non-finite VDSK<br />
=⇒<br />
k(x) → 0 as |x| → ∞.<br />
Some examples of non-finite VDSKs are:<br />
π<br />
4 sech2 ( πx<br />
) ↔ u csch u<br />
2<br />
∞�<br />
� � 2 2 ℓ π<br />
=<br />
, (II.4.2)<br />
ℓ=1<br />
ℓ 2 π 2 + u 2<br />
1<br />
2 sech(πx<br />
∞�<br />
� 2 2 (2ℓ − 1) π<br />
) ↔ sech u =<br />
2 (2ℓ − 1)<br />
ℓ=1<br />
2π2 + u2 �<br />
.<br />
(II.4.3)<br />
Note that a non-finite VDSK does not necessarily belongs to<br />
L2 (R, R), e.g. the kernel given by equation (II.4.3).<br />
II.5 The Gaussian VDSK<br />
The Gaussian VDSK, k, is defined by the relation<br />
k(x) ↔ K(u) = exp(−cu 2 ), c > 0. (II.5.1)<br />
With c → 1/4c 2 , the Gaussian VDSK is now defined as<br />
k(x) ↔ K(u) = exp(−u 2 /4c 2 ), c > 0. (II.5.2)<br />
The basic properties of the above kernel follow directly <strong>and</strong><br />
are collated together in the following theorem.<br />
Theorem II.5.1 (Basic Properties of The Gaussian<br />
VDSK)<br />
1. k(x) = c gauss(cx), c > 0,<br />
2. K(u) = exp(−u 2 /4c 2 ), c > 0,<br />
3. p > 0<br />
=⇒<br />
A. k is a VDSK,<br />
B. k(x) ↔ K(u),<br />
C. k has mean ν = 0,<br />
D. k has variance σ 2 = 1/2c 2 ,<br />
E. k, K ∈ L(R, R) ∩ L 2 (R, R),<br />
F. k, K ∈ C ∞ (R, R),<br />
G. |k(x)| = o(|x| −p ),<br />
H. |K(u)| = o(|u| −p ).<br />
If in equation (II.5.1), c is considered as a variable, say t,<br />
then after taking the inverse Fourier transform with respect to<br />
x we obtain a real valued function of two variables, i.e.<br />
k(x, t) = 1<br />
√ 4πt exp(−x 2 /4t). (II.5.3)<br />
This new function is the familiar source solution of the<br />
diffusion equation<br />
� �<br />
2 ∂ ∂<br />
− k(x, t) = 0 (II.5.4)<br />
∂x2 ∂t<br />
II.6 Geometric Properties of The VDSKs<br />
We consider the general geometric properties shared by the<br />
finite, non-finite <strong>and</strong> the Gaussian VDSKs where k denotes<br />
either a finite, non-finite or Gaussian VDSK throughout.<br />
Theorem II.6.1 (Geometric Properties of The VDSKs)<br />
1. k a VDSK,<br />
2. f : R → R bounded <strong>and</strong> convex (concave)<br />
=⇒<br />
A. For a, b ∈ R<br />
V [k(x) ⊗ f(x) − a − bx] ≤ V [f(x) − a − bx], (II.6.1)<br />
B. (k ⊗ f)(x) is convex (concave).<br />
Proof. A. Inequality (II.6.1) follows by a direct application<br />
of the variation diminishing property of k.<br />
B. It is well known that f is convex iff<br />
∆ 2 hf(x) = f(x + 2h) − 2f(x + h) − f(x) ≥ 0,<br />
for all x ∈ R, h > 0. Because k is a non-negative function,<br />
then<br />
∆ 2 h[(k ⊗ f)(x)] = ∆ 2 ⎡<br />
�∞<br />
⎤<br />
⎣<br />
h k(y)f(x − y) dy⎦<br />
=<br />
�∞<br />
−∞<br />
−∞<br />
k(y)∆ 2 hf(x − y) dy ≥ 0.<br />
Thus the inequality follows. The case for which f is concave<br />
follows using a similar argument but ∆2 hf(x) ≤ 0, for all<br />
x ∈ R, h > 0.<br />
The geometric significance of inequality (II.6.1) is that the<br />
number of intersections of the straight line y = a + bx, a, b ∈<br />
R, with (k⊗f)(x) does not exceed the number of intersections<br />
of y = a + bx with y = f(x). As a special instance of such<br />
an inequality, it follows that (k ⊗ f)(x) is non-negative if f<br />
is non-negative.<br />
Corollary II.6.2 (Non-Negativity of k ⊗ f)<br />
1. k a VDSK,<br />
2. f : R → R, f ≥ 0, <strong>and</strong> bounded<br />
=⇒<br />
(k ⊗ f)(x) ≥ 0, x ∈ R.<br />
From the above results, it is clear that if f is composed of<br />
a succession of alternating convex or concave arcs, then k ⊗ f<br />
is also made up of a similar succession of convex or concave<br />
arcs equal in number to those of f. Thus, a VDSK is shape<br />
preserving.<br />
ACKNOWLEDGMENTS<br />
The ABX data was provided by the Systemic Risk Analysis<br />
Division, Bank of Engl<strong>and</strong>, who originally commissioned the<br />
research.
94 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
REFERENCES<br />
[1] http://www.tickdata.com/<br />
[2] http://www.vhayu.com/<br />
[3] http://en.wikipedia.org/wiki/Louis Bachelier<br />
[4] http://en.wikipedia.org/wiki/Robert Brown (botanist)<br />
[5] T. R. Copel<strong>and</strong>, J. F. Weston <strong>and</strong> K. Shastri, Financial Theory <strong>and</strong><br />
Corporate Policy, 4th Edition, Pearson Addison Wesley, 2003.<br />
[6] J. D. Martin, S. H. Cox, R. F. McMinn <strong>and</strong> R. D. Maminn, The Theory of<br />
Finance: Evidence <strong>and</strong> Applications, International Thomson Publishing,<br />
1997.<br />
[7] R. C. Menton, Continuous-Time Finance, Blackwell Publishers, 1992.<br />
[8] T. J. Watsham <strong>and</strong> K. Parramore, Quantitative Methods in Finance,<br />
Thomson Business Press, 1996.<br />
[9] E. Fama, The Behavior of Stock Market Prices, Journal of Business Vol.<br />
38, 34-105, 1965.<br />
[10] P. Samuelson, Proof That Properly Anticipated Prices Fluctuate R<strong>and</strong>omly,<br />
Industrial Management Review Vol. 6, 41-49, 1965.<br />
[11] E. Fama, Efficient Capital Markets: A Review of Theory <strong>and</strong> Empirical<br />
Work, Journal of Finance Vol. 25, 383-417, 1970.<br />
[12] G. M. Burton, Efficient Market Hypothesis, The New Palgrave: A<br />
Dictionary of Economics, Vol. 2, 120-23, 1987.<br />
[13] F. Black <strong>and</strong> M. Scholes, The Pricing of Options <strong>and</strong> Corporate<br />
Liabilities, Journal of Political Economy, Vol. 81(3), 637-659, 1973.<br />
[14] http://uk.finance.yahoo.com/q/hp?s=%5EFTSE<br />
[15] B. B. M<strong>and</strong>elbrot <strong>and</strong> J. R. Wallis, Robustness of the Rescaled Range<br />
R/S in the Measurement of Noncyclic Long Run Statistical Dependence,<br />
Water Resources Research, Vol. 5(5), 967-988, 1969.<br />
[16] B. B. M<strong>and</strong>elbrot, Statistical Methodology for Non-periodic Cycles:<br />
From the Covariance to R/S Analysis, Annals of Economic <strong>and</strong> Social<br />
Measurement, Vol. 1(3), 259-290, 1972.<br />
[17] E. H. Hurst, A Short Account of the Nile Basin, Cairo, Government<br />
Press, 1944.<br />
[18] http://en.wikipedia.org/wiki/Elliott wave principle<br />
[19] http://uk.finance.yahoo.com/q/hp?s=%5EFTSE<br />
[20] http://uk.finance.yahoo.com/q/hp?s=%5EDJI<br />
[21] B. B. M<strong>and</strong>elbrot, The Fractal Geometry of Nature, Freeman, 1983.<br />
[22] J. Feder, Fractals, Plenum Press, 1988.<br />
[23] K. J. Falconer, Fractal Geometry, Wiley, 1990.<br />
[24] P. Bak, How Nature Works, Oxford University Press, 1997.<br />
[25] N. Lam <strong>and</strong> L. De Cola L, Fractal in Geography, Prentice-Hall, 1993.<br />
[26] H. O. Peitgen <strong>and</strong> D. Saupe (Eds.), The Science of Fractal Images,<br />
Springer, 1988.<br />
[27] A. J. Lichtenberg <strong>and</strong> M. A. Lieberman, Regular <strong>and</strong> Stochastic Motion:<br />
Applied Mathematical Sciences, Springer-Verlag, 1983.<br />
[28] J. J. Murphy, Intermarket Technical Analysis: Trading Strategies for the<br />
Global Stock, Bond, Commodity <strong>and</strong> Currency Market, Wiley Finance<br />
Editions, Wiley, 1991.<br />
[29] J. J. Murphy, Technical Analysis of the Futures Markets: A Comprehensive<br />
Guide to Trad-ing Methods <strong>and</strong> Applications, New York Institute<br />
of Finance, Prentice-Hall, 1999.<br />
[30] T. R. DeMark, The New Science of Technical Analysis, Wiley, 1994.<br />
[31] J. O. Matthews, K. I. Hopcraft, E. Jakeman <strong>and</strong> G. B. Siviour, Accuracy<br />
Analysis of Measurements on a Stable Power-law Distributed Series of<br />
Events, J. Phys. A: Math. Gen. 39, 1396713982, 2006.<br />
[32] W. H. Lee, K. I. Hopcraft, <strong>and</strong> E. Jakeman, Continuous <strong>and</strong> Discrete<br />
Stable Processes, Phys. Rev. E 77, American Physical Society, 011109,<br />
1-4.<br />
[33] A. Einstein, On the Motion of Small Particles Suspended in Liquids at<br />
Rest Required by the Molecular-Kinetic Theory of Heat, Annalen der<br />
Physik, Vol. 17, 549-560, 1905.<br />
[34] J. M. Blackledge, G. A. Evans <strong>and</strong> P. Yardley, Analytical Solutions to<br />
Partial Differential Equations, Springer, 1999.<br />
[35] H. Hurst, Long-term Storage Capacity of Reservoirs, Transactions of<br />
American Society of Civil Engineers, Vol. 116, 770-808, 1951.<br />
[36] M. F. Shlesinger, G. M. Zaslavsky <strong>and</strong> U. Frisch (Eds.), Lévy Flights<br />
<strong>and</strong> Related Topics in Physics, Springer 1994.<br />
[37] R. Hilfer, Foundations of Fractional Dynamics, Fractals Vol. 3(3), 549-<br />
556, 1995.<br />
[38] A. Compte, Stochastic Foundations of Fractional Dynamics, Phys. Rev<br />
E, Vol. 53(4), 4191-4193, 1996.<br />
[39] P. M. Morse <strong>and</strong> H. Feshbach, Methods of Theoretical Physics, McGraw-<br />
Hill, 1953.<br />
[40] G. F. Roach, Green’s Functions (Introductory Theory with Applications),<br />
Van Nostr<strong>and</strong> Reihold, 1970.<br />
[41] T. F. Nonnenmacher, Fractional Integral <strong>and</strong> Differential Equations for<br />
a Class of Lévy-type Probability Densities, J. Phys. A: Math. Gen. Vol.<br />
23, L697S-L700S, 1990.<br />
[42] R. Hilfer, Exact Solutions for a Class of Fractal Time R<strong>and</strong>om Walks,<br />
Fractals, Vol. 3(1), 211-216, 1995.<br />
[43] R. Hilfer <strong>and</strong> L. Anton, Fractional Master Equations <strong>and</strong> Fractal Time<br />
R<strong>and</strong>om Walks, Phys. Rev. E, Vol. 51(2), R848-R851, 1995.<br />
[44] M. J. Turner, J. M. Blackledge <strong>and</strong> P. Andrews, Fractal Geometry in<br />
Digital Imaging, Academic Press, 1997.<br />
[45] M. F. Shlesinger, G. M. Zaslavsky <strong>and</strong> U. Frisch (Eds.), Lévy Flights<br />
<strong>and</strong> Related Topics in Physics, Springer 1994.<br />
[46] S. Abea <strong>and</strong> S. Thurnerb, Anomalous Diffusion in View of Einsteins 1905<br />
Theory of Brownian Motion, Physica A(356) 403407, Elsevier 2005.<br />
[47] I. Lvova, Application of Statistical Fractional Methods for the Analysis<br />
of Time Series of Currency Exchange Rates, PhD Thesis, De Montfort<br />
University, 2006.<br />
[48] C. R. Rao, Linear Statistical Inference <strong>and</strong> its Applications, Wiley, 1973.<br />
[49] http://webscripts.softpedia.com/script/Scientific-Engineering-Ruby/<br />
Mathematics/Orthogonal-Linear-Regression-33745.html<br />
[50] S. Mallat, A Wavelet Tour of Signal Processing, Academic Press, ISBN:<br />
0-12-466606-X, 1999.<br />
[51] http://www.mathworks.com/matlabcentral/fileexchange/<br />
loadFile.do?objectId=6716&objectType=File<br />
[52] http://www.markit.com/en/home.page.<br />
Jonathan Blackledge graduated in physics from<br />
Imperial College in 1980. He gained a PhD in theoretical<br />
physics from London University in 1984 <strong>and</strong><br />
was then appointed a Research Fellow of Physics<br />
at Kings College, London, from 1984 to 1988,<br />
specializing in inverse problems in electromagnetism<br />
<strong>and</strong> acoustics. During this period, he worked on<br />
a number of industrial research contracts undertaking<br />
theoretical <strong>and</strong> computational research into<br />
the applications of inverse scattering theory for the<br />
analysis of signals <strong>and</strong> images. In 1988, he joined<br />
the Applied Mathematics <strong>and</strong> Computing Group at Cranfield University as<br />
Lecturer <strong>and</strong> later, as Senior Lecturer <strong>and</strong> Head of Group where he promoted<br />
postgraduate teaching <strong>and</strong> research in applied <strong>and</strong> engineering mathematics<br />
in areas which included computer aided engineering, digital signal processing<br />
<strong>and</strong> computer graphics. While at Cranfield, he co-founded Management <strong>and</strong><br />
Personnel Services Limited through the Cranfield Business School which<br />
was originally established for the promotion of management consultancy<br />
working in partnership with the Chamber of Commerce. He managed the<br />
growth of the company from 1993 to 2007 to include the delivery of a<br />
range of National Vocational Qualifications, primarily through the City <strong>and</strong><br />
Guilds London Institute, including engineering, ICT, business administration<br />
<strong>and</strong> management. In 1994, Jonathan Blackledge was appointed Professor of<br />
Applied Mathematics <strong>and</strong> Head of the Department of Mathematical Sciences<br />
at De Montfort University where he exp<strong>and</strong>ed the post-graduate <strong>and</strong> research<br />
portfolio of the Department <strong>and</strong> established the Institute of Simulation<br />
Sciences. From 2002-2008 he was appointed Visiting Professor of Information<br />
<strong>and</strong> Communications Technology in the Advanced Signal Processing Research<br />
Group, Department of Electronics <strong>and</strong> Electrical Engineering at Loughborough<br />
University, Engl<strong>and</strong> (a group which he co-founded in 2003 as part<br />
of his appointment). In 2004 he was appointed Professor Extraordinaire of<br />
Computer Science in the Department of Computer Science at the University<br />
of the Western Cape, South Africa. His principal roles at these institutes<br />
include the supervision of MSc <strong>and</strong> MPhil/PhD students <strong>and</strong> the delivery<br />
of specialist short courses for their Continuous Professional Development<br />
programmes. He currently holds the prestigious Stokes Professorship funded<br />
by the Science Foundation Irel<strong>and</strong> at Dublin Institute of Technology <strong>and</strong><br />
is Distinguished Professor in the Centre for Advanced Studies at Warsaw<br />
University of Technology
95 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
An Optical Machine Vision System for<br />
Applications in Cytopathology<br />
Jonathan M Blackledge, Fellow, IET <strong>and</strong> Dmitry A Dubovitskiy, Member, IET<br />
Abstract— This paper discusses a new approach to the processes<br />
of object detection, recognition <strong>and</strong> classification in a<br />
digital image focusing on problem in Cytopathology. A unique self<br />
learning procedure is presented in order to incorporate expert<br />
knowledge. The classification method is based on the application<br />
of a set of features which includes fractal parameters such as the<br />
Lacunarity <strong>and</strong> Fourier dimension. Thus, the approach includes<br />
the characterisation of an object in terms of its fractal properties<br />
<strong>and</strong> texture characteristics. The principal issues associated with<br />
object recognition are presented which include the basic model<br />
<strong>and</strong> segmentation algorithms. The self-learning procedure for<br />
designing a decision making engine using fuzzy logic <strong>and</strong> membership<br />
function theory is also presented <strong>and</strong> a novel technique<br />
for the creation <strong>and</strong> extraction of information from a membership<br />
function considered. The methods discussed <strong>and</strong> the algorithms<br />
developed have a range of applications <strong>and</strong> in this work, we<br />
focus the engineering of a system for automating a Papanicolaou<br />
screening test.<br />
Index Terms— Computer vision, Segmentation, Object recognition,<br />
Contour detection, Edge detection, Decision making,<br />
Self-learning, Fuzzy logic, Image morphology, Cytopathology,<br />
Cervical smear analysis, Papanicolaou screening test.<br />
I. INTRODUCTION<br />
THE cervix is an important site for pathological studies,<br />
particularly in women of reproductive age. It protects the<br />
uterine cavity from intrusion of pathogenic micro-organisms,<br />
promotes the movement of spermatozoa to the ovule <strong>and</strong> holds<br />
a fetus in the uterus at pregnancy. The conventional study<br />
of cellular structures on stained glass slides for cytological<br />
reporting is a routine procedure for the early detection of<br />
pre-carcinoma conditions. Visual inspection allows an estimate<br />
to be made of the state of the cervix <strong>and</strong> a diagnosis to be<br />
developed based on the cytological pattern observed providing<br />
an adequate specimen is available. Worldwide, approximately<br />
471,000 women are diagnosed with invasive carcinoma of<br />
the cervix each year <strong>and</strong> the order of 233,000 die from the<br />
disease. Although mortality from cervical cancer continues to<br />
decrease due to improved screening programmes, it remains<br />
among the most common female cancers in many countries.<br />
For example, in the United Kingdom, it is ranked eleventh<br />
for women, sexually transmitted infections by certain strains<br />
of the human papilloma virus being the major cause of the<br />
condition.<br />
Manuscript completed in December, 2009. The work reported in this paper<br />
is supported by the Science Foundation Irel<strong>and</strong>.<br />
Jonathan Blackledge (email: jonathan.blackledge@dit.ie) is SFI (Science<br />
Foundation Irel<strong>and</strong>) Stokes Professor, School of Electrical Engineering <strong>Systems</strong>,<br />
Faculty of Engineering, Dublin Institute of Technology, Kevin Street,<br />
Dublin 8, Irel<strong>and</strong> - http://eleceng.dit.ie/blackledge. Dr Dmitry Dubovitskiy is<br />
Director of Oxford Recognition Limited (email: dda@oxreco.com).<br />
A. Papanicolaou Screening<br />
Cervical cancer is preceded by a precancerous condition<br />
called Cervical Intraepithelial Neoplasia (CIN) which can be<br />
easily treated if detected. It is therefore important to identify<br />
CINs through a Papanicolaou screening test commonly known<br />
as a ‘PAP test’. A small sample of cells from the surface of<br />
the cervix is removed <strong>and</strong> smeared onto a glass slide <strong>and</strong><br />
the material is fixed in alcohol. The slide is then stained<br />
<strong>and</strong> the sample(s) examined under a microscope, a search<br />
being carried to detect abnormal cells. Examination typically<br />
involves observing the nucleus of a cell <strong>and</strong> inspecting it<br />
for characteristics that point toward abnormalities that include<br />
size, texture <strong>and</strong> colour. For example, if the nucleus is enlarged<br />
relative to the area of the cytoplasm as shown in the example<br />
given in Figure 1 then there is a likelihood of abnormal activity<br />
within the nucleus.<br />
Fig. 1. Example of normal (left) <strong>and</strong> abnormal cell clusters (right) where,<br />
in the latter case, the Cytoplasm to Nuclei area ratio is enlarged.<br />
The order of four million cervical smears are taken annually<br />
in the UK <strong>and</strong> fifty million in USA, for example, <strong>and</strong> a principal<br />
diagnostic problem is that about one fifth of the borderline<br />
preparations show the disease at an advanced stage on referral<br />
<strong>and</strong> biopsy. Overall there is a 50% ‘failure’ rate in detecting<br />
significant diseases within borderline cases. In addition there<br />
is a 50% ‘failure’ in detecting significant deceases within<br />
negative cases. The reasons for this vary from extraction of<br />
a sample, the preparation of the slide, but most of all, from<br />
the sequential reading of a slide in the diagnostic laboratory<br />
when human error occurs.<br />
In current practices world-wide a diagnosis is performed<br />
manually. It typically takes 8-10 minutes for a cytopathologist<br />
to screen a slide <strong>and</strong> involves upto 300 movements of a<br />
microscope over the slide. This approach not only takes time<br />
but inevitably leads to outcomes in which it is not possible to<br />
guarantee consistent <strong>and</strong> accurate results as many borderline<br />
results are generated, for example. It is therefore of significant
96 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
value if accurate image analysis <strong>and</strong> object recognition techniques<br />
can be developed in an attempt to automate the process<br />
<strong>and</strong> produce a system that provides a reliable, consistent <strong>and</strong><br />
quantitative estimation of CINs <strong>and</strong> other abnormalities to<br />
improve upon the subjective assessments of a cytopathologist.<br />
A typical screening session involves a cytopathologist<br />
analysing a slide under the microscope with a magnification<br />
up to 400x. The output is related to the number of slides<br />
<strong>and</strong> working hour per cytologist <strong>and</strong> an increase in either<br />
reduces the speed <strong>and</strong> reliability of the results. Telecytology<br />
[5] provides a large number of digital images for consideration<br />
which can lead increased human error. Moreover, in<br />
telecytology the cytopatholoist is not usually able to examine<br />
cellular details <strong>and</strong> to change the focal plane of the image.<br />
In virtual microscopy a digital image of the entire slide is<br />
generated <strong>and</strong> consequently the image file can become very<br />
large ∼4-7Gb. Another problem with virtual microscopy is<br />
that the focal plane limits the representation of the specimen.<br />
Virtual microscopy is used for proficiency tests <strong>and</strong> there are a<br />
number of commercially available medical imaging assistant<br />
tools [11], [12], [13]. However, a cytopathologist is still an<br />
important factor in the ‘diagnostic cycle’. Furthermore, due<br />
to compression <strong>and</strong>/or differences in the focal depth, many<br />
images may not provide a clear enough representation of a<br />
cell in comparison to those obtained using conventional microscopy.<br />
Thus, the development of automated recognition <strong>and</strong><br />
classification systems provides the potential for introducing<br />
quality control in national screening procedures.<br />
B. Image Analysis <strong>and</strong> Pattern Recognition<br />
Conventional microscopy, as applied to cytopathology, involves<br />
the use of image processing methods that are often<br />
designed in an attempt to provide a machine interpretation<br />
of an image, ideally in a form that allows some decision<br />
criterion to be applied, such that a pattern <strong>and</strong>/or object can<br />
be recognised [1], [2]. Pattern recognition uses a range of<br />
different approaches that are not necessarily based on any<br />
one particular theme or unified theoretical approach. The main<br />
problem is that, to date, there is no complete theoretical model<br />
for simulating the processes that take place when a human<br />
interprets an image generated by the eye, i.e. there is no<br />
fully compatible model, currently available, for explaining the<br />
processes of visual image comprehension. Hence, machine or<br />
computer vision remains a rather elusive subject area in which<br />
automatic inspection systems are advanced without having a<br />
fully operational theoretical framework as a guide. Nevertheless,<br />
numerous algorithms for interpreting two- <strong>and</strong> threedimensional<br />
objects in a digital image have <strong>and</strong> continue to be<br />
researched in order to design systems that can provide reliable<br />
automatic object detection <strong>and</strong> recognition in an independent<br />
environment, e.g. [3], [4], [14], [16], [25].<br />
Vision can be thought of as a process of linking parts of<br />
the visual field (objects) with stored information or templates<br />
about their significance for the observer. There are a number of<br />
questions concerning vision such as: (i) what are the goals <strong>and</strong><br />
constraints? (ii) what type of algorithm or set of algorithms<br />
is required to effect vision? (iii) what are the implications<br />
for the process given the types of hardware that might be<br />
available? (iv) what are the levels of representation required<br />
to achieve vision? The levels of representation are dependent<br />
on what type of segmentation can <strong>and</strong>/or should be applied<br />
to an image. For example, we may be able to produce a<br />
primal sketch from an image via some measure of the intensity<br />
changes in a scene which are recorded as place tokens <strong>and</strong><br />
stored in a database. This allows sets of raw components<br />
to be generated, e.g. regions of pixels with similar intensity<br />
values or sets of lines obtained by isolating the edges of an<br />
image scene <strong>and</strong> computed by locating regions where there is<br />
a significant difference in the intensity. However, such sets are<br />
subject to inherent ambiguities when computed from a given<br />
input image <strong>and</strong> associated with those from which an existing<br />
database has been constructed. Such ambiguities can only be<br />
overcome by the application of high-level rules, based on how<br />
humans interpret images, but the nature of this interpretation is<br />
not always clear. Nevertheless, parts of an image will tend to<br />
have an association if they share size, colour, figural similarity,<br />
continuity, shading <strong>and</strong> texture, for example. For this purpose,<br />
we are required to consider how best to segment an image <strong>and</strong><br />
what form this segmentation should take.<br />
The identification of the edges of an object in an image<br />
scene is an important aspect of the human visual system<br />
because it provides information on the basic topology of the<br />
object from which an interpretative match can be achieved. In<br />
other words, the segmentation of an image into a complex<br />
of edges is a useful pre-requisite for object identification.<br />
However, although many low-level processing methods can be<br />
applied for this purpose, the problem is to decide which object<br />
boundary each pixel in an image falls within <strong>and</strong> which highlevel<br />
constraints are necessary. Thus, in many cases, a principal<br />
question is, which comes first, recognition or segmentation?<br />
Compared to image processing, computer vision (which<br />
incorporates machine vision) is more than automated image<br />
processing. It results in a conclusion, based on a machine<br />
performing an inspection of its own. The machine must be<br />
programmed to be sensitive to the same aspects of the visual<br />
field as humans find meaningful. Segmentation is concerned<br />
with the process of dividing an image into meaningful regions<br />
or segments. It is used in image analysis to separate features or<br />
regions of a pre-determined type from the background; it is the<br />
first step in automatic image analysis <strong>and</strong> pattern recognition.<br />
Segmentation is broadly based on one of two properties in<br />
an image: (i) similarity; (ii) discontinuity. The first property<br />
is used to segment an image into regions which have grey<br />
(or colour) levels within a predetermined range. The second<br />
property segments the image into regions of discontinuity<br />
where there is a more or less abrupt change in the values<br />
of the grey (or colour) levels.<br />
In this paper, we consider an approach to object detection in<br />
an image that is based on a new segmentation (edge detection)<br />
algorithm based on a Contour Tracing Algorithm <strong>and</strong> spaceoriented<br />
filter [6]. The image usually requires enhancing<br />
before it is process <strong>and</strong> for this purpose a novel self-adjusting<br />
sharpening filter has been developed as discussed in this paper.<br />
The segmented object is then analysed in terms metrics derived<br />
from both a Euclidean <strong>and</strong> fractal geometric perspective, the
97 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
output fields being used to train a fuzzy inference engine <strong>and</strong><br />
the recognition structure being based on some of the methods<br />
reported in [15], for example. The approach considered is<br />
generic in that it can, in principle, be applied to any type<br />
of imaging modality. There are numerous applications of<br />
this technique especially when self-calibration <strong>and</strong> leaning is<br />
m<strong>and</strong>atory. Example applications may include remote sensing,<br />
non-destructive evaluation <strong>and</strong> testing <strong>and</strong> many other applications<br />
which specifically require the classification of objects<br />
that are textural. However, in this paper we focus on one<br />
particular application, namely, the diagnosis of cervical cancer<br />
based on st<strong>and</strong>ard Papanicolaou screening test images.<br />
II. OBJECT RECOGNITION ARCHITECTURE<br />
Suppose we have an image which is given by a function<br />
f(x, y) <strong>and</strong> contains some object described by a set S =<br />
{s1, s2, ..., sn}. We consider the case when it is necessary<br />
to define a sample which is somewhat ‘close’ to this object.<br />
This task can be reduced to the construction of some function<br />
determining a degree of proximity of the object to a sample<br />
- a template of the object. Recognition is the process of<br />
comparing individual features against some pre-established<br />
template subject to a set of conditions <strong>and</strong> tolerances. The<br />
process of recognition commonly takes place in four definable<br />
stages: (i) image acquisition <strong>and</strong> filtering (as required for the<br />
removal of noise, for example); (ii) object location (which<br />
may include edge detection); (iii) measurement of object<br />
parameters; (iv) object class estimation. We now consider the<br />
common aspects of each step. In particular, we consider details<br />
on the design features <strong>and</strong> their implementation together with<br />
their advantages, disadvantages <strong>and</strong> proposals for a solution<br />
whose application, in this paper, focuses on problems in<br />
cytopathology.<br />
Image acquisition depends on the technology that is best<br />
suited for integration with a particular application. For pattern<br />
recognition in cytopathology, for example, high fidelity digital<br />
images are required for image analysis whose resolution is,<br />
at least, compatible by the image acquisition equipment used<br />
for human inspection. For cytopathology this involves optical<br />
microscopy <strong>and</strong> for the application considered in this work,<br />
the microscope is equipped with digital camera. The colour<br />
images generated, examples of which are presented in this<br />
paper are, in general, relatively noise free <strong>and</strong> are digitised<br />
using a st<strong>and</strong>ard CCD camera. Nevertheless, it is important<br />
that good quality images are obtained that are homogeneous<br />
with regard to brightness <strong>and</strong> contrast, for example. Unless<br />
consistently high quality images can be generated that are<br />
compatible with the sample images used to design a given<br />
computer vision system, then that same system can be severely<br />
compromised.<br />
The system discussed in this paper is based on an object<br />
detection technique that includes a novel segmentation method<br />
<strong>and</strong> must be adjusted or ‘fine tuned’ for the each area of<br />
application. The necessary features associated with the ‘object’<br />
must be computed for a particular area of application. In the<br />
work reported here, this includes objects for which fractal<br />
models are well suited [23], [1], [2]. The system provides<br />
an output (i.e. a decision) using a knowledge database <strong>and</strong><br />
outputs a result by subscribing different objects. The ‘expert<br />
data’ in the application field creates the knowledge database<br />
by using a supervised training system with a number of model<br />
objects [18]. The recognition process is illustrated in Figure 2,<br />
a process that includes the following steps:<br />
1<br />
2<br />
3<br />
4<br />
5<br />
6<br />
image<br />
acqusition<br />
special<br />
transform<br />
segmentation<br />
feature<br />
detection<br />
decision<br />
making<br />
reporting<br />
Fig. 2. Recognition processes.<br />
digital image {fm,n}<br />
transformed image { ˜ fm,n}<br />
. . . object images {f 1 m,n}, {f 2 m,n}, . . .<br />
. . . feature vectors {x1 k }, {x2 k }, . . .<br />
. . . class probability vectors {p1 j }, {p2j }, . . .<br />
1) Image Acquisition <strong>and</strong> Filtering<br />
A physical object is digitally imaged <strong>and</strong> the data<br />
transferred to memory using current image acquisition<br />
hardware available commercially. The image is filtered<br />
to reduce noise <strong>and</strong> to remove unnecessary features such<br />
as light flecks.<br />
2) Special Transform: Edge Detection<br />
The digital image function fm,n is transformed into<br />
˜fm,n to identify regions of interest <strong>and</strong> provide an<br />
input dataset for the segmentation <strong>and</strong> feature detection<br />
operations [17]. This transform avoids the use of edge<br />
detection filters which have proved to be highly unreliable<br />
in the present application.<br />
3) Segmentation<br />
The image {fm,n} is segmented into individual objects<br />
{f 1 m,n}, {f 2 m,n}, . . . to perform a separate analysis<br />
of each region. This step includes such operations as<br />
thresholding, morphological analysis <strong>and</strong> contour tracing<br />
using the convex hull method developed in [6].<br />
4) Feature Detection<br />
Feature vectors {x1 k }, {x2 k }, . . . are computed from the<br />
object images {f 1 m,n}, {f 2 m,n}, . . . <strong>and</strong> corresponding<br />
{ ˜ f 1 m,n}, { ˜ f 2 m,n}, . . . . The features are numeric parameters<br />
that characterize the object inclusive of its texture.
98 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
The feature vectors computed consist of a number of Euclidean<br />
<strong>and</strong> fractal geometric parameters together with<br />
statistical measures in both one- <strong>and</strong> two-dimensions.<br />
The one-dimensional features correspond to the border<br />
of an object whereas the two-dimensional features relate<br />
to the surface within <strong>and</strong>/or around the object.<br />
5) Decision Making<br />
This involves assigning a probability to a predefined<br />
set of classes [21]. Probability theory <strong>and</strong> fuzzy logic<br />
[19] are applied to estimate the class probability vec-<br />
}, . . . from the object feature vectors<br />
tors {p1 j }, {p2 j<br />
{x1 k }, {x2 k<br />
}, . . . . A fundamental problem is to establish<br />
a quantitative relationship between features <strong>and</strong> class<br />
probabilities, i.e.<br />
{pj} ↔ {xk}<br />
A ‘decision’ is the estimated class of the object coupled<br />
with the a probabilistic accuracy [20].<br />
The application considered in this paper is based on algorithms<br />
that have been designed to solve problems associated<br />
with the above steps details of which are given in [6] which<br />
provides algorithms on threshold selection <strong>and</strong> a contour<br />
tracing algorithm using the ‘convex hull’ property. However,<br />
the application considered here requires some additional algorithms<br />
to solve the object recognition problem associated with<br />
cytopathology. This is because edge detection is particularly<br />
difficult to solve for images consisting of many cells <strong>and</strong><br />
a special space-oriented filter has therefore been designed<br />
to extract parameters associated with the spatial distribution<br />
of object borders. This includes a self-adjustable filter for<br />
enhanced object sharpness that has been considered as an<br />
inter-medium mask filter in order to clarify a cellular border.<br />
For characterisation, the line of objects found using the steps<br />
described above, need to be considered in terms of their major<br />
properties.<br />
With regard to the design of a decision making engine,<br />
the approach proposed is based on establishing an expert<br />
learning procedure in which a Knowledge Data Base (KDB)<br />
is constructed based on answers that an expert makes during<br />
a manual mode. Once the KDB has been developed, the<br />
system is ready for application in the field <strong>and</strong> provides results<br />
automatically. However, the accuracy <strong>and</strong> robustness of the<br />
output depends critically on the extent <strong>and</strong> <strong>and</strong> completeness<br />
of the KDB as well as the quality of the input image, primarily<br />
in terms of its compatibility with those images that have been<br />
used to generate the KDB. The algorithm discussed in Section<br />
IV has no analogy with previous contour tracing algorithms<br />
<strong>and</strong> has been designed to trace a contour of an object with<br />
any level of complexity to produce an output that consists<br />
of a consecutive list of coordinates of an object’s edge. The<br />
algorithm is optimised in terms of computational efficiency<br />
<strong>and</strong> can be realised in a compact form suitable for hardware<br />
implementation.<br />
III. REGION OF INTEREST SEGMENTATION<br />
For applications in cytopathology, a fundamental requirement<br />
is to select Regions of Interest (ROI) for detail review.<br />
The ROI is not taken to be the object itself but its local<br />
boundary. This approach improves the efficiency associated<br />
with the process of recognition, a process that is recursive <strong>and</strong><br />
involves different settings required to evaluate the probability<br />
of a the presence of a cell in the image. The algorithm used<br />
for ROI segmentation is based on adaptive thresholding <strong>and</strong><br />
morphological analysis. The adaptive image threshold is given<br />
by<br />
Tx = 1<br />
�<br />
min<br />
2 y<br />
Ty = 1<br />
�<br />
min<br />
2 x<br />
� max<br />
x f(x, y)� − 〈max<br />
x<br />
+〈max f(x, y)〉y,<br />
x<br />
� max<br />
y f(x, y)� − 〈max<br />
y<br />
+〈max f(x, y)〉x,<br />
y<br />
�<br />
Tx, Tx ≥ Ty,<br />
T =<br />
Ty, otherwise,<br />
f(x, y)〉y<br />
f(x, y)〉x<br />
where 〈·〉x <strong>and</strong> 〈·〉y are the means within column x <strong>and</strong> row y,<br />
respectively. This approach provides a solution for extracting<br />
the most significant features in the image, in this case, the<br />
nucleus of cells. If these objects cover an extensive area of the<br />
image, then this ‘filter’ provides the fastest compact solution.<br />
An example of the output generated by this algorithm is shown<br />
in Figure 3). In order to obtain a clear boundary, morphological<br />
analysis is applied to select objects with a predefined area. This<br />
is discussed in the following section.<br />
Fig. 3. Example of ROI segmentation where + points to the location in the<br />
image where there is a cell.<br />
IV. SPACE ORIENTED FILTER DESIGN FOR EDGE<br />
DETECTION<br />
Edge detection is used to identify the edges in an image<br />
which are those areas that correspond to object boundaries.<br />
To find these edges, an algorithm is designed that looks for<br />
places in the image where the intensity changes rapidly; this<br />
is typically based on using one of two principal criteria:<br />
�<br />
�
(i) areas where the first derivative of the intensity is larger<br />
in magnitude than some threshold;<br />
(ii) regions where the second derivative of the intensity has<br />
a zero crossing.<br />
There are many st<strong>and</strong>ard digital filters available for this<br />
process. Taking into account that in many images, high frequency<br />
noise (white noise) is usually present, we consider an<br />
appropriate adaptive filtering strategy.<br />
A. Noise Reduction by Adaptive Wiener Filtering<br />
Edge detection methods typically require an effective noise<br />
reduction algorithm in order to eliminate noise which should<br />
be undertaken adaptively. A well known adaptive filter is the<br />
Wiener filter which can be applied to an image adaptively,<br />
tailoring itself to the local image variance. When the variance<br />
is large, the Wiener filter performs little smoothing; when<br />
the variance is small, it performs more smoothing. This<br />
approach often produces better results than linear filtering. The<br />
adaptive filter is more selective than a comparable linear filter,<br />
preserving edges <strong>and</strong> other high frequency parts of an image.<br />
Although the Wiener filter requires greater computational time<br />
than linear filtering, it performs better when the noise is<br />
constant-power or ’white’ additive noise, such as Gaussian<br />
noise which is one of the conditions required to simplify the<br />
result of applying a least squares criterion.<br />
The Wiener filter algorithm uses a pixel-wise adaptive filtering<br />
procedure with neighborhoods of size m-by-n to estimate<br />
the local image mean <strong>and</strong> st<strong>and</strong>ard deviation. It estimates the<br />
local mean <strong>and</strong> variance around each pixel given respectively<br />
by<br />
µ = 1 �<br />
Is(r, c) - mean of the brightness of the image<br />
nm<br />
<strong>and</strong><br />
99 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
r,c∈η<br />
σ 2 = 1 �<br />
nm<br />
r,c∈η<br />
(I 2 s (r, c) − µ 2 ) - dispersion<br />
where the sum is taken over the n-by-m local neighborhood<br />
of each pixel in the image I. The algorithm then creates a<br />
pixel-wise Wiener filter using the following estimates<br />
ID (r, c) = µ + σ2 − v2 σ2 (Is(r, c) − µ)<br />
where ν 2 is the noise variance. If the noise variance is not<br />
given, the filter uses the average of all the local estimated<br />
variances. In this work, the Wiener filter is used as a first step<br />
to processing the image prior to applying a space oriented edge<br />
detection filter in order to provide an image that is optimal with<br />
regard to solving the edge detection problem for applications<br />
in cytopathology. Example results are shown in Figures 4<br />
<strong>and</strong> 5. Figure 4 shows the original image <strong>and</strong> Figure 5 is<br />
the result of applying the Wiener filter described above.<br />
B. Edge Detection<br />
Edge detection methods are based on a number of derivative<br />
estimators. For some of these estimators, it is possible to<br />
specify whether the operation should be sensitive to horizontal<br />
Fig. 4. Original image of a cell cluster obtained from a cervical smear after<br />
staining.<br />
Fig. 5. Adaptive Wiener filtered image.<br />
or vertical edges, or both. In each case, the aim is to return<br />
a binary image - an array containing elements which are<br />
either 0 or 1 where 1 represents an element of an edge <strong>and</strong> 0<br />
represents an empty edge space. Moreover, within the context<br />
of the overall approach, it is assumed that different edge<br />
detectors will yield minimal differences. In this application<br />
a Canny filter [8] is used to provide a first estimate of the<br />
edge boundaries of a cell nucleus.<br />
The Canny edge detector is based on a functional analysis<br />
to derive an optimal function for edge detection, starting<br />
with three optimisation criteria, namely, good detection, good<br />
localization, <strong>and</strong> only one response per edge under white noise<br />
conditions. The 1D ‘Canny function’ is accurately approximated<br />
by the derivative of a Gaussian function which is then<br />
combined with a Gaussian of identical st<strong>and</strong>ard deviation in<br />
the perpendicular direction, truncated at 0.001 of its peak<br />
value, <strong>and</strong> split into suitable masks. Underlying this method, is<br />
the idea of locating edges at the local maxima of the gradient<br />
magnitudes of a Gaussian-smoothed image. In addition, the<br />
Canny implementation employs a hysteresis operation on edge<br />
magnitude in order to make edges reasonably connected.<br />
Finally, a multiple-scale method is employed to analyse the
100 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
output of the edge detector.<br />
Fig. 6. Application of a Canny Filter to Figure 5.<br />
An example of applying a Canny filter to Figure 5 is<br />
given in Figure 6. This result typically illustrates that it<br />
is not possible to uniquely tell where the edge of a cell or<br />
nuclei occurs, especially when there is a connection between<br />
one edge with another gradient, where Canny edge detection<br />
introduces errors. For this purpose, it is necessary to design a<br />
new filter which is discussed in the following section.<br />
C. Space Oriented Filtering<br />
In some cases, the nuclei of the cells in a cervical smear<br />
can appear very close together, or be in touch with a foreign<br />
object such as a bacterium. In this case, an extra filter must be<br />
used to obtain a contour boundary. For this purpose, a spaceoriented<br />
filter for the detection of ‘holes’ has been developed.<br />
The nuclei represent a ‘hole’ if the image is visualised in terms<br />
of a surface in which the nuclei are regions of lower intensity.<br />
The filter has been designed to take account of the following:<br />
(i) objects should be of a quasi-spherical form; (ii) the search<br />
space should include objects with lower intensity (i.e. which<br />
have a darker colour); (iii) it is necessary to find only the<br />
surface of a cell without a hysteresis zone. An example of a<br />
profile that is characteristic of a nucleus is given in Figure 7.<br />
The same principle can of course be used for other objects.<br />
The solution to this problem is compounded in the algorithm<br />
that is now described, the basic procedure being illustrated in<br />
Figure 8. To start with, we estimate the brightness of the<br />
central area (using a window of 9×9 pixels) <strong>and</strong> a circle (a<br />
layer consisting of 2 pixels). If the center is dark, we suppose<br />
that it is part of the nuclei <strong>and</strong> compare the intensity along the<br />
white line in Figure 8 with the central zone. If the profile along<br />
this line has a maximum <strong>and</strong> minimum gradient, we consider<br />
the angle between them. If the angle lies in the range 79 o to<br />
248 o degrees then we assume that we are near to the border<br />
of a nucleus. This angle can be estimated automatically or<br />
established as a constant <strong>and</strong> ‘hard-wired’ into the algorithm.<br />
The next step is to apply the hole detection method (red<br />
<strong>and</strong> brown lines in Figure 8). This hole detection algorithm<br />
is extended in a procedure to decide whether the area under<br />
investigation is a nuclei or otherwise. In Figure 8, the<br />
Fig. 7. Example intensity profile of a Nucleus.<br />
Fig. 8. Mask used for space-oriented filtering.<br />
maximum length of the brown line is approximately 70 pixels<br />
(which depends on the image resolution) <strong>and</strong> can be chosen<br />
automatically. A useful procedure is to check the direction<br />
toward the center of a nuclei but this is application dependent.<br />
If, for a period, there is no hole, then the present position is<br />
ignored. If the test for detecting a hole gives a positive result,<br />
as in an index figure, the line from the center of a hole up to<br />
the border of a hysteresis is drawn.<br />
In the central part of the image (Figure 5) one can see 5<br />
joint kernels in the centre of the image. To automatically find<br />
the edges between all of these nuclei requires a special algorithm<br />
for object separation The sequence of steps associated<br />
with the algorithm designed for this purpose can be divided<br />
into following list:<br />
(i) estimation of the edge;<br />
(ii) search the boundaries of the cell;<br />
(iii) calculate the direction to the center of a core;<br />
(iv) search the opposite edge of the core;<br />
(v) calculate the centers of the kernels;<br />
(vi) save the index map of the figure.<br />
Estimation of edge expectation<br />
Pre-processing can be used to form part of the estimated<br />
performance for edge expectation. This allows for accelerated
101 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
scanning of the image. For this purpose a structure estimation<br />
operator is applied at the central part of the mask as shown in<br />
Figure 8. This selects only those nuclei of interest <strong>and</strong> avoids<br />
spending computer time processing other parts of the image.<br />
Searching the boundaries of the cell (Step 1)<br />
The ring around of the central part of a mask (Figure 8) is<br />
decomposed using the operator<br />
R = [x1, x2, ...xn]<br />
In the following analysis we evaluated the gradient sequence:<br />
g1..n = dR<br />
dn<br />
Upon demarcation of a core <strong>and</strong> after the derivation, the<br />
gradient window will contain two maxima - positive <strong>and</strong><br />
negative. The polar angle then gives the direction of the<br />
nuclear center θ1.<br />
Calculation of the direction of the center (Step 2)<br />
In this step, the expected direction to the center is updated<br />
by means of a check on the position of the angle on a plane<br />
between the maxima obtained in the previous step. In general,<br />
for the purpose of recognition, a point on the binary map uses<br />
a convolution technique with a series of masks for searching<br />
the exact point on the object edge. The sequence of masks<br />
used is as follows:<br />
⎧ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />
⎨ 0 0 0 0 0 0 0 0 0<br />
M = ⎣ 0 1 0 ⎦ , ⎣ 0 1 0 ⎦ , ⎣ 0 1 0 ⎦ ,<br />
⎩<br />
0 1 0 1 1 0 0 1 1<br />
⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎫<br />
0 0 0 0 0 0 0 0 0 ⎬<br />
⎣ 0 1 0 ⎦ , ⎣ 0 1 1 ⎦ , ⎣ 1 1 0 ⎦<br />
⎭<br />
1 1 1 1 1 1 1 1 1<br />
The appropriate mask is applied in the direction of a local<br />
gradient rate <strong>and</strong> gives a maximal convolution between both<br />
the points obtained from the previous step. From the definition<br />
of the angle θ2, utilizing the a priori results, we form the ratio<br />
θ = θ1 + θ2<br />
2<br />
The logical conformity of the mask <strong>and</strong> adjacent points of the<br />
binary map is further evaluated <strong>and</strong> the binary representation<br />
of object is determined via<br />
IB(r, c) =<br />
� 1, if M /∈ Ig;<br />
0, if M ∈ Ig.<br />
The profile information (gradient <strong>and</strong> amplitude) is memorized<br />
for Step 3 (discussed below). The dimension IB(r, c)<br />
corresponds to the dimension <strong>and</strong> starting map Ig(r, c).<br />
Search for the opposite edge of a core (Step 3)<br />
The opposite gradient is searched for by finding of centre<br />
of a nuclei together with the gradient on the opposite end<br />
which serves as a final confirmation for the coordinates of<br />
object. In Figure 9 these lines are illustrated in brown. The<br />
opposite profile has to have the same properties as at Step<br />
2. This prevents any wrong detection through irregularities in<br />
the image. If the opposite profile is found, then a green line<br />
is ‘painted’ on the index binary image from the center to the<br />
boundary of the nucleus as in Figure 10.<br />
Fig. 9. Mask of the space-oriented filter with an image.<br />
Fig. 10. Result of applying the space oriented filter to an image.<br />
Calculation of the central of kernel<br />
The centre calculation algorithm is based on the weighted<br />
mean from the total number of bars detected in the previous<br />
steps - Figure 3. The calculation depends on the kind of<br />
implementation used to design the processing engine. If the<br />
calculations are implemented in a programmed logic, the data<br />
are better stored in an index space. For a PC, the data are<br />
stored as array of coordinates.<br />
Saving the index map (Figure 11)<br />
After application of the algorithm, a connected area can<br />
be detected which serves as an index for further processing.<br />
An example of an index image is given in Figure 11<br />
which includes the application of erosion <strong>and</strong> dilation for the<br />
subdivision of close located objects.<br />
V. TWO DIMENSIONAL ALGORITHM FOR IMAGE<br />
SHARPENING<br />
In this section, we consider the procedures necessary during<br />
object recognition. These procedures are adaptive <strong>and</strong> are not<br />
bound to a particular range of applications.
102 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
Fig. 11. Segmentation of nuclei (Index Image).<br />
A. Self-adjustable filter for enhanced object sharpness<br />
The task of edge searching of an object in an image is<br />
a part of the process of object recognition. In the case of<br />
an image with no preliminary information on the quantity of<br />
the points on each edge, resolution or particular boundary,<br />
it is possible to convert the data into an auxiliary map with<br />
an increased contrast range. With existing algorithms image<br />
contrast enhancement does not provide sufficient fidelity to<br />
cope with unknown levels of difference between objects.<br />
Typically, noise appears causing an increase in the level of<br />
transformation parameters <strong>and</strong> at a low level there is poor<br />
detection of an objects edge.<br />
An image I, is represented in a computer memory in terms<br />
of an array r × c of points <strong>and</strong> the value of a particular<br />
point is determined as I(r, c). One of the approaches to applying<br />
a filter or transformation to two-dimensional information<br />
representation is in terms of a sequence of masks M over<br />
m × n points <strong>and</strong> the subsequent calculation of a value for a<br />
central pixel depending on its environment. We now consider<br />
Fig. 12. Cytology cells - Mild dyskaryosis.<br />
an algorithm for calculating the value of a central point in<br />
a moving window M with m × n points. The algorithm is<br />
applied sequentially <strong>and</strong> not recursively to all points of an<br />
image. For example, consider the image given in (Figure 12).<br />
The characteristic property of the given image is that during<br />
preparation of a sample, a cell can be fixed at a given angle <strong>and</strong><br />
consequently, it can have a different gradient rate on different<br />
boundaries. The mask sizes m <strong>and</strong> n are selected according to<br />
the proportional sizes of the object to the image. The method<br />
is compounded in the following stages:<br />
1. The first step is to sort out the array M[m × n] in<br />
terms of increasing values. The result of applying this<br />
operation gives an information represented in terms of<br />
a one-dimensional array S[i] as illustrated in Figure 13.<br />
Fig. 13. Profile obtained by sorting an image into an array of increasing<br />
pixel values.<br />
2. We define an index i as a point with the greatest value<br />
of a gradient rate Simax. Otherwise, we determine a<br />
maximal gradient rate such that the given position of<br />
the window M does not correspond to a boundary of<br />
the object. It is then possible to apply general filtering<br />
methods, e.g. to calculate the average value or to take<br />
the value of a point with a predetermined index <strong>and</strong> with<br />
this value, assign it to a central point. For example, in<br />
Figure 13 Simax is the point shown by the red arrow.<br />
3. We estimate in which part of the sorted array S[i] from<br />
mask M there exists a value of the original central<br />
mask point Ic(r, c). For example, in Figure 13, this<br />
is indicated by the green arrow. We denote this part of<br />
the array by Sc[i] (see Figure 13).<br />
4. We estimate the parameter established by the user which<br />
sets a factor on a boundary excretion - in percentage<br />
terms, 50% for example - <strong>and</strong> then define the value of<br />
point Scr[i] of the array Sc[i] from the beginning of<br />
the array. This value is the resultant solution Ic(r, c) =<br />
Scr[i] displayed by the cyan arrow in Figure 13.<br />
An example result of applying this procedure is shown in<br />
Figure 14. Application of this filter allows us to observe very<br />
precisely the evolution of cell boundaries during the operation<br />
of the object recognition system.
103 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
Fig. 14. Filtered image.<br />
VI. PRECISION CALCULATIONS ON THE MEASURE OF<br />
STRUCTURE<br />
For characterization, the line of objects obtained using<br />
the method described in the previous section, need to be<br />
considered in terms of their major properties. The modern<br />
requirements for recognition systems establish structures as<br />
main features for natural objects such as the measures defined<br />
by Tamura [9].<br />
For structure classification, we apply fractal geometry for<br />
a description of natural objects. A fundamental property of<br />
a fractal is its Fractal Dimension. There are a number ways<br />
to calculate this feature of a fractal object <strong>and</strong> many different<br />
approaches to computing the Fractal Dimension have<br />
been considered [23]. For example, the origins of the ‘box<br />
dimension’ is hard to trace but would have been considered<br />
by pioneers of the Hausdorff measure <strong>and</strong> dimension<br />
<strong>and</strong> was probably rejected as being less satisfactory from<br />
a computational viewpoint. The precision of the calculation<br />
is less than two decimal places. Computation of the Fourier<br />
dimension provides a better result [10]. However, in our case,<br />
we have to estimate the dimension from an image with a<br />
lower resolution than that at which the object ‘exists’ using a<br />
frequency spectrum that is subject to additive noise.<br />
Many signal processing applications are based on the use<br />
of different transforms. The signals under consideration are<br />
written as a linear combination (or series) of some predefined<br />
set of functions. Traditionally, orthogonal basis functions have<br />
been used for this purpose, for example, the discrete Fourier<br />
transform. The theory for orthogonal basis <strong>and</strong> Hilbert spaces<br />
can, however, be generalized to other sequences of functions<br />
called frames which have been used in this work to develop<br />
measures of structure with high precision.<br />
If we consider the profile of a typical cytopathology image,<br />
then the curve does not coincide with a sine-wave signal.<br />
To obtain adequate accuracy, it is necessary to magnify the<br />
resolution of the image, which in turn introduces distortion.<br />
For increased accuracy on low-resolution data, we consider a<br />
convolution function of a form more consistent with the profile<br />
of a video signal. For a signal I we consider the representation<br />
F (k) =<br />
N�<br />
I (n)<br />
n=1<br />
� �<br />
2π(k − 1)(n − 1)<br />
arccos cos<br />
−<br />
N<br />
π<br />
��<br />
−<br />
2<br />
π<br />
2<br />
� � ��<br />
2π(k − 1)(n − 1)<br />
−i arcsin cos<br />
N<br />
<strong>and</strong> for an image I with resolution m × n,<br />
F (p, q) =<br />
M�<br />
m=1 n=1<br />
N�<br />
I (m, n) (1)<br />
� � �<br />
2π(p − 1)(m − 1)<br />
arccos cos<br />
−<br />
M<br />
π<br />
��<br />
−<br />
2<br />
π<br />
�<br />
2<br />
� � �<br />
2π(k − 1)(n − 1)<br />
× arccos cos<br />
−<br />
N<br />
π<br />
��<br />
−<br />
2<br />
π<br />
�<br />
2<br />
� � ��<br />
2π(k − 1)(p − 1)<br />
−i arcsin arccos<br />
M<br />
� � ��<br />
2π(k − 1)(n − 1)<br />
× arcsin cos<br />
(2)<br />
N<br />
In this work, application of the power spectrum method used<br />
to compute the fractal dimension of a cell boundary <strong>and</strong><br />
cell surface is based the above representations for F (k) <strong>and</strong><br />
F (p, q) respectively. We then consider the power spectrum<br />
of an ideal fractal signal given by P = c|k| −β , where c is a<br />
constant <strong>and</strong> β is the spectral exponent. In two dimensions,<br />
the power spectrum is given by P (kx, ky) = c|k| −β �<br />
, where<br />
|k| = k2 x + k2 y. In both cases, application of the least squares<br />
method or Orthogonal Linear Regression yields a solution<br />
for β <strong>and</strong> c [23], the relationship between β <strong>and</strong> the Fractal<br />
Dimension DF being given by [23]<br />
DF = 3DT + 2 − β<br />
2<br />
for Topological Dimension DT . This approach allows us to<br />
drop the limits on the recognition of small objects since<br />
application of the FFT (for computing the power spectrum)<br />
works well (in terms of computational accuracy) only for<br />
large data sets, i.e. arrays sizes larger than 256 <strong>and</strong> 256×256.<br />
Tests on the accuracy associated with computing the fractal<br />
dimension using equations (1) <strong>and</strong> (2) show an improvement of<br />
5% over computations based on conventional Discrete Fourier<br />
Transform.<br />
VII. FEATURE DETERMINATION<br />
Features (which are typically compounded in a set of<br />
metrics - floating point or decimal integer numbers) describe<br />
the object state in an image <strong>and</strong> provides the input for a<br />
decision making engine as illustrated in Figure 2. The features<br />
considered in this paper are computed in the spatial domains<br />
of the original image {fm,n} <strong>and</strong> transformed image { ˜ fm,n}.<br />
Further, these features are extracted from the three colour<br />
channels - Red (R), Green (G) <strong>and</strong> Blue (B) - captured
104 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
by the CCD array. The issue of what type <strong>and</strong> how many<br />
features should be used to develop a computer vision system<br />
is critical to the design associated with a specific application.<br />
The system considered here has been developed to include<br />
features associated with the texture of an object which include<br />
the Fractal Dimension. Texture is particularly important in<br />
medical image classification <strong>and</strong> of primary importance in the<br />
application considered in this paper. The following features<br />
or their derivatives have been considered (primarily through a<br />
process of ’trial <strong>and</strong> error’) in the recognition system reported<br />
in this paper:<br />
Average Gradient G<br />
describes how the intensity changes when scanning<br />
from the object center to the border. The object<br />
gradient is computed using the least squares method<br />
in polar coordinates as compounded in the following<br />
result:<br />
g =<br />
N �<br />
(m,n)∈S<br />
N �<br />
rm,n ˜ fm,n − �<br />
(m,n)∈S<br />
r 2 m,n −<br />
(m,n)∈S<br />
⎛<br />
⎝ �<br />
rm,n<br />
(m,n)∈S<br />
�<br />
(m,n)∈S<br />
⎞2<br />
rm,n<br />
⎠<br />
˜fm,n<br />
where N is the number of object pixels <strong>and</strong> rm,n is<br />
the distance between (m, n) <strong>and</strong> the center (m ′ , n ′ ),<br />
i.e.<br />
rm,n = � (m − m ′ ) 2 + (n − n ′ ) 2 .<br />
The centers (m ′ , n ′ ) correspond to the local maximums<br />
of ˜ fm,n within the cluster. The cluster gradient<br />
is the average of object gradients,<br />
G = 〈gi〉i∈I<br />
where i ∈ I is the object index.<br />
Colour Composites Υ <strong>and</strong> ΥD characterises the relationship between R, G <strong>and</strong> B<br />
layers of the transformed image. The triangle formula<br />
�<br />
(s − a)(s − b)(s − c)<br />
r(a, b, c) =<br />
,<br />
s<br />
s = 1<br />
(a + b + c)<br />
2<br />
is applied to the ‘colour triangle’ RGB such that the<br />
following pixel colour composite is obtained<br />
where<br />
υm,n = r(a, b, c)<br />
a = ˜ f R m,n, b = ˜ f G m,n, c = ˜ f B m,n<br />
<strong>and</strong> υ D = rincircle(a, b, c) with<br />
<strong>and</strong><br />
a = | ˜ f R m,n − ˜ f G m,n|, b = | ˜ f G m,n − ˜ f B m,n|<br />
c = | ˜ f R m,n − ˜ f B m,n|.<br />
The average colour composites are then given by<br />
Υ = 〈υm,n〉 (m,n)∈S, Υ D = 〈υ D m,n〉 (m,n)∈S.<br />
,<br />
Fourier Dimension q<br />
determines the frequency characteristics of the object<br />
<strong>and</strong> is related to the fractal dimension D by q =<br />
4 − DF [1], [2]. It represents a measure of texture<br />
[23] <strong>and</strong> is computed using the approach discussed<br />
in Section VI.<br />
Lacunarity (Gap Dimension) Λk<br />
characterizes the way the ‘gaps’ are distributed in an<br />
image [2]. The gap dimension is, roughly speaking,<br />
the number of light or dark spots in the image. It is<br />
defined for the given degree k by<br />
Λk =<br />
� ����<br />
fm,n<br />
〈fm,n〉<br />
�<br />
�<br />
− 1�<br />
�<br />
k<br />
� 1<br />
k<br />
,<br />
where 〈fm,n〉 = 1 �<br />
N<br />
fm,n denotes the mean value.<br />
In the system described in this paper, an average of<br />
local lacunarities of the degree k = 2 is measured in<br />
the spatial <strong>and</strong> frequency domains.<br />
Symmetry Features Sn <strong>and</strong> M<br />
are estimated by morphological analysis in threedimensional<br />
space, i.e. two-dimensional spatial coordinates<br />
<strong>and</strong> intensity. A symmetry feature Sn is measured<br />
for a given degree of symmetry n (currently<br />
n = {2, 4}). This value shows the deviation from a<br />
perfectly symmetric object, i.e. Sn is close to zero<br />
when the object is symmetric <strong>and</strong> Sn > 0 otherwise.<br />
Feature M describes the fluctuation of the centre or<br />
mass for pixels with different intensities; M = 0 for<br />
symmetric objects <strong>and</strong> M > 0 otherwise.<br />
Structure γ<br />
provides an estimation of the 2D curvature of the<br />
object in terms of the following:<br />
γ < 0, if the object bulging is less than a threshold,<br />
γ = 0, if the object has the st<strong>and</strong>ard bulging,<br />
γ > 0, if the object has a higher level of bulging.<br />
Geometrical Features<br />
include the minimum Rmin <strong>and</strong> maximum radius<br />
Rmax of the object (or ratio Rmax/Rmin), object<br />
area S, object perimeter P (or ratio S/P 2 ) <strong>and</strong> the<br />
coefficient of infill S/SR, where SR is the area of<br />
the bounding polygon which, in this application, is<br />
determined using the convex hull algorithm given in<br />
Section V.<br />
The system reported in this paper classifies objects using<br />
mixed mode features that are based on Euclidean <strong>and</strong> fractal<br />
geometric metrics. The procedure of object detection is performed<br />
at the segmentation stage <strong>and</strong> needs to be adjusted<br />
for each area of application. The recognition algorithm then<br />
makes a decision using a knowledge database <strong>and</strong> outputs a<br />
result by subscribing objects based on the features defined<br />
above. The ‘expert data’ associated with a given application<br />
creates a knowledge database by using a supervised training<br />
system with a number of model objects. This is discussed in<br />
the following section.
105 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
VIII. OBJECT RECOGNITION<br />
In order to characterize an object, the ‘system’ must have<br />
a mathematical representation compounded in metrics that<br />
are used to compose a feature vector. The basis for the<br />
application considered in this paper are the textural features<br />
(Fourier dimension <strong>and</strong> Lacunarity) for an object coupled with<br />
the Euclidean <strong>and</strong> morphological measures as defined in the<br />
previous section. In the case of a general application, all<br />
objects are represented by a list of parameters for implementation<br />
of supervised learning in which a fuzzy logic engine<br />
automatically adjusts the weight coefficients for the remaining<br />
features. The methods developed represent a contribution to<br />
pattern recognition based on fractal geometry (at least in a<br />
partial sense), fuzzy logic <strong>and</strong> the implementation of a fully<br />
automatic recognition scheme as illustrated in Figure 15 for<br />
the Fractal Dimension D (just one element of the feature<br />
vector used in practice). The recognition procedure uses the<br />
decision making rules from fuzzy logic theory [21], [18], [19],<br />
[20] based on all, or a selection, of the features defined <strong>and</strong><br />
discussed in Section VII which are combined to produce a<br />
feature vector x.<br />
Fig. 15. Basic architecture of the diagnostic system based on the Fractal<br />
Dimension D (a single feature) <strong>and</strong> decision making criteria β.<br />
A. Decision Making<br />
The class probability vector p = {pj} is estimated from<br />
the object feature vector x = {xi} <strong>and</strong> membership functions<br />
mj(x) defined in the knowledge database. If mj(x) is a membership<br />
function, the following equation defines the probability<br />
for each j th class <strong>and</strong> i th feature as follows:<br />
�<br />
pj(xi) = max<br />
σj<br />
· mj(xj,i)<br />
|xi − xj,i|<br />
for weight coefficient matrix given by wj = wj,i where σj<br />
is the distribution density of values xj at the point xi of the<br />
membership function. The next step is to compute the mean<br />
class probability given by<br />
〈p〉 = 1<br />
j<br />
�<br />
j<br />
wjpj<br />
where the distance from the mean probability selects the class<br />
associated with<br />
�<br />
p(j) = min [(pj · wj − 〈p〉) ≥ 0]<br />
providing a result for the decision making of the j th class.<br />
The weight coefficient matrix is adjusted during the learning<br />
stage of the algorithm.<br />
The decision criterion method considered here represents a<br />
weighing-density minimax expression. The estimation of the<br />
decision accuracy is achieved by using the density function<br />
di = |xσmax − xi| 3 3<br />
+ (σmax(xσmax ) − pj(xi))<br />
with an accuracy determined by<br />
2<br />
P = wjpj − wjpj<br />
π<br />
B. Supervised Learning Process<br />
N�<br />
di.<br />
i=1<br />
The supervised learning procedure is the most important<br />
part of the system for operation in automatic recognition mode.<br />
The training set of sample objects should cover all ranges of<br />
class characteristics with a uniform distribution together with<br />
a universal membership function. This rule should be taken<br />
into account for all classes participating in the training of the<br />
system. An expert defines the class <strong>and</strong> accuracy for each<br />
model object where the accuracy is the level of self confidence<br />
that the object belongs to a given class. During this procedure,<br />
the system computes <strong>and</strong> transfers to a knowledge database<br />
a vector of values of parameters x = {xi} which forms the<br />
membership function mj(x). The matrix of weight factors wj,i<br />
is formed at this stage accordingly for the i th parameter <strong>and</strong><br />
j th class using the following expression:<br />
wi,j =<br />
�<br />
� N�<br />
� �<br />
�1<br />
− pi,j(x<br />
� k i,j) − 〈pi,j(xi,j)〉 � pi,j(x k �<br />
�<br />
�<br />
i,j) �<br />
� .<br />
k=1<br />
The result of the weight matching procedure is that all<br />
parameters which have been computed but have not made any<br />
contribution to the characteristic set of an object are removed<br />
from the decision making algorithm by setting wj,i to null.<br />
IX. DISCUSSION<br />
The methods discussed in the previous sections represent<br />
a novel approach to designing an object recognition system<br />
that is robust in classifying textured features, the application<br />
considered in this paper, having required a symbiosis of the<br />
parametric representation of an object <strong>and</strong> its geometrical<br />
invariant properties. In comparison with existing methods, the<br />
approach adopted here has the following advantages:<br />
Speed of operation. The approach uses a limited but effective<br />
parameter set (feature vector) associated with an object<br />
instead of a representation using a large set of values (pixel<br />
values, for example). This provides a considerably higher operational<br />
speed in comparison with existing schemes, especially<br />
with composite tasks, where the large majority of methods<br />
require object separation. The principal computational effort
106 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
is that associated with the computation of the feature vector<br />
using the metrics discussed in Section VII<br />
Accuracy. The methods constructed for the analysis of<br />
sets of geometrical primitives are, in general, more precise.<br />
Because the parameters are feature values, which are not<br />
connected to an orthogonal grid, it is possible to design<br />
different transformations (shifts, rotational displacements <strong>and</strong><br />
scaling) without any significant loss of accuracy compared<br />
with a set of pixels, for example. On the other h<strong>and</strong>, the overall<br />
accuracy of the method is directly influenced by the accuracy<br />
of the procedure used to extract the required geometrical tags.<br />
Generally, the accuracy of a method will always be lower,<br />
than, for example, classical correlative techniques, where,<br />
due to padding, error can arise during the extraction of a<br />
parameter set. However, by using precise parameterization<br />
structures based on fractal geometry, remarkably good results<br />
are obtained.<br />
Reliability. The proposed approach relies first <strong>and</strong> foremost<br />
on the reliability of the extraction procedure used to establish<br />
the geometrical <strong>and</strong> parametric properties of objects, which,<br />
in turn, depends on the quality of the image; principally in<br />
terms of the quality of the contours. It should be noted, that<br />
the image quality is a common problem in any visual system<br />
<strong>and</strong> that in conditions of poor visibility <strong>and</strong>/or resolution, all<br />
vision systems will fail. In other words, the reliability of the<br />
system is fundamentally dependent on the quality of the input<br />
data.<br />
An additional feature of the system discussed in this paper,<br />
is that the sub-products of the image processes can be used<br />
for tasks that are related to image analysis such as a search for<br />
objects in a field of view, object identification, maintaining an<br />
object in a view field, optical correction of a view point <strong>and</strong><br />
so on. These can include tasks involving the relative motion<br />
of an object with respect to another object or with respect to<br />
background for which the method considered can be also be<br />
applied - collision avoidance tasks, for example.<br />
Among the characteristic disadvantages of the approach, it<br />
should be noted that: (i) The method requires a considerable<br />
number of different calculations to be performed <strong>and</strong> appropriate<br />
hardware requirements are therefore m<strong>and</strong>atory in the<br />
development of a real time system; (ii) the accuracy of the<br />
method is intimately connected with the required computing<br />
speed - an increase in accuracy can be achieved but may be<br />
incompatible with acceptable computing costs. In general, it<br />
is often difficult to acquire a template of samples under real<br />
life or field trial conditions which have a uniform distribution<br />
of membership functions. If a large number of training objects<br />
are non-uniformly distributed, it is, in general, not possible to<br />
generate accurate recognition system.<br />
The original approach to the decision process proposed<br />
includes the following important steps: (i) estimation of the<br />
density distribution is accurately determined from the original<br />
samples in the membership function during a supervised<br />
learning phase which improves the recognition accuracy under<br />
non-ideal conditions; (ii) the pre-filtering procedures provide<br />
a good response to the required features of the object without<br />
generating noise; (iii) the segmentation procedures discussed<br />
in Section III efficiently select only those objects required; (iv)<br />
computation of fractal parameters, in particular, the average<br />
lacunarity, helps to characterize the textural features (in terms<br />
of their classification) associated with the object.<br />
The integration of Euclidean with fractal geometric parameters<br />
provides a more complete suite of tools for pattern<br />
recognition in combination with supervised learning through<br />
fuzzy logic criteria. In the following section, we consider the<br />
application of our approach for the design of a cytological<br />
screening system.<br />
X. APPLICATION TO CERVICAL SMEAR SCREENING<br />
The application considered in this section has focused on<br />
screening programmes that utilize Liquid Based Cytology<br />
(LBC). Cells are collected from the cervix in the same way as<br />
PAP smear, but using a very small brush instead of a spatula.<br />
The head of the brush is broken off <strong>and</strong> maintained in a liquid<br />
environment instead of smearing the cells directly onto a slide.<br />
This preserves the cells <strong>and</strong> so the results of the test are more<br />
reliable. At present, about one in twelve PAP smears have to<br />
be done again because they can not be read properly. With the<br />
LBC approach, far fewer test have to be repeated. However, the<br />
LBC method is not, as yet, in widespread use. Nevertheless,<br />
the system reported in this paper has been designed to operate<br />
in conjunction with screening centres that use LBC.<br />
A. Classes of Cervical Cells<br />
There are two main types of cervical cancer: (i) Squamous<br />
cell cancer; (ii) Adenocarcinoma. They are named after the<br />
type of cell that becomes cancerous. Squamous cells are<br />
the flat skin-like cells that cover the surface of the cervix.<br />
Squamous cell cancer is the most common type of cervical<br />
cancer. Adenocarcinoma cells are gl<strong>and</strong>ular cells that produce<br />
mucus. The cervix has these gl<strong>and</strong>ular cells along the inside<br />
of the passageway that runs from the cervix to the womb<br />
(the endocervical canal). Adenocarcinoma is a cancer of these<br />
cell types. It is less common than squamous cell cancer, but<br />
has become more commonly recognised in recent years. Only<br />
about one in five to one in ten cases of cervical cancer are<br />
adenocarcinoma. Adenocarcinoma is associated with a similar<br />
precancerous phase. It is treated in the same way as squamous<br />
cell cancer of the cervix.<br />
Tables I <strong>and</strong> II explain the relationship between<br />
the current system <strong>and</strong> Bathesda 2001 classifications -<br />
http://www.aafp.org/afp/2003/1115/p1992.html. The first class<br />
represents normal cells <strong>and</strong> the last one are malignant (cancerous)<br />
cells. Intermediate classes represent different degrees<br />
of abnormalities; it is important to detect these as well. The<br />
classification, for which the system is ‘focused’ is simplified<br />
because, unlike Bathesda 2001, it provides a fuzzy estimation<br />
of class membership, which gives a better description of the<br />
cell state. An additional class Exudate is defined to described<br />
irrelevant structures in the image.<br />
With current techniques, all cervical smear tests are examined<br />
by ‘screeners’ who have only a few minutes per slide.<br />
This means that the screening is done at low magnification <strong>and</strong><br />
high speed so it is not surprising that mistakes can be made.<br />
The ‘screeners’ look for abnormal variations in the ratio of the
107 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
TABLE I<br />
CLASSIFICATION OF SQUAMOUS CELLS.<br />
System Bathesda 2001<br />
Normal Sq Normal squamous cells<br />
Normal Sq Atypical squamous cells – ’undetermined significance’<br />
(ASC-US)<br />
Normal Sq Atypical squamous cells – ’cannot exclude high<br />
grade disease’ (ASC-H)<br />
LSIL Low grade squamous intra-epithelial lesion (LSIL)<br />
HSIL High grade squamous intra-epithelial lesion (HSIL)<br />
– CIN2<br />
HSIL High grade squamous intra-epithelial lesion (HSIL)<br />
– CIN3<br />
Invasive Sq Invasive squamous carcenoma<br />
TABLE II<br />
CLASSIFICATION OF GLANDULAR CELLS.<br />
System Bathesda 2001<br />
Normal Gl Normal gl<strong>and</strong>ular cells<br />
Normal Gl Atypical gl<strong>and</strong>ular cells (AGC) – endocx/endom/not<br />
specified<br />
Normal Gl Atypical gl<strong>and</strong>ular cells (AGC) – favour neoplasia<br />
AIS Adenocarcinoma in situ (AIS)<br />
Invasive Adeno Adenocarcinoma<br />
size of the nucleus relative to the size of the cell, as well as<br />
other markers of diseased tissue. When they identify suspect<br />
areas of the slide they mark these with a felt tip pen <strong>and</strong> pass<br />
them on for further inspection. These slides are then looked<br />
at by ‘checkers’ who have more experience <strong>and</strong> examine the<br />
slide more carefully <strong>and</strong> at higher magnification. If they are not<br />
satisfied that ‘all is well’, then they pass the suspect slides to a<br />
cytopathologist for further, more detailed analysis <strong>and</strong> diagnosis.<br />
Even at this final stage, mistakes can be made as each slide<br />
is prepared differently <strong>and</strong> it is common for cells to overlie<br />
each other, compounding the problem of accurate diagnosis<br />
further. New techniques that use cytocentrifuge preparations<br />
(e.g. http://www.tharmac.com/?p=15) can overcome this last<br />
problem but have yet to be introduced in general.<br />
One of the major criteria of assessing whether a cell is premalignant<br />
or malignant is the ratio of the size of the nucleus of<br />
the cell compared with that of the whole cytoplasm - the nuclear/cytoplasmic<br />
ratio. The rapid identification of variations in<br />
these ratios enables ‘checkers’ to quickly <strong>and</strong> more accurately<br />
determine if there are abnormalities by examining cells that<br />
are located in a small area. To estimate the condition of the<br />
cells, the cytologist typically makes upto 300 slide movements<br />
over a period of 8-10 minutes on a desk microscope <strong>and</strong> may<br />
consequently miss many important features. This approach not<br />
only takes time but inevitably can not guarantee consistent<br />
<strong>and</strong> accurate estimates of the condition of the cells. With an<br />
increasing number of screening projects taking place together<br />
with the variability of different preparations, diagnostic errors<br />
can lead to a number of fatalities due to false negatives <strong>and</strong><br />
lack of appropriate treatment in the early stages of cervical<br />
cancer.<br />
At present, there are no commercial or experimental systems<br />
available for the automatic identification <strong>and</strong> classification<br />
of tissue cells without human participation. Obtaining results<br />
from cytology diagnostics in real time with a robust least error<br />
criterion is a widespread <strong>and</strong> important problem for screening<br />
the cervix uteri. The automatic coloring (staining) <strong>and</strong><br />
scanning of the material creates preconditions in designing an<br />
algorithm <strong>and</strong> technical devices for the automatic identification<br />
<strong>and</strong> classification in cytopathology. A key point is to identify<br />
<strong>and</strong> classify the condition of the cell nuclei using a suitable<br />
recognition process.<br />
There are a range of techniques that aim to<br />
improve the examination of slides using integrated<br />
optical densitometry. For example, SurePath -<br />
http://www.pathlabsofark.com/surepathliquidpap.html -<br />
uses integrated optical density of conventional smears. The<br />
aim of the system reported in this paper is to exclude 25%<br />
of samples without visual examination. Unlike a human<br />
expert, the automatic scanning method can count the cells<br />
<strong>and</strong> estimate their statistical distribution among classes or<br />
states. The system delivers high accuracy <strong>and</strong> automation due<br />
to the following innovations:<br />
Fractal analysis<br />
Biological structures (such as body tissues) have<br />
natural fractal properties. Numerical measurements<br />
of these properties provides for the efficient <strong>and</strong><br />
effective detection of abnormalities.<br />
Extended set of detectable features<br />
High accuracy is achieved when multiple features are<br />
measured together <strong>and</strong> combined into a result<br />
Advanced fuzzy logic engine<br />
The knowledge-based recognition scheme enables<br />
highly accurate diagnosis.<br />
B. System Overview<br />
It is proposed that the approach described in this paper <strong>and</strong><br />
the system developed may assist cytopathologists in reducing<br />
the workload by eliminating in a secure manner a percentage<br />
of normal smears, thus allowing more time for the evaluation<br />
of the abnormal cases. The ‘software solutions’ detect abnormalities<br />
in organic structures such as cells by digital image<br />
analysis. Cancer experts create the knowledge database by<br />
training the system with a number of case study images. The<br />
recognition algorithm is composed of the following steps:<br />
Filtering<br />
The image is filtered to reduce noise <strong>and</strong> remove<br />
unnecessary features (bacteria, broken cells).<br />
Segmentation<br />
The image is segmented to perform a separate analysis<br />
of each object. In order to separate connected<br />
objects a new algorithm has been designed. An<br />
example of the GUI developed is given in Figure 16<br />
which shows the stage at which the nuclei of suspect<br />
cells have been identified <strong>and</strong> located.<br />
Feature Detection<br />
For each object, a set of recognition features are<br />
detected. The features are numeric parameters that<br />
describe the object inclusive of fractal geometric<br />
parameters. The system captures a variety of geometrical,<br />
fractal <strong>and</strong> statistical features in one- <strong>and</strong> twodimensions.<br />
One-dimensional features correspond to
108 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
Model <strong>and</strong><br />
Supplier<br />
Nikon<br />
Coolscope<br />
(Nikon<br />
Instruments<br />
Europe BV)<br />
Aperio<br />
Scanscope<br />
(Aperio<br />
Technologies:<br />
DakoCytomation)<br />
Nikon Eclipse<br />
E8000 +<br />
JVC 3-CCD<br />
KY-F55B.<br />
TABLE III<br />
IMAGING ACQUISITION HARDWARE.<br />
Advantages Shortcomings<br />
Available on the<br />
market<br />
Magnification 40x.<br />
Complete solution<br />
with a slide feeder.<br />
High scanning<br />
speed<br />
(20 min/slide).<br />
Magnification 40x.<br />
Non-tiling scan.<br />
Better focus.<br />
Better dynamic<br />
range.<br />
Variable resolution<br />
4X-80X.<br />
Manual focus.<br />
Manual brightness.<br />
Very slow (several<br />
hours/slide).<br />
Small focus depth <strong>and</strong><br />
automatic focus does<br />
not find the optimal zposition.<br />
Dynamic range to be<br />
adjusted.<br />
Tiling scan.<br />
Not fully developed.<br />
Problem to achieve<br />
60x.<br />
Manual image capture.<br />
Can be used only for<br />
testing.<br />
the border of objects, whereas two-dimensional features<br />
relate to the surface within <strong>and</strong> around objects.<br />
Decision Making<br />
The system uses fuzzy logic to combine features<br />
into a decision. A decision is the estimated class<br />
of object <strong>and</strong> accuracy probability. In-between states<br />
are determined by the probability. For example,<br />
35% normal is equivalent to 65% abnormal <strong>and</strong><br />
suggests careful analysis by cancer specialists. In<br />
the extended training version for cervical cancer, the<br />
system provides upto 10 classes (CINs) depending on<br />
the classification system <strong>and</strong> the number <strong>and</strong> extent<br />
of available samples for learning procedures.<br />
Fig. 16. GUI associated with the cervical smear analysis system.<br />
The system has been developed to operate with a range of<br />
image acquisition hardware, examples of which are provided<br />
in Table III.<br />
XI. CONCLUSION<br />
This paper has been concerned with the task of developing<br />
a methodology <strong>and</strong> implementing applications that are concerned<br />
with two key tasks: (i) the partial analysis of an image<br />
in terms of its fractal structure <strong>and</strong> the fractal properties that<br />
characterize that structure; (ii) the use of a fuzzy logic engine<br />
to classify an object based on both its Euclidean <strong>and</strong> fractal<br />
geometric properties. The combination of these two aspects<br />
has been used to define a processing <strong>and</strong> image analysis engine<br />
that is unique in its modus oper<strong>and</strong>i but entirely generic in<br />
terms of the applications to which it can be applied.<br />
The research has investigated numerous processes for pattern<br />
recognition using fractal geometry as a central processing<br />
kernel. This has led to the design of a new library of pattern<br />
recognition algorithms. The image types considered contain<br />
about 80% useful environmental information for the human.<br />
With rapid advances in video technology, the content of a<br />
video stream is increasing at a rate that is far beyond the<br />
human brain capacity for decision making. This necessitates<br />
a need for developing an automatic image processing <strong>and</strong><br />
decision making system using artificial intelligence. Such<br />
systems are required in search engines, information databases,<br />
navigation in unknown terrain, interpretation of two dimensional<br />
data, etc.<br />
The creation of logic <strong>and</strong> general purpose hardware for<br />
artificial intelligence is a basic theme for any future development<br />
based on the results reported in this paper for<br />
the applications developed <strong>and</strong> beyond. The results of the<br />
current system can be utilized in a number of different areas<br />
although medical imaging would appear to be one of the<br />
most natural fields of interest because of the nature of the<br />
images available, their complex structures <strong>and</strong> the difficulty<br />
of obtaining accurate diagnostic results which are efficient<br />
<strong>and</strong> time effective. A further extension of our approach is to<br />
consider the effect of replacing the fuzzy logic engine used<br />
to date with an appropriate Artificial Neural Network. It is<br />
not clear as to whether the application of an ANN could<br />
provide a more effective system <strong>and</strong> whether it could provide<br />
greater flexibility with regard to the type of images used <strong>and</strong><br />
the classifications that may be required. Within the context<br />
of this paper, algorithms have been designed that focus on<br />
solving the detection <strong>and</strong> classification problems associated<br />
with the analysis of cervical smear images. In this respect, a<br />
new set of image processing algorithms have been developed<br />
that may have value in a wider class of image processing<br />
<strong>and</strong> pattern recognition application, particularly with regard to<br />
medical image analysis.<br />
ACKNOWLEDGMENTS<br />
This work is supported by the Science Foundation Irel<strong>and</strong>.<br />
The authors are grateful for the advice <strong>and</strong> help of Dr Alastair<br />
Deery (Department of Cellular Pathology, St Georges Hospital,<br />
London), Professor Jonathan Brostoff (Kings College, London<br />
University) <strong>and</strong> Professor Irina Shabalova (Russian Medical<br />
Academy of Postgraduate Education, Moscow).
109 ISAST Transactions on <strong>Computers</strong> <strong>and</strong> <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 2, 2010 (ISSN 1798-2448)<br />
REFERENCES<br />
[1] J. M. Blackledge, Digital Signal Processing, 2 nd Edition, Horwood<br />
Publishing, 2006.<br />
[2] J. M. Blackledge, Digital Image Processing, Horwood Publishing, 2005.<br />
[3] E. R. Davies. Machine Vision: Theory, Algorithms, Practicalities, Academic<br />
press, London, 1997.<br />
[4] H. Freeman, Machine vision. Algorithms, Architectures, <strong>and</strong> <strong>Systems</strong>,<br />
Academic press, London, 1988.<br />
[5] M. G. Rojo, G. B. Garcia, C. P. Mateos, J. G. Garcia <strong>and</strong> M. C. Vicente,<br />
Critical Comparison of 31 Commercially Available Digital Slide <strong>Systems</strong><br />
in Pathology, Int. J. Surg. Pathol., 14, 285-30, 2006.<br />
[6] J. M. Blackledge <strong>and</strong> D. Dubovitskiy, Object Detection <strong>and</strong> Classification<br />
with Applications to Skin Cancer Screening, ISAST Transactions<br />
on <strong>Intelligent</strong> <strong>Systems</strong>, No. 1, Vol. 1, 34-45, ISSN:1797-1802, 2008.<br />
[7] J. M. Blackledge, D, Dubovitskiy, Surface Inspection using a Computer<br />
Vision System that Includes Fractal Analysis, ISAST Transaction on<br />
Electronics <strong>and</strong> Signal Processing, No. 2, Vol. 3, 76 -89, ISSN:1797-<br />
2329, 2008<br />
[8] Canny J. A computational approach to edge detection. IEEE Trans.<br />
Pattern Analysis <strong>and</strong> Machine Intelligence, (PAMI-8):679–698, 1986.<br />
[9] Shunji Mori Hideyuki Tamura <strong>and</strong> Takashi Yamawaki. Textual features<br />
corresponding to visual perception. IEEE Man. <strong>and</strong> Cybernetics, 6,<br />
1978.<br />
[10] Falconer K. Fractal Geometry. Wiley, 1990.<br />
[11] Pantanowitz L, Henricks W, Beckwith B. Medical laboratory informatics.<br />
Clin Lab Med, 27:823-43, 2007.<br />
[12] Pantanowitz L, Hornish MA, Goulart RA. Computer-assisted cervical<br />
cytology. Medical information Science, 2008.<br />
[13] Yagi Y, Gilberson JR. Digital imaging in pathology: The case for<br />
st<strong>and</strong>ardization. J Telemed Telecare, 11:109-16, 2005.<br />
[14] Jr Louis J. Galbiati. Machine vision <strong>and</strong> digital image processing<br />
fundamentals. State University of New York, New-York, 1990.<br />
[15] Roger Boyle Milan Sonka, Vaclav Hlavac. Image Processing, Analysis<br />
<strong>and</strong> Machine Vision. PWS, USA, 1999.<br />
[16] Wesley E.Snyder Hairong Qi. Machine Vision. Cambridge University<br />
Press, Engl<strong>and</strong>, 2004.<br />
[17] V.S Nalwa <strong>and</strong> T.O.Binford. On detecting edge. IEEE Trans. Pattern<br />
Analysis <strong>and</strong> Machine Intelligence, (PAMI-8):699–714, 1986.<br />
[18] Lotfi A. Zadeh. Fuzzy sets <strong>and</strong> their applications to cognitive <strong>and</strong><br />
decision processes. Academic Press, New York, 1975.<br />
[19] E.H.Mamdani. Advances in linguistic synthesis of fuzzy controllers.<br />
J.Man Mach., 8:669–678, 1976.<br />
[20] E.Sanchez. Resolution of composite fuzzy relation equations.<br />
Inf.Control, 30:38–48, 1976.<br />
[21] N.Vadiee. Fuzzy rule based expert system-I. Prentice Hall, Englewood,<br />
1993.<br />
[22] Contour Tracing Algorithms http://www.cs.mcgill.ca/ aghnei/alg.html<br />
[23] Patrick R.Andrews Martin J.Turner, Jonathan M.Blackledge. Fractal<br />
Geometry in Digital Imaging. Academic Press, London, 1998.<br />
[24] Cancer research uk. http://www.cancerresearchuk.org/<br />
aboutcancer/reducingyourrisk/9314.<br />
[25] J. S. Lim, Two-Dimensional Signal <strong>and</strong> Image Processing, Prentice-Hall,<br />
1990.<br />
Jonathan Blackledge received a BSc in Physics<br />
from Imperial College, London University in 1980,<br />
a Diploma of Imperial College in Plasma Physics<br />
in 1981 <strong>and</strong> a PhD in Theoretical Physics from<br />
Kings College, London University in 1983. As a Research<br />
Fellow of Physics at Kings College (London<br />
University) from 1984 to 1988, he specialized in<br />
information systems engineering undertaking work<br />
primarily for the defence industry. This was followed<br />
by academic appointments at the Universities of<br />
Cranfield (Senior Lecturer in Applied Mathematics)<br />
<strong>and</strong> De Montfort (Professor in Applied Mathematics <strong>and</strong> Computing)<br />
where he established new post-graduate MSc/PhD programmes <strong>and</strong> research<br />
groups in computer aided engineering <strong>and</strong> informatics. In 1994, he cofounded<br />
Management <strong>and</strong> Personnel Services Limited where he is currently<br />
Executive Director. His work for Microsharp (Director of R & D, 1998-<br />
2002) included the development of manufacturing processes now being<br />
used for digital information display units. In 2002, he co-founded a group<br />
of companies specializing in information security <strong>and</strong> cryptology for the<br />
defence <strong>and</strong> intelligence communities, actively creating partnerships between<br />
industry <strong>and</strong> academia (e.g. Lexicon Data Limited). He is currently holder<br />
of the Stokes Professorship in Digital Signal Processing <strong>and</strong> Information <strong>and</strong><br />
Communications Technology based at Dublin Institute of Technology <strong>and</strong> has<br />
published over one hundred scientific <strong>and</strong> engineering research papers <strong>and</strong><br />
technical reports for industry, six industrial software systems, fifteen patents,<br />
ten books <strong>and</strong> been supervisor to sixty research (PhD) graduates. His current<br />
research interests include computational geometry <strong>and</strong> computer graphics,<br />
image analysis, nonlinear dynamical systems modelling <strong>and</strong> computer network<br />
security, working in both an academic <strong>and</strong> commercial context. He holds<br />
Fellowships with Engl<strong>and</strong>’s leading scientific <strong>and</strong> engineering Institutes <strong>and</strong><br />
Societies including the Institute of Physics, the Institute of Mathematics <strong>and</strong><br />
its Applications, the Institution of Electrical Engineers, the Institution of<br />
Mechanical Engineers, the British Computer Society, the Royal Statistical<br />
Society <strong>and</strong> the Chartered Management Institute.<br />
Dmitry Dubovitskiy received a BSc <strong>and</strong> Diploma<br />
in Aeronautical Engineering from Saratov Aviation<br />
Technical College in 1993, an MSc in Computer<br />
Science <strong>and</strong> Information Technology from Baumann<br />
Moscow State Technical University in 1999 <strong>and</strong> a<br />
PhD in Computer Science from De Montfort University<br />
in 2005 under the supervision of Professor<br />
J M Blackledge. As a project leader in medical<br />
imaging at Microsharp Limited from 2002 to 2005,<br />
he specialized in information systems engineering,<br />
developing image recognition systems for medical<br />
applications for real time operational diagnosis. He founded Oxford Recognition<br />
Limited in 2005 which specialises in the applications of artificial<br />
intelligence for computer vision. He has developed a range of computer vision<br />
systems for industry including applications for 3D image visualisation <strong>and</strong> has<br />
been coordinator for the INTAS project in distributed automated systems for<br />
acquiring <strong>and</strong> analysing eye tracking data.