10.07.2015 Views

In Network Processing and Data Aggregation in

In Network Processing and Data Aggregation in

In Network Processing and Data Aggregation in

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Athens University of Economics <strong>and</strong> Bus<strong>in</strong>essDepartment of Comuter Science<strong>Network</strong><strong>in</strong>g Issues <strong>in</strong> wireless sensor networks<strong>and</strong> the <strong>Data</strong> <strong>Aggregation</strong> problem: A surveyDimosthenis PediaditakisAthens, Moday 3 July 20061 st Supervisor :2 nd Supervisor :George C. PolyzosGeorge XylomenosA thesis submitted <strong>in</strong> partial fulfillment of the requirements of thedegree of Master <strong>in</strong> Computer Science of Athens University ofEconomics <strong>and</strong> Bus<strong>in</strong>ess, department of Computer Science


<strong>Network</strong><strong>in</strong>g Issues <strong>in</strong> wireless sensor networks <strong>and</strong> the <strong>Data</strong><strong>Aggregation</strong> problem: A surveyAbstract – <strong>In</strong> recent years, a great deal of research has been devoted to wireless sensor networks(WSNs). WSNs are networks consist<strong>in</strong>g of small self-powered devices that communicate with each otherover the wireless medium <strong>and</strong> coord<strong>in</strong>ate their operation <strong>in</strong> order to perform distributed sens<strong>in</strong>g nearphysical phenomena. The total lifetime of the network is a major design factor for protocols, applications<strong>and</strong> <strong>in</strong> general algorithms used by a WSN. Therefore, it is always a good choice to avoid the transmissionof big amounts of data. Each sensor node produces locally raw data upon sens<strong>in</strong>g events. These datausually flow to a central node (called s<strong>in</strong>k) that collects them for further process<strong>in</strong>g. Such a traffic patterntends to exhaust the limited resources of the sensor network (<strong>in</strong>clud<strong>in</strong>g the energy). <strong>In</strong>-network process<strong>in</strong>g<strong>and</strong> <strong>in</strong> particular the data aggregation techniques, try to reduce the traffic <strong>in</strong>side the network by trad<strong>in</strong>g theexpensive, <strong>in</strong> terms of energy, communication for local computation at each node. <strong>In</strong> this paper, we firstlypo<strong>in</strong>t out the ma<strong>in</strong> network<strong>in</strong>g challenges for a WSN <strong>and</strong> give an overview of the data aggregationproblem. Then we cont<strong>in</strong>ue by present<strong>in</strong>g a taxonomy of the most representative aggregation techniques.We f<strong>in</strong>ally discuss the ma<strong>in</strong> trade-offs that result from the use of data aggregation <strong>and</strong> conclude our work.Keywords – Wireless sensor networks, data aggregation, <strong>in</strong>-network process<strong>in</strong>g, data-centric rout<strong>in</strong>g,sensor database systems, query process<strong>in</strong>g.


1 <strong>In</strong>troductionRecently, the technological advancements <strong>in</strong> hardware have led to the design ofextremely powerful chips. Gordon E. Moore stated <strong>in</strong> 1965 that the complexity of<strong>in</strong>tegrated circuits, with respect to m<strong>in</strong>imum component cost, doubles every 24 months.This “law” had hold for several years. However, on April 13 2005, Gordon Moorehimself stated <strong>in</strong> an <strong>in</strong>terview that the law may not hold valid for too long, s<strong>in</strong>cetransistors may reach the limits of m<strong>in</strong>iaturization at atomic levels. Now that thecomputational speed is bounded by physical limitations, the <strong>in</strong>terest of hardware expertshas been attracted to m<strong>in</strong>iaturization of devices. The new trend of compact<strong>in</strong>g thesystems enabled the development of co<strong>in</strong>-sized comput<strong>in</strong>g devices that are capable ofproduc<strong>in</strong>g digital representations of real-world phenomena. These devices are widelyknown as wireless sensors <strong>and</strong> they usually consist of sens<strong>in</strong>g, data process<strong>in</strong>g, <strong>and</strong>communicat<strong>in</strong>g components. The ability to communicate with each other throughwireless <strong>in</strong>terfaces enables the deployment of a big number of sensor nodes <strong>in</strong> a fieldform<strong>in</strong>g a wireless sensor network (WSN). <strong>In</strong> order for sensors to be both small <strong>and</strong><strong>in</strong>expensive, they have several resource constra<strong>in</strong>ts:Low b<strong>and</strong>width communication – The b<strong>and</strong>width of wireless l<strong>in</strong>ks is usually limitedto a few hundred Kbps. <strong>Network</strong> doesn’t provide quality of service, the latency is highlyvariable <strong>and</strong> the loss of a packet is a frequent phenomenon.Power consumption – Most of the times sensors are battery powered. The batteriesthat are used are non rechargeable, irreplaceable <strong>and</strong> very small. Therefore, whiletraditional networks aim to achieve high quality of service, sensor network protocolsfocus primarily on power conservation. Prolong<strong>in</strong>g the network’s lifetime is the ma<strong>in</strong>goal. <strong>In</strong> addition to that a WSN must support parameterized trade-off mechanisms thatprovide the end user with the option of prolong<strong>in</strong>g network lifetime at the cost of lowerthroughput or higher transmission delay.Computation – The node has limited computational power but it is usually adequate tocover the needs of the communication, application <strong>and</strong> sens<strong>in</strong>g activities.Sens<strong>in</strong>g accuracy – Signal process<strong>in</strong>g functions convert physical events <strong>in</strong>to <strong>in</strong>ternaldata representations. Sensed data, due to limitations of the sensor, may conta<strong>in</strong>


environmental noise <strong>in</strong>duc<strong>in</strong>g this way uncerta<strong>in</strong>ty <strong>in</strong> read<strong>in</strong>gs. A damaged sensor mightalso generate <strong>in</strong>accurate data.<strong>In</strong> Figure 1 a typical example of a sensor node (Berkeley MICA Mote) is illustrated sothat the reader is able to realize its t<strong>in</strong>y size. Table 1 provides us with the hardwarecharacteristics of the MICA Mote.ProcessorStorageRadio B<strong>and</strong>Comm. Range<strong>Data</strong> RateTx PowerReceive Power4Mhz, 8bit MCU(ATMEL)512KB916Mhz33 m40 Kbits/sec12 mA1.8 mAFigure 1. The Berkeley MICA MoteTable 1. Mica Mote’s characteristicsWireless sensor networks can be used <strong>in</strong> awide range of application doma<strong>in</strong>s:• Environmental monitor<strong>in</strong>g (coord<strong>in</strong>ated sens<strong>in</strong>g or <strong>in</strong>formation gather<strong>in</strong>g <strong>in</strong> adisaster area).• Supervis<strong>in</strong>g items <strong>in</strong> a factory warehouse.• <strong>In</strong>telligent build<strong>in</strong>g management• Organization of vehicle traffic <strong>in</strong> a large city (traffic rout<strong>in</strong>g).• Military (target recognition <strong>and</strong> track<strong>in</strong>g).• Medical <strong>and</strong> health care.Each one of the above applications has its own network, hardware <strong>and</strong> nodedeployment requirements. WSNs lack of st<strong>and</strong>ardized solutions because they areapplication specific networks, i.e., design requirements of a sensor network change depend<strong>in</strong>gon the application. However, no matter the k<strong>in</strong>d of the application, the communicationscenario is always identical (Figure 2). A user sends several tasks <strong>in</strong> the form of queriesto the network. The user can either be directly attached to a gateway node or he can alsobe connected remotely through an <strong>in</strong>termediate network (e.g. <strong>in</strong>ternet or via satellite).After the query reaches the WSN (s<strong>in</strong>k), it is automatically translated <strong>in</strong>to an <strong>in</strong>ternal


Thus, <strong>in</strong>-network process<strong>in</strong>g is one of the most effective ways to m<strong>in</strong>imize the traffic<strong>in</strong>side a WSN <strong>and</strong> it is generally performed <strong>in</strong> the form of data aggregation. A typicalquery asks for the average/max/m<strong>in</strong> of the sensed values with<strong>in</strong> a given area. Assume thatthe user is placed at the s<strong>in</strong>k <strong>and</strong> <strong>in</strong>jects <strong>in</strong>to the network a query Q, ask<strong>in</strong>g the averagetemperature of area A. The query can be answered either by us<strong>in</strong>g a direct-deliveryapproach or by perform<strong>in</strong>g data-aggregation:Direct DeliveryAfter the propagation of Q to area A each node <strong>in</strong>side A is required to sendits own read<strong>in</strong>gs back to the host node (s<strong>in</strong>k) for process<strong>in</strong>g. Afterreceiv<strong>in</strong>g all data packets from the source nodes, the s<strong>in</strong>k aggregateslocally all of the data <strong>in</strong>to a f<strong>in</strong>al value <strong>and</strong> report the value back to theuser.Distributed <strong>in</strong>-network aggregation<strong>In</strong> this technique, a sensor network forms a reverse multicast treeas shown <strong>in</strong> Figure 3 where the s<strong>in</strong>k <strong>in</strong>jects query Q <strong>in</strong>side the network.The sensor nodes start send<strong>in</strong>g back their sensed values related to thephenomena of <strong>in</strong>terest. All the <strong>in</strong>formation flows <strong>in</strong> a top- down manner<strong>and</strong> as the packets are com<strong>in</strong>g from multiple sensor nodes they areaggregated <strong>in</strong>side the network, before they reach the s<strong>in</strong>k. Forexample, the data of sensors A <strong>and</strong> B are aggregated at node E. <strong>In</strong> thesame way sensor node F aggregates the data from sensor nodes C <strong>and</strong>D. When a node receives two or more values from its children, itforwards down only 1 value (the aggregated) reduc<strong>in</strong>g this way thetotal number of transmitted messages. <strong>In</strong> addition, the f<strong>in</strong>al value thatreaches at the s<strong>in</strong>k will be the requested aggregate.


Figure 3. <strong>Aggregation</strong> tree is a reverse multicast treeThe plan for the rema<strong>in</strong>der of this paper is as follows: <strong>In</strong> Section 2, we present thema<strong>in</strong> network<strong>in</strong>g issues of the wireless sensor networks <strong>and</strong> also the data-centric rout<strong>in</strong>g.<strong>In</strong> section 3 we discuss the data aggregation problem <strong>and</strong> dist<strong>in</strong>guish the different typesof aggregates that may be computed over a WSN. <strong>In</strong> section 4 we make a taxonomy ofthe current aggregation techniques that are widely accepted. <strong>In</strong> section 5 we review thema<strong>in</strong> trade-offs result<strong>in</strong>g from the use of data aggregation. Section 6 consists of the f<strong>in</strong>alsolution of the paper.


The follow<strong>in</strong>g figure (Figure 4) visualizes the three different ranges mentioned above:TRIRDRTR : Transmission range, DR = Detection range, IR = <strong>In</strong>terference rangeFigure 4The transmission range (<strong>and</strong> of course both the detection <strong>and</strong> <strong>in</strong>terference ranges) canbe <strong>in</strong>creased by allow<strong>in</strong>g the transceiver to consume more energy. It has been found thatthe power required by a sender <strong>in</strong> order to reach a receiver at a given distance d isproportional to d 2 (as the height of the antennas of the communicat<strong>in</strong>g nodes is lower <strong>and</strong>close to the ground the exponent is <strong>in</strong>creased to 4 at most). Many hardwaremanufacturers support the dynamic change of the transmitt<strong>in</strong>g power <strong>and</strong> so thetransmission range can be adapted to the current needs of the network. This is not thecase though for many WSNs due to cost-per-node constra<strong>in</strong>ts.Hav<strong>in</strong>g <strong>in</strong> m<strong>in</strong>d the previous conversation about the behavior of the wireless mediumit is time to mention the factors that ma<strong>in</strong>ly affect the signal propagation process:Shadow<strong>in</strong>g: The environment that a sensor node is placed is more likely to bedynamic <strong>and</strong> thus changes may occur unexpectedly. Often, an object may obstruct thecommunication between two or more sensor nodes. A nice example would be a biganimal sleep<strong>in</strong>g <strong>in</strong> the middle of the network.Path loss: Electromagnetic wave attenuates as the distance between the twocommunicators <strong>in</strong>creases. This is the reason why we need to transmit at higher powerlevels <strong>in</strong> order to reach a distant node.Fad<strong>in</strong>g: The characteristics of the channel change over time <strong>and</strong> location, lead<strong>in</strong>g <strong>in</strong>variations of the power of the receiv<strong>in</strong>g signal.<strong>In</strong>terference: <strong>In</strong> a network it is almost sure that all sources transmit <strong>and</strong> all receiverslisten for signals at/to the same frequency. If there is not a mechanism that can


synchronize perfectly all the nodes it is undoubted that two or more signals will reach thesame receiver at different power levels. That is called <strong>in</strong>terference.Multipath propagation: The same signal is possible to follow two or more differentpaths on its way to the receiver because of the possible reflections (on objects or even onthe ground). The multiple paths have various lengths; different path lengths lead todifferent reception times <strong>and</strong> this f<strong>in</strong>ally results <strong>in</strong> the blurr<strong>in</strong>g of the signal at thereceiver.Another function that the physical network layer may support is the physical carriersense. This is done by listen<strong>in</strong>g to the channel for possible transmissions <strong>in</strong> order to avoidsend<strong>in</strong>g signals that may <strong>in</strong>terfere with others. Carrier sense is performed at r<strong>and</strong>om timesexcept for the case that the MAC sublayer dem<strong>and</strong>s it.To get an idea about the specifications of the wireless channel of a typical node of aWSN we present the follow<strong>in</strong>g table (Table 1) that demonstrates the transceivercharacteristics of the most popular commercial sensor nodes (for more <strong>in</strong>formation thereader is encouraged to read [14]).PrototypeUCBerkeleyCrossbowWeCUCBerkeleyCrossbowReneUCBerkeleyCrossbowMICAUCBerkeleyCrossbowMICA2<strong>In</strong>teliMote(2003)Microstra<strong>in</strong>Galbreath et al.(2003)(1999) (2000) (2002) (2003)Radio TR1000 TR1000 TR1000RFChipcon WirelessMonolithicsCC1000 BT ZeevoDR-3000-1315, 433Frequency 868 / 916 868 / 916 868 / 916 orB<strong>and</strong> MHz MHz MHz 868 / 9162.4 GHz 916.5 MHzMHzSt<strong>and</strong>ard - - - -IEEE802.15.1-SpreadSpectrumNo No No Yes (s/w) Yes -<strong>Data</strong> Rate 10 kbps 10 kbps 40 kbps 38.4 kbps 600 kbps 75 kbpsTable 2. Transceiver characteristics of the most popular commercial sensor nodes


2.1.3 Proposed schemesSeveral systems suggested physical layer solutions that provide APIs for upper layerslike sett<strong>in</strong>g the radio <strong>in</strong>to different states (sleep, idle, reception, transmission). It isnoticeable the lack of st<strong>and</strong>ardization at lower <strong>and</strong> the (physical) sensor hardware. <strong>In</strong> thefollow<strong>in</strong>g paragraphs we mention the most significant <strong>and</strong> popular schemes that wereproposed.Bluetooth [13]Operates <strong>in</strong> the 2.4 GHz ISM b<strong>and</strong>. It employs b<strong>in</strong>ary Gaussian Frequency ShiftKey<strong>in</strong>g (GFSK), 1 MBaud - symbol rate of. Performs frequency hopp<strong>in</strong>g spreadspectrum, r<strong>and</strong>omly hopp<strong>in</strong>g across 79 channels (1 MHz each one), perform<strong>in</strong>g 1600hops per second. Although it is frequently suggested for sensor applications, theBluetooth physical layer is not very suitable for WSNs [12]. This is ma<strong>in</strong>ly because itconsumes a lot of power search<strong>in</strong>g the b<strong>and</strong> for the network on a packet rate time scale(frequency hopp<strong>in</strong>g). What is more, the narrow channel separation makes the phase noiserequirements of signal sources more difficult.IEEE 802.11b WLAN [15]Operates <strong>in</strong> the 2.4 GHz ISM b<strong>and</strong>. This worldwide IEEE st<strong>and</strong>ard, specifies threedifferent layer-1 options at 1 Mb/s <strong>and</strong> optionally 2 Mb/s:• <strong>In</strong>frared,• Frequency hopp<strong>in</strong>g spread spectrum 2.4 GHz <strong>and</strong>• Direct Sequence spread spectrum 2.4 GHz [DBPSK at 1 Mb/s <strong>and</strong> DQPSK at 2Mb/s. Us<strong>in</strong>g complementary code key<strong>in</strong>g CCK it achieves till 11Mb/s].The 1- <strong>and</strong> 2-MHz direct sequence 802.11 physical layer is a possible option for aWSN because it has m<strong>in</strong>imal hardware requirements, the provided data rate is more thanenough <strong>and</strong> it has not the problems of the frequency hopp<strong>in</strong>g systems (like bluetooth).Us<strong>in</strong>g the extended version (CCK) of the direct sequence option has a big cost <strong>in</strong> powerconsumption <strong>and</strong> transceiver hardware complexity (e.g. have to support complexidentification <strong>and</strong> decod<strong>in</strong>g functions for the CCK sequences).μAMPS [12]


Proposes a bottom (physical layer)–up (application layer) approach <strong>in</strong> thedevelopment of a protocol for a wireless sensor network. Application, MAC <strong>and</strong> physicallayers have to be tightly <strong>in</strong>tegrated with the hardware of the sensor node. For example theenergy penalties of switch<strong>in</strong>g states or the consumed power at each state of a transceiver,might determ<strong>in</strong>e the policy of the MAC layer scheme or even affect the network layerrout<strong>in</strong>g algorithm. Multiple access schemes (Time Division Multiple Access, FrequencyDivision Multiple Access) are compared <strong>and</strong> the same is also done for modulationschemes (b<strong>in</strong>ary, M-ary modulation). It is found that a hybrid TDMA-FDMA scheme <strong>and</strong>the M-ary modulation scheme achieve significant power sav<strong>in</strong>gs. Another energyexpensive factor is the frequent turn<strong>in</strong>g on/off of the transceiver of a node because thecontrol <strong>in</strong>put spends not negligible time on sett<strong>in</strong>g the right voltage to the transceiver(start-up energy). All the previous notes show that layer-1 has not to be considered as ablack box when design<strong>in</strong>g an upper level protocol or even an application for a WSN.2.2 <strong>Data</strong> L<strong>in</strong>k Layer (MAC)2.2.1 Medium Access Control (MAC) functionalityAbove the Physical Layer (Layer-1) resides the <strong>Data</strong> L<strong>in</strong>k Layer (Layer-2). Hav<strong>in</strong>gonly the tools that are provided by the layer 1 <strong>in</strong> means of services, the l<strong>in</strong>k layer isresponsible for the transmission of data packets between two communicat<strong>in</strong>g nodes. Thecommunication pattern may be po<strong>in</strong>t-to-po<strong>in</strong>t or even po<strong>in</strong>t-to-multipo<strong>in</strong>t. Nevertheless,this function is not as simple as it sounds. There are several services that the l<strong>in</strong>k layermay offer to the upper layers like:• medium access,• packet encapsulation <strong>in</strong> l<strong>in</strong>k-layer frames <strong>and</strong> data frame detection,• flow control,• error detection/correction,• data stream multiplex<strong>in</strong>g <strong>and</strong>• reliable communication mode.


Medium access functionality is so important that it constitutes an <strong>in</strong>dividual sub-layer,the Medium Access Control sub-layer, widely known as MAC. <strong>In</strong> order to underst<strong>and</strong> themajor importance of the MAC sub-layer, its primary functions are <strong>in</strong>dicated below :• medium access control before transmitt<strong>in</strong>g,• bit-stream fragmentation <strong>in</strong>to frames upon reception,• level-2 frame encapsulation before transmission,• <strong>in</strong>sertion of checksums for error detection,• <strong>in</strong>sertion of the source-dest<strong>in</strong>ation MAC addresses <strong>in</strong>side all transmitt<strong>in</strong>g frames<strong>and</strong>• frame filter<strong>in</strong>g by check<strong>in</strong>g the dest<strong>in</strong>ation MAC address of the packet.For the rest of this section, we will be concerned for the medium access problem.2.2.2 MAC issues for wireless communicationsBefore analyz<strong>in</strong>g the MAC sub-layer issues from the perspective of the WSNs, it isuseful to review the ma<strong>in</strong> problems that a general purpose wireless network faces.Wireless (radio, <strong>in</strong>frared, optical) medium has to be tightly controlled s<strong>in</strong>ce it has abroadcast<strong>in</strong>g nature. When a host transmits, the nodes <strong>in</strong>side its transmission range willlisten to it, the nodes <strong>in</strong>side its detection range will just detect its signal <strong>and</strong> f<strong>in</strong>ally thenodes with<strong>in</strong> its <strong>in</strong>terference range will receive a vague <strong>and</strong> confus<strong>in</strong>g signal. When twoor more nodes with<strong>in</strong> the same “neighborhood” transmit packets at the same time, thenearby nodes that listen to these transmissions will get a confus<strong>in</strong>g mixed signal <strong>and</strong>f<strong>in</strong>ally they will discard the received packets. This phenomenon is called collision <strong>and</strong>results <strong>in</strong> the loss of all the transmitted packets. When a collision occurs, the senders haveto retransmit their data.The need for a mechanism that will play the role of the traffic controller is imperative.This mechanism must give everyone the opportunity to talk, prevent the monopolizationof the conversation by a s<strong>in</strong>gle participant <strong>and</strong> guarantee that no one can <strong>in</strong>terrupt the


other. There are three general families of protocols that are able to control the access tothe wireless medium. The protocols of each category are presented briefly:• Channel Partition<strong>in</strong>g Protocols (Contention free)Certa<strong>in</strong> assignments are used to avoid contentions. They are more applicable to staticnetworks <strong>and</strong>/or networks with centralized control.o Time Division Multiple Access (TDMA) – All users share the same frequencyby divid<strong>in</strong>g it <strong>in</strong> timeslots. Each one waits for his turn (the time that corresponds to theirtimeslot) <strong>and</strong> uses all or part of the provided b<strong>and</strong>width.Advantages:a) It naturally avoids collisions so there are no extra overheads.b) It has short duty cycle.c) FairnessDisadvantages:a) Needs synchronization mechanisms.b) It requires the formation of communication clusters.c) No good scalability because the timeslots are difficult to be assigneddynamically.d) Limited transmission rate because even if a node is the only one that hassometh<strong>in</strong>g to send it has to wait for its turn.o Frequency Division Multiple Access (FDMA) – The channel is divided <strong>in</strong>toequal smaller frequency b<strong>and</strong>s that are called subdivisions. Each subdivision is assignedto a node <strong>and</strong> so every participant can use only a fraction (with equal size for all) of thetotal channel b<strong>and</strong>width. The FDMA has the same advantages as TDMA. The drawbacksare identical too except for the fact that even if a node is the only one who transmitssometh<strong>in</strong>g it is not obliged to wait for its turn whereas it may use only a fraction of theb<strong>and</strong>width of the channel.o Hybrid TDMA/FDMA – This approach comb<strong>in</strong>es the two previous protocols<strong>and</strong> the medium is divided <strong>in</strong> a time-frequency space basis. It is a very frequent solutionfor wireless communications but it also has similar <strong>in</strong>efficiencies with TDMA <strong>and</strong>FDMA.o Code Division Multiple Access (CDMA) – The channel is divided neither <strong>in</strong>totime nor by frequency but each node is assigned a code (chipp<strong>in</strong>g sequence) with which it


encodes the data before transmission. The codes are “mutually orthogonal” to each other,which means that nodes can transmit simultaneously without the possibility of collisions.Of course the selection <strong>and</strong> the distribution of the codes between the participants is not aneasy task, especially when their total number is high. The receiver listens to a signal thatis spread on the air consist<strong>in</strong>g of multiple encoded sub-signals transmitted from eachsender. By the code of the sender it is an easy process to decode only the signal of<strong>in</strong>terest among the rest of them.Advantages:a) Naturally avoids collisions so there are no extra overheads.b) Fairness.c) More efficient use of the channel.d) Theoretically unlimited number of users that CDMA can support, <strong>in</strong>contrast to TDMA (f<strong>in</strong>ite time-slots) <strong>and</strong> FDMA (f<strong>in</strong>ite sub-channels).Disadvantages:Near-far problem: Because all users transmit at the same frequency, <strong>in</strong>ternal<strong>in</strong>terference generated by the system is the most significant factor <strong>in</strong> determ<strong>in</strong><strong>in</strong>g systemcapacity <strong>and</strong> call quality. S<strong>in</strong>ce one transmission is the other's noise the Signal-to-noiseratio (SNR) can be high <strong>in</strong> many situations. Thus, it is not always easy to detect <strong>and</strong>receive a weak signal among stronger ones.• R<strong>and</strong>om Access Protocols (Contention based, on dem<strong>and</strong> allocation)These protocols are aware of the risk of collisions of transmitted data, <strong>and</strong> are moresuitable for mobile Ad-Hoc networks.o Pure ALOHA – When a node has data to send it directly transmits the datapacket. It is very possible for a packet to collide with other packets that were transmittedat the same time because all the nodes share the same medium <strong>and</strong> use the samefrequency b<strong>and</strong>s. When the sender realizes the collision then it performs a retransmissionwith probability p otherwise it waits for a fixed amount of time.Advantages :a) Supports many simultaneous users,b) Ease of management,c) Speed <strong>in</strong> <strong>in</strong>itial communication.Disadvantages :


a) It has a maximum throughput of 18.4% due to frequent collisions giventhat many nodes try to “talk” at the same time.b) There is no sens<strong>in</strong>g mechanism to <strong>in</strong>form nodes about the channelcondition.o Slotted ALOHA – Is an improved version of Pure ALOHA. Nodes canperform transmissions only at the beg<strong>in</strong>n<strong>in</strong>g of a timeslot. A slot is “wasted” if on itsbeg<strong>in</strong>n<strong>in</strong>g two or more stations attempt to send a packet, otherwise (only one stationstarted transmitt<strong>in</strong>g on its beg<strong>in</strong>n<strong>in</strong>g) the transmission <strong>in</strong> the duration of one time slot isunremitt<strong>in</strong>g. <strong>In</strong> terms of performance it manages to improve the maximum throughputfrom 18.4% to 36.8%.o Carrier Sense Multiple Access (CSMA) – <strong>In</strong> ALOHA (pure <strong>and</strong> slotted) astation never takes <strong>in</strong>to account the other nodes’ actions but acts selfishly. Neithersomebody listens to the channel for possible ongo<strong>in</strong>g communications nor does it stop thetransmission when a collision occurs. This is exactly what the CSMA scheme is do<strong>in</strong>g.The transmitter firstly tries to detect the presence of an encoded signal from anotherstation. If a carrier is sensed, the node waits for the transmission <strong>in</strong> progress to f<strong>in</strong>ishbefore it senses aga<strong>in</strong> the medium. If the channel is idle the node beg<strong>in</strong>s frametransmission.Advantages :a) Simplicity of mechanism.b) Extremely scalable.Disadvantages :a) A node has not a way to detect that a collision has occurred so that he canstop send<strong>in</strong>g data through the wireless <strong>in</strong>terface.b) Collisions due to hidden/exposed term<strong>in</strong>als.o CSMA/CD – Is an extension of the simple CSMA scheme described above.While a station is transmitt<strong>in</strong>g it is possible a collision to occur. If that node doesn’t stopthe communication process, the entire frame will be wasted <strong>and</strong> the same will happenwith all the other active transmitters. Collision Detection (CD) gives nodes the ability tostop on time the transmission thus bound the amount of wasted b<strong>and</strong>width dur<strong>in</strong>g acollision. The only drawback of CSMA/CD is that it has extra hardware requirements(the ability to send <strong>and</strong> receive at the same time is not simple when the communication ishalf-duplex) <strong>and</strong> this leads to more expensive transceivers.


o CSMA/CA – Is also an extension of the simple CSMA scheme. Each stationfirstly listens to the medium for on-go<strong>in</strong>g transmissions. If the medium is busy it choosesa r<strong>and</strong>om back-off time-value d. Afterwards, a counter :- counts down d while the channel is (sensed) idle <strong>and</strong>- freezes when channel stops be<strong>in</strong>g idle.When the counter hits zero the station retries to send the frame by start<strong>in</strong>g the aboveprocess from the beg<strong>in</strong>n<strong>in</strong>g. Although there is no way for a node to detect a collision (like<strong>in</strong> CSMA/CD) it is less likely for all wait<strong>in</strong>g nodes to retry transmissions simultaneouslywhen the channel is sensed idle.All the CSMA schemes face a common problem that is known <strong>in</strong> wirelessnetworks bibliography as the hidden <strong>and</strong> exposed station problem.Hidden Term<strong>in</strong>al Problem<strong>In</strong> Figure 5, node B is with<strong>in</strong> the transmission range of C, but A is not.Correspond<strong>in</strong>gly B is with<strong>in</strong> the transmission range of A but C is not. Let us suppose thatC is currently transmitt<strong>in</strong>g data to B. If <strong>in</strong> the middle of the CB transmission, station Aattempts to communicate with B it will not listen that the medium is busy because A isout of the range of C <strong>and</strong> it will send also its own data to B. That is, a collision willhappen <strong>and</strong> B will not receive anyth<strong>in</strong>g from none.ABCFigure 5. Hidden Term<strong>in</strong>al ProblemExposed Term<strong>in</strong>al Problem<strong>In</strong> Figure 6, the Exposed term<strong>in</strong>al problem is illustrated. Station C sends data to D. Bis with<strong>in</strong> the transmission range of C <strong>and</strong> A is with<strong>in</strong> the transmission range of B. Bwants to send data to A but doesn’t sense the channel as idle because of the CDcommunication. Communication CD is very important because B is a pathetic listener


of it <strong>and</strong> as a consequence B already receives packets from C (even though B is not thedest<strong>in</strong>ation). The result of this scenario is that B decides to defer its own transmission toA. This is not a right decision because there is no possibility of a collision at A (A is notwith<strong>in</strong> the transmission range of C).ABCDFigure 6. Exposed Term<strong>in</strong>al ProblemSolution:Protocol 802.11 uses a nice technique to overcome the above problems. It uses twok<strong>in</strong>ds of control frames the RTS (Request to Send) <strong>and</strong> CTS (Clear to Send). RTS is usedby the transmitter to reserve the channel when senses the channel to be idle. Afterwards,it waits until it receives the CTS frame by the receiver <strong>in</strong>dicat<strong>in</strong>g that the latter is ready tostart “listen<strong>in</strong>g”. That’s it, the sender is <strong>in</strong>directly <strong>in</strong>formed that someone else is keep<strong>in</strong>gbusy the receiver by not receiv<strong>in</strong>g CTS with<strong>in</strong> a given timeout t.2.2.3 MAC Properties for WSNsWhen design<strong>in</strong>g a protocol for a system, the focus should be tuned on the designconsiderations <strong>and</strong> the operational requirements of that system. These specialcharacteristics may <strong>in</strong>clude the physical ones (hardware, node size, supportedtechnologies etc.), the practical ones (e.g. what k<strong>in</strong>d of applications will run on top of theprotocol stack) <strong>and</strong> f<strong>in</strong>ally the performance related ones (scalability, delays etc.). <strong>In</strong>section 2.2.1 we reviewed the functions that a MAC layer must support <strong>and</strong> then we


made a discussion over the issues that come up when communicat<strong>in</strong>g through thewireless medium. The next step <strong>in</strong> design<strong>in</strong>g an effective MAC scheme for WSNs is thedef<strong>in</strong>ition of the properties that it should have.A general purpose MAC-layer protocol takes <strong>in</strong>to serious consideration the assuranceof:Fairness: Ensure that all nodes have equal opportunities to access the medium. It isalso important for the node that acquired the medium to use it prudentially.Small Latency: It is a vital requirement for a network to respond quickly to the needsof an application. If a MAC scheme is fair enough but has very strict <strong>and</strong> complex rulesthen it is possible that it will suffer from lengthy delays (the time elapsed between themoment a station requested to send a frame <strong>and</strong> the moment it f<strong>in</strong>ally managed to startsend<strong>in</strong>g the first bytes).High throughput: Wasted network resources (like b<strong>and</strong>width) due to stiff architectureof the lower level affect the performance of the upper layers. The MAC layer sets theupper bound of the network throughput <strong>and</strong> thus it has to achieve high channelutilization.WSNs have very tight constra<strong>in</strong>ts <strong>in</strong> power consumption, computational capabilities,transceiver strength <strong>and</strong> storage size. Very few of their characteristics are common withthese of a traditional wireless system like GSM for example. It sounds a bit weird that afair, with small latency <strong>and</strong> high throughput MAC may not be ideal for a WSN. However,this is true. A MAC protocol designed for a sensor network is supposed to be “welldef<strong>in</strong>ed” if:Is energy efficient: Sensor nodes use batteries for power supply. These batteries areusually small sized <strong>and</strong> it is difficult (if not impossible) to be recharged. The totalnetwork lifetime then, is a major design factor. <strong>In</strong> section 2.2.4 we discuss why exist<strong>in</strong>gMAC schemes for wireless systems are not suitable for WSNs. Additionally, the majorreasons for energy waste are def<strong>in</strong>ed.Has good scalability: Scalability to the change <strong>in</strong> network size, node density <strong>and</strong>topology is an additional property of great importance. WSNs are frequently deployed <strong>in</strong>areas where the topology changes rapidly. Nodes may fail at any time (due to hardwarefailures or empty battery), change location unexpectedly <strong>and</strong> f<strong>in</strong>ally new nodes may jo<strong>in</strong>later. A good MAC protocol should easily adapt to such network changes.


Idle listen<strong>in</strong>g: A network <strong>in</strong>terface is possible to support several operation states, eachone consum<strong>in</strong>g different amounts of energy. Usually, the hierarchy is the follow<strong>in</strong>g (P xxrepresents the average consumed energy of state xx) :P sleep < P idle < P receive < P send .P idle is the energy that is spent <strong>in</strong> order to listen to an idle channel to receive possibletraffic. This amount is not negligible (50-100% of the energy required for receiv<strong>in</strong>g) <strong>and</strong>is larger than P sleep as it requires to be powered on more circuit elements of thetransceiver.Overmitt<strong>in</strong>g: Happens when a node transmits a packet but the receiver is not ready toreceive it (may be <strong>in</strong> the sleep state). The packet is dropped.It would be helpful to review some known MAC schemes for the case of a WSN (wehave already presented them at section 2.2.4).TDMA/FDMABoth of them are contention free mechanisms, so collisions never occur. Every onelistens <strong>in</strong>/to its own timeslot/frequency result<strong>in</strong>g <strong>in</strong> a zero probability of overhear<strong>in</strong>g /overmitt<strong>in</strong>g. There is no need for control packets to be exchanged before thecommunication of two nodes. The problem with TDMA/FDMA is the difficulty <strong>in</strong> themanagement of the <strong>in</strong>ter-cluster communications. Moreover, when topology changes(happens very frequently), the medium has to be statically reassigned between a new setof nodes. This allocation is not an easy process <strong>and</strong> <strong>in</strong> addition to the need for <strong>in</strong>traclustersynchronization, the solution of TDMA/FDMA is not appropriate for WSNs.CDMA<strong>In</strong> section 2.2.2 we reviewed briefly how does CDMA work. We also made a smalldiscussion on the near far problem. The solution to this problem is to adjust dynamicallythe transmission power of each node so that for a given receiver, all signals reach thereceiver with the same power. <strong>In</strong> cellular telephony systems CDMA is used widelybecause <strong>in</strong> a cell, all mobile nodes <strong>in</strong>teract only with the base station (BS). BS controlsthe transmitt<strong>in</strong>g power of all the nodes with<strong>in</strong> the cell. <strong>In</strong> an ad-hoc communicationpattern like a WSN, this problem has not such an easy solution. Every one can send <strong>and</strong>receive messages. The ideal transmission power for a particular receiver may <strong>in</strong>duce a big


amount of noise to the channel for a neighbour<strong>in</strong>g node/receiver that may not be relatedto our communication process. A CDMA-Based MAC Protocol for general purposeMobile Ad-Hoc networks has been proposed [17]. It uses channel-ga<strong>in</strong> <strong>in</strong>formationobta<strong>in</strong>ed from overheard RTS <strong>and</strong> CTS packets over an out-of-b<strong>and</strong> control channel. Theimplementation of this MAC protocol for WSNs is very complex as it requires specialtransceiver capabilities <strong>and</strong> also needs a second control channel. Furthermore, it is basedon the overhear<strong>in</strong>g <strong>and</strong> as we have seen above the former is a primary reason for energywaste.IEEE 802.11 [15]Whenever IEEE 802.11 is used for WSNs, it operates <strong>in</strong> the Ad-Hoc mode withDistributed Coord<strong>in</strong>ation Function (DCF). Many control packets (RTS-CTS) are used <strong>in</strong>order to avoid the hidden/exposed term<strong>in</strong>al problems (see section 2.2.2). However controlpackets <strong>and</strong> possible collisions are not the ma<strong>in</strong> energy waste reasons. Recent work [16]has shown that the energy consumption us<strong>in</strong>g the 802.11 MAC is very high when nodesare <strong>in</strong> idle mode. Idle listen<strong>in</strong>g consumes about 50-100% of the energy that is required toreceive data. Several measurements have shown that idle:receive:send ratios are1:1.05:1.4 . To sum up the above observations an IEEE 802.11 MAC protocol is notsuitable for a WSN <strong>in</strong> terms of energy consumption but due to its big popularity it isusually a choice for several experiments <strong>and</strong> simulations with small modifications.2.2.5 Proposed MAC schemes for WSNs<strong>In</strong> the literature of sensor networks, several MAC protocols/schemes have beenproposed.Most of them use techniques like:• turn<strong>in</strong>g off the radio when channel is idle,• hav<strong>in</strong>g two separate channels (one for data <strong>and</strong> one for control),• support<strong>in</strong>g several operat<strong>in</strong>g states with different energy consumption levels <strong>and</strong>• perform<strong>in</strong>g power control (use a variation <strong>in</strong> the transmission power).


It is not with<strong>in</strong> the scopes of this paper to describe them one by one. The reader isreferenced to read [2] for a nice survey on the current MAC Protocols for WirelessSensor <strong>Network</strong>s. To get an idea of how does a typical MAC protocol for WSNs operatewe describe <strong>in</strong> short the S-MAC protocol.S-MAC [1]The designers of S-MAC set energy conservation <strong>and</strong> self-configuration as primarygoals, while per-node fairness <strong>and</strong> latency were less important for them. It is known thatlengthy packets lead to higher energy consumption because <strong>in</strong> the case of a collision thewasted energy is higher than that of a smaller one (more bytes are transmitted<strong>in</strong>effectually). Additionally, overhear<strong>in</strong>g <strong>and</strong> control overhead due to large packets, spendserious amounts of energy. Here comes the first technique of S-MAC, message pass<strong>in</strong>g,to achieve efficient transmission of a very long message. It divides it <strong>in</strong>to smallerfragments <strong>and</strong> then transmits them <strong>in</strong> bursts. Unfortunately, this is not a fair modification,as the nodes that have more data to send acquire more times the medium <strong>in</strong> a per-hopMAC perspective. This drawback though, is not so crucial for WSNs compar<strong>in</strong>g to theenergy sav<strong>in</strong>gs. Another technique that is used is the scheme of periodic listen <strong>and</strong> sleepof a node. <strong>In</strong> sleep mode, only a few circuit elements of the transceiver have to bepowered on (radio is off), lead<strong>in</strong>g to m<strong>in</strong>imal power needs dur<strong>in</strong>g this operat<strong>in</strong>g mode. Itma<strong>in</strong>ly saves the expended energy due to idle listen<strong>in</strong>g but the latency is <strong>in</strong>creased whilethe sender has to wait for the receiver to “wake up”. Periodic listen-sleep is implementedus<strong>in</strong>g synchronization to form virtual clusters of nodes on the same sleep schedule <strong>in</strong>order to m<strong>in</strong>imize additional latency through their coord<strong>in</strong>ated operation. After extensiveexperiments the authors of [1] found that S-MAC can reduce the energy consumption upto 2-6 times (when messages sent every 1-10sec) compared to IEEE 802.11. What ismore, S-MAC supports parameterized tradeoffs between energy <strong>and</strong> latency.2.3 <strong>Network</strong> Layer2.3.1 Rout<strong>in</strong>g challenges <strong>in</strong> WSNs<strong>In</strong> the first look, wireless sensor networks seem to have a lot of similarities with acommon Mobile Ad-Hoc <strong>Network</strong> (MANET). Nodes are mobile <strong>and</strong> they can change


wild animals are monitored <strong>in</strong>side a park of several thous<strong>and</strong>s of sq. meters the placementof nodes is totally different from the case of a factory that supervises items <strong>in</strong> awarehouse.• Particular Traffic patternsA typical communication scenario is as follows: A physical phenomenon occurs neara field of sensors. A big number of sensors events are triggered result<strong>in</strong>g <strong>in</strong> the massproduction of messages (conta<strong>in</strong><strong>in</strong>g values, IDs etc.). These messages have to be routed<strong>in</strong> a multi-hop fashion to a s<strong>in</strong>gle po<strong>in</strong>t, the s<strong>in</strong>k. The rout<strong>in</strong>g pattern looks like a reversedmulticast tree where the dest<strong>in</strong>ation of all the packets is a s<strong>in</strong>gle node <strong>and</strong> there aremultiple senders (many-to-one). Figure 7 demonstrates this pattern. This, however, doesnot prevent the flow of data to be <strong>in</strong> other forms (e.g., multicast or peer to peer). Additionally,s<strong>in</strong>ce the data be<strong>in</strong>g collected by multiple sensors is based on common phenomena, redundantdata will surely be propagated <strong>in</strong>side the network.A B CEFDGSINKFigure 7. Reverse multicast tree• Robustness to network DynamicsThe nodes <strong>in</strong>side the area of deployment may change locations (mobile nodes). Thisleads to frequent alternations of the “neighborhood” of a sensor. Moreover, several nodesmay run out of battery or face hardware failures <strong>and</strong> become dead. The network has toma<strong>in</strong>ta<strong>in</strong> its connectivity no matter how many or how frequent are the topology changes.• Scalability (Us<strong>in</strong>g localized & distributed algorithm)


A rout<strong>in</strong>g algorithm is necessary to perform well no matter the network size ordensity. It is widely known that localized (need only local <strong>in</strong>formation) <strong>and</strong> distributedalgorithms scale very well.• Energy efficiencyS<strong>in</strong>ce sensor nodes are battery powered it is of major concern the rout<strong>in</strong>g protocol tobe as energy efficient as possible. The network may have to operate for months withoutthe replacement or recharg<strong>in</strong>g of the power supplies. MANETS <strong>in</strong> contrary, usually havethis option <strong>and</strong> thereby many end-to-end rout<strong>in</strong>g protocols proposed for Ad-Hocnetworks are not applicable for WSNs. <strong>In</strong> addition, redundant data have to be elim<strong>in</strong>ated<strong>in</strong>side the network <strong>and</strong> load balanc<strong>in</strong>g between the nodes is essential s<strong>in</strong>ce some nodes(especially the ones that are placed close to the s<strong>in</strong>k) tend to consume more energy (theyperform too many transmissions - receptions).To m<strong>in</strong>imize energy consumption, several rout<strong>in</strong>g techniques have been proposed <strong>and</strong>they employ some well-known rout<strong>in</strong>g tactics as well as tactics special to WSNs, e.g.,data aggregation <strong>and</strong> <strong>in</strong>-network process<strong>in</strong>g, cluster<strong>in</strong>g, different node role assignment,<strong>and</strong> data-centric methods were employed. Almost all of the rout<strong>in</strong>g protocols can beclassified accord<strong>in</strong>g to the network structure as flat, hierarchical, or location-based.A taxonomy of the current rout<strong>in</strong>g schemes follows:<strong>Data</strong> CentricAll nodes have the same role <strong>and</strong> data is named <strong>in</strong> an attribute-value fashion. Therout<strong>in</strong>g is performed regard<strong>in</strong>g the data contents <strong>and</strong> no by us<strong>in</strong>g predef<strong>in</strong>edshortest paths (Directed Diffusion, SPIN-1 <strong>and</strong> SPIN-2).Cluster<strong>in</strong>g / HierarchicalCluster<strong>in</strong>g is performed to the nodes so that cluster heads can dosome aggregation <strong>and</strong> reduction of data <strong>in</strong> order to save energy(LEACH, TTDD, TEEN).Geographic / Location basedThey utilize the position <strong>in</strong>formation to relay the data to the desiredregions rather


than the whole network (GAF, GEAR).Present<strong>in</strong>g all the rout<strong>in</strong>g protocols is beyond the scope of this paper as it is focusedto the network<strong>in</strong>g issues of the wireless sensor networks that are related (directly or not)with the <strong>in</strong>-network process<strong>in</strong>g <strong>and</strong> especially with the data aggregation. For a nicesurvey of the current rout<strong>in</strong>g schemes the user is referred to [18].2.3.2 <strong>Data</strong>-centric Rout<strong>in</strong>gMany end-to-end rout<strong>in</strong>g schemes have been proposed <strong>in</strong> the literature for mobile adhocnetworks but they are not appropriate under the requirements that we discussed <strong>in</strong> theprevious section (2.3.1). It is not possible to build a global address<strong>in</strong>g scheme for thedeployment of a huge number of sensor nodes. Therefore, classical IP-based protocolscannot be applied to sensor networks because of the great overhead that a b<strong>in</strong>d<strong>in</strong>g service<strong>in</strong>duces. Most of the exist<strong>in</strong>g network protocols for MANETS (like DSR, AODV etc)assume a global identification of nodes, so they are not applicable to WSNs. Sometimesgett<strong>in</strong>g the data is more important than know<strong>in</strong>g the IDs of which nodes sent the data.<strong>Data</strong> is usually transmitted from every sensor node with<strong>in</strong> the deployment region withsignificant redundancy. S<strong>in</strong>ce this is very <strong>in</strong>efficient <strong>in</strong> terms of energy consumption,rout<strong>in</strong>g protocols that will be able to select a set of sensor nodes <strong>and</strong> utilize dataaggregation dur<strong>in</strong>g the relay<strong>in</strong>g of data have been considered. This consideration has ledto data-centric rout<strong>in</strong>g, which is different from traditional address-based rout<strong>in</strong>g whereroutes are created between addressable nodes managed <strong>in</strong> the network layer of thecommunication stack. <strong>In</strong> data-centric rout<strong>in</strong>g, the s<strong>in</strong>k sends queries to certa<strong>in</strong> regions<strong>and</strong> waits for data from the sensors located <strong>in</strong> the selected regions. S<strong>in</strong>ce data is be<strong>in</strong>grequested through queries, attribute based nam<strong>in</strong>g is necessary to specify the properties ofdata. For example, if the query is someth<strong>in</strong>g like [temperature > 60F], then sensor nodesthat sense temperature > 60F only need to respond <strong>and</strong> report their read<strong>in</strong>gs. Us<strong>in</strong>g datacentricrout<strong>in</strong>g it is possible:(a) to comb<strong>in</strong>e the data on their way back to the s<strong>in</strong>k,(b) to elim<strong>in</strong>ate duplicates,


(c) to perform smart cach<strong>in</strong>g <strong>in</strong>side the network,(d) to compute several aggregate values <strong>in</strong>side the network <strong>and</strong> f<strong>in</strong>ally(e) to save energy by reduc<strong>in</strong>g the number of routed messages (less transmissionsmean less energy consumed by transceiver).Therefore, there are two rout<strong>in</strong>g models that a WSN can use:Address-centric Protocol (AC): Each source <strong>in</strong>dependently sends data along theshortest path to s<strong>in</strong>k, based on the route that the queries took (“end-to-end rout<strong>in</strong>g”).<strong>Data</strong>-centric Protocol (DC): The sources send data to the s<strong>in</strong>k, but rout<strong>in</strong>g nodeslook at the content of the data <strong>and</strong> perform some form of aggregation / consolidationfunction on the data orig<strong>in</strong>at<strong>in</strong>g at multiple sources.The ma<strong>in</strong> reason for the use of data-centric rout<strong>in</strong>g schemes is the reduction of theconsumed energy as we already have stated above. To get an idea of how this goal can beachieved we consider a simple scenario (Figure 8). A heavy track passes near by a terra<strong>in</strong>where small wireless sensor nodes are deployed. The nodes are able to perceive nonnormal sound levels <strong>in</strong> the environment. Let us suppose that the only nodes that areplaced near the pass<strong>in</strong>g-by track are Node A <strong>and</strong> Node B. Nodes C <strong>and</strong> D are with<strong>in</strong> thetransmission range of A <strong>and</strong> only node D is reachable by B. Both nodes C <strong>and</strong> D cantransmit directly to s<strong>in</strong>k.Case 1: Address-centric rout<strong>in</strong>gEach source (A,B) sends its own <strong>in</strong>formation separately to the s<strong>in</strong>k. The shortest pathsare used, so Node A routes packet_A to s<strong>in</strong>k through C <strong>and</strong> Node B routes packet_B tos<strong>in</strong>k through D. There is no way for the network to “know” that packet_A <strong>and</strong> packet_B<strong>in</strong>clude identical values for the measured noise. Therefore, all the <strong>in</strong>termediate nodes(C,D) forward bl<strong>in</strong>dly everyth<strong>in</strong>g they receive to the next hop. The data-contents ofpackets are not accessible to the network layer as the only <strong>in</strong>formation provided is theaddress of the sender <strong>and</strong> the address of the receiver. Th<strong>in</strong>k the same scenario for 100 or1000 small sensors near a volcano <strong>and</strong> try to realize how many transmissions of identical(duplicate) packets will be performed. As a result, a big amount of energy will bedissipated.Case 2: <strong>Data</strong>-centric rout<strong>in</strong>g


<strong>In</strong> the data-centric approach we assume that rout<strong>in</strong>g is not optimal. Node A routespacket_A to s<strong>in</strong>k through a longer path (than that used <strong>in</strong> Address centric rout<strong>in</strong>g) withNode D be<strong>in</strong>g the last hop before s<strong>in</strong>k. The data is named (attribute-value form) <strong>and</strong> thenetwork layer protocol can access this <strong>in</strong>formation. It is possible now to perform severalactions depend<strong>in</strong>g on the content the data packets. The most reasonable action that arout<strong>in</strong>g protocol can perform is duplicate suppression. Thus, Node D checks its cache <strong>and</strong>realizes that the packet_A (reached later than packet_B) is identical to packet_B <strong>and</strong> itnever transmits it to s<strong>in</strong>k.Node ANode BNode ANode BNode CNode DNode CNode DS<strong>in</strong>kS<strong>in</strong>kDuplicateSuppressionAddress Centric<strong>Data</strong> CentricFigure 8. Address-Centric Vs <strong>Data</strong>-Centric Rout<strong>in</strong>g2.3.3 Directed Diffusion [3]<strong>In</strong>-network process<strong>in</strong>g must be supported by the network layer mechanism that isused. The implementation of data-centric rout<strong>in</strong>g protocols is ideal for such a purpose. <strong>In</strong>this section we present one of the most common rout<strong>in</strong>g protocols of this category, theDirected Diffusion.Directed diffusion is a data-centric (DC) <strong>and</strong> application-aware paradigm <strong>in</strong> the sensethat all data generated by sensor nodes is named by attribute-value pairs. The ma<strong>in</strong> ideaof the DC paradigm is to comb<strong>in</strong>e the data com<strong>in</strong>g from different sources (<strong>in</strong>-networkaggregation) by elim<strong>in</strong>at<strong>in</strong>g redundancy <strong>and</strong> m<strong>in</strong>imiz<strong>in</strong>g the number of transmissions;thus sav<strong>in</strong>g network energy <strong>and</strong> prolong<strong>in</strong>g its lifetime. Unlike traditional end-to-end


out<strong>in</strong>g, DC rout<strong>in</strong>g f<strong>in</strong>ds routes from multiple sources to a s<strong>in</strong>gle dest<strong>in</strong>ation that allows<strong>in</strong>-network consolidation of redundant data.<strong>In</strong> directed diffusion, sensors measure events <strong>and</strong> create gradients of <strong>in</strong>formation <strong>in</strong>their respective neighborhoods. The base station requests data by broadcast<strong>in</strong>g <strong>in</strong>terests.<strong>In</strong>terest describes a task required to be done by the network. <strong>In</strong>terest diffuses through thenetwork hop-by-hop, <strong>and</strong> is broadcasted by each node to its neighbors. As the <strong>in</strong>terest ispropagated throughout the network, gradients are setup to draw data satisfy<strong>in</strong>g the querytowards the request<strong>in</strong>g node, i.e., a BS may query for data by dissem<strong>in</strong>at<strong>in</strong>g <strong>in</strong>terests <strong>and</strong><strong>in</strong>termediate nodes propagate these <strong>in</strong>terests. Each sensor that receives the <strong>in</strong>terest sets upa gradient toward the sensor nodes, from which it receives the <strong>in</strong>terest. This processcont<strong>in</strong>ues until gradients are setup from the sources back to the BS. More generally, agradient specifies an attribute value <strong>and</strong> a direction. The strength of the gradient may bedifferent towards different neighbors, result<strong>in</strong>g <strong>in</strong> different amounts of <strong>in</strong>formation flow.At this stage, loops are not checked, but are removed at a later stage. When <strong>in</strong>terests fitgradients, paths of <strong>in</strong>formation flow are formed from multiple paths <strong>and</strong> then the bestpaths are re<strong>in</strong>forced so as to prevent further flood<strong>in</strong>g accord<strong>in</strong>g to a local rule. <strong>In</strong> order toreduce communication costs, data is aggregated on the way. The goal is to f<strong>in</strong>d a goodaggregation tree which gets the data from source nodes to the BS. The BS periodicallyrefreshes <strong>and</strong> re-sends the <strong>in</strong>terest when it starts to receive data from the source(s). This isnecessary because <strong>in</strong>terests are not reliably transmitted throughout the network.All sensor nodes <strong>in</strong> a directed diffusion-based network are application-aware, whichenables diffusion to achieve energy sav<strong>in</strong>gs by select<strong>in</strong>g empirically good paths <strong>and</strong> bycach<strong>in</strong>g <strong>and</strong> process<strong>in</strong>g data <strong>in</strong> the network. Cach<strong>in</strong>g can <strong>in</strong>crease the efficiency,robustness <strong>and</strong> scalability of coord<strong>in</strong>ation between sensor nodes which is the essence ofthe data diffusion paradigm. Other usage of directed diffusion is to spontaneouslypropagate an important event to some sections of the sensor network. Such type of<strong>in</strong>formation retrieval is well suited only for persistent queries where request<strong>in</strong>g nodes arenot expect<strong>in</strong>g data that satisfy a query for duration of time. This makes it unsuitable forone-time queries, as it is not worth sett<strong>in</strong>g up gradients for queries, which use the pathonly once.


3 <strong>Data</strong> <strong>Aggregation</strong> <strong>in</strong> WSNs3.1 GeneralSo far, we have discussed the network<strong>in</strong>g issues of a WSN, <strong>and</strong> have concluded that<strong>in</strong> order to m<strong>in</strong>imize the dissipated energy dur<strong>in</strong>g its operation, the transceivers have tobe used very st<strong>in</strong>gily. Suppos<strong>in</strong>g that the lower layers (physical <strong>and</strong> MAC) are energyefficient, it is up to the network <strong>and</strong> the application layer protocols to follow aconservative strategy regard<strong>in</strong>g the number transmitted/received messages. <strong>In</strong> section 2.3we reviewed a communication paradigm that is data-centric <strong>and</strong> suppresses duplicatesensor read<strong>in</strong>gs, <strong>in</strong>side the network. Directed Diffusion performs the simpler form ofwhat we call data aggregation. However, most of the times the user asks to retrieve<strong>in</strong>formation from a group of sensors that is placed at a specific area. An aggregate is avalue or set of values that provides <strong>in</strong>formation about a sensor group <strong>and</strong> can becomputed by the <strong>in</strong>dividual sensor data read<strong>in</strong>g of each element/node of that group. Thecomputation is performed by us<strong>in</strong>g an aggregate function that takes as <strong>in</strong>put the read<strong>in</strong>gsof the field <strong>and</strong> gives to the output the respective aggregate. Examples of aggregatefunctions are SUM, MAX, MIN, AVERAGE etc.The typical communication pattern used <strong>in</strong> WSNs is the one that assumes thatmultiple sensors/sources send data only towards one s<strong>in</strong>k (receiver) that is reachable bymultihop routes. The idea beh<strong>in</strong>d the data aggregation techniques is the comb<strong>in</strong>ation ofthe received values on their way to the s<strong>in</strong>k at each <strong>in</strong>termediate node. Thus, thecorrespond<strong>in</strong>g packet flow resembles a reverse-multicast structure, which is called thedata aggregation tree (spann<strong>in</strong>g tree of the sensors that perform sens<strong>in</strong>g tasks for a givenquery).To get an idea of how the <strong>in</strong>-network data aggregation achieves significant energysav<strong>in</strong>gs we consider the general approaches discussed at the end of the <strong>in</strong>troduction.Direct delivery (Figure 9a):Each sensor value must be routed to the s<strong>in</strong>k. If a sensor resides at depth n of therout<strong>in</strong>g tree, this requires transmitt<strong>in</strong>g n-1 messages (1 transmission by himself <strong>and</strong> therest by the <strong>in</strong>termediate nodes of the path). The S<strong>in</strong>k after receiv<strong>in</strong>g all messagescomputes the aggregated value. <strong>In</strong> Figure 3a each node is labelled with the total number


of hops/messages needed to reach the s<strong>in</strong>k. The total number of transmissions needed forthe s<strong>in</strong>k to collect all the values is 1 + 2 + 3 + 3 + 3 + 4 = 16.Distributed <strong>in</strong>-network approach (Figure 9b):All nodes send local sensed values to their neighbour/parent. The value received bythe s<strong>in</strong>k is the ready-computed aggregate. However, each node has to wait its childrenbefore comput<strong>in</strong>g the aggregate <strong>and</strong> forward it (the respond delay grows). Moreover, theamount of the transmitted data depends on the type of the aggregate function. The totalnumber of transmissions needed for the s<strong>in</strong>k to collect the f<strong>in</strong>al aggregated value is 1 + 1+ 1 + 1 + 1 + 1 = 6.Observ<strong>in</strong>g the above example we note that the reduction of the transferred data isobvious (the distributed approach performs 10 transmissions less than direct delivery).This difference is translated <strong>in</strong>to reduction of the energy consumption when us<strong>in</strong>g thedistributed approach.Figure 9. a)Direct Delivery, b) Distributed <strong>in</strong>-network aggregationBenefits from Distributed <strong>Data</strong> <strong>Aggregation</strong>1. Reduction <strong>in</strong> total transmission needed to be performed <strong>in</strong>sidethe network.The sensor data are aggregated because they get fused at eachparental node with the parent’s local values <strong>and</strong> the received datafrom other children. Because of the fact that the user is not <strong>in</strong>terested<strong>in</strong> <strong>in</strong>dividual values there is no loss <strong>in</strong> the quality of the result returned.


2. Less packet collisions.Packet collisions occur more frequently when the network is veryloaded. Less transferred packets is translated <strong>in</strong>to less packetcollisions.3. Less redundant read<strong>in</strong>gs.<strong>In</strong> the case of Direct Delivery, where all the <strong>in</strong>dividual read<strong>in</strong>gs aredelivered separately to the host node, it is a common phenomenonthat a node sends its values to multiple parents. This results <strong>in</strong>reception of the same packet multiple times by the s<strong>in</strong>k. <strong>In</strong> dataaggregation, the filters drop the redundant packets (cach<strong>in</strong>g is used).4. <strong>In</strong>crease <strong>in</strong> accuracy of results.If a sensor node is temporarily down, its parent can estimate avirtual value based on node’s previous read<strong>in</strong>gs. Thus the totalaggregate value is not significantly affected. Of course, this benefitholds <strong>in</strong> slow-chang<strong>in</strong>g environments.3.2 Query <strong>Process<strong>in</strong>g</strong>A user can send queries to the sensor network, <strong>in</strong> order to retrieve <strong>in</strong>formation aboutits state. The query is usually <strong>in</strong>jected <strong>in</strong>to the network from specific nodes (e.g. s<strong>in</strong>k or agateway). There are several proposed approaches for syntax of a query, its lifetime, itsscope, its propagation <strong>and</strong> f<strong>in</strong>ally its parametrization.<strong>In</strong> Directed diffusion [3] the queries are formed as “<strong>in</strong>terests” which are expressionswith multiple attribute-value pairs (Figure 10). Predef<strong>in</strong>ed attribute categories <strong>and</strong>subcategories <strong>and</strong> application specific data representation characterize <strong>in</strong>terests. Usuallydata are also named <strong>in</strong> a similar way <strong>and</strong> so, the application filters are able to performdata aggregation.<strong>In</strong>terest message Exampletype = wheeled vehicle<strong>in</strong>terval = 20 ms // send events every 20 msduration = 10 s // for the next 10 srect = [-100; 100; 200; 400] // from sensors with<strong>in</strong> rectangle.Figure 10


A different method is used <strong>in</strong> TAG [10] to represent a user query is an SQL-likelanguage (Figure 11). The query is always performed over a s<strong>in</strong>gle table called “sensors”.SELECT AVG(volume),room FROM sensorsWHERE floor = 6GROUP BY roomHAVING AVG(volume) > thresholdEPOCH DURATION 30sFigure 11Regard<strong>in</strong>g the duration of the execution a query it is usually def<strong>in</strong>ed with<strong>in</strong> its body(<strong>In</strong> Figure 10 the duration attribute <strong>and</strong> <strong>in</strong> Figure 11 the EPOCH clause).A query may be executed many times periodically (periodic, long-runn<strong>in</strong>g) or it isexecuted only one time <strong>and</strong> returns just a snapshot of the current state of the network(snapshot queries).3.3 <strong>Aggregation</strong> operatorsAuthors of TAG [10] propose a nice taxonomy of the aggregation operators by us<strong>in</strong>gseveral criteria:1. Duplicate sensitivity.This property specifies whether an aggregate function will returnthe same result when the dataset conta<strong>in</strong>s duplicate values. Examplesof duplicate sensitive aggregates are MEDIAN, AVERAGE, <strong>and</strong> COUNT.Examples of duplicate <strong>in</strong>sensitive aggregates <strong>in</strong>clude MIN, MAX, <strong>and</strong>COUNT DISTINCT.2. Exemplary/Summary.Exemplary aggregates always return a representative value present<strong>in</strong> the dataset while summary aggregates perform some calculationover the entire dataset <strong>and</strong> return the calculated value. Summaryvalues (such as AVERAGE <strong>and</strong> COUNT) are more easily estimated even<strong>in</strong> a network with losses, where all data packets are not received.


Exemplary aggregates, on the other h<strong>and</strong>, may be highly <strong>in</strong>accurate ifeven a few messages are lost. Such aggregates <strong>in</strong>clude MIN, MAX, <strong>and</strong>MEDIAN.3. Monotonic aggregates.Aggregates that allow early test<strong>in</strong>g of predicates <strong>in</strong> the network aremonotonic. For example, assume the user requests the MAXtemperature read<strong>in</strong>g <strong>in</strong> the network. As source nodes report theirvalues toward the host node, other nodes may listen <strong>and</strong> only reporttheir own values if they are greater than the current MAX. Thisprovides sav<strong>in</strong>gs <strong>in</strong> the overall number of messages sent through thenetwork without affect<strong>in</strong>g the result.4. Partial state requirements.The amount of partial state <strong>in</strong>formation required differs amongaggregate functions. Aggregates such as SUM <strong>and</strong> COUNT requirepartial state records that are the same size as the f<strong>in</strong>al aggregate. TheAVERAGE function requires a partial state record conta<strong>in</strong><strong>in</strong>g two values(both the SUM <strong>and</strong> COUNT). Other aggregates such as MEDIAN <strong>and</strong>HISTOGRAM require that the entire dataset be returned to the hostnode unless some type of compression or estimation is used (see [10]).3.4 A Taxonomy of Current <strong>Aggregation</strong> Approaches3.4.1 Tree-Based TechniquesLow level Nam<strong>in</strong>g – Filter Based [4]<strong>In</strong> order to illustrate the way <strong>in</strong> which <strong>in</strong>-network process<strong>in</strong>g can reduce data traffic toconserve energy, an example of filter-driven data aggregation us<strong>in</strong>g Directed Diffusion ispresented below. The scope of an anticipated sensor application is to query a number ofsensors so as to be able to take some action when one or more of the sensors is activated.Take as an example a surveillance system which could notify a biologist if an animalenters a specific region. <strong>In</strong> order to ensure robust coverage, the coverage of deployedsensors will overlap <strong>and</strong> as a result, one event is likely to trigger multiple sensors of thenetwork. Every sensor will report detection to the user whereas communication <strong>and</strong>


energy costs could be reduced by aggregat<strong>in</strong>g this data as it returns to the user. Thisaggregation could be performed to a b<strong>in</strong>ary value (a detection exists), an area (a detectionexists <strong>in</strong> quadrant 2), or it can be application specific (seismic <strong>and</strong> <strong>in</strong>frared sensors<strong>in</strong>dicate 80% chance of detection).<strong>In</strong> spite of the fact that details of aggregation can be application-specific, a commonproblem is the design of mechanisms for the establishment of data dissem<strong>in</strong>ation paths tothe sensors with<strong>in</strong> the region, as well as for the aggregation of responses. Let’s nowconsider the way <strong>in</strong> which this k<strong>in</strong>d of data fusion may be implemented <strong>in</strong> a traditionalnetwork where low-level node names are topologically assigned. Firstly, a b<strong>in</strong>d<strong>in</strong>gservice must exist, so as to list the node identifiers of sensors with<strong>in</strong> a given region, <strong>in</strong>order to determ<strong>in</strong>e which sensors are present <strong>in</strong> that region. Once these sensors aretasked, an election algorithm must choose dynamically one or more nodes, to aggregatethe data <strong>and</strong> return the result to the querier.On the other h<strong>and</strong>, Directed Diffusion [3] faces this problem by us<strong>in</strong>g opportunisticdata aggregation. The selection as well as the task<strong>in</strong>g of sensors is accomplished bynam<strong>in</strong>g nodes us<strong>in</strong>g geographic attributes. As data is sent from the sensors to the querier,the <strong>in</strong>termediate sensors of the return path are able <strong>and</strong> do identify <strong>and</strong> cache relevantdata with the aid of application-specific filters. Also, they can then suppress duplicatedata, by simply not propagat<strong>in</strong>g it, or they may slightly delay <strong>and</strong> aggregate data frommultiple sources.Opportunistic aggregation strategies benefit from filters <strong>in</strong> various ways as theyprovide a natural approach for the <strong>in</strong>sertion of application-specific code <strong>in</strong>to the network.The nam<strong>in</strong>g <strong>and</strong> match<strong>in</strong>g of attributes, allow these filters to stay <strong>in</strong>active until they aretriggered by relevant data. The significance of a common attribute set is that filters <strong>in</strong>curno network costs to <strong>in</strong>teract with directory or mapp<strong>in</strong>g services.<strong>In</strong> [4] one can f<strong>in</strong>d more complex examples concern<strong>in</strong>g <strong>in</strong>-network aggregation us<strong>in</strong>gfilters. Another important issue is that nested queries (where one sensor cues another) canbe used for trigger<strong>in</strong>g as well as to reduce overall energy consumption significantly. Forexample, the entrance of a person <strong>in</strong> a room is often correlated with changes <strong>in</strong> light ormotion. When implement<strong>in</strong>g multi-modal sensor networks there is the ability to use thesecorrelations by trigger<strong>in</strong>g a secondary sensor based on the status of another, or <strong>in</strong> otherwords by nest<strong>in</strong>g one query <strong>in</strong>side another. The overall energy consumption as well asthe network traffic can be reduced, by reduc<strong>in</strong>g the duty cycle of some sensors. Forexample, <strong>in</strong> the case of the energy consumption this happens if the secondary sensor


consumes more energy than the <strong>in</strong>itial sensor (as an accelerometer trigger<strong>in</strong>g a GPSreceiver), whereas <strong>in</strong> the case of network traffic a triggered imager for example generatesmuch less traffic than a constant video stream. <strong>In</strong> other words, <strong>in</strong>-network process<strong>in</strong>gmight choose the best application of a sparse resource (for example, a motion sensortrigger<strong>in</strong>g a steerable camera).COUGAR [5], [6]<strong>In</strong> order to enable declarative query<strong>in</strong>g of sensor networks, <strong>in</strong> [5],[6] is proposed aquery layer consist<strong>in</strong>g of a query proxy on every sensor node. Concern<strong>in</strong>g thearchitecture of the sensor node, the query proxy lies between the network layer <strong>and</strong> theapplication layer <strong>and</strong> it provides higher-level services us<strong>in</strong>g queries that can be <strong>in</strong>jected<strong>in</strong>to the network from a specified gateway node.Cougar, enables the user to parameterize the queries. A complex query may not onlyconsist of a large number of parameters <strong>and</strong> operators but also various user requirementson the query answers, such as specification of a maximum permissible latency <strong>and</strong>accuracy of the query result.Also, the query proxy is responsible for the data aggregation. Part of the computationcan be moved from a location outside the network <strong>and</strong> pushed <strong>in</strong>to the sensor network,aggregat<strong>in</strong>g records, or elim<strong>in</strong>at<strong>in</strong>g irrelevant records. Compared to traditionalcentralized data extraction <strong>and</strong> analysis, <strong>in</strong>-network process<strong>in</strong>g can reduce energyconsumption <strong>and</strong> improve sensor network lifetime significantly. This is the reason whyone of the ma<strong>in</strong> roles of the query proxy when process<strong>in</strong>g user queries is to perform <strong>in</strong>networkprocess<strong>in</strong>g.Nevertheless, COUGAR has some drawbacks as well. First, the addition of querylayer on each sensor node may add an extra overhead <strong>in</strong> terms of energy consumption<strong>and</strong> memory storage. Second, to obta<strong>in</strong> successful <strong>in</strong>-network data computation,synchronization among nodes is required (not all data are received at the same time from<strong>in</strong>com<strong>in</strong>g sources) before send<strong>in</strong>g the data to the leader node. Third, the leader nodesshould be dynamically ma<strong>in</strong>ta<strong>in</strong>ed to prevent them from be<strong>in</strong>g hot-spots (failure prone).As an example, suppose that we have a long-runn<strong>in</strong>g query Q to monitor the averagetemperature of an office every t seconds. The query Q notifies the adm<strong>in</strong>istrator of thenetwork if the average temperature <strong>in</strong> the office is greater than a user-def<strong>in</strong>ed threshold,


y generat<strong>in</strong>g an output record. Also, a query optimizer generates an efficient query planfor <strong>in</strong>-network process<strong>in</strong>g of query Q, <strong>in</strong> order to vastly reduce resource usage <strong>and</strong> thusextend the lifetime of the sensor network. Furthermore, a query plan specifies both thedata flow (between sensors) <strong>and</strong> an exact computation plan (at each sensor). Thecomputation plan determ<strong>in</strong>es the leader of the specific query, a designated node wherethe computation of the average temperature will take place. The leader could be either afixed sensor with more rema<strong>in</strong><strong>in</strong>g power <strong>and</strong> energy, or a r<strong>and</strong>omly selected node bysome distributed leader election algorithm. Two computation plans are produced, one forthe leader node, <strong>and</strong> a second plan for the rema<strong>in</strong><strong>in</strong>g nodes <strong>in</strong> the query region. A queryoptimizer can also perform several techniques to improve the performance of the system.For example, it can merge a new query with exist<strong>in</strong>g similar queries. It is important tomention that <strong>in</strong> order to generate a good plan for a user query, the optimizer requiresmetadata about the status of the sensor network to evaluate the costs acid benefits(latency <strong>and</strong> accuracy) of different plans.<strong>Data</strong> <strong>Aggregation</strong> <strong>in</strong> COUGAR <strong>in</strong>volves two important issues:a) from a computational po<strong>in</strong>t of view, the aggregation must take place at a "leader"node [leader election problem + dynamically ma<strong>in</strong>ta<strong>in</strong> a leader + leader with physicallyadvantageous location], unless the f<strong>in</strong>al computation of the aggregate is delegated to agateway node or happens outside of the network.b) data records have to be delivered from source sensor nodes to the designated leadersimply by send<strong>in</strong>g data records directly to the leader along multi-hop routes so as all thecomputation takes place directly at the leader or alternatively by push<strong>in</strong>g partialcomputation from the leader to <strong>in</strong>ternal nodes along the path <strong>in</strong> order to reduce data sizeon-the-fly. Synchronization between sensor nodes along the communication path is veryimportant, s<strong>in</strong>ce a node has to "wait" to receive the results that are to be aggregated.TAG [10]<strong>In</strong> the TAG system, the connection of the users to the sensor network is achieved byus<strong>in</strong>g a workstation or a base station which is directly connected to a sensor <strong>and</strong> has therole of the root node. The aggregation of queries over the data of the sensors isformulated us<strong>in</strong>g a simple SQL-like language. The <strong>in</strong>-network query evaluation consistsof two phases, the (smart) distribution <strong>and</strong> the collection phase. Dur<strong>in</strong>g the distributionphase, the query is flooded <strong>in</strong> the network <strong>and</strong> organizes the nodes <strong>in</strong>to an <strong>Aggregation</strong>


Tree. To be more specific, as the query is distributed across the network, a spann<strong>in</strong>g treeif formed from the sensors, <strong>in</strong> order to return data back to the root node. Dur<strong>in</strong>g the datacollection (sens<strong>in</strong>g) phase, each leaf node produces a s<strong>in</strong>gle tuple <strong>and</strong> forwards it to itsparent. The non-leaf nodes receive the tuples of their children <strong>and</strong> comb<strong>in</strong>e these values.Afterwards, they submit the new partial results to their own parents. F<strong>in</strong>ally, the totalresult will arrive at the root after h steps, where h is the height of the aggregation tree. Ifthere are no failures, this technique works extremely well for decomposable aggregates,namely distributive <strong>and</strong> algebraic aggregates such as MIN, MAX, COUNT <strong>and</strong> AVG.When a query <strong>in</strong>volves an epoch, requir<strong>in</strong>g read<strong>in</strong>gs to be collected periodically, TAGuses the periodic per-hop adjusted aggregation approach. It subdivides the epoch <strong>in</strong>toslots. The length of each slot is equal to the epoch length divided by n, where n is themaximum number of hops separat<strong>in</strong>g the nodes that generate data from the s<strong>in</strong>k. By us<strong>in</strong>gthe per-hop adjusted aggregation operation, slots are assigned to nodes <strong>in</strong> decreas<strong>in</strong>gorder (i.e. n, n-1, n-2,…) as the query propagates through the network. Each nodetransmits <strong>in</strong> its slot thus, the nodes that transmit first are the outmost nodes whereas thenodes that transmit last are those that are closest to the s<strong>in</strong>k. As <strong>in</strong> any time-slottedmechanism, clock synchronization among nodes is required so that nodes transmit <strong>in</strong>their designated slots.Nevertheless, the tree-based approach of TAG breaks down when failures are<strong>in</strong>troduced <strong>in</strong>to the system. Especially <strong>in</strong> sensor networks, both node <strong>and</strong> l<strong>in</strong>k failures arevery common phenomena. Node failures are expected to be relatively frequent, s<strong>in</strong>ce thesensors are meant to be small, cheap as well as mass-produced <strong>and</strong> they will be placed <strong>in</strong>a variety of uncontrolled environments. L<strong>in</strong>k failures (<strong>and</strong> packet losses) are alsoexpected to occur very often due to environmental <strong>in</strong>terference, packet collisions, <strong>and</strong>low signal-to-noise ratios. Furthermore, if a node fails or its message does not reach itsparent, the values associated with the entire subtree are lost. . If the failure occurs close tothe root node, then the effect on the result<strong>in</strong>g aggregate can be significant.<strong>In</strong> order to improve the performance of the TAG service, several optimizations havebeen proposed with significant results. Concern<strong>in</strong>g the conservation of energy, sensornodes sleep as much as possible dur<strong>in</strong>g each step where the processor <strong>and</strong> radio are idle.When a timer expires or an external event occurs, the device wakes <strong>and</strong> starts to performthe process<strong>in</strong>g <strong>and</strong> communication phases. At this po<strong>in</strong>t, as mentioned above, it receivesthe messages from its children <strong>and</strong> then submits the new value(s) to its parent. If no more


process<strong>in</strong>g is needed for this step, the node enters aga<strong>in</strong> the sleep<strong>in</strong>g mode [11]. Whilethis approach is satisfactory suitable for ideal network conditions, Madden et al. [10]proposed some methods with the goal of improv<strong>in</strong>g the performance of their system. Onesolution is to cache previous values <strong>and</strong> reuse them if newer ones are unavailable but ofcourse, the previous values may reflect losses at lower levels of the tree.Pipel<strong>in</strong>ed <strong>Aggregation</strong> [7]The authors of [7] proposed a fully-pipel<strong>in</strong>ed approach for aggregation. Also, as thisapproach is an ancestor of TAG, it has many similarities with it. Both of them assumethat a sensor network is a distributed database. Furthermore, it implements a generalpurpose,SQL-style <strong>in</strong>terface that can execute queries over any type of sensor data whileprovid<strong>in</strong>g the abilities for significant optimization.Each sensor possesses a unique id that is used just for local <strong>in</strong>teractions <strong>and</strong> not <strong>in</strong>order to route the sensed data back to the s<strong>in</strong>k as expected. It does not use a data-centricrout<strong>in</strong>g mechanism but it forms a rout<strong>in</strong>g tree upon query distribution, just as the casewas <strong>in</strong> TAG. Aga<strong>in</strong>, one sensor plays the role of the root po<strong>in</strong>t, upon which, aggregateddata will converge <strong>and</strong> also it is the <strong>in</strong>terface of the query<strong>in</strong>g user for the rest of thenetwork.The computation of an aggregate consists of two phases: the propagation phase, <strong>in</strong>which aggregate queries are pushed down <strong>in</strong>to sensor networks, <strong>and</strong> the aggregationphase, <strong>in</strong> which the aggregate values are propagated up from children to parents. Whiledata flow back to the top of the tree (to s<strong>in</strong>k/root), all the sensors that have any childrenmust wait for their responses before comput<strong>in</strong>g the local aggregate <strong>and</strong> then they canforward it to its own parents. There is a timeout t for wait<strong>in</strong>g to hear a propagate messagefrom a child to bound <strong>in</strong>-network delay. Choos<strong>in</strong>g a small value for t may result <strong>in</strong>missed reports from children. Also, it has to be noted that the proper value of t varies,depend<strong>in</strong>g on the depth of the rout<strong>in</strong>g tree.This simple approach would work f<strong>in</strong>e if the nature of a WSN was not unreliable. Theunreliability affects the accuracy of the returned aggregated results/values. Here comesthe pipel<strong>in</strong>ed aggregation to give the solution. The propagation of aggregates is done asdescribed above except that time is divided <strong>in</strong> <strong>in</strong>tervals of duration i. Dur<strong>in</strong>g each i<strong>in</strong>terval:


a) Sensors that heard aggregate request start to transmit to their parent a fused valueof:• the partial aggregate they have already received• the aggregates that received by their children• their local read<strong>in</strong>gb) After the elapse of the first i-duration, the root will have received messages thatconta<strong>in</strong> aggregates from sensors that are only 1-hop away. After the elapse of the 2ndduration the root will have received messages with aggregates from sensors that are 2-hops away <strong>and</strong> so on.<strong>In</strong> the pipel<strong>in</strong>ed approach that was described above, a new aggregate arrives every iseconds. <strong>In</strong> this manner, the user that <strong>in</strong>jected the query can have a first draft estimationof the value of the aggregate (faster first response) whereas every i seconds the user willget a more accurate result. Consequently, the technique of the pipel<strong>in</strong>ed aggregationprovides users with a stream of aggregate values that changes, as sensor read<strong>in</strong>gs <strong>and</strong> theunderly<strong>in</strong>g network change.The most significant disadvantage of this approach is that a number of additionalmessages are transmitted, <strong>in</strong> order to extract the temporary aggregate values every i-time<strong>in</strong>terval over all the sensors of the network. However, despite these negative effectssimulation shows that this technique generally improves the robustness as well as thethroughput of the network.<strong>In</strong> [7] several optimizations are proposed:1) Snoop<strong>in</strong>g: Takes advantage of the shared radio channel through which everymessage is broadcasted to the other sensors of the network with<strong>in</strong> the range. When asensor misses an <strong>in</strong>itial request to aggregate, it may start to <strong>in</strong>itiate the aggregationactivity whenever it listens to another sensor, report<strong>in</strong>g an aggregate value to its ownparents. Go<strong>in</strong>g one step forward, a sensor does not need to explicitly tell its children tobeg<strong>in</strong> aggregation. It can simply report its value to its parents, as this value will also beheard by its children. The children will assume they missed the start request <strong>and</strong> as aresult they will <strong>in</strong>itiate aggregation locally. It is important to mention that Snoop<strong>in</strong>greduces the total messages required to compute the first full aggregate of the network fora total sav<strong>in</strong>gs of 23%.


2) Use of Multiple parents: Due to the broadcast nature of radio, data redundancy isvery often <strong>and</strong> while from one perspective this provides more reliability, on the otherh<strong>and</strong> <strong>in</strong> MAX-MIN a node may be counted multiple times. A possible solution to thisproblem is to send part of the aggregate to one parent <strong>and</strong> the rest to the other. <strong>In</strong> thatway, there is the benefit that the variance of the multiple parent COUNT is much less,although its expected value is the same. <strong>In</strong> other words, a s<strong>in</strong>gle loss will affect less thecomputed value.3) Hypothesis Test<strong>in</strong>g: The above optimizations still require an <strong>in</strong>put from every nodeof a network <strong>in</strong> order to compute an aggregate. Sometimes we need to hear from aparticular sensor its sensed value <strong>in</strong> order to figure out if it will affect the value of theaggregate. The node decides locally whether contribut<strong>in</strong>g its read<strong>in</strong>g <strong>and</strong> the read<strong>in</strong>gs ofits children will affect the value of the aggregate.For the case of MIN(or MAX) hypothesis test<strong>in</strong>g can be done even with snoop<strong>in</strong>g orpipel<strong>in</strong>ed aggregate over the k-first levels of the aggregate tree.For COUNT/SUM/AVERAGE the user def<strong>in</strong>es an error_bound. If the value of theleaf is with<strong>in</strong> the def<strong>in</strong>ed error_bound, then is rema<strong>in</strong>s silent <strong>and</strong> its parent keeps the oldapproximate answer, else the child forwards its own value to its parent.3.4.2 Multipath TechniquesRecently the research community of WSNs has turned its <strong>in</strong>terest from the tree basedaggregation schemes to multi-path graphs where multiple edges can exist between twonodes. This field is cont<strong>in</strong>uously ga<strong>in</strong><strong>in</strong>g more attention <strong>and</strong> the reason for this is the factthat the data which is sent from a sensor to the others can be easily lost due to the largecommunication error rate of wireless communication.The rationale beh<strong>in</strong>d multi-path graph is as follows: <strong>in</strong>stead of requir<strong>in</strong>g from eachnode to send its accumulated partial result to its s<strong>in</strong>gle parent <strong>in</strong> an aggregation tree, themulti-path approach exploits the characteristics of the wireless broadcast medium byforc<strong>in</strong>g each node to broadcast its partial result to multiple neighbors. Furthermore, thisapproach sends the same m<strong>in</strong>imal number of messages as the tree approach (i.e., onetransmission per node), which makes it energy-efficient. Also it is very robust <strong>in</strong> means


of communication failures because each read<strong>in</strong>g is accounted for <strong>in</strong> many paths towardsthe base station, <strong>and</strong> all would have to fail for the read<strong>in</strong>g to be lost. However, there aretwo drawbacks for the multi-path approach:(1) when many aggregates are to be accumulated, the known energy-efficienttechniques provide only an approximate answer (with accuracy guarantees), <strong>and</strong>(2) when some aggregates are to be accumulated, the message size is longer thanwhen us<strong>in</strong>g the tree approach, thereby consum<strong>in</strong>g more energy.(3) there is the danger to double count the same values because of the multipleparents.<strong>In</strong> the follow<strong>in</strong>g, it is proposed a protocol which recommends a topology called r<strong>in</strong>gs.<strong>In</strong> this topology, nodes are divided <strong>in</strong>to levels accord<strong>in</strong>g to their hop count from the basestation, <strong>and</strong> the multi-path aggregation is performed level-by-level towards the basestation.SYNOPSIS DIFFUSION [8]<strong>In</strong> [8] it is proposed a general framework for achiev<strong>in</strong>g significantly more accurate<strong>and</strong> reliable answers by comb<strong>in</strong><strong>in</strong>g energy-efficient multi-path rout<strong>in</strong>g schemes withtechniques that avoid double-count<strong>in</strong>g. Synopsis Diffusion uses a r<strong>in</strong>gs topology toperform <strong>in</strong>-network aggregation. The partial result at a node is represented as a synopsis,a small digest (e.g., histogram, bit-vectors, sample, etc.) of the data.The synopsis diffusion algorithm consists of two phases:a) a distribution phase <strong>in</strong> which the aggregate query is flooded through the network<strong>and</strong> an aggregation topology is constructed, <strong>and</strong>b) an aggregation phase where the aggregate values are cont<strong>in</strong>ually routed towardsthe query<strong>in</strong>g node.Distribution Phase<strong>In</strong> order to construct a r<strong>in</strong>gs topology, firstly, the base station transmits <strong>and</strong> any nodehear<strong>in</strong>g this transmission is <strong>in</strong> r<strong>in</strong>g 1. At each subsequent step, nodes <strong>in</strong> r<strong>in</strong>g i transmit<strong>and</strong> any node, apart from those that are already <strong>in</strong> a r<strong>in</strong>g, hear<strong>in</strong>g one of thesetransmissions, is <strong>in</strong> r<strong>in</strong>g i + 1. As a result, the r<strong>in</strong>g number def<strong>in</strong>es the level of a node <strong>in</strong>


the r<strong>in</strong>gs topology. <strong>Aggregation</strong> proceeds level-by-level, with level i + 1 nodestransmitt<strong>in</strong>g, while level i nodes are listen<strong>in</strong>g. <strong>In</strong> contrast to trees, the r<strong>in</strong>gs topologyexploits the wireless broadcast medium by forc<strong>in</strong>g all level i nodes that hear a level i + 1partial result, to <strong>in</strong>corporate that result <strong>in</strong>to their own. <strong>In</strong> that way, there is a significant<strong>in</strong>crease <strong>in</strong> robustness, because each read<strong>in</strong>g is accounted for <strong>in</strong> many paths towards thebase station, <strong>and</strong> all would have to fail for the read<strong>in</strong>g so as not to be accounted for <strong>in</strong> thequery result. As with trees, nodes can monitor l<strong>in</strong>k quality <strong>and</strong> level changes aswarranted. A key advantage of us<strong>in</strong>g a r<strong>in</strong>gs topology is that the communication error istypically very low, <strong>in</strong> stark contrast with trees. Moreover, the r<strong>in</strong>gs approach is as energyefficientas the tree approach (with<strong>in</strong> 1%). Nevertheless, because each partial result isaccounted for <strong>in</strong> multiple other partial results, special techniques are required to avoiddouble-count<strong>in</strong>g.<strong>Aggregation</strong> PhaseThe aggregate computation is def<strong>in</strong>ed by three functions on thesynopses:• Synopsis Generation:A synopsis generation function SG(arg) takes a sensor read<strong>in</strong>g(<strong>in</strong>clud<strong>in</strong>g its metadata) <strong>and</strong> generates a synopsis represent<strong>in</strong>g thatdata.• Synopsis Fusion:A synopsis fusion function SF(arg1, arg2) takes two synopses <strong>and</strong>generates a new synopsis.• Synopsis Evaluation:A synopsis evaluation function SE(arg) translates a synopsis <strong>in</strong>to thef<strong>in</strong>al answer.The exact details of the functions SG(), SF(), <strong>and</strong> SE() depend on theparticular aggregate query to be answered.Dur<strong>in</strong>g the aggregation phase, each node periodically uses the function SG() <strong>in</strong> orderto convert sensor data to a local synopsis <strong>and</strong> the function SF() so as to merge twosynopses to create a new local synopsis. For example, whenever a node receives asynopsis from a neighbour, it may update its local synopsis by apply<strong>in</strong>g SF() to itscurrent local synopsis <strong>and</strong> the received synopsis. F<strong>in</strong>ally, the query<strong>in</strong>g node uses the


function SE() to translate its local synopsis to the f<strong>in</strong>al answer. The cont<strong>in</strong>uous querydef<strong>in</strong>es the desired period between successive answers, as well as the overall duration ofthe query.Double-Count<strong>in</strong>g ProblemAs we have already mentioned, multipath techniques usually double count the samevalues because of the multiple parents of some nodes. Nevertheless, Synopsis diffusionavoids double-count<strong>in</strong>g through the use of order <strong>and</strong> duplicate-<strong>in</strong>sensitive (ODI)synopses that compactly summarize <strong>in</strong>termediate results dur<strong>in</strong>g <strong>in</strong>-network aggregation.Ma<strong>in</strong> drawbackThe total number of massages that are exchanged between the sensor nodes dur<strong>in</strong>g theaggregation phase is a lot higher compar<strong>in</strong>g to that of a simple tree-based approach.Therefore, the “synopsis diffusion” approach is more robust-oriented that energyefficient.3.4.3 Distributed Localized AlgorithmsUs<strong>in</strong>g Distributed Estimation [19]Boulis et. al <strong>in</strong> [19] proposed a distributed localized algorithm that explores theenergy/accuracy subspace for the periodic aggregation doma<strong>in</strong>. They firstly separated the<strong>in</strong>-network process<strong>in</strong>g algorithms <strong>in</strong> two different types.<strong>Process<strong>in</strong>g</strong> Type-I: <strong>In</strong>cludes all the “snapshot aggregation” solutions that calculate anaggregate by simply comb<strong>in</strong><strong>in</strong>g the values at multiple <strong>in</strong>termediate nodes until the f<strong>in</strong>alvalue reaches the s<strong>in</strong>k (this type <strong>in</strong>cludes all the approaches that have been reviewed sofar).<strong>Process<strong>in</strong>g</strong> Type-2: <strong>In</strong>cludes the approaches that take <strong>in</strong>to account that the sensedvalues are just approximations of the real ones <strong>and</strong> moreover, they do not try to calculatean aggregate but they rather attempt to achieve a “good” estimation of it.


The proposed solution of paper [19] belongs to the second type <strong>and</strong> is concerned withthe case of the MAX aggregation function.Every node keeps an estimation of the global aggregated value <strong>in</strong> the form of a vectorV = . These estimations change dynamically with time as the nodes<strong>in</strong>teract with each other by send<strong>in</strong>g to their neighborhood their own estimations. Thereare two ways that a local view (V) of the global estimation can change:a) After the reception of a global estimate of one of our neighbors (Module A,Figure 12).b) After a new value has been generated by the local sensors (Module B, Figure 12).It is beyond the scope of this paper to analyze the mechanism of local <strong>and</strong> globalestimation fusion, however our <strong>in</strong>terest is focused on the way that the fused value (V’) isdecided or not to be transmitted to the others (Modules C, D, Figure 12). Each nodekeeps an estimation table with the most recent estimations (V 1 ,V 2 ,…,V n ) that are receivedfrom each one of their neighbor. It comb<strong>in</strong>es V’ with each one of V 1 ,V 2 ,…,V n separatelyto create the temporary (virtual) new neighbor estimations V 1 ’,V 2 ’,…,V n ’. Given athreshold parameter (Th - user def<strong>in</strong>ed) of the maximum difference it does the follow<strong>in</strong>g:a) If (|V 1 ’- V 1 | > Th) then it broadcasts V’ as it is likely that it will affect seriouslythe others’ estimationsb) If (|V 1 ’- V 1 | < Th) then do noth<strong>in</strong>g because V’ will not affect seriously the others’estimationsFigure 12. The modular structure of the algorithm


Benefits1. It is a completely distributed <strong>and</strong> localized solution for the periodic aggregationproblem <strong>and</strong> is not error prone to node failures (unlike the solutions that make use ofaggregation-trees) due to the broadcast nature of the communications that are performedbetween the nodes.2. Simulations showed that it achieves significant energy sav<strong>in</strong>gs for some thresholdparameters (Th).DrawbacksIt needs the exchange of big volumes of <strong>in</strong>formation between the nodes when thecomput<strong>in</strong>g aggregate is more complex than the cases of MAX/MIN (e.g. MEDIAN,HISTOGRAM etc.).Gossip-Based Computation of aggregates [9]<strong>In</strong> this paper [9], a new theoretical method is proposed for the computation ofaggregates (like sums, averages, r<strong>and</strong>om samples) with gossip-style protocols. But whyshould someone use a gossip based protocol? Here are some characteristics of them thatare enough to conv<strong>in</strong>ce someone about the benefits from us<strong>in</strong>g them:a) It is known that gossip-based algorithms perform well <strong>in</strong> terms of scalability.b) Each node communicates with one or a few nodes of its neighborhood, thus, onlylocal <strong>in</strong>formation is neededc) Simplicity of design <strong>and</strong> operation. Usually, gossip-style protocols do not requireerror recovery mechanisms, while often <strong>in</strong>curr<strong>in</strong>g only moderate overhead, compared tooptimal determ<strong>in</strong>istic protocols such as the construction of data dissem<strong>in</strong>ation treesd) They achieve high stability under stress <strong>and</strong> disruptions. <strong>In</strong> comparison,traditional techniques have absolute guarantees, but are unstable or fail to make progressdur<strong>in</strong>g periods of even modest disruption.It is po<strong>in</strong>ted out that the po<strong>in</strong>t-to-po<strong>in</strong>t Uniform Gossip protocol is not suitable forwireless sensor networks. <strong>In</strong>stead, an alternative distributed broadcast-based algorithm isproposed, namely flood<strong>in</strong>g. The convergence of the flood<strong>in</strong>g algorithm is analyzed by


us<strong>in</strong>g the mix<strong>in</strong>g time of the r<strong>and</strong>om walk on the underly<strong>in</strong>g graph. It is assumed that theunderly<strong>in</strong>g graph is ergodic <strong>and</strong> reversible [discrete events theory], hence their algorithmsmay not converge on many natural topologies such as Grid. However, the algorithm runsvery fast (logarithmic complexity class) <strong>in</strong> certa<strong>in</strong> graphs. Under the assumption of acomplete graph, their analysis shows that when hav<strong>in</strong>g high probability, the values at allnodes converge exponentially fast to the true (global) average.Advantages of the proposed protocolAs it was expected, the algorithm is very simple due to the use of gossip basedcommunications. The speed of convergence is very high, due to the fact that after a fewrounds/iterations of the algorithm, the estimated value of the aggregate, which iscomputed by the algorithm, gives a good approximation of the real value. What is more,it can automatically adjust itself upon jo<strong>in</strong>s <strong>and</strong> disjo<strong>in</strong>s of the nodes.DrawbacksThe number of rounds that are needed <strong>in</strong> order the estimation of the aggregate (e.g.average) to be close to the real value (converge) is a critical <strong>in</strong>formation. Nodes though,haven’t a global view of the network (the algorithm is localized), so it is difficult to knowa priori after how many rounds they can stop runn<strong>in</strong>g the algorithm. An easy way forstopp<strong>in</strong>g the execution of the algorithm is to let the node which <strong>in</strong>itiates the aggregatequery to send a stop message to cease the computation. The query<strong>in</strong>g node firstly samples<strong>and</strong> compares the values from different nodes located at different locations. If thesampled values are all the same or with<strong>in</strong> some satisfiable accuracy range, the query<strong>in</strong>gnode dissem<strong>in</strong>ates the stop messages. This method <strong>in</strong>curs a delay overhead on thedissem<strong>in</strong>ation while the lack of a purely distributed local stop mechanism on each node isobvious.3.5 Expos<strong>in</strong>g Trade-Offs <strong>in</strong> data-aggregation schemesEnergy-AccuracyRecall<strong>in</strong>g from section 4.1.3 the algorithm that Boulis et. al [19] proposed, it is easyto realize that this approach creates a system-level energy vs. accuracy knob. If an


application can tolerate big estimation errors then it has to set a high value to thethreshold parameter “Th”. This will lead to significant energy sav<strong>in</strong>gs due to the smallertotal number of transmissions of the global estimates. A node now is more difficult todecide to broadcast its new estimation. However, such a “Th” value will reduce theaccuracy of the algorithm especially <strong>in</strong> environments where the values of the sensed datachange very fast. Similarly, the effects of select<strong>in</strong>g a low threshold “Th” will lead to highaccuracy <strong>and</strong> unfortunately to high energy consumption.Energy-Delay<strong>In</strong>-network data aggregation, as we have already seen, reduces the traffic <strong>in</strong>side thesensor network <strong>and</strong> therefore the consumed energy is less than without perform<strong>in</strong>g it.This fact has been confirmed by several simulations <strong>and</strong> experiments carried out byseveral research groups ([3], [6], [7], [10]). However, the cost of <strong>in</strong> network process<strong>in</strong>g,especially <strong>in</strong> the case of the tree-based approaches, is the higher delay which isproportional to the depth of the aggregation tree. This happens ma<strong>in</strong>ly because the nodesat higher levels (closer to the root) have to wait for the aggregates of the big sub-treesunder them, before them forward up to the top their own values. Moreover, the timeneeded to search the cache (match<strong>in</strong>g) <strong>in</strong> order to suppress the duplicates ([3], [4]) is notalways negligible.<strong>Data</strong> freshness - accuracyThe longer a s<strong>in</strong>k-node waits for the calculation of an aggregate, the more read<strong>in</strong>gs itis likely to receive (from the most of the nodes of <strong>in</strong>terest) <strong>and</strong> thus, the received valuewill be more accurate. On the other h<strong>and</strong>, wait<strong>in</strong>g too long may result <strong>in</strong> stale data,especially <strong>in</strong> the case of periodic aggregation or when frequent environmental changesoccur.


4 ConclusionsWireless sensor networks are a new area of research, hav<strong>in</strong>g many open issues <strong>and</strong>challenges due to the strict constra<strong>in</strong>ts of their resources. Although multihopcommunication protocols of mobile Ad-Hoc networks have many solutions to propose,only a few of them are applicable for a WSN. <strong>In</strong> this paper we focused on two goals. Thefirst one was to <strong>in</strong>troduce the reader to the network<strong>in</strong>g issues of a WSN <strong>and</strong> po<strong>in</strong>t out thenecessity of data nam<strong>in</strong>g <strong>and</strong> data-centric rout<strong>in</strong>g approaches. The second goal was topresent several data-centric oriented techniques that achieve reduction of traffic <strong>in</strong>side thenetwork <strong>and</strong> thus save energy. These techniques are referred to as <strong>Data</strong> <strong>Aggregation</strong>techniques <strong>and</strong> critical past research has been devoted to them. Although many of thesetechniques look promis<strong>in</strong>g, there are still many challenges that need to be addressed.


REFERENCES[1] W. Ye, J. Heidemann, D. Estr<strong>in</strong>. “An Energy-Efficient MAC Protocol for WirelessSensor <strong>Network</strong>s”. <strong>In</strong> Proceed<strong>in</strong>gs of IEEE <strong>In</strong>focom ‘02, New York, New York, June23-27, 2002.[2] Ilker Demirkol, Cem Ersoy, <strong>and</strong> Fatih Alagöz. “MAC Protocols for Wireless Sensor<strong>Network</strong>s: a Survey”. IEEE Communications Magaz<strong>in</strong>e, 2005.[3] C. <strong>In</strong>tanagonwiwat, R. Gov<strong>in</strong>dan, , <strong>and</strong> D. Estr<strong>in</strong>. “Directed diffusion: A scalable<strong>and</strong> robust communication paradigm for sensor networks”. <strong>In</strong> Proceed<strong>in</strong>gs of theSixth Annual <strong>In</strong>ternational Conference on Mobile Comput<strong>in</strong>g <strong>and</strong> <strong>Network</strong>s(MobiCOM 2000), Boston, MA, August 2000.[4] Heidemann, F. Silva, C. <strong>In</strong>tanagonwiwat, R. Gov<strong>in</strong>dan, D. Estr<strong>in</strong>, <strong>and</strong> D. Ganesan.“Build<strong>in</strong>g efficient wireless sensor networks with low-level nam<strong>in</strong>g”. <strong>In</strong> SOSP,October 2001.[5] P.Bonnet, J.Gehrke, <strong>and</strong> P.Seshadri. “Towards sensor database systems”. <strong>In</strong> 2nd<strong>In</strong>ternational Conference on Mobile <strong>Data</strong> Management, Hong Kong, January 2001.[6] Y. Yao <strong>and</strong> J. Gehrke, “The cougar approach to <strong>in</strong>-network query process<strong>in</strong>g <strong>in</strong>sensor networks”, <strong>in</strong> SIGMOD Record, September 2002.[7] S. Madden, R. Szewczyk, M. J. Frankl<strong>in</strong>, <strong>and</strong> D. Culler, Support<strong>in</strong>g AggregateQueries Over Ad-Hoc Wireless Sensor <strong>Network</strong>s", <strong>In</strong> Proc. of WMCSA, June 2002.[8] S. Nath, P. B. Gibbons, S. Seshan, <strong>and</strong> Z. Anderson. Synopsis diffusion for robustaggregation <strong>in</strong> sensor networks. <strong>In</strong> SenSys, 2004. S. Nath, P. B. Gibbons, S. Seshan,<strong>and</strong> Z. Anderson.[9] D. Kempe A. Dobra J. Gehrke, “Gossip-based Computation of Aggregate<strong>In</strong>formation”, Proc. The 44th Annual IEEE Symp. on Foundations of ComputerScience, FOCS October 2003.[10] S. Madden, M. Frankl<strong>in</strong>, J. Hellerste<strong>in</strong>, <strong>and</strong> W. Hong. “TAG: A t<strong>in</strong>y aggregationservice for ad-hoc sensor networks”. <strong>In</strong> Proc. 5th Symp. on Operat<strong>in</strong>g SystemsDesign <strong>and</strong> Implementation, December 2002.[11] Madden, S., Frankl<strong>in</strong>, M., Hellerste<strong>in</strong>, J., <strong>and</strong> Hong, W. The design of anacquisitional query processor for sensor networks. <strong>In</strong> Proceed<strong>in</strong>gs of ACMSIGMOD June 2003 (San Diego, June 9–12). ACM Press, New York, 2003, 491–502.[12] Eugene Shih, Seong-Hwan Cho, Nathan Ickes, Rex M<strong>in</strong>, Amit S<strong>in</strong>ha, Alice Wang,<strong>and</strong> Anantha Ch<strong>and</strong>rakasan, “Physical layer driven protocol <strong>and</strong> algorithm designfor energy-efficient wireless sensor networks,” <strong>in</strong> The Seventh Annual <strong>In</strong>ternationalConference on Mobile Comput<strong>in</strong>g <strong>and</strong> <strong>Network</strong><strong>in</strong>g 2001, July 2001, pp. 272 – 287.


[13] Jaap C. Haartsen <strong>and</strong> Sven Mattisson, Bluetooth — a new low-power radio <strong>in</strong>terfaceprovid<strong>in</strong>g short-range connectivity, Proc. IEEE, v. 88, n. 10, October 2000, pp.1651–1661.[14] JP Lynch, KJ Loh , “A Summary Review of Wireless Sensors <strong>and</strong> Sensor <strong>Network</strong>sfor Structural Health Monitor<strong>in</strong>g” - The Shock <strong>and</strong> Vibration Digest, 2006[15] LAN MAN St<strong>and</strong>ards Committee of the IEEE Computer Society, Wireless LANmedium access control (MAC) <strong>and</strong> physical layer (PHY) specification,IEEE, NewYork, NY, USA, IEEE Std 802.11-1997 edition, 1997.[16] Mark Stemm <strong>and</strong> R<strong>and</strong>y H Katz, “Measur<strong>in</strong>g <strong>and</strong> reduc<strong>in</strong>g energy consumption ofnetwork <strong>in</strong>terfaces <strong>in</strong> h<strong>and</strong>-held devices,” IEICE Transactions on Communications,vol. E80-B, no. 8, pp. 1125–1131, Aug. 1997.[17] Alaa Muqattash <strong>and</strong> Marwan Krunz, “CDMA-Based MAC Protocol for WirelessAd Hoc <strong>Network</strong>s”, MobiHoc’03, June 1–3, 2003, Annapolis, Maryl<strong>and</strong>, USA.[18] Jamal N. Al-Karaki Ahmed E. Kamal, “Rout<strong>in</strong>g Techniques <strong>in</strong> Wireless Sensor<strong>Network</strong>s: A Survey”[19] A.Boulis, S. Ganeriwal <strong>and</strong> M.B. Srivastava. “<strong>Aggregation</strong> <strong>in</strong> Sensor <strong>Network</strong>s: AnEnergy-Accuracy Trade-off”. Sensor <strong>Network</strong> Protocols <strong>and</strong> Applications (SNPA'03), May 2003.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!