
Research Article


Synchronous Aggregation Scheduling with Minimal Latency in Multihop SensorNet 

Liqun Shan,
Jinkuan Wang,
Yanchao Zhao
and
Yanchang Liu



ABSTRACT

This study has addressed the problem of data aggregation scheduling problem to minimize the latency in wireless sensor networks (WSNs). An efficient distributed synchronous aggregation scheduling method is proposed to structure a collisionfree schedule for data aggregation in WSNs. By using a Connected Dominating Set (CDS) as an aggregation tree we implement the synchronous aggregation scheduling. We prove that the latency of the aggregation schedule generated by our algorithm is at most 4δ+15R16 timeslots where R is the network radius and δ is the maximum node degree in the communication graph of the original network. Analysis and simulation results show the validity and superiority of the algorithm.





Received: January 07, 2011;
Accepted: April 24, 2011;
Published: June 18, 2011


INTRODUCTION
Data aggregation is a primitive operation in wireless sensor network applications,
such as emergency medical situations, environmental monitoring and battlefield
surveillance. In most applications, quite often we need to gather data from
sensors to a fixed sink node that collects a packet from every other node and
every intermediate node combines all received packets with its own packet into
a single packet of fixedsize according to some aggregation function such as
logical and/or, maximum, or minimum. This type of application is called data
aggregation (Krishnamachari et al., 2002). We focus
on generic environmental monitoring applications which collect sensing data
continuously over a long period of time and aim at providing collisionfree
and shorter packet delay.
In data aggregation, many issues need to be resolved such as energy conservation,
deployment strategies, routing in dynamic environment, latency and so on (Idris
et al., 2009; Guo et al., 2010). Previous
researches on innetwork data aggregation seldom consider the collision problem
but leave it to the MAC layer. Solving this problem in MAC layer results in
latency too high to be practical. Recently, a few researchers begin to consider
the collision problem for reducing data aggregation latency and try to construct
a feasible schedule to eliminate collisions during data aggregation (Kesselman
and Kowalski, 2005; Liu et al., 2010a,
b) (Wei et al., 2010a, b).
The main contributions of this paper are as follows. We propose an efficient algorithm that will construct a data aggregation tree and a synchronous TDMA schedule for all links in the tree such that the latency of aggregating all data to the sink node is approximately minimized. An efficient distributed method is proposed to structure a collisionfree schedule for data aggregation in WSNs. We prove that the latency of the aggregation schedule generated by our algorithm is at most 4δ+15R16 timeslots where R is the network radius and Δ is the maximum node degree in the communication graph of the original network.
The data aggregation scheduling problems have been extensively studied in the
works of Annamalai et al. (2003), Intanagonwiwat
et al. (2002), Madden et al. (2002)
and Yu et al. (2004) in recent years. The most
related ones are as follows. Krishnamachari et al.
(2002) proposed a randomized and distributed algorithm for aggregation in
a nnode sensor network with an expected latency of O (log n). In their model,
there are two assumptions. One is that each sensor node has the capability of
detecting whether a collision occurs after transmitting data. Another one is
that sensor nodes can adjust their transmission range without any limitation.
These assumptions pose some challenging issues for hardware design and the latter
assumption is almost impossible when the network scale is very large. Chen
et al. (2005) proposed an algorithm to generate a collisionfree
schedule with latency bound of (Δ1) R. They also proved that the minimum
data aggregation time problem is NPhard. Huang et al.
(2007) proposed a scheduling algorithm with the latency bound of 23R+Δ+18
timeslots where R is the network radius and Δ is maximum node degree.
However the interference model used is a simple primary interference model:
no node can send and receive simultaneously. Under Protocol Interference Model,
Yu et al. (2008) proposed a distributed scheduling
algorithm generating collisionfree schedules that have a latency bound of 24D+6Δ+16
timeslots where D is the network diameter. Xu et al.
(2009) proposed a scheduling algorithm generating collisionfree schedules
that has a latency bound of 16R+Δ14 timeslots which choose the topology
center of the UDG as the root that is not practical, because the root is common
node with limited energy and prone to be failure. Wan et
al. (2009) proposed a scheduling algorithm generating collisionfree
schedules that has a latency bound of 15R+Δ4 timeslots.
Among these algorithms mentioned above, some work on centralized aggregation scheduling that are not practical, especially when the network topology often changes in a large sensor network and other work on distributed aggregation scheduling that can need the requirement of the dynamic network properties. But data scheduling of these algorithms uses a bottomup or upbottom approach that don’t consider nodes in different dominate nodes execute the transmission at the same time and still have high latencies. Moreover, each dominant node costs at most Δ time slots that are not accurate. PROBLEM DEFINITION We consider a WSN consisting of n nodes, given a graph G = (V, E) representing the WSN and a base station (sink node) sεV, to find a data aggregation scheduling with minimum latency. Each node is equipped with an RF transceiver that can be used to send or receive data. We consider omnidirectional antennae only and a node’s transmission or reception range is roughly a disk centered at that node. For simplicity we assume that all nodes other than the sink have the same transmission range (radius) r such that both node and form a communication link whenever their Euclidean distance uv≤r. We normalize their transmission radius to one unit and use a unit disk graph (UDG).
Let X, Y⊂V and X∩Y = ø. If all the nodes in X transmit data
simultaneously in one timeslot and all data are received by some nodes at the
same time in Y with collisionfree we say data are aggregated from X to Y in
one timeslot. A data aggregation scheduling can be thought of as a sequence
of senders {X_{1}, X_{2},..., X_{l}} (in which X_{l}⊂V,
∀i) satisfying the data aggregation property. This sequence represents
that all nodes in X_{1} transmit in the first time slot, followed by
all nodes in X_{2} transmitting in the second time slot and so on. All
data will be aggregated to one single node s which is the sink node. Then a
data aggregation scheduling with latency l can be a sequence of sender
sets X_{1}, X_{2},..., X_{l} satisfying the following
conditions:
• 

• 

• 
Data are aggregated from X_{i} to
at timeslot k, for all k = 1, 2,..., l and all the data are aggregated
to the sink node s in l timeslots 
The number l is called data aggregation latency. This paper focuses on constructing a distributed aggregation scheduling algorithm to find a schedule in a distributed way such that is minimized. This problem is proved to be NPhard. This paper proposes an approximate distributed algorithm with at most latency 4δ+15R16 where R is the network radius and δ is the maximum node degree. SYNCHRONOUS AGGREGATION SCHEDULING ALGORITHM
For simplicity we assume that all communications proceed in synchronous timeslots
and each node can transmit/receive at most one packet of a fixed size in each
timeslot. We assume that the sensors are aware of their locations. The location
information can be obtained either through GPS (McNeff, 2002)
or various localization techniques (Hu and Evans, 2004;
Savvides et al., 2001; He et
al., 2003; Goldenberg et al., 2006).
In this study we construct a Connected Dominating Set (CDS) as data aggregation tree. The data aggregation scheduling is executed according to the CDS tree. The details of our algorithm are as follows.
Distributed aggregation tree construction: The CDS is constructed by
the distributed approach to generate a spanning data aggregation tree. Assume
each node knows own neighbors list that contains ID and Level. The pseudo code
of a Maximal Independent Set (MIS) is as follows.
Algorithm 1: 
Construct a max independent set (MIS) 


Fig. 1: 
Data aggregation tree 
In the algorithm 1, all black nodes form a MIS set by a
distributed approach. Next we will construct a connected dominating set (CDS).
Algorithm 2: 
Construct a CDS to form an aggregation tree level by level 

According to the algorithm 1 and 2 we
construct the aggregation tree just like Fig. 1. All the black
nodes that are dominators form an MIS. Each blue node connects two black nodes
and is lowerlevel black node’s parent node. The grey nodes are leaves
and dominates. All nodes form a CDS. For aggregation, the CDS is an aggregation
tree.
Distributed parallel aggregation scheduling: According to the constructed
aggregation tree to perform aggregation scheduling. Let B denote all black nodes
set, b_{j}∈B where 1≤j≤B. Let W denote all grey nodes,
w_{i}∈W. Set the coordinate of b_{j} is (x_{bj},
y_{bj}) and the coordinate of w_{i} is (x_{wi}, y_{wi}),
where 1≤i≤W. We construct different virtual disks centered at every
black nodes of radius one and divide every disk into four equal portions: area_{i1},
area_{i2} area_{i3} and area_{i4} along the horizontal
and vertical directions. The area_{i1} is formed by the nodes in the
range x_{bj}≤x_{wi} and y_{bj}>y_{wi}
in b_{j}’s children nodes. The area_{i2} is formed by the
nodes in the range x_{bj}>x_{wi} and y_{bj}≤y_{wi}
in b_{j}’s children nodes. The area_{i3} is formed by the
nodes in the range x_{bj}≥x_{wi} and y_{bj}>y_{wi}
in b_{j}’s children nodes. The area_{i4} is formed by the
nodes in the range x_{bj}<x_{wi} and y_{bj} ≤y_{wi}
in b_{j}’s children nodes.

Fig. 2: 
The different fields constructed by the dominant nodes 
As shown in Fig. 2, the area is how to be structured. The
communications process in area_{i1} are executed at the same time in
synchronous timeslots, where ∀j, j∈*B. Nodes of the four
fields are scheduled in clockwise order, that is to say, every node in the area_{i1}
finishes transmitting the information, next the b_{j} receives the information
of every node in the area_{i2} and so on.
We will determine the number of time slot in four equal portions in the following firstfit manner. Assume that D_{i} and D_{j} are both neighboring dominant nodes in B. D_{i, 1} denotes the set of nodes in the area_{i1} of D_{i}. The set of nodes with respect to the area_{j1} of D_{j} are denoted by D_{j, 1}. Consider ID increasing order <v_{i1}, v_{i2},..., v_{i}Di, 1> of D_{i, 1} and <v_{ji}, v_{j2},..., v_{jDj, 1} of D_{j, 1}. Initially, U = {v_{i1}} and v = ø. For x = 2 up to D_{i, 1}, for y = 1 up to D_{j, 1}, if v_{j}∉V and v_{jy} is first not adjacent to v_{ix}, add v_{jy} to V. Because the matching nodes in U and V are pairwise conflictfree, U is equal to V and they use U time slot to complete the communication. The nodes in D_{i, 1}/U and D_{j, 1}/V are conflicting and need separate time slot. According to the algorithm, we can achieve the number of time slot is equal to δ_{ij, 1} = D_{i, 1}/U+D_{j, 1}/V+U. The b_{j} separately gets the number δ_{ij, 1}, δ_{ij, 2} δ_{ij, 3} and δ_{ij, 4} of nodes in different area. All nodes in B broadcasts δ_{ij, 1}, δ_{ij, 2}, δ_{ij, 3} and δ_{ij, 4} messages in WSNs and get the max values of δ_{ij, 1}, δ_{ij, 2}, δ_{ij, 3} and δ_{ij, 4} that separately denote δ_{1}, δ_{2}, δ_{3} and δ_{4}. The nodes in the area_{ik} are assigned corresponding time slot t deferred to the conflictfree in δ_{k} and are sleep in the rest of the time. For example, the first time slot is assigned in Fig. 3. Every dominator has the same scheduling period δ_{1}+δ_{2}+δ_{3}+δ_{4}. The max value of the scheduling period is 4δ1. The dominates in all nodes of B periodically send data to b_{j} if and only if they are in the corresponding time slot and are in sleep mode in the rest of time. In δ_{1}δ_{ij, 1}, δ_{2}δ_{ij, 2}, δ_{3}δ_{ij, 3} and δ_{4}δ_{ij, 4} interval, the b_{j} node is in sleep state.
Let C be the set of connectors. The scheduling in c∪d is as the following.
Observe that every dominator,except the root of the data aggregation tree, connects
to at least one dominator in the upper level within 2hops.

Fig. 3: 
Scheduling in the first time slot 
We now determine the max number of time slots that a dominator node will use
to connect to all dominators within 2 hops. Consider any dominator u, let D_{2}(u)
be the set of dominators within 2hops of u in the original communication network
G. Based on a technique lemma implied from lemmas proved, u requires at most
12 connectors to connect D_{2}(u). We use C(u) to denote the set of
connectors used to connect all dominators in D_{2}(u). According to
Wegner Theorem, there can be at most 5 dominant nodes in a unit disc centered
at any node u in C(u). Each connector only receives the data from lowerlevel
dominant nodes, so u can receive the data from all neighboring dominators D_{2}(u)
in at most 15 time slots. All dominators within 2 hops execute parallel scheduling
by connectors.
Theorem 1: By using distributed parallel aggregation scheduling, the sink can receive all the aggregated data in at most 4δ+15R16 timeslots. Proof: Every dominator can aggregate its neighboring dominates’ data in at most 4Δ1 timeslots within corresponding timeslot by parallel manner. Then connectors ensure that every dominator’s data can be aggregated upwards the root finally. Every dominator u including the root of data aggregation tree can collect aggregated data from all dominators in D_{2}(u) within at most 15 timeslots. Since there is no dominator in level R, after at most 15(R1) timeslots, every dominator’s data can be aggregated to the root. Therefore within timeslots, all the data can be aggregated to the sink node. SIMULATION RESULTS In our simulation we randomly deploy sensors into a region of 200x200m. All sensors have the same transmission radius. We compare the performance of our algorithm SAS with the algorithm for aggregation scheduling DAS.

Fig. 4: 
Latency with different number of nodes 

Fig. 5: 
Latency with different transmission radius 
We use matlab to implement SAS and DAS algorithm. In Fig. 4, the transmission radius of each sensor is fixed to 25 m. The figure shows the number of the time slots needed to aggregate data from the leaves to the sink by running the two algorithms while the number of nodes increases. Fig. 5 compares the number of slots to aggregate data using the two algorithms when the transmission radius varies. Here the number of nodes is fixed to 1600. It can be seen from the Fig. 4 and 5 that SAS outperforms DAS algorithm with much lower latencies. From the figures, SAS's latencies are 1.52 to 3.06 times smaller than DAS's latencies and with the increment of the number of nodes and the transmission radius, the improvement of the latency will be larger. The reason is that we choose the max degree node as a dominant node which causes the nodes to uniformly distribute in different levels, thereby, reduce the latency.
Next we compare the running time of SAS and DAS in Fig. 6
when the number of sensor nodes and the transmission radius vary.

Fig. 6: 
Running time with different number of nodes 
The running time refers to the time required to generate the aggregation scheduling.
We set 1/20 of a second for each time slot when running two algorithms which
is greater than the time required to transmit one packet. Though the time complexity
is O (n) in DAS we can see that the running time is rather small. We can see
that the running time of SAS is 2 to 8 sec on average. For instance, for 2000
nodes with transmission radius of 30 m, the running time of DAS is 14.6 sec
and the running time of SAS is only 3.2 sec. The reason is that many nodes can
schedule simultaneously in SAS and the occasion where only one node can schedule
at each time maybe happens in DAS. As the number of the nodes or the transmission
radius increases, the average size of the nodes' competitor sets also increases
and thus each node has to compete with more competitors that cost more time.
CONCLUSIONS In this study we make a study of the data aggregation problem and its latency. We use the techniques of maximal independent sets and connected dominating set to construct the aggregation tree. We propose a distributed synchronous aggregation scheduling algorithm. The latency of the aggregation schedule generated by our algorithm is at most 4δ+15R16 timeslots. Analysis and simulation results show that the SAS algorithm has smaller latency and better performance by comparing with the DAS algorithm. A distributed asynchronous aggregation scheduling algorithm will be proposed which makes the algorithm suitable for networks with dynamic topology changes in our future work. ACKNOWLEDGMENTS
We would like to thank Yanchao Zhao for helpful discussions and insightful
comments. He reviewed the draft of the paper and made further modifications
that improved the quality of the study. Yanchang Liu devoted herself to designing
some graphs and other art works in this study. We also thank the anonymous reviewers
for their insightful and valuable hints.

REFERENCES 
Krishnamachari, B., D. Estrin and S. Wicker, 2002. The impact of data aggregation in wireless sensor networks. Proceedings of the 22nd International Conference on Distributed Computing Systems Workshops, July 25, 2002, Vienna, Austria, pp: 575578 CrossRef 
Idris, M.Y.I., E.M. Tamil, N.M. Noor, Z. Razak and K.W. Fong, 2009. Parking guidance system utilizing wireless sensor network and ultrasonic sensor. Inform. Technol. J., 8: 138146. CrossRef  Direct Link 
Guo, L., B. Wang, Z. Liu and W. Wang, 2010. An energy equilibrium routing algorithm based on clusterhead prediction for wireless sensor networks. Inform. Technol. J., 9: 14031408. CrossRef  Direct Link 
Kesselman, A. and D. Kowalski, 2005. Fast distributed algorithm for convergecast in ad hoc geometric radio networks. Proceedings of the 2nd Annual Conference on Wireless OnDemand Network Systems and Services, Jan. 1921 IEEE., pp: 119124
Liu, Z., B. Wang and L. Guo, 2010. A survey on connected dominating set construction algorithm for wireless sensor networks. Inform. Technol. J., 9: 10811092. CrossRef  Direct Link 
Liu, Z., B. Wang and Q. Tang, 2010. Approximation two independent sets based connected dominating set construction algorithm for wireless sensor networks. Inform. Technol. J., 9: 864876. CrossRef  Direct Link 
Wei, W., A. Gao, B. Zhou and Y. Mei, 2010. Scheduling adjustment of mac protocols on cross layer for sensornets. Inform. Technol. J., 9: 11961201. CrossRef  Direct Link 
Wei, W., B. Zhou, A. Gao and Y. Mei, 2010. A new approximation to information fields in sensor nets. Inform. Technol. J., 9: 14151420. CrossRef  Direct Link 
Annamalai, V., S. Gupta and L. Schwiebert, 2003. On treebased convergecasting in wireless sensor networks. IEEE WCNC., 3: 19421947. CrossRef  Direct Link 
Intanagonwiwat, C., D. Estrin, R. Govindan and J. Heidemann, 2002. Impact of network density on data aggregation in wireless sensor networks. Proceedings of the 22nd International Conference on Distributed Computing Systems, (DCS'02), IEEE., pp: 457457 CrossRef  Direct Link 
Madden, S., M.J. Franklin, J. Hellerstein and Wei Hong, 2002. Tag: A tiny aggregation service for adhoc sensor networks. Proceedings of the 5th Symposium on Operating Systems Design and Implementation, December 911, 2002, Boston, MA., USA., pp: 131146 CrossRef  Direct Link 
Yu, Y., B. Krishnamachari and V. Prasanna, 2004. Energylatency tradeoffs for data gathering in wireless sensor networks. Proceedings of the 23rd AnnualJoint Conference of the Computer and Communications, March 711, IEEE., pp: 112
Chen, X., X. Hu and J. Zhu, 2005. Minimum data aggregation time problem in wireless sensor networks. Proceedings of the 1st Inerternational Conference on Mobile Adhoc and Sensor NetworkMSN, (MASN`05), SpringerVerlag, pp: 133142
Huang, S.C.H., P.J. Wan, C.T. Vu, Y. Li and F. Yao, 2007. Nearly constant approximation for data aggregation scheduling in wireless sensor networks. Proceedings of the 26th International Conference on Computer Communications, May 612, IEEE., pp: 366372 CrossRef 
Yu, B. , J. Li and Y. Li , 2008. Distributed data aggregation scheduling in wireless sensor networks. Proceedings of the IEEE INFOCOM, April 1925, IEEE., pp: 21592167 CrossRef  Direct Link 
Xu, X.H., S.G. Wang, X.F. Mao, S.J. Tang and X.Y. Li, 2009. An improved approximation algorithm for data aggregation in multihop wireless sensor networks. Proceedings of the 2nd International Workshop on Foundations of Wireless adhoc and Sensor Networking and Computing, May 18, ACM., pp: 4756 CrossRef 
Wan, P.J., C.H. Huang, L. Wang, Z.Y. Wan and X. Jia, 2009. Minimumlatency aggregation scheduling in multihop wireless networks. Proceedings of the 10th Interational Symposium on Mobile AdHoc Networking and Computing, May 1821, ACM., pp: 185193
McNeff, J.G., 2002. The global positioning system. IEEE Trans. Microwave Theory Techniques, 50: 645652.
Hu, L. and D. Evans, 2004. Localization for mobile sensor networks. Proceedings of the 10th Annual International Conference on Mobile Computing and Networking, September 26October 1, 2004, Philadelphia, USA., pp: 4557 Direct Link 
Savvides, A., C.C. Han and M.B. Srivastava, 2001. Dynamic finegrained location in ad hoc networks of sensors. Proceedings of the ACMMobicom, July 2001, USA., pp: 166179
He, T., C. Huang, B.M. Blum, J.A. Stankovic and T. Abdelzaher, 2003. Rangefree localization schemes for large scale sensor networks. Proceedings of the 9th ACM International Conference on Mobile Computing and Networking, September 1419, 2003, San Diego, CA., USA., pp: 8195 Direct Link 
Goldenberg, D.K., P. Bihler, Y.R. Yang, M. Cao, J. Fang, A.S. Morse and B.D.O. Anderson, 2006. Localization in sparse networks using sweeps. Proceedings of the ACM Mobicom, December 2006, USA., pp: 110121



