Introduction
Wireless sensor networks are being used for many different applications,
such as monitoring chemical spills, detecting and assessing the extent of
environmental contamination, and monitoring the movement of soldiers and
weapons on the battlefield. However, their limited lifespan is a great
concern when they are used in remote locations or in harsh environments.
Radio energy model.
Many different techniques have been introduced in an effort to maximize
their lifespan, but these techniques have focused on having the nodes in a
cluster send their data to a selected cluster head node that, in turn,
reports the data to the base station. Therefore, the choice of the number of
clusters and the way the cluster head node is selected are the main focuses
of these techniques. Clustering and the use of cluster heads in wireless
sensor networks have the potential to enhance the lifespans of a group of
sensor nodes and to minimize the generation of noise in the signals
exchanged between the sensor nodes and the base station (sink) (Heinzelman
et al., 2000). In this approach, the cluster head organizes a reservation
scheme to improve communication with the sensor nodes in the cluster, and
the cluster head uses this scheme to aggregate, compress, and transmit the
cluster's sensing data to the base station. Several technologies have been
designed to improve the lifespan of the sensors. For example, algorithms were
developed for this purpose by the energy efficient heterogeneous clustered
scheme (EEHC) (Kumar et al., 2009) by the design of a distributed energy
efficient clustering (DEEC)(Qing et al., 2006), and by the low-energy
adaptive clustering hierarchy (LEACH) (Heinzelman et al., 2000). These goals
of these algorithms were to determine the optimal number of clusters in a
given number of sensor nodes and to selecting a head in a cluster of
sensors. The low energy consumption clustering routing protocol (Kumar et
al., 2009) improved the LEACH algorithm by utilizing the k-means algorithm
that divides the sensor nodes into k clusters in the setup and
steady-state phases. A major problem of the k-means algorithm was that it
could not accommodate the inevitable situation that the number of clusters
gradually changed as the energy levels of the nodes decreased. Also, the
method did not solve the sticky issue of initialization of the k-means
process (Zhong et al., 2012). However, k-harmonic means (KHM) clustering
solved the initialization problem by providing “soft membership”, which
assumes that a data element belongs to more than one cluster; also, if a
data point is not close to any center or cluster, a “dynamic weighting
function” provides a higher weight to the data element in the next
iteration so that it becomes a candidate for all of the clusters. However,
the KHM algorithm cannot provide an optimal number of clusters.
In this paper, we have provided detailed discussions of clustering
algorithms; the combination of k-means++, k-means, and gap statistics
algorithms; the selective ways in which each is used and combined; and how,
using the combination, the optimal number of clusters is generated, which
leads to the maximum lifespan of a group of distributed wireless sensors.
Before discussing the clustering algorithms and their combination, in the
next section, we discuss a popular clustering algorithm, known as the low-energy adaptive clustering hierarchy algorithm (LEACH), for extending the
lifespan of wireless sensors. Section 3 describes the selected clustering
algorithms and their combination for determining the optimal number of
clusters. Last, Sect. 4 provides the simulation results and compares the
results provided by a combined clustering algorithm and the LEACH algorithm.
Section 5 presents the conclusion.
Clustering algorithms
Low-energy adaptive clustering algorithm (LEACH)
The LEACH algorithm was developed to minimize the power consumption of
wireless sensor nodes by determining the optimal number of clusters,
k, in a group of distributed homogeneous wireless sensors
based on the “computation and communication energy model” (Heinzelman et
al., 2000). In order to determine the optimal number of clusters, k, first,
the algorithm considers how much energy the head of a cluster consumes using
the radio energy model depicted in Fig. 1. In the radio energy model, for
a single bit transmission over a unit distance, ETx is the
transmission energy dissipated, which is composed of two components, i.e.,
ETx-elec(q), the electrical energy consumed for digital coding,
modulation, and filtering a signal and ETx-amp(q,d), the energy
required for amplification.
Then, the total energy used to transmit a q bit message over a distance d
is expressed by
ETx(q,d)=ETx-elec(q)+ETx-amp(q,d).
The energy for k bit amplification is expressed by ETx-amp(q,d)=qεfsd2, in a free space path (εfs)
with distance squared (d2). When a multi-path is considered, the
amplification energy is defined as ETx-amp(q,d)=qεmpd4,
for q bits with distance to the fourth power (d4). The LEACH
algorithm proposed that the free space (fs) model be used when the distance
between the transmitter and the receiver is less than the threshold distance
do (base station distance); otherwise, the multipath (mp) model is
used, as summarized below:
ETx(q,d)=ETx-elec+qεfsd2,d<do,ETx(q,d)=ETx-elec+qεmpd4,d≥do.
The receiver's energy for a q bit receipt is calculated by
ERx(k)=qEelec.
Let us now consider energy consumption by the sensor nodes in a cluster of a multi-cluster
sensor network. Assuming that there are N wireless sensor nodes uniformly
distributed in a square region of M×M geographical units that have
k clusters, there are N/k nodes per cluster, and, in each cluster, there is one
cluster head node and (N/k)-1 non-cluster-head nodes (or “cluster member
nodes”). In a cluster, during the steady-state phase, data transfer from
the nodes to a cluster head as well as from the cluster head to the sink,
which is located a long distance away, so the energy of the cluster head's
battery is being depleted faster that of any of the member nodes, because
the cluster head receives data from the member nodes, aggregates and
compresses them, and transmits the compressed data to the sink. The energy
consumption of a cluster head is calculated by
ECH=qEelecNk-1+qEDANk+qETx-elec+εmpdto BS4,
where dto BS is the distance between the cluster head and the base
station, and (EDA) is the energy dissipation per bit for data aggregation
and compression.
The energy consumption by a member node for transmitting a q bit message
to the cluster head is defined as
ENon-CH=qETx-elec+q∈fsdCH2,
where dCH2 is the distance between the member nodes and the
cluster head.
Now, let us calculate the energy consumption in a cluster in the
aforementioned sensor network, i.e., N sensors distributed uniformly in an
M×M geographical unit square area that is divided into k clusters. First, we
can say that each cluster in the area takes up approximately (M2/k)
of the geographical region. Second, the location of a sensor node
can be described by a Cartesian coordinate ρ(x,y)
(Heinzelman et al., 2000). If the area is a circle,
the sensor's location can be described by a polar coordinate ρ(r,θ),
where r is the radius and θ is an angle, with the
radius defined by r=M/πk. Third, the expected square
distance in a circular area between the cluster head and the member sensor
nodes is calculated by
Edto CH2=ρ∫θ=02π∫r=0M/πkr3drdθ=ρM42πk2,
where due to the uniform region of a node,
ρ=1(M2/k),Edto CH2=M22πk,ENon-CH=qETx-elec+qεfsM22πp.
Fourth, the total energy consumption for a cluster is the sum of that for the
cluster head and for the non-cluster head member nodes:
Etotal=ECH+ENon-CH,Etotal=qEelecN/k-1+EDAN/k+2ETx-elec+εmpdto BS4+qεfsM2/2πk.
Finally, the optimal number of clusters, k, can be determined by setting the
derivative of Etotal with respect k to zero, resulting in
k=NϵfsM2πϵmpdto BS2ϵfs=10pJbit-1m-2,ϵmp=0.0013pJbit-1m-4.
Based on Eq. (12), let us assume that the number of sensor nodes (N) and
the network region (M) are constant, but the base station distance (d)
increases; subsequently, the optimal number of clusters (k) decreases.
Ultimately, some clusters have many sensor nodes when the number of clusters
decreases due to k is the inverse squared distance. As Haibo et al. (2010)
described, a cluster head with many sensor nodes consumes more energy than a
cluster head with a few sensor nodes, because it aggregates, receives, and
compresses more sensing information than a cluster head with few sensor
nodes. In addition, if there is a large distance between a cluster head and
the base station, the cluster head node consumes more energy than it would
if the distance were shorter. If the current cluster head runs out of
energy, the entire wireless sensor network is no longer operational. The
main challenge is to minimize the power consumption of the cluster head,
especially when many sensor nodes are allocated to a single cluster.
k-means++ algorithm
The k-means++ algorithm is used to assign the initial center of the
k-means algorithm. Since the k-means algorithm randomly chooses the
initial centroid, it is not guaranteed that clustering by the k-means
algorithm is optimal. For example, if the initial random centroid is far
away from the cluster's true center, the number of iterations required to
optimize the centroid takes longer, and an incorrect clustering result may
be obtained (Arthur et al., 2007; Avros et al., 2012). To remedy these
problems, the k-means++ algorithm randomly selects the initial center
from the sensor nodes' locations, but their location depends on their
squared distances from the closest center that already has been selected.
For example, the first single initial center (c1) is selected randomly;
however, the remaining centers, such as those in the range from (c2) to
(cl), are calculated based on the steps described below.
First, let us assume that the sensor nodes are represented by X=x1,…,xn and that l centers are represented as C,
where C=c1,…,cl. The distance between
each sensor node and (c1) is calculated by
D1=x1-c12,D2=x2-c12andDn=xn-c12.
The distance of each sensor nodes and over the average distance is
calculated by
px1=D12D12,px2=D22D12+D22,pxn=Dn2D12+D22+…+Dn2.
Second, the algorithm generates a random number. Then, one of the values of
px1,px2,…,pxn
close to a random number (i.e., xi) becomes the second
center. For example, for the random number of R≈px4, the sensor node x4 becomes (c2); otherwise, the algorithm
generates another value. The third step is to choose the third center
(c3). The distance is calculated as
D12=min(x1-c12x1-c22),D22=min(x2-c12x2-c22),D32=min(x3-c12x3-c22).
The distance of each sensor node and over the average distance of sensor
nodes is also calculated as
px1=D12D12,px2=D22D12+D22,pxn=Dn2D12+D22+…+Dn2.
Again, the algorithm generates a random number to choose one of the values
of px1,px2,…,pxn. The process of selecting the initial centers using the above
steps continues until l centers are selected.
Moreover, Arthur et al. (2007) chose the initialization center of a data set
one by one in a controlled fashion using the k-means++ algorithm. For
example, the first initial center was selected randomly in a sensing region,
but the subsequent centers depended on the value of the previous center. For
example, c2 depends on c1, and c3 depends on c2 and
c1. If we expand the illustrative example, the
k-means++ algorithm can be conveniently generalized
for any number of nodes and clusters.
The first step is to choose the first single initial center (c1)
randomly. The second step is to compute the distance between all sensor
nodes and (c1) and choose c2 by the following:
Di=xi-c12,px1=D12D12,px2=D22D12+D22,pxn=Dn2D12+D22+…+Dn2.
The algorithm generates a random number. Then, one of the values of px1,px2,…,pxn close to the random number, xi, becomes a second center,
c2. Third, recompute the distance vector to choose the third center as
Di2=min(xi-c12xi-c22,…,xi-cl2).
Calculate px1,…,pxn as
Eq. (16) and generate a random number close to px1,…,pxn to choose the third center
(c3). The difference between Eqs. (17) and (19) is that Eq. (17)
is used to calculate the distance between the initial center (c1)
and the sensor nodes, whereas Eq. (19) is used to calculate the
distance based on (c1) and (c2). In general, all remaining
centers, such as cl, are calculated as
Dn2=min(xi-c12,…,xi-cl2),px1=D12D12,px2=D22D12D22,…,pxn=Dn2D12+D22+…+Dn2.
k-means algorithm diagram: (a) location of the
sensor nodes; (b) initial centers; (c) new center after
multiple iterations; (d) optimal centers.
k-means algorithm
The k-means algorithm is a method of grouping or classifying sensor nodes
into k numbers of groups/clusters (Zhong et al., 2012). This technique
selects an optimal center location of a cluster from which the sum of the
squared distances to the locations of the sensor nodes is minimized.
Figure 2 illustrates how the k-mean algorithm is used to select an optimal center.
First, sensor nodes are represented as x1,x2,x3,x4 in
Fig. 2a, and let us randomly choose two centers, called
c1 and c2 (Fig. 2b). Next,
calculate the distance between each sensor node to the
two centers, x1-c12,…,x4-c12
and x1-c22,…,x4-c22. Third, group
sensor nodes are based on sensor nodes' minimum distance to the centers. For
example, if x1 and x2 are closest to c1, then x1 and
x2 will be in the same group. Similarly, if x3 and x4 are
closest to c2, then x3 and x4 will be in the same group. In
addition, Fig. 2c shows that sensor nodes are grouped based on the closest
distance to the centers. Four, calculate a new center for sensor nodes,
which are in the same group. For example, c1new=12(x1-c1)2+(x2-c1)2
and c2new=12(x1-c2)2+(x2-c2)2. Last, we
continue to calculate the center based on the previous equation until the new
center is the same as the previous center location. When the previous and
the new center location are the same, the centers are optimal, shown
in Fig. 2d.
If we expand the illustrative example, the k-means
algorithm can be generalized conveniently for any number
of nodes and clusters. In general, the locations of n
sensor nodes are represented by X, where X=x1,…,xn, and l centers are represented by
C, where C=c1,…,cl. The k-means
objective function, which minimizes the distance between sensor node xi and the cluster center cj, is defined as
KM(X,C)=∑i=1nxi-cj2i=1,…,nandj=1,…,l,
where
cj=1uj∑xiϵujxi.
The cluster center cj represents the current estimation of the location
of the center of cluster j, and uj is the number of sensor nodes in
cluster j.
Gap statistics
“Gap statistics” is a standard technique for determining the optimal
number of clusters for a data set (or a group of sensor nodes) by comparing
the observed weight curve to the expectation of a referenced weight curve
(Tibshirani et al., 2001).
The observed weight is the sum of the distance between all observed sensor
nodes (actual data) and the center of the cluster; the referenced weight is
the sum of the distance between all referenced sensor nodes (ideal) and the
center of the cluster (Yan, 2005; Zhang, 2001). The observed weight and the
expectation of the referenced weight can be derived mathematically as shown
below.
First, let us assume that the sensor nodes are represented by X=x1,…,xn. Also, if there are sensor nodes in a
cluster, the distance between each of them is defined by
Dk=∑ii′d′ii′i=(1,…,n)=∑i=1n∑i′=1nxi-xi′2,=(x1-x1)2+(x1-x2)2+(x1-x3)2+(x2-x1)2+(x2-x2)2+…+(xn-xn)2
x1-x1=0,x2-x2=0,…,xn-xn=0.
Therefore,Dk=2nk∑i=1nxi-x‾2,
where x‾=x1′+x2′+…+xn′n, and x‾
is the center of the cluster, n is the number of sensor nodes,
and di,i′ is the distance between two nodes (i and i′),
k is the number of clusters, (k=1,…,g),
and g is the maximum number of clusters.
The weight in the k cluster is defined by
Wk=∑k=1g12nkDk.
Sensor nodes in a cluster.
Figure 3 also illustrates the distance between each of the sensor nodes, the
number of clusters, and the number of sensor nodes in a cluster. For
example, k=2, n1=3, and n2=3, D1 is the total
distance between sensor nodes to the center at cluster 1, and D2 is the
total distance between sensor nodes to the center at cluster 2.
Second, the algorithm generates the referenced weight by adding a small
noise into the original sensor nodes or the observed sensor nodes. The
referenced weight is Wk, and the referenced weight dispersion is
Wkb*; k is the number of clusters, k=1,…,g,
and b refers to the reference data sets, b=(1,2,…,B),
where B is the maximum number of data sets. For example, when k=3 and b=5, the algorithm generates five different locations for sensor nodes
which are distributed across three clusters.
Third, the algorithm calculates the expected value of the referenced weight,
En*(Wkb), and n is the number of sensor nodes. In order to
analyze the difference between observed weight and the expected value of
referenced weight, the algorithm uses the logarithmic scale graph since it
shows a visual differentiation between observed and referenced weight.
Therefore, the observed weight is represented as log (Wk), and the
expected referenced weight is represented as En∗logWkb.
As expressed above, the main goal of the gap statistics method is to compare
the curve of the observed weight (log(Wk)) to the curve
that represents the expectation of a referenced weight (En*logWkb) to determine the optimal
number of clusters based on the maximum gap between the two curves. As Yan (2005)
and Zhang (2001) describe, the number of optimal clusters can be found
when (log(Wk)) falls the farthest below the expected
referenced weight dispersion curve.
Results of the example with three clusters: (a) sensor nodes;
(b) weight dispersion, Wk, as a function of k number of clusters.
However, when there is a small gap between the log(Wk)
curve and the expected referenced weight curve (En*logWkb), the cluster is not optimal because the
observed sensor nodes have noise that is the same as that of the referenced
weight sensor nodes. Conversely, when there is a maximum gap between the
log(Wk) curves and the expected referenced weight curve
(En*logWkb), the cluster is
optimal. In other words, the observed sensor nodes have very small noise at
the maximum gap compared to that of the referenced sensor nodes, which are
generated with noise. In this discussion, the term “noise” indicates that
the sensor nodes are not close to each other and that they do not form the
optimal number of clusters.
For example, Fig. 4a shows a scatter graph in which the sensor nodes are
distributed across three clusters; one cluster is well separated from the
other two clusters, which are connected. Figure 4b shows that using the gap
statistics algorithm determines the optimal number of clusters in Fig. 4a.
As Fig. 4b shows, the increased number of clusters results in decreased
weight. The red line indicates the location of the original sensor nodes
within the cluster and has observed weight logWk; the
graph shows a rapid decrease up to cluster number 2, and, then, it decreases
slowly from cluster numbers 3–10. In addition, the blue line is the
referenced weight, (En*logWkb). The optimal number of clusters is determined to be three,
because, at that point, the gap between the two lines is at its maximum.
Combination of the three clustering algorithms.
Combination of the clustering algorithms
As summarized above, the LEACH (Heinzelman et al., 2000) algorithm uses a
computation and communication energy model to increase the lifespan of the
sensor nodes. But the method is still far from being a complete and optimal
solution to the problem. For example, the LEACH algorithm selects a fixed
number of clusters, but it ignores the fact that some of the sensor nodes in
a cluster can be reallocated to another cluster. It also ignores the fact
that the cluster head's energy will be depleted quickly when too many sensor
nodes remain in a single cluster, because more energy is required for
aggregating, compressing, and transmitting more information. With this
background of partial solutions to the problem, our intention was to attain
a complete solution by using other clustering algorithms that were developed
for other purposes. This section provides details concerning how they were
used. The operation of wireless sensor nodes is divided into three phases,
i.e., setup, advertisement, and steady state. In this research, we focused
only on the setup phase. During the setup phase, first, the sensor nodes
identify their locations and positions and then transmit the information to
a base station. At the base station, where this combined algorithm is
located and runs, the k-means++ algorithm generates the initial center
for the sensor nodes' location. Second, the k-means algorithm chooses the
optimal centers of the clusters. Finally, the gap statistics algorithm is
used to select the optimal number of clusters for the nodes.
Figure 5 shows the steps that are used to choose the optimal number of
clusters based on the three clustering algorithms (k-means++,
k-means, and gap statistics).
In the first step, we represent the location of the sensor node. In the
second step, we initialize the cluster's center based on the k-means++
algorithm. In the third step, we choose the optimal center for the cluster
based on the k-means algorithm. In the fourth step, we used the gap
statistic algorithm to calculate the optimal number of clusters.
The first step starts with a number of sensor nodes represented by
X=x1,…,xn In the second step, we calculate the initial
centers for the sensor nodes based on Eqs. (17)–(19).
Third, we calculate the optimal centers of the distributed sensor network
based on the k-means algorithm using Eqs. (22) and (23). The k number of
clusters is defined as (k=1,…,g).
Using Eqs. (24)–(25), the sum of the clusters' weight (Wk) is
calculated, and the mean of a reference weight (Wkb*) is
generated. b refers to the reference data sets, b=(1, 2,…,B),
where B is the maximum number of data sets. In order to analyze the
difference between the observed weight and the expected value of the
referenced weight, the algorithm uses a logarithmic scale graph since it
shows a visual differentiation between the observed weight and the
referenced weight. Therefore, the observed weight is represented as log
(Wk), and the expected referenced weight is represented as
En*logWkb. The gap statistics
is defined by
Gapn(k)=En*logWkb-logWk.
As expressed above, the main goal of the gap statistics method is to compare
the curve of the observed weight (log(Wk)) to the curve
that represents the expected reference weight (En*logWkb) to determine the optimal number of clusters
based on the maximum gap between the two curves.
max(Gapn(k))≈k^opt