With the rapid development of wireless communication technology and micro-electro-mechanical systems technology, wireless sensor networks (WSNs) that consist of low-cost and low-power sensor nodes have become very popular. Sensor nodes are distributed in various environments to collect important information, such as in military, medical, and environmental applications where the confidentiality and integrity of data are of great importance. However, because of their characteristics, WSNs are prone to node capture attack . In the node capture attack, sensor nodes are captured physically, and encryption keys are extracted by the adversary [2-4]. By using keys recovered from captured nods, the adversary can grab the sensitive and secret information from the communication among sensor nodes or even falsify the messages that are very destructive. For example, in the patient monitoring scenario, patients' lives may be in danger if the sensor readings are falsified during transmission.
To improve the security of encryption keys, recent research studies put emphasis on symmetric key assignment protocol against node capture attacks where nodes are just captured randomly [5-7]. However, the effect of key assignment protocol will decline greatly because of the adversary's capturing nodes in certain sequence. Reference  provides a formal characterization of node capture attacks; it studies the application of storage randomization to key establishment and further proves that the adversary can significantly reduce the resource expenditure for node capture attack by using information leaked from key assignment protocol. Therefore, recent node capture attack methods mainly exploit the information from key assignment protocol [6, 7, 9, 10]. Moreover, the adversary particularly has the knowledge that which particular key set is assigned to each sensor node in the network by eavesdropping link establishment protocol. The adversary can randomly capture the node to attack the WSNs as described in . In addition, an adversary can reasonably capture the node by comparing each node's metric on compromising the network .
The greedy node captured approximation using vulnerability evaluation (GNAVE) protocol presented in  is an elegant solution to approximate the minimum cost attack where the compromise of network traffic is mapped as the flow of current through an electric circuit. In the GNAVE algorithm, vulnerability metric is proposed as a function of the routing and the cryptographic protocols that can reduce resource expenditure to a certain degree. However, in the GNAVE, only the nodes on the specific routes are considered, whereas the internal relationship among nodes is ignored, especially the nodes that are not on the route. The whole network topology should be taken into consideration to improve the efficiency.
Because existing node capture attack methods have some insufficiency, a high efficiency node capture attack algorithm based on route minimum key set, namely GNRMK, is proposed in this paper. In GNRMK, the sensor network is mapped as a flow network to obtain its route minimum key set that represents the vulnerability of the route. The route minimum key set can be obtained by calculating the max flow of the flow network through labeling and adjustment procedure on the basis of Ford–Fulkerson . Then, a node metric called overlapping value is calculated as a function of route minimum key set to represent the effect of the node's capture on the whole network. After a node is captured, the network topology is also changed because of the already compromised links. Besides the routing protocol and key assignment protocol, the network topology is utilized to obtain more information about the route's vulnerability to accelerate the attack process. The route's vulnerability is different according to different routing protocols. Compared with previously proposed schemes, our approach provides more efficiency in compromising the sensor network. The main contributions in this paper are summarized as follows:
The sensor network is mapped as a flow network to obtain the route minimum key set. With the max flow value of the flow network, the route minimum key sets of the sensor network are obtained to reflect the vulnerability of the route.
A node overlapping value (NOV) metric is calculated on the basis of the route minimum key sets. A node with maximum overlapping value is reasonably chosen as the node to be captured. Also, after a node is captured, the network topology is dynamically changed because of the already compromised links or paths.
The effect of our GNRMK algorithm is demonstrated in different routing environments: the single path routing protocol, the multiple independent path routing protocol, and the multiple dependent path routing protocol. Furthermore, GNRMK is compared with previously related proposed schemes in the efficiency of compromising the network.
The remainder of this paper is organized as follows. In Section 2, models and definitions for wire sensor network, key assignment, routing, and the adversary are presented. In Section 3, the node metric for compromising the network and the GNRMK algorithm are presented. In Section 4, the performance of GNRMK is evaluated through simulation experiments and compared with other algorithms. In Section 5, we conclude and discuss future work.
2 MODELS AND DEFINITIONS
In this section, we introduce the models and various definitions used in our work. Table 1 summarizes the main variables and their definitions.
Table 1. A summary of notations used.
Set of sensor nodes
Set of links
Link from node i to node j
Set of keys and labels of node i
Set of keys and labels shared by nodes i and j
Set of source and destination nodes
Set of route pairs constructed by S and D
Subset of Γ, captured route pair
Fraction of traffic traverses through path π
Route from source node s to destination node d
Network flow and its max flow value
Set of keys of captured nodes
Minimum key set of sensor network
Minimum key set of route i
Overlapping value of node T
2.1 The model
In the network model, a WSN with N sensor nodes is considered, which can be denoted by a directed graph G = (V, E) where V is the set of vertices indicating the sensor nodes and E is the set of direct edges indicating the wireless links in the graph.
In this paper, it is assumed that the WSN is a set K of symmetric cryptographic keys and L of corresponding key labels. Each node in network i ∈ N is assigned a subset Ki of K and the corresponding label subset Li of L. The key is selected randomly from the key pool to be assigned to each node. The set of keys shared by node i and j is defined as Kij = Ki ∩ Kj with corresponding shared label set. Neighbor nodes can communicate with each other only when the set Kij securing the link (i, j) is empty. Thus, the number of keys in the set Kij measures the link security strength. More keys mean more security, but this requires more calculation and resources .
In the routing model, S and D of sensor nodes set N are used to denote sensor source and destination nodes sets, respectively. The set of sensor source nodes and destination nodes pairs is denoted as Γ ⊆ S × D and is constructed on the basis of routing protocol. A message from source node s ∈ S to destination node d ∈ D will traverse one or more paths. The path a message traverses is decided by routing protocol. Every route path is composed of a set of sequential links (i, j) in which the key set Kij is not empty.
A route Rsd from source node s to destination node d may include multiple paths decided by the routing protocol. The path weight fπ is defined as the fraction of traffic in the route Rsd that traverses the pathπ. Three kinds of routing protocols are used according to the path multiplicity to make the message traverses different dependent or independent paths. The first protocol yields routes consisting of single path from source node to destination node, the second protocol yields routes consisting of multiple independent paths where the message only chooses one of them to transmit, and the third protocol yields routes consisting of multiple dependent paths where the message is fragmented to multiple packets, and these packets traverse along different paths and are assembled into the original message [13-16].
It is assumed that the adversary can capture each node in network and extract encrypted keys from the memory of captured nodes with polynomial time and constrained resources. As long as the whole resource expenditure does not overpass certain threshold, the network can be compromised. The key assignment model and network routing protocol are easily accessed for the adversary. The adversary also knows the route information (i, j) ∈ Γ and key label, Li, set of each sensor node.
The adversary's goal is to use minimum resources to compromise the network. The compromised route set is defined as ΓA. The latter is the subset of route set Γ for all source node and destination node pairs. The captured node set is defined as C which is a subset of sensor nodes set N. The adversary's goal is to make ΓA = Γ with the cardinal number of C minimum.
A flow network G = (V, E) is a directed graph  in which each edge (u, v) has a nonnegative capacity c(u, v) ≥ 0. Two special vertices are distinguished in a flow network: a source s and a sink t. Here, the standard definition of network flow is given as follows. For a network flow , the flow satisfies the following three conditions:
0 ≤ f ≤ cij, where cij is the capacity of edge (i, j).
For each intermediate vertex, , which means the flow into the vertex is equal to the flow out of the vertex.
For source vertex and target vertex, the flow satisfies the following balance condition:
and , respectively.f = fij is called a feasible flow, and v(f) is the flow value of this feasible flow f.
3 GREEDY NODE CAPTURE ALGORITHM BASED ON ROUTE MINIMUM KEY SET
The adversary attacks the network by compromising the route Rsd from source node to destination node. To compromise the different routes, nodes are captured, and their keys are extracted. The keys captured from the captured node constitute the node capture set C, which satisfies the condition KC = ∪ i ∈ CKi where KC is the key set of C and KC is the key set of node i.
A link is compromised if and only if the shared encryption key set securing the link is a subset of KC. A path is compromised when any link or node on the path is compromised. A route is compromised when all paths in the route are compromised.
So, the adversary's goal is to compromise all route traffic in (s, d) ∈ Γi with the minimum number of nodes captured.
The node capture attack methods seek to find a node capture set C to obtain enough keys to compromise the network. By capturing the nodes in certain sequence, the size of node capture set C can be small enough. In this paper, the sensor network is mapped as a flow network, and the link key set cardinal is defined as the capacity of the corresponding edge in flow network. The minimum key set is calculated according to the max flow of the equivalent flow network. Then, the route minimum key set is used to compromise the network with less resource expenditure.
A direct graph G = (V, E) can be used to represent a sensor network. For node i, j ∈ V, Kij is used to denote the shared key set between two neighbor nodes. The number of elements in the key set Kij is denoted by card(Kij). Then, each edge (i, j) in G is weighted as w(i, j) = card(Kij).
Now, the route from source node to destination node is denoted by a subgraph of sensor network graph Gn = (Vn, En). The edge weight w(i, j) denotes the weight of corresponding edge that is also the capacity of edge (i, j) in network flow. We exploit Ford–Fulkerson-based  label method to calculate the minimum key set in the subgraph Gn. The label method includes label procedure and adjustment procedure. In the label procedure, an augment path is found through breadth-first search. Then, we have the adjustment value, which can be obtained in the last searched node. In the adjustment procedure, the flow value of each edge on the augment path is updated by adding the adjustment value if the edge is pointing toward the sink node or subtracting the adjustment value if the edge is pointing toward the source node. The max flow is calculated through finding augment paths continually. As shown in Figure 1(a), (s, a, b, t) is an augment path in the first label round. After the last round of Ford–Fulkerson, the flow of each edge can be calculated as shown in Figure 1(b).
When all the labeled nodes have been checked and no more nodes can be labeled, the labeling process comes to an end. The weight sum of edges starting from source node is the max flow value. As shown in Figure 1(b), the value of max flow is 20. At the same time, edges connecting the checked nodes and the unchecked nodes in the last round labeling process make the network flow maximum. As shown in Figure 1(c), edges (a, b) ,(a, c), and (s, d)make the network flow maximum; the weight sum of which is just 20. It can be seen from Figure 1(c) that the graph is separated into two parts connected by these edges, which contain source node and destination node, respectively. This validates the fact that these edges are the bottleneck where the flow must pass through [18-20]. These edges are links with card(Kij) as weight. The key sets of these links constitute the route minimum key set from the source node s to the destination node t. The route minimum key set is the union set of the key set of every link that makes the flow maximum. In Figure 1(c), route minimum key set is the union set of Kab, Kac, andKad. The network minimum key set Kmin is composed of the route minimum key sets of all route pairs in the network. Node c has max overlapping value among all the sensor nodes. Therefore, it is captured and the network is reconstructed as shown in Figure 1(d). Thus, one round attack comes to an end, and the process will repeat until the network is compromised.
3.1 Node overlapping
To improve the efficiency of node capture attack, a GNRMK scheme is proposed. In GNRMK, a metric called NOV is assigned to each node. During the attack process, the node with maximal NOV is captured. NOV is calculated on the basis of route minimum key set.
The minimum key set of a senor network is the integrity of different route minimum key sets, namely where is the route minimum key set of route r. Nodes are captured in the sequence calculated by GNRMK and added to node capture set C. The captured node with maximal NOV is the node that can maximally overlap the minimum key set of the network to compromise the network quickly.
Let us define the overlapping value NOV of a node T as
The node fmax(T) with max cardinal number is the desired node. Then, greedy node capture algorithm based on route minimum key set is proposed on the basis of this formula. The algorithm procedure is as shown in Algorithm 1.
First, a network graph is constructed according to the sensor network and is taken as input of GNRMK algorithm. Then, the route minimum key set is calculated in route subgraph of the whole network to obtain NOV. The minimum key set is the input of the next step max node overlapping function fmax(n). For all nodes in the network except the destination nodes, which are strictly tamper-proof, the NOV is calculated, and the node with max NOV is selected as the target node compromised and is added to the node capture set C. The last step is to delete the already compromised paths because of the node capture set C, which means reconstructing the network to a new network NC.
This is not only the last step but also one of the most important steps. The algorithm requires traversing all the paths in the sensor network and deleting the paths compromised because of the compromised key set calculated by node capture set C. In this way, a new sensor network structure is obtained with the paths that are not compromised yet. Every time the network is reconstructed just after a node captured and in this way, the complexity of the network is quickly decreased. Thus, the attack times out, and executing time can be reduced effectively.
Figure 2 illustrates the node capture process with GNRMK. The process is divided into three stages: initialization stage, node capture stage, and reconstruct network and ending stage.
4 SIMULATION ANALYSIS
To verify the performance of GNRMK, simulation experiments have been carried out to evaluate the performance of the proposed algorithm. Key predistribution protocol is used to assign keys to individual node. In the experiments, source nodes and intermediate nodes are assumed to be low-level sensor nodes in the WSN that can be compromised. Destination nodes or sink nodes are high-level nodes that are tamper-proof by hardware equipment, so these nodes cannot be compromised and are not considered in the simulation. To compare the performance of GNRMK with other nodes captured methods, six kinds of node capture attack methods are simulated in the experiments:
Random node captured (RNC). This is the most simple node captured method, which is used as the base of the comparison of performance among different strategies.
Maximum compromised key (MCK). The node with the maximum |Ki/KC| is captured. Here, |Ki/KC| denotes the number of elements of set Ki that are not in the set KC, and Ki and KC are the key set of node i and the compromised key set, respectively. This strategy uses information leaked from key assignment protocol.
Maximum compromised link (MCL). The node with the key set that can be used to compromise the maximum number of links is captured. This strategy clearly uses the information leaked from key assignment protocol.
Maximum (MCT). The node with the maximum traffic in the route is captured. This strategy uses both the key assignment protocol and route information.
Greedy node captured approximation using vulnerability evaluation (GNAVE). The node that makes the network more vulnerable is captured. This strategy uses both the key assignment protocol and route information.
Greedy node captured based on route minimum key set (GNRMK). The node that can make the network quicker to be compromised is captured. GNRMK uses information leaked from key assignment protocol and route information. In addition, we also make use of the network topology and dynamically change the network topology.
4.1 Simulation parameters
Table 2 gives the simulation parameters used in the experiments. The simulation is performed for a network N of 400 nodes with the number of keys ranging from 15 to 50, incremented by 5. The subsets S, D are randomly selected such as S = 80 and D = 100. Six node capture attacks are simulated for three kinds of routing protocols: single path, multiple independent paths, and multiple dependent path routing. We describe the application scenario of the GNRMK algorithm in Figure 3.
Table 2. Simulation parameters.
The number of sensor nodes
15, 20, 25, 30, 35, 40, 45, 50
The number of key in every sensor node
The number of source nodes
The number of destination nodes
The range of abscissa for sensor nodes
The range of ordinate for sensor nodes
Radio transmission range for sensor nodes
The key pool for sensor nodes
The number of keys in each node is an important metric in evaluating the algorithm performance, so different sensor networks with different number of keys of sensor nodes are constructed in the simulation model: 15, 20, 25, 30, 35, 40, 45, and 50 keys. To ensure the accuracy of algorithms and prevent accidental error from affecting the result, experiments are carried out five times for the sensor network with the same number of keys, and the average value is taken as the final result.
Figure 4 presents the fraction of traffic compromised with the number of captured nodes increasing with traffic routed through multiple independent paths, multiple dependent paths, and single path. The fraction of RNC, MCK, and MCL is relatively low compared with the other three methods when the number of captured nodes is the same with the other three methods. Therefore, RNC, MCK, and MCL methods are ignored in Figure 4. The simulation uses a network of 400 sensor nodes with each node randomly assigned 50 keys. In multiple path routing, at first, the fraction of traffic compromised MCT is the maximal because the node selected by MCT is the node that most path has to traverse. Later, when the number of captured nods exceeds six, as shown in Figure 4(a) and (b), as MCT does not take the effect of the captured node on other links or paths, the attack speed is slowed down. In the single path routing, as shown in Figure 4(c), for the route of one source, destination pair only contains one path, so the number of node s that are past by multiple paths declines quickly. Therefore, the advantage of MCT is not so obvious in the single path routing. It can be seen from Figure 4(a) that the curves of GNAVE and GNRMK coincide partly when the abscissa is 2. The similar situation happens in Figure 4(b) when the abscissa is 2 and 4. It means that the fraction of traffic is the same at this time, although different nodes may contain the key set that will compromise the same number of routes, but this rarely happens because the probability of two nodes having the same key set is relatively low, and indeed, they are just the same node. The result in the simulation also proves this fact that in the early stages of attacking, the node selected by GNAVE and GNRMK is the same. It can be seen from Figure 4 that GNRMK is 10% quicker to make the fraction of traffic 100% than that of GNAVE. It can be concluded that multiple path routing is more resistant to captured node attack method compared with single path routing. After the attack, our algorithm GNRMK always chooses the node that can infect more of the other links than that of GNAVE, so the fraction of compromised traffic in GNRMK is higher than that of GNAVE until the attack is over. The MCT method always chooses the node to compromise the most passed paths. As the number of captured nodes increases, the disadvantage of MCT begins to show up because this method does not take the whole network into consideration. Then, the attack speed begins to slow down with a smooth curve. To sum up, our GNRMK is on the average 10% quicker to compromise the network than GNAVE does and excels much than MCT.
Figure 5 presents the number of captured nodes to compromise the whole network with traffic routed through multiple independent paths, multiple dependent paths, and single path, respectively. The experiment is carried out on different number of keys, ranging from 15 to 50 keys. The RNC, MCK, and MCL methods relatively need a large number of captured nodes to compromise the network because they just randomly choose the node or without using the information from routing protocols. Basically, the number of captured nodes increases monotonously for GNAVE and GNRMK. In respect to MCT without considering the whole network or route information, some dirty nodes may be chosen to compromise the network. These nodes just make little effect on the compromise of the network, so as shown in Figure 5(a) and (b), when the number of keys in each node is 20, the MCT method needs relatively large number of nodes. GNAVE and GNRMK both need a few nodes to compromise the network because they both take the key assignment protocol and routing protocol into consideration. However, it can be seen from Figure 5 that GNRMK needs 10% fewer nodes than that of GNAVE because GNRMK also considers the network topology and dynamically changes the network topology at the same time. As the number of keys in each node increases, the number of nodes needed to compromise the network increases slowly because the relative relationship among nodes and links changes a little. Nevertheless, the complexity is much higher, and more time is needed to calculate these nodes. To sum up, GNRMK outperforms other five methods and has a 10% performance rise when compared with GNAVE.
In this paper, a GNRMK has been proposed. The WSN is mapped as a flow network. First, the route minimum key sets of the sensor network are calculated through the max flow of the corresponding flow network. In this way, the whole network topology is utilized to accelerate the compromise of the network besides the information from routing and key assignment protocols. A node metric called NOV is calculated to measure the damage of a node's capture to the whole network on the basis of the route minimum key sets. The simulation results indicate that an adversary can significantly decrease the number of captured nodes to compromise the whole network by using GNRMK. How to further reduce the number of captured nodes is still our future work. The primary contributions of this paper are summarized as follows.
The flow network model is explored to effectively improve the performance of our algorithm.
Through the maximal flow of corresponding flow network, information from the whole network topology is exploited to facilitate the adversary attacking the network besides information from routing and key assignment protocols.
A NOV metric is proposed to represent the node's effect to the security of the sensor network. The node with maximum overlapping value is captured. By using this metric, GNRMK obtains a better performance compared with previously proposed strategies.
This work was partially supported in part by the National Natural Science Foundation of China under Grant No.61173179 and the Fundamental Research Funds for the Central Universities.