In recent years, the integration of computing and physical systems under monitoring and control draws a lot of attention. Cyber-physical system (CPS) is one of the most promising technologies in this research topic due to its wide applications in various areas such as transportation, power grids, and robotic systems [1-3].
Besides, with the rapid development of intelligent transportation systems, cyber-physical surveillance system for transportation (CPSST) becomes a typical and important application of CPS, in which large number of video cameras are adopted for real-time monitoring for surveillance and traffic analyzing [4]. The surveillance system with video cameras is believed to be cost effective due to its real-time characteristic and large field of view, compared with other monitoring methods [5].
In such CPSSTs, the video cameras and other sensing and controlling devices are the physical world, while the wireless communication network connects the physical world and computing systems, because that in most cases, the devices need information transmission among them but are not installed at the same place [6]. In CPSST, video streaming to a centralized location is always required. And in order to reduce the infrastructure costs and enhance the flexibility, radio communication technology is widely adopted for video transmissions.
However, radio spectrum, as one of the most important resources for wireless communications, is running out of use and becomes very valuable [7]. To utilize the spectrum resource with high efficiency, dynamic spectrum management is one of the solutions for CPSSTs, allowing optimal spectrum subband selection and co-existence with other wireless devices operating in the same spectrum bands [8]. Although dynamic spectrum management improves the security performance by using different subbands in different time slots, there is still a probability of being attacked by malicious devices, due to the vulnerability nature of wireless communications [9]. Thus security issue still merits efforts on it. Besides, in video surveillance systems, application layer quality of service (QoS), such as distortion, dramatically affects the performance of the video processing afterwards and is dramatically affected by the noise and interference in the spectrum subband, thus it is another key feature to be considered while designing the system.
A lot of work has been done on CPSs and their applications in transportation systems [3-6, 10-18]. In [3], the authors introduce real-time management methods for adaptively controlling the electric loads in CPS, with the objective of electric power utilization balancing to obtain an optimal power load. The proposed scheme in this work also considers the limitations on the practical process and the electric loads controlling. Multicast routing is introduced in [16] for the system design of distributed controllers as well as the sensors, and the authors put the proposed framework in smart grid to dynamically manage the power. Taking into account the expectation of reliability in CPS and the time-varying power loads, an advanced technique of the semantics-aware communication algorithm is proposed, with the goal of achieving reliability. This paper introduces a feedback controller into the publishers and a predictor into the subscribers. Queuing the and control theories are also adopted in the algorithm [6]. The work of [17] focuses on the databases in CPSST. A mutual operation multi-instance learning structure is proposed in the paper. Besides, for transportation surveillance architecture, a method based on tracking has been introduced, to cope with the problem of timely parts of lesser tracking concern. This method is based on the H.264 video compression scheme and improves the performance of quantization of spectrum coefficients [4]. For wireless surveillance CPS with video cameras of low capabilities in infrastructureless distributed wireless surveillance system with large number of nodes, an approach considering the overall optimization has been proposed in [5], which takes into consideration the device utilization upper bound to optimize the real-time operation of the system. Even when the central processing unit workloads change dramatically at runtime, according to the random sensing consequences, the system performance can be improved by jointly considering data fusion and feedback control. For radio resource management in CPS for QoS assurance, the authors in [18] introduce the cognitive radio technology into the system, to gather radio resource utilization information for radio resource management in self-governing nodes.
Most existing work on CPSST focuses on the system architecture, routing protocols, coding schemes, and timeliness and reliability performances, without considering the application layer QoS and security problems simultaneously with dynamic spectrum utilization. In this paper, we propose a novel dynamic spectrum management scheme in CPSST, taking into account the combined optimization of application layer QoS and security. The scheme is formulated as a restless bandit problem, which is based on the development of multi-armed bandit problem [19, 20]. Restless bandit model, as a type of stochastic control systems, has been widely applied in many areas, including robot control, environment detection, financial investment, and cognitive radio communications, among others [21-24]. The main contributions of this paper are as follows.
We adopt QoS-centric design in the proposed scheme. Due to the importance of video distortion to transportation surveillance systems, it is considered as the application layer QoS metric to be optimized in the system. Video distortion formulation based on the channel state information (CSI) is also presented in this paper.
Interference avoidance is considered in the proposed scheme to achieve optimized application layer QoS. The dynamic utilization of spectrum subband for transmission is partly based on the CSI and allows the system to share the spectrum band with other co-existing wireless systems.
For security issue, the proposed scheme minimizes the chance of being attacked by introducing the concepts of security level and security cost, both of which are functions of the attacking probability when choosing the subband for transmission.
The system optimization is based on the channel state estimation of the previous time slot and historic information. Considering that channel estimation in communication systems is not always up-to-date and accurate, this is more practical than assuming that the current channel estimation results are perfectly available.
The rest of the paper is organized as follows. In Section 2, the QoS-aware and security-aware dynamic spectrum management for CPSST is described, and the system model is presented. Then the proposed scheme is formulated as a restless bandit problem in Section 3. We describe the dynamic spectrum management scheme process in Section 4 and provide the extensive simulation results in Section 5 to demonstrate the performance improvement compared with existing scheme. Section 6 concludes this paper.
2 System Model
The cyber-physical surveillance system can be divided into physical component incorporating the video cameras/video image sensors and the monitoring/control/computing center and the cyber component incorporating the wireless communication system. The objective of the system design discussed in this paper is to improve its QoS and security performance.
2.1 System description
In the CPSST considered in this paper, we assume that there are multiple video cameras or video image sensors equipped with wireless communication tranceivers, multiple controllers, and a monitoring and computing center, all of which are connected by wireless links.
Figure 1 shows the architecture of the whole system. In this paper, the cameras or sensors with tranceivers are called the terminals, and the monitoring and computing center is called the central point. The wireless communication channel is divided into subbands, some of which are selected as the transmission subbands for the communication among the terminals and the central point. The central point is capable of estimating the CSI and making the decision of subband utilization. Coexisting with other wireless communication networks, CPSST selects the most proper subbands for video transmission. The subband decision is made in the central point, and the video is encoded in the terminals with the parameters set by the central point.
We take into account the security problem by considering the possible peeping or attacking by malicious devices in some of the subbands. It is assumed that the malicious devices have limited capability and can not peep or attack all the subbands at the same time, instead, they choose to peep or attack a number of subbands based on their observation of the subband utilization. That is, malicious devices tend to peep or attack a subband that has been frequently used by the surveillance system with a larger probability, and vice versa. Consequently, in CPSST, the security of the system is based on the security of the subbands, which could be optimized by carefully designing the subband utilization scheme.
Besides, application layer QoS plays an important role in the performance of the whole surveillance system. For video streaming, distortion is the most important application layer QoS. Thus, in this paper, we adopt distortion as one of the optimization objectives to manage the subband. The decision is based on the CSI of each subband, which allows the system to share the spectrum band with other wireless systems. The other advantage of dynamic spectrum utilization is its inherent nature of frequency hopping, which significantly increases the difficulty of peeping or attacking by malicious devices.
2.2 Physical component model
We assume that there are N_{t} terminals in the system, collecting live videos and transmitting them to the central point. Recent advanced coding schemes such as H.264 and MPEG are adopted in the terminals. Rate control algorithm is utilized in these coding schemes to adaptively control the video encoder bit rate to improve the error performance [25]. The macroblocks’ (MBs)’ intra-refreshing is believed to be one of the key solutions for rate control and error correction. Previous frames of a MB may have been unsuccessfully received due to the instability of the propagation environment. Fortunately, the information from previous frames are not necessary to decode the current frames. Thanks to this characteristic, intra coded MBs form an efficient error protection method, otherwise, with inter-coding instead of intra-coding, errors of previous frame may dramatically affect the current frame.
In CPSST, different channel subbands with different channel state provide different data rates to the video transmission. The authors in [25] have a deep discussion on the formulation on video distortion, considering the different and changing features of the video streams, the bit rate, the intra-refreshing rate, and the coding scheme. We adopt this rate-distortion model in this paper, assuming the video distortion to be a combination of source distortion, which is the quantization distortion caused by the video encoder in the transmitter to reduce the data rate, and the channel distortion, which is the distortion introduced by packet loss during transmission.
The source distortion, as part of the total distortion, is given by
DsHs,ξ=DsHs,0+ξ1−η+ηξDsHs,1−DsHs,0
where ξ is the intra-refreshing rate, H_{s} is the source coding rate, and η is a constant based on the multimedia sequence. D_{s}(H_{s},0) and D_{s}(H_{s},1) in this equation are the time average source distortion over all inter-mode and intra-mode selections, respectively, for all frames.
where Y_{k} is the number of inter/intra frames at time slot t_{k}, 1 ≤ k ≤ K, and K is the total number of time slots.
On the basis of the rate-distortion model in [25], we can write the average channel distortion as
Dc(ψ,ξ)=Ω11−Ω2+Ω2ξψ1−ψE[Fd(y,y−1)](2)(2)
where Ω_{1} is the energy loss ratio of the encoder filter, Ω_{2} is a constant depending on the motion randomness of the video data, ψ is the packet loss rate, and E[F_{d}(y,y − 1)] is the average value of the frame difference F_{d}(y,y − 1) over the time slots.
Then the total distortion can be formulated as D(H_{s},ψ,ξ) = D_{s}(H_{s},ξ) + D_{c}(ψ,ξ). Thus, to minimize the total distortion, the optimal ξ^{ * } is given by
ξ*=argminξD(Hs,ψ,ξ)(3)
Because of the time-varying nature of wireless subbands, adaptive intra-refreshing rate ξ is adopted in this paper to minimize the total video distortion. To reduce the source distortion D_{s}, we can decrease ξ, but smaller intra-refreshing rate leads to larger bit rate and thus more serious data transmission error. Consequently, there is a tradeoff, and the optimized intra-refreshing rate ξ can be found to minimize the total distortion according to (3).
2.3 Network and communication model
All the N_{t} terminals are assumed to be equipped with wireless tranceivers with dynamic spectrum utilization capability, and the central point's wireless communication device performs the spectrum sensing function to gain the CSI estimation. Orthogonal frequency division multiplexing (OFDM) is assumed to be the physical layer modulation technology used for video streaming. An OFDM symbol can be divided into multiple subcarriers which are orthogonal, i.e., interference free, thus different terminals could use different subcarriers for data transmission. Due to the flexibility nature of OFDM, the total available spectrum band can be divided into subbands, each of which contains one or several OFDM subcarriers [26]. We assume the number of all available subbands to be N_{b}, and the set of all available subbands to be {b_{i}}, where 1 ≤ i ≤ N_{b}, here N_{b} ≥ N_{t}.
The noise and interference signal on the spectrum subband are considered to be distributed as Rayleigh distribution, from the aspect of the central point as the receiver, and the spectrum subband noise and interference power level can be modeled as a stochastic process with the Rayleigh distribution, of which the probability density function
f(x)=xσ2e−x22σ2(4)
and the cumulative distribution function
F(x)=1−e−x22σ2(5)
Then the variance and mean of this Rayleigh random variable can be expressed as (4 − π)σ^{2} ∕ 2 and σπ∕2, respectively. Furthermore, this spectrum subband state changing process is a stationary stochastic process, thus the distribution of a realization of the spectrum subband state over the time line is exactly the same as the distribution of the random subband state at a time point. For simplicity, we assume that the realization of the subband state at one time point is the mean of the random subband state at next time point, i.e., when the random subband state realization is X_{k} at time point t_{k}, the probability density function and the cumulative distribution function of the spectrum subband state at time point t_{k + 1} are
f(xk+1,Xk)=xk+1(Xk2∕π)2e−xk+122(Xk2∕π)2(6)
and
F(xk+1,Xk)=1−e−xk+122(Xk2∕π)2(7)
respectively, where X_{k} is the realization of the subband state at time point t_{k}
Due to the vulnerability nature of wireless communication, we also take into account the security of the system, by introducing the security cost d_{s}(i)of subband b_{i}
ds(i)=NUSED(i)NTOTAL×PATTACK(i)(8)
where N_{TOTAL} is total number of time slots considered, N_{USED}(i) is the number of time slots that subband b_{i} is used during the recent N_{TOTAL} time slots, and P_{ATTACK}(i) is the probability of subband b_{i} being attacked or peeped [9]. With this definition, 0 ≤ d_{s}(i) ≤ 1, and d_{s}(i) is the probability that subband b_{i} is used in the current time slot while it is chosen to be peeped or attacked by the malicious devices. Larger d_{s}(i) means that the probability of subband b_{i} being attacked or peeped is high, thus it is less secure. As mentioned previously, the malicious devices peep or attack a number of subbands based on their observation of the subband utilization. For simplicity, we assume that
PATTACK(i)=NUSED(i)NTOTAL(9)
Thus ds(i)=PATTACK2(i).
2.4 Optimization objective
The surveillance system is designed to improve the QoS and security, which can be interpreted as minimize video streaming distortion and maximizing security degree in this paper. We define the cost of subband b_{i} to be
where D ′ is the distortion measured when intra-refreshing rate ξ = 0.07, because that in practical video systems, ξ is always larger than 0.07, and D ′ > D(ξ) for practical ξ. As defined previously, d_{s}(i) is the security cost of subband b(i), which is also the probability of being attacked or peeped. We use security degree to represent the security level of the subbands, which can be defined as 1 − d_{s}(i).
Besides, in Equation (10), a_{1} and a_{2} are constant coefficients, a_{1},a_{2} ≥ 0, which are introduced to adjust the weights of the two factors in our cost definition. Larger a_{1} denotes that we emphasize on video distortion in the subband cost, and vice versa. Consequently, the values of the two coefficients can be determined according to the user preference. An example of the coefficient values are given in the simulation section in this paper. Thus, when the subband is not used, its cost is always higher than when it is under utilization. Then the system optimization objective is to minimize the summation cost of all subbands along the time line.
3 Restless Bandit Formulation
In this section, the dynamic spectrum subband management problem is formulated as a restless bandit system, whose indexibility feature reduces the optimization problem to simply selecting the subbands with the smallest indices.
3.1 Time slots and actions
During the system operation, the whole time period considered in the optimization is divided into time slots with equal length. We denote by the time slots as t1,t2,…,tk,…,tK. The value of the interval between the two time points [t_{k} − t_{k − 1}) is a constant and is determined according to the processing capacity of the central point. A central point with more power may choose smaller t_{k} − t_{k − 1}, which helps improve the real-time performance of the proposed scheme, with the cost of larger computational complexity.
At the beginning of each time slot, the central point decides the spectrum subband utilization policy according to the current and historical subband CSI and security state information, to optimize the QoS and security performance of the system. We denote the action to subband b_{i} at time slot t_{k} to be ak(i)={1,0}=A, where A is the action space. a_{k}(i) = 1 means that the subband is selected to be used for transmission, and a_{k}(i) = 0 means that it is not selected in this time slot. In each time slot, the actions on the subbands always satisfy
∑i=1Nbak(i)=Nt(11)
3.2 System state space
We represent sik as the state of subband b_{i} at time slot t_{k}, sik∈S, where S is the set of all available subband states. sik is a two-dimensional discrete random variable composed of security state sSEC,ik and subband channel state sCH,ik, sSEC,ik∈SSEC and sCH,ik∈SCH, where SSEC and SCH are the sets of all available security states and subband channel states, respectively.
sik={sSEC,ik,sCH,ik}NSEC×NCH(12)
where N_{SEC} and N_{CH} are the size of SSEC and SCH, respectively.
3.2.1 Security state
For subband b_{i}, we use discrete security state sSEC,ik to describe the possibility of being peeped or attacked by malicious devices. For simplicity, we use the probability of being peeped or attacked P_{ATTACK}(i) as the security state.
sSEC,ik=em,ifPATTACK=m−1NTOTAL(13)
for all 1 ≤ m ≤ N_{TOTAL} + 1. With this definition of security state, N_{SEC} = N_{TOTAL} + 1. Thus the security state of the subband can be formulated as a first order Markov chain, and its one-step state transition probability will be discussed in Section 3.3.
3.2.2 Subband channel state
In this paper, similar to security state model, the channel state of spectrum subband b_{i} is modeled as a discrete-time first-order Markov chain {sCH,ik}, where k denotes the time instant.
We consider N_{CH} discrete channel state levels to represent the different spectrum subband states detected by the central point's channel estimation functions. We covert the continuous spectrum subband states discussed in Section 2.3 into discrete ones with thresholds ε_{CH}(m), 1 ≤ m ≤ N_{CH}, i.e.,
where x_{k}(i) is the spectrum subband noise and interference power level, which has been discussed in Section 2.3.
3.3 State transition probabilities
The security state sSEC,ik is modeled as an N_{SEC}-state Markov chain, with the one-step state transition probability matrix depending on the action at t_{k}. Thus, we have the state transition probability matrix while subband b_{i} being selected in the time slot PSEC1={PSEC1(m,n)}, and the state transition probability matrix while subband b_{i} not being selected in the time slot PSEC0={PSEC0(m,n)}, 1 ≤ m,n ≤ N_{SEC}, where
We also model the discrete spectrum subband channel state as an N_{CH}-state Markov chain, with the one-step state transition probability matrix P_{CH} = {P_{CH}(m,n)}, 1 ≤ m,n ≤ N_{CH}, where
PCH(m,n)=PsCH,ik+1=cn|sCH,ik=cm(17)
According to our discussion in Section 2.3, {P_{CH}(m,n)}can be expressed as
Then the system state transition probability matrix PSTAT1 and PSTAT0 is a combination of PSEC1, PSEC0, and P_{CH}, whose elements PSTAT1(em′,en′,cm,cn) and PSTAT0(em′,en′,cm,cn) are
due to the fact that the security and channel states are independent random variables.
3.4 System reward and policy
As defined in Section 2.4, the objective of the system optimization is to minimize the subband cost C_{i} over all b_{i} along the whole time line. Thus, in the restless bandit formulation, we define the system reward R as
R=∑k=1KβK−k∑i=1Nb−Ci(k)(20)
where β is the discount factor, 0 ≤ β ≤ 1. R is also called the long-term total discounted reward, which is the summation of reward from all time slots with discount factor β. β is introduced to represent that the receivers are more interested in the current reward and the reward in the near past. Reward in the long-time-ago time slots is less important to the receivers. 0.8 ≤ β ≤ 1 are usually adopted in most cases [22-24].
The restless bandit model has an indexability feature that could reduce the computational complexity significantly. For subband b_{i} in time slot t_{k}, the index is represented as δ_{k}(i). Then the optimal policy U is a set of optimal actions, of which the elements are the optimal actions for each subband in each time slot. According to the restless bandit approach, the optimization problem of the actions could be reduced to simply select the N_{t} subbands with the smallest indices among the N_{b} ones for transmission by the N_{t} terminals. That is, the optimal action
And the policy U maximizes the long-term total discounted reward R, i.e.,
U=ak*(i)=argmaxak(i)∈{1,2}R(22)
3.5 Solving the restless bandit problem
The restless bandit formulation in the proposed CPSST allows N_{t} out of N_{b} spectrum subbands to be selected at time slot t_{k}. The system reward R_{i}(k) = − C_{i}(k) is obtained by each terminal, with its state changing according to the transition probability matrix PSTAT0 and PSTAT1.
Based on the linear programming formulations of Markov decision chains, a hierarchy of increasingly stronger linear programming relaxations is provided to solve the problem. The last relaxation is exact. To reduce the computational complexity, a heuristic algorithm for the restless bandits problem has been developed, utilizing the information contained in optimal primal and dual solutions to the first-order relaxation [20].
3.6 Computational complexity
As cyber-physical video surveillance system is considered in this paper, the up-to-date decision making is necessary to guarantee the real-time performance. Consequently, low computational complexity is required in the decision making process. The optimization problem is generally considered as NP-hard, but fortunately, as discussed above, the restless bandit formulation has an “indexibility” characteristic that could reduce the optimization to simply selecting the object with the smallest index. By dramatically reducing the complexity, even devices with low computational capacity could solve the objective selecting problem in a short time.
4 System Operation Process
The dynamic spectrum management scheme for CPSST can be operated either with or without the off-line computation stage.
4.1 Operation process without off-line computation stage
The operation without off-line stage calculates the indices at the beginning of each time slot, whose process is as follows.
(1)At the beginning of time slot t_{k}, the central point performs the channel estimation function for all subbands according to the channel sensing results in the previous time slot.
(2)The central point determines the channel state sCH,ik of each subband based on the channel estimation results according to Equation (14).
(3)The central point updates the security state sSEC,ik of each subband according to Equation (13).
(4)For each subband, given system state sik, state transition probability matrix PSTAT1 and PSTAT0, discount factor β, and cost of the current state C_{i}(k), compute the indices δ_{k}(i) for each subband.
(5)The central point sorts all the indices {δ_{k}(i)} in the ascend order and set the subbands corresponding to indices in the first N_{t} places to be active.
(6)The central point maps the N_{t} active subbands to the N_{t} terminals and announces this mapping information via the broadcasting channel to all the terminals.
(7)The N_{t} terminals determine their optimal intra-refreshing rate ξ^{ * } by Equation (3).
(8)The N_{t} terminals transmit their video data via the subband mapped to them with their ξ^{ * }.
(9)During the time slot, the central point senses all the N_{b} subbands.
(10)Repeat the process.
Obviously, the operation process without off-line computation stage only need online computation and resource allocation, without any extra memory requirement or pre-configuration step. However, the computational complexity for each decision is relatively high, which may result in out-of-date decision making and impaired performance.
4.2 Operation process with off-line computation stage
The index calculation step and the intra-refreshing rate optimization step can also be performed off-line, which could dramatically reduce the online processing complexity and time consumption, and improve the efficiency of the scheme. The off-line computation stage process is as follows.
(1)According to the wireless channel fading parameters σ and N_{TOTAL}, the system state space and state transition probability matrices under different actions are determined.
(2)For each subband b_{i} and each possible state sik∈S, given the state transition probability PSTAT1 and PSTAT0, the system cost C_{i}(k) and the discount factor β, then off-line compute the finite set of the indices {δ_{k}(i)}. Store these indices and the corresponding PSTAT1 or PSTAT0 and C_{i}(k) in a table.
(3)All terminals optimize their intra-refreshing rate ξ based on their source coding rate H_{s} and acceptable packet loss rate ψ by Equation (3).
Compared with the process without off-line stage, in the process with off-line stage, Step 7 could be omitted and Step 4 should be revised as ‘The central point looks up the index table to find out the corresponding indices δ_{k}(i) for each subband b_{i}.’
By calculating all indices for all possible system states at the system initialization step as the off-line computation stage, the computational complexity for each online decision-making step could be largely reduced. Thus compared with the operation process without off-line computation stage, the real-time performance is improved significantly.
5 Simulation Results and Discussions
In this section, extensive simulation results are presented to demonstrate the performance improvement of the proposed scheme.
In the simulation, we assume that the proposed CPSST uses OFDM as the physical layer modulation technology, with the modulation scheme 64QAM and the guard interval 1/16 [27, 28]. Besides, IEEE 802.11 is adopted as the Physical and MAC layer standards, with slot time 10 μs, contention windows size 32, AIFS time 10 μs, time to transmit a PHY header 48 μs, a MAC header 25 μs, an RTS 15 μs, a CTS 10 μs, and an ACK 10 μs.
All subbands between the terminals and the central point are experiencing independent identically distributed Rayleigh fading without spatial correlation. Two values of Rayleigh parameter σ defined in Equation (4), 0.05 and 0.07, are considered in most simulations. We also adopt that N_{t} = 5, N_{b} = 10, a_{1} = a_{2} = 0.5, β = 0.9, and N_{TOTAL} = 5 if there is no further explanation. For simplicity, we assume three channel states, i.e., N_{CH} = 3.
Besides, all the terminals are assumed to adopt the H.264 video coding scheme, with the data rate 1024 Kbps, MB size 16 × 16 and block size 4 × 4. In the distortion model, we adopt the following parameters: Ω_{1} = 0.001, Ω_{2} = 1.0, and E[F_{d}(y,y − 1)] = 100 as described in (2). Dynamically utilizing the subbands is also assumed to be one of the capabilities of the terminals. The central point is assumed to be equipped with the classical pilot-symbol-assisted channel estimator for all subbands, as described in [29, 30].
We compare the system cost of our proposed scheme with the existing one that ignores the optimization of subband management and intra-refreshing rate [6]. Because the subbands that are not selected for utilization have the constant cost, we simply ignore this part in the following simulation.
5.1 Dynamic state and actions
The action decision procedure in the central point is based on the current and historical system state information including channel state and security state. For simplicity, we assume N_{TOTAL} = 3 in this subsection, thus the number of security state N_{SEC} = N_{TOTAL} + 1 = 4.
An example of the dynamic action decision for one subband and the corresponding security state and channel state is demonstrated in Figure 2. In the first 25 time slots, i.e., 1 ≤ t_{k} ≤ 25, as shown in the figure, we can see that the proposed scheme always keeps the security state relatively low, which means the probability of being attacked or peeped is not high. Besides, in our proposed scheme, the action decision is the global optimization process over the entire system and guarantees the low system cost.
5.2 System cost comparison
According to Equation (10), the system cost is composed of video distortion and security cost. In Figures 3 and 4, the comparison of the distortion and security cost of the proposed scheme with the existing one that ignores the action and distortion optimization is presented. The distortion and security cost here refer to the summation value of all subbands under utilization. During the whole time line, 1 ≤ t_{k} ≤ 500, it can be observed that both the distortion and security costs of our proposed scheme can be reduced significantly compared with the existing one, due to the consideration of the current and historical channel state and security state, and taking video distortion as the application layer QoS metric in the system optimization.
Besides, in Figure 5, the total system cost combining distortion and security cost is presented. We can also observe from the figure that the proposed scheme significantly improves the system performance.
5.3 System cost with different terminal and subband numbers
In this subsection, we compare the system cost of our proposed scheme with the existing one, taking into account the different terminal number N_{t} and subband number N_{b}. The system cost in the simulation is the total cost of all subbands under utilization averaged over the time slots.
In Figure 6, as the terminal number N_{t} varying from 2 to 8, the proposed scheme always outperforms the existing one. Because N_{t} is also the number of subbands under utilization, and the system cost in the figure is the summation value of all the N_{t} subbands, the system cost increases as N_{t} increases. Besides, larger N_{t} indicates larger probability of a subband being utilized, thus the security cost for one subband ds(i)=NUSED(i)NTOTAL×PATTACK(i) also goes up, which leads to the rising of total system cost.
Furthermore, the system cost with different number of available subbands N_{b} is compared in Figure 7. As N_{b} increases, the system cost is reduced because of the lower security ds(i)=NUSED(i)NTOTAL×PATTACK(i). And we can draw the same conclusion from the figure that our proposed scheme significantly improves the performance compared with the existing one.
5.4 System cost with different cost coefficients
As in Equation (10), the coefficients a_{1} and a_{2} affect the weights of distortion and security cost in the system cost definition. In this simulation, we use different a_{1} values to compare the system cost performance while always keeping a_{1} + a_{2} = 1.
The simulation result is shown in Figure 8, from which we can see the system cost reduction of our proposed scheme, due to the optimization of dynamic spectrum management and video distortion.
5.5 System cost with different N_{TOTAL}
N_{TOTAL} is defined as the number of time slots considered in the security state. As shown in Equations (8), (15), and (16), N_{TOTAL} affects the security cost and the transition probability matrix. However, as we can see from Figure 9, the system cost of our proposed scheme does not largely depend on the value of N_{TOTAL}. That is, the value of N_{TOTAL} may affect the system computational complexity but not the performance. And our proposed scheme is always better than the existing one.
5.6 System cost with different average frame difference
In our distortion model for the video transmission, the average value of the frame difference E[F_{d}(y,y − 1)] over the time slots is one of the key factors in calculating channel distortion D_{c}. With similar packet loss rate ψ, energy loss ratio of the encoder filter Ω_{1} and the constant based on the motion randomness of the multimedia data Ω_{2}, larger E[F_{d}(y,y − 1)] may lead to larger channel distortion. Thus, in Figure 10, the system cost goes up as E[F_{d}(y,y − 1)] increases, while our proposed scheme always performs better than the existing one.
5.7 System cost comparison with confidence level
Because that Monte Carlo simulation always generates random results, to demonstrate the randomness of the simulation results, we draw a figure with confidence level 95% as shown in Figure 11.
We can see from the figure that, even considering the worst case, comparing with the best case of the existing one, our proposed scheme still significantly reduces the system cost.
6 Conclusions
Cyber-physical surveillance system that provides real-time video monitoring is becoming a popular application of CPS for smart transportation systems. In this paper, we have proposed a QoS-aware and security-aware dynamic spectrum management schemes for CPSST, taking into account the application layer QoS, video distortion, and subband security state, the probability of being attacked or peeped. To minimize the system cost, we formulated the problem as a restless bandit system, with the indexibility characteristic that can be used to reduce the online computation complexity for real-time systems. The restless bandit formulation makes use of the current and historical system state information and optimizes the actions in each time slot to achieve the minimized system cost by minimizing the video distortion and the chance of being attacked or peeped in the transmission subband. Extensive simulation results demonstrated that the system cost can be significantly reduced under different circumstances compared with the existing one that ignores the intra-refreshing and subband security optimization.
Acknowledgements
This work was supported in part by the National Natural Science Foundation of China under grants no. 61101113, 61372089, and 61072088, Beijing Natural Science Foundation under grant no. 4132007, and the Doctorate Subject Foundation of the Ministry of Education of China under grant no. 20111103120017.
The authors would like to thank the editors and anonymous reviewers who gave valuable suggestions that have helped to improve the quality of this manuscript.