Prospect Theoretic Approach for Data Integrity in IoT Networks under Manipulation Attacks

As Internet of Things (IoT) and Cyber-Physical systems become more ubiquitous and an integral part of our daily lives, it is important that we are able to trust the data aggregate from such systems. However, the interpretation of trustworthiness is contextual and varies according to the risk tolerance attitude of the concerned application and varying levels of uncertainty associated with the evidence upon which trust models act. Hence, the data integrity scoring mechanisms should have provisions to adapt to varying risk attitudes and uncertainties. In this paper, we propose a Bayesian inference model and a prospect theoretic framework for data integrity scoring that quantify the trustworthiness of data collected from IoT devices in the presence of an adversaries who manipulate the data. We consider an imperfect anomaly monitoring mechanism that monitors the data being sent from each device and classifies the outcome as not compromised, compromised, and cannot be inferred. These outcomes are conceptualized as a multinomial hypothesis of a Bayesian inference model with three parameters which are then used for calculating a utility value on how reliable the aggregate data is. We use a prospect theory inspired approach to quantify this data integrity score and evaluate the trustworthiness of the aggregate data from the IoT framework. Furthermore, we also model the system using the traditionally used expected utility theory and compare the results with that obtained using prospect theory. As decisions are based on how the data is fused, we propose two measuring models- one optimistic and another conservative. The proposed framework is validated using extensive simulation experiments. We show how data integrity scores vary under a variety of system factors like attack intensity and inaccurate detection.


INTRODUCTION
T HE proliferation of Internet of Things (IoT) is witnessing an exponential growth, both in terms of market value and number of devices. In situations where multiple devices/networks contend for resources, it is not uncommon that some will deviate from the mutually-agreed upon norms to either i) illegitimately draw additional benefits or ii) mislead a central entity (i.e., a hub) from arriving at a fair decision. Thus, there needs to be a mechanism that would establish the trustworthiness of the devices [1]. Usually, trustworthiness is assigned to individual IoT devices based on quality of information they share with others. However, devices may not always be malicious by themselves but their inputs may be compromised (e.g., man in the middle attack, replay, etc.) by an adversary before they reach the central decision maker (performing a fused decision). In such a case, focusing on a device's trustworthiness is not appropriate. Rather, the trustworthiness of the data gathered at the controlling IoT hub is something which is of importance. It may be noted that an IoT hub may have a generic fusion rule and the final dependability is based on both the type of fusion rule used and the integrity of the collected data [2].
Furthermore, the anomaly monitoring techniques that are used for gathering evidence on whether an input from a device is legitimate or not, may not have perfect information. Hence the evidence space for trust modeling is ternary where each input from a device is either labeled as not compromised (positive evidence), compromised (negative evidence), or cannot determine (uncertain evidence). The question is, can trust models handle uncertainty in an adaptive manner? Nonetheless, this problem is further exacerbated, as the adversary's behavior may not always be consistent and is dictated by the context of interaction; for example, the adversaries magnitude of attack on a particular time slot, available power, energy constraints, and the state of other network entities. In other situations, the adversary may behave honestly for a while to gain confidence and then attack later. Hence an adversary's behavior is often dynamic, where they rapidly switch to different modes rendering different behavior under various situations known as On-Off attacks. In such cases, the regular trust management mechanisms either fail to react quickly or allow quick trust improvement when the adversary returns to good behavior.
In this paper, we propose a Bayesian and prospect theory based framework for data integrity scoring that signifies the trustworthiness of the aggregate data gathered at the IoT hub in presence of an adversary. The adversary can manipulate the data sent to the IoT hub from any IoT device. We assume each IoT device provides a single input on a time slot, all of which are vulnerable to attacks. We assume a generic but imperfect failure monitoring mechanism that produces varying feedback over time. We consider that the outcome of the monitoring mechanism can be classified into three categories: those we know have not been compromised, those we know have been compromised, and those which cannot be inferred either way. Given these, we compute the trustworthiness of the collected data at the IoT hub in making a decision. In this regard, we conceptualize the outcome of monitoring the input over time as a multinomial hypothesis of a Bayesian inference model with three param-eters. We build a Bayesian inference-based data integrity scoring model, where we assign a utility value to reflect the reliability of the collected data. Such an integrity scoring model also takes into account the risk a system could afford to tolerate. This model can also guide us on what kind of fusion rule should be used so as to make a robust decision. For example, a mission critical system may not operate properly even if there are few compromised devices because the associated risks are too high. Therefore, we propose two models of data integrity measuring-first is an optimistic one and the second is a conservative one. The optimistic model could be applied to systems where some tolerance for wrong decisions are allowed. However, for a mission critical system where there is almost no room for erroneous decisions, the conservative model could be used. With the probabilities of all the outcomes known, we use prospect theory for computing the utilities. Such an approach helps us model both loss averse and risk averse system, thus allowing us to differentiate between the way the utility is calculated for optimistic and conservative systems. Then, we compare our model with the more popular expected utility theory and show that that the use of prospect theory is more apt for data is at risk. We conduct extensive simulation experiments and show how data integrity scores vary under a variety of system factors like attack intensity and inaccurate detection. We observe that with more inputs compromised, the data integrity score reduces. Low data integrity scores may also be caused by temporal or initial lack of evidence due to uncertainty.

RELATED WORK AND MOTIVATION
Historically, dependability for a particular service has been expressed through trust and reputation scores. Usually, trust in IoT and other networks is built on evidence provided by a feedback or anomaly monitoring system [3]. The anomaly monitoring systems usually evaluate whether an input from an IoT device is satisfactory (denoted as 1) or not (denoted as 0) and expressed as a binary value [4]. However, it is known that it might not always be possible for the feedback or anomaly monitoring systems to express interactions from various devices in strict binary values due to inherent wireless channel uncertainties in wireless IoT networks [5]. In such cases, the feedback system will produce a ternary evidence, the third inference being 'undecided'.
Unlike [4], we argue that an IoT device may not be always be malicious by itself but their inputs may be compromised by an adversary before they reach the central IoT hub. In such a case, focusing on an IoT device's trustworthiness will not guarantee an assured decision at the IoT hub. For example, a man in the middle attack (MITM) that modifies the data sent from devices does not represent a compromised IoT device.
In a time slotted system, a trust management scheme usually updates individual trust scores over time and is usually accompanied by a recovery scheme to allow a system to negate effects of intermittent noise or errors. Popular examples of trust management schemes are cumulative weighted moving average (CWMA) or exponentially weighted moving average (EWMA) [6]. The CWMA provides equal weights to all individual scores and is useful when the goal is to characterize long term behavior. EWMA, on the other hand, is a forgetting scheme, which provides more weight to recent observations than old ones to reflect recent changes in conditions. However, both these approaches do not work well under On-Off attacks when general data integrity scoring is concerned. An On-Off attack is a stealthy distribution of attacks over the time domain and can be combined with other modification of message attacks [7]. In an On-Off attack, the modification of message attacks are launched inconsistently over the time to exploit weaknesses in trust management and redemption schemes. An On-Off attack tries to masquerade attacks as temporary unintentional noisy conditions. At other times, the adversary's inconsistent attacks over time may be due to context of interaction. Either way, we show in Section 6, that CWMA and EWMA do not work when a collective data integrity score (long term) is required for a system. The expected correctness of data from an IoT cluster or subsystem can be computed. This, in turn, would lead to design considerations for a highly assured system. Given this, there is a lack of concrete mathematical framework that quantifies general trustworthiness of collective data where feedback systems do not have perfect or complete information.
Apart from the necessity of proposing a concrete mathematical framework for capturing a posterior belief about trustworthiness of future data sent from each device in an IoT network, there is a need for another mathematical model to aggregate the data from the different devices and make a fair decision about reliability of the aggregated data [8]. This model should be flexible enough to make a reasonable decision for different systems with different thresholds of tolerance for adversaries. In this paper, we have modeled the decision making process using both prospect theory and expected utility theory. Prospect theory has been shown to the most appropriate theory for decision making under risk for economical problems. It also models how people behave in risky situations more accurately than expected utility theory [9]. As we will discuss in this paper, the data fusion center in an IoT network will have a similar behavior towards adversaries as humans.

SYSTEM MODEL AND ASSUMPTIONS
We consider a time-slotted system comprising N IoT devices each of which provides only one input (i.e., the vote) on each time slot. The nature of the decision is generic; it could be as simple as a binary voting or it could be some complex decision metric. A centralized IoT hub fuses all votes from each component through a fusion scheme (e.g., majority or plurality voting rule) to arrive at a global decision. •Adversarial model: We assume that all the inputs from each IoT device are exposed to an adversary whose goal is to disrupt the voting process at the central hub. The adversary has some predefined attack resources and can choose to attack different sets of inputs over time and also attack varying number of inputs in each time slot. However, it maintains a long term average of the fraction of the inputs it attacks which we call the probability of attack and denote as P a . For example, P a = 0.6 means that the adversary compromises 60% of the inputs over a large period of time. Hence a single observation (over one time slot) is not sufficient for characterizing the behavior of the adversary. •Imperfect failure monitoring: We assume that there is a failure monitoring or anomaly detection mechanism in place that infers whether the input from each device has been compromised or not. Unlike related works [4], [10], we consider that the monitoring mechanism cannot infer an anomaly with certainty. Thus, it classifies the inputs into three categories: i) compromised, ii) not compromised, and iii) undecided. All three are functions of environmental parameters that may be dynamic over time. Also system transients and noisy environments may increase or decrease temporal uncertainty. Hence, the data integrity is computed over time-a larger time window of observation allows a more accurate estimation of the overall data integrity.
•Uniformly distributed prior inference: Since there is no bias (or available information) over any of the three possible outcomes of the monitoring process, we assume that the initial probabilities of each is equal. Similarly, we assume that the prior probabilities of an input being compromised or not is also uniformly distributed.
•Probability of detection: We define the probability of detection as the percentage of IoTs' inputs that can be accurately inferred as compromised or not compromised and denote it as P detect . Let us further illustrate the meaning of P detect using Fig. 1 that shows an input in reality could be either compromised or not compromised. If compromised, it can be inferred as either as 'compromised' with a probability a 1 (correct) or 'undecided' with probability a 2 (uncertain) or 'not compromised' with a probability a 3 (missed detection). Similarly, if an input was not compromised, it can be inferred as either 'not compromised' with a probability b 1 (correct) or 'undecided' with probability b 2 or 'compromised with a probability b 3 (false alarm). Thus, for the two real cases, detection occurs with probabilities a 1 and b 1 . If an input has equal chances of being compromised and not compromised, then P detect = a1+b1 2 . Else, a 1 and b 1 will have to be weighted with their corresponding probabilities. For all practical purposes, we consider P detect to be at least 0.5, since it is impractical to have a monitoring mechanism where majority of feedbacks are incorrect. Similarly, P uncertain = a2+b2 2 denote the probabilities of ignorance (expressing inherent uncertainty) about IoT inputs. P error = a3+b3 2 denotes probability of errors made by the feedback system. These probabilities are used for performance evaluation.  The above features make the problem of computing the data integrity a probabilistic concept. Hence, we compute the utility value as an incremental process based on observations over time slots. If the adversary uses the same attack strategy, then the utility value will converge sooner. On the other hand, if the adversary changes its attack strategy (i.e., dynamic attack strategy), the utility value will oscillate even for large time windows. Later, we study a special case on how the proposed theoretical model can be modified to accommodate adversaries that do not have a fixed P a and launch a dynamic (e.g., On-Off) attack strategy. On-Off Attack: The On-Off attack strategy is denoted with an Off:On ratio. In 'Off' stage the adversary does not attack. In 'On' stage the attacker manipulates a random number of inputs on each time slot. It may be noted that ratios with equal Off to On stages do not depict true inconsistency. Higher ratios like 2:1 and 3:1 are used in this paper. Remember that very high Off:On ratio hardly means the adversary behaves honest most of the time hence may not be realistic.

DATA BASED DECISION MAKING
Let the three outputs of the anomaly monitoring mechanism, viz. 'not compromised', 'compromised', or 'undecided' be denoted by α, β and µ respectively. Let n α represent the number of device inputs that have 'not' been compromised, n β be the number of compromised ones, and n µ be the number for which a decision could not be arrived. Of course, n α + n β + n µ = N . Since the values of n α , n β and n µ change over time, we represent these observations at time t as n α (t), n β (t) and n µ (t).
Given that the underlying parameters of the system supplying accurate data are unknown, we use a Bayesian inference approach to incrementally update the corresponding probability estimate for a hypothesis that the data aggregate is correct with a certain probability. The system is only as reliable as the individual inputs are. Therefore, we have to calculate the posterior probabilities associated with encountering each of the three feedbacks. The final data integrity score will be some function of these posterior probabilities which are also known as belief estimate in Bayesian inference.
To begin with, an uniform belief over the three possibilities is assumed as there is no initial information. As time progresses, we update the belief estimate based on the observed values of α, β, and µ which increases the accuracy of the estimate of the belief associated with each category.
Let D α , D β , and D µ represent the random variables that represent the number of times the outcomes α, β and µ occur. The observation data can be represented as random observation vector D(N ) = {D α , D β , D µ } having a multinomial distribution also known as concentration hyperparameter of the underlying 3-tuple probability parameter described by θ α , θ β , and θ µ . The commonly used notations are tabulated in Table 1.

Bayesian Inference
As mentioned earlier, there are N independently monitored components of a system whose parameters for voting behavior are unknown due to changing adversarial attack strategies and the imperfect monitoring mechanism. Given this, we calculate the Bayesian belief associated with 'not compromised'. Similarly, we will model Bayesian posterior belief for the other two cases as well viz. compromised and undecided.
We use the observation counts from the sequential observations over time to calculate the posterior Bayesian estimate of each of the parameters. Our objective is to estimate and update the probability parameters in X(θ), viz. θ α , θ β , and θ µ based on observation evidence D(N ) and prior information on the hypothesis parameter,θ, itself.
Since there is no information aboutθ initially, we consider the prior parameters ofθ to be uniformly distributed. Subsequent observations decide how these parameters are updated. Our first step is to calculate the Bayesian estimate ofθ.
First, we show the case of estimating belief that a 'not compromised' occurs (θ α ). Since in Bayesian inference, the assumption is that prior and posterior probability have the same distribution, we can formally define the probability parameters as: This assumption is due to the well known fact that a Dirichlet distribution acts as a conjugate prior to multinomial distributions [11]. Hence prior and posterior preserve the same form.
The observations data D(N ) can be treated as a multinomial distribution with probability parameter θ α , θ β , and θµ, where the probability mass function is given by: Given this we can use Bayes theorem to calculate the posterior belief estimate on the event of a positive interac-tionX(θ) = α, given observation data D(N ) as: The denominator of the above equation is the marginal probability that can be conditioned or marginalized on all possible outcomes forθ and since probabilities are continuous Since there is no prior information onθ (before any observations) in Eqn. (4), we can assume it to be uniformly distributed such that f (θ) = 1 and we can put Eqn. (2) in Eqn. (4), and get Assuming conditional independence between theX(θ), D(N ) andθ, we calculate the numerator of Eqn. (3), P (X(θ) = α, D(N )), as: Thus, Eqn. (3), can be solved by dividing Eqn. (7) by Eqn. (6), which gives Similarly, P (X(θ) = β|D(N )) = n β +1 N +3 and P (X(θ) = µ|D(N )) = nµ+1 N +3 . These equations are the expressions for posterior belief of 'not compromised', 'compromised', and 'undecided'. To simplify the notations of belief estimates of the three categories, we rewrite them as R α , R β , R µ respectively. Of course, it can be verified that R α + R β + R µ = 1.

DATA INTEGRITY UNDER UNIFORM ATTACKS
In this section, we propose two system models under different conditions. Then, we propound a data integrity measurement approach based on prospect theory.

Optimistic System Model
As was discussed, there are three possible outcomes for each of the components of IoT framework in each time slot: compromised, not compromised, and undecided. Each of these outcomes will incur some costs. As it is evident, the compromised components will effectuate the highest cost, denoted by c c . Not compromised devices will also cause some cost denoted by c n . The third cost, denoted by c u , is associated with the devices that remained undecided. Based on the system requirement, we will take different measures for undecided components. The general relation between these costs is: For optimistic systems, we consider half of the undecided components as compromised because we assumed that adversary has uniformly chosen the IoT inputs to attack i.e., there is no reason for preferential attack on a certain IoT device's input. In this case c u is defined as follows: Of course, when the proportion of undecided is high, we may not be as confident on the integrity measurement than when we have fewer undecided.

Conservative System Model
Unlike the optimistic approach, where the undecided ones are split in an equal ratio, the conservative model treats the undecided ones as if they are more likely to be compromised. In this case, we consider two weights for the compromised and not compromised costs namely w 1 and w 2 in a way that w 1 + w 2 = 1 and 0.5 < w 2 ≤ 1. Hence, This conservative way of computing the undecided cost is more appropriate for mission-critical systems where the decisions can mostly be made based on the 'not compromised' inputs. Depending on how conservative a system is, we define the weight w 2 . By increasing w 2 , the chance of assuming a compromised device as a not compromised one reduces. If the system is highly mission-critical and there is no room for risk, we consider w 2 = 1. In this case, all undecided devices are considered as compromised even if there could be some that were not compromised.

Data Integrity Using Prospect Theory (PT)
Using the Bayesian posterior believes of the three possible outcomes (as obtained in section IV), we want to calculate a utility value based on prospect theory (PT). This utility value is obtained as follows: where V denotes value function and W denotes weighting function. δ α , δ β , and δ µ are three deviation values related to the three independent outcomes of extracted data from each device in an IoT network. These deviations show the difference between profit function, denoted by π, and reference point denoted by π P . Due to the independence of the outcomes, three different profit functions are defined for each and are denoted by π α , π β , and π µ . The deviation values and profit functions are defined as follows: We need to define another variable called the 'reference point' to calculate deviation values. π P is defined by assuming all the IoT devices as not compromised: According to Eqns. (13) and (14), the state which is considered for all the n nodes at the reference point will determine whether the real outcomes are gains or losses. By considering all the n nodes as not compromised at the reference point, any node which is not compromised will always be considered as a gain and other outcomes will be counted as a gain or loss based on the cost values. However, if we had considered the nodes as compromised or undecided at the reference point, the value of δ in each state would increase such that even compromised devices yield a gain. The reason for the gain is that the not compromised devices have the lowest cost.
As for the value function, it is an asymmetrical S-shaped function as shown in Fig. 2. It is asymmetric because of its loss aversion nature which causes the same absolute values to have more impact for the loss than the impact on gain. Its value is dependent on the deviation of the profit values from the reference point, defined in Eqn. (13). Value function is obtained as follows: Positive part of the value function represents the gain and its negative section denotes the loss in reliability of the integrated data from the IoT devices. λ and γ are two parameters used for controlling loss aversion and risk aversion where λ > 1, and 0 < γ < 1 [12]. By increasing λ, the IoT system will become more loss averse and consequently will become more asymmetric with the loss part becoming more convex. By decreasing parameter γ, the IoT system will become more risk averse. The effect of these parameters in value function is shown in Fig. 2. Choosing the right values for these two parameters depends on the IoT system. As the system becomes more conservative, it becomes more loss averse.
According to Eqn. (12), we need to define another function called weighting function. Based on prospect theory (PT), in real life decision making process, people overreact to lower probabilities and under-react to higher probabilities [13]. Here, we have the same situation. For example, if one device is compromised, it will have a significant impact in reliability of the aggregated data. However, if we have 30 devices and 20 of them are compromised, having one more compromised device will not bring about a significant difference. Furthermore, the effect of probability weights is not the same for loss and gain [14]. As it is obvious, it is desirable to have the minimum possible loss rather than achieving a huge gain since even a little loss will result in losing confidence in the aggregated data. Therefore, two similar weighting functions but with two different parameters, denoted by ω and ρ, are defined for gain denoted by W + (p) and loss denoted by W − (p) as follows: ρ and ω are defined in a way to emphasize loss which means choosing lower values for ω in comparison with ρ. Their values also depend on the system that we are dealing with. In conservative systems, the effect of weighting function for loss will be higher than optimistic systems. Therefore, conservative systems have lower values of ω in order to cause strict penalty for loss. On the other hand, the value of ρ for conservative systems is higher than the optimistic ones since achieving gain in conservative systems is not as highly valued as in optimistic systems. In other words, achieving gain in a conservative system will not achieve a positive utility as high as its optimistic counterpart. The effects of these two parameters are demonstrated in Fig. 3. Using these definitions, we obtain a utility value for an IoT system in each time slot. This utility value indicates the aggregated data from all the nodes in this IoT framework is reliable to what extent. Each IoT network will have its own parameters. Based on these parameters, a threshold is defined for the utility value of each IoT system. If the calculated utility of this system is higher than this threshold, the aggregated data from this system has acceptable integrity.

Data Integrity Using Expected Utility Theory (EUT)
Though we argue for an prospect theoretic approach for preserving data integrity in IoT networks, there are other competing theories and one of the most popular ones is the expected utility theory (EUT) [15]. However, the utility derived from EUT is not as risk averse and loss averse as that obtained by prospect theory [16]. Therefore, it is expected that it does not significantly differentiate between optimistic systems and conservative systems.
When we compare PT with EUT, we should analyze two functions: value function and weight function, which are shown accordingly in Fig. 2 and 3. In this problem, we are dealing with three possible outcomes. According to Eqn. 12, the final utility is the summation of the utilities of these three outcomes. Therefore, we should analyze the effect of these two functions on each of these three utilities separately.

Small attack magnitudes
By taking into account the mentioned features, for small attack magnitudes, it is expected that compromised and undecided devices will incur more negative utility values for PT than EUT. In addition, these devices will have higher weights for PT than EUT. On the other hand, non compromised devices which have higher probabilities will have higher positive utility value in PT than EUT. However, their weight in PT is less than EUT. By considering all of these facts, it is expected for small attack magnitudes, the total utility value which is a positive value is higher for EUT.

Moderate attack magnitudes
However, for moderate attack magnitudes, the weights for PT and EUT are almost the same and the only difference is in the utility values. According to Fig. 2, the positive utility for PT and EUT are not significantly different. However, the negative part is noticeably different. Therefore, the positive utility incurred by non compromised devices is almost the same for PT and EUT but the absolute value of negative part of PT which is caused by compromised and undecided devices is more than EUT. In conclusion, for moderate attack magnitudes, it is expected that the absolute utility value of PT is higher than that of EUT.

Large attack magnitudes
For large attack magnitudes, again the positive part is almost the same for PT and EUT. However, for the negative part which is caused mostly by compromised devices, the story is different. As mentioned earlier, PT is risk seeking in the loss domain. It means that for large negative values of x in value function, the absolute utility value of EUT is larger than PT. Therefore, the magnitude of negative utility value incurred by compromised devices in EUT is more than PT. In addition, the negative utility values caused by compromised devices have higher weights in EUT. Therefore, it is expected that EUT will have higher absolute value than PT for large attack magnitudes.

Conservative systems vs optimistic systems
Disregarding the under-reaction for larger probabilities in expected utility theory is more noticeable for conservative systems which have higher costs for malicious nodes. It means that since the expected utility does not utilize a weight function, the effect of the probabilities for large probabilities in EUT is more than that of PT. Thus, the absolute utility value of EUT is larger than PT. The effect of this phenomenon is to the extent that for high attack magnitudes, utility value using EUT for optimistic systems is lower than the utility value using PT for conservative systems.

ASYMMETRIC TRUST UPDATE FOR ON-OFF AT-TACKS
In On-Off attacks, the adversaries have preferences over time periods where an adversary may choose not to attack for some time (Off) and then attack for some time with a random magnitude (On). In such a case, both CWMA or EWMA would not reflect true behavior of the node. The equal weighted CWMA will lag in reflecting such attacks, while EWMA will enable the system to quickly recover or redeem its reputation when it switches back to honest behavior. Under On-Off attacks, the data integrity scoring framework should not allow the system to recover its integrity score even though the adversary starts behaving well after a short burst of attack. In Fig. 4, we show an example where the adversary employs a 2:1 Off-On ratio where it divides the time domain of 300 slots in four stages; 'Stage 1' ranging from t = 0 − 100, 'Stage 2' ranging from t = 101 − 150, 'Stage 3' ranging from t = 151 − 250, 'Stage 4' ranging from t = 251−300. Finally 'Stage 5' ranging from t = 301−500 is a no attack phase to analyze the after effects of On-Off attacks. In Stage 1, the adversary does not attack in a bid to gain a high trust of the system initially. In Stage 2, it attacks for the next 50 time slots with a random magnitude on each of the time slots. In Stage 3, it does not attack on any of the 100 slots. In Stage 4, it again attacks on 50 slots with a random attack magnitude. In Stage 5, it again behaves cooperatively. Suppose an algorithm checks whether the system is compromised or not every 50 slots and uses a threshold of zero below which the collective should be considered 'unusable' for decision making. On 100-th, 200th, 250-th, 350-th, 400-th, 450-th and 500-th slot, the system's data is deemed usable by both trust/score update schemes although adversary is employing a stealthy On-Off attack. We see that CWMA reacts too slowly and fails to reflect malicious nature even at the end of the Stage 2. On the other hand, EWMA detects attacks quickly but also allows such nodes to quickly recover their reputation on 151-th and 301-th slot in the ensuing Off period. Hence there is need for a special trust update scheme that would restrict the average data integrity to improve quickly even when the adversary starts behaving cooperatively after malicious activity as well as be responsive enough to decrease the integrity score when a adversary starts acting maliciously after building a high reputation.

Weighted Integrity Score
Let us denote the utility of data integrity as obtained from Eqn. 12 as u di . Since the interval of u di is large, the depiction of trust values and bounded classification decisions as dependable or not get difficult. Hence, we map these scores to a bounded lower dimensional plane via a scaling trick, which ensures all negative u di values are mapped between [-1,0] and positive u di values are mapped between [0,+1]. Also, the weights monotonically increase with increasing data integrity and vice-versa. Therefore, we report the normalized weight w di by giving a value between [−1, 1] using Eqn. (17). The normalized weighted integrity score is given by: The above equation which uses trust to give weights helps to clearly distinguish between two classes of nodes. At any time t, the weighted integrity score is denoted as w di (t).

Asymmetric Weighted Moving Average scheme
We propose an Asymmetric Weighted Moving Average (AWMA) technique that is based on the socially inspired concept that bad actions are far more remembered than good actions. This forms the basis of the asymmetric weighted moving average scheme, where slots with instantaneous integrity trust w di (t) lower than a threshold Γ on−of f are given more weight than time slots where w di (t) has higher values. The value of Γ on−of f is dictated by a system specific risk attitude and defines what can be termed as sufficiently good behavior. In the update of trust values, there are two important things; the cumulative average and current trust value. We introduce four weighting factors χ a , χ bmax , χ cmin and χ d such that 0 < χ a < 1; 0 << χ bmax < 1; 0 < χ cmin << 1 and 0 < χ d < 1. Note the fact that χ cmin is much much less than χ bmax introduces an asymmetry. Now there may be four possible scenarios at time t with regard to On-Off attacks. Case (a): w mavg In Case (a), a cumulative average higher than Γ on−of f suggests a system is maintaining a sufficiently good behavior. If the current trust value is also higher than Γ on−of f then it suggests continuity of the good behavior. Hence continuing good behavior is rewarded with a high weighting factor χ a to w di (t) and low weightage given to w mavg di (t − 1) using 1 − χ a . We name χ a as a rewarding factor such that 1 > χ a > 0. It helps a historically reliable system to improve, or at-least maintain its reputation, if it also behaved in a cooperative manner in time slot t. Hence for Case (a), cumulative trust is updated as: In Case (b), a cumulative average higher than Γ on−of f and w di (t) ≤ Γ on−of f suggests a system maintaining a sufficiently good behavior upto time t − 1 and then initiated some anomalous behavior. Hence all the good behavior until now needs to be forgotten and a very high weight be given to the current slot's anomalous behavior. This will cause the system's cumulative trust value to quickly decrease. Once this happens, Case (c) would ensure that the cumulative trust is not able to redeem itself quickly. Hence w di (t) is weighted with a high value χ bmax such that 1 > χ bmax >> 0 and w mavg(t−1) di is weighted using 1−χ bmax . We name χ bmax as a punishment factor. The higher the value of punishment factor the quicker the drop in the reputation and hence the more severe the system's reaction will be to new evidence of malicious behavior. In such cases, the cumulative trust is updated as: In Case (c), a cumulative average lower than Γ on−of f but a current trust value w di (t) higher than Γ on−of f signifies a system where current inputs are cooperative but has a history of anomalous behavior which may be as recent as t − 1. Hence even though w di (t) may be high we assign it a very low weight χ cmin such that 0 < χ cmin << 1 and assign 1 − χ cmin to w mavg(t−1) di . We name χ cmin as the redemption factor that controls how fast or slow a system with malicious history can redeem its trustworthiness if it shows good behavior for a sufficiently long time. Redemption factors also make it possible for systems which experienced noise redeem their trust values. A low redemption factor ensures that the trust value is not increased quickly even though a system starts to behave honestly after a period of malicious behavior. In this case cumulative trust is updated as: In Case (d), both cumulative average and current trust value of node j are below Γ on−of f indicating continuing anomalous behavior. In such a case, we provide χ d known as retrogression factor as weight to the current value and 1 − χ d weight to cumulative average such that trust is updated as: The above scheme, termed as asymmetric weighted moving average, is effective in defending against On-Off attacks which is not possible using equally weighted or exponential weighted moving averages. In the simulation section, we also show that this can also be effective to distinguish malicious IoT devices and devices experiencing intermittent noise.

SIMULATION MODEL AND RESULTS
We simulate a generic system with 100 IoT devices. Inputs from all devices are monitored by an imperfect monitoring mechanism that produces three possible outcomes. The probability of detection and attack are varied to capture their effects on data integrity measurements.
Under non-opportunistic attacks, an adversary attacks and compromises different sets of inputs over time. The number of inputs compromised vary over each time slot; although the long-term average of the number of inputs compromised, denoted by P a , remains the same. Under opportunistic On-Off attacks, we study how a system can establish appropriate trustworthiness under opportunistic time dependent attacks. We study the data integrity utility values for different values of P a and P detect ; and plot instantaneous and moving average of data integrity scores. For calculating the utility values during all the simulations, the parameters are considered as follows: λ = 2, γ = 0.5, ω = 0.63, and ρ = 0.69.

Optimistic and Conservative Utility Values Under Same Attack Conditions: Instantaneous and Average
In Fig. 5, we plot the instantaneous and steady state utility values for both optimistic and conservative models when the adversary launches attacks with P a = 0.1 and the system is able to detect aggregated data with P detect = 0.9.
The optimistic system is defined with these costs: c c = 0.1, c n = 0.01, and c u = 0.055. The costs for conservative system are defined as follows: c c = 0.1, c n = 0.01, and c u = 0.09.
We observe that the instantaneous utility values fluctuate over time owing to the particular realizations of P a in a time slot and imperfect monitoring based on P detect . As expected, with sufficient observations, the moving average of optimistic utility values converges to a positive steady state value around 0.5. Furthermore, the moving average of conservative model utility values converges to a steady state value lower than the corresponding values related to optimistic system. This value is almost zero. Therefore, under the same attack and detection conditions, the collected data from an optimistic system is usually reliable but the aggregate data from a conservative system is mostly unreliable.

Utility Value and Attack Magnitude
In Fig. 6, we plot the steady state utility values in an optimistic system for different attack magnitudes, from P a = 0.1 to P a = 1, under the same monitoring mechanism, P detect = 0.9. In Fig. 7, we illustrate the average utility values for different attack magnitudes, P a , under two different monitoring mechanism for both an optimistic system and a conservative system. P detect is 0.9 in Fig. 7(a), and it is 0.5 in Fig. 7(b). According to Fig. 6 and Fig. 7, aggregate data in an optimistic system with a low P a is reliable most of the time.
However, if the attack magnitude increases, the collected data becomes mostly unreliable. On the other hand, the aggregate data in a conservative system is mostly unreliable even with low magnitude of attack. However, with an increase in magnitude of attack in a conservative system, it becomes more unreliable. Fig. 7 also reveals that with a reduction in P detect , the effect of attack magnitude on utility values follows the same trend with just a small decrease in utility values. The effect of attack magnitude on utility value is almost linear. Furthermore, as P detect decreases, the difference between average utility values of an optimistic system and a conservative system increases as c c is same for both systems. The only difference is in c u which is dependent on undecided devices.

Utility Value and Imperfect Monitoring
Imperfect monitoring can also have negative effect on utility values and subsequently on reliability of collected data from an IoT system as much as attack magnitude. Fig. 8 shows the effect of imperfect monitoring for an optimistic system over time while the attack magnitude remains the same, P a = 0.1. We observe that by an increase in P detect , the aggregate data become more reliable with a linear trend.

Prospect Theory VS Expected Utility Theory
Expected utility is not as risk averse and loss averse as prospect theory. Therefore, as shown in Fig. 9, it does not significantly differentiate between optimistic systems and conservative systems.
We analyzed the effect of expected utility theory on different attack magnitudes in Section 5.4 and predicted the possible differences between prospect theory and expected utility theory for different attack magnitudes. To validate those predictions, we have simulated an optimistic system using prospect theory and expected utility theory. The effect of attack magnitudes on PT and EUT based utility values is shown in Fig. 10. The simulation results perfectly match with our predictions.
In Section 5.4, we also claimed that the effect of expected utility theory would be different for optimistic and conservative systems. Since, expected utility does not use any weight function, large probabilities have a higher effect on EUT than PT. Since malicious nodes in conservative systems have a higher cost, this phenomenon is more sensible in conservative systems. According to Fig. 11, the absolute EUT utility values for conservative systems are much higher than absolute PT utility values for conservative systems which aligns with our prediction. This effect is to the extent that even absolute EUT utility values for optimistic systems are larger than absolute PT utility values for conservative systems.
As discussed, there exists three weight functions associated with probabilities of each outcome which contribute to the difference between the utility value obtained by EUT and PT. According to Fig. 12, if we consider P attack = 0.1, the expected utility value is higher than prospect theory utility value when P detect = 1 since P attack = 0.1 has higher negative effect on PT than EUT because of value and weight functions. As we decrease the detection accuracy from P detect = 1, utility values obtained by PT and EUT start to decrease. However, this negative effect in PT is more than EUT until we reach P detect = 0.7 (P undetected ≥ 0.3). As per Fig. 3, for small probabilities (less than 0.3), we consider higher weights than the actual probabilities in prospect theory. Therefore, it is expected that reduction of P detect from 1 to 0.7 which is equivalent to increase of P undecided from 0 to 0.3 has higher negative effect on the utility obtained by prospect theory than expected utility theory. However, according to the weight function, the trend of this effect will change for larger probabilities (larger than 0.3) which means utility values obtained by EUT should decrease with higher rate. However, as the detection accuracy decreases, the number of undecided devices increases. According to Fig. 2, the increase of undecided devices will incur more significant negative effect on PT than EUT. Therefore, as shown in Fig. 12, for P detect ≤ 0.7 (P undecided ≥ 0.3), these two effects cancel out each other in a fashion that utility values obtained by EUT and PT decrease with the same rate.

Defending against On-Off attacks: A special case
We consider an adversary launching On-Off attacks in five stages over 500 time slots. The first 300 slots are an active attack period and 300-500 is used to study how trust recovers. For most results we consider an Off-On ratio of 2:1. Later we compare results with 3:1 ratio. We plot the results of On-Off attacks as calculated by the IoT hub using equations from the asymmetric weighted moving average discussed in Section 6. We compare the results with other popular trust update schemes and justify the suitability of asymmetric averaging with regard to On-Off attacks.

Choice of weighing factors and threshold
The weighing factors χ a , χ bmax , χ cmin , and χ d are chosen as 0.99, 0.999, 0.001 and 0.001. We can verify that this satisfies the conditions: 0 < χ cmin << χ bmax < 1, 0 < χ a < 1, and 0 < χ d < 1. The skewed values of the weighing factors χ cmin and χ bmax justify the asymmetry provided by giving negative behaviors a very high weightage and positive behavior a very low weightage on the first occurrence of a negative behavior. The choice of χ a and χ d can be used to control the rate of trust redemption. If a system requires slower trust redemption then lower values of χ a and χ d are necessary. We put these weighing factors in the four case based equations discussed in Section. 6. Since there is no fixed magnitude of attack we keep the mid point between the trust value range (−1, +1) as Γ on−of f = 0. However, Γ on−of f can be adjusted according to the requirements of the system. More conservative systems will have Γ on−of f > 0. Different values of χ min and χ max can be chosen to ensure more fairness to nodes in a network inherently susceptible to more bit flips due to noise.

Comparison with Equal Weighted CWMA
In Fig. 13, we show how AWMA performs as opposed to the CWMA. We observe that at Stage 1 with no attacks, both schemes preserve a high trust value, but when attacks start from the 101-st time slot for the next 50 slots, AWMA ensures that the cumulative dependability score is decreased more rapidly and preserves a low value. On the other hand, CWMA is slow to react due to the adversary having behaved well in the first 100 slots. This happens because once the current value in a slot is less than 0, the proposed AWMA model forgets the previous high reputation through a very low value 1 − χ bmax = 0.001 and expresses extremely high weight χ bmax = 0.999 to the current values from the 101th time slot, thus causing the cumulative trust at stage 2 to decrease rapidly. Even at the end of Stage 5, when attacks have ceased for the last 200 slots, we see that the dependability score given by the asymmetric average is low enough to reflect the adversary's malicious attacks, while the equal weighted moving average fails to capture this outcome because the off-on attack ratio is 2 : 1, i.e., more slots with no attacks. This happens because previous cumulative trust of less than 0 (selected Γ onof f ) at the end of Stage 2 is given a very high weight compared to current honest behavior. It prevents the trust values to improve even during honest behavior.

Comparison with Exponential Weighted Moving Average
The major criticism of EWMA was that although it reacts quickly when attacks start, it also forgets malicious behavior as quickly as it reacts. This is inappropriate because the system should not be allowed to redeem its trust value quickly unless it experiences a long period of honest behavior. The key point where a difference is created is Case (c) of the On-Off defense schema where we provide very low value to honest behavior after a period of dishonest behavior. Hence its cumulative trust value hardly increases. In Fig. 14, we do not see much difference in Stage 1 due to no attacks. Also there is not much difference in Stage 2 as more weight is given to new trust values by both models. However, in Stage 3, EWMA allows the malicious device to quickly recover its trust value owing to forgetting old values. On the other hand, our asymmetric average selectively does not forget old trust values that are low. This happens because previous cumulative trust of less than 0 (selected Γ onof f ) at the end of Stage 2 is given a very high weight compared to current honest behavior. It prevents improvement of the trust values even during the period of honest behavior. We see that for all subsequent stages the exponentially weighted averages oscillate between high and low values, but the asymmetric average preserves a low value while maintaining fairness by allowing a very slow increase of cumulative trust at stage 5 owing to its continuous good behavior for 200 slots.

Comparison between higher and lower On-Off ratios
A 3:1 Off-On attack ratio is less aggressive than 2:1. Hence, after 500 slots, we should expect 3:1 to have higher dependability score. It may be noted a ratio as high as 1:1 is not a characteristic of On-Off attack and too low attack ratio hardly effects the system. In Fig. 15, we observe the differences in the dependability scores under 2:1 and 3:1 attack ratios.

CONCLUSIONS
In this paper, we proposed a Bayesian framework to maintain data integrity in an IoT network that is exposed to opportunistic data manipulation by adversaries. By considering an imperfect monitoring mechanism, we quantified the trustworthiness of the data being collected by an IoT hub through utility values obtained using prospect theory and expected utility theory. A comparison was drawn between these two theories considering their features and their applicability to trust measurements in an IoT network. According to the theoretical predictions and simulation experiments, prospect theory proved to be more promising for measuring trustworthiness of the aggregated data in risk averse systems. An asymmetric weighted moving average scheme is also proposed that can counter stealthy On-Off attacks. The proposed framework has been validated using extensive simulation experiments and the results bring out the efficacy of the proposed framework.