A Unified Framework for Fault Detection and Diagnosis Using Particle Filter

In this paper, a particle filter (PF) based fault detection and diagnosis framework is proposed. A system with possible faults is modeled as a group of hidden Markov models representing the system in fault-free mode and different failure modes, and a first order Markov chain is modeling the system mode transitions. A modified particle filter algorithm is developed to estimate the system states and mode. By doing this, system faults are detected when estimating the system mode, and the size of the fault is diagnosed by estimating the system state. A new resampling method is also developed for running the modified PF efficiently. Two introductory examples and a case study are given in detail. The introduction examples demonstrate the manner to model a system with possible faults into hidden Markov model and Markov chain. The case study considers a numerical model with common measurement failure modes. It focuses on the verification of the proposed fault diagnosis and detection algorithm and shows the behavior of the particle filter.


Introduction 1.Fault Detection and Diagnosis
Due to the increasing requirements of safety, reliability, and performance of control systems, a conventional feedback control design for a sophisticated system may result in an unsatisfactory performance, or even instability, in the event of failures in actuators, sensors, or other system components.Thus, to maintain the performance of the system, or to let the system work in degraded but safe condition when failures occur is particularly important for safety-critical systems such as aircrafts, submarines, nuclear power stations, chemical plants, and so on.In such systems, the consequences of a minor fault in a component or any loss of system functionality can be catastrophic.Therefore, the demand on reliability, safety, and fault tolerance is high.
Towards realizing a fault tolerant system, the first step is to diagnose and detect faults.Hence, the control engineers and system designers started to add fault diagnosis and detection (FDD) functions to the control system.There are two approaches to achieve FDD, analytical approach and heuristic approach.The analytical approach further consists of two methodssignal-processing-based and model-based (Isermann, 2006).Signal-processing-based method considers the time domain (statistical) and frequency domain features of output signals.Basically, the threshold, amplitude, mathematical expectation, variance, correlation, and frequency spectrum ( Fourier analysis) must be inspected in the signal-processing-based method.
In addition, advanced analytical methods such as wavelet analysis, intelligent analysis (neural networks and fuzzy logic), cluster analysis, and some other methods from pattern recognition may also be helpful.
detail in Blanke et al. (2006).A system model, usually a state estimator such as a Kalman filter, is introduced into this scheme.Then the system outputs is compared with the estimated outputs from the observer.FDD results can be deduced from the comparison residual.This paper follows the model-based analytical FDD idea, and it makes efforts to extend the existing modelbased methods into a more general framework by using the advantages of particle filters.
Particle filter (PF), which is derived from Bayesian estimation and Monte Carlo methods, has been studied for about two decades.The idea of a particle filter was first introduced by Gordon et al. (1993), then enriched by Kitagawa (1996), Liu and Chen (1998), and many other researchers.A good early tutorial is by Arulampalam et al. (2002), and recently summarized by Doucet and Johansen (2009).The PF has drawn great attention since it was proposed because it is a powerful tool to solve the optimal estimation problems in nonlinear non-Gaussian systems.The PF has been applied in target tracking, computer vision, digital communications, speech recognition, machine learning, and other areas; see Chen (2003) for more detail.The PF has also shown its strong capability to solve state estimation problem in nonlinear systems with non-Gaussian noise.This article trends to extend the usage of PF to solve FDD problems.

An Introductory Idea -Fault Detection and Diagnosis with Filter Bank
In Gustafsson (2001), a linear system with possible faults can be modeled as a time varying linear system, whose dynamic is described by1 In this model, x is the state vector, y is the measurement, A is the state transition matrix, B u is the input matrix, B w is the noise input matrix, C is the measurement matrix, D u is the direct input to measurement transition matrix, and k always denotes the current time instance.v and w are the measurement noise and system noise vectors, respectively.These noise terms are assumed to be white and Gaussian, but there can be a DC component embedded.N (µ, σ) is a multivariate Gaussian probability density function, and µ and σ is the mean vector and covariance matrix, respectively.
The sign "∼" denotes a random variable "subject to" a probability density function (PDF).All the parameter matrices can be time-varying and conditioned on the system mode parameter δ.
The parameter δ2 is important in this model.It is called the system mode parameter because the system switches its behavior as δ changes its value if the parameter matrices are depending on δ nontrivially.This system mode parameter is used to distinguish the faultfree and faulty behavior of the system in the following FDD framework.
The model ( 1), ( 2) is linear Gaussian, so that it coincides with most of the assumptions in a Kalman filter, except for the system mode parameter δ.Then, the Kalman filter is still valid in this case, if assuming that δ is known.However, in FDD δ is the objective of the detection and diagnosis, so it is unknown to us.For these features of the model, a Kalman filter bank based FDD scheme is given in Gustafsson (2001).In this scheme, the FDD problem is reformed as an estimating problem of the system mode δ 1:k , to obtain the maximum a posteriori (MAP) estimator δ1:k However, this model has its limitations: 1.The model is linear, but the real systems are generally nonlinear.On the other hand, as shown in the previous work by Zhao et al. (2012b), sometimes we may have to think about nonlinear measurement problem.So it is necessary to extend this model to the nonlinear case.
2. The noise terms are assumed to be Gaussian in the model, but it does not always fit the real world.Also refer to Zhao et al. (2012a) and Zhao et al. (2014), although we can whiten a colored noise by state augmentation, we still have a Rice distributed measurement noise and a Gaussian mixture driving noise.
Focusing on these limitations in the model ( 1), ( 2), a generalized system model is defined as a combination of a hidden Markov model and a Markov chain.Several modeling examples are given in Section 3. Then a modified particle filter algorithm is proposed to estimate the system states and system mode in the generalized model in Section 4. Two case studies in different scenarios are given in Section 5 as validation of the proposed algorithm.Finally the conclusion of this paper is given.

A Generalized Model
In this paper, a system is described by a switching mode hidden Markov model (HMM) where (6) is the system equation, defining how the states propagate depending on the input, and system mode.Moreover: • X is the state vector, a random vector in R Nx .
• x k+1 and x k are the realizations of the corresponding state vectors.
• u k ∈ R Nu is the known input to the system at time k.
parameter representing the system mode, where ∆ i ∈ {0, 1} is a binary representation of the occurrence of a fault, such that δ k ∈ {0, 1} Nm .The detailed discussion about this system mode parameter is given in the next section.
Nm → R is the state transition mapping, which maps the states, input, and system mode at current time to the PDF of the states at the next time instance.And p (•) is a probability measure on R Nx .Equation ( 7) is the measurement equation .It defines the relation from the current states, input, and system mode to the observation, where: • Y k ∈ R Nx is a random vector, representing the observation.
• y k ∈ R Ny is the measurement at time instance k.
• h k (•) : R Nx ×R Nu ×{0, 1} Nm → R is the measurement mapping, which maps the states, input, and system mode at current time to the PDF of the measurement.This mapping corresponds to the measurement equation in the state space model.And p (•) is a probability measure on R Ny .This model is a generalization of (1), (2) since it can describe a nonlinear system subject to non-Gaussian noise conditions.For example, the model ( 1), (2) can be written in the HMM form (6) and (7) according to The manner to solve the FDD problem by an HMM is by estimating the system mode sequence δ.For instance, assuming that the sequence δ is estimated, suppose it is found that δ i is nontrivial for i = k −l, • • • , k.It can then be concluded that a fault has occurred, and the fault happened between time instance k − l − 1 and k − l.
It should be highlighted that although this model is very general, it is not universal.For instance, while ( 6) and ( 7) are discrete, physical systems are continuous.Discretization is necessary for the physical systems to fit the model.The cost we then pay is the discretization error.
It can be a problem that a system with its failure modes may be modeled in different ways within this framework.This implies that the model ( 6) and ( 7) is too general and flexible and it loses its functionality to be a canonical form.Hence, the model should be more specialized.More discussion regarding the canonical form will be given in the next section after some further examples and analysis.

Modeling Examples and Discussion
Modeling a system into the form ( 6) and ( 7) is not difficult, because the knowledge and methods of modeling a system into state space form can be directly inherited.Regarding nonlinearity and non-Gaussian noise, it is possible to inherit the nonlinearity of the system directly rather than linearizing the system like the extended Kalman filter, and also the general model is compatible with non-Gaussian noise by manufacturing the mappings.We give some examples to show how system with their possible faults are modeled in the HMM form.These examples are motivated by the examples in (Gustafsson, 2001).

One-time Changing Mean Model
Consider the case of an unknown DC component embedded in white noise.Suppose that we want to test the hypothesis that the DC component has been changed at some unknown time instant.We can then model the signal by where σ (•) is the step function, θ 1 and θ 2 are the DC components embedded in the white noise before and after the change, l is the change time, and v k is the white measurement noise.If all possible change instants are to be considered, the variable l takes its value from the set {0, 1, • • • , k − 1, k}, where l = k should be interpreted as no change.
We can rewrite (10) as where the system mode parameter where N (µ, σ) is the scalar Gaussian probability density function.
However, the new model ( 11) is not equivalent to (10), because the system mode in (11) can switch multiple times between [1 0] and [0 1] , while it can change only once in (10).So some restriction must be assigned to the system mode parameter in (11), such that δ can only change from [1 0] to [0 1], but never opposite.This can mathematically be described by the following Markov chain (see also Figure 1), where p 11 is the probability that the value of the DC component remains the same between the time steps, while p 12 is the probability that the value of the DC component change from θ 1 to θ 2 between time steps.The transition probabilities 0 and 1 are set according to the assumption that the DC component can change only once in the whole sequence.
The combined model of ( 11) and ( 13) is now almost equivalent to the model (10).However, the model ( 11) and ( 13) is more precise than (10).This is because the combined model also defines the system mode transition probabilities, while in (10) the probability of the mode switching is undefined, meaning that the a prior information p 11 and p 12 cannot be used in the estimation.

Segmentation in Changing Mean Model
We can extend the above scenario to a more general case by relaxing the assumptions of previous knowledge of the DC values and the number of changes.In this case it is convenient to take the DC level as system state, and thereby model the signal as where ρ (•) is the Dirac measure, U (θ inf , θ sup ) is the uniform distribution U (θ inf , θ sup ), and ∆ k follows the definition in the previous example.∆ k = [1 0] represents there is no change at time k, while ∆ k = [0 1] represents there is a change at time k.
The mode transition can be defined according to the frequency of the mode change, such as (16)

Example 2: Decayed Input
In a control system, the output from actuators can be less than the required value, typically due to actuators' aging problem.This input reduction may happen suddenly, but changes mildly once occurred.We can model this fault as where α is a vector, whose entries are between 0 and 1, indicating the reduction rate of each channel of the system input.The dynamic of α is modeled in ( 18), where g (α k ) can be adjusted in a manner such that the entries in α k+1 is less than or equal to the entries in represents that the input channels are fault-free, while ∆ k = [0 1] represents that there is an input reduction.

Discussion of the General Model
We recall the general model As shown in the previous examples, it is necessary and beneficial to model the transition of the system mode parameter to obtain a more accurate and more informative model.On the other hand, in both examples mentioned above, we are considering single failure cases.In practice, however, we may encounter multiple failure cases when different faults occurs at the same time.For this sake, we can assign a state ∆ f (i) ∈ {0, 1} denoting whether fault "f (i) " occurs in the system, so that the combination of states Nm describes the failures in the whole system.Note that here we are using capital characters ∆ and ∆ for the discrete random vector/variable of the system mode, while their realizations will be denoted as δ and δ.The Markov chain is suitable to model the characteristics of these faults.It is not difficult to combine the Markov chain model of the mode transition into the HMM of the states.First we define the extended state vector, which consists of the system state and the system mode parameter, Then the uniform system model can be obtained by reforming equations ( 20), (21), and ( 22) as where i, j ∈ {1, • • • , N m }.The model ( 24)) and ( 25) will be used in the following derivation of the PF-based FDD algorithm.
4 Particle Filter for Fault Detection and Diagnosis

The Algorithm of the PF for FDD
The algorithm of the PF proposed here is adapted from the sampling importance resampling (SIR) PF in Arulampalam et al. (2002).Note, however, that different Monte Carlo based Bayesian estimation algorithms can be used to solve the estimation problem for ( 24) and ( 25).The SIR PF presented here is our prototype.
Figure 2 shows the process of the PF algorithm.

Inherit from the Last Cycle
The PF works in a recursive manner.At time k, it inherits p ξ k−1 |y 1:k−1 , which is the posterior approximation of the distribution of the extended state vector given the observation up to the previous time instance k − 1.In this PF, because there are both continuous and discrete states in the system, the posterior distribution p ξ k−1 |y 1:k−1 is expressed by where N s,k−1 is the number of particles at time instance k − 1, ρ (•) is the Dirac function, and ρ i,j is the Kronecker delta function.A conceptual illustration of the posterior can be seen in Figure 2.

Time Update
The time update process is to obtain the prior estimation of the states as p ξ k y 1:k−1 .In the PF context, this process is done by draw samples from a so-called importance density.The SIR PF uses the most convenient importance density p ξ k ξ defined by ( 24) to derive the prior density.
Since there is a dependence in (24), the time update process has to be divided into two steps -the system mode time update corresponding to Action 1 in Figure 2, and the system states time update as Action 2. Intuitively, the system mode should update first, and then the system states are updated since they are modedependent.That is, for each particle, we determine its mode at current time instance by drawing sample from the mode transition Markov Chain, and then we determine the system states by drawing sample from that we draw sample from is then equivalent to where Pr δ k x , because δ k is independent from x k .At the end of the time update step, we obtain new positions of the particles ξ (i) k .

Measurement Update
At this step the particle weights are updated according to the observation at the current time instance k.The weights are updated according to the Baye's law, in which case if one particle's prior ξ k is supported by the observation then the weight of the particle should increase, and vice versa.
Given the observation at current time as y k , and using the importance density p ξ k ξ where p y k ξ is defined by ( 25).This corresponds to Action 5 in Figure 2.

A Modified Resampling Algorithm
Degeneracy is a common problem in PFs, where after a few iterations, all but one particle will have negligible weight (Arulampalam et al., 2002).Doucet et al. (2000) shows that the variance of the importance weights will increase over time, and thus it is impossible to avoid the degeneracy phenomenon.It is common to employ a resample step in the PF algorithm to counteract the degeneracy.Examples are the resample methods proposed in multinomial resampling in Smith and Gelfand (1992), residual resampling in Liu (1996), and systematic resampling in Carpenter et al. (1999).
However, these resampling methods are not suitable in our case.To understand the problem, consider (24) and assume that one mode has the marginal mass Pr ∆ f (p) k = 1 = 0.0001 (p ∈ {1, . . ., N m }).Then this mode rarely occurs in the system.To accurately describe the distribution, it is common to use more than O 10 1 particles.So, if using 10 particles, which is a very low value, to represent the density p x k ∆ f (p) k = 1 , then in average 10 0.0001 = 10 6 particles in total are needed to describe the entire density p (ξ k |y 1:k ).This large amount of particles is computationally infeasible.
A desired resampling algorithm should be able to generate a suitable amount of particles for each system mode.Since the probability of a fault to occur in a system is generally low, the particles in these modes will have light weights.With a standard resampling method, these light-weighted particles are unlikely to survive.Thus, the amount of particles regenerated by the resampling algorithm should make sure that the computational cost is not too high, at the same time as the particles from the system modes that are unlikely to happen will represent the conditional density of the states in these system modes.
For these reasons, an adaptive resampling method is proposed.This may give enough samples for the modes with small marginal probability mass, and at the same time it can restrict the total numbers of particles.Essentially this modified resampling algorithm must adaptively determine the number of samples in each system mode according to their significance, and then make a compromise between the computational complexity and the estimation performance.
At first, define Ns as the minimum number of particles required for sufficiently representing a system mode.Then there will be at least Ns N m particles in the PF.So, if there are too many modes in the system, a large amount of particles cannot be avoided.
Consider those modes with significant marginal probability mass.If a system mode has a significant marginal probability mass, this means the system is more likely to work well in this mode.These modes are named significant modes.Define Ñs as the number of particles suitable for estimation.Then Ns (N m − 1) + Ñs should be an acceptable sample size for computing, as explained at the end of this section.Thus, we assign Ñs particles to each system mode according to their marginal probability mass, where δ (p) goes through all elements in {0, 1} Nm , N s,δ (p) ,k is the number of particles required in mode δ (p) in the resampling step, and a is the minimum integer which is larger than a.For any N s,δ (p) ,k ≤ Ns we compulsorily assign Ns to N s,δ (p) ,k .
After assigning the number of samples N s,δ (p) ,k , resampling can be done mode-specifically with any standard method.Resulting from the adaptive resampling, it obtains a group of particles with uneven weights Pr ∆ k = δ (p) |y 1:k /N s,δ (p) ,k depending on the modes δ (p) that the particles belong to.
One can then derive the lower boundary of the estimated effective sample size (Doucet et al. (2000)) as Neff = 1 for each δ (p) .Thereby, we get Substituting this into (29) yields Thus, the proposed adaptive resampling method will keep the effective sample size larger or equal to Ñs .The least efficient case happens when one of the modes has the posterior probability 1.For instance, if there are N m = 24 modes in the system, Ñs = 1000 and Ns = 100.In worst case we still have Ns (N m − 1)+ Ñs particles in the system.However Ns (N m − 1) of these have null weights.This gives the effective sample size Neff = Ñs = 1000, and we waste computation on the other Ns (N m − 1) = 2300 particles.

Fault Detection and Diagnosis with Particle Filter
The system is working in mode ∆ k = δ (p) when the marginal probability mass Pr ∆ k = δ (p) |y 1:k becomes significant.This probability mass is obtained by marginalizing the distribution p (ξ k |y 1:k ) along x k , according to We provide two methods to detect the failure mode.One may define thresholds h δ (p) for each system mode.Once the estimated marginal mass of a mode other than the fault-free mode exceeds the threshold, then h δ (p) is captured and the system is considered to suffer the corresponding failure mode.That is, when Pr ∆ k = δ (p) |y 1:k ≥ h δ p it reports failure mode δ (p)  to occur between time instants k − 1 and k.
An alternative way is to simply draw the conclusion by picking up the most significant mode δ (m) , where Pr ∆ l = δ (m) |y 1:l ≥ Pr ∆ l = δ (p) |y 1:l , ∀δ (p) , p = m.Once observed that the most significant mode δ (m) is other than the fault-free mode, the fault is detected.

Detection of Common Failure Modes in Position-like Measurement
In industrial control systems typical failure modes in measurements are • bias -the measurement has a constant-like bias relative to the true signal; • drift -the measurement drifts off relative to the true signal, either by a stochastic process (Wiener process) or deterministically (ramp); and • outliers -a sample from a measurement signal that lies abnormally far away from the other values.
For safety and reliability, in industry applications, it is commonly required for sensor redundancy.This is to install multiple sensors to measure the same system output or state.The fault detection and diagnosis in this case should focus on monitoring and identifying the conditions of sensors.
In this simulation, it is assumed that the true state being measured is [0, 0], but it is embedded in zero mean white Gaussian noise.The above listed three failure modes will then occur, but it is assumed that they do not occur simultaneously.
It needs to be clear that the above proposed change detection on a zero mean white noise sequence is not trivial.A two dimensional zero mean white noise sequence can be interpreted as the residual of comparing homogeneous measurements from two sensors of the same two dimensional state, for instance the position of a surface vessel.Thus, this detection and diagnosis problem can easily be extended to a wide range of applications.

System Modeling
To model the signal with the above mentioned failure modes into (24) and (25), we first construct the state vector x, consisting of the trivial position and the possible bias and drifting terms.So, we assign x k = p k b k d k , where p k ≡ 0 0 is just as the zero mean assumption, b k ∈ R 2 is the bias term, and d k ∈ R 2 is the drifting term.According to the different failure modes, we may define the dynamics of the system and measurement as in Table 1.Here, p m,k

System mode
System Measurement Table 1: System equations and measurement equations in each system mode.
denotes the measurement, v k is the measurement noise in fault-free condition, and v k is the measurement noise in outliers failure mode.The measurement noise terms will be discussed in detail later.Correspondingly, ∆ 0 to ∆ 3 are assigned to the four system modes, b 0 is the bias, that subjects to the prior probability p (b), and w d k is driving noise for the drift.
The measurement noise v k in mode ∆ 0 , ∆ 1 , and ∆ 2 are assumed to be white, subject to N (0, I 2×2 ).System mode ∆ 1 represents the bias fault.The bias is modeled as a step sequence where l is the time when the bias fault occurs, and b 0 is uniformly distributed in the region b in \b out ⊂ R 2 3 .Here, b in covers the region where the bias should be, and the exclusion b out is used to distinguish the fault-free mode and the bias mode.System mode ∆ 2 represents the drift failure mode.The drifting term d k is modeled as a random walking, such that where l is the time when the drift fault occurs, and w d,0 is uniformly distributed in the region w d,in \w d,out ⊂ R 2 , similar to the bias case.The outliers can conceptually be seen as measurements with unusually large Table 2: The Markov chain for the transition of combined mode ∆ 1 ∆ 2 .measurement noise.Hence, we can use v k belonging to a distribution with much larger variance than v k to model this phenomenon.To simplify the calculation in the particle filter, when updating the weights of particles, the weights are assigned as w This simplification loses the probability nature since R 2 p (v k ) dv k is infinite.However, it reflects the characteristics of outliers where the measurement shows a significant jump, and this is easier to calculate.
This Markov chain gives a more detailed description of the system behavior, but in a connotative manner.The system is assumed to never run into a multiple fault case, which is when more than one fault occur at the same time.The transition probabilities to such modes are therefore zero and neglected in Table 2. Another restriction of the mode transition is that the system mode can transfer to a fault case only from the fault-free mode.This is inspired by the assumption of the system behavior, and it helps when diagnosing an ambiguous fault (e.g. a small size bias can be mistaken for a drifting bias).Designing the transition probabilities related to the bias and the drift modes are tricky.Compared to the outliers, which is isolated events, detecting the bias and drift failures is close to estimating a time sequence.Hence, the transition probabilities from these modes back to the fault-free mode is small to make the particles stick to these modes and perform retentively.

Results
The simulation results in this section show the detection and diagnosis performance of the proposed algorithm to the different fault cases.In all the simulations, the PF adopts the same structure and parameters, and the number of particles is set to 1000.The mode transition Markov chain is given in Table 2.The covariance matrix of the measurement noise v k takes the value

Detection of Outliers
Figure 3 shows the detection results of outliers in the white Gaussian noise sequence.In this simulation three different sizes of outliers are triggered; [1.5 0.5] from 200 sec to 400 sec, [2 1] from 400 sec to 600 sec, and [3 1] from 600 sec to 800 sec.The sizes of the outliers subject to a Gaussian noise with covariance 0.2 × I 2×2 .
It can be conclude from the result that the PF effectively detects the outliers with large enough size.If considering the outliers being the signal, the signal-tonoise ratio4 for the different outliers are 1.25, 2.5, and 5, respectively.As the SNR increases, the PF's detection frequency of outliers also increases.SNR 2.5 is a critical value in the sense that the detection rate of the outliers is about 50%.This speed is relatively low compared to the variance of the noise.As shown in Figure 4 that the drift is detected in the way that the marginal probability mass of the drifting mode exceeds the marginal probability mass of the fault-free mode at about 145 sec, which is about 45 seconds after the fault triggered.Because the drift speed is slow, the time-to-detection is relatively long.The detection happens when the SNR reaches 1.01, which is lower than the critical SNR 2.5 in the outliers case.Besides the consideration of sensitivity, this benefits from that the PF makes good use of the historical data to estimate the trend of the drift.Instead of only examining a measurement independently from the previous measurements, the detection of drift is like estimating a ramp sequence starting at an unknown time.However, not all historic information is used in the detection, since partial information leaks along the particle transfer from the drift mode to the fault-free mode.This information leak is an innate character of this algorithm, and it cannot be avoided by increasing the number of particles (unless we unrealistically let the number of particles increase with time).

Drift Detection and Estimation
In Figure 4 the bias mode has the largest marginal probability mass from 195 sec to 265 sec, which means that the PF diagnoses the current fault as a bias instead of a drift.This misdiagnosis is due to the behavior of the fault at this stage, which can be taken as either a bias or a drift.After 265 sec, the PF has cumulated enough information to distinguish these faults.Although there is a misdiagnosis during the process, the fault detection is always successful since the marginal probability mass of the fault-free mode is always low, and the estimate of the drift size is accurate.
Bias Detection and Estimation Figures 6 and 7 show the detection and estimation of a bias fault.The size of the bias is [3, −1] , such that the SNR of the fault is 2.5.The bias is triggered at 100 second, and the detection is successful since the marginal probability mass of the bias mode is almost always the largest.The exception is a the few points where the measurement is close to the origin because of the noise.

Conclusion
In this paper, a PF based fault detection and diagnosis framework was proposed.In this framework, the sys- tem with possible failures is modeled as a group of hidden Markov models, representing the system in faultfree and different failure modes, and a Markov chain, representing the transition between modes.This combined HMM and MC model can be used as a canonical model for a wide range of systems, since the HMM is compatible with the state-space model in control context, and the MC is also a generalized model of the system mode transition.By two introductory examples, we showed how to model a system into this form.A modified PF algorithm was introduced to estimate the system mode and the states of the canonical model, and at the same time solve the FDD problem.This new PF algorithm extends the PFs into hybrid spaces of continuous and discrete components.A new resampling algorithm was developed along with the PF, to enhance the efficiency of the PF.
The proposed method is suitable for the system where the propagations of the distributions in fault-free and faulty conditions are known and can be properly modeled in switching mode hidden Markov model.In the case that the propagations of the distributions are not accurately known, rough approximations need to be used instead.These inaccurate models will affect the fault detection and diagnosis performance of the proposed method.To enhance the robustness of this method, a CUSUM algorithm Blanke et al. (2006) can be applied in addition.For example, we can calculate the time accumulation of the posterior probability of each mode, and compare with each other, or compare against predefined thresholds to detect the faults.
At last, a case study regarding the proposed FDD scheme for a faulty system was are given in detail.The case study demonstrates how to apply this PF-based FDD method, and it also verifies the method by simulations.The example is instructive and can be used as a template for developing new applications based on the same algorithm.

Figure 1 :
Figure 1: The Markov chain of mode transition.

Figure 2 :
Figure 2: One cycle of the PF, divided into 3 steps by chain lines.These steps correspond to the inherit, time update, and measurement update, respectively.The resampling step is not included in this figure.

Figure 4 :
Figure 4: Drift detection.The upper graph shows the original signal and the estimation of the drift by the PF.The lower graph shows the abundance of the particles in each system mode.
Figures 4 and 5show the detection and estimation of a drift fault.The drift speed is set to [0.03 − 0.01] meters per second.

Figure 5 :
Figure 5: The estimated empirical distribution of the first component of d k , and the estimated mean as the curve in green.

Figure 6 :
Figure 6: Bias detection.The upper graph shows the original signal and the estimation of the bias from the PF.The lower graph shows the abundance of the particles in each system mode.

Figure 7 :
Figure 7: The estimated empirical distribution of the first component of b k , and the estimated mean as the curve in green.