introduction to estimation filters
background
estimation systems
for many autonomous systems, the knowledge of the system state is a prerequisite for designing any applications. in reality, however, the state is often not directly obtainable. the system state is usually inferred or estimated based on the system outputs measured by certain instruments (such as sensors) and the flow of the state governed by a dynamic or motion model. some simple techniques, such as least square estimation or batch estimation, are sufficient in solving static or offline estimation problems. for online and real time (sequential) estimation problems, more sophisticated estimation filters are usually applied.
an estimation system is composed of a dynamic or motion model that describes the flow of the state and a measurement model that describes how the measurements are obtained. mathematically, these two models can be represented by an equation of motion and a measurement equation. for example, the equation of motion and measurement equation for a general nonlinear discrete estimation system can be written as:
where k is the time step, xk is the system state at time step k, f(xk) is the state-dependent equation of motion, h(xk) is the state dependent measurement equation, and yk is the output.
noise distribution
in most cases, building a perfect model to capture all the dynamic phenomenon is not possible. for example, including all frictions in the motion model of an autonomous vehicle is impossible. to compensate for these unmodeled dynamics, process noise (w) is often added to the dynamic model. moreover, when measurements are taken, multiple sources of errors, such as calibration errors, are inevitably included in the measurements. to account for these errors, proper measurement noise must be added to the measurement model. an estimation system including these random noises and errors is called a stochastic estimation system, which can be represented by:
where wk and vk represent process noise and measurement noise, respectively.
for most engineering applications, the process noise and measurement noise are assumed to follow zero-mean gaussian or normal distributions, or are at least be approximated by gaussian distributions. also, because the exact state is unknown, the state estimate is a random variable, usually assumed to follow gaussian distributions. assuming gaussian distributions for these variables greatly simplifies the design of an estimation filter, and form the basis of the kalman filter family.
a gaussian distribution for a random variable (x) is parametrized by a mean value μ and a covariance matrix p, which is written as x∼n(μ,p). given a gaussian distribution, the mean, which is also the most likely value of x, is defined by expectation (e) as:
the mean is also called the first moment of x about the origin. the covariance that describes of the uncertainty of x is defined by expectation (e) as:
the covariance is also called the second moment of x about its mean.
if the dimension of x is one, p is only a scalar. in this case, the value of p is usually denoted by σ2 and called variance. the square root, σ, is called the standard deviation of x. the standard deviation has important physical meaning. for example, the following figure shows the probability density function (which describes the likelihood that x takes a certain value) for a one-dimensional gaussian distribution with mean equal to μ and standard deviation equal to σ. about 68% of the data fall within the 1σ boundary of x, 95% of the data fall within the 2σ boundary, and 99.7% of the data fall within the 3σ boundary.
even though the gaussian distribution assumption is the dominant assumption in engineering applications, there exist systems whose state cannot be approximated by gaussian distributions. in this case, non-kalman filters (such as a particle filter) is required to accurately estimate the system state.
filter design
the goal of designing a filter is to estimate the state of a system using measurements and system dynamics. since the measurements are usually taken at discrete time steps, the filtering process is usually separated into two steps:
prediction: propagate state and covariance between discrete measurement time steps (k = 1, 2, 3, …, n) using dynamic models. this step is also called flow update.
correction: correct the state estimate and covariance at discrete time steps using measurements. this step is also called measurement update.
for representing state estimate and covariance status in different steps, xk|k and pk|k denote the state estimate and covariance after correction at time step k, whereas xk 1|k and pk 1|k denote the state estimate and covariance predicted from the previous time step k to the current time step k 1.
prediction
in the prediction step, the state propagation is straightforward. the filter only needs to substitute the state estimate into the dynamic model and propagate it forward in time as xk 1|k = f(xk|k).
the covariance propagation is more complicated. if the estimation system is linear, then the covariance can be propagated (pk|k→pk 1|k) exactly in a standard equation based on the system properties. for nonlinear systems, accurate covariance propagation is challenging. a major difference between different filters is how they propagate the system covariance. for example:
a linear kalman filter uses a linear equation to exactly propagate the covariance.
an extended kalman filter propagates the covariance based on linear approximation, which renders large errors when the system is highly nonlinear.
an unscented kalman filter uses unscented transformation to sample the covariance distribution and propagate it in time.
how the state and covariance are propagated also greatly affects the computation complexity of a filter. for example:
a linear kalman filter uses a linear equation to exactly propagate the covariance, which is usually computationally efficient.
an extended kalman filter uses linear approximations, which require calculation of jacobian matrices and demand more computation resources.
an unscented kalman filter needs to sample the covariance distribution and therefore requires the propagation of multiple sample points, which is costly for high-dimensional systems.
correction
in the correction step, the filter uses measurements to correct the state estimate through measurement feedback. basically, the difference between the true measurement and the predicted measurement is added to the state estimate after it is multiplied by a feedback gain matrix. for example, in an extended kalman filter, the correction for the state estimate is given by:
as mentioned, xk 1|k is the state estimate before (priori) correction and xk 1|k 1 is the state estimate after (posteriori) correction. kk is the kalman gain governed by an optimal criterion, yk is the true measurement, and h(xk 1|k) is the predicted measurement.
in the correction step, the filter also corrects the estimate error covariance. the basic idea is to correct the probabilistic distribution of x using the distribution information of yk 1. this is called the posterior probability density of x given y. in a filter, the prediction and correction steps are processed recursively. the flowchart shows the general algorithms for kalman filters.
estimation filters in sensor fusion and tracking toolbox
sensor fusion and tracking toolbox™ offers multiple estimation filters you can use to estimate and track the state of a dynamic system.
kalman filter
the classical kalman filter () is the optimal filter for linear systems with gaussian process and measurement noise. a linear estimation system can be given as:
both the process and measurement noise are assumed to be gaussian, that is:
therefore, the covariance matrix can be directly propagated between measurement steps using a linear algebraic equation as:
the correction equations for the measurement update are:
to calculate the kalman gain matrix (kk) in each update, the filter needs to calculate the inverse of a matrix:
since the dimension of the inverted matrix is equal to that of the estimated state, this calculation requires some computation efforts for a high dimensional system. for more details, see .
alpha-beta filter
the alpha-beta filter () is a suboptimal filter applied to linear systems. the filter can be regarded as a simplified kalman filter. in a kalman filter, the kalman gain and covariance matrices are calculated dynamically and updated in each step. however, in an alpha-beta filter, these matrices are constant. this treatment sacrifices the optimality of a kalman filter but improves the computation efficiency. for this reason, an alpha-beta filter might be preferred when the computation resources are limited.
extended kalman filter
the most popular extended kalman filter () is modified from the classical kalman filter to adapt to the nonlinear models. it works by linearizing the nonlinear system about the state estimate and neglecting the second and higher order nonlinear terms. its formulations are basically the same as those of a linear kalman filter except that the ak and hk matrices in the kalman filter are replaced by the jacobian matrices of f(xk ) and h(xk):
if the true dynamics of the estimation system are close to the linearized dynamics, then using this linear approximation does not yield significant errors for a short period of time. for this reason, an ekf can produce relatively accurate state estimates for a mildly nonlinear estimation system with short update intervals. however, since an ekf neglects higher order terms, it can diverge for highly nonlinear systems (quadrotors, for example), especially with large update intervals.
compared to a kf, an ekf needs to derive the jacobian matrices, which requires the system dynamics to be differentiable, and to calculate the jacobian matrices to linearize the system, which demands more computation assets.
note that for estimation systems with state expressed in spherical coordinates, you can use .
unscented kalman filter
the unscented kalman filter () uses an unscented transformation (ut) to approximately propagate the covariance distribution for a nonlinear model. the ut approach samples the covariance gaussian distribution at the current time, propagates the sample points (called sigma points) using the nonlinear model, and approximates the resulting covariance distribution assumed to be gaussian by evaluating these propagated sigma points. the figure illustrates the difference between the actual propagation, the linearized propagation, and the ut propagation of the uncertainty covariance.
compared to the linearization approach taken by an ekf, the ut approach results in more accurate propagation of covariance and leads to more accurate state estimation, especially for highly nonlinear systems. ukf does not require the derivation and calculation of jacobian matrices. however, ukf requires the propagation of 2n 1 sigma points through the nonlinear model, where n is the dimension of the estimated state. this can be computationally expensive for high dimensional systems.
cubature kalman filter
the cubature kalman filter () takes a slightly different approach than ukf to generate 2n sample points used to propagate the covariance distribution, where n is the dimension of the estimated state. this alternate sample point set often results in better statistical stability and avoids divergence which might occur in ukf, especially when running in a single-precision platform. note that a ckf is essentially equivalent to a ukf when the ukf parameters are set to α = 1, β = 0, and κ = 0. see for the definition of these parameters.
gaussian-sum filter
the gaussian-sum filter () uses the weighted sum of multiple gaussian distributions to approximate the distribution of the estimated state. the estimated state is given by a weighted sum of gaussian states:
where n is the number of gaussian states maintained in the filter, and cki is the weight for the corresponding gaussian state, which is modified in each update based on the measurements. the multiple gaussian states follow the same dynamic model as:
the filter is effective in estimating the states of an incompletely observable estimation system. for example, the filter can use multiple angle-parametrized extended kalman filters to estimate the system state when only range measurements are available. see tracking with range-only measurements for an example.
interactive multiple model filter
the interactive multiple model filter () uses multiple gaussian filters to track the position of a target. in highly maneuverable systems, the system dynamics can switch between multiple models (constant velocity, constant acceleration, and constant turn for example). modelling the motion of a target using only one motion model is difficult. a multiple model estimation system can be described as:
where i = 1, 2, …, m, and m is the total number of dynamic models. the imm filter resolves the target motion uncertainty by using multiple models for a maneuvering target. the filter processes all the models simultaneously and represents the overall estimate as the weighted sum of the estimates from these models, where the weights are the probability of each model. see tracking maneuvering targets for an example.
particle filter
the particle filter () is different from the kalman family of filters (ekf and ukf, for example) as it does not rely on the gaussian distribution assumption, which corresponds to a parametric description of uncertainties using mean and variance. instead, the particle filter creates multiple simulations of weighted samples (particles) of a system's operation through time, and then analyzes these particles as a proxy for the unknown true distribution. a brief introduction of the particle filter algorithm is shown in the figure.
the motivation behind this approach is a law-of-large-numbers argument — as the number of particles gets large, their empirical distribution gets close to the true distribution. the main advantage of a particle filter over various kalman filters is that it can be applied to non-gaussian distributions. also, the filter has no restriction on the system dynamics and can be used with highly nonlinear system. another benefit is the filter’s inherent ability to represent multiple hypotheses about the current state. since each particle represents a hypothesis of the state with a certain associated likelihood, a particle filter is useful in cases where there exists ambiguity about the state.
along with these appealing properties is the high computation complexity of a particle filter. for example, a ukf requires propagating 13 sample points to estimate the 3-d position and velocity of an object. however, a particle filter may require thousands of particles to obtain a reasonable estimate. also, the number of particles needed to achieve good estimation grows very quickly with the state dimension and can lead to particle deprivation problems in high dimensional spaces. therefore, particle filters have been mostly applied to systems with a reasonably low number of dimensions (for example robots).
how to choose a tracking filter
the following table lists all the tracking filters available in sensor fusion and tracking toolbox and how to choose them given constraints on system nonlinearity, state distribution, and computational complexity.
filter name | supports nonlinear models | gaussian state | computational complexity | comments |
alpha-beta | low | suboptimal filter. | ||
kalman | ✓ | medium low | optimal for linear systems. | |
extended kalman | ✓ | ✓ | medium | uses linearized models to propagate uncertainty covariance. |
unscented kalman | ✓ | ✓ | medium high | samples the uncertainty covariance to propagate the sample points. may become numerically unstable in a single-precision platform. |
cubature kalman | ✓ | ✓ | medium high | samples the uncertainty covariance to propagate the sample points. numerically stable. |
gaussian-sum | ✓ | ✓ (assumes a weighted sum of distributions) | high | good for partially observable cases (angle-only tracking for example). |
interacting multiple models (imm) | ✓ multiple models | ✓ (assumes a weighted sum of distributions) | high | maneuvering objects (which accelerate or turn, for example) |
particle | ✓ | very high | samples the uncertainty distribution using weighted particles. |
references
[1] wang, e.a., and r. van der merwe. "the unscented kalman filter for nonlinear estimation." ieee 2000 adaptive systems for signal processing, communications, and control symposium. no. 00ex373, 2000, pp. 153–158.
[2] fang, h., n. tian, y. wang, m. zhou, and m.a. haile. "nonlinear bayesian estimation: from kalman filtering to a broader horizon." ieee/caa journal of automatica sinica. vol. 5, number 2, 2018, pp. 401–417.
[3] arasaratnam, i., and s. haykin. "cubature kalman filters." ieee transactions on automatic control. vol. 54, number 6, 2009, pp. 1254–1269.
[4] konatowski, s., p. kaniewski, and j. matuszewski. "comparison of estimation accuracy of ekf, ukf and pf filters." annual of navigation. vol. 23, number 1, 2016, pp. 69–87.
[5] darko, j. "object tracking: particle filter with ease." https://www.codeproject.com/articles/865934/object-tracking-particle-filter-with-ease.