This study considers the problem of applying the Kalman Filter (KF) to nonlinear systems. Estimation in nonlinear systems is extremely important because almost all practical systems involve nonlinearities. Accurately estimating the state of such systems is extremely important for fault detection and control applications. However, estimation in nonlinear systems is extremely difficult. The optimal (Bayesian) solution to the problem requires the propagation of the description of the full probability density function (pdf) (Kushner and Budhiraja, 2000). This solution is extremely general and incorporates aspects such as multimodality, asymmetries and discontinuities. However, because the form of the pdf is not restricted, it cannot, in general, be described using a finite number of parameters. Therefore, any practical estimator must use an approximation of some kind. Many different types of approximations have been developed; unfortunately, most are either computationally unmanageable or require special assumptions about the form of the process and observation models that cannot be satisfied in practice. For these and other reasons, the KF remains the most widely used estimation algorithm. The most common application of the KF to nonlinear systems is in the form of the Extended KF (EKF) (Martin, 2001). Exploiting the assumption that all transformations are quasi-linear, the EKF simply linearizes all nonlinear transformations and substitutes Jacobian matrices for the linear transformations in the KF equations. Although, the EKF maintains the elegant and computationally efficient recursive update form of the KF, it suffers a number of serious limitations.
Linearized transformations are only reliable if the error propagation can be well approximated by a linear function. If this condition does not hold, the linearized approximation can be extremely poor. At best, this undermines the performance of the filter. At worst, it causes its estimates to diverge altogether. However, determining the validity of this assumption is extremely difficult because it depends on the transformation, the current state estimate and the magnitude of the covariance. This problem is well documented in many applications such as the estimation of ballistic parameters of missiles (Costa, 1994; Chowdhary and Jategaonkar, 2006) and computer vision (Viéville et al., 1993).
Linearization can be applied only if the Jacobian matrix exists. However, this is not always the case. Some systems contain discontinuities in which the parameters can change abruptly, others have singularities and in others the states themselves are inherently discrete.
Calculating Jacobian matrices can be a very difficult and error-prone process.
The Jacobian equations frequently produce many pages of dense algebra that must
be converted to code. This introduces numerous opportunities for human coding
errors that may undermine the performance of the final system in a manner that
cannot be easily identified and debugged-especially given the fact that it is
difficult to know what quality of performance to expect. Regardless of whether
the obscure code associated with a linearized transformation is or is not correct,
it presents a serious problem for subsequent users who must validate it for
use in any high integrity system.
To address theses limitations, Julier and Uhlmann (1997) developed the UKF. Therefore, it was extended to parameter estimation applications by Van der Merve (2004).
Instead of linearizing these non linear functions using Jacobian matrices, the UKF uses a deterministic sampling approach to calculate the mean and covariance estimates of Gaussian random state variables with a minimal set of sample points called as sigma point (Kandepu et al., 2008) through the actual non linear functions. This approach yields more accurate results compared to ordinary functions linearization in the EKF.
The extended Kalman filter: The Kalman Filter (KF) in its standard form is a popular choice for any stochastic estimation problem. However, in most real applications of interest; the system dynamics and observation equations are non linear and hence suitable modifications to the standard kalman filter are required. The Extended Kalman Filter (EKF) provides these modifications by linearizing the non linear process and observation models around the last state estimate. In this way, the EKF gives an approximation of the optimal estimate and hence can be considered as the most common and popular approach for both non linear state estimation and parameter estimation.
Let the process be estimated and the associated observation relationship be described by the following non linear state space model:
where, xk represents the unobserved state of system uk is known exogenous input and zk is the observed measurement output. The process noise wk drives the dynamic system and the observation noise is given by vk.
The EKF involves the following recursive estimation of the mean (
)and error covariance (P) of the state estimate under a Gaussian assumption:
||Consider the last estimated state
||Linearize the non linear system dynamics xk+1 =
f(xk, uk)+wk around
||Apply the prediction cycle of the kalman filter algorithm
to the linearized system dynamics in order to calculate a prior state estimate and
a prior estimate of the error covariance matrix P(k+1|k)
||Linearize the non linear measurement dynamics yk = h(xk)+vk around
||Apply filtering or update cycle of the kalman filter algorithm
to the linearized measurement dynamics in order to calculate the posteriori
state estimate and
the posteriori estimate of error covariance matrix P(k+1|k)
Let Φ(k) and H(k) be the Jacobian matrices of the non linear process f(.) and observation h(.) models around the estimated state, denoted by:
The EKF works almost like a standard KF, except for F and H, which vary in
time based on the estimated state .
Its actual algorithm can be stated in terms of the following two distinct cycles:
Predict cycle: Predict next state, before measurements are taken:
Update cycle: Update state, after measurements are taken:
where, Q and R are covariance matrices, describing the second order properties of the state and measurement noise (i.e., Q = E(wwT) and R = E(vvT), where E(.) denotes the expected operator). K denotes the kalman gain matrix, used in the update observer.
The unscented Kalman filter: The UKF is based on the idea that it is
easier to approximate a probability distribution than to approximate an arbitrary
nonlinear transformation (Julier and Uhlmann, 1994). The actual algorithm is
based on propagating carefully selected finite set of points, called sigma points,
through the system nonlinear dynamics and then approximating the first two moments
of the distribution (mean and covariance) through a suitable method; such as
weighted sample mean and covariance calculations (Julier and Uhlmann, 1994;
Julier et al., 2000). The flaw in the EKF that results from propagating
the mean and covariance through linear approximations of the nonlinear transformation
is thus eliminated in the UKF, leading to theoretically better performance of
the UKF. Furthermore, the UKF implementation does not need the calculation of
any Jacobian or Hessian matrices, which not only results in considerable simplicity
in implementation, but also makes the UKF suitable for real-time applications
and applications involving non-differentiable functions. The accuracy of the
UKF can be compared to that of the second order EKF, the computational order
is comparable to the EKF (Julier and Uhlmann, 1994; Julier et al., 2000).
The UKF follows the same overall structure as the EKF. The UKF is fundamentally different in the way it estimates the noise distributions of the filter. As opposed to approximating the non-linear transformation of the noise distributions, the UKF approximates the transformation by applying the non-linear transformation to a number of selected points in the state space. These points are called sigma points and are calculated from the covariance of the estimation error. The approach is shown in Fig. 1. The sigma points, circled points in the Fig. 1, are selected such that they describe the distribution of the estimation error. The transformed sigma points are then used to calculate the approximation of the transformed distribution. This is known as the UT published in (Julier and Uhlmann, 1997). In the following, the equations of the UKF are presented.
As with the EKF, we present an algorithmic description of the UKF. Given the state vector at step k (we use the same state vector as in Eq. 1), we compute a collection of sigma points, stored in the columns of the Lx(2L+1) sigma point matrix χk where L is the dimension of the state vector. In our case, L = 7 so χk is a 7x15 matrix. The columns of χk are computed by:
the ith column of the matrix square root and λ is defined by:
where, α is a scaling parameter which determines the spread of the sigma points and κ is a secondary scaling parameter.
Note that we assume is
symmetric and positive definite which allows us to find the square root using
||The unscented transformation
Prediction cycle: Once χk computed, we perform the prediction step by first propagating each column of χk through time by Δt using nonlinear function as:
In present formulation, since L = 7, we perform 15 4th order Runge-Kutta integrations.
Calculate the a priori state estimate:
where, Wi(m) are weights defined by:
Calculate the a priori error covariance:
where, Qk is once again the process error covariance matrix and the weights are defined by:
Correction cycle: Transform the columns of χk through the measurement function.
Calculate the predicted observation:
Calculate the a posteriori state estimate:
where, Kk is once again Kalman gain. In the UKF formulation, Kk is defined by:
Note that as with the EKF, R is the measurement noise covariance matrix.
Calculate the a posteriori estimate of the mean state and the error covariance:
In present study, we have used the 7 state vectors, for both EKF and UKF, which defines the attitude of the satellite and also the rates at which the attitude is changing (Matthew et al., 2004). The state vector is comprised of the four-element quaternion attitude vector combined with the three-element body rates vector, with respect to the inertial frame (ω1By). Symbolically, this state vector can be represented as:
During the propagation cycle, the quaternion and angular rate components of the state vector are propagated separately. The quaternion are propagated forward in time utilizing the basic quaternion dynamic equation, specifically
is the local orbital referenced body angular rate written in the body frame.
This vector can be found using the inertial referenced body rates and the orbital-referenced orbital rate vector by the following equations:
where, T is the quaternion transformation matrix from the orbital to body coordinates:
and ω0 = [0 ω0 0] is the inertial referenced orbital angular rate written in the orbital frame.
If we can assume that the satellite has a circular orbit, then we can find ω0, with the equation for mean motion:
where, μT is the earth's gravitational parameter and a is the satellite's semi-major axis.
The body rates are propagated forward in time utilizing the dynamic equation of motion for orbiting satellites:
||diag(Ix Iy Iz) the 3x3 moment of inertia tensor
||The magnetic torque vector
||The environmental disturbance torque vector
||The thrusters control torque vector
With these definitions in mind, the basic mathematical process for state Kalman Filtering can be detailed below.
The state transformation matrix Φ is defined by:
The F11 is a (4x4) matrix and is calculated using Eq.
19, partial derivative of with
respect to each qi becomes:
Thus the matrix is computed by evaluating:
The (4x3) F12 matrix is computed using partial derivative of
with respect to ωbIand can be derived as follows. Again, it requires a substitution
of Eq. 21.
The partial derivative (3x4) matrix F21 can be found as follows:
The partial derivative (3x3) F22 matrix can be found as follows:
The process noise covariance matrix Q is used to represent the uncertainty in the linearized model of the system and how that uncertainty is correlated between the states. Based on the design of a Kalman filter and probability definitions, the process noise covariance matrix Q for step k is defined as follows (Anderson et al., 2004):
The variances σ2x
σ2x are an estimated variance of the body rates. This is not a straightforward quantity and best determined by computer simulation.
The extended Kalman filter approximates the observation matrix H by:
where, z is measurements of system state, either sun-sensor or magnetometer
This basic process has introduced several variables that merit further explanation.
The covariance matrix (P) essentially is a time-referenced estimate of the accuracy
of both the system model and the measurements. The correction cycle depends
heavily upon these accuracies in order to determine how much to trust either
the propagated state or the entered measurements. The state transformation matrix
(Φ) is an approximation of the change that the state undergoes over the
specified time interval. The process noise covariance matrix (Q) is derived
from the expected error in the filtering process. The observation matrix (H)
is a measure of how dependent the measurements are upon the state of the system.
The measurement noise covariance matrix (R) entails the expected error in the
states themselves, derived from the precision of the system model.
The dynamic equation of motion for an orbiting satellite is (Hashida, 2004):
||Inertially referenced body angular rate vector
||Moment of inertia tensor of spacecraft (MOI)
||Reaction wheel angular momentum vector
||Gravity-gradient torque vector
||External disturbance torque vector such as aerodynamic torque
and solar radiation pressure torque
||Applied torque vector by 3-axis magnetorquers
For an axially symmetric satellite with Y/Z wheels and the principal moment of inertia axes along the body axes, the off-diagonal products of inertia elements in the MOI tensor will be zero. The deployed boom along the Z-axis also increases the MOI elements Ixx and Iyy to a much larger and equal value. This value is called the transverse MOI, IT.
The complete set of dynamic equations of motion can then be written as follows:
The spacecraft attitude is parameterized by quaternion denoted q = (q1 q2 q3
q4) which has a corresponding rotation matrix T in (22). The attitude kinematics,
which couple ω and q are given by Eq. 19 and 20.
In order to examine its operability and performance, EKF and UKF were programmed in the computer language MATLAB/SIMULINK. These simulation results were all completed during sun-lit portions of orbit, when both the sun sensors and magnetometer were providing data to the filter.
In order to test the EKF before launch, a simulation of the operating environment of the satellite must first be developed. This simulation is required in order to generate realistic measurements for the EKF to utilize as an input. The simulation used to generate the input measurements for all analysis detailed herein is an Alsat-1 flight orbit code. It has been used on various satellites engineered by Surrey Satellite Technology Ltd. This orbit propagator generates simulated magnetometer and sun sensor inputs of two types. First, the orbit propagator generates these magnetometer and sun sensor inputs in the local orbital frame. These filter inputs are what is considered the orbit propagator predicted values. This input from the simulation will be necessary during actual satellite operations. The second set of input data generated by the orbit propagator is a set of measured values. These values simulate measurement input from the magnetometer and sun sensor and so this set of input is in the body frame. From this second set of inputs, the measurement inputs, are only simulating values that will come from onboard sensors during the actual operation of the satellite, this data will not be generated by the simulator during actual satellite operations.
In addition, a standard of comparison is necessary in order to determine the veracity of the EKF results. Another well-tested program developed by Surrey Satellite Technology Ltd. was utilized for this purpose. This simulation is an attitude propagator, which models the actual satellite dynamics and outputs Euler angles and Euler rates over a specified period of time. This simulation simply uses Eulers Moment Equations and the quaternion dynamic equation to propagate expected angles and rates. Due to the construction of this model, it is quite simple to simplify the model to discount all disturbances torques etc. or to precisely model gravity gradient (including boom deployment), magnetic (including commanded magnetic moment), drag, reaction wheel effects and other disturbance torques.
We consider our satellite is operating in a 98° inclined circular LEO at an altitude of 686 km. The spacecraft is estimated to have a mass of 90 kg and essentially be configured as a rectangular shape 0.65x0.65x0.6 m3 with a 6 m (to the centre of gravity) gravity gradient boom and 3 kg tip mass. The microsatellite is Nadir pointing. These requirements specify several operating constants for the EKF; a summary of operating constants is presented.
A 2006 IGRF B-field model was used to obtain the geomagnetic field values.
Note that the assumed integration time step (Δtint) is 1 S unless otherwise specified. The inertia tensor is boom deployed.
Several other practical consideration merit mention. The integration in the state propagation is accomplished by numerical integration. This cyclic process is computationally intensive. Therefore, to reduce the computational demand without sacrificing significant accuracy, an Adams 2nd order numerical integrator will be implemented rather than a more complex and accurate integrator.
Another practical consideration concerns the nature of the computer code. Since, the code will be implemented on a satellite, the code must be extremely robust. Most importantly, the ADCS computer code must not crash the satellites onboard computer at all expenses. Therefore, error checks are conducted many times throughout each cycle of the EKF. If an error is detected, the ADCS algorithm is immediately exited and a message relayed to satellite operators to allow human operators to deal with the problem rather than the onboard computer crashing as a result of the error. Finally, the nature of the assumptions underlying the development of the EKF permits a small amount of error to creep into the quaternion calculations. In order to negate this error, the quaternion must be normalized after every instance where they are calculated. This includes both the quaternion calculation in the state propagation step and in the state update step.
Finally, several other initialization parameters deserve an explanation. The covariance matrix (P) embodies an approximate error associated with attitude estimates. The first four diagonal elements of the covariance matrix represent the estimated error of the quaternion, while the next three diagonal elements give the estimated error of the angular rates. It is important to note that the covariance matrix is merely an initialization; over time, the matrix changes as the EKF converges. The process noise covariance matrix (Q) is another initialization parameter that deserves mention. The process noise covariance matrix contains information relating an estimate for the error associated with the system equations. From a strict mathematical standpoint, the process noise covariance matrix also changes with time. However, these changes can be ignored due to their small magnitude. The first four diagonal elements of the process noise covariance matrix represent the error associated with the quaternion dynamic equation, while the next three diagonal elements are the estimated error associated with Eulers moment equations. The process noise covariance matrix error estimates is 1e-5 for both quaternion and rates. Lastly, the measurement noise covariance matrix (R) contains the error expected to be associated with measurements. Note that the measurement noise covariance will change as the sun sensor is switched on and off. However, the measurement noise covariance is otherwise constant over time.
||Initial uncertainty of quaternions and angular rate (Initial
covariance matrix P)
||Process noise intensity of quaternions and angular rate (System
noise covariance matrix Q)
||Inertia tensor (Satellite configuration I)
||Measurement error variance (Measurement noise covariance matrix
The measurement and system noise covariance matrices were chosen as diagonal and adjusted through simulation to give a kalman filter with the best filtering properties smallest tracking error. In all simulations the following values were used throughout. These measurements are shown in Table 1-5.
The simulation results are represented in Fig. 2-9,
the black graph shows the true values of quaternion, the red graph represents
the EKF outputs and finally the blue one gives the UKF outputs.
||Comparison between UKF and EKF for estimated state q1
||Comparison between UKF and EKF for estimated state q2
||Comparison between UKF and EKF for estimated state q3
||Comparison between UKF and EKF for estimated state q4
||Attitude error estimated state q1
||Attitude error estimated state q2
The EKF an UKF outputs for estimating quaternion (q1, q2,
q3 and q4) are shown in Fig. 2 for q1,
Fig. 3 for q2, Fig. 4 for q3
and Fig. 5 for q4.
||Attitude error estimated state q3
||Attitude error estimated state q4
The quantities obtained from the both filters are compared with the real states
in black line as showed in the Fig. 2-5.
It can be seen that the performance of the EKF and the UKF is identical; this
is to be expected since they both assume additive white noise. However, the
UKF have shown better performances in presence of the severe nonlinearity in
The attitude error calculated between the real quantity and estimated one are
illustrated in Fig. 6 for q1, Fig.
7 for q2, Fig. 8 for q3 and Fig.
9 for q4, it is clear that the attitude converge to zero quickly.
We can see from Fig. 6-9 that the more severe
the nonlinearity of the state dynamics is the more error occurs in states estimation
obtained by EKF. The new Kalman filter UKF has also shown better performances
in presence of the severe nonlinearity, as seen in the blue graph in Fig.
6-9 for error estimation.
The UKF as a tool for state estimation has been compared to the standard method for nonlinear state estimation the EKF. The state estimation methods have been compared using the same tuning parameters to make the comparative study as credible as possible.
The UKF has been shown to be superior to the EKF in many state estimation, dual estimation and parameter estimation problems (Van der Merwe et al., 2004). It has also been used in many applications, such as state estimation for road vehicles (Julier et al., 1995), induction motors (Akin, 2003), quaternion motion (Joseph and Viola, 2003), visual contour tracking (Li and Zhang, 2002) and parameter estimation for time series modelling (Wan et al., 2000) and neural network training.
In present study, we have used the new alternative filter to space application and the results obtained show the performance of UKF over EKF. The UKF was able to converge with poor initial estimates of the parameters, while the EKF was shown to have a greater tendency to diverge due to poor initial estimates of the parameters. In general, it can be expected that the performance of the UKF is better than that of the EKF, since it propagates the state noise more accurately.
In this study, we have argued the principle difficulty for applying the Kalamn filter to non linear systems is the need to consistently predict the new state and observation of the system. We have introduced another approach of filtering, called the unscented filter, which eliminates the need explicitly derive Jacobian matrices, permitting the implementation of highly complex nonlinear system models previously considered intractable. The benefits of the algorithm were demonstrated in realistic example and we anticipate that the technique described should find widespread application in control and estimation applications.