EKF vs UKF — use case example

Jaroslaw Goslinski
9 min readFeb 28, 2021

--

Previously, in The Unscented Kalman Filter, simply the best! Python code I explained the idea of the UKF and gave a simple use case. Here I’m going to compare the obtained results to the results of the EKF in the same use-case. The idea is to show how good the UKF is when compared to the EKF. We could compare the results by only looking at the mean square error (MSE) or we can use some extra tool like ellipse of confidence which in the case of the two-dimensional model is a very intuitive method of filter assessment. We will use both of them.

In the previous part, I’ve shown the UKF with a simple multivariate nonstationary growth model MNGM. As I promised, today I will use the same model, however, I added some offset to state variables just to better visualize the ellipse of confidence (and to maintain filters’ stability). Below I present a slightly modified MNGM model:

import numpy as np


class MNGM2:
# class used for generation of multivariate nonstationary growth model
def __init__(self, n, x_0):

self.n = n
self.x = np.zeros((n, 2))
self.y = np.zeros((n, 2))

self.xi = np.zeros(2)
self.yi = np.zeros(2)

self.x[0, :] = x_0[:, 0]

self.u = np.random.normal(0., 1.0, (n, 2))
self.v = np.random.normal(0., 1.0, (n, 2))

self.F = np.zeros((2, 2)) # jacobian matrix of the state f model
self.H = np.zeros((2, 2)) # jacobian matrix of the output h model


def generate_data(self):

for i in range(1, self.n):
self.x[i, :] = self.state(i, self.x[i-1, :]) + self.u[i, :]
self.y[i, :] = self.output(self.x[i, :]) + self.v[i, :]


def state(self, i, xp):

self.xi[0] = 0.5 * xp[0] - 0.1 * xp[1] + 0.7 * (xp[0] / (1 + xp[0] ** 2)) + 2.2 * np.cos(1.2 * (i - 1))+1.5
self.xi[1] = 0.8 * xp[1] - 0.2 * xp[0] + 0.9 * (xp[1] / (1 + xp[1] ** 2)) + 2.4 * np.cos(2.2 * (i - 1))-1.5

return self.xi

def output(self, xi):
self.yi[0] = (xi[0] ** 2) / 18.0 - (xi[1] ** 2) / 24.0
self.yi[1] = (xi[0] ** 2) / 17.0 + (xi[1] ** 2) / 2.0

return self.yi

def f_jacob(self, xi):
self.F[0, 0] = 0.5 + 0.7*(1 - xi[0] ** 2)/((1 + xi[0] ** 2)**2)
self.F[0, 1] = -0.1
self.F[1, 0] = -0.2
self.F[1, 1] = 0.8 + 0.9*(1 - xi[1] ** 2)/((1 + xi[1] ** 2)**2)

return self.F

def h_jacob(self, xi):
self.H[0, 0] = 2.0*xi[0]/18.0
self.H[0, 1] = -2.0*xi[1]/24.0
self.H[1, 0] = 2.0*xi[0]/17.0
self.H[1, 1] = 2.0*xi[1]/2.0

return self.H

Please notice, that I’ve added two additional methods: f_jacob(…) and g_jacob(…), these are Jacobian matrices of the state and output function, which elements are constructed accordingly to the formulas:

Note, that the Jacobian of output function g(…) is named G, whereas in most of the articles variable H is used instead. Here the notation was used intentionally as the original model representation given in the previous article uses g(…) as the output function, not the h(…)., though they might be treated equally.

We have introduced the f_jacob(…) and g_jacob(…) because we are going to use the EKF and it needs to be provided with model jacobians.

The EKF has a pretty straight forward form and just to recall, I put its equations below:

The EKF algorithm with the model equation highlighted

The EKF (as all types of Kalman Filters) is a two-stages algorithm, where during the prediction stage the a priori estimate of the state is computed, while in the correction stage the final estimation is found, based on the model and real output values. Here is the basic implementation of the Extended Kalman Filter:

import numpy as np
import scipy
import scipy.linalg # SciPy Linear Algebra Library
from src.MNGM2 import MNGM2

class EKF:

def __init__(self, n, m):

self.n = n
self.m = m

# all vectors used in the UKF process
self.x_apriori = np.zeros((self.n,), dtype=float)
self.x_aposteriori = np.zeros((self.n,), dtype=float)
self.y_P = np.zeros((self.m,), dtype=float)
self.y = np.zeros((self.m,), dtype=float)

# covarince matrices used in the process

self.F = np.zeros((self.n, self.n), dtype=float)
self.H = np.zeros((self.m, self.n), dtype=float)

self.P_apriori = np.zeros((self.n, self.n), dtype=float)
self.P_aposteriori = np.zeros((self.n, self.n), dtype=float)

self.P_y = np.zeros((self.m, self.m), dtype=float)
self.K = np.zeros((self.n, self.m), dtype=float)

self.Q = np.zeros((self.n, self.n), dtype=float)
self.R = np.zeros((self.m, self.m), dtype=float)

self.I = np.identity(2)

self.mngm = 0



def resetEKF(self, _Q, _R, x_0):
# Q - filter process noise covraiance
# R - measurement noise covariance,
# P - init covariance noise

self.mngm = MNGM2(self.n, x_0)
# init of all vectors and matrices where the first dim := n

self.y_P = np.zeros((self.m,))
self.y = np.zeros((self.m,))

self.P_y = np.zeros((self.m, self.m))
# init of all vectors and matrices where the first dim := n_UKF
self.x_apriori = x_0[:, 0]
self.x_aposteriori = x_0[:, 0]

self.K = np.zeros((self.n, self.m))

self.P_apriori = np.zeros((self.n, self.n))
self.P_aposteriori = np.zeros((self.n, self.n))

for i in range(0, self.n):
self.P_apriori[i, i] = _Q
self.P_aposteriori[i, i] = _Q

self.setCovariances(_Q, _R)

def setCovariances(self, _Q, _R):

self.Q = np.zeros((self.n, self.n))
self.R = np.zeros((self.m, self.m))

for i in range(self.n):
self.Q[i, i] = _Q

for i in range(self.m):
self.R[i, i] = _R

def timeUpdate(self, w):

self.x_apriori = self.mngm.state(w, self.x_aposteriori)

#apriori covariance matrix:
self.F = self.mngm.f_jacob(self.x_apriori)
self.P_apriori = np.matmul(self.F, np.matmul(self.P_aposteriori, np.transpose(self.F))) + self.Q


def measurementUpdate(self, z):

#vector residuum
self.y = self.mngm.output(self.x_apriori)
self.y_P = z - self.y

#jacobian of the output
self.H = self.mngm.h_jacob(self.x_apriori)

# output covariance
self.P_y = np.matmul(self.H, np.matmul(self.P_apriori, np.transpose(self.H))) + self.R

# kalman gain:
self.K = np.matmul(np.matmul(self.P_apriori, np.transpose(self.H)), np.linalg.inv(self.P_y))

# aposteriori state:
self.x_aposteriori = self.x_apriori + np.matmul(self.K, self.y_P)

# cov aposteriori:
self.P_aposteriori = np.matmul((self.I - np.matmul(self.K, self.H)), self.P_apriori)

Now, having the model, UKF and EKF we can use all of them in the estimation of the model state. To do this we put all the classes in one project, create some vectors of data and proceed with the estimation:

def estimateState():
n = 2 # size of the state vector
m = 2 # size of the output vector

# initial x value
x_0 = np.zeros((n, 1))
x_0[0, 0] = 1.1
x_0[1, 0] = 2.1
# added offset in the model to capture
mngm = MNGM2(500, x_0)
mngm.generate_data()

ukf = UKF(n, m)
ekf = EKF(n, m)

# generated data:
dataX = mngm.x
dataY = mngm.y

size_n = dataX.shape[0]

ukf.resetUKF(3.0, 1.0, x_0)
ekf.resetEKF(3.0, 1.0, x_0)

err_ukf = 0
err_ekf = 0

est_state_ukf = np.zeros((size_n, n))
est_y_ukf = np.zeros((size_n, m))
est_P_ukf = np.zeros((size_n, n * 2))
est_Py_ukf = np.zeros((size_n, n * 2))

est_state_ekf = np.zeros((size_n, n))
est_y_ekf = np.zeros((size_n, m))
est_P_ekf = np.zeros((size_n, n * 2))
est_Py_ekf = np.zeros((size_n, n * 2))

# estimation loop
for i in range(size_n):

timeUpdateInput = i
measurementUpdateInput = dataY[i, :]

# recursively go through time update and measurement correction

ekf.timeUpdate(timeUpdateInput)
ekf.measurementUpdate(measurementUpdateInput)

ukf.timeUpdate(timeUpdateInput)
ukf.measurementUpdate(measurementUpdateInput)

err_ukf = err_ukf + np.sum((ukf.x_aposteriori - dataX[i, :]) ** 2)
err_ekf = err_ekf + np.sum((ekf.x_aposteriori - dataX[i, :]) ** 2)

est_state_ukf[i, :] = ukf.x_aposteriori

est_P_ukf[i, :] = ukf.P_aposteriori.flatten()
est_Py_ukf[i, :] = ukf.P_y.flatten()

est_y_ukf[i, :] = ukf.y

est_state_ekf[i, :] = ekf.x_aposteriori
est_P_ekf[i, :] = ekf.P_aposteriori.flatten()

est_Py_ekf[i, :] = ekf.P_y.flatten()

est_y_ekf[i, :] = ekf.y


print("total error ukf:", err_ukf)
print("total error ekf:", err_ekf)



plt.plot(dataX[:, 0], 'g', label='x_1 original')
plt.plot(dataX[:, 1], 'b', label='x_2 original')
plt.plot(est_state_ukf[:, 0], 'r--', label='x_1 ukf')
plt.plot(est_state_ukf[:, 1], 'k--', label='x_2 ukf')
plt.plot(est_state_ekf[:, 0], 'b--', label='x_1 ekf')
plt.plot(est_state_ekf[:, 1], 'c--', label='x_2 ekf')
plt.legend(loc='upper right')
plt.show()

plot_confidence_ellipse_cov_all(dataX, dataY,
est_state_ekf, est_y_ekf, est_P_ekf, st_Py_ekf,
est_state_ukf, est_y_ukf, est_P_ukf, st_Py_ukf)

estimateState()

By running the code above, I got the following estimations for both filters i.e EKF and UKF:

The MSE for the EKF and UKF are as follows:

As one can tell, the difference is visible. The UKF handles the model’s nonlinearities, while the EKF sometimes doesn’t predict the state properly. Although MSE is sufficient proof that the UKF outperforms the EKF in that particular case (in practice, you will find many cases where the difference is not so evident) but I want to show you some additional methods of filters’ comparison. As mentioned in the beginning, stochastic methods of data filtering might be considered in terms of their covariances. That is a plus because we can see the state vector as well as output vector variable distribution. Below you can see how we can actually plot the covariance matrix. Please note, that I intentionally used the 2D model to be able to plot 2D figures, for more dimensions I would recommend plotting a maximum of 3 variables, and for more dimensions chose randomly two or three variables from the state/output vector.

Here is the code for covariance plotting:

def plot_confidence_ellipse_cov_all(dataX, dataY, est_state_ekf, est_y_ekf, P_ekf, Py_ekf,est_state_ukf, est_y_ukf, P_ukf, Py_ukf):
n = dataX.shape[0]

sig = np.zeros((n, 4)) ## x0 x1 y0 y1
sig_mean = np.zeros(4) # mean(x0) mean(x1) mean(y0) mean(y1)

est_mean_ekf = np.mean(est_state_ekf, axis=0)
est_y_mean_ekf = np.mean(est_y_ekf, axis=0)
est_mean_ukf = np.mean(est_state_ukf, axis=0)
est_y_mean_ukf = np.mean(est_y_ukf, axis=0)

for i in range(2):
sig[:, i] = dataX[:, i]
sig[:, i + 2] = dataY[:, i]

sig_mean[i] = np.mean(sig[:, i])
sig_mean[i + 2] = np.mean(sig[:, i + 2])

#cov of EKF P
cov1 = np.zeros((2, 2))
cov1[0, 0] = np.mean(P_ekf[:, 0])
cov1[0, 1] = np.mean(P_ekf[:, 1])
cov1[1, 0] = np.mean(P_ekf[:, 2])
cov1[1, 1] = np.mean(P_ekf[:, 3])
lambda_p_ekf, v = np.linalg.eig(cov1)
lambda_p_ekf = np.sqrt(lambda_p_ekf)
vvv = v[:, 0][::-1]
aaa = np.arctan2(vvv[0], vvv[1])
angle_p_ekf = np.arctan2(*v[:, 0][::-1])

# EKF p_y cov
cov1[0, 0] = np.mean(Py_ekf[:, 0])
cov1[0, 1] = np.mean(Py_ekf[:, 1])
cov1[1, 0] = np.mean(Py_ekf[:, 2])
cov1[1, 1] = np.mean(Py_ekf[:, 3])
lambda_py_ekf, v = np.linalg.eig(cov1)
lambda_py_ekf = np.sqrt(lambda_py_ekf)
angle_py_ekf = np.arctan2(*v[:, 0][::-1])

# cov of UKF P
cov1 = np.zeros((2, 2))
cov1[0, 0] = np.mean(P_ukf[:, 0])
cov1[0, 1] = np.mean(P_ukf[:, 1])
cov1[1, 0] = np.mean(P_ukf[:, 2])
cov1[1, 1] = np.mean(P_ukf[:, 3])
lambda_p_ukf, v = np.linalg.eig(cov1)
lambda_p_ukf = np.sqrt(lambda_p_ukf)
vvv = v[:, 0][::-1]
aaa = np.arctan2(vvv[0], vvv[1])
angle_p_ukf = np.arctan2(*v[:, 0][::-1])

# UKF p_y cov
cov1[0, 0] = np.mean(Py_ukf[:, 0])
cov1[0, 1] = np.mean(Py_ukf[:, 1])
cov1[1, 0] = np.mean(Py_ukf[:, 2])
cov1[1, 1] = np.mean(Py_ukf[:, 3])
lambda_py_ukf, v = np.linalg.eig(cov1)
lambda_py_ukf = np.sqrt(lambda_py_ukf)
angle_py_ukf = np.arctan2(*v[:, 0][::-1])

#signal cov
angle = np.zeros(2)
lambda_ = np.zeros(4)
# first ellipse:
cov1 = np.cov(sig[:, 0], sig[:, 1])
lambda_1, v = np.linalg.eig(cov1)
lambda_1 = np.sqrt(lambda_1)
lambda_[0:2] = lambda_1
angle[0] = np.arctan2(*v[:, 0][::-1])

# second ellipse:
cov2 = np.cov(sig[:, 2], sig[:, 3])
lambda_2, v = np.linalg.eig(cov2)
lambda_2 = np.sqrt(lambda_2)
lambda_[2:4] = lambda_2
angle[1] = np.arctan2(*v[:, 0][::-1])

fig, axs = plt.subplots(1, 2, figsize=(6, 3))
for i, ax in enumerate(axs): # ax in axs:
ax.scatter(sig[:, i * 2], sig[:, i * 2 + 1], s=0.9)
ax.axvline(c='grey', lw=1)
ax.axhline(c='grey', lw=1)
if i == 0:
ell_p_ekf = Ellipse(xy=(est_mean_ekf[0], est_mean_ekf[1]),width=lambda_p_ekf[0] * 6,
height=lambda_p_ekf[1] * 6,angle=np.rad2deg(angle_p_ekf), edgecolor='blue', label=r'$3\sigma$')
ell_p_ekf.set(label=r'$3\sigma$')
ax.add_artist(ell_p_ekf)
ell_p_ekf.set_facecolor('none')

ax.scatter(est_mean_ekf[0], est_mean_ekf[1], c='blue', s=3)

ell_p_ukf = Ellipse(xy=(est_mean_ukf[0], est_mean_ukf[1]),width=lambda_p_ukf[0] * 6, height=lambda_p_ukf[1] * 6,angle=np.rad2deg(angle_p_ukf), edgecolor='green', label=r'$3\sigma$')
ell_p_ukf.set(label=r'$3\sigma$')
ax.add_artist(ell_p_ukf)
ell_p_ukf.set_facecolor('none')

ax.scatter(est_mean_ukf[0], est_mean_ukf[1], c='green', s=3)

if i == 1:
ell_py_ekf = Ellipse(xy=(est_y_mean_ekf[0], est_y_mean_ekf[1]),width=lambda_py_ekf[0] * 6, height=lambda_py_ekf[1] * 6,angle=np.rad2deg(angle_py_ekf), edgecolor='blue', label=r'$3\sigma$')
ell_py_ekf.set(label=r'$3\sigma$')
ax.add_artist(ell_py_ekf)
ell_py_ekf.set_facecolor('none')

ax.scatter(est_y_mean_ekf[0], est_y_mean_ekf[1], c='blue', s=3)

ell_py_ukf = Ellipse(xy=(est_y_mean_ukf[0], est_y_mean_ukf[1]),width=lambda_py_ukf[0] * 6, height=lambda_py_ukf[1] * 6,angle=np.rad2deg(angle_py_ekf), edgecolor='green', label=r'$3\sigma$')
ell_py_ukf.set(label=r'$3\sigma$')
ax.add_artist(ell_py_ukf)
ell_py_ukf.set_facecolor('none')

ax.scatter(est_y_mean_ukf[0], est_y_mean_ukf[1], c='green', s=3)

ell = Ellipse(xy=(sig_mean[i * 2], sig_mean[i * 2 + 1]),
width=lambda_[i * 2] * 6, height=lambda_[i * 2 + 1] * 6,
angle=np.rad2deg(angle[i]), edgecolor='firebrick', label=r'$3\sigma$')
ell.set(label=r'$3\sigma$')
ax.add_artist(ell)
ell.set_facecolor('none')
#ax.set(xlim=[-10, 10], ylim=[-15, 5], aspect='equal')
ax.scatter(sig_mean[i * 2], sig_mean[i * 2 + 1], c='red', s=3)
if i == 0:
ax.set_title('cov state var')
else:
ax.set_title('cov output var')
plt.show()

Finally, we can plot the covariances, for the state of the model, EKF, and UKF:

The figure above depicts the real problem of the EKF, it not only the mean of the state but also (or mostly) the distribution of the state. It is skewed and it does not reflect the real model distribution.

Conclusions

In my career, I used the UKF and EKF many times, mostly for autonomous robots, but also in vision systems or even econometrics. What I’ve learned is that the difference in the performance of UKF vs EKF is not always so bright, and for simple nonlinear models, the UKF can be only slightly better. However, the UKF is much easier to use with complicated models, for which the jacobians could take additional work. It is worth to mention, that there also another problem related to the KFs’ — sometimes they are unstable.

The project can be found here: https://github.com/jaroslav87/UKF-vs-EKF.git Please share your thoughts and don't forget to start following me for more projects on data filtering/data science/machine learning.

--

--

Jaroslaw Goslinski

AI researcher, Interested in estimation theory, sensors, robotics and machine learning