Models for Vision
Temporal models
Temporal estimation frameworks
let us assume we have a state \(\rvw\) that represents an evolving system, our goal is to compute the marginal posterior distribution over \(\rvw\) at time \(t\), i.e., infer a sequence of world states \(\{\rvw_t\}_{t=1}^T\) from a noisy sequnece of measurements \(\{\rvx_t\}_{t=1}^T\). There are two important components for estimating the states:
-
Measurement model: \(Pr(\rvx_t|\rvw_t)\), we assume \(\rvx_t\) is conditionally independent of \(\rvw_{1\dots t-1}\) given \(\rvw_t\).
-
Temporal model: \(Pr(\rvw_t|\rvw_{t-1})\), we assume \(\rvw_t\) is conditionally independent of \(\rvw_{1\dots t-2}\) given \(\rvw_{t-1}\).
Recursive Estimation
Time 1 (Posterior):
\[
Pr(\rvw_1|\rvx_1) = \frac{Pr(\rvx_1|\rvw_1)Pr(\rvw_1)}{\int Pr(\rvx_1|\rvw_1)Pr(\rvw_1)d\rvw_1}
\]
Time 2 (Posterior):
\[
Pr(\rvw_2|\rvx_1,\rvx_2) = \frac{Pr(\rvx_2|\rvw_2)Pr(\rvw_2|\rvx_1)}{\int Pr(\rvx_2|\rvw_2)Pr(\rvw_2|\rvx_1)d\rvw_2} = \frac{Pr(\rvx_2, \rvw_2|\rvx_1)}{Pr(\rvx_2|\rvx_1)}
\]
Time t (Posterior):
\[
Pr(\rvw_t|\rvx_{1\dots t}) = \frac{\overbrace{Pr(\rvx_t|\rvw_t)}^{\text{likelihood}} \,
\overbrace{Pr(\rvw_t|\rvx_{1\dots t-1})}^{\text{prior}}}{\int Pr(\rvx_t|\rvw_t)Pr(\rvw_t|\rvx_{1\dots t-1})d\rvw_t} = \frac{Pr(\rvx_t, \rvw_t|\rvx_{1\dots t-1})}{Pr(\rvx_t|\rvx_{1\dots t-1})}
\]
The prior of each timestep is obtained via Chapman-Kolmogorov equation:
\[
Pr(\rvw_t|\rvx_{1\dots t-1}) = \int Pr(\rvw_t|\rvw_{t-1})Pr(\rvw_{t-1}|\rvx_{1\dots t-1})d\rvw_{t-1}
\]
i.e., prior at time t = temporal model at time t * posterior at time t-1.
Thus, the estimation of the states would alternate between:
-
Temporal evolution: \(Pr(\rvw_t|\rvx_{1\dots t-1}) = \int Pr(\rvw_t|\rvw_{t-1})Pr(\rvw_{t-1}|\rvx_{1\dots t-1})d\rvw_{t-1}\)
-
Measurement update: \(Pr(\rvw_t|\rvx_{1\dots t}) = \frac{Pr(\rvx_t|\rvw_t)Pr(\rvw_t|\rvx_{1\dots t-1})}{\int Pr(\rvx_t|\rvw_t)Pr(\rvw_t|\rvx_{1\dots t-1})d\rvw_t} = \frac{Pr(\rvx_t, \rvw_t|\rvx_{1\dots t-1})}{Pr(\rvx_t|\rvx_{1\dots t-1})}\)
Kalman Filter
Kalman filter carefully chooses the measurement model and temporal model to make the posterior distribution a Gaussian distribution.
-
Temporal evolution equation
\[
\rvw_t = \boldsymbol{\Psi}\rvw_{t-1} + \boldsymbol{\mu}_p + \boldsymbol{\epsilon}_p
\]
where \(\boldsymbol{\psi}\) is the state transition matrix, i.e., motion model. Thus, we have:
\[
Pr(\rvw_t|\rvw_{t-1}) = \mathrm{Norm}_{\rvw_t}[\boldsymbol{\mu}_p + \boldsymbol{\Psi}\rvw_{t-1}, \boldsymbol{\Sigma}_p]
\]
-
Measurement equation
\[
\rvx_t = \boldsymbol{\Phi}\rvw_t + \boldsymbol{\mu}_m + \boldsymbol{\epsilon}_m
\]
where \(\boldsymbol{\Phi}\) is the measurement matrix, i.e., observation model. Thus, we have:
\[
Pr(\rvx_t|\rvw_t) = \mathrm{Norm}_{\rvx_t}[\boldsymbol{\mu}_m + \boldsymbol{\Phi}\rvw_t, \boldsymbol{\Sigma}_m]
\]
The temporal evolution would be:
\[
\begin{aligned}
\overbrace{Pr(\rvw_t|\rvx_{1\dots t-1})}^{\text{Prior}} &= \int Pr(\rvw_t|\rvw_{t-1})Pr(\rvw_{t-1}|\rvx_{1\dots t-1})d\rvw_{t-1}\\
&= \int\mathrm{Norm}_{\rvw_t}[\boldsymbol{\mu}_p + \boldsymbol{\Psi}\rvw_{t-1}, \boldsymbol{\Sigma}_p]\mathrm{Norm}_{\rvw_{t-1}}[\boldsymbol{\mu}_{t-1}, \boldsymbol{\Sigma}_{t-1}]d\rvw_{t-1}\\
&= \mathrm{Norm}_{\rvw_t}[\boldsymbol{\mu}_p + \boldsymbol{\Psi}\boldsymbol{\mu}_{t-1}, \boldsymbol{\Sigma}_p + \boldsymbol{\Psi}\boldsymbol{\Sigma}_{t-1}\boldsymbol{\Psi}^T]\\
&= \mathrm{Norm}_{\rvw_t}[\boldsymbol{\mu}_{+}, \boldsymbol{\Sigma}_{+}]
\end{aligned}
\]
And the measurement incorporation would be:
\[
\begin{aligned}
Pr(\rvw_t|\rvx_{1\dots t}) &= \frac{Pr(\rvx_t|\rvw_t)Pr(\rvw_t|\rvx_{1\dots t-1})}{Pr(\rvx_t|\rvx_{1\dots t-1})}\\
&= \frac{\mathrm{Norm}_{\rvx_t}[\boldsymbol{\mu}_m + \boldsymbol{\Phi}\rvw_t, \boldsymbol{\Sigma}_m]\mathrm{Norm}_{\rvw_t}[\boldsymbol{\mu}_{+}, \boldsymbol{\Sigma}_{+}]}{Pr(\rvx_t|\rvx_{1\dots t-1})}\\
&= \mathrm{Norm}_{\rvw_t}[(\boldsymbol{\Phi}^T\boldsymbol{\Sigma}_m^{-1}\boldsymbol{\Phi} + \boldsymbol{\Sigma}_{+}^{-1})^{-1}(\boldsymbol{\Phi}^T\boldsymbol{\Sigma}_m^{-1}(\rvx_t-\boldsymbol{\mu}_m) + \boldsymbol{\Sigma}_{+}^{-1}\boldsymbol{\mu}_{+}), (\boldsymbol{\Phi}^T\boldsymbol{\Sigma}_m^{-1}\boldsymbol{\Phi} + \boldsymbol{\Sigma}_{+}^{-1})^{-1}]
\end{aligned}
\]
If we define Kalman gain:
\[
\boldsymbol{K}_t = \boldsymbol{\Sigma}_{+}\boldsymbol{\Phi}^T(\boldsymbol{\Phi}\boldsymbol{\Sigma}_{+}\boldsymbol{\Phi}^T + \boldsymbol{\Sigma}_m)^{-1}
\]
Then we have:
\[
Pr(\rvw_t|\rvx_{1\dots t}) = \mathrm{Norm}_{\rvw_t}[\boldsymbol{\mu}_{+} + \boldsymbol{K}_t(\rvx_t - \boldsymbol{\mu}_m - \boldsymbol{\Phi}\boldsymbol{\mu}_{+}), (\boldsymbol{I} - \boldsymbol{K}_t\boldsymbol{\Phi})\boldsymbol{\Sigma}_{+}]
\]
Extended Kalman Filter
Key idea: Use non-linear temporal and measurement models:
\[
\begin{aligned}
\rvw_t &= \rvf(\rvw_{t-1}) + \boldsymbol{\epsilon}_p\\
\rvx_t &= \rvg(\rvw_t) + \boldsymbol{\epsilon}_m
\end{aligned}
\]
We can take first-order Taylor expansion. The important thing is to compute the Jacobian matrix of the non-linear functions:
\[
\begin{aligned}
\boldsymbol{\Psi} &= \frac{\partial \rvf[\rvw_{t-1}, \boldsymbol{\epsilon}_p]}{\partial \rvw_{t-1}}|_{\boldsymbol{\mu}_{t-1}, 0}\\
\Upsilon_p &= \frac{\partial \rvf[\rvw_{t-1}, \boldsymbol{\epsilon}_p]}{\partial \boldsymbol{\epsilon}_p}|_{\boldsymbol{\mu}_{t-1}, 0}\\
\boldsymbol{\Phi} &= \frac{\partial \rvg[\rvw_t, \boldsymbol{\epsilon}_m]}{\partial \rvw_t}|_{\boldsymbol{\mu}_{+}, 0}\\
\Upsilon_m &= \frac{\partial \rvg[\rvw_t, \boldsymbol{\epsilon}_m]}{\partial \boldsymbol{\epsilon}_m}|_{\boldsymbol{\mu}_{+}, 0}
\end{aligned}
\]
Then, we have:
-
State prediction: \(\boldsymbol{\mu}_{+} = \rvf[\boldsymbol{\mu}_{t-1}, 0]\)
-
Covariance prediction: \(\boldsymbol{\Sigma}_{+} = \boldsymbol{\Psi}\boldsymbol{\Sigma}_{t-1}\boldsymbol{\Psi}^T + \Upsilon_p\Sigma_t\Upsilon_p^T\)
-
State Update: \(\boldsymbol{\mu}_t = \boldsymbol{\mu}_{+} + \boldsymbol{K}_t(\rvx_t - \rvg[\boldsymbol{\mu}_{+}, 0])\)
-
Covariance Update: \(\boldsymbol{\Sigma}_t = (\boldsymbol{I} - \boldsymbol{K}_t\boldsymbol{\Phi})\boldsymbol{\Sigma}_{+}\)
The Kalman gain is
\[
\boldsymbol{K}_t = \boldsymbol{\Sigma}_{+}\boldsymbol{\Phi}^T(\boldsymbol{\Phi}\boldsymbol{\Sigma}_{+}\boldsymbol{\Phi}^T + \Upsilon_m\boldsymbol{\Sigma}_m\Upsilon_m^T)^{-1}
\]
Unscented Kalman Filter
Key idea: Use unscented transform, i.e., a set of sigma points to approximate the posterior distribution. We can approximate the posterior distribution at time \(t-1\) as:
\[
\begin{aligned}
Pr(\rvw_{t-1}|\rvx_{1\dots t-1}) &= \mathrm{Norm}_{\rvw_{t-1}}[\boldsymbol{\mu}_{t-1}, \boldsymbol{\Sigma}_{t-1}]\\
&\approx \sum_{j=0}^{2n}a_j\delta[\rvw_{t-1} - \hat{\rvw}^{[j]}]\\
\end{aligned}
\]
Particle Filter
Key idea: Represent probability distribution as a set of weighted particles. The posterior distribution at time \(t-1\) is represented as:
\[
\begin{aligned}
Pr(\rvw_{t-1}|\rvx_{1\dots t-1}) &= \sum_{j=1}^J a_j\delta[\rvw_{t-1} - \hat{\rvw}_{t-1}^{[j]}]\\
\end{aligned}
\]