Models for Vision

Temporal models

Temporal estimation frameworks

let us assume we have a state \(\rvw\) that represents an evolving system, our goal is to compute the marginal posterior distribution over \(\rvw\) at time \(t\), i.e., infer a sequence of world states \(\{\rvw_t\}_{t=1}^T\) from a noisy sequnece of measurements \(\{\rvx_t\}_{t=1}^T\). There are two important components for estimating the states:

Measurement model: \(Pr(\rvx_t|\rvw_t)\), we assume \(\rvx_t\) is conditionally independent of \(\rvw_{1\dots t-1}\) given \(\rvw_t\).
Temporal model: \(Pr(\rvw_t|\rvw_{t-1})\), we assume \(\rvw_t\) is conditionally independent of \(\rvw_{1\dots t-2}\) given \(\rvw_{t-1}\).

Recursive Estimation

Time 1 (Posterior):

\[ Pr(\rvw_1|\rvx_1) = \frac{Pr(\rvx_1|\rvw_1)Pr(\rvw_1)}{\int Pr(\rvx_1|\rvw_1)Pr(\rvw_1)d\rvw_1} \]

Time 2 (Posterior):

\[ Pr(\rvw_2|\rvx_1,\rvx_2) = \frac{Pr(\rvx_2|\rvw_2)Pr(\rvw_2|\rvx_1)}{\int Pr(\rvx_2|\rvw_2)Pr(\rvw_2|\rvx_1)d\rvw_2} = \frac{Pr(\rvx_2, \rvw_2|\rvx_1)}{Pr(\rvx_2|\rvx_1)} \]

Time t (Posterior):

\[ Pr(\rvw_t|\rvx_{1\dots t}) = \frac{\overbrace{Pr(\rvx_t|\rvw_t)}^{\text{likelihood}} \, \overbrace{Pr(\rvw_t|\rvx_{1\dots t-1})}^{\text{prior}}}{\int Pr(\rvx_t|\rvw_t)Pr(\rvw_t|\rvx_{1\dots t-1})d\rvw_t} = \frac{Pr(\rvx_t, \rvw_t|\rvx_{1\dots t-1})}{Pr(\rvx_t|\rvx_{1\dots t-1})} \]

The prior of each timestep is obtained via Chapman-Kolmogorov equation:

\[ Pr(\rvw_t|\rvx_{1\dots t-1}) = \int Pr(\rvw_t|\rvw_{t-1})Pr(\rvw_{t-1}|\rvx_{1\dots t-1})d\rvw_{t-1} \]

i.e., prior at time t = temporal model at time t * posterior at time t-1.

Thus, the estimation of the states would alternate between:

Temporal evolution: \(Pr(\rvw_t|\rvx_{1\dots t-1}) = \int Pr(\rvw_t|\rvw_{t-1})Pr(\rvw_{t-1}|\rvx_{1\dots t-1})d\rvw_{t-1}\)
Measurement update: \(Pr(\rvw_t|\rvx_{1\dots t}) = \frac{Pr(\rvx_t|\rvw_t)Pr(\rvw_t|\rvx_{1\dots t-1})}{\int Pr(\rvx_t|\rvw_t)Pr(\rvw_t|\rvx_{1\dots t-1})d\rvw_t} = \frac{Pr(\rvx_t, \rvw_t|\rvx_{1\dots t-1})}{Pr(\rvx_t|\rvx_{1\dots t-1})}\)

Kalman Filter

Kalman filter carefully chooses the measurement model and temporal model to make the posterior distribution a Gaussian distribution.

Temporal evolution equation

\[ \rvw_t = \boldsymbol{\Psi}\rvw_{t-1} + \boldsymbol{\mu}_p + \boldsymbol{\epsilon}_p \]

where \(\boldsymbol{\psi}\) is the state transition matrix, i.e., motion model. Thus, we have:

\[ Pr(\rvw_t|\rvw_{t-1}) = \mathrm{Norm}_{\rvw_t}[\boldsymbol{\mu}_p + \boldsymbol{\Psi}\rvw_{t-1}, \boldsymbol{\Sigma}_p] \]
Measurement equation

\[ \rvx_t = \boldsymbol{\Phi}\rvw_t + \boldsymbol{\mu}_m + \boldsymbol{\epsilon}_m \]

where \(\boldsymbol{\Phi}\) is the measurement matrix, i.e., observation model. Thus, we have:

\[ Pr(\rvx_t|\rvw_t) = \mathrm{Norm}_{\rvx_t}[\boldsymbol{\mu}_m + \boldsymbol{\Phi}\rvw_t, \boldsymbol{\Sigma}_m] \]

The temporal evolution would be:

\[ \begin{aligned} \overbrace{Pr(\rvw_t|\rvx_{1\dots t-1})}^{\text{Prior}} &= \int Pr(\rvw_t|\rvw_{t-1})Pr(\rvw_{t-1}|\rvx_{1\dots t-1})d\rvw_{t-1}\\ &= \int\mathrm{Norm}_{\rvw_t}[\boldsymbol{\mu}_p + \boldsymbol{\Psi}\rvw_{t-1}, \boldsymbol{\Sigma}_p]\mathrm{Norm}_{\rvw_{t-1}}[\boldsymbol{\mu}_{t-1}, \boldsymbol{\Sigma}_{t-1}]d\rvw_{t-1}\\ &= \mathrm{Norm}_{\rvw_t}[\boldsymbol{\mu}_p + \boldsymbol{\Psi}\boldsymbol{\mu}_{t-1}, \boldsymbol{\Sigma}_p + \boldsymbol{\Psi}\boldsymbol{\Sigma}_{t-1}\boldsymbol{\Psi}^T]\\ &= \mathrm{Norm}_{\rvw_t}[\boldsymbol{\mu}_{+}, \boldsymbol{\Sigma}_{+}] \end{aligned} \]

And the measurement incorporation would be:

\[ \begin{aligned} Pr(\rvw_t|\rvx_{1\dots t}) &= \frac{Pr(\rvx_t|\rvw_t)Pr(\rvw_t|\rvx_{1\dots t-1})}{Pr(\rvx_t|\rvx_{1\dots t-1})}\\ &= \frac{\mathrm{Norm}_{\rvx_t}[\boldsymbol{\mu}_m + \boldsymbol{\Phi}\rvw_t, \boldsymbol{\Sigma}_m]\mathrm{Norm}_{\rvw_t}[\boldsymbol{\mu}_{+}, \boldsymbol{\Sigma}_{+}]}{Pr(\rvx_t|\rvx_{1\dots t-1})}\\ &= \mathrm{Norm}_{\rvw_t}[(\boldsymbol{\Phi}^T\boldsymbol{\Sigma}_m^{-1}\boldsymbol{\Phi} + \boldsymbol{\Sigma}_{+}^{-1})^{-1}(\boldsymbol{\Phi}^T\boldsymbol{\Sigma}_m^{-1}(\rvx_t-\boldsymbol{\mu}_m) + \boldsymbol{\Sigma}_{+}^{-1}\boldsymbol{\mu}_{+}), (\boldsymbol{\Phi}^T\boldsymbol{\Sigma}_m^{-1}\boldsymbol{\Phi} + \boldsymbol{\Sigma}_{+}^{-1})^{-1}] \end{aligned} \]

If we define Kalman gain:

\[ \boldsymbol{K}_t = \boldsymbol{\Sigma}_{+}\boldsymbol{\Phi}^T(\boldsymbol{\Phi}\boldsymbol{\Sigma}_{+}\boldsymbol{\Phi}^T + \boldsymbol{\Sigma}_m)^{-1} \]

Then we have:

\[ Pr(\rvw_t|\rvx_{1\dots t}) = \mathrm{Norm}_{\rvw_t}[\boldsymbol{\mu}_{+} + \boldsymbol{K}_t(\rvx_t - \boldsymbol{\mu}_m - \boldsymbol{\Phi}\boldsymbol{\mu}_{+}), (\boldsymbol{I} - \boldsymbol{K}_t\boldsymbol{\Phi})\boldsymbol{\Sigma}_{+}] \]

Extended Kalman Filter

Key idea: Use non-linear temporal and measurement models:

\[ \begin{aligned} \rvw_t &= \rvf(\rvw_{t-1}) + \boldsymbol{\epsilon}_p\\ \rvx_t &= \rvg(\rvw_t) + \boldsymbol{\epsilon}_m \end{aligned} \]

We can take first-order Taylor expansion. The important thing is to compute the Jacobian matrix of the non-linear functions:

\[ \begin{aligned} \boldsymbol{\Psi} &= \frac{\partial \rvf[\rvw_{t-1}, \boldsymbol{\epsilon}_p]}{\partial \rvw_{t-1}}|_{\boldsymbol{\mu}_{t-1}, 0}\\ \Upsilon_p &= \frac{\partial \rvf[\rvw_{t-1}, \boldsymbol{\epsilon}_p]}{\partial \boldsymbol{\epsilon}_p}|_{\boldsymbol{\mu}_{t-1}, 0}\\ \boldsymbol{\Phi} &= \frac{\partial \rvg[\rvw_t, \boldsymbol{\epsilon}_m]}{\partial \rvw_t}|_{\boldsymbol{\mu}_{+}, 0}\\ \Upsilon_m &= \frac{\partial \rvg[\rvw_t, \boldsymbol{\epsilon}_m]}{\partial \boldsymbol{\epsilon}_m}|_{\boldsymbol{\mu}_{+}, 0} \end{aligned} \]

Then, we have:

State prediction: \(\boldsymbol{\mu}_{+} = \rvf[\boldsymbol{\mu}_{t-1}, 0]\)
Covariance prediction: \(\boldsymbol{\Sigma}_{+} = \boldsymbol{\Psi}\boldsymbol{\Sigma}_{t-1}\boldsymbol{\Psi}^T + \Upsilon_p\Sigma_t\Upsilon_p^T\)
State Update: \(\boldsymbol{\mu}_t = \boldsymbol{\mu}_{+} + \boldsymbol{K}_t(\rvx_t - \rvg[\boldsymbol{\mu}_{+}, 0])\)
Covariance Update: \(\boldsymbol{\Sigma}_t = (\boldsymbol{I} - \boldsymbol{K}_t\boldsymbol{\Phi})\boldsymbol{\Sigma}_{+}\)

The Kalman gain is

\[ \boldsymbol{K}_t = \boldsymbol{\Sigma}_{+}\boldsymbol{\Phi}^T(\boldsymbol{\Phi}\boldsymbol{\Sigma}_{+}\boldsymbol{\Phi}^T + \Upsilon_m\boldsymbol{\Sigma}_m\Upsilon_m^T)^{-1} \]

Unscented Kalman Filter

Key idea: Use unscented transform, i.e., a set of sigma points to approximate the posterior distribution. We can approximate the posterior distribution at time \(t-1\) as:

\[ \begin{aligned} Pr(\rvw_{t-1}|\rvx_{1\dots t-1}) &= \mathrm{Norm}_{\rvw_{t-1}}[\boldsymbol{\mu}_{t-1}, \boldsymbol{\Sigma}_{t-1}]\\ &\approx \sum_{j=0}^{2n}a_j\delta[\rvw_{t-1} - \hat{\rvw}^{[j]}]\\ \end{aligned} \]

Particle Filter

Key idea: Represent probability distribution as a set of weighted particles. The posterior distribution at time \(t-1\) is represented as:

\[ \begin{aligned} Pr(\rvw_{t-1}|\rvx_{1\dots t-1}) &= \sum_{j=1}^J a_j\delta[\rvw_{t-1} - \hat{\rvw}_{t-1}^{[j]}]\\ \end{aligned} \]