Orthogonal Procrustes Problem
Introduction
The orthogonal Procrustes problem is to find the orthogonal matrix that maps a given set of points closest to another given set of points. The one-to-one correspondence between points must be given.
Consider a set of \(n\) points, \(\mathcal{B}=\{\mathbf{b}_1, …, \mathbf{b}_n\}\), where \(\mathbf{b}_i \in \mathbb{R}^d\). (Here \(d=2\)). We can map it to point set \(\mathcal{A}=\{\mathbf{a}_1, …, \mathbf{a}_n\}\) with a rotation and adding small gaussian noise on each point.
The goal of orthogonal Procrustes problem is to find the orthgonal matrix \(R\):
\[
\DeclareMathOperator*{\argmin}{arg,min}
R = \argmin_{\Omega : \text{ }\Omega ^T \Omega =I} \sum_{i=1}^n || \Omega \mathbf{a}_i - \mathbf{b}_i ||^2.
\]
Note here the one-to-one correspondences between the points in \(\mathcal{A}\) with those in \(\mathcal{B}\) are implicitly specified via the indexing of the points.
Analytical solution
By stacking points in \(\mathcal{A}\) and \(\mathcal{B}\) into matrices, \(A := [\mathbf{a}_1 \cdots \mathbf{a}_n]\) and \(B := [\mathbf{b}_1 \cdots \mathbf{b}_n]\), we can rewrite the objective function:
\[
\DeclareMathOperator*{\argmin}{arg,min}
R = \argmin_{\Omega :\text{ } \Omega ^T \Omega =I} || \Omega A - B ||_F^2.
\]
where \(||\cdot ||_F\) is Frobenius norm that is defined as: \(||Y||_F ^2:=\sum_i \sum_j y_{ij}^2\). By definitioon of the Frobenius norm, we can expand the equation:
\[
\begin{align}
R &= \argmin_{\Omega : \text{ } \Omega ^T\Omega =I} \sum_i \sum_j [(\Omega A - B)_{ij} ]^2 \\\
&= \argmin_{\Omega : \text{ } \Omega ^T \Omega =I} \sum_i \sum_j [(\Omega A)_{ij} - B_{ij} ]^2 \\\
&= \argmin_{\Omega : \text{ } \Omega ^T\Omega =I} \sum_i \sum_j [(\Omega A)_{ij}^2 + B_{ij}^2 -2 (\Omega A)_{ij} B_{ij}]
\end{align}
\]
The equation can be further rewritten as:
\[
\DeclareMathOperator*{\argmin}{arg,min}
R = \argmin_{\Omega : \text{ } \Omega ^T\Omega =I} { ||\Omega A||_F^2 + ||B||_F^2 + \sum_i \sum_j -2 (\Omega A)_{ij} B_{ij}}
\]
Since \(\Omega\) is an orthogonal matrix, we have:
\[
||\Omega A||_F^2=\text{tr}[(\Omega A)^T(\Omega A)]=\text{tr}(A^T \Omega^T \Omega A) = \text{tr}(A^TA) = ||A||_F^2
\]
Thus, the first two terms do not depend on \(\Omega\), hence:
\[
\DeclareMathOperator*{\argmax}{arg,max}
\begin{align}
R &= \argmax_{\Omega : \text{ } \Omega ^T\Omega =I} \sum_i \sum_j (\Omega A)_{ij} B_{ij}\\
&=\argmax_{\Omega : \text{ } \Omega^T\Omega=I}\hspace{2pt} \text{tr}(B^T\Omega A)
\end{align}
\]
As the trace of a product of matrices is invariant to cyclic permutations of the matrix product, we have:
\[
\DeclareMathOperator*{\argmin}{arg,min}
\DeclareMathOperator*{\argmax}{arg,max}
R = \argmax_{\Omega : \text{ } \Omega^T\Omega=I} \text{tr}(\Omega A B^T)
\]
By using SVD: \(AB^T=U\Sigma V^T\), we have:
\[
\DeclareMathOperator*{\argmin}{arg,min}
\DeclareMathOperator*{\argmax}{arg,max}
\begin{align}
R &= \argmax_{\Omega : \text{ }\Omega^T\Omega=I} { \hspace{2pt} \text{tr}(\Omega U\Sigma V^T)}\\
&= \argmax_{\Omega : \text{ }\Omega^T\Omega=I} \hspace{2pt} \text{tr}(V^T \Omega U \Sigma)\\
&= \argmax_{\Omega : \text{ }\Omega^T\Omega=I} \hspace{2pt} \text{tr}(Z \Sigma) \quad\text{where}\hspace{2pt} Z = V^T \Omega U\\
&= \argmax_{\Omega : \text{ }\Omega^T\Omega=I} \hspace{2pt} \text{tr}[\sigma_1\mathbf{z}_1 \dots\sigma_d\mathbf{z}_d] \quad\text{where}\hspace{2pt}\sigma_i\hspace{2pt} \text{is the singular value of}\hspace{2pt} AB^T
\end{align}
\]
Thus, we need \(Z\) to become an identity matrix to maximize our objective function:
\[
Z=V^T\Omega U=I \rightarrow \Omega = V U^T
\]
Summary
- Compute SVD: \(AB^T=U\Sigma V^T\)
- Set \(R=VU^T\)
- Transform each point \(\mathbf{a}_i\in\mathcal{A}\) via a matrix multiplication \(R\mathbf{a}_i\)
Reference
https://simonensemble.github.io/posts/2018-10-27-orthogonal-procrustes/
https://www.research-collection.ethz.ch/bitstream/handle/20.500.11850/147853/eth-26937-01.pdf