Skip to content

Orthogonal Procrustes Problem

Introduction

Formulation

The orthogonal Procrustes problem is to find the orthogonal matrix that maps a given set of points closest to another given set of points. The one-to-one correspondence between points must be given.

formulation

Consider a set of \(n\) points, \(\mathcal{B}=\{\mathbf{b}_1, …, \mathbf{b}_n\}\), where \(\mathbf{b}_i \in \mathbb{R}^d\). (Here \(d=2\)). We can map it to point set \(\mathcal{A}=\{\mathbf{a}_1, …, \mathbf{a}_n\}\) with a rotation and adding small gaussian noise on each point.

The goal of orthogonal Procrustes problem is to find the orthgonal matrix \(R\):

\[ \DeclareMathOperator*{\argmin}{arg,min} R = \argmin_{\Omega : \text{ }\Omega ^T \Omega =I} \sum_{i=1}^n || \Omega \mathbf{a}_i - \mathbf{b}_i ||^2. \]

Note here the one-to-one correspondences between the points in \(\mathcal{A}\) with those in \(\mathcal{B}\) are implicitly specified via the indexing of the points.

Analytical solution

By stacking points in \(\mathcal{A}\) and \(\mathcal{B}\) into matrices, \(A := [\mathbf{a}_1 \cdots \mathbf{a}_n]\) and \(B := [\mathbf{b}_1 \cdots \mathbf{b}_n]\), we can rewrite the objective function:

\[ \DeclareMathOperator*{\argmin}{arg,min} R = \argmin_{\Omega :\text{ } \Omega ^T \Omega =I} || \Omega A - B ||_F^2. \]

where \(||\cdot ||_F\) is Frobenius norm that is defined as: \(||Y||_F ^2:=\sum_i \sum_j y_{ij}^2\). By definitioon of the Frobenius norm, we can expand the equation:

\[ \begin{align} R &= \argmin_{\Omega : \text{ } \Omega ^T\Omega =I} \sum_i \sum_j [(\Omega A - B)_{ij} ]^2 \\\ &= \argmin_{\Omega : \text{ } \Omega ^T \Omega =I} \sum_i \sum_j [(\Omega A)_{ij} - B_{ij} ]^2 \\\ &= \argmin_{\Omega : \text{ } \Omega ^T\Omega =I} \sum_i \sum_j [(\Omega A)_{ij}^2 + B_{ij}^2 -2 (\Omega A)_{ij} B_{ij}] \end{align} \]

The equation can be further rewritten as:

\[ \DeclareMathOperator*{\argmin}{arg,min} R = \argmin_{\Omega : \text{ } \Omega ^T\Omega =I} { ||\Omega A||_F^2 + ||B||_F^2 + \sum_i \sum_j -2 (\Omega A)_{ij} B_{ij}} \]

Since \(\Omega\) is an orthogonal matrix, we have:

\[ ||\Omega A||_F^2=\text{tr}[(\Omega A)^T(\Omega A)]=\text{tr}(A^T \Omega^T \Omega A) = \text{tr}(A^TA) = ||A||_F^2 \]

Thus, the first two terms do not depend on \(\Omega\), hence:

\[ \DeclareMathOperator*{\argmax}{arg,max} \begin{align} R &= \argmax_{\Omega : \text{ } \Omega ^T\Omega =I} \sum_i \sum_j (\Omega A)_{ij} B_{ij}\\ &=\argmax_{\Omega : \text{ } \Omega^T\Omega=I}\hspace{2pt} \text{tr}(B^T\Omega A) \end{align} \]

As the trace of a product of matrices is invariant to cyclic permutations of the matrix product, we have:

\[ \DeclareMathOperator*{\argmin}{arg,min} \DeclareMathOperator*{\argmax}{arg,max} R = \argmax_{\Omega : \text{ } \Omega^T\Omega=I} \text{tr}(\Omega A B^T) \]

By using SVD: \(AB^T=U\Sigma V^T\), we have:

\[ \DeclareMathOperator*{\argmin}{arg,min} \DeclareMathOperator*{\argmax}{arg,max} \begin{align} R &= \argmax_{\Omega : \text{ }\Omega^T\Omega=I} { \hspace{2pt} \text{tr}(\Omega U\Sigma V^T)}\\ &= \argmax_{\Omega : \text{ }\Omega^T\Omega=I} \hspace{2pt} \text{tr}(V^T \Omega U \Sigma)\\ &= \argmax_{\Omega : \text{ }\Omega^T\Omega=I} \hspace{2pt} \text{tr}(Z \Sigma) \quad\text{where}\hspace{2pt} Z = V^T \Omega U\\ &= \argmax_{\Omega : \text{ }\Omega^T\Omega=I} \hspace{2pt} \text{tr}[\sigma_1\mathbf{z}_1 \dots\sigma_d\mathbf{z}_d] \quad\text{where}\hspace{2pt}\sigma_i\hspace{2pt} \text{is the singular value of}\hspace{2pt} AB^T \end{align} \]

Thus, we need \(Z\) to become an identity matrix to maximize our objective function:

\[ Z=V^T\Omega U=I \rightarrow \Omega = V U^T \]

Summary

  1. Compute SVD: \(AB^T=U\Sigma V^T\)
  2. Set \(R=VU^T\)
  3. Transform each point \(\mathbf{a}_i\in\mathcal{A}\) via a matrix multiplication \(R\mathbf{a}_i\)

Reference

https://simonensemble.github.io/posts/2018-10-27-orthogonal-procrustes/

https://www.research-collection.ethz.ch/bitstream/handle/20.500.11850/147853/eth-26937-01.pdf