3D Gaussian Splatting

Modeling geometry with 3D Gaussians

The general form of 3D gaussian is defined as

\[ f(\mathbf{x}) = \frac{1}{\sqrt{(2\pi)^3 \det(\Sigma)}} \exp\left(-\frac{1}{2}(\mathbf{x} - \mathbf{\mu})^\top \Sigma^{-1} (\mathbf{x} - \mathbf{\mu})\right) \]

where \(\mathbf{x}=[a, b, c]^\top\) is the 3d coordinate, and the covariance matrix \(\Sigma\) is:

\[ \Sigma = \begin{bmatrix} \sigma_a^2 & \text{cov}(a, b) & \text{cov}(a, c) \\ \text{cov}(a, b) & \sigma_b^2 & \text{cov}(b, c) \\ \text{cov}(a, c) & \text{cov}(b, c) & \sigma_c^2 \end{bmatrix} \]

The diagonal elements are the variance, which can define the variability of each variable along its own axis. The off-diagonal elements are the covariance. They provide information about how the variables are correlated and how changes in one variable relate to changes in another.

3D Gaussian splatting models the gaussian centers at the target point mean \(\mu\):

\[ G(x) = e^{-\frac{1}{2}(x)^\top\Sigma^{-1}(x)} \]

That is to say, the 3D Gaussian is defined at the local coordinate of each point (centered at each point)

Initialization of the Covariance Matrix

The covariance matrices have physical meaning only if they are positive semi-definite. Thus, we can decompose the covariance matrix into a scaling matrix \(S\) (diagonal matrix) and a rotation matrix \(R\):

\[ \Sigma = RSS^\top R^T \]

This representation is related to eigendecomposition. 3D Gaussian splatting uses a 3D vector \(s\) for scaling matrix and a quaternion \(q\) for rotation.

Forward mapping pipeline

Volume rendering can be divided into: 1) backward mapping that shoot rays through pixels on the image plane into the volume data 2) forward mapping that map the data onto the image plane. 3D Gaussian splatting follows the forward mapping paradigm as illustrated below:

forward_mapping

Projective transformation and local affine approximation

The projective transformation converts camera coordinates to ray coordinates. Given a point \(\mathbf{u}\) in camera space, we can map it from camera space to ray space by \(\mathbf{x} = m(\mathbf{u})\):

\[ \begin{bmatrix} x_0 \\ x_1 \\ x_2 \end{bmatrix} = m(\mathbf{u}) = \begin{bmatrix} u_0/u_2 \\ u_1/u_2 \\ \|(u_0, u_1, u_2)^\top\| \end{bmatrix} \]

Since this transformation is not affine, we cannot use it to transform the Gaussian. Thus, local affine approximation is introduced to solve this problem. Given any gaussian centered at \(\mathbf{u}_k\) and its correponding point \(\mathbf{x}_k\) in ray space, we can approximate the projective function \(m_{\mathbf{u}_k}\) using Taylor expansion of \(m\) at the point \(\mathbf{u}_k\):

\[ \mathbf{m}_{\mathbf{u}_k} (\mathbf{u}) = \mathbf{x}_k + J_{\mathbf{u}_k}\cdot (\mathbf{u}-\mathbf{u}_k) \]

The Jacobian \(J\) is

\[ \mathbf{J} = \begin{bmatrix} \frac{\partial x_0}{\partial u_0} & \frac{\partial x_0}{\partial u_1} & \frac{\partial x_0}{\partial u_2} \\ \frac{\partial x_1}{\partial u_0} & \frac{\partial x_1}{\partial u_1} & \frac{\partial x_1}{\partial u_2} \\ \frac{\partial x_2}{\partial u_0} & \frac{\partial x_2}{\partial u_1} & \frac{\partial x_2}{\partial u_2} \end{bmatrix}= \begin{bmatrix} \frac{1}{u_2} & 0 & -\frac{u_0}{u_2^2} \\ 0 & \frac{1}{u_2} & -\frac{u_1}{u_2^2} \\ \frac{u_0}{\|\mathbf{u}\|} & \frac{u_1}{\|\mathbf{u}\|} & \frac{u_2}{\|\mathbf{u}\|} - \frac{\|\mathbf{u}\|}{u_2^2} \end{bmatrix} \]

Projection of the Covariance Matrix

Given the local approximation \(J\) of projective transformation and viewing transformation \(W\), the covariance matrix \(\Sigma^\prime\) in camera coordinate is given as follows:

\[ \Sigma^\prime = J W \Sigma W^\top J^\top \]