Two-View Geometry

Computer Vision CMP-6035B

Dr. David Greenwood

Spring 2022


  • Camera Pair
  • Coplanarity Constraint
  • Fundamental Matrix
  • Essential Matrix

Camera Pair

Two cameras capturing images of the same scene.

A stereo camera. Intel D435

  • A stereo camera.
  • Two cameras, each with a different position.
  • One camera that moves.

A camera pair is two configurations from which images have been taken of the same scene.


The orientation of the camera pair can be described using independent orientations for each camera.

How many parameters are needed?


The orientation of the camera pair can be described using independent orientations for each camera.

How many parameters are needed?

  • Calibrated cameras require 12 parameters.
  • Uncalibrated cameras require 22 parameters.

Camera Motion

Can we estimate the camera motion without knowing the scene?

Which parameters can be obtained from these images?

  • and which cannot?

Cameras Measure Direction

We can’t obtain global translation and rotation or scale.

Two views

We can obtain:

  • 3 rotation parameters of the second camera w.r.t. the first camera.
  • 2 direction parameters of the line \(B\), connecting the two centres.
  • But, we can’t estimate the length of \(B\).

Calibrated Cameras

  • We need \(2 \times 6 = 12\) parameters for two calibrated cameras for their pose.
  • Without additional information we can only obtain \(12 - 7 = 5\) parameters.
  • Not 3 rotation, 3 translation, and 1 scale.

Photogrammetric Model

Given two cameras images, we can reconstruct an object up to a similarity transform.

The orientation of the photogrammetric model is called the absolute orientation.

  • To obtain the absolute orientation we need at least 3 points in 3D.

Uncalibrated Cameras

For uncalibrated cameras, we can only obtain \(22-15=7\) parameters given two images.

We need at least 5 points in 3D to obtain the absolute orientation.

Camera image pair RO AO 3D
Calibrated 6 12 5 7 3
Uncalibrated 11 22 7 15 5
  • RO : relative orientation
  • AO : absolute orientation
  • 3D : minimum number of control points in 3D

By simply moving the camera in the scene we can obtain a relative orientation.

“Agarwal, Sameer, et al. Building rome in a day. 2011”

Rome in a day

Coplanarity Constraint

Leading to the Fundamental Matrix.

Which parameters can we compute without any knowledge of the scene?

Two cameras observe one point.

The perfect intersection of two rays.

Two rays lie on a plane.

The baseline vector.

Coplanarity can be expressed in the following way:

\[ [O^{'}X, O^{'}O^{''}, O^{''}X] = 0 \]


Aside: Scalar Triple Product

Dot product of one vector with the cross product of the other two.

\[ [A, B, C] = (A \times B) \cdot C \]

  • It is the volume of the parallelepiped formed by the three vectors.
  • \([A, B, C] = 0\) if all the vectors are in a plane.


\[ [O^{'}X, O^{'}O^{''}, O^{''}X] = 0 \]


The directions of the vectors \(O^{'}X\) and \(O^{''}X\) can be derived from the image coordinates \(x', x''\):

\[ x' = P'X \quad \quad x'' = P''X \]

with the projection matrices:

\[ P'=K'R'[\textbf{I}_{3}| - X_{O'}] \quad \quad P''=K''R''[\textbf{I}_{3}| - X_{O''}] \]

The normalised direction of the vector \(O^{'}X\) is:

\[ {}^{n}x^{'} = (R')^{-1}(K')^{-1} x' \]

The normalised direction of the vector \(O^{'}X\) is:

\[ {}^{n}x^{'} = (R')^{-1}(K')^{-1} x' \]

as the normalised projection:

\[ {}^{n}x^{'} = [\textbf{I}_{3}| - X_{O'}]X \]

This gives the direction from the centre of projection to the point in 3D.

Analogously, we can do the same thing for both cameras:

\[ {}^{n}x^{'} = (R')^{-1}(K')^{-1} x' \quad \quad {}^{n}x^{''} = (R'')^{-1}(K'')^{-1} x'' \]

The baseline vector \(O^{'}O^{''}\), is obtained from the coordinates of the projection centres:

\[ \textbf{b} = X_{O^{''}} - X_{O^{'}} \]

Coplanarity Constraint


\[ [O^{'}X, O^{'}O^{''}, O^{''}X] = 0 \]

can be expressed as:

\[ \begin{aligned} \begin{bmatrix}{}^{n}x^{'}, \textbf{b}, {}^{n}x^{''} \end{bmatrix} &= 0 \\ {}^{n}x^{'} \cdot (\textbf{b} \times {}^{n}x^{''}) &= 0 \\ {}^{n}x^{'T} S_{b} {}^{n}x^{''} &= 0 \end{aligned} \]

Skew Symmetric Matrix

How does this work?

\[ \begin{aligned} {}^{n}x^{'} \cdot (\textbf{b} \times {}^{n}x^{''}) &= 0 \\ {}^{n}x^{'T} S_{b} {}^{n}x^{''} &= 0 \end{aligned} \]

Write the cross product as a skew symmetric matrix \(S_b\):

\[ \begin{bmatrix} b_1 \\ b_2 \\ b_3 \end{bmatrix} \times \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix} = \begin{bmatrix} - b_3 x_2 & + & b_2 x_3 \\ b_3 x_1 & - & b_1 x_3 \\ - b_2 x_1 & + & b_1 x_2 \end{bmatrix} = \underbrace{\begin{bmatrix} 0 & -b_3 & b_2 \\ b_3 & 0 & -b_1 \\ -b_2 & b_1 & 0 \end{bmatrix}}_{S_b} \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix} \]

We can continue to work with the coplanarity constraint, to build the fundamental matrix.

By combining \({}^{n}x^{'} = (R')^{-1}(K')^{-1} x'\) and \({}^{n}x^{'T} S_{b} {}^{n}x^{''} = 0\)

  • we obtain:

\[ x'^{T}(K')^{-T}(R')^{-T}S_{b}(R'')^{-1}(K'')^{-1}x'' = 0 \]

By combining \({}^{n}x^{'} = (R')^{-1}(K')^{-1} x'\) and \({}^{n}x^{'T} S_{b} {}^{n}x^{''} = 0\)

  • we obtain:

\[ x'^{T}\underbrace{(K')^{-T}(R')^{-T}S_{b}(R'')^{-1}(K'')^{-1}}_{F}x'' = 0 \]

\[ \begin{aligned} F &= (K')^{-T}(R')^{-T}S_{b}(R'')^{-1}(K'')^{-1} \\ &= (K')^{-T}(R') S_{b} (R'')^{T}(K'')^{-1} \end{aligned} \]

The matrix \(F\) is the fundamental matrix.

\[ F = (K')^{-T}(R') S_{b} (R'')^{T}(K'')^{-1} \]

  • it allows us to express the coplanarity constraint as:

\[ x'^{T} Fx'' = 0 \]

The fundamental matrix holds the parameters we can estimate to describe the relative orientation of two cameras looking at the same point.

\[ x'^{T} Fx'' = 0 \]

The fundamental matrix fulfils the equation:

\[ x'^{T} Fx'' = 0 \]

for corresponding points in two images.

  • The fundamental matrix contains all the information about the relative orientation of two images from uncalibrated cameras.

NOTE: we have defined the fundamental matrix for the relative orientation from camera one to camera two.

  • You will also find in the literature, \(F\) can be defined for the relative orientation from camera two to camera one.

  • This transposition must be accounted for when comparing expressions.

Calibrated Cameras

Most photogrammetric systems rely on calibrated cameras.

  • Calibrated cameras simplify the orientation problem.
  • Often, both cameras have the same calibration matrix.

For calibrated cameras the coplanarity constraint can be simplified.

  • From the calibration matrices we obtain the directions as:

\[ {}^{k}x^{'} = (K')^{-1}x' \quad {}^{k}x^{''} = (K'')^{-1}x'' \]


From the fundamental matrix:

\[ \begin{aligned} x'^{T} Fx'' &= 0 \\[10pt] x'^{T}\underbrace{(K')^{-T}(R')^{-T}S_{b}(R'')^{-1}(K'')^{-1}}_{F}x'' &= 0 \end{aligned} \]


From the fundamental matrix:

\[ \begin{aligned} x'^{T} Fx'' &= 0 \\[10pt] x'^{T}\underbrace{(K')^{-T}(R')^{-T}S_{b}(R'')^{-1}(K'')^{-1}}_{F}x'' &= 0 \\[10pt] \underbrace{x'^{T}(K')^{-T}}_{{}^{k}x^{'T}} (R')^{-T}S_{b}(R'')^{-1} \underbrace{(K'')^{-1}x''}_{{}^{k}x^{''}} &= 0 \end{aligned} \]


From the fundamental matrix:

\[ \begin{aligned} x'^{T} Fx'' &= 0 \\[10pt] x'^{T}\underbrace{(K')^{-T}(R')^{-T}S_{b}(R'')^{-1}(K'')^{-1}}_{F}x'' &= 0 \\[10pt] \underbrace{x'^{T}(K')^{-T}}_{{}^{k}x^{'T}} (R')^{-T}S_{b}(R'')^{-1} \underbrace{(K'')^{-1}x''}_{{}^{k}x^{''}} &= 0 \\[10pt] {}^{k}x^{'T} \underbrace{R'S_b R^{''T}}_{E} {}^{k}x^{''} &= 0 \end{aligned} \]

From \(F\) to the essential matrix \(E\):

\[ \begin{aligned} {}^{k}x^{'T} \underbrace{R'S_b R^{''T}}_{E} {}^{k}x^{''} &= 0 \\ {}^{k}x^{'T} E {}^{k}x^{''} = 0 \end{aligned} \]

\[ E = R'S_b R^{''T} \]

The essential matrix is a special form of the fundamental matrix.

For calibrated cameras it is called the essential matrix:

\[ E = R'S_b R^{''T} \]

For calibrated cameras, the coplanarity constraint is:

\[ {}^{k}x^{'T} E {}^{k}x^{''} = 0 \]

  • The essential matrix has five degrees of freedom.
  • The essential matrix is homogeneous and singular.

\[ {}^{k}x^{'T} E {}^{k}x^{''} = 0 \]

How do we obtain the values of the fundamental matrix from image correspondences?

8 Point algorithm

We know the direction vectors from the image coordinates, but the parameters of \(F\) are unknown.

\[ [x'_{n}, y'_{n}, 1] \begin{bmatrix} F_{11} & F_{12} & F_{13} \\ F_{21} & F_{22} & F_{23} \\ F_{31} & F_{32} & F_{33} \end{bmatrix} \begin{bmatrix}x''_{n} \\ y''_{n} \\ 1 \end{bmatrix} = 0 \]

Solve using the SVD:

\[ A \begin{bmatrix} F_{11} \\ \vdots \\ F_{33} \end{bmatrix} = 0 \]

From 8 corresponding points, we can solve \(F\) or \(E\).


There are implementations of these algorithms in many popular packages.


  • Forsyth, Ponce; Computer Vision: A modern approach.
  • Hartley, Zisserman; Multiple View Geometry in Computer Vision.
  • H. Christopher Longuet-Higgins (1981). “A computer algorithm for reconstructing a scene from two projections”.
