On-Line Computer Graphics Notes

The Camera Transform


Overview

To understanding the rendering process, you must master the procedure that specifies a camera and then constructs a transformation that projects a three-dimensional scene onto a two-dimensional screen. This procedure has two several components: First, the specification of a camera model; second, the conversion of the scene's coordinates from Cartesian space to the space of the camera; and finally the specification of a viewing transformation that projects that scene into image space

For a postscript version of these notes look here.


The Camera Model

We specify our initial camera model by identifying the following parameters.

  1. A scene, consisting of polygonal elements each represented by their vertices,
  2. A point that represents the camera position -- tex2html_wrap_inline427,
  3. A point that represents the ``center-of-attention'' of the camera (i.e. where the camera is looking) -- tex2html_wrap_inline429,
  4. A field-of-view angle, tex2html_wrap_inline431, representing the angle subtended at the apex of the viewing pyramid.
  5. The specification of ``near'' and ``far'' bounding planes. These planes considered perpendicular to the direction-of-view vector at a distance of n and f from the camera, respectively.

tex2html_wrap543

The specification of tex2html_wrap_inline437, tex2html_wrap_inline439 and tex2html_wrap_inline441 forms a viewing volume in the shape of a pyramid with the camera position tex2html_wrap_inline443 at the apex of the pyramid and the vector tex2html_wrap_inline445 forming the axis of the pyramid. This pyramid is commonly referred to as the viewing pyramid. The specification of the near and far planes forms a truncated viewing pyramid which gives the region of space which contains the primary portion of the scene to be viewed (We note that objects may extend outside the trunchated pyramid. In many situations polygons will lie between the near plane and the camera, or, in distance, beyond the far plane.). The viewing transform, transforms this truncated pyramid onto the image space volume tex2html_wrap_inline447.


The Camera Transform

Given the definition of a camera tex2html_wrap_inline449, the camera transformation is a combination of a transform that first converts the coordinates of the Cartesian frame to the local coordinates of the camera's frame.

tex2html_wrap545

and, second, applies the viewing transform. These two transformations are usually multiplied together to form a single tex2html_wrap_inline451 matrix that is applied to all points of the scene.


Defining a Frame at the Camera Position

The main idea here is to define a frame at the camera position. Given such a frame tex2html_wrap_inline453, we generate a transformation that converts the Cartesian Frame coordinates to the camera's frame.

To define a frame at the camera position is easy - and there are actually an number of ways of doing this. One of the vectors is obvious - that is, we want
displaymath50
(the transformed camera should be looking along the negative w axis).

In order to define the other vectors that make up the frame, we must make an assumption. We assume that the vertical direction of the camera must be in the plane defined by tex2html_wrap_inline457 and the vector tex2html_wrap_inline459. This frequently happens when you are taking a picture, if you think about it - and it actually fairly easy to arrange. See the following figure for an illustration of this process. In the figure, the dotted line is the direction of view, and should be placed on the negative z axis by the transformation.

tex2html_wrap547

To define tex2html_wrap_inline463 and tex2html_wrap_inline465 we utilize the following steps

This also insures that the vectors are all unit vectors, and that they are mutually perpendicular.

We note that this works well, except when you wish to have the camera look in the direction <0,1,0> or <0,-1,0>. In these cases, either tex2html_wrap_inline481 or tex2html_wrap_inline483 and tex2html_wrap_inline485, and we cannot calculate a frame in this manner. However, we can utilize another vector as the ``up direction'' to utilize with tex2html_wrap_inline487 to obtain tex2html_wrap_inline489.


Calculating the Matrix

To calculate the actual matrix that implements the transformation, we can write each of the vectors <1,0,0>, <0,1,0> and <0,0,1> as a linear combination of tex2html_wrap_inline497, tex2html_wrap_inline499, and tex2html_wrap_inline501 (Since the vectors defining tex2html_wrap_inline503 are linearly independent). In addition, we can write the vector tex2html_wrap_inline505 as a linear combination of tex2html_wrap_inline507, tex2html_wrap_inline509 and tex2html_wrap_inline511. Thus we can calculate the values tex2html_wrap_inline513, where
align62
These equations can be solved by Cramers Rule,: To obtain tex2html_wrap_inline515, we have


align81
To obtain tex2html_wrap_inline517, we have
align97
And to obtain tex2html_wrap_inline519, we have
align113
In addition, if tex2html_wrap_inline521, then we have
align126
to get tex2html_wrap_inline523.

The matrix that convets the coordinates of objects in the frame tex2html_wrap_inline525 into coordinates for the frame tex2html_wrap_inline527 is given by
displaymath144
Any point tex2html_wrap_inline529 can be written in the frame tex2html_wrap_inline531 by
displaymath162
But by the above calculations this is equal to
displaymath170
which implies that the coordinate
displaymath193
is the coordinate of the point in the frame tex2html_wrap_inline533.

We note, that by our construction, the frame tex2html_wrap_inline535 is an orthonormal frame (all vectors are unit vectors and are mutually perpendicular) and in this case the equations above simplify tremendously. In particular, all the denominators tex2html_wrap_inline537, and we can simplify the numerators utilizing the identities
align216
to obtain
align218
The first few of these are extremely simple, as, for example tex2html_wrap_inline539 is just the first coordinate of tex2html_wrap_inline541, etc.


Overview

The camera transform is a Cartesian-frame-to-frame transform. This is combined with the viewing transform to give a transformation that converts a scene into image space.


This document maintained by Ken Joy

Comments to the Author

All contents copyright (c) 1996, 1997
Computer Science Department,
University of California, Davis
All rights reserved.



Ken Joy
Fri May 2 14:02:26 PDT 1997