01 November 2002

Eigenwhat?

by George E. Hrabovsky, President, MAST

News from MAST

After much soul-searching I have decided to make a shameless plea... MAST is running out of money! We are in serious danger of having to close down. It might surprise you to know that we do not receive any funds from the SAS at all. If you appreciate the columns that we are writing, and the columns we hope to produce (we are considering a Java programming column, and an image-processing column), then I would urge you to visit our web site and make a small contribution (all such contributions are tax-deductible).

Proving Euler's Theorem

Recall that in the last column we stated the following, "There is an truly amazing result that you get when you move a rigid body with a single fixed point. It turns out that when you take an initial position and move it to the final position, this motion can be duplicated by a single rotation of the rigid body about an axis through the fixed point. This is called Euler's Theorem."

We did not, however, prove this statement. In keeping with the accompanying Mathematics Corner on proofs, I will seek to prove this theorem. Before I can perform this proof, though, I need to introduce an important mathematical tool.

Eigenvalues and Eigenvectors

There is an interesting class of problems that have wide application throughout physics and other sciences. Given a square matrix (a matrix with the same number of rows and columns) we can characterize the matrix as a product of a set of scalars and vectors. This also works for anything that can be represented by a square matrix — like some tensors. If we have a square matrix A of n rows and columns and a set of column vectors v, then we can characterize this with a set of values λ such that,

A v = λ v .

In this case λ is called the eigenvalue of the matrix and v is called the eigenvector of the matrix. We can write this as a difference if we convert the eigenvalues into a matrix by multiplying by the Kronecker delta,

A   v - 1 λ v   =   0 (A - 1 λ) v   =   0.

This is called the secular equation (also the characteristic equation). What we have here is the product of a matrix by a column vector. If we consider only the matrix A   - 1   λ and we take the determinant of its negative we get a special expression,

Δ (λ) = det(1 λ - A)             = (-1)^n det(A - 1 λ) .

This is called the characteristic polynomial of the matrix. There is an important result related to this, every matrix is a root of its characteristic polynomial. This is called the Cayley-Hamilton theorem. It is also true that the eigenvalues are roots of the characteristic polynomial. In general, for a square matrix we can express the characteristic polynomial as,

Δ(λ) = λ^n - S _ 1 λ^(n - 1) + S _ 2 λ^(n - 2) + ... + (-1)^n S _ n,

where S _ i is the sum of principal minors of order i (see the column, "Turning one Vector into Another," a principle minor is a minor whose diagonal elements come for the diagonal elements of the matrix.) For a three-dimensional square matrix,

A    = (a     a     a  )                    11    12    13                    a      ...                  21    22    23                    a     a     a                    31    32    33

we have,

Δ(λ) = λ^3 - S _ 1 λ^2 + S _ 2 λ^1 - S _ 3 .

The first-order minors for this would just be the elements of the matrix. The second-order minors for this would be,

M _ 11 = | a     a   | = a _ 11 a _ 22 - a _ 12 a _ 21             11    12             a      ... | = a _ 22 a _ 33 - a _ 23 a _ 32 .             22    23             a     a             32    33

There is only one minor of third-order, the determinant of A.

The principal minors are those whose diagonal elements are diagonal elements of A. The principle minors of first-order are simply the diagonal elements of the matrix. The principle minors of second-order are,

P _ 1 = M _ 11 = | a     a   | = a _ 11 a _ 22 - a _ 12 a _ 21                     11    12    ... 23 a _ 32 .                     22    23                     a     a                     32    33

The principle minor of third-order is the determinant of A.

The sum of the first-order principle minors is the sum of the diagonal elements of the matrix. This sum is called the trace of the matrix,

S _ 1 = tr(A) = a _ 11 + a _ 22 + a _ 33 .

The sum of the second-order principle minors is,

S _ 2 = P _ 1 + P _ 2 + P _ 3 = a _ 11 a _ 22 - a _ 12 a _ 21 + a _ 11 a _ 33 - a _ 13 a _ 31 + a _ 22 a _ 33 - a _ 23 a _ 32 .

The sum of the third-order principle minors is,

S _ 3 = det(A) = a _ 12 a _ 23 a _ 31 - a _ 13 a _ 22 a _ 31 +  a _ 13 a _ 21 a _ 32 - a _ 11 a _ 23 a _ 32 + a _ 11 a _ 22 a _ 33 - a _ 12 a _ 21 a _ 33 .

If we insert some numbers into the matrix we can get an example, if we use the following values for the matrix

A    = (1    0    3 )                    2    0    -1                    -1   1    0

we have

S _ 1 = tr(A) = a _ 11 + a _ 22 + a _ 33 = 1 + 0 + 0 = 1 S _ 2 = P _ 1 + P _ 2 + P _ 3 = a _ 1 ... 32 - a _ 11 a _ 23 a _ 32 + a _ 11 a _ 22 a _ 33 - a _ 12 a _ 21 a _ 33    = 6 + 1 = 7.

So, the characteristic polynomial becomes,

Δ(λ) = λ^3 -    λ^2 + 4 λ^1 - 7.

When we find the roots for this polynomial, we will know the eigenvalues of the matrix. In this case we have

λ _ 1 = 1.48 λ _ 2 = -0.24 - 2.16 i λ _ 3 = -0.24 + 2.16 i

where i is the imaginary number (-1)^(1/2). If our goal was to complete our discussion of eigenvectors we could go into the complexities of that operation, but this carries us too far from our goal of proving Euler's theorem. Determination of eigenvectors will have to wait for now.

Euler's Theorem Revisited

If we think about it, we can represent the finite rotation tensor Φ as a matrix whose individual elements are,

φ _ (i j) = cos θ δ _ (i j) + (1 - cos θ) e _ i e _ j + sin θ ϵ _ (i k j) e _ k .

So we can write the matrix in three dimensions,

Φ   =   ( φ     φ     φ  ) .                      11         12         13 ...      33                 φ     φ     φ                      31         32         33

Then we can write,

FormBox[RowBox[{Φ v = λ v, ,, RowBox[{Cell[        ... bsp;           ], (1)}]}], TraditionalForm]

Since the tensor represents a rotation there are two facts that must be pointed out. The first is that the sum of the squares of the elements in the three column vectors are equal to 1, and the inner products of the column vectors are equal to 0. Whenever these two facts are true, the matrix is called orthogonal. Here are the facts written out,

(φ _ (1 i))^2 = 1 (φ _ (2 i))^2 = 1 (φ _ (3 i))^2 = 1 δ _ (i j) φ _ (k i) φ _ (l j) = 0,

Where i, j, k, l = {1, 2, 3}. This being the case there is a theorem that allows us to write,

FormBox[RowBox[{det (Φ v ) = det (v ), ,, RowBox[{Cell[     &nbs ... bsp;           ], (2)}]}], TraditionalForm]

then we can substitute (2) into (1),

det (v )   = det (Φ v ) = | λ | det(v),

So long as det ( v )   !=   0 we can rewrite this,

| λ | = 1,

Hence, at least one eigenvalue of a rotation must be 1. This is another way of stating Euler's theorem.

The secular equation for the matrix representation of the tensor is,

FormBox[RowBox[{(Φ - 1 λ) v,  , =,  , RowBox[{0., Cell[      ... bsp;           ], (3)}]}], TraditionalForm]

We can expand this into a system of equations,

(φ _ 11 - λ )   X   +   φ _ 12 Y +   φ _ 13 Z = 0,  φ _ 21 X   +   (& ... 5;) Y +   φ _ 23 Z = 0,  φ _ 31 X   +   φ _ 32 Y +   ( φ _ 33 - λ) Z = 0.

This system can only have a solution when the determinant of the coefficients vanishes,

FormBox[RowBox[{det(Φ - 1 λ),  , =,  , RowBox[{det(φ   - λ   φ        ...                                                         31                  32                  33

This is equivalent to (2), so λ = 1. This proves the theorem.

Answer to Last Column's Theory Challenge

The entire column is the tensor-based answer to last week's theory challenge.

Theory Challenge

Using (3) can you figure out why (4) is equivalent to (2)?

As an added challenge, can you verify the facts for the assumption of orthogonality of the rotation tensor?

Books That I Like

E. A. Fox (1967), Mechanics, Harper and Row. This book is an excellent treatment of the mechanics of rigid bodies, elastic solids, and fluids. It is also, unfortunately, out of print; because of this I am following its arguments reasonably closely.

Melvin Hauser (1965), A Vector Space Approach to Geometry, Prentice-Hall (reprinted in 1998 by Dover Publications). This is a very interesting book that deals with the application of algebra to geometry. This book has lots of theorems regarding orthogonal matrices.

Seymour Lipschutz, Marc Lipson, (2001), Linear Algebra, McGraw-Hill (one of the excellent Schaum's Outline series). This book is one of the best for linear algebra, it is very clear and detailed.


Converted by Mathematica  (October 28, 2002)