Clojure Linear Algebra Refresher (2) - Eigenvalues and Eigenvectors

June 6, 2017

I stumble upon eigenvectors and eigenvalues regularly when reading about machine learning topics. They seem to be important. They are not introduced in those texts, though, so you might be scratching your head wondering how the authors got the idea in the first place to use them and why. Luckily, they are introduced soon enough in linear algebra textbooks, and they are available at high-speed in Clojure via Neanderthal, that we should be able to at least try them out easily, and then build upon that knowledge fearlessly when we encounter them in the wild.

Before continuing this linear algebra series with Eigen-stuff, let me remind you that:

This text covers relevant parts of chapter 5 of the alternate edition of the textbook.

Eigenvalues and Eigenvectors

Let \(A\) be a \(m \times n\) matrix. If there are scalar \(\lambda\) and a non-zero vector \(\mathbf{x}\) such that \(A\mathbf{x} = \lambda\mathbf{x}\), we call such scalar eigenvalue, and such vector eigenvector.

There can be more than one eigenvalue for a given matrix, and there is an infinite number of eigenvectors corresponding to one eigenvalue. All eigenvectors that correspond to one a unique eigenvalue lie on the same line, but have different magnitudes.

Seems simple, and it is, but so what? It looks like a trivial thing; how come these eigenvectors and eigenvalues are so ubiquitous in linear algebra? It turns out that some useful special matrices can be computed, and some useful theorems can be build upon this simple definition. IANM (I Am Not a Mathematician), so I'll let you use your math textbook to discover that further.

Let's do some Clojure: given a matrix, how do I find eigenvalues and eigenvectors? The function is called ev! and it can be found in the uncomplicate.neanderthal.linalg namespace:

(require '[uncomplicate.neanderthal
           [core :refer [col entry nrm2 mv scal axpy copy mm dia]]
           [native :refer [dge]]
           [linalg :refer [ev! tri! trf]]])

I'm following the example 1 from page 210; there is a matrix a and we are looking for 2 eigenvalues with corresponding eigenvectors. Calling def inside a let block is not a coding style to be proud of, but here I do it because I need an easy way to produce printable outputs in org-mode and org-babel, which is used to generate this nice text from live code.

(let [a (dge 2 2 [-4 3 -6 5]) ;;note: column-oriented!
      eigenvectors (dge 2 2)
      eigenvalues (ev! a nil eigenvectors)]
  (def lambda1 (entry eigenvalues 0 0))
  (def x1 (col eigenvectors 0))
  (def lambda2 (entry eigenvalues 1 0))
  (def x2 (col eigenvectors 1)))

The first eigenvector is \(\lambda = 1\) (the order is not important):

lambda1
-1.0

An infinite number of vectors correspond to this λ value, but they are all linearly dependent: find one base vector, and you can construct any other by scaling that one. That base vector forms a subspace, or eigenspace. The base corresponding to \(\lambda = 1\) is \(r(-0.89, 0.45)\):

x1
#RealBlockVector[double, n:2, offset: 0, stride:1]
[  -0.89    0.45 ]

Mathematicians warn us to always be skeptical. Let's check that λ1 and \(\mathbf{x_1}\) are really an eigenvalue and eigenvector:

(let [a (dge 2 2 [-4 3 -6 5])]
  (axpy -1 (mv a x1) (scal lambda1 x1)))
#RealBlockVector[double, n:2, offset: 0, stride:1]
[   0.00    0.00 ]

Yes, they are.

Perhaps you wonder why I haven't simply check these two vectors for equality with =. Recall from the part 1 of this tutorial that comparing floating point numbers for equality is a tricky business. Even a small difference of 0.00000001, that can appear due to inevitable rounding errors, would break such equality check. Even those numbers, 0.89, and 0.45 are not very precise - they have much more digits, but Neanderthal rounds them to two decimals for readability. These are the actual values to the max precision available by 64 bits:

(doall (seq x1))
-0.8944271909999159 0.4472135954999579

Another benefit of using a good numerical software (such as Neanderthal) is that the eigenvectors we get are normalized:

(nrm2 x1)
1.0

We can repeat the same procedure for λ2; it is a good idea if you test them as an exercise in your trusty Clojure REPL.

lambda2
2.0

x2
#RealBlockVector[double, n:2, offset: 2, stride:1]
[   0.71   -0.71 ]

Eigenvalues are not necessarily distinct. Consider example 2 from page 214:

(let [a (dge 3 3 [5 4 2 4 5 2 2 2 2]) ;;note: column-oriented!
      eigenvectors (dge 3 3)
      eigenvalues (ev! a nil eigenvectors)]
  [(col eigenvalues 0) eigenvectors])
'(#RealBlockVector(double  n:3  offset: 0  stride:1)
(   1.00   10.00    1.00 )
 #RealGEMatrix(double  mxn:3x3  layout:column  offset:0)
   ▥       ↓       ↓       ↓       ┓
   →      -0.75    0.67   -0.03
   →       0.60    0.67   -0.42
   →       0.30    0.33    0.91
   ┗                               ┛
)

\(\lambda = 1\) appears two times. The space corresponding to it has multiplicity 2, and the dimension of its corresponding eigenspace is 2. As an exercise, you might check whether any linear combination of these two eigenvectors (column 0 and column 2 from the result matrix) is indeed an eigenvector (it should be!).

You might also wonder why those eigenvalues are in the first column of a result matrix. Eigenvalues are usually complex numbers. That matrix has \(m\times{2}\) dimensions. The first column contains the real part, and the second the imaginary part. These examples from the textbook are well-designed to be easily computed by hand, so I knew that the imaginary part was zero, and didn't bother to clutter the introductory code. In general, I guess that eigenvalues will have imaginary component more often than not (IANM).

Diagonalization of Matrices

Consider square matrices \(A\) and \(B\) of the same size, and suppose there exists an invertible matrix \(C\), such that \(B = C^{-1}AC\). Then, we say that \(B\) is similar to \(A\), and the transformation from \(A\) to \(B\) is called similarity transformation.

Let's compute an example in Clojure. We have matrices a and c, and would like to compute the similar matrix b. The first thing to ask is: how do we compute the inverse of c? Inverting a matrix is rather numerically intensive, so there exist several "shortcuts" to that.

For example, one of the approaches is to first factorize the matrix into a product of two triangular matrices \(L\) and \(U\). That is easier than doing the full compute of the inverse. Then, we can reuse these factors for different tasks that ask for the inverse in a naive approach, but where the answer can be computed using only \(L\) and \(U\). Since \(L\) and \(U\) are lower and upper triangular or trapezoidal, it is easier to compute their inverses or diagonals. Or, if we still need an inverse of the whole matrix, we can compute it from \(LU\) factors.

Relevant functions can be found in the uncomplicate.neanderthal.linalg namespace. Function trf! does the LU ("tr" is for triangular and "f" for factorization) and tri! the inverse given the previously LU-processed input (trf). Use it only when you absolutely need an inverse, and avoid it when LU (or QR etc.) factors are enough.

(let [a (dge 2 2 [7 3 -10 -4])
      c (dge 2 2 [2 1 5 3])
      inv-c (tri! (trf c))];; we can not afford to destroy c!
(mm inv-c (mm a c)))
#RealGEMatrix[double, mxn:2x2, layout:column, offset:0]
   ▥       ↓       ↓       ┓
   →       2.00    0.00
   →       0.00    1.00
   ┗                       ┛

In this case, \(B\) is diagonal, which is a nice property, since many computations are easier with special cases such as diagonal or symmetric matrices than with any random matrix.

A matrix is diagonalizable if there exists a similar matrix that is diagonal.

What has this to do with eigenvalues? It turns out that similar matrices have the same eigenvalues, and eigenvectors can tell us whether the matrix is diagonalizable, and if it is, help us discover the transformation.

Specifically, there is a theorem that states that, for a \(n\times{n}\) matrix \(A\):

  • If \(A\) has n linearly independent eigenvectors, it is diagonalizable, and vice versa.
  • The matrix \(C\), whose columns are n linearly independent eigenvectors can be used in similarity tranformation to give a diagonal matrix \(D\).
  • The diagonal elements of \(D\) will be the eigenvalues of A.

Huh, nice! So, we do not need to compute the inverse, or multiply matrices. We just do the eigen decomposition, and find \(D\), or find that \(D\) does not exist!

Demonstration in Clojure:

(let [a (dge 2 2 [-4 3 -6 5])
            evec (dge 2 2)
            eval (ev! (copy a) nil evec)
            inv-evec (tri! (trf evec))]
        (def diag (col eval 0))
        (def d (mm inv-evec (mm a evec))))

I have computed the eigenvalues and have found the diagonal of \(D\):

diag
#RealBlockVector[double, n:2, offset: 0, stride:1]
[  -1.00    2.00 ]

I have also computed \(D\) in a pedestrian way, using matrix inverse, to check (and demonstrate) whether I should believe what the textbooks tells me. Indeed, the textbook is right!

d
#RealGEMatrix[double, mxn:2x2, layout:column, offset:0]
   ▥       ↓       ↓       ┓
   →      -1.00    0.00
   →       0.00    2.00
   ┗                       ┛

Working with \(D\) is easier than working with \(A\). For example, instead of computing \(A^k\), we can use the identity \(A^k = CD^kC^{-1}\), knowing that \(D^k\) can be computed easily by just taking powers of its diagonal elements.

I'll use Neanderthal's dia function to access the diagonal of \(D\), and Fluokitten's fmap! function to apply the pow function to each of its elements (fmap! is similar to map, just more efficient, and polymorphic so it supports vectors, matrices and many more structures).

(require '[uncomplicate.fluokitten.core :refer [fmap!]])
(require '[uncomplicate.neanderthal.math :refer [pow]])
(def d9 (copy d))
(fmap! (pow 9) (dia d9))
d9
#'user2/d9#RealBlockVector[double, n:2, offset: 0, stride:3]
[  -1.00  512.00 ]
#RealGEMatrix[double, mxn:2x2, layout:column, offset:0]
   ▥       ↓       ↓       ┓
   →      -1.00    0.00
   →       0.00  512.00
   ┗                       ┛

I now leave as an exercise to the reader to compute \(CD^kC^{-1}\). (It seems I'm getting good at math writing; whenever you get bored of doing these step-by-step instructions, offload some work to the reader!)

There is a theorem that claims that symmetric matrices are diagonalizable. That's also something that you might demonstrate as an exercise. There is an even more significant property of symmetric matrices: they are orthogonaly diagonalizable, meaning that \(D = C^tAC\) (since by definition \(C^{-1} = C^t\) for orthogonal matrices). Of course, \(C^t\) is much easier to compute than \(C^{-1}\), Neanderthal's trans function will do that in \(O(1)\), that is - instantly.

Conclusions

Here is an illuminating trivia that will help you associate eigenvalues with their usefulness. Eigen is not a mathematician, the term eigen is adopted from German, where it means "own", "special", "inherent". Eigenvalues and eigenvectors are then special values and vectors of a matrix, its own inherent properties.

You can continue with the next post in the series: Matrix Transformations. In the meantime, you might be interested in checking out some Neanderthal Tutorials.

Clojure Linear Algebra Refresher (2) - Eigenvalues and Eigenvectors - June 6, 2017 - Dragan Djuric