Clojure Linear Algebra Refresher (2) - Eigenvalues and Eigenvectors
Need help with your custom Clojure software? I'm open to (selected) contract work.June 6, 2017
Please share: Twitter.
These books fund my work! Please check them out.
I stumble upon eigenvectors and eigenvalues regularly when reading about machine learning topics. They seem to be important. They are not introduced in those texts, though, so you might be scratching your head wondering how the authors got the idea in the first place to use them and why. Luckily, they are introduced soon enough in linear algebra textbooks, and they are available at high-speed in Clojure via Neanderthal, that we should be able to at least try them out easily, and then build upon that knowledge fearlessly when we encounter them in the wild.
Before continuing this linear algebra series with Eigen-stuff, let me remind you that:
- These articles are not self-contained. You should follow along with a linear algebra textbook. I recommend Linear Algebra With Applications, Alternate Edition by Gareth Williams (see more in part 1 of this series).
- The intention is to connect the dots from a math textbook to Clojure code, rather than explain math theory or teach you basics of Clojure.
- Please read Clojure Linear Algebra Refresher (1) - Vector Spaces first.
This text covers relevant parts of chapter 5 of the alternate edition of the textbook.
Eigenvalues and Eigenvectors
Let \(A\) be a \(m \times n\) matrix. If there are scalar \(\lambda\) and a non-zero vector \(\mathbf{x}\) such that \(A\mathbf{x} = \lambda\mathbf{x}\), we call such scalar eigenvalue, and such vector eigenvector.
There can be more than one eigenvalue for a given matrix, and there is an infinite number of eigenvectors
corresponding to one eigenvalue. All eigenvectors that correspond to one a unique eigenvalue lie on the same
line, but have different magnitudes.
Seems simple, and it is, but so what? It looks like a trivial thing; how come these eigenvectors and eigenvalues are so ubiquitous in linear algebra? It turns out that some useful special matrices can be computed, and some useful theorems can be build upon this simple definition. IANM (I Am Not a Mathematician), so I'll let you use your math textbook to discover that further.
Let's do some Clojure: given a matrix, how do I find eigenvalues and eigenvectors? The function is called
ev!
and it can be found in the uncomplicate.neanderthal.linalg
namespace:
(require '[uncomplicate.neanderthal [core :refer [col entry nrm2 mv scal axpy copy mm dia]] [native :refer [dge]] [linalg :refer [ev! tri! trf]]])
I'm following the example 1 from page 210; there is a matrix a
and we are looking for 2 eigenvalues with
corresponding eigenvectors. Calling def
inside a let block is not a coding style to be proud of, but here I
do it because I need an easy way to produce printable outputs in org-mode and org-babel, which is used to generate
this nice text from live code.
(let [a (dge 2 2 [-4 3 -6 5]) ;;note: column-oriented! eigenvectors (dge 2 2) eigenvalues (ev! a nil eigenvectors)] (def lambda1 (entry eigenvalues 0 0)) (def x1 (col eigenvectors 0)) (def lambda2 (entry eigenvalues 1 0)) (def x2 (col eigenvectors 1)))
The first eigenvector is \(\lambda = 1\) (the order is not important):
lambda1
-1.0
An infinite number of vectors correspond to this λ value, but they are all linearly dependent: find one base vector, and you can construct any other by scaling that one. That base vector forms a subspace, or eigenspace. The base corresponding to \(\lambda = 1\) is \(r(-0.89, 0.45)\):
x1
#RealBlockVector[double, n:2, offset: 0, stride:1] [ -0.89 0.45 ]
Mathematicians warn us to always be skeptical. Let's check that λ1 and \(\mathbf{x_1}\) are really an eigenvalue and eigenvector:
(let [a (dge 2 2 [-4 3 -6 5])] (axpy -1 (mv a x1) (scal lambda1 x1)))
#RealBlockVector[double, n:2, offset: 0, stride:1] [ 0.00 0.00 ]
Yes, they are.
Perhaps you wonder why I haven't simply check these two vectors for equality with =
. Recall
from the part 1 of this tutorial that comparing floating point numbers for equality is a tricky business.
Even a small difference of 0.00000001, that can appear due to inevitable rounding errors, would break such equality check.
Even those numbers, 0.89, and 0.45 are not very precise - they have much more digits, but Neanderthal rounds them
to two decimals for readability. These are the actual values to the max precision available by 64 bits:
(doall (seq x1))
-0.8944271909999159 | 0.4472135954999579 |
Another benefit of using a good numerical software (such as Neanderthal) is that the eigenvectors we get are normalized:
(nrm2 x1)
1.0
We can repeat the same procedure for λ2; it is a good idea if you test them as an exercise in your trusty Clojure REPL.
lambda2
2.0
x2
#RealBlockVector[double, n:2, offset: 2, stride:1] [ 0.71 -0.71 ]
Eigenvalues are not necessarily distinct. Consider example 2 from page 214:
(let [a (dge 3 3 [5 4 2 4 5 2 2 2 2]) ;;note: column-oriented! eigenvectors (dge 3 3) eigenvalues (ev! a nil eigenvectors)] [(col eigenvalues 0) eigenvectors])
'(#RealBlockVector(double n:3 offset: 0 stride:1) ( 1.00 10.00 1.00 ) #RealGEMatrix(double mxn:3x3 layout:column offset:0) ▥ ↓ ↓ ↓ ┓ → -0.75 0.67 -0.03 → 0.60 0.67 -0.42 → 0.30 0.33 0.91 ┗ ┛ )
\(\lambda = 1\) appears two times. The space corresponding to it has multiplicity 2, and the dimension of its corresponding eigenspace is 2. As an exercise, you might check whether any linear combination of these two eigenvectors (column 0 and column 2 from the result matrix) is indeed an eigenvector (it should be!).
You might also wonder why those eigenvalues are in the first column of a result matrix. Eigenvalues are usually complex numbers. That matrix has \(m\times{2}\) dimensions. The first column contains the real part, and the second the imaginary part. These examples from the textbook are well-designed to be easily computed by hand, so I knew that the imaginary part was zero, and didn't bother to clutter the introductory code. In general, I guess that eigenvalues will have imaginary component more often than not (IANM).
Diagonalization of Matrices
Consider square matrices \(A\) and \(B\) of the same size, and suppose there exists an invertible matrix \(C\), such that \(B = C^{-1}AC\). Then, we say that \(B\) is similar to \(A\), and the transformation from \(A\) to \(B\) is called similarity transformation.
Let's compute an example in Clojure. We have matrices a
and c
, and would like to compute the similar
matrix b
. The first thing to ask is: how do we compute the inverse of c
? Inverting a matrix is
rather numerically intensive, so there exist several "shortcuts" to that.
For example, one of the approaches is to first factorize the matrix into a product of two triangular matrices \(L\) and \(U\). That is easier than doing the full compute of the inverse. Then, we can reuse these factors for different tasks that ask for the inverse in a naive approach, but where the answer can be computed using only \(L\) and \(U\). Since \(L\) and \(U\) are lower and upper triangular or trapezoidal, it is easier to compute their inverses or diagonals. Or, if we still need an inverse of the whole matrix, we can compute it from \(LU\) factors.
Relevant functions can be found in the uncomplicate.neanderthal.linalg
namespace.
Function trf!
does the LU
("tr" is for triangular and "f" for factorization) and tri!
the inverse given the previously LU-processed
input (trf
). Use it only when you absolutely need an inverse, and avoid it when LU (or QR etc.) factors are enough.
(let [a (dge 2 2 [7 3 -10 -4]) c (dge 2 2 [2 1 5 3]) inv-c (tri! (trf c))];; we can not afford to destroy c! (mm inv-c (mm a c)))
#RealGEMatrix[double, mxn:2x2, layout:column, offset:0] ▥ ↓ ↓ ┓ → 2.00 0.00 → 0.00 1.00 ┗ ┛
In this case, \(B\) is diagonal, which is a nice property, since many computations are easier with special cases such as diagonal or symmetric matrices than with any random matrix.
A matrix is diagonalizable if there exists a similar matrix that is diagonal.
What has this to do with eigenvalues? It turns out that similar matrices have the same eigenvalues, and eigenvectors can tell us whether the matrix is diagonalizable, and if it is, help us discover the transformation.
Specifically, there is a theorem that states that, for a \(n\times{n}\) matrix \(A\):
- If \(A\) has n linearly independent eigenvectors, it is diagonalizable, and vice versa.
- The matrix \(C\), whose columns are n linearly independent eigenvectors can be used in similarity tranformation to give a diagonal matrix \(D\).
- The diagonal elements of \(D\) will be the eigenvalues of A.
Huh, nice! So, we do not need to compute the inverse, or multiply matrices. We just do the eigen decomposition, and find \(D\), or find that \(D\) does not exist!
Demonstration in Clojure:
(let [a (dge 2 2 [-4 3 -6 5]) evec (dge 2 2) eval (ev! (copy a) nil evec) inv-evec (tri! (trf evec))] (def diag (col eval 0)) (def d (mm inv-evec (mm a evec))))
I have computed the eigenvalues and have found the diagonal of \(D\):
diag
#RealBlockVector[double, n:2, offset: 0, stride:1] [ -1.00 2.00 ]
I have also computed \(D\) in a pedestrian way, using matrix inverse, to check (and demonstrate) whether I should believe what the textbooks tells me. Indeed, the textbook is right!
d
#RealGEMatrix[double, mxn:2x2, layout:column, offset:0] ▥ ↓ ↓ ┓ → -1.00 0.00 → 0.00 2.00 ┗ ┛
Working with \(D\) is easier than working with \(A\). For example, instead of computing \(A^k\), we can use the identity \(A^k = CD^kC^{-1}\), knowing that \(D^k\) can be computed easily by just taking powers of its diagonal elements.
I'll use Neanderthal's dia
function to access the diagonal of \(D\), and Fluokitten's fmap!
function
to apply the pow
function to each of its elements (fmap!
is similar to map
, just more efficient, and polymorphic
so it supports vectors, matrices and many more structures).
(require '[uncomplicate.fluokitten.core :refer [fmap!]]) (require '[uncomplicate.neanderthal.math :refer [pow]])
(def d9 (copy d)) (fmap! (pow 9) (dia d9)) d9
#'user2/d9#RealBlockVector[double, n:2, offset: 0, stride:3] [ -1.00 512.00 ] #RealGEMatrix[double, mxn:2x2, layout:column, offset:0] ▥ ↓ ↓ ┓ → -1.00 0.00 → 0.00 512.00 ┗ ┛
I now leave as an exercise to the reader to compute \(CD^kC^{-1}\). (It seems I'm getting good at math writing; whenever you get bored of doing these step-by-step instructions, offload some work to the reader!)
There is a theorem that claims that symmetric matrices are diagonalizable. That's also
something that you might demonstrate as an exercise. There is an even more significant property of
symmetric matrices: they are orthogonaly diagonalizable, meaning that \(D = C^tAC\) (since by definition
\(C^{-1} = C^t\) for orthogonal matrices). Of course, \(C^t\) is much easier to compute than \(C^{-1}\), Neanderthal's
trans
function will do that in \(O(1)\), that is - instantly.
Conclusions
Here is an illuminating trivia that will help you associate eigenvalues with their usefulness. Eigen is not a mathematician, the term eigen is adopted from German, where it means "own", "special", "inherent". Eigenvalues and eigenvectors are then special values and vectors of a matrix, its own inherent properties.
You can continue with the next post in the series: Matrix Transformations. In the meantime, you might be interested in checking out some Neanderthal Tutorials.