Clojure Linear Algebra Refresher (4) - Linear Transformations
Need help with your custom Clojure software? I'm open to (selected) contract work.June 15, 2017
Please share: Twitter.
These books fund my work! Please check them out.
Now that we got ourselves acquainted with matrix transformations, the next obvious step is to generalize our knowledge with linear transformations. This post covers a bit more theoretical ground, so I will refer you to look things up in your linear algebra textbook more often than in the Matrix Transformations post. There will also be less code, since the computations that we'd need are similar to what I've already shown you. That's a good thing though; it indicates that we covered the most important things already, and just need more understanding of math theory and applications, and a bit finer grasp of the fine details of the performance numerical computations.
Before I continue, a few reminders:
- These articles are not stand-alone. You should follow along with a linear algebra textbook. I recommend Linear Algebra With Applications, Alternate Edition by Gareth Williams (see more in part 1 of this series).
- The intention is to connect the dots from a math textbook to Clojure code, rather than explain math theory or teach you basics of Clojure.
- Please read Clojure Linear Algebra Refresher (3) - Matrix Transformations first.
- Include Neanderthal library in your project to be able to use ready-made high-performance linear algebra functions.
This text covers the second part of chapter 6 of the alternate edition of the textbook.
Linear Transformations
Recall from the Vector Spaces post that a vector space has two operations: addition and scalar multiplication. Consider these matrix transformations: \(T(\mathbf{u} + \mathbf{v}) = A(\mathbf{u} + \mathbf{v}) = A\mathbf{u}+A\mathbf{v} = T(\mathbf{u})+T(\mathbf{v})\), and \(T(c\mathbf{u}) = A(c\mathbf{u}) = cA\mathbf{u} = cT(\mathbf{u})\). From this, it's easy to understand the textbook definition:
Let \(\mathbf{u}\text{ and }\mathbf{v}\) be vectors in \(R^n\) and \(c\) be a scalar. A transformation \(T:R^n\rightarrow{R^n}\) is a linear transformation if \(T(\mathbf{u} + \mathbf{v}) = T(\mathbf{u})+T(\mathbf{v})\) and \(T(c\mathbf{u}) = cT(\mathbf{u})\).
These properties tell us that all linear transformations preserve addition and scalar multiplication. Every matrix transformation is linear, but translations and affine transformations are not (I'll show you later that there is a workaround).
We can refer to a transformation where domain and codomain are the same (such as \(R^n\rightarrow{R^n}\)) as an operator.
In the previous post, I (and the textbook author) used ad hoc ways of arriving at matrices that describe transformations such as dilations, rotations, and reflections. Now, we will learn a formal method.
Let's start with the example, but first require the namespaces that we're going to need:
(require '[uncomplicate.neanderthal [native :refer [dv dge]] [core :refer [mv mm col axpy copy]] [linalg :refer [ev! sv! trf tri]] [math :refer [cos sin pi]]])
The example 3 from page 259 of the textbook tries to find a matrix that describes the following transformation: \(T\left(\begin{bmatrix}x\\y\\\end{bmatrix}\right)=\begin{bmatrix}2x+y\\3y\end{bmatrix}\).
Here's what we'll do in Clojure: first we find the effect of \(T\) on the standard basis of the domain \(R^n\), and then form a matrix whose columns are those images. Easy!
(let [t! (fn [v] (v 0 (+ (* 2 (v 0)) (v 1))) (v 1 (* 3 (v 1)))) a (dge 2 2 [1 0 0 1])] (t! (col a 0)) (t! (col a 1)) a)
#RealGEMatrix[double, mxn:2x2, layout:column, offset:0] ▥ ↓ ↓ ┓ → 2.00 1.00 → 0.00 3.00 ┗ ┛
Excuse the imperative code in function t!
, but we've found the matrix that represents the
linear transformation. Please keep in mind that vectors and matrices are functions that can retrieve
or update values at specific index, but that these functions are usually boxed. In tight loops, you'd
want to use entry and entry! from the uncomplicate.neanderthal.real namespace, or some other
operation optimized for performance.
Let \(T\) be a linear transformation on \(R^n\), \(\left\{\mathbf{e_1}, \mathbf{e_2}, \dots , \mathbf{e_n}\right\}\) the standard basis (see the Vector Spaces post) and \(\mathbf{u}\) arbitrary vector in \(R^n\). \(\mathbf{e_1}=\begin{bmatrix}1\\0\\\vdots\\0\end{bmatrix}\), \(\mathbf{e_2}=\begin{bmatrix}0\\1\\\vdots\\0\end{bmatrix}\), …, \(\mathbf{e_n}=\begin{bmatrix}0\\0\\\vdots\\1\end{bmatrix}\), and \(\mathbf{u}=\begin{bmatrix}c_1\\c_2\\\vdots\\c_n\end{bmatrix}\).
We can express \(\mathbf{u}\) as a linear combination of \({\mathbf{e_1}, \mathbf{e_2}, \dots , \mathbf{e_n}}\): \(\mathbf{u} = c_1\mathbf{e_1}+c_2\mathbf{e_2}+\dots+c_n\mathbf{e_n}\). If we fill that in \(T(\mathbf{u})\), and apply the properties of linear transformations (look this up in the textbook, or do your own pen and paper exercise), we find that \(A=[T(\mathbf{e_1})\cdots T(\mathbf{e_n})]\). \(A\) is called the standard matrix of \(T\).
The previous definition of linear transformations in \(R^n\) can be expanded for any vector space (not only \(R^n\)).
Homogeneous coordinates
Translation and affine transformation (see the previous post) are not linear transformations and cannot be composed, yet they are very useful operations. Programmers in computer graphics understand the value of linear algebra and utilize it well. We can see an example of how the problem of translation and affine transformation composability have been overcome in that field, and keep that in mind when we develop solutions that use more than 3 dimensions.
It turns out that homogeneous coordinates can describe points in a plane in a way that matrix multiplication
can be used for translations. In homogeneous coordinates, a third component of 1
is added to each coordinate,
and the transformations are defined by the following matrices:
- point \(\begin{bmatrix}x\\y\\1\end{bmatrix}\)
- rotation \(A=\begin{bmatrix}cos\theta&-sin\theta&0\\sin\theta&cos\theta&0\\0&0&1\end{bmatrix}\)
- reflection \(B=\begin{bmatrix}1&0&0\\0&-1&0\\0&0&1\end{bmatrix}\)
- dilation/contraction \(C=\begin{bmatrix}r&0&0\\0&r&0\\0&0&1\end{bmatrix}\), \(r>0\)
- translation \(E=\begin{bmatrix}1&0&h\\0&1&k\\0&0&1\end{bmatrix}\)
If we, for example, wanted to do a dilation followed by translation and then a rotation, we would use the \(AEC\) matrix.
The textbook I recommended offers us a practical example (example 7, page 264): There is a triangle, and we would like to rotate it, but not about the origin, but about a point \(P(h,k)\). Thus, the solution is to translate \(P\) to origin, rotate, and then translate \(P\) back. In the example, I implement the rotation through an angle \(\pi/2\) about the point \(P(5,4)\).
(let [t1 (fn [p] (dge 3 3 [1 0 0 0 1 0 (- (p 0)) (- (p 1)) 1])) r (fn [theta] (let [c (cos theta) s (sin theta)] (dge 3 3 [c s 0 (- s) c 0 0 0 1]))) t2 (fn [p] (dge 3 3 [1 0 0 0 1 0 (p 0) (p 1) 1])) t2rt1 (fn [p theta] (mm (t2 p) (r theta) (t1 p)))] (t2rt1 (dv 5 4) (/ pi 2)))
#RealGEMatrix[double, mxn:3x3, layout:column, offset:0] ▥ ↓ ↓ ↓ ┓ → 0.00 -1.00 9.00 → 1.00 0.00 -1.00 → 0.00 0.00 1.00 ┗ ┛
Kernel, Range, and the Rank/Nullity Theorem
Besides domain and codomain, linear transformations are associated with another two important vector spaces: kernel and range.
With transformation \(T:U\rightarrow{V}\):
- kernel, \(ker(T)\), is the set of vectors in \(U\) that are mapped into the zero vector of \(V\).
- range is the set of vectors in \(V\) that are the images of vectors in \(U\).
This section does not contain nothing new I could show you in Clojure code, but is important to grasp the theory. Therefore, I'll only mention the important stuff, and direct you to the textbook.
Example 2 (page 273) demonstrates how to find the kernel, by solving a system of equations \(T(\mathbf{x}) = A\mathbf{x} = \mathbf{0}\). The first thing that we could do is to try to use the function for solving linear systems of equations sv!. That won't help us much; since our system is homogeneous, sv! can only give us the trivial solution \(\mathbf{x}=\mathbf{0}\). Fortunately, in this example, \(A\) is a square matrix, so we can exploit the equality \(A\mathbf{x}=\lambda\mathbf{x}\) between eigenvectors \(\mathbf{x}\) and eigenvalue(s) \(\lambda=0\), if such \(\lambda\) exists (since this is a textbook example, I'd be surprised if it didn't). See Clojure Linear Algebra Refresher (2) - Eigenvalues and Eigenvectors if you need to refresh your Eigen-fu.
(let [a (dge 3 3 [1 0 1 2 -1 1 3 1 4]) eigenvectors (dge 3 3)] (def lambdas (ev! a nil eigenvectors)) (def eigen eigenvectors) eigenvectors)
#RealGEMatrix[double, mxn:3x3, layout:column, offset:0] ▥ ↓ ↓ ↓ ┓ → -0.64 -0.96 0.71 → -0.13 0.19 -0.71 → -0.76 0.19 0.00 ┗ ┛
Let's see whether any of these eigenvectors is our solution.
lambdas
#RealGEMatrix[double, mxn:3x2, layout:column, offset:0] ▥ ↓ ↓ ┓ → 5.00 0.00 → -0.00 0.00 → -1.00 0.00 ┗ ┛
The second eigenvalue is \(0\), so the second column \((-0.96,0.19,0.19)\) is the normalized basis of one-dimensional subspace of \(R^n\) that is the kernel we were looking for.
(seq (col eigen 1))
-0.9622504486493764 | 0.1924500897298754 | 0.1924500897298751 |
For a linear transformation \(T\), \(dim\, ker(T) + dim\, range(T) = dim\, domain(T)\). Also, \(dim\, range(T) = rank(A)\). The kernel of \(T\) is also called the null space. \(dim\, ker(T)\) is called nullity and \(dim\, range(T)\) is the rank of the transformation. The previous equality is referred to as the rank/nullity theorem.
Systems of Linear Equations
As we've already seen in the previous section, there is a close connection between linear transformations and systems of linear equations. System \(A\mathbf{x}=\mathbf{y}\) can be written as \(T(\mathbf{x})=\mathbf{y}\).
Note that the solution of the homogeneous system \(A\mathbf{x}=\mathbf{0}\) is the kernel of the transformation, as we've already shown.
There is an interesting relation between an element of the kernel \(\mathbf{z}\), particular solution \(\mathbf{x_1}\) of the nonhomogeneous system \(A\mathbf{x}=\mathbf{y}\), and every other solution \(\mathbf{x_1}\) of that system: \(\mathbf{x}=\mathbf{z}+\mathbf{x_1}\)!
Look the details up in the textbook, an I'm showing you example 2 (page 286) in Clojure. We can use the sv! function to solve this non-homogeneous system.
(let [a (dge 3 3 [1 0 1 1 -1 1 3 1 4]) b (dge 3 1 [11 -2 9])] (sv! a b) (def a-solution (col b 0)) a-solution)
#RealBlockVector[double, n:3, offset: 0, stride:1] [ 17.00 -0.00 -2.00 ]
Now, knowing the kernel from the previous section, and having this particular solution, we can construct all solution to this system:
(let [z (col eigen 1) x1 a-solution] (defn y [r] (axpy r z x1))) (y 1.33)
#'user2/y#RealBlockVector[double, n:3, offset: 0, stride:1] [ 15.72 0.26 -1.74 ]
As an exercise, you can check whether those vectors are really solutions (and inform me if they're not).
Neanderthal's sv! solves not only one system at a time; you can give it a bunch of systems, and solve them in one sweep:
(let [a (dge 3 3 [1 0 1 1 -1 1 3 1 4]) b (dge 3 4 [11 -2 9 3 -2 1 4 5 -6 -1 0 1])] (sv! a b))
#RealGEMatrix[double, mxn:3x4, layout:column, offset:0] ▥ ↓ ↓ ↓ ↓ ┓ → 17.00 9.00 49.00 -9.00 → -0.00 -0.00 -15.00 2.00 → -2.00 -2.00 -10.00 2.00 ┗ ┛
I haven't killed seven in one blow, like the Brave Little Tailor, only four. But I could have, easily, even more!
One-to-One and Inverse Transformations
This section covers only the theory, so I'll just summarize the results, and direct you to the textbook for details.
A transformation is one-to-one if each element in the range is the image of exactly one element in the domain.
A transformation \(T\) is invertible if there is a transformation \(S\) such that \(S(T(\mathbf{u}))=\mathbf{u}\) and \(T(S(\mathbf{u}))=\mathbf{u}\) for every \(\mathbf{u}\).
The following statements are equivalent:
- \(T\) is invertible.
- \(T\) is nonsingular (\(det(a)\neq 0\)).
- \(T\) is one-to-one.
- \(ker(T)=\mathbf{0}\).
- \(ran(T)=R^n\).
- \(T^{-1}\) is linear.
- \(T^{-1}\) is defined by \(A^{-1}\).
Coordinate Vectors
Let \(U\) be a vector space, \(B = \left\{\mathbf{u_1}, \mathbf{u_2}, \cdots, \mathbf{u_n}\right\}\) basis, \(\mathbf{u}\) a vector in \(U\), and \(a_1, a_2, \cdots, a_n\) scalars such that \(\mathbf{u} = a_1\mathbf{u_1} + a_2\mathbf{u_2} + \cdots + a_n\mathbf{u_n}\). The column vector \(\begin{bmatrix}a_1\\a_2\\\vdots\\a_n\end{bmatrix}\) is called the coordinate vector of \(\mathbf{u}\) releative to \(B\), and \(a_1, a_2, \cdots, a_n\) are called the coordinates of \(\mathbf{u}\).
Column vectors are more convenient to work with when developing the theory. In Clojure code, no distinction is made. The same vector can be used either as a column or as a row vector.
The standard bases are the most convenient bases to work with. To express a vector in coordinates relative to another base, we need to solve a system of linear equations. One special case is orthonormal basis. In that case, the coordinate vector is \(\mathbf{v}_B = \begin{bmatrix}\mathbf{v\cdot u_1\\v\cdot u_2\\\vdots\\v\cdot u_n}\end{bmatrix}\).
Change of Basis
In general, if \(B\) and \(B'\) are bases of \(U\), and \(\mathbf{u}\) is a vector in \(U\) having coordinate vectors \(\mathbf{u_B}\) and \(\mathbf{u_{B'}}\) relative to \(B\) and \(B'\), then \(\mathbf{u_{B'}} = P \mathbf{u_B}\), where \(P\) is a transition matrix from \(B\) to \(B'\). \(P = [(\mathbf{u_1})_{B'} \cdots (\mathbf{u_n})_{B'}]\).
The change from a nonstandard basis to the standard basis is easy to do - the columns of \(P\) are the columns of the first basis! (example 4, page 294, from the textbook):
(let [b (dge 2 2 [1 2 3 -1]) b' (dge 2 2 [1 0 0 1]) p (copy b)] (mv p (dv 3 4)))
#RealBlockVector[double, n:2, offset: 0, stride:1] [ 15.00 2.00 ]
A more interesting case is when neither basis is standard. In that case, we can use the standard basis as an intermediate basis. Let \(B\) and \(B'\) be bases, \(S\) the standard basis, and \(P\) and \(P'\) transition matrices to the standard basis. Then, transition from \(B\) to \(B'\) can be accomplished by transition matrix \(P'^{-1}P\), and from \(B'\) to \(B\) with \(P^{-1}P'\).
Here's example 5 (page 295) from the textbook worked out in Clojure code:
(let [p (dge 2 2 [1 2 3 -1]) p' (dge 2 2 [3 1 5 2]) p-1 (tri (trf p)) p'-1 (tri (trf p'))] (mv (mm p'-1 p) (dv 2 1)))
#RealBlockVector[double, n:2, offset: 0, stride:1] [ -5.00 4.00 ]
Which is equal to the result from the book. Nice.
It's not over
What we started in the previous post on matrix transformations, we wrapped up with the general concept of linear transformations. In the next post, we'll take a step back, and cover a bit of more basic ground - the technicalities of matrices.
I hope this was interesting and useful to you. If you have any suggestions, feel free to send feedback - this can make next guides better.