# Neanderthal 0.9.0 released - Clojure's high-performance computing story is getting into shape

March 31, 2017

Today, I have the pleasure to announce the new release of Neanderthal, the high-performance Clojure matrix library, which brings some important major improvements, and beats the path to even better stuff that will follow in subsequent versions. If you are not sure why you would even need such a library in Clojure, I recommend this EuroClojure talk (slides). Let's explore major stuff (and also see some code :)

## Streamlined installation

Neanderthal is written in Clojure, so you can explore its source code, improve it, or mold it to your needs in your favorite programming language. However, that Clojure code must rely on natively optimized low-level operations when it comes to linear algebra computations standardized as BLAS and LAPACK. Such stuff is a couple of orders of magnitudes faster than is possible with Java, and is also really hard to implement. Basically any competitive library in Python, R, and even C/C++ in this area uses the approach of calling those low-level libraries, only the Java world is reluctant. If you are still not convinced, I recommend my EuroClojure talk again.

In the earlier versions of Neanderthal, I used the ATLAS BLAS & LAPACK library for such operations. It is fantastic - fast, and with a free open-source license. The only trouble (for some Linux, and many Windows users) was that it had to be compiled and tuned for your machine. On Windows, it was a really tricky task. On Linux, it was pretty well documented and straightforward, but the problem is that many Clojure programmers have zero experience with C, and its fairly messy build story. Only on OS X, it worked out of the box, without installing anything, since OS X comes with its own BLAS framework.

From version 0.9.0, I switched to Intel's Math Kernel Library (MKL) instead of ATLAS. MKL does not require compiling or tuning, but is instead a drop-in installation. The trade-off is that MKL is not open-source, but on the other hand it is free as free beer. On yet another hand, MKL is the fastest thing around, and also support many features that are missing from its open-source cousins.

## Even faster!

The switch to MKL came with another benefit: MKL is even faster than ATLAS. Don't get me wrong, ATLAS is quite fast. MKL is just a bit faster in some cases, but is considerably faster in others. In almost all benchmarks I could find on the Web, MKL was the fastest kid around, or almost as fast as the winner for the particular operation, so I guess I can say that it's the fastest offering around. This improvement came for free, so let's embrace it :)

To see what you can expect from this, see the benchmarks. I have done many more measurements, and Neanderthal is consistently fast across operations, but matrix multiplication is the most representative overall.

Of course, Neanderthal's GPU engine is still here, and is much faster than CPU (even with MKL), if you have really large structues.

## Factorizations and solvers

The new Neanderthal finally comes with those higher level operations that usually fall under LAPACK umbrella! I was waiting a bit until I stabilize the core architecture properly, and ensure that both CPU and GPU engines are straightforward, and fit well with a seamless Clojure API. Since I think that Neanderthal is well-shaped in that regard now, I added the major LAPACK functionalities: factorizations, eigenvalues, eigenvectors, and solvers.

Here is a complete working example of solving linear equations:

First, I'll require the necessary namespaces.

(require '[uncomplicate.neanderthal
[native :refer [dge]]
[linalg :refer [sv!]]])


Next, create the matrices that describe the system of equations that is going to be solved ($$A x = B$$):

(def a (dge 3 3 [1 3 1
2 5 1
1 2 3]
{:order :row}))
(def b (dge 3 1 [-2
-5
6]
{:order :row}))


Finally, call the solver.

(do
(sv! a b)
b)


We got our solution: $$x_1 = 0$$, $$x_2 = -2$$, and $$x_3 = 3$$!

Of course, If I only needed to solve the simplest linear systems like this one, I might have used pen and paper, or any Java library that has that feature. The value of Neanderthal is that it can solve huge systems in a reasonable amount of time.

To see what functions are available now, visit The API documentation, especially the section about the linalg namespace. Do not skip it, there's valuable info there, including the links to the literature.

## What to expect in next versions?

Neanderthal is pretty useful in its current state, but it'll get even more useful:

### CUDA integration.

Neanderthal already supports GPU computing through OpenCL. This also helps when we need to build our own GPU kernels through ClojureCL. However, this leaves a bunch of CUDA-based libraries out of the reach. I'll change that and enable Neanderthal to have the same functionality based on CUDA, too. This is very important, since it will enable the code based on Neanderthal to easily communicate to Deep Learning libraries that are usually based on Nvidia's cuDNN.

### More specialized matrix formats

Neanderthal supports the most useful general dense matrices, and triangular matrices. More dense and packed formats are also going to be supported (symmetric, etc.), but the biggest addition will be the support for sparse matrices.

### Tensors

Last, but not least, I also decided to support tensors. I can't promise when, but it will probably be sooner than expected.

## Happy hacking!

If you find Neanderthal useful, do not forget to check out the tutorials, and other complementary Clojure libraries: ClojureCL, which brings GPU computing to Clojure, and Fluokitten, Clojure monadic library. Both of them have been used as Neanderthal's building blocks, and can be useful in your code.