
A Common Gotcha with Asynchronous GPU Computing
September 18, 2019
There is one important thing we have to keep in mind. GPU computing routines are asynchronous by default! In some common situations, it can surprise even an experienced developer (yes, humans are fallible).

Fast Tensors in Clojure  a Sneak Peek
August 26, 2019
There's a lot going on behind the scenes, since I am not only writing the books; at the same time I have to create the software to write about. For example, a slim and superfast deep learning library. I call it *Deep Diamond*, and here's a sneak peek of some basic things from my trusty REPL.

Do Judge a Programming Book by its Cover
July 15, 2019
I needed an idea for the covers of the book series that I'm writing, Interactive Programming for Artificial Intelligence. Meet Dog and Bird, the two friends that bravely immerse into challenging activities and make them effortless and fun!

Billions of Random Numbers in a Blink of an Eye
June 12, 2019
I'm happy to announce that the new release of Neanderthal can generate random vectors and matrices on the CPU and GPU out of the box!

Deep Learning from Scratch to GPU  16  Momentum
April 29, 2019
Today we are goind to implement momentum, a ubiquitous learning optimization technique. What's more, we'll do it without any performance penalty. Find out how many lines of Clojure code it will take.

Deep Learning from Scratch to GPU  15  Weight Decay
April 23, 2019
In this article we explore a simple but useful technique for keeping weights from growing too big. Weight Decay is useful as a regularization technique that improves generalization, and can help with improving even the basic learning on the technical level.

Deep Learning from Scratch to GPU  14  Learning a Regression
April 15, 2019
A great moment has arrived. We are going to apply our neural networks implementation to a regression problem. The network is going to learn a known function, which enables us to see how well it learns, and why it doesn't do a great job. We are also going to get some hints for improvements. But, hey, it works!

Deep Learning from Scratch to GPU  13  Initializing Weights
April 10, 2019
As the iterative learning algorithm has to start somewhere, we have to decide how to initialize weights. Here we try a few techniques and weight their strengths and weaknesses until we find one that is good enough.

Deep Learning from Scratch to GPU  12  A Simple Neural Network Training API
April 3, 2019
The stage has been set for wrapping up the simplest version of a complete neural network API, and its key part that offers the entry for the /learning/ functionality  the training API.

Deep Learning from Scratch to GPU  11  A Simple Neural Network Inference API
March 28, 2019
The time is ripe for wrapping what we have built so far in a nice Neural Network API. After all, who would want to assemble networks by hand?

Deep Learning from Scratch to GPU  10  The Backward Pass (CUDA, OpenCL, Nvidia, AMD, Intel)
March 25, 2019
We complete the basic implementation of the backward pass of backpropagation and gradient descent.

Deep Learning from Scratch to GPU  9  The Activation and its Derivative
March 20, 2019
We implement the key part of the backward pass, the computation of the error of a layer. Along the way, we set up the infrastructure for the complete implementation of backpropagation.

Deep Learning from Scratch to GPU  8  The Forward Pass (CUDA, OpenCL, Nvidia, AMD, Intel)
March 13, 2019
We start implementing stochastic gradient descent and the backpropagation algorithm. Here we implement the forward pass of the training layer and run it on the CPU and GPU.

Deep Learning from Scratch to GPU  7  Learning and Backpropagation
March 6, 2019
At last, we reached the point where we can take on the implementation of the learning algorithm. Here we look at the basics of backpropagation, the engine that makes deep learning possible.

Deep Learning from Scratch to GPU  6  CUDA and OpenCL
February 28, 2019
We generalize the network code and run it on the GPU. On an Nvidia GPU with CUDA, and on an AMD GPU with OpenCL. Even more  we mix both CUDA and OpenCL, just because we can.

Deep Learning from Scratch to GPU  5  Sharing Memory
February 21, 2019
Sharing and reusing memory buffers is inescapable if we want high performance. It is a sharp but powerful tool.

Deep Learning from Scratch to GPU  4  Increasing Performance with Batch Processing
February 18, 2019
We increase performance many times by computing a group of (vector) inputs as one (matrix) batch.

Deep Learning from Scratch to GPU  3  Fully Connected Inference Layers
February 14, 2019
It's time to formalize some structure of our layers into a layer type.

Deep Learning from Scratch to GPU  2  Bias and Activation Function
February 11, 2019
We continue building our network by adding the activation function and bias.

Deep Learning from Scratch to GPU  1  Representing Layers and Connections
February 6, 2019
Here we start our journey of building a deep learning library that runs on both CPU and GPU.

Deep Learning in Clojure from Scratch to GPU  Part 0  Why Bother?
February 1, 2019
An introduction to a series of tutorials about Deep Learning in Clojure funded by Clojurists Together. Start with an empty clj file and build a fast neural network that runs on the GPU, built with nothing else but plain Clojure and Neanderthal. The series is a companion to a free online book Neural Networks and Deep Learning.

CUDA 10 in Clojure
November 21, 2018
CUDA 10 support has just landed in ClojureCUDA with the latest version 0.6.0.

Neanderthal vs ND4J  vol 5  Why are native map and reduce up to 100x faster in Clojure?
November 14, 2018
In which we compare mapping and reduction functions in Clojure's Neanderthal and Nd4j.

Code Mesh LDN 2018  Interactive GPU Programming with ClojureCUDA and ClojureCL
November 4, 2018
I'm presenting a talk at CodeMesh LDN 2018 conference in a few days. Here are the slides.

Easy Probability and Clojure  1  The South Park Socks Drawer
October 31, 2018
Beginner friendly Clojure solutions to the 50 Challenging Problems in Probability book. Problem 1, The Sock Drawer.

Programmer, Teach Yourself Foundations of ML and AI with these 6 Books
October 27, 2018
Here's what I recommend to programmers if they want to understand machine learning. Six fantastic books that you can work with in your spare time to build yourself a solid foundation for data analysis, deep learning and the likes.

Clojure Walk through the OpenCL in Action GPU Computing Book  Part 1
October 24, 2018
Since its inception, ClojureCL tests have been carrying on the walthrough of examples from the best introductory book for GPU computing  OpenCL in Action by Matthew Scarpino. Thanks to Nikola Milikic, now we have the proper commentary for Chapter 4.

I'm ditching Slack and opening a place to foster a more personal interaction
October 22, 2018
I'm starting a dedicated, more personalized, discussion server about Clojure, machine learning, artificial intelligence, high performance computing, and related themes. Here's how to join.

Adopt a Neanderthal function as your own pet! Support my Clojure work on Patreon.
October 18, 2018
You can adopt a function in Uncomplicate project as a pet now! Choose a tier on Patreon.

SmallFP & ClojureTRE: Interactive, Functional, GPU Accellerated Programming in Clojure
September 11, 2018
I'm presenting a talk at SmallFP conference in Helsinki, Finland. Here are the slides.

Neanderthal vs ND4J  vol 4  Fast Vector Broadcasting in Java, CPU and CUDA
July 25, 2018
In which we implement our own vector broadcasting function that matches the performance of Nd4j's optimized one and even surpasses it on the GPU.

Fast Function Currying in Clojure (Without Macros)
July 8, 2018
Fluokitten 0.8.0 just got released. I significantly improved the performance of existing currying functionality. Here's a walkthrough.

Map and Reduce Primitive Arrays Without Clojure Macros
July 5, 2018
Fluokitten 0.7.0 just got released. Now it supports Java arrays. A brief walktrough.

Neanderthal vs ND4J  vol 3  Clojure Beyond the Fast Native MKL Backend
June 29, 2018
Some free time fell into my lap, and I decided to continue helping the DL4J team at Skymind improve ND4J. This is a direct continuation of Vol 2.

Neanderthal vs ND4J  vol 2  The Same Native MKL Backend, 1000 x Speedup
June 28, 2018
Mercenary work have been finished ahead of time; a handful of time to spare. Time for free software! More Neanderthal vs ND4J comparison, then?

Neanderthal vs ND4J  vol 1  Native performance, Java and CPU
June 22, 2018
Today, Java has a choice of a few well maintained libraries that can use Intel's MKL for fast native operations. MKL is basically as fast as you can get today on the CPU. The reasoning is that, since two libraries use MKL, they should squeeze the same amount of juice out of it. It's not quite like that.

Out of the box support for TensorFlow and PyTorch for Clojure has landed in Neanderthal
April 1, 2018
Out of the box support for TensorFlow and PyTorch for Clojure has landed in Neanderthal.

Interactive GPU Programming  Part 3  CUDA Context Shenanigans
March 1, 2018
All communication with the GPU takes place in a context. Context is an analogous to a CPU program. It sets the environment for the particular GPU we want to use in the computations. Simple on the surface, it is, unfortunately, incidentally complex due to early CUDA legacy. So what? We have to deal with that, so, instead of complaining, we'll see how to tame it...

Interactive GPU Programming  Part 2  Hello OpenCL
February 7, 2018
This is really the same article as Part 1  Hello CUDA, but with the focus on OpenCL, so I'll skip most of the narration and just show you the code.

Neanderthal and friends support CUDA 9, Java 9, and Clojure 1.9
January 17, 2018
Uncomplicate libraries have just got a nice update and are ready for the latest underlying platform releases. CUDA 9, Java 9, and Clojure 1.9.

Interactive GPU Programming  Part 1  Hello CUDA
January 17, 2018
Today, fast number crunching means parallel programs that run on Graphical Processing Units (GPUs). Thanks to the recent highly publicized deep learning and artificial intelligence advances, everyone has heard about Nvidia's CUDA  an environment and set of C++ oriented tools that transform your GPU card into a sort of a desktop supercomputer. Yes, that is interesting, and we want to get in! But we would also like an interactive environment that is easy to set up and easy to play with, without a loss of power or performance.

Clojure Numerics, Part 6  More Linear Algebra Fun with Least Squares
December 27, 2017
This time, we look at a few important variants of the least squares problem. Least squares with equality constraints, and generalized least squares. They show how we can solve problems where there is some norm to be minimized with additional constraints to be satisfied. A very cool and useful thing to know.

Clojure Numerics, Part 5  Orthogonalization and Least Squares
October 17, 2017
How to solve linear systems that have many solutions, or those that have no solutions at all? That's the theme for a thick math textbook, of course, but from the programmer's point of view, we are interested in the practical matters, so I'll stick to the main point. When I have less equations than there are unknowns, or I have too many equations, what can I do in Clojure to make those unknowns known? Which functions do I call?

Clojure Numerics, Part 4  Singular Value Decomposition (SVD)
October 4, 2017
Today's article is a short one. Not because Singular Value Decomposition (SVD) is not important, but because it is so ubiquitous that we'll touch it in other articles where appropriate. The goal of this article is to give an overview and point to Neanderthal's functions that work with it. When you see SVD in the literature you read, you'll know where to look for the implementation; that's the idea.

What's nice about Clojure numerical computing with new Neanderthal 0.16.0
September 18, 2017
I've spent some quality time with my Emacs sipping some CIDER, and it is a good moment to introduce the release 0.16.0 of Neanderthal, the linear algebra and numerical computing library for Clojure. The time spent over the summer on refactoring the foundations for 0.15.0, pays the dividends now. It has been much easier for me to add many new features and polish the old ones. And the best news is that I expect this to continue giving for the upcoming releases.

Clojure Numerics, Part 3  Special Linear Systems and Cholesky Factorization
September 18, 2017
In the last article we have learned to solve general linear systems, assuming that the matrix of coefficients is square, dense, and unstructured. We have also seen how computing the solution is much faster and easier when we know that the matrix is triangular. These are pretty general assumptions, so we are able to solve any welldefined system. We now explore how additional knowledge about the system can be applied to make it faster. The properties that we are looking for are symmetry, definiteness, and bandedness.

Clojure Numerics, Part 2  General Linear Systems and LU Factorization
September 7, 2017
Solving systems of linear equations is a staple food of linear algebra. It can be applied as a part of many machine learning tasks, although it is not always obvious to spot the opportunity. Here, we explore how triangular systems are the foundation that we need to internalize well. We concentrate on computational details, and transformations of general systems to triangular systems . Neanderthal offers many functions to help us in this quest.

Neanderthal 0.15.0 Released  Many more specialized matrix data structures in Clojure
September 1, 2017
The new release of Neanderthal, the fast Clojure one stop shop for linear algebra and matrix computations at top speed on Intel and AMD CPU's, and both Nvidia and AMD GPU's has just been released to Clojars.

Setting up Neanderthal  High Performance Clojure Easy as 1,2,3
August 18, 2017
This is a flash post, written after having a sudden thought while reading a book totally unrelated to Clojure. How hard (or easy) is it to start with Neanderthal? This is only a rhetorical question, since I am not interested in _your_ answer. I'm interested in your opinion on another matter. Here it is.

Clojure Numerics, Part 1  Use Matrices Efficiently
June 26, 2017
It's time to step back from theory and look a bit more into implementation details of all that matrix stuff. In this post, I give an overview of data structures that I use to represent matrices in Clojure, methods to manipulate them, and a few tips and tricks that you can use to make your code fast.

Clojure Linear Algebra Refresher (4)  Linear Transformations
June 15, 2017
Now that we got ourselves acquainted with matrix transformations, the next obvious step is to generalize our knowledge with linear transformations.

Clojure Linear Algebra Refresher (3)  Matrix Transformations
June 13, 2017
A Clojure programmer will immediately feel at home with linear transformations  functions are also transformations. Linear transformations preserve the mathematical structure of a vector space. Just like functions define transformations, matrices define a king of linear transformations  matrix transformations. I often see programmers using matrices and vectors as dumb data structures and writing their own loops, or accessing elements one by one in a haphazard fashion. Matrices are useful data structures, but using them as transformations is what really gives them power. This is something that is very well understood in computer graphics, but is often neglected in other areas.

Clojure Linear Algebra Refresher (3)  Matrix Transformations
June 13, 2017
A Clojure programmer will immediately feel at home with linear transformations  functions are also transformations. Linear transformations preserve the mathematical structure of a vector space. Just like functions define transformations, matrices define a king of linear transformations  matrix transformations. I often see programmers using matrices and vectors as dumb data structures and writing their own loops, or accessing elements one by one in a haphazard fashion. Matrices are useful data structures, but using them as transformations is what really gives them power. This is something that is very well understood in computer graphics, but is often neglected in other areas.

Clojure Linear Algebra Refresher (2)  Eigenvalues and Eigenvectors
June 6, 2017
I stumble upon eigenvectors and eigenvalues regularly when reading about machine learning topics. They seem to be important. They are not introduced in those texts, though, so you might be scratching your head wondering how the authors got the idea in the first place to use them and why. Luckily, they are introduced soon enough in linear algebra textbooks, and they are available at highspeed in Clojure via Neanderthal, that we should be able to at least try them out easily, and then build upon that knowledge fearlessly when we encounter them in the wild.

Clojure Linear Algebra Refresher (1)  Vector Spaces
June 3, 2017
This article is a starting article in a series that is going to briefly skim through a good engineering textbook on linear algebra, making notes that help you relate that material to the Clojure code.

CUDA and cuBLAS GPU matrices in Clojure
May 20, 2017
The new release of Neanderthal is here! The highlight of 0.11.0 is the new CUDA/cuBLAS based engine. The highperformance Clojure matrix library now supports all 3 major choices that you'd want to crunch those billions of numbers with  CPU, CUDA GPU on Nvidia, and OpenCL GPU on AMD, or other accellerators. Let's se why this new stuff is important (and it really is!).

Neanderthal 0.9.0 released  Clojure's highperformance computing story is getting into shape
March 31, 2017
Today, I have the pleasure to announce the new release of Neanderthal, the highperformance Clojure matrix library, which brings some important major improvements, and beats the path to even better stuff that will follow in subsequent versions. Let's explore major stuff (and also see some code :)

Neanderthal 0.9.0 with major improvements is around the corner
March 22, 2017
The upcoming version of Neanderthal, the library for vector and matrix computations in Clojure, brings improvements to Clojure's numerical computing capabilities that are so important, that I think there's no harm in talking about them in advance. If you like what you're about to read, you won't have to wait for long, since the implementation is mostly done.

Bayadera = Bayes + Clojure + GPU (slides for my Bob Konferenz 2017 talk)
February 20, 2017
I'll give a talk about Bayadera, Clojure, and Bayes (+ GPU's) in Berlin on February 24. Here are the slides!

Clojure is not afraid of the GPU (video of my EuroClojure2016 talk)
November 18, 2016
The videos for EuroClojure 2016 talks are finally available on YouTube. Here is mine.

Clojure is not afraid of the GPU (slides for my EuroClojure2016 talk)
October 20, 2016
Is there a better way to open the new blog than with a post about my upcoming talk at EuroClojure2016? I decided to put the slides online in advance, so the attendees can relax and enjoy the talk, instead of scraping by to type all the notes and memos.