Deep Learning in Clojure from Scratch to GPU - Part 0 - Why Bother?
Need help with your custom Clojure software? I'm open to (selected) contract work.February 1, 2019
Please share: Twitter.
These books fund my work! Please check them out.
In this series, you are going to implement your own deep learning mini library, first run it on the CPU, and then learn how to accelerate it on the GPU with (almost) no changes.
Motivation
Isn't that lots of work, you'd ask? Let me show you how to do it with a couple dozen lines of code! I am not talking about calling a specialized deep learning framework! We'll implement everything from scratch. It will only support the classic features, but can we make it faster than the mega-frameworks? Let's see…
Implementing deep learning and neural networks from scratch is an excellent way to:
- Learn the principles behind deep learning.
- Learn how to implement other algorithms using vectors and matrices.
- Learn how to write simple, yet fast, number crunching software.
- Learn when and how to accelerate stuff on the GPU.
- Have some programming fun!
Why bother with this? There is TensorFlow by Google, there is PyTorch by Facebook, there is MxNet sponsored by Amazon, there are many other frameworks. Why not just learn these, and use the abundance of features they offer?
Well, Google, Facebook, and Amazon certainly know what they're doing. But they are solving different problems than me. If I was in competition to out-google Google, I would probably need to use a (better) TensorFlow. Most of the features TensorFlow has might be a great fit for Google, but are bloated for the challenges that smaller companies have. What is a feature for these giants is often a shackle around small team's ankles.
Additionally, you might've noticed that learning how to tame these beasts is not as easy as advertised. Tings you learn here will help you understand how these big frameworks work and how to use them optimally, even if TensorFlow and MxNet are the right tool for you. So, I guess there's no downside :)
Here's how
I wrote a bit about books that programmers can use to learn about machine learning. It is difficult to find resources that discuss the implementation of deep learning tools and algorithms. For programmers, I think that a nice short free online book Neural Networks and Deep Learning by Michael Nielsen is a good starting choice.
That book explains how to implement fully connected neural networks, and backs it up by only a minimal amount of math and complete Python code backed up by Numpy. This series of blog posts can be read as a companion to that (short!) book.
What is the added value of this series?
Obviously, one thing is that this is done in Clojure. But that in itself would not be that much.
What is unique here, is that we will do a more thorough job in implementation. We will:
- have less code
- use more techniques offered by matrix libraries to make a more serious solution
- do a better job in optimizing memory usage
- do a better job in optimizing performance
- make it run on the GPU
- implement some additional DL optimizations that Michael's book just mentions
- make a groundwork for a nice Clojure based DL library with tensors, convolutions, Intel's MKL-DNN, Nvidia's cuDNN and all that.
Next steps
Before reading this series, it would be a good idea that you do some preparation:
- Read the first chapter of the book. It's a nice (short!) introduction.
- Learn how to include Neanderthal () in your Clojure projects.
- Read a few short matrix tutorials, such as Vector Spaces, Matrix Transformations, and Use Matrices Efficiently.
The idea is that you have a straightforward path from the higher-level explanations in Michael's free online book, to the fast implementation details that I show.
I'll give you a few days to catch up with these resources, and then I'll try to keep your interest with small and easily digestible articles that tackle implementation issues one by one. I won't hit you with much complexity in one big blow. If you make this preparation, gradually following this series will be a piece of cake!
Table of contents
This ToC will be updated with links to the new articles.
- 1 - Representing Layers and Connections
- 2 - Bias and Activation Function
- 3 - Fully Connected Inference Layers
- 4 - Increasing Performance with Batch Processing
- 5 - Sharing Memory
- 6 - CUDA and OpenCL
- 7 - Learning and Backpropagation
- 8 - The Forward Pass
- 9 - The Activation and Its Derivative
- 10 - The Backward Pass
- 11 - A Simple Neural Network Inference API
- 12 - A Simple Neural Network Training API
- 13 - Initializing Weights
- 14 - Learning a Regression
- 15 - Weight Decay
- 16 - Momentum
Donations
Let me sneak in a quick reminder that I'm accepting donations that I hope will support the development of these cool Clojure libraries in the years to come.
If you feel that you can afford to help, and wish to donate, I even created a special Starbucks for two tier at patreon.com/draganrocks. Don't worry, I won't squander the donations at Starbucks. But, not because I don't like a good mocha! There's no Starbucks shops in my country, that's all. If you feel specially generous, you can do something even cooler: adopt a pet function.
Thank you
Clojurists Together financially supported writing this series. Big thanks to all Clojurians who contribute, and thank you for reading and discussing this series.