Clojure Runs ONNX AI Models Now - Join the AI fun!

October 26, 2025

Please share this post in your communities. Without your help, it will stay burried under tons of corporate-pushed, AI and blog farm generated slop, and very few people will know that this exists.

These books fund my work! Please check them out.

Hello, Clojurians! I haven't written here in a long time. Was I tired? Is anybody reading blogs anymore? Who knows. But that was not the main reason.

I've been working on several Clojure projects sponsored by the Clojurists Together Foundation. I did a ton of things, but after all this programming, I was kinda tired, and kept slugging when it comes to telling people about the work done! That's not very smart, but you know how it goes… :) But, then, if we don't tell people about awesome software that we have, nobody is going to use it, so finally I had to stop kicking this down the road, sit, and write the first post. It's been long overdue, so expect more posts soon!

ONNX Runtime in one line of Clojure

The most recent thing I'm currently working on started its life as Clojure ML (again, superthanks to Clojurists Together for sponsoring this). I proposed to create a human-friendly Clojure API for AI/DL/ML models, and back it by the first implementation, in this case based on ONNX Runtime. Of course, it should all be integrated into existing Clojure libraries, and follow the Clojure way of doing stuff as much as possible!

The idea is to get an existing, pre-trained ML model previously exported to the ONNX format from whatever technology the authors chose (which in today's world is typically Python and PyTorch), and put it into production in Clojure and JVM. It should be seamless and in-process, without any clunky interoperability, copy, translation, etc. Of course, our Clojure numerical libraries fully support GPU computing, so it goes without saying that we want that, too! Just to be clear, we do not use nor need any Python or Python interop for this, we use the ONNX Runtime's underlying C library.

Nice idea, but what parts of this well intended story can we evaluate in our REPLs right now? At least some promising demo? Are we on the trail? To access that AI goodness, we surely have to do a sophisticated dance? Are the steps hard to learn? Do we need to watch carefully for slippery floor? Is it accessible to mere mortals?

Here's the gist:

(onnx "data/mnist-12.onnx")

"Wait, what?", you'll say. One function? One tini, tiny, function, with one laughingly trivial argument? Is that an API? What does such trivial API do? "Now you confused me!", you'll scratch your head. It's just a stick.

I hope I've also intrigued you, so please keep reading to see it in action (this post is actually generated from a live REPL session, so the example is fully executable, not just interesting bits on the table, and a ton of complex boilerplate under a Persian rug).

Hello World, the MNIST image recognition model

For this recipe, you'll need the following ingredients: Deep Diamond tensors (one cup), Deep Diamond network (one slice), one Neanderthal transfer! function for moving data around for demo purposes, and that's it! Oh, yes, don't forget the new onnx function. We load the native namespace, and the right Deep Diamond engine is set up for our system (yes, even on Mac OS, thanks to Clojurists Together!).

(require '[uncomplicate.neanderthal.core :refer [transfer! iamax]]
         '[uncomplicate.diamond
           [tensor :refer [tensor desc]]
           [dnn :refer [network]]
           [onnxrt :refer [onnx]]]
         '[uncomplicate.diamond.native])

The ONNX model

We evaluate the onnx function, and it loads the model.

(def mnist-onnx (onnx "../../data/mnist-12.onnx"))

#'user/mnist-onnx

Sure, that's easy, but how is that useful? Well, the result is a function. This function has just been set up with ONNX internals, so now it can create Deep Diamond network layers and fit in with the rest of the Tensor-y stuff that DD already provides.

The ONNX Runtime model revolves around environment, session, input and output tensors, type info, and a lot of other stuff and brittle ceremony. Sure, sometimes you need to reach these internals, and diamond-onnxrt provides clojurized internals API even for that. However, it can sing the main song, and set all the right arguments at the right places for you. Even the onnx function supports option map, where you can tell what you like, and it will take care to configure ONNX to do the right thing, but this is a story for another article.

The rest is the usual Deep Diamond stuff, which is simple as beans!

The MNIST dataset specifies images of hand-written digits, in just one grayscale channel, each \(28\times28\) pixels a challenging task for 1989's USPO and the tecnology from back then, but a hello world level stuff for today's accelerated libraries (still, keep in mind that if you tried to code even this easy example without such libraries, you'll be surprised how slow that can be!).

We create a tensor descriptor for such input (this step can be left out, but I'm being pedantic to accommodate beginners):

(def input-desc (desc [1 1 28 28] :float :nchw))

#'user/input-desc

Next, we create a reusable abstract network blueprint, that can then create concrete networks tailored for training, or optimized for inference, that is classifying MNIST images. Normally, we would have to train these networks, or load the parameters from somewhere, but in this case it contains only of the onnx model, which had already been trained and already knows all the right weights, so no training is needed (nor available with ONNX Runtime yet; it's main job is inference in production).

(def mnist (network input-desc [mnist-onnx]))

#'user/mnist

Note that all these things so far look and behave just as ordinary Clojure objects. You can use them even outside this specific structure. Full flexibility that I hope will spark your creativity.

We'll also need a place for the actual image that we'd like to classify. This particular network that I downloaded from ONNX Runtime examples specifies exactly one image at input, to classify one at a time. Typically, if we have many images, it's better to compute them in batches, but it's just a hello-world, after all, we won't be too demanding.

(def input-tz (tensor input-desc))

#'user/input-tz

A blueprint (mnist in this case) is a function that can create networks optimized for inference with concrete tensors, adequate internal tensors, and parameters. The following line is the moment when the network is actually created from the abstract descriptors contained in its blueprint, to the actual engines, operation primitives, and tensors in memory.

(def classify! (mnist input-tz))

#'user/classify!

True to the Clojure philosophy, mnist is a function, which, given the specification for desired input, (mnist input-tz) produces classify!, which is a function, too, but for actual inference! It might sound cumbersome when it's written out, but the code shows it's elegance. No need for complex APIs. Each thing does exactly one thing, and does it in the most simple way, by just evaluating with one or two parameters!

Now we got a function that classifies images

This is how you would typically use this

Step one: classify! is now a typical Clojure function! Evaluate it:

(classify!)

{:shape [1 10], :data-type :float, :layout [10 1]} (-0.04485602676868439 0.007791661191731691 0.06810081750154495 0.02999374084174633 -0.1264096349477768 0.14021874964237213 -0.055284902453422546 -0.04938381537795067 0.08432205021381378 -0.05454041436314583)

The result is a ten-element tensor, each element represents the possibility that the category at its index is the right one. So we should just find which element contains the highest value, and that'd be our category, which in the MNIST example is, very conveniently a digit 0 to 9 that is equal to that index.

However, you can see that the current values are just random small numbers. This is because we never loaded any image data to the input tensor! It just classified random noise as not very likely to be an image of any digit.

We need step zero: place the image data in network's input somehow. This could be done in many different ways (for example, by memory-mapping the image data on disk), but we'll keep it simple, and we'll just transfer it naively from an in-place Clojure sequence. (this is a hello-world :)

The following sequence is copied from the actual MNIST data, but I just took the data of the first image, and scaled it to 0-1 range instead of 0-255.

(transfer! (map #(float (/ % 255)) [0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 84.0 185.0 159.0 151.0 60.0 36.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 222.0 254.0 254.0 254.0 254.0 241.0 198.0 198.0 198.0 198.0 198.0 198.0 198.0 198.0 170.0 52.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 67.0 114.0 72.0 114.0 163.0 227.0 254.0 225.0 254.0 254.0 254.0 250.0 229.0 254.0 254.0 140.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 17.0 66.0 14.0 67.0 67.0 67.0 59.0 21.0 236.0 254.0 106.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 83.0 253.0 209.0 18.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 22.0 233.0 255.0 83.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 129.0 254.0 238.0 44.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 59.0 249.0 254.0 62.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 133.0 254.0 187.0 5.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 9.0 205.0 248.0 58.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 126.0 254.0 182.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 75.0 251.0 240.0 57.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 19.0 221.0 254.0 166.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 3.0 203.0 254.0 219.0 35.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 38.0 254.0 254.0 77.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 31.0 224.0 254.0 115.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 133.0 254.0 254.0 52.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 61.0 242.0 254.0 254.0 52.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 121.0 254.0 254.0 219.0 40.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 121.0 254.0 207.0 18.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0])
           input-tz)

{:shape [1 1 28 28], :data-type :float, :layout [784 784 28 1]} (0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0)

(classify!)

{:shape [1 10], :data-type :float, :layout [10 1]} (-1.2567189931869507 0.6275832653045654 8.642718315124512 9.428943634033203 -13.740066528320312 -6.045698642730713 -23.486745834350586 28.3399658203125 -6.7914958000183105 3.941998243331909)

Now, we see some better looking results, but are we the one who need to look at a bunch of numbers and compare them?

No, the machine should do that. Luckily, Neanderthal has just the right function for this!

(iamax (classify!))

And this is the kind of answer that we can show our clients! What's on this image? Easy, it's 7!

Can you tell me the main point of this, in one paragraph?

Yes. Clojure programmers typically write functions. Functions are things that take something at the input, compute stuff internally, and return an output, which is hopefully useful downstream. The funcion transforms the input into the output according to the logic that we programmers wrote in code, following some algorithm that we designed for the purpose. Now, sometimes the problem is so convoluted that we don't have the slightest idea how to write that transformation in code, but what we (or someone else) do have is lots of data, and in many such cases we can train a general machinery (neural networks for example) to find out a good enough transformation. Sometimes someone else have already done the hard part by training the network, exporting it to a standard format (ONNX) and gave it to you! Now, you can load it in Clojure and use it as a Clojure function. You don't even need to know how it works internally, but it does the thing that you need, it transforms the input tensors that you have into just the right output tensors. What you do with these outputs is up to you :)

Who is this for?

Do you need to be an AI researcher to find this useful? Absolutely not! This can appeal to any Clojure engineer.

AI researchers try to find novel AI models, or to push their model by 0.1% on an artificial benchmark. Recently, they don't necessarily even do that, some of them found the way to chase funding at crazy evaluations, and catch it. Some of them don't necessarily write code but work with mathematical models trying to figure a way to do some abstract thing. Or, if they are PhD students, they spend endless nights fiddling with Python and PyTorch trying to figure this or that task assigned by their laboratory, or they just try to catch a bit of sleep while a GPU cluster crunches some tiny step in an endless training cycle.

There's nothing wrong with that, but if you're a Clojure programmer, you probably don't have time, opportunity, experience, or even interest to work on that stuff. But, even if you don't want (or can't) understand AI internals, you can still be very creative with the applications. Now there are many, many, published ML models that work, many of them are even exported to ONNX, and quite usable. You don't need to invent a new OpenAI competitor, there are many more mundane problems that can be solved by taking an already existing model and applying it in a niche context, in a domain that you know well. You don't even need to understand exactly what or how the model does what it does, you can treat it as a black-box function that transforms inputs and outputs, and that function just need a bit more care to work than a regular Clojure four-liner that you'd normally write and be proud of.

Although (sadly) Clojure has not find its way in the big guns AI arena, Clojure is a very capable capable language and Clojure programmers very knowledgeable people when it comes to integrating stuff into real-world applications! So, here it is, now you don't have to make compromises; you can got to Hugging Face, or some other AI related community, find ONNX models that other people already prepared, and join the AI fun, directly from Clojure.