A Primer on Clojure Macros

A Primer on Clojure Macros

One day my squad mate at a Clojure startup committed a macro to the codebase, and I was quick to reprimand them for it as the same functionality could have been implemented with Clojure core functions. Peeling back the decision rationale, I discovered my colleague had never learned the first and second rule of Clojure macros: Don't talk about Clojure macros. Just kidding, the real rule is you should only reach for a macro if Clojure doesn't already support something you need (or if you need to stop evaluation, but that's another story).

So what is a macro anyways? A macro is code that operates on other code, and typically they return code as their result. But what do all the tildes, at-symbols, pound (#), and back ticks mean?

Data is beautiful, and so is code?

It's no coincidence Clojure forms (function calls) share a similar syntax to Clojure's list type. Clojure forms are lists! Every Clojure developer has made this mistake (and, some tired days, many times):

user> (1 2 3)
Execution error (ClassCastException) at user/eval79101 (dev.clj:82).
class java.lang.Long cannot be cast to class clojure.lang.IFn (java.lang.Long is in module java.base of loader 'bootstrap'; clojure.lang.IFn is in unnamed module of loader 'app')                                                                          
user> '(1 2 3)                                                                                  
(1 2 3)
Clearly, not your dad's parens

Since Clojure forms are just lists, we can treat Clojure code like data because it is data. What happens if we use that little quote ' on a function? Check it out:

user> '(defn hello-world [] (println "Hello, data! Or code??"))
(defn hello-world [] (println "Hello, data! Or code??"))
user> (eval '(defn hello-world [] (println "Hello, data! Or code??")))
user> (hello-world)
Hello, data! Or code??
A list evaluating a list of lists, List-ception

We get a list! Specifically, we get a list starting with two symbols, defn and hello-world, followed by an empty vector, trailing with another list containing the symbol println and a string. With such a list, we can pass it to eval to get our function. eval evaluates its input into something useful, whether it's a map or a function. This reveals the heart of our macro, we simply return code to be evaluated and used.

But there's a problem with the previous code though. We can't really manipulate Clojure code like this because all the contents of the list get evaluated on their own. If we want to operate on code, we have to do it before the Clojure compiler evaluates our code.

All the cool compilers Read

A lisp compiler has a stage before evaluation called reading. A program called the reader reads the code from source files and then evaluates it. Lucky for us we can manipluate code before evaluation with a macro! In this sense, macros are like special functions for the reader, and Clojure provides us with a special form for creating macros called defmacro.

Using defmacro, the def- macro below returns a list of symbols and lists, so evaluation can turn it into a handy form that binds a symbol to decls with the private metadata flag set to true so functions outside this namespace can't access the symbol's value.

(defmacro def-
  "same as defn-, yielding non-public def"
  [name & decls]
  (list* `def
         (with-meta name
           (assoc (meta name) :private true))
Keep your hands off my def

Notice the backtick in front of def in our def- macro? Backtick in Clojure is called Syntax Quote, and works like the quote but will fully resolve symbols for us. In other words, it'll find the symbol's namespace for us. For example, `def yields clojure.core/def . Without syntax quote our def is just a plain symbol devoid of any value in the symbol table to the Clojure reader.

user> `(defn hello-world [] (println "Hello, data! Or code??"))
  (clojure.core/println "Hello, data! Or code??"))

So to tell the reader, "hey look here for this thing", we need to give it directions to something already in the symbol table by using the fully qualified symbol (a symbol with a namespace prefix).  We have to fully qualify our symbols in macros because namespace aliases don't exist to the reader. Normally an alias might look like using (:require [foo.bar :as foob]) at the top of your Clojure source file. Because the Clojure compiler automatically imports all of clojure.core to every source file, we still have to account for it as an alias in the form (:require [clojure.core :refer :all]) .

Your favorite Unquote?

You'll almost always see Syntax Quote close to ~, or unquote, in Clojure macros. We can think of it as an operator to evaluate just this one thing. Similarly, we'll see ~@ , or unquote splice, and we can think of unquote splice as an operator to pull data out of a list. This example from the Clojure documentation demonstrates the use of syntax quote, unquote, and unquote splicing pretty well:

user=> (def x 5)
user=> (def lst '(a b c))
user=> `(fred x ~x lst ~@lst 7 8 :nine)
(user/fred user/x 5 user/lst a b c 7 8 :nine)
from https://clojure.org/reference/reader#syntax-quote

Unquote and unquote splice are instrumental to writing macros where we can add our own features to Clojure as these give us the ability to have values, and unfurl lists. So putting those together, we can get the gist of something like when-let :

(defmacro when-let
  "bindings => binding-form test
  When test is true, evaluates body with binding-form bound to the value of test"
  {:added "1.0"}
  [bindings & body]
     (vector? bindings) "a vector for its binding"
     (= 2 (count bindings)) "exactly 2 forms in binding vector")
   (let [form (bindings 0) tst (bindings 1)]
    `(let [temp# ~tst]
       (when temp#
         (let [~form temp#]
from clojure.core, under license of course

Looking at when-let , we can see this macro takes a vector of bindings and body of forms in a multivariant argument list. The first let takes the first and second values out of the bindings, just like we'd expect from a let but then passes the second binding, tst to syntax quoted let, setting it to a temporary binding called temp# . temp# gets used as the test for our when macro where temp# gets bound again to the first value (a simple symbol) in our original bindings. Finally, body gets unquote spliced to take out all the forms in the list body is and passed to an implicit do in the closest let . The implicit do is very important as we'd just be returning a list of forms again. If there wasn't an implicit one, you'd have to use an explicit one.

We can expand our use of when-let using macroexpand-1 and macroexpand to see the first step and all steps in macro expansion, respectively.

user> (macroexpand-1 '(when-let [cool "hi"] (println cool)))
    [temp__5804__auto__ "hi"]
    (clojure.core/let [cool temp__5804__auto__] (println cool))))
user> (macroexpand '(when-let [cool "hi"] (println cool)))
    [temp__5804__auto__ "hi"]
    (clojure.core/let [cool temp__5804__auto__] (println cool))))

Hey wait, what the hell is temp__5804__auto__? Remember early when I talked about how the reader has no concept of aliasing? If we want to use simple (unqualified) symbols, we also have the potential for symbols having the same names (symbol collision), where ever the macro is used. To avoid symbol collision the hash used at the end of temp# in a macro tells the reader to generate a symbol for us. The hash will only generate one unique symbol though. If we wanted to have a recursive macro with a unique symbol for each recursive call, we call the gensym function to generate a unique symbol each call.

Too long; didn't read

Macros are a cool, powerful feature of Clojure. While they are great fun to write, they are often discouraged in codebases because of the complexity of using code to manipulate other code. Often, this makes macros a nightmare to maintain and debug. The goal of this post is to give you the confidence and tools to read and understand macros as Clojure macros have been the subject of whole books.

A quick summary:

  • Macros are code that operates on other code, and are used in the read stage
  • Clojure forms are really just lists
  • Get a list with simple symbols by prefixing our function call with quote or '
  • Get a list with fully qualified symbols by prefixing our function or list with a syntax quote  `(defn hello-world ...)
  • Evaluate something using unquote or ~ in a macro
  • Pull out the contents of a list using unquote splicing or ~@
  • A symbol postfixed with a # will yield a unique symbol by the syntax reader
  • If you need a unique symbol for each recursive macro expansion, use the gensym function.

And that's it, I hope you learned enough to feel confident working with macros.

Subscribe to Janet A. Carr

Sign up now to get access to the library of members-only issues.
Jamie Larson