A Primer on Clojure Macros
One day my squad mate at a Clojure startup committed a macro to the codebase, and I was quick to reprimand them for it as the same functionality could have been implemented with Clojure core functions. Peeling back the decision rationale, I discovered my colleague had never learned the first and second rule of Clojure macros: Don't talk about Clojure macros. Just kidding, the real rule is you should only reach for a macro if Clojure doesn't already support something you need (or if you need to stop evaluation, but that's another story).
So what is a macro anyways? A macro is code that operates on other code, and typically they return code as their result. But what do all the tildes, at-symbols, pound (#), and back ticks mean?
Data is beautiful, and so is code?
It's no coincidence Clojure forms (function calls) share a similar syntax to Clojure's list type. Clojure forms are lists! Every Clojure developer has made this mistake (and, some tired days, many times):
Since Clojure forms are just lists, we can treat Clojure code like data because it is data. What happens if we use that little quote '
on a function? Check it out:
We get a list! Specifically, we get a list starting with two symbols, defn
and hello-world
, followed by an empty vector, trailing with another list containing the symbol println
and a string. With such a list, we can pass it to eval
to get our function. eval
evaluates its input into something useful, whether it's a map or a function. This reveals the heart of our macro, we simply return code to be evaluated and used.
But there's a problem with the previous code though. We can't really manipulate Clojure code like this because all the contents of the list get evaluated on their own. If we want to operate on code, we have to do it before the Clojure compiler evaluates our code.
All the cool compilers Read
A lisp compiler has a stage before evaluation called reading. A program called the reader reads the code from source files and then evaluates it. Lucky for us we can manipluate code before evaluation with a macro! In this sense, macros are like special functions for the reader, and Clojure provides us with a special form for creating macros called defmacro
.
Using defmacro
, the def-
macro below returns a list of symbols and lists, so evaluation can turn it into a handy form that binds a symbol to decls
with the private metadata flag set to true so functions outside this namespace can't access the symbol's value.
Notice the backtick in front of def
in our def-
macro? Backtick in Clojure is called Syntax Quote, and works like the quote but will fully resolve symbols for us. In other words, it'll find the symbol's namespace for us. For example, `def
yields clojure.core/def
. Without syntax quote our def
is just a plain symbol devoid of any value in the symbol table to the Clojure reader.
user> `(defn hello-world [] (println "Hello, data! Or code??"))
(clojure.core/defn
user/hello-world
[]
(clojure.core/println "Hello, data! Or code??"))
user>
So to tell the reader, "hey look here for this thing", we need to give it directions to something already in the symbol table by using the fully qualified symbol (a symbol with a namespace prefix). We have to fully qualify our symbols in macros because namespace aliases don't exist to the reader. Normally an alias might look like using (:require [foo.bar :as foob])
at the top of your Clojure source file. Because the Clojure compiler automatically imports all of clojure.core
to every source file, we still have to account for it as an alias in the form (:require [clojure.core :refer :all])
.
Your favorite Unquote?
You'll almost always see Syntax Quote close to ~
, or unquote, in Clojure macros. We can think of it as an operator to evaluate just this one thing. Similarly, we'll see ~@
, or unquote splice, and we can think of unquote splice as an operator to pull data out of a list. This example from the Clojure documentation demonstrates the use of syntax quote, unquote, and unquote splicing pretty well:
Unquote and unquote splice are instrumental to writing macros where we can add our own features to Clojure as these give us the ability to have values, and unfurl lists. So putting those together, we can get the gist of something like when-let
:
Looking at when-let
, we can see this macro takes a vector of bindings
and body
of forms in a multivariant argument list. The first let
takes the first and second values out of the bindings, just like we'd expect from a let
but then passes the second binding, tst
to syntax quoted let
, setting it to a temporary binding called temp#
. temp#
gets used as the test for our when
macro where temp#
gets bound again to the first value (a simple symbol) in our original bindings. Finally, body
gets unquote spliced to take out all the forms in the list body
is and passed to an implicit do
in the closest let
. The implicit do
is very important as we'd just be returning a list of forms again. If there wasn't an implicit one, you'd have to use an explicit one.
We can expand our use of when-let
using macroexpand-1
and macroexpand
to see the first step and all steps in macro expansion, respectively.
user> (macroexpand-1 '(when-let [cool "hi"] (println cool)))
(clojure.core/let
[temp__5804__auto__ "hi"]
(clojure.core/when
temp__5804__auto__
(clojure.core/let [cool temp__5804__auto__] (println cool))))
user> (macroexpand '(when-let [cool "hi"] (println cool)))
(let*
[temp__5804__auto__ "hi"]
(clojure.core/when
temp__5804__auto__
(clojure.core/let [cool temp__5804__auto__] (println cool))))
user>
Hey wait, what the hell is temp__5804__auto__
? Remember early when I talked about how the reader has no concept of aliasing? If we want to use simple (unqualified) symbols, we also have the potential for symbols having the same names (symbol collision), where ever the macro is used. To avoid symbol collision the hash used at the end of temp#
in a macro tells the reader to generate a symbol for us. The hash will only generate one unique symbol though. If we wanted to have a recursive macro with a unique symbol for each recursive call, we call the gensym
function to generate a unique symbol each call.
Too long; didn't read
Macros are a cool, powerful feature of Clojure. While they are great fun to write, they are often discouraged in codebases because of the complexity of using code to manipulate other code. Often, this makes macros a nightmare to maintain and debug. The goal of this post is to give you the confidence and tools to read and understand macros as Clojure macros have been the subject of whole books.
A quick summary:
- Macros are code that operates on other code, and are used in the read stage
- Clojure forms are really just lists
- Get a list with simple symbols by prefixing our function call with
quote
or'
- Get a list with fully qualified symbols by prefixing our function or list with a syntax quote
`(defn hello-world ...)
- Evaluate something using unquote or
~
in a macro - Pull out the contents of a list using unquote splicing or
~@
- A symbol postfixed with a
#
will yield a unique symbol by the syntax reader - If you need a unique symbol for each recursive macro expansion, use the
gensym
function.
And that's it, I hope you learned enough to feel confident working with macros.