Jul 24, 2023

Thriving in the dynamically type-checked hell scape of Clojure

People often come to me asking "I love the idea of Clojure, but how do you write code without types?". I struggle to answer this question. I have no idea what they're talking about half the time. The nuance of strong and weak typing, and static vs dynamic type checking, is often lost on people. I think they want their tools to tell them what to do. I don't struggle with that like others. I never have. It never occured to me that I might just be a weirdo.

Weird life

Like a lot of things, I think I can directly attribute my cognitive model of writing code to how I learned to program. Right around the time I was getting into Linux in high-school and becoming a script kiddy. I learned how to program in C. At first on Windows with something called Bloodshed Dev C++, then I quickly graduated to the terminal because of this cool thing called Linux. I used GCC and GEdit for a long time. Not because I'm some arogant asshole (debateable). I didn't really know better. Then my hacky brain stole this book called Hacking: The Art of Exploitation. At sixteen I learned how to disassemble the C programs I had written and attach GNU debugger (GDB) to examine memory, graduating from:

printf("HERE DUMBASS: %s", *buf);

This "bad" habit of GEdit + command line followed me to University where I coasted and watched people new to programming struggle writing Java. I used it during labs, competitions, assignments, etc. The first time I was actually introduced to Eclipse was when my friend used it on our Intro to Software Engineering project.

All of this was made worse when I finally dropped out of my Computer Science program to work in the industry. My first real developer job was writing Clojure, and the office preference was to use Emacs prelude + CIDER.

Every office has it problems, but what I quickly learned was that documentation was always out of date. I had to build the muscle memory to just read the code which wasn't out of date, ever. I suspect this is why I always find myself reading the Clojure source code for interesting technical nuggets. Even long after I was a Go developer, I still read code everywhere.

Okay, so what?

The less code kept in the developer's head the better. The typical model of software development follows a loop of getting product requirements, understanding enough of the codebase to formulate a solution, and then implementing that solution. More code in the head leads to more complexity.

Blackboxing code addresses complexity by reducing something strictly to inputs and outputs. I think people coming from statically type-checked languages conflate complier errors and type declarations with blackboxing. In reality, they've traded flexibility for guardrails. While I argee that dynamic type checking requires a little bit more cognitive load, docstrings and idioms cut down on cognitive load, and they are far more descriptive than type declarations on function arguments.

Similarly, by sending type mismatching off to the compile time, rather than runtime, those developers lose out on the flexibility on deciding what to do with that type mismatch. Of course, the type mismatch fault can mitigated by testing and assertions. Often, it doesn't matter in Clojure anyways.

While Clojure is a dynamically type-checked language, it is also strongly-typed. Clojure types are several types at once rather than implicitly converting types between function calls. Collections (vectors, maps, sets, and lists) adhere to enough type interfaces they work with many functions like filter, reduce, partition, group-by, etc. get, assoc, and update not only operate on maps, but can also operate on vectors and so on.

However, while developers might have a hard time shooting themselves in the foot, that doesn't mean they can't shoot their colleagues in the foot. Here's some ways to avoid that:

Idioms

Clojure idioms convey information about the type of value being passed a function, macro, and methods at a glance like type declarations. They are great for smaller, general functions.

Follow clojure.core's example for idiomatic names like pred and coll.
in functions:

f, g, h - function input

n - integer input usually a size

index, i - integer index

x, y - numbers

xs - sequence

m - map

k, ks - key, keys

Excerpt from the Clojure style guide Idioms section

(defn drop-last
  "Return a lazy sequence of all but the last n (default 1) items in coll"
  {:added "1.0"
   :static true}
  ([coll] (drop-last 1 coll))
  ([n coll] (map (fn [x _] x) coll (drop n coll))))

drop-last using idioms as function args from clojure.core

Idioms only capture part of the story, and are not entirely suitable for all cases. Sometimes, we want our bindings or argument names to convey more domain information. We can describe them in docstrings.

Docstrings

Docstrings are first-class documentation for functions. Clojure tools like editors and documentation generators look at docstrings first, so It's important to do well. Docstrings should describe what the function does, takes as parameters, and returns. Use backticks functions parameters and other important bits like so:

(defn insert
  "Insert `bounds-obj` into the node, returning a freshly grown quadtree.
  If the node exceeds the capacity, it will split and add all objects to
  their corresponding subnodes."
  [quadtree bounds-obj]
  (let [{:keys [nodes objects bounds
                level max-levels
                max-objects objects]} quadtree
        all-objects (conj objects bounds-obj)]
    (if (pos? (count nodes))
      (as-> quadtree quadtree
        (assoc quadtree :objects [])
        (reduce (fn [quadtree obj]
                  (let [quadrant (get-quadrant quadtree obj)
                        nodes (:nodes quadtree)]
                    (if quadrant
                      (merge quadtree {:nodes (assoc nodes
                                                     quadrant
                                                     (insert (nth nodes quadrant)
                                                             obj))})
                      (update quadtree :objects #(conj % obj)))))
                quadtree
                all-objects))
      (if (and (> (count all-objects) max-objects) (< level max-levels))
        (let [quadtree (if (empty? nodes) (split quadtree) quadtree)]
          (insert quadtree bounds-obj))
        (merge quadtree {:objects all-objects})))))

insert from my quadtree-cljc library

Comments

Similar to docstrings are Clojure's comments. These are comments like any other comments from any language, but Clojure comes with a couple more types of comments:

semi-colon comments ; - typical boring comments, completely ignored
reader comments #_ - tells the reader to ignore the next form
rich comments (comment ....) - form evaluates contents, but unreachable

;; This comment is ignored by the complier
;; It's great for long form comments. 
;; and some editors allow storing and evaluating
;; REPL-like expressions here.

#_(defn hello [] (println "This form ignored, and not evaluated"))

(comment
  (def hello "This comment IS evaluated, but not reachable by code))

These are some of the best places to cuss in the codebase

REPL

Some IDEs allow storing and evaluating REPL-like expressions in Clojure. If you haven't heard of a REPL, it's a Read-Eval-Print-Loop. Clojure features a robust REPL compared to other programming languages as developers can spin up their application process and access the state of the application while it's running.

It's great for debugging and hardening code, and the closest to thing to static type checked process of compiling and waiting for type errors. Except the Clojure REPL doesn't have to build the entire program again, so it can be much faster to iterate with than a statically type-checeked compiler.

Here's a dev script I use to bootstrap my process with the main entry point :main-opts ["dev.clj"] in my deps.edn alias. I use (CIDER) nREPL to connect to the process once it's running.

(defmulti task first)

(defmethod task :default
  [[task-name]]
  (println "Unknown task:" task-name)
  (System/exit 1))

(defmethod task nil
  [_]
  (require '[shadow.cljs.devtools.cli :as shadow])
  (require '[app.core :as app])
  ((resolve 'app/-main))
  ((resolve 'shadow/-main) "watch" "app"))

(defmethod task "repl"
  [_]
  (clojure.main/repl :init #(doto 'app.core require in-ns)))

(task *command-line-args*)

A dev.clj script I derived from play-cljc's

Once connected, I can deref the server atom or alter a root var to enable debug logs or prototype functions for my current dev task. nREPL can be very handy to debug production systems as it can be used with a SSH tunnel for secure access.

Clojure.spec

My thoughts on spec are well known at this point. I really don't like bolting on quasi-static-but-at-run-time-type-checking. I find it's best used for the system boundaries to check and coerce input as well as some critical code pathways. Developers should exercise caution when using spec though. Codebases with too much spec suffer from rigidity, making changes harder.

Rigidity is brought on by the false promise of spec test generators (using spec to generate function test cases). If you want to use generators, you can't just sprinkle in some Clojure spec. Spec generators are great at generating obscure test cases. They create a cycle of tweaking functions and specs, so it doesn't generate unintended non-sense and the function can handle the desired input. I find the time spent on this has little value for the payoff and makes codebases more difficult to develop in.

And now the part where I shamelessly self promote myself. I have a Clojure course available for pre-order if you're into that kind of thing. You can also follow me on Twitter? X? Xitter @janetacarr , or not. ¯\_(ツ)_/¯