Poor man's dependency injection in Clojure

When writing daemons in clojure which need configuration, you often find yourself in a situation where you want to provide users with a way of overriding or extending some parts of the application.

All popular daemons provide this flexibility, usually through modules or plugins. The extension mechanism is a varying beast though, lets look at how popular daemons work with it.

nginx.org , written in C, uses a function pointer structure that needs to be included at compile time. Modules are expected to bring in their own parser extensions for the configuration file.
collectd, written in C uses function pointer structures as well but allows dynamic loading through ld.so. Additionally, the exposed function are expected to work with a pre-parsed structure to configure their behavior
puppet, written in Ruby, lets additional module reopen the puppet module to add functionality
cassandra, written in Java parses a YAML configuration file which specifies classes which will be loaded to provide a specific functionality

While all these approaches are valid, Cassandra’s approach most closely ressembles what you’d expect a clojure program to provide since it runs on the JVM. That particular type of behavior management - while usually being defined in XML files, since it is so pervasive in the Java community - is called Dependency Injection.

Dependency injection on the JVM

The JVM brings two things which simplify creating a daemon with configurable behavior:

Interfaces let you define a contract an object must satisfy
Classpaths let you add code to a project at run-time (not build-time)

Cassandra’s YAML configuration takes advantage of these two properties to let you swap implementation for different types of authenticators, snitches or partitioners.

A lightweight approach in clojure

So let’s mimick cassandra and write a simple configuration file which allows modifying behavior.

Let’s pretend we have a daemon which listens for data through transports, and needs to store it using a storage mechanism. A good example would be a log storage daemon, listening for incoming log lines, and storing them somewhere.

For such a daemon, the following “contracts” emerge:

transports: which listen for incoming log lines
codecs: which determine how data should be de-serialized
stores: which provide a way of storing data

This gives us the following clojure protocols:

(defprotocol Store
  (store! [this payload]))

(defprotocol Transport
  (listen! [this sink]))

(defprotocol Codec
  (decode [this payload]))

(defprotocol Service
  (start! [this]))

This gives you the ability to build an engine which has no knowledge of underlying implementation and can be very easily tested and inspected:

(defn reactor
  [transports codec store]
  (let [ch  (chan 10)]
    (reify
      Service
      (start! [this]
        (go-loop []
          (when-let [msg (<! ch)]
            (store! store (decode codec msg))
              (recur)))
        (doseq [transport transports]
          (start! transport)
          (listen! transport sink))))))

As shown above, we use reify to create an instance of an object honoring a specific protocol (or Java interface).

Here are simplistic implementations of an EDN codec, an stdout store and an stdin transport:

(defn edn-codec [config]
  (reify Codec
    (decode [this payload]
      (read-string payload))))

(defn stdout-store [config]
  (reify
    Store
    (store! [this payload]
      (println "storing: " payload))))

(defn stdin-transport [config]
  (let [sink (atom nil)]
    (reify
      Transport
      (listen! [this new-sink]
        (reset! sink new-sink))
      Service
      (start!
        (future
          (loop []
            (when-let [input (read-line)]
              (>!! @sink input)
              (recur))))))))

Note that each implementation gets passed a configuration variable - which will be useful.

A yaml configuration

Now that we have our protocols in place let’s see if we can come up with a sensible configuration file for our mock daemon:

codec:
  use: mock-daemon.codec/edn-codec
transports:
  stdin:
    use: mock-daemon.transport.stdin/stdin-transport
store:
  use: mock-daemon.transport.stdin/stdout-store

Our config contains three keys. codec and store are maps containing at least a use key which points to a symbol that will yield an instance of a class implementing the Codec or Store protocol.

Now all that remains to be done is having an an easy way to load this configuration and produce a codec, transports and stores from it.

Clojure introspection

Parsing the above configuration from yaml, with for instance clj-yaml.core/parse-string, will yield a map, if we only look at the codec part we would have:

{:codec {:use "mock-daemon.codec/edn-codec"}}

Our goal will be to retrieve an instance reifying Codec from the string mock-daemon.codec/edn-codec.

This can be done in two steps:

Retrieve the symbol
Call out the function

To retrieve the symbol, this simple bit will do:

(defn find-ns-var
  [candidate]
  (try
    (let [var-in-ns  (symbol candidate)
          ns         (symbol (namespace var-in-ns))]
      (require ns)
      (find-var var-in-ns))
    (catch Exception _)))

We first extract the namespace out of the namespace qualified var and require it, then get the var. Any errors will result in nil being returned.

Now that we have the function, it’s straightforward to call it with the config:

(defn instantiate
  [candidate config]
  (if-let [reifier (find-ns-var candidate)]
    (reifier config)
    (throw (ex-info (str "no such var: " candidate) {}))))

We can now tie these two functions:

(defn get-instance
  [config]
  (let [candidate (-> config :use name symbol)
        raw-config (dissoc config :use)]
    (instantiate candidate raw-config)))

These three snippets are the only bits of introspection you’ll need and are the core of our solution.

Tying it together

We can now make use of get-instance in our configuration loading code:

(defn load-path
  [path]
  (-> (or path
          (System/getenv "CONFIGURATION_PATH")
          "/etc/default_path.yaml")
      slurp
      parse-string))

(defn get-transports
  [transports]
  (zipmap (keys transports)
          (mapv get-instance (vals transports))))

(defn init
  [path]
  (try
    (-> (load-path path)
        (update-in [:codec] get-instance)
        (update-in [:store] get-instance)
        (update-in [:transports] get-transports))))

Using it from your main function

Now that all elements are there, starting up the daemon ends up only creating the configuration and working with protocols by calling our previous reactor function.

(defn main
  [& [config-file]]
  (let [config     (config/init config-file)
        codec      (:codec config)
        store      (:store config)
        transports (:transports config)
        reactor    (reactor transports codec store)]
    (start! reactor)))

By having reactor decoupled from the implementations of transports, codecs and the likes, testing the meat of the daemon becomes dead simple; a reactor can be started with dummy transports, stores and codecs to validate its inner-workings.

I hope this gives a good overview of simple techniques for building daemons in clojure.