One of Alan Perlis' famous quote is
It is better to have 100 functions operate on one data structure than 10 functions on 10 data structures.
In a recent talk about reducers (available here), Rich Hickey playfully asserted that it might be even better to have 100 functions operate on anything. This article aims to show the effective way clojure offers to emulate common datastructures to use standard functions on pretend datastructures.
Let’s see how we can leverage the clojure transient interface to provide access to simple key/value stores. For the purpose of this article we’ll use everyone’s favorite outsourced heap: redis hand in hand with the carmine library.
This is what accessing redis looks like from carmine:
(require '[taoensso.carmine :as r :refer [wcar get set del]])
(def server-spec {}) ;; will connect to localhost
(wcar server-spec
(r/set "a" "b")
(r/get "a") ;; => "b"
(r/del "a"))
Ideally, we’d like to be able to treat redis like a standard transient map:
(def redis (instance->transient server-spec))
;; assuming we're starting from an empty redis instance
(pr-str redis)
;; => "{}"
(assoc! redis :a :a)
(:a redis)
;; => :a
(pr-str redis)
;; => "{:a :a}"
(assoc! redis :b {:a 0 :b 1})
;; => {:a :a, :b {:b 1, :a 0}}
(assoc! (:b redis) :c 2)
;; {:b 1, :a 0, :c 2}
(-> redis :b (dissoc! :c :b))
;; {:a 0}
(count redis)
;; => 2
(seq redis)
;; => ([:a :a] [:b {:a 0}])
(dissoc! redis :b)
(dissoc! redis :a)
;; => {}
(assoc! redis :c #{:foo :bar})
;; {:c #{:foo :bar}}
This article will walk you through how to achieve that, on top of the carmine library presented above.
Please note that while this makes for an interesting exercise with protocols, it is not recommended to use it extensively, some of the chosen strategies to make this example easier are suboptimal at best.
Basic building blocks
To achieve homomorphism, clojure relies on interfaces, they provide a uniform way to access resources implementing them.
Looking at clojure’s source code, a mere two interfaces provide the
necessary functionality for setting, retrieving and removing keys:
clojure.lang.ILookup
and clojure.lang.ITransientMap
:
/* src/jvm/clojure/lang/ILookup.java */
public interface ILookup{
Object valAt(Object key);
Object valAt(Object key, Object notFound);
}
/* src/jvm/clojure/lang/ITransientMap.java */
public interface ITransientMap extends ITransientAssociative, Counted{
ITransientMap assoc(Object key, Object val);
ITransientMap without(Object key);
IPersistentMap persistent();
}
In the above protocols:
valAt
provides lookups and is shared across transients and persistent structuresassoc
mutates a transient map to set a key to a valuewithout
mutates a transient map to remove a key
The simplest way to implement java interfaces or clojure
protocols is to make use of reify
which generates anonymous classes implementing a list of provide
protocol or interface.
With this, we can go on and write a fake transient:
(ns transient.redis
(:require [taoensso.carmine :as r :refer [wcarl]]))
(defn instance->transient
[spec]
(reify
clojure.lang.ILookup
(valAt [this k]
(wcar spec (r/get k)))
(valAt [this k default]
(or (.valAt this k) default))
clojure.lang.ITransientMap
(assoc [this k v]
(wcar spec (r/set k v))
this) ;; transients always return themselves on mutation.
(without [this k]
(wcar spec (r/del k))
this)))
This gives a nice and clean interface to redis for keys:
(def kv (instance->transient {})) ;; connect to localhost
(assoc! kv "a" "b")
(get kv "a")
(dissoc! kv "a")
Taking advantage of homoiconicity
One of the nice properties of clojure is its homoiconic nature, meaning that all standard data can be printed and read back in by the reader without loss of information. Clojure even now provides a way to add new types to the reader through [tagged literals](http://clojure.org/reader#The Reader–Tagged Literals).
A first improvement we can make to our implementation is to use
pr-str
and
read-string
as a cheap serializer/deserializer.
(ns transient.redis
(:refer-clojure :exclude [read-string])
(:require [taoensso.carmine :as r :refer [wcarl]]
[clojure.edn :refer [read-string]]))
(defn instance->transient
[spec]
(reify
clojure.lang.ILookup
(valAt [this k]
(when-let [res (wcar spec (r/get (pr-str k)))]
(read-string res)))
(valAt [this k default]
(or (.valAt this k) default))
clojure.lang.ITransientMap
(assoc [this k v]
(wcar spec (r/set (pr-str k) (pr-str v)))
this)
(without [this k]
(wcar spec (r/del (pr-str k)))
this)))
You’ll note that in the above, we’re using clojure.edn/read-string
instead of clojure.core/read-string
. This is because
clojure.core/read-string
is not safe for arbitrary inputs since it
might end up calling eval on input.
More protocols
We’ve now reached a good first step for our key value interface. We
could go a bit further though, since redis supports data-types which
resemble clojure’s. If we look at redis sets
, lists
and hashes
,
they map to clojure sets
, vectors
and maps
without to much
contorsion.
Fortunately, clojure provides transient versions of sets
, vectors
and maps
. To coerce our redis connection instance we will need to
implement a variety of interfaces described below.
Counted
This protocol should be implemented by all types and provides a way to yield the length of a collection.
public interface Counted {
int count();
}
Seqable
This protocol is implemented by any datastructure which can be coerced to a seq:
public interface ISeq {
ISeq seq();
}
When calling seq
on a map, it is expected to receive a list of
IMapEntry
structures as defined in the following interface:
public interface IMapEntry extends Map.Entry{
Object key();
Object val();
}
Map entries also happen to implement Counted
, Indexed
(see
below) and Seqable
. This is the protocol that allows you to write:
(map key {:a 0 :b 1})
Indexed
This protocol allows lookup in collections with
nth
:
public interface Indexed extends Counted{
Object nth(int i);
Object nth(int i, Object notFound);
}
ITransientVector
Much like ITransientAssociative
, ITransientVector
gives a way to
mutate on vector like structures:
public interface ITransientVector extends ITransientAssociative, Indexed{
ITransientVector assocN(int i, Object val);
ITransientVector pop();
}
ITransientSet
ITransientSet
completes the list of transient collections
public interface ITransientSet extends ITransientCollection, Counted{
public ITransientSet disjoin(Object key) ;
public boolean contains(Object key);
public Object get(Object key);
}
Redis Operations
Let’s now look at how the redis world can be mapped to the clojure world.
Redis Instance
The global redis instance can be seen as a map, just like in our
first example. If we want it to implement Seqable
and Counted
,
there is no other choice but to issue the redis command KEYS *
and
count the results for Counted
or map them to clojure values for
Seqable
.
We have already seen how to implement ILookup
and ITransientMap
above, but we’ll add a twist, when creating values, instead of
always using the SET
command, we can look at the type of value
we’re fed with
set?,
map?
and
sequential?
to create matching types in redis (set
, hash
or list
) while
still defaulting to string keys.
Likewise, when retrieving keys, we can use the redis TYPE
command
to lookup the key type and yield a transient vector, map or set when
we encounter the matching redis types.
without
does not need to change, since it works on any key type.
This gives us and updated instance->transient
:
clojure.lang.ILookup
(valAt [this k]
(let [k (pr-str k)
type (wcar spec (r/type k))]
(condp = type
"string" (read-string (wcar spec (r/get k)))
"hash" (hash->transient spec k)
"list" (list->transient spec k)
"set" (set->transient spec k)
"none" nil
(throw (ex-info "unsupported redis type" {:type type})))))
clojure.lang.ITransientMap
(assoc [this k v]
(let [k (pr-str k)]
(cond
(set? v) (doseq [member v] ;; Call SADD on all members
(wcar spec (r/sadd k (pr-str member))))
(map? v) (doseq [[subk v] v] ;; Call HSET on all entries
(wcar spec (r/hset k (pr-str subk) (pr-str v))))
(sequential? v) (doseq [e v] ;; Call LPUSH on all entries
(wcar spec (r/lpush k (pr-str e))))
;; Default to a plain SET
:else (wcar spec (r/set k (pr-str v)))))
this)
As explained above we can now also implement Counted
and
Seqable
:
clojure.lang.Counted
(count [this]
(count (wcar spec (r/keys "*"))))
clojure.lang.Seqable
(seq [this]
(let [keys (wcar spec (r/keys "*"))]
(for [k keys]
(->mapentry (read-string k) (.valAt this (read-string k))))))
Beware that calling KEYS *
is very suboptimal and should not be
done in real life scenarios.
We’re also missing the ->mapentry
function above, which can be
simply be:
(defn ->mapentry
[k v]
(reify
clojure.lang.Indexed
(nth [this i] (nth [k v] i)) ;; carry over
(nth [this i def] (nth [k v] i def)) ;; carry over to nth
clojure.lang.Seqable
(seq [this] (list k v)) ;; we know all elems
clojure.lang.Counted
(count [this] 2) ;; always two elems
clojure.lang.IMapEntry
(getKey [this] k) ;; IMapEntry extends Map.Entry
(getValue [this] v) ;;
(key [this] k)
(val [this] v)))
Redis Hashes
Redis hashes will implement the same interfaces than redis
instances: ILookup
, ITransientMap
, Counted
and Seqable
. The
logic will closely resemble our initial version. Lookups will be
done using HGET
, removals with HDEL
. To count elements or coerce
a hash to a seq, we can count on the HGETALL
command which yields
a list containing keys and values. Since the output list is
flattened we can use (partition 2)
to obtain the desired
structure:
(defn hash->transient
[spec k]
(reify
clojure.lang.ILookup
(valAt [this subk]
(when-let [res (wcar spec (r/hget k (pr-str subk)))]
(read-string res)))
(valAt [this subk default]
(or (.valAt this subk) default))
clojure.lang.ITransientMap
(assoc [this subk v]
(wcar spec (r/hset k (pr-str subk) (pr-str v)))
this)
(without [this subk]
(wcar spec (r/hdel k (pr-str subk)))
this)
clojure.lang.Counted
(count [this]
(count (partition 2 (wcar spec (r/hgetall k)))))
clojure.lang.Seqable
(seq [this]
(for [[k v] (partition 2 (wcar spec (r/hgetall k)))]
(->mapentry (read-string k)
(read-string v))))))
Redis Sets
Redis sets are very similar to hashes. Additions to the set are done
with SADD
, deletions with SREM
, we have an efficient way of
counting the set with SCARD
. SISMEMBER
will tell us if we have a
matching member in our set (it returns either 0 or 1, so we need to
coerce it to a boolean with pos?
). Last, SMEMBERS
will help in
implementing seq
:
(defn set->transient
[spec k]
(reify
clojure.lang.Counted
(count [this]
(wcar spec (r/scard k)))
clojure.lang.Seqable
(seq [this]
(map read-string (wcar spec (r/smembers k))))
clojure.lang.ITransientCollection
(conj [this v]
(wcar spec (r/sadd k (pr-str v)))
this)
clojure.lang.ITransientSet
(disjoin [this v]
(wcar spec (r/srem k (pr-str v)))
this)
(contains [this v]
(let [member (wcar spec (r/sismember k (pr-str v)))]
(pos? member)))
(get [this v]
(when (.contains this v)
v))))
Redis Lists
To finish off with our tour of structures, vectors and lists will be
mapped to redis lists. A list’s length is reported by LLEN
, we can
retrieve all members with LRANGE 0 -1
while LSET
, LPOP
and
LPUSH
will implement mutation operations:
(defn list->transient
[spec k]
(reify
clojure.lang.Counted
(count [this]
(wcar spec (r/llen k)))
clojure.lang.Seqable
(seq [this]
(map read-string (wcar spec (r/lrange k 0 -1))))
clojure.lang.ITransientCollection
(conj [this v]
(wcar spec (r/lpush k (pr-str v)))
this)
clojure.lang.ITransientVector
(assocN [this index v]
(wcar spec (r/lset k index v))
this)
(pop [this]
(wcar spec (r/lpop k))
this)))
Behaving like functions
A property of both sets and maps is to behave like functions which test for membership (on sets) and lookup keys (on maps).
This is done by yet another interface: IFn
:
public interface IFn extends Callable, Runnable{
public Object invoke() ;
public Object invoke(Object arg1) ;
public Object invoke(Object arg1, Object arg2) ;
/* ... */
}
We can add the signature to our set and map implementations:
;; hash->transient
clojure.lang.IFn
(invoke [this subk]
(.valAt this subk))
;; set->transient
clojure.lang.IFn
(invoke [this member]
(when (.contains this member)
member))
Our transients now fully behave like clojure datastructures!
The cherry on top: pretty printing
Our transient facade is getting there, but we’re still faced with a
problem: in doesn’t look good in the repl. Without going any further a
redis instance looks like this:
#<transient$instance$reify__8580 redis.transient$instance$reify__8580@3850ea4b>
.
Likewise, when dealing with sets, lists or maps.
By digging around, we find that printing is done with clojure.core/pr
which ultimately calls clojure.core/pr-on
, defined at
https://github.com/clojure/clojure/blob/eccff113e7d68411d60f7204711ab71027dc5356/src/clj/clojure/core.clj#L3532-L3544.
We can rely on print-method
’s ability to look either at the class of
an object or at its metadata to dispatch to the appropriate pretty
printer:
(defmulti print-method (fn [x writer]
(let [t (get (meta x) :type)]
(if (keyword? t) t (class x)))))
We can now write a dispatch method which expects our transient’s
metadata to contain the keys :type
, used by print-method
’s dispatch
function, prefix
and suffix
specify how to enclose the contents of
collections.
(defmethod print-method :redis
[obj ^java.io.Writer writer]
;; extract additional metadata
(let [{:keys [prefix suffix sep tuple?]} (meta obj)]
(.write writer prefix)
;; ensure we have elements to show
(when (pos? (count obj))
(loop [[item & items] (seq obj)]
;; handle map tuples differently
(if tuple?
(do
(print-method (key item) writer)
(.write writer " ")
(print-method (val item) writer))
(print-method item writer))
;; show separator when there are more elems
(when (seq items)
(.write writer (str sep " "))
(recur items))))
(.write writer suffix)))
With this in place we can add metadata to our closures:
(defn ->mapentry
[k v]
^{:type :redis :prefix "[" :suffix "]"}
...)
(defn hash->transient
[k v]
^{:type :redis :prefix "{" :suffix "}" :sep "," :tuple? true}
...)
(defn set->transient
[k v]
^{:type :redis :prefix "#{" :suffix "}"}
...)
(defn list->transient
[k v]
^{:type :redis :prefix "[" :suffix "]"}
...)
(defn instance->transient
[k v]
^{:type :redis :prefix "{" :suffix "}" :sep "," :tuple? true}
...)
Wrapping up
All the bits are now in place, and the example code shown above works as expected. I’ve posted the output here.
I didn’t publish a library on purpose since I don’t think you should use this for any serious work, but it does make for an interesting playground.
While testing this implementation I found out that contains?
cannot be
called on transient sets because of a limitation in the runtime, I
created a JIRA issue to discuss this here:
http://dev.clojure.org/jira/browse/CLJ-1581.