Bad Concurrency: 2010

While at QCon I sat through Stuart Halloway's talk on the Clojure time/concurrency model, it was very interesting. I watched a copy of Rich Hickey's talk on the same subject some time ago. I'm not going to rehash the entire model here, if you are unfamiliar with it, best to head straight to the source. However, I am going to offer a few (small) criticisms. In fact they are not criticisms of the model but more of the way it is presented.

First off I would like to say that I think the approach the Clojure guys are taking is excellent. I am currently playing with a small prototype application that is based on similar principals. Admittedly I'm using Scala rather than Clojure, but it just shows that their model can be generalised to other languages easily.

Focus On The Model

One of the enabling features of Clojure's concurrency model is the Hash Array Mapped Trie, which allows for a path copy based structure to be used for persistent vector and dictionary type structures. What was not presented during the talk - maybe all Clojure developers know this already - is how the path copy metaphor can (and should) be extended to your entire object model.

Consider an account management service that provides a function for updating an individual account's post code. An object graph for such as service could look something like this:

After an update to the post code for a specific account - using immutable objects to represent the model - the resulting object graph would like the following:

This closely follows the pattern displayed when updating one of the hash tries (q.v.) and retains the property that readers will always see a consistent view of the model no matter which part of the model the reader holds a reference to. If the reader needs a more up to date view of the object graph, it will have to re-enter the model through the accountRef atom. This brings me to my next point.

Identity vs. Entry Point

One of the questions that I asked was around whether there was any real applications built using this model. The response mentioned 2, one being a web framework. However, in both cases those systems only had a single reference, i.e. a single identity. When considering the concurrency model, this makes perfect sense, but from a data modelling or a domain modelling perspective the concept of identity is closely tied to the notion of entities. This use of terminology suggests that you implement a system that puts every entity behind a reference and while this may sound appealing initially, it has 2 negative effects. Firstly is clutters your domain model with an artificial construct, mixing an infrastructure concern (concurrency) with your domain logic. It is generally accepted that separation of concerns is a good thing, so heavily mixing concerns can be considered as bad. The second issue is that an operation that spans multiple entities is difficult to make consistent if all of the entities have individual references. For example, reading threads will be able to see the result of partially applied operations, unless you apply some extensive and complex bookkeeping to ensure that references are made visible in the right order. There is also a performance cost, but I talk about that later.

Using 'Identity' feels wrong as it adds confusion due existing definitions and/or usages of the term. I think a better term is 'Entry Point' or from Domain Driven Design 'Aggregate Root', as this is closer to what actually happening when the code interacts with the model. Another option would be to break the strong linkage between the concept of Identity and the use of Refs to represent them. Using the account service example above, the account repository provides a point with the domain model that code can enter and then reach other entities with that model. Maintaining the reference at the level of the repository allows operations that modify an number of entities that exist below that aggregation point can be made visible as a single atomic action, providing simple, clean transaction semantics.

It's Not Free

One of the statements that irked me the most was around that using Atoms, STM or Agents from a read perspective is free. It's fast, cheap, non-blocking, runs in user-space, but it is NOT free. Using Atoms as an example, the swap! function on the Atom uses an AtomicReference to compare and swap the values after a change has occurred. On the metal this is using a machine level compare and exchange operation (on Intel this is a LOCK CMPXCHG). In order to ensure visibility of the changes the CPU has take out a memory bus lock (or cache lock on newer x86 CPUs) and flush the pipeline. Therefore if your reading thread happened to try and dereference the atom (or potentially any other operation) it won't be able to have its load instruction pipelined along with the write. The slow down is small (and getting smaller on newer CPUs, e.g. Nehalem's CMPXCHG instruction is 40% faster than Core 2) but can't be considered cheaper than a normal non-volatile object reference. A reference within an AtomicReference is declared volatile. Volatile variables are accessed differently to standard variables in that the JVM generates instructions that enforce ordering, which restricts both the compiler's (Hotspot) and the CPU's ability to optimise said instructions. I have anecdotal evidence of code littered with volatile references slowing down significantly.

The other area around performance is the use of completely immutable structures to represent your domain model. Before I get flamed into oblivion, I'm not going to make blanket statement that mutable structures are faster than immutable ones. Before making a judgement it is worth ensuring you understand the behaviour of your own program, specifically the read/write bias. If you have a very high write bias (like in a financial exchange) there is cost to using pure immutable structures. There is a significant memory allocation and copying hit on a write, plus the system will create a lot of garbage (which may cost you in GC pauses). As operations within your application shift toward a read bias, then immutable structures make a lot more sense as the data can be shared.

After 3 years of hard work (admittedly only about 1.5 for me), my place of work has finally launched its flagship product LMAX Trader. It's been really exciting to see the fruits of our labour go live and seeing some the reaction in the press. The marketing messaging talks about being the world's first multi-asset retail exchange, real time margining etc, etc.

What's even more interesting is the technology. Some challenging latency and throughput requirements have led us to take a very back to basics approach to the design and eschewed most of the typical solutions in the enterprise software space. The back end is heavily asynchronous with a funky high-performance reliable messaging system, with custom persistence (journal-based). For retail users almost all of the data is delivered over long poll/comet to a single page GWT UI.

It's an incredibly interesting place to work (today our B.A. started modelling our client accounting system using Feynman diagrams). Now that we're live I'm hoping to blog about some of the things I've been working on.

The CTO and I are off to San Francisco next week to speak at QCon (under the Architecture Anarchists track) about some of the challenges we faced and some of the solutions we devised. It will hopefully be interesting for those interested in HPC and concurrency.

Bad Concurrency

Friday, 17 December 2010

QCon Talk Available Online

Saturday, 20 November 2010

Clojure's Time/Concurrency Model - A Gentle Critique

Sunday, 14 November 2010

Talk Slides Available

Wednesday, 27 October 2010

LMAX Launch

Thursday, 24 June 2010

Long time between updates