Multiverse: Concurrency, STM & MVCC

March 19, 2009

How does Multiversion Concurrency Control work in Multiverse. At the moment it works like this: each object that lives in STM-space, and is attached to a transaction (directly or indirectly) is dehydrated when the transaction commits and is dirty. All dehydrated/hydrated stm objects have a unique id (a long) and to make fast searching on this id possible by storing the dehydrated objects in a balanced binary tree. This immutable balanced tree is a stable view of reality: a snapshot. And can be shared between transactions since there is no mutable state. So readers don’t block writers and writers don’t block readers. If a transaction needs an instance of a previously committed object, a new hydrated object is formed based on the dehydrated object and registered at the transaction to make sure that multiple reads of the same id, give the same instance. When a transaction starts, it gets a reference to a snapshot and automatically gets transaction level read consistency (so it can’t observe changes made by other transactions, only his own).

Non destructive updates to the heap

But does Multiverse need to copy the entire tree when it does a commit? No, Multiverse uses the same technique as a pure functional programming language uses to create a tree: non destructive updates. Only the updated node in the tree, and the path to this, need to be replaced. So if you have a tree of 1000 items and you need to do an update, at most 10 nodes (2^10 makes 1000) need to be replaced.

Concurrent transactions

Transaction can be committed concurrently as long as there are no write/write conflicts. Read/write conflicts are not detected, so just like Oracle Multiverse is not completely serialised (for now).

Concurrent commits

And what about concurrency? Does Multiverse rely on locking to prevent concurrent commits? No: Multiverse almost is lock free (in the following weeks the last locks will be removed). When a transaction commits, a new snapshot is build based on the changes and the active snapshot, and actives the new snapshot as active snapshot. This is done within a CAS (Compare and Swap) loop, because another transaction maybe has finished committing (and replacing the active snapshot) earlier. The advantage of this approach is that all changes are isolated and atomic: they will appear at once, or not at all. A rollback is nothing more than not creating a new snapshot.

Livelocking

What about livelocking, does Multiverse suffer from livelocking? Yes, lock free algorithms could suffer from livelocking: if a lot of transaction commit at the same time, it could happen that a lot of them keep running the CAS loop and do a lot of unnecessary work. Especially longer transactions could starve. The holy grail of Multiverse will be creating a Wait-Free commit. Another level where livelocking can happen is when there is a writeconflict and a transaction needs to be retried again. By setting a maximum number of retries (in the future other policies are added) you can control the amount of livelocking

Deadlocking

At the moment Multiverse doesn’t support blocked locking and is (almost) lock free, so a deadlock can’t happen. When blocked locking is going to be added, deadlock detection will be added.

retry

What about notification? At the moment Multiverse supports the retry. The implementation is quite simple; when a transaction aborts with a retry, a RetryError is thrown. The TransactionTemplate (just like a Spring HibernateTemplate) catches this RetryError and retrieves all the id’s that have been read by the transaction. After this is done, a Latch is registered to all addresses that have been read, and waits till this latch is opened. Once the latch is opened, a new transaction begins. Some latch implementations also support timed waiting, this makes it possible to prevent waiting indefinitely.

orElse

What about the orElse? There orElse is a little bit more difficult. The TransactionTemplate could catch the RetryError in a try/catch and continue in the catch block. The problem is that changes that have been made in the try block are not rolled back (Multiverse can’t roll back changes inside objects). Although this would be a very good performing solution (the RetryError is reused because creating a stacktrace is very expensive and error handling is just jumping to an instruction), it introduces some strange semantics. So a different solution needs to be found here. For the nerds: in the past I have written some Prolog compilers based on the WAM, and this is just a choicepoint.

Optimistic locking

What about optimistic locking? Well, this implementation by nature is optimistic; write conflicts are only detected at the end and the CAS loop in essence is an optimistic lock.

Pessimistic locking

What about pessimistic locking? Just like MVCC databases, pessimistic locking is going to be added. And just like databases lock are automatically released when the transaction commits. And I’m thinking about a read/writelock (with a upgrade that detects if another transaction received the writelock this transaction was waiting for…. so something that works unlike the ReadWriteLock from java.util.concurrent). For example:

Transaction t = stm.startTransaction();

Person person = (Person)t.read(handle);
PessimisticLock personLock = t.getLock(person);
personLock.acquireExclusiveNoWait();
..do stuff on person.

Or the shorter approach

Transaction t = stm.startTransaction();
Person person = (Person)t.readAndLockNoWait(handle, LockMode.exclusive);
..do stuff on person.

And unlike oracle, where an update,insert,delete automatically leads to a pessimistic lock in the database, multiverse doesn’t automatically lock objects yet. Automatic locking requires listening to every write on a stm object field, and this reduces performance. But perhaps in the future it will be optional (instrumentation of pojo’s needs to change).


Cutting costs: Layering good, tiers bad

March 17, 2009

From time to time I see old-school software architectures at customers where all the layers of the applications (the most common ones are ui, business logic, persistence) are placed in separate tiers (different machines). There are a lot of problems with this approach:

  1. increased latency because of all the marshalling, marshalling and network communication.
  2. increased development time because of all the marshalling/unmarshalling code (so DTO’s, remote and local interfaces etc).
  3. increased hardware costs: often you see as much front end machines, as backend machines (that don’t do much essential stuff, but spend a lot of time waiting and marshalling/unmarshalling). So instead of 2×2 machines for example, you have more uptime with 3×1 machines and lower hardware costs.
  4. increased license costs since you need to run extra servers
  5. increased maintenance costs since you need to maintain more machines
  6. increased infrastructure costs since there are more machines to connect
  7. increased troubleshooting costs: in these environments it is much harder to figure out what is going wrong because there is so much communication going on.

So instead of hosting the layers on different machines, it often imho is better to host those layers on the same hardware or even on the same JVM. It doesn’t mean this is the only approach, but I have seen too many applications where this useless separation of tiers has lead to a lot of extra cost. So it certainly is something worth considering if you need to cut costs in these financial difficult times.


Going to speak on NLJUG J-Spring 2009

March 8, 2009

I just received confirmation that my proposal to speak on J-Spring 2009 has been accepted. The subject is going to be “Transactional Memory, the next gc”. I’m going to explain how Transactional Memory can simplify and complicate our life 🙂 And I’m probably going to explain some details of Multiverse; a Java based Software Transactional Memory implementation I’m working on.

In November 2008 I spoke on J-Fall about the JMM and it was a great experience. So I hope to see some of you there 🙂