Kickass hardware for Multiverse up and running

June 18, 2009

A few weeks ago I received the last components of my dual Xeon 5500 configuration for Multiverse and I have been working on a framework for doing microbenchmarks. This framework has produced some nice diagrams that provide a lot of insight.

One of the things I wanted to see is how well the system scales with and without contention. The graph below shows the result of a benchmark for various levels of contention for updating a counter stored in the stm:
stm_completerangeofsharedandunshared

The slowest one is one with contended shared state: all threads share the same counter The blue one is a bit better because each thread gets his own counter. The green one is the same as the blue one but instead of using instrumentation, the classes are ‘instrumented’ manually. And the last graph where each counter is placed in his own stm.

The big questions are:

  1. why is the performance difference between manual and real instrumentation so big. It can be partially explained by not having to access expensive fields like volatile (for the global stm) and threadlocals (current transaction)
  2. why does the system scale so bad when transactions don’t share state, only the stm. The only point of contention is the shared logical clock (an AtomicLong). So I would expect a more linear growth.

So a lot of hard questions to answer.

The benchmark framework is going to be integrated with the Continuous integration server so we have a continuous feedback on performance. One of the features that needs to be added is trend analysis, but since the data already is there, this should not be a difficult requirement to implement.

Update
It appears that statics were still activated on the non manual instrumented version. Statistics cause a lot of cas updates. stm_completerangeofsharedandunshared

As you can see the difference between the manual and the automatic ones has decreased. So that partially solves the first question.


Multiverse 0.2 release nearing completion

May 23, 2009

Multiverse is a Java based STM implementation and the 0.2 version is nearing completion and I hope to finish it before 1 July. The 0.2 version is going to be a very usable implementation, unlike the 0.1 release, where most configuration can be done by placing some annotations. For more information about annotation based configuration, see the following tutorial.

Goal for this release

The goal of this release it to make Multiverse really easy to use, so that users don’t get scared (there is enough time for that) and let them realise that Software Transactional Memory can lead to a complete change in the way they write concurrent systems.

Todo and future plans

To see what needs to be done, check the following page. If you want to see what can be expected in the future, check the backlog. In the next release improving performance and scalability, and support for writing efficient STM data-structures, are going to be the most important goals.

Moving to Codehaus

I hope to move Multiverse to Codehaus as soon as the proposal is accepted. Google code is nice, but Codehaus has better Java facilities (continuous integration server, Licenses for Java tooling etc).


Multiverse needs a community

May 16, 2009

Multiverse is a Java based Software Transactional Memory implementation I have worked on for the last half year and I made a lot of progress and gained a lot of new insights. But there is too much to do for me alone and I would really like have some people on the project to learn from/teach things to/have fun with. So if you are interested in joining the community, have experience with concurrency control/instrumentation or would like to know a hell of a lot more about it, please send an email to alarmnummer at gmail.com.


The Springforum has lost its pazzaz

May 16, 2009

I have been reading and posting on the Spring forum since before it became mainstream, to be specific I joined on 22 November 2004. And under the name ‘Alarmnummer’ I have placed more than 1000 posts (in most cases answers to questions).

But the quality of the forum has dropped dramatically. Perhaps it is caused by:

  1. Spring becoming yet another bloated environment
  2. closed design by Spring-team and not the community
  3. Spring getting too commercial

Thereby loosing touch with the community and especially with smart early adaptors.

Most questions I see on the forum are not about design any more, but in a lot of cases RTFM (Read the Fucking Manual) or UTFS (Use the Fucking Search) and not fun to answer at all.


Some STM Thoughts part II

May 15, 2009

In my spare time I have been working onMultiverse; to be specific on the instrumentation of POJO’s. A crude initial version is working and dropping the manual instrumentation of classes is a dream come true. But there is a lot left to do:

  1. overcoming current instrumentation limitations like dealing with statics, inner classes, private and final fields.
  2. optimisations like unmanaged reads to prevent unnecessary transaction overload. This could be done by escape analysis in the future, but I have to settle for annotations for now. Another optimisation is making final objects completely free to work with, so no extra STM caused CPU or memory usage. At the moment the STM doesn’t see the difference, so it tries to manage them and this causes the unnecessary usage.

Some thoughts about making the stm distributed:

  1. With clock based STM’s, the maximum transaction speed is limited to the the maximum speed the clock run with. Because the read and write of this shared clock needs to be done atomically over the network. If the network action alone takes 1 ms the maximum number of transactions are 1000 per second. The same can be concluded if you apply ,Amdalah’s Law if you see network communication as sequential fractions that limit the maximum speed up. I guess it is also time to have a more thorough look at the work of Leslie Lamport.
  2. The big difference is that distributed systems and non distributed systems is that the former is a lot less reliable, and this means that a STM implementation needs to deal with failure of notification. This could complicate matters like performance and the holy grail of software design: simplicity.

Buying some kickass hardware for Multiverse

April 23, 2009

This week I’m going to order my kick ass system for Multiverse, a STM implementation I’m working on. 2 Intel Xeon E5520 cpu’s (4 cores each + hyperthreading -> 16 virtual processors in total!). 12 Gig DDR3 1066 Mhz ram, a super fast Intel 32 gigabyte SSD and 2x 1.5 terabyte old school disk drives. And it is going to run Linux and perhaps even Solaris (I want to play with D-trace).

As soon as I have this up and running (and moved it somewhere so I don’t have to look at it) I can run some serious performance and scalability tests. At the moment I only have a dual core laptop for this purpose and that is not sufficient.

And if I’m lucky, I can upgrade the cpu’s in the near future to 8 cores each (resulting in 32 virtual processors!).


MVCC & TL2: concurrency

April 22, 2009

Last month I have been studying TL2 (Transactional Locking 2) and one thing occurred to me: the difference between MVCC and TL2 isn’t that big. The biggest reason to prefer MVCC above traditional lock based concurrency control systems, is that the former allows higher concurrency. With traditional lock based concurrency control systems, to provide a consistent view (so a transaction won’t observe changes made by other transactions executing in parallel) records are locked as they are encountered (encounter time locking). This leads to writers blocking readers, and readers blocking writers.

With MVCC there is no need to rely on locking to provide a consistent view; instead of relying on locks, previous versions of a record can be reconstructed for reading purposes. This means that only writers block writers, and this should have a positive impact on concurrency. For MVCC it doesn’t matter if you use encounter time locking or commit time locking (databases typically use encounter time locking as soon as a insert/delete/update occurs).

In TL2 a field/object (depending on the granularity of the Transactional Memory) doesn’t need to be locked or reconstructed, because as soon as field is read, it is localized: a transaction gets its own private copy (TL2 uses commit time locking). This serves the following consistency purposes:

  1. a read done by a transaction, will never see writes made by other transactions executing in parallel.
  2. a read done by a transaction, will always see the writes made by itself

So just as MVCC, with TL2 readers don’t block writers and a writers don’t block readers. But unlike MVCC, instead of revering back to an older version if a ‘too’ new version is found, the transaction is retried. And this in principle looks a lot like the ORA-01555 Snapshot Too Old error from Oracle.


Multiverse and the JMM

April 18, 2009

You don’t need to worry about the JMM while using Multiverse. The JMM is defined in happens before rules, and Multiverse provides a happens before relation on a transaction commit and transaction start. It probably will be based on a CAS write and read of object state. A CAS write/read can be compared to a volatile write/read and the last one is one of the before rules. So one of the things you can say goodbye to is using volatile variables.


Some STM thoughts, Part I

April 14, 2009

Last 5 months I have been working on Multiverse and I have gained a lot of new insights:

  1. Concurrency control in an stm is nothing else than saying something about the ordering of transaction execution.
    If there is no shared state, the result of executing transactions will be the same no matter the order, so they are allowed to execute concurrently. If there is shared state, the result of the transactions needs to be the same as one of the 2 possible sequential executions. So concurrency control is creating a partial ordering in the execution of transactions.
  2. Locking essentially leads to an ordering of concurrent executing transactions.
  3. A write-conflict is nothing more than an illegal interleaving of concurrent executing transactions. A write-conflicts happens when an another transaction writes to one or more shared addresses after the current transaction starts and before it commits. A write-conflict would be impossible if transactions execute sequentially.
  4. A deadlock also is nothing more than an illegal interleaving of concurrent executing transactions. Since there is shared state (the locks for example) one transaction needs to execute before the other, or after the other. In both cases a deadlock is impossible because one transaction has finished completely, and releases its locks, before the other begins.

Multiverse: Concurrency, STM & MVCC

March 19, 2009

How does Multiversion Concurrency Control work in Multiverse. At the moment it works like this: each object that lives in STM-space, and is attached to a transaction (directly or indirectly) is dehydrated when the transaction commits and is dirty. All dehydrated/hydrated stm objects have a unique id (a long) and to make fast searching on this id possible by storing the dehydrated objects in a balanced binary tree. This immutable balanced tree is a stable view of reality: a snapshot. And can be shared between transactions since there is no mutable state. So readers don’t block writers and writers don’t block readers. If a transaction needs an instance of a previously committed object, a new hydrated object is formed based on the dehydrated object and registered at the transaction to make sure that multiple reads of the same id, give the same instance. When a transaction starts, it gets a reference to a snapshot and automatically gets transaction level read consistency (so it can’t observe changes made by other transactions, only his own).

Non destructive updates to the heap

But does Multiverse need to copy the entire tree when it does a commit? No, Multiverse uses the same technique as a pure functional programming language uses to create a tree: non destructive updates. Only the updated node in the tree, and the path to this, need to be replaced. So if you have a tree of 1000 items and you need to do an update, at most 10 nodes (2^10 makes 1000) need to be replaced.

Concurrent transactions

Transaction can be committed concurrently as long as there are no write/write conflicts. Read/write conflicts are not detected, so just like Oracle Multiverse is not completely serialised (for now).

Concurrent commits

And what about concurrency? Does Multiverse rely on locking to prevent concurrent commits? No: Multiverse almost is lock free (in the following weeks the last locks will be removed). When a transaction commits, a new snapshot is build based on the changes and the active snapshot, and actives the new snapshot as active snapshot. This is done within a CAS (Compare and Swap) loop, because another transaction maybe has finished committing (and replacing the active snapshot) earlier. The advantage of this approach is that all changes are isolated and atomic: they will appear at once, or not at all. A rollback is nothing more than not creating a new snapshot.

Livelocking

What about livelocking, does Multiverse suffer from livelocking? Yes, lock free algorithms could suffer from livelocking: if a lot of transaction commit at the same time, it could happen that a lot of them keep running the CAS loop and do a lot of unnecessary work. Especially longer transactions could starve. The holy grail of Multiverse will be creating a Wait-Free commit. Another level where livelocking can happen is when there is a writeconflict and a transaction needs to be retried again. By setting a maximum number of retries (in the future other policies are added) you can control the amount of livelocking

Deadlocking

At the moment Multiverse doesn’t support blocked locking and is (almost) lock free, so a deadlock can’t happen. When blocked locking is going to be added, deadlock detection will be added.

retry

What about notification? At the moment Multiverse supports the retry. The implementation is quite simple; when a transaction aborts with a retry, a RetryError is thrown. The TransactionTemplate (just like a Spring HibernateTemplate) catches this RetryError and retrieves all the id’s that have been read by the transaction. After this is done, a Latch is registered to all addresses that have been read, and waits till this latch is opened. Once the latch is opened, a new transaction begins. Some latch implementations also support timed waiting, this makes it possible to prevent waiting indefinitely.

orElse

What about the orElse? There orElse is a little bit more difficult. The TransactionTemplate could catch the RetryError in a try/catch and continue in the catch block. The problem is that changes that have been made in the try block are not rolled back (Multiverse can’t roll back changes inside objects). Although this would be a very good performing solution (the RetryError is reused because creating a stacktrace is very expensive and error handling is just jumping to an instruction), it introduces some strange semantics. So a different solution needs to be found here. For the nerds: in the past I have written some Prolog compilers based on the WAM, and this is just a choicepoint.

Optimistic locking

What about optimistic locking? Well, this implementation by nature is optimistic; write conflicts are only detected at the end and the CAS loop in essence is an optimistic lock.

Pessimistic locking

What about pessimistic locking? Just like MVCC databases, pessimistic locking is going to be added. And just like databases lock are automatically released when the transaction commits. And I’m thinking about a read/writelock (with a upgrade that detects if another transaction received the writelock this transaction was waiting for…. so something that works unlike the ReadWriteLock from java.util.concurrent). For example:

Transaction t = stm.startTransaction();

Person person = (Person)t.read(handle);
PessimisticLock personLock = t.getLock(person);
personLock.acquireExclusiveNoWait();
..do stuff on person.

Or the shorter approach

Transaction t = stm.startTransaction();
Person person = (Person)t.readAndLockNoWait(handle, LockMode.exclusive);
..do stuff on person.

And unlike oracle, where an update,insert,delete automatically leads to a pessimistic lock in the database, multiverse doesn’t automatically lock objects yet. Automatic locking requires listening to every write on a stm object field, and this reduces performance. But perhaps in the future it will be optional (instrumentation of pojo’s needs to change).