With each release of Java the performance of the virtual machine increases. With the introduction of the new Java Memory Model based on JSR-133 it is completely clear which compiler optimizations are and are not allowed. But also a lot of improvements have been made to increase the performance of locking itself, to name a few:
- adaptive spinning: instead of calling expensive atomic operations, just try to spin a few times and maybe the lock was obtained. Based on prior experiences the JVM can adapt and change the number of tries
- lock coarsening: replacing multiple acquire/releases by a single one and therefor reducing the number of atomic operations
- lock elision through Escape Analysis: if an object is confined to a single thread, a lock has no additional value (even not for the JMM), so it can be removed (and also the atomic operations)
But there is another optimization technique that is very useful for uncontended locks (just like lock coarsening en lock elision): biased locking. Biased locking prevents executing expensive atomic calls by letting a lock being biased towards a specific thread. As long as that thread acquires (and releases) the lock, no atomic operations are needed. Only when another thread acquires the lock, an atomic call is made.
I was messaging with Alex Miller of Terracotta about how locking works in Terracotta. One of my worries was that the single active Terracotta server would be a contention point, because each and every lock acquire/release would need to communicate with the Terracotta server. Alex explained that Terracotta uses a special optimization technique called Greedy locks: a lock can be checked out on a specific node and as long as the node needs to acquire/release the lock no communication with the Terracotta server is needed and thereby reducing the amount of communication (and latency) required.
It is fun to see that a similar solution can be applied on different levels.