Fixing deadlocks

I’m working on a concurrency library, and I was wondering about deadlocks in complex concurrent structures. Deadlocks are tricky because they doesn’t always need to happen (so testing for deadlocks is almost impossible) and another problem is that the JVM isn’t obligated to detect deadlocks.

Most enterprise application need a database (or other transactional resources) and one of the tasks the database executes is concurrency control. In most cases you just need to set an isolation level and the Java code remains free of concurrency control related complexity, because:

  1. object are isolated.
  2. object are immutable.

(For more information see ‘Patterns of Enterprise Application Architecture’, or ‘Concurrent Programming in Java’). But although the developer doesn’t need to worry a lot about concurrency control, there is a lot going on in the database.

Deadlocks in databases

One of the things that can go wrong in a database are deadlocks. But a deadlock in a database is not as bad as a deadlock in Java, because:

  1. the database has mechanisms to detect deadlocks. When a deadlock is detected, the transaction is rolled back and the locks are released (so the deadlock is killed).
  2. a transaction is given a timeout, and it is rolled back when it takes to long. So when a deadlock occurs, a timeout eventually occurs, and the transaction is rolled back (and the locks are released, so the deadlocks is killed).

Deadlocks in Java

Adding deadlock detection in the JVM is difficult to realize, but working with timeouts is not that difficult. The JSR-166 has an excellent new Lock structure that make it possible to obtain a lock with a timeout (this was not possible with the classic synchronized blocks/methods). The problem with Locks is that you have to carry around the remaining timeout and this makes this solution difficult to work with, and the logical consequence is that in most cases locks without timeouts are used and this increases the chance of a deadlock.

The improvement

So how can this situation be improved? How can you get the timeout behavior without getting the complexity? Last week on a sunday morning, when I was having my first cup of coffee, I got an idea. The timeout could be stored in a ThreadLocal so when a timeout is required, this structure can be asked.

If you need a lock, you only have to do this:

Lock lock = ....;
try{ something 

The threadlocal can be initialized in the beginning (a timeout of 60 seconds for example). And every time some sort of locking (with a timeout) is needed, the remaining timeout can be calculated and stored in the threadlocal. When the timeout reaches zero all locks will start throwing a TimeoutException. By adding this save call structure to the library:

  1. void Repeater.saveRepeater(Runnable)
  2. E LendableReference.saveTake()
  3. void BlockingRepeater.saveExecute (Runnable)

You don’t need to worry (as much) about deadlocks.

Other thoughts

There is some stuff to think about, eg: how can different systems that use this threadlocal influence each other. You don’t want a system resetting the remaining timeout to a higher value.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: