Ask your Oracle locking questions

February 26, 2007

I my stats I see that a lot of people reach my blog by searching for Oracle in combination with locking/concurrency control related terms. I have been struggling with Oracle and Multi Version Concurrency Control (MVCC) for some time, but I have a good understanding of the subject now. So if you have Oracle +locking questions, ask them and I’ll try to answer them.


Sharing Hibernate entities between threads

February 19, 2007

A few weeks ago I had a discussion with another developer about sharing entities (mapped by Hibernate) between threads. My personal opinion is that concurrency control should be localized to a few special objects (something we agreed upon). And most other objects don’t need any form of concurrency because one of the following rules applies:

  1. they are ‘immutable’. Safely created immutable objects are thread safe, eg a stateless service like a FireEmployeeService.
  2. they are confined to a single thread.

Entities are not immutable (in most cases), so the first rule doesn’t apply. Luckily the second rule applies in most standard applications: entities are created and used by a single thread (the concurrency control task has shifted to the database).

But as soon as an entity is transferred from one thread to another, the second rule doesn’t apply anymore, and you have to deal with concurrency control. This even applies to entities that are completely moved to another thread and not used by the original thread anymore!

So what can go wrong?

The most obvious thing that can go wrong is that the same object is used by multiple threads (the thread that created the object and the thread that received it). If this happens you can get all kinds of race problems.

Another thing that can go wrong are session related problems. If an entity has unloaded lazy loaded fields, it could happen that multiple threads are accessing the same Hibernate session.

object1@session1 transfered to thread-a
object2@session1 transfered to thread-b

If object1 and object2 both have a lazy loaded field, it could happen that thread-a and thread-b both are using the same session to load those fields. This is a bad thing because the Hibernate sessions are not thread safe. The safest thing you can do is evict the entity from the session before passing it to the other thread, so all references to the session are removed. It is up to the receiving thread, to decide if the object needs to be reattached.

The last thing that comes to mind are visibility problems: values written by one thread don’t have to be visible in other threads. Entities normally don’t have synchronized methods or volatile properties, so a different mechanism needs to be used to prevent visibility problems. Luckely with the new memory model, it is now perfectly clear which alternatives are available. A structure that posses the save handoff property (like a BlockingQueue) can be used to safely pass objects with visibility problems from one thread to another. It is important to realize that not just the entities could have visibility problems, the same problems can occur within the hibernate session. The session is not meant to be used by multiple threads, and therefor it has no reason to prevent visibility problems.

Conclusion

My advice is that sharing objects between threads should not be thought of lightly; you really needs to know what you are doing. This is even more true for Hibernate entities because there is a lot going on behind the screens.


Agile and upfront design

February 15, 2007

One of the things I get annoyed by is that people consider Agile being a methodology without any form of design. Big upfront design, like the waterfall method has, is not a good thing. In most cases you don’t know exactly what needs to be build and what all the constraints are. And that is why big upfront design leads to frozen ignorance: bad technical choices, caused by a lack of understanding, make a system hard to change. Another consequence is that the system is hard to work with because you have to deal with unnecessary complexity.

But my feeling is that some Agile developers are out of balance, because they reject any form of upfront design. Doing some light up front design (maybe a few hours in front of a white board), especially for more complex systems, can be a big help. Often there are some architectural patterns you can use like MVC, Layers, Pipes and Pilters, etc and it helps you to get a better understanding of the system. This makes it possible to do a calculated guess. My personal experience is that a good guess often pays of. If it doesn’t pay off, you should not stick to the choice, and refactor it so becomes the best solution for the problem (if this is possible).


Doing interruptible calls

February 14, 2007

When doing interruptible calls (calls that throw an InterruptedException), try to do the actions the modify the state of the object after this call, or try to leave object in a invalid state before doing that call (or do some cleanup yourself). If you don’t, objects could be left in an invalid state, and not usable anymore because its behavior could be undefined. This could lead to problems in threads using the same object, and this is not something you want (especially in general purpose concurrency structures used in an enterprise environment).

example:

  public E take() throws InterruptedException {
        mainLock.lockInterruptibly();

        try {
            while (ref == null)
                refAvailableCondition.await();

            lendCount++;
            return ref;
        } finally {
            mainLock.unlock();
        }
    }

The ‘mainLock.lockInterruptibly()’ and the ‘refAvailableCondition.await()’ both are interruptible and done before the state of the object is modified (lendCount++)

The example below shows the the example that doesn’t leave to object in a valid state when it is interrupted.

public E take() throws InterruptedException {
        mainLock.lockInterruptibly();

        try {
            lendCount++;           
            while (ref == null)
                refAvailableCondition.await();

            return ref;
        } finally {
            mainLock.unlock();
        }
    }

Alternative

If you can’t guarantee that the state is valid before executing the interruptible call, it is better to use an uninterruptible version of that call if it is available. The downside of uninterrubtible calls, is that the calling thread can’t be interrupted while waiting. In case of an application server shutdown, it could lead to shutdown/redeployment issues because the application doesn’t shut down (something I witnessed last week).

Although the checked nature of the InterruptedException can be a serious pain, doing interrupted calls is not trivial, and forcing you to deal with interrupted exceptions, forces you to think about it.


Fixing deadlocks

February 9, 2007

I’m working on a concurrency library, and I was wondering about deadlocks in complex concurrent structures. Deadlocks are tricky because they doesn’t always need to happen (so testing for deadlocks is almost impossible) and another problem is that the JVM isn’t obligated to detect deadlocks.

Most enterprise application need a database (or other transactional resources) and one of the tasks the database executes is concurrency control. In most cases you just need to set an isolation level and the Java code remains free of concurrency control related complexity, because:

  1. object are isolated.
  2. object are immutable.

(For more information see ‘Patterns of Enterprise Application Architecture’, or ‘Concurrent Programming in Java’). But although the developer doesn’t need to worry a lot about concurrency control, there is a lot going on in the database.

Deadlocks in databases

One of the things that can go wrong in a database are deadlocks. But a deadlock in a database is not as bad as a deadlock in Java, because:

  1. the database has mechanisms to detect deadlocks. When a deadlock is detected, the transaction is rolled back and the locks are released (so the deadlock is killed).
  2. a transaction is given a timeout, and it is rolled back when it takes to long. So when a deadlock occurs, a timeout eventually occurs, and the transaction is rolled back (and the locks are released, so the deadlocks is killed).

Deadlocks in Java

Adding deadlock detection in the JVM is difficult to realize, but working with timeouts is not that difficult. The JSR-166 has an excellent new Lock structure that make it possible to obtain a lock with a timeout (this was not possible with the classic synchronized blocks/methods). The problem with Locks is that you have to carry around the remaining timeout and this makes this solution difficult to work with, and the logical consequence is that in most cases locks without timeouts are used and this increases the chance of a deadlock.

The improvement

So how can this situation be improved? How can you get the timeout behavior without getting the complexity? Last week on a sunday morning, when I was having my first cup of coffee, I got an idea. The timeout could be stored in a ThreadLocal so when a timeout is required, this structure can be asked.

If you need a lock, you only have to do this:

Lock lock = ....;
LockUtils.saveLock(lock);
try{
     ...do something 
}finally{
      lock.unlock();
}

The threadlocal can be initialized in the beginning (a timeout of 60 seconds for example). And every time some sort of locking (with a timeout) is needed, the remaining timeout can be calculated and stored in the threadlocal. When the timeout reaches zero all locks will start throwing a TimeoutException. By adding this save call structure to the library:

  1. void Repeater.saveRepeater(Runnable)
  2. E LendableReference.saveTake()
  3. void BlockingRepeater.saveExecute (Runnable)

You don’t need to worry (as much) about deadlocks.

Other thoughts

There is some stuff to think about, eg: how can different systems that use this threadlocal influence each other. You don’t want a system resetting the remaining timeout to a higher value.


Repeating tasks

February 2, 2007

Introduction

On a few server side projects I have worked on, I needed a special threading structure that keeps repeating the same task over and over (in most cases it blocks because it needs to wait for something like input/output etc). In some cases I just needed a single thread, but in most cased I needed multiple threads. I have played with the Executor to accomplish this goal:

  1. reposting a task as soon as it completes. The problem with this solution is that it is quite tricky if you want to increase the number of threads because no task is available for them to execute. The reposting also increases the complexity of the task, and complexity is something I try to prevent (especially in server side environments)
  2. a modified BlockingQueue that keeps handing out the same task instead of it being removed. This solution was given at the concurrency mailinglist, but it still doesn’t feel right: the structure is not used as it was intended, and this also complicates this already complicated subject.

That is why I decided to create a new threading structure: the Repeater. The Repeater is a structure that is able to keep repeating the same task over and over again. The default implementation of the Repeater is the ThreadPoolRepeater: as the name states, it has a pool of threads that keep repeating the same task.

How it works

The workerthreads in the repeater try to get the current task. If no task is available, they block until one comes available (or until they are interrupted). I have moved this ‘blocking until reference comes available’ behavior in a new structure: the LendableReference. If a task is placed in the LendableReference, the workerthreads wakeup, execute the task, and finally take the task back (I’ll get back to this) and wait for a new one to come available. If the same task is still in the LendeableReference, the workerthreads can repeat the same task over and over again.

How it can be used

I love concurrency control, but I prefer to keep the number of objects that are aware of concurrency as small as possible. In most cases this is possible to extract all threading behavior from objects. With repeaters this also is quite easy to realize: I see a method as an axle and a Repeater as an engine. I can hook up the engine to the axle from the outside (I use Spring for this task, but something else could be used as well). This approach is perfect for a production environment, because you have a very clear separation of concerns (and this makes is easy to reason about a system or to alter its behavior). For testing purposes, you can call the method yourself without worrying about multi threading.

Relaxed or Strict

The task in the Repeater can be changed. If it is changed, it could happen that at some moment more than one task is being executed by the Repeater (some threads still execute the old task and some threads are executing the new task). In some cases this is very undesirable behavior. That is why this behavior can be customized in the Repeater. In the previous section I introduced the LendableReference; a structure you can lend a value from, and before obtaining a new reference, the old one needs to be returned. I have created two different implementation of this LendableReference:

  1. StrictLendableReference: this implementation doesn’t allow a new reference to be placed before all references are returned. This makes it impossible that different references are lend at any given moment.
  2. RelaxedLendableReference: this implementation does allow a new reference to be placed before all reference are returned. This makes it possible that different references are lend at any given moment. This gives the RelaxedLendableReference better concurrency characteristics because taking and putting a reference don’t block.

So by using different LendableReference implementations in the Repeater, you can control if different tasks can be run at the same moment. This behavior is quite difficult to realize with the Executor.

Other customizations

There are various other aspects of the Repeater you want to control in a server side environment. You want to control the threads it uses; that is why you can inject a ThreadFactory. But this is not the only thing: in my previous blogpost I told about the WaitPoints (a point, threads need to pass before they can continue). By creating a new LendableReference that acts as a decorator to a target LendableReference, and where all lends need to pass the WaitPoint, you can open en close the Repeater, and you can even throttle it (the amount of time between executions). Another thing I’m playing with is to create some sort of predicate that can remove the task from further execution.

The Repeater, the LendableReferences and the WaitPoints are part of the concurrency library I’m working on, and I hope to make a first release in 1 or 2 months time (it really is a lot of work: documentation, testing, writing code, a site, etc).