Executing long running task from UI

December 15, 2007

A colleague asked me how one could prevent the execution a long running task on some UI thread (e.g. the thread of a Servlet container or the Swing event dispatching thread) and also how one could prevent the concurrent execution of that same task. So I send him an email containing a small example and decided to place it on my blog to help others struggling with the same issue.

The long running task

The FooService is the service with the long running method ‘foo’.

interface FooService{
        void foo();
}

There is no need to add threading logic to the FooService implementation. It can focus purely on the realization of the business process by implementing the business logic. The code should not be infected with concurrency control logic, because it makes testing very hard, and makes code also hard to understand, reuse or change. So this is one of the first potential refactoring I often see in code. I’ll post more about this in ‘Java Concurrency Top 10’.

The execution-service

The FooExecutionService is responsible for executing the FooService and preventing concurrent execution (if the correctly configured executor instance is injected). Personally I prefer to inject the executor instead creating one inside the FooExecutionService, because it makes it hard to test and change/configure.

class FooExecutionService{

        private final FooService fooService;
        private final Executor executor;

	public FooExecutorService(FooService fooService, Executor executor){
		this.fooService = fooService;
		this.executor = executor;
	}

        /**
	 * Starts executing the FooService. This call is asynchronous, so
         * won't block.
	 *
	 * @throws RejectedExecutionException if the execution 
	 *         of foo is not accepted (concurrent/shutdown).
	 */
	public void start(){
                executor.execute(new Task());
        }

        class Task implements Runnable{
                void run(){
                        try{
                                fooService.foo();
                        catch(Exception ex){
                                log.error("failed to ...", ex);
                        }
                }
        }
}

The FooExecutionService could be improved in different ways: it could provide information when a task already is executing. This could be realized by placing a dummy task in the executor and check if the task is rejected. A different solution would be to let the Task publish some information about the status of the current execution. If the task is very long running, and you want to be able to stop the task, you could shutdown the executor by calling the shutdownNow method. This interrupts the worker-threads and if you periodically check the interrupt status of the executing thread while doing to long running call, you can end the execution.

Some Spring configuration

The Executor is injected from the outside by some Spring configuration, i.e.:

<bean id=""fooService" class="FooServiceImpl"/>

<bean id="fooExecutionService" class="FooExecutionService">
        <constructor-arg 	index="0" 
				ref="fooService"/>
        <constructor-arg 	index="1">
            	<bean 	class="java.util.concurrent.ThreadPoolExecutor"
			destroy-method="shutdownNow">
			<!-- minimal poolsize (only 1 thread) -->
                	<constructor-arg 	index="0"
                                 		value="1"/>
                	<!-- maximum poolsize (only 1 thread)-->
                	<constructor-arg 	index="1"
                                 		value="1"/>
                	<!-- the timeout (we don't need it) -->
                	<constructor-arg 	index="2"
                                 		value="0"/>
                	<!-- the timeunit that belongs to the timeout argument (we don't need it) -->
                	<constructor-arg index="3">
                    		<bean 	id="java.util.concurrent.TimeUnit.SECONDS"
                          		class="org.springframework.beans.factory.config.FieldRetrievingFactoryBean"/>
                	</constructor-arg>
                	<!-- the workqueue where unprocessed tasks get stored -->
                	<constructor-arg index="4">
                    		<!-- we don't want any unprocessed work: a worker needs to be available,
                 		or the task gets rejected. -->
                    		<bean class="java.util.concurrent.SynchronousQueue"/>
                	</constructor-arg>
        	</bean>
	</constructor-arg>    
</bean>

If there are multiple long running methods, it would be an idea to extract the creational logic of the executor to a factory method.

The UI-controller

And the FooExecutionService can be hooked up to some controller like this:

class StartFooController extends SomeController{
        final FooExecutionService fooExecutionService;
      	
	StartFooController(FooExecutionService fooExecutionService){
		this.fooExecutionService = fooExecuctionService;
	} 
	
        String handleRequest(Request request, Response response){
                try{
                        fooExecutionService.start();
                        return "success";
                }catch(RejectedExecutionException ex){
                        return "alreadyrunningorshuttingdownview";
                }
        }
}

Prevention of concurrent execution of different tasks

If you want to prevent concurrent execution of different long running methods, you could create a single execution-service for all methods, and share the same executor between the execution of the different tasks:


Lightweight Batch Processing I: Intro

November 12, 2007

If you are lucky, your application is a lot more complex than just the standard request/response webapplication. The complexity in these application can typically be found in the business domain or in the presentation logic. Batch processing systems process large volumes of data, and this is always something that makes me happy to be a software developer, because so much interesting stuff is going on; especially concurrency control and transaction management.

This is the first blog about lightweight batch processing and the goal is to share my knowledge, and hopefully gain new insights by your comments. There are batch frameworks (like the newest Spring module: Spring Batch) but frameworks often introduce a lot more functionality (and complexity) than required and they can’t always be used for a wide range of reasons (sometimes technical, sometimes political). This set of blogs is aimed at these scenario’s. The approach I use is to start from some basic example, to point out the problems that can occur (and the conditions), and eventually to refactor the example.

Lets get started: underneath you can see a standard approach to processing a batch of employees.

EmployeeDao employeeDao;

@Transactional
void processAll(){
    List batch = getBatch();
    for(Employee employee: batch)
        process(employee);
}

void process(Employee employee){
    ...logic
}

As you can see, the code is quite simple. There is no need to integrate the scheduling logic in the processing logic. It is much better to hook up a scheduler (like Quartz for example) from the outside (makes code much easier to test, to maintain and to extend). This example works fine for a small number of employees and if the processing of a single employee doesn’t take too much time. But when the number of employees increases, or the time to process a single item increases, this approach won’t scale well and could lead to all kinds of problems. One of the biggest problems (for now) is that the complete batch is executed under a single transaction. Although this transaction provides the ‘all or nothing’ (atomicity) functionality that normally is desired, the length of the transaction can lead to all kinds of problems:

  1. lock contention (and even lock escalation depending on the database) leading to decreased performance and eventually to a complete serialized access to the database. This can be problematic if the batch process is not the only user of the database.
  2. failing transactions caused by running out of undo space, or the database aborting the transaction because it runs too long.
  3. when the transaction fails, all the items need to be reprocessed, even the ones that didn’t gave a problem. If the batch takes a long time to run, this behavior could be highly undesirable.

In the following example the long running transaction has been replaced by multiple smaller transactions: 1 transaction to retrieve the batch and 1 transaction for each employee that needs to be processed:

EmployeeDao employeeDao;

void processAll(){
    List batch = getBatch();
    for(Employee employee: batch)
        process(employee);
}

@Transactional
List getBatch(){
    return employeeDao.findItemsToProcess();
}

@Transactional
void process(Employee employee){
    ...logic
}

As you maybe have noticed, this example is not without problems either. One of the biggest problems is that the complete list of employees needs to be retrieved first. If the number of employees is very large, or when a single employee consumes a lot of resources (memory for example) this can lead to all kinds of problems (apart from running another long running transaction!). One of the possible solutions is to retrieve only the id’s:

EmployeeDao employeeDao;

void processAll(){
    List batch = getBatch();
    for(Long id: batch)
        process(id);
}

@Transactional
List getBatch(){
    return employeeDao.findItemsToProcess();
}

@Transactional
void process(long id){
    Entity e = dao.load(id);
    ...actual processing
}

A big advantage of retrieving a list of id’s instead of a list of Employees, is that the transactional behavior is well defined. Detaching and reattaching objects to sessions introduces a lot more vagueness (especially if the or mapping tool doesn’t detach the objects entirely). There are different approaches possible: you can keep a cursor open and retrieve an employee only when it is needed, but the problem is that you still have a long running transaction. Another approach is that the only employees are retrieved that can be processed in single run, this has to be repeated until no items can be found.

In the next blogpost I’ll deal with multi-threading and locking.


Service layer woes

July 18, 2007

I have posted an entry on the blog of my employer:

http://blog.xebia.com/2007/07/18/service-layer-woes/


Test Refactoring: extract an assert

July 8, 2007

When writing unit tests, I often see large sections of repeated asserts and asserts that are too low level:

void testAdd(){
	List list = new LinkedList();

	String s = "foo";
	list.add(s);

	assertEquals(1,list.size());
	assertEquals(s,list.get(0));
}

In this case the asserts are checking that the list contains the added item. This is done based on checking the size and checking of elements. The problem is that this logic often is repeated all over the place.

The smell can be removed by extracting an assert that spans the conceptual distance: it states what it does, but it doesn’t say how it does the job (extract method):

void testAdd(){
	List list = new LinkedList();

	String s = "foo";
	list.add(s);

	assertListContent(list,s);
}

void assertListContent(List list, Object... args){
	List expectedList = asList(args);
	assertEquals(expectedList,list);
}

This code can even be simplified by making List a member variable of the TestCase:

void testAdd(){
	list = new LinkedList();

	String s = "foo";
	list.add(s);

	assertListContent(s);
}

void assertListContent(Object... args){
	List expectedList = asList(args);
	assertEquals(expectedList,list);
}

It doesn’t look like it is adding much value, but I have seen enough unnecessary difficult to understand tests because of too many low level asserts. And when test are hard to understand, they are also hard to maintain. That is why I like my test methods short and clear (10/15 lines normally).