Make deployment part of continuous integration

December 26, 2006

Last year I started working with Cruise Control as continuous integration server. Continuous integration is very important for Agile projects, because it is a source of continuous feedback: without continuous feedback you can’t be agile. But I have been focusing on the unit testing aspect too much: I thought of Cruise Control as a mechanism to run all unit tests when it detects changes in the source repository. But a continuous integration server can do much more than just running your unit tests.`

It can also build the war/ear and deploy it. This provides feedback on a lot of stuff that is hard to test otherwise:

  1. is the application able to deploy on the target application server? Often there are differences between the development and production environment: different operating systems, different versions of the application server, different virtual machines, different configurations, etc. Personally I don’t mind much if there are small differences, but the closer you get to the production environment, the smaller the differences should be.
  2. I use Spring for most projects, so are there errors in the applicationcontext? The best way to make sure the applicationcontext is valid, is to deploy. This could fail for a lot of reasons: an error in the jndi-name of the datasource for example.

If you make the deployment part of the continuous integration process, and make sure the build breaks when the deployment fails, you have a very good source of feedback.

At the moment I’m using Oracle OC4J as application server. I added a deploy target in my ANT script and this task is added to the list of targets that Cruise Control executes when it detects a change. Letting the deploy target drop the war/ear in a directory where it is picked up eventually is not enough btw, the build won’t break when the deploy fails.

Advertisements

Pipelines and priorities

December 11, 2006

Abstract

A pipeline (aka ‘Pipes and Filters’) is a great architectural pattern for building a complex system composed of a sequence of simpler processes. But one of the weaknesses of a pipeline is that it doesn’t have a low response time when it is under load. In this post I describe a solution to this limitation.

Introduction

A month ago, I was working an system that needed to be very flexible, required a sequence of steps to be executed and should be highly scalable, so a pipeline was the first thing that came to mind. A pipeline is a chain of producers and consumers of messages connected by queues (aka channels, pipes). A message is put on the first queue by some producer, and a consumer will take the message from the queue, does some processing and place a message (could be the same) on the following queue.

One advantage of pipelines is that they make it very easy to loosely couple processes:

  1. a producer of messages, only needs a reference to a pipe to put messages on.
  2. a consumer of messages, only needs a reference to a pipe to take messages from.

This means, that a producer is not coupled to a consumer behind it, and a consumer not to a producer in front of it. This makes it very easy to replace components or alter their behaviour.

Another big advantage of pipelines is that it is quite easy to make them multi-threaded (so great for using all cores/cpu’s of a system):

  1. every producer and consumer can run on its own thread. You can even use multiple threads for running a single producer or consumer. Tip: try to externalize the threading from the producers and consumers, and hook it up from outside.
  2. messages are ‘touched’ by at most a single producer/consumer at any given moment. This means, that messages are used in the isolation of a single thread, and this reduces the need to make them thread safe. With the introduction of the new Java Memory Model (JSR-133) in Java 5, save hand of also is a great new property of these structures because it helps to prevent visibility problems.
  3. by making queues blocking, producers block if they try to put an element on a full queue (great for graceful degradation), and consumers block if they try to take an element from an empty queue. Blocking calls help to reduce concurrency control related complexity, because this programming model makes synchronization with other threads largely transparent.

What is the problem

The problem with pipelines, is that they have a low response time when the system is under load. The cause of this problem, is that a message is not processed immediately, but put on a queue first, and it has to wait until all proceeding messages are processed. If there are many unprocessed messages (usually the case when a system is under load), the response-time will increase. If there are multiple queues (often the case in pipelines), the response time increases even more because the wait times are accumulated.

In the system I was working on, there were 2 message producers:

  1. scheduled process: on certain times, the scheduler triggers a process. This process could place a large amount of queries that need to be processed.
  2. user: the user also can place queries that need to be processed.

The problem is, that a standard pipeline is not responsive to user queries, when there are a lot of queries created by the scheduled process. This is not acceptable, because one of the non-functional requirements is that user queries should have a very small response time.

Solution

The solution is quite simple: add priorities to the messages and make the queues aware of these priorities: higher priority messages should be passed to consumers before lower priority messages. As soon as a high priority message is placed on the queue, it will be processed before all lower priority messages in that queue. This means that the response time of higher priority messages decreases, and response time for lower priority messages increases (in our case this behaviour is acceptable).

Luckily in Java there already is such a queue implementation: PriorityBlockingQueue. Warning: one of the problems of the PriorityBlockingQueue, is that it isn’t bounded: placing a limit on the maximum size is not available out of the box.

Also make sure that all the queues are sensitive to the priorities, and if a new message is created (based on an previous message), don’t forget to pass the priority to this new message.