Use the real database when testing

While unit/integration testing the repositories (dao’s in yesterdays lingo), try to use the same type of database, instead of some lightweight alternative like hsldb. Unless it also used in the production environment of course.

Using a different type of database could lead to not detecting problems like:

  1. differences in sql interpretation, types, precision and formatting.
  2. missing database constraints. Usually this is functionality I prefer to guarantee and like to have placed in handwritten ddl scripts, instead of it being out of control because it is generated by some or-mapping technology. This approach also invites for creating update scripts from day one, instead of it being an afterthought and causing pain when upgrading to the next release.
  3. differences in concurrency control mechanisms. What leads to a pessimistic lock in one database, could lead to an optimistic locking failure in another.

This means that there are differences in behavior between databases and these should not be ignored. Not using the real database could lead to problems that are only observed in the acceptation or production environment. And the longer it takes to encounter a bug, the more time it is going to cost to solve it.

I know that in continuous build environments using the same type of database can be difficult; not only the environment needs to support different databases, it also needs to support multiple versions. One of the ways to solve this problem is to use a virtualized operating system (using e.g. VMWare) per database version. This should prevent the build environment from changing into a maintenance nightmare.

2 Responses to Use the real database when testing

  1. Ryan Cooper says:

    The two options (testing with a lightweight database and testing with the real database) are not mutually exclusive. Presumably, there is a good reason to use a lightweight database for testing; otherwise no one would bother. Usually the reason is performance; running all your tests on a file system backed database can be slow. This lengthens the feedback loop, reducing the value of your tests. But if you only ever test with an in-memory db, you’ll miss the types of defects you mentioned above.

    My preferred solution to this dilemma is to use an in-memory db to run tests on every commit to your source control system (so you still have a tight – but non-comprehensive – feedback loop) and run the same tests using a “real” database on a different machine. You get the best of both worlds: fast feedback from the in-memory tests, and a safety net from the real database tests.

  2. dave says:

    The easiest way around this is to use a lightweight database on the development machine before committing, and use the real database on the continuous integration build (eg cruise control).

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: