Breaking Oracle SERIALIZABLE

October 4, 2008

The SERIALIZABLE isolation level, is the highest isolation level for transactions. When transactions are executed under this isolation level, they should behave as they were being executed serially (one after another). However under Oracle, and all other databases that use Multi Version Concurrency Control (MVCC) like Postgresql, MySQL + InnoDb, it is possible to to break this behavior. At the moment I’m working on an STM implementation that also uses MVCC, so it suffers from the same problem, I’ll get back to this in another blogpost.

The problem

If we have 2 tables A (with column a) and table B (with column b). And two transactions, with SERIALIZABLE isolationlevel, execute the following statements concurrently:

transaction1 INSERT INTO B (SELECT COUNT(*) FROM A)
transaction2 INSERT INTO A (SELECT COUNT(*) FROM B)

What do you think the result is going to be? The SERIALIZABLE isolation level mandates a serial execution of transactions, so transaction1 is executed first, or transaction 2 is executed first. If Transaction 1 executes first, the result would be A=1, and B=0. If Transaction2 executes first, the result would be A=0 and B=1. With MVCC databases the result can also be A=0 and B=0. How is this possible?

The cause

Transactions with the SERIALIZABLE isolation level, use a snapshot of the world for the duration of the transaction (same goes for the READ_ONLY isolation level), so during the execution of the transaction it will not see changes made by other transactions (it will see its own changes of course). When the transaction wants to commit, the database checks if there are conflicts and if there are, the transaction is aborted and you get the feared ORA-08177: can’t serialize access for this transaction.

Time Transaction 1 Transaction 2
T1 start
T2 start
T3 INSERT INTO B (SELECT COUNT(*) FROM A)
T4 INSERT INTO A (SELECT COUNT(*) FROM B)
T5 commit
T6 commit

At T4, Transaction2 won’t see the record that has been inserted in B by Transaction1, so 0 is inserted in table A. When Transaction1 wants to commit at T5, it can be committed because there were no conflicting writes. When Transaction2 wants to commit at T5, it can commit because it also has no conflicting writes. So this is why the result is A=0 and B=0. In most cases it won’t be an issue, but it is interesting to know that MVCC is not perfect.

If you want to know more about this problem, I advice the excellent books of Thomas Kyte.

Advertisements

Oracle and ORA-08177

November 28, 2007

I posted a blog on the company blog about Oracle and the ORA-08177. So if want to know more about the subject I suggest checking it out.

Oracle and ORA-08177


Oracle and locking thoughts

January 28, 2007

Introduction

I was talking with a colleague about a locking issue in Oracle. We have the following constraint:

An administrator (an user with the is_admin flag set to ‘T’) is allowed to change the is_admin flag on users. To prevent problems, like locking out all administrators, an administrator is not allowed to unset his own is_admin flag. This way you will always have at least one administrator.

But if there are concurrent administrators, a naive implementation could lead to data races:

user 1 (an administrator) sets the is_admin flag to 'F' of user 2.
user 2 (an administrator) sets the is_admin flag to 'F' of user 1.

That is why some concurrency control is needed. We were discussing different ways of solving this problem:

  1. using READ_COMMITTED isolation level
  2. using READ_COMMITTED isolation level and pessimistic locking
  3. using SERIALIZED isolation level

There are more alternatives, but for argument sake I will only discuss these.

Short introduction to MVCC

We are using Oracle and it uses a special concurrency technology: Multi Version Concurrency Control (MVCC). I won’t go into the details of this technology (there is enough good material written about this subject), but I’m going to explain how it can be used as part of the solution. MVCC uses a System Change or Commit Number (SCN): every time a commit is made, this number is increased. The cool thing about MVCC is, is that it is possible to reconstruct records as they were at some point in time. This point in time, can be identified with the SCN. This means that a transaction can have a consistent view over data, whatever other transactions are doing. Oracle support 2 levels of read consistent views:

  1. statement level read consistency: when a statement is executed, the SCN is stored and used for the duration of the execution of that statement. This means that during the execution of that statement, it won’t see changes made by other transactions. That is why the READ_UNCOMMITTED isolation level isn’t supported by MVCC: it doesn’t need to, because it always can track back to a committed version of a record, and it doesn’t need to read uncommitted data. When the next statement is executed, the newest SCN will be selected. This means that during the execution of a single transaction, multiple SCN’s could be used.
  2. transaction level read consistency: when a transaction begins, the SCN is stored and used for the duration of that transaction. This means that a transaction is not able to see changes made by other transactions. This means that transaction level read consistency, prevents unrepeatable reads and phantom reads and that is why it is used to implement the SERIALIZED isolation level. The REPEATABLE_READ isolation level isn’t needed because MVCC provides a consistent view for the duration of the transaction. Other inserts/deletes (phantom reads) and updates (unrepeatable reads) are not visible with transaction level read consistency.

Using READ_COMMITTED

Let us bring this theory in practice with the READ_COMMITTED isolation level and no locking on the selected records. It is very important to realize that records that are updated, are automatically locked (isolation level doesn’t matter).

This is the initial database:

user(id=1, is_admin='T')
user(id=2, is_admin='T')
SCN = 0

Let us follow the following pseudo code steps:

Time Transaction 1 executes Transaction 2 executes
t1 start transaction trans1  
t2   start transaction trans2
t3 trans1.CN = SCN
select * from user where id = 1
(user 1 is not locked)
 
t4   trans2.CN = SCN
select * from user where id = 2
(user 2 is not locked)
t5 trans1.CN = SCN
update user set is_admin = ‘F’ where id = 2
(user 2 is locked)
 
t6   trans2.CN = SCN
update user set is_admin = ‘F’ where id = 1
(user 1 is locked)
t7 commit transaction trans1SCN++ (is now 1)
release lock user 2
 
t8   commit transaction trans2
SCN++
release lock user 1

There is no reason for the Oracle database to complain, because no conflicting locks are held, and the CN’s all match. So the transactions are allowed to commit, and this will be the database:

user(id=1, is_admin='F')
user(id=2, is_admin='F')
SCN=2

As you can see, there are no administrators anymore. Using READ_COMMITTED isolation level without locking on the selects, doesn’t solve our problem.

Using READ_COMMITTED and pessimistic locking

One solution to prevent this race problem, it to use the READ_COMMITTED isolation level in combination with pessimistic locking. Oracle support pessimistic locking with the ‘select for update’ statement.

Let is follow the same example, but now with select for update:

Time Transaction 1 executes Transaction 2 executes
t1 start transaction trans1  
t2   start transaction trans2
t3 trans1.CN = SCN
select * from user where id=1 for update
(user 1 is locked by trans1)
 
t4   trans2.CN = SCN
select * from user where id=2 for update
(user 2 is locked by trans2)
t5 trans1.CN = SCN
update user set is_admin = ‘F’ where id=2
rollback: (because trans2 already holds the lock on user 2,trans1 is rolled back)
release lock user 1
 
t6   trans2.CN = SCN
update user set is_admin = ‘F’ where id = 1
(user 1 & 2 are locked by trans2)
t7   commit trans2SCN++
release lock user 1 & 2

The result is that trans1 is rolled back, and trans2 is committed. The database will be:

user(id=1, is_admin='T')
user(id=2, is_admin='F')
SCN=1

The result is that one of the users will remain administrator. Using the select for update, and the update, provides a critical section around user 1 and 2: the check and update are now atomic and this prevent the race problem to occur. Personally I think the ‘select for update’ is a littlebit misguiding because you are not always to going to update the record(s) you select.

SERIALIZED Isolation Level

A different alternative to using pessimistic locking, is using optimistic locking. You could add optimistic locking yourself (by adding some version field), but the cool thing is that the SERIALIZED isolation level, under MVCC, also provides this functionality. It provides even more, because the SERIALIZED isolation level also prevents phantom reads. It is important to realize that preventing phantom reads (and throwing an optimistic locking failure) is not always desired behavior.

Time Transaction 1 executes Transaction 2 executes
t1 start transaction trans1
trans1.CN = SCN
 
t2   start transaction trans2trans2.CN = SCN
t3 select * from user where id = 1  
t4   select * from user where id = 2
t5 update user set is_admin = ‘F’ where id = 2  
t6   update user set is_admin = ‘F’ where id = 1
t7 assert trans1.CN = SCN
commit trans1inc SCN (SCN is now 1)
 
t7   assert trans.CN = SCN
(because 0!=1, the assert fails)
rollback

Transaction 2 now gets a ‘ORA-08177: can’t serialize access for this transaction’, and the database will be:

user(id=1, is_admin='T')
user(id=2, is_admin='F')
SCN=1

So using the SERIALIZED isolation level also prevents the race problem to occur.

Conclusion

My personal preference would be the second solution, because it locks what needs to be locked (and not more, so no unnecessary errors).