STM: Encryption and Security

June 28, 2010

Software Transactional Memory makes certain problems easier to solve since there is a additional layer between the data itself and using the data. The standard problems STM’s solve is to coordinate concurrency, provide consistency and failure atomicity. But adding more ‘enterprisy’ logic isn’t that hard.

One of the things that would be quite easy to add is security; in some cases you want to control if someone is allowed to access a certain piece of information (either for reading or for writing). Most stm’s already provide readonly vs update transactions, and once executed in readonly mode a user is now allowed to modify information (although one should be careful that a readonly transaction is not ignored because there is en enclosing update transaction). This can easily be extended to provide some form of security where certain information can only be accessed if certain conditions are met.

Another related piece of functionality is adding encryption; in some cases you don’t want the data to be available unprotected in memory (someone could access it with a debugger for example) or placed on some durable storage or being distributed. Adding some form of encryption when committing and decryption when loading also would not be that hard to realize, since the transaction already exactly knows what data is used. Personally I don’t like to work with encryption since it very often is completely in your face and gets intertwined with the design, but using some declarative annotations, would remove a lot of pain.

Advertisements

STM: Object vs Field Granularity

June 28, 2010

Stm’s for object oriented languages like Java, c#, Scala etc can either be:

  1. field granular: meaning that each field of an object is managed individually
  2. object granular: meaning that all fields of an object are managed indivisible

Both approaches have some pro’s and con’s. The advantage of field granularity, is that independent fields of transactional object won’t cause conflicts. If you have a double linkedlist for example, you want the head and tail of the list to be accessed no matter what happens on the other side. This reduces the number of conflicts you get, so improves concurrency.

But having field granularity is not without its problems:

  1. more expensive transactions: because each field needs to be tracked individually
  2. more expensive loading; with a field granularity, each fields needs to be loaded individually.
  3. more expensive conflict detection; since each fields needs to be checked individually
  4. more expensive commit: each field needs to be locked/updated individually

This means that transactions require more cpu and memory. Another problem is that a transaction that uses field granularity takes longer and this means that it could suffer from more conflicts (causing reduced concurrency) and causing more conflicts (since they acquire locks longer). And last but not least, a field granular transaction puts a lot more pressure on your memory bus since more ‘synchronization’ actions are needed.

In Multiverse (when instrumentation is used), object granularity is the default. And to make field granularity possible, just place a @FieldGranular annotation on top of your field.


Java Extreme Performance: Part I – Baseline

June 24, 2010

I’m currently working on writing a new STM engine for the Multiverse project and this time the focus is going to be on scalability and performance. The goal is that for uncontended data and 1 ref per transaction, the update performance should be a least 10 million per second and readonly transactions should be 50/75 million per second (all per core and it should scale linearly). So I’m going to write some posts about what I discovered while improving performance and this is the first one.

The first big thing is that profilers become useless (even if you use stack sampling) because the call stack often changes so frequently, that no information can be derived from it (even if you use a sampling frequency of 1 ms which apparently is the lowest you can get).

What works best for me is to throw away everything you have and start from scratch with an (incomplete) baseline. This helps to create some kind of ‘wow.. I wish I could get this performance’ and by adding more logic, you can see how expensive something is. In a lot of cases you will be flabbergasted how expensive certain constructs are; e.g. a volatile read or a polymorphic method call, or that the JIT is not able to optimize something you would expect it to optimize.

This is a very time consuming process (especially since it also depends on the platform or the jdk you are using). But it will help to gain a deeper insight and help you to write better performing code.


Groovy and STM

May 18, 2010

Yesterday evening I was playing with some initial support for Groovy in Multiverse; a software transactional memory implementation for Java. And with the help of Alex Tkachman is was easy to get up some initial support.

If you want to have a transactional reference (similar to a ref in Clojure), you can just create a Ref (or in this case a LongRef):

class Account{
   final LongRef balance = new LongRef();

   void long getBalance(){balance.get()}

   void setBalance(long newBalance){
       if(newBalance<0){ 
            throw new InsufficientFundsException();
       }
       this.balance.set(newBalance);
   }
} 

And if you want to have an atomic block, you can execute the following:

    Account from = new Account(10)
    Account to = new Account(10)

    atomic(readonly:false, trackreads:true) {
      from.balance -= 5
      to.balance += 5
    }

    println "from $from.balance"
    println "to $to.balance"

The parameters in the atomic block are not needed (Multiverse is able to infer some settings and in the future more inference will be added), but I wanted to see if it was possible. Having closure support in a language, increases language complexity, but imho makes a language a lot easier to use and less painful on the eyes.

There is more clutter that can be removed, but I really need to polish my Groovy skills. Groovy support is going to be added to Multiverse 0.6 (expected in 6/8 weeks). I haven’t checked in the new groovy support on the snapshot, so contact me if you want to play with it.


Liking Gradle

April 17, 2010

Last week I have spend a lot of time getting Maven 2 the way I wanted. In most cases Maven is doable, but in case Multiverse, I have some very specific requirements. I need to run the same tests again using different configurations (javaagent vs static instrumentation) and I also need to fabricate a very special jar that integrates all dependent libraries (using jarjar) and does some pre-compilation.

The big problem with Maven is that defining complex builds using the plug-in system is not that great. I have spend a lot of time on configuring the plugins because they were not working like expected or not working at all.

A long time ago I took a look at Gant, but ANT (even if you do it in groovy) still is ANT; meaning a lot of manual labor. So I wanted to give Gradle a try; the power of Groovy combined with a plugin system and predefined build system from Maven. And after 1 day of playing with the scripts, I have got most stuff up and running exactly like the current Maven system (even the artifacts being deployed to the local maven repository!). And I guess that during this week I can make the switch final.

One of the things I really need to get started on, is creating an elaborate set of benchmarks. Using Maven it is fighting the framework, but with the complete freedom you have in Gradle, but still all the facilities provided by plugins, it think it is going to a joy to begin with.

We’ll see what Maven 3 is going to bring, but I’m glad that there is a usable alternative available.


Multiverse: Timed blocking transactions

April 15, 2010

With traditional concurrency control you can do a timed wait on some resource (a lock for example) to come available. If the resources
doesn’t come available within a certain limit, the operation fails and this needs to be handled in the code. There are a few problems with
this approach:

  1. no convenient way to pass the total timeout to each blocking call
  2. you need to have several versions of blocking methods (with or without timeout, interruptible or not) and often there is no easy abstraction higher up. So you keep getting almost identical methods, that only differ in the type of blocking method they call
  3. no easy way to roll back changes. If you decide to abort an operation because of a timeout, it could be hard to restore the system in the original state (so no atomicity)

With software transactional memory these problems can be solved. I have just added support for timed blocking methods, so you can say something like:

@TransactionalObject
class BankAccount{
    private int balance;

    public int balance(){
      return balance;
    }

    public void setBalance(int balance){
       this.balance = balance;
    }
}

  @TransactionalMethod(timeout = 1, timeoutTimeUnit = TimeUnit.SECONDS)
  public static void tryRemove(Account a, int amount){
      a.setBalance(a.getBalance()-amount);//is rolled back in case of a failure

       if(a.getBalance()<0{
              retry();
       }       
    }

When the tryRemove is called (and no transaction is running), and the balance is not sufficient, the call blocks until a timeout occurs or the balance is increased and the money withdrawn. And if you want the call to be interruptible, you just need to add the InterruptedException:

    @TransactionalMethod(timeout = 1)
    public void tryRemoveInterruptibly(int amount)throws InterruptedException{
       balance = balance - amount;//is rolled back in case of a failure

       if(balance<0{
              retry();
       }       
    }

And the timeunit defaults to SECONDS, so no need to configure that.

In Multiverse you only need to specify multiple blocking versions of a method if that method is exposed to non transactional methods. Transactional methods calling transactional methods, will always lift on the transaction of the outermost transactional call (Multiverse provides a flattened transaction model for nested transactions for now). This also means that it is very easy to create a timed blocking method on datastructures you don’t own.


Multiverse STM: Looking for a Groovy Committer

April 10, 2010

One of the mission statements of Multiverse is to create a STM that can be used inside any JVM based language. We already have Scala integration available (although it needs some care) and since yesterday we also have someone responsible for integrating Multiverse with JRuby (welcome aboard Sai Venkat).

But it would be cool if we can add Groovy integration in the 0.5 release (in a week) so that it seamlessly integrates with Groovy. So we are looking for someone with Groovy experience, wants to experiment with STM technology and work on an Open Source project.

If you are interested, send a mail to alarmnummer at gmail dot com.