Why Double Checked Locking is flawed


I was having a discussion with fellow developers friends about how to implement Singleton and many things came up. In the discussion I soon realized that not many developers are aware of the fact that Double Checked Locking (DCL) is flawed. Many developers say that they have never seen any issues with it and hence they use it everywhere in order to overcome the Synchronization over head. So I thought this would be a good topic to discuss here at my blog and this post is going to focus on it.

Before moving forward lets first talk about what is Double Checked Locking. To understand where the double-checked locking idiom originated, you must understand the common singleton creation idiom, which is illustrated below:

import java.util.*;
class Singleton
{
  private static Singleton instance;
  private Vector v;

  private Singleton()
  {
    v = new Vector();
    v.addElement(new Object());
  }

  public static Singleton getInstance()
  {
    if (instance == null)          //1
      instance = new Singleton();  //2
    return instance;               //3
  }
}

The design for this class is suitable for single thread usage but when it comes to multi-thread usage. Since threads can get preempted we can have multiple instances of Singleton object. The solution is to protect the getInstance() method through synchronization as shown below:

  public static synchronized Singleton getInstance()
  {
    if (instance == null)          //1
      instance = new Singleton();  //2
    return instance;               //3
  }

This code works perfectly fine for multi-threaded access to getInstance() method. But if you look at the code deeply, synchronization is required only once, when the variable is initialized. Rest of the time there is no need for synchronization and hence we get the synchronization overhead for other threads since the whole method is synchronized.

In an effort to make this method more efficient (avoiding the synchronization on common, the read path) an idiom called Double Checked Locking was born. The idea is to avoid the costly synchronization for all invocations of the method except the first. The cost of synchronization differs from JVM to JVM. In the early days, the cost could be quite high. As more advanced JVMs have emerged, the cost of synchronization has decreased, but there is still a performance penalty for entering and leaving a synchronized method or block. Regardless of the advancements in JVM technology, programmers never want to waste processing time unnecessarily.

Here is how DCL is implemented:

public static Singleton getInstance()
{
  if (instance == null)
  {
    synchronized(Singleton.class) {  //1
      if (instance == null)          //2
        instance = new Singleton();  //3
    }
  }
  return instance;
}

The theory behind double-checked locking is that the second check at //2 makes it impossible for two different Singleton objects to be created and we also avoid the synchronization overhead for the most frequent read path by check at // 1.

The theory behind DCL is perfect but reality is entirely different. The problem with double-checked locking is that there is no guarantee it will work on single or multi-processor machines. The issue of the failure of double-checked locking is not due to implementation bugs in JVMs but to the current Java platform memory model. The memory model allows what is known as “out-of-order writes” and is a prime reason why this idiom fails.

What is out of order writes?

To illustrate the problem, you need to re-examine line //3 from the code above. This line of code creates a Singleton object and initializes the variable instance to refer to this object. The problem with this line of code is that the variable instance can become non-null before the body of the Singleton constructor executes.

Before explaining how this happens, accept this fact for a moment while examining how this breaks the double-checked locking idiom. Consider the following sequence of events with the code above:

  • Thread 1 enters the getInstance() method.
  • Thread 1 enters the synchronized block at //1 because instance is null.
  • Thread 1 proceeds to //3 and makes instance non-null, but before the constructor executes. Thread 1 is preempted by thread 2.
  • Thread 2 checks to see if instance is null. Because it is not, thread 2 returns the instance reference to a fully constructed, but partially initialized, Singleton object.
  • Thread 2 is preempted by thread 1. Thread 1 completes the initialization of the Singleton object by running its constructor and returns a reference to it.

These sequence of events results in a period of time where Thread 2 returned an object whose constructor had not executed. To show how this occurs, consider the following pseudo code for the line: instance = new Singleton();

mem = allocate();             //Allocate memory for Singleton object.
instance = mem;               //Note that instance is now non-null, but
                              //has not been initialized.
ctorSingleton(instance);      //Invoke constructor for Singleton passing
                              //instance.

This pseudo code is not only possible, but is exactly what happens on some JIT compilers. To prove the point I am showing a stripped down version of getInstance() method and its assembly code.

class Singleton
{
  private static Singleton instance;
  private boolean inUse;
  private int val;

  private Singleton()
  {
    inUse = true;
    val = 5;
  }
  public static Singleton getInstance()
  {
    if (instance == null)
      instance = new Singleton();
    return instance;
  }
}

Here is the assembly code for the above code:

;asm code generated for getInstance
054D20B0   mov         eax,[049388C8]      ;load instance ref
054D20B5   test        eax,eax             ;test for null
054D20B7   jne         054D20D7
054D20B9   mov         eax,14C0988h
054D20BE   call        503EF8F0            ;allocate memory
054D20C3   mov         [049388C8],eax      ;store pointer in
                                           ;instance ref. instance
                                           ;non-null and ctor
                                           ;has not run
054D20C8   mov         ecx,dword ptr [eax]
054D20CA   mov         dword ptr [ecx],1   ;inline ctor - inUse=true;
054D20D0   mov         dword ptr [ecx+4],5 ;inline ctor - val=5;
054D20D7   mov         ebx,dword ptr ds:[49388C8h]
054D20DD   jmp         054D20B0

For those who are not hands on Assembly language here is a summary of the code sequence:

  • The first two lines of assembly code at B0 and B5 load the instance reference from memory location 049388C8 into eax and test for null. This corresponds to the first line of the getInstance() method.
  • The first time this method is called, instance is null and the code proceeds to B9.
  • The code at BE allocates the memory from the heap for the Singleton object and stores a pointer to that memory in eax.
  • The next line, C3, takes the pointer in eax and stores it back into the instance reference at memory location 049388C8. As a result, instance is now non-null and refers to a valid Singleton object.
  • However, the constructor for this object has not run yet, which is precisely the situation that breaks double-checked locking.
  • Then at line C8, the instance pointer is dereferenced and stored in ecx.
  • Lines CA and D0 represent the inline constructor storing the values true and 5 into the Singleton object.
  • If this code is interrupted by another thread after executing line C3 but before completing the constructor, double-checked locking fails.

Not all JIT compilers generate the code as above. Some generate code such that instance becomes non-null only after the constructor executes. However, this does not mean you should use double-checked locking in these instances. There are other reasons it could fail. In addition, you do not always know which JVMs your code will run on, and the JIT compiler could always change to generate code that breaks this idiom.

So how do we solve the issue. Some developers come up with the following code:

        public static Singleton getInstance() {
		if (instance == null) {
		    instance = this.getInstanceSlow();
		}
		return instance;
	}

	private static synchronized Singleton getInstanceSlow() {
		if (instance == null) {
		    instance = new Singleton();
		}
		return instance;
	}

In other words, delegate the locking to another method. This is entirely equivalent to the inline version, and fails for the same reasons.

Volatile?

Another idea is to use the keyword volatile for the variables inst and instance. According to the JLS , variables declared volatile are supposed to be sequentially consistent, and therefore, not reordered. But two problems occur with trying to use volatile to fix the problem with double-checked locking:

  1. The problem here is not with sequential consistency. Code is being moved, not reordered.
  2. Many JVMs do not implement volatile correctly regarding sequential consistency anyway.

The second point is worth expanding upon. Consider the code:

class Test
{
  private volatile boolean stop = false;
  private volatile int num = 0;

  public void foo()
  {
    num = 100;    //This can happen second
    stop = true;  //This can happen first
    //...
  }

  public void bar()
  {
    if (stop)
      num += num;  //num can == 0!
  }
  //...
}

According to the JLS, because stop and num are declared volatile, they should be sequentially consistent. This means that if stop is ever true, num must have been set to 100. However, because many JVMs do not implement the sequential consistency feature of volatile, you cannot count on this behavior. Therefore, if Thread 1 called foo and Thread 2 called bar concurrently, Thread 1 might set stop to true before num is set to 100. This could lead thread 2 to see stop as true, but num still set to 0. There are additional problems with volatile and the atomicity of 64-bit variables, I wont go in to the details as it is out of scope for this post.

This is it from me for now. Stay tuned for the next post in which I will discuss various solutions available for fixing this and pros and cons of those solution.

Until then Happy Coding !

Advertisements
Why Double Checked Locking is flawed

23 thoughts on “Why Double Checked Locking is flawed

  1. Ahmed Khan says:

    Another good post. I am sure becoming a fan of your blog. I have always been using the Double-Checked-Locking idiom in my code but never realized that it could cause issues.

    Infact I have seen a case in our code base where the uninitialized fields issue came up and we were never able to track out the issue and regarded this as a random case. As a result there are many places where we ended up adding null checks thourgh out the code base to guard against any future occurances.

    Thanks to you article I know whats is going on and will fix up the code base and make code clean by removing all those un-necessary null checks.

    1. Faisal Feroz says:

      @Ahmed Thanks for the comments. You made a pretty nice point about null checking. Most of the times developers add up similar checks to guard against what they call random behaviors 🙂

  2. Andy Till says:

    Nice post. Firstly, in Java Concurrency In Practice it states you must use volatile for double checked locking so yes no one should be doing this on non-volatile fields.

    Your volatile example is also demonstrating a compound action on two variables so volatile is not appropriate here, a synchronized block should be used.

    Do you know which VMs do not implement volatile correctly? I thought this was resolved by VMs that meet the Java Memory Model spec.

    1. Faisal Feroz says:

      Volatile wasn’t correctly implemented prior to Java 5 in which JSR-133 was implemented fully. So almost all the VMs prior to Java 5 have volatility issue as the problem was with the Java Memory Model.

      You can refer to the JSR-133 for more information about the issue.

  3. How about this?

    public static Singleton getInstance() {
    if (instance == null) {
    synchronized(Singleton.class) { //1
    if (instance == null) //2
    Singleton instance_ = new Singleton(); //3
    instance = instance_;
    }
    }
    return instance;
    }

    1. @Animesh The JIT sees an opportunity for optimization here. The key line is instance=instance_

      The JIT will remove this line and generate the code simialr to this one:

      public static Singleton getInstance()
      {
        if (instance == null)
        {
          synchronized(Singleton.class) {
             if (instance == null)
                instance = new Singleton();        
          }
        }
        return instance;
      }
      

      which is similar to the original one and hence flawed for obvious reasons.

  4. Faisal Feroz says:

    @Animesh theres one more pattern that I have seen that people try which is also flawed because of JIT. Heres the code that people try:

    public static Singleton getInstance()
    {
      if (instance == null)
      {
        synchronized(Singleton.class) {      //1
          Singleton inst = instance;         //2
          if (inst == null)
          {
            synchronized(Singleton.class) {  //3
              inst = new Singleton();        //4
            }
            instance = inst;                 //5
          }
        }
      }
      return instance;
    }
    

    Again here JIT combines lines //4 and //5 and generates instance = new Singleton(); Since according to JLS its fine to move a code into a synchronized block (but the reverse is not allowed). Hence after the optimization we get out-f-order writes.

      1. Faisal Feroz says:

        @Animesh but why would you want to do that? Instead of disabling JIT get rid of double checked locking and make use of static pattern of implementing singleton.

        BTW static is just one way, there are other approaches as well – Topic of my next blog post.

  5. KitD says:

    I have seen your arguments against the Double-locking pattern mentioned elsewhere, but one solution that you don’t mention is, rather than checking a reference for non-nullity, to use instead a 32-bit flag (eg, boolean) to indicate initialisation, since updates to 32-bit values are guaranteed to be atomic. EG

    private static Singleton instance;
    private static volatile boolean init = false;

    public static Singleton getInstance()
    {
    if (! init)
    {
    synchronized(Singleton.class) {
    if (! init)
    instance = new Singleton();
    init = true;
    }
    }
    return instance;
    }

    Do you have any comments on this?

    1. KitD says:

      Aargh, formatting. Try again:

      private static Singleton instance;
      private static volatile boolean init = false;
      
      public static Singleton getInstance()
      {
          if (! init)
          {
              synchronized(Singleton.class) {
                  if (! init) {
                      instance = new Singleton();
                      init = true;
                  }
              }
          }
          return instance;
      }
      

      EDIT – I fixed the code and added the paranthesis.

      1. Faisal Feroz says:

        @KitD same issue here as well. instance can become non-null before the constructor is called (code reordering).

        Here the issue is not the null check but the instance variable becoming non-null before the object is fully initialized. Even adding the volatile init variable in place of null check won’t help.

  6. Jaroslav Sedlacek says:

    There is actually more reasons why original DCL is broken. Even when writes are in order DCL might not work. Thread 2 can still see partially constructed instance as new object is not correctly published.

  7. Julia says:

    Awesome, good work. I really admire this work. Nice to find articles worth reading. I will load the rss feed into our feedreader.Thanks and keep up the good work.

  8. Dmitriy Pichugin says:

    Singleton solution by Bill Pugh is the answer.
    Link: http://en.wikipedia.org/wiki/Singleton_pattern
    Code:
    public class Singleton
    {

    // Private constructor prevents instantiation from other classes
    private Singleton()
    {
    }

    /**
    * SingletonHolder is loaded on the first execution of Singleton.getInstance()
    * or the first access to SingletonHolder.INSTANCE, not before.
    */
    private static class SingletonHolder
    {
    public static final Singleton INSTANCE = new Singleton();
    }

    public static Singleton getInstance()
    {
    return SingletonHolder.INSTANCE;
    }

    }
    my $0.02
    cheers
    Dmitriy

  9. Anon says:

    How did you reach these results? Because according to Bill Pugh (the spec lead of jsr 133, java memory model) Double check locking idiom, used with the volatile keyword behaves perfectly well for all standard JVM’s built after jdk 1.5. And volatiles (neither on 32-bit nor on 64-bit) aren’t atomic.

  10. No matter how hard we try the JVM always lands us up in trouble. The thing is we must stop trying to outsmart JVM. Using enum is the preferred approach to implement a singleton design pattern.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s