19 Mar 2014, 19:34

Upgradable locks in Go

Share

Read-write locks are one of the basic means of coordinating access to a shared resource. As many readers can coexist as you’d like, but a writer requires exclusive access. For performance reasons, you’d like to have your write locks over as short a section as possible so you don’t unnecessarily block readers.

Sometimes, though, you have a long stretch of reads, and then a relatively short section of writes dependent on those reads. While you’re reading, you don’t care about other readers - but you do want to ensure that after your reads, you (and not some other caller) immediately acquire the write lock. This is known as “lock upgrading”.

An example of this is restructuring a file (eg, compacting a database file). You read it in, process it, write it to a temp file, and rename it to overwrite the original. While you’re reading the original and writing to a temp file, you don’t care about other readers. But after you’re done reading, you do want to adjust all the metadata so that subsequent readers are pointed at the right file handle / mmap’d buffer / whatever, and also to ensure that no writers bash the file between your reads and your writes.

Unfortunately, POSIX locks don’t support this. Java, which probably has the best shared-memory concurrency runtime available, explicitly forbids it. If you happen to be using Go, which has only the most basic POSIX-y concurrency constructs, how do you do it without holding a write lock the entire time?

The key insight is that the caller doesn’t actually need to ensure that it “upgrades the lock”, per se. It just needs to ensure that its code is run immediately the next time the write lock is acquired. It turns out this is not too difficult in a language that supports passing around functions:

import (
	"fmt"
	"runtime"
	"sync"
)

type UpgradableLock struct {
	uglMutex sync.RWMutex
	upgrades chan func()
}

func NewUpgradableLock() *UpgradableLock {
	return &UpgradableLock{sync.RWMutex{}, make(chan func(), 1)}
}

func (ugl *UpgradableLock) Lock() {
	ugl.uglMutex.Lock()
	select {
	case v, _ := <-ugl.upgrades:
		v()
	default:
	}
}

func (ugl *UpgradableLock) MaybeUpgrade(f func()) bool {
	select {
	case ugl.upgrades <- f:
		go func() {
			ugl.Lock()
			ugl.Unlock()
		}()
		return true
	default:
		return false
	}
}

// RLock, RUnlock, and Unlock just call the corresponding method on the underlying RWMutex.

The full example lives here. When we try to upgrade, we attempt to place the function with our write-lock-requiring code in a channel, that is always selected from when the lock is acquired. In case there are no writers trying to acquire that lock (and hence run our code), we fire off one with no additional side effects. In order to ensure we’re handing off to the particular write operation we’re placing on the channel, we set the capacity of the channel to 1. If we can’t place that function (ie, there is already another writer scheduled), the upgrade fails.

There are two important limitations to this approach:

  • It doesn’t support arbitrary upgrade-downgrade cycling. This is less than ideal if for instance we wanted to split our write operation into two parts, separated by a read. Currently it is a one-way street.
  • The upgrade is not guaranteed to succeed. This makes sense - if you have multiple readers wanting to upgrade, only one of them can be next. What is guaranteed is that if we have multiple upgraders, at least one succeeds. The rest can continue spinning in the interim, attempting to stay low-impact with time.Sleep() or runtime.Gosched() calls on each cycle until they all complete.

What’s nice in addition is that this approach works on any platform with some sort of passable function / callable construct, read/write locks, and a threadsafe queue (in fact, the queue here isn’t much of a queue - you could replace it with an atomic reference or a plain mutex and a pointer).