Friday, March 21, 2014

Linear Types and Program Resource Management

In the previous post "The Case for a Concurrency-oriented VM", I speculated that a VM aimed at concurrency-oriented languages, and with focus on concurrency safety, might be a good thing to have. I also mentioned in passing that linear types might have something to offer in that regard.
Today, I will elaborate on that. In particular, I will show how linear types enable expressing resource management related constraints, and how some common patterns for concurrency-safe resource management can be expressed using linear types.
The context is that of a intermediate program representation to be used as input for a virtual machine — a representation which allows certain safety conditions above and beyond mere type safety to be enforced through locally verifiable rules (but not a representation intended to be produced by hand).
More concretely, this can be seen as part of an attempt to model a complete concurrency-oriented language (Erlang, as it happens) using linear types, aiming towards demonstrating that such a thing as an Erlang VM can be defined in terms of a linear-type based VM.

Resources and life cycles

What is the difference between information and physical matter? One key difference is that whereas physical matter can not be created nor destroyed, information can be copied and destroyed at zero cost. The cost of software, for instance, lies in its design and development - not in manufacturing or distribution (in the original sense). And there are no software landfills; when you erase a file, the zeroes and ones don’t "go anywhere".
Even so, software deals with resources of a physical nature. Some of the values manipulated by software — numbers, for instance — are just bit patterns which can easily be copied, moved around and forgotten. Other values denote some kind of resource — be it a chunk of memory, a file descriptor, or a network connection — which can only be used for one purpose at a time, and which must be disposed of properly when the program is done using it. Resources which have lifes cycles and which need managing.
While many programming languages have static type checking — verification that values are only ever used according to their type, few have static checking of life cycles — verification that values are only used according to where in their life cycle they are.
That kind of checking is possible, though. It just requires a type system with linear types.

Linear types and values

What are linear types? A value of a linear type cannot be copied, and cannot be destroyed — at least not implicitly.
In other words, each value can be used only once, and must be used exactly once.
This may sound odd to most programmers. But take as an example a simple assignment statement of some imperative language:
a = b;
From the traditional point of view, one thing happens here: The bit pattern of b is copied into the location of the variable a.
From the linear point of view, however, three things actually happen: The old value of a is destroyed; the value of b is copied; and the copy is put in a. At least if a is a pre-existing variable, and b is a variable which is read later in the program.
At the level we’re speaking of here, the destruction and the copying step must be made explicit — it is not a given that these steps are permitted; the value in question may be a resource which does not support copying or destruction — or it may be the case that special steps must be taken when copying or destruction happens. (This may sound familiar to C programmers: C classes may have copy constructors and destructors which are used for such special steps.)
In other words: For linear values, there is a one-to-one relation between variables and values — there is such a thing as value ownership.

Resource categories

Now, back to the practical problem with resource management in programs. Assume that we are to program a virtual machine which has only linear types. Can we model the usual kinds of values as linear types, if we are free to add the necessary built-in functions to support them?
First, let’s categorize value types according to how there are managed. (Developers versed in C++, where resources are also managed more or less explicitly, may notice some familiar patterns.)
  • Can the value be copied and destroyed trivially?
    If yes: value belongs to category 1; this category includes all values which are merely bit patterns of fixed size: integral and floating-point numbers, booleans, characters, enumeration values, bit masks.
  • If no (it requires cleanup): are the point(s) of cleanup in the program statically known?
    If yes: (category 2) the value is used linearly, and can be (explicitly) disposed of at the appropriate place(s).
  • If no (the points of cleanup are not statically known): is the value immutable?
    If yes: (category 3) the value can be either copied in its entirety, and the copies disposed of individually (3a); or the value can be managed through reference counting (3b), in which case a "copy" action consists of incrementing its reference count, and a "destroy" action consists of decrementing its reference count and optionally (if the count reaches zero) disposing of the value.
  • If no (the value is shared and mutable): is the value used from a single thread only (or, more generally, is it only used in scope of a specific other value which is used linearly)?
    If yes: (category 4) the value can be put in a heap which is linearly owned (i.e., only accessible from one thread at a time).
  • If no (the value is mutable and shared between threads): (category 5) the value can be protected by a lock; the lock is reference counted and its disposal causes the value to be disposed.
It is perhaps worth noting that such a program as the Erlang VM uses all of these management techniques, for different kinds of data:
  • Atoms and small integers are contained in a single machine word and are simply copied (category 1).
  • Process contexts are linear (category 2), and is disposed of at specific program points.
  • Message between processes are copied (category 3a).
  • Large binary values are managed through reference counting (category 3b).
  • Other Erlang terms (tuples, lists, bignums, small binaries, etc.) live in a heap which belongs to a process which is a linear value (category 4).
  • Finally, certain data structures: the process registry, non-private tables, etc., which are mutable and shared among processes, are managed through the use of locks (category 5).
(It should be noted that there is another category, program literals, which resemble category 3 but is optimized to be handled like category 1 most of the time. In the Erlang VM, these require very special handling when code is unloaded. This is an advanced management form which I won’t try to cover just yet. Also, variables accessed with atomic hardware operations comprise yet another category, one which is handled much like category 5.)

Intermezzo: Type notation

In what follows, I will have to write a number of function signatures. These signatures involve four type constructors: ⊗, ⊕, ∀ and ∃.
  • A ⊗ B is a linear product type — which is like a traditional product type (aka record, struct, or tuple type, depending on language) type except that each of its fields must be used (consumed) exactly once.
  • A ⊕ B is a linear sum type — which is like a normal sum type (aka tagged union, algebraic datatype, etc.) except that the value it contains (whichever of the variants it is) must be consumed exactly once.
  • ∀T.A(T) is universal quantification — aka. parametric polymorphism or generics, which is used for e.g. generic container types.
  • ∃T.A(T) is existential quantification — such as can be used for subtype polymorphism or typeclasses, but in what follows will be used in a more general manner.
The points of interest here is firstly the linear nature of values. Secondly: existential types are usually used to link together two values — an object representation and a vtable for that object’s class — and to enforce encapsulation on the type level. But as we shall see, existential quantification can be used in a different manner; notably, one that (unlike the vtable usage) does not involve anything similar to run-time method lookups. We’ll get back to this in the treatment of heaps.

Resource management in a linear virtual machine

Now we should be ready to look at how each of the resource categories can be modelled in a virtual machine with linear types.
In the following, T stands for the type of the resource. Most of the cases are quite unsurprising; the interesting bit is the heap case.

Category 1: bit patterns

For this kind of values, we provide built-in functions ("axioms") for both copying and disposal; copy simply copies the bit pattern, and dispose does nothing:
copy : T → T ⊗ T
dispose : T → ()

Category 2: linear values

This kind of value requires no special functions. Copying is not allowed, and disposal happens explicitly.

Category 3a: resource copying

This kind of values can be copied; each copy is disposed of individually:
copy : T → T ⊗ T
dispose : T → ()
There is of course some dynamic storage allocation in play here, too. In the function signatures shown above, the storage allocation system is an implicit argument to both operations (in malloc-style); one might imagine alternatives where it is presents as an explicit argument.

Category 3b: reference counting

Reference counted values are created with an initial reference count of one. Copying and disposal functions are supplied; copy increments the reference count, and dispose decrements it and frees the resource if the count reaches zero. In the simple form of this, freeing the resource does not require any other parameters:
copy : T → T ⊗ T
dispose : T → ()
Again, the underlying storage allocation system is implied.

Category 4: (defered)

Single-thread heaps is a more graceful strategy than lock protection — but the cruder lock protection mechanism is easier to describe, so we’ll handle that first and return to heaps afterwards.

Category 5: lock-protection

This kind of value is created with a lock and a reference count (initially one). The raw value of type T is not initially available; instead, you get a LockedT. To get access to the value proper, you first need to acquire the lock.
copy and dispose works as for a reference counted value. acquire and release work the lock.
copy : LockedT → LockedT ⊗ LockedT
copy : T → T ⊗ LockedT
dispose : LockedT → ()
acquire : LockedT → T
acquire : LockedT ⊗ Timeout → T ⊕ LockedT
release : T → LockedT
(If the type T can also be managed in other ways, such that not all instances of T can be subjected to, e.g., the release operation, then the function signatures need to be a bit different.)

Category 4: heap allocation in linearly owned heap

This is the most complex case.
The model here is an Erlang-style heap: In Erlang, each process owns a heap, which is a contiguous block of memory divided into an "allocated" area and a "free" area. The heap’s allocation pointer points to the boundary between the two; everything below the boundary is allocated, and everything above is free.
When a new object needs to be allocated, it is first checked whether there is enough space in the "free" area to accommodate an object of the size in question. If there is room, the lowest part of the free area is used for the object, and the allocation pointer is adjusted upwards accordingly.
If there is not enough room, then a garbage collector is called — a moving garbage collector which compacts the live objects into a contiguous area at the beginning of the heap. During this process, a new heap area may be allocated; in any case, after garbage collection, the heap is once again parted in two by the allocation pointer, and there is now room in the free space for the requested object.
There are two things to be aware of here:
  1. At allocation time, the garbage collector must know all of the root objects in the heap, in order to preserve all reachable values;
  2. At allocation time, the location of all live pointers into the heap must be known to the garbage collector, so that it can adjust them appropriately when it moves objects.
As an example of how the first invariant can be missed, consider a built-in function which makes two heap allocations. If, after the first allocation, the function does not tuck away the resulting pointer into a location searched by the garbage collector, then if the second allocation triggers a collection, then the first allocated object will be deemed garbage and be collected.
As an example of what can go wrong if the second invariant is not observed, consider the case where the same built-in function does tuck away a copy of the first pointer into a location searched by the garbage collector, but then — after the second allocation — goes on to use the original pointer from a local variables which is not adjusted by the collector: the first object lives, but the pointer used by the built-in function does not point to it (but rather to the place where it used to live).
These are the two major pitfalls to be aware of, and if we are to model such a heap, the type system should guard against them.
Enter the use of existential variables as a means to link two variables.
In this scenario, unlike the others, we will not provide access to the raw value of type T; instead, the individual field of the value can be accessed (if the value is a record; sum type values can be handled similarly).
First of all, we need to have access to the heap itself. It will have two parameters:
Heap[h,a]
of which one, h, denotes the identity of the heap, while the other, a, denotes the current version of the heap. These are not actual types (they will end up coming from an existential quantor) but are of a more technical nature.
Each pointer to a heap object (of type T) is represented by a:
HeapPtr[T,a]
where a denotes the version of the heap for which this pointer is valid. Given that we don’t intend the garbage collector to support objects with finalizers, T can not be of any type — only certain types of objects (those with no-op disposal functions) can be heap-allocated.
Access of a field (of, say, type F) happens through a built-in function:
access_field : ∀h.∀a. Heap[h,a] ⊗ HeapPtr[T,a] → Heap[h,a] ⊗ F
(Here it is not really necessary to have access to the heap itself; proof that the current version is a is enough. The heap interface can be refined to reflect this — and the proof in question need not be present at runtime, and can be optimized away.)
Disposing of a heap pointer is a no-op — the pointer can simply be forgotten about:
dispose : ∀T.∀a. HeapPtr[T,a] → ()
Allocation of a new heap pointer is interesting. Here we may end up with a different version of the heap, because the allocation may trigger garbage collection. Suppose that we are to allocate an object of type F, which requires values of types A and B to construct:
allocate : ∀h.∀a. Heap[h,a] ⊗ _A_ ⊗ _B_ → ∃b.Heap[h,b]
Note that after allocation, all of the old heap pointers of type HeapPtr[h,a] are useless! They can no longer be used as input to access_field (but dispose can still be used on them).
This is exactly what we desire. However, we need some way of registering root pointers for the garbage collector. To this end, we introduce a type:
RootPtr[T,h]
and functions for registering root pointer and get up-to-date pointers back:
register_root : ∀h.∀a.∀T. Heap[h,a] ⊗ HeapPtr[T,a] → Heap[h,a] ⊗ RootPtr[T,h]
read_root : ∀h.∀a.∀T. Heap[h,a] ⊗ RootPtr[T,h] → Heap[h,a] ⊗ HeapPtr[T,a]
dispose_root : ∀h.∀T. RootPtr[T,h] → ()
Finally, we have of course construction and disposal of a heap. These functions allocate, respectively deallocate, the memory areas and bookkeping data structures for the heap:
create_heap : () → ∃h.∃a. Heap[h,a]
dispose : ∀h.∀a. Heap[h,a] → ()
All in all, we can model heaps in a safe way with linear types. It does not matter whether the heap is managed through mark-and-sweep or copying or compacting or generational garbage collection; the scheme presented should work for either kind.
The performance characteristics should (I think) be within a small constant factor of usual GC implementations; the largest difference is in explicit registration of heap root versus other schemes such as stack maps. On the other hand, the scheme presented needs no special tricks to be portable, and it is flexible: for instance, it supports multiple heaps, which can be used for values with very different life cycles or for messages to be sent from one thread to another.