Linear Types and Program Resource Management
In the previous post
"The
Case for a Concurrency-oriented VM", I speculated that a VM aimed at
concurrency-oriented languages, and with focus on concurrency safety,
might be a good thing to have.
I also mentioned in passing that linear types might have something to
offer in that regard.
Today, I will elaborate on that.
In particular, I will show how linear types enable expressing resource
management related constraints, and how some common patterns for
concurrency-safe resource management can be expressed using linear
types.
The context is that of a intermediate program representation to be
used as input for a virtual machine — a representation which allows
certain safety conditions above and beyond mere type safety to be
enforced through locally verifiable rules (but not a representation
intended to be produced by hand).
More concretely, this can be seen as part of an attempt to model a
complete concurrency-oriented language (Erlang, as it happens) using
linear types, aiming towards demonstrating that such a thing as an Erlang VM
can be defined in terms of a linear-type based VM.
Resources and life cycles
What is the difference between information and physical matter?
One key difference is that whereas physical matter can not be created
nor destroyed, information can be copied and destroyed at zero cost.
The cost of software, for instance, lies in its design and development
- not in manufacturing or distribution (in the original sense).
And there are no software landfills; when you erase a file, the zeroes
and ones don’t "go anywhere".
Even so, software deals with resources of a physical nature. Some of
the values manipulated by software — numbers, for instance — are just
bit patterns which can easily be copied, moved around and forgotten.
Other values denote some kind of resource — be it a chunk of memory, a
file descriptor, or a network connection — which can only be used for
one purpose at a time, and which must be disposed of properly
when the program is done using it.
Resources which have lifes cycles and which need managing.
While many programming languages have static type checking — verification that values are only ever used according to their type,
few have static checking of life cycles — verification that values
are only used according to where in their life cycle they are.
That kind of checking is possible, though.
It just requires a type system with linear types.
Linear types and values
What are linear types?
A value of a linear type cannot be copied, and cannot be destroyed — at least not implicitly.
In other words, each value can be used only once, and must be used exactly once.
This may sound odd to most programmers.
But take as an example a simple assignment statement of some
imperative language:
a = b;
From the traditional point of view, one thing happens here: The bit
pattern of b is copied into the location of the variable a.
From the linear point of view, however, three things actually happen:
The old value of a is destroyed; the value of b is copied; and the
copy is put in a.
At least if a is a pre-existing variable, and b is a variable
which is read later in the program.
At the level we’re speaking of here, the destruction and the copying step must
be made explicit — it is not a given that these steps are permitted;
the value in question may be a resource which does not support copying
or destruction — or it may be the case that special steps must be taken
when copying or destruction happens.
(This may sound familiar to C programmers: C classes may have
copy constructors and destructors which are used for such special steps.)
In other words: For linear values, there is a one-to-one relation
between variables and values — there is such a thing as value ownership.
Resource categories
Now, back to the practical problem with resource management in programs.
Assume that we are to program a virtual machine which has only linear types.
Can we model the usual kinds of values as linear types, if we are free
to add the necessary built-in functions to support them?
First, let’s categorize value types according to how there are managed.
(Developers versed in C++, where resources are also managed more or
less explicitly, may notice some familiar patterns.)
-
Can the value be copied and destroyed trivially?
If yes: value belongs to category 1; this category includes all values which are merely bit patterns of fixed size: integral and floating-point numbers, booleans, characters, enumeration values, bit masks.
-
If no (it requires cleanup): are the point(s) of cleanup in the
program statically known?
If yes: (category 2) the value is used linearly, and can be (explicitly) disposed of at the appropriate place(s).
-
If no (the points of cleanup are not statically known): is the value immutable?
If yes: (category 3) the value can be either copied in its entirety, and the copies disposed of individually (3a); or the value can be managed through reference counting (3b), in which case a "copy" action consists of incrementing its reference count, and a "destroy" action consists of decrementing its reference count and optionally (if the count reaches zero) disposing of the value.
-
If no (the value is shared and mutable): is the value used from a
single thread only (or, more generally, is it only used in scope of
a specific other value which is used linearly)?
If yes: (category 4) the value can be put in a heap which is linearly owned (i.e., only accessible from one thread at a time).
-
If no (the value is mutable and shared between threads):
(category 5) the value can be protected by a lock; the lock is
reference counted and its disposal causes the value to be disposed.
It is perhaps worth noting that such a program as the Erlang VM uses
all of these management techniques, for different kinds of data:
-
Atoms and small integers are contained in a single machine word and
are simply copied (category 1).
-
Process contexts are linear (category 2), and is disposed of at
specific program points.
-
Message between processes are copied (category 3a).
-
Large binary values are managed through reference counting (category 3b).
-
Other Erlang terms (tuples, lists, bignums, small binaries, etc.) live
in a heap which belongs to a process which is a linear value (category 4).
-
Finally, certain data structures: the process registry, non-private tables,
etc., which are mutable and shared among processes, are managed
through the use of locks (category 5).
(It should be noted that there is another category, program literals,
which resemble category 3 but is optimized to be handled like category
1 most of the time.
In the Erlang VM, these require very special
handling when code is unloaded. This is an advanced management form
which I won’t try to cover just yet.
Also, variables accessed with atomic hardware operations comprise yet
another category, one which is handled much like category 5.)
Intermezzo: Type notation
In what follows, I will have to write a number of function signatures.
These signatures involve four type constructors: ⊗, ⊕, ∀ and ∃.
-
A ⊗ B is a linear product type — which is like a traditional
product type (aka record, struct, or tuple type, depending on language) type
except that each of its fields must be used (consumed) exactly once.
-
A ⊕ B is a linear sum type — which is like a normal sum type (aka tagged
union, algebraic datatype, etc.) except that the value it contains
(whichever of the variants it is) must be consumed exactly once.
-
∀T.A(T) is universal quantification — aka. parametric polymorphism
or generics, which is used for e.g. generic container types.
-
∃T.A(T) is existential quantification — such as can be used for
subtype polymorphism or typeclasses, but in what follows will be
used in a more general manner.
The points of interest here is firstly the linear nature of values.
Secondly: existential types are usually used to link together two
values — an object representation and a vtable for that object’s
class — and to enforce encapsulation on the type level.
But as we shall see, existential quantification can be used in a
different manner; notably, one that (unlike the vtable usage)
does not involve anything similar
to run-time method lookups. We’ll get back to this in the treatment of heaps.
Resource management in a linear virtual machine
Now we should be ready to look at how each of the resource categories
can be modelled in a virtual machine with linear types.
In the following, T stands for the type of the resource.
Most of the cases are quite unsurprising; the interesting bit is the heap case.
Category 1: bit patterns
For this kind of values, we provide built-in functions ("axioms") for
both copying and disposal; copy simply copies the bit pattern, and
dispose does nothing:
copy : T → T ⊗ T dispose : T → ()
Category 2: linear values
This kind of value requires no special functions.
Copying is not allowed, and disposal happens explicitly.
Category 3a: resource copying
This kind of values can be copied; each copy is disposed of individually:
copy : T → T ⊗ T dispose : T → ()
There is of course some dynamic storage allocation in play here, too.
In the function signatures shown above, the storage allocation system
is an implicit argument to both operations (in malloc-style); one
might imagine alternatives where it is presents as an explicit
argument.
Category 3b: reference counting
Reference counted values are created with an initial reference count of one.
Copying and disposal functions are supplied; copy increments the
reference count, and dispose decrements it and frees the resource if
the count reaches zero. In the simple form of this, freeing the resource does
not require any other parameters:
copy : T → T ⊗ T dispose : T → ()
Again, the underlying storage allocation system is implied.
Category 4: (defered)
Single-thread heaps is a more graceful strategy than lock protection — but the cruder lock protection mechanism is easier to describe, so we’ll
handle that first and return to heaps afterwards.
Category 5: lock-protection
This kind of value is created with a lock and a reference count (initially one).
The raw value of type T is not initially available; instead, you get a LockedT.
To get access to the value proper, you first need to acquire the lock.
copy and dispose works as for a reference counted value. acquire and release work the lock.
copy and dispose works as for a reference counted value. acquire and release work the lock.
copy : LockedT → LockedT ⊗ LockedT copy : T → T ⊗ LockedT dispose : LockedT → () acquire : LockedT → T acquire : LockedT ⊗ Timeout → T ⊕ LockedT release : T → LockedT
(If the type T can also be managed in other ways, such that not all
instances of T can be subjected to, e.g., the release operation,
then the function signatures need to be a bit different.)
Category 4: heap allocation in linearly owned heap
This is the most complex case.
The model here is an Erlang-style heap: In Erlang, each process owns a heap,
which is a contiguous block of memory divided into an "allocated" area
and a "free" area. The heap’s allocation pointer points to the
boundary between the two; everything below the boundary is allocated,
and everything above is free.
When a new object needs to be allocated, it is first checked whether there is
enough space in the "free" area to accommodate an object of the size
in question.
If there is room, the lowest part of the free area is used for the object,
and the allocation pointer is adjusted upwards accordingly.
If there is not enough room, then a garbage collector is called — a
moving garbage collector which compacts the live objects into a contiguous
area at the beginning of the heap. During this process, a new heap area may be
allocated; in any case, after garbage collection, the heap is once again
parted in two by the allocation pointer, and there is now room in the
free space for the requested object.
There are two things to be aware of here:
-
At allocation time, the garbage collector must know all of the root
objects in the heap, in order to preserve all reachable values;
-
At allocation time, the location of all live pointers into the heap
must be known to the garbage collector, so that it can adjust them
appropriately when it moves objects.
As an example of how the first invariant can be missed, consider a built-in
function which makes two heap allocations.
If, after the first allocation, the function does not tuck away the resulting
pointer into a location searched by the garbage collector, then if the
second allocation triggers a collection, then the first allocated object
will be deemed garbage and be collected.
As an example of what can go wrong if the second invariant is not
observed, consider the case where the same built-in function does
tuck away a copy of the first pointer into a location searched by the
garbage collector, but then — after the second allocation — goes on
to use the original pointer from a local variables which is not
adjusted by the collector: the first object lives, but the pointer
used by the built-in function does not point to it (but rather to the
place where it used to live).
These are the two major pitfalls to be aware of, and if we are to
model such a heap, the type system should guard against them.
Enter the use of existential variables as a means to link two variables.
In this scenario, unlike the others, we will not provide access to the
raw value of type T; instead, the individual field of the value can be
accessed (if the value is a record; sum type values can be handled similarly).
First of all, we need to have access to the heap itself.
It will have two parameters:
Heap[h,a]
of which one, h, denotes the identity of the heap, while the other,
a, denotes the current version of the heap.
These are not actual types (they will end up coming from an
existential quantor) but are of a more technical nature.
Each pointer to a heap object (of type T) is represented by a:
HeapPtr[T,a]
where a denotes the version of the heap for which this pointer is
valid. Given that we don’t intend the garbage collector to support
objects with finalizers, T can not be of any type — only certain
types of objects (those with no-op disposal functions) can be
heap-allocated.
Access of a field (of, say, type F) happens through a built-in function:
access_field : ∀h.∀a. Heap[h,a] ⊗ HeapPtr[T,a] → Heap[h,a] ⊗ F
(Here it is not really necessary to have access to the heap itself;
proof that the current version is a is enough. The heap interface
can be refined to reflect this — and the proof in question need not
be present at runtime, and can be optimized away.)
Disposing of a heap pointer is a no-op — the pointer can simply be
forgotten about:
dispose : ∀T.∀a. HeapPtr[T,a] → ()
Allocation of a new heap pointer is interesting. Here we may end up
with a different version of the heap, because the allocation may
trigger garbage collection. Suppose that we are to allocate an object
of type F, which requires values of types A and B to construct:
allocate : ∀h.∀a. Heap[h,a] ⊗ _A_ ⊗ _B_ → ∃b.Heap[h,b]
Note that after allocation, all of the old heap pointers of type
HeapPtr[h,a] are useless!
They can no longer be used as input to access_field (but dispose
can still be used on them).
This is exactly what we desire.
However, we need some way of registering root pointers for the garbage
collector.
To this end, we introduce a type:
RootPtr[T,h]
and functions for registering root pointer and get up-to-date pointers back:
register_root : ∀h.∀a.∀T. Heap[h,a] ⊗ HeapPtr[T,a] → Heap[h,a] ⊗ RootPtr[T,h] read_root : ∀h.∀a.∀T. Heap[h,a] ⊗ RootPtr[T,h] → Heap[h,a] ⊗ HeapPtr[T,a] dispose_root : ∀h.∀T. RootPtr[T,h] → ()
Finally, we have of course construction and disposal of a heap.
These functions allocate, respectively deallocate, the memory areas
and bookkeping data structures for the heap:
create_heap : () → ∃h.∃a. Heap[h,a] dispose : ∀h.∀a. Heap[h,a] → ()
All in all, we can model heaps in a safe way with linear types.
It does not matter whether the heap is managed through mark-and-sweep
or copying or compacting or generational garbage collection; the scheme
presented should work for either kind.
The performance characteristics should (I think) be within a small
constant factor of usual GC implementations; the largest difference is
in explicit registration of heap root versus other schemes such as
stack maps. On the other hand, the scheme presented needs no special
tricks to be portable, and it is flexible: for instance, it supports
multiple heaps, which can be used for values with very different life cycles
or for messages to be sent from one thread to another.