tag:blogger.com,1999:blog-28791325699173199392024-03-21T09:43:20.656+01:00Polymorphic TypistOn programming languages, methods and technologies. (Primarily.)Erik Søe Sørensenhttp://www.blogger.com/profile/04571315699824621252noreply@blogger.comBlogger13125tag:blogger.com,1999:blog-2879132569917319939.post-68478133163114468972014-03-21T13:47:00.000+01:002014-03-21T13:47:09.450+01:00Linear Types and Program Resource Management<div id="header">
<h1>
Linear Types and Program Resource Management</h1>
</div>
<div id="content">
<div id="preamble">
<div class="sectionbody">
<div class="paragraph" style="margin-top: 0.35em;">
In the previous post
<a href="http://polymorphictypist.blogspot.com/2013/03/the-case-for-concurrency-oriented-vm.html">"The
Case for a Concurrency-oriented VM"</a>, I speculated that a VM aimed at
concurrency-oriented languages, and with focus on concurrency safety,
might be a good thing to have.
I also mentioned in passing that linear types might have something to
offer in that regard.</div>
<div class="paragraph" style="margin-top: 0.35em;">
Today, I will elaborate on that.
In particular, I will show how linear types enable expressing resource
management related constraints, and how some common patterns for
concurrency-safe resource management can be expressed using linear
types.</div>
<div class="paragraph" style="margin-top: 0.35em;">
The context is that of a intermediate program representation to be
used as input for a virtual machine — a representation which allows
certain safety conditions above and beyond mere type safety to be
enforced through locally verifiable rules (but not a representation
intended to be produced by hand).</div>
<div class="paragraph" style="margin-top: 0.35em;">
More concretely, this can be seen as part of an attempt to model a
complete concurrency-oriented language (Erlang, as it happens) using
linear types, aiming towards demonstrating that such a thing as an Erlang VM
can be defined in terms of a linear-type based VM.</div>
</div>
</div>
<div class="sect1">
<h2 id="_resources_and_life_cycles">
Resources and life cycles</h2>
<div class="sectionbody">
<div class="paragraph" style="margin-top: 0.35em;">
What is the difference between information and physical matter?
One key difference is that whereas physical matter can not be created
nor destroyed, information can be copied and destroyed at zero cost.
The cost of software, for instance, lies in its design and development
- not in manufacturing or distribution (in the original sense).
And there are no software landfills; when you erase a file, the zeroes
and ones don’t "go anywhere".</div>
<div class="paragraph" style="margin-top: 0.35em;">
Even so, software deals with resources of a physical nature. Some of
the values manipulated by software — numbers, for instance — are just
bit patterns which can easily be copied, moved around and forgotten.
Other values denote some kind of resource — be it a chunk of memory, a
file descriptor, or a network connection — which can only be used for
one purpose at a time, and which must be disposed of properly
when the program is done using it.
Resources which have lifes cycles and which need managing.</div>
<div class="paragraph" style="margin-top: 0.35em;">
While many programming languages have static type checking — verification that values are only ever used according to their type,
few have static checking of life cycles — verification that values
are only used according to where in their life cycle they are.</div>
<div class="paragraph" style="margin-top: 0.35em;">
That kind of checking is possible, though.
It just requires a type system with linear types.</div>
</div>
</div>
<div class="sect1">
<h2 id="_linear_types_and_values">
Linear types and values</h2>
<div class="sectionbody">
<div class="paragraph" style="margin-top: 0.35em;">
What are linear types?
A value of a linear type cannot be copied, and cannot be destroyed — at least not implicitly.</div>
<div class="paragraph" style="margin-top: 0.35em;">
In other words, each value can be used only once, and must be used exactly once.</div>
<div class="paragraph" style="margin-top: 0.35em;">
This may sound odd to most programmers.
But take as an example a simple assignment statement of some
imperative language:</div>
<div class="literalblock">
<div class="content">
<pre style="margin: 0.5em 2em;"><tt>a = b;</tt></pre>
</div>
</div>
<div class="paragraph" style="margin-top: 0.35em;">
From the traditional point of view, one thing happens here: The bit
pattern of <i>b</i> is copied into the location of the variable <i>a</i>.</div>
<div class="paragraph" style="margin-top: 0.35em;">
From the linear point of view, however, three things actually happen:
The old value of <i>a</i> is destroyed; the value of <i>b</i> is copied; and the
copy is put in <i>a</i>.
At least if <i>a</i> is a pre-existing variable, and <i>b</i> is a variable
which is read later in the program.</div>
<div class="paragraph" style="margin-top: 0.35em;">
At the level we’re speaking of here, the destruction and the copying step must
be made explicit — it is not a given that these steps are permitted;
the value in question may be a resource which does not support copying
or destruction — or it may be the case that special steps must be taken
when copying or destruction happens.
(This may sound familiar to C<tt> programmers: C</tt> classes may have
<i>copy constructors</i> and <i>destructors</i> which are used for such special steps.)</div>
<div class="paragraph" style="margin-top: 0.35em;">
In other words: For linear values, there is a one-to-one relation
between variables and values — there is such a thing as value ownership.</div>
</div>
</div>
<div class="sect1">
<h2 id="_resource_categories">
Resource categories</h2>
<div class="sectionbody">
<div class="paragraph" style="margin-top: 0.35em;">
Now, back to the practical problem with resource management in programs.
Assume that we are to program a virtual machine which has <i>only</i> linear types.
Can we model the usual kinds of values as linear types, if we are free
to add the necessary built-in functions to support them?</div>
<div class="paragraph" style="margin-top: 0.35em;">
First, let’s categorize value types according to how there are managed.
(Developers versed in C++, where resources are also managed more or
less explicitly, may notice some familiar patterns.)</div>
<div class="ulist">
<ul>
<li>
Can the value be copied and destroyed trivially?
<br />
If yes: value belongs to <b>category 1</b>; this category includes all
values which are merely bit patterns of fixed size: integral and
floating-point numbers, booleans, characters, enumeration values,
bit masks.
<br />
</li>
<li>
If no (it requires cleanup): are the point(s) of cleanup in the
program statically known?
<br />
If yes: (<b>category 2</b>) the value is used linearly, and can be
(explicitly) disposed of at the appropriate place(s).
<br />
</li>
<li>
If no (the points of cleanup are not statically known): is the value immutable?
<br />
If yes: (<b>category 3</b>) the value can be either copied in its entirety, and the
copies disposed of individually (<b>3a</b>); or the value can be managed through
reference counting (<b>3b</b>), in which case a "copy" action consists of
incrementing its reference count, and a "destroy" action consists of
decrementing its reference count and optionally (if the count reaches zero)
disposing of the value.
<br />
</li>
<li>
If no (the value is shared and mutable): is the value used from a
single thread only (or, more generally, is it only used in scope of
a specific other value which is used linearly)?
<br />
If yes: (<b>category 4</b>) the value can be put in a heap which is
linearly owned (i.e., only accessible from one thread at a time).
<br />
</li>
<li>
If no (the value is mutable and shared between threads):
(<b>category 5</b>) the value can be protected by a lock; the lock is
reference counted and its disposal causes the value to be disposed.
<br />
</li>
</ul>
</div>
<div class="paragraph" style="margin-top: 0.35em;">
It is perhaps worth noting that such a program as the Erlang VM uses
all of these management techniques, for different kinds of data:</div>
<div class="ulist">
<ul>
<li>
Atoms and small integers are contained in a single machine word and
are simply copied (category 1).
<br />
</li>
<li>
Process contexts are linear (category 2), and is disposed of at
specific program points.
<br />
</li>
<li>
Message between processes are copied (category 3a).
<br />
</li>
<li>
Large binary values are managed through reference counting (category 3b).
<br />
</li>
<li>
Other Erlang terms (tuples, lists, bignums, small binaries, etc.) live
in a heap which belongs to a process which is a linear value (category 4).
<br />
</li>
<li>
Finally, certain data structures: the process registry, non-private tables,
etc., which are mutable and shared among processes, are managed
through the use of locks (category 5).
<br />
</li>
</ul>
</div>
<div class="paragraph" style="margin-top: 0.35em;">
(It should be noted that there is another category, program literals,
which resemble category 3 but is optimized to be handled like category
1 most of the time.
In the Erlang VM, these require very special
handling when code is unloaded. This is an advanced management form
which I won’t try to cover just yet.
Also, variables accessed with atomic hardware operations comprise yet
another category, one which is handled much like category 5.)</div>
</div>
</div>
<div class="sect1">
<h2 id="_intermezzo_type_notation">
Intermezzo: Type notation</h2>
<div class="sectionbody">
<div class="paragraph" style="margin-top: 0.35em;">
In what follows, I will have to write a number of function signatures.
These signatures involve four type constructors: ⊗, ⊕, ∀ and ∃.</div>
<div class="ulist">
<ul>
<li>
A ⊗ B is a linear product type — which is like a traditional
product type (aka record, struct, or tuple type, depending on language) type
except that each of its fields must be used (consumed) exactly once.
<br />
</li>
<li>
A ⊕ B is a linear sum type — which is like a normal sum type (aka tagged
union, algebraic datatype, etc.) except that the value it contains
(whichever of the variants it is) must be consumed exactly once.
<br />
</li>
<li>
∀T.A(T) is universal quantification — aka. parametric polymorphism
or generics, which is used for e.g. generic container types.
<br />
</li>
<li>
∃T.A(T) is existential quantification — such as can be used for
subtype polymorphism or typeclasses, but in what follows will be
used in a more general manner.
<br />
</li>
</ul>
</div>
<div class="paragraph" style="margin-top: 0.35em;">
The points of interest here is firstly the linear nature of values.
Secondly: existential types are usually used to link together two
values — an object representation and a vtable for that object’s
class — and to enforce encapsulation on the type level.
But as we shall see, existential quantification can be used in a
different manner; notably, one that (unlike the vtable usage)
does not involve anything similar
to run-time method lookups. We’ll get back to this in the treatment of heaps.</div>
</div>
</div>
<div class="sect1">
<h2 id="_resource_management_in_a_linear_virtual_machine">
Resource management in a linear virtual machine</h2>
<div class="sectionbody">
<div class="paragraph" style="margin-top: 0.35em;">
Now we should be ready to look at how each of the resource categories
can be modelled in a virtual machine with linear types.</div>
<div class="paragraph" style="margin-top: 0.35em;">
In the following, <tt>T</tt> stands for the type of the resource.
Most of the cases are quite unsurprising; the interesting bit is the heap case.</div>
<div class="sect2">
<h3 id="_category_1_bit_patterns">
Category 1: bit patterns</h3>
<div class="paragraph" style="margin-top: 0.35em;">
For this kind of values, we provide built-in functions ("axioms") for
both copying and disposal; <tt>copy</tt> simply copies the bit pattern, and
<tt>dispose</tt> does nothing:</div>
<div class="literalblock">
<div class="content">
<pre style="margin: 0.5em 2em;"><tt>copy : T → T ⊗ T
dispose : T → ()</tt></pre>
</div>
</div>
</div>
<div class="sect2">
<h3 id="_category_2_linear_values">
Category 2: linear values</h3>
<div class="paragraph" style="margin-top: 0.35em;">
This kind of value requires no special functions.
Copying is not allowed, and disposal happens explicitly.</div>
</div>
<div class="sect2">
<h3 id="_category_3a_resource_copying">
Category 3a: resource copying</h3>
<div class="paragraph" style="margin-top: 0.35em;">
This kind of values can be copied; each copy is disposed of individually:</div>
<div class="literalblock">
<div class="content">
<pre style="margin: 0.5em 2em;"><tt>copy : T → T ⊗ T
dispose : T → ()</tt></pre>
</div>
</div>
<div class="paragraph" style="margin-top: 0.35em;">
There is of course some dynamic storage allocation in play here, too.
In the function signatures shown above, the storage allocation system
is an implicit argument to both operations (in <tt>malloc</tt>-style); one
might imagine alternatives where it is presents as an explicit
argument.</div>
</div>
<div class="sect2">
<h3 id="_category_3b_reference_counting">
Category 3b: reference counting</h3>
<div class="paragraph" style="margin-top: 0.35em;">
Reference counted values are created with an initial reference count of one.
Copying and disposal functions are supplied; <tt>copy</tt> increments the
reference count, and <tt>dispose</tt> decrements it and frees the resource if
the count reaches zero. In the simple form of this, freeing the resource does
not require any other parameters:</div>
<div class="literalblock">
<div class="content">
<pre style="margin: 0.5em 2em;"><tt>copy : T → T ⊗ T
dispose : T → ()</tt></pre>
</div>
</div>
<div class="paragraph" style="margin-top: 0.35em;">
Again, the underlying storage allocation system is implied.</div>
</div>
<div class="sect2">
<h3 id="_category_4_defered">
Category 4: (defered)</h3>
<div class="paragraph" style="margin-top: 0.35em;">
Single-thread heaps is a more graceful strategy than lock protection — but the cruder lock protection mechanism is easier to describe, so we’ll
handle that first and return to heaps afterwards.</div>
</div>
<div class="sect2">
<h3 id="_category_5_lock_protection">
Category 5: lock-protection</h3>
<div class="paragraph" style="margin-top: 0.35em;">
This kind of value is created with a lock and a reference count (initially one).
The raw value of type <i>T</i> is not initially available; instead, you get a <i>LockedT</i>.
To get access to the value proper, you first need to acquire the lock.
<br />
<tt>copy</tt> and <tt>dispose</tt> works as for a reference counted value.
<tt>acquire</tt> and <tt>release</tt> work the lock.</div>
<div class="literalblock">
<div class="content">
<pre style="margin: 0.5em 2em;"><tt>copy : LockedT → LockedT ⊗ LockedT
copy : T → T ⊗ LockedT
dispose : LockedT → ()
acquire : LockedT → T
acquire : LockedT ⊗ Timeout → T ⊕ LockedT
release : T → LockedT</tt></pre>
</div>
</div>
<div class="paragraph" style="margin-top: 0.35em;">
(If the type <i>T</i> can also be managed in other ways, such that not all
instances of <i>T</i> can be subjected to, e.g., the <tt>release</tt> operation,
then the function signatures need to be a bit different.)</div>
</div>
<div class="sect2">
<h3 id="_category_4_heap_allocation_in_linearly_owned_heap">
Category 4: heap allocation in linearly owned heap</h3>
<div class="paragraph" style="margin-top: 0.35em;">
This is the most complex case.</div>
<div class="paragraph" style="margin-top: 0.35em;">
The model here is an Erlang-style heap: In Erlang, each process owns a heap,
which is a contiguous block of memory divided into an "allocated" area
and a "free" area. The heap’s <i>allocation pointer</i> points to the
boundary between the two; everything below the boundary is allocated,
and everything above is free.</div>
<div class="paragraph" style="margin-top: 0.35em;">
When a new object needs to be allocated, it is first checked whether there is
enough space in the "free" area to accommodate an object of the size
in question.
If there is room, the lowest part of the free area is used for the object,
and the allocation pointer is adjusted upwards accordingly.</div>
<div class="paragraph" style="margin-top: 0.35em;">
If there is not enough room, then a garbage collector is called — a
moving garbage collector which compacts the live objects into a contiguous
area at the beginning of the heap. During this process, a new heap area may be
allocated; in any case, after garbage collection, the heap is once again
parted in two by the allocation pointer, and there is now room in the
free space for the requested object.</div>
<div class="paragraph" style="margin-top: 0.35em;">
There are two things to be aware of here:</div>
<div class="olist arabic">
<ol class="arabic">
<li>
At allocation time, the garbage collector must know all of the <i>root</i>
objects in the heap, in order to preserve all reachable values;
<br />
</li>
<li>
At allocation time, the location of all live pointers into the heap
must be known to the garbage collector, so that it can adjust them
appropriately when it moves objects.
<br />
</li>
</ol>
</div>
<div class="paragraph" style="margin-top: 0.35em;">
As an example of how the first invariant can be missed, consider a built-in
function which makes <i>two</i> heap allocations.
If, after the first allocation, the function does not tuck away the resulting
pointer into a location searched by the garbage collector, then if the
second allocation triggers a collection, then the first allocated object
will be deemed garbage and be collected.</div>
<div class="paragraph" style="margin-top: 0.35em;">
As an example of what can go wrong if the second invariant is not
observed, consider the case where the same built-in function <i>does</i>
tuck away a copy of the first pointer into a location searched by the
garbage collector, but then — after the second allocation — goes on
to use the original pointer from a local variables which is not
adjusted by the collector: the first object lives, but the pointer
used by the built-in function does not point to it (but rather to the
place where it used to live).</div>
<div class="paragraph" style="margin-top: 0.35em;">
These are the two major pitfalls to be aware of, and if we are to
model such a heap, the type system should guard against them.</div>
<div class="paragraph" style="margin-top: 0.35em;">
Enter the use of existential variables as a means to link two variables.</div>
<div class="paragraph" style="margin-top: 0.35em;">
In this scenario, unlike the others, we will not provide access to the
raw value of type T; instead, the individual field of the value can be
accessed (if the value is a record; sum type values can be handled similarly).</div>
<div class="paragraph" style="margin-top: 0.35em;">
First of all, we need to have access to the heap itself.
It will have two parameters:</div>
<div class="literalblock">
<div class="content">
<pre style="margin: 0.5em 2em;"><tt>Heap[h,a]</tt></pre>
</div>
</div>
<div class="paragraph" style="margin-top: 0.35em;">
of which one, <i>h</i>, denotes the identity of the heap, while the other,
<i>a</i>, denotes the current <i>version</i> of the heap.
These are not actual types (they will end up coming from an
existential quantor) but are of a more technical nature.</div>
<div class="paragraph" style="margin-top: 0.35em;">
Each pointer to a heap object (of type <i>T</i>) is represented by a:</div>
<div class="literalblock">
<div class="content">
<pre style="margin: 0.5em 2em;"><tt>HeapPtr[T,a]</tt></pre>
</div>
</div>
<div class="paragraph" style="margin-top: 0.35em;">
where <i>a</i> denotes the version of the heap <i>for which this pointer is
valid</i>. Given that we don’t intend the garbage collector to support
objects with finalizers, T can not be of any type — only certain
types of objects (those with no-op disposal functions) can be
heap-allocated.</div>
<div class="paragraph" style="margin-top: 0.35em;">
Access of a field (of, say, type <i>F</i>) happens through a built-in function:</div>
<div class="literalblock">
<div class="content">
<pre style="margin: 0.5em 2em;"><tt>access_field : ∀h.∀a. Heap[h,a] ⊗ HeapPtr[T,a] → Heap[h,a] ⊗ F</tt></pre>
</div>
</div>
<div class="paragraph" style="margin-top: 0.35em;">
(Here it is not really necessary to have access to the heap itself;
proof that the current version is <i>a</i> is enough. The heap interface
can be refined to reflect this — and the proof in question need not
be present at runtime, and can be optimized away.)</div>
<div class="paragraph" style="margin-top: 0.35em;">
Disposing of a heap pointer is a no-op — the pointer can simply be
forgotten about:</div>
<div class="literalblock">
<div class="content">
<pre style="margin: 0.5em 2em;"><tt>dispose : ∀T.∀a. HeapPtr[T,a] → ()</tt></pre>
</div>
</div>
<div class="paragraph" style="margin-top: 0.35em;">
Allocation of a new heap pointer is interesting. Here we may end up
with a different version of the heap, because the allocation may
trigger garbage collection. Suppose that we are to allocate an object
of type <i>F</i>, which requires values of types <i>A</i> and <i>B</i> to construct:</div>
<div class="literalblock">
<div class="content">
<pre style="margin: 0.5em 2em;"><tt>allocate : ∀h.∀a. Heap[h,a] ⊗ _A_ ⊗ _B_ → ∃b.Heap[h,b]</tt></pre>
</div>
</div>
<div class="paragraph" style="margin-top: 0.35em;">
Note that after allocation, all of the old heap pointers of type
<tt>HeapPtr[h,a]</tt> are useless!
They can no longer be used as input to <tt>access_field</tt> (but <tt>dispose</tt>
can still be used on them).</div>
<div class="paragraph" style="margin-top: 0.35em;">
This is exactly what we desire.
However, we need some way of registering root pointers for the garbage
collector.
To this end, we introduce a type:</div>
<div class="literalblock">
<div class="content">
<pre style="margin: 0.5em 2em;"><tt>RootPtr[T,h]</tt></pre>
</div>
</div>
<div class="paragraph" style="margin-top: 0.35em;">
and functions for registering root pointer and get up-to-date pointers back:</div>
<div class="literalblock">
<div class="content">
<pre style="margin: 0.5em 2em;"><tt>register_root : ∀h.∀a.∀T. Heap[h,a] ⊗ HeapPtr[T,a] → Heap[h,a] ⊗ RootPtr[T,h]
read_root : ∀h.∀a.∀T. Heap[h,a] ⊗ RootPtr[T,h] → Heap[h,a] ⊗ HeapPtr[T,a]
dispose_root : ∀h.∀T. RootPtr[T,h] → ()</tt></pre>
</div>
</div>
<div class="paragraph" style="margin-top: 0.35em;">
Finally, we have of course construction and disposal of a heap.
These functions allocate, respectively deallocate, the memory areas
and bookkeping data structures for the heap:</div>
<div class="literalblock">
<div class="content">
<pre style="margin: 0.5em 2em;"><tt>create_heap : () → ∃h.∃a. Heap[h,a]
dispose : ∀h.∀a. Heap[h,a] → ()</tt></pre>
</div>
</div>
<div class="paragraph" style="margin-top: 0.35em;">
All in all, we can model heaps in a safe way with linear types.
It does not matter whether the heap is managed through mark-and-sweep
or copying or compacting or generational garbage collection; the scheme
presented should work for either kind.</div>
<div class="paragraph" style="margin-top: 0.35em;">
The performance characteristics should (I think) be within a small
constant factor of usual GC implementations; the largest difference is
in explicit registration of heap root versus other schemes such as
stack maps. On the other hand, the scheme presented needs no special
tricks to be portable, and it is flexible: for instance, it supports
multiple heaps, which can be used for values with very different life cycles
or for messages to be sent from one thread to another.</div>
</div>
</div>
</div>
</div>
Erik Søe Sørensenhttp://www.blogger.com/profile/04571315699824621252noreply@blogger.com1tag:blogger.com,1999:blog-2879132569917319939.post-33125356529948653492013-12-27T21:48:00.000+01:002013-12-27T21:50:51.952+01:00Contra-testing<div id="content">
<div class="paragraph">
In software development, it is common practice to have a suite of unit
tests which can be run automatically against the program under
development.
In the following, I’ll introduce (or quite possibly reinvent) a
way of improving the quality of a test suite.</div>
<div class="paragraph">
How well the tests check the program is difficult to quantify. One
common way is to use a coverage measuring tool; such a coverage
tool then monitors the test execution and records which of the
program’s code paths are exercised and which aren’t.</div>
<div class="paragraph">
However, while a good test suite yields good coverage of the program,
the reverse is not necessarily true: a test suite may give good program
coverage yet not check the program’s behaviour very well.
For instance, a test suite which runs the program with a lot of
different inputs, but never perform any checks on the output of the
program, might give good coverage without finding any bugs.
In a less pathetic example, one unit test may exercise a code path,
and another test may contain the check which would expose a bug in that
code path — this, too, would exhibit good program coverage, yet not
catch the bug.</div>
<div class="paragraph">
Another danger when writing a unit (or other) test is that it doesn’t test
what it’s supposed to test.
The important assertions may be defunct for some reason or other;
sometimes this lack is caught, sometimes not.</div>
<div class="paragraph">
In this respect, the test is not different from any other program:
somtimes it doesn’t do what it is supposed to do.
Except the risk is in some respects greater for test code: if it
doesn’t work as expected, there’s a significant probability that the
result will just be a green lamp, with no-one being the wiser.</div>
<div class="paragraph">
A different, but related problem with test code is that of
maintenance. As the requirements, structure, and maintainers of a
program change, the original intent of the code may be muddled
somewhat, making it less obvious what the test is supposed to check,
and whether those checks remain relevant.</div>
<div class="paragraph">
So — who guards the guards? Who tests the tests — keeps the checkers in
check?</div>
<div class="paragraph">
For the main program, both failure to behave as intended, and
ceasing at some point to do it, is often caught by the test suite.
(That’s the idea, anyway.) For tests, on the other hand,
either kind of failure may go unnoticed.</div>
<div class="paragraph">
Part of the problem is in essence one of coverage — not of the
program, which is commonly measured, but of the test code, which seldom is.
It is, indeed, a common feature of test code, by its very nature, that certain
of its code paths are seldom if ever exercised.</div>
<div class="paragraph">
Can you test the tests, then? Can you exercise them more thoroughly?</div>
<div class="paragraph">
Test-driven development (TDD) is a partial answer to this problem:
in the TDD methodology, tests are written to first fail, reflecting
the non-compliance of the code, then the code is changed to make the
test succeed. That means that the error-path of the test is taken at
least once; definitely an improvement.</div>
<div class="paragraph">
Here, though, I want to go a step further — and aim for <i>continual</i>
testing of the tests.</div>
<div class="paragraph">
One possibility is to see unit tests and programs as adversaries — as duals of each other — keeping each other in check.
That approach appears to have some advantages.</div>
<div class="paragraph">
Following the testing principle from "The Art of Software Testing":
"A programmer should avoid attempting to test his or her own program",
I’ll discuss it as if program writer and test writer are different
people — but this doesn’t mean that it’ll work less well if the
program writer is also the tester.</div>
<div class="paragraph">
If the normal unit test writing method is an asymmetric game where
the test writer tries to second what the program writer might
have missed, and the program writer is simply doing his job, — we
could conceivably aim for a more symmetric situation of both parties
writing checks of the other’s work.</div>
<div class="paragraph">
In <i>that</i> kind of game, this means that the program writer
supplies more than one implementation, of which all but one are known
to contain a bug — realistically, this would be the actual program
and a set of variations on it.</div>
<div class="paragraph">
For the test writer, this means supplying a set of unit tests as usual — except that now that set can be failed, deemed inadequate!
The test writer gets a chance to lose, too.</div>
<div class="paragraph">
Supposing this is a novel idea (although probably it isn’t), let’s
name it "Contra-testing".</div>
<div class="paragraph">
(<b>Update:</b> The idea if using intentional bug introduction as part of
the testing process is indeed far from new. Both "Bebugging" and
"Mutation Testing" (two techniques dating back to the 1970s) are about
exactly that. More on that later.)</div>
<div class="paragraph">
Also, let’s have some illustrative examples.
I’ll go with a couple of classics — as far as the programs are concerned.
The test cases probably won’t be.</div>
<div class="sect1">
<h2 id="_notation">
Notation</h2>
<div class="sectionbody">
<div class="paragraph">
We’ll need some notation for specifying faulty variations of the program.
There are several possibilities; here I’ll go with a pure Java API:
from the program writer’s point of view, it consists of a
static method <tt>ContraTest.bug(String bugName)</tt> which under normal
conditions returns false, but for the contra-test specified by <tt>bugName</tt>
returns true.</div>
</div>
</div>
<div class="sect1">
<h2 id="_example_1_8201_8212_8201_factorial">
Example 1 — factorial</h2>
<div class="sectionbody">
<div class="paragraph">
Consider the well-known fibonacci function, as written by our friend
Petra the Program Writer (and, also, written in Java):</div>
<div class="listingblock">
<div class="content">
<!-- Generator: GNU source-highlight 3.1.5
by Lorenzo Bettini
http://www.lorenzobettini.it
http://www.gnu.org/software/src-highlite -->
<pre><tt><span style="font-weight: bold;"><span style="color: blue;">public</span></span> <span style="font-weight: bold;"><span style="color: blue;">static</span></span> <span style="color: #009900;">long</span> <span style="font-weight: bold;"><span style="color: black;">fibonacci</span></span><span style="color: #990000;">(</span><span style="color: #009900;">int</span> n<span style="color: #990000;">)</span> <span style="color: red;">{</span>
<span style="color: #009900;">long</span> a<span style="color: #990000;">=</span><span style="color: #993399;">0</span><span style="color: #990000;">,</span> b<span style="color: #990000;">=</span><span style="color: #993399;">1</span><span style="color: #990000;">;</span>
<span style="font-weight: bold;"><span style="color: blue;">for</span></span> <span style="color: #990000;">(</span><span style="color: #009900;">int</span> i<span style="color: #990000;">=</span><span style="color: #993399;">0</span><span style="color: #990000;">;</span> i<span style="color: #990000;"><</span>n<span style="color: #990000;">;</span> i<span style="color: #990000;">++)</span> <span style="color: red;">{</span>
<span style="color: #009900;">long</span> tmp<span style="color: #990000;">=</span>a<span style="color: #990000;">;</span> a<span style="color: #990000;">=</span>b<span style="color: #990000;">;</span> b<span style="color: #990000;">=</span>tmp<span style="color: #990000;">+</span>b<span style="color: #990000;">;</span>
<span style="color: red;">}</span>
<span style="font-weight: bold;"><span style="color: blue;">return</span></span> a<span style="color: #990000;">;</span>
<span style="color: red;">}</span></tt></pre>
</div>
</div>
<div class="paragraph">
What challenges might Petra set for Terry the Test Writer?
What might she suspect that he won’t think of testing?</div>
<div class="paragraph">
Well, apart from border cases and so on, one thing on her mind — one
concern which is present for her as she writes the code, and which
Terry might not think of — is that there are different algorithms to
choose from, with very different performance characteristics.</div>
<div class="paragraph">
Thus, one suitable contra-test would be this:</div>
<div class="listingblock">
<div class="content">
<!-- Generator: GNU source-highlight 3.1.5
by Lorenzo Bettini
http://www.lorenzobettini.it
http://www.gnu.org/software/src-highlite -->
<pre><tt><span style="font-weight: bold;"><span style="color: blue;">public</span></span> <span style="font-weight: bold;"><span style="color: blue;">static</span></span> <span style="color: #009900;">long</span> <span style="font-weight: bold;"><span style="color: black;">fibonacci</span></span><span style="color: #990000;">(</span><span style="color: #009900;">int</span> n<span style="color: #990000;">)</span> <span style="color: red;">{</span>
<span style="font-weight: bold;"><span style="color: blue;">if</span></span> <span style="color: #990000;">(</span>ContraTest<span style="color: #990000;">.</span><span style="font-weight: bold;"><span style="color: black;">bug</span></span><span style="color: #990000;">(</span><span style="color: red;">"slow factorial"</span><span style="color: #990000;">))</span> <span style="color: red;">{</span>
<span style="font-weight: bold;"><span style="color: blue;">return</span></span> <span style="color: #990000;">(</span>n<span style="color: #990000;">==</span><span style="color: #993399;">0</span><span style="color: #990000;">)</span> <span style="color: #990000;">?</span> <span style="color: #993399;">0</span>
<span style="color: #990000;">:</span> <span style="color: #990000;">(</span>n<span style="color: #990000;">==</span><span style="color: #993399;">1</span><span style="color: #990000;">)</span> <span style="color: #990000;">?</span> <span style="color: #993399;">1</span>
<span style="color: #990000;">:</span> <span style="font-weight: bold;"><span style="color: black;">fibonacci</span></span><span style="color: #990000;">(</span>n<span style="color: #990000;">-</span><span style="color: #993399;">1</span><span style="color: #990000;">)</span> <span style="color: #990000;">+</span> <span style="font-weight: bold;"><span style="color: black;">fibonacci</span></span><span style="color: #990000;">(</span>n<span style="color: #990000;">-</span><span style="color: #993399;">2</span><span style="color: #990000;">);</span>
<span style="color: #990000;">...</span>the real code of <span style="color: teal;">the</span> function<span style="color: #990000;">...</span></tt></pre>
</div>
</div>
<div class="paragraph">
This contra-test ensures that if Terry only tests the function with very small inputs,
that’ll be caught. So he’ll include one. Which again means that if,
later on, some other developer steps in thinking, "why use iteration?
Surely the straight-forward recursive version is better," that
regression will be caught.
Also, possibly, he’ll be more inclined to write tests for algorithmic
complexity in the future.</div>
<div class="paragraph">
The example may not be so realistic (there are faily few uses of the
fibonacci function in everyday software projects), but it illustrates
the general principles well enough.</div>
</div>
</div>
<div class="sect1">
<h2 id="_example_2_8201_8212_8201_accounts">
Example 2 — accounts</h2>
<div class="sectionbody">
<div class="paragraph">
Another example: the classic "account" example illustrating race conditions.</div>
<div class="paragraph">
Here is Petra’s code, including a contra-test:</div>
<div class="listingblock">
<div class="content">
<!-- Generator: GNU source-highlight 3.1.5
by Lorenzo Bettini
http://www.lorenzobettini.it
http://www.gnu.org/software/src-highlite -->
<pre><tt><span style="font-weight: bold;"><span style="color: blue;">public</span></span> <span style="font-weight: bold;"><span style="color: blue;">class</span></span> <span style="color: teal;">Accounts</span> <span style="color: red;">{</span>
<span style="font-weight: bold;"><span style="color: blue;">public</span></span> <span style="color: #009900;">boolean</span> <span style="font-weight: bold;"><span style="color: black;">transfer</span></span><span style="color: #990000;">(</span><span style="color: teal;">String</span> fromAccountID<span style="color: #990000;">,</span> <span style="color: teal;">String</span> toAccountID<span style="color: #990000;">,</span> <span style="color: #009900;">long</span> amount<span style="color: #990000;">)</span> <span style="color: red;">{</span>
<span style="font-weight: bold;"><span style="color: blue;">synchronized</span></span> <span style="color: #990000;">(</span>ContraTest<span style="color: #990000;">.</span><span style="font-weight: bold;"><span style="color: black;">bug</span></span><span style="color: #990000;">(</span><span style="color: red;">"no locking in transfer"</span><span style="color: #990000;">)</span>
<span style="color: #990000;">?</span> <span style="font-weight: bold;"><span style="color: blue;">new</span></span> <span style="font-weight: bold;"><span style="color: black;">Object</span></span><span style="color: #990000;">()</span> <span style="color: #990000;">:</span> <span style="font-weight: bold;"><span style="color: blue;">this</span></span><span style="color: #990000;">)</span> <span style="color: red;">{</span>
<span style="font-weight: bold;"><span style="color: blue;">if</span></span> <span style="color: #990000;">(</span><span style="font-weight: bold;"><span style="color: black;">withdraw</span></span><span style="color: #990000;">(</span>fromAccountID<span style="color: #990000;">,</span> amount<span style="color: #990000;">))</span> <span style="color: red;">{</span>
<span style="font-weight: bold;"><span style="color: black;">deposit</span></span><span style="color: #990000;">(</span>toAccountID<span style="color: #990000;">,</span> amount<span style="color: #990000;">);</span>
<span style="font-weight: bold;"><span style="color: blue;">return</span></span> <span style="font-weight: bold;"><span style="color: blue;">true</span></span><span style="color: #990000;">;</span>
<span style="color: red;">}</span> <span style="font-weight: bold;"><span style="color: blue;">else</span></span> <span style="font-weight: bold;"><span style="color: blue;">return</span></span> <span style="font-weight: bold;"><span style="color: blue;">false</span></span><span style="color: #990000;">;</span>
<span style="color: red;">}</span>
<span style="color: red;">}</span>
<span style="color: #990000;">...</span>
<span style="color: red;">}</span></tt></pre>
</div>
</div>
<div class="paragraph">
This is another example of a contra-test of a non-functional but crucial
aspect of the implementation — here of whether the Accounts.transfer()
method is properly synchronized.</div>
<div class="paragraph">
I conjecture that while this is a quite important aspect, few unit test writers
would under normal circumstances include a test for it.
With contra testing, the odds might improve.</div>
</div>
</div>
<div class="sect1">
<h2 id="_example_3_8201_8212_8201_hash_set">
Example 3 — hash set</h2>
<div class="sectionbody">
<div class="paragraph">
The examples so far happen to have lead to non-functional tests.
For a functional-test example, here’s a Set implementation Petra has written:</div>
<div class="listingblock">
<div class="content">
<!-- Generator: GNU source-highlight 3.1.5
by Lorenzo Bettini
http://www.lorenzobettini.it
http://www.gnu.org/software/src-highlite -->
<pre><tt><span style="font-weight: bold;"><span style="color: blue;">public</span></span> <span style="font-weight: bold;"><span style="color: blue;">class</span></span> <span style="color: teal;">MySet</span> <span style="color: red;">{</span>
<span style="font-weight: bold;"><span style="color: blue;">private</span></span> Element<span style="color: #990000;">[]</span> hash_table<span style="color: #990000;">;</span>
<span style="font-weight: bold;"><span style="color: blue;">private</span></span> <span style="font-weight: bold;"><span style="color: blue;">static</span></span> <span style="font-weight: bold;"><span style="color: blue;">class</span></span> <span style="color: teal;">Element</span> <span style="color: red;">{</span><span style="color: #990000;">...</span><span style="color: red;">}</span>
<span style="color: #990000;">...</span>
<span style="font-weight: bold;"><span style="color: blue;">public</span></span> <span style="color: #009900;">boolean</span> <span style="font-weight: bold;"><span style="color: black;">contains</span></span><span style="color: #990000;">(</span><span style="color: teal;">Object</span> candidate<span style="color: #990000;">)</span> <span style="color: red;">{</span>
<span style="font-style: italic;"><span style="color: #9a1900;">// Compute hash:</span></span>
<span style="color: #009900;">long</span> candHash <span style="color: #990000;">=</span> element<span style="color: #990000;">.</span><span style="font-weight: bold;"><span style="color: black;">hashCode</span></span><span style="color: #990000;">();</span>
<span style="font-style: italic;"><span style="color: #9a1900;">// Determine the bucket to search:</span></span>
<span style="color: #009900;">int</span> bucket <span style="color: #990000;">=</span> Math<span style="color: #990000;">.</span><span style="font-weight: bold;"><span style="color: black;">abs</span></span><span style="color: #990000;">(</span>candHash<span style="color: #990000;">)</span> <span style="color: #990000;">%</span> hashTable<span style="color: #990000;">.</span>length<span style="color: #990000;">;</span>
<span style="font-style: italic;"><span style="color: #9a1900;">// Walk the bucket's chain:</span></span>
<span style="font-weight: bold;"><span style="color: blue;">for</span></span> <span style="color: #990000;">(</span><span style="color: teal;">Element</span> element <span style="color: #990000;">=</span> hash_table<span style="color: #990000;">[</span>bucket<span style="color: #990000;">];</span>
element <span style="color: #990000;">!=</span> <span style="font-weight: bold;"><span style="color: blue;">null</span></span><span style="color: #990000;">;</span>
element <span style="color: #990000;">=</span> element<span style="color: #990000;">.</span>next<span style="color: #990000;">)</span>
<span style="color: red;">{</span>
<span style="font-weight: bold;"><span style="color: blue;">if</span></span> <span style="color: #990000;">(</span>element<span style="color: #990000;">.</span>hash <span style="color: #990000;">!=</span> candHash<span style="color: #990000;">)</span> <span style="font-weight: bold;"><span style="color: blue;">continue</span></span><span style="color: #990000;">;</span> <span style="font-style: italic;"><span style="color: #9a1900;">// This wasn't it.</span></span>
<span style="font-weight: bold;"><span style="color: blue;">if</span></span> <span style="color: #990000;">(</span>ContraTest<span style="color: #990000;">.</span><span style="font-weight: bold;"><span style="color: black;">bug</span></span><span style="color: #990000;">(</span><span style="color: red;">"contains: no equals check"</span><span style="color: #990000;">)</span> <span style="color: #990000;">||</span>
candidate<span style="color: #990000;">.</span><span style="font-weight: bold;"><span style="color: black;">equals</span></span><span style="color: #990000;">(</span>element<span style="color: #990000;">.</span>value<span style="color: #990000;">))</span>
<span style="color: red;">{</span>
<span style="font-weight: bold;"><span style="color: blue;">return</span></span> <span style="font-weight: bold;"><span style="color: blue;">true</span></span><span style="color: #990000;">;</span> <span style="font-style: italic;"><span style="color: #9a1900;">// Candidate is present.</span></span>
<span style="color: red;">}</span>
<span style="color: red;">}</span>
<span style="font-weight: bold;"><span style="color: blue;">return</span></span> <span style="font-weight: bold;"><span style="color: blue;">false</span></span><span style="color: #990000;">;</span> <span style="font-style: italic;"><span style="color: #9a1900;">// Candidate is absent.</span></span>
<span style="color: red;">}</span>
<span style="color: red;">}</span></tt></pre>
</div>
</div>
<div class="paragraph">
Here again, Petra recognizes — with her white-box knowledge — a mistake
a developer could easily make, and which the tests should consequently
check for: Two key objects can have the same hash code yet be different.</div>
</div>
</div>
<div class="sect1">
<h2 id="_back_from_examples">
Back from examples</h2>
<div class="sectionbody">
<div class="paragraph">
I’ve illustrated the idea with two developers; in practice, the same
person can of course write both tests and contra-tests.
In that case, of course, the question which the program writer asks herself
is not likely to be "what haven’t the test writer thought about?"
(or why indeed not?)
but rather: "what <i>have</i> I thought of but possibly got wrong anyway?" — much like when one developer writes normal unit tests for his or
her own code.</div>
<div class="paragraph">
Now, why might this testing approach work better than the status quo?</div>
<div class="paragraph">
One argument for contra-testing has to do with confirmation bias:
it’s like the difference between asking "prove that my program works" and
"this program has one small but significant bug. Find it."
Given that all software contains bugs, the latter question is
obviously the correct one to ask.</div>
<div class="paragraph">
Furthermore, having pairs of tests with a single variation and with
opposite expected results is generally good practice, as it heightens
the probability that you’re testing what you think you’re testing.
Testing on both sides of a boundary, and expecting different results,
tends to be a very good idea.</div>
<div class="paragraph">
The contra-tests also give useful input to the question, "when have we
tested enough?"</div>
<div class="paragraph">
What might the consequences be for <i>which</i> unit tests are written?
The above examples give reason to believe that there might be a
greater weight on non-functional tests such as tests of
performance/algorithmic complexity and thread-safety.
Also, the trickiest part of the code, the special cases which were
hardest to express in code, and the hardest-gained insights of the
program developer would probably be better assured to get attention — because the program developer is most acutely aware of these areas of
the problem.
Which would be a very good thing, because for the same reasons, these corners
and program properties would be most likely to deteriorate later on
in the program’s life.</div>
<div class="paragraph">
Finally, especially if contra-testing is carried out with program
writer and test writer being separate persons, I suppose there could be an
effect on these developer’s testing mindset in general.
This suspicion is purely based on the above examples, though.</div>
</div>
</div>
<div class="sect1">
<h2 id="_not_all_new">
Not all new</h2>
<div class="sectionbody">
<div class="paragraph">
I haven’t yet had the chance to try out contra-testing in practice,
but I find the idea interesting — perhaps you do too, now.</div>
<div class="paragraph">
And as I’ve mentioned, the idea is not all new — few ideas usually are.</div>
<div class="paragraph">
As for the related preexisting techniques I mentioned earlier:
Mutation Testing and Bebugging are both about assessing the adequacy
of a given test suite, and/or estimating the number of remaining,
uncaught bugs.</div>
<div class="paragraph">
Bebugging involves manually injecting a known number of bugs, seeing
what proportion of of those are caught by the testing process, and
using that number to extrapolate from the number of real caught errors
to the number of unfound errors.</div>
<div class="paragraph">
Mutation testing, on the other hand, is about automatically generating
program variations (each of which may or may not be faulty) and see in
how many cases the test suite catches the difference. This translates
into an adequacy score of the test suite.
Thus, this method in effect works as an improved coverage test,
measuring not only if all parts of the program is exercised, but also
whether the effects of each part of the program are in fact (directly
or indirectly) checked by the tests.</div>
<div class="paragraph">
The main drawbacks of mutation testing is that it’s a
resource-intensive process, and that many of the generated variations
may be semantically equivalent to the original program, and this
cannot be detected fully automatically.</div>
<div class="paragraph">
As far as I can make out, neither of the two techniques are meant for
continuous testing, and for being a permanent and integrated part of
the development process. Thus, contra-testing may still be not all old news.
I guess another name for it would be "Continuous Bebugging".</div>
<div class="paragraph">
Given time, I’ll see if I can get some level of JUnit or TestNG
support for contra-testing going. (Yeah, like that’s going to happen…)
In any case: If you decide to try this testing method, please let me
know your experiences.</div>
</div>
</div>
</div>
Erik Søe Sørensenhttp://www.blogger.com/profile/04571315699824621252noreply@blogger.com0tag:blogger.com,1999:blog-2879132569917319939.post-10582010300059540002013-03-26T12:27:00.002+01:002013-03-26T20:43:02.482+01:00The Case for a Concurrency-oriented VM<div class="paragraph"><p>In the following, I’ll describe the background for my current
long-running project - why I believe it to be a worthwhile endeavour.</p></div>
<div class="paragraph"><p>The overall line of thought goes like this: concurrency and
concurrency-oriented languages are important; we have such languages,
but they
are few and far between; there exists a non-CO ecosystem
(the one around the JVM) which has spawned
dozens of languages and allows those languages to cooperate and
prosper; which factors have been significant in creating this
ecosystem, and how can they be harnessed
to improve conditions for concurrency-oriented languages?</p></div>
<div class="paragraph"><p>We begin at concurrency and how programming languages deal with it.</p></div>
<div class="sect1">
<h2 id="_concurrency_and_languages">Concurrency and languages</h2>
<div class="sectionbody">
<div class="paragraph"><p>A fair number of the programs we use every day are doing more than one
thing at a time.
Every web server, smartphone app, GUI application,
operating system, chat server, VoIP system, game server, music
player, video player, etc. that you’re using are concurrent
programs - either in the sense that they are serving multiple users
at once, or in the sense that they have multiple asynchronous event
sources they have to react to in a timely manner.</p></div>
<div class="paragraph"><p>Yet most programming languages - and about all mainstream languages -
aren’t very well geared towards making concurrent applications.
The programming paradigms one hears most about these days are
the Object-oriented and the Functional paradigm.
Ever heard of Concurrency-oriented languages?
They exist, but there aren’t very many of them - and they are not
mainstream but something you have to actively seek out.</p></div>
<div class="paragraph"><p>What do I mean by being concurrency-oriented, or "geared towards
concurrent applications"?
(I’ve
<a href="http://polymorphictypist.blogspot.dk/2011/05/on-concurrency-issues.html">touched</a>
upon this question earlier.)</p></div>
<div class="paragraph"><p>A number of languages with concurrency support are not much more than
a sequential language with threads and locks added.
This is not enough to qualify for being a concurrency-oriented language;
to be that, you have to take the issues of concurrency seriously.
If you take a convential imperative language and add threads, you’d
have to subtract something as well.</p></div>
<div class="paragraph"><p>A typical example of a language-with-threads is Java.
Java is an object-oriented
language first and a concurrency-supporting language second, at best.
Like most languages-with-threads, concurrency support mainly comes
as an addition, an afterthought, with a library-ish feel.
Sure, the Java syntax contains concurrency-related things - mainly
<tt>synchronized</tt> and <tt>volatile</tt> - but <em>nothing forces you to use them</em>.
You can write a program embodying all of the concurrency pitfalls
known to man, and neither compiler nor runtime will object in the
slightest.</p></div>
<div class="paragraph"><p>There are rules about when the concurrency primitives are used safely.
They come in the form of the
<a href="http://www.cs.umd.edu/~pugh/java/memoryModel/jsr133.pdf">Java
Memory Model and Thread Specification</a>, which states what the guarantees are
and how to avoid »Surprising behaviours«.<sup><a href="#fn_1" name="fnref_1">[1]</a></sup></p></div>
<div class="paragraph"><p>To be sure, explicitly stated guarantees is a Good Thing.
But consider a world in which the type safety of Java amounted to
a document stating how to avoid type errors.
No static checking, no run-time checking, just rules about when you’re
safe…</p></div>
<div class="paragraph"><p>Smells a lot like C, right?
Replace type errors, initialization errors, and memory mismanagement faults
with data races and deadlocks - and lack-of-strong-typing becomes
lack-of-proper-concurrency-support.
Except that whereas the former kind of bugs are more or less
deterministic, the latter kind seldom is.
Which means that a) it is harder to debug, b) it can not be guarded
against by unit tests, and c) the place where you’re likely to
discover it is in production.</p></div>
<div class="paragraph"><p>And just like we’re much better off with the enforced type safety of Java
(nobody misses the segmentation faults which are common when
developing C or C++ programs),
so we’re better off when the language provides some actual
guarantees in the face of concurrent activities.</p></div>
<div class="paragraph"><p>And languages can do that.
A language like Erlang, for example, is
<a href="http://www.guug.de/veranstaltungen/ffg2003/papers/ffg2003-armstrong.html">concurrency-oriented</a> first and functional second - and it shows.
Other examples include Oz, CML, Rust, and Clojure (Haskell
probably also qualifies these days); each solves the problem of
providing safe concurrency in its own way.<sup><a href="#fn_2" name="fnref_2">[2]</a></sup></p></div>
<div class="paragraph"><p>But these languages are each isolated efforts, and they are not mainstream.
Which means that they each must struggle on many fronts: evolving
syntactically and semantically; keeping up with hardware and OS
evolution; providing libraries for the expected functionality of the
changing times; and fighting for developer mindshare.</p></div>
</div>
</div>
<div class="sect1">
<h2 id="_the_success_of_the_jvm">The success of the JVM</h2>
<div class="sectionbody">
<div class="paragraph"><p>Which brings us to the happy ecosystem centered around the JVM.</p></div>
<div class="paragraph"><p>The Java Virtual Machine is thriving -
to the point where it might outlive the language it was originally
created for; while Java the language is evolving only slowly,
- the JVM is the center (qua
facilitator) of a wholly different level of evolution: a
<a href="http://en.wikipedia.org/wiki/List_of_JVM_languages">multitude of
languages</a> of all sorts of paradigms, domains, typing strategies and
syntactic styles are targeting the JVM.</p></div>
<div class="paragraph"><p>Though the VM was originally designed for just one language - Java -
it has proven so convenient to generate code for that many language
implementers have chosen to do just that.
(I’ve done it myself on more than one occasion.)</p></div>
<div class="paragraph"><p>Why is that?</p></div>
<div class="paragraph"><p>For one thing, the JVM is well-specified. Its specification is public and
the rules are clear and make sense.
Second, part of those rules are embodied in the <em>bytecode verifier</em> -
a validation step performed on code when it is loaded into the JVM.
Thirdly, the existence of the public specification has meant that the
Java language and the JVM implementation(s) have been decoupled (more
or less) with respect to their evolution. Java the language has been
able to evolve quite a bit without the VM spec changing much (The
addition of generics being one example), and the VM on its part has
been able to evolve quite much performance-wise.
(Besides, several competing VM implementations have come to exist.)</p></div>
<div class="paragraph"><p>And the HotSpot implementation of Java is a solid piece of software
engineering which has put to shame the "interpreted languages must be
slow" prejudice, making it a thing of the last century.</p></div>
<div class="paragraph"><p>What does these things mean for compiler writers targeting the JVM?</p></div>
<div class="paragraph"><p>Targeting a statically checked bytecode format is a blessing, because
many of the most difficult-to-debug kinds of error are revealed at
load time, instead of resulting in mysterious memory corruptions.
Bytecode verification enforces some useful invariants, reducing
debugging time.</p></div>
<div class="paragraph"><p>Targeting a mature VM yields performance and portability benefits -
"write once, run anywhere" is attractive for language implementers as
well as application developers.</p></div>
<div class="paragraph"><p>Consider <a href="http://en.wikipedia.org/wiki/List_of_JVM_languages">all
the languages</a> which run on the JVM, some of them available only on
that platform.
<br>
Without that virtual machine, or something else playing the same role,
would we have had so many new implementations of such a variety of
languages, of such a level of maturity, performance and portability?
And provided with such a wide range of libraries?</p></div>
<div class="paragraph"><p>Most probably not.</p></div>
<div class="paragraph"><p>The task of implementing a native code generator for just a single
platform, along with runtime libraries and a decently performing
general-purpose garbage collector, is not trivial.<sup><a href="#fn_3" name="fnref_3">[3]</a></sup>
The number of man-hours which could have gone to debugging
segmentation fault and similar mysteries, and to implement just the
basic libraries for the new languages, but which have instead been
channeled to progress in other areas, are significant.
Small wonder that
Microsoft chose to copy Java and make their own common cross-language
platform (.NET).</p></div>
<div class="paragraph"><p>Finally, apart from development of compilers, the well-defined-ness of
the Java bytecode format has meant that a number of other byte-code
processing tools have come to exist: tools for analysis,
transformation and instrumentation of Java bytecode.</p></div>
<div class="paragraph"><p>In summary: The JVM has been, and is, a great facilitator; this is
especially due to its general usability, public specification,
fine performance (especially after HotSpot), and - not least -
bytecode verification following well-defined and well-understood
rules.</p></div>
</div>
</div>
<div class="sect1">
<h2 id="_the_shortcomings_of_the_jvm">The shortcomings of the JVM</h2>
<div class="sectionbody">
<div class="paragraph"><p>But the JVM design has drawbacks as well. Some are general nuisances,
some are just annoying to compiler implementers targeting the JVM,
and some cause it to be less than ideal for languages belonging to
certain paradigms.
Some are visible to Java developers, others are purely technical.</p></div>
<div class="sect2">
<h3 id="_issues_for_language_implementers">Issues for language implementers</h3>
<div class="paragraph"><p>First of all, the JVM smells of Java.
<br>
In a number of ways, the design of the virtual machine would have been
different if Java - especially the early versions of Java - had been
different.</p></div>
<div class="paragraph"><p>The JVM is quite class-centric, for instance - the unit of loading and
verification is the class. The only kind of type definition is the class.
For language implementers, this is a source of mild annoyance:
translating most kinds of simple code is simple, but the moment the
code uses a new record type, or variant type, or function object, or a
function needs to return multiple values - it translates into adding a
new separate class file to the compiler output.</p></div>
<div class="paragraph"><p>The JVM is statically typed - but has no parametrized types.
Therefore, type safety breaks slightly every time one of the common
container classes from e.g. <tt>java.util</tt> is used, so runtime checks
must be performed.
This is because Java prior to 1.5 (where generics where introduced)
was poor on this point.
<br>
Furthermore, the JVM instruction set has a type check instruction and
a type cast instruction - but no "check-and-cast" instruction.
I can only assume that this is because Java doesn’t have a language
construction such as "type case", but only the combining idiom "<tt>if (x
instanceof T) {T y = (T) x; …}</tt>". Presumably, this pattern is
recognized and simplified internally in the common JVM implementations.</p></div>
<div class="paragraph"><p>JVM methods are limited to 64KB worth of bytecode.
While this is rarely a limitation for hand-written Java code,
it is occassionally a problem for machine-generated code.
(In Erjang, for example, we hit the limit for one version of the
interpreter, and also (if memory serves) when compiling one particular
Erlang module into Java bytecode.)</p></div>
<div class="paragraph"><p>A special case of this is that large literals simply can’t be used,
because in the Java class file format, a literal value are represented
as the sequence of instructions it takes to build the value.
Try compiling a literal array of, say, 12,000 elements - at four
instructions per array slot, this exceeds the method-length limit, and
the compiler quits with the message "<tt>code too large</tt>".<sup><a href="#fn_4" name="fnref_4">[4]</a></sup>
Certain table-driven algorithms would therefore also be hit by the limit.</p></div>
<div class="paragraph"><p>The implications for language implementers are that a) one likely kind
of candidate for hitting the method-length limit is interpreter loops,
and b) compilers to JVM bytecode should know about and handle the limit — which gives rise to a tough implementation decision: just detecting
when the limit is overrun and aborting the compilation is an honest,
but abstraction-breaking solution; working around the limit in general
is presumably possible, but introduces undesired (and seldomly needed)
complexity.<sup><a href="#fn_5" name="fnref_5">[5]</a></sup><sup><a href="#fn_6" name="fnref_6">[6]</a></sup></p></div>
</div>
<div class="sect2">
<h3 id="_issues_related_to_concurrency">Issues related to concurrency</h3>
<div class="paragraph"><p>A related problem is JVM’s bad immutability guarantees.
Specifically, there is no such thing as an immutable array.
If a part of your code is given an array, or provides public access to
one it its arrays, then it can’t assume that other code parts won’t
modify the contents of the array at a later point.
<br>
Literal arrays are no exception: a VM implementation can’t perform
constant folding of lookups into a literal array until it’s proven
that that array can’t be leaked to somewhere where it might be
mutated; a property which can only be established by inspecting the
entire scope of the variable.</p></div>
<div class="paragraph"><p>The JVM’s support for (shallowly) immutable variables comes in the
form of the <tt>final</tt> field modifier.
The intuition of this modifier is simple enough: a <tt>final</tt> field can
be set only once, and must be set before its first use, so it should
have the same value throughtout its lifetime.</p></div>
<div class="paragraph"><p>But in reality, things are a bit more complex - the field modifier has
its warts:</p></div>
<div class="ulist"><ul>
<li>
<p>
Certain <tt>final</tt> fields may be inlined at compile time - even across
classes, which has the potential for causing surprising
inconsistencies.
</p>
</li>
<li>
<p>
Even fields declared <tt>final</tt> can be modified though reflection(!) -
which, of course, doesn’t affect those accesses which have been
inlined at compile-time (I believe you could call this a surprise
within a surprise).
</p>
</li>
<li>
<p>
There’s another case where the "final fields always have the same
value" rule is broken:
Final fields can be observed before they gain their value.
<br>
That abstraction breakage is demonstrated in the following program:
</p>
<div class="listingblock">
<div class="content">
<!-- Generator: GNU source-highlight 3.1.5
by Lorenzo Bettini
http://www.lorenzobettini.it
http://www.gnu.org/software/src-highlite -->
<pre><tt><span style="font-weight: bold"><span style="color: #0000FF">public</span></span> <span style="font-weight: bold"><span style="color: #0000FF">class</span></span> <span style="color: #008080">JavaFinal</span> <span style="color: #FF0000">{</span>
<span style="font-weight: bold"><span style="color: #0000FF">final</span></span> <span style="font-weight: bold"><span style="color: #0000FF">static</span></span> <span style="color: #008080">Object</span> x <span style="color: #990000">=</span> JavaFinalHelper<span style="color: #990000">.</span><span style="font-weight: bold"><span style="color: #000000">create</span></span><span style="color: #990000">();</span>
<span style="font-weight: bold"><span style="color: #0000FF">public</span></span> <span style="font-weight: bold"><span style="color: #0000FF">static</span></span> <span style="color: #009900">void</span> <span style="font-weight: bold"><span style="color: #000000">main</span></span><span style="color: #990000">(</span>String<span style="color: #990000">[]</span> args<span style="color: #990000">)</span> <span style="color: #FF0000">{</span>
System<span style="color: #990000">.</span>err<span style="color: #990000">.</span><span style="font-weight: bold"><span style="color: #000000">println</span></span><span style="color: #990000">(</span><span style="color: #FF0000">"Main: x = "</span><span style="color: #990000">+</span>x<span style="color: #990000">);</span>
<span style="color: #FF0000">}</span>
<span style="color: #FF0000">}</span>
<span style="font-weight: bold"><span style="color: #0000FF">class</span></span> <span style="color: #008080">JavaFinalHelper</span> <span style="color: #FF0000">{</span>
<span style="font-weight: bold"><span style="color: #0000FF">static</span></span> <span style="color: #FF0000">{</span>
System<span style="color: #990000">.</span>err<span style="color: #990000">.</span><span style="font-weight: bold"><span style="color: #000000">println</span></span><span style="color: #990000">(</span><span style="color: #FF0000">"Helper: x = "</span><span style="color: #990000">+</span>JavaFinal<span style="color: #990000">.</span>x<span style="color: #990000">);</span>
<span style="color: #FF0000">}</span>
<span style="font-weight: bold"><span style="color: #0000FF">public</span></span> <span style="font-weight: bold"><span style="color: #0000FF">static</span></span> <span style="color: #008080">Object</span> <span style="font-weight: bold"><span style="color: #000000">create</span></span><span style="color: #990000">()</span> <span style="color: #FF0000">{</span><span style="font-weight: bold"><span style="color: #0000FF">return</span></span> <span style="font-weight: bold"><span style="color: #0000FF">new</span></span> <span style="font-weight: bold"><span style="color: #000000">Object</span></span><span style="color: #990000">();</span><span style="color: #FF0000">}</span>
<span style="color: #FF0000">}</span></tt></pre>
</div>
</div>
<div class="paragraph"><p>Here, the helper class’s static initializer is run before
<tt>JavaFinal.x</tt> has been set - exposing the <tt>null</tt> value that field has
before it is initialized.</p></div>
</li>
</ul></div>
<div class="paragraph"><p>Thus, certain should-be-invariants aren’t.</p></div>
<div class="paragraph"><p>Finally, and most importantly in this context,
there are the shortcomings in the concurrency department.</p></div>
<div class="paragraph"><p>The JVM provides no checking of the sanity of concurrent access patterns.
Java’s static type system which is the basis of the bytecode
verification is static in more than one sense:
It doesn’t take time into account.
If you ever get hands on a reference to an object, then you can at any
later point call any (visible) method.</p></div>
<div class="paragraph"><p>In concurrent systems, concepts like life cycle, ownership, sharing
and thread safety become important. The JVM doesn’t provide much
support for dealing with these.
For that matter, attention to these concepts is important in general
for defining library boundaries - which assumptions the library and
the library client can make about values passed to and from the library.
Quite often, thread safety and access timing questions aren’t
addressed by Java libraries' API documentations; Javadoc makes it easy
to document the static types involved, but certain questions about the
dynamics are not often answered.</p></div>
<div class="paragraph"><p>From an Erlang point of view, such a thing as thread-local garbage
collection is also desirable. I have a hypothesis that no sanely
structured concurrent programs genuinely <em>need</em> a globally-GC’ed heap.
Whether that is true or not, for global GC you either pay the price of
occasional slow major garbage collections - which cost in latency -
or, if your garbage collector is sufficiently advanced to do its work
incrementally or concurrently with the mutating threads, pay the price
of inter-thread synchronization necessary for this to work - which
costs in throughput.</p></div>
<div class="paragraph"><p>For the matter of this text, the verification aspects are
the central points.</p></div>
<div class="paragraph"><p>To sum up: The JVM lacks verification support for such things as
sanity of concurrent accesses and immutability/read-only access privileges.
As far as concurrency-orientation is concerned, these areas present
room for improvement.
<br>
And I am not aware of any other generally targetable VMs which fare
better in these respects.</p></div>
</div>
</div>
</div>
<div class="sect1">
<h2 id="_the_shortcomings_of_the_beam">The shortcomings of the BEAM</h2>
<div class="sectionbody">
<div class="paragraph"><p>The primary contender for the title of Concurrency-orientated VM is
probably the one for Erlang, which is called BEAM.
The flavour of concurrency-orientation provided by BEAM includes
immutable data structures, shared-nothing actor-based concurrency,
light-weight processes, and process-local GC - and a few specific
forms of destructive updates.</p></div>
<div class="paragraph"><p>In some ways, it is very much like the Java VM: It has a mature
implementation, was originally intended for just one language, comes
from industry (as opposed to academia), and started out as purely
interpreted but later grew optional native compilation.</p></div>
<div class="paragraph"><p>But in other respects, it is very different:
<br>
While the JVM has had a public specification at least since 1999, BEAM
<a href="http://erlang.org/pipermail/erlang-questions/2012-May/066872.html">hasn’t
got</a> a complete specification - public or not.<sup><a href="#fn_7" name="fnref_7">[7]</a></sup></p></div>
<div class="paragraph"><p>The most complete instruction set reference I know of is
<a href="http://www.cs-lab.org/historical_beam_instruction_set.html">this
historical paper</a>, written in 1997 but first released 2012 (prompted,
I think, by the 2012 mailing list discussion). And that one only
documents the <em>internal</em> representation, not the (somewhat different)
instruction set used in BEAM files.</p></div>
<div class="paragraph"><p>This lack of a specification has consequences:</p></div>
<div class="ulist"><ul>
<li>
<p>
As for BEAM code producers: While other languages than Erlang have
been developed for BEAM, none produce BEAM code directly - instead,
they target an intermediate representation of the Erlang compiler
and use that compiler’s backend for producing actual BEAM code.
</p>
</li>
<li>
<p>
As for BEAM consumers: Until fairly recently, no other BEAM VM
implementations existed but the official one from Ericsson.
</p>
</li>
</ul></div>
<div class="paragraph"><p>The consumer situation has changed a bit; people have begun making
other implementations such as
<a href="http://github.com/trifork/erjang">Erjang</a>,
<a href="http://github.com/svahne/browserl/">Browserl</a> and
<a href="http://erlangonxen.org/">Ling</a>
(which run on, respectively, JVM, Javascript, and bare metal).</p></div>
<div class="paragraph"><p>But the lack of a specification - a clear definition of what is and
what isn’t allowed - means that there are differences in what the
implementations can run.</p></div>
<div class="paragraph"><p>At least two example of this are known to me:</p></div>
<div class="olist arabic"><ol class="arabic">
<li>
<p>
Erjang assumes that exception handlers are set up in a execution
path independent manner. While this is presumably true for code
generated by the Erlang compiler, I can imagine BEAM code in which
this isn’t the case, and which the official BEAM implementation will
run cheerfully, while Erjang will balk at it.
</p>
</li>
<li>
<p>
There is a gotcha hidden in the external instruction set -
there are nine instructions for calling functions:
three for local calls (<tt>call</tt>, <tt>call_last</tt>, <tt>call_only</tt> - depending on
whether it’s a tail call, and whether there’s a stackframe to deallocate);
three for external calls (<tt>call_ext</tt>, <tt>call_ext_last</tt>, <tt>call_ext_only</tt>);
two for calls with dynamic module and/or function name (<tt>apply</tt>, <tt>apply_last</tt>);
and one for calling function objects (<tt>call_fun</tt>).
<br><br>
Can you guess what the trap is?
<br><br>
Erlang is a language with guaranteed proper tail recursion,
but the only instruction for calling function objects is a <em>non-tail-call</em>.
<br>
How can that work??
<br><br>
By cheating. The official BEAM rewrites the instruction set used in
BEAM files into a richer instruction set used internally.
The rewriting rules contain a rule which recognizes the instruction sequence
"<tt>call_fun; deallocate; return</tt>" and rewrite it into the internal
<em>tail-call</em> version of <tt>call_fun</tt>.
<br><br>
The Erlang compiler, obviously, is in on the joke, and generates code
accordingly.
</p>
</li>
</ol></div>
<div class="paragraph"><p>This <tt>call_fun</tt> gotcha was one that both Erjang and Browserl missed.
I only discovered it
<a href="http://erlang.org/pipermail/erlang-questions/2012-May/066648.html">accidentally</a>,
and fixed it in Erjang, but as far as I can tell, Browserl doesn’t
handle that corner case (yet).</p></div>
<div class="paragraph"><p>So: a clear, public specification does matter.
A specification backed by a formal checking mechanism such as a
bytecode verifier is even better.<sup><a href="#fn_8" name="fnref_8">[8]</a></sup></p></div>
<div class="paragraph"><p>From a "multiple different languages on one VM" viewpoint, there are
of course other issues as well - for instance, the BEAM is rather
Erlang-centered.</p></div>
<div class="ulist"><ul>
<li>
<p>
No in-process destructive updates are supported - even where it
would be safe.
</p>
</li>
<li>
<p>
Only the concurrency primitives needed by Erlang are present - and
the form in which they are present is so high-level that there is
very little flexibility for other languages to be very non-Erlang.
A process can’t have more than one inbox; the set of supported
transactions on tables are fixed; binaries support efficient
appending but not prepending; and so on.
</p>
</li>
</ul></div>
<div class="paragraph"><p>In summary: The BEAM is a mature, well-tuned, and actively evolving
VM for running Erlang programs, but lacks probably the flexibility and
(as things stand) certainly the well-defined-ness necessary for it to
be viewed as a general platform for concurrency-orientated languages.</p></div>
</div>
</div>
<div class="sect1">
<h2 id="_what_would_a_concurrency_oriented_vm_look_like">What would a concurrency-oriented VM look like?</h2>
<div class="sectionbody">
<div class="paragraph"><p>If the JVM isn’t it, and the BEAM isn’t it, what <em>would</em> a
concurrency-oriented VM then look like?</p></div>
<div class="paragraph"><p>The BEAM offers concurrency safety, in that it does not allow<sup><a href="#fn_9" name="fnref_9">[9]</a></sup>
shared mutable state without clearly defined transactions; this is a
property we want.</p></div>
<div class="paragraph"><p>The JVM guarantees the absence of certain kinds of errors, because its
rules make the necessary invariants locally verifiable. We also want
that. More specifically, the rule should guarantee against not only
type errors, but also concurrency-related errors (data race conditions).</p></div>
<div class="paragraph"><p>The JVM also enforces these invariants by verifying code at load time
- rather than trust the compiler to generate only valid code.
We want that, too - because we intend for there to be more than one
compiler, for more than one language.</p></div>
<div class="paragraph"><p>Both the JVM and the BEAM support "native functions" - functions
implemented in C or similar - e.g. for interfacing with third-party
libraries or for performance-sensitive parts of a program.
It would be foolish to constrain the applications of a new VM by not
supporting this. However, the aim should be to encourage as much as
possible to be written in a concurrency-safe language - i.e.
while library integration would be a good reason for using native functions,
performance shouldn’t be; there should not be things which are
<em>inherently</em> slow to implement on the VM.<sup><a href="#fn_10" name="fnref_10">[10]</a></sup></p></div>
<div class="paragraph"><p>Thus, while the rule "all data structures are immutable" (as known
from Erlang) makes it easy to guarantee against data races, it is not
good enough - it either makes it impossible to implement certain things
efficiently, or encourages diving into a lower-level language to
escape the rules.
We want a bit more generality than that, also for the sake of
supporting other paradigms than Erlang’s - mutating operations should
be supported as long as they are safe and can be proven (reasonably
easily) to be so.</p></div>
<div class="paragraph"><p>For example, one step up, we have the rule "all data structures which
are ever visible by more than one process/thread to another must be
immutable", which is much more flexible yet also guarantees against
data races (and can be made locally verifiable).
Even higher up that ladder, shared mutable data are allowed, as long
as proper synchonization is in place.</p></div>
<div class="paragraph"><p>The design I have in mind is based on the concept of <em>linear types</em>,
which appear to be a good match, in that they make it possible to
capture lifecycles of objects in the type system - which goes a long
way towards making certain concurrency-related guarantees locally
verifiable (because half of concurrency is timing).</p></div>
<div class="paragraph"><p>I plan to elaborate on linear types and their use as basis for a VM in
future posts.</p></div>
</div>
</div>
<div class="sect1">
<h2 id="_acknowledgements">Acknowledgements</h2>
<div class="sectionbody">
<div class="paragraph"><p>I would like to thank Peter Mechlenborg and Thomas In Sook Holmen
for their help with forming this text.</p></div>
</div>
</div>
</div>
<div id="footnotes">
<hr>
<p style="font-size: 80%"><a href="#fnref_1" name="fn_1">[1]</a> <span class="footnote">
This is the place where you learn that if you have one thread writing
a <tt>long</tt> or <tt>double</tt> value, and another thread reading it, the JVM
implementation is allowed to let the reading thread see half-updates -
where the two words of the read values come from different writes.</span></p>
<p style="font-size: 80%"><a href="#fnref_2" name="fn_2">[2]</a> <span class="footnote">
Go and CML do allow shared mutable memory and thus are not strictly
data-race safe; they do, however, encourage communication to be done
in other, safer ways.</span></p>
<p style="font-size: 80%"><a href="#fnref_3" name="fn_3">[3]</a> <span class="footnote">
Admittedly, <a href="http://llvm.org/">LLVM</a> has come along in the meantime,
promising to relieve some of that burden.
</span></p>
<p style="font-size: 80%"><a href="#fnref_4" name="fn_4">[4]</a> <span class="footnote">
Technically, the compiler <em>could</em> handle large literals by splitting the
initialization into two methods, or - for integers - maybe even
initialize the array based on a string constant.
But apparently the Java compiler has been taught no such trick.</span></p>
<p style="font-size: 80%"><a href="#fnref_5" name="fn_5">[5]</a> <span class="footnote">
The Scala compiler (as of version 2.9.1) handles large literals in the
same way as Java - except that it does not detect when the resulting
method is too long; the resulting class file is rejected by the JVM at
load time.</span></p>
<p style="font-size: 80%"><a href="#fnref_6" name="fn_6">[6]</a> <span class="footnote">
When developing <a href="http://erjang.org/">Erjang</a>, we ran into the
limit both when writing the (machine-generated) interpreter and when
compiling an Erlang module containing a particularly complex
function.</span></p>
<p style="font-size: 80%"><a href="#fnref_7" name="fn_7">[7]</a> <span class="footnote">
The question of the BEAM spec has come up multiple time on
the Erlang mailing lists, for instance
<a href="http://erlang.org/pipermail/erlang-questions/2003-September/009829.html">in 2003</a>,
<a href="http://erlang.2086793.n4.nabble.com/VM-amp-BEAM-Specs-td2097814.html">in 2007</a>
and
<a href="http://erlang.org/pipermail/erlang-questions/2012-May/066515.html">in 2012</a> - see erlang.org/pipermail/erlang-questions/2012-May/066628.html.</span></p>
<p style="font-size: 80%"><a href="#fnref_8" name="fn_8">[8]</a> <span class="footnote">The Erlang compiler has a
verifier in its backend, which does certain sanity checks on the
generated code - but it is not complete, nor is it
<a href="http://erlang.org/pipermail/erlang-bugs/2011-June/002492.html">
meant to be</a>.</span></p>
<p style="font-size: 80%"><a href="#fnref_9" name="fn_9">[9]</a> <span class="footnote">
Except through the use of native functions (NIFs) or similar.</span></p>
<p style="font-size: 80%"><a href="#fnref_10" name="fn_10">[10]</a> <span class="footnote">
For preference, there should be a rather small trusted kernel written
in C or similar.
The official BEAM implementation is around 140K lines of C code for
the emulator proper; I hope that less than that would be necessary.</span></p>
</div>
<div id="footer">
</div>
</div>
Erik Søe Sørensenhttp://www.blogger.com/profile/04571315699824621252noreply@blogger.com2Danmark56.26392 9.5017850000000451.727657 -0.82536349999995906 60.800183 19.828933500000041tag:blogger.com,1999:blog-2879132569917319939.post-44449671979952770842011-10-25T00:53:00.000+02:002011-10-25T00:53:22.120+02:00The Impossible Isolates<blockquote>…in which I continue to expound on the vast superiority of the
Erlang Way, this time with the young and innocent Dart on the
receiving end (but don't worry, Dart can handle multiple of those. The problem is that it must).</blockquote>
<p>Google recently published a "technical preview" of the Dart language.
It contains a notion of “<em>isolates</em>” — described by the
<a href="http://www.dartlang.org/docs/spec/dartLangSpec.pdf">language specification</a> as "actor-like entities" — which serve partly as
sandboxes, partly as "unit[s] of concurrency".</p>
<p>In the following, I intend to demonstrate that though the Dart
isolates are <a href="http://gotocon.com/dl/goto-aarhus-2011/slides/GiladBracha_and_LarsBak_OpeningKeynoteDartANewProgrammingLanguageForStructuredWebProgramming.pdf">presented as “inspired by Erlang
[processes]”</a>, there is much to be gained by
taking a few more pages out of Erlang's book.</p>
<h2>What's Impossible in Your Favorite Language?</h2>
<p>When programming languages are discussed, focus tends to be on the
things that are possible in a given language.
<br/>
But what's at least as important — if you're doing anything beyond
toy programs and obfuscation contests — are the things that are
impossible in the language. What <em>can't</em> happen?</p>
<p>Impossibilities are very useful. They can be a great help when
reasoning about programs — and when constructing programs which lend
themselves to reasoning.
<br/>
The impossible helps isolate relevant effect from irrelevant cause.</p>
<p>It's the same in real life: we do a lot of optimizations based on what
we know can't happen; being able to make assumptions about the world
is what lets us do, well, anything complex at all.</p>
<!--
In traffic, for instance, we know which directions to be wary about,
where new stuff may turn up which we need to react to. And we then
concentrate our attention about those directions. We don't have to
look everywhere at once; if I've established that there are no
pedestrians in a given area, I know that there's a limit to how
quickly that can change.
-->
<p>Gravity is quite reliable, for instance.
I rarely bother to strap things to my desk these days, but rely on
plain old gravity to keep them where I put them.</p>
<p>If I place something on my desk, then come back later to find that
thing lying on the floor, and wonder how that came to be — then I may
suspect a number things, but gravity outage isn't even considered a
candidate explanation.
<br/>
That's the difference between <em>unlikely</em> and <em>impossible</em>.</p>
<p>(With the different large-ish systems I've been working on for the
past few years, <em>unlikely</em> happens all the time.
And <em>implausible</em> (though not impossible) happens about once every
1–2 years (these events often involve virtualization).)</p>
<p>Narrowing down the "what may happen / what may have happened" scope is
a crucial part of program debugging, understanding and maintenance,
and hence of developing.</p>
<h2>The blessings of Erlang</h2>
<p>One of the critical things that Erlang provides, and a major reason
why I like it, is the way its semantics and building blocks prevent surprises.
<br/>
Two processes can't interfere with each others' state; they can't
affect each others' control flow except through message passing;
and if a process goes down, it disappears completely, without leaking resources.</p>
<p>The great thing about Erlang is not in itself that it allows you to do
many things simultaneously, but that at the same time it lets you
focus on one thing at a time.
It is just as much about being able to think non-concurrently, as
it's about concurrency. That's the way that complexity is handled: by
isolation, through focus on the task at hand.
<br/>
Much of the time, the result is that concerns separate nicely.</p>
<!--
Erlang may enable us to write concurrent programs, but the way it does
it is by letting us not have to think concurrently. Not all the time, at
least. And get safely away with it.
-->
<h2>The three pillars of sound state</h2>
<p>As Kresten Krab Thorup points out in his posting
“<a href="http://dartinside.com/2011/dart-an-erlangers-reflections/">Dart: An Erlanger's Reflections</a>”, the three major features of Erlang which enable such focus on one task are
<em>isolation</em>, <em>sequencing</em>, and <em>fault handling</em>.</p>
<p>These features consist of four (1+1+2) properties which can be summed up as, respectively,</p>
<ul>
<li>Other processes cannot interfere with a process's state (except through message passing)</li>
<li>A process cannot interfere with its own state (but has a single, predictable control flow)</li>
<li>A process which fails cannot continue to run after the failure (with a possibly corrupt state)</li>
<li>When a process fails, and thus ceases to run, entities dependent on that process will be notified of its demise in a timely manner.</li>
</ul>
<p>(Or, more consisely: “Others won't interfere”, “I myself won't interfere”, “Failure will not linger”, and “Failure will not go unnoticed”.)</p>
<p>Of these, I will in the following focus on the first three — the ones
which are, incidentally, stated as impossibilities of the
aforementioned kind.
<br/>
The absense of either of these properties would destroy locality in reasoning.</p>
<p>One consequence of this set of properties is related to the famous "<a href="http://en.wikipedia.org/wiki/Fail-fast">Let it crash</a>" philosophy of the Erlang tradition:
that even error handling is done only by the survivors — those processes
with a “good” (non-corrupted) state.</p>
<h2>The state of Dart (and stacklessness of its isolates)</h2>
<p>The reason I write about all of this just now is related to Google's
recently-published language “<a href="http://www.dartlang.org/">Dart</a>” (released as
an early preview; the spec is not final yet).</p>
<p>Dart introduces a notion of <em>isolates</em> — like Erlang's light-weight
processes, isolates have separate heaps.
Unlike Erlang processes, however, isolates don't have separate stacks,
but are activated on an empty stack each time, through callbacks which
process the incoming messages on a FIFO basis.</p>
<p>And as far as I can tell from the specification, nothing particular
happens to an isolate if one of its callbacks terminate abnormally.
<br/>
[Actually, I am told that as it is now, that will cause the Dart VM to crash.
I will assume that this behaviour is not intended.]</p>
<p>Indeed, isolates appear to be more <em>passive</em> than <em>active</em> objects --
as far as I can tell, their life spans are determined, not explicitly
and from within like Erlang processes' life spans, but implicitly and
from without, by reachability, like OOP objects.</p>
<p>And that difference is a significant one:
it breaks two of the three “pillars of sound state”: an isolate can surprise itself by handling an unrelated event when it perhaps shouldn't, and an event handler crashing within an isolate can leave the state in an inconsistent state without any consequences for the isolate's lifecycle.</p>
<p>Basically, what this means is that the invariants of an isolate's state most be reestablished by all of its event handlers, always, regardless of whether and how they terminate abnormally. This may just be par for the course in the eyes of many programmers, of course, but from an Erlang developer's perspective it's very much a “close but no cigar” situation.</p>
<p>The <a href="http://www.infoq.com/presentations/Death-by-Accidental-Complexity">state-space explosion</a> that commonly happens when a state machine must be able to handle every kind of events in every state is exactly one of the things that drove the development of Erlang in the first place (and led to the “selective receive” feature).</p>
<h2>Case in point: Synchronous calls</h2>
<p>An example of a common pattern where this matters is as follows:
One actor needs, during the processing of a message, to contact
another actor for information necessary to complete the processing.
This is a situation that commonly occurs in Erlang programs, and
in Erlang, the typical solution would be to make a synchronous call to
the other actor: send a request, then wait for a response before
continuing execution. (In most cases, a timeout would be set as well,
to ensure that execution will in fact continue.)</p>
<p>This send-request, wait-for-response operation pair can be put
together in a function, so that the invocation of another actor looks
just like a normal function invocation — the communication can be encapsulated; in fact, the send/receive code is rarely actually spelled out, because it is already provided by the standard library.</p>
<p>In Dart, the situation is somewhat different: to do a synchronous call
to another actor, you'd create a Promise (a one-shot message queue), and send the
send-end + the request to the actor in question. That actor would then
put the result to the Promise.
But the caller cannot just explicitly wait for the response; instead, it must register a response handler on the Promise. The handler would typically be a closure which carries whatever state is needed to complete the processing.</p>
<p>All of these operation complete right away — execution continues immediately.
This means that the remote call cannot be encapsulated and treated like a normal function call; modularity suffers.
In Erlang, it is easy to change whether, how and to which other actor to make asynchronous calls — it's a local change, completely transparent to the call site; in Dart it looks like it wouldn't be so simple a matter.</p>
<p>If you were to approach the Erlang way in Dart, you'd have to write in continuation passing style (CPS), so that the call would always happen on a nearly empty stack — so that returning from a function means returning from the event handler; this means that the current constraints — that event handling always begins and ends with an empty stack — wouldn't matter, because the real stack would be in the continuation.
<br/>
When using CPS, calling another actor could indeed be made to look just like another function call.</p>
<p>Except that it wouldn't <em>behave</em> like an ordinary function call, because other incoming messages could be processed between the call request and the call response — leading to the problems mentioned above with self-interference.</p>
<p>The funny thing about this CPS-based approach, by the way, is that this
attempt to buy back the between-events stack that is necessary for
encapsulation of calls, essentially <em>overdoes</em> it:
what you get is not one call stack, but a number of simultaneous call
stacks (disguised as response handler continuations), all ready to continue
execution as responses become ready. A multitude of threads of execution which can trip each other up and cause all sorts of nondeterminism-induced problems.</p>
<p>If the intention of the single-threaded isolates is to reduce the
complexity of developing concurrent systems, then this would not be a
particularly good outcome.
<br/>
No stack is too little; a multitude of stacks is too many. One is the
number we want, the one we can reason about.</p>
<h2>Please animate the isolates!</h2>
<p>The above describes the present situation, as I read the Dart specification (which I must admit I haven't done end-to-end; I've mainly been focusing on the isolate- and exception-related parts).</p>
<p>Since the Dart language is still in development, and Google is requesting input to the development process, these things may of course change.</p>
<p>My input to the process, then, is this:
<br/></p>
<blockquote>
<p>Give the isolates an explicit life cycle — just isolating the heap, thus ensuring sequential heap access, is a good first step, but to fully simplify matters and keep invariants local, an isolate must be in control of its own lifecycle, including which types of events it will be ready to process at any given moment.
<br/>
In short, give them liberty <em>and</em> give them death! (to paraphrase Patrick Henry rather freely).</p>
</blockquote>Erik Søe Sørensenhttp://www.blogger.com/profile/04571315699824621252noreply@blogger.com2tag:blogger.com,1999:blog-2879132569917319939.post-41292113738443957902011-10-10T00:40:00.000+02:002011-10-10T00:40:28.117+02:00Dynamic Selective Receive — an Erlang hack<p>Erlang is a language which is dynamic in many aspects. <br />
One of the things that are resolved statically, however, is pattern matching; the decision trees used for branching in function clauses, case statements, and receive statements are constructed on compile time.</p>
<p>For instance, a set of patterns like</p>
<pre style="margin: 0em 2em; line-height: 1.2; background: #DDFFDD; border: thin solid #99AA99"><code>[X] when X==0 orelse X==100
[H|T]
[]
</code></pre>
<p>is translated by the Erlang compiler into the following decision tree:</p>
<pre style="margin: 0em 2em; line-height: 1.2; background: #DDFFDD; border: thin solid #99AA99"><code>is_nonempty_list(B)?
+-Y-> [H|T] = B
| is_nil(T)?
| +-Y-> H == 0?
| | +-Y-> case 1
| | | ^
| | | |
| | | Y
| | | |
| | +-N-> H == 100?
| | |
| | N
| | |
| | v
| +---------> case 2
|
+-N-> is_nil(B)?
+-Y-> case 3
+-N-> No match.
</code></pre>
<p>The Erlang compiler is quite good at constructing such decision trees, which is good, because it's critical to the performance of typical Erlang programs. <br />
(Actually, if the compiler was poor at that job, “typical Erlang programs” would probably often look different, because developers would be more inclined to write their own decision trees. Compilers form developers, to some extent.)</p>
<p>If you want to construct a predicate dynamically, you can do that, using function composition or an interpreter function — that is for instance how the Eshell is implemented.
For instance, the first pattern above can be described as</p>
<pre style="margin: 0em 2em; line-height: 1.2; background: #DDFFDD; border: thin solid #99AA99"><code>{is_pair,
{'or', {is_eq, 0}, {is_eq, 100}},
is_nil}
</code></pre>
<p>which can be interpreted by a predicate evaluator like:</p>
<pre style="margin: 0em 2em; line-height: 1.2; background: #DDFFDD; border: thin solid #99AA99"><code>eval(is_nil, []) -> true;
eval({is_pair, HP, TP}, [H|T]) -> eval(HP,H) andalso eval(TP,T);
eval({is_eq, V}, V) -> true;
eval({'or', P1,P2}, V) -> eval(P1,V) orelse eval(P2,V);
eval(P, V) when is_function(P,1) -> P(V);
eval(_, _) -> false.
</code></pre>
<p>However, you can't do a <em>selective receive</em> using such a predicate, <em>because</em> one of those statically-constructed decision trees is always used for selecting which message to extract from the inbox; and once it's taken out of the inbox it cannot be put back again. Only after a message is extracted from the inbox can more general code be applied to it.</p>
<p>When is this a problem? <br />
Well, the Erlang shell is one example; here, dynamic selective receive will have to be faked: buffering is needed, and in fact the timeout mechanism of the Erlang <code>receive</code> construct is (last I checked, which is a few years ago) emulated only incompletely. <small>I know that I filed a bug report about that issue, but I can't find it now.</small></p>
<p>Also, it's an issue that may sometimes arise when you're trying to combine two concerns in one process, and each of the concerns communicate with other processes; in such cases, it would sometimes be convenient to have parameterizable selective receive.</p>
<h2>Solutions</h2>
<p>For this problem, I've found a solution which looks promising.</p>
<p>The solution is sufficiently cheaty, however, that I should probably first mention a couple of more “in the spirit” solutions; all three solutions can be considered to be cheating, but at least I can offer different flavours of unrulyness.</p>
<ol>
<li><p>Construct, dynamically, a module which implements a function which does the selective receive.
Constructing and compiling modules dynamically is relatively easy in Erlang. Calling a function in a runtime-specified module is just everyday Erlang; it's Erlang primary kind of polymorhism.</p>
<p>You wouldn't want to do this module construction business for something that needs to be done a million times, but it's probably a good exercise.</p></li>
<li><p>Take as a parameter a predicate function which is to be used for selecting the message.
Access the process's inbox through the backdoor: <code>process_info(self(), messages)</code>, and apply the predicate to each message in turn until an acceptable one is found (or the end of the list is reached, in which case you wait a bit and retry). Now you know exactly which message you want to extract, and that's straightforward to do (the decision tree for that is static).</p>
<p>I can only recommend this approach if you don't care about performance or ugly hacks. Also, there's no really good way of handling the no-such-message-yet case, as far as I can see.</p></li>
<li><p>Take advantage of the fact that what is possible in Beam — the VM's instruction set — is a superset of what's possible in Erlang. In particular with respect to the receive construct. This is the solution that I will describe below.</p></li>
</ol>
<p>The following module is a quite general solution which uses a <a href=“http://www.erlang.org/doc/apps/erts/match_spec.html”>matchspec</a> as a predicate. It is written in Erlang assembly, because at this level this hack becomes possible:</p>
<!-- -->
<pre style="margin: 0em 2em; line-height: 1.2; background: #DDFFDD; border: thin solid #99AA99"><code>%% File: dyn_sel_recv.S
{module, dyn_sel_recv}. %% version = 0
{exports, [{match_spec_receive,2}]}.
{labels, 5}.
{function,match_spec_receive,2,2}.
{label,1}.
{func_info,{atom,match_spec_magic},{atom,match_spec_receive},2}.
{label,2}.
{allocate,2,2}.
{move,{x,1},{y,0}}.
{move,{x,0},{y,1}}.
{label,3}.
{loop_rec,{f,5},{x,0}}.
%% x0 = Msg, y0 = Timeout, y1 = CMS
{test_heap,2,2}.
{put_list,{x,0},nil,{x,0}}.
%% x0 = [Msg]
{move,{y,1},{x,1}}.
%% x1 = CMS
{call_ext,2,{extfunc,ets,match_spec_run,2}}.
%% x0 = [] | [Res]
{test,is_nonempty_list,{f,4},[{x,0}]}.
{get_list,{x,0},{x,1},{x,2}}.
{move,{x,1},{x,0}}.
%% x0 = Res
remove_message.
{deallocate,2}.
return.
{label,4}.
{loop_rec_end,{f,3}}.
{label,5}.
{wait_timeout,{f,3},{y,0}}.
timeout.
{move,{atom,timeout},{x,0}}.
{deallocate,2}.
return.
</code></pre>
<p>To test it, we have this module:</p>
<pre style="margin: 0em 2em; line-height: 1.2; background: #DDFFDD; border: thin solid #99AA99"><code>%% File: dyn_recv_test.erl
-module(dyn_recv_test).
-include_lib(“stdlib/include/ms_transform.hrl”).
-export([test/0]).
test() ->
%% The pattern (and what it should result in):
MS = ets:fun2ms(fun({ping, X}) -> {pong,X} end),
%% Compile the pattern:
CMS = ets:match_spec_compile(MS),
%% Perform a selective receive based on the pattern:
dyn_sel_recv:match_spec_receive(CMS, 1000).
</code></pre>
<p>A sample interaction, with comment marked with "#" and "%":</p>
<pre style="margin: 0em 2em; line-height: 1.2; background: #DDFFDD; border: thin solid #99AA99"><code># Compile the code:
$ erlc dyn_recv_test.erl dyn_sel_recv.S
# Then start an Erlang shell:
$ erl
Erlang R14B (erts-5.8.1) [source] [smp:2:2] [rq:2] [async-threads:0] [hipe] [kernel-poll:false]
Eshell V5.8.1 (abort with ^G)
1> dyn_recv_test:test(). % No appropriate messages - times out
timeout
% Fill the inbox with test data:
2> self() ! 'before'.
before
3> self() ! {ping, 144}.
{ping,144}
4> self() ! {ping, 1234}.
{ping,1234}
5> self() ! 'after'.
'after'
6> process_info(self(), messages). % There are four messages in the inbox now
{messages,[before,{ping,144},{ping,1234},'after']}
7> dyn_recv_test:test(). % Receive first ping message
{pong,144}
8> dyn_recv_test:test(). % Receive second ping message
{pong,1234}
9> dyn_recv_test:test(). % Times out again - no more ping messages
timeout
10> process_info(self(), messages). % Now, only the non-pings are left.
{messages,[before,'after']}
</code></pre>
<h2>Discussion</h2>
<p>Why restrict ourselves to matchspecs?</p>
<p>In principle you could instead construct a receive function which accepts a general predicate function, but that would be risky — while I'm reasonably confident that the matchspec hack works (no guarantees though!), I'm fairly certain that the Erlang VM would take offense if such a predicate were to e.g. do any receiving of its own, or throw exceptions.</p>
<p>Matchspecs, on the other hand, can do just that which is possible to do in a regular pattern matching construct in Erlang — in particular, no side effects are allowed.</p>
<p>And matchspecs can be preprocessed (“compiled”) into a form which is interpreted efficiently; yet their source form can be constructed with relative ease at runtime.
Finally, matchspecs have the nice thing about them that even though they are represented as ordinary Erlang terms, and as such may be a somewhat hard to read (and write) if they are complex, there is a parse-tree transformation which makes it possible to write them just as ordinary Erlang patterns and conditions — this is demonstrated in the above.</p>
<p>Now, I haven't yet had the chance to use this hack in a real context, but here it is.
If nothing else, I suppose it demonstrates that it is quite possible to allow matchspecs in Erlang's pattern-matching constructs — at least that there does not seem to be any VM-level reason against it. I think that might be an interesting language extension; then again, there are plenty of other language extension discussions, and this suggestion would probably not get high priority.</p>Erik Søe Sørensenhttp://www.blogger.com/profile/04571315699824621252noreply@blogger.com0tag:blogger.com,1999:blog-2879132569917319939.post-38693093683724634672011-08-15T22:49:00.004+02:002011-08-15T23:32:54.027+02:00An Erlang-Java Interop Demo<p><head>
<meta http-equiv="content-type" value="text/html;charset=UTF-8"/>
<style>pre {margin: 0em 2em; font-size: 90%; line-height: 1.25;}</style>
</head>
<!--# An Erlang-Java Interop Demo--></p>
<p>Erlang is designed and used for distributed systems.
As such, it is quite good at talking to itself.
And as I'll show in the following, it is — at least in some cases — reasonably good at talking to other languages as well.</p>
<h2>When would you want to do this?</h2>
<p>(You can skip this section if you know.)</p>
<p>Interoperability between languages is useful when there's value to be had in the software in both ends. You might prefer to keep some functionality in Erlang because it's already written in that language; because it has better libraries for the task; or simply because the language supports that task better. The same goes for the other language: there are C libraries around for almost anything (and plenty of C/C++ legacy code), and for GUIs you might prefer e.g Java.</p>
<p>Or you have the same functionality implemented in both languages, and wish to do comparison tests on the two implementations.</p>
<p>Or the architecture is inherently distributed — e.g. a client-server configuration — so that there is no advantage to be had in building the two parts using the same language anyway.</p>
<p>In the following demonstration, we have a Java side with some freshly written software with interesting bugs, and an Erlang side with an interesting tool for discovering bugs and a nice-ish language for specifying tests.</p>
<h2>The demo</h2>
<p>What I intend to do:</p>
<ul>
<li>Define an interface;</li>
<li>Write a Java function;</li>
<li>Make it accessible from Erlang;</li>
<li>Test it using Erlang.</li>
</ul>
<p>Sounds like a lot of work? It needn't be.
Not for you, anyway; most of the heavy lifting has already been done.</p>
<p>(If you don't care about exposition, and are just interested in the technical end result, here is the <a href="https://gist.github.com/1147842">executable summary</a>.)</p>
<h2>An interface</h2>
<p>First, we'll need to define an interface.
For this, we use the <a href="http://en.wikipedia.org/wiki/IDL_specification_language">IDL</a> — Interface Description Language:</p>
<pre><code>// File "foo.idl"
interface Foo {
string quuxinate(in string s);
};
</code></pre>
<p>We translate this into Java using the Erlang compiler:</p>
<pre><code>erlc '+{be,java}' foo.idl
</code></pre>
<p>This results in a Java interface in <code>Foo.java</code>, as well as some stub code we'll be using.</p>
<h2>A function in Java</h2>
<p>For an implementation, let's say that <code>quuxinate()</code> is to reverse the string.
To add a twist, let's say that it breaks down under some rare, complex-ish circumstances, such as when there's a letter which occurs thrice in the input string — that'll be a bug we can find later, when we start testing it:</p>
<pre><code>// File "FooImpl.java"
public class FooImpl extends _FooImplBase /* which implements Foo */ {
public String quuxinate(String s) {
int[] stats = new int[256];
for (int i=0; i<s.length(); i++) {
char c = s.charAt(i);
if (c<255 && ++stats[c] >= 3) throw new RuntimeException("WTF");
}
return new StringBuilder(s).reverse().toString();
}
}
</code></pre>
<h2>Making the function accessible from Erlang</h2>
<p>You may have noticed that we don't implement Foo directly, but rather extend an stub class which implements it.
This means that the class already knows how to talk to Erlang — specifically, it knows how to handle requests like a <code>gen_server</code>.</p>
<p>What is left is to establish connection to an Erlang node.
There are two frameworks for this: CORBA and Erlang inter-node communication.
CORBA is, I believe, the heavy-weight option, and I haven't worked with it; in the following, we'll use Erlang inter-node communication.</p>
<p>A runnning Erlang program (OS-level process) is often referred to as a 'node', because Erlang is designed for having multiple such programs be connected, in what is known as a 'cluster' of nodes.</p>
<p>It's not an exclusive club, either: nodes can be, and usually are, Erlang nodes, but other kinds can enter the mix — a C or Java node, for instance.</p>
<p>A bit of code is needed to run a Java program as a node; don't worry though, this is the only lengthy bit, and it can be generalized so that it doesn't have to refer to <code>Foo</code>:</p>
<pre><code>// File "FooServer.java"
public class FooServer {
// The following is based on the program in lib/ic/examples/java-client-server/server.java
static java.lang.String snode = "javaserver";
static java.lang.String cookie = "xyz";
public static void main(String[] args) throws java.io.IOException, com.ericsson.otp.erlang.OtpAuthException {
com.ericsson.otp.erlang.OtpServer self = new com.ericsson.otp.erlang.OtpServer(snode, cookie);
System.err.print("Registering with EPMD...");
boolean res = self.publishPort();
if (!res) throw new RuntimeException("Node name was already taken.");
System.err.println("done");
do {
try {
com.ericsson.otp.erlang.OtpConnection connection = self.accept();
System.err.println("Incoming connection.");
try {
handleConnection(connection);
} catch (Exception e) {
System.err.println("Server terminated: "+e);
} finally {
connection.close();
System.err.println("Connection terminated.");
}
} catch (Exception e) {
System.err.println("Error accepting connection: "+e);
}
} while (true);
}
static void handleConnection(com.ericsson.otp.erlang.OtpConnection connection) throws Exception {
while (connection.isConnected() == true) {
FooImpl srv = new FooImpl();
com.ericsson.otp.erlang.OtpInputStream request= connection.receiveBuf();
try {
com.ericsson.otp.erlang.OtpOutputStream reply = srv.invoke(request);
if (reply != null) {
connection.sendBuf(srv.__getCallerPid(), reply);
}
} catch (Exception e) {
System.err.println("Server exception: "+e);
e.printStackTrace(System.err);
handleException(e, connection, null);
}
}
}
static void handleException(Exception e, com.ericsson.otp.erlang.OtpConnection connection, com.ericsson.otp.ic.Environment env) throws Exception {
// We'll improve on this later...
throw e;
}
}
</code></pre>
<p>Time to build and try out:</p>
<pre><code>ERL_ROOT=`erl -noshell -eval 'io:format("~s\n", [code:root_dir()]), init:stop().'` # Or simply where Erlang is.
IC_JAR=`ls -1 $ERL_ROOT/lib/ic-*/priv/ic.jar`
JI_JAR=`ls -1 $ERL_ROOT/lib/jinterface-*/priv/OtpErlang.jar`
CLASSPATH=".:$IC_JAR:$JI_JAR"
javac -classpath "$CLASSPATH" *.java
epmd # Start Erlang port mapper daemon if it isn't already.
java -classpath "$CLASSPATH" FooServer
</code></pre>
<p>The Java node should now be running and registered as <code>javaserver</code> — ready to accept connections with the right cookie.</p>
<p>Let's test that:</p>
<pre><code>erl -sname tester -setcookie xyz
> {ok,Host}=inet:gethostname().
> JavaServer = {dummy, list_to_atom("javaserver@"++Host)}.
> gen_server:call(JavaServer, {quuxinate, "Testing, 1-2-3"}).
</code></pre>
<p>The reply should be the reverse string:</p>
<pre><code>"3-2-1 ,gnitseT"
</code></pre>
<p>And the bug we put there is working too:</p>
<pre><code>> gen_server:call(JavaServer, {quuxinate, "Testing, 1 2 3"}).
** exception exit: {{nodedown,javaserver@flitwick},
{gen_server,call,
[{dummy,javaserver@flitwick},
{quuxinate,"Testing, 1 2 3"}]}}
in function gen_server:call/2
</code></pre>
<h2>Property testing of Java code</h2>
<p>Now then, what can we do with this setup?</p>
<p>One interesting thing that we can do is to apply one of the <a href="http://www.protest-project.eu/">property-based testing</a> tools to our Java function.
In the following, I'll be using <a href="http://www.javalimit.com/2010/05/triq-the-free-quickcheck-for-erlang.html">Triq</a>, but <a href="http://quviq.com/">Quviq QuickCheck</a> or <a href="http://proper.softlab.ntua.gr/">PropEr</a> could be substituted with only minor changes.
First, assuming that you haven't got Triq, but have <code>git</code>:</p>
<pre><code>git clone git://github.com/krestenkrab/triq.git
(cd triq && ./rebar compile)
</code></pre>
<p>Then we are ready to write our test — namely, that given any (ASCII) string, <code>quuxinate()</code> should return the reverse string:</p>
<pre><code>// File "test.erl"
-module(test).
-include_lib("triq/include/triq.hrl").
-export([main/0]).
prop_reverse(JavaServer) -> % The property
?FORALL(S, ascii_string(),
gen_server:call(JavaServer, {quuxinate, S})
== lists:reverse(S)).
ascii_string() -> % A data generator
list(choose(0,127)).
main() ->
{ok,Host}=inet:gethostname(),
JavaServer = {dummy, list_to_atom("javaserver@"++Host)},
triq:check(prop_reverse(JavaServer), 100), % Do the magic
init:stop(). % Shut down cleanly
</code></pre>
<p>Compile and run the test:</p>
<pre><code>erlc -I triq/include -pa triq/ebin test.erl
erl -noshell -sname tester -setcookie xyz -pa triq/ebin -run test main
</code></pre>
<p>Triq will now generate a hundred random test cases, and verify the property for each case.
After a dozen such tests, the bug is triggered:</p>
<pre><code>...........Failed with: {exit,
{{nodedown,javaserver@flitwick},
{gen_server,call,
[{dummy,javaserver@flitwick},
{quuxinate,
[79,84,75,110,3,42,73,14,1,53,76,42,126,40,118,122,
74,2,58,34,42,98]}]}},
[{gen_server,call,2},
{test,'-prop_reverse/1-fun-0-',2},
{triq,check_input,4},
{triq,check_forall,6},
{triq,check,3},
{test,main,0},
{init,start_it,1},
{init,start_em,1}]}
Failed after 12 tests with {'EXIT',
{{nodedown,javaserver@flitwick},
{gen_server,call,
[{dummy,javaserver@flitwick},
{quuxinate,
[79,84,75,110,3,42,73,14,1,53,76,42,126,40,
118,122,74,2,58,34,42,98]}]}}}
</code></pre>
<p>The 22-character string has three <code>'*'</code>s (ascii value 42) in it, which was what triggered the bug.</p>
<p>Triq then proceeds to simplify the test case:</p>
<pre><code>Simplified:
S = [33,33,33]
</code></pre>
<p>concluding that the string <code>"!!!"</code> is a locally-minimal failing test case.</p>
<p>Quite useful, isn't it? — and the amount of non-reusable code has been quite manageable (3 lines of IDL, 14 lines of implementation, 14 lines of test code, 1 line of Foo-specific code in <code>FooServer</code>).
<br>(Speaking of which: you're free to use these snippets as you see fit; provided as-is and with no guarantees of anything, of course.)</p>
<p>I should mention at this point that property testing tools exist within the Java world as well — there exists at least one for Scala.
I didn't find it particularly satifying to use, though, although I may have been unlucky; in any case, this is an alternative.</p>
<h2>Erlang, IDL and exceptions</h2>
<p>The Guide (that is, the Erlang IC (IDL compiler) User Guide) has <a href="http://www.erlang.org/doc/apps/ic/ch_java.html#id75463">this</a> to say about handling of Java exceptions:</p>
<pre><code>While exception mapping is not implemented, the stubs will generate some Java exceptions in case of operation failure. No exceptions are propagated through the communication.
</code></pre>
<p>Which means that, out of the box, these are our options for handling Java-side exceptions:</p>
<ol>
<li>Convert exceptions into values within <code>quuxinate()</code>.</li>
<li>Return no result on exceptions — in which case the Erlang-side call will time out (after, as a default, 5 seconds).</li>
<li>Close the connection — in which case the Erlang-side call will receive an error result immediately.</li>
</ol>
<p>None of these options are especially satisfying.</p>
<p>So let's look at how to do this:</p>
<p><ol start="4"><li>Report exceptions to the caller as an <code>{error, Reason}</code> reply.</li></ol></p>
<p>It's not too difficult. What we need to do (according to the <code>gen_server</code> call protocol) is send a <code>{RequestRef, Reply}</code> tuple to the caller, where <code>RequestRef</code> is a reference which was included in the request and must be included in the response as well.</p>
<p>The main difficulty is one of access: at the point of error handling, we will need access to (1) the caller's PID and (2) the request reference.
I'd like to keep <code>FooImpl</code> and the connection-managing <code>FooServer</code> separate, so we need to add an accessor:</p>
<pre><code>// File "FooImpl.java"
public class FooImpl extends _FooImplBase /* which implements Foo */ {
...
/** The request environment is exposed for error handling reasons. */
public com.ericsson.otp.ic.Environment getEnv() {
return _env;
}
}
</code></pre>
<p>That's it. We could instead have added one accessor for the caller PID and one for the request reference, but let's keep the complexity in the server class.</p>
<p>In the server class, instead of throwing an exception which causes the connection to be terminated, we will build and send an error reply:</p>
<pre><code>// File "FooServer.java"
public class FooServer {
...
// in handleConnection(), replace
// handleException(e, connection, null);
// with:
handleException(e, connection, srv.getEnv());
...
// and replace handleException() with:
static void handleException(Exception e, com.ericsson.otp.erlang.OtpConnection connection, com.ericsson.otp.ic.Environment env) throws Exception {
// Write exception reply:
com.ericsson.otp.erlang.OtpOutputStream err_reply = new com.ericsson.otp.erlang.OtpOutputStream();
err_reply.write_tuple_head(2);
err_reply.write_any(env.getSref());
err_reply.write_tuple_head(2); // Construct return value {error, ErrorText}
err_reply.write_atom("error");
err_reply.write_string(e.toString());
connection.sendBuf(env.getScaller(), err_reply);
}
}
</code></pre>
<p>There; that's all.</p>
<p>If we test it again, we get a nicer behaviour:</p>
<pre><code>> gen_server:call(JavaServer, {quuxinate, "Testing, 1 2 3"}).
{error,"java.lang.RuntimeException: WTF"}
</code></pre>
<h2>Disclaimer</h2>
<p>I ought to mention that I have only just discovered this interoperability option (two days ago, in fact, when I stumbled upon <a href="http://gupea.ub.gu.se/bitstream/2077/4579/1/ISOFM_ITU870_V07_Pablo,Alvarez,Yanez%3BAutomated%20Testing%20of%20Java%20ABNF%20Grammar%20Parsers%20with%20QuickCheck.pdf">this article</a>); it is only a few months ago that I wrote a custom socket server test program in Java, and corresponding driver code in Erlang, just to achieve what could be had far more easily using Erlang's IDL compiler.
It may therefore not be so easy to use in practice as the above make it seem.
(For one thing, the only type that I've used is <code>string</code>.)</p>
<p>Even so, I hope to have inspired others to try this out.
<br>One way to do so is to have a look at the <a href="https://gist.github.com/1147842">executable summary</a> referred to earlier; it is the demo as a shell script.</p>
<p>Happy inter-language hacking!</p>
<hr width="29%"/>
<h2>Further technical notes (and speculation)</h2>
<p>The IDL compiler supports generating code for Java, C and Erlang.
Here is its <a href="http://www.erlang.org/doc/apps/ic/users_guide.html">documentation</a>.</p>
<p>If you're writing Erlang port programs (like an Erlang driver, but running in a separate process), but have an interface which is evolving quicḱly enough, or you yourself are lazy enough, that writing and maiintaining the serialization and deserialization code has quite lost its attraction, then it might be tempting to look into using IDL for that interface, and have the boring bits code-generated for you.</p>
<p>The IDL compiler and the code it generates does not seem to have been designed for this, but as far as I can tell it appears to be achieveable.</p>
<!-- Tags: Erlang, Java, IDL, property-based testing, QuickCheck, Triq -->
Erik Søe Sørensenhttp://www.blogger.com/profile/04571315699824621252noreply@blogger.com1tag:blogger.com,1999:blog-2879132569917319939.post-53032211865443082532011-06-03T19:22:00.022+02:002011-06-13T00:00:38.380+02:00Concurrent Design as a Matter of CauseI've written <a href="http://polymorphictypist.blogspot.com/2011/05/on-concurrency-issues.html">earlier</a> about designing for concurrency in the small.<br />
But even the best and most flawless bricks can be put together in far
more meaningless ways than meaningful ways.<br />
In the following, I'll consider concurrent design on a larger scale,
and introduce a tool which may be useful to ensure soundness in a
design.<br />
<br />
What is concurrent design about? (Performance aside — we'll focus on
soundness for the moment.) What is the basic units for reasoning about
concurrent, possibly distributed systems?<br />
Mutexes, messages, transactions — these are among the building blocks.
But the connecting mortar is <span style="font-style: italic;">causality
chains</span>. When a database client, for instance, submits data to a
database server, it can rely on the data to be persisted only if there
is a causality chain leading from the moment the client receives the
"success"-response, back to when the database sent it, and further back
to when the database server's hard drive physically wrote the last
block of the transaction.<br />
If there is no causality chain, then no timing assumptions can be made.
Which is why, in a given concurrent design, it is prudent to ensure
that the necessary causality chains are present.<br />
<h3>
Example: the "update take-over" pitfall</h3>
This is a concurrency design pitfall which I learned about a few years
ago. I'd forgotten all about its non-obviousness, until it came up
recently in a design discussion. This incident suggested to me that the
problem might be sufficiently non-trivial for there to be a lesson to
pass on. And I will do that — describe the problem, the naïve-but-wrong
solution, and some correct solutions — but do so a bit more elaborately
than I originally planned, because the focus will not be on the
solutions themselves, but rather on the process: a method to arrive at
them.<br />
<br />
<div style="float: right;">
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj80TanTV1Vcf3ZtZZLS42t2o8hfhPwkxGdu_JDHfGT2ZAdkWBGOIDilpEgZC9b_654wrqKdbanmzyrjf0dBecuHk4W6yDjXNId9csrSFlYhwuXYzhcbE6iXO5uvK98rW1XP788Bz4I2Q/s1600/SnapshotAndUpdates-UML.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj80TanTV1Vcf3ZtZZLS42t2o8hfhPwkxGdu_JDHfGT2ZAdkWBGOIDilpEgZC9b_654wrqKdbanmzyrjf0dBecuHk4W6yDjXNId9csrSFlYhwuXYzhcbE6iXO5uvK98rW1XP788Bz4I2Q/s320/SnapshotAndUpdates-UML.png" width="282" /></a></div>
</div>
<span style="font-weight: bold;">The scenario:</span> A client (C) must
keep track of some object's state. It does
so by subscribing to changes (updates). However, there are two update
sources, and it is desirable to change over from on (A) to the other
(B) from some point on. That is, before the take-over, A provides all
updates; after the take-over, B provides all the updates.<br />
A common variation of the pattern is that A is providing snapshots of
the object's state through an synchronous request-response protocol,
rather than providing updates through a subscribe-publish mechanism.<br />
For simplicity, this variation is the scenario I'll be focusing on in
the following; it is depicted as UML to the right.<br />
<br />
<span style="font-weight: bold;">The naïve solution:</span> Simply set
up the subscription to B before getting the snapshot from A.<br />
<br />
<span style="font-weight: bold;">The threat:</span> An update is missed
because it "falls into the crack"
caused by the take-over — it is processed by A too late to be part of
the snapshot, but is processed by B too soon, before the subscription
is set up.<br />
The core of the issue is that when there are two independent message
paths, we cannot assume anything about their relative timing; even
though they have a common source, events in one path may overtake
events in the other path.<span style="font-weight: bold;"> </span><br />
<br />
<span style="font-weight: bold;">Analysis:</span><br />
What does it take to get this setup to work as expected?<br />
<blockquote style="font-style: italic;">
Any update must reach the
client, either through A (the snapshot) or B (the following updates).</blockquote>
Or, from a causality view:<br />
<blockquote style="font-style: italic;">
There must be a causality chain
ruling out
"update is processe by A after the snapshot is made, but is processed
by B before the subscription is set up".</blockquote>
A causality chain is a series of <span style="font-style: italic;">causality
links<span style="font-style: italic;"></span></span>, which are
defined as follows:<br />
<h3>
Causality rules</h3>
<ol>
<li>There is a causality link from X to Y if X happens before Y and
they both happen in the <span style="font-style: italic;">same thread</span>.</li>
<li>There is a causality link from a message is sent to its reception.</li>
<li>Two message receptions may be causally linked if the transport
layer guarantees that one is delivered before the other.<br />
For instance, TCP guarantees that message <span style="font-style: italic;">within the same connection</span> are
delivered in order, i.e. that if X is sent before Y (and in the same
direction), then X is delivered/received before Y.<br />
Similarly, in Erlang, message delivery order is <a href="http://www.erlang.org/faq/academic.html#id56948">guaranteed</a> for any
sender/receiver pair.<br />
</li>
</ol>
<hr style="height: 2px; width: 28%;" />
<blockquote>
<b>Exercise/Kata:</b> I think this problem makes a rather nice <a href="http://en.wikipedia.org/wiki/Kata_%28programming%29">kata</a> in concurrent design.<br />
If you'd like to try it yourself, do so before reading on.<br />
How many different solutions do you see?</blockquote>
<hr style="height: 2px; width: 28%;" />
<h3>
Enter the loop</h3>
Here is the trick: Causality <span style="font-style: italic;">loops</span>
are impossible. You can't enter a causality loop; they correspond to paradoxes. (Proof by paradox is an ancient technique. also known as proof by contradiction.)<br />
Thus, one way to attack the problem is to reformulate the requirement
as follows:<br />
<blockquote style="font-style: italic;">
If update is processed by A
after the snapshot is made, <span style="text-decoration: underline;">and</span>
the same update is processed by B before the subscription is set up,
then there is a causality loop.<br />
The loop in question is absent, however, if either of the two
sub-conditions are absent.</blockquote>
The situation can be represented graphically:
<br />
<div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi-3krx_4_9udDqGQGLhQgxQMwGKkozfPv3MX2T4bdTmLf4L3Ef5WsOQl6QEZDSLIPzro-0g4CCbeOyWY2UIrYDDD51FxQf3HlisU1cPUPGqcpGk5_Jo39KEiMY4UCapupsTpJrs24UAQ/s1600/SnapshotAndUpdates-nonloop.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="219" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi-3krx_4_9udDqGQGLhQgxQMwGKkozfPv3MX2T4bdTmLf4L3Ef5WsOQl6QEZDSLIPzro-0g4CCbeOyWY2UIrYDDD51FxQf3HlisU1cPUPGqcpGk5_Jo39KEiMY4UCapupsTpJrs24UAQ/s320/SnapshotAndUpdates-nonloop.png" width="320" /></a></div>
<br /></div>
This is quite a bit simpler than the UML diagram above. But actually it
captures the essence of the problem.<br />
And it is useful, because it leads to a further refinement of the
problem statement: How can we introduce a loop which is absent if we
reverse either of the two arrows?<br />
We will, however, also need the context in order to translate our findings back into concrete solutions. The context looks like this:<br />
<div class="separator" style="clear: both; text-align: center;">
<img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgy5GMXEA2rXiBRHnBpz0gWNRxaqOeeAjfvVPvYAOX33lz9QvNh530AjRO8mhhO0BS1DI2yt174E8c9yAQrQg80Vyh6CBraMuiirZKGk81a9UAHAHLC5Fe4CKPi49aHRNYU51IaBpBlvw/s1600/SnapshotAndUpdates-context3.png" /></div>
<br />
<h3>
Getting solutions through loops</h3>
That question is not too hard: simply add A2→B1 and B2→A1, to get a
four-edge loop which disappears if any edge is reversed or removed.<br />
<div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjb6AWW5wIyU499L1GXWcijlhb8WkWG4jyO8fvvxFU8bjVK9MDfapEhMCDBhg_cS1qSHGONF1y60FBALZYUkruRPUDVs9FKoNkReyhaioPxnZF60zRW_JYzib4KSiAJluMHKbUiRiSbzw/s1600/SnapshotAndUpdates-full-loop.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="219" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjb6AWW5wIyU499L1GXWcijlhb8WkWG4jyO8fvvxFU8bjVK9MDfapEhMCDBhg_cS1qSHGONF1y60FBALZYUkruRPUDVs9FKoNkReyhaioPxnZF60zRW_JYzib4KSiAJluMHKbUiRiSbzw/s320/SnapshotAndUpdates-full-loop.png" width="320" /></a></div>
</div>
Translating backwards, what does this mean in terms of the original
problem?<br />
<span style="font-weight: bold;"><br />
Solution 1:</span> The client subscribes synchronously to B before
getting the snapshot from A. The event origin publishes updates
synchronously to A before it sends them to B.<br />
Now, the first part of that solution is reasonable, but the second part
is often not directly achievable — the event origin may just have a
generic publish-subscribe scheme, and be oblivious to the difference
between the nature of A and B.<br />
But there is another interpretation of the A2→B1 edge.<br />
It can be realized by linking A2 or anything "downstream" (causally later) from A2 to B1 or anything "upstream" (causally earlier) from B1.<br />
By using a direct A2→B1 link, we obtain:<br />
<br />
<span style="font-weight: bold;">Solution 2:</span> The client
subscribes synchronously to B before getting the snapshot
from A. When A has processed an update, it sends it on to B (which gets
the updates by this route, rather than directly from the event
source).
<br />
<br />
Likewise, the A2→B1 edge can be realized by a direct link. This results in another two solutions:<br />
<br />
<b>Solution 3</b>: The client does not contact A directly in order to get a snapshot. Instead, the 'subscribe' operation provided by B both subscribes and obtains a snapshot from A.<br />
<br />
<b>Solution 4</b>: Like in #2, updates are sent only to A, which sends them on to B; like in #3, the client contacts only B, which both registers a subscription and requests a snapshot for the client from A.
<br />
<h3>
Merging nodes</h3>
Revisiting Solution 2, we find that, depending on the situation, this solution may render B somewhat moot.
It may make sense to eliminate it altogether, which is yet another
interpretation: A2 occurs in the same thread as B1, and B2 occurs in
the same thread as A1.
<br />
In terms of causality graphs, this is a correct solution because
merging A1 with B2 and B1 with A2 results in a two-edge loop:<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEit7ngbjakG1uOVTj3TzBMPgbACwnUibC40Y655GIw1uaEn9GsJVWqCQzHWOiTUOQLeRMHszl3l1f4Ha9wpGrk97JFliCB4tL00AOCeG08hoqiPey6itkcijoOeq1ytpKDQUio5AOyJUQ/s1600/SnapshotAndUpdates-merge-loop.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="151" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEit7ngbjakG1uOVTj3TzBMPgbACwnUibC40Y655GIw1uaEn9GsJVWqCQzHWOiTUOQLeRMHszl3l1f4Ha9wpGrk97JFliCB4tL00AOCeG08hoqiPey6itkcijoOeq1ytpKDQUio5AOyJUQ/s320/SnapshotAndUpdates-merge-loop.png" width="320" /></a></div>
<br />
<span style="font-weight: bold;">Solution 5:</span> The snapshot
provider also plays the role of update publisher. This combined service
provides a "get snapshot and subscribe for updates" operation.<br />
<br />
In general, merging existing nodes may provide additional ways to obtain causality loops. <br />
<h3>
Solutions involving conditional edges</h3>
We now have usable solutions, but our options haven't been exhausted yet.<br />
There are other ways of causing loops in the graph — but in order to
ensure that the loop is only present when it should be, we will need to
introduce <span style="font-style: italic;">conditional edges</span> —
which are only actually there in certain circumstances.
<br />
<br />
For instance, consider the loop caused by adding the edge A2→A1. In
order to ensure that this loop is absent if the B1→B2 edge is reversed,
we will need to put a condition on the new edge saying "only if the
update at A2 has already happened at B before the subscription at B2."
This means that information from the subscription operation is required
to evaluate the condition, so we'll need a B2→A2 edge as well:
<br />
<div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhSuiKV9beRBKvZXLn5w_yA-wJZEVQdYoRK5mkBLzOdeICnPXp8lpHhCF4qTIgkXaMi_gUiRUjHjgJ6m4LIdS_JHj7EvQCuQ3IINwQdUi8Mmlxc1Iq4aJliMLBfFnIfyt_kgGAKfGbLaw/s1600/SnapshotAndUpdates-snapshot-loop.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="212" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhSuiKV9beRBKvZXLn5w_yA-wJZEVQdYoRK5mkBLzOdeICnPXp8lpHhCF4qTIgkXaMi_gUiRUjHjgJ6m4LIdS_JHj7EvQCuQ3IINwQdUi8Mmlxc1Iq4aJliMLBfFnIfyt_kgGAKfGbLaw/s320/SnapshotAndUpdates-snapshot-loop.png" width="320" /></a></div>
</div>
Also, we will
need the updates to be identifiable in some way, by
timestamp, serial number or similar; let's assume serial numbers.
<br />
<br />
Just as Solutions 1 and 2 differed in how the A2→B1 edge was
implemented — whether it was present as a link directly from A to B or
from A to the event source to B — so there are at least two solutions
corresponding to this graph, because an A2→A1 can be realized both directly and as A2→C1 (because C1 is upstreams of A1).<br />
The A2→C1 edge involves the client:
<br />
<br />
<span style="font-weight: bold;">Solution 6:</span> Subscribing to B is
a synchronous operation, which returns the ID of the last known update
(which is the last update not published as result of the subscription).
The "get snapshot" operation provided by A returns both the snapshot
and the ID of the last update included in the snapshot. The client
first performs the subscription, then requests snapshots repeatedly,
until the snapshot includes the last update not published.
<br />
<br />
In the direct variant, the A2→A1 edge is kept within A; if sending
snapshot is expensive, this is probably better:<br />
<br />
<span style="font-weight: bold;">Solution 7:</span> Subscribing to B is
a synchronous operation, which returns the ID of the last known update.
The "get snapshot" operation takes an update ID and checks it
against the last known (by A) update. Sending the snapshot is delayed
until the update in question has been processed by A.<br />
<hr style="height: 2px; width: 28%;" />
So much for that causality loop; moving on, we consider the loop
caused by adding a B2→B1 edge. It also needs to be conditional, with
the same condition; this time, we need to add an A1→B2 edge to have the
information required to evaluate the condition:<br />
<div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhSqgwWpTH9_-U_ddvJNHXAGlgmyfXrZwxhS-34Mp-6zs0sWqUMgDP53ft9-x_fsFwNmW1X-760eeG_qb_rUxxoHOqYxzmggeh7F_dJ_I9_0OZQtQXKVgU5zxHnMghDY0VKY5hc2YRyeg/s1600/SnapshotAndUpdates-subscribe-loop.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="212" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhSqgwWpTH9_-U_ddvJNHXAGlgmyfXrZwxhS-34Mp-6zs0sWqUMgDP53ft9-x_fsFwNmW1X-760eeG_qb_rUxxoHOqYxzmggeh7F_dJ_I9_0OZQtQXKVgU5zxHnMghDY0VKY5hc2YRyeg/s320/SnapshotAndUpdates-subscribe-loop.png" width="320" /></a></div>
</div>
In domain
terms: "If the update is not in the image when subscribing,
then reprocess the update."<br />
This seems to imply memory:<br />
<br />
<span style="font-weight: bold;">Solution 8: </span>The update provider B caches a
suitable amount of the most recent updates. The "get snapshot"
operation returns both the snapshot and the ID of the last update
included therein. The subscription operation takes an update ID, and
results in all updates since that one to be sent to the client —
whether they have already been processed before the subscription took
place (in which case the update is taken from the cache), or arrive at
B at a later point. The client first gets a snapshot synchronously,
then subscribes with the ID thus obtained.<br />
<h3>
There are more (of course there are)<br />
</h3>
The set of solutions found so far is obviously not complete; for
instance, none of the solutions we've considered have involved
reinventing even <a href="http://en.wikipedia.org/wiki/Greenspun%27s_Tenth_Rule">a quarter of Common
Lisp</a> (or Erlang).<br />
Further meaningful solutions exist; for instance, one that lets the
snapshot provider send updates until the real update provider has taken
over (which resembles a hybrid between #5 and #7); or one in which the
snapshot provider tries to include a given update ID (as in #7) but
times out if this takes too long, in which case the client may repeat
the snapshot operation (as in #6). Or one in which the "get snapshot" operation is replaced with a "get snapshot if update <i>u</i> is included" (also a hybrid between #6 and #7).<br />
<h3>
Conclusion<br />
</h3>
We have looked at a concurrent design problem in the view of causality
chains, and attacked it using the concept of causality loops (and with
the help of causality graphs). Using a fairly simple technique, we have
generated a number of solutions — quite different solutions, as it turns out, but
all correct and realistic, and with different trade-offs.<br />
Causality loops as a design tool appears to be a useful way both to
find correct solutions and to explore and outline the design space for
concurrent design problems.<br />
<h3>
Acknowledgement</h3>
I would like to thank Kresten Krab Thorup, who unknowingly became the
cause of this posting, and indirectly of many of the ideas in it.Erik Søe Sørensenhttp://www.blogger.com/profile/04571315699824621252noreply@blogger.com0tag:blogger.com,1999:blog-2879132569917319939.post-82006638543707124632011-05-23T16:04:00.003+02:002011-05-23T16:44:01.425+02:00On concurrency issuesConcurrency issues — race conditions and the like — are the worst
category of bugs. These are the bugs that cannot well be proven absent
by unit tests; these are the kind of bugs that hide away, biding their
time until the most inopportune moment, then rearing their ugly, non-deterministic head on your production
system when it is at its busiest. And even then, they can continue to
exist unlocated for quite a while, despite many hours being put into
tracking them down. Elusive, hardly reproducible, yet ultimately
expensive; I've seen it happen more than once.<br />
<br />
What follows is some thoughts on the basis of the typical issues.<br />
<br />
<span style="font-weight: bold;">Your basic multi-threading bug</span><br />
As any course on multi-threaded programming will tell you, when
multiple threads of execution are to run concurrently, care needs to be
taken or there will be a risk of data corruption and/or unintended
results.<br />
<br />
More specifially, the threat (a "race condition" or "data race") is present when:<br />
<ol>
<li>one thread <span style="text-decoration: underline;">modifies
the state</span> of an object</li>
<li>at the <span style="text-decoration: underline;">same time</span>
as</li>
<li><span style="text-decoration: underline;">another thread</span>
accesses the object.</li>
</ol>
That is: it takes a coincidence, a conjunction of three conditions.<br />
Let's analyze it...:<br />
<br />
<h3>
Analysis</h3>
Another way of stating the above is obtained by reversing the statement:<br />
We can avoid concurrency issues by always making sure that any object
either<br />
<ol>
<li>is never modified; or</li>
<li>is never accessed by two threads simultaneously (or stronger:
never written to while accessed otherwise); or</li>
<li>can only ever be accessed by one thread.</li>
</ol>
Such objects are known as, respectively,<br />
<ol>
<li><span style="font-style: italic;">Immutable</span> objects.</li>
<li>Objects with state protected by a <span style="font-style: italic;">mutex<span style="font-style: italic;"><span style="font-style: italic;"> </span></span></span>(synchronization
lock; monitor)</li>
</ol>
and the one used less often:
<br />
<ol start="3">
<li>Single-thread objects — or even stronger: objects of <span style="font-style: italic;">linear type</span>.</li>
</ol>
The
first two options are well-known; personally, I'm increasingly
coming to consider the strong third option — objects with enforced
linear lifecycle — to be rather overlooked, language design space-wise.
<br />
<br />
<span style="font-weight: bold;">"Concurrency-ready" metric</span><br />
One possible metric for how well a programming language is designed to
express complex concurrent systems is, then, how easy it is to enforce
that all objects fall into one of the three categories.<br />
<br />
<h3>
Languages and concurrency</h3>
Let's apply this view to a few programming languages. The two languages
I've used most recently are Java and Erlang:<br />
<dl>
<dt><span style="font-weight: bold;">Erlang</span></dt>
<dd>In Erlang, there are the following kinds of objects: terms,
message queues, process dictionaries and other process metadata, ETS
tables,
ports (files and drivers).</dd><dd><ul>
<li>Terms are immutable values. They may contains handles of
other kinds of objects, but the handles themselves are also immutable.</li>
<li>Certain objects — private ETS tables, and to some extent
ports — are single-thread objects.</li>
<li>The rest — message queues, public ETS tables, process
metadata etc. — are mutex-protected.</li>
</ul>
Single-thread objects are enforced not to be used by other threads than the owning one.<br />
In Erlang, low-level data races only occur if there's a bug in the Erlang run-time system, or if you write your own driver and include one. <br />
<ul>
</ul>
<br /></dd>
<dt><span style="font-weight: bold;">Java</span></dt>
<dd>In Java, you could argue both that there are fewer kinds of
object — just one, really, the class-file defined kind — and that there
is a much wider range of object kinds.There are certainly thousands of
classes, defined by one well-defined scheme, which includes just a few
mechanisms relevant to concurrency.<br />
But the problem here is the number of classes — because it is at the
class level that it is ensured that the corresponding objects will be
thread-safe. A well-designed and -implemented class may be thread-safe,
in that it is either immutable or uses appropriate inter-thread
synchronization (and encapsulation) to ensure thread-safety. A less
well-written class may rely implicitly on details of the context in
which it is used, and really be single-thread-use only, or multi-thread
usable only in certain unstated conditions.</dd></dl>
Java is one of the few languages actually designed for portable multi-threaded programs; in particular, it has explicitly stated semantics wrt. multi-threaded execution. However, as the above comparison is one indication of, it has its shortcomings. For a language to claim good concurrency support, it should provide mechanisms for good, usable guarantees to be derived from local context (e.g. "we know that this-and-this property always holds, because these few lines here ensure that it does").<br />
I've expounded earlier on the <a href="http://polymorphictypist.blogspot.com/2009/09/programming-languages-and-local.html">importance of supporting local reasoning</a>. As it happens, Java did also then get some criticism (sorry, Java, but you're the modern main-stream language I happen to know the best...).<br />
<br />
<b>A Java exercise</b>
<br />
<blockquote>
Imagine that you have in front of you the source code of a Java class.<br />
A quick inspection reveals that all methods are declared "<span style="font-family: monospace;">synchronized</span>".<br />
What kinds of thread-safety issues might the class yet have? Try to
list at least three ways in which the class may be non-thread-safe.
</blockquote>
I'll present my list at the end of this article.<br />
<br />
<h3>
Does it matter?</h3>
But building security against these issues into a language is not
exactly trivial, you might argue. Is stricter language rules, additional compiler analysis,
and/or costly run-time support just for preventing concurrency issues not just
overkill?<br />
Sure, the trend is towards increasing parallellism and so on, but we
have done quite fine without such extra measures so far, haven't we?<br />
And <a href="http://www.catb.org/jargon/html/B/bondage-and-discipline-language.html">bondage-and-discipline languages</a> have been out of
fashion for a while. Dynamically typed languages are as popular as ever!<br />
<br />
The difference is this:
<br />
<br />
You can easily live and work with the relative uncertainties of dynamic typing — but then, you can unit test
and get
some confidence that the types match. If something is broken, or breaks
later, then there's a good chance it'll be caught.<br />
For many concurrency issues, unit tests are not likely to catch any
errors. Furthermore, the necessary invariants aren't local — they are
often widely dispersed in the code. To convince yourself that the code
is correct, you more or less need to keep it all in your head at once.
That, combined with missing language support for documenting and/or
enforcing vital non-local invariants, means that they will perhaps not
be communicated to whoever makes the next change in the code, who will
therefore not have the full picture necessary to keep things correct.<br />
<br />
Rather than being caught at the next full test suite run, or at least quite soon after the rubber hits the road, here's what happens to a race-condition bug:<br />
<ul>
<li>The program may appear to work most of the time.<br />
</li>
<li>The issue will tend to manifest itself at the most inopportune
moment: Not during development or testing (unless explicit and
considerable effort is taken to stress-test against such issues), but
in production, when your servers are at their busiest, or on the
desktop of your busiest client.</li>
<li>Often, what clues you have amount to little besides "There <span style="font-style: italic;">almost certainly</span> is an issue, it <span style="font-style: italic;">presumably</span> is a software bug, the
issue is <span style="font-style: italic;">probably</span> in our code
— and it occurs seldomly, so it's likely to be a concurrency issue of
some kind."</li>
<li>Replicating the issue may be difficult; under the exact same
circumstances, the program may run just fine.<br />
Indeed, if the problem does manifest itself during development, it'll
appear to have gone at the next run.<br />
The issue may even be technically impossible to reproduce on some
machine architectures, because it requires multiple physical processors
of the right kinds to manifest itself (and your servers or other
production environment is less likely than the development machines to
be thus bug-resistant).</li>
<li>Eliminating a tricky concurrency bug can be a drawn-out
experience in all phases — detecting it, reproducing it, tracking down
its root cause, verifying that it has gone — all steps tend to be
markedly more difficult than for deterministic bugs.</li>
<li>With any kind of bug, tracking it down once is one thing; there
may be even be a feeling of gratification once you've done it. Tracking
the same bug down twice is another matter — it is deeply frustrating to
realize that there's a reason for your feeling of déja-vu. This is one
of the reasons for regression testing: bug hunting may be stimulating
in its own way, but it'd better lead to a different bug each time.<br />
This goes doubly (well, even more than that) for the elusive
non-deterministic bugs.<br />
Sadly, because of their nature it is at best difficult, and often near
impossible, to write regression tests for these kinds of issues.<br />
</li>
</ul>
<h3>
Conclusion</h3>
Languages differ in concurrency support. Nothing new about that, of course, but I think it likely that many developers using one of the mainstream languages which have a relatively good level of support in that area may not know that there are alternatives which are significantly more concurrency-ready.<br />
In the beginning of this text, the prerequisites for a race condition were broken down and three kinds of conditions for avoiding race condition were derived; based on this I suggested a qualitative metric for a language's level of support for concurrent programming. I hope to have demonstrated that it may be a useful way of looking at both languages and concrete programs or program designs.<br />
<br />
In the absence of a programming language providing strong, local guarantees with respect to thread-safety, as a developer you need to be alert whenever there's a chance that two threads may be executing your code concurrently. The best way of doing this is probably through discipline — for instance, by clearly constructing each class so that it falls into one of the above categories: Immutable, thread-safe through synchronization, or single-thread use only — and then using them strictly according to this. That is one way of trying to restore local reasoning.<br />
<br />
Do you write programs involving multiple threads? If so, are you familiar with which consistency guarantees your platform actually provides (e.g, the <a href="http://en.wikipedia.org/wiki/Java_Memory_Model">Java Memory Model</a>)?<br />
Do you regularly stress-test your program on a system with multiple physical processors? I hope you do.<br />
Dealing with concurrent programs in general requires good global,
combinatorial-temporal reasoning abilities. Probably not your
best-developed cognitive mode... :-)<br />
The solution? Keep things as simple as possible. Find rules that work,
and follow them. Encapsulate the issues, so that you can deal with just
one question at a time. If possible, let the rules be checked
mechanically.<br />
<br />
<hr width="28%" />
<b>Answer to Java exercise</b><br />
Some ways in which a Java class may be unsafe even though all methods
are declared "<tt>synchronized</tt>":<br />
<ol>
<li>A field is public, and either</li>
<ol style="list-style-type: lower-alpha;">
<li>Non-<span style="font-family: monospace;">final</span> (and non-<span style="font-family: monospace;">volatile</span>) or</li>
<li>Referring to a non-thread-safe object.<br />
</li>
</ol>
<li>The superclass is non-thread-safe, and the current class does not
or can not override the causes.</li>
<li>An instance method modifies the value of a a <span style="font-family: monospace;">static</span> variable without proper synchronization.</li>
<li>An instance method reads the value of a a <span style="font-family: monospace;">static</span> variable which is also
modified by a <tt>static</tt> method, without proper synchronization.<br />
</li>
<li>A method exposes a reference to a non-thread-safe object referred
to directly or indirectly by the current object.<br />
<ol style="list-style-type: lower-alpha;">
<li>By returning such a reference.</li>
<li>By passing such a reference to some method which stores the
reference somewhere accessible to another thread.<br />
</li>
<li>By starting a new thread and giving it such a reference.<br />
</li>
</ol>
(I.e., a non-thread-safe object becomes shared.)<br />
</li>
<li>A field (which may be private) refers to a non-thread-safe object
which may be shared because</li>
<ol style="list-style-type: lower-alpha;">
<li>It originated as (or was extracted from) a parameter to a
constructor</li>
<li>It originated as (or was extracted from) a parameter to a method</li>
<li>It was returned from a call<br />
</li>
</ol>
(I.e., a non-thread-safe object is already unexpectedly shared.)
<li>An inner class accesses an instance variable of the surrounding class, but fails to synchronize on the right <tt>this</tt>.
</li>
<li>An instance method locks on another object which may be of the
same class (this may result in a deadlock)<br />
<ol style="list-style-type: lower-alpha;">
<li>Implicitly, by calling a (synchronized) method on the other
object</li>
<li>Explicitly, using "<span style="font-family: monospace;">synchronized</span>"</li>
</ol>
(A plausible example of this is a synchronized <span style="font-family: monospace;">equals()</span> method.)<br />
</li>
<li>A constructor leaks <span style="font-family: monospace;">this</span>
to some place where it can be accessed by another thread, and the
object has at least one (<span style="font-family: monospace;">final</span>)
variable which is accessed without synchronization.<br />(This is because the special rule for <span style="font-family: monospace;">final</span> fields, which allows them to be accessed without synchronization, <a href="http://www.cs.umd.edu/%7Epugh/java/memoryModel/jsr-133-faq.html#finalRight">only applies</a> after the constructor has completed.)
</li>
<li>A method exposes a reference to a thread-safe object referred to
directly
or indirectly by the current object, and that thread-safe object allows
mutation (of itself or of a contained object) in a way that the current class is not prepared for.
</li>
</ol>
This list is quite likely not complete; it is not the result of a systematic analysis of the matter.Erik Søe Sørensenhttp://www.blogger.com/profile/04571315699824621252noreply@blogger.com2tag:blogger.com,1999:blog-2879132569917319939.post-80700261872352058442011-05-16T15:19:00.004+02:002011-05-17T13:57:51.087+02:00Testing distributed-store algorithmsThis is a follow-up to my post on a <a href="http://polymorphictypist.blogspot.com/2011/04/multi-version-collections-in-riak.html">datastructure for storing collections in Riak</a>.<br />
While I have been planning this follow-up, Kresten Krab Thorup has actually gone ahead and implemented the algorithm (<a href="https://github.com/krestenkrab/riak_link_index">on GitHub</a>; see src/riak_column.erl) — which suites me nicely, as I haven't written a single line of implementation and am not really past the thinking stage yet :-)<br />
To wit:<br />
<br />
Shortly after writing the post, I discovered a couple of issues or finer points, both of which have to do with the fact that different rows in Riak live independent lives — they are independently versioned, and <i>no cross-row guarantees are given</i> — specifically, nothing can be concluded from the read order: if one client A updates row X, then row Y, then another client B may see the Y update and later see the old version of row X.<br />
<br />
(If Riak is set up with appropriate consistency settings, such that the majority of the replicas of an object are written synchronously, then you do have indirectly some sort of guarantee. Below a certain threshold of machine and/or network problems, that is.<br />
But even so, nothing can be concluded from the read order if there several clusters set up with cluster-to-cluster replication: changes on different keys in one cluster arrive at the other clusters in arbitrary order.)<br />
<br />
This leads to at least the following issues:<br />
<ol><li>Firstly, when splitting a row into two, I included this step: "<span style="font-family: "Courier New",Courier,monospace;">Write an empty row back under the original auxiliary row key</span>".<br />
In fact, doing that would be wrong. Even writing some tombstone dummy value in the old row would be a bad idea; instead, one should simply <i>mark the row as obsolete</i> while <i>keeping the old data</i>. This is necessary because a client accessing the collection later on may see from the master row that the old row has been split, but not see the new rows. Or it may see the old row, but not the change in the main row. In the former case, it is necessary that at least the values in the old row be available (or the collection would suddenly have shrunk); in the latter case, it would not be apparent that the values in the old row are obsolete.<br />
<br />
</li>
<li>Secondly, after modifying an auxiliary row, the main row should be updated — even nothing in it has changed. This may sound silly, but is the easiest way to ensure that, in the event of a concurrent row split, the auxiliary row in question is taken into account at a subsequent merge (indeed, that the merge is triggered at all).<br />
<br />
</li>
<li>And even that is not enough if there are more than one cluster: At the time of the read repair of the main row, the update for the auxiliary row may not have arrived - and it may therefore be lost silently. It appears that some kind of versioning of the auxiliary rows is necessary; with such versioning, we can tell when repairing the main row that we're missing an update on the pre-split auxiliary row, and that the reference to it should therefore be kept in the main row, so that it can be repaired later when all information is available.</li>
</ol>Ah, the perils and challenges of incomplete information.<br />
<br />
<b>Managing concurrent subtlety</b><br />
As the saying goes,<br />
<blockquote><i>Meddle not in the affairs of concurrent systems for their ways are subtle, and are quick to anger.<br />
-- </i><i>Tolkie</i><i>(error: timeout)</i><i>n<br />
</i></blockquote>In a domain as subtle as this — with gotchas in the style of the above mentioned issues, how can we ever convince ourselves that a scheme like the one for Riak collections are implemented robustly and correctly?<br />
<br />
As always, there are two ways: Formal verification — proving that the program is correct; and thorough testing — gaining confidence by exercising the program.<br />
Both have their advantages and disadvantages.<br />
<br />
If I were Dijkstra, I'd develop a formal proof (as he did in quite a few of his blog postings), and probably modify the algorithm appropriately along the way. I, however, am no Dijkstra, do not believe to have the time to develop a formal proof — nor do I have any delusion of being able to write anything worth reading in the process. Luckily, going the other way, through testing, have something going for it as well:<br />
<ul><li>It is not always clear which guarantees the components we build on actually provide. That mean that, regarding formal proof, the set of axioms may be uncertain.</li>
<li>The same test can be applied to different implementations of a component.</li>
<li>The same test can be applied to different versions of a component — i.e. if we modify the code, we can cheaply gain some confidence in the modified version.</li>
</ul>For both formal verification and testing goes that the answers you get depend on the questions you ask.<br />
When proving a property, you prove only that property; when you test a concrete call sequence with concrete values, you test with only those values.<br />
Which is of course a good reason to get familiar with <a href="http://www.protest-project.eu/publications.html">property-based testing</a> — which lies somewhere in between the two in that it tests a property on <i>a number of</i> concrete call sequences — typically a few hundred, and typically different instances for each time a test is run.<br />
<br />
This can be a nice compromise between formal verification and hand-rolled test cases — provided of course that the properties and the instance generators are chosen well.<br />
As always with testing, paranoia and imagination is key. But you get more value for your paranoia when it's used to power random test case generation.<br />
<br />
For the problem in question, and assuming an Erlang implementation like KKT's, property testing can be done with relative ease using a tool like<a href="http://www.quviq.com/"> Quviq QuickCheck</a> or <a href="https://github.com/manopapad/proper">Proper</a>. Their support for testing state machines comes in handy; I've not tried using it before, but this seems a good occasion.<br />
<br />
<b>How to test randomly</b> <br />
Randomized testing involves abstracting over usage scenarios.<br />
How the abstraction is done determines what will be tested.<br />
Knowledge of the problem domain is the primary guide; paranoia together with perhaps knowledge of implementation details should provide additional input.<br />
What must be considered is:<br />
<ul><li><u>What</u> is tested:<br />
Which API functions to exercise?<br />
Which invariants to verify?</li>
<li><u>How</u> it is tested:<br />
Test concurrent use to check for thread safety?<br />
Test with invalid inputs?<br />
Should certain special circumstances be simulated - e.g., file I/O errors, disk-full conditions, network delays, packet loss?</li>
<li><u>With what</u> it is tested:<br />
Test with abnormal (very long/short/high/low) inputs?<br />
Key collisions?<br />
Many values for the same key?<br />
Keys which are nearly identical?</li>
</ul>This is of course where the imagination and paranoia enter the picture.<br />
You can't assume that "since we generate the values randomly over a large domain, we exercise all code paths" — exercising e.g. a dictionary with a million randomly generated keys is of little use if the keys never collide.<br />
<br />
<b>How to test distributed collections </b><br />
OK then, how to test an implementation of the distributed collections scheme?<br />
Using the state machine testing support of Proper, we can maintain a "model" collection aside the subject-under-scrutiny collection (shortened to SUS in the following).<i><br />
<br />
What to test</i>:<br />
This is the easiest part: We will test the usual collection functions: insert, delete, lookup, list keys.<br />
The invariant is that the result is consistent with the same operation performed on the model collection — as defined below.<br />
<u><br />
</u><i>How to test:</i><br />
We'll certainly need to test concurrent use. Let's say that there are three simultaneous users of the collection; we know that there are subtleties involving two, and there may be additional ones involving at least three.<br />
The users won't be really simultaneous, though — We will test concurrent use in a way where the concurrency is made explicit. This provides repeatability and insight into why things fail, which is indispensable for this problem domain.<br />
So, instead of having an actual underlying Riak store, we'll mock it up, in such a way that the mock provides just the guarantees we actually expect the real thing to provide.<br />
<br />
<br />
<b>Model representation</b><br />
The structure I have in mind is: the mock remembers all versions of all values put into it. This is what the entire simulated multi-cluster store — let's call it the "cloud" — has ever seen.<br />
The mock furthermore has a number of "views" of the store, corresponding to what you would see if you accessed the cloud at different points. It has a number of these; we'll make the assumption that <i>in any of these views, for any given key, the version will only increase with time</i>. I.e. we do not expect the version of any value in any view to evolve backwards. The mock keeps track of the view-to-version mapping for all keys.<br />
<br />
One thing that can happen, then, beside operations on collections, is that value versions find their way from one view to another. Also, version merging can happen in transit — the versions being merged are not necessarily the latest value in any view, but may be any versions that have ever existed in the cloud.<br />
For simplicity, let's say that this is modelled with views also — if we allow enough views, this won't reduce the generality.<br />
<br />
The model thus consists, not just of a simple collection, but of an abstract model of the storage cloud.<br />
Beside the externally visible events — the calls to the collection API — there are internal events: a given version of a given key finds its way to one view to another. Both kind of events go into the randomly generated test scenario specification.<br />
<br />
<b>The test</b><br />
It was originally my intention to end with putting all of this together in a Proper-based Erlang unit test — it would then be a good occasion for me to get experience with the <a href="http://aszt.inf.elte.hu/%7Ecefp2009/materials/papers/Hughes.pdf">state-machine modelling support</a> (link: state machine part is from p.26).<br />
However, it appears that I have a latency-vs.-completeness tradeoff to make, so I'd better stop here, publish what I have, and hope to return to the subject soon, hopefully with some concrete code. (While this posting has been under way, naturally other topics have come up which I'd like to write about; the order in which further postings arrive here is undetermined...)Erik Søe Sørensenhttp://www.blogger.com/profile/04571315699824621252noreply@blogger.com0tag:blogger.com,1999:blog-2879132569917319939.post-89250984437513896422011-04-12T16:38:00.009+02:002011-05-16T16:14:49.110+02:00Multi-version collections in Riak<b>The setting:</b><br />
We have a distributed, transaction-less key-value store with multi-version awareness (<a href="http://en.wikipedia.org/wiki/Vector_clock">vector clocks</a> as version numbers, basically). More specifically, we have <a href="http://wiki.basho.com/">Riak</a>.<br />
<br />
This provides us with the ability to store and retrieve blobs by key — and to do decent conflict resolution, up to a point.<br />
<br />
That is a great building block, but there are some shortcomings which means that we need to build on.<br />
<br />
<i>First, the stored values have (as far as Riak is concerned) no structure.</i><br />
This means that if we store rows as values, and need to update only a part of a row, we still need to fetch and store the entire row.<br />
There are both potential bandwidth and versioning issues with this: you may have to move more data than strictly needed, and the elements of the row are (at the outset) not individually versioned, which makes conflict resolution difficult.<br />
For the latter problem, a solution has already been devised in the form of <a href="http://www.javalimit.com/2011/02/the-beauty-of-vector-clocked-data.html">Vector Maps</a>.<br />
<br />
<i>Second, there is no concept of "collections".</i><br />
This leaves you with two possibilities when you need to store a collection in Riak: storing the entire collection under one Riak key, or storing it under several Riak keys.<br />
(I'll henceforth use the term "row" for a value stored under one Riak key, both to avoid confusion with the keys of the individual element and because that's the term that has become stuck in my head.)<br />
<br />
Storing all of a collection in one row has the above-mentioned drawbacks: it's a waste of resources (bandwidth, CPU, disk I/O) much of the time, and you lose much of the advantages of automatic version conflict resolution because only the version of the entire value is taken into consideration.<br />
<br />
Storing the collection in different rows also has its drawbacks, though — most significantly, you lose data locality. To fetch a collection of 100 elements, spread over 100 different keys, 100 separate disk locations may have to be accessed. (Unless perhaps you're so clever that you can trick the hashing function, that is...)<br />
<br />
<blockquote><i>A side note:</i><br />
Riak has a "link" concept which enable you to link objects associated with different keys together; sadly, it has both the locality issue (which can't be helped) and the versioning issue (which is a pity, this being Riak).<br />
For instance, consider an object which in version [] has one link, "foo"; in version [{a,1}] the foo link is deleted; and in version [{b,1}] another link, "bar", is added. Based on versions [{a,1}] and [{b,1}], with link sets Ø and {foo,bar}, respectively, you can't conclude anything about which links should be present in a merge. (Even if the version [] was accessible to Riak, it would be disregarded entirely.)<br />
So, if you want to take advantage of Riak's multiversioning support, you'd better stay clear of links. Or else use them in a disciplined manner — i.e., only adding, never changing or deleting them.</blockquote><br />
<b>The concerns </b><br />
Summing up so far, we have these concerns when storing collections:<br />
<ol><li>The ability to store an arbitrary amount of elements (obviously).</li>
<li>Data locality. — This concern alone would have all elements in one row.</li>
<li>Manageable row size. — This concern alone would have one element in each row.</li>
<li>Efficient element lookup.<br />
(In the following, the elements are assumed to be associated with some kind of key.)</li>
<li>Easy conflict resolution.<br />
(E.g., when creating a new row, expect that another Riak node might also decide to create a row at the same time, based on the same decisions; this may influence the way you choose keys. Also, don't move data from one row to another too much.)</li>
</ol>(A non-concern, luckily, is hard limits e.g. on row size. The present problem has many features in common with the issue of how to structure database storage, but while DBs have to respect the size of a disk block, we have more freedom. A good thing, too — dealing with concurrency is quite enough.)<br />
<br />
The conflict between concerns (2) and (3) is obvious.<br />
One straightforward compromise between the two is "add elements to a row until it reaches a certain size (say, N elements), then create a new row for the next N elements, and so on", but this conflicts with concern (4).<br />
<br />
<b>Hash tables</b><br />
So, we need a growable, block-based collection, in which elements can be looked up efficiently; what are our options?<br />
A hash table as it is usually implemented — using hash value modulo bucket count to get the bucket number — is not so good a choice in light of (5); global redistribution of the data is a thing to be avoided.<br />
<br />
Taking inspiration from database systems, however, we might consider the hash table variants used there, secondary-storage hash tables (link?).<br />
These work by using a bitwise prefix of the hash value as the bucket indicator.<br />
<br />
The scheme I have in mind differs from both of the variants described in my book on database systems, but the basic idea is the same — after all, the main concern of touching as few block as possible at each access is the same.<br />
I'll sketch my scheme below.<br />
<br />
<b>Collection representation using secondary-storage hash table</b><br />
First, a bit of basics and assumptions:<br />
I'll assume that we have a Riak bucket to our disposal for collection storage; that each collection is associated with an alphanumeric key in that collection.<br />
<br />
Each element is versioned individually, for reasons mentioned earlier:<br />
<blockquote>VersionedElement :: {ElementKey, [{VClock, Data}]}</blockquote>(I'll be using Erlang type specifications.)<br />
<br />
After a version merge, multiple versions of the element may be present, with pairwise independent versions (i.e. none of them "happens before" any of the other); this is why there is a list of versioned values rather than a single value.<br />
<br />
As mentioned, we use a prefix of the hash values for indexing the auxiliary rows.<br />
The two approaches for extensible secondary-storage hash tables which are described in my database book ("Database Systems: The Complete Book", by Garcia-Molina, Ullman and Widom; interesting reading) are:<br />
<ul><li>"<i>Extensible hash table</i>" — in which a hash table's current structure is described by the number <i>i</i> of bits of suffix used for selecting the hash bucket. Drawback: the table extension from <i>i</i> to 2<i>i</i> has to be done all at once.<br />
</li>
<li>"<a href="http://en.wikipedia.org/wiki/Linear_hashing"><i>Linear hash table</i></a>" — which employs an incremental growing approach, and in which the structure is described by (<i>i,k</i>) where <i>i</i> is as above and <i>k</i> is the current number of buckets, 2<sup>i-1</sup> < <i>k</i> <= 2<sup>i</sup> . If a suffix is below <i>k</i>, it's used as the bucket number; if it is <i>k</i> or larger, then the suffix of length <i>i</i>-1 is used (i.e., clear the MSB of the original suffix). When <i>k</i> is increased, the elements in preexisting bucket <i>j</i> are redistributed between <i>j</i> and <i>k</i> — where <i>j</i> is <i>k</i> with the MSB cleared.<br />
Quite elegant, really.</li>
</ul>In our case, the concerns and constraints are similar but different; for one thing, we can get away with using less compact and simple data structures. More specifically, instead of a couple of numbers we're going to have a list of buckets:<br />
<blockquote>RowPointer :: {BitCount :: integer(), BitSuffix :: integer()}</blockquote>This allows us to split exactly the hash buckets that need splitting, rather than having to rely on a predetermined splitting order.<br />
(This structuring turns out to be similar to the one wikipedia calls <a href="http://en.wikipedia.org/wiki/Extendible_hashing">extendible hashing</a>, except for the representation of the bit suffix table)<br />
Consider a collection stored under Key. The collection's main row looks like this:<br />
<blockquote>Key → MainRow</blockquote>where<br />
<blockquote>MainRow :: {[VersionedElement], [RowPointer]}</blockquote>and VersionedElement and RowPointer are described above.<br />
<br />
For locality, we allow a limited number of elements to be stored directly in the main row. This is an optimization, of course, which ensures that for small collections, one row access is needed rather than two; it can be easily omitted but I see no harm in it.<br />
<br />
The RowPointer set must be internally consistent: For each possible bit pattern, there should be at most one RowPointer matching that pattern.<br />
<br />
For auxiliary rows, we use the key format<br />
<blockquote>AuxKey = Key#BitCount,BitSuffix</blockquote>(where '#' is a character guaranteed never to be present in Key. Alternatively, use Key# as key for the main row.)<br />
The auxiliary rows contain just elements:<br />
<blockquote>AuxKey → [VersionedElement]</blockquote><br />
<b>Consistency and conflict resolution</b><br />
Let us define criteria for a reliable collection storing scheme, to guide the further design:<br />
<ul><li>Firstly, we can determine the key set for a given collection (subject to which revision versions are visible to us).</li>
<li> Secondly, given a collection and a key, we can determine the last version(s) of the associated value.</li>
</ul>Part of the first criterion is that deleted elements stay deleted even when other versions of the collection is "versionally visible" where an older version of the element is present.<br />
Part of the second criterion is, strictly speaking, that if a version is visible in which an elements was deleted, then that element is either not present at all in the key set, or the concluded "last version(s)" include a 'deleted' value — i.e., it will not be forgotten that it was once deleted.<br />
<br />
<b>Version-Merging Algorithm</b><br />
When merging two versions of the collection, we have for each row a number of versions of that row available. In Riak we can't tell across rows which versions belong together.<br />
<br />
An outline of a merge algorithm is as follows:<br />
<ul><li>First, determine the set of keys.</li>
<li>Next, for each key,</li>
<ul><li>Determine the most recent version(s).</li>
</ul></ul><br />
Determining the most recent versions of each element is a known and solved problem; the interesting part is therefore finding all of the relevant versions.<br />
<br />
First attempt:<br />
<ul><li>Take the union of the RowPointers set from all versions of the main row.</li>
<li>For each of these rows, take all element versions.</li>
</ul><br />
This doesn't take deletions into account, however.<br />
To handle deletions, I see two options:<br />
<br />
<b>1: The tombstone approach.</b> We represent deleted elements by explicit tombstone values (just as it is done in vector maps and, I believe, in Riak itself).<br />
<br />
It would be nicer to have deleted elements be removed from the system altogether. Is this possible, within the constraints we've set up?<br />
<br />
Consider a merge operation of two independent versions "D" and "P" of a row; a given element "X" is present in "P", but absent in "D". Assume further that the element modification version is <i>later than</i> the last common version ("LVC"). How do we know whether the "X" was present in "LVC", then deleted in "D" and modified in "P" (in which case we must include both options, present and deleted, in the merge result: [X<sub>P</sub>, tombstone]), or if it was created since the LCV (in which case it must be present, not deleted, in the merge result: [X<sub>P</sub>])?<br />
<div style="font-family: "Courier New",Courier,monospace;"></div><div style="font-family: "Courier New",Courier,monospace;"><br />
</div><div style="font-family: "Courier New",Courier,monospace;"> LCV<br />
/ \<br />
delete(X)/ \update(X)<br />
/ \<br />
D P<br />
(X absent) (X → X<sub>P</sub>)</div><br />
This question leads to:<br />
<br />
<b>2: The extra versioning approach.</b> For each element, include both the version of the last modification <i>and</i> the version of its creation. Furthermore, include for each row the version of the last modification to the row (these versions must be comparable to the versions of the individual elements, by the way).<br />
<br />
With this extra information, we can distinguish: If the row timestamp of "D" is <i>later than</i> the creation date, then it was deleted (result=[X<sub>P</sub>, tombstone]); if not, then X was created since (result=[X<sub>P</sub>]).<br />
<br />
Interestingly, the two approaches (1) and (2) are not equivalent, but yield different results in certain cases.<br />
Consider for instance the following history:<br />
<br />
<span style="font-family: "Courier New",Courier,monospace;"> V0<br />
/:<br />
/ :<br />
create(Y) - A1 V1 - create(X)<br />
/|<br />
/ |<br />
delete(X) - A2 V2 - delete(X)<br />
/:<br />
/ :<br />
create(Y) - A3 V3 - create(X)<br />
/|<br />
/ |<br />
delete(X) - A4 V4 - update(X)</span><br />
<br />
and assume that the input versions available to the merge operation is V4 and one of {A2, A3}.<br />
Then using the tombstone approach, A2 and A3 contains a tombstone value for X, and we must include it in the merge result.<br />
Using the extra versioning approach, on the other hand, we can conclude that the X present in V4 is created at a point not "versionally visible" to either A2 or A3, and thus exists without question after the merger.<br />
<br />
(If merging (V4 and A1) or (V4 and A4), the two approaches yield the same result.)<br />
<br />
Anyway, I hope to have convinced you that we can resolve conflicts reliably.<br />
Having gotten that out of the way, let's turn to how the data structure is actually manipulated.<br />
<br />
<b>Collection Manipulation Algorithms</b><br />
In the insertion and deletion algorithm, the possibility of write conflicts has been ignored. This is intentional; Riak lets you detect the write conflict, and it can then be repaired — instantly or later, as read repair — using the principles described above.<br />
Note that the organization of the data means that repairs caused by write conflicts are local, i.e. only a few rows need to be repaired, not the entire collection.<br />
<br />
<u>Lookup algorithm:</u><br />
Given a collection key, and an element key, look up the key in the collection. Answer with the versioned element or "not found".<br />
<span style="font-family: "Courier New",Courier,monospace;"> - Fetch the main row of the collection: {Elements, RowPointers}.</span><br />
<span style="font-family: "Courier New",Courier,monospace;">- Is the element among Elements?</span><br />
<span style="font-family: "Courier New",Courier,monospace;"> - If yes: There's your answer.</span><br />
<span style="font-family: "Courier New",Courier,monospace;"> - If no: Is there a RowPointer matching the key?</span><br />
<span style="font-family: "Courier New",Courier,monospace;"> - If no: Conclude "not found".</span><br />
<span style="font-family: "Courier New",Courier,monospace;"> - If yes: Fetch the auxiliary row pointed to by RowPointer: Elements2.</span><br />
<span style="font-family: "Courier New",Courier,monospace;"> - Is the element in Elements2?</span><br />
<span style="font-family: "Courier New",Courier,monospace;"> - If yes: There's your answer.</span><br />
<span style="font-family: "Courier New",Courier,monospace;"> - If no: Conclude "not found".</span><br />
<br />
<u>Update algorithm:</u><br />
Like the lookup algorithm, with the obvious additions. Left as an exercise to the reader.<br />
<br />
<u>Insertion algorithm:</u><br />
Given a collection key, an element key and a value, insert the (key,value) pair in the collection.<br />
<br />
<span style="font-family: "Courier New",Courier,monospace;"> - Perform a Update of the key. If the key was found, we are done; if not, continue.<br />
- Fetch the main row of the collection: {Elements, RowPointers}.<br />
- Is there room for another element in Element, without exceeding the size threshold?<br />
- If yes: insert the versioned element in the main row.<br />
- If no: Is there a RowPointer matching the key?<br />
- If yes: fetch the row pointed to by RowPointer, and add the versioned element.<br />
- Is the size limit exceeded?<br />
- If no: Write the auxiliary row back.<br />
- If yes: Split the auxiliary row into two rows with a BitCount increased by one.<br />
- Write these rows.<br />
- Update the main row with the new row pointers.<br />
- Write an empty row back under the original auxiliary row key. (Optional?)<br />
- If no:<br />
- Create a RowPointer matching the key and having as small a BitCount as possible.<br />
- Insert an auxiliary row for that RowPointer, containing just the new element.<br />
- Update the main row, adding the RowPointer.</span><br />
<br />
<u>Deletion algorithm:</u><br />
Given a collection key, and an element key, delete the element with that key from the collection (if it is present).<br />
Done like Update with a 'tombstone' value — if the element is present; if it isn't, nothing should be done.<br />
<br />
<b>Conclusion</b><br />
Riak is a good building block for a distributed storage system, but for certain applications you need some appropriate abstractions on top of the raw key-value store. There are for instance good reasons to put some thought into the organization of collections of values, and ensure that conflicting versions can be dealt with appropriately.<br />
I have presented some of the issues involved and attempted to put together a usable scheme.<br />
At this point, it's all talk and no code, though — not a bad starting point, but I cannot claim much about the resulting scheme except that, given that this text ended up much longer and with many more asides that I originally envisioned, I must have put <i>some</i> thought into it.<br />
<br />
(<b>Update:</b> There is now a <a href="http://polymorphictypist.blogspot.com/2011/05/testing-distributed-store-algorithms.html">follow-up</a>. Still more thinking than code, though -- for my part, at least.)<br />
<br />
Oh, and by the way, I hope to be able to find the time to explain why I'm writing about Riak, all of a sudden.Erik Søe Sørensenhttp://www.blogger.com/profile/04571315699824621252noreply@blogger.com0tag:blogger.com,1999:blog-2879132569917319939.post-81769302613727046162009-09-23T23:26:00.007+02:002009-09-24T15:08:28.445+02:00Program Composition and Transformation in UnixA few years ago, I got the feeling that language design could benefit from looking towards Unix and the lessons of the Unix culture.<br />
A definite source of inspiration for this was Eric S. Raymond's <a href="http://catb.org/%7Eesr/writings/taoup/">The Art of Unix Programming</a>, which among other things analyze what the greatest strengths and most significant design decisions (or rather, design style elements) of Unix are.<br />
<br />
One of the qualities I'd most like to carry over from Unix to programming languages has to do with modularity - the ease with which one can combine small components into great things; components which in themselves are simple and easily understood.<br />
<br />
<a name='more'></a><br />
<br />
<b>Properties of the command line</b><br />
The Unix shell is a marvelous thing. It allows a reasonably skilled programmer to write many practical programs in one or a few lines, and in less than a minute; programs which in a "real" programming language would have taken considerably longer - and many of which would not have been written at all without this ease.<br />
The power and ease come mainly from Unix's <a href="http://en.wikipedia.org/wiki/Pipeline_%28Unix%29">pipeline</a>, the primary means of combining components into a greater whole. The pipeline has many interesting properties:<br />
<ul><li>Transparency/debuggability - output can be inspected after each prefix of the pipeline. This can even be done while running the full pipeline, with the help of <a href="http://unixhelp.ed.ac.uk/CGI/man-cgi?tee"><tt>tee</tt>(1)</a>.<br />
(<tt>tee</tt> even makes sense as the last part of the pipeline, where it lets you store your results and view them too.)<br />
(A rarely exploited - or perhaps indeed widely known - fact is that <tt>tee</tt> may even be used to change a linear pipeline into a treeshaped one - like this: <tt>echo foo | tee >(tr a-z A-Z) | rev</tt>)</li>
<li>A pipeline can be extended with additional processing steps.</li>
<li>Or it can be cut in half - to debug each half on its own; to reuse output from a common prefix of two or more pipelines; or to run one part of the pipeline on a different machine or at a later time.</li>
<li>The components of a Unix pipeline run concurrently - it's parallelism that comes for free, without one having to do anything for it, and without muddling the semantics.<br />
Even on single-core machines, this may be a win if I/O and processing can be performed in parallel.</li>
<li>Isolation/encapsulation - The components are loosely coupled. A malfunction in one of the components of a pipeline cannot spread to other component except through clearly defined ways (bad output; premature or infinitely delayed termination; resource depletion). In particular, one component can't corrupt the internal state of another.</li>
<li>Context independence/agnosticism - The components are not as a rule aware of each other's existence; their behaviour in the pipeline does not change if the pipekline is changed - or even if they are run by themselves, without the pipeline. (The exception is through performing the isatty() query operation on the standard input and output streams.)</li>
<li>Reusability - Many of the atoms of a typical command line, and sometimes larger parts of one, are reusable.</li>
<li>A program like <tt>grep</tt> may be one of the most successfully and frequently reused pieces of code around - it is used in new contexts thousands of times each day.<a href="http://www.blogger.com/post-create.g?blogID=2879132569917319939#erltermstat"><sup>1</sup></a> <br />
</li>
</ul>All in all, the Unix pipeline is a showcase for composability and reusability.<br />
<br />
Let's look at the mentioned strengths in the perspective of the two desirable properties I wrote of in <a href="http://polymorphictypist.blogspot.com/2009/09/programming-languages-and-local.html">my previous post</a>: guarantees and late binding.<br />
<br />
First, guarantees:<br />
<span style="font-size: x-small;">(Given a few simplifying assumptions: that none of the components depend on isatty() or on any timers, or of how much data each read/write call transfers; that no component read files written to by other components appearing later in the pipeline, or in other ways introduce causal loops; that CPU and memory resources are not too scarce; and that we do not care about side effects by components which may be precluded by premature termination of later components. Did I miss any...?<br />
The majority of these assumptions are subsumed by this stronger version: That all components save the first are deterministic data transformations from standard input to standard output; many useful programs fulfill this criterion.)</span><br />
<ul><li>The components of a pipeline are guaranteed to not interfer with each other except through a few, well-defined ways.</li>
<li>The behaviour of components do not depend on whether their execution is performed in serial, or interleaved, or in parallel, or in the latter cases how the execution is interleaved or parallelized.</li>
<li>The behaviour of any prefix of a pipeline is guaranteed to be independent of the remainder of the pipeline.</li>
<li>Running a whole pipeline at once is guaranteed to be equivalent to running a prefix, storing the intermediate result, and later running the remainder with the intermediate result as input.</li>
<li>It is guaranteed that tee(1) can be inserted at any point in the pipeline without affecting its behaviour.</li>
<li>Running a whole pipeline one one machine is equivalent to running a prefix on one machine and the remainder of the pipeline on another machine (with input from the prefix), provided that any accesses to local resources by the remainder yield identical results in the two scenarios.<br />
(This equivalence rule is often used in the reverse direction: the first machine does not have access to a certain resource; therefore the pipeline is split up, so that it behaves *as if* the first machine did have access to the resource.)</li>
</ul><span style="font-size: x-small;">(Sorry about the abundance of lists - there will be a couple more.)</span><br />
As for late binding, the loose coupling of one pipeline component to the next is very like a late binding. I guess it does not as such count as a true late binding, because a source code (i.e. pipeline) change is after all usually necessary to change the binding (though it may not be; a pipeline component may be a Bash if-statement or depend on environment variables, <tt>$PAGER</tt> being a good example). However, the source changes in question are pretty trivial.<br />
<br />
It is because of the loose couplings that<br />
<ul><li><tt>tee</tt> can be inserted at any point, facilitating debugging;</li>
<li>a pipeline may be broken at any point, substituting storage of the intermediate result for parallel processing;</li>
<li>or broken at any point to let each half run on a separate machine;</li>
<li>processing step can easily be added or removed.</li>
</ul><b>Black-box program transformations</b><br />
Various features of Unix provide other kinds of guarantees and late binding for a program.<br />
<ul><li>Each program has an owner; it runs with the permissions of a certain user. This implies certain guarantees about which operations the program may and may not perform; primarily which files and devices it may read from and write to, and whether it is allowed to perform operations restricted to the superuser.<br />
<br />
</li>
<li>Most programs depend on the values of environment variables. This provides late binding with respect to e.g. which language the application speaks, which character encoding it uses, and other localization information.<br />
Another use of this is for configuration of preferred service programs - external programs invoked by the application to perform certain tasks; the environment variables PAGER, EDITOR and CVS_RSH are examples of this use of late binding.<br />
<br />
</li>
<li>Certain programs, primarily servers, need for security reasons to be restricted in regard to which files they may access, but still need to perform certain superuser-only operations. The solution to this is to put the program in a *chroot jail* - to irreversibly change its notion about what the root of the filesystem is, in such a way that guarantees that it can only ever access files within the corresponding subtree of the file system. This mechanism means that even if the program is compromised, the damage is contained.<br />
<br />
</li>
<li>For some purposes, it is useful to run a program with redefined versions of system functions. For instance, a development tool called Electric Fence (link) redefines the libc functions malloc and free with versions which make buffer overruns and similar more likely to be caught immediately (rather than e.g. cause crashes at arbitrary subsequent times). Unix enables such context modifications through the environment variable LD_LIBRARY_PATH - which makes the program be late-bound to other library implementation(s) than the default.<br />
<br />
</li>
<li>Finally, it is occasionally useful to follow the sequence of system calls that a program performs. The LD_LIBRARY_PATH could be used for this purpose, to the extent that the program only calls system functions through a given shared library. However, this is not guaranteed. A more complete solution is provided though a tracing functionality in Unix; this is a late-binding feature that instruments a program in order to obtain a trace.<br />
</li>
</ul><br />
<b>Conclusion</b><br />
To sum up, the Unix pipeline provides flexibility and other interesting properties in a form that has proven very convenient, intuitive and useful.<br />
Furthermore, Unix provides certain forms of encapsulation (access restrictions), as well as certain "program transformations" by relatively simple mechanisms (and through late binding). A some of these transformations effectively change the substrate on which the program runs - the implementation of low-level operations - which in a pipeline-analogy can be likened to changing or injecting a component (for instance, adding tracing corresponds to adding a <tt>tee</tt> component).<br />
<br />
I find these features interesting from a programming language design perspective, because they are solutions to problems familiar to programmers, and because they are solutions which have proved to work rather well - and give a great deal of flexibility - through relatively simple means.<br />
<br />
<br />
<hr /><br />
<a href="http://www.blogger.com/post-create.g?blogID=2879132569917319939" name="erltermstat">1</a>: For illustration, here's one that can be used on a stream of Erlang-term-formatted property lists, to get a statistic of the values associated with a given key: <tt>program-emitting-erlang-terms | ggrep -Eo '{keyword,[^}]+}' | sort | uniq -c | sort -nk1</tt>Erik Søe Sørensenhttp://www.blogger.com/profile/04571315699824621252noreply@blogger.com0tag:blogger.com,1999:blog-2879132569917319939.post-62553540862275250772009-09-18T01:29:00.002+02:002009-09-24T14:56:50.447+02:00Programming Languages and Local ReasoningThrough time, there have been two interesting trends in programming languages: towards <i>better guarantees</i> and <i>later binding</i>.<br />
<br />
With "guarantees", I mean facts and invariants which the language rules permit us to conclude from the code. Such as the following - you may or may not recognize them (and the language features which give rise to them):<br />
<ul><li>"The value in this variable is of this type."</li>
<li>"The value in this variable can only come from one of these expressions."</li>
<li>"The value of this object field can not be changed once the object has been constructed."</li>
<li>"This variable can only be accessed from one thread."</li>
<li>"This method can not be redefined in subclasses."</li>
<li>"This process will always be notified when this other process disappears."</li>
<li>"If this code point is reached, then this other code point will eventually be reached."</li>
</ul>The two trends, <i>better guarantees</i> and <i>later binding</i>, correspond to two desirable properties of software: the ability to reason about the behaviour of software, and flexibility.<br />
<br />
<a name='more'></a><br />
<br />
One important kind of guarantee is encapsulation.<br />
Encapsulation enable us to reason about a program's behaviour because it limits the number of places we would have to look to find all the pieces of code that might affect the encapsulated entity.<br />
<br />
A rough evolution timeline might look like this:<br />
<br />
<b>Language Evolution</b> <br />
<dl><dt>Machine language / assembly code: </dt>
<dd>No encapsulation, no guarantees above what the hardware and OS provides, no late binding (unless you count self-modifying programs, which I find little reason to do; they rather count as examples of the absense of a very valuable guarantee).
</dd>
<dt>BASIC etc: </dt>
<dd>Still no encapsulation, but guarantees in the form of e.g. typed variables.
</dd>
<dt>Structured Programming: </dt>
<dd>Data encapsulation in the form of local variables ("At which program points can a given variable be accessed?"); control encapsulation in the form of less reliance of labels ("Which program points might be the predecessor of a given program point, and under which circumstances?").
</dd>
<dt>OOP: </dt>
<dd>Data encapsulation in the form of private variables. Late binding in the form of subtype polymorphism and virtual methods.
</dd></dl>Language evolution is not linear, of course. Just to keep our horizons broad and to point out that OOP is not the only and natural way forward (although it may <a href="http://www.sics.se/%7Ejoe/bluetail/vol1/v1_oo.html">dominate</a> today's language landscape), let's include some non-OOP paradigms - the functional one, and the one employed by Erlang (a concurrent, functional paradigm that in these respects yields much the same benefits as OOP does):<br />
<dl><dt>Functional programming: </dt>
<dd>Same as Structured Programming, plus: data encapsulation in the form of immutable values and variables, and late binding in the form of higher-order functions.
</dd>
<dt>Erlang: </dt>
<dd>Same as Functional Programming, plus: data encapsulation in the form of state private to a process; late binding in the form of weakly-coupled processes and callback modules. Further guarantees in the form of e.g. process link semantics.
</dd></dl>(Having worked professionally with Erlang for nearly two years has left an impression. And, come to think of it, that's a very good sign: if you work with any tool for a prolonged period of time, shouldn't it shape the way you think? Preferably in an inclusive, letting-you-see-new-connections kind of way, rather than letting-you-see-nothing-but-nails.)<br />
<br />
Late binding is very useful and associated with all kinds of sexy buzzwords like "reusability" and "hot upgrading" - but in the remainder of this article, I will focus on guarantees.<br />
<br />
<b>Language Guarantees</b><br />
Deciding which guarantees a language gives under which assumptions is a crucial part of language design. And the form of those assumptions are also important.<br />
<br />
As Dijkstra pointed out in his famous "<a href="http://www.cs.utexas.edu/%7EEWD/ewd02xx/EWD249.PDF">Notes on Structured Programming</a>" [PDF], our cogitational capacities are limited - and we must recognize this fact and shape our efforts accordingly.<br />
Regarding the tools we use to create computer programs, the following question therefore becomes important: Which kinds of reasoning are our minds strong at, respectively weak at?<br />
<br />
Now, it doesn't take an expert on cognitive psychology to guess that we're better suited for what I would call "local reasoning" - where few facts and rules, preferably "close" in some sense, are involved in coming to a conclusion - than we are at more "global" reasoning, with more numerous and/or widespread facts and rules.<br />
<br />
The most practical kind of conclusion is the kind where you, based on a sheetful or screenful of information, can say with confidence: yes, this property certainly holds.<br />
<br />
The less practical kind is where you need to go through your code base to ascrtain that there are no violations of a certain property.<br />
<br />
In other words - in logic terminology - it's best if the premises to the useful conclusions are such that you need a "yes" answer to all "exists"-kind of questions, and a "no" to all "for all"-questions.<br />
Local reasoning, in other words. <a href="http://en.wikipedia.org/wiki/Baldur">Even gods occasionally have problems with global reasoning</a>.<br />
<br />
<b>Examples</b><br />
To take an example from Java, the keyword <tt>final</tt> in front of the definition of a class, a method or a variable signals clearly that we're allowed certain assumptions about the defined entity: It is guaranteed not to be subclassed/redefined in a subclass/have its value modified. That's a strong and useful guarantee, eaily read from the code: there exists a <tt>final</tt> keyword before the declaration, therefore we can conclude such-and-such.<br />
<br />
An example of a language feature, also from Java, that is less well designed: synchronizing memory accesses between thread is done in Java with the keyword <tt>synchronized</tt>, which in front of a method definition (or block statement). The presence of the keyword alone actually tells the reader of a program very little about the possible behaviours of the code block in question; most significantly, you cannot <i>locally</i> infer that when one thread is inside the synchronized block, no other threads will alter the value of any variables used in the block (except of course for local variables). It is entirely possible for other threads to e.g. execute other methods on the same object, simultaneously with the execution of the synchronized region.<br />
To get that kind of certainty, you need to ensure that <i>all</i> code points where the variable or variables in question are accessed are likewise within a synchronization block (which synchronize on the same object).<br />
<br />
In short, the <tt>synchronized</tt> language feature allows reasoning which reaches a positive conlusion only based on a "yes" answer to a "for all" question, which is less usable in practise - partly because the human mind is not overly good at working globally, enumerating all of a certain kind of object and checking that a property holds for each of them - and partly because as code evolves, such non-local properties are more likely to break.<br />
<br />
For these reasons, it would make good sense to attach synchronization specifications to the vulnerable variables, instead of just to the pieces of code which manipulate them. More demanding on the compiler, but more developer friendly (especially as it concerns an area - concurrency - which is tricky and where many bugs are elusive and are often found late in the development cycle).<br />
<b>/Examples</b><br />
<br />
In summary, one axis on which to evaluate a language feature is to which extent it provides guarantees - or take them away (a feature like concurrency, for instance, usually invalidates certain assumptions) - and to which extent the provided guarantees can lead to useful conclusions about code, using just local reasoning.Erik Søe Sørensenhttp://www.blogger.com/profile/04571315699824621252noreply@blogger.com0tag:blogger.com,1999:blog-2879132569917319939.post-47347073467489497972009-09-17T14:26:00.002+02:002009-09-24T14:57:23.564+02:00Introduction<b>About me:</b><br />
<br />
I have a background in theoretical computer science (Master's degree from University of Aarhus). I currently work as a software developer (developing in Erlang, Scala, and the odd Bash, Perl and C++).<br />
<br />
I've been involved - briefly - in many software projects; as bug reporter, coder, translator, or level designer. You could call me a hit-and-run contributor.<br />
The reason: I tend to have too many ideas of my own to explore to hang around any particular project for long. Perhaps not ideal, but that's how it's turned out.<br />
<br />
My main interests are programming languages and their implementations. That means primarily compilers, but also interpreters and runtime systems.<br />
<br />
<a name='more'></a><br />
<br />
It's been an interest for the last 15+ years, and I've tried designing languages of my own for almost as long.<br />
Most of that time the ideas have remained squarely inside the box - having been acquainted with only the imperative paradigm doesn't help. So until recently, I definitely have been somewhat behind the curve.<br />
<br />
At some point, though, I realized that what I was doing was too much like what other people were doing elsewhere, with better resources.<br />
Consequently, anything not sufficiently avant-garde would be a waste of time...<br />
<br />
Since University, I find myself increasingly short of people with whom to discuss technical stuff and off whom to bounce ideas.<br />
A good time to start a blog, then... long overdue, actually.<br />
For outlet and for feedback. Well, outlet, at least; feedback is something you can only wish for. But writing does force you to organize your thoughts, and that's a start.<br />
<br />
<b>Disclaimers...</b><br />
<ul><li> I'm not a native English speaker.<br />
(I do, however, try to hide that fact as well as I can.)<br />
<br />
</li>
<li> I'm supposed to be a Master in Computer Sciences.<br />
However, given the size of the field, this does not make me an expert in all of its areas - practical or theoretical - nor indeed of every area that I might venture to write about.<br />
<br />
</li>
<li> Like so many others, I'm curious about some things (e.g., programming languages) and more rigid about other things (e.g., which OS's or editors I find myself being compatible with).<br />
There are even things I ought to be curious about which I've nevertheless given up on for now, mainly because of lack of spare time. </li>
</ul>Erik Søe Sørensenhttp://www.blogger.com/profile/04571315699824621252noreply@blogger.com0