Learning the Webassembly type system
Over the last week I have embarked on a new mission for webassemblyjs: Write a type checker^{1}. Webassemblyjs is an open source JavaScript toolchain for Webassembly to which I have been contributing quite a bit^{2} ^{3}. Toolchain here simply means that we develop a collection of packages that help JavaScript code deal with Webassembly code. For a browser to run Webassembly code it needs to parse and validate it first. Since the language is statically typed, this validation step includes type checking. The specification on validation can look quite intimidating at first, but it turns out that Webassembly is designed in a way to make validation as easy (and fast) as possible.
While the discussions and explorations of type systems can become really abstract and sometimes seem to drift off into some esoteric, mathematical (yet fascinating) metaworlds, my goal today is to understand things on a very handson level. I will not bother to understand things in their most general form, but in the context of the webassembly virtual machine.
What we expect from a type checker
So I want to write a type checker. This is a piece of code which ensures that a given program obeys the type rules defined by the language^{4}. It computes a simple function (pseudocode):
typeChecker(program) > [typeErrors]
The result is simply a list of type errors. If the list is empty, the program is welltyped. A simple example of a type error:
(module
(func (result i32)
(f32.const 0)
)
)
The function has a return type of i32
but the actual result is of type f32
.
My type checker should detect this and report a type error.
This is because Webassembly is a stack machine.
The last value pushed in a function body becomes its return value.
For a nice introduction to stack machines I suggest Sven’s article “Imagine the computer is a stack”^{5}.
While not all type errors are this simple, detecting them fortunately does not become much harder than this.
The key thing to realize is for a function to be welltyped, two things need to happen:
 Every instruction in the function body has to match with the current stack.
 At the end of the function the stack needs to match its result type.
Instruction types
In order to check these things we need a description for each instruction saying what stacks it can be applied to and what the resulting stack after the instruction looks like. Guess what, this is exactly what the type of an instruction is. In the specification they are denoted as function types^{6}:
Let’s make this more concrete by looking at types of some simple instructions:
These are all directly derived from the fact that Webassembly is a stack machine: Instructions pop their operands off the stack and push their results onto the stack^{7}.
Why do we need “type rules” instead of just types?
Now you might wonder, why we need typing rules.
Can we not just give the types for all instructions as above?
If you are not sure, pause here and try to come up with the type of get_global
.
Since it pushes a global value onto the stack,
its type should be \( [] \rightarrow [t] \) where \( t \) is the type of the global.
Therefore its type depends on some property of the module.
There are many such instructions,
think of function calls for example:
The type of call
should be the type of the function it calls.
So the question we need to answer is not “What is the type of this instruction?”, but “What is the type of this instruction in this module at this point?”. A typing rule provides the answer to this question by saying something of the following form:
"If you know x about the module, then instruction y has type z."
Since a specification needs to be precise and natural language is not, usually some form of mathematical notation is used for type rules. The Webassembly specification has both, so it is a perfect place to practice or even learn reading the formal notation. All you need to know is that premises (what you know) is above and conclusions (the type of an instruction) are below the horizontal line^{8}. \( C \) denotes the module context and the exact syntax for it is not important to get the gist.
"If the function x is of type t, then instruction call x is of the same type."
"The instruction t.const results in type t."
"The instruction t.add takes two values of type t and results in type t."
"The instruction t.eq takes two values of type t and results in an i32 value."
So these fancy typing rules turn out to be pretty simple in most cases. All of above can be found in the validation section^{9} of the specification. It also contains some more puzzling aspects like control flow, unreachable code and traps which I might cover in a future post.

https://medium.com/@svensauleau/imaginethecomputerisastack91456cde51da ↩

https://webassembly.github.io/spec/core/syntax/types.html#functiontypes ↩

The result of
eq
isi32
because that is what Webassembly uses for boolean values. ↩ 
https://webassembly.github.io/spec/core/valid/instructions.html ↩