Increase confidence in WebAssembly with formal verification

What is WebAssembly?

WebAssembly (often abbreviated as Wasm) is a low-level binary instruction format intended to be used as a build target for programs in web platforms. It is designed to be safe, efficient, compact and portable.

WebAssembly is defined as a virtual instruction set architecture (or ISA) to be executed in virtual machines such as V8 —the JavaScript engine used by Chromium-based web browsers—, or standalone runtimes like wasmtime. This means that high-level languages can implement a WebAssembly compilation target, so that you can compile functions written in your language of choice into WebAssembly bytecode. Then you can run that bytecode natively in a web browser, usually by calling it from JavaScript code. Not only does this open up the door for more language choices across the web development landscape, but WebAssembly is faster and lighter than JavaScript, since it's compiled instead of interpreted, and being a binary format means it has a much smaller footprint. Thus you can leverage the power of WebAssembly in the more computationally intensive parts of your web application.

WebAssembly programs are organized into modules, which are stored as files with the .wasm extension. These modules are primarily comprised of functions and other kinds of WebAssembly definitions that you can export so that they can be called from other languages (like JavaScript) as part of a web application.

As with programs written in any other programming language, WebAssembly programs can also go wrong. The kind of safety and sandboxing mechanisms that WebAssembly provides can prevent a lot of potentially dangerous unintended behavior, like inappropriate usage of the system's memory or malicious modules. However, that does not prevent programs from not doing what we expect them to do: for example, we can have a function that is supposed to sort an array of integers, but for some inputs the output list isn't sorted. That function can be generally considered safe, but it's buggy and this bad behavior can still propagate to the rest of the program and cause all sorts of undesired downstream behavior.

While our tolerance for this kind of bad behavior could vary depending on the kind of software we are developing, it's clear that we developers need to make an effort to ensure that our software adheres to a certain level of quality. In other words, we want to increase our confidence that the software we are developing does what we want it to do. This is something that we deeply value at Stack Builders.

How can we gain that confidence, then?

There are plenty of ways to increase the confidence in our software. Some of these are more general and abstract, and can be considered more like guidelines: for example, having a code review process, or following best practices that are standard to our programming language of choice, paradigm, or business domain.

When we are looking for more consistent, reproducible, and concrete techniques to check our programs' behavior, we also have various approaches to choose from. The most common technique would be automated testing. Another interesting technique that can yield helpful insights is program analysis (static or dynamic).

The approach we will focus on for the rest of this blog post is formal verification. Formal verification consists of using tools that let you prove that your program follows a given specification, which is a formal description of the expected behavior of the program. You can check out our blog post "Increasing confidence in your software with formal verification" for an introduction to this topic and some of the concepts that we will mention in the following sections. In that blog post, I also explain why formal verification is of particular interest as a very powerful approach for ensuring software correctness.

It's only natural that if we want to increase our confidence in WebAssembly programs, we would want to leverage some of the aforementioned techniques. Formal verification of WebAssembly programs is a relatively unexplored area. That was the topic of research for my master's thesis. As the result of those efforts, I present wasm-verify: a proof-of-concept formal verification tool for WebAssembly, with a design-by-contract approach that provides a specification language for expressing preconditions, loop invariants and postconditions. If this sounds like Greek to you, fret not, dear reader: this is best understood by looking at concrete examples.

Right; so show me those examples

To understand the examples, we first need to know how WebAssembly bytecode gets executed. Essentially, the bytecode execution model is that of a stack machine. This means that WebAssembly instructions work by pushing and popping values in a stack. While there are some other mechanisms besides the stack to store data and keep track of our program's state in WebAssembly, the stack plays the central role during code execution. We will only be using the stack and local variables (i.e. variables whose scope is limited to the current function we are executing) in the examples.

While WebAssembly is most commonly used as a target binary format by high-level languages, it also offers a low-level programming language similar to assembly language —hence the "WebAssembly" name— called WebAssembly Text Format (WAT). This textual representation of WebAssembly bytecode serves two purposes: 1. It allows WebAssembly programs to be more easily read by humans than bytecode (while not impossible, humans have a hard time thinking in terms of binary instructions and unstructured data!). 2. It allows you to write WebAssembly programs directly, without the need for a high-level language that compiles down to WebAssembly. We are going to use the WAT format (stored as files with .wat extension) to show the code of the WebAssembly modules in this section's examples.

Duplicate function

Now, let's take a look at our first example:

(module
   (export "dup" (func $dup))
   (func $dup (param $n i32) (result i32)
      local.get $n
      local.get $n
      i32.add
   )
)

This is a WebAssembly module that exports a function called dup, which receives a 32-bit integer as an argument and returns the duplicate of that number (as a 32-bit integer). To do this, the implementation pushes the value of the argument n to the top of the stack twice, and executes the add instruction that pops the two values at the top of the stack and adds them, effectively adding n to itself.

How can we write a specification to verify that the code does what we expect, namely, that the dup function duplicates the number passed as an argument? As we mentioned in the previous section, wasm-verify provides a specification language, called VerifiWASM. Here is what a VerifiWASM specification of the dup function would look like:

spec dup(n : i32) returns (x : i32) {
  ensures x == 2*n;
}

The first line in the specification is a declaration of our function, its arguments and its return value: we have a dup function that receives an argument n of type i32 and which returns a value x of type i32. The body of the specification consists of a single statement: a postcondition, which states a condition that our function dup must satisfy after it finishes executing. In this case, we are ensuring that x (the returned value) is equal to n (the argument of the function) multiplied by 2. Makes sense, right?

If we pass the dup.wasm module and the dup.verifiwasm specification to wasm-verify in order to verify that the function matches the specification, we get this output:

[✔] dup matches its specification.
──────────────────────────────────────
Verification successful! All 1 functions match their specifications.

Nice! wasm-verify was able to prove that dup always works as we expect it to. This is in contrast with conventional testing, which is limited to checking that the function behaves as expected for the specific inputs indicated in a test suite (since testing all possible inputs is in general unfeasible).

With property-based testing (PBT), you can also write similar properties, and in some PBT frameworks they are even called specifications, in the same fashion. The main difference is that PBT checks the properties for a reasonably bounded set of the possible inputs, whereas formal verification aims to prove that the property holds for any possible inputs (that are considered valid, i.e. that match the precondition). You can learn more about the theory behind how wasm-verify works in my master's thesis "Specification and verification of WebAssembly programs", as well as some of its current limitations in the Conclusions section.

Fibonacci function

Of course, the dup example is a very simple one. Let's take a look at a more interesting one: a WebAssembly module containing a fib_iter function that calculates the n-th number of the Fibonacci sequence using a loop and auxiliary variables. The code includes some comments just to make it easier to understand.

1.  (module
2.    (export "fib_iter" (func $fib_iter))
3.    (func $fib_iter (param $n i32) (result i32)
4.      (local $x i32)
5.      (local $y i32)
6.      (local $i i32)
7.      i32.const 0
8.      local.set $x ;; x := 0
9.      i32.const 1
10.     local.set $y ;; y := 1
11.     i32.const 0
12.     local.set $i ;; i := 0
13.     ;; while i < n:
14.     ;;   x, y = y, x + y
15.     ;;   i = i + 1
16.     block
17.       loop
18.         local.get $i
19.         local.get $n
20.         i32.ge_s
21.         br_if 1 ;; is i >= n? if so, exit the loop 
22.         local.get $y
23.         local.get $x
24.         local.get $y
25.         i32.add
26.         local.set $y ;; y := x + y
27.         local.set $x ;; x := y (old value)
28.         local.get $i
29.         i32.const 1
30.         i32.add
31.         local.set $i ;; i := i + 1
32.         br 0 ;; go to next loop iteration
33.       end
34.     end
35.     local.get $x
36.     return
37.   )
38. )

The fib_iter function receives a 32-bit integer argument n. The first thing it does is declare three local variables (lines 4-6): - i: this will store the loop counter, which will increase on each iteration. - x: this will store the temporary result of the i-th Fibonacci number (will change on each iteration). - y: this will store the temporary result of the (i+1)-th Fibonacci number (will change on each iteration).

Then (lines 7-12), the function pushes the constant 0 to the top of the stack, and subsequently pops the value from the top of the stack (i.e. the 0 that was just pushed) and assigns that value to the local variable x: this will be the initial value of x (it's 0, since F(0) = 0). The same thing is done for y (it's initial value is 1, since F(1) = 1) and i (it's initial value is 0 since that will be our first iteration).

Now, we enter the loop (starting after lines 16 and 17, denoted by the combination of the block and loop instructions): it first pushes the values of i (the iterator) and n (the function's argument) to the top of the stack. An instruction is executed that pops the two values from the top of the stack and checks if the second value popped (i) is greater than or equal to the first value popped (n); the result of this comparison is then pushed to the top of the stack. Depending on this result, we perform a conditional jump (the br_if instruction): if in this iteration we have a condition where i >= n, it means we have reached the n-th Fibonacci number and we exit the loop; otherwise the jump is ignored and the following instructions are executed.

The rest of the iteration consists of updating variables x and y with the next number in the Fibonacci sequence. To get this, we need to add the values of x and y. On line 22 we push the value of y to the top of the stack; we will make use of this later. Lines 23-25 perform the addition of x and y: push the values of x and y to the top of the stack, and call add which pops those two values and adds them, storing the result on top of the stack. With the set instruction we pop this value and store it in variable y: this will be the value of F(i+1) in the next iteration (which corresponds to F(i+2) in the current iteration). However, we still need to set the value of F(i) for the next iteration, which should be F(i+1) of the current iteration: this is the reason we pushed the old value of y to the top of the stack on line 22. Since after the previous operations we are left with it on top of the stack, we simply pop it and assign it to variable x (line 27). Lines 28-31 simply increase the value of the iteration counter i by one. Finally, on line 32 with the br instruction we perform an unconditional jump to the beginning of the loop, which starts the next iteration.

Phew, that was a mouthful! Well, now that we've seen the code, how can we write a specification? In other words, what do we expect from the behavior of the fib_iter function on a higher-level? It's a bit trickier than with dup:

spec fib_iter(n : i32) returns (r : i32) {
  requires n >= 0;
  local x : i32, y : i32, i : i32;
  assert before 8: 0 <= i && i <= n &&
                   x == ghost_fib(i) && y == ghost_fib(i+1);
  ensures r == ghost_fib(n);
}

ghostdef ghost_fib(n : i32) returns integer {
  if n < 1 then
    0 
  else if n < 3 then
    1
  else
    ghost_fib(n - 1) + ghost_fib(n - 2)
}

That's a lot to take in. It all boils down to one simple fact: we want to ensure that our fib_iter function always returns a value that corresponds to the n-th Fibonacci number. But that means we need some way to reference the Fibonacci numbers to compare the WebAssembly function's results.

Enumerating the Fibonacci numbers up to a given number is impractical, so our best bet is by computing them. How? Well, with a function, of course. This function will be defined in our specification, and exists completely detached from our WebAssembly function. We are going to use a conceptually simpler, recursive implementation. We are calling this function ghost_fib, and it's declared by using the ghostdef keyword. Spooky! The reason behind calling it "ghost" is because these are called ghost functions (or ghost code) in the field of verification. The "ghost" word is in reference to the fact that this code/function doesn't exist in the actual code we are trying to verify: it's merely a construct to help us define specifications and aid in the verification process.

Back to the specification of our fib_iter function, we have three key components: - A precondition: this is a condition that must hold upon calling the fib_iter function. Since Fibonacci numbers are not defined for negative numbers, to ensure correctness of the fib_iter function, we require the function to be called with an argument ≥0. - A loop invariant: this is a condition that must hold during every iteration of the loop. This is key to proving that fib_iter is correct. We know that fib_iter is correct if its return value corresponds to F(n). In terms of how fib_iter works, at each iteration we use the Fibonacci numbers from the previous iteration to calculate the Fibonacci numbers for the current iteration; this means that we depend on the numbers from the previous iteration being correct so that the numbers we are going to calculate in the current iteration are correct. The way we encode this condition in the specification for fib_iter is by stating that the iteration counter i must fall between 0 and n, and that the value of the local variables x and y must equal F(i) and F(i+1) respectively (considering ghost_fib(n) our specification's equivalent to F(n) being the purely mathematical function for the n-th Fibonacci number). To reference the local variables found in our WebAssembly code, we need to explicitly declare them (and their types) in our specification: we accomplish this through the previous line (local ...). - A postcondition: finally, this is the condition that must hold after calling the function. It specifies the actually observable behavior of the function since it's a condition on the return values of the function. With this kind of verification process, what we ultimately want to check is that the postcondition holds which means that the values the function returns are the ones it should. The rest of our specification (including precondition and loop invariants) serve as a means to being able to prove the postcondition. Now that we have specified how we expect the fib_iter function to behave, we can run our WebAssembly module through wasm-verify:

[✔] fib_iter matches its specification.
──────────────────────────────────────
Verification successful! All 1 functions match their specifications.

Success! The function behaves as described for all inputs that satisfy the precondition!

We have shown a couple of succesful verification processes so far, but how would a failing one look like? Let's say we modify the postcondition for fib_iter to be ensures r == ghost_fib(n+1);. This is the output we would get from wasm-verify:

[✘] fib_iter does not match its specification.
──────────────────────────────────────
Verification failed! Of all 1 functions, there were 1 that didn't match their specification.
These are some possible causes:
  - The specification is correct and complete: this means the function is incorrect, and should be fixed.
  - The specification is complete but incorrect: the function could be correct,
    but it didn't match the specification because the specification is wrong, and should be fixed.
  - The specification is incomplete: maybe you are calling a WebAssembly function that doesn't have a spec,
    or maybe you are using a recursive ghost function that doesn't terminate in some part in the spec.

The tool states that fib_iter doesn't match its specification, and presents the user with some ideas of what might be wrong. If we assume that the specification is correct, which is the ideal case, then that means that our function is wrong and we need to address our function's bad behavior. However, it could be the case that the function works as expected, but our specification is incorrect, in the sense that it doesn't accurately reflect our expectations. Another posssibility is that the specification is incomplete and doesn't provide enough information so that wasm-verify can determine whether the function satisfies the postcondition or not: this would be the case if we remove the precondition from fib_iter's specification, for example.

When putting formal verification tools to use, they usually operate under the assumption that the specification is our source of truth and that they are correct. But just as happens with testing, where you can have tests that fail because the test cases are wrong (while the function they are testing works properly), specifications written by humans can also be malformed. I will briefly touch on how this should —in theory— not be that much of an issue with wasm-verify compared to some other verification tools out there, if the tool kept evolving in the future, in the following Conclusions section.

Conclusions

This is a proof-of-concept to show how one can formally verify WebAssembly functions. If you want to learn more about the design, theory, and core ideas behind the implementation of wasm-verify, you can find more in my master's thesis "Specification and verification of WebAssembly programs", which drove the implementation of wasm-verify.

As a tool, wasm-verify is far from production ready. It's currently limited to a subset of WebAssembly, and doesn't support some of WebAssembly's features such as global variables, linear memory and tables. It also doesn't accurately model underflow/overflow behavior, so the correctness results you can get primarily apply to values that fall within the natural range of the numeric types.

Another thing to note is that the correctness results you can currently get from wasm-verify are of partial correctness: this means that when it says that your functions are correct, it assumes that your functions terminate (i.e. the function doesn't get stuck during execution and eventually returns some value). To have total correctness, one must also provide a proof that the function terminates, which is currently unsupported in wasm-verify. This is not as bad as it sounds, since partial correctness is still a guarantee that your functions will behave as expected if they terminate.

Some of the future work that I have mentioned so far and more can be found in the Conclusions chapter of the thesis. wasm-verify's GitHub repo's issues also outline some smaller scoped things that could be worked on to improve what's already there.

One usually doesn't write WebAssembly code by hand but rather it gets generated from code in a high-level language. When working on the development of wasm-verify and its specification language VerifiWASM, the idea was that, in the same way, you wouldn't write specifications by hand but rather they would get generated from the high-level language code. wasm-verify would then constitute a platform to help bring verification to different languages that have a WebAssembly compilation target. To accomplish this, developers of these high-level languages would have to write a plugin or tool to generate VerifiWASM specifications from code and annotations, in exchange for not having to write a verification tool like wasm-verify from the ground up for each of the high-level languages. This is why I mentioned before that we as users shouldn't be too concerned about incorrect or incomplete specifications, since the idea is that the user wouldn't need to write them by hand. The feedback that wasm-verify provides in this regard when running WebAssembly modules and specifications through it would be geared towards implementors of these tools that generate VerifiWASM specifications.

To sum up: wasm-verify serves as a starting point for formal verification of WebAssembly code, whose vision is to serve as a potential leverage for formal verification of other high-level languages. For simpler functions that don't make heavy usage of the more advanced WebAssembly features, it is already usable, by writing the VerifiWASM specifications manually as we have shown. Being fully transparent: I do not currently have the time to keep developing wasm-verify or a VerifiWASM specification generator tool, but as its current maintainer I welcome contributions of all kinds in its repository.

If you liked this blog post, make sure to check out the other blog posts that we have on the Stack Builders blog. Until next time!

Disclaimer: while formal verification of WebAssembly was the topic I researched and developed for my master's thesis, I am not a WebAssembly expert. I apologize in advance for any potential errors or missed details.

Increasing confidence in WebAssembly code with formal verification

Increasing confidence in your software with formal verification

POV - You’re a Software Engineer at Stack Builders, Spain (Part 2)

POV - You’re a Software Engineer at Stack Builders, Spain (Part 1)