EIP-615: Subroutines and Static Jumps for the EVM

Greg Colvin; Brooklyn Zelenka; Paweł Bylica; Christian Reitwiessner

Simple Summary

In the 21st century, on a blockchain circulating billions of ETH, formal specification and verification are an essential tool against loss. Yet the design of the EVM makes this unnecessarily difficult. Further, the design of the EVM makes near-linear-time compilation to machine code difficult. We propose to move forward with proposals to resolve these problems by tightening EVM security guarantees and reducing barriers to performance.

Abstract

EVM code is currently difficult to statically analyze, hobbling critical tools for preventing the many expensive bugs our blockchain has experienced. Further, none of the current implementations of the Ethereum Virtual Machine—including the compilers—are sufficiently performant to reduce the need for precompiles and otherwise meet the network's long-term demands. This proposal identifies dynamic jumps as a major reason for these issues, and proposes changes to the EVM specification to address the problem, making further efforts towards a safer and more performant the EVM possible.

We also propose to validate—in near-linear time—that EVM contracts correctly use subroutines, avoid misuse of the stack, and meet other safety conditions before placing them on the blockchain. Validated code precludes most runtime exceptions and the need to test for them. And well-behaved control flow and use of the stack makes life easier for interpreters, compilers, formal analysis, and other tools.

Motivation

Currently the EVM supports only dynamic jumps, where the address to jump to is an argument on the stack. Worse, the EVM fails to provide ordinary, alternative control flow facilities like subroutines and switches provided by Wasm and most CPUs. So dynamic jumps cannot be avoided, yet they obscure the structure of the code and thus mostly inhibit control- and data-flow analysis. This puts the quality and speed of optimized compilation fundamentally at odds. Further, since many jumps can potentially be to any jump destination in the code, the number of possible paths through the code can go up as the product of the number of jumps by the number of destinations, as does the time complexity of static analysis. Many of these cases are undecidable at deployment time, further inhibiting static and formal analyses.

However, given Ethereum's security requirements, near-linear n log n time complexity is essential. Otherwise, Contracts can be crafted or discovered with quadratic complexity to use as denial of service attack vectors against validations and optimizations.

But absent dynamic jumps code can be statically analyzed in linear time. That allows for linear time validation. It also allows for code generation and such optimizations as can be done in log n time to comprise an n log n time compiler.

And absent dynamic jumps, and with proper subroutines the EVM is a better target for code generation from other languages, including

Solidity
Vyper
LLVM IR
- front ends include C, C++, Common Lisp, D, Fortran, Haskell, Java, Javascript, Kotlin, Lua, Objective-C, Pony, Pure, Python, Ruby, Rust, Scala, Scheme, and Swift

The result is that all of the following validations and optimizations can be done at deployment time with near-linear (n log n) time complexity.

The absence of most exceptional halting states can be validated.
The maximum use of resources can be sometimes be calculated.
Bytecode can be compiled to machine code in near-linear time.
Compilation can more effectively optimize use of smaller registers.
Compilation can more effectively optimize injection of gas metering.

Specification

Dependencies

EIP-1702. Generalized Account Versioning Scheme. This proposal needs a versioning scheme to allow for its bytecode (and eventually eWasm bytecode) to be deployed with existing bytecode on the same blockchain.

Proposal

We propose to deprecate two existing instructions—JUMP and JUMPI—and propose new instructions to support their legitimate uses. In particular, it must remain possible to compile Solidity and Vyper code to EVM bytecode, with no significant loss of performance or increase in gas price.

Especially important is efficient translation to and from eWasm and to machine code. To that end we maintain a close correspondence between Wasm, x86, ARM and proposed EVM instructions.

EIP-615	Wasm	x86	ARM
JUMPTO	br	JMP	B
JUMPIF	br_if	JE	BEQ
JUMPV	br_table	JMP	TBH
JUMPSUB	call	CALL	BL
JUMPSUBV	call_indirect	CALL	BL
RETURN	return	RET	RET
GETLOCAL	local.get	POP	POP
PUTLOCAL	local.put	PUSH	PUSH
BEGINSUB	func
BEGINDATA	tables

Preliminaries

These forms

INSTRUCTION
INSTRUCTION x
INSTRUCTION x, y

name an INSTRUCTION with no, one and two arguments, respectively. An instruction is represented in the bytecode as a single-byte opcode. Any arguments are laid out as immediate data bytes following the opcode inline, interpreted as fixed length, MSB-first, two's-complement, two-byte positive integers. (Negative values are reserved for extensions.)

Branches and Subroutines

The two most important uses of JUMP and JUMPI are static jumps and return jumps. Conditional and unconditional static jumps are the mainstay of control flow. Return jumps are implemented as a dynamic jump to a return address pushed on the stack. With the combination of a static jump and a dynamic return jump you can—and Solidity does—implement subroutines. The problem is that static analysis cannot tell the one place the return jump is going, so it must analyze every possibility (a heavy analysis).

Static jumps are provided by

JUMPTO jump_target
JUMPIF jump_target
which are the same as JUMP and JUMPI except that they jump to an immediate jump_target rather than an address on the stack.

To support subroutines, BEGINSUB, JUMPSUB, and RETURNSUB are provided. Brief descriptions follow, and full semantics are given below.

BEGINSUB n_args, n_results
marks the single entry to a subroutine. n_args items are taken off of the stack at entry to, and n_results items are placed on the stack at return from the subroutine. The subroutine ends at the next BEGINSUB instruction (or BEGINDATA, below) or at the end of the bytecode.

JUMPSUB jump_target
jumps to an immediate subroutine address.

RETURNSUB
returns from the current subroutine to the instruction following the JUMPSUB that entered it.

Switches, Callbacks, and Virtual Functions

Dynamic jumps are also used for O(1) indirection: an address to jump to is selected to push on the stack and be jumped to. So we also propose two more instructions to provide for constrained indirection. We support these with vectors of JUMPDEST or BEGINSUB offsets stored inline, which can be selected with an index on the stack. That constrains validation to a specified subset of all possible destinations. The danger of quadratic blow up is avoided because it takes as much space to store the jump vectors as it does to code the worst case exploit.

Dynamic jumps to a JUMPDEST are used to implement O(1) jumptables, which are useful for dense switch statements. Wasm and most CPUs provide similar instructions.

JUMPV n, jump_targets
jumps to one of a vector of n JUMPDEST offsets via a zero-based index on the stack. The vector is stored inline at the jump_targets offset after the BEGINDATA bytecode as MSB-first, two's-complement, two-byte positive integers. If the index is greater than or equal to n - 1 the last (default) offset is used.

Dynamic jumps to a BEGINSUB are used to implement O(1) virtual functions and callbacks, which take at most two pointer dereferences on most CPUs. Wasm provides a similar instruction.

JUMPSUBV n, jump_targets
jumps to one of a vector of n BEGINSUB offsets via a zero-based index on the stack. The vector is stored inline at the jump_targets offset after the DATA bytecode, as MSB-first, two's-complement, two-byte positive integers. If the index is greater than or equal to n - 1 the last (default) offset is used.

Variable Access

These operations provide convenient access to subroutine parameters and local variables at fixed stack offsets within a subroutine. Otherwise only sixteen variables can be directly addressed.

PUTLOCAL n
Pops the stack to the local variable n.

GETLOCAL n
Pushes the local variable n onto the stack.

Local variable n is the nth stack item below the frame pointer, FP[-n], as defined below.

Data

There needs to be a way to place unreachable data into the bytecode that will be skipped over and not validated. Indirect jump vectors will not be valid code. Initialization code must create runtime code from data that might not be valid code. And unreachable data might prove useful to programs for other purposes.

BEGINDATA
specifies that all of the following bytes to the end of the bytecode are data, and not reachable code.

Structure

Valid EIP-615 EVM bytecode begins with a valid header. This is the magic number ‘\0evm’ followed by the semantic versioning number '\1\5\0'. (For Wasm the header is '\0asm\1').

Following the header is the BEGINSUB opcode for the main routine. It takes no arguments and returns no values. Other subroutines may follow the main routine, and an optional BEGINDATA opcode may mark the start of a data section.

Semantics

Jumps to and returns from subroutines are described here in terms of

The EVM data stack, (as defined in the Yellow Paper) usually just called “the stack.”
A return stack of JUMPSUB and JUMPSUBV offsets.
A frame stack of frame pointers.

We will adopt the following conventions to describe the machine state:

The program counter PC is (as usual) the byte offset of the currently executing instruction.
The stack pointer SP corresponds to the Yellow Paper's substate s of the machine state.
- SP[0] is where a new item is can be pushed on the stack.
- SP[1] is the first item on the stack, which can be popped off the stack.
- The stack grows towards lower addresses.
The frame pointer FP is set to SP + n_args at entry to the currently executing subroutine.
- The stack items between the frame pointer and the current stack pointer are called the frame.
- The current number of items in the frame, FP - SP, is the frame size.

Note: Defining the frame pointer so as to include the arguments is unconventional, but better fits our stack semantics and simplifies the remainder of the proposal.

The frame pointer and return stacks are internal to the subroutine mechanism, and not directly accessible to the program. This is necessary to prevent the program from modifying its own state in ways that could be invalid.

Execution of EVM bytecode begins with the main routine with no arguments, SP and FP set to 0, and with one value on the return stack—code_size - 1. (Executing the virtual byte of 0 after this offset causes an EVM to stop. Thus executing a RETURNSUB with no prior JUMPSUB or JUMBSUBV—that is, in the main routine—executes a STOP.)

Execution of a subroutine begins with JUMPSUB or JUMPSUBV, which

pushes PC on the return stack,
pushes FP on the frame stack
- thus suspending execution of the current subroutine,
sets FP to SP + n_args, and
sets PC to the specified BEGINSUB address
- thus beginning execution of the new subroutine.

Execution of a subroutine is suspended during and resumed after execution of nested subroutines, and ends upon encountering a RETURNSUB, which

sets FP to the top of the virtual frame stack and pops the stack,
sets SP to FP + n_results,
sets PC to top of the return stack and pops the stack, and
advances PC to the next instruction

thus resuming execution of the enclosing subroutine or main routine. A STOP or RETURN also ends the execution of a subroutine.

For example, starting from this stack,

_________________
      | locals      20 <- FP
frame |             21
______|___________  22
                       <- SP

and after pushing two arguments and branching with JUMPSUB to a BEGINSUB 2, 3

PUSH 10
PUSH 11
JUMPSUB beginsub

and initializing three local variables

PUSH 99
PUSH 98
PUSH 97

the stack looks like this

                    20
                    21
__________________  22
      | arguments   10 <- FP
frame |___________  11
      | locals      99
      |             98
______|___________  97
                       <- SP

After some amount of computation the stack could look like this

                    20
                    21
__________________  22
      | returns     44 <- FP
      |             43
frame |___________  42
      | locals      13
______|___________  14
                       <- SP

and after RETURNSUB would look like this

_________________
      | locals      20 <- FP
      |             21
frame |___________  22
      | returns     44
      |             43
______|___________  42
                       <- SP

Validity

We would like to consider EVM code valid iff no execution of the program can lead to an exceptional halting state, but we must validate code in linear time. So our validation does not consider the code’s data and computations, only its control flow and stack use. This means we will reject programs with invalid code paths, even if those paths are not reachable. Most conditions can be validated, and will not need to be checked at runtime; the exceptions are sufficient gas and sufficient stack. As such, static analysis may yield false negatives belonging to well-understood classes of code requiring runtime checks. Aside from those cases, we can validate large classes at validation time and with linear complexity.

Execution is as defined in the Yellow Paper—a sequence of changes in the EVM state. The conditions on valid code are preserved by state changes. At runtime, if execution of an instruction would violate a condition the execution is in an exceptional halting state. The Yellow Paper defines five such states.

1 Insufficient gas

2 More than 1024 stack items

3 Insufficient stack items

4 Invalid jump destination

5 Invalid instruction

We propose to expand and extend the Yellow Paper conditions to handle the new instructions we propose.

To handle the return stack we expand the conditions on stack size:

2a The size of the data stack does not exceed 1024.

2b The size of the return stack does not exceed 1024.

Given our more detailed description of the data stack we restate condition 3—stack underflow—as

3 SP must be less than or equal to FP

Since the various DUP and SWAP operations—as well as PUTLOCAL and GETLOCAL—are defined as taking items off the stack and putting them back on, this prevents them from accessing data below the frame pointer, since taking too many items off of the stack would mean that SP is less than FP.

To handle the new jump instructions and subroutine boundaries, we expand the conditions on jumps and jump destinations.

4a JUMPTO, JUMPIF, and JUMPV address only JUMPDEST instructions.

4b JUMPSUB and JUMPSUBV address only BEGINSUB instructions.

4c JUMP instructions do not address instructions outside of the subroutine they occur in.

We have two new conditions on execution to ensure consistent use of the stack by subroutines:

6 For JUMPSUB and JUMPSUBV the frame size is at least the n_args of the BEGINSUB(s) to jump to.

7 For RETURNSUB the frame size is equal to the n_results of the enclosing BEGINSUB.

Finally, we have one condition that prevents pathological uses of the stack:

8 For every instruction in the code the frame size is constant.

In practice, we must test at runtime for conditions 1 and 2—sufficient gas and sufficient stack. We don’t know how much gas there will be, we don’t know how deep a recursion may go, and analysis of stack depth even for non-recursive programs is nontrivial.

All of the remaining conditions we validate statically.

Costs & Codes

All of the instructions are O(1) with a small constant, requiring just a few machine operations each, whereas a JUMP or JUMPI typically does an O(log n) binary search of an array of JUMPDEST offsets before every jump. With the cost of JUMPI being high and the cost of JUMP being mid, we suggest the cost of JUMPV and JUMPSUBV should be mid, JUMPSUB and JUMPIF should be low, andJUMPTO and the rest should be verylow. Measurement will tell.

We suggest the following opcodes:

0xb0 JUMPTO
0xb1 JUMPIF
0xb2 JUMPV
0xb3 JUMPSUB
0xb4 JUMPSUBV
0xb5 BEGINSUB
0xb6 BEGINDATA
0xb7 RETURNSUB
0xb8 PUTLOCAL
0xb9 GETLOCAL

Backwards Compatibility

These changes would need to be implemented in phases at decent intervals:

1. If this EIP is accepted, invalid code should be deprecated. Tools should stop generating invalid code, users should stop writing it, and clients should warn about loading it.

2. A later hard fork would require clients to place only valid code on the block chain. Note that despite the fork old EVM code will still need to be supported indefinitely; older contracts will continue to run, and to create new contracts.

If desired, the period of deprecation can be extended indefinitely by continuing to accept code not versioned as new—but without validation. That is, by delaying or canceling phase 2.

Regardless, we will need a versioning scheme like EIP-1702 to allow current code and EIP-615 code to coexist on the same blockchain.

Rationale

This design was highly constrained by the existing EVM semantics, the requirement for eWasm compatibility, and the security demands of the Ethereum environment. It was also informed by the lead author's previous work implementing Java and Scheme interpreters. As such there was very little room for alternative designs.

As described above, the approach was simply to deprecate the problematic dynamic jumps, then ask what opcodes were necessary to provide for the features they supported. These needed to include those provided by eWasm, which themselves were modeled after typical hardware. The only real innovation was to move the frame pointer and the return pointer to their own stacks, so as to prevent any possibility of overwriting them. (Although Forth also uses a return stack.) This allowed for treating subroutine arguments as local variables, and facilitated the return of multiple values.

Implementation

Implementation of this proposal need not be difficult. At the least, interpreters can simply be extended with the new opcodes and run unchanged otherwise. The new opcodes require only stacks for the frame pointers and return offsets and the few pushes, pops, and assignments described above. The bulk of the effort is the validator, which in most languages can almost be transcribed from the pseudocode above.

A lightly tested C++ reference implementation is available in Greg Colvin's Aleth fork. This version required circa 110 lines of new interpreter code and a well-commented, 178-line validator.

Appendix A

Validation

Validation comprises two tasks:

Check that jump destinations are correct and instructions valid.
Check that subroutines satisfy the conditions on control flow and stack use.

We sketch out these two validation functions in pseudo-C below. To simplify the presentation only the five primitives are handled (JUMPV and JUMPSUBV would just add more complexity to loop over their vectors), we assume helper functions for extracting instruction arguments from immediate data and managing the stack pointer and program counter, and some optimizations are forgone.

Validating Jumps

Validating that jumps are to valid addresses takes two sequential passes over the bytecode—one to build sets of jump destinations and subroutine entry points, another to check that addresses jumped to are in the appropriate sets.

    bytecode[code_size]   // contains EVM bytecode to validate
    is_sub[code_size]     // is there a BEGINSUB at PC?
    is_dest[code_size]    // is there a JUMPDEST at PC?
    sub_for_pc[code_size] // which BEGINSUB is PC in?

    bool validate_jumps(PC)
    {
        current_sub = PC

        // build sets of BEGINSUBs and JUMPDESTs
        for (PC = 0; instruction = bytecode[PC]; PC = advance_pc(PC))
        {
            if instruction is invalid
                return false
            if instruction is BEGINDATA
                break;
            if instruction is BEGINSUB
                is_sub[PC] = true
                current_sub = PC
                sub_for_pc[PC] = current_sub
            if instruction is JUMPDEST
                is_dest[PC] = true
            sub_for_pc[PC] = current_sub
        }

        // check that targets are in subroutine
        for (PC = 0; instruction = bytecode[PC]; PC = advance_pc(PC))
        {
            if instruction is BEGINDATA
                break;
            if instruction is BEGINSUB
                current_sub = PC
            if instruction is JUMPSUB
                if is_sub[jump_target(PC)] is false
                    return false
            if instruction is JUMPTO or JUMPIF
                if is_dest[jump_target(PC)] is false
                    return false
            if sub_for_pc[PC] is not current_sub
                return false
       }
       return true
    }

Note that code like this is already run by EVMs to check dynamic jumps, including building the jump destination set every time a contract is run, and doing a lookup in the jump destination set before every jump.

Subroutine Validation

This function can be seen as a symbolic execution of a subroutine in the EVM code, where only the effect of the instructions on the state being validated is computed. Thus the structure of this function is very similar to an EVM interpreter. This function can also be seen as an acyclic traversal of the directed graph formed by taking instructions as vertices and sequential and branching connections as edges, checking conditions along the way. The traversal is accomplished via recursion, and cycles are broken by returning when a vertex which has already been visited is reached. The time complexity of this traversal is O(|E|+|V|): The sum of the number of edges and number of vertices in the graph.

The basic approach is to call validate_subroutine(i, 0, 0), for i equal to the first instruction in the EVM code through each BEGINDATA offset. validate_subroutine() traverses instructions sequentially, recursing when JUMP and JUMPI instructions are encountered. When a destination is reached that has been visited before it returns, thus breaking cycles. It returns true if the subroutine is valid, false otherwise.

    bytecode[code_size]     // contains EVM bytecode to validate
    frame_size[code_size ]  // is filled with -1

    // we validate each subroutine individually, as if at top level
    // * PC is the offset in the code to start validating at
    // * return_pc is the top PC on return stack that RETURNSUB returns to
    // * at top level FP = SP = 0 is both the frame size and the stack size
    // * as items are pushed SP get more negative, so the stack size is -SP
    validate_subroutine(PC, return_pc, SP)
    {
        // traverse code sequentially, recurse for jumps
        while true
        {
            instruction = bytecode[PC]

            // if frame size set we have been here before
            if frame_size[PC] >= 0
            {
                // check for constant frame size
                if instruction is JUMPDEST
                    if -SP != frame_size[PC]
                        return false

                // return to break cycle
                return true
            }
            frame_size[PC] = -SP

            // effect of instruction on stack
            n_removed = removed_items(instructions)
            n_added = added_items(instruction)

            // check for stack underflow
            if -SP < n_removed
                return false

            // net effect of removing and adding stack items
            SP += n_removed
            SP -= n_added

            // check for stack overflow
            if -SP > 1024
                return false

            if instruction is STOP, RETURN, or SUICIDE
                return true

            // violates single entry
            if instruction is BEGINSUB
                 return false

            // return to top or from recursion to JUMPSUB
            if instruction is RETURNSUB
                return true;;

            if instruction is JUMPSUB
            {
                // check for enough arguments
                sub_pc = jump_target(PC)
                if -SP < n_args(sub_pc)
                    return false
                return true
            }

            // reset PC to destination of jump
            if instruction is JUMPTO
            {
                PC = jump_target(PC)
                continue
            }

            // recurse to jump to code to validate
            if instruction is JUMPIF
            {
                if not validate_subroutine(jump_target(PC), return_pc, SP)
                    return false
            }

            // advance PC according to instruction
            PC = advance_pc(PC)
        }

        // check for right number of results
        if (-SP != n_results(return_pc)
            return false
        return true
    }

Appendix B

EVM Analysis

There is a large and growing ecosystem of researchers, authors, teachers, auditors, and analytic tools--providing software and services focused on the correctness and security of EVM code. A small sample is given here.

Some Tools

Some Papers

Copyright

Copyright and related rights waived via CC0.

Authors	Greg Colvin, Brooklyn Zelenka, Paweł Bylica, Christian Reitwiessner
Created	2016-12-10

EIP-615: Subroutines and Static Jumps for the EVM

Simple Summary ​

Abstract ​

Motivation ​

Specification ​

Dependencies ​

Proposal ​

Preliminaries ​

Branches and Subroutines ​

Switches, Callbacks, and Virtual Functions ​

Variable Access ​

Data ​

Structure ​

Semantics ​

Validity ​

Costs & Codes ​

Backwards Compatibility ​

Rationale ​

Implementation ​

Appendix A ​

Validation ​

Validating Jumps ​

Subroutine Validation ​

Appendix B ​

EVM Analysis ​

Some Tools ​

Some Papers ​

Copyright ​

Citation