Proposal Preview: Upgrading the Cannon Fault Proof VM to support 64-bit and multi-threading

Please note that this is a Proposal Preview and is not an actual proposal. It is meant to allow the Collective to provide feedback and ask questions before an official proposal is published.

Executive Summary

OP Labs is planning to submit a proposal to upgrade Cannon, the fault proof VM (a core part of the fault proof system) to a new version that supports the MIPS-64 instruction sent and supports running multi-threaded programs, both of which remove memory constraints for the fault proof program.

Motivation

OP Stack chains are scaling their throughput, led by Base which has aggressive targets for continued gas limit increases. Larger blocks cause the fault proof program to use more memory, and the available memory is currently limited to a size that, while initially comfortable, is quickly becoming a constraint on desired gas limit increases.

If the memory limit is reached while proving a block then that block cannot be proven.

The upgraded version of Cannon solves this in two ways:

  1. Multi-threading support allows garbage collection by the Go runtime, which means much more efficient memory management. While processing the L2 blocks to be proven, the fault proof program allocates temporary memory. This memory can now be be quickly freed with garbage collection enabled, improving over the status quo in which memory continues to accumulate over the lifetime of the program. As blocks get larger, more memory accumulates.
  2. A 64-bit VM architecture has a much larger addressable memory. The current version is a 32-bit MIPS architecture which has 4 GiB of addressable memory. A 64-bit MIPS architecture can address 2^64 bytes (orders of magnitude more than is physically available in any system today).

This completely removes memory constraints for the fault proof program as a scaling limit for the foreseeable future.

Technical Details

The specification for the potential changes can be found in the Multithreaded Cannon Fault Proof Virtual Machine section of the OP Stack Specification.

Affected Components

This potential upgrade affects both the on-chain and off-chain components of Cannon which are generally run by chain operators (and any community members who may be running the infrastructure to permissionlessly challenge fault proofs). It does not impact node operators (there is no hard fork required).

  • Cannon
    • Smart Contract (MIPS.sol):
      • The on-chain version of Cannon, MIPS.sol, will be replaced with a new version: MIPS64.sol. The game types (defined in Types.sol) will remain the same (CANNON / 0 or PERMISSIONED_CANNON / 1).
    • Cannon Go VM (src)
      • The Cannon CLI has been updated to work with programs that target different versions of Cannon.
  • OP-Challenger
    • OP-Challenger, the honest actor in the fault proof system, uses the updated Cannon CLI tool to run the appropriate version of Cannon.
  • OP-Program
    • OP-Program will be compiled for the MIPS64 architecture.

VM Architecture Changes

Major changes to the VM architecture include:

  • The emulated CPU architecture is now MIPS64 instead of MIPS32:
    • Registers now hold 64-bit values.
    • Memory address space is dramatically expanded.
    • New 64-bit specific instructions have been added for operations on 64-bit values
  • Cannon now supports reading 8 bytes of data at a time from the pre-image oracle instead of 4.
  • Now supports multi-threading
    • Concurrency is via multitasking rather than true parallel processing with pre-emption and scheduling.
    • The VM now tracks a set of ThreadState objects that represent the state of the CPU for each thread.
    • Thread-safe memory access is enabled by Load Linked Word (ll) and Store Conditional Word (sc) instructions that provide the low-level primitives used to implement atomic read-modify-write (RMW) operations.
    • Extended syscall support for multi-threading.
  • Unrecognized syscalls now raise exceptions, making behavior more predictable, rather than being treated as noops.

Security Considerations

Audit Results

The smart contract changes were audited by Spearbit and by Coinbase Protocol Security:

Risks

The potential failure modes are generally similar to the previous version, including but not limited to:

  • Incorrect Linux/MIPS emulation: bugs in the thread scheduler, incorrect emulation of MIPS64 instructions, and so on.
  • Unimplemented syscalls or opcodes needed by op-program: as with the previous version, we only aim to implement syscalls and opcodes that are required by op-program so there are some unimplemented. The risk is that there is some previously untested code path that uses an opcode or syscall that we haven’t implemented and this code path ends up being exercised by an input condition some time in the future.
  • Mismatch between on-chain and off-chain execution: as with the previous version, there are two implementations that must produce identical results.
  • Livelocks in the fault proof: multithreading introduces new failure modes such as livelocks. Based on our review of op-program and the go runtime internals we don’t expect to see this but monitoring should be used to detect an execution of op-program that takes too long, indicating that a livelock is preventing execution.

Impact Summary

  • After the proposed upgrade, the on-chain implementation of the fault proof VM will be MIPS64.sol instead of MIPS.sol
  • The off-chain fault proof infrastructure will use the new 64-bit multi-threaded version of Cannon.
  • New dispute game contract implementations will need to be deployed that point to the new MIPS64 VM and a new absolute prestate (compiled for MIPS64). .
  • Chains can increase the gas limit to allow much larger blocks without the risk of the fault proof program (op-program) hitting limits of addressable memory in the VM when proving. (There may still be scalability limitations with other systems of course).
  • The VM itself is still subject to the memory constraints of the server that it’s running on, it’s memory usage should be monitored and it should be given sufficient memory.
3 Likes