OP Labs Audit Framework: When to get external security review and how to prepare for it

Hi all, maurelian from OP Labs here. In this post, we are sharing an internally developed framework for identifying code which we believe should be audited prior to production deployment.

Our purposes for publishing this framework are twofold:

  1. Providing an example which other contributors to the OP Stack may take into account when deciding how to get audits.
  2. Seeking feedback from the community. Since Optimism Governance is the ultimate decider of what gets shipped to production, it’s important to ensure that the efforts made by core contributors like ourselves align with community expectations.

The Framework

This framework provides guidance to teams at OP Labs who are writing security critical code for the OP Stack and are interested in obtaining external review, including:

  • A security review or audit from a security firm.
  • A community audit competition from a platform such as Sherlock.
  • A bug bounty

It will describe what code should, from our perspective, be audited, and how to prepare for an audit.

Philosophy Behind External Security Audits

It is important to underscore that security starts with the developers. An audit should not be seen as a means of “purchasing a security guarantee” from an external vendor.

What a security vendor provides is:

  1. Feedback and insight into your ability to write secure code.
  2. A final assurance about the specific code you have developed.
  3. A public artifact attesting to the security of the system.

What code should be audited?

Audits are expensive and time consuming, so we recommend using them in a targeted manner.

Generally, audits are better suited for some kinds of codebases, specifically those where Safety is of Existential importance.

The following rubric divides projects along Liveness vs. Safety and Reputational vs. Existential axes.

The rubric above is not exhaustive, but should give a sense of the framework OP Labs uses to determine what requires an audit.

Most OP Stack code fits into one of the following buckets:

  1. Secure via testing and real world usage:
    • This applies largely to infrastructural code, referring to components such as op-node, op-geth, op-batcher as well as alternative implementations of such components.
    • In general, audits are less valuable or necessary for infrastructural code because:
      • Experience has shown us that auditors provide less value with infrastructural code, especially relative to what is learned by getting code running on a testnet or putting code in users’ hands.
      • When issues do occur in infrastructural code, they can be addressed through ‘social consensus’, ie. quickly disseminating new software with a bug fix.
  2. Secure with extreme caution (including auditing):
    • This applies largely to smart contract code, in particular in cases where assets are being secured.
    • Examples include modifications to bridge code, changes to deposit derivation code, multi-sig wallet modifications, etc.
    • In general, audits are necessary for smart contract code because damage resulting for vulnerabilities is more likely to be irrecoverable.

There are likely to be situations where smart contract code does not require an audit, and where infrastructure code should receive an audit. Every case requires a fact-specific analysis about whether an audit should be required.

And while there is no “magic formula” that works in all cases, analysis according to the Existential vs. Reputational x Liveness vs. Safety framework is more important than simply differentiating between infrastructure and smart contract code.

Preparation for external security review

Once again, secure code starts with the development teams that write it. We recommend completing the following steps for a given codebase prior to seeking external review.

  1. Enumerate the invariants and security properties of the system

    This step can be completed however the development team wishes, though teams should note that it is an important input to the next two items.

  2. Documentation for security researchers

    Security specific docs which describe the high-level security properties of the system. A good example of this documentation is what OP Labs provided to the Sherlock auditors for the Bedrock audit.
    Documentation for security researchers should outline:

    • The behavior that the system is expected to maintain.
    • The negative situations the system should prevent.
    • Known issues which are considered low risk and are out of scope.
  3. Test coverage assessment

    Developers should evaluate coverage of the system from two angles:

    1. Code coverage
      For each testing method, a test coverage report should be generated. This report should be reviewed to identify any gaps in coverage.
      We should aim for greater than 100% coverage of critical code paths. We do not need to be sticklers about testing trivial code paths like basic getters and such.

    2. Property coverage

      For each security property, a qualitative assessment of the test coverage should be done. It should answer the following questions:

      • Has this property been thoroughly tested?
      • Which testing methods have been used to test this property? Are there other testing methods we could apply with a reasonable amount of effort?
      • Are there any edge cases relevant to this property which are not fully tested?

    Page 13 and onwards of this Trail of Bits report may serve as a useful template for this assessment.


We believe that this framework provides a strong foundation for both development teams and governance in evaluating the necessity of an external review OP Stack code.


This is very helpful. I like the 2 OP stack buckets, but what do you prioritize if you have a time or a resource constraint? Or if you have any other priority system.

1 Like

Many of us delegates are non-technical, and even the technical ones don’t have the expertise or time to examine a complex codebase like OP Stack. So, a diversity of external audits and testing reviews are key to our decisions. In future, when there’s a protocol upgrade proposal, I’d like to see a suite of reviews before the proposal is votable.