Skip to content

Commit e443b43

Browse files
pauldowmanmds1
andauthored
FMA for Cannon updates for Go 1.23 and Kona (#279)
* Begin FMA for Cannon updates for Go 1.23 and Kona * Add failure modes to the go 1.23 and Kona FMA * Fix FMA formatting and text changes * Include mprotect syscall in Cannon go 1.23 FMA * Set FMA status to In Review * Add suggested updates for go 1.23 and kona FMA * Add audit info to Cannon go 1.23 and Kona changes FMA * Updated TOC for Cannon updates for Go 1.23 and Kona FMA * Mark the "resolve comments" to-do item as done. Co-authored-by: Matt Solomon <[email protected]> * Mark Cannon Go 1.23 FMA as Final Co-authored-by: Matt Solomon <[email protected]> --------- Co-authored-by: Matt Solomon <[email protected]>
1 parent 6e1e044 commit e443b43

File tree

1 file changed

+127
-0
lines changed

1 file changed

+127
-0
lines changed
Lines changed: 127 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,127 @@
1+
# [Project Name]: Failure Modes and Recovery Path Analysis
2+
3+
<!-- START doctoc generated TOC please keep comment here to allow auto update -->
4+
<!-- DON'T EDIT THIS SECTION, INSTEAD RE-RUN doctoc TO UPDATE -->
5+
6+
- [Introduction](#introduction)
7+
- [Failure Modes and Recovery Paths](#failure-modes-and-recovery-paths)
8+
- [FM1: Toggles are incorrectly deployed or implemented causing features to be incorrectly toggled off](#fm1-toggles-are-incorrectly-deployed-or-implemented-causing-features-to-be-incorrectly-toggled-off)
9+
- [FM2: Stack depth-related refactoring with new dclo/dclz instructions introduced a bug](#fm2-stack-depth-related-refactoring-with-new-dclodclz-instructions-introduced-a-bug)
10+
- [FM3: New dclo/dclz instructions are incorrectly implemented](#fm3-new-dclodclz-instructions-are-incorrectly-implemented)
11+
- [FM4: Incomplete Go 1.23 support (missing syscalls)](#fm4-incomplete-go-123-support-missing-syscalls)
12+
- [FM5: eventfd or mprotect noop insufficient for Go 1.23 suppport](#fm5-eventfd-or-mprotect-noop-insufficient-for-go-123-suppport)
13+
- [Generic items we need to take into account:](#generic-items-we-need-to-take-into-account)
14+
- [Action Items](#action-items)
15+
- [Audit Requirements](#audit-requirements)
16+
17+
<!-- END doctoc generated TOC please keep comment here to allow auto update -->
18+
19+
_Italics are used to indicate things that need to be replaced._
20+
21+
| | |
22+
| ------------------ | -------------------------------------------------- |
23+
| Author | Paul Dowman |
24+
| Created at | 2025-05-02 |
25+
| Initial Reviewers | Meredith Baxter |
26+
| Need Approval From | Matt Solomon |
27+
| Status | Final |
28+
29+
> [!NOTE]
30+
> 📢 Remember:
31+
>
32+
> - The single approver in the “Need Approval From” must be from the Security team.
33+
> - Maintain the “Status” property accordingly. An FMA document can have the following statuses:
34+
> - **Draft 📝:** Doc is created but not yet ready for review.
35+
> - **In Review 🔎:** Security is reviewing, and Engineering is iterating on the design. A checklist of action items will be created during this phase.
36+
> - **Implementing Actions 🛫:** Security has signed off on the content of the document, including the resulting action items. Engineering is responsible for implementing the action items, and updating the checklist.
37+
> - **Final 👍:** Security will transition the status of the document to Final once all action items are completed.
38+
39+
> [!TIP]
40+
> Guidelines for writing a good analysis, and what the reviewer will look for:
41+
>
42+
> - Show your work: Include steps and tools for each conclusion.
43+
> - Completeness of risks considered.
44+
> - Include both implementation and operational failure modes
45+
> - Provide references to support the reviewer.
46+
> - The size of the document will likely be proportional to the project's complexity.
47+
> - The ultimate goal of this document is to identify action items to improve the security of the project. The FMA review process can be accelerated by proactively identifying action items during the writing process.
48+
49+
## Introduction
50+
51+
This document covers updates to Cannon (Solidity and Go versions) to support Go 1.23 and to support running Kona
52+
53+
Below are references for this project:
54+
55+
- [Go 1.23 PR](https://github.com/ethereum-optimism/optimism/pull/14692)
56+
- [New instructions for Kona PR](https://github.com/ethereum-optimism/optimism/pull/15601)
57+
- [Add feature toggling to MIPS VM contracts PR](https://github.com/ethereum-optimism/optimism/pull/15487)
58+
59+
## Failure Modes and Recovery Paths
60+
61+
**_Use one sub-header per failure mode, so the full set of failure modes is easily scannable from the table of contents._**
62+
63+
### FM1: Toggles are incorrectly deployed or implemented causing features to be incorrectly toggled off
64+
65+
- **Description:** A [feature toggle](https://github.com/ethereum-optimism/optimism/pull/15487) was added. The contract could be deployed with the wrong version.
66+
- **Risk Assessment:** low
67+
- **Mitigations:**
68+
1. The version number is checked in the constructor, and currently it's required to be 7 (the latest version) so we shouldn't be able to deploy MIPS64.sol with the wrong version.
69+
2. This logic is fairly simple, it's just a check against the version number to enable features, so it's easy to reason about and low risk of being implemented incorrectly.
70+
- **Detection:** We have manually reviewed for this.
71+
- **Recovery Path(s)**: This would require a contract upgrade.
72+
73+
### FM2: Stack depth-related refactoring with new dclo/dclz instructions introduced a bug
74+
75+
- **Description:** Arguments were consolidated into a struct to avoid "stack too deep" issues.
76+
- **Risk Assessment:** low
77+
- **Mitigations:**
78+
1. We have comprehensive differential testing on all VM instructions between go and solidity, which should catch any potential refactoring-related bugs. In this case, the solidity code was changed but the go code was unchanged, therefore we have confidence a bug was not introduced from the refactor.
79+
2. This is a trivial refactoring
80+
- **Detection:** We rely on our tests.
81+
- **Recovery Path(s)**: It would require fixing the bug and upgrading the contract.
82+
83+
### FM3: New dclo/dclz instructions are incorrectly implemented
84+
85+
- **Description:** There are two new instructions, there could be a bug in the implementation. They aren't used by op-program, but would be used if we ever deployed Kona on Cannon.
86+
- **Risk Assessment:** low
87+
- **Mitigations:**
88+
1. These instructions aren't emitted by the Go compiler, so behavior should not affect the VM when running op-program
89+
2. If we ever do deploy Kona on Cannon we will do more testing, including running it on mainnet data for weeks in VM Runner.
90+
- **Detection:** The program would crash if it used those instructions and they were incorrectly implemented.
91+
- **Recovery Path(s)**: It would require fixing the bug and upgrading the contract.
92+
93+
### FM4: Incomplete Go 1.23 support (missing syscalls)
94+
95+
- **Description:** It's possible that the Go 1.23 compiler uses additional syscalls that we haven't noticed and they aren't implemented.
96+
- **Risk Assessment:** low
97+
- **Mitigations:**
98+
1. We have been running `op-challenger-runner` on production data for several weeks with the new VM
99+
2. We used `vm-compat`, a tool that runs in CI and detects new syscalls referenced in the op-program binary
100+
- **Detection:**: we will continue to watch `op-challenger-runner` and will be alerted if any mainnet blocks fail.
101+
- **Recovery Path(s)**: It would require fixing the bug and upgrading the contract.
102+
103+
### FM5: eventfd or mprotect noop insufficient for Go 1.23 suppport
104+
105+
- **Description:** the eventfd and mprotect syscalls were implemented as a noop, because it was determined that it won't be used by op-program even though there is a reference to it in the binary.
106+
- **Risk Assessment:** medium
107+
- **Mitigations:**
108+
1. We have been running op-challenger-runner on production data for several weeks with the new VM
109+
- **Detection:** We rely on our tests.
110+
- **Recovery Path(s)**: It would require fixing the bug and upgrading the contract.
111+
112+
### Generic items we need to take into account:
113+
114+
See [generic hardfork failure modes](./fma-generic-hardfork.md) and [generic smart contract failure modes](./fma-generic-contracts.md).
115+
Incorporate any applicable failure modes with FMA-specific mitigations and detections directly into this document.
116+
117+
- [x] Check this box to confirm that these items have been considered and updated if necessary.
118+
119+
## Action Items
120+
121+
Below is what needs to be done before launch to reduce the chances of the above failure modes occurring, and to ensure they can be detected and recovered from:
122+
123+
- [x] Resolve all comments on this document and incorporate them into the document itself (Assignee: document author)
124+
125+
## Audit Requirements
126+
127+
These changes were audited as part of [this larger Spearbit review](https://github.com/ethereum-optimism/optimism/blob/49a80f8054cf59be69624416160cad760f09c692/docs/security-reviews/2025_05-Interop-Portal-Spearbit.pdf) and [by Coinbase Protocol Security](https://github.com/ethereum-optimism/optimism/blob/49a80f8054cf59be69624416160cad760f09c692/docs/security-reviews/2025_05-Cannon-Go-Updates-Coinbase.pdf).

0 commit comments

Comments
 (0)