diff --git a/protocol/greenlandian-holocene-archival.md b/protocol/greenlandian-holocene-archival.md new file mode 100644 index 00000000..4084d68b --- /dev/null +++ b/protocol/greenlandian-holocene-archival.md @@ -0,0 +1,158 @@ +# Project Greenlandian + +# Purpose + +Make the OP-Stack less complicated by archiving pre-Holocene blocks. + +# Summary + +Holocene brought important OP-Stack protocol simplifications. +The Holocene upgrade simplifications are currently only forward-looking: +new blocks are batch-submitted and confirmed with leaner more stable rules. +The history of pre-Holocene data still requires major derivation-pipeline complexity in the stack. + +Holocene is the "warm and stable" geological epoch we live in today. +"Greenlandian" is the first Age of the Holocene, +named after ice-layers that effectively archived the transition out of the ice age with high resolution. + +This design doc proposes a small project to archive the pre-Holocene block data, +so it no longer has to be processed by the op-node, allowing major tech-debt cleanup, +and unblocking greenfield ( ;) ) implementations (Rust!). + +No breaking protocol changes are required, this work is platforms-focused. + +# Problem Statement + Context + +The L2 needs the historical chain inputs to re-process blocks for archive-node purposes. + +L2 blocks may be retrieved from multiple different sources: +- reproduce from L1 data +- import as ready L2 block data + +L1 blobs have a limited data-availability time duration. +To challenge an invalid claim of L2 outputs on L1, the state-transition needs to be proven. +For this to be proven by anyone, the input data needs to be available, to reproduce the computation. + +Actors with an interest in the system are assumed to have access to an older already widely agreed-upon state, +but need access to the last few weeks of input data permissionlessly to not rely on the L2 sequencer, +which can otherwise make false output claims against their interest. + +Given this limited availability period, there already exists a widely-used archiving strategy today. +An archive that is verified against L1 block commitments which are agreed upon. + +Archiving the blobs and re-running derivation is increasingly less and less sustainable: +- Blobs may contain different L2s. It is more data than any single chain may need. + Limiting archival just to 1 inbox contract helps, + but this is limited to a single-chain inbox design, incompatible with shared inbox designs. +- Derivation also relies on L1 receipt data, which thus also needs to be archived. + Many L1 nodes prune older receipts. Receipt availability today is slowly disappearing, + with work such as EIP-4444 moving to archive it. +- Legacy derivation adds significant complexity to new implementations, + and is significant maintenance burden, hurting implementation progress with legacy design choices. + Blob archives are not useful to new implementations that aim to avoid the pre-Holocene derivation complexity. + +## Trust assumption + +The L2 block archival solution trust-model is equivalent with the L1 blob archival trust model, +with one new minor theoretical assumption: +**The canonical L2 chain can be established by the node, +with a delay from the actual tip of the chain as long as blobs are available for.** + +Without *any* blob archival, this is `MIN_EPOCHS_FOR_BLOB_SIDECARS_REQUESTS (4096 L1 beacon epochs)`, +or approximately 18 days. With some blob archival, this can be increased. + +In practice L2s have a bridge with widely accepted finality of around a week, +determining a canonical chain as user, at 2-3x the time during node bootstrapping, should be of relatively ease. + +## `op-node` cleanup + +Once we archive pre-Holocene blocks, we can remove and cleanup the op-node implementation as following: +- Remove legacy parts of the `rollup/engine` package that maintains fragile forkchoice state workarounds such as + "backup unsafe L2 blocks" and "pending safe blocks". +- Remove derivation-stage multiplexing from the `rollup/derive` package. +- Remove the legacy `ChannelBank` and `BatchQueue` logic. + +Cleanup of the `engine` package allows us to better reason about the execution-engine failure cases +and maintain a simplified forkchoice state for better extensibility. + +Cleanup of the `derive` package allows us to cut some of the most complicated derivation code, +simplifying maintenance and the fault-proof that is built on top of this. + +# Proposed Solution + +If we are deprecating pre-Holocene derivation, +we need a way to reliably archive and distribute the post-derivation work: the L2 blocks themselves. + +Archival of execution-layer blocks is already standardized on L1, +something the L2 stack should already be compatible with. + +Using the L1 Ethereum ERA file format we are able to archive the chain in a universal cross-client ethereum format. + +In op-geth, we should be able to run two geth commands: `export-history`, `import-history`. + +```bash +op-geth export-history --datadir=... --datadir.ancient=...