Description
I'm creating this issue to keep a list of tasks that will need to be done after #15294 is merged before we can consider EOF support "non-experimental".
The list is non-exhaustive so far and not ordered by importance, please just edit in additional tasks (or comment) for anything still missing.
General/Miscellaneous
- Topological ordering of code blocks (see #15633)
- Proper errors due to exceeding EOF limits (too many containers, too long functions, too many arguments, too many immutables, too long jumps, stack overflow/underflow, unreachable code, etc.).
- With test coverage both via Yul and via high-level Solidity code.
-
loadimmutable()
on EOF. -
msize()
on EOF. - Unoptimized compilation on EOF.
- EOF opcode support in gas meter.
- Proper workarounds for cases of "out of reach" rjumps
(e.g. using intermittent jumps, outlining or splitting out control flow usingJUMPF
) - Refactor parts of
Assembly::assemble
to better split into EOF/non-EOF parts (see e.g. Implementation of EOF in Solidity [DO NOT MERGE] #15294 (comment) and Implementation of EOF in Solidity [DO NOT MERGE] #15294 (comment))
Language Level Changes
Solidity:
- Properly reject non-salted creation during Analysis (when targetting EOF)
- Ensure proper analysis treatment of various high-level constructs with EOF, including:
- Disallow
gas
in call options (#15638). - Disallow
type(C).creationCode
,type(C).runtimeCode
,<address>.code
,<address>.codehash
,<address>.transfer()
,<address>.send()
,gasleft()
andselfdestruct()
. -
low level calls (disallow; reintroduce for EOF-low-level-calls - consider either another unified version that works for both targets or implementing the EOF-style version for legacy as well, if at all possible)As implemented now (#15559) low-level calls work for both targets. - Disallow abicoder v1 pragma on EOF.
- Disallow
-
Builtin address calculation for salted creation (compatible with both EOF and legacy)(rejected/impossible) - Revert on calls to uninitialized external function pointers (unless we get
EXTCODETYPE
and can keep the more general check against calls to accounts without code). - In-language construct to check for EOF (like a global bool
isEOF
)? Option to define both legacy and EOF specific assembly blocks? (Ideally we're fine without this, but will need to check demand from library authors) - Allow deploying custom subcontainers (like https://gist.github.com/frangio/940f21778f411154dc58ff818c712929)
- Related to the above: consider if we need to disallow
verbatim
for targetting EOF (since we cannot perform stack height analysis) and can/need to replace with full custom subcontainers. To retainverbatim
, it'd need to declare it's max stack effects, if used with EOF. - Decision: should
auxdataloadn()
be exposed in inline assembly? - Decision: should
returncontract()
be exposed in inline assembly?
Yul:
- Expose a more complete set of Yul builtins for accessing the data section (only
auxdataload()
available now).- We have no builtin for accessing Yul data objects by name.
- Decision: should the names of non-EOF instructions and builtins remain reserved in Yul on EOF?
- Decision: should the names of EOF instructions and builtins become reserved in Yul on legacy?
EVM assembly:
- Support for remaining EOF opcodes:
DATALOAD
,DATASIZE
,DATACOPY
,RETURNDATALOAD
,RJUMPV
,DUPN
,SWAPN
,EXCHANGE
.
Noteworthy Documentation Entries
- Compile a list of breaking changes, similar to what we did for IR codegen.
- Difference in address calculation in salted creation and
- Constructor arguments no longer affecting salted-create address.
-
EXTCODESIZE
/EXTCODETYPE
check and behavior of uninitialized external function pointers. -
gas
no longer allowed in call options. - Builtins no longer available in Solidity.
- Opcodes/builtins no longer available in inline assembly and Yul.
- Existence of implicit limits on some language constructs resulting from EOF size constraints.
-
msg.data
in constructor no longer empty; now contains arguments.
- Document the EOF-Yul builtins.
- Including new restrictions in how nested objects and data can be referenced.
- Document the EOF opcodes.
- Check for required documentation changes on all changing output artifacts.
- Check general introductions for legacy/EOF specific parts.
Compiler Output (and Input) Artifacts
- Settle the details on CBOR metadata. (Where? -> Likely beginning of data; Should we tweak the format?) See List of metadata improvements sourcify#1523
- May involve documenting that e.g.
--no-cbor-metadata
induces code changes (in particular dataload offsets) - similar prerelease vs release due to the length of the cbor-encoded version string.
- May involve documenting that e.g.
- Generate proper source mappings.
- Properly define and output assembly text and assembly json for EOF.
- Arguments of
EXCHANGE
: as raw values or slot numbers? - Section and sub IDs: as numbers or identifiers? (see eof: new contract creation #15512 (comment), eof: new contract creation #15512 (comment))
- Distinguishing immediates from stack arguments (
{}
? see eof: new contract creation #15512 (comment)) - Code sections.
- Arguments of
- After the above allow importing EOF assembly json.
- (Minor) print
JUMPDEST
asNOP
for EOF assembly/opcode. - Make sure the "immutable references" output remains accurate - and whether it can/should be reused for information within the data section or we should have a new artifact for that instead.
- Also, restore validation against reading a non-existent immutable.
Optimizations
- libevmasm optimizer steps: most importantly
BlockDeduplicator
, but alsoJumpdestRemover
,Inliner
,PeepholeOptimiser
,CommonSubexpressionEliminator
,ConstantOptimiser
. - Use
RJUMPV
for dense Yul switches; investigate trying to turn sparse switches into dense switches (especially for the external dispatch) -
CALLF RETF
->JUMPF
peephole optimizer rule; tail-call optimization in codegen - Consider potentially new peephole optimizer rules (e.g. certain chains of swaps to exchange)
- Adjust Yul inlining heuristics
- Allow stack depth larger than 16 with
SWAPN
/DUPN
during Yul->EVM code transform and remove all stack-to-memory logic (see eof: Make use of EOF's SWAPN/DUPN #15844) - Exploit
EXCHANGE
for stack shuffling - Consider creating a version of the libevmasm low-level inliner that can inline EOF function calls according. (Very low-priority - most can be expected to be done on the Yul level - would be more relevant if we backported EOF to legacy codegen)
- Relax
RETURNDATACOPY
restrictions in the optimizer. - Exploit
RETURNDATALOAD
- Sub-assembly deduplication on EOF (Do not duplicate subassemblies. #13804). Cannot be done during assembling like in legacy, but should be doable in the optimizer.
- Removal of unreferenced
data
objects.
Testing
Note: Obviously any changes in all other points need proper testing. But beyond that generally:
- Set up fuzzing pipelines for EOF vs legacy (needs robustness against minor differences in e.g. salted creation addresses)
- Try to unify tests (e.g. if not already done, refactor tests that depend on address calculations (e.g. move into a testing contract)); try to reduce test copies between legacy and EOF in general, whenever possible.
- Run external tests on EOF.
- Gas measurements in semantic tests on EOF. Consider also adding some gas tests (disabled in #15653).
- Review correctness of properties of EOF opcodes in
SemanticInformation.cpp
and add more coverage viafunctionSideEffects
tests (disabled in #15654). - Correct codehash calculation for
EOFCREATE
inEVMHost
(see #15635) - Enable SMTChecker tests on EOF (disabled in #15659).
- More coverage for EOF use in inline assembly.
Breaking Changes
Note: Non-backwards-compatible changes we should do even with legacy as target in the breaking release that switches to EOF per default. (EOF-specific breaking changes for EOF as target can generally be done outside of breaking releases if restricted to only affect compilation when EOF is enabled)
- Consider renaming the previous special
datacopy
, etc., Yul builtins for legacy codegen.
New Language Features
Note: we don't need these to call EOF support stable, but notably a few language features become significantly easier to support with EOF, e.g.:
- Reference-type Immutables (resp. a data location for the data section)
- returndata as data location
-
calldata
arguments in constructors - high-level
TXCREATE
support. - referencing contract subcontainers from inline assembly (needed for
eofcreate()
).
Metadata
Metadata
Assignees
Labels
Type
Projects
Status
Status