Skip to content

DelimTokenTree-based parsing #3791

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
powerboat9 opened this issue May 10, 2025 · 3 comments
Open

DelimTokenTree-based parsing #3791

powerboat9 opened this issue May 10, 2025 · 3 comments
Assignees

Comments

@powerboat9
Copy link
Collaborator

For a future project, it would be nice to convert the parser to run on DelimTokenTree, rather than a stream of tokens. It would make parsing simpler and help catch edge cases like #3790.

Implementation wise, we'd probably want to do a first pass through the list of tokens, verifying that there aren't any mismatched delimiters. During that pass we could also cache distances between delimiter token pairs, sorted by left delimiter location. For example, [ ( { a } b ) c d ] could produce the distance cache [8, 4, 1]. Then we could have something like a ManagedTokenSource, say, TreeSource, that gives out either tokens or delimited token trees (via new instances of TreeSource). It would only have to store an iterator into the list of tokens, an ending iterator into the list of tokens, and an iterator into the distance cache.

Producing an initial TreeSource would be simple. Additionally, spinning off TreeSource instances for nested delimited pairs would be cheap, and the distance cache would make advancing the original TreeSource instance past the nested delimited pair cheap as well.

@powerboat9
Copy link
Collaborator Author

It looks like this would require storing a file's worth of lexed tokens in a vector, instead of single-passing them, but that doesn't seem like too much of a drawback.

@powerboat9
Copy link
Collaborator Author

powerboat9 commented May 10, 2025

It could also allow us to move away from having Parser as a template, by providing some indirection away from Lex and ProcMacroInvocLexer.

@powerboat9 powerboat9 self-assigned this May 10, 2025
@powerboat9
Copy link
Collaborator Author

powerboat9 commented May 10, 2025

Not going to work on it right now though, since I'm more focused on getting nr2.0 through

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant