Skip to content

Move token_stream_parse implementation to Wasm #18

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
dtolnay opened this issue Oct 31, 2019 · 4 comments
Closed

Move token_stream_parse implementation to Wasm #18

dtolnay opened this issue Oct 31, 2019 · 4 comments
Labels
help wanted Extra attention is needed

Comments

@dtolnay
Copy link
Owner

dtolnay commented Oct 31, 2019

As noted in #2 (comment),

watt::sym::token_stream_parse is me being too lazy to implement a Rust syntax tokenizer in wasm (aka copy it from proc-macro2), but we could likely optimize that slightly by running that in wasm as well.

We'll need to import something like https://github.com/alexcrichton/proc-macro2/blob/1.0.6/src/strnom.rs and use that to implement token stream parsing rather than calling out to a host func.

It's possible we can simplify the implementation from proc-macro2 by omitting the parts that deal with parsing comments, as those won't appear in our use case.

@dtolnay dtolnay added the help wanted Extra attention is needed label Oct 31, 2019
@alexcrichton
Copy link
Collaborator

One thing I should note on this is that I was surprised that #[derive(Serialize)] called this function, but then I remembered this is part of quote!, which also lends itself well to reasoning that this may be relatively hot if it's called in quote! a lot. In that sense I think we could perhaps speed up but a mild amount, but as the linked post shows it's only 1-2ms for a massive macro.

@mystor
Copy link
Contributor

mystor commented Oct 31, 2019

Does that include time spent running watt::sym::token_stream_serialize to get the parsed data into wasm? Running parsing in wasm should eliminate a bunch of calls to these other methods, which might increase the impact a bit.

In addition, given the higher minimum rustc version for watt, it may be reasonable to use rustc_lexer as the lexer within wasm, probably based on the work in dtolnay/proc-macro2#202. I'm guessing it's been optimized a bit more than strnom.

@alexcrichton
Copy link
Collaborator

Oh right that's an excellent point! Yes the timing information of watt::sym::token_stream_parse doesn't include the timing of watt::sym::token_stream_serialize nor the time to deserialize inside of wasm itself. Those would all largely be avoided if we move this to wasm. While still likely to be modest, I think it'd be a clear win speed-wise.

mystor added a commit to mystor/watt that referenced this issue Nov 2, 2019
@mystor
Copy link
Contributor

mystor commented Nov 4, 2019

@dtolnay benchmarked an impl of this in #26 (review), and found it had equivalent (or worse) performance, even on large inputs. I'm inclined to think that we should leave the token_stream_parse implementation implemented by the host runtime.

@dtolnay dtolnay closed this as completed Nov 4, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants