unify grammars #1

FreddieGilbraith · 2025-03-07T20:59:59Z

Looks like great minds think alike! https://github.com/little-bonsai/tree-sitter-ink

I've been working on this as well for the last few months. It would be great if we could find a way to unify into a single grammar, that we could perhaps donate to the inkle github org? I think there are strengths to both our approaches (you have better support for identifiers in weave labels and better highlighting; I think my grammar has more in-depth handling of code sections), so it would be good to preserve work from both.

Let me know if you'd be interested in coming together on this :)

rhizoome · 2025-03-07T23:43:35Z

This is very interesting. First, I want to emphasize that any differences I point out are not meant as value judgements. My parser grew very organically from a few priorities I chose, but I could have chosen differently. I treated building the parser like coloring mandalas. I choose colors and let my choices guide me. Here are some of the choices I made:

Think of Ink as a line-based language, similar to Python.
- Therefore scanner.c identifies line start and line end.
- This has the nice side effect that parse errors are always recovered after "\n".
Try very hard to avoid conflicts, use scanner.c to resolve conflicts.
If it is not about conflicts, try to have a minimal scanner.c.
Have full unicode support with minimal unicode knowledge in scanner.c, so scanner.c only knows unicode whitespaces.
Prioritize features that are important for highlighting and an auto-completion language server.
- It knows all about knots, knits, stitches and diversions, so we can autocomplete them.
- It has a production for "vocabulary", so we can autocomplete vocabulary like inky does.
- Make the parsing more flexible than the ink language, so we can do lints in the language server.

In particular, treating ink as a line-based language leads to a very different approach. It seems you needed more complexity in scanner.c, while my grammer.js is much more verbose and repetitive. Also I haven't bothered to build the complete grammar.

It seems you found out about this project when you wanted to submit the highlighting to Helix, which must be frustrating. I am really sorry about that.

I suppose it is the same for you, once you have built a piece of software in a certain way, you start to like it. Mainly because you understand what you have built and it is difficult to understand someone else's thought process. So what can we do? Some ideas:

Design a new parser that takes the best of both projects
Work on the language server together, I have only very minimal rust code at the moment

I see you also worked on LSPs in rust, so that might be interesting. Just so you know, if we can't find a way to work together and your project turns out to be better maintained than mine, I won't stand in the way of switching Helix to yours.

I’d really like to collaborate on software, though I’ve mostly either worked alone or contributed small changes to larger projects. Maybe this could be an opportunity? What do you think? And what are the design principles of your parser?

FreddieGilbraith · 2025-03-12T20:47:33Z

Thank you for your polite and thoughtful response!

I treated building the parser like coloring mandalas. I choose colors and let my choices guide me.

What a wonderful philosophy!

My approach when building the grammar was to go through in ink tutorial in order, and add new language constructs as they appeared in the tutorial. I hoped that this would lead to a grammar that was structured to prioritise language features in order of importance, as more important features would be introduced first.

I also wanted to avoid having a complex scanner, but eventually realised it was impossible to correctly parse ink with only a context-free-grammar.

Prompted by some of the points you’ve made, I’ve made some major simplifications and improvements to my grammar & scanner. I have also added you as a contributor, and would love to hear any further ideas you might have for improvements. I know there are still a few things in TheIntercept that cause parsing errors, so there’s still more work to be done.

Regarding LSPs, my main priority at the moment is to write an LSP that supports spell-checking, as my spelling is very poor, but I would also be interested in working with you on a LSP that uses Treesitter to parse and analyse ink projects.

Let me know what you think about all the above, and I’m looking forward to working with you :)

rhizoome · 2025-03-13T13:16:58Z

Message ID: ***@***.***>I am commenting via email. I hope that works. I was discussing our tree-sitter-ink grammars with a friend and I realized that probably the most important lesson about grammars I learned many years ago: The version for humans (BNF) should look nice, but it causes lots of problems if one tries to make the version for the parser-generator look nice. It is more important to have something that is good for the compiler-backend, for the highlighter or the LSP. That is why I was very fast creating the parser, I knew from previous projects that any energy put into avoiding redundancy and striving for uniformity would be wasted. Because when you then work on the compiler you either introduce redundancy and non-uniform constructs into your grammar or you again waste a lot of energy trying to convert the AST into a more suitable form for the task. This discussion with a friend lead me to the idea, that maybe a nice way to collaborate would be to meet on discord (or similar) and discuss parsers in general and also you can show your parser. And then we could look at the errors you have. I think with the right mindset (like that in ink line-endings are relevant) ink is very regular, they do a few strange things, but every problem I encountered was just that I had the wrong view and once I changed my view, it turned out to be regular. I do not parse everything, so I might be wrong, but the parts I do not parse should be the regular ones. I think you are probably more invested in the project than I. I wanted highlighting in helix and knew that I can do it in about 5-10 hours. So it is the right thing to support you. And again I am often very set in my mindset and I know that there are hundreds of valid approaches, so nothing I just said or that we are going to discuss has to apply to your approach. But I think if we focus on specific problems we should be able to improve things without changing the architecture.

rhizoome · 2025-04-03T21:42:50Z

@FreddieGilbraith we are getting built-in word-completion in helix: helix-editor/helix#13206, the contributor usually gets things done. :-)

FreddieGilbraith · 2025-04-07T20:36:41Z

Sorry it's taken me so long to reply. I don't have a lot of time for open source stuff at the moment as I've just had my first child. For that reason I probably don't have enough time for a discord call or similar, but would like to continue working with you when I get time :)

rhizoome · 2025-04-08T08:02:20Z

Congratulations! It is a beautiful and sometimes bumpy journey to support a new life in this world. I may be spending a lot of time on open source over the next months. So if you think: "Let's fix some errors in tree-sitter-ink to get a change from changing diapers." - just ping me and I might spontaneously find the time.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

unify grammars #1

unify grammars #1

FreddieGilbraith commented Mar 7, 2025

rhizoome commented Mar 7, 2025

FreddieGilbraith commented Mar 12, 2025 •

edited

Loading

rhizoome commented Mar 13, 2025 via email

rhizoome commented Apr 3, 2025

FreddieGilbraith commented Apr 7, 2025

rhizoome commented Apr 8, 2025

unify grammars #1

unify grammars #1

Comments

FreddieGilbraith commented Mar 7, 2025

rhizoome commented Mar 7, 2025

FreddieGilbraith commented Mar 12, 2025 • edited Loading

rhizoome commented Mar 13, 2025 via email

rhizoome commented Apr 3, 2025

FreddieGilbraith commented Apr 7, 2025

rhizoome commented Apr 8, 2025

FreddieGilbraith commented Mar 12, 2025 •

edited

Loading