Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Code highlighting for Elena #400

Open
1muhgcmg opened this issue Mar 16, 2025 · 2 comments
Open

Code highlighting for Elena #400

1muhgcmg opened this issue Mar 16, 2025 · 2 comments
Assignees
Labels
enhancement New feature or request syntax highlight

Comments

@1muhgcmg
Copy link

VSCode support is available.

@1muhgcmg
Copy link
Author

Quoted the Elena developer:

With syntax highlighting the situation is a bit more complicated. The language has no keywords, only attributes which are position dependent. So for example class can be an attribute (which must be highlighted) and can be an identifier as well.
Similar with statements, they are all user-defined.

If this is the case, how could you provide syntax highlighting on the IDE? The Elena IDE on Windows supports syntax highlighting.
It is far from perfect as well. I still need to fix it. But currently I do it based on the position in the statement. For example
int n := 2 - if a word token is followed by another word token, then this token is an attribute and must be highlighted. So int is highlighted, n is not. The situation with number is simple.

@SpartanJ Do you think this is a showstopper for the current syntax highlighting engine of ecode? I think this language marks a practical limitation of the engine that you can't overcome without making a completely new engine.

The usage of the single quote as delimiters on the import statement is already something impossible to overcome with the current engine according to #297.

@SpartanJ
Copy link
Owner

SpartanJ commented Mar 18, 2025

@SpartanJ Do you think this is a showstopper for the current syntax highlighting engine of ecode?

No, it's not, but the heavy lifting would need to be done with the semantic highlighting provided by the LSP, which has knowledge about the AST of the parsed document. This is how it's solved in ecode and it will work as good as any AST based syntax highlighting. The problem for these languages is that none will provide an LSP with semantic highlighting support, due to the fact that this is a very complex peace of software that it's not usually developed during early stages of language development, probably supporting just a tree-sitter for these languages makes much more sense due to its lower complexity (always talking about syntax highlighting, LSP has tons of features besides that).

 I think this language marks a practical limitation of the engine that you can't overcome without making a completely new engine.

It's not particularly different than any other language with no explicit keywords. The way this works on ecode is: ecode will do the "base" syntax highlighting with its regex based tokenizer, then it will communicate with the LSP to request the semantic highlighting and will merge the results. The quality of the result will depend more on the quality of the semantic highlighter than the regex version. ecode currently does the same kind of highlighting than vscode, so it's not that bad as it might sound. Although RegEx based syntax highlighting is far from perfect and I'm aware of it, but it's also a very simple way to do syntax highlighting, anyone can implement a highlighter, and you can have some decent results in less than 30 minutes of work, this is why I'm still convinced that it's good to have it and it's a great starting point.

The usage of the single quote as delimiters on the import statement is already something impossible to overcome with the current engine according to #297.

It's actually not impossible to overcome, it's just that the way to solve it is not great... I could fix that at the expense of adding a bunch of regexs, which translates into a slower syntax highlighter, currently I'm prioritising speed over accuracy but I improved the tokenizer around 35% in the latest nightly builds so I've some space to fix some of this inaccuracies without killing the performance (also I must say that the highlighter is very fast currently, so I might be exaggerating a bit, but I want things to be instant, always). So I might fix that particular case soon, I just didn't care because it's actually something that I've never seen used in a codebase, and believe me I do see a lot of C++. To clarify, the comment I did on that thread was before supporting perl regular expressions (which is needed for custom length separators), it was not possible to solve with lua patterns without having to add an absurd amount of rules.

And probably the best course of action is supporting tree-sitter based highlighting, which I'm sure I'll implement sooner than later but you'll still have the same problem: all these languages won't have a tree-sitter, because they are in a very early stage of development, so it won't solve much supporting tree-sitter for these cases. I believe that for these particular languages we should add very simple highlighting not caring too much about accuracy due to the fact that they are evolving and things will break anyway, this happen in even more popular languages like Odin or Zig, so it's a matter of caring and maintaining each particular language, which I cannot do all by myself.

tree-sitter is the fastest and most accurate way of doing syntax highlighting because it's AST based syntax highlighting, but has a few disadvantages not directly related with the method of highlighting but for example you need to distribute the tree-sitter of each language supported, which is basically a library for each language, if you plan to support hundreds of languages this ends up complicating the whole distribution and build system (you need to build the grammars for each language). This is mostly a problem if you're dealing with a precarious build system like in C++, in Rust you would need just to add a crate (helix does this and it's great for them).

Adding more methods for doing syntax highlighting is not something crazy complex, it can be done with no problem and it will be done, the biggest argument of the current implementation is: simplicity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request syntax highlight
Projects
None yet
Development

No branches or pull requests

2 participants