-
Notifications
You must be signed in to change notification settings - Fork 532
Why is there a "not greedy" comment here and what does that mean? #1831
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
It's non-greedy in a sense that if you have something like: let a = br#"example a"#;
let b = br#"example b"#; while it is matching the ASCII_FOR_RAW bytes in https://en.wikipedia.org/wiki/Regular_expression#Lazy_matching contains a little more of a description. |
Thx u very much! Why isn't this described in the table? Is this such obvious information? It doesn't even specify whether regular expression symbols are used, I didn't even think about it: |
To take your questions in reverse: No, this isn't so obvious it doesn't need documenting. The fact that the formalism being used isn't documented is a bug in the Reference (there's isn't an open issue explicitly about this, but I suppose it comes under #567). The reason it isn't documented is that the Reference has evolved gradually from a Rust "manual" which described the lexical structure only in English. In 2017 a contributor was kind enough to submit a form of the current "Lexer" blocks, and the editors at the time thought that was valuable enough to include without an explicit desciption of how they need to be interpreted. (As I understand it he was using Antlr4 in its "lexer grammar" mode.) |
Thx u very much! |
https://doc.rust-lang.org/1.87.0/reference/tokens.html#raw-byte-string-literals
This seems to mean that not all of the x00-x7f range is allowed, the "non-greedy" comment refers to an invalid character in the pair, namely the carriage return (CR) - x0D.
It only confused, it's already clear that the carriage return can't be used, since the ASCII_FOR_RAW description has "except IsolatedCR".
I was told the following:
"This is a standard concept for regular expressions.
Greedy matching takes the maximum possible number of characters of the string to match the mask, non-greedy - the minimum possible.
For example, for the string
axxxbxxxb
greedy/a.*b/
will capture the entire string, and non-greedy/a.*?b/
only up to the first b."The text was updated successfully, but these errors were encountered: