Skip to content

Commit e28f7a9

Browse files
committed
Some mostly minor changes
1 parent 9c42f45 commit e28f7a9

File tree

1 file changed

+48
-13
lines changed

1 file changed

+48
-13
lines changed

text/0000-proc-macros.md

Lines changed: 48 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -49,8 +49,8 @@ to avoid this problem.
4949
# Detailed design
5050
[design]: #detailed-design
5151

52-
There are two kinds of procedural macro: function-like and macro-like. These two
53-
kinds exist today, and other than naming (see
52+
There are two kinds of procedural macro: function-like and attribute-like. These
53+
two kinds exist today, and other than naming (see
5454
[RFC 1561](https://github.com/rust-lang/rfcs/pull/1561)) the syntax for using
5555
these macros remains unchanged. If the macro is called `foo`, then a function-
5656
like macro is used with syntax `foo!(...)`, and an attribute-like macro with
@@ -120,8 +120,9 @@ details.
120120

121121
When a `#[cfg(macro)]` crate is `extern crate`ed, it's items (even public ones)
122122
are not available to the importing crate; only macros declared in that crate.
123-
The crate is dynamically linked with the compiler at compile-time, rather
124-
than with the importing crate at runtime.
123+
There should be a lint to warn about public items which will not be visible due
124+
to `#[cfg(macro)]`. The crate is dynamically linked with the compiler at
125+
compile-time, rather than with the importing crate at runtime.
125126

126127

127128
## Writing procedural macros
@@ -163,7 +164,7 @@ sketch is available in this [blog post](http://ncameron.org/blog/libmacro/).
163164
## Tokens
164165

165166
Procedural macros will primarily operate on tokens. There are two main benefits
166-
to this principal: flexibility and future proofing. By operating on tokens, code
167+
to this principle: flexibility and future proofing. By operating on tokens, code
167168
passed to procedural macros does not need to satisfy the Rust parser, only the
168169
lexer. Stabilising an interface based on tokens means we need only commit to
169170
not changing the rules around those tokens, not the whole grammar. I.e., it
@@ -213,12 +214,20 @@ pub struct TokenTree {
213214
}
214215
215216
pub enum TokenKind {
216-
Sequence(Delimiter, Vec<TokenTree>),
217+
Sequence(Delimiter, TokenStream),
217218
218219
// The content of the comment can be found from the span.
219220
Comment(CommentKind),
220-
// The Span is the span of the string itself, without delimiters.
221-
String(Span, StringKind),
221+
222+
// Symbol is the string contents, not including delimiters. It would be nice
223+
// to avoid an allocation in the common case that the string is in the
224+
// source code. We might be able to use `&'Codemap str` or something.
225+
// `Option<usize> is for the count of `#`s if the string is a raw string. If
226+
// the string is not raw, then it will be `None`.
227+
String(Symbol, Option<usize>, StringKind),
228+
229+
// char literal, span includes the `'` delimiters.
230+
Char(char),
222231
223232
// These tokens are treated specially since they are used for macro
224233
// expansion or delimiting items.
@@ -227,11 +236,11 @@ pub enum TokenKind {
227236
// Not actually sure if we need this or if semicolons can be treated like
228237
// other punctuation.
229238
Semicolon, // `;`
230-
Eof,
239+
Eof, // Do we need this?
231240
232241
// Word is defined by Unicode Standard Annex 31 -
233242
// [Unicode Identifier and Pattern Syntax](http://unicode.org/reports/tr31/)
234-
Word(InternedString),
243+
Word(Symbol),
235244
Punctuation(char),
236245
}
237246
@@ -253,13 +262,34 @@ pub enum CommentKind {
253262
254263
pub enum StringKind {
255264
Regular,
256-
// usize is for the count of `#`s.
257-
Raw(usize),
258265
Byte,
259-
RawByte(usize),
260266
}
267+
268+
// A Symbol is a possibly-interned string.
269+
pub struct Symbol { ... }
261270
```
262271

272+
### Open question: `Punctuation(char)` and multi-char operators.
273+
274+
Rust has many compound operators, e.g., `<<`. It's not clear how best to deal
275+
with them. If the source code contains "`+ =`", it would be nice to distinguish
276+
this in the token stream from "`+=`". On the other hand, if we represent `<<` as
277+
a single token, then the macro may need to split them into `<`, `<` in generic
278+
position.
279+
280+
I had hoped to represent each character as a separate token. However, to make
281+
pattern matching backwards compatible, we would need to combine some tokens. In
282+
fact, if we want to be completely backwards compatible, we probably need to keep
283+
the same set of compound operators as are defined at the moment.
284+
285+
Some solutions:
286+
287+
* `Punctuation(char)` with special rules for pattern matching tokens,
288+
* `Punctuation([char])` with a facility for macros to split tokens. Tokenising
289+
could match the maximum number of punctuation characters, or use the rules for
290+
the current token set. The former would have issues with pattern matching. The
291+
latter is a bit hacky, there would be backwards compatibility issues if we
292+
wanted to add new compound operators in the future.
263293

264294
## Staging
265295

@@ -314,6 +344,9 @@ are better addressed by compiler plug-ins or tools based on the compiler (the
314344
latter can be written today, the former require more work on an interface to the
315345
compiler to be practical).
316346

347+
We could use the `macro` keyword rather than the `fn` keyword to declare a
348+
macro. We would then not require a `#[macro]` attribute.
349+
317350
We could have a dedicated syntax for procedural macros, similar to the
318351
`macro_rules` syntax for macros by example. Since a procedural macro is really
319352
just a Rust function, I believe using a function is better. I have also not been
@@ -374,6 +407,8 @@ a process-separated model (if desired). However, if this is considered an
374407
essential feature of macro reform, then we might want to consider the interfaces
375408
more thoroughly with this in mind.
376409

410+
A step in this direction might be to run the macro in its own thread, but in the
411+
compiler's process.
377412

378413
### Interactions with constant evaluation
379414

0 commit comments

Comments
 (0)