|
| 1 | +--- |
| 2 | +layout: post |
| 3 | +title: "Contribute to the diagnostic translation effort!" |
| 4 | +author: David Wood |
| 5 | +team: the compiler team <https://www.rust-lang.org/governance/teams/compiler> |
| 6 | +--- |
| 7 | + |
| 8 | +The Rust Diagnostics working group is leading an effort to add support for |
| 9 | +internationalization of error messages in the compiler, allowing the compiler |
| 10 | +to produce output in languages other than English. |
| 11 | + |
| 12 | +For example, consider the following diagnostic where a user has used a colon to |
| 13 | +specify a function's return type instead of an arrow: |
| 14 | + |
| 15 | +```text |
| 16 | +error: return types are denoted using `->` |
| 17 | + --> src/main.rs:1:21 |
| 18 | + | |
| 19 | +1 | fn meaning_of_life(): u32 { 42 } |
| 20 | + | ^ help: use `->` instead |
| 21 | +``` |
| 22 | + |
| 23 | +We could output that diagnostic in Chinese: |
| 24 | + |
| 25 | +<pre lang="zh-CN"> |
| 26 | +<code class="language-text">错误: 返回类型使用`->`表示 |
| 27 | + --> src/main.rs:1:21 |
| 28 | + | |
| 29 | +1 | fn meaning_of_life(): u32 { 42 } |
| 30 | + | ^ 帮助: 使用`->`来代替 |
| 31 | +</code></pre> |
| 32 | + |
| 33 | +or even in Spanish: |
| 34 | + |
| 35 | +<pre lang="es"> |
| 36 | +<code class="language-text">error: el tipo de retorno se debe indicar mediante `->` |
| 37 | + --> src/main.rs:1:21 |
| 38 | + | |
| 39 | +1 | fn meaning_of_life(): u32 { 42 } |
| 40 | + | ^ ayuda: utilice `->` en su lugar |
| 41 | +</code></pre> |
| 42 | + |
| 43 | +Translated error messages will allow non-native speakers of English to use Rust |
| 44 | +in their preferred language. |
| 45 | + |
| 46 | +## What's the current status? |
| 47 | +Implementation on diagnostic translation has started, but we're looking for |
| 48 | +help! |
| 49 | + |
| 50 | +The core infrastructure for diagnostic translation has been implemented in |
| 51 | +`rustc`; this makes it possible for Rust to emit a diagnostic with translated |
| 52 | +messages. However, every diagnostic in `rustc` has to be ported to use this new |
| 53 | +infrastructure, otherwise they can't be translated. That's a lot of work, so |
| 54 | +the diagnostics working group has chosen to combine the translation effort with |
| 55 | +a transition to "diagnostic structs" (more on that later) and get both done at |
| 56 | +once. |
| 57 | + |
| 58 | +Once most diagnostic messages have been ported to the new infrastructure, then |
| 59 | +the diagnostics working group will start creating a workflow for translation |
| 60 | +teams to translate all of the diagnostic messages to different languages. |
| 61 | + |
| 62 | +Every pull request related to diagnostic translation is listed [in this |
| 63 | +document](https://hackmd.io/@davidtwco/rkXSbLg95). |
| 64 | + |
| 65 | +## Getting involved |
| 66 | +There's a lot of work to do on diagnostic translation, but the good news is that |
| 67 | +lots of the work can be done in parallel, and it doesn't require background in |
| 68 | +compiler development or familiarity with `rustc` to contribute! |
| 69 | + |
| 70 | +If you are interested, feel free to just get started! You can ask for help in |
| 71 | +[`#t-compiler/wg-diagnostics`] or reach out to [`@davidtwco`]. |
| 72 | + |
| 73 | +**Note:** This post isn't going to be updated as the working group iterates on |
| 74 | +and improves the workflow for diagnostic translation, so always consult the |
| 75 | +developer guide for the most recent documentation on [diagnostic |
| 76 | +structs][diag_struct] or [diagnostic translation][diag_translation]. |
| 77 | + |
| 78 | +### 1. Setting up a local development environment |
| 79 | +Before helping with the diagnostic translation effort, you'll need to get your |
| 80 | +development environment set up, so [follow the instructions on the `rustc` dev |
| 81 | +guide][getting_started]. |
| 82 | + |
| 83 | +### 2. Getting ready to port your first diagnostic |
| 84 | +Almost all diagnostics in `rustc` are implemented using the traditional |
| 85 | +`DiagnosticBuilder` APIs, which look something like this: |
| 86 | + |
| 87 | +```rust |
| 88 | +self.struct_span_err(self.prev_token.span, "return types are denoted using `->`") |
| 89 | + .span_suggestion_short( |
| 90 | + self.prev_token.span, |
| 91 | + "use `->` instead", |
| 92 | + "->".to_string(), |
| 93 | + Applicability::MachineApplicable, |
| 94 | + ) |
| 95 | + .emit(); |
| 96 | +``` |
| 97 | + |
| 98 | +`struct_span_err` creates a new diagnostic given two things - a `Span` and a |
| 99 | +message. `struct_span_err` isn't the only diagnostic function that you'll |
| 100 | +encounter in the compiler's source, but the others are all pretty similar. You |
| 101 | +can read more about `rustc`'s diagnostic infrastructure [in the `rustc` dev |
| 102 | +guide][errors_and_lints]. |
| 103 | + |
| 104 | +`Span`s just identify some location in the user's source code and you can find |
| 105 | +them used throughout the compiler for diagnostic reporting (for example, the |
| 106 | +location `main.rs:1:21` from the earlier example would have been |
| 107 | +`self.prev_token.span`). |
| 108 | + |
| 109 | +In this example, the message is just a string literal (a `&'static str`) which |
| 110 | +needs to be replaced by an identifier for the same message in whichever |
| 111 | +language was requested. |
| 112 | + |
| 113 | +There are two ways that a diagnostic will be ported to the new infrastructure: |
| 114 | + |
| 115 | +1. If it's a simple diagnostic, without any logic to decide whether or not to |
| 116 | + add suggestions or notes or helps or labels, like in the example above, |
| 117 | + then... |
| 118 | + - [...use a diagnostic derive](#using-a-diagnostic-derive). |
| 119 | +2. Otherwise... |
| 120 | + - [...manually implement `SessionDiagnostic`](#manually-implementing-sessiondiagnostic). |
| 121 | + |
| 122 | +In both cases, diagnostics are represented as types. Representing diagnostics |
| 123 | +using types is a goal of the diagnostic working group as it helps separate |
| 124 | +diagnostic logic from the main code paths. |
| 125 | + |
| 126 | +Every diagnostic type should implement `SessionDiagnostic` (either manually or |
| 127 | +automatically). In the `SessionDiagnostic` trait, there's a member function |
| 128 | +which converts the trait into a `Diagnostic` to be emitted. |
| 129 | + |
| 130 | +#### Using a diagnostic derive... |
| 131 | +Diagnostic derives (either `SessionDiagnostic` for whole diagnostics, |
| 132 | +`SessionSubdiagnostic` for parts of diagnostics, or `DecorateLint` for lints) |
| 133 | +can be used to automatically implement a diagnostic trait. |
| 134 | + |
| 135 | +To start, create a new type in the `errors` module of the current crate (e.g. |
| 136 | +`rustc_typeck::errors` or `rustc_borrowck::errors`) named after your |
| 137 | +diagnostic. In our example, that might look like: |
| 138 | + |
| 139 | +```rust |
| 140 | +struct ReturnTypeArrow { |
| 141 | + |
| 142 | +} |
| 143 | +``` |
| 144 | + |
| 145 | +Next, we'll need to add fields with all the information we need - that's just a |
| 146 | +`Span` for us: |
| 147 | + |
| 148 | +```rust |
| 149 | +struct ReturnTypeArrow { |
| 150 | + span: Span, |
| 151 | +} |
| 152 | +``` |
| 153 | + |
| 154 | +In most cases, this will just be the `Span`s that are used by the original |
| 155 | +diagnostic emission logic and values that are interpolated into diagnostic |
| 156 | +messages. |
| 157 | + |
| 158 | +After that, we should add the derive, add our error attribute and annotate the |
| 159 | +primary `Span` (that was given to `struct_span_err`). |
| 160 | + |
| 161 | +```rust |
| 162 | +#[derive(SessionDiagnostic)] |
| 163 | +#[error(parser::return_type_arrow)] |
| 164 | +struct ReturnTypeArrow { |
| 165 | + #[primary_span] |
| 166 | + span: Span, |
| 167 | +} |
| 168 | +``` |
| 169 | + |
| 170 | +Each diagnostic should have a unique slug. By convention, these always start |
| 171 | +with the crate that the error is related to (`parser` in this example). This |
| 172 | +slug will be used to find the actual diagnostic message in our translation |
| 173 | +resources, which we'll see shortly. |
| 174 | + |
| 175 | +Finally, we need to add any labels, notes, helps or suggestions: |
| 176 | + |
| 177 | +```rust |
| 178 | +#[derive(SessionDiagnostic)] |
| 179 | +#[error(parser::return_type_arrow)] |
| 180 | +struct ReturnTypeArrow { |
| 181 | + #[primary_span] |
| 182 | + #[suggestion(applicability = "machine-applicable", code = "->")] |
| 183 | + span: Span, |
| 184 | +} |
| 185 | +``` |
| 186 | + |
| 187 | +In this example, there's just a single suggestion - to replace the `:` with |
| 188 | +a `->`. |
| 189 | + |
| 190 | +Before we're finished, we have to [add the diagnostic messages to the |
| 191 | +translation resources..](#adding-translation-resources) |
| 192 | + |
| 193 | +For more documentation on diagnostic derives, see the [diagnostic structs |
| 194 | +chapter of the `rustc` dev guide][diag_struct]. |
| 195 | + |
| 196 | +#### Manually implementing `SessionDiagnostic`... |
| 197 | +Some diagnostics are too complicated to be generated from a diagnostic type |
| 198 | +using the diagnostic derive. In these cases, `SessionDiagnostic` can be |
| 199 | +implemented manually. |
| 200 | + |
| 201 | +Using the same type as in ["Using a diagnostic |
| 202 | +derive..."](#using-a-diagnostic-derive), we can implement `SessionDiagnostic` |
| 203 | +as below: |
| 204 | + |
| 205 | +```rust |
| 206 | +use rustc_errors::{fluent, SessionDiagnostic}; |
| 207 | + |
| 208 | +struct ReturnTypeArrow { span: Span } |
| 209 | + |
| 210 | +impl SessionDiagnostic for ReturnTypeArrow { |
| 211 | + fn into_diagnostic(self, sess: &'_ rustc_session::Session) -> DiagnosticBuilder<'_> { |
| 212 | + sess.struct_span_err( |
| 213 | + self.span, |
| 214 | + fluent::parser::return_type_arrow, |
| 215 | + ) |
| 216 | + .span_suggestion_short( |
| 217 | + self.span, |
| 218 | + fluent::parser::suggestion, |
| 219 | + "->".to_string(), |
| 220 | + Applicability::MachineApplicable, |
| 221 | + ) |
| 222 | + } |
| 223 | +} |
| 224 | +``` |
| 225 | + |
| 226 | +Instead of using strings for the messages as in the original diagnostic |
| 227 | +emission logic, typed identifiers referring to translation resources are used. |
| 228 | +Now we just have to [add the diagnostic messages to the translation |
| 229 | +resources..](#adding-translation-resources). |
| 230 | + |
| 231 | +#### Examples |
| 232 | +For more examples of diagnostics ported to use the diagnostic derive or written |
| 233 | +manually, see the following pull requests: |
| 234 | + |
| 235 | +- https://github.com/rust-lang/rust/pull/98353 |
| 236 | +- https://github.com/rust-lang/rust/pull/98415 |
| 237 | +- https://github.com/rust-lang/rust/pull/97093 |
| 238 | +- https://github.com/rust-lang/rust/pull/99213 |
| 239 | + |
| 240 | +For more examples, see the pull requests labelled [`A-translation`][A-translation]. |
| 241 | + |
| 242 | +#### Adding translation resources... |
| 243 | +Every slug in a diagnostic derive or typed identifier in a manual |
| 244 | +implementation needs to correspond to a message in a translation resource. |
| 245 | + |
| 246 | +`rustc`'s translations use [Fluent][fluent], an asymmetric translation system. |
| 247 | +For each crate in the compiler which emits diagnostics, there is a |
| 248 | +corresponding Fluent resource at |
| 249 | +`compiler/rustc_error_messages/locales/en-US/$crate.ftl`. |
| 250 | + |
| 251 | +Error messages need to be added to this resource (a macro will then generate |
| 252 | +the typed identifier corresponding to the message). |
| 253 | + |
| 254 | +For our example, we should add the following Fluent to |
| 255 | +`compiler/rustc_error_messages/locales/en-US/parser.ftl`: |
| 256 | + |
| 257 | +```fluent |
| 258 | +parser_return_type_arrow = return types are denoted using `->` |
| 259 | + .suggestion = use `->` instead |
| 260 | +``` |
| 261 | + |
| 262 | +`parser_return_type_arrow` will generate a `parser::return_type_arrow` type (in |
| 263 | +`rustc_errors::fluent`) that can be used with diagnostic structs and the |
| 264 | +diagnostic builder. |
| 265 | + |
| 266 | +Subdiagnostics are "attributes" of the primary Fluent message - by convention, |
| 267 | +the name of attributes are the type of subdiagnostic, such as "suggestion", but |
| 268 | +this can be changed when there are multiple of one kind of subdiagnostic. |
| 269 | + |
| 270 | +Now that the Fluent resource contains the message, our diagnostic is ported! |
| 271 | +More complex messages with interpolation will be able to reference other fields |
| 272 | +in a diagnostic type (when implemented manually, those are provided as |
| 273 | +arguments). See the diagnostic translation documentation [in the `rustc` dev |
| 274 | +guide][diag_translation] for more examples. |
| 275 | + |
| 276 | +### 3. Porting diagnostics |
| 277 | +Now that you've got a rough idea what to do, you need to find some diagnostics |
| 278 | +to port. There's lots of diagnostics to port, so the diagnostic working group |
| 279 | +have split the work up to avoid anyone working on the same diagnostic as |
| 280 | +someone else - but right now, there aren't many people involved, so just pick a |
| 281 | +crate and start porting it :) |
| 282 | + |
| 283 | +Please add the [`A-translation`][A-translation] label to any pull requests that |
| 284 | +you make so we can keep track of who has made a contribution! You can use |
| 285 | +`rustbot` to label your PR (if it wasn't labelled automatically by |
| 286 | +`triagebot`): |
| 287 | + |
| 288 | +```text |
| 289 | +@rustbot label +A-translation |
| 290 | +``` |
| 291 | + |
| 292 | +You can also assign a member of the diagnostics working group to review your PR |
| 293 | +by posting a comment with the following content (or including this in the PR |
| 294 | +description): |
| 295 | + |
| 296 | +```text |
| 297 | +r? rust-lang/diagnostics |
| 298 | +``` |
| 299 | + |
| 300 | +Even if you aren't sure exactly how to proceed, give it a go and you can ask |
| 301 | +for help in [`#t-compiler/wg-diagnostics`] or reach out to [`@davidtwco`]. |
| 302 | + |
| 303 | +## FAQ |
| 304 | + |
| 305 | +### Is this a feature that anyone wants? |
| 306 | +Yes! Some language communities prefer native resources and some don't (and |
| 307 | +preferences will vary within those communities too). For example, |
| 308 | +Chinese-speaking communities have a mature ecosystem of programming language |
| 309 | +resources which don't require knowing any English. |
| 310 | + |
| 311 | +### Wouldn't translating X be more worthwhile? |
| 312 | +There are many different areas within the Rust project where |
| 313 | +internationalization would be beneficial. Diagnostics aren't being prioritized |
| 314 | +over any other part of the project, it's just that there is interest within the |
| 315 | +compiler team in supporting this feature. |
| 316 | + |
| 317 | +### Couldn't compiler developer time be better spent elsewhere? |
| 318 | +Compiler implementation isn't zero-sum: work on other parts of the compiler |
| 319 | +aren't impacted by these efforts and working on diagnostic translation doesn't |
| 320 | +prevent contributors working on anything else. |
| 321 | + |
| 322 | +### Will translations be opt-in? |
| 323 | +Translations will be opt-in, you won't need to use them if you don't want to. |
| 324 | + |
| 325 | +### How will a user select the language? |
| 326 | +Exactly how a user will choose to use translated error messages hasn't been |
| 327 | +decided yet. |
| 328 | + |
| 329 | +[getting_started]: https://rustc-dev-guide.rust-lang.org/building/how-to-build-and-run.html |
| 330 | +[`#t-compiler/wg-diagnostics`]: https://rust-lang.zulipchat.com/#narrow/stream/147480-t-compiler.2Fwg-diagnostics |
| 331 | +[`@davidtwco`]: https://github.com/davidtwco |
| 332 | +[errors_and_lints]: https://rustc-dev-guide.rust-lang.org/diagnostics.html#error-messages |
| 333 | +[diag_struct]: https://rustc-dev-guide.rust-lang.org/diagnostics/diagnostic-structs.html |
| 334 | +[diag_translation]: https://rustc-dev-guide.rust-lang.org/diagnostics/translation.html |
| 335 | +[fluent]: http://projectfluent.org/ |
| 336 | +[A-translation]: https://github.com/rust-lang/rust/issues?q=is%3Aopen+label%3AA-translation+sort%3Aupdated-desc |
0 commit comments