Command error handling should work off of non-erased error types #17273

alice-i-cecile · 2025-01-10T03:00:04Z

I do agree that this could probably handle non-erased error types instead in the "specific-command error handler" case. I spent a bit of time investigating and got this largely working. I added a type Error associated type to HandleError and CommandWithEntity, and added an ErrorHandler<E> type alias for a generic error handler function. This works for adding error handlers to specific commands, but there is a problem: The "default error handler" fundamentally requires the general-purpose bevy Error type, making it (not directly) usable for a signature that expects a specific error type. We want queue to be "default error conversion compatible" and queue_handled to require the concrete type. I'm reasonably certain this requires a duplication of traits.

This is a big enough change that I think my preference is to defer it to a later PR.

Originally posted by @cart in #17215 (comment)

The text was updated successfully, but these errors were encountered:

cart · 2025-01-10T05:24:36Z

So in terms of next steps for error handling, I think our priority should be exposing useful backtrace info (anyhow-style). Currently the backtrace printed for our boxed error is from where the handler panicked, which is useless. My plan involved using the anyhow-style backtrace feature. The boxed error type in bevy_ecs::result::Error doesn't support that out of the gate. Switching that out for anyhow is just a one-liner (not necessarily suggesting we use anyhow, but it makes it easy to test). Supporting arbitrary error types like we currently do actually gets in the way of useful backtraces, as the backtraces are captured in the anyhow::Error::from (which is called via the ? in fallible_thing()?). If we defer the construction of the "anyhow-style error" until later, we miss our chance to capture the backtrace at the right spot.

Sadly I think this implies that commands should always return our general purpose / anyhow-style error type. The "niceness" of returning whatever error type we want isn't worth losing useful backtraces.

I think it would also be good to report system debug information when a command fails. Not quite sure yet the best way to do it, but it seems possible.

NthTensor · 2025-01-11T03:17:14Z

Totally agree with cart here. I actually started looking into exactly this direction after we shipped the basic fallible systems stuff.

If it's helpful, here's what I eventually designed:

Define a new specialized Report type, which stands in for anyhow::Error. To this type we can attach all our custom information: system origin, backtrace, source location are a given, but we could also do also stuff like error codes, hints for how to fix common bugs, or even links to the docs.
We write a handler which knows how to read Reports and print them to the console or log them to tracy. For the logging to tracy case, I opted to include a severity hint (info/warn/error) in the Report.
Finally we define a trait Advice and implement From<E> for Report where E: Error + Advice, along with an thiserror style error annotation macro. We derive Advice for all our concrete types so that they automatically get converted into reports, and we add helpers to manually convert errors-without-adivce into reports. Backtraces and so on get captured in the conversion from concrete error type to Report.

You can see some of the preliminary work here: https://github.com/NthTensor/bevy/tree/error_reporting.

I'm kind of on the fence about this design. Rust errors typically come in two flavors: concrete on the supply side (think thiserror) and dynamic on the consumer side (think anyhow). Bevy is in the odd position of being both the generator and the ultamate consumer of most errors, with user code sandwiched in the middle. This design takes advantage of that by adding a strong link between the two (in the form of the Advice trait).

As an example of just one of the things this end-to-end loop would let us do, we could adapt our current warn_once logic to errors. The error creater would count the number of times a specific failure is hit and embed that count in the result. The log would read that count and de-duplicate the log by suppressing redundant errors (web consoles do this and it's great).

On the other hand, maybe this would be way over-engineered. IDK.

cart · 2025-01-21T20:28:49Z

I like the general concept, but I think we should slowly extend the scope and capabilities as the need arises, rather than try to build the whole paradigm all at once. This allows us to take our first steps quickly, and lets us make each additional choice on its own time / let it succeed on its own merits. Some thoughts:

We should definitely have our own anyhow-style Error type that we can extend with additional metadata, but as mentioned above, lets start it as a direct and simple anyhow-ish clone. Each new capability or piece of metadata we add should be a separate discussion.
Calling that type Report is an interesting thought. I think I prefer the error-type-alias, as thats how the ecosystem approaches this. Result<X, Report> also just feels different than Result<X, Error>. But I do also see the appeal of avoiding naming conflicts and communicating that this type might have different behaviors. Especially for the initial "this is just an anyhow clone" phase, I think we should still call the type Error as that is non-breaking and the least surprising.
Our error handler API should definitely use this error type. As mentioned above, in the interest of capturing more useful backtraces we should adapt our "accept any error type" APIs to instead require the Bevy error type.
I see the appeal of the Advice trait in terms of attaching additional information, but adding it to the blanket From impl has the distinct disadvantage of blocking normal errors from being automatically converted to our error type. Given that we can make every upstream error compatible, thats a strong start. But it means that things like std errors, ecosystem crate errors, and user-defined "would otherwise be a standard Rust error type" errors require conversion. Ultimately, I worry the value of the additional metadata (that would be embedded in the concrete error type directly and not attached later) is not worth the compatibility loss there:
- Additional Advice info on the concrete error type: error code (could be in the error message), hints (could be in the error message), doc links (could be in the error message)
  - If we would benefit from structured data (ex: better in-editor presentation of error messages), we could embed a structured / parse-able (but still human readable) error message format.
- Info attached later: backtrace (captured and attached after/during conversion), system origin

So in summary, I'm not fully against the "bevy requires its own error traits and explicit conversion to them for non-bevy errors", but before going down that route I would like to explore the limits of the "bevy is compatible with all error types, attaches additional data when possible". And if we ever discover that we need additional structured data from the error types themselves, develop a parse-able format that we embed in the error message.

alice-i-cecile added this to the 0.16 milestone Jan 10, 2025

alice-i-cecile added A-ECS Entities, components, systems, and events S-Ready-For-Implementation This issue is ready for an implementation PR. Go for it! D-Modest A "normal" level of difficulty; suitable for simple features or challenging fixes labels Jan 10, 2025

BenjaminBrienen added the C-Feature A new feature, making something new possible label Jan 11, 2025

cart mentioned this issue Feb 28, 2025

Bevy's Error type should capture a useful backtrace #18092

Closed

alice-i-cecile modified the milestones: 0.16, 0.17 Mar 28, 2025

This was referenced Mar 28, 2025

Making Required Components force existence as a follow-up of #18514 #18566

Open

Add notes to fallible commands #18649

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Command error handling should work off of non-erased error types #17273

Command error handling should work off of non-erased error types #17273

alice-i-cecile commented Jan 10, 2025

cart commented Jan 10, 2025

NthTensor commented Jan 11, 2025 •

edited

Loading

cart commented Jan 21, 2025 •

edited

Loading

Command error handling should work off of non-erased error types #17273

Command error handling should work off of non-erased error types #17273

Comments

alice-i-cecile commented Jan 10, 2025

cart commented Jan 10, 2025

NthTensor commented Jan 11, 2025 • edited Loading

cart commented Jan 21, 2025 • edited Loading

NthTensor commented Jan 11, 2025 •

edited

Loading

cart commented Jan 21, 2025 •

edited

Loading