-
-
Notifications
You must be signed in to change notification settings - Fork 355
Add message with debugging info to Cancelled #3256
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
0b627ba
5b2294a
c7c46a4
af964cf
69c398b
357b797
1bc89b2
7def83f
8dbdde3
ebec334
87a8e07
36cf222
4944282
f37d993
67b1370
bcf0014
ada02cb
59d4eee
680306f
2fe4fca
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
:exc:`Cancelled` strings can now display the source and reason for a cancellation. Trio-internal sources of cancellation will set this string, and :meth:`CancelScope.cancel` now has a ``reason`` string parameter that can be used to attach info to any :exc:`Cancelled` to help in debugging. |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -36,7 +36,12 @@ | |
from ._asyncgens import AsyncGenerators | ||
from ._concat_tb import concat_tb | ||
from ._entry_queue import EntryQueue, TrioToken | ||
from ._exceptions import Cancelled, RunFinishedError, TrioInternalError | ||
from ._exceptions import ( | ||
Cancelled, | ||
CancelReasonLiteral, | ||
RunFinishedError, | ||
TrioInternalError, | ||
) | ||
from ._instrumentation import Instruments | ||
from ._ki import KIManager, enable_ki_protection | ||
from ._parking_lot import GLOBAL_PARKING_LOT_BREAKER | ||
|
@@ -305,7 +310,7 @@ def expire(self, now: float) -> bool: | |
did_something = True | ||
# This implicitly calls self.remove(), so we don't need to | ||
# decrement _active here | ||
cancel_scope.cancel() | ||
cancel_scope._cancel(CancelReason(source="deadline")) | ||
# If we've accumulated too many stale entries, then prune the heap to | ||
# keep it under control. (We only do this occasionally in a batch, to | ||
# keep the amortized cost down) | ||
|
@@ -314,6 +319,20 @@ def expire(self, now: float) -> bool: | |
return did_something | ||
|
||
|
||
@attrs.define | ||
class CancelReason: | ||
"""Attached to a :class:`CancelScope` upon cancellation with details of the source of the | ||
cancellation, which is then used to construct the string in a :exc:`Cancelled`. | ||
Users can pass a ``reason`` str to :meth:`CancelScope.cancel` to set it. | ||
|
||
Not publicly exported or documented. | ||
""" | ||
|
||
source: CancelReasonLiteral | ||
source_task: str | None = None | ||
reason: str | None = None | ||
|
||
|
||
@attrs.define(eq=False) | ||
class CancelStatus: | ||
"""Tracks the cancellation status for a contiguous extent | ||
|
@@ -468,6 +487,14 @@ def recalculate(self) -> None: | |
or current.parent_cancellation_is_visible_to_us | ||
) | ||
if new_state != current.effectively_cancelled: | ||
if ( | ||
current._scope._cancel_reason is None | ||
and current.parent_cancellation_is_visible_to_us | ||
): | ||
assert current._parent is not None | ||
current._scope._cancel_reason = ( | ||
current._parent._scope._cancel_reason | ||
) | ||
current.effectively_cancelled = new_state | ||
if new_state: | ||
for task in current._tasks: | ||
|
@@ -558,6 +585,8 @@ class CancelScope: | |
_cancel_called: bool = attrs.field(default=False, init=False) | ||
cancelled_caught: bool = attrs.field(default=False, init=False) | ||
|
||
_cancel_reason: CancelReason | None = attrs.field(default=None, init=False) | ||
|
||
# Constructor arguments: | ||
_relative_deadline: float = attrs.field( | ||
default=inf, | ||
|
@@ -594,7 +623,7 @@ def __enter__(self) -> Self: | |
self._relative_deadline = inf | ||
|
||
if current_time() >= self._deadline: | ||
self.cancel() | ||
self._cancel(CancelReason(source="deadline")) | ||
with self._might_change_registered_deadline(): | ||
self._cancel_status = CancelStatus(scope=self, parent=task._cancel_status) | ||
task._activate_cancel_status(self._cancel_status) | ||
|
@@ -883,19 +912,42 @@ def shield(self, new_value: bool) -> None: | |
self._cancel_status.recalculate() | ||
|
||
@enable_ki_protection | ||
def cancel(self) -> None: | ||
"""Cancels this scope immediately. | ||
|
||
This method is idempotent, i.e., if the scope was already | ||
cancelled then this method silently does nothing. | ||
def _cancel(self, cancel_reason: CancelReason | None) -> None: | ||
"""Internal sources of cancellation should use this instead of :meth:`cancel` | ||
in order to set a more detailed :class:`CancelReason` | ||
Helper or high-level functions can use `cancel`. | ||
""" | ||
if self._cancel_called: | ||
return | ||
|
||
if self._cancel_reason is None: | ||
self._cancel_reason = cancel_reason | ||
|
||
with self._might_change_registered_deadline(): | ||
self._cancel_called = True | ||
|
||
if self._cancel_status is not None: | ||
self._cancel_status.recalculate() | ||
|
||
@enable_ki_protection | ||
def cancel(self, reason: str | None = None) -> None: | ||
"""Cancels this scope immediately. | ||
|
||
The optional ``reason`` argument accepts a string, which will be attached to | ||
any resulting :exc:`Cancelled` exception to help you understand where that | ||
cancellation is coming from and why it happened. | ||
|
||
This method is idempotent, i.e., if the scope was already | ||
cancelled then this method silently does nothing. | ||
""" | ||
try: | ||
current_task = repr(_core.current_task()) | ||
except RuntimeError: | ||
current_task = None | ||
self._cancel( | ||
CancelReason(reason=reason, source="explicit", source_task=current_task) | ||
) | ||
|
||
@property | ||
def cancel_called(self) -> bool: | ||
"""Readonly :class:`bool`. Records whether cancellation has been | ||
|
@@ -924,7 +976,7 @@ def cancel_called(self) -> bool: | |
# but it makes the value returned by cancel_called more | ||
# closely match expectations. | ||
if not self._cancel_called and current_time() >= self._deadline: | ||
self.cancel() | ||
self._cancel(CancelReason(source="deadline")) | ||
return self._cancel_called | ||
|
||
|
||
|
@@ -1192,9 +1244,9 @@ def parent_task(self) -> Task: | |
"(`~trio.lowlevel.Task`): The Task that opened this nursery." | ||
return self._parent_task | ||
|
||
def _add_exc(self, exc: BaseException) -> None: | ||
def _add_exc(self, exc: BaseException, reason: CancelReason | None) -> None: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If the only callers of this are internal, IMO it would be cleaner to have them set the reason inline. Also, to avoid multiple comments for similar things, why doesn't this unconditionally set There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
in case we get multiple sources of cancellation I don't want to override the first one. In other places it's more critical, but here I could see a scenario where:
so I'm pretty sure we need the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. not relevant anymore There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I haven't looked at the code to see why this isn't relevant anymore, but I already typed up a response comment to this: I'm not entirely convinced avoiding this is a good thing: async def crasher():
await trio.sleep(2)
raise ValueError("...")
with trio.open_nursery() as nursery:
try:
nursery.start_soon(crasher)
try:
await trio.sleep(10) # app code
finally:
with trio.move_on_after(2, shield=True):
await trio.sleep(3) # cleanup code
except trio.Cancelled as c:
# what should c's cancel reason be
raise This might matter for instance in code for shutting down stuff on exceptions, moving on after 2 seconds. The cancel reason if the clean up code ran over its 2 seconds would presumably (?) be that the exceptions happened, not that the timeout happened. I think it would make more sense if the reason was instead about the timeout. (I haven't played with this PR yet so I'm not sure that's actually what will happen) Would it make sense to establish some sort of causal mechanism? I.e. a field on CancelReason that points to the old CancelReason. (I guess Cancelled could store another Cancelled? But that might be bad for cycles.) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah current behavior is that the crasher becomes the reason. Storing a chain of ... although the crash cancellation should be accessible somehow in the But storing a chain of reasons would be fairly straightforward and sounds like it might have some good use There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I thought I found a repro where But going back to your example: There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good point: what's the cancel reason visible inside the move_on_after then? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm pretty sure it was There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nice, yeah that behavior sounds nice. Returning to the earliest response you have, is that (nursery cancel -> raise a different exception in one of the tasks) the only case where things try to overwrite the cancellation reason? If so, I think it would be nicer to make nurseries not try to cancel if they are already cancelled (which would prevent the cancellation reason from being overwritten). I also see that changing the deadline can potentially overwrite. I don't see why that would try to cancel anything if things are already cancelled... I guess just code simplicity. I guess it makes sense to try to handle it in one place with a check on the cancellation reason, then. I just don't like it! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. haha. Yeah if we rewrote everything from scratch we might implement it differently. But I think there's a bunch of ways to re-cancel, including simply calling |
||
self._pending_excs.append(exc) | ||
self.cancel_scope.cancel() | ||
self.cancel_scope._cancel(reason) | ||
|
||
def _check_nursery_closed(self) -> None: | ||
if not any([self._nested_child_running, self._children, self._pending_starts]): | ||
|
@@ -1210,7 +1262,14 @@ def _child_finished( | |
) -> None: | ||
self._children.remove(task) | ||
if isinstance(outcome, Error): | ||
self._add_exc(outcome.error) | ||
self._add_exc( | ||
outcome.error, | ||
CancelReason( | ||
source="nursery", | ||
source_task=repr(task), | ||
A5rocks marked this conversation as resolved.
Show resolved
Hide resolved
|
||
reason=f"child task raised exception {outcome.error!r}", | ||
), | ||
) | ||
self._check_nursery_closed() | ||
|
||
async def _nested_child_finished( | ||
|
@@ -1220,7 +1279,14 @@ async def _nested_child_finished( | |
# Returns ExceptionGroup instance (or any exception if the nursery is in loose mode | ||
# and there is just one contained exception) if there are pending exceptions | ||
if nested_child_exc is not None: | ||
self._add_exc(nested_child_exc) | ||
self._add_exc( | ||
nested_child_exc, | ||
reason=CancelReason( | ||
source="nursery", | ||
source_task=repr(self._parent_task), | ||
reason=f"Code block inside nursery contextmanager raised exception {nested_child_exc!r}", | ||
), | ||
) | ||
self._nested_child_running = False | ||
self._check_nursery_closed() | ||
|
||
|
@@ -1231,7 +1297,13 @@ async def _nested_child_finished( | |
def aborted(raise_cancel: _core.RaiseCancelT) -> Abort: | ||
exn = capture(raise_cancel).error | ||
if not isinstance(exn, Cancelled): | ||
self._add_exc(exn) | ||
self._add_exc( | ||
exn, | ||
CancelReason( | ||
source="KeyboardInterrupt", | ||
source_task=repr(self._parent_task), | ||
), | ||
) | ||
# see test_cancel_scope_exit_doesnt_create_cyclic_garbage | ||
del exn # prevent cyclic garbage creation | ||
return Abort.FAILED | ||
|
@@ -1245,7 +1317,8 @@ def aborted(raise_cancel: _core.RaiseCancelT) -> Abort: | |
try: | ||
await cancel_shielded_checkpoint() | ||
except BaseException as exc: | ||
self._add_exc(exc) | ||
# there's no children to cancel, so don't need to supply cancel reason | ||
self._add_exc(exc, reason=None) | ||
|
||
popped = self._parent_task._child_nurseries.pop() | ||
assert popped is self | ||
|
@@ -1575,8 +1648,17 @@ def _attempt_delivery_of_any_pending_cancel(self) -> None: | |
if not self._cancel_status.effectively_cancelled: | ||
return | ||
|
||
reason = self._cancel_status._scope._cancel_reason | ||
|
||
def raise_cancel() -> NoReturn: | ||
raise Cancelled._create() | ||
if reason is None: | ||
raise Cancelled._create(source="unknown", reason="misnesting") | ||
else: | ||
raise Cancelled._create( | ||
source=reason.source, | ||
reason=reason.reason, | ||
source_task=reason.source_task, | ||
) | ||
|
||
self._attempt_abort(raise_cancel) | ||
|
||
|
@@ -2075,15 +2157,27 @@ async def init( | |
) | ||
|
||
# Main task is done; start shutting down system tasks | ||
self.system_nursery.cancel_scope.cancel() | ||
self.system_nursery.cancel_scope._cancel( | ||
CancelReason( | ||
source="shutdown", | ||
reason="main task done, shutting down system tasks", | ||
source_task=repr(self.init_task), | ||
) | ||
) | ||
|
||
# System nursery is closed; finalize remaining async generators | ||
await self.asyncgens.finalize_remaining(self) | ||
|
||
# There are no more asyncgens, which means no more user-provided | ||
# code except possibly run_sync_soon callbacks. It's finally safe | ||
# to stop the run_sync_soon task and exit run(). | ||
run_sync_soon_nursery.cancel_scope.cancel() | ||
run_sync_soon_nursery.cancel_scope._cancel( | ||
CancelReason( | ||
source="shutdown", | ||
reason="main task done, shutting down run_sync_soon callbacks", | ||
source_task=repr(self.init_task), | ||
) | ||
) | ||
|
||
################ | ||
# Outside context problems | ||
|
@@ -2926,7 +3020,18 @@ async def checkpoint() -> None: | |
if task._cancel_status.effectively_cancelled or ( | ||
task is task._runner.main_task and task._runner.ki_pending | ||
): | ||
with CancelScope(deadline=-inf): | ||
cs = CancelScope(deadline=-inf) | ||
if ( | ||
task._cancel_status._scope._cancel_reason is None | ||
and task is task._runner.main_task | ||
and task._runner.ki_pending | ||
): | ||
task._cancel_status._scope._cancel_reason = CancelReason( | ||
source="KeyboardInterrupt" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's kind of strange to me to set this here. Is there no way for the thing raising There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah it's a super ugly place.
Though the current logic doesn't make sense in that regard either, the scope has already effectively been canceled, but it's not until we checkpoint that we set the reason to I'd love to place it in With the current setup there's even a case for saying we shouldn't bother setting a reason at all, because the scope hasn't actually been canceled. And during KI we don't raise There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. my current error when trying to do it in But I'm starting to lean towards not setting the reason, though #3233 would have to revisit that. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Well, we can just leave the status quo and fix it in #3233 unless we expect the PRs to be in separate releases |
||
) | ||
assert task._cancel_status._scope._cancel_reason is not None | ||
cs._cancel_reason = task._cancel_status._scope._cancel_reason | ||
with cs: | ||
await _core.wait_task_rescheduled(lambda _: _core.Abort.SUCCEEDED) | ||
|
||
|
||
|
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure
repr(Task)
is actually that useful? Like yes, it says what function this was started in and maybe you can locate the task through some gc machinery through theid
, but...Maybe a
weakref
would be more useful so people can access attributes?I feel like "where was I spawned" is already answered by the stack trace.(nevermind, just remembered this is the canceller not the cancellee)Nevermind, I didn't think this suggestion through. A weakref wouldn't work for the common case (a task cancelling a sibling task).
I'm not convinced a strong ref here would be bad -- a Task doesn't store the exception or the cancellation reason so there's no reference cycle I think? But a string here is fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if a
Task
contains aCancelScope
and the scope gets cancelled within the same task, the scope will then have a strong ref to aCancelReason
which will then point back to theTask
. I think?the
repr(Task)
on its own is perhaps not super useful, but in case you have multiple cancellations going on at the same time that are only distinguished by the source task then you can visually distinguish them even without other sources of the task id.Though it does also contain the name of the function itself:
<Task 'trio._core._tests.test_run.test_Cancelled_str' at 0x\w*>
which could be very helpful if you have different functions spawned in a nursery.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I guess task -> parent/child nursery -> cancel scope -> cancel reason -> task is a loop, yeah. That's annoying. (Or maybe CoroutineType stores frames as a strongref? That too.)