Skip to content

Clean up yield inheritance #115

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

shaseley
Copy link
Collaborator

@shaseley shaseley commented Mar 5, 2025

  1. Key the scheduling state based on the {{Scheduler}} to prevent leaking it across (potentially cross-origin) windows. This changes the event loop's continuation state to be a small wrapper around a map. The continuation state is propagated in the same way, but the scheduler state is unique to the scheduler and not shared. In practice there will only be one entry in this map (a task or microtask can only have originated from one task), but the mechanism is generic enough to support other use cases, implementations can optimize this, and the key/value mapping hopefully makes the isolation clear.

    Alternatively, we could propagate only the state for the current scheduler, but we don't always know the current scheduler, e.g. in "queue a microtask", and this model is different enough from AsyncContext and Chrome's "Task Attribution" that we'd need a separate mechanism, which is a performance concern. The main behavioral difference is how propagating is handled in the case of A --> (B microtask) --> A. With this approach, the context is preserved in the second call to A, which matches the synchronous behavior of A --> calls B --> calls A.

  2. Propagate the current scheduling state in "queue a microtask", unless coming from JavaScript, in which case the propagation is handled by the abstract job closure. Previously, the state would be inherited only if it wasn't reset by another microtask or after the postTask callback ran. This fixes the inconsistency, making directly scheduled microtasks match microtasks originating from JavaScript.


Preview | Diff

 1. Key the scheduling state based on the {{Scheduler}} to prevent
    leaking it across (potentially cross-origin) windows. This changes
    the event loop's continuation state to be a small wrapper around a
    map. The continuation state is propagated in the same way, but the
    scheduler state is unique to the scheduler and not shared. In
    practice there will only be one entry in this map (a task or
    microtask can only have originated from one task), but the mechanism
    is generic enough to support other use cases, implementations
    can optimize this, and the key/value mapping hopefully makes the
    isolation clear.

    Alternatively, we could propagate only the state for the current
    scheduler, but we don't always know the current scheduler, e.g. in
    "queue a microtask", and this model is different enough from
    AsyncContext and Chrome's "Task Attribution" that we'd need a
    separate mechanism, which is a performance concern. The main
    behavioral difference is how propagating is handled in the case of
    A --> (B microtask) --> A. With this approach, the context is
    preserved in the second call to A, which matches the synchronous
    behavior of A --> calls B --> calls A.

 2. Propagate the current scheduling state in "queue a microtask",
    unless coming from JavaScript, in which case the propagation is
    handled by the abstract job closure. Previously, the state would be
    inherited only if it wasn't reset by another microtask or after the
    postTask callback ran. This fixes the inconsistency, making directly
    scheduled microtasks match microtasks originating from JavaScript.
@shaseley shaseley force-pushed the yield-inheritance-fix branch from 7680be8 to c887dc3 Compare March 5, 2025 23:05
@shaseley shaseley marked this pull request as ready for review March 5, 2025 23:36
@shaseley shaseley linked an issue Mar 5, 2025 that may be closed by this pull request
@shaseley
Copy link
Collaborator Author

shaseley commented Mar 5, 2025

Hi @smaug---- would you mind taking a look? I'm going to start migrating stuff to HTML, which will make the monkey patching easier to reason about, but I wanted to have a starting point. The indirection is maybe a little overkill, but I think having the map makes the isolation clear, and it can probably be expanded later to support AsyncContext since the microtask propagation should be the same, but TBD.

@smaug----
Copy link

smaug---- commented Mar 12, 2025

Oops, the email got buried underneath other bugmail. I'll take a look later today.
(and I don't think we should compare anything to AsyncContext, that has so many issues)

@sefeng211

@smaug----
Copy link

Sorry, this is taking a bit longer. I don't yet quite understand in which all ways this changes the behavior and how this prevents leaking scheduling behavior to other globals.

@smaug----
Copy link

smaug---- commented Mar 17, 2025

It is the change to microtask queuing I don't understand. How does that work with MutationObservers and such?

If task has set the state, and then microtask runs for another global, don't we end up using wrong state? Maybe not, since if window.scheduler.yield() is used, it wouldn't find the state for the scheduler. But otherWindow.scheduler.yield() would work? Wouldn't that be surprising?
I definitely might be missing something here.

@shaseley
Copy link
Collaborator Author

Thanks for looking; ran out of time today, will reply tomorrow.

@shaseley
Copy link
Collaborator Author

@mmocny FYI

It is the change to microtask queuing I don't understand. How does that work with MutationObservers and such?

Some additional context: I realized that I didn't spec the "queue a microtask" propagation, but we had implemented that in Blink. The motivation for that was primarily to align with Task Attribution (soft navigation metrics) so that microtask behavior was consistent (it made sense there, especially for custom elements), but we can deviate from that. Anyway I wanted to include this bit as a better starting point to discuss.

If task has set the state, and then microtask runs for another global, don't we end up using wrong state? Maybe not, since if window.scheduler.yield() is used, it wouldn't find the state for the scheduler.

Right, the state would be set, but it wouldn't be used. MutationObservers are interesting because a single microtask is queued for all mutations, which can include mutations from multiple frames ("pending mutation observers" is per agent), so we can't attribute the microtask to a window/scheduler.

But otherWindow.scheduler.yield() would work?

I think the behavior is consistent with the other scenarios:

  1. Events
    1. postTask callback runs (windowA)
    2. event is synchronously dispatched
    3. event handler runs for windowB
    4. windowB.scheduler.yield() does not inherit, windowA.scheduler.yield() does
  2. Scripting other windows
    1. postTask callback runs (windowA)
    2. windowB.scheduler.yield() does not inherit, windowA.scheduler.yield() does

I think MutationObserver is basically case (1), but with a microtask hop.

Wouldn't that be surprising?

Maybe. I'm wary about including MutationObservers (and events / other callbacks) because they're asynchronous with respect to the original code flow -- and therefore not necessarily a continuation (vs. awaiting a promise). Whereas scenario (2) above is more obviously a continuation, so dropping the state is a bit surprising to me.

sefeng211 referenced this pull request in web-platform-tests/wpt Apr 2, 2025
In rare cases, e.g. changing focus in scheduler.postTask() task causing
blur in a cross origin frame, the scheduling state can be propagated to
cross origin microtasks and used if scheduler.yield() is called. To
avoid this, propagate the SecurityOrigin along with the scheduling state
and ignore the inherited state if calling yield() would inherit cross
origin state.

Alternatively, we might consider scoping the signals to a specific
ExecutionContext, but that is a bigger change, and it's more
complicated because developers can cross that boundary when posting
tasks/running script.

Bug: 399478940
Change-Id: Ia6c2139138d3893d7e2930e5ca886775d3214384
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/6306840
Reviewed-by: Kent Tamura <[email protected]>
Reviewed-by: Nate Chapin <[email protected]>
Commit-Queue: Scott Haseley <[email protected]>
Cr-Commit-Position: refs/heads/main@{#1426380}

Modify step 5 to read:

1. Return the <span>JobCallback Record</span> { \[[Callback]]: <var ignore=''>callable</var>,
\[[HostDefined]]: { \[[IncumbentSettings]]: <var ignore=''>incumbent settings</var>,
\[[ActiveScriptContext]]: <var ignore=''>script execution context</var>,
\[[SchedulingState]]: |state| } }.
\[[ContinuationState]]: |state| } }.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the state be a copy of what is stored in the event loop? Otherwise the state might get changed before the job is run.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that makes sense. In blink the state's properties are read-only (which would be nice to have in spec land), so copying makes sense. Heading out the door for a week, will pick this up when I get back.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Fixed here and above.

1. Let |callbackResult| be the result of [=invoking=] |callback| with « » and "`rethrow`".
If that threw an exception, then [=reject=] |result| with that. Otherwise, [=resolve=]
|result| with |callbackResult|.
1. Set |event loop|'s [=event loop/current scheduling state=] to null.
1. Set |eventLoop|'s [=event loop/current continuation state=] to null.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am having a bit of hard time to understand this. This rules out the continuation state, is this correct or it should be the scheduling state for the scheduler?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The invariant should be that the continuation state is null before and after the task runs, so that it doesn't leak between tasks, so clearing it is intentional here. It's not possible to have concurrent scheduler state (only one task or microtask can run at a time), so we don't have to restore a previous value (so we could actually do away with the map and have the continuation state store a Scheduler to compare against, but the map works too).

An annotated example of how this works:

scheduler.postTask(async () => {
  // Before this callback runs, this algorithm sets event loop's continuation state
  // for `scheduler` to a "scheduling state" object with a null "abort source" and
  // fixed priority background signal for "priority source". It should have been null
  // before that.
  doStuff();

  // The `await` (or .then()) causes HostMakeJobCallback to run, where we copy
  // the state and link it to the future continuation. At this point the "invoke" step
  // returns, and the continuation state is cleared so it doesn't leak to the next task.
  await randomPromiseResolvedLater();

  // This runs in a  microtask at some point in the future (e.g. separate task).
  // Before the microtask runs, HostCallJobCallback runs, restoring the
  // continuation state.

  // yield() reads the state restored above. It also gets propagated to
  // yield continuation/microtask by the mechanism above.
  await scheduler.yield();
}, {priority: "background"});

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Scott, just make sure I understand this....I understand that clearing the whole state to avoid leaking it and HostCallJobCallback will restore it before the continuation.

I still wonder would it also work we just set state for this particular scheduler to nullptr? I am not suggesting this change, just making sure my understanding of the whole thing is correct. Or did you also imply this in your comment?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Clarify that the scheduling state is supposed to be per event loop
3 participants