Skip to content

Preserve insertion order in maps that are traversed #25218

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

fmeum
Copy link
Collaborator

@fmeum fmeum commented Feb 6, 2025

Consumers of an Immutable{List,Map,Set} cannot know whether it has been created from a source with unspecified iteration order, such as HashMap. Since ImmutableMap itself prescribes that it preserves insertion order, it's reasonable for consumers to expect a specified, reasonable, deterministic order. While the current implementation of HashMap in the OpenJDK still results in a deterministic order, this kind of "sequenced-washing" doesn't cause non-determinism, but it still runs the risk of having the order of elements change between JDK versions or vendors.

The locations changed in this PR have been identified by instrumenting the factory methods of the Immutable{List,Map,Set} classes to track whether they 1) have been created from an unsequenced collection and 2) are iterated over. I then manually picked out those cases in which I couldn't rule out that a change in order would be observable to users.

Related to google/guava#6903 (comment)

@fmeum fmeum requested review from a team, Wyverald and meteorcloudy as code owners February 6, 2025 17:52
@fmeum fmeum requested review from katre and removed request for a team February 6, 2025 17:52
@github-actions github-actions bot added team-Performance Issues for Performance teams team-Configurability platforms, toolchains, cquery, select(), config transitions team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. awaiting-review PR is awaiting review from an assigned reviewer labels Feb 6, 2025
Consumers of an `Immutable{List,Map,Set}` cannot know whether it has been created from a source with unspecified iteration order, such as `HashMap`. Since `ImmutableMap` itself prescribes that it preserves insertion order, it's reasonable for consumers to expect a specified, reasonable, deterministic order. While the current implementation of `HashMap` in the OpenJDK still results in a deterministic order, this kind of "sequenced-washing" doesn't cause non-determinism, but it still runs the risk of having the order of elements change between JDK versions or vendors.

The locations changed in this PR have been identified by instrumenting the factory methods of the `Immutable{List,Map,Set}` classes to track whether they 1) have been created from an unsequenced collection and 2) are iterated over. I then manually picked out those cases in which I couldn't rule out that a change in order would be observable to users.

Related to google/guava#6903 (comment)
@fmeum fmeum force-pushed the unspecified-order-fixes branch from 505a0a2 to b45d5d4 Compare February 6, 2025 17:56
Copy link
Member

@katre katre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, I support this change and think this is worthwhile. However, I am worried about backsliding. What can we do to keep new uses of non-SequencedMap implementations from being added, or is this not a concern?

@fmeum
Copy link
Collaborator Author

fmeum commented Feb 6, 2025

It's a real concern and a proper fix would likely require having Immutable* implement Sequenced*, which may be far away depending on some low-level Java tooling question raised in the linked Guava issue.

@katre
Copy link
Member

katre commented Feb 6, 2025

I see, thanks. I looked at the linked Guava thread and it does look difficult.

It looks like some of the current test failures are real, so I will hold off further review until those are addressed.

@fmeum fmeum force-pushed the unspecified-order-fixes branch from 18e0f63 to 74c4ac9 Compare February 6, 2025 18:47
Copy link
Member

@Wyverald Wyverald left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for doing this.

Though I wonder if sometimes what we need is an ImmutableMap that semantically does not guarantee consistent iteration ordering, and then an ImmutableSequencedMap that does. (Several places in this PR feel like they really want an immutable map but don't care about the order at all; but they have to "care" now because they're offering an ImmutableMap.)

…ildConfigurationValue.java

Co-authored-by: Xùdōng Yáng <[email protected]>
@fmeum
Copy link
Collaborator Author

fmeum commented Feb 6, 2025

Though I wonder if sometimes what we need is an ImmutableMap that semantically does not guarantee consistent iteration ordering, and then an ImmutableSequencedMap that does. (Several places in this PR feel like they really want an immutable map but don't care about the order at all; but they have to "care" now because they're offering an ImmutableMap.)

That could be an alternative for Guava. A new class would also resolve the concerns around adopting SequencedMap early. FYI @cpovirk

@fmeum fmeum requested a review from Wyverald February 6, 2025 19:52
@cpovirk
Copy link
Contributor

cpovirk commented Feb 6, 2025

While I see the logic (and the implementation could intentionally shuffle order to prevent people from depending on it!), I think it's unlikely that we'd do it.

Inside Google, we have a JDK patch that shuffles the order of plain old HashMap (and I think ConcurrentHashMap). Those aren't the only source of unordered inputs, of course, and I think the randomization still doesn't cover every single case, but it does a pretty good job if there is some test that fails as a result of a change in ordering.

That's seemed to hit a pretty good spot in the cost-benefit space. I can definitely see how the ability to actively request randomization could be useful, especially outside Google. But I figure:

  • Users have a limited amount of attention to spend on the problem. If we can get across to them that hash ordering is random and that they can fix it with types like ImmutableMap and LinkedHashMap, I think that's a good outcome, and if we drive even a small fraction of users away by saying "Or you can actively randomize it...," then that may make things worse on net.
  • At this point, I assume that we have various code that "expects" ImmutableMap in various senses—APIs that require it (though that should be uncommon), static analyzers that recognize it, tools that have special handling for it. If we steer users toward ImmutableRandomizedMap, they'd probably encounter code that hasn't been updated to support it, especially if ImmutableRandomizedMap ends up being used only a fraction as often as ImmutableMap.

@Wyverald Wyverald added awaiting-PR-merge PR has been approved by a reviewer and is ready to be merge internally and removed awaiting-review PR is awaiting review from an assigned reviewer labels Feb 6, 2025
@copybara-service copybara-service bot closed this in b03cb63 Feb 7, 2025
@github-actions github-actions bot removed the awaiting-PR-merge PR has been approved by a reviewer and is ready to be merge internally label Feb 7, 2025
@fmeum fmeum deleted the unspecified-order-fixes branch February 7, 2025 17:18
@fmeum
Copy link
Collaborator Author

fmeum commented Feb 7, 2025

@bazel-io flag

@bazel-io bazel-io added the potential release blocker Flagged by community members using "@bazel-io flag". Should be added to a release blocker milestone label Feb 7, 2025
@iancha1992
Copy link
Member

@fmeum Can this go to 8.2.0? Or do we have to include it in 8.1.0?

@fmeum
Copy link
Collaborator Author

fmeum commented Feb 10, 2025

This can wait for 8.2.0

@iancha1992
Copy link
Member

@bazel-io fork 8.2.0

@bazel-io bazel-io removed the potential release blocker Flagged by community members using "@bazel-io flag". Should be added to a release blocker milestone label Feb 14, 2025
iancha1992 pushed a commit to iancha1992/bazel that referenced this pull request Feb 14, 2025
Consumers of an `Immutable{List,Map,Set}` cannot know whether it has been created from a source with unspecified iteration order, such as `HashMap`. Since `ImmutableMap` itself prescribes that it preserves insertion order, it's reasonable for consumers to expect a specified, reasonable, deterministic order. While the current implementation of `HashMap` in the OpenJDK still results in a deterministic order, this kind of "sequenced-washing" doesn't cause non-determinism, but it still runs the risk of having the order of elements change between JDK versions or vendors.

The locations changed in this PR have been identified by instrumenting the factory methods of the `Immutable{List,Map,Set}` classes to track whether they 1) have been created from an unsequenced collection and 2) are iterated over. I then manually picked out those cases in which I couldn't rule out that a change in order would be observable to users.

Related to google/guava#6903 (comment)

Closes bazelbuild#25218.

PiperOrigin-RevId: 724341188
Change-Id: I2f38a4c3dad80055cb8d80853e9f211836019e5b
github-merge-queue bot pushed a commit that referenced this pull request Feb 18, 2025
Consumers of an `Immutable{List,Map,Set}` cannot know whether it has
been created from a source with unspecified iteration order, such as
`HashMap`. Since `ImmutableMap` itself prescribes that it preserves
insertion order, it's reasonable for consumers to expect a specified,
reasonable, deterministic order. While the current implementation of
`HashMap` in the OpenJDK still results in a deterministic order, this
kind of "sequenced-washing" doesn't cause non-determinism, but it still
runs the risk of having the order of elements change between JDK
versions or vendors.

The locations changed in this PR have been identified by instrumenting
the factory methods of the `Immutable{List,Map,Set}` classes to track
whether they 1) have been created from an unsequenced collection and 2)
are iterated over. I then manually picked out those cases in which I
couldn't rule out that a change in order would be observable to users.

Related to
google/guava#6903 (comment)

Closes #25218.

PiperOrigin-RevId: 724341188
Change-Id: I2f38a4c3dad80055cb8d80853e9f211836019e5b

Commit
b03cb63

Co-authored-by: Fabian Meumertzheim <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
team-Configurability platforms, toolchains, cquery, select(), config transitions team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. team-Performance Issues for Performance teams
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants