Skip to content

8354473: Incorrect results for compress/expand tests with -XX:+EnableX86ECoreOpts #24645

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

vpaprotsk
Copy link
Contributor

@vpaprotsk vpaprotsk commented Apr 15, 2025

It looks like the permv mask isnt always 'all-ones' or 'all-zeroes'. (Which is OK for real blend, but needs to be enforced via the flag for blend emulation)

Before the fix, make test TEST="jdk/incubator/vector" (on ECore machine)

==============================
Test summary
==============================
   TEST                                              TOTAL  PASS  FAIL ERROR  SKIP
>> jtreg:test/jdk/jdk/incubator/vector                  83    71    10     0     2 <<
==============================
TEST FAILURE

After the fix:

==============================
Test summary
==============================
   TEST                                              TOTAL  PASS  FAIL ERROR  SKIP
   jtreg:test/jdk/jdk/incubator/vector                  83    81     0     0     2
==============================
TEST SUCCESS

And on an AVX512 machine:

==============================
Test summary
==============================
   TEST                                              TOTAL  PASS  FAIL ERROR  SKIP
   jtreg:test/jdk/jdk/incubator/vector                  83    81     0     0     2
==============================
TEST SUCCESS

Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8354473: Incorrect results for compress/expand tests with -XX:+EnableX86ECoreOpts (Bug - P3)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/24645/head:pull/24645
$ git checkout pull/24645

Update a local copy of the PR:
$ git checkout pull/24645
$ git pull https://git.openjdk.org/jdk.git pull/24645/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 24645

View PR using the GUI difftool:
$ git pr show -t 24645

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/24645.diff

Using Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Apr 15, 2025

👋 Welcome back vpaprotski! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Apr 15, 2025

@vpaprotsk This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8354473: Incorrect results for compress/expand tests with -XX:+EnableX86ECoreOpts

Reviewed-by: jbhateja, sviswanathan, epeter

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 364 new commits pushed to the master branch:

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@jatin-bhateja, @sviswa7, @eme64) but any other Committer may sponsor as well.

➡️ To flag this PR as ready for integration with the above commit message, type /integrate in a new comment. (Afterwards, your sponsor types /sponsor in a new comment to perform the integration).

@openjdk openjdk bot added the rfr Pull request is ready for review label Apr 15, 2025
@openjdk
Copy link

openjdk bot commented Apr 15, 2025

@vpaprotsk The following label will be automatically applied to this pull request:

  • hotspot-compiler

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@mlbridge
Copy link

mlbridge bot commented Apr 15, 2025

Webrevs

@eme64
Copy link
Contributor

eme64 commented Apr 23, 2025

@vpaprotsk Can you please give a little more details about what exactly went wrong here, and why your change is correct?

@jatin-bhateja You should probably review this code, since you wrote the code originally :)

@vpaprotsk
Copy link
Contributor Author

@vpaprotsk Can you please give a little more details about what exactly went wrong here, and why your change is correct?

@eme64 Thanks for looking. Point form in attempt to be concise:

  • Jatin brought this to my attention, we weren't sure whose code was at fault (i.e. I wrote the blend emulation, he wrote the compress_expand) and I got to the investigation first (i.e. see https://bugs.openjdk.org/browse/JDK-8354473)
  • The mask for vblendvps instruction; actual instruction only cares about the MSB but for emulation we must have the mask to be either FFF..FF or 000..00. In many places blend is used, this is already the case, so no need to recompute the mask. That's why the flag is provided (i.e. optimization).
  • (Without fully understanding the entirety of compress_expand), it appears to me that in this function the mask in permv must be computed explicitly. That's why the flag is changed.

@jatin-bhateja
Copy link
Member

jatin-bhateja commented May 1, 2025

@vpaprotsk Can you please give a little more details about what exactly went wrong here, and why your change is correct?

@eme64 Thanks for looking. Point form in attempt to be concise:

  • Jatin brought this to my attention, we weren't sure whose code was at fault (i.e. I wrote the blend emulation, he wrote the compress_expand) and I got to the investigation first (i.e. see https://bugs.openjdk.org/browse/JDK-8354473)
  • The mask for vblendvps instruction; actual instruction only cares about the MSB but for emulation we must have the mask to be either FFF..FF or 000..00. In many places blend is used, this is already the case, so no need to recompute the mask. That's why the flag is provided (i.e. optimization).
  • (Without fully understanding the entirety of compress_expand), it appears to me that in this function the mask in permv must be computed explicitly. That's why the flag is changed.

Hi @vpaprotsk , @eme64,

Just to fill in the missing details about compress/expand handling on AVX2, we maintain an in-memory lookup table of permutation indices corresponding to a mask value. Each row of lookup table either holds a valid permute index, which is a positive index value less than the vector lane count OR a -1 index.

Since blend emulation always expects to operate over a blend mask vector whose lanes either hold a -1 or a 0 value hence there is a need to re-compose the desired blend mask by signed extending the MSB bits to fill the entire lane. Your fix to recompute the mask looks good to me.

Best Regards,
Jatin

@openjdk openjdk bot added the ready Pull request is ready to be integrated label May 1, 2025
@eme64
Copy link
Contributor

eme64 commented May 1, 2025

@jatin-bhateja Thanks for reviewing!

@vpaprotsk I'm realdy to give the approval too, just want to run some internal testing first - please ping me again in 24h :)

@eme64
Copy link
Contributor

eme64 commented May 1, 2025

@jatin-bhateja It seems the flag -XX:+EnableX86ECoreOpts only is enabled on some very specific machines. How important / wide spread are these machines? Will they become more wide spread over time? Or is this rather rare, and not worth investing too many resources? How does their importance compare to AVX and AVX2, or machines with only SSE2 or SSE4.1? Because we put a focus on SSE/AVX in internal testing, but I'm wondering if we should also test EnableX86ECoreOpts more. How does this flag interact with AVX features? Do ECore machines always have AVX2 for example? What would be good flag combinations here?

Copy link

@sviswa7 sviswa7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

@sviswa7
Copy link

sviswa7 commented May 2, 2025

@jatin-bhateja It seems the flag -XX:+EnableX86ECoreOpts only is enabled on some very specific machines. How important / wide spread are these machines? Will they become more wide spread over time? Or is this rather rare, and not worth investing too many resources? How does their importance compare to AVX and AVX2, or machines with only SSE2 or SSE4.1? Because we put a focus on SSE/AVX in internal testing, but I'm wondering if we should also test EnableX86ECoreOpts more. How does this flag interact with AVX features? Do ECore machines always have AVX2 for example? What would be good flag combinations here?

Testing with EnableX86ECoreOpts would be good, these machines have AVX2.

@vpaprotsk
Copy link
Contributor Author

@eme64 Pinging as promised about tests results.. thanks!

Re: EnableX86ECoreOpts testing.. the option currently 'protects' some fairly model-specific optimizations. Currently only test/jdk/java/lang/String/IndexOf.java calls out for it specifically, which leads to some 'false positives'.. or rather falsely blaming String.indexof.

@jatin-bhateja
Copy link
Member

@jatin-bhateja It seems the flag -XX:+EnableX86ECoreOpts only is enabled on some very specific machines. How important / wide spread are these machines? Will they become more wide spread over time? Or is this rather rare, and not worth investing too many resources? How does their importance compare to AVX and AVX2, or machines with only SSE2 or SSE4.1? Because we put a focus on SSE/AVX in internal testing, but I'm wondering if we should also test EnableX86ECoreOpts more. How does this flag interact with AVX features? Do ECore machines always have AVX2 for example? What would be good flag combinations here?

Testing with EnableX86ECoreOpts would be good, these machines have AVX2.

How important / wide spread are these machines? Will they become more wide spread over time? Or is this rather rare, and not worth investing too many resources?

E-core Xeons, Sapphire Rapids is widely deployed by all major CSPs, -XX:+EnableX86ECoreOpts enables certain micro-architectural optimization for these systems and JIT code may be different than using -XX:UseAVX=2 on regular P-core Xeons (AVX512 family) targets.

@vpaprotsk
Copy link
Contributor Author

@eme64 Let me know how those tests fared? And if/when I can integrate, thanks!

Copy link
Contributor

@eme64 eme64 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks reasonable to me. I did run our testing, so it won't break out CI. But I think we are not yet running our testing with -XX:+EnableX86ECoreOpts, so no guarantees there ;)

@vpaprotsk
Copy link
Contributor Author

/integrate

@openjdk openjdk bot added the sponsor Pull request is ready to be sponsored label May 6, 2025
@openjdk
Copy link

openjdk bot commented May 6, 2025

@vpaprotsk
Your change (at version fffd783) is now ready to be sponsored by a Committer.

@sviswa7
Copy link

sviswa7 commented May 6, 2025

/sponsor

@openjdk
Copy link

openjdk bot commented May 6, 2025

Going to push as commit a6995a3.
Since your change was applied there have been 365 commits pushed to the master branch:

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label May 6, 2025
@openjdk openjdk bot closed this May 6, 2025
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review sponsor Pull request is ready to be sponsored labels May 6, 2025
@openjdk
Copy link

openjdk bot commented May 6, 2025

@sviswa7 @vpaprotsk Pushed as commit a6995a3.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-compiler [email protected] integrated Pull request has been integrated
Development

Successfully merging this pull request may close these issues.

4 participants