Skip to content

Fix GEMMT transforming the input array B in some complex cases #5143

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 21, 2025

Conversation

martin-frbg
Copy link
Collaborator

fixes #5111

@martin-frbg martin-frbg added this to the 0.3.30 milestone Feb 20, 2025
@@ -688,5 +688,19 @@ void CNAME(enum CBLAS_ORDER order, enum CBLAS_UPLO Uplo,

IDEBUG_END;

/* transform B back if necessary */

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@martin-frbg Not sure if I'm interpreting the overall code correctly, but shouldn't B not be written to at all?

While GEMMT is not part of standard BLAS and only exists as an "extension", it does have a definition within MKL, where B is marked as "const". In other words, writing to B should never happen within GEMMT (ie. no in-place operations on B).

https://www.intel.com/content/www/us/en/docs/onemkl/developer-reference-c/2025-0/cblas-gemmt.html

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically the promise is only that the inputs will be unchanged on exit, IMHO what happens behind closed doors is implementation-defined. And the current implementation is inefficient and needs to be redone eventually. It was only a quick, GEMV-based hack to provide any support for GEMMT at all when the Reference implementation did not yet have it.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree that it's better to have something rather than nothing. Looking purely at the implementation side, writing to B would still be ill-advised as it's promised as "const" by MKL. One possible case is that B is shared with another thread (say in a non-OpenBLAS function), which expects it unchanged. There may also be a possible performance degradation due to cache pollution stemming from unnecessary writes to B.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I say the entire implementation is an ugly hack and needs to be replaced - but that is another can of worms, namely #4921, and I can only do so much in a day (and sometimes not even that)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@martin-frbg martin-frbg merged commit 20d1118 into OpenMathLib:develop Feb 21, 2025
84 of 86 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

C/ZGEMMT changes the inputs in the UNC case
2 participants