-
Notifications
You must be signed in to change notification settings - Fork 769
[SYCL] Check that some device global uninitialized before locking #18041
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL] Check that some device global uninitialized before locking #18041
Conversation
Count number of not done initialization of device globals and take a lock only if something is not initialized yet.
…nce before lockig.
@maarquitos14 Could you please review this PR? We expect some performance improvement from these changes. |
std::lock_guard<std::mutex> Lock(MDeviceGlobalInitializersMutex); | ||
for (const device &Dev : Devs) { | ||
auto Key = std::make_pair(Program, getSyclObjImpl(Dev)->getHandleRef()); | ||
MDeviceGlobalInitializers.emplace(Key, BinImage); | ||
auto [Iter, Inserted] = MDeviceGlobalInitializers.emplace(Key, BinImage); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a couple of questions:
- I see
MDeviceGlobalInitializers
is anstd::map
. Do we really need the elements to be sorted? If not, we can change tostd::unordered_map
, which should provide better performance. - Are we expecting the inserted elements to be consecutive? Could we use
emplace_hint
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I see
MDeviceGlobalInitializers
is anstd::map
. Do we really need the elements to be sorted? If not, we can change tostd::unordered_map
, which should provide better performance.
Sure, done.
- Are we expecting the inserted elements to be consecutive? Could we use
emplace_hint
instead?
The key is a pair of pointers. I'm not sure what can be say about order of them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The key is a pair of pointers. I'm not sure what can be say about order of them.
Yep, that is my concern too. Maybe it's worth it to try and measure to see if it helps? I'll approve anyway in case you want to do that in a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
@maarquitos14 Should we merge it? |
I don't have merge permissions, pinging @intel/llvm-gatekeepers should help. |
Count number of not done initialization of device globals and take a lock only if something is not initialized yet.