Skip to content

Offloading CUDA #4166

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 48 commits into
base: master
Choose a base branch
from
Draft

Offloading CUDA #4166

wants to merge 48 commits into from

Conversation

Olender
Copy link
Contributor

@Olender Olender commented Mar 27, 2025

Description

__all__ = ("OffloadPC",)


class OffloadPC(PCBase):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could AssembledPC assume this functionality by providing -assembled_mat_type aijcusparse?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly, but then you lose all flexibility w.r.t. using other matrix types.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could have this one as a subclass of AssembledPC, there's substantial code duplication

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems like a good idea. @Olender

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea! I implemented this in the latest commit, but I'm still testing things out to ensure everything works as expected. Let me know if you have any more suggestions

@connorjward
Copy link
Contributor

By the way this shouldn't be named "DO NOT MERGE". Just leave it as a draft PR.

@Olender Olender changed the title DO NOT MERGE: Offloading CUDA Offloading CUDA Apr 4, 2025
. venv/bin/activate
: # Use pytest-xdist here so we can have a single collated output (not possible
: # for parallel tests)
firedrake-run-split-tests 1 1 "-n 12 $EXTRA_PYTEST_ARGS"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is failing because EXTRA_PYTEST_ARGS is not defined.

Comment on lines 65 to 71
# We set a DM and an appropriate SNESContext on the constructed PC
# so one can do e.g. multigrid or patch solves.
dm = outer_pc.getDM()
self._ctx_ref = self.new_snes_ctx(
outer_pc, a, bcs, mat_type,
fcp=fcp, options_prefix=options_prefix
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might not need to create a new _SNESContext here, but instead just grab it from the parent PC. I guess most of the symbolic stuff involving the form above wouldn't be required anymore, so this will look very different from AssembledPC

# Update preconditioner with GPU matrix
self.pc.setOperators(A, P_cu)

def form(self, pc, test, trial):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is already inherited from PCBase

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants