Offloading CUDA #4166

Olender · 2025-03-27T14:28:24Z

Description

pbrubeck · 2025-03-28T23:37:48Z

firedrake/preconditioners/offload.py

+__all__ = ("OffloadPC",)
+
+
+class OffloadPC(PCBase):


Could AssembledPC assume this functionality by providing -assembled_mat_type aijcusparse?

Possibly, but then you lose all flexibility w.r.t. using other matrix types.

We could have this one as a subclass of AssembledPC, there's substantial code duplication

That seems like a good idea. @Olender

Good idea! I implemented this in the latest commit, but I'm still testing things out to ensure everything works as expected. Let me know if you have any more suggestions

connorjward · 2025-04-04T12:02:57Z

By the way this shouldn't be named "DO NOT MERGE". Just leave it as a draft PR.

.github/workflows/build_cuda.yml

connorjward · 2025-04-09T16:13:10Z

.github/workflows/build_cuda.yml

+          . venv/bin/activate
+          : # Use pytest-xdist here so we can have a single collated output (not possible
+          : # for parallel tests)
+          firedrake-run-split-tests 1 1 "-n 12 $EXTRA_PYTEST_ARGS"


I think this is failing because EXTRA_PYTEST_ARGS is not defined.

Co-authored-by: Connor Ward <[email protected]>

pbrubeck · 2025-04-02T09:21:02Z

firedrake/preconditioners/offload.py

+            # We set a DM and an appropriate SNESContext on the constructed PC
+            # so one can do e.g. multigrid or patch solves.
+            dm = outer_pc.getDM()
+            self._ctx_ref = self.new_snes_ctx(
+                outer_pc, a, bcs, mat_type,
+                fcp=fcp, options_prefix=options_prefix
+            )


You might not need to create a new _SNESContext here, but instead just grab it from the parent PC. I guess most of the symbolic stuff involving the form above wouldn't be required anymore, so this will look very different from AssembledPC

pbrubeck · 2025-04-08T07:11:43Z

firedrake/preconditioners/offload.py

+            # Update preconditioner with GPU matrix
+            self.pc.setOperators(A, P_cu)
+
+    def form(self, pc, test, trial):


This method is already inherited from PCBase

picalarix and others added 13 commits March 5, 2025 13:07

work in progress

0134ed4

offload (not yet really) first try

28ab1b5

linear solver update cusparse

42d2e01

some more changes

8daaf5d

cusparse convert - not done

0995127

last changes to offload

38fd7ba

Commentary

fb9be7f

after meeting

b6c6d4c

Events

69508ab

adding simple test for debugging

c120d5b

duplicating to get around locking

bbd958f

different fix for the lcoked vector

f6ed362

only install for now

a46028e

pbrubeck reviewed Mar 28, 2025

View reviewed changes

Olender added 7 commits March 31, 2025 08:08

calling data to synchronize vector

57b7a97

adding first test

74e4cfb

adding kmv wave test

65bceb0

minor fix

c20a3e7

Merge remote-tracking branch 'origin/master' into olender/CUDA

383529a

minor changes

25ea718

offload now subclass of assembledpc

3350bde

Olender changed the title ~~DO NOT MERGE: Offloading CUDA~~ Offloading CUDA Apr 4, 2025

adding tests in CI

763fe6b

Olender force-pushed the olender/CUDA branch from c984f72 to 763fe6b Compare April 9, 2025 13:28

checking if run tests gets the tests with cuda marker

fc13aa2

connorjward requested changes Apr 9, 2025

View reviewed changes

Olender and others added 3 commits April 10, 2025 16:03

Update .github/workflows/build_cuda.yml

af7e6ce

Co-authored-by: Connor Ward <[email protected]>

adding env options

c988926

trying to figure out whats wrong with petsc4py now

d075453

Olender added 17 commits April 10, 2025 17:12

Merge remote-tracking branch 'origin/master' into olender/CUDA

1765f5b

wip

af69f87

updating PETSC

84f8851

wip

b00c615

wip

9e9fe08

wip

7a3c5da

adding slepc

7f69823

wip

c23f37e

Merge remote-tracking branch 'origin/master' into olender/CUDA

13498cf

wip

58db56b

wip

a380acf

wip

18f4daa

wip

319aa19

Merge remote-tracking branch 'origin/master' into olender/CUDA

80b8e2b

wip

bbf1825

wip

07a2a20

wip

97e41fa

pbrubeck reviewed Apr 23, 2025

View reviewed changes

Olender added 6 commits April 23, 2025 13:46

back to openmpi

25d356b

wip

6ff1e91

wip

86ae6a8

wip

c96c15b

wip

61f65b1

Merge remote-tracking branch 'origin/master' into olender/CUDA

64f67dc

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Offloading CUDA #4166

Offloading CUDA #4166

Olender commented Mar 27, 2025

pbrubeck Mar 28, 2025

connorjward Apr 1, 2025

pbrubeck Apr 1, 2025

connorjward Apr 1, 2025

Olender Apr 4, 2025

connorjward commented Apr 4, 2025

connorjward Apr 9, 2025

pbrubeck Apr 2, 2025

pbrubeck Apr 8, 2025

		__all__ = ("OffloadPC",)


		class OffloadPC(PCBase):

Offloading CUDA #4166

Are you sure you want to change the base?

Offloading CUDA #4166

Conversation

Olender commented Mar 27, 2025

Description

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

connorjward commented Apr 4, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment