Fix a performance issue with Helion-emitted Flash Attention #181

manman-ren · 2025-06-16T20:35:54Z

Triton doesn't handle 3D dots well in terms of performance. Perform the following operation when emitting dots from Helion when the highest dimension is 1 for 3D dots:

-        qk = tl.dot(q_copy, k, input_precision='tf32')
+        q_copy_r = q.reshape(_BLOCK_SIZE_1, 64)
+        k_r = k.reshape(64, _BLOCK_SIZE_2)
+        qk_r = tl.dot(q_copy_r, k_r, input_precision='tf32')
+        qk = qk_r.reshape(1, _BLOCK_SIZE_1, _BLOCK_SIZE_2)

Perf Results on H100:

CUDA_VISIBLE_DEVICES=5 python examples/attention.py 
Helion time: 0.0642ms, flex time: 0.0638, torch time: 0.0716

jansel

Lint failures. Run ./lint.sh
Test failures, you may need to run EXPECTTEST_ACCEPT=1 pytest test to update expected outputs.
Do we need to support codegen_addmm as well?
Fix code duplication

jansel · 2025-06-17T23:31:48Z

helion/_compiler/inductor_lowering.py

+    lhsSize = node.args[1].meta['val'].size()
+    rhsSize = node.args[2].meta['val'].size()
+    # check to see if it is 3D
+    reduceDim = False
+    if len(lhsSize) == 3:
+        env = CompileEnvironment.current()
+        lhsDimIdx = env.get_block_id(lhsSize[0])
+        rhsDimIdx = env.get_block_id(rhsSize[0])
+        if lhsDimIdx is not None and rhsDimIdx is not None:
+            lhsDimVal = env.block_sizes[lhsDimIdx]
+            rhsDimVal = env.block_sizes[rhsDimIdx]
+            if (lhsDimVal.from_config(ctx.cg.device_function.config) == 1 and
+                rhsDimVal.from_config(ctx.cg.device_function.config) == 1):
+                reduceDim = True
+
+    if not reduceDim:
+        return expr_from_string(
+            f"tl.dot(lhs, rhs, acc=acc, input_precision={tf32!r})",
+            lhs=lhs,
+            rhs=rhs,
+            acc=acc,
+        )
+    # create reshape, dot, then reshape
+    lhs_shape_str = ctx.cg.device_function.tile_strategy.shape_str(
+        [*node.args[1].meta["val"].size()[1:]]
+    )
+    rhs_shape_str = ctx.cg.device_function.tile_strategy.shape_str(
+        [*node.args[2].meta["val"].size()[1:]]
+    )
+    acc_shape_str = ctx.cg.device_function.tile_strategy.shape_str(
+        [*node.args[0].meta["val"].size()[1:]]
+    )
+    out_shape_str = ctx.cg.device_function.tile_strategy.shape_str(
+        [*node.meta['val'].size()]
    )
+    lhs_reshape = expr_from_string(f"tl.reshape(lhs, {lhs_shape_str})", lhs=lhs)
+    rhs_reshape = expr_from_string(f"tl.reshape(rhs, {rhs_shape_str})", rhs=rhs)
+    acc_reshape = expr_from_string(f"tl.reshape(rhs, {acc_shape_str})", rhs=acc)
+    comp = expr_from_string(
+               f"tl.dot(lhs, rhs, acc=acc, input_precision={tf32!r})",
+               lhs=lhs_reshape,
+               rhs=rhs_reshape,
+               acc=acc_reshape,
+           )
+    return expr_from_string(f"tl.reshape(lhs, {out_shape_str})", lhs=comp)


Duplicate code with the above? Please refactor into a helper function.

jansel

some minor nits, but otherwise lgtm

helion/_compiler/inductor_lowering.py

jansel

lgtm, need to fix merge conflict though

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:

facebook-github-bot · 2025-06-23T19:59:55Z

@manman-ren has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

oulgen · 2025-06-23T21:55:47Z

examples/attention.py

@@ -12,9 +12,9 @@
 @helion.kernel(
    config=helion.Config(
        # This config was autotuned on a 3090, it won't be fast for other architectures


is this comment still valid? maybe delete it?

facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Jun 16, 2025

manman-ren requested review from jansel and yf225 June 16, 2025 21:39

jansel requested changes Jun 17, 2025

View reviewed changes

manman-ren force-pushed the mren/fa-perf branch from 1372a3d to d1134cf Compare June 20, 2025 16:07

jansel reviewed Jun 21, 2025

View reviewed changes

helion/_compiler/inductor_lowering.py Outdated Show resolved Hide resolved

helion/_compiler/inductor_lowering.py Outdated Show resolved Hide resolved

jansel approved these changes Jun 23, 2025

View reviewed changes

manman-ren added 8 commits June 23, 2025 10:49

FA perf

9a48d35

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:

update tests

4e43e5b

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:

refactoring

edd89fd

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:

pyre issue

3c79917

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:

name

9b932d4

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:

naming

5de546f

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:

update tests

adcb6ed

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:

update tests

05d3ae3

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:

manman-ren force-pushed the mren/fa-perf branch from 23a206e to 05d3ae3 Compare June 23, 2025 18:28

manman-ren merged commit c86b278 into main Jun 23, 2025
6 checks passed

oulgen reviewed Jun 23, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix a performance issue with Helion-emitted Flash Attention #181

Fix a performance issue with Helion-emitted Flash Attention #181

Uh oh!

manman-ren commented Jun 16, 2025 •

edited by drisspg

Loading

Uh oh!

jansel left a comment

Uh oh!

jansel Jun 17, 2025

Uh oh!

jansel left a comment

Uh oh!

Uh oh!

Uh oh!

jansel left a comment

Uh oh!

Uh oh!

facebook-github-bot commented Jun 23, 2025

Uh oh!

oulgen Jun 23, 2025

Uh oh!

Uh oh!

Fix a performance issue with Helion-emitted Flash Attention #181

Fix a performance issue with Helion-emitted Flash Attention #181

Uh oh!

Conversation

manman-ren commented Jun 16, 2025 • edited by drisspg Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jansel left a comment

Choose a reason for hiding this comment

Uh oh!

jansel Jun 17, 2025

Choose a reason for hiding this comment

Uh oh!

jansel left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

jansel left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

facebook-github-bot commented Jun 23, 2025

Uh oh!

oulgen Jun 23, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

manman-ren commented Jun 16, 2025 •

edited by drisspg

Loading