[CIR][CUDA] Emit address space casts on lowering #1518

AdUhTkJm · 2025-03-22T21:55:22Z

This is different from the approach in OG, because we use cir.get_global to access global variables, while in OG we just directly write the name of the variable. My approach is to add another attribute to GlobalOps and emit additional casts in LoweringPrepare.

I hope the comments are clear enough about the changes, and please tell me if you find a better approach.

…n to match OG" Seems like windows bots are now broken! This reverts commit 9a63c50.

llvm#1180) This removes some NYI in CIRGenModule::tryEmitBaseDestructorAsAlias and similar to llvm#1179, use `assert(false)` to tell devs to add test. It is slightly verbose due to the difference between LLVM and CIR's type system. LLVM's pointer are opaque types while CIR's pointer are typed. So we need to handle these pointers when transforming the generated cir.

…lvm#1186) This PR adds a new command line option `--target` to our tool `cir-translate`. The concrete behaviour of it also depends on the triple and data layout in the CIR module. See the table in code comments for details. The default triple is `x86_64-unknown-linux-gnu` currently. Some tests are updated with triple and DLTI attribute eliminated (replaced by an option in RUN line). But still some tests remain unchanged, primarily because they use `cir-opt` instead.

…nAttr (llvm#1199) CIR PoisonOp is needed in this context as alternative would be to use VecCreateOp to prepare an arg for VecInsertElement, but VecCreate is for different purpose and [it would insert all elements](https://github.com/llvm/clangir/blob/eacaabba76ebdbf87217fefaa77f92c45cf4509c/clang/lib/CIR/Lowering/DirectToLLVM/LowerToLLVM.cpp#L1679) which is not totally unnecessary in this context. Here is the [intrinsic def ](https://developer.arm.com/architectures/instruction-sets/intrinsics/#f:@navigationhierarchiessimdisa=[Neon]&q=vqmovns_)

This would facilitate implementation of neon intrinsic `neon_vmax_v` and `__builtin_elementwise_max`, and potentially future optimizations. CIR BinOp supports vector type. Floating point has already been supported by FMaxOp.

This option creates and links all tools against a single libLLVM shared library (and the corresponding CLANG_LINK_CLANG_DYLIB option also gets turned on by default). In order for this to work, we need to use LINK_COMPONENTS instead of LINK_LIBS for all LLVM dependencies, and clang_target_link_libraries for all Clang dependencies, so that they get rewritten to use the dylib. Remove llvm_update_compile_flags while I'm here, since the build macros handle that internally. Before this change, we'd link against certain LLVM libraries both statically and dynamically, leading to test failures from duplicate singletons. The way this works for MLIR is fragile right now: MLIR can create its own dylib as well but doesn't have build system support for linking against that dylib. We end up folding the MLIR libraries into libclang-cpp.so (because all Clang dependencies get pulled into it), but MLIR tools still link the MLIR libraries statically. It'll still work, but BUILD_SHARED_LIBS is possibly a better alternative for development. Distributions like Fedora build their LLVM packages with LLVM_LINK_LLVM_DYLIB, so we'll want to eventually have good MLIR support for that setup too.

…lvm#1203) C/C++ functions returning void had an explicit !cir.void return type while not having any returned value, which was breaking a lot of MLIR invariants when the CIR dialect is used in a greater context, for example with the inliner. Now, a C/C++ function returning void has not return type and no return values, which does not break the MLIR invariant about the same number of return types and returned values. This change keeps the same parsing/pretty-printed syntax as before for compatibility.

llvm#1225)

…llvm#1230) References: [OG's implementation](https://github.com/llvm/clangir/blob/2b1a638ea07ca10c5727ea835bfbe17b881175cc/clang/lib/CodeGen/CGBuiltin.cpp#L13509) [Builtin's definition ](https://developer.arm.com/architectures/instruction-sets/intrinsics/#f:@navigationhierarchiessimdisa=[Neon]&q=vrsra_n_)

Combined implementaiton with `neon_vaddlvq_u16` OG somehow implemented them separately but they are no different except signess and intrinsic name [OG's neon_vaddlvq_s16](https://github.com/llvm/clangir/blob/2b1a638ea07ca10c5727ea835bfbe17b881175cc/clang/lib/CodeGen/CGBuiltin.cpp#L13483) [OG's neon_vaddlvq_u16](https://github.com/llvm/clangir/blob/2b1a638ea07ca10c5727ea835bfbe17b881175cc/clang/lib/CodeGen/CGBuiltin.cpp#L13449)

The module-level uwtable attribute controls the unwind tables for any synthesized functions, and the function-level attribute controls them for those functions. I'll add support for this attribute to the LLVM dialect as well, but translate it from CIR directly for now to avoid waiting on the MLIR addition and a subsequent rebase.

llvm#1232) Match `CodeGenModule::SetLLVMFunctionAttributesForDefinition` so that we can see what's missing and have a good base to build upon.

…1233)

I am working on a clangir based solution to improve C++'s safety (https://discourse.llvm.org/t/rfc-a-clangir-based-safe-c/83245). This is similar with the previous analysis only approach I proposed, where we may not care about the lowered code. And this is what I described as layering problems in https://discourse.llvm.org/t/rfc-a-clangir-based-safe-c/83245 This is similar with the other issue proposed llvm#1128. We'd better to emit the higher level operations and lowering/optimizing it later. This is also inspired our handling method for VAArg, where we use ABI information to lower things during the passes. It gives me more confidence that I am doing things right.

The PR should help us to get rid of NYI `NYI UNREACHABLE executed at clang/lib/CIR/CodeGen/CIRGenExprAgg.cpp:899` [Relevant OG code here](https://github.com/llvm/clangir/blob/7fb608d4d1b72c25a1739a1bd66c9024208819cb/clang/lib/CodeGen/CGExpr.cpp#L4767): I put `HasExplicitObjectParameter` support as a missing feature, which is a new C++23 feature.

…ge (llvm#1214) #### The Problem Let's take a look at the following code: ``` struct A { ~A() {} }; int foo() { return 42; } void bar() { A a; int b = foo(); } ``` The call to `foo` guarded by the synthetic `tryOp` looks approximately like the following: ``` cir.try synthetic cleanup { %2 = cir.call exception @_Z3foov() : () -> !s32i cleanup { cir.call @_ZN1AD1Ev(%0) : (!cir.ptr<!ty_A>) -> () extra(#fn_attr1) // call to destructor of 'A' cir.yield } cir.yield } catch [#cir.unwind { cir.resume }] cir.store %2, %1: !s32i, !cir.ptr<!s32i> // CIR verification error ``` The result of the `foo` call is in the `try` region - and is not accessible from the outside, so the code generation fails with `operand #0 does not dominate its use` . #### Solution So we have several options how to handle this properly. 1. We may intpoduce a new operation here, like `TryCall` but probably more high level one, e.g. introduce the `InvokeOp`. 2. Also, we may add the result to `TryOp`. 3. The fast fix that is implemented in this PR is a temporary `alloca` where we store the call result right in the try region. And the result of the whole `emitCall` is a `load` from the temp `alloca`. So this PR is both the request for changes and an open discussion as well - how to handle this properly. So far I choose the third approach. If it's ok - I will need to create one more PR with a similar fix for the aggregated results or update this one.

This PR puts for-loop body, while-loop body, and do-while-loop body in nested scopes. Allocas in the loop body are now push down to the nested scope. Resolve llvm#1218 .

There are two sets of intrinsics regarding Min and Max operations for floating points [Maximum](https://mlir.llvm.org/docs/Dialects/LLVM/#llvmintrmaximum-llvmmaximumop) vs [Maxnum](https://mlir.llvm.org/docs/Dialects/LLVM/#llvmintrmaxnum-llvmmaxnumop) [Minimum](https://mlir.llvm.org/docs/Dialects/LLVM/#llvmintrminimum-llvmminimumop) vs [Minnum](https://mlir.llvm.org/docs/Dialects/LLVM/#llvmintrminnum-llvmminnumop) [The difference is whether NaN should be propagated when one of the inputs is NaN](https://llvm.org/docs/LangRef.html#llvm-maximumnum-intrinsic) Maxnum and Minnum would return number if one of inputs is NaN, and the other is a number, But Maximum and Minimum would return NaN (propagation of NaN) And they are resolved to different ASM such as [FMAX](https://developer.arm.com/documentation/ddi0596/2021-03/SIMD-FP-Instructions/FMAX--vector---Floating-point-Maximum--vector--?lang=en) vs [FMAXNM](https://developer.arm.com/documentation/ddi0596/2021-03/SIMD-FP-Instructions/FMAXNM--vector---Floating-point-Maximum-Number--vector--?lang=en) Both have user cases, we already implemented Maxnum and Minnum But Maximum and Minimum has user cases in [neon intrinsic ](https://developer.arm.com/architectures/instruction-sets/intrinsics/vmax_f32 ) and [__builtin_elementwise_maximum ](https://github.com/llvm/clangir/blob/a989ecb2c55da1fe28e4072c31af025cba6c4f0f/clang/test/CodeGen/strictfp-elementwise-bulitins.cpp#L53)

…lvm#1235) Use iterator to visit std::initializer_list field reduce the readability

…d neon_vaddvq_f64 (llvm#1238) [Neon intrinsic definition](https://developer.arm.com/architectures/instruction-sets/intrinsics/vaddv_f32). They are vector across operation which LLVM doesn't currently have a generic intrinsic about it. As a side note for brainstorm, it might be worth in the future for CIR to introduce Vector Across type operations even though LLVM dialect doesn't have it yet. This would help to expose opt opportunities. E.g. a very trivial constant fold can happen if we are adding across a constant vector.

…#1239) This implementation is different from OG in the sense we chose to use CIR op which eventually lowers to generic LLVM intrinsics instead of llvm.aarch64.neon intrinsics But down to the ASM level, [they are identical ](https://godbolt.org/z/Gbbos9z6Y).

…vm#1242) This patch follows llvm#1220 (comment) by augmenting `CIR_Type` with a new field, `tbaaName`. Specifically, it enables TableGen support for the `-gen-cir-tbaa-name-lowering` option, allowing for the generation of `getTBAAName` functions based on the `tbaaName`. This enhancement enables us to replace the hardcoded TBAA names in the `getTypeName` function with the newly generated `getTBAAName`.

This PR adds a bitcast when we rewrite globals type. Previously we just set a new type and it worked. But recently I started to test ClangIR with CSmith in order to find some run time bugs and faced with the next problem. ``` typedef struct { int x : 15; uint8_t y; } S; S g = { -12, 254}; int main() { printf("%d\n", g.y); return 0; } ``` The output for this program is ... 127 but not 254! The reason is that first global var is created with the type of struct `S`, then `get_member` operation is generated with index `1` and then after, the type of the global is rewritten - I assume because of the anon struct created on the right side in the initialization. But the `get_member` operation still wants to access to the field at index `1` and get a wrong byte. If we change the `y` type to `int` we will fail on the verification stage. But in the example above it's a run time error! This is why I suggest to add a bitcast once we have to rewrite the global type.

…2/u32, (llvm#1240) Co-authored-by: Bruno Cardoso Lopes <[email protected]>

We figure it would be nice to have a common place with all our known crashes that is tracked by git and is actively verified whether or not we can now support the crashes by lit. It can act as our source of truth for known failures and also being potential good first tasks for new developers. Add a simple test case of a known crash that involves copying a struct in a catch. Reviewers: smeenai, bcardosolopes Reviewed By: bcardosolopes Pull Request: llvm#1243

…#1247) Basically that is - the return value for `=` operator for bitfield assignment is wrong now. For example, the next function returns `7` for 3 bit bit field, though it should be `-1`: ``` int get_a(T *t) { return (t->a = 7); } ``` This PR fix it. Actually, the bug was in the lowering - the integer cast is applied in the wrong place (in comparison with the original codegen).

clang/include/clang/CIR/Dialect/IR/CIROps.td

Lower vcalts_f32

Sub-issue of llvm#1192. Adds CIR_ATanOp and support for __builtin_elementwise_atan.

Part of llvm#258 . 1. Added `AddressPointAttr` 2. Change all occurrences of `VTableAddrPointOp` into using the attribute 3. Update tests --------- Co-authored-by: Sirui Mu <[email protected]> Co-authored-by: Morris Hafner <[email protected]> Co-authored-by: Morris Hafner <[email protected]> Co-authored-by: Sharp-Edged <[email protected]> Co-authored-by: Amr Hesham <[email protected]> Co-authored-by: Bruno Cardoso Lopes <[email protected]> Co-authored-by: Letu Ren <[email protected]>

clang/lib/CIR/Dialect/Transforms/LoweringPrepare.cpp

clang/include/clang/CIR/Dialect/IR/CIROps.td

Clang relies on `llvm::Intrinsic::getOrInsertDeclaration` to handle functions marked as `ClangBuiltin` in TableGen. That function receives a `CodeGenModule*` so CIR can't use that. We need to re-implement parts of it.

Closes llvm#1367 --------- Co-authored-by: Sirui Mu <[email protected]> Co-authored-by: Amr Hesham <[email protected]> Co-authored-by: Chibuoyim (Wilson) Ogbonna <[email protected]> Co-authored-by: Yue Huang <[email protected]> Co-authored-by: Morris Hafner <[email protected]> Co-authored-by: Morris Hafner <[email protected]> Co-authored-by: Sharp-Edged <[email protected]> Co-authored-by: Bruno Cardoso Lopes <[email protected]> Co-authored-by: Letu Ren <[email protected]>

…#1521) - Part of llvm#70 This is not very useful now, as both branches of interest are NYI. I will try to implement the real features in subsequent patches.

…ity and maintainability (llvm#1525) As noted in [this comment](llvm#1442 (comment)), the nested if-arms in `GlobalOpLowering` are somewhat confusing and error-prone. This PR simplifies the logic into more straightforward components. Since LLVM's GlobalOp accepts two types of initializers (either an initializer value or an initializer region), we've extracted the decision logic into a separate function called `lowerInitializer`. This function takes two inout arguments: `mlir::Attribute &init` (for the attribute value) and `bool useInitializerRegion` (as the decision indicator). All code paths then converge at a common epilogue that handles the operation rewriting. The previous implementation for lowering `DataMemberAttr` initializers relied on recursion between MLIR rewrite calls, which made the control flow somewhat opaque. The new version makes this explicit by using a clear self-recursive pattern within `lowerInitializer`.

Fix llvm#1371. Not sure about whether we could remove `convertTypeForMem` completely. Let's fix the doc first.

This change moves all declarations of emit* functions in CIRGenFunction.h into a common location and sorts them alphabetically. The goal of this change is to make it easier to keep upstream and incubator code in a consistent location, making functions easier to find for upstreaming and minimizing conflicts in the incubator when rebasing. I did most of this sort manually, and I've probably been inconsistent in how I treat sorting of uppercase versus lowercase. I made no attempt to provide a rule for ordering different declarations of functions with the same name. We can improve on that over time if anyone feels the need. I tried very hard not to drop comments (one of the reasons I had to do this manually), but I may have lost a few. This change loses the grouping of some declarations that were co-located by common purpose, but most of the declarations lacked a coherent ordering, so I think this is a step forward overall.

This is a rebased version of the inactive PR llvm#1380. --------- Co-authored-by: koparasy <[email protected]>

Adds implementation for ATanOp's lowering ThroughMLIR.

Lower `vabds_f32` and `vabdd_f64`

Adds implementation for ACosOp's lowering ThroughMLIR.

Lower vmaxv_s32

bcardosolopes · 2025-03-28T22:38:29Z

clang/lib/CIR/CodeGen/CIRGenDecl.cpp

@@ -470,7 +470,7 @@ CIRGenModule::getOrCreateStaticVarDecl(const VarDecl &D,
    Name = getStaticDeclName(*this, D);

  mlir::Type LTy = getTypes().convertTypeForMem(Ty);
-  cir::AddressSpaceAttr AS =
+  cir::AddressSpaceAttr actualAS =


Add a comment on what you mean by "actual"

bcardosolopes · 2025-03-28T22:41:52Z

clang/lib/CIR/CodeGen/CIRGenDecl.cpp

+  auto definingOp = addr.getDefiningOp();
+  bool hasCast = isa<cir::CastOp>(definingOp);
+  auto getAddrOp = mlir::cast<cir::GetGlobalOp>(
+      hasCast ? definingOp->getOperand(0).getDefiningOp() : definingOp);


I'm missing the point here: why you need to unwrap a cast that was wrapped by createGetGlobal? If you need to unwrap it, why do you need to generate it in the first place?

We need both the cast and the get_global op for examination below. We will store addr in the LocalDeclMap, with potential bitcasts. Previously a bitcast is emitted whenever the global variable type and the initializer type don't match (for example int* and int[]), but they can mismatch because of different AS but the same underlying type (say int* addrspace(1) and int*). So I think we must emit a cast to check which case we're in.

clang/include/clang/CIR/Dialect/Builder/CIRBaseBuilder.h

bcardosolopes · 2025-03-28T22:55:39Z

clang/include/clang/CIR/Dialect/Builder/CIRBaseBuilder.h

+        auto oldTy = mlir::cast<cir::PointerType>(getGlobal.getType());
+        auto newTy =
+            cir::PointerType::get(oldTy.getPointee(), /*addrspace=*/gpuAS);
+        auto cast = createAddrSpaceCast(loc, getGlobal, newTy);


Instead of emitting all this casts, have you considered changing the address space of the global for good instead of playing with two different ones? Are there uses that actually try to retrieve two different ASs out of the same global?

If we change the address space of the global, then it will be placed in the wrong section. For __device__ (AS 1) as an example, OG does

; Global @var: ptr addrspace(1) ; Use a global ... (addrspacecast ptr addrspace(1) @var to ptr)

Adds implementation for ASinOp's lowering ThroughMLIR.

Lower vmaxv_u32

Lower vmaxvq_f64

- Part of llvm#70.

Lower vmaxnmvq f32 and f64

bcardosolopes · 2025-04-04T20:47:50Z

Sorry for the delay here, I still have a few thing I want to double check here before I consider this ready, should put more time on it beginning of next week. Thanks for your patience

bcardosolopes and others added 30 commits March 17, 2025 14:26

Revert "[CIR][CIRGen][NFCI] Take a step into getting scope informatio…

e5baa79

…n to match OG" Seems like windows bots are now broken! This reverts commit 9a63c50.

[CIR][CodeGen][NFC] Refactor emitCXXGlobalVarDeclInit to match codegen

14bf0c0

[CIR][CodeGen] Flesh out some more missing features from Address

41295de

[CIR][Dialect] Add BinOpKind_Max (llvm#1201)

70b0bc9

This would facilitate implementation of neon intrinsic `neon_vmax_v` and `__builtin_elementwise_max`, and potentially future optimizations. CIR BinOp supports vector type. Floating point has already been supported by FMaxOp.

[CIR][CIRGen][Builtin][Neon] Lower neon_vaddvq_u32 and neon_vaddvq_u64 (

ff8e44c

llvm#1225)

[CIR][CIRGen] Match skeleton for setCIRFunctionAttributesForDefinition (

7a9be9d

llvm#1232) Match `CodeGenModule::SetLLVMFunctionAttributesForDefinition` so that we can see what's missing and have a good base to build upon.

[CIR][ThroughMLIR][NFC] BinOpKind_Max NYI ThroughMLIR lowering (llvm#…

129feaa

…1233)

[CIR] Put loop body in nested scopes (llvm#1221)

031fb9c

This PR puts for-loop body, while-loop body, and do-while-loop body in nested scopes. Allocas in the loop body are now push down to the nested scope. Resolve llvm#1218 .

[CIR][CIRGen][TBAA] Add support for scalar types (llvm#1220)

4a4f647

[CIR][CodeGen][NFC] centralized split std::initializer_list field (l…

541ea8f

…lvm#1235) Use iterator to visit std::initializer_list field reduce the readability

[CIR][NFC] Silence warning from recent PR

22e4ce8

[CIR][CIRGen][Builtin][Neon] Lower neon_vaddvq_s64/s32, neon_vaddv_s3…

abb0ad7

…2/u32, (llvm#1240) Co-authored-by: Bruno Cardoso Lopes <[email protected]>

bcardosolopes reviewed Mar 24, 2025

View reviewed changes

clang/include/clang/CIR/Dialect/IR/CIROps.td Outdated Show resolved Hide resolved

AmrDeveloper and others added 3 commits March 24, 2025 11:15

[CIR][CIRGen][Builtin][Neon] Lower vcalts_f32 (llvm#1520)

d27f1bc

Lower vcalts_f32

[CIR][CodeGen] Support for __builtin_elementwise_atan (llvm#1512)

ae826d0

Sub-issue of llvm#1192. Adds CIR_ATanOp and support for __builtin_elementwise_atan.

bcardosolopes reviewed Mar 25, 2025

View reviewed changes

clang/lib/CIR/Dialect/Transforms/LoweringPrepare.cpp Outdated Show resolved Hide resolved

clang/include/clang/CIR/Dialect/IR/CIROps.td Outdated Show resolved Hide resolved

clang/include/clang/CIR/Dialect/IR/CIROps.td Outdated Show resolved Hide resolved

AdUhTkJm and others added 11 commits March 25, 2025 11:06

[CIR][CUDA] Handle clang builtin functions (llvm#1496)

d59e44b

Clang relies on `llvm::Intrinsic::getOrInsertDeclaration` to handle functions marked as `ClangBuiltin` in TableGen. That function receives a `CodeGenModule*` so CIR can't use that. We need to re-implement parts of it.

[CIR][CodeGen] Add support for x86 memory fence (llvm#1519)

6d19a97

[CIR][CIRGen][NFC] Handle PragmaComment in emitTopLevelDecl (llvm…

55a2236

…#1521) - Part of llvm#70 This is not very useful now, as both branches of interest are NYI. I will try to implement the real features in subsequent patches.

[CIR][NFC] Rectify the documentation of convertTypeForMem (llvm#1526)

6edfd5c

Fix llvm#1371. Not sure about whether we could remove `convertTypeForMem` completely. Let's fix the doc first.

[CIR][CIRGen][CUDA][NFC] Mirror CUDARuntime skeleton of OG (llvm#1527)

67579ff

This is a rebased version of the inactive PR llvm#1380. --------- Co-authored-by: koparasy <[email protected]>

[CIR][CodeGen] Support for BI__rdtsc (llvm#1523)

169a356

[CIR][ThroughMLIR] Add ATanOp Lowering (llvm#1528)

f07c4bc

Adds implementation for ATanOp's lowering ThroughMLIR.

[CIR][CIRGen][Builtin][Neon] Lower vabds_f32 and vabdd_f64 (llvm#1531)

23dc430

Lower `vabds_f32` and `vabdd_f64`

AdUhTkJm force-pushed the addrspace branch 2 times, most recently from d6f21e5 to 4888f55 Compare March 27, 2025 05:16

ayokunle321 and others added 2 commits March 28, 2025 11:34

[CIR][ThroughMLIR] Add ACosOp Lowering (llvm#1529)

62520a4

Adds implementation for ACosOp's lowering ThroughMLIR.

[CIR][CIRGen][Builtin][Neon] Lower vmaxv_s32 (llvm#1533)

05b785a

Lower vmaxv_s32

bcardosolopes reviewed Mar 28, 2025

View reviewed changes

ayokunle321 and others added 6 commits March 31, 2025 13:58

[CIR][ThroughMLIR] Add ASinOp Lowering (llvm#1530)

b7e2bca

Adds implementation for ASinOp's lowering ThroughMLIR.

[CIR][CIRGen][Builtin][Neon] Lower vmaxv_u32 (llvm#1534)

91c6e07

Lower vmaxv_u32

[CIR][CIRGen][Builtin][Neon] Lower vmaxvq_f64 (llvm#1535)

ba4d718

Lower vmaxvq_f64

[CIR][LowerToLLVM] Support for LinkerOptions lowering (llvm#1532)

c57b86b

- Part of llvm#70.

[CIR][CIRGen][Builtin][Neon] Lower vmaxnmvq f32 and f64 (llvm#1537)

ccb6039

Lower vmaxnmvq f32 and f64

[CIR][CUDA] Emit address space casts on lowering

00230ab

AdUhTkJm force-pushed the addrspace branch from 4888f55 to 00230ab Compare April 1, 2025 07:24

lanza force-pushed the main branch from d3fe299 to 98e8811 Compare April 9, 2025 22:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CIR][CUDA] Emit address space casts on lowering #1518

[CIR][CUDA] Emit address space casts on lowering #1518

AdUhTkJm commented Mar 22, 2025

bcardosolopes Mar 28, 2025

bcardosolopes Mar 28, 2025

AdUhTkJm Mar 29, 2025

bcardosolopes Mar 28, 2025

AdUhTkJm Mar 29, 2025

bcardosolopes commented Apr 4, 2025

[CIR][CUDA] Emit address space casts on lowering #1518

Are you sure you want to change the base?

[CIR][CUDA] Emit address space casts on lowering #1518

Conversation

AdUhTkJm commented Mar 22, 2025

bcardosolopes Mar 28, 2025

Choose a reason for hiding this comment

bcardosolopes Mar 28, 2025

Choose a reason for hiding this comment

AdUhTkJm Mar 29, 2025

Choose a reason for hiding this comment

bcardosolopes Mar 28, 2025

Choose a reason for hiding this comment

AdUhTkJm Mar 29, 2025

Choose a reason for hiding this comment

bcardosolopes commented Apr 4, 2025