Skip to content

support new struct path tbaa in MLIR LLVM Dialect #119699

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
PikachuHyA opened this issue Dec 12, 2024 · 8 comments
Open

support new struct path tbaa in MLIR LLVM Dialect #119699

PikachuHyA opened this issue Dec 12, 2024 · 8 comments

Comments

@PikachuHyA
Copy link
Contributor

I'm currently working on implementing support for TBAA ( llvm/clangir#1076 ) in Clang IR and have noticed that the LLVM Dialect does not accommodate the new format of TBAA.

To address this, I’ve identified two potential approaches:

  1. Enhance Existing Attributes: We could extend the current attributes, namely LLVM_TBAAMemberAttr, LLVM_TBAATypeDescriptorAttr, and LLVM_TBAATagAttr, to support the new struct path TBAA format.

  2. Introduce Separate Attributes: Alternatively, we could introduce separate attributes such as LLVM_TBAAStructFieldAttr, LLVM_TBAATypeNodeAttr, and LLVM_TBAAAccessTagAttr that are specifically designed to handle the new struct path TBAA. (a draft patch is [mlir][llvm] support -new-struct-path-tbaa #119698 )

I’d like to gather feedback from the community on which design approach is preferred. Your insights and suggestions would be greatly appreciated!

cc @bcardosolopes

@github-actions github-actions bot added the mlir label Dec 12, 2024
@PikachuHyA
Copy link
Contributor Author

@llvmbot
Copy link
Member

llvmbot commented Dec 12, 2024

@llvm/issue-subscribers-mlir-llvm

Author: PikachuHy (PikachuHyA)

I'm currently working on implementing support for TBAA ( https://github.com/llvm/clangir/pull/1076 ) in Clang IR and have noticed that the LLVM Dialect does not accommodate the new format of TBAA.

To address this, I’ve identified two potential approaches:

  1. Enhance Existing Attributes: We could extend the current attributes, namely LLVM_TBAAMemberAttr, LLVM_TBAATypeDescriptorAttr, and LLVM_TBAATagAttr, to support the new struct path TBAA format.

  2. Introduce Separate Attributes: Alternatively, we could introduce separate attributes such as LLVM_TBAAStructFieldAttr, LLVM_TBAATypeNodeAttr, and LLVM_TBAAAccessTagAttr that are specifically designed to handle the new struct path TBAA. (a draft patch is [mlir][llvm] support -new-struct-path-tbaa #119698 )

I’d like to gather feedback from the community on which design approach is preferred. Your insights and suggestions would be greatly appreciated!

cc @bcardosolopes

@bcardosolopes
Copy link
Member

I would say for for (1) coming from the assumption that the old format isn't probably useful anymore in top of tree. @gysit do you have any take here?

@gysit
Copy link
Contributor

gysit commented Jan 29, 2025

Sorry I missed this.

TBAA is quite important for us and we currently still rely on the old format. Does Clang already produce the new format or is there a conversion in LLVM that would allow us to migrate to the new format? As long as there is a migration we are probably fine with moving to the new format and ideally LLVM dialect should always mirror the structure of the metadata in LLVM head.

Is there an RFC that documents the change in LLVM/Clang possibly including their timelines for the change and dropping support for the old format?

Also note that besides us, at least Flang uses TBAA actively. So we should definitely synchronize with them as well. I believe @vzakhari was working on TBAA at some point.

@nikic
Copy link
Contributor

nikic commented Jan 29, 2025

Clang does not use new struct path TBAA by default, it's behind the -new-struct-path-tbaa flag.

@gysit
Copy link
Contributor

gysit commented Jan 29, 2025

Clang does not use new struct path TBAA by default, it's behind the -new-struct-path-tbaa flag.

Ok this sounds more like we should continue to support both formats for the time being, ideally closely mirroring LLVM's metadata representation?

@vzakhari
Copy link
Contributor

@gysit thanks for the tag!

There is not much code regarding TBAA in Flang, so it should not be hard to switch to another format. Though I do not remember the actuall differences between the two formats and whether there is a critical semantics change that can make it harder for Flang.

A new set of MLIR attributes may be easier to comprehend vs making the existing ones flexible to support both formats, but I do not have a strong preference.

FWIW, there are two pending TODOs about supporting the new format that I added in https://reviews.llvm.org/D140768 (look for TODO: keyword).

@PikachuHyA
Copy link
Contributor Author

PikachuHyA commented Feb 25, 2025

The new TBAA format is supported in https://reviews.llvm.org/D41394 . Below is an example comparing the old and new formats.

Assuming we compile the following code with -O1 (the example is from clang/test/CodeGen/tbaa.cpp):

typedef unsigned char uint8_t;
typedef unsigned short uint16_t;
typedef unsigned int uint32_t;
typedef unsigned long long uint64_t;

typedef struct {
   uint16_t f16;
   uint32_t f32;
   uint16_t f16_2;
   uint32_t f32_2;
} StructA;

uint32_t g(uint32_t *s, StructA *A, uint64_t count) {
  *s = 1;
  A->f32 = 4;
  return *s;
}

Old Format

The old format is structured as <name>, <parent>, <offset> [, <parent>, <offset>]:

define dso_local noundef i32 @_Z1gPjP7StructAy(ptr nocapture noundef initializes((0, 4)) %s, ptr nocapture noundef writeonly initializes((4, 8)) %A, i64 noundef %count) local_unnamed_addr #0 {
entry:
  store i32 1, ptr %s, align 4, !tbaa !5
  %f32 = getelementptr inbounds nuw i8, ptr %A, i64 4
  store i32 4, ptr %f32, align 4, !tbaa !9
  %0 = load i32, ptr %s, align 4, !tbaa !5
  ret i32 %0
}

!5 = !{!6, !6, i64 0}
!6 = !{!"int", !7, i64 0}
!7 = !{!"omnipotent char", !8, i64 0}
!8 = !{!"Simple C++ TBAA"}
!9 = !{!10, !6, i64 4}
!10 = !{!"_ZTS7StructA", !11, i64 0, !6, i64 4, !11, i64 8, !6, i64 12}
!11 = !{!"short", !7, i64 0}

New Format

The new format is structured as <parent>, <size>, <name> [, <parent>, <offset>, <size>]:

define dso_local noundef i32 @_Z1gPjP7StructAy(ptr nocapture noundef initializes((0, 4)) %s, ptr nocapture noundef writeonly initializes((4, 8)) %A, i64 noundef %count) local_unnamed_addr #0 {
entry:
  store i32 1, ptr %s, align 4, !tbaa !5
  %f32 = getelementptr inbounds nuw i8, ptr %A, i64 4
  store i32 4, ptr %f32, align 4, !tbaa !9
  %0 = load i32, ptr %s, align 4, !tbaa !5
  ret i32 %0
}

!5 = !{!6, !6, i64 0, i64 4}
!6 = !{!7, i64 4, !"int"}
!7 = !{!8, i64 1, !"omnipotent char"}
!8 = !{!"Simple C++ TBAA"}
!9 = !{!10, !6, i64 4, i64 4}
!10 = !{!7, i64 16, !"_ZTS7StructA", !11, i64 0, i64 2, !6, i64 4, i64 4, !11, i64 8, i64 2, !6, i64 12, i64 4}
!11 = !{!7, i64 2, !"short"}

I have updated #119698 . Please review it.

Additionally, llvm/clangir#1403 is how I utilized these attributes in ClangIR to support the new struct path.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants