Skip to content

Commit 3127751

Browse files
committed
[ElimAvailExtern] Add an option to allow to convert global variables in a specified address space to local
Currently, the `EliminateAvailableExternallyPass` only converts certain available externally functions to local. For global variables, it only drops their initializers. This PR adds an option to allow the pass to convert global variables in a specified address space to local. The motivation for this change is to correctly support lowering of LDS variables (`__shared__` variables, in more generic terminology) when ThinLTO is enabled for AMDGPU. A `__shared__` variable is lowered to a hidden global variable in a particular address space by the frontend, which is roughly same as a `static` local variable. To properly lower it in the backend, the compiler needs to check all its uses. Enabling ThinLTO currently breaks this when a function containing a `__shared__` variable is imported from another module. Even though the global variable is imported along with its associated function, and the function is privatized by the `EliminateAvailableExternallyPass`, the global variable itself is not. It's safe to privatize such global variables, because they're _local_ to their associated functions. If the function itself is privatized, its associated global variables should also be privatized accordingly.
1 parent d64ee2c commit 3127751

File tree

2 files changed

+51
-1
lines changed

2 files changed

+51
-1
lines changed

llvm/lib/Transforms/IPO/ElimAvailExtern.cpp

Lines changed: 30 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -35,8 +35,14 @@ static cl::opt<bool> ConvertToLocal(
3535
cl::desc("Convert available_externally into locals, renaming them "
3636
"to avoid link-time clashes."));
3737

38+
static cl::opt<unsigned> ConvertGlobalVariableInAddrSpace(
39+
"avail-extern-gv-in-addrspace-to-local", cl::Hidden,
40+
cl::desc(
41+
"Convert available_externally global variables into locals if they are "
42+
"in specificed addrspace, renaming them to avoid link-time clashes."));
43+
3844
STATISTIC(NumRemovals, "Number of functions removed");
39-
STATISTIC(NumConversions, "Number of functions converted");
45+
STATISTIC(NumConversions, "Number of functions and globalbs converted");
4046
STATISTIC(NumVariables, "Number of global variables removed");
4147

4248
void deleteFunction(Function &F) {
@@ -88,9 +94,32 @@ static void convertToLocalCopy(Module &M, Function &F) {
8894
++NumConversions;
8995
}
9096

97+
static void convertToLocalCopy(Module &M, GlobalValue &GV) {
98+
assert(GV.hasAvailableExternallyLinkage());
99+
std::string OrigName = GV.getName().str();
100+
std::string NewName = OrigName + ".__uniq" + getUniqueModuleId(&M);
101+
GV.setName(NewName);
102+
GV.setLinkage(GlobalValue::InternalLinkage);
103+
++NumConversions;
104+
}
105+
91106
static bool eliminateAvailableExternally(Module &M, bool Convert) {
92107
bool Changed = false;
93108

109+
// Convert global variables in specified address space before changing it to
110+
// external linkage below.
111+
if (ConvertGlobalVariableInAddrSpace.getNumOccurrences()) {
112+
for (GlobalVariable &GV : M.globals()) {
113+
if (!GV.hasAvailableExternallyLinkage() || GV.use_empty())
114+
continue;
115+
116+
if (GV.getAddressSpace() == ConvertGlobalVariableInAddrSpace)
117+
convertToLocalCopy(M, GV);
118+
119+
Changed = true;
120+
}
121+
}
122+
94123
// Drop initializers of available externally global variables.
95124
for (GlobalVariable &GV : M.globals()) {
96125
if (!GV.hasAvailableExternallyLinkage())
Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --check-globals all --version 5
2+
; RUN: opt -S -passes=elim-avail-extern -avail-extern-gv-in-addrspace-to-local=3 %s -o - | FileCheck %s
3+
4+
@shared = internal addrspace(3) global i32 undef, align 4
5+
@shared.imported = available_externally hidden unnamed_addr addrspace(3) global i32 undef, align 4
6+
7+
;.
8+
; CHECK: @shared = internal addrspace(3) global i32 undef, align 4
9+
; CHECK: @shared.imported.__uniq.[[UUID:.*]] = internal unnamed_addr addrspace(3) global i32 undef, align 4
10+
;.
11+
define void @foo(i32 %v) {
12+
; CHECK-LABEL: define void @foo(
13+
; CHECK-SAME: i32 [[V:%.*]]) {
14+
; CHECK-NEXT: store i32 [[V]], ptr addrspace(3) @shared, align 4
15+
; CHECK-NEXT: store i32 [[V]], ptr addrspace(3) @shared.imported.__uniq.[[UUID]], align 4
16+
; CHECK-NEXT: ret void
17+
;
18+
store i32 %v, ptr addrspace(3) @shared, align 4
19+
store i32 %v, ptr addrspace(3) @shared.imported, align 4
20+
ret void
21+
}

0 commit comments

Comments
 (0)