This example demonstrates the following:
- You must set
-fopenmp-version=50
for OpenMP 5.0 functionality. - How to use the
if
clause in pragmas for conditional target offload. - How to use function variants for different architectures.
The example implements the SAXPY (Single-Precision A·X Plus Y) operation using OpenMP. It demonstrates:
- Conditional offloading using the
if
clause in#pragma omp target
. - Architecture-specific function variants using
#pragma omp declare variant
.
The base saxpy
function runs on the host, while architecture-specific variants (amdgcn_saxpy
and nvptx_saxpy
) are executed on AMD and NVIDIA GPUs, respectively.
To build the example, run the following command:
make
Ensure that the LLVM_GPU_ARCH
environment variable is set to the appropriate GPU architecture (e.g., gfx90a
for AMD GPUs or sm_70
for NVIDIA GPUs).
To execute the example, run:
make run
The output will vary depending on the target architecture and the value of the if
clause in the #pragma omp target
directive. A typical output might look like:
Calling saxpy with high threshold for device execution
saxpy: Running on host. IsHost:1
Calling saxpy with low threshold for device execution
amdgcn_saxpy: Running on amdgcn device. IsHost:0
y[0],y[N-1]: 5 640