Skip to content

Add HIP-RT support for rendering on AMD GPUs #473

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 24 additions & 0 deletions 0001-MSVC-HIP.patch
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
From 56933471af36147e1032fbbc7912ca0088797b78 Mon Sep 17 00:00:00 2001
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this intended to be included?

Copy link
Contributor Author

@jammm jammm Feb 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm IIRC it was because libdeflate wasn't compiling due to some issue with BitScanReverse not being declared. Perhaps the CUDA SDK included the relevant intrin.h but not in the HIPCC path. It's applied here

COMMAND cd ${CMAKE_CURRENT_SOURCE_DIR}/src/ext/libdeflate && git apply ${CMAKE_CURRENT_SOURCE_DIR}/0001-MSVC-HIP.patch > nul 2> nul & exit 0

A newer libdeflate commit seems to include the intrin.h though. Should we update the submodule to this commit 8f3c3f0000c6a09943e34908654f0489489b6047 then ? ebiggers/libdeflate@8f3c3f0#diff-cad417885447534a89122739a27dcf3c0e4b4629f37befb66117c31dddb50f0aR35

From: Aaryaman Vasishta <[email protected]>
Date: Sun, 10 Dec 2023 18:40:22 +0900
Subject: [PATCH] Fix BitScanreverse by including intrin.h for MSVC

---
common/compiler_msc.h | 1 +
1 file changed, 1 insertion(+)

diff --git a/common/compiler_msc.h b/common/compiler_msc.h
index 18cfa12..2eafd55 100644
--- a/common/compiler_msc.h
+++ b/common/compiler_msc.h
@@ -4,6 +4,7 @@

#include <stdint.h>
#include <stdlib.h> /* for _byteswap_*() */
+#include <intrin.h>

#define LIBEXPORT __declspec(dllexport)

--
2.33.0.windows.2

676 changes: 411 additions & 265 deletions CMakeLists.txt

Large diffs are not rendered by default.

29 changes: 29 additions & 0 deletions LICENSE.txt
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
All of the modifications made in commit ID 14582eebabcf1ca33323e6b6d6757a49d2306dbf are covered by the MIT license (below).
The rest would be covered by the existing Apache license.

---------------------------------------------------------

Apache License
Version 2.0, January 2004
Expand Down Expand Up @@ -200,3 +204,28 @@
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.


---------------------------------------------------------

The MIT License (MIT)

Copyright (c) 2022-2024 Advanced Micro Devices, Inc.

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.
30 changes: 30 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -228,3 +228,33 @@ is straightforward:
```bash
$ imgtool denoise-optix noisy.exr --outfile denoised.exr
```

Instructions to build the HIP port
--------

Linux:
* Install [ROCm](https://rocm.docs.amd.com/projects/install-on-linux/en/latest/install/native-install/ubuntu.html)
* Download and extract [HIPRT](https://gpuopen.com/hiprt/)
* Note that HIP is very sensitive to the difference between versions of HIPRT and the application on Linux
* If you encounter linking error, we recommend to compile HIPRT by yourself
* `cmake -DCMAKE_C_COMPILER=/opt/rocm/llvm/bin/clang -DCMAKE_CXX_COMPILER=/opt/rocm/llvm/bin/clang++ -DPBRT_HIPRT=ON -DPBRT_HIPRT_PATH=~/hiprtSdk ..`
* `make` or `make pbrt_exe`

Windows:
* Download and install [HIP SDK](https://www.amd.com/en/developer/resources/rocm-hub/hip-sdk.html)
* Make sure that you checked HIPRT to be installed
* Download and install [Strawberry Perl](https://strawberryperl.com/)
* Run `x64 Native Tools Command Prompt for VS 2022` as administrator
* `set CC=clang`
* `set CXX=clang++`
* `mkdir build` and `cd build`
* `cmake -G "Ninja" -DCMAKE_BUILD_TYPE=Release -DPBRT_HIPRT=ON ..`
* `cmake --build . --config=Release`

Example: `./pbrt --gpu ~/pbrt-v4-scenes/killeroos/killeroo-simple.pbrt`
Add `--interactive` for the interactive mode.

GPU architecture and ROCm version
--------

The instructions above assume the `gfx1100` architecture. You can specify other architectures via the CMake `AMDGPU_TARGETS` variable. Note that PBRT should be compiled with the same version as HIPRT binaries (e.g., using ROCm 6.0 and `hiprt02004_6.0_amd_lib_linux.bc`).
4 changes: 4 additions & 0 deletions src/ext/flip/flip.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,10 @@
#include <fstream>
#include <cassert>

#ifndef M_PI
#define M_PI 3.14159265358979323846f
#endif

namespace flip_detail {

class histogram
Expand Down
2 changes: 1 addition & 1 deletion src/ext/openvdb
Submodule openvdb updated 392 files
1 change: 0 additions & 1 deletion src/pbrt/base/medium.h
Original file line number Diff line number Diff line change
Expand Up @@ -88,7 +88,6 @@ class Medium

std::string ToString() const;

PBRT_CPU_GPU
bool IsEmissive() const;

PBRT_CPU_GPU
Expand Down
12 changes: 6 additions & 6 deletions src/pbrt/cmd/imgtool.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,9 @@
#include <pbrt/filters.h>
#include <pbrt/options.h>
#ifdef PBRT_BUILD_GPU_RENDERER
#ifndef __HIP_PLATFORM_AMD__
#ifdef __NVCC__
#include <pbrt/gpu/optix/denoiser.h>
#endif // __HIP_PLATFORM_AMD__
#endif // __NVCC__
#include <pbrt/gpu/util.h>
#endif // PBRT_BUILD_GPU_RENDERER
#include <pbrt/util/args.h>
Expand Down Expand Up @@ -2220,7 +2220,7 @@ int makeequiarea(std::vector<std::string> args) {
return 0;
}

#ifdef PBRT_BUILD_GPU_RENDERER
#ifdef __NVCC__
int denoise_optix(std::vector<std::string> args) {
std::string inFilename, outFilename;

Expand Down Expand Up @@ -2335,7 +2335,7 @@ int denoise_optix(std::vector<std::string> args) {

return 0;
}
#endif // PBRT_BUILD_GPU_RENDERER
#endif // __NVCC__

int main(int argc, char *argv[]) {
PBRTOptions opt;
Expand All @@ -2362,10 +2362,10 @@ int main(int argc, char *argv[]) {
return convert(args);
else if (cmd == "diff")
return diff(args);
#ifdef PBRT_BUILD_GPU_RENDERER
#ifdef __NVCC__
else if (cmd == "denoise-optix")
return denoise_optix(args);
#endif // PBRT_BUILD_GPU_RENDERER
#endif // __NVCC__
else if (cmd == "error")
return error(args);
else if (cmd == "falsecolor")
Expand Down
2 changes: 1 addition & 1 deletion src/pbrt/cmd/pbrt_test.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2,14 +2,14 @@
// The pbrt source code is licensed under the Apache License, Version 2.0.
// SPDX: Apache-2.0

#include <gtest/gtest.h>
#include <pbrt/pbrt.h>

#include <pbrt/options.h>
#include <pbrt/util/args.h>
#include <pbrt/util/error.h>
#include <pbrt/util/print.h>

#include <gtest/gtest.h>
#include <string>

using namespace pbrt;
Expand Down
4 changes: 4 additions & 0 deletions src/pbrt/cmd/pspec_gpu.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -16,8 +16,12 @@
#include <pbrt/util/image.h>
#include <pbrt/util/vecmath.h>

#if defined(__HIPCC__)
#include <pbrt/util/hip_aliases.h>
#else
#include <cuda.h>
#include <cuda_runtime_api.h>
#endif

#include <vector>

Expand Down
111 changes: 111 additions & 0 deletions src/pbrt/gpu/common.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
#ifndef PBRT_GPU_COMMON_H
#define PBRT_GPU_COMMON_H

#include <pbrt/pbrt.h>

#include <pbrt/base/light.h>
#include <pbrt/base/material.h>
#include <pbrt/base/medium.h>
#include <pbrt/base/shape.h>
#include <pbrt/base/texture.h>
#include <pbrt/util/pstd.h>
#include <pbrt/wavefront/workitems.h>
#include <pbrt/wavefront/workqueue.h>

#if defined(__HIPCC__)
#include <hiprt/hiprt.h>
#include <hiprt/hiprt_vec.h>
#else
#include <optix.h>
#endif

namespace pbrt {

class TriangleMesh;
class BilinearPatchMesh;

struct TriangleMeshRecord {
const TriangleMesh *mesh;
Material material;
FloatTexture alphaTexture;
pstd::span<Light> areaLights;
MediumInterface *mediumInterface;
};

struct BilinearMeshRecord {
const BilinearPatchMesh *mesh;
Material material;
FloatTexture alphaTexture;
pstd::span<Light> areaLights;
MediumInterface *mediumInterface;
};

struct QuadricRecord {
Shape shape;
Material material;
FloatTexture alphaTexture;
Light areaLight;
MediumInterface *mediumInterface;
};

#if defined(__HIP_PLATFORM_AMD__)
static constexpr size_t HitgroupAlignment = 16u;

struct alignas(HitgroupAlignment) HitgroupRecord {
PBRT_CPU_GPU HitgroupRecord() {}
PBRT_CPU_GPU HitgroupRecord(const HitgroupRecord &r) {
memcpy(this, &r, sizeof(HitgroupRecord));
}
PBRT_CPU_GPU HitgroupRecord &operator=(const HitgroupRecord &r) {
if (this != &r)
memcpy(this, &r, sizeof(HitgroupRecord));
return *this;
}

union {
TriangleMeshRecord triRec;
BilinearMeshRecord blpRec;
QuadricRecord quadricRec;
};
enum { TriangleMesh, BilinearMesh, Quadric } type;
};
#endif

struct RayIntersectParameters {
#if defined(__HIPCC__)
hiprtScene traversable;
#else
OptixTraversableHandle traversable;
#endif

const RayQueue *rayQueue;

// Closest hit
RayQueue *nextRayQueue;
EscapedRayQueue *escapedRayQueue;
HitAreaLightQueue *hitAreaLightQueue;
MaterialEvalQueue *basicEvalMaterialQueue, *universalEvalMaterialQueue;
MediumSampleQueue *mediumSampleQueue;

// Shadow rays
ShadowRayQueue *shadowRayQueue;
SOA<PixelSampleState> pixelSampleState;

// Subsurface scattering...
SubsurfaceScatterQueue *subsurfaceScatterQueue;

#if defined(__HIPCC__)
// Stack buffers
hiprtGlobalStackBuffer globalStackBuffer;
hiprtGlobalStackBuffer globalInstanceStackBuffer;
// Custom function table
hiprtFuncTable funcTable;
// Hitgroup records
HitgroupRecord *hgRecords;
// Offsets for hitgroup records
uint32_t *offsets;
#endif
};
} // namespace pbrt

#endif // PBRT_GPU_COMMON_H
13 changes: 13 additions & 0 deletions src/pbrt/gpu/cudagl.h
Original file line number Diff line number Diff line change
Expand Up @@ -34,9 +34,13 @@

#include <glad/glad.h>

#if defined(__HIPCC__)
#include <pbrt/util/hip_aliases.h>
#else
#include <cuda.h>
#include <cuda_gl_interop.h>
#include <cuda_runtime.h>
#endif

#define GL_CHECK(call) \
do { \
Expand Down Expand Up @@ -370,6 +374,15 @@ CUDAOutputBuffer<PIXEL_FORMAT>::CUDAOutputBuffer(int32_t width, int32_t height)
nullptr, GL_STREAM_DRAW));
GL_CHECK(glBindBuffer(GL_ARRAY_BUFFER, 0u));

#ifdef __HIPCC__
uint32_t num_gl_devices = 0;

int glDevice;
cudaGLGetDevices(&num_gl_devices, &glDevice, 1, cudaGLDeviceListAll);

if (glDevice != current_device)
LOG_FATAL("Multi-GPU not supported with GL interop yet");
#endif
CUDA_CHECK(cudaGraphicsGLRegisterBuffer(&m_cuda_gfx_resource, m_pbo,
cudaGraphicsMapFlagsWriteDiscard));

Expand Down
Loading