Skip to content

Troubleshooting D3D12 out of memory errors with GPGMM

Bryan B edited this page Feb 14, 2023 · 9 revisions

GPGMM offers highly configurable allocators to assist in debugging OOM errors.

First, test to see if E_OUTOFMEMORY occurs when using driver-based allocation by specifying ALLOCATOR_FLAG_ALWAYS_COMMITTED flag in CreateResourceAllocator.

If E_OUTOFMEMORY, the issue is likely over-commit by application/workload. Enable debugging by specifying MinLogLevel of D3D12_MESSAGE_SEVERITY_MESSAGE then check for resource allocation alignments warnings or consider ALLOCATION_FLAG_ALLOW_SUBALLOCATE_WITHIN_RESOURCE. See https://github.com/intel/GPGMM/wiki/Profiling-allocator-GPU-memory-usage-with-GPGMM for more insights. Otherwise, proceed to next step.

Second, test to see if E_OUTOFMEMORY occurs when pooling is disabled by specifying ALLOCATOR_FLAG_ALWAYS_ON_DEMAND flag in CreateResourceAllocator.

If E_OUTOFMEMORY, proceed to next step. Otherwise, the issue is likely because the application/workload is creating many resources of variable sizes that cannot benefit from pooling. The application/workload should consider call ReleaseResourceHeaps to release these unused heaps when the working set significantly changes or consider a different PoolAlgorithm and PreferredResourceHeapSize to limit the amount of the memory retained.

Third, test to see if E_OUTOFMEMORY occurs when specifying a MemoryFragmentationLimit value of 1 in CreateResourceAllocator.

If E_OUTOFMEMORY, proceed to next step. Otherwise, the issue is likely due to memory fragmentation: allocation sizes are larger then resource sizes due to alignment mismatches. Consider a different ALLOCATOR_ALGORITHM or ALLOCATION_FLAG_NEVER_SUBALLOCATE_MEMORY, if available.

Fourth, test to see if E_OUTMEMORY when MemoryGrowthFactor=1.

If E_OUTOFMEMORY, proceed to next step. Otherwise, the issue is likely due to the default memory growth rate being too aggressive for the workload/application. The application/workload should consider creating a separate allocator for these resources or restricting growth.

Clone this wiki locally