fix: Improve CUDA version detection and error handling #1599

ved1beta · 2025-04-16T08:59:12Z

fix: Improve CUDA/HIP version detection and error handling

This commit fixes issue #1513 by improving the version detection logic
in cuda_specs.py to handle both CUDA and ROCm/HIP systems more robustly.

Key changes:

Add proper error handling for None values in version detection
Make version string and tuple functions return Optional types
Add robust version string parsing with try-except blocks
Add validation checks for compute capabilities
Improve error handling in get_cuda_specs function
Add proper type hints and documentation

The changes ensure that:

NoneType errors are properly handled when version info is missing
ROCm/HIP systems are properly supported
Invalid version strings don't cause crashes
All error cases return None instead of raising exceptions

Test results show successful version detection for both CUDA and ROCm/HIP
systems, resolving the original issue where torch.version.cuda.split()
would fail on ROCm systems.

github-actions · 2025-04-16T14:52:22Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

matthewdouglas · 2025-04-16T14:52:26Z

LGTM, thanks! We can merge after fixing the failing linter check.

fix: Improve CUDA version detection and error handling

6074e0e

matthewdouglas added CUDA Setup Cross Platform labels Apr 16, 2025

matthewdouglas added this to the v0.46.0 milestone Apr 16, 2025

ved1beta added 2 commits April 16, 2025 21:18

lint fix

28d4d5c

lint fix

1724fa1

matthewdouglas linked an issue Apr 17, 2025 that may be closed by this pull request

torch.version.cuda.split error #1513

Closed

matthewdouglas merged commit feaedbb into bitsandbytes-foundation:main Apr 17, 2025
33 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: Improve CUDA version detection and error handling #1599

fix: Improve CUDA version detection and error handling #1599

ved1beta commented Apr 16, 2025

github-actions bot commented Apr 16, 2025

matthewdouglas commented Apr 16, 2025

fix: Improve CUDA version detection and error handling #1599

fix: Improve CUDA version detection and error handling #1599

Conversation

ved1beta commented Apr 16, 2025

github-actions bot commented Apr 16, 2025

matthewdouglas commented Apr 16, 2025