[Bug]: Inconsistency between `is_contiguous` and `stride` API in HIPGRAPH #2020

tjtanaa · 2025-04-07T12:01:21Z

🐛 Describe the bug

I could not replicate the scenario through simple test script. It is a bug found in vLLM ROCm when running meta-llama/Llama-4-Scout-17B-16E-Instruct. This behaviour only occurs in HIPGraph mode + torch.compile. In EAGER mode + torch.compile, the contiguous() API and stride() API are consistent.

In the HIPGraph, it could occur that the tensor A has the following properties:
.shape: ([1024, 1])
.is_contiguous(): True
.stride() : [1,1024]
.is_contiguous(memory_format=torch.channels_last) is False
.is_contiguous(memory_format=torch.contiguous_format) is True

Expected behaviour is .stride(1) == 1 is True as is_contiguous() is True.

This A = A.contiguous() do not fix the issue. It is still .stride() : [1,1024].

To fix this bug, the workaround right now is A = A.view(-1).reshape(A.shape).

On CUDA, the stride returns (1,1) but on ROCm, it returns (1,1024)

Versions

Collecting environment information...                                                                                                             
PyTorch version: 2.7.0a0+git295f2ed                                                                                                               
Is debug build: False                                                                                                                             
CUDA used to build PyTorch: N/A                                                                                                                   
ROCM used to build PyTorch: 6.3.42133-1b9c17779                                                                                                   
                                                                                                                                                  
OS: Ubuntu 22.04.5 LTS (x86_64)                                                                                                                   
GCC version: (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0                                                                                                
Clang version: 18.0.0git (https://github.com/RadeonOpenCompute/llvm-project roc-6.3.1 24491 1e0fda770a2079fbd71e4b70974d74f62fd3af10)             
CMake version: version 3.31.6                                                                                                                     
Libc version: glibc-2.35                                                                                                                          
                                                                                                                                                  
Python version: 3.12.9 (main, Feb  5 2025, 08:49:00) [GCC 11.4.0] (64-bit runtime)                                                                
Python platform: Linux-5.15.0-116-generic-x86_64-with-glibc2.35                                                                                   
Is CUDA available: True                                                                                                                           
CUDA runtime version: Could not collect                                                                                                           
CUDA_MODULE_LOADING set to: LAZY                                                                                                                  
GPU models and configuration: AMD Instinct MI300X (gfx942:sramecc+:xnack-)                                                                        
Nvidia driver version: Could not collect                                                                                                          cuDNN version: Could not collect
HIP runtime version: 6.3.42133                                           
MIOpen runtime version: 3.3.0                                            
Is XNNPACK available: True          

Versions of relevant libraries:
[pip3] numpy==1.26.4
[pip3] torch==2.7.0a0+git295f2ed
[pip3] torchvision==0.21.0+7af6987
[pip3] triton==3.2.0+gite5be006a

The text was updated successfully, but these errors were encountered:

hongxiayang · 2025-04-07T14:05:22Z

cc @jeffdaily @jataylo

tjtanaa changed the title ~~[Bug]: Inconsistency between is_contiguous and Stride API in HIPGRAPH~~ [Bug]: Inconsistency between is_contiguous and stride API in HIPGRAPH Apr 7, 2025

hongxiayang mentioned this issue Apr 7, 2025

[Bug] [ROCm] Fix Llama 4 Enablement Bug on ROCm: V0 ROCmFlashAttentionImpl and Triton Fused MoE bugs vllm-project/vllm#16198

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: Inconsistency between `is_contiguous` and `stride` API in HIPGRAPH #2020

[Bug]: Inconsistency between `is_contiguous` and `stride` API in HIPGRAPH #2020

tjtanaa commented Apr 7, 2025 •

edited

Loading

hongxiayang commented Apr 7, 2025

[Bug]: Inconsistency between is_contiguous and stride API in HIPGRAPH #2020

[Bug]: Inconsistency between is_contiguous and stride API in HIPGRAPH #2020

Comments

tjtanaa commented Apr 7, 2025 • edited Loading

🐛 Describe the bug

Versions

hongxiayang commented Apr 7, 2025

[Bug]: Inconsistency between `is_contiguous` and `stride` API in HIPGRAPH #2020

[Bug]: Inconsistency between `is_contiguous` and `stride` API in HIPGRAPH #2020

tjtanaa commented Apr 7, 2025 •

edited

Loading