Skip to content

Create API for hardware activation (Nvidia) #1603

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Tracked by #1568
vansangpfiev opened this issue Nov 1, 2024 · 1 comment
Closed
Tracked by #1568

Create API for hardware activation (Nvidia) #1603

vansangpfiev opened this issue Nov 1, 2024 · 1 comment
Assignees
Labels
category: hardware management Related to hardware & compute
Milestone

Comments

@vansangpfiev
Copy link
Contributor

vansangpfiev commented Nov 1, 2024

v1/hardware/activate

For Nvidia card we can use below API
image

Records device as the device on which the active host thread executes the device code. 
If the host thread has already initialized the CUDA runtime by calling non-device management 
runtime functions or if there exists a CUDA driver context active on the host thread, then this
call returns cudaErrorSetOnActiveProcess

Note that we may need to restart Server to apply this change

Update: Records device as the device on which the active host thread executes the device code. Seems like we can not apply the active device to all threads.

Question: Should we use CUDA_VISIBLE_DEVICE?
Answer: This is the best approach that I know. We will support TensorRT-LLM and Onnx(?), so it will reduce the complexity because we don't need to change the logic.

For AMD, I think we have ROCM_VISIBLE_DEVICE. Any environment variable can be useful?

@github-project-automation github-project-automation bot moved this to Investigating in Menlo Nov 1, 2024
@vansangpfiev vansangpfiev self-assigned this Nov 1, 2024
@vansangpfiev vansangpfiev moved this from Investigating to In Progress in Menlo Nov 1, 2024
@vansangpfiev vansangpfiev added the category: hardware management Related to hardware & compute label Nov 1, 2024
@vansangpfiev vansangpfiev mentioned this issue Nov 5, 2024
3 tasks
@vansangpfiev vansangpfiev moved this from In Progress to In Review in Menlo Nov 12, 2024
@gabrielle-ong gabrielle-ong modified the milestones: v1.0.3, v1.0.4 Nov 12, 2024
@gabrielle-ong gabrielle-ong modified the milestones: v1.0.4, v1.0.3 Nov 20, 2024
@gabrielle-ong gabrielle-ong changed the title Create API for hardware activation Create API for hardware activation (Nvidia) Nov 21, 2024
@gabrielle-ong
Copy link
Contributor

Thanks @vansangpfiev, marking as complete as a subtask of #1568
Noted on the idea to use ROCM_VISIBLE_DEVICE for AMD in the future

@gabrielle-ong gabrielle-ong moved this from Review + QA to Completed in Menlo Nov 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: hardware management Related to hardware & compute
Projects
Archived in project
Development

No branches or pull requests

2 participants