Skip to content

No module named 'deep_gemm' #3249

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
3 tasks
lordk911 opened this issue Apr 14, 2025 · 0 comments
Closed
3 tasks

No module named 'deep_gemm' #3249

lordk911 opened this issue Apr 14, 2025 · 0 comments
Labels
Milestone

Comments

@lordk911
Copy link
Contributor

System Info / 系統信息

flashinfer-python: 0.2.4
transformers: 4.50.3
torch: 2.6.0

Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece?

  • docker / docker
  • pip install / 通过 pip install 安装
  • installation from source / 从源码安装

Version info / 版本信息

xinference 1.4.1

The command used to start Xinference / 用以启动 xinference 的命令

xinference launch -n qwen2.5-instruct -u qwen2.5-72B-instruct -s 72 -f awq --gpu-idx 0 --context_length 10240 -e "http://10.9.27.41:9997" --worker-ip 10.9.27.41 --model-engine sglang

Reproduction / 复现过程

Launch model name: qwen2.5-instruct with kwargs: {'context_length': 10240}
Traceback (most recent call last):
  File "/data/miniconda3/envs/xinference/bin/xinference", line 8, in <module>
    sys.exit(cli())
  File "/data/miniconda3/envs/xinference/lib/python3.10/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/data/miniconda3/envs/xinference/lib/python3.10/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
  File "/data/miniconda3/envs/xinference/lib/python3.10/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/data/miniconda3/envs/xinference/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/data/miniconda3/envs/xinference/lib/python3.10/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/data/miniconda3/envs/xinference/lib/python3.10/site-packages/click/decorators.py", line 33, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/data/miniconda3/envs/xinference/lib/python3.10/site-packages/xinference/deploy/cmdline.py", line 928, in model_launch
    model_uid = client.launch_model(
  File "/data/miniconda3/envs/xinference/lib/python3.10/site-packages/xinference/client/restful/restful_client.py", line 1007, in launch_model
    raise RuntimeError(
RuntimeError: Failed to launch model, detail: [address=10.9.27.41:36139, pid=3186654] No module named 'deep_gemm'

Expected behavior / 期待表现

use sglang for qwen2.5

@XprobeBot XprobeBot added the gpu label Apr 14, 2025
@XprobeBot XprobeBot added this to the v1.x milestone Apr 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants