You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Launch model name: qwen2.5-instruct with kwargs: {'context_length': 10240}
Traceback (most recent call last):
File "/data/miniconda3/envs/xinference/bin/xinference", line 8, in <module>
sys.exit(cli())
File "/data/miniconda3/envs/xinference/lib/python3.10/site-packages/click/core.py", line 1157, in __call__
return self.main(*args, **kwargs)
File "/data/miniconda3/envs/xinference/lib/python3.10/site-packages/click/core.py", line 1078, in main
rv = self.invoke(ctx)
File "/data/miniconda3/envs/xinference/lib/python3.10/site-packages/click/core.py", line 1688, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/data/miniconda3/envs/xinference/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/data/miniconda3/envs/xinference/lib/python3.10/site-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
File "/data/miniconda3/envs/xinference/lib/python3.10/site-packages/click/decorators.py", line 33, in new_func
return f(get_current_context(), *args, **kwargs)
File "/data/miniconda3/envs/xinference/lib/python3.10/site-packages/xinference/deploy/cmdline.py", line 928, in model_launch
model_uid = client.launch_model(
File "/data/miniconda3/envs/xinference/lib/python3.10/site-packages/xinference/client/restful/restful_client.py", line 1007, in launch_model
raise RuntimeError(
RuntimeError: Failed to launch model, detail: [address=10.9.27.41:36139, pid=3186654] No module named 'deep_gemm'
Expected behavior / 期待表现
use sglang for qwen2.5
The text was updated successfully, but these errors were encountered:
System Info / 系統信息
flashinfer-python: 0.2.4
transformers: 4.50.3
torch: 2.6.0
Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece?
Version info / 版本信息
xinference 1.4.1
The command used to start Xinference / 用以启动 xinference 的命令
xinference launch -n qwen2.5-instruct -u qwen2.5-72B-instruct -s 72 -f awq --gpu-idx 0 --context_length 10240 -e "http://10.9.27.41:9997" --worker-ip 10.9.27.41 --model-engine sglang
Reproduction / 复现过程
Expected behavior / 期待表现
use sglang for qwen2.5
The text was updated successfully, but these errors were encountered: