Skip to content

添加llama.cpp提供的llama-server服务支持 #792

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
mrcedar-git opened this issue Mar 21, 2025 · 8 comments
Open

添加llama.cpp提供的llama-server服务支持 #792

mrcedar-git opened this issue Mar 21, 2025 · 8 comments
Labels
enhancement New feature or request

Comments

@mrcedar-git
Copy link

在什么场景下,需要你请求的功能?

我在尝试使用pdf2zh连接llama.cpp的llama-server提供的api :http://127.0.0.1:8080/v1时,总是失败,提示不支持,还请大佬添加此服务,非常感激!

解决方案

No response

其他内容

No response

@mrcedar-git mrcedar-git added the enhancement New feature or request label Mar 21, 2025
@awwaawwa
Copy link
Collaborator

请用openai like接入。

@mrcedar-git
Copy link
Author

我查看了服务的高级文档,在Rockylinux9上的bash shell 里使用如下方法测试,没有成功,一直卡在这个位置:
29%|██████████████████████▊ | 2/7 [01:06<02:46, 33.24s/it]

第一种:

export OPENAILIKED_BASE_URL=http://127.0.0.1:8080/v1
pdf2zh example.pdf -s openailiked:Qwen2.5-7B-Instruct-Q8_0  -li en -lo zh

第二种:

export OPENAILIKED_BASE_URL=http://127.0.0.1:8080/v1 
export OPENAILIKED_MODEL=Qwen2.5-7B-Instruct-Q8_0
pdf2zh example.pdf -s openailiked  -li en -lo zh

原来试用ollama的api能成功。
请大佬指点指点!

@hellofinch
Copy link
Contributor

请提供一些控制台输出

@mrcedar-git
Copy link
Author

mrcedar-git commented Mar 24, 2025

环境:
系统:rockylinux9.5
启动服务:
# systemctl start llama-qwen2.5

● llama-qwen2.5.service - llama-qwen2.5.server
Loaded: loaded (/etc/systemd/system/llama-qwen2.5.service; disabled; preset: disabled)
Active: active (running) since Mon 2025-03-24 12:40:36 CST; 25s ago
Main PID: 4690 (llama-server)
Tasks: 16 (limit: 150436)
Memory: 8.1G
CPU: 3.989s
CGroup: /system.slice/llama-qwen2.5.service
└─4690 /usr/bin/llama-server -ngl 23 -m /home/wang/llama.cpp/Models-gguf/Qwen2.5-7B-Instruct-Q8_0.gguf --port 8080

3月 24 12:40:54 192.168.0.110 llama-server[4690]: <|im_start|>user
3月 24 12:40:54 192.168.0.110 llama-server[4690]: Hello<|im_end|>
3月 24 12:40:54 192.168.0.110 llama-server[4690]: <|im_start|>assistant
3月 24 12:40:54 192.168.0.110 llama-server[4690]: Hi there<|im_end|>
3月 24 12:40:54 192.168.0.110 llama-server[4690]: <|im_start|>user
3月 24 12:40:54 192.168.0.110 llama-server[4690]: How are you?<|im_end|>
3月 24 12:40:54 192.168.0.110 llama-server[4690]: <|im_start|>assistant
3月 24 12:40:54 192.168.0.110 llama-server[4690]: '
3月 24 12:40:54 192.168.0.110 llama-server[4690]: main: server is listening on http://127.0.0.1:8080 - starting the main loop
3月 24 12:40:54 192.168.0.110 llama-server[4690]: srv update_slots: all slots are idle

激活虚拟环境:
source .venv/bin/activate

设置环境变量:
(pdf2zh) [wang@192 pdf2zh]$ export OPENAILIKED_BASE_URL=http://127.0.0.1:8080/v1
(pdf2zh) [wang@192 pdf2zh]$ export OPENAILIKED_MODEL=Qwen2.5-7B-Instruct-Q8_0
(pdf2zh) [wang@192 pdf2zh]$ export OPENAILIKED_API_KEY=openailiked

config.json文件内容:
{
"translators": [
{
"name": "openailiked",
"envs": {
"OPENAILIKED_BASE_URL": "http://127.0.0.1:8080/v1",
"OPENAILIKED_API_KEY": "openailiked",
"OPENAILIKED_MODEL": "Qwen2.5-7B-Instruct-Q8_0"
}
}
],
"PDF2ZH_LANG_FROM": "English",
"PDF2ZH_LANG_TO": "Simplified Chinese"
}

运行命令:
(pdf2zh) [wang@192 pdf2zh]$ pdf2zh stable-ts-help.pdf -s openailiked -li en -lo zh

终端输出:
not in git repo
Namespace(files=['stable-ts-help.pdf'], debug=False, pages=None, vfont='', vchar='', lang_in='en', lang_out='zh', service='openailiked', output='', thread=4, interactive=False, share=False, flask=False, celery=False, authorized=None, prompt=None, compatible=False, onnx=None, serverport=None, dir=False, config=None, babeldoc=False, skip_subset_fonts=False)
29%|████████████▊ | 2/7 [00:00<00:02, 2.30it/s]
[03/24/25 13:11:17] ERROR ERROR:pdf2zh.converter:Request converter.py:356
timed out.

@hellofinch
Copy link
Contributor

看上去是连接超时了,llama.cpp连通性正常吗?

@mrcedar-git
Copy link
Author

llama.cpp连通性正常,我用chatbox连接能正常翻译,用webui也能访问和翻译。
我今天又试了openai服务,设置环境变量:

export OPENAI_BASE_URL=http://127.0.0.1:8080/v1
export OPENAI_MODEL=Qwen2.5-7B-Instruct-Q8_0
export OPENAI_API_KEY=None

能连接翻译,但在翻译到100多页时(全文600多页),又出现连接超时提示。
我改用-s bing翻译服务时,能正确翻译全部文档。

@hellofinch
Copy link
Contributor

奇怪
@awwaawwa 有什么头绪吗?

@awwaawwa
Copy link
Collaborator

Image

2.0出来再查呗

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants