Skip to content

Commit 0f7a8a9

Browse files
authored
Support GLM4V (#336)
1 parent f86777c commit 0f7a8a9

25 files changed

+1923
-670
lines changed

.gitmodules

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,3 +14,6 @@
1414
[submodule "third_party/abseil-cpp"]
1515
path = third_party/abseil-cpp
1616
url = https://github.com/abseil/abseil-cpp.git
17+
[submodule "third_party/stb"]
18+
path = third_party/stb
19+
url = https://github.com/nothings/stb.git

CMakeLists.txt

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -58,6 +58,8 @@ add_subdirectory(third_party/abseil-cpp)
5858

5959
add_subdirectory(third_party/re2)
6060

61+
include_directories(third_party/stb)
62+
6163
if (GGML_METAL)
6264
add_compile_definitions(GGML_USE_METAL)
6365
configure_file(third_party/ggml/src/ggml-metal.metal ${CMAKE_LIBRARY_OUTPUT_DIRECTORY}/ggml-metal.metal COPYONLY)
@@ -135,7 +137,7 @@ add_custom_target(check-all
135137
COMMAND cmake --build build -j
136138
COMMAND ./build/bin/chatglm_test
137139
COMMAND python3 setup.py develop
138-
COMMAND python3 -m pytest tests/test_chatglm_cpp.py
140+
COMMAND python3 -m pytest --forked tests/test_chatglm_cpp.py
139141
WORKING_DIRECTORY ${PROJECT_SOURCE_DIR}
140142
)
141143

MANIFEST.in

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,3 +20,7 @@ graft third_party/re2
2020

2121
# absl
2222
graft third_party/abseil-cpp
23+
24+
# stb
25+
include third_party/stb/stb_image.h
26+
include third_party/stb/stb_image_resize2.h

README.md

Lines changed: 42 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66
![Python](https://img.shields.io/pypi/pyversions/chatglm-cpp)
77
[![License: MIT](https://img.shields.io/badge/license-MIT-blue)](LICENSE)
88

9-
C++ implementation of [ChatGLM-6B](https://github.com/THUDM/ChatGLM-6B), [ChatGLM2-6B](https://github.com/THUDM/ChatGLM2-6B), [ChatGLM3](https://github.com/THUDM/ChatGLM3) and [GLM-4](https://github.com/THUDM/GLM-4) for real-time chatting on your MacBook.
9+
C++ implementation of [ChatGLM-6B](https://github.com/THUDM/ChatGLM-6B), [ChatGLM2-6B](https://github.com/THUDM/ChatGLM2-6B), [ChatGLM3](https://github.com/THUDM/ChatGLM3) and [GLM-4](https://github.com/THUDM/GLM-4)(V) for real-time chatting on your MacBook.
1010

1111
![demo](docs/demo.gif)
1212

@@ -22,7 +22,7 @@ Highlights:
2222
Support Matrix:
2323
* Hardwares: x86/arm CPU, NVIDIA GPU, Apple Silicon GPU
2424
* Platforms: Linux, MacOS, Windows
25-
* Models: [ChatGLM-6B](https://github.com/THUDM/ChatGLM-6B), [ChatGLM2-6B](https://github.com/THUDM/ChatGLM2-6B), [ChatGLM3](https://github.com/THUDM/ChatGLM3), [GLM-4](https://github.com/THUDM/GLM-4), [CodeGeeX2](https://github.com/THUDM/CodeGeeX2)
25+
* Models: [ChatGLM-6B](https://github.com/THUDM/ChatGLM-6B), [ChatGLM2-6B](https://github.com/THUDM/ChatGLM2-6B), [ChatGLM3](https://github.com/THUDM/ChatGLM3), [GLM-4](https://github.com/THUDM/GLM-4)(V), [CodeGeeX2](https://github.com/THUDM/CodeGeeX2)
2626

2727
## Getting Started
2828

@@ -53,9 +53,9 @@ python3 chatglm_cpp/convert.py -i THUDM/chatglm-6b -t q4_0 -o models/chatglm-ggm
5353

5454
The original model (`-i <model_name_or_path>`) can be a Hugging Face model name or a local path to your pre-downloaded model. Currently supported models are:
5555
* ChatGLM-6B: `THUDM/chatglm-6b`, `THUDM/chatglm-6b-int8`, `THUDM/chatglm-6b-int4`
56-
* ChatGLM2-6B: `THUDM/chatglm2-6b`, `THUDM/chatglm2-6b-int4`
57-
* ChatGLM3-6B: `THUDM/chatglm3-6b`
58-
* ChatGLM4-9B: `THUDM/glm-4-9b-chat`
56+
* ChatGLM2-6B: `THUDM/chatglm2-6b`, `THUDM/chatglm2-6b-int4`, `THUDM/chatglm2-6b-32k`, `THUDM/chatglm2-6b-32k-int4`
57+
* ChatGLM3-6B: `THUDM/chatglm3-6b`, `THUDM/chatglm3-6b-32k`, `THUDM/chatglm3-6b-128k`, `THUDM/chatglm3-6b-base`
58+
* ChatGLM4(V)-9B: `THUDM/glm-4-9b-chat`, `THUDM/glm-4-9b-chat-1m`, `THUDM/glm-4-9b`, `THUDM/glm-4v-9b`
5959
* CodeGeeX2: `THUDM/codegeex2-6b`, `THUDM/codegeex2-6b-int4`
6060

6161
You are free to try any of the below quantization types by specifying `-t <type>`:
@@ -188,6 +188,22 @@ python3 chatglm_cpp/convert.py -i THUDM/glm-4-9b-chat -t q4_0 -o models/chatglm4
188188

189189
</details>
190190

191+
<details open>
192+
<summary>ChatGLM4V-9B</summary>
193+
194+
[![03-Confusing-Pictures](examples/03-Confusing-Pictures.jpg)](https://www.barnorama.com/wp-content/uploads/2016/12/03-Confusing-Pictures.jpg)
195+
196+
You may use `-vt <vision_type>` to set quantization type for the vision encoder. It is recommended to run GLM4V on GPU since vision encoding runs too slow on CPU even with 4-bit quantization.
197+
```sh
198+
python3 chatglm_cpp/convert.py -i THUDM/glm-4v-9b -t q4_0 -vt q4_0 -o models/chatglm4v-ggml.bin
199+
./build/bin/main -m models/chatglm4v-ggml.bin --image examples/03-Confusing-Pictures.jpg -p "这张图片有什么不寻常之处" --temp 0
200+
# 这张图片中不寻常的是,一个男人站在一辆黄色SUV的后备箱上,正在使用一个铁板熨烫衣物。
201+
# 通常情况下,熨衣是在室内进行的,使用的是家用电熨斗,而不是在户外使用汽车后备箱作为工作台。
202+
# 此外,他似乎是在一个繁忙的城市街道上,周围有行驶的车辆和建筑物,这增加了场景的荒谬性。
203+
```
204+
205+
</details>
206+
191207
<details>
192208
<summary>CodeGeeX2</summary>
193209

@@ -361,6 +377,15 @@ python3 cli_demo.py -m ../models/chatglm4-ggml.bin -p 你好 --temp 0.8 --top_p
361377
```
362378
</details>
363379
380+
<details open>
381+
<summary>ChatGLM4V-9B</summary>
382+
383+
Chat mode:
384+
```sh
385+
python3 cli_demo.py -m ../models/chatglm4v-ggml.bin --image 03-Confusing-Pictures.jpg -p "这张图片有什么不寻常之处" --temp 0
386+
```
387+
</details>
388+
364389
<details>
365390
<summary>CodeGeeX2</summary>
366391
@@ -450,12 +475,22 @@ Use the OpenAI client to chat with your model:
450475
451476
For stream response, check out the example client script:
452477
```sh
453-
OPENAI_BASE_URL=http://127.0.0.1:8000/v1 python3 examples/openai_client.py --stream --prompt 你好
478+
python3 examples/openai_client.py --base_url http://127.0.0.1:8000/v1 --stream --prompt 你好
454479
```
455480
456481
Tool calling is also supported:
457482
```sh
458-
OPENAI_BASE_URL=http://127.0.0.1:8000/v1 python3 examples/openai_client.py --tool_call --prompt 上海天气怎么样
483+
python3 examples/openai_client.py --base_url http://127.0.0.1:8000/v1 --tool_call --prompt 上海天气怎么样
484+
```
485+
486+
Request GLM4V with image inputs:
487+
```sh
488+
# request with local image file
489+
python3 examples/openai_client.py --base_url http://127.0.0.1:8000/v1 --prompt "描述这张图片" \
490+
--image examples/03-Confusing-Pictures.jpg --temp 0
491+
# request with image url
492+
python3 examples/openai_client.py --base_url http://127.0.0.1:8000/v1 --prompt "描述这张图片" \
493+
--image https://www.barnorama.com/wp-content/uploads/2016/12/03-Confusing-Pictures.jpg --temp 0
459494
```
460495
461496
With this API server as backend, ChatGLM.cpp models can be seamlessly integrated into any frontend that uses OpenAI-style API, including [mckaywrigley/chatbot-ui](https://github.com/mckaywrigley/chatbot-ui), [fuergaosi233/wechat-chatgpt](https://github.com/fuergaosi233/wechat-chatgpt), [Yidadaa/ChatGPT-Next-Web](https://github.com/Yidadaa/ChatGPT-Next-Web), and more.

0 commit comments

Comments
 (0)