Skip to content

Commit d342869

Browse files
authored
FEAT: support SD3.5 series model (#2706)
1 parent 513af9d commit d342869

32 files changed

+884
-112
lines changed

Diff for: doc/source/gen_docs.py

+1
Original file line numberDiff line numberDiff line change
@@ -203,6 +203,7 @@ def get_unique_id(spec):
203203
available_controlnet = None
204204
model["available_controlnet"] = available_controlnet
205205
model["model_ability"] = ', '.join(model.get("model_ability"))
206+
model["gguf_quantizations"] = ", ".join(model.get("gguf_quantizations", []))
206207
rendered = env.get_template('image.rst.jinja').render(model)
207208
output_file_path = os.path.join(output_dir, f"{model['model_name'].lower()}.rst")
208209
with open(output_file_path, 'w') as output_file:

Diff for: doc/source/locale/zh_CN/LC_MESSAGES/models/model_abilities/image.po

+211-36
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ msgid ""
88
msgstr ""
99
"Project-Id-Version: Xinference \n"
1010
"Report-Msgid-Bugs-To: \n"
11-
"POT-Creation-Date: 2024-10-30 07:49+0000\n"
11+
"POT-Creation-Date: 2024-12-26 18:49+0800\n"
1212
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
1313
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
1414
"Language: zh_CN\n"
@@ -17,7 +17,7 @@ msgstr ""
1717
"MIME-Version: 1.0\n"
1818
"Content-Type: text/plain; charset=utf-8\n"
1919
"Content-Transfer-Encoding: 8bit\n"
20-
"Generated-By: Babel 2.16.0\n"
20+
"Generated-By: Babel 2.14.0\n"
2121

2222
#: ../../source/models/model_abilities/image.rst:5
2323
msgid "Images"
@@ -98,26 +98,48 @@ msgid "stable-diffusion-xl-base-1.0"
9898
msgstr ""
9999

100100
#: ../../source/models/model_abilities/image.rst:43
101+
#: ../../source/models/model_abilities/image.rst:149
101102
msgid "sd3-medium"
102103
msgstr ""
103104

104105
#: ../../source/models/model_abilities/image.rst:44
105-
msgid "FLUX.1-schnell"
106+
#: ../../source/models/model_abilities/image.rst:151
107+
#: ../../source/models/model_abilities/image.rst:180
108+
msgid "sd3.5-medium"
106109
msgstr ""
107110

108111
#: ../../source/models/model_abilities/image.rst:45
112+
#: ../../source/models/model_abilities/image.rst:153
113+
#: ../../source/models/model_abilities/image.rst:182
114+
msgid "sd3.5-large"
115+
msgstr ""
116+
117+
#: ../../source/models/model_abilities/image.rst:46
118+
#: ../../source/models/model_abilities/image.rst:155
119+
msgid "sd3.5-large-turbo"
120+
msgstr ""
121+
122+
#: ../../source/models/model_abilities/image.rst:47
123+
#: ../../source/models/model_abilities/image.rst:147
124+
#: ../../source/models/model_abilities/image.rst:178
125+
msgid "FLUX.1-schnell"
126+
msgstr ""
127+
128+
#: ../../source/models/model_abilities/image.rst:48
129+
#: ../../source/models/model_abilities/image.rst:145
130+
#: ../../source/models/model_abilities/image.rst:176
109131
msgid "FLUX.1-dev"
110132
msgstr ""
111133

112-
#: ../../source/models/model_abilities/image.rst:49
134+
#: ../../source/models/model_abilities/image.rst:52
113135
msgid "Quickstart"
114136
msgstr "快速入门"
115137

116-
#: ../../source/models/model_abilities/image.rst:52
138+
#: ../../source/models/model_abilities/image.rst:55
117139
msgid "Text-to-image"
118140
msgstr "文生图"
119141

120-
#: ../../source/models/model_abilities/image.rst:54
142+
#: ../../source/models/model_abilities/image.rst:57
121143
msgid ""
122144
"The Text-to-image API mimics OpenAI's `create images API "
123145
"<https://platform.openai.com/docs/api-reference/images/create>`_. We can "
@@ -127,15 +149,26 @@ msgstr ""
127149
"可以通过 cURL、OpenAI Client 或 Xinference 的方式尝试使用 Text-to-image "
128150
"API。"
129151

130-
#: ../../source/models/model_abilities/image.rst:109
131-
msgid "Tips for Large Image Models including SD3-Medium, FLUX.1"
132-
msgstr "大型图像模型部署(sd3-medium、FLUX.1 系列)贴士"
152+
#: ../../source/models/model_abilities/image.rst:112
153+
msgid "Quantize Large Image Models e.g. SD3-Medium, FLUX.1"
154+
msgstr "量化大型图像模型(sd3-medium、FLUX.1 系列等)"
133155

134-
#: ../../source/models/model_abilities/image.rst:111
156+
#: ../../source/models/model_abilities/image.rst:116
157+
msgid ""
158+
"From v0.16.1, Xinference by default enabled quantization for large image "
159+
"models like Flux.1 and SD3.5 series. So if your Xinference version is "
160+
"newer than v0.16.1, You barely need to do anything to run those large "
161+
"image models on GPUs with small memory."
162+
msgstr ""
163+
"从 v0.16.1 开始,Xinference 默认对大图像模型如 Flux.1 和 SD3.5 系列开启"
164+
"量化。如果你使用新于 v0.16.1 的 Xinference 版本,你不需要做什么事情来在小"
165+
" GPU 显存的机器上来运行这些大型图像模型。"
166+
167+
#: ../../source/models/model_abilities/image.rst:121
135168
msgid "Useful extra parameters can be passed to launch including:"
136169
msgstr "有用的传递给加载模型的额外参数包括:"
137170

138-
#: ../../source/models/model_abilities/image.rst:113
171+
#: ../../source/models/model_abilities/image.rst:123
139172
msgid ""
140173
"``--cpu_offload True``: specifying ``True`` will offload the components "
141174
"of the model to CPU during inference in order to save memory, while "
@@ -147,7 +180,7 @@ msgstr ""
147180
"CPU 上以节省内存,这会导致推理延迟略有增加。模型卸载仅会在需要执行时将"
148181
"模型组件移动到 GPU 上,同时保持其余组件在 CPU 上"
149182

150-
#: ../../source/models/model_abilities/image.rst:117
183+
#: ../../source/models/model_abilities/image.rst:127
151184
msgid ""
152185
"``--quantize_text_encoder <text encoder layer>``: We leveraged the "
153186
"``bitsandbytes`` library to load and quantize the T5-XXL text encoder to "
@@ -158,7 +191,7 @@ msgstr ""
158191
"`` 库加载并量化 T5-XXL 文本编码器至8位精度。这使得你能够在仅轻微影响性能"
159192
"的情况下继续使用全部文本编码器。"
160193

161-
#: ../../source/models/model_abilities/image.rst:120
194+
#: ../../source/models/model_abilities/image.rst:130
162195
msgid ""
163196
"``--text_encoder_3 None``, for sd3-medium, removing the memory-intensive "
164197
"4.7B parameter T5-XXL text encoder during inference can significantly "
@@ -167,53 +200,195 @@ msgstr ""
167200
"``--text_encoder_3 None``,对于 sd3-medium,移除在推理过程中内存密集型的"
168201
"47亿参数T5-XXL文本编码器可以显著降低内存需求,而仅造成性能上的轻微损失。"
169202

170-
#: ../../source/models/model_abilities/image.rst:124
203+
#: ../../source/models/model_abilities/image.rst:133
204+
msgid "``--transformer_nf4 True``: use nf4 for transformer quantization."
205+
msgstr "``--transformer_nf4 True`` :使用 nf4 量化 transformer。"
206+
207+
#: ../../source/models/model_abilities/image.rst:134
171208
msgid ""
172-
"If you are trying to run large image models liek sd3-medium or FLUX.1 "
173-
"series on GPU card that has less memory than 24GB, you may encounter OOM "
174-
"when launching or inference. Try below solutions."
209+
"``--quantize``: Only work for MLX on Mac, Flux.1-dev and Flux.1-schnell "
210+
"will switch to MLX engine on Mac, and ``quantize`` can be used to "
211+
"quantize the model."
175212
msgstr ""
176-
"如果你试图在显存小于24GB的GPU上运行像sd3-medium或FLUX.1系列这样的大型图像"
177-
"模型,你在启动或推理过程中可能会遇到显存溢出(OOM)的问题。尝试以下"
178-
"解决方案。"
213+
"``--quantize`` :只对 Mac 上的 MLX 引擎生效,Flux.1-dev 和 Flux.1-schnell"
214+
"会在 Mac 上使用 MLX 引擎计算,``quantize`` 可以用来量化模型。"
179215

180-
#: ../../source/models/model_abilities/image.rst:128
181-
msgid "For FLUX.1 series, try to apply quantization."
182-
msgstr "对于 FLUX.1 系列,尝试应用量化。"
216+
#: ../../source/models/model_abilities/image.rst:137
217+
msgid ""
218+
"For WebUI, Just add additional parameters, e.g. add key ``cpu_offload`` "
219+
"and value ``True`` to enable cpu offloading."
220+
msgstr ""
221+
"对于 WebUI,只需要添加额外参数,比如,添加 key ``cpu_offload`` 以及值 ``"
222+
"True`` 来开启 CPU 卸载。"
183223

184-
#: ../../source/models/model_abilities/image.rst:134
185-
msgid "For sd3-medium, apply quantization to ``text_encoder_3``."
186-
msgstr "对于 sd3-medium 模型,对 ``text_encoder_3`` 应用量化。"
224+
#: ../../source/models/model_abilities/image.rst:140
225+
msgid "Below list default options that used from v0.16.1."
226+
msgstr "如下列出了从 v0.16.1 开始默认使用的参数。"
227+
228+
#: ../../source/models/model_abilities/image.rst:143
229+
#: ../../source/models/model_abilities/image.rst:174
230+
msgid "Model"
231+
msgstr "模型"
232+
233+
#: ../../source/models/model_abilities/image.rst:143
234+
msgid "quantize_text_encoder"
235+
msgstr ""
187236

188-
#: ../../source/models/model_abilities/image.rst:141
189-
msgid "Or removing memory-intensive T5-XXL text encoder for sd3-medium."
190-
msgstr "或者,移除 sd3-medium 模型中内存密集型的 T5-XXL 文本编码器。"
237+
#: ../../source/models/model_abilities/image.rst:143
238+
msgid "quantize"
239+
msgstr ""
240+
241+
#: ../../source/models/model_abilities/image.rst:143
242+
msgid "transformer_nf4"
243+
msgstr ""
244+
245+
#: ../../source/models/model_abilities/image.rst:145
246+
#: ../../source/models/model_abilities/image.rst:147
247+
msgid "text_encoder_2"
248+
msgstr ""
249+
250+
#: ../../source/models/model_abilities/image.rst:145
251+
#: ../../source/models/model_abilities/image.rst:147
252+
#: ../../source/models/model_abilities/image.rst:153
253+
#: ../../source/models/model_abilities/image.rst:155
254+
msgid "True"
255+
msgstr ""
191256

192-
#: ../../source/models/model_abilities/image.rst:148
257+
#: ../../source/models/model_abilities/image.rst:145
258+
#: ../../source/models/model_abilities/image.rst:147
259+
#: ../../source/models/model_abilities/image.rst:149
260+
#: ../../source/models/model_abilities/image.rst:151
261+
msgid "False"
262+
msgstr ""
263+
264+
#: ../../source/models/model_abilities/image.rst:149
265+
#: ../../source/models/model_abilities/image.rst:151
266+
#: ../../source/models/model_abilities/image.rst:153
267+
#: ../../source/models/model_abilities/image.rst:155
268+
msgid "text_encoder_3"
269+
msgstr ""
270+
271+
#: ../../source/models/model_abilities/image.rst:149
272+
#: ../../source/models/model_abilities/image.rst:151
273+
#: ../../source/models/model_abilities/image.rst:153
274+
#: ../../source/models/model_abilities/image.rst:155
275+
msgid "N/A"
276+
msgstr ""
277+
278+
#: ../../source/models/model_abilities/image.rst:160
279+
msgid ""
280+
"If you want to disable some quantization, just set the corresponding "
281+
"option to False. e.g. for Web UI, set key ``quantize_text_encoder`` and "
282+
"value ``False`` and for command line, specify ``--quantize_text_encoder "
283+
"False`` to disable quantization for text encoder."
284+
msgstr ""
285+
"如果你想关闭某些量化,只需要设置相应的选项为 False。比如,对于 Web UI,"
286+
"设置 key ``quantize_text_encoder`` 和值 ``False``,或对于命令行,指定 ``"
287+
"--quantize_text_encoder False`` 来关闭 text encoder 的量化。"
288+
289+
#: ../../source/models/model_abilities/image.rst:166
290+
msgid "GGUF file format"
291+
msgstr "GGUF 文件格式"
292+
293+
#: ../../source/models/model_abilities/image.rst:168
294+
msgid ""
295+
"GGUF file format for transformer provides various quantization options. "
296+
"To use gguf file, you can specify additional option ``gguf_quantization``"
297+
" for web UI, or ``--gguf_quantization`` for command line for those image "
298+
"models which support internally by Xinference. Below is the mode list."
299+
msgstr ""
300+
"GGUF 文件格式为 transformer 模块提供了丰富的量化选项。要使用 GGUF 文件,"
301+
"你可以在 Web 界面上指定额外选项 ``gguf_quantization`` ,或者在命令行指定 "
302+
"``--gguf_quantization`` ,以为 Xinference 内建支持 GGUF 量化的模型开启。"
303+
"如下是内置支持的模型。"
304+
305+
#: ../../source/models/model_abilities/image.rst:174
306+
msgid "supported gguf quantization"
307+
msgstr "支持 GGUF 量化格式"
308+
309+
#: ../../source/models/model_abilities/image.rst:176
310+
#: ../../source/models/model_abilities/image.rst:178
311+
msgid "F16, Q2_K, Q3_K_S, Q4_0, Q4_1, Q4_K_S, Q5_0, Q5_1, Q5_K_S, Q6_K, Q8_0"
312+
msgstr ""
313+
314+
#: ../../source/models/model_abilities/image.rst:187
315+
msgid ""
316+
"We stronly recommend to enable additional option ``cpu_offload`` with "
317+
"value ``True`` for WebUI, or specify ``--cpu_offload True`` for command "
318+
"line."
319+
msgstr ""
320+
"我们强烈推荐在 WebUI 上开启额外选项 ``cpu_offload`` 并指定为 ``True``,或"
321+
"对命令行,指定 ``--cpu_offload True``。"
322+
323+
#: ../../source/models/model_abilities/image.rst:190
324+
msgid "Example:"
325+
msgstr "例如:"
326+
327+
#: ../../source/models/model_abilities/image.rst:196
328+
msgid ""
329+
"With ``Q2_K`` quantization, you only need around 5 GiB GPU memory to run "
330+
"Flux.1-dev."
331+
msgstr ""
332+
"使用 ``Q2_K`` 量化,你只需要大约 5GB 的显存来运行 Flux.1-dev。"
333+
334+
#: ../../source/models/model_abilities/image.rst:198
335+
msgid ""
336+
"For those models gguf options are not supported internally, or you want "
337+
"to download gguf files on you own, you can specify additional option "
338+
"``gguf_model_path`` for web UI or spcecify ``--gguf_model_path "
339+
"/path/to/model_quant.gguf`` for command line."
340+
msgstr ""
341+
"对于非内建支持 GGUF 量化的模型,或者你希望自己下载 GGUF 文件,你可以在 "
342+
"Web UI 指定额外选项 ``gguf_model_path`` 或者用命令行指定 ``--gguf_model_"
343+
"path /path/to/model_quant.gguf`` 。"
344+
345+
#: ../../source/models/model_abilities/image.rst:204
193346
msgid "Image-to-image"
194347
msgstr "图生图"
195348

196-
#: ../../source/models/model_abilities/image.rst:150
349+
#: ../../source/models/model_abilities/image.rst:206
197350
msgid "You can find more examples of Images API in the tutorial notebook:"
198351
msgstr "你可以在教程笔记本中找到更多 Images API 的示例。"
199352

200-
#: ../../source/models/model_abilities/image.rst:154
353+
#: ../../source/models/model_abilities/image.rst:210
201354
msgid "Stable Diffusion ControlNet"
202355
msgstr ""
203356

204-
#: ../../source/models/model_abilities/image.rst:157
357+
#: ../../source/models/model_abilities/image.rst:213
205358
msgid "Learn from a Stable Diffusion ControlNet example"
206359
msgstr "学习一个 Stable Diffusion 控制网络的示例"
207360

208-
#: ../../source/models/model_abilities/image.rst:160
361+
#: ../../source/models/model_abilities/image.rst:216
209362
msgid "OCR"
210363
msgstr ""
211364

212-
#: ../../source/models/model_abilities/image.rst:162
365+
#: ../../source/models/model_abilities/image.rst:218
213366
msgid "The OCR API accepts image bytes and returns the OCR text."
214367
msgstr "OCR API 接受图像字节并返回 OCR 文本。"
215368

216-
#: ../../source/models/model_abilities/image.rst:164
369+
#: ../../source/models/model_abilities/image.rst:220
217370
msgid "We can try OCR API out either via cURL, or Xinference's python client:"
218371
msgstr "可以通过 cURL 或 Xinference 的 Python 客户端来尝试 OCR API。"
219372

373+
#~ msgid ""
374+
#~ "If you are trying to run large "
375+
#~ "image models liek sd3-medium or FLUX.1"
376+
#~ " series on GPU card that has "
377+
#~ "less memory than 24GB, you may "
378+
#~ "encounter OOM when launching or "
379+
#~ "inference. Try below solutions."
380+
#~ msgstr ""
381+
#~ "如果你试图在显存小于24GB的GPU上运行像"
382+
#~ "sd3-medium或FLUX.1系列这样的大型图像模型"
383+
#~ ",你在启动或推理过程中可能会遇到显存"
384+
#~ "溢出(OOM)的问题。尝试以下解决方案。"
385+
386+
#~ msgid "For FLUX.1 series, try to apply quantization."
387+
#~ msgstr "对于 FLUX.1 系列,尝试应用量化。"
388+
389+
#~ msgid "For sd3-medium, apply quantization to ``text_encoder_3``."
390+
#~ msgstr "对于 sd3-medium 模型,对 ``text_encoder_3`` 应用量化。"
391+
392+
#~ msgid "Or removing memory-intensive T5-XXL text encoder for sd3-medium."
393+
#~ msgstr "或者,移除 sd3-medium 模型中内存密集型的 T5-XXL 文本编码器。"
394+

Diff for: doc/source/models/builtin/audio/cosyvoice2-0.5b.rst

+19
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
.. _models_builtin_cosyvoice2-0.5b:
2+
3+
===============
4+
CosyVoice2-0.5B
5+
===============
6+
7+
- **Model Name:** CosyVoice2-0.5B
8+
- **Model Family:** CosyVoice
9+
- **Abilities:** text-to-audio
10+
- **Multilingual:** True
11+
12+
Specifications
13+
^^^^^^^^^^^^^^
14+
15+
- **Model ID:** mrfakename/CosyVoice2-0.5B
16+
17+
Execute the following command to launch the model::
18+
19+
xinference launch --model-name CosyVoice2-0.5B --model-type audio

Diff for: doc/source/models/builtin/audio/f5-tts-mlx.rst

+19
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
.. _models_builtin_f5-tts-mlx:
2+
3+
==========
4+
F5-TTS-MLX
5+
==========
6+
7+
- **Model Name:** F5-TTS-MLX
8+
- **Model Family:** F5-TTS-MLX
9+
- **Abilities:** text-to-audio
10+
- **Multilingual:** True
11+
12+
Specifications
13+
^^^^^^^^^^^^^^
14+
15+
- **Model ID:** lucasnewman/f5-tts-mlx
16+
17+
Execute the following command to launch the model::
18+
19+
xinference launch --model-name F5-TTS-MLX --model-type audio

0 commit comments

Comments
 (0)