Skip to content

Commit e026f39

Browse files
authored
Feat/sd35 large (#36)
* feat: add sd-3.5-large * add sd 3.5 large turbo
1 parent 8236ed1 commit e026f39

File tree

10 files changed

+289
-0
lines changed

10 files changed

+289
-0
lines changed

sd3.5-large-turbo/.bentoignore

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
__pycache__/
2+
*.py[cod]
3+
*$py.class
4+
.ipynb_checkpoints
5+
venv/

sd3.5-large-turbo/README.md

Lines changed: 75 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,75 @@
1+
<div align="center">
2+
<h1 align="center">Serving Stable Diffusion 3.5 Large Turbo with BentoML</h1>
3+
</div>
4+
5+
[Stable Diffusion 3.5 Large Turbo](https://huggingface.co/stabilityai/stable-diffusion-3.5-large-turbo) is a Multimodal Diffusion Transformer (MMDiT) text-to-image model with Adversarial Diffusion Distillation (ADD) that features improved performance in image quality, typography, complex prompt understanding, and resource-efficiency, with a focus on fewer inference steps.
6+
7+
This is a BentoML example project, demonstrating how to build an image generation inference API server, using the Stable Diffusion 3.5 Large Turbo model. See [here](https://github.com/bentoml/BentoML/tree/main/examples) for a full list of BentoML example projects.
8+
9+
## Prerequisites
10+
11+
- You have installed Python 3.9+ and `pip`. See the [Python downloads page](https://www.python.org/downloads/) to learn more.
12+
- You have a basic understanding of key concepts in BentoML, such as Services. We recommend you read [Quickstart](https://docs.bentoml.com/en/1.2/get-started/quickstart.html) first.
13+
- Accept the conditions to gain access to [Stable Diffusion 3.5 Large Turbo on Hugging Face](https://huggingface.co/stabilityai/stable-diffusion-3.5-large-turbo).
14+
- (Optional) We recommend you create a virtual environment for dependency isolation for this project. See the [Conda documentation](https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html) or the [Python documentation](https://docs.python.org/3/library/venv.html) for details.
15+
- To run the Service locally, you need a Nvidia GPU with at least 32G VRAM.
16+
17+
## Install dependencies
18+
19+
```bash
20+
git clone https://github.com/bentoml/BentoDiffusion.git
21+
cd BentoDiffusion/sd3.5-large-turbo
22+
pip install -r requirements.txt
23+
```
24+
25+
## Run the BentoML Service
26+
27+
We have defined a BentoML Service in `service.py`. Run `bentoml serve` in your project directory to start the Service.
28+
29+
```python
30+
$ bentoml serve .
31+
32+
2024-01-18T18:31:49+0800 [INFO] [cli] Starting production HTTP BentoServer from "service:SD3.5LargeTurbo" listening on http://localhost:3000 (Press CTRL+C to quit)
33+
Loading pipeline components...: 100%
34+
```
35+
36+
The server is now active at [http://localhost:3000](http://localhost:3000/). You can interact with it using the Swagger UI or in other different ways.
37+
38+
CURL
39+
40+
```bash
41+
curl -X 'POST' \
42+
'http://localhost:3000/txt2img' \
43+
-H 'accept: image/*' \
44+
-H 'Content-Type: application/json' \
45+
-d '{
46+
"prompt": "A cat holding a sign that says hello world",
47+
"num_inference_steps": 4
48+
}'
49+
```
50+
51+
Python client
52+
53+
```python
54+
import bentoml
55+
56+
with bentoml.SyncHTTPClient("http://localhost:3000") as client:
57+
result = client.txt2img(
58+
prompt="A cat holding a sign that says hello world",
59+
num_inference_steps=4
60+
)
61+
```
62+
63+
## Deploy to BentoCloud
64+
65+
After the Service is ready, you can deploy the application to BentoCloud for better management and scalability. [Sign up](https://www.bentoml.com/) if you haven't got a BentoCloud account.
66+
67+
Make sure you have [logged in to BentoCloud](https://docs.bentoml.com/en/latest/bentocloud/how-tos/manage-access-token.html), then run the following command to deploy it.
68+
69+
```bash
70+
bentoml deploy --env HF_TOKEN=<your huggingface token> .
71+
```
72+
73+
Once the application is up and running on BentoCloud, you can access it via the exposed URL.
74+
75+
**Note**: For custom deployment in your own infrastructure, use [BentoML to generate an OCI-compliant image](https://docs.bentoml.com/en/latest/guides/containerization.html).

sd3.5-large-turbo/bentofile.yaml

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
service: "service:SD35LargeTurbo"
2+
labels:
3+
owner: bentoml-team
4+
project: gallery
5+
include:
6+
- "*.py"
7+
python:
8+
requirements_txt: "./requirements.txt"
9+
lock_packages: false
10+
envs:
11+
- name: HF_TOKEN

sd3.5-large-turbo/requirements.txt

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
accelerate==1.0.1
2+
bentoml>=1.3.5
3+
diffusers==0.31.0
4+
pillow==11.0.0
5+
protobuf==5.28.3
6+
sentencepiece==0.2.0
7+
torch==2.4.1
8+
transformers==4.46.0

sd3.5-large-turbo/service.py

Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
import typing as t
2+
import bentoml
3+
from PIL.Image import Image
4+
from annotated_types import Le, Ge
5+
from typing_extensions import Annotated
6+
7+
8+
MODEL_ID = "stabilityai/stable-diffusion-3.5-large-turbo"
9+
10+
sample_prompt = "A cat holding a sign that says hello world"
11+
12+
@bentoml.service(
13+
traffic={"timeout": 300},
14+
workers=1,
15+
resources={
16+
"gpu": 1,
17+
"gpu_type": "nvidia-tesla-a100",
18+
},
19+
)
20+
class SD35LargeTurbo:
21+
def __init__(self) -> None:
22+
import torch
23+
from diffusers import StableDiffusion3Pipeline
24+
25+
self.pipe = StableDiffusion3Pipeline.from_pretrained(
26+
MODEL_ID,
27+
torch_dtype=torch.bfloat16,
28+
)
29+
self.pipe.to(device="cuda")
30+
31+
@bentoml.api
32+
def txt2img(
33+
self,
34+
prompt: str = sample_prompt,
35+
negative_prompt: t.Optional[str] = None,
36+
num_inference_steps: Annotated[int, Ge(1), Le(10)] = 4,
37+
) -> Image:
38+
image = self.pipe(
39+
prompt=prompt,
40+
negative_prompt=negative_prompt,
41+
num_inference_steps=num_inference_steps,
42+
guidance_scale=0.0,
43+
).images[0]
44+
return image

sd3.5-large/.bentoignore

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
__pycache__/
2+
*.py[cod]
3+
*$py.class
4+
.ipynb_checkpoints
5+
venv/

sd3.5-large/README.md

Lines changed: 77 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,77 @@
1+
<div align="center">
2+
<h1 align="center">Serving Stable Diffusion 3.5 Large with BentoML</h1>
3+
</div>
4+
5+
[Stable Diffusion 3.5 Large](https://huggingface.co/stabilityai/stable-diffusion-3.5-large) is a Multimodal Diffusion Transformer (MMDiT) text-to-image model that features improved performance in image quality, typography, complex prompt understanding, and resource-efficiency.
6+
7+
This is a BentoML example project, demonstrating how to build an image generation inference API server, using the Stable Diffusion 3.5 Large model. See [here](https://github.com/bentoml/BentoML/tree/main/examples) for a full list of BentoML example projects.
8+
9+
## Prerequisites
10+
11+
- You have installed Python 3.9+ and `pip`. See the [Python downloads page](https://www.python.org/downloads/) to learn more.
12+
- You have a basic understanding of key concepts in BentoML, such as Services. We recommend you read [Quickstart](https://docs.bentoml.com/en/1.2/get-started/quickstart.html) first.
13+
- Accept the conditions to gain access to [Stable Diffusion 3.5 Large on Hugging Face](https://huggingface.co/stabilityai/stable-diffusion-3.5-large).
14+
- (Optional) We recommend you create a virtual environment for dependency isolation for this project. See the [Conda documentation](https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html) or the [Python documentation](https://docs.python.org/3/library/venv.html) for details.
15+
- To run the Service locally, you need a Nvidia GPU with at least 20G VRAM.
16+
17+
## Install dependencies
18+
19+
```bash
20+
git clone https://github.com/bentoml/BentoDiffusion.git
21+
cd BentoDiffusion/sd3.5-large
22+
pip install -r requirements.txt
23+
```
24+
25+
## Run the BentoML Service
26+
27+
We have defined a BentoML Service in `service.py`. Run `bentoml serve` in your project directory to start the Service.
28+
29+
```python
30+
$ bentoml serve .
31+
32+
2024-01-18T18:31:49+0800 [INFO] [cli] Starting production HTTP BentoServer from "service:SD35Large" listening on http://localhost:3000 (Press CTRL+C to quit)
33+
Loading pipeline components...: 100%
34+
```
35+
36+
The server is now active at [http://localhost:3000](http://localhost:3000/). You can interact with it using the Swagger UI or in other different ways.
37+
38+
CURL
39+
40+
```bash
41+
curl -X 'POST' \
42+
'http://localhost:3000/txt2img' \
43+
-H 'accept: image/*' \
44+
-H 'Content-Type: application/json' \
45+
-d '{
46+
"prompt": "A cat holding a sign that says hello world",
47+
"num_inference_steps": 40,
48+
"guidance_scale": 4.5
49+
}'
50+
```
51+
52+
Python client
53+
54+
```python
55+
import bentoml
56+
57+
with bentoml.SyncHTTPClient("http://localhost:3000") as client:
58+
result = client.txt2img(
59+
prompt="A cat holding a sign that says hello world",
60+
num_inference_steps=40,
61+
guidance_scale=4.5
62+
)
63+
```
64+
65+
## Deploy to BentoCloud
66+
67+
After the Service is ready, you can deploy the application to BentoCloud for better management and scalability. [Sign up](https://www.bentoml.com/) if you haven't got a BentoCloud account.
68+
69+
Make sure you have [logged in to BentoCloud](https://docs.bentoml.com/en/latest/bentocloud/how-tos/manage-access-token.html), then run the following command to deploy it.
70+
71+
```bash
72+
bentoml deploy --env HF_TOKEN=<your huggingface token> .
73+
```
74+
75+
Once the application is up and running on BentoCloud, you can access it via the exposed URL.
76+
77+
**Note**: For custom deployment in your own infrastructure, use [BentoML to generate an OCI-compliant image](https://docs.bentoml.com/en/latest/guides/containerization.html).

sd3.5-large/bentofile.yaml

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
service: "service:SD35Large"
2+
labels:
3+
owner: bentoml-team
4+
project: gallery
5+
include:
6+
- "*.py"
7+
python:
8+
requirements_txt: "./requirements.txt"
9+
lock_packages: false
10+
envs:
11+
- name: HF_TOKEN

sd3.5-large/requirements.txt

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
accelerate==1.0.1
2+
bentoml>=1.3.5
3+
diffusers==0.31.0
4+
pillow==11.0.0
5+
protobuf==5.28.3
6+
sentencepiece==0.2.0
7+
torch==2.5.0
8+
transformers==4.46.0

sd3.5-large/service.py

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
import typing as t
2+
import bentoml
3+
from PIL.Image import Image
4+
from annotated_types import Le, Ge
5+
from typing_extensions import Annotated
6+
7+
8+
MODEL_ID = "stabilityai/stable-diffusion-3.5-large"
9+
10+
sample_prompt = "A cat holding a sign that says hello world"
11+
12+
@bentoml.service(
13+
traffic={"timeout": 300},
14+
workers=1,
15+
resources={
16+
"gpu": 1,
17+
"gpu_type": "nvidia-tesla-a100",
18+
},
19+
)
20+
class SD35Large:
21+
def __init__(self) -> None:
22+
import torch
23+
from diffusers import StableDiffusion3Pipeline
24+
25+
self.pipe = StableDiffusion3Pipeline.from_pretrained(
26+
MODEL_ID,
27+
torch_dtype=torch.bfloat16,
28+
)
29+
self.pipe.to(device="cuda")
30+
31+
@bentoml.api
32+
def txt2img(
33+
self,
34+
prompt: str = sample_prompt,
35+
negative_prompt: t.Optional[str] = None,
36+
num_inference_steps: Annotated[int, Ge(1), Le(50)] = 40,
37+
guidance_scale: float = 4.5,
38+
) -> Image:
39+
image = self.pipe(
40+
prompt=prompt,
41+
negative_prompt=negative_prompt,
42+
num_inference_steps=num_inference_steps,
43+
guidance_scale=guidance_scale,
44+
).images[0]
45+
return image

0 commit comments

Comments
 (0)