Feat/sd35 large (#36)

larme · web-flow · commit e026f390047d · 2024-11-01T09:42:19.000+08:00
* feat: add sd-3.5-large

* add sd 3.5 large turbo
diff --git a/sd3.5-large-turbo/.bentoignore b/sd3.5-large-turbo/.bentoignore
@@ -0,0 +1,5 @@
+__pycache__/
+*.py[cod]
+*$py.class
+.ipynb_checkpoints
+venv/
diff --git a/sd3.5-large-turbo/README.md b/sd3.5-large-turbo/README.md
@@ -0,0 +1,75 @@
+<div align="center">
+    <h1 align="center">Serving Stable Diffusion 3.5 Large Turbo with BentoML</h1>
+</div>
+
+[Stable Diffusion 3.5 Large Turbo](https://huggingface.co/stabilityai/stable-diffusion-3.5-large-turbo) is a Multimodal Diffusion Transformer (MMDiT) text-to-image model with Adversarial Diffusion Distillation (ADD) that features improved performance in image quality, typography, complex prompt understanding, and resource-efficiency, with a focus on fewer inference steps.
+
+This is a BentoML example project, demonstrating how to build an image generation inference API server, using the Stable Diffusion 3.5 Large Turbo model. See [here](https://github.com/bentoml/BentoML/tree/main/examples) for a full list of BentoML example projects.
+
+## Prerequisites
+
+- You have installed Python 3.9+ and `pip`. See the [Python downloads page](https://www.python.org/downloads/) to learn more.
+- You have a basic understanding of key concepts in BentoML, such as Services. We recommend you read [Quickstart](https://docs.bentoml.com/en/1.2/get-started/quickstart.html) first.
+- Accept the conditions to gain access to [Stable Diffusion 3.5 Large Turbo on Hugging Face](https://huggingface.co/stabilityai/stable-diffusion-3.5-large-turbo).
+- (Optional) We recommend you create a virtual environment for dependency isolation for this project. See the [Conda documentation](https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html) or the [Python documentation](https://docs.python.org/3/library/venv.html) for details.
+- To run the Service locally, you need a Nvidia GPU with at least 32G VRAM.
+
+## Install dependencies
+
+```bash
+git clone https://github.com/bentoml/BentoDiffusion.git
+cd BentoDiffusion/sd3.5-large-turbo
+pip install -r requirements.txt
+```
+
+## Run the BentoML Service
+
+We have defined a BentoML Service in `service.py`. Run `bentoml serve` in your project directory to start the Service.
+
+```python
+$ bentoml serve .
+
+2024-01-18T18:31:49+0800 [INFO] [cli] Starting production HTTP BentoServer from "service:SD3.5LargeTurbo" listening on http://localhost:3000 (Press CTRL+C to quit)
+Loading pipeline components...: 100%
+```
+
+The server is now active at [http://localhost:3000](http://localhost:3000/). You can interact with it using the Swagger UI or in other different ways.
+
+CURL
+
+```bash
+curl -X 'POST' \
+  'http://localhost:3000/txt2img' \
+  -H 'accept: image/*' \
+  -H 'Content-Type: application/json' \
+  -d '{
+  "prompt": "A cat holding a sign that says hello world",
+  "num_inference_steps": 4
+}'
+```
+
+Python client
+
+```python
+import bentoml
+
+with bentoml.SyncHTTPClient("http://localhost:3000") as client:
+        result = client.txt2img(
+            prompt="A cat holding a sign that says hello world",
+            num_inference_steps=4
+        )
+```
+
+## Deploy to BentoCloud
+
+After the Service is ready, you can deploy the application to BentoCloud for better management and scalability. [Sign up](https://www.bentoml.com/) if you haven't got a BentoCloud account.
+
+Make sure you have [logged in to BentoCloud](https://docs.bentoml.com/en/latest/bentocloud/how-tos/manage-access-token.html), then run the following command to deploy it.
+
+```bash
+bentoml deploy --env HF_TOKEN=<your huggingface token> .
+```
+
+Once the application is up and running on BentoCloud, you can access it via the exposed URL.
+
+**Note**: For custom deployment in your own infrastructure, use [BentoML to generate an OCI-compliant image](https://docs.bentoml.com/en/latest/guides/containerization.html).
diff --git a/sd3.5-large-turbo/bentofile.yaml b/sd3.5-large-turbo/bentofile.yaml
@@ -0,0 +1,11 @@
+service: "service:SD35LargeTurbo"
+labels:
+  owner: bentoml-team
+  project: gallery
+include:
+- "*.py"
+python:
+  requirements_txt: "./requirements.txt"
+  lock_packages: false
+envs:
+  - name: HF_TOKEN
diff --git a/sd3.5-large-turbo/requirements.txt b/sd3.5-large-turbo/requirements.txt
@@ -0,0 +1,8 @@
+accelerate==1.0.1
+bentoml>=1.3.5
+diffusers==0.31.0
+pillow==11.0.0
+protobuf==5.28.3
+sentencepiece==0.2.0
+torch==2.4.1
+transformers==4.46.0
diff --git a/sd3.5-large-turbo/service.py b/sd3.5-large-turbo/service.py
@@ -0,0 +1,44 @@
+import typing as t
+import bentoml
+from PIL.Image import Image
+from annotated_types import Le, Ge
+from typing_extensions import Annotated
+
+
+MODEL_ID = "stabilityai/stable-diffusion-3.5-large-turbo"
+
+sample_prompt = "A cat holding a sign that says hello world"
+
+@bentoml.service(
+    traffic={"timeout": 300},
+    workers=1,
+    resources={
+        "gpu": 1,
+        "gpu_type": "nvidia-tesla-a100",
+    },
+)
+class SD35LargeTurbo:
+    def __init__(self) -> None:
+        import torch
+        from diffusers import StableDiffusion3Pipeline
+
+        self.pipe = StableDiffusion3Pipeline.from_pretrained(
+            MODEL_ID,
+            torch_dtype=torch.bfloat16,
+        )
+        self.pipe.to(device="cuda")
+
+    @bentoml.api
+    def txt2img(
+            self,
+            prompt: str = sample_prompt,
+            negative_prompt: t.Optional[str] = None,
+            num_inference_steps: Annotated[int, Ge(1), Le(10)] = 4,
+    ) -> Image:
+        image = self.pipe(
+            prompt=prompt,
+            negative_prompt=negative_prompt,
+            num_inference_steps=num_inference_steps,
+            guidance_scale=0.0,
+        ).images[0]
+        return image
diff --git a/sd3.5-large/.bentoignore b/sd3.5-large/.bentoignore
@@ -0,0 +1,5 @@
+__pycache__/
+*.py[cod]
+*$py.class
+.ipynb_checkpoints
+venv/
diff --git a/sd3.5-large/README.md b/sd3.5-large/README.md
@@ -0,0 +1,77 @@
+<div align="center">
+    <h1 align="center">Serving Stable Diffusion 3.5 Large with BentoML</h1>
+</div>
+
+[Stable Diffusion 3.5 Large](https://huggingface.co/stabilityai/stable-diffusion-3.5-large) is a Multimodal Diffusion Transformer (MMDiT) text-to-image model that features improved performance in image quality, typography, complex prompt understanding, and resource-efficiency.
+
+This is a BentoML example project, demonstrating how to build an image generation inference API server, using the Stable Diffusion 3.5 Large model. See [here](https://github.com/bentoml/BentoML/tree/main/examples) for a full list of BentoML example projects.
+
+## Prerequisites
+
+- You have installed Python 3.9+ and `pip`. See the [Python downloads page](https://www.python.org/downloads/) to learn more.
+- You have a basic understanding of key concepts in BentoML, such as Services. We recommend you read [Quickstart](https://docs.bentoml.com/en/1.2/get-started/quickstart.html) first.
+- Accept the conditions to gain access to [Stable Diffusion 3.5 Large on Hugging Face](https://huggingface.co/stabilityai/stable-diffusion-3.5-large).
+- (Optional) We recommend you create a virtual environment for dependency isolation for this project. See the [Conda documentation](https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html) or the [Python documentation](https://docs.python.org/3/library/venv.html) for details.
+- To run the Service locally, you need a Nvidia GPU with at least 20G VRAM.
+
+## Install dependencies
+
+```bash
+git clone https://github.com/bentoml/BentoDiffusion.git
+cd BentoDiffusion/sd3.5-large
+pip install -r requirements.txt
+```
+
+## Run the BentoML Service
+
+We have defined a BentoML Service in `service.py`. Run `bentoml serve` in your project directory to start the Service.
+
+```python
+$ bentoml serve .
+
+2024-01-18T18:31:49+0800 [INFO] [cli] Starting production HTTP BentoServer from "service:SD35Large" listening on http://localhost:3000 (Press CTRL+C to quit)
+Loading pipeline components...: 100%
+```
+
+The server is now active at [http://localhost:3000](http://localhost:3000/). You can interact with it using the Swagger UI or in other different ways.
+
+CURL
+
+```bash
+curl -X 'POST' \
+  'http://localhost:3000/txt2img' \
+  -H 'accept: image/*' \
+  -H 'Content-Type: application/json' \
+  -d '{
+  "prompt": "A cat holding a sign that says hello world",
+  "num_inference_steps": 40,
+  "guidance_scale": 4.5
+}'
+```
+
+Python client
+
+```python
+import bentoml
+
+with bentoml.SyncHTTPClient("http://localhost:3000") as client:
+        result = client.txt2img(
+            prompt="A cat holding a sign that says hello world",
+            num_inference_steps=40,
+            guidance_scale=4.5
+        )
+```
+
+## Deploy to BentoCloud
+
+After the Service is ready, you can deploy the application to BentoCloud for better management and scalability. [Sign up](https://www.bentoml.com/) if you haven't got a BentoCloud account.
+
+Make sure you have [logged in to BentoCloud](https://docs.bentoml.com/en/latest/bentocloud/how-tos/manage-access-token.html), then run the following command to deploy it.
+
+```bash
+bentoml deploy --env HF_TOKEN=<your huggingface token> .
+```
+
+Once the application is up and running on BentoCloud, you can access it via the exposed URL.
+
+**Note**: For custom deployment in your own infrastructure, use [BentoML to generate an OCI-compliant image](https://docs.bentoml.com/en/latest/guides/containerization.html).
diff --git a/sd3.5-large/bentofile.yaml b/sd3.5-large/bentofile.yaml
@@ -0,0 +1,11 @@
+service: "service:SD35Large"
+labels:
+  owner: bentoml-team
+  project: gallery
+include:
+- "*.py"
+python:
+  requirements_txt: "./requirements.txt"
+  lock_packages: false
+envs:
+  - name: HF_TOKEN
diff --git a/sd3.5-large/requirements.txt b/sd3.5-large/requirements.txt
@@ -0,0 +1,8 @@
+accelerate==1.0.1
+bentoml>=1.3.5
+diffusers==0.31.0
+pillow==11.0.0
+protobuf==5.28.3
+sentencepiece==0.2.0
+torch==2.5.0
+transformers==4.46.0
diff --git a/sd3.5-large/service.py b/sd3.5-large/service.py
@@ -0,0 +1,45 @@
+import typing as t
+import bentoml
+from PIL.Image import Image
+from annotated_types import Le, Ge
+from typing_extensions import Annotated
+
+
+MODEL_ID = "stabilityai/stable-diffusion-3.5-large"
+
+sample_prompt = "A cat holding a sign that says hello world"
+
+@bentoml.service(
+    traffic={"timeout": 300},
+    workers=1,
+    resources={
+        "gpu": 1,
+        "gpu_type": "nvidia-tesla-a100",
+    },
+)
+class SD35Large:
+    def __init__(self) -> None:
+        import torch
+        from diffusers import StableDiffusion3Pipeline
+
+        self.pipe = StableDiffusion3Pipeline.from_pretrained(
+            MODEL_ID,
+            torch_dtype=torch.bfloat16,
+        )
+        self.pipe.to(device="cuda")
+
+    @bentoml.api
+    def txt2img(
+            self,
+            prompt: str = sample_prompt,
+            negative_prompt: t.Optional[str] = None,
+            num_inference_steps: Annotated[int, Ge(1), Le(50)] = 40,
+            guidance_scale: float = 4.5,
+    ) -> Image:
+        image = self.pipe(
+            prompt=prompt,
+            negative_prompt=negative_prompt,
+            num_inference_steps=num_inference_steps,
+            guidance_scale=guidance_scale,
+        ).images[0]
+        return image