Skip to content

Commit 6a25006

Browse files
committed
update examples
1 parent 3a64cc2 commit 6a25006

File tree

2 files changed

+113
-22
lines changed

2 files changed

+113
-22
lines changed

examples/ControlNet/README.md

Lines changed: 91 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,91 @@
1+
# ControlNet
2+
3+
We provide extensive ControlNet support. Taking the FLUX model as an example, we support many different ControlNet models that can be freely combined, even if their structures differ. Additionally, ControlNet models are compatible with high-resolution refinement and partition control techniques, enabling very powerful controllable image generation.
4+
5+
These examples are in [`flux_controlnet.py`](./flux_controlnet.py).
6+
7+
## Canny/Depth/Normal: Structure Control
8+
9+
Structural control is the most fundamental capability of the ControlNet model. By using Canny to extract edge information, or by utilizing depth maps and normal maps, we can extract the structure of an image, which can then serve as control information during the image generation process.
10+
11+
Model link: https://modelscope.cn/models/InstantX/FLUX.1-dev-Controlnet-Union-alpha
12+
13+
For example, if we generate an image of a cat and use a model like InstantX/FLUX.1-dev-Controlnet-Union-alpha that supports multiple control conditions, we can simultaneously enable both Canny and Depth controls to transform the environment into a twilight setting.
14+
15+
|![image_5](https://github.com/user-attachments/assets/19d2abc4-36ae-4163-a8da-df5732d1a737)|![image_6](https://github.com/user-attachments/assets/28378271-3782-484c-bd51-3d3311dd85c6)|
16+
|-|-|
17+
18+
The control strength of ControlNet for structure can be adjusted. For example, in the case below, when we move the girl from summer to winter, we can appropriately lower the control strength of ControlNet so that the model will adapt to the content of the image and change her into warm clothes.
19+
20+
|![image_7](https://github.com/user-attachments/assets/a7b8555b-bfd9-4e92-aa77-16bca81b07e3)|![image_8](https://github.com/user-attachments/assets/a1bab36b-6cce-4f29-8233-4cb824b524a8)|
21+
|-|-|
22+
23+
## Upscaler/Tile/Blur: High-Resolution Image Synthesis
24+
25+
There are many ControlNet models that support high definition, such as:
26+
27+
Model link: https://modelscope.cn/models/jasperai/Flux.1-dev-Controlnet-Upscaler, https://modelscope.cn/models/InstantX/FLUX.1-dev-Controlnet-Union-alpha, https://modelscope.cn/models/Shakker-Labs/FLUX.1-dev-ControlNet-Union-Pro
28+
29+
These models can transform blurry, noisy low-quality images into clear ones. In DiffSynth-Studio, the native high-resolution patch processing technology supported by the framework can overcome the resolution limitations of the models, enabling image generation at resolutions of 2048 or even higher, significantly enhancing the capabilities of these models. In the example below, we can see that in the high-definition image enlarged to 2048 resolution, the cat's fur is rendered in exquisite detail, and the skin texture of the characters is delicate and realistic.
30+
31+
|![image_1](https://github.com/user-attachments/assets/9038158a-118c-4ad7-ab01-22865f6a06fc)|![image_2](https://github.com/user-attachments/assets/88583a33-cd74-4cb9-8fd4-c6e14c0ada0c)|
32+
|-|-|
33+
34+
|![image_3](https://github.com/user-attachments/assets/13061ecf-bb57-448a-82c6-7e4655c9cd85)|![image_4](https://github.com/user-attachments/assets/0b7ae80f-de58-4d1d-a49c-ad17e7631bdc)|
35+
|-|-|
36+
37+
## Inpaint: Image Restoration
38+
39+
The Inpaint ControlNet model can repaint specific areas in an image. For example, we can put sunglasses on a cat.
40+
41+
Model link: https://modelscope.cn/models/alimama-creative/FLUX.1-dev-Controlnet-Inpainting-Beta
42+
43+
|![image_9](https://github.com/user-attachments/assets/babddad0-2d67-4624-b77a-c953250ebdab)|![mask_9](https://github.com/user-attachments/assets/d5bc2878-1817-457a-bdfa-200f955233d3)|![image_10](https://github.com/user-attachments/assets/e3197f2c-190b-4522-83ab-a2e0451b39f6)|
44+
|-|-|-|
45+
46+
However, we noticed that the head movements of the cat have changed. If we want to preserve the original structural features, we can use the Canny, Depth, and Normal models. DiffSynth-Studio provides seamless support for ControlNet of different structures. By using a Normal ControlNet, we can ensure that the structure of the image remains unchanged during local redrawing.
47+
48+
Model link: https://modelscope.cn/models/jasperai/Flux.1-dev-Controlnet-Surface-Normals
49+
50+
|![image_11](https://github.com/user-attachments/assets/c028e6fc-5125-4cba-b35a-b6211c2e6600)|![mask_11](https://github.com/user-attachments/assets/1928ee9a-7594-4c6e-9c71-5bd0b043d8f4)|![image_12](https://github.com/user-attachments/assets/97b3b9e1-f821-405e-971b-9e1c31a209aa)|
51+
|-|-|-|
52+
53+
## MultiControlNet+MultiDiffusion: Fine-Grained Control
54+
55+
DiffSynth-Studio not only supports the simultaneous activation of multiple ControlNet structures, but also allows for the partitioned control of content within an image using different prompts. Additionally, it supports the chunk processing of ultra-high-resolution large images, enabling us to achieve extremely detailed high-level control. Next, we will showcase the creative process behind a beautiful image.
56+
57+
First, use the prompt "a beautiful Asian woman and a cat on a bed. The woman wears a dress" to generate a cat and a young girl.
58+
59+
![image_13](https://github.com/user-attachments/assets/8da006e4-0e68-4fa5-b407-31ef5dbe8e5a)
60+
61+
Then, enable Inpaint ControlNet and Canny ControlNet.
62+
63+
Model link: https://modelscope.cn/models/alimama-creative/FLUX.1-dev-Controlnet-Inpainting-Beta, https://modelscope.cn/models/InstantX/FLUX.1-dev-Controlnet-Union-alpha
64+
65+
We control the image using two component.
66+
67+
|Prompt: an orange cat, highly detailed|Prompt: a girl wearing a red camisole|
68+
|-|-|
69+
|![mask_13_1](https://github.com/user-attachments/assets/188530a0-913c-48db-a7f1-62f0384bfdc3)|![mask_13_2](https://github.com/user-attachments/assets/99c4d0d5-8cc3-47a0-8e56-ceb37db4dfdc)|
70+
71+
Generate!
72+
73+
![image_14](https://github.com/user-attachments/assets/f5b9d3dd-a690-4597-91a8-a019c6fc2523)
74+
75+
The background is a bit blurry, so we use deblurring LoRA for image-to-image generation.
76+
77+
Model link: https://modelscope.cn/models/LiblibAI/FLUX.1-dev-LoRA-AntiBlur
78+
79+
![image_15](https://github.com/user-attachments/assets/32ed2667-2260-4d80-aaa9-4435d6920a2a)
80+
81+
The entire image is much clearer now. Next, let's use the high-definition model to increase the resolution to 4096*4096!
82+
83+
Model link: https://modelscope.cn/models/jasperai/Flux.1-dev-Controlnet-Upscaler
84+
85+
![image_17](https://github.com/user-attachments/assets/1a688a12-1544-4973-8aca-aa3a23cb34c1)
86+
87+
Zoom in to see details.
88+
89+
![image_17_cropped](https://github.com/user-attachments/assets/461a1fbc-9ffa-4da5-80fd-e1af9667c804)
90+
91+
Enjoy!

examples/ControlNet/flux_controlnet.py

Lines changed: 22 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ def example_1():
2020
height=768, width=768,
2121
seed=0
2222
)
23-
image_1.save("image_1.png")
23+
image_1.save("image_1.jpg")
2424

2525
image_2 = pipe(
2626
prompt="a photo of a cat, highly detailed",
@@ -29,7 +29,7 @@ def example_1():
2929
height=2048, width=2048, tiled=True,
3030
seed=1
3131
)
32-
image_2.save("image_2.png")
32+
image_2.save("image_2.jpg")
3333

3434

3535

@@ -48,7 +48,7 @@ def example_2():
4848
height=768, width=768,
4949
seed=2
5050
)
51-
image_1.save("image_3.png")
51+
image_1.save("image_3.jpg")
5252

5353
image_2 = pipe(
5454
prompt="a beautiful Chinese girl, delicate skin texture",
@@ -57,7 +57,7 @@ def example_2():
5757
height=2048, width=2048, tiled=True,
5858
seed=3
5959
)
60-
image_2.save("image_4.png")
60+
image_2.save("image_4.jpg")
6161

6262

6363
def example_3():
@@ -80,15 +80,15 @@ def example_3():
8080
height=1024, width=1024,
8181
seed=4
8282
)
83-
image_1.save("image_5.png")
83+
image_1.save("image_5.jpg")
8484

8585
image_2 = pipe(
8686
prompt="sunshine, a cat is running",
8787
controlnet_image=image_1,
8888
height=1024, width=1024,
8989
seed=5
9090
)
91-
image_2.save("image_6.png")
91+
image_2.save("image_6.jpg")
9292

9393

9494
def example_4():
@@ -111,15 +111,15 @@ def example_4():
111111
height=1024, width=1024,
112112
seed=6
113113
)
114-
image_1.save("image_7.png")
114+
image_1.save("image_7.jpg")
115115

116116
image_2 = pipe(
117117
prompt="a beautiful Asian girl, full body, red dress, winter",
118118
controlnet_image=image_1,
119119
height=1024, width=1024,
120120
seed=7
121121
)
122-
image_2.save("image_8.png")
122+
image_2.save("image_8.jpg")
123123

124124

125125

@@ -138,20 +138,20 @@ def example_5():
138138
height=1024, width=1024,
139139
seed=8
140140
)
141-
image_1.save("image_9.png")
141+
image_1.save("image_9.jpg")
142142

143143
mask = np.zeros((1024, 1024, 3), dtype=np.uint8)
144144
mask[100:350, 350: -300] = 255
145145
mask = Image.fromarray(mask)
146-
mask.save("mask_9.png")
146+
mask.save("mask_9.jpg")
147147

148148
image_2 = pipe(
149149
prompt="a cat sitting on a chair, wearing sunglasses",
150150
controlnet_image=image_1, controlnet_inpaint_mask=mask,
151151
height=1024, width=1024,
152152
seed=9
153153
)
154-
image_2.save("image_10.png")
154+
image_2.save("image_10.jpg")
155155

156156

157157

@@ -179,20 +179,20 @@ def example_6():
179179
height=1024, width=1024,
180180
seed=10
181181
)
182-
image_1.save("image_11.png")
182+
image_1.save("image_11.jpg")
183183

184184
mask = np.zeros((1024, 1024, 3), dtype=np.uint8)
185185
mask[-400:, 10:-40] = 255
186186
mask = Image.fromarray(mask)
187-
mask.save("mask_11.png")
187+
mask.save("mask_11.jpg")
188188

189189
image_2 = pipe(
190190
prompt="a beautiful Asian woman looking at the sky, wearing a yellow t-shirt.",
191191
controlnet_image=image_1, controlnet_inpaint_mask=mask,
192192
height=1024, width=1024,
193193
seed=11
194194
)
195-
image_2.save("image_12.png")
195+
image_2.save("image_12.jpg")
196196

197197

198198
def example_7():
@@ -220,22 +220,22 @@ def example_7():
220220
height=1024, width=1024,
221221
seed=100
222222
)
223-
image_1.save("image_13.png")
223+
image_1.save("image_13.jpg")
224224

225225
mask_global = np.zeros((1024, 1024, 3), dtype=np.uint8)
226226
mask_global = Image.fromarray(mask_global)
227-
mask_global.save("mask_13_global.png")
227+
mask_global.save("mask_13_global.jpg")
228228

229229
mask_1 = np.zeros((1024, 1024, 3), dtype=np.uint8)
230230
mask_1[300:-100, 30: 450] = 255
231231
mask_1 = Image.fromarray(mask_1)
232-
mask_1.save("mask_13_1.png")
232+
mask_1.save("mask_13_1.jpg")
233233

234234
mask_2 = np.zeros((1024, 1024, 3), dtype=np.uint8)
235235
mask_2[500:-100, -400:] = 255
236236
mask_2[-200:-100, -500:-400] = 255
237237
mask_2 = Image.fromarray(mask_2)
238-
mask_2.save("mask_13_2.png")
238+
mask_2.save("mask_13_2.jpg")
239239

240240
image_2 = pipe(
241241
prompt="a beautiful Asian woman and a cat on a bed. The woman wears a dress.",
@@ -244,7 +244,7 @@ def example_7():
244244
height=1024, width=1024,
245245
seed=101
246246
)
247-
image_2.save("image_14.png")
247+
image_2.save("image_14.jpg")
248248

249249
model_manager.load_lora("models/lora/FLUX-dev-lora-AntiBlur.safetensors", lora_alpha=2)
250250
image_3 = pipe(
@@ -255,7 +255,7 @@ def example_7():
255255
cfg_scale=2.0, num_inference_steps=50,
256256
seed=102
257257
)
258-
image_3.save("image_15.png")
258+
image_3.save("image_15.jpg")
259259

260260
pipe = FluxImagePipeline.from_model_manager(model_manager, controlnet_config_units=[
261261
ControlNetConfigUnit(
@@ -271,7 +271,7 @@ def example_7():
271271
height=2048, width=2048, tiled=True,
272272
seed=103
273273
)
274-
image_4.save("image_16.png")
274+
image_4.save("image_16.jpg")
275275

276276
image_5 = pipe(
277277
prompt="a beautiful Asian woman wearing a red camisole and an orange cat on a bed. highly detailed, delicate skin texture, clear background.",
@@ -280,7 +280,7 @@ def example_7():
280280
height=4096, width=4096, tiled=True,
281281
seed=104
282282
)
283-
image_5.save("image_17.png")
283+
image_5.save("image_17.jpg")
284284

285285

286286

0 commit comments

Comments
 (0)