Skip to content

Running VAE Tiling with small TAESD buffer will crash #649

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
rmatif opened this issue Apr 5, 2025 · 0 comments
Open

Running VAE Tiling with small TAESD buffer will crash #649

rmatif opened this issue Apr 5, 2025 · 0 comments

Comments

@rmatif
Copy link

rmatif commented Apr 5, 2025

Just noticed also that enabling VAE tiling while using TAESD will cause a crash for small TAESD buffer. Maybe in this case just ignore the tiling if the buffer is < a certain size

rmatif ➜ /workspaces/stable-diffusion.cpp (master) $ ./build/bin/sd -m models/Realistic_Vision_V6.0_NV_B1_fp16.safetensors --taesd models/taesd.safetensors -p "cute cat" -v -H 256 -W 256 --vae-tiling --cfg-scale 1 --steps 1
Option: 
    n_threads:         4
    mode:              txt2img
    model_path:        models/Realistic_Vision_V6.0_NV_B1_fp16.safetensors
    wtype:             unspecified
    clip_l_path:       
    clip_g_path:       
    t5xxl_path:        
    diffusion_model_path:   
    vae_path:          
    taesd_path:        models/taesd.safetensors
    esrgan_path:       
    controlnet_path:   
    embeddings_path:   
    stacked_id_embeddings_path:   
    input_id_images_path:   
    style ratio:       20.00
    normalize input image :  false
    output_path:       output.png
    init_img:          
    mask_img:          
    control_image:     
    clip on cpu:       false
    controlnet cpu:    false
    vae decoder on cpu:false
    diffusion flash attention:false
    strength(control): 0.90
    prompt:            cute cat
    negative_prompt:   
    min_cfg:           1.00
    cfg_scale:         1.00
    slg_scale:         0.00
    guidance:          3.50
    eta:               0.00
    clip_skip:         -1
    width:             256
    height:            256
    sample_method:     euler_a
    schedule:          default
    sample_steps:      1
    strength(img2img): 0.75
    rng:               cuda
    seed:              42
    batch_count:       1
    vae_tiling:        true
    upscale_repeats:   1
System Info: 
    SSE3 = 1
    AVX = 1
    AVX2 = 1
    AVX512 = 0
    AVX512_VBMI = 0
    AVX512_VNNI = 0
    FMA = 1
    NEON = 0
    ARM_FMA = 0
    F16C = 1
    FP16_VA = 0
    WASM_SIMD = 0
    VSX = 0
[DEBUG] stable-diffusion.cpp:188  - Using CPU backend
[INFO ] stable-diffusion.cpp:197  - loading model from 'models/Realistic_Vision_V6.0_NV_B1_fp16.safetensors'
[INFO ] model.cpp:908  - load models/Realistic_Vision_V6.0_NV_B1_fp16.safetensors using safetensors format
[DEBUG] model.cpp:979  - init from 'models/Realistic_Vision_V6.0_NV_B1_fp16.safetensors'
[INFO ] stable-diffusion.cpp:244  - Version: SD 1.x 
[INFO ] stable-diffusion.cpp:277  - Weight type:                 f16
[INFO ] stable-diffusion.cpp:278  - Conditioner weight type:     f16
[INFO ] stable-diffusion.cpp:279  - Diffusion model weight type: f16
[INFO ] stable-diffusion.cpp:280  - VAE weight type:             f16
[DEBUG] stable-diffusion.cpp:282  - ggml tensor size = 400 bytes
[DEBUG] clip.hpp:171  - vocab size: 49408
[DEBUG] clip.hpp:182  -  trigger word img already in vocab
[DEBUG] ggml_extend.hpp:1174 - clip params backend buffer size =  307.44 MB(RAM) (196 tensors)
[DEBUG] ggml_extend.hpp:1174 - unet params backend buffer size =  1640.25 MB(RAM) (686 tensors)
[DEBUG] stable-diffusion.cpp:419  - loading weights
[DEBUG] model.cpp:1727 - loading tensors from models/Realistic_Vision_V6.0_NV_B1_fp16.safetensors
  |==================================================| 1130/1130 - 500.00it/s
[INFO ] tae.hpp:214  - loading taesd from 'models/taesd.safetensors', decode_only = true
[DEBUG] ggml_extend.hpp:1174 - taesd params backend buffer size =   2.34 MB(RAM) (67 tensors)
[INFO ] model.cpp:908  - load models/taesd.safetensors using safetensors format
[DEBUG] model.cpp:979  - init from 'models/taesd.safetensors'
[DEBUG] model.cpp:1727 - loading tensors from models/taesd.safetensors
  |=========================>                        | 67/134 - 1000.00it/s[INFO ] tae.hpp:236  - taesd model loaded
[INFO ] stable-diffusion.cpp:503  - total params memory size = 1950.03MB (VRAM 2.34MB, RAM 1947.69MB): clip 307.44MB(RAM), unet 1640.25MB(RAM), vae 2.34MB(VRAM), controlnet 0.00MB(VRAM), pmid 0.00MB(RAM)
[INFO ] stable-diffusion.cpp:522  - loading model from 'models/Realistic_Vision_V6.0_NV_B1_fp16.safetensors' completed, taking 0.65s
[INFO ] stable-diffusion.cpp:556  - running in eps-prediction mode
[DEBUG] stable-diffusion.cpp:600  - finished loaded file
[DEBUG] stable-diffusion.cpp:1548 - txt2img 256x256
[DEBUG] stable-diffusion.cpp:1241 - prompt after extract and remove lora: "cute cat"
[INFO ] stable-diffusion.cpp:690  - Attempting to apply 0 LoRAs
[INFO ] stable-diffusion.cpp:1246 - apply_loras completed, taking 0.00s
[DEBUG] conditioner.hpp:357  - parse 'cute cat' to [['cute cat', 1], ]
[DEBUG] clip.hpp:311  - token length: 77
[DEBUG] ggml_extend.hpp:1126 - clip compute buffer size: 1.40 MB(RAM)
[DEBUG] conditioner.hpp:485  - computing condition graph completed, taking 178 ms
[INFO ] stable-diffusion.cpp:1379 - get_learned_condition completed, taking 178 ms
[INFO ] stable-diffusion.cpp:1402 - sampling using Euler A method
[INFO ] stable-diffusion.cpp:1439 - generating image: 1/1 - seed 42
[DEBUG] stable-diffusion.cpp:808  - Sample
[DEBUG] ggml_extend.hpp:1126 - unet compute buffer size: 49.43 MB(RAM)
  |==================================================| 1/1 - 2.69s/it
[INFO ] stable-diffusion.cpp:1478 - sampling completed, taking 2.69s
[INFO ] stable-diffusion.cpp:1486 - generating 1 latent images completed, taking 2.75s
[INFO ] stable-diffusion.cpp:1489 - decoding 1 latents
[DEBUG] ggml_extend.hpp:616  - tile work buffer size: 3.06 MB
[DEBUG] ggml_extend.hpp:1126 - taesd compute buffer size: 480.00 MB(RAM)
[INFO ] ggml_extend.hpp:630  - processing 1 tiles
  |==================================================| 1/1 - 0.00it/s
Segmentation fault (core dumped)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant