-
Notifications
You must be signed in to change notification settings - Fork 6.6k
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
Describe the bug
Loss curve decreases but U-net denoising gets worse when fine-tuning stable-diffusion-2-1-base on self-built dataset.
These artifacts are similar to an illustration in a paper I read, so I'll call them “residual noise” for now.
overfitting on single batch
--resolution="320" \ --train_batch_size="16" \ --gradient_accumulation_steps="4" \ --gradient_checkpointing \ --learning_rate="5e-05" \
Reproduction
accelerate launch --mixed_precision="fp16" train_text_to_image.py \
--pretrained_model_name_or_path="stabilityai/stable-diffusion-2-1-base" \
--train_data_dir="clean_data/train-good" \
--resolution="320" \
--train_batch_size="16" \
--gradient_accumulation_steps="4" \
--gradient_checkpointing \
--max_train_steps="100000" \
--learning_rate="5e-05" \
--max_grad_norm="1" \
--lr_scheduler="constant" \
--lr_warmup_steps="0" \
--output_dir="experiments/demo" \
--mixed_precision="fp16" \
--validation_prompts \
"Brain, AXT1PRE, Slice1" \
"Brain, AXT1PRE, Slice2, FieldStrength:2.8936, Flash, TR:264, TE:2.88, TI:300, flipAngle:70" \
"Brain, AXT1PRE, Slice8, FieldStrength:2.8936, Flash, TR:264, TE:2.88, TI:300, flipAngle:70" \
"Brain, AXT1PRE, Slice11, FieldStrength:2.8936, Flash, TR:250, TE:2.64, TI:300, flipAngle:70" \
"Brain, AXT2, Slice1" \
"Brain, AXT2, Slice1, FieldStrength:1.494, TurboSpinEcho with EchoSpacing:9.74, TR:5120, TE:107, TI:100, flipAngle:150" \
"Brain, AXT2, Slice6, FieldStrength:1.494, TurboSpinEcho with EchoSpacing:10.28, TR:5120, TE:103, TI:100, flipAngle:150" \
"Brain, AXT2, Slice10, FieldStrength:1.494, TurboSpinEcho with EchoSpacing:10.16, TR:5460, TE:102, TI:100, flipAngle:150" \
"Brain, AXT1POST, Slice1" \
"Brain, AXT1POST, Slice1, FieldStrength:2.8936, Flash, TR:250, TE:2.64, TI:300, flipAngle:70" \
"Brain, AXT1POST, Slice8, FieldStrength:2.8936, Flash, TR:264, TE:2.88, TI:300, flipAngle:70" \
"Brain, AXT1POST, Slice11, FieldStrength:2.8936, Flash, TR:264, TE:2.88, TI:300, flipAngle:70" \
"Brain, AXT1, Slice1" \
"Brain, AXT1, Slice1, FieldStrength:1.494, TurboSpinEcho with EchoSpacing:9.4, TR:419, TE:9.4, TI:100, flipAngle:140" \
"Brain, AXT1, Slice9, FieldStrength:1.494, TurboSpinEcho with EchoSpacing:9.4, TR:446, TE:9.4, TI:100, flipAngle:140" \
"Brain, AXT1, Slice11, FieldStrength:1.494, TurboSpinEcho with EchoSpacing:9.4, TR:461, TE:9.4, TI:100, flipAngle:145" \
"Brain, AXFLAIR, Slice1" \
"Brain, AXFLAIR, Slice1, FieldStrength:2.8936, TurboSpinEcho with EchoSpacing:9.02, TR:9000, TE:81, TI:2500, flipAngle:150" \
"Brain, AXFLAIR, Slice9, FieldStrength:2.8936, TurboSpinEcho with EchoSpacing:9.02, TR:9000, TE:81, TI:2500, flipAngle:150" \
"Brain, AXFLAIR, Slice11, FieldStrength:2.8936, TurboSpinEcho with EchoSpacing:9.02, TR:9000, TE:81, TI:2500, flipAngle:150" \
"yoda" \
"Brain, Slice1, FieldStrength:1.494, TurboSpinEcho with EchoSpacing:9.4, TR:419, TE:9.4, TI:100, flipAngle:140" \
"Brain, Slice9, FieldStrength:1.494, TurboSpinEcho with EchoSpacing:9.4, TR:446, TE:9.4, TI:100, flipAngle:140" \
"''" \
--validation_epochs="1" \
--checkpointing_steps="1500"
Logs
System Info
- 🤗 Diffusers version: 0.33.0.dev0
- Platform: Windows-10-10.0.19044-SP0
- Running on Google Colab?: No
- Python version: 3.8.20
- PyTorch version (GPU?): 2.4.1+cu124 (True)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Huggingface_hub version: 0.29.3
- Transformers version: 4.46.3
- Accelerate version: 1.0.1
- PEFT version: 0.7.0
- Bitsandbytes version: not installed
- Safetensors version: 0.5.3
- xFormers version: not installed
- Accelerator: NVIDIA RTX A5000, 24564 MiB
- Using GPU in script?:
- Using distributed or parallel set-up in script?:
Who can help?
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working






