Skip to content

SD3 cannot finetunes a better model (hand and face deformation)? #8748

@KaiWU5

Description

@KaiWU5

Describe the bug

I want to finetune sd3 to improve its human generation quality with 3million high-quality human datasets (which has been proven useful on sdxl and other models). But hand and face deformation doesn't improve much after two days of training.

I am using train script

What I have been done so far:

  1. regular training with 3 million data with batch size 2x24(V100) for 2 epochs with lr 5e-6 and adamw optimizer
  2. prodigy optimizer training with same setting
  3. Add q,k RMS norm to each attention layer
  4. only train several blocks

All of my training gives me nearly the same deformation results, where the hands are never normal like human.

Could you some provide more experiments about sd3 training? There seems no easy way to adapt sd3 for human generation

Reproduction

Has described in bug part

Logs

No response

System Info

V100 24GPU, batchsize 2 for each card, 3 million human data with aesthetic score > 4.5

Who can help?

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions