XTTS-FINETUNE 🚂

First, clone this repository and build the image:

git clone https://github.com/veralvx/xtts-finetune xtts-finetune \
  && cd xtts-finetune \
  && podman build -f Dockerfile -t xtts-finetune

Start the container:

podman run -it --rm --gpus=all -v ./dataset:/xtts/dataset -v ./run:/xtts/run xtts-finetune

If you want to use CPU only, omit --gpus=all.

Before fine-tuning, your directory layout should look like this:

.
├── .venv
├── convert_audio.py
├── dataset
│   ├── metadata.csv
│   └── wavs
│       ├── 01.wav
│       ├── 02.wav
│       ├── reference.wav
│       └─ ...
├── Dockerfile
├── main.py
├── pyproject.toml
├── requirements.txt
├── finetune.py
├── transcribe.py
├── uv.lock
└── validate_audio.py

Notice:

.wav files under dataset/wavs, with one file called reference.wav (~ 5s duration);
metadata.csv under dataset

Audio files must use mono channel and 22050hz:

uv run main.py --validate dataset/wavs

uv run main.py --convert dataset/wavs

Or, using ffmpeg:

ffmpeg -i input.wav -ac 1 -ar 22050 output.wav

The metadata.csv can be obtained with:

uv run main.py --transcribe ./dataset/wavs --lang en --model medium --device cuda

The metadata output will be under dataset/wavs, and it should be moved to dataset/metadata.csv.

Then:

uv run main.py --lang en

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

XTTS-FINETUNE 🚂

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
dataset		dataset
.dockerignore		.dockerignore
.gitignore		.gitignore
.python-version		.python-version
Dockerfile		Dockerfile
LICENSE.txt		LICENSE.txt
README.md		README.md
convert_audio.py		convert_audio.py
finetune.py		finetune.py
main.py		main.py
podman.sh		podman.sh
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
transcribe.py		transcribe.py
uv.lock		uv.lock
validate_audio.py		validate_audio.py

License

veralvx/xtts-finetune

Folders and files

Latest commit

History

Repository files navigation

XTTS-FINETUNE 🚂

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages