[Efficiency improvement] GGUF conversion - file caching. #286

Symbiomatrix · 2025-06-06T22:32:38Z

One line fix - ggufwriter will cache to file instead of living in ram.
It's a bit slower naturally, but I think it's unreasonable to demand - what is it, 24gb ram? I get crashes on 12gb at 54% - to store a full flux model in memory, when it's a one time operation.
As mentioned in #285 , this will allow running conversions even on low resource containers (colab, space presumably).
In terms of disk space, you need <100gb free for the base, temp and converted files mainly; colab supplies that much.

Symbiomatrix · 2025-06-06T22:51:04Z

It's acting a bit weird, added some prints to see if it gets stuck during writing somewhere.

Symbiomatrix · 2025-06-06T23:02:22Z

Double checked - seems to be working correctly, just that when you use a temp file the write_tensors_to_file function takes significantly longer - another 5 minutes, I reckon. Breaking it mid run will yield a corrupted file.

Colab probably needs an extra weight to actually reclaim the space.

Symbiomatrix · 2025-06-11T07:06:21Z

I added a model deletion flag, but colab doesn't appear to free any disk space whilst the process is running. The only way I could get it to work with FP32 is to return the writer after writing header, remove source model and then proceed with writing (per branch efficiency1). Semi manual hack but effective. Hf spaces seem to be even more limited at 50gb, colab at 70gb.

city96 · 2025-06-12T01:18:51Z

Sorry, I'm pretty swamped with IRL stuff lately so I can't really review or test stuff at the moment, but just some ideas:

There is a branch with an auto convert gradio here that might have some useful stuff you can reuse for the google colab space, including some of the logic: https://github.com/city96/ComfyUI-GGUF/blob/auto_convert/tools/tool_auto.py (+ the PR #274 )

Some other nodes for this PR specifically:

I think adding use_temp_file=True would make sense to have behind a launch arg, with the description mentioning that it saves RAM but makes stuff slightly slower + adds some wear to the SSD I guess (also, possible to crash if the user has temp on the system drive and is running low on space? maybe keep the default and add a note to the readme). Dunno if this needs to have a version check to make sure old versions of gguf-py don't break if that flag is present.
There is this PR that I still need to look at. I want to add it to the gradio on the auto_convert branch then make it a HF space similar to gguf-my-repo but yeah, I had zero time to work on any of this in the past months sadly...
convert : ability to lazy-load safetensors remotely without downloading to disk ggml-org/llama.cpp#12820
Deleting the input file is a definite no-go to have in the main repo. I don't like to have any kind of destructive operations that could result in people having to re-download files or even worse, losing custom finetuned models due to it.

Add file caching.

b456873

Symbiomatrix marked this pull request as draft June 6, 2025 22:37

Symbiomatrix marked this pull request as ready for review June 6, 2025 23:00

Added flag for low space environments.

a68f795

Colab probably needs an extra weight to actually reclaim the space.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Efficiency improvement] GGUF conversion - file caching. #286

[Efficiency improvement] GGUF conversion - file caching. #286

Uh oh!

Symbiomatrix commented Jun 6, 2025 •

edited

Loading

Uh oh!

Symbiomatrix commented Jun 6, 2025

Uh oh!

Symbiomatrix commented Jun 6, 2025

Uh oh!

Symbiomatrix commented Jun 11, 2025 •

edited

Loading

Uh oh!

city96 commented Jun 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[Efficiency improvement] GGUF conversion - file caching. #286

Are you sure you want to change the base?

[Efficiency improvement] GGUF conversion - file caching. #286

Uh oh!

Conversation

Symbiomatrix commented Jun 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Symbiomatrix commented Jun 6, 2025

Uh oh!

Symbiomatrix commented Jun 6, 2025

Uh oh!

Symbiomatrix commented Jun 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

city96 commented Jun 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Symbiomatrix commented Jun 6, 2025 •

edited

Loading

Symbiomatrix commented Jun 11, 2025 •

edited

Loading