Skip to content

Conversation

@spawner1145
Copy link
Contributor

@spawner1145 spawner1145 commented Oct 2, 2025

@city96
Copy link
Owner

city96 commented Nov 8, 2025

So I don't think we can merge this as-is, in part because it has a lot of random changes that look like AI slop to me (i.e. the logging messages with the checkmark emoji lol), but mainly because I don't think making custom model formats for text models supported by llama.cpp is acceptable.

We should only ever convert image models, and for the text models the conversion should be handled by mainline llama.cpp.

The tokenizer can be reconstructed from the metadata, though you need to do a tensor operation to actually make it usable, otherwise the norms will be messed up and you'll get bad results. I tried my best to try and implement it, though I couldn't get the tokenizer to be 1:1 (everything else is tested working with proper quants from huggingface) #358

If you have the time, you could try and see if you can find the issue with the above PR, I'm pretty sure it's either due to spm.trainer_spec or due to the individual piece.score values. I left the test line in to load the quantized model but replace sd["spiece_model"] to test as the baseline. I can go into more detail, if you have any questions.

(Also, this PR would break UMT5 support, which already uses the tokenizer reconstruction logic)

@city96 city96 closed this Nov 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants