Skip to content

Conversation

@vaclavmuller
Copy link

This PR fixes a shape mismatch when loading some lumina2 / NextDiT GGUF models
(e.g. Z-Image Turbo GGUF builds).

Some GGUF conversions store x_pad_token and cap_pad_token as 1D vectors
([D]) instead of the expected 2D shape ([1, D]), which causes
load_state_dict to fail.

The loader now:

  • ensures a robust fallback shape when orig_shape metadata is missing
  • reshapes lumina2 pad tokens to (1, D) when needed

Tested with:
https://huggingface.co/leejet/Z-Image-Turbo-GGUF

Addresses #379

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant