Skip to content

Conversation

@nastya236
Copy link
Contributor

QQLinear layer

@nastya236 nastya236 changed the title Qq linear QQ linear Dec 18, 2025
mode=self.mode,
)

def train(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you make that API consistent with Module::train?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if you do that we can get rid of the eval override above and just use the base class eval.

Copy link
Contributor Author

@nastya236 nastya236 Jan 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course, this is a very good point. Now the weights will be quantized on the first qq_linear.__call__() after calling Module::eval(), and likewise dequantized on the first qq_linear.__call__()after calling Module::train(). This seems to be the only way to keep it consistent with the current API without changing Module::train()..

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see why you had to do it this way but I'm not crazy about how it works... I'm wondering if there is a better way to do it.

Basically the behavior that would be good to have is if we do:

qq_module.eval()
qq_module.parameters() # should give me the quantized params
qq_module.load_weights(quantized_weights) # should be able to load the quantized params

I think that should work and right now it won't.

I think we have some other options:

  1. Break away from the train/eval API and have something like QQLinear.quantize/QQLinear.dequantize which either quantizes/dequantize the module in-place (or maybe returns a copy that is quantized/dequantized)
  2. Change the base class Module to make it easier to override train (e..g call the submodules train as well as setting the local module's _training.

Wdyt?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fully agree with the desired behavior you described. Between the options, I prefer (2). I would expect model.train() / model.eval() to recursively propagate mode changes through the tree and adding a separate quantize() / dequantize() API would create a parallel mode system that would likely need to be aligned with train / eval as well..

Copy link
Member

@awni awni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks very nice! Just a few more cosmetic comments then we should get it merged!

Copy link
Member

@awni awni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thanks!!

@awni awni merged commit 5037317 into ml-explore:main Jan 5, 2026
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants