Skip to content

Conversation

@jsjung00
Copy link

@jsjung00 jsjung00 commented Jan 4, 2026

Addresses a memory leak in the decoding.

Problem:
Pae and ptm tensors were not moved to CPU, leading to GPU memory accumulation.
This behavior is benign when batch size is 1 in batch_generate(), but when batch size > 1 it leads to a OOM error.

Fix
Move to pae and ptm tensors to CPU.

…ot moved to CPU. This behavior is benign when batch size is 1, but when batch size > 1 causes memory to accumulate and OOM
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant