Record output_tokens for incomplete requests #519

sjmonson · 2025-12-18T20:33:09Z

Summary

Sets continuous_usage_stats to get token usage on incomplete requests. If usage is still unavailable fall back to iteration count.

Details

In v0.3.0 and earlier the number of iterations was used as proxy for output token count in incomplete requests that did not return usage metrics. In v0.4.0 this behavior was removed which lead to large discrepancies in output token count based on the percentage of the benchmark consisting of incomplete requests.

This PR restore the original behavior of falling back to number of iterations. Additionally it sets the continuous_usage_stats flag to enable usage metrics on every iteration, when available.

Test Plan

Run a long-generation, high concurrency benchmark using a max-seconds constraint. For incomplete requests check that output_tokens is greater than 0 for some requests.

Related Issues

Resolves Up to 9% decrease in output tokens per second between v0.3 and v0.4 #514

"I certify that all code in this PR is my own, except as noted below."

Use of AI

Includes AI-assisted code completion
Includes code generated by an AI application
Includes AI-generated tests (NOTE: AI written tests should have a docstring that includes ## WRITTEN BY AI ##)

Signed-off-by: Samuel Monson <smonson@redhat.com>

sjmonson added 3 commits December 18, 2025 14:41

Enable continuous usage metrics in completion requests

7c31d3f

Signed-off-by: Samuel Monson <smonson@redhat.com>

Fallback to iteration count if usage metrics are unavailable

aaa865d

Signed-off-by: Samuel Monson <smonson@redhat.com>

Store full traceback in error

633a669

Signed-off-by: Samuel Monson <smonson@redhat.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Record output_tokens for incomplete requests #519

Record output_tokens for incomplete requests #519

Uh oh!

sjmonson commented Dec 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Record output_tokens for incomplete requests #519

Are you sure you want to change the base?

Record output_tokens for incomplete requests #519

Uh oh!

Conversation

sjmonson commented Dec 18, 2025

Summary

Details

Test Plan

Related Issues

Use of AI

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants