-
Notifications
You must be signed in to change notification settings - Fork 212
feat(flowcontrol): add pool and model labels to metrics #2010
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat(flowcontrol): add pool and model labels to metrics #2010
Conversation
Refactors `pkg/epp/metrics/metrics.go` to improve performance and maintainability. Changes include: - Centralize label keys and bucket definitions into package-level constants to remove duplication. - Switch from `.With(prometheus.Labels)` to ``.WithLabelValues()` in recorder functions to avoid map allocations on the hot path. - Extract magic strings for metric types into constants to prevent typos.
Updates the Flow Control layer metrics to include InferencePool, Model, and TargetModel labels, enabling better correlation between queuing behavior and backend demand. Changes include: - Update `FlowControlRequest` interface to expose observability context. - Propagate pool name, model name, and target model name through Admission Controller and Flow Controller. - Update `metrics.go` to include the new labels in the Flow Control metric definitions. - Update mocks and tests to reflect these API changes.
✅ Deploy Preview for gateway-api-inference-extension ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
|
Hi @LukeAVanDrie. Thanks for your PR. I'm waiting for a github.com member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
cc' @dumb0002 I did not add a second aggregate version of each metric as that would break away from the existing project convention. But, you should be able to aggregate from the existing metrics now. Let me know if this is sufficient. |
| } | ||
| go registry.Run(ctx) | ||
| admissionController = requestcontrol.NewFlowControlAdmissionController(fc) | ||
| admissionController = requestcontrol.NewFlowControlAdmissionController(fc, opts.PoolName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can also get this from the data layer via PoolGet (e.g., in requestcontrol/director.go or handlers/server.go). My understanding is that this is a static value though, so this should be sufficient and avoids a lookup.
@LukeAVanDrie, sounds good. Thanks! |
|
/ok-to-test |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ahg-g, LukeAVanDrie The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
@LukeAVanDrie: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
What type of PR is this?
/kind feature
/kind cleanup
What this PR does / why we need it:
This PR enhances the observability of the experimental Flow Control layer to support advanced autoscaling scenarios, specifically Scale-from-Zero and per-model scaling.
Previously, Flow Control metrics (
flow_control_queue_size,flow_control_request_queue_duration_seconds) only included flow keys (fairness_id,priority) which are orthogonal to model labels. This made it impossible for an external autoscaler (HPA) to distinguish which specific Pool or Model was experiencing pressure/saturation.Key Changes:
inference_pool,model_name, andtarget_model_namelabels to Flow Control metrics. This allows autoscalers to correlate queue metrics with specific pools or models.FlowControlRequestinterface and the Admission Controller to capture and propagate this context from the incoming request down to the metrics layer.pkg/epp/metrics/metrics.goto deduplicate label definitions, extract magic strings to constants, and optimize recorder performance (switched to.WithLabelValuesto reduce allocations on the hot path).Which issue(s) this PR fixes:
Fixes #1920
Does this PR introduce a user-facing change?: