-
Notifications
You must be signed in to change notification settings - Fork 141
fix: add wf and guard metrics #3657
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: add wf and guard metrics #3657
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
3 Skipped Deployments
|
|
Warning This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
How to use the Graphite Merge QueueAdd the label merge-queue to this PR to add it to the merge queue. You must have a Graphite account in order to use the merge queue. Sign up using this link. An organization admin has enabled the Graphite Merge Queue in this repository. Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue. This stack of pull requests is managed by Graphite. Learn more about stacking. |
PR Review: Add Workflow and Guard MetricsOverviewThis PR improves observability by adding new metrics for workflow loops and guard proxy requests, while also cleaning up unused response handling code in the guard proxy service. Code Quality & Best Practices ✅Good practices observed:
Detailed Findings1. Workflow Loop Metrics (gasoline/src/builder/workflow/lupe.rs)Strengths:
Potential Issues: 🔴 Critical: Incorrect metric recording on first iteration // Line 189-192
if !loop_branch.cursor().compare_loop_branch(iteration)? {
// ...
metrics::LOOP_ITERATION_DURATION
.with_label_values(&[&ctx.name().to_string()])
.observe(iteration_dt.as_secs_f64());
}Problem: On the first iteration (iteration=0), Fix: Record the metric using metrics::LOOP_ITERATION_DURATION
.with_label_values(&[&ctx.name().to_string()])
.observe(cb_dt.as_secs_f64());🟡 Minor: Metric description clarity
2. Guard In-Flight Request Tracking (guard-core/src/proxy_service.rs)Strengths:
Issues: 🟡 Performance: Redundant metric updates // Lines 641-642 and 622
metrics::IN_FLIGHT_REQUEST_COUNT.set(self.in_flight_requests.entry_count() as i64);Problem: Calling Recommendation:
pub struct ProxyState {
// ... existing fields ...
in_flight_request_count: AtomicI64,
}
// On insert:
self.in_flight_request_count.fetch_add(1, Ordering::Relaxed);
metrics::IN_FLIGHT_REQUEST_COUNT.set(self.in_flight_request_count.load(Ordering::Relaxed));
// On release:
self.in_flight_request_count.fetch_sub(1, Ordering::Relaxed);
metrics::IN_FLIGHT_REQUEST_COUNT.set(self.in_flight_request_count.load(Ordering::Relaxed));🟢 Good comment addition (line 317): The clarification about using moka::Cache for TTL/capacity is helpful 3. Dead Code Removal (guard-core/src/proxy_service.rs)Excellent cleanup:
No issues found - this is a net positive for maintainability 4. API Error Metric Naming (api-builder/src/metrics.rs)✅ Good fix: Changed
Security Considerations ✅No security concerns identified. The changes:
Performance Considerations
Test Coverage
Recommendation: Add a test like: #[tokio::test]
async fn test_loop_metrics_recording() {
// Setup workflow with instrumented loop
// Assert metrics are recorded with expected labels
}SummaryMust Fix (Critical):
Should Fix (Recommended):
Overall Assessment: Good observability improvements with one critical bug that needs fixing before merge. The dead code removal is excellent housekeeping. |
More templates
@rivetkit/cloudflare-workers
@rivetkit/db
@rivetkit/framework-base
@rivetkit/next-js
@rivetkit/react
rivetkit
@rivetkit/sql-loader
@rivetkit/engine-runner
@rivetkit/engine-runner-protocol
commit: |
e258ce3 to
a4c59b6
Compare
5a3e37b to
16a7519
Compare
Merge activity
|
PR Review: Add wf and guard metricsOverviewThis PR adds improved metrics instrumentation for workflow loops and guard proxy requests. It also removes unused 🚨 Critical Issues1. Breaking Change - Test FailureFile: The PR removes Recommendation: Update the test to use an alternative approach (e.g., return a different route or use
|

No description provided.