Skip to content

Conversation

@MasterPtato
Copy link
Contributor

No description provided.

@vercel
Copy link

vercel bot commented Dec 18, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Review Updated (UTC)
rivetkit-serverless Error Error Jan 6, 2026 1:34am
3 Skipped Deployments
Project Deployment Review Updated (UTC)
rivet-cloud Ignored Ignored Jan 6, 2026 1:34am
rivet-inspector Ignored Ignored Preview Jan 6, 2026 1:34am
rivet-site Ignored Ignored Preview Jan 6, 2026 1:34am

Copy link
Contributor Author

MasterPtato commented Dec 18, 2025

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more


How to use the Graphite Merge Queue

Add the label merge-queue to this PR to add it to the merge queue.

You must have a Graphite account in order to use the merge queue. Sign up using this link.

An organization admin has enabled the Graphite Merge Queue in this repository.

Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.

This stack of pull requests is managed by Graphite. Learn more about stacking.

@pkg-pr-new
Copy link

pkg-pr-new bot commented Dec 18, 2025

More templates

@rivetkit/cloudflare-workers

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/cloudflare-workers@3660

@rivetkit/db

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/db@3660

@rivetkit/framework-base

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/framework-base@3660

@rivetkit/next-js

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/next-js@3660

@rivetkit/react

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/react@3660

rivetkit

pnpm add https://pkg.pr.new/rivet-dev/rivet/rivetkit@3660

@rivetkit/sql-loader

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/sql-loader@3660

@rivetkit/engine-runner

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/engine-runner@3660

@rivetkit/engine-runner-protocol

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/engine-runner-protocol@3660

commit: 2bc37dc

@graphite-app
Copy link
Contributor

graphite-app bot commented Jan 6, 2026

Merge activity

  • Jan 6, 1:36 AM UTC: MasterPtato added this pull request to the Graphite merge queue.
  • Jan 6, 1:37 AM UTC: CI is running for this pull request on a draft pull request (#3753) due to your merge queue CI optimization settings.
  • Jan 6, 1:37 AM UTC: Merged by the Graphite merge queue via draft PR: #3753.

graphite-app bot pushed a commit that referenced this pull request Jan 6, 2026
@graphite-app graphite-app bot closed this Jan 6, 2026
@graphite-app graphite-app bot deleted the 12-18-chore_clean_up_metrics branch January 6, 2026 01:37
@claude
Copy link

claude bot commented Jan 6, 2026

Pull Request Review: Metrics Cleanup

Overview

This PR successfully cleans up the metrics system with improvements to metric labels and dashboard configurations. The changes focus on improving consistency and removing unused infrastructure (FoundationDB dashboards).


✅ Positive Changes

  1. Better metric label naming: Changed error_codeerror throughout the codebase

    • More concise and consistent with Prometheus naming conventions
    • Applied consistently across API and Gasoline metrics
  2. Axum macros feature: Added macros feature to axum dependency

    • This enables route macro support which can simplify route definitions
    • Good forward-looking change for developer experience
  3. Removed FoundationDB dashboards: Cleaned up ~1,845 lines of unused dashboard configs

    • Reduces maintenance burden
    • Removes confusion about supported database backends
  4. Fixed Prometheus query division: Updated cache dashboard queries in engine/docker/template/grafana-dashboards/cache.json

    • Before: sum(irate(...) / rate(...))
    • After: sum by (key) (rate(...)) / sum by (key) (rate(...))
    • This correctly divides aggregated values instead of aggregating division results
  5. Removed unnecessary service_name tracking: Cleaned up 18 lines from env/src/lib.rs


🔍 Issues & Concerns

1. Breaking Change in Metric Labels

Severity: HIGH

The change from error_code to error is a breaking change for any existing:

  • Grafana dashboards not in this repo
  • Alert rules based on these metrics
  • External monitoring/observability tools

Impact:

  • Queries like rivet_api_request_errors_total{error_code="FOO"} will stop working
  • Historical metrics will have different label names than new metrics

Recommendation:

  • Document this breaking change in commit message/PR description
  • Consider a migration guide for teams running custom dashboards
  • If there's a metrics retention policy, note when old label names will expire

2. Inconsistent Error Label Format

Severity: MEDIUM

In engine/packages/api-builder/src/middleware.rs:183-189, the error label is formatted as {group}.{code}:

let error_str: String = if status.is_success() {
    String::new()
} else if let Some(err) = &error {
    format\!("{}.{}", err.group, err.code)  // ← Creates "GROUP.CODE"
} else {
    String::new()
};

This means the error label will contain values like "auth.invalid_token" or "api.rate_limited".

Issues:

  • High cardinality: Each unique group+code combination creates a new time series
  • Makes querying less flexible - can't easily filter by group or code independently
  • The old error_code label name made more sense for this combined value

Recommendations:

  1. Keep separate labels (Preferred):

    // In metrics.rs
    &["router", "method", "path", "status", "error_group", "error_code"]
    
    // In middleware.rs
    .with_label_values(&[
        router_name,
        method_clone.as_str(),
        path_clone.as_str(),
        status.as_str(),
        error.as_ref().map_or("", |x| &x.group),
        error.as_ref().map_or("", |x| &x.code)
    ])
  2. OR use a consistent delimiter: If you must combine, document the format and consider using :: or __ instead of . to avoid confusion with Prometheus label syntax


3. Empty String Labels

Severity: MEDIUM

Success cases use empty string "" for the error label:

let error_str: String = if status.is_success() {
    String::new()  // ← Empty string for successful requests

Issues:

  • Creates a time series with error="" for all successful requests
  • Wastes storage and query performance
  • Makes queries more complex (need to filter out empty strings)

Recommendation:
Use a sentinel value like "none" or "ok" for successful requests:

let error_str = if status.is_success() {
    "none"
} else if let Some(err) = &error {
    format\!("{}.{}", err.group, err.code)
} else {
    "unknown"
};

4. Missing Error Handling Documentation

Severity: LOW

The gasoline metrics in gasoline/src/metrics.rs have an error label but there's no clear documentation about:

  • What values this label can have
  • Whether it follows the same {group}.{code} format as API metrics
  • How to interpret empty vs non-empty error labels

Recommendation:
Add inline comments documenting the error label format.


5. Potential Memory Allocation in Hot Path

Severity: LOW

In middleware.rs:183-189, creating error_str involves string allocation on every request. For the success case, consider using a static string like "none" instead of String::new() to avoid allocations.


📊 Dashboard Updates Review

The Grafana dashboard changes look good overall:

  • ✅ Consistent metric label updates (error_codeerror)
  • ✅ Fixed query aggregation issues in cache dashboard
  • ✅ Updated to newer Grafana schema version (39→41)
  • ✅ Removed unnecessary dashboard IDs that cause git conflicts

🧪 Test Coverage

Missing:

  • No tests verify the new metric label format
  • No tests ensure metrics are properly recorded with error labels

Recommendation:
Consider adding integration tests that verify the error label format.


🔒 Security Review

No security concerns identified. The changes are primarily cosmetic refactoring of metrics labels.


🚀 Performance Considerations

The changes should have minimal performance impact:


Summary & Verdict

Overall Assessment: Good cleanup PR with some important issues to address

Must Fix Before Merge:

  1. Document the breaking change from error_codeerror
  2. Reconsider the combined error label format (separate labels would be better)
  3. Use non-empty sentinel values instead of empty strings for success cases

Nice to Have:
4. Add documentation for error label formats
5. Optimize string allocations in hot path


Recommendation: Request changes to address issues #1-3 before merging.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants