Usage metering for embedding and reranking #2008

Yuqi-Du · 2025-04-14T19:14:20Z

This PR will add ModelUsage as part the BatchedRerankingResponse.
The idea is to generalize ModelUsage that will be common shared by reranking and embedding model service.

public class ModelUsage 
  public final ProviderType providerType;
  public final String provider;
  public final String model;

  // requestBytes will be calculated according to the request http body to the provider.
  private int requestBytes = 0;
  
  // if content-length exists in provider http response, use that, otherwise, calculate bytes from http body in the response.
  private int responseBytes = 0;

  private int promptTokens = 0;
  private int totalTokens = 0;

Note that, gRPC prototype also changed to reflect the shared modelUsage structure.
So This PR will has corresponding EGW changes.

Checklist

Changes manually tested
Automated Tests added/updated
Documentation added/updated
CLA Signed: DataStax CLA

amorton · 2025-06-05T23:42:19Z

Picking this work up, and combining it with #1865

tests not verified

get the request config from ServiceConfigStore

src/main/java/io/stargate/sgv2/jsonapi/service/embedding/operation/EmbeddingProvider.java

src/main/resources/embedding-providers-config.yaml

src/main/proto/embedding_gateway.proto

src/main/java/io/stargate/sgv2/jsonapi/service/embedding/gateway/EmbeddingGatewayClient.java

...n/java/io/stargate/sgv2/jsonapi/service/embedding/operation/AwsBedrockEmbeddingProvider.java

src/main/java/io/stargate/sgv2/jsonapi/service/provider/ModelInputType.java

amorton

reviewed by yuqi

This reverts commit 48857a2.

init

e69667c

Yuqi-Du requested a review from a team as a code owner April 14, 2025 19:14

java doc

b50241a

amorton added 19 commits June 11, 2025 11:56

WIP - code changes, compiles,

1c1f1d8

tests not verified

WIP - basics working on laptop, checking regressions

e1db443

tmp

120993c

finished merge from main

88a1d2b

fmt

a024699

Use ServiceProviderConfig

19fa063

get the request config from ServiceConfigStore

EmbeddingGatewayClientTest fixes

7d8a6cb

EmbeddingProviderErrorMessageTest fixes

28bca03

OpenAiEmbeddingClientTest fixes

7ce2af7

fmt

501871e

RerankingProviderTest fixes

618961e

CommandResolverWithVectorizerTest fixes

9c72b30

fix for vectorize IT's that used custom provider

a213c80

fmt

396e652

Merge branch 'main' into yuqi/rerank-metering

131ad4e

fixes missed from merge

2030562

code tidy

cc9b48f

InsertOneTableIntegrationTest fixes

9b215b8

fmt

f65e364

amorton changed the title ~~Reranking model usage metering structure.~~ Usage metering for embedding and reranking Jun 17, 2025

cody tidy

1a5c956