[SDK] Fix off-by-one error in Base2ExponentialHistogramAggregation::Merge() downscaling
#3793
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Issue
The count invariant (
count == zero_count + sum(bucket_counts)) was being violated duringBase2ExponentialHistogramAggregation::Merge()operations when the combined bucket range of two histograms exceededmax_bucketsby exactly 1.Changes
The offending condition
if (pos_max_index > pos_min_index + max_buckets) ...fails when the combined range exceedsmax_bucketsby exactly 1.Step-by-step example with max_buckets=5 at scale 0:
2,4,8,16,32: histogram spans indices 0-44,8,16,32,64: histogram spans indices 1-5 (64 extends one position beyond the previous range)During Merge:
pos_min_index = min(0, 1) = 0pos_max_index = max(4, 5) = 5max_buckets + 1Old condition check:
pos_max_index > pos_min_index + max_buckets5 > 0 + 55 > 5 = FALSE<-- downscaling NOT triggered (BUG!)Fixed condition check:
pos_max_index >= pos_min_index + max_buckets5 >= 0 + 55 >= 5 = TRUE<-- downscaling correctly triggeredWhen downscaling wasn't triggered but the range exceeded
max_buckets,MergeBuckets()created a result that couldn't fit in the circular buffer, causing bucket counts to be silently lost.Prometheus rejects histograms with mismatching counts here: https://github.com/prometheus/prometheus/blob/main/model/histogram/histogram.go#L474
For significant contributions please make sure you have completed the following items:
[ ]CHANGELOG.mdupdated for non-trivial changes[ ] Changes in public API reviewed