Skip to content

Conversation

@InJunee
Copy link

@InJunee InJunee commented Dec 9, 2025

  • Change attachment_size from int32 to int64 in protobuf definitions (baidu_rpc_meta.proto, rpc_dump.proto, nshead_meta.proto)
  • Fix type mismatches in ProcessRpcRequest and ProcessRpcResponse:
    • Change req_size/res_size from int to size_t
    • Change body_without_attachment_size from int to size_t
    • Add overflow checks and proper type casting
  • Fix size inconsistency in PackRpcHeader and SerializeRpcHeaderAndMeta:
    • Change payload_size parameter from int to size_t
    • Add overflow check for meta_size + payload_size exceeding UINT32_MAX
  • Update format strings to use PRId64 and %zu for proper large value display

This fix allows the protocol to correctly handle attachment sizes greater than 4GB (uint32 max), preventing integer overflow and size mismatches.

Fixes #1348

What problem does this PR solve?

Issue Number: resolve

Problem Summary:

What is changed and the side effects?

Changed:

Side effects:

  • Performance effects:

  • Breaking backward compatibility:


Check List:

- Change attachment_size from int32 to int64 in protobuf definitions
  (baidu_rpc_meta.proto, rpc_dump.proto, nshead_meta.proto)
- Fix type mismatches in ProcessRpcRequest and ProcessRpcResponse:
  - Change req_size/res_size from int to size_t
  - Change body_without_attachment_size from int to size_t
  - Add overflow checks and proper type casting
- Fix size inconsistency in PackRpcHeader and SerializeRpcHeaderAndMeta:
  - Change payload_size parameter from int to size_t
  - Add overflow check for meta_size + payload_size exceeding UINT32_MAX
- Update format strings to use PRId64 and %zu for proper large value display

This fix allows the protocol to correctly handle attachment sizes greater
than 4GB (uint32 max), preventing integer overflow and size mismatches.

Fixes apache#1348
optional int32 compress_type = 3;
optional int64 correlation_id = 4;
optional int32 attachment_size = 5;
optional int64 attachment_size = 5;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You cannot directly change the type of an existing protobuf field. This will cause client and server of different version not compatible.
You could add a new field, like optional int64 attachment_size_long = 13.
If the attachment size not exceed int32, you can use the original attachment_size field for compatibility. If the attachment size exceed int32, you can use the attachment_size_long field.

<< " + payload_size=" << payload_size
<< " = " << total_size
<< ") exceeds uint32_t maximum (" << UINT32_MAX << ")";
// Truncate to maximum, but this will cause protocol error
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You know this will cause protocol error, so what's the meaning of this code?
Maybe you can use pack32(UINT32_MAX) as a flag, then pack64(total_size) after it?
Then you can update the corresponding parsing code in ParseRpcMessage.

optional int64 correlation_id = 2;
optional int64 log_id = 3;
optional int32 attachment_size = 4;
optional int64 attachment_size = 4;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to change nshead protocol.

…ility

- Add attachment_size_long (int64) field to support sizes > INT32_MAX
- Keep attachment_size (int32) for backward compatibility
- Implement helper functions for reading/writing attachment sizes
- Extend header format: use UINT32_MAX as flag for 64-bit body_size
- Update ParseRpcMessage to support both normal (12-byte) and extended (20-byte) header formats
- Fix rpc_replay tool to handle new attachment_size fields
- Remove unused variables and redundant checks

This fix allows the protocol to correctly handle attachment sizes greater
than 4GB while maintaining full backward compatibility with existing
clients and servers.

Fixes apache#1348
@InJunee
Copy link
Author

InJunee commented Dec 9, 2025

I have made modifications to the code according to your comments. Please check it. @wwbmmm

} else {
req.serialized_data() = sample->request.movable();
}
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This else is never reached

@wwbmmm
Copy link
Contributor

wwbmmm commented Dec 10, 2025

Please fix the CI failure, there are some compilation errors.

- Change %d to %zu for request_size and response_size
- Use standard INT32_MAX instead of custom constant
- Fix response_size format string (was incorrectly using request_size)
Remove the unreachable else branch that was never executed.
The logic is already handled in the inner if-else block.
@InJunee
Copy link
Author

InJunee commented Dec 10, 2025

I have made modifications to the code. Please check it. @wwbmmm

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes attachment size handling in the brpc protocol to support sizes exceeding uint32 limits (>4GB). The changes introduce backward-compatible int64 fields in protobuf schemas and implement an extended header format for large message bodies.

Key Changes

  • Added attachment_size_long (int64) fields to RpcMeta and RpcDumpMeta with backward compatibility for existing attachment_size (int32) fields
  • Implemented extended 20-byte header format for messages with body_size > UINT32_MAX, while maintaining the standard 12-byte format for smaller messages
  • Updated type handling from int to size_t for size calculations and added proper PRId64/%zu format specifiers

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
src/brpc/policy/baidu_rpc_meta.proto Adds attachment_size_long field for sizes > INT32_MAX while preserving original field for backward compatibility
src/brpc/rpc_dump.proto Adds attachment_size_long field with same backward compatibility approach
src/brpc/nshead_meta.proto Changes attachment_size from int32 to int64 (breaking change)
src/brpc/policy/baidu_rpc_protocol.cpp Implements helper functions for backward-compatible attachment size handling, extended header format, type conversions, and updated format strings
tools/rpc_replay/rpc_replay.cpp Adds backward-compatible attachment size reading logic for replay functionality

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.


butil::IOBuf req_buf;
int body_without_attachment_size = req_size - meta.attachment_size();
const int64_t attachment_size = GetAttachmentSize(meta);
Copy link

Copilot AI Dec 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Redundant call to GetAttachmentSize(meta). The attachment_size was already retrieved and validated at lines 772-780. Consider reusing the same variable to avoid duplicate calls and improve code clarity.

Suggested change
const int64_t attachment_size = GetAttachmentSize(meta);
// attachment_size already retrieved and validated earlier

Copilot uses AI. Check for mistakes.
Comment on lines 773 to 774
if (attachment_size > 0) {
if (static_cast<size_t>(attachment_size) > req_size) {
Copy link

Copilot AI Dec 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing validation for negative attachment_size values. The GetAttachmentSize() function returns int64_t which could be negative (either from malicious input or bugs). When a negative int64_t is cast to size_t at line 774, it becomes a very large unsigned value that would correctly fail the > req_size check. However, the condition should explicitly check attachment_size < 0 before the cast to make the validation logic clearer and more robust.

Consider adding: if (attachment_size < 0 || static_cast<size_t>(attachment_size) > req_size) {

Copilot uses AI. Check for mistakes.
Comment on lines 1064 to 1065
if (attachment_size > 0) {
if (static_cast<size_t>(attachment_size) > res_size) {
Copy link

Copilot AI Dec 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing validation for negative attachment_size values. The GetAttachmentSize() function returns int64_t which could be negative. When a negative int64_t is cast to size_t at line 1065, it becomes a very large unsigned value that would correctly fail the > res_size check. However, the condition should explicitly check attachment_size < 0 before the cast to make the validation logic clearer and more robust.

Consider adding: if (attachment_size < 0 || static_cast<size_t>(attachment_size) > res_size) {

Copilot uses AI. Check for mistakes.
Comment on lines 192 to 193
if (attachment_size > 0 &&
static_cast<size_t>(attachment_size) < sample->request.size()) {
Copy link

Copilot AI Dec 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing validation for negative attachment_size values. When attachment_size is negative (from malicious or corrupted dump files), casting it to size_t at line 193 will produce a very large unsigned value. While this would fail the < sample->request.size() check, an explicit negative check would make the validation clearer.

Consider adding: if (attachment_size > 0 && attachment_size <= static_cast<int64_t>(sample->request.size())) { to explicitly validate the range before the cast.

Copilot uses AI. Check for mistakes.
optional int64 correlation_id = 2;
optional int64 log_id = 3;
optional int32 attachment_size = 4;
optional int64 attachment_size = 4;
Copy link

Copilot AI Dec 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This protobuf field change from int32 to int64 is a breaking change that breaks backward compatibility. Unlike baidu_rpc_meta.proto and rpc_dump.proto which preserve the old int32 field and add a new int64 field for compatibility, this directly changes the field type. This will cause issues when old clients/servers communicate with new ones, as they expect different wire formats for field 4.

Consider adding a new field attachment_size_long (similar to the pattern in other proto files) and keeping the original attachment_size as int32 for backward compatibility.

Suggested change
optional int64 attachment_size = 4;
// For backward compatibility, keep attachment_size as int32.
optional int32 attachment_size = 4;
// Use attachment_size_long for large attachments.
optional int64 attachment_size_long = 10;

Copilot uses AI. Check for mistakes.
}

messages = BaiduProxyPBMessages::Get();
const int64_t attachment_size = GetAttachmentSize(meta);
Copy link

Copilot AI Dec 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Redundant call to GetAttachmentSize(meta). The attachment_size was already retrieved at line 772 and validated at lines 773-780. Consider reusing the same variable to avoid duplicate calls and improve code clarity.

Suggested change
const int64_t attachment_size = GetAttachmentSize(meta);
// Use the previously retrieved attachment_size variable here.

Copilot uses AI. Check for mistakes.
Remove duplicate calls to GetAttachmentSize(meta) at lines 819 and 889.
The attachment_size is already retrieved and validated at line 772,
so we can reuse the same variable to avoid redundant calls and improve
code clarity.
- Add explicit negative value validation for attachment_size before casting to size_t
- Remove redundant GetAttachmentSize calls in ProcessRpcRequest
- Mark GetAttachmentSizeFromDump as unused to avoid compilation warnings
- Improve validation logic in rpc_replay to validate before casting
- Revert nshead_meta.proto attachment_size to int32 as per maintainer feedback

Fixes issue apache#1348
@InJunee InJunee requested a review from wwbmmm December 11, 2025 01:30
@InJunee
Copy link
Author

InJunee commented Dec 11, 2025

I have made modifications to the code. Please check it. @wwbmmm

@wwbmmm
Copy link
Contributor

wwbmmm commented Dec 11, 2025

LGTM

Copy link
Contributor

@chenBright chenBright left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add some UTs for Extended format, for example, in the brpc_server_unittest.cpp file.

- add extended header parse UT in brpc_server_unittest.cpp
- switch unused marker to ALLOW_UNUSED for GetAttachmentSizeFromDump
- include needed headers for packing extended header

Fixes issue apache#1348
@InJunee
Copy link
Author

InJunee commented Dec 15, 2025

@chenBright I have made some modifications to the code. Please have a look. Please let me know where the code still needs to be modified so that the code can be merged.

Copy link
Contributor

@chenBright chenBright left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@InJunee
Copy link
Author

InJunee commented Dec 18, 2025

It's a great honor to receive your approval. May I ask when my PR can be merged? This is of great significance to me. @chenBright

Comment on lines +69 to +72
// 1. Header format:
// - Normal format (12 bytes): [PRPC][body_size(32bit)][meta_size(32bit)]
// - Extended format (20 bytes): [PRPC][UINT32_MAX][meta_size(32bit)][body_size(64bit)]
// Extended format is used when body_size > UINT32_MAX
Copy link
Contributor

@chenBright chenBright Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If body_size equals UINT32_MAX, the client uses the normal format, while the server uses the extended format.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Normal format (12 bytes): [PRPC][body_size(32bit)][meta_size(32bit)]
  • Extended format (16 bytes): [LRPC][body_size(64bit)][meta_size(32bit)] , L represents long

The two formats have the same structure, only the magic number and body size differ. This makes them simpler to process.

@wwbmmm What do you think of the approach?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there anything else that needs to be modified in the code?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's nothing else that needs to be changed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

附件大小限制问题,当大小大于uint32, 会有问题

4 participants