-
Notifications
You must be signed in to change notification settings - Fork 4.1k
Fix span lifecycle with smart pointers to prevent use-after-free in a… #3140
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
When the first follower responds, the parent span is prematurely destroyed along with all child spans |
|
Yes. For example, in the Raft protocol, there is one leader and two followers. The leader only needs to wait for the first follower to complete before returning. |
src/bthread/task_meta.h
Outdated
| KeyTable* keytable; | ||
| void* assigned_data; | ||
| void* rpcz_parent_span; | ||
| std::weak_ptr<brpc::Span> rpcz_parent_span; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you 'd better not use brpc types in the bthread code.
In the brpc design, bthread is a basic library, and brpc is built on top of it. So bthread should no depend on brpc. This is why the original type is void*, not brpc::Span*.
My suggestion: you can keep void* rpcz_parent_span here, and in the brpc code, you can new a std::weak_ptr<brpc::Span>, then cast the pointer std::weak_ptr<brpc::Span>* to void*. This may be more complicated, but keep the brpc design clean.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
|
@lh2debug Thank you for your contribution! This solve a very important issue. You have done a good job! Currently I have two concerns:
|
|
Thank you for your suggestion. I will consider the above optimizations. @wwbmmm |
|
maybe we can add a reference counter in span, There may be much less code modification. |
|
Thank you for your answer. This approach of implementing reference counting internally within a Span seems impractical. Because reference Counting with void* rpcz_parent_span Fails Across Bthreads: Smart pointers solve this via control blocks that survive object destruction, allowing safe lifetime detection across threads. |
|
Root Cause: The span information is a tree structure, and this relationship is maintained through the ref information. For example, ParentSpan initially has ref=1, and two ClientSpans are created, so ref becomes 3. When one ClientSpan returns, ref becomes 2; when the other ClientSpan returns, ref becomes 1. At this point, it can be destructed. Because a single call might be split into a tree, the bthreads generated by this call should follow the lifecycle principle of this call chain, which should be fine. |
|
I believe that spans are suitable for requesting related information and should not be more closely associated with threads or coroutines; they simply use thread-local storage to pass variables in their implementation. |
Let me complement, the first reference to the ParentSpan is held by tls, when the bthread exit, ref_count will -1, when the ref_count become 0 (only after all ClientSpan return and the bthread exit) , it can be destructed. Moreover, when a new bthread with BTHREAD_INHERIT_SPAN is created, the ref_count will +1; when this bthread exit, the ref_count will -1. |
a) It's more complex in principle and implementation than the smart pointer method. Imagine a Raft scenario: one server span creates two bthreads. Each bthread context initiates an RPC (i.e., creates a client span). That is, the server span is the parent of the bthread span, and the bthread span is the parent of the client span. In this case, manually adding references/releasing references to the server span, bthread span, and client span is much more complex than the smart pointer method. b) It's not flexible enough for business trace collection. For example, if the business wants to collect some trace information in a background thread outside the brpc context, the smart pointer method simply passes the |
|
If the implementation follows the |
We can add some notes in the document and code comment. I think that's OK. |
83d573d to
629d247
Compare
Under this implementation logic, the span function still only counts the time taken for the first response, right? |
Yes, in this scenario, when the server span is committed, only the first client span (i.e., only the terminated child span) will be counted; the second ongoing client span will not be counted. When the server span is destroyed, the reference counts of the first and second client spans will be canceled. Shortly afterward, the reference counts of these two client spans will be reduced to 0, and they will be garbage collected. |
55b628d to
817e972
Compare
|
@wwbmmm @yanglimingcn The code for bthread that doesn't depend on the brpc module (i.e., rpcz_parent_span remains void*) has been modified and tested. Please take a look, tkx~ |
…sync RPC callbacks (apache#3068)
67b67d0 to
241e6f3
Compare
| os, span.sent_real_us(), | ||
| &last_time, extr, ARRAY_SIZE(extr))) { | ||
| os << " Responded(" << span.response_size() << ')' << std::endl; | ||
| &last_time, extr, ARRAY_SIZE(extr), &span)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please post some of the display information?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As shown below, the type of span (server span, client span, bthread span) is indicated at the beginning to facilitate troubleshooting.
Fetching new trace TRACE#B from HOST#X
Received request REQ#65674 from HOST#Y via PROTO#1 log=0 trace=TRACE#B span=SPAN#S1
[ServerSpan span=SPAN#S1] Processing the request in a new worker
[ServerSpan span=SPAN#S1] ParseStats[cut=?, queue=?, worker=?, rpc=?]
[ServerSpan span=SPAN#S1] Enter Ecom.OrderService.PlaceOrder
[ClientSpan span=SPAN#C1] Cart.AddItem item=ITEM# qty=QTY# price=PRICE# user=USER# session=SESSION#
[ClientSpan span=SPAN#C1] Cart.VerifyOperation item=ITEM# op=OP# user=USER# session=SESSION#
[ClientSpan span=SPAN#C1] Cart.ProposeUpdate item=ITEM# op=OP# user=USER# session=SESSION# attempt=SN#
[ClientSpan span=SPAN#C1] Orchestration.Manager: enqueue step=PlaceOrder first=I# last=I#
[ClientSpan span=SPAN#C1] Orchestration.Queue: pending step
Requesting Ecom.Consensus.Append@HOST#Z PROTO#1 call=CALL# trace=TRACE#B span=SPAN#C1
[ClientSpan span=SPAN#C1] Requested(REQ#65778) [1]
[ClientSpan span=SPAN#C1] Received response(RSP#) of request[1]
[ClientSpan span=SPAN#C1] Processing the response in a new worker
[ClientSpan span=SPAN#C1] Enter client callback
[ClientSpan span=SPAN#C1] Workflow.OnApply, ongoing tasks: 0
[ClientSpan span=SPAN#C1] TaskQueue status running=R# queues=Q# active=A#
[ClientSpan span=SPAN#C1] OrderPipeline.WriteOrder
[ClientSpan span=SPAN#C1] Payment.Authorize
[ClientSpan span=SPAN#C1] Inventory.Reserve
[ClientSpan span=SPAN#C1] Promotion.Apply
[ClientSpan span=SPAN#C1] Promotion.Apply_done
[ClientSpan span=SPAN#C1] Notification.Prepare
[ClientSpan span=SPAN#C1] Notification.Prepare_done
[ClientSpan span=SPAN#C1] Inventory.Reserve_done
[ClientSpan span=SPAN#C1] Payment.Authorize_done
[ClientSpan span=SPAN#C1] OrderPipeline.Sync
[ClientSpan span=SPAN#C1] OrderPipeline.WriteOrder done (~2.7ms)
[ClientSpan span=SPAN#C1] Latency stats: avg≈58us p90≈66us p99≈160us
[ClientSpan span=SPAN#C1] Protocol response before join
[ClientSpan span=SPAN#C1] Protocol response after join
[ClientSpan span=SPAN#C1] FulfillmentWorker: ongoing tasks=3
[ClientSpan span=SPAN#C1] FulfillmentWorker: global queue running=R# queues=Q# active=A#
[ClientSpan span=SPAN#C1] FulfillmentWorker: enter
[ClientSpan span=SPAN#C1] Batcher.append
[ClientSpan span=SPAN#C1] Batcher.flush queue_wait≈~6.7ms
Requesting Ecom.Consensus.Append@HOST#Y PROTO#1 call=CALL# trace=TRACE#B span=SPAN#C2
[ClientSpan span=SPAN#C2] Requested(REQ#65778) [1]
[ClientSpan span=SPAN#C2] Received response(RSP#) of request[1]
[ClientSpan span=SPAN#C2] Processing the response in a new worker
[ClientSpan span=SPAN#C2] Enter client callback
[ServerSpan span=SPAN#S1] Leave Ecom.OrderService.PlaceOrder
[ServerSpan span=SPAN#S1] Responded
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This format is a little bit tedious. How about [Server SPAN#S1] Processing the request in a new worker?
| #if BRPC_SPAN_ENABLE_SHARED_PTR_API | ||
| return span_internal; | ||
| #else | ||
| return span_internal.get(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return Span* not hold a reference, This will cause problems, won't it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the server span waits for the child span to complete before returning (as is the case in most business scenarios), returning a raw pointer won't cause any problems.
For example, in a Raft scenario, if the server span returns before the second follower completes (causing the second follower's client span to be destroyed), the second follower's RPC callback will retrieve the destroyed dangling pointer through ControllerPrivateAccessor::span.
| RpcPBMessages* messages, const Server* server, | ||
| MethodStatus* method_status, int64_t received_us) { | ||
| MethodStatus* method_status, int64_t received_us, | ||
| std::shared_ptr<Span> span) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why SendRpcResponse need span parameter, It can be obtained from the accessor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SendRpcResponse needs to hold a strong reference count; otherwise, if the backend spancontainer releases the reference, the server span will be unable to set_start_send_us and set_response_size.
src/brpc/span.h
Outdated
|
|
||
| class SpanContainer : public bvar::Collected { | ||
| public: | ||
| explicit SpanContainer(std::shared_ptr<Span> span) : _span(span) {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const std::shared_ptr& span ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I'll change this later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
| delete kt; | ||
| // After deletion: tls may be set during deletion. | ||
| tls_bls.keytable = NULL; | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the relationship between rpcz_parent_span and key.cpp?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rpcz_parent_span is a void* protocol field (type controlled by brpc) in bthread TLS that points to weak_ptr<Span>.
key.cpp is responsible for its lifecycle triggering points (creation/destruction/end) and is decoupled from the specific implementation of brpc (span.cpp) through a global callback.
| } | ||
|
|
||
| int bthread_set_span_funcs(bthread_create_span_fn create_fn, | ||
| bthread_destroy_span_fn destroy_fn, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be more reasonable to put this function in bthread.cpp?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your suggestion. Personally, I think it might be more appropriate to manage it under key.cpp, because span is a brpc layer concept, not a bthread concept. bthread.cpp should focus on the bthread lifecycle. key.cpp is responsible for span's lifecycle triggering points (creation/destruction/end) and is decoupled from the specific implementation of brpc (span.cpp) through a global callback.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think key.cpp is responsible for span's lifecycle triggering points.
key.cpp is for the implementation of bthread_get/setspecific, it manage keytable_pool and keytable, which has no relationship to span.
What you need is to call the span functions in the bthread lifecycle (create/end/...), the better place is in the task_group.cpp, you can get/set bthread::tls_bls.rpcz_parent_span there.
And the implementation of bthread_set_span_funcs is better placed into bthread.cpp, because most bthread interface functions are there.
| if (!span) { | ||
| return NULL; | ||
| } | ||
| return new std::weak_ptr<Span>(span); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When is this newly created weak_ptr deleted? This part seems a bit obscure, and it's easy for memory leaks to occur.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My earliest version used the weak_ptr type for rpcz_parent_span within LocalStorage because your team offered a very reasonable optimization reason: bthreads shouldn't depend on brpc. Therefore, rpcz_parent_span needed to retain its original void* type. The brpc layer provides manual new/delete management of weak_ptr for the bthread layer to register and use.
Therefore, during the bthread's lifecycle, the weak_ptr lifecycle of rpcz_parent_span needs to be manually managed using new/delete (i.e., new when the bthread is created, and delete when the bthread ends).
@chenBright What do you think of these two approaches ?
|
… lifecycle management (apache#3068)
241e6f3 to
c5a8dee
Compare
Although we used approache 1 before, I now prefer approache 2 because it offers better maintainability and extensibility after being refactored using smart pointers. |
| os, span.sent_real_us(), | ||
| &last_time, extr, ARRAY_SIZE(extr))) { | ||
| os << " Responded(" << span.response_size() << ')' << std::endl; | ||
| &last_time, extr, ARRAY_SIZE(extr), &span)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This format is a little bit tedious. How about [Server SPAN#S1] Processing the request in a new worker?
| } | ||
| return 0; | ||
| } | ||
| uint64_t Controller::span_id() const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a new line
| #if BRPC_SPAN_ENABLE_SHARED_PTR_API | ||
| accessor.set_span(span); | ||
| #else | ||
| accessor.set_span(span.get()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can add an overloaded function set_span(std::shared_ptr<Span> span),
so that you don't need to use BRPC_SPAN_ENABLE_SHARED_PTR_API everywhere.
| if (ls.rpcz_parent_span) { | ||
| delete static_cast<std::weak_ptr<Span>*>(ls.rpcz_parent_span); | ||
| } | ||
| ls.rpcz_parent_span = new std::weak_ptr<Span>(span); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know why you need to delete and new the weak_ptr, can you directly reset the weak_ptr like *ls.rpcz_parent_span = span?
| using namespace bthread; | ||
| LocalStorage ls = BAIDU_GET_VOLATILE_THREAD_LOCAL(tls_bls); | ||
| if (ls.rpcz_parent_span) { | ||
| delete static_cast<std::weak_ptr<Span>*>(ls.rpcz_parent_span); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you don't need to delete here, just ls.rpcz_parent_span->reset()
| } | ||
|
|
||
| int bthread_set_span_funcs(bthread_create_span_fn create_fn, | ||
| bthread_destroy_span_fn destroy_fn, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think key.cpp is responsible for span's lifecycle triggering points.
key.cpp is for the implementation of bthread_get/setspecific, it manage keytable_pool and keytable, which has no relationship to span.
What you need is to call the span functions in the bthread lifecycle (create/end/...), the better place is in the task_group.cpp, you can get/set bthread::tls_bls.rpcz_parent_span there.
And the implementation of bthread_set_span_funcs is better placed into bthread.cpp, because most bthread interface functions are there.
What problem does this PR solve?
Issue Number: resolve #3068
Problem Summary:
Span lifecycle management defect in distributed storage system:
Problem Scenario:
Root Cause:
a. Premature deallocation: Parent span destroyed while child spans still in use
b. Dangling pointer: Response callback accesses freed span objects
Reference counting relationship
graph LR subgraph "Parent Span (WriteChunk)" PS["Parent Span<br/>ref_count = 2<br/>trace_id: 12345<br/>span_id: 67890"] end subgraph "强引用持有者 (shared_ptr)" RD["RPC Done Callback<br/>SendRpcResponse<br/>shared_ptr<Span>"] SC["SpanContainer<br/>后台collector<br/>shared_ptr<Span>"] end subgraph "弱引用持有者 (weak_ptr)" PC["Controller<br/>_span<br/>weak_ptr<Span>"] CS1["Child Span 1<br/>_local_parent<br/>weak_ptr<Span>"] CS2["Child Span 2<br/>_local_parent<br/>weak_ptr<Span>"] end subgraph "Child Spans" CS1_DETAIL["Child Span 1<br/>ref_count = 1<br/>append_entries to Follower1<br/>trace_id: 12345<br/>span_id: 67891"] CS2_DETAIL["Child Span 2<br/>ref_count = 1<br/>append_entries to Follower2<br/>trace_id: 12345<br/>span_id: 67892"] end %% Parent Span的强引用 RD -->|"+1 ref_count"| PS SC -->|"+1 ref_count"| PS %% Parent Span的弱引用 PC -.->|"不增加ref_count"| PS CS1 -.->|"不增加ref_count"| PS CS2 -.->|"不增加ref_count"| PS %% Parent Span持有Child Spans PS -->|"_client_list<br/>shared_ptr<br/>+1 ref_count"| CS1_DETAIL PS -->|"_client_list<br/>shared_ptr<br/>+1 ref_count"| CS2_DETAIL %% Child Spans的弱引用 CS1_DETAIL -.->|"_local_parent<br/>weak_ptr"| PS CS2_DETAIL -.->|"_local_parent<br/>weak_ptr"| PS %% Controller对Child Spans的弱引用 C1["Controller 1<br/>weak_ptr"] -.-> CS1_DETAIL C2["Controller 2<br/>weak_ptr"] -.-> CS2_DETAIL %% 样式 classDef parentSpan fill:#e1f5fe,stroke:#01579b,stroke-width:4px classDef childSpan fill:#f3e5f5,stroke:#4a148c,stroke-width:3px classDef strongRef fill:#e8f5e8,stroke:#2e7d32,stroke-width:3px classDef weakRef fill:#fff3e0,stroke:#e65100,stroke-width:2px class PS parentSpan class CS1_DETAIL,CS2_DETAIL childSpan class RD,SC strongRef class PC,C1,C2 weakRefWhat is changed and the side effects?
Changed:
Side effects: NO
Performance effects: NO
Breaking backward compatibility: NO
Check List: