-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Revert "sched/group: move task group into task_tcb_s to improve perfo… #17564
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…rmance" This reverts commit 29e50ff. reason: Placing the main thread and the gourd in the same memory block, and allocating and freeing memory simultaneously, presents the following two problems: When the main thread creates a child thread and performs a detach operation, there is a possibility that the main thread may have exited, but the main thread's TCB (Transaction Control Block) may not have been released. This could potentially cause the main thread's TCB to be double-freed. The core contradiction in this problem lies in binding the main thread's TCB (Trust Container Registry) and the group together. When releasing the main thread's TCB, an additional check is needed to ensure the main thread was the last to leave the group. If this check and the free operation are atomically guaranteed, the logic is sound, and double freeing won't occur. However, this atomicity cannot be completely guaranteed. If other free operations cause a block, problems still arise. Signed-off-by: wangzhi16 <wangzhi16@xiaomi.com>
|
please hold this PR I have a better solution to this issue, which avoids fragmented allocation of group requests while also ensuring the release of TCBs on the main thread. |
|
@wangzhi16 please review this change: |
@anchao Hi, there may be used after free. I understand that your patch modification still binds the group and task tcb together, but the free action only appears in group leave, unlike before when it also appeared in release_tcb.
However, this might cause used after free, because group_leave is also called in the file nuttx/sched/task/task_exithook.c:454, and the tcb will still be operated on after that, even though you have already released the tcb. Moreover, this patch does not solve the first problem I mentioned.
By the way, if the main thread neither executes please check again, thanks. |
|
@wangzhi16 Thanks for your excellent analysis, I'll get back to you tonight. Regarding the resource leak issue you mentioned this morning, it seems to be a bug in the message queue's reference counter maintenance. I will resolve it tonight. |
|
@anchao do you have a new fix? The proposal anchao@68ce1b8 seems a little bit awkward using tcb and group offset operations, maybe you can create are struct type where the first member is a "struct task_group_s" and the second is a "struct tcb_s", then you do a casting and point to the right struct. Please don't take it as a critique, this is just a suggestion to make the code more readable. Should we wait for a proper fix instead of merging this PR here? |
It's wrong to allocate tcb_s and task_group_s in one kmm_malloc from design point since each has the different lile cycle, this type of optimization:
it's better to revert the optimization directly. |
|
|

This reverts commit 29e50ff.
Summary
Placing the main thread and the gourd in the same memory block, and allocating and freeing memory simultaneously, presents the following two problems:
When the main thread creates a child thread and performs a detach operation, there is a possibility that the main thread may have exited, but the main thread's TCB may not have been released.
This could potentially cause the main thread's TCB to be double-freed. The core contradiction in this problem lies in binding the main thread's TCB and the group together. When releasing the main thread's TCB, an additional check is needed to ensure the main thread was the last to leave the group. If this check and the
freeoperation are atomically guaranteed, the logic is sound, and double freeing won't occur. However, this atomicity cannot be completely guaranteed. If otherfreeoperations cause a block, problems still arise. For example:Impact
None
Testing
here are two patches that can reproduce the problem mentioned above.
master...wangzhi16:nuttx:master
apache/nuttx-apps@master...wangzhi16:nuttx-apps:dev
Since this is an intermittent issue, blocking during the free process will free up the CPU. Therefore, we use usleep to create a scenario that is guaranteed to occur.
Before revert:
Configuration that needs to be opened:
test on qemu-armv7a:
build:

tools/configure.sh qemu-armv7a:nsh; make -j16result:
The crash mentioned above was caused by the execution flow described below.
after revert:
test on qemu-armv7a:
build:

tools/configure.sh qemu-armv7a:nsh; make -j16result: