-
Notifications
You must be signed in to change notification settings - Fork 1.4k
[EXPERIMENTAL]sched/hrtimer: Part 2: refine hrtimer state machine and introduce scheduler support with hrtimer #17573
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
6847a0c to
c073a95
Compare
c073a95 to
f1e9f9a
Compare
f1e9f9a to
e0f1573
Compare
e0f1573 to
424634c
Compare
424634c to
ba4a61f
Compare
ba4a61f to
9bcc6ae
Compare
9bcc6ae to
2c0c6b0
Compare
04f255c to
50ad211
Compare
|
@wangchdo please rebase with current mainline to fix the CI error |
|
We may want to mark this feature with |
90a24af to
be3ae0e
Compare
Hi @cederom [EXPERIMENTAL] tag added, please double check |
|
Big thank you @wangchdo !! :-) |
be3ae0e to
b5856ad
Compare
Allow running/armed hrtimer to be restarted to fix hrtimer bug: apache#17567 Signed-off-by: Chengdong Wang <wangchengdong@lixiang.com>
Add hrtimer_set() to allow users to change hrtimer callback after hrtimer is started or cancelled Signed-off-by: Chengdong Wang <wangchengdong@lixiang.com>
Update the hrtimer documentation to describe the hrtimer state machine,
which is introduced to handle safe cancellation and execution in SMP
environments.
Signed-off-by: Chengdong Wang <wangchengdong@lixiang.com>
Enable the timer start functions when hrtimer is enabled. This allows hrtimer to set timer expirations with nanosecond resolution. Signed-off-by: Chengdong Wang <wangchengdong@lixiang.com>
This commit add hrtimer support to scheduler
tick without altering the existing scheduler behavior.
Signed-off-by: Chengdong Wang <wangchengdong@lixiang.com>
When hrtimer is enabled, the tickless scheduler should call
nxsched_hrtimer_start to start the timer, this is because
the tick system is support by hrtimer
Signed-off-by: Chengdong Wang <wangchengdong@lixiang.com>
b5856ad to
4c34382
Compare
|
@cederom CI passed, could you please double check? @xiaoxiang781216 @Fix-Point I understand that your main concern with my previous implementation was about potential concurrency issues, or a possible violation of ownership invariants under SMP. This PR resolves those concerns using an approach that I believe is reliable, simple, and efficient. Below, I’ll explain the fix from a code-level perspective:
4 In
Overall, this design ensures correct behavior under SMP while keeping the logic straightforward and efficient, and it integrates cleanly with the existing hrtimer model in NuttX. |
|
@Fix-Point @xiaoxiang781216 @GUIDINGLI As I pointed out in the comments on your PR (#17675), I don’t think the new hrtimer implementation is a better choice. From my perspective, the way it addresses concurrency issues is not efficient, and it requires updating the entire hrtimer API surface, which I believe makes it less user-friendly compared to my approach. My API design is as follows:
My implementation focuses on resolving the concurrency concerns without forcing API-wide changes, keeping the usage model simple while still ensuring correctness under SMP. |
|
|
||
| /* Re-arm periodic timer if not canceled or re-armed concurrently */ | ||
|
|
||
| if (period > 0 && hrtimer->expired == expired) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your latest attempt is almost same the versioning idea I tried before. Sadly, I found it still violated the ownership invariant. In fact, the expired field can not be used as version, since the expired is not monotonic (the newer timer may has same or older expire than the older timer), which is a fundamental assumption regarding the correctness of epoch-based memory reclamation. I believe it is very easy for you to make a test case to trigger the ownership invariant violation.
In my early implmentation, I added another monotonic version field for the hrtimer to do correct versioning (or Epoch-based memory reclamation). However, it will increase the memory footprint of the hrtimer. That's why I eventually gave up on the idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still insist what I stated before:
Designing correct concurrent algorithms requires systematic consideration. As I stated in #17570, after carefully reviewing your implementation, I could not really find a good solution that preserves version information without introducing significant performance or memory overhead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your latest attempt is almost same the versioning idea I tried before. Sadly, I found it still violated the ownership invariant. In fact, the
expiredfield can not be used as version, since theexpiredis not monotonic (the newer timer may has same or olderexpirethan the older timer), which is a fundamental assumption regarding the correctness ofepoch-based memory reclamation. I believe it is very easy for you to make a test case to trigger the ownership invariant violation.In my early implmentation, I added another monotonic
versionfield for the hrtimer to do correctversioning(orEpoch-based memory reclamation). However, it will increase the memory footprint of the hrtimer. That's why I eventually gave up on the idea.
Can you tell the incorrect scenario using the APIs I provided? The new timer has the same expired value? I think the only case is that the user changes the callback concurrently using hrtimer_set, but keep the expired value the same without calling hrtimer_start.
I already thought about this, But I think this is a design choice:
-
if you think this is a violation of ownership invariant, it can be easily fixed by adding a check whether the func is changed here.
-
If you think it is not violation of ownership, since the new timer will eventually be executed and the period is updated, the check is not needed here as the current implementation. And if the user do not want this happen, he can call hrtimer_start to update the expired value immediately after calling hrtimer_set updating the callback
At last, If users don't change both the callback and the expired value, we should consider the timer not changed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still insist what I stated before:
Designing correct concurrent algorithms requires systematic consideration. As I stated in #17570, after carefully reviewing your implementation, I could not really find a good solution that preserves version information without introducing significant performance or memory overhead.
In my opinion, I think your new implementation is not proper due to three reasons:
-
Your way of protecting timer from concurrency update is not efficient
-
You totally refactored the existing more user-friendly API design, this is not correct
-
Your queue abstraction design should be based on the existing hrtimer design.in fact, if you do not provide your way of queue abstraction, i already planned to add mine.And, In my opinion, maybe I am wrong,Nuttx is a RTOS, not a OS as linux, but I found your queue abstraction has a Linux style.
At the last, i also want to make this concurrency systematic consideration illustrated in a more easy to understand way in my opinion:
If you want to fix concurrency issue, you only need to figure out all the data that may be wrongly used in concurrency case, and then find a way to protect them, this is what i think of as systematic thinking for concurrency issues
|
Thank you @wangchdo! This change builds fine now. @Fix-Point provides some important feedback as he seems to work on this already. I like approach of @wangchdo to keep things compatible and aligned with existing API. Maybe additional checks / protections may be implemented to avoid situations described by @Fix-Point? :-) @Fix-Point can you please provide exact test scenario steps to verify problems you mentioned on @wangchdo solution? This should confirm if the implementation requires some additional protections? :-) |







Summary
This PR introduces high-resolution timer (hrtimer) support as a fully independent and optional module for support of the scheduler, without affecting existing scheduler behavior.
Hrtimer is strictly isolated from the current scheduling logic:
The module does not modify any scheduler data structures or timing paths.
Hrtimer acts solely as an alternative time source. Core scheduler functions (nxsched_process_tick(), nxsched_tick_expiration(), etc.) remain unchanged and are reused as-is.
Additional safeguards:
Integration benefit
This design enables incremental development and review of hrtimer while ensuring that existing NuttX scheduling behavior remains stable even if the hrtimer feature is explicitly enabled.
Development benefit
With this design, developers interested in optimizing the scheduler and those focused on optimizing hrtimer can work independently on their respective improvements.
One other key update
This PR also includes an improvement(also in a seperate PR #17570) to hrtimer by refining its state machine, this is to fix some issues in SMP mode found by @Fix-Point. The refined state-machine is as shown below, and the corresponding diagram is also added in the hrtimer documentation.
Impact
Add hrtimer support to nuttx scheduelr, without altering the existing scheduler behavior.
Testing
Test 1 passed (integrated in ostest):
- test implementation:
test log on rv-virt:smp64:
test 2 passed (provided by @Fix-Point )
test implementation
test passed log on rv-virt:smp64
test 3 passed (provided by @Fix-Point )
test implementation
test passed log on rv-virt:smp64