-
Notifications
You must be signed in to change notification settings - Fork 5
feat(ocap-kernel): add resource limits for remote communications #714
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
a40d3eb to
96f26e2
Compare
b6c23e0 to
25e81ac
Compare
8e04986 to
ce1e774
Compare
|
@cursor review |
- Add connection limit (default 100 concurrent connections) - Add message size limit (default 1MB per message) - Add stale peer cleanup (removes data for peers disconnected >1 hour) - Make all limits configurable via RemoteCommsOptions - Add ResourceLimitError for limit violations - Add comprehensive tests for all resource limits This prevents memory exhaustion and manages system resources by: - Rejecting new connections when limit is reached - Rejecting messages exceeding size limit - Periodically cleaning up stale peer data
ce1e774 to
8323837
Compare
FUDCo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks fine for what it is.
One lingering question I have is that as we get more careful about detecting error conditions (or proclaiming them, in the case of configurable resource limits), are we potentially setting ourselves up for situations where an error bubbles up to user code at some location other than the location that is actually responsible for causing it? In other words, will a message transmission error always find its way back to the actual send operation that triggered it?
More generally, could user code find itself in an unrecoverable error state (by that I don't mean a state where there's an error that you can't get rid of -- things can always break unfixably, e.g., a remote host dies forever -- but rather a state where you don't actually know you're stuck). It's entirely plausible to me that everything is fine, but I can't tell from reading the tests whether our tests give us reason to believe we are ok on this score.
@FUDCo Two parts to your question: Error attribution: Yeah transmission errors are always caught within the specific Stuck without knowing: Currently |
Closes #660
This prevents memory exhaustion and manages system resources by:
Note
Introduces resource enforcement in remote comms and a dedicated error type, with robust reconnection/race handling and cleanup.
network.ts; rejects excess withResourceLimitError; all limits configurable viaRemoteCommsOptions(maxConcurrentConnections,maxMessageSizeBytes,cleanupIntervalMs,stalePeerTimeoutMs).ConnectionFactory.closeChannelto explicitly close/abort underlying streams and uses it for rejected/replaced channels.ResourceLimitError(codeRESOURCE_LIMIT_ERROR) with marshal/unmarshal validation and exports.ocap-kernel.Written by Cursor Bugbot for commit 71fe98b. This will update automatically on new commits. Configure here.