-
Notifications
You must be signed in to change notification settings - Fork 0
fix: address code review findings for performance and quality #14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
CHANGES: - ServiceRegistry: Remove unused methods (register_factory, has, remove, reset_factories, get_registered_types) and unnecessary class variables - ServiceRegistry: Reduce from 235 lines to 120 lines (-48%) - Simplify CaptureService.get_default_service() to remove factory abstraction - Update SyncService.get_sync_service() to use simplified registry API RATIONALE: ServiceRegistry had premature optimization with unused factory registration pattern and close handlers that were never invoked. Kept the three methods that are actually used: get(), register(), and reset(). This improves clarity and maintainability while preserving all functionality. TESTING: - All 157 core service tests pass (capture, recall, sync) - Full test suite: 715/716 pass (1 pre-existing unrelated failure) - No behavior changes, functionality fully preserved
Performance improvements (PERF-001 through PERF-007): - Add show_notes_batch() for batch git notes retrieval using git cat-file --batch - Add embed_batch() for batch embedding generation - Add prewarm() method for eager model loading - Add @lru_cache for struct format caching in index.py - Implement batch hydration in recall.py using show_notes_batch() - Use batch operations in sync.py for index synchronization Quality improvements (QUAL-001, QUAL-002): - Rename _read_input to _read_input_with_fallback for clarity - Replace broad Exception catches with specific (MemoryIndexError, OSError) Security improvements (SEC-001, SEC-002): - Add input length limit in signal_detector.py for ReDoS prevention Test updates: - Update test fixtures to mock batch operations - Update test assertions for renamed functions and specific exceptions All 1806 tests passing with 89.29% coverage. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…ST-001, TEST-002) New test coverage: - tests/test_hook_utils.py: 59 tests for shared hook utilities - setup_timeout/cancel_timeout (SIGALRM handling) - validate_file_path (security validation) - read_json_input (input size limits) - get_hook_logger (logger caching) - setup_logging (debug/warning configuration) - tests/test_session_analyzer.py: 48 tests for session transcript analysis - parse_transcript (JSONL and plain text formats) - analyze/analyze_content (signal detection) - has_uncaptured_content (detection) - Novelty checking integration All 1913 tests passing. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…-003, DOC-005, DOC-008) README.md: - Add comprehensive API Reference section with services, models, operations - Document all environment variables with descriptions and defaults - Add Hook Configuration section with all hook-specific variables CLAUDE.md: - Add Code Intelligence (LSP) section with hook documentation - Add Navigation & Understanding guidance - Add Verification Workflow and Pre-Edit Checklist commands/*.md: - Add consistent --help support with man-page style output - Add help_check sections for early help detection Addresses: DOC-003 (API Reference), DOC-005 (environment variables), DOC-008 (Related Commands) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This pull request implements performance optimizations and quality improvements addressing code review findings. The changes introduce batch operations for git and embedding operations (PERF-001 through PERF-007), improve exception handling specificity (QUAL-001, QUAL-002), and add security safeguards against ReDoS attacks (SEC-001, SEC-002). A new ServiceRegistry class centralizes singleton management, replacing module-level global variables. All 1806 tests pass with 89.29% coverage maintained.
Key changes:
- Batch git operations via
git cat-file --batchreduce subprocess overhead when fetching multiple notes - Batch embedding generation processes multiple texts in one model call for better throughput
- ServiceRegistry provides centralized singleton management with clean reset mechanism for testing
Reviewed changes
Copilot reviewed 16 out of 16 changed files in this pull request and generated 11 comments.
Show a summary per file
| File | Description |
|---|---|
| src/git_notes_memory/git_ops.py | Adds show_notes_batch() for bulk note retrieval via git cat-file; adds path sanitization to error messages |
| src/git_notes_memory/sync.py | Updates collect_notes(), reindex(), and verify_consistency() to use batch operations; migrates to ServiceRegistry |
| src/git_notes_memory/recall.py | Implements batch hydration with grouped git operations; optimizes token estimation with generator expression |
| src/git_notes_memory/index.py | Adds struct format caching for embedding serialization to avoid repeated format parsing |
| src/git_notes_memory/embedding.py | Adds prewarm() method for eager model loading; migrates to ServiceRegistry |
| src/git_notes_memory/capture.py | Extracts validation and front matter building into separate methods; migrates to ServiceRegistry |
| src/git_notes_memory/registry.py | New centralized singleton registry with type-safe get/register/reset methods |
| src/git_notes_memory/hooks/stop_handler.py | Renames _read_input to _read_input_with_fallback for clarity |
| src/git_notes_memory/hooks/signal_detector.py | Adds 100KB input length limit to prevent ReDoS attacks |
| src/git_notes_memory/hooks/context_builder.py | Replaces bare Exception with specific MemoryIndexError and OSError catches |
| tests/test_sync.py | Updates all test mocks for batch operations (show_notes_batch, embed_batch) |
| tests/test_hook_handlers.py | Updates imports and calls for renamed _read_input_with_fallback function |
| tests/test_context_builder.py | Updates exception mocking to use specific exception types |
| tests/conftest.py | Major refactor: uses ServiceRegistry.reset() instead of direct module access; adds comprehensive fixtures for git repos, mocks, and models |
Comments suppressed due to low confidence (1)
tests/test_sync.py:800
- The test mock setup may not match the actual behavior. In the implementation (sync.py:316),
parser.parse_many(content)is called with a single content string, but the mock returns[record](one record). However, parse_many can return multiple records from a single note. In the test at line 794,mock_note_parser.parse_many.return_value = [record]is set once, but parse_many is called twice (once for commit1 and once for commit2). This means the second call will also return the same single record, which is correct for this test but doesn't reflect that parse_many is called multiple times. The test expects 2 insertions (line 800), but only mocks one record being returned each time parse_many is called, which would work only if each note contains exactly one memory. This is fine for the test, but the comment should clarify this assumption.
record = make_note_record()
mock_note_parser.parse_many.return_value = [record]
result = sync_service.reindex()
assert result == 2
assert mock_index.insert.call_count == 2
- Archive 2025-12-20 code review files to docs/code-review/2025/12/20/ - Archive 2025-12-24 code review files to docs/code-review/2025/12/24/ - Add .claude/settings.local.json to .gitignore Organizes code review history for future reference while keeping root directory clean. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Changes made based on PR review comments: 1. git_ops.py: Improve path sanitization for SEC-002 - Handle Windows paths (C:\...), home-relative (~), and traversals (..) - More precise "missing" check using line.endswith pattern - Fix sha_index sync on malformed headers 2. index.py: Reduce lru_cache maxsize from 8 to 1 - Embedding dimensions are constant (384), maxsize=1 is sufficient 3. stop_handler.py: More robust empty input detection - Check for "empty input" pattern instead of just "empty" 4. signal_detector.py: Move MAX_TEXT_LENGTH to module level - Constant should not be redefined on every method call 5. registry.py: Reject kwargs on existing instances - Raise ValueError if kwargs passed when instance already exists - Prevents silently ignoring configuration parameters 6. embedding.py: Use error level for prewarm failures - Configuration issues should be visible to users Responds to Copilot review comments on PR #14. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 28 out of 33 changed files in this pull request and generated 4 comments.
- Remove redundant logging import in test_hook_utils.py - Use types_found variable in assertion (test_session_analyzer.py) - Add assertion for result variable (test_session_analyzer.py) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Use UTF-8 encoding for accurate byte counting when parsing git cat-file --batch output. The previous implementation used len(str) which counts characters, not bytes, causing incorrect content boundary detection for multi-byte UTF-8 characters. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Changes made based on PR review comments: 1. git_ops.py: Improve path sanitization for SEC-002 - Handle Windows paths (C:\...), home-relative (~), and traversals (..) - More precise "missing" check using line.endswith pattern - Fix sha_index sync on malformed headers 2. index.py: Reduce lru_cache maxsize from 8 to 1 - Embedding dimensions are constant (384), maxsize=1 is sufficient 3. stop_handler.py: More robust empty input detection - Check for "empty input" pattern instead of just "empty" 4. signal_detector.py: Move MAX_TEXT_LENGTH to module level - Constant should not be redefined on every method call 5. registry.py: Reject kwargs on existing instances - Raise ValueError if kwargs passed when instance already exists - Prevents silently ignoring configuration parameters 6. embedding.py: Use error level for prewarm failures - Configuration issues should be visible to users Responds to Copilot review comments on PR #14. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 32 out of 37 changed files in this pull request and generated no new comments.
Comments suppressed due to low confidence (2)
tests/test_sync.py:1
- These parameters are marked as unused with
# noqa: ARG001but the comment says 'Required by signature'. If they are truly required by the signature but not used in the function body, consider if the function signature itself needs refactoring. If this is a callback or interface requirement, add a more specific comment explaining why these parameters must match a specific signature.
tests/test_sync.py:1 - The fixture is marked as unused with
# noqa: ARG002and noted as 'Required fixture'. If the fixture is truly required (e.g., for setup/teardown side effects), the comment should explain what side effect is needed. If it's not actually required, remove it from the function signature.
Summary
Test plan
make qualitypasses (format, lint, typecheck, security)🤖 Generated with Claude Code