Skip to content

Enhancement: Prune excluded directories early in traversal for better performance (e.g., node_modules) #26

@jml6m

Description

@jml6m

Description

I've hit a performance issue with large excluded directories like node_modules.

When using --exclude "node_modules/**", the tool still recurses into the directory and skips files individually (visible with --log-level DEBUG), leading to slow scans.

Example command (from project root):
code2prompt --path . --filter ".js,.json,.ts,.html,.scss,.css" --exclude "node_modules/**" --log-level DEBUG --output prompt.md

Relevant debug logs (abridged):

[2025-10-03 16:03:20] DEBUG: Checking if should process file: node_modules\@angular\cdk\esm2022\stepper\stepper-button.mjs
DEBUG:code2prompt.utils.should_process_file:Skipping node_modules\@angular\cdk\esm2022\stepper\stepper-button.mjs: File does not meet filter criteria.
[2025-10-03 16:03:20] DEBUG: Skipping node_modules\@angular\cdk\esm2022\stepper\stepper-button.mjs: File does not meet filter criteria.
DEBUG:code2prompt.utils.should_process_file:Checking if should process file: node_modules\@angular\cdk\esm2022\stepper\stepper-module.mjs
[2025-10-03 16:03:20] DEBUG: Checking if should process file: node_modules\@angular\cdk\esm2022\stepper\stepper-module.mjs
DEBUG:code2prompt.utils.should_process_file:Skipping node_modules\@angular\cdk\esm2022\stepper\stepper-module.mjs: File does not meet filter criteria.
[2025-10-03 16:03:20] DEBUG: Skipping node_modules\@angular\cdk\esm2022\stepper\stepper-module.mjs: File does not meet filter criteria.
DEBUG:code2prompt.utils.should_process_file:Checking if should process file: node_modules\@angular\cdk\esm2022\stepper\stepper.mjs
[2025-10-03 16:03:20] DEBUG: Checking if should process file: node_modules\@angular\cdk\esm2022\stepper\stepper.mjs
DEBUG:code2prompt.utils.should_process_file:Skipping node_modules\@angular\cdk\esm2022\stepper\stepper.mjs: File does not meet filter criteria.
[2025-10-03 16:03:20] DEBUG: Skipping node_modules\@angular\cdk\esm2022\stepper\stepper.mjs: File does not meet filter criteria.
DEBUG:code2prompt.utils.should_process_file:Checking if should process file: node_modules\@angular\cdk\esm2022\stepper\stepper_public_index.mjs
[2025-10-03 16:03:20] DEBUG: Checking if should process file: node_modules\@angular\cdk\esm2022\stepper\stepper_public_index.mjs
DEBUG:code2prompt.utils.should_process_file:Skipping node_modules\@angular\cdk\esm2022\stepper\stepper_public_index.mjs: File does not meet filter criteria.
[2025-10-03 16:03:20] DEBUG: Skipping node_modules\@angular\cdk\esm2022\stepper\stepper_public_index.mjs: File does not meet filter criteria.
DEBUG:code2prompt.utils.should_process_file:Checking if should process file: node_modules\@angular\cdk\esm2022\table\can-stick.mjs
[2025-10-03 16:03:20] DEBUG: Checking if should process file: node_modules\@angular\cdk\esm2022\table\can-stick.mjs
DEBUG:code2prompt.utils.should_process_file:Skipping node_modules\@angular\cdk\esm2022\table\can-stick.mjs: File does not meet filter criteria.
[2025-10-03 16:03:20] DEBUG: Skipping node_modules\@angular\cdk\esm2022\table\can-stick.mjs: File does not meet filter criteria.
... (thousands more)

This happens because the directory walker (os.walk or equivalent) enters node_modules before the exclude check, and the glob only filters files post-entry, without pruning the subtree.

Suggested Improvement

Before recursing into a subdir, check if its full path (or path + '/*') matches any exclude pattern using fnmatch/glob. If yes, skip it entirely (e.g., remove from dirnames list in os.walk).

Environment

  • OS: Windows 11 Home, 24H2, build 26100.6584 (Anaconda PowerShell)
  • Conda: conda 24.1.2
  • Python: Python 3.11.7
  • code2prompt version: code2prompt version 0.8.1
  • Project size: 506MB

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions