Skip to content

Conversation

@vaind
Copy link
Contributor

@vaind vaind commented Dec 30, 2025

DESCRIBE YOUR PR

Adds automated external link checking to catch broken links in documentation using lychee.

What's included

  • GitHub workflow (.github/workflows/lint-external-links.yml)

    • On PRs: checks changed markdown files, fails if broken links found (visible in checks tab)
    • Weekly scheduled: checks all docs, creates/updates a GitHub issue with results
    • Manual trigger: run full check anytime via workflow dispatch
  • Pre-commit hook for local validation (warn-only, doesn't block commits)

  • Configuration files

    • lychee.toml - Link checker settings (timeouts, retries, accepted status codes, etc.)
    • .lycheeignore - URL patterns to ignore (examples, bot-blocking sites, TLS-incompatible sites)

Current state

Found 54 real broken external links in the docs. These need to be fixed separately - this PR just adds the tooling to detect them.

IS YOUR CHANGE URGENT?

  • Urgent deadline (GA date, etc.):
  • Other deadline:
  • None: Not urgent, can wait up to 1 week+

PRE-MERGE CHECKLIST

  • Checked Vercel preview for correctness, including links
  • PR was reviewed and approved by any necessary SMEs (subject matter experts)
  • PR was reviewed and approved by a member of the Sentry docs team

vaind and others added 9 commits December 30, 2025 14:47
Configures lychee link checker with:
- Rate limiting and retry settings
- Custom user agent to avoid bot blocking
- Cache settings to reduce load on external sites
- Ignore patterns for placeholder URLs, localhost, and sites
  that block automated checkers (Twitter, LinkedIn, etc.)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Uses lychee to validate external links in documentation.

Triggers:
- Weekly cron (Sunday 2 AM UTC): Creates/updates GitHub issue
- Manual dispatch: Optionally fails on broken links
- Pull requests: Adds non-blocking comment with report

The workflow caches results to reduce load on external sites
and does not block PRs (external link failures are often
transient or false positives).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add section explaining the relationship between internal link
checking (this script) and external link checking (lychee).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Adds a warn-only pre-commit hook that checks external links in
changed markdown files using lychee. The hook:
- Only runs on docs/ and develop-docs/ markdown files
- Shows warnings but doesn't block commits
- Gracefully handles missing lychee installation

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add instructions for running lychee locally and document the
pre-commit hook behavior.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Remove separate shell script and use inline bash command with
|| true to achieve warn-only behavior.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Replace bash one-liner with TypeScript script for Windows
compatibility. Uses bun like other scripts in the repo.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Use git diff to get list of changed markdown files for PRs,
making the check faster. Full scans still run on schedule and
manual dispatch.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@vercel
Copy link

vercel bot commented Dec 30, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Review Updated (UTC)
develop-docs Ready Ready Preview, Comment Jan 2, 2026 10:18am
sentry-docs Ready Ready Preview, Comment Jan 2, 2026 10:18am

- Add scheme filter to only check http/https (skip root-relative links)
- Accept 403/418 status codes (bot blocking, freedesktop teapot)
- Add ignore patterns for:
  - Bot-blocking sites (npmjs, maven, medium, gitlab, epicgames)
  - Private resources (Notion, private GitHub repos, Zendesk)
  - Unstable docs (freedesktop)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Set base_url to docs.sentry.io so lychee can resolve root-relative
links, then exclude docs.sentry.io from checking (internal links
are already covered by lint-404s).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
After manually testing ERROR entries from lychee.log:

- bottlepy.org: TLS 1.3 only, incompatible with lychee's native-tls
- help.revise.dev: Cloudflare ECH required, fails even with curl
- dev.getsentry.net: Internal development URLs
- sentry-content-dashboard: Internal dashboard (401)
- godoc.org/pkg.go.dev: Rate-limited (429)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
vaind and others added 2 commits December 30, 2025 20:39
Changed from separate entries to using regex optional group (.+@)?
to match private IPs with or without credentials (e.g., token@10.0.2.2).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Split into two jobs for clarity:
- check-pr: PRs only, changed files, adds comment
- check-full: Schedule/manual, all files, creates issue

Removed caching (wasn't working with per-commit keys).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
vaind and others added 2 commits December 30, 2025 21:07
- Rename .lychee.toml to lychee.toml (default config name)
- Remove --config args since lychee.toml is auto-detected
- Simplify workflow: use '.' instead of listing directories
- Split workflow into separate PR and full-scan jobs
- Update PR job to update existing comment instead of creating new ones
- Update full-scan job to update existing issue instead of creating duplicates
- Add file existence checks before reading reports
- Use appropriate GitHub labels (Bug, Team: Docs, Product Area: Docs)
- Add proper permissions scoping per job

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
vaind and others added 2 commits December 30, 2025 23:56
Replace Unix-specific `which` command with `lychee --version` check
that works on Windows, macOS, and Linux. Also add cargo install option
to the help message.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
vaind and others added 2 commits December 31, 2025 10:27
Weekly scheduled runs save the cache, PR checks restore it. This reduces
load on external sites and speeds up PR checks. Cache lifetime is 2 weeks
to ensure it survives between weekly runs.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Skip caching 429 (rate limit) and 5xx (server error) responses so they
get retried on subsequent runs. This ensures transient failures don't
persist in the cache while stable results are still reused.

Also restore cache on scheduled runs so successful checks from the
previous week are skipped.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants