# Cursor Indexing Optimization

**Last Updated:** 2026-02-12

Plan and decisions for optimizing Cursor codebase indexing. Target: reduce from ~9.5K to ~6–7K indexed files while preserving full reference access via documentation indices and @ mentions.

## Do This Now (User Actions)

1. **Reindex Cursor** – Cursor Settings > Indexing & Docs > **Delete Index** > **Sync**. Wait for indexing to complete (new count should be ~5.7K–6.2K).

2. **Verify** – In Chat, type `@docs/content/blog/posts/ratgeber/gastronomie-mindestlohn/data/faq-questions.json` and confirm it still resolves.

**Note:** “Add Doc” in Indexing & Docs accepts **external URLs only** (e.g. `https://pytorch.org/docs`). Local files like `docs/DOCUMENTATION_MASTER_INDEX.md` are already in the codebase and indexed – no need to add them.

## Summary of Changes

| Change | Files Excluded (approx.) |
|--------|---------------------------|
| Image extensions (.webp, .svg, .jpg, etc.) in `.cursorignore` | ~1,560 |
| `docs/content/blog/posts/**/data/` in `.cursorindexingignore` | ~1,500–2,000 |
| `docs/data/**/*.json` in `.cursorindexingignore` | ~276 |
| **Total** | **~3,300–3,800** |

**Expected result:** Index from ~9,500 → ~5,700–6,200 files.

## Files Modified/Created

- **`.cursorignore`** – Added image extensions: `*.webp`, `*.svg`, `*.jpg`, `*.jpeg`, `*.gif`, `*.ico`
- **`.cursorindexingignore`** – Per-post blog data, docs/data JSON, docs/archive, docs/audit
- **`v2/scripts/dev-helpers/analyze-cursor-index.py`** – Index analysis script (simulates all three ignore files)
- **`v2/scripts/dev-helpers/verify-index-config.py`** – Validates .cursorindexingignore pattern syntax
- **`v2/scripts/dev-helpers/export-cursor-indexed-files.py`** – Export Cursor's actual index
- **`v2/scripts/dev-helpers/diff-index-vs-expected.py`** – Diff actual vs expected index
- **`docs/development/CURSOR_INDEXING_DEBUGGING.md`** – Debugging guide
- **`docs/development/CURSOR_INDEXING_BASELINE.md`** – Baseline metrics
- **`docs/development/PRE_INDEX_CHANGE_CHECKLIST.md`** – Checklist for ignore file changes
- **`docs/development/CURSOR_INDEXING_SETUP.md`** – Updated with .cursorindexingignore, image exclusions
- **`docs/ai/SEMANTIC_SEARCH_INDEX.md`** – Added "Excluded but accessible" section
- **`.cursor/rules/indexing-optimization.mdc`** – Rule for @-mentioning excluded paths

## Verification

1. Run `python3 v2/scripts/dev-helpers/analyze-cursor-index.py` before/after changes
2. Run `python3 v2/scripts/dev-helpers/verify-index-config.py` to validate patterns
3. Cursor Settings > Indexing & Docs > Delete Index > Sync
4. Verify `@docs/content/blog/posts/ratgeber/gastronomie-mindestlohn/data/faq-questions.json` still works
5. Update CURSOR_INDEXING_BASELINE.md with the new Cursor UI count

## Troubleshooting: Index Still High After Optimization

If the Cursor index count remains high after adding ignore patterns:

1. **Delete Index + Sync** – Cursor does not auto-apply ignore changes. You must:
   - Cursor Settings > Indexing & Docs > **Delete Index**
   - Then **Sync**
   - Wait for indexing to complete

2. **Restart Cursor** – If changes don't apply, restart Cursor and run Delete Index + Sync again.

3. **Verify counts** – Run the analysis script and compare:
   ```bash
   python3 v2/scripts/dev-helpers/analyze-cursor-index.py
   python3 v2/scripts/dev-helpers/verify-index-config.py
   ```
   Compare `estimated_cursor_index` (script) with the "X files" count in Cursor UI.

4. **View included files** – Cursor Settings > Indexing & Docs > **View included files** – save the `.txt` output and compare with script output to identify what Cursor actually indexes.

5. **Check Cursor logs** – `Cursor` > `Terminal` > `Output` > select **"Cursor Indexing & Retrieval"** from the dropdown. Look for errors or warnings during indexing.

6. **Enable Trace logging** – Command Palette (`Cmd+Shift+P` / `Ctrl+Shift+P`) > **"Developer: Set Log Level"** > **Trace**. Re-run Delete Index + Sync and check logs for pattern application.

7. **Export and diff actual index** – Run diagnostic scripts after Sync:
   ```bash
   python3 v2/scripts/dev-helpers/export-cursor-indexed-files.py
   python3 v2/scripts/dev-helpers/diff-index-vs-expected.py
   ```
   Review `docs/development/cursor-index-diff-report.txt` for violations (files indexed that should be excluded).

8. **Clean-slate workaround** – Per [forum report](https://forum.cursor.com/t/indexing-issue-will-not-index-using-the-cursor-ignore-file/46359), Cursor may not apply ignore rules when folders have "too many files." Try:
   - Close Cursor
   - Rename project folder (e.g. `landingpage` → `landingpage-temp`)
   - Reopen the renamed folder in Cursor
   - Delete Index + Sync
   - Rename back if needed

9. **If ignores still fail: .cursorignore fallback** – Move exclusions from `.cursorindexingignore` to `.cursorignore` to force exclusion. **Trade-off:** Those paths are no longer @-mentionable. Example: add `docs/content/blog/posts/**/data/**` to `.cursorignore` to exclude blog data from indexing. Use only if index reduction is critical and @ is not needed for those paths.

### Expected vs Actual Count

| Source | Expected | Notes |
|-------|----------|-------|
| Script `estimated_cursor_index` | ~5.7k–6.2k | After all three ignore files |
| Cursor UI | ~5.7k–6.2k | After Delete Index + Sync |
| If Cursor UI > script | Possible | Cursor may index additional files |
| If Cursor UI >> script | Verify | Re-run Delete Index + Sync; check for conflicting patterns |

## Known Cursor Limitations

- **Large codebases:** Cursor may not apply ignore rules when indexing folders with "too many files" ([forum](https://forum.cursor.com/t/indexing-issue-will-not-index-using-the-cursor-ignore-file/46359)). Landingpage has 15k+ tracked files.
- **embeddable_files.txt diff:** Use `export-cursor-indexed-files.py` and `diff-index-vs-expected.py` to compare actual vs expected index. See [CURSOR_INDEXING_DEBUGGING.md](CURSOR_INDEXING_DEBUGGING.md).

## Related

- [CURSOR_INDEXING_DEBUGGING.md](CURSOR_INDEXING_DEBUGGING.md) – Step-by-step debugging guide
- [CURSOR_INDEXING_SETUP.md](CURSOR_INDEXING_SETUP.md) – Setup and ignore configuration
- [CURSOR_INDEXING_BASELINE.md](CURSOR_INDEXING_BASELINE.md) – Baseline metrics
