# Blog FAQs: source of truth and pipeline

**Last Updated:** 2026-03-23  
**Purpose:** One place to answer “which file is authoritative?” and how FAQ research files relate to what ships in `v2/data/blog/posts/`.

## Canonical (what users and schema see)

| Location | Role |
|----------|------|
| `v2/data/blog/posts/{category}/{slug}.json` → top-level **`faqs`** | **Single source of truth for published FAQs.** HTML `question` + `answer`; rendered via `BlogFAQ.php` and FAQPage schema. |

Validators such as `validate-faq-quality.php` and `check-h2-faq-overlap.php` read **this** representation.

## Pipeline (research → generation → apply)

These live under `docs/content/blog/posts/{category}/{slug}/data/`. They are **stages in a chain**, not three parallel editorial copies of the same content.

| File | Role | Human-editable? | When to refresh |
|------|------|-------------------|-----------------|
| `faq-research.json` | SERP/PAA/research inputs for FAQ work | Yes (or re-run collection) | Start here when re-baselining FAQ topics |
| `faq-questions.json` | Generated or curated question list | Yes, before answer generation | After research changes |
| `faq-answers-optimized.json` | Generated/optimized Q+A (often `answers` array) | Yes, before applying to post | After question changes or answer tuning |

**Flow:** `collect-faq-research-data.php` → `faq-research.json` → `generate-faq-questions.php` → `faq-questions.json` → `generate-faq-answers-optimized.php` → `faq-answers-optimized.json` → **`add-faqs-to-post.php`** → post JSON `faqs`.

Batch rebuilds that run this sequence are orchestrated by `rebuild-faqs-batch.php` (priority list driven).

## Applying FAQs to the post (merge vs replace)

**Script:** `v2/scripts/blog/add-faqs-to-post.php`

**Inputs:**

- `--post={slug}` and `--category={category}` (required).
- `--faqs=/path/to.json` (optional). File may use either a top-level `faqs` array or an `answers` array (as produced by `faq-answers-optimized.json`).
- If `--faqs` is omitted, the script loads `docs/content/blog/posts/{category}/{slug}/data/faq-answers-optimized.json` when that file exists.

**Modes:**

- **Default (merge):** Loads existing FAQs from the post’s `faqs` array. If `faqs` is empty and `--replace` was **not** passed, it **falls back** to extracting FAQs from `content.html` (legacy). New items from the input file are appended when the normalized question is not already present; duplicates and strong H2 overlaps are skipped.
- **`--replace`:** When the input file yields a non-empty FAQ list, the post’s `faqs` array is **replaced entirely** by that list (still subject to optional logical-flow sort). Use this after you have a full, authoritative `faq-answers-optimized.json` (or equivalent) so you do not accumulate stale merged entries.

**Sorting:** By default, FAQs are reordered with `faq_reorder_by_logical_flow()`. Pass `--no-sort` to preserve order.

**Recommended edit loops (pick one per change, do not fix the same issue in two places):**

1. **Full pipeline refresh:** Update or regenerate from `faq-research.json` → regenerate questions → regenerate optimized answers → run  
   `php v2/scripts/blog/add-faqs-to-post.php --post={slug} --category={category} --replace`  
   (with or without `--faqs=...` if not using the default path).

2. **Post-publish tweak using pipeline file only:** Edit `faq-answers-optimized.json`, then run  
   `php v2/scripts/blog/add-faqs-to-post.php --post={slug} --category={category} --replace`  
   so the post `faqs` stay aligned with the staged file you maintain.

3. **Emergency fix without a pipeline file:** Prefer adding a small JSON file and `--faqs=... --replace`, or follow [blog-json-edit-prohibition.mdc](../../.cursor/rules/blog-json-edit-prohibition.mdc) and use the supported scripts—**do not** hand-edit `faqs` in post JSON with search-and-replace.

## Anti-patterns

- Treating `faq-questions.json`, `faq-answers-optimized.json`, and post `faqs` as three independent “versions” and editing different FAQs in each.
- Fixing a typo in post `faqs` but leaving `faq-answers-optimized.json` unchanged (or the reverse) and expecting the next `--replace` run not to overwrite your fix.
- Relying on HTML extraction for new work: `add-faqs-to-post.php` only reads from HTML when `faqs` is empty and not in `--replace` mode—**new posts should use the `faqs` array only** (see [FAQ_IMPLEMENTATION.md](FAQ_IMPLEMENTATION.md)).

## FAQ-only sprint (existing post)

After refreshing blog improvement data, use the copy-paste command chain in [FAQ_REWORK_DECISION_TREE.md](FAQ_REWORK_DECISION_TREE.md) (section **Full FAQ refresh**) so collectors, overlap checks, `add-faqs-to-post.php`, and validators run in a consistent order.

## Commands (cheat sheet)

```bash
# Apply optimized answers to post (default path or --faqs)
php v2/scripts/blog/add-faqs-to-post.php --post=SLUG --category=CATEGORY [--faqs=path/to.json] [--replace] [--no-sort]

# FAQ quality (published JSON)
php v2/scripts/blog/validate-faq-quality.php --post=SLUG --category=CATEGORY

# H2 / FAQ overlap
php v2/scripts/blog/check-h2-faq-overlap.php --post=SLUG --category=CATEGORY

# Read-only: post faqs vs faq-answers-optimized.json (when present)
php v2/scripts/blog/audit-faq-source-drift.php --post=SLUG --category=CATEGORY
```

## Relationship to JSON edit rules

- **Body HTML:** Use `update-post-content.php` (see [BLOG_CONTENT_EDIT_WORKFLOW.md](BLOG_CONTENT_EDIT_WORKFLOW.md)).
- **FAQ array on the post:** Apply via `add-faqs-to-post.php` (and related FAQ scripts), not raw multi-file search-replace on `v2/data/blog/posts/**/*.json`. See [.cursor/rules/blog-json-edit-prohibition.mdc](../../.cursor/rules/blog-json-edit-prohibition.mdc).

## Related docs

- [FAQ_BEST_PRACTICES.md](FAQ_BEST_PRACTICES.md), [FAQ_IMPLEMENTATION.md](FAQ_IMPLEMENTATION.md), [BLOG_SCRIPTS_USAGE_GUIDE.md](BLOG_SCRIPTS_USAGE_GUIDE.md)
- [blog-content-flow.mdc](../../.cursor/rules/blog-content-flow.mdc) — FAQs must not live in `content.html`
- [CONTENT_SYSTEM_INDEX.md](CONTENT_SYSTEM_INDEX.md) — hub link
