# Marketing research stack — feature vs industry/static (parity note)

**Last Updated:** 2026-04-02

**Purpose:** Short audit of which JSON/markdown artifacts the **product feature** pipeline produces vs **industry (Branchen)** and **static/site** pipelines before VIP parity work, plus **rollback** guidance.

## Before parity (historical)

| Artifact | Feature (`run-feature-page-research-pipeline.sh`) | Industry / static (`run-page-research-pipeline.sh`) |
|----------|---------------------------------------------------|-----------------------------------------------------|
| `data/keywords-sistrix.json` | Yes (`collect-page-keywords-sistrix.php`) | Yes |
| `data/gsc-queries.json` | Chained via `--path` in feature orchestrator | Documented only; not chained |
| `data/sistrix-keyword-serp.json` | Yes (optional `--with-sistrix-serp`; `collect-feature-page-keyword-serp.php`) | **Blocked** — SERP script required `surface=product` |
| `data/sistrix-domain-kw-serp.json` | Optional `--with-sistrix-domain-kw` | Same script supported industry/static; rarely run from thin orchestrator |
| `data/faq-research.json` | Serper when `SERPER_API_KEY` set | Same |
| `competitor-faq-analysis.json` | Optional Firecrawl scrape | **Blocked** — `scrape-competitor-faqs.py` resolved `docs_dir` only for product registry ids |
| `DATA_DRIVEN_SYNTHESIS.generated.md` | `generate-feature-page-data-synthesis.php` (SERP, domain-kw, competitor sections) | `generate-industry-data-synthesis.php` **without** SERP/domain/competitor sections |

## After parity (target)

Industry and static registry pages can use the **same VIP baseline** as features where budget allows:

- `collect-feature-page-keyword-serp.php` accepts registry **`surface` ∈ {`product`, `industry`, `static`}** (shared cache, `sistrix_limits.serp_keywords_limit`).
- `bash v2/scripts/marketing-pages/run-page-research-pipeline.sh <id> --with-sistrix-serp` (and optional flags) chains GSC query export, SERP, domain-kw, synthesis, and optional competitor scrape.
- `generate-industry-data-synthesis.php` (and `generate-static-data-synthesis.php` wrapper) ingests SERP / domain-kw / competitor JSON when present.
- `scrape-competitor-faqs.py --page=<registry_id>` resolves any registry row with `docs_dir` + `competitor_urls`.

**Utilization gate:** Every new or refreshed collector output must appear in synthesis and/or `KEYWORD_DECISION.md` (or be explicitly deprecated). See [VIP_MARKETING_SEO_DATA_TIERS.md](VIP_MARKETING_SEO_DATA_TIERS.md).

## Rollback

- **Orchestrator:** New steps are **opt-in** (`--with-*` flags). Omit flags to match pre-parity behavior (SISTRIX keywords + Serper + printed GSC instructions only).
- **Git:** Revert the orchestrator/script/doc commits; **omit** new `data/sistrix-keyword-serp.json` (and related) from commits if you abandon a sprint.
- **Registry:** Remove pilot `serp_keywords_limit` overrides if credits need to fall back to script default (8).
