# Static / site pages — data collection and reports

**Last Updated:** 2026-04-10

Canonical inventory for SISTRIX, GSC, GA4, Serper PAA, and per-page research artifacts for **Tier A marketing surfaces**: homepage plus `/preise`, `/kunden`, `/ueber-uns`, `/partner`, `/demo-vereinbaren`, **`/veranstaltungen`**. Registry-driven: [`../marketing-pages-registry.json`](../marketing-pages-registry.json) (`surface`: `static`). Inventory: [STATIC_PAGES_INVENTORY.md](STATIC_PAGES_INVENTORY.md).

**VIP static Tier A:** **`homepage`** and **`preise`** (and other registry static ids) may use selective **`keyword.domain.seo` + `kw`** via `collect-marketing-page-domain-kw-serp.php --page=<registry_id>` (~100 credits/keyword, max 5 from `target-keywords.json`) when pricing or positioning sprints need it; always update synthesis + `KEYWORD_DECISION.md` — [VIP_MARKETING_SEO_DATA_TIERS.md](../marketing-pages/VIP_MARKETING_SEO_DATA_TIERS.md).

**Scope:** Seven `page_id`s (`homepage`, `preise`, `kunden`, `ueber-uns`, `partner`, `demo-vereinbaren`, `veranstaltungen`). Legal, checkout utilities, and thin campaign LPs are **out of scope** for this pipeline unless added in a later registry phase.

## Security: API keys

- **SISTRIX:** `SISTRIX_API_KEY` or `v2/config/sistrix-api-key.php` (gitignored). Never commit keys.
- **Google (GSC / GA4):** `v2/config/google-api-credentials.php` (same as blog/tools/product/branchen).
- **Serper:** `SERPER_API_KEY` for PAA script only.

## Credit accounting

- Shared log: `v2/data/blog/sistrix-credits-log.json`.
- Portfolio batch (`collect-static-pages-keywords-sistrix.php`) + per-page runs (`collect-page-keywords-sistrix.php`) consume credits; dry-run first when tuning `static-pages-candidate-keywords.json` or `target-keywords.json`.

## Script → output → when to refresh

| Step | Script | Output | Refresh cadence |
|------|--------|--------|-------------------|
| Global static GSC | `v2/scripts/static-pages/collect-static-pages-performance-gsc.php` | `docs/content/pages/static-pages-performance-gsc.json` | Monthly; `--days=N` (default 90) |
| Global GA4 | `v2/scripts/static-pages/collect-static-pages-performance-ga4.php` (uses [`ga4-data-api.php`](../../../../v2/helpers/ga4-data-api.php)) | `docs/content/pages/static-pages-performance-ga4.json` | Same cadence; property `275821028` |
| GSC JSON → per page | `v2/scripts/static-pages/split-static-gsc-to-registry-pages.php` | `{docs_dir}/data/performance-gsc.json` | After global GSC refresh |
| Portfolio SISTRIX | `v2/scripts/static-pages/collect-static-pages-keywords-sistrix.php` | `docs/content/pages/static-pages-keyword-sistrix.json` | Quarterly or when `static-pages-candidate-keywords.json` changes |
| Merge portfolio table | `v2/scripts/static-pages/merge-static-opportunity-data.php` | stdout → paste [STATIC_SITE_OPPORTUNITY_LIST.md](STATIC_SITE_OPPORTUNITY_LIST.md) | After SISTRIX and/or GSC refresh |
| Per-page SISTRIX | `v2/scripts/marketing-pages/collect-page-keywords-sistrix.php --page=<id>` | `{docs_dir}/data/keywords-sistrix.json` | After `data/target-keywords.json` edits |
| SISTRIX SERP top 10 (cheap `keyword.seo`) | `php v2/scripts/product-pages/collect-feature-page-keyword-serp.php --page=<id>` | `{docs_dir}/data/sistrix-keyword-serp.json` | ~1 cr/keyword; cap `sistrix_limits.serp_keywords_limit`; shared cache with product feature collector |
| SISTRIX domain SERP (VIP, optional) | `php v2/scripts/marketing-pages/collect-marketing-page-domain-kw-serp.php --page=<id>` | `{docs_dir}/data/sistrix-domain-kw-serp.json` | e.g. `preise`, `homepage`; see [VIP_MARKETING_SEO_DATA_TIERS.md](../marketing-pages/VIP_MARKETING_SEO_DATA_TIERS.md) |
| Synthesis (generated) | `php v2/scripts/marketing-pages/generate-industry-data-synthesis.php --page=<id>` or `generate-static-data-synthesis.php --page=<id>` | `{docs_dir}/DATA_DRIVEN_SYNTHESIS.generated.md` | Includes SERP/domain/competitor sections when JSON exists |
| Serper PAA | `python3 v2/scripts/marketing-pages/serper-paa-research.py --page=<id>` | `{docs_dir}/data/faq-research.json` | Needs `SERPER_API_KEY`; FAQ refresh sprints |
| GSC CSV → JSON (fallback) | `v2/scripts/product-pages/gsc-product-export.php --csv=... --marketing-page=<id>` | same per-page `performance-gsc.json` | If GSC API unavailable |
| GSC queries (single path) | `v2/scripts/tools/collect-tool-gsc-queries.php --path=/preise --output=docs/content/pages/static-pages/preise/data/gsc-queries.json` | per-page `data/gsc-queries.json` | FAQ/meta sprints; any static path via `--path` + `--output` |

**Dry-runs:** Where supported, pass `--dry-run` (portfolio SISTRIX mirrors product/branchen scripts).

### Homepage `/` — keyword, meta & Volle Transparenz FAQ (checklist)

1. Refresh global static GSC + GA4 (table above), then `split-static-gsc-to-registry-pages.php` → `docs/content/pages/homepage/data/performance-gsc.json`.
2. Query-level GSC: `collect-tool-gsc-queries.php --path=/ --output=docs/content/pages/homepage/data/gsc-queries.json` (`--days` or `--start`/`--end` optional).
3. `php v2/scripts/marketing-pages/collect-page-keywords-sistrix.php --page=homepage`; optional `php v2/scripts/product-pages/collect-feature-page-keyword-serp.php --page=homepage`; optional `python3 v2/scripts/marketing-pages/serper-paa-research.py --page=homepage` when `SERPER_API_KEY` is set; optional `bash v2/scripts/marketing-pages/run-page-research-pipeline.sh homepage --with-gsc-queries --with-sistrix-serp --with-synthesis` (add `--with-competitor-faq-scrape` when `FIRECRAWL_API_KEY` is set).
4. `php v2/scripts/marketing-pages/generate-static-data-synthesis.php --page=homepage` → review `docs/content/pages/homepage/DATA_DRIVEN_SYNTHESIS.generated.md`; update `docs/content/pages/homepage/data/target-keywords.json` + `data/KEYWORD_DECISION.md` (no orphan JSON).
5. **Live implementation:** canonical file is [`v2/start-v2.php`](../../../../v2/start-v2.php) (not `landingpage.php`). Meta + WebPage JSON-LD in `<head>` (no FAQPage in `@graph`); **root `/` FAQ copy** in [`v2/data/homepage-faq-items.php`](../../../../v2/data/homepage-faq-items.php) (`ordio_homepage_faq_items()`); **LP card-grid** FAQs stay in [`v2/data/landing-transparency-faq-items.php`](../../../../v2/data/landing-transparency-faq-items.php). Map homepage rows to [`v2/components/site-faq-section.php`](../../../../v2/components/site-faq-section.php) (accordion, section `id="faq"`, **before** `footer.php`). Post-footer [`ordio_echo_homepage_faq_jsonld_script()`](../../../../v2/helpers/faq-jsonld.php) — FAQPage `@id` `https://www.ordio.com/#faq`. LP variants use [`landing-transparency-faq-section.php`](../../../../v2/sections/partials/landing-transparency-faq-section.php) (card grid). Run `php v2/scripts/dev-helpers/audit-faq-jsonld-context.php` after FAQ JSON-LD changes.
6. **Optional VIP:** `php v2/scripts/marketing-pages/collect-marketing-page-domain-kw-serp.php --page=homepage --limit=3` — document credits in `KEYWORD_DECISION.md` ([VIP_MARKETING_SEO_DATA_TIERS.md](../marketing-pages/VIP_MARKETING_SEO_DATA_TIERS.md)).

**Reference:** [homepage-documentation.md](../homepage/homepage-documentation.md), [HOMEPAGE_INVENTORY.md](../homepage/HOMEPAGE_INVENTORY.md).

### Pricing page `/preise` — FAQ or keyword sprint (checklist)

1. Refresh global static GSC + GA4 (table above), then `split-static-gsc-to-registry-pages.php` → `preise/data/performance-gsc.json`.
2. Query-level GSC: `collect-tool-gsc-queries.php --path=/preise --output=docs/content/pages/static-pages/preise/data/gsc-queries.json` (`--days` optional).
3. `bash v2/scripts/marketing-pages/run-page-research-pipeline.sh preise --with-gsc-queries --with-sistrix-serp` (add `--with-synthesis` to refresh generated synthesis; Serper if `SERPER_API_KEY` set).
4. Update `preise/data/target-keywords.json` + `KEYWORD_DECISION.md`; keep [`preise/DATA_DRIVEN_SYNTHESIS.md`](preise/DATA_DRIVEN_SYNTHESIS.md) in sync so no orphan JSON.
5. Edit live FAQs in [`v2/data/pricing-faq.json`](../../../../v2/data/pricing-faq.json); run `php v2/scripts/dev-helpers/verify-faq-jsonld-parity.php --file=pricing-faq.json`.

### Partner page `/partner` — FAQ or keyword sprint (checklist)

1. Refresh global static GSC + GA4 (table above), then `split-static-gsc-to-registry-pages.php` → `partner/data/performance-gsc.json`.
2. Query-level GSC: `collect-tool-gsc-queries.php --path=/partner --output=docs/content/pages/static-pages/partner/data/gsc-queries.json` (`--days` or `--start`/`--end` optional).
3. `bash v2/scripts/marketing-pages/run-page-research-pipeline.sh partner --with-gsc-queries --with-sistrix-serp --with-synthesis` (add `--with-competitor-faq-scrape` when `FIRECRAWL_API_KEY` is set; Serper PAA when `SERPER_API_KEY` is set).
4. Update `partner/data/target-keywords.json` + `KEYWORD_DECISION.md`; review `partner/DATA_DRIVEN_SYNTHESIS.generated.md` so new JSON is not orphaned.
5. Edit live FAQs in [`v2/data/misc-faqs/static_partner.json`](../../../../v2/data/misc-faqs/static_partner.json); run `php v2/scripts/dev-helpers/verify-faq-jsonld-parity.php --file=misc-faqs/static_partner.json` and `php v2/scripts/dev-helpers/audit-faq-json-internal-links.php '--glob=v2/data/misc-faqs/static_partner.json'`.
6. **Optional VIP:** `php v2/scripts/marketing-pages/collect-marketing-page-domain-kw-serp.php --page=partner --limit=3` — document credits in `KEYWORD_DECISION.md` ([VIP_MARKETING_SEO_DATA_TIERS.md](../marketing-pages/VIP_MARKETING_SEO_DATA_TIERS.md)).

### Kunden page `/kunden` — FAQ or keyword sprint (checklist)

1. Refresh global static GSC + GA4 (table above), then `split-static-gsc-to-registry-pages.php` → `kunden/data/performance-gsc.json`.
2. Query-level GSC: `collect-tool-gsc-queries.php --path=/kunden --output=docs/content/pages/static-pages/kunden/data/gsc-queries.json` (`--days` or `--start`/`--end` optional).
3. `bash v2/scripts/marketing-pages/run-page-research-pipeline.sh kunden --with-gsc-queries --with-sistrix-serp --with-synthesis` (add `--with-competitor-faq-scrape` when `FIRECRAWL_API_KEY` is set; Serper PAA when `SERPER_API_KEY` is set).
4. Update `kunden/data/target-keywords.json` + `KEYWORD_DECISION.md`; review `kunden/DATA_DRIVEN_SYNTHESIS.generated.md` and keep `data/faq-research.json` / `data/competitor-faq-analysis.json` in sync (no orphan JSON).
5. Edit live FAQs in [`v2/data/misc-faqs/static_customers_new.json`](../../../../v2/data/misc-faqs/static_customers_new.json); run `php v2/scripts/dev-helpers/verify-faq-jsonld-parity.php --file=misc-faqs/static_customers_new.json` and `php v2/scripts/dev-helpers/audit-faq-json-internal-links.php '--glob=v2/data/misc-faqs/static_customers_new.json'`.
6. **Optional VIP:** `php v2/scripts/marketing-pages/collect-marketing-page-domain-kw-serp.php --page=kunden --limit=3` — document credits in `KEYWORD_DECISION.md` ([VIP_MARKETING_SEO_DATA_TIERS.md](../marketing-pages/VIP_MARKETING_SEO_DATA_TIERS.md)).

### Events & Messen `/veranstaltungen` — FAQ or keyword sprint (checklist)

1. Refresh global static GSC + GA4 (table above), then `split-static-gsc-to-registry-pages.php` → `veranstaltungen/data/performance-gsc.json`.
2. Query-level GSC: `collect-tool-gsc-queries.php --path=/veranstaltungen --output=docs/content/pages/static-pages/veranstaltungen/data/gsc-queries.json` (`--days` or `--start`/`--end` optional).
3. `bash v2/scripts/marketing-pages/run-page-research-pipeline.sh veranstaltungen --with-gsc-queries --with-sistrix-serp --with-synthesis` (add `--with-competitor-faq-scrape` when `FIRECRAWL_API_KEY` is set and competitor URLs in registry are useful; Serper PAA when `SERPER_API_KEY` is set).
4. Update `veranstaltungen/data/target-keywords.json` + [`veranstaltungen/KEYWORD_DECISION.md`](veranstaltungen/KEYWORD_DECISION.md); review `veranstaltungen/DATA_DRIVEN_SYNTHESIS.generated.md` so new JSON is not orphaned.
5. Edit live FAQs in [`v2/data/misc-faqs/static_events.json`](../../../../v2/data/misc-faqs/static_events.json); run `php v2/scripts/dev-helpers/verify-faq-jsonld-parity.php --file=misc-faqs/static_events.json` and `php v2/scripts/dev-helpers/audit-faq-json-internal-links.php '--glob=v2/data/misc-faqs/static_events.json'`.
6. **Optional VIP:** `php v2/scripts/marketing-pages/collect-marketing-page-domain-kw-serp.php --page=veranstaltungen --limit=3` — document credits in `KEYWORD_DECISION.md` ([VIP_MARKETING_SEO_DATA_TIERS.md](../marketing-pages/VIP_MARKETING_SEO_DATA_TIERS.md)).

**Page doc:** [EVENTS_PAGE_COMPONENTS.md](EVENTS_PAGE_COMPONENTS.md) (FAQ placement + validators).

### Über uns `/ueber-uns` — FAQ or keyword sprint (checklist)

1. Refresh global static GSC + GA4 (table above), then `split-static-gsc-to-registry-pages.php` → `ueber-uns/data/performance-gsc.json`.
2. Query-level GSC: `collect-tool-gsc-queries.php --path=/ueber-uns --output=docs/content/pages/static-pages/ueber-uns/data/gsc-queries.json` (`--days` or `--start`/`--end` optional).
3. `bash v2/scripts/marketing-pages/run-page-research-pipeline.sh ueber-uns --with-gsc-queries --with-sistrix-serp --with-synthesis` (add `--with-competitor-faq-scrape` when `FIRECRAWL_API_KEY` is set; Serper PAA when `SERPER_API_KEY` is set).
4. Update `ueber-uns/data/target-keywords.json` + `ueber-uns/data/KEYWORD_DECISION.md`; review `ueber-uns/DATA_DRIVEN_SYNTHESIS.generated.md` and keep `data/faq-research.json` in sync (no orphan JSON).
5. Edit live FAQs in [`v2/data/misc-faqs/static_ueber_uns.json`](../../../../v2/data/misc-faqs/static_ueber_uns.json); run `php v2/scripts/dev-helpers/verify-faq-jsonld-parity.php --file=misc-faqs/static_ueber_uns.json`, `php v2/scripts/dev-helpers/audit-faq-json-internal-links.php '--glob=v2/data/misc-faqs/static_ueber_uns.json'`, and `php v2/scripts/dev-helpers/audit-marketing-faq-ssot.php`.
6. **Optional VIP:** `php v2/scripts/marketing-pages/collect-marketing-page-domain-kw-serp.php --page=ueber-uns --limit=3` — document credits in `KEYWORD_DECISION.md` ([VIP_MARKETING_SEO_DATA_TIERS.md](../marketing-pages/VIP_MARKETING_SEO_DATA_TIERS.md)).

**Page doc:** [company-pages-documentation.md](company-pages-documentation.md) § Über Ordio (structure + FAQ SSOT).

## Homepage `/` caveat (GSC / GA4 / merge)

- **GSC:** Homepage uses **property URL equality** for `https://www.ordio.com/` (and normalized path `/`) so row keys align with split + merge. Other paths use **contains** filters with an allowlist.
- **GA4:** `pagePath` for homepage is **`/`** (EXACT); subpages use CONTAINS.
- **Merge:** `merge-static-opportunity-data.php` normalizes homepage GSC lookup keys (trailing slash / empty path → `/`) so portfolio keywords mapped to `homepage` pick up path-level clicks.

## Continuous improvement

- **Cadence:** Monthly GSC/GA refresh; quarterly SISTRIX portfolio + per-page reruns when targets change; PAA when Serper is available.
- **Loop:** Refresh global JSON → update `data/KEYWORD_DECISION.md` with evidence → editorial hypotheses for meta/FAQ/copy → ship → re-check CTR/position next cycle.
- **Cannibalization:** Document primary ownership in `KEYWORD_DECISION.md`; defer conflicting head terms to product/industry registry pages when appropriate.

## Related

- [MARKETING_RESEARCH_STACK_PARITY.md](../marketing-pages/MARKETING_RESEARCH_STACK_PARITY.md) — VIP parity + rollback  
- [STATIC_SITE_OPPORTUNITY_LIST.md](STATIC_SITE_OPPORTUNITY_LIST.md)  
- [STATIC_SITE_SEO_IMPROVEMENT_BACKLOG.md](STATIC_SITE_SEO_IMPROVEMENT_BACKLOG.md)  
- [homepage-documentation.md](../homepage/homepage-documentation.md)  
- [.cursor/rules/marketing-pages-seo-data.mdc](../../../.cursor/rules/marketing-pages-seo-data.mdc)  
- [VIP_MARKETING_SEO_DATA_TIERS.md](../marketing-pages/VIP_MARKETING_SEO_DATA_TIERS.md)
