# Blog Scripts Usage Guide

**Last Updated:** 2026-04-04

Comprehensive guide to all blog-related scripts in `v2/scripts/blog/`.

## Quick Reference

### Pipeline entrypoints (data collection)

| Script | When to use |
|--------|-------------|
| `run-new-post-pipeline.php` | After `create-new-blog-post.php`; no GA4/GSC. Fills `docs/.../data/` for outline and skyscraper. After SISTRIX, PAA+SERP+competition+intent run in parallel; `--dry-run` prints DAG. |
| `run-post-improvement-pipeline.php` | Existing post refresh: backup, GA4/GSC, derive keywords, then **same parallel SISTRIX batch** as new-post (`blog-pipeline-parallel.inc.php`). `--dry-run`, `--skip-paa`, `--allow-paa-failure`, Firecrawl search flags match new-post. |
| `validate-data-pipeline-order.php` | Before **manual** `collect-faq-research-data.php`: checks `keywords-sistrix.json`, `serp-features.json`, and PAA file(s) exist under `docs/.../data/`. |
| `run-all-data-collection.php` | Bulk or scheduled collection across many posts (see `blog-data-collection.mdc`). |

**Unified checklist (blog outputs + templates/tools):** [CONTENT_CREATION_DATA_CHECKLIST.md](../CONTENT_CREATION_DATA_CHECKLIST.md).

### New Post Creation

- `create-new-blog-post.php` – Scaffold post JSON, docs dir, target-keywords.json (author: Hady)
- `run-new-post-pipeline.php` – Data collection for new posts (no GA4/GSC)
- `generate-blog-featured-image.py` – AI featured image generation (Gemini 2.5 Flash Image)
- `suggest-related-posts.php` – Suggest related posts for carousel
- `add-new-post-to-related-carousels.php` – Add new post to existing posts' related_posts
- `validate-new-post.php` – Validate new post before publishing

**Usage:** See [BLOG_POST_IMPROVEMENT_PROCESS.md](BLOG_POST_IMPROVEMENT_PROCESS.md) "New Post Creation" section and `.cursor/rules/blog-new-post-creation.mdc`.

### Outline and skyscraper (data → outline)

- `synthesize-outline-scaffold.php` — Builds `data/outline-scaffold.generated.md` from `competitive-depth-analysis.md`, `competitor-analysis.json`, PAA sources, `search-intent.json`. Merge into `CONTENT_OUTLINE.md` manually.
- `generate-section-briefs.php` — Per-H2 briefs in `data/section-briefs.md` from outline + pipeline data. **Target words:** uses outline target, else **100%** of competitive-depth recommended (not 80%).
- `validate-content-outline-quality.php` — Outline vs competitive depth, PAA coverage hints, Evidence fields. **`--strict-evidence`** blocks when too few `**Evidence:**` rows.
- `validate-serp-outline-ready.php` — SERP + outline readiness gate.
- `check-outline-h2-overlap.php` — Overlapping H2 detection.

### Post-body validation (AEO / GEO / competitive)

- `compare-content-to-competitors.php` — Word count, H2/H3, tables/lists vs competitors; gap list. **`--strict`** exits 1 on WARN or &lt;80% of recommended target.
- `validate-aeo-capsules.php` — Question-style H2 → first `<p>` length (warn; **`--strict`** → exit 1 on warns).
- `validate-aeo-capsule-diversity.php` — Flags overuse of the **„In Kürze:“** template label (warn by default; **`--strict`** in `blog-post-validate-strict`). See [CONTENT_DEPTH_GUIDELINES.md](CONTENT_DEPTH_GUIDELINES.md) (AEO capsules without repetitive labels).
- `validate-geo-citability.php` — Definition signal, Fazit H2, FAQ count (lexikon/ratgeber). **`--strict`** fails on warnings too.
- `audit-blog-repetitive-phrasing.py` — Scans all `v2/data/blog/posts/**/*.json` for label fatigue + soft AI-tell phrases; optional `--csv=path`, `--include-drafts`, `--fail-on high|medium`.

**Makefile:** `make blog-outline-gate POST=slug CAT=lexikon`, `make blog-post-validate POST=slug CAT=lexikon`, **`make blog-post-validate-strict`**, and **`make audit-blog-repetition`** — see repo `Makefile`. **Skyscraper / competitive ratio:** there is no separate `make blog-skyscraper-*` target—`blog-post-validate` already includes `compare-content-to-competitors` + `validate-content-completeness`; use **`blog-post-validate-strict`** for `compare-content --strict`. See [CI_BLOG_CONTENT_GUARDRAILS.md](CI_BLOG_CONTENT_GUARDRAILS.md).

### Quality Checks

- `weekly-quality-check.php` - Weekly quality audits (FAQ, links, schema, freshness)
- `audit-faq-quality.php` - FAQ quality audit
- `analyze-link-quality.php` - Link quality analysis
- `validate-link-quality.php` - Validate link quality

### Keyword workflow (SISTRIX / targets)

- `collect-post-keywords-sistrix.php` — SISTRIX metrics + ideas → `keywords-sistrix.json`. Use `--primary-only` before finalizing secondaries in `target-keywords.json`. See [KEYWORD_RESEARCH_WORKFLOW.md](KEYWORD_RESEARCH_WORKFLOW.md).
- `propose-secondary-keywords.php` — Rank secondary keywords from existing `keywords-sistrix.json` (optional `--write` to merge into `target-keywords.json`).
- `derive-target-keywords.php` — Primary/secondary from GSC; use `--from-sistrix` when `performance-gsc.json` is missing.
- `validate-keyword-decision.php` — Validates `KEYWORD_DECISION.md` (warnings; `--strict` for CI-style failure).

### Data Collection

- `collect-post-performance-ga4.php` - Collect GA4 data for posts
- `collect-post-performance-gsc.php` - Collect GSC data for posts
- `pull-sistrix-data.php` - Collect SISTRIX data
- `run-all-data-collection.php` - Run all data collection scripts
- `weekly-priority-refresh.php` - Weekly priority refresh (GA4/GSC updates)

### Content Edit

- `update-post-content.php` - Canonical script for updating blog post content; replaces `content.html`, regenerates `content.text` and `word_count`
- `fix-blog-table-styling.php` - Strip Tailwind from post tables, wrap in `table-breakout-wrapper`, restore Ordio blue header styling. `--post=` + `--category=` or `--all [--backup]`. See [BLOG_TABLE_FORMAT.md](BLOG_TABLE_FORMAT.md); `validate-new-post.php` warns on noncompliant tables.
- `apply-and-validate-post.sh` - Runs `update-post-content.php` then `make blog-post-validate` (or **`--strict`** → `blog-post-validate-strict`). Example: `./v2/scripts/blog/apply-and-validate-post.sh --post=slug --category=lexikon --html=docs/content/blog/posts/lexikon/slug/content-draft.html`
- `sync-post-content-text.php` - Regenerate `content.text` and `word_count` from existing `content.html` (when HTML was edited outside update-post-content.php)

**Usage:** See [BLOG_CONTENT_EDIT_WORKFLOW.md](BLOG_CONTENT_EDIT_WORKFLOW.md). Never edit post JSON directly. Use **one** tracked `content-draft.html` per post; avoid `*-expanded` / `*-improved` filenames in git ([CONTENT_DRAFT_LEGACY_INVENTORY.md](CONTENT_DRAFT_LEGACY_INVENTORY.md)).

**Makefile shortcuts (ergonomics):** [BLOG_WORKFLOW_EFFICIENCY.md](BLOG_WORKFLOW_EFFICIENCY.md) — `make blog-apply-validate` / `blog-apply-validate-strict` with `POST`, `CAT`, `HTML=` path; `make blog-post-validate-faq` when only FAQs changed.

### FAQ Management

- `add-faqs-to-post.php` - Add FAQs to blog post (updates `faqs` array only, sorts by logical flow by default)
- `audit-faq-source-drift.php` - Read-only compare post `faqs` vs `data/faq-answers-optimized.json`
- `reorder-faqs-by-logical-flow.php` - Reorder existing FAQs by logical flow (definition first, then how-to, etc.)
- `remove-faqs-from-content.php` - Remove FAQ sections from `content.html` (cleanup script)
- `generate-faq-questions.php` - Generate FAQ questions
- `generate-faq-answers-ai.php` - Generate FAQ answers using AI
- `add-faq-links.php` - Add internal links to FAQ answers
- `automate-faq-enhancement.php` - Automate FAQ enhancement
- `expand-short-faq-answers.php` - Expand short FAQ answers
- `condense-faq-answers.php` - Condense long FAQ answers
- `prioritize-faq-improvements.php` - Prioritize FAQ improvements

**FAQ pipeline vs published JSON:** What visitors and schema use is **`faqs` in** `v2/data/blog/posts/{category}/{slug}.json`. Under `docs/content/blog/posts/.../data/`, the files `faq-research.json` → `faq-questions.json` → `faq-answers-optimized.json` are **pipeline stages**—do not maintain them as three independent “sources of truth.” After substantive FAQ work, run `validate-faq-quality.php` and optionally **`audit-faq-source-drift.php`** (read-only compare of post `faqs` vs `faq-answers-optimized.json`). Full model: [FAQ_SOURCE_OF_TRUTH.md](FAQ_SOURCE_OF_TRUTH.md).

### Link Management

- `add-bidirectional-links.php` - Add bidirectional in-content links (config-driven)
- `add-links-to-json.php` - Add internal links to posts
- `add-pillar-links.php` - Add pillar page links
- `add-tools-links.php` - Add tools page links
- `remove-stop-word-links.php` - Remove stop word links
- `fix-malformed-links.php` - Fix malformed links
- `suggest-new-links.php` - Suggest new internal links
- `suggest-contextual-links.php` - Per-post tool/industry/lexikon suggestions (output: data/suggested-contextual-links.json); use `--all` to audit all pre-existing posts
- `generate-blog-lexikon-mapping.php` - Regenerate topic-to-lexikon mapping (run after adding new lexikon posts)
- `audit-blog-tool-links.php` - Audit tool link gaps
- `audit-blog-industry-links.php` - Audit industry link gaps
- `audit-blog-lexikon-links.php` - Audit lexikon-to-lexikon link gaps
- `add-contextual-links-from-suggestions.php` - Add suggested links in bulk (run suggest-contextual-links --all first)

### Content Analysis

- `analyze-post-seo.php` - Analyze post SEO
- `analyze-post-content.php` - Analyze post content
- `analyze-post-links.php` - Analyze post links
- `analyze-cluster-relationships.php` - Analyze cluster relationships

### Priority & Reporting

- `calculate-comprehensive-priority.php` - Calculate post priority scores
- `generate-priority-dashboard.php` - Generate priority dashboard
- `generate-review-priority-list.php` - Generate review priority list
- `generate-automated-reports.php` - Generate automated reports

### Validation

- `validate-post-dates.php` - Validate post dates
- `validate-faq-schema.php` - Validate FAQ schema
- `validate-link-quality.php` - Validate link quality
- `validate-seo-meta.php` - Validate SEO meta tags
- `validate-content-integrity.php` - Validate content integrity

## Detailed Script Documentation

### Weekly Quality Check

**File:** `weekly-quality-check.php`

**Purpose:** Automates weekly quality checks for blog posts

**Usage:**

```bash
php v2/scripts/blog/weekly-quality-check.php [--email]
```

**What it does:**

- Audits FAQ quality
- Checks link health
- Validates schema
- Checks content freshness

**Output:** Report saved to `docs/content/blog/reports/weekly-quality-check-YYYY-MM-DD.md`

**Dependencies:**

- `audit-faq-quality.php`
- `analyze-link-quality.php`
- `blog-template-helpers.php`

---

### Weekly Priority Refresh

**File:** `weekly-priority-refresh.php`

**Purpose:** Updates GA4/GSC data and recalculates priority scores

**Usage:**

```bash
php v2/scripts/blog/weekly-priority-refresh.php [--limit=N] [--dry-run]
```

**What it does:**

- Updates GA4 data
- Updates GSC data
- Recalculates priority scores
- Generates priority report

**Dependencies:**

- `collect-post-performance-ga4.php`
- `collect-post-performance-gsc.php`
- `calculate-comprehensive-priority.php`

---

### Link Quality Analysis

**File:** `analyze-link-quality.php`

**Purpose:** Analyzes anchor text quality and link relevance

**Usage:**

```bash
php v2/scripts/blog/analyze-link-quality.php [--output=report.md]
```

**What it does:**

- Extracts all links from posts
- Checks for stop words in anchor text
- Identifies too-short anchor text
- Identifies generic anchor text
- Generates quality report

**Output:** Report saved to `docs/content/blog/LINK_QUALITY_ANALYSIS.md`

**Dependencies:**

- `blog-template-helpers.php`

---

### Update Post Content

**File:** `update-post-content.php`

**Purpose:** Canonical script for updating blog post content. Never edit `content.html` directly in JSON files – use this script to avoid JSON escaping, Unicode corruption, and sync issues.

**Usage:**

```bash
# Replace from file
php v2/scripts/blog/update-post-content.php --post=slug --category=lexikon --html=path/to/content.html

# Replace from stdin
php v2/scripts/blog/update-post-content.php --post=slug --category=lexikon --stdin

# Dry-run (validate only, no write)
php v2/scripts/blog/update-post-content.php --post=slug --category=lexikon --html=... --dry-run

# With backup before write
php v2/scripts/blog/update-post-content.php --post=slug --category=lexikon --html=... --backup
```

**What it does:**

- Loads post JSON via `json_decode()`
- Replaces `content.html` with provided HTML
- Regenerates `content.text` from `strip_tags(html)`
- Recomputes `word_count`
- Updates `modified_date`
- Writes with `JSON_UNESCAPED_UNICODE | JSON_UNESCAPED_SLASHES`

**See:** [BLOG_CONTENT_EDIT_WORKFLOW.md](BLOG_CONTENT_EDIT_WORKFLOW.md)

---

### Sync Post Content Text

**File:** `sync-post-content-text.php`

**Purpose:** Regenerate `content.text` and `word_count` from existing `content.html` without replacing HTML. Use when HTML was edited outside `update-post-content.php`.

**Usage:**

```bash
# Single post
php v2/scripts/blog/sync-post-content-text.php --post=slug --category=lexikon

# All posts
php v2/scripts/blog/sync-post-content-text.php --all

# Dry-run
php v2/scripts/blog/sync-post-content-text.php --post=slug --category=lexikon --dry-run
```

**See:** [BLOG_CONTENT_EDIT_WORKFLOW.md](BLOG_CONTENT_EDIT_WORKFLOW.md)

---

### FAQ Quality Audit

**File:** `audit-faq-quality.php`

**Purpose:** Audits FAQ quality across all posts

**Usage:**

```bash
php v2/scripts/blog/audit-faq-quality.php
```

**What it checks:**

- Answer length (too short/long)
- Keyword integration
- Internal links in FAQs
- FAQ count per post

**Output:** Report saved to `docs/content/blog/FAQ_QUALITY_AUDIT.md`

---

### GA4 Data Collection

**File:** `collect-post-performance-ga4.php`

**Purpose:** Collects GA4 performance data for blog posts

**Usage:**

```bash
php v2/scripts/blog/collect-post-performance-ga4.php [--post=slug] [--category=category] [--all] [--limit=N] [--dry-run]
```

**What it collects:**

- Page views (last 90 days, last year)
- Sessions
- Bounce rate
- Average engagement time

**Dependencies:**

- Google API credentials (`v2/config/google-api-credentials.php`)
- Composer dependencies

---

### GSC Data Collection

**File:** `collect-post-performance-gsc.php`

**Purpose:** Collects Google Search Console data for blog posts

**Usage:**

```bash
php v2/scripts/blog/collect-post-performance-gsc.php [--post=slug] [--category=category] [--all] [--limit=N] [--dry-run]
```

**What it collects:**

- Clicks
- Impressions
- Average position
- CTR

**Dependencies:**

- Google API credentials (`v2/config/google-api-credentials.php`)
- Composer dependencies

---

### Add Bidirectional Links

**File:** `add-bidirectional-links.php`

**Purpose:** Adds bidirectional in-content links to target posts. Config-driven; replaces per-post scripts (add-recruiting-links.php, add-cost-per-hire-links.php, add-mitarbeitergespraech-links.php, etc.) for standard search/replace patterns.

**Usage:**

```bash
php v2/scripts/blog/add-bidirectional-links.php --post=slug --category=lexikon [--dry-run]
php v2/scripts/blog/add-bidirectional-links.php --post=recruiting --category=lexikon --config=/path/to/bidirectional-links.json
```

**What it does:**

- Loads config from `docs/content/blog/posts/{category}/{slug}/data/bidirectional-links.json`
- Applies content replacements via `update-post-content.php --backup`
- Applies FAQ replacements via direct JSON edit
- Skips targets where link already present

**Config:** See [BIDIRECTIONAL_LINKS_CONFIG.md](BIDIRECTIONAL_LINKS_CONFIG.md).

**Dependencies:**

- `update-post-content.php`

---

### Add Internal Links

**File:** `add-links-to-json.php`

**Purpose:** Adds internal links to blog posts

**Usage:**

```bash
php v2/scripts/blog/add-links-to-json.php [--post=slug] [--category=category] [--all] [--dry-run]
```

**What it does:**

- Detects keywords in content
- Adds contextual internal links
- Validates anchor text quality
- Preserves existing links

**Dependencies:**

- `link_utils.php`
- `blog-template-helpers.php`

---

### Generate FAQ Questions

**File:** `generate-faq-questions.php`

**Purpose:** Generates FAQ questions for blog posts

**Usage:**

```bash
php v2/scripts/blog/generate-faq-questions.php [--post=slug] [--category=category] [--all]
```

**What it does:**

- Analyzes post content
- Generates relevant FAQ questions
- Saves to post JSON file

**Dependencies:**

- OpenAI API key (`v2/config/openai-api-key.php`)
- `blog-template-helpers.php`

---

### Generate FAQ Answers (AI)

**File:** `generate-faq-answers-ai.php`

**Purpose:** Generates FAQ answers using AI

**Usage:**

```bash
php v2/scripts/blog/generate-faq-answers-ai.php [--post=slug] [--category=category] [--all]
```

**What it does:**

- Generates answers for existing FAQ questions
- Uses OpenAI API
- Validates answer quality
- Saves to post JSON file

**Dependencies:**

- OpenAI API key (`v2/config/openai-api-key.php`)
- `blog-template-helpers.php`

---

### Add FAQs to Post

**File:** `add-faqs-to-post.php`

**Purpose:** Adds FAQs to a blog post JSON file (updates `faqs` array only, removes from HTML). **Sorts FAQs by logical flow by default** (definition first, then how-to, etc.). Use `--no-sort` to preserve input order.

**Usage:**

```bash
php v2/scripts/blog/add-faqs-to-post.php --post=slug --category=category [--faqs=faqs.json] [--replace] [--no-sort]
```

**What it does:**

- Loads FAQs from `--faqs` if set; otherwise uses `docs/content/blog/posts/{category}/{slug}/data/faq-answers-optimized.json` when it exists
- **Default:** merges into existing `faqs` (skips duplicate questions / strong H2 overlap); if `faqs` is empty, may extract from HTML (legacy)
- **`--replace`:** replaces `faqs` with the loaded set (recommended after editing the optimized file so published JSON matches the pipeline artifact)
- Updates `faqs` array in JSON
- **Removes FAQ sections from HTML content** (prevents duplication)
- Updates `modified_date`

**Important:** FAQs are stored only in the `faqs` array, NOT in `content.html`. The template renders FAQs separately via `BlogFAQ.php` component.

**Example:**

```bash
php v2/scripts/blog/add-faqs-to-post.php --post=recap-webinar-sv-pruefung --category=ratgeber --faqs=/tmp/faqs.json
```

**Dependencies:**

- `blog-template-helpers.php`

**See:** [FAQ_SOURCE_OF_TRUTH.md](FAQ_SOURCE_OF_TRUTH.md), `docs/content/blog/FAQ_CONTENT_SEPARATION_FIX.md`

---

### Audit FAQ source drift (read-only)

**File:** `audit-faq-source-drift.php`

**Purpose:** Compares published `faqs` in the post JSON with `docs/content/blog/posts/{category}/{slug}/data/faq-answers-optimized.json` (when present). Normalizes questions the same way as merge logic; compares a hash of stripped answer text. **Does not modify files.**

**Usage:**

```bash
php v2/scripts/blog/audit-faq-source-drift.php --post=slug --category=category
```

**Exit codes:** `0` = OK or SKIP (no pipeline file); `1` = DRIFT; `2` = error.

**See:** [FAQ_SOURCE_OF_TRUTH.md](FAQ_SOURCE_OF_TRUTH.md)

---

### Remove FAQs from Content HTML

**File:** `remove-faqs-from-content.php`

**Purpose:** Removes FAQ sections from `content.html` while preserving FAQs in `faqs` array

**Usage:**

```bash
# Dry run (test without modifying files)
php v2/scripts/blog/remove-faqs-from-content.php --dry-run

# Remove FAQs from all posts
php v2/scripts/blog/remove-faqs-from-content.php

# Remove FAQs from specific post
php v2/scripts/blog/remove-faqs-from-content.php --post=slug --category=category
```

**What it does:**

- Scans blog posts for FAQ sections in HTML
- Removes FAQ headings (`<h2>FAQ</h2>` or `<h3>FAQ</h3>`)
- Removes FAQ wrapper divs (`<div class="schema-faq wp-block-yoast-faq-block">`)
- Removes individual FAQ sections (`<div class="schema-faq-section">`)
- Preserves FAQs in `faqs` array
- Updates `modified_date`

**Use cases:**

- Cleanup after migration
- Fix posts with FAQs incorrectly in HTML
- Ensure proper separation of content and FAQs

**Example:**

```bash
# Check which posts have FAQs in HTML
php v2/scripts/blog/remove-faqs-from-content.php --dry-run

# Remove FAQs from all posts
php v2/scripts/blog/remove-faqs-from-content.php
```

**Dependencies:**

- `blog-template-helpers.php`

**See:** `docs/content/blog/FAQ_CONTENT_SEPARATION_FIX.md` for details

---

## Script Categories

### Analysis Scripts

- `analyze-post-seo.php`
- `analyze-post-content.php`
- `analyze-post-links.php`
- `analyze-link-quality.php`
- `analyze-cluster-relationships.php`
- `analyze-anchor-text.py`
- `analyze-content-context.php`
- `analyze-content-structure.py`
- `analyze-heading-hierarchy.py`
- `analyze-srcset-patterns.py`
- `analyze-sistrix-data.py`
- `analyze-seo-meta.php`

### Audit Scripts

- `audit-faq-quality.php`
- `audit-faq-inventory.php`
- `audit-linked-words.php`
- `audit-malformed-links.php`
- `audit-post-dates.php`
- `audit-seo-practices.py`
- `audit-content-changes.py`
- `audit-image-files.py`
- `audit-image-mapping.py`
- `audit-image-references.py`
- `audit-physical-images.py`
- `audit-partial-word-links.py`

### Collection Scripts

- `collect-post-performance-ga4.php`
- `collect-post-performance-gsc.php`
- `collect-post-keywords-sistrix.php`
- `collect-post-serp-data.php`
- `collect-post-serp-features.php`
- `collect-post-search-intent.php`
- `collect-post-competition-levels.php`
- `collect-faq-research-data.php`
- `collect-high-value-serp-data.php`
- `collect-missing-keywords.php`
- `collect-all-missing-keywords.php`
- `collect-competitor-keywords.php`
- `collect-domain-backlinks.php`
- `collect-domain-content-ideas.php`
- `collect-domain-level-sistrix.php`
- `collect-domain-opportunities.php`
- `pull-sistrix-data.php`
- `run-all-data-collection.php`
- `run-all-advanced-collection.php`

### Fix Scripts

- `fix-malformed-links.php`
- `fix-post-dates.php`
- `fix-problematic-links.php`
- `fix-partial-word-links.py`
- `fix-heading-hierarchy.py`
- `fix-image-attributes.py`
- `fix-missing-images.py`
- `fix-srcset-references.py`
- `fix-remaining-split-words.py`

### Generation Scripts

- `generate-faq-questions.php`
- `generate-faq-answers-ai.php`
- `generate-faqs-complete.php`
- `generate-priority-dashboard.php`
- `generate-review-priority-list.php`
- `generate-automated-reports.php`
- `generate-content-gaps-summary.php`
- `generate-improvement-plan.php`
- `generate-link-recommendations.php`
- `generate-missing-keyword-links.php`
- `generate-missing-link-types.php`
- `generate-missing-recommendations.php`
- `generate-post-documentation.php`
- `generate-review-progress-tracker.php`
- `generate-template-placeholders-report.php`
- `generate-anchor-text.py`
- `generate-audit-report.py`
- `generate-date-report.php`
- `generate-docs-inventory.py`
- `generate-scripts-inventory.py`
- `generate-internal-links-review-report.php`
- `generate-manual-review-checklist.php`

### Validation Scripts

- `validate-post-dates.php`
- `validate-faq-schema.php`
- `validate-link-quality.php`
- `validate-seo-meta.php`
- `validate-content-integrity.php`
- `validate-all-links.py`
- `validate-all-images.py`
- `validate-api-data-quality.php`
- `validate-blog-rules.py`
- `validate-clusters.py`
- `validate-cross-references.py`
- `validate-data-collection.php`
- `validate-documentation-quality.php`
- `validate-expansion-quality.php`
- `validate-heading-fixes.py`
- `validate-link-grammar.php`
- `validate-links.py`
- `validate-placeholder-completion.php`
- `validate-post-documentation.php`
- `validate-schema.php`
- `validate-sitemap.php`

## Common Patterns

### Running Scripts for All Posts

```bash
php v2/scripts/blog/script-name.php --all
```

### Running Scripts for Specific Post

```bash
php v2/scripts/blog/script-name.php --post=slug --category=ratgeber
```

### Dry Run (Test Without Changes)

```bash
php v2/scripts/blog/script-name.php --all --dry-run
```

### Limiting Results

```bash
php v2/scripts/blog/script-name.php --all --limit=10
```

## Dependencies

### Required PHP Files

- `v2/config/blog-template-helpers.php` - Blog helper functions
- `v2/config/blog-schema-generator.php` - Schema generation
- `v2/config/blog-meta-generator.php` - Meta tag generation
- `v2/scripts/blog/link_utils.php` - Link utility functions

### Required API Keys

- Google API credentials: `v2/config/google-api-credentials.php`
- OpenAI API key: `v2/config/openai-api-key.php`
- SISTRIX API key: `v2/config/sistrix-api-key.php`

### Required Composer Packages

- Google API Client (for GA4/GSC)
- OpenAI PHP SDK (for FAQ generation)

## Error Handling

All scripts should:

- Log errors to appropriate log files
- Handle missing dependencies gracefully
- Provide clear error messages
- Support `--dry-run` mode for testing

## Log Files

- GA4 collection errors: `v2/data/blog/ga4-collection-errors.log`
- GSC collection errors: `v2/data/blog/gsc-collection-errors.log`
- SISTRIX errors: `v2/data/blog/sistrix-errors.log`

## Best Practices

1. **Always test with `--dry-run` first**
2. **Back up data before running scripts**
3. **Check log files after running scripts**
4. **Run validation scripts after making changes**
5. **Document any manual changes made**

## Automation

### Weekly Tasks

- `weekly-quality-check.php` - Run every Monday
- `weekly-priority-refresh.php` - Run every Monday

### Monthly Tasks

- SISTRIX data collection (credit management)
- Comprehensive audits

### Cron Setup

See `setup-faq-monitoring-cron.sh` for example cron setup.
