# blog-monitoring Full Instructions

## Overview

Automated monitoring and quality checks for blog content. Includes weekly quality audits, priority refresh, and performance tracking.

## Weekly Quality Checks

### Script: `weekly-quality-check.php`

**Purpose:** Automates weekly quality checks for blog posts

**What it checks:**

- FAQ quality (answer length, keyword integration, internal links)
- Link health (anchor text quality, stop words, generic links)
- Schema validation (Article, FAQPage, HowTo schemas)
- Content freshness (publication dates, modification dates)

**Usage:**

```bash
php v2/scripts/blog/weekly-quality-check.php [--email]
```

**Output:** Report saved to docs/content/blog/reports/ with filename format weekly-quality-check-YYYY-MM-DD.md where YYYY-MM-DD is replaced with the actual date when the script runs (example: weekly-quality-check-2026-01-21.md). The file is created dynamically by the script with the current date.

**Dependencies:**

- `audit-faq-quality.php`
- `analyze-link-quality.php`
- `blog-template-helpers.php`

**Cron Setup:**

```bash
# Weekly quality check (Monday 9 AM)
0 9 * * 1 cd /path/to/landingpage && php v2/scripts/blog/weekly-quality-check.php --email
```

## Weekly Priority Refresh

### Script: `weekly-priority-refresh.php`

**Purpose:** Updates GA4/GSC data and recalculates priority scores

**What it does:**

- Updates GA4 performance data
- Updates GSC search performance data
- Recalculates priority scores (via `calculate-comprehensive-priority.php`)
- Generates priority dashboard (via `generate-priority-dashboard.php`; output: `docs/content/blog/PRIORITY_DASHBOARD.md`)

**Usage:**

```bash
php v2/scripts/blog/weekly-priority-refresh.php [--limit=N] [--dry-run]
```

**Dependencies:**

- `collect-post-performance-ga4.php`
- `collect-post-performance-gsc.php`
- `calculate-comprehensive-priority.php`

**Cron Setup:**

```bash
# Weekly priority refresh (Monday 10 AM)
0 10 * * 1 cd /path/to/landingpage && php v2/scripts/blog/weekly-priority-refresh.php
```

## Content backlog and competitive analysis

**Where to read for "what to do next" (optimize vs create):**

- **CONTENT_BACKLOG.md** – Single artifact: optimize existing (top N by priority) + create new (domain opportunities, content ideas). Regenerate with `generate-content-backlog.php` after priority refresh or domain-level collection.
- **reports/competitive-analysis-YYYY-Q.md** – Keyword opportunities, content ideas, top SEO competitors, competitive gaps. Regenerate with `generate-competitive-analysis.php` (reads domain-level-data JSON).
- **domain-level-data/** – domain-opportunities.json, content-ideas.json, domain-competitors-seo.json, competitive-gaps.json. Must be populated by collection scripts; generate-competitive-analysis and generate-content-backlog **must use these real files**, not placeholders. See [CONTENT_BACKLOG_WORKFLOW.md](docs/content/blog/CONTENT_BACKLOG_WORKFLOW.md).

**Full-site awareness:** Generators (content backlog, content-gap, competitive analysis) must consider the **full website** (tools, comparison, product, templates, download, webinar, industry), not only blog. Domain opportunities are **classified by surface** via the site-surface classifier (`v2/scripts/blog/helpers/site-surface.php`). **"Create new"** in the backlog must list only **true content gaps** (ranking URL is homepage or other); keywords already covered by a tools/comparison/product/blog page appear under **"Already covered (optimize existing page)."** See [CONTENT_BACKLOG_WORKFLOW.md](docs/content/blog/CONTENT_BACKLOG_WORKFLOW.md).

## Link Quality Analysis

### Script: `analyze-link-quality.php`

**Purpose:** Analyzes anchor text quality and link relevance

**What it checks:**

- Stop words in anchor text (für, dafür, der, die, etc.)
- Too-short anchor text (< 3 characters)
- Generic anchor text (hier, mehr, weiter, etc.)

**Usage:**

```bash
php v2/scripts/blog/analyze-link-quality.php [--output=report.md]
```

**Output:** Report saved to `docs/content/blog/LINK_QUALITY_ANALYSIS.md`

**Quality Standards:**

- ✅ No stop words in anchor text
- ✅ Anchor text length ≥ 3 characters
- ✅ Descriptive anchor text (not generic)
- ✅ Natural language anchor text

## FAQ Quality Audit

### Script: `audit-faq-quality.php`

**Purpose:** Audits FAQ quality across all posts

**What it checks:**

- Answer length (too short: < 50 words, too long: > 200 words)
- Keyword integration (primary keywords in answers)
- Internal links in FAQs
- FAQ count per post

**Usage:**

```bash
php v2/scripts/blog/audit-faq-quality.php
```

**Output:** Report saved to `docs/content/blog/FAQ_QUALITY_AUDIT.md`

**Quality Standards:**

- ✅ Answer length: 50-200 words
- ✅ Primary keywords included in answers
- ✅ Internal links in FAQ answers (when relevant)
- ✅ 5-10 FAQs per post (educational content)

## Schema Validation

### Automatic Schema Generation

**Article Schema:**

- Automatically includes: keywords, articleSection, wordCount
- Generated via `blog-schema-generator.php`

**FAQPage Schema:**

- Automatically added to posts with FAQs
- Generated via `blog-schema-generator.php`

**HowTo Schema:**

- Automatically detected for posts with step-by-step content
- Detects patterns: "Schritt 1", "Schritt 2", etc.
- Requires minimum 3 steps
- Generated via `blog-schema-generator.php`

**Validation:**

- Schema validation integrated into weekly quality checks
- Use Google Rich Results Test for manual validation

## Content Freshness

### Automated Date Management

**Publication Dates:**

- Preserved from original content
- Never modified automatically

**Modified Dates:**

- Automatically managed via git hooks
- Updated on content changes
- Format: YYYY-MM-DD

**Content Freshness Checks:**

- Posts older than 2 years flagged
- Posts not updated in 1+ year flagged
- Integrated into weekly quality checks

## Data Collection Automation

### GA4 Data Collection

**Script:** `collect-post-performance-ga4.php`

**Frequency:** Weekly (via `weekly-priority-refresh.php`)

**What it collects:**

- Page views (last 90 days, last year)
- Sessions
- Bounce rate
- Average engagement time

**Dependencies:**

- Google API credentials
- Composer dependencies (Google API Client)

### GSC Data Collection

**Script:** `collect-post-performance-gsc.php`

**Frequency:** Weekly (via `weekly-priority-refresh.php`)

**What it collects:**

- Clicks
- Impressions
- Average position
- CTR

**Dependencies:**

- Google API credentials
- Composer dependencies (Google API Client)

### SISTRIX Data Collection

**Script:** `pull-sistrix-data.php` (and other SISTRIX collectors; see `docs/content/blog/SISTRIX_ENDPOINTS_AND_REPORTS.md`)

**Frequency:** Monthly (credit management). Domain visibility: `collect-domain-visibilityindex.php` weekly or monthly (1 credit, 7-day cache).

**What it collects:**

- Keyword rankings
- Visibility scores
- Competition levels
- Domain visibility index (domain-level; see MONITORING_RUNBOOK)

**Dependencies:**

- SISTRIX API key
- Credit management (see credit log and SEO_DATA_MANAGEMENT)

**When adding or changing SISTRIX usage:** Document the endpoint, script, output file, and credit cost in `docs/content/blog/SISTRIX_ENDPOINTS_AND_REPORTS.md` and consider credit impact before adding new API calls.

## Error Handling

### Log Files

- GA4 errors: `v2/data/blog/ga4-collection-errors.log`
- GSC errors: `v2/data/blog/gsc-collection-errors.log`
- SISTRIX errors: `v2/data/blog/sistrix-errors.log`

### Error Alerts

- All error emails sent to `hady@ordio.com`
- Never use `david@ordio.com` for error notifications
- Log all errors to appropriate log files

## Monitoring Best Practices

1. **Always test with `--dry-run` first**
2. **Check log files after running scripts**
3. **Review weekly quality check reports**
4. **Monitor credit usage (SISTRIX)**
5. **Set up email alerts for critical errors**

## Monitoring Runbook

**Complete Guide:** `docs/content/blog/MONITORING_RUNBOOK.md`

The monitoring runbook provides:

- Complete cron configuration
- Troubleshooting procedures
- Alert configuration
- Manual monitoring tasks
- Quarterly analysis procedures

**Key Monitoring Scripts:**

- `weekly-quality-check.php` - Weekly quality audits
- `weekly-priority-refresh.php` - Weekly priority updates
- `automate-data-collection.php` - Automated data collection
- `audit-faq-quality.php` - FAQ quality audit
- `analyze-link-quality.php` - Link quality analysis

## Blog traffic & SEO audit

For a full audit run: (1) `check-data-freshness.php --all --max-age=7`, (2) `validate-data-collection.php --all --stale-days=30` and `validate-api-data-quality.php --all`, (3) regenerate dashboards/reports (`generate-data-freshness-report.php`, `monitor-collection-health.php`, `generate-priority-dashboard.php`, `generate-traffic-seo-snapshot.php`). Review DATA_FRESHNESS_REPORT and COLLECTION_HEALTH_DASHBOARD. **Weekly:** run freshness check and weekly-priority-refresh. **Monthly/quarterly:** run full validation and traffic snapshot. See `docs/content/blog/AUDIT_RUNBOOK.md`.

## Related Documentation

- `docs/content/blog/MONITORING_RUNBOOK.md` - Complete monitoring guide
- `docs/content/blog/BLOG_SCRIPTS_USAGE_GUIDE.md` - Complete scripts guide
- `docs/content/blog/README.md` - Blog documentation index
- `docs/content/blog/AUDIT_RUNBOOK.md` - Blog traffic & SEO audit steps
- `docs/content/blog/DATA_FRESHNESS_REPORT.md` - Data freshness status
- `docs/content/blog/PERFORMANCE_ANALYSIS.md` - GA4/GSC data usage
- `docs/content/blog/CONSOLIDATED_NEXT_STEPS.md` - Next steps
- `.cursor/rules/blog-data-collection.mdc` - Data collection patterns
- `.cursor/rules/blog-faq-optimization.mdc` - FAQ optimization patterns
