# blog-improvement-process-content Full Instructions

## Content Creation Approach

### New From Scratch

**CRITICAL:** Treat each post as completely new creation, not building around existing content.

**Process:**

1. **Extract Valuable Elements** (calculators, videos, images, tables)
2. **Create Content Outline** (based on analysis and SERP research)
3. **Write New Content** (from scratch, following outline)
4. **Reintegrate Elements** (at appropriate positions)
5. **Optimize Flow** (ensure natural flow throughout)

**See:** `docs/content/blog/CONTENT_CREATION_WORKFLOW_IMPROVEMENT.md` for complete workflow.

### Element Preservation

**Before Creating New Content:**

1. **Identify Valuable Elements:**

   - Calculators (Alpine.js components)
   - Videos (YouTube embeds)
   - Images (with alt text)
   - Tables (data tables)
   - Interactive elements

2. **Document Element Details:**

   - Element type
   - Current position
   - Content context
   - Proposed new position

3. **Extract Elements:**
   - Copy HTML for calculators
   - Copy embed codes for videos
   - Note image paths and alt text
   - Copy table HTML

**Note:** Elements will be reintegrated into new content at appropriate positions.

### Content Structure Requirements

**Required Structure:**

1. **Introduction** (2-3 paragraphs)

   - Hook (first sentence)
   - Context (what is the topic)
   - Value proposition (what reader will learn)

2. **Main Sections (H2)** (3-6 sections)

   - Clear, descriptive headings
   - Logical flow
   - Comprehensive coverage

3. **FAQs** (10-15 questions)

   - In `faqs` array (NOT in content HTML)
   - From PAA questions and GSC queries

4. **Conclusion** (1-2 paragraphs)
   - Summary of key points
   - Next steps or CTA (if appropriate)

**Critical Requirements:**

- Definition must appear within first 20% of content
- Logical flow: Intro → Definition → Explanation → Examples → Advanced
- All FAQs in `faqs` array (NOT in content HTML)

## Content Depth & Word Count Standards

### Flexible Word Count Guidelines

**⚠️ IMPORTANT:** Word count is NOT a direct ranking factor. Use flexible, data-driven guidelines based on search intent, competition level, and competitor analysis.

**Decision Framework:**

1. **Analyze Search Intent:**
   - Informational: 1,800-2,500 words (competitive), 2,500-4,000 words (high competition)
   - Transactional: 1,000-1,500 words (focus on conversion)
   - Navigational: 500-1,200 words (brand-focused)

2. **Analyze Competition Level:**
   - Low competition (< 30 difficulty): 1,200-1,800 words sufficient
   - Medium competition (30-50): 1,800-2,500 words target
   - High competition (50+): 2,500-4,000 words target
   - Very high competition (70+): 4,000+ words (skyscraper approach)

3. **Analyze Competitor Content:**
   - Average competitor word count: Match or exceed by 20-30%
   - Top 3 competitors: Analyze depth, not just length
   - Content gaps: Fill gaps comprehensively

4. **Determine Content Depth (Minimum vs Suggested):**
   - **Minimum** = validation floor (80% of competitive-depth). **Suggested** = target to aim for (100%).
   - **Minimum Depth:** 1,200-1,500 words (low competition)
   - **Competitive Depth:** 1,800-2,500 words suggested (medium competition)
   - **Comprehensive Depth:** 2,500-4,000+ words suggested (high competition)
   - **Reach for suggested;** minimum is fallback. Content that hits minimums underperforms.

**See:** `docs/content/blog/FLEXIBLE_WORD_COUNT_GUIDELINES.md` for complete decision framework.

### Skyscraper Technique (2026 Adaptation)

**Reach for Suggested:** Set outline target to 100% of competitive-depth recommended. Use 90% only when topic is narrow.

**When to Use:**
- High competition keywords (50+ difficulty)
- Competitors have 2,500+ words
- Content gaps identified vs competitors

**Process:**

1. **Competitive Analysis First:**
   - Analyze top 10 ranking pages
   - Extract word counts, structure, topics covered
   - Identify content gaps

2. **Create Better Content:**
   - Cover all topics competitors cover
   - Cover topics competitors don't
   - Add unique value (data, insights, examples)
   - Exceed competitor depth by 20-30%

3. **Differentiation Over Duplication:**
   - Don't just make it longer
   - Make it better: unique data, expert insights, better formats

**See:** `docs/content/blog/SKYSCRAPER_TECHNIQUE_2026.md` for complete framework.

### Content Depth Guidelines

**Expand When:**
- Competitor analysis shows gaps
- Search intent requires more depth
- Performance data indicates need
- Content gaps identified

**Stop When:**
- Topic fully covered
- Diminishing returns (adding doesn't add value)
- Quality over quantity (better to have focused content)
- Format limitations (some topics don't need 4,000 words)

**See:** `docs/content/blog/CONTENT_DEPTH_GUIDELINES.md` for complete guidelines.

## Content Quality Standards

### Human-First Content

**Requirements:**

- Natural, conversational tone (du tone)
- Varied sentence structures
- No AI content tells ("Furthermore", "Moreover")
- Specific examples and data points
- Natural transitions
- Personal insights (when appropriate)
- **Anti-fluff:** Per [ANTI_FLUFF_CHECKLIST.md](docs/content/blog/ANTI_FLUFF_CHECKLIST.md)—no filler; cut micro-fluff and redundant paragraphs

**Section briefs:** `generate-section-briefs.php` merges overlapping PAA/gaps to avoid redundant key points. Address gaps substantively.

**See:** `docs/content/blog/CONTENT_QUALITY_CHECKLIST_IMPROVEMENT.md` for complete checklist.

### AI Content Avoidance

**Avoid:**

- Overly formal language ("Furthermore", "Moreover")
- Repetitive sentence structures
- Generic phrases ("It is important to note")
- Lack of specific examples
- Perfect grammar without variations

**Use:**

- Varied sentence lengths
- Personal insights
- Specific examples and anecdotes
- Natural transitions
- Conversational tone

**See:** `docs/content/AI_CONTENT_AVOIDANCE_GUIDE.md` for complete guide.

### Content Flow

**Requirements:**

- Definition within first 20% of content
- Logical flow throughout
- Smooth transitions between sections
- Paragraphs 2-3 sentences max
- Clear heading hierarchy

**Validation:**

```bash
php v2/scripts/blog/validate-content-flow.php --post=slug --category=category
```

**Target:** Flow score ≥80/100

## SEO/GEO/AEO Optimization

### SEO Optimization

**Required:**

- Primary keyword in title, H1, first paragraph
- Meta tags optimized (title: 50-60 chars, description: 150-160 chars)
- Schema markup (Article + FAQPage + BreadcrumbList minimum)
- Internal links (10-15 natural, contextual links)
- Technical SEO (fast, mobile-friendly, valid HTML)

**See:** `docs/content/blog/SEO_GEO_AEO_CHECKLIST.md` for complete checklist.

### GEO Optimization (AI Search Engines)

**Required:**

- AI-ready content (clear, structured)
- Citation format (author, date, sources)
- Structured data (ArticleBody, Speakable)
- Comprehensive coverage

### AEO Optimization (Answer Engine Optimization)

**Required:**

- Featured snippet optimization (direct answer in first paragraph, 40-60 words)
- Question-based headings (match PAA question wording)
- PAA optimization (all questions answered)
- FAQ schema markup (10-15 FAQs, 40-100 word answers)
- E-E-A-T signals (Experience, Expertise, Authoritativeness, Trustworthiness)
- Direct answers to questions
- Structured content (lists, tables, visual breaks)

**See:** `docs/content/blog/AEO_GEO_BEST_PRACTICES_2026.md` for complete best practices.

### GEO Optimization (Generative Engine Optimization)

**Required:**

- Topic-first approach (comprehensive topic coverage)
- Structured, succinct, and citable content
- Authority and trust signals (expert insights, citations)
- Multimodal content (images, tables, videos)
- Freshness (update quarterly/annually)

**See:** `docs/content/blog/AEO_GEO_BEST_PRACTICES_2026.md` for complete best practices.

## Internal Linking Requirements

### Link Guidelines

**Requirements:**

- 10-15 internal links per post
- Natural anchor text (not "click here")
- Varied anchor text (no repetition)
- Contextual placement (within relevant content)
- Links add value to reader

**Link Types:**

- Related blog posts
- Pillar pages (CRITICAL: Must be included when post belongs to cluster)
- Tools/calculators
- Templates/downloads
- Product pages (when relevant)

**Pillar Page Integration (REQUIRED):**

**CRITICAL:** All posts that belong to a content cluster MUST link to their pillar page(s).

**Process:**

1. **Check Pillar Mapping**
   - Review `docs/content/blog/TIER1_PILLAR_MAPPING.md` to identify which pillar page(s) the post should link to
   - Pillar pages:
     - `/insights/dienstplan/` - Dienstplan pillar (Dienstplan cluster)
     - `/insights/zeiterfassung/` - Zeiterfassung pillar (Zeiterfassung cluster)

2. **Add to Internal Links**
   - Add pillar page link(s) to `internal_links` array in JSON file
   - Use natural anchor text (e.g., "digitale Zeiterfassung", "Dienstplan")
   - Set `target_type: "pillar-page"` and `priority: "critical"`
   - Add contextual link in content HTML where topic is naturally discussed

3. **Add to Related Posts**
   - Add pillar page(s) to `related_posts` array at the beginning (top 3-5 positions)
   - Use `category: "pillar"` and `relationship_type: "pillar_page"`
   - Set high `similarity_score` (0.85-0.90) to ensure visibility

4. **Validation**
   - Verify pillar links are in `internal_links` array
   - Verify pillar pages are in `related_posts` array (top positions)
   - Verify contextual links exist in content HTML
   - Run: `php v2/scripts/blog/validate-pillar-links.php --post=slug --category=category`

**Checklist:**

- [ ] Pillar page mapping checked
- [ ] Pillar link(s) added to `internal_links` array
- [ ] Contextual pillar link(s) added to content HTML
- [ ] Pillar page(s) added to `related_posts` array (top 3-5 positions)
- [ ] Validation script run and passing

**See:** `docs/content/blog/INTERNAL_LINKING_IMPROVEMENT_GUIDE.md` for complete guide.
**See:** `docs/content/blog/PILLAR_PAGE_INTEGRATION_CHECKLIST.md` for pillar-specific checklist.

### Anchor Text Quality

**Requirements:**

- Natural, contextual phrases
- Varied (no repetition)
- Descriptive (reader knows what to expect)
- Flows naturally within sentence
- Not generic ("click here", "more info")

**Examples:**

✅ "Für eine präzise Berechnung nutze unseren Arbeitszeitrechner."
✅ "Wie du Dienstpläne erstellst, erklären wir dir in unserem Leitfaden."
❌ "Klicke hier für mehr Informationen."

## FAQ Requirements

### FAQ Structure

**Requirements:**

- FAQs in `faqs` array (NOT in content HTML)
- 10-15 FAQs optimal
- Questions are unique (no duplicates)
- Logical ordering (definitions first, then how-to)
- High-volume queries prioritized

**Sources (Priority Order):**

1. People Also Ask questions (from SISTRIX SERP features)
2. Top GSC queries (sorted by clicks, then impressions)
3. Related keywords (from SISTRIX)
4. Standard questions based on topic

### FAQ Answers

**Requirements:**

- Answer length: 40-80 words
- Primary keyword in 3-5 FAQs (natural)
- Direct answer first sentence
- Natural du tone
- No template language
- Contextual internal links when FAQs mention lexikon/tool/product terms (1:1 mandatory)

**See:** `.cursor/rules/blog-faq-optimization.mdc` for complete FAQ guidelines.

### FAQ Quality Review (MANDATORY)

**CRITICAL:** After FAQ expansion, conduct comprehensive quality review to ensure SEO optimization, keyword consistency, and content quality.

**1. Keyword Consistency:**

- Primary keyword must match across title, meta, content, FAQs
- No inconsistent keyword variants (e.g., "Zeiterfassung App" vs "Zeiterfassung per App")
- Primary keyword in 3-5 FAQs (natural integration)
- Related keywords integrated where relevant

**2. Quality Standards:**

- No duplicate questions (semantic similarity < 0.7)
- No repetitive answers (content similarity < 0.6)
- Top 10 GSC queries addressed
- Answers 40-80 words (optimal: 55-65 words)
- No template language ("Wichtig ist, dass...", "Für detaillierte Informationen...")
- Natural du tone (informal German)
- Clean HTML formatting
- Direct answer in first sentence

**3. Review Process:**

```bash
# Run analysis tools
php v2/scripts/blog/analyze-faqs-seo.php --post=slug --category=category
php v2/scripts/blog/check-faq-uniqueness.php --post=slug --category=category
php v2/scripts/blog/suggest-faq-improvements.php --post=slug --category=category

# Fix all identified issues
# - Remove duplicates or merge with unique angles
# - Rewrite repetitive answers with unique focus
# - Remove template language
# - Optimize keyword integration
# - Ensure natural flow and value

# Validate changes
php v2/scripts/blog/check-faq-uniqueness.php --post=slug --category=category
php v2/scripts/blog/analyze-faqs-seo.php --post=slug --category=category
```

**Success Criteria:**

- ✅ Keyword consistency achieved across all content
- ✅ No duplicate questions
- ✅ No repetitive answers
- ✅ All answers 40-80 words
- ✅ No template language
- ✅ Natural keyword integration

**See:** `docs/content/blog/FAQ_MANUAL_REVIEW_SEO_CHECKLIST.md` for complete checklist.

## Ordio Integration

### Natural Mentions

**Requirements:**

- Ordio mentioned once per major section (natural)
- Not forced or salesy
- Contextually relevant
- Adds value to reader
- Not repetitive

**Examples:**

✅ "Mit Ordio kannst du Dienstpläne digital erstellen und verwalten."
✅ "Für eine präzise Berechnung nutze unseren Arbeitszeitrechner."
❌ "Ordio ist die beste Lösung. Ordio ist besser als andere."


# Blog Improvement Plans

**CRITICAL**: Rules for editing and regenerating improvement plans. Files are created dynamically in post documentation folders using the [IMPROVEMENT_PLAN.md template](../../docs/content/blog/posts/_templates/IMPROVEMENT_PLAN.md). The template is copied to `docs/content/blog/posts/{category}/{slug}/IMPROVEMENT_PLAN.md` for each post.

## File Structure

### Auto-Generated Sections (Overwritten on Regeneration)

**Quick Summary (From Reports):**
- Current Status (Quality Score, Priority Level, Word Count, SEO Score)
- Top Improvement Priorities (3 items auto-generated from data)

**Data Sources:**
- `related-resources.json` - Tool/template/download links
- `links-analysis.json` - Link quality scores  
- `seo-analysis.json` - SEO metrics
- `content-analysis.json` - Word count, FAQ count

### Manual Sections (Always Preserved)

Content between `<!-- BEGIN MANUAL -->` and `<!-- END MANUAL -->` markers is **always preserved** during regeneration:

- Manual Improvement Notes
- Manual Improvement Observations
- Manual Improvement Priorities (High/Medium/Low)

## Editing Rules

### ✅ Safe to Edit

**Always Safe (Preserved):**
- Content between `<!-- BEGIN MANUAL -->` and `<!-- END MANUAL -->` markers
- Manual improvement notes, observations, and priorities

**Safe (but overwritten on regeneration):**
- Quick Summary section (if manually edited without placeholders, may be preserved)

### ❌ Never Edit

**Don't Edit (Auto-Generated):**
- Placeholders like `{QUICK_IMPROVEMENT_1}` - these are auto-populated from data
- Auto-generated improvement priorities - regenerated from analysis data

**Don't Remove:**
- `<!-- BEGIN MANUAL -->` and `<!-- END MANUAL -->` markers - required for preservation

## Regeneration Process

### When to Regenerate

Regenerate when:
- Analysis data files are updated (`related-resources.json`, `seo-analysis.json`, etc.)
- Reports are regenerated (`data/reports/` folder)
- Content or SEO metrics change significantly

### How to Regenerate

**Single Post:**
```bash
php v2/scripts/blog/generate-post-documentation.php --post={slug} --category={category}
```

**All Posts:**
```bash
php v2/scripts/blog/generate-post-documentation.php --all
```

### Pre-Regeneration Checklist

Before regenerating:
- [ ] Git commit current state (backup)
- [ ] Verify data files exist and are current
- [ ] Ensure important content is in `<!-- BEGIN MANUAL -->` sections
- [ ] Test on 1-2 sample posts first

### Post-Regeneration Validation

After regenerating:
- [ ] Verify manual sections preserved
- [ ] Check placeholders are populated (not empty)
- [ ] Validate auto-generated priorities are accurate
- [ ] Run validation script if available

## Placeholder Behavior

### Auto-Populated Placeholders

These are **always** replaced with actual data:

- `{QUICK_IMPROVEMENT_1}` - First priority from data analysis
- `{QUICK_IMPROVEMENT_2}` - Second priority from data analysis
- `{QUICK_IMPROVEMENT_3}` - Third priority from data analysis
- `{SEO_SCORE}` - SEO score from `seo-analysis.json`

**Data Sources:**
- Missing tool links → "Add X tool links - Critical for conversions"
- Missing template links → "Add X template links - Valuable resources"
- Link quality score < 70 → "Improve link quality score from X/100 to 70+"
- SEO score < 70 → "Improve SEO score from X/100 to 70+"
- Word count < 1200 → "Expand content by X words to reach 1,200+"
- FAQ count < 5 → "Add X FAQs for AEO/GEO optimization"

### Manual-Only Placeholders

These are **never** auto-populated - replace manually:

- `{MANUAL_IMPROVEMENT_NOTES}` - Add your notes here
- `{MANUAL_IMPROVEMENT_OBSERVATION_1}` - Add observations
- `{MANUAL_IMPROVEMENT_HIGH_PRIORITY_1}` - Add high-priority items
- `{MANUAL_IMPROVEMENT_MEDIUM_PRIORITY_1}` - Add medium-priority items
- `{MANUAL_IMPROVEMENT_LOW_PRIORITY_1}` - Add low-priority items

## Preservation Logic

### What Gets Preserved

✅ **Always Preserved:**
- Content between `<!-- BEGIN MANUAL -->` and `<!-- END MANUAL -->` markers
- Manually edited Quick Summary (if no placeholders present)

❌ **Always Overwritten:**
- Quick Summary section (if contains placeholders)
- Auto-generated improvement priorities
- SEO scores and metrics

### Preservation Detection

The script detects manual edits by:
1. Checking if Quick Summary contains placeholders
2. If no placeholders → preserve as manual content
3. If placeholders → regenerate from data

**Best Practice:** If you manually edit Quick Summary, ensure it doesn't contain placeholders like `{QUICK_IMPROVEMENT_1}` or it will be overwritten.

## Common Mistakes

### ❌ Mistake 1: Editing Auto-Generated Sections

**Wrong:**
```markdown
## Quick Summary (From Reports)

**Top Improvement Priorities:**

- My custom priority here  <!-- Will be overwritten! -->
```

**Correct:**
```markdown
<!-- BEGIN MANUAL -->

### Manual Improvement Priorities

**High Priority (Manual Assessment):**

- My custom priority here  <!-- Preserved! -->

<!-- END MANUAL -->
```

### ❌ Mistake 2: Removing Markers

**Wrong:**
```markdown
## Manual Improvement Plan & Insights (EDITABLE)

### Manual Improvement Notes

My notes here  <!-- No markers = will be overwritten! -->
```

**Correct:**
```markdown
<!-- BEGIN MANUAL -->

## Manual Improvement Plan & Insights (EDITABLE)

### Manual Improvement Notes

My notes here  <!-- Markers = preserved! -->

<!-- END MANUAL -->
```

### ❌ Mistake 3: Expecting Manual Placeholders to Auto-Populate

**Wrong:**
```markdown
{MANUAL_IMPROVEMENT_NOTES}  <!-- Expecting auto-population -->
```

**Correct:**
```markdown
**My Manual Notes:**  <!-- Replace placeholder manually -->
- First observation
- Second observation
```

## Validation

### Check for Unpopulated Placeholders

```bash
# Find files with placeholders
find docs/content/blog/posts -name "IMPROVEMENT_PLAN.md" -exec grep -l "{QUICK_IMPROVEMENT_1}" {} \;
```

### Verify Manual Sections

```bash
# Check manual sections exist
find docs/content/blog/posts -name "IMPROVEMENT_PLAN.md" -exec grep -l "<!-- BEGIN MANUAL -->" {} \;
```

## Recovery

### If Manual Content Was Lost

1. **Restore from Git:**
   ```bash
   git checkout HEAD -- docs/content/blog/posts/{category}/{slug}/IMPROVEMENT_PLAN.md
   ```

2. **Re-add to Manual Section:**
   - Copy content to `<!-- BEGIN MANUAL -->` section
   - Ensure markers are correct
   - Test regeneration to verify preservation

### If Placeholders Not Populated

1. Check data files exist (`related-resources.json`, `seo-analysis.json`, etc.)
2. Verify script has access to data files
3. Run script with debug output
4. Check `generateQuickImprovements()` function output

## Related Documentation

- `docs/content/blog/IMPROVEMENT_PLAN_GUIDE.md` - Complete usage guide
- `docs/content/blog/IMPROVEMENT_PLAN_PREVENTION_GUIDE.md` - Prevention measures
- `docs/content/blog/posts/_templates/IMPROVEMENT_PLAN.md` - Template file
- `v2/scripts/blog/generate-post-documentation.php` - Generation script

## Key Principles

1. **Auto-Generated = Overwritten** - Quick Summary and priorities are regenerated from data
2. **Manual Sections = Preserved** - Content between markers is always preserved
3. **Placeholders = Context-Dependent** - `{QUICK_IMPROVEMENT_*}` auto-populated, `{MANUAL_IMPROVEMENT_*}` manual only
4. **Test Before Regenerating** - Always test on sample posts before regenerating all
5. **Backup Before Regenerating** - Git commit or backup before running regeneration script

## AI Agent Guidelines

When editing improvement plans:

1. **Never remove `<!-- BEGIN MANUAL -->` markers** - Required for preservation
2. **Never edit auto-generated sections** - They will be overwritten
3. **Always add manual content to manual sections** - Guaranteed preservation
4. **Verify placeholders are populated** - Check after regeneration
5. **Test preservation logic** - Run regeneration on sample post first

When regenerating improvement plans:

1. **Create backup first** - Git commit or manual backup
2. **Test on sample posts** - Verify preservation works
3. **Check data files exist** - Required for auto-population
4. **Validate after regeneration** - Check placeholders populated, manual sections preserved
5. **Document any issues** - Note problems for future reference
- docs/ai/rules-archive/blog-improvement-process-content-full.md



# Blog Post Improvement Process Rules

**Last Updated:** 2026-02-22




## Heavy Instructions Moved

**CRITICAL:** The detailed instructions, edge cases, and massive data for this rule have been moved to optimize AI context.
You MUST read the full documentation before proceeding:
`docs/ai/rules-archive/blog-improvement-process-content-full.md`



# Blog Post Improvement Process Rules

**Last Updated:** 2026-03-20



## Core Principles

1. **Manual-First Approach:** All content creation is manual, focused, and human-first
2. **Data-Driven Decisions:** Leverage all available data (GA4, GSC, SISTRIX) for informed decisions
3. **SERP Research:** Deep analysis of top ranking pages informs content strategy
4. **New From Scratch:** Treat each post as a new creation, not building around existing content
5. **Preserve Value:** Keep valuable elements (embeds, calculators, videos) but reposition as needed
6. **Natural Integration:** Ordio mentions and internal links must be natural and contextual
7. **Quality Over Speed:** Focus on perfect output, not rushed automation
8. **Comprehensive Coverage:** Ensure all aspects are covered (SEO, GEO, AEO, user value)

## Process Overview

**Complete Workflow:** See `docs/content/blog/BLOG_POST_IMPROVEMENT_PROCESS.md` for complete workflow.

**Key Phases:**

1. **Preparation & Data Collection** - Collect all available data
2. **Analysis & Research** - Analyze performance and conduct SERP research
3. **Content Strategy & Planning** - Define strategy and create outline
4. **Content Creation** - Write new content from scratch
5. **SEO/GEO/AEO Optimization** - Optimize for all search platforms
6. **Content Quality Validation** - Validate quality standards
7. **Final Review & Publication** - Final checks and publication

## Data Collection Requirements

**Preferred:** Run `run-post-improvement-pipeline.php` for Phase 1 (single entry point). It runs GA4+GSC, derive-keywords, SISTRIX, PAA, SERP, FAQ, competition-levels, search-intent, competitor analysis (top 15), Firecrawl, content-depth-report, analysis, docs, SERP skeleton, pre-content checklist. See [CONTENT_OPTIMIZATION_WORKFLOW.md](../../docs/content/blog/CONTENT_OPTIMIZATION_WORKFLOW.md).

### Re-keyword and topic refresh (triggers)

Re-run **derive** + **improvement pipeline** (or at minimum `collect-post-keywords-sistrix.php` + `derive-target-keywords.php` when GSC exists) when:

- **GSC top queries shift** materially vs. current `target-keywords.json` / primary in post JSON
- **Major competitor refresh** (new H2 coverage, featured snippet change) per SERP review
- **SISTRIX / data cadence** — align with [blog-data-collection.mdc](blog-data-collection.mdc) (monthly SISTRIX refresh for active posts; tier-1 posts quarterly full improvement pipeline per strategy docs)

Then update `KEYWORD_DECISION.md` and `CONTENT_OUTLINE.md` **Evidence** rows as needed. Canonical workflow: [KEYWORD_RESEARCH_WORKFLOW.md](../../docs/content/blog/KEYWORD_RESEARCH_WORKFLOW.md).

### Required Data Collection (Manual Alternative)

**Before Starting Improvement:**

1. **GA4 Performance Data**

   ```bash
   php v2/scripts/blog/collect-post-performance-ga4.php --post=slug --category=category
   ```

2. **GSC Search Performance Data**

   ```bash
   php v2/scripts/blog/collect-post-performance-gsc.php --post=slug --category=category
   ```

3. **SISTRIX Keyword Data**

   ```bash
   php v2/scripts/blog/collect-post-keywords-sistrix.php --post=slug --category=category
   ```

4. **SERP Features**

   ```bash
   php v2/scripts/blog/collect-post-serp-features.php --post=slug --category=category
   ```

5. **Search Intent**

   ```bash
   php v2/scripts/blog/collect-post-search-intent.php --post=slug --category=category
   ```

6. **Competition Levels**
   ```bash
   php v2/scripts/blog/collect-post-competition-levels.php --post=slug --category=category
   ```

**See:** `docs/content/blog/IMPROVEMENT_DATA_COLLECTION_GUIDE.md` for complete guide.

### Data Validation

**After Collection:**

```bash
php v2/scripts/blog/validate-data-collection.php --post=slug --category=category
```

**Checks:**

- All data files exist
- JSON files are valid
- Data freshness (< 7 days for GA4/GSC, < 30 days for SISTRIX)

## SERP Analysis Requirements

### Manual SERP Research

**CRITICAL:** Conduct deep SERP analysis before content creation.

**Process:**

1. **Search Primary Keyword** (incognito mode)
2. **Document SERP Features** (featured snippets, PAA, knowledge panels)
3. **Analyze Top 5 Results** (content structure, depth, formats)
4. **Content Gap Analysis** (missing topics, weak topics, missing formats)
5. **Ranking Factor Analysis** (what makes top results rank)

**See:** `docs/content/blog/SERP_ANALYSIS_WORKFLOW.md` for complete workflow.

**Documentation:** Create SERP_ANALYSIS.md files dynamically in post documentation folders using the [SERP_ANALYSIS.md template](../../docs/content/blog/posts/_templates/SERP_ANALYSIS.md). The template is copied to post folders (docs/content/blog/posts/{category}/{slug}/SERP_ANALYSIS.md) for each post.

### Content Gap Identification

**Required Analysis:**

- Missing topics (competitors cover but current post doesn't)
- Weak topics (covered but not as comprehensively)
- Missing formats (tables, lists, calculators, videos)
- Missing FAQs (PAA questions not answered)
- Ranking factor gaps (content depth, technical factors)




# Blog Post Improvement Process Rules

**Last Updated:** 2026-01-18



## Validation Requirements

### Pre-Publication Validation

**Required Checks:**

1. **Content Quality**

   ```bash
   php v2/scripts/blog/validate-content-flow.php --post=slug --category=category
   ```

2. **FAQ Quality**

   ```bash
   php v2/scripts/blog/validate-faq-quality.php --post=slug --category=category
   php v2/scripts/blog/validate-faq-schema.php --post=slug --category=category
   ```

3. **SEO Validation**

   ```bash
   php v2/scripts/blog/seo-validation-tier1.php --post=slug --category=category
   ```

4. **Browser Testing**
   - Load post in browser
   - Test all links
   - Test calculators/interactive elements
   - Test mobile responsiveness

### Quality Checklists

**Required Checklists:**

- [Content Quality Checklist](CONTENT_QUALITY_CHECKLIST_IMPROVEMENT.md)
- [SEO/GEO/AEO Checklist](SEO_GEO_AEO_CHECKLIST.md)
- [Internal Linking Guide](INTERNAL_LINKING_IMPROVEMENT_GUIDE.md)

## Phase Gate

**Phase 4 (Content Creation) cannot begin until `validate-improvement-readiness.php` passes.** Run `php v2/scripts/blog/validate-improvement-readiness.php --post={slug} --category={category}`. Exit 0 = ready; exit 1 = complete Phases 1–3 first.

**When user says "do post X" or "improve post X":** Treat as full improvement process. Run pipeline, conduct SERP analysis, create CONTENT_OUTLINE.md, then rewrite content. Do not patch.

**One post at a time:** Complete all phases for a single post before starting the next. Do not begin post B while post A is still in content creation or validation. See [CONTENT_OPTIMIZATION_WORKFLOW.md](../../docs/content/blog/CONTENT_OPTIMIZATION_WORKFLOW.md).

## Workflow Execution

### When Improving a Blog Post

**Step 1: Preparation**

1. User provides post slug and category
2. Review current post content
3. Run `run-post-improvement-pipeline.php` (or collect data manually: GA4, GSC, derive-keywords, SISTRIX, etc.)
4. Review GA4/GSC data when present (bounce, engagement, top queries). See GSC_GA4_CONTENT_DECISION_GUIDE.md.
5. Validate data collection

**Step 2: Analysis**

1. Analyze current performance (GA4, GSC data)
2. Review SISTRIX keyword data
3. Run `collect-post-competitor-analysis.php` (output: competitor-analysis.json)
4. Run `analyze-competitor-content-depth.php` (output: competitive-depth-analysis.md)
5. Conduct SERP analysis (manual, browser-based; primary + 2 secondary keywords; per SERP_REVIEW_CHECKLIST.md)
6. Generate SERP skeleton (`generate-serp-analysis-skeleton.php`), then fill manual sections
7. Generate analysis documentation

**Step 3: Strategy**

1. **Competitive Analysis:**
   - Analyze competitor content depth (word counts, topics covered)
   - Identify content gaps vs competitors
   - Calculate competitor average word count
   - Set target word count (competitor average × 1.2-1.3 for competitive depth)
   - Determine if skyscraper approach needed (high competition, 2,500+ word competitors)

2. Define content strategy (based on analysis and competitive positioning)
2. Create content outline
3. Identify valuable elements to preserve
4. Plan content structure

**Step 4: Content Creation**

1. Run `generate-section-briefs.php`; copy into CONTENT_OUTLINE
2. Extract valuable elements
3. **Section-by-section drafting:** Write each H2 to full depth from outline before next. No incremental word-count chasing.
4. Reintegrate elements at appropriate positions
5. Create FAQs (from PAA and GSC queries)
6. Run `validate-section-depth.php` and `validate-content-completeness.php` before finalizing

**Step 5: Optimization**

1. SEO optimization (meta tags, schema, keywords)
2. GEO optimization (AI-ready content, citations)
3. AEO optimization (featured snippets, PAA, E-E-A-T)
4. Internal linking (10-15 natural links)

**Step 6: Validation**

1. Content quality validation
2. SEO/GEO/AEO validation
3. FAQ validation
4. Browser testing

**Step 7: Publication**

1. Update post JSON file
2. Validate JSON file
3. Publish
4. Monitor performance

## Script Usage Guidelines

### Data Collection Scripts

**See:** `docs/content/blog/DATA_COLLECTION_SCRIPTS_INVENTORY.md` for complete inventory.

**Key Scripts:**

- `collect-post-performance-ga4.php` - GA4 metrics
- `collect-post-performance-gsc.php` - GSC search data
- `collect-post-keywords-sistrix.php` - SISTRIX keywords
- `collect-post-serp-features.php` - SERP features
- `collect-post-competitor-analysis.php` - Competitor URLs, headings, word count, FAQs (~1-3 credits)
- `collect-post-search-intent.php` - Search intent
- `collect-post-competition-levels.php` - Competition data

### Analysis Scripts

**See:** `docs/content/blog/ANALYSIS_SCRIPTS_GUIDE.md` for complete guide.

**Key Scripts:**

- `analyze-post-content.php` - Content analysis
- `analyze-post-seo.php` - SEO analysis
- `analyze-post-links.php` - Link analysis
- `analyze-competitor-content-depth.php` - Competitor depth, content gaps, word count target
- `generate-serp-analysis-skeleton.php` - Pre-fill SERP_ANALYSIS.md from competitor data
- `comprehensive-faq-analysis.php` - FAQ analysis
- `validate-content-flow.php` - Flow validation

## Quality Standards Enforcement

### Content Quality

**Must Meet:**

- Human-first content standards
- AI content avoidance requirements
- Natural flow requirements
- Value and user experience standards
- Ordio integration guidelines

### SEO/GEO/AEO Quality

**Must Meet:**

- SEO optimization requirements
- GEO optimization requirements
- AEO optimization requirements
- Technical SEO requirements
- Schema markup requirements

### Validation

**Must Pass:**

- Content flow validation (score ≥80/100)
- FAQ quality validation
- FAQ schema validation
- SEO validation
- Browser testing

## Related Documentation

**Quick Start:**

- [Quick Start Guide](../docs/content/blog/QUICK_START_IMPROVEMENT.md) - Quick reference for getting started

**Process Documentation:**

- [Blog Post Improvement Process](../docs/content/blog/BLOG_POST_IMPROVEMENT_PROCESS.md) - Complete workflow
- [Improvement Data Collection Guide](../docs/content/blog/IMPROVEMENT_DATA_COLLECTION_GUIDE.md) - Data collection
- [SERP Analysis Workflow](../docs/content/blog/SERP_ANALYSIS_WORKFLOW.md) - SERP analysis
- [Content Creation Workflow 2026](../docs/content/blog/CONTENT_CREATION_WORKFLOW_2026.md) - Outline-first, section-by-section drafting

**Checklists:**

- [SEO/GEO/AEO Checklist](../docs/content/blog/SEO_GEO_AEO_CHECKLIST.md) - Optimization checklist
- [Content Quality Checklist](../docs/content/blog/CONTENT_QUALITY_CHECKLIST_IMPROVEMENT.md) - Quality standards
- [Internal Linking Guide](../docs/content/blog/INTERNAL_LINKING_IMPROVEMENT_GUIDE.md) - Linking guidelines

**Scripts:**

- [Data Collection Scripts Inventory](../docs/content/blog/DATA_COLLECTION_SCRIPTS_INVENTORY.md) - Data scripts
- [Analysis Scripts Guide](../docs/content/blog/ANALYSIS_SCRIPTS_GUIDE.md) - Analysis scripts
- [SERP Analysis Tools](../docs/content/blog/SERP_ANALYSIS_TOOLS.md) - SERP tools

## Quick Reference

### Complete Workflow Summary

1. **Prepare:** Identify post, review current state, collect all data
2. **Analyze:** Review performance, conduct SERP research, generate analysis
3. **Plan:** 
   - Competitive analysis (competitor word counts, content gaps)
   - Define strategy (based on competition and search intent)
   - Set word count target (flexible, data-driven)
   - Create outline (based on competitor analysis and content gaps)
   - Identify elements to preserve
4. **Create:** Section-by-section comprehensive drafting; write each H2 to full depth from outline before next. Run `generate-section-briefs.php` before writing. Prohibited: incremental word-count chasing.
5. **Optimize:** SEO/GEO/AEO optimization, internal linking
6. **Validate:** Quality checks, browser testing
7. **Publish:** Update JSON, validate, publish, monitor

### Estimated Time per Post

- **Data Collection:** 10-20 minutes (with parallelization: run-parallel-collection, run-parallel-analyze)
- **SERP Analysis:** 1-2 hours
- **Content Creation:** 4-8 hours (section-by-section drafting is faster than incremental patching—fewer passes, clearer scope)
- **Optimization:** 1-2 hours
- **Validation:** 30 minutes
- **Total:** 6-12 hours per post (data collection ~30-40% faster with parallelization)

### Critical Requirements

1. **Data Collection:** All data collected before starting
2. **SERP Analysis:** Manual analysis of top 10 results
3. **New From Scratch:** Treat as new creation, not building around existing
4. **Element Preservation:** Extract and reintegrate valuable elements
5. **Definition Placement:** Within first 20% of content
6. **FAQs:** In `faqs` array, NOT in content HTML
7. **Internal Links:** 10-15 natural, contextual links
8. **Quality Standards:** All checklists passed before publication

## Overview

**This rule has been split into focused files for better maintainability:**

- **`blog-improvement-process-core.mdc`** - Core principles, process overview, data collection requirements, SERP analysis requirements
- **`blog-improvement-process-content.mdc`** - Content creation approach, depth standards, quality standards, SEO/GEO/AEO optimization, internal linking, FAQ requirements, Ordio integration
- **`blog-improvement-process-workflow.mdc`** - Validation requirements, workflow execution, script usage guidelines, quality standards enforcement, quick reference

Rules for the manual-focused blog post improvement process, emphasizing deep data analysis, SERP research, and human-first content creation.

**See the split files above for detailed documentation.**
