# FAQ Schema Best Practices

**Last Updated:** 2026-01-15

Comprehensive guide for FAQPage schema optimization, following Google best practices and optimizing for SEO, AEO (Answer Engine Optimization), and GEO (Generative Engine Optimization).

## Overview

FAQPage schema markup enables rich results in Google Search, improves visibility in Featured Snippets, and optimizes content for AI search engines. This guide covers implementation, validation, and optimization best practices.

## Schema Structure Requirements

### Required Properties

All FAQPage schemas must include:

1. **`@context`**: Must be `"https://schema.org"`
2. **`@type`**: Must be `"FAQPage"`
3. **`mainEntity`**: Array of Question objects (at least one required)
4. **Each Question must have:**
   - `@type`: `"Question"`
   - `name`: The question text (string)
   - `acceptedAnswer`: Object with:
     - `@type`: `"Answer"`
     - `text`: The answer text (plain text only, no HTML)

### Example Valid Schema

```json
{
  "@context": "https://schema.org",
  "@type": "FAQPage",
  "mainEntity": [
    {
      "@type": "Question",
      "name": "Was ist Excel Zeiterfassung?",
      "acceptedAnswer": {
        "@type": "Answer",
        "text": "Excel Zeiterfassung ist die Verwendung von Excel-Tabellen zur Erfassung der Arbeitszeiten von Mitarbeitern. Excel ist jedoch keine spezielle Zeiterfassungssoftware und kann fehleranfällig sein."
      }
    }
  ]
}
```

## Text Cleaning Best Practices

### HTML Stripping

- **All HTML tags must be removed** from answer text in schema
- Use `strip_tags()` to remove all HTML elements
- Links, formatting, and other HTML should be stripped completely

### Text Normalization

1. **Decode HTML entities**: Use `html_entity_decode()` with `ENT_QUOTES | ENT_HTML5`
2. **Replace smart quotes**: Convert smart quotes to regular quotes for consistency
3. **Normalize whitespace**: Multiple spaces/tabs/newlines → single space
4. **Trim whitespace**: Remove leading/trailing whitespace
5. **Ensure UTF-8 encoding**: Validate and convert to UTF-8 if needed
6. **Remove control characters**: Strip non-printable characters

### Implementation

The schema generator automatically performs these steps:

```php
// Step 1: Remove all HTML tags
$answer_text = strip_tags($answer_html);

// Step 2: Decode HTML entities
$answer_text = html_entity_decode($answer_text, ENT_QUOTES | ENT_HTML5, 'UTF-8');

// Step 3: Replace smart quotes
$smartQuotes = ["\xE2\x80\x9E", "\xE2\x80\x9C", "\xE2\x80\x9D", "\xE2\x80\x98", "\xE2\x80\x99"];
$regularQuotes = ['"', '"', '"', "'", "'"];
$answer_text = str_replace($smartQuotes, $regularQuotes, $answer_text);

// Step 4: Normalize whitespace
$answer_text = preg_replace('/\s+/', ' ', $answer_text);

// Step 5: Trim whitespace
$answer_text = trim($answer_text);

// Step 6: Ensure UTF-8 encoding
if (!mb_check_encoding($answer_text, 'UTF-8')) {
    $answer_text = mb_convert_encoding($answer_text, 'UTF-8', 'UTF-8');
}

// Step 7: Remove control characters
$answer_text = preg_replace('/[\x00-\x08\x0B\x0C\x0E-\x1F\x7F]/', '', $answer_text);
```

## Google Requirements

### Reflective Content Requirement

**CRITICAL**: Schema content must match visible content on the page.

- Questions in schema must appear on the page (visible or expandable)
- Answers in schema must match answers displayed on the page
- Google will ignore schema if content doesn't match

### Single FAQPage Schema

- Only **one** FAQPage schema per page
- Multiple FAQPage schemas will cause validation errors
- Check for duplicate schemas from themes/plugins

### Schema Placement

- JSON-LD script can be in `<head>` or `<body>`
- Must be present in rendered HTML (not dynamically loaded)
- Ensure schema loads before page render completes

## SEO/AEO/GEO Optimization

### SEO Optimization

1. **Natural Language Questions**: Use natural, conversational questions
2. **Keyword Integration**: Include primary keywords naturally in answers
3. **Answer Length**: 40-80 words optimal for Featured Snippets
4. **Comprehensive Answers**: Provide complete, helpful answers

### AEO (Answer Engine Optimization)

1. **Direct Answers**: Answer the question directly in first sentence
2. **Clear Structure**: Use clear, concise language
3. **Featured Snippet Format**: Structure for snippet display
4. **Context and Examples**: Include relevant context and examples

### GEO (Generative Engine Optimization)

1. **Factual Content**: Ensure answers are factual and authoritative
2. **Relevant Details**: Include relevant details and context
3. **Natural Language**: Use natural language patterns
4. **AI Understanding**: Optimize for AI comprehension

## Validation Checklist

### Pre-Deployment Validation

- [ ] Schema validates in Google Rich Results Test
- [ ] Schema validates in Schema Markup Validator
- [ ] All required properties present
- [ ] No HTML tags in answer text
- [ ] Text properly normalized (no extra whitespace, smart quotes)
- [ ] UTF-8 encoding verified
- [ ] Only one FAQPage schema per page
- [ ] Schema matches visible content
- [ ] Questions and answers match exactly

### Validation Script

Use the validation script to check schema:

```bash
# Validate single post
php v2/scripts/blog/validate-faq-schema.php --post=slug --category=category

# Validate all posts
php v2/scripts/blog/validate-faq-schema.php --all
```

**What it checks:**

- FAQPage schema presence
- Required properties
- Schema structure
- Text normalization
- Content matching
- Duplicate schemas

## Common Issues and Solutions

### Issue: "FAQPage schema not found"

**Cause**: Schema not generated or not included in page

**Solution**:
1. Check FAQs array exists in post JSON
2. Verify schema generator is called with FAQs
3. Check rendered HTML for schema presence

### Issue: "Schema answer contains HTML"

**Cause**: HTML tags not stripped from answer text

**Solution**: Ensure `strip_tags()` is called before adding to schema

### Issue: "Answer mismatch"

**Cause**: Schema answer doesn't match visible content

**Solution**:
1. Verify schema answer matches displayed answer
2. Check for HTML links in visible content (may be stripped in schema)
3. Ensure text normalization matches

### Issue: "Multiple FAQPage schemas found"

**Cause**: Duplicate schemas from theme/plugin

**Solution**: Remove duplicate schema sources, ensure only one FAQPage schema

## Testing and Monitoring

### Google Rich Results Test

Test schema with Google's tool:
https://search.google.com/test/rich-results

### Schema Markup Validator

Validate JSON-LD syntax:
https://validator.schema.org/

### Search Console Monitoring

Monitor schema errors in Google Search Console:
- Rich Results report
- Structured Data section
- URL Inspection tool

## Implementation Notes

### Schema Generator Location

File: `v2/config/blog-schema-generator.php`

Function: `generate_blog_schema('post', $data, $overrides)`

### Schema Rendering

File: `v2/pages/blog/post.php`

Schema is rendered in `<head>` section via `render_blog_schema()` function.

### Validation Script

File: `v2/scripts/blog/validate-faq-schema.php`

Validates schema generation and structure for all posts.

## Best Practices Summary

1. ✅ Always strip HTML from answer text
2. ✅ Normalize text (whitespace, quotes, encoding)
3. ✅ Ensure schema matches visible content
4. ✅ Use only one FAQPage schema per page
5. ✅ Validate with Google Rich Results Test
6. ✅ Monitor Search Console for errors
7. ✅ Test schema in browser before deployment
8. ✅ Follow SEO/AEO/GEO optimization guidelines

## Related Documentation

- `FAQ_CREATION_WORKFLOW_2026.md` - Complete FAQ creation workflow
- `FAQ_BEST_PRACTICES.md` - FAQ content best practices
- `FAQ_MANUAL_REVIEW_SEO_CHECKLIST.md` - Manual review checklist
- `.cursor/rules/blog-faq-optimization.mdc` - Cursor rules for FAQ optimization
