# Schema Markup Troubleshooting Guide

**Last Updated:** 2026-01-15

## Overview

This guide helps troubleshoot schema markup issues for blog posts, with a focus on SpeakableSpecification and other common schema problems.

## Common Issues

### SpeakableSpecification XPath Errors

**Symptom**: Google Schema Markup Validator shows error: "(Keine Übereinstimmungen für den Ausdruck html/body/article//p[1] gefunden.)"

**Root Cause**: 
- XPath `/html/body/article//p[1]` assumes `<article>` is directly under `<body>`
- Actual HTML structure has wrapper divs: `<body><div><article>...</article></div></body>`
- Mixing CSS selector and XPath violates Google's best practices

**Solution**:
- Use CSS selector only (remove XPath)
- Use selector: `.post-content-inner p:first-of-type`
- This matches the actual HTML structure: `<div class="post-content-inner"><p>...</p></div>`

**Fixed In**: `v2/config/blog-schema-generator.php` (lines 211-222)

### Missing SpeakableSpecification

**Symptom**: SpeakableSpecification not present in schema

**Possible Causes**:
1. No `<p>` tags in HTML content
2. Empty HTML content
3. Content validation failed

**Solution**:
- Ensure blog post HTML content has at least one `<p>` tag
- Check `content.html` field in post JSON file
- SpeakableSpecification is only added if first paragraph exists

### Invalid JSON-LD

**Symptom**: JSON-LD encoding fails or produces invalid JSON

**Possible Causes**:
1. Invalid UTF-8 characters
2. Unescaped special characters
3. Circular references

**Solution**:
- Run validation script: `php v2/scripts/blog/validate-schema-speakable.php --all`
- Check error logs for specific encoding issues
- Ensure all text fields are UTF-8 encoded

### Missing Required Schema Types

**Symptom**: Article, BreadcrumbList, or other required schemas missing

**Possible Causes**:
1. Schema generation function error
2. Missing required data fields
3. Exception during generation

**Solution**:
- Check error logs for exceptions
- Verify post data has required fields (title, url, publication_date)
- Run verification script: `php v2/scripts/blog/verify-schema-types.php`

## Validation Scripts

### Validate SpeakableSpecification

```bash
# Validate all posts
php v2/scripts/blog/validate-schema-speakable.php --all

# Validate specific post
php v2/scripts/blog/validate-schema-speakable.php --post=tarifvertraege --category=lexikon
```

**Checks**:
- XPath not present (should use CSS selector only)
- CSS selector present and valid
- First paragraph exists in HTML
- JSON-LD is valid

### Test All Schema Types

```bash
php v2/scripts/blog/test-schema-all-posts.php
```

**Checks**:
- JSON-LD encoding
- Schema structure
- Required schemas present
- SpeakableSpecification correct

### Verify Schema Types

```bash
php v2/scripts/blog/verify-schema-types.php
```

**Checks**:
- Article schema
- BreadcrumbList schema
- WebPage schema
- ImageObject schema
- Person schema (if author exists)
- FAQPage schema (if FAQs exist)
- SpeakableSpecification (if content exists)

## Best Practices

### SpeakableSpecification

1. **Use CSS Selector Only**:
   - ✅ DO: Use `cssSelector` only
   - ❌ DON'T: Mix CSS selector and XPath

2. **Selector Choice**:
   - ✅ DO: Use `.post-content-inner p:first-of-type`
   - ✅ DO: Ensure selector matches actual HTML structure
   - ❌ DON'T: Use XPath `/html/body/article//p[1]`

3. **Content Validation**:
   - ✅ DO: Check if first paragraph exists before adding SpeakableSpecification
   - ✅ DO: Skip if no paragraphs found
   - ❌ DON'T: Add SpeakableSpecification without content

### Schema Structure

1. **JSON-LD Format**:
   - ✅ DO: Use `@graph` array structure
   - ✅ DO: Include `@context: "https://schema.org"`
   - ✅ DO: Use proper `@id` references

2. **Required Schemas**:
   - ✅ DO: Include Article schema for all posts
   - ✅ DO: Include BreadcrumbList schema
   - ✅ DO: Include WebPage schema
   - ✅ DO: Include ImageObject schema

3. **Optional Schemas**:
   - ✅ DO: Include Person schema if author exists
   - ✅ DO: Include FAQPage schema if FAQs exist
   - ✅ DO: Include SpeakableSpecification if content exists

## Testing with Google Tools

### Google Rich Results Test

1. Visit: https://search.google.com/test/rich-results
2. Enter blog post URL
3. Check for errors:
   - XPath errors (should not appear)
   - Missing required fields
   - Invalid JSON-LD

### Schema.org Validator

1. Visit: https://validator.schema.org/
2. Enter blog post URL
3. Check for warnings:
   - SpeakableSpecification structure
   - Missing properties
   - Invalid property values

## Debugging Steps

### Step 1: Check Schema Generation

```php
// In blog-schema-generator.php, add debug logging:
error_log("Schema generation for {$post_url}: SpeakableSpecification check");
error_log("HTML content length: " . strlen($html_content));
error_log("Has first paragraph: " . ($has_first_paragraph ? 'yes' : 'no'));
```

### Step 2: Verify HTML Structure

```bash
# Check HTML structure of a post
php v2/scripts/blog/analyze-schema-speakable.php
```

### Step 3: Test Schema Output

```bash
# Generate and display schema JSON-LD
php v2/scripts/blog/test-schema-output.php
```

### Step 4: Validate All Posts

```bash
# Run comprehensive validation
php v2/scripts/blog/validate-schema-speakable.php --all
php v2/scripts/blog/test-schema-all-posts.php
```

## Common Fixes

### Fix 1: Remove XPath from SpeakableSpecification

**Before**:
```php
$article['speakable'] = [
    '@type' => 'SpeakableSpecification',
    'cssSelector' => ['article .post-content-inner p:first-of-type'],
    'xpath' => ['/html/body/article//p[1]'] // ❌ Remove this
];
```

**After**:
```php
$article['speakable'] = [
    '@type' => 'SpeakableSpecification',
    'cssSelector' => ['.post-content-inner p:first-of-type'] // ✅ CSS only
];
```

### Fix 2: Update CSS Selector

**Before**:
```php
'cssSelector' => ['article .post-content-inner p:first-of-type']
```

**After**:
```php
'cssSelector' => ['.post-content-inner p:first-of-type'] // More reliable
```

### Fix 3: Add Content Validation

**Before**:
```php
if (!empty($data['excerpt'])) {
    $article['speakable'] = [...];
}
```

**After**:
```php
$html_content = $data['content']['html'] ?? '';
$has_first_paragraph = !empty($html_content) && preg_match('/<p[^>]*>/i', $html_content);

if ($has_first_paragraph) {
    $article['speakable'] = [...];
}
```

## Monitoring

### Regular Checks

1. **Weekly**: Run validation script on all posts
2. **Monthly**: Test sample URLs with Google Rich Results Test
3. **Quarterly**: Review schema health metrics

### Monitoring Script

```bash
# Check schema health
php v2/scripts/blog/monitor-schema-health.php
```

## Related Documentation

- [SEO Optimization Guide](SEO_OPTIMIZATION_GUIDE.md) - Complete SEO guide
- [Template Development Guide](TEMPLATE_DEVELOPMENT_GUIDE.md) - Template structure
- [Schema Generator Code](../../../../v2/config/blog-schema-generator.php) - Implementation

## Support

If issues persist:
1. Check error logs: `tail -f /path/to/error.log`
2. Run validation scripts
3. Test with Google Rich Results Test
4. Review schema generator code
