# Data Migration Guide: Comparison Pages


**Last Updated:** 2025-11-20

This guide provides step-by-step instructions for migrating competitor data from old comparison pages to the centralized `competitors_data.php` structure.

## Overview

The migration process involves:

1. Extracting data from old comparison pages
2. Validating extracted data
3. Comparing with current data
4. Updating `competitors_data.php`
5. Verifying updates

## Prerequisites

- Python 3.x installed
- PHP 7.4+ installed
- Access to comparison page files in `v2/pages/`
- Access to `v2/data/competitors_data.php`

## Step-by-Step Migration Process

### Step 1: Extract Data from Old Pages

Run the extraction script to extract all competitor data:

```bash
cd /path/to/landingpage
python3 scripts/data/extract_competitor_data.py
```

**Output:** `docs/development/testing/extracted_competitor_data.json`

**What it extracts:**

- Hero content (H1, description)
- Comparison grid data (ratings, pricing, features)
- FAQ sections (questions and answers)
- Details sections (if present)
- Schema data (excluding meta tags)
- Rating distributions
- Detailed ratings

**Excluded Pages:**

- compare_freshbooks.php (outdated layout)
- compare_e2n.php (outdated layout)
- compare_timely.php (outdated layout)
- compare_shyftplan.php (outdated layout)
- compare_planerio.php (outdated layout)
- compare_generator.php (generator tool)
- compare_template_v2.php (template)
- compare_index.php (index page)

### Step 2: Validate Extracted Data

Run the validation script to check data quality:

```bash
python3 scripts/data/validate_extracted_data.py
```

**Output:** `docs/development/testing/extraction_validation_report.md`

**What it validates:**

- Rating distribution counts sum to total reviews
- All required fields present
- Data consistency
- FAQ completeness

**Common Issues:**

- Rating distribution mismatch (counts don't sum to reviews)
- Missing required fields (rating, reviews, description)
- FAQ count is 0 (extraction may have failed)
- Description too short or missing

### Step 3: Compare with Current Data

Run the comparison script to identify discrepancies:

```bash
php scripts/data/validate_and_compare.php
```

**Output:** `docs/audit/comparison-pages/DATA_COMPARISON_REPORT.md`

**What it compares:**

- Rating values
- Review counts
- Descriptions (similarity check)
- FAQ counts
- Pricing information
- Details sections (`has_details` flag)

**Discrepancy Types:**

- 🔴 High severity: Pricing, reviews, `has_details` flag
- 🟡 Medium severity: Descriptions, FAQ counts

### Step 4: Update competitors_data.php

#### Option A: Automated Update (Recommended for bulk updates)

Run the update script:

```bash
php scripts/data/update_competitors_data.php
```

**What it does:**

- Creates backup of `competitors_data.php`
- Updates competitor entries with extracted data
- Validates PHP syntax after update

**Limitations:**

- Pattern matching may fail for some entries
- Manual intervention may be required

**Success Indicators:**

- Script reports "Updated {competitor}" for each entry
- No PHP syntax errors
- Backup file created

#### Option B: Manual Update

For entries that automated update fails, manually update:

1. **Open `v2/data/competitors_data.php`**
2. **Find competitor entry** (search for `'slug' => '{competitor}'`)
3. **Update fields** based on extracted data:

   - `rating`: Use extracted rating value
   - `reviews`: Use extracted review count
   - `description`: Use extracted full description
   - `pricing`: Update `starting_price`, `price_unit`, `currency`
   - `has_details`: Set to `true` if Details section exists
   - `details`: Add Details sections structure if `has_details` is true
   - `faq`: Update FAQ array with extracted questions/answers

4. **Validate PHP syntax:**
   ```bash
   php -l v2/data/competitors_data.php
   ```

### Step 5: Verify Updates

Re-run comparison to verify updates:

```bash
php scripts/data/validate_and_compare.php
```

**Expected Results:**

- Discrepancies reduced or eliminated
- Complete matches increased
- No new errors introduced

## Details Section Migration

### Structure

Details sections use a flexible structure:

```php
'has_details' => true,
'details' => [
    'sections' => [
        [
            'title' => 'Section Title',
            'type' => 'list', // or 'paragraph' or 'mixed'
            'items' => ['Item 1', 'Item 2'], // for list type
            'content' => 'Paragraph content...', // for paragraph type
            'items_with_descriptions' => [ // for mixed type
                [
                    'title' => 'Item Title',
                    'description' => 'Item description'
                ]
            ]
        ]
    ]
]
```

### Content Types

1. **List Type** (`type => 'list'`):

   - Simple list items
   - Use `items` array
   - Example: Clockin features list

2. **Paragraph Type** (`type => 'paragraph'`):

   - Text paragraphs
   - Use `content` string
   - Example: askDANTE description paragraphs

3. **Mixed Type** (`type => 'mixed'`):
   - Items with bold headers and descriptions
   - Use `items_with_descriptions` array
   - Example: awork features with descriptions

### Migration Steps

1. **Identify Details Section** in old page
2. **Extract Content** using extraction script
3. **Determine Content Type** (list, paragraph, mixed)
4. **Structure Data** according to type
5. **Update Entry** in `competitors_data.php`

## Troubleshooting

### Issue: Extraction Script Fails

**Symptoms:** Script errors or incomplete extraction

**Solutions:**

- Check file encoding (should be UTF-8)
- Verify page structure matches expected format
- Check for HTML syntax errors in source page
- Review extraction script logs

### Issue: Update Script Can't Find Entry

**Symptoms:** "Warning: Could not find {competitor} entry"

**Solutions:**

- Check slug matches exactly (case-sensitive)
- Verify entry exists in `competitors_data.php`
- Try manual update instead
- Check for formatting differences (indentation, quotes)

### Issue: PHP Syntax Errors After Update

**Symptoms:** `php -l` reports syntax errors

**Solutions:**

- Check for double commas (`,,`)
- Verify string escaping (single quotes, backslashes)
- Check array structure (matching brackets)
- Restore from backup and retry

### Issue: Details Section Not Displaying

**Symptoms:** Details section missing on page

**Solutions:**

- Verify `has_details` is `true`
- Check `details['sections']` structure is correct
- Verify component includes Details section
- Check browser console for errors

### Issue: Rating Distribution Mismatch

**Symptoms:** Validation reports count mismatch

**Solutions:**

- Verify extraction captured all rating rows
- Check source page has complete rating distribution
- Manually verify counts sum to reviews
- Update distribution manually if needed

## Best Practices

1. **Always Create Backup** before updating
2. **Validate After Each Update** to catch errors early
3. **Test in Browser** after migration
4. **Document Changes** in commit messages
5. **Review Discrepancies** before updating
6. **Prioritize High-Severity Issues** (pricing, reviews)
7. **Verify Details Sections** structure matches component expectations
8. **Check String Escaping** for special characters

## Validation Checklist

Before completing migration:

- [ ] All data extracted successfully
- [ ] Validation passed (no critical errors)
- [ ] Comparison report reviewed
- [ ] High-priority discrepancies fixed
- [ ] PHP syntax valid
- [ ] Details sections properly structured (if applicable)
- [ ] Visual verification completed
- [ ] Documentation updated

## Related Documentation

- `docs/guides/comparison-pages/COMPARISON_PAGES_GUIDE.md` - Comparison page creation guide
- `docs/audit/comparison-pages/MASTER_AUDIT_REPORT.md` - Comprehensive audit report
- `docs/audit/comparison-pages/DATA_COMPARISON_REPORT.md` - Latest comparison results
- `.cursor/rules/comparison-pages.mdc` - Cursor rules for comparison pages
