# Blog Migration Content Structure

**Last Updated:** 2026-01-09

Content data structure requirements for blog migration, including post metadata, category structure, relationships, and internal linking data.

## Overview

This document defines the data structure needed for migrating blog content, including post metadata, categories, relationships, and internal linking information.

## Post Metadata Structure

### Core Post Data

```json
{
  "slug": "leitfaden-zur-finanzbuchhaltung",
  "title": "Leitfaden zur Finanzbuchhaltung",
  "category": "lexikon",
  "url": "/insights/lexikon/leitfaden-zur-finanzbuchhaltung/",
  "publication_date": "2023-09-01T10:40:44+00:00",
  "modified_date": "2025-03-31T18:58:44+00:00",
  "author": "Emma",
  "word_count": 1214,
  "reading_time": 6
}
```

### Content Data

```json
{
  "content": {
    "html": "<div>...</div>",
    "text": "Plain text content...",
    "word_count": 1214
  },
  "excerpt": "Leitfaden zur Finanzbuchhaltung: Hier erfährst du alles...",
  "h1": "Finanzbuchhaltung für Arbeitgeber: Ein Leitfaden"
}
```

### SEO Data

```json
{
  "meta": {
    "title": "Leitfaden zur Finanzbuchhaltung | Lexikon | Ordio",
    "description": "Leitfaden zur Finanzbuchhaltung: Hier erfährst du alles...",
    "keywords": ["Arbeitgeber", "Buchhaltung", "Gesetz"],
    "canonical_url": "https://www.ordio.com/insights/lexikon/leitfaden-zur-finanzbuchhaltung/"
  }
}
```

### Media Data

```json
{
  "featured_image": {
    "url": "https://www.ordio.com/wp-content/uploads/2023/09/...",
    "alt": "Leitfaden zur Finanzbuchhaltung",
    "width": 1024,
    "height": 576
  },
  "images": [
    {
      "url": "...",
      "alt": "...",
      "width": 1024,
      "height": 576,
      "type": "content"
    }
  ]
}
```

### Topic & Cluster Data

```json
{
  "topics": {
    "personalverwaltung": 3,
    "compliance": 2
  },
  "primary_cluster": "personalverwaltung",
  "secondary_clusters": [],
  "keywords": ["leitfaden", "finanzbuchhaltung", "ordio"]
}
```

## Category Structure

### Category Data

```json
{
  "categories": [
    {
      "slug": "lexikon",
      "name": "Lexikon",
      "description": "Fachbegriffe rund um Arbeitsrecht...",
      "url": "/insights/lexikon/",
      "post_count": 18,
      "meta": {
        "title": "Lexikon - Ordio",
        "description": "Fachbegriffe rund um Arbeitsrecht..."
      }
    },
    {
      "slug": "ratgeber",
      "name": "Ratgeber",
      "description": "Praxisnahe Tipps zu Dienstplänen...",
      "url": "/insights/ratgeber/",
      "post_count": 53,
      "meta": {
        "title": "Ratgeber - Ordio",
        "description": "Praxisnahe Tipps zu Dienstplänen..."
      }
    },
    {
      "slug": "inside-ordio",
      "name": "Inside Ordio",
      "description": "Erhalte spannende Einblicke...",
      "url": "/insights/inside-ordio/",
      "post_count": 6,
      "meta": {
        "title": "Inside Ordio - Ordio",
        "description": "Erhalte spannende Einblicke..."
      }
    }
  ]
}
```

## Content Relationships Data

### Post Relationships

```json
{
  "relationships": [
    {
      "source_url": "/insights/lexikon/post-a/",
      "target_url": "/insights/ratgeber/post-b/",
      "relationship_type": "definition_to_guide",
      "similarity_score": 0.686,
      "shared_topics": ["personalverwaltung", "zeiterfassung"],
      "has_existing_link": false
    }
  ]
}
```

### Related Posts

```json
{
  "related_posts": [
    {
      "url": "/insights/lexikon/related-post-1/",
      "title": "Related Post 1",
      "similarity_score": 0.65,
      "shared_topics": ["personalverwaltung"]
    }
  ]
}
```

## Internal Linking Data

### Link Structure

```json
{
  "internal_links": [
    {
      "source_url": "/insights/lexikon/post-a/",
      "target_url": "/insights/ratgeber/post-b/",
      "anchor_text": "Zeiterfassung richtig umsetzen",
      "link_type": "content",
      "target_type": "post"
    }
  ]
}
```

### Link Categories

- **Content Links**: Links within post content
- **Navigation Links**: Header, footer, breadcrumbs
- **Related Posts**: Related posts section
- **Category Links**: Category archive links

## Cluster Mapping Data

### Cluster Structure

```json
{
  "clusters": [
    {
      "name": "personalverwaltung",
      "posts": [
        {
          "url": "/insights/lexikon/post-a/",
          "title": "Post A",
          "primary": true,
          "secondary": []
        }
      ],
      "topics": ["personalverwaltung", "compliance"],
      "post_count": 19
    }
  ]
}
```

## Topic Taxonomy Data

### Topic Structure

```json
{
  "topics": [
    {
      "name": "personalverwaltung",
      "keywords": ["personalverwaltung", "mitarbeiterverwaltung", "personal"],
      "subtopics": ["mitarbeiterdaten", "personalakte"],
      "frequency": 146,
      "posts_count": 58
    }
  ]
}
```

## Image Structure

### Image Data

```json
{
  "images": [
    {
      "url": "https://www.ordio.com/wp-content/uploads/2023/09/...",
      "alt": "Image description",
      "width": 1024,
      "height": 576,
      "type": "featured",
      "post_url": "/insights/lexikon/post-a/",
      "local_path": "/images/blog/2023/09/..."
    }
  ]
}
```

### Image Migration Requirements

- Download all images
- Optimize images (WebP format)
- Generate responsive sizes
- Preserve alt text
- Update URLs in content

## Data File Structure

### Recommended File Organization

```
data/
├── blog/
│   ├── posts/
│   │   ├── lexikon/
│   │   │   ├── post-slug.json
│   │   │   └── ...
│   │   ├── ratgeber/
│   │   │   └── ...
│   │   └── inside-ordio/
│   │       └── ...
│   ├── categories.json
│   ├── clusters.json
│   ├── topics.json
│   ├── relationships.json
│   └── images.json
```

### JSON File Format

**Post File** (`posts/{category}/{slug}.json`):

```json
{
  "slug": "post-slug",
  "title": "Post Title",
  "category": "lexikon",
  "url": "/insights/lexikon/post-slug/",
  "publication_date": "2023-09-01T10:40:44+00:00",
  "modified_date": "2025-03-31T18:58:44+00:00",
  "author": "Emma",
  "content": {
    "html": "...",
    "text": "...",
    "word_count": 1214
  },
  "excerpt": "...",
  "meta": {
    "title": "...",
    "description": "..."
  },
  "featured_image": {...},
  "images": [...],
  "topics": {...},
  "primary_cluster": "personalverwaltung",
  "related_posts": [...],
  "internal_links": [...]
}
```

## Migration Data Requirements

### Required Data

1. **Post Content**: HTML and text content
2. **Post Metadata**: Title, dates, author, SEO
3. **Categories**: Category structure and metadata
4. **Images**: All images with metadata
5. **Relationships**: Post relationships and related posts
6. **Internal Links**: All internal links
7. **Topics**: Topic taxonomy and mapping
8. **Clusters**: Cluster assignments

### Data Sources

- **Existing Data Files**:
  - `docs/data/blog-posts-metadata.json`
  - `docs/data/blog-posts-content.json`
  - `docs/data/blog-topics-extracted.json`
  - `docs/data/blog-cluster-mapping.json`
  - `docs/data/blog-content-relationships.json`
  - `docs/data/blog-internal-links.json`

### Data Transformation

**Required Transformations**:

1. Convert WordPress URLs to static URLs
2. Update image URLs to local paths
3. Process HTML content (clean, optimize)
4. Generate related posts lists
5. Build internal link maps
6. Create category indexes
7. Generate topic indexes

## Related Documentation

- [Content Relationships](CONTENT_RELATIONSHIPS.md) - Post relationship mapping
- [Migration Architecture](MIGRATION_ARCHITECTURE.md) - Technical architecture
- [Migration Template Requirements](MIGRATION_TEMPLATE_REQUIREMENTS.md) - Template specifications
- [Migration Strategy](MIGRATION_STRATEGY.md) - High-level migration approach
