# SISTRIX Comprehensive Integration Guide

**Last Updated:** 2026-01-15  
**Purpose:** Complete guide to SISTRIX API integration for blog content writing, including all endpoints, workflows, examples, and best practices

## Table of Contents

1. [Overview](#overview)
2. [SISTRIX API Endpoints](#sistrix-api-endpoints)
3. [Data Collection Scripts](#data-collection-scripts)
4. [Content Brief Generation](#content-brief-generation)
5. [Competitive Analysis Workflow](#competitive-analysis-workflow)
6. [Content Optimization Workflow](#content-optimization-workflow)
7. [Examples and Use Cases](#examples-and-use-cases)
8. [Best Practices](#best-practices)
9. [Troubleshooting](#troubleshooting)

## Overview

This guide provides comprehensive documentation for using SISTRIX API data to create the best content possible. It covers:

- All available SISTRIX endpoints and their usage
- Data collection scripts and workflows
- Content brief generation process
- Competitive analysis workflow
- Content optimization strategies
- Real-world examples and use cases

## SISTRIX API Endpoints

### Keyword Endpoints

#### 1. `keyword.seo.metrics`

**Purpose:** Get keyword metrics (volume, difficulty, competition, historical trends)

**Credits:** 5 per keyword

**Batch Mode:** Yes (optimized: up to 50 keywords per batch, tested optimal: 30 keywords)

**Optimization Notes:**

- **Optimal Batch Size:** 30 keywords per batch (tested and verified)
- **POST Requests:** Automatically used for batches > 20 keywords to avoid URL length limits
- **GET Requests:** Used for smaller batches (≤20 keywords)
- **No Rate Limiting:** Batch endpoints don't require delays between requests (single API call)
- **Cross-Post Batching:** Use `collect-all-keywords-cross-post.php` to batch process all unique keywords across all posts for maximum efficiency

**Parameters:**

- `api_key` (required): SISTRIX API key
- `kw` (required): Keyword or JSON array of keywords
- `country` (required): ISO country code (e.g., 'de')
- `history` (optional): Include historical trends ('true'/'false')
- `format` (optional): 'json' or 'xml'

**Example Request:**

```bash
# Single keyword
curl "https://api.sistrix.com/keyword.seo.metrics?api_key=KEY&kw=digitale%20zeiterfassung&country=de&history=true&format=json"

# Batch mode (10 keywords)
curl "https://api.sistrix.com/keyword.seo.metrics?api_key=KEY&kw=[\"kw1\",\"kw2\",...]&country=de&format=json"
```

**Response Structure:**

```json
{
  "answer": [
    {
      "result": [
        {
          "kw": "digitale zeiterfassung",
          "traffic": 1200,
          "competition": 45,
          "clicks": 1000,
          "cpc": 2.5,
          "desktop_distribution": 0.49,
          "mobile_distribution": 0.51,
          "history": [
            { "date": "2025-12", "traffic": 1150, "competition": 43 },
            { "date": "2026-01", "traffic": 1200, "competition": 45 }
          ]
        }
      ]
    }
  ],
  "credits": [{ "used": 5 }]
}
```

**Usage:**

- Collect keyword metrics for all post keywords
- Include historical trends for trend analysis
- Use batch mode to reduce API calls

#### 2. `keyword.questions`

**Purpose:** Get People Also Ask (PAA) questions for a keyword

**Credits:** 1 per question returned

**Batch Mode:** No

**Parameters:**

- `api_key` (required): SISTRIX API key
- `kw` (required): Keyword
- `country` (required): ISO country code
- `lang` (optional): ISO 639-1 language code
- `limit` (optional): Maximum results (default: 20)
- `format` (optional): 'json' or 'xml'

**Example Request:**

```bash
curl "https://api.sistrix.com/keyword.questions?api_key=KEY&kw=digitale%20zeiterfassung&country=de&lang=de&limit=15&format=json"
```

**Response Structure:**

```json
{
  "answer": [
    {
      "questions": [
        {
          "question": "Wie funktioniert digitale Zeiterfassung?",
          "amount": 120,
          "search_volume": 500,
          "lang": "de"
        }
      ]
    }
  ],
  "credits": [{ "used": 15 }]
}
```

*Note: SISTRIX may return `answer[0].questions` (with `search_volume`) or `answer[0].result` (with `traffic`). Script handles both. Credits charged per request (limit-based).*

**Usage:**

- Extract actual PAA questions (not just counts)
- Prioritize by traffic and frequency
- Use for FAQ generation

#### 3. `marketplace.keyword.search.ideas`

**Purpose:** Discover related keywords and semantic variations

**Credits:** 1 per idea returned

**Batch Mode:** No

**Parameters:**

- `api_key` (required): SISTRIX API key
- `kw` (required): Seed keyword
- `country` (required): ISO country code
- `mode` (optional): 'include' (semantic variations), 'same' (same words), 'exact' (exact match)
- `limit` (optional): Maximum results
- `format` (optional): 'json' or 'xml'

**Example Request:**

```bash
curl "https://api.sistrix.com/marketplace.keyword.search.ideas?api_key=KEY&kw=digitale%20zeiterfassung&country=de&mode=include&limit=15&format=json"
```

**Usage:**

- Discover semantic keyword variations
- Find related keywords for internal linking
- Build keyword clusters

#### 4. `keyword.seo`

**Purpose:** Get ranking positions for a keyword (top 10 results)

**Credits:** 1 per request

**Batch Mode:** No

**Parameters:**

- `api_key` (required): SISTRIX API key
- `kw` (required): Keyword
- `country` (required): ISO country code
- `limit` (optional): Number of results (default: 10)
- `format` (optional): 'json' or 'xml'

**Example Request:**

```bash
curl "https://api.sistrix.com/keyword.seo?api_key=KEY&kw=digitale%20zeiterfassung&country=de&limit=10&format=json"
```

**Usage:**

- Get competitor URLs ranking for keywords
- Analyze competitor content structure
- Identify content gaps

#### 5. `keyword.seo.searchintent`

**Purpose:** Classify search intent for a keyword

**Credits:** 1 per keyword

**Batch Mode:** No

**Parameters:**

- `api_key` (required): SISTRIX API key
- `kw` (required): Keyword
- `country` (required): ISO country code
- `format` (optional): 'json' or 'xml'

**Response Structure:**

```json
{
  "answer": [
    {
      "result": [
        {
          "intent_know": 85,
          "intent_do": 15,
          "intent_visit": 0,
          "intent_website": 0
        }
      ]
    }
  ]
}
```

**Usage:**

- Determine content strategy (informational vs transactional)
- Optimize content structure based on intent
- Plan conversion elements

#### 6. `keyword.seo.serpfeatures`

**Purpose:** Get SERP features present for a keyword

**Credits:** 1 per keyword

**Batch Mode:** No

**Parameters:**

- `api_key` (required): SISTRIX API key
- `kw` (required): Keyword
- `country` (required): ISO country code
- `format` (optional): 'json' or 'xml'

**Response Structure:**

```json
{
  "answer": [
    {
      "result": [
        {
          "RELATED_QUESTION": 6,
          "IMAGE": 5,
          "VIDEO": 3,
          "FEATURED_SNIPPET": 1
        }
      ]
    }
  ]
}
```

**Usage:**

- Identify SERP feature opportunities
- Plan FAQ optimization
- Optimize for images/videos

### Domain Endpoints

#### 7. `domain.opportunities`

**Purpose:** Get keyword opportunities for a domain

**Credits:** 1 per opportunity returned

**Batch Mode:** No (but returns multiple opportunities per call)

**Parameters:**

- `api_key` (required): SISTRIX API key
- `domain` (required): Domain name
- `country` (required): ISO country code
- `limit` (optional): Maximum results (default: 100)
- `offset` (optional): Pagination offset
- `format` (optional): 'json' or 'xml'

**Usage:**

- Identify keyword opportunities
- Map content gaps
- Plan keyword clusters

## Data Collection Scripts

### Script Overview

All scripts are located in `v2/scripts/blog/` and `v2/scripts/content/`:

1. **`collect-post-keywords-sistrix.php`** - Collect keyword metrics, related keywords, historical trends
2. **`collect-post-paa-questions.php`** - Extract PAA questions
3. **`collect-post-search-intent.php`** - Classify search intent
4. **`collect-post-serp-features.php`** - Get SERP features
5. **`collect-post-competitor-analysis.php`** - Analyze competitor content
6. **`generate-content-brief-from-sistrix.php`** - Generate content briefs
7. **`content-writing-assistant.php`** - Master workflow script
8. **`analyze-topical-authority.php`** - Keyword clustering and gap analysis

### Collection Workflow

**Full Workflow (Recommended):**

```bash
php v2/scripts/content/content-writing-assistant.php --post=slug --category=category --mode=full
```

**Individual Scripts:**

```bash
# 1. Keywords (includes related keywords and historical trends)
php v2/scripts/blog/collect-post-keywords-sistrix.php --post=slug --category=category

# 2. PAA Questions
php v2/scripts/blog/collect-post-paa-questions.php --post=slug --category=category

# 3. Search Intent
php v2/scripts/blog/collect-post-search-intent.php --post=slug --category=category

# 4. SERP Features
php v2/scripts/blog/collect-post-serp-features.php --post=slug --category=category

# 5. Competitor Analysis
php v2/scripts/blog/collect-post-competitor-analysis.php --post=slug --category=category --top=5
```

## Content Brief Generation

### Process

**Step 1: Collect Data**

- Run data collection scripts (or use Content Writing Assistant)
- Ensure all data files exist

**Step 2: Generate Brief**

```bash
php v2/scripts/content/generate-content-brief-from-sistrix.php --post=slug --category=category --output=content-brief.md
```

**Step 3: Review Brief**

- Check primary keyword and metrics
- Review PAA questions
- Analyze competitor insights
- Note recommended headings and word count

### Brief Structure

1. **Primary Keyword** - Metrics, competition level, strategy
2. **Secondary Keywords** - Top 5 secondary keywords
3. **Related Keywords** - Semantic variations for internal linking
4. **Search Intent** - Intent classification and content strategy
5. **Content Depth** - Target word count based on competitor analysis
6. **PAA Questions** - Specific questions to use as FAQs
7. **Competitor Analysis** - Top competitors, their structure, FAQs
8. **Recommended Headings** - From competitor analysis
9. **Actionable Recommendations** - Specific optimization actions

## Competitive Analysis Workflow

### Step 1: Collect Competitor URLs

**Script:** `collect-post-competitor-analysis.php`

**Process:**

1. Query `keyword.seo` endpoint for primary keyword
2. Get top 10 ranking URLs
3. Filter out own domain
4. Select top 5 competitors

### Step 2: Analyze Competitor Content

**Analysis Includes:**

- Word count
- Heading structure (H1-H6)
- FAQ extraction (schema and patterns)
- Content structure (sections, lists, tables, images)

### Step 3: Generate Insights

**Insights Generated:**

- Average competitor word count
- Recommended headings (from multiple competitors)
- Competitor FAQs to address
- Content depth recommendations
- Content structure suggestions

### Step 4: Apply to Content

**Use Insights For:**

- Determining target word count
- Planning content structure
- Identifying FAQ opportunities
- Finding content gaps

## Content Optimization Workflow

### Step 1: Generate Suggestions

```bash
php v2/scripts/content/integrate-sistrix-insights.php --post=slug --category=category
```

### Step 2: Review Recommendations

**Types of Suggestions:**

1. **Quick Wins** (Low Competition) - Meta tags, internal linking, content expansion
2. **Content Expansion** (Medium Competition) - Depth, examples, FAQs
3. **Long-Term Strategy** (High Competition) - Comprehensive content, backlinks
4. **FAQ Optimization** - Specific PAA questions to add
5. **Competitor Insights** - Word count targets, recommended headings
6. **Keyword Clusters** - Related keywords for internal linking

### Step 3: Score Content Quality

**Metrics:**

- Keyword optimization (30%)
- Content depth (30%)
- FAQ coverage (20%)
- Internal linking (20%)

**Scoring:**

- 80-100: Excellent
- 60-79: Good
- <60: Needs improvement

### Step 4: Implement Optimizations

**Priority Actions:**

1. Add specific PAA questions as FAQs
2. Expand content to meet word count targets
3. Add recommended headings from competitors
4. Improve internal linking with related keywords
5. Optimize meta tags

## Examples and Use Cases

### Example 1: New Blog Post Creation

**Scenario:** Creating a new blog post about "digitale zeiterfassung"

**Workflow:**

```bash
# Step 1: Full data collection and brief generation
php v2/scripts/content/content-writing-assistant.php --post=digitale-zeiterfassung --category=lexikon --mode=full

# Output:
# - keywords-sistrix.json (with related keywords and historical trends)
# - paa-questions.json (15 actual questions)
# - competitor-analysis.json (5 competitors analyzed)
# - content-brief.md (comprehensive brief)
# - optimization-suggestions.md (specific recommendations)
# - content-quality-score.md (if content exists)
```

**Content Brief Includes:**

- Primary keyword: "digitale zeiterfassung" (Volume: 1200, Competition: 45)
- Related keywords: 10 semantic variations
- PAA questions: 15 specific questions with traffic data
- Competitor analysis: Average 2,100 words, recommended headings
- Target word count: 2,500 words (exceed average by 20%)

**Action Items:**

- Use 10 PAA questions as FAQs
- Include recommended headings from competitors
- Target 2,500 words
- Add 10+ internal links to related keywords

### Example 2: Content Update

**Scenario:** Updating existing post with new SISTRIX insights

**Workflow:**

```bash
# Step 1: Collect new data
php v2/scripts/blog/collect-post-paa-questions.php --post=existing-post --category=lexikon
php v2/scripts/blog/collect-post-competitor-analysis.php --post=existing-post --category=lexikon

# Step 2: Generate optimization suggestions
php v2/scripts/content/integrate-sistrix-insights.php --post=existing-post --category=lexikon

# Step 3: Score current content
php v2/scripts/content/content-writing-assistant.php --post=existing-post --category=lexikon --mode=suggestions
```

**Optimization Opportunities:**

- Add 5 new PAA questions as FAQs
- Expand content from 1,200 to 2,100 words (competitor average)
- Add 3 recommended headings from competitors
- Improve internal linking (currently 5, target: 15)

### Example 3: Topical Authority Analysis

**Scenario:** Identifying content clusters and gaps

**Workflow:**

```bash
php v2/scripts/blog/analyze-topical-authority.php --all --output=topical-analysis.json
```

**Output:**

- Keyword clusters identified
- Content gaps mapped
- Pillar page opportunities
- Cluster expansion opportunities

**Use Cases:**

- Plan content strategy
- Identify pillar page topics
- Map content clusters
- Find content gaps

## Best Practices

### 1. Data Collection (Optimized)

- **Use cross-post keyword batching** (`collect-all-keywords-cross-post.php`) for maximum efficiency - batches all unique keywords across all posts
- **Optimal batch size:** 30 keywords per batch (tested and verified)
- **POST requests** automatically used for batches > 20 keywords (avoids URL length limits)
- **Parallel processing** for non-batch endpoints (PAA, rankings) using `curl_multi` (5-10 concurrent requests)
- **No rate limiting delays** for batch endpoints (single API call)
- **Adaptive delays** for individual endpoints (0.5s instead of 1s)
- **Exponential backoff** for 429 errors (2s, 4s, 8s retries)
- **Check cache first** before API calls (30-day TTL for keywords/PAA, 7 days for rankings)
- **Pre-check cache status** using `check-sistrix-cache-status.php` before collection
- **Credit pre-checking** before starting large collections
- **Resume capability** via checkpoints for interrupted collections
- **Monitor credits** continuously (weekly limit is primary constraint)
- **Use dry-run** to estimate credit usage before large collections

### 2. Content Optimization

- **Use specific recommendations** from competitor analysis (not generic)
- **Target exact word counts** based on competitor averages
- **Use actual PAA questions** (not just detection)
- **Prioritize by search volume** for PAA questions and related keywords
- **Score content quality** before and after optimization

### 3. Workflow Integration

- **Use Content Writing Assistant** for full workflow
- **Generate content briefs** before writing
- **Review optimization suggestions** before updating content
- **Track quality scores** over time
- **Update data quarterly** for top-performing posts

### 4. Credit Management (Optimized)

- **Pre-check credits** before starting collection (estimates total needed)
- **Monitor weekly usage** (10,000 credit limit, daily 2,000 limit with flexibility)
- **Use efficient endpoints** (batch mode up to 50 keywords, parallel processing for non-batch)
- **Cross-post batching** reduces API calls by ~90% (all unique keywords in largest batches)
- **Cache pre-checking** shows cache hit rate and estimates credits needed
- **Skip cached posts** option to only process uncached data
- **Prioritize high-value** keywords and questions
- **Skip expensive endpoints** when possible on **scale-tier** work (100 credits/keyword for **`keyword.domain.seo` + `kw`**); **VIP marketing pages** may use selective domain-keyword SERP per [VIP_MARKETING_SEO_DATA_TIERS.md](pages/marketing-pages/VIP_MARKETING_SEO_DATA_TIERS.md)
- **History parameter optional** (default: false, saves credits unless trend analysis needed)
- **Generate credit reports** weekly to track patterns
- **Resume capability** saves checkpoints to avoid re-processing completed work

## Performance Optimization

### Batch Processing Optimizations

**Cross-Post Keyword Batching:**

- **Script:** `collect-all-keywords-cross-post.php`
- **Method:** Extracts all unique primary keywords from all posts, processes in largest possible batches (up to 50 keywords), then distributes results back
- **Efficiency:** Reduces API calls by ~90% compared to per-post processing
- **Usage:** `php v2/scripts/blog/collect-all-keywords-cross-post.php --max-batch-size=50`

**Optimal Batch Sizes:**

- **keyword.seo.metrics:** 30 keywords per batch (tested optimal)
- **Maximum:** 50 keywords per batch (POST requests)
- **Minimum:** 1 keyword (fallback to single calls)

**POST vs GET Requests:**

- **POST:** Automatically used for batches > 20 keywords (avoids URL length limits)
- **GET:** Used for smaller batches (≤20 keywords)
- **Performance:** POST slightly faster for large batches, GET simpler for small batches

### Parallel Processing Optimizations

**PAA Questions Collection:**

- **Script:** `collect-post-paa-questions-parallel.php`
- **Method:** Uses `curl_multi` for concurrent requests
- **Concurrent Limit:** 5-10 requests (default: 5)
- **Efficiency:** ~5x faster than sequential processing
- **Usage:** `php v2/scripts/blog/collect-post-paa-questions-parallel.php --all --concurrent=5`

**Rankings Collection:**

- **Method:** Parallel processing via `curl_multi` in competitor analysis script
- **Concurrent Limit:** 5 requests (configurable)
- **Efficiency:** ~3-5x faster than sequential

### Rate Limiting Optimizations

**Batch Endpoints:**

- **No delays** between batch requests (single API call per batch)
- **No rate limiting** needed for batch endpoints

**Individual Endpoints:**

- **Adaptive delays:** 0.5s between requests (reduced from 1s)
- **Chunk delays:** 0.5s between parallel chunks (not individual requests)

**Error Handling:**

- **Exponential backoff** for 429 errors: 2s, 4s, 8s retries
- **Max retries:** 3 attempts
- **Automatic retry** with increasing delays

### Cache Optimizations

**Cache Pre-Checking:**

- **Script:** `check-sistrix-cache-status.php`
- **Purpose:** Scan all posts and report cache hit rate before collection
- **Output:** Cache hit rates, uncached posts list, credit estimates
- **Usage:** `php v2/scripts/blog/check-sistrix-cache-status.php --skip-cached`

**Cache Expiration:**

- **Keywords:** 30 days (stable data)
- **PAA Questions:** 30 days (stable data)
- **Rankings:** 7 days (more dynamic)

### Credit Management Optimizations

**Pre-Checking:**

- **Estimate credits** needed before starting collection
- **Validate** against daily/weekly limits
- **Abort** if insufficient credits available

**History Parameter:**

- **Default:** false (saves credits)
- **Use:** Only when trend analysis needed
- **Flag:** `--with-history` to include historical trends

**Resume Capability:**

- **Checkpoints:** Saved every N posts (default: 10)
- **Resume:** `--resume-from=N` to continue from checkpoint
- **Avoids:** Re-processing completed work

### Performance Benchmarks

**Before Optimization:**

- Keywords collection: ~1-2 seconds per keyword (sequential)
- PAA collection: ~1 second per keyword (sequential)
- Total time for 100 posts: ~15-20 minutes

**After Optimization:**

- Keywords collection: ~0.05 seconds per keyword (batch of 30)
- PAA collection: ~0.2 seconds per keyword (parallel, 5 concurrent)
- Total time for 100 posts: ~3-5 minutes

**Speed Improvement:** ~4-6x faster overall

## Troubleshooting

### Issue: PAA Questions Not Found

**Symptoms:** `paa-questions.json` file missing or empty

**Solutions:**

1. Run PAA collection script: `php v2/scripts/blog/collect-post-paa-questions.php --post=slug --category=category`
2. Check API credits available
3. Verify primary keyword exists in keywords-sistrix.json
4. Check API response format matches expected structure

### Issue: Competitor Analysis Fails

**Symptoms:** Competitor analysis script errors or returns empty data

**Solutions:**

1. Check API credits available
2. Verify keyword.seo endpoint is accessible
3. Check competitor URLs are accessible (not blocked)
4. Review error logs for specific issues

### Issue: Related Keywords Not Collected

**Symptoms:** `related_keywords` field missing in keywords-sistrix.json

**Solutions:**

1. Re-run keyword collection: `php v2/scripts/blog/collect-post-keywords-sistrix.php --post=slug --category=category`
2. Check marketplace.keyword.search.ideas endpoint is accessible
3. Verify API credits available
4. Check script version includes related keyword collection

### Issue: Content Brief Generation Fails

**Symptoms:** Brief generation returns errors or empty output

**Solutions:**

1. Verify all required data files exist:
   - keywords-sistrix.json
   - paa-questions.json (optional)
   - competitor-analysis.json (optional)
   - search-intent.json (optional)
2. Check file permissions
3. Review error output for specific issues
4. Test with `--dry-run` mode

### Issue: High Credit Usage

**Symptoms:** Credits exhausted quickly

**Solutions:**

1. Review credit usage report: `php v2/scripts/blog/generate-credit-usage-report.php`
2. Increase cache duration for stable data
3. Use batch mode more aggressively
4. On blog/bulk pipelines, skip expensive domain-keyword SERP unless justified; VIP marketing tier: see [VIP_MARKETING_SEO_DATA_TIERS.md](pages/marketing-pages/VIP_MARKETING_SEO_DATA_TIERS.md)
5. Prioritize high-value collections

## Related Documentation

- `docs/content/SISTRIX_BEST_PRACTICES_2026.md` - Best practices guide
- `docs/content/SISTRIX_CONTENT_INTEGRATION_GUIDE.md` - Integration guide
- `docs/content/CONTENT_CREATION_WORKFLOW_2026.md` - Content creation workflow
- `.cursor/rules/blog-data-collection.mdc` - Data collection rules
- `docs/content/SISTRIX_PAA_RESEARCH_FINDINGS.md` - PAA extraction research
