# SISTRIX PAA Extraction Research Findings

**Date:** 2026-01-15  
**Status:** Research Complete

## Summary

After researching SISTRIX API documentation and testing available endpoints, here are the findings for extracting People Also Ask (PAA) questions:

## Available Endpoints

### 1. `keyword.questions` Endpoint (RECOMMENDED)

**Endpoint:** `keyword.questions`  
**Purpose:** Returns Google queries that include your specified keyword paired with common question words (what, where, when, how, etc.)

**What it returns:**

- `question`: The full question query including the keyword paired with question words
- `amount`: Count of how many times the question appears in SISTRIX's dataset
- `traffic`: Search volume (how many users are searching this question)

**Parameters:**

- `api_key` (required): SISTRIX API key
- `kw` (required): Target keyword
- `country` (optional): ISO country code (default: account country)
- `lang` (optional): ISO 639-1 language code
- `limit` (optional): Maximum number of results
- `page` (optional): Pagination
- `format` (optional): JSON or XML (default: XML)

**Credit Usage:** 1 credit per question returned

**Example Request:**

```
https://api.sistrix.com/keyword.questions?api_key=YOUR_KEY&kw=digitale%20zeiterfassung&country=de&lang=de&limit=20&format=json
```

**Advantages:**

- Returns actual question queries (not just counts)
- Includes search volume data for prioritization
- Official SISTRIX API endpoint
- Reliable and well-documented

**Limitations:**

- Not exactly the same as Google's PAA box (which is dynamic and interactive)
- Returns question-style queries from Google's query data, not necessarily what shows in PAA box at the moment
- For many SEO use cases, this serves a very similar purpose

### 2. `keyword.seo.serpfeatures` Endpoint (CURRENTLY USED)

**Endpoint:** `keyword.seo.serpfeatures`  
**Purpose:** Returns SERP features present for a keyword

**What it returns:**

- Feature types (RELATED_QUESTION, IMAGE, VIDEO, etc.)
- Feature counts
- Feature positions

**Limitations:**

- Only provides counts of RELATED_QUESTION features
- Does NOT provide actual question text
- Current implementation tries to extract questions from feature data but structure doesn't contain questions

## Recommendation

**Use `keyword.questions` endpoint** for extracting actual PAA-style questions. This endpoint provides:

1. Actual question queries (not just counts)
2. Search volume data for prioritization
3. Reliable, official API endpoint
4. Better data quality than trying to extract from SERP features

## Implementation Plan

1. Create `collect-post-paa-questions.php` script using `keyword.questions` endpoint
2. Collect top 10-20 questions per primary keyword
3. Store questions with search volume and priority scores
4. Integrate with existing SERP features collection
5. Use questions for FAQ generation and content optimization

## Alternative Solutions (If Needed)

If `keyword.questions` doesn't provide sufficient coverage:

1. **Google Search API**: Could use Google Custom Search API or Programmable Search Engine
2. **Scraping**: Could scrape Google SERPs (not recommended due to ToS and reliability)
3. **Third-party APIs**: Services like DataForSEO, SerpAPI, etc. (additional cost)

## Next Steps

1. Implement `collect-post-paa-questions.php` using `keyword.questions` endpoint
2. Test with sample keywords
3. Validate question quality and relevance
4. Integrate with content brief generation
