# Affiliate HubSpot Sync – Runbook

**Last Updated:** 2026-04-08

Operational runbook for the hourly affiliate HubSpot sync cron job.

## Purpose

The sync refreshes partner, lead, and deal data from HubSpot into a local cache. All partner dashboards (Dashboard, Levels, Earnings, Leads, Leaderboard) read from this cache. For each synced (active) partner, the sync also **always** pushes the current level and MRR share to HubSpot so HubSpot stays in sync with platform; no separate "update HubSpot level" step is needed after backfills.

## Data flow

**Sync → cache → display APIs.** The sync is the only process that calls HubSpot. Display APIs (affiliate-dashboard-data, affiliate-leads, affiliate-earnings, affiliate-levels, affiliate-leaderboard, affiliate-admin-list) read exclusively from `affiliate_hubspot_cache.json` via `loadCachedHubSpotData()`. No HubSpot API calls occur during user navigation. See [CACHE_ARCHITECTURE.md](CACHE_ARCHITECTURE.md) for schema and [ARCHITECTURE.md](ARCHITECTURE.md) for Cache-First Architecture.

## Pagination

The sync fetches **all pages** for contacts and deals per partner. HubSpot Search API returns up to 100 records per page; partners with 100+ contacts or deals receive full data via pagination (`after` from `paging.next`).

## Schedule

- **Frequency:** Hourly at minute 0 (server time).
- **Production crontab:** Use `v2/cron/crontab-production.txt` for consolidated setup (sync, health check, performance).
- **Crontab example:** `0 * * * * cd /var/www/lexikon && /usr/bin/php v2/cron/sync-affiliate-hubspot.php >> /var/log/affiliate-sync.log 2>&1`

## Sync triggers (production SSOT)

| Trigger | Role | Notes |
|--------|------|--------|
| **Server cron (CLI)** | **Primary** — intended for hourly freshness | Runs `php v2/cron/sync-affiliate-hubspot.php`. No HTTP hop; failure email on exit 1. Same lock as manual sync. |
| **GitHub Action** | **Optional backup** | Workflow [`.github/workflows/hubspot-sync-cron.yml`](../../../.github/workflows/hubspot-sync-cron.yml) calls `cron-webhook-sync.php` **twice daily** (09:00 and 21:00 UTC). Use when you want redundancy if server cron is misconfigured. |
| **`cron-webhook-sync.php`** | External / CI entry | Token query param; **max one successful trigger per 5 minutes** (abuse protection). **Not suitable for hourly scheduling** via this URL without changing that limit. |
| **Admin “Sync mit HubSpot”** | Manual | `affiliate-admin-trigger-sync.php`; same runner + lock; **5-minute** cooldown between admin triggers. |

**Decision (2026-04):** Keep the GitHub workflow as a **2×/day backup** alongside production server cron. **Do not** switch hourly sync to GitHub Actions (cost, noise, and best-effort scheduling). Remove or narrow the workflow later only after server cron has been green for several weeks and ops agrees.

### Troubleshooting: “Letzte Synchronisation” only moves ~twice per day

If the Partner **Verwaltung** timestamp matches **GitHub Actions** “HubSpot Sync Cron” runs (~09:00 and ~21:00 UTC), **hourly server cron is not running or is failing** — this is **not** a bug in the workflow schedule.

1. On the server: `crontab -l` (for the user that should run the job) must contain `sync-affiliate-hubspot.php` with the correct `cd` to the live docroot (see `v2/cron/crontab-production.txt`).
2. Check `/var/log/affiliate-sync.log` (or your redirect target) for errors or repeated “Another sync already running; skipping.”
3. Run `php v2/scripts/affiliate/verify-cron-installed.php` and `php v2/scripts/affiliate/monitor-sync-health.php`.
4. The admin UI shows a **warning styling** when `last_sync` is older than **2 hours** (threshold aligns with the health monitor).

### Secrets and rotation

- **`CRON_WEBHOOK_SECRET`:** Stored in GitHub Actions secrets; must match `v2/config/cron-webhook-config.php` (env → local override → fallback). If you **disable** the GitHub workflow entirely, you can still keep the webhook for other callers (e.g. external monitors); rotate the token if it was exposed. Do not commit secrets.

### HubSpot API usage (before going faster than hourly)

Roughly **2–3 HubSpot API calls per active partner** per full sync, plus pagination for large result sets. With **50+** active partners, the runner uses a **1-second delay between partners** to reduce burst pressure (HubSpot limits: see [API usage guidelines](https://developers.hubspot.com/docs/developer-tooling/platform/usage-guidelines)). Do not schedule sub-hourly syncs without reviewing tier limits and, if needed, backoff in `v2/helpers/affiliate-sync-runner.php`.

### Future: event-driven updates (Operations Hub / webhooks)

HubSpot **workflow webhooks** (Operations Hub tiers that include them) or **app webhook subscriptions** can notify your app when deals or contacts change, which can reduce reliance on polling for *change detection*. They **do not replace** the full sync: you still need periodic reconciliation (pagination, MRR/level recompute, registry-driven PATCHes). Treat as a separate design phase: idempotent handler, auth, queue, plus retained hourly (or daily) full sync.

## Production verification (SSH)

Run as the **same user** that owns production cron (often `www-data` or deploy user):

1. **Crontab:** `crontab -l` — confirm a non-comment line contains `sync-affiliate-hubspot.php` and the `cd` path matches the live docroot (adjust `v2/cron/crontab-production.txt` paths if your tree is not `/var/www/lexikon`).
2. **Automated check:** `php v2/scripts/affiliate/verify-cron-installed.php` (exit 0 = sync line found).
3. **Logs:** `tail -n 50 /var/log/affiliate-sync.log` (or your redirect target) and search for `[Affiliate Sync]`.
4. **Cache freshness:** Locate readable `affiliate_hubspot_cache.json` (see [DEPLOYMENT_CHECKLIST.md](DEPLOYMENT_CHECKLIST.md) / path helpers), then check `last_sync` (e.g. `jq -r '.last_sync' /path/to/affiliate_hubspot_cache.json`). For hourly primary sync, `monitor-sync-health.php` expects **&lt; 2 hours** for a healthy status.
5. **Permissions:** Cache directory writable for atomic writes; lock file `affiliate_hubspot_sync.lock` in the same directory; `v2/data/cron_webhook_rate_limit.txt` writable if the public webhook is used.

## Staging verification

If staging mirrors production layout: repeat **Production verification** on the staging host, run one manual `php v2/cron/sync-affiliate-hubspot.php`, and confirm admin **Sync mit HubSpot** returns **409** when a second trigger fires while the first still holds the lock (overlap test).

## Manual run

From the repository root:

```bash
php v2/cron/sync-affiliate-hubspot.php
```

Or with absolute path:

```bash
php /path/to/v2/cron/sync-affiliate-hubspot.php
```

## Manual sync from admin panel

Admins can trigger a HubSpot sync from the **Verwaltung** (Admin) page without waiting for cron. The "Sync mit HubSpot" button runs the same sync logic as the cron job and uses the **same lock file**, so only one sync runs at a time (manual or cron). If a sync is already running, the button returns "Sync already running." A **rate limit** applies: one manual sync per 5 minutes per environment. The button may take 1–2 minutes for large partner lists.

## Lockfile

The script uses `flock()` on a lock file to prevent overlapping runs (cron and manual sync share this lock). The lock file is:

- **Path:** Same directory as the writable HubSpot cache file, named `affiliate_hubspot_sync.lock`.
- **Behavior:** If another instance is already running, a new run acquires no lock, logs "Another sync already running; skipping.", and exits with code 0 (so cron does not treat it as a failure).
- **Cleanup:** The lock is released when the process exits (normally or on error). No manual cleanup is needed.

## Exit codes

| Code | Meaning                                                                      |
| ---- | ---------------------------------------------------------------------------- |
| 0    | Success, or skipped because another sync was already running (lock held).    |
| 1    | Error (e.g. failed to open lock file, cache write failure, fatal exception). |

## Health check

After a sync (or to verify recent sync), run:

```bash
php v2/scripts/affiliate/monitor-sync-health.php
```

**Health check cron:** Runs daily at 9:00 UTC via `v2/cron/crontab-production.txt`. Output: `/var/log/affiliate-sync-health.log`.

**Verify cron installed:** `php v2/scripts/affiliate/verify-cron-installed.php` (exits 0 if sync cron present).

Interpret the output:

- **OK (exit 0):** Cache exists, valid, and `last_sync` is recent (e.g. within 2 hours for hourly cron).
- **Warning (exit 2):** e.g. no `last_sync` in cache, or sync age above threshold (2–6 hours).
- **Error (exit 1):** Cache missing, unreadable, invalid JSON, or sync age > 6 hours.

## Sync failure alerting

When the sync fails (exit code 1), an email is sent to `hady@ordio.com` with error message, error code, last successful sync (if cache exists), and remediation steps. **No alert** when `sync_already_running` (another instance holds the lock; exit 0).

## Logs

- **Sync stdout/stderr (production):** `/var/log/affiliate-sync.log` when cron redirects output (see crontab-production.txt).
- **Health check output:** `/var/log/affiliate-sync-health.log`.
- **PHP error_log:** Sync messages use prefix `[Affiliate Sync]`.
- **Search:** `grep "Affiliate Sync" /var/log/php-fpm/error.log` or your server error log path.

## Remediation

### Cache missing or invalid

1. Run the sync manually: `php v2/cron/sync-affiliate-hubspot.php`.
2. Confirm the cache directory is writable (see [DEPLOYMENT_CHECKLIST.md](DEPLOYMENT_CHECKLIST.md)).
3. Re-run the health script to confirm cache exists and is valid.

### HubSpot 429 (rate limit)

1. Check HubSpot API rate limits for your tier (burst and daily).
2. Ensure the cron runs hourly; do not increase frequency further without checking limits.
3. The script already uses a 1-second delay between partners when there are 50+ active partners; keep that in place if you increase frequency.

### last_sync too old (health script warning)

1. Confirm the cron job is scheduled and running (check crontab and server logs).
2. Run the sync manually and check for errors in the log.
3. Verify HubSpot API token and network access from the server.

### No data for specific partner

**Symptom:** One partner’s dashboard shows 0 leads, 0 deals, €0 MRR after sync, but HubSpot has contacts and deals for that partner.

**Diagnosis:** Run the diagnostic script:

```bash
php v2/scripts/affiliate/diagnose-sync-frederik.php [--json] [--partner-id=AP-XXX]
```

Use `--partner-id=AP-XXX` for the affected partner (default: AP-28260283-8FF485).

For cross-API consistency (leads_count, deals_count, MRR mismatch between Admin and Dashboard), run:

```bash
php v2/scripts/affiliate/validate-affiliate-data.php [--json] [--partner-id=AP-XXX]
```

Output: `discrepancies` (cached vs calculated MRR, etc.), `duplicate_emails` (same email, different partner IDs). See [DATA_GLOSSARY.md](DATA_GLOSSARY.md) for term definitions.

**Output fields:**
- `summary`: ok | partner_not_in_registry | cache_path_mismatch | hubspot_empty | hubspot_api_error | sync_not_populating_cache
- `registry`: in_registry, status, total_partners, active_partners, partner_ids
- `paths`: data_paths_match, cache_paths_match, cache_readable, cache_writable
- `hubspot`: contacts_count, deals_count, portal_id (HUBSPOT_PORTAL_ID when defined), token_verified (true/false/null), portal_error (if token verification failed), api_error (if any)
- `cache`: leads_count, deals_count, last_sync, partners_in_cache
- `diagnosis`: list of issues found
- `recommendations`: suggested fixes

**Checks:**

1. **Partner in registry:** Verify `registry.in_registry` is true and `registry.status` is `active`. Sync only processes partners in the registry with status `active`.
2. **HubSpot token:** If `hubspot.contacts_count` and `hubspot.deals_count` are 0 but HubSpot has data, production token may point to a different portal. Check `hubspot.token_verified` (false = token invalid or wrong scope); `hubspot.portal_id` shows the configured portal—verify it matches HubSpot Settings > Account & Billing. Ensure token has `crm.objects.contacts.read` and `crm.objects.deals.read`.
3. **Path mismatch:** Compare `paths.readable_cache_file` and `paths.writable_cache_file`. If they differ, sync writes to one path and dashboard reads from another. Align paths so both use the same file.

See [HUBSPOT_INTEGRATION_TEST_PROCEDURE.md](HUBSPOT_INTEGRATION_TEST_PROCEDURE.md) → "Sync not updating data" for full troubleshooting.

### Lock stuck (last resort)

The lock is released when the process exits; it does not persist across reboots. If you have evidence a process died without releasing (e.g. no sync for days and no other errors), you can remove the lock file manually so the next run can proceed:

- **Path:** Same directory as the writable cache file, file name: `affiliate_hubspot_sync.lock`.
- Only do this when no sync process is running (e.g. `pgrep -f sync-affiliate-hubspot` returns nothing).

## One-time: Add Beginner level (HubSpot + platform backfill)

After deploying the "Beginner" level (0 deals, pre–first deal) to the platform:

1. **Add Beginner to HubSpot Level property** (run once):  
   `php v2/scripts/hubspot/update-affiliate-level-property.php [--dry-run]`  
   Then run without `--dry-run` to PATCH the property. Requires HubSpot API token with write scope.

2. **Backfill platform partner JSON** (run once):  
   `php v2/scripts/affiliate/backfill-beginner-level.php [--dry-run]`  
   Then run without `--dry-run` to set level to `beginner` for pending_verification and 0-deal starter partners.

3. **Run the sync** so HubSpot records are updated from platform:  
   `php v2/cron/sync-affiliate-hubspot.php`

See [ARCHITECTURE.md](ARCHITECTURE.md) for level progression (Beginner → Starter → Partner → Pro).

If the admin panel or HubSpot still show "Starter" for 0-deal or pending partners, run the platform backfill once in that environment: `php v2/scripts/affiliate/backfill-beginner-level.php [--dry-run]` then without `--dry-run`.

## Updating HubSpot only (records not in platform JSON)

The **sync is driven by the platform registry (local JSON)**. It only processes partners that exist in the registry; for each of those it recalculates level and **PATCHes** that partner’s HubSpot record (level + MRR share). Partners that exist **only in HubSpot** (e.g. test records, or created in HubSpot before the platform) are never touched by sync.

To update **HubSpot-only** records (level and MRR share, without touching local JSON):

```bash
# Set specific Partner IDs in HubSpot to a level (e.g. Beginner = 0%)
php v2/scripts/hubspot/update-hubspot-affiliate-levels.php --level=beginner AP-20260203-784830 AP-20260203-0FF4B6

# Set all Affiliate Partner records in HubSpot to a level (e.g. Starter = 20%) – dry run first
php v2/scripts/hubspot/update-hubspot-affiliate-levels.php --all --level=starter --dry-run
php v2/scripts/hubspot/update-hubspot-affiliate-levels.php --all --level=starter
```

Use `--level=beginner|starter|partner|pro`. MRR share is taken from config (Beginner 0%, Starter 20%, Partner 25%, Pro 30%).

## When to update this runbook

Update this runbook when you change:

- The **schedule** (e.g. from hourly to a different frequency),
- **Lock behavior** or lock file path,
- **Exit codes** or success/failure semantics.

Also update:

- The script docblock in `v2/cron/sync-affiliate-hubspot.php`,
- [DEPLOYMENT_CHECKLIST.md](DEPLOYMENT_CHECKLIST.md) (Cron Job Setup),
- [HUBSPOT_SETUP_STATUS.md](HUBSPOT_SETUP_STATUS.md) (Recommended Next Steps),
- This runbook.

## Related documentation

- [DEPLOYMENT_CHECKLIST.md](DEPLOYMENT_CHECKLIST.md) – Cron job setup and monitoring
- [ARCHITECTURE.md](ARCHITECTURE.md) – Data flow and cron role
- [HUBSPOT_INTEGRATION_SUMMARY.md](HUBSPOT_INTEGRATION_SUMMARY.md) – Sync process and production readiness
