# Cron Setup Guide for Blog Automation

**Last Updated:** 2026-02-08

## Overview

This guide provides step-by-step instructions for setting up cron jobs to automate blog monitoring, data collection, and quality checks. For full process details and domain-level collection, see [MONITORING_RUNBOOK.md](MONITORING_RUNBOOK.md) and [CONSOLIDATED_NEXT_STEPS.md](CONSOLIDATED_NEXT_STEPS.md).

## Prerequisites

- Server access with cron capability
- PHP 7.4+ installed
- All automation scripts exist and are tested
- Email configuration for alerts

## Cron Jobs Configuration

### 1. Weekly Quality Check

**Schedule:** Every Monday at 9:00 AM

**Command:**
```bash
0 9 * * 1 cd /path/to/landingpage && php v2/scripts/blog/weekly-quality-check.php --email >> /path/to/logs/weekly-quality-check.log 2>&1
```

**What it does:**
- Audits FAQ quality
- Checks link health
- Validates schema
- Checks content freshness
- Sends email report if issues found

**Log File:** `logs/weekly-quality-check.log`

### 2. Weekly Priority Refresh

**Schedule:** Every Monday at 10:00 AM (after quality check)

**Command:**
```bash
0 10 * * 1 cd /path/to/landingpage && php v2/scripts/blog/weekly-priority-refresh.php >> /path/to/logs/weekly-priority-refresh.log 2>&1
```

**What it does:**
- Updates GA4 performance data
- Updates GSC search performance data
- Recalculates priority scores
- Generates priority dashboard

**Log File:** `logs/weekly-priority-refresh.log`

### 3. Weekly Data Collection (GA4 & GSC) – optional if job 2 runs

**Schedule:** Every Monday at 11:00 AM (only if you do *not* run weekly-priority-refresh, which already runs GA4/GSC/priority/dashboard)

**Command:**
```bash
0 11 * * 1 cd /path/to/landingpage && php v2/scripts/blog/automate-data-collection.php --weekly >> /path/to/logs/data-collection-weekly.log 2>&1
```

**What it does:**
- Collects GA4 performance metrics
- Collects GSC search performance
- Updates post JSON files with latest data

**Note:** **Recommended minimal weekly setup** is jobs 1 + 2 only (quality check 9 AM, priority refresh 10 AM). Priority refresh already runs GA4, GSC, priority calculation, and dashboard. Add job 3 only if you need a separate GA4/GSC-only run.

**Log File:** `logs/data-collection-weekly.log`

### 4. Monthly SISTRIX Data Collection

**Schedule:** First Monday of each month at 12:00 PM

**Command:**
```bash
0 12 1-7 * * [ $(date +\%u) -eq 1 ] && cd /path/to/landingpage && php v2/scripts/blog/automate-data-collection.php --monthly >> /path/to/logs/data-collection-monthly.log 2>&1
```

**Alternative (simpler):**
```bash
0 12 1 * * cd /path/to/landingpage && php v2/scripts/blog/automate-data-collection.php --monthly >> /path/to/logs/data-collection-monthly.log 2>&1
```

**What it does:**
- Collects SISTRIX keyword data
- Updates competition levels
- Collects SERP features
- Manages credit usage (weekly 10,000 cap; scripts stop when remaining is low, e.g. &lt; 50; optional 9,500 reserve is documentation-only)

**Log File:** `logs/data-collection-monthly.log`

**Note:** Credit check is manual - verify credits before running monthly collection.

## Complete Cron Configuration

Add all jobs to crontab:

```bash
# Edit crontab
crontab -e

# Add the following lines (adjust paths as needed):
0 9 * * 1 cd /Users/hadyelhady/Documents/GitHub/landingpage && php v2/scripts/blog/weekly-quality-check.php --email >> /Users/hadyelhady/Documents/GitHub/landingpage/logs/weekly-quality-check.log 2>&1
0 10 * * 1 cd /Users/hadyelhady/Documents/GitHub/landingpage && php v2/scripts/blog/weekly-priority-refresh.php >> /Users/hadyelhady/Documents/GitHub/landingpage/logs/weekly-priority-refresh.log 2>&1
0 11 * * 1 cd /Users/hadyelhady/Documents/GitHub/landingpage && php v2/scripts/blog/automate-data-collection.php --weekly >> /Users/hadyelhady/Documents/GitHub/landingpage/logs/data-collection-weekly.log 2>&1
0 12 1 * * cd /Users/hadyelhady/Documents/GitHub/landingpage && php v2/scripts/blog/automate-data-collection.php --monthly >> /Users/hadyelhady/Documents/GitHub/landingpage/logs/data-collection-monthly.log 2>&1
```

## Testing Cron Jobs

### Test Individual Scripts

```bash
# Test weekly quality check
cd /path/to/landingpage
php v2/scripts/blog/weekly-quality-check.php --email

# Test weekly priority refresh
php v2/scripts/blog/weekly-priority-refresh.php

# Test data collection (dry run)
php v2/scripts/blog/automate-data-collection.php --weekly --dry-run
php v2/scripts/blog/automate-data-collection.php --monthly --dry-run
```

### Verify Cron Jobs

```bash
# List all cron jobs
crontab -l

# Check cron logs (system dependent)
# macOS: Console.app → search for "cron"
# Linux: /var/log/cron or journalctl -u cron
```

## Troubleshooting

### Cron Job Not Running

1. **Check cron service:**
   ```bash
   # macOS
   sudo launchctl list | grep cron
   
   # Linux
   sudo systemctl status cron
   ```

2. **Check file permissions:**
   ```bash
   ls -la v2/scripts/blog/*.php
   chmod +x v2/scripts/blog/*.php  # If needed
   ```

3. **Check PHP path:**
   ```bash
   which php
   # Use full path in cron if needed: /usr/bin/php
   ```

4. **Check log files:**
   ```bash
   tail -f logs/weekly-quality-check.log
   ```

### Script Errors

1. **Check script output:**
   ```bash
   php v2/scripts/blog/weekly-quality-check.php 2>&1
   ```

2. **Check dependencies:**
   ```bash
   php -r "require_once 'v2/config/blog-template-helpers.php';"
   ```

3. **Check API credentials:**
   - Verify GA4/GSC API credentials
   - Verify SISTRIX API key
   - Check API quotas

### Email Not Sending

1. **Test email configuration:**
   ```bash
   php -r "mail('hady@ordio.com', 'Test', 'Test message');"
   ```

2. **Check PHP mail configuration:**
   ```bash
   php -i | grep sendmail_path
   ```

3. **Use SMTP if needed:**
   - Configure PHPMailer in scripts
   - Update email sending code

## Monitoring

### Check Log Files Regularly

```bash
# Weekly check
tail -n 50 logs/weekly-quality-check.log
tail -n 50 logs/weekly-priority-refresh.log
tail -n 50 logs/data-collection-weekly.log

# Monthly check
tail -n 50 logs/data-collection-monthly.log
```

### Set Up Log Rotation

Create `/etc/logrotate.d/blog-automation`:

```
/path/to/landingpage/logs/*.log {
    weekly
    rotate 4
    compress
    delaycompress
    missingok
    notifempty
}
```

## Security Considerations

1. **File Permissions:**
   - Scripts: 644 (readable, not executable)
   - Logs: 644 (readable)
   - Config files: 600 (readable only by owner)

2. **API Credentials:**
   - Store in config files outside web root
   - Use environment variables when possible
   - Never commit credentials to git

3. **Error Handling:**
   - All errors sent to `hady@ordio.com`
   - Never expose internal errors publicly
   - Log all errors for debugging

## Monthly domain-level collection (alternative to job 4)

For fresh content backlog and competitive analysis, run domain-level scripts at least monthly (see [CONSOLIDATED_NEXT_STEPS.md](CONSOLIDATED_NEXT_STEPS.md) and [MONITORING_RUNBOOK.md](MONITORING_RUNBOOK.md)):

```bash
# Example: first of month (adjust path and log)
0 8 1 * * cd /path/to/landingpage && php v2/scripts/blog/collect-domain-opportunities.php --limit=100 >> /path/to/logs/domain-collection.log 2>&1
# Then: collect-domain-content-ideas.php, collect-competitor-keywords.php; then generate-content-backlog.php, generate-competitive-analysis.php
```

**Credit use:** domain-opportunities ~100 cr, content-ideas variable, competitor-keywords ~15 cr. Check `check-sistrix-credits.php` before running.

## Related Documentation

- [MONITORING_RUNBOOK.md](MONITORING_RUNBOOK.md) - Complete monitoring guide (domain visibility, domain-level scripts, Tier 2)
- [CONSOLIDATED_NEXT_STEPS.md](CONSOLIDATED_NEXT_STEPS.md) - Next steps and automation scripts summary
- [BLOG_SCRIPTS_USAGE_GUIDE.md](BLOG_SCRIPTS_USAGE_GUIDE.md) - Script reference
