# Slack Integration Monitoring Guide

**Last Updated:** 2026-03-06

Guide for monitoring Slack notification health, setting up alerts, and analyzing notification statistics.

## Overview

The Slack integration includes built-in monitoring capabilities:

- **Dedicated log file**: `v2/logs/affiliate-slack.log`
- **Notification statistics**: Success/failure tracking
- **Health checks**: Automated failure detection
- **Email alerts**: Automatic notifications for repeated failures
- **Admin dashboard**: Status endpoint for real-time monitoring

## Log File

### Location

```
v2/logs/affiliate-slack.log
```

### Format

Each log entry includes:
- Timestamp: `[YYYY-MM-DD HH:MM:SS]`
- Message: Description of event
- Context: JSON object with correlation ID, type, and details

**Example:**
```
[2026-03-06 14:23:45] Notification sent successfully {"correlation_id":"SLACK-20260306142345-abc123","type":"registration","attempt":1,"http_code":200,"duration_ms":234.56}
```

### Viewing Logs

**Real-time monitoring:**
```bash
tail -f v2/logs/affiliate-slack.log
```

**Recent entries:**
```bash
tail -n 100 v2/logs/affiliate-slack.log
```

**Search for specific events:**
```bash
# Find all failures
grep -i "failed\|error" v2/logs/affiliate-slack.log

# Find specific correlation ID
grep "SLACK-20260306-123456" v2/logs/affiliate-slack.log

# Find retry attempts
grep "attempt" v2/logs/affiliate-slack.log
```

## Notification Statistics

### Get Statistics

Use the monitoring helper to get statistics:

```php
<?php
require_once 'v2/helpers/affiliate-slack-monitor.php';

// Get stats for last 24 hours (default)
$stats = getSlackNotificationStats(24);

// Get stats for last 48 hours
$stats = getSlackNotificationStats(48);

print_r($stats);
```

**Response structure:**
```php
[
    'total' => 150,              // Total notifications
    'success' => 145,            // Successful notifications
    'failed' => 5,               // Failed notifications
    'success_rate' => 96.67,     // Success rate percentage
    'recent_failures' => [...]   // Last 10 failures with details
]
```

### Get Summary

Get a health summary:

```php
<?php
require_once 'v2/helpers/affiliate-slack-monitor.php';

$summary = getSlackNotificationSummary(24);
print_r($summary);
```

**Response structure:**
```php
[
    'period_hours' => 24,
    'total_notifications' => 150,
    'successful' => 145,
    'failed' => 5,
    'success_rate_percent' => 96.67,
    'health_status' => 'healthy',  // 'healthy', 'degraded', or 'unhealthy'
    'recent_failures_count' => 5
]
```

**Health Status Thresholds:**
- **healthy**: Success rate ≥ 95%
- **degraded**: Success rate ≥ 80% but < 95%
- **unhealthy**: Success rate < 80%

## Health Checks

### Manual Health Check

Run health check manually:

```php
<?php
require_once 'v2/helpers/affiliate-slack-monitor.php';

// Check with default thresholds (5 failures OR <50% success rate in 24h)
$alertSent = checkSlackNotificationHealth();

// Custom thresholds
$alertSent = checkSlackNotificationHealth(
    10,      // Minimum failures to trigger alert
    30.0,    // Minimum failure rate percentage
    48       // Hours to analyze
);
```

**Returns:** `true` if alert was sent, `false` otherwise

### Automated Health Checks

Set up a cron job for automated health checks:

```bash
# Check every 6 hours
0 */6 * * * php /path/to/v2/scripts/affiliate/health-check-slack.php

# Check daily at 9 AM
0 9 * * * php /path/to/v2/scripts/affiliate/health-check-slack.php
```

**Create health check script** (`v2/scripts/affiliate/health-check-slack.php`):

```php
<?php
/**
 * Slack Notification Health Check Script
 * 
 * Run via cron to check notification health and send alerts if needed.
 */

require_once dirname(__DIR__, 2) . '/helpers/affiliate-slack-monitor.php';

$alertSent = checkSlackNotificationHealth(5, 50.0, 24);

if ($alertSent) {
    echo "Alert sent: Notification health check detected issues\n";
    exit(1);
} else {
    echo "OK: Notification health check passed\n";
    exit(0);
}
```

## Email Alerts

### Alert Conditions

Alerts are sent when:
- **Failure count threshold**: ≥ 5 failures in the time period, OR
- **Failure rate threshold**: Success rate < 50% (when total > 0)

### Alert Recipient

Alerts are sent to: `hady@ordio.com`

### Alert Content

Email includes:
- Time period analyzed
- Total notifications
- Success/failure counts
- Success rate percentage
- Recent failure details (last 10)
- Diagnostic steps

### Customizing Alerts

Modify thresholds in health check:

```php
checkSlackNotificationHealth(
    $failureThreshold = 5,        // Minimum failures
    $failureRateThreshold = 50.0, // Minimum failure rate %
    $hours = 24                   // Analysis period
);
```

## Admin Dashboard

### Status Endpoint

Admin-only endpoint for real-time status:

```
GET /v2/api/slack-notification-status.php?hours=24
```

**Authentication:** Requires admin session cookie

**Response:**
```json
{
    "success": true,
    "summary": {
        "period_hours": 24,
        "total_notifications": 150,
        "successful": 145,
        "failed": 5,
        "success_rate_percent": 96.67,
        "health_status": "healthy",
        "recent_failures_count": 5
    },
    "statistics": {
        "total": 150,
        "successful": 145,
        "failed": 5,
        "success_rate_percent": 96.67
    },
    "recent_failures": [...],
    "configuration": {
        "enabled": true,
        "webhook_configured": true,
        "webhook_url_length": 89,
        "curl_available": true,
        "log_file_exists": true
    },
    "log_file": {
        "exists": true,
        "size_bytes": 45678,
        "last_modified": "2026-03-06T14:23:45+00:00",
        "readable": true
    },
    "period_hours": 24
}
```

### Accessing Status

**Via curl:**
```bash
curl -b cookies.txt https://www.ordio.com/v2/api/slack-notification-status.php
```

**Via browser:**
Navigate to `/v2/api/slack-notification-status.php` while logged in as admin

**Custom time period:**
```
/v2/api/slack-notification-status.php?hours=48
```

## Monitoring Best Practices

### Daily Checks

1. Review notification statistics
2. Check for recent failures
3. Verify health status

### Weekly Reviews

1. Analyze failure patterns
2. Review error logs
3. Check success rate trends

### Monthly Analysis

1. Review overall statistics
2. Identify recurring issues
3. Update monitoring thresholds if needed

### Alert Response

When alert is received:

1. **Immediate**: Check status endpoint for current state
2. **Diagnosis**: Run diagnostic script
3. **Investigation**: Review error logs
4. **Resolution**: Fix identified issues
5. **Verification**: Test webhook connectivity

## Integration with External Monitoring

### Prometheus Metrics (Future)

Potential metrics to expose:
- `slack_notifications_total` (counter)
- `slack_notifications_success` (counter)
- `slack_notifications_failed` (counter)
- `slack_notification_duration_ms` (histogram)

### Log Aggregation

Log file can be ingested by:
- ELK Stack (Elasticsearch, Logstash, Kibana)
- Splunk
- CloudWatch Logs
- Datadog

**Log format:** Structured JSON in log context makes parsing easy

## Troubleshooting Monitoring

### Log File Not Created

**Issue:** `v2/logs/affiliate-slack.log` doesn't exist

**Solution:**
1. Ensure `v2/logs/` directory exists and is writable
2. Check directory permissions: `chmod 755 v2/logs/`
3. First notification will create the file automatically

### Statistics Not Accurate

**Issue:** Statistics don't match actual notifications

**Solution:**
1. Verify log file is being written to
2. Check log file permissions
3. Ensure time period matches log file age
4. Check for log rotation issues

### Alerts Not Sending

**Issue:** Health check runs but no email sent

**Solution:**
1. Verify PHP `mail()` function works
2. Check email server configuration
3. Verify `hady@ordio.com` is valid
4. Check spam folder

## Related Documentation

- [SLACK_LOOP_UPDATES.md](SLACK_LOOP_UPDATES.md) - Integration overview
- [SLACK_TROUBLESHOOTING.md](SLACK_TROUBLESHOOTING.md) - Troubleshooting guide
- [DEPLOYMENT_CHECKLIST.md](DEPLOYMENT_CHECKLIST.md) - Deployment steps
