# Caching Strategy Documentation


**Last Updated:** 2026-03-14

## Overview

This document outlines the caching strategy implemented for the Ordio website to address uncached JavaScript and CSS file issues reported by SEO audit tools like Semrush.

## Problem Statement

Semrush and other SEO audit tools report issues with uncached JavaScript and CSS files. The primary issue is external third-party scripts (especially HubSpot) that don't have proper cache headers set by their origin servers.

## Solution Architecture

### 1. Server-Side Caching (.htaccess)

Our own assets are cached using Apache `.htaccess` configuration:

- **Static Assets (JS, CSS, images, fonts)**: 1 year cache with `must-revalidate` directive
- **Dynamic Content (PHP, HTML)**: 1 hour cache with `must-revalidate`
- **Cache-Control Headers**: Modern approach using `max-age` and `must-revalidate`
- **Expires Headers**: Fallback for older browsers
- **Cache Busting**: All CSS/JS files must include version parameters (`?v=timestamp`) for proper cache invalidation

**Location**: `.htaccess` (lines 45-111)

**Cache Busting Requirements**:
- All CSS/JS includes must use `?v=<?php echo filemtime(...); ?>` pattern
- Critical files: `/dist/output.min.css`, `/src/critical.css` must be versioned
- Validation: Run `python3 v2/scripts/dev-helpers/audit-unversioned-assets.py` before deployment

### 2. Service Worker Caching

For external resources that we cannot control (HubSpot, Google Tag Manager, etc.), we implement client-side caching using a Service Worker.

**Location**: `v2/js/service-worker.js`

**Strategy**: Stale-While-Revalidate

- Serve cached content immediately
- Update cache in background
- Cache duration: 7 days
- Fallback to stale cache if network fails

**Registration**: `v2/base/head.php` (lines 108-148)

## External Resources Cached

The following external resources are cached by the service worker:

1. **HubSpot Script** (`js-eu1.hs-scripts.com/145133546.js`)

   - Main issue: 215 occurrences across site
   - Critical: Yes

2. **Google Tag Manager** (`googletagmanager.com/gtm.js`)

   - Critical: Yes

3. **jsPDF Library** (`cdnjs.cloudflare.com/ajax/libs/jspdf/2.5.1/jspdf.umd.min.js`)

   - Used for PDF generation

4. **html2canvas Library** (`cdnjs.cloudflare.com/ajax/libs/html2canvas/1.4.1/html2canvas.min.js`)

   - Used for canvas rendering

5. **LinkedIn Insight Tag** (`snap.licdn.com/li.lms-analytics/insight.min.js`)

   - Analytics tracking

6. **TikTok Pixel** (`analytics.tiktok.com/i18n/pixel/sdk.js`)

   - Analytics tracking

7. **Facebook Pixel** (`connect.facebook.net/en_US/fbevents.js`)
   - Analytics tracking

## Service Worker Implementation Details

### Cache Name

- `ordio-external-resources-v2`
- Versioned to allow cache invalidation on updates

### Cache Duration

- 7 days (604,800,000 milliseconds)
- Resources older than 7 days are revalidated

### Update Strategy

- Service worker checks for updates every hour
- New versions auto-activate
- Page reloads when new service worker is ready

### Error Handling

- Falls back to stale cache if network fails
- Logs errors to console for debugging
- Gracefully handles CORS issues with `no-cors` mode

## Limitations and Workarounds

### External Script Cache Headers

**Critical Limitation**: We cannot control cache headers for external resources. Third-party providers (HubSpot, Google, etc.) control their own cache headers.

**Why This Matters**:

- SEO audit tools (like Semrush) flag resources without proper cache headers
- This affects 215+ pages where HubSpot script is loaded
- We cannot fix this at the server level

**Workaround**:

- Service worker provides client-side caching that works regardless of origin server cache headers
- Browser caches resources even if origin server doesn't set cache headers
- Service worker ensures consistent caching behavior across browsers

**What We Can Do**:

1. ✅ Cache external scripts via service worker (implemented)
2. ✅ Ensure proper loading attributes (async, defer, crossorigin)
3. ✅ Monitor cache performance
4. ❌ Cannot change HubSpot's cache headers (contact HubSpot support if needed)

**When to Contact Third-Party Providers**:

- If cache headers are critical for your use case
- If you need longer cache durations
- If you're experiencing performance issues due to missing cache headers

### Service Worker Limitations

1. **HTTPS Required**:

   - Service workers only work on HTTPS (or localhost for development)
   - HTTP sites cannot use service workers
   - **Workaround**: Ensure site is served over HTTPS

2. **Browser Support**:

   - Not supported in Internet Explorer
   - Limited support in older Safari versions
   - **Workaround**: Service worker registration is optional - site works without it, just without client-side caching

3. **CORS Restrictions**:

   - Some resources may have CORS restrictions
   - Service worker uses `no-cors` mode to bypass some restrictions
   - **Workaround**: Use `crossorigin="anonymous"` in script tags

4. **Cache Size**:

   - Browser may evict cache if storage limits are reached
   - Typical limit: 50-100MB per origin
   - **Workaround**: Monitor cache size, remove unused resources

5. **Cache Updates**:
   - Service worker cache updates happen in background
   - Users may see slightly stale content during updates
   - **Workaround**: 7-day cache duration balances freshness and performance

### Semrush Audit Limitations

**Important**: Semrush and similar tools check HTTP cache headers, not browser cache or service worker cache.

**What This Means**:

- Even with service worker caching, Semrush may still report "uncached" resources
- This is a limitation of how audit tools work, not a real problem
- Resources ARE cached (via service worker), just not at HTTP level

**What We Can Do**:

1. ✅ Document that external scripts are cached via service worker
2. ✅ Monitor actual cache hit rates (not just HTTP headers)
3. ✅ Contact Semrush support to explain service worker caching
4. ❌ Cannot change how Semrush audits cache headers

**Recommendation**:

- Focus on actual performance metrics (cache hit rates, load times)
- Don't rely solely on Semrush cache header reports
- Use browser DevTools to verify actual caching behavior

## Testing

### Audit Scripts

Two Python scripts are available for testing:

1. **`scripts/audit-resource-caching.py`**

   - Audits cache headers for all external resources
   - Generates JSON report
   - Identifies resources without proper cache headers

2. **`scripts/test-cache-headers.py`**
   - Tests cache headers on multiple pages
   - Verifies `.htaccess` configuration is working
   - Checks for expected cache durations

### Browser Testing

1. **Chrome DevTools**:

   - Network tab: Check cache headers
   - Application tab: Verify service worker registration
   - Cache Storage: Inspect cached resources

2. **Cache Behavior**:
   - First load: Resources fetched from network
   - Subsequent loads: Served from cache (304 responses)
   - After 7 days: Cache revalidated in background

## Monitoring

### Service Worker Status

Check browser console for service worker logs:

- `[Service Worker] Registration successful`
- `[Service Worker] Serving from cache`
- `[Service Worker] Cache updated`

### Cache Hit Rates

Monitor cache performance:

- Service worker cache hits vs network requests
- Cache storage usage
- Cache eviction events

## Maintenance

### Adding New External Resources

1. Add URL to `EXTERNAL_RESOURCES_TO_CACHE` in `v2/js/service-worker.js`
2. Add pattern to `EXTERNAL_PATTERNS` if needed
3. Update service worker version to invalidate old cache
4. Test cache behavior

### Updating Service Worker

1. Modify `v2/js/service-worker.js`
2. Update `CACHE_NAME` version (e.g., `v1` → `v2`)
3. Service worker will auto-update on next page load
4. Old cache will be cleaned up automatically

### Troubleshooting

**Service worker not registering**:

- Check HTTPS is enabled
- Verify file path is correct
- Check browser console for errors

**Resources not caching**:

- Verify URL matches pattern in service worker
- Check CORS headers
- Verify service worker is active

**Cache not updating**:

- Clear browser cache
- Unregister service worker in DevTools
- Hard refresh page (Ctrl+Shift+R)

## Best Practices

1. **Version Assets**: Use query strings (`?v=timestamp`) for cache busting
   - Pattern: `?v=<?php echo filemtime(__DIR__ . '/path/to/file.css'); ?>`
   - Always use `file_exists()` check with fallback to `time()`
   - Critical files: `/dist/output.min.css`, `/src/critical.css` must be versioned
2. **Validate Before Deploy**: Run `python3 v2/scripts/dev-helpers/audit-unversioned-assets.py` to check for unversioned assets
3. **Monitor Cache**: Regularly check cache hit rates
4. **Update Regularly**: Keep service worker cache duration reasonable (7 days)
5. **Test Thoroughly**: Test on multiple browsers and devices
6. **Document Changes**: Update this document when adding new resources

## Cache Busting Implementation

### Why Cache Busting is Critical

Without version parameters, browsers may serve stale cached CSS/JS files even after updates, causing broken styling that requires hard refresh (Cmd+Shift+R).

### Implementation Pattern

**Correct Pattern**:
```php
<?php
$css_path = $_SERVER['DOCUMENT_ROOT'] . '/dist/output.min.css';
$css_version = file_exists($css_path) ? filemtime($css_path) : time();
?>
<link rel="stylesheet" href="/dist/output.min.css?v=<?php echo $css_version; ?>">
```

**Incorrect Pattern** (causes caching issues):
```php
<link rel="stylesheet" href="/dist/output.min.css">
```

### Cache Header Strategy

**Changed from `immutable` to `must-revalidate`**:
- `immutable` prevented browsers from revalidating even when query strings changed
- `must-revalidate` allows revalidation when query strings change while maintaining long cache
- Query string changes (`?v=old` → `?v=new`) now force fresh fetch

### Validation

Run before deployment:
```bash
python3 v2/scripts/dev-helpers/audit-unversioned-assets.py
```

This script scans `v2/base/`, `v2/pages/`, and `v2/components/` for unversioned CSS/JS includes.

## References

- [Service Worker API - MDN](https://developer.mozilla.org/en-US/docs/Web/API/Service_Worker_API)
- [Cache API - MDN](https://developer.mozilla.org/en-US/docs/Web/API/Cache)
- [HTTP Caching - MDN](https://developer.mozilla.org/en-US/docs/Web/HTTP/Caching)
- [Stale-While-Revalidate Strategy](https://web.dev/stale-while-revalidate/)
