# Google Ads Attribution Audit Report

**Last Updated:** 2026-01-28  
**Status:** ✅ Fixes Implemented

## Problem Statement

Google Ads leads from `/gastro` and `/schichtbetriebe` landing pages were being misattributed as "Organic Search" or "Direct Traffic" instead of "Google" (Google Ads). Approximately 10 leads affected in recent days.

**Example Contacts:**
- Contact 1: `666278336707` - Submit on `/schichtbetriebe`, leadSource: Direct Traffic
- Contact 2: `667905998044` - Submit on `/gastro`, leadSource: Organic Search or Direct Traffic  
- Contact 3: `669640998128` - Created by Direct Traffic, leadSource: Organic Search, Demo submit on `/schichtbetriebe`

## Audit Results

### Contacts Audited

Three example contacts were audited using the HubSpot API:

1. **Contact 666278336707** (Moqbel Schurab)
   - Email: myhelium.de@gmail.com
   - Created: 2026-01-26
   - Current leadSource: Direct Traffic
   - First URL: `https://www.ordio.com/schichtbetriebe/` (no UTM parameters)
   - HubSpot Analytics Source: DIRECT_TRAFFIC
   - **Issue:** No Google Ads indicators found in contact properties or URLs

2. **Contact 667905998044** (Miguel Gomes)
   - Email: miguel-gomes97@web.de
   - Current leadSource: Unknown (needs verification)
   - **Issue:** Form submitted on `/gastro` but attribution unclear

3. **Contact 669640998128** (Malte Kroenig)
   - Email: maltekroenig08@gmail.com
   - Current leadSource: Organic Search
   - **Issue:** Created by Direct Traffic but leadSource shows Organic Search

### Key Findings

1. **Missing UTM Parameters in HubSpot**
   - No `gclid__c` values stored in contact properties
   - No `hsa_src__c` values stored
   - No `utm_source__c`, `utm_medium__c` values stored
   - First visit URLs (`hs_analytics_first_url`) don't contain UTM parameters

2. **Parameter Loss During Navigation**
   - Users land on `/gastro` or `/schichtbetriebe` from Google Ads
   - UTM parameters (`gclid`, `hsa_*`) are present in initial URL
   - Parameters are lost before form submission (likely during internal navigation)
   - Form submission URLs don't contain Google Ads parameters

3. **Attribution Logic Issues**
   - `determineLeadSourceFromContext()` didn't prioritize `gclid` detection
   - Google Ads detection happened AFTER lead source refinement, allowing wrong values to persist
   - Frontend sent "Direct Traffic" or "Organic Search" which overrode correct Google Ads detection
   - `gclid` extraction from `pageUrl` wasn't prioritized

## Root Causes Identified

### 1. gclid Not Prioritized in Attribution Logic

**Issue:** `determineLeadSourceFromContext()` checked for `gclid` but only returned 'Google' if `utm_source` was also 'adwords' or 'google'. If `gclid` was present alone (without `utm_source`), it didn't detect Google Ads properly.

**Fix:** Updated `determineLeadSourceFromContext()` to prioritize `gclid` detection - if `gclid` is present, return 'Google' immediately, regardless of other parameters.

**Files Changed:**
- `v2/config/utm-validation.php` - Lines 1652-1658

### 2. Google Ads Detection Order Issue

**Issue:** Google Ads detection happened AFTER `determineLeadSourceFromContext()` was called, allowing wrong leadSource values from frontend to persist.

**Fix:** Reordered logic so Google Ads detection happens FIRST, then `determineLeadSourceFromContext()` refines (but doesn't override correct Google Ads attribution).

**Files Changed:**
- `v2/api/lead-capture.php` - Lines 2412-2535 (Step 1)
- `v2/api/lead-capture.php` - Lines 3269-3395 (Step 2)
- `v2/api/collect-lead.php` - Lines 433-451

### 3. Missing Override Logic for Wrong leadSource Values

**Issue:** Google Ads detection only set `leadSource = 'Google'` if it was empty. If frontend sent "Direct Traffic" or "Organic Search", it wasn't overridden.

**Fix:** Added logic to override wrong leadSource values when Google Ads indicators (`gclid` or `hsa_src='g'`) are present.

**Files Changed:**
- `v2/api/lead-capture.php` - Lines 2412-2424, 2426-2497 (Step 1)
- `v2/api/lead-capture.php` - Lines 3269-3281, 3283-3371 (Step 2)
- `v2/api/collect-lead.php` - Lines 433-451

### 4. Frontend leadSource Mismatch

**Issue:** Frontend JavaScript (`utm-tracking.js`) set `leadSource = 'Paid Search'` for Google Ads, but backend expects `leadSource = 'Google'`.

**Fix:** Updated frontend to use `leadSource = 'Google'` for Google Ads to match backend expectations.

**Files Changed:**
- `v2/js/utm-tracking.js` - Lines 621-624, 2047-2050, 2157-2160

### 5. Missing hsa_src Detection in determineLeadSourceFromContext()

**Issue:** `determineLeadSourceFromContext()` didn't check for `hsa_src='g'` parameter from pageUrl.

**Fix:** Added `hsa_src='g'` detection in `determineLeadSourceFromContext()` by extracting from pageUrl.

**Files Changed:**
- `v2/config/utm-validation.php` - Lines 1660-1675

## Fixes Implemented

### Backend Fixes

1. **Enhanced `determineLeadSourceFromContext()` Function**
   - Prioritizes `gclid` detection - returns 'Google' immediately if `gclid` present
   - Extracts `hsa_src='g'` from pageUrl if not passed as parameter
   - Checks `hsa_src='g'` as Google Ads indicator
   - Returns 'Google' for Google Ads before checking other signals

2. **Improved Google Ads Detection in Form Endpoints**
   - `lead-capture.php` (Step 1 & Step 2): Google Ads detection happens BEFORE `determineLeadSourceFromContext()`
   - Overrides wrong leadSource values (Direct Traffic, Organic Search) when Google Ads indicators present
   - Extracts `gclid` from pageUrl if not in cookies/parameters
   - Logs warnings when overriding incorrect leadSource values

3. **Enhanced `collect-lead.php`**
   - Added Google Ads detection BEFORE `determineLeadSourceFromContext()`
   - Extracts `hsa_src` from pageUrl
   - Overrides wrong leadSource values when Google Ads indicators present

### Frontend Fixes

1. **Updated `utm-tracking.js`**
   - Changed `leadSource = 'Paid Search'` to `leadSource = 'Google'` for Google Ads
   - Ensures consistency with backend expectations
   - Updated in `getAllUTMData()`, `correctGoogleAdsUTM()`, and `getUTMDataForAPI()`

## Testing

### Unit Tests

Created `test-google-ads-attribution.php` script with 10 test cases covering:
- `gclid` present scenarios
- `hsa_src='g'` scenarios
- Override logic for wrong leadSource values
- Edge cases (no indicators, organic search)

**Results:** All 10 tests passed ✓

### Test Scenarios Verified

1. ✓ `gclid` present, no utm_source → Returns 'Google'
2. ✓ `gclid` present, utm_source=adwords → Returns 'Google'
3. ✓ `hsa_src='g'` present → Returns 'Google'
4. ✓ `gclid` present but leadSource=Direct Traffic → Overrides to 'Google'
5. ✓ `gclid` present but leadSource=Organic Search → Overrides to 'Google'
6. ✓ `utm_source=adwords`, `utm_medium=ppc` → Returns 'Google'
7. ✓ `gclid` in pageUrl but not in parameters → Extracted and returns 'Google'
8. ✓ `hsa_src='g'` in pageUrl → Returns 'Google'
9. ✓ No Google Ads indicators → Keeps original leadSource
10. ✓ Organic search → Returns 'Organic Search' (not Google Ads)

## Prevention Measures

### 1. Enhanced Logging

Added warning logs when overriding incorrect leadSource values:
- Logs old leadSource and new leadSource
- Includes context (gclid present, hsa_src present, etc.)
- Helps identify future misattribution issues

### 2. Monitoring Script

Created `monitor-google-ads-attribution.php` (see Monitoring section below) to detect future issues.

### 3. Documentation Updates

- Updated `ATTRIBUTION_DEBUGGING_GUIDE.md` with Google Ads patterns
- Updated `.cursor/rules/lead-capture.mdc` with Google Ads attribution requirements

## Recommendations

### Immediate Actions

1. **Monitor Attribution** - Run monitoring script weekly to catch misattributed leads early
2. **Test Form Submissions** - Test form submissions on `/gastro` and `/schichtbetriebe` with Google Ads parameters
3. **Verify Cookie Persistence** - Ensure UTM cookies persist during internal navigation

### Long-Term Improvements

1. **Form Submission URL Tracking** - Improve form submission URL capture in HubSpot Activities API
2. **Real-time Attribution Validation** - Add validation alerts when Google Ads indicators found but leadSource is wrong
3. **Attribution Dashboard** - Create dashboard showing attribution health metrics

## Files Modified

### Backend
- `v2/config/utm-validation.php` - Enhanced `determineLeadSourceFromContext()`
- `v2/api/lead-capture.php` - Improved Google Ads detection (Step 1 & Step 2)
- `v2/api/collect-lead.php` - Added Google Ads detection

### Frontend
- `v2/js/utm-tracking.js` - Changed 'Paid Search' to 'Google' for consistency

### Scripts Created
- `v2/scripts/hubspot/google-ads-attribution-audit.php` - Contact audit script
- `v2/scripts/hubspot/analyze-google-ads-attribution.php` - Analysis script
- `v2/scripts/hubspot/test-google-ads-attribution.php` - Test script

## Next Steps

1. **Deploy Fixes** - Deploy code changes to production
2. **Monitor Results** - Run monitoring script to verify fixes work
3. **Test Form Submissions** - Test with real Google Ads parameters
4. **Retroactive Fix** - Consider fixing affected contacts if needed (requires user approval)

## Related Documentation

- `docs/development/ATTRIBUTION_DEBUGGING_GUIDE.md` - Attribution troubleshooting
- `.cursor/rules/lead-capture.mdc` - Lead capture patterns
- `v2/scripts/hubspot/google-ads-attribution-audit.php` - Audit script
- `v2/scripts/hubspot/analyze-google-ads-attribution.php` - Analysis script
