# Lead source vs UTM audit — pattern summary (90 days)

**Last Updated:** 2026-03-29  

**Source run:** `leadsource-utm-audit-full-90d.csv` / `leadsource-utm-audit-full-90d-summary.json` (generated 2026-03-29 UTC). **Example emails and IDs** below are copied from that export; after you re-run the audit, refresh this section from the new CSV if you want matching examples.  

**Scale:** 5,530 contacts scanned · **230 discrepancy rows** (23 tier **A**, 207 tier **B**).

Use this page to **pick one pattern group**, spot-check **2–3 examples** in HubSpot (**search by contact ID or email**), then decide per group: *ignore*, *bulk-update `leadsource`*, or *investigate integration/tracking*.

---

## How to use the CSV with this summary

1. Open `var/hubspot-audits/leadsource-utm-audit-full-90d.csv` in Excel or Google Sheets.
2. Filter the column **`reason_codes`** to the code shown for each group below (exact match).
3. Optionally filter **`tier`** to `A` or `B`.
4. In HubSpot, open a contact by **`contact_id`** or **email** (see example tables per group below) and compare **Lead source** with **Original source** / UTM custom properties.

---

## Decision log (remediation 2026-03-29)

Stakeholder decisions for the **90-day audit groups** are recorded in [HUBSPOT_LEADSOURCE_ATTRIBUTION_POLICY.md](./HUBSPOT_LEADSOURCE_ATTRIBUTION_POLICY.md). **Group 5** case-by-case workbook: [GROUP5_CASE_DECISIONS.md](./GROUP5_CASE_DECISIONS.md).

| Group | Action |
|------|--------|
| 1 | Accept as-is (analytics vs UTM divergence documented) |
| 2 | CRM fix + integration trace (Meta → `leadsource` **meta**); logs: `lead_capture_step2_meta_paid`, `form_hs_meta_paid` |
| 3 | CRM fix only if **Organic Search** agreed for those rows |
| 4 | CRM fix: align to **Organic Search** when `utm_medium__c` = organic |
| 5 | Case-by-case; see GROUP5_CASE_DECISIONS (merge / workflow / SDR / bug) |
| 6 | Reporting-first; **narrow** `leadsource`/UTM PATCH only with evidence (`patch-contact-attribution-from-csv.php`) |

**HubSpot workflows (manual):** Inventory workflows that touch `leadsource` / UTMs; prefer fill-only rules — see policy doc checklist.

After CRM patches, **re-run** `php v2/scripts/hubspot/audit-leadsource-utm-discrepancies.php --days=90` and refresh example rows in this file if counts shift materially. Draft PATCH rows from any audit: `php v2/scripts/hubspot/build-patch-csv-from-audit.php --input=… --reason=… --output=…` (then edit, `patch-leadsource-from-audit.php`).

**Paid search + weak CRM UTMs (`paid_search_utm_gap`):** Dedicated export via `--paid-utm-gap-output=…`; remediation ladder and scripts are documented in [HUBSPOT_LEADSOURCE_ATTRIBUTION_POLICY.md](./HUBSPOT_LEADSOURCE_ATTRIBUTION_POLICY.md#paid-search-utm-backfill-hybrid-evidence-ladder).

---

## Snapshot by tier

| Tier | Count | How to read it |
|------|------:|----------------|
| **A** | 23 | UTM/gclid on the record **strongly** suggests a different `leadsource` than what is stored. Good candidates for **CRM corrections** after you confirm taxonomy. |
| **B** | 207 | HubSpot **analytics** (`hs_analytics_source`) disagrees with `leadsource` or with UTM fields. Often **not a bug**—treat as **review** (tracking cookie, first session, or timing). |

---

## Pattern groups (sorted by volume)

Each row is one **decision bucket** for you: same remediation usually applies to the whole group.

| # | Count | Tier | `reason_codes` | Plain-language summary |
|---|------:|:----:|----------------|-------------------------|
| 1 | 196 | B | `analytics_direct_vs_leadsource_organic` | Form/capture says **organic** (`source__c` + `utm_medium__c` = organic-style), but HubSpot analytics says **Direct traffic** (often `hs_analytics_source_data_1` = your domain). |
| 2 | 10 | A | `meta_paid_medium` | **`source__c`** is Meta-style (**fb** / similar) and **`utm_medium__c`** is paid, but **`leadsource`** is not **meta** (e.g. Direct Traffic, Offline). |
| 3 | 8 | B | `analytics_organic_vs_leadsource_google` | Analytics says **organic search**, but **`leadsource`** is **Google** (or paid-flavoured). |
| 4 | 5 | A | `utm_medium_organic` | **`utm_medium__c` = organic** implies **Organic Search**, but **`leadsource`** is something else (e.g. Organic Social, Direct Traffic). |
| 5 | 5 | A | `gclid_present` | **`gclid__c` is set** (Google Ads click) but **`leadsource`** is not aligned with **paid Google** (e.g. referral, wrong bucket). |
| 6 | 2 | B | `analytics_paid_search_vs_leadsource_organic` | Analytics = **paid search**, form UTM = **organic**—session vs submit mismatch. |
| 7 | 1 | A | `utm_source_google_no_gclid` | **`source__c` = google** without gclid → we treat as **organic**; **`leadsource`** disagrees (e.g. referral). |
| 8 | 1 | A | `utm_medium_referral` | Medium **referral** → expect **referral**-style `leadsource`; stored value differs. |
| 9 | 1 | A | `utm_medium_email` | Medium **email** → expect **email**-style `leadsource`; stored value differs. |
| 10 | 1 | B | `analytics_paid_social_vs_leadsource_not_meta` | Analytics = **paid social**, `leadsource` not **meta**-like. |

---

## Group 1 — Direct analytics vs organic UTM (196 rows, tier B)

**What you are seeing:**  
UTM fields on the contact often look like **organic search** (e.g. `google` + `organic`). HubSpot’s **Original source** is **Direct traffic**, with drill-down often showing a **full URL on ordio.com** (not a search engine referrer string).

**Typical `source__c` mix (this run):**

| `source__c` (approx.) | Rows |
|----------------------|-----:|
| google | 179 |
| bing | 9 |
| duckduckgo | 3 |
| direct | 2 |
| ecosia | 2 |
| yahoo | 1 |

**About **19** rows** use a **lead-capture temp email** (`@temp.ordio.com`)—two-step flows where Step 1 may not pass the same context as Step 2.

**Likely causes (usually not a single “wrong” value):**

- **HubSpot tracking** attributes the **first session** as direct (bookmark, mobile app open, stripped referrer, ITP/privacy, or user typed URL) while your **form** still submits **organic-style UTM** from the page or cookies.
- **No contradiction for reporting** if you trust **first-touch analytics** for channel mix and **UTM properties** for **last-touch campaign** context.

**What to decide:**

| Your call | Action |
|-----------|--------|
| **Accept as-is** | No CRM change; document that `leadsource`/UTM can diverge from `hs_analytics_source` by design. |
| **Align reporting** | Use **one** system as source of truth in dashboards (e.g. analytics vs UTM fields), or sync via **workflow** only when empty. |
| **Investigate integration** | If you expect **hutk** / tracking script on every form page: audit a few URLs for **missing script**, **ad blockers**, or **forms submitted without cookie** (especially temp-email leads). |

**Examples** (from this audit export — search in HubSpot by ID or email):

| Contact ID | Email |
|------------|-------|
| 622456904912 | stoicab588@gmail.com |
| 617700772028 | lead-lc17670978026971@temp.ordio.com |
| 616264179952 | razonk@posteo.de |

---

## Group 2 — Meta paid UTM vs wrong `leadsource` (10 rows, tier A)

**What you are seeing:**  
**`source__c`** + **`utm_medium__c`** indicate **Meta paid**, but **`leadsource`** is e.g. **Direct Traffic** or **Offline**.

**Likely causes:**

- **Lead source not updated** on Step 2 or on a specific form endpoint.
- **`hs_analytics_source` OFFLINE** on some rows suggests **import, offline conversion, or missing web session** while UTM still came from the form payload—worth checking **which form/API** created the contact.

**What to decide:**

| Your call | Action |
|-----------|--------|
| **CRM fix** | After spot-check, set **`leadsource`** to your portal’s **meta / paid social** value to match UTM. |
| **Integration** | Trace **lead-capture Step 2** and **collect-lead** paths for **`leadsource`** and Meta detection (`fb` + paid medium). |

**Examples:**

| Contact ID | Email |
|------------|-------|
| 636994259153 | draganceda123@icloud.com |
| 640478269639 | martina.menzel@tuscherei.de |
| 646898845935 | petragiuliani08@gmail.com |

---

## Group 3 — Analytics organic vs `leadsource` Google (8 rows, tier B)

**What you are seeing:**  
HubSpot says **organic search** (sometimes “Unknown keywords (SSL)” in drill-down), but **`leadsource`** is **Google** or similar.

**Likely causes:**

- **Naming**: your dropdown **“Google”** may mean **paid**, while analytics **ORGANIC_SEARCH** is **SEO**—taxonomy mismatch, not necessarily wrong data.
- **SSL / keyword not provided** in drill-down is common.

**What to decide:**

| Your call | Action |
|-----------|--------|
| **Taxonomy** | Rename or split options (e.g. **Google Ads** vs **Google organic**) so reporting matches HubSpot analytics language. |
| **CRM fix** | Only if you agree **leadsource** should be **Organic Search** for these records. |

**Batch tooling (CRM fix):** If you use `build-patch-csv-from-audit.php` with `--target-override="Organic Search"`, add **`--require-empty-gclid`** so contacts with **`gclid__c` + paid-style UTMs** stay on **Google** (otherwise you recreate a **`gclid_present`** discrepancy). One audited row (`650648411363`) fell in this overlap and required a follow-up PATCH to **Google**.

**Examples:**

| Contact ID | Email |
|------------|-------|
| 651336909008 | pat.mill988@gmail.com |
| 664835707103 | info@sporttreff-in.de |
| 650648411363 | reservierung@walderdorffs.de (had **gclid**; **Google** kept after overlap fix) |

---

## Group 4 — `utm_medium=organic` vs wrong `leadsource` (5 rows, tier A)

**What you are seeing:**  
Explicit **organic** medium on the contact, but **`leadsource`** is e.g. **Organic Social** or **Direct Traffic**.

**Likely causes:**

- **Frontend** sent **`utm_medium=organic`** with a **social-looking** `source` in another field, or **server refinement** (`determineLeadSourceFromContext`) expects **Organic Search** but form stored a different dropdown value.

**What to decide:**

| Your call | Action |
|-----------|--------|
| **CRM fix** | If UTM is trusted, align **`leadsource`** to **Organic Search** (or your portal’s equivalent). |
| **Integration** | Compare **`source__c`** + **`utm_medium__c`** + page context for these IDs in logs. |

**Examples:**

| Contact ID | Email |
|------------|-------|
| 621887367380 | emilbahtijar26@gmail.com |
| 653374740695 | jochen.kramb@winerebels.de |
| 668964066526 | lea-marie.lippert@faceoff-events.de |

---

## Group 5 — `gclid` present vs wrong `leadsource` (5 rows, tier A)

**What you are seeing:**  
**Google Click ID** is stored; **`leadsource`** is not **paid Google** (e.g. **referral**, **Freelancesdr**).

**Likely causes:**

- **Campaign / SDR** flows that **override** `leadsource` after capture.
- **Import or workflow** that set **`leadsource`** without clearing **`gclid__c`**.

**What to decide:**

| Your call | Action |
|-----------|--------|
| **CRM fix** | If paid Google is correct for attribution, set **`leadsource`** to paid Google / **Google Ads** (per your enum). |
| **Process** | If **Freelancesdr** (or other) is **intentionally** winning for SDR leads, document that **gclid** may remain for ad diagnostics only. |

**Examples:**

| Contact ID | Email |
|------------|-------|
| 609537709256 | thomas1983mueller1983@gmail.com |
| 685457135815 | a.gerhardt@kairosred.eu |
| 630893167819 | alansido@googlemail.com |

---

## Group 6 — Paid search analytics vs organic UTM (2 rows, tier B)

**What you are seeing:**  
Analytics **paid search**, but **UTM** on the record looks **organic**.

**Likely causes:**

- **Session** came from ads; **form** submitted **organic** UTMs (stale cookies, different tab, or manual URL).

**Remediation decision (2026-03-29):** Default **no** bulk CRM rewrite of UTM fields; treat as **reporting / first-touch vs last-touch** education. If you **do** correct specific rows (e.g. paid ads proven on form URL), use [patch-attribution-narrow-template.csv](./patch-attribution-narrow-template.csv) with [`patch-contact-attribution-from-csv.php`](../../v2/scripts/hubspot/patch-contact-attribution-from-csv.php) (dry-run first). See [Decision log](#decision-log-remediation-2026-03-29).

**Examples:**

| Contact ID | Email |
|------------|-------|
| 375879344372 | lead-lc17676128472236@temp.ordio.com |
| 174142465247 | lead-lc17703763617655@temp.ordio.com |

---

## Groups 7–10 — Single-row tier A/B (1–2 rows each)

| `reason_codes` | Tier | Contact ID | Email | Quick note |
|----------------|------|------------|-------|------------|
| `utm_source_google_no_gclid` | A | 635491778789 | og@gmail.com | **`google`** without **gclid** → rules assume **organic**; **`leadsource`** was **referral**. Confirm intent. |
| `utm_medium_referral` | A | 682630252776 | lead-lc17702904899905@temp.ordio.com | Expect **referral**-style **`leadsource`**. |
| `utm_medium_email` | A | 691968790722 | hasan.akman@bedachung-ph.ch | Expect **email**-style **`leadsource`**. |
| `analytics_paid_social_vs_leadsource_not_meta` | B | 657313602779 | haring@seppbauer.at | Rare **analytics vs dropdown** mismatch for paid social. |

---

## Integration / setup checklist (when a group looks “wrong”)

Use this when you suspect **product** or **MarTech** rather than a one-off contact:

1. **Tracking script** on landing URLs: HubSpot embed loads, **`hutk`** present on form submit where expected.  
2. **Lead capture two-step**: Step 1 vs Step 2 **`leadsource`** and UTM merge behaviour ([`v2/api/lead-capture.php`](../../../v2/api/lead-capture.php)).  
3. **Meta / Google** hidden fields and **`determineLeadSourceFromContext()`** ([`v2/config/utm-validation.php`](../../../v2/config/utm-validation.php)) vs HubSpot **dropdown labels**.  
4. **Workflows** in HubSpot that **overwrite** `leadsource` on create or update.  
5. **Offline / import** sources: **`hs_analytics_source` OFFLINE** with paid UTM → often **expected** if the record was touched outside the web session.

---

## Related docs

- [HUBSPOT_LEADSOURCE_UTM_AUDIT.md](./HUBSPOT_LEADSOURCE_UTM_AUDIT.md) — how to re-run the audit and interpret tiers  
- [HUBSPOT_LEADSOURCE_ATTRIBUTION_POLICY.md](./HUBSPOT_LEADSOURCE_ATTRIBUTION_POLICY.md) — source-of-truth policy, workflow checklist, references  
- [GROUP5_CASE_DECISIONS.md](./GROUP5_CASE_DECISIONS.md) — `gclid_present` case workbook  
- [patch-leadsource-template.csv](./patch-leadsource-template.csv) — signed-off PATCH CSV template  
- [patch-attribution-narrow-template.csv](./patch-attribution-narrow-template.csv) — optional UTM + `leadsource` PATCH (Group 6 / evidence-based)  
- Raw export: `var/hubspot-audits/leadsource-utm-audit-full-90d.csv` (gitignored; regenerate locally after re-run)
