# HubSpot Tracking Validation Report - November 2025


**Last Updated:** 2025-11-20

## Executive Summary

This report documents the validation of the updated tracking setup using contacts from November 1-14, 2025. The validation process simulated all form submission flows and page interactions without submitting to HubSpot, comparing simulated outputs against actual HubSpot values to identify discrepancies and improvements.

**Date:** Generated on validation run date  
**Sample Period:** November 1-14, 2025  
**Total Contacts Analyzed:** [To be filled by analysis script]  
**Overall Match Rate:** [To be filled by analysis script]%

## Methodology

### Data Collection

1. **Contact Fetching:** Used HubSpot Search API v3 to fetch contacts created between November 1-14, 2025
2. **Data Normalization:** Parsed and normalized contact data including:
   - Contact properties (attribution, UTMs, form data)
   - Page views with timestamps, URLs, and referrers
   - Form submissions with details
   - Analytics data (first/last URLs, referrers, timestamps)

### Simulation Process

1. **Lead Source Simulation:** Simulated lead source determination using:
   - `determineLeadSourceFromContext()` - New comprehensive function
   - `ordio_resolve_attribution()` - Existing attribution resolver
2. **Form Submission Simulation:** Simulated all form types:
   - Lead capture (Step 1 and Step 2)
   - Template downloads
   - Add-on requests
   - Webinar registrations
   - Export workdays
   - ShiftOps submissions
   - Collect lead submissions
3. **Page Flow Simulation:** Reconstructed user journeys from page views to track lead source changes throughout sessions

### Comparison Process

- Compared simulated results against actual HubSpot values
- Flagged discrepancies in:
  - `leadsource`
  - `sign_up_type__c`
  - `content`
  - UTM parameters (`utm_source__c`, `utm_medium__c`, etc.)

### Analysis Process

- Categorized discrepancies as:
  - **Expected improvements:** Cases where new logic correctly fixes old misclassifications
  - **Unexpected issues:** Cases requiring review or fixes
  - **Edge cases:** Unusual scenarios requiring attention

## Key Findings

### Overall Statistics

- **Total Contacts:** [To be filled]
- **Perfect Matches:** [To be filled] ([X]%)
- **Expected Discrepancies:** [To be filled] ([X]%)
- **Unexpected Discrepancies:** [To be filled] ([X]%)

### Lead Source Accuracy

- **Match Rate:** [To be filled]%
- **Common Patterns:**
  - [Pattern 1]: [Count] occurrences
  - [Pattern 2]: [Count] occurrences

### Form Type Accuracy

- **Lead Capture:** [Match rate]%
- **Template Downloads:** [Match rate]%
- **Add-on Requests:** [Match rate]%
- **Webinar Registrations:** [Match rate]%
- **Export Workdays:** [Match rate]%
- **ShiftOps:** [Match rate]%
- **Collect Lead:** [Match rate]%

### Discrepancy Categories

#### Expected Improvements

These discrepancies represent improvements where new logic correctly fixes old misclassifications:

1. **Paid Search → Organic Search**

   - **Count:** [X]
   - **Root Cause:** Old logic classified traffic as "Paid Search" based on `utm_medium='cpc'/'ppc'` alone without verifying `utm_source`. New logic requires `utm_source` verification.
   - **Impact:** Positive - correctly identifies organic traffic from search engines
   - **Recommendation:** No action needed

2. **Meta → Organic Search**

   - **Count:** [X]
   - **Root Cause:** Old logic didn't check referrer domain. New logic detects search engine referrers and correctly classifies as organic.
   - **Impact:** Positive - prevents misattribution of organic search traffic
   - **Recommendation:** No action needed

3. **Content Value Simplification**
   - **Count:** [X]
   - **Root Cause:** Removed redundant prefixes/suffixes (" - Template", "Lead Capture - ")
   - **Impact:** Positive - cleaner, more consistent data
   - **Recommendation:** No action needed

#### Unexpected Issues

These discrepancies require review and potential fixes:

1. **[Issue Category 1]**

   - **Count:** [X]
   - **Description:** [Description]
   - **Root Cause:** [Root cause]
   - **Impact:** [Impact assessment]
   - **Recommendation:** [Recommendation]

2. **[Issue Category 2]**
   - **Count:** [X]
   - **Description:** [Description]
   - **Root Cause:** [Root cause]
   - **Impact:** [Impact assessment]
   - **Recommendation:** [Recommendation]

### Edge Cases Identified

1. **[Edge Case 1]**
   - **Description:** [Description]
   - **Frequency:** [X] occurrences
   - **Current Behavior:** [Current behavior]
   - **Expected Behavior:** [Expected behavior]
   - **Recommendation:** [Recommendation]

## Recommendations

### High Priority

1. **[Recommendation 1]**

   - **Rationale:** [Rationale]
   - **Implementation:** [Implementation steps]

2. **[Recommendation 2]**
   - **Rationale:** [Rationale]
   - **Implementation:** [Implementation steps]

### Medium Priority

1. **[Recommendation 3]**
   - **Rationale:** [Rationale]
   - **Implementation:** [Implementation steps]

### Low Priority

1. **[Recommendation 4]**
   - **Rationale:** [Rationale]
   - **Implementation:** [Implementation steps]

## Code Improvements Needed

### Files Requiring Updates

1. **[File Path]**

   - **Issue:** [Issue description]
   - **Fix:** [Fix description]
   - **Lines:** [Line numbers]

2. **[File Path]**
   - **Issue:** [Issue description]
   - **Fix:** [Fix description]
   - **Lines:** [Line numbers]

## Test Cases for Regression Testing

See `TRACKING_TEST_CASES.md` for comprehensive test cases covering:

- All form types
- All lead source types
- Edge cases
- UTM parameter variations
- Referrer variations
- Page path heuristics

## Conclusion

[Summary of findings and overall assessment]

## Appendix

### Files Generated

- `scripts/temp/hubspot-contacts-nov-2025-*.json` - Raw contact data
- `scripts/temp/normalized-contacts-*.json` - Normalized test cases
- `scripts/temp/simulated-lead-sources-*.json` - Lead source simulation results
- `scripts/temp/simulated-form-submissions-*.json` - Form submission simulation results
- `scripts/temp/simulated-page-flows-*.json` - Page flow simulation results
- `scripts/temp/comparison-results-*.json` - Comparison results
- `scripts/temp/discrepancy-analysis-*.json` - Detailed discrepancy analysis

### Scripts Used

- `scripts/hubspot/fetch-contacts-by-date.php` - Fetches contacts from HubSpot
- `scripts/hubspot/normalize-contact-data.php` - Normalizes contact data
- `scripts/development/testing/simulate-lead-source.php` - Simulates lead source determination
- `scripts/development/testing/simulate-form-submission.php` - Simulates form submissions
- `scripts/development/testing/simulate-page-flow.php` - Simulates page flows
- `scripts/development/testing/compare-tracking-results.php` - Compares results
- `scripts/development/testing/analyze-discrepancies.php` - Analyzes discrepancies
