# ShiftOps Team Estimation - Research Findings


**Last Updated:** 2025-11-20

## Industry Benchmarks Research

### Restaurant Staffing Ratios

**General Guidelines:**

- **Fine Dining:** 1 server per 3-4 tables (typically 12-20 seats per server)
- **Casual Dining:** 1 server per 5-6 tables (typically 20-30 seats per server)
- **Fast Casual:** 1 employee per 8-10 tables (typically 30-40 seats per employee)
- **Fast Food:** 1 employee per 15-20 customers during peak hours

**Kitchen Staff:**

- **Small restaurant (<50 seats):** 2-4 kitchen staff (chef + 1-3 cooks)
- **Medium restaurant (50-100 seats):** 4-8 kitchen staff
- **Large restaurant (100+ seats):** 8-15+ kitchen staff

**Total Staff Ratios:**

- **Small restaurant:** 5-10 total staff
- **Medium restaurant:** 10-20 total staff
- **Large restaurant:** 20-40+ total staff

### Cafe Staffing Ratios

- **Small cafe (<30 seats):** 2-4 staff (baristas + cashier)
- **Medium cafe (30-60 seats):** 4-6 staff
- **Large cafe (60+ seats):** 6-10+ staff
- **Peak hours:** Typically need 1.5-2x base staffing

### Bar Staffing Ratios

- **Small bar (<50 capacity):** 2-4 staff (bartenders + servers)
- **Medium bar (50-100 capacity):** 4-6 staff
- **Large bar (100+ capacity):** 6-12+ staff
- **Peak hours:** Typically need 2-3x base staffing

### Retail Store Staffing

- **Small store (<500 sqm):** 2-4 staff
- **Medium store (500-2000 sqm):** 4-8 staff
- **Large store (2000+ sqm):** 8-20+ staff
- **Ratio:** Approximately 1 staff per 200-300 sqm

### Healthcare Staffing

- **Hospital:** Highly variable based on department and patient load
- **Pharmacy:** Typically 3-6 staff (pharmacist + technicians)
- **Ratio:** Not easily correlated with reviews/ratings

## Review-to-Staff Correlation Analysis

### Key Findings from Current Implementation Analysis

**Current Assumptions:**

1. Reviews correlate with customer volume
2. Customer volume correlates with team size
3. Square root scaling (diminishing returns) - Analyzer
4. Linear scaling with caps - Cost Calculator

**Issues Identified:**

- No validation against actual staffing data
- Review count varies significantly by business type and location
- Review rate depends on customer demographics and business culture
- No consideration of review recency or velocity

### Research-Based Insights

**Review Rate Factors:**

- **Restaurants:** 5-15% of customers leave reviews (varies by type)
- **Cafes:** 3-8% of customers leave reviews
- **Bars:** 2-5% of customers leave reviews
- **Retail:** 1-3% of customers leave reviews

**Review-to-Staff Correlation:**

- Weak correlation for small businesses (<50 reviews)
- Moderate correlation for medium businesses (50-500 reviews)
- Stronger correlation for large businesses (500+ reviews)
- Diminishing returns at very high review counts (1000+)

## Estimation Methodologies

### Multi-Factor Models

**Advantages:**

- Consider multiple data points
- More accurate than single-factor models
- Can handle missing data gracefully

**Current Implementation:**

- Cost Calculator uses 4 factors with weights:
  - Customer Volume (40%)
  - Operating Hours (30%)
  - Service Complexity (20%)
  - Quality/Scale (10%)

**Recommendations:**

- Add location type factor (urban/suburban/rural)
- Add seasonal variation factor
- Consider review velocity (reviews per month)
- Add seating capacity if available from Google Places

### Bayesian Approaches

**Potential Benefits:**

- Incorporate prior knowledge about industry averages
- Update estimates as more data becomes available
- Provide confidence intervals

**Implementation Complexity:**

- Requires statistical modeling
- Needs historical data for priors
- More computationally intensive

**Recommendation:** Consider for future enhancement, not immediate implementation

### Machine Learning Models

**Potential Benefits:**

- Can learn complex patterns
- Automatically discover important features
- Improve over time with more data

**Challenges:**

- Requires large training dataset
- Needs labeled data (actual team sizes)
- Black box - less interpretable

**Recommendation:** Not recommended for current implementation due to lack of training data

## Best Practices from Scheduling Software

### Common Factors Used:

1. **Operating Hours** - More hours = more staff needed
2. **Peak Hours** - Need additional staff during peak times
3. **Service Types** - More services = more complexity = more staff
4. **Business Size** - Seating capacity, square footage
5. **Location Type** - Urban vs suburban vs rural
6. **Seasonality** - Seasonal variations in demand
7. **Historical Data** - Past staffing patterns

### Industry Standards:

- **Full-time equivalent (FTE):** 40 hours/week per employee
- **Part-time:** Typically 20-30 hours/week
- **Peak staffing:** 1.5-2x base staffing during peak hours
- **Coverage:** Need staff for all operating hours

## Google Places API Data Availability

### Reliable Data:

- **Review Count:** Always available, reliable
- **Rating:** Always available, reliable
- **Business Type:** Always available, reliable
- **Opening Hours:** Usually available, format varies
- **Price Level:** Often available (1-4 scale)
- **Service Options:** Often available (boolean flags)

### Less Reliable Data:

- **Seating Capacity:** Rarely available
- **Square Footage:** Not available
- **Location Type:** Not directly available (can infer from address)
- **Review Velocity:** Not directly available (would need historical tracking)

### Data Quality Indicators:

- **High Quality:** Review count >100, complete opening hours, price level present
- **Medium Quality:** Review count 50-100, partial opening hours
- **Low Quality:** Review count <50, missing opening hours, no price level

## Recommendations for Improved Model

### Factor Weights (Proposed):

1. **Customer Volume Proxy (35%)** - Reviews with business age consideration
2. **Operating Hours (25%)** - Weekly hours / 40 (standard FTE)
3. **Service Complexity (20%)** - Number of service types
4. **Quality/Scale (10%)** - Rating + price level
5. **Location Type (5%)** - Urban/suburban/rural (if inferable)
6. **Review Velocity (5%)** - Reviews per month (if calculable)

### Confidence Calculation (Improved):

**Factors:**

- Review count >100: +3 points
- Review count 50-100: +2 points
- Review count 25-50: +1 point
- Complete opening hours: +2 points
- Price level present: +1 point
- Service options present: +1 point
- Multiple business types: +1 point

**Confidence Levels:**

- High: 7+ points
- Medium: 4-6 points
- Low: <4 points

### Validation Rules:

1. **Minimum Team Size:** Based on industry base staffing
2. **Maximum Team Size:** Cap at reasonable ratio (e.g., 1 employee per 15 reviews)
3. **Sanity Checks:** Flag estimates that seem unrealistic
4. **Industry-Specific Bounds:** Different min/max per industry

### Fallback Strategy:

1. **Primary:** Use multi-factor model with all available data
2. **Fallback 1:** Use simplified model if hours missing
3. **Fallback 2:** Use review-only model if most data missing
4. **Fallback 3:** Use industry average if no reviews

## Known Limitations

1. **No Actual Staffing Data:** Cannot validate against real team sizes
2. **Review Rate Variability:** Different businesses have different review rates
3. **Missing Capacity Data:** No seating capacity or square footage
4. **No Historical Data:** Cannot track changes over time
5. **Location Inference:** Cannot reliably determine urban/suburban/rural
6. **Seasonality:** No seasonal variation consideration
7. **Multi-Location:** Cannot detect chains vs single locations

## Next Steps

1. **Implement Improved Model:** Based on research findings
2. **Add Validation:** Sanity checks and bounds
3. **Improve Confidence:** More comprehensive confidence calculation
4. **Standardize Fallbacks:** Single fallback formula across all implementations
5. **Add Logging:** Track estimation accuracy over time
6. **Collect Feedback:** User feedback on estimate accuracy
7. **Iterate:** Refine based on real-world feedback
