# Python Excel Generation Libraries Research

**Last Updated:** 2025-11-20

## Overview

This document evaluates Python libraries for Excel generation, focusing on advanced features like charts, pivot tables, and sparklines that PhpSpreadsheet cannot generate natively.

## Libraries Evaluated

1. **openpyxl** - Most comprehensive Python Excel library
2. **XlsxWriter** - Write-only, optimized for charts
3. **pandas + openpyxl** - Data manipulation + Excel generation
4. **xlwings** - Excel automation with VBA integration

## 1. openpyxl

### Overview

- **Maintainer:** Eric Gazoni, Charlie Clark
- **License:** MIT
- **GitHub:** https://github.com/openpyxl/openpyxl
- **Documentation:** https://openpyxl.readthedocs.io/
- **PyPI:** https://pypi.org/project/openpyxl/

### Strengths

- ✅ **Full Excel Support:**

  - Charts (all types: bar, line, pie, scatter, area, etc.)
  - Pivot tables
  - Sparklines
  - Tables (structured references)
  - Conditional formatting (all types including color scales, data bars, icon sets)
  - Data validation
  - Formulas (all Excel functions)
  - Named ranges
  - Sheet protection
  - Images
  - Comments

- ✅ **Read and Write:**

  - Can read existing Excel files
  - Can modify existing files
  - Can create new files

- ✅ **Advanced Features:**

  - Chart customization (colors, styles, legends, axes)
  - Pivot table configuration
  - Sparkline groups
  - Table styling
  - Advanced conditional formatting

- ✅ **Active Development:**
  - Regular updates
  - Good documentation
  - Large community

### Weaknesses

- ⚠️ Slower than XlsxWriter (read/write overhead)
- ⚠️ More memory usage than write-only libraries
- ⚠️ Python dependency (requires Python environment)

### Performance

- **Memory Usage:** Moderate
- **Speed:** Moderate (faster than PhpSpreadsheet for charts)
- **Large Files:** Good performance

### Best For

- **Charts** - Excellent support, all chart types
- **Pivot Tables** - Full support
- **Sparklines** - Native support
- **Advanced Features** - Best Python library for Excel

### Example Usage

```python
from openpyxl import Workbook
from openpyxl.chart import BarChart, Reference

wb = Workbook()
ws = wb.active

# Add data
ws['A1'] = 'Month'
ws['B1'] = 'Sales'
for i in range(2, 14):
    ws[f'A{i}'] = f'Month {i-1}'
    ws[f'B{i}'] = i * 100

# Create chart
chart = BarChart()
chart.title = "Monthly Sales"
chart.x_axis.title = "Month"
chart.y_axis.title = "Sales"

data = Reference(ws, min_col=2, min_row=1, max_row=13)
cats = Reference(ws, min_col=1, min_row=2, max_row=13)
chart.add_data(data, titles_from_data=True)
chart.set_categories(cats)

ws.add_chart(chart, "D2")
wb.save("sales_chart.xlsx")
```

## 2. XlsxWriter

### Overview

- **Maintainer:** John McNamara
- **License:** BSD
- **GitHub:** https://github.com/jmcnamara/XlsxWriter
- **Documentation:** https://xlsxwriter.readthedocs.io/
- **PyPI:** https://pypi.org/project/XlsxWriter/

### Strengths

- ✅ **Write-Only (Fast):**

  - Optimized for writing
  - Lower memory usage
  - Faster than openpyxl for writing

- ✅ **Excellent Chart Support:**

  - All chart types
  - Extensive chart customization
  - Chart formatting options

- ✅ **Advanced Features:**

  - Conditional formatting (all types)
  - Data validation
  - Tables
  - Sparklines
  - Images
  - Comments

- ✅ **Performance:**
  - Fastest Python library for writing
  - Low memory usage
  - Good for large files

### Weaknesses

- ❌ Write-only (cannot read/modify existing files)
- ❌ No pivot table support
- ❌ Python dependency

### Performance

- **Memory Usage:** Low
- **Speed:** Fast (write-only)
- **Large Files:** Excellent performance

### Best For

- **Charts** - Excellent, optimized for chart generation
- **Performance** - Fastest Python option
- **Large Files** - Best performance
- **Write-Only Scenarios** - Perfect for generation

### Example Usage

```python
import xlsxwriter

workbook = xlsxwriter.Workbook('chart.xlsx')
worksheet = workbook.add_worksheet()

# Add data
data = [10, 40, 50, 20, 10, 50]
worksheet.write_column('A1', data)

# Create chart
chart = workbook.add_chart({'type': 'column'})
chart.add_series({'values': '=Sheet1!$A$1:$A$6'})
chart.set_title({'name': 'Sales Chart'})

worksheet.insert_chart('C2', chart)
workbook.close()
```

## 3. pandas + openpyxl

### Overview

- **pandas:** Data manipulation library
- **openpyxl:** Excel engine (used by pandas)
- **Combination:** Powerful data processing + Excel generation

### Strengths

- ✅ **Data Manipulation:**

  - Powerful data processing
  - Data transformation
  - Data analysis
  - Easy data export

- ✅ **Excel Integration:**

  - Uses openpyxl as engine
  - Can leverage openpyxl features
  - Easy data-to-Excel workflow

- ✅ **Best For:**
  - Data-heavy templates
  - Complex calculations
  - Data analysis templates

### Weaknesses

- ⚠️ Requires pandas knowledge
- ⚠️ More complex setup
- ⚠️ Overkill for simple templates

### Best For

- Templates requiring data processing
- Complex calculations
- Data analysis features
- When data manipulation is needed

### Example Usage

```python
import pandas as pd
from openpyxl import load_workbook
from openpyxl.chart import BarChart

# Create DataFrame
df = pd.DataFrame({
    'Month': ['Jan', 'Feb', 'Mar'],
    'Sales': [100, 200, 150]
})

# Write to Excel
df.to_excel('sales.xlsx', index=False)

# Add chart using openpyxl
wb = load_workbook('sales.xlsx')
ws = wb.active
chart = BarChart()
# ... add chart
wb.save('sales.xlsx')
```

## 4. xlwings

### Overview

- **Maintainer:** Zoomer Analytics
- **License:** BSD
- **GitHub:** https://github.com/xlwings/xlwings
- **Documentation:** https://docs.xlwings.org/
- **PyPI:** https://pypi.org/project/xlwings/

### Strengths

- ✅ **Excel Automation:**

  - Controls Excel application
  - Can use Excel's native features
  - VBA integration
  - Can leverage Excel's chart engine

- ✅ **Advanced Features:**

  - Can use Excel's native charts
  - Can use Excel's pivot tables
  - Can execute VBA macros
  - Can interact with Excel UI

- ✅ **Best For:**
  - Complex Excel automation
  - VBA macro execution
  - Excel application control

### Weaknesses

- ❌ Requires Excel installed (Windows/Mac)
- ❌ Not suitable for server environments
- ❌ Slower (requires Excel application)
- ❌ Platform-dependent

### Best For

- **Desktop Automation** - When Excel is available
- **VBA Macros** - Execute existing macros
- **Excel Integration** - Control Excel application
- **Not Recommended** - For server-side generation

## Feature Comparison Matrix

| Feature                    | openpyxl     | XlsxWriter   | pandas+openpyxl | xlwings                |
| -------------------------- | ------------ | ------------ | --------------- | ---------------------- |
| **Charts**                 | ✅ Full      | ✅ Full      | ✅ Full         | ✅ Full (via Excel)    |
| **Pivot Tables**           | ✅ Yes       | ❌ No        | ✅ Yes          | ✅ Yes (via Excel)     |
| **Sparklines**             | ✅ Yes       | ✅ Yes       | ✅ Yes          | ✅ Yes (via Excel)     |
| **Tables**                 | ✅ Yes       | ✅ Yes       | ✅ Yes          | ✅ Yes                 |
| **Conditional Formatting** | ✅ Full      | ✅ Full      | ✅ Full         | ✅ Full                |
| **Data Validation**        | ✅ Yes       | ✅ Yes       | ✅ Yes          | ✅ Yes                 |
| **Formulas**               | ✅ Yes       | ✅ Yes       | ✅ Yes          | ✅ Yes                 |
| **Read Files**             | ✅ Yes       | ❌ No        | ✅ Yes          | ✅ Yes                 |
| **Write Files**            | ✅ Yes       | ✅ Yes       | ✅ Yes          | ✅ Yes                 |
| **VBA Macros**             | ❌ No        | ❌ No        | ❌ No           | ✅ Yes                 |
| **Performance**            | ⚠️ Moderate  | ✅ Fast      | ⚠️ Moderate     | ⚠️ Slow                |
| **Memory Usage**           | ⚠️ Moderate  | ✅ Low       | ⚠️ Moderate     | ⚠️ High                |
| **Server Compatible**      | ✅ Yes       | ✅ Yes       | ✅ Yes          | ❌ No (requires Excel) |
| **Documentation**          | ✅ Excellent | ✅ Excellent | ✅ Good         | ✅ Good                |

## Recommendations

### For Chart Generation

**Best Choice: openpyxl or XlsxWriter**

- **openpyxl:** If you need read/write capabilities
- **XlsxWriter:** If you only need write (faster)

**Integration Strategy:**

1. Generate base template with PhpSpreadsheet (formulas, validation, structure)
2. Use Python script with openpyxl/XlsxWriter to add charts
3. Save final template

### For Pivot Tables

**Best Choice: openpyxl**

- Only Python library with native pivot table support
- Full configuration options
- Can be added programmatically

**Integration Strategy:**

1. Generate base template with PhpSpreadsheet
2. Use Python script with openpyxl to add pivot tables
3. Save final template

### For Sparklines

**Best Choice: openpyxl or XlsxWriter**

- Both support sparklines
- openpyxl: More flexible
- XlsxWriter: Faster

**Integration Strategy:**

1. Generate base template with PhpSpreadsheet
2. Use Python script to add sparklines
3. Save final template

### For Maximum Quality Templates

**Hybrid Approach:**

1. **PhpSpreadsheet** - Core generation (formulas, validation, structure)
2. **openpyxl** - Charts, pivot tables, sparklines
3. **Manual Polish** - Final adjustments if needed

## Integration Workflow

### Proposed Workflow

1. **Generate Base Template (PHP):**

   ```php
   // Use PhpSpreadsheet to generate:
   // - Structure and sheets
   // - Formulas
   // - Data validation
   // - Basic formatting
   // - Conditional formatting (basic)
   ```

2. **Enhance with Python:**

   ```python
   # Use openpyxl to:
   # - Add charts
   # - Add pivot tables
   # - Add sparklines
   # - Enhance conditional formatting
   ```

3. **Final Output:**
   - Complete Excel template with all features
   - Best quality possible
   - All advanced features included

### Implementation Example

```python
# enhance_template.py
from openpyxl import load_workbook
from openpyxl.chart import BarChart, Reference

def add_charts_to_template(template_path, output_path):
    """Add charts to PhpSpreadsheet-generated template"""
    wb = load_workbook(template_path)
    ws = wb['Report']  # Assume 'Report' sheet exists

    # Create chart
    chart = BarChart()
    chart.title = "Monthly Sales"
    data = Reference(ws, min_col=2, min_row=1, max_row=13)
    cats = Reference(ws, min_col=1, min_row=2, max_row=13)
    chart.add_data(data, titles_from_data=True)
    chart.set_categories(cats)

    # Add chart to sheet
    ws.add_chart(chart, "F2")

    # Save enhanced template
    wb.save(output_path)

if __name__ == '__main__':
    add_charts_to_template('base_template.xlsx', 'enhanced_template.xlsx')
```

## Conclusion

**For our use case (best-ever templates):**

1. **Use PhpSpreadsheet** for core generation
2. **Use openpyxl** for charts, pivot tables, sparklines
3. **Create Python enhancement script** to add advanced features
4. **Document workflow** for Cursor AI agent

**Advantages:**

- Best of both worlds
- Maximum feature support
- High-quality output
- Flexible approach

**Next Steps:**

1. Create Python enhancement script
2. Test integration workflow
3. Document usage for Cursor AI
4. Create example templates with charts
