# Blog Title Keyword Highlighting Guide

**Last Updated:** 2026-02-11

## Recent Improvements (2026-01-09)

The keyword highlighting logic has been significantly improved:

- **Smart Topic Extraction**: For long phrases before separators, the system now extracts the main topic keyword (e.g., "Digitalisierung" from "Die größten Fehler bei der Digitalisierung in der Gastronomie")
- **Compound Word Detection**: Better handling of compound words and technical terms (e.g., "WhatsApp-Chaos", "8D-Report", "Management-Team")
- **Quoted Keywords**: Keywords in quotes are automatically highlighted (e.g., "Digitalisierung" in "Was heißt eigentlich „Digitalisierung" für Restaurants?")
- **Long Noun Detection**: Identifies main topic nouns by length and common noun endings (-ung, -tion, -ismus, -ität)
- **Phrase Length Limiting**: Automatically limits highlighted phrases to 1-3 words for better visual balance

Complete guide to the automatic keyword highlighting system for blog post titles.

## Overview

The blog title keyword highlighting system automatically identifies and highlights 1-2 keywords in blog post titles using Ordio blue (#4D8EF3). This enhances visual appeal, improves brand consistency, and makes titles more engaging while maintaining readability and SEO.

## Purpose and Benefits

- **Visual Appeal**: Highlights draw attention to key terms, making titles more scannable
- **Brand Consistency**: Uses Ordio blue (#4D8EF3) consistently across all titles
- **SEO-Friendly**: Maintains semantic HTML structure (keywords remain in H1)
- **Automatic**: No manual configuration needed - works intelligently for all titles
- **Accessible**: Maintains proper contrast ratios and semantic structure

## How It Works

### Keyword Selection Algorithm

The system uses a three-tier priority system:

#### Priority 1: Brand Keywords

- **"Ordio"** is always highlighted when present in the title
- Example: "Dein Weg zu Ordio: Unser Sales-Prozess" → "**Ordio**" highlighted

#### Priority 2: Structural Keywords

- If title contains a colon (:) → text before colon is highlighted
- If title contains a dash (–) → text before dash is highlighted
- Examples:
  - "Employer Branding – Definition, Strategien & Tipps" → "**Employer Branding**" highlighted
  - "24-Stunden-Dienst: Regeln, Vorteile & Pflichten" → "**24-Stunden-Dienst**" highlighted

#### Priority 3: First Significant Words

- If no structural separator found, highlights first 1-2 significant words
- Excludes German stop words: der, die, das, und, oder, für, mit, im, etc.
- Example: "Der ultimative Leitfaden für Schichtplanung" → "**ultimative Leitfaden**" highlighted

### Technical Implementation

**Function**: `highlight_title_keywords($title)` in `v2/config/blog-template-helpers.php`

**Output**: HTML string with keywords wrapped in `<span class="title-keyword">`

**CSS Class**: `.title-keyword` styled with `var(--ordio-blue)` (#4D8EF3)

## Examples

### Titles with Dashes

- "Arbeitszeitmodelle – Überblick, Beispiele & Lösungen"

  - Highlights: **Arbeitszeitmodelle**

- "Betriebliche Altersvorsorge – Vorteile & Funktionsweise"
  - Highlights: **Betriebliche Altersvorsorge**

### Titles with Colons

- "Personalentwicklung & Weiterbildung: Team gezielt fördern"

  - Highlights: **Personalentwicklung & Weiterbildung**

- "Dein Weg zu Ordio: Unser Sales-Prozess"
  - Highlights: **Ordio** (brand keyword takes priority)

### Titles with Brand Keywords

- "Ordio erweitert Management-Team um erfahrene Führungskräfte"

  - Highlights: **Ordio** and **erweitert Management-Team**

- "Ordio"
  - Highlights: **Ordio**

### Titles without Separators

- "Der ultimative Leitfaden für Schichtplanung"

  - Highlights: **ultimative Leitfaden**

- "Tipps und Tricks"
  - Highlights: **Tipps**

## CSS Styling

The highlighting uses the `.title-keyword` class defined in `v2/css/blog-post.css`:

```css
.post-header-title .title-keyword {
  color: var(--ordio-blue);
  font-weight: inherit;
}
```

**Color**: Uses CSS variable `var(--ordio-blue)` which is `#4D8EF3`

**Font Weight**: Inherits from parent (already bold via `font-gilroybold`)

**No Additional Effects**: No underline, background, or other decoration to maintain clean design

## Usage in Components

### PostHeader Component

The highlighting is automatically applied in `v2/components/blog/PostHeader.php`:

```php
<h1 class="post-header-title ...">
    <?php echo highlight_title_keywords($title); ?>
</h1>
```

### Other Components

Currently, highlighting is only applied to:

- ✅ Single post page titles (PostHeader component)

**Not applied to** (by design):

- Post card titles (too small, may reduce readability)
- Related post titles (too small, may reduce readability)

## Reliability & Error Handling

### Production Reliability (Updated 2026-01-14)

The function has been enhanced with comprehensive error handling and production reliability improvements:

**Input Validation:**
- Accepts any input type (string, null, boolean, numeric, object)
- Safely converts non-string types to string
- Validates UTF-8 encoding
- Removes null bytes and control characters
- Limits processing for very long titles (>1000 chars)

**Error Handling:**
- Never throws exceptions - all errors are caught internally
- Returns sanitized title on any error (never fails silently)
- Logs errors with context for debugging
- Works without mbstring extension (uses fallback functions)

**mbstring Dependency:**
- Function works with or without mbstring extension
- Automatically detects mbstring availability
- Uses fallback functions when mbstring unavailable
- Fallback functions: `_mb_strlen_fallback()`, `_mb_stripos_fallback()`, etc.

**Edge Cases:**
- Invalid UTF-8 sequences → converted or sanitized
- Very long titles → truncated at word boundary
- Null bytes → removed
- Special regex characters → properly escaped
- Empty/null inputs → returns empty string
- Array/object inputs → handled gracefully

### Troubleshooting

**Issue: Function returns empty string**
- Check error logs for input validation errors
- Verify input is a valid string or convertible type
- Check for null bytes or invalid UTF-8 sequences

**Issue: Keywords not highlighting**
- Check error logs for regex errors
- Verify title contains valid keywords
- Check if title is too long (>1000 chars, may be truncated)

**Issue: mbstring errors**
- Function should work without mbstring (uses fallbacks)
- Check error logs for fallback function issues
- Verify PHP version compatibility (PHP 7.4+)

**Issue: Production not highlighting (works locally, fails on production)**
- **Root cause:** Production may have mbstring extension disabled. Unconditional mb_ function calls cause fatal errors when mbstring is unavailable.
- **Fix applied (2026-02-11):** All mb_ calls are now guarded with `function_exists()` or `_has_mbstring()` checks. UTF-8 validation uses iconv fallback when mbstring unavailable. See [H1_HIGHLIGHTING_PRODUCTION_FIX.md](./H1_HIGHLIGHTING_PRODUCTION_FIX.md).
- **Verification:** Run `php v2/scripts/blog/test-title-highlighting.php --simulate-no-mbstring` to test fallback path locally.
- **Production check:** Run `php v2/scripts/dev-helpers/check-php-extensions.php` to verify extensions. If mbstring is missing, either enable it (`apt install php-mbstring`) or rely on fallbacks (function works without it).

**Issue: Performance problems**
- Check for very long titles (>1000 chars)
- Monitor error logs for repeated errors
- Verify mbstring extension is loaded (better performance)

### Testing

Run comprehensive test suite:

```bash
php v2/scripts/blog/test-title-highlighting.php
```

Simulate production without mbstring (tests fallback path):

```bash
php v2/scripts/blog/test-title-highlighting.php --simulate-no-mbstring
```

Test suite covers:
- Various input types (string, null, array, object, etc.)
- Edge cases (empty, very long, special chars, UTF-8 issues)
- Real blog post titles
- mbstring fallback behavior (including `--simulate-no-mbstring`)
- Regex error scenarios

## Maintenance Guidelines

### Adding New Keywords

To add new keywords that should always be highlighted:

1. Edit `v2/config/blog-template-helpers.php`
2. Find the `highlight_title_keywords()` function
3. Add to the brand keywords section:

```php
// Priority 1: Brand keywords (always highlight)
if (preg_match('/\bOrdio\b/i', $title)) {
    $highlighted_keywords[] = 'Ordio';
}
// Add new brand keywords here
if (preg_match('/\bNewKeyword\b/i', $title)) {
    $highlighted_keywords[] = 'NewKeyword';
}
```

### Modifying Stop Words

To exclude additional words from keyword selection:

1. Edit `v2/config/blog-template-helpers.php`
2. Find the `$stop_words` array in `highlight_title_keywords()`
3. Add new stop words to the array

### Testing Changes

After modifying the function:

1. Run the test script: `php v2/scripts/blog/test-title-highlighting.php`
2. Verify all tests pass
3. Test visually on multiple blog posts
4. Check responsive design (mobile, tablet, desktop)
5. Test with various PHP versions (7.4, 8.0, 8.1, 8.2, 8.3, 8.4)
6. Test with and without mbstring extension

## Future Enhancements

Potential improvements for future consideration:

1. **Manual Override**: Allow manual keyword specification in post JSON metadata
2. **A/B Testing**: Test different highlighting strategies to optimize engagement
3. **Analytics**: Track which highlighted keywords perform best
4. **Admin Interface**: Allow content editors to customize highlighting per post
5. **Multiple Colors**: Consider highlighting different types of keywords in different colors
6. **Post Card Highlighting**: Evaluate if highlighting should be applied to post cards

## Troubleshooting

### Keywords Not Highlighting

1. Check if title contains structural separators (colon/dash)
2. Verify keyword is not a stop word
3. Check if keyword is already part of a highlighted phrase
4. Review test script output for debugging

### Nested Spans

The function prevents nested spans by:

- Sorting keywords by length (longest first)
- Tracking highlighted positions
- Skipping keywords that would create nested spans

### SEO Concerns

- Keywords remain in H1 tag (SEO-friendly)
- HTML structure is semantic
- No impact on search engine indexing
- Keywords are still readable by screen readers

## Related Documentation

- [Component API Documentation](./COMPONENT_API.md) - Complete API reference
- [Blog Template Helpers](../v2/config/blog-template-helpers.php) - Source code
- [Blog CSS](v2/css/blog-post.css) - Styling definitions
