Brand Mention Detection¶

Brand mention detection is the core feature of LLM Answer Watcher. It uses word-boundary regex matching to accurately identify brand mentions while preventing false positives.

How It Works¶

Word-Boundary Matching¶

The system uses word-boundary regex (\b) to ensure accurate matching:

# Pattern: \bHubSpot\b
# Matches: "I use HubSpot daily"
# Doesn't match: "I use HubSpotter" or "hub" in "GitHub"

This prevents common false positives:

✅ "HubSpot" matches "HubSpot" exactly
❌ "Hub" does NOT match "HubSpot"
❌ "Spot" does NOT match "HubSpot"
❌ "hub" does NOT match "GitHub"

Case-Insensitive Matching¶

All matching is case-insensitive:

# All these match "HubSpot"
"HubSpot", "hubspot", "HUBSPOT", "HuBsPoT"

Brand Aliases¶

Configure multiple aliases for each brand:

brands:
  mine:
    - "Warmly"
    - "Warmly.io"
    - "Warmly AI"

  competitors:
    - "HubSpot"
    - "HubSpot CRM"
    - "Instantly"
    - "Instantly.ai"

Configuration¶

Basic Brand Configuration¶

Minimal configuration with your brand and competitors:

brands:
  mine:
    - "YourBrand"

  competitors:
    - "CompetitorA"
    - "CompetitorB"

Advanced Brand Configuration¶

Include all variations and common misspellings:

brands:
  mine:
    - "Acme Corp"
    - "Acme"
    - "AcmeCorp"
    - "Acme.io"
    - "Acme Software"

  competitors:
    # Direct competitors
    - "Competitor One"
    - "CompetitorOne"
    - "Competitor1"

    # Market leaders
    - "Industry Leader"
    - "Big Player Inc"

    # Adjacent competitors
    - "Alternative Tool"

Brand Normalization¶

Brands are normalized for storage and analysis:

"HubSpot CRM" → "hubspot-crm"
"Instantly.ai" → "instantly-ai"
"Apollo.io" → "apollo-io"

This ensures consistent matching across different formats.

Detection Methods¶

Method 1: Regex (Default)¶

Fast, free, pattern-based detection.

Advantages:

Zero cost (no API calls)
Instant results
100% consistent
Works offline

Limitations:

May miss contextual mentions
Requires exact alias match
No semantic understanding

Configuration:

run_settings:
  use_llm_rank_extraction: false

Method 2: Function Calling¶

LLM-assisted detection using function calling for higher accuracy.

Advantages:

Understands context
Catches variations
Semantic understanding
Confidence scores

Limitations:

Costs money per query
Slower than regex
Requires extraction model

Configuration:

extraction_settings:
  extraction_model:
    provider: "openai"
    model_name: "gpt-4o-mini"
    env_api_key: "OPENAI_API_KEY"

  method: "function_calling"
  fallback_to_regex: true
  min_confidence: 0.7

Method 3: Hybrid¶

Combines regex and function calling for best results.

How it works:

Try regex first (fast, free)
If regex fails, use function calling
Merge results with de-duplication

Configuration:

extraction_settings:
  extraction_model:
    provider: "openai"
    model_name: "gpt-4o-mini"
    env_api_key: "OPENAI_API_KEY"

  method: "hybrid"
  fallback_to_regex: true
  min_confidence: 0.7

Detection Results¶

Mention Object¶

Each detected mention includes:

{
  "brand": "HubSpot",
  "normalized_name": "hubspot",
  "is_mine": false,
  "rank_position": 1,
  "snippet": "...I recommend HubSpot for CRM needs...",
  "confidence": 1.0,
  "detection_method": "regex"
}

My Brands vs Competitors¶

Mentions are categorized:

{
  "my_mentions": [
    {
      "brand": "Warmly",
      "is_mine": true,
      "rank_position": 2
    }
  ],
  "competitor_mentions": [
    {
      "brand": "HubSpot",
      "is_mine": false,
      "rank_position": 1
    },
    {
      "brand": "Instantly",
      "is_mine": false,
      "rank_position": 3
    }
  ]
}

Common Detection Patterns¶

Pattern 1: Exact Brand Name¶

LLM Response:

"The best email warmup tools are Warmly, Instantly, and Lemwarm."

Detected:

✅ Warmly
✅ Instantly
✅ Lemwarm

Pattern 2: Brand with TLD¶

LLM Response:

"Check out Warmly.io for email warmup."

Detected:

✅ Warmly.io

Note: Add both "Warmly" and "Warmly.io" as aliases to catch both.

Pattern 3: Brand in Context¶

LLM Response:

"Many sales teams use HubSpot CRM to manage leads."

Detected:

✅ HubSpot CRM
✅ HubSpot (if both aliases configured)

Pattern 4: Case Variations¶

LLM Response:

"HUBSPOT and hubspot are the same product."

Detected:

✅ HubSpot (both instances)

Preventing False Positives¶

Use Word Boundaries¶

❌ Bad - Substring Matching:

brands:
  mine:
    - "Hub"  # Matches "GitHub", "HubSpot", "hub"

This creates false positives.

✅ Good - Full Word Matching:

brands:
  mine:
    - "HubSpot"  # Only matches "HubSpot"

Word boundaries prevent substring matches.

Avoid Overly Generic Names¶

❌ Bad:

brands:
  competitors:
    - "AI"  # Too generic
    - "The"
    - "Pro"

✅ Good:

brands:
  competitors:
    - "OpenAI"
    - "The Sales Platform"
    - "Pro CRM"

Test Your Aliases¶

# Validate configuration
llm-answer-watcher validate --config watcher.config.yaml

# Run with example intents
llm-answer-watcher run --config watcher.config.yaml

Detection Accuracy¶

Evaluation Metrics¶

LLM Answer Watcher tracks detection accuracy:

Metric	Description	Target
Precision	Correct mentions / Total detected	≥ 90%
Recall	Correct mentions / Expected mentions	≥ 80%
F1 Score	Harmonic mean of precision and recall	≥ 85%

Run Evaluations¶

llm-answer-watcher eval --fixtures evals/testcases/fixtures.yaml

See Evaluation Framework for details.

Advanced Detection¶

Special Characters¶

Escape special characters in brand names:

brands:
  mine:
    - "Brand (TM)"  # Automatically escaped
    - "Brand.io"
    - "Brand-Name"

The system handles escaping automatically.

Multi-Word Brands¶

brands:
  competitors:
    - "Acme Corp"
    - "Big Company Inc"
    - "The Sales Platform"

Word boundaries work across multiple words.

Abbreviations¶

Add both full name and abbreviation:

brands:
  competitors:
    - "Customer Relationship Management"
    - "CRM"
    - "HubSpot CRM"

Debugging Detection Issues¶

Issue: Brand Not Detected¶

Problem: Your brand appears in response but isn't detected.

Solutions:

Check brand alias spelling:

# View raw response
cat output/2025-11-05T14-30-00Z/intent_*_raw_*.json | jq '.answer_text'

Add alias variation:

brands:
  mine:
    - "YourBrand"
    - "YourBrand.io"
    - "Your Brand"  # Add this

Check for special formatting:

"Check out **YourBrand**"  // Bold formatting
"Visit `YourBrand.io`"     // Code formatting

Issue: False Positives¶

Problem: Unrelated words are detected as brand mentions.

Solutions:

Remove overly generic aliases:

# ❌ Remove this
brands:
  mine:
    - "AI"

# ✅ Use this instead
brands:
  mine:
    - "YourBrand AI"

Check word boundaries are working:

# Test with evaluation suite
llm-answer-watcher eval --fixtures evals/testcases/fixtures.yaml

Issue: Case Sensitivity¶

Problem: Brand detected with wrong capitalization.

Solution: Matching is already case-insensitive, but display preserves original case from LLM response.

# All match the same brand
"HubSpot" → normalized to "hubspot"
"hubspot" → normalized to "hubspot"
"HUBSPOT" → normalized to "hubspot"

Best Practices¶

1. Start with Core Aliases¶

brands:
  mine:
    - "YourBrand"      # Exact name
    - "YourBrand.io"   # With TLD

2. Add Variations Incrementally¶

Run monitoring, review results, add missing aliases:

brands:
  mine:
    - "YourBrand"
    - "YourBrand.io"
    - "YourBrand AI"    # Added after reviewing results
    - "YB"              # Abbreviation if commonly used

3. Limit Competitor List¶

Track 10-20 key competitors:

brands:
  competitors:
    # Top 5 direct competitors
    - "Competitor A"
    - "Competitor B"
    # Top 3 market leaders
    - "Market Leader"

4. Monitor Detection Metrics¶

-- Check detection rates
SELECT
    brand,
    COUNT(*) as total_mentions,
    COUNT(DISTINCT run_id) as runs_appeared,
    COUNT(*) * 100.0 / (SELECT COUNT(*) FROM runs) as appearance_rate
FROM mentions
WHERE timestamp_utc >= datetime('now', '-30 days')
GROUP BY brand
ORDER BY total_mentions DESC;

5. Use Evaluation Suite¶

# Test detection before deploying
llm-answer-watcher eval --fixtures evals/testcases/fixtures.yaml

# Add custom test cases for your brands
# See: evals/testcases/fixtures.yaml

Next Steps¶

Rank Extraction

Learn how ranking positions are extracted

Rank Extraction →
Function Calling

Use LLM-assisted detection for higher accuracy

Function Calling →
Evaluation Framework

Test and validate detection accuracy

Evaluation Guide →
Brand Configuration

Deep dive into brand configuration strategies

Brand Config →