Provider Overview¶
LLM Answer Watcher supports 6 major LLM providers with a unified interface. Choose providers based on cost, performance, and feature requirements.
π New in v0.2.0: Browser Runners - Access ChatGPT and Perplexity via web UI automation to capture the true user experience. See Browser vs API Access below.
Supported Providers¶
| Provider | Models | Cost Range | Web Search | Best For |
|---|---|---|---|---|
| OpenAI | gpt-4o-mini, gpt-4o, more | \(0.15-\)10/1M tokens | β Yes | General use, cost-effective |
| Anthropic | Claude 3.5 Haiku, Sonnet, Opus | \(0.80-\)75/1M tokens | β No | High-quality responses |
| Mistral | mistral-large, mistral-small | \(0.30-\)2/1M tokens | β No | European alternative |
| X.AI (Grok) | grok-beta, grok-2, grok-3 | \(2-\)25/1M tokens | β No | X platform integration |
| Gemini 2.0 Flash | \(0.075-\)0.30/1M tokens | β No | Low-cost option | |
| Perplexity | Sonar, Sonar Pro | \(1-\)15/1M tokens | β Built-in | Grounded responses |
Quick Configuration¶
Single Provider¶
Multiple Providers¶
run_settings:
models:
- provider: "openai"
model_name: "gpt-4o-mini"
env_api_key: "OPENAI_API_KEY"
- provider: "anthropic"
model_name: "claude-3-5-haiku-20241022"
env_api_key: "ANTHROPIC_API_KEY"
- provider: "perplexity"
model_name: "sonar-pro"
env_api_key: "PERPLEXITY_API_KEY"
Provider Selection Guide¶
By Budget¶
Ultra Low Cost (<$0.005 per query):
- Google Gemini 2.0 Flash
- OpenAI gpt-4o-mini
Low Cost ($0.005-0.01 per query):
- Mistral mistral-small
- Anthropic Claude 3.5 Haiku
Medium Cost ($0.01-0.05 per query):
- OpenAI gpt-4o
- Anthropic Claude 3.5 Sonnet
- Perplexity Sonar Pro
High Cost (>$0.05 per query):
- Anthropic Claude 3.5 Opus
- Grok grok-3
- OpenAI gpt-4-turbo
By Feature¶
Web Search Required:
- β OpenAI (with tools configuration)
- β Perplexity (built-in)
No Web Search:
- Anthropic, Mistral, Grok, Google
Grounded Responses:
- β Perplexity (best)
- β OpenAI with web search
High Quality:
- Anthropic Claude 3.5 Sonnet/Opus
- OpenAI gpt-4o
- Perplexity Sonar Pro
Fast Response:
- OpenAI gpt-4o-mini
- Google Gemini Flash
- Mistral mistral-small
By Use Case¶
Cost-Optimized Monitoring:
High-Quality Analysis:
Multi-Provider Comparison:
models:
- provider: "openai"
model_name: "gpt-4o-mini"
- provider: "anthropic"
model_name: "claude-3-5-haiku-20241022"
- provider: "perplexity"
model_name: "sonar"
Web Search Required:
API Key Setup¶
OpenAI¶
Get key: https://platform.openai.com/api-keys
Anthropic¶
Get key: https://console.anthropic.com/
Mistral¶
Get key: https://console.mistral.ai/
X.AI (Grok)¶
Get key: https://console.x.ai/
Google Gemini¶
Get key: https://aistudio.google.com/apikey
Perplexity¶
Get key: https://www.perplexity.ai/settings/api
Provider Comparison¶
Response Quality¶
Best to Good:
- Anthropic Claude 3.5 Opus
- Anthropic Claude 3.5 Sonnet
- OpenAI gpt-4o
- Perplexity Sonar Pro
- Mistral mistral-large
- Grok grok-3
- Anthropic Claude 3.5 Haiku
- OpenAI gpt-4o-mini
- Google Gemini 2.0 Flash
- Mistral mistral-small
Cost Efficiency¶
Best value (quality per dollar):
- OpenAI gpt-4o-mini
- Google Gemini 2.0 Flash
- Anthropic Claude 3.5 Haiku
- Mistral mistral-small
- Perplexity Sonar
Speed¶
Fastest to Slowest:
- Google Gemini Flash
- OpenAI gpt-4o-mini
- Mistral models
- Perplexity Sonar
- Anthropic Haiku
- OpenAI gpt-4o
- Anthropic Sonnet
- Grok models
- Anthropic Opus
Rate Limits¶
Default rate limits (check provider docs for current limits):
| Provider | Requests/Min | Tokens/Min |
|---|---|---|
| OpenAI | 500 | 90,000 |
| Anthropic | 50 | 100,000 |
| Mistral | 5-60 | Varies |
| X.AI | 60 | 120,000 |
| 15 | 32,000 | |
| Perplexity | 20 | Varies |
Recommendation: Add delays between queries if hitting rate limits:
Provider-Specific Features¶
OpenAI¶
- β Web search via tools
- β Function calling
- β JSON mode
- β Vision support (not used)
See OpenAI Provider
Anthropic¶
- β Extended context (200K tokens)
- β Function calling
- β JSON mode
- β Thinking mode (not used)
Mistral¶
- β European data residency
- β Function calling
- β JSON mode
- β Competitive pricing
See Mistral Provider
X.AI (Grok)¶
- β X platform integration
- β OpenAI-compatible API
- β Real-time information
- β οΈ Limited model selection
See Grok Provider
Google¶
- β Very low cost
- β Fast responses
- β Long context (1M tokens)
- β οΈ Newer platform
See Google Provider
Perplexity¶
- β Built-in web search
- β Grounded responses
- β Citations included
- β Real-time information
- β οΈ Request fees (not in cost estimate)
Multi-Provider Strategies¶
Strategy 1: Cost vs Quality¶
Cheap model for volume, expensive for quality:
models:
# High volume, low cost
- provider: "openai"
model_name: "gpt-4o-mini"
# Occasional high-quality check
- provider: "anthropic"
model_name: "claude-3-5-sonnet-20241022"
enabled_for: ["critical-intent"]
Strategy 2: Provider Diversity¶
Avoid single-provider dependency:
models:
- provider: "openai"
model_name: "gpt-4o-mini"
- provider: "anthropic"
model_name: "claude-3-5-haiku-20241022"
- provider: "google"
model_name: "gemini-2.0-flash-exp"
Strategy 3: Web Search + Standard¶
models:
# Standard queries
- provider: "openai"
model_name: "gpt-4o-mini"
# Web-search enabled
- provider: "perplexity"
model_name: "sonar-pro"
Common Issues¶
API Key Errors¶
Solution:
Rate Limit Exceeded¶
Solutions:
- Add delay:
delay_between_queries: 2 - Reduce concurrent requests
- Upgrade API tier
Model Not Found¶
Solution: Use correct model name: gpt-4o-mini
See provider docs for valid models.
Authentication Failed¶
Solutions:
- Check key spelling
- Regenerate key at provider console
- Verify key has correct permissions
Browser vs API Access¶
Two Ways to Access Providers¶
Starting in v0.2.0, LLM Answer Watcher supports two access methods for supported providers:
| Access Method | Providers | How It Works | Use Cases |
|---|---|---|---|
| API Access | All 6 providers | Direct API calls with your API key | Production monitoring, cost-optimized, fast |
| Browser Access (BETA) | ChatGPT, Perplexity | Headless browser via Steel API | True user experience, screenshots, web UI testing |
Key Differences¶
API Access: - β Faster (no browser overhead) - β Accurate cost tracking - β Token usage metrics - β Programmatic control - β May differ from web UI behavior - β No visual evidence
Browser Access: - β Captures actual user experience - β Screenshots and HTML snapshots - β Tests web UI behavior - β Free tier usage (no API costs) - β Slower (10-30s overhead) - β No cost tracking yet (shows $0.00) - β Subject to UI changes
When to Use Each¶
Use API Access when: - You need fast, automated monitoring - Cost tracking is important - You're running high-volume queries - You need programmatic control
Use Browser Access when: - You want to verify web UI behavior - You need visual evidence (screenshots) - You're testing free tier experience - You want to see what actual users see
Example: Comparing Both¶
runners:
# API access for production monitoring
- runner_plugin: "api"
config:
provider: "openai"
model_name: "gpt-4o-mini"
api_key: "${OPENAI_API_KEY}"
# Browser access to verify web UI
- runner_plugin: "steel-chatgpt"
config:
steel_api_key: "${STEEL_API_KEY}"
take_screenshots: true
This configuration runs the same query through both methods, letting you compare: - Does the API response match what users see in ChatGPT? - Are citations/sources displayed differently? - Does the web UI recommend different brands?
See Browser Runners Guide for complete details.
Next Steps¶
-
OpenAI
Complete OpenAI provider guide
-
Anthropic
Claude models documentation
-
Perplexity
Grounded LLMs with web search
-
Browser Runners
Web UI automation guide