Browser Runners Guide¶
⚠️ BETA FEATURE - Browser runners were added on November 8, 2025 and are currently in BETA quality.
Known Limitations: - Cost tracking returns $0.00 (placeholder - actual Steel API costs not yet calculated) - CSS selectors may break if web UIs change - ChatGPT authentication not fully documented - Response completion detection is heuristic-based
Suitable for: Research, testing, screenshot capture, hobby/startup monitoring Not recommended for: Fully automated production systems relying on accurate cost tracking
Overview¶
Browser runners enable LLM Answer Watcher to interact with web-based LLM interfaces like ChatGPT and Perplexity using headless browser automation. This captures the true user experience that differs from direct API access.
Why Browser Runners?¶
Key Differences: Browser UI vs API¶
| Aspect | API | Browser (ChatGPT/Perplexity) |
|---|---|---|
| Web Search | Optional tool calling | Integrated UI behavior |
| Citations | Structured JSON | Visual citations in response |
| System Prompts | Your control | Platform-specific defaults |
| Rate Limits | API quota | Web UI limits |
| Cost | Per token | Free tier or subscription |
| Response Format | Raw text | Formatted with markdown, links |
Use Cases¶
- Verify API vs Web Consistency: Check if your brand appears differently in ChatGPT web UI vs API
- Capture Visual Evidence: Screenshots prove how brands are displayed
- Test Web-Specific Features: Perplexity sources, ChatGPT citations
- Monitor Free Tier Behavior: See what users without API access experience
Supported Platforms¶
- ✅ ChatGPT (chat.openai.com) via Steel
- ✅ Perplexity (perplexity.ai) via Steel
- 🚧 Claude (claude.ai) - Coming soon
- 🚧 Gemini (gemini.google.com) - Coming soon
Prerequisites¶
1. Steel API Account¶
Browser runners use Steel for headless browser automation.
# Sign up at https://steel.dev
# Get your API key from dashboard
export STEEL_API_KEY="your-steel-api-key"
Steel Pricing (as of 2025): - Hobby: $0/month + $0.10/hour browser time (5 concurrent sessions) - Pro: $49/month + $0.10/hour (20 concurrent sessions) - Enterprise: Custom pricing
2. Optional: CAPTCHA Solver¶
Steel can integrate with CAPTCHA solvers for sites that require login:
Configuration¶
Basic Example¶
# watcher.config.yaml
runners:
# Browser runner: ChatGPT via Steel
- runner_plugin: "steel-chatgpt"
config:
steel_api_key: "${STEEL_API_KEY}"
target_url: "https://chat.openai.com"
session_timeout: 300
wait_for_response_timeout: 60
take_screenshots: true
save_html_snapshot: true
session_reuse: true
# Browser runner: Perplexity via Steel
- runner_plugin: "steel-perplexity"
config:
steel_api_key: "${STEEL_API_KEY}"
target_url: "https://www.perplexity.ai"
session_timeout: 300
take_screenshots: true
session_reuse: true
brands:
mine: ["YourBrand"]
competitors: ["CompetitorA", "CompetitorB"]
intents:
- id: "crm-tools"
prompt: "What are the best CRM tools?"
Configuration Options¶
Common Options (All Browser Runners)¶
| Option | Type | Default | Description |
|---|---|---|---|
steel_api_key |
string | required | Steel API key (use env var) |
target_url |
string | Platform URL | Starting URL for browser |
session_timeout |
int | 300 | Max session duration (seconds) |
wait_for_response_timeout |
int | 60 | Max wait for LLM response (seconds) |
take_screenshots |
bool | true | Capture screenshots |
save_html_snapshot |
bool | true | Save HTML snapshots |
session_reuse |
bool | true | Reuse sessions (faster, cheaper) |
solver |
string | "capsolver" | CAPTCHA solver service |
proxy |
string | null | Optional proxy config |
output_dir |
string | "./output" | Directory for artifacts |
ChatGPT-Specific Options¶
- runner_plugin: "steel-chatgpt"
config:
steel_api_key: "${STEEL_API_KEY}"
target_url: "https://chat.openai.com"
# Add ChatGPT session token if needed
# chatgpt_session_token: "${CHATGPT_SESSION_TOKEN}"
Perplexity-Specific Options¶
- runner_plugin: "steel-perplexity"
config:
steel_api_key: "${STEEL_API_KEY}"
target_url: "https://www.perplexity.ai"
# Perplexity works without login
Usage¶
Run with Browser Runners¶
# Set environment variables
export STEEL_API_KEY="your-steel-api-key"
# Run with browser runner config
llm-answer-watcher run --config examples/watcher.config.browser-runners.yaml
# Output:
# ✓ Created Steel session: session-abc123
# ✓ Submitted prompt to ChatGPT: "What are the best CRM tools?"
# ✓ Extracted answer (2,345 chars)
# ✓ Screenshot saved: ./output/2025-11-06T10-30-00Z/screenshot_chatgpt_session-abc123.png
# ✓ HTML snapshot saved: ./output/2025-11-06T10-30-00Z/html_chatgpt_session-abc123.html
Compare API vs Browser Results¶
# Run configuration with both API and browser runners
llm-answer-watcher run --config examples/watcher.config.browser-runners.yaml
# View comparison report
llm-answer-watcher report --run-id 2025-11-06T10-30-00Z
The HTML report will show: - API Response: Direct from OpenAI API with web search - ChatGPT Browser: Screenshot + extracted text from web UI - Perplexity Browser: Screenshot + sources from Perplexity
Artifacts Generated¶
Each browser run produces:
output/
└── 2025-11-06T10-30-00Z/
├── run_meta.json
├── screenshot_chatgpt_session-abc123.png # Visual evidence
├── html_chatgpt_session-abc123.html # Full HTML snapshot
├── intent_crm-tools_raw_chatgpt-web.json # Structured data
├── intent_crm-tools_parsed_chatgpt-web.json # Extracted mentions
└── report.html # HTML report
Architecture¶
Plugin System¶
Browser runners integrate seamlessly via the plugin system:
IntentRunner (Protocol)
├── APIRunner (wraps LLMClient)
│ ├── OpenAI
│ ├── Anthropic
│ └── Others...
└── BrowserRunner (extends SteelBaseRunner)
├── SteelChatGPTRunner
├── SteelPerplexityRunner
└── SteelClaudeRunner (coming soon)
How It Works¶
- Session Creation: Steel creates headless Chrome browser
- Navigation: Runner navigates to target URL (chat.openai.com)
- Authentication: Steel manages cookies/sessions automatically
- Prompt Submission: Runner types prompt into UI
- Wait for Response: Monitors DOM for response completion
- Extraction: Scrapes answer text from page
- Artifacts: Captures screenshot, HTML snapshot
- Cleanup: Releases session (or reuses for next intent)
Troubleshooting¶
"Steel API key invalid"¶
# Verify your API key
echo $STEEL_API_KEY
# Test Steel access
curl -H "Authorization: Bearer $STEEL_API_KEY" https://api.steel.dev/v1/sessions
"Session timeout exceeded"¶
Increase timeout if prompts are complex:
"CAPTCHA blocking"¶
Enable CAPTCHA solver:
"Element not found"¶
Browser UI selectors may change. Check logs for details:
Cost Management¶
⚠️ IMPORTANT: Cost tracking for browser runners currently returns $0.00 in reports. This is a placeholder - you WILL be charged by Steel based on session duration. Monitor your Steel dashboard for actual costs.
Browser Runner Costs¶
Browser runners have zero LLM API cost but incur Steel charges that are not yet tracked in LLM Answer Watcher cost estimates:
| Activity | Cost | Tracked in Reports? |
|---|---|---|
| API runners | Per token (normal rates) | ✅ Yes |
| Browser runners | $0.10-0.30/hour via Steel | ❌ No (shows $0.00) |
| CAPTCHA solving | $1-3 per 1,000 solves | ❌ No |
Cost Optimization¶
- Enable Session Reuse: Reuse sessions across intents
- Reduce Timeouts: Lower
session_timeoutif possible - Selective Screenshots: Disable screenshots if not needed
- Batch Intents: Run multiple intents per session
config:
session_reuse: true # Reuse sessions (big cost savings)
take_screenshots: false # Skip if not needed
session_timeout: 180 # Shorter timeout = lower cost
Limitations¶
Current Limitations (BETA Status)¶
- ❌ No Cost Tracking: Browser runner costs show $0.00 (placeholder - Steel charges not calculated)
- ❌ No Token Tracking: Browser responses don't expose token counts
- ❌ No Model Detection: Can't determine which ChatGPT model is used
- ⚠️ Slower than API: Browser automation adds 10-30s overhead per query
- ⚠️ Rate Limits: Subject to web UI rate limits (not API limits)
- ⚠️ Fragile Selectors: CSS selectors may break if ChatGPT/Perplexity UI changes
- ⚠️ Limited CDP Implementation: Full Steel CDP commands need implementation for advanced features
Future Enhancements (Planned for v0.3.0)¶
- Implement accurate cost tracking based on Steel session duration (HIGH PRIORITY)
- Implement full Steel CDP commands for navigation/extraction
- Add web source extraction (citations, search results)
- Support Claude and Gemini web UIs
- Add browser action recording (interaction_steps)
- Add authentication documentation for ChatGPT login
- Add browser pool for parallel execution
- Support custom browser configurations (extensions, etc.)
- Implement retry logic for selector failures
Plugin Development¶
Want to add a new browser runner? See Plugin Development Guide.
Example:
from llm_answer_watcher.llm_runner.browser.steel_base import SteelBaseRunner
from llm_answer_watcher.llm_runner.plugin_registry import RunnerRegistry
@RunnerRegistry.register
class MyCustomBrowserPlugin:
@classmethod
def plugin_name(cls) -> str:
return "my-browser"
@classmethod
def runner_type(cls) -> str:
return "browser"
# Implement other required methods...
Resources¶
Support¶
For issues with browser runners:
- Check Steel API status: https://status.steel.dev
- Review logs with
--verboseflag - Report issues: https://github.com/nibzard/llm-answer-watcher/issues
- Tag with
browser-runnerlabel