A toolkit for automating interactions with web-based Large Language Models (LLMs) like Qwen, Perplexity, and more. This project leverages AppleScript to control a real Chrome browser (bypassing bot detection) and Python to extract structured responses from the resulting HTML.
Many cutting-edge LLMs are only accessible through browser-based interfaces, lacking public or affordable APIs. This restricts programmatic access for agents, scripts, or CLI workflows, unlike models with REST APIs.
web_llm_interactor solves this by enabling seamless interaction with web-based LLMs as if they had API endpoints. It automates browser actions using AppleScript to mimic human behavior, submits queries, waits for responses, and extracts structured data (e.g., JSON) from the page. This makes web-only LLMs fully compatible with your automation workflows.
graph TD
A["π€ User/Agent calls CLI"] --> B["π AppleScript activates Chrome"]
B --> C["π AppleScript injects JavaScript"]
C --> D["π€ LLM processes query"]
D --> E["π AppleScript polls for response"]
E --> F["π AppleScript saves HTML"]
F --> G["π Python parses HTML, extracts JSON"]
G --> H["π» CLI formats data"]
H --> I["π Structured JSON returned"]
I --> A
classDef userAction fill:#f8f8f8,stroke:#505050,stroke-width:1.5px
classDef appleScript fill:#f8f8f8,stroke:#505050,stroke-width:1.5px
classDef llmAction fill:#f8f8f8,stroke:#505050,stroke-width:1.5px
classDef dataProcess fill:#f8f8f8,stroke:#505050,stroke-width:1.5px
class A,I userAction
class B,C,E,F appleScript
class D llmAction
class G,H dataProcess
- Bypass Bot Detection: Uses AppleScript to control a real Chrome browser, mimicking human interactions
- Adaptive Response Polling: Intelligently waits for responses by monitoring HTML length changes
- Structured Output: Extracts responses as JSON with customizable required fields
- Automatic Form Submission: Uses multiple strategies to send messages (form submit, button click, Enter key)
- Multiple LLM Support: Works with Qwen, Perplexity, and other browser-based LLMs
- CLI Interface: Simple command-line interface for easy integration
- Focus Management: Properly returns focus to your editor after processing
- Customizable Fields: Specify which fields must be present in extracted JSON
# Clone the repository
git clone https://github.com/grahama1970/web-llm-interactor.git
cd web-llm-interactor
# Install with UV
uv pip install -e .
# Or use the installation script
./scripts/install_cpu_uv.sh
# Clone the repository
git clone https://github.com/grahama1970/web-llm-interactor.git
cd web-llm-interactor
# Install in development mode
pip install -e .
Requirements (managed by pyproject.toml):
- pyperclip
- python-dotenv
- loguru
- typer
- beautifulsoup4
- html2text
- bleach
- json-repair
- (MacOS with AppleScript support)
See USAGE.md for detailed usage instructions and examples.
# Basic usage with default settings (Qwen.ai)
web-llm-interactor ask "What is the capital of Georgia?"
# Specify a different LLM site
web-llm-interactor ask "What is the capital of France?" --url "https://chat.qwen.ai/"
# Specify custom output HTML path
web-llm-interactor ask "What is the tallest mountain?" --output-html "./responses/mountain.html"
# Get all JSON objects, not just the last one
web-llm-interactor ask "List the largest oceans" --all
# Customize required JSON fields
web-llm-interactor ask "Explain quantum computing" --fields "question,answer"
# Skip adding JSON format instructions
web-llm-interactor ask "What's the weather in Tokyo?" --no-json-format
# Configure polling behavior
web-llm-interactor ask "What are the three branches of government?" --poll-interval 3 --stable-polls 2 --timeout 60
# Basic usage
osascript src/web_llm_interactor/send_enter_save_source.applescript "What is the capital of Georgia?" "https://chat.qwen.ai/" "./output.html"
# Get all responses
osascript src/web_llm_interactor/send_enter_save_source.applescript "What is the capital of Florida?" "https://chat.qwen.ai/" "./output.html" "--all"
# Specify required fields
osascript src/web_llm_interactor/send_enter_save_source.applescript "Explain quantum computing" "https://chat.qwen.ai/" "./output.html" "--fields" "question,answer"
import subprocess
import json
def ask_web_llm(question, url="https://chat.qwen.ai/", custom_fields=None, get_all=False):
"""Query a web-based LLM and get a structured JSON response."""
cmd = ["web-llm-interactor", "ask", question, "--url", url]
if get_all:
cmd.append("--all")
if custom_fields:
cmd.extend(["--fields", custom_fields])
result = subprocess.check_output(cmd, text=True)
return json.loads(result)
# Example usage
response = ask_web_llm("What is the capital of Idaho?")
print(f"Question: {response['question']}")
print(f"Answer: {response['answer']}")
# Get response with custom fields
custom_response = ask_web_llm(
"Explain quantum computing in simple terms",
custom_fields="question,answer"
)
print(custom_response["answer"])
- Stealth: AppleScript controls a real Chrome browser, making interactions indistinguishable from a human user
- Reliability: Unlike Selenium, which is often detected via browser fingerprinting or navigator.webdriver, this approach works with sites that block bots
- Simplicity: No need for complex browser drivers or additional configurations
The system uses a simple but effective approach to detect when an LLM has finished responding:
- Record the initial HTML length when the message is sent
- Poll the page at regular intervals (configurable with
--poll-interval
) - When HTML grows significantly from initial state (>500 characters), start tracking stability
- When HTML length stays the same for N consecutive polls (configurable with
--stable-polls
), consider the response complete - If maximum wait time is reached (configurable with
--timeout
), proceed with current content
This approach is more efficient than fixed wait times and works across different LLM interfaces.
web_llm_interactor/
βββ src/
β βββ web_llm_interactor/
β βββ __init__.py
β βββ cli.py # Command-line interface
β βββ send_enter_save_source.applescript # Browser automation script
β βββ extract_json_from_html.py # HTML-to-JSON extractor
β βββ file_utils.py # File handling utilities
β βββ json_utils.py # JSON parsing utilities
βββ scripts/
β βββ demo.sh # Demo script showing usage examples
β βββ cleanup.sh # Script to clean up temporary files
βββ README.md
βββ pyproject.toml
- No Chrome Tab Found: Make sure you have Chrome open with the correct URL (e.g., https://chat.qwen.ai/). This is a required step before running any command!
- Empty Response: Try increasing the timeout with
--timeout 60
- JSON Extraction Failed: Ensure the LLM is responding with properly formatted JSON or specify required fields with
--fields
- Response Too Slow: Adjust polling parameters with
--poll-interval
and--stable-polls
- Command Not Found: Ensure you've installed the package with
uv pip install .
and are using the correct command:web-llm-interactor ask "Your question"
For more detailed troubleshooting, see USAGE.md.
MIT License
web_llm_interactor empowers agents and CLI workflows to harness web-only LLMs, delivering API-like functionality with minimal setup. π