Interactive Mode
In Interactive mode, your LLM coding agent drives a real browser by sending JavaScript commands via REST or MCP. The grid server proxies commands to the worker container — no LLM runs on the grid itself.
When to Use Interactive Mode
- Exploratory testing — discover how an app works, find locators, test interactions
- Flow development — build and debug
.jsflow files interactively - Debugging — investigate a failing autonomous job step-by-step
- Live demos — show stakeholders real browser automation in real time
- Solo development — test your localhost app before committing, as part of your dev loop
Connecting via curl
# Start an Interactive session
SESSION=$(curl -s -X POST http://localhost:4444/api/sessions \
-H "Content-Type: application/json" \
-d '{"mode": "interactive"}' | jq -r .sessionId)
# Navigate
curl -X POST http://localhost:4444/sessions/$SESSION/proxy \
-d '{ "js": "agent.go(\"https://your-app.com/login\")" }'
# Discover elements
curl -X POST http://localhost:4444/sessions/$SESSION/proxy \
-d '{ "js": "agent.look()" }'
# Interact
curl -X POST http://localhost:4444/sessions/$SESSION/proxy \
-d '{ "js": "agent.act(\"{input}Username\", \"input\", \"admin\")" }'
curl -X POST http://localhost:4444/sessions/$SESSION/proxy \
-d '{ "js": "agent.act(\"{button}Sign In\", \"click\")" }'
Connecting via MCP
# Add Karate as an MCP server in Claude Code
claude mcp add karate http://localhost:4444/mcp
Once connected, Claude Code can use the karate_eval tool to send JS commands. The experience is natural — Claude writes JS just as it would write any code.
Testing Against Localhost
Run the grid alongside your development server. Worker containers reach your local app via host.docker.internal:
# Your dev server
npm run dev # or whatever starts your app on localhost:3000
# In another terminal — start the grid
java -jar veriquant.jar grid --port 4444
# Your LLM agent navigates to your local app
curl -X POST http://localhost:4444/sessions/$SESSION/proxy \
-d '{ "js": "agent.go(\"http://host.docker.internal:3000\")" }'
This enables a pre-checkin verification loop: the same AI agent that writes your code can immediately test it in a real browser — before you push.
The API Reference Endpoint
Every session exposes a /prompt endpoint with a self-contained API reference:
curl http://localhost:4444/sessions/$SESSION/prompt
Point your LLM agent at this URL. It returns a compact reference document the LLM can read, covering all available commands — look(), act(), wait(), match(), Flow.run(), and more.
Working with look()
look() is the primary discovery tool. It returns structured JSON:
{
"elements": [
{"role": "link", "name": "Home", "locator": "{a}Home", "actions": ["click"]},
{"role": "button", "name": "Submit", "locator": "{button}Submit", "actions": ["click"]},
{"role": "textbox", "name": "Email", "locator": "{input}Email", "actions": ["input", "clear"]}
]
}
Diff mode: After the first look(), subsequent calls return only changes — {added, removed, changed, unchanged}. This reduces tokens by 70-90% when navigating within a SPA.
Building Flows Interactively
The typical workflow:
- Use
look()to discover elements - Use
act()to interact — note which locators and sequences work - Codify the working pattern as a
.jsflow file viaFile.write() - Test with
Flow.run('path/to/flow')— verify it executes correctly - Iterate until the flow handles the full workflow
See Flows for details on writing and composing flow files.