Karate Agent Documentation
Karate Agent is an AI-native browser testing platform built on Karate. It provisions on-demand browser containers from a lightweight grid server and supports two modes of operation.
Two Modes
| Mode | Who drives the browser | LLM needed? | Use case |
|---|---|---|---|
| Interactive | Client-side LLM (Claude Code, Cursor, etc.) sends JS commands via REST | No (grid is a proxy) | Exploratory testing, debugging, live demos |
| Autonomous | Worker-side LLM drives observe-decide-act loop inside karate-agent container | Yes (worker-configured) | CI/CD, scheduled tests, batch jobs |
Both modes share the same session lifecycle and API surface.
The Agent API
The core API is designed for LLM consumption — minimal, structured, and token-efficient:
look()— Discover elements on the page. Returns structured JSON with{role, name, locator, actions}per element. Subsequent calls return diffs only.act(locator, action, value?)— Interact with an element. Display-text locators like{button}Submituse visible text, not CSS/XPath.wait(condition)— Wait for a condition (navigation, element, text change).match(actual, expected)— Karate’s structural matching with schema markers.
Key Concepts
- Flows — Reusable
.jsscripts that execute at native speed viaFlow.run(). No LLM tokens consumed. - Interactive Mode — Interactive browser exploration via your existing LLM coding agent.
- Autonomous Mode — Submit jobs that run flows first, with LLM recovery for failures.
Requirements
- Java 21+
- Docker (for browser containers)
- LLM API key (for autonomous mode only — any OpenAI-compatible, Anthropic, or Ollama endpoint)
Next Steps
- Quick Start — Get running in under 60 seconds
- Architecture — Understand the grid server, workers, and session lifecycle