Verification Belongs Inside the Dev Loop.
Not After It.

AI is making developers 3x faster. Testing just became the bottleneck. Karate Agent is the AI-native verification platform that keeps pace — from exploratory testing to CI/CD, on your infrastructure.

100% Self-Hosted • Bring Your Own LLM • Java 21 + Docker • Enterprise Ready

Karate Agent Dashboard showing job list with active and completed sessions
Built on Karate — the open-source test framework trusted by 8,000+ teams

The Speed Problem — Solved

LLM-driven browser automation is powerful but slow. Flows fix this.

Guidewire PolicyCenter

12-step Personal Auto submission — login through quote — each step developed as an independent flow via Interactive mode, composed into a single orchestrator.

Pure LLM: 8–19 minutes
Flows: ~30 seconds

Flows execute at native JavaScript speed — no LLM tokens consumed. The LLM is only invoked for recovery when a step fails.

Completed job with session transcript and iteration details

Why Karate Agent

Self-Hosted, Air-Gap Ready

Your data never leaves your network. Session transcripts, screenshots, and recordings stay on your file system. Suitable for financial services, healthcare, and government.

Bring Your Own LLM

Run with Claude, GPT, Gemma, Llama, or any local model via Ollama. Zero cloud dependency. No vendor lock-in. Use whatever model your organization has approved.

Token-Efficient by Design

Structured JSON responses — not DOM dumps. look() diffing reduces repeated page scans by 72x. One JS code block batches multiple actions in a single HTTP request.

Display-Text Locators

{button}Submit uses visible text, not CSS/XPath. When the app is refactored and element IDs change, display-text locators keep working.

MCP Server Built In

Single karate_eval tool via Model Context Protocol. Compatible with Claude Code, VS Code Copilot, and any MCP-compliant client. No sidecar, no Node.js.

Session Video Recording

Opt-in H.264 recording at 8fps. Compliance teams audit exactly what the agent did. QA reviews failures without reproducing them.

Start Free, Scale Gradually

Most teams stabilize at Stage 3: 90%+ scripted (free), LLM handles edge cases.

1
Interactive

Developer drives browser via curl or Claude Code. Discovers locators, explores the app.

$0 / run
2
Scripted Flows

Reusable .js flow files execute via Flow.run(). Deterministic, repeatable, no LLM.

$0 / run
3
Autonomous + Flows

Scripted flows first. LLM invoked only when a step fails — 2-4 recovery iterations.

~$0.02–0.05 / job
4
Fully Autonomous

LLM drives the entire workflow. Reserve for exploratory testing or new workflows.

~$0.15–0.50 / job

How It Works

From exploration to automation in three steps.

Explore Interactively

Your LLM coding agent drives a real browser via curl or MCP. Discover locators, test interactions, build understanding.

Interactive Mode Demo Video coming soon
Create & Test Flows

Codify working patterns as reusable .js flow files. Execute at native speed — no LLM tokens consumed.

Flow Creation Video coming soon
Run Autonomously

Submit jobs via dashboard or API. Flows run fast, LLM handles unknowns. Review reports, fix flows, repeat.

Autonomous Mode Video coming soon

Who It's For

Solo Developer

Run the agent server locally. Test your app on localhost before you commit. Verify UI changes as part of your dev loop — the same agent that writes your code can test it. Personal automation for repetitive browser tasks.

Interactive User

Connect Claude Code, Cursor, or any LLM agent to a live browser session. Explore apps, discover locators, build and debug flows interactively. Develop automation step-by-step with instant feedback.

QA & Acceptance

Submit test jobs via dashboard or CI pipeline. Run scripted flows with AI recovery. Review reports, watch recordings, manage deviations. Shared team server with concurrent sessions.

Business Stakeholder

Watch session recordings to see exactly what was tested. Review reports on the shared dashboard — no technical setup required. Understand test coverage without needing access to code or infrastructure.

Enterprise-Grade Features

Session Isolation
Each session in its own Docker container with dedicated Chrome
Live Browser View
noVNC in dashboard — watch, pause, inject, resume
Audit Trail
Transcripts, reports, screenshots, and video per session
CI/CD Integration
REST API — any pipeline that can curl can submit jobs
MCP Support
Works with Claude Code, VS Code Copilot, any MCP client
Shared Dashboard
One URL for QA, devs, and managers — no per-seat installs
Enterprise SPA Support
Cursor-pointer discovery for Guidewire, Salesforce, ServiceNow
No Vendor Lock-In
Plain JS flows, standard REST API, any LLM provider

Ready to Verify at the Speed of AI?

See how Karate Agent can transform your testing workflow.

Book a Demo