Features

Every feature.
One jar.

Everything you need for AI-native browser testing at enterprise scale.

No premium tier, no paid add-ons, no feature gating. The eight capabilities below are in every Karate Agent deployment.

agent — live session

> agent.go("https://app.local") > agent.look() { role: "form", fields: [ {role:"input", name:"Email"}, {role:"button", name:"Sign in"} ] } > act('{button}Sign in', 'click') ✓ clicked · 18ms > Flow.run('login.js') ✓ ok: true · 0 tokens

01 · Token-Efficient API

An automation API the LLM actually wants to read

The JS Agent API (look(), act(), wait()) is purpose-built for LLM consumption. Structured JSON responses with only actionable data — no DOM dumps, no HTML parsing.

✓Dramatically smaller responses — look() returns {role, name, locator, actions} per element, not raw HTML
✓Incremental updates — subsequent look() calls return only what changed
✓agent.text() renders the page as structured markdown — tables, headings, key-value pairs — for data extraction
✓Batch operations — fill forms, click, and navigate in one HTTP request
✓Prompt caching — high cache hit rate on the system prompt at a fraction of regular input cost

Raw HTML dump

~80 KB

per page scan

agent.look()

~2 KB

structured JSON

agent.look() — structured response

[ { role: "input", name: "Policy Number", locator: "{input}Policy Number", actions: ["input", "clear"] }, { role: "button", name: "Submit", locator: "{button}Submit", actions: ["click"] } ]

locator syntax — compared

// Karate Agent — display-text locators act('{button}Submit', 'click') act('{input}Email', 'input', 'user@example.com') act('{a}Sign In', 'click') // vs. traditional selectors driver.findElement(By.xpath( "//button[contains(text(),'Submit')]" )).click() page.locator( 'button:has-text("Submit")' ).click()

chars Karate

chars Playwright

chars XPath

02 · Display-Text Locators

The locator that survives your next refactor

{button}Submit uses visible text instead of CSS selectors or XPath. When the app is refactored and element IDs change, display-text locators keep working.

This dramatically reduces maintenance — the #1 cost of traditional UI test suites. Karate's locator syntax is also shorter: 16 characters vs 30–42 for equivalent Playwright/XPath selectors, saving output tokens across every session.

03 · Bring Your Own LLM

Any provider. Any model. Including zero-cost self-hosted.

No vendor lock-in. Use cloud APIs or run open-weight models on your own hardware for zero API cost and full data sovereignty.

Cloud aggregator

OpenRouter

100+ models through one endpoint.

openrouter/*

Direct

Anthropic

Claude Sonnet, Haiku, Opus — native.

anthropic/*

Self-hosted · $0 API

Gemma 26B

4B active (MoE) · 16 GB VRAM · consumer GPU.

ollama/gemma3

OpenAI-compatible

Ollama / vLLM

Llama, Mistral, Qwen, any local.

ollama/* · vllm/*

Gemma 26B passes every benchmark

Gemma 26B (4B active parameters, MoE) runs on a single consumer GPU and passes all our page automation, flow integration, and vision benchmarks. For regulated industries: the entire stack — server, browser, and LLM — runs on your infrastructure with zero internet dependency. Per-job model override lets teams use smaller models for routine jobs, larger models for exploratory work.

provider routing

# Provider prefix handles routing openrouter/anthropic/claude-sonnet-4-6 anthropic/claude-haiku-4-5 ollama/gemma3 ollama/llama3 # Single env var to switch KARATE_AGENT_MODEL=ollama/gemma3

04 · 100% Self-Hosted

Air-gap compatible, by design

The entire platform — grid server, worker containers, dashboard — runs on your infrastructure. No data leaves your network.

✓Session transcripts, screenshots, and recordings stay on your file system
✓Pair with self-hosted Gemma 26B via Ollama — zero internet dependency, even the LLM runs locally
✓Suitable for regulated industries: financial services, healthcare, government
✓No cloud-based testing tool or AI API required

deploy — single artifact

# Single-artifact deployment $ docker pull karatelabs/karate-agent:latest $ java -jar karate-agent.jar dashboard --port 4444 $ open http://localhost:4444 ▸ ready # That's it. One jar. One image. # No microservices. No databases. # No message queues.

Runtime

Java 21

Container

Docker 24+

Artifacts

1 jar · 1 image

Network

Air-gap OK

flows/ — version-controlled in git

flows/ ├── task.md# plain-English hints ├── orchestrator.js// chains login → quote → submit ├── auth/ │ ├── login.js │ └── logout.js ├── quote/ │ ├── steps.md │ ├── new-quote.js │ └── verify.js └── shared/ └── wait-for-grid.js

5 LLM iterations

~50 s

Flow.run()

~2 s

same flow, scripted

05 · Flow System

Executable `.js`. Native speed. Self-healing.

Executable .js scripts run at native JavaScript speed. A login flow that takes 5 LLM iterations (~50s) executes in 2 seconds via Flow.run(). Also supports .md task files — plain English hints that can cut iterations in half without writing any code.

✓Compose flows — chain login → navigate → fill form → submit
✓Self-healing — flow failure returns error + page state. LLM recovers automatically
✓Deviation reporting — deviations flagged in the report so teams know which flows need maintenance
✓.md task files — plain English hints alongside .js flows, no code required
✓Plain JavaScript — no proprietary DSL, readable by any developer, version in git

06 · Works With Any Coding Agent

One `curl`. Your agent bootstraps itself.

AI-first design. One curl command and your coding agent knows how to create browser sessions and drive them. No plugins, no configuration files, no sidecar processes.

✓Self-bootstrapping: /api/prompt returns a markdown document that teaches the LLM the full API — it creates sessions, drives browsers, and submits jobs on its own
✓Any agent: Claude Code, Cursor, VS Code Copilot, OpenAI Codex CLI, JetBrains AI, Aider — anything that can run a shell command
✓MCP also supported: native karate_eval tool via Streamable HTTP for agents that prefer MCP
✓Per-session API reference: each live session exposes /sessions/{id}/prompt with the full Agent API tailored to that session

Any agent that can run a shell command

Claude Code Cursor VS Code Copilot OpenAI Codex CLI JetBrains AI Aider curl + any LLM

bootstrap — one curl

# Point the agent at the prompt endpoint $ curl http://localhost:4444/api/prompt # returns markdown teaching the full Agent API # Or via MCP $ claude mcp add karate http://localhost:4444/mcp ▸ tool: karate_eval registered # Per-session API reference $ curl http://localhost:4444/sessions/$ID/prompt

07 · Recording & Reproducibility

Every session produces a complete audit trail

H.264 video, step-by-step transcript, structured report, screenshots per step. Recordings accelerate the entire team.

Failure diagnosis

Review video + transcript instead of reproducing 12-step workflows.

Flow development

Watch successful autonomous sessions, extract patterns, codify as flows.

Onboarding

New team members watch recordings to learn app navigation and locator patterns.

Stakeholder demos

Share recordings with product owners. They see exactly what was tested.

08 · Enterprise SPA Support

The SPAs Selenium quietly gives up on

Cursor-pointer discovery catches <div onclick> targets in Guidewire, Salesforce, ServiceNow — apps where standard locator strategies fail.

Auto-retry when enterprise field formatters (GW/SAP) clear values set before async init completes — no flow workaround needed.

Guidewire PolicyCenter Guidewire ClaimCenter Salesforce ServiceNow SAP Fiori Workday

cursor-pointer discovery

// Guidewire uses <div onclick> — no buttons // Standard finders return nothing > agent.look() [ { role: "clickable", # cursor:pointer name: "New Policy", locator: "{clickable}New Policy" }, { role: "input", name: "Effective Date", formatter: "GW-date", # auto-retry locator: "{input}Effective Date" } ] > act('{clickable}New Policy', 'click') ✓ opened · 210ms