Self-Hosted · BYO-LLM | API + UI in one runtime

One AI-native runtime for
API & UI testing

Karate Agent runs the whole Karate stack in one self-hosted container — API testing, AI browser automation, mocks, business-rule verification, and a safe-to-ship verdict — all driven by your LLM. For the UI it’s a modern Selenium, Playwright & Cypress alternative; underneath, it’s far more than a browser bot.

~30s

Scripted flows

8–19 min

Pure LLM approach

72x

Fewer page scans

Request Evaluation

100% Self-Hosted · Bring Your Own LLM · Docker-Native

More than a browser agent

A runtime, not a browser host.

Karate Agent runs the entire Karate stack in one self-hosted container. The browser is one capability — plugin #1 of a capability-agnostic harness — all driven by your LLM over a single MCP / REST surface. API-only runs never even start a browser.

API Testing

REST, gRPC, GraphQL & WebSocket on the proven Karate engine — driven by the agent. Contract checks, data setup, and assertions, no browser required.

AI Browser Automation

DOM-first UI testing that survives UI change. Display-text locators instead of brittle CSS/XPath — a modern Selenium, Playwright & Cypress alternative.

Stateful Mocks

Service virtualization with real logic. The same rulebook is the mock’s brain and the test oracle — so your mocks never rot into “valid but useless.”

Business-Rule Verification

Encode rules — premium calc, eligibility, surcharge — and verify the live system row by row with Rulebooks. Disagreement is a defect, not a debate.

Coverage & Traceability

Requirement-to-test coverage, graded exercised vs. merely claimed, rolled into a confidence-to-ship verdict — computed by code, not guessed.

MCP Server

A single karate_eval tool drives the whole runtime from Claude Code, VS Code Copilot, or Cursor.

One self-contained image — harness loop · serve console · MCP server · API engine · colocated Chrome — part of the Karate platform.

Evidence, not assertions

Other agents say “done.” Ours shows the traffic.

Karate Agent drives your UI like a user — and captures the real network traffic from each run, matched to your API contract, with zero tagging. So even when heavy customization blocks standard API testing, you still get API-level evidence — straight from the UI run.

An evidence-backed RTM, every morning

Every acceptance criterion carries a screenshot, the captured traffic, and a rule verdict — regenerated from last night’s runs. The traceability artifact no ALM can produce.

Regression in nights, not weeks

Hundreds of cases verified per night, per container — a fleet of identical containers whose results merge with no server. Built to collapse a two-week regression cycle into two nights.

Your testers move up the stack

Day one it explores; over time it mostly replays deterministically for ~$0. Testers become reviewers of machine-claimed evidence — and curators of the business rulebooks.

Validated against a Guidewire-class application · self-hosted · BYO-LLM.

How it works

Scripted speed, AI recovery, no model in the verdict

Scripted flows run at native speed with zero tokens; the LLM is invoked only to recover or explore. The release verdict is computed by deterministic code — the AI explains it, it never decides it.

Instruct

Describe what to verify in plain English — over MCP, REST, or the console.

Run

Scripted flows execute at native speed; the LLM recovers or explores only when needed.

Capture

Every run emits screenshots, video, rule verdicts, and the network traffic matched to your contract.

Verdict

A reproducible, hash-verifiable safe-to-ship report — computed with no AI in the loop.

See the architecture in detail

Enterprise-ready

Built for scale, security, and governance

Everything inside your perimeter — no SaaS, no telemetry, no data leaving your network.

Single-tenant & self-hosted

One Docker container behind your firewall — no accounts, no database, reached over your VPN or SSH tunnel.

Role-Based Access

Granular permissions across teams and projects.

Audit Trail

Every run recorded — H.264 video, step logs, full execution history.

Air-Gap Deployment

Runs fully offline with on-premise models. No internet required.

Data Residency

Self-hosted by design — zero data egress, no telemetry.

License Server

Centralized licensing and team management for the IDE tier.

Deployment

Your data, your infrastructure, your LLM

No SaaS, no telemetry, no data exfiltration. Air-gap ready for regulated industries.

Self-Hosted

One Docker container on your servers. Nothing leaves your network.

BYO-LLM

Claude, GPT (incl. Azure OpenAI), Gemini, or local models via Ollama.

Air-Gap Ready

Run the entire stack offline with on-premise models. Full data sovereignty.

CI/CD Native

A single REST/curl call fits Jenkins, GitHub Actions, Azure DevOps, GitLab.

FAQ

Questions, answered

What enterprise teams ask before adopting Karate Agent.

Does Karate Agent do API testing, or just UI?

Both — and more. Karate Agent is not a browser bot; it’s the Karate test runtime in a container. The same self-hosted image runs API testing (REST, gRPC, GraphQL, WebSocket), stateful mocks, business-rule verification, coverage and requirements-traceability, an MCP server, and AI browser automation. The browser is one capability of a capability-agnostic harness — API-only runs never even start a browser.

Can AI replace Selenium or Playwright?

For many enterprise use cases, yes. Traditional tools like Selenium and Playwright rely on brittle CSS or XPath selectors that break when the UI changes. Karate Agent uses display-text locators and LLM-powered recovery to adapt to UI changes automatically, dramatically reducing test maintenance. Many teams use it as a Selenium alternative or Playwright alternative, especially for complex enterprise SPAs like Guidewire, Salesforce, and ServiceNow.

How is Karate Agent different from Claude computer use?

Claude computer use is a cloud-hosted, vision-based agent that sends screenshots to a vendor and reports “done” on its own say-so. Karate Agent is self-hosted and DOM-based (10–50x more token-efficient), and every run leaves evidence: annotated screenshots, video, rule verdicts, and the real network traffic matched to your API contract. Your data never leaves your network, it integrates with CI/CD via a REST API, and it works with any LLM — not just one vendor.

Which LLMs does Karate Agent support?

Karate Agent is LLM-agnostic — bring your own LLM. It works out of the box with Anthropic Claude, OpenAI GPT-4 (including Azure OpenAI), Google Gemini, and any open-source model served via Ollama (Llama, Qwen, DeepSeek, Mistral, Gemma, GLM, Kimi). You can also connect to any OpenAI-compatible endpoint, including self-hosted vLLM deployments.

Is Karate Agent cloud-based or self-hosted?

Karate Agent is 100% self-hosted. It runs as a Docker container on your infrastructure — no SaaS dependency, no data exfiltration, no telemetry. Paired with a local or Azure OpenAI model, the entire stack runs inside your perimeter. This makes it suitable for regulated industries like financial services, insurance, and healthcare that require full data sovereignty.

How much does Karate Agent cost in LLM tokens?

Dramatically less than screenshot-based agents. Karate Agent reads the DOM directly rather than sending pixel images to the LLM — 10–50x more token-efficient. Scripted flows consume zero tokens, a look() diffing function reduces page scans by 72x, and stable explorations converge into deterministic checks that replay for ~$0 with no AI in the loop.

One AI-native runtime forAPI & UI testing