We Audited Our Own Security So You Don't Have To Trust Us

2026-02-20 · Hivemind Team We Audited Our Own Security So You Don't Have To Trust Us

Why We Did This

The OpenClaw incident was a wake-up call for the entire AI agent ecosystem.

When CVE-2026-1182 dropped in January — a path traversal in the skill loader that let any loaded skill read arbitrary files from the host filesystem — it wasn't just an OpenClaw problem. It was a design problem. The vulnerability existed because the architecture assumed skills could be trusted with host-level access. That assumption was baked into the foundation of the project, not bolted on as an afterthought.

We watched the fallout. Teams scrambling to audit which skills had accessed what. No logs granular enough to answer the question. Secrets stored in plaintext \.env\ files that any compromised skill could have exfiltrated. The second CVE, CVE-2026-1347 — an SSRF in the web tool that could reach internal network services — made it worse. Two architectural blind spots, exploited within weeks of each other.

We don't say this to dunk on OpenClaw. We say it because it forced us to look in the mirror. Hivemind was already designed with stronger isolation primitives, but "stronger than the project that just got popped" is not a security posture. So we did the thing that most startups avoid: we audited ourselves, wrote down everything we found, and decided to publish it.

This post is that audit. The good, the bad, and the roadmap.

Our Architecture

Hivemind's security model is built on three core principles: credential isolation, workspace isolation, and explicit trust boundaries. Here's how each one works in practice.

Credential Isolation

Every secret in Hivemind — API keys, database credentials, OAuth tokens, webhook signing keys — lives in an encrypted vault backed by AES-256-GCM. Secrets are never written to disk in plaintext. They're never passed as environment variables to the host process. They're never logged.

When an agent needs a credential, it requests it through a scoped accessor that checks three things:

Identity: Which agent is requesting the secret?
Scope: Is this agent authorized to access this specific secret?
Context: Is this request happening within an approved execution context (e.g., during a sanctioned tool invocation)?

If any check fails, the request is denied and the denial is logged with the full request context. There's no fallback, no override flag, no \--force\ option.

This is architecturally different from the \.env\ file pattern that most self-hosted AI tools use. In that model, every process on the host can read every secret. A single compromised dependency in any skill gets access to your entire credential set. In Hivemind, a compromised skill gets access to exactly the secrets it was authorized to use — and nothing else.

Workspace Isolation

Every tool invocation in Hivemind runs inside an isolated container. We use gVisor-backed sandboxes that provide:

Filesystem isolation: Each tool sees only its own workspace directory. No access to the host filesystem, no access to other tools' workspaces, no access to Hivemind's own configuration or runtime state.
Network isolation: By default, tools have no network access. Outbound connections require explicit allowlisting per-agent, per-domain. DNS resolution is controlled.
Process isolation: Tools cannot see or signal other processes. No \/proc\ access, no shared memory segments, no Unix domain sockets outside the sandbox.
Resource limits: CPU, memory, and execution time are capped per invocation. A runaway tool cannot starve the host or other agents.

The sandbox boots in under 200ms and tears down completely after each invocation. There's no persistent state between tool calls unless explicitly configured through Hivemind's state management API, which itself is access-controlled.

Trust Boundaries

This is where the architecture diagram matters. In Hivemind, trust boundaries are explicit and enforced:

\\\`

User Input (untrusted)

[Message Gateway] -- rate limiting, input validation

[Agent Runtime] -- LLM interaction, planning, tool selection

[Tool Approval Layer] -- capability checks, skill signatures

[Sandboxed Execution] -- gVisor container, scoped credentials

[Output Sanitization] -- response filtering, PII detection

User Response (sanitized)

\\\`

Every arrow in that diagram is a trust boundary crossing. At each boundary, we validate, log, and enforce access controls. The agent runtime cannot bypass the tool approval layer. The sandboxed execution environment cannot reach back into the agent runtime. Output passes through sanitization before reaching the user.

This is defense in depth. Not "we have a firewall" defense in depth, but "every component assumes every other component might be compromised" defense in depth.

The OpenClaw Comparison

We want to be precise here, not promotional. OpenClaw and Hivemind make fundamentally different architectural choices, and those choices have security implications. Understanding them helps you make an informed decision — regardless of which project you choose.

Execution Model

OpenClaw runs tools in the same process as the agent runtime. When a skill executes, it shares memory space, filesystem access, and network capabilities with the host process. This is fast and simple, but it means a vulnerability in any skill is a vulnerability in the entire system.

Hivemind runs every tool invocation in a dedicated sandbox. The overhead is real — about 200ms per invocation for sandbox boot — but the isolation is complete. A compromised tool cannot access host resources, other agents' data, or Hivemind's own runtime state.

Secret Management

OpenClaw stores secrets in \.env\ files or environment variables. Any process running on the host can read them. The recent CVEs demonstrated the practical impact: the path traversal vulnerability (CVE-2026-1182) could read \.env\ files directly.

Hivemind's vault is encrypted at rest and in transit. Access is scoped per-agent, per-secret, and logged. Even if an attacker achieves code execution inside a sandbox, they can only access the specific credentials that sandbox was authorized to use.

Skill Trust

OpenClaw trusts skills implicitly. If a skill is installed, it runs with full host access. There's no static analysis, no capability declaration, no approval flow.

Hivemind requires skills to declare their capabilities upfront: which APIs they call, which file patterns they access, which network endpoints they contact. Before a skill runs for the first time, it goes through static analysis and requires explicit human approval. Signed skill packages provide cryptographic verification of authorship and integrity.

What OpenClaw Gets Right

We should acknowledge this clearly. OpenClaw's execution model is simpler, faster, and easier to debug. The lack of sandbox overhead means lower latency for tool calls. The \.env\ pattern, while insecure, is widely understood and trivially easy to set up. For local development and non-sensitive workloads, these trade-offs might be exactly right.

Security is always a spectrum, and the right point on that spectrum depends on your threat model. If you're running AI agents against public APIs on a personal machine, OpenClaw's model might be perfectly adequate. If you're running agents against production infrastructure with access to customer data, the calculus changes.

What We're Shipping

Honest audit means honest gaps. Here's what we know we need to improve, and what we're building to close those gaps.

Capability Declarations (Coming Soon)

Today, Hivemind's skill scanning is static-analysis-only. It catches common dangerous patterns — shell injection, unrestricted network calls, filesystem writes outside the workspace — but it's not a complete capability model. A sufficiently clever skill could potentially perform actions outside its declared scope.

We're shipping formal capability declarations. Every skill must declare a manifest listing:

Network endpoints it will contact (with URL patterns)
File system paths it will read or write (with glob patterns)
External binaries it will invoke
Credential scopes it requires

The runtime will enforce these declarations at the syscall level. A skill that declares it only needs \api.github.com\ cannot contact \evil.example.com\, regardless of what the code tries to do. This moves us from "we scan for bad patterns" to "we enforce declared boundaries."

Tool Approval Workflow (Coming Soon)

Currently, skill approval happens once at installation time. After that, the skill runs without further human oversight. This is insufficient for high-security environments.

We're adding a per-invocation approval mode for sensitive tools. Administrators can flag specific tools or tool categories as requiring human approval before each execution. The approval UI shows the full context: which agent is requesting the tool, what arguments it's passing, what the tool will do, and what credentials it will access.

For lower-sensitivity tools, the current install-time approval will remain the default.

Dual-Context Architecture (Coming Soon)

This is a deeper architectural change. Today, the agent runtime has a single execution context that handles both planning (LLM interaction, reasoning, tool selection) and execution (actually running tools). This means the planning context has theoretical access to execution-level primitives.

We're splitting these into dual contexts:

Planning context: Can interact with the LLM, reason about tasks, select tools, and compose workflows. Cannot execute tools directly.
Execution context: Can run approved tools in sandboxes. Cannot interact with the LLM or modify the planning state.

Communication between contexts happens through a message bus with a strict schema. The planning context sends execution requests; the execution context returns results. Neither can reach into the other's memory space.

This eliminates an entire class of attacks where a malicious prompt could trick the agent into executing tools outside the normal approval flow.

Network Controls (Shipping in v2026.03)

Hivemind already restricts outbound network access per-agent. But the current implementation uses iptables rules at the container level, which can be coarse-grained. We're moving to a DNS-aware proxy layer that provides:

Per-request URL filtering (not just per-domain)
Request/response logging for audit trails
TLS inspection for detecting data exfiltration in encrypted channels
Bandwidth limits to prevent large-scale data theft

Security Dashboard (Coming Soon)

Mission Control is getting a dedicated security tab. It will show:

Real-time credential access patterns across all agents
Tool invocation audit trails with full argument logging
Anomaly detection for unusual access patterns (e.g., an agent suddenly requesting credentials it's never used before)
Sandbox escape attempts and policy violations
A risk score per agent based on its configured capabilities and recent behavior

This gives security teams visibility into agent behavior without requiring them to parse raw logs.

What We Know Is Still Missing

Transparency means admitting what we haven't solved yet:

Supply chain verification for skill dependencies: Skills can declare their own dependencies, but we don't yet verify the integrity of those transitive dependencies. A skill could pull in a compromised npm package and we wouldn't catch it at the skill-scanning level. This is on the roadmap.
Formal verification of sandbox escapes: Our gVisor sandboxes are well-tested, but we haven't commissioned a formal penetration test of the sandbox boundary itself. We're budgeting for an external security firm to do this in Q2 2026.
Multi-tenant isolation: Hivemind currently assumes a single-tenant deployment. If you're running multiple teams on the same instance, agent-to-agent isolation is enforced but not hardened to the level we'd want for true multi-tenancy. This is on the roadmap.
Audit log tamper resistance: Audit logs are comprehensive but stored locally. An attacker with host access could modify them. We're exploring append-only log stores and remote attestation.

We'd rather tell you about these gaps than have you discover them in production.

Invite to Break It

We believe security claims are only as strong as the scrutiny they survive. That's why we're publishing our SECURITY.md alongside this post.

If you find a vulnerability in Hivemind, we want to know about it. Our responsible disclosure policy is straightforward:

Report vulnerabilities to [email protected]
We acknowledge receipt within 24 hours
We provide an initial assessment within 72 hours
We commit to a fix timeline based on severity (critical: 48 hours, high: 7 days, medium: 30 days)
We credit reporters publicly (with permission) and are working on a formal bug bounty program

We're also inviting security researchers to review the sandbox implementation, the credential vault, and the trust boundary enforcement code. The relevant modules are clearly documented in the repository, and we're happy to provide architecture walkthroughs to serious auditors.

The code is open source. The architecture is documented. The gaps are listed above. We'd rather be proven wrong about something now than discover it the hard way later.

Security isn't a feature. It's a commitment. Hold us to it.