Run MiniMax M2.7 in Hivemind: Tier-1 Coding Performance at $0.06/M Tokens

2026-03-25 · Hivemind Team

# Run MiniMax M2.7 in Hivemind: Tier-1 Coding Performance at $0.06/M Tokens

MiniMax just shipped M2.7. It matches Claude Opus 4.6 on SWE-Pro (56.22% vs ~57%), runs at 100 tokens per second, and costs $0.30 per million input tokens — compared to $15 for Opus. With cache optimization, the blended cost drops to $0.06/M tokens.

That's not a typo. Sixty cents per billion tokens.

For teams running high-volume agent workloads on Hivemind — coding agents, heartbeat automation, document pipelines — this changes the economics of what's feasible.

Here's how to connect it.

What M2.7 Actually Is

Most model releases are incremental. M2.7 is structurally different.

It's MiniMax's first model built to participate in its own training — what they call self-evolution. During development, M2.7 autonomously ran 100+ optimization cycles on its own scaffold, managed 30–50% of reinforcement learning research workflows independently, and competed in 22 ML competitions (9 gold medals). It wasn't just trained on data. It helped shape the process that produced it.

The result is a 10B activated-parameter model that performs at the level of models orders of magnitude larger.

Benchmark snapshot:

Benchmark	M2.7	Claude Opus 4.6
SWE-Pro	56.22%	~57%
SWE-bench Verified	78%	55%
GDPval-AA (office tasks)	ELO 1495	—
MLE-Bench Lite	66.6%	—
Skill adherence (complex tasks)	97%	—

On SWE-bench Verified — which measures real-world bug fixes, not curated puzzles — M2.7 outperforms Opus by 23 percentage points.

Speed and cost comparison:

M2.7	Claude Opus 4.6	GPT-5
Speed	100 TPS	~33 TPS	~40 TPS
Input cost	$0.30/M	$15/M	$10/M
Output cost	$1.20/M	$75/M	$30/M
Blended (cached)	$0.06/M	—	—

At these numbers, you could run 250x the agent workload for the same budget. Or run more specialized agents in parallel. Or stop throttling your heartbeat interval.

Choosing a Plan

MiniMax offers two ways to pay. Both give you the same API key format and the same endpoint — Hivemind doesn't care which plan you're on.

Token plans (flat monthly):

Plan	Monthly	M2.7 requests / 5hr window	Best for
Starter	$10	1,500	Personal Hivemind setups, experimentation
Standard	$20	5,000	Small teams, moderate agent workloads
Pro	$50	15,000	Active dev teams, frequent heartbeat agents

Token plans give you predictable monthly costs. No per-token math, no surprise bills. MiniMax explicitly marks these as not recommended for production at scale — but for personal use or a small Hivemind instance, Starter or Standard is the low-friction way to get started.

Pay-as-you-go:

Metered at $0.30/M input, $1.20/M output. No rate ceiling. Best for team deployments, high-volume pipelines, or any workload where you'd regularly exceed the token plan's request window.

The API key works identically either way. Start at $10, upgrade when your agents demand it.

Connecting M2.7 to Hivemind

MiniMax exposes an OpenAI-compatible API. Hivemind has native support for any OpenAI-compatible provider — full streaming, tool calling, and thinking/reasoning support included.

Setup takes about 3 minutes.

Step 1: Get your API key

After signing up at platform.minimax.io, find your API key under API Keys in the console. It works regardless of whether you're on a token plan or pay-as-you-go.

MiniMax has two regional endpoints:

International: \https://api.minimax.io/v1\
China: \https://api.minimaxi.com/v1\

Use the international endpoint unless you're routing through the China region.

Step 2: Add the provider in Hivemind

Go to Settings → Providers → Add Provider → OpenAI Compatible.

Fill in:

Field	Value
Name	\`MiniMax\` (or whatever you want)
Base URL	\`https://api.minimax.io/v1\`
API Key	Your MiniMax API key

Save it. Hivemind will hit \/v1/models\ and auto-discover available models.

Step 3: Assign M2.7 to an agent

Open any agent and go to Model Settings. Select \MiniMax\ as the provider, then choose \MiniMax-Text-01\ (the current M2.7 model identifier — verify in your MiniMax console, as this may be updated).

MiniMax also ships a \M2.7-highspeed\ variant that produces identical output with lower latency for throughput-sensitive tasks.

That's it. The agent is now running on M2.7.

Where M2.7 Fits in a Hivemind Team

Hivemind is built around specialized agents. M2.7's profile suggests three natural fits.

High-volume coding agents

M2.7's 78% SWE-bench Verified score and 55.6% VIBE-Pro (end-to-end project delivery) make it a serious coding model. If you're running a dev team with a coding agent that fires frequently — reviewing PRs, generating boilerplate, running code audits — the cost difference is significant.

A coding agent averaging 50,000 tokens per run, firing 20 times a day:

Model	Daily cost	Monthly cost
Claude Opus 4.6	~$45	~$1,350
MiniMax M2.7	~$0.90	~$27

Same benchmark tier. 50x cost reduction.

Heartbeat automation

The Hivemind heartbeat runs on a configurable interval — every 5 minutes to every 24 hours. It reads a task checklist, delegates to the right agents, and saves findings to memory.

For high-frequency heartbeats (every 5–15 minutes), model cost accumulates fast. M2.7's $0.06/M blended rate makes it practical to run a heartbeat every few minutes without watching your budget evaporate.

\\\`

Settings → Heartbeat → Model: MiniMax / M2.7

\\\`

Document processing pipelines

M2.7 scored the highest ELO (1495) among all open-source models on GDPval-AA — real-world office task benchmarks across Excel, Word, and PowerPoint. Its 97% skill adherence rate on 40+ complex tasks (each over 2,000 tokens) is relevant for any agent doing structured document work.

Agents handling invoice processing, report generation, or data extraction are good candidates for M2.7.

Two API Variants

MiniMax ships M2.7 in two flavors:

Variant	Speed	Quality	Best for
Standard	—	Full quality	Production, complex tasks
Highspeed	Faster	Identical results	High-throughput, latency-sensitive

Both produce the same output. The highspeed variant trades some throughput for lower latency — useful for agents that need to respond quickly in chat contexts. The standard variant is what you want for deep coding or research tasks where quality matters more than speed.

Set up both as separate providers in Hivemind if you want to assign them to different agents.

The Bigger Picture

M2.7 is interesting for a reason beyond the benchmarks.

It's a model that participated in its own training — wrote code for its own eval scaffolds, managed its own reinforcement learning workflows, iterated on its own performance. That's the same pattern Hivemind encourages at the agent level: agents that improve over time, write their own tools, build their own skills.

The model itself was built the way Hivemind agents are designed to work.

Whether or not you run M2.7, it's worth understanding that self-improvement isn't a future capability. It's already shipping — both in models and in agent platforms.

Get Started

\\\`bash

# Already running Hivemind? Add the provider:

# Settings → Providers → Add Provider → OpenAI Compatible

# Base URL: https://api.minimax.io/v1

# API Key: your MiniMax key

# Not running Hivemind yet?

curl -fsSL https://hivementality.ai/install.sh | bash

\\\`

Hivemind

MiniMax API keys: platform.minimax.io

Hivemind on GitHub: github.com/hivementality-ai/hivemind

Questions: Discord