Run MiniMax M2.7 in Hivemind: Tier-1 Coding Performance at $0.06/M Tokens
· Hivemind Team
# Run MiniMax M2.7 in Hivemind: Tier-1 Coding Performance at $0.06/M Tokens
MiniMax just shipped M2.7. It matches Claude Opus 4.6 on SWE-Pro (56.22% vs ~57%), runs at 100 tokens per second, and costs $0.30 per million input tokens — compared to $15 for Opus. With cache optimization, the blended cost drops to $0.06/M tokens.
That's not a typo. Sixty cents per billion tokens.
For teams running high-volume agent workloads on Hivemind — coding agents, heartbeat automation, document pipelines — this changes the economics of what's feasible.
Here's how to connect it.
What M2.7 Actually Is
Most model releases are incremental. M2.7 is structurally different.
It's MiniMax's first model built to participate in its own training — what they call self-evolution. During development, M2.7 autonomously ran 100+ optimization cycles on its own scaffold, managed 30–50% of reinforcement learning research workflows independently, and competed in 22 ML competitions (9 gold medals). It wasn't just trained on data. It helped shape the process that produced it.
The result is a 10B activated-parameter model that performs at the level of models orders of magnitude larger.
Benchmark snapshot:
| Benchmark | M2.7 | Claude Opus 4.6 |
| SWE-Pro | 56.22% | ~57% |
| SWE-bench Verified | 78% | 55% |
| GDPval-AA (office tasks) | ELO 1495 | — |
| MLE-Bench Lite | 66.6% | — |
| Skill adherence (complex tasks) | 97% | — |
On SWE-bench Verified — which measures real-world bug fixes, not curated puzzles — M2.7 outperforms Opus by 23 percentage points.
Speed and cost comparison:
| M2.7 | Claude Opus 4.6 | GPT-5 | |
| Speed | 100 TPS | ~33 TPS | ~40 TPS |
| Input cost | $0.30/M | $15/M | $10/M |
| Output cost | $1.20/M | $75/M | $30/M |
| Blended (cached) | $0.06/M | — | — |
At these numbers, you could run 250x the agent workload for the same budget. Or run more specialized agents in parallel. Or stop throttling your heartbeat interval.
Choosing a Plan
MiniMax offers two ways to pay. Both give you the same API key format and the same endpoint — Hivemind doesn't care which plan you're on.
Token plans (flat monthly):
| Plan | Monthly | M2.7 requests / 5hr window | Best for |
| Starter | $10 | 1,500 | Personal Hivemind setups, experimentation |
| Standard | $20 | 5,000 | Small teams, moderate agent workloads |
| Pro | $50 | 15,000 | Active dev teams, frequent heartbeat agents |
Token plans give you predictable monthly costs. No per-token math, no surprise bills. MiniMax explicitly marks these as not recommended for production at scale — but for personal use or a small Hivemind instance, Starter or Standard is the low-friction way to get started.
Pay-as-you-go:
Metered at $0.30/M input, $1.20/M output. No rate ceiling. Best for team deployments, high-volume pipelines, or any workload where you'd regularly exceed the token plan's request window.
The API key works identically either way. Start at $10, upgrade when your agents demand it.
Sign up and grab your key at platform.minimax.io.
Connecting M2.7 to Hivemind
MiniMax exposes an OpenAI-compatible API. Hivemind has native support for any OpenAI-compatible provider — full streaming, tool calling, and thinking/reasoning support included.
Setup takes about 3 minutes.
Step 1: Get your API key
After signing up at platform.minimax.io, find your API key under API Keys in the console. It works regardless of whether you're on a token plan or pay-as-you-go.
MiniMax has two regional endpoints:
- International: \
https://api.minimax.io/v1\ - China: \
https://api.minimaxi.com/v1\
Use the international endpoint unless you're routing through the China region.
Step 2: Add the provider in Hivemind
Go to Settings → Providers → Add Provider → OpenAI Compatible.
Fill in:
| Field | Value |
| Name | \MiniMax\ (or whatever you want) |
| Base URL | \https://api.minimax.io/v1\ |
| API Key | Your MiniMax API key |
Save it. Hivemind will hit \/v1/models\ and auto-discover available models.
Step 3: Assign M2.7 to an agent
Open any agent and go to Model Settings. Select \MiniMax\ as the provider, then choose \MiniMax-Text-01\ (the current M2.7 model identifier — verify in your MiniMax console, as this may be updated).
MiniMax also ships a \M2.7-highspeed\ variant that produces identical output with lower latency for throughput-sensitive tasks.
That's it. The agent is now running on M2.7.
Where M2.7 Fits in a Hivemind Team
Hivemind is built around specialized agents. M2.7's profile suggests three natural fits.
High-volume coding agents
M2.7's 78% SWE-bench Verified score and 55.6% VIBE-Pro (end-to-end project delivery) make it a serious coding model. If you're running a dev team with a coding agent that fires frequently — reviewing PRs, generating boilerplate, running code audits — the cost difference is significant.
A coding agent averaging 50,000 tokens per run, firing 20 times a day:
| Model | Daily cost | Monthly cost |
| Claude Opus 4.6 | ~$45 | ~$1,350 |
| MiniMax M2.7 | ~$0.90 | ~$27 |
Same benchmark tier. 50x cost reduction.
Heartbeat automation
The Hivemind heartbeat runs on a configurable interval — every 5 minutes to every 24 hours. It reads a task checklist, delegates to the right agents, and saves findings to memory.
For high-frequency heartbeats (every 5–15 minutes), model cost accumulates fast. M2.7's $0.06/M blended rate makes it practical to run a heartbeat every few minutes without watching your budget evaporate.
\\\`
Settings → Heartbeat → Model: MiniMax / M2.7
\\\`
Document processing pipelines
M2.7 scored the highest ELO (1495) among all open-source models on GDPval-AA — real-world office task benchmarks across Excel, Word, and PowerPoint. Its 97% skill adherence rate on 40+ complex tasks (each over 2,000 tokens) is relevant for any agent doing structured document work.
Agents handling invoice processing, report generation, or data extraction are good candidates for M2.7.
Two API Variants
MiniMax ships M2.7 in two flavors:
| Variant | Speed | Quality | Best for |
| Standard | — | Full quality | Production, complex tasks |
| Highspeed | Faster | Identical results | High-throughput, latency-sensitive |
Both produce the same output. The highspeed variant trades some throughput for lower latency — useful for agents that need to respond quickly in chat contexts. The standard variant is what you want for deep coding or research tasks where quality matters more than speed.
Set up both as separate providers in Hivemind if you want to assign them to different agents.
The Bigger Picture
M2.7 is interesting for a reason beyond the benchmarks.
It's a model that participated in its own training — wrote code for its own eval scaffolds, managed its own reinforcement learning workflows, iterated on its own performance. That's the same pattern Hivemind encourages at the agent level: agents that improve over time, write their own tools, build their own skills.
The model itself was built the way Hivemind agents are designed to work.
Whether or not you run M2.7, it's worth understanding that self-improvement isn't a future capability. It's already shipping — both in models and in agent platforms.
Get Started
\\\`bash
# Already running Hivemind? Add the provider:
# Settings → Providers → Add Provider → OpenAI Compatible
# Base URL: https://api.minimax.io/v1
# API Key: your MiniMax key
# Not running Hivemind yet?
curl -fsSL https://hivementality.ai/install.sh | bash
\\\`
MiniMax API keys: platform.minimax.io
Hivemind on GitHub: github.com/hivementality-ai/hivemind
Questions: Discord