开发者对 Kimi K2 Thinking 的评价
观看 AI 研究人员、开发者和技术专家探索 Kimi K2 Thinking 能力的技术评测和实操演示

Kimi K2 Thinking 太疯狂了... (重大更新)
现在等待 20B 蒸馏版本

Kimi K2 Thinking 是最佳开源模型 - 首次体验与测试
Kimi 的写作总是如此出色。它非常像人类,在 AI 检测器中很少被发现。

5 分钟了解 Kimi K2
快速更正:MoonShot AI 网站上推荐运行 k2-base 量化版本的硬件至少需要 8 个 H100,所以成本至少是我这里计算的 8 倍。它在可行性方面仍然稍微落后,但重点是差距会改变。为计算错误道歉!
性能基准对比
了解 Kimi K2 Thinking 在关键推理、编程和智能体基准测试中与主流 AI 模型的表现对比。
Performance Across Key Categories

Comprehensive performance comparison across Agentic & Competitive Coding, Tool Use, and Math & STEM benchmarks
Coding Tasks
Software engineering and competitive programming benchmarks
| Benchmark | K2 Thinking | GPT-5 (High) | Claude Sonnet 4.5 | K2 0905 | DeepSeek-V3.2 |
|---|---|---|---|---|---|
| SWE-bench Verified (w/ tools) | 71.3 | 74.9 | 77.2 | 69.2 | 67.8 |
| SWE-bench Multilingual (w/ tools) | 61.1 | 55.3* | 68.0 | 55.9 | 57.9 |
| LiveCodeBench v6 (no tools) | 83.1 | 87.0* | 64.0* | 56.1* | 74.1 |
| OJ-Bench (cpp) (no tools) | 48.7 | 56.2* | 30.4* | 25.5* | 38.2* |
| Terminal-Bench (w/ simulated tools) | 47.1 | 43.8 | 51.0 | 44.5 | 37.7 |
Reasoning Tasks
Multi-step reasoning, mathematics, and STEM problem-solving
| Benchmark | K2 Thinking | GPT-5 (High) | Claude Sonnet 4.5 | K2 0905 | DeepSeek-V3.2 | Grok-4 |
|---|---|---|---|---|---|---|
| HLE (w/ tools) | 44.9 | 41.7* | 32.0* | 21.7 | 20.3* | 41.0 |
| AIME25 (w/ python) | 99.1 | 99.6 | 100.0 | 75.2 | 58.1* | 98.8 |
| HMMT25 (w/ python) | 95.1 | 96.7 | 88.8* | 70.4 | 49.5* | 93.9 |
| GPQA (no tools) | 84.5 | 85.7 | 83.4 | 74.2 | 79.9 | 87.5 |
* indicates values from third-party reports or unofficial sources
Data source: Official Kimi K2 Thinking Model Card
快速开始指南
使用 vLLM 在您的基础设施上部署 Kimi K2 Thinking。简单的 5 步设置,即可完成生产级推理部署。
Hardware Requirements
Minimum setup for deploying Kimi K2 Thinking:
- •8x GPUs with Tensor Parallel (NVIDIA H200 recommended)
- •Supports INT4 quantized weights with 256k context length
Install vLLM
Install vLLM inference framework:
pip install vllmDownload Model
Download the model from Hugging Face:
huggingface-cli download moonshotai/Kimi-K2-Thinking --local-dir ./kimi-k2-thinkingLaunch vLLM Server
Start the inference server with essential parameters:
vllm serve moonshotai/Kimi-K2-Thinking \
--tensor-parallel-size 8 \
--tool-call-parser kimi_k2 \
--reasoning-parser kimi_k2 \
--max-num-batched-tokens 32768Test Deployment
Verify the deployment is working:
curl http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "moonshotai/Kimi-K2-Thinking",
"messages": [
{"role": "user", "content": "Hello, what is 1+1?"}
]
}'For complete deployment guide including SGLang and KTransformers:
Official Deployment GuideKimi K2 Thinking 核心能力
探索使 Kimi K2 Thinking 成为复杂推理和开发工作流理想选择的强大功能。
深度思维链推理
端到端训练的多步推理能力,具备原生思考模式。在复杂问题解决中可保持 200-300 次连续工具调用的逻辑连贯性。
超长上下文理解
业界领先的 256K token 上下文窗口,能够处理整个代码库、长篇文档和多文件项目,同时保持完整的上下文理解。
万亿参数 MoE 架构
1 万亿参数的专家混合设计,每次前向传播激活 32B 参数,在保持高效计算成本的同时提供卓越性能。
卓越编程与智能体能力
在 SWE-bench Verified 上达到 71.3%,在 LiveCodeBench v6 上达到 83.1%。在智能体任务上表现出色,BrowseComp 达到 60.2%,人类最后的考试达到 44.9%。
原生 INT4 量化
量化感知训练使 INT4 精度下推理加速 2 倍,同时保持模型质量,适合生产部署。
开源且高性价比
采用修改版 MIT 许可证发布,API 定价为输入 $0.60/M tokens(缓存时 $0.15)、输出 $2.50/M - 比 GPT-4 和 Claude 便宜 60-80%。
社区在 X 上的反响
加入关于 Kimi K2 Thinking 的讨论,看看开发者社区分享的使用体验
🚀 Hello, Kimi K2 Thinking!
— Kimi.ai (@Kimi_Moonshot) November 6, 2025
The Open-Source Thinking Agent Model is here.
🔹 SOTA on HLE (44.9%) and BrowseComp (60.2%)
🔹 Executes up to 200 – 300 sequential tool calls without human interference
🔹 Excels in reasoning, agentic search, and coding
🔹 256K context window
Built… pic.twitter.com/lZCNBIgbV2
Kimi K2 Thinking is the new leading open weights model: it demonstrates particular strength in agentic contexts but is very verbose, generating the most tokens of any model in completing our Intelligence Index evals@Kimi_Moonshot's Kimi K2 Thinking achieves a 67 in the… pic.twitter.com/m6SvpW7iif
— Artificial Analysis (@ArtificialAnlys) November 7, 2025
The new 1 Trillion parameter Kimi K2 Thinking model runs well on 2 M3 Ultras in its native format - no loss in quality!
— Awni Hannun (@awnihannun) November 7, 2025
The model was quantization aware trained (qat) at int4.
Here it generated ~3500 tokens at 15 toks/sec using pipeline-parallelism in mlx-lm: pic.twitter.com/oH5DPi7kAg
If Kimi K2 Thinking was truly trained with only $4.6 million, the close AI labs are cooked. pic.twitter.com/LPbSL0v1U5
— Yuchen Jin (@Yuchenj_UW) November 7, 2025
Give me 1 reason why I shouldn't buy this top of the line Mac Studio, download Kimi K2 Thinking (best AI model in the world right now), and let it control the computer autonomously 24/7
— Alex Finn (@AlexFinn) November 7, 2025
A full employee working for me year round
Would anyone want to this live streamed? pic.twitter.com/6vZd7dyAoP
