新機能🚀 1Tパラメータオープンソースモデル - 256Kコンテキスト、深層推論モード

Kimi K2 Thinking：拡張コンテキストを持つ深層推論AI

深層マルチステップ推論と拡張コンテキスト理解のために設計された1兆パラメータMoEモデル。256Kトークンコンテキストウィンドウとネイティブシンキングモードにより、Kimi K2 Thinkingはコスト効率を維持しながら、複雑な推論タスクで最先端のパフォーマンスを提供します。修正MITライセンスの下で完全オープンソース。

レビュー

開発者がKimi K2 Thinkingについて語っていること

AI研究者、開発者、技術専門家がKimi K2 Thinkingの機能を探求する技術レビューと実践デモをご覧ください

Kimi K2 Thinking is CRAZY... (HUGE UPDATE)

20B蒸留版を待っています

Kimi K2 Thinking Is The BEST Open Source Model - First Look & Testing

Kimiの文章は常に優れています。人間らしく、AIディテクターで検出されることはほとんどありません。

Kimi K2 explained in 5 minutes

簡単な訂正：MoonShot AIサイトで推奨されているk2-baseを実行するためのハードウェアは、量子化バージョンでH100が8ユニット必要なので、コストは私がここで計算したものの少なくとも8倍です。実現可能性ではまだ少し遅れていますが、ギャップは変わるという点は変わりません。計算ミスをお詫びします！

パフォーマンスベンチマーク比較

主要な推論、コーディング、エージェントベンチマークにおいて、Kimi K2 Thinkingが主要なAIモデルと比較してどのようなパフォーマンスを発揮するかをご覧ください。

Performance Across Key Categories

Kimi K2 Thinking Benchmark Comparison - Agentic Coding, Tool Use, Math & STEM

Comprehensive performance comparison across Agentic & Competitive Coding, Tool Use, and Math & STEM benchmarks

Coding Tasks

Software engineering and competitive programming benchmarks

Benchmark	K2 Thinking	GPT-5 (High)	Claude Sonnet 4.5	K2 0905	DeepSeek-V3.2
SWE-bench Verified (w/ tools)	71.3	74.9	77.2	69.2	67.8
SWE-bench Multilingual (w/ tools)	61.1	55.3*	68.0	55.9	57.9
LiveCodeBench v6 (no tools)	83.1	87.0*	64.0*	56.1*	74.1
OJ-Bench (cpp) (no tools)	48.7	56.2*	30.4*	25.5*	38.2*
Terminal-Bench (w/ simulated tools)	47.1	43.8	51.0	44.5	37.7

Reasoning Tasks

Multi-step reasoning, mathematics, and STEM problem-solving

Benchmark	K2 Thinking	GPT-5 (High)	Claude Sonnet 4.5	K2 0905	DeepSeek-V3.2	Grok-4
HLE (w/ tools)	44.9	41.7*	32.0*	21.7	20.3*	41.0
AIME25 (w/ python)	99.1	99.6	100.0	75.2	58.1*	98.8
HMMT25 (w/ python)	95.1	96.7	88.8*	70.4	49.5*	93.9
GPQA (no tools)	84.5	85.7	83.4	74.2	79.9	87.5

* indicates values from third-party reports or unofficial sources

Data source: Official Kimi K2 Thinking Model Card

クイックスタートガイド

vLLMを使用してKimi K2 Thinkingをインフラストラクチャにデプロイします。本番環境対応の推論のためのシンプルな5ステップセットアップ。

Hardware Requirements

Minimum setup for deploying Kimi K2 Thinking:

•8x GPUs with Tensor Parallel (NVIDIA H200 recommended)
•Supports INT4 quantized weights with 256k context length

Install vLLM

Install vLLM inference framework:

bash

pip install vllm

Download Model

Download the model from Hugging Face:

bash

huggingface-cli download moonshotai/Kimi-K2-Thinking --local-dir ./kimi-k2-thinking

Launch vLLM Server

Start the inference server with essential parameters:

vLLM Deployment

bash

vllm serve moonshotai/Kimi-K2-Thinking \
  --tensor-parallel-size 8 \
  --tool-call-parser kimi_k2 \
  --reasoning-parser kimi_k2 \
  --max-num-batched-tokens 32768

Test Deployment

Verify the deployment is working:

Test API

bash

curl http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "moonshotai/Kimi-K2-Thinking",
    "messages": [
      {"role": "user", "content": "Hello, what is 1+1?"}
    ]
  }'

For complete deployment guide including SGLang and KTransformers:

Official Deployment Guide

Kimi K2 Thinkingの主要機能

Kimi K2 Thinkingを複雑な推論と開発ワークフローに理想的にする強力な機能を発見してください。

深層思考連鎖推論

ネイティブシンキングモードを備えたマルチステップ推論のためのエンドツーエンド訓練。複雑な問題解決のために200-300の連続的なツール呼び出しにわたって一貫した論理を維持します。

拡張コンテキスト理解

業界をリードする256Kトークンコンテキストウィンドウにより、全体のコードベース、長文書、複数ファイルプロジェクトを処理しながら、全体を通じて完全なコンテキストを保持します。

1兆パラメータMoEアーキテクチャ

フォワードパスあたり32Bのアクティブパラメータを持つ1兆パラメータ専門家混合設計により、効率的な計算コストで卓越したパフォーマンスを提供します。

優れたコーディングとエージェント機能

SWE-bench Verifiedで71.3%、LiveCodeBench v6で83.1%を達成。BrowseCompで60.2%、Humanity's Last Examで44.9%を達成し、エージェントタスクに優れています。

ネイティブINT4量子化

量子化認識訓練により、INT4精度で2倍の推論加速を実現し、本番環境デプロイメントのためのモデル品質を維持します。

オープンソースとコスト効率

修正MITライセンスでリリースされ、API価格は入力トークン100万あたり$0.60（キャッシュ使用時$0.15）、出力100万あたり$2.50 - GPT-4とClaudeより60-80%安価。

Xでのコミュニティの反応

Kimi K2 Thinkingについての会話に参加し、開発者コミュニティが体験について共有していることをご覧ください

🚀 Hello, Kimi K2 Thinking!
The Open-Source Thinking Agent Model is here.

🔹 SOTA on HLE (44.9%) and BrowseComp (60.2%)
🔹 Executes up to 200 – 300 sequential tool calls without human interference
🔹 Excels in reasoning, agentic search, and coding
🔹 256K context window

Built… pic.twitter.com/lZCNBIgbV2
— Kimi.ai (@Kimi_Moonshot) November 6, 2025

Kimi K2 Thinking is the new leading open weights model: it demonstrates particular strength in agentic contexts but is very verbose, generating the most tokens of any model in completing our Intelligence Index evals@Kimi_Moonshot's Kimi K2 Thinking achieves a 67 in the… pic.twitter.com/m6SvpW7iif
— Artificial Analysis (@ArtificialAnlys) November 7, 2025

The new 1 Trillion parameter Kimi K2 Thinking model runs well on 2 M3 Ultras in its native format - no loss in quality!

The model was quantization aware trained (qat) at int4.

Here it generated ~3500 tokens at 15 toks/sec using pipeline-parallelism in mlx-lm: pic.twitter.com/oH5DPi7kAg
— Awni Hannun (@awnihannun) November 7, 2025

If Kimi K2 Thinking was truly trained with only $4.6 million, the close AI labs are cooked. pic.twitter.com/LPbSL0v1U5
— Yuchen Jin (@Yuchenj_UW) November 7, 2025

Give me 1 reason why I shouldn't buy this top of the line Mac Studio, download Kimi K2 Thinking (best AI model in the world right now), and let it control the computer autonomously 24/7

A full employee working for me year round

Would anyone want to this live streamed? pic.twitter.com/6vZd7dyAoP
— Alex Finn (@AlexFinn) November 7, 2025

Kimi K2 Thinking：拡張コンテキストを持つ深層推論AI

開発者がKimi K2 Thinkingについて語っていること

Kimi K2 Thinking is CRAZY... (HUGE UPDATE)

Kimi K2 Thinking Is The BEST Open Source Model - First Look & Testing

Kimi K2 explained in 5 minutes

パフォーマンスベンチマーク比較

Performance Across Key Categories

Coding Tasks

Reasoning Tasks

クイックスタートガイド

Hardware Requirements

Install vLLM

Download Model

Launch vLLM Server

Test Deployment

Kimi K2 Thinkingの主要機能

深層思考連鎖推論

拡張コンテキスト理解

1兆パラメータMoEアーキテクチャ

優れたコーディングとエージェント機能

ネイティブINT4量子化

オープンソースとコスト効率

Xでのコミュニティの反応

よくある質問

Kimi K2 Thinkingとは何ですか？標準K2とどう違いますか？

シンキングモードはどのように機能しますか？

Kimi K2 Thinkingはどのようなユースケースに最適ですか？

Kimi K2 Thinkingにアクセスして使用するにはどうすればよいですか？

価格体系はどうなっていますか？

Kimi K2 Thinkingはo1やDeepSeek R1などの推論モデルと比較してどうですか？

Kimi K2 Thinkingは推論の深さと速度とコストのバランスをどのように取っていますか？

Kimi K2 Thinkingをローカルにデプロイできますか？要件は何ですか？

Kimi K2 Thinkingを効果的に使用するためのベストプラクティスは何ですか？