Kimi-K2.5

Core Overview

Kimi-K2.5 is Moonshot AI’s most versatile model to date, featuring a native multimodal architecture that simultaneously supports visual and text input, thinking and non-thinking modes, and conversational and Agent tasks. With its 256K ultra-long context window, multimodal understanding, and Tool Calling capabilities, it sets a new benchmark in open-source visual programming and Agent clusters, aiming to empower developers to build next-generation AI applications.

Key Features

Native Multimodal Architecture: Supports mixed input of visual and text, capable of image recognition, visual programming, and other tasks, leading the way in open-source visual programming.
256K Ultra-Long Context Window: Provides a 256,000 token context window, supporting long-form reasoning and complex task processing, effectively handling large amounts of information.
Agent Clusters and Tool Calling: Supports a preview version of Agent clusters, capable of supporting up to 100 sub-agents and 1,500 tool calls, operating 4.5 times faster than single-agent configurations, significantly enhancing automation and complex task execution efficiency.
Exceptional Coding Capabilities: Performs outstandingly in benchmarks such as SWE-Bench and LiveCodeBench, leading in competitive programming, and is priced significantly lower than comparable models.
Thinking Modes: Offers both thinking and non-thinking modes, allowing the model to engage in deeper reasoning and planning when needed.

Best Use Cases

Visual Programming and Automation: From pixel-level webpage replication to expert-level office delivery, efficiently handles complex tasks, especially suitable for scenarios requiring visual understanding and programming capabilities.
Ultra-Long Text Processing and Analysis: Applicable to tasks requiring the processing of massive amounts of information, such as legal document review, research report analysis, and codebase understanding.
Agent Tasks and Multi-Agent Collaboration: Builds complex automated workflows, enabling multi-agent collaboration to complete large-scale tasks.
Professional Code Generation and Debugging: Provides powerful code generation, optimization, and debugging capabilities for developers.

Capabilities and Limitations

Capability	Detailed Description
Reasoning Ability	Extremely Strong. Excels in long-context reasoning, Agent task planning, and execution.
Creative Ability	Extremely Strong. Particularly adept at visual programming, code generation, and multimodal content understanding.
Multimodal Ability	Native multimodal, supports visual and text input, outstanding in visual programming.
Response Speed	Fast response in quick mode, efficient parallel processing achievable in Agent cluster mode.
Context Window	256,000 Tokens
Max Output	256,000 Tokens

Credits and Pricing

Model	Input (Credits/Token)	Output (Credits/Token)
Kimi-K2.5	0.59	3.00