LongCat AI – Next-Generation Multi-Modal Models
Open-source MoE LLMs by Meituan: Flash-Chat, Flash-Thinking, Video, Video-Avatar (SOTA avatar generation), Image (generation & editing), Audio-Codec, and Omni. Fast, efficient, and production-ready.
Latest Release: LongCat-Flash-Thinking-2601
Today, the Meituan LongCat team officially releases and open-sources LongCat-Flash-Thinking-2601. As an upgraded version of the previously released LongCat-Flash-Thinking model, LongCat-Flash-Thinking-2601 achieves open-source SOTA performance on core evaluation benchmarks including Agentic Search, Agentic Tool Use, and TIR (Tool Interaction Reasoning).
The model demonstrates exceptional generalization capabilities in tool calling, outperforming Claude in random complex tasks that rely on tool calling, significantly reducing the training cost for adapting to new tools in real-world scenarios. It is also the first fully open-source model that supports online free experience of the "Re-thinking Mode", simultaneously activating 8 parallel reasoning paths to ensure thorough thinking and reliable decision-making.
This feature is now available for free experience on https://longcat.ai (the Re-thinking Mode is triggered when selecting the deep thinking function).
🧠 Revolutionary "Re-thinking" Mode
The newly upgraded "Re-thinking" mode teaches the model to "think carefully" before acting. When encountering high-difficulty problems, the model breaks down the thinking process into two steps: parallel thinking and summary synthesis.
- Parallel thinking phase: The model simultaneously and independently explores multiple reasoning paths, similar to how humans consider different solutions when facing difficult problems, ensuring diversity of thought to avoid missing optimal solutions
- Summary synthesis phase: Multiple paths are organized, optimized, and synthesized, with optimized results fed back to form closed-loop iterative reasoning, continuously deepening the thinking process
- Reinforcement learning enhancement: Additional reinforcement learning components specifically designed to refine the model's summary synthesis capabilities, enabling LongCat-Flash-Thinking-2601 to truly "think clearly before acting"
📊 Comprehensive Benchmark Performance
Comprehensive and rigorous evaluation shows that LongCat-Flash-Thinking-2601 leads across programming, mathematical reasoning, agentic tool calling, and agentic search dimensions:
- Programming capability: Achieves 82.8 on LCB benchmark and 47.7 on OIBench EN, ranking in the first tier of similar models, demonstrating solid code foundation capabilities
- Mathematical reasoning: Outstanding performance with Re-thinking mode enabled, achieving 100.0 (perfect score) on AIME-25 and 86.8 (current SOTA) on IMO-AnswerBench
- Agentic tool calling: Scores 88.2 on τ²-Bench and 29.3 on VitaBench, both achieving open-source SOTA, demonstrating excellent performance in multi-domain tool calling scenarios
- Agentic search: Achieves 73.1 on BrowseComp (best among all models) and 79.5 on RW Search, demonstrating strong information retrieval and scenario adaptation capabilities, reaching open-source leading levels
🔧 Advanced Training Techniques
- Environment expansion + multi-environment reinforcement learning: Built diverse "high-intensity training grounds" with multiple high-quality training environments, each integrating 60+ tools with dense dependency graphs and complex interactions
- Noise robustness training: Active injection of multiple noise types during training, simulating API call failures, error returns, and incomplete data, using curriculum learning to gradually increase noise types and intensity
- Enhanced DORA infrastructure: Extended self-developed reinforcement learning infrastructure, enabling stable parallel training of large-scale multi-environment agents while maintaining efficient asynchronous training characteristics
Previous Release: LongCat-Video-Avatar
Following the successful releases of InfiniteTalk and LongCat-Video, the LongCat team officially releases and open-sources LongCat-Video-Avatar, a SOTA-level avatar video generation model. Built on the LongCat-Video base, it achieves breakthrough improvements in three key dimensions: realistic motion, long-video stability, and identity consistency, providing developers with a more stable, efficient, and practical solution for virtual human generation.
🎭 Open-Source SOTA Realism
- Full-body synchronization: Synchronously controls lip sync, eye movements, facial expressions, and body gestures to achieve rich and full emotional expression
- Natural micro-movements: Maintains natural blinking, breathing, and posture adjustments during silent segments through Disentangled Unconditional Guidance
- Multi-mode support: Audio-Text-to-Video (AT2V), Audio-Text-Image-to-Video (ATI2V), and video continuation
🎬 Long-Sequence High-Quality Generation
- 5-minute+ stable generation: Cross-Chunk Latent Stitching enables stable video generation without quality degradation, maintaining stable colors and clear details for videos with ~5,000 frames
- No quality loss: Eliminates VAE cycle-induced quality loss by performing operations directly in latent space
- Improved inference efficiency: Direct latent space operations without pixel domain decoding
✅ Commercial-Grade Identity Consistency
- Identity consistency: Reference Skip Attention mechanism ensures consistent character appearance throughout long sequences
- Motion diversity: Avoids "copy-paste" effects and rigid movements, making videos both stable and varied
- SOTA benchmark performance: Leading performance on HDTF, CelebV-HQ, EMTD, and EvalTalker datasets
Model Series
Image
6B parameters | Open-source SOTA on image editing (GEdit-Bench, ImgEdit-Bench) and Chinese text rendering (ChineseWord: 90.7). Covers all 8,105 standard Chinese characters.
Latest Release | Hugging Face | GitHub
Video-Avatar
SOTA-level avatar video generation. Natural micro-movements, long-video stability, and identity consistency for virtual human generation.
Released: Nov 2025 | Hugging Face | GitHub
Flash-Thinking
Enhanced reasoning with dual-path framework. Open-source SOTA on agentic tool use and search. Revolutionary Re-thinking mode with 8 parallel reasoning paths.
Latest: Flash-Thinking-2601 | Hugging Face | GitHub
Flash-Chat
Foundation dialogue model (560B params, MoE). Achieves 100+ tokens/s on H800 GPUs with ~27B active params/token.
Released: Sept 1, 2025
Key Highlights
- High-throughput inference: 100+ tokens/s on H800 GPUs
- Zero-Computation Experts: Activates only ~27B params/token from 560B pool
- Extended context: Up to 128K tokens
- AI image generation: Fast, studio-quality image generation and editing with superior Chinese text rendering
- Open-source SOTA: Leading performance on Omni-Bench, WorldSense, MMLU, and more
- Production-ready: Deployed across Meituan's services