Post Content
[[{“value”:”LongCat 2.0: How Meituan Trained a 1.6T Model Without NVIDIA GPUs or TPUs
In this video, I break down how Meituan open sourced LongCat 2.0 (Owl Alpha on OpenRouter), a 1.6T-parameter MoE model trained without NVIDIA GPUs or Google TPUs, and why that matters for reducing reliance on NVIDIA’s hardware and CUDA software stack. I explain the core tradeoffs between parameters and compute, why simply adding more experts can hit diminishing returns, and how LongCat uses n-gram embeddings to increase corpus information more cheaply than adding experts. I cover the long-context cost problem and their modified sparse attention approach (inspired by DeepSeek) that makes the helper lighter via predictable memory access, caching across layers, and coarse-to-fine selection. I also discuss speculative decoding with a draft “picker” model, custom ASICs tuned for prefill vs decode, training on 50,000+ chips over 35T tokens, and I demo it generating a 3D ISS tracker on longcat.chat.
https://longcat.chat/blog/longcat-2.0/
DeepSpec Video: https://youtu.be/eFgknPFK-g0
My voice to text App: whryte.com
Website: https://engineerprompt.ai/
RAG Beyond Basics Course:
https://prompt-s-site.thinkific.com/courses/rag
Signup for Newsletter, localgpt:
https://tally.so/r/3y9bb0
Let’s Connect:
🦾 Discord: https://discord.com/invite/t4eYQRUcXB
☕ Buy me a Coffee: https://ko-fi.com/promptengineering
|🔴 Patreon: https://www.patreon.com/PromptEngineering
💼Consulting: https://calendly.com/engineerprompt/consulting-call
📧 Business Contact: engineerprompt@gmail.com
Become Member: http://tinyurl.com/y5h28s6h
💻 Pre-configured localGPT VM: https://bit.ly/localGPT (use Code: PromptEngineering for 50% off).
Signup for Newsletter, localgpt:
https://tally.so/r/3y9bb0
00:00 LongCat Shockwave
00:39 Why Hardware Matters
01:20 MoE Limits and N-Grams
03:40 Cheaper Long Context
06:23 Custom Chips and Stack
08:23 Demo Results and Takeaways”}]] Read More Prompt Engineering
#Promptengineering #AI