DeepSeek-V2 Technical Interpretation
DeepSeek has introduced a new MoE model, DeepSeek-V2, with a total parameter count of 236 billion and 21 billion active parameters. Although it is still a bit short of GPT-4 levels, it can be considered the strongest open-source MoE model available. Staying true to its open-source spirit, the accompanying technical report is also packed with … Read more