Deepseek-V2 Technical Report Analysis

Deepseek-V2 Technical Report Analysis

Deepseek has recently released the v2 version of its model, continuing the technical route of the Deepseek-MoE (Mixture of Experts) model released in January. It employs a large number of small parameter experts for modeling and incorporates more optimizations in training and inference. True to its tradition, Deepseek has fully open-sourced the model (base and … Read more