Summary of Post-Training Techniques from Llama3.1 to DeepSeek-V3

Summary of Post-Training Techniques from Llama3.1 to DeepSeek-V3

Summit Preview On January 14,the Fourth Global Autonomous Driving Summit will be held in Beijing.The main venue will host the opening ceremony, an end-to-end autonomous driving innovation forum, and a city NOA special forum, while the sub-venues will hold technical seminars on autonomous driving visual language models and world models.All the speakers for the summit … Read more

In-Depth Study of Qwen 2.5 Paper

In-Depth Study of Qwen 2.5 Paper

Introduction I must say, Qwen is really impressive. It seems that its foundational capabilities have firmly established it as the leader in open source, and it is not at all inferior compared to most closed sources. Many companies’ foundational teams are likely already being judged on the significance of foundational models. Qwen’s open-source momentum is … Read more

Guide to Deploying Llama3 Locally with Ollama

Guide to Deploying Llama3 Locally with Ollama

As we all know, Zuckerberg’s Meta has open-sourced Llama3 with two versions: the 8B and 70B pretrained and instruction-tuned models. There is also a larger 400B parameter version expected to be released this summer, which may be the first open-source model at the GPT-4 level! Let’s start with a preliminary understanding of Llama3. Model Architecture … Read more