Summary of BERT-Related Models

Summary of BERT-Related Models

©PaperWeekly Original · Author|Xiong Zhiwei School|Tsinghua University Research Direction|Natural Language Processing Since BERT was proposed in 2018, it has gained significant success and attention. Based on this, various related models have been proposed in academia to improve BERT. This article attempts to summarize and organize these models. MT-DNN MT-DNN (Multi-Task DNN) was proposed by Microsoft … Read more

Building Language Applications with Hugging Face Transformers

Building Language Applications with Hugging Face Transformers

Hugging Face is a chatbot startup based in New York, focusing on NLP technology, with a large open-source community. Especially, the open-source natural language processing and pre-trained model library, Transformers, has been downloaded over a million times and has more than 24,000 stars on GitHub. Transformers provides a large number of state-of-the-art pre-trained language model … Read more

Runway Gen-1: Create Anything You Want with AI Tools

Runway Gen-1: Create Anything You Want with AI Tools

Click the card below👇👇 to get more AI tools Runway Runway is a cloud-based AI tool that provides various AI models and tools, enabling users to quickly create, deploy, and share their own AI projects. Runway offers many commonly used AI models, including image recognition, natural language processing, and audio processing. These models are obtained … Read more

Runway: Rapid Iteration of Foundation Models in Video Generation

Runway: Rapid Iteration of Foundation Models in Video Generation

Further Reading AI Deep Research No. 19: Jasper: An Integrated AI Tool Born for Marketing AI Deep Research No. 18: Cohere: A Strong Competitor to OpenAI Focused on B-end Track AI Deep Research No. 17: Inspur Information: Focusing on “Cloud + AI”, the Power Leader Welcomes New Development AI Deep Research No. 16: Zhongwang Software … Read more

AI Breakthrough: Automatically Generate Videos from Text with Runway Gen-2

AI Breakthrough: Automatically Generate Videos from Text with Runway Gen-2

Are you already amazed by AI’s ability to convert text into images? Now AI can also generate videos from text! Do you still need to edit videos? With AI, you can create a video from scratch; all you need to do is think of a script and type it out. There’s no need to learn … Read more

Worried About Prompt Leaking Privacy? This Framework Enables Secure Inference for LLaMA-7B

Worried About Prompt Leaking Privacy? This Framework Enables Secure Inference for LLaMA-7B

Machine Heart Reports Editor: Panda Currently, there are numerous providers offering deep learning services. When using these services, users need to send their information included in the prompt to these providers, which can lead to privacy leakage issues. On the other hand, service providers are generally unwilling to disclose the model parameters they have painstakingly … Read more

Visual Prompt Engineering: No Fine-Tuning Required

Visual Prompt Engineering: No Fine-Tuning Required

↑ ClickBlue Text Follow the Jishi platform Author丨Tech Beast Editor丨Jishi Platform Jishi Guide How to adapt a pre-trained visual model to new downstream tasks without specific task fine-tuning or any model modifications? >> Join the Jishi CV technology exchange group and stay at the forefront of computer vision Table of Contents 1 Completing Visual Prompting … Read more

Prompt Engineering Tutorial in Chinese

Prompt Engineering Tutorial in Chinese

Write high-quality prompts to let AI generate stunning text and images. —— This is the problem that prompt engineers, who are still in high demand with annual salaries reaching millions, are solving. In the era of large language models, numerous mind-blowing artworks and impressive copywriting are emerging, created not by traditionally defined artists or writers, … Read more

Key Details of Qwen MoE: Enhancing Model Performance Through Global Load Balancing

Key Details of Qwen MoE: Enhancing Model Performance Through Global Load Balancing

Today, we share with you the latest paper from Alibaba Cloud Tongyi Qianwen team – Demons in the Detail: On Implementing Load Balancing Loss for Training Specialized Mixture-of-Expert Models (Original paper link: https://arxiv.org/abs/2501.11873) This paper focuses on improving the training method of Mixture-of-Experts (MoEs) by relaxing local balance to global balance through lightweight communication, significantly … Read more

Ali Qwen 2.5-1M Open Source: 320GB for 14B Tokens

Ali Qwen 2.5-1M Open Source: 320GB for 14B Tokens

Recently, domestic large models such as DeepSeek, Kimi, Baichuan Intelligence, Doubao, and Jieti Xingchen have released their respective models. On the last day of the year, Alibaba Qwen couldn’t hold back anymore and also open-sourced the million-token contextQwen2.5-1M model and its corresponding inference framework support. Open Source Model: The Qwen2.5-7B-Instruct-1M and Qwen2.5-14B-Instruct-1M models, which extend … Read more