Comprehensive Collection of Common PyTorch Code Snippets

Comprehensive Collection of Common PyTorch Code Snippets

↑ ClickBlue Text Follow the Jishi Platform Author丨Jack Stark@Zhihu (Authorized) Source丨https://zhuanlan.zhihu.com/p/104019160 Editor丨Jishi Platform Jishi Guide This article is a collection of common PyTorch code snippets, covering five aspects: basic configuration, tensor processing, model definition and operation, data processing, and model training and testing. It also provides several noteworthy tips, making the content very comprehensive. >> … Read more

LlamaFactory Model Export Quantization

LlamaFactory Model Export Quantization

1. Each large model framework has specific format requirements for its fine-tuning data. For example, LlamaFactory supports it, and you can refer to the documentation: https://llamafactory.readthedocs.io/zh-cn/latest/getting_started/data_preparation.html 2. Convert Ruozhiba data into LlamaFactory data format. import json # Conversion function def convert_format(original_data): converted_data = [] for item in original_data: converted_item = { "instruction": item["query"], "input": "", … Read more

Impact of Sora on AI Infrastructure

Impact of Sora on AI Infrastructure

Unicorn Think Tank: The Leading Industry Research Think Tank Recruitment for Unicorn Investment Research Intelligence Group Unicorn Think Tank has developed over 9 years, accumulating a wealth of resources and forming a community of shared interests with top investment research resources. After nearly a year of product testing and small-scale member services for almost two … Read more

Detailed Derivation of XGBoost Explained

Detailed Derivation of XGBoost Explained

– What is the basis for tree node splitting in XGBoost? – How is the weight of tree nodes calculated? – What improvements has XGBoost made to prevent overfitting? Those reading this article are likely familiar with XGBoost. Indeed, XGBoost is not only a powerful tool in major data science competitions but is also widely … Read more

Beyond Mistral: The Rise of Mianbi

Beyond Mistral: The Rise of Mianbi

Author|Zhou YixiaoEmail|[email protected] After more than seventy days, Mianbi has released four distinct models following the launch of MiniCPM-2B, and it has also officially announced new financing worth hundreds of millions. This financing was led by Chuanghua Venture Capital and Huawei Hubble, with the Beijing Artificial Intelligence Industry Investment Fund and others participating. Zhihu continues to … Read more

Scikit-learn: The Swiss Army Knife of Machine Learning

Scikit-learn: The Swiss Army Knife of Machine Learning

Honestly, every time I write machine learning code with Scikit-learn, I feel an inexplicable thrill. This library is like our helpful assistant, wrapping complex machine learning algorithms in a simple and easy-to-use way, allowing us to focus on solving real problems rather than getting bogged down in the details of algorithm implementation. Installation and Import … Read more

Amazon SageMaker: Build, Train, and Deploy ML Models Easily

Amazon SageMaker: Build, Train, and Deploy ML Models Easily

Beginner: Jing, I recently heard that many companies are using Amazon SageMaker for machine learning projects. What exactly is this tool? Is it easy for beginners like us to get started? Jing: To address this question, let me explain in detail. Amazon SageMaker is a one-stop machine learning platform launched by Amazon. It’s like an … Read more

In-Depth Analysis of LLAMA3 Paper

In-Depth Analysis of LLAMA3 Paper

Introduction Recently, while reviewing the papers I had previously studied in depth, I found that some notes were still very valuable. I made some minor adjustments and am publishing them for everyone to see. LLama3 is a paper from a few months ago, but each reading still brings new insights. This article discusses key points, … Read more

Comprehensive Collection of Common PyTorch Code Snippets

Comprehensive Collection of Common PyTorch Code Snippets

↑ ClickBlue Text Follow the Extreme Market Platform Author丨Jack Stark@Zhihu Source丨https://zhuanlan.zhihu.com/p/104019160 Extreme Market Guide This article is a collection of common code snippets in PyTorch, covering five aspects: basic configuration, tensor processing, model definition and operation, data processing, and model training and testing, along with several notable tips. The content is very comprehensive. Join the … Read more

Cost-Saving Techniques in DeepSeek: Unveiling the Secrets

Cost-Saving Techniques in DeepSeek: Unveiling the Secrets

Tencent Technology “AI Future Guide” Special Contributor: Hao Boyang Editor: Zheng Kejun No GPU Poor, Only Not Enough Squeeze. The launch of DeepSeek-V3 perfectly illustrates this statement with a set of astonishing data. While models like O1, Claude, Gemini, and Llama 3 struggle with billions in training costs, DeepSeek-V3 achieved performance on par with them … Read more