Fraud Text Classification Detection: LLama.cpp + CPU Inference

Fraud Text Classification Detection: LLama.cpp + CPU Inference

1. Introduction Previously, after training our personalized model with Lora, the first issue we faced was: how to run the model on a regular machine? After all, the model was fine-tuned on dedicated GPUs with dozens of gigabytes of memory, and switching to a regular computer with only a CPU could lead to the awkward … Read more

Deploy Personal Code Assistant Using LLama.cpp in 3 Minutes

Deploy Personal Code Assistant Using LLama.cpp in 3 Minutes

Deploy Personal Code Assistant Using LLama.cpp in 3 Minutes Today, I will demonstrate the use of the most popular on-device LLM deployment engine, llama.cpp. The demonstration will be conducted on a MacBook Pro (M3 Pro). Project address: https://github.com/ggerganov/llama.cpp. Compilation method: https://github.com/ggerganov/llama.cpp/blob/master/docs/build.md. The model used for testing is the Qwen2.5-Coder-3B-Instruct. Model download address: https://huggingface.co/Qwen/Qwen2.5-Coder-3B-Instruct. This model … Read more