Deploy Personal Code Assistant Using LLama.cpp in 3 Minutes
Today, I will demonstrate the use of the most popular on-device LLM deployment engine, llama.cpp. The demonstration will be conducted on a MacBook Pro (M3 Pro). Project address: https://github.com/ggerganov/llama.cpp. Compilation method: https://github.com/ggerganov/llama.cpp/blob/master/docs/build.md. The model used for testing is the Qwen2.5-Coder-3B-Instruct. Model download address: https://huggingface.co/Qwen/Qwen2.5-Coder-3B-Instruct. This model supports 32K context. We used the Q8_0 quantized model for testing.
Name clearedScan to Appreciate the AuthorLike the AuthorOther AmountArticlesNo articlesLike the AuthorOther AmountMinimum appreciation ¥0Other AmountAppreciation Amount¥Minimum appreciation ¥01234567890.
AI tools , 1
Personal opinion, for reference only , January 17, 2025 09:37 , ,