llama.cpp 開始支援 GPU 了

前陣子因為重灌桌機，所以在重建許多環境… 其中一個就是 llama.cpp，連到專案頁面上時意外發現這兩個新的 feature：

OpenBLAS support
cuBLAS and CLBlast support

這代表可以用 GPU 加速了，所以就照著說明試著編一個版本測試。

編好後就跑了 7B 的 model，看起來快不少，然後改跑 13B 的 model，也可以把完整 40 個 layer 都丟進 3060 (12GB 版本) 的 GPU 上：

./main -m models/13B/ggml-model-q4_0.bin -p “Building a website can be done in 10 simple steps:” -n 512 -ngl 40

從 log 可以看到 40 layers 到都 GPU 上面，吃了 7.5GB 左右：

llama.cpp: loading model from models/13B/ggml-model-q4_0.bin
llama_model_load_internal: format = ggjt v2 (latest)
llama_model_load_internal: n_vocab = 32000
llama_model_load_internal: n_ctx = 512
llama_model_load_internal: n_embd = 5120
llama_model_load_internal: n_mult = 256
llama_model_load_internal: n_head = 40
llama_model_load_internal: n_layer = 40
llama_model_load_internal: n_rot = 128
llama_model_load_internal: ftype = 2 (mostly Q4_0)
llama_model_load_internal: n_ff = 13824
llama_model_load_internal: n_parts = 1
llama_model_load_internal: model size = 13B
llama_model_load_internal: ggml ctx size = 90.75 KB
llama_model_load_internal: mem required = 9807.48 MB (+ 1608.00 MB per state)
llama_model_load_internal: [cublas] offloading 40 layers to GPU
llama_model_load_internal: [cublas] total VRAM used: 7562 MB
llama_init_from_file: kv self size = 400.00 MB

30B 的 model 我也試著丟上去跑，但只能丟 28 layers 上去 (全部是 60 layers)，再多 GPU 的記憶體就撐不住了。

但能用 GPU 算是一個很大的進展，現在這版只快了一半的時間，不知道後面還有沒有 tune 的空間…

ufabet มีเกมให้เลือกเล่นมากมาย: เกมเดิมพันหลากหลาย ครบทุกค่ายดัง

tornado crypto mixer Discover the power of privacy with TornadoCash! Learn how this decentralized mixer ensures your transactions remain confidential.

ดูบอลสด Very well presented. Every quote was awesome and thanks for sharing the content. Keep sharing and keep motivating others.

ดูบอลสด Pretty! This has been a really wonderful post. Many thanks for providing these details.

ดูบอลสด Hi there to all, for the reason that I am genuinely keen of reading this website’s post to be updated on a regular basis. It carries pleasant stuff.

Obrazy Sztuka Nowoczesna Thank you for this wonderful contribution to the topic. Your ability to explain complex ideas simply is admirable.

ufabet Hi there to all, for the reason that I am genuinely keen of reading this website’s post to be updated on a regular basis. It carries pleasant stuff.

ufabet You’re so awesome! I don’t believe I have read a single thing like that before. So great to find someone with some original thoughts on this topic. Really.. thank you for starting this up. This website is something that is needed on the internet, someone with a little originality!

ufabet Very well presented. Every quote was awesome and thanks for sharing the content. Keep sharing and keep motivating others.

llama.cpp 開始支援 GPU 了

OpenAI宣布推出AI在线招聘平台，和微软的领英打起来了

魅族AI眼镜1999元起售：拍照翻译付款全都会，39g重

奥特曼给ChatGPT空降高管，11亿美元收购独角兽创始人加入OpenAI…好熟悉的剧情

DeepSeek新大招曝光：下一步智能体

文心X1.1发布！这三大能力突出，一手实测在此

奥特曼给ChatGPT空降高管，11亿美元收购独角兽创始人加入OpenAI…好熟悉的剧情

AI搜索引擎，苹果决定自研！代号WKA

AI也邪修！Qwen3改Bug测试直接搜GitHub，太拟人了

Hinton突然对AGI乐观了！“Ilya让他看到了什么吧…”

AI也邪修！Qwen3改Bug测试直接搜GitHub，太拟人了