Ciro Santilli
OurBigBook.com
$£
Sponsor
中国
独裁统治 China Dictatorship 新疆改造中心、六四事件、法轮功、郝海东、709大抓捕、2015巴拿马文件 邓家贵、低端人口、西藏骚乱
llama-cli inference batching
...
Text-to-text model
Large language model
Open source LLM
Ollama
llama.cpp
llama-cli
OurBigBook.com
Tags:
LLM inference batching
Words: 18
As of
llama.cpp
79e0b68c178656bb0632cb8602d2940b755077f8 there is a
--parallel
option but not sure what it does.
Bibliography:
github.com/ggml-org/llama.cpp/discussions/3222
www.reddit.com/r/LocalLLaMA/comments/12aj0ze/what_is_batchsize_in_llamacpp_also_known_as_n/
www.reddit.com/r/LocalLLaMA/comments/12gtanv/batch_queries/
related for server:
www.reddit.com/r/LocalLLaMA/comments/1f19t2l/parallel_requests_using_llamaserver
Ancestors
(17)
llama-cli
llama.cpp
Ollama
Open source LLM
Large language model
Text-to-text model
AI text generation
Generative AI by modality
Generative AI
AI by capability
Artificial intelligence
Machine learning
Computer
Information technology
Area of technology
Technology
Home