A lightweight wrapper around llama.cpp's llama-server that simplifies installation, configuration, and lifecycle management of a local LLM inference server. It supports OpenAI-compatible REST API ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results