Zero-overhead, terminal-native local-LLM launcher.
A fast TUI and CLI with init wizard for launching local LLMs. One Rust binary that's a TUI, a CLI, a daemon, and an OpenAI-compatible proxy. llama.cpp is the direct, zero-overhead backend (vs raw llama-server), behind a pluggable backend seam so other engines can plug in.
LlamaStash gives you one binary for three jobs:
- a keyboard-driven TUI
- a scriptable CLI with stable
--jsonoutput - an on-demand daemon that supervises running models
# macOS + Linux
curl -fsSL https://llamastash.dev/install.sh | sh
# Homebrew
brew install llamastash/llamastash/llamastash
# Cargo
cargo install llamastashThen run:
llamastash initThe init wizard detects your hardware, installs the right llama-server, downloads a starter GGUF, writes a tuned config, and smoke-launches it.
- hardware-aware first-run setup
- auto-discovery of HuggingFace, Ollama, and LM Studio model caches
- a fast TUI for launching, testing, and stopping models
- an OpenAI-compatible local proxy for external tools and agents
- CLI + JSON contracts for scripts and coding agents
- llamastash/llamastash - main source repo
- llamastash/homebrew-llamastash - Homebrew tap
- llamastash/llamastash.github.io - website and install-script mirror
