On-device AI
Shrike’s AI features run entirely on-device. Inference uses an embedded
llama.cpp (via the
llama-cpp-2 crate), GPU-accelerated
through Metal on Apple Silicon. No prompt and no mail ever leaves your
machine — there’s no API key to add and no server to call.
What it does today
Section titled “What it does today”The first AI feature is Thread → to-do (t): the model reads
a conversation and proposes a one-line action plus a due date, which pre-fills the
to-do editor for one-key confirmation. The AI boundary sits behind a clean
AiEngine trait, mirroring the mail provider, so more features can plug into the
same on-device engine over time.
Choosing a model
Section titled “Choosing a model”Shrike ships a built-in model picker rather than one hardcoded model. It offers a small tiered registry of GGUF models and uses a RAM-fit heuristic to recommend one your Mac can run comfortably.
- The recommended default is a compact instruct model that’s quick on Apple Silicon and fits machines with modest memory.
- Pick a model and Shrike streams the download with a progress indicator, caches it under the app’s data directory, and reuses it thereafter.
- Your choice is saved in preferences, and you can switch models later — the engine is hot-swappable by model id.
Manage all of this from the Settings window (⌘,) under AI Models.
Lazy and non-blocking
Section titled “Lazy and non-blocking”The model loads only when an AI feature is first invoked, on a background thread. Launch stays instant, and if you never use an AI feature you never pay the cost — no model is downloaded until you ask for one.
Privacy
Section titled “Privacy”Because everything runs locally, on-device AI is a privacy feature, not a privacy cost. The contents of your inbox are never sent to a model provider, including Shrike’s authors. See Privacy & security.