Multi-Provider AI
Talon is not tied to any single AI provider. Connect to OpenAI, Anthropic, Google, Groq, Ollama, or any service with an OpenAI-compatible API. Mix and match models across channels, switch on the fly, and run local models entirely offline.
Supported Providers
Section titled “Supported Providers”| Provider | Models | Notes |
|---|---|---|
| OpenAI | GPT-4o, GPT-4o-mini, o1, o3, and more | Official API |
| Anthropic | Claude Opus 4, Claude Sonnet 4, Haiku 3.5 | Direct API |
| Google Gemini | Gemini 2.0 Flash, Gemini 2.5 Pro | Via Google AI API |
| Groq | Llama 3, Mixtral, Gemma | Ultra-fast inference |
| Together AI | Llama 3, Qwen, DBRX | Open model hosting |
| Ollama | Any locally hosted model | Fully private, no API key needed |
| Any OpenAI-compatible API | Varies | Set a custom base URL |
Model Format
Section titled “Model Format”Models are specified using a provider/model format:
openai/gpt-4oanthropic/claude-opus-4-5anthropic/claude-sonnet-4-20250514google/gemini-2.0-flashgroq/llama-3.3-70b-versatileollama/llama3.2ollama/mistralConfiguration
Section titled “Configuration”Add your providers and API keys to the Talon config:
providers: openai: api_key: sk-... anthropic: api_key: sk-ant-... google: api_key: AIza... groq: api_key: gsk_... ollama: base_url: http://localhost:11434
default_model: anthropic/claude-sonnet-4-20250514Setting a Default Model
Section titled “Setting a Default Model”The default model is used for all channels unless overridden:
default_model: openai/gpt-4oOr change it at runtime without restarting:
Set default model to groq/llama-3.3-70b-versatilePer-Channel Model Override
Section titled “Per-Channel Model Override”Different channels can use different models. Point a channel at a fast, cheap model for quick tasks and a more capable model for complex work:
channels: quick-tasks: model: openai/gpt-4o-mini deep-work: model: anthropic/claude-opus-4-5 local-only: model: ollama/llama3.2Or switch a channel’s model on the fly:
Use anthropic/claude-opus-4-5 for this channelSwitch to ollama/mistralLocal Models with Ollama
Section titled “Local Models with Ollama”Run models completely locally via Ollama. No API key, no data leaving your machine, no usage costs.
# Install a model locallyollama pull llama3.2ollama pull mistralollama pull codellamaThen use it in Talon:
default_model: ollama/llama3.2Custom OpenAI-Compatible APIs
Section titled “Custom OpenAI-Compatible APIs”Any service that implements the OpenAI chat completions API works with Talon. Set a custom base URL to point at your own deployment, a proxy, or a third-party compatible service:
providers: my-custom: base_url: https://my-llm-proxy.internal/v1 api_key: my-keyThen use it as:
my-custom/my-model-nameSwitching Models Without Restarting
Section titled “Switching Models Without Restarting”Model changes take effect immediately — no restart required. Ask the agent to switch, update your config file, or use the config management tool. The change applies to the next message in that channel.
Switch to openai/gpt-4o for the rest of this conversationUse groq/llama-3.3-70b-versatile — I need a faster responseComparing Models Side by Side
Section titled “Comparing Models Side by Side”Set up separate channels with different models to compare responses on the same tasks:
channels: compare-openai: model: openai/gpt-4o compare-claude: model: anthropic/claude-sonnet-4-20250514 compare-local: model: ollama/llama3.2Send the same prompt to each channel and see how models differ in speed, quality, and style.