VIVI supports a range of large language models, each suited to specific use cases. When selecting a model for your agent, consider the complexity of the tasks it needs to handle, the expected response speed, and cost. Models are selected per agent under the Model section of the agent configuration screen.Documentation Index
Fetch the complete documentation index at: https://docs.vivi.bot/llms.txt
Use this file to discover all available pages before exploring further.
Reasoning Tiers
Many models are available in reasoning variants — High, Medium, Low, and in some cases Minimal. These tiers control how much thinking the model does before responding.| Tier | Behavior |
|---|---|
| High | Deepest reasoning; slower but most accurate. Best for complex, multi-step problems. |
| Medium | Balanced reasoning and speed. Good for most complex tasks. |
| Low | Lightweight reasoning. Suitable for everyday problems where speed matters. |
| Minimal | Fastest and most cost-effective. Best for simple, high-volume agents. |
| (none) | No reasoning overhead. Fastest response time for straightforward tasks. |
Available Models
| Model | Input (tokens) | Output (tokens) | Description |
|---|---|---|---|
| gpt-5.1 (High) | 272k | 128k | Thorough reasoning for hard problems; slower but more accurate |
| gpt-5.1 (Medium) | 272k | 128k | Balanced speed and reasoning for complex tasks |
| gpt-5.1 (Low) | 272k | 128k | Lightweight reasoning for everyday problems |
| gpt-5.1 | 272k | 128k | Fast model for common tasks; no reasoning overhead |
| gpt-5-mini (High) | 136k | 64k | Strong reasoning for complex tasks at lower cost |
| gpt-5-mini (Medium) | 136k | 64k | Balanced reasoning for everyday complex problems |
| gpt-5-mini (Low) | 136k | 64k | Light reasoning for simple to moderate tasks |
| gpt-5-mini (Minimal) | 136k | 64k | Fast, cost-effective reasoning for simple agents |
| gpt-5.2 (High) | 272k | 128k | Thorough reasoning with higher precision for demanding problems |
| gpt-5.2 (Medium) | 272k | 128k | Balanced speed and reasoning with improved accuracy |
| gpt-5.2 (Low) | 272k | 128k | Lightweight reasoning with better results for everyday problems |
| gpt-5.2 | 272k | 128k | Smarter and more accurate for common tasks; no reasoning overhead |
| gpt-5.3-codex (High) | 272k | 128k | Maximum code reasoning for the most complex agentic coding workflows |
| gpt-5.3-codex (Medium) | 272k | 128k | Balanced code reasoning for complex development tasks |
| gpt-5.3-codex (Low) | 272k | 128k | Light code reasoning for routine development and automation |
| gpt-5.3-codex | 272k | 128k | Code-optimized model for development and automation tasks; SWE-bench SOTA |
| gpt-5.4 (High) | 872k | 128k | Deepest reasoning available for the most challenging problems |
| gpt-5.4 (Medium) | 872k | 128k | Balanced speed and top-tier reasoning for complex tasks |
| gpt-5.4 (Low) | 872k | 128k | Lightweight reasoning with best-in-class results for everyday problems |
| gpt-5.4 | 872k | 128k | Most intelligent model for common tasks; no reasoning overhead |
| gpt-5.4-mini (High) | 272k | 128k | Strong reasoning at mini pricing for demanding tasks |
| gpt-5.4-mini (Medium) | 272k | 128k | Balanced reasoning at mini pricing for complex everyday work |
| gpt-5.4-mini (Low) | 272k | 128k | Lightweight reasoning at mini pricing for routine problems |
| gpt-5.4-mini | 272k | 128k | Cost-efficient model balancing intelligence and speed; no reasoning overhead |
| gpt-5.4-nano (High) | 272k | 128k | Strong reasoning at nano pricing for latency-sensitive tasks |
| gpt-5.4-nano (Medium) | 272k | 128k | Balanced reasoning at nano pricing for high-volume workloads |
| gpt-5.4-nano (Low) | 272k | 128k | Lightweight reasoning at nano pricing for simple, high-throughput tasks |
| gpt-5.4-nano | 272k | 128k | Smallest and fastest model in the 5.4 family; no reasoning overhead |
| gpt-realtime | 28,672 | 8,191 | Instant, short-context, voice-first realtime agent. 10 voice options available. |
| gpt-realtime-1.5 | 32,000 | 4,096 | Enhanced voice model with improved reasoning, transcription accuracy, and tool calling. 10 voice options available. |
Best Practices
- Use gpt-5.4 (High) or gpt-5.1 (High) for agents handling complex reasoning, legal, or technical content where accuracy is critical.
- Use gpt-5-mini or gpt-5.4-nano variants for high-volume, straightforward workflows where speed and cost matter more than depth.
- Use gpt-5.3-codex variants for agents that involve code generation, automation, or developer-facing workflows.
- Use gpt-realtime models only for voice-enabled agents — they are optimized for low-latency audio and are not suited for text-heavy or long-context tasks.
- When in doubt, start with a Medium reasoning tier and adjust based on response quality and latency in testing.

