Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.vivi.bot/llms.txt

Use this file to discover all available pages before exploring further.

VIVI supports a range of large language models, each suited to specific use cases. When selecting a model for your agent, consider the complexity of the tasks it needs to handle, the expected response speed, and cost. Models are selected per agent under the Model section of the agent configuration screen.

Reasoning Tiers

Many models are available in reasoning variants — High, Medium, Low, and in some cases Minimal. These tiers control how much thinking the model does before responding.
TierBehavior
HighDeepest reasoning; slower but most accurate. Best for complex, multi-step problems.
MediumBalanced reasoning and speed. Good for most complex tasks.
LowLightweight reasoning. Suitable for everyday problems where speed matters.
MinimalFastest and most cost-effective. Best for simple, high-volume agents.
(none)No reasoning overhead. Fastest response time for straightforward tasks.

Available Models

ModelInput (tokens)Output (tokens)Description
gpt-5.1 (High)272k128kThorough reasoning for hard problems; slower but more accurate
gpt-5.1 (Medium)272k128kBalanced speed and reasoning for complex tasks
gpt-5.1 (Low)272k128kLightweight reasoning for everyday problems
gpt-5.1272k128kFast model for common tasks; no reasoning overhead
gpt-5-mini (High)136k64kStrong reasoning for complex tasks at lower cost
gpt-5-mini (Medium)136k64kBalanced reasoning for everyday complex problems
gpt-5-mini (Low)136k64kLight reasoning for simple to moderate tasks
gpt-5-mini (Minimal)136k64kFast, cost-effective reasoning for simple agents
gpt-5.2 (High)272k128kThorough reasoning with higher precision for demanding problems
gpt-5.2 (Medium)272k128kBalanced speed and reasoning with improved accuracy
gpt-5.2 (Low)272k128kLightweight reasoning with better results for everyday problems
gpt-5.2272k128kSmarter and more accurate for common tasks; no reasoning overhead
gpt-5.3-codex (High)272k128kMaximum code reasoning for the most complex agentic coding workflows
gpt-5.3-codex (Medium)272k128kBalanced code reasoning for complex development tasks
gpt-5.3-codex (Low)272k128kLight code reasoning for routine development and automation
gpt-5.3-codex272k128kCode-optimized model for development and automation tasks; SWE-bench SOTA
gpt-5.4 (High)872k128kDeepest reasoning available for the most challenging problems
gpt-5.4 (Medium)872k128kBalanced speed and top-tier reasoning for complex tasks
gpt-5.4 (Low)872k128kLightweight reasoning with best-in-class results for everyday problems
gpt-5.4872k128kMost intelligent model for common tasks; no reasoning overhead
gpt-5.4-mini (High)272k128kStrong reasoning at mini pricing for demanding tasks
gpt-5.4-mini (Medium)272k128kBalanced reasoning at mini pricing for complex everyday work
gpt-5.4-mini (Low)272k128kLightweight reasoning at mini pricing for routine problems
gpt-5.4-mini272k128kCost-efficient model balancing intelligence and speed; no reasoning overhead
gpt-5.4-nano (High)272k128kStrong reasoning at nano pricing for latency-sensitive tasks
gpt-5.4-nano (Medium)272k128kBalanced reasoning at nano pricing for high-volume workloads
gpt-5.4-nano (Low)272k128kLightweight reasoning at nano pricing for simple, high-throughput tasks
gpt-5.4-nano272k128kSmallest and fastest model in the 5.4 family; no reasoning overhead
gpt-realtime28,6728,191Instant, short-context, voice-first realtime agent. 10 voice options available.
gpt-realtime-1.532,0004,096Enhanced voice model with improved reasoning, transcription accuracy, and tool calling. 10 voice options available.

Best Practices

  • Use gpt-5.4 (High) or gpt-5.1 (High) for agents handling complex reasoning, legal, or technical content where accuracy is critical.
  • Use gpt-5-mini or gpt-5.4-nano variants for high-volume, straightforward workflows where speed and cost matter more than depth.
  • Use gpt-5.3-codex variants for agents that involve code generation, automation, or developer-facing workflows.
  • Use gpt-realtime models only for voice-enabled agents — they are optimized for low-latency audio and are not suited for text-heavy or long-context tasks.
  • When in doubt, start with a Medium reasoning tier and adjust based on response quality and latency in testing.