Models - VIVI

VIVI supports a range of large language models, each suited to specific use cases. When selecting a model for your agent, consider the complexity of the tasks it needs to handle, the expected response speed, and cost. Models are selected per agent under the Model section of the agent configuration screen.

Reasoning Tiers

Many models are available in reasoning variants — High, Medium, Low, and in some cases Minimal. These tiers control how much thinking the model does before responding.

Tier	Behavior
High	Deepest reasoning; slower but most accurate. Best for complex, multi-step problems.
Medium	Balanced reasoning and speed. Good for most complex tasks.
Low	Lightweight reasoning. Suitable for everyday problems where speed matters.
Minimal	Fastest and most cost-effective. Best for simple, high-volume agents.
(none)	No reasoning overhead. Fastest response time for straightforward tasks.

Available Models

Model	Input (tokens)	Output (tokens)	Description
gpt-5.1 (High)	272k	128k	Thorough reasoning for hard problems; slower but more accurate
gpt-5.1 (Medium)	272k	128k	Balanced speed and reasoning for complex tasks
gpt-5.1 (Low)	272k	128k	Lightweight reasoning for everyday problems
gpt-5.1	272k	128k	Fast model for common tasks; no reasoning overhead
gpt-5-mini (High)	136k	64k	Strong reasoning for complex tasks at lower cost
gpt-5-mini (Medium)	136k	64k	Balanced reasoning for everyday complex problems
gpt-5-mini (Low)	136k	64k	Light reasoning for simple to moderate tasks
gpt-5-mini (Minimal)	136k	64k	Fast, cost-effective reasoning for simple agents
gpt-5.2 (High)	272k	128k	Thorough reasoning with higher precision for demanding problems
gpt-5.2 (Medium)	272k	128k	Balanced speed and reasoning with improved accuracy
gpt-5.2 (Low)	272k	128k	Lightweight reasoning with better results for everyday problems
gpt-5.2	272k	128k	Smarter and more accurate for common tasks; no reasoning overhead
gpt-5.3-codex (High)	272k	128k	Maximum code reasoning for the most complex agentic coding workflows
gpt-5.3-codex (Medium)	272k	128k	Balanced code reasoning for complex development tasks
gpt-5.3-codex (Low)	272k	128k	Light code reasoning for routine development and automation
gpt-5.3-codex	272k	128k	Code-optimized model for development and automation tasks; SWE-bench SOTA
gpt-5.4 (High)	872k	128k	Deepest reasoning available for the most challenging problems
gpt-5.4 (Medium)	872k	128k	Balanced speed and top-tier reasoning for complex tasks
gpt-5.4 (Low)	872k	128k	Lightweight reasoning with best-in-class results for everyday problems
gpt-5.4	872k	128k	Most intelligent model for common tasks; no reasoning overhead
gpt-5.4-mini (High)	272k	128k	Strong reasoning at mini pricing for demanding tasks
gpt-5.4-mini (Medium)	272k	128k	Balanced reasoning at mini pricing for complex everyday work
gpt-5.4-mini (Low)	272k	128k	Lightweight reasoning at mini pricing for routine problems
gpt-5.4-mini	272k	128k	Cost-efficient model balancing intelligence and speed; no reasoning overhead
gpt-5.4-nano (High)	272k	128k	Strong reasoning at nano pricing for latency-sensitive tasks
gpt-5.4-nano (Medium)	272k	128k	Balanced reasoning at nano pricing for high-volume workloads
gpt-5.4-nano (Low)	272k	128k	Lightweight reasoning at nano pricing for simple, high-throughput tasks
gpt-5.4-nano	272k	128k	Smallest and fastest model in the 5.4 family; no reasoning overhead
gpt-realtime	28,672	8,191	Instant, short-context, voice-first realtime agent. 10 voice options available.
gpt-realtime-1.5	32,000	4,096	Enhanced voice model with improved reasoning, transcription accuracy, and tool calling. 10 voice options available.

Best Practices

Use gpt-5.4 (High) or gpt-5.1 (High) for agents handling complex reasoning, legal, or technical content where accuracy is critical.
Use gpt-5-mini or gpt-5.4-nano variants for high-volume, straightforward workflows where speed and cost matter more than depth.
Use gpt-5.3-codex variants for agents that involve code generation, automation, or developer-facing workflows.
Use gpt-realtime models only for voice-enabled agents — they are optimized for low-latency audio and are not suited for text-heavy or long-context tasks.
When in doubt, start with a Medium reasoning tier and adjust based on response quality and latency in testing.

​Reasoning Tiers

​Available Models

​Best Practices

Reasoning Tiers

Available Models

Best Practices