Skip to main content
llm.kiwi offers a curated selection of models optimized for speed, intelligence, and cost-efficiency. By abstracting complex model names, we provide a stable interface that always leverages the best available technology.

Core Models

These are our recommended models for most users. They are automatically routed to the best underlying architecture.

Pro

The most capable model for complex reasoning and creative tasks.

Fast

Highly optimized for latency and quick interactions.

Default

The standard balance of speed and capability for general use.

Chat Models

Our chat models are designed for conversational AI, coding, and reasoning.
IDProviderTierDescription
defaultllm.kiwiFreeAuto-select best model for your request
fastllm.kiwiFreeLow latency speed optimized
prollm.kiwiProHigh capability model routing
gpt-4.1-nano-*OpenAIProGPT-4.1 Nano - Fast and efficient
gpt-5-miniOpenAIProGPT-5 Mini - Advanced reasoning
deepseek-v3.1DeepSeekProExcellent for coding
mistral-small-3.1-*MistralProBalanced performance
codestral-*MistralProCode generation specialist
ministral-8b-*MistralProCompact and fast
meta-llama/*MetaProOpen source power
gemini-2.5-flash-liteGoogleFreeUltra fast reasoning
gemini-searchGoogleProWeb-grounded responses
glm-4.5-flashZhipuProChinese/English bilingual
bidaraBidaraFreeBiomimicry design assistant

Image & Media Models

State-of-the-art models for creative and multimodal tasks.
IDProviderTierDescription
fluxFluxProHigh quality image generation
whisperOpenAIProIndustry-standard speech to text

Model Updates

We continuously evaluate and update the underlying architecture of our models. Using the static slugs (pro, fast, default) ensures that your integration remains stable while automatically benefiting from the latest AI advancements.
[!TIP] Use the pro model for tasks requiring high precision, and switch to fast for UI-critical elements where speed is paramount.