Available models
We currently offer the following models, which may change or expand over time. Each is described along with model-specific parameters.
| Model Name | Type | Modalities | Context (Tokens) | License |
|---|---|---|---|---|
| gpt-oss-120b | Chat + reasoning | Text, tool-calling | 131,072 | Apache 2.0 |
| Qwen3.5-0.8B | Chat + reasoning | Text, tool-calling | 262,144 | Apache 2.0 |
| Ministral-3-14B-Instruct-2512 | Chat + vision | Text, image, tool-calling | 262,144 | Apache 2.0 |
| Mistral-Medium-3.5-128B | Chat + vision | Text, image, tool-calling | 256,000 | Apache 2.0 |
| Qwen3.5-122B-A10B-FP8 | Chat + reasoning + vision | Text, image, tool-calling | 245,760 | Apache 2.0 |
| Qwen3.6-35B-A3B-FP8 | Chat + reasoning + vision | Text, image, tool-calling | 262,144 | Apache 2.0 |
| GLM-OCR | Document OCR | PDF, DOCX, PPTX, XLSX, HTML, SVG, image to text | 131,072 | MIT |
| Qwen3-Embedding-8B | Embedding | Text to vector | 32,768 | Apache 2.0 |
| Qwen3-VL-Reranker-2B | Reranking | Text, image to score | 32,768 | Apache 2.0 |
| whisper-large-v3-turbo | Speech-to-Text | Audio to text | N/A (audio-based) | MIT |
Picking modelsβ
- For complex text-centric workloads and advanced automations when precision and vast knowledge are required
usegpt-oss-120b. - For high-throughput, cost-sensitive tasks that don't require vision (e.g. simple Q&A, routing, classification, and batch processing)
useQwen3.5-0.8B. - For complex reasoning, multilingual tasks, or vision workloads where a large frontier model is required
useMistral-Medium-3.5-128B. - For broad, scalable, cost-conscious chat and basic multimodal (text + image) workflows
useMinistral-3-14B-Instruct-2512. - For large-scale reasoning and vision tasks where high model capacity is required
useQwen3.5-122B-A10B-FP8. - For workloads that require long context windows with reasoning and vision support at lower cost
useQwen3.6-35B-A3B-FP8. - Special-purpose applications
- For extracting text from PDF, DOCX, PPTX, XLSX, HTML, and image documents β including scanned invoices, contracts, and forms β
useGLM-OCR - For all use cases involving search, recommendation, clustering, or knowledge graph building
useQwen3-Embedding-8B - For any audio transcription or voice-command needs
usewhisper-large-v3-turbo - To improve RAG retrieval precision by adding it as a second-pass reranker after vector search
useQwen3-VL-Reranker-2B
- For extracting text from PDF, DOCX, PPTX, XLSX, HTML, and image documents β including scanned invoices, contracts, and forms β
For more details and additional tips have a look at the usage examples and guides.
Please have a look at the following pages to gather more information about a specific model:
Ministral-3-14B-Instruct-2512
Detailled information on Ministral-3-14B-Instruct-2512
Qwen3-Embedding-8B
Detailled information on Qwen3-Embedding-8B
gpt-oss-120b
Detailled information on gpt-oss-120b
Whisper-Large-V3-Turbo
Detailed information about Whisper-Large-V3-Turbo
Qwen3.5-122B-A10B-FP8
Detailled information on Qwen3.5-122B-A10B-FP8
Qwen3.6-35B-A3B-FP8
Detailled information on Qwen3.6-35B-A3B-FP8
GLM-OCR
Detailed information about GLM-OCR
Qwen3.5-0.8B
Detailed information about Qwen3.5-0.8B
Qwen3-VL-Reranker-2B
Detailed information about Qwen3-VL-Reranker-2B
Mistral-Medium-3.5-128B
Detailed information about Mistral-Medium-3.5-128B