Skip to main content

Mistral-Medium-3.5-128B

Description

"Mistral-Medium-3.5-128B" is a 128-billion-parameter frontier language model by Mistral AI. It supports text and tool calling over a 256,000-token context window and uses EAGLE speculative decoding for fast inference.

It supports and is suitable for:

  • Text generation within a chat completion (text to text)
  • Tool-calling for agentic workflows
  • Long-context document analysis and summarisation
  • Multilingual tasks — strong coverage of European languages

The following limitations apply:

  • Maximum context length: 256,000 tokens
  • No audio support

API usage

Chat

from openai import OpenAI

client = OpenAI(
base_url="https://llm.aihosting.mittwald.de/v1",
api_key="sk-your-api-key-here",
)

response = client.chat.completions.create(
model="Mistral-Medium-3.5-128B",
messages=[{"role": "user", "content": "Explain the difference between TCP and UDP."}],
temperature=0.7,
top_p=0.9,
max_tokens=1024,
)

print(response.choices[0].message.content)

Tool calling (function calling)

from openai import OpenAI

client = OpenAI(base_url="https://llm.aihosting.mittwald.de/v1")

tools = [{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather for a city",
"parameters": {
"type": "object",
"properties": {"city": {"type": "string"}},
"required": ["city"],
},
},
}]

response = client.chat.completions.create(
model="Mistral-Medium-3.5-128B",
messages=[{"role": "user", "content": "What is the weather in Paris?"}],
tools=tools,
tool_choice="auto",
temperature=0.2,
)

if response.choices[0].message.tool_calls:
call = response.choices[0].message.tool_calls[0]
print(f"Function: {call.function.name}")
print(f"Arguments: {call.function.arguments}")

General chat

ParameterValue
temperature0.7
top_p1.0
max_tokens1024–8192 depending on task

Tool calling / structured output

ParameterValue
temperature0.0–0.3
top_p1.0

Terms of use and licensing

The general terms of use apply. The model is provided by Mistral AI under the Apache 2.0 License, and reuse of the generated content is not subject to any additional restrictions.