Skip to main content

Python examples

You can use the models within programming languages conveniently via existing libraries that support the OpenAI API. Therefore, mittwald’s AI hosting can often be used as a drop-in replacement.

For the following examples, first install the required libraries using a Python package manager and store the API key generated in mStudio in a .env file:

pip install python-dotenv openai langchain-openai
echo 'OPENAI_API_KEY="sk-…"' > .env

Then, you can access a model using the OpenAI package:

from openai import OpenAI
from dotenv import load_dotenv

# Load .env file
load_dotenv()

# Initialize client with custom host and key from environment
client = OpenAI(
base_url="https://llm.aihosting.mittwald.de/v1"
)

# Make a simple call
response = client.chat.completions.create(
model="Ministral-3-14B-Instruct-2512",
temperature = 0.15,
messages=[
{"role": "user", "content": "Moin and hello!"}
]
)

print(response.choices[0].message.content)

Alternatively, you can also use langchain:

from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage

# Load .env file
load_dotenv()

# Initialize client with custom host and key from environment
chat = ChatOpenAI(
model="Ministral-3-14B-Instruct-2512",
base_url="https://llm.aihosting.mittwald.de/v1",
temperature = 0.15
)

# Get response
response = chat.invoke([
HumanMessage(content="Moin and hello!")
])

print(response.content)

Vision (image + text)

Vision models on mittwald AI Hosting accept images as Base64-encoded data URLs only. Sending image URLs is not supported and results in a server error.

Encoding and resizing images

Install Pillow for image processing:

pip install Pillow

Always resize before encoding. Sending large images significantly increases time to first token (TTFT). Keeping the longest edge at 1024 px or below is a safe default that preserves quality for most tasks:

import base64
import io
from PIL import Image

def encode_image(path: str, max_px: int = 1024) -> str:
"""Resize to max_px on the longest edge and return a Base64 data URL."""
with Image.open(path) as img:
w, h = img.size
scale = min(1.0, max_px / max(w, h))
if scale < 1.0:
img = img.resize((int(w * scale), int(h * scale)), Image.LANCZOS)
buf = io.BytesIO()
img.save(buf, format="JPEG", quality=85)
b64 = base64.b64encode(buf.getvalue()).decode()
return f"data:image/jpeg;base64,{b64}"

Making a vision request

from openai import OpenAI

client = OpenAI(base_url="https://llm.aihosting.mittwald.de/v1")

resp = client.chat.completions.create(
model="Ministral-3-14B-Instruct-2512",
messages=[{
"role": "user",
"content": [
{"type": "text", "text": "Describe this image in detail."},
{"type": "image_url", "image_url": {"url": encode_image("photo.jpg")}},
]
}],
temperature=0.1,
max_tokens=512,
)
print(resp.choices[0].message.content)

Choosing a vision model

ModelMax imagesStrengths
Qwen3.5-122B-A10B-FP820+Best accuracy, OCR, complex scenes
Ministral-3-14B-Instruct-25124Balanced speed and quality
Qwen3.6-35B-A3B-FP85+Fast on warm requests
Devstral-Small-2-24B-Instruct-25124Code-heavy workflows with images

For Qwen models, disable thinking mode in vision requests to avoid unnecessary overhead:

resp = client.chat.completions.create(
model="Qwen3.5-122B-A10B-FP8",
messages=[{
"role": "user",
"content": [
{"type": "text", "text": "Extract all text from this document."},
{"type": "image_url", "image_url": {"url": encode_image("document.jpg")}},
]
}],
temperature=0.7,
max_tokens=1024,
extra_body={"chat_template_kwargs": {"enable_thinking": False}},
)
print(resp.choices[0].message.content)

Tool-calling (function calling)

from openai import OpenAI
client = OpenAI(base_url="https://llm.aihosting.mittwald.de/v1")

tools = [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather",
"parameters": {
"type": "object",
"properties": {"city": {"type": "string"}},
"required": ["city"]
}
}
}
]

resp = client.chat.completions.create(
model="Devstral-Small-2-24B-Instruct-2512",
messages=[{"role": "user", "content": "What is the weather in Berlin?"}],
tools=tools,
tool_choice="auto"
)

# Check if the model called a tool
if resp.choices[0].message.tool_calls:
call = resp.choices[0].message.tool_calls[0]
print(f"Function: {call.function.name}")
print(f"Arguments: {call.function.arguments}")

Streaming responses

from openai import OpenAI
client = OpenAI(base_url="https://llm.aihosting.mittwald.de/v1")

stream = client.chat.completions.create(
model="Devstral-Small-2-24B-Instruct-2512",
messages=[{"role": "user", "content": "Write a short poem about coding"}],
stream=True
)

for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="", flush=True)