Python examples

You can use the models within programming languages conveniently via existing libraries that support the OpenAI API. Therefore, mittwald’s AI hosting can often be used as a drop-in replacement.

For the following examples, first install the required libraries using a Python package manager and store the API key generated in mStudio in a .env file:

pip install python-dotenv openai langchain-openai
echo 'OPENAI_API_KEY="sk-…"' > .env

Then, you can access a model using the OpenAI package:

from openai import OpenAI
from dotenv import load_dotenv

# Load .env file
load_dotenv()

# Initialize client with custom host and key from environment
client = OpenAI(
    base_url="https://llm.aihosting.mittwald.de/v1"
)

# Make a simple call
response = client.chat.completions.create(
    model="Ministral-3-14B-Instruct-2512",
    temperature = 0.15,
    messages=[
        {"role": "user", "content": "Moin and hello!"}
    ]
)

print(response.choices[0].message.content)

Alternatively, you can also use langchain:

from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage

# Load .env file
load_dotenv()

# Initialize client with custom host and key from environment
chat = ChatOpenAI(
    model="Ministral-3-14B-Instruct-2512",
    base_url="https://llm.aihosting.mittwald.de/v1",
    temperature = 0.15
)

# Get response
response = chat.invoke([
    HumanMessage(content="Moin and hello!")
])

print(response.content)

Vision (image + text)

Vision models on mittwald AI Hosting accept images as Base64-encoded data URLs only. Sending image URLs is not supported and results in a server error.

Encoding and resizing images

Install Pillow for image processing:

pip install Pillow

Always resize before encoding. Sending large images significantly increases time to first token (TTFT). Keeping the longest edge at 1024 px or below is a safe default that preserves quality for most tasks:

import base64
import io
from PIL import Image

def encode_image(path: str, max_px: int = 1024) -> str:
    """Resize to max_px on the longest edge and return a Base64 data URL."""
    with Image.open(path) as img:
        w, h = img.size
        scale = min(1.0, max_px / max(w, h))
        if scale < 1.0:
            img = img.resize((int(w * scale), int(h * scale)), Image.LANCZOS)
        buf = io.BytesIO()
        img.save(buf, format="JPEG", quality=85)
        b64 = base64.b64encode(buf.getvalue()).decode()
    return f"data:image/jpeg;base64,{b64}"

Making a vision request

from openai import OpenAI

client = OpenAI(base_url="https://llm.aihosting.mittwald.de/v1")

resp = client.chat.completions.create(
    model="Ministral-3-14B-Instruct-2512",
    messages=[{
        "role": "user",
        "content": [
            {"type": "text", "text": "Describe this image in detail."},
            {"type": "image_url", "image_url": {"url": encode_image("photo.jpg")}},
        ]
    }],
    temperature=0.1,
    max_tokens=512,
)
print(resp.choices[0].message.content)

Choosing a vision model

Model	Max images	Strengths
`Qwen3.5-122B-A10B-FP8`	20+	Best accuracy, OCR, complex scenes
`Ministral-3-14B-Instruct-2512`	4	Balanced speed and quality
`Qwen3.6-35B-A3B-FP8`	5+	Fast on warm requests
`Devstral-Small-2-24B-Instruct-2512`	4	Code-heavy workflows with images

For Qwen models, disable thinking mode in vision requests to avoid unnecessary overhead:

resp = client.chat.completions.create(
    model="Qwen3.5-122B-A10B-FP8",
    messages=[{
        "role": "user",
        "content": [
            {"type": "text", "text": "Extract all text from this document."},
            {"type": "image_url", "image_url": {"url": encode_image("document.jpg")}},
        ]
    }],
    temperature=0.7,
    max_tokens=1024,
    extra_body={"chat_template_kwargs": {"enable_thinking": False}},
)
print(resp.choices[0].message.content)

Tool-calling (function calling)

from openai import OpenAI
client = OpenAI(base_url="https://llm.aihosting.mittwald.de/v1")

tools = [
  {
    "type": "function",
    "function": {
      "name": "get_weather",
      "description": "Get current weather",
      "parameters": {
        "type": "object",
        "properties": {"city": {"type": "string"}},
        "required": ["city"]
      }
    }
  }
]

resp = client.chat.completions.create(
    model="Devstral-Small-2-24B-Instruct-2512",
    messages=[{"role": "user", "content": "What is the weather in Berlin?"}],
    tools=tools,
    tool_choice="auto"
)

# Check if the model called a tool
if resp.choices[0].message.tool_calls:
    call = resp.choices[0].message.tool_calls[0]
    print(f"Function: {call.function.name}")
    print(f"Arguments: {call.function.arguments}")

Streaming responses

from openai import OpenAI
client = OpenAI(base_url="https://llm.aihosting.mittwald.de/v1")

stream = client.chat.completions.create(
    model="Devstral-Small-2-24B-Instruct-2512",
    messages=[{"role": "user", "content": "Write a short poem about coding"}],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="", flush=True)

Vision (image + text)​

Encoding and resizing images​

Making a vision request​

Choosing a vision model​

Tool-calling (function calling)​

Streaming responses​