JavaScript examples

Chat completions

import OpenAI from "openai";

const client = new OpenAI({ baseURL: "https://llm.aihosting.mittwald.de/v1" });

const resp = await client.chat.completions.create({
  model: "Devstral-Small-2-24B-Instruct-2512",
  messages: [{ role: "user", content: "Hello from JS!" }],
});

console.log(resp.choices[0].message.content);

Vision (image + text)

Vision models on mittwald AI Hosting accept images as Base64-encoded data URLs only. Sending image URLs is not supported and results in a server error.

Encoding and resizing images

Install sharp for image processing:

npm install sharp

Always resize before encoding. Sending large images significantly increases time to first token (TTFT). Keeping the longest edge at 1024 px or below is a safe default that preserves quality for most tasks:

import sharp from "sharp";

async function encodeImage(path: string, maxPx = 1024): Promise<string> {
  const img = sharp(path);
  const { width = 0, height = 0 } = await img.metadata();
  const longest = Math.max(width, height);
  const resized =
    longest > maxPx
      ? img.resize(width >= height ? maxPx : undefined, height > width ? maxPx : undefined)
      : img;
  const buf = await resized.jpeg({ quality: 85 }).toBuffer();
  return `data:image/jpeg;base64,${buf.toString("base64")}`;
}

Making a vision request

import OpenAI from "openai";

const client = new OpenAI({ baseURL: "https://llm.aihosting.mittwald.de/v1" });

const resp = await client.chat.completions.create({
  model: "Ministral-3-14B-Instruct-2512",
  messages: [
    {
      role: "user",
      content: [
        { type: "text", text: "Describe this image in detail." },
        { type: "image_url", image_url: { url: await encodeImage("photo.jpg") } },
      ],
    },
  ],
  temperature: 0.1,
  max_tokens: 512,
});

console.log(resp.choices[0].message.content);

Choosing a vision model

Model	Max images	Strengths
`Qwen3.5-122B-A10B-FP8`	20+	Best accuracy, OCR, complex scenes
`Ministral-3-14B-Instruct-2512`	4	Balanced speed and quality
`Qwen3.6-35B-A3B-FP8`	5+	Fast on warm requests
`Devstral-Small-2-24B-Instruct-2512`	4	Code-heavy workflows with images

For Qwen models, disable thinking mode in vision requests to avoid unnecessary overhead:

const resp = await client.chat.completions.create({
  model: "Qwen3.5-122B-A10B-FP8",
  messages: [
    {
      role: "user",
      content: [
        { type: "text", text: "Extract all text from this document." },
        { type: "image_url", image_url: { url: await encodeImage("document.jpg") } },
      ],
    },
  ],
  temperature: 0.7,
  max_tokens: 1024,
  // @ts-expect-error vLLM-specific parameter not in OpenAI types
  extra_body: { chat_template_kwargs: { enable_thinking: false } },
});

console.log(resp.choices[0].message.content);

Tool calling (function calling)

import OpenAI from "openai";

const client = new OpenAI({ baseURL: "https://llm.aihosting.mittwald.de/v1" });

const tools = [
  {
    type: "function",
    function: {
      name: "get_weather",
      description: "Get current weather for a city",
      parameters: {
        type: "object",
        properties: {
          city: { type: "string", description: "City name" },
        },
        required: ["city"],
      },
    },
  },
];

const resp = await client.chat.completions.create({
  model: "Devstral-Small-2-24B-Instruct-2512",
  messages: [{ role: "user", content: "What is the weather in Munich?" }],
  tools,
  tool_choice: "auto",
});

const toolCall = resp.choices[0].message.tool_calls?.[0];
if (toolCall) {
  console.log(`Function: ${toolCall.function.name}`);
  console.log(`Arguments: ${toolCall.function.arguments}`);
}

Streaming responses

import OpenAI from "openai";

const client = new OpenAI({ baseURL: "https://llm.aihosting.mittwald.de/v1" });

const stream = await client.chat.completions.create({
  model: "Devstral-Small-2-24B-Instruct-2512",
  messages: [{ role: "user", content: "Write a short poem about coding" }],
  stream: true,
});

for await (const chunk of stream) {
  const content = chunk.choices[0]?.delta?.content || "";
  process.stdout.write(content);
}

Chat completions​

Vision (image + text)​

Encoding and resizing images​

Making a vision request​

Choosing a vision model​

Tool calling (function calling)​

Streaming responses​