Getting started with Dedicated AI Hosting

After we set up your dedicated instance, you will receive:

API base URL - your dedicated HTTPS endpoint, e.g. https://your-company.llm.aihosting.mittwald.de
API key - a bearer token that authenticates your requests

Keep your API key confidential. Store it in an environment variable or secrets manager — never hardcode it in source files or commit it to version control. If a key is exposed, contact us to rotate it.

Checking available models

user@local $ curl https://your-company.llm.aihosting.mittwald.de/v1/models \
  -H "Authorization: Bearer YOUR_API_KEY"

Use one of the returned model IDs as YOUR_MODEL_ID in requests.

Sending your first request

curl
Python
JavaScript / TypeScript

user@local $ curl https://your-company.llm.aihosting.mittwald.de/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "YOUR_MODEL_ID",
    "messages": [
    {"role": "user", "content": "Explain retrieval-augmented generation in two sentences."}
    ]
  }'

user@local $ pip install openai python-dotenv

OPENAI_API_KEY=YOUR_API_KEY
OPENAI_BASE_URL=https://your-company.llm.aihosting.mittwald.de/v1

from openai import OpenAI
from dotenv import load_dotenv

load_dotenv()

client = OpenAI()

response = client.chat.completions.create(
    model="YOUR_MODEL_ID",
    messages=[
        {"role": "user", "content": "Explain retrieval-augmented generation in two sentences."}
    ]
)

print(response.choices[0].message.content)

user@local $ npm install openai dotenv

OPENAI_API_KEY=YOUR_API_KEY
OPENAI_BASE_URL=https://your-company.llm.aihosting.mittwald.de/v1

import OpenAI from "openai";
import "dotenv/config";

const client = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY,
  baseURL: process.env.OPENAI_BASE_URL,
});

const response = await client.chat.completions.create({
  model: "YOUR_MODEL_ID",
  messages: [
    { role: "user", content: "Explain retrieval-augmented generation in two sentences." }
  ],
});

console.log(response.choices[0].message.content);

Streaming responses

Add "stream": true to receive tokens as they are generated instead of waiting for the full response.

curl
Python
JavaScript / TypeScript

user@local $ curl https://your-company.llm.aihosting.mittwald.de/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "YOUR_MODEL_ID",
    "stream": true,
    "messages": [
      {"role": "user", "content": "Explain retrieval-augmented generation in two sentences."}
    ]
  }'

from openai import OpenAI

client = OpenAI(api_key="YOUR_API_KEY", base_url="https://your-company.llm.aihosting.mittwald.de/v1")

with client.chat.completions.stream(
    model="YOUR_MODEL_ID",
    messages=[{"role": "user", "content": "Explain retrieval-augmented generation in two sentences."}],
) as stream:
    for chunk in stream:
        if chunk.choices[0].delta.content:
            print(chunk.choices[0].delta.content, end="", flush=True)

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: "YOUR_API_KEY",
  baseURL: "https://your-company.llm.aihosting.mittwald.de/v1",
});

const stream = await client.chat.completions.create({
  model: "YOUR_MODEL_ID",
  stream: true,
  messages: [{ role: "user", content: "Explain retrieval-augmented generation in two sentences." }],
});

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content ?? "");
}

Request parameters

Parameter recommendations can be model-specific. Use the defaults from your chosen SDK first, then tune based on your model behavior and use case.

Drop-in replacement

Because the endpoint is OpenAI-compatible, you can use it as a drop-in replacement in frameworks that accept a custom base URL. See OpenAI API compatibility for the full list of supported endpoints and parameters, including tool calling and structured outputs.

Managing multiple API keys

If you want separate keys per app/team, usage tracking, or per-key rate limits, run LiteLLM as a self-hosted proxy.

Checking available models​

Sending your first request​

Streaming responses​

Request parameters​

Drop-in replacement​

Managing multiple API keys​