Getting started with Dedicated AI Hosting
After we set up your dedicated instance, you will receive:
- API base URL - your dedicated HTTPS endpoint, e.g.
https://your-company.llm.aihosting.mittwald.de - API key - a bearer token that authenticates your requests
Keep your API key confidential. Store it in an environment variable or secrets manager — never hardcode it in source files or commit it to version control. If a key is exposed, contact us to rotate it.
Checking available models
user@local $ curl https://your-company.llm.aihosting.mittwald.de/v1/models \
-H "Authorization: Bearer YOUR_API_KEY"
Use one of the returned model IDs as YOUR_MODEL_ID in requests.
Sending your first request
- curl
- Python
- JavaScript / TypeScript
user@local $ curl https://your-company.llm.aihosting.mittwald.de/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "YOUR_MODEL_ID",
"messages": [
{"role": "user", "content": "Explain retrieval-augmented generation in two sentences."}
]
}'
user@local $ pip install openai python-dotenv
OPENAI_API_KEY=YOUR_API_KEY
OPENAI_BASE_URL=https://your-company.llm.aihosting.mittwald.de/v1
from openai import OpenAI
from dotenv import load_dotenv
load_dotenv()
client = OpenAI()
response = client.chat.completions.create(
model="YOUR_MODEL_ID",
messages=[
{"role": "user", "content": "Explain retrieval-augmented generation in two sentences."}
]
)
print(response.choices[0].message.content)
user@local $ npm install openai dotenv
OPENAI_API_KEY=YOUR_API_KEY
OPENAI_BASE_URL=https://your-company.llm.aihosting.mittwald.de/v1
import OpenAI from "openai";
import "dotenv/config";
const client = new OpenAI({
apiKey: process.env.OPENAI_API_KEY,
baseURL: process.env.OPENAI_BASE_URL,
});
const response = await client.chat.completions.create({
model: "YOUR_MODEL_ID",
messages: [
{ role: "user", content: "Explain retrieval-augmented generation in two sentences." }
],
});
console.log(response.choices[0].message.content);
Streaming responses
Add "stream": true to receive tokens as they are generated instead of waiting for the full response.
- curl
- Python
- JavaScript / TypeScript
user@local $ curl https://your-company.llm.aihosting.mittwald.de/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "YOUR_MODEL_ID",
"stream": true,
"messages": [
{"role": "user", "content": "Explain retrieval-augmented generation in two sentences."}
]
}'
from openai import OpenAI
client = OpenAI(api_key="YOUR_API_KEY", base_url="https://your-company.llm.aihosting.mittwald.de/v1")
with client.chat.completions.stream(
model="YOUR_MODEL_ID",
messages=[{"role": "user", "content": "Explain retrieval-augmented generation in two sentences."}],
) as stream:
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="", flush=True)
import OpenAI from "openai";
const client = new OpenAI({
apiKey: "YOUR_API_KEY",
baseURL: "https://your-company.llm.aihosting.mittwald.de/v1",
});
const stream = await client.chat.completions.create({
model: "YOUR_MODEL_ID",
stream: true,
messages: [{ role: "user", content: "Explain retrieval-augmented generation in two sentences." }],
});
for await (const chunk of stream) {
process.stdout.write(chunk.choices[0]?.delta?.content ?? "");
}
Request parameters
Parameter recommendations can be model-specific. Use the defaults from your chosen SDK first, then tune based on your model behavior and use case.
Drop-in replacement
Because the endpoint is OpenAI-compatible, you can use it as a drop-in replacement in frameworks that accept a custom base URL. See OpenAI API compatibility for the full list of supported endpoints and parameters, including tool calling and structured outputs.
Managing multiple API keys
If you want separate keys per app/team, usage tracking, or per-key rate limits, run LiteLLM as a self-hosted proxy.