Qwen3-Embedding-8B
Description
The Qwen3 Embedding model by Alibaba is designed for creating embeddings, which are essential for a retrieval-augmented generation (RAG) system. This model does not support chat interaction. It is used to convert documents and texts into individual vectors, which are stored in a vector database by the application, enabling semantic search. The strength of this model lies in its support of over 100 languages with high performance.
The following limitations apply:
- Maximum context length: 32,768 tokens
- Embedding dimension: 4,096
- The
dimensionsparameter for dynamically projecting to a lower dimension is not supported
It is recommended to format embeddings for search queries using the following template:
Instruct: {task_description}
Query: {query}
Here, {query} should express the individual search query in one sentence, and {task_description} should describe the task, for example:
Truncating and normalizing embeddings
If you require smaller vector dimensions, you can truncate the embedding vector to the desired dimensions and normalize it afterwards. This is possible because the model was trained using Matryoshka Representation Learning. To normalize the vector use code like this:
Install the required libraries and set up the token first:
pip install python-dotenv openai langchain-openai
echo 'OPENAI_API_KEY="sk-…"' > .env
Use the model and truncate to 256 dimensions + normalize the embedding:
from openai import OpenAI
import numpy as np
from dotenv import load_dotenv
# Load .env file
load_dotenv()
client = OpenAI(
base_url="https://llm.aihosting.mittwald.de/v1"
)
def normalize_l2(x):
x = np.array(x)
if x.ndim == 1:
norm = np.linalg.norm(x)
if norm == 0:
return x
return x / norm
else:
norm = np.linalg.norm(x, 2, axis=1, keepdims=True)
return np.where(norm == 0, x, x / norm)
response = client.embeddings.create(
model="Qwen3-Embedding-8B", input="Testing 123", encoding_format="float"
)
cut_dim = response.data[0].embedding[:256]
norm_dim = normalize_l2(cut_dim)
print(norm_dim)
Terms of use and licensing
The general terms of use apply. The model is provided by Alibaba under the Apache 2.0 License, and reuse of the generated content is not subject to any additional restrictions.