API Documentation
Ynnova exposes an OpenAI-compatible REST API. Any client that works with OpenAI's API works with Ynnova — just change the base URL and API key.
Overview
The Ynnova API is fully compatible with the OpenAI Chat Completions specification. It supports streaming, system prompts, and temperature control.
All requests are served by GPU-accelerated vLLM workers connected to the network via encrypted tunnels. Responses follow the OpenAI response schema exactly.
Authentication
Pass your API key in the Authorization header as a Bearer token.
Authorization: Bearer sk-your-api-key
To request an API key, contact us or email support@ynnova.eu.
Base URL
https://api.ynnova.eu
All endpoints are served over HTTPS with a valid TLS certificate.
Chat Completions
Generates a model response for the given chat history. Compatible with the OpenAI
ChatCompletion object format.
Request body
| Parameter | Type | Required | Description |
|---|---|---|---|
| model | string | required | Model ID to use. See List Models. |
| messages | array | required | Array of message objects with role and content. |
| stream | boolean | optional | If true, returns a server-sent event stream. Default: false. |
| temperature | number | optional | Sampling temperature 0–2. Default: 1. |
| max_tokens | integer | optional | Maximum tokens to generate. |
| top_p | number | optional | Nucleus sampling probability. Default: 1. |
List Models
Returns the list of models currently available on the network.
Currently available
| Model ID | Context | Description |
|---|---|---|
| Qwen/Qwen2.5-7B-Instruct | 32k tokens | Qwen 2.5 7B instruction-tuned. Fast, low latency. |
Health Check
Returns the current status of the gateway and connected vLLM worker.
{
"status": "ok",
"vllm": true,
"model": "Qwen/Qwen2.5-7B-Instruct"
}
cURL example
curl https://api.ynnova.eu/v1/chat/completions \
-H "Authorization: Bearer sk-your-api-key" \
-H "Content-Type: application/json" \
-d '{
"model": "Qwen/Qwen2.5-7B-Instruct",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is the capital of France?"}
]
}'
Python example
from openai import OpenAI
client = OpenAI(
api_key="sk-your-api-key",
base_url="https://api.ynnova.eu/v1"
)
response = client.chat.completions.create(
model="Qwen/Qwen2.5-7B-Instruct",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Explain quantum computing in one paragraph."}
]
)
print(response.choices[0].message.content)
JavaScript example
const response = await fetch("https://api.ynnova.eu/v1/chat/completions", {
method: "POST",
headers: {
"Authorization": "Bearer sk-your-api-key",
"Content-Type": "application/json"
},
body: JSON.stringify({
model: "Qwen/Qwen2.5-7B-Instruct",
messages: [{ role: "user", content: "Hello!" }]
})
});
const data = await response.json();
console.log(data.choices[0].message.content);
All parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
| model | string | — | Model ID (required) |
| messages | array | — | Conversation history (required) |
| stream | boolean | false | SSE streaming |
| temperature | float | 1.0 | 0 = deterministic, 2 = creative |
| top_p | float | 1.0 | Nucleus sampling cutoff |
| max_tokens | integer | null | Max output tokens |
| stop | string[] | null | Stop sequences |
| frequency_penalty | float | 0 | Penalize repeated tokens |
| presence_penalty | float | 0 | Penalize already-seen tokens |
Error codes
| Status | Code | Description |
|---|---|---|
| 401 | unauthorized | Missing or invalid API key. |
| 422 | validation_error | Request body failed validation. |
| 503 | service_unavailable | No vLLM worker is currently connected. |
| 500 | internal_error | Unexpected server error. |
Need help? Contact us or email support@ynnova.eu.