API Documentation

Ynnova exposes an OpenAI-compatible REST API. Any client that works with OpenAI's API works with Ynnova — just change the base URL and API key.

Overview

The Ynnova API is fully compatible with the OpenAI Chat Completions specification. It supports streaming, system prompts, and temperature control.

All requests are served by GPU-accelerated vLLM workers connected to the network via encrypted tunnels. Responses follow the OpenAI response schema exactly.

Authentication

Pass your API key in the Authorization header as a Bearer token.

HTTP Header

Authorization: Bearer sk-your-api-key

To request an API key, contact us or email support@ynnova.eu.

Base URL

https://api.ynnova.eu

All endpoints are served over HTTPS with a valid TLS certificate.

Chat Completions

POST /v1/chat/completions Generate a chat response

Generates a model response for the given chat history. Compatible with the OpenAI ChatCompletion object format.

Request body

Parameter	Type	Required	Description
model	string	required	Model ID to use. See List Models.
messages	array	required	Array of message objects with `role` and `content`.
stream	boolean	optional	If true, returns a server-sent event stream. Default: false.
temperature	number	optional	Sampling temperature 0–2. Default: 1.
max_tokens	integer	optional	Maximum tokens to generate.
top_p	number	optional	Nucleus sampling probability. Default: 1.

List Models

GET /v1/models List available models

Returns the list of models currently available on the network.

Currently available

Model ID	Context	Description
Qwen/Qwen2.5-7B-Instruct	32k tokens	Qwen 2.5 7B instruction-tuned. Fast, low latency.

Health Check

GET /health Check API and worker status

Returns the current status of the gateway and connected vLLM worker.

Response 200

{
  "status": "ok",
  "vllm": true,
  "model": "Qwen/Qwen2.5-7B-Instruct"
}

cURL example

cURL

curl https://api.ynnova.eu/v1/chat/completions \
  -H "Authorization: Bearer sk-your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "Qwen/Qwen2.5-7B-Instruct",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "What is the capital of France?"}
    ]
  }'

Python example

Python · openai SDK

from openai import OpenAI

client = OpenAI(
    api_key="sk-your-api-key",
    base_url="https://api.ynnova.eu/v1"
)

response = client.chat.completions.create(
    model="Qwen/Qwen2.5-7B-Instruct",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Explain quantum computing in one paragraph."}
    ]
)

print(response.choices[0].message.content)

JavaScript example

JavaScript · fetch

const response = await fetch("https://api.ynnova.eu/v1/chat/completions", {
  method: "POST",
  headers: {
    "Authorization": "Bearer sk-your-api-key",
    "Content-Type": "application/json"
  },
  body: JSON.stringify({
    model: "Qwen/Qwen2.5-7B-Instruct",
    messages: [{ role: "user", content: "Hello!" }]
  })
});

const data = await response.json();
console.log(data.choices[0].message.content);

All parameters

Parameter	Type	Default	Description
model	string	—	Model ID (required)
messages	array	—	Conversation history (required)
stream	boolean	false	SSE streaming
temperature	float	1.0	0 = deterministic, 2 = creative
top_p	float	1.0	Nucleus sampling cutoff
max_tokens	integer	null	Max output tokens
stop	string[]	null	Stop sequences
frequency_penalty	float	0	Penalize repeated tokens
presence_penalty	float	0	Penalize already-seen tokens

Error codes

Status	Code	Description
401	unauthorized	Missing or invalid API key.
422	validation_error	Request body failed validation.
503	service_unavailable	No vLLM worker is currently connected.
500	internal_error	Unexpected server error.

Need help? Contact us or email support@ynnova.eu.