API Documentation

Ynnova exposes an OpenAI-compatible REST API. Any client that works with OpenAI's API works with Ynnova — just change the base URL and API key.

Overview

The Ynnova API is fully compatible with the OpenAI Chat Completions specification. It supports streaming, system prompts, and temperature control.

All requests are served by GPU-accelerated vLLM workers connected to the network via encrypted tunnels. Responses follow the OpenAI response schema exactly.

Authentication

Pass your API key in the Authorization header as a Bearer token.

HTTP Header
Authorization: Bearer sk-your-api-key

To request an API key, contact us or email support@ynnova.eu.

Base URL

Base URL
https://api.ynnova.eu

All endpoints are served over HTTPS with a valid TLS certificate.

Chat Completions

POST /v1/chat/completions Generate a chat response

Generates a model response for the given chat history. Compatible with the OpenAI ChatCompletion object format.

Request body

ParameterTypeRequiredDescription
modelstringrequiredModel ID to use. See List Models.
messagesarrayrequiredArray of message objects with role and content.
streambooleanoptionalIf true, returns a server-sent event stream. Default: false.
temperaturenumberoptionalSampling temperature 0–2. Default: 1.
max_tokensintegeroptionalMaximum tokens to generate.
top_pnumberoptionalNucleus sampling probability. Default: 1.

List Models

GET /v1/models List available models

Returns the list of models currently available on the network.

Currently available

Model IDContextDescription
Qwen/Qwen2.5-7B-Instruct32k tokensQwen 2.5 7B instruction-tuned. Fast, low latency.

Health Check

GET /health Check API and worker status

Returns the current status of the gateway and connected vLLM worker.

Response 200
{
  "status": "ok",
  "vllm": true,
  "model": "Qwen/Qwen2.5-7B-Instruct"
}

cURL example

cURL
curl https://api.ynnova.eu/v1/chat/completions \
  -H "Authorization: Bearer sk-your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "Qwen/Qwen2.5-7B-Instruct",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "What is the capital of France?"}
    ]
  }'

Python example

Python · openai SDK
from openai import OpenAI

client = OpenAI(
    api_key="sk-your-api-key",
    base_url="https://api.ynnova.eu/v1"
)

response = client.chat.completions.create(
    model="Qwen/Qwen2.5-7B-Instruct",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Explain quantum computing in one paragraph."}
    ]
)

print(response.choices[0].message.content)

JavaScript example

JavaScript · fetch
const response = await fetch("https://api.ynnova.eu/v1/chat/completions", {
  method: "POST",
  headers: {
    "Authorization": "Bearer sk-your-api-key",
    "Content-Type": "application/json"
  },
  body: JSON.stringify({
    model: "Qwen/Qwen2.5-7B-Instruct",
    messages: [{ role: "user", content: "Hello!" }]
  })
});

const data = await response.json();
console.log(data.choices[0].message.content);

All parameters

ParameterTypeDefaultDescription
modelstringModel ID (required)
messagesarrayConversation history (required)
streambooleanfalseSSE streaming
temperaturefloat1.00 = deterministic, 2 = creative
top_pfloat1.0Nucleus sampling cutoff
max_tokensintegernullMax output tokens
stopstring[]nullStop sequences
frequency_penaltyfloat0Penalize repeated tokens
presence_penaltyfloat0Penalize already-seen tokens

Error codes

StatusCodeDescription
401unauthorizedMissing or invalid API key.
422validation_errorRequest body failed validation.
503service_unavailableNo vLLM worker is currently connected.
500internal_errorUnexpected server error.

Need help? Contact us or email support@ynnova.eu.