Shared LLM API
OpenAI-compatible
Shared LLM API Tester
Use one shared OpenAI-compatible base URL for Qwen3 today and additional local models later. This tester hits the same customer-facing API that external callers will use.
Supported endpoints
- GET /api/llm/v1/models
- POST /api/llm/v1/chat/completions
Current live model focus
The shared API is model-agnostic. Right now it is backed by the pooled Qwen3 router, and future local models can appear in the same/modelslist without changing the public base URL.
curl sample
Server-to-server request against the shared LLM API.
curl -sS \
-X POST \
-H "Content-Type: application/json" \
-H "x-api-key: YOUR_API_KEY" \
-d '{
"model": "qwen3-4b",
"messages": [
{ "role": "system", "content": "You are a helpful assistant." },
{ "role": "user", "content": "Explain the API in one short paragraph." }
],
"temperature": 0.4,
"max_tokens": 256
}' \
"https://kaleidovid.com/api/llm/v1/chat/completions"OpenAI SDK sample
Drop-in Python example using a custom base URL.
from openai import OpenAI
client = OpenAI(
api_key="YOUR_API_KEY",
base_url="https://kaleidovid.com/api/llm/v1",
)
response = client.chat.completions.create(
model="qwen3-4b",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Explain the API in one short paragraph."},
],
temperature=0.4,
max_tokens=256,
)
print(response.choices[0].message.content)Live tester
Paste a raw team API key to test the customer-facing path exactly as an external caller would. If left blank, the tester falls back to your signed-in session.
Live response
Assistant text
No assistant text yet.
{
"message": "Response JSON will appear here."
}