Messages API¶
Generate messages using language models. Compatible with the Anthropic Messages API.
Base URL¶
Authentication¶
When authentication is enabled, include your token in the Authorization header:
Messages¶
Create messages with language models using the Anthropic Messages API format.
POST /messages¶
Create a message. Supports streaming responses with Server-Sent Events using Anthropic's event format.
Authentication: Required when auth is enabled. Token must have 'messages' endpoint access.
Headers¶
| Header | Required | Description |
|---|---|---|
Authorization |
Yes | Bearer token for authentication |
Content-Type |
Yes | Must be application/json |
anthropic-version |
No | API version (optional) |
Request Body¶
Content-Type: application/json
| Field | Type | Required | Description |
|---|---|---|---|
model |
string |
Yes | ID of the model to use |
messages |
array |
Yes | Array of message objects with role (user/assistant) and content |
max_tokens |
integer |
Yes | Maximum number of tokens to generate |
system |
string|array |
No | System prompt as string or array of content blocks |
stream |
boolean |
No | Enable streaming responses (default: false) |
tools |
array |
No | List of tools the model can use |
temperature |
number |
No | Sampling temperature (0-1) |
top_p |
number |
No | Nucleus sampling parameter |
top_k |
integer |
No | Top-k sampling parameter |
stop_sequences |
array |
No | Sequences where the API will stop generating |
Response¶
Returns a message object, or streams Server-Sent Events if stream=true. Response includes anthropic-request-id header.
Content-Type: application/json or text/event-stream
Examples¶
Basic message:
curl -X POST https://api.getkawai.com/v1/messages \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "qwen3-8b-q8_0",
"max_tokens": 1024,
"messages": [
{"role": "user", "content": "Hello, how are you?"}
]
}'
With system prompt:
curl -X POST https://api.getkawai.com/v1/messages \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "qwen3-8b-q8_0",
"max_tokens": 1024,
"system": "You are a helpful assistant.",
"messages": [
{"role": "user", "content": "What is the capital of France?"}
]
}'
Streaming response:
curl -X POST https://api.getkawai.com/v1/messages \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "qwen3-8b-q8_0",
"max_tokens": 1024,
"stream": true,
"messages": [
{"role": "user", "content": "Write a haiku about coding"}
]
}'
Multi-turn conversation:
curl -X POST https://api.getkawai.com/v1/messages \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "qwen3-8b-q8_0",
"max_tokens": 1024,
"messages": [
{"role": "user", "content": "What is 2+2?"},
{"role": "assistant", "content": "2+2 equals 4."},
{"role": "user", "content": "What about 2+3?"}
]
}'
Vision with image URL (requires vision model):
curl -X POST https://api.getkawai.com/v1/messages \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "qwen2.5-vl-3b-instruct-q8_0",
"max_tokens": 1024,
"messages": [
{
"role": "user",
"content": [
{"type": "text", "text": "What is in this image?"},
{"type": "image", "source": {"type": "url", "url": "https://example.com/image.jpg"}}
]
}
]
}'
Vision with base64 image (requires vision model):
curl -X POST https://api.getkawai.com/v1/messages \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "qwen2.5-vl-3b-instruct-q8_0",
"max_tokens": 1024,
"messages": [
{
"role": "user",
"content": [
{"type": "text", "text": "Describe this image"},
{
"type": "image",
"source": {
"type": "base64",
"media_type": "image/jpeg",
"data": "/9j/4AAQ..."
}
}
]
}
]
}'
Tool calling:
curl -X POST https://api.getkawai.com/v1/messages \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "qwen3-8b-q8_0",
"max_tokens": 1024,
"messages": [
{"role": "user", "content": "What is the weather in Paris?"}
],
"tools": [
{
"name": "get_weather",
"description": "Get the current weather for a location",
"input_schema": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "City name"
}
},
"required": ["location"]
}
}
]
}'
Tool result (continue conversation after tool call):
curl -X POST https://api.getkawai.com/v1/messages \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "qwen3-8b-q8_0",
"max_tokens": 1024,
"messages": [
{"role": "user", "content": "What is the weather in Paris?"},
{
"role": "assistant",
"content": [
{
"type": "tool_use",
"id": "call_xyz789",
"name": "get_weather",
"input": {"location": "Paris"}
}
]
},
{
"role": "user",
"content": [
{
"type": "tool_result",
"tool_use_id": "call_xyz789",
"content": "Sunny, 22°C"
}
]
}
],
"tools": [
{
"name": "get_weather",
"description": "Get the current weather for a location",
"input_schema": {
"type": "object",
"properties": {
"location": {"type": "string"}
},
"required": ["location"]
}
}
]
}'
Response Formats¶
The Messages API returns different formats for streaming and non-streaming responses.
Non-Streaming Response¶
For non-streaming requests (stream=false or omitted), returns a complete message object.
Examples¶
{
"id": "msg_abc123",
"type": "message",
"role": "assistant",
"content": [
{
"type": "text",
"text": "Hello! I'm doing well, thank you for asking. How can I help you today?"
}
],
"model": "qwen3-8b-q8_0",
"stop_reason": "end_turn",
"usage": {
"input_tokens": 12,
"output_tokens": 18
}
}
Tool Use Response¶
When the model calls a tool, the content includes tool_use blocks with the tool call details.
Examples¶
{
"id": "msg_abc123",
"type": "message",
"role": "assistant",
"content": [
{
"type": "tool_use",
"id": "call_xyz789",
"name": "get_weather",
"input": {
"location": "Paris"
}
}
],
"model": "qwen3-8b-q8_0",
"stop_reason": "tool_use",
"usage": {
"input_tokens": 50,
"output_tokens": 25
}
}
Streaming Events¶
For streaming requests (stream=true), the API returns Server-Sent Events with different event types following Anthropic's streaming format.
Examples¶
event: message_start
data: {"type":"message_start","message":{"id":"msg_abc123","type":"message","role":"assistant","content":[],"model":"qwen3-8b-q8_0","stop_reason":null,"usage":{"input_tokens":12,"output_tokens":0}}}
event: content_block_start
data: {"type":"content_block_start","index":0,"content_block":{"type":"text","text":""}}
event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":"Hello"}}
event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":"!"}}
event: content_block_stop
data: {"type":"content_block_stop","index":0}
event: message_delta
data: {"type":"message_delta","delta":{"stop_reason":"end_turn"},"usage":{"output_tokens":18}}
event: message_stop
data: {"type":"message_stop"}
Streaming Tool Calls¶
When streaming tool calls, input_json_delta events provide incremental JSON for tool arguments.
Examples¶
event: message_start
data: {"type":"message_start","message":{...}}
event: content_block_start
data: {"type":"content_block_start","index":0,"content_block":{"type":"tool_use","id":"call_xyz789","name":"get_weather","input":{}}}
event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"input_json_delta","partial_json":"{\"location\":"}}
event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"input_json_delta","partial_json":"\"Paris\"}"}}
event: content_block_stop
data: {"type":"content_block_stop","index":0}
event: message_delta
data: {"type":"message_delta","delta":{"stop_reason":"tool_use"},"usage":{"output_tokens":25}}
event: message_stop
data: {"type":"message_stop"}