Messages API¶

Generate messages using language models. Compatible with the Anthropic Messages API.

Base URL¶

https://api.getkawai.com/v1

Authentication¶

When authentication is enabled, include your token in the Authorization header:

Authorization: Bearer API_KEY

Messages¶

Create messages with language models using the Anthropic Messages API format.

`POST /messages`¶

Create a message. Supports streaming responses with Server-Sent Events using Anthropic's event format.

Authentication: Required when auth is enabled. Token must have 'messages' endpoint access.

Headers¶

Header	Required	Description
`Authorization`	Yes	Bearer token for authentication
`Content-Type`	Yes	Must be application/json
`anthropic-version`	No	API version (optional)

Request Body¶

Content-Type: application/json

Field	Type	Required	Description
`model`	`string`	Yes	ID of the model to use
`messages`	`array`	Yes	Array of message objects with role (user/assistant) and content
`max_tokens`	`integer`	Yes	Maximum number of tokens to generate
`system`	`string\|array`	No	System prompt as string or array of content blocks
`stream`	`boolean`	No	Enable streaming responses (default: false)
`tools`	`array`	No	List of tools the model can use
`temperature`	`number`	No	Sampling temperature (0-1)
`top_p`	`number`	No	Nucleus sampling parameter
`top_k`	`integer`	No	Top-k sampling parameter
`stop_sequences`	`array`	No	Sequences where the API will stop generating

Response¶

Returns a message object, or streams Server-Sent Events if stream=true. Response includes anthropic-request-id header.

Content-Type: application/json or text/event-stream

Examples¶

Basic message:

curl -X POST https://api.getkawai.com/v1/messages \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen3-8b-q8_0",
    "max_tokens": 1024,
    "messages": [
      {"role": "user", "content": "Hello, how are you?"}
    ]
  }'

With system prompt:

curl -X POST https://api.getkawai.com/v1/messages \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen3-8b-q8_0",
    "max_tokens": 1024,
    "system": "You are a helpful assistant.",
    "messages": [
      {"role": "user", "content": "What is the capital of France?"}
    ]
  }'

Streaming response:

curl -X POST https://api.getkawai.com/v1/messages \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen3-8b-q8_0",
    "max_tokens": 1024,
    "stream": true,
    "messages": [
      {"role": "user", "content": "Write a haiku about coding"}
    ]
  }'

Multi-turn conversation:

curl -X POST https://api.getkawai.com/v1/messages \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen3-8b-q8_0",
    "max_tokens": 1024,
    "messages": [
      {"role": "user", "content": "What is 2+2?"},
      {"role": "assistant", "content": "2+2 equals 4."},
      {"role": "user", "content": "What about 2+3?"}
    ]
  }'

Vision with image URL (requires vision model):

curl -X POST https://api.getkawai.com/v1/messages \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen2.5-vl-3b-instruct-q8_0",
    "max_tokens": 1024,
    "messages": [
      {
        "role": "user",
        "content": [
          {"type": "text", "text": "What is in this image?"},
          {"type": "image", "source": {"type": "url", "url": "https://example.com/image.jpg"}}
        ]
      }
    ]
  }'

Vision with base64 image (requires vision model):

curl -X POST https://api.getkawai.com/v1/messages \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen2.5-vl-3b-instruct-q8_0",
    "max_tokens": 1024,
    "messages": [
      {
        "role": "user",
        "content": [
          {"type": "text", "text": "Describe this image"},
          {
            "type": "image",
            "source": {
              "type": "base64",
              "media_type": "image/jpeg",
              "data": "/9j/4AAQ..."
            }
          }
        ]
      }
    ]
  }'

Tool calling:

curl -X POST https://api.getkawai.com/v1/messages \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen3-8b-q8_0",
    "max_tokens": 1024,
    "messages": [
      {"role": "user", "content": "What is the weather in Paris?"}
    ],
    "tools": [
      {
        "name": "get_weather",
        "description": "Get the current weather for a location",
        "input_schema": {
          "type": "object",
          "properties": {
            "location": {
              "type": "string",
              "description": "City name"
            }
          },
          "required": ["location"]
        }
      }
    ]
  }'

Tool result (continue conversation after tool call):

curl -X POST https://api.getkawai.com/v1/messages \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen3-8b-q8_0",
    "max_tokens": 1024,
    "messages": [
      {"role": "user", "content": "What is the weather in Paris?"},
      {
        "role": "assistant",
        "content": [
          {
            "type": "tool_use",
            "id": "call_xyz789",
            "name": "get_weather",
            "input": {"location": "Paris"}
          }
        ]
      },
      {
        "role": "user",
        "content": [
          {
            "type": "tool_result",
            "tool_use_id": "call_xyz789",
            "content": "Sunny, 22°C"
          }
        ]
      }
    ],
    "tools": [
      {
        "name": "get_weather",
        "description": "Get the current weather for a location",
        "input_schema": {
          "type": "object",
          "properties": {
            "location": {"type": "string"}
          },
          "required": ["location"]
        }
      }
    ]
  }'

Response Formats¶

The Messages API returns different formats for streaming and non-streaming responses.

Non-Streaming Response¶

For non-streaming requests (stream=false or omitted), returns a complete message object.

Examples¶

{
  "id": "msg_abc123",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "type": "text",
      "text": "Hello! I'm doing well, thank you for asking. How can I help you today?"
    }
  ],
  "model": "qwen3-8b-q8_0",
  "stop_reason": "end_turn",
  "usage": {
    "input_tokens": 12,
    "output_tokens": 18
  }
}

Tool Use Response¶

When the model calls a tool, the content includes tool_use blocks with the tool call details.

Examples¶

{
  "id": "msg_abc123",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "type": "tool_use",
      "id": "call_xyz789",
      "name": "get_weather",
      "input": {
        "location": "Paris"
      }
    }
  ],
  "model": "qwen3-8b-q8_0",
  "stop_reason": "tool_use",
  "usage": {
    "input_tokens": 50,
    "output_tokens": 25
  }
}

Streaming Events¶

For streaming requests (stream=true), the API returns Server-Sent Events with different event types following Anthropic's streaming format.

Examples¶

event: message_start
data: {"type":"message_start","message":{"id":"msg_abc123","type":"message","role":"assistant","content":[],"model":"qwen3-8b-q8_0","stop_reason":null,"usage":{"input_tokens":12,"output_tokens":0}}}

event: content_block_start
data: {"type":"content_block_start","index":0,"content_block":{"type":"text","text":""}}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":"Hello"}}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":"!"}}

event: content_block_stop
data: {"type":"content_block_stop","index":0}

event: message_delta
data: {"type":"message_delta","delta":{"stop_reason":"end_turn"},"usage":{"output_tokens":18}}

event: message_stop
data: {"type":"message_stop"}

Streaming Tool Calls¶

When streaming tool calls, input_json_delta events provide incremental JSON for tool arguments.

Examples¶

event: message_start
data: {"type":"message_start","message":{...}}

event: content_block_start
data: {"type":"content_block_start","index":0,"content_block":{"type":"tool_use","id":"call_xyz789","name":"get_weather","input":{}}}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"input_json_delta","partial_json":"{\"location\":"}}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"input_json_delta","partial_json":"\"Paris\"}"}}

event: content_block_stop
data: {"type":"content_block_stop","index":0}

event: message_delta
data: {"type":"message_delta","delta":{"stop_reason":"tool_use"},"usage":{"output_tokens":25}}

event: message_stop
data: {"type":"message_stop"}

Messages API¶

Base URL¶

Authentication¶

Messages¶

POST /messages¶

Headers¶

Request Body¶

Response¶

Examples¶

Response Formats¶

Non-Streaming Response¶

Examples¶

Tool Use Response¶

Examples¶

Streaming Events¶

Examples¶

Streaming Tool Calls¶

Examples¶

`POST /messages`¶