Chat Completions API

New

Overview

Tzafon provides an OpenAI-compatible chat completions API. You can use the standard OpenAI SDK with Tzafon’s models by simply changing the base URL.

Use your Tzafon API key (TZAFON_API_KEY) with the OpenAI SDK—no separate authentication required.

Quick Start

from openai import OpenAI

client = OpenAI(
    api_key="sk_your_tzafon_api_key",
    base_url="https://api.tzafon.ai/v1"
)

response = client.chat.completions.create(
    model="tzafon.sm-1",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Hello!"}
    ]
)

print(response.choices[0].message.content)

Available Models

Model	Description	Best For
`tzafon.sm-1`	Small, fast model	General tasks, quick responses
`tzafon.northstar.cua.sft`	Computer-use optimized	Browser/desktop automation agents

Use GET /v1/models to retrieve the current list of available models.

Endpoints

Chat Completions

POST https://api.tzafon.ai/v1/chat/completions

Create a chat completion with conversation history. Request Body:

Parameter	Type	Required	Description
`model`	string	Yes	Model ID (e.g., `tzafon.sm-1`)
`messages`	array	Yes	Conversation messages
`temperature`	number	No	Sampling temperature (0-2). Default: 1
`max_tokens`	number	No	Maximum tokens to generate
`stream`	boolean	No	Enable streaming responses
`stop`	array	No	Stop sequences

Message Format:

{
  "role": "system" | "user" | "assistant",
  "content": "Message text"
}

Streaming

Enable real-time token streaming for responsive UIs:

from openai import OpenAI

client = OpenAI(
    api_key="sk_your_tzafon_api_key",
    base_url="https://api.tzafon.ai/v1"
)

stream = client.chat.completions.create(
    model="tzafon.sm-1",
    messages=[{"role": "user", "content": "Write a short poem"}],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

List Models

GET https://api.tzafon.ai/v1/models

Returns the list of available models.

from openai import OpenAI

client = OpenAI(
    api_key="sk_your_tzafon_api_key",
    base_url="https://api.tzafon.ai/v1"
)

models = client.models.list()
for model in models.data:
    print(model.id)

Completions (Legacy)

POST https://api.tzafon.ai/v1/completions

For text completion without chat format. Parameters are similar to chat completions, but uses a prompt string instead of messages.

Embeddings

POST https://api.tzafon.ai/v1/embeddings

Generate vector embeddings for text.

Configuration Options

Temperature Presets

Preset	Temperature	Max Tokens	Use Case
Creative	1.0	2048	Creative writing, brainstorming
Balanced	0.7	1024	General tasks
Precise	0.3	512	Factual responses, code

Example with Options

response = client.chat.completions.create(
    model="tzafon.sm-1",
    messages=[{"role": "user", "content": "Explain quantum computing"}],
    temperature=0.7,
    max_tokens=1024,
    stop=["<|im_end|>", "<|end_of_text|>"]
)

Error Handling

The API returns standard HTTP status codes:

Status	Description
200	Success
400	Bad request (invalid parameters)
401	Unauthorized (invalid API key)
429	Rate limit exceeded
500+	Server error

from openai import OpenAI, APIError, RateLimitError, AuthenticationError

client = OpenAI(
    api_key="sk_your_tzafon_api_key",
    base_url="https://api.tzafon.ai/v1"
)

try:
    response = client.chat.completions.create(
        model="tzafon.sm-1",
        messages=[{"role": "user", "content": "Hello"}]
    )
except AuthenticationError:
    print("Invalid API key")
except RateLimitError:
    print("Rate limit exceeded - retry later")
except APIError as e:
    print(f"API error: {e}")

Pricing

LLM API usage is billed per million tokens:

Model	Input Tokens	Output Tokens
`tzafon.sm-1`	$0.20 / 1M tokens	$0.30 / 1M tokens
`tzafon.northstar.cua.sft`	$0.30 / 1M tokens	$1.00 / 1M tokens

Token usage is tracked automatically. View your usage in the developer dashboard.

Using with Computer Automation

Combine the LLM API with browser automation for AI-powered workflows:

from openai import OpenAI
from tzafon import Computer

# LLM client
llm = OpenAI(
    api_key="sk_your_tzafon_api_key",
    base_url="https://api.tzafon.ai/v1"
)

# Computer client
computer_client = Computer()

with computer_client.create(kind="browser") as computer:
    computer.navigate("https://example.com")
    screenshot = computer.screenshot()

    # Use LLM to analyze the page
    response = llm.chat.completions.create(
        model="tzafon.northstar.cua.sft",
        messages=[
            {"role": "system", "content": "You are a web automation assistant."},
            {"role": "user", "content": f"I'm on a webpage. What should I click next? Screenshot: {screenshot}"}
        ]
    )

    print(response.choices[0].message.content)

Getting Started

Core Concepts

Examples

Chat Completions API

Overview

Quick Start

Available Models

Endpoints

Chat Completions

Streaming

List Models

Completions (Legacy)

Embeddings

Configuration Options

Temperature Presets

Example with Options

Error Handling

Pricing

Using with Computer Automation

Next Steps

Quickstart

Computer Instances

Getting Started

Core Concepts

Examples

​Overview

​Quick Start

​Available Models

​Endpoints

​Chat Completions

​Streaming

​List Models

​Completions (Legacy)

​Embeddings

​Configuration Options

​Temperature Presets

​Example with Options

​Error Handling

​Pricing

​Using with Computer Automation

​Next Steps

Quickstart

Computer Instances

Overview

Quick Start

Available Models

Endpoints

Chat Completions

Streaming

List Models

Completions (Legacy)

Embeddings

Configuration Options

Temperature Presets

Example with Options

Error Handling

Pricing

Using with Computer Automation

Next Steps