Actions - Lightcone

Overview

Every computer instance provides core actions for browser and desktop automation. These actions work identically across both browser and desktop instances (except navigate, which is browser-only).

navigate(url)

Navigate to a URL. Browser instances only.

computer.navigate("https://example.com")

Parameters:

url (string) - The URL to navigate to

Use Cases:

Opening web pages
Navigating between pages
Loading web applications

Mouse Actions

Coordinates are viewport-dependent. The same coordinates will produce different results on different screen sizes. Always:

Take a screenshot first to see the page
Measure the pixel position of elements
Use those specific coordinates for your viewport

click(x, y)

Click at specific screen coordinates.

computer.click(100, 200)

Parameters:

x (int) - X coordinate in pixels
y (int) - Y coordinate in pixels

Use Cases:

Clicking buttons
Selecting menu items
Activating UI elements

double_click(x, y) / doubleClick(x, y)

Double-click at specific coordinates.

computer.double_click(150, 250)

Parameters:

x (int) - X coordinate
y (int) - Y coordinate

Use Cases:

Opening files
Selecting text
Desktop application interactions

right_click(x, y) / rightClick(x, y)

Right-click at specific coordinates to open context menus.

computer.right_click(200, 300)

Parameters:

x (int) - X coordinate
y (int) - Y coordinate

Use Cases:

Opening context menus
Right-click actions
Desktop interactions

drag(from_x, from_y, to_x, to_y)

Drag from one position to another.

computer.drag(100, 100, 300, 300)

Parameters:

from_x (int) - Starting X coordinate
from_y (int) - Starting Y coordinate
to_x (int) - Ending X coordinate
to_y (int) - Ending Y coordinate

Use Cases:

Drag and drop
Moving windows
Selecting regions

mouse_down(x, y) / mouseDown(x, y)

Press and hold the left mouse button at specific coordinates. Use with mouse_up for fine-grained drag control.

# Using the low-level API for mouse control
client.computers.execute_action(computer.id, action={
    "type": "mouse_down",
    "x": 100,
    "y": 200
})

Parameters:

x (float) - X coordinate in pixels
y (float) - Y coordinate in pixels

Use Cases:

Fine-grained drag control
Custom drag interactions
Drawing applications

mouse_up(x, y) / mouseUp(x, y)

Release the left mouse button at specific coordinates.

client.computers.execute_action(computer.id, action={
    "type": "mouse_up",
    "x": 300,
    "y": 400
})

Parameters:

x (float) - X coordinate in pixels
y (float) - Y coordinate in pixels

Use Cases:

Complete a fine-grained drag operation
Release after mouse_down

For simple drag operations, use drag() instead. Use mouse_down/mouse_up when you need control between the press and release (e.g., moving to intermediate positions).

Keyboard Actions

type(text)

Type text at the current cursor position.

computer.type("Hello World")

Parameters:

text (string) - The text to type

Use Cases:

Filling form fields
Entering search queries
Text input

hotkey(*keys)

Send keyboard shortcut combinations.

computer.hotkey("Control", "c")    # Copy
computer.hotkey("cmd", "v")        # Paste on Mac
computer.hotkey("alt", "F4")       # Close window

Parameters:

*keys (strings) - Key names to press simultaneously

Common Shortcuts:

ctrl/cmd + c - Copy
ctrl/cmd + v - Paste
ctrl/cmd + s - Save
ctrl/cmd + f - Find
alt + tab - Switch windows
enter - Submit/Enter
escape - Cancel/Escape

Use Cases:

Keyboard shortcuts
System commands
Application controls

key_down(key) / keyDown(key)

Press and hold a keyboard key. Use with key_up for complex interactions like shift-click selection.

# Shift-click selection example
client.computers.execute_action(computer.id, action={"type": "key_down", "key": "Shift"})
computer.click(100, 200)  # First item
computer.click(100, 400)  # Last item (selects range)
client.computers.execute_action(computer.id, action={"type": "key_up", "key": "Shift"})

Parameters:

key (string) - Key name to press (e.g., “Shift”, “Control”, “Alt”)

Use Cases:

Shift-click selection
Ctrl-click multi-select
Holding modifier keys during mouse operations

key_up(key) / keyUp(key)

Release a keyboard key that was previously pressed with key_down.

client.computers.execute_action(computer.id, action={"type": "key_up", "key": "Shift"})

Parameters:

key (string) - Key name to release

Always pair key_down with key_up to avoid leaving keys stuck in a pressed state.

Scrolling

scroll(dx, dy, x?, y?)

Scroll the viewport by delta x and delta y, optionally at a specific position.

computer.scroll(dx=0, dy=500)            # Scroll down 500px at current position
computer.scroll(dx=0, dy=-300)           # Scroll up 300px
computer.scroll(dx=100, dy=0)            # Scroll right 100px
computer.scroll(dx=0, dy=500, x=400, y=300)  # Scroll at specific coordinates

Parameters:

dx (float) - Horizontal scroll delta (positive = right, negative = left)
dy (float) - Vertical scroll delta (positive = down, negative = up)
x (float, optional) - X coordinate where the scroll action originates
y (float, optional) - Y coordinate where the scroll action originates

Use x and y when you need to scroll within a specific scrollable element (like a sidebar or modal) rather than the main page.

Use Cases:

Scrolling web pages
Loading lazy-loaded content
Viewing off-screen content
Horizontal scrolling in wide layouts
Scrolling within specific elements (use x, y to target)

Capture

screenshot(base64?)

Capture a screenshot of the current screen.

# Get screenshot as URL (default)
result = computer.screenshot()
url = result.result['screenshot_url']
print(f"Screenshot: {url}")

# Get screenshot as base64-encoded JPEG
result = computer.screenshot(base64=True)
base64_data = result.result['screenshot_url']  # Contains base64 data, not URL

Parameters:

base64 (bool, optional) - If true, returns base64-encoded JPEG data instead of a URL. Default: false

Returns:

ActionResult with screenshot_url in result dict
- When base64=false (default): Contains an HTTPS URL to the screenshot image
- When base64=true: Contains raw base64-encoded JPEG data (not a URL, despite the field name)

Use Cases:

Capturing evidence
Visual verification
Monitoring
Documentation
Embedding images directly (use base64=true)

html()

Get the HTML content of the current page.

result = computer.html()
html_content = computer.get_html_content(result)
print(html_content)

# With encoding detection
result = computer.html(auto_detect_encoding=True)

Parameters:

auto_detect_encoding (bool, optional) - Automatically detect character encoding

Returns:

ActionResult with html_content in result dict

Use Cases:

Web scraping
Content extraction
Page analysis
Data mining

debug(command)

Execute a shell command inside the session.

result = computer.debug("ls -la")
output = computer.get_debug_response(result)
print(output)

# With timeout
result = computer.debug(
    "sleep 10",
    timeout_seconds=15,
    max_output_length=10000
)

Parameters:

command (string) - Shell command to execute
timeout_seconds (int, optional) - Command timeout (default: 120)
max_output_length (int, optional) - Max output bytes (default: 65536)

Returns:

ActionResult with debug_response containing command output

Use Cases:

Running scripts inside browser environment
Debugging
File operations
System commands

The /computers/{id}/debug endpoint is deprecated for shell commands. debug() is buffered and does not stream output. Use /computers/{id}/exec or /computers/{id}/exec/sync for command execution.

Shell Execution

The /exec and /exec/sync endpoints are available for desktop sessions (kind: "desktop"). These endpoints run shell commands in a real Linux environment.

exec (streaming NDJSON)

Stream stdout/stderr in real time using the HTTP endpoint. This is ideal for long-running commands or live progress updates. Desktop sessions only. Request body:

command (string, required)
cwd (string, optional, default: /workspace)
env (object, optional)
timeout_seconds (int, optional, default: 120)

Response:

application/x-ndjson
Each line is a JSON object:
- {"type":"stdout","data":"..."}
- {"type":"stderr","data":"..."}
- {"type":"exit","code":0}
- {"type":"error","code":"...","message":"..."}

import json
import requests

url = f"https://api.tzafon.ai/computers/{computer_id}/exec"
headers = {
    "Authorization": f"Bearer {api_key}",
    "Accept": "application/x-ndjson",
}
payload = {"command": "bash -lc 'for i in 1 2 3; do echo $i; sleep 1; done'"}

with requests.post(url, headers=headers, json=payload, stream=True) as response:
    for line in response.iter_lines(decode_unicode=True):
        if not line:
            continue
        msg = json.loads(line)
        if msg["type"] == "stdout":
            print(msg["data"], end="")
        elif msg["type"] == "stderr":
            print(msg["data"], end="")
        elif msg["type"] == "exit":
            print(f"Exit code: {msg['code']}")
            break
        elif msg["type"] == "error":
            raise RuntimeError(msg.get("message", "exec error"))

The streaming endpoint returns HTTP 200 even on malformed input. Always parse the stream and handle type: "error" lines.

exec/sync (buffered)

Execute a command and return buffered stdout/stderr in a single response. Desktop sessions only. Response fields:

stdout (string)
stderr (string)
exit_code (int)

import requests

url = f"https://api.tzafon.ai/computers/{computer_id}/exec/sync"
headers = {"Authorization": f"Bearer {api_key}"}
payload = {"command": "ls -la"}

response = requests.post(url, headers=headers, json=payload)
response.raise_for_status()
print(response.json())

set_viewport(width, height)

Change the browser viewport dimensions.

computer.set_viewport(1920, 1080)
computer.set_viewport(1280, 720, scale_factor=2.0)

Parameters:

width (int) - Viewport width in pixels
height (int) - Viewport height in pixels
scale_factor (float, optional) - Device pixel ratio / zoom level (default: 1.0)

Use Cases:

Testing responsive designs
Capturing full-width screenshots
Simulating different devices
Mobile viewport testing

change_proxy(proxy_url)

Change the proxy settings for the browser session. Browser instances only.

# Using the low-level API to change proxy
client.computers.execute_action(computer.id, action={
    "type": "change_proxy",
    "proxy_url": "http://user:pass@proxy-server:port"
})

Parameters:

proxy_url (string) - Proxy URL in format http://user:pass@host:port or socks5://user:pass@host:port

Use Cases:

Geo-targeted testing
Avoiding rate limits
Web scraping with rotating proxies
Ad verification across regions

Set the proxy immediately after creating the session, before any page navigation. See the Playwright with Proxy example for a complete workflow.

Timing

wait(seconds)

Wait for a specified duration.

computer.wait(2.0)   # Wait 2 seconds
computer.wait(0.5)   # Wait 500ms

Parameters:

seconds (float) - Duration to wait

Use Cases:

Waiting for page loads
Allowing animations to complete
Rate limiting
Synchronization delays

Implementation differs between SDKs:

Python: Executes server-side via execute_action({"type": "wait", "ms": ...}). This makes an API call.
TypeScript: Executes client-side via sleep(). This does NOT make an API call.

Both achieve the same end result (pausing execution), but Python’s wait happens on the remote session while TypeScript’s wait happens locally.

Batch Actions

batch(actions)

Execute multiple actions in sequence with a single API call. The batch stops on the first error.

# Execute multiple actions in one request
result = client.computers.execute_batch(computer.id, actions=[
    {"type": "go_to_url", "url": "https://example.com"},
    {"type": "wait", "ms": 2000},
    {"type": "click", "x": 100, "y": 200},
    {"type": "type", "text": "search query"},
    {"type": "keypress", "keys": ["enter"]},
    {"type": "screenshot"}
])

Parameters:

actions (array) - Array of action objects to execute sequentially

Behavior:

Actions execute in order
Stops immediately on first error
Returns results for all executed actions

Use Cases:

Reducing latency for multi-step workflows
Atomic operation sequences
Efficient automation scripts

Use batch actions when you have a sequence of actions that don’t require intermediate inspection. This reduces network round-trips and improves performance.

Complete Example

Here’s a workflow using multiple actions:

from tzafon import Computer

client = Computer()
with client.create(kind="browser") as computer:
    # Navigate
    computer.navigate("https://example.com")
    computer.wait(1)

    # Fill form
    computer.click(100, 200)  # Click input
    computer.type("search query")
    computer.hotkey("enter")
    computer.wait(2)

    # Scroll and capture
    computer.scroll(dx=0, dy=500)
    computer.wait(1)

    result = computer.screenshot()
    url = computer.get_screenshot_url(result)
    print(f"Screenshot: {url}")

Action Reference

Core Actions

Action	Browser	Desktop	Parameters
`navigate(url)`	✅	❌	url: string
`click(x, y)`	✅	✅	x: float, y: float
`double_click(x, y)`	✅	✅	x: float, y: float
`right_click(x, y)`	✅	✅	x: float, y: float
`drag(x1, y1, x2, y2)`	✅	✅	x1, y1, x2, y2: float
`type(text)`	✅	✅	text: string
`hotkey(*keys)`	✅	✅	*keys: string
`scroll(dx, dy)`	✅	✅	dx: float, dy: float
`screenshot()`	✅	✅	base64?: bool
`html()`	✅	❌	auto_detect_encoding?: bool
`debug(cmd)`	✅	✅	command: string, timeout?, max_output?
`set_viewport(w, h)`	✅	❌	width: int, height: int, scale_factor?: float
`change_proxy(url)`	✅	❌	proxy_url: string
`wait(seconds)`	✅	✅	seconds: float

Batch Operations

Action	Browser	Desktop	Parameters
`batch(actions)`	✅	✅	actions: array of action objects

Best Practices

Use wait() generously

Always wait after navigation, form submissions, or dynamic content loading

Verify coordinates

Test coordinate values with screenshots before automation

Handle errors

Check action results, especially for navigation and screenshots

Screenshot for verification

Capture screenshots to verify successful completion

Next Steps

Responses

Understand action results and error handling

Example

See a working example

Getting Started

Core Concepts

Examples

​Overview

​Navigation

​navigate(url)

​Mouse Actions

​click(x, y)

​double_click(x, y) / doubleClick(x, y)

​right_click(x, y) / rightClick(x, y)

​drag(from_x, from_y, to_x, to_y)

​mouse_down(x, y) / mouseDown(x, y)

​mouse_up(x, y) / mouseUp(x, y)

​Keyboard Actions

​type(text)

​hotkey(*keys)

​key_down(key) / keyDown(key)

​key_up(key) / keyUp(key)

​Scrolling

​scroll(dx, dy, x?, y?)

​Capture

​screenshot(base64?)

​html()

​debug(command)

​Shell Execution

​exec (streaming NDJSON)

​exec/sync (buffered)

​set_viewport(width, height)

​change_proxy(proxy_url)

​Timing

​wait(seconds)

​Batch Actions

​batch(actions)

​Complete Example

​Action Reference

​Core Actions

​Batch Operations

​Best Practices

​Next Steps

Responses

Example

Overview

Navigation

navigate(url)

Mouse Actions

click(x, y)

double_click(x, y) / doubleClick(x, y)

right_click(x, y) / rightClick(x, y)

drag(from_x, from_y, to_x, to_y)

mouse_down(x, y) / mouseDown(x, y)

mouse_up(x, y) / mouseUp(x, y)

Keyboard Actions

type(text)

hotkey(*keys)

key_down(key) / keyDown(key)

key_up(key) / keyUp(key)

Scrolling

scroll(dx, dy, x?, y?)

Capture

screenshot(base64?)

html()

debug(command)

Shell Execution

exec (streaming NDJSON)

exec/sync (buffered)

set_viewport(width, height)

change_proxy(proxy_url)

Timing

wait(seconds)

Batch Actions

batch(actions)

Complete Example

Action Reference

Core Actions

Batch Operations

Best Practices

Next Steps