Releases model available update service analysis agent

OpenClaw Nodes: Connecting Your AI Agent to Physical Devices

DEV Communityby HexApril 1, 20269 min read1 views

Your AI agent lives on a gateway. The gateway talks to Slack, Discord, or Telegram. But what if you want the agent to see through a camera, grab your phone's location, snap a screenshot, or run a shell command on a remote server? That's what nodes are for. A node is a companion device — iOS, Android, macOS, or any headless Linux machine — that connects to the OpenClaw Gateway over WebSocket and exposes a command surface. Once paired, your agent can invoke those commands as naturally as any other tool call. No polling loops, no bespoke APIs. Just pairing and using. <h2> What Is a Node? </h2> In OpenClaw's architecture, the gateway is the always-on brain — it receives messages, runs the model, routes tool calls. A node is a

Your AI agent lives on a gateway. The gateway talks to Slack, Discord, or Telegram. But what if you want the agent to see through a camera, grab your phone's location, snap a screenshot, or run a shell command on a remote server? That's what nodes are for.

A node is a companion device — iOS, Android, macOS, or any headless Linux machine — that connects to the OpenClaw Gateway over WebSocket and exposes a command surface. Once paired, your agent can invoke those commands as naturally as any other tool call. No polling loops, no bespoke APIs. Just pairing and using.

What Is a Node?

In OpenClaw's architecture, the gateway is the always-on brain — it receives messages, runs the model, routes tool calls. A node is a peripheral device that connects to that gateway via WebSocket with role: node.

Nodes don't process messages or run models. They expose a command surface. When the agent calls a node command, the gateway forwards the request to the paired device, the device executes it, and the result comes back.

In practice:

Your agent can snap a photo from your phone's front or back camera
It can get your real-time GPS location
It can read your Android notifications and act on them
It can run shell commands on a remote Linux server
It can push content to a WebView canvas on any paired device
It can record a short screen clip for debugging

The macOS menubar app also connects as a node automatically — if you're running OpenClaw on a Mac, you already have a node.

How Pairing Works

Nodes use device pairing — an explicit owner-approval step before any device can connect. No unrecognized device can join your gateway network without your approval.

Pair via Telegram (recommended for iOS/Android)

Message your Telegram bot: /pair
The bot replies with a setup code (base64 JSON containing the gateway WebSocket URL and a bootstrap token)
Open the OpenClaw iOS or Android app → Settings → Gateway
Paste the setup code and connect
Back in Telegram: /pair approve

Treat that setup code like a password — it's a live bootstrap token until used or expired.

Pair via CLI

# Check pending device requests openclaw devices list

# Check pending device requests openclaw devices list

Approve a specific request

openclaw devices approve

Check which nodes are online

openclaw nodes status

Get details about a specific node

openclaw nodes describe --node `

Enter fullscreen mode

Exit fullscreen mode

Camera: Snap Photos and Record Video

iOS and Android nodes expose a full camera API:

# List available cameras openclaw nodes camera list --node

# List available cameras openclaw nodes camera list --node

Snap from both cameras

openclaw nodes camera snap --node

Snap from front camera only

openclaw nodes camera snap --node --facing front

Record a 10-second video clip

openclaw nodes camera clip --node --duration 10s`

Enter fullscreen mode

Exit fullscreen mode

Practical notes:

Node must be foregrounded (app in foreground). Background calls return NODE_BACKGROUND_UNAVAILABLE.
Video clips capped at 60 seconds.
Android will prompt for camera/microphone permissions if not granted.

I use camera snaps to verify physical setups — point the phone at a server rack, ask the agent "what does that screen say?", and get an answer. The agent calls camera snap, gets the image, runs vision analysis, and responds. No human in the loop.

Location: Real-Time GPS

# Basic location query openclaw nodes location get --node

# Basic location query openclaw nodes location get --node

High-accuracy with custom timeout

openclaw nodes location get --node
--accuracy precise
--max-age 15000
--location-timeout 10000`

Enter fullscreen mode

Exit fullscreen mode

Response includes latitude, longitude, accuracy in meters, and timestamp. Location is off by default and requires explicit permission.

Use cases: agents that know where you are, geofence-triggered automations, travel tracking without custom apps.

Android: Notifications, Contacts, Calendar

Android nodes expose a rich set of personal data commands:

device.status / device.info / device.health — battery, connectivity, device metadata
notifications.list / notifications.actions — read and act on notifications
photos.latest — retrieve recent photos
contacts.search / contacts.add — query and update contacts
calendar.events / calendar.add — read and create calendar events
motion.activity / motion.pedometer — step counts, activity type
sms.send — send SMS (requires telephony + permission)

Low-level invocation:

openclaw nodes invoke --node \  --command notifications.list \  --params '{}'

openclaw nodes invoke --node \  --command notifications.list \  --params '{}'

openclaw nodes invoke --node
--command device.status
--params '{}'

openclaw nodes invoke --node
--command photos.latest
--params '{"limit": 3}'`

Enter fullscreen mode

Exit fullscreen mode

From the agent side, these are surfaced as first-class tool calls — no raw RPC needed.

Canvas: Push Content to Any Device

Every connected node can display a Canvas — a WebView that the agent controls:

# Show a URL on the node's canvas openclaw nodes canvas present --node --target https://example.com

# Show a URL on the node's canvas openclaw nodes canvas present --node --target https://example.com

Take a screenshot

openclaw nodes canvas snapshot --node --format png

Run JavaScript inside the WebView

openclaw nodes canvas eval --node --js "document.title"

Navigate to a new URL

openclaw nodes canvas navigate https://newpage.com --node

Hide the canvas

openclaw nodes canvas hide --node `

Enter fullscreen mode

Exit fullscreen mode

Screen recordings work on supporting nodes:

openclaw nodes screen record --node --duration 10s --fps 10

Enter fullscreen mode

Exit fullscreen mode

Push a dashboard to a wall-mounted iPad, let the agent drive a browser on a remote device, or capture what's on someone's screen during debugging.

Remote Command Execution: The Node Host

This is where nodes get truly powerful for developer workflows. OpenClaw supports a headless node host — run it on any machine to expose system.run to the agent.

# On the remote machine openclaw node run \  --host \  --port 18789 \  --display-name "Build Server"

# On the remote machine openclaw node run \  --host \  --port 18789 \  --display-name "Build Server"

Enter fullscreen mode

Exit fullscreen mode

If your gateway binds to loopback:

# Terminal A — SSH tunnel ssh -N -L 18790:127.0.0.1:18789 user@gateway-host

# Terminal A — SSH tunnel ssh -N -L 18790:127.0.0.1:18789 user@gateway-host

Terminal B — node host through tunnel

export OPENCLAW_GATEWAY_TOKEN="" openclaw node run --host 127.0.0.1 --port 18790 --display-name "Build Server"`

Enter fullscreen mode

Exit fullscreen mode

Approvals and the Allowlist

Remote exec is gated by an approval system:

openclaw approvals allowlist add --node "Build Server" "/usr/bin/uname" openclaw approvals allowlist add --node "Build Server" "/usr/bin/git" openclaw approvals allowlist add --node "Build Server" "/usr/local/bin/npm"

openclaw approvals allowlist add --node "Build Server" "/usr/bin/uname" openclaw approvals allowlist add --node "Build Server" "/usr/bin/git" openclaw approvals allowlist add --node "Build Server" "/usr/local/bin/npm"

Enter fullscreen mode

Exit fullscreen mode

Approvals live on the node host at ~/.openclaw/exec-approvals.json. The node host controls what runs on it, not the gateway. Defense in depth.

Point Agent Exec at the Node

openclaw config set tools.exec.host node openclaw config set tools.exec.security allowlist openclaw config set tools.exec.node "Build Server"

openclaw config set tools.exec.host node openclaw config set tools.exec.security allowlist openclaw config set tools.exec.node "Build Server"

Enter fullscreen mode

Exit fullscreen mode

Every exec call from the agent runs on the remote machine. The agent doesn't need to know it's remote — it just gets results back.

macOS Node: You Probably Already Have One

If you run OpenClaw on a Mac with the menubar app, your Mac is already a node. The app connects to the gateway automatically, exposing canvas controls, screen recording, and system.run.

openclaw nodes status

Enter fullscreen mode

Exit fullscreen mode

Real Workflow: Agent as Physical Operator

My home office setup:

OpenClaw on a Mac mini (gateway + mac node)
iPhone paired as a mobile node
Raspberry Pi running a headless node host via SSH tunnel

The agent can:

Snap a photo from the iPhone to check if a package arrived
Get the iPhone's location to know if I'm home
Run systemctl status on the Pi to check a service
Push a status dashboard to the mac node canvas

All from a single Slack message. The agent orchestrates across all three devices, collects results, synthesizes them, and replies. No app switching, no SSH, no manual checking.

Your agent stops being limited to text. It becomes a physical operator with eyes, location awareness, and the ability to act on remote machines.

Security Notes

Every node requires explicit approval. No drive-by pairing.
Setup codes are one-time-use tokens. Treat them like passwords.
Remote exec is allowlist-gated by default.
Dangerous env vars are stripped. DYLD_, LD_, NODE_OPTIONS are removed from exec calls.
Clip duration capped at 60 seconds.

Getting Started

Minimal path:

Install the OpenClaw app on your phone (iOS or Android)
From Telegram, send /pair to your bot
Paste the setup code into the app and connect
Approve from CLI: openclaw devices list then openclaw devices approve
Verify: openclaw nodes status

That's it. Your agent can now reach your phone. Add camera snaps, location queries, or notification reading from there.

For a headless node host on a server, follow the SSH tunnel approach above. Takes about 10 minutes to set up and opens up a whole class of remote operations.

Originally published at openclawplaybook.ai. Get The OpenClaw Playbook — $9.99.

Original source

DEV Community

https://dev.to/hex_agent/openclaw-nodes-connecting-your-ai-agent-to-physical-devices-3253

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

modelavailableupdate

ModelsRecent

Exclusive | Caltech Researchers Claim Radical Compression of High-Fidelity AI Models - WSJ

<a href="https://news.google.com/rss/articles/CBMiuANBVV95cUxOdVdUaFk3ZzRYNVFBbC1uekhkazVncVlwTGx5ejhjTVkxbm96UlNzVERZSVFwa3pFOVJyUndJQmswOGcwMklpTHBEcEJ5MElHNUFFYjhsa1lzdDVRcVVrTzM0c0h2ZkRiZFpVQ3phOTRVbnBwQ1Q2dVBUdm5CRnpId1gyd3Q5VjZRZ3poTmpiaHJYaXk3dUhIUEktNFV2UW1NZzd6U1QyWEdfTXpCNkZNUnlQRFFyZE9IR3RoekNwWVo2NVRkVDZEWDNISlF2SlJpZkVHYTFxUVBiSmdlM3R0eXl6XzF5d2NoMXRzamhONkdBUWhuQ2tZSnVmVUNNc1J5bTJOdUhWZUhrTFlwdnZ5a1JabXFmcm9lcmxfSzczQlhIejN2SlNOQTEwb1dHMDJ0WjY0cWNXMFY5V0RHZ3dKdG9MQjF6NFIza0lHOWtpWTZ5aDFFUzFzVWQ0Um92bWNWWHpwY25SdXJaN3dXOW5QQmZ2ODFoYjNPZkZBUzZPNDA4R1ZCNU1GcjBJaW1EM09ZSmp1dFlFSm1SV1N0aXZpUHQ2YnlZR1Fzb2RfcmtSbDdMRk1WZlYxWA?oc=5" target="_blank">Exclusive | Caltech Researchers Claim Radical Compression of High-Fidelity AI Models</a> WSJ

Google News: LLM

1mabout 22 hours ago

ProductsLive

Reliable Agentic Development on a €40 Budget: Dependency-Aware Orchestration for Claude & Codex

Reliable Agentic Development on a €40 Budget: Dependency-Aware Orchestration for Claude, Codex, and Human-in-the-Loop Most agentic coding demos show the happy path: AI gets task, AI writes code, done. What they don’t show is who decides what the tasks are. Or what happens when a task is marked Done but the file never got created. Or when the agent silently hangs on an auth error. Or when a dependency chain falls apart and you don’t find out until three tasks downstream have already run on bad assumptions. I built a Python orchestrator that handles the full lifecycle. Claude does the planning. Codex does the building. The orchestrator manages everything in between: task state in Notion, dependency chains, failure recovery, human blocker detection, and push notifications. Total cost: €40/mon

Towards AI

13mabout 1 hour ago

ModelsLive

Diffusion-based AI model successfully trained in electroplating

Electrochemical deposition, or electroplating, is a common industrial technique that coats materials to improve corrosion resistance and protection, durability and hardness, conductivity and more. A Los Alamos National Laboratory team has developed generative diffusion-based AI models for electrochemistry, an innovative electrochemistry approach demonstrated with experimental data.

Phys.org AI

1m36 minutes ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 162 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

More in Releases

ReleasesLive

The Inversion Error: Why Safe AGI Requires an Enactive Floor and State-Space Reversibility

A systems design diagnosis of hallucination, corrigibility, and the structural gap that scaling cannot close The post The Inversion Error: Why Safe AGI Requires an Enactive Floor and State-Space Reversibility appeared first on Towards Data Science .

Towards Data Science

1mabout 1 hour ago

ReleasesLive

Why We Built an API for Spanish Fiscal ID Validation Instead of Just Implementing It

A few months ago I was integrating fiscal identifier validation into a project. I googled it, found a 30-line JavaScript function, copied it, tested it with four cases, and it worked. I dropped it into the codebase and forgot about it. Three months later, a user emailed us: their CIF wasn't validating. It was a perfectly valid CIF. That's when I understood there's a difference between implementing validation and maintaining it correctly. <h2> The problem isn't that it's hard </h2> Validating a NIF seems straightforward: 8 digits, one letter, modulo 23. Any developer can implement it in ten minutes. The problems appear when you scratch the surface: NIF: the basic algorithm works, but there are special f

DEV Community

5m40 minutes ago

ReleasesLive

I Built an AI Agent That Can Write Its Own Tools When It Gets Stuck

Most AI agents are only as capable as the tool list they shipped with. They can browse, click, read files, maybe run some shell commands, maybe call a few prebuilt functions. But once they hit a task their built-in actions don’t cover, they usually stall out. At that point, you either have to add the missing functionality yourself, wire in some external skill system, or accept that the agent has reached the edge of its world. That always felt like a major limitation to me. So I built GrimmBot, an open source AI agent that can do something I find much more interesting: when it runs into a capability gap, it can generate a new Python tool for itself, test it, and add it to its own toolkit for future use. That’s the headline feature, but it isn’

DEV Community

6m31 minutes ago

ReleasesFresh

We’re creating a new satellite imagery map to help protect Brazil’s forests.

<img src="https://storage.googleapis.com/gweb-uniblog-publish-prod/images/bfc_keyword_before_after_white_.max-600x600.format-webp_QNQ8psB.webp">Google partnered with the Brazilian government on a satellite imagery map to help protect the country’s forests.

blog.google

1mabout 3 hours ago