Phantom Router

Phantom Router is an API platform for adult AI. One account, one key, and one base URL give you a growing catalog of purpose-built endpoints - so instead of wiring up a different vendor, model, and safety stack for every capability you ship, you call a single Phantom Router endpoint and we handle the model routing, fallback, and moderation behind it.

Think of it the way OpenRouter or Together AI work for general models - but built specifically for adult AI. You integrate once; as we add endpoints, you reach them with the same key.

The endpoint catalog

Endpoint	What it does
Chat	In-character companion chat with automatic, customer-owned long-term memory.
Re-engagement	Generate a natural "opener" to a user who has gone quiet.
Image tagging	Tag and describe a user-supplied image with a vision model.

Plus account endpoints for configuring your vector store and chat preferences. More endpoints are on the way - the same key reaches each one as it lands.

Generation endpoints live under the /api/v1/companion/ path prefix (for example, POST /api/v1/companion/chat); account and configuration endpoints live under /api/v1/me/.

Chat

Chat is where most integrations currently start. You describe a character with a persona, send the recent messages of a conversation, and Phantom Router returns one in-character reply - with relevant long-term memories already retrieved and woven into the prompt for you.

Memory is automatic and customer-owned:

Automatic - you never manage embeddings, summaries, or retrieval. On every turn Phantom Router embeds the conversation, retrieves the most relevant memories from your store, injects them into the system prompt, and (on a cadence) extracts new memories in the background.
Customer-owned - memories live in your own vector store (an Upstash Vector index you bring). Phantom Router holds an encrypted connection string and writes to it on your behalf; it never persists chat history on its side.

Every chat turn also runs inbound and outbound moderation inline, so prohibited input is refused and unsafe output is filtered before it reaches your users.

How a chat turn works

your request ──▶ inbound moderation ──▶ embed conversation ──▶ retrieve memories (your store)
                                                                          │
                                                                          ▼
       reply ◀── outbound moderation ◀── generate ◀── assemble prompt (persona + memories + profile)
                                                                          │
                                                                          ▼
                                                  extract new memories (async, cadence-gated)

The Playground lets you drive this pipeline interactively - send turns with an editable persona and inspect the raw request and response for each one.

Two ways in

Surface	Use it for
The HTTP API	Production integration. A Bearer API key and a configured vector store are all you need.
The Playground	A dev console that drives the API and inspects each request and response as you chat.

Next steps

Quickstart - your first chat request in a few minutes.
Authentication - API keys and scopes.
API Reference - every client-facing endpoint.
Playground - connect, chat, and watch memory work.

The endpoint catalog​

Chat​

How a chat turn works​

Two ways in​

Next steps​

The endpoint catalog

Chat

How a chat turn works

Two ways in

Next steps