Chat
Generate a single in-character reply for a persona and a recent message buffer. Phantom retrieves relevant memories from your vector store, injects them into the prompt, runs inbound and outbound moderation, and (on a cadence) extracts new memories in the background.
POST /api/v1/companion/chat
- Auth: Bearer key with the
chat:companionscope. - Requires: a configured vector store (otherwise
400).
Construct a chat request interactively and watch the response come back in the Playground - edit the persona and per-turn options, send a turn, then copy the exact request and response JSON straight into your own code.
Request body
| Field | Type | Required | Notes |
|---|---|---|---|
chat_id | string | ✓ | Opaque per-user conversation id. Pattern ^[A-Za-z0-9][A-Za-z0-9_.:@-]{0,127}$, max 128 chars. Reuse it across turns to accumulate memory. |
persona | object | ✓ | The character. See below. |
messages | array | ✓ | 1–100 turns. The last message must be role: "user". |
model | string | - | Override the generation model (1–200 chars). |
temperature | number | - | 0–2. Default 0.8. |
max_tokens | integer | - | 1–32000. Defaults to the model's own limit. |
extract_memory | boolean | - | Default true. Set false to skip background memory write-back (retrieval still runs). |
persona
Every field describes the companion character (e.g. Luna), not the user. All
fields except name are optional free text.
| Field | Type | Required | Notes |
|---|---|---|---|
name | string | ✓ | 1–120 chars. The character's display name. |
nickname | string | - | Up to 120 chars. What the user calls the character. |
gender | string | - | Up to 200 chars. |
date_of_birth | string | - | Up to 200 chars. Free text — a date, an age, or a range. |
language | string | - | Up to 200 chars, e.g. "English, Portuguese". |
lives_in_country | string | - | Up to 200 chars. |
lives_in_city | string | - | Up to 200 chars. |
comes_from_country | string | - | Up to 200 chars. |
comes_from_city | string | - | Up to 200 chars. |
living_situation | string | - | Up to 2000 chars. |
daily_routine | string | - | Up to 2000 chars. |
goals | string | - | Up to 2000 chars. |
bio | string | - | Up to 4000 chars. Background, personality, backstory. |
boundaries | string | - | Up to 2000 chars. Lines the character will not cross. |
chat_style | string | - | Up to 80 chars, e.g. "playful, teasing". Slots into the generated preamble. |
custom_instructions | string | - | Up to 8000 chars. Appended as a strong directive; does not replace the preamble. |
The persona is stateless - send it on every turn. Phantom doesn't store it.
messages
Each item is { "role": "user" | "assistant", "content": "..." }, with content 1–16000 chars.
Send a rolling window of recent turns; the last one must be from the user.
Example request
curl -X POST https://api.phantomrouter.ai/api/v1/companion/chat \
-H "Authorization: Bearer $PHANTOM_KEY" \
-H "Content-Type: application/json" \
-d '{
"chat_id": "user-42",
"persona": {
"name": "Luna",
"bio": "A warm, witty companion who loves astronomy.",
"chat_style": "playful, curious"
},
"messages": [
{ "role": "user", "content": "Morning! Remember I had that big presentation?" }
],
"temperature": 0.8
}'
Response
{
"reply": "Morning, you! Of course - the big one you were dreading. How'd it land?",
"chat_id": "user-42",
"moderation": { "inbound_prohibited": false, "outbound_filtered": false },
"memory": { "retrieved_count": 3, "profile_used": true },
"extraction_enqueued": true
}
| Field | Type | Notes |
|---|---|---|
reply | string | The in-character reply. |
chat_id | string | Echoes the request. |
moderation.inbound_prohibited | boolean | true if the user's message was refused. |
moderation.outbound_filtered | boolean | true if the generated reply was replaced by a safe fallback. |
moderation.reason | string? | Present when moderation acted. |
memory.retrieved_count | integer | How many memories were injected this turn. |
memory.profile_used | boolean | Whether a structured profile was injected. |
extraction_enqueued | boolean | Whether a background memory-extraction job was queued this turn. |
Moderation behavior
Moderation runs inline and returns 200, not an error:
- Inbound refusal - if the user's message is prohibited,
replyis a canned refusal,moderation.inbound_prohibitedistrue, and nothing is generated or extracted. - Outbound filter - if the generated reply is unsafe, it's replaced with a safe fallback and
moderation.outbound_filteredistrue.
Memory & cadence
Retrieval runs on every turn. Extraction is cadence-gated - by default Phantom extracts new
memories every few user turns, not on every message, and the job runs in the background
(extraction_enqueued: true). Set extract_memory: false to opt out of write-back for a turn
while still retrieving.
Errors
| Status | Code | When |
|---|---|---|
| 400 | INVALID_REQUEST | Body failed validation, or no vector store is configured. |
| 401 | UNAUTHORIZED | Missing or invalid key. |
| 402 | PAYMENT_REQUIRED | Insufficient credit balance. |
| 403 | FORBIDDEN | Key lacks chat:companion. |
| 503 | SERVICE_UNAVAILABLE | Transient model-provider or vector-store failure. Retry with backoff. |