Chat

Generate a single in-character reply for a persona and a recent message buffer. Phantom retrieves relevant memories from your vector store, injects them into the prompt, runs inbound and outbound moderation, and (on a cadence) extracts new memories in the background.

POST /api/v1/companion/chat

Auth: Bearer key with the chat:companion scope.
Requires: a configured vector store (otherwise 400).

Build a request in the Playground

Construct a chat request interactively and watch the response come back in the Playground - edit the persona and per-turn options, send a turn, then copy the exact request and response JSON straight into your own code.

Request body

Field	Type	Required	Notes
`chat_id`	string	✓	Opaque per-user conversation id. Pattern `^[A-Za-z0-9][A-Za-z0-9_.:@-]{0,127}$`, max 128 chars. Reuse it across turns to accumulate memory.
`persona`	object	✓	The character. See below.
`messages`	array	✓	1–100 turns. The last message must be `role: "user"`.
`model`	string	-	Override the generation model (1–200 chars).
`temperature`	number	-	`0`–`2`. Default `0.8`.
`max_tokens`	integer	-	`1`–`32000`. Defaults to the model's own limit.
`extract_memory`	boolean	-	Default `true`. Set `false` to skip background memory write-back (retrieval still runs).

`persona`

Every field describes the companion character (e.g. Luna), not the user. All fields except name are optional free text.

Field	Type	Required	Notes
`name`	string	✓	1–120 chars. The character's display name.
`nickname`	string	-	Up to 120 chars. What the user calls the character.
`gender`	string	-	Up to 200 chars.
`date_of_birth`	string	-	Up to 200 chars. Free text — a date, an age, or a range.
`language`	string	-	Up to 200 chars, e.g. `"English, Portuguese"`.
`lives_in_country`	string	-	Up to 200 chars.
`lives_in_city`	string	-	Up to 200 chars.
`comes_from_country`	string	-	Up to 200 chars.
`comes_from_city`	string	-	Up to 200 chars.
`living_situation`	string	-	Up to 2000 chars.
`daily_routine`	string	-	Up to 2000 chars.
`goals`	string	-	Up to 2000 chars.
`bio`	string	-	Up to 4000 chars. Background, personality, backstory.
`boundaries`	string	-	Up to 2000 chars. Lines the character will not cross.
`chat_style`	string	-	Up to 80 chars, e.g. `"playful, teasing"`. Slots into the generated preamble.
`custom_instructions`	string	-	Up to 8000 chars. Appended as a strong directive; does not replace the preamble.

The persona is stateless - send it on every turn. Phantom doesn't store it.

`messages`

Each item is { "role": "user" | "assistant", "content": "..." }, with content 1–16000 chars. Send a rolling window of recent turns; the last one must be from the user.

Example request

curl -X POST https://api.phantomrouter.ai/api/v1/companion/chat \
  -H "Authorization: Bearer $PHANTOM_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "chat_id": "user-42",
    "persona": {
      "name": "Luna",
      "bio": "A warm, witty companion who loves astronomy.",
      "chat_style": "playful, curious"
    },
    "messages": [
      { "role": "user", "content": "Morning! Remember I had that big presentation?" }
    ],
    "temperature": 0.8
  }'

Response

{
  "reply": "Morning, you! Of course - the big one you were dreading. How'd it land?",
  "chat_id": "user-42",
  "moderation": { "inbound_prohibited": false, "outbound_filtered": false },
  "memory": { "retrieved_count": 3, "profile_used": true },
  "extraction_enqueued": true
}

Field	Type	Notes
`reply`	string	The in-character reply.
`chat_id`	string	Echoes the request.
`moderation.inbound_prohibited`	boolean	`true` if the user's message was refused.
`moderation.outbound_filtered`	boolean	`true` if the generated reply was replaced by a safe fallback.
`moderation.reason`	string?	Present when moderation acted.
`memory.retrieved_count`	integer	How many memories were injected this turn.
`memory.profile_used`	boolean	Whether a structured profile was injected.
`extraction_enqueued`	boolean	Whether a background memory-extraction job was queued this turn.

Moderation behavior

Moderation runs inline and returns 200, not an error:

Inbound refusal - if the user's message is prohibited, reply is a canned refusal, moderation.inbound_prohibited is true, and nothing is generated or extracted.
Outbound filter - if the generated reply is unsafe, it's replaced with a safe fallback and moderation.outbound_filtered is true.

Memory & cadence

Retrieval runs on every turn. Extraction is cadence-gated - by default Phantom extracts new memories every few user turns, not on every message, and the job runs in the background (extraction_enqueued: true). Set extract_memory: false to opt out of write-back for a turn while still retrieving.

Errors

Status	Code	When
400	`INVALID_REQUEST`	Body failed validation, or no vector store is configured.
401	`UNAUTHORIZED`	Missing or invalid key.
402	`PAYMENT_REQUIRED`	Insufficient credit balance.
403	`FORBIDDEN`	Key lacks `chat:companion`.
503	`SERVICE_UNAVAILABLE`	Transient model-provider or vector-store failure. Retry with backoff.

Request body​

persona​

messages​

Example request​

Response​

Moderation behavior​

Memory & cadence​

Errors​