Streaming
Egret supports real-time token streaming via Server-Sent Events (SSE), so users see responses as they're generated.
Enabling streaming
Add "stream": true to your query request:
POST /api/v1/rag/query/
Content-Type: application/json
Authorization: Api-Key ek_live_...
{
"query": "What are the key GDPR principles?",
"domain": "gdpr",
"stream": true
}
SSE event format
The response is a text/event-stream with three event types:
Token event
event: token
data: {"text": "The key principles"}
Citation event
event: citation
data: {"document": "gdpr-principles.pdf", "section": "Art. 5", "text": "..."}
Done event
event: done
data: {"credits_used": 1, "session_id": "sess_xyz789"}
Client implementation
Use the native EventSource API or our typed client:
import { createClient } from "@egret/api";
const client = createClient({ apiKey: "ek_live_..." });
for await (const event of client.streamQuery({
query: "What are the key GDPR principles?",
domain: "gdpr",
})) {
if (event.type === "token") {
process.stdout.write(event.text);
}
}
Why SSE over WebSockets?
LLM responses are unidirectional — the client sends one request and receives a stream of tokens. SSE works through proxies and CDNs, reconnects automatically, and requires no special server infrastructure.