Tracing
Tracing is the “deep debugging” view for investigating query processing and assistant behavior.
Use tracing when:
- the sessions view is not enough to explain an answer
- you need retrieval details (what chunks were selected)
- you need timing/latency breakdowns
- you’re investigating quality regressions
What You Typically See in Tracing
Depending on configuration, a trace may include:
- user query text
- knowledge base id/name
- model/engine metadata
- retrieved chunks (content + metadata)
- scores and ranking information
- final answer text
- timestamps and latency measurements
- pipeline step markers (scrape/index/retrieve/generate, etc.)
Common Investigations
“Why did it answer that?”
Check:
- The retrieved chunks: are they relevant to the question?
- The top-scoring chunks: do they contain misleading text?
- The final answer: does it quote/paraphrase retrieved content?
If retrieval is wrong:
- reindex
- improve ingestion/extraction
- reduce KB scope or adjust content structure
“Why is it slow?”
Check:
- time spent in retrieval (vector store latency)
- time spent generating the answer (model latency)
- time spent waiting on upstream services
If the trace shows long gaps:
- verify connectivity to vector store/model provider
- check system load
“Why are citations missing?”
Common causes:
- KB configuration hides sources
- retrieval returned no chunks
- the chat UI is not rendering sources for the selected mode
Recommended workflow:
- confirm KB hide/show sources setting
- check the trace for retrieved chunks
- compare
/ai-chat/{kb_id}vs embedded test chat UI
Filtering and Safety
Tracing can contain:
- user-entered content
- retrieved source snippets
- internal debug metadata
Operational advice:
- treat traces as sensitive
- restrict access to tracing in production environments