Documentation

INFEREN

Clear guides, reference pages, and operational notes in one place.

Ready to read

Tracing

Tracing is the “deep debugging” view for investigating query processing and assistant behavior.

Use tracing when:

  • the sessions view is not enough to explain an answer
  • you need retrieval details (what chunks were selected)
  • you need timing/latency breakdowns
  • you’re investigating quality regressions

What You Typically See in Tracing

Depending on configuration, a trace may include:

  • user query text
  • knowledge base id/name
  • model/engine metadata
  • retrieved chunks (content + metadata)
  • scores and ranking information
  • final answer text
  • timestamps and latency measurements
  • pipeline step markers (scrape/index/retrieve/generate, etc.)

Common Investigations

“Why did it answer that?”

Check:

  1. The retrieved chunks: are they relevant to the question?
  2. The top-scoring chunks: do they contain misleading text?
  3. The final answer: does it quote/paraphrase retrieved content?

If retrieval is wrong:

  • reindex
  • improve ingestion/extraction
  • reduce KB scope or adjust content structure

“Why is it slow?”

Check:

  • time spent in retrieval (vector store latency)
  • time spent generating the answer (model latency)
  • time spent waiting on upstream services

If the trace shows long gaps:

  • verify connectivity to vector store/model provider
  • check system load

“Why are citations missing?”

Common causes:

  • KB configuration hides sources
  • retrieval returned no chunks
  • the chat UI is not rendering sources for the selected mode

Recommended workflow:

  • confirm KB hide/show sources setting
  • check the trace for retrieved chunks
  • compare /ai-chat/{kb_id} vs embedded test chat UI

Filtering and Safety

Tracing can contain:

  • user-entered content
  • retrieved source snippets
  • internal debug metadata

Operational advice:

  • treat traces as sensitive
  • restrict access to tracing in production environments