Knowledge Bases

Knowledge bases (KBs) are the admin-managed content collections that power retrieval and chat.

This page explains:

what you can do from the KB list and KB detail pages
the KB lifecycle/status model
common ingestion and maintenance operations
practical troubleshooting

KB List Page (Overview)

The main KB list is designed for quick operations:

search/filter knowledge bases
see status at a glance (ready vs scraping vs failed)
open chat quickly for a KB
open configuration / maintenance actions

Common information shown per KB:

Name (display name)
Source URL (if URL-based)
Status (queued/scraping/indexing/ready/failed/paused variants)
Document count
Last updated or processing timestamps

Creating a Knowledge Base

URL ingestion (most common)

Use this when the content is reachable by URL.

Typical inputs:

source_url
optional name
optional description
optional preset_id (ingestion preset)

What happens after creation:

KB record is created immediately.
Background processing begins:
scraping/extraction
chunking/embedding (indexing)
The status updates as the pipeline progresses.

File upload ingestion (if enabled)

Use this when you have a local file to ingest.

What to expect:

the upload endpoint receives the file
extracted text is stored as a document
indexing runs similarly to URL ingestion

KB Lifecycle & Status Model

KBs move through a pipeline. The exact implementation can vary, but the operational meaning is consistent.

Common statuses

queued: waiting to start work
scraping: fetching/extracting content
indexing: chunking and embedding content
ready: searchable/chat-ready
failed: pipeline stopped due to an error

Paused variants

Paused variants may appear when an in-flight job is intentionally stopped:

paused during scraping
paused during indexing

Operational definition of “ready”

Treat ready as:

documents exist
index exists
chat and search can run reliably

If a KB is “ready” but answers are poor, use document preview + tracing to diagnose retrieval quality.

KB Detail Page (Maintenance)

The KB detail view is the main “maintenance” screen.

Documents

Typical document operations:

list documents in the KB
search within documents
preview extracted content
delete a single document

When previewing a document, look for:

missing sections (bad extraction)
duplicated content (scrape loops)
navigation boilerplate dominating (needs smarter extraction)

Embedded Test Chat

The KB detail page often includes an embedded test chat panel for fast iteration.

Use it to:

run 2–3 quick smoke-test questions
validate retrieval before sharing the full chat link
compare behavior with/without citations (if configurable)

Configuration

Common configuration fields (names may vary by UI):

Name / description
Language
Theme color / icon
Hide sources: suppress visible citations in the chat UI
Scheduling: automatic re-scrape settings (if available)

Reindexing and Rescraping

These operations exist because content and indexing can drift.

Reindex only

Use reindex when:

the stored extracted content is correct
you want to rebuild chunking/embedding

Rescrape then index

Use rescrape when:

the upstream site/file changed
extraction was wrong (missing/garbled text)
you updated ingestion settings that affect extraction

Pause / Resume

Pause/resume is useful when:

a scrape is taking too long with the wrong settings
you need to temporarily reduce system load
you want to stop and adjust parameters before continuing

Archiving vs Deletion

Delete

permanently removes the KB and its documents
use only when you do not need recovery

Troubleshooting

KB is stuck in `scraping` or `indexing`

Check:

whether there is a pause control available
tracing/pipeline logs for an error or long-running step
whether the source URL is slow or blocked

KB is `failed`

Common causes:

unreachable URL / blocked by remote host
extraction error (invalid document, unsupported format)
indexing error (vector store connectivity, embedding errors)

Recommended workflow:

Open KB detail → look for error message, document counts, and timestamps.
Try rescrape with a smaller scope (less depth, fewer links) if available.
Validate the source URL is reachable from the deployment environment.

Chat answers are irrelevant

Common causes:

content was extracted incorrectly (garbage in → garbage out)
KB is too broad or poorly structured (retrieval noise)
embeddings/index not aligned with content (needs reindex)

Recommended workflow:

Preview documents to confirm extracted text quality.
Reindex the KB.
Use tracing to inspect retrieved chunks for a query.

INFEREN

Knowledge Bases

KB List Page (Overview)

Creating a Knowledge Base

URL ingestion (most common)

File upload ingestion (if enabled)

KB Lifecycle & Status Model

Common statuses

Paused variants

Operational definition of “ready”

KB Detail Page (Maintenance)

Documents

Embedded Test Chat

Configuration

Reindexing and Rescraping

Reindex only

Rescrape then index

Pause / Resume

Archiving vs Deletion

Archive

Delete

Troubleshooting

KB is stuck in `scraping` or `indexing`

KB is `failed`

Chat answers are irrelevant

Knowledge Bases

KB List Page (Overview)

Creating a Knowledge Base

URL ingestion (most common)

File upload ingestion (if enabled)

KB Lifecycle & Status Model

Common statuses

Paused variants

Operational definition of “ready”

KB Detail Page (Maintenance)

Documents

Embedded Test Chat

Configuration

Reindexing and Rescraping

Reindex only

Rescrape then index

Pause / Resume

Archiving vs Deletion

Archive

Delete

Troubleshooting

KB is stuck in scraping or indexing

KB is failed

Chat answers are irrelevant

KB is stuck in `scraping` or `indexing`

KB is `failed`