Step-by-step: generate and grow a workspace knowledge base.
The knowledge base (KB) is the shared memory of a workspace. Orkestral scans your repositories, writes structured pages about them, links those pages together, and indexes everything for fast retrieval. Your agents read this KB before they plan or touch code, so the better your KB, the sharper their work.This guide walks you through generating a KB from your sources, watching the analysis run, browsing the result (pages, wikilinks, and the graph), and understanding how agents consume it.
Generated from your repos
A deterministic scan maps files, languages, dependencies, entrypoints, and risks. An AI pass writes deep architecture pages on top.
Hybrid search
Pages are indexed with lexical search (BM25) plus local embeddings, then merged into one ranked result for agents and for you.
Linked and visual
Wikilinks and entity relations connect pages into a navigable graph you can explore in the galaxy view.
Local and free
Scanning and indexing run on your machine. The repository overview can be written by the bundled local model at zero API cost.
When you analyze a source, Orkestral runs a pipeline. Understanding the phases helps you read the progress and know what to expect.
1
Walk the repository
Orkestral walks the source folder, skipping noise like node_modules, .git, dist, build, coverage, and other build or vendor directories. It keeps documentation files (.md, .mdx, .markdown), code files (.ts, .tsx, .js, .py, .go, .rs, .java, and more), and key config files (package.json, tsconfig.json, Dockerfile). Files over 256 KB are skipped, and the scan stops at 800 files to keep the UI responsive.
2
Extract entities
Dependencies from package.json become tech entities (runtime and dev). Imports parsed from a sample of code files add more external libraries as entities. These feed the graph even before any AI runs.
3
Write the base overview
A root page titled Repo: <name> is created. Its overview can be written by the bundled local model (the Forge) from the in-memory inventory of languages, top directories, and docs. If the local route is unavailable or returns nothing, a deterministic summary is used instead.
4
Create deterministic coverage pages
Seven coverage pages are created as children of the root: a structural map, dependencies and scripts, entrypoints, a code inventory, contracts and integrations, tests and quality, and reading risks. These exist immediately, even if the AI pass fails later.
5
Run the deep AI analysis
An orchestrator agent (Claude or Codex) is spawned with your repo as its working directory and the Orkestral MCP tools attached. It reads the important files and writes rich pages: Overview, Architecture, Tech Stack, Dependencies, Directory Structure, Main Flows, Pain Points and Risks, Conventions, and Setup. It links pages with wikilinks as it goes.
6
Snapshot and index
Chunks are rebuilt, a binary snapshot is written to disk, and an embedding job is queued. Search indexing and embeddings make the pages retrievable.
If the AI pass succeeds and writes a rich tree (at least four of its own pages), the shallow deterministic coverage pages are archived (not deleted) so the KB shows the richer analysis. If the AI pass fails, the deterministic pages stay visible so you always have a usable base.
Analysis reads files from disk, so the source must have a local path that exists. If a source points only to a remote repo, clone it first. See Connect your repos.
An executable agent for the deep pass
The deep AI analysis needs a runnable agent in the workspace (a claude_local or codex_local adapter). Orkestral prefers the orchestrator (CEO). Without one, the scan still produces the deterministic base, but you get no AI-written architecture pages. See Hire your team.
The Forge for a local overview (optional)
When the bundled local model is available and routing allows it, the repository overview is written locally at zero API cost. This is optional: a deterministic overview is used as a fallback.
Go to the Knowledge area of your workspace. If you have never run an analysis, you see an empty state inviting you to generate the KB from a source.
2
Pick a source to analyze
Choose the source (repository or folder) you want to map. Orkestral creates the Repo: <name> root page right away and starts the job in the background, so the UI stays responsive.
Analyze your most important source first. You can analyze more sources later, and each one becomes its own planet in the graph.
3
Watch the phases
Progress streams live. You see the current phase (walk, coverage-pages, ai-analysis, snapshot) and a running count of pages, entities, and files. During the AI pass, the tool calls the agent makes are surfaced as progress, so you can see it reading and writing pages.
4
Cancel if needed
You can cancel a running job at any time. The job stops, its status becomes cancelled, and anything already written (the root page, entities, coverage pages) stays in place.
5
Review the result
When the job completes, the root page and its children are populated. If the AI pass had a problem, the job finishes with a warning and the deterministic base remains, so you still have coverage.
Re-analyzing a source clears the old auto-generated pages for that source (except the root) before writing fresh ones. Pages you wrote by hand are a different kind and are not touched, but do not rely on edits made to auto-generated pages surviving a re-analysis.
Once the KB exists, there are three ways to move through it.
Pages
Wikilinks
Graph
Pages are organized as a tree. The Repo: <name> page is the root, with children for architecture, stack, flows, risks, and the rest. Open any page to read its markdown. Each page shows its backlinks: the other pages that point to it, so you can trace what references a concept.
Inside page content, [[Title of another page]] is a wikilink. Orkestral resolves these to real pages automatically, so clicking one jumps you to the target. Agents create wikilinks while writing, which is how the tree connects: an architecture page links to a flow page, a flow page links to a tech entity, and so on. Wikilinks can also cross sources, connecting a frontend page to the backend API page it calls.
The galaxy view renders the whole KB as nodes and edges. Pages and entities are nodes; wikilinks, the parent-to-child hierarchy, and entity relations are edges. Root repo pages appear as planets, child pages orbit them, and entities (such as npm dependencies) appear as small stars. Node size reflects how connected a node is.
The graph view includes a heads-up display summarizing the KB. These numbers come straight from the graph snapshot.
Pages, entities, and chunks
Total pages and total entities (including orphan entities like dependencies that the graph hides to avoid clutter but still counts as knowledge). Chunks are the indexed segments of your pages, the units that search and embeddings operate on.
Top hubs and constellations
Top hubs are the most connected nodes, the pages and entities everything points at. Constellations are clusters of two or more entities connected by relations, a quick read on how tightly your knowledge links together.
Growth and recently added
Weekly growth counts pages created per day over the last seven days, and recently added highlights pages from the past week. Use these to see your KB expanding as you analyze more sources and write more pages.
A KB is not a one-time export. It grows as you add sources, write pages, and re-analyze.
1
Analyze more sources
Repeat the generation flow for each repository in the workspace. Multiple sources share one graph, and the AI is told about sibling sources so it can link across them (for example, a frontend calling a backend API).
2
Write and edit pages by hand
Create your own pages for tribal knowledge the scan cannot infer: deployment runbooks, on-call notes, product decisions. Every page you create or update is reindexed for lexical search and queued for embeddings, so your manual notes are searchable alongside the generated ones.
3
Link as you write
Use [[Page title]] wikilinks in your content to weave new pages into the existing tree. Good linking turns scattered notes into a navigable web and strengthens the graph.
4
Rebuild snapshots after big changes
After a batch of edits, trigger a rebuild to refresh chunks, the binary snapshot on disk, and the embeddings. This keeps retrieval current with the latest content.
5
Re-analyze when a repo changes a lot
When a source evolves significantly, run the analysis again. Orkestral refreshes the auto-generated pages and re-queues embeddings so the KB reflects the new state of the code.
The KB is built for your agents first. Here is how they reach into it.
Hybrid retrieval
When an agent (or you) searches, Orkestral runs a lexical BM25 pass and a local-embedding semantic pass, then merges them into one ranked list. Lexical catches exact terms and identifiers; embeddings catch meaning even when the words differ.
Read before they plan
The orchestrator reads relevant KB pages before planning and delegating, so specialists inherit architecture, conventions, and risks instead of rediscovering them.
Write through MCP tools
During analysis, the agent uses kb_create_page and related MCP tools to materialize pages and links. The same tools let agents extend the KB during normal work.
Binary snapshot for bulk reads
An aggregated snapshot is written to disk so an agent can process the whole base from one ordered file instead of many round trips.
The deterministic coverage pages exist specifically so agents always have a baseline (structure, dependencies, entrypoints, risks) even when the deep AI pass is unavailable. Honest risk pages help agents avoid fragile areas before they edit them.
The deterministic base was created but the AI pass failed. Confirm you have an executable agent in the workspace and that its CLI (claude or codex) is on your PATH, then re-analyze. The deterministic pages remain usable in the meantime.
The source has no valid local path
Analysis reads files from disk. If the source is not cloned locally, clone it first, then run the analysis again.
The graph looks empty or sparse
Sparse graphs usually mean few wikilinks. Encourage linking in the pages you write, and re-analyze so the AI pass can connect concepts and create entity relations.
Some files were not analyzed
The scan caps at 800 files and skips files over 256 KB. The reading-risks page flags when the cap was hit. For very large repos, focus the KB on the most important sources and folders.