A knowledge base that keeps itself maintained as materials grow.
Reference implementation: Personal-Auto-Wiki
Table of contents
Open Table of contents
- 1. Motivation
- 2. Architecture
- 4. Typical workflow scenarios
- 5. Additional notes
- 6. Systematic tweaks I made to Karpathy’s proposal
- 7. Design philosophy
- 8. Toolchain
- References
1. Motivation
Knowledge workers face a large volume of information every day: articles, papers, reports, meeting notes. How to handle this stream of information is a long‑standing problem.
Approach 1: Store it
The most common approach is to use Obsidian, Notion, Bear, or Apple Notes to create a folder structure and file documents into categories. Organizing feels good in the moment. A few months later when you try to find a specific idea, you often can’t remember which folder or file it was in. Organization didn’t produce understanding — it just moved information to a different place.
A bigger challenge is that knowledge itself changes. The same concept can be phrased differently across articles. Different sources may describe the same thing in conflicting ways. A methodology in one field can be reinvented in another. Discovering these connections and contradictions requires keeping all material contents in mind at once, which exceeds human cognitive capacity.
Approach 2: RAG retrieval
Retrieval‑Augmented Generation (RAG) solves the “can’t find it” problem: store documents in a vector database, retrieve relevant passages at query time, and let the LLM synthesize an answer. But RAG starts from raw documents on every query rather than from knowledge that has already been digested. It assumes raw text is the best vessel for knowledge — which is not true. Raw documents contain redundancy, bias, outdated information, and an author’s personal phrasing. What was discovered last week must be rediscovered this week. Contradictions between two articles will both be shown by RAG without being explicitly flagged.
A third approach
Andrej Karpathy proposed a different direction in his LLM Wiki:
Let the LLM act like a Wikipedia editorial team that continuously maintains a persistent wiki.
When new material arrives, the LLM reads, understands, extracts concepts and entities, updates existing pages, flags contradictions, and maintains cross‑references. At query time the LLM retrieves from this wiki instead of the raw documents. The key insight is: LLMs excel at understanding, synthesizing, and generating text — not at reassembling fragmented pieces on every query. Move the work of understanding into the ingestion phase; the query phase then only needs to search already‑structured knowledge, which greatly improves efficiency and accuracy.
I followed Karpathy’s idea and built my own LLM Wiki workflow, with extensive customizations. Below I explain what I built and why, from architecture and workflows to technical details and design philosophy.
2. Architecture
The system has three clearly separated layers:
PersonalAutoWiki/
├── ObsidianRaw/ # Raw sources (read‑only, human‑curated)
│ ├── 00_Inbox/ # Temporary inbox, to be categorized
│ ├── 01_Projects/ # Active projects, action‑oriented, time‑bound
│ ├── 02_Areas/ # Ongoing responsibility areas, require active maintenance
│ ├── 03_Resources/ # Reusable reference materials ← the LLM's only ingestion source
│ └── 04_Archive/ # Historical records
├── Wiki/ # Knowledge base (fully maintained by the LLM)
│ ├── Index.md # Automatically maintained index
│ ├── Log.md # Machine‑parsable operation log
│ ├── Concepts/ # Concept pages (methodologies, theoretical frameworks)
│ ├── Entities/ # Entity pages (people, organizations, projects)
│ ├── Sources/ # Source summaries
│ └── Outputs/ # Query outputs (high‑value analyses backfilled)
├── .claude/skills/ # Claude Skills definitions (workflow codified)
├── AGENTS.md # LLM "employee handbook" (complete collaboration spec)
└── CLAUDE.md # References AGENTS.md
2.1 Layer 1: Raw sources
I organize my materials using the PARA method (Projects, Areas, Resources, Archive). PARA divides information into four categories plus an Inbox as a temporary entry point.
These five categories have different natures:
| Directory | Nature | Example | Suitable for Wiki? |
|---|---|---|---|
| Inbox | Temporary, to be categorized | Newly saved web pages, rough notes | ❌ Not stable |
| Projects | Action‑oriented, timebound | Notes for “write annual report” | ❌ Actionable, not knowledge |
| Areas | Ongoing responsibilities | Notes about “team management” | ❌ Personal, not reusable |
| Resources | Reusable reference material | Papers, frameworks, industry reports | ✅ Target source |
| Archive | Completed or inactive | Historical project records | ❌ Historical, low priority |
Key: the LLM ingests only from the Resources directory.
This constraint is my first modification to Karpathy’s scheme. LLMs should only ingest from Resources. Karpathy didn’t explicitly state this restriction.
Projects and Areas are action oriented and require human execution; they are not content the LLM should automatically convert into wiki entries. The Inbox must be curated and moved into Resources before ingestion. Archive is historical and lower priority. Without this constraint, project notes or transient ideas could be fed to the LLM and pollute the Wiki. The Wiki’s goal is to build a reusable knowledge graph. “Resources” are defined as materials likely to be useful in the future — this aligns with the Wiki’s purpose.
2.2 Layer 2: Wiki (the knowledge layer)
This is the system core. The Wiki consists of four page types, each with a clear responsibility and naming convention:
2.2.1 Sources
For every ingested raw document, a summary page is created under Wiki/Sources/. A Sources page contains:
- The original document’s title, author, and source
- A concise summary of the core arguments
- Extracted key concepts and entities
- Cross‑references to other Sources
Sources are the entry points to the knowledge layer. Raw documents are compressed into structured summaries; redundancy and noise are filtered out. Sources pages include tracking metadata so the Wiki can detect changes to the original files. Each Sources page’s frontmatter includes three tracking fields:
---
title: Resilience
created: 2026-04-08
updated: 2026-04-08
type: source
tags: [systems theory, ecology, resilience studies]
source_path: Academic/Research Notes/Theory/Concepts/Resilience.md
source_hash: a1b2c3d4
source_mtime: 2026-04-08T17:30:00Z
---
| Field | Purpose |
|---|---|
source_path | Locates the original file; detects moves or deletions |
source_hash | First 8 characters of SHA256 of the file contents; detects content changes |
source_mtime | File modification time for quick screening (avoids hashing every time) |
These three fields let the Wiki automatically detect changes in raw sources. When an original file is modified, the system compares hashes and flags re‑ingestion if necessary.
2.2.2 Concepts
Wiki/Concepts/ stores abstract concepts and methodologies, e.g. “resilience”, “percolation theory”, “prospect theory”. Characteristic features of concept pages:
- Definitions are synthesized from multiple Sources
- When sources disagree on a definition, contradictions are explicitly flagged
- Definitions evolve as new Sources are ingested
- Each concept page lists all Sources that mention it
Concepts are the abstract nodes of the knowledge graph and point to ideas. Single‑source definitions are often partial — synthesizing multiple sources gets closer to a robust definition.
2.2.3 Entities
Wiki/Entities/ stores concrete entities: people, organizations, projects, tools. Unlike Concepts, Entities refer to specific objects.
An entity page contains:
- Basic information about the entity
- Links to related Concepts (which concepts the entity is associated with)
- Links to Sources that mention the entity
- Relationships to other Entities (collaboration, hierarchy, competition)
- A timeline if applicable
Entities are the concrete nodes of the knowledge graph; they ground abstract concepts in real contexts.
2.2.4 Outputs
Wiki/Outputs/ stores the byproducts of queries and analyses. When a query produces a cross‑source synthesis, comparison table, or insight, the result can be backfilled as an Output page.
Backfill criteria — not every query should be saved. Save outputs when they are:
- Overviews: cross‑source syntheses
- Comparisons: structured comparison tables of concepts or methods
- Analyses: discoveries of hidden links or insights
- Conclusions: reasoning results worth preserving
Simple lookups (“What are the three dimensions of framework X?”) are not worth saving — that answer already exists in Sources and Concepts. Outputs are the synthesis layer: they are new knowledge produced by reasoning and are worth persisting.
2.2.5 Index.md and Log.md
Index.md is a directory of all pages, organized by type, with a one‑line summary per page. Its role is to control scale: when a Wiki grows to dozens or hundreds of pages, an index is necessary for navigation.
Log.md is an operation log in a machine‑parseable format:
## [2026-04-08] ingest | resilience assessment
- **source**: ObsidianRaw/03_Resources/resilience-framework.md
- **impact**:
- **Sources**: framework-summary.md
- **Concepts**: resilience.md
- **Entities**: some-institute.md
- **Index.md**, **Log.md**
- **notes**: added a systems performance definition of resilience
The format is designed so Unix tools can query it directly:
grep "^## \\[" Log.md | tail -5 # last 5 records
grep "| ingest" Log.md # all ingest operations
grep "| contradiction" Log.md # all contradiction flags
grep "resilience" Log.md # all operations related to "resilience"
The operation log should be as queryable and auditable as a Git log.
2.3 Layer 3: Schema and Skills (instructions for the LLM)
2.3.1 AGENTS.md / CLAUDE.md
AGENTS.md and CLAUDE.md are complete operational manuals. They cover:
- Directory structure and responsibilities
- Detailed steps for the four core workflows
- File formats, naming conventions, and citation rules
- Strategies for impact analysis
- Methods for detecting and resolving contradictions
- Search tool invocation order and priorities
- Response style requirements
The LLM reads this file at the start of a session so it knows its role, rules, and step definitions. Instructions embedded in a conversational prompt get truncated or overridden; the handbook remains stable and keeps behavior consistent.
2.3.2 Skills
AGENTS.md specifies rules; .claude/skills/ codifies them as executable modules. Each Skill is a self‑contained unit that includes trigger conditions, execution steps, output formats, and boundary handling:
| Skill | Function | Trigger scenario |
|---|---|---|
syncing-wiki | Full sync | Detects and processes all changes |
ingesting-resources | Resource ingest | Processes new files in Resources |
querying-wiki | Knowledge query | User questions / knowledge retrieval |
checking-wiki-health | Health checks | Detects wiki integrity issues |
detecting-resources-sync | Sync detection | Only detects changed files |
Each Skill consists of a `SKILL.md` (triggers, steps, output format, edge cases) and a `reference/` directory (page templates, frontmatter formats, query patterns, checklists). Claude Code automatically recognizes the Skills directory. Modifying a Skill file changes LLM behavior without editing AGENTS.md.
---
## 3. Five core workflows
### 3.1 Workflow 0: Full sync (Syncing Wiki)
Full sync is the recommended entry point for routine maintenance. A single command performs detection through processing.
When I say "sync and process", the LLM runs a six‑stage workflow:
```text
Stage 1: Detect sync state
Scan Resources and Wiki/Sources, perform a three‑way comparison
↓
Stage 2: Confirm scope
Show the list of files to process, mark files to skip, ask user whether to continue
↓
Stage 3: Execute processing
Call ingesting‑resources in batches (>10 files split into batches of 5)
↓
Stage 4: Verify results
Check Sources pages are created, frontmatter completeness, Index.md updated, qmd index updated
↓
Stage 5: Record log
Write structured operation records into Wiki/Log.md
↓
Stage 6: Output report
Produce processing statistics and verification results
Design considerations:
- Batch processing: if file count > 10, split into batches of 5 to avoid losing context
- Fault tolerance: record and continue on per‑file errors instead of blocking
- Mandatory logging: stage 5 must write to Log.md
- Composition:
syncing-wikiinternally callsdetecting-resources-sync(stage 1) andingesting-resources(stage 3)
syncing-wiki turns the previously manual chain of steps into an atomic operation. The user only confirms the scope; the rest is automatic.
3.2 Workflow 1: Detecting Resources Sync
Detecting sync answers a simple question: how does the Wiki know when raw sources have changed?
Without detection, ingestion only runs when a human explicitly asks the LLM to process a file. Raw sources are frequently edited — corrections, additions, restructuring. Without automatic detection, the Wiki will become out of sync.
The detection workflow:
Step 1: Scan ObsidianRaw/03_Resources/
Record relative path, content hash (SHA256 first 8 chars), and mtime for each file
↓
Step 2: Read frontmatter of all Wiki/Sources pages
Extract tracking fields (source_path, source_hash, source_mtime)
↓
Step 3: Three‑way comparison
├─ New: present in Resources, missing in Sources → needs ingestion
├─ Changed: same path, different hash → needs re‑ingestion
├─ Deleted: present in Sources, missing from Resources → needs handling
└─ Synced: path and hash match → no action
↓
Step 4: Generate sync status report
Sample output:
## Resources sync status [2026-04-08 10:30]
### New files (2)
| Path | Modified time | Size |
| --------------------------- | ----------------- | ----- |
| `Academic/NewTopic.md` | 2026-04-08 09:00 | 2.5KB |
### Changed (1)
| Path | Old hash | New hash | Modified time |
| ---------------------------------- | -------- | -------- | ----------------- |
| `Academic/Theory/Resilience.md` | a1b2c3d4 | e5f6g7h8 | 2026-04-08 08:30 |
### Synced (43)
43 files are in sync.
Design notes:
- Use
source_mtimeas a quick filter to avoid hashing everything every run - SHA256 first 8 characters is sufficient for change detection while keeping frontmatter compact
- File renames are detected as “delete + new” and require user confirmation
Sync detection is the system’s change sensor. Without it the Wiki is a static snapshot.
3.3 Workflow 2: Ingesting Resources
When detection finds new or changed files, the ingestion pipeline runs. “Process this article” means the LLM executes a multi‑step understanding → association → update process:
Step 1: Parse the original file structure: headings, sections, lists, tables, code blocks
↓
Step 2: Create a summary page under Wiki/Sources/
- 1–3 sentence core summary
- Key points extracted following the original structure
- Extract key concepts and related entities
- Write tracking fields into frontmatter
↓
Step 3: Extract Concepts
├─ If found in Concepts/ → update: add new references, augment definition
├─ If not found → create new concept page
└─ If new definition conflicts with existing definitions → flag contradiction
↓
Step 4: Extract Entities
├─ If found in Entities/ → update: add relations, events, attributes
├─ If not found → create new entity page
└─ Establish relations to other entities and concepts
↓
Step 5: Impact analysis
├─ Which Sources need cross‑references updated?
├─ Which Concepts need revision?
├─ Which Entities need relationship updates?
└─ Which contradictions must be flagged?
↓
Step 6: Update Wiki/Index.md
↓
Step 7: Record operation in Wiki/Log.md
↓
Step 8: Run `qmd update && qmd embed` to refresh search indexes
The crucial part is Step 5 — association impact analysis. A new paper rarely affects a single page. For example, a paper on “infrastructure resilience” might:
- Introduce a resilience model → update
Concepts/resilience.md - Mention a research institute → update/create
Entities/some-institute.md - Cite another ingested paper → create cross references between Sources
- Provide a different definition of “resilience” → flag a contradiction in the concept page
Without association analysis, ingestion degenerates into isolated summaries. Concepts and Entities wouldn’t be updated and the knowledge graph wouldn’t grow. LLMs excel at this step because they can consider the whole Wiki context while human readers often cannot remember all relevant connections.
3.4 Workflow 3: Querying the Wiki
When a user asks a question, the LLM performs:
Step 1: Use qmd to search relevant Wiki pages
├─ lexical match (lex): exact keyword matches
└─ vector semantic match (vec): capture semantically related expressions
↓
Step 2: Read retrieval results and synthesize an answer
↓
Step 3: Cite sources with [[wiki-links]]
↓
Step 4: Assess whether the result should be backfilled into Outputs/
The fundamental difference from RAG is the retrieval target. RAG retrieves raw documents; Wiki queries retrieve already‑digested knowledge. Retrieval results are already refined — contradictions have been annotated and concept definitions are cross‑document syntheses. Entity relations are established.
Backfilling decisions are human‑mediated. The LLM will suggest saving high‑value outputs to Wiki/Outputs/, but a human decides whether to persist them. Backfilled Outputs should be high quality.
reference/query_patterns.md defines five common query patterns and best practices. Each pattern prescribes lex/vec query construction, intent disambiguation, and limits.
Example query patterns:
// Concept definition
mcp__qmd__query({
searches: [
{ type: "lex", query: "concept-name definition" },
{ type: "vec", query: "what is concept-name; meaning and explanation of concept-name" },
],
intent: "concept-definition",
limit: 3,
});
// Concept comparison
mcp__qmd__query({
searches: [
{ type: "lex", query: "conceptA conceptB comparison differences" },
{ type: "vec", query: "differences and relationships between conceptA and conceptB" },
],
intent: "concept-comparison",
limit: 5,
});
3.5 Workflow 4: Checking Wiki Health
As the Wiki grows, pages multiply and links proliferate. Common issues: outdated pages, unflagged contradictions between sources, orphan pages, concepts referenced frequently without dedicated concept pages. Health checks scan the Wiki and produce a report:
Check item | What it detects | Suggested fix | Priority
─────────────────┼──────────────────────────────────────────┼─────────────────────────┼────────
Unsynced files | New/changed files in Resources | Run ingestion | P0
Missing concepts | Links to concepts whose page is missing | Create missing pages | P1
Contradiction flags | Presence of contradiction markers | Human review required | P1
Unindexed pages | Pages not listed in Index.md | Update Index.md | P2
Isolated pages | Pages with no inbound links | Evaluate and link/remove| P3
Health check report (structured):
# Wiki Health Check Report
Date: 2026-04-08
## Statistics
| Type | Count |
| --------- | ----- |
| Concept pages | 54 |
| Entity pages | 34 |
| Source pages | 45 |
| Output pages | 0 |
| **Total** | **133** |
## Issues
### Missing concepts (2)
- [[Concepts/adaptive-governance]] — referenced by [[Sources/resilience-framework]] but page missing
- [[Concepts/socio-ecology]] — referenced by [[Entities/some-institute]] but page missing
### Contradiction flags (1)
- [[Concepts/resilience]] — dimension definition differs from [[Sources/another-framework]]
## Suggested actions
1. Ingest sources to create Concepts/adaptive-governance and Concepts/socio-ecology
2. Human review of the contradiction flagged for resilience
Once a Wiki reaches tens of pages, purely manual maintenance of all links and consistency is impractical. The LLM should scan regularly and surface structural issues; humans only handle exceptions.
4. Typical workflow scenarios
The five workflows can be combined into practical usage patterns.
Scenario 1: First time setup
Run syncing-wiki to initialize:
Sync and process
The system will detect changes → confirm scope → ingest updates → verify → log → report.
Scenario 2: Routine maintenance
Run syncing-wiki regularly to detect and handle changes, and run health checks weekly:
Sync and process
Health check
Scenario 3: Detect only, do not process
Detect new files
It outputs the sync status and asks whether to continue processing.
Scenario 4: Knowledge query
What are the developmental stages of resilience theory?
The system retrieves and answers; if the result seems valuable for long‑term use it will suggest saving to Outputs/.
5. Additional notes
5.1 Local search: qmd
Search relies on qmd, a local Markdown search engine.
Hybrid retrieval: lexical + vector
mcp__qmd__query({
searches: [
{ type: "lex", query: "critical infrastructure" }, // exact match
{ type: "vec", query: "how to assess infrastructure resilience" }, // semantic match
],
intent: "academic research methodology", // disambiguation
limit: 5,
});
When the Wiki contains different phrasings, pure lexical search misses results; pure vector search may return semantically related but imprecise results. Hybrid search balances precision and recall. The intent parameter helps disambiguate user intent when queries contain polysemous terms.
Reasons to choose qmd: all data stays local (no external API), it can be exposed to LLMs via Model Context Protocol (MCP) as a tool, and it supports path/docid/glob retrieval modes.
5.2 Contradiction flags
When newly ingested sources conflict with existing content, the LLM does not decide — it explicitly flags contradictions on page:
> [!warning] Contradiction
> This page conflicts with [[Sources/another-framework]]:
>
> - This page: resilience consists of three dimensions (engineering, ecological, social)
> - Other source: resilience consists of five dimensions (adds economic and governance)
> - Likely reason for discrepancy: different application domains (natural hazards vs urban infrastructure)
Design principles:
- The LLM is not the final domain expert; automatic adjudication risks error. Explicit flags keep contradictions visible and defer final judgment to humans.
- A contradiction is rarely a matter of right vs wrong; it often reflects scope, domain, or framing differences. Preserving both sides is more valuable than picking one.
- Good contradiction notes attempt to analyze causes: differing definitions, domains, or timeframes. That analysis itself is valuable.
- Use Obsidian’s standard callout format
> [!warning]for visual prominence.
5.3 Operation log
Log.md adopts software engineering log best practices:
## [2026-04-08] ingest | infrastructure resilience framework
- **source**: ObsidianRaw/03_Resources/resilience-framework.md
- **impact**:
- **Sources**: framework-summary.md
- **Concepts**: resilience.md
- **Entities**: some-institute.md
- **Index.md**, **Log.md**
- **notes**: added three‑dimension definition of resilience
Design goals:
- Machine‑parseable: header lines start with
## [date] type | titlefor regex matching - Traceable: each record includes source and impact for auditability
- Statable:
grep | wc -lcan count operation types - Human‑readable: the format is easy to read without special tools
The operation log is the Wiki’s Git history. Without it, the Wiki becomes a black box.
5.4 Page templates
Each Skill’s reference/ directory includes page templates to ensure consistent generation:
- Sources template: summary → core content → key concepts → related entities → references
- Concepts template: definition → core characteristics → related concepts → sources
- Entities template: introduction → info table → related concepts → sources
- Outputs template: question → answer → sources → related concepts
Templates increase predictability. When all pages follow the same structure, humans know where to find information. LLMs have explicit format expectations, and automation scripts can reliably parse pages.
6. Systematic tweaks I made to Karpathy’s proposal
Karpathy’s gist proposed the core framework: Raw Sources → Wiki → Schema and workflows Ingest / Query / Lint. My engineering customizations refine that framework:
| Dimension | Karpathy original | My customization | Why |
|---|---|---|---|
| Ingestion range | Unspecified | Only PARA’s Resources directory | Prevent noise pollution |
| Contradiction handling | Mentioned, not detailed | Explicit flags + logs + difference analysis | Make contradictions visible |
| Operation logs | Simple timeline | Machine‑parseable format, grepable | Observability |
| Search tool | Unspecified | qmd hybrid search (lex + vec + intent) | Precision + recall |
| Backfilling outputs | Mentioned | Clear criteria + human decision | Ensure Output quality |
| Health checks | Lint | Five checks + priority + report | Systematic maintenance |
| Collaboration rules | Brief description | Full AGENTS.md | Reduce LLM unpredictability |
| Impact analysis | Unspecified | Explicit association analysis step | Ensure ingestion is global |
| Page types | Sources/Entities/Concepts | + Outputs (query backfill) | Preserve high‑value analyses |
| Sync detection | None | New detection workflow with triple tracking | Keep Wiki synchronized |
| Full sync | None | syncing-wiki six‑stage atomic operation | One‑click maintenance |
| Workflow modularity | Document only | Skills modularization (SKILL.md + ref) | From procedure to executable |
Karpathy’s gist is a design doc; my AGENTS.md + Skills are runbooks and program modules. Each step has operational detail, error handling, and quality standards. Changing a Skill file changes LLM behavior without rewriting the core spec.
7. Design philosophy
7.1 LLMs are the Wiki’s “bookkeepers”
The most tedious parts of maintaining knowledge are bookkeeping tasks:
- After reading a new paper, updating related concept definitions
- Noticing two papers say different things and flagging contradictions
- Turning a good Q&A into a reusable asset
- Regularly checking for outdated pages
- Tracking changes to original sources
These tasks consume time but require little human judgment. LLMs can search all related pages, compare definitions, and flag contradictions. Humans should spend their time curating sources, asking good questions, resolving contradictions, and deciding what to persist.
7.2 Knowledge evolves
Traditional note systems assume knowledge is “write once, store forever.” In reality knowledge evolves:
- New concepts change old definitions
- New evidence invalidates previous conclusions
- Cross‑disciplinary exchange produces new understanding
- The same concept may mean different things in different contexts
The Wiki acknowledges evolution. Contradiction flags allow multiple viewpoints to coexist. Association analysis lets knowledge update as new information arrives. Sync detection tracks original source changes. Health checks surface issues.
7.3 Constraints create value
AGENTS.md and Skills enforce constraints: only process Resources, require unified page formats, use [[wiki-links]], contradictions cannot be auto‑resolved but must be flagged, backfills require standards, and tracking fields must be present.
Constraints guide LLM behavior to the right work. Without ingestion limits, the LLM would process garbage. Without format constraints, pages become inconsistent and harder to parse. Without backfill standards, Outputs would be low value. Without tracking fields, the Wiki can’t detect source changes.
Trade flexibility for consistency, quality, and maintainability.
7.4 Clear human‑AI division of labor
| Task | Human | LLM |
|---|---|---|
| Curation | ✅ Decide what to ingest | ❌ |
| Reading | ❌ | ✅ Extract, summarize, associate |
| Global analysis | ❌ | ✅ Systematic scans and updates |
| Change detection | ❌ | ✅ Three‑way comparison and reports |
| Contradiction resolution | ✅ Final judgment | ❌ Flag only |
| Querying | ✅ Ask good questions | ✅ Retrieve, synthesize, cite |
| Quality control | ✅ Decide whether to backfill | ✅ Suggest backfills |
| Routine scans | ❌ | ✅ Run checks and report |
| Workflow design | ✅ Write AGENTS.md and Skills | ❌ |
7.5 Workflows should be composable and editable
Skills transform linear procedures into modular tools:
syncing-wikiis the recommended daily entry point and composes detection and ingestiondetecting-resources-synccan run standalone or be embedded in other flowsingesting-resourcescan be run manually or triggered bysyncing-wikiquerying-wikiresults can be answered or backfilled as Outputschecking-wiki-healthcan validate after ingestion or run on a schedule
Humans mix and match tools as needed. Each Skill encodes steps, fault tolerance, and outputs so users don’t recompose logic every time.
8. Toolchain
| Purpose | Tool | Notes |
|---|---|---|
| Raw source mgmt | Obsidian | Organized with PARA |
| LLM editor | Claude Code / [Qwen Code] | Reads AGENTS.md and Skills to run flows |
| Local search | qmd | Hybrid search (lex + vec) |
| Wiki browsing | Obsidian | Native support for [[wiki-links]] and callouts |
| Web clipping | Obsidian Web Clipper | Save web pages into Inbox |
| Skills framework | Claude Code Skills | Workflow modules |
References
- Karpathy: LLM Wiki — original inspiration: three‑layer architecture and workflow design
- PARA Method — knowledge organization framework used for raw sources
- Obsidian — local Markdown knowledge tool used for storage and browsing
- qmd — local Markdown search engine