Obsidian as the Brain: Memory Architecture for AI Agent Systems

Published: 2026-03-03 · 14 min

The problem with AI agents in production isn’t capability. It’s memory.

A capable model that resets every session is expensive to operate. Every session, you’re re-establishing context that should already exist: who the person is, what the current projects are, what decisions have been made, what the infrastructure looks like. The model isn’t getting smarter — you’re just doing the work of initializing it manually, over and over.

The fix is a memory architecture. Not prompt engineering. Not better system prompts. An actual structured system that captures knowledge outside the model and feeds it back in — selectively, accurately, and without requiring manual effort to maintain.

Obsidian is the core of mine. But Obsidian alone doesn’t solve the problem. The full system is Obsidian as the store, QMD as the retrieval engine, and a set of operational protocols that determine what gets written, where, and how it flows back to agents.

That’s what this article documents.

Why Memory Architecture Matters

Without persistent memory, every agent session starts at zero. The model knows what’s in its training data and nothing else. It doesn’t know you built a specific thing last week. It doesn’t know the decisions you made last month and why. It doesn’t know the shape of your current projects or the context of ongoing relationships.

This is manageable for one-off tasks. It becomes an operational liability for any system running continuously.

The memory architecture I’ve built addresses three failure modes:

Context blindness — agent doesn’t know what’s relevant to the current task. Fixed by semantic search over the memory vault, which surfaces relevant files regardless of where they’re stored.

Staleness — agent acts on outdated information. Fixed by daily log structure and a clear protocol for when information moves from working note to typed permanent memory.

Loss at restart — agent loses everything when the session ends. Fixed by MEMORY.md injection into every session, giving the agent its operational state on cold start.

Each layer of the architecture handles one of these failure modes. Together they produce an agent that picks up where it left off, knows what’s relevant, and doesn’t make decisions based on information that’s months out of date.

The Vault Structure

My Obsidian vault lives at ~/Documents/Brain/. It’s a flat file system of markdown documents organized by type, not by project. The organizational principle is that information should be findable by what it is, not where it came from.

The primary structure:

~/Documents/Brain/
├── Personal Memories/the orchestrator/
│   ├── Daily Logs/          # YYYY-MM-DD.md — continuous record
│   ├── Decisions/           # Significant choices and rationale
│   ├── Lessons/             # What I learned and when
│   ├── People/              # Named person notes
│   ├── Commitments/         # What I've agreed to
│   ├── Preferences/         # How I work, what I like
│   └── Projects/            # Project-level context
├── Research/                # All research output goes here
│   └── {topic}/             # One directory per research area
└── Personal Memories/the orchestrator/VAULT_INDEX.md  # Running index of the vault

The daily log is the write-everywhere destination. Anything that happened today goes there first — decisions, conversations, completed tasks, observations, context that came up. The daily log is messy. It’s supposed to be. It’s the capture layer.

Periodically, the agent promotes content from daily logs to typed memory: decisions go to Decisions/, lessons go to Lessons/, key facts about people get added to or updated in People/. The daily log is a staging area. Typed memory is the permanent record.

This distinction matters for retrieval. When the semantic search engine queries the vault, typed memory gives high-quality structured results. Daily logs give temporal context. Both are valuable; neither is a substitute for the other.

QMD: Semantic Search Over the Vault

Obsidian alone is a knowledge store. To make it useful for agents, you need a retrieval layer that can answer the question: “what do I already know that’s relevant to this task?”

I use QMD (a local semantic search tool installed globally via npm) for this. It indexes the vault, embeds all documents into a vector store, and provides hybrid search with BM25 + semantic reranking.

The numbers from my current setup: 884 files indexed, 4,563 vectors. That covers the full Brain vault — daily logs, typed memory, research documents, everything.

How indexing works:

qmd update   # scan for new/changed files, update the text index
qmd embed    # embed new documents into the vector store

This is run on a schedule — currently nightly via cron — so the vector store stays current without manual intervention. When a new research document lands in the vault, it gets embedded on the next scheduled run and becomes searchable from the next session.

The index lives locally. Nothing leaves the machine. The embeddings run against a local model or a low-cost API endpoint, depending on configuration. For 884 files, the cost of a full re-embed from scratch is low enough that I run it on any structural change to the vault.

How search works:

QMD exposes three search modes:

qmd search "query"    # keyword search (BM25)
qmd vsearch "query"   # vector search (semantic)
qmd query "query"     # hybrid with reranking (recommended for agent use)

The hybrid mode is the one that matters. It combines keyword matching (good for exact names, terms, and identifiers) with semantic matching (good for conceptual similarity where the words don’t match exactly). The reranker pushes the most relevant results to the top based on both signals.

In practice: if I need to retrieve what I know about a specific person, keyword search finds their name. If I need to retrieve everything relevant to a decision about content strategy, semantic search surfaces context from multiple documents that might not share vocabulary with the query.

MCP server mode:

For agent use, QMD can run as an MCP server, making the search tool available as a first-class tool call:

qmd mcp       # stdio mode for direct integration
qmd mcp --http  # HTTP mode for network access

This is what the memory_search tool uses — it’s making structured queries to QMD and returning the top matching snippets with source file paths and line numbers.

How Agents Actually Access the Vault

There are three access patterns I use, and they serve different purposes.

1. Session injection via MEMORY.md

Every agent session has MEMORY.md injected into the system context. This is the operational summary layer — a distillation of the most important facts the agent needs to function: who it is, who it’s serving, infrastructure details, current project state, key preferences, active lessons.

MEMORY.md is intentionally lean. It’s not a dump of the entire vault — it’s the 2-page brief that covers everything an agent needs to know to start any task without asking me to re-explain context that should already be known. I maintain it as things change. When a significant decision gets made, it goes into MEMORY.md if it affects how future agents should operate.

2. Semantic search via memory_search tool

For anything that might be in the vault but isn’t in MEMORY.md, the agent calls memory_search with a query string. This hits the QMD index and returns the top matching snippets with source paths and line numbers.

The typical pattern:

memory_search("Cornyn fundraising research")
→ returns: Research/FEC-Analysis/cornyn-paxton-fec.md (lines 23-47), relevant snippet

The agent then reads just those lines using memory_get. It gets exactly what’s relevant without loading the entire document into context.

This is the critical design choice: the vault is too large to load fully into any session’s context window. Semantic search is what makes it selectively accessible — you query for what you need, you get exactly that, and the rest stays on disk.

3. Direct file read for known paths

When the agent knows where something is — a specific person note, a specific research document — it reads it directly. No search required. This is the low-overhead path for structured, predictable memory access.

The agent reads ~/Documents/Brain/Personal Memories/the orchestrator/VAULT_INDEX.md first in any cold-start session, before searching. The index is a curated map of what’s in the vault and where. That single read often eliminates the need for broad semantic search.

The VAULT_INDEX Pattern

VAULT_INDEX.md deserves its own explanation because it’s what makes the vault usable at scale.

A vault of 884 files is too large to navigate by inspection. You can’t load the entire directory tree into context and ask the model to find what’s relevant. The semantic search engine handles that, but semantic search has overhead — you have to know what query to run.

VAULT_INDEX.md solves the “where should I even look?” problem. It’s a running human-readable index of the vault’s major contents: what’s in each section, which documents are the canonical references for specific domains, where the most frequently-accessed things live.

When an agent starts a cold session and needs to find something, the protocol is: 1. Read VAULT_INDEX.md — get the map 2. If the location is obvious from the map, read directly 3. If not, run a targeted memory_search with a specific query 4. Pull the relevant lines with memory_get

This three-step protocol handles 90% of memory access without requiring broad search. The result is faster, more reliable context retrieval and better citation of source material.

What Gets Written Where

The write side of the architecture is as important as the read side. Information in the wrong place is either unfindable or creates noise in search results.

The rules I follow:

Daily logs → everything that happens today. Stream-of-consciousness is fine. This is the capture layer.

Research documents → all research output goes to ~/Documents/Brain/Research/{topic}/. Always. With YAML frontmatter: tags, date, source. This applies to every agent — if a research agent produces a briefing, it goes to Obsidian, not just to a workspace scratch file.

People notes → any person who matters goes to People/{Name}.md. Relationship context, key facts, how I know them, what I’ve promised them, what they’ve committed to me.

Decisions → significant choices with their rationale. The question I ask: “Will I need to explain why this was decided a year from now?” If yes, it goes to Decisions/.

Lessons → anything I want to remember learning. Format: what I thought before, what happened, what I know now.

MEMORY.md → distilled operational facts. Updated when something changes that affects how agents should operate. Kept lean — this gets injected into every session, so every byte matters.

The rule that makes this work: Obsidian is canon. Everything else is staging.

Workspace scratch files are temporary. Shared-context drafts are temporary. QMD cache is temporary. The Obsidian vault is the permanent record, and everything worth keeping eventually gets there.

Memory Architecture Principles

After running this system for several months, these are the principles that actually govern how it works in practice:

Write to the right layer on first capture. A conversation with a client goes into the daily log immediately, before you forget it. Key facts from that conversation get promoted to the person note within 24 hours. This is the workflow that prevents the daily log from becoming the only layer and losing the structure that makes search useful.

Structure enables search. Semantic search over well-structured files returns accurate results. Semantic search over unstructured notes returns noise. The investment in typed memory categories pays off every time the retrieval system surfaces exactly the right document instead of a semi-relevant one.

Memory is maintenance, not magic. The vault doesn’t stay current by itself. Daily logs require periodic promotion. Research documents require YAML frontmatter. Person notes require updates when relationships change. Agents help with this — I have the main agent periodically review daily logs and promote content to typed memory — but the underlying discipline of actually writing things down has to be there first.

Lean injection, rich search. Don’t try to inject the whole vault into every session. Keep MEMORY.md small enough to be useful. Let the search engine handle the rest. The worst outcome is MEMORY.md becoming a 10,000-word document that bloats every context window without actually improving access to relevant information.

Citation matters. When the agent retrieves something from memory, it should cite the source: which file, which lines. This is what lets me verify that the agent is working from accurate information rather than hallucinating context that sounds like it could be in the vault.

The Honest Limitations

Obsidian is local-first. That’s a feature — your data stays on your machine — but it means mobile is rough and multi-device sync requires an explicit solution (iCloud, Obsidian Sync, or git). For a desktop-primary workflow, this is fine. For a workflow where you need to update notes from your phone on the go, it requires setup.

The vault doesn’t organize itself. QMD doesn’t help you decide what to write or where to write it. The retrieval engine is only as good as the content it retrieves from. This is the irreducible human responsibility: you have to maintain the discipline of actually capturing and structuring information for the system to work.

Semantic search is probabilistic. Most of the time it surfaces the right thing. Sometimes it misses something that should be obvious because the vocabulary of the query doesn’t overlap enough with the vocabulary of the document. The VAULT_INDEX pattern helps here — you can navigate by structure when the search isn’t returning what you expect.

Why This Stack

The alternatives I evaluated before settling on this:

Notion — browser-based, sync works, but the file format isn’t plain markdown. AI agents can read Notion via API, but it adds a dependency and there’s a latency cost. Markdown files on disk are zero-latency, always available, and trivially readable by any tool.

A database-backed memory system — more sophisticated, but adds infrastructure complexity. For the scale I’m operating at (a few thousand documents), a local file system + vector index is fast enough and the simplicity has significant operational advantages.

Just using the model’s context window — works for small tasks, completely fails at the scale of continuous multi-session operations. Context windows are getting larger, but the cost of loading thousands of pages of context into every session is prohibitive, and you don’t need most of it for any given task. Selective retrieval beats brute-force context loading.

The Obsidian + QMD combination wins on: local-first privacy, zero-latency file access, semantic retrieval without external dependencies, and the ability to use standard text tools to inspect and manipulate the vault directly.

That’s the setup. It took about a week to get right and has compounded in value every week since.

Have questions about setting up your own memory architecture? Email me: deacon@ridleyresearch.com

← All Posts