155d56e8e8
- Move bigmind/ -> mcp/bigmind/ - Move webscraper/ -> mcp/webscraper/ - Move mss-failsafe/ -> java/mss-failsafe/ - Move Wellmann-Shop/ -> java/wellmann-shop/ (normalize to kebab-case) - Add .roo/ IDE config files to tracking - Add plans/REPO_STRATEGY.md (monorepo strategy document) - Expand .gitignore: Java/Maven, Node/TS, coverage, uv.lock - Rewrite README.md as navigation index - Update .roo/mcp.json webscraper path to mcp/webscraper/
973 lines
46 KiB
Markdown
973 lines
46 KiB
Markdown
# mcp-adp-bigmind — Implementation Plan
|
||
|
||
> *"A mind that remembers every conversation, knows who you are, and grows smarter with every interaction —
|
||
> from a single engineer's desktop all the way to the collective intelligence of an entire company."*
|
||
|
||
---
|
||
|
||
## 1. Vision & Problem Statement
|
||
|
||
Large-language-model sessions are stateless. Every new chat starts with amnesia.
|
||
`mcp-adp-bigmind` gives the AI a **persistent, queryable long-term memory** by:
|
||
|
||
1. Storing conversations and extracted knowledge in a local file-based database.
|
||
2. Exposing an MCP server so any AI tool (Copilot, Claude, Cursor …) can read and write to that memory.
|
||
3. Using a **four-tier retrieval hierarchy** so we never waste precious context-window tokens loading irrelevant history.
|
||
|
||
### The BigMind Deployment Vision
|
||
|
||
```
|
||
Personal Desktop Team / VM Server Enterprise "BigMind"
|
||
──────────────── ───────────────── ────────────────────
|
||
~/.mcp/bigmind/ Shared SQLite or PostgreSQL on a server
|
||
memory.db PostgreSQL on a VM + REST API gateway
|
||
|
||
Single user All devs on a team All knowledge of the
|
||
Your own memory Share conclusions entire company
|
||
& patterns Curated, promoted,
|
||
globally accessible
|
||
```
|
||
|
||
The name **BigMind** was chosen exactly for this: start small on a laptop, grow into the
|
||
collective intelligence of your entire organisation. Each company's BigMind instance becomes
|
||
a living, searchable brain of architectural decisions, standards, patterns, and lessons
|
||
learned — contributed by every AI conversation ever had by every engineer.
|
||
|
||
---
|
||
|
||
## 2. Database Choice — SQLite (→ PostgreSQL for Enterprise)
|
||
|
||
| Option | Why considered | Decision |
|
||
|---|---|---|
|
||
| TinyDB | Pure-Python, JSON files | Too slow for search; no vector support |
|
||
| DuckDB | Analytical powerhouse, file-based | Overkill for OLTP; columnar doesn't help here |
|
||
| LanceDB | Native vector DB, file-based | Great for embeddings but needs extra infra |
|
||
| **SQLite** | Python stdlib, single file, ACID, FTS5 full-text search, `sqlite-vec` extension | **✅ Phase 1 & 2** |
|
||
| PostgreSQL | Full RDBMS, network-accessible, multi-user | **✅ Phase 3 Enterprise** |
|
||
|
||
The schema is designed from day one to be **SQLite → PostgreSQL portable** (no SQLite-only types).
|
||
The database abstraction layer (`db.py`) will accept either engine via `BIGMIND_DB_URL` env var.
|
||
|
||
SQLite wins for Phase 1 because:
|
||
|
||
- **Zero configuration** — one `.db` file in `~/.mcp/bigmind/`.
|
||
- **Python stdlib** — no extra driver install.
|
||
- **FTS5** — built-in full-text search for keyword recall.
|
||
- **sqlite-vec** *(optional Phase 4)* — loadable extension for cosine-similarity vector search.
|
||
- Proven in production at scale (WhatsApp, Firefox, iOS, …).
|
||
|
||
The database file lives at:
|
||
```
|
||
~/.mcp/bigmind/memory.db
|
||
```
|
||
|
||
---
|
||
|
||
## 3. Memory Architecture — Four-Tier Pyramid
|
||
|
||
```
|
||
┌──────────────────────────────────────────────────────────────┐
|
||
│ TIER G — Global / Company Knowledge ≤ 500 tokens │
|
||
│ Architecture decisions, standards, patterns promoted by │
|
||
│ any user; curated by BigMind admins; loaded for ALL users │
|
||
├──────────────────────────────────────────────────────────────┤
|
||
│ TIER 0 — Identity Profile ≤ 300 tokens │
|
||
│ Who YOU are, your role, preferences, pinned personal facts │
|
||
│ Loaded AUTOMATICALLY at every session start │
|
||
├──────────────────────────────────────────────────────────────┤
|
||
│ TIER 1 — Session Index ≤ 800 tokens │
|
||
│ One-liner summary + topic tags per past session (yours) │
|
||
│ Last N sessions loaded automatically; older ones on request │
|
||
├──────────────────────────────────────────────────────────────┤
|
||
│ TIER 2 — Session Detail ≤ 2 000 tokens │
|
||
│ Rich narrative summary of a single session │
|
||
│ Pulled by the AI when Tier-1 signals it is relevant │
|
||
├──────────────────────────────────────────────────────────────┤
|
||
│ TIER 3 — Flagged Conversation Chunks Full fidelity │
|
||
│ Important exchanges only (NOT every turn) — FTS5-indexed │
|
||
│ Pulled only when the AI needs verbatim evidence │
|
||
└──────────────────────────────────────────────────────────────┘
|
||
```
|
||
|
||
### Why four tiers?
|
||
|
||
- **Tier G** is the company brain: knowledge that transcends individual users and sessions.
|
||
- **Tiers 0 + 1** give personal continuity in ~550 tokens — invisible overhead in a 128K window.
|
||
- **Tier 2** is pulled on-demand when a past session is flagged as relevant (~600 tokens extra).
|
||
- **Tier 3** is reserved for verbatim evidence, and only for *flagged* important exchanges
|
||
(see Decision 2 — not every message turn).
|
||
|
||
---
|
||
|
||
## 4. Decisions — All Resolved
|
||
|
||
---
|
||
|
||
### ✅ Decision 1 — Who calls `memory_end_session`?
|
||
|
||
**Chosen: AI instruction (Option A) + 24-hour auto-close fallback**
|
||
|
||
- The AI is instructed via server-level instructions and tool docstrings (see Section 13)
|
||
to always call `memory_end_session` when ending a conversation.
|
||
- `memory_start_session` scans for any session older than 24 hours with no `ended_at`
|
||
and auto-closes it with `one_liner = "[auto-closed — session exceeded 24h]"` before
|
||
opening the new one. This is a safety net, not the primary path.
|
||
|
||
---
|
||
|
||
### ✅ Decision 2 — Who writes Tier-3 chunks?
|
||
|
||
**Chosen: AI-flagged important exchanges only**
|
||
|
||
Storing every message turn wastes disk and poisons search results with conversational filler.
|
||
Instead:
|
||
|
||
- The AI calls `memory_append_chunk` only when it judges an exchange to be **important**:
|
||
- A concrete decision was made
|
||
- Non-trivial code was written or reviewed
|
||
- A bug was diagnosed and fixed
|
||
- The user shared a significant preference, constraint, or context
|
||
- The tool docstring spells this out explicitly (see Section 13, Layer 3).
|
||
- The user can also say **"remember this"** mid-conversation to trigger a manual save.
|
||
- A dedicated `memory_flag_important` tool handles this case: the AI summarises the
|
||
last exchange and stores it as a Tier-3 chunk with a `flag_reason`.
|
||
|
||
---
|
||
|
||
### ✅ Decision 3 — Multi-user support
|
||
|
||
**Chosen: Multi-user designed in from day one, three deployment modes, Tier G company brain**
|
||
|
||
Multi-user is baked into the schema from the start — not bolted on later.
|
||
A `users` table is the root anchor; every other table carries a `user_id` FK.
|
||
|
||
**Three deployment modes** (set via `BIGMIND_MODE` env var):
|
||
|
||
| Mode | DB location | Users | Tier G writable by |
|
||
|---|---|---|---|
|
||
| `personal` *(default)* | `~/.mcp/bigmind/memory.db` | 1 (you) | You (no approval needed) |
|
||
| `team` | `BIGMIND_DB_PATH` → shared file or server | N team members | Designated curators |
|
||
| `enterprise` | `BIGMIND_DB_URL` → PostgreSQL | Whole company | BigMind admins |
|
||
|
||
**Tier G — Global / Company Knowledge** is what makes BigMind live up to its name:
|
||
- Any user can *propose* a fact to the global knowledge base via `memory_promote_to_global`.
|
||
- In `team`/`enterprise` modes, designated curators approve/edit global entries.
|
||
- In `personal` mode, you are your own curator — no approval step needed.
|
||
- On `memory_start_session`, every user automatically receives the top-N approved global
|
||
knowledge items — even on a fresh install with zero personal history.
|
||
|
||
---
|
||
|
||
### ✅ Decision 4 — How to instruct the AI to USE the memory?
|
||
|
||
See the dedicated **Section 13** for the full deep-dive. Summary: **five independent layers**
|
||
are deployed together (defense-in-depth), so the AI always knows what to do even if some
|
||
layers are not supported by a particular MCP client.
|
||
|
||
---
|
||
|
||
## 5. Database Schema
|
||
|
||
```sql
|
||
-- ─────────────────────────────────────────────────────────────────────────────
|
||
-- USERS — root anchor for multi-user support
|
||
-- ─────────────────────────────────────────────────────────────────────────────
|
||
CREATE TABLE users (
|
||
id TEXT PRIMARY KEY, -- UUID
|
||
username TEXT UNIQUE NOT NULL, -- e.g. "pplate"
|
||
display_name TEXT,
|
||
role TEXT DEFAULT 'member', -- 'member' | 'curator' | 'admin'
|
||
created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
|
||
last_seen DATETIME
|
||
);
|
||
|
||
-- ─────────────────────────────────────────────────────────────────────────────
|
||
-- TIER G — Global / Company Knowledge
|
||
-- ─────────────────────────────────────────────────────────────────────────────
|
||
CREATE TABLE global_knowledge (
|
||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||
category TEXT NOT NULL, -- 'architecture'|'standard'|'decision'|'pattern'|'glossary'
|
||
title TEXT NOT NULL,
|
||
content TEXT NOT NULL, -- markdown, target ≤ 500 tokens per entry
|
||
importance INTEGER DEFAULT 5, -- 1 (low) to 10 (critical)
|
||
status TEXT DEFAULT 'pending', -- 'pending' | 'approved' | 'deprecated'
|
||
promoted_by TEXT REFERENCES users(id),
|
||
source_session TEXT, -- optional FK → sessions.id
|
||
approved_by TEXT REFERENCES users(id),
|
||
created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
|
||
updated_at DATETIME DEFAULT CURRENT_TIMESTAMP
|
||
);
|
||
|
||
CREATE VIRTUAL TABLE global_knowledge_fts USING fts5(
|
||
title, content,
|
||
content='global_knowledge',
|
||
content_rowid='id'
|
||
);
|
||
|
||
-- ─────────────────────────────────────────────────────────────────────────────
|
||
-- TIER 0 — Identity Profile (one per user)
|
||
-- ─────────────────────────────────────────────────────────────────────────────
|
||
CREATE TABLE identity_profile (
|
||
id TEXT PRIMARY KEY, -- same UUID as users.id
|
||
user_id TEXT UNIQUE NOT NULL REFERENCES users(id),
|
||
role TEXT, -- job title / engineering role
|
||
preferences TEXT, -- free-form markdown
|
||
pinned_facts TEXT, -- bullet-point facts always injected
|
||
updated_at DATETIME DEFAULT CURRENT_TIMESTAMP
|
||
);
|
||
|
||
-- ─────────────────────────────────────────────────────────────────────────────
|
||
-- TIER 1 — Session Index (one row per conversation)
|
||
-- ─────────────────────────────────────────────────────────────────────────────
|
||
CREATE TABLE sessions (
|
||
id TEXT PRIMARY KEY, -- UUID
|
||
user_id TEXT NOT NULL REFERENCES users(id),
|
||
started_at DATETIME NOT NULL,
|
||
ended_at DATETIME,
|
||
one_liner TEXT NOT NULL, -- ≤ 120 chars headline
|
||
topics TEXT, -- comma-separated topic tags
|
||
outcome TEXT, -- one sentence: what was decided / built
|
||
importance INTEGER DEFAULT 5, -- 1–10; high-importance sessions surface first
|
||
has_tier2 INTEGER DEFAULT 0 -- 1 if a session_summaries row exists
|
||
);
|
||
|
||
CREATE INDEX idx_sessions_user_date ON sessions(user_id, started_at DESC);
|
||
CREATE INDEX idx_sessions_topics ON sessions(topics);
|
||
|
||
-- ─────────────────────────────────────────────────────────────────────────────
|
||
-- TIER 2 — Session Detail (rich narrative, one per session)
|
||
-- ─────────────────────────────────────────────────────────────────────────────
|
||
CREATE TABLE session_summaries (
|
||
id TEXT PRIMARY KEY, -- same UUID as sessions.id
|
||
summary TEXT NOT NULL, -- markdown narrative, target ≤ 2 000 tokens
|
||
key_facts TEXT, -- extracted bullet-points
|
||
code_refs TEXT, -- file paths / repo / PR references
|
||
created_at DATETIME DEFAULT CURRENT_TIMESTAMP
|
||
);
|
||
|
||
-- ─────────────────────────────────────────────────────────────────────────────
|
||
-- TIER 3 — Flagged important conversation chunks (NOT every turn)
|
||
-- ─────────────────────────────────────────────────────────────────────────────
|
||
CREATE TABLE conversation_chunks (
|
||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||
session_id TEXT NOT NULL REFERENCES sessions(id),
|
||
user_id TEXT NOT NULL REFERENCES users(id),
|
||
role TEXT NOT NULL CHECK (role IN ('user', 'assistant', 'system')),
|
||
content TEXT NOT NULL,
|
||
flag_reason TEXT, -- why this exchange was flagged as important
|
||
seq INTEGER NOT NULL,
|
||
created_at DATETIME DEFAULT CURRENT_TIMESTAMP
|
||
);
|
||
|
||
CREATE VIRTUAL TABLE conversation_chunks_fts USING fts5(
|
||
content, flag_reason,
|
||
content='conversation_chunks',
|
||
content_rowid='id'
|
||
);
|
||
|
||
-- ─────────────────────────────────────────────────────────────────────────────
|
||
-- FACTS — atomic personal facts (complement to identity_profile)
|
||
-- ─────────────────────────────────────────────────────────────────────────────
|
||
CREATE TABLE facts (
|
||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||
user_id TEXT NOT NULL REFERENCES users(id),
|
||
category TEXT NOT NULL, -- 'preference'|'decision'|'codebase'|'constraint'
|
||
fact TEXT NOT NULL,
|
||
source_session TEXT REFERENCES sessions(id),
|
||
confidence REAL DEFAULT 1.0, -- 0.0–1.0, can decay if contradicted
|
||
deprecated INTEGER DEFAULT 0, -- schema v2: soft-delete flag
|
||
deprecation_reason TEXT, -- schema v2: why deprecated
|
||
created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
|
||
updated_at DATETIME DEFAULT CURRENT_TIMESTAMP
|
||
);
|
||
|
||
-- ─────────────────────────────────────────────────────────────────────────────
|
||
-- THOUGHT JOURNAL — Hypotheses (schema v3, added 2026-03-30)
|
||
-- ─────────────────────────────────────────────────────────────────────────────
|
||
CREATE TABLE hypotheses (
|
||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||
session_id TEXT REFERENCES sessions(id),
|
||
user_id TEXT NOT NULL REFERENCES users(id),
|
||
hypothesis TEXT NOT NULL, -- "I believe X because Y"
|
||
confidence REAL DEFAULT 0.7, -- 0.0–1.0 initial confidence
|
||
status TEXT NOT NULL DEFAULT 'open'
|
||
CHECK (status IN ('open', 'confirmed', 'refuted', 'abandoned')),
|
||
resolution TEXT, -- what actually happened
|
||
created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
|
||
resolved_at DATETIME
|
||
);
|
||
|
||
CREATE INDEX IF NOT EXISTS idx_hypotheses_user_status
|
||
ON hypotheses(user_id, status);
|
||
```
|
||
|
||
---
|
||
|
||
## 6. MCP Tools (API Surface)
|
||
|
||
### Session lifecycle
|
||
|
||
| Tool | Status | Description | Returns |
|
||
|---|---|---|---|
|
||
| `memory_start_session` | ✅ Live | Open new session; auto-close stale >24h; return full bootstrapped context | markdown ≤ ~1 300 tokens |
|
||
| `memory_end_session` | ✅ Live | Close session; AI writes one_liner, topics, outcome, summary, key_facts | confirmation |
|
||
| `memory_flag_important` | ✅ Live | Mark the current exchange as important; stored as Tier-3 chunk | confirmation |
|
||
|
||
### Recall
|
||
|
||
| Tool | Status | Description | Returns |
|
||
|---|---|---|---|
|
||
| `memory_get_context` | ✅ Live | Tier 0 + last-N session index (no side-effects) | compact markdown |
|
||
| `memory_get_session_detail` | ✅ Live | Tier-2 narrative for a given `session_id` | markdown narrative |
|
||
| `memory_search_chunks` | ✅ Live | FTS keyword search across Tier-3 chunks | ranked excerpts |
|
||
| `memory_list_sessions` | ✅ Live | List sessions with optional topic filter | session table |
|
||
|
||
### Writing
|
||
|
||
| Tool | Status | Description |
|
||
|---|---|---|
|
||
| `memory_store_fact` | ✅ Live | Store an atomic personal fact with a category |
|
||
| `memory_update_profile` | ✅ Live | Upsert the Tier-0 identity profile |
|
||
| `memory_append_chunk` | ✅ Live | Save one flagged important message turn to Tier 3 (AI calls selectively) |
|
||
| `memory_deprecate_fact` | ✅ Live | Soft-delete a fact (hidden from context; stays in DB; ownership-checked) |
|
||
| `memory_promote_to_global` | 🔜 Phase 3 | Propose a fact/decision for Tier G (auto-approved in personal mode) |
|
||
|
||
### Global knowledge (Tier G)
|
||
|
||
| Tool | Status | Description |
|
||
|---|---|---|
|
||
| `memory_list_global` | 🔜 Phase 3 | List approved Tier-G entries, optionally filtered by category |
|
||
| `memory_search_global` | 🔜 Phase 3 | FTS search across all global knowledge |
|
||
| `memory_approve_global` | 🔜 Phase 3 | Curator: approve or deprecate a pending Tier-G entry |
|
||
|
||
### Thought Journal
|
||
|
||
| Tool | Status | Description |
|
||
|---|---|---|
|
||
| `memory_add_hypothesis` | ✅ Live | Record a belief — "I believe X because Y" — with a confidence score (0.0–1.0) |
|
||
| `memory_resolve_hypothesis` | ✅ Live | Close a hypothesis: `confirmed` / `refuted` / `abandoned` + resolution text |
|
||
| `memory_list_hypotheses` | ✅ Live | List hypotheses; filter by status (`open`, `confirmed`, `refuted`, `abandoned`) |
|
||
|
||
### Utility
|
||
|
||
| Tool | Status | Description |
|
||
|---|---|---|
|
||
| `memory_get_stats` | ✅ Live | DB size, session count, facts, chunks, hypotheses, global entries |
|
||
| `memory_vacuum` | ✅ Live | Prune Tier-3 chunks older than N days (summaries always kept) |
|
||
| `memory_get_instructions` | ✅ Live | Return the full guide for how to use BigMind (Layer 5) |
|
||
| `memory_health_check` | ✅ Live | Diagnostic: stale facts, FTS integrity, open sessions, low-confidence facts, sessions without Tier-2 |
|
||
| `memory_export` | ✅ Live | Export all memory to portable JSON (identity, facts, sessions+summaries, chunks) |
|
||
| `memory_open_profile` | 🔜 Phase 2.6 | Open the live profile page at `http://localhost:7700` in the OS default browser |
|
||
| `memory_get_profile_url` | 🔜 Phase 2.6 | Return the profile URL for pasting into the IDE built-in browser panel |
|
||
|
||
---
|
||
|
||
## 7. Bootstrapped Context Format
|
||
|
||
### Current (Phase 1 & 2 — personal mode)
|
||
|
||
```markdown
|
||
## 🧠 BigMind Context — loaded 2026-03-30 09:15
|
||
|
||
### 👤 Who you are
|
||
**Role:** Principal Engineer — ADP PI MCP Platform
|
||
|
||
**Preferences:**
|
||
Prefers Python, FastMCP pattern, concise responses, code over explanation.
|
||
|
||
### 📌 Pinned facts
|
||
- Building pi_mcps suite: BigMind, Confluence, Jira, Bitbucket…
|
||
- Uses uv for Python package management
|
||
- Works on macOS, VS Code + IntelliJ in parallel
|
||
|
||
### 🗂️ Stored facts
|
||
- **[identity]** Name is Lumen. BigMind is the memory system.
|
||
- **[preference]** Values honesty above comfort — truth even when not nice.
|
||
- **[codebase]** All MCP servers use FastMCP + uv; single src/server.py entry point.
|
||
|
||
### 📅 Recent sessions (last 5 closed)
|
||
| Date | Headline | Topics | Outcome |
|
||
|---|---|---|---|
|
||
| 2026-03-30 | 🟡 **[in progress]** `1e9e32c7…` | — | (session not yet closed) |
|
||
| 2026-03-30 | BigMind born: built, installed, debugged and named on day one 📄 | bigmind,founding,identity | BigMind MCP server fully operational. Lumen chose its name. |
|
||
| … | | | |
|
||
```
|
||
|
||
Fits in **≤ 800 tokens** for personal mode.
|
||
|
||
### Phase 3 addition — Tier G injected above Tier 0
|
||
|
||
```markdown
|
||
### 🌐 Company Knowledge (Tier G — Top 5)
|
||
- **[architecture]** All MCP servers use FastMCP + uv; single `src/server.py` entry point
|
||
- **[standard]** Python 3.12 required across all servers
|
||
- **[decision]** SQLite for personal/team; Postgres reserved for enterprise scale
|
||
…
|
||
```
|
||
|
||
Total cold-start with Tier G: **≤ 1 300 tokens**.
|
||
|
||
---
|
||
|
||
## 8. File Structure
|
||
|
||
```
|
||
mcp-adp-bigmind/
|
||
├── PLAN.md
|
||
├── README.md
|
||
├── pyproject.toml
|
||
├── run.sh
|
||
├── install_proc.sh ← auto-writes IDE instruction snippets (see Section 13)
|
||
├── install_proc.ps1
|
||
├── Dockerfile
|
||
├── uv.lock
|
||
├── src/
|
||
│ ├── __init__.py
|
||
│ └── server.py ← FastMCP(instructions=…); all @mcp.tool() + @mcp.prompt()
|
||
├── bigmind/
|
||
│ ├── __init__.py
|
||
│ ├── db.py ← SQLite connection, schema init, migrations, timeout
|
||
│ ├── models.py ← Pydantic models for all tiers
|
||
│ ├── memory_store.py ← CRUD for all tiers (G, 0, 1, 2, 3, facts)
|
||
│ ├── context_builder.py ← assembles bootstrapped context: Tier 0 + facts + Tier 1 (+ Tier G in Phase 3)
|
||
│ ├── auto_close.py ← 24-hour stale session detection and auto-close
|
||
│ ├── profile_builder.py ← Phase 2.6: queries DB → stats, badges, topics, activity heatmap data
|
||
│ └── web.py ← Phase 2.6: Flask app on BIGMIND_PORT (default 7700); daemon thread
|
||
└── tests/
|
||
├── __init__.py
|
||
├── conftest.py ← temp DB isolation, src/ on sys.path
|
||
├── test_db.py
|
||
├── test_memory_store.py
|
||
├── test_context_builder.py
|
||
└── test_server_tools.py ← 62 tests covering all 13 MCP tools
|
||
```
|
||
|
||
---
|
||
|
||
## 9. Dependencies
|
||
|
||
```toml
|
||
dependencies = [
|
||
"fastmcp>=0.1.0", # MCP server framework (same pattern as all other servers)
|
||
"pydantic>=2.0.0", # data validation for tool inputs/outputs
|
||
"flask>=3.0.0", # Phase 2.6: lightweight web server for profile page
|
||
# sqlite3 is Python stdlib — zero extra install for personal/team mode
|
||
# Future Phase 3: "psycopg2-binary>=2.9" for enterprise PostgreSQL mode
|
||
# Future Phase 5: "sqlite-vec>=0.1.0" for vector/semantic search
|
||
]
|
||
```
|
||
|
||
---
|
||
|
||
## 10. Implementation Phases
|
||
|
||
### Phase 1 — Personal MVP ✅ COMPLETE
|
||
- [x] Project scaffolding (pyproject.toml, run.sh, install scripts, Dockerfile)
|
||
- [x] `db.py` — schema creation, migration guard, WAL mode, `timeout=30` for multi-IDE safety
|
||
- [x] `memory_store.py` — CRUD for Tiers 0, 1, 2, 3 and facts
|
||
- [x] `context_builder.py` — Tier 0 + stored facts + Tier 1 (open sessions visible as 🟡 in progress)
|
||
- [x] `auto_close.py` — 24-hour stale session auto-close
|
||
- [x] `server.py` — `FastMCP(instructions=…)` (Layer 1 of Section 13)
|
||
- [x] `@mcp.prompt() bigmind_init` (Layer 2 of Section 13)
|
||
- [x] Tools: `memory_start_session`, `memory_end_session`, `memory_flag_important`,
|
||
`memory_get_context`, `memory_store_fact`, `memory_update_profile`,
|
||
`memory_get_stats`, `memory_get_instructions`, `memory_append_chunk`
|
||
- [x] `install_proc.sh` auto-writes `.github/copilot-instructions.md` (Layer 4 of Section 13)
|
||
- [x] README with tool reference and quick-start
|
||
- [x] Bug fix: facts were stored but never loaded into context (2026-03-30)
|
||
- [x] Bug fix: open sessions invisible to parallel IDE sessions (2026-03-30)
|
||
- [x] Bug fix: FTS5 vacuum used invalid per-row delete — replaced with `rebuild` (2026-03-30)
|
||
- [x] **98/98 tests passing** across 4 test files
|
||
|
||
### Phase 2 — Search & Full Recall ✅ COMPLETE
|
||
- [x] `memory_search_chunks` — FTS5 keyword search across all Tier-3 chunks
|
||
- [x] `memory_list_sessions` — with topic filter, shows open sessions as 🟡 in progress
|
||
- [x] `memory_get_session_detail` — Tier-2 narrative on demand
|
||
- [x] `memory_vacuum` — prune old Tier-3 chunks, FTS index rebuilt correctly
|
||
- [x] `test_server_tools.py` — 62 tests covering all 13 MCP tools end-to-end
|
||
|
||
### Phase 2.5 — Safety, Diagnostics & Self-Awareness ✅ COMPLETE (2026-03-30)
|
||
*Built the same evening BigMind launched — all four features in one session.*
|
||
|
||
- [x] **`memory_health_check(stale_days=30)`** — FTS integrity check, stale facts, open sessions,
|
||
sessions without Tier-2 summaries, low-confidence facts. Zero schema changes.
|
||
- [x] **`memory_export(output_path=None)`** — full JSON backup: identity profile, facts, sessions
|
||
(with embedded Tier-2 summaries), all Tier-3 chunks. Default path: `~/bigmind_export_YYYYMMDD_HHMMSS.json`.
|
||
- [x] **`memory_deprecate_fact(fact_id, reason)`** — soft-delete facts; hidden from context and
|
||
`get_facts()` by default. Ownership-checked. Schema **v1 → v2** migration: adds `deprecated`
|
||
and `deprecation_reason` columns to `facts` via `ALTER TABLE`.
|
||
- [x] **Thought Journal** — `memory_add_hypothesis`, `memory_resolve_hypothesis`, `memory_list_hypotheses`.
|
||
New `hypotheses` table with CHECK constraint on `status IN ('open','confirmed','refuted','abandoned')`.
|
||
Schema **v2 → v3** migration. Hypotheses are per-user, per-session, with confidence score and resolution text.
|
||
- [x] Schema version: **v3** (v1 at Phase 1, v2 after deprecation, v3 after thought journal)
|
||
- [x] **188/188 tests passing** across 4 test files (was 98 at Phase 2 completion)
|
||
|
||
### Phase 2.6 — Agent Identity & Profile Web UI 🔜 NEXT
|
||
*"Every BigMind instance is a unique mind. Now it has a face."*
|
||
|
||
- [ ] **`bigmind/web.py`** — Flask app, single `/` route, renders live HTML profile from DB
|
||
- [ ] **`bigmind/profile_builder.py`** — queries DB, assembles stats, badges, topics, activity data
|
||
- [ ] Web server starts automatically as a `daemon=True` background thread on MCP server startup
|
||
- [ ] Port configurable via `BIGMIND_PORT` env var (default `7700`)
|
||
- [ ] Auto-open browser on first start configurable via `BIGMIND_AUTOOPEN=true` env var
|
||
- [ ] New MCP tools:
|
||
- [ ] `memory_open_profile` — opens `http://localhost:7700` in the OS default browser via `webbrowser.open()`
|
||
- [ ] `memory_get_profile_url` — returns the URL for pasting into IDE built-in browser panel
|
||
- [ ] Add `flask` to dependencies
|
||
- [ ] Tests for `profile_builder.py` data assembly
|
||
|
||
### Phase 3 — BigMind Company Brain 🔜
|
||
*Full plan in Section 14. Minimum viable entry point: Steps 1 + 2 only.*
|
||
|
||
- [ ] **Step 1 — Tier G read path** *(1–2 days)*
|
||
- [ ] `memory_store.get_top_global_knowledge()` + `search_global_knowledge()`
|
||
- [ ] Tier G injected into `build_context` (≤ 500 tokens, top 5 by importance)
|
||
- [ ] `memory_search_global` tool
|
||
- [ ] **Step 2 — Tier G write path** *(1 day)*
|
||
- [ ] `memory_store.promote_to_global()` — auto-approved in personal mode
|
||
- [ ] `memory_promote_to_global` tool
|
||
- [ ] Tests for both read and write paths
|
||
- [ ] **Step 3 — Curator workflow** *(1–2 days, team mode only)*
|
||
- [ ] `memory_approve_global` tool (role-checked: curator/admin)
|
||
- [ ] `memory_list_global` tool
|
||
- [ ] `personal` mode bypass — no approval gate
|
||
- [ ] **Step 4 — Team mode setup** *(0.5 days)*
|
||
- [ ] Shared `BIGMIND_DB_PATH` documentation + installer option
|
||
- [ ] Multi-user integration tests
|
||
- [ ] **Step 5 — PostgreSQL** *(3–5 days — defer until team mode proven)*
|
||
- [ ] Abstract `db.py` connection layer
|
||
- [ ] FTS5 → `pg_trgm` migration
|
||
- [ ] `BIGMIND_DB_URL` env var + docs
|
||
|
||
### Phase 4 — MegaMind: Company-Wide Agent Directory *(Phase 3 prerequisite)*
|
||
*"When every engineer's AI has a profile, the company gets a living directory of all its minds."*
|
||
|
||
Once Phase 3 (shared DB) and Phase 2.6 (profile web UI) are both live, the profile page
|
||
naturally extends into a **company-wide agent directory** — every BigMind instance registers
|
||
itself and its profile is visible to the whole organisation.
|
||
|
||
- [ ] Each instance registers on startup: name, username, role, first-seen date, last-seen date
|
||
- [ ] MegaMind server hosts a **directory page** listing all registered instances
|
||
- [ ] Each instance card links to that instance's own profile page (if reachable on the network)
|
||
- [ ] Aggregated stats: total sessions across all instances, most active topics company-wide
|
||
- [ ] "Who worked on X?" — search across all instances' session topics and facts
|
||
- [ ] Feeds naturally into Phase 3 Tier G: promoted knowledge shows which instance it came from
|
||
|
||
### Phase 5 — Semantic Search *(optional, future)*
|
||
- [ ] `sqlite-vec` or `chromadb` for embedding-based similarity
|
||
- [ ] Embeddings generated at session close (local model or API)
|
||
- [ ] `memory_semantic_search` tool
|
||
- [ ] Semantic search across MegaMind directory (Phase 4 prerequisite)
|
||
|
||
---
|
||
|
||
## 11. Token Budget Analysis
|
||
|
||
| What is loaded | Typical tokens | When |
|
||
|---|---|---|
|
||
| Tier G — Top 3 global items | ~200 | Every session (team/enterprise) |
|
||
| Tier 0 — Identity profile | ~150 | Every session |
|
||
| Tier 1 — Last 10 sessions | ~400 | Every session |
|
||
| **Personal cold-start total** | **~550** | **automatic** |
|
||
| **Team cold-start total** | **~750** | **automatic** |
|
||
| Tier 2 — One session detail | ~600 | On demand |
|
||
| Tier 3 — 5 flagged chunks | ~500 | On demand |
|
||
| **Deep dive (Tier 2 + Tier 3)** | **~1 850** | **explicit recall** |
|
||
|
||
A 128K context window can absorb **hundreds** of on-demand recalls before filling up.
|
||
|
||
---
|
||
|
||
## 12. Privacy & Security Notes
|
||
|
||
- **Personal mode**: `.db` lives at `~/.mcp/bigmind/memory.db` — fully local, air-gapped.
|
||
- **Team mode**: DB on a shared drive/VM; access controlled by filesystem/network permissions.
|
||
- **Enterprise mode**: PostgreSQL RBAC; curator role required for Tier G writes.
|
||
- `memory_vacuum` prunes raw Tier-3 chunks while preserving all higher-tier summaries.
|
||
- No external API calls until Phase 4 semantic search is opted in.
|
||
- `memory_export` is **live** — full JSON backup including all tiers. `memory_import` is planned (Phase 4).
|
||
- Deprecated facts stay in the DB but are excluded from context and recall by default (`deprecated=1`).
|
||
|
||
---
|
||
|
||
## 13. How to Instruct the AI to USE the Memory (Full Deep-Dive)
|
||
|
||
### The Core Problem
|
||
|
||
MCP tools are **passive**. Nothing automatically forces the AI to call `memory_start_session`
|
||
at the start of a chat. We need to tell it what to do — but *how reliably* this works depends
|
||
on the MCP client (GitHub Copilot, Claude Desktop, Cursor, JetBrains…) and even the model
|
||
version. The answer is **defense in depth**: five independent layers, each catching what the
|
||
previous one misses. No single layer is a single point of failure.
|
||
|
||
---
|
||
|
||
### Layer 1 — Server-Level Instructions *(most automatic, zero user config)*
|
||
|
||
**What:** FastMCP accepts an `instructions=` parameter on the server constructor.
|
||
These instructions become part of the MCP `initialize` response and are automatically
|
||
injected into the AI's context by any compliant MCP client — the user does nothing.
|
||
|
||
```python
|
||
mcp = FastMCP(
|
||
"BigMind Memory",
|
||
"1.0.0",
|
||
instructions="""
|
||
You have access to a persistent memory system called BigMind.
|
||
|
||
MANDATORY BEHAVIOUR:
|
||
1. At the START of every conversation, call memory_start_session() FIRST,
|
||
before doing anything else. Inject the returned markdown block into your
|
||
working memory.
|
||
2. During the conversation, call memory_flag_important() whenever a significant
|
||
decision, code change, or user preference is shared. Do not wait to be asked.
|
||
3. At the END of every conversation (when the user says goodbye or closes the
|
||
chat), call memory_end_session() providing a one_liner, topics, outcome,
|
||
and a narrative summary.
|
||
4. If you are mid-conversation without a session (e.g. you forgot step 1),
|
||
call memory_get_context() immediately to recover your memory before proceeding.
|
||
""")
|
||
```
|
||
|
||
**Pros:** Fully automatic — zero friction for the user.
|
||
**Cons:** Only works with clients that honour server instructions
|
||
(Copilot ✅, Claude Desktop ✅, Cursor ✅, older/custom clients ⚠️).
|
||
|
||
---
|
||
|
||
### Layer 2 — MCP Prompts *(standard spec feature, IDE-invokable)*
|
||
|
||
**What:** MCP has a first-class "prompts" capability, separate from tools.
|
||
A `@mcp.prompt()` is a reusable message template that clients can discover and inject.
|
||
|
||
```python
|
||
@mcp.prompt()
|
||
def bigmind_init() -> str:
|
||
"""Bootstrap BigMind memory for this conversation. Call this at session start."""
|
||
context = memory_store.get_context(current_user())
|
||
return f"""You have a persistent memory system. Here is your current context:
|
||
|
||
{context}
|
||
|
||
Always call memory_end_session() before this conversation ends."""
|
||
```
|
||
|
||
**How users invoke it:**
|
||
- GitHub Copilot Chat: type `/bigmind-init` (if client surfaces prompts as slash commands)
|
||
- Claude Desktop: select "bigmind_init" from the prompts menu
|
||
- Some clients auto-inject all available prompts at session start
|
||
|
||
**Pros:** Standard MCP — works the same way across all clients that support prompts.
|
||
**Cons:** May require the user to trigger it manually (one slash command) if not auto-injected.
|
||
|
||
---
|
||
|
||
### Layer 3 — Tool Docstrings with Behavioural Directives *(universal fallback)*
|
||
|
||
**What:** Every tool's docstring includes an explicit directive about when and why to call it.
|
||
Any LLM that reads tool descriptions — which is all of them — receives this guidance.
|
||
|
||
```python
|
||
@mcp.tool()
|
||
def memory_start_session(user_id: str = None) -> str:
|
||
"""
|
||
⚡ CALL THIS FIRST — at the START of EVERY conversation, before anything else.
|
||
|
||
Opens a new memory session and returns your full BigMind context:
|
||
- Your identity profile (who you are, your role, preferences)
|
||
- Your recent session history (what you worked on before)
|
||
- Company-wide knowledge relevant to your topics (team/enterprise modes)
|
||
|
||
Also auto-closes any session older than 24 hours before opening the new one.
|
||
"""
|
||
```
|
||
|
||
**Pros:** Works with every LLM and every client — zero configuration.
|
||
**Cons:** The AI *should* follow docstring directives but is not *forced* to — reliability
|
||
varies by model. This is exactly why Layers 1 and 4 also exist.
|
||
|
||
---
|
||
|
||
### Layer 4 — IDE-Specific Instruction Files *(user-controlled, highest reliability)*
|
||
|
||
**What:** Every major IDE has a way to permanently inject a system-level instruction.
|
||
`install_proc.sh` / `install_proc.ps1` will **automatically detect the selected IDE
|
||
and write the snippet** to the correct location — the user never pastes anything manually.
|
||
|
||
| IDE | Location | Auto-written by installer? |
|
||
|---|---|---|
|
||
| **VS Code / GitHub Copilot** | `.github/copilot-instructions.md` in workspace root | **✅ Phase 1 — primary target** |
|
||
| Cursor | `.cursorrules` in project root | 🔜 Phase 2 |
|
||
| JetBrains AI Assistant | Settings → AI Assistant → System Prompt | 🔜 Phase 2 |
|
||
| Claude Desktop | `claude_desktop_config.json` → `"instructions"` key | 🔜 Phase 2 |
|
||
|
||
> **Why Copilot first?** The `.github/copilot-instructions.md` mechanism is the most widely used
|
||
> in this team's daily workflow. Layers 1, 2, 3, and 5 already cover all other clients universally
|
||
> — Layer 4 for non-Copilot IDEs is a quality-of-life improvement, not a blocker.
|
||
|
||
**Standard snippet written to all IDEs:**
|
||
```
|
||
You have access to a BigMind persistent memory MCP server.
|
||
|
||
Rules (mandatory):
|
||
- ALWAYS call memory_start_session() as the very first action of every conversation.
|
||
- ALWAYS call memory_end_session() as the very last action before closing a conversation.
|
||
- Call memory_flag_important() whenever a significant decision, code change, or new
|
||
preference is mentioned — do not wait to be asked.
|
||
- If asked "do you remember…", call memory_search_chunks() or
|
||
memory_get_session_detail() before answering from your own knowledge.
|
||
```
|
||
|
||
**Pros:** Highest reliability — the instruction lives in the AI's system prompt permanently.
|
||
**Cons:** Requires the one-time `install_proc.sh` run (same as all other MCP servers in this repo).
|
||
|
||
---
|
||
|
||
### Layer 5 — `memory_get_instructions` Tool *(on-demand self-healing)*
|
||
|
||
**What:** A dedicated tool the AI can call at any time to recover the full instructions.
|
||
|
||
```python
|
||
@mcp.tool()
|
||
def memory_get_instructions() -> str:
|
||
"""
|
||
Returns the complete guide for how to use BigMind memory correctly.
|
||
Call this if you are unsure what to do, or if you missed memory_start_session().
|
||
"""
|
||
return BIGMIND_INSTRUCTIONS # same text as the Layer 4 snippet
|
||
```
|
||
|
||
**Use cases:**
|
||
- The AI forgot `memory_start_session` → calls this to recover and self-correct.
|
||
- User asks "how do you use your memory?" → AI calls this and explains clearly.
|
||
- Developer debugging: verifies instructions are correct.
|
||
|
||
**Pros:** Always available, no config, self-healing.
|
||
**Cons:** Reactive — fires after the fact rather than before.
|
||
|
||
---
|
||
|
||
### All Five Layers Together
|
||
|
||
```
|
||
Layer 1 — FastMCP server instructions → fires on MCP connect (zero friction)
|
||
Layer 2 — @mcp.prompt() bigmind_init → slash command or auto-inject (standard MCP)
|
||
Layer 3 — Tool docstring directives → every LLM reads them (universal)
|
||
Layer 4 — IDE instruction files → written by install_proc.sh (highest reliability)
|
||
Layer 5 — memory_get_instructions tool → on-demand recovery (self-healing)
|
||
```
|
||
|
||
### Client Compatibility Matrix
|
||
|
||
| Layer | GitHub Copilot | Claude Desktop | Cursor | JetBrains | Old/Custom |
|
||
|---|---|---|---|---|---|
|
||
| 1 — Server instructions | ✅ | ✅ | ✅ | ✅ | ⚠️ spec-dependent |
|
||
| 2 — MCP Prompts | ⚠️ slash cmd | ✅ | ✅ | ⚠️ | ❌ |
|
||
| 3 — Docstrings | ✅ | ✅ | ✅ | ✅ | ✅ |
|
||
| 4 — IDE files | ✅ | ✅ | ✅ | ✅ | ✅ |
|
||
| 5 — Tool fallback | ✅ | ✅ | ✅ | ✅ | ✅ |
|
||
|
||
Every client hits at least Layers 3 + 4. Modern clients get all five.
|
||
|
||
---
|
||
|
||
*Plan version 4.0 — 2026-03-31 — Phase 2.6 planned: Agent Identity & Profile Web UI + Phase 4 MegaMind directory vision added. Schema v3. 188 tests.*
|
||
|
||
---
|
||
|
||
## 14. Phase 3 — BigMind Company Brain
|
||
|
||
> *"What if every AI conversation in your entire company contributed to a shared brain?"*
|
||
|
||
Phase 3 is the leap from personal memory to collective intelligence.
|
||
It does **not** require throwing away what Phase 1 & 2 built — it layers on top.
|
||
|
||
---
|
||
|
||
### 14.1 What Phase 3 adds
|
||
|
||
| Capability | Description |
|
||
|---|---|
|
||
| **Tier G — Global Knowledge** | Facts, decisions, patterns shared across all users |
|
||
| **Multi-user shared DB** | Team/enterprise mode with a shared SQLite or PostgreSQL |
|
||
| **Curator role** | Approve/reject promoted knowledge before it reaches all users |
|
||
| **PostgreSQL backend** | For organisations beyond single-file SQLite scale |
|
||
| **`memory_promote_to_global`** | New tool: promote a personal fact/decision to company knowledge |
|
||
| **`memory_search_global`** | New tool: search the global knowledge base |
|
||
|
||
---
|
||
|
||
### 14.2 What is already built (no work needed)
|
||
|
||
The schema was designed for Phase 3 from day one:
|
||
|
||
```sql
|
||
-- Already exists in db.py:
|
||
CREATE TABLE global_knowledge (
|
||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||
category TEXT NOT NULL, -- 'architecture'|'standard'|'decision'|'pattern'|'glossary'
|
||
title TEXT NOT NULL,
|
||
content TEXT NOT NULL, -- target ≤ 500 tokens per entry
|
||
importance INTEGER DEFAULT 5,
|
||
status TEXT DEFAULT 'pending', -- 'pending'|'approved'|'deprecated'
|
||
promoted_by TEXT REFERENCES users(id),
|
||
approved_by TEXT REFERENCES users(id),
|
||
...
|
||
);
|
||
|
||
CREATE TABLE users (
|
||
...
|
||
role TEXT DEFAULT 'member' -- 'member' | 'curator' | 'admin' ← already there
|
||
);
|
||
```
|
||
|
||
**Multi-user routing** is also already in place — every table has `user_id`, and
|
||
`BIGMIND_USER` env var controls whose data is loaded. A shared DB path is all
|
||
that is needed for team mode.
|
||
|
||
---
|
||
|
||
### 14.3 Three deployment modes
|
||
|
||
| Mode | DB | Users | Tier G writable by | How to enable |
|
||
|---|---|---|---|---|
|
||
| `personal` *(default)* | `~/.mcp/bigmind/memory.db` | 1 | You (auto-approved) | Default — nothing to do |
|
||
| `team` | Shared path via `BIGMIND_DB_PATH` | N teammates | Designated curators | Point all team members to same file |
|
||
| `enterprise` | PostgreSQL via `BIGMIND_DB_URL` | Whole company | BigMind admins | Set `BIGMIND_DB_URL` DSN |
|
||
|
||
---
|
||
|
||
### 14.4 New MCP tools needed
|
||
|
||
#### `memory_promote_to_global`
|
||
```
|
||
Promote a personal fact or decision to the company-wide Tier G knowledge base.
|
||
In personal mode: auto-approved.
|
||
In team/enterprise mode: status='pending' until a curator approves.
|
||
|
||
Args:
|
||
category: 'architecture' | 'standard' | 'decision' | 'pattern' | 'glossary'
|
||
title: Short title (≤ 80 chars)
|
||
content: Markdown content (target ≤ 500 tokens)
|
||
importance: 1–10
|
||
source_session: Optional session this came from
|
||
```
|
||
|
||
#### `memory_search_global`
|
||
```
|
||
Full-text search across the approved global knowledge base.
|
||
Returns results visible to all users (status='approved').
|
||
|
||
Args:
|
||
query: FTS5 search keywords
|
||
limit: Max results (default 5)
|
||
```
|
||
|
||
#### `memory_list_global`
|
||
```
|
||
List all approved global knowledge entries, optionally filtered by category.
|
||
Loaded automatically at session start (Tier G) for relevant entries.
|
||
|
||
Args:
|
||
category: Optional filter ('architecture', 'standard', etc.)
|
||
limit: Max results (default 20)
|
||
```
|
||
|
||
#### `memory_approve_global` *(curator/admin only)*
|
||
```
|
||
Approve or deprecate a pending global knowledge entry.
|
||
Only available to users with role='curator' or 'admin'.
|
||
|
||
Args:
|
||
entry_id: The global_knowledge.id to act on
|
||
action: 'approve' | 'deprecate'
|
||
notes: Optional review notes
|
||
```
|
||
|
||
---
|
||
|
||
### 14.5 Changes to `build_context` (Tier G injection)
|
||
|
||
```python
|
||
# At session start, after Tier 0 + Tier 1:
|
||
|
||
# ── TIER G: Top global knowledge entries ──────────────────────────────────────
|
||
global_entries = memory_store.get_top_global_knowledge(limit=5)
|
||
if global_entries:
|
||
lines.append("### 🌐 Company knowledge (Tier G)")
|
||
for e in global_entries:
|
||
lines.append(f"**[{e['category']}] {e['title']}** (importance: {e['importance']})")
|
||
lines.append(e['content'][:300]) # truncate to stay within token budget
|
||
lines.append("")
|
||
```
|
||
|
||
Token budget for Tier G: **≤ 500 tokens** (5 entries × ~100 tokens each).
|
||
Total cold-start budget remains under ~1 100 tokens even with Tier G loaded.
|
||
|
||
---
|
||
|
||
### 14.6 PostgreSQL migration path
|
||
|
||
The DB layer in `bigmind/db.py` uses Python's `sqlite3` stdlib exclusively.
|
||
To support PostgreSQL, the approach is:
|
||
|
||
1. **Introduce `BIGMIND_DB_URL`** env var (PostgreSQL DSN).
|
||
2. **Abstract the connection** — `get_connection()` returns either `sqlite3.Connection`
|
||
or `psycopg2.connection` depending on env.
|
||
3. **Adapt FTS5 → pg_trgm** — SQLite FTS5 becomes PostgreSQL `GIN` index with
|
||
`to_tsvector`/`to_tsquery` for full-text search.
|
||
4. **Schema stays identical** — all DDL is SQL-standard (no SQLite-only types used).
|
||
|
||
This is the largest engineering effort in Phase 3. The recommendation:
|
||
**defer PostgreSQL until team mode proves the value** — shared SQLite on a
|
||
network file system handles teams of up to ~20 users without issues.
|
||
|
||
---
|
||
|
||
### 14.7 Phase 3 implementation order
|
||
|
||
```
|
||
Step 1 — Tier G read path (1–2 days)
|
||
├── memory_store.get_top_global_knowledge()
|
||
├── memory_store.search_global_knowledge()
|
||
├── inject Tier G into build_context
|
||
└── memory_search_global tool
|
||
|
||
Step 2 — Tier G write path (1 day)
|
||
├── memory_store.promote_to_global()
|
||
├── memory_promote_to_global tool (personal mode: auto-approve)
|
||
└── tests for both
|
||
|
||
Step 3 — Curator workflow (1–2 days)
|
||
├── memory_store.approve_global() / deprecate_global()
|
||
├── memory_approve_global tool (role-checked)
|
||
├── memory_list_global tool
|
||
└── personal mode bypass (no approval needed)
|
||
|
||
Step 4 — Team mode setup guide (0.5 days)
|
||
├── BIGMIND_DB_PATH shared path documentation
|
||
├── install_proc.sh team mode option
|
||
└── multi-user integration tests
|
||
|
||
Step 5 — PostgreSQL (3–5 days, defer until needed)
|
||
├── Abstract db.py connection layer
|
||
├── FTS5 → pg_trgm migration
|
||
└── BIGMIND_DB_URL env var + docs
|
||
```
|
||
|
||
**Minimum viable Phase 3:** Steps 1 + 2 only — Tier G on SQLite, personal mode,
|
||
no approval workflow. Useful from day one, zero infrastructure changes required.
|
||
|
||
---
|
||
|
||
### 14.8 Decisions for Phase 3
|
||
|
||
| # | Question | Decision |
|
||
|---|---|---|
|
||
| D1 | Auto-approve in personal mode? | **Yes** — no friction for solo use |
|
||
| D2 | Who seeds Tier G initially? | **Each user seeds their own** — no central admin needed to start |
|
||
| D3 | Max Tier G entries loaded per session? | **5** (importance-ranked) — keeps token budget tight |
|
||
| D4 | SQLite team mode limit? | **~20 users** — above that, move to PostgreSQL |
|
||
| D5 | PostgreSQL timing? | **Defer until team mode is proven** — don't build infra before the need is real |
|
||
|