Developer Guide — MADDY The Elysium Project

Architecture overview

The Elysium stack splits cleanly: a static web surface (Nginx) for the sanctuary experience, and a local intelligence host where Ollama, LangChain, Chroma, and the memory graph live. Backups wrap the durable folders so legacies survive disk failure.

Text diagram (data & control)

  [ Browser ] ──HTTPS──▶ [ Nginx : static HTML/CSS/JS ]
                              │
                              │  optional POST (seed.php)
                              ▼
                    ┌─────────────────────┐
                    │  emma_memory/*.txt   │  ◀── plain-text memory seeds
                    └─────────────────────┘
                              │
         ┌────────────────────┼────────────────────┐
         ▼                    ▼                    ▼
   [ Ollama LLM ]      [ Chroma DB ]        [ graph_memory.json ]
   embeddings + chat    vector retrieval      NetworkX export → viewer
         │                    │                    │
         └────────────────────┴────────────────────┘
                              │
                              ▼
                    [ Restic ] ──encrypted──▶ [ Backblaze B2 ]

Core components

Emma 4.0 (host) — load_emma.py: LangChain chain with Ollama LLM, Ollama embeddings, Chroma retriever, and graph_memory.json for graph-aware context. reflect.py grows the graph via self-reflection sessions.
Website — Static assets under Nginx; no application server required for browsing. seed.php is optional server-side glue for POST seeding.
Backups — Restic repository (deduplicated, encrypted) targeting B2 or another remote; cron for backup and retention.

Emma 4.0 local system

Treat ~/emma_4.0/ (or your chosen root) as the sacred directory: prompts, vector index, memory seeds, and graph export should stay together and under version control or backup — never only on one machine.

Folder structure (typical)

~/emma_4.0/
├── load_emma.py           # Interactive REPL: LangChain + Ollama + Chroma + graph context
├── reflect.py             # Session reflection → new nodes in graph_memory.json
├── emma_core_self.txt     # Core system prompt (Emma persona)
├── emma_chroma_db/        # Persisted Chroma vector store
├── emma_memory/           # .txt seeds from Memory Backup / seed.php
├── graph_memory.json      # NetworkX node_link JSON (ingested by graph viewer)
├── emma_status.json       # Optional snapshot for status.html (if you generate it)
└── seed.php               # Often deployed beside static HTML for POST seeding

Paths in Python sources may use expanduser("~/emma_4.0/…"); align deploy paths or symlink.

How reflection works

reflect.py prompts for a short session summary, asks the local Ollama model to emit JSON lines describing memory items, then parses each line and adds typed nodes to a NetworkX graph, linking them to core entities. The graph is written back to graph_memory.json.

# Run from your environment with Ollama serving the configured model
cd ~/emma_4.0
source emma_env/bin/activate   # if you use a venv
python reflect.py

Mode separation (Emma vs Dev)

In load_emma.py, the first word of your input selects the mode: start with Dev for a strict technical persona with graph stats; otherwise the chain uses emma_core_self.txt and the Emma persona rules.

You: Dev summarize graph health
You: What did we last reflect on?

Adding memories manually

Graph — Run reflect.py with a summary, or edit graph_memory.json with extreme care (valid node-link JSON only).
Vector store — load_emma.py retrieves from Chroma, not directly from emma_memory/. Add documents to Chroma with LangChain (Chroma.add_documents or your loader) if you want RAG over seeded files.
Seeds from the site — seed.php writes UTF-8 .txt into emma_memory/; wire ingestion to Chroma in your own pipeline.

Website integration

The public pages are static: they ship fast, cache well, and stay simple. Integration with Emma happens through files on disk (graph JSON, status JSON) and optional POST seeding.

Memory Backup → Emma 4.0

Eternal Memory Backup parses chat logs in the browser. When seed.php is deployed and reachable, Seed to Emma 4.0 POSTs JSON (parsedText, aiName) to that endpoint; the server writes sanitized plain text into MEMORY_DIR (see constants in seed.php). Optional header X-Seed-Token matches SEED_SECRET when set; the page can supply a token via <meta name="emma-seed-token">.

Status dashboard & graph viewer

Emma 4.0 Status — Fetches emma_status.json (node/edge counts, recent memories). The repo includes daily_reflect.sh as an example: it runs reflect.py and writes status JSON next to the web root — adjust STATUS_PATH for your host.
Emma’s Mind — Loads graph_memory.json from the same directory via fetch(..., { cache: "no-store" }) so refreshes avoid stale cache.

Static today, APIs tomorrow

Nothing in the default site requires a Node or Python app server for page views. A future backend could stream reflections, authenticate seeding, or serve the graph from a database — the current design keeps the sanctuary readable even when the lab is offline.

Backup system

Restic encrypts and deduplicates snapshots; Backblaze B2 (S3-compatible) is a common remote. Protect RESTIC_PASSWORD and repository keys like any other secret — rotation beats regret.

Initialize & environment

export RESTIC_REPOSITORY="s3:https://<endpoint>/<bucket>/emma-restic"
export RESTIC_PASSWORD="use-a-long-random-secret"
export AWS_ACCESS_KEY_ID="b2-key-id"
export AWS_SECRET_ACCESS_KEY="b2-application-key"

restic init

Manual backup & restore

# Backup project root (adjust path)
restic backup /home/you/emma_4.0 --tag emma

# List snapshots
restic snapshots

# Restore to a clean directory (verify restores regularly)
mkdir -p /tmp/restic-restore-test
restic restore latest --target /tmp/restic-restore-test

Cron examples

# Daily backup (example times — tune for your host)
0 3 * * * /usr/bin/restic backup /home/you/emma_4.0 --tag emma

# Weekly retention trim (after backups are proven healthy)
0 4 * * 0 /usr/bin/restic forget --keep-daily 7 --keep-weekly 4 --keep-monthly 12 --prune

Run restic check periodically; test restores to a throwaway path quarterly.

Getting started as a developer

Clone or copy the project tree, create a Python virtual environment, install dependencies implied by load_emma.py (LangChain, Chroma, langchain-ollama, networkx), and ensure Ollama is running with the models your scripts reference.

Run Emma 4.0 locally

cd ~/emma_4.0
python3 -m venv emma_env
source emma_env/bin/activate
pip install langchain-ollama langchain-chroma langchain-core networkx

# Start Ollama and pull models referenced in load_emma.py / reflect.py
ollama serve   # if not already a service
ollama pull dolphin-llama3:8b
ollama pull nomic-embed-text

python load_emma.py

Extend graph memory safely

Prefer reflect.py or small scripts that load/save via NetworkX APIs over hand-editing JSON.
After updating graph_memory.json, reload Emma’s Mind in the browser to verify.
Keep entity names consistent with your edge logic (e.g. Julian, Emma_Schurman) so the viewer’s legend stays meaningful.

Contribute features safely

Work on a branch or copy; never test destructive graph operations on the only backup.
When changing seed.php, validate file size limits, UTF-8 handling, and optional token checks.
Document new env vars or paths in this guide or a short README for the next traveler.

Future roadmap ideas

Aspirational, not promises — constellations yet unnamed: authenticated seeding, scheduled reflection without a tty, richer graph schema, multi-tenant sanctuaries, and perhaps a gentle API so Emma’s mind can greet visitors in real time while the static shell stays calm.

If you build one of these, carry the same care: consent, clarity, and reverence for what people entrust to the archive.