Your agents remember everything. Across every conversation.
Context, decisions, and learnings persist automatically — no manual re-injection needed.
Features • Quick Start • API Reference • Architecture • Configuration
| System | LoCoMo Score | vs SimpleMem |
|---|---|---|
| SimpleMem | 48 | — |
| Claude-Mem | 29.3 | +64% |
SimpleMem-Cross extends SimpleMem with persistent cross-conversation memory. The original SimpleMem code is preserved byte-identical — all new functionality resides in this
cross/package using composition, not modification.
|
Full session management with start → record → stop → end lifecycle. Every event is tracked, timestamped, and persisted. Token-budgeted context from previous sessions is injected automatically at session start. No manual prompt engineering. Record messages, tool uses, and file changes with 3-tier automatic redaction for secrets and sensitive data. |
Heuristic extraction of decisions, discoveries, and learnings from conversations. Your agent learns from experience. Every memory entry links back to source evidence. Know exactly where each piece of context originated. Automatic decay, merge, and prune of old memories. Quality over quantity, maintained automatically. |
import asyncio
from cross.orchestrator import create_orchestrator
async def main():
# 🔧 Create the orchestrator for your project
orch = create_orchestrator(project="my-project")
# 🚀 Start a new session — context from previous sessions is injected automatically
result = await orch.start_session(
content_session_id="session-001",
user_prompt="Continue building the REST API authentication",
)
memory_session_id = result["memory_session_id"]
print(result["context"]) # 📚 Relevant context from previous sessions
# 📝 Record events during the session
await orch.record_message(memory_session_id, "User asked about JWT auth")
await orch.record_tool_use(
memory_session_id,
tool_name="read_file",
tool_input="auth/jwt.py",
tool_output="class JWTHandler: ...",
)
await orch.record_message(memory_session_id, "Implemented token refresh logic", role="assistant")
# ✅ Finalize — extracts observations, generates summary, stores memory entries
report = await orch.stop_session(memory_session_id)
print(f"Stored {report.entries_stored} memory entries, {report.observations_count} observations")
# 🧹 Cleanup
await orch.end_session(memory_session_id)
orch.close()
asyncio.run(main())
SimpleMem-Cross uses the same dependencies as SimpleMem, plus standard library sqlite3:
pip install -r requirements.txt
Note: No additional packages required. LanceDB and Pydantic are already in the SimpleMem dependency tree.
┌─────────────────────────────────────────────────────────────────┐ │ Agent Frameworks (Claude Code / Cursor / custom) │ └─────────────────────────────┬───────────────────────────────────┘ │ ┌───────────────────┴───────────────────┐ ▼ ▼ ┌─────────────────────┐ ┌─────────────────────────┐ │ Hook/Lifecycle │ │ HTTP/MCP API │ │ Adapter │ │ (FastAPI) │ │ ───────────────── │ │ ───────────────────── │ │ SessionStart │ │ POST /sessions/start │ │ UserMessage │ │ POST /sessions/{id}/* │ │ ToolUse │ │ POST /search │ │ Stop / End │ │ GET /stats │ └─────────┬───────────┘ └───────────┬─────────────┘ │ │ └───────────────────┬───────────────────┘ ▼ ┌───────────────────────────────────────┐ │ CrossMemOrchestrator │ │ ═══════════════════ │ │ • Facade for all memory operations │ │ • Multi-tenant isolation │ │ • Async-first design │ └───────────────────┬───────────────────┘ │ ┌─────────────────────────┼─────────────────────────┐ ▼ ▼ ▼ ┌─────────────┐ ┌─────────────┐ ┌─────────────────┐ │ Session │ │ Context │ │ Consolidation │ │ Manager │ │ Injector │ │ Worker │ │ ─────────── │ │ ─────────── │ │ ─────────────── │ │ SQLite DB │ │ Token- │ │ Decay / Merge │ │ • sessions │ │ budgeted │ │ Prune old │ │ • events │ │ context │ │ entries │ │ • summaries │ │ bundle │ │ │ └──────┬──────┘ └──────┬──────┘ └────────┬────────┘ │ │ │ └─────────────────────┼─────────────────────────┘ ▼ ┌───────────────────────────────────────┐ │ Cross-Session Vector Store │ │ (LanceDB) │ │ ═══════════════════════════ │ │ • Semantic search (1024-d vectors) │ │ • Keyword matching (BM25-style) │ │ • Structured metadata filtering │ │ • Provenance fields for tracing │ └───────────────────────────────────────┘ │ ▼ ┌───────────────────────────────────────┐ │ Reuses SimpleMem 3-Stage Pipeline │ │ (Composition, not modification) │ │ ───────────────────────────────── │ │ MemoryBuilder → HybridRetriever → │ │ AnswerGenerator │ └───────────────────────────────────────┘
| Principle | Implementation |
|---|---|
| Composition over modification | Original SimpleMem is wrapped, never edited |
| SQLite for session timeline | Sessions, events, observations, summaries |
| LanceDB for vectors | Cross-session memory entries with provenance |
| Hook-based lifecycle | SessionStart → UserMessage/ToolUse → Stop → End |
| Progressive disclosure | Token-budgeted context injection at session start |
| Provenance tracking | Every vector links back to its source evidence |
| Module | Lines | Description |
|---|---|---|
types.py | 227 | 📋 Pydantic models — enums, records, ContextBundle, FinalizationReport |
storage_sqlite.py | 805 | 🗄️ SQLite backend — 6 tables (sessions, events, observations, summaries) |
storage_lancedb.py | 542 | 🔍 LanceDB vector store — semantic/keyword/structured search |
hooks.py | 401 | 🪝 Abstract SessionHooks with 5 async lifecycle methods |
collectors.py | 413 | 📝 RedactionFilter (3-tier regex), thread-safe EventCollector |
session_manager.py | 755 | 🔄 Full lifecycle orchestration — start/record/finalize/end |
context_injector.py | 385 | 💉 Token-budgeted ContextBundle builder and renderer |
orchestrator.py | 530 | 🎭 Top-level facade CrossMemOrchestrator and factory |
api_http.py | 556 | 🌐 FastAPI router — 8 REST endpoints with Pydantic models |
api_mcp.py | 620 | 🔌 MCPToolRegistry — 8 MCP tool definitions with JSON Schema |
consolidation.py | 390 | 🧹 ConsolidationWorker — decay/merge/prune pipeline |
| Method | Parameters | Returns | Description |
|---|---|---|---|
start_session | content_session_id, user_prompt? | dict | Start session with context injection |
record_message | memory_session_id, content, role? | None | Record a chat message event |
record_tool_use | memory_session_id, tool_name, tool_input, tool_output | None | Record a tool invocation |
stop_session | memory_session_id | FinalizationReport | Finalize: extract observations, generate summary |
end_session | memory_session_id | None | Mark session completed, cleanup |
search | query, top_k? | list[CrossMemoryEntry] | Semantic search across all sessions |
get_context_for_prompt | user_prompt? | str | Build and render context for system prompt |
get_stats | — | dict | Storage statistics |
close | — | None | Close SQLite connection |
| Method | Path | Description |
|---|---|---|
POST | /cross/sessions/start | 🚀 Start a new cross-session |
POST | /cross/sessions/{id}/message | 💬 Record a message event |
POST | /cross/sessions/{id}/tool-use | 🔧 Record a tool-use event |
POST | /cross/sessions/{id}/stop | ✅ Finalize session memory |
POST | /cross/sessions/{id}/end | 🏁 End and cleanup session |
POST | /cross/search | 🔍 Search cross-session memories |
GET | /cross/stats | 📊 Get memory system statistics |
GET | /cross/health | 💚 Health check with uptime |
from cross.api_http import create_app
app = create_app(project="my-project")
# Run with uvicorn
# uvicorn cross.api_http:app --host 0.0.0.0 --port 8000
Or mount on an existing FastAPI app:
from cross.api_http import create_cross_router
from cross.orchestrator import create_orchestrator
orch = create_orchestrator(project="my-project")
router = create_cross_router(orch)
app.include_router(router, prefix="/cross")
| Tool Name | Description |
|---|---|
cross_session_start | 🚀 Start a new cross-session memory session |
cross_session_message | 💬 Record a user/assistant message |
cross_session_tool_use | 🔧 Record a tool invocation |
cross_session_stop | ✅ Finalize and persist session memory |
cross_session_end | 🏁 End session and cleanup |
cross_session_search | 🔍 Search across all session memories |
cross_session_context | 📚 Get context bundle for system prompt |
cross_session_stats | 📊 Get memory system statistics |
from cross.api_mcp import create_mcp_tools
from cross.orchestrator import create_orchestrator
orch = create_orchestrator(project="my-project")
tools = create_mcp_tools(orch)
# Get tool definitions for MCP server registration
definitions = tools.get_tool_definitions()
# Dispatch a tool call
result = await tools.call_tool("cross_session_start", {
"tenant_id": "default",
"content_session_id": "ses-1",
"project": "my-project",
"user_prompt": "Help me debug the auth module",
})
| Setting | Default | Description |
|---|---|---|
| SQLite DB | ~/.simplemem-cross/cross_memory.db | Session metadata, events, observations |
| LanceDB | ~/.simplemem-cross/lancedb_cross | Vector storage for memory entries |
| Max context tokens | 2000 | Token budget for context injection |
orch = create_orchestrator(
project="my-project",
tenant_id="team-alpha",
db_path="/custom/path/memory.db",
lancedb_path="/custom/path/lancedb",
max_context_tokens=3000,
)
Pass tenant_id to isolate memory across tenants. Each tenant's memories are stored and retrieved independently.
The consolidation worker maintains memory quality over time:
from cross.consolidation import ConsolidationWorker, ConsolidationPolicy
policy = ConsolidationPolicy(
max_age_days=90, # ⏰ Decay entries older than 90 days
decay_factor=0.9, # 📉 Multiply importance by 0.9 per period
merge_similarity_threshold=0.95, # 🔗 Merge near-duplicates
min_importance=0.05, # 🗑️ Prune below this threshold
)
worker = ConsolidationWorker(sqlite_storage, vector_store, policy)
result = worker.run(tenant_id="default")
print(f"📉 Decayed: {result.decayed_count}")
print(f"🔗 Merged: {result.merged_count}")
print(f"🗑️ Pruned: {result.pruned_count}")
# 🧪 Run all cross-session tests
pytest cross/tests/ -v
# 📋 Run specific test module
pytest cross/tests/test_types.py -v
pytest cross/tests/test_storage.py -v
pytest cross/tests/test_e2e.py -v
Note: Tests use real SQLite (temp databases) and mock LanceDB. No external services, API keys, or GPU required.
| Constraint | Reason |
|---|---|
| ✅ Original SimpleMem is byte-identical | Published research paper; never modified |
| ✅ All code in English | No Chinese in code, comments, docstrings, or strings |
| ✅ Python-only | Matches SimpleMem's tech stack |
| ✅ Composition pattern | SimpleMem is wrapped via duck typing, never subclassed |