Memory Management Kit for Agents, Remember Me, Refine Me.
If you find it useful, please give us a ⭐ Star.
ReMe is a modular memory management kit that provides AI agents with unified memory capabilities—enabling the ability to extract, reuse, and share memories across users, tasks, and agents. Agent memory can be viewed as:
Agent Memory = Long-Term Memory + Short-Term Memory = (Personal + Task + Tool) Memory + (Working Memory)
/horse to trigger the Year of the Horse Easter egg -- fireworks, a galloping horse animation, and a random blessing.|
马 上 有 钱 |
马 到 成 功 |
from reme_ai import ReMeApp without HTTP/MCP service
ReMe provides a modular memory management kit with pluggable components that can be integrated into any agent framework. The system consists of:
Procedural knowledge reused across agents
Learn more about how to use task memory from task memory
Contextualized memory for specific users
Learn more about how to use personal memory from personal memory
Data-driven tool selection and usage optimization
Learn more about how to use tool memory from tool memory
Short‑term contextual memory for long‑running agents via message offload & reload:
grep_working_memory) and read (read_working_memory) offloaded content on demand
📖 Concept & API:pip install reme-ai
git clone https://github.com/agentscope-ai/ReMe.git
cd ReMe
pip install .
ReMe requires LLM and embedding model configurations. Copy example.env to .env and configure:
FLOW_LLM_API_KEY=sk-xxxx FLOW_LLM_BASE_URL=https://xxxx/v1 FLOW_EMBEDDING_API_KEY=sk-xxxx FLOW_EMBEDDING_BASE_URL=https://xxxx/v1
reme \
backend=http \
http.port=8002 \
llm.default.model_name=qwen3-30b-a3b-thinking-2507 \
embedding_model.default.model_name=text-embedding-v4 \
vector_store.default.backend=local
reme \
backend=mcp \
mcp.transport=stdio \
llm.default.model_name=qwen3-30b-a3b-thinking-2507 \
embedding_model.default.model_name=text-embedding-v4 \
vector_store.default.backend=local
import requests
# Experience Summarizer: Learn from execution trajectories
response = requests.post("http://localhost:8002/summary_task_memory", json={
"workspace_id": "task_workspace",
"trajectories": [
{"messages": [{"role": "user", "content": "Help me create a project plan"}], "score": 1.0}
]
})
# Retriever: Get relevant memories
response = requests.post("http://localhost:8002/retrieve_task_memory", json={
"workspace_id": "task_workspace",
"query": "How to efficiently manage project progress?",
"top_k": 1
})
import asyncio
from reme_ai import ReMeApp
async def main():
async with ReMeApp(
"llm.default.model_name=qwen3-30b-a3b-thinking-2507",
"embedding_model.default.model_name=text-embedding-v4",
"vector_store.default.backend=memory"
) as app:
# Experience Summarizer: Learn from execution trajectories
result = await app.async_execute(
name="summary_task_memory",
workspace_id="task_workspace",
trajectories=[
{
"messages": [
{"role": "user", "content": "Help me create a project plan"}
],
"score": 1.0
}
]
)
print(result)
# Retriever: Get relevant memories
result = await app.async_execute(
name="retrieve_task_memory",
workspace_id="task_workspace",
query="How to efficiently manage project progress?",
top_k=1
)
print(result)
if __name__ == "__main__":
asyncio.run(main())
# Experience Summarizer: Learn from execution trajectories
curl -X POST http://localhost:8002/summary_task_memory \
-H "Content-Type: application/json" \
-d '{
"workspace_id": "task_workspace",
"trajectories": [
{"messages": [{"role": "user", "content": "Help me create a project plan"}], "score": 1.0}
]
}'
# Retriever: Get relevant memories
curl -X POST http://localhost:8002/retrieve_task_memory \
-H "Content-Type: application/json" \
-d '{
"workspace_id": "task_workspace",
"query": "How to efficiently manage project progress?",
"top_k": 1
}'
# Memory Integration: Learn from user interactions
response = requests.post("http://localhost:8002/summary_personal_memory", json={
"workspace_id": "task_workspace",
"trajectories": [
{"messages":
[
{"role": "user", "content": "I like to drink coffee while working in the morning"},
{"role": "assistant",
"content": "I understand, you prefer to start your workday with coffee to stay energized"}
]
}
]
})
# Memory Retrieval: Get personal memory fragments
response = requests.post("http://localhost:8002/retrieve_personal_memory", json={
"workspace_id": "task_workspace",
"query": "What are the user's work habits?",
"top_k": 5
})
import asyncio
from reme_ai import ReMeApp
async def main():
async with ReMeApp(
"llm.default.model_name=qwen3-30b-a3b-thinking-2507",
"embedding_model.default.model_name=text-embedding-v4",
"vector_store.default.backend=memory"
) as app:
# Memory Integration: Learn from user interactions
result = await app.async_execute(
name="summary_personal_memory",
workspace_id="task_workspace",
trajectories=[
{
"messages": [
{"role": "user", "content": "I like to drink coffee while working in the morning"},
{"role": "assistant",
"content": "I understand, you prefer to start your workday with coffee to stay energized"}
]
}
]
)
print(result)
# Memory Retrieval: Get personal memory fragments
result = await app.async_execute(
name="retrieve_personal_memory",
workspace_id="task_workspace",
query="What are the user's work habits?",
top_k=5
)
print(result)
if __name__ == "__main__":
asyncio.run(main())
# Memory Integration: Learn from user interactions
curl -X POST http://localhost:8002/summary_personal_memory \
-H "Content-Type: application/json" \
-d '{
"workspace_id": "task_workspace",
"trajectories": [
{"messages": [
{"role": "user", "content": "I like to drink coffee while working in the morning"},
{"role": "assistant", "content": "I understand, you prefer to start your workday with coffee to stay energized"}
]}
]
}'
# Memory Retrieval: Get personal memory fragments
curl -X POST http://localhost:8002/retrieve_personal_memory \
-H "Content-Type: application/json" \
-d '{
"workspace_id": "task_workspace",
"query": "What are the user'\''s work habits?",
"top_k": 5
}'
import requests
# Record tool execution results
response = requests.post("http://localhost:8002/add_tool_call_result", json={
"workspace_id": "tool_workspace",
"tool_call_results": [
{
"create_time": "2025-10-21 10:30:00",
"tool_name": "web_search",
"input": {"query": "Python asyncio tutorial", "max_results": 10},
"output": "Found 10 relevant results...",
"token_cost": 150,
"success": True,
"time_cost": 2.3
}
]
})
# Generate usage guidelines from history
response = requests.post("http://localhost:8002/summary_tool_memory", json={
"workspace_id": "tool_workspace",
"tool_names": "web_search"
})
# Retrieve tool guidelines before use
response = requests.post("http://localhost:8002/retrieve_tool_memory", json={
"workspace_id": "tool_workspace",
"tool_names": "web_search"
})
import asyncio
from reme_ai import ReMeApp
async def main():
async with ReMeApp(
"llm.default.model_name=qwen3-30b-a3b-thinking-2507",
"embedding_model.default.model_name=text-embedding-v4",
"vector_store.default.backend=memory"
) as app:
# Record tool execution results
result = await app.async_execute(
name="add_tool_call_result",
workspace_id="tool_workspace",
tool_call_results=[
{
"create_time": "2025-10-21 10:30:00",
"tool_name": "web_search",
"input": {"query": "Python asyncio tutorial", "max_results": 10},
"output": "Found 10 relevant results...",
"token_cost": 150,
"success": True,
"time_cost": 2.3
}
]
)
print(result)
# Generate usage guidelines from history
result = await app.async_execute(
name="summary_tool_memory",
workspace_id="tool_workspace",
tool_names="web_search"
)
print(result)
# Retrieve tool guidelines before use
result = await app.async_execute(
name="retrieve_tool_memory",
workspace_id="tool_workspace",
tool_names="web_search"
)
print(result)
if __name__ == "__main__":
asyncio.run(main())
# Record tool execution results
curl -X POST http://localhost:8002/add_tool_call_result \
-H "Content-Type: application/json" \
-d '{
"workspace_id": "tool_workspace",
"tool_call_results": [
{
"create_time": "2025-10-21 10:30:00",
"tool_name": "web_search",
"input": {"query": "Python asyncio tutorial", "max_results": 10},
"output": "Found 10 relevant results...",
"token_cost": 150,
"success": true,
"time_cost": 2.3
}
]
}'
# Generate usage guidelines from history
curl -X POST http://localhost:8002/summary_tool_memory \
-H "Content-Type: application/json" \
-d '{
"workspace_id": "tool_workspace",
"tool_names": "web_search"
}'
# Retrieve tool guidelines before use
curl -X POST http://localhost:8002/retrieve_tool_memory \
-H "Content-Type: application/json" \
-d '{
"workspace_id": "tool_workspace",
"tool_names": "web_search"
}'
import requests
# Summarize and compact working memory for a long-running conversation
response = requests.post("http://localhost:8002/summary_working_memory", json={
"messages": [
{
"role": "system",
"content": "You are a helpful assistant. First use `Grep` to find the line numbers that match the keywords or regular expressions, and then use `ReadFile` to read the code around those locations. If no matches are found, never give up; try different parameters, such as searching with only part of the keywords. After `Grep`, use the `ReadFile` command to view content starting from a specified `offset` and `limit`, and do not exceed 100 lines. If the current content is insufficient, you can continue trying different `offset` and `limit` values with the `ReadFile` command."
},
{
"role": "user",
"content": "搜索下reme项目的的README内容"
},
{
"role": "assistant",
"content": "",
"tool_calls": [
{
"index": 0,
"id": "call_6596dafa2a6a46f7a217da",
"function": {
"arguments": "{\"query\": \"readme\"}",
"name": "web_search"
},
"type": "function"
}
]
},
{
"role": "tool",
"content": "ultra large context , over 50000 tokens......"
},
{
"role": "user",
"content": "根据readme回答task memory在appworld的效果是多少,需要具体的数值"
}
],
"working_summary_mode": "auto",
"compact_ratio_threshold": 0.75,
"max_total_tokens": 20000,
"max_tool_message_tokens": 2000,
"group_token_threshold": 4000,
"keep_recent_count": 2,
"store_dir": "test_working_memory",
"chat_id": "demo_chat_id"
})
import asyncio
from reme_ai import ReMeApp
async def main():
async with ReMeApp(
"llm.default.model_name=qwen3-30b-a3b-thinking-2507",
"embedding_model.default.model_name=text-embedding-v4",
"vector_store.default.backend=memory"
) as app:
# Summarize and compact working memory for a long-running conversation
result = await app.async_execute(
name="summary_working_memory",
messages=[
{
"role": "system",
"content": "You are a helpful assistant. First use `Grep` to find the line numbers that match the keywords or regular expressions, and then use `ReadFile` to read the code around those locations. If no matches are found, never give up; try different parameters, such as searching with only part of the keywords. After `Grep`, use the `ReadFile` command to view content starting from a specified `offset` and `limit`, and do not exceed 100 lines. If the current content is insufficient, you can continue trying different `offset` and `limit` values with the `ReadFile` command."
},
{
"role": "user",
"content": "搜索下reme项目的的README内容"
},
{
"role": "assistant",
"content": "",
"tool_calls": [
{
"index": 0,
"id": "call_6596dafa2a6a46f7a217da",
"function": {
"arguments": "{\"query\": \"readme\"}",
"name": "web_search"
},
"type": "function"
}
]
},
{
"role": "tool",
"content": "ultra large context , over 50000 tokens......"
},
{
"role": "user",
"content": "根据readme回答task memory在appworld的效果是多少,需要具体的数值"
}
],
working_summary_mode="auto",
compact_ratio_threshold=0.75,
max_total_tokens=20000,
max_tool_message_tokens=2000,
group_token_threshold=4000,
keep_recent_count=2,
store_dir="test_working_memory",
chat_id="demo_chat_id",
)
print(result)
if __name__ == "__main__":
asyncio.run(main())
curl -X POST http://localhost:8002/summary_working_memory \
-H "Content-Type: application/json" \
-d '{
"messages": [
{
"role": "system",
"content": "You are a helpful assistant. First use `Grep` to find the line numbers that match the keywords or regular expressions, and then use `ReadFile` to read the code around those locations. If no matches are found, never give up; try different parameters, such as searching with only part of the keywords. After `Grep`, use the `ReadFile` command to view content starting from a specified `offset` and `limit`, and do not exceed 100 lines. If the current content is insufficient, you can continue trying different `offset` and `limit` values with the `ReadFile` command."
},
{
"role": "user",
"content": "搜索下reme项目的的README内容"
},
{
"role": "assistant",
"content": "",
"tool_calls": [
{
"index": 0,
"id": "call_6596dafa2a6a46f7a217da",
"function": {
"arguments": "{\"query\": \"readme\"}",
"name": "web_search"
},
"type": "function"
}
]
},
{
"role": "tool",
"content": "ultra large context , over 50000 tokens......"
},
{
"role": "user",
"content": "根据readme回答task memory在appworld的效果是多少,需要具体的数值"
}
],
"working_summary_mode": "auto",
"compact_ratio_threshold": 0.75,
"max_total_tokens": 20000,
"max_tool_message_tokens": 2000,
"group_token_threshold": 4000,
"keep_recent_count": 2,
"store_dir": "test_working_memory",
"chat_id": "demo_chat_id"
}'
ReMe provides a memory library with pre-extracted, production-ready memories that agents can load and use immediately:
| Memory Pack | Domain | Size | Description |
|---|---|---|---|
appworld.jsonl | Task Execution | ~100 memories | Complex task planning patterns, multi-step workflows, and error recovery strategies |
bfcl_v3.jsonl | Tool Usage | ~150 memories | Function calling patterns, parameter optimization, and tool selection strategies |
# Load pre-built memories
response = requests.post("http://localhost:8002/vector_store", json={
"workspace_id": "appworld",
"action": "load",
"path": "./docs/library/"
})
# Query relevant memories
response = requests.post("http://localhost:8002/retrieve_task_memory", json={
"workspace_id": "appworld",
"query": "How to navigate to settings and update user profile?",
"top_k": 1
})
import asyncio
from reme_ai import ReMeApp
async def main():
async with ReMeApp(
"llm.default.model_name=qwen3-30b-a3b-thinking-2507",
"embedding_model.default.model_name=text-embedding-v4",
"vector_store.default.backend=memory"
) as app:
# Load pre-built memories
result = await app.async_execute(
name="vector_store",
workspace_id="appworld",
action="load",
path="./docs/library/"
)
print(result)
# Query relevant memories
result = await app.async_execute(
name="retrieve_task_memory",
workspace_id="appworld",
query="How to navigate to settings and update user profile?",
top_k=1
)
print(result)
if __name__ == "__main__":
asyncio.run(main())
We tested ReMe on Appworld using Qwen3-8B (non-thinking mode):
| Method | Avg@4 | Pass@4 |
|---|---|---|
| without ReMe | 0.1497 | 0.3285 |
| with ReMe | 0.1706 (+2.09%) | 0.3631 (+3.46%) |
Pass@K measures the probability that at least one of the K generated samples successfully completes the task ( score=1). The current experiment uses an internal AppWorld environment, which may have slight differences.
You can find more details on reproducing the experiment in quickstart.md.
We tested ReMe on BFCL-V3 multi-turn-base (randomly split 50train/150val) using Qwen3-8B (thinking mode):
| Method | Avg@4 | Pass@4 |
|---|---|---|
| without ReMe | 0.4033 | 0.5955 |
| with ReMe | 0.4450 (+4.17%) | 0.6577 (+6.22%) |
| without ReMe | with ReMe |
|---|---|
|
|
We tested on 100 random frozenlake maps using qwen3-8b:
| Method | pass rate |
|---|---|
| without ReMe | 0.66 |
| with ReMe | 0.72 (+6.0%) |
You can find more details on reproducing the experiment in quickstart.md.
We evaluated Tool Memory effectiveness using a controlled benchmark with three mock search tools using Qwen3-30B-Instruct:
| Scenario | Avg Score | Improvement |
|---|---|---|
| Train (No Memory) | 0.650 | - |
| Test (No Memory) | 0.672 | Baseline |
| Test (With Memory) | 0.772 | +14.88% |
Key Findings:
You can find more details in tool_bench.md and the implementation at run_reme_tool_bench.py.
We believe the best memory systems come from collective wisdom. Contributions welcome 👉Guide:
@software{AgentscopeReMe2025, title = {AgentscopeReMe: Memory Management Kit for Agents}, author = {Li Yu and Jiaji Deng and Zouying Cao and Weikang Zhou and Tiancheng Qin and Qingxu Fu and Sen Huang and Xianzhe Xu and Zhaoyang Liu and Boyin Liu}, url = {https://reme.agentscope.io}, year = {2025} } @misc{AgentscopeReMe2025Paper, title={Remember Me, Refine Me: A Dynamic Procedural Memory Framework for Experience-Driven Agent Evolution}, author={Zouying Cao and Jiaji Deng and Li Yu and Weikang Zhou and Zhaoyang Liu and Bolin Ding and Hai Zhao}, year={2025}, eprint={2512.10696}, archivePrefix={arXiv}, primaryClass={cs.AI}, url={https://arxiv.org/abs/2512.10696}, }
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.