Memory
Bedrock agents have persistent memory that maintains context across runs. The memory system uses hierarchical summarization to keep long conversations within LLM token limits.
How Memory Works
Raw Messages
↓ (every 5 min)
5-Minute Summaries
↓ (every hour)
Hourly Summaries
↓ (every day)
Daily Summaries
↓ (every week)
Weekly Summaries
↓ (every month)
Monthly Summaries
As conversations grow, older details are compressed into higher-level summaries while recent context stays detailed.
Memory Context Structure
When an agent runs, it receives memory as two parts:
Historical Context (Cached)
Stable summaries from monthly → daily level. This content rarely changes and is cached by Claude for efficiency.
Recent Context (Dynamic)
Recent activity from hourly → 5-minute summaries plus raw messages. This changes frequently.
┌─────────────────────────────────────┐
│ System Prompt │ ← Cached
├─────────────────────────────────────┤
│ Historical Context │ ← Cached
│ (Monthly/Weekly/Daily summaries) │
├─────────────────────────────────────┤
│ Recent Context │ ← Not cached
│ (Hourly/5-min summaries + msgs) │
├─────────────────────────────────────┤
│ Current Time + Wakeup │ ← Not cached
└─────────────────────────────────────┘
Logging Messages
Log messages to agent memory:
curl -X POST https://api.bedrock.orinlabs.org/api/memory/agents/AGENT_ID/log-messages/ \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"messages": [
{"role": "user", "content": "Hi, can you help me schedule a meeting?"},
{"role": "assistant", "content": "Of course! When would you like to schedule it?"}
],
"timestamp": "2024-01-15T10:30:00Z"
}'
This is useful for:
Injecting external conversations (SMS, email) into memory
Seeding agents with initial context
Syncing state from external systems
Getting Memory Context
Retrieve the formatted memory context (what the agent sees):
curl -X GET https://api.bedrock.orinlabs.org/api/memory/agents/AGENT_ID/memory-context/ \
-H "Authorization: Bearer YOUR_API_KEY"
Response:
{
"context" : "=== MONTHLY SUMMARY (December 2023) === \n User onboarded and set up task tracking... \n\n === WEEKLY SUMMARY (Jan 8-14) === \n Discussed project deadlines... \n\n === DAILY SUMMARY (Jan 15) === \n User asked about meeting scheduling..."
}
Getting Recent Messages
Retrieve raw recent messages (before summarization):
curl -X GET https://api.bedrock.orinlabs.org/api/memory/agents/AGENT_ID/messages/ \
-H "Authorization: Bearer YOUR_API_KEY"
Summarization Model
Memory summarization uses a separate, smaller model (default: claude-haiku-4-5) for efficiency. Configure per-agent:
curl -X PATCH https://api.bedrock.orinlabs.org/api/cloud/agents/AGENT_ID/ \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"summarizer_model": "claude-haiku-4-5"
}'
Resetting Memory
Clear all memory for an agent:
curl -X POST https://api.bedrock.orinlabs.org/api/cloud/agents/AGENT_ID/reset-memory/ \
-H "Authorization: Bearer YOUR_API_KEY"
This permanently deletes all messages and summaries. The agent will start fresh with no memory of past interactions.
When agents make tool calls, the calls and results are automatically logged to memory:
[
{
"type" : "function_call" ,
"name" : "list_tasks" ,
"arguments" : "{ \" limit \" : 10}"
},
{
"type" : "function_call_output" ,
"output" : "Found 3 tasks: \n 1. Review proposal (due tomorrow) \n 2. Call client..."
}
]
This means agents remember what tools they’ve used and what they learned.
Memory Timestamp Handling
Messages can include explicit timestamps:
{
"messages" : [ ... ],
"timestamp" : "2024-01-15T10:30:00Z"
}
If omitted, the current time is used. Timestamps affect which summary bucket messages fall into.
Prompt Caching Benefits
The hierarchical memory structure enables prompt caching with Claude:
Content Type Cache Status Why Tool descriptions Cached Rarely changes System prompt Cached Static Historical memory Cached Changes slowly Recent memory Not cached Changes every run
This can reduce costs by 50-80% for long-running agents.
Best Practices
Let It Summarize Don’t manually manage memory - let the summarization system work.
Use Timestamps Include accurate timestamps when logging external messages.
Reset Sparingly Only reset memory when truly needed - context is valuable.
Check Context Use memory-context endpoint to debug what agents see.