Skip to main content

Memory

Bedrock agents have persistent memory that maintains context across runs. The memory system uses hierarchical summarization to keep long conversations within LLM token limits.

How Memory Works

Raw Messages
    ↓ (every 5 min)
5-Minute Summaries
    ↓ (every hour)
Hourly Summaries
    ↓ (every day)
Daily Summaries
    ↓ (every week)
Weekly Summaries
    ↓ (every month)
Monthly Summaries
As conversations grow, older details are compressed into higher-level summaries while recent context stays detailed.

Memory Context Structure

When an agent runs, it receives memory as two parts:

Historical Context (Cached)

Stable summaries from monthly → daily level. This content rarely changes and is cached by Claude for efficiency.

Recent Context (Dynamic)

Recent activity from hourly → 5-minute summaries plus raw messages. This changes frequently.
┌─────────────────────────────────────┐
│         System Prompt               │ ← Cached
├─────────────────────────────────────┤
│     Historical Context              │ ← Cached
│  (Monthly/Weekly/Daily summaries)   │
├─────────────────────────────────────┤
│      Recent Context                 │ ← Not cached
│  (Hourly/5-min summaries + msgs)    │
├─────────────────────────────────────┤
│      Current Time + Wakeup          │ ← Not cached
└─────────────────────────────────────┘

Logging Messages

Log messages to agent memory:
curl -X POST https://api.bedrock.orinlabs.org/api/memory/agents/AGENT_ID/log-messages/ \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {"role": "user", "content": "Hi, can you help me schedule a meeting?"},
      {"role": "assistant", "content": "Of course! When would you like to schedule it?"}
    ],
    "timestamp": "2024-01-15T10:30:00Z"
  }'
This is useful for:
  • Injecting external conversations (SMS, email) into memory
  • Seeding agents with initial context
  • Syncing state from external systems

Getting Memory Context

Retrieve the formatted memory context (what the agent sees):
curl -X GET https://api.bedrock.orinlabs.org/api/memory/agents/AGENT_ID/memory-context/ \
  -H "Authorization: Bearer YOUR_API_KEY"
Response:
{
  "context": "=== MONTHLY SUMMARY (December 2023) ===\nUser onboarded and set up task tracking...\n\n=== WEEKLY SUMMARY (Jan 8-14) ===\nDiscussed project deadlines...\n\n=== DAILY SUMMARY (Jan 15) ===\nUser asked about meeting scheduling..."
}

Getting Recent Messages

Retrieve raw recent messages (before summarization):
curl -X GET https://api.bedrock.orinlabs.org/api/memory/agents/AGENT_ID/messages/ \
  -H "Authorization: Bearer YOUR_API_KEY"

Summarization Model

Memory summarization uses a separate, smaller model (default: claude-haiku-4-5) for efficiency. Configure per-agent:
curl -X PATCH https://api.bedrock.orinlabs.org/api/cloud/agents/AGENT_ID/ \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "summarizer_model": "claude-haiku-4-5"
  }'

Resetting Memory

Clear all memory for an agent:
curl -X POST https://api.bedrock.orinlabs.org/api/cloud/agents/AGENT_ID/reset-memory/ \
  -H "Authorization: Bearer YOUR_API_KEY"
This permanently deletes all messages and summaries. The agent will start fresh with no memory of past interactions.

Memory and Tool Calls

When agents make tool calls, the calls and results are automatically logged to memory:
[
  {
    "type": "function_call",
    "name": "list_tasks",
    "arguments": "{\"limit\": 10}"
  },
  {
    "type": "function_call_output",
    "output": "Found 3 tasks:\n1. Review proposal (due tomorrow)\n2. Call client..."
  }
]
This means agents remember what tools they’ve used and what they learned.

Memory Timestamp Handling

Messages can include explicit timestamps:
{
  "messages": [...],
  "timestamp": "2024-01-15T10:30:00Z"
}
If omitted, the current time is used. Timestamps affect which summary bucket messages fall into.

Prompt Caching Benefits

The hierarchical memory structure enables prompt caching with Claude:
Content TypeCache StatusWhy
Tool descriptionsCachedRarely changes
System promptCachedStatic
Historical memoryCachedChanges slowly
Recent memoryNot cachedChanges every run
This can reduce costs by 50-80% for long-running agents.

Best Practices

Let It Summarize

Don’t manually manage memory - let the summarization system work.

Use Timestamps

Include accurate timestamps when logging external messages.

Reset Sparingly

Only reset memory when truly needed - context is valuable.

Check Context

Use memory-context endpoint to debug what agents see.