Memory

Bedrock agents have persistent memory that maintains context across runs. The memory system uses hierarchical summarization to keep long conversations within LLM token limits.

How Memory Works

Raw Messages
    ↓ (every 5 min)
5-Minute Summaries
    ↓ (every hour)
Hourly Summaries
    ↓ (every day)
Daily Summaries
    ↓ (every week)
Weekly Summaries
    ↓ (every month)
Monthly Summaries

As conversations grow, older details are compressed into higher-level summaries while recent context stays detailed.

Memory Context Structure

When an agent runs, it receives memory as two parts:

Historical Context (Cached)

Stable summaries from monthly → daily level. This content rarely changes and is cached by Claude for efficiency.

Recent Context (Dynamic)

Recent activity from hourly → 5-minute summaries plus raw messages. This changes frequently.

┌─────────────────────────────────────┐
│         System Prompt               │ ← Cached
├─────────────────────────────────────┤
│     Historical Context              │ ← Cached
│  (Monthly/Weekly/Daily summaries)   │
├─────────────────────────────────────┤
│      Recent Context                 │ ← Not cached
│  (Hourly/5-min summaries + msgs)    │
├─────────────────────────────────────┤
│      Current Time + Wakeup          │ ← Not cached
└─────────────────────────────────────┘

Logging Messages

Log messages to agent memory:

curl -X POST https://api.bedrock.orinlabs.org/api/memory/agents/AGENT_ID/log-messages/ \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {"role": "user", "content": "Hi, can you help me schedule a meeting?"},
      {"role": "assistant", "content": "Of course! When would you like to schedule it?"}
    ],
    "timestamp": "2024-01-15T10:30:00Z"
  }'

This is useful for:

Injecting external conversations (SMS, email) into memory
Seeding agents with initial context
Syncing state from external systems

Getting Memory Context

Retrieve the formatted memory context (what the agent sees):

curl -X GET https://api.bedrock.orinlabs.org/api/memory/agents/AGENT_ID/memory-context/ \
  -H "Authorization: Bearer YOUR_API_KEY"

Response:

{
  "context": "=== MONTHLY SUMMARY (December 2023) ===\nUser onboarded and set up task tracking...\n\n=== WEEKLY SUMMARY (Jan 8-14) ===\nDiscussed project deadlines...\n\n=== DAILY SUMMARY (Jan 15) ===\nUser asked about meeting scheduling..."
}

Getting Recent Messages

Retrieve raw recent messages (before summarization):

curl -X GET https://api.bedrock.orinlabs.org/api/memory/agents/AGENT_ID/messages/ \
  -H "Authorization: Bearer YOUR_API_KEY"

Summarization Model

Memory summarization uses a separate, smaller model (default: claude-haiku-4-5) for efficiency. Configure per-agent:

curl -X PATCH https://api.bedrock.orinlabs.org/api/cloud/agents/AGENT_ID/ \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "summarizer_model": "claude-haiku-4-5"
  }'

Resetting Memory

Clear all memory for an agent:

curl -X POST https://api.bedrock.orinlabs.org/api/cloud/agents/AGENT_ID/reset-memory/ \
  -H "Authorization: Bearer YOUR_API_KEY"

This permanently deletes all messages and summaries. The agent will start fresh with no memory of past interactions.

Memory and Tool Calls

When agents make tool calls, the calls and results are automatically logged to memory:

[
  {
    "type": "function_call",
    "name": "list_tasks",
    "arguments": "{\"limit\": 10}"
  },
  {
    "type": "function_call_output",
    "output": "Found 3 tasks:\n1. Review proposal (due tomorrow)\n2. Call client..."
  }
]

This means agents remember what tools they’ve used and what they learned.

Memory Timestamp Handling

Messages can include explicit timestamps:

{
  "messages": [...],
  "timestamp": "2024-01-15T10:30:00Z"
}

If omitted, the current time is used. Timestamps affect which summary bucket messages fall into.

Prompt Caching Benefits

The hierarchical memory structure enables prompt caching with Claude:

Content Type	Cache Status	Why
Tool descriptions	Cached	Rarely changes
System prompt	Cached	Static
Historical memory	Cached	Changes slowly
Recent memory	Not cached	Changes every run

This can reduce costs by 50-80% for long-running agents.

Best Practices

Let It Summarize

Don’t manually manage memory - let the summarization system work.

Use Timestamps

Include accurate timestamps when logging external messages.

Reset Sparingly

Only reset memory when truly needed - context is valuable.

Check Context

Use memory-context endpoint to debug what agents see.

Get Started

Concepts

Memory

Memory

How Memory Works

Memory Context Structure

Historical Context (Cached)

Recent Context (Dynamic)

Logging Messages

Getting Memory Context

Getting Recent Messages

Summarization Model

Resetting Memory

Memory and Tool Calls

Memory Timestamp Handling

Prompt Caching Benefits

Best Practices

Let It Summarize

Use Timestamps

Reset Sparingly

Check Context

Get Started

Concepts

​Memory

​How Memory Works

​Memory Context Structure

​Historical Context (Cached)

​Recent Context (Dynamic)

​Logging Messages

​Getting Memory Context

​Getting Recent Messages

​Summarization Model

​Resetting Memory

​Memory and Tool Calls

​Memory Timestamp Handling

​Prompt Caching Benefits

​Best Practices

Let It Summarize

Use Timestamps

Reset Sparingly

Check Context

Memory

How Memory Works

Memory Context Structure

Historical Context (Cached)

Recent Context (Dynamic)

Logging Messages

Getting Memory Context

Getting Recent Messages

Summarization Model

Resetting Memory

Memory and Tool Calls

Memory Timestamp Handling

Prompt Caching Benefits

Best Practices