Memory Types¶
The Redis Agent Memory Server provides two distinct types of memory storage, each optimized for different use cases and access patterns: Working Memory and Long-Term Memory.
Overview¶
Feature | Working Memory | Long-Term Memory |
---|---|---|
Scope | Session-scoped | Cross-session, persistent |
Lifespan | TTL-based (1 hour default) | Permanent until manually deleted |
Storage | Redis key-value with JSON | Redis with vector indexing |
Search | Simple text matching | Semantic vector search |
Capacity | Limited by window size | Unlimited (with compaction) |
Use Case | Active conversation state | Knowledge base, user preferences |
Indexing | None | Vector embeddings + metadata |
Deduplication | None | Hash-based and semantic |
Working Memory¶
Working memory is session-scoped, ephemeral storage designed for active conversation state and temporary data. It's the "scratch pad" where an AI agent keeps track of the current conversation context.
Characteristics¶
- Session Scoped: Each session has its own isolated working memory
- TTL-Based: Automatically expires (default: 1 hour)
- Window Management: Automatically summarizes when message count exceeds limits
- Mixed Content: Stores both conversation messages and structured memory records
- No Indexing: Simple JSON storage in Redis
- Promotion: Structured memories can be promoted to long-term storage
Data Structure¶
Working memory contains:
- Messages: Conversation history (role/content pairs)
- Memories: Structured memory records awaiting promotion
- Context: Summary of past conversation when truncated
- Data: Arbitrary JSON key-value storage
- Metadata: User ID, timestamps, TTL settings
When to Use Working Memory¶
-
Active Conversation State
-
Temporary Structured Data
-
Session-Specific Settings
-
Promoting Memories to Long-Term Storage
# Memories in working memory are automatically promoted to long-term storage working_memory = WorkingMemory( session_id="chat_123", memories=[ MemoryRecord( text="User is planning a trip to Paris next month", id="trip_planning_paris", memory_type="episodic", topics=["travel", "planning"], entities=["Paris"] ) ] ) # This memory will become permanent in long-term storage
🔑 Key Distinction: - Use
data
field for temporary facts that stay only in the session - Usememories
field for permanent facts that should be promoted to long-term storage - Anything in thememories
field will automatically become persistent and searchable across all future sessions
API Endpoints¶
# Get working memory for a session
GET /v1/working-memory/{session_id}?namespace=demo&model_name=gpt-4o
# Set working memory (replaces existing)
PUT /v1/working-memory/{session_id}
# Delete working memory
DELETE /v1/working-memory/{session_id}?namespace=demo
Automatic Promotion¶
When structured memories in working memory are stored, they are automatically promoted to long-term storage in the background:
- Memories with
persisted_at=null
are identified - Server assigns unique IDs and timestamps
- Memories are indexed in long-term storage with vector embeddings
- Working memory is updated with
persisted_at
timestamps
Three Ways to Create Long-Term Memories¶
Long-term memories are typically created by LLMs (either yours or the memory server's) based on conversations. There are three pathways:
1. 🤖 Automatic Extraction from Conversations¶
The server automatically extracts memories from conversation messages using an LLM in the background:
# Server analyzes messages and creates memories automatically
working_memory = WorkingMemory(
session_id="chat_123",
messages=[
{"role": "user", "content": "I love Italian food, especially carbonara"},
{"role": "assistant", "content": "Great! I'll remember your preference for Italian cuisine."}
]
# Server will extract: "User enjoys Italian food, particularly carbonara pasta"
)
2. âš¡ LLM-Identified Memories via Working Memory (Performance Optimization)¶
Your LLM can pre-identify memories and add them to working memory for batch storage:
# LLM identifies important facts and adds to memories field
working_memory = WorkingMemory(
session_id="chat_123",
memories=[
MemoryRecord(
text="User prefers morning meetings and dislikes calls after 4 PM",
memory_type="semantic",
topics=["preferences", "scheduling"],
entities=["morning meetings", "4 PM"]
)
]
# Automatically promoted to long-term storage when saving working memory
)
3. 🎯 Direct Long-Term Memory Creation¶
Create memories directly via API or LLM tool calls:
# Direct API call or LLM using create_long_term_memory tool
await client.create_long_term_memories([
{
"text": "User works as a software engineer at TechCorp",
"memory_type": "semantic",
"topics": ["career", "work"],
"entities": ["software engineer", "TechCorp"]
}
])
💡 LLM-Driven Design: The system is designed for LLMs to make memory decisions. Your LLM can use memory tools to search existing memories, decide what's important to remember, and choose the most efficient storage method.
Long-Term Memory¶
Long-term memory is persistent, cross-session storage designed for knowledge that should be retained and searchable across all interactions. It's the "knowledge base" where important facts, preferences, and experiences are stored.
Characteristics¶
- Cross-Session: Accessible from any session
- Persistent: Survives server restarts and session expiration
- Vector Indexed: Semantic search with OpenAI embeddings
- Deduplication: Automatic hash-based and semantic deduplication
- Rich Metadata: Topics, entities, timestamps, memory types
- Compaction: Automatic cleanup and merging of duplicates
Memory Types¶
Long-term memory supports three types of memories:
-
Semantic: Facts, preferences, general knowledge
-
Episodic: Events with temporal context
-
Message: Conversation records (auto-generated)
When to Use Long-Term Memory¶
-
User Preferences and Profile
-
Important Facts and Knowledge
-
Cross-Session Context
API Endpoints¶
# Create long-term memories
POST /v1/long-term-memory/
# Search long-term memories
POST /v1/long-term-memory/search
Search Capabilities¶
Long-term memory provides powerful search features:
Semantic Vector Search¶
Advanced Filtering¶
{
"text": "user preferences",
"filters": {
"user_id": {"eq": "user_123"},
"memory_type": {"eq": "semantic"},
"topics": {"any": ["preferences", "settings"]},
"created_at": {"gte": "2024-01-01T00:00:00Z"}
}
}
Hybrid Search¶
{
"text": "travel plans",
"filters": {
"namespace": {"eq": "personal"},
"event_date": {"gte": "2024-03-01T00:00:00Z"}
},
"include_working_memory": true,
"include_long_term_memory": true
}
Memory Lifecycle¶
1. Creation in Working Memory¶
# Client creates structured memory
memory = MemoryRecord(
text="User likes Italian food",
id="client_generated_id",
memory_type="semantic"
)
# Add to working memory
working_memory = WorkingMemory(
session_id="current_session",
memories=[memory]
)
2. Automatic Promotion¶
# Server promotes to long-term storage (background)
# - Assigns persisted_at timestamp
# - Generates vector embeddings
# - Indexes for search
# - Updates working memory with timestamps
3. Deduplication and Compaction¶
# Server automatically:
# - Identifies hash-based duplicates
# - Finds semantically similar memories
# - Merges related memories using LLM
# - Removes obsolete duplicates
4. Retrieval and Search¶
# Client searches across all memory
results = await search_memories(
text="food preferences",
filters={"user_id": {"eq": "user_123"}}
)
Memory Prompt Integration¶
The memory system integrates with AI prompts through the /v1/memory/prompt
endpoint:
# Get memory-enriched prompt
response = await memory_prompt({
"query": "Help me plan dinner",
"session": {
"session_id": "current_chat",
"model_name": "gpt-4o",
"context_window_max": 4000
},
"long_term_search": {
"text": "food preferences dietary restrictions",
"filters": {"user_id": {"eq": "user_123"}},
"limit": 5
}
})
# Returns ready-to-use messages with:
# - Conversation context from working memory
# - Relevant memories from long-term storage
# - User's query as final message
Best Practices¶
Working Memory¶
- Keep conversation state and temporary data
- Use for session-specific configuration
- Store structured memories that might become long-term
- Let automatic promotion handle persistence
Long-Term Memory¶
- Store user preferences and lasting facts
- Include rich metadata (topics, entities, timestamps)
- Use meaningful IDs for easier retrieval
- Leverage semantic search for discovery
Memory Design¶
- Use semantic memory for timeless facts
- Use episodic memory for time-bound events
- Include relevant topics and entities for better search
- Design memory text for LLM consumption
Search Strategy¶
- Start with semantic search for discovery
- Add filters for precision
- Use unified search for comprehensive results
- Consider both working and long-term contexts
Memory Extraction¶
By default, the system automatically extracts structured memories from conversations as they flow from working memory to long-term storage. This extraction process can be customized using different memory strategies.
Memory Strategies
The system supports multiple extraction strategies (discrete facts, summaries, preferences, custom prompts) that determine how conversations are processed into memories. See Memory Strategies for complete documentation and examples.
Configuration¶
Memory behavior can be configured through environment variables:
# Working memory settings
WINDOW_SIZE=50 # Message window before summarization
LONG_TERM_MEMORY=true # Enable long-term memory features
# Long-term memory settings
ENABLE_DISCRETE_MEMORY_EXTRACTION=true # Extract memories from messages
GENERATION_MODEL=gpt-4o-mini # Model for summarization/extraction
For complete configuration options, see the Configuration Guide.