Letta is the platform for building stateful agents: open AI with advanced memory that can learn and self-improve over time.
- Developer Documentation: Learn how to create agents using Python or TypeScript
- Python API Reference: Complete Python SDK documentation
- Agent Development Environment (ADE): A no-code UI for building stateful agents
- Letta Cloud: The fastest way to try Letta
Install the Letta Python SDK:
pip install letta-clientIn the example below, we'll create a stateful agent with two memory blocks. We'll initialize the human memory block with incorrect information, and correct the agent in our first message - which will trigger the agent to update its own memory with a tool call.
To run the examples, you'll need to get a LETTA_API_KEY from Letta Cloud, or run your own self-hosted server (see our guide)
from letta_client import Letta
client = Letta(token="LETTA_API_KEY")
# client = Letta(base_url="http://localhost:8283") # if self-hosting
agent_state = client.agents.create(
model="openai/gpt-4o-mini",
embedding="openai/text-embedding-3-small",
memory_blocks=[
{
"label": "human",
"value": "The human's name is Chad. They like vibe coding."
},
{
"label": "persona",
"value": "My name is Sam, a helpful assistant."
}
],
tools=["web_search", "run_code"]
)
print(agent_state.id)
# agent-d9be...0846
response = client.agents.messages.create(
agent_id=agent_state.id,
messages=[
{
"role": "user",
"content": "Hey, nice to meet you, my name is Brad."
}
]
)
# the agent will think, then edit its memory using a tool
for message in response.messages:
print(message)
# The content of this memory block will be something like
# "The human's name is Brad. They like vibe coding."
# Fetch this block's content with:
human_block = client.agents.blocks.retrieve(agent_id=agent_state.id, block_label="human")
print(human_block.value)Letta is built on the MemGPT research paper, which introduced the concept of the "LLM Operating System" for memory management:
- Memory Hierarchy: Agents have self-editing memory split between in-context and out-of-context memory
- Memory Blocks: In-context memory is composed of persistent editable blocks
- Agentic Context Engineering: Agents control their context window using tools to edit, delete, or search memory
- Perpetual Self-Improving Agents: Every agent has a perpetual (infinite) message history
Connect to a local Letta server instead of the cloud:
from letta_client import Letta
client = Letta(base_url="http://localhost:8283")Run Letta locally with Docker:
docker run \
-v ~/.letta/.persist/pgdata:/var/lib/postgresql/data \
-p 8283:8283 \
-e OPENAI_API_KEY="your_key" \
letta/letta:latestSee the self-hosting guide for more options.
Memory Management (full guide)
Memory blocks are persistent, editable sections of an agent's context window:
# Create agent with memory blocks
agent = client.agents.create(
memory_blocks=[
{"label": "persona", "value": "I'm a helpful assistant."},
{"label": "human", "value": "User preferences and info."}
]
)
# Modify blocks manually
client.agents.blocks.modify(
agent_id=agent.id,
block_label="human",
value="Updated user information"
)
# Retrieve a block
block = client.agents.blocks.retrieve(agent_id=agent.id, block_label="human")Multi-agent Shared Memory (full guide)
Memory blocks can be attached to multiple agents. All agents will have an up-to-date view on the contents of the memory block -- if one agent modifies it, the other will see it immediately.
Here is how to attach a single memory block to multiple agents:
# Create shared block
shared_block = client.blocks.create(
label="organization",
value="Shared team context"
)
# Attach to multiple agents
agent1 = client.agents.create(
memory_blocks=[{"label": "persona", "value": "I am a supervisor"}],
block_ids=[shared_block.id]
)
agent2 = client.agents.create(
memory_blocks=[{"label": "persona", "value": "I am a worker"}],
block_ids=[shared_block.id]
)Sleep-time Agents (full guide)
Background agents that share memory with your primary agent:
agent = client.agents.create(
model="openai/gpt-4o-mini",
enable_sleeptime=True # creates a sleep-time agent
)Agent File Import/Export (full guide)
Save and share agents with the .af file format:
# Import agent
with open('/path/to/agent.af', 'rb') as f:
agent = client.agents.import_agent_serialized(file=f)
# Export agent
schema = client.agents.export_agent_serialized(agent_id=agent.id)MCP Tools (full guide)
Connect to Model Context Protocol servers:
# Add tool from MCP server
tool = client.tools.add_mcp_tool(
server_name="weather-server",
tool_name="get_weather"
)
# Create agent with MCP tool
agent = client.agents.create(
model="openai/gpt-4o-mini",
tool_ids=[tool.id]
)Filesystem (full guide)
Give agents access to files:
# Get an available embedding config
embedding_configs = client.models.list_embedding_models()
# Create folder and upload file
folder = client.folders.create(
name="my_folder",
embedding_config=embedding_configs[0]
)
with open("file.txt", "rb") as f:
client.folders.files.upload(file=f, folder_id=folder.id)
# Attach to agent
client.agents.folders.attach(agent_id=agent.id, folder_id=folder.id)Long-running Agents (full guide)
Background execution with resumable streaming:
stream = client.agents.messages.create_stream(
agent_id=agent.id,
messages=[{"role": "user", "content": "Analyze this dataset"}],
background=True
)
run_id = None
last_seq_id = None
for chunk in stream:
run_id = chunk.run_id
last_seq_id = chunk.seq_id
# Resume if disconnected
for chunk in client.runs.stream(run_id=run_id, starting_after=last_seq_id):
print(chunk)Streaming (full guide)
Stream responses in real-time:
stream = client.agents.messages.create_stream(
agent_id=agent.id,
messages=[{"role": "user", "content": "Hello!"}]
)
for chunk in stream:
print(chunk)Message Types (full guide)
Agent responses contain different message types. Handle them with the message_type discriminator:
messages = client.agents.messages.list(agent_id=agent.id)
for message in messages:
if message.message_type == "user_message":
print(f"User: {message.content}")
elif message.message_type == "assistant_message":
print(f"Agent: {message.content}")
elif message.message_type == "reasoning_message":
print(f"Reasoning: {message.reasoning}")
elif message.message_type == "tool_call_message":
print(f"Tool: {message.tool_call.name}")
elif message.message_type == "tool_return_message":
print(f"Result: {message.tool_return}")Full type hints and async support:
from letta_client import Letta
from letta_client.types import CreateAgentRequest
# Sync client
client = Letta(token="LETTA_API_KEY")
# Async client
from letta_client import AsyncLetta
async_client = AsyncLetta(token="LETTA_API_KEY")
agent = await async_client.agents.create(
model="openai/gpt-4o-mini",
memory_blocks=[...]
)from letta_client.core.api_error import ApiError
try:
client.agents.messages.create(agent_id=agent_id, messages=[...])
except ApiError as e:
print(e.status_code)
print(e.message)
print(e.body)response = client.agents.create(
{...},
request_options={"max_retries": 3} # Default: 2
)response = client.agents.create(
{...},
request_options={"timeout_in_seconds": 30} # Default: 60
)response = client.agents.create(
{...},
request_options={
"additional_headers": {
"X-Custom-Header": "value"
}
}
)response = client.agents.with_raw_response.create({...})
print(response.headers["X-My-Header"])
print(response.data) # access the underlying objectimport httpx
from letta_client import Letta
client = Letta(
httpx_client=httpx.Client(
proxies="http://my.test.proxy.example.com",
transport=httpx.HTTPTransport(local_address="0.0.0.0"),
)
)Works with:
- Python 3.8+
- Supports async/await
- Compatible with type checkers (mypy, pyright)
Letta is an open source project built by over a hundred contributors. There are many ways to get involved in the Letta OSS project!
- Join the Discord: Chat with the Letta devs and other AI developers.
- Chat on our forum: If you're not into Discord, check out our developer forum.
- Follow our socials: Twitter/X, LinkedIn, YouTube
This SDK is generated programmatically. For SDK changes, please open an issue.
README contributions are always welcome!
MIT
Legal notices: By using Letta and related Letta services (such as the Letta endpoint or hosted service), you are agreeing to our privacy policy and terms of service.