This repository hosts the backend for the Capital Markets - Crypto Assistant ReAct Agent Chatbot Service. The service provides insights and recommendations for cryptocurrency portfolio management using a LangGraph ReAct agent, leveraging MongoDB Atlas for data storage, vector search, and agent state management.
- LangGraph ReAct agent: An agent built using LangGraph (a graph-based agent orchestration framework) that implements the ReAct reasoning pattern — "Reason and Act". In simple terms: ReAct means the agent thinks step-by-step ("reasoning") and takes actions (like calling tools) as needed between thoughts. In practice, a LangGraph ReAct agent:
- Receives a user query,
- Reasons about it (e.g., “I need to look up today’s weather”),
- Takes an action (e.g., “call WeatherAPI tool”),
- Observes the result (e.g., “it’s raining”),
- Continues reasoning and acting until it reaches a final answer.
(The example above is a simplified version of the ReAct pattern, and below is a diagram that illustrates the ReAct pattern.)
Note
The agent uses memory to store past observations and actions, which is valuable for more complex tasks. For our use case, we implement the ReAct pattern to build a Crypto Assistant Agent that analyzes cryptocurrency market data, news articles, and social media sentiment, providing insights and recommendations to users about their crypto portfolios.
-
Long-term memory: The agent stores and recalls information from previous interactions, building a comprehensive understanding of the user's needs and preferences. We utilize Checkpointers as a mechanism to store state at every step across different interactions, implemented with the MongoDB checkpointer component of LangGraph.
- Two main collections support the checkpointer:
crypto_checkpoints_aio
crypto_checkpoint_writes_aio
- Two main collections support the checkpointer:
-
Agent Profile: The agent is configured with a profile that defines its role, instructions, and rules. This profile is stored in MongoDB and can be easily modified to change the agent's behavior. The agent profile is stored in the
agent_profiles
collection. -
Embedding Model optimized for finance retrieval: We use the voyage-finance-2 embedding model, which is optimized for finance retrieval tasks. This model understands the nuances of financial language, including cryptocurrency terminology, and generates embeddings for crypto news articles, technical analysis reports, social media sentiment, and other cryptocurrency-related documents.
-
Vector Search: Semantic understanding of cryptocurrency market data, news articles, and social media sentiment is achieved through MongoDB Atlas Vector Search. This allows the agent to find semantically similar results across crypto analysis, news, and social media reports. The vector search is performed on three collections:
reports_crypto_analysis
Stores cryptocurrency analysis reports with technical indicators, trends, and momentum analysis with a vector field namedreport_embedding
containing vector representations generated by the voyage-finance-2 model.reports_crypto_news
Stores cryptocurrency news reports with sentiment analysis and market impact assessments with a vector field namedreport_embedding
containing vector representations generated by the voyage-finance-2 model.reports_crypto_sm
Stores cryptocurrency social media sentiment reports from Reddit, Twitter, and other platforms with a vector field namedreport_embedding
containing vector representations generated by the voyage-finance-2 model.
The service is built on a modular foundation using:
- MongoDB Atlas: For data storage, including agent profiles, crypto reports, and portfolio information
- MongoDB Atlas Vector Search: For semantic search capabilities across crypto data
- MongoDB checkpointer: For agent state management
- VoyageAI: For generating finance-specific embeddings optimized for cryptocurrency content
- AWS Bedrock/Anthropic: For LLM inference in agent reasoning steps
- LangGraph: For utilizing the prebuilt ReAct agent
- FastAPI: For reliable, documented API endpoints
The Crypto Assistant Agent serves as an intelligent financial advisor that analyzes user queries, fetches relevant financial data, and presents actionable insights. Let's explore how this system works from query to response.
-
User Query Processing:
When a user asks a cryptocurrency question (e.g., "Based on current crypto market sentiment and social media trends, what overall portfolio reallocation would you suggest?"), the ReAct agent examines the query to determine the information need. -
Reasoning Step:
The agent engages in a thought process, weighing which tools would be most appropriate. This reasoning is guided by the agent's profile, which defines its role, capabilities, and decision-making rules. -
Tool Selection:
Based on its reasoning, the agent selects the most appropriate specialized tool:- For cryptocurrency portfolio allocation questions →
get_portfolio_allocation_tool
- For technical analysis and crypto trends →
crypto_analysis_reports_vector_search_tool
- For news sentiment about crypto assets →
crypto_news_reports_vector_search_tool
- For social media sentiment analysis →
crypto_social_media_reports_vector_search_tool
- For year-to-date crypto portfolio returns →
get_portfolio_ytd_return_tool
- For general cryptocurrency information →
tavily_search_tool
- For cryptocurrency portfolio allocation questions →
-
Tool Execution and Observation:
The selected tool retrieves data from MongoDB collections or external APIs. The agent observes the tool's output, integrating it into its understanding. -
Follow-up Reasoning:
If the information is incomplete, the agent may reason through additional tool calls to gather complementary data points. -
Response Synthesis:
The agent combines all gathered information into a coherent, actionable response that directly addresses the user's query.
Throughout this process, the agent maintains a conversation state in MongoDB, enabling it to reference previous interactions and provide contextually relevant responses over time.
The Crypto Assistant Agent uses MongoDB as a long-term memory store to maintain context across conversation sessions:
-
Memory Storage:
Two key collections—crypto_checkpoints_aio
andcrypto_checkpoint_writes_aio
—store the complete state of agent interactions, including reasoning steps, tool calls, and observations. -
Memory Structure:
Each conversation is organized by a uniquethread_id
that includes a timestamp (format:thread_YYYYMMDD_HHMMSS
). This allows for organized memory retrieval and management. -
Automated Memory Cleanup:
To prevent memory buildup, a scheduled job runs daily at 04:00 UTC through theCheckpointerMemoryJobs
system:- Cleanup Process: The job identifies
thread_id
s that don't contain today's date and removes them from both memory collections. This ensures the system maintains only recent conversation history while preventing database bloat. - Memory Retention Strategy: By design, the system retains the current day's conversations, allowing users to continue discussions throughout their workday while automatically clearing older interactions that are less likely to be relevant.
- Cleanup Process: The job identifies
This memory management approach balances the benefits of persistent conversation context with the need for database efficiency and regular maintenance.
The Crypto Assistant ReAct Agent leverages specialized tools to access and analyze cryptocurrency data from various sources. These tools are designed specifically for crypto portfolio management and maintain clear boundaries between portfolio-specific analysis and general cryptocurrency information.
-
Portfolio Allocation Tool (
get_portfolio_allocation_tool
)- Purpose: Shows the precise distribution of cryptocurrency investments across different digital assets.
- Data Source:
crypto_portfolio_allocation
collection containing current portfolio breakdown. - Key Features:
- Returns crypto symbols, asset descriptions, and allocation percentages.
- Identifies asset types (Cryptocurrency vs Stablecoin).
- Provides clear visibility into current cryptocurrency investment distribution.
- Should be called FIRST for any portfolio-related question.
-
Crypto Analysis Reports Vector Search Tool (
crypto_analysis_reports_vector_search_tool
)- Purpose: Retrieves relevant technical analysis and market insights specifically for cryptocurrency assets in the current portfolio.
- Data Source:
reports_crypto_analysis
collection with vector embeddings generated by the voyage-finance-2 model. - Key Features:
- Provides crypto trends and momentum analysis for portfolio assets.
- Returns technical indicators (RSI, moving averages, etc.) for cryptocurrencies.
- Combines vector similarity with recency scoring for optimal results.
- Offers overall crypto portfolio diagnosis and asset-specific insights.
- Intelligently falls back to the most recent report if no semantic matches are found.
-
Crypto News Reports Vector Search Tool (
crypto_news_reports_vector_search_tool
)- Purpose: Provides recent news summaries and sentiment analysis specifically for cryptocurrency assets in the current portfolio.
- Data Source:
reports_crypto_news
collection with vector embeddings. - Key Features:
- Returns crypto news summaries with sentiment categorization (positive/neutral/negative).
- Includes overall news impact diagnosis for the crypto portfolio.
- Prioritizes the most relevant and recent news for portfolio crypto assets.
- Focuses on media coverage impact on cryptocurrency holdings.
-
Crypto Social Media Reports Vector Search Tool (
crypto_social_media_reports_vector_search_tool
)- Purpose: Analyzes social media sentiment from Reddit, Twitter, and other platforms for cryptocurrency assets in the portfolio.
- Data Source:
reports_crypto_sm
collection with vector embeddings. - Key Features:
- Returns social media sentiment analysis for crypto portfolio assets.
- Provides community perception and discussions about portfolio holdings.
- Captures Reddit, Twitter, and other social platform sentiment.
- Essential for crypto markets where social sentiment significantly drives prices.
- Should be used together with news reports for complete sentiment analysis.
-
Portfolio YTD Return Tool (
get_portfolio_ytd_return_tool
)- Purpose: Offers quantitative measurement of cryptocurrency portfolio performance since the beginning of the current year.
- Data Source:
portfolio_performance
collection tracking historical crypto returns. - Key Features:
- Calculates the difference between start-of-year and current cumulative returns.
- Provides contextual information, including starting and ending dates.
- Returns a formatted percentage representation of crypto portfolio performance.
- Tavily Search Tool
- Purpose: Supplements portfolio-specific tools with broader cryptocurrency data and news.
- Data Source: Web search results via the Tavily API.
- Key Features:
- Returns up to 3 relevant search results with citations.
- Used for questions about cryptocurrencies not in the portfolio or general crypto concepts.
- Provides real-time information beyond the scope of the portfolio data.
- Covers broader cryptocurrency market trends and developments.
- Document Model: MongoDB's document model stores data in JSON-like BSON format, which aligns perfectly with the agent state representation in LangGraph workflows. This enables near-seamless transitions between application objects and database storage with minimal serialization overhead.
- Agent Profiles Storage: MongoDB efficiently stores and retrieves different agent profiles with distinct roles, instructions, and rules. The document model accommodates these nested, semi-structured configurations without requiring rigid schemas or complex joins.
- MongoDB Checkpointer for Long-Term Memory: MongoDB Checkpointer is utilized to achieve long-term memory by storing the agent's state at every step of its interactions. This ensures that the agent can recall past observations and actions, enabling it to build a comprehensive understanding of user needs over time. The checkpointer integrates seamlessly with LangGraph workflows, providing a robust mechanism for persistence.
- Vector Search for Semantic Understanding: MongoDB Atlas Vector Search enables the agent to perform semantic searches across cryptocurrency data, including technical analysis, news reports, and social media sentiment. This capability ensures that the agent retrieves the most relevant and contextually accurate information for crypto-related user queries.
- Schema Flexibility for Evolving Workflows: MongoDB's schema flexibility allows the agent's workflows to evolve without requiring disruptive schema migrations. This adaptability is crucial for rapidly iterating on AI-driven solutions in the dynamic and fast-moving cryptocurrency markets.
-
Easy: MongoDB's document model naturally fits with object-oriented programming, utilizing BSON documents that closely resemble JSON. This design simplifies the management of complex data structures such as user accounts, allowing developers to build features like account creation, retrieval, and updates with greater ease.
-
Fast: Following the principle of "Data that is accessed together should be stored together," MongoDB enhances query performance. This approach ensures that related data—like user and account information—can be quickly retrieved, optimizing the speed of operations such as account look-ups or status checks, which is crucial in services demanding real-time access to operational data.
-
Flexible: MongoDB's schema flexibility allows account models to evolve with changing business requirements. This adaptability lets financial services update account structures or add features without expensive and disruptive schema migrations, thus avoiding costly downtime often associated with structural changes.
-
Versatile: The document model in MongoDB effectively handles a wide variety of data types, such as strings, numbers, booleans, arrays, objects, and even vectors. This versatility empowers applications to manage diverse account-related data, facilitating comprehensive solutions that integrate user, account, and transactional data seamlessly.
- MongoDB Atlas for database storage, vector search, and agent state management.
- LangGraph for orchestrating multi-step agent workflows and implementing the ReAct pattern.
- LangChain MongoDB and LangGraph Checkpoint MongoDB for seamless MongoDB integration and agent memory persistence.
- VoyageAI for domain-specific financial embeddings.
- AWS Bedrock/Anthropic for LLM inference in agent reasoning steps.
- FastAPI for RESTful API endpoints and documentation.
- Poetry for dependency management and packaging.
- Uvicorn for ASGI server implementation.
- Docker for containerization.
- pymongo for MongoDB connectivity and operations.
- langchain-mongodb for MongoDB integration with LangChain and LangGraph.
- langgraph-checkpoint-mongodb for agent state persistence in MongoDB.
- langgraph for building agent workflow graphs.
- langchain-tavily for integrating Tavily search capabilities.
- boto3 and botocore for AWS Bedrock API integration.
- langchain-aws for AWS-related LangChain integrations.
- voyageai for generating finance-specific embeddings.
- scheduler for cleaning checkpointer collections (memories) on daily basis.
- python-dotenv for environment variable management.
- pytz for timezone handling in scheduled reports.
- fastapi for API development.
- uvicorn for running the ASGI server.
- voyage-finance-2 for generating embeddings for crypto analysis, news, and social media reports.
Before you begin, ensure you have met the following requirements:
- MongoDB Atlas account - Register Here
- Python 3.10 or higher
- Poetry (install via Poetry's official documentation)
- Log in to MongoDB Atlas and create a database named
agentic_capital_markets
. Ensure the name is reflected in the environment variables. - Create the following collections if they do not already exist:
agent_profiles
(for storing agent profiles)reports_crypto_analysis
(for storing cryptocurrency technical analysis reports)reports_crypto_news
(for storing cryptocurrency news reports with sentiment)reports_crypto_sm
(for storing cryptocurrency social media sentiment reports)portfolio_performance
(for storing portfolio performance data)crypto_portfolio_allocation
(for storing cryptocurrency portfolio allocation data)risk_profiles
(for storing risk profile data)crypto_checkpoints_aio
(for storing agent state checkpoints)crypto_checkpoint_writes_aio
(for storing agent state checkpoint writes)
- Create the vector search index for the
reports_crypto_analysis
,reports_crypto_news
, andreports_crypto_sm
collections. You can do this using the MongoDB Atlas UI or by running the following python script located in thebackend/agent/db/
directory:vector_search_index_creator.py
. Make sure to parametrize the script accordingly.
Follow MongoDB's guide to create a user with readWrite access to the agentic_capital_markets
database.
Important
Create a .env
file in the /backend
directory with the following content:
MONGODB_URI="your_mongodb_uri"
DATABASE_NAME="agentic_capital_markets"
APP_NAME="ist.demo.capital_markets.react.agent.crypto"
VOYAGE_API_KEY="your_voyage_api_key"
TAVILY_API_KEY = "your_tavily_api_key"
AWS_REGION="us-east-1"
CHAT_COMPLETIONS_MODEL_ID="anthropic.claude-3-haiku-20240307-v1:0"
EMBEDDINGS_MODEL_ID="voyage-finance-2"
AGENT_PROFILES_COLLECTION = "agent_profiles"
CHECKPOINTS_AIO_COLLECTION = "crypto_checkpoints_aio"
CHECKPOINTS_WRITES_AIO_COLLECTION = "crypto_checkpoint_writes_aio"
REPORTS_COLLECTION_CRYPTO_ANALYSIS = "reports_crypto_analysis"
REPORT_CRYPTO_ANALYSIS_VECTOR_INDEX_NAME = "reports_crypto_analysis_report_embedding_index"
REPORTS_COLLECTION_CRYPTO_NEWS = "reports_crypto_news"
REPORT_CRYPTO_NEWS_VECTOR_INDEX_NAME = "reports_crypto_news_report_embedding_index"
REPORTS_COLLECTION_CRYPTO_SM = "reports_crypto_sm"
REPORT_CRYPTO_SM_VECTOR_INDEX_NAME = "reports_crypto_sm_report_embedding_index"
REPORT_VECTOR_FIELD = "report_embedding"
PORTFOLIO_PERFORMANCE_COLLECTION = "portfolio_performance"
CRYPTO_PORTFOLIO_ALLOCATION_COLLECTION = "crypto_portfolio_allocation"
- Open a terminal in the project root directory.
- Run the following commands:
make poetry_start make poetry_install
- Verify that the
.venv
folder has been generated within the/backend
directory.
To start the backend service, run:
poetry run uvicorn main:app --host 0.0.0.0 --port 8007
Default port is
8007
, modify the--port
flag if needed.
Run the following command in the root directory:
make build
To remove the container and image:
make clean
You can access the API documentation by visiting the following URL:
http://localhost:<PORT_NUMBER>/docs
E.g. http://localhost:8007/docs
Note
Make sure to replace <PORT_NUMBER>
with the port number you are using and ensure the backend is running.
Important
Check that you've created an .env
file that contains the required environment variables.
This project is for educational and demonstration purposes.