Comprehensive Financial Services Platform with Dual-Backend Microservices Architecture
In today's rapidly evolving financial landscape, detecting fraudulent transactions quickly and accurately while maintaining robust AML/KYC compliance is crucial. Financial institutions of all sizes struggle with balancing customer experience with comprehensive fraud protection and regulatory compliance.
ThreatSight 360 addresses these challenges with real-time risk assessment, AI-powered pattern recognition, intelligent entity resolution, and comprehensive compliance operations.
By the end of this guide, you'll have a comprehensive fraud detection and AML/KYC compliance system up and running capable of:
- Real-time Fraud Detection: Multi-factor risk assessment with AI-powered pattern recognition
- Intelligent Entity Resolution: AI-powered fuzzy matching and duplicate detection for AML/KYC compliance
- LLM-Powered Classification: AWS Bedrock Claude-3 Sonnet for automated entity risk assessment
- Automated Case Investigation: AI-generated investigation reports and case documentation
- Network Analysis: Relationship mapping and graph analytics for compliance investigations
- Vector-based Pattern Recognition: Advanced similarity matching using MongoDB Atlas Vector Search
- Dynamic Risk Model Management: Configurable risk models with real-time updates
We will walk you through the process of configuring and using MongoDB Atlas as your backend with AWS Bedrock for AI-powered risk assessment and entity resolution in your Next.js and FastAPI application.
ThreatSight 360 uses a dual-backend microservices architecture:
- Risk Model Management: Multi-factor risk evaluation with risk models configurable in real-time with MongoDB Atlas Change Streams.
- Transaction Screening: Real-time transaction screening using MongoDB Atlas Vector Search
- Entity Management: Comprehensive individual and organization entity management with Customer 360 view possible due to the MongoDB Document Model
- Intelligent Entity Resolution: MongoDB Atlas Search fuzzy matching and duplicate detection
- LLM Classification Service: AWS Bedrock Claude-3 Sonnet for entity risk assessment
- Investigation Service: Automated case investigation and report generation
- Network Analysis: Relationship and transaction graph traversal analytics using MongoDB $graphLookup
- Atlas Search Integration: Advanced search capabilities with faceted filtering and autocomplete
- Transaction Simulator: Interactive fraud scenario testing
- Risk Model Management: Dynamic risk model configuration interface and MongoDB Change Streams for live risk model synchronization
- Entity Management Dashboard: Advanced entity 360 with relationship and transaction network visualization
- Intelligent Entity Resolution Workflow: A multi-step entity onboarding workflow involving MongoDB full-text + vector + hybrid search with $rankFusion, network traversal using $graphLookup and risk classification and case generation using LLMs
ThreatSight 360 employs a microservices architecture with four key components working in tandem:
The fraud detection flow demonstrates:
- Transaction Simulator: Generates simulated transactions for testing fraud scenarios
- Fraud Detection Engine: Processes transactions in real-time with customer profile lookup, rule application, vector search, and embedding generation via AWS Bedrock
- Risk Model Engine: Manages dynamic risk models with versioning and real-time updates via MongoDB Change Streams
- MongoDB Collections: Stores Customer 360 profiles, transactions, and risk models
The AML/entity resolution flow showcases:
- Entity Management: Handles entity onboarding and exploration
- Entity Resolution Engine: Performs embedding generation via AWS Bedrock, finds similar/duplicate entities using Vector Search or Hybrid Search, and analyzes transaction and relationship networks
- Advanced Search Capabilities: Leverages MongoDB's Graph Traversal for relationship analysis and AI Risk Classification via Claude Sonnet 4
- MongoDB Collections: Manages entities, transactions, and relationships data
Let's get started!
Before you begin working with this project, ensure that you have the following prerequisites set up in your development environment:
-
Python 3.10+: Both backend services are built with Python. You can download it from the official website.
-
Node.js 18+: The frontend requires Node.js 18 or higher, which includes npm for package management. You can download it from the official Node.js website.
-
Poetry: Both backend services use Poetry for dependency management. Install it by following the instructions on the Poetry website.
-
MongoDB Atlas Account: This project uses MongoDB Atlas for data storage, Atlas Search, and vector search capabilities. If you don't have an account, you can sign up for free at MongoDB Atlas. Once you have an account, follow these steps to set up a M10 tier cluster:
- Log in to your MongoDB Atlas account.
- Create a new project or use an existing one, and then click "create a new database".
- Choose the M10 tier option.
- Configure the cluster settings according to your preferences and then click "finish and close" on the bottom right.
- Finally, add your IP to the network access list so you can access your cluster remotely.
-
AWS Account with Bedrock Access: You'll need an AWS account with access to the Bedrock service for AI foundation models used in both fraud detection and entity resolution. Visit the AWS Console to set up an account and request access to Bedrock.
-
Docker (Optional): For containerized deployment, Docker is required. Install it from the Docker website.
The fastest way to get ThreatSight 360 up and running:
# Clone the repository
git clone <repository-url>
cd fsi-aml-fraud-detection
# Install Poetry (if not already installed)
curl -sSL https://install.python-poetry.org | python3 -
# Setup all components
# Backend (Fraud Detection)
cd backend && poetry install && cd ..
# AML Backend
cd aml-backend && poetry install && cd ..
# Frontend
cd frontend && npm install && cd ..
# Start all services in development mode (in separate terminals)
# Terminal 1: Fraud Detection Backend
cd backend
poetry run uvicorn main:app --reload --port 8000
# Terminal 2: AML/KYC Backend
cd aml-backend
poetry run uvicorn main:app --reload --port 8001
# Terminal 3: Frontend
cd frontend
npm run dev
This will start:
- Fraud Detection Backend at http://localhost:8000
- AML/KYC Backend at http://localhost:8001
- Frontend Application at http://localhost:3000
For detailed configuration, continue with the sections below.
Once the MongoDB Atlas Cluster is set up, locate your newly created cluster, click the "Connect" button and select the "Connect your application" section. Copy the provided connection string. It should resemble something like this:
mongodb+srv://<username>:<password>@cluster-name.xxxxx.mongodb.net/
Note
You will need the connection string to set up your environment variables later (MONGODB_URI
).
-
Log in to your AWS Management Console.
-
Navigate to the Bedrock service or search for "Bedrock" in the AWS search bar.
-
Follow the prompts to request access to the Bedrock service if you haven't already.
-
Once access is granted, create an IAM user with programmatic access and appropriate permissions for Bedrock.
-
Save the AWS Access Key ID and Secret Access Key for later use in your environment variables.
Important
Keep your AWS credentials secure and never commit them to version control.
Now it's time to clone the ThreatSight 360 source code from GitHub to your local machine:
-
Open your terminal or command prompt.
-
Navigate to your preferred directory where you want to store the project using the
cd
command. For example:cd /path/to/your/desired/directory
-
Once you're in the desired directory, use the
git clone
command to clone the repository:git clone <repository-url>
-
After running the
git clone
command, a new directory with the repository's name will be created in your chosen directory. To navigate into the cloned repository, use thecd
command:cd fsi-aml-fraud-detection
ThreatSight 360 leverages MongoDB Atlas Vector Search for advanced fraud pattern recognition and entity similarity matching. Follow these steps to enable it:
-
Navigate to your MongoDB Atlas dashboard and select your cluster.
-
Click on the "Search" tab located in the top navigation menu.
-
Click "Create Search Index".
-
Choose the JSON editor and click "Next".
-
Name your index "transaction_vector_index".
-
Select your database and the "transactions" collection.
-
For the index definition, paste the following JSON:
{ "mappings": { "dynamic": true, "fields": { "vector_embedding": { "type": "knnVector", "dimensions": 1536, "similarity": "cosine" } } } }
-
Create another Atlas Search index named "entity_resolution_search".
-
Select the "entities" collection.
-
Use the following comprehensive index definition for entity resolution:
{
"mappings": {
"dynamic": false,
"fields": {
"name": {
"type": "document",
"fields": {
"full": [
{
"type": "autocomplete",
"analyzer": "lucene.standard",
"tokenization": "edgeGram",
"minGrams": 2,
"maxGrams": 15
},
{
"type": "string"
}
],
"aliases": {
"type": "string"
}
}
},
"entityType": {
"type": "stringFacet"
},
"nationality": {
"type": "stringFacet"
},
"residency": {
"type": "stringFacet"
},
"jurisdictionOfIncorporation": {
"type": "stringFacet"
},
"riskAssessment": {
"type": "document",
"fields": {
"overall": {
"type": "document",
"fields": {
"level": {
"type": "stringFacet"
},
"score": {
"type": "numberFacet"
}
}
}
}
},
"customerInfo": {
"type": "document",
"fields": {
"businessType": {
"type": "stringFacet"
}
}
}
}
}
}
For enhanced entity text matching, create an Atlas Search index named "entity_text_search_index":
{
"mappings": {
"dynamic": false,
"fields": {
"name": {
"type": "document",
"fields": {
"full": { "type": "string" },
"aliases": { "type": "string" }
}
},
"addresses": {
"type": "document",
"fields": {
"full": { "type": "string" }
}
},
"entityType": { "type": "string" },
"identifiers": {
"type": "document",
"fields": {
"value": { "type": "string" }
}
}
}
}
}
For semantic entity matching, create a vector search index named "entity_vector_search_index":
{
"type": "vectorSearch",
"fields": [
{
"type": "vector",
"path": "embedding",
"numDimensions": 1536,
"similarity": "cosine"
}
]
}
Note
The index names must match exactly for the application to work properly.
For real-time updates in your application, you'll need to enable change streams in MongoDB Atlas:
-
Navigate to your MongoDB Atlas dashboard and select your cluster.
-
Go to "Database Access" in the left sidebar.
-
Ensure that your database user has the "readWrite" and "dbAdmin" roles for the database you'll be using.
-
For production environments, consider creating a dedicated user with specific privileges for change streams.
Important
Change streams require a replica set, which is automatically provided by MongoDB Atlas, even in the free tier.
Navigate to the backend
directory and create environment configuration:
cd backend
Create a .env
file with the following configuration settings:
# MongoDB Connection
MONGODB_URI=mongodb+srv://<username>:<password>@cluster-name.xxxxx.mongodb.net/
DB_NAME=fsi-threatsight360
# AWS Bedrock Credentials
AWS_ACCESS_KEY_ID=your_aws_access_key_here
AWS_SECRET_ACCESS_KEY=your_aws_secret_key_here
AWS_REGION=us-east-1
# Server Configuration
HOST=0.0.0.0
PORT=8000
# Frontend URL for CORS
FRONTEND_URL=http://localhost:3000
# Risk Assessment Thresholds
AMOUNT_THRESHOLD_MULTIPLIER=2.5
MAX_LOCATION_DISTANCE_KM=100
VELOCITY_TIME_WINDOW_MINUTES=10
VELOCITY_THRESHOLD=5
SIMILARITY_THRESHOLD=0.8
# Risk Weights
WEIGHT_AMOUNT=0.25
WEIGHT_LOCATION=0.25
WEIGHT_DEVICE=0.20
WEIGHT_VELOCITY=0.15
WEIGHT_PATTERN=0.15
Install dependencies and start the server:
# Install dependencies
cd backend
poetry install
# Start development server
poetry run uvicorn main:app --host 0.0.0.0 --port 8000 --reload
Note
For detailed backend configuration and API documentation, see backend/README.md
Navigate to the aml-backend
directory and create environment configuration:
cd aml-backend
Create a .env
file with the following configuration settings:
# MongoDB Connection
MONGODB_URI=mongodb+srv://<username>:<password>@cluster-name.xxxxx.mongodb.net/
DB_NAME=fsi-threatsight360
# AWS Bedrock Credentials (for AI features)
AWS_ACCESS_KEY_ID=your_aws_access_key_here
AWS_SECRET_ACCESS_KEY=your_aws_secret_key_here
AWS_REGION=us-east-1
# Server Configuration
HOST=0.0.0.0
PORT=8001
# Frontend URL for CORS
FRONTEND_URL=http://localhost:3000
# Atlas Search Configuration
ATLAS_SEARCH_INDEX=entity_resolution_search
ATLAS_TEXT_SEARCH_INDEX=entity_text_search_index
ENTITY_VECTOR_INDEX=entity_vector_search_index
# Performance Tuning
ATLAS_SEARCH_TIMEOUT=30000
MAX_SEARCH_RESULTS=1000
VECTOR_SEARCH_LIMIT=100
CONNECTION_POOL_SIZE=50
Install dependencies and start the server:
# Install dependencies
cd aml-backend
poetry install
# Start development server
poetry run uvicorn main:app --host 0.0.0.0 --port 8001 --reload
Note
For detailed AML backend configuration and API documentation, see aml-backend/README.md
Note
Never commit your .env
files to version control. Make sure they're included in your .gitignore
file.
Navigate to the frontend
directory of your project:
cd frontend
Create a .env.local
file with the following content:
# API URLs for dual-backend architecture
NEXT_PUBLIC_FRAUD_API_URL=http://localhost:8000
NEXT_PUBLIC_AML_API_URL=http://localhost:8001
# Legacy compatibility (points to fraud backend)
NEXT_PUBLIC_API_URL=http://localhost:8000
Note
The .env.local
file will be ignored by Git automatically.
Install the frontend dependencies and start the development server:
# Install dependencies
cd frontend
npm install
# Start development server
npm run dev
Your frontend application should now be running at http://localhost:3000.
To populate your database with initial data for testing and demonstration, we provide comprehensive Jupyter notebooks for synthetic data generation that showcase MongoDB's document model capabilities and advanced features.
The Transaction Synthetic Data Generation notebook demonstrates how MongoDB's document model enables sophisticated fraud detection through dynamic behavioral profiling:
What it generates:
-
50 Customer Profiles with rich behavioral patterns including:
- Personal and account information
- Device fingerprints and usual locations (GeoJSON)
- Transaction behavior patterns (average amounts, merchant categories, usual times)
- Risk profiles with scoring and flags
-
6 Months of Synthetic Transactions (26,000+ transactions) with:
- Realistic mix of normal (60%), suspicious (25%), and fraudulent (15%) transactions
- Location data as GeoJSON for geospatial queries
- Device information for device fingerprinting
- Risk assessments with scores and flags
-
5 Fraud Patterns with AWS Bedrock embeddings:
- Account Takeover
- Card Testing
- Transaction Laundering
- Geographic Anomaly
- Purchase Anomaly
-
MongoDB Indexes including:
- Standard indexes for query performance
- Geospatial indexes for location-based fraud detection
- Atlas Search indexes for text search
- Vector search indexes for pattern matching
Key Features Demonstrated:
- Rich document model for complex customer profiles
- Nested documents for behavioral patterns
- Geospatial data for location-based fraud detection
- Vector embeddings for similarity-based pattern matching
- AWS Bedrock integration for AI-powered embeddings
# Run the transaction data generation notebook
jupyter notebook "docs/ThreatSight360 - Transaction Synthetic Data Generation.ipynb"
# Or use the backend seeding script
cd backend
poetry run python scripts/seed_data.py
The Entity Resolution Synthetic Data Generation notebook creates comprehensive entity data for AML/KYC compliance testing:
What it generates:
- Individual and Organization Entities with:
- Complete identity information with aliases
- Multiple addresses and identifiers
- Risk assessment scores and levels
- Customer information and watchlist flags
- Entity Relationships for network analysis:
- Corporate relationships (director_of, owner_of, subsidiary_of)
- Household relationships (family members, same address)
- Transactional relationships (frequent_transactor, beneficiary)
- Suspicious patterns with confidence scores
- Embeddings and Search Data:
- AWS Bedrock embeddings for semantic search
- Text search optimized fields
- Network graph data for relationship traversal
Key Features Demonstrated:
- Complex entity resolution scenarios with duplicates
- Multi-hop relationship networks for graph analysis
- Vector embeddings for semantic entity matching
- Risk propagation through relationship networks
# Run the entity data generation notebook
jupyter notebook "docs/ThreatSight 360 - Entity Resolution Synthetic Data Generation.ipynb"
Before running the notebooks, ensure you have:
-
Jupyter Notebook or JupyterLab installed:
pip install notebook # or pip install jupyterlab
-
Required Python packages:
pip install pymongo pandas faker numpy scikit-learn python-dotenv geojson boto3
-
MongoDB Atlas connection string and AWS Bedrock credentials configured in the notebooks
Important
The notebooks use AWS Bedrock to generate embeddings. If Bedrock is not available, the notebooks will fall back to random embeddings for demonstration purposes.
Note
The synthetic data generation creates realistic patterns that closely mirror production scenarios, making it ideal for testing, demonstrations, and training purposes.
The Transaction Simulator allows you to test and visualize how the fraud detection system responds to different scenarios:
-
Navigate to http://localhost:3000/transaction-simulator.
-
Select a customer from the dropdown menu.
-
Choose a predefined fraud scenario or configure your own:
- Normal Transaction
- Unusual Amount
- Unusual Location
- New Device
- Multiple Red Flags
-
Customize transaction details if needed:
- Transaction type (purchase, withdrawal, transfer, deposit)
- Payment method
- Amount
- Merchant category
- Location information
- Device information
-
Click "Evaluate Transaction" to analyze the risk profile.
-
Review the comprehensive risk assessment, including:
- Traditional Risk Assessment: Rules-based evaluation with fraud pattern detection
- Advanced Vector Search: AI-powered similarity matching against historical transactions
- Intelligent Vector Search Analysis: Context-aware risk score calculation featuring:
- High-Risk Focus: Detailed mathematical breakdowns only shown for concerning patterns
- Smart Transparency: Step-by-step calculations when high-risk matches are detected
- Clean Interface: Simplified display for normal/low-risk transactions
- Algorithm Explanations: Educational content for fraud analysts when needed
- Context-aware Filtering: Smart prioritization of relevant similar transactions
- Multi-factor Risk Scoring: Comprehensive risk evaluation with detailed explanations
Note
The simulator is a powerful tool for understanding how the system works and for demonstrating the capabilities to stakeholders.
The Entity Management interface provides comprehensive AML/KYC capabilities:
-
Navigate to http://localhost:3000/entities.
-
Key capabilities include:
- Advanced Search: Multi-strategy search with Atlas Search, autocomplete, and faceted filtering
- Entity Resolution: AI-powered fuzzy matching and duplicate detection with vector search during onboarding
- Network Visualization: Interactive relationship graphs using Cytoscape.js
-
Search and filter entities using:
- Name-based fuzzy search with autocomplete
- Entity type filters (Individual, Organization)
- Risk level filters (Low, Medium, High, Critical)
- Geographic filters (Country, City, Nationality, Residency)
- Business type and jurisdiction filters
-
Click on any entity to view:
- Detailed entity information and identifiers
- Risk assessment details and watchlist matches
- Relationship + transaction network visualization
- Similar entities and potential duplicates
The Enhanced Entity Resolution feature provides a comprehensive 5-step workflow for intelligent entity onboarding, duplicate detection, and risk assessment:
-
Navigate to http://localhost:3000/entity-resolution/enhanced.
-
Step 0 - Entity Input: Enter new entity information using the simplified onboarding form:
- Entity Type (Individual or Organization)
- Full Name
- Address
-
Step 1 - Parallel Search: The system performs AI-powered search using three methods simultaneously:
- Atlas Search: Text-based fuzzy matching on names and addresses
- Vector Search: Semantic similarity analysis using AWS Bedrock AI embeddings
- Hybrid Search: MongoDB $rankFusion combining both approaches with contribution analysis
-
Step 2 - Network Analysis: Comprehensive network risk assessment for top 3 hybrid search matches:
- Relationship Networks: Graph analysis with depth-2 traversal
- Transaction Networks: Transaction pattern analysis with depth-1 traversal
-
Step 3 - AI Classification: LLM-powered entity classification using AWS Bedrock Claude-3 Sonnet:
- Comprehensive Analysis: Evaluates entity data, search results, and network analysis
- Risk Assessment: Generates risk scores, confidence levels, and recommended actions
- AML/KYC Compliance: Identifies compliance flags and concerns
- Network Positioning: Analyzes entity's position within relationship networks
-
Step 4 - Case Investigation: Automated case document creation for compliance workflows:
- Case Document Generation: Creates MongoDB case document
- LLM Investigation Summary: Professional investigation narrative using AI
- Workflow Consolidation: Combines all previous steps into comprehensive case file
- Report Generation: Export PDF report Case Report
The Risk Model Management interface allows administrators to configure and deploy different risk assessment models:
-
Navigate to http://localhost:3000/risk-models.
-
View and select from available risk models in the system.
-
Key capabilities include:
- Dynamic Risk Factor Management: Add or modify risk factors without system changes
- Real-Time Updates: See changes instantly using MongoDB Change Streams
- Version Control: Create and manage multiple versions of risk models
- Model Activation: Easily switch between different models
- Performance Metrics: Track effectiveness with false positive/negative rates
- Custom Thresholds: Configure flag and block thresholds for each model
- Model Reset Functionality: Reset models to clean state by removing version 2 models and setting default configurations
-
To create a new risk model:
- Click "Create New Model"
- Configure basic information (name, description)
- Add risk factors with appropriate weights and thresholds
- Set overall model thresholds
- Save and optionally activate the model
-
To reset models to default state:
- Click "Reset Models" (located on the far right of the action buttons)
- This will delete all version 2 models, set
default-risk-model
to active, and setbehavioral-risk-model
to inactive - Useful for returning to a clean baseline during testing or demos
Important
All changes are reflected in real-time across all connected sessions thanks to MongoDB Change Streams.
For containerized deployment in production environments:
-
Ensure Docker and Docker Compose are installed on your system.
-
Configure environment variables for production in your
.env
files. -
Build and run the containers:
# Build all images docker-compose build # Start all services docker-compose up -d # Or build and start in one command docker-compose up --build -d
-
This will run containers for:
- Frontend (port 3000)
- Fraud Detection Backend (port 8000)
- AML/KYC Backend (port 8001)
-
Access the application at http://localhost:3000.
Note
The Docker configuration uses production settings by default. Check the docker-compose.yml
file and individual Dockerfiles for details.
Check additional and accompanying resources below:
- MongoDB for Financial Services
- MongoDB Atlas Search Documentation
- MongoDB Atlas Vector Search
- MongoDB Change Streams
- MongoDB $graphLookup
- MongoDB $rankFusion
- MongoDB LeafyGreen UI
- Building Real-time Fraud Detection Systems
- Financial Services Solutions
- Vector Search for Fraud Detection
- Document Model: Rich, nested structures for customer and entity profiles
- Atlas Search: Full-text search with fuzzy matching and autocomplete
- Vector Search: AI-powered similarity matching for fraud patterns and entity resolution
- $graphLookup: Relationship network traversal for compliance investigations
- $rankFusion: Hybrid search combining text and vector search
- Change Streams: Real-time updates for risk models
- Geospatial Queries: Location-based fraud detection
This project is licensed under the MIT License - see the LICENSE file for details.