Skip to content

aws-samples/sample-dat409-hybrid-search-aurora-mcp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

92 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

DAT409 - Hybrid Search with Aurora PostgreSQL for MCP Retrieval

AWS Python PostgreSQL Bedrock License

⚠️ Important Notice: For demonstration and educational purposes only. Not intended for production use.

πŸš€ Quick Start

Workshop Duration: 60 minutes | Lab 1: 25 min | Lab 2: 20 min

Build hybrid search with Aurora PostgreSQL, pgvector, and MCP for natural language database queries.

πŸ“ Repository Structure

β”œβ”€β”€ lab1-hybrid-search/
β”‚   β”œβ”€β”€ notebook/
β”‚   β”‚   └── dat409-hybrid-search-notebook.ipynb  # Main lab notebook
β”‚   β”œβ”€β”€ data/
β”‚   β”‚   └── amazon-products.csv                  # 21,704 products
β”‚   └── requirements.txt                         # Lab 1 dependencies
β”œβ”€β”€ lab2-mcp-agent/
β”‚   β”œβ”€β”€ streamlit_app.py                         # Streamlit demo app
β”‚   β”œβ”€β”€ test_personas.sh                         # RLS testing script
β”‚   β”œβ”€β”€ mcp_config.json                          # MCP configuration
β”‚   └── requirements.txt                         # Lab 2 dependencies
β”œβ”€β”€ scripts/
β”‚   β”œβ”€β”€ bootstrap-code-editor.sh                 # Infrastructure setup
β”‚   β”œβ”€β”€ setup-database.sh                        # Database & data loading
β”‚   └── setup/                                   # Helper scripts
β”œβ”€β”€ cfn/                                         # CloudFormation templates
β”œβ”€β”€ env_example                                  # Environment template
└── README.md                                    # This file

🎯 Labs

Lab 1: Hybrid Search (25 min)

Combine vector similarity (pgvector + HNSW), full-text search, and fuzzy matching (pg_trgm) with 21,704 products.

cd /workshop/lab1-hybrid-search/notebook
# Open dat409-hybrid-search-notebook.ipynb

Lab 2: MCP & RLS (20 min)

Natural language queries with Model Context Protocol and Row-Level Security for multi-tenant access.

cd /workshop/lab2-mcp-agent
./test_personas.sh  # Test customer, support, product manager personas
streamlit run streamlit_app.py  # Optional demo

πŸ› οΈ Prerequisites

  • AWS Account with Aurora PostgreSQL 17.5 + Amazon Bedrock (Cohere Embed English v3)
  • Python 3.13 + Jupyter Notebook

🚦 Getting Started

Participants: Access Code Editor via CloudFront URL β†’ Open Lab 1 notebook in /workshop/lab1-hybrid-search/notebook/

Instructors: See DEPLOYMENT_SEQUENCE.md for setup.

πŸ”§ Technical Stack

  • Database: Aurora PostgreSQL 17.5 with pgvector extension
  • Embeddings: Cohere Embed English v3 (1024 dimensions)
  • Search: HNSW vector index + GIN full-text + pg_trgm fuzzy
  • MCP: Model Context Protocol for database access
  • AI Agent: Strands Agent with Claude Sonnet 4 + MCP tools
  • RLS: Row-Level Security for persona-based access
  • Python: 3.13 with pandas, psycopg, boto3, streamlit

πŸ€– MCP Agent Architecture

The workshop demonstrates natural language database queries using a Strands Agent with MCP tools:

User Query β†’ Strands Agent (Claude Sonnet 4) β†’ MCP Client β†’ Aurora PostgreSQL (Data API)

Key Components:

  • Strands Agent: AI agent framework with tool-calling capabilities
  • MCP Client: Provides standardized database access tools via awslabs.postgres-mcp-server
  • Claude Sonnet 4: Interprets queries and decides which MCP tools to call
  • Aurora Data API: Serverless database access using cluster ARN + secret ARN

Why This Pattern?

  • Agent uses admin access via Data API for intelligent cross-schema queries
  • Application-level authorization handles security (typical production pattern for AI agents)
  • MCP provides standardized, reusable database tools
  • Enables natural language β†’ SQL translation with context awareness

πŸ“š Resources

⭐ Like This Workshop?

If you find this workshop helpful, please consider:

  • ⭐ Star this repository to show your support!
  • 🍴 Fork it to customize for your own use cases
  • πŸ› Report issues to help us improve
  • πŸ’‘ Submit pull requests with enhancements or fixes
  • πŸ“’ Share it with your colleagues and community

Your feedback and contributions help make this workshop better for everyone!

🀝 Contributing

Contributions welcome! Documentation, bug fixes, features, tests, or feedback. See CONTRIBUTING.md.

πŸ“„ License & Security

MIT-0 License. See LICENSE | Security: CONTRIBUTING.md


Β© 2025 Shayon Sanyal, Principal Solutions Architect, AWS

About

DAT409 - Hybrid Search with Aurora PostgreSQL for MCP Retrieval

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •