A Flask-based REST API service for searching and synchronizing data between MongoDB and Elasticsearch. This service provides search capabilities with token-based authentication and role-based access control (RBAC).
- Search Operations: Full-text search and structured query support via Elasticsearch
- Data Synchronization: Sync MongoDB collections to Elasticsearch with admin-only access
- RBAC Security: Role-based access control with admin and user roles
- Token Authentication: Secure token-based authentication with breadcrumb tracing
- Observability: Comprehensive logging and monitoring support
- server.py - Main Flask application entry point
- routes/ - API endpoint handlers
- search_routes.py - Search operations (
/api/search
) - sync_routes.py - Synchronization operations (
/api/sync
)
- search_routes.py - Search operations (
- services/ - Business logic layer
- search_services.py - Search operations and query processing
- sync_services.py - MongoDB to Elasticsearch synchronization
- utils/ - Utility modules
- elastic_utils.py - Elasticsearch client operations
- mongo_utils.py - MongoDB client operations
- Token-based Authentication: All requests require valid authentication tokens
- Role-based Access Control:
admin
role: Full access to all operations including syncuser
role: Read-only access to search operations
- Breadcrumb Tracing: All operations include request tracing for observability
- Python 3.12+
- pipenv
- MongoDB instance
- Elasticsearch instance
# Clone the repository
git clone <repository-url>
cd stage0_search_api
# Install dependencies
pipenv install --dev
# Set up environment variables
export MONGO_CONNECTION_STRING="mongodb://localhost:27017/?replicaSet=rs0"
export ELASTIC_CLIENT_OPTIONS='{"hosts":"http://localhost:9200"}'
export LOGGING_LEVEL=INFO
# Run locally
pipenv run local
# Open http://localhost:8083/
# Run the service locally
pipenv run local
# Run with debug logging
pipenv run debug
# Run unit tests with coverage
pipenv run test
# Run black box tests
pipenv run stepci
# Build Docker container
pipenv run build
# Start containerized service stack
pipenv run service
# Start just the database
pipenv run database
# Stop all containers
pipenv run down
# Simple text search (use 'search' parameter, not 'q')
curl -X GET "http://localhost:8083/api/search/?search=test%20query"
# Structured Elasticsearch query
curl -X GET "http://localhost:8083/api/search/?query=%7B%22match%22%3A%7B%22title%22%3A%22test%22%7D%7D"
curl -X POST "http://localhost:8083/api/sync/"
# Sync a specific collection (uses singular collection names)
curl -X POST "http://localhost:8083/api/sync/bot/"
curl -X POST "http://localhost:8083/api/sync/conversation/"
curl -X POST "http://localhost:8083/api/sync/workshop/"
curl -X GET "http://localhost:8083/api/sync/?limit=10"
# Get current sync period
curl -X GET "http://localhost:8083/api/sync/periodicity"
# Set sync period (in seconds)
curl -X PUT "http://localhost:8083/api/sync/" \
-H "Content-Type: application/json" \
-d '{"period_seconds": 300}'
The API synchronizes the following MongoDB collections to Elasticsearch:
- bot - Bot configurations and personalities
- chain - Workflow chains
- conversation - Conversation data and chat history
- execution - Execution tracking data
- exercise - Design thinking exercises
- runbook - Operational runbooks
- template - Template definitions
- user - User data
- workshop - Workshop configurations
tests/
├── routes/ # Route endpoint tests
├── services/ # Service layer tests
├── stepci/ # Black box API tests
└── test_data/ # Test fixtures and data
# Run all tests with coverage
pipenv run test
# Run specific test file
python -m pytest tests/routes/test_search_routes.py -v
# Run black box tests
pipenv run stepci
The service uses the stage0_py_utils
configuration system. Key configuration items:
MONGO_CONNECTION_STRING
- MongoDB connection stringELASTIC_CLIENT_OPTIONS
- Elasticsearch client configurationELASTIC_SEARCH_INDEX
- Default search index nameELASTIC_SYNC_INDEX
- Sync history index nameSEARCH_API_PORT
- API server port (default: 8083)LOGGING_LEVEL
- Logging verbosityMONGO_COLLECTION_NAMES
- List of collections to sync (managed by stage0_py_utils)
-
✅ Fixed MongoDB to Elasticsearch synchronization - All collections now sync successfully
-
✅ Updated to use stage0_py_utils v0.2.9 - Uses centralized configuration
-
✅ Fixed ObjectId and datetime serialization - Documents now index properly
-
✅ Updated collection names - Now uses singular names (bot, conversation, etc.)
-
✅ Removed unnecessary packaging files - Cleaner project structure
All endpoints return consistent error responses:
- 500 Internal Server Error: For all exceptions with empty JSON response
{}
- Detailed logging: All errors are logged internally with breadcrumb tracing
- Security: Authentication and authorization errors are logged but not exposed
This project follows the Stage0 development standards and implements API standards for consistency across the platform.
This project is licensed under the MIT License - see the LICENSE file for details.