Skip to content

feat(SQO): Introduce Slack notification pipeline and test script for failed/successful runs of the SQO #2

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Jun 9, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 14 additions & 24 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -16,52 +16,42 @@ ENV PYTHONDONTWRITEBYTECODE=1 \
PYTHONPATH=/app \
TZ=UTC

# Install system dependencies
# Install minimal system dependencies
RUN apt-get update && apt-get install -y \
gcc \
tini \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*

# Create a non-root user to run the application
RUN groupadd -g 1000 oracle && \
useradd -u 1000 -g oracle -s /bin/bash -m oracle

# Create necessary directories for persistent data with proper permissions
RUN mkdir -p /app/data/output /app/logs && \
chown -R oracle:oracle /app && \
chmod -R 750 /app

# Copy requirements file separately to leverage Docker caching
# Copy requirements file first to leverage Docker caching
COPY requirements.txt .

# Install Python dependencies
RUN pip install --no-cache-dir -r requirements.txt

# Create necessary directories for persistent data
RUN mkdir -p /app/data/output /app/logs

# Copy the application code
COPY --chown=oracle:oracle src/ ./src/
COPY --chown=oracle:oracle scripts/ ./scripts/
COPY --chown=oracle:oracle contracts/ ./contracts/
COPY src/ ./src/
COPY scripts/ ./scripts/
COPY contracts/ ./contracts/

# Copy marker files for project root detection
COPY --chown=oracle:oracle .gitignore ./
COPY --chown=oracle:oracle pyproject.toml ./
COPY .gitignore ./
COPY pyproject.toml ./

# Copy the scheduler to the root directory
COPY --chown=oracle:oracle src/models/scheduler.py ./
COPY src/models/scheduler.py ./

# Create healthcheck file
RUN touch /app/healthcheck && chown oracle:oracle /app/healthcheck

# Switch to non-root user
USER oracle
RUN touch /app/healthcheck

# Use Tini as entrypoint
# Use Tini as entrypoint for proper signal handling
ENTRYPOINT ["/usr/bin/tini", "--"]

# Add healthcheck to verify the service is running
HEALTHCHECK --interval=5m --timeout=30s --start-period=1m --retries=3 \
CMD python -c "import os, time; assert os.path.exists('/app/healthcheck') and time.time() - os.path.getmtime('/app/healthcheck') < 3600, 'Healthcheck failed: file missing or too old'" || exit 1
CMD python -c "import os, time; assert os.path.exists('/app/healthcheck') and time.time() - os.path.getmtime('/app/healthcheck') < 3600, 'Healthcheck failed'" || exit 1

# Run the scheduler
CMD ["python", "scheduler.py"]
45 changes: 33 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,24 +1,50 @@
# Service Quality Oracle


## Overview

This repository implements a Docker container service for the Service Quality Oracle. The oracle consumes data from BigQuery, processes it to determine indexer issuance rewards eligibility, based on a defined threshold algorithm, and posts issuance eligibility data on-chain.

### Key Features

The oracle runs with the following functionality:
- **BigQuery Integration**: Fetches indexer performance data from Google BigQuery
- **Eligibility Processing**: Applies threshold algorithm to determine issuance rewards eligibility based on service quality
- **Blockchain Integration**: Posts issuance eligibility updates to the ServiceQualityOracle contract
- **Slack Notifications**: Sends success/failure notifications for monitoring
- **Docker Deployment**: Containerized and running with health checks
- **Scheduled Execution**: Runs daily at 10:00 UTC
- **RPC Failover**: Automatic failover between multiple RPC providers for reliability

### Monitoring & Notifications

The oracle includes built-in Slack notifications for operational monitoring:

- **Success Notifications**: Sent when oracle runs complete successfully, including transaction details
- **Failure Notifications**: Sent when errors occur, with detailed error information for debugging
- **Simple & Reliable**: Direct notifications from the oracle process itself

For production deployments, container orchestration (Kubernetes) should handle:
- Container health monitoring and restarts
- Resource management and scaling
- Infrastructure-level alerts and monitoring

### Testing Notifications

Test notification functionality:
```bash
# Set webhook URL
export SLACK_WEBHOOK_URL="your_webhook_url"

# Run notification tests
./scripts/test_slack_notifications.py
```

## Configuration

## Eligibility Criteria

Please refer to the [ELIGIBILITY_CRITERIA.md](./ELIGIBILITY_CRITERIA.md) file to view the latest criteria for issuance. We are also posting upcoming criteria in that document.


## Data Flow

The application follows this data flow:
Expand Down Expand Up @@ -92,26 +118,21 @@ bandit -r src/

[License information to be determined.]


## TODO List (only outstanding TODOs)

### 1. Monitoring features
- [ ] Slack monitoring integration
- [ ] Add notification logic for failed runs so we are aware in a slack channel
- [ ] Initially we can notify for successful runs too

### 2. Production Readiness
### 1. Production Readiness
- [ ] Check error recovery mechanisms to see if they could be improved (RPC failover, retry logic)
- [ ] Verify health check endpoints or processes (Docker healthcheck)

### 3. Testing
### 2. Testing
- [ ] Create unit tests for all components
- [ ] Create integration tests for the entire pipeline
- [ ] Security review of code and dependencies

### 4. Documentation
### 3. Documentation
- [ ] Documentation of all major components
- [ ] Document operational procedures

## 5. Last check
### 4. Optimization
- [ ] Optimize dependencies and container setup
- [ ] Ensure unused files, functions & dependencies are removed from codebase
1 change: 1 addition & 0 deletions config.toml.example
Original file line number Diff line number Diff line change
Expand Up @@ -43,3 +43,4 @@ ETHERSCAN_API_KEY = "$ETHERSCAN_API_KEY"
ARBITRUM_API_KEY = "$ARBITRUM_API_KEY"
STUDIO_API_KEY = "$STUDIO_API_KEY"
STUDIO_DEPLOY_KEY = "$STUDIO_DEPLOY_KEY"
SLACK_WEBHOOK_URL = "$SLACK_WEBHOOK_URL"
3 changes: 2 additions & 1 deletion docker-compose.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
services:
issuance-oracle:
service-quality-oracle:
build: .
container_name: service-quality-oracle
image: service-quality-oracle:latest
Expand Down Expand Up @@ -30,6 +30,7 @@ services:
- ARBITRUM_API_KEY=${ARBITRUM_API_KEY}
- STUDIO_API_KEY=${STUDIO_API_KEY}
- STUDIO_DEPLOY_KEY=${STUDIO_DEPLOY_KEY}
- SLACK_WEBHOOK_URL=${SLACK_WEBHOOK_URL}

# Block processes from gaining higher privileges/capabilities
security_opt:
Expand Down
156 changes: 156 additions & 0 deletions scripts/test_slack_notifications.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,156 @@
#!/usr/bin/env python3
"""
Test script for Slack notifications.
This script tests the Slack notification functionality without running the full oracle.
"""

import logging
import os
import sys

# Add project root to path
project_root = os.path.abspath(os.path.join(os.path.dirname(__file__), ".."))
sys.path.insert(0, project_root)

logging.basicConfig(level=logging.INFO, format="%(asctime)s - %(name)s - %(levelname)s - %(message)s")
logger = logging.getLogger("slack-test")

from src.utils.slack_notifier import SlackNotifier, create_slack_notifier


def test_info_notification(notifier: SlackNotifier) -> bool:
"""
Test sending an informational notification to Slack.

Args:
notifier: Configured SlackNotifier instance

Returns:
True if test passes, False otherwise
"""
# Send test info notification with sample message
logger.info("Testing info notification...")
success = notifier.send_info_notification("Test info notification", "Test Notification")
logger.info(f"Info notification: {'PASSED' if success else 'FAILED'}")
return success


def test_success_notification(notifier: SlackNotifier) -> bool:
"""
Test sending a success notification to Slack.

Args:
notifier: Configured SlackNotifier instance

Returns:
True if test passes, False otherwise
"""
# Send test success notification with sample indexer data and transaction links
logger.info("Testing success notification...")

test_indexers = [
"0x1234567890abcdef1234567890abcdef12345678",
"0xabcdef1234567890abcdef1234567890abcdef12",
"0x9876543210fedcba9876543210fedcba98765432",
]

test_transaction_links = [
"https://sepolia.arbiscan.io/tx/0x1234567890abcdef1234567890abcdef1234567890abcdef1234567890abcdef12",
"https://sepolia.arbiscan.io/tx/0xabcdef1234567890abcdef1234567890abcdef1234567890abcdef1234567890ab",
]

success = notifier.send_success_notification(
eligible_indexers=test_indexers,
total_processed=len(test_indexers),
execution_time=1,
transaction_links=test_transaction_links,
batch_count=2,
)

logger.info(f"Success notification: {'PASSED' if success else 'FAILED'}")
return success


def test_failure_notification(notifier: SlackNotifier) -> bool:
"""
Test sending a failure notification to Slack.

Args:
notifier: Configured SlackNotifier instance

Returns:
True if test passes, False otherwise
"""
# Send test failure notification with sample error and partial transaction info
logger.info("Testing failure notification...")

partial_transactions = [
"https://sepolia.arbiscan.io/tx/0x1111111111111111111111111111111111111111111111111111111111111111",
]

success = notifier.send_failure_notification(
error_message="Test error message for verification",
stage="Test Blockchain Submission",
execution_time=1,
partial_transaction_links=partial_transactions,
indexers_processed=150,
)

logger.info(f"Failure notification: {'PASSED' if success else 'FAILED'}")
return success


def run_all_tests() -> bool:
"""
Run all tests and return True if all tests pass, False otherwise.

Returns:
True if all tests pass, False otherwise
"""
# Get the Slack webhook URL from the environment variable
webhook_url = os.environ.get("SLACK_WEBHOOK_URL")
if not webhook_url:
logger.error("SLACK_WEBHOOK_URL environment variable not set")
return False

# Create a Slack notifier instance using the webhook URL
notifier = create_slack_notifier(webhook_url)
if not notifier:
logger.error("Failed to create Slack notifier")
return False

# Define the list of tests to run
tests = [
test_info_notification,
test_success_notification,
test_failure_notification,
]

# Run each test and return False if any test fails
for test in tests:
if not test(notifier):
return False

# If all tests pass, return True
return True


def main():
"""
Main function to orchestrate Slack notification testing.
"""
# Display test header information
logger.info("Service Quality Oracle - Slack Notification Test")

if run_all_tests():
logger.info("All tests completed successfully!")
logger.info("Check Slack channel to verify notifications were received.")
sys.exit(0)

else:
logger.error("Some tests failed!")
sys.exit(1)


if __name__ == "__main__":
main()
Loading