Skip to content

Commit d50ab59

Browse files
authored
feat(SQO): Introduce Slack notification pipeline and test script for failed/successful runs of the SQO (#2)
* Create slack notifier utility module * Integrate new slack notifier module with codebase * Create test_slack_notifications.py * Ruff * Update docker-compose.yml * Update test_slack_notifications.py * Update slack_notifier.py * Update Dockerfile * Update README.md * Update issuance_eligibility_oracle_core.py * Ruff
1 parent 8131872 commit d50ab59

File tree

9 files changed

+650
-60
lines changed

9 files changed

+650
-60
lines changed

Dockerfile

Lines changed: 14 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -16,52 +16,42 @@ ENV PYTHONDONTWRITEBYTECODE=1 \
1616
PYTHONPATH=/app \
1717
TZ=UTC
1818

19-
# Install system dependencies
19+
# Install minimal system dependencies
2020
RUN apt-get update && apt-get install -y \
21-
gcc \
2221
tini \
2322
&& apt-get clean \
2423
&& rm -rf /var/lib/apt/lists/*
2524

26-
# Create a non-root user to run the application
27-
RUN groupadd -g 1000 oracle && \
28-
useradd -u 1000 -g oracle -s /bin/bash -m oracle
29-
30-
# Create necessary directories for persistent data with proper permissions
31-
RUN mkdir -p /app/data/output /app/logs && \
32-
chown -R oracle:oracle /app && \
33-
chmod -R 750 /app
34-
35-
# Copy requirements file separately to leverage Docker caching
25+
# Copy requirements file first to leverage Docker caching
3626
COPY requirements.txt .
3727

3828
# Install Python dependencies
3929
RUN pip install --no-cache-dir -r requirements.txt
4030

31+
# Create necessary directories for persistent data
32+
RUN mkdir -p /app/data/output /app/logs
33+
4134
# Copy the application code
42-
COPY --chown=oracle:oracle src/ ./src/
43-
COPY --chown=oracle:oracle scripts/ ./scripts/
44-
COPY --chown=oracle:oracle contracts/ ./contracts/
35+
COPY src/ ./src/
36+
COPY scripts/ ./scripts/
37+
COPY contracts/ ./contracts/
4538

4639
# Copy marker files for project root detection
47-
COPY --chown=oracle:oracle .gitignore ./
48-
COPY --chown=oracle:oracle pyproject.toml ./
40+
COPY .gitignore ./
41+
COPY pyproject.toml ./
4942

5043
# Copy the scheduler to the root directory
51-
COPY --chown=oracle:oracle src/models/scheduler.py ./
44+
COPY src/models/scheduler.py ./
5245

5346
# Create healthcheck file
54-
RUN touch /app/healthcheck && chown oracle:oracle /app/healthcheck
55-
56-
# Switch to non-root user
57-
USER oracle
47+
RUN touch /app/healthcheck
5848

59-
# Use Tini as entrypoint
49+
# Use Tini as entrypoint for proper signal handling
6050
ENTRYPOINT ["/usr/bin/tini", "--"]
6151

6252
# Add healthcheck to verify the service is running
6353
HEALTHCHECK --interval=5m --timeout=30s --start-period=1m --retries=3 \
64-
CMD python -c "import os, time; assert os.path.exists('/app/healthcheck') and time.time() - os.path.getmtime('/app/healthcheck') < 3600, 'Healthcheck failed: file missing or too old'" || exit 1
54+
CMD python -c "import os, time; assert os.path.exists('/app/healthcheck') and time.time() - os.path.getmtime('/app/healthcheck') < 3600, 'Healthcheck failed'" || exit 1
6555

6656
# Run the scheduler
6757
CMD ["python", "scheduler.py"]

README.md

Lines changed: 33 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,24 +1,50 @@
11
# Service Quality Oracle
22

3-
43
## Overview
54

65
This repository implements a Docker container service for the Service Quality Oracle. The oracle consumes data from BigQuery, processes it to determine indexer issuance rewards eligibility, based on a defined threshold algorithm, and posts issuance eligibility data on-chain.
76

7+
### Key Features
8+
89
The oracle runs with the following functionality:
910
- **BigQuery Integration**: Fetches indexer performance data from Google BigQuery
1011
- **Eligibility Processing**: Applies threshold algorithm to determine issuance rewards eligibility based on service quality
1112
- **Blockchain Integration**: Posts issuance eligibility updates to the ServiceQualityOracle contract
13+
- **Slack Notifications**: Sends success/failure notifications for monitoring
1214
- **Docker Deployment**: Containerized and running with health checks
1315
- **Scheduled Execution**: Runs daily at 10:00 UTC
1416
- **RPC Failover**: Automatic failover between multiple RPC providers for reliability
1517

18+
### Monitoring & Notifications
19+
20+
The oracle includes built-in Slack notifications for operational monitoring:
21+
22+
- **Success Notifications**: Sent when oracle runs complete successfully, including transaction details
23+
- **Failure Notifications**: Sent when errors occur, with detailed error information for debugging
24+
- **Simple & Reliable**: Direct notifications from the oracle process itself
25+
26+
For production deployments, container orchestration (Kubernetes) should handle:
27+
- Container health monitoring and restarts
28+
- Resource management and scaling
29+
- Infrastructure-level alerts and monitoring
30+
31+
### Testing Notifications
32+
33+
Test notification functionality:
34+
```bash
35+
# Set webhook URL
36+
export SLACK_WEBHOOK_URL="your_webhook_url"
37+
38+
# Run notification tests
39+
./scripts/test_slack_notifications.py
40+
```
41+
42+
## Configuration
1643

1744
## Eligibility Criteria
1845

1946
Please refer to the [ELIGIBILITY_CRITERIA.md](./ELIGIBILITY_CRITERIA.md) file to view the latest criteria for issuance. We are also posting upcoming criteria in that document.
2047

21-
2248
## Data Flow
2349

2450
The application follows this data flow:
@@ -92,26 +118,21 @@ bandit -r src/
92118

93119
[License information to be determined.]
94120

95-
96121
## TODO List (only outstanding TODOs)
97122

98-
### 1. Monitoring features
99-
- [ ] Slack monitoring integration
100-
- [ ] Add notification logic for failed runs so we are aware in a slack channel
101-
- [ ] Initially we can notify for successful runs too
102-
103-
### 2. Production Readiness
123+
### 1. Production Readiness
104124
- [ ] Check error recovery mechanisms to see if they could be improved (RPC failover, retry logic)
105125
- [ ] Verify health check endpoints or processes (Docker healthcheck)
106126

107-
### 3. Testing
127+
### 2. Testing
108128
- [ ] Create unit tests for all components
109129
- [ ] Create integration tests for the entire pipeline
110130
- [ ] Security review of code and dependencies
111131

112-
### 4. Documentation
132+
### 3. Documentation
113133
- [ ] Documentation of all major components
114134
- [ ] Document operational procedures
115135

116-
## 5. Last check
136+
### 4. Optimization
137+
- [ ] Optimize dependencies and container setup
117138
- [ ] Ensure unused files, functions & dependencies are removed from codebase

config.toml.example

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -43,3 +43,4 @@ ETHERSCAN_API_KEY = "$ETHERSCAN_API_KEY"
4343
ARBITRUM_API_KEY = "$ARBITRUM_API_KEY"
4444
STUDIO_API_KEY = "$STUDIO_API_KEY"
4545
STUDIO_DEPLOY_KEY = "$STUDIO_DEPLOY_KEY"
46+
SLACK_WEBHOOK_URL = "$SLACK_WEBHOOK_URL"

docker-compose.yml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
services:
2-
issuance-oracle:
2+
service-quality-oracle:
33
build: .
44
container_name: service-quality-oracle
55
image: service-quality-oracle:latest
@@ -30,6 +30,7 @@ services:
3030
- ARBITRUM_API_KEY=${ARBITRUM_API_KEY}
3131
- STUDIO_API_KEY=${STUDIO_API_KEY}
3232
- STUDIO_DEPLOY_KEY=${STUDIO_DEPLOY_KEY}
33+
- SLACK_WEBHOOK_URL=${SLACK_WEBHOOK_URL}
3334

3435
# Block processes from gaining higher privileges/capabilities
3536
security_opt:

scripts/test_slack_notifications.py

Lines changed: 156 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,156 @@
1+
#!/usr/bin/env python3
2+
"""
3+
Test script for Slack notifications.
4+
This script tests the Slack notification functionality without running the full oracle.
5+
"""
6+
7+
import logging
8+
import os
9+
import sys
10+
11+
# Add project root to path
12+
project_root = os.path.abspath(os.path.join(os.path.dirname(__file__), ".."))
13+
sys.path.insert(0, project_root)
14+
15+
logging.basicConfig(level=logging.INFO, format="%(asctime)s - %(name)s - %(levelname)s - %(message)s")
16+
logger = logging.getLogger("slack-test")
17+
18+
from src.utils.slack_notifier import SlackNotifier, create_slack_notifier
19+
20+
21+
def test_info_notification(notifier: SlackNotifier) -> bool:
22+
"""
23+
Test sending an informational notification to Slack.
24+
25+
Args:
26+
notifier: Configured SlackNotifier instance
27+
28+
Returns:
29+
True if test passes, False otherwise
30+
"""
31+
# Send test info notification with sample message
32+
logger.info("Testing info notification...")
33+
success = notifier.send_info_notification("Test info notification", "Test Notification")
34+
logger.info(f"Info notification: {'PASSED' if success else 'FAILED'}")
35+
return success
36+
37+
38+
def test_success_notification(notifier: SlackNotifier) -> bool:
39+
"""
40+
Test sending a success notification to Slack.
41+
42+
Args:
43+
notifier: Configured SlackNotifier instance
44+
45+
Returns:
46+
True if test passes, False otherwise
47+
"""
48+
# Send test success notification with sample indexer data and transaction links
49+
logger.info("Testing success notification...")
50+
51+
test_indexers = [
52+
"0x1234567890abcdef1234567890abcdef12345678",
53+
"0xabcdef1234567890abcdef1234567890abcdef12",
54+
"0x9876543210fedcba9876543210fedcba98765432",
55+
]
56+
57+
test_transaction_links = [
58+
"https://sepolia.arbiscan.io/tx/0x1234567890abcdef1234567890abcdef1234567890abcdef1234567890abcdef12",
59+
"https://sepolia.arbiscan.io/tx/0xabcdef1234567890abcdef1234567890abcdef1234567890abcdef1234567890ab",
60+
]
61+
62+
success = notifier.send_success_notification(
63+
eligible_indexers=test_indexers,
64+
total_processed=len(test_indexers),
65+
execution_time=1,
66+
transaction_links=test_transaction_links,
67+
batch_count=2,
68+
)
69+
70+
logger.info(f"Success notification: {'PASSED' if success else 'FAILED'}")
71+
return success
72+
73+
74+
def test_failure_notification(notifier: SlackNotifier) -> bool:
75+
"""
76+
Test sending a failure notification to Slack.
77+
78+
Args:
79+
notifier: Configured SlackNotifier instance
80+
81+
Returns:
82+
True if test passes, False otherwise
83+
"""
84+
# Send test failure notification with sample error and partial transaction info
85+
logger.info("Testing failure notification...")
86+
87+
partial_transactions = [
88+
"https://sepolia.arbiscan.io/tx/0x1111111111111111111111111111111111111111111111111111111111111111",
89+
]
90+
91+
success = notifier.send_failure_notification(
92+
error_message="Test error message for verification",
93+
stage="Test Blockchain Submission",
94+
execution_time=1,
95+
partial_transaction_links=partial_transactions,
96+
indexers_processed=150,
97+
)
98+
99+
logger.info(f"Failure notification: {'PASSED' if success else 'FAILED'}")
100+
return success
101+
102+
103+
def run_all_tests() -> bool:
104+
"""
105+
Run all tests and return True if all tests pass, False otherwise.
106+
107+
Returns:
108+
True if all tests pass, False otherwise
109+
"""
110+
# Get the Slack webhook URL from the environment variable
111+
webhook_url = os.environ.get("SLACK_WEBHOOK_URL")
112+
if not webhook_url:
113+
logger.error("SLACK_WEBHOOK_URL environment variable not set")
114+
return False
115+
116+
# Create a Slack notifier instance using the webhook URL
117+
notifier = create_slack_notifier(webhook_url)
118+
if not notifier:
119+
logger.error("Failed to create Slack notifier")
120+
return False
121+
122+
# Define the list of tests to run
123+
tests = [
124+
test_info_notification,
125+
test_success_notification,
126+
test_failure_notification,
127+
]
128+
129+
# Run each test and return False if any test fails
130+
for test in tests:
131+
if not test(notifier):
132+
return False
133+
134+
# If all tests pass, return True
135+
return True
136+
137+
138+
def main():
139+
"""
140+
Main function to orchestrate Slack notification testing.
141+
"""
142+
# Display test header information
143+
logger.info("Service Quality Oracle - Slack Notification Test")
144+
145+
if run_all_tests():
146+
logger.info("All tests completed successfully!")
147+
logger.info("Check Slack channel to verify notifications were received.")
148+
sys.exit(0)
149+
150+
else:
151+
logger.error("Some tests failed!")
152+
sys.exit(1)
153+
154+
155+
if __name__ == "__main__":
156+
main()

0 commit comments

Comments
 (0)