WeaveBot - Intelligent Event Assistant 🤖

An intelligent Telegram bot that extracts event information from web pages using Playwright for browser automation and OpenAI GPT-4o for intelligent data extraction.

🚀 Key Features

🌐 Universal Web Scraping: Handles JavaScript-heavy sites (Lu.ma, Meetup, etc.) with Playwright
🧠 AI-Powered Extraction: Uses GPT-4o for intelligent event/update data extraction
📊 Airtable Integration: Automatically saves events and updates to organized tables
⚡ Fast Processing: ~5-10 second response times
🛡️ Robust Error Handling: Graceful failures with helpful user feedback
📈 Weekly Summaries: Generate newsletter-style event and update summaries

🏗️ Architecture

User Input (URL) → Playwright (Render Page) → OpenAI (Extract Data) → Airtable (Save) → User Feedback

Why This Approach?

Playwright: Handles modern JavaScript-heavy event platforms
OpenAI GPT-4o: Intelligent, context-aware data extraction
Direct Integration: No third-party scraping services, full control
Cost Effective: Only OpenAI API costs (~$20-50/month typical usage)

📋 Commands

/start - Welcome message and usage guide
/weeklyweave - Generate weekly summary of events and updates

💬 Message Formats

Event Extraction

event: https://lu.ma/event-link
event: https://meetup.com/group/events/123456
event: https://eventbrite.com/e/event-name-123456

Update Processing

update: https://techcrunch.com/article-link
update: Just wanted to share that our meetup went great!

🌐 Supported Websites

✅ Excellent Support

Lu.ma events - Full dynamic content support
Meetup.com - Comprehensive event details
News sites - TechCrunch, Wired, etc.
Simple event pages - Static HTML sites
Blog posts - Personal and corporate blogs

⚠️ Limited Support

Eventbrite - May be blocked due to anti-bot measures
Facebook Events - Requires authentication
LinkedIn Events - Anti-scraping protection

🔧 Environment Variables

Required

TELEGRAM_BOT_TOKEN=your_telegram_bot_token
OPENAI_API_KEY=your_openai_api_key
AIRTABLE_API_KEY=your_airtable_api_key
AIRTABLE_BASE_ID=your_airtable_base_id
AIRTABLE_TABLE_NAME=Events

Optional

AIRTABLE_TABLE_ID=optional_events_table_id
AIRTABLE_VIEW_ID=optional_events_view_id
AIRTABLE_UPDATES_TABLE_NAME=Updates
AIRTABLE_UPDATES_TABLE_ID=optional_updates_table_id
AIRTABLE_UPDATES_VIEW_ID=optional_updates_view_id

🚀 Deployment

Option 1: Render (Recommended)

Fork this repository
Connect to Render
Set environment variables
Deploy as Worker service

Option 2: Docker

# Build image
docker build -t weavebot .

# Run container
docker run -d \
  --name weavebot \
  -e TELEGRAM_BOT_TOKEN=your_token \
  -e OPENAI_API_KEY=your_key \
  -e AIRTABLE_API_KEY=your_key \
  -e AIRTABLE_BASE_ID=your_base_id \
  -e AIRTABLE_TABLE_NAME=Events \
  weavebot

Option 3: Local Development

# Install dependencies
pip install -r requirements.txt

# Install Playwright browsers
playwright install chromium

# Set environment variables in .env file
cp .env.example .env
# Edit .env with your keys

# Run the bot
python bot.py

📊 Data Structure

Events Table (Airtable)

Event Title (Text)
Description (Long Text)
Start Datetime (Date/Time)
End Datetime (Date/Time)
Location (Text)
Link (URL)

Updates Table (Airtable)

Content (Long Text)
Received At (Date/Time - auto-generated)

🏃‍♂️ Performance

Cold Start: ~5-10 seconds
Warm Processing: ~3-5 seconds
Memory Usage: ~150-200MB
Browser Overhead: Minimal (headless Chromium)

🔄 Migration from ScrapeGraphAI

This version removes ScrapeGraphAI in favor of a cleaner architecture:

Before (Issues)

Complex setup with multiple dependencies
ScrapeGraphAI reliability issues
Credit-based pricing confusion
Performance overhead

After (Benefits)

Direct Playwright + OpenAI integration
Predictable OpenAI-only costs
Better error handling and logging
Faster processing times

🧪 Testing

WeaveBot includes a comprehensive test suite with 22 tests covering all functionality:

Quick Testing

# Run all tests
python3 run_tests.py all

# Run only unit tests (fast)
python3 run_tests.py unit

# Run with coverage report
python3 run_tests.py coverage

Test Coverage

✅ Date validation and formatting
✅ OpenAI data extraction with mocking
✅ Playwright browser automation
✅ Airtable integration and data mapping
✅ Newsletter generation and formatting
✅ End-to-end workflow testing
✅ Comprehensive error handling

CI/CD

GitHub Actions: Automated testing on push/PR
Multiple Python versions: 3.9, 3.10, 3.11
Code quality: Linting with flake8, black, isort
Coverage reporting: Integrated with Codecov

See Testing Guide for detailed documentation.

🛠️ Development

Project Structure

WeaveBot/
├── bot.py              # Main bot logic
├── test_bot.py         # Comprehensive test suite
├── run_tests.py        # Test runner script
├── pytest.ini         # Test configuration
├── requirements.txt    # Python dependencies
├── Dockerfile         # Container configuration
├── render.yaml        # Render deployment config
├── docs/              # Documentation
│   ├── testing.md     # Testing guide
│   └── python-revert-analysis.md
└── README.md          # This file

Key Components

Event Processing: scrape_event_data() + extract_event_data_with_openai()
Update Processing: scrape_update_data() + extract_update_data_with_openai()
Browser Automation: get_html_with_playwright()
Data Storage: save_event_to_airtable() + save_update_to_airtable()

🐛 Troubleshooting

Common Issues

Bot not responding

Check Telegram bot token
Verify internet connectivity
Check logs for error messages

Scraping failures

Some sites block automated access
Try different event platforms (Lu.ma, Meetup)
Check if URL is accessible manually

Airtable errors

Verify API key and base ID
Check table names match exactly
Ensure required fields exist in tables

Logging

The bot provides detailed logging for debugging:

# View logs in production
docker logs weavebot

# Local development
python bot.py  # Logs print to console

📈 Usage Analytics

Track your bot usage:

Successful Events: Check Airtable Events table
Updates Processed: Check Airtable Updates table
Error Rates: Monitor application logs
Response Times: Built-in timing logs

🤝 Contributing

Fork the repository
Create a feature branch
Make your changes
Test thoroughly
Submit a pull request

📄 License

MIT License - see LICENSE file for details

🙋‍♂️ Support

For issues or questions:

Check the troubleshooting section
Review application logs
Open a GitHub issue with details

Built with ❤️ using Python, Playwright, and OpenAI GPT-4o

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
.github/workflows		.github/workflows
docs		docs
new-ts		new-ts
.dockerignore		.dockerignore
.gitignore		.gitignore
AI-SDK-TS-MIGRATION.md		AI-SDK-TS-MIGRATION.md
CLAUDE.md		CLAUDE.md
DEPLOYMENT.md		DEPLOYMENT.md
Dockerfile		Dockerfile
IMPLEMENTATION-SUMMARY.md		IMPLEMENTATION-SUMMARY.md
README.md		README.md
bot.py		bot.py
get_user_id.py		get_user_id.py
pytest.ini		pytest.ini
render.yaml		render.yaml
requirements.txt		requirements.txt
run_tests.py		run_tests.py
test_bot.py		test_bot.py
test_event_extraction.py		test_event_extraction.py

Woven-Web/WeaveBot

Folders and files

Latest commit

History

Repository files navigation