Syncing data from pinata/creating farcaster hub on ncsa infra
This project collects Farcaster data from the Pinata Hub API (https://hub.pinata.cloud/v1) without requiring an API key. The data collection process includes:
- FIDs (Farcaster IDs)
- Casts (posts)
- Reactions (likes and recasts)
- Verifications (Ethereum address verifications)
- Links (follows)
- User data (profile information)
The data is stored in a PostgreSQL database with the following tables:
fids
: Stores Farcaster IDscasts
: Stores user postsreactions
: Stores likes and recastsverifications
: Stores Ethereum address verificationslinks
: Stores follow relationshipsuser_data
: Stores user profile information
- Set up a PostgreSQL database
- Configure the following environment variables in a
.env
file:DB_NAME=your_database_name DB_USER=your_database_user DB_PASSWORD=your_database_password DB_HOST=your_database_host DB_PORT=your_database_port
-
Run the data collector:
python farcaster_data_collector.py
-
Query the collected data:
python query_farcaster_data.py
You can also query the database directly using the PostgreSQL command-line tool psql
. Here are some example queries:
-
Connect to the database:
psql -h <DB_HOST> -p <DB_PORT> -U <DB_USER> -d <DB_NAME>
-
Example queries:
-- Get total number of FIDs SELECT COUNT(*) FROM fids; -- Get latest 5 casts SELECT text, timestamp FROM casts ORDER BY timestamp DESC LIMIT 5; -- Get user's profile data SELECT type, value FROM user_data WHERE fid = <FID>; -- Get user's followers SELECT fid FROM links WHERE target_fid = <FID> AND type = 'follow'; -- Get user's verifications SELECT address FROM verifications WHERE fid = <FID>; -- Get user's reactions SELECT type, target_hash FROM reactions WHERE fid = <FID>;
-
To exit psql:
\q
-
Build and start the containers:
docker-compose up -d
-
View logs:
docker-compose logs -f
-
Stop the containers:
docker-compose down
The project uses GitHub Actions for CI/CD. To deploy to production:
-
Set up the following secrets in your GitHub repository:
DOCKERHUB_USERNAME
: Your DockerHub usernameDOCKERHUB_TOKEN
: Your DockerHub access tokenSERVER_HOST
: Your server's hostname or IPSERVER_USERNAME
: SSH username for the serverSERVER_SSH_KEY
: SSH private key for server accessDB_NAME
: Database nameDB_USER
: Database userDB_PASSWORD
: Database password
-
Push to the main branch to trigger deployment:
git push origin main
The GitHub Actions workflow will:
- Run tests
- Build and push the Docker image
- Deploy to your server using SSH
- Set up the environment and start the containers
-
Install Docker and Docker Compose on your server
-
Create the deployment directory:
mkdir -p /opt/farcaster-collector
-
Copy the docker-compose.yml file to the server
-
The GitHub Actions workflow will handle the rest of the setup