Skip to content

rmichelena/x_to_raindrop

Repository files navigation

X Bookmarks to Raindrop.io Converter

📋 Project Overview

This project converts Twitter/X bookmarks exported as JSON into a CSV format compatible with Raindrop.io, using OpenAI's GPT-3.5-turbo for intelligent title generation and folder/tag reclassification.

🎯 Goal

Transform a JSON file containing Twitter bookmarks (with manually added folder and tag data) into a clean, organized CSV that can be imported into Raindrop.io with:

  • Intelligent title generation (concise one-liners)
  • Smart folder and tag reclassification
  • Proper nested folder structure under "X/"
  • Clean tag taxonomy (removing low-usage tags)

📁 Project Structure

X Bookmarks to Raindrop/
├── README.md                           # This file
├── twitter_bookmarks.json              # Original export (Twitter Bookmark Exporter)
├── twitter_bookmarks_tagged_full.json  # Manually enhanced with folders/tags (1,778 bookmarks)
├── APIKEY.txt                          # OpenAI API key (user-provided)
├── folders_list.txt                    # Extracted folder list for AI prompts
├── tags_list.txt                       # Extracted tag list for AI prompts
├── raindrop_format.csv                 # Final output (all bookmarks)
├── raindrop_cleaned.csv                # Cleaned output (tags with ≥5 uses)
├── setup_openai.py                     # OpenAI API key setup utility
├── analyze_folders_tags.py             # Extract folders/tags from JSON
├── reclassify_with_openai.py           # Early OpenAI processing script
├── analyze_tags.py                     # Analyze tag usage in CSV
└── clean_tags.py                       # Remove low-usage tags

🔧 Dependencies

pip install openai

Environment Variables

  • OPENAI_API_KEY: Your OpenAI API key (set via setup_openai.py or manually)

📊 Complete Data Flow

twitter_bookmarks.json (original export from Twitter Bookmark Exporter)
    ↓
[MANUAL STEP: User added "folder" and "tags" fields to each bookmark]
    ↓
twitter_bookmarks_tagged_full.json (manually enhanced with folders/tags)
    ↓
analyze_folders_tags.py → folders_list.txt, tags_list.txt
    ↓
reclassify_with_openai.py → raindrop_reclassified.csv (early attempt)
    ↓
[Multiple iterations and format fixes]
    ↓
create_raindrop_format.py → raindrop_format.csv (final format)
    ↓
clean_tags.py → raindrop_cleaned.csv (recommended for import)

🚀 Quick Start

1. Setup OpenAI API Key

python3 setup_openai.py

Or manually set: export OPENAI_API_KEY="your-key-here"

2. Prepare Enhanced JSON (Manual Step)

  • Start with twitter_bookmarks.json (original export)
  • Manually add "folder" and "tags" fields to each bookmark
  • Save as twitter_bookmarks_tagged_full.json

3. Extract Folder/Tag Lists

python3 analyze_folders_tags.py

4. Generate Raindrop CSV

python3 create_raindrop_format.py

5. Clean Tags (Recommended)

python3 clean_tags.py

6. Import to Raindrop.io

Upload raindrop_cleaned.csv to Raindrop.io

📋 Scripts Documentation

Core Scripts

create_raindrop_format.py 🎯

Purpose: Main conversion script that processes all bookmarks with OpenAI

  • Generates concise titles (max 100 chars) using GPT-3.5-turbo
  • Reclassifies folders and tags based on content analysis
  • Outputs CSV in exact Raindrop.io export format
  • Processes all 1,778 bookmarks with rate limiting

Key Features:

  • Single OpenAI API call per bookmark (cost-optimized)
  • Text cleaning for CSV compatibility
  • ISO 8601 timestamp format
  • Nested folders under "X/"
  • Adds "twitter" tag to all entries

clean_tags.py 🧹

Purpose: Removes tags with less than 5 uses to create cleaner taxonomy

  • Reduces unique tags from 1,247 to 203 (84% reduction)
  • Maintains meaningful tags only
  • Provides detailed analysis of tag usage

reclassify_with_openai.py 🔄

Purpose: Early version of OpenAI processing script

  • Predecessor to create_raindrop_format.py
  • Combined title generation and reclassification
  • Used separate API calls (less cost-efficient)
  • Generated intermediate CSV outputs

Utility Scripts

setup_openai.py 🔑

Purpose: Secure OpenAI API key setup

  • Prompts for API key securely (no echo)
  • Sets environment variable
  • Validates key format

analyze_folders_tags.py 📊

Purpose: Extract unique folders and tags from source JSON

  • Creates folders_list.txt and tags_list.txt
  • Used as reference lists for OpenAI reclassification
  • Provides usage statistics

analyze_tags.py 📈

Purpose: Analyze tag distribution in generated CSV

  • Counts total vs unique tags
  • Shows most popular tags
  • Helps understand Raindrop.io import statistics

🔍 Key Features

OpenAI Integration

  • Model: GPT-3.5-turbo
  • Single API Call: Combines title generation + classification
  • Rate Limiting: 0.2s delay between calls
  • Cost Optimization: ~$5-10 for 1,778 bookmarks

Raindrop.io Compatibility

  • Exact Format Match: Based on actual Raindrop export
  • Required Fields: id, title, note, excerpt, url, folder, tags, created, cover, highlights, favorite
  • Folder Structure: All nested under "X/" (e.g., "X/ai", "X/devtools")
  • Tag Format: Comma-separated, includes "twitter" tag

Data Processing

  • Text Cleaning: Removes problematic newlines and quotes
  • Timestamp Conversion: ISO 8601 format (2025-07-19T18:34:39.000Z)
  • Tag Optimization: Removes tags with <5 uses
  • Content Preservation: Full text in excerpt field

📊 Results

Final Statistics

  • Bookmarks: 1,778
  • Folders: 17 (nested under "X/")
  • Tags (before cleaning): 1,247 unique, 2,669 total uses
  • Tags (after cleaning): 203 unique, 7,393 total uses
  • Average tags per bookmark: 4.2

Top Tags (After Cleaning)

  1. twitter: 1,778 uses
  2. devtools: 840 uses
  3. ai: 588 uses
  4. opensource: 328 uses
  5. GPT: 207 uses

🔧 OpenAI Prompt Strategy

The AI uses a sophisticated prompt that:

  • Analyzes full text content and URL
  • References predefined folder and tag lists
  • Suggests most appropriate folder from existing options
  • Selects relevant tags from existing list + new important ones
  • Generates concise, descriptive titles

🚨 Troubleshooting

Common Issues

Raindrop.io Import Problems

  • Solution: Use raindrop_cleaned.csv (follows exact export format)
  • Cause: CSV formatting, newlines in fields, wrong headers

OpenAI API Issues

  • Rate Limits: Script includes 0.2s delays
  • Invalid Key: Use setup_openai.py or check APIKEY.txt
  • Cost Control: Test with small subset first

Tag Count Confusion

  • Raindrop.io reports total tag uses, not unique tags
  • Use analyze_tags.py to understand the breakdown

File Issues

  • Large Files: JSON is ~8MB, CSV is ~2MB
  • Encoding: All files use UTF-8
  • Line Endings: Handled by Python CSV writer

💡 Lessons Learned

  1. Single API Call: Combining title + classification saves ~50% on API costs
  2. Exact Format Matching: Raindrop.io is strict about CSV format
  3. Tag Cleanup: Essential for usable taxonomy (1,247 → 203 tags)
  4. Text Cleaning: Critical for CSV compatibility
  5. Rate Limiting: Prevents API throttling

🔄 Process Evolution

  1. Initial: Started with twitter_bookmarks.json (no folders/tags)
  2. Manual Enhancement: User added folder and tags fields → twitter_bookmarks_tagged_full.json
  3. V1: json_to_raindrop_csv.py - Basic conversion with timestamp fixes
  4. V2: Added OpenAI title generation
  5. V3: reclassify_with_openai.py - Added folder/tag reclassification (separate API calls)
  6. V4: Optimized to single API call per bookmark
  7. V5: Multiple CSV format attempts to fix Raindrop.io compatibility
  8. V6: create_raindrop_format.py - Final format matching Raindrop export structure
  9. Final: clean_tags.py - Tag cleanup for better taxonomy

📝 Manual Steps Required

  1. Export bookmarks using Twitter Bookmark Exportertwitter_bookmarks.json
  2. Manually add "folder" and "tags" fields to each bookmark → twitter_bookmarks_tagged_full.json
  3. Obtain OpenAI API key
  4. Place key in APIKEY.txt or set environment variable
  5. Run scripts in sequence (analyze → generate → clean)
  6. Import final CSV to Raindrop.io

🎯 Future Improvements

  • Batch API calls for better efficiency
  • Support for other bookmark sources
  • Custom tag taxonomy rules
  • Automated import via Raindrop.io API
  • Progress bars for long operations

📞 Notes for Future Self/Agents

  • The user prefers Python3 over python
  • OpenAI API key was provided in APIKEY.txt due to terminal paste issues
  • Tag cleaning with min 5 uses was crucial for usability
  • Raindrop.io is very strict about CSV format - use exact export structure
  • User values cost optimization (single API call approach)
  • All folders should be nested under "X/"
  • Always add "twitter" tag to all entries

Last Updated: January 2025
Total Processing Time: ~15 minutes for 1,778 bookmarks
Estimated API Cost: $5-10 USD

About

for syncing X bookmarks to raindrop.io

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages