A Python script that extracts bookmarks from Chrome/Firefox HTML exports, specifically focusing on links saved in the "2024" folder. The script processes the bookmarks and exports them to a CSV file with useful metadata.
- Extracts bookmarks from Chrome/Firefox HTML exports
- Focuses on links within the "2024" folder
- Calculates relative ages (e.g., "2 months ago")
- Extracts main domain names from URLs
- Sorts by date (newest first)
- Exports to CSV format
- Python 3.x
- pandas
- beautifulsoup4
- Export your bookmarks from Chrome/Firefox to an HTML file
- Name the file
bookmarks_12_3_24.html
(or update the filename in the script) - Run:
python parse_bookmarks.py
The script generates a bookmarks_export.csv
with the following columns:
- title: The bookmark title
- link: Full URL
- site: Main domain name
- date: Date added (YYYY-MM-DD)
- age: Relative time since adding (e.g., "2 months ago")
- Parses the HTML bookmarks file using BeautifulSoup
- Locates the "2024" folder
- Extracts bookmark metadata including dates and URLs
- Processes timestamps into human-readable formats
- Exports the data to CSV
title,link,site,date,age
"Example Article",https://example.com/article,example.com,2024-03-01,2 months ago
- This is a simple copy paste of the table bodies on the HN Favorites page
- It's very clunky and meant to be one time instead of doing it correctly.
- Since the link works with a url parameter for the userID and a page parameter I could make it easier to work with but am happy for now.
MIT