Releases: serejekee/thebat_parser
Releases · serejekee/thebat_parser
1.0
📦 Changelog: https://github.com/serejekee/thebat_parser/commits/1.0
[v1.0.0] - Initial Release
✨ New Features
- Parses
.eml
files from thedata/
directory. - Extracts key email fields: date, sender, recipient, subject, body, and attachments.
- Converts HTML bodies to plain text using
BeautifulSoup
. - Cleans and formats email content for readability.
- Generates a structured
.docx
file with all parsed emails in table format usingpython-docx
.
🛠️ Technologies Used
- Python 3.8+
beautifulsoup4
for HTML parsingpython-docx
for document creation- Standard
email
library for parsing.eml
messages
📁 Output
emails.docx
containing a summary of all emails with:- Date/Time
- Sender
- Recipient
- Subject + Message Body
- Attachment Names
⚠️ Notes
- Ensure
.eml
files are placed in thedata/
folder before running. pandas
is included inrequirements.txt
but not currently used in the script.