A Python-based project designed to streamline the analysis of personal finances, enabling users to extract, process, and visualize financial data from PDF files such as salary receipts, credit card statements, and more. This tool is perfect for anyone looking to gain control over their expenses, track trends, and make informed financial decisions.
- PDF Data Extraction: Automatically extracts relevant data (e.g., salary base, expenses) from multiple PDFs in a folder.
- Expense Categorization: Categorizes transactions based on descriptions using a customizable dictionary.
- Interactive Visualizations: Generates insights through dynamic charts and temporal trend analysis.
- Custom Date Parsing: Handles financial data by parsing months and years from filenames for accurate temporal grouping.
- Export to CSV: Outputs processed data into CSV format for further analysis.
- Python 3.8 or higher.
- Required libraries:
pdfplumber
pandas
matplotlib
seaborn
plotly
Install them via:
pip install pdfplumber pandas matplotlib seaborn plotly
Ensure your folder structure looks like this:
Financial Control/
├── raw_pdfs/ # Folder containing all PDF files
├── processed_data/ # Folder for saving the processed CSV files
├── scripts/ # Python scripts for running the project
- Clone the repository:
git clone https://github.com/mlascam/financial-control.git
- Navigate to the project folder:
cd financial-control
- Run the Python scripts from the
scripts/
directory.
Run the script to extract salary base and other financial metrics from PDFs in the specified folder.
python extract_financial_data.py
Edit the categorization dictionary in the script to match your specific transaction descriptions. Example:
category_dict = {
"PEDIDOSYA": "Food Delivery",
"COTO": "Supermarket",
"MERCADOLIBRE": "Online Shopping"
}
Leverage the visualization scripts to analyze trends:
- Spending over time
- Spending by category
Example:
python visualize_financial_data.py
Processed data is saved in processed_data/combined_data.csv
with the following columns:
PDF File
: Name of the source PDFPage
: Page number in the PDFDescription
: Extracted transaction descriptionAmount
: Transaction amountCategory
: Assigned category (if applicable)Date
: Parsed date based on the filename
Generates dynamic charts to explore:
- Spending trends (monthly/yearly)
- Category-wise expenses
- Custom filters for deeper analysis
Contributions are welcome! To contribute:
- Fork the repository.
- Create a feature branch:
git checkout -b feature/your-feature-name
- Commit your changes:
git commit -m "Add your message here"
- Push the branch:
git push origin feature/your-feature-name
- Open a Pull Request.
This project is licensed under the MIT License. See the LICENSE file for details.
- GitHub: mlascam
- LinkedIn: Maximiliano Lasca
Start taking control of your finances today with Financial Control!