A Flask-based web application for data preprocessing and visualization, designed to help users analyze Excel data through column selection, renaming, chart generation, and automated Google Slides presentation creation.
- Excel File Upload: Support for
.xls
and.xlsx
files with drag-and-drop interface - Column Management: Select specific columns and rename them for better analysis
- Data Visualization: Generate interactive bar charts and pie charts from your data
- Chart Toggle: Switch between bar and pie chart views for each column
- Google Slides Integration: Automatically generate presentation slides with your charts
- Download Options: Download individual charts or all charts at once
- Modern UI: Responsive design with smooth animations and intuitive navigation
- Session Management: Maintains user state throughout the workflow
- Error Handling: Comprehensive error handling with custom 404 page
AI-pca-pase01/
├── app.py # Main Flask application
├── requirements.txt # Python dependencies
├── README.md # Project documentation
├── LICENSE # MIT License
├── .gitignore # Git ignore rules
├── .gitattributes # Git attributes
├── templates/ # HTML templates
│ ├── upload.html # File upload page
│ ├── columns.html # Column selection page
│ ├── rename_columns.html # Column renaming page
│ ├── results.html # Data processing results
│ ├── graphs.html # Chart visualization page
│ └── error_404.html # Custom error page
├── static/ # Static assets
│ ├── css/ # Stylesheets
│ │ ├── upload.css # Upload page styles
│ │ ├── columns.css # Column selection styles
│ │ ├── graphs.css # Charts page styles
│ │ ├── results.css # Results page styles
│ │ └── error_404.css # Error page styles
│ └── images/
│ └── 404_error.svg # Error page illustration
├── analysis/ # Generated content
│ ├── charts/ # Chart images (auto-generated)
│ └── results/ # Analysis results (auto-generated)
├── uploads/ # Uploaded files directory (auto-created)
└── credentials.json # Google API credentials (not included)
- Python 3.7 or higher
- pip package manager
- Google Cloud Console account (for Slides integration)
-
Clone the repository
git clone https://github.com/yasithS/AI-pca-pase01.git cd AI-pca-pase01
-
Create a virtual environment (recommended)
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install dependencies
pip install -r requirements.txt
-
Set up Google API credentials (for Slides integration)
- Go to Google Cloud Console
- Create a new project or select existing one
- Enable Google Slides API and Google Drive API
- Create credentials (OAuth 2.0 Client ID)
- Download the credentials file and save as
credentials.json
in the project root
-
Run the application
python app.py
-
Access the application Open your web browser and navigate to
http://localhost:5000
- Flask 2.3.3 - Web framework
- pandas 2.0.3 - Data manipulation and analysis
- numpy 1.24.3 - Numerical computing
- openpyxl 3.1.2 - Excel file reading/writing
- matplotlib 3.7.2 - Data visualization
- seaborn 0.12.2 - Statistical data visualization
- google-api-python-client - Google APIs integration
- google-auth-httplib2 - Google authentication
- google-auth-oauthlib - OAuth 2.0 for Google APIs
- python-pptx 0.6.21 - PowerPoint file handling
- scikit-learn 1.3.0 - Machine learning library
- Werkzeug 2.3.7 - WSGI utilities
- Navigate to the upload page
- Drag and drop or click to select an
.xls
or.xlsx
file - Click "Upload and Continue" to proceed
- Review all available columns from your dataset
- Check the boxes for columns you want to analyze
- Use "Select All" to quickly select all columns
- Click "Next: Rename Columns"
- Modify column names as needed for better readability
- Leave fields unchanged to keep original names
- Click "Process & Continue"
- Review the processing summary
- Check data shape and column mappings
- Preview the first 5 rows of processed data
- Continue to chart generation
- View automatically generated charts for each column
- Toggle between bar charts and pie charts using the toggle button
- Download individual charts or all charts at once
- Charts are automatically saved to the
analysis/charts/
directory
- Click "Generate the slides" to authorize Google integration
- Complete OAuth authorization in your browser
- The application automatically creates a presentation with your selected charts
- Access your presentation through the provided Google Slides link
- Bar Charts: Default visualization showing value counts for each column
- Pie Charts: Alternative visualization showing data distribution as percentages
- Interactive Toggle: Switch between chart types with a single click
- High Quality: All charts generated at high resolution for presentations
For production deployment, set the following environment variables:
export SECRET_KEY='your-secure-secret-key-here'
export FLASK_ENV='production'
The application uses a predefined Google Slides template. Update the PRESENTATION_ID
in app.py
:
PRESENTATION_ID = "your-google-slides-template-id"
Default upload directory is uploads/
. Modify in app.py
if needed:
UPLOAD_FOLDER = 'your-custom-upload-path'
- Responsive Design: Works seamlessly on desktop and mobile devices
- Drag & Drop: Intuitive file upload with visual feedback
- Progress Indicators: Real-time feedback during file processing
- Smooth Animations: Professional transitions and hover effects
- Robust Excel Parsing: Handles various Excel formats and structures
- Column Validation: Ensures data integrity throughout the process
- Session Management: Secure handling of user data and preferences
- OAuth 2.0 Security: Secure authentication with Google services
- Automatic Uploads: Charts uploaded to Google Drive with proper permissions
- Template System: Uses placeholder-based slide generation
- Session Security: Encrypted session management
- File Validation: Strict file type checking for uploads
- Google OAuth: Secure authentication for API access
- Error Handling: Graceful handling of errors without exposing sensitive data
- PCA Analysis: Principal Component Analysis implementation (planned)
- Advanced Visualizations: More chart types and interactive plots
- Data Export: Export processed data in multiple formats
- Batch Processing: Handle multiple files simultaneously
- User Authentication: Multi-user support with individual accounts
- Cloud Storage: Integration with multiple cloud storage providers
Upload Errors
- Ensure file is in
.xls
or.xlsx
format - Check file size (large files may timeout)
- Verify file is not corrupted or password-protected
Google Slides Integration
- Ensure
credentials.json
is properly configured - Check that Google Slides API and Drive API are enabled
- Verify OAuth consent screen is configured
Chart Generation
- Ensure matplotlib backend is properly configured
- Check that
analysis/charts/
directory has write permissions - Verify data contains valid values for visualization
This project is licensed under the MIT License. See the LICENSE file for full details.
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature
) - Commit your changes (
git commit -m 'Add amazing feature'
) - Push to the branch (
git push origin feature/amazing-feature
) - Open a Pull Request
For questions, issues, or suggestions:
- Create an issue on GitHub
- Check the troubleshooting section above
- Review the configuration options
Note: This tool is currently focused on data visualization and Google Slides integration. The PCA analysis feature mentioned in the project name is planned for future releases.
# Clone and setup
git clone https://github.com/yasithS/AI-pca-pase01.git
cd AI-pca-pase01
python -m venv venv
source venv/bin/activate # or venv\Scripts\activate on Windows
pip install -r requirements.txt
# Add Google credentials (credentials.json)
# Run the application
python app.py
# Open http://localhost:5000 in your browser
Ready to transform your Excel data into beautiful visualizations!