This web application is designed to scrape application data from Google Play and the App Store based on user-defined criteria. The scraped data is transformed for analysis, saved to CSV, and uploaded to Google Cloud Storage. A custom dashboard created in Google Looker then visualizes this data.
This project is a part of thesis included in thesis "Analýza trhu s mobilními aplikacemi s využitím Competitive Intelligence a jeho nástrojů". Prague University of Economics and Business
These instructions will give you a copy of the project up and running on your local machine for development and testing purposes. See deployment for notes on deploying the project on a live system.
- Data Scraping: Dynamically scrapes app data from Google Play and the App Store using the
google-play-scraper
andapp-store-scraper
libraries. - User Input: Allows users to define scraping criteria through a web interface.
- Data Transformation: Cleans, categorizes, and enhances the data using Python with Pandas, TextBlob, and iso639 libraries.
- Data Storage: Saves the transformed data to CSV files and uploads them to Google Cloud Storage.
- Visualization: A custom Google Looker dashboard fetches the data for visualization and analysis.
- User Interaction: The user inputs their search criteria via the frontend, which is developed with HTML, CSS, and Vanilla JavaScript.
- Scraping: The backend, powered by NodeJS, scrapes the app data based on the user's input.
- Data Transformation: The scraped data is saved as a CSV file and then transformed using Python. This includes cleaning, categorization, and calculation of new fields relevant for analysis.
- Storage and Visualization: The final CSV file is uploaded to Google Cloud Storage. A custom dashboard in Google Looker then accesses this data via a connector for visualization.
- Node.js
- Python 3.x
- Access to Google Cloud Storage and Google Looker with service account
- Bc.Hoang Nam Dao robogentlenam
- PhDr. Jan Černý, Ph.D. for leading the thesis