Home

Welcome to the Steam-Sales-Analysis wiki!

Main Highlights

Automated and scheduled a robust data pipeline using Prefect for the Steam sales analysis, ensuring timely data ingestion, validation, and transformation with fault-tolerant mechanisms like flagging faulty IDs, logging fetcher runtimes, and soft restarts from points of failure.
Developed and maintained steamstore_etl, a Python-based CLI tool, and implemented a CI/CD pipeline using GitHub Actions to automate testing and publish the package on PyPI, streamlining distribution and deployment.
Created an interactive Tableau dashboard showcasing KPIs, regional, and demographic analysis of Steam sales, delivering actionable insights into user behavior and sales trends.

Other pointers

Developed a robust data pipeline to automate the retrieval, processing, validation, and ingestion of Steam game sales data using Steamspy and Steam APIs, significantly improving data accuracy and availability.
Engineered a Python package, 'steam-sales', and a CLI tool, 'steamstore', to facilitate seamless ETL (Extract, Transform, Load) operations, enhancing the efficiency and scalability of the data processing workflow.
Designed and implemented a MySQL database hosted on Aiven Cloud to store and manage large-scale Steam game metadata and sales data, ensuring reliable data storage and easy access for analysis.
Optimized the ETL pipeline by incorporating environment-specific configurations through .env files, enabling smooth deployment across different operating systems (Windows, Ubuntu) and enhancing the project's portability.
Utilized GitHub Actions to create a CI/CD pipeline that automatically tests the Python distribution across multiple operating systems, ensuring consistent performance and reducing manual testing efforts.
Addressed API rate limits and error handling challenges by implementing retries and user-agent modifications, improving the robustness and reliability of data fetching operations from the Steam API.
Documented the entire project setup and usage instructions in a comprehensive README file, facilitating easy onboarding for new users and contributors.
Leveraged Python libraries like requests, pandas, and sqlalchemy to streamline data fetching, transformation, and storage processes, contributing to a more maintainable and efficient codebase.
Ensured code quality and project maintainability by adhering to best practices in software engineering, including version control, modular design, and thorough testing, leading to a stable and extensible project.
Enhanced the fault tolerance of the data pipeline by implementing mechanisms to flag faulty IDs, log the last run time for fetchers, and enable soft restarts from the point of failure, ensuring minimal data loss and continuous operation even in the event of errors or interruptions.
Developed an interactive Tableau dashboard showcasing key performance indicators (KPIs), regional and demographic analysis of Steam sales, enabling deep insights into the gaming user base and facilitating data-driven decision-making.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Home

Main Highlights

Other pointers

Clone this wiki locally