Skip to content
View zubairfayyaz's full-sized avatar
🏠
Data Engineer | Full Stack developer
🏠
Data Engineer | Full Stack developer

Block or report zubairfayyaz

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
zubairfayyaz/README.md

πŸ‘‹ Hi, I'm Zubair Fayyaz

Data Engineer | BI Engineer | Data Analyst

LinkedIn Email GitHub


πŸ” About Me

🎯 I’m an innovative Data & Software Engineer with 3+ years of hands-on experience building robust data infrastructure, AI-powered automation tools, and business intelligence dashboards.

🌐 Currently working at CBA as a Data Pipeline | BI Engineer
πŸ” Previously worked at Intellicode.tech and Nexthon
☁️ Specialize in Azure Data Factory, Apache Superset, Python ETL, and OpenAI integrations
πŸ›  Built AI-driven ERP systems, OCR invoice processing, voice-to-form automation, and code generators


🧰 Tech Stack

  • Languages: Python, Java, SQL
  • Databases: MySQL, PostgreSQL, MongoDB
  • Data Tools: Apache Airflow, Superset, Metabase, Power BI
  • Cloud & Infra: Azure Data Factory, Synapse Analytics, Docker
  • AI & ML: OCR (EasyOCR, Tesseract), OpenAI GPT, Whisper, Transformers
  • Other: Flask, JasperReports, Git

πŸš€ Highlighted Projects

πŸ“Š ERP Reporting System

  • Built full Kimball-modeled data warehouse in MySQL.
  • Designed CDC/SCD pipelines and Superset dashboards.
  • Created curated reporting zones for business insights.

🧾 Invoice OCR & Bounding Box Detection

  • Used OpenCV, EasyOCR, and Tesseract for extracting structured data from invoices.
  • Developed image preprocessing for enhanced OCR accuracy.

πŸ—£οΈ Voice-to-Form Automation

  • Converted voice into structured ERP fields using OpenAI Whisper + regex.
  • Built AI workflows for entity recognition and field mapping.

πŸ€– ERP Code Generator (GPT-4)

  • Auto-generated backend APIs, database schemas, and UI forms using GPT-4 API.
  • Reduced development time from days to minutes.

🏦 Loan Default Risk Prediction

  • Built ML pipelines for financial risk analysis.
  • Trained logistic regression and random forest models for credit scoring.

πŸ“ˆ Certifications

  • βœ… DataCamp SQL Associate (2024)
  • βœ… Google Foundations: Data, Data, Everywhere
  • βœ… HackerRank: SQL (Advanced)
  • βœ… 365DataScience: Advanced SQL for Data Engineering

πŸ“¬ Let’s Connect

I love collaborating on data-driven tools, ETL pipelines, AI integrations, and making smart systems smarter.
Drop me a line at zubairbinfayyaz@gmail.com or check out my repos for hands-on project code.

πŸ“¬ Get in Touch

Feel free to reach out for collaborations, consultations, or just to connect!


"Transforming data into actionable insights to drive business success."

Popular repositories Loading

  1. firmata4j firmata4j Public

    Forked from kurbatov/firmata4j

    Firmata client written in Java.

    Java 1

  2. FinalProject FinalProject Public

    Java 1 1

  3. zubairfayyaz zubairfayyaz Public

    Data Engineer Portfolio

  4. ikraa.acedemy ikraa.acedemy Public