Skip to content
View amirulshafiq98's full-sized avatar
:atom:
Looking for jobs
:atom:
Looking for jobs
  • Singapore Institute of Technology (SIT)
  • Singapore
  • LinkedIn in/shafiq-g

Block or report amirulshafiq98

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
amirulshafiq98/README.md

📊 Amirul Shafiq - Data Analytics Portfolio

Hi, I'm Amirul — a data enthusiast with a Bachelor's in Food Technology from the Singapore Institute of Technology (convocation October 2025). Over the past four years, I've immersed myself in analysing diverse datasets to assess the viability of novel food products developed during my undergraduate studies. This experience honed my skills in data analysis and storytelling to effectively engage stakeholders.

My pivotal moment came during an internship at Chiang Mai University, Thailand, where I assisted local beekeepers in determining optimal honey storage conditions. Despite collecting extensive data over 12 weeks, initial results were inconclusive. This challenge sparked my deep interest in data analysis, driving me to uncover actionable insights before returning to Singapore.

Beyond data, I enjoy exploring history, playing football (soccer), and continuously learning new skills — be it technical projects or enhancing communication abilities. I thrive in collaborative environments, believing that the best ideas emerge through teamwork.

resume linkedin


🛠 Skills

MicrosoftSQLServer MySQL Power Bi Tableau Python Microsoft Office Google Workspace

📘 Table of Contents

📌 Highlighted Projects

SQL

Olist E-commerce Data Exploration

Olist E-commerce Schema

Repository: Olist Sales

Objective: Prepare data for visualisation to uncover sales insights of sellers and customers on the platform by product category, month, and year

Description: Utilised a Kaggle dataset from Olist, a Brazilian e-commerce company. Translated product categories from Portuguese using an Excel translation file. Performed data loading, cleaning, preprocessing, and exploratory data analysis (EDA)

Skills: Data joining, Common Table Expressions (CTEs), VLOOKUP, Entity Relationship Diagram (ERD)

Tools: SQL Server Management Studio (SSMS), Excel

Outcome: Generated four CSV files for Power BI with necessary primary keys for filtering visuals

Cafe Promotions Data Cleaning

Cafe Schema

Repository: Cafe Promos

Objective: Prepare data for visualisation to determine which specific promotion had the most participations on a given day, along with a breakdown of completions

Description: Analysed a Maven Analytics dataset containing eight different promotions (Buy-One-Get-One & Discount). Promotions ran concurrently, ending in 5, 7, or 10 days. Each customer received one promotion every 7 days, with Day 24 as an exception. Conducted data loading, cleaning, preprocessing, and EDA

Skills: Data cleaning, CTEs, ERD, indexing, calculating 7-day moving averages

Tools: MySQL

Outcome: Generated six CSV files for Tableau visualisations, including stacked bar charts, line charts, tree maps, and tables

Mochi Sensory Database Setup

Flowchart

Repository: Mochi Ice cream

Objective: Build a full-stack sensory analytics pipeline to identify the most preferred Mochi formulation based on consumer feedback.

Description: The raw, wide-format Excel sensory panel data was first ingested into PostgreSQL via a Python script. It was then cleaned and transformed into a normalized, analysis-ready structure using dbt within PostgreSQL, preparing it for subsequent analytical processes.

Skills: Data Modeling, SQL Transformation (dbt, PostgreSQL), Python Scripting (Data Ingestion), Data Cleaning, Data Normalisation.

Tools: PostgreSQL, dbt, SQLAlchemy, Python

Outcome: A clean, normalized, and analysis-ready dataset hosted in PostgreSQL, serving as the foundational input for advanced sensory analytics.


Visualisation (Tableau and Power BI)

Cafe Promotions Daily Insights

Overall

Repository: Cafe Insights

Objective: Create a dashboard showcasing daily participations for each promotion code and the percentage of claims before expiration.

Description: Developed a dashboard featuring line charts for total sales, stacked bar charts for redemptions, tree maps for customer join dates, horizontal stacked bar charts for average age groups, and tables displaying promotion details over 30 days.

Skills: Filtering, tooltips, conditional formatting, max-min markers on line charts

Tools: Tableau Public

Outcome: Delivered an interactive dashboard presenting promotion insights based on age group, customer join date, total sales, and redemptions.


Superstore Data

Updated BI

Repository: Superstore Sales

Objective: Identify consumer trends and sales insights across different segments and product categories

Description: Used a sample Superstore dataset to build a Power BI report featuring sales KPIs, segment-wise profit/loss analysis, and demographic visuals (e.g., waffle charts and customer location maps). Final dashboard highlights which segments to target based on profit margins and sales contribution

Skills: Native waffle chart visualisation, KPI cards, bar graphs, area charts

Tools: Power BI

Outcome: Delivered a compelling dashboard story useful for internal sales and marketing teams


Olist E-commerce Sales Data

Olist

Repository: Olist Sales

Objective: Visualise seller and customer behavior across Brazil from 2016 to 2018.

Description: Built Power BI dashboards showing delivery times, order statuses, customer satisfaction, and top-selling product categories by state and year. The data was prepared in SQL Server, cleaned, and joined before visualization

Skills: DAX formulas, calculated columns, KPI cards, filters, slicers, star rating system

Tools: Power BI, SQL Server

Outcome: Created an interactive dashboard with dynamic filters by year, seller location, product category, and delivery time


Python

Principal Component Analysis (PCA) of Boba Pearls

PCA Biplot

Repository: PCA of Boba Pearls

Objective: Reduce dimensionality of textural data from machine analysis and identify trials closest to a store-bought sample

Description: Conducted PCA on force-based variables like hardness, cohesiveness, and springiness from tapioca pearl formulations. Also estimated the correlation factors of hardness, cohesiveness and chewiness for the different formulations to aid in more focussed trials during sensory evaluation

Skills: PCA, matplotlib, seaborn, NumPy, scikit-learn, distance metrics

Outcome: Identified optimal trial combinations and visualised product similarity for formulation development


Honey Analysis for Optimal Storage

Bacillus

Repository: Honey Storage Recommendation

Objective: Identify optimal honey storage conditions by analysing microbial changes over 6 weeks

Description: Ran descriptive statistics, two-way ANOVA and Tukey HSD tests for honey stored in analyse microbial growth across temperatures (4°C, 25°C, 37°C) and conditions (aerobic vs. anaerobic). Final recommendation was tailored for bee farmers in Northern Thailand

Skills: Statsmodels (ANOVA), Seaborn, matplotlib, pandas

Outcome: Provided a scientifically-backed recommendation on honey storage to reduce spoilage


University Allocation Based on WEF Report (2025)

Histogram

Repository: University Allocation Optimisation

Objective: Allocate students to majors using an optimisation model with realistic demand curves and constraints

Description: Designed an Integer Linear Programming (ILP) model using PuLP, factoring in costs, major demand, and budget constraints. Simulated multiple demand curve options (exponential, sigmoid, power law) to model student preferences

Skills: PuLP, matplotlib, numpy, pandas, ILP, smoothing

Outcome: Produced visually realistic student allocation plans for use by education planners


Mochi Ice-Cream Sensory Analysis

Boxplot

Repository: Mochi Ice cream

Objective: Apply advanced analytics to identify key sensory drivers and optimal Mochi formulations for upscaling.

Description: Utilising a Jupyter Notebook, Python was used to connect to the analysis-ready data in PostgreSQL. Principal Component Analysis (PCA) reduced data dimensionality to uncover key sensory profiles, while K-Means Clustering grouped similar Mochi formulations based on their sensory attributes. Finally, boxplots were plotted to visualise the cluster that had the highest acceptability score to determine the best formulation for upscaling. Each insight was visualised through various plots to guide decision-making.

Skills: Principal Component Analysis (PCA), K-Means Clustering, Data Visualization, Statistical Analysis, Python (Pandas, Scikit-learn, Plotly), Database Connectivity

Tools: Python, Jupyter Notebook, Scikit-learn, Pandas, Plotly.

Outcome: Generated a comprehensive analysis, including detailed visualizations and cluster assignments, to recommend the optimal Mochi formulation for upscaling based on insightful sensory data.


🎓 Education

Bachelor (Tech) of Food Technology with Honours
Singapore Institute of Technology (SIT)

Expected Graduation: October 2025
Relevant Modules: Applied Data Science, Design of Experiments (DOE), Sensory Evaluation, Market Research, Python Programming

📜 Certification

Microsoft Excel Professional.pdf

Tableau Intelligence Analyst.pdf

Google Business Intelligence.pdf

Google Cloud Analytics.pdf

Google Advanced Data Analytics.pdf

Pinned Loading

  1. Uni_Allocation Uni_Allocation Public

    After reading the WEF report on jobs that can be affected by AI, I decided to simulate this demand with university courses to better allocate students to meet this demand.

    Python

  2. HR_Attrition HR_Attrition Public

    Jupyter Notebook

  3. mochi_icecream mochi_icecream Public

    A project I did for FYP where we looked at 5 different formulations to determine which one was the best to move to upscale operations

    Jupyter Notebook

  4. olist-sales olist-sales Public

    Olist is a Brazilian e-commerce site that sells all sorts of things. I got this dataset from Kaggle where I did EDA using SQL before visualising in PowerBI

  5. honey honey Public

    Honey project I did back in my internship at Chiang Mai University to predict the best conditions for storage after 6 weeks

    Python

  6. PCA_2d PCA_2d Public

    This is a python script I made for my capstone project in my final year where I wanted to determine the interactions between hydrocolloids to find out which ones were able to retain the textural pr…

    Python