This is a mini data project in attempt to accomplish Pelajar Data Challenge #1. The objective is to gain any interesting insights that can be provided to a C-Level of Indiemart. The requirements are detailed in this post.
Example of interesting questions (by @BukanYahya):
- What is the lowest price product in this month?
- Is Indomie at Indomaret cheaper than at Alfamart?
- Create a statistic about the price fluctuation from this month?
Here is some important questions that I will present for the higher-ups besides product description:
- Which source store has the most diverse product offerings?
- Which source store that provides the most affordable product?
- Database system :
SQLite
- Script :
Python
- Dashboard :
Streamlit
- I explored the database using
DBeaver
. Turns out that the database contains three filled tables (items
,prices
, anddiscount
) and some empty tables (mostly about the order data). - I looked into the
items
table and found that there is different style forcategory
from different sources, so I decided to normalize the entries. I usedCopilot
to help me making the values map. - In the
items
table, I also found out that some products that's actually the same have different names, e.g. Indomie Mi Instan Soto Mie 70 g & Indomie Mi Instan Soto Mie 70G. For the purpose of comparing product from differentsource
, I also made a values map to clean this data. - I made a little data pipeline using python script to load the data into my dashboard.
- Clone this project
- Download data from https://194.233.94.36/indiemart.db
- Install all requirements (in requirements.txt)
- From terminal, run
python -m streamlit run .\streamlit-app.py
- Follow the link provided
-
27 Apr 2024
-
Eventually, I didn't continue and failed to participate in the contest :(
-
I added minor update to provide looking into the data details
-