Research Title: "Statistical Analysis and Population Growth Prediction of Countries using Linear Regression Method"
- Talent Scouting Academy scholarship by Digitalent Scholarship
- Student College from Universitas Prima Indonesia
This repository contains the details and steps for the "Oracle Project" conducted in March 2023 as part of the Oracle Cloud Infrastructure Specialization program. The research focuses on statistical analysis and population growth prediction of countries using the Linear Regression method. Useing Oracle Machine Learning and Oracle Cloud Infrastructure for data analysis and prediction. For the All Report and documentation you can check : Click Here
- Create an Autonomous Database in Oracle Cloud Infrastructure.
- Configure database details, including compartments, database type, storage capacity, and administrative credentials.
- Download client credentials (Wallet) for secure database connections.
- Open Database Actions and use the "Load Data" feature to prepare the dataset.
- Import the World Population Data 1960-2020 dataset to the Autonomous Database.
- Create user credentials to connect to the dataset within Oracle Machine Learning.
- Create a new Notebook in Oracle Machine Learning.
- Execute SQL commands for data analysis, including data type checks, exploratory data analysis (EDA), and identifying countries with the highest population growth in 2020.
- Establish a database connection to interact with data in Python.
- Visualize dataset through graphs.
- Prepare the data for regression analysis.
- Build a Linear Regression model using Scikit-learn.
- Evaluate the model using R-squared, Mean Absolute Error (MAE), and Mean Squared Error (MSE).
Based on the evaluation, the Linear Regression model demonstrates the ability to confidently predict population growth in the coming years. However, it's important to consider that the model's predictions are based on historical data and real-world factors may vary. Future research should explore additional variables and validate the model on independent data for better predictive capabilities. This project contributes to understanding and predicting population growth patterns using a linear regression approach, offering valuable insights for decision-making in various fields related to a country's population.
We would like to thank Oracle Cloud Infrastructure for providing the platform and tools necessary for this research project and Ministry of Communication and Information Technology Indonesia for the Scholarship Program.