Skip to content

Arif-F-Aytkn/R-Data-Analysis

Repository files navigation

R Data Analysis Projects

This repository contains various data analysis projects implemented using the R programming language. Each project aims to explore a specific problem or relationship using different datasets and statistical/visualization techniques. The goal is to showcase my abilities in data processing, analysis, and visualization in R.

Project Contents

Below is a brief description of each project found in this repository:

1. Analyzing the Relationship Between CPI and HDI (Analyzing-the-Relationship-Between--CPI--and---HDI-.pdf)

  • Description: This project visualizes the relationship between the Corruption Perceptions Index (CPI) and the Human Development Index (HDI). It investigates the potential connections between countries' perceptions of corruption and their human development levels.
  • Techniques Used: Data loading, scatter plot with ggplot2 and smooth line, data point labeling with ggrepel.

2. ANOVA Examples (ANOVA_Example.pdf)

  • Description: This study demonstrates applications of Analysis of Variance (ANOVA). Both one-way and two-way ANOVA tests are used to examine the relationship between diamond prices and factors like cut quality and color, using the diamonds dataset. Tukey HSD post-hoc test is also applied to identify specific group differences after ANOVA.
  • Techniques Used: Box plots, One-Way ANOVA, Two-Way ANOVA, Post-hoc tests (Tukey HSD).

3. Descriptive Data Analysis with mtcars (Descriptive_Data_Analysis_with_mtcars.pdf)

  • Description: This project performs comprehensive descriptive statistics and exploratory data analysis on the mtcars dataset. It explores relationships between variables, summarizes statistics by groups, and examines basic distribution characteristics.
  • Techniques Used: Data manipulation with dplyr, visualizing multivariate relationships with GGally (ggpairs), group-wise descriptive statistics with the psych package, Kruskal-Wallis test.

4. Statistical Distributions and Central Limit Theorem (Distributions.pdf)

  • Description: This project introduces the functions (d, p, q, r) for fundamental statistical distributions (Uniform, Binomial, Normal) in R and visually demonstrates the Central Limit Theorem (CLT) through simulation.
  • Techniques Used: Statistical distribution functions, simulation, histogram and density curve plotting with ggplot2.

5. Evaluation of Variables with Hypothesis Tests (Evaluation-of-Variables-with-Hypothesis-Tests.pdf)

  • Description: This project illustrates how different types of hypothesis tests are applied in R. Common tests such as one-sample t-test, dependent/independent two-sample t-tests, proportion test, and chi-square test are explained with examples.
  • Techniques Used: One-sample t-test, Two-sample t-tests, prop.test (proportion test), chisq.test (chi-square test).

6. Heatmap Example for Electricity Prices (Heatmap-Example-for-Electricity-Prices.pdf)

  • Description: Using electricity price time series data, this project creates a heatmap to visualize the intensity of price changes on a yearly, weekly, and daily basis. This helps in identifying patterns and seasonal trends within time series data.
  • Techniques Used: Time series data processing, line plots with ggplot2, heatmap creation with geom_tile, viridis color palette, facet_wrap for year-wise separation.

7. Measures of Dispersion (Measures-of-Dispersion.pdf)

  • Description: This study examines the basic features and measures of dispersion of datasets (descriptive statistics with the stargazer package), correlation, and covariance between variables. Additionally, skewness and kurtosis values are calculated and visualized.
  • Techniques Used: Summary statistics tables with stargazer, correlation, covariance, partial correlation, skewness and kurtosis calculations and visualizations.

8. Port Freight Traffic Analysis (Port_Freight_Traffic.pdf)

  • Description: This project analyzes global port freight traffic data and visualizes it on a geographical map. The busiest ports are identified, and their locations and traffic volumes are displayed on the map.
  • Techniques Used: Loading external data, data manipulation with dplyr, geographical visualization with ggplot2 and map_data.

Usage

Each project is created as an R Markdown (.Rmd) file (the PDF outputs are compiled versions of these .Rmd files). To inspect, modify, or apply the source code of the projects with your own data, you can use the .Rmd files.

  1. Clone the Repository:
    git clone [https://github.com/Arif-F-Aytkn/R-Data-Analysis.git](https://github.com/Arif-F-Aytkn/R-Data-Analysis.git)
  2. Open in RStudio: Open the .Rmd file you are interested in using RStudio.
  3. Install Required Packages: The packages used in each project are specified at the beginning. You might need to install them using install.packages("package_name").
  4. Run the Code: You can knit the .Rmd file or run the code blocks individually to see the outputs and analyses.

Development Notes

These projects were created to enhance my R data analysis skills. In the future, I aim to deepen the analyses, make assumption checks more explicit, and enrich the interpretation of the outputs.


Arif Furkan AYTKN

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published