Skip to content

Mikhail773/R-Final-Project

Repository files navigation

R-Final-Project

When you are done with a section use tildes in the readme file to cross it out as completed

It's done as such:

~~ "Insert Text" ~~
.... Just remove the space between the text and tildes on both sides

Data Analytics Final Project We have 4 weeks: Average 1 phase per week

Phase 1: Proposal: Dataset and Topic Selection

One Page Paper with the following:

  1. Dataset and topic: Craigslist cars trucks data: Choose a Title, a detailed description, and the specification of the chosen dataset

  2. Prepare and clean the dataset so that you can answer question with the dataset

  3. Questions and Goals:

    • At least 5 questions

        + Question Ideas (at least 5):
      
             - During which time of the day is a vehicle most likely to be posted on Craiglist by state?
      
             - What is the distribution of manufacturers for each region on Craigslist?
      
             - What is the relationship of mileage to price for each car model?
         
             - What is the relationship between title and price?
             
             - What is the most popular model by state? (ml)
             
             - //What is the average age of vehicle? (ml)
             
             - //What is the average age of vehicle by state? (ml)
             
             - What is the average asking price? (ml)
             
             - What is the average mileage? (ml)
             	
        + Questions From New Car Dataset:
        
             - What is the distribution of manufacturers for each region?
             
             - What is the most popular model by region? (First 2 questions can be combined)
             
             - What is the average age of vehicle (By region?)?
            		   
             - What is the average asking price? 
             
             - What is the relationship between odometer and price(Fill in with whether its exhangeable)? 
           
             - What is the relationship between engine type and engine capacity?
             
             - What is the relationship between Year Produced and Price?
             
             - Manufacturer Origin Distribution
             
             - Distribution of Engine type for each region and by car type?
            
             - Distribution of colors to body type
      
    • What you are going to do to meet your goals and to answer questions?

    • What visualization may help you seek results?

Phase 2: Analysis and Modeling:

In this phase you should have your final dataset, try to answer the questions you have from your database. You should start the process of analyzing the database by visualizing lot of plots and using ML methods for finding the initial results. In this phase every team based on the initial results and finding they get, have to convince me that they have the ability to finish their project.

Have zip file containing:

  1. Clean version of the data set used

  2. A single page highlighting what are the different ways you tried. A description of your initial results and outcomes what you have done and what you still have to do.4

  3. The code and the plots done on this phase.

Phase 3: Final Results: Use R Markdown

Project report: 1000-1500 lines in R Markdown: https://www.youtube.com/watch?v=DNS7i2m4sB0

Discuss in detail your project steps for getting the final outcomes

A detailed description about the used dataset and where you got the dataset The process followed to clean up the data the different solutions you applied to draw the final results

The machine learning patterns used and your outcomes

Full discussion about these topics:

  • Abstract: Provide one paragraph giving a short description about your project and your findings.

  • Introduction: This section should be divided into sub-sections.

    • In this section you should write detail information (overviews) about the problem you are working on.

    • Why you chose this problem (motivations and related work)?

    • What you are going to do to solve the problems (your goals).

  • Initial goals:

    • What is your goal?

    • What the questions you are trying to answer about your dataset?(At least 5)

  • Dataset:

    • The source of your dataset

    • Description of your dataset structure

    • How did you clean up your dataset?

    • Any things related to the used dataset.

  • Data Analysis and Modeling:

    • What visualizations methods used to analyze your data from different point of views?

    • What are the machine learning techniques you used and why (justify your decisions)?

    • How did you reach these conclusions?

  • Final outcomes and Analysis:

    • What are the results you obtained about the data?

    • Make a comparison between results obtained using different ML methods.

      • What are the answers for you proposed questions about the data?

      • What are your justifications for your answers?

Phase 4: Presentation.

10 Minutes enforced (Google slides?, Alternate speaking every other slide?)

Submission:

Submit zip file with the following in it:

Dataset

Code of project (R code)

Project report

About

Data Analytics Final Project

Resources

Stars

Watchers

Forks

Contributors 5