Weird Statistics Questions

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
README.md		README.md
StatisticsQA_SectionI.Rmd		StatisticsQA_SectionI.Rmd
StatisticsQA_SectionII.Rmd		StatisticsQA_SectionII.Rmd
StatisticsQA_SectionIII.Rmd		StatisticsQA_SectionIII.Rmd
StatisticsQA_SectionIV.Rmd		StatisticsQA_SectionIV.Rmd
StatisticsQA_SectionV.Rmd		StatisticsQA_SectionV.Rmd
Weird Statistics Questions V2.Rmd		Weird Statistics Questions V2.Rmd
Weird-Statistics-Questions-V1.pdf		Weird-Statistics-Questions-V1.pdf
Weird-Statistics-Questions-V2.html		Weird-Statistics-Questions-V2.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Weird Statistics Questions

About

Releases

Packages

Languages

rainfireliang/Weird-Statistics-Questions

Folders and files

Latest commit

History

Repository files navigation

Weird Statistics Questions

I. Descriptive Statistics

1. How do we represent and summarize a variable? Why do we calculate the mean and standard deviation for a variable? How about if the variable is skewed or discrete?

2. We use correlation coefficients (why plural?) to quantify the (linear) relationships between two continuous variables; how can we show the relationships involving categorical/rank variables?

II. Inferential Statistics (Tests)

3. Why can we use a sample to infer the population under what conditions? What is a sampling distribution? Is it observable? Why does a large sample size work?

4. What are the relationships between standard errors, sampling errors, confidence intervals, and confidence levels?

5. What are the differences among T-test, Z-test, F-test, χ2-test, etc.?

6. What are distribution-free tests (non-parametric tests)? We test the difference between means; can we test the difference between medians/variances? Can we test the difference between two categorical variables?

III. Regression (OLS)

7. Can regression models (why plural?) replace all the above-mentioned tests in section II?

8. What is the degree of freedom? Can a regression model be fitted with R2=1? What is over-fitting? Why should we avoid over-fitting?

9. How to interpret regression coefficients (direction, magnitude/strength, significance, form)? How do we interpret when predictors are categorical? How to compare regression coefficients?

10. Why is the I.I.D. assumption essential?

11. What if residuals are normally distributed, but the dependent variable is not? Is this possible?

13. Is estimating the main effects from a regression model with interaction terms possible?

IV. Regression (GLM)

14. How do we deal with non-normally distributed dependent variables? How do we interpret the coefficients?

15. Can we model the variance (instead of the mean) of the dependent variable? E.g., the variance in the salary of older people is smaller than that of young people, given equal salary.

16. What is wrong with using Z-statistics (and associated p-values) of the coefficient of a multiplicative term to test for a statistical interaction in nonlinear models with categorical dependent variables (e.g., logistic regression)?

V. Regression (Causal Inference)

17. Everyone knows that correlation is not causation, but when can regression coefficients be interpreted as causal effects? Is including the lagged explanatory variable a solution?

18. Is a random experiment always better than other methods to identify causality? Why or why not?

19. Is it always better to control more variables than fewer? Should we remove non-significant variables from the regression?

20. Should we control for mediators to estimate the treatment effect?

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

5. What are the differences among T-test, Z-test, F-test, χ²-test, etc.?

8. What is the degree of freedom? Can a regression model be fitted with R²=1? What is over-fitting? Why should we avoid over-fitting?

Packages