Announcements

Exercises

Let’s look at data relating student test scores in various subjects to factors such as parent background and test preparation. The data are from Kaggle. We’ll use the following variables in the analysis:

scores <- read.csv("data/student_performance.csv")

Exercise 1

Visualize the relationship between math score and parent education. Describe any patterns or relationships you see.

Exercise 2

Write the ANOVA hypotheses to test the relationship between parent education and math test score.

Exercise 3

Check the assumptions of ANOVA by (1) normality and (2) homoscedastic variance. Assuming the observations are independent, should we move forward with the ANOVA testing?

Exercise 4

Regardless of what you concluded in Exercise 3, perform the hypothesis test using the ANOVA model and a significance level of \(\alpha = 0.05\). State the value of the test statistic, as well the distribution it follows, assuming \(H_{0}\) true.

Exercise 5

Should you proceed with any more testing? Explain your decision.

Exercise 6

If we were to proceed with testing and conducted all pairwise tests, we would conduct 15 t-tests. What would be the Bonferroni-correct significance level we would use for each test?

Exercise 7

Pick one pairwise hypothesis and perform a t-test using the significance level from Exercise 6. Hint: use the t_test() function, filtering for the two levels of parent education that you are testing between.