Announcements

Your project Github repositories have been created. Reminder that your project proposal is due next Friday, July 9 at 11:59pm. Use the file project-proposal.Rmd to write your proposal.

Exercises

paintings <- read_csv("data/paris_paintings.csv", 
                            na = c("n/a", "", "NA"))

We will be looking at the Paris Paintings data set in today’s application exercise. We’ll primarily focus on the variables:

Exercise 1

Create a scatterplot to visualize the relationship between Height and Width. Color the points based on the whether the painting has any landscape elements. Note: Be sure to make landsAll a factor, so R treats it as categorical variable.

Exercise 2

Based on your scatterplot, does the relationship between Height and Width differ between paintings with landscape elements and those without? Briefly explain.

Exercise 3

Fit a main effects model using width and whether the painting has landscape elements to predict the height.

Exercise 4

Using your model from the previous exercise,

  • Interpret the intercept.
  • Interpret the coefficient (slope) of Width_in.
  • Interpret the coefficient (slope) of landsAll.

Exercise 5

Write the equation of the model for

  • paintings with no landscape elements.
  • paintings with landscape elements.

How do the slopes compare between the two models? How do the intercepts compare?

Exericse 6

Now let’s consider an interaction term. Fit a linear model using width, whether the painting has landscape elements, and the interaction between the two variables to predict the height.

Exercise 7

Write the equation of the model for

  • paintings with no landscape elements.
  • paintings with landscape elements.

How do the slopes compare between the two models? How do the intercepts compare?

Exercise 8

Now let’s see how well each model fits the data. Use the glance function to calculate \(R^2\) and Adj. \(R^2\) for the model fit in Ex. 3 and the one fit in Ex. 6.

Exercise 9

Based on your answer to the previous exercise, which model is a better fit for the data? Briefly explain.

Exercise 10

Interpret the \(R^2\) for the model chosen in the previous exercise.