Go to the ae-05-[GITHUB USERNAME]
repo, clone it, and start a new project in RStudio.
Run the following code to configure Git. Fill in your GitHub username and the email address associated with your GitHub account.
library(usethis)
use_git_config(user.name= "your github username", user.email="your email")
Note: In each of these exercises you will need to set eval=TRUE
in the code chunk header when you’re ready to knit and run the code for that exercise.
library(tidyverse)
library(scales)
fisheries <- read_csv("data/fisheries.csv")
continents <- read_csv("data/continents.csv")
The code below fills in the gaps from joining the data sets to creating the updated visualizations.
fisheries <- fisheries %>%
filter(total > 100000) %>%
left_join(continents) %>%
mutate(
continent = case_when(
country == "Democratic Republic of the Congo" ~ "Africa",
country == "Hong Kong" ~ "Asia",
country == "Myanmar" ~ "Asia",
TRUE ~ continent
),
aquaculture_perc = aquaculture / total
)
Calculate the mean aquaculture percentage (we’ll call it mean_ap
for short) for continents in the fisheries data using the summarise()
function in dplyr. Note that the function for calculating the mean is mean()
in R.
fisheries %>% # start with the fisheries data frame
___ %>% # group by continent
___(mean_ap = ___) # calculate mean aquaculture
Now expand your calculations to also calculate the minimum and maximum aquaculture percentage for continents in the fisheries data. Note that the functions for calculating minimum and maximum in R are min()
and max()
respectively.
fisheries %>% # start with the fisheries data frame
# and the rest of the code goes here
Create a new data frame called fisheries_summary
that calculates minimum, mean, and maximum aquaculture percentage for each continent in the fisheries data.
fisheries_summary <- fisheries %>%
# you can reuse code from Exercise 2 here
Take the fisheries_summary
data frame and order the results in descending order of mean aquaculture percentage.
fisheries_summary %>% # start with the fisheries_summary data frame
___ # order in descending order of mean_ap
The code below creates the graph you originally saw in the lecture slides. Change the theme to change the look of the graph. Choose one of the complete themes found in the ggplot2 reference page.
ggplot(fisheries_summary,
aes(y = fct_reorder(continent, mean_ap), x = mean_ap)) +
geom_col() +
scale_x_continuous(labels = label_percent(accuracy = 1)) +
labs(
x = "",
y = "",
title = "Average share of aquaculture by continent",
subtitle = "out of total fisheries harvest, 2016",
caption = "Source: bit.ly/2VrawTt"
) +
theme_minimal() #change the theme!
This exercise was modified from “Fisheries” in Data Science in Box.