library(tidyverse)
<- read_csv("data/brodhead_center.csv") brodhead_center
Wrangle Data with {dplyr}
{dplyr
} verbs help you wrangle, clean, and normalize your data
dplyr function | use for |
---|---|
select() |
subset columns |
filter() |
subset rows |
arrange() |
sort rows by column variable values |
mutate() |
Create new, or modify variables |
group_by() |
use with summarize for subtotals |
summarize() |
generate column totals and subtotals, etc. |
count() |
a specialized summarize() function |
Examples
First we need to load the {dplyr} package for wrangling and the {readr} package for importing CSV data. In our case, we’ll do that by loading the tidyverse which loads {dplyr}, {readr} and several other helpful packages. Then we need to load our data
select()
|>
brodhead_center select(name, type)
filter()
|>
brodhead_center filter(menuType == "dessert")
arrange()
|>
brodhead_center arrange(cost)
mutate()
|>
brodhead_center mutate(ratings_high = rating * 2)
We can also mutate data by groups or categories
|>
brodhead_center mutate(avg_item_rating_rest = mean(rating, na.rm = TRUE),
.by = name,
.after = name)
count()
Count values in a group | |
---|---|
menuType | n |
entree | 24 |
appetizer | 23 |
dessert | 7 |
side | 5 |
|>
brodhead_center count(menuType)
group_by()
& summarise()
Summarise column |
---|
Sum_of_cost |
412 |
|>
brodhead_center group_by(name) |>
summarise(min_cost = min(cost), mean_cost = mean(cost), max_cost = max(cost))
or
Summarize by groups, without group_by()
|>
brodhead_center summarise(min_cost = min(cost), .by = name)