library(tidyverse)
brodhead_center <- read_csv("data/brodhead_center.csv")Wrangle Data with {dplyr}
{dplyr} verbs help you wrangle, clean, and normalize your data
| dplyr function | use for |
|---|---|
select() |
subset columns |
filter() |
subset rows |
arrange() |
sort rows by column variable values |
mutate() |
Create new, or modify variables |
group_by() |
use with summarize for subtotals |
summarize() |
generate column totals and subtotals, etc. |
count() |
a specialized summarize() function |
Examples
First we need to load the {dplyr} package for wrangling and the {readr} package for importing CSV data. In our case, we’ll do that by loading the tidyverse which loads {dplyr}, {readr} and several other helpful packages. Then we need to load our data
select()
brodhead_center |>
select(name, type)filter()
brodhead_center |>
filter(menuType == "dessert")arrange()
brodhead_center |>
arrange(cost)mutate()
brodhead_center |>
mutate(ratings_high = rating * 2)We can also mutate data by groups or categories
brodhead_center |>
mutate(avg_item_rating_rest = mean(rating, na.rm = TRUE),
.by = name,
.after = name)count()
| Count values in a group | |
|---|---|
| menuType | n |
| entree | 24 |
| appetizer | 23 |
| dessert | 7 |
| side | 5 |
brodhead_center |>
count(menuType)group_by() & summarise()
| Summarise column |
|---|
| Sum_of_cost |
| 412 |
brodhead_center |>
group_by(name) |>
summarise(min_cost = min(cost), mean_cost = mean(cost), max_cost = max(cost))or
Summarize by groups, without group_by()
brodhead_center |>
summarise(min_cost = min(cost), .by = name)