Tidyverse R Cheat Sheet



Subsetting using the tidyverse

  1. R Tidyverse Cheat Sheet Pdf
  2. R Tidyverse Cheat Sheet Pdf

You can also subset tibbles using tidyverse functions from package dplyr. dplyr verbs are inspired by SQL vocabulary and designed to be more intuitive.

Tidyverse R Cheat Sheet
  • Tidyverse include dplyr, tidyr, and ggplot2, which are among the most popular R packages. There are others that are super useful like readxl, forcats, and stringr that are part of the tidyverse, but don't come installed automatically with the tidyverse package, so you'll have to lead them explicitly.
  • Data Transformation with dplyr:: CHEAT SHEET A B C A B C select(.data.
Tidyverse R Cheat Sheet

The first argument of the main dplyr functions is a tibble (or data.frame)

Filtering rows with filter()

filter() allows us to subset observations (rows) based on their values. The first argument is the name of the data frame. The second and subsequent arguments are the expressions that filter the data frame.

If you’re using R to do data analysis inside a company, most of the data you need probably already lives in a database (it’s just a matter of figuring out which one!). However, you will learn how to load data in to a local database in order to demonstrate dplyr’s database tools. Dbplyr is a part of the tidyverse.

dplyr executes the filtering operation by generating a logical vector and returns a new tibble of the rows that match the filtering conditions. You can therefore use any logical operators we learnt using [.

Slicing rows with slice()

Using slice() is similar to subsetting using element indices in that we provide element indices to select rows.

Selecting columns with select()

Tidyverse R Cheat Sheet

select() allows us to subset columns in tibbles using operations based on the names of the variables.

In dplyr we use unquoted column names (ie Volume rather than 'Volume').

R Tidyverse Cheat Sheet Pdf

Behind the scenes, select matches any variable arguments to column names creating a vector of column indices. This is then used to subset the tibble. As such we can create ranges of variables using their names and :

R Tidyverse Cheat Sheet Pdf

There’s also a number of helper functions to make selections easier. For example, we can use one_of() to provide a character vector of column names to select.