Dataset References
Learn more about some important datasets that are useful for building data visualizations in ggplot2.
Reference Datasets for ggplot2 visualizations
The R Datasets package is one of the several datasets maintained by the R Core team and included with the R base installation. By calling the data() function without any arguments, we can list all the built-in datasets.
data()
Let’s look at the available built-in datasets in the ggplot2 package using the code below:
data(package="ggplot2")
Note: We can replace
ggplot2in the above command with any required package (example:MASS) to list the datasets available with the specific package.
Therefore, the base R installation, along with ggplot2, offers several useful built-in datasets. Let’s familiarize ourselves with some of those datasets. We’ll import each dataset and print the first ten rows to get an idea about the variables in the dataset.
The mpg dataset
This is one of the popular datasets used in the data science community. The mpg dataset is a built-in dataset from the ggplot2 package. It consists of a subset of the fuel economy data provided by the EPA.
This dataset contains data about the fuel economy of major car models between to .
Note: We can browse and download this dataset from the official website of the US Department of Energy.
head(mpg, n=10)
The mtcars dataset
The mtcars (Motor Trend Car Road Tests) dataset is another commonly used dataset for data science projects. This dataset provides the fuel consumption data collected for automobiles and ten attributes of automotive ...