Before running this notebook, select “Session > Restart R and Clear Output” in the menu above to start a new R session. This will clear any old data sets and give us a blank slate to start with.
After starting a new session, run the following code chunk to load the libraries and data that we will be working with today.
I have set the options
message=FALSE to avoid cluttering the solutions with all the output from this code.
In the introduction to the grammar of graphics, we saw that visualizations can be built out of graphics layers. Each layer, in turn, is described by a data set, a geometry, and a series of aes (aesthetic) mappings between variables and features of the layer. The point geometry required
y aesthetics; the text and text repel layers also required a
In addition to the required aesthetics, each geometry type also has a number of optional aesthetics that we can use to add additional information to the plot. For example, most geoms have a
color aesthetic. The syntax for describing this is exactly the same as with the required aesthetics; we place the name of the aesthetic followed by the name of the associated variable name. Let’s see what happens when add a color aesthetic this to our scatterplot by relating the variable
food_group to the aes
%>% food ggplot() + geom_point(aes(x = calories, y = total_fat, color = food_group))
Notice that R has done a lot of work for us. It determined all of the food groups in the data set, assigned each to a color, built a legend, and modified the points on the plot so that the colors align with the food groups. Can you now tell what types of food have a large number of calories and fat? Which kinds of food have the lowest calories and fat? What is the biggest difference between fruits and vegetables from the plot?
Similarly, we can modify the size of the points according to a variable in the data set by setting the
size aesthetic. Here, we will make points larger or smaller based on the saturated fat in each food item:
%>% food ggplot() + geom_point(aes(x = calories, y = total_fat, size = sat_fat))
Both size and color can also be specified for the text, text repel, and line geometries. There are a few other aesthetics that will be useful, and that we will introduce as needed.
In the previous section we changed the default aes value for the color and size of points by associating these to a variable in the data set. In the plots from the last notebook, where we did not specify color or size, R choose a default value for these: the color “black” and the size 1. What if we want to change the defaults to a different fixed value? This can be done relatively easily, but take care with the details because this is a common source of confusing errors for users new to the grammar of graphics.
To change an aes to a fixed value, we specify the changed value inside the
geom_ function, but after the
aes( function. Here, for example, is how we change the size of all the points to 4 (four times larger than the default):
%>% food ggplot() + geom_point(aes(x = calories, y = total_fat), size = 4)
We can do the same with colors, but notice that we need to put the color name inside of quotes:
%>% food ggplot() + geom_point(aes(x = calories, y = total_fat), color = "pink")