Aggregating Data by Group


To get some focused practice summarizing subsets of data separately using group_by() together with summarize()


For convenience, the link to the dplyr reference sheet is here

The Data

We’ll continue using the babynames dataset. We’ll start out exploring the question I asked you to “think about” but not actually answer from the last lab:

For a chosen name, find the year that that name was the most equally split between male and female babies: that is, the year when the sex distribution was closest to 50/50.

Preliminaries (loading packages and data, and setting the default color palette):

Uncle Jess(i)e vs… Great Aunt Jessie?

80s Heart-throb Uncle Jesse, Born During Jessie's Most Male Era (Source: [Bustle](

