This is data from the USDA about the number of colonies of bees around the country.

First import the data. I got the data from the site.

Tidy the Data

I had to do a bit of tidying:

  • remove the commas in numbers (i.e., 21,892 needed to be 21892 to be treated as numeric)
  • remove the period after St. in St. Johns and St. Lucie in order to match the map dataset

I feel the expand/grid bit is a little clunky, but I couldn’t figure out how to do it in one pipe.


#Data available here:
bees <- read_csv("Bee Colony Census Data by County.csv", na = "NA") %>%
      filter(State == "FLORIDA") %>%
      mutate(Value = as.numeric(gsub(",", "", Value))) %>%
#some counties have [.]s in their name (i.e. St. Johns, etc.)
      mutate(County = gsub("\\.","", County)) %>% 
      select(Year, State, County, Value)

grid <- bees %>%
      expand(State, County, Year) 

bees2 <-  full_join(grid, bees)

Get the county outlines and connect to the bee data

Now I make the map, convert it to upper case to match my bee data, and join it to the bee data so they are all in one df.


m.usa <- map_data("county") %>%
      mutate(State = str_to_upper(region, locale = "en"), 
             County = str_to_upper(subregion, locale = "en")) %>%
      full_join(bees2) %>%
      select(long, lat, group, order, State, County, Year, Value) %>%
      filter(State == "FLORIDA") 

Graph it

Finally, graph it!

ggplot(m.usa, aes(x = long,
                  y = lat,
                  group = group,
                  fill = Value)) +
      geom_polygon(color = "grey50", size = 0.2) +
      coord_fixed(1.3) +
      scale_fill_continuous(low = "white", high = "red", na.value = "grey80") +
      facet_wrap(~Year) +
      labs(title = "Number of bee colonies in Florida counties", 
             subtitle = "2002-2012", x = "", y = "") +
      theme(axis.ticks = element_blank(), 
            axis.text = element_blank())