This is data from the USDA about the number of colonies of bees around the country.
First import the data. I got the data from the data.world site.
Tidy the Data
I had to do a bit of tidying:
- remove the commas in numbers (i.e., 21,892 needed to be 21892 to be treated as numeric)
- remove the period after St. in St. Johns and St. Lucie in order to match the map dataset
I feel the expand/grid bit is a little clunky, but I couldn’t figure out how to do it in one pipe.
library(tidyverse) library(readr) #Data available here: https://data.world/siyeh/us-bee-stats-by-state/workspace/file?filename=Bee+Colony+Census+Data+by+County.csv bees <- read_csv("Bee Colony Census Data by County.csv", na = "NA") %>% filter(State == "FLORIDA") %>% mutate(Value = as.numeric(gsub(",", "", Value))) %>% #some counties have [.]s in their name (i.e. St. Johns, etc.) mutate(County = gsub("\\.","", County)) %>% select(Year, State, County, Value) grid <- bees %>% expand(State, County, Year) bees2 <- full_join(grid, bees)
Get the county outlines and connect to the bee data
Now I make the map, convert it to upper case to match my bee data, and join it to the bee data so they are all in one df.
library(ggplot2) library(maps) m.usa <- map_data("county") %>% mutate(State = str_to_upper(region, locale = "en"), County = str_to_upper(subregion, locale = "en")) %>% full_join(bees2) %>% select(long, lat, group, order, State, County, Year, Value) %>% filter(State == "FLORIDA")
Finally, graph it!
ggplot(m.usa, aes(x = long, y = lat, group = group, fill = Value)) + geom_polygon(color = "grey50", size = 0.2) + coord_fixed(1.3) + scale_fill_continuous(low = "white", high = "red", na.value = "grey80") + facet_wrap(~Year) + labs(title = "Number of bee colonies in Florida counties", subtitle = "2002-2012", x = "", y = "") + theme(axis.ticks = element_blank(), axis.text = element_blank())