Create maps in R in 10 (fairly) easy steps

Sure you can use this technique for election maps — but it’s also handy for sales figures, mobile data coverage and many other types of data.

Trust me: You will save yourself a lot of time if you run a few R commands to see whether the nhgeo@data$NAME vector of county names is the same as the nhdata$County vector of county names.

Do they have the same structure?

str(nhgeo@data$NAME)
Factor w/ 1921 levels "Abbeville","Acadia",..: 1470 684 416 1653 138 282 1131 1657 334 791
str(nhdata$County)
chr [1:11] "Belknap" "Carroll" "Cheshire" "Coos" "Grafton"

Whoops, problem number one: The geospatial file lists counties as R factors, while they're plain character text in the data. Change the factors to character strings with:

nhgeo@data$NAME <- as.character(nhgeo@data$NAME)

Next, it is helpful to sort both data sets by county name and then compare.

nhgeo <- nhgeo[order(nhgeo@data$NAME),]
nhdata <- nhdata[order(nhdata$County),]

Are the two county columns identical now? They should be; let's check:

identical(nhgeo@data$NAME,nhdata$County )
[1] TRUE

Now we can join the two files. The sp package's merge function is pretty common for this type of task, but I like tmap's append_data() because of its intuitive syntax and allowing names of the two join columns to be different.

nhmap <- append_data(nhgeo, nhdata, key.shp = "NAME", key.data="County")

You can see the new data structure with:

str(nhmap@data)

Step 5: Create a static map

The hard part is done: finding data, getting it into the right format and merging it with geospatial data. Now, creating a simple static map of Sanders' margins by county in number of votes is as easy as:

qtm(nhmap, "SandersMarginVotes")

and mapping margins by percentage:

qtm(nhmap, "SandersMarginPctgPoints")

sandersmarginmerged

We can see that there's some difference between which areas gave Sanders the highest percent win versus which ones were most valuable for largest number-of-votes advantage.

For more control over the map's colors, borders and such, use the tm_shape() function, which uses a ggplot2-like syntax to set fill, border and other attributes:

tm_shape(nhmap) +
tm_fill("SandersMarginVotes", title="Sanders Margin, Total Votes", palette = "PRGn") +
tm_borders(alpha=.5) +
tm_text("County", size=0.8)

The first line above sets the geodata file to be mapped, while tm_fill() sets the data column to use for mapping color values. The PRGn" palette argument is a ColorBrewer palette of purples and greens -- if you're not familiar with ColorBrewer, you can see the various palettes available at colorbrewer2.org. Don't like the ColorBrewer choices? You can use built-in R palettes or set your own color HEX values manually instead of using a named ColorBrewer option.

There are also a few built-in tmap themes, such as tm_style_classic:

tm_shape(nhmap) +
  tm_fill("SandersMarginVotes", title="Sanders Margin, Total Votes", palette = "PRGn") +
  tm_borders(alpha=.5) +
  tm_text("County", size=0.8) + 
tm_style_classic()

You can save static maps created by tmap by using the save_tmap() function:

nhstaticmap <- tm_shape(nhmap) +
  tm_fill("SandersMarginVotes", title="Sanders Margin, Total Votes", palette = "PRGn") +
  tm_borders(alpha=.5) +
tm_text("County", size=0.8)
save_tmap(nhstaticmap, filename="nhdemprimary.jpg")

The filename extension can be .jpg, .svg, .pdf, .png and several others; tmap will then produce the appropriate file, defaulting to the size of your current plotting window. There are also arguments for width, height, dpi and more; run ?("save_tmap") for more info.

If you'd like to learn more about available tmap options, package creator Martijn Tennekes posted a PDF presentation on creating maps with tmap as well as tmap in a nutshell.

Step 6: Create palette and pop-ups for interactive map

The next map we'll create will let users click to see underlying data as well as switch between maps, thanks to RStudio's Leaflet package that gives an R front-end to the open-source JavaScript Leaflet mapping library.

For a Leaflet map, there are two extra things we'll want to create in addition to the data we already have: A color palette and pop-up window contents.

For palette, we specify the data range we're mapping and what kind of color palette we want -- both the particular colors and the type of color scale. There are four built-in types:

  • colorNumeric is for a continuous range of colors from low to high, so you might go from a very pale blue all the way to a deep dark blue, with many gradations in between.
  • colorBin maps a set of numerical data to a set of discreet bins, either defined by exact breaks or specific number of bins -- things like "low," "medium" and "high".
  • colorQuantile maps numerical data into groups where each group (quantile) has the same number of records -- often used for income levels, such as bottom 20%, next-lowest 20% and so on.
  • colorFactor is for non-numerical categories where no numerical value makes sense, such as countries in Europe that are part of the Eurozone and those that aren't.

Create a Leaflet palette with this syntax:

mypalette <- colorFunction(palette = "colors I want", domain = mydataframe$dataColumnToMap)

where colorFunction is one of the four scale types above, such as colorNumeric() or colorFactor and "colors I want" is a vector of colors.

Just to change things up a bit, I'll map where Hillary Clinton was strongest, the inverse of the Sanders maps. To map Clinton's vote percentage, we could use this palette:

clintonPalette <- colorNumeric(palette = "Blues", domain=nhmap$ClintonPct)

where "Blues" is a range of blues from ColorBrewer and domain is the data range of the color scale. This can be the same as the data we're actually plotting but doesn't have to be. colorNumeric means we want a continuous range of colors, not specific categories.

We'll also want to add a pop-up window -- what good is an interactive map without being able to click or tap and see underlying data?

Aside: For the pop-up window text display, we'll want to turn the decimal numbers for votes such as 0.7865 into percentages like 78.7%. We could do it by writing a short formula, but the scales package has a percent() function to make this easier. Install (if you need to) and load the scales package:

install.packages("scales")
library("scales")

Content for a pop-up window is just a simple combination of HTML and R variables, such as:

nhpopup <- paste0("County: ", nhmap@data$County,
"Sanders ", percent(nhmap@data$SandersPct), " - Clinton ", percent(nhmap@data$ClintonPct))

(If you're not familiar with paste0, it's a concatenate function to join text and text within variables).

Step 7: Generate an interactive map

Now, the map code:

leaflet(nhmap) %>%
  addProviderTiles("CartoDB.Positron") %>%
  addPolygons(stroke=FALSE, 
              smoothFactor = 0.2,
              fillOpacity = .8, 
              popup=nhpopup,
              color= ~clintonPalette(nhmap@data$ClintonPct)
)
Basic interactive map created in R and RStudio's Leaflet package. Click on an area to see the underlying data.

Let's go over the code. leaflet(nhmap) creates a leaflet map object and sets nhmap as the data source. addProviderTiles("CartoDB.Positron" ) sets the background map tiles to CartoDB's attractive Positron design. There's a list of free background tiles and what they look like on GitHub if you'd like to choose something else.

The addPolygons() function does the rest -- putting the county shapes on the map and coloring them accordingly. stroke=FALSE says no border around the counties, fillOpacity sets the opacity of the colors, popup sets the contents of the popup window and color sets the palette — I’m not sure why the tilde is needed before the palette name, but that's the function format — and what data should be mapped to the color.

The Leaflet package has a number of other features we haven't used yet, including adding legends and the ability to turn layers on and off. Both will be very useful when mapping a race with three or more candidates, such as the current Republican primary.

Join the newsletter!

Or

Sign up to gain exclusive access to email subscriptions, event invitations, competitions, giveaways, and much more.

Membership is free, and your security and privacy remain protected. View our privacy policy before signing up.

Error: Please check your email address.

More about ClickExcelGitHubGoogleInteractiveMicrosoft

Show Comments
[]