1 Introduction

Let say that you receive a raw data.frame with coordinate points per row, and I want to create a Thematic map with it, how can I make it using {ggplot2}? Or that you use a package to get spatial data but in a object class like SpatVector or SpatialPolygonsDataFrame, how can I use them with {ggplot2}?

Figure 1. (A) Map from a raw data.frame with coordinate points. (B) Dot map with the same data converted to a sf object.

We need sf class objects to create ggplot maps! So, today we are going to learn how to convert foreign spatial objects to sf, either polygon or point data, to keep making ggplot2 maps!

2 Learning objectives

  1. Convert foreign objects to sf class using the st_as_sf() function from the {sf} package.

  2. Convert foreign polygon data in SpatVector class to sf.

  3. Convert foreign point data in data.frame class to sf.

3 Prerequisites

This lesson requires the following packages:

if(!require('pacman')) install.packages('pacman')

pacman::p_load(malariaAtlas,
               ggplot2,
               cholera,
               geodata,
               dplyr,
               here,
               sf)

pacman::p_load_gh("afrimapr/afrilearndata",
                  "wmgeolab/rgeoboundaries")

4 Mapping country borders with {geodata}

An advantage of the {geodata} package is that it downloads country borders data with a multiple column output, providing the names of all the levels above the one requested.

For example, here is an example with the first three administrative levels in Bolivia:

bolivia_level3 <- gadm(country = "Bolivia", level = 3, path = tempdir())
bolivia_level3 %>% 
  
  as_tibble() %>% 
  select(starts_with("NAME_")) #👈👈👈👈👈👈👈👈👈👈👈👈👈
## # A tibble: 344 × 3
##    NAME_1 NAME_2                 NAME_3                      
##    <chr>  <chr>                  <chr>                       
##  1 Beni   Cercado                San Javier                  
##  2 Beni   Cercado                Trinidad                    
##  3 Beni   General José Ballivián Puerto Menor de Rurrenabaque
##  4 Beni   General José Ballivián Reyes                       
##  5 Beni   General José Ballivián San Borja                   
##  6 Beni   General José Ballivián Santa Rosa                  
##  7 Beni   Iténez                 Baures                      
##  8 Beni   Iténez                 Huacaraje                   
##  9 Beni   Iténez                 Magdalena                   
## 10 Beni   Mamoré                 Puerto Siles                
## # ℹ 334 more rows

This is useful for situations where you want to filter() only a specific subset of sub-divisions (e.g. level 3) within a region (level 1) or department (level 2).

bolivia_level3 %>% 
  
  as_tibble() %>% 
  select(starts_with("NAME_")) %>% 
  
  filter(NAME_1 == "Santa Cruz") #👈👈👈👈👈👈👈👈👈👈👈👈👈👈
## # A tibble: 56 × 3
##    NAME_1     NAME_2         NAME_3                 
##    <chr>      <chr>          <chr>                  
##  1 Santa Cruz Andrés Ibáñez  Cotoca                 
##  2 Santa Cruz Andrés Ibáñez  El Torno               
##  3 Santa Cruz Andrés Ibáñez  La Guardia             
##  4 Santa Cruz Andrés Ibáñez  Porongo                
##  5 Santa Cruz Andrés Ibáñez  Santa Cruz de la Sierra
##  6 Santa Cruz Angel Sandoval San Matías             
##  7 Santa Cruz Chiquitos      Pailón                 
##  8 Santa Cruz Chiquitos      Roboré                 
##  9 Santa Cruz Chiquitos      San José               
## 10 Santa Cruz Cordillera     Boyuibe                
## # ℹ 46 more rows

However, I can not create ggplot maps with it because it is a SpatVector class object:

gadm(country = "Bolivia", level = 3, path = tempdir())
👉 class       : SpatVector 👈
   geometry    : polygons 
   dimensions  : 344, 16  (geometries, attributes)
   extent      : -69.64525, -57.45443, -22.90657, -9.670923
   coord. ref. : +proj=longlat +datum=WGS84 +no_defs 

Convert foreign Polygon geometries to sf

The best way to solve this issue is to use the st_as_sf() function from the {sf} package

# spatvector
bolivia_level3 %>% class()
## [1] "SpatVector"
## attr(,"package")
## [1] "terra"
# sf
bolivia_level3 %>% st_as_sf() %>% class() #👈👈👈👈👈👈👈👈
## [1] "sf"         "data.frame"

This solution is useful for object classes like SpatVector and SpatialPolygonsDataFrame.

Convert the country borders of Germany from SpatVector to a sf class object using the st_as_sf() function:

gadm(country = "Germany", level = 2, path = tempdir()) %>% 
  ...........

Now with sf objects, we can create a ggplot like this:

ggplot() +
  geom_sf(data = geoboundaries(country = "Bolivia")) +
  geom_sf(data = bolivia_level3 %>% 
            
            st_as_sf() %>%                 #👈👈👈👈
            
            filter(NAME_1 == "Santa Cruz"),
          mapping = aes(fill = NAME_1))

Vector data

Why is the class object called SpatVector?

  • “SpatVector” stands for “Spatial Vector”, and “Vector” stands for “Vector data”.

  • Vector data is the formal name for the Geometry types like Point, Lines, and Polygons.

  • It requires a Coordinate Reference System (CRS) to relate the spatial elements of the data with the surface of Earth.

  • It is also the most common format of Spatial data used in GIS. Which is commonly stored in Shapefiles.

Do not get confused by the vector class object, which is an R class just like data.frame and matrix.

Above is an example of converting foreign Polygon data to sf class. Now let’s see briefly see how to convert foreign Point data.

5 Disease information with {malariaAtlas}

The {malariaAtlas} package download, visualize and manipulate global malaria data hosted by the Malaria Atlas Project.

The malariaAtlas package enables users to download data like:

Parasite Rate surveys

The getPR() function downloads all the publicly available PR points for a country (or countries) and returns it as a data.frame.

zimbabwe_malaria_pr <- getPR(country = "Zimbabwe", species = "BOTH")

The species argument is a string specifying the Plasmodium species and can be Pf (Plasmodium falciparum), Pv (Plasmodium vivax) or BOTH.

The output is a data.frame object that contains longitude and latitude as variables:

class(zimbabwe_malaria_pr)
## [1] "pr.points"  "data.frame"
zimbabwe_malaria_pr %>% 
  as_tibble()
## # A tibble: 567 × 28
##    dhs_id site_id site_name    latitude longitude rural_urban country country_id
##    <lgl>    <int> <chr>           <dbl>     <dbl> <chr>       <chr>   <chr>     
##  1 NA       16142 Chilonga        -18.4      27.2 UNKNOWN     Zimbab… ZWE       
##  2 NA       17018 Ume - Kahobo    -17.8      28.8 UNKNOWN     Zimbab… ZWE       
##  3 NA       18443 Bandanyenje     -18.9      32.0 UNKNOWN     Zimbab… ZWE       
##  4 NA          68 Piringani       -17.0      30.1 UNKNOWN     Zimbab… ZWE       
##  5 NA       19390 Zidulini        -18.7      28.7 UNKNOWN     Zimbab… ZWE       
##  6 NA       17564 Lutumba         -22.1      30.2 UNKNOWN     Zimbab… ZWE       
##  7 NA       14287 Seula School    -21.6      28.4 UNKNOWN     Zimbab… ZWE       
##  8 NA        2943 Ndlovu          -18.9      27.8 UNKNOWN     Zimbab… ZWE       
##  9 NA       11029 Bambaninga      -20.4      31.8 UNKNOWN     Zimbab… ZWE       
## 10 NA       20058 Pumula Miss…    -19.6      27.1 UNKNOWN     Zimbab… ZWE       
## # ℹ 557 more rows
## # ℹ 20 more variables: continent_id <chr>, month_start <int>, year_start <int>,
## #   month_end <int>, year_end <int>, lower_age <int>, upper_age <int>,
## #   examined <int>, positive <int>, pr <dbl>, species <chr>, method <chr>,
## #   rdt_type <chr>, pcr_type <lgl>, malaria_metrics_available <chr>,
## #   location_available <chr>, permissions_info <chr>, citation1 <chr>,
## #   citation2 <chr>, citation3 <chr>

Download the publicly available Parasite Ratio (PR) points for Plasmodium falciparum in Sierra Leone using the {malariaAtlas} package.

q3 <- ________(________ = "Sierra Leone",species = "Pf")
q3

autoplot() can be used to quickly and easily visualize the downloaded PR survey points.

autoplot(zimbabwe_malaria_pr)

However, if you would like to make a custom plot for your particular research needs, with complete control of the output aesthetics, you would probably prefer {ggplot2} for the work!

Convert foreign Point geometries to sf

To convert a data.frame to an sf object, we can use the st_as_sf() from the {sf} package.

In case of Point patterns data (like for a Dot map):

  • the coords argument need to specify the names of the variables holding the Coordinates, and
  • the crs argument specify the Coordinate Reference System (CRS) to be assigned (here WGS84, which is the CRS code #4326).
zimbabwe_malaria_pr_sf <- 
  # data frame
  zimbabwe_malaria_pr %>% 
  # convert to sf
  sf::st_as_sf(coords = c("longitude","latitude"),  #👈👈👈👈👈👈
               crs = 4326)

Inside the coords argument,you need to pay attention to two things:

  • You need to use "quotation_marks" for each column name,
  • First write the longitude, and then the latitude.

Mission accomplished! This is the output of the conversion:

zimbabwe_malaria_pr_sf %>% 
  dplyr::select(site_name,pr)
## Simple feature collection with 567 features and 2 fields
## Geometry type: POINT
## Dimension:     XY
## Bounding box:  xmin: 25.8063 ymin: -22.2333 xmax: 32.9708 ymax: -16.0662
## Geodetic CRS:  WGS 84
## First 10 features:
##         site_name     pr                 geometry
## 9        Chilonga 0.0000    POINT (27.1667 -18.4)
## 11   Ume - Kahobo 0.0750 POINT (28.8209 -17.8091)
## 17    Bandanyenje 0.0000   POINT (31.95 -18.8667)
## 19      Piringani 0.0000 POINT (30.1333 -16.9667)
## 21       Zidulini 0.0080 POINT (28.6667 -18.7333)
## 24        Lutumba 0.0000  POINT (30.1757 -22.122)
## 26   Seula School 0.0000 POINT (28.4033 -21.6176)
## 30         Ndlovu 0.0800   POINT (27.75 -18.8667)
## 34     Bambaninga 0.0080 POINT (31.7562 -20.4146)
## 38 Pumula Mission 0.0078   POINT (27.116 -19.617)

Now we have converted our data.frame into an sf object and can directly plot this object with {ggplot2} code:

ggplot(data = zimbabwe_malaria_pr_sf) +
  geom_sf(aes(size = pr))

Or add one previous layer with the administrative boundaries of Zimbabwe:

zimbabwe_boundaries_adm1 <- geoboundaries(country = "Zimbabwe",
                                         adm_lvl = 1)

In the same ggplot:

ggplot() +
  # boundaries data
  geom_sf(data = zimbabwe_boundaries_adm1,
          fill = "white") +
  # disease data
  geom_sf(data = zimbabwe_malaria_pr_sf,
          aes(size = pr),
          color = "red",
          alpha=0.5)

sf::st_as_sf() does not allow missing values in coordinates!

So you can use {dplyr} verbs like filter() to drop problematic rows from variable x using this notation:filter(!is.na(x)).

With the publicly available Parasite Ratio (PR) points of Plasmodium falciparum in Sierra Leone stored in q3:

Convert its data.frame output to a sf object using the st_as_sf() function, with the code 4326 to set a WGS84 CRS. (More on all these codes coming soon, don’t be intimidated !)

q4 <- 
  q3 %>% 
  filter(!is.na(longitude)) %>% 
  sf::________(coords = c("________","________"),
               crs = 4326)
q4

6 Wrap up

In this lesson, we have learned how to convert foreign Polygon and Point data to a sf class object from SpatVector and data.frame objects, respectively. We also learned about Vector data and how Point data needs a Coordinate Reference Systems (CRS) to obtain an sf object.

Figure 2. Concept map

Contributors

The following team members contributed to this lesson:

References

Some material in this lesson was adapted from the following sources:

This work is licensed under the Creative Commons Attribution Share Alike license. Creative Commons License