So far, all spatial data visualized contain locations measured in angular units (longitude/latitude). But, what if data came in a different coordinate system measured in linear units? Like the planar coordinates of this summary figure!
In this lesson we are going to learn how to set up our data with a UTM Coordinate System, measured in meters instead of longitude/latitude, and how to transform spatial objects from UTM to longitude/latitude, and vice versa!
Configure data with UTM coordinate system using
the st_as_sf()
function.
Change the CRS projection of a sf
object using the st_transform()
function.
This lesson requires the following packages:
if(!require('pacman')) install.packages('pacman')
::p_load(malariaAtlas,
pacman
colorspace,
ggplot2,
cholera,
spData,
dplyr,
here,
rio,
sf)
::p_load_gh("afrimapr/afrilearndata",
pacman"wmgeolab/rgeoboundaries")
This lesson requires familiarity with {dplyr}
: if you
need to brush up, have a look at our introductory course on data
wrangling.
Let’s say that you receive a data frame with coordinates, but
without a CRS projection. If you want to make a
ggplot
map with it, you know that you can use
st_as_sf()
from the {sf}
package.
As we learned in a previous lesson, for point data we need to specify the coordinates and the CRS:
%>%
fatalities st_as_sf(coords = c("x","y"),
crs = 4326) %>%
ggplot() +
geom_sf(alpha = 0.3)
However, what if you receive coordinates different to longitude and latitude?
To exemplify this scenario, we are going to use malaria prevalence in The Gambia. We use data of malaria prevalence in children obtained at 65 villages in The Gambia.
First, we read the gambia_summarized.rds
file using its
file path with the {readr}
package.
<-
gambia_point_summary read_rds(here("data/gambia_summarized.rds"))
gambia_point_summary
## # A tibble: 65 × 5
## x y total positive prev
## <dbl> <dbl> <int> <dbl> <dbl>
## 1 349631. 1458055 33 17 0.515
## 2 358543. 1460112 63 19 0.302
## 3 360308. 1460026 17 7 0.412
## 4 363796. 1496919 24 8 0.333
## 5 366400. 1460248 26 10 0.385
## 6 366688. 1463002 18 7 0.389
## 7 368420. 1460569 38 24 0.632
## 8 370400. 1460301 56 7 0.125
## 9 370628. 1499589 57 6 0.105
## 10 374403. 1501392 26 8 0.308
## # ℹ 55 more rows
It is a data frame with 65 observations and 5 variables:
x
: x coordinate of the village
(UTM),y
: y coordinate of the village
(UTM),total
: total number of tests performed,positive
: number of positive tests, andprev
: malaria prevalence in each village.UTM stants for Universal Transverse Mercator, another coordinate system.
Now we can plot the malaria prevalence. However, in order to use
ggplot2
and geom_sf()
, we need to transform
the data.frame
to an sf
object.
To use the st_as_sf()
function from the
{sf}
package, we need to specify a CRS projection. But in
this case, the units of the x
and
y
variables are not in Geographic
coordinates (longitude/latitude). Instead, these data coordinates are in
UTM format (Easting/Northing), also called
Projected coordinates.
CRS coordinate systems:
Geographic (or unprojected) reference systems use longitude and latitude for referencing a location on the Earth’s three-dimensional ellipsoid surface.
Projected coordinate reference systems use easting and northing Cartesian coordinates for referencing a location on a two-dimensional representation of the Earth.
In R, it looks like this:
All Projected CRSs are based on a Geographic CRS and rely on map projections to convert the three-dimensional surface of the Earth into Easting and Northing (x and y) values for a two-dimensional Earth representation.
Which of the following options of Coordinate Reference System (CRS) types:
"geographic_crs"
"projected_crs"
…corresponds to each of these datasets, given the magnitude of the
values in their x
and y
columns:
the parana
dataset?
<- import("https://github.com/cran/geoR/raw/master/data/parana.rda")
parana as_tibble(parana$coords)
the fatalities
dataset?
::p_load(cholera)
pacmanas_tibble(fatalities)
st_as_sf()
First, we need to set UTM coordinates. For this, we
specify the projection of The Gambia, that is, UTM zone
28 ("+proj=utm +zone=28"
) in the
st_as_sf()
function of the {sf}
package.
<- gambia_point_summary %>%
gambia_projected # first, specify the projection of gambia
# UTM zone 28
st_as_sf(coords = c("x", "y"),
crs = "+proj=utm +zone=28")
gambia_projected
## Simple feature collection with 65 features and 3 fields
## Geometry type: POINT
## Dimension: XY
## Bounding box: xmin: 349631.3 ymin: 1458055 xmax: 622086.1 ymax: 1510610
## Projected CRS: +proj=utm +zone=28
## # A tibble: 65 × 4
## total positive prev geometry
## * <int> <dbl> <dbl> <POINT [m]>
## 1 33 17 0.515 (349631.3 1458055)
## 2 63 19 0.302 (358543.1 1460112)
## 3 17 7 0.412 (360308.1 1460026)
## 4 24 8 0.333 (363795.7 1496919)
## 5 26 10 0.385 (366400.5 1460248)
## 6 18 7 0.389 (366687.5 1463002)
## 7 38 24 0.632 (368420.5 1460569)
## 8 56 7 0.125 (370399.5 1460301)
## 9 57 6 0.105 (370627.7 1499589)
## 10 26 8 0.308 (374402.6 1501392)
## # ℹ 55 more rows
Confirm the presence of the:
CRS: +proj=utm +zone=28
) inside the
header of the new sf
object, andgeometry
column in
meters (<POINT [m]>
).In the UTM system, a position on the Earth is given by the:
parana_data
contains the average rainfall over different
years for the period May-June (dry-season). It was collected at 143
recording stations throughout Parana State, Brasil.
Set UTM
coordinate system to the
parana_data
. Parana State is located in the UTM zone number
22
.
<- as_tibble(parana$coords) %>%
parana_data mutate(Rainfall = parana$data)
<- parana_data %>%
q3 st_as_sf(coords = c("east", "north"),
crs = "+proj=______ +zone=______")
q3
st_transform()
We can transform the UTM projected coordinates to
geographic coordinates (longitude/latitude
with datum WGS84) using st_transform()
where we set CRS to "+proj=longlat +datum=WGS84"
.
<- gambia_projected %>%
gambia_geographic # second, transform
# projected coordinates to
# geographic coordinates
st_transform(crs = "+proj=longlat +datum=WGS84")
gambia_geographic
## Simple feature collection with 65 features and 3 fields
## Geometry type: POINT
## Dimension: XY
## Bounding box: xmin: -16.38755 ymin: 13.18541 xmax: -13.87273 ymax: 13.66438
## Geodetic CRS: +proj=longlat +datum=WGS84
## # A tibble: 65 × 4
## total positive prev geometry
## * <int> <dbl> <dbl> <POINT [°]>
## 1 33 17 0.515 (-16.38755 13.18541)
## 2 63 19 0.302 (-16.30543 13.20444)
## 3 17 7 0.412 (-16.28914 13.20374)
## 4 24 8 0.333 (-16.25869 13.53742)
## 5 26 10 0.385 (-16.23294 13.20603)
## 6 18 7 0.389 (-16.23041 13.23094)
## 7 38 24 0.632 (-16.21431 13.20902)
## 8 56 7 0.125 (-16.19604 13.20668)
## 9 57 6 0.105 (-16.19569 13.56187)
## 10 26 8 0.308 (-16.16088 13.57834)
## # ℹ 55 more rows
Confirm the update of the:
CRS: +proj=longlat +datum=WGS84
inside the
header, andgeometry
column to
degrees (<POINT [°]>
).A PROJ string includes the following information:
+proj=
: the projection of the data
(e.g. utm
, longlat
, or laea
)+zone=
: the zone of the data, specific to the UTM
projection (e.g. 28
)+datum=
: the datum use (e.g. WGS84
)+units=
: the units for the coordinates of the data
(e.g. m
)With the UTM coordinate system data stored in q3
:
Transform its Projected CRS to a Geographic CRS using the
longitude/latitude
(longlat
) projection with
datum WGS84
.
<- q3 %>%
q4 st_transform(crs = "+proj=______ +datum=______")
q4
To reproduce John
Snow’s map shown in previous lessons, we needed to set and
transform different CRS in the same coding pipeline, using
st_set_crs()
and st_transform()
.
::CholeraDeaths %>%
mdsr# british national grid
st_set_crs(27700) %>%
# to wgs84
st_transform(4326)
Projected CRS are a choice made by a public mapping agency. So, with local data sources, work with the CRS in which the data was provided, to ensure compatibility, even if the official CRS is not the most accurate.
Now that you have set up the right CRS projection to your data, you can overlap these points with other Vector data objects:
<- geoboundaries(country = "Gambia", adm_lvl = 2) gambia_adm_2
ggplot() +
geom_sf(data = gambia_adm_2) +
geom_sf(data = gambia_geographic, mapping = aes(color = prev)) +
::scale_color_continuous_sequential(palette="Reds 3") colorspace
Which CRS to use?
“There exist no all-purpose projections, all involve distortion when far from the center of the specified frame” (Bivand, Pebesma, and Gómez-Rubio 2013).
In this lesson, we learned how to transform the CRS between different coordinate systems (projected and geographic).
In the next lesson, we are going to use all our previous learning to built one single thematic map by layers, and enrich them with text and labels referring to specific places or regions, and important map elements like scale bars and a north arrow!
The following team members contributed to this lesson:
Some material in this lesson was adapted from the following sources:
Moreno, M., Basille, M. Drawing beautiful maps programmatically with R, sf and ggplot2 — Part 1: Basics. (2018). Retrieved 10 May 2022, from https://r-spatial.org/r/2018/10/25/ggplot2-sf.html
Data carpentry. Introduction to Geospatial Concepts: Coordinate Reference Systems. (2021). Retrieved 15 May 2022, from https://datacarpentry.org/organization-geospatial/03-crs/index.html
Moraga, Paula. Geospatial Health Data: Modeling and Visualization with R-INLA and Shiny. Chapter 9: Spatial modeling of geostatistical data. Malaria in The Gambia. (2019). Retrieved 10 May 2022, from https://www.paulamoraga.com/book-geospatial/sec-geostatisticaldataexamplespatial.html
Carrasco-Escobar, G., Barja, A., Quispe, J. [Visualization and Analysis of Spatial Data in Public Health]. (2021). Retrieved 15 May 2022, from https://www.reconlearn.org/post/spatial-analysis-1-spanish.html
This work is licensed under the Creative Commons Attribution Share Alike license.