Creating a distance cartogram using OSM data
Matthieu Viry
2025-02-03
Source:vignettes/with-osm-data.Rmd
with-osm-data.Rmd
Introduction
We are looking to build a unipolar distance cartogram (also called a centered time cartogram) using only data from OpenStreetMap (OSM): it is going to use both geographic layers describing the territory and travel time data from OSM.
This type of cartogram visualizes travelling times from a reference starting location to other destinations in a selected area. In the resulting map, the geographic distance are replaced by time distance and the underlying map layers are deformed accordingly.
This highlights the use of several packages developed at RIATE, including:
-
mapsf
to create maps withsf
objects, -
osrm
for calculating distance-time matrices from OSM data, -
maposm
to easily retrieve OSM data (this is a wrapper aroundosmdata
), -
distanamo
for creating distance cartograms.
Installing and importing packages
mapsf
and osrm
are available both on CRAN
and on riatelab’s
r-universe while distanamo
and maposm
are
only available from riatelab’s
r-universe.
As a result, we propose to install all the packages from r-universe:
repos <- c("https://riatelab.r-universe.dev", "https://cloud.r-project.org")
install.packages("mapsf", repos = repos)
install.packages("osrm", repos = repos)
install.packages("distanamo", repos = repos)
install.packages("maposm", repos = repos)
Once installed, we can import the necessary packages:
Fetching OSM data
We are now retrieving OSM data for the study area of our choice: here we are interested in the accessibility (by bike, but we will see this later) from the market hall of Chambéry.
Let’s get the coordinates of the address of Chambéry’s market hall and then fetch the data for a radius of 3000 meters around it:
bb <- osmdata::getbb("10, Place Georges Clemenceau, 73100 Chambéry, France")
lon <- mean(bb[1, ])
lat <- mean(bb[2, ])
res <- get_city(c(lon, lat), r = 3000)
#> Getting buildings: 16.926 sec elapsed
#> Getting green areas: 1.723 sec elapsed
#> Getting water bodies: 2.761 sec elapsed
#> Getting roads: 1.247 sec elapsed
#> Getting streets: 2.111 sec elapsed
#> Getting railways: 0.35 sec elapsed
We can see that the resulting object is a list of sf layers:
print(names(res))
#> [1] "circle" "green" "water" "railway" "road" "street" "building"
#> [8] "urban"
Depending on the radius defined in the maposm::get_city
function the result list contains either individual buildings (if r
<= 5000) or urban patches (if r > 5000). As a result we need to
filter the resulting list to remove any possible NULL
entry:
Defining a reference point
We need to define a reference point from which the travel times will be computed. Since we used the address of Chambéry’s market hall as a center point to fetch the data, we can use the center of the circle that includes all the data:
start <- st_centroid(res$circle)
Sampling points
We then need a set of destinations to compute the travel times to. To do so, we sample roughly 200 points on the road network:
Making a map of the result
mf_map(res$circle, col = "#f2efe9", border = NA, add = FALSE)
mf_map(res$green, col = "#c8facc", border = "#c8facc", lwd = .5, add = TRUE)
mf_map(res$water, col = "#aad3df", border = "#aad3df", lwd = .5, add = TRUE)
mf_map(res$railway, col = "grey50", lty = 2, lwd = .2, add = TRUE)
mf_map(res$road, col = "white", border = "white", lwd = .5, add = TRUE)
mf_map(res$street, col = "white", border = "white", lwd = .5, add = TRUE)
mf_map(res$building, col = "#d9d0c9", border = "#c6bab1", lwd = .5, add = TRUE)
mf_map(res$circle, col = NA, border = "#c6bab1", lwd = 4, add = TRUE)
mf_map(start, pch = 23, col = "red", cex = 2, add = T)
mf_map(points, add = T, col = "blue")
Computing travel times between the reference point and the sampled points
To do so, we can use the osrmTable
function from the
osrm
package:
mat <- osrmTable(
dst = start,
src = points,
osrm.profile = "bike"
)
points$durations <- mat$durations
Alternative to OSRM for computing travel times
Alternatively, if having a Valhalla instance under the hand and if the data preparation step included elevation data, you can also use it to compute travel times and benefit from taking the elevation into account:
library(valhallr)
s <- as.data.frame(st_coordinates(st_transform(start, "EPSG:4326")))
names(s) <- c("lon", "lat")
d <- as.data.frame(st_coordinates(st_transform(points, "EPSG:4326")))
names(d) <- c("lon", "lat")
durs <- sources_to_targets(
s,
d,
costing = "bicycle",
hostname = Sys.getenv("VH_HOSTNAME"),
port = Sys.getenv("VH_PORT")
)
points$durations <- durs$time / 60
Note that we use environment variables to store the hostname and port
of the Valhalla instance because this is our private instance. You need
to replace Sys.getenv('VH_HOSTNAME')
and
Sys.getenv('VH_PORT')
by the actual values you want to
use.
Computing the positions of image points
We can now compute the positions of the image points using the
dc_move_from_reference_point
function. It takes as input
the reference point, the other points, and the column name of the travel
times in the other_points
object.
This function will move the points by computing the reference speed (the average distance / time) and then moving the points further or closer to the reference point depending on the travel time to reach them.
pos_result <- dc_move_from_reference_point(
reference_point = start,
other_points = points,
duration_col_name = "durations"
)
We can call the summary
function to obtain a summary of
the resulting object:
summary(pos_result)
#> Summary of the unipolar displacement result:
#> Min displacement: 0.4697282 [m]
#> Mean displacement: 485.8294 [m]
#> Max displacement: 1978.529 [m]
We can also plot the resulting object to visualize the positions of the image points and how they have been moved:
plot(pos_result)
Creating the concentric circles
Since the cartogram will replace the geographic distance by time
distance we can create concentric circles to represent the time distance
from the reference point. Internally, this is possible because the
reference speed has been stored in the resulting object of the
dc_move_from_reference_point
function.
The steps
parameter is a list of the time distances we
want to represent in the cartogram, in the same unit as the travel times
(here, minutes).
circles <- dc_concentric_circles(
pos_result,
steps = list(5, 10, 15, 20)
)
circles
#> Simple feature collection with 4 features and 1 field
#> Geometry type: LINESTRING
#> Dimension: XY
#> Bounding box: xmin: 652762.2 ymin: 5726235 xmax: 662957.4 ymax: 5736430
#> Projected CRS: WGS 84 / Pseudo-Mercator
#> step geometry
#> 1 5 LINESTRING (659134.2 573133...
#> 2 10 LINESTRING (660408.6 573133...
#> 3 15 LINESTRING (661683 5731333,...
#> 4 20 LINESTRING (662957.4 573133...
Creating the distance cartogram
We first need to compute the bounding box of all the layers we want to deform using the interpolation grid to ensure that the interpolation grid covers the whole area of interest.
bbox <- dc_combine_bbox(res)
We can then create the interpolation grid:
The precision
parameters controls the size of the grid
cells (higher is more precise, for example 0.5 generally gives a coarse
result, 2 a satisfactory result and 4 a particularly fine result). A
precision of 2 is usually a good default value and this is what we
choose to use here.
igrid <- dc_create(
source_points = pos_result$source_points,
image_points = pos_result$image_points,
precision = 2,
bbox = bbox
)
Various information can be obtained from the interpolation grid:
summary(igrid)
#> Summary of the interpolation grid:
#> Number of cells: 900
#> Precision: 211.693 (α = 2)
#> Deformation strength: 1.2912
#> Mean absolute error: 64.47526
#> RMSE (interp - image): 81.72795
#> RMSE x (interp - image): 51.80675
#> RMSE y (interp - image): 63.21011
#> RMSE (interp - source): 598.5697
#> RMSE x (interp - source): 469.7113
#> RMSE y (interp - source): 371.0216
#> R squared: 0.9985654
We can also plot the interpolation grid to visualize the source grid, the interpolated grid, the distance between the image points and the interpolated points, and the strength of the deformation.
plot(igrid)
Deforming the background layers
Since we have multiple layers to deform, we can use the
dc_interpolate_parallel
function to deform all the layers
at once:
deformed <- dc_interpolate_parallel(igrid, res)
Mapping the result
mf_map(deformed$circle, col = "#f2efe9", border = NA, add = FALSE)
mf_map(deformed$green, col = "#c8facc", border = "#c8facc", lwd = .5, add = TRUE)
mf_map(deformed$water, col = "#aad3df", border = "#aad3df", lwd = .5, add = TRUE)
mf_map(deformed$railway, col = "grey50", lty = 2, lwd = .2, add = TRUE)
mf_map(deformed$road, col = "white", border = "white", lwd = .5, add = TRUE)
mf_map(deformed$street, col = "white", border = "white", lwd = .5, add = TRUE)
mf_map(deformed$building, col = "#d9d0c9", border = "#c6bab1", lwd = .5, add = TRUE)
mf_map(deformed$circle, col = NA, border = "#c6bab1", lwd = 4, add = TRUE)
mf_map(igrid$interpolated_grid, col = NA, border = "#940000", lwd = .1, add = TRUE)
mf_map(start, pch = 16, col = "red", cex = 2, add = TRUE)
mf_map(sf::st_intersection(circles, sf::st_union(sf::st_buffer(igrid$interpolated_grid, 0))), col = "red", lty = 2, lwd = 2, add = TRUE)
for (i in 1:4) {
mapsf:::shadowtext(
sf::st_bbox(circles$geometry[i])$xmax,
start$geometry[[1]][2],
paste(i * 5, "min"),
col = "purple",
bg = "white",
r = 0.15
)
}
If needed, we can add the various points (or a sample of them) and their corresponding travel times to see that the deformation has been correctly applied:
mvd_pts <- igrid$interpolated_points
mvd_pts$durations <- c(0, points$durations)
mf_map(deformed$circle, col = "#f2efe9", border = NA, add = FALSE)
mf_map(deformed$green, col = "#c8facc", border = "#c8facc", lwd = .5, add = TRUE)
mf_map(deformed$water, col = "#aad3df", border = "#aad3df", lwd = .5, add = TRUE)
mf_map(deformed$railway, col = "grey50", lty = 2, lwd = .2, add = TRUE)
mf_map(deformed$road, col = "white", border = "white", lwd = .5, add = TRUE)
mf_map(deformed$street, col = "white", border = "white", lwd = .5, add = TRUE)
mf_map(deformed$building, col = "#d9d0c9", border = "#c6bab1", lwd = .5, add = TRUE)
mf_map(deformed$circle, col = NA, border = "#c6bab1", lwd = 4, add = TRUE)
mf_map(igrid$interpolated_grid, col = NA, border = "#940000", lwd = .1, add = TRUE)
mf_map(start, pch = 16, col = "red", cex = 2, add = TRUE)
mf_map(sf::st_intersection(circles, sf::st_union(sf::st_buffer(igrid$interpolated_grid, 0))), col = "red", lty = 2, lwd = 2, add = TRUE)
for (i in 1:4) {
mapsf:::shadowtext(
sf::st_bbox(circles$geometry[i])$xmax,
start$geometry[[1]][2],
paste(i * 5, "min"),
col = "purple",
bg = "white",
r = 0.15
)
}
mf_map(mvd_pts, add = TRUE, col = "blue")
mf_label(mvd_pts, var = "durations", col = "black", halo = TRUE, overlap = TRUE)
Conclusion
Distance cartograms are both a way of analyzing space and a way of representing it. In the case of unipolar cartograms, as proposed here, reading is generally fairly straightforward, making it easy to highlight portions of the territory that are difficult (or at least more difficult than the average) to reach for example.