Skip to content

πŸ“Š Using R to Access CKAN Data

R can be used to interact with CKAN’s API via the ckanr package.
This allows you to query datasets, preview resources, and download files directly into your R environment or onto your hard drive.

⚠️ Note: Only certain file types are supported for direct download (CSV, XLS, XLSX, XML, HTML, JSON, SHP, GeoJSON, TXT).
Compressed .zip packages cannot be downloaded with R. See the Downloading Data help page for details.


πŸ“¦ Setup

Load libraries and configure CKAN to point to the CanWIN server:

library(tidyverse)
library(ckanr)

# Set CKAN server to CanWIN
ckanr_setup("https://canwin-datahub.ad.umanitoba.ca/data")

# Verify setup
get_default_url()   # Prints the URL being queried
servers()           # Lists available CKAN servers

πŸ” Viewing Data Categories

In ckanr, CKAN concepts map as follows:

  • Themes β†’ groups
  • Datasets β†’ packages
  • Keywords β†’ tags
group_list(as = "table")    # Lists themes
package_list(as = "table")  # Lists datasets
tag_list(as = "table")      # Lists keywords

πŸ“₯ Importing a Dataset into RStudio

Most CanWIN resources are CSV files. You’ll need the resource ID (found in the metadata section of a dataset page).

Example using dplyr:

# Resource ID from CKAN site
data_id <- "c07482a5-c8e2-403c-9eaa-94153fc3659c"

# Load dataset into R environment
dataset_original <- dplyr::tbl(src = ckan$con, from = data_id) %>%
  as_tibble()

# Filter and select specific parameters
tbl(src = ckan$con, from = data_id) %>%
  select(project_name, station_id) %>%    # Select columns
  filter(station_id == "GL_LWH_M") %>%    # Filter rows
  as_tibble()

πŸ’Ύ Downloading a Dataset to Disk

To save a dataset directly to your hard drive:

Note

Set your working directory first to the location where you want the file saved.
This is usually done at the beginning of your script.

Example using the 2016 Lake Waterhen Ecotriplet dataset:

# Store resource information
res <- resource_show(
  id = "c07482a5-c8e2-403c-9eaa-94153fc3659c",
  as = "table"
)

# Preview first rows of data
head(ckan_fetch(res$url))

# Set working directory
wd <- "D:/R/Ckan"   # Replace with your own path
setwd(wd)
getwd()             # Prints the current working directory

# Download dataset to disk
ckan_fetch(res$url, "disk", "file_name.csv")

This saves the dataset as file_name.csv in your specified directory.

πŸ”‘ Authentication

Some CKAN endpoints require an API key.

You can include your key when setting up the connection:

ckanr_setup(
  url = "https://canwin-datahub.ad.umanitoba.ca/data",
  key = "YOUR-API-KEY"
)