Table of Contents

FIXME This page is outdated. Please help completing the wiki updates.

Using the API to pull/push data from/to CKAN

Who is this guide for

What this guide teaches

Things to know in forehand

The CKAN API has several different versions

The CKAN APIs are versioned. If you make a request to an API URL without a version number, CKAN will choose the latest version of the API:

http://data.opendevelopmentmekong.net/api/action/package_list

Alternatively, you can specify the desired API version number in the URL that you request:

http://data.opendevelopmentmekong.net/api/3/action/package_list

Version 3 is currently the only version of the Action API.

We recommend that you specify the API number in your requests, because this ensures that your API client will work across different sites running different version of CKAN (and will keep working on the same sites, when those sites upgrade to new versions of CKAN). Because the latest version of the API may change when a site is upgraded to a new version of CKAN, or may differ on different sites running different versions of CKAN, the result of an API request that doesn’t specify the API version number cannot be relied on.

Authentication and API keys

Some API functions require authorization. The API uses the same authorization functions and configuration as the web interface, so if a user is authorized to do something in the web interface they’ll be authorized to do it via the API as well.

When calling an API function that requires authorization, you must authenticate yourself by providing your API key with your HTTP request. To find your API key, login to the CKAN site using its web interface and visit your user profile page. In case you are not register yet, please follow this GUIDE

To provide your API key in an HTTP request, include it in either an Authorization or X-CKAN-API-Key header. (The name of the HTTP header can be configured with the apikey_header_name option in your CKAN configuration file.)

For example, to ask whether or not you’re currently following the user acorbi on data.opendevelopmentmekong.net using HTTPie, run this command:

http http://data.opendevelopmentmekong.net/api/3/action/am_following_user id=acorbi Authorization:XXX

(Replacing XXX with your API key.)

Or, to get the list of activities from your user dashboard on data.opendevelopmentmekong.net, run this Python code:

request = urllib2.Request('http://data.opendevelopmentmekong.net/api/3/action/dashboard_activity_list')
request.add_header('Authorization', 'XXX')
response_dict = json.loads(urllib2.urlopen(request, '{}').read())

More details in the official documentation

This GUIDE has been built with excerpts from the official CKAN documentation. For more detailed information, it is suggested to consult the official CKAN documentation found on: http://docs.ckan.org/en/latest/api/index.html

Special ODM fields

The different dataset types stored in CKAN ( datasets & library records ) contain a series of special fields, some of them mandatory.

These fields allow tagging the datasets with specific metadata used by the staff to fulfill different use cases and should be considered while retrieving dataset information or creating new ones using the API:

Open Development Network Metadata template

Defined on https://github.com/OpenDevelopmentMekong/ckanext-odm_theme/blob/master/ckanext/odm_theme/lib/odm_theme_helper.py#L30

# (field id, field label, mandatory)
metadata_fields = [
('odm_format','Format',True),
('odm_language','Language',True),
('odm_date_created','Date Created',True),
('odm_date_uploaded','Date Uploaded',True),
('odm_date_modified','Date Modified',True),
('odm_temporal_range','Temporal Range',True),
('odm_spatial_range','Spatial Range',False),
('odm_accuracy','Accuracy',False),
('odm_logical_consistency','Logical Consistency',False),
('odm_completeness','Completeness',False),
('odm_process','Process(es)',True),
('odm_source','Source(s)',True),
('odm_contact','Contact',True),
('odm_contact_email','Contact Email',True),
('odm_access_and_use_constraints','Access and Use Constraints',False),
('odm_url','Download URL',False),
('odm_metadata_reference_information','Metadata Reference Information',False),
('odm_attributes','Attributes',False)
]

Library publications metadata template

Defined on https://github.com/OpenDevelopmentMekong/ckanext-odm_theme/blob/master/ckanext/odm_theme/lib/odm_theme_helper.py#L53

# (field id, field label, mandatory)
library_fields = [
('marc21_020','ISBN',False),
('marc21_022','ISSN',False),
('marc21_084','Classification',False),
('marc21_100','Author',False),
('marc21_110','Corporate Author',False),
('marc21_245','Title',False),
('marc21_246','Varying Form of Title',False),
('marc21_250','Edition',False),
('marc21_260a','Publication Place',False),
('marc21_260b','Publication Name',False),
('marc21_260c','Publication Date',False),
('marc21_300','Pagination',False),
('marc21_500','General Note',False),
('marc21_520','Summary',False),
('marc21_650','Subject',False),
('marc21_651','Subject (Geographic Name)',False),
('marc21_653','Keyword',False),
('marc21_700','Co-Author',False),
('marc21_710','Co-Author (Corporate)',False),
('marc21_850','Institution',False),
('marc21_852','Location',False)
]

ODC Fields ( Fields included in imports of old ODC content e.g Laws)

Defined on https://github.com/OpenDevelopmentMekong/ckanext-odm_theme/blob/master/ckanext/odm_theme/lib/odm_theme_helper.py#L20

# (field id, field label, mandatory)
odc_fields = [
('file_name_kh','File (Khmer)',False),
('file_name_en','File (English)',False),
('adopted_date','Adopted Date',False),
('number_en','Number (English)',False),
('number_kh','Number (Khmer)',False),
('published_date','Publication date',False),
('published_under','Published under',False)
]

Making an API request

To call the CKAN API, post a JSON dictionary in an HTTP POST request to one of CKAN’s API URLs. The parameters for the API function should be given in the JSON dictionary. CKAN will also return its response in a JSON dictionary.

One way to post a JSON dictionary to a URL is using the command-line HTTP client HTTPie. For example, to get a list of the names of all the datasets in the data-explorer group on data.opendevelopmentmekong.net, install HTTPie and then call the group_list API function by running this command in a terminal:

http http://data.opendevelopmentmekong.net/api/3/action/group_list id=maps-group

The response from CKAN will look like this:

{
    "help": "...",
    "result": [
        "data-explorer",
        "department-of-ricky",
        "geo-examples",
        "geothermal-data",
        "reykjavik",
        "skeenawild-conservation-trust"
    ],
    "success": true
}

The same HTTP request can be made using Python’s standard urllib2 module, with this Python code:

#!/usr/bin/env python
import urllib2
import urllib
import json
import pprint

# Use the json module to dump a dictionary to a string for posting.
data_string = urllib.quote(json.dumps({'id': 'data-explorer'}))

# Make the HTTP request.
response = urllib2.urlopen('http://demo.ckan.org/api/3/action/group_list',
        data_string)
assert response.code == 200

# Use the json module to load CKAN's response into a dictionary.
response_dict = json.loads(response.read())

# Check the contents of the response.
assert response_dict['success'] is True
result = response_dict['result']
pprint.pprint(result)

Searching for datasets using CKAN's API

For searching datasets by a particular query, the package_list API method can be used.

#!/usr/bin/env python
import urllib2
import urllib
import json
import pprint

# We'll use the package_create function to create a new dataset.
request = urllib2.Request(
    'http://data.opendevelopmentmekong.net/api/action/package_list')

# Make the HTTP request.
response = urllib2.urlopen(request)
assert response.code == 200

# Use the json module to load CKAN's response into a dictionary.
response_dict = json.loads(response.read())
assert response_dict['success'] is True
pprint.pprint(response_dict['result'])

Retrieving a particular dataset using CKAN's API

For retrieving the information contained in a certain datasets, the package_show API method can be used. Note that you will need the id or name of the dataset you want to pull the information from

#!/usr/bin/env python
import urllib2
import urllib
import json
import pprint

# Put the details of the dataset we're going to create into a dict.
dataset_dict = {
    'id': 'cambodia-law-on-forestry-2002'
}

# Use the json module to dump the dictionary to a string for posting.
data_string = urllib.quote(json.dumps(dataset_dict))

# We'll use the package_create function to create a new dataset.
request = urllib2.Request(
    'http://data.opendevelopmentmekong.net/api/action/package_show')

# Make the HTTP request.
response = urllib2.urlopen(request, data_string)
assert response.code == 200

# Use the json module to load CKAN's response into a dictionary.
response_dict = json.loads(response.read())
assert response_dict['success'] is True
pprint.pprint(response_dict['result'])

Creating datasets with CKAN's API

You can add datasets using CKAN’s web interface, but when importing many datasets it’s usually more efficient to automate the process in some way. In this example, we’ll show you how to use the CKAN API to write a Python script to import datasets into CKAN.

#!/usr/bin/env python
import urllib2
import urllib
import json
import pprint

# Put the details of the dataset we're going to create into a dict.
dataset_dict = {
    'name': 'my_dataset_name',
    'notes': 'A long description of my dataset',
}

# Use the json module to dump the dictionary to a string for posting.
data_string = urllib.quote(json.dumps(dataset_dict))

# We'll use the package_create function to create a new dataset.
request = urllib2.Request(
    'http://www.my_ckan_site.com/api/action/package_create')

# Creating a dataset requires an authorization header.
# Replace *** with your API key, from your user account on the CKAN site
# that you're creating the dataset on.
request.add_header('Authorization', '***')

# Make the HTTP request.
response = urllib2.urlopen(request, data_string)
assert response.code == 200

# Use the json module to load CKAN's response into a dictionary.
response_dict = json.loads(response.read())
assert response_dict['success'] is True

# package_create returns the created package as its result.
created_package = response_dict['result']
pprint.pprint(created_package)