FIXME **This page is outdated. Please help completing the wiki updates.** ====== Using the API to pull/push data from/to CKAN ====== ===== Who is this guide for ===== * Developers willing to access data through CKAN's REST API * Partners willing to integrate ODM data into their platforms ===== What this guide teaches ===== * How to register in the system and obtaining an API KEY for those actions that require Athentication * How to call the API to search for datasets * How to call the API for retrieving data from a particular dataset * How to call the API for creating datasets programatically ===== Things to know in forehand ===== ==== The CKAN API has several different versions ==== The CKAN APIs are versioned. If you make a request to an API URL without a version number, CKAN will choose the latest version of the API: http://data.opendevelopmentmekong.net/api/action/package_list Alternatively, you can specify the desired API version number in the URL that you request: http://data.opendevelopmentmekong.net/api/3/action/package_list Version 3 is currently the only version of the Action API. We recommend that you specify the API number in your requests, because this ensures that your API client will work across different sites running different version of CKAN (and will keep working on the same sites, when those sites upgrade to new versions of CKAN). Because the latest version of the API may change when a site is upgraded to a new version of CKAN, or may differ on different sites running different versions of CKAN, the result of an API request that doesn’t specify the API version number cannot be relied on. ==== Authentication and API keys ==== Some API functions require authorization. The API uses the same authorization functions and configuration as the web interface, so if a user is authorized to do something in the web interface they’ll be authorized to do it via the API as well. When calling an API function that requires authorization, you must authenticate yourself by providing your API key with your HTTP request. To find your API key, login to the CKAN site using its web interface and visit your user profile page. In case you are not register yet, please follow [[http://wiki.opendevelopmentmekong.net/content_generation#registering_a_new_user|this GUIDE]] To provide your API key in an HTTP request, include it in either an Authorization or X-CKAN-API-Key header. (The name of the HTTP header can be configured with the apikey_header_name option in your CKAN configuration file.) For example, to ask whether or not you’re currently following the user acorbi on data.opendevelopmentmekong.net using HTTPie, run this command: http http://data.opendevelopmentmekong.net/api/3/action/am_following_user id=acorbi Authorization:XXX (Replacing XXX with your API key.) Or, to get the list of activities from your user dashboard on data.opendevelopmentmekong.net, run this Python code: request = urllib2.Request('http://data.opendevelopmentmekong.net/api/3/action/dashboard_activity_list') request.add_header('Authorization', 'XXX') response_dict = json.loads(urllib2.urlopen(request, '{}').read()) ==== More details in the official documentation ==== This GUIDE has been built with excerpts from the official CKAN documentation. For more detailed information, it is suggested to consult the official CKAN documentation found on: http://docs.ckan.org/en/latest/api/index.html ==== Special ODM fields ==== The different dataset types stored in CKAN ( datasets & library records ) contain a series of special fields, some of them mandatory. These fields allow tagging the datasets with specific metadata used by the staff to fulfill different use cases and should be considered while retrieving dataset information or creating new ones using the API: [[metadata|Open Development Network Metadata template]] Defined on https://github.com/OpenDevelopmentMekong/ckanext-odm_theme/blob/master/ckanext/odm_theme/lib/odm_theme_helper.py#L30 # (field id, field label, mandatory) metadata_fields = [ ('odm_format','Format',True), ('odm_language','Language',True), ('odm_date_created','Date Created',True), ('odm_date_uploaded','Date Uploaded',True), ('odm_date_modified','Date Modified',True), ('odm_temporal_range','Temporal Range',True), ('odm_spatial_range','Spatial Range',False), ('odm_accuracy','Accuracy',False), ('odm_logical_consistency','Logical Consistency',False), ('odm_completeness','Completeness',False), ('odm_process','Process(es)',True), ('odm_source','Source(s)',True), ('odm_contact','Contact',True), ('odm_contact_email','Contact Email',True), ('odm_access_and_use_constraints','Access and Use Constraints',False), ('odm_url','Download URL',False), ('odm_metadata_reference_information','Metadata Reference Information',False), ('odm_attributes','Attributes',False) ] [[moving_the_library_from_ngl_to_ckan|Library publications metadata template]] Defined on https://github.com/OpenDevelopmentMekong/ckanext-odm_theme/blob/master/ckanext/odm_theme/lib/odm_theme_helper.py#L53 # (field id, field label, mandatory) library_fields = [ ('marc21_020','ISBN',False), ('marc21_022','ISSN',False), ('marc21_084','Classification',False), ('marc21_100','Author',False), ('marc21_110','Corporate Author',False), ('marc21_245','Title',False), ('marc21_246','Varying Form of Title',False), ('marc21_250','Edition',False), ('marc21_260a','Publication Place',False), ('marc21_260b','Publication Name',False), ('marc21_260c','Publication Date',False), ('marc21_300','Pagination',False), ('marc21_500','General Note',False), ('marc21_520','Summary',False), ('marc21_650','Subject',False), ('marc21_651','Subject (Geographic Name)',False), ('marc21_653','Keyword',False), ('marc21_700','Co-Author',False), ('marc21_710','Co-Author (Corporate)',False), ('marc21_850','Institution',False), ('marc21_852','Location',False) ] ODC Fields ( Fields included in imports of old ODC content e.g Laws) Defined on https://github.com/OpenDevelopmentMekong/ckanext-odm_theme/blob/master/ckanext/odm_theme/lib/odm_theme_helper.py#L20 # (field id, field label, mandatory) odc_fields = [ ('file_name_kh','File (Khmer)',False), ('file_name_en','File (English)',False), ('adopted_date','Adopted Date',False), ('number_en','Number (English)',False), ('number_kh','Number (Khmer)',False), ('published_date','Publication date',False), ('published_under','Published under',False) ] ===== Making an API request ===== To call the CKAN API, post a JSON dictionary in an HTTP POST request to one of CKAN’s API URLs. The parameters for the API function should be given in the JSON dictionary. CKAN will also return its response in a JSON dictionary. One way to post a JSON dictionary to a URL is using the command-line HTTP client HTTPie. For example, to get a list of the names of all the datasets in the data-explorer group on data.opendevelopmentmekong.net, install HTTPie and then call the group_list API function by running this command in a terminal: http http://data.opendevelopmentmekong.net/api/3/action/group_list id=maps-group The response from CKAN will look like this: { "help": "...", "result": [ "data-explorer", "department-of-ricky", "geo-examples", "geothermal-data", "reykjavik", "skeenawild-conservation-trust" ], "success": true } The same HTTP request can be made using Python’s standard urllib2 module, with this Python code: #!/usr/bin/env python import urllib2 import urllib import json import pprint # Use the json module to dump a dictionary to a string for posting. data_string = urllib.quote(json.dumps({'id': 'data-explorer'})) # Make the HTTP request. response = urllib2.urlopen('http://demo.ckan.org/api/3/action/group_list', data_string) assert response.code == 200 # Use the json module to load CKAN's response into a dictionary. response_dict = json.loads(response.read()) # Check the contents of the response. assert response_dict['success'] is True result = response_dict['result'] pprint.pprint(result) ===== Searching for datasets using CKAN's API ===== For searching datasets by a particular query, the [[http://docs.ckan.org/en/latest/api/index.html#ckan.logic.action.get.package_list|package_list]] API method can be used. #!/usr/bin/env python import urllib2 import urllib import json import pprint # We'll use the package_create function to create a new dataset. request = urllib2.Request( 'http://data.opendevelopmentmekong.net/api/action/package_list') # Make the HTTP request. response = urllib2.urlopen(request) assert response.code == 200 # Use the json module to load CKAN's response into a dictionary. response_dict = json.loads(response.read()) assert response_dict['success'] is True pprint.pprint(response_dict['result']) ===== Retrieving a particular dataset using CKAN's API ===== For retrieving the information contained in a certain datasets, the [[http://docs.ckan.org/en/latest/api/index.html#ckan.logic.action.get.package_show|package_show]] API method can be used. Note that you will need the id or name of the dataset you want to pull the information from #!/usr/bin/env python import urllib2 import urllib import json import pprint # Put the details of the dataset we're going to create into a dict. dataset_dict = { 'id': 'cambodia-law-on-forestry-2002' } # Use the json module to dump the dictionary to a string for posting. data_string = urllib.quote(json.dumps(dataset_dict)) # We'll use the package_create function to create a new dataset. request = urllib2.Request( 'http://data.opendevelopmentmekong.net/api/action/package_show') # Make the HTTP request. response = urllib2.urlopen(request, data_string) assert response.code == 200 # Use the json module to load CKAN's response into a dictionary. response_dict = json.loads(response.read()) assert response_dict['success'] is True pprint.pprint(response_dict['result']) ===== Creating datasets with CKAN's API ===== You can add datasets using CKAN’s web interface, but when importing many datasets it’s usually more efficient to automate the process in some way. In this example, we’ll show you how to use the CKAN API to write a Python script to import datasets into CKAN. #!/usr/bin/env python import urllib2 import urllib import json import pprint # Put the details of the dataset we're going to create into a dict. dataset_dict = { 'name': 'my_dataset_name', 'notes': 'A long description of my dataset', } # Use the json module to dump the dictionary to a string for posting. data_string = urllib.quote(json.dumps(dataset_dict)) # We'll use the package_create function to create a new dataset. request = urllib2.Request( 'http://www.my_ckan_site.com/api/action/package_create') # Creating a dataset requires an authorization header. # Replace *** with your API key, from your user account on the CKAN site # that you're creating the dataset on. request.add_header('Authorization', '***') # Make the HTTP request. response = urllib2.urlopen(request, data_string) assert response.code == 200 # Use the json module to load CKAN's response into a dictionary. response_dict = json.loads(response.read()) assert response_dict['success'] is True # package_create returns the created package as its result. created_package = response_dict['result'] pprint.pprint(created_package)