For the migration planning, please have a look at this link: https://docs.google.com/document/d/1ibDJNgf_HNvSOgAMCxRJcWqpmA1vZ0I716W7Gx2RefE/edit
This is a list of items on the server, we can go through it and make sure that all the components are accounted for and we have a migration plan for them.
Component | Location | Migration Plan? | Component |
---|---|---|---|
http://www.opendevelopmentcambodia.net/company-profiles/hydropower-sub-stations/ | /home/devnet/public_html/references | Wordpress front-end | |
http://geoserver.opendevelopmentcambodia.net:8181/geoserver/web/ | /home/devnet/public_html/geoserver_data | yes | CKAN |
http://library.opendevelopmentcambodia.net:8080/newgenlibctxt/ | /usr/NGL3/apache-tomcat-6.0.32/webapps/newgenlibctxt/CatalogueRecords | yes | CKAN |
Store all files for downloading | /home/devnet/public_html/download | Wordpress front-end | |
http://www.opendevelopmentcambodia.net/maps/downloads/ | /home/devnet/public_html/download/maps | Wordpress front-end | |
http://www.opendevelopmentcambodia.net/laws-regulations/ | /home/devnet/public_html/download/law | Wordpress front-end | |
Wordpress | /home/devnet/public_html/wp-content | Wordpress front-end |
Scripts for automated migration of contents are available in the odm-internal repository. After cloning the repo from within the CKAN instance, you will find a series of utility scripts in the scripts folder.
cd odm-internal/odm-migration/CKAN/import-scripts/scripts
The scripts come with a test suite included, for running tests:
. ~/.virtualenvs/ckan/bin/activate
nosetests --ckan --with-pylons=test.ini tests
from the odm-internal/odm-migration/CKAN/import-scripts/ folder
Follow the ckan documentation for creating a Sysadmin user: http://docs.ckan.org/en/latest/maintaining/getting-started.html#create-admin-user
. ~/.virtualenvs/ckan/bin/activate paster --plugin=ckan sysadmin add <USERNAME> -c /etc/ckan/default/production.ini
Once created, head to http://CKAN_URL/user/login and type the access information you have just specified. Once logged in, the profile page of the created sysadmin user will appear, we need to look at the bottom and take note of the API Key, we will need it in the next step.
Prior to runing the scripts, the odm_theme_config.sample.py file needs to be renamed to odm_theme_config.py:
cd config mv odm_theme_config.sample.py odm_theme_config.py
In addition, the variables contained in this file need to be initialized:
{'ontology':'*','organization':'cambodia-organization','groups':[{'name':'maps-group'},{'name':'cambodia-group'}]}
{'ontology':'*','organization':'odm-library','groups':[{'name':'library-group'},{'name':'cambodia-group'}]}
ODC_MAP=[{'ontology':'ODC/laws','organization':'cambodia-organization','groups':[{'name':'laws-group'},{'name':'cambodia-group'}],'field_prefixes':[{'field':'file_name_kh','prefix':'http://cambodia.opendevelopmentmekong.net/wp-content/blogs.dir/2/download/law/'},{'field':'file_name_en','prefix':'http://cambodia.opendevelopmentmekong.net/wp-content/blogs.dir/2/download/law/'}]}]
{'group':'laws-group','limit':500,'field_filter':{'odm_contact':'ODM Importer','odm_contact_email':'info@opendevmekong.net
{'type':'library_record','organization':'odm-library','state':'active','limit':500,'field_filter':{'odm_contact':'OD Mekong Importer','odm_contact_email':'info@opendevmekong.net'}}
There is a known Issue on the ckanext-googleanalytics extension ( which we use for traking use of the ckan instance ) which causes the scripts to fail after a certain number of requests. Therefore, please be sure that the plugin is not included in the ckan.plugins parameter in the development.ini or production.ini
As part of the workflow, the ckanext-issues extension modifies the system in a way that newly created datasets are automatically made private thus becoming unpublished. This is not the expected behaviour for the import scripts and in order for this situation to be avoided, the variable ckanext.issues.review_system should be set to False prior to running the scripts. After all the import scripts have been run successfully, the variable should be set again to True
On the scripts folder, there is a script that needs to be run first. It is called insert_initial_odm_data.py and creates a basic set of Users, Organizations and Groups that will be necessary for the import scripts below. In order to run it
cd scripts python insert_initial_odm_data.py
Please be sure to edit the script and change CKAN_ADMIN_API_KEY to your user's API Key (which you noted before), otherwise the script will fail.
A python script has been developed in order to extract the list of Layers hosted on GeoServer, pull the metadata, GEOJson representation (if available), link to OpenLayers and other visualisation formats (stored in GeoServer) in order to be stored/updated on CKAN. Here is the Pseudocode of the script:
In order to avoid issues deriving from GeoServer been moved to another location thus changing its IP, it should be always address with the current domain name: http://geoserver.opendevelopmentcambodia.net
Run import_from_geoserver.py which can be found under odm-scripting/ckan-scripts. This script downloads and initializes the map layers from GeoServer.
python import_from_geoserver.py
Please be sure to edit the script and change <CKAN_URL_AND_PORT>, <CKAN_ADMIN_API_KEY> and <GEOSERVER_BASIC_AUTH> to your CKAN's user API Key and Geoserver's Authorization header (Basic Auth) respectively, otherwise the script will fail.
Currently, NextGenLib is used to maintain a collection of Library Publications on old's ODC website. This system not only offers users the possibility to browse through a book and article catalog but to check the availability of certain publications in ODC's physical library. The existing records along with its metadata need to be imported into the new datahub module. However, the functionality to check the availability of publications in the physical library is not supported by ckan and would need to be programmed extra. By the moment this won't be supported.
The records stored on NGL should be imported to the Data Hub programatically. For that, a script has been developed that aims to automate this process. Following workflow has been conceived:
Run import_from_ngl.py which can be found under odm-internal/odm-migration. This script imports the Library publication records from NextGenLib into CKAN
python import_from_ngl.py
Please be sure to edit the script and change <CKAN_URL_AND_PORT>, <CKAN_ADMIN_API_KEY> and <NGL_URL> to CKAN's URL and PORT, your CKAN's user API Key and URL of the NGL instance respectively, otherwise the script will fail.
In order to replicate the efect of the wpckan wordpress plugin to all of the previously created contents on opendevelopmentcambodia.net. A script has beeen written, which pulls XML files with exports of each relevant category on the wordpress site and archives it into CKAN assigning the created or modified datasets to specific Organizations and/or Groups. See above.
Run import_odc_contents.py which can be found under odm-internal/odm-migration.
python import_odc_contents.py
Please be sure to edit the script and change <CKAN_URL_AND_PORT>, <CKAN_ADMIN_API_KEY> and <ODC_MAP> to CKAN's URL and PORT, your CKAN's user API Key.
Information across the platform is structured following a taxonomy which helps contents to be categorized after certain topics. This structure has to be also maintained in the Data Hub. The Taxonomy is available on the odm-localization repository along with its translation in several languages. For importing the Taxonomy elements into ODM's CKAN instance, a script has being written which gets the structure of the taxonomy from the odm-localization repository and imports it into ckan as Tag Vocabularies.
Run import_taxonomy_tag_dictionaries.py|import_taxonomy_tag_dictionaries.py which can be found under odm-scripting/ckan-scripts. This script downloads and initializes the taxonomy structure into CKAN.
python import_taxonomy_tag_dictionaries.py
Please be sure to edit the script and change <CKAN_URL_AND_PORT>, <CKAN_ADMIN_API_KEY> to your user's API Key (which you noted before), otherwise the script will fail.
The Taxonomy is available on the odm-localization repository along with its translation in several languages. For importing the trasnalted Taxonomy elements into ODM's CKAN instance, a script has being written which gets the structure of the translated taxonomy elements and imports it into ckan as Term translations.
Run import_taxonomy_term_translations.py which can be found under odm-scripting/ckan-scripts. This script downloads and initializes the taxonomy structure into CKAN.
python import_taxonomy_tag_dictionaries.py
Please be sure to edit the script and change <CKAN_URL_AND_PORT>, <CKAN_ADMIN_API_KEY> to your user's API Key (which you noted before), otherwise the script will fail.
Sometimes it will be needed to remove datasets from a certain group (i.e Laws or maps). For that, the delete_datasets_in_group can be used. The script can be configured by specifying the following details in the DELETE_MAP variable in the config file:
Use field_filter parameter to remove only datasets imported by the import scripts, specify this value
'field_filter':{'odm_contact':'ODM Importer','odm_contact_email':'info@opendevmekong.net'}
Run delete_datasets_in_group.py which can be found under odm_internal/odm_migration. This script gathers the list of datasets and removes them bulk-wise.
python delete_datasets_in_group.py
Please be sure to edit the script and change <CKAN_URL_AND_PORT>, <CKAN_ADMIN_API_KEY> to your user's API Key (which you noted before), otherwise the script will fail.
When we run this script, filtered datasets will get the state 'deleted' but will still be available in the DB. In order to delete these datasets permanently, login with sysadmin credentials and point your browser to: http://data.opendevelopmentmekong.net/ckan-admin/trash. Alternativelly, the instruction found under http://wiki.opendevelopmentmekong.net/code_snippets#purge_all_datasets_marked_as_deleted_on_ckan can be run in order to purge the deleted datasets.