====== Metadata ====== ==== Who is this guide for ==== * OD staff (editors and data officers) * Contributors willing to add content to the OD Database ==== What this guide teaches ==== * What is metadata, and why it is important? * Guidelines for creating an accurate metadata * Open Development metadata templates ===== What is metadata, and why is it important? ===== Metadata is data that describes data. Imagine you had a present for someone. Without having to tear the wrapping papers and open the box, a label with written note attached can tell you what they are going to get. A carefully wrapped present with out a label might add excitement for the gift recipient, but data without a metadata is not usable. Data is simply number and figures. Data doesn’t mean anything without a description, a metadata. For CKAN purposes, data is published in units called “datasets”. A dataset is a parcel of data - for example, it could be the crime statistics for a region, the spending figures for a government department, or temperature readings from various weather stations or a reference document. A dataset contains two things: * A metadata that describe information about the data. For example, the title and publisher, date, what formats it is available in, what license it is released under, etc. * A number of “resources”, which hold the data itself. CKAN does not mind what format the data is in. A resource can be a CSV or Excel spreadsheet, XML file, PDF document, image file, linked data in RDF format, etc. CKAN can store the resource internally, or store it simply as a link, the resource itself being elsewhere on the web. A dataset can contain any number of resources. For example, different resources might contain the data for different years, or they might contain the same data in different formats. Example: [[https://opendevelopmentmekong.net/dataset/?id=hydro-basins-level-6-greater-mekong-subregion-laos-myanmar-thailand-vietnam-cambodia&search_query=P3M9aHlkcm9iYXNpbiZ0eXBlPWRhdGFzZXQmcGFnZT0w|Hydrobasins level 6 dataset on OD Mekong]] {{ :hydro_basin_level_6_mekong.png?nolink&600 |}} Metadata provides important context about an informational asset’s source and manner of creation, as well as in what applications or environments the asset is relevant. Metadata also has the following purposes: * certifies the authenticity and degree of completeness of the content; * establishes and documents the context of the content; * identifies and exploits the structural relationships within and between information objects; * provides a range of intellectual access points for an increasingly diverse range of users. ===== Guidelines for creating accurate metadata ===== Information is often imperfect, whether it is produced by members of the Open Development Network or by others. Details may be missing, badly defined, or even completely wrong. Sometimes it is possible to improve the quality of the information by contacting its source. But even even then, problems may remain. **How may we create an accurate and useful metadata when the information it is describing might be flawed?** We aim to produce, to the best of our ability, an accurate metadata by describing the extent of our knowledge about the asset/resource. A good metadata should clearly state what is known about the resource and what is not known or problematic. Metadata changes when the asset itself or knowledge about its condition changes. If information is missing or inconsistent, describe the known inconsistencies or gaps instead of disregarding the resource. Mention any steps being taken to address these issues, along with an expected timeline. **[THIS INSTRUCTION IS UNDER REVIEW] ** ==== Open Development Platform metadata templates ==== For each different type of data, there are specific terms that relate to that type of data. On the Datahub 4 different types of datasets, each requires its own metadata template, are currently stored/administered: * Dataset (both spatial and non-spatial) * Library records * Law records * Agreement records (for contracts) These metadata templates were developed by adapting and enhancing the standard CKAN's metadata template. Each template contains metadata fields common for all dataset types on CKAN and a set of fields that are only applicable to the dataset type. For example, metadata about a research report (Library records) will have information about author(s) and publishers (s); metadata for laws and policies (Law records) will instead have information about the drafting agency, issuing agency, and promulgation date etc. The templates below outline information that should be included and offer instruction for each metadata field. ==== Metadata for dataset (both spatial and non-spatial) ==== [[public:geospatial_metadata|public:geospatial_metadata]] ==== Metadata template for library publications ==== [[public:library_metadata|public:library_metadata]] ==== Metadata template for law and policy documents ==== [[public:laws_metadata|public:laws_metadata]] ==== Metadata for agreement documents (contracts) ==== [[public:agreement_metadata|public:agreement_metadata]] Other metadata fields exposed by the CKAN API: ^ Label ^ Field Name (API) ^ Definition ^ Guidelines ^ Example ^ | Type* | type | Dataset type | dataset or library_record | dataset | | Resources* | resources | Array with information about resources | ... | ... | | Tags* | tags | Array with information about tags/topics | ... | ... |