This shows you the differences between two versions of the page.
| Both sides previous revision Previous revision Next revision | Previous revision | ||
| playground:playground [2020/01/16 11:27] admin Changed to reflect new GA setup (split WP and CKAN) | playground:playground [2020/06/23 15:05] (current) | ||
|---|---|---|---|
| Line 1: | Line 1: | ||
| ====== PlayGround ====== | ====== PlayGround ====== | ||
| - | |||
| - | ====This is the replacement page for analytics ==== | ||
| - | ====== Site analytics report ====== | ||
| - | |||
| - | ===== Who is this guide for ===== | ||
| - | |||
| - | * Monitoring and evaluation officer who produces data for reporting | ||
| - | * Editor and data officer who assist with reporting  | ||
| - | * Program development officer who uses site analytics data for presentation and fundraising purposes  | ||
| - | |||
| - | ===== What this guide teaches ===== | ||
| - | |||
| - | * Introduction to site analytics indicators and workflow | ||
| - | * How to to identify, gather, manipulate (if needed) and analyze site analytics data | ||
| - | |||
| - | **Note:** The step-by-step guide on how to work with site analytics data has been simplified for those who have basis data skills. If you are an advanced data wrangler or a developer, use your magic.  | ||
| - | |||
| - | ===== Things to know beforehand ===== | ||
| - | |||
| - | Questions to consider when analyzing site data: | ||
| - | |||
| - | * How many users and where do they come from? | ||
| - | * What do they do when they arrive on the site? | ||
| - | * Do they specifically come to the site or are we showing up as a search result from their research online? What are the terms they are searching for? | ||
| - | * How long do they stay? | ||
| - | * Which areas of the platform are they most interested in? | ||
| - | * What do they read or download off the site? | ||
| - | * How do they navigate through the platform and between the countries if at all? | ||
| - | * How do they use interactive tools (i.e. map explorer and profiles)?  | ||
| - | |||
| - | <WRAP center round tip 90%> | ||
| - | ODI has developed two surveys to collect user feedback and testimonials.  | ||
| - | * **Pop-up survey**, to be implemented on the site, to gather data for user profiling and user need assessment. | ||
| - | * **User feedback and testimonial survey**, to be sent to a selected number of active users, to gather feedbacks for improving the site and collect testimonies.  | ||
| - | </WRAP> | ||
| - | |||
| - | ==== Site analytics workflow ==== | ||
| - | |||
| - | - **Identify the purpose** of the site analytics report | ||
| - | - **Determine the audience** – Is it a donor report, an internal assessment, a PR product? | ||
| - | - **Identify temporal range** – Weekly, monthly, quarterly, annually, mid-term, specific date (e.g. October 1, 2016 – September 30, 2017) | ||
| - | - **Identify indicators and data sources** – A number of useful indicators are included below in the Glossary of terms; however, depending on the purpose of the analytics report, additional indicators may be added. Google Analytics is an endless source of parameters, comparisons, insights and combinations to discover how pages on OD platform perform. Having a set of specific questions helps identify the most useful and relevant parameters. Some donors may have a number of specific indicators they would like to see (e.g. gender, access from mobile devices, access from a certain location).  Make sure to check the reporting requirements with the donors.  | ||
| - | - **Determine if comparison is need** – Do you need to benchmark against previous reporting period or another OD country instance (for internal evaluation)? | ||
| - | - **Download the data** – Never edit the original file. Make a copy of the file and conduct data manipulation and analysis off the copy. It’s advisable to use Google Sheet for your calculation as it enables collaborative work. | ||
| - | - **Manipulate and analyze the data** – Data manipulation is the process of changing data in an effort to make it easier to read or be more organized. You may need to filter, sort, categorize and perform basic calculation. | ||
| - | - **Visualize the data** – Identify the right type of presentation for the data you want to visualize. Provide a short and compelling caption for each graph or data table. | ||
| - | - **Draft a narrative for the report** informed by the site analytics insights | ||
| - | - **Share the draft report with colleagues** and invite comments / feedback  | ||
| - | - With relevant feedback, **finalize the report** | ||
| - | |||
| - | <WRAP center round tip 90%> | ||
| - | If you download raw data for analysis, make sure to NEVER edit the original data. Properly name the files and save them in one folder. Create a centralized Google Sheet and for each dataset create a tab where you can import (or copy and paste) the raw data over for analysis. | ||
| - | </WRAP> | ||
| - | |||
| - | <WRAP center round download 90%> | ||
| - | A folder containing **raw datasets** and a **centralized Google Sheet** containing data which will be used in the following section are stored on this [[https://drive.google.com/drive/folders/1bEBall3BwICaXhEMEKC3uwBeun94kVw9|Google Drive folder]] | ||
| - | </WRAP> | ||
| - | |||
| - | ==== Data sources and their limitations ==== | ||
| - | |||
| - | We have been using Google Analytics and ‘Linked to Your Site’ report, also a Google tool, to discern user behaviours and usage trends. Each offer specific insights and has its own set of limitations: | ||
| - | |||
| - | [[https://analytics.google.com|Google Analytics]] data is used to discern usage trends and average user behaviour on OD platform. Usage trends are demonstrated by number of users and sessions, disaggregated by new and returning users. Average user behaviour is discerned by how users access the site, average time spent on the site, how likely they are to exit the site, and most visited pages during the reporting period.  | ||
| - | |||
| - | There are limitations associated with this data as Google Analytics only offers the information disaggregated by certain parameters. These limitations are described in the **Working with Google Analytics** section. | ||
| - | |||
| - | {{ :partners:google_analytics_interface.png?nolink&700 |}} | ||
| - | |||
| - | [[https://www.google.com/webmasters/tools/external-links?utm_source=support.google.com/webmasters/&utm_medium=referral&utm_campaign=6155685&pli=1|‘Links to Your Site’]] (LTYS) reports websites that make linkages to the OD platforms. Also a Google tool, it deploys crawlers to discover the hyperlinks directed to OD pages and data from external domains. With data manipulation, you may discern the most hyperlinked topics, pages and content types, and allows us to identify who hyperlinks to OD platform from external domains.  | ||
| - | |||
| - | {{ :partners:ltys_interface.png?nolink&700 |}} | ||
| - | |||
| - | The LTYS data, however, also has its limitations. For instance the data is collected at a single point in time and thus is not dated and cannot be filtered for the reporting period. However, the data offers richer information than Google Analytics for insights on which pages have been hyperlinked to by external domains, indicating their popularity and amplifying the reach of the OD platforms to users of these domains. The data can be manipulated (categorize and grouping) and analyzed to reveal a broad group of users from government, media and civil society. | ||
| - | |||
| - | ==== Glossary of terms ==== | ||
| - | |||
| - | === Google Analytics === | ||
| - | |||
| - | * **Users**: Google Analytics used to offer “Users” as the unique number of visitors who visit a site. The number used to represent exactly how many individual people were there. However, currently Users metric in standard Google Analytics views doesn’t represent individual users and people. Rather, this number is based on a cookie that is set by the user’s browser. That means if a user access the website from a different browser or device, he/she might be counted as multiple users.  | ||
| - | |||
| - | * **Sessions**: A session is a group of interactions by one user with the site that take place within a given time frame. One unique visitor may initiate multiple sessions in a day. Sessions are typically refreshed after 30 minutes of inactivity.  | ||
| - | |||
| - | * **Unique pageview**: The number of visits to any given page. If a page was viewed multiple times during one visit, it is only counted once. | ||
| - | |||
| - | * **Bounce rate**: A //bounce// is a single-page visit to the site that does not continue to another page. If users enter our site on a certain page (the landing page or entrance page), and then leave the site without visiting another page, they have technically bounced. Google records limited statistics for these users, they are represented as having viewed 1 page and spending 0 minutes on the site. The bounce rate tells us what percentage of our visitors exit without visiting another page. | ||
| - | |||
| - | * **Non-bounce sessions**: Sessions where users view more than one page in a session.  | ||
| - | |||
| - | * **Average session duration**: It tells us how long, on average, visitors spend on our site. Note: this metric is likely an underestimated since Google averages site visit duration for bounced visitors as 0. | ||
| - | |||
| - | === Links to Your Site (LTYS) === | ||
| - | |||
| - | * **Source domains**: Number of unique domains that contains / exposed a link to a page on the OD platform. | ||
| - | |||
| - | * **Links**: Total number of hyperlinks made to a page on our site. This figure counts all published/exposed hyperlinks on an external domain. | ||
| - | |||
| - | * **Linked pages**: This reflects the accumulated total number of unique domains that hyperlink to a page from the OD platform.  | ||
| - | |||
| - | ===== Working with Google Analytics ===== | ||
| - | |||
| - | ==== Things to know beforehand ==== | ||
| - | |||
| - | * Admin of the site as well as those working on donor reporting should already have access permission to Google Analytics. To request access, contact ODI administrator.  | ||
| - | * Go to: https://analytics.google.com and sign-in | ||
| - | * The stats are divided into WordPress and CKAN | ||
| - | * Select your country instance under **Analytics Account** > OD instance > WordPress or CKAN. | ||
| - | |||
| - | **Note**: For CKAN data you have the option **Unfiltered** data (contains spam) or **Master**. Always choose **Master**. | ||
| - | {{ :playground:ga_-_master.png?400 |}} | ||
| - | |||
| - | <WRAP center round info 90%> | ||
| - | ODM data from October 1, 2016 to September 30, 2017 will be used for the following step-by-step demonstration.  | ||
| - | </WRAP> | ||
| - | |||
| - | ==== User acquisitions ==== | ||
| - | |||
| - | User acquisition data tells the source of traffic and the medium through which users came to the OD platform for the reporting period.  | ||
| - | There are three types of source:  | ||
| - | * Direct (through entering the OD platform URL into the web browser) | ||
| - | * Organic (through search engine such as Google, Bing, Yahoo ... etc., which are called medium) | ||
| - | * Referral (through a link published on an external website, inclusive of social media platforms, link contained in an e-mail or newsletter) | ||
| - | |||
| - | Since the OD platform hosts six websites and each is making linkages to another, we want to demonstrate how much traffic to one of the OD site is directed from other OD sites. For example, how much traffic to ODM is directed from OD country instances. This **traffic** is measured by **number of sessions**.  | ||
| - | |||
| - | Referral data is also useful if we want to measure how much traffic to the site has been directed from a partner organization's site. For example, how many sessions to ODC has been directed from the Global Forest Watch or a Cambodian government websites.  | ||
| - | |||
| - | === Basic user acquisition data === | ||
| - | |||
| - | Basic referral data is readily available on Google Analytics. It is accessible through this path: Acquisition > Overview (you should see this page: https://analytics.google.com/analytics/web/#report/trafficsources-overview/) | ||
| - | |||
| - | {{ :partners:traffic.png?nolink&700 |}} | ||
| - | |||
| - | From this **acquisition overview** page, you can find macro data on traffic source and number of sessions associated with each source. If this level of data is all you need, download the data by clicking on Export. You may save the file as CSV, Excel, Google Sheet, or PDF.  | ||
| - | |||
| - | {{ :partners:download_ga.png?nolink&700 |}} | ||
| - | |||
| - | Assuming you save the file as CSV, Excel, or Google Sheet, you will see: | ||
| - | |||
| - | {{ :partners:traffic_data.png?nolink&700 |}} | ||
| - | |||
| - | The same process can be done to access broad data on traffic to OD platform from social media and social network sites. | ||
| - | |||
| - | {{ :partners:traffic_social.png?nolink&700 |}} | ||
| - | |||
| - | === User acquisition data disaggregated by source and medium === | ||
| - | |||
| - | Disaggregated data helps us answer the following questions: | ||
| - | - What are some of the most popular search engines our users used? | ||
| - | - How much **traffic** to OD platform is directed from other OD sites? | ||
| - | - How much **traffic** to OD platform is directed from data partner websites?  | ||
| - | - How much **traffic** to OD platform is directed from government / media / academic website?  | ||
| - | |||
| - | Below is a step-by-step guide on how to download and analyze traffic data to show direct traffic, organic search, and traffic via referrals from OD platform and social media. Using the method below, you may also analyze how much traffic to OD platform comes from government, media, academic, NGOs, etc.  | ||
| - | |||
| - | **Step 1: Download raw data** | ||
| - | |||
| - | * Go to **Acquisition** > **All traffic** > **Source/Medium**.  | ||
| - | {{ :partners:raw_aquisition_ga_1.png?nolink&800 |}} | ||
| - | |||
| - | * Do not click export yet. Scroll all the way down to the data table.  | ||
| - | {{ :partners:aquisition_raw_one_page.png?nolink&800 |}} | ||
| - | |||
| - | <WRAP center round info 60%> | ||
| - | Only the visible data is downloaded. In this case, only 10 rows of data would be downloaded if you clicked Export. | ||
| - | Change the number of visible row to a number that is more than the total number of rows. In this example, choose 250 (there are 246 rows in total). | ||
| - | </WRAP> | ||
| - | |||
| - | {{ :partners:aguisition_data_raw_all.png?nolink&800 |}} | ||
| - | |||
| - | * Scroll back up and click **Export** and save as Excel or Google Sheet. | ||
| - | |||
| - | {{ :partners:aquisition_data_raw_excel.png?nolink&800 |}} | ||
| - | |||
| - | **Step 2: Working with raw data** | ||
| - | |||
| - | * Copy the data to a Google Sheet. See [[https://docs.google.com/spreadsheets/d/1I_Hsqmq_lLzrqOp2qfpxPnEO6h1CMKOPptCq2uazUTw/edit#gid=0|this sheet on Google Drive]]. | ||
| - | |||
| - | * Take note of the **totals** for data verification later. Note that the total number of sessions from all traffic is 35045. | ||
| - | |||
| - | * Delete the total and unrelated data. In this example the rows and columns which have been highlighted.  | ||
| - | |||
| - | {{ :partners:acquisition_data_gsheet.png?nolink&800 |}} | ||
| - | |||
| - | * Add two new columns to the right of **Source / Medium** (column A). You should have column B and C blank.  | ||
| - | |||
| - | * Copy **Source / Medium** column and paste it into column B. | ||
| - | Select column B > go to Data > Split text to columns > Split by "/" sign. | ||
| - | You should have the following: | ||
| - | |||
| - | {{ :partners:aquisition_split_text_to_data.png?nolink&800 |}} | ||
| - | |||
| - | This is data from ODM site. Thus the medium for "direct" is opendevelopmentmekong.net | ||
| - | |||
| - | **Step 3: Determine data you want to identify** | ||
| - | |||
| - | We want to be able to identify: | ||
| - | 1) traffic from major search engines (e.g. Google, Bing, Yahoo),  | ||
| - | 2) traffic from other OD platforms (e.g. ODC, ODMm, ODL, ODT, ODV), and | ||
| - | 3) traffic from social media (e.g. Facebook, Twitter).  | ||
| - | |||
| - | **Step 4: Transform / manipulate data** | ||
| - | |||
| - | We need to transform the Source data to the above grouping. | ||
| - | |||
| - | * Add a new column next to the **Source** column. Call it **Source (analyzed)** to differentiate.  | ||
| - | |||
| - | * Use the filter function to view data by **Medium**.  | ||
| - | |||
| - | * Assign appropriate categories.  | ||
| - | |||
| - | <WRAP center round tip 90%> | ||
| - | If you want to analyze how much traffic to OD platform comes from government, media, academic, NGOs, etc., you will need to transform data in the **Source** column by associating .edu with Academia, .gov with Government, .org = NGOs and so on. For media organization, you will need to perform a text search to find match with news website URL. This manual data transformation might produce some inconsistency. Make sure you double-check the work and get a colleague to help reproduce the data using your method."  | ||
| - | </WRAP> | ||
| - | |||
| - | |||
| - | {{ :partners:source_analyze_organic.png?nolink&800 |}} | ||
| - | |||
| - | <WRAP center round important 80%> | ||
| - | ODC and ODMm uses opendev[country].net as well as the default URL country.opendevelopmentmekong.net. Make sure to count traffic from both as traffic directed from the OD platform. There are traffic directed from PP site and ODM Wiki to PROD. For reporting purposes, this traffic doesn't need to be identified.  | ||
| - | </WRAP> | ||
| - | |||
| - | {{ :partners:aquisition_data_od.png?nolink&800 |}} | ||
| - | |||
| - | * Now that the categorization is complete. Use Pivot table function to count number of sessions for each medium. Make sure the grand total is the same as the number provided in the raw data (in this example 35045). | ||
| - | |||
| - | {{ :partners:aquisition_pivot.png?nolink&800 |}} | ||
| - | |||
| - | **Step 5: Analyze / visualize data** | ||
| - | |||
| - | Below is an example of how this data can be presented.  | ||
| - | |||
| - | {{ :partners:user_aquisition_report.png?nolink&400 |}} | ||
| - | |||
| - | <WRAP center round tip 90%> | ||
| - | If your organization reports detail user acquisition data on a regular basis (monthly or quarterly), you may combine the monthly / quarterly reports rather than downloading and transforming data of a longer temporal range.  | ||
| - | </WRAP> | ||
| - | |||
| - | ==== Users and sessions ==== | ||
| - | |||
| - | === Things to know beforehand === | ||
| - | |||
| - | **Temporal range**: For donor reporting, it's useful to breakdown the data by quarter. In the example below, we will gather data from October 1, 2016 to September 30, 2017, which are four quarters: | ||
| - | * Quarter 4, 2016: October 1 to December 31, 2016 | ||
| - | * Quarter 1, 2017: January 1 to March 31, 2017 | ||
| - | * Quarter 2, 2017: April 1 to June 30, 2017 | ||
| - | * Quarter 3, 2017: July 1 to September 30, 2017 | ||
| - | |||
| - | <WRAP center round tip 90%> | ||
| - | Each data point which is downloaded for a specific temporal range is dated with the same range, thus if you set the temporal range for October 1, 2016 to September 30, 2017, it cannot be separated into week, month, or quarter. It's a good practice to define the specific temporal range segments and download the data for each segment. | ||
| - | </WRAP> | ||
| - | |||
| - | === Basic user and session data === | ||
| - | |||
| - | **Users**: Gather data to show how many users, desegregated by **returning** and **new** users, have visited the OD platform over the reporting period.  | ||
| - | |||
| - | **Sessions**: A session is a group of interactions by one user with the site that take place within a given time frame. One unique visitor may initiate multiple sessions in a day. Sessions are typically refreshed after 30 minutes of inactivity.  | ||
| - | * **Average session duration** data shows Average time returning and new users spent on the Platform | ||
| - | * **Page / Session** data shows Average number of pages on the Platform viewed per session by returning and new users | ||
| - | * **Bounce rate** shows how likely returning and new users on average to exit the Platform after viewing only one page. | ||
| - | |||
| - | <WRAP center round info 90%> | ||
| - | Basic Google Analytics report disaggregates user data by returning and new user segments (see below). However, user segments will need to be specified for Sessions data. | ||
| - | </WRAP> | ||
| - | |||
| - | |||
| - | {{ :partners:users_sessions_lumsum.png?nolink&700 |}} | ||
| - | |||
| - | |||
| - | **Step 1: Get the data** | ||
| - | |||
| - | * User and sessions data are accessible via Audience > Overview. Make sure to set the temporal range parameter appropriately. The following example is for user and session data in the fourth quarter of 2016 (October 1 to December 31, 2016) | ||
| - | |||
| - | {{ :partners:user_session_temporal_range.png?nolink&700 |}} | ||
| - | |||
| - | * Clicking on the Add segment sign, add **New users** and **Returning users** to the segment. Uncheck **All users**. | ||
| - | |||
| - | {{ :partners:segment_setting.png?nolink&700 |}} | ||
| - | |||
| - | * Copy and paste the following data (in the red box) to a new tab on your centralized Google Sheet. Name it appropriately. Note that you may also export the raw data, but it would be more time consuming to organize and calculate them later. Since there are only a few data points and we are not going to perform any data transformation / manipulation, it's not necessary to work with raw data. | ||
| - | |||
| - | {{ :partners:user_data_to_capture.png?nolink&700 |}} | ||
| - | |||
| - | * Repeat this process for other temporal range as needed. In this example, extract data for three other quarters. You should have the following: | ||
| - | |||
| - | {{ :partners:session_data_all.png?nolink&700 |}} | ||
| - | |||
| - | **Step 2: Visualize and present the data** | ||
| - | |||
| - | See the following for examples: | ||
| - | |||
| - | {{ :partners:user_viz.png?nolink&500 |}} | ||
| - | |||
| - | {{ :partners:user_sessions_tables.png?nolink&600 |}} | ||
| - | |||
| - | <WRAP center round info 90%> | ||
| - | **Bounce rate**: Although the bounce rates seem relatively high for OD platform, it is possible that many return users are targeting specific pages for updates (i.e. daily news updates), spend their time reading these and leave, which constitutes a bounce even though the users found what they were looking for Therefore, it is possible that the bounce rates listed by google analytics above are skewed and not a true representation of return user behavior.  | ||
| - | </WRAP> | ||
| - | |||
| - | === Sessions data disaggregated by bounce and non-bounce behavior === | ||
| - | |||
| - | <WRAP center round info 90%> | ||
| - | Non-bounce sessions are sessions where users (both returning and new users) view more than one page in a session. | ||
| - | </WRAP> | ||
| - | |||
| - | It is important to note that Google reports disaggregated statistics for new and return users as a whole. Users cannot be desegregated as bounce or non-bounce users. However, data that describes **user behavior in a session** such as **Average session duration** and **Page / Session** can be disaggregated by **bounce** and **non-bounce**. | ||
| - | |||
| - | Available statistics that offer disaggregated **user behavior** by bounce and non-bounce behavior displays that in non-bounce sessions users (both returning and new users) spend even more time on the platform, approximately 5-7 minutes on average as opposed to 1-3 minutes.  | ||
| - | |||
| - | This insights can be added to a report if deemed appropriate. Go gather the data: | ||
| - | |||
| - | * Go to Audience < Overview. Make sure to set the temporal range parameter appropriately. Example below is for Q3, 2017. | ||
| - | |||
| - | * Clicking on the Add segment sign, add **Bounce sessions** and **Non-bounce sessions** to the segment. Uncheck **All users**. | ||
| - | |||
| - | * Copy and paste the following data (in the red box) to a new tab on your centralized Google Sheet. Name it appropriately.  | ||
| - | |||
| - | {{ :partners:non-bounce_sessions.png?nolink&600 |}} | ||
| - | |||
| - | * Repeat this process for other temporal range as needed. In this example, extract data for three other quarters. You should have the following: | ||
| - | |||
| - | {{ :partners:screen_shot_2017-11-14_at_10.39.51_pm.png?nolink&600 |}} | ||
| - | |||
| - | ==== Most visited pages ==== | ||
| - | |||
| - | **Step 1: Get the data** | ||
| - | |||
| - | * Got to Behavior > Site content > All pages. Make sure to set the temporal range parameter appropriately. The following example October 1, 2016 to September 30, 2017. | ||
| - | |||
| - | {{ :partners:site_content_overview.png?nolink&700 |}} | ||
| - | |||
| - | * Do not click export yet. Scroll all the way down to the data table. There are thousand of pages with Pageviews data. Choose to view the top 25 most viewed pages. | ||
| - | |||
| - | {{ :partners:site_content_most_pageviews.png?nolink&700 |}} | ||
| - | |||
| - | * Scroll back up and click **Export** and save as Excel or Google Sheet. You should have: | ||
| - | {{ :partners:most_viewed_excel.png?nolink&600 |}} | ||
| - | |||
| - | |||
| - | **Step 2: Working with raw data** | ||
| - | |||
| - | * Add the raw data to a centralized folder | ||
| - | |||
| - | * Copy the data to a Google Sheet. See [[https://docs.google.com/spreadsheets/d/1I_Hsqmq_lLzrqOp2qfpxPnEO6h1CMKOPptCq2uazUTw/edit#gid=0|this sheet on Google Drive]]. | ||
| - | |||
| - | **Step 3: Determine data you want to identify** | ||
| - | |||
| - | We want to be able to classify applicable page with: Topic page, Maps, Data, Tags, News, or Profiles | ||
| - | |||
| - | **Step 4: Transform / manipulate data** | ||
| - | |||
| - | * Create a column before the "Page" column. Mark applicable pages. Homepage and 'about' pages cannot be classified under an OD content type (above) and thus are not relevant for reporting purposes. Filter the column to show only applicable pages. | ||
| - | |||
| - | {{ :partners:applicable_pages.png?nolink&600 |}} | ||
| - | |||
| - | * Create two additional column after the "Page" column. Assign relevant content type to the pages in the "Content type" column and add a note each page. | ||
| - | |||
| - | {{ :partners:transformed_data.png?nolink&700 |}} | ||
| - | |||
| - | **Step 5: Visualize / present the data** | ||
| - | |||
| - | Below are two examples of how this data can be presented. The graph below counts total Pageviews for each content type. The table provides a list of these most viewed pages, each hyperlinked with the relevant URL. | ||
| - | |||
| - | {{ :partners:odm_most_viewed_pages.png?nolink&500 |}} | ||
| - | |||
| - | {{ :partners:most_viewed_pages_table.png?nolink&600 |}} | ||
| - | |||
| - | **Note**: Although you might not use all the data downloaded, it's better to have more data on hands. You might be able to use it to help you produce the narrative section of the report. For example, it might be interesting for your team or the donor to know how much time on average users spent on one of your most popular page over the reporting period.  | ||
| - | |||
| - | <WRAP center round tip 90%> | ||
| - | If your organization reports most visited pages, grouped by OD content types, on a regular basis (monthly or quarterly), you may combine the monthly / quarterly report rather than downloading and transforming data of a longer temporal range.  | ||
| - | </WRAP> | ||
| - | ===== Working with 'Linked To Your Site' ===== | ||
| - | |||
| - | ==== Things to know beforehand ==== | ||
| - | |||
| - | * Google’s Links to Your Site tool crawls the internet and determines hyperlinks to the OD Platform from **external domains**. Not all links to OD platform may be listed. This is normal. See [[https://support.google.com/webmasters/answer/55281?hl=en|here]]  | ||
| - | |||
| - | * LTYS report offers information on which external domains are linking to OD Platform, what pages on OD platform they are linking to, and how much they are exposing the links to OD Platform (measured by how many of their pages contains a link to OD Platform). | ||
| - | |||
| - | * Information how an external domain is linking to the OD platform is readily available, however, in order to analyze how a user groups (academia, government, the media, NGOs or a set of interest group identified by your organization) is linking to the OD Platform, raw data needs to be downloaded, transformed / manipulated and analyzed.  | ||
| - | |||
| - | * Unlike Google Analytics data, LTYS data for referrals is not dated. The figures represent the entire lifespan of the project where a hyperlink is detectable and still live as of the date the data is downloaded. | ||
| - | |||
| - | * LTYS does offer dated data for the most recent link it crawled. However, the data will need to be heavily manipulated to merge with the undated data on links to OD pages. | ||
| - | |||
| - | {{ :partners:ltys_lattest_links.png?nolink&700 |}} | ||
| - | |||
| - | <WRAP center round tip 90%> | ||
| - | **What are 'external' domains?** OD Platform hosts 6 websites, each of which might have more than one URL (i.e. both cambodia.opendevelopmentmekong.net and opendevelopmentcambodia.net take users to the ODC platform). The 6 front-ends, which technically are regarded by Google crawler as separate domains, are interconnected and makeing linkages to one another (referrals). To demonstrate how others (non-OD platform) have linked to content on an OD instance, we must take out links from within the OD family.  | ||
| - | </WRAP> | ||
| - |  | ||
| - | === Access LTYS report === | ||
| - | |||
| - | * Admin of the site as well as those working on donor reporting should already have access permission to LYTS report. To request access, contact ODI administrator. | ||
| - | * Go to: https://www.google.com/webmasters/tools/external-links?hl=en&authuser=0 and sign-in | ||
| - | * Select your country instance under Choose a verified property < OD instance  | ||
| - | |||
| - | {{ :partners:login_ltys.png?nolink&600 |}} | ||
| - | |||
| - | You should see: | ||
| - | |||
| - | {{ :partners:ltys_homepage.png?nolink&600 |}} | ||
| - |  | ||
| - | ==== Referrals: Google Analytics vs. Links To Your Site data ==== | ||
| - | |||
| - | * LTYS technically reports 'referrals' which is also an indicator reported on Google Analytics. However, Google Analytics reports the amount of traffic (measured by number of sessions) directed to OD Platform from one external site (domain); it doesn't offer information on which OD pages has been hyperlinked on that site. | ||
| - | |||
| - | **For example**:  | ||
| - | |||
| - | By going to Google Analytics > Acquisition > Referrals, you can see that for the reporting period (October 1, 2016 - September 30, 2017), you can see that: | ||
| - | * The Land Portal have linked to OD Mekong. For the reporting period, via the links to OD Mekong on the Land Portal website, **a number of users** have visited OD Mekong, 23 of those were new users. They combined had 95 sessions. On average they view 1.8 pages per session and spent 1 mins and 40 seconds on the site. | ||
| - | * The Mekong Eye also have linked to OD Mekong. For the same reporting period, via the links to OD Mekong on the Mekong Eye website **a number of users** have visited OD Mekong, 43 of those were new users. They combined had 92 sessions. On average they engaged more with OD Mekong -- they viewed 2.2 pages per session and spent 2 mins and 17 seconds on the site. | ||
| - | {{ :partners:ga_referrals.png?nolink&700 |}} | ||
| - | |||
| - | By going to LTYS report for opendevelopmentmekong.net < click "More ..." under **Who links the most** < Search for the **Land Portal** and the **Mekong Eye**, you would see: | ||
| - | |||
| - | * The Land Portal have linked to 4 ODM pages and it exposed a link to these 4 pages on 57 of its own pages. | ||
| - | |||
| - | {{ :partners:odm_pages_linked_on_landportal.png?nolink&700 |}} | ||
| - | |||
| - | * The Mekong Eye have linked to 2 ODM pages and it exposed a link to these 2 pages on 7322 of its own pages. The ODM homepage has been linked to 7,306 pages on the Mekong Eye. | ||
| - | |||
| - | {{ :partners:odm_pages_linked_on_mekong_eye.png?nolink&700 |}} | ||
| - | |||
| - | * These are some of the pages on the Mekong Eye that contains a link to ODM homepage. | ||
| - | |||
| - | {{ :partners:mekong_eye_links_exposed.png?nolink&700 |}} | ||
| - | |||
| - | |||
| - | **Note**: Depending on what analysis you need, you might need to consolidate data for an external domain from the OD Datahub in order to demonstrate how a data partner is linking to the site. For example, the Land Portal is a data partner and have linked more to the OD Datahub rather than OD Mekong site. It has linked to 69 datasets on OD Datahub and has exposed these links on 169 of its web pages.  | ||
| - | |||
| - | {{ :partners:dataset_ckan_linked_on_landportal.png?nolink&700 |}} | ||
| - | |||
| - | ==== Main institutional user groups hyperlinking to OD Platform ==== | ||
| - | |||
| - | To show how one OD platform might benchmark against another, the following demonstration will analyze and compare data from OD Mekong, ODC, ODMm, and OD Datahub as an example. Those working on an OD country instance may download only data for their respective site. If you want to access data from another OD instance, please contact the administrator of that country site. | ||
| - | |||
| - | <WRAP center round download 90%> | ||
| - | The raw data will be stored [[https://drive.google.com/drive/folders/1DTfKQQ1GcNuKrFZKnQqiehTuRaMhkOdI|here]] and the analysis will be conducted on this centralized [[https://docs.google.com/spreadsheets/d/1I_Hsqmq_lLzrqOp2qfpxPnEO6h1CMKOPptCq2uazUTw/edit#gid=0|Google Sheet]] | ||
| - | </WRAP> | ||
| - | |||
| - | **Step 1: Download the data** | ||
| - | |||
| - | * Go to Link to Your Site > **Who links the most** > **"More"** > **Download this table** | ||
| - | |||
| - | {{ :partners:download_ltys_data.png?nolink&700 |}} | ||
| - | |||
| - | |||
| - | <WRAP center round info 90%> | ||
| - | Links to Your Site data was downloaded on November 15, 2017 for analysis for this guide. | ||
| - | </WRAP> | ||
| - | |||
| - | * Properly name the file and add it to your centralized folder for raw data. | ||
| - | |||
| - | * Copy and past the data to a properly named tab on your working Google Sheet. If you are working with data from other OD sites, add a "Linking to:" Column and make sure to properly mark the data using the shorthand for each OD platform. | ||
| - | |||
| - | {{ :partners:ltys_combined.png?nolink&700 |}} | ||
| - | |||
| - | **Step 3: Determine institutional user groups and domain extension** | ||
| - | |||
| - | The domains data can be classified into: | ||
| - | * OD domains | ||
| - | * Non-OD domains (which are true external domains) | ||
| - | |||
| - | For external domains, we want to identify the following institutional user groups: | ||
| - | * Government | ||
| - | * Academia | ||
| - | * Civi society | ||
| - | * Media | ||
| - | * Private sector / business | ||
| - | |||
| - | Domain extension can be identified and classified. The following assumptions are made for this analysis: | ||
| - | * .org, .net, .info, and relevant .de extensions = Civil society organization (Double check the domains if needed. A number of German institutions have linked to OD Platforms) | ||
| - | * .edu and .ac = academia | ||
| - | * .gov, .go and relevant .s.de extensions = government  | ||
| - | * .com = business | ||
| - | * News domains are identified by searching for exact match to known URL of media houses OR by filtering the text for the word "news", "tribune", "times", "post", and actual newsroom url (e.g. coconuts.co is an online media publisher). Using this assumption, the number of media websites linking to OD platform might be underreported. | ||
| - | |||
| - | <WRAP center round important 90%> | ||
| - | Some CSOs may have a .com domain (e.g. sahrika.com). Some media organization / newsroom may have a .org or .net domain. Academia might have a .net domain (e.g. researchgate.net). Thus, using this transformation method, the number of CSOs or media websites linking to OD platform might be skewed. Try your best to identify these and document your assumptions. A domain should only be assigned to one user group. Since LTYS data only offers a sample of links, this analysis should be accepted as it as: insights on which pages have been hyperlinked to by external domains, indicating their popularity and amplifying the reach of the OD platform to users of these domains. The data generally reveals a broad group of users from government, media, and civil society. | ||
| - | </WRAP> | ||
| - | |||
| - | **Step 4: Transform the data** | ||
| - | |||
| - | * Copy the data to a new sheet for analysis. Properly name the new sheet | ||
| - | |||
| - | * Evaluate wether a domain is internal (OD instances) or external. You can use the Filter function < Filter by condition < Text contains < type in **opendev** < click Enter | ||
| - | |||
| - | {{ :partners:filter_opendev.png?nolink&700 |}} | ||
| - | |||
| - | * In the **Internal / External** column, mark these rows with "Internal".  | ||
| - | |||
| - | {{ :partners:internal_domains.png?nolink&700 |}} | ||
| - | |||
| - | * Double-check. Filter "Internal / External" column. Select (Blank) and mark the rest with "External". You should have the following. | ||
| - | |||
| - | {{ :partners:ltys_filtered.png?nolink&700 |}} | ||
| - | |||
| - | * Filter "Internal / External" column for "External". To identify the relevant domain extensions noted in Step 3, you may either 1) use the same **Filter by condition function** and mark the extensions properly in a new column called "Domain extension" OR 2) use the **Split text to column** function.  | ||
| - | |||
| - | * In another column called "User group", label the data with the previously defined institutional user grouping in Step 3. | ||
| - | |||
| - | Note: Random entity might have a **.net** extension. They shouldn't be classified as civil society. Add them to "Other" category. **.com** domains are not very useful for this analysis and will also be classified as "Other".  | ||
| - | {{ :partners:ltys_links_analyzed.png?nolink&700 |}} | ||
| - | |||
| - | * Using the Pivot table function, summarize the data you need for analysis. Note that the figures in the Pivot table below represent number of external domains in each institutional user group.  | ||
| - | |||
| - | {{ :partners:ltys_pivot.png?nolink&700 |}} | ||
| - | |||
| - | **Step 5: Visualize and present the data** | ||
| - | |||
| - | You may present the data as a data table or in a graphic presentation.  | ||
| - | |||
| - | === Government websites hyperlinking to OD Platform === | ||
| - | |||
| - | We are often asked if government agencies have used data offered on OD Platform. LTYS data analyzed above sheds light on which government institutions have found our content useful enough to link it to their website.  | ||
| - | |||
| - | * Filter the **User group** column for "Government" we have the following: | ||
| - | |||
| - | {{ :partners:gov_linked_to_od.png?nolink&700 |}} | ||
| - | |||
| - | * Identify the government institution names and organize the data for presentation if need. | ||
| - | |||
| - | {{ :partners:ltys_govt_linked.png?nolink&700 |}} | ||
| - | |||
| - | {{ :partners:ltys_govt_linked_2.png?nolink&700 |}} | ||
| - | |||
| - | * With the domain information provided, you may go to LTYS website and find out which OD pages they each have linked to and how have the OD pages been displayed on their website.  | ||
| - | |||
| - | **For example**: The Ministry of Commerce of Cambodia have linked to three pages on ODC. | ||
| - | |||
| - | On ODC, MoC has linked to three pages, each displaying all content which has been tagged with "fdi" (Foreign Direct Investment), "construction-industry", and "rubber-export". ODC uses these keywords to tag relevant news article curated on the site. This indicate that some staffer at the MOC has using ODC website to browse news and to conduct research. Clicking on the "fdi" tag, we can see that MOC has been referencing this tag page in multiple of its report. | ||
| - | |||
| - | {{ :partners:odc_on_moc.png?nolink&700 |}} | ||
| - | {{ :partners:moc_links_tagpages.png?nolink&700 |}} | ||
| - | |||
| - | |||
| - | |||
| - | |||
| - | |||
| - | |||
| - | |||
| - | |||
| - | ==== Most hyperlinked pages ==== | ||
| - | |||
| - | LTYS report also offers data on most linked pages for each OD Platform. The data is accessible via **LTYS** report > **Your most linked content** < **"More"**  | ||
| - | |||
| - | {{ :partners:most_linked_pages_homepage.png?nolink&700 |}} | ||
| - | |||
| - | **Source domains** is an important indicator. It tells you how many websites have hyperlinked to a certain OD page. | ||
| - | |||
| - | <WRAP center round important 90%> | ||
| - | Since OD Platforms, each with a different URL, are regarded by Google crawlers as external website, LTYS data also include linkages from other OD instances. To Truly present linkages from 'external' domains, the data needs to be adjusted. In the following example, ODMm has hyperlinked to ODM Land page. Thus, number of source domains linking to ODM Land page needs to be reduced by 1 and the number of links needs to be reduced by 4. | ||
| - | |||
| - | {{ :partners:odm_land_linked_odmm.png?nolink&700 |}} | ||
| - | |||
| - | </WRAP> | ||
| - | |||
| - | **Step 1: Download the data** | ||
| - | |||
| - | * Go to **LTYS** report > **Your most linked content** > **"More"** > **Download this table** | ||
| - | |||
| - | * Add the raw data to a centralized folder | ||
| - | |||
| - | * Copy the data to a new and properly named sheet on Google Sheet. See [[https://docs.google.com/spreadsheets/d/1I_Hsqmq_lLzrqOp2qfpxPnEO6h1CMKOPptCq2uazUTw/edit#gid=0|this sheet on Google Drive]]. | ||
| - | |||
| - | {{ :partners:most_linked_pages_odm_ga.png?nolink&600 |}} | ||
| - | |||
| - | Note that the page URLs already contains information about OD content type. Fore example /topic/ = Topic page, /updates/ = Site updates etc. Editors can easily verify these markers with the custom-post types on WordPress. | ||
| - | |||
| - | **Step 2: Transform data** | ||
| - | |||
| - | * We want to use the "Split text to column" function to separate out the page URL. Before doing that, move the "**Links**" and "**Source domains**" columns and place them before the "**Your pages**" column. | ||
| - | |||
| - | * Copy content from "**Your pages**" column and past it in column D. | ||
| - | |||
| - | {{ :partners:copy_your_pages.png?nolink&700 |}} | ||
| - | |||
| - | * Select column D, go to Data > Split text to columns > Split by custom separator "/" | ||
| - | |||
| - | {{ :partners:split_most_linked_pages.png?nolink&700 |}} | ||
| - | |||
| - | * Clean up the data by combining similar markers (for example news and news-source) and separating irrelevant pages such as homepage, partnership page, terms of use, etc by marking them with "internal" in a new column. Statistics for these pages might be useful to know but we do not need it for donor reporting.  | ||
| - | |||
| - | {{ :partners:linked_pages_consolidated.png?nolink&700 |}} | ||
| - | |||
| - | <WRAP center round tip 90%> | ||
| - | **Why you shouldn't use unadjusted data**:  | ||
| - | From the data above we can see which content types have been hyperlinked the most by external domains.  | ||
| - | {{ :partners:ltys_content_type_viz.png?nolink&600 |}} | ||
| - | |||
| - | We can also see which topic pages are the most linked.  | ||
| - | |||
| - | {{ :partners:most_linkied_topic_pages.png?nolink&400 |}} | ||
| - | |||
| - | However the number of source domains hyperlinking to each page maybe over reported since the figures might contain hyperlinks from other OD instance. This problem is of a greater concern to ODC since the site has been operational longer.  | ||
| - | </WRAP> | ||
| - | |||
| - | **Step 2: Adjust the data** | ||
| - | |||
| - | Since we need to look up each page one by one on LTYS in order to find out if other ODC instances have linked that specific page, it's best to clearly identify a small set of pages to look up. | ||
| - | |||
| - | {{ :partners:adjusted_data.png?nolink&700 |}} | ||
| - | |||
| - | Using the same method with ODC data, remove figures for linkages from ODM and ODMm. | ||
| - | |||
| - | Before adjustment: | ||
| - | {{ :partners:odc_before_adjustment.png?nolink&700 |}} | ||
| - | |||
| - | After adjustment: | ||
| - | {{ :partners:odc_after_adjustment.png?nolink&700 |}} | ||
| - | |||
| - | **Step 2: Visualize and present the data** | ||
| - | |||
| - | By filtering the LTYS data further we found that the most linked content type for ODM were the topic page, with the Land page the most linked topic by external domains. It recorded 11 external domains who hyperlinked at least 5 times on average to the land page. | ||
| - | |||
| - | {{ :partners:odm_linked_by_content_type.png?nolink&400 |}} | ||
| - | |||
| - | {{ :partners:domains_linked_to_odm.png?nolink&700 |}} | ||
| - | |||
| - | For ODC the most linked content types were the profile pages, which continues to be the Economic Land Concession, Mining and Natural Protected Areas datasets, which have been periodically updated throughout the year. This highlights the demand for detailed national level datasets and the uniqueness of our platform to offer these. | ||
| - | |||
| - | {{ :partners:linked_by_content_type.png?nolink&400 |}} | ||
| - | |||
| - | {{ :partners:external_linking_to_odc.png?nolink&700 |}} | ||
| - | |||
| - | |||
| - | ===== Additional indicators ===== | ||
| - | |||
| - | In addition to Google Anlytics and Links to Your Site, relevant figures from CKAN should also be included. | ||
| - | CKAN offers statistics on **Total number of datasets** and **Top rated datasets** which can be pulled directly from [[https://data.opendevelopmentmekong.net/stats|CKAN Stat]] page without additional coding. | ||
| - | |||
| - | In addition to these two indicators, figures on **Most viewed datasets** and **Most downloaded datasets**, disaggregated by **Topics** and **OD Country** should also be included.  | ||
| - | |||
| - | |||
| - | As part of the already completed milestone 2.3.0 improvements on layout of [[https://drive.google.com/open?id=1pmT-rrj0uOOIqlHpSiA-1CGTGzXFC7xrwmTNrF6h-D8|dataset detail page]], we have implemented a mechanism which tracks following **Events** on Google Analytics: | ||
| - | |||
| - | * When the dataset detail page is loaded (example: https://opendevelopmentmekong.net/dataset/?id=an-overview-of-large-scale-investments-in-the-mekong), an Event dataset_view would be sent to GAnalytics containing the id of the dataset in question. | ||
| - | |||
| - | * When the user clicks on the **download** button for a dataset's resource, an Event **resource_download** would be sent to Google Analytics containing the id of the dataset and the id of the resource. | ||
| - | |||
| - | * When the user user clicks on a related dataset listed on the sidebar, an Event **related_dataset_link** is sent to Google Analytics containing the id of the dataset the user is being linked to. | ||
| - | |||
| - | |||