Text Analytics of United Nations Publications

Analyze thousands of pages to discover changing topics, challenges, and policy recommendations in more than seven decades of United Nations research.

Updated by UN-OICT Analytics on October 30, 2016

Project repository

Project Objective

This project seeks solutions to analyze the body of analytical work published by the United Nations Agencies, Funds and Programmes. Such analysis is useful to identify the most and least covered topics over time, to identify the key issues, challenges and recommendations, to identify how the analytical work relates to the organization’s goals and to identify clusters of topics and trends.

Project deliverables

This project will result on a web application enabling search and visualization of the body of documents provided (the corpus). The web application will allow the user to view the topics of the corpus, generated by various methods (unconstrained, MDGs, SDGs), a timeline of the topics of the corpus and by publication, how the documents relate to each other, the main entities mentioned (e.g. people, places, organizations, etc.). Additional dimensions of analysis can be identified during the analysis and are welcomed.

The tool will also add metadata to individual sentences in each publication to tag if a sentence refers to one or more Sustainable Development Goal and if they are a “policy recommendation”, “barriers or challenges”, of if they indicate “causality” (when a sentence mentions a cause - effect relation).

Data sources

The United Nations publications are made available in a variety of formats and languages. Some examples of publications can be seen here:

Publications by the Department of Economic and Social Affairs

Publications by the United Nations University

Publications by the Economic and Social Commission for Asia and the Pacific

This project will focus on English-language publications, with the option of expanding into other languages. An initial task will be to identify and prepare the publications of interest in a corpus for analysis.

Deliverable A

A target deliverable of this project consists of an analysis of the publications of the Department of Economic and Social Affairs according to main themes, links to MDGs and SDGs, clustering, and trends over time. It will also include a tool to visualize these results.

Deliverable B

Another deliverable consists on doing the analysis for publications relating to the issues of Small Island Developing States (SIDS). This deliverable will be useful for the preparations of the upcoming “Symposium on Implementing the 2030 Agenda for Sustainable Development and the SAMOA Pathway in Small Islands Developing States - SIDS” which will be held in February 2017 in Nassau.

The list of SIDS can be found here:
UN OFFICE OF THE HIGH REPRESENTATIVE FOR THE LEAST DEVELOPED COUNTRIES, LANDLOCKED DEVELOPING COUNTRIES AND SMALL ISLAND DEVELOPING STATES

Reference projects

Unite Ideas #LinksSDGs

Project Team

The project is seeking a data analytics and visualization partner.

This team is advised by staff members from the United Nations Department of Economic and Social Affairs who provide background as well as detailed information on the expected work and objectives.