Science & Technology Search Engine for Sustainable Development

Project Objective

This project aims to use machine-based classification of text to build a searchable repository of papers, websites, training materials, and other useful knowledge for sustainable development.

This search engine will include content from across the Internet originating from credible organizations from academia, governments and private companies.

Unlike the leading search engines in the market which seek to aggregate as much content as possible, this one aims to provide a machine-curated result set from reputable sources.

Useful resources

Common Crawl, an open dataset of web crawl data that can be accessed and analyzed by anyone.
Common Search, a nonprofit search engine for the Web.
Spyglass, a simple search results front-end for Apache Solr using EmberJS.

Project Team

Data acquisition

Ms. Sara Crouse — Common Crawl Foundation
Director Common Crawl Foundation

Sebastian Nagel
Crawl Engineer & Data Scientist

Data Analytics

W. “RP” Raghupathi, Ph.D- Fordham University
Professor of Information Systems
Director, MS in Business Analytics Program
Director, Center for Digital Transformation
Gabelli School of Business

Prof. Yilu Zhou, PhD - Fordham University Associate Professor
Information Systems
Gabelli School of Business

Front-end web development

Roberto González Ibáñez, PhD
Assistant Professor
Departamento de Ingeniería Informática
Universidad de Santiago de Chile

Carolina Vásquez
Computer Engineering Student - Estudiante de Ingeniería Civil Informática