The executive director of an international nonprofit organization needs to map all the hospitals, clinics, and first aid facilities in Ecuador, Colombia, and Venezuela that are within five miles of elementary schools. An entrepreneur is researching the market for laundromats in low income areas in mid-sized U.S. cities. A State Department diplomatic security agent wants to map the police stations and military bases that are within a 20 minute drive from U.S. embassies and consulates in 17 Middle East countries. All three face the same challenge: finding a comprehensive, reliable, up-to-date database of geo-located points of interest (POI) that they can import into their geographic information system (GIS) and analyze. For each project they can find hundreds of relevant databases on the Internet—but few of these sources, if any, will meet all of their requirements.
The creation and analysis of geospatial data by individuals and non-government organizations (NGOs)—commonly known as crowd-sourced data—has exploded in the last couple of years, dwarfing the amount of information in government databases and proprietary commercial systems. While massive, however, these troves of data are often not sufficiently reliable and accurate for uses such as the ones in the examples above. Moreover, it is challenging to make sense out of disparate socio-cultural data, collected by a multitude of systems, for different purposes, and at different levels of specificity. Enabling users to rapidly gain understanding of their regions of interest requires fusing these non-standard data sources into consistent datasets.