Big Data Solutions in Forced Migration: Innovations in Analytics to Promote Humane, Sustainable Responses to Forced Migration

World Bank, May 2018


This report highlights the potential for big data analytics to inform responses to forced migration, by supplying accurate real-time data on forced migration flows and the needs of displaced people. Big data also offer the potential to understand the causes of forced displacement and to predict forced migration flows. The report defines “Big Data” as “high-volume, high-velocity and high-variety datasets that can be analyzed to identify and understand previously unknown patterns, trends and associations.” Big data are usually produced in the course of another activity (e.g. making a cell phone call, using geo-located applications such as Twitter or email), and frequently draw on social media and the Internet. Crowdsourcing and crowdseeding can rapidly provide critical information to inform decision-making (e.g. mapping crisis hotspots, determining the cause and magnitude of violence, and identifying affected groups and their needs), by sourcing data directly from displaced and host populations. Remote sensing technologies, such as satellite imagery and unmanned aerial vehicles (UAVs), can supply data on migration flows and drivers such as environmental change or conflict. Weather data layered onto geospatial data can detect problems that may lead to forced migration, such as pocket droughts, especially in areas already affected by violent extremist activity. The report profiles several experimental approaches, as follows:

  • Monitoring hate speech: PeaceTech Lab undertook social media monitoring of hate speech terms to better understand the links between hate speech, fake news and conflict in South Sudan. A similar approach in Myanmar combined human and machine learning to construct data visualizations of real-time trends in sentiments alongside instances of hate speech online. Such approaches can be used to: (a) monitor trends in instability; (b) prevent atrocities and forced migration (i.e. employed as an early warning system); (c) understand refugee integration problems; and (d) demonstrate whether refugee integration interventions have had an impact.
  • Monitoring risk indicators for future conflict: PeaceTech Lab developed the Open Situation Room Exchange (OSRx) to aggregate and visualize two large datasets of geospatial data on global protests, violent events and conflict.
  • Predicting movement of people from weather data: Where’s global agronomic modeling system constructs daily agro-meteorological datasets for geographical grids of around nine kilometers in size. This method can detect early indications of pocket droughts, which could lead to competition over scarce resources and forced migration, particularly in areas affected by violent extremism. Remote sensing data can also be used to predict population movements, e.g. researchers at the University of Colorado Boulder have shown that Mexican rainfall estimates and other climate data from satellite imagery were predictive of domestic and international migration, particularly in regions where agriculture is the largest sector of the local economy.
  • Predicting outbreaks of mass atrocities: Harvard and NASA developed a set of algorithms to assess the risk of future atrocities using data from past atrocities and the Global Data on Events, Location and Tone (GDELT) platform. If real-time data were available, the algorithm could provide early warning of the risk of mass atrocities and potential forced migration.
  • Mapping Syria’s conflict: Since 2014 the Carter Center’s Syria Conflict Mapping Project and Planatir have analyzed open-source data to profile the Syrian conflict. By mining social media posts, they identified attributes for over 5,600 armed groups and documented evidence of mass atrocities.
  • Crowdsourcing for situational analysis: The open-source Ushahidi crowdsourcing platform was developed to map post-election violence in Kenya in 2008, using information submitted via the Internet and cell phones. It has since been used for monitoring natural disasters, elections, and high-crime areas. In 2012, the platform was deployed in Syria (Syria Tracker) to complement an open-source web and social media tracking platform that mines thousands of online sources for evidence of human rights violations.
  • Profiling needs through satellite imagery: In Nigeria, the Global Facility for Disaster Reduction and Recovery (GFDRR) advised on the use of satellite data to inform assessments of post-disaster needs. Detailed images taken from satellites and UAVs are also being used to track the growth or contraction of refugee camps, e.g. in Jordan and Syria.
  • Predicting food insecurity: USAID’s Famine Early Warning Systems Network (FEWS NET), which provides early indicators and analysis of food insecurity to inform humanitarian assistance, has been used to assess forced migration in response to food insecurity and conflict in South Sudan. The project was based on analysis of DigitalGlobe’s satellite imagery and crowdsourced data of dwellings and cattle tagged by 25,000 volunteers using the Tomnod crowdsourcing platform. Combined with existing data on food production, FEWS NET was able to more accurately project food insecurity.
  • Migration tracking from population-based models: Researchers have collaborated with Flowminder Foundation and its open-source demographic analytics resource WorldPop to aggregate census microdata and spatial data to model migration patterns in Sub-Saharan Africa. These gravity-type spatial interaction models have been shown to be effective in explaining and predicting migration. Flowminder researchers have also pioneered the use of anonymous mobile network data to monitor population displacement following the 2010 earthquake in Haiti and 2015 earthquake in Nepal, using data showing people’s movements as they travelled between individual cell phone transmitter towers adjusted for normal population movement patterns.
  • Geotagging internet data to follow movements: In 2013, researchers at Queens College CUNY, the Qatar Computing Research Institute and Stanford University used geo-tagged login information for Yahoo! users to track their international mobility. Researchers at Georgetown University’s Institute for the Study of International Migration have also mined the Internet to create a vast database that seeks to pre-emptively identify threats likely to lead to population dispersion.
  • Analyzing and visualizing information from big data sources: The World Bank’s Research Insight Tool (RIT) aggregates big data from news articles, social media and various sectoral sources. It analyzes text and curates data by tagging topics and key information, enabling full text searches.
  • Analyzing social media for responses to migration: In 2017, UN Global Pulse and UNHCR published analysis of how aspects of the European refugee crisis were conveyed on Twitter, in particular host communities’ sentiment towards migrants

Challenges in the use of big data include: (a) individual privacy concerns and the ability to access proprietary data; (b) less developed contexts may lack the technological resources to collect data, including smartphones, sensors, or access to cloud-based processing; (c) biases can be introduced when data is not representative, e.g. the lower end of the income distribution can be missed from data collected through cell phones.


Big Data | Technology