Mobile Phone Data for Children on the Move: Challenges and Opportunities

Vedran Sekara, Elisa Omodei, Laura Healy, Jan Beise, Claus Hansen, Danzhen You, Saskia Blume and Manuel Garcia-Herranz

In Guide to Mobile Data Analytics in Refugee Scenarios: The “Data for Refugees Challenge” Study


Reliable, timely and accessible data are essential for understanding how migration and forced displacement affect children, and for informing policies and programs to meet their needs. This chapter discusses opportunities for using mobile phone data to address gaps in the data on displaced and migrant children. The authors identify three key challenges—data access, data and algorithmic bias, and operationalization of research—which need to be addressed if mobile phone data is to be successfully used in humanitarian contexts.

Key points:

  • Although mobile phone data mainly represents adult populations (since children are less likely to own a mobile phone) it can nevertheless be used in combination with other data sources (e.g. surveys) to understand youth mobility.
  • Mobile phone data has been used to estimate population displacements after national disasters, understand collective behavior during emergencies, predict the geographic spread and timing of an epidemic, and estimate short-term mobility (e.g. temporary and circular migration). However, since SIM cards are linked to national providers, human mobility calculated from phone records can only be used only to estimate movements within countries. To study international migration patterns, alternative sources of data have been used, such as geo-tagged tweets and Facebook data. Twitter data has also been used to estimate the relationship between short-term mobility and long-term migration.
  • In principle, mobile phone data coupled with tools from network science, algorithms from machine learning, and artificial intelligence techniques have the potential to be used for: mapping socioeconomic vulnerabilities, tracking epidemics in real-time, and establishing causal relationships between factors such as climate change and migration.
  • Access to data is one of the key challenges faced by organizations that wish to incorporate data-driven methods into operations. It is difficult to access mobile phone data from telephone operators due to privacy concerns, and lack of data anonymization and aggregation standards. Four different privacy-conscientious models have been proposed that balance privacy concerns and usefulness of data: (1) limited release of a restricted data sample to a small group of trusted researchers; (2) remote access to anonymized data on a virtual environment controlled by the mobile phone operator, which ensures better security but requires mobile phone operators to invest in infrastructure and technical expertise; (3) question and answer model, where data stays within the premises of mobile phone operators but researchers can interact with it by submitting code (questions) to the system, which takes the code, validates and runs it, and returns results through an application interface—this method also requires substantial investments in infrastructure and systems; and (4) aggregated data that involves sharing indicators that are harder to link back to individuals, however this approach requires proper aggregation standards.
  • A further challenge is data representativeness and bias. Much of the research conducted so far has been undertaken in data-rich populations in high-income countries. Consequently, findings and methodologies might not generalize to vulnerable populations (children, low-income individuals), who tend to be the least represented in datasets that rely on technology usage (because they are less likely to own a phone, and if they do, they have lower usage rates). To address this problem, datasets should be built by accurately selecting representative demographics among mobile phone users (using demographic information provided by users when subscribing, or based on phone usage patterns) and the time window selected to compute mobility should be wide enough to reduce bias. Mobile network coverage is another potential source of bias that needs to be considered.
  • Data challenge initiatives—in which private sector companies share a curated dataset with the research community—can provide insights into human behavior patterns and provide the opportunity to test mathematical and computational models. Once models have been finalized, their operationalization requires data streams which can be aggregated but need to be updated in near real-time. Models running on real-time data should also learn in real-time, using techniques such as data assimilation. The authors call for: the creation of pipelines to allow joint research to be conducted with a strong focus on the most vulnerable; data explorations and models packaged into open-source modules that can be reused and adapted to different contexts; and implementations that easily integrate with the existing systems already in place.
  • There is a general disconnect between the scientific communities that work with ‘Big Data’ and the humanitarian and development sector. However, there are a number of areas where technology is already being used to protect forcibly displaced children, and where further inroads could be made. For example: (1) mobile phone data and self-reported data have been used to monitor drivers of migration, which could be expanded to better understand causal relationships, tipping points, and monitoring strategies; (2) phone data can provide detailed population maps that can be used to identify populations with poor access to services; (3) phone data has been used to analyze the relationship between mobility patterns and social ties; (4) mobile phone data has been applied to categorize social networks, identify communities and understand urban environments in terms of social dynamics and segregation; and (5) a growing body of research within computational social science has been devoted to analyzing complex societal issues, such as polarization, community integration, gender and ethnic stereotypes, as well as fake news; (6) research has demonstrated how network analysis can be applied to design more efficient interventions to reduce conflict in schools, and mobile phone data can be used to study individual communication capacities, behavioral adaptation, and detection of unusual behaviors.