Segregation and Sentiment: Estimating Refugee Segregation and Its Effects Using Digital Trace Data

Neal Marquez, Kiran Garimella, Ott Toomet, Ingmar G. Weber, Emilio Zagheni

Max Planck Institute for Demographic Research (MPIDR) Working Paper WP 2019-021, October 2019


This paper analyzes Call Detail Record (CDR) data to assess how communication and segregation between Turkish natives and Syrian refugees differ over time and space. The authors: (a) use CDR data to create metrics of geographic activity space and residential dissimilarity, as measures of segregation; (b) calculate spatial-temporal measures of the probability of refugees contacting Turkish citizens by phone and text, as a measure of group isolation; and (c) use Twitter posts that mention refugees to examine the relationship between the sentiment of tweets (revealing positive or negative attitudes towards refugees) and changes in segregation over space and time.

Key findings:

  • Metrics of activity space, i.e. the movements of refugees and Turkish citizens as indicated by CDR data, varies across major metropolitan areas. Of the major metropolitan areas, Ankara had the highest activity space dissimilarity, while Istanbul had the lowest, though district level variance was twice as high in Ankara. There were significant differences over time at both the district and province level.
  • Residential dissimilarity was strongly correlated with activity space dissimilarity. Activity space dissimilarity was frequently less than residential dissimilarity.
  • Twitter sentiment was found to change significantly over time but not over locations. The data did not provide any significant evidence that higher dissimilarity and urban areas produced unfavorable sentiment towards refugees.
  • There is a significant positive relationship between social segregation, as measured by calls from refugees to Turkish citizens, and the sentiment expressed in tweets about refugees. As weekly Twitter sentiment scores increased, i.e. revealing more positive attitudes towards refugees, there was a higher probability of refugees contacting non-refugees. The probability between cross-group connections was larger in urban areas than non-urban areas, and higher when dissimilarity was higher.