What social networks know about non-members

Contact details reveal private preferences

Social networks divide society into members and non-members. Relationships between non-members whose e-mail addresses have been communicated to the network by members (red lines) are likely to be predicted based on the mutually agreed friendships between members (black lines) and their connections to non-members (green lines) become. © Ágnes Horvát
Read out

What can social networks on the internet know about people who themselves have no user profile but are friends of members? Researchers have now examined this in more detail. They found that through targeted analysis it is possible to find out which of these non-members are friends with each other. From the knowledge of the circle of friends in turn can be concluded on certain hobbies, political views and even something as private as the sexual orientation.

Scientists have been asking for some years which conclusions can be drawn from directly or indirectly input data using the computer. For example, in a social network, information such as sexual orientation or political direction can be calculated with great precision even if the member did not specify it. It is enough if enough friends of the respective user have released the corresponding information about themselves. "Once confirmed friendships are known, predicting certain unknown properties is no longer a challenge for machine-aided data analysis, " says Fred Hamprecht, co-founder of the Interdisciplinary Center for Scientific Computing (IWR) at the University of Heidelberg

However, investigations of this kind have so far been restricted to users of social networks, ie to persons who have a user profile there - and have thus consented to the respective data protection conditions. "Non-members, however, have no such agreement. For this reason, we have examined their susceptibility to the automatic generation of so-called shadow profiles, "explains Katharina Zweig, who until recently worked at the IWR.

Networks also collect contact information from non-members

In a social network on the Internet, it is possible to obtain information about non-members, including with the help of a function for finding acquaintances. For example, new members of Facebook are asked to provide the network with their complete e-mail contacts when registering - including contacts to people who are not themselves members of Facebook. "This very basic knowledge of who is known to whom in a social network can be linked to information about who users outside the network know. With this link, in turn, a substantial part of the network of acquaintances between non-members can be derived, "explains gnes Horv t, who is researching at the IWR.

For their calculations, the Heidelberg scientists used a standard machine learning method based on the analysis of structural features of networks. Since the data needed for this study are not freely available, the researchers have used a test set of real baseline data. The division into members and non-members should be simulated with the widest possible range of methods. Commercial computers were able to calculate in just a few days which non-members are likely to be friends with each other. display

40 percent guess correctly

For the Heidelberg scientists, it was surprising that all simulation approaches produced the same qualitative results. Realistic assumptions about how many percent of a population are members of a social network and the likelihood of uploading their e-mail address book have been shown to be possible with the calculations to make about 40 percent correct predictions about acquaintances between the non-members. If you know the friendships, then you can often also draw conclusions on preferences, lifestyle and to draw something similar.

Our investigation has highlighted the potential social networks have for deriving information about non-members. The results are also astonishing because they are based on pure contact data, emphasizes Hamprecht. However, many social networks and service providers have much more user information, such as age, income, education, or place of residence. With the use of such data, a corresponding technical infrastructure and other structural features of the network analysis, the scientists are likely to be able to significantly increase their predictive accuracy.

"All in all, our project shows that as a society we have to reach an agreement on the extent to which information can be used for which there is no release of the persons concerned, " says Zweig,

(PLoS ONE, 2012; doi: 10.1371 / journal.pone.0034740)

(Ruprecht-Karls-University Heidelberg, 04.05.2012 - NPO)