During the last few years a growing amount of content produced by Internet users has become publicly available online. These data come from a variety of places, including popular social web services like Facebook and Twitter, consumer services like Amazon or weblogs.
The research opportunities opened up by this socio-technological innovation are, as shown by the growing literature on the topic, huge. At the same time new challenges for social scientists arise. In this paper we will focus on two of the main challenges posed to the growth of the so-called computational social science: interdisciplinarity and ethics. While the searchability and persistence of this information make it ideal for sociological research, a quantitative approach is still challenging because of the size and complexity of the data.
Collecting, storing and analyzing these data often require technical skills beyond the traditional curricula of social scientists. These projects require, in fact, collaboration with computer scientists. Nevertheless developing a common interdisciplinary project is often challenging because of the different backgrounds of the researchers.
At the same time the availability of this content poses a challenge concerning privacy and research ethics. Due to the amount of data and the fact that the real identity of the author is often hidden behind a nickname, it is often impossible to ask the subject involved to consent to the use of their data. On the other hand, especially in the first wave of web 2.0, this information has been – intentionally or not – publicly shared by the users. While a technique of dis-embedding the identity of the user from the content analyzed is often the solution used to bypass this issue, an even more important privacy-related challenge for computational social science is emerging. Due to the wide adoption of social network sites such Facebook or Google+, where a user may decide to share his content with his/her group of friends only, the amount of public data will change and decrease in the future. We will discuss this issue by enumerating a number of possible future scenarios.
BassettE. H. and O'RiordanK. (2002) Ethics of internet research: Contesting the human subjects model. Ethics and Information Technology, 4(1): 233–247. Computer Ethics: Philosophical Enquiries (CEPE).
2.
BeaulieuA. and EstalellaA. (2011) Rethinking Research Ethics for Mediated Settings. Information, Communication and Society, 1–20. Routledge.
3.
BoydD. (2008) Taken Out of Context: American Teen Sociality in Networked Publics. PhD Dissertation. University of California-Berkeley, School of Information.
4.
BoydD.GolderS. and LotanG. (2010) Tweet, Tweet, Retweet: Conversational Aspects of Retweeting on Twitter. In HICSS-43. IEEE: Kauai, HI, USA.
5.
BoydD., and HargittaiE. (2010) Facebook Privacy Settings: Who Cares?, First Monday15 (8).
6.
BoydD. and CrawfordK. (2011) Six Provocations for Big Data. SSRN eLibrary. SSRN. doi:10.2139/ssrn.1926431
7.
DasM. (2012) Innovation in online data collection for scientific research: The Dutch MESS project. Methodological Innovations Online, 7(1) 7–24
8.
DuttonW.H.JeffreysP.W. and GoldinI. (2010) World Wide Research: Reshaping the Sciences and Humanities, Cambridge, MA, MIT Press.
EdlingC. (2009) We Always Know More Than We Can Say: Mathematical Sociologists on Mathematical Sociology. In HedstromP. and WittrockB. (Eds.), Frontiers of Sociology (pp. 345–396). Leiden: IDC Publisher.
12.
FonioC.GigliettoF.PedrioliS.RossiL. and PrunoR. (2007) Eyes on You. Narrating pregnancy in a networked space, presented at Toward a Social Science of Web2.0 conference, York, UK.
13.
GigliettoF. and RossiL. (2006) Eyes on Europe, presented at IVSA Conference, Urbino Italy.
14.
GigliettoF. and RossiL. (2009) Toward a bridge between sociocybernetics and internet studies. Journal of Sociocybernetics, 7(2): 1–3.
15.
GigliettoF. (2009) Social semantics in a networked space. New perspectives for social science. In AguadoJ. M.ScottB. and BuchingerE. (Eds.), Technology and Social Complexity. Murcia: University of Murcia Press.
16.
GigliettoF. (2012) If Likes Were Votes: An Empirical Study on the 2011 Italian Administrative ElectionsIn International AAAI Conference on Weblogs and Social Media.
17.
HendlerJ.ShadboltN.HallW.Berners-LeeT. and WeitznerD. (2008) Web science, Communications of the ACM, 51(7): 60.
18.
HoneycuttC. and HerringS.C. (2009) Beyond Microblogging: Conversation and Collaboration via Twitter. In Proceedings of the Forty-Second Hawaii International Conference on System Sciences (HICSS-42), Los Alamitos, CA, USA: IEEE Computer Society: 1–10.
19.
KozinetsR. V. (2012) Marketing Netnography: Prom/ot(ulgat)ing a New Research Method. Methodological Innovations Online, 7(1): 37–45
20.
LazerD.PentlandA.AdamicL.AralS.BarabasiA.BrewerD.ChristakisN.ContractorN.FowlerJ.GutmannM.JebaraT.KingG.MacyM.RoyD. and Van AlstyneM. (2009) Social science. Computational social science. Science (New York, N.Y.), 323(5915): 721–3.
21.
LeydesdorffL. (2000). A Sociological Theory of Communication, Parkland, FL, Universal Publishers.
22.
MagnaniM.MontesiD. and RossiL. (2010) Friendfeed Breaking News: Death of a public figure. In Second IEEE International Conference on Social Computing. LOS ALAMITOS, USA: IEEE computer Society, pp. 528–533.
ManovichL. (2011) Trending: The Promises and the Challenges of Big Social Data, Debates in the Digital Humanities, ed GoldM.K.The University of Minnesota Press, Minneapolis, MN<http://www.manovich.net/DOCS/Manovich_trending_paper.pdf>. [15 July 2011].
25.
MarwickA.E. and BoydD. (2010) I Tweet Honestly, I Tweet Passionately: Twitter Users, Context Collapse, and the Imagined Audience. New Media and Society. 13(1): 114–133.
26.
MichelJ.-B.ShenY. K.AidenA. P.VeresA.GrayM. K.PickettJ. P.HoibergD.ClancyD.NorvigP.OrwantJ.PinkerS.NowakM. A. and AidenE. L. (2011) Quantitative analysis of culture using millions of digitized books. Science (New York, N.Y.), 331(6014): 176–82. doi:10.1126/science.1199644
27.
RossiL. (2010) Media and Generation: How user generated content reshape generational identity in the Mass Media System. SociologiadellaComunicazione, 40: 109–119.
28.
RossiL.MagnaniM.IadarolaB., (2011) #rescatemineros: Global media event in the microblogging age, presented at AOIR Association of Internet Researcher Conference, Seattle, USA.
29.
TurkleS. (1995) Life on the screen: Identity in the age of the Internet. New York: Simon and Schuster.
30.
WuS.HofmanJ. M.WattsD. J. and MasonW. A., (2010) Who Says What to Whom on Twitter Categories and Subject Descriptors. In ACM WWW 11.