COMM6320: Digital Research
PhD course, School of Journalism and Communication, CUHK, 2022
COMM6320 (2024-2025 Term 2)
Teacher: Prof. LIANG Hai (NAH310)
NAH209: Wed 10:30 AM - 1:15 PM
Textbooks:
- Salganik, M. J. (2017). Bit by bit: Social research in the digital age.
- Munzert, S., Rubba, C., Meißner, P, & Nyhuis, D. (2015). Automated data collection with R.
- Grimmer, J., Roberts, M. E., & Stewart, B. M. (2022). Text as data: A new framework for machine learning and the social sciences.
- Borgatti, S. P., Everett, M. G., & Johnson, J. C. (2018). Analyzing social networks.
- Kolaczyk, E. D. & Csardi, G. (2014). Statistical analysis of network data with R.
Reference Books:
- Kabacoff, R. I. (2011). R in action: Data analysis and graphics with R.
- Wickham, W. (2010). ggplot2: Elegant Graphics for Data Analysis (Use R!).
- Kumar, A. & Paul, A. (2016). Mastering text mining with R.
- Silge, J. & Robinson, D. (2017). Text mining with R: A Tidy approach.
- Hadley Wickham
Tools:
Course Materials:
- Syllabus
- Reading List
- Prerequisites: R Programming + Statistics
- Schedule with Slides
- 11/01 Week 1: Introduction
- 18/01 Week 2: Design I
- 01/02 Week 3: Design II
- 08/02 Week 4: Data Collection I (API)
- 15/02 Week 5: Data Collection II (Scraping) + Coding Crawling
- 22/02 Week 6: Text Mining Basics
- 01/03 Week 7: Vector Space Model
- 08/03 Week 8: Topic Modeling
- 15/03 Week 9: Coding Text
- 22/03 Week 10: Social Network Analysis Basics
- 29/03 Week 11: Ego Network Analysis
- 12/04 Week 12: ERGM (Social Selection)
- 19/04 Week 13: Coding SNA + dataset
- 19/04 Week 14: Network Inference + Bilibili
Previous Syllabus (2017-2018/2018-2019):
- Overview Slides
- Digital research = digital data + computational methods
- Reading:
- Lazer, D., Pentland, A. S., Adamic, L., Aral, S., Barabasi, A. L., Brewer, D., … & Jebara, T. (2009). Life in the network: the coming age of computational social science. Science, 323(5915), 721-723.
- Golder, S. A. & Macy, M. W. (2014). Digital footprints: Opportunities and challenges for online social research. Annual Review of Sociology, 40(1), 129.
- Ruths, D. & Pfeffer, J. (2014). Social media for large studies of behavior. Science, 346(6213), 1063-1064.
- Lazer, D. & and Radford, J. (2017). Data ex machina: Introduction to big data. Annual Review of Sociology, 43, 19-39.
- Kabacoff (2011) & Wickham (2010).
- Research Design Slides
- Big data, experiment, survey, crowdsourcing, ethics (Salganik, 2017)
- Experiment:
- Salganik, Dodds, & Watts (2006). Experimental study of inequality and unpredictability in an artificial cultural market. Science, 311, 854-856.
- Bond, R. M., Fariss, C. J., Jones, J. J., Kramer, A. D., Marlow, C., Settle, J. E., & Fowler, J. H. (2012). A 61-million-person experiment in social influence and political mobilization. Nature, 489(7415), 295-298.
- Muchnik, L., Aral, S., & Taylor, S. J. (2013). Social influence bias: A randomized experiment. Science, 341(9), 647-650.
- Kramer, A. D., Guillory, J. E., & Hancock, J. T. (2014). Experimental evidence of massive-scale emotional contagion through social networks. Proceedings of the National Academy of Sciences, 111(24), 8788-8790.
- King, G., Pan, J., & Roberts, M. E. (2014). Reverse-engineering censorship in China: Randomized experimentation and participant observation. Science, 345(6199), 1251722.
- Munger, K. (2016). Tweetment effects on the tweeted: Experimentally reducing racist harassment. Political Behavior, 39(3), 1-21.
- Survey:
- Berinsky, A. J., Huber, G. A., & Lenz, G. S. (2012). Evaluating online labor markets for experimental research: Amazon.com’s Mechanical Turk. Political Analysis, 20(3), 351-368.
- Wang, W., Rothschild, D., Goel, S., & Gelman, A. (2015). Forecasting elections with non-representative polls. International Journal of Forecasting, 31(3), 980-991.
- Mellon, J. & Prosser, C. (2017). Twitter and Facebook are not representative of the general population: Political attitudes and demographics of British social media users. Research and Politics, 1-9.
- Web Data Collection Slides
- API, screen scraping, special techniques (e.g., selenium, apps)
- Munzert et al. (2015)
- Liang, H., & Fu, K. W. (2015). Testing propositions derived from Twitter studies: Generalization and replication in computational social science. PLoS ONE, 10(8), e0134270.
- Liang, H., & Zhu, J. J. H. (2017). Big data, collection of (social media, harvesting). In Jörg Matthes (Ed.), The International Encyclopedia of Communication Research Methods. Wiley Press.
- API, screen scraping, special techniques (e.g., selenium, apps)
- Text Mining Slides
- Basic preprocessing, vector space model, supervised, unsupervised learning (Kumar, 2016; Silge & Robinson, 2017)
- Basics & Counting:
- Grimmer, J., & Stewart, B. M. (2013). Text as data: The promise and pitfalls of automatic content analysis methods for political texts. Political analysis, 21(3), 267-297.
- Lansdall-Welfare, T., Sudhahar, S., Thompson, J., Lewis, J., Team, F. N., & Cristianini, N. (2017). Content analysis of 150 years of British periodicals. Proceedings of the National Academy of Sciences, 114(4), E457-E465.
- Benoit, K. et al. Getting Started with quanteda.
- Bail, C. A. (2016). Combining natural language processing and network analysis to examine how advocacy organizations stimulate conversation on social media. Proceedings of the National Academy of Sciences, 113(42), 11823-11828.
- Liang, H., & Fu, K. W. (2017). Information similarity, overload, and redundancy: Unsubscribing information sources on Twitter. Journal of Computer-Mediated Communication, 22(1), 1–17.
- Liang, H., & Fu, K. W. (2019). Network redundancy and information diffusion: The impacts of information redundancy, similarity, and tie strength. Communication Research, 46(2), 250-272.
- Supervised Learning:
- Beauchamp, N. (2017). Predicting and interpolating state‐level polls using Twitter textual data. American Journal of Political Science, 61(2), 490-503.
- Theocharis, Y., Barberá, P., Fazekas, Z., Popa, S. A. and Parnet, O. (2016), A bad workman blames his tweets: The consequences of citizens’ uncivil Twitter use when interacting with party candidates. Journal of Communication, 66(6), 1007–1031.
- Unsupervised Learning:
- Blei, D. M. (2012). Probabilistic topic models. Communications of the ACM, 55(4), 77-84.
- Roberts, M. E., Stewart, B. M., Tingley, D., Lucas, C., Leder‐Luis, J., Gadarian, S. K., … & Rand, D. G. (2014). Structural Topic Models for Open‐Ended Survey Responses. American Journal of Political Science, 58(4), 1064-1082.
- Lucas, C., Nielsen, R. A., Roberts, M. E., Stewart, B. M., Storer, A., & Tingley, D. (2015). Computer-assisted text analysis for comparative politics. Political Analysis, 23(2), 254-277.
- Social Network Analysis Slides
- Basics, network formation, network influence, information diffusion (Kolaczyk & Csardi, 2014)
- Basics:
- Himelboim, I. (2017). Social Network Analysis (Social Media). The International Encyclopedia of Communication Research Methods. 1–15.
- Social Selection & Influence:
- McPherson, M., Smith-Lovin, L., & Cook, J. M. (2001). Birds of a feather: Homophily in social networks. Annual review of sociology, 27(1), 415-444.
- Desmarais, B. A., & Cranmer, S. J. (2017). Statistical inference in political networks research. The Oxford Handbook of Political Networks, 203.
- Wimmer, A., & Lewis, K. (2010). Beyond and below racial homophily: ERG models of a friendship network documented on Facebook. American Journal of Sociology, 116(2), 583-642.
- Aral, S., Muchnik, L., & Sundararajan, A. (2009). Distinguishing influence-based contagion from homophily-driven diffusion in dynamic networks. Proceedings of the National Academy of Sciences, 106(51), 21544-21549.
- Lewis, K., Gonzalez, M., & Kaufman, J. (2012). Social selection and peer influence in an online social network. Proceedings of the National Academy of Sciences, 109(1), 68-72.
- Liang, H. (2014). The organizational principles of online political discussion: A relational event stream model for analysis of web forum deliberation. Human Communication Research, 40(4), 483-507.
- Liang, H. (2014). Coevolution of political discussion and common ground in web discussion forum. Social Science Computer Review, 32(2), 155-169.
- Information Diffusion:
- Del Vicario, M., Bessi, A., Zollo, F., Petroni, F., Scala, A., Caldarelli, G., … & Quattrociocchi, W. (2016). The spreading of misinformation online. Proceedings of the National Academy of Sciences, 113(3), 554-559.
- Ugander, J., Backstrom, L., Marlow, C., & Kleinberg, J. (2012). Structural diversity in social contagion. Proceedings of the National Academy of Sciences, 109(16), 5962-5966.
- Vosoughi, S., Roy, D., & Aral, S. (2018). The spread of true and false news online. Science, 359(6380), 1146-1151.
- Lehmann, S. & Ahn, Y. Y. (2017). Spreading dynamics in Social Systems.
- Liang, H. (2018). Broadcast versus viral spreading: The structure of diffusion cascades and selective sharing on social media. Journal of Communication, 68(3): 525–546.