COMM6320: Digital Research
PhD course, School of Journalism and Communication, CUHK, 2024
COMM6320 (2024-2025 Term 2)
Teacher: Prof. LIANG Hai (NAH310)
NAH209: Wed. 10:30 AM - 1:15 PM
Textbooks:
Research Design (Weeks 1-3):
- Salganik, M. J. (2017). Bit by bit: Social research in the digital age.
Web Data Collection (Weeks 4-5):
- Munzert, S., Rubba, C., Meißner, P, & Nyhuis, D. (2015). Automated data collection with R.
Text as Data (Weeks 6-9):
- Grimmer, J., Roberts, M. E., & Stewart, B. M. (2022). Text as data.
- Rothman, D. (2024). Transformers for Natural Language Processing. (3rd Edition).
Network Analysis (Weeks 10-13):
- Borgatti, S. P., Everett, M. G., & Johnson, J. C. (2018). Analyzing social networks.
- Kolaczyk, E. D. & Csardi, G. (2014). Statistical analysis of network data with R.
Image as Data (Week 14):
- Williams, N. W., Casas, A., & Wilkerson, J. D. (2020). Images as data for social science research.
Reference Books:
- Angrist, J. D., & Pischke, J. S. (2009). Mostly harmless econometrics.
- Kabacoff, R. I. (2011). R in action: Data analysis and graphics with R.
- Kumar, A. & Paul, A. (2016). Mastering text mining with R.
- Silge, J. & Robinson, D. (2017). Text mining with R: A Tidy approach.
- Wickham, W. (2010). ggplot2: Elegant Graphics for Data Analysis (Use R!).
- Hadley Wickham
Course Materials:
- Syllabus
- Reading List by Week
- Prerequisites: R Programming + Statistics
- Schedule with Slides
- 08/01 Week 1: Introduction
- CSS
- Self-learning: R Programming + Videos
- 15/01 Week 2: Design I
- Big data, online survey
- Self-learning: Wired statistics I-III
- 22/01 Week 3: Design II
- Digital field experiment, mass collaboration
- Self-learning: Wired statistics IV-V
- 05/02 Week 4: Data Collection I (API)
- API
- Lab:
Rcurl
,httr
,rvest
- 12/02 Week 5: Data Collection II (Scraping) + Coding Crawling
- Web scraping
- Lab:
Rcurl
,httr
,rvest
- 19/02 Week 6: Text Mining I
- BoW, Vector-space models
- Lab:
quanteda
- 26/02 Week 7: Text Mining II
- Clustering, topic modeling, machine learning
- Lab:
quanteda.textmodels
,stm
- 05/03 Week 8: Text Mining III
- Word embedding, deep learning
- Lab:
word2vec
,h2o
,keras
- 12/03 Week 9: Text Mining IV + Coding Texts
- Transformers, LLMs
- Lab:
text
,grafzahl
,rollama
,llamafile
- 19/03 Week 10: Social Network Analysis I
- Nodes, edges, centralities
- Lab:
sna
,igraph
- 26/03 Week 11: Social Network Analysis II
- Network cohesion, communities
- Lab:
sna
,igraph
- 02/04 Week 12: Social Network Analysis III
- Ego network analysis
- Lab:
egor
- 09/04 Week 13: Social Network Analysis IV + Coding Networks
- Network formation models
- Lab:
sna
,igraph
- 16/04 Week 14: Multimedia Analysis
- Computer vision, audio/image analysis
- Lab:
googleCloudVisionR
,keras
- 08/01 Week 1: Introduction