Sage Journals: Discover world-class research

Abstract

Generative artificial intelligence (AI) has shown incredible leaps in performance across data of a variety of modalities including texts, images, audio, and videos. This affords social scientists the ability to annotate variables of interest from unstructured media. While rapidly improving, these methods are far from perfect and, as we show, even ignoring the small amounts of error in high accuracy systems can lead to substantial bias and invalid confidence intervals in downstream analysis. We review how using design-based supervised learning (DSL) guarantees asymptotic unbiasedness and proper confidence interval coverage by making use of a small number of expert annotations. While originally developed for use with large language models in text, we present a series of applications in the context of image analysis, including an investigation of visual predictors of the perceived level of violence in protest images, an analysis of the images shared in the Black Lives Matter movement on Twitter, and a study of U.S. outlets reporting of immigrant caravans. These applications are representative of the type of analysis performed in the visual social science landscape today, and our analyses will exemplify how DSL helps us attain statistical guarantees while using automated methods to reduce human labor.

Keywords

AI machine learning computer vision casual inference measurement

Get full access to this article

View all access options for this article.

References

Angelopoulos

Anastasios N.

Bates

Stephen

Fannjiang

Clara

Jordan

Michael I.

Zrnic

Tijana

. 2023b. “Prediction-Powered Inference.” Science (New York, N.Y.) 382(6671): 669–674.

Angelopoulos

Anastasios N.

Duchi

John C.

Zrnic

Tijana

. 2023a. “PPI++: Efficient Prediction-Powered Inference.” arXiv preprint arXiv:2311.01453 .

Baranauskas

Andrew J

. 2022. “News Media and Public Attitudes Toward the Protests of 2020: An Examination of the Mediating Role of Perceived Protester Violence.” Criminology & Public Policy 21(1): 107–123.

Barocas

Hardt

Narayanan

. 2023. Fairness and Machine Learning: Limitations and Opportunities. Cambridge, Massachusetts, USA: MIT Press. Adaptive Computation and Machine Learning Series.

Buolamwini

Joy

Gebru

Timnit

. 2018. “Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification.” Pp. 77–91 in Conference on Fairness, Accountability and Transparency, PMLR.

Cantú

Francisco

. 2019. “The Fingerprints of Fraud: Evidence From Mexico’s 1988 Presidential Election.” American Political Science Review 113(3): 710–726.

Casas

Andreu

Webb Williams

Nora

. 2019. “Images that Matter: Online Protests and the Mobilizing Role of Pictures.” Political Research Quarterly 72(2): 360–375.

Chernozhukov

Victor

Chetverikov

Denis

Demirer

Mert

Duflo

Esther

Hansen

Christian

Newey

Whitney

Robins

James

. 2018. “Double/Debiased Machine Learning for Treatment and Structural Parameters.” Econometrics Journal 21: C1–C68.

Salomé

Ollion

Étienne

Shen

Rubing

. 2024. “The Augmented Social Scientist: Using Sequential Transfer Learning to Annotate Millions of Texts With Human-Level Accuracy.” Sociological Methods & Research 53(3): 1167–1200.

10.

Egami

Naoki

Fong

Christian J.

Grimmer

Justin

Roberts

Margaret E.

Stewart

Brandon M.

. 2022. “How to Make Causal Inferences Using Texts.” Science Advances 8(42): eabg2652.

11.

Egami

Naoki

Hinck

Musashi

Stewart

Brandon

Wei

Hanying

. 2024a. “Using Imperfect Surrogates for Downstream Inference: Design-Based Supervised Learning for Social Science Applications of Large Language Models.” Advances in Neural Information Processing Systems 36.

12.

Egami

Naoki

Hinck

Musashi

Stewart

Brandon

Wei

Hanying

. 2024b. “Using Large Language Model Annotations for the Social Sciences: A General Framework of Using Predicted Variables in Downstream Analyses.” Working paper.

13.

Fong

Christian

Tyler

Matthew

. 2021. “Machine Learning Predictions as Regression Covariates.” Political Analysis 29(4): 467–484.

14.

Gilardi

Fabrizio

Alizadeh

Meysam

Kubli

Maël

. 2023. “ChatGPT Outperforms Crowd Workers for Text-Annotation Tasks.” Proceedings of the National Academy of Sciences 120(30): e2305016120.

15.

Girbau

Andreu

Kobayashi

Tetsuro

Renoust

Benjamin

Matsui

Yusuke

Satoh

Shin’ichi

. 2024. “Face Detection, Tracking, and Classification From Large-Scale News Archives for Analysis of Key Political Figures.” Political Analysis 32(2): 221–239.

16.

Grimmer

Roberts

M. E.

Stewart

B. M.

2022. Text as Data: A New Framework for Machine Learning and the Social Sciences. Princeton, New Jersey, USA: Princeton University Press.

17.

Grimmer

Justin

Stewart

Brandon M.

. 2013. “Text as Data: The Promise and Pitfalls of Automatic Content Analysis Methods for Political Texts.” Political Analysis 21(3): 267–297.

18.

Hopkins

Daniel J.

King

Gary

. 2010. “A Method of Automated Nonparametric Content Analysis for Social Science.” American Journal of Political Science 54(1): 229–247.

19.

Joo

Jungseock

Bucy

Erik P.

Seidel

Claudia

. 2019. “Automated Coding of Televised Leader Displays: Detecting Nonverbal Political Behavior With Computer Vision and Deep Learning.” Pp. 4044–4066 in International Journal of Communication.

20.

Joo

Jungseock

Steinert-Threlkeld

Zachary C.

. 2022. “Image as Data: Automated Content Analysis for Visual Presentations of Political Actors and Events.” Computational Communication Research 4(1): 11–67.

21.

Knox

Dean

Lucas

Christopher

Cho

Wendy K. Tam

. 2022. “Testing Causal Theories With Learned Proxies.” Annual Review of Political Science 25: 419–441.

22.

Lundberg

Ian

Johnson

Rebecca

Stewart

Brandon M

. 2021. “What Is Your Estimand? Defining the Target Quantity Connects Statistical Evidence to Theory.” American Sociological Review 86(3): 532–565.

23.

Lyu

Chengqi

Zhang

Wenwei

Huang

Haian

Zhou

Yue

Wang

Yudong

Liu

Yanyi

Zhang

Shilong

Chen

Kai

. 2022. “RTMDet: An Empirical Study of Designing Real-Time Object Detectors.” arXiv preprint arXiv:2212.07784 .

24.

Mozer

Reagan

Miratrix

Luke

. 2025. “More Power to You: Using Machine Learning to Augment Human Coding for More Efficient Inference in Text-based Randomized Trials.” Annals of Applied Statistics 19(1): 440–464.

25.

Nelson

Laura K

. 2020. “Computational Grounded Theory: A Methodological Framework.” Sociological Methods & Research 49(1): 3–42.

26.

Nelson

Laura K.

Burk

Derek

Knudsen

Marcel

McCall

Leslie

. 2021. “The Future of Coding: A Comparison of Hand-Coding and Three Types of Computer-Assisted Text Analysis Methods.” Sociological Methods & Research 50(1): 202–237.

27.

OpenAI

Josh Achiam

Adler

Steven

Agarwal

Sandhini

Ahmad

Lama

Akkaya

Ilge

Aleman

Florencia Leoni

Almeida

Diogo

Altenschmidt

Janko

Altman

Sam

, et al. 2024) “GPT-4 Technical Report.” arXiv:2303.08774 .

28.

Radford

Alec

Kim

Jong Wook

Hallacy

Chris

Ramesh

Aditya

Goh

Gabriel

Agarwal

Sandhini

Sastry

Girish

Askell

Amanda

Mishkin

Pamela

Clark

Jack

, et al. 2021. “Learning Transferable Visual Models from Natural Language Supervision.” Pp. 8748–8763 in International Conference on Machine Learning, PMLR.

29.

Rister Portinari Maranca

Alessandra

Chung

Jihoon

Hinck

Musashi

Wolsky

Adam D.

Egami

Naoki

Stewart

Brandon

. 2025. “Replication Data for: Correcting the Measurement Errors of AI-Assisted Labeling in Image Analysis Using Design-Based Supervised Learning.” Harvard Dataverse, V1, UNF:6:aWD3kLJwTyEjsKIh7M7ikw. doi: 10.7910/DVN/D9UGOV.

30.

Robins

James M.

Rotnitzky

Andrea

. 1995. “Semiparametric Efficiency in Multivariate Regression Models With Missing Data.” Journal of the American Statistical Association 90(429): 122–129.

31.

Robins

James M.

Rotnitzky

Andrea

Zhao

Lue Ping

. 1994. “Estimation of Regression Coefficients When Some Regressors Are Not Always Observed.” Journal of the American Statistical Association 89(427): 846–866.

32.

Russakovsky

Olga

Deng

Jia

Hao

Krause

Jonathan

Satheesh

Sanjeev

Sean

Huang

Zhiheng

, et al. 2015. “ImageNet Large Scale Visual Recognition Challenge.” International Journal of Computer Vision 115(3): 211–252.

33.

Tarr

Alexander

Hwang

June

Imai

Kosuke

. 2023. “Automated Coding of Political Campaign Advertisement Videos: An Empirical Validation Study.” Political Analysis 31(4): 554–574.

34.

Torres

Michelle

. 2024. “A Framework for the Unsupervised and Semi-Supervised Analysis of Visual Frames.” Political Analysis 32(2): 199–220.

35.

Wager

Stefan

Athey

Susan

. 2018. “Estimation and Inference of Heterogeneous Treatment Effects Using Random Forests.” Journal of the American Statistical Association 113(523): 1228–1242.

36.

Wang

Chen

Hui

Liu

Lihao

Chen

Kai

Lin

Zijia

Han

Jungong

Ding

Guiguang

. 2024. “YOLOv10: Real-Time End-to-End Object Detection.” in The Thirty-eighth Annual Conference on Neural Information Processing Systems.

37.

Wang

Angelina

Liu

Alexander

Zhang

Ryan

Kleiman

Anat

Kim

Leslie

Zhao

Dora

Shirai

Iroha

Narayanan

Arvind

Russakovsky

Olga

. 2022. “REVISE: A Tool for Measuring and Mitigating Bias in Visual Datasets.” International Journal of Computer Vision 130(7): 1790–1810.

38.

Wang

Siruo

McCormick

Tyler H.

Leek

Jeffrey T.

. 2020. “Methods for Correcting Inference Based on Outcomes Predicted by Machine Learning.” Proceedings of the National Academy of Sciences 117(48): 30266–30275.

39.

Webb Williams

Nora

Casas

Andreu

Wilkerson

John D.

. 2020. Images as Data for Social Science Research: An Introduction to Convolutional Neural Nets for Image Classification. Cambridge, Massachusetts, USA: Cambridge University Press.

40.

Won

Donghyeon

Steinert-Threlkeld

Zachary C.

Joo

Jungseock

. 2017. “Protest Activity Detection and Perceived Violence Estimation from Social Media Images.” Pp. 786–794 in Proceedings of the 25th ACM International Conference on Multimedia.

41.

Ying

Luwei

Montgomery

Jacob M.

Stewart

Brandon M.

. 2022. “Topics, Concepts, and Measurement: A Crowdsourced Procedure for Validating Topics As Measures.” Political Analysis 30(4): 570–589.

42.

Zhang

Han

. 2021. “How Using Machine Learning Classification as a Variable in Regression Leads to Attenuation Bias and What to Do About It.” SocArXiv: 453jk.

43.

Zhang

Han

Borch

Christian

Pardo-Guerra

Juan Pablo

. 2023. “Analyzing Image Data With Machine Learning.” In The Oxford Handbook of the Sociology of Machine Learning, Oxford, United Kingdom: Oxford University Press.

44.

Ziems

Caleb

Held

William

Shaikh

Omar

Chen

Jiaao

Zhang

Zhehao

Yang

Diyi

. 2024. “Can Large Language Models Transform Computational Social Science?” Computational Linguistics 50(1): 237–291.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.28 MB