Main Article Content



The changing social reality, which is increasingly digitally networked, requires new research methods capable of analysing large bodies of data (including textual data). This development poses a challenge for sociology, whose ambition is primarily to describe and explain social reality. As traditional sociological research methods focus on analysing relatively small data, the existential challenge of today involves the need to embrace new methods and techniques, which enable valuable insights into big volumes of data at speed. One such emerging area of investigation involves the application of Natural Language Processing and Machine-Learning to text mining, which allows for swift analyses of vast bodies of textual content. The paper’s main aim is to probe whether such a novel approach, namely, topic modelling based on Latent Dirichlet Allocation (LDA) algorithm, can find meaningful applications within sociology and whether its adaptation makes sociology perform its tasks better. In order to outline the context of the applicability of LDA in the social sciences and humanities, an analysis of abstracts of articles published in journals indexed in Elsevier’s Scopus database on topic modelling was conducted. This study, based on 1,149 abstracts, showed not only the diversity of topics undertaken by researchers but helped to answer the question of whether sociology using topic modelling is “good” sociology in the sense that it provides opportunities for exploration of topic areas and data that would not otherwise be undertaken.


Download data is not yet available.

Article Details

How to Cite
Author Biographies

MARIUSZ BARANOWSKI, Adam Mickiewicz University in Poznań

Mariusz Baranowski is assistant professor of sociology at the Adam Mickiewicz University, Poznań, Poland.

PIOTR CICHOCKI, Adam Mickiewicz University in Poznań

Piotr Cichocki is assistant professor of sociology at the Adam Mickiewicz University, Poznań, Poland.


  1. Adorjan, Michael &Benjamin Kelly. 2021. “Time as Vernacular Resource: Temporality and Credibility in Social Problems Claims-Making.” The American Sociologist 1–27.
  2. Alghamdi, Rubayyi & Khalid Alfalqi. 2015. “A survey of topic modeling in text mining.” International Journal of Advanced Computer Science and Applications 6(1): 147–153.
  3. Arabshahi, Forough & Animashree Anandkumar. 2016. Beyond LDA: A unified framework for learning latent normalized infinitely divisible topic models through spectral methods. Technical report. Retrieved November 10, 2021 (
  4. Baranowski, Mariusz. 2021. “The sharing economy: Social welfare in a technologically networked economy.” Bulletin of Science, Technology & Society 41(1): 20–30.
  5. Baranowski, Mariusz & Dorota Mroczkowska. 2021. “Algorithmic Automation of Leisure from a Sustainable Development Perspective.” Pp. 21–38 in Handbook of Sustainable Development and Leisure Services. World Sustainability Series, edited by A. Lubowiecki-Vikuk, B. M. B. de Sousa, B. M. Đerčan, & W. Leal Filho. Cham: Springer.
  6. Benoit, Kenneth, Kohei Watanabe, Haiyan Wang, Paul Nulty, Adam Obeng, Stefan Müller, & Akitaka Matsuo. 2018. “quanteda: An R package for the quantitative analysis of textual data.” Journal of Open Source Software 3(30): 774. doi:10.21105/joss.00774
  7. Berelson, Bernard R. 1952. Content analysis in communication research. Glencoe, Ill.: Free Press.
  8. Blei, David M., Andrew Y. Ng, & Michael I. Jordan. 2003. “Latent dirichlet allocation.” Journal of Machine Learning Research 3(1): 993–1022.
  9. Blei, David & John Lafferty. 2006. “Correlated topic models.” Advances in Neural Information Processing Systems 18: 147.
  10. Bohr, Jeremiah & Riley E. Dunlap. 2018. “Key Topics in environmental sociology, 1990–2014: Results from a computational text analysis.” Environmental Sociology 4(2): 181–195. DOI: 10.1080/23251042.2017.1393863
  11. DiMaggio, Paul, Manish Nag, & David Blei. 2013. “Exploiting affinities between topic modeling and the sociological perspective on culture: Application to newspaper coverage of U.S. government arts funding.” Poetics 41(6): 570–606.
  12. Ding, Juncheng & Wei Jin. 2019. “A Prior Setting that Improves LDA in both Document Representation and Topic Extraction.” 2019 International Joint Conference on Neural Networks (IJCNN) 2019: 1–8. DOI: 10.1109/IJCNN.2019.8852000
  13. Gans, Herbert J. 1999. Making Sense of America: Sociological Analyses and Esseys. Lanham, Oxford: Rowman & Littlefield Publishers, Inc.
  14. Gouldner, Alvin W. 1976. The Dialectic of Ideology and Technology: The Origins, Grammar, and Future of Ideology. New York: Seabury Press.
  15. Grün Bettina & Kurt Hornik. 2011. “topicmodels: An R Package for Fitting Topic Models.” Journal of Statistical Software 40(13): 1–30. doi: 10.18637/jss.v040.i13
  16. Hannigan, Timothy R. et al. 2019. “Topic modeling in management research: Rendering new theory from textual data.” Academy of Management Annals 13(2): 586–632.
  17. Jabkowski, Piotr, Piotr Cichocki, & Marta Kołczyńska. 2021. “Multi-Project Assessments of Sample Quality in Cross-National Surveys: The Role of Weights in Applying External and Internal Measures of Sample Bias.” Journal of Survey Statistics and Methodology 1–24.
  18. Lasswell, Harold D. 1927. “The theory of political propaganda.” American Political Science Review 21(3): 627–631.
  19. Lazarsfeld, Paul F. & Anthony R. Oberschall. 1965. “Max Weber and Empirical Social Research”. American Sociological Review 30(2): 185–199.
  20. Lee, Sangno, Jaeki Song, & Yongjin Kim. 2010. “An empirical comparison of four text mining methods.” Journal of Computer Information Systems 51(1): 1–10. DOI: 10.1080/08874417.2010.11645444
  21. Lee, Monica & John L. Martin. 2015. “Coding, counting and cultural cartography.” American Journal of Cultural Sociology 3(1): 1–33.
  22. Mayntz, Renate, Kurt Holm, & Peter Hübner. 1976. Introduction to Empirical Sociology. Harmondsworth: Penguin Education.
  23. McFarland, Daniel A., Daniel Ramage, Jason Chuang, Jeffrey Heer, Christopher D. Manning, & Daniel Jurafsky. 2013. “Differentiating language usage through topic models.” Poetics 41(6): 607–625.
  24. Mohr, John W. & Petko Bogdanov. 2013. “Introduction—Topic models: What they are and why they matter.” Poetics 41(6): 545–569.
  25. Pääkkönen, Juho & Petri Ylikoski. 2021. “Humanistic interpretation and machine learning.” Synthese 199: 1461–1497.
  26. R Core Team. 2021. R: A language and environment for statistical computing. R Foundation for Statistical Computing. Vienna, Austria.
  27. Rex, John. 1983. “British Sociology: 1960-80—An Essay.” Social Forces 61(4): 999-1009.
  28. Schmiedel, Theresa, Oliver Müller, & Jan vom Brocke. 2018. “Topic modeling as a strategy of inquiry in organizational research.” Organizational Research Methods 22(4): 941–968.
  29. Selwyn, Neil. 2015. “Data entry: Towards the critical study of digital data and education.” Learning, Media and Technology 40(1): 64–82.
  30. Silge, Julia & David Robinson. 2017. Text mining with R: A tidy approach. Sebastopol, CA: O’Reilly Media.
  31. Weber, Max. 1949. “‘Objectivity’ in Social Science and Social Policy.” Pp. 50–112 in Max Weber on The Methodology of the Social Sciences, edited by E. A. Shils & H. A. Finch. Illinois: The Free Press of Glencoe.
  32. Weber, Robert Philip. 1990. Basic content analysis. London: Sage.
  33. Wilterdink, Nico. 2012. “Controversial science: Good and bad sociology.” Figurations: Newsletter of the Norbert Elias Foundation 36: 1–12.