The Most Cited Scientific Information Sources in Wikipedia Articles Across Various Languages


sources of information
scientific sources
data science
data exploration

How to Cite

Lewoniewski, W. (2024). The Most Cited Scientific Information Sources in Wikipedia Articles Across Various Languages. Biblioteka, (27 (36), 269–294.

Number of views: 735

Number of downloads: 198


Ensuring the accuracy of information in Wikipedia articles requires the use of reliable sources that can be assessed by encyclopedia readers. However, the determination of source reliability can be subjective, with variations based on the language and subject matter of the article. Consequently, each language version of Wikipedia may have its own guidelines for judging the trustworthiness of sources. Some Wikipedia citations lead to scientific resources, which are usually deemed more reliable than websites due to their rigorous peer-review procedures and publication by esteemed academic publishers. This implies that the data presented in these scientific sources has been meticulously examined by specialists in the relevant area, providing a higher level of precision and trustworthiness. In this study, 332,424,439 references from 61,218,277 Wikipedia articles across 309 language versions were analyzed to identify citations to scientific publications. Additionally, OpenAlex was used to find unified metadata of important sources of information of multilingual Wikipedia.


Alnajrani B., A. Alghamdi, M. Alotaibi, S.A. Atta-Ur-rahman & M. Nabil, A Novel Approach to Wikipedia References Classification, “ICIC Express Letters, Part B: Applications”, 13, 2022, pp. 1321–1330.

Arroyo-Machado W., D. Torres-Salinas, E. Herrera-Viedma & E. Romero-Frı́as, Science through Wikipedia: A novel representation of open knowledge through co-citation networks, “PloS one”, 15, 2020, e0228713. DOI:

Baigutanova A. et al., Longitudinal Assessment of Reference Quality on Wikipedia, in: Proceedings of the ACM Web Conference 2023, New York: Association for Computing Machinery, 2023, pp. 2831–2839. DOI:

BestRef. Popularity and Reliability Assessment of Wikipedia Sources, [Accessed 19 June 2023].

Blumenstock J.E., Size matters: word count as a measure of quality on Wikipedia, “Proceedings of the 17th international conference on World Wide Web”, 2008, pp. 1095–1096. DOI:

Chen C.-C. & C. Roth, {{Citation needed}}: The dynamics of referencing in Wikipedia, in: Proceedings of the eighth annual international symposium on wikis and open collaboration, New York: Association for Computing Machinery, 2012, pp. 1–4. DOI:

Colavizza G., COVID-19 research in Wikipedia, “Quantitative Science Studies”, 1, 2020, pp. 1349–1380. DOI: Supplementary materials for this research, [Accessed 19 June 2023].

English Wikipedia. MediaWiki API help, [Accessed 19 June 2023].

Fetahu B., K. Markert, W. Nejdl & A. Anand, Finding news citations for Wikipedia, in: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, New York: Association for Computing Machinery, 2016, pp. 337–346. DOI:

Halfaker A. & R.S. Geiger, ORES: Lowering barriers with participatory machine learning in Wikipedia, “Proceedings of the ACM on Human-Computer Interaction”, 4, 2020, pp. 1–37. DOI:

Hernández N.A., G. del Rio & D. de la Hera, Insights on the references of Wikipedia’s featured articles in English, French, Portuguese and Spanish, [Accessed 19 June 2023].

Jemielniak D., G. Masukume & M. Wilamowski, The most influential medical journals according to Wikipedia: quantitative analysis, “Journal of medical Internet research”, 21, 2019, e11429. DOI:

Kaffee L.-A. & H. Elsahar, References in Wikipedia: The editors’ perspective, in: Companion Proceedings of the Web Conference 2021, New York: Association for Computing Machinery, 2021, pp. 535–538. DOI:

Koppen L., J. Phillips & R. Papageorgiou, Analysis of reference sources used in drug-related Wikipedia articles, “Journal of the Medical Library Association: JMLA”, 103, 2015, p. 140. DOI:

Kousha K. & M. Thelwall, Are Wikipedia citations important evidence of the impact of scholarly articles and books?, “Journal of the Association for Information Science and Technology”, 68, 2017, pp. 762–779. DOI:

Lewoniewski W., Identification of Important Web Sources of Information on Wikipedia across various Topics and Languages, “Procedia Computer Science”, 207, 2022, pp. 3290–3299. DOI:

Lewoniewski W., Węcel K. & W. Abramowicz, Analysis of references across Wikipedia languages, in: International Conference on Information and Software Technologies, 2017, pp. 561–573. DOI:

Lewoniewski W., K. Węcel & W. Abramowicz, Companies in Multilingual Wikipedia: Articles Quality and Important Sources of Information, in: Information Technology for Management: Approaches to Improving Business and Society, eds. E. Ziemba, W. Chmielarz & J. Wątróbski, Cham: Springer Nature Switzerland, 2023, pp. 48–67. DOI:

Lewoniewski W., K. Węcel & W. Abramowicz, Modeling Popularity and Reliability of Sources in Multilingual Wikipedia, “Information”, 11, 2020, p. 263. DOI:

Lewoniewski W., K. Węcel & W. Abramowicz, Multilingual Ranking of Wikipedia Articles with Quality and Popularity Assessment in Different Topics, “Computers”, 8, 2019. DOI:

Lewoniewski W., K. Węcel & W. Abramowicz, Reliability in Time: Evaluating the Web Sources of Information on COVID-19 in Wikipedia across Various Language Editions from the Beginning of the Pandemic presented at Wiki WorkShop 2022 (held virtually at The Web Conference 2022) on 25 April 2022.

Liu P.J. et al., Generating Wikipedia by summarizing long sequences, 2018.

Luyt B. & D. Tan, Improving Wikipedia’s credibility: References and citations in a sample of history articles, “Journal of the American Society for Information Science and Technology”, 61, 2010, pp. 715–722. DOI:

Maggio L.A. et al., Wikipedia as a gateway to biomedical research: The relative distribution and use of citations in the English Wikipedia, “PloS one”, 12, 2017, e0190046. DOI:

Maggio L.A., R.M. Steinberg, T. Piccardi & J.M. Willinsky, Reader engagement with medical content on Wikipedia, “Elife”, 9, 2020, e52426. DOI:

Nicholson J.M. et al., Measuring the quality of scientific references in Wikipedia: an analysis of more than 115M citations to over 800 000 scientific articles, “The FEBS journal”, 288, 2021, pp. 4242–4248.08 DOI:

OpenAlex. OpenAlex: The open catalog to the global research system, [Accessed 19 June 2023].

Petroni F. et al., Improving Wikipedia Verifiability with AI, 2022. DOI:

Piccardi T., M. Redi, G. Colavizza & R. West, Quantifying engagement with citations on Wikipedia, in: Proceedings of The Web Conference 2020, New York: Association for Computing Machinery, 2020, pp. 2365–2376. DOI:

Piscopo A. et al., What do wikidata and wikipedia have in common? An analysis of their use of external references, in: Proceedings of the 13th International Symposium on Open Collaboration, New York: Association for Computing Machinery, 2017, pp. 1–10. DOI:

Pooladian A. & Á. Borrego, Methodological issues in measuring citations in Wikipedia: a case study in Library and Information Science, “Scientometrics”, 113, 2017, pp. 455–464. DOI:

Priem J., H.A. Piwowar & B.M. Hemminger, Altmetrics in the wild: Using social media to explore scholarly impact, 2012.

Redi M., B. Fetahu, J. Morgan & D. Taraborelli, Citation needed: A taxonomy and algorithmic assessment of Wikipedia’s verifiability, in: The World Wide Web Conference, 2019, pp. 1567–1578. DOI:

Singh H., R. West & G. Colavizza, Wikipedia citations: A comprehensive data set of citations with identifiers extracted from English Wikipedia, “Quantitative Science Studies”, 2, 2021, pp. 1–19. DOI:

Teplitskiy M., G. Lu, & E. Duede, Amplifying the impact of open access: Wikipedia and the diffusion of science, “Journal of the Association for Information Science and Technology”, 68, 2017, pp. 2116–2127. DOI:

Thompson N. & D. Hanley, Science is shaped by Wikipedia: evidence from a randomized control trial, 2018. DOI:

Tomaszewski R. & K.I. MacDonald, A study of citations to Wikipedia in scholarly publications, “Science & technology libraries”, 35, 2016, pp. 246–261. DOI:

Wikimedia Dumps. Wikimedia Enterprise HTML Dumps, [Accessed 19 June 2023].

WikiRank. Quality and Popularity Assessment of Wikipedia Articles, [Accessed 19 June 2023].

Yang P. & G. Colavizza, A Map of Science in Wikipedia, in: Companion Proceedings of the Web Conference 2022, New York: Association for Computing Machinery, 2022, pp. 1289–1300. DOI:

Zagorova O., R. Ulloa, K. Weller & F. Flöck, “I updated the ”: The evolution of references in the English Wikipedia and the implications for altmetrics, “Quantitative Science Studies”, 3, 2022, pp. 147–173. DOI: