The Most Cited Scientific Information Sources in Wikipedia Articles Across Various Languages
PDF

Keywords

Wikipedia
references
sources of information
wiki
scientific sources
data science
data exploration
OpenAlex

How to Cite

Lewoniewski, W. (2024). The Most Cited Scientific Information Sources in Wikipedia Articles Across Various Languages. Biblioteka, (27 (36), 269–294. https://doi.org/10.14746/b.2023.27.12

Number of views: 735


Number of downloads: 198

Abstract

Ensuring the accuracy of information in Wikipedia articles requires the use of reliable sources that can be assessed by encyclopedia readers. However, the determination of source reliability can be subjective, with variations based on the language and subject matter of the article. Consequently, each language version of Wikipedia may have its own guidelines for judging the trustworthiness of sources. Some Wikipedia citations lead to scientific resources, which are usually deemed more reliable than websites due to their rigorous peer-review procedures and publication by esteemed academic publishers. This implies that the data presented in these scientific sources has been meticulously examined by specialists in the relevant area, providing a higher level of precision and trustworthiness. In this study, 332,424,439 references from 61,218,277 Wikipedia articles across 309 language versions were analyzed to identify citations to scientific publications. Additionally, OpenAlex was used to find unified metadata of important sources of information of multilingual Wikipedia.

https://doi.org/10.14746/b.2023.27.12
PDF

References

Alnajrani B., A. Alghamdi, M. Alotaibi, S.A. Atta-Ur-rahman & M. Nabil, A Novel Approach to Wikipedia References Classification, “ICIC Express Letters, Part B: Applications”, 13, 2022, pp. 1321–1330.

Arroyo-Machado W., D. Torres-Salinas, E. Herrera-Viedma & E. Romero-Frı́as, Science through Wikipedia: A novel representation of open knowledge through co-citation networks, “PloS one”, 15, 2020, e0228713. DOI: https://doi.org/10.1371/journal.pone.0228713

Baigutanova A. et al., Longitudinal Assessment of Reference Quality on Wikipedia, in: Proceedings of the ACM Web Conference 2023, New York: Association for Computing Machinery, 2023, pp. 2831–2839. DOI: https://doi.org/10.1145/3543507.3583218

BestRef. Popularity and Reliability Assessment of Wikipedia Sources, https://bestref.net [Accessed 19 June 2023].

Blumenstock J.E., Size matters: word count as a measure of quality on Wikipedia, “Proceedings of the 17th international conference on World Wide Web”, 2008, pp. 1095–1096. DOI: https://doi.org/10.1145/1367497.1367673

Chen C.-C. & C. Roth, {{Citation needed}}: The dynamics of referencing in Wikipedia, in: Proceedings of the eighth annual international symposium on wikis and open collaboration, New York: Association for Computing Machinery, 2012, pp. 1–4. DOI: https://doi.org/10.1145/2462932.2462943

Colavizza G., COVID-19 research in Wikipedia, “Quantitative Science Studies”, 1, 2020, pp. 1349–1380. DOI: https://doi.org/10.1162/qss_a_00080

data.lewoniewski.info. Supplementary materials for this research, https://data.lewoniewski.info/importantsources [Accessed 19 June 2023].

English Wikipedia. MediaWiki API help, https://en.wikipedia.org/w/api.php [Accessed 19 June 2023].

Fetahu B., K. Markert, W. Nejdl & A. Anand, Finding news citations for Wikipedia, in: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, New York: Association for Computing Machinery, 2016, pp. 337–346. DOI: https://doi.org/10.1145/2983323.2983808

Halfaker A. & R.S. Geiger, ORES: Lowering barriers with participatory machine learning in Wikipedia, “Proceedings of the ACM on Human-Computer Interaction”, 4, 2020, pp. 1–37. DOI: https://doi.org/10.1145/3415219

Hernández N.A., G. del Rio & D. de la Hera, Insights on the references of Wikipedia’s featured articles in English, French, Portuguese and Spanish, https://wikiworkshop.org/2022/papers/WikiWorkshop2022_paper_18.pdf [Accessed 19 June 2023].

Jemielniak D., G. Masukume & M. Wilamowski, The most influential medical journals according to Wikipedia: quantitative analysis, “Journal of medical Internet research”, 21, 2019, e11429. DOI: https://doi.org/10.2196/11429

Kaffee L.-A. & H. Elsahar, References in Wikipedia: The editors’ perspective, in: Companion Proceedings of the Web Conference 2021, New York: Association for Computing Machinery, 2021, pp. 535–538. DOI: https://doi.org/10.1145/3442442.3452337

Koppen L., J. Phillips & R. Papageorgiou, Analysis of reference sources used in drug-related Wikipedia articles, “Journal of the Medical Library Association: JMLA”, 103, 2015, p. 140. DOI: https://doi.org/10.3163/1536-5050.103.3.007

Kousha K. & M. Thelwall, Are Wikipedia citations important evidence of the impact of scholarly articles and books?, “Journal of the Association for Information Science and Technology”, 68, 2017, pp. 762–779. DOI: https://doi.org/10.1002/asi.23694

Lewoniewski W., Identification of Important Web Sources of Information on Wikipedia across various Topics and Languages, “Procedia Computer Science”, 207, 2022, pp. 3290–3299. DOI: https://doi.org/10.1016/j.procs.2022.09.387

Lewoniewski W., Węcel K. & W. Abramowicz, Analysis of references across Wikipedia languages, in: International Conference on Information and Software Technologies, 2017, pp. 561–573. DOI: https://doi.org/10.1007/978-3-319-67642-5_47

Lewoniewski W., K. Węcel & W. Abramowicz, Companies in Multilingual Wikipedia: Articles Quality and Important Sources of Information, in: Information Technology for Management: Approaches to Improving Business and Society, eds. E. Ziemba, W. Chmielarz & J. Wątróbski, Cham: Springer Nature Switzerland, 2023, pp. 48–67. DOI: https://doi.org/10.1007/978-3-031-29570-6_3

Lewoniewski W., K. Węcel & W. Abramowicz, Modeling Popularity and Reliability of Sources in Multilingual Wikipedia, “Information”, 11, 2020, p. 263. DOI: https://doi.org/10.3390/info11050263

Lewoniewski W., K. Węcel & W. Abramowicz, Multilingual Ranking of Wikipedia Articles with Quality and Popularity Assessment in Different Topics, “Computers”, 8, 2019. DOI: https://doi.org/10.20944/preprints201905.0144.v2

Lewoniewski W., K. Węcel & W. Abramowicz, Reliability in Time: Evaluating the Web Sources of Information on COVID-19 in Wikipedia across Various Language Editions from the Beginning of the Pandemic presented at Wiki WorkShop 2022 (held virtually at The Web Conference 2022) on 25 April 2022.

Liu P.J. et al., Generating Wikipedia by summarizing long sequences, 2018.

Luyt B. & D. Tan, Improving Wikipedia’s credibility: References and citations in a sample of history articles, “Journal of the American Society for Information Science and Technology”, 61, 2010, pp. 715–722. DOI: https://doi.org/10.1002/asi.21304

Maggio L.A. et al., Wikipedia as a gateway to biomedical research: The relative distribution and use of citations in the English Wikipedia, “PloS one”, 12, 2017, e0190046. DOI: https://doi.org/10.1371/journal.pone.0190046

Maggio L.A., R.M. Steinberg, T. Piccardi & J.M. Willinsky, Reader engagement with medical content on Wikipedia, “Elife”, 9, 2020, e52426. DOI: https://doi.org/10.7554/eLife.52426

Nicholson J.M. et al., Measuring the quality of scientific references in Wikipedia: an analysis of more than 115M citations to over 800 000 scientific articles, “The FEBS journal”, 288, 2021, pp. 4242–4248.08 DOI: https://doi.org/10.1111/febs.15608

OpenAlex. OpenAlex: The open catalog to the global research system, https://openalex.org [Accessed 19 June 2023].

Petroni F. et al., Improving Wikipedia Verifiability with AI, 2022. DOI: https://doi.org/10.21203/rs.3.rs-2116541/v1

Piccardi T., M. Redi, G. Colavizza & R. West, Quantifying engagement with citations on Wikipedia, in: Proceedings of The Web Conference 2020, New York: Association for Computing Machinery, 2020, pp. 2365–2376. DOI: https://doi.org/10.1145/3366423.3380300

Piscopo A. et al., What do wikidata and wikipedia have in common? An analysis of their use of external references, in: Proceedings of the 13th International Symposium on Open Collaboration, New York: Association for Computing Machinery, 2017, pp. 1–10. DOI: https://doi.org/10.1145/3125433.3125445

Pooladian A. & Á. Borrego, Methodological issues in measuring citations in Wikipedia: a case study in Library and Information Science, “Scientometrics”, 113, 2017, pp. 455–464. DOI: https://doi.org/10.1007/s11192-017-2474-z

Priem J., H.A. Piwowar & B.M. Hemminger, Altmetrics in the wild: Using social media to explore scholarly impact, 2012.

Redi M., B. Fetahu, J. Morgan & D. Taraborelli, Citation needed: A taxonomy and algorithmic assessment of Wikipedia’s verifiability, in: The World Wide Web Conference, 2019, pp. 1567–1578. DOI: https://doi.org/10.1145/3308558.3313618

Singh H., R. West & G. Colavizza, Wikipedia citations: A comprehensive data set of citations with identifiers extracted from English Wikipedia, “Quantitative Science Studies”, 2, 2021, pp. 1–19. DOI: https://doi.org/10.1162/qss_a_00105

Teplitskiy M., G. Lu, & E. Duede, Amplifying the impact of open access: Wikipedia and the diffusion of science, “Journal of the Association for Information Science and Technology”, 68, 2017, pp. 2116–2127. DOI: https://doi.org/10.1002/asi.23687

Thompson N. & D. Hanley, Science is shaped by Wikipedia: evidence from a randomized control trial, 2018. DOI: https://doi.org/10.2139/ssrn.3039505

Tomaszewski R. & K.I. MacDonald, A study of citations to Wikipedia in scholarly publications, “Science & technology libraries”, 35, 2016, pp. 246–261. DOI: https://doi.org/10.1080/0194262X.2016.1206052

Wikimedia Dumps. Wikimedia Enterprise HTML Dumps, https://dumps.wikimedia.org/other/enterprise_html/ [Accessed 19 June 2023].

WikiRank. Quality and Popularity Assessment of Wikipedia Articles, https://wikirank.net [Accessed 19 June 2023].

Yang P. & G. Colavizza, A Map of Science in Wikipedia, in: Companion Proceedings of the Web Conference 2022, New York: Association for Computing Machinery, 2022, pp. 1289–1300. DOI: https://doi.org/10.1145/3487553.3524925

Zagorova O., R. Ulloa, K. Weller & F. Flöck, “I updated the ”: The evolution of references in the English Wikipedia and the implications for altmetrics, “Quantitative Science Studies”, 3, 2022, pp. 147–173. DOI: https://doi.org/10.1162/qss_a_00171