A cross-linguistic database of phonetic transcription systems

Main Article Content

Cormac Anderson
Tiago Tresoldi
Thiago Chacon
Anne-Maria Fehn
Mary Walworth
Robert Forkel
Johann-Mattis List


Contrary to what non-practitioners might expect, the systems of phonetic notation used by linguists are highly idiosyncratic. Not only do various linguistic subfields disagree on the specific symbols they use to denote the speech sounds of languages, but also in large databases of sound inventories considerable variation can be found. Inspired by recent efforts to link cross-linguistic data with help of reference catalogues (Glottolog, Concepticon) across different resources, we present initial efforts to link different phonetic notation systems to a catalogue of speech sounds. This is achieved with the help of a database accompanied by a software framework that uses a limited but easily extendable set of non-binary feature values to allow for quick and convenient registration of different transcription systems, while at the same time linking to additional datasets with restricted inventories. Linking different transcription systems enables us to conveniently translate between different phonetic transcription systems, while linking sounds to databases allows users quick access to various kinds of metadata, including feature values, statistics on phoneme inventories, and information on prosody and sound classes. In order to prove the feasibility of this enterprise, we supplement an initial version of our cross-linguistic database of phonetic transcription systems (CLTS), which currently registers five transcription systems and links to fifteen datasets, as well as a web application, which permits users to conveniently test the power of the automatic translation across transcription systems.


Download data is not yet available.

Article Details

How to Cite
Anderson, C., Tresoldi, T., Chacon, T., Fehn, A.-M., Walworth, M., Forkel, R., & List, J.-M. (2018). A cross-linguistic database of phonetic transcription systems. Yearbook of the Poznań Linguistic Meeting, 4(1), 21-53. https://doi.org/10.2478/yplm-2018-0002


  1. Anonymous. 2014. Index Diachronica. <https://chridd.nfshost.com/diachronica/>
  2. Bell, A. 1867. Visible speech: The science of universal alphabetics: Or, self-interpreting physiological letters, for the writing of all languages in one alphabet. Illustrated by tables, diagrams, and examples. London: Simpkin, Marshall.
  3. Běijīng Dàxué 北京大学. 1964. Hànyǔ fāngyán cíhuì [Chinese dialect vocabularies]. Běijīng: Wénzì Gǎigé 文字改革.
  4. Bybee, J. 2001. Phonology and language use. Cambridge: Cambridge University Press.
  5. Chao, Y. 2006. A system of ‘tone letters’. In: Wu, Z.-J. and X.-N. Zhao (eds.), Linguistic essays by Yuenren Chao. Běijīng: Shāngwù. 98–102.
  6. Charpentier, J.-M. and A. François. 2015. Linguistic atlas of French Polynesia / Atlas linguistique de la Polynésie française. Berlin, Boston: De Gruyter Mouton.
  7. Chomsky, N. and M. Halle. 1968. The sound pattern of English. New York: Harper and Row.
  8. Crowley, T. 2006. The Avava Language of Central Malakula (Vanuatu). The Australian National University: Pacific Linguistics, Research School of Pacific and Asian Studies.
  9. Crowley, T. 2006. Nese: A Diminishing Speech Variety of Northwest Malakula (Vanuatu). The Australian National University: Pacific Linguistics, Research School of Pacific and Asian Studies.
  10. Dediu, D. and S. Moisik. 2016. Defining and counting phonological classes in cross-linguistic segment databases. In: Proceedings of the 10th International Conference on Language Resources and Evaluation. 1955–1962.
  11. Dench, A. 2002. Descent and diffusion: The complexity of the Pilbara situation. In: Aikhenvald, A. and R. Dixon (eds.), Areal diffusion and genetic inheritance: Problems in comparative linguistics. Oxford: Oxford University Press. 105–133.
  12. Dodd, R. 2014. V’ënen Taut: Grammatical topics in the Big Nambas Language of Malekula. (PhD dissertation, University of Waikato.)
  13. Dolgopolsky, A. 1964. Gipoteza drevnejšego rodstva jazykovych semej Severnoj Evrazii s verojatnostej točky zrenija [A probabilistic hypothesis concering the oldest relationships among the language families of Northern Eurasia]. Voprosy Jazykoznanija 2. 53–63.
  14. Dryer, M. and M. Haspelmath. 2011. The World Atlas of Language Structures online. Munich: Max Planck Digital Library.
  15. Eden, E. 2018. Measuring phonological distance between languages. (PhD dissertation, University College London.)
  16. Güldemann, T. 2001. Phonological regularities of consonant systems across Khoesan lineages. University of Leipzig Papers on Africa 16. 1–50.
  17. Güldemann, T. 2014. ‘Khoisan’ linguistic classification today. In: Güldemann, T. and A.-M. Fehn (eds.), Beyond ‘Khoisan’. Historical relations in the Kalahari Basin. Amsterdam and Philadelphia: John Benjamins. 1–40.
  18. Hammarström, H., R. Forkel, and M. Haspelmath. 2017. Glottolog. Version 3.0. Leipzig: Max Planck Institute for Evolutionary Anthropology.
  19. Haspelmath, M. 2010. Comparative concepts and descriptive categories. Language 86(3). 663–687.
  20. Haspelmath, M. and R. Forkel. 2015. CLLD – Cross-Linguistic Linked Data. Max Planck Institute for Evolutionary Anthropology: Leipzig.
  21. Herzog, G., S. Newman, E. Sapir, M. Swadesh, M. Swadesh, and C. Voegelin. 1934. Some orthographic recommendations. American Anthropologist 36(4). 629–631.
  22. Honeybone P. 2005. Diachronic evidence in segmental phonology: The case of laryngeal specifications. In: van Oostendorp, M. and J. van de Weijer (eds.), The internal organisation of phonological segments. Mouton de Gruyter: Berlin and New York. 319–354.
  23. Hóu Jīngyī 侯精一 (ed.). 2004. Xiàndài Hànyǔ fāngyán yīnkù 现代汉语方言音库 [Phonological database of Chinese dialects]. Shànghǎi 上海: Shànghǎi Jiàoyù 上海 教育.
  24. Huáng, B. and X. Liào. 2002. Xiàndài Hànyǔ 现代汉语 [Modern Chinese]. Běijīng: Gāoděng Jiàoyù.
  25. International Institute of African Languages and Cultures. 1930. Practical orthography of African languages. (Revised edition.) Oxford: Oxford University Press.
  26. International Phonetic Association. 1912. The Principles of the International Phonetic Association. Bourg-la-Reine and London: Paul Passy and Daniel Jones.
  27. International Phonetic Association. 1999. Handbook of the International Phonetic Association. Cambridge: Cambridge University Press.
  28. International Phonetic Association. 2015. The International Phonetic Alphabet. (Revised to 2015.)
  29. Department of Linguistics. 2017. Multimedia IPA chart. Victoria: University of Victoria.
  30. Jacob, J.M. 1963. Prefixation and infixation in old Mon, old Khmer, and modern Khmer. Linguistic comparison in Southeast Asia and the Pacific. 62–70.
  31. Jäger, G., J.-M. List and P. Sofroniev. 2017. Using support vector machines and state-of-the-art algorithms for phonetic alignment to identify cognates in multi-lingual wordlists. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics. (Long papers.) 1204–1215.
  32. Kalusky, W. 2017. Die Transkription der Sprachlaute des Internationalen Phonetischen Alphabets: Vorschläge zu einer Revision der systematischen Darstellung der IPATabelle. München: LINCOM Europa.
  33. Kieviet, P. 2017. A Grammar of Rapa Nui. Berlin: Language Science Press.
  34. Köhler, O., P. Ladefoged, J. Snyman, A. Traill and R. Vossen. 1988. The symbols for clicks. Journal of the International Phonetic Association 18(2). 140–142.
  35. Kümmel, M. 2008. Konsonantenwandel [Consonant change]. Reichert: Wiesbaden.
  36. Lepsius, C. 1854. Das allgemeine linguistische Alphabet: Grundsätze der Übertragung fremder Schriftsysteme und bisher noch ungeschriebener Sprachen in europäische Buchstaben. Wilhelm Hertz: Berlin.
  37. List, J.-M. 2014. Sequence comparison in historical linguistics. Düsseldorf: Düsseldorf University Press.
  38. List, J.-M. and J. Prokić. 2014. A benchmark database of phonetic alignments in historical linguistics and dialectology. In: Proceedings of the Ninth International Conference on Language Resources and Evaluation. 288–294.
  39. List, J.-M., M. Cysouw, and R. Forkel. 2016. Concepticon. A resource for the linking of concept lists. In: Proceedings of the Tenth International Conference on Language Resources and Evaluation. 2393–2400.
  40. List, J.-M., S. Greenhill, and R. Gray. 2017. The potential of automatic word comparison for historical linguistics. PLOS ONE 12(1). 1–18.
  41. Lynch, J. 2016. Malakula internal subgrouping: Phonological evidence. Oceanic Linguistics 55(2). 399–431.
  42. Maddieson, I. 1984. Patterns of sounds. Cambridge: Cambridge University Press.
  43. Maddieson, I., S. Flavier, E. Marsico, C. Coupé and F. Pellegrino. 2013. LAPSyD: Lyon-Albuquerque Phonological Systems Database. In: Proceedings of Interspeech.
  44. Malau, C. 2016. A grammar of Vurës, Vanuatu. Berlin: Walter de Gruyter:
  45. Mann, M. and D. Dalby. 1987. A thesaurus of African languages: A classified and annotated inventory of the spoken languages of Africa with an appendix on their written representation. London: Zell Publishers.
  46. Michaelis, S., P. Maurer, M. Haspelmath and M. Huber. 2013. The Atlas of Pidign and Creole language structures online. Leipzig: Max Planck Institute for Evolutionary Anthropology.
  47. Mielke, J. 2008. The emergence of distinctive features. Oxford: Oxford University Press.
  48. Moran, S. and M. Cysouw. 2017. The Unicode cookbook for linguists. Managing writing systems using Orthography Profiles. Zürich: Zenodo.
  49. Moran, S., D. McCloy and R. Wright (eds.). 2014. PHOIBLE Online. Leipzig: Max Planck Institute for Evolutionary Anthropology.
  50. Mortensen, D. 2017. PanPhon. Python API for accessing phonological features of IPA Segments. Pittsburgh: Carnegie Mellon School of Computer Science.
  51. Nakagawa, H. 2006. Aspects of the phonetic and phonological structure of the Gui language. (PhD dissertation, University of the Witwatersrand, Johannesburg.)
  52. Nikolaev, D., A. Nikulin and A. Kukhto. 2015. The database of Eurasian phonological inventories. Moscow: RGGU. <http://eurasianphonology.info>
  53. Press, M. L. 1980. Chemehuevi: A grammar and lexicon. Berkeley: University of California Press.
  54. Pullum, G. and W. Ladusaw. 1996. Phonetic symbol guide. Chicago: University of Chicago Press.
  55. Ruhlen, M. 2008. A global linguistic database. Moscow: RGGU.
  56. Salisbury, M.C. 2002. A grammar of Pukapukan. (PhD dissertation, The University of Auckland.)
  57. Sapir, E. 1930. Southern Paiute, a Shoshonean language. Boston: Academic Press.
  58. Saussure, F. de. 1878. Mémoire sur le système primitif des voyelles dans les langues indo-européennes. Leipzig: Teubner.
  59. Saussure, F. de. 1916. Cours de linguistique générale. Lausanne: Payot.
  60. Setälä, E. 1901. Über transskription der finnisch-ugrischen sprachen. Finnisch-ugrische Forschungen 1. 15–52.
  61. Simpson, A. 1999. Fundamental problems in comparative phonetics and phonology: does UPSID help to solve them. In: Proceedings of the 14th international congress of phonetic sciences.
  62. Starostin, G. and P. Krylov (eds.). 2011. The global lexicostatistical database. Compiling, clarifying, connecting basic vocabulary around the world: From free-form to tree-form. <http://starling.rinet.ru/new100/main.htm>
  63. Starostin, G. (ed.) 2017. Annotated Swadesh wordlists for the Hmong group (Hmong-Mien family).
  64. Stimson, J. F. and D.S. Marshall. 1964. A dictionary of some Tuamotuan dialects of the Polynesian language. Leiden: M. Nijhoff.
  65. Sweet, H. 1877. A handbook of phonetics, including a popular exposition of the principles of spelling reform. Oxford: Clarendon Press.
  66. Tadadjeu, M. and E. Sadembouo. 1979. Alphabet Générale des langues Camerounaises. Yaoundé: Departement des Langues Africaines et Linguistique, Université de Yaoundé.
  67. Traill A. 1993. The feature geometry of clicks. In: van Staden, P.M.S. (ed.), Linguistica: Festschrift E. B. van Wyk: ’n huldeblyk. Pretoria: van Schaik. 134–140.
  68. Tregear, E. 1899. Dictionary of Mangareva: Or Gambier Islands. Wellington: J. Mackay.
  69. Trubetzkoy, N. 1939. Grundzüge der Phonologie [Foundations of phonology]. Prague: Cercle Linguistique de Copenhague.
  70. UNESCO. 1978. African languages. In: Proceedings of the meeting of experts on the transcription and harmonization of African languages.
  71. Wichmann, S., E. Holman and C. Brown. 2016. The ASJP database. Jena: Max Planck Institute for the Science of Human History.
  72. Wikipedia contributors. 2018. International Phonetic Alphabet. Wikipedia, The Free Encyclopedia. <https://en.wikipedia.org/w/index.php?title=International_Phonetic_Alphabet&oldid=822828531>. Accessed 29 Jan 2018.