Establishing Sentence Boundary – man vs machine
PDF (Język Polski)

How to Cite

Lipnicki, M. (2016). Establishing Sentence Boundary – man vs machine. Investigationes Linguisticae, 34, 11–20. https://doi.org/10.14746/il.2016.34.3

Number of views: 421


Number of downloads: 190

Abstract

The paper aims at presenting the results of an experiment checking the possibility of an accurate reconstruction of text sentence boundary done by humans and computer programme with possible application in presenting the output of automatic speech recognition systems. The results are compare with the assesment of received sentences acceptability.
https://doi.org/10.14746/il.2016.34.3
PDF (Język Polski)

References

Baranowska, E., Francuzik, K., Karpiński, M., Kleśta, J. 2003. Determining Phrase Boundaries in Written Texts for the Purpose of Polish Speech Synthesis. Speech and Language Technology, vol.7. pp. 71-78.

Beeferman, D., Berger, A., Lafferty. J. 1998. CYBERPUNC: A lightweight punctuation annotation system for speech. w: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing. pp. 689-692.

Carletta, J. 1996. Assessing Agreement on Classification Tasks: The Kappa Statistic. Computional Linguistics. vol 22, no. 2. pp. 249-254.

Chomsky, N. 1965. Aspects of the theory of syntax. Cambridge: MIT Press.

Dąbrowska, E. 2010. Naive v. expert intuitions: An experimental study of acceptability judgements. The Linguistic Review. vol 27. pp. 1-23.

Featherston, Sam. 2005. Universals and grammaticality: Wh-constraints in German and English. Linguistics, vol 43. pp. 667–711.

Fleiss, JL. 1971. Measuring nominal scale agreement among many raters. Psychological Bulletin, vol. 76, no. 5 pp. 378–382.

Huang, J., Zweig, G. 2002. Maximum entropy model for punctuation annotation from speech. w: Proceedings of ICSLP. pp. 917-920.

Jarosz-Nowakm J. 2007. Modele oceny stopnia zgody pomiędzy dwoma ekspertami z wykorzystaniem współczynników kappa. Matematyka stosowana. vol.8. pp. 126-154.

Kiss, T., Strunk, J. 2006. Unsupervised Multilingual Sentence Boundary Detection. Computional Linguistics. vol. 32, no. 4. pp. 1-40.

Landis, JR., KochGG. 1977. The Measurement of Observer Agreement for Categorical Data. Biometrics, vol. 33, no. 1. pp. 159-174.

Liu, Y., Shriberg, E., Stolcke, A., Hillard, D., Ostendorf, M., Harper, M. 2006. Enriching speech recognition with automatic detection of sentence boundaries and disfluencies. IEEE Transactions on Speech and Audio Processing, vol. 14, no. 5, pp. 1526 1540.

Matusov, E., Mauser, A., Ney, H. 2006, Automatic Sentence Segmentation and Punctuation Prediction for Spoken Language Translation. w: Proceedings of the International Workshop on Spoken Language Translation. pp. 158-165.

Polański, K. 1980-1992. Słownik syntaktyczno-generatywny czasowników polskich. tom 1-5. Wrocław, Warszawa, Kraków, Gdańsk, Łódź. Wydawnictwo Polskiej Akademii Nauk.

Rao, S., Lane, I., Schultz, T. 2007. Optimizing Sentence Segmentation for Speech Translation. w: Preceedings of Interspeech, 2007. pp. 2845-2848.

Song, Y., Ahn, H., Kim, H. 2014. Re-ranking ASR Outputs for Spoken Sentence Retrieval. w: JIST Workshops & Posters. pp. 6-11.

Stevenson, M., Gaizauskas, R. 2000. Experiments on Sentence Boundary Detection. w: Proceedings of the North American Chapter of the Association for Computational Linguistics annual meeting, pp. 24–30.

Woliński, M. 2004. System znaczników morfosyntaktycznych w korpusie IPI PAN. Polonica XII. pp. 39-54.