Wprowadzenie do metod statystycznych w tłumaczeniu automatycznym

Main Article Content

Marcin Junczys-Dowmunt

Abstrakt

The intention of this article is to provide a concise introduction to the basic mathematical concepts of statistical translation models as they were introduced by Brown et al. (1993) in their groundbreaking work The Mathematics of Statistical Machine Translation: Parameter Estimation. We concentrate on a simplified description of the first two translation models known as IBM Model 1 and 2. It is one major aim of this work to serve as tutoring material for students of computational linguistics, mathematics or computer science and therefore a lot of comments, additional examples and step-by-step explanations are given, augmenting the original formula by Brown et al. (1993). For both discussed models the calculations for a small parallel corpus are described in detail.
 

Downloads

Download data is not yet available.

Article Details

Jak cytować
Junczys-Dowmunt, M. (2008). Wprowadzenie do metod statystycznych w tłumaczeniu automatycznym. Investigationes Linguisticae, 16, 44-66. https://doi.org/10.14746/il.2008.16.5
Dział
Artykuły

Bibliografia

  1. Al-Onaizan, Y., J. Curin, M. Jahr, K. Knight, J. Lafferty, I. Melamed, F. Och, D. Purdy, N. Smith i D. Yarowsky (1999). Statistical machine translation. Rap. tech., JHU workshop.
  2. Brown, P. F., V. J. Della Pietra, S. A. Della Pietra i R. L. Mercer (1993). The mathematics of statistical machine translation: parameter estimation. Comput. Linguist., 19(2):263-311.
  3. Dempster, A. P.,N. M. Laird i D. B. Rubin (1977). Maximum Likelihood from Incomplete Data via the EM Algorithm.
  4. Journal of the Royal Statistical Society. Series B (Methodological), 39(1): 1—38.
  5. Jurafsky, D. i J. H. Martin (2000). Speech andLanguage Processing: An Introduction to Natural Language Processing, Computational Linguistics and Speech Recognition (International Edition). Prentice Hall.
  6. Kay, M. i M. Roscheisen (1993). Text-trańslatioń aligńmeńt. Comput. Linguist., 19(1): 121—142.
  7. Knight, K. (1999). A Statistical MT Tutorial Workbook. Niepublikowane.
  8. Manning, C. D. i H. Schutze (1999). Foundations o f Statistical Natural Language Processing. The MIT Press.
  9. Melamed, I. D. (2000). Models of translational eąuwalence among words. Comput. Linguist., 26(2):221-249.
  10. Och, F. J. i H. Ney (2003). A systematic comparison of various statistical aligńmeńt models. Comput. Linguist., 29(1): 19-51.
  11. Somers, H. (2001). Bilingual parallel corpora and Language Engineering.