Course Contents & Slides (Last update: 2003/06/16)

Double Click the Underlined Document Numbers to Download the Slides

Copyright Statements (版權聲明)

All Documents are Copyrighted by Their Respective Authors.
The course contents in the slides here are intended only for non-profitable
and educational purposes. Distribution other than personal and
educational uses is subject to liability problems and law suit.
本網頁所連結之各種文件, 其版權乃各該作者所有.
任意散佈上述文件供非個人教育用途之用恐有違法之虞, 請勿為之.

Courses: [NLP]

Natural Language Processing (#215021)

Course Slides

00. NLP Architecture, Machine Translation and Applications (3.2 M) - Table of Contents - Machine Translation as a Generic NLP System - NLP Applications (and Related Problems) - Basic Linguistics 01. Words and Tokenization (2.6 M) - English Morphological Analysis - Chinese Word Segmentation - (a First Case Study of Statistical NLP Modeling) 02. Basic NLP Theories (6.6 M) - Basic Probability and Statistics - Basic Decision Theory - Language Models for Regular Languages - N-gram Model (+Smoothing) - Hidden Markov Model - Parameter Estimation - MLE - Data Sparseness, Training vs. Testing and Over Fitting - Smoothing (Add one, add Lambda, Witten Bell) - Backoff Smoothing - Good-Tuning, modified GT, Katz Backoff Smoothing - EM, Viterbi, Maximum Entropy, Co-Training (not included) 03. Word Class and POS Tagging (264 K) (See also HMM) 04. Sentence Structure and Syntactic Parsing (4.2 M) - CFG, CYK, PCFG, P-CYK - Unsupervised Training: Trainable Grammar, Inside-Outside Algorithm 05. Alignment and Statistical MT (4.8 M) 06. Lexicon Acquisition (1.8 M)

Textbooks & References

[1] Foundations of Statistical Natural Language Processing, by Christopher D. Manning and Hinrich Schutze, MIT Press, 1999. [2] Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition, by Daniel Jurafsky & James H. Martin, Prentice Hall, 2000. Website: Errata: (Local copy) Errata: (Local copy) - Probabilistic CYK Algorithm [*] Figures of the book: Figs: (Local copy) Other Figs: (Local copy) [3] Spoken Language Processing: A Guide to Theory, Algorithm, and System Development, by XueDong Huang, Alex Acero and Hsiao-Wen Hon, Prentice Hall PTR, Upper Saddle River, NJ 07458, USA, 2001. (Ch. 1, Sec. 2.3-2.5, Ch. 3, Ch. 4, Sec. 5.8, Ch. 8, Ch. 11, Ch. 12, Ch. 13, Ch. 14, Sec. 17.3-17.5 will be particularly interesting for Statistical NLP.) [4] Advanced NLP Issues ... (beyond algorithmic/application points of view ...) [1] Modeling Problems [1.1] Features [1.2] Dependency [2] Estimation Problems [2.1] Performance Metrics [2.2] Dicrimination [2.3] Robustness [2.4] Adaptive Training [2.5] Supervised vs. Unsupervised Training - Why and When

Lectures/Tutorial Courses/Invited Talks

[in Publication List]

Conference Papers

[11] Chang, J.-S. , Y.-F. Luo and K.-Y. Su, "GPSM: A Generalized Probabilistic Semantic Model for Ambiguity Resolution," Proceedings of ACL-92, pp. 177--184, 30th Annual Meeting of the Association for Computational Linguistics, University of Delaware, Newark, DE, USA, 28 June--2 July, 1992. [13] Tung-Hui Chiang, Jing-Shin Chang, Ming-Yu Lin and Keh-Yih Su, "Statistical Models for Word Segmentation and Unknown Word Resolution," Proceedings of ROCLING-V, ROC Computational Linguistics Conference V, pp. 123--146, National Taiwan University, Taipei, Taiwan, ROC, Sep. 18--20, 1992. (PDF version)

Journals and Books

[6] Tung-Hui Chiang, Jing-Shin Chang, Ming-Yu Lin and Keh-Yih Su, "Statistical Word Segmentation," in C.-R. Huang, K.-J. Chen and Benjamin K. T'sou (eds.): Journal of Chinese Linguistics, Monograph Series Number 9, Readings in Chinese Natural Language Processing, pp. 147-173. University of California, Berkeley. 1996. [7] K.-Y. Su, Tung-Hui Chiang, and Jing-Shin Chang, "An Overview of Corpus-Based Statistics-Oriented (CBSO) Techniques for Natural Language Processing," International Journal of Computational Linguistics & Chinese Language Processing (CLCLP), vol. 1 no. 1, pp. 101--157, August 1996. [8] Yu-Ling Una Hsu, Jing-Shin Chang, and Keh-Yih Su, "Computational Tools and Resources for Linguistics Studies," International Journal of Computational Linguistics & Chinese Language Processing (CLCLP), vol. 2, no. 1, pp. 1-39, 1997.


Su, K.-Y., T.-H. Chiang and J.-S. Chang, "Introduction to Corpus-based Statistics-oriented (CBSO) Techniques," Pre-Conference Workshop on Corpus-based NLP, ROC Computational Linguistics Conference VII, National Tsing-Hua Univ., Taiwan, ROC., Aug. 1994. Part I: Introduction (PDF/4) (PS/4) (PDF) (PS) Part II: Basic Concepts (PDF/4) (PS/4) (PDF) (PS) Part III: Techniques (PDF/4) (PS/4) (PDF) (PS) Errata: Corrections to Part I-III TXT

#Visitors: (Since 2003/06/16)