Summer Tutorial on Statistical Natural Language
Processing
By Dr. Keh-Yih Su and Prof. Jing-Shin Chang
---
Natural Language Computing Group at Microsoft
Research Asia (MSRA) is very pleased to announce that we
have the honor to invite a world renowned scholar in
Natural Language Processing, Dr. Keh-Yih Su and his
colleague, Prof. Jing-Shin Chang, to MSRA to hold a
summer tutorial on Statistical Natural Language
Processing (SNLP).
We cordially invite all who
are interested in SNLP to join us in this tutorial for
the valuable opportunity to explore the fundamentals and
advanced topics in the SNLP field.
Expenses:
1. This tutorial is free of charge, but you need to
cover the travel and living expenses. 2. Microsoft
Research Asia will provide the lecture materials, drinks
and lunch.
For more details of the tutorial and
registration information, please click Registration.
Note:
Please note that we have an upper limit of attendee
number. We will give priority to students, teachers from
universities if the number exceeds the capacity of the
lecture room space.
Short Bios of Dr.
Keh-Yih Su and Prof. Jin-Shin Chang
Dr. Su
received his Ph.D. degree from the University of
Washington, Seattle, USA in 1984. After graduation, he
became a professor at the National Tsing-Hua University
in Taiwan. In 1985, he started an English-to-Chinese
Machine Translation project (which was later named
BehaviorTran) and became its director. In 1988, this
project was transferred to the Behavior Design
Corporation (BDC), which was founded in the
Science-Based Industrial Park (SBIP), Hsinchu, Taiwan,
to support the long-term R&D work of the
BehaviorTran Machine Translation System. Since then, the
company has been providing translation service to many
well-known international corporations. In 1998, Dr. Su
left the university to join the Behavior Design Corp. He
is now the general manager of the company.
Dr. Su
has been a leading figure in the field of Natural
Language Processing since 1986. In 1991, at the
conference of Machine Translation Summit III, he
proposed and presented the Corpus-Based
Statistics-Oriented (CBSO) Approach adopting the two-way
training mechanism and avoiding the problems induced by
those purely statistical approaches (e.g., IBM 1990). He
has published over 100 technical papers and has been the
editorial member for several international journals. He
has also been the invited speaker at numerous MT-related
international conferences.
Professor
Jing-Shin Chang received his BS Degree from the
Department of Electrical Engineering of the National
Tsing-Hua University (NTHU), Hsinchu, Taiwan, in 1984.
In 1986, he joined the Machine Translation Research
Group of NTHU, which had been under the leadership of
Professor Keh-Yih Su of the EE Department since 1985. He
became the project leader of the MT Research Group in
1987. During this period of time, he was the principle
designer of a new generation MT parser and participated
in most of the major research and development work on
the ArchTran (now known as the BehaviorTran) Machine
Translation System. In 1988, he started his study for
his MS Degree and PhD Degree in the National Chiao-Tung
University (NCTU), Hsinchu, Taiwan, and NTHU,
respectively, while keeping a close cooperative
relationship with the Behavior Design Corporation in all
aspects of MT R&D work. He received his MS degree
from NCTU in 1990, and PhD from NTHU in 1997,
respectively. Since 1997, he had been a senior
researcher of the Behavior Design Corporation, working
on the next generation CBSO (Corpus-Based
Statistics-Oriented) MT System. In 2000, he became an
Assistant Professor at the Department of Computer
Science and Information Engineering, National Chi-Nan
University (NCNU), Puli, Nantou, Taiwan. Currently, he
is still working on various challenging MT research and
related topics, and is expected to do so for the
foreseeable future.
The Two-Day
Program:
Aug. 17: Introduction to Statistical
Natural Language Processing (Mainly Cover Supervised
Learning)
Part I: Introduction (1)
Problems and Characteristics of Natural Language
Processing Part II: Introduction (2) What,
When, and Why for Statistical Approach Part III:
Basic Concepts and Background Feature Space,
Probability, Estimator, Stochastic Process, Data Set
Classification, and Performance Measure Part IV:
Typical Applications Word Segmentation, Tagging,
Selecting Parse Tree, Aligning Bilingual Corpus
Part V: Techniques for Improving Performance
Smoothing, Class-Based Model, Adaptive Learning, Tips
for Checking Part VI: Advanced Topics Support
Vector Machine, Maximum Entropy Models Appendix:
Related Techniques Parameter Estimation, Fractional
Factorial Experiment Design, Decision Tree
Aug.
18: Unsupervised Learning for Natural Language
Processing Part I: Introduction What and When
for Unsupervised Learning, Why It Is Getting More
Popular Part II: Basic Concepts and Background
(using EM as an example) Incomplete Data Space
Learnability Part III: Typical Unsupervised
Learning Algorithms: Viterbi & EM Procedures,
Characteristics Part IV: Potential Traps &
Source of Problems Various Mismatches, Model
Deficiencies, Local Maximum, and Over-fitting Part
V: Suggested Strategies for Better Performance
Lessons Learned from Past Experience Recommended
Procedures for Unsupervised Learning Part VI:
Co-Training Basic Principle Example: Chinese
Compound Noun Extraction
|