Title: Corpus-Based Methods in Chinese Morphology Speaker: Dr. Richard Sproat AT&T Labs -- Research 180 Park Ave, Room B207 Florham Park, NJ 07932, USA +1-973-360-8490 rws@research.att.com Summary: The course will cover selected topics in corpus-based linguistics and statistical natural language processing. I will show how some of these techniques can be applied to problems related more specifically to Chinese: specific topic areas will include statistical analysis of morphological productivity in modern Chinese, morphological phenomena such as abbreviation (suoxie), questions of morphological structure in earlier forms of Chinese and more practical NLP problems such as Chinese word segmentation and named entity recognition. Tutorial Outline: 3 hour tutorial, assuming 15 minute break in the middle. This is tentative and will likely change as I develop materials... 0:00 - 0:30 Issues in Chinese morphology 0:30 - 1:30 Measures of association and measures of productivity 1:30 - 1:45 Break 1:45 - 2:00 Productivity in Chinese morphology: Root compounds 2:00 - 2:30 Chinese word segmentation 2:30 - 2:45 Theories of Suoxie 2:45 - 3:00 Measures of association and concepts of ancient Chinese morphology Bio: Richard Sproat has worked in many areas of speech and computational linguistics including computational morphology, corpus-based methods, text-to-speech synthesis, the computational analysis of writing systems, and text-to-scene conversion; as well as a number of areas of theoretical linguistics. He has also done research in Chinese linguistics, both theoretical work on Chinese morphology, and more practical work on Chinese word segmentation and text-to-speech synthesis.