CALL FOR PAPERS (Deadline Extended: Oct 31, 2003) "Recent Advances in Statistical Language Modeling - Beyond N-grams" Special issue of ACM Transactions on Asian Language Information Processing (TALIP) Guest Editors: Jianfeng Gao, Microsoft Research Asia, Beijing, China Chin-Yew Lin, ISI, University of Southern California, USA Website: http://www.isi.edu/~cyl/TALIP Theme: Statistical language modeling (SLM) aims to estimate probability distribution of various linguistic units, such as words, sentences, and documents, for the purpose of many natural language applications. Over the last two decades, many attempts have been made to improve the state of the art. In this issue, we solicit papers showing recent advances of SLM in both theory and applications. Theory: It is ironical that the most popular language model (n-grams) uses very little language knowledge. In recent years, many attempts have been made that try to "put language back into language model". But little improvement has been achieved so far in realistic applications due to two major obstacles: (1) the number of parameters of the knowledge-based models is usually too large to estimate; (2) the construction and use of these models requires a large annotated training corpus and a decoder that assigns linguistic structure, which are not always available. We are seeking ideas that enhance our understanding of these core problems in SLM. We encourage submissions that describe principles, concepts or models on which work in SLM could be based. Application: SLM has been successfully applied in many applications such as speech recognition, Asian language input, information retrieval, and machine translation. We welcome submissions that demonstrate significant improvement in performance using knowledge-based models, present novel applications of SLM in new areas such as paraphrasing, question answering, and text summarization, or how SLM techniques are used in novel ways to improve the system's performance. Areas of interest include, but are not limit to: - Theory of statistical language modeling (SLM), including o Formal models (N-gram model, HMM, maximal entropy model, structural language model, word/class model, grammar model, etc.) o Parameter estimation (model smoothing/combination/adaptation) o Evaluation o Resource (tagged training data) for SLM - Applications of SLM, including the application of SLM in the areas of o Paraphrasing o Question answering o Text summarization o Speech recognition o Asian language input o Information retrieval o Named entity recognition o Text generation o Machine translation - Other statistical natural language processing methods beyond the scope of SLM, e.g. statistical parsing, machine learning for NLP etc. The tentative plan is to publish this special issue as volume 3, issue 1, January 2004. Instructions for Submission Papers should follow the style guidelines for the ACM Transactions on Asian Language Information Processing (http://www.cintec.cuhk.edu.hk/~talip/web/). Papers should be sent to the guest editors, by the submission date listed below. The submission should be either: - Electronically to jfgao@microsoft.com. The "Subject:" line should be: TALIP Special Issue Submission. The following formats are acceptable: - Postscript - Adobe PDF If you cannot produce an electronic version in either of these formats, or if the editor informs you of a problem with your electronic submission, then please follow the instructions for hardcopy submission. - Or, Three hardcopies to: Jianfeng Gao Microsoft Research Asia 5F, Beijing Sigma Center No. 49, Zhichun Road, Haidian District Beijing, 100080, P.R.C or Chin-Yew Lin USC/Information Sciences Institute 4676 Admiralty Way Marina del Rey, CA 90292 USA Important Dates Call for Papers: April 1, 2003 Submission of Papers: October 31, 2003 (extended!) Notification of Acceptance: January 15, 2004 Final Version Due: March 15, 2004 Special Issue Date: June, 2004