WORKSHOP ON EXAMPLE BASED MACHINE TRANSLATION Workshop on Example-Based Machine Translation: Last Call for Papers Hosted by MT-SUMMIT VIII Santiago de Compostella, Spain, September 18-22, 2001 http://www.eamt.org/summitVIII/index.html http://www.compapp.dcu.ie/~away/EBMT.html Co-chairs: Michael Carl, IAI, Saarbr=FCcken; Andy Way, Computer Applications, Dublin City University=20 In recent years, corpora of multilingual translated texts have become widely available for a number of languages. Notwithstanding the seminal paper by Nagao (84), it is primarily since the early 90's that such bilingual texts have been exploited in the area of Machine Translation (MT). The two main paradigmatic approaches which have been proposed are Statistics-based Machine Translation (SBMT) and Example-Based Machine Translation (EBMT). A related variant of EBMT that we ignore here, despite being widely used in the localisation area, is that of Translation Memories (TM). No new translations are created afresh from previously existing examples in the system database: rather, the closest translation matches are proposed to the user for post-editing into the correct translation. While translation memory systems are used in restricted domains, SBMT systems require training on huge, good quality bilingual corpora. As a consequence TMs can hardly be applied as a general purpose solution to MT and SBMT as yet cannot produce complex translations to the desired quality, even if such translations are given to the system in the training phase. EBMT seeks to exploit and integrate a number of knowledge resources, such as linguistics and statistics, and symbolic and numerical techniques, for integration into one framework. In this way, rule-based morphological, syntactic and/or semantic information is combined with knowledge extracted from bilingual texts which is then re-used in the translation process. However, it is unclear how one might combine the different knowledge resources and techniques in an optimal way. In EBMT, therefore, the question is asked: what can be learned from a bilingual corpus and what needs to be manually provided? Furthermore, we remain uncertain as to how far the EBMT methodology can be pushed with respect to translation quality and/or translation purpose. Finally, one wonders what the implications and consequences are for size and quality of the reference translations, (computational) complexity of the system, sizeability and transportability, if such an approach is taken. Given this background, we propose to organize a workshop in order to shed some light on these open questions, among others. We are seeking contributions which go beyond the purely statistical and/or rule-based approaches to MT. We welcome visionary and technical descriptions, reports of empirical research as well as feasibility studies and system demonstrations. We would welcome contributions on any of the following topics and sub-headings: a.. (semi-)automatic preparation of existing bi/multilingual corpora for EBMT a.. extraction of bi/multilingual texts from the web b.. preparation of treebanks for EBMT c.. bi/multilingual alignment/bracketing/parsing d.. inference of bi/multilingual grammar and transfer rules b.. description of `pure' EBMT systems a.. knowledge resources used b.. representation of numeric and symbolic knowledge c.. descriptions of `hybrid' systems integrating EBMT with rule-based or other methodologies d.. evaluation of EBMT results and/or comparison with other MT systems e.. considerations on domain-(in)dependence of EBMT systems f.. computational and/or system complexity of EBMT systems Submissions Submitted papers must describe original, previously unpublished work. Submissions must not exceed 12 pages. Contributions should be submitted to Michael Carl. Conference stylesheets are now available. Deadlines appear below. There may also be poster sessions, subject to demand. We also strongly encourage system demonstrations, either in conjunction with contentful paper presentations or as stand-alone demos during the lunch and coffee intervals. Please make it clear in your submissions whether you plan to demonstrate your system, either as part of a paper presentation, or as a stand-alone demo. Publication There will be a common publication format for all workshops in line with the main conference proceedings. Please follow the guidelines for the main conference. However, it is anticipated that relevant publishers will be approached to see if there would be interest in publishing the proceedings more widely. Important Dates a.. January 2001 First call for papers/demos b.. 15.4.2001 Deadline for receipt of papers c.. 31.5.2001 Notification of acceptance d.. 15.7.2001 Final Paper due e.. 18.9.2001 Workshop takes place Attendance Fee Details of registration procedures, including registration fees, have just been announced. The attendance fee for our workshop is Euro 50. Organizing Committee a.. Sivaji Bandyopadhyay, India b.. Ralf Brown, USA c.. Michael Carl, Germany d.. Ilyas Cicekli, Turkey e.. Brona Collins, Belgium f.. Oliver Streiter, Taiwan g.. Stephan Vogel, Germany h.. Andy Way, Ireland