______________________________________________________________ CALL FOR PAPERS ACL-2003 Workshop on Multiword Expressions: Analysis, Acquisition and Treatment 12 July 2003, Sapporo, Japan ______________________________________________________________ WEBSITES Workshop website: http://www.cl.cam.ac.uk/users/alk23/mwe/mwe.html ACL website: http://www.ec-inc.co.jp/ACL2003/ WORKSHOP DESCRIPTION Multiword expressions (MWEs) include a large range of linguistic phenomenon, such as phrasal verbs (e.g. "add up"), nominal compounds (e.g. "telephone box"), and institutionalized phrases (e.g. "salt and pepper"), and they can be syntactically and/or semantically idiosyncratic in nature. MWEs are used frequently in everyday language, usually to express precisely ideas and concepts that cannot be compressed into a single word. A considerable amount of research has been devoted to this subject, both in terms of theory and practice, but despite increasing interest in idiomaticity within linguistic research, there is still a gap between the needs of NLP and the descriptive tradition of linguistics. Owing to the lack of adequate resources to identify and treat MWEs properly, they pose a real challenge for NLP. Most real-world applications tend to ignore MWEs or address them simply by listing. However, it is clear that successful applications will need to be able to identify and treat them appropriately. This particularly applies to the many applications which require some degree of semantic processing (e.g. machine translation, question-answering, summarisation, generation). In recent years there has been a growing awareness in the NLP community of the problems that MWEs pose and the need for their robust handling. A considerable amount of research has been conducted in this area, some within large research projects dedicated to MWEs (e.g. the Multiword Expression Project). There is also a growing interest in MWEs in projects focused on tasks such as parsing (e.g. Robust Accurate Statistical Parsing (RASP)) and word sense disambiguation (e.g. MEANING - Developing Multilingual Web-scale Language Technologies) which are required by real-world applications. Previous workshops on MWEs have focused on certain MWE types, notably collocations, terminology and named entities. There are, however, further subtypes of MWEs, which are highly relevant for NLP tasks but which have not to date received specific attention. One example are lexicalised (non- or semi-compositional) MWEs which raise specific issues for applications which require semantic interpretation. TARGET AUDIENCE This workshop is intended to bring together NLP researchers working on all areas of MWEs. The objective is to summarise what has been achieved in the area, to establish common themes between different approaches, and to discuss future trends, with particular emphasis on addressing the problems that different MWE (sub)types pose for real-world NLP applications. AREAS OF INTEREST Papers are invited on, but not limited to, the following topics: * Theoretical research on MWEs * MWE taxonomies, classifications and databases * Corpus based analysis of MWEs * Cross-lingual analysis of MWE types, use, and behaviour * Methods for identification and extraction of MWEs (machine learning, statistical, example- or rule-based, or hybrid) * Evaluation of MWE extraction methods * Integration of MWE data into grammars and NLP applications (e.g. machine translation and generation) * Problems MWEs (or MWE types) pose for NLP applications and solutions proposed Papers can cover one or more of these areas. SUBMISSION INFORMATION Papers should be submitted electronically in Postscript or PDF format to: mwe@cslab.kecl.ntt.co.jp. Submissions should conform to the two-column format of ACL proceedings and should not exceed eight (8) pages, including references. We strongly recommend the use of ACL-2003 style files, also available from the ACL-2003 website. The subject line of the submission email should be "ACL2003 WORKSHOP PAPER SUBMISSION". As reviewing will be blind, the body of the paper should not include the names or affiliations of the authors. The following identification information should be sent in a separate email with the subject line "ACL2003 WORKSHOP ID PAGE": Title: title of paper Authors: list of all authors Keywords: up to five topic keywords Contact author: email address of author of record (for correspondence) Abstract: abstract of paper (not more than 5 lines) Notification of receipt will be emailed to the contact author. IMPORTANT DATES Submission deadline: 05 April 2003 Acceptance notification: 03 May 2003 Final version deadline: 24 May 2003 Workshop date: 12 July 2003 WORKSHOP CHAIRS Francis Bond NTT Communication Science Laboratories, Japan (bond@cslab.kecl.ntt.co.jp) Anna Korhonen University of Cambridge, UK (Anna.Korhonen@cl.cam.ac.uk) Diana McCarthy University of Sussex, UK (dianam@cogs.susx.ac.uk) Aline Villavicencio University of Cambridge, UK (Aline.Villavicencio@cl.cam.ac.uk) PROGRAM COMMITTEE Anne Abeillé (Université Paris 7, France) Timothy Baldwin (Stanford University, USA) Ted Briscoe (University of Cambridge, UK) Nicoletta Calzolari (Istituto di Linguistica Computazionale, Italy) Ido Dagan (Lingomotors, Israel) Christiane Fellbaum (Princeton University, USA) Chuck Fillmore (UC Berkeley, USA) Nancy Ide (Vassar College, USA) Kyo Kageura (National Institute of Informatics, Japan) Brigitte Krenn (Austrian Research Institute for Artificial Intelligence, Austria) Maria Lapata (University of Edinburgh, UK) Simonetta Montemagni (Istituto di Linguistica Computazionale, Italy) Kentaro Ogura (NTT Cyber Space Laboratories, Japan) Darren Pearce (University of Sussex, UK) Ivan Sag (Stanford University, USA) Tom Wasow (Stanford University, USA) Annie Zaenen (Xerox PARC, USA) REGISTRATION Workshop registration information will be posted at a later date. The registration fee will include attendance at the workshop and a copy of workshop proceedings.