FIRST CALL FOR PAPERS ACL04 WORKSHOP ON QUESTION ANSWERING IN RESTRICTED DOMAINS Barcelona, Spain, 25-26 July 2004 Submission deadline: 15 March 2004 http://www.clt.mq.edu.au/Events/Conferences/acl04qa/ Much of the current research in question answering systems is driven by programs such as AQUAINT and evaluation exercises such as TREC, NTCIR and CLEF, all of which focus on open-domain question answering. The availability of large volumes of data (e.g. documents extracted from the World Wide Web) has prompted the development of systems that focus on shallow text processing. But there are many document sets in restricted domains that are potentially valuable as a source for question answering systems. For example, the documentation pages of Unix and Linux systems would make an ideal corpus for QA systems targeted at users that want to know how to use these operating systems. There is a wealth of information in other technical documentation such as software manuals, car maintenance manuals, and encyclopediae of specific areas such as medicine. Users interested in these specific areas would benefit from QA systems targeted to their areas of interest. Restricted domains typically have limited data available and therefore conventional techniques based on data redundancy can simply not be applied in an effective way. The scarcity of data available seems to prompt for a more targeted, NLP-intensive approach to QA. The use of additional corpora such as the WWW raises a number of interesting questions. For instance, will these corpora help or obstruct the proper functioning of an NLP-intensive approach to QA? And, how do we find good pockets of information that are appropriate to the chosen domains? On the other hand, restricted domains (e.g. law, medicine) have specific stylistic conventions. Often these domains use terminology that is not stored in conventional lexica. Consequently NLP approaches devised for open-domain systems may under-perform on these specific domains, thus raising the question of how portable these systems can be. In this workshop we aim at answering some of the following questions: * Are open-domain question answering techniques appropriate for QA in restricted domains? * Can we use generic large corpora and/or the WWW? How can we identify specific pockets of information in these generic corpora? * How can we use specific sources such as the CIA factbook, acronym lists, e-commerce sites (e.g. e-bay), and specialized glossaries and encyclopedia? How can we discover new specific sources? * What types of question-answering techniques are best for what types of restricted domains? * Is it easy/possible/worthwhile to develop domain-independent QA systems for restricted domains? What would be the cost of porting a QA system to a specific domain? * Are restricted domains more suitable than open domains to drive research in NLP? * Is evaluation of restricted-domain QA systems different than that of open-domain QA systems? We welcome papers that address any of the above questions or that focus on any of the following topics: * Comparison between open-domain and restricted-domain QA * Characterisation of the types of restricted domains and the technology required for QA on those domains * Methodologies and/or tools for restricted-domain QA * Description of specific restricted-domain QA systems * Development of modules (e.g. document preselection, NE extraction, terminology extraction) for use in restricted-domain QA systems * Portability of QA systems between different restricted domains * Evaluation of restricted-domain QA systems SUBMISSION PROCEDURE Authors should submit full papers of maximum 8 pages, including references and figures, following the main conference ACL style format (http://www.acl2004.org/aclstyles/style.html). The review will not be blind. Submissions must be in PS or PDF format and they should be sent to diego@ics.mq.edu.au PROGRAM COMMITTEE Organizers: ----------- Diego Mollá Macquarie University, Australia José Luis Vicedo Alicante University, Spain Committee: ---------- In alphabetical order by first name: Anselmo Peñas UNED, Spain Antonio Ferrández Alicante University, Spain Bernardo Magnini ITC-Irst, Italy Bonnie Webber University of Edinburgh, UK Donna Harman NIST, USA Ellen Voorhees NIST, USA Fabio Rinaldi University of Zurich, Switzerland Felisa Verdejo UNED, Spain Graeme Hirst University of Toronto, Canada Horacio Rodríguez Universitat de Catalunya, Spain Ingrid Zukerman Monash University, Australia Jimmy Lin MIT, USA Johan Bos University of Edinburgh, UK Juergen Franke DaimlerChrysler AG, Germany Julio Gonzalo UNED, Spain Lynette Hirschman MITRE, USA Maarten de Rijke University of Amsterdam, The Netherlands Manuel Palomar Alicante University, Spain Mark Maybury MITRE, USA Michael Hess University of Zurich, Switzerland Pierre Zweigenbaum DIAM, France Richard Sutcliffe University of Limerick, Ireland Rolf Schwitter Macquarie University, Australia Sanda Harabagiu University of Texas, USA IMPORTANT DATES * 15 March 04 Paper submission * 15 April 04 Notification of acceptance * 15 May 04 Camera ready version * 25 or 26 July 04 Workshop (final date not yet determined) CONTACT DETAILS Diego Mollá Centre for Language Technology Division of Information and Communication Sciences Macquarie University New South Wales 2109 Australia Tel. +61 2 9850 9531 Fax +61 2 9850 9551 diego@ics.mq.edu.au