CALL FOR PAPERS SPECIAL ISSUE of COMPUTATIONAL LINGUISTICS Web as Corpus Guest editors Adam Kilgarriff, ITRI, University of Brighton and Oxford University Press Gregory Grefenstette, Clairvoyance Corporation The Web is an immense, multilingual, freely available corpus. As with other large new corpora, computational linguists have been stimulated by its presence. Web research includes many of the most talked about papers of recent ACL and other meetings (eg Resnik, ACL '99; Brill, "Does the web change everything?", ACL SIGNLL '01). In comparison with most corpora studied to date, the web is heterogeneous and noisy. Methods for handling the noise, and extracting and exploiting subcorpora meeting particular criteria, are being developed by a widening population ranging from students who realise that it is an obvious place to obtain their corpus for free, to companies who seek to use HLT techniques on datasets other than the ones HLT researchers usually use. NLP can both give to, and take from, the web (distinction due to Dragomir Radev). It can give to the web technologies such as summarisation, MT and question-answering. But the giving side of the equation looks only at short-to-medium term goals. For the longer term, for 'giving' as well as for other purposes, a deeper understanding of the linguistic nature of the web and its potential for CL/NLP is required. For that, we must take the web itself, in whatever limited way, as an object of study, and uncover what it has to tell us about the nature of language. The Special Issue will focus on how we can use the web, rather than how we can help web users. The issues which we will expect Special Issue papers to cover include: Lexical data derived from the Web Classifying Web language; the range of text types on the Web Mapping Web documents onto existing ontologies; implications for ontologies Clustering in an open corpus The multilingual Web as a resource for translation CL/HLT engagement with the Semantic Web SCHEDULE Papers due: 30 April 2002 SUBMISSION PROCEDURE Initial submissions should be sent to: 1. Guest Editors adam.kilgarriff@itri.brighton.ac.uk, grefen@clairvoyancecorp.com 2. Publishing Editor Julia Hirschberg (julia@research.att.com) For initial submissions only, authors should send electronic copies (postscript, pdf, rtf, or doc) to both the Guest Editors and the Publishing Editor. Please indicate that the submission is for the Special Issue of Computational Linguistics: Web as Corpus. Questions about submissions should be directed to the two Guest Editors, rather than the Journal or Publishing Editors.