

                           CALL FOR PAPERS

                            ACL Workshop:

                          COMPARING CORPORA

                             October 2000

            Hong Kong University of Science and Technology


THEME
=====

Anyone who has worked with corpora will be all too aware of
differences between them.  Depending on the differences, it may, or
may not, be reasonable to expect results based on one corpus to also
be valid for another.  It may, or may not, be appropriate for a
grammar, or parser, based on one to perform well on another.  It may,
or may not, be straightforward to port an application from a domain of
the first text type to a domain of the second.  Currently,
characterisations of corpora are mostly textual and at different
levels of generality.  A corpus is described as ``Wall Street
Journal'' or ``transcripts of business meetings'' or ``foreign
learners' essays (intermediate grade)''.  It would be desirable to be
able to place a new corpus in relation to existing ones, and to be
able to quantify similarities and differences.

Allied to corpus-similarity is corpus-homogeneity. An understanding of
homogeneity is a prerequisite to a measure of the similarity -- it makes
little sense to compare a corpus sampled across many genres, like the
Brown, with a corpus of weather forecasts, without first accounting
for the one being broad, the other narrow.

Given the centrality of corpora to contemporary language engineering,
it is remarkable how little research there has been to date on the
question.  Biber's work, coming from sociolinguistics, has made a
considerable impact, with various researchers in computational
lingustics taking forward the model (Biber 1989, 1995).  Studies in
text classification, genre and sublanguage are also salient, but it is
rarely evident how well the technologies ddeveloped in these fields are
suited to measuring corpus similarity or homogeneity.

The workshop will welcome contributions concerned with measuring and
comparing corpora using quantitative methods, from any field.


Where and when
==============

The workshop will last half a day and will be on either 7th or 8th
Oct, the main ACL conference being 3rd-6th Oct.  The venue will be
the as for the main conference.

Submissions:
============

Submissions are limited to original, unpublished work. Papers may
not exceed 3200 words (exclusive of title page and references).
They must be received by July 8, 2000, in hard copy (4 copies)
OR postscript OR rtf format.  Electronic delivery is to

compcorp@itri.brighton.ac.uk

and hard copies are to be mailed to 

Compcorp submission
ITRI
University of Brighton
Lewes Road
Brighton BN2 4GJ
United Kingdom


Important Dates:
  July 8, 2000              Submission (of full-length paper)
  August 17, 2000           Acceptance notice
  September 5, 2000         Camera-ready paper due
  October 7 or 8            Workshop date

        
Co-ordinators
=============
        
Adam Kilgarriff - University of Brighton, UK
Tony Berber Sardinha - Catholic University of Sao Paulo, Brazil

Programme committee
===================

Douglas Biber           Northern Arizona University   
Jeremy Clear            University of Birmingham
Ted Dunning             MusicMatch Software, Inc.            
Tomaz Erjavec           Jozef Stefan Institute, Slovenia   
Pascale Fung            University of Science and Technology, Hong Kong   
Sylviane Granger (tbc)  Universite Catholique de Louvain  
Greg Grefenstette (tbc) XRCE, Grenoble     
Benoit Habert           LIMSI, France          
Przemek Kaszubski (tbc) Adam Mickiewicz University, Poland   
Adam Kilgarriff         University of Brighton   
David Lee               University of Lancaster   
Oliver Mason            University of Birmingham   
Doug Oard               University of Maryland   
Tony Rose               Canon Research           
Tony Berber Sardinha    Catholic University of Sao Paulo, Brazil  
George Tambouratzis     ILSP, Athens                
Christopher Tribble     King's College, London University  

Website
=======

http://www.itri.bton.ac.uk/events/compcorp
