Treebanks and Linguistic Theories 2002 20th and 21st September 2002, Sozopol, Bulgaria Workshop motivation and aims Treebanks are a language resource that provides annotations of natural languages at various levels of structure: at the word level, the phrase level, the sentence level, and sometimes also at the level of function-argument structure. Treebanks have become crucially important for the development of data-driven approaches to natural language processing, human language technologies, grammar extraction and linguistic research in general. There are a number of on-going projects on compilation of representative treebanks for languages that still lack them (Spanish, Bulgarian, Portugese,Turkish) and a number of on-going projects on compilation of treebanks for specific purposes for languages that already have them (English). The practices of building syntactically processed corpora have proved that aiming at more detailed description of the data becomes more and more theory-dependent (Prague Dependency Treebank and other dependency-based treebanks as the Italian treebank (TUT) or the Turkish treebank (METU); Verbmobil HPSG Treebanks, Polish HPSG Treebank, Bulgarian HPSG-based Treebank etc.). Therefore the development of treebanks and formal linguistic theories need to be more tightly connected in order to ensure the necessary information flow between them. The workshop aims at being a forum for researchers and advanced students working in one or both of these areas. It will be held in conjunction with the summer school "Empirical Linguistics and Natural Language Processing", Flagman hotel, Sozopol, Bulgaria. Topics of interest Papers should address the following topics: - design principles and annotation schemes for treebanks; - applications of treebanks in acquiring linguistic knowledge and NLP; - the role of the linguistic theories in a treebank development; - treebanks as a base for linguistic research; - evaluation of treebanks; - tools for creation and management of treebanks; - standards for treebanks. Two round-table discussions will be organized on the following topics: - the relationship between the syntactic properties of a given language and the choice of linguistic theory for annotation purposes - the utility of treebanks for linguistic theorizing Important dates Deadline for workshop abstract submission 12th April 2002 Notification of acceptance 20th May 2002 Final version of paper for workshop proceedings 24th June 2002 Submissions Papers should describe existing research connected to the topics of the workshop. The presentation at the workshop will be 25 minutes long (20 minutes for presentation and 5 minutes for questions and discussion). Each submission should include: title; author(s); affiliation(s); and contact author's e-mail address, postal address, telephone and fax numbers. Extended abstracts (maximum 1500 words, plain-text format or Postscript) should be sent to: Name: Kiril Simov Email: kivs@bgcict.acad.bg Those who wish to attend without offering a paper are asked to briefly motivate their interest. The final version of the accepted papers should not be longer than 4,000 words or 10 A4 pages. Instructions for formatting and presentation of the final version will be sent to authors upon notification of acceptance. Program Committee Erhard Hinrichs, Germany (co-chair) Tilman Berger , Germany Marek Swidzinski, Poland Adam Przepi'orkowski, Poland Kiril Simov, Bulgaria (co-chair) Vladimir Petkevic, Czech Republic Anatolij N. Baranov, Russia Sandra Kuebler, Germany Kemal Oflazer, Turkey Michael Barlow, USA Tomaz Erjavec, Slovenia Robert Engels, Norway Andreas Wagner, Germany Frank Richter, Germany Manfred Sailer, Germany Walter Daelemans, Belgium Karel Oliva, Austria Invited Speakers Frantisek Cermak, Charles University Prague, Czech Republic Hans Uszkoreit, DFKI, Saarbruecken, Germany (to be confirmed) Workshop registration The registration fee for the workshop is: 150 Euro The fees cover the following services: a copy of the proceedings of the attended workshop, coffee-breaks and refreshments. Participation in the workshop is limited by the venue. Requests for participation will be processed on first come first served basis. Local organisation Kiril Simov (kivs@bgcict.acad.bg) Petya Osenova (petyaosenova@hotmail.com) Milena Slavcheva (milena@lml.bas.bg) BulTreeBank Project Linguistic Modelling Laboratory, CLPP, Bulgarian Academy of Sciences Acad. G.Bonchev St. 25A 1113 Sofia, Bulgaria Web: http://www.bultreebank.org/