WORKSHOP ON EXAMPLE BASED MACHINE TRANSLATION

Workshop on Example-Based Machine Translation: Last Call for Papers

Hosted by MT-SUMMIT VIII
Santiago de Compostella, Spain,
September 18-22, 2001
http://www.eamt.org/summitVIII/index.html
http://www.compapp.dcu.ie/~away/EBMT.html
Co-chairs: Michael Carl, IAI, Saarbr=FCcken; Andy Way, Computer
Applications, Dublin City University=20

In recent years, corpora of multilingual translated texts have become
widely available for a number of languages. Notwithstanding the seminal
paper by Nagao (84), it is primarily since the early 90's that such
bilingual texts have been exploited in the area of Machine Translation
(MT).

The two main paradigmatic approaches which have been proposed are
Statistics-based Machine Translation (SBMT) and Example-Based Machine
Translation (EBMT). A related variant of EBMT that we ignore here,
despite being widely used in the localisation area, is that of
Translation Memories (TM). No new translations are created afresh from
previously existing examples in the system database: rather, the closest
translation matches are proposed to the user for post-editing into the
correct translation.

While translation memory systems are used in restricted domains, SBMT
systems require training on huge, good quality bilingual corpora. As a
consequence TMs can hardly be applied as a general purpose solution to
MT and SBMT as yet cannot produce complex translations to the desired
quality, even if such translations are given to the system in the
training phase. EBMT seeks to exploit and integrate a number of
knowledge resources, such as linguistics and statistics, and symbolic
and numerical techniques, for integration into one framework. In this
way, rule-based morphological, syntactic and/or semantic information is
combined with knowledge extracted from bilingual texts which is then
re-used in the translation process.

However, it is unclear how one might combine the different knowledge
resources and techniques in an optimal way. In EBMT, therefore, the
question is asked: what can be learned from a bilingual corpus and what
needs to be manually provided? Furthermore, we remain uncertain as to
how far the EBMT methodology can be pushed with respect to translation
quality and/or translation purpose. Finally, one wonders what the
implications and consequences are for size and quality of the reference
translations, (computational) complexity of the system, sizeability and
transportability, if such an approach is taken.

Given this background, we propose to organize a workshop in order to
shed some light on these open questions, among others. We are seeking
contributions which go beyond the purely statistical and/or rule-based
approaches to MT. We welcome visionary and technical descriptions,
reports of empirical research as well as feasibility studies and system
demonstrations. We would welcome contributions on any of the following
topics and sub-headings:

  a.. (semi-)automatic preparation of existing bi/multilingual corpora
for EBMT
    a.. extraction of bi/multilingual texts from the web
    b.. preparation of treebanks for EBMT
    c.. bi/multilingual alignment/bracketing/parsing
    d.. inference of bi/multilingual grammar and transfer rules
  b.. description of `pure' EBMT systems
    a.. knowledge resources used
    b.. representation of numeric and symbolic knowledge
  c.. descriptions of `hybrid' systems integrating EBMT with rule-based
or other methodologies
  d.. evaluation of EBMT results and/or comparison with other MT systems

  e.. considerations on domain-(in)dependence of EBMT systems
  f.. computational and/or system complexity of EBMT systems

Submissions
Submitted papers must describe original, previously unpublished work.
Submissions must not exceed 12 pages. Contributions should be submitted
to Michael Carl. Conference stylesheets are now available. Deadlines
appear below.

There may also be poster sessions, subject to demand. We also strongly
encourage system demonstrations, either in conjunction with contentful
paper presentations or as stand-alone demos during the lunch and coffee
intervals. Please make it clear in your submissions whether you plan to
demonstrate your system, either as part of a paper presentation, or as a
stand-alone demo.


Publication
There will be a common publication format for all workshops in line with
the main conference proceedings. Please follow the guidelines for the
main conference. However, it is anticipated that relevant publishers
will be approached to see if there would be interest in publishing the
proceedings more widely.


Important Dates

  a.. January 2001 First call for papers/demos
  b.. 15.4.2001 Deadline for receipt of papers
  c.. 31.5.2001 Notification of acceptance
  d.. 15.7.2001 Final Paper due
  e.. 18.9.2001 Workshop takes place
Attendance Fee
Details of registration procedures, including registration fees, have
just been announced. The attendance fee for our workshop is Euro 50.


Organizing Committee

  a.. Sivaji Bandyopadhyay, India
  b.. Ralf Brown, USA
  c.. Michael Carl, Germany
  d.. Ilyas Cicekli, Turkey
  e.. Brona Collins, Belgium
  f.. Oliver Streiter, Taiwan
  g.. Stephan Vogel, Germany
  h.. Andy Way, Ireland