C A L L F O R P A P E R S 4th International SALTMIL (ISCA SIG) LREC workshop on First Steps for Language Documentation of Minority Languages: Computational Linguistic Tools for Morphology, Lexicon and Corpus Compilation 24 May 2004, Lisbon, Portugal http://193.2.100.60/SALTMIL/ Motivation and Aims The minority or lesser used languages of the world are under increasing pressure from the major languages (especially English), and many of them lack full political recognition. Some minority languages have been well researched linguistically, but most have not, and the vast majority do not yet possess basic speech and language resources (such as text and speech corpora) which are sufficient to permit research or commercial development of products. If this situation were to continue, the minority languages would fall a long way behind the major languages, as regards the availability of commercial speech and language products. This in turn will accelerate the decline of those languages that are already struggling to survive. To break this vicious circle, it is important to encourage the development of basic language resources as a first step. The workshop is intended to continue the series of SALTMIL (ISCA SIG) LREC workshops: 1) "Language Resources for European Minority Languages" (LREC1998) Granada, Spain. 2) "Developing Language Resources for Minority Languages: Re-usability and Strategic Priorities" (LREC2000) Athens, Greece. 3) "Portability Issues in Human Language Technologies " (LREC2002) Las Palmas de Gran Canaria, Spain. The proposed workshop aims to share information on tools and best practice, so that isolated researchers will not need to start from scratch. An important aspect will be the forming of personal contacts, which can minimise duplication of effort. Information on sources of funding for minority languages will also be presented, and there will be discussion on the strategic priorities that need to be addressed in this area. There will be a balance between presentations of existing language resources, and more general presentations designed to give background information needed by all researchers present. One potential means of ameliorating this imbalance in technology resources is through encouraging research in the portability of human language technology for multilingual application. Topics of Interest The workshop will focus on the following topics and languages: * Existing projects in the field, with the opportunity to share useful information * Presentations of existing speech and text databases for minority languages, with particular emphasis on software tools that have been found useful in their development. * Linguistic corpora * Automatic Speech Recognition * Acoustic modelling * Dictionary development * Language modelling . * Natural Language Processing: * Computational lexicography * Morphology * Syntax * Machine Translation. * Information retrieval Agenda The first session of the workshop will consist of invited talks focusing on current methodologies for language documentation and computational linguistic tools which are available for minority languages. Each invited speaker will be asked to comment on the following: * how current research relates to minority languages, perhaps indicating how they would approach their work within this context * which methodologies and tools they find most useful * which of those methodologies are defined as portable for different languages. * how these tools could extend the use of the language * how these basis could be used in further work on HLT The second session will be an oral session focusing on programmes and initiatives for supporting minority language documentation. The main aim of this session is to provide a forum for fostering new contacts among researchers working in this area. Invited speakers * Dafydd Gibbon, Univ. Bielefeld. "First steps in corpus compilation" * Xabier Artola, Ixa group, Univ. of the Basque Country. "First steps in lexicon resources" * Bojan Petek, University of Ljubljana. Slovenia. =93Experiences defining a Network of Excellence on Portability of Human Language Technologies * Kenneth R. Beesley, Xerox (to be confirmed) "First steps in morphology" Workshop Organizing and Program Committee Bojan Petek, University of Ljubljana. Slovenia Julie Berndsen, University College Dublin, Ireland Oliver Streiter, EURAC; European Academy, Bolzano/Bozen, Italy Atelach Alemu, Addis Ababa University. Ethiopia Kepa Sarasola,University of the Basque Country, Donostia Submission Papers are invited that describe research and development in the area of Human Language Technology portability. All contributed papers will be presented in poster format. Each submission should include: title; author(s); affiliation(s); and contact author's e-mail address, postal address, telephone and fax numbers. Abstracts (maximum 500 words, plain-text format) should be sent via email to: Julie Berndsen Julie.Berndsen@ucd.ie All contributions (including invited papers) will be printed in the workshop proceedings (CD). They also will be published on the SALTMIL website. Submissions of papers for poster presentations should follow the same style as the ones for regular LREC paper and not be longer than 6000 words. The final details will be published as soon as they become available. We allow simultaneous paper submission to the workshop and the LREC main conference. If a paper is accepted by both the conference and the workshop, the paper will be presented at the conference, rather than at the workshop. The author(s) should notify the workshop chair. Important Dates: Deadline for workshop abstract submission 11th February 2004 Notification of acceptance 25th February 2004 Final version of the paper for the workshop proceedings 1st April 2004 Workshop 24 May 2004, morning Workshop Registration Fees The registration fees for the workshop are: If you are not attending LREC: 85 EURO If you are attending LREC: 50 Euro These fees will include a coffee break and the Proceedings of the Workshop. Registration will be handled by the LREC Secretariat.