Call for Papers Workshop on Reading Comprehension Tests as Evaluation for Computer-Based Language Understanding Systems Thursday, May 4th, 2000, Seattle, Washington, USA (post-conference workshop in conjunction with ANLP-NAACL2000) Reading Comprehension tests, such as the one below, are designed to help evaluate a reader's understanding of a text passage. How Maple Syrup is Made Maple syrup comes from sugar maple trees. At one time, maple syrup was used to make sugar. This is why the tree is called a "sugar" maple tree. Sugar maple trees make sap. Farmers collect the sap. The best time to collect sap is in February and March. The nights must be cold and the days warm. The farmer drills a few small holes in each tree. He puts a spout in each hole. Then he hangs a bucket on the end of each spout. The bucket has a cover to keep rain and snow out. The sap drips into the bucket. About 10 gallons of sap come from each hole. 1. Who collects maple sap? (Farmers) 2. What does the farmer hang from a spout? (A bucket) 3. When is sap collected? (February and March) 4. Where does the maple sap come from? (Sugar maple trees) 5. Why is the bucket covered? (to keep rain and snow out) Such tests exist in many languages, have human performance benchmarks associated with them, and come in a variety of types (short-answer, multiple choice) and levels of difficulty. In addition, they are generally written to make each story and set of questions self-contained, in order to require as little outside knowledge as possible to answer the questions. The focus of the proposed workshop will be to explore the following questions: - Can such exams be used to evaluate computer-based language understanding effectively and efficiently? - Would they provide an impetus and test bed for interesting and useful research? - Are they too hard for current technology? - Or are they too easy, such that simple hacks can score high, although there is clearly no understanding involved? The most direct method of exploring these questions is to choose a set of tests and build a system that takes these tests. Some preliminary results indicate that such tests are tractable, but not trivial and that linguistic processing is helpful (Hirschman, et al. ACL-99). A test set, evaluation routines, prototype system, and documentation are available upon request to light@mitre.org. We hope that a number of submissions will present results based on actual reading comprehension systems. In addition, we encourage submissions that report on other kinds of tests or similar tests in other languages, or that address our list of questions by other means. Note that submissions are encouraged that describe work in progress with preliminary empirical results. Invited speaker: Karen Kukich (Educational Testing Service) "NLP Tools for Analyzing TOEFL Reading Comprehension Passages and Items" Format for Submission Authors are asked to submit previously unpublished papers only; a workshop proceedings will be published. Our target submission length is 2000 words but both shorter and longer submissions will also be considered. Electronic submission of postscript will be accepted. Hard copy submissions should include 4 copies of the paper. Since the papers will be reviewed anonymously, please do not place the author name on the paper. Instead include a separate title page with title, abstract, author, and e-mail address. Unless requested otherwise, notification of acceptance will be sent electronically to the first author. Parallel submission is unproblematic; however if your paper is accepted to this workshop and you decide to present it here, we will ask you to withdraw it from any other events. Important Dates Deadline for submission: February 11th, 2000 Notification of authors: March 1st, 2000 Final versions due: March 10th, 2000 Address for Submission and Further Information Marc Light The MITRE Corporation 202 Burlington Rd. M/S K329 Bedford, MA 01730 USA Phone: 1-781-271-5579 light@mitre.org (The mailing list, read-comp@linus.mitre.org, has been set up to discuss reading comprehension tests as evaluation for computer-based language understanding systems. It is open subscription and unmoderated. To subscribe, send email to majordomo@linus.mitre.org with 'subscribe read-comp' in the body.) Program Committee: Eric Brill Eugene Charniak Mary Harper Marc Light (chair) Ellen Riloff Ellen Voorhees