Date of Original Version



Conference Proceeding

Abstract or Description

Although knowledge-based MT systems have the potential to achieve high translation accuracy, each successful application system requires a large amount of hand-coded knowledge (lexicons, grammars, mapping rules, etc.). Systems like KBMT-89 and its descendants have demonstrated how knowledge-based translation can produce good results in technical domains with tractable domain semantics. Nevertheless, the cost of developing large-scale applications with tens of thousands of domain concepts precludes a purely hand-crafted approach. The current challenge for the "next generation" of knowledge-based MT systems is to utilize on-line textual resources and corpus analysis software in order to automate the most laborious aspects of the knowledge acquisition process. This partial automation can in turn maximize the productivity of human knowledge engineers and help to make large-scale applications of knowledge-based MT an economic reality. In this paper we discuss the corpus-based knowledge acquisition methodology used in KANT, a knowledge-based translation system for multi-lingual document production. This methodology can be generalized beyond the KANT interlingua approach for use with any system that requires similar kinds of knowledge.