Overview

Over the last two decades, statistical machine translation (SMT) has made a substantial progress from word-based to phrase and syntax-based SMT. Recently the curve of such a progress reaches a stage where the performance growth of translation quality slows even if we use sophisticated syntactic-forest-based models for translation. On the other hand, crucial meaning errors, such as incorrect translations of word senses and semantic roles, are still pervasive in SMT-generated translation hypotheses. These errors sometimes make the meanings of target translations significantly drift from the original meanings of source sentences. With an eye on the current dilemma of SMT, one might ask questions: Does SMT reach the maturity stage of its lifespan? Or is it time for us to find a new direction for SMT in order to catalyze next breakthroughs?

Semantics-driven SMT may be one of these breaking points. Semantics at different levels may enable SMT to generate not only grammatical but also meaning-preserving translations. Lexical semantics provides useful information for sense and semantic role disambiguation during translation. Compositional semantics allows SMT to generate target phrase and sentence translations by means of semantic composition. Discourse semantics captures inter-sentence dependencies for document-level machine translation. Large-scale semantic knowledge bases such as WordNet, YAGO and BabelNet, can provide external semantic knowledge for SMT. Semantics-driven SMT allows us to gradually shift from syntax to semantics and offers insights on how meaning is correctly conveyed during translation.

The goals of this workshop are to identify key challenges of exploring semantics in SMT, to discuss how semantics can help SMT and how SMT can benefit from rapid developments of semantic technologies theoretically and practically, and to find new opportunities emerging from the combination of semantics and SMT. Our key interest is to provide insights into semantics-driven SMT. Specifically, the motivations of this workshop are:

  • To bring researchers in the SMT and semantics community together and to cultivate new ideas for cutting-edge models and algorithms of semantic SMT.
  • To theoretically examine what semantics can provide for SMT and how SMT can benefit from semantics from a broad perspective.
  • To explore new research horizons for semantics-driven SMT in practice