# Machine Translation @ the National Centre for Language Technology

## Dublin City University, Ireland

Home People Projects Publications Events Announcements Links
-- current projects --
EuroMatrix+
Panacea
CoSyne
PLuTO
T4ME
Confident MT

-- completed projects --
Prospect
Attempt
Sign language
translation

Evaluation
Transbooster
DVD subtitling
LFG-DOP
EBMT & Marker
Hypothesis

DOP & DOT
Hybrid EBMT-SMT
 Title: TransBooster: boosting the performance of existing MT by complex sentence reduction Duration: October 1st 2003 - September 30th 2006 Funded by: Enterprise Ireland People: Bart Mellebeek, Karolina Owczarzak, Josef van Genabith, Andy Way Description: Machine Translation (MT) systems tend to underperform when faced with long, linguistically complex sentences. Rule-based systems often trade a broad but shallow linguistic coverage for a deep, fine-grained analysis since hand-crafting rules based on detailed linguistic analyses is time-consuming, error-prone and expensive. Most data-driven systems lack the necessary syntactic knowledge to effectively deal with non-local grammatical phenomena. Therefore, both rule-based and data-driven MT systems are better at handling short, simple sentences than linguistically complex ones. This thesis proposes a new and modular approach to help MT systems improve their output quality by reducing the number of complexities in the input. Instead of trying to reinvent the wheel by proposing yet another approach to MT, we build on the strengths of existing MT paradigms while trying to remedy their shortcomings as much as possible. We do this by developing TransBooster, a wrapper technology that reduces the complexity of the MT input by a recursive decomposition algorithm which produces simple input chunks that are spoon-fed to a baseline MT system. TransBooster is not an MT system itself: it does not perform automatic translation, but operates on top of an existing MT system, guiding it through the input and trying to help the baseline system to improve the quality of its own translations through automatic complexity reduction. In this dissertation, we outline the motivation behind TransBooster, explain its development in depth and investigate its impact on the three most important paradigms in the field: Rule-based, Example-based and Statistical MT. In addition, we use the TransBooster architecture as a promising alternative to current Multi-Engine MT techniques. We evaluate TransBooster on the language pair English$\rightarrow$Spanish with a combination of automatic and manual evaluation metrics, providing a rigorous analysis of the potential and shortcomings of our approach.
Last update: Sep 19 2007
Related Sites: NCLT | School of Computing | School of Applied Languages and Intercultural Studies | Dublin City University