Tags

abhaga's library 25 articles

 
 

Syntax-based Alignment of Multiple Translations: Extracting Paraphrases and Generating New Sentences

  [CiTO]
In HLT-NAACL 2003: Main Proceedings (# 2003)
posted to fst paraphrasing by abhaga on 2007-05-27 01:59:27 **

Abstract

We describe a syntax-based algorithm that automatically builds Finite State Automata (word lattices) from semantically equivalent translation sets. These FSAs are good representations of paraphrases. They can be used to extract lexical and syntactic paraphrase pairs and to generate new, unseen sentences that express the same meaning as the sentences in the input sets. Our FSAs can also predict the correctness of alternative semantic renderings, which may be used to evaluate the quality of... ...

 

Contextual Bitext-Derived Paraphrases in Automatic MT Evaluation

  [CiTO]
posted to mt-eval paraphrasing by abhaga on 2007-05-27 01:55:09 ***

Abstract

In this paper we present a novel method for deriving paraphrases during automatic MT evaluation using only the source and reference texts, which are necessary for the evaluation, and word and phrase alignment software. Using target language paraphrases produced through word and phrase alignment a number of alternative reference sentences are constructed automatically for each candidate translation. ...

 

A Study of Translation Error Rate with Targeted Human Annotation

  [CiTO]
No. {LAMP}-{TR}-126,{CS}-{TR}-4755,{UMIACS}-{TR}-2005-58. (J 2005)
posted to mt-eval by abhaga on 2007-01-20 03:05:56 ***

Abstract

We define a new, intuitive measure for evaluating machine translation output that avoids the knowledge intensiveness of more meaning-based approaches, and the labor-intensiveness of human judgments. Translation Error Rate (TER) measures the amount of editing that a human would have to perform to change a system output so it exactly matches a reference translation. We also compute a human-targeted TER (or HTER), where the minimum TER of the translation is computed against a human ?targeted reference? that preserves the meaning (provided ...

 

Re-evaluating the Role of BLEU in Machine Translation Research

  [CiTO]
posted to general mt-eval by abhaga  on 2006-10-19 04:46:25 ** along with 2 people glpayson student_t

Abstract

We argue that the machine translation community is overly reliant on the Bleu machine translation evaluation metric. We show that an improved Bleu score is neither necessary nor sufficient for achieving an actual improvement in translation quality, and give two significant counterexamples to Bleu's correlation with human judgments of quality. This offers new potential for research which was previously deemed unpromising by an inability to improve upon Bleu scores. ...

 

CDER: Efficient MT Evaluation Using Block Movements

  [CiTO]
posted to block_movement cder mt-eval by abhaga on 2006-10-19 04:45:58 *** along with 1 person student_t

Abstract

Most state-of-the-art evaluation measures for machine translation assign high costs to movements of word blocks. In many cases though such movements still result in correct or almost correct sentences. In this paper, we will present a new evaluation measure which explicitly models block reordering as an edit operation. ...

 

The Significance of Recall in Automatic Metrics

  [CiTO]
posted to mt-eval recall by abhaga on 2006-10-19 04:31:22 read

Abstract

Recent research has shown that a balanced harmonic mean (F1 measure) of unigram precision and recall outperforms the widely used BLEU and NIST metrics for Machine Translation evaluation in terms of correlation with human judgments of translation quality. We show that significantly better correlations can be achieved by placing more weight on recall than on precision. While this may seem unexpected, since BLEU and NIST focus on n-gram precision and disregard recall, our experiments show... ...

 

Evaluation of Machine Translation and its Evaluation

  [CiTO]
In Machine Translation Summit IX (September 2003)
posted to mt-eval by abhaga on 2006-10-19 04:29:56 *** along with 4 people JeremyKahn mskoleva patjov student_t

Abstract

Evaluation of MT evaluation measures is limited by inconsistent human judgment data. Nonetheless, machine translation can be evaluated using the well-known measures precision, recall, and the F-measure. The F-measure has significantly higher correlation with human judgments than recently proposed alternatives. More importantly, the standard measures have an intuitive graphical interpretation, which can facilitate insight into how MT systems might be improved. The relevant software is publicly... ...

 

BLANC: Learning Evaluation Metrics for MT

  [CiTO]
In Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing (October 2005), pp. 740-747
posted to blanc mt-eval by abhaga on 2006-10-19 04:27:20 read
 

METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments

  [CiTO]
In Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization (June 2005), pp. 65-72
posted to meteor mt-eval by abhaga  on 2006-10-19 04:25:08 read along with 2 people glpayson minhle
 

On Some Pitfalls in Automatic Evaluation and Significance Testing for MT

  [CiTO]
In Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization (June 2005), pp. 57-64
posted to confidence_interval mt-eval by abhaga on 2006-10-19 04:23:59 **
 

Modelling legitimate translation variation for automatic evaluation of MT quality

  [CiTO]
In LREC 2004
posted to mt-eval by abhaga on 2006-10-19 04:04:10 **
 

Automatic Evaluation of Machine Translation Quality Using Longest Common Subsequence and Skip-Bigram Statistics

  [CiTO]
In Proceedings of the 42nd Meeting of the Association for Computational Linguistics (ACL'04), Main Volume (July 2004), pp. 605-612
posted to mt-eval rouge by abhaga on 2006-10-19 03:59:39 ***
 

Paraphrasing for Automatic Evaluation

  [CiTO]
In Proceedings of the Human Language Technology Conference of the NAACL, Main Conference (June 2006), pp. 455-462
posted to mt-eval paraphrasing by abhaga on 2006-10-19 03:57:53 *** along with 1 person student_t
 

Measuring Confidence Intervals for the Machine Translation Evaluation Metrics

  [CiTO]
In International Conference on Theoretical and Methodological Issues in Machine Translation (TMI 2004), Baltimore, MD USA, October 4-6, 2004
posted to confidence_interval mt-eval by abhaga on 2006-10-19 03:54:53 ***
 

Extending the BLEU MT Evaluation Method with Frequency Weightings

  [CiTO]
In Proceedings of the 42nd Meeting of the Association for Computational Linguistics (ACL'04), Main Volume (July 2004), pp. 621-628
posted to bleu mt-eval by abhaga on 2006-10-19 03:44:13 **
 

Extending MT evaluation tools with translation complexity metrics

  [CiTO]
In Proceedings of Coling 2004 (Aug 2004), pp. 106-112
posted to metric_normalization mt-eval translation_complexity by abhaga on 2006-10-19 03:34:56 ***
 

ORANGE: a Method for Evaluating Automatic Evaluation Metrics for Machine Translation

  [CiTO]
In Proceedings of Coling 2004 (Aug 2004), pp. 501-507
posted to mt-eval orange by abhaga on 2006-10-19 03:31:50 *** along with 1 person student_t
 

Stochastic Iterative Alignment for Machine Translation Evaluation

  [CiTO]
In Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions (July 2006), pp. 539-546
posted to alignment mt-eval by abhaga on 2006-10-19 03:26:20 *** along with 1 person student_t
 

Automatic Evaluation of Machine Translation Quality: XRCE Multilingual Team

  [CiTO]
posted to general mt-eval by abhaga on 2006-10-19 02:48:43 **
 

A Paraphrase-Based Approach to Machine Translation Evaluation

  [CiTO]
posted to mt-eval paraphrasing by abhaga on 2006-10-19 02:23:59 ***

Note (first note only)

shows comparison with meteor results

 

Sentence-Level MT evaluation without reference translations: beyond language modeling

  [CiTO]
In European Association for Machine Translation
posted to eamt2005 mt-eval sentence_level by abhaga on 2006-10-19 02:07:23 ***
 

Syntactic Features for Evaluation of Machine Translation

  [CiTO]
posted to mt-eval by abhaga on 2006-10-19 02:00:27 ** along with 1 person student_t

Abstract

Automatic evaluation of machine translation, based on computing n-gram similarity between system output and human reference translations, has revolutionized the development of MT systems. We explore the use of syntactic information, including constituent labels and head-modifier dependencies, in computing similarity between output and reference. Our results show that adding syntactic information to the evaluation metric improves both sentence-level and corpus-level correlation with... ...

 

A learning approach to improving sentence-level MT evaluation

  [CiTO]
(2004)
posted to mt-eval sentence_level by abhaga on 2006-10-19 01:58:59 ** along with 1 person patjov

Abstract

The problem of evaluating machine translation (MT) systems is more challenging than it may first appear, as diverse translations can often be considered equally correct. The task is even more difficult when practical circumstances require that evaluation be done automatically over short texts, for instance, during incremental system development and error analysis. ...

 

Experimental Comparison of MT Evaluation Methods: RED vs. BLEU

  [CiTO]
posted to mt-eval by abhaga on 2006-10-19 01:56:40 **

Abstract

This paper experimentally compares two automatic evaluators, RED and BLEU, to determine how close the evaluation results of each automatic evaluator are to average evaluation results by human evaluators, following the ATR standard of MT evaluation. This paper gives several cautionary remarks intended to prevent MT developers from drawing misleading conclusions when using the automatic evaluators. In addition, this paper reports a way of using the automatic evaluators so that their results... ...

 

Bleu: a method for automatic evaluation of machine translation

  [CiTO]
(2001)
posted to bleu mt-eval by abhaga  on 2006-10-19 01:52:09 read along with 5 people and 5 groups barliant ealdent glpayson mote student_t ASR ISI LanguageAndBrain NLP SLA

Abstract

Human evaluations of machine translation are extensive but expensive. Human evaluations can take months to finish and involve human labor that can not be reused. ...

Note: You may cite this page as: http://www.citeulike.org/user/abhaga

Create CiTO

Create a CiTO relationship by dragging the [CiTO] link onto another article.

Alternatively, drag two articles into the two boxes below. This is useful when the two articles are not on the same page - the articles will be remembered between pages.

This article...

...this one

Privacy Statement | Terms & Conditions
CiteULike organises scholarly (or academic) papers or literature and provides bibliographic (which means it makes bibliographies) for universities and higher education establishments. It helps undergraduates and postgraduates. People studying for PhDs or in postdoctoral (postdoc) positions. The service is similar in scope to EndNote or RefWorks or any other reference manager like BibTeX, but it is a social bookmarking service for scientists and humanities researchers.