answer_equivalence

  • Description:

The Answer Equivalence Dataset contains human ratings on model predictions from several models on the SQuAD dataset. The ratings establish whether the predicted answer is 'equivalent' to the gold answer (taking into account both question and context).

More specifically, by 'equivalent' we mean that the predicted answer contains at least the same information as the gold answer and does not add superfluous information. The dataset contains annotations for: * predictions from BiDAF on SQuAD dev * predictions from XLNet on SQuAD dev * predictions from Luke on SQuAD dev * predictions from Albert on SQuAD training, dev and test examples

Split Examples
'ae_dev' 4,446
'ae_test' 9,724
'dev_bidaf' 7,522
'dev_luke' 4,590
'dev_xlnet' 7,932
'train' 9,090
  • Feature structure:
FeaturesDict({
    'candidate': Text(shape=(), dtype=string),
    'context': Text(shape=(), dtype=string),
    'gold_index': int32,
    'qid': Text(shape=(), dtype=string),
    'question': Text(shape=(), dtype=string),
    'question_1': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'question_2': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'question_3': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'question_4': ClassLabel(shape=(), dtype=int64, num_classes=3),
    'reference': Text(shape=(), dtype=string),
    'score': float32,
})
  • Feature documentation:
Feature Class Shape Dtype Description
FeaturesDict
candidate Text string
context Text string
gold_index Tensor int32
qid Text string
question Text string
question_1 ClassLabel int64
question_2 ClassLabel int64
question_3 ClassLabel int64
question_4 ClassLabel int64
reference Text string
score Tensor float32
  • Citation:
@article{bulian-etal-2022-tomayto,
      title={Tomayto, Tomahto. Beyond Token-level Answer Equivalence for Question Answering Evaluation},
      author={Jannis Bulian and Christian Buck and Wojciech Gajewski and Benjamin Boerschinger and Tal Schuster},
      year={2022},
      eprint={2202.07654},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}