- Description:
Eraser Multi RC is a dataset for queries over multi-line passages, along with answers and a rationalte. Each example in this dataset has the following 5 parts
- A Mutli-line Passage 2. A Query about the passage 3. An Answer to the query
- A Classification as to whether the answer is right or wrong 5. An Explanation justifying the classification
Additional Documentation: Explore on Papers With Code
Homepage: https://cogcomp.seas.upenn.edu/multirc/
Source code:
tfds.text.EraserMultiRc
Versions:
0.1.1
(default): No release notes.
Download size:
1.59 MiB
Dataset size:
62.59 MiB
Auto-cached (documentation): Yes
Splits:
Split | Examples |
---|---|
'test' |
4,848 |
'train' |
24,029 |
'validation' |
3,214 |
- Feature structure:
FeaturesDict({
'evidences': Sequence(Text(shape=(), dtype=string)),
'label': ClassLabel(shape=(), dtype=int64, num_classes=2),
'passage': Text(shape=(), dtype=string),
'query_and_answer': Text(shape=(), dtype=string),
})
- Feature documentation:
Feature | Class | Shape | Dtype | Description |
---|---|---|---|---|
FeaturesDict | ||||
evidences | Sequence(Text) | (None,) | string | |
label | ClassLabel | int64 | ||
passage | Text | string | ||
query_and_answer | Text | string |
Supervised keys (See
as_supervised
doc):None
Figure (tfds.show_examples): Not supported.
Examples (tfds.as_dataframe):
- Citation:
@unpublished{eraser2019,
title = {ERASER: A Benchmark to Evaluate Rationalized NLP Models},
author = {Jay DeYoung and Sarthak Jain and Nazneen Fatema Rajani and Eric Lehman and Caiming Xiong and Richard Socher and Byron C. Wallace}
}
@inproceedings{MultiRC2018,
author = {Daniel Khashabi and Snigdha Chaturvedi and Michael Roth and Shyam Upadhyay and Dan Roth},
title = {Looking Beyond the Surface:A Challenge Set for Reading Comprehension over Multiple Sentences},
booktitle = {NAACL},
year = {2018}
}