- Description:
Government report dataset consists of reports written by government research agencies including Congressional Research Service and U.S. Government Accountability Office.
Additional Documentation: Explore on Papers With Code
Homepage: https://gov-report-data.github.io/
Source code:
tfds.summarization.gov_report.GovReport
Versions:
1.0.0
(default): Initial release.
Download size:
320.59 MiB
Auto-cached (documentation): No
Figure (tfds.show_examples): Not supported.
Citation:
@inproceedings{
anonymous2022efficiently,
title={Efficiently Modeling Long Sequences with Structured State Spaces},
author={Anonymous},
booktitle={Submitted to The Tenth International Conference on Learning Representations },
year={2022},
url={https://openreview.net/forum?id=uYLFoz1vlAC},
note={under review}
}
gov_report/crs_whitespace (default config)
Config description: CRS report with summary. Structures flattened and joined by whitespace. This is the format used by original paper
Dataset size:
349.76 MiB
Splits:
Split | Examples |
---|---|
'test' |
362 |
'train' |
6,514 |
'validation' |
362 |
- Feature structure:
FeaturesDict({
'id': Text(shape=(), dtype=string),
'released_date': Text(shape=(), dtype=string),
'reports': Text(shape=(), dtype=string),
'summary': Text(shape=(), dtype=string),
'title': Text(shape=(), dtype=string),
})
- Feature documentation:
Feature | Class | Shape | Dtype | Description |
---|---|---|---|---|
FeaturesDict | ||||
id | Text | string | ||
released_date | Text | string | ||
reports | Text | string | ||
summary | Text | string | ||
title | Text | string |
Supervised keys (See
as_supervised
doc):('reports', 'summary')
Examples (tfds.as_dataframe):
gov_report/gao_whitespace
Config description: GAO report with highlight Structures flattened and joined by whitespace. This is the format used by original paper
Dataset size:
690.24 MiB
Splits:
Split | Examples |
---|---|
'test' |
611 |
'train' |
11,005 |
'validation' |
612 |
- Feature structure:
FeaturesDict({
'fastfact': Text(shape=(), dtype=string),
'highlight': Text(shape=(), dtype=string),
'id': Text(shape=(), dtype=string),
'published_date': Text(shape=(), dtype=string),
'released_date': Text(shape=(), dtype=string),
'report': Text(shape=(), dtype=string),
'title': Text(shape=(), dtype=string),
'url': Text(shape=(), dtype=string),
})
- Feature documentation:
Feature | Class | Shape | Dtype | Description |
---|---|---|---|---|
FeaturesDict | ||||
fastfact | Text | string | ||
highlight | Text | string | ||
id | Text | string | ||
published_date | Text | string | ||
released_date | Text | string | ||
report | Text | string | ||
title | Text | string | ||
url | Text | string |
Supervised keys (See
as_supervised
doc):('report', 'highlight')
Examples (tfds.as_dataframe):
gov_report/crs_html
Config description: CRS report with summary. Structures flattened and joined by newline while add html tags. Tags are only added for secition_title in a format like
<h2>xxx<h2>
.Dataset size:
351.25 MiB
Splits:
Split | Examples |
---|---|
'test' |
362 |
'train' |
6,514 |
'validation' |
362 |
- Feature structure:
FeaturesDict({
'id': Text(shape=(), dtype=string),
'released_date': Text(shape=(), dtype=string),
'reports': Text(shape=(), dtype=string),
'summary': Text(shape=(), dtype=string),
'title': Text(shape=(), dtype=string),
})
- Feature documentation:
Feature | Class | Shape | Dtype | Description |
---|---|---|---|---|
FeaturesDict | ||||
id | Text | string | ||
released_date | Text | string | ||
reports | Text | string | ||
summary | Text | string | ||
title | Text | string |
Supervised keys (See
as_supervised
doc):('reports', 'summary')
Examples (tfds.as_dataframe):
gov_report/gao_html
Config description: GAO report with highlight Structures flattened and joined by newline while add html tags. Tags are only added for secition_title in a format like
<h2>xxx<h2>
.Dataset size:
692.72 MiB
Splits:
Split | Examples |
---|---|
'test' |
611 |
'train' |
11,005 |
'validation' |
612 |
- Feature structure:
FeaturesDict({
'fastfact': Text(shape=(), dtype=string),
'highlight': Text(shape=(), dtype=string),
'id': Text(shape=(), dtype=string),
'published_date': Text(shape=(), dtype=string),
'released_date': Text(shape=(), dtype=string),
'report': Text(shape=(), dtype=string),
'title': Text(shape=(), dtype=string),
'url': Text(shape=(), dtype=string),
})
- Feature documentation:
Feature | Class | Shape | Dtype | Description |
---|---|---|---|---|
FeaturesDict | ||||
fastfact | Text | string | ||
highlight | Text | string | ||
id | Text | string | ||
published_date | Text | string | ||
released_date | Text | string | ||
report | Text | string | ||
title | Text | string | ||
url | Text | string |
Supervised keys (See
as_supervised
doc):('report', 'highlight')
Examples (tfds.as_dataframe):
gov_report/crs_json
Config description: CRS report with summary. Structures represented as raw json.
Dataset size:
361.92 MiB
Splits:
Split | Examples |
---|---|
'test' |
362 |
'train' |
6,514 |
'validation' |
362 |
- Feature structure:
FeaturesDict({
'id': Text(shape=(), dtype=string),
'released_date': Text(shape=(), dtype=string),
'reports': Text(shape=(), dtype=string),
'summary': Text(shape=(), dtype=string),
'title': Text(shape=(), dtype=string),
})
- Feature documentation:
Feature | Class | Shape | Dtype | Description |
---|---|---|---|---|
FeaturesDict | ||||
id | Text | string | ||
released_date | Text | string | ||
reports | Text | string | ||
summary | Text | string | ||
title | Text | string |
Supervised keys (See
as_supervised
doc):('reports', 'summary')
Examples (tfds.as_dataframe):
gov_report/gao_json
Config description: GAO report with highlight Structures represented as raw json.
Dataset size:
712.82 MiB
Splits:
Split | Examples |
---|---|
'test' |
611 |
'train' |
11,005 |
'validation' |
612 |
- Feature structure:
FeaturesDict({
'fastfact': Text(shape=(), dtype=string),
'highlight': Text(shape=(), dtype=string),
'id': Text(shape=(), dtype=string),
'published_date': Text(shape=(), dtype=string),
'released_date': Text(shape=(), dtype=string),
'report': Text(shape=(), dtype=string),
'title': Text(shape=(), dtype=string),
'url': Text(shape=(), dtype=string),
})
- Feature documentation:
Feature | Class | Shape | Dtype | Description |
---|---|---|---|---|
FeaturesDict | ||||
fastfact | Text | string | ||
highlight | Text | string | ||
id | Text | string | ||
published_date | Text | string | ||
released_date | Text | string | ||
report | Text | string | ||
title | Text | string | ||
url | Text | string |
Supervised keys (See
as_supervised
doc):('report', 'highlight')
Examples (tfds.as_dataframe):