View source on GitHub |
RecordInput asynchronously reads and randomly yields TFRecords.
tf.contrib.framework.RecordInput(
file_pattern, batch_size=1, buffer_size=1, parallelism=1, shift_ratio=0, seed=0,
name=None, batches=None, compression_type=None
)
A RecordInput Op will continuously read a batch of records asynchronously into a buffer of some fixed capacity. It can also asynchronously yield random records from this buffer.
It will not start yielding until at least buffer_size / 2
elements have been
placed into the buffer so that sufficient randomization can take place.
The order the files are read will be shifted each epoch by shift_amount
so
that the data is presented in a different order every epoch.
Args | |
---|---|
file_pattern
|
File path to the dataset, possibly containing wildcards. All matching files will be iterated over each epoch. |
batch_size
|
How many records to return at a time. |
buffer_size
|
The maximum number of records the buffer will contain. |
parallelism
|
How many reader threads to use for reading from files. |
shift_ratio
|
What percentage of the total number files to move the start file forward by each epoch. |
seed
|
Specify the random number seed used by generator that randomizes records. |
name
|
Optional name for the operation. |
batches
|
None by default, creating a single batch op. Otherwise specifies
how many batches to create, which are returned as a list when
get_yield_op() is called. An example use case is to split processing
between devices on one computer.
|
compression_type
|
The type of compression for the file. Currently ZLIB and GZIP are supported. Defaults to none. |
Raises | |
---|---|
ValueError
|
If one of the arguments is invalid. |
Methods
get_yield_op
get_yield_op()
Adds a node that yields a group of records every time it is executed.
If RecordInput batches
parameter is not None, it yields a list of
record batches with the specified batch_size
.