View source on GitHub |
Vocabulary information for warm-starting.
tf.estimator.VocabInfo(
new_vocab,
new_vocab_size,
num_oov_buckets,
old_vocab,
old_vocab_size=-1,
backup_initializer=None,
axis=0
)
See tf.estimator.WarmStartSettings
for examples of using
VocabInfo to warm-start.
Args: new_vocab: [Required] A path to the new vocabulary file (used with the model to be trained). new_vocab_size: [Required] An integer indicating how many entries of the new vocabulary will used in training. num_oov_buckets: [Required] An integer indicating how many OOV buckets are associated with the vocabulary. old_vocab: [Required] A path to the old vocabulary file (used with the checkpoint to be warm-started from). old_vocab_size: [Optional] An integer indicating how many entries of the old vocabulary were used in the creation of the checkpoint. If not provided, the entire old vocabulary will be used. backup_initializer: [Optional] A variable initializer used for variables corresponding to new vocabulary entries and OOV. If not provided, these entries will be zero-initialized. axis: [Optional] Denotes what axis the vocabulary corresponds to. The default, 0, corresponds to the most common use case (embeddings or linear weights for binary classification / regression). An axis of 1 could be used for warm-starting output layers with class vocabularies.
Returns:
A VocabInfo
which represents the vocabulary information for warm-starting.
Raises:
ValueError: axis
is neither 0 or 1.
Example Usage:
embeddings_vocab_info = tf.VocabInfo(
new_vocab='embeddings_vocab',
new_vocab_size=100,
num_oov_buckets=1,
old_vocab='pretrained_embeddings_vocab',
old_vocab_size=10000,
backup_initializer=tf.compat.v1.truncated_normal_initializer(
mean=0.0, stddev=(1 / math.sqrt(embedding_dim))),
axis=0)
softmax_output_layer_kernel_vocab_info = tf.VocabInfo(
new_vocab='class_vocab',
new_vocab_size=5,
num_oov_buckets=0, # No OOV for classes.
old_vocab='old_class_vocab',
old_vocab_size=8,
backup_initializer=tf.compat.v1.glorot_uniform_initializer(),
axis=1)
softmax_output_layer_bias_vocab_info = tf.VocabInfo(
new_vocab='class_vocab',
new_vocab_size=5,
num_oov_buckets=0, # No OOV for classes.
old_vocab='old_class_vocab',
old_vocab_size=8,
backup_initializer=tf.compat.v1.zeros_initializer(),
axis=0)
#Currently, only axis=0 and axis=1 are supported.
```
<!-- Tabular view -->
<table class="responsive fixed orange">
<colgroup><col width="214px"><col></colgroup>
<tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
<tr>
<td>
`new_vocab`<a id="new_vocab"></a>
</td>
<td>
A `namedtuple` alias for field number 0
</td>
</tr><tr>
<td>
`new_vocab_size`<a id="new_vocab_size"></a>
</td>
<td>
A `namedtuple` alias for field number 1
</td>
</tr><tr>
<td>
`num_oov_buckets`<a id="num_oov_buckets"></a>
</td>
<td>
A `namedtuple` alias for field number 2
</td>
</tr><tr>
<td>
`old_vocab`<a id="old_vocab"></a>
</td>
<td>
A `namedtuple` alias for field number 3
</td>
</tr><tr>
<td>
`old_vocab_size`<a id="old_vocab_size"></a>
</td>
<td>
A `namedtuple` alias for field number 4
</td>
</tr><tr>
<td>
`backup_initializer`<a id="backup_initializer"></a>
</td>
<td>
A `namedtuple` alias for field number 5
</td>
</tr><tr>
<td>
`axis`<a id="axis"></a>
</td>
<td>
A `namedtuple` alias for field number 6
</td>
</tr>
</table>