Returned with every call to policy.action()
and policy.distribution()
.
tf_agents.trajectories.PolicyStep(
action=(), state=(), info=()
)
Used in the notebooks
Attributes |
action
|
An action tensor or action distribution for TFPolicy , or numpy
array for PyPolicy .
|
state
|
During inference, it will hold the state of the policy to be fed back
into the next call to policy.action() or policy.distribution(), e.g. an
RNN state. During the training, it will hold the state that is input to
policy.action() or policy.distribution() For stateless policies, this will
be an empty tuple.
|
info
|
Auxiliary information emitted by the policy, e.g. log probabilities of
the actions. For policies without info this will be an empty tuple.
|
Methods
replace
View source
replace(
**kwargs
) -> 'PolicyStep'
Exposes as namedtuple._replace.
Usage:
new_policy_step = policy_step.replace(action=())
This returns a new policy step with an empty action.
Args |
**kwargs
|
key/value pairs of fields in the policy step.
|
Returns |
A new PolicyStep .
|