View source on GitHub |
Constructs an action mask from multiple sources.
tf_agents.bandits.policies.constraints.construct_mask_from_multiple_sources(
observation: tf_agents.typing.types.NestedTensor
,
observation_and_action_constraint_splitter: tf_agents.typing.types.Splitter
,
constraints: Iterable[tf_agents.bandits.policies.constraints.BaseConstraint
],
max_num_actions: int
) -> Optional[types.Tensor]
The sources include:
-- The action mask encoded in the observation,
-- the num_actions
feature restricting the number of actions per sample,
-- the feasibility mask implied by constraints.
The resulting mask disables all actions that are masked out in any of the three sources.
Returns | |
---|---|
An action mask in the form of a [batch_size, max_num_actions] 0-1 tensor.
|