A float tensor of shape [outer_dim1, ... outer_dimK, action_dim1,
..., action_dimJ].
actions
An int tensor of shape [outer_dim1, ... outer_dimK] if
multi_dim_actions=False [outer_dim1, ... outer_dimK, J] if
multi_dim_actions=True I.e. in the multidimensional case,
actions[outer_dim1, ... outer_dimK] is a vector [actions_1, ...,
actions_J] where each element actions_j is an action in the range [0,
num_actions_j). While in the single dimensional case, actions[outer_dim1,
... outer_dimK] is a scalar.
multi_dim_actions
whether the actions are multidimensional.
Returns
A [outer_dim1, ... outer_dimK] tensor of q_values for the given actions.