tf_agents.bandits.multi_objective.multi_objective_scalarizer.HyperVolumeScalarizer

Implement the hypervolume scalarization.

Inherits From: Scalarizer

tf_agents.bandits.multi_objective.multi_objective_scalarizer.HyperVolumeScalarizer(
    direction: Sequence[tf_agents.bandits.multi_objective.multi_objective_scalarizer.ScalarFloat],
    transform_params: Sequence[tf_agents.bandits.multi_objective.multi_objective_scalarizer.HyperVolumeScalarizer.PARAMS],
    multi_objective_transform: Optional[Callable[[tf.Tensor, Sequence[ScalarFloat], Sequence[ScalarFloat]],
        tf.Tensor]] = None
)

Given a vector of (at least two) objectives M, a unit-length vector V with non-negative coordinates, a slope vector A, and an offset vector B, all having the same dimension, the hypervolume scalarization of M is defined as:

min_{i: V_i > 0} max(A_i * M_i + B_i, 0) / V_i.

See https://arxiv.org/abs/2006.04655 for more details. Note that it is recommended for the user to set A_i and B_i in such a way to ensure non-negativity of the transformed objectives.

Args
`direction`	A `Sequence` representing a directional vector, which will be normalized to have unit length. Coordinates of the normalized direction whose absolute values are less than `HyperVolumeScalarizer.ALMOST_ZERO` will be considered zeros.
`transform_params`	A `Sequence` of namedtuples `HyperVolumeScalarizer.PARAMS`, each containing a slope and an offset for transforming an objective to be non-negative.
`multi_objective_transform`	A `Optional` `Callable` that takes in a `tf.Tensor` of multiple objective values, a `Sequence` of slopes, and a `Sequence` of offsets, and returns a `tf.Tensor` of transformed multiple objectives. If unset, the transform is defaulted to the standard transform multiple_objectives * slopes + offsets.

Raises
`ValueError`	if `any([x < 0 for x in direction])`.
`ValueError`	if the 2-norm of `direction` is less than `HyperVolumeScalarizer.ALMOST_ZERO`.
`ValueError`	if `len(transform_params) != len(self._direction)`.

Child Classes

class PARAMS

Methods

`set_parameters`

View source

set_parameters(
    direction: tf.Tensor, transform_params: Dict[str, tf.Tensor]
)

Set the scalarization parameters for the HyperVolumeScalarizer.

Args

direction A tf.Tensor representing a directional vector, which will be normalized to have unit length. Coordinates of the normalized direction whose absolute values are less than HyperVolumeScalarizer.ALMOST_ZERO will be considered zeros. It must be rank-2 and shaped as [batch_size, self._num_of_objectives], where batch_size should match the batch size of the multi objectives passed to the scalarizer call.

transform_params A dictionary mapping self.SLOPE_KEY and/or self.OFFSET_KEY to tf.Tensor, representing the slope and the offset parameters for transforming an objective to be non-negative. These tensors must satisfy the same rank and shape requirements as direction.

Args
`direction`	A `tf.Tensor` representing a directional vector, which will be normalized to have unit length. Coordinates of the normalized direction whose absolute values are less than `HyperVolumeScalarizer.ALMOST_ZERO` will be considered zeros. It must be rank-2 and shaped as [batch_size, self._num_of_objectives], where `batch_size` should match the batch size of the multi objectives passed to the scalarizer call.
`transform_params`	A dictionary mapping `self.SLOPE_KEY` and/or `self.OFFSET_KEY` to `tf.Tensor`, representing the slope and the offset parameters for transforming an objective to be non-negative. These tensors must satisfy the same rank and shape requirements as `direction`.

Raises
`ValueError`	if any input scalarization parameter tensor is not rank-2, or has a last dimension size that does not match `self._num_of_objectives`.

`call`

View source

__call__(
    multi_objectives: tf.Tensor
) -> tf.Tensor

Returns a single reward by scalarizing multiple objectives.

Args
`multi_objectives`	A `Tensor` of shape [batch_size, number_of_objectives], where each column represents an objective.

Returns: A Tensor of shape [batch_size] representing scalarized rewards.

Raises
`ValueError`	if `multi_objectives.shape.rank != 2`.
`ValueError`	if `multi_objectives.shape.dims[1] != self._num_of_objectives`.

Class Variables
ALMOST_ZERO	`1e-16`
DIRECTION_KEY	`'direction'`
OFFSET_KEY	`'offset'`
SLOPE_KEY	`'slope'`

tf_agents.bandits.multi_objective.multi_objective_scalarizer.HyperVolumeScalarizer

Args

Raises

Child Classes

Methods

set_parameters

__call__

Class Variables

`set_parameters`

`call`