View source on GitHub |
Network Outputting Expected Value and Variance of Rewards.
Classes
class HeteroscedasticQNetwork
: Network Outputting Expected Value and Variance of Rewards.
class QBanditNetworkResult
: QBanditNetworkResult(q_value_logits, log_variance)
Other Members | |
---|---|
absolute_import |
Instance of __future__._Feature
|
division |
Instance of __future__._Feature
|
print_function |
Instance of __future__._Feature
|