View source on GitHub |
An agent that uses and trains a greedy reward prediction policy.
Classes
class GreedyRewardPredictionAgent
: A neural reward network based bandit agent.
Other Members | |
---|---|
absolute_import |
Instance of __future__._Feature
|
division |
Instance of __future__._Feature
|
print_function |
Instance of __future__._Feature
|