Optimizing Large-Scale Systems with Reinforcement Learning
Sayak Ray Chowdhury
Broschiertes Buch

Optimizing Large-Scale Systems with Reinforcement Learning

Versandkostenfrei!
Nicht lieferbar
Reinforcement learning (RL) is concerned with learning to take actions to maximize rewards, by trial and error, in environments that can evolve in response to actions. A Markov decision process (MDP) [6] is a popular framework to model decision making in RL environments. In the MDP, starting from an initial observed state, an agent repeatedly (a) takes an action, (b) receives a reward, and (c) observes the next state of the MDP. The traditional objective in RL is a search goal - find a policy (a rule to select an action for each state) with high total reward using as few interactions with the ...