Conservation of Utility in Markov Decision Processes

by Scott Worley, 2018-04-19

The standard Markov Decision Process formalism is loose. It allows one to specify funhouse worlds of no economic or practical interest. Analogously, building a robot or learning algorithm that works well in all possible physics, including physics without conservation of energy, necessarily makes it less effective in our physics. I think it would be interesting to narrow our attention to MDPs with conservation of utility.

