programmingbee.net
Part 5.3 Model-Free Prediction: Temporal-Difference Learning, section 1.
Our first algorithms of a totally different class. Temporal-Difference (TD), just like Monte-Carlo method, learns directly from an experience of interacting with an environment. TD is model-free, d…