programmingbee.net
Part 5.4 Model-Free Prediction: Temporal-Difference Learning, section 2 TD(λ).
In section 1 of TD learning we have seen a new class of algorithms that can learn online after every step. In other words TD can learn before and without the final outcome using Bootstrapping &#821…