Differential TD

Cross-source consensus on Differential TD from 1 sources and 5 claims.

1 sources · 5 claims

Uses

Differential TD replaces the discounted TD error with an error that subtracts the current average-reward estimate. — On the Divergence of Differential Temporal Difference Learning without Local Clocks
Differential TD was introduced for off-policy average-reward policy evaluation. — On the Divergence of Differential Temporal Difference Learning without Local Clocks
Global-clock differential TD does not converge for all positive eta. — On the Divergence of Differential Temporal Difference Learning without Local Clocks
Local-clock differential TD is stable for every positive eta under the cited Wan et al. conditions. — On the Divergence of Differential Temporal Difference Learning without Local Clocks
Differential TD is designed for the average-reward setting and incorporates a centering effect similar to reference-state centering. — On the Divergence of Differential Temporal Difference Learning without Local Clocks