Offline Reinforcement Learning
Cross-source consensus on Offline Reinforcement Learning from 1 sources and 4 claims.
1 sources · 4 claims
How it works
Highlighted claims
- The decision problem is formulated as a discounted infinite-horizon Markov decision process with 90-day decision intervals. — Precision Physical Activity Prescription via Reinforcement Learning for Functional Actions
- Functional FQE estimates policy value using kernel ridge regression in an RKHS with a tensor-product state-action kernel. — Precision Physical Activity Prescription via Reinforcement Learning for Functional Actions
- Functional FQI updates a B-spline functional linear policy through penalised maximisation of the empirical Q-function average. — Precision Physical Activity Prescription via Reinforcement Learning for Functional Actions
- Avoiding per-sample greedy maximisation is presented as a way to reduce computational intensity, Q-function overestimation, and non-smooth policies in functional-action settings. — Precision Physical Activity Prescription via Reinforcement Learning for Functional Actions