Clock-State Q-Learning
Cross-source consensus on Clock-State Q-Learning from 1 sources and 7 claims.
1 sources · 7 claims
How it works
Benefits
Comparisons
Evidence quality
Highlighted claims
- The clock-state Q-learning agent uses only elapsed blank time as its internal state, discarding all other odor history. — Clock-state olfactory search in turbulent flows using Q-learning: The geometry of plume recovery
- The clock-state Q-learning agent substantially outperforms the optimized cast-and-surge heuristic in all four sparsity environments. — Clock-state olfactory search in turbulent flows using Q-learning: The geometry of plume recovery
- A scalar clock state resets to zero upon odor detection and increments by one on each blank step, with the Q-matrix encoding a fixed deterministic sequence of moves in response to plume loss. — Clock-state olfactory search in turbulent flows using Q-learning: The geometry of plume recovery
- Success rates for the clock-state agent are at or above 90% in all environments, approaching 100% in denser plume conditions. — Clock-state olfactory search in turbulent flows using Q-learning: The geometry of plume recovery
- The best single-Q agent across an ensemble of 20 training runs approaches the performance of the quasi-optimal Bayesian POMDP agent despite using far simpler memory. — Clock-state olfactory search in turbulent flows using Q-learning: The geometry of plume recovery
- The hypothesis driving this study is that plume recovery is the dominant challenge in turbulent olfactory search and that elapsed blank time captures the most relevant information for solving it. — Clock-state olfactory search in turbulent flows using Q-learning: The geometry of plume recovery
- High policy variability across 20 training runs reflects a broad manifold of near-equivalent local optima rather than training instability. — Clock-state olfactory search in turbulent flows using Q-learning: The geometry of plume recovery