Two-Q Architecture
Cross-source consensus on Two-Q Architecture from 1 sources and 5 claims.
1 sources · 5 claims
How it works
Comparisons
Highlighted claims
- The two-Q agent selects between two Q-matrices based on whether the duration of the most recently completed blank interval exceeds a threshold, encoding a coarse local-density classifier. — Clock-state olfactory search in turbulent flows using Q-learning: The geometry of plume recovery
- Q-minus specializes in preventing overshooting by initiating downwind return much sooner and exhibiting a more prominent initial surge. — Clock-state olfactory search in turbulent flows using Q-learning: The geometry of plume recovery
- Q-plus specializes in escaping sparse rear regions by performing far more upwind search before returning downwind. — Clock-state olfactory search in turbulent flows using Q-learning: The geometry of plume recovery
- The heuristic 2Qh variant, assembled without retraining by combining a dense-trained and a sparse-trained single-Q agent, matches or outperforms the best single-Q agent in most conditions. — Clock-state olfactory search in turbulent flows using Q-learning: The geometry of plume recovery
- The performance benefit of the two-Q architecture arises from the functional complementarity of the two programs, not from jointly optimized training. — Clock-state olfactory search in turbulent flows using Q-learning: The geometry of plume recovery