Submodular Tree Search
Cross-source consensus on Submodular Tree Search from 1 sources and 5 claims.
1 sources · 5 claims
How it works
Preparation
Evidence quality
Other
Highlighted claims
- The paper formulates tree informativeness as a weighted combination of Coverage, Novelty, and Contrast. — Maximizing Rollout Informativeness under a Fixed Budget: A Submodular View of Tree Search for Tool-Use Agentic Reinforcement Learning
- Entropy for UUCB excludes copied tool-output spans and only considers tokens that produce the next tool call or final answer. — Maximizing Rollout Informativeness under a Fixed Budget: A Submodular View of Tree Search for Tool-Use Agentic Reinforcement Learning
- The paper states that the components of F are monotone submodular and that greedy expansion has a classical approximation guarantee. — Maximizing Rollout Informativeness under a Fixed Budget: A Submodular View of Tree Search for Tool-Use Agentic Reinforcement Learning
- UUCB combines backed-up value, exploration, entropy contrast, cost penalty, and depth penalty. — Maximizing Rollout Informativeness under a Fixed Budget: A Submodular View of Tree Search for Tool-Use Agentic Reinforcement Learning
- A validation study found diminishing returns consistent with submodularity for UUCB marginal gain. — Maximizing Rollout Informativeness under a Fixed Budget: A Submodular View of Tree Search for Tool-Use Agentic Reinforcement Learning