Submodular Tree Search

Cross-source consensus on Submodular Tree Search from 1 sources and 5 claims.

1 sources · 5 claims

How it works

The paper formulates tree informativeness as a weighted combination of Coverage, Novelty, and Contrast. — Maximizing Rollout Informativeness under a Fixed Budget: A Submodular View of Tree Search for Tool-Use Agentic Reinforcement Learning
Entropy for UUCB excludes copied tool-output spans and only considers tokens that produce the next tool call or final answer. — Maximizing Rollout Informativeness under a Fixed Budget: A Submodular View of Tree Search for Tool-Use Agentic Reinforcement Learning
The paper states that the components of F are monotone submodular and that greedy expansion has a classical approximation guarantee. — Maximizing Rollout Informativeness under a Fixed Budget: A Submodular View of Tree Search for Tool-Use Agentic Reinforcement Learning
UUCB combines backed-up value, exploration, entropy contrast, cost penalty, and depth penalty. — Maximizing Rollout Informativeness under a Fixed Budget: A Submodular View of Tree Search for Tool-Use Agentic Reinforcement Learning
A validation study found diminishing returns consistent with submodularity for UUCB marginal gain. — Maximizing Rollout Informativeness under a Fixed Budget: A Submodular View of Tree Search for Tool-Use Agentic Reinforcement Learning