InfoTree

Cross-source consensus on InfoTree from 1 sources and 5 claims.

1 sources · 5 claims

How it works

InfoTree is presented as a training-time, budget-aware tree-search framework. — Maximizing Rollout Informativeness under a Fixed Budget: A Submodular View of Tree Search for Tool-Use Agentic Reinforcement Learning
InfoTree initializes from a root prompt, samples initial trajectories, computes entropy statistics, and expands frontier nodes using UUCB under a leaf budget. — Maximizing Rollout Informativeness under a Fixed Budget: A Submodular View of Tree Search for Tool-Use Agentic Reinforcement Learning
The main InfoTree configuration uses a 16-leaf training budget per prompt and a 32-leaf validation budget. — Maximizing Rollout Informativeness under a Fixed Budget: A Submodular View of Tree Search for Tool-Use Agentic Reinforcement Learning
InfoTree improved over flat GRPO across nine benchmarks by 2.5 to 11.2 points. — Maximizing Rollout Informativeness under a Fixed Budget: A Submodular View of Tree Search for Tool-Use Agentic Reinforcement Learning
InfoTree can be combined with DPS and prefix sharing for better results than InfoTree alone. — Maximizing Rollout Informativeness under a Fixed Budget: A Submodular View of Tree Search for Tool-Use Agentic Reinforcement Learning