DAPO
Cross-source consensus on DAPO from 1 sources and 4 claims.
1 sources · 4 claims
Uses
How it works
Comparisons
Highlighted claims
- DAPO removes zero-variance groups after rollouts are complete. — Selective Rollout: Mid-Trajectory Termination for Multi-Sample Agent RL
- DAPO saves training cost but does not save rollout cost. — Selective Rollout: Mid-Trajectory Termination for Multi-Sample Agent RL
- Selective rollout differs from DAPO because it uses information revealed during an agent rollout. — Selective Rollout: Mid-Trajectory Termination for Multi-Sample Agent RL
- The article recommends combining selective rollout with DAPO in production. — Selective Rollout: Mid-Trajectory Termination for Multi-Sample Agent RL