Compute Efficiency
Cross-source consensus on Compute Efficiency from 1 sources and 5 claims.
1 sources · 5 claims
Benefits
Highlighted claims
- With the backward selector only, Qwen2.5-Math-1.5B reaches better performance in 2.40 hours versus 3.97 hours baseline, a 1.65× wall-clock speedup. — Learning-Zone Energy: Online Data Selection for Efficient RL Post-Training
- Adding the forward pruner to the backward selector increases end-to-end wall-clock speedup to 1.80× for 1.5B and 1.91× for 7B models, at a modest accuracy trade-off. — Learning-Zone Energy: Online Data Selection for Efficient RL Post-Training
- The LZE framework is optimizer-agnostic, operating solely on group-level pass-rate statistics and requiring no modification to the underlying GRPO or DAPO optimizer. — Learning-Zone Energy: Online Data Selection for Efficient RL Post-Training
- The 36% FLOPs reduction and 30–40% wall-clock savings are achieved without accuracy loss and often with accuracy improvements, indicating uniform prompt allocation is a substantial inefficiency. — Learning-Zone Energy: Online Data Selection for Efficient RL Post-Training
- LZE restricts backpropagation to 40% of prompt groups, yielding a 36% theoretical reduction in FLOPs per token, consistent across all configurations. — Learning-Zone Energy: Online Data Selection for Efficient RL Post-Training