Dynamic Gradient Gating

Cross-source consensus on Dynamic Gradient Gating from 1 sources and 6 claims.

1 sources · 6 claims

How it works

DGG monitors lm_head gradient energy and its step-wise increment. — When to Stop Reusing: Dynamic Gradient Gating for Sample-Efficient RLVR
DGG activates gating only after the sliding window is populated and the current reuse index is greater than one. — When to Stop Reusing: Dynamic Gradient Gating for Sample-Efficient RLVR
When DGG detects an excessive Z-score, it zeros gradients and ends reuse before Adam updates the optimizer state. — When to Stop Reusing: Dynamic Gradient Gating for Sample-Efficient RLVR
The recommended DGG defaults are maximum reuse K equal to 4 and threshold tau in the range 0.1 to 0.5. — When to Stop Reusing: Dynamic Gradient Gating for Sample-Efficient RLVR
DGG achieved rollout and wall-clock speedups while matching the single-use baseline's converged performance. — When to Stop Reusing: Dynamic Gradient Gating for Sample-Efficient RLVR
DGG often also improves final performance modestly beyond the single-use baseline. — When to Stop Reusing: Dynamic Gradient Gating for Sample-Efficient RLVR