Disproportionate Weight Divergence
Cross-source consensus on Disproportionate Weight Divergence from 1 sources and 4 claims.
1 sources · 4 claims
How it works
Evidence quality
Where it comes from
Highlighted claims
- DWD appears as an abrupt isolated surge in lm_head relative weight change under naive reuse. — When to Stop Reusing: Dynamic Gradient Gating for Sample-Efficient RLVR
- Single-use training does not produce the lm_head surge, indicating sample reuse is the direct cause. — When to Stop Reusing: Dynamic Gradient Gating for Sample-Efficient RLVR
- DWD is explained as being structurally localized to the lm_head. — When to Stop Reusing: Dynamic Gradient Gating for Sample-Efficient RLVR
- The identification of DWD is framed as a principled way to understand sample-reuse instability in RLVR. — When to Stop Reusing: Dynamic Gradient Gating for Sample-Efficient RLVR