Inverse Preconditioning
Cross-source consensus on Inverse Preconditioning from 1 sources and 5 claims.
1 sources · 5 claims
How it works
Preparation
Risks & contraindications
Evidence quality
Highlighted claims
- The practical LLQR relaxation replaces exact layer updates with a preconditioned gradient using a block-diagonal inverse preconditioner. — Layerwise LQR for Geometry-Aware Optimization of Deep Networks
- LLQR learns the inverse preconditioner under the LQR objective instead of inverting a pre-structured curvature matrix. — Layerwise LQR for Geometry-Aware Optimization of Deep Networks
- The overhead of LLQR is mainly in periodic refitting of the inverse preconditioner rather than in applying it. — Layerwise LQR for Geometry-Aware Optimization of Deep Networks
- LLQR periodically refits the learned inverse preconditioner and passes the preconditioned gradient to SGDM or AdamW. — Layerwise LQR for Geometry-Aware Optimization of Deep Networks
- The relaxed LLQR update can differ from exact LQR because the learned inverse preconditioner is constrained by directions spanned by current gradients. — Layerwise LQR for Geometry-Aware Optimization of Deep Networks