Computational Overhead
Cross-source consensus on Computational Overhead from 1 sources and 5 claims.
1 sources · 5 claims
How it works
Dosage & preparation
Risks & contraindications
Comparisons
Highlighted claims
- LLQR adds memory and compute beyond standard first-order optimizer states. — Layerwise LQR for Geometry-Aware Optimization of Deep Networks
- The estimated ImageNet overhead was about 1.02x and the measured multiplier was about 1.03x. — Layerwise LQR for Geometry-Aware Optimization of Deep Networks
- Chunking lowers peak memory by splitting the preconditioner-update minibatch while accumulating the same relaxed objective contributions. — Layerwise LQR for Geometry-Aware Optimization of Deep Networks
- Diagonal blocks are light, while Kronecker and E-KFAC structures increase memory and update cost. — Layerwise LQR for Geometry-Aware Optimization of Deep Networks
- Update frequency is the dominant source of computational cost, with recommended settings of 1-4 updates per epoch and 25-50 inner steps. — Layerwise LQR for Geometry-Aware Optimization of Deep Networks