Computational Overhead

Cross-source consensus on Computational Overhead from 1 sources and 5 claims.

1 sources · 5 claims

How it works

LLQR adds memory and compute beyond standard first-order optimizer states. — Layerwise LQR for Geometry-Aware Optimization of Deep Networks
The estimated ImageNet overhead was about 1.02x and the measured multiplier was about 1.03x. — Layerwise LQR for Geometry-Aware Optimization of Deep Networks
Chunking lowers peak memory by splitting the preconditioner-update minibatch while accumulating the same relaxed objective contributions. — Layerwise LQR for Geometry-Aware Optimization of Deep Networks
Diagonal blocks are light, while Kronecker and E-KFAC structures increase memory and update cost. — Layerwise LQR for Geometry-Aware Optimization of Deep Networks
Update frequency is the dominant source of computational cost, with recommended settings of 1-4 updates per epoch and 25-50 inner steps. — Layerwise LQR for Geometry-Aware Optimization of Deep Networks