Geometry-Aware Optimization

Cross-source consensus on Geometry-Aware Optimization from 1 sources and 5 claims.

1 sources · 5 claims

Uses

Dense principled curvature matrices are difficult to use directly because they couple parameters across layers through the chain rule. — Layerwise LQR for Geometry-Aware Optimization of Deep Networks
Standard gradient descent is steepest descent under the Euclidean norm, while Newton, Gauss-Newton, and natural-gradient methods use curvature or divergence-induced metrics. — Layerwise LQR for Geometry-Aware Optimization of Deep Networks
Scalable preconditioners often make computation tractable by imposing block-diagonal or factored structure early. — Layerwise LQR for Geometry-Aware Optimization of Deep Networks
LLQR can compare optimizers interpretable as steepest descent under different norms within a common layerwise optimal-control objective. — Layerwise LQR for Geometry-Aware Optimization of Deep Networks
Geometry-aware methods can affect both convergence speed and the implicit bias of training trajectories. — Layerwise LQR for Geometry-Aware Optimization of Deep Networks