Convergence-Based Early Exit
Cross-source consensus on Convergence-Based Early Exit from 1 sources and 5 claims.
1 sources · 5 claims
How it works
Dosage & preparation
Risks & contraindications
Highlighted claims
- The evaluated inference procedure exits after a minimum layer when the current pooled representation is sufficiently similar to an earlier pooled representation. — LEAP: Layer-wise Exit-Aware Pretraining for Efficient Transformer Inference
- Convergence-based early exit terminates inference when intermediate representations are sufficiently stable. — LEAP: Layer-wise Exit-Aware Pretraining for Efficient Transformer Inference
- Early exit is viable when layer-to-layer representation changes diminish consistently. — LEAP: Layer-wise Exit-Aware Pretraining for Efficient Transformer Inference
- The paper recommends an inference threshold of 0.95 for balancing quality and latency. — LEAP: Layer-wise Exit-Aware Pretraining for Efficient Transformer Inference
- Batch inference limits realized savings because total batch compute depends on the latest-exiting sample. — LEAP: Layer-wise Exit-Aware Pretraining for Efficient Transformer Inference