Retrofitting
Cross-source consensus on Retrofitting from 1 sources and 3 claims.
1 sources · 3 claims
How it works
Risks & contraindications
Evidence quality
Highlighted claims
- Applying the mixture objective post hoc to a pretrained Llama-3.2-1B checkpoint failed in the reported experiment. — N-vium: Mixture-of-Exits Transformer for Accelerated Exact Generation
- In the retrofit experiment, early exit probabilities collapsed toward zero, making the model effectively dense. — N-vium: Mixture-of-Exits Transformer for Accelerated Exact Generation
- The article attributes retrofit failure to dense pretrained backbones lacking useful intermediate representations for early exits. — N-vium: Mixture-of-Exits Transformer for Accelerated Exact Generation