Empirical Performance
Cross-source consensus on Empirical Performance from 1 sources and 4 claims.
1 sources · 4 claims
Benefits
Comparisons
Highlighted claims
- On the modern backbone, three-phase structure without RoPE underperformed RoPE alone, while three-phase plus RoPE improved over RoPE-only. — Three-Phase Transformer
- At 123M on WikiText-103, fixed-horn 3PT reduced perplexity and bits per byte versus the matched RoPE-only baseline. — Three-Phase Transformer
- Long-horizon 5.5M runs showed the best phase-aligned and PhRMS variant beat RoPE-only by 13.30% perplexity. — Three-Phase Transformer
- At matched quality, 3PT converged in fewer steps and less wall-clock time despite higher per-step cost. — Three-Phase Transformer