PhaseRotationLayer
Cross-source consensus on PhaseRotationLayer from 1 sources and 4 claims.
1 sources · 4 claims
How it works
Comparisons
Highlighted claims
- PhaseRotationLayer is inserted between attention and the FFN without a residual connection. — Three-Phase Transformer
- The rotation layer uses learnable angle vectors shared across phases, with phase-specific offsets. — Three-Phase Transformer
- The non-residual orthogonal rotation preserves norm, invertibility, and gradient singular values. — Three-Phase Transformer
- Residualizing the rotation worsened performance, supporting the non-residual design choice. — Three-Phase Transformer