Three-Phase Transformer
Cross-source consensus on Three-Phase Transformer from 1 sources and 4 claims.
1 sources · 4 claims
How it works
Comparisons
Highlighted claims
- 3PT partitions the residual stream into equal phases, with three phases as the canonical setting. — Three-Phase Transformer
- Three-Phase Transformer applies a structured residual-stream geometry to a decoder-only Transformer while retaining standard components such as RoPE, GQA, RMSNorm-style normalization, and SwiGLU. — Three-Phase Transformer
- The article interprets 3PT as a residual-stream structural prior rather than a replacement for attention or positional encoding. — Three-Phase Transformer
- 3PT is not presented as a new positional encoding or embedding trick. — Three-Phase Transformer