Head-Property Capacity
Cross-source consensus on Head-Property Capacity from 1 sources and 4 claims.
1 sources · 4 claims
How it works
Benefits
Risks & contraindications
Evidence quality
Highlighted claims
- Accuracy collapsed when simultaneously evolving properties exceeded the number of attention heads. — Structured-Sparse Attention for Entity Tracking with Subquadratic Sequence Complexity
- Blockwise evaluation does not remove the need for enough heads to separate concurrent property-specific routing channels. — Structured-Sparse Attention for Entity Tracking with Subquadratic Sequence Complexity
- A single head was sufficient when only one evolving property was tracked. — Structured-Sparse Attention for Entity Tracking with Subquadratic Sequence Complexity
- The article interprets insufficient heads as interference among routing signals for multiple evolving properties. — Structured-Sparse Attention for Entity Tracking with Subquadratic Sequence Complexity