Block-ChaCAL
Cross-source consensus on Block-ChaCAL from 1 sources and 6 claims.
1 sources · 6 claims
How it works
Benefits
Comparisons
Highlighted claims
- Block-ChaCAL partitions the sequence into contiguous blocks and decomposes attention into block-diagonal and off-block residual components. — Structured-Sparse Attention for Entity Tracking with Subquadratic Sequence Complexity
- Block-ChaCAL preserves exact within-block masked attention semantics by applying the resolvent exactly to each causal diagonal tile. — Structured-Sparse Attention for Entity Tracking with Subquadratic Sequence Complexity
- Block-ChaCAL handles off-block interactions through a down-sampled block-level system and then lifts the result back. — Structured-Sparse Attention for Entity Tracking with Subquadratic Sequence Complexity
- Balancing the local and residual costs yields approximately O(n^(4/3) d) sequence complexity when hidden width is independent of sequence length. — Structured-Sparse Attention for Entity Tracking with Subquadratic Sequence Complexity
- On BOXES, large-block Block-ChaCAL retained dense ChaCAL-level exact match while reducing evaluation time. — Structured-Sparse Attention for Entity Tracking with Subquadratic Sequence Complexity
- At about 97% exact match, Block-ChaCAL reduced runtime by roughly 2.4 times relative to a 5-layer dense Transformer. — Structured-Sparse Attention for Entity Tracking with Subquadratic Sequence Complexity