Human Instructions

Cross-source consensus on Human Instructions from 1 sources and 5 claims.

1 sources · 5 claims

Uses

The Human Instructions evaluation set targets natural speech phenomena missing from standard benchmarks. — Building Interactive Real-Time Agents with Asynchronous I/O and Speculative Tool Calling
The final Human Instructions set contains 177 valid sessions after human review. — Building Interactive Real-Time Agents with Asynchronous I/O and Speculative Tool Calling
On Human Instructions, AsyncIO performed worse and slower than the synchronous SFT baseline. — Building Interactive Real-Time Agents with Asynchronous I/O and Speculative Tool Calling
Naturalistic speech transcripts caused lower accuracy and increased AsyncIO latency due to repeated-action behavior. — Building Interactive Real-Time Agents with Asynchronous I/O and Speculative Tool Calling
The Human Instructions results suggest synthetic streaming training does not fully cover spontaneous spoken interaction. — Building Interactive Real-Time Agents with Asynchronous I/O and Speculative Tool Calling