Model Architecture
Cross-source consensus on Model Architecture from 1 sources and 5 claims.
1 sources · 5 claims
How it works
Benefits
Preparation
Comparisons
Highlighted claims
- The default audio backbone was Whisper Small with mean temporal pooling. — Voice Biomarkers for Depression and Anxiety
- The core architecture used frozen pretrained backbones with trainable LoRA adaptation modules. — Voice Biomarkers for Depression and Anxiety
- Training used randomly selected 30-second speech segments because Whisper Small had a 30-second receptive field. — Voice Biomarkers for Depression and Anxiety
- Whisper Small was the strongest audio backbone in the early architecture and dataset setting. — Voice Biomarkers for Depression and Anxiety
- Broad ASR pretraining across many speakers supported downstream mental health classification. — Voice Biomarkers for Depression and Anxiety