Tracking Human Health Through Speech and Language

Neurological disorders or traumatic brain injury may disturb an individual’s speech and language abilities well before such changes are perceptually detectable. For example, Parkinson’s Disease can result in speaking rate changes, reduced intonation, imprecise articulation, etc.; Alzheimer’s Disease can result in longer pauses during speech, reduced vocabulary, reduced language complexity, etc. The goal of this project is to develop signal processing and machine learning technology to detect subtle speech and language changes and to use these algorithms in devices for early detection, real-time symptom tracking, and intervention monitoring. This work is funded by several projects from the NIH and the NSF.

Relevant Publications:
  • Stegmann, G., Hahn, S., Bhandari, S., Kawabata, K., Shefner, J., Duncan, C.J., Liss, J., Berisha, V. and Mueller, K., 2022. Automated semantic relevance as an indicator of cognitive decline: Out‐of‐sample validation on a large‐scale longitudinal dataset. Alzheimer's & Dementia: Diagnosis, Assessment & Disease Monitoring, 14(1).
  • Stegmann, G.M., Hahn, S., Duncan, C.J., Rutkove, S.B., Liss, J., Shefner, J.M. and Berisha, V., 2021. Estimation of forced vital capacity using speech acoustics in patients with ALS. Amyotrophic Lateral Sclerosis and Frontotemporal Degeneration, 22(sup1).
  • Stegmann, G., Hahn, S., Liss, J., Shefner, J., Rutkove, S., Shelton, K., Duncan, C.J. and Berisha, V., 2020. Early detection and tracking of bulbar changes in ALS via frequent and remote speech analysis. npj Digital Medicine, 3(1), pp.1-5.
  • Stegmann, G., Hahn, S., Liss, J., Shefner, J., Rutkove, S.B., Kawabata, K., Bhandari, S., Shelton, K., Duncan, C.J. and Berisha, V., 2020. Repeatability of Commonly Used Speech and Language Features for Clinical Applications. Digital Biomarkers, 4(3), pp.109-122
  • Mathad, V., Scherer, N., Chapman, K., Liss, J. and Berisha, V., 2021. A Deep Learning Algorithm for Objective Assessment of Hypernasality in Children with Cleft Palate. IEEE Transactions on Biomedical Engineering. Jan 2021.
  • Schwedt, T., Peplinski, J., Berisha, V. (2019). Altered speech during migraine attacks: A prospective, longitudinal study of episodic migraine without aura. Cephalalgia.
  • Rutkove, S., Qi, K., Shelton, K., Liss, J., Berisha, V., Shefner, J. 2019. ALS longitudinal studies with frequent data collection at home: study design and baseline data. Amyotrophic Lateral Sclerosis and Frontotemporal Degeneration.
  • Berisha, V., Wang, S., LaCross, A., Liss, J., Garcia-Filion, P. 2017. Longitudinal changes in linguistic complexity among professional football players. Brain and language, 169, 57-63.

Improving Robustness and Efficiency in Machine Learning: Theory and Algorithms

A fundamental problem in machine learning (ML) is developing models that generalize to real-world conditions after training. In practice, this problem is solved by using massive training data to train high-capacity models. We are interested in the scenario where labeled data is available, but costly (i.e. this is the case in most clinical applications). This project is focused on how we can exploit the structure of data (labeled and unlabeled) and active learning to efficiently develop robust models that generalize. This work is funded by several projects from the ONR and the NSF.

Relevant Publications:
  • Li, W., Dasarathy, G., Ramamurthy, K.N. and Berisha, V., 2022. A label efficient two-sample test. Proceedings of UAI.
  • Li, W., Dasarathy, G., Ramamurthy, K.N. and Berisha, V., 2020. Finding the Homology of Decision Boundaries with Active Learning. Proceedings of NeurIPS.
  • Li, W., Dasarathy, G., & Berisha, V. 2020. Regularization via Structural Label Smoothing. Proceedings of AISTATS.
  • Wisler, A., Berisha, V., Spanias, A., & Hero, A. O. 2018. Direct estimation of density functionals using a polynomial basis. IEEE Transactions on Signal Processing, 66(3), 558-572.
  • Berisha, V., Wisler, A., Hero, A. O., & Spanias, A. 2016. Empirically estimable classification bounds based on a nonparametric divergence measure. IEEE Transactions on Signal Processing, 64(3), 580-591.
  • Berisha, V., & Hero, A. O. 2015. Empirical non-parametric estimation of the Fisher Information. IEEE Signal Processing Letters, 22(7), 988-992.

Speech/Audio Compression Based on Loudness Criteria

Based on several psychoacoustic principles, a number of different computational auditory models have been developed over the years to mimic aspects of the human auditory system. Embedding these models within existing audio compression algorithms (e.g. MP-3) has led to significant increases in coding efficiency. In this project, we consider an alternative psychoacoustic model based on loudness for inclusion in speech/audio codecs. Loudness is a subjective phenomenon which represents the magnitude of perceived intensity, i.e., it is a measure of the magnitude of neural activity that corresponds to the hearing sensations. When embedded in an existing compression algorithm, results reveal that the proposed system improves the quality of narrowband speech while performing at a lower bitrate. When compared to other wideband speech coding schemes, the proposed algorithms provide comparable speech quality at a lower bitrate.

Relevant Publications:
  • Krishnamoorthi, H., Spanias, A. and Berisha, V., 2009. A frequency/detector pruning approach for loudness estimation. IEEE Signal Processing Letters, 16(11), pp.997-1000.
  • Berisha, V. and Spanias, A., 2007. Wideband speech recovery using psychoacoustic criteria. EURASIP Journal on Audio, Speech, and Music Processing, 2007, pp.1-18.