Associate Professor

ASU Fulton Enterpreneurial Professor
Electrical, Computer, and Energy Engineering and
College of Health Solutions
Arizona State University
(480) 727 - 6455
visar ((at)) asu ((dot)) edu


Tracking neurological health through speech and language

Neurological disorders or traumatic brain injury may disturb an individual’s speech and language abilities well before such changes are perceptually detectable. For example, Parkinson’s Disease can result in speaking rate changes, reduced intonation, imprecise articulation, etc.; Alzheimer’s Disease can result in longer pauses during speech, reduced vocabulary, reduced language complexity, etc. The goal of this project is to develop signal processing and machine learning technology to detect subtle speech and language changes and to use these algorithms in devices for early detection, real-time symptom tracking, and intervention monitoring. This work is funded by several projects from the NIH and the NSF.

Relevant Publications:
  • Stegmann, G.M., Hahn, S., Liss, J., Shefner, J., Rutkove, S., Shelton, K., Duncan, C.J. and Berisha, V., 2020. Early detection and tracking of bulbar changes in ALS via frequent and remote speech analysis. npj Digital Medicine, 3(1), pp.1-5.
  • Stegmann, G.M., Hahn, S., Liss, J., Shefner, J., Rutkove, S.B., Kawabata, K., Bhandari, S., Shelton, K., Duncan, C.J. and Berisha, V., 2020. Repeatability of Commonly Used Speech and Language Features for Clinical Applications. Digital Biomarkers, 4(3), pp.109-122
  • Mathad, V., Scherer, N., Chapman, K., Liss, J. and Berisha, V., 2021. A Deep Learning Algorithm for Objective Assessment of Hypernasality in Children with Cleft Palate. IEEE Transactions on Biomedical Engineering. Jan 2021.
  • Schwedt, T., Peplinski, J., Berisha, V. (2019). Altered speech during migraine attacks: A prospective, longitudinal study of episodic migraine without aura. Cephalalgia.
  • Rutkove, S., Qi, K., Shelton, K., Liss, J., Berisha, V., Shefner, J. (2019) ALS longitudinal studies with frequent data collection at home: study design and baseline data. Amyotrophic Lateral Sclerosis and Frontotemporal Degeneration.
  • Berisha, V., Wang, S., LaCross, A., Liss, J., Garcia-Filion, P. (2017). Longitudinal changes in linguistic complexity among professional football players. Brain and language, 169, 57-63.

Improving robustness and efficiency in machine learning: theory and algorithms

A fundamental problem in machine learning (ML) is developing models that generalize to real-world conditions after training. In practice, this problem is solved by using massive training data to train high-capacity models. We are interested in the scenario where labeled data is available, but costly (i.e. this is the case in most clinical applications). This project is focused on how we can exploit the structure of data (labeled and unlabeled) and active learning to efficiently develop robust models that generalize. This work is funded by several projects from the ONR and the NSF.

Relevant Publications:
  • Li, W., Dasarathy, G., Ramamurthy, K.N. and Berisha, V., 2020. Finding the Homology of Decision Boundaries with Active Learning. 2020. Proceedings of NeurIPS.
  • Li, W., Dasarathy, G., & Berisha, V. (2020). Regularization via Structural Label Smoothing. Proceedings of AISTATS 2020.
  • Wisler, A., Berisha, V., Spanias, A., & Hero, A. O. (2018). Direct estimation of density functionals using a polynomial basis. IEEE Transactions on Signal Processing, 66(3), 558-572.
  • Berisha, V., Wisler, A., Hero, A. O., & Spanias, A. (2016). Empirically estimable classification bounds based on a nonparametric divergence measure. IEEE Transactions on Signal Processing, 64(3), 580-591.
  • Berisha, V., & Hero, A. O. (2015). Empirical non-parametric estimation of the Fisher Information. IEEE Signal Processing Letters, 22(7), 988-992.

Efficient Models for Computing Loudness

Reliably estimating loudness requires employing elaborate models associated with a high computational complexity, often not suitable for real-time applications. In this project, we developed, and implemented on mobile devices, efficient algorithms for estimating loudness. In particular, we propose a number of fast algorithms for estimating excitation patterns, specific loudness patterns, and total loudness. The computational efficiency of the existing standard (ANSI, S3.4-2005) for estimating loudness is greatly improved while the fidelity of the estimates is largely unaffected.

Relevant Publication:
  • H. Krishnamoorthi, V. Berisha and A. Spanias, ``A Frequency/Detector Pruning Approach for Loudness," IEEE Signal Processing Letters. June 2009.

Speech/Audio Compression Based on Loudness Criteria

Based on several psychoacoustic principles, a number of different computational auditory models have been developed over the years to mimic aspects of the human auditory system. Embedding these models within existing audio compression algorithms (e.g. MP-3) has led to significant increases in coding efficiency. In this project, we consider an alternative psychoacoustic model based on loudness for inclusion in speech/audio codecs. Loudness is a subjective phenomenon which represents the magnitude of perceived intensity, i.e., it is a measure of the magnitude of neural activity that corresponds to the hearing sensations. When embedded in an existing compression algorithm, results reveal that the proposed system improves the quality of narrowband speech while performing at a lower bitrate. When compared to other wideband speech coding schemes, the proposed algorithms provide comparable speech quality at a lower bitrate.

Relevant Publication:
  • V. Berisha and A. Spanias, ``Wideband Speech Recovery Using Psychoacoustic Criteria," EURASIP Journal on Audio, Speech, and Music Processing, 2007.