ASU Fulton Enterpreneurial Professor
Electrical, Computer, and Energy Engineering and
College of Health Solutions
Arizona State University
(480) 727 - 6455
visar ((at)) asu ((dot)) edu
Neurological disorders or traumatic brain injury may disturb an individual’s speech and language abilities well before such changes are perceptually detectable. For example, Parkinson’s Disease can result in speaking rate changes, reduced intonation, imprecise articulation, etc.; Alzheimer’s Disease can result in longer pauses during speech, reduced vocabulary, reduced language complexity, etc. The goal of this project is to develop signal processing and machine learning technology to detect subtle speech and language changes and to use these algorithms in devices for early detection, real-time symptom tracking, and intervention monitoring. This work is funded by several projects from the NIH and the NSF.
Relevant Publications:A fundamental problem in machine learning (ML) is developing models that generalize to real-world conditions after training. In practice, this problem is solved by using massive training data to train high-capacity models. We are interested in the scenario where labeled data is available, but costly (i.e. this is the case in most clinical applications). This project is focused on how we can exploit the structure of data (labeled and unlabeled) and active learning to efficiently develop robust models that generalize. This work is funded by several projects from the ONR and the NSF.
Relevant Publications:Reliably estimating loudness requires employing elaborate models associated with a high computational complexity, often not suitable for real-time applications. In this project, we developed, and implemented on mobile devices, efficient algorithms for estimating loudness. In particular, we propose a number of fast algorithms for estimating excitation patterns, specific loudness patterns, and total loudness. The computational efficiency of the existing standard (ANSI, S3.4-2005) for estimating loudness is greatly improved while the fidelity of the estimates is largely unaffected.
Relevant Publication:Based on several psychoacoustic principles, a number of different computational auditory models have been developed over the years to mimic aspects of the human auditory system. Embedding these models within existing audio compression algorithms (e.g. MP-3) has led to significant increases in coding efficiency. In this project, we consider an alternative psychoacoustic model based on loudness for inclusion in speech/audio codecs. Loudness is a subjective phenomenon which represents the magnitude of perceived intensity, i.e., it is a measure of the magnitude of neural activity that corresponds to the hearing sensations. When embedded in an existing compression algorithm, results reveal that the proposed system improves the quality of narrowband speech while performing at a lower bitrate. When compared to other wideband speech coding schemes, the proposed algorithms provide comparable speech quality at a lower bitrate.
Relevant Publication: