I am thrilled to start a two-year NSF-funded postdoctoral research fellowship in with Drs. Georgia Zellou, Zhou Yu, and Katharine Graf Estes to explore human-voice AI interaction.
This project explores the ways in which adults and children adapt their speech when talking to voice-activated digital assistants (e.g., Amazon’s Alexa), compared to adult human interlocutors. This line of work provides a way to test differing theoretical predictions as to the extent that speech-register adjustments are driven by functional motives (e.g., intelligibility) and social factors (e.g., gender). For instance, this research explores whether the same functional motivations that apply when correcting comprehension errors to human interlocutors apply in device-directed speech (DS), such as in manipulating the phonological nature of errors, to carefully control the level of intelligibility-related pressures in communication. At the same time, this project explores how social factors may impact speech adaptation strategies, such as by interlocutor type, speaker age, or device gender. This project additionally involves important methodological innovations in programming and running experiments directly through a digital device platform. Overall, this project aims to fill a gap in our knowledge in the acoustic-phonetic adjustments humans make when talking to voice-AI devices, and can ultimately reveal the underlying mechanisms in speech production by different speakers (e.g., based on age, gender, device experience), contributing to basic science research.
This past June I had the opportunity to attend the first inaugural Como Summer School on Music, Language, and Cognition (MLCS) held in Como, Italy. As someone working at the intersection of language/music perception—a growing field of research— this was an excellent opportunity to work closely with 32 other graduate and postdoctoral researchers from a range of international institutions, as well as hear 11 presentations by experts in these fields including: Ani Patel (Tufts University: neuroscience of language/music) , Ian Cross (Cambridge: musicology, musicality), Jessica Grahn (Western University: neuroscience of rhythm, therapeutic effects of music on Parkinson’s patients), Tecumseh Fitch (University of Vienna: biological foundations of music & language), Tom Fritz (Max Plank Institute: cross-cultural comparisons of music, biological effects of music on cognition and reward systems), and Alice Mado Proverbio (University of Milano-Bicocca: neuroscience of language/music).
In addition to faculty presentations and engaging group discussions, all students gave a short presentation of theirown research. I was amazed at the range of subfields in which graduate students/postdocs were exploring language, music, or a combination of both.
For example, Sinead Rocha, a doctoral student at UCL, found that infants’ production of spontaneous rhythmic patterns may be related to the height and walking rate of their parents (based on their time in utero and being carried).
In looking at the intersection of language & music, Giuliana Genovese, a graduate student at the University of Milano-Bicocca, found that infants better learned new phonemic contrasts through song vs. through speech.
Overall, the experience was quite enriching — particularly in the close collaboration with graduate students and faculty from a diverse range of disciplines — and I look forward to seeing them again at future programs/conferences.