This year’s Picnic Day is going 100% digital in light of the COVID-19 pandemic. Fortunately, a small team of talented RAs (Patty Sandoval, Julian Rambob, Mia Gong, and Marlene Andrade) helped me create Virtual Booth videos!
Gunrock: A Social Bot for Complex and Engaging Long Conversations Dian Yu, Michelle Cohn, Yi Mang Yang, Chun Yen Chen, Weiming Wen, Jiaping Zhang, Mingyang Zhou, Kevin Jesse, Austin Chau, Antara Bhowmick, Shreenath Iyer, Giritheja Sreenivasulu, Sam Davidson, Ashwin Bhandare and Zhou Yu [pdf]
You can see our system demonstration (2 minute video):
While at the KTH Royal Institute of Technology (Stockholm, Sweden) this September, Michelle Cohn met up with Dr. Jonas Beskow, co-founder of Furhat Robotics, and Ph.D. student Patrik Jonell. Together with Georgia Zellou, they are conducting a study to test the role of embodiment and gender in human’s voice-AI interaction with three platforms: Amazon Echo, Nao, and Furhat.
Cohn, M., Chen, C., Yu, Z. (2019). A Large-Scale User Study of an Alexa Prize Chatbot: Effect of TTS Dynamism on Perceived Quality of Social Dialog. (In press). 2019 Special Interest Group on Discourse and Dialogue, SIGDIAL.
I am thrilled to serve as a PI for a two-year NSF-funded postdoctoral research fellowship with Drs. Georgia Zellou, Zhou Yu, and Katharine Graf Estes to explore human-voice AI interaction. (Click here to see the official NSF posting)
We explore ways in which adults and children adapt their speech when talking to voice-activated digital assistants (e.g., Amazon’s Alexa), compared to adult human interlocutors.
This line of work provides a way to test differing theoretical predictions as to the extent that speech-register adjustments are driven by functional motives (e.g., intelligibility) and social factors (e.g., gender).
For instance, this research explores whether the same functional motivations that apply when correcting comprehension errors to human interlocutors apply in device-directed speech (DS), such as in manipulating the phonological nature of errors, to carefully control the level of intelligibility-related pressures in communication.
At the same time, this project explores how social factors may impact speech adaptation strategies, such as by interlocutor type, speaker age, or device gender. This project additionally involves important methodological innovations in programming and running experiments directly through a digital device platform.
Overall, this project aims to fill a gap in our knowledge in the acoustic-phonetic adjustments humans make when talking to voice-AI devices, and can ultimately reveal the underlying mechanisms in speech production by different speakers (e.g., based on age, gender, device experience), contributing to basic science research.
We are excited that several papers have been accepted for the Interspeech 2019 meeting in Graz, Austria!
Papers on human-voice AI interaction
Cohn, M., & Zellou, G.(2019). Expressiveness influences human vocal alignment toward voice-AI. (In press). 2019 Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH.
Snyder, C. Cohn, M., & Zellou, G. (2019). Individual variation in cognitive processing style predicts differences in phonetic imitation of device and human voices. (In press). 2019 Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH.
Ferenc Segedin, C. Cohn, M., & Zellou, G. (2019). Perceptual adaptation to device and human voices: learning and generalization of a phonetic shift across real and voice-AI talkers. (In press). 2019 Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH.
Paper on musical training & speech perception
Cohn, M., Zellou, G., Barreda, S. (2019) The role of musical experience in the perceptual weighting of acoustic cues for the obstruent coda voicing contrast in American English. (In press). 2019 Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH.