Refine
Language
- English (5)
Is part of the Bibliography
- yes (5)
Keywords
- human-robot interaction (2)
- paralinguistic features (2)
- synthesized voice (2)
- text-to-speech (2)
- uncanny valley (2)
Institute
Lee and Schwarz (L&S) suggest that separation is the grounded procedure underlying cleansing effects in different psychological domains. Here, we interpret L&S's account from a hierarchical view of cognition that considers the influence of physical properties and sensorimotor constraints on mental representations. This approach allows theoretical integration and generalization of L&S's account to the domain of formal quantitative reasoning.
Peripersonal space is the space surrounding our body, where multisensory integration of stimuli and action execution take place. The size of peripersonal space is flexible and subject to change by various personal and situational factors. The dynamic representation of our peripersonal space modulates our spatial behaviors towards other individuals. During the COVID-19 pandemic, this spatial behavior was modified by two further factors: social distancing and wearing a face mask. Evidence from offline and online studies on the impact of a face mask on pro-social behavior is mixed. In an attempt to clarify the role of face masks as pro-social or anti-social signals, 235 observers participated in the present online study. They watched pictures of two models standing at three different distances from each other (50, 90 and 150 cm), who were either wearing a face mask or not and were either interacting by initiating a hand shake or just standing still. The observers’ task was to classify the model by gender. Our results show that observers react fastest, and therefore show least avoidance, for the shortest distances (50 and 90 cm) but only when models wear a face mask and do not interact. Thus, our results document both pro- and anti-social consequences of face masks as a result of the complex interplay between social distancing and interactive behavior. Practical implications of these findings are discussed.
Peripersonal space is the space surrounding our body, where multisensory integration of stimuli and action execution take place. The size of peripersonal space is flexible and subject to change by various personal and situational factors. The dynamic representation of our peripersonal space modulates our spatial behaviors towards other individuals. During the COVID-19 pandemic, this spatial behavior was modified by two further factors: social distancing and wearing a face mask. Evidence from offline and online studies on the impact of a face mask on pro-social behavior is mixed. In an attempt to clarify the role of face masks as pro-social or anti-social signals, 235 observers participated in the present online study. They watched pictures of two models standing at three different distances from each other (50, 90 and 150 cm), who were either wearing a face mask or not and were either interacting by initiating a hand shake or just standing still. The observers’ task was to classify the model by gender. Our results show that observers react fastest, and therefore show least avoidance, for the shortest distances (50 and 90 cm) but only when models wear a face mask and do not interact. Thus, our results document both pro- and anti-social consequences of face masks as a result of the complex interplay between social distancing and interactive behavior. Practical implications of these findings are discussed.
The Human Takes It All
(2020)
Background: The increasing involvement of social robots in human lives raises the question as to how humans perceive social robots. Little is known about human perception of synthesized voices.
Aim: To investigate which synthesized voice parameters predict the speaker's eeriness and voice likability; to determine if individual listener characteristics (e.g., personality, attitude toward robots, age) influence synthesized voice evaluations; and to explore which paralinguistic features subjectively distinguish humans from robots/artificial agents.
Methods: 95 adults (62 females) listened to randomly presented audio-clips of three categories: synthesized (Watson, IBM), humanoid (robot Sophia, Hanson Robotics), and human voices (five clips/category). Voices were rated on intelligibility, prosody, trustworthiness, confidence, enthusiasm, pleasantness, human-likeness, likability, and naturalness. Speakers were rated on appeal, credibility, human-likeness, and eeriness. Participants' personality traits, attitudes to robots, and demographics were obtained.
Results: The human voice and human speaker characteristics received reliably higher scores on all dimensions except for eeriness. Synthesized voice ratings were positively related to participants' agreeableness and neuroticism. Females rated synthesized voices more positively on most dimensions. Surprisingly, interest in social robots and attitudes toward robots played almost no role in voice evaluation. Contrary to the expectations of an uncanny valley, when the ratings of human-likeness for both the voice and the speaker characteristics were higher, they seemed less eerie to the participants. Moreover, when the speaker's voice was more humanlike, it was more liked by the participants. This latter point was only applicable to one of the synthesized voices. Finally, pleasantness and trustworthiness of the synthesized voice predicted the likability of the speaker's voice. Qualitative content analysis identified intonation, sound, emotion, and imageability/embodiment as diagnostic features.
Discussion: Humans clearly prefer human voices, but manipulating diagnostic speech features might increase acceptance of synthesized voices and thereby support human-robot interaction. There is limited evidence that human-likeness of a voice is negatively linked to the perceived eeriness of the speaker.
The Human Takes It All
(2020)
Background: The increasing involvement of social robots in human lives raises the question as to how humans perceive social robots. Little is known about human perception of synthesized voices.
Aim: To investigate which synthesized voice parameters predict the speaker's eeriness and voice likability; to determine if individual listener characteristics (e.g., personality, attitude toward robots, age) influence synthesized voice evaluations; and to explore which paralinguistic features subjectively distinguish humans from robots/artificial agents.
Methods: 95 adults (62 females) listened to randomly presented audio-clips of three categories: synthesized (Watson, IBM), humanoid (robot Sophia, Hanson Robotics), and human voices (five clips/category). Voices were rated on intelligibility, prosody, trustworthiness, confidence, enthusiasm, pleasantness, human-likeness, likability, and naturalness. Speakers were rated on appeal, credibility, human-likeness, and eeriness. Participants' personality traits, attitudes to robots, and demographics were obtained.
Results: The human voice and human speaker characteristics received reliably higher scores on all dimensions except for eeriness. Synthesized voice ratings were positively related to participants' agreeableness and neuroticism. Females rated synthesized voices more positively on most dimensions. Surprisingly, interest in social robots and attitudes toward robots played almost no role in voice evaluation. Contrary to the expectations of an uncanny valley, when the ratings of human-likeness for both the voice and the speaker characteristics were higher, they seemed less eerie to the participants. Moreover, when the speaker's voice was more humanlike, it was more liked by the participants. This latter point was only applicable to one of the synthesized voices. Finally, pleasantness and trustworthiness of the synthesized voice predicted the likability of the speaker's voice. Qualitative content analysis identified intonation, sound, emotion, and imageability/embodiment as diagnostic features.
Discussion: Humans clearly prefer human voices, but manipulating diagnostic speech features might increase acceptance of synthesized voices and thereby support human-robot interaction. There is limited evidence that human-likeness of a voice is negatively linked to the perceived eeriness of the speaker.