Title

Cantonese Audio-Visual Emotional Speech (CAVES) dataset

Description

This database consists of audio visual recordings of Cantonese spoken expressions of emotions produced by 10 native speakers of Cantonese.

5 speakers are female and their folders are labeled from fm1 to fm5; 5 speakers are male and their folders are labeled from m1 to m5.

Each folder consists of 21 zip files (e.g., 7 emotions x 3 presentation modes (audio only AO, visual only VO, audio visual AV).
Each zip file contains a file for each of the 50 Cantonese sentences produced in one emotion type (angry, disgust, fear, happy, neutral, sad, surprise) and in one modality (AO, VO, AV). Note: the AV files are in MTS format https://docs.fileformat.com/video/avchd/

FM5 is an exception to the above; only 25 Cantonese sentences were recorded for Sad.

To get an idea of the material, we provide 6 files in AV format as a sample.
The sample consists of sentence 1 spoken in the 6 emotions by Speaker FM1.

The data from the perception study (validation experiment) are in the file CAVES_data_final.csv

Chong, C.S., Davis, C. & Kim, J. (2023). A Cantonese Audio-Visual Emotional Speech (CAVES) dataset https://doi.org/10.3758/s13428-023-02270-7

The MARCS Institute

Creators

Kim, Jeesun j.kim@westernsydney.edu.au ORCID 0000-0003-2651-1020
Chong, Chee Seng chongcheeseng138@gmail.com
Davis, Chris chris.davis@westernsydney.edu.au ORCID 0000-0002-6387-4181

Fields of Research

520207 - Social and affective neuoroscience
520403 - Learning, motivation and emotion
520406 - Sensory processes, perception and performance

Socio-Economic Objective

280121 - Expanding knowledge in psychology

Keywords

Cantonese dataset; Auditory and visual expressions; Emotional speech; Dataset evaluation