Title	Cantonese Audio-Visual Emotional Speech (CAVES) dataset
Description	This database consists of audio visual recordings of Cantonese spoken expressions of emotions produced by 10 native speakers of Cantonese. 5 speakers are female and their folders are labeled from fm1 to fm5; 5 speakers are male and their folders are labeled from m1 to m5. Each folder consists of 21 zip files (e.g., 7 emotions x 3 presentation modes (audio only AO, visual only VO, audio visual AV). Each zip file contains a file for each of the 50 Cantonese sentences produced in one emotion type (angry, disgust, fear, happy, neutral, sad, surprise) and in one modality (AO, VO, AV). Note: the AV files are in MTS format https://docs.fileformat.com/video/avchd/ FM5 is an exception to the above; only 25 Cantonese sentences were recorded for Sad. To get an idea of the material, we provide 6 files in AV format as a sample. The sample consists of sentence 1 spoken in the 6 emotions by Speaker FM1. The data from the perception study (validation experiment) are in the file CAVES_data_final.csv
Related Publication	Chong, C.S., Davis, C. & Kim, J. (2023). A Cantonese Audio-Visual Emotional Speech (CAVES) dataset https://doi.org/10.3758/s13428-023-02270-7
Related Websites	The MARCS Institute
Creators	Kim, Jeesun j.kim@westernsydney.edu.au ORCID 0000-0003-2651-1020 Chong, Chee Seng chongcheeseng138@gmail.com Davis, Chris chris.davis@westernsydney.edu.au ORCID 0000-0002-6387-4181
Fields of Research	520207 - Social and affective neuoroscience 520403 - Learning, motivation and emotion 520406 - Sensory processes, perception and performance
Socio-Economic Objective	280121 - Expanding knowledge in psychology
Keywords	Cantonese dataset; Auditory and visual expressions; Emotional speech; Dataset evaluation