MISP-Meeting contains 125 hours of audio and video data in total. The dataset is divided into 119 hours for training (Train), 3 hours for development (Dev) and 3 hours as the evaluation set (Eval) for challenging scoring and ranking. Specifically, the training, development and evaluation sets contain 72, 9 and 9 sessions, respectively. There is no overlap in speakers and recording rooms among the data in each subset. Each session consists of a discussion involving 4-8 participants. The duration of these discussions varies: in the training set, each session lasts for 2 hours, while in the development and evaluation sets, sessions are 20 minutes long. Consequently, a training session encompasses multiple topic transitions. The total number of participants in the training, development, and evaluation sets is 233, 15, and 15, respectively, with balanced gender representation. All participants' professions or areas of study (for those who are students) are related to the meeting topics. This real-world relevance not only enhances the authenticity of the setting but also helps to minimize the occurrence of extended silent periods during the discussions. The ratio of the speech segment containing overlap to the entire speech segment in the training, development and evaluation sets are 57.30%, respectively.
One of the advantages of the MISP-Meeting corpus compared to other meeting corpora is the diversity of its meeting rooms. As shown in Table 1, the 23 meeting rooms are categorized into four size groups: tiny, small, medium, and large, ranging from 8.79 to 117.6 square meters. Each subset includes meeting rooms of all sizes, offering a broad spectrum of acoustic properties and layouts. The meeting rooms feature various wall materials, including cement and glass, and are equipped with furnishings such as sofas, TVs, blackboards, fans, air conditioners, and plants. Detailed parameters of each meeting venue will be released with the training data, providing a comprehensive resource for acoustic analysis.
Dataset | Train | Dev | Eval | Total | |||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Duration (h) | 119 | 125 | |||||||||
Session | 72 | 90 | |||||||||
Room | 15 | 23 | |||||||||
Participant | 233 | 263 | |||||||||
-Male | 115 | 130 | |||||||||
-Female | 118 | 133 | |||||||||
Overlap Ratio | 57.30% | ||||||||||
Tab.1. Details of MISP-Meeting corpus |
Size | Tiny | Small | Middle | Large | Total | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Area (in m2) | 0-15 | 15-35 | 35-60 | 60-∞ | 0-∞ | ||||||
Train | 5 | 5 | 3 | 2 | 15 | ||||||
Dev | |||||||||||
Eval | |||||||||||
Tab.2. Statistics of meeting rooms |
This dataset is available under the specified license. Before using the corpus, please navigate to the Registration page to sign up. After registering, you will receive a download password.
download link |