January 23rd, 2024: The ChatCLR challenge has been accepted by ICME 2024 Grand Challenge (GC). Please refer to https://2024.ieeeicme.org/grand-challenge-proposals for more details of ICME 2024 GC.
January 25th, 2024: The download link of the training set and development set has been sent to the registered participants via email.
In daily conversations, people acquire information through auditory cues, such as the speaker's voice, and visual cues, like lip movements, to facilitate the comprehension of spoken content. The audio may be drowned in the noise encountered in poor acoustic scenarios, making the content difficult to acquire. Lipreading can infer pronunciation content through lip movements. This emerging and challenging field lies at the intersection of computer vision and natural language processing and plays a key role in various applications in different domains. However, lipreading tasks primarily concentrate on English, emphasizing the need for increased attention in Chinese research. The heightened complexity of Chinese lipreading tasks stems from the extensive number of Chinese characters and the complex mapping between these characters and corresponding lip movements. Additionally, the scarcity of lipreading challenges and the lack of large-scale Chinese lipreading datasets further constrain research efforts in this field. Existing Chinese lipreading datasets mainly focus on professional announcers or carefully prepared topics, which is not applicable in practical applications. Our challenge will release the videos recorded in a real-home chat scenario with 2-6 speakers talking in a relaxed and unscripted fashion.
Our challenge addresses wake word and target speaker lipreading. The wake word lipreading task focuses on activating smart home devices during conversational interactions. Target speaker lipreading requires participants to fine-tune their networks to recognize the continuous and colloquial spoken conversation.Researchers from both academia and industry are warmly welcome to work on our two tasks for promoting research of speech processing using video information to cross the practical threshold of realistic applications in challenging scenarios. For full details of the challenge, please refer to the website. Don't hesitate to contact us at mispchallenge@gmail.com for queries concerning the challenge.