January 23rd, 2024: The ChatCLR challenge has been accepted by ICME 2024 Grand Challenge (GC). Please refer to for more details of ICME 2024 GC.

January 25th, 2024: The download link of the training set and development set has been sent to the registered participants via email.

In daily conversations, people acquire information through auditory cues, such as the speaker's voice, and visual cues, like lip movements, to facilitate the comprehension of spoken content. The audio may be drowned in the noise encountered in poor acoustic scenarios, making the content difficult to acquire. Lipreading can infer pronunciation content through lip movements. This emerging and challenging field lies at the intersection of computer vision and natural language processing and plays a key role in various applications in different domains. However, lipreading tasks primarily concentrate on English, emphasizing the need for increased attention in Chinese research. The heightened complexity of Chinese lipreading tasks stems from the extensive number of Chinese characters and the complex mapping between these characters and corresponding lip movements. Additionally, the scarcity of lipreading challenges and the lack of large-scale Chinese lipreading datasets further constrain research efforts in this field. Existing Chinese lipreading datasets mainly focus on professional announcers or carefully prepared topics, which is not applicable in practical applications. Our challenge will release the videos recorded in a real-home chat scenario with 2-6 speakers talking in a relaxed and unscripted fashion.

Our challenge addresses wake word and target speaker lipreading. The wake word lipreading task focuses on activating smart home devices during conversational interactions. Target speaker lipreading requires participants to fine-tune their networks to recognize the continuous and colloquial spoken conversation.Researchers from both academia and industry are warmly welcome to work on our two tasks for promoting research of speech processing using video information to cross the practical threshold of realistic applications in challenging scenarios. For full details of the challenge, please refer to the website. Don't hesitate to contact us at for queries concerning the challenge.


  • Task 1: Wake Word Lipreading
  • Task 2: Target Speaker Lipreading

Planned Schedule (AOE Time):

  • Start the challenge and release the training and development set: January 25th, 2024
  • Baseline system release: February 5th, 2024
  • Evaluation set release and leaderboard update for evaluation set: March 13th, 2024
  • Leaderboard freeze for evaluation set: March 27th, 2024
  • Deadline for submitting invited papers: April 7th, 2024
  • Notification of paper acceptance: April 12th, 2024
  • Deadline for camera-ready submission of accepted paper: Same as the ICME2024 camera-ready deadline


Jun Du

University of Science and Technology of China

Chin-Hui LEE

Georgia Institute of Technology

Sabato Marco Siniscalchi

Kore University of Enna