Actions
  • shareshare
  • link
  • cite
  • add
add
auto_awesome_motion View all 7 versions
Publication . Conference object . Preprint . Contribution for newspaper or weekly magazine . Article . 2016

The MGB-2 challenge: Arabic multi-dialect broadcast media recognition

Ahmed Ali; Peter Bell; James Glass; Yacine Messaoui; Hamdy Mubarak; Steve Renals; Yifan Zhang;
Open Access
Abstract
This paper describes the Arabic Multi-Genre Broadcast (MGB-2) Challenge for SLT-2016. Unlike last year's English MGB Challenge, which focused on recognition of diverse TV genres, this year, the challenge has an emphasis on handling the diversity in dialect in Arabic speech. Audio data comes from 19 distinct programmes from the Aljazeera Arabic TV channel between March 2005 and December 2015. Programmes are split into three groups: conversations, interviews, and reports. A total of 1,200 hours have been released with lightly supervised transcriptions for the acoustic modelling. For language modelling, we made available over 110M words crawled from Aljazeera Arabic website Aljazeera.net for a 10 year duration 2000−2011. Two lexicons have been provided, one phoneme based and one grapheme based. Finally, two tasks were proposed for this year's challenge: standard speech transcription, and word alignment. This paper describes the task data and evaluation process used in the MGB challenge, and summarises the results obtained.
Subjects by Vocabulary

Microsoft Academic Graph classification: Arabic language.human_language language Grapheme Process (engineering) Task (project management) Computer science Speech transcription Metadata Transcription (linguistics) Emphasis (typography) Natural language processing computer.software_genre computer Artificial intelligence business.industry business

Subjects

Computer Science - Computation and Language, Computation and Language (cs.CL), FOS: Computer and information sciences

[5] Andreas Stolcke et al. Srilm-an extensible language modeling toolkit. In Interspeech, volume 2002, page 2002, 2002.

[6] Norbert Braunschweiler, Mark JF Gales, and Sabine Buchholz. Lightly supervised recognition for automatic alignment of large coherent speech recordings. In INTERSPEECH, pages 2222-2225, 2010.

[7] Fadi Biadsy, Nizar Habash, and Julia Hirschberg. Improving the arabic pronunciation dictionary for phone and word recognition with linguistically-based pronunciation rules. In Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pages 397-405. Association for Computational Linguistics, 2009. [OpenAIRE]

Funded by
EC| SUMMA
Project
SUMMA
Scalable Understanding of Multilingual Media
  • Funder: European Commission (EC)
  • Project Code: 688139
  • Funding stream: H2020 | RIA
Validated by funder
Related to Research communities
Digital Humanities and Cultural Heritage
Download fromView all 6 sources
lock_open
moresidebar