publication . Contribution for newspaper or weekly magazine . Other literature type . Conference object . Preprint . 2016

Automatic Dialect Detection in Arabic Broadcast Speech

Ali, Ahmed; Dehak, Najim; Cardinal, Patrick; Khurana, Sameer; Yella, Sree Harsha; Glass, James; Bell, Peter; Renals, Steve;
Open Access English
  • Published: 08 Sep 2016
  • Country: United Kingdom
Abstract
In this paper, we investigate different approaches for dialect identification in Arabic broadcast speech. These methods are based on phonetic and lexical features obtained from a speech recognition system, and bottleneck features using the <br/>i-vector framework. We studied both generative and discriminative classifiers, and we combined these features using a multi-class Support Vector Machine (SVM). We validated our results on an Arabic/English language identification task, with an accuracy of 100%. We also evaluated these features in a binary classifier to discriminate between Modern Standard Arabic (MSA) and Dialectal Arabic, with an accuracy of 100%. We fur...
Subjects
ACM Computing Classification System: ComputingMethodologies_PATTERNRECOGNITION
free text keywords: Computer Science - Computation and Language, Broadcasting, business.industry, business, Arabic, language.human_language, language, Natural language processing, computer.software_genre, computer, Artificial intelligence, Computer science
Related Organizations
Funded by
EC| SUMMA
Project
SUMMA
Scalable Understanding of Multilingual Media
  • Funder: European Commission (EC)
  • Project Code: 688139
  • Funding stream: H2020 | RIA
Validated by funder
Communities
Digital Humanities and Cultural Heritage
Download fromView all 5 versions
OpenAIRE
Preprint . 2016
Provider: OpenAIRE
Edinburgh Research Explorer
Contribution for newspaper or weekly magazine . 2016
20 references, page 1 of 2

[2] D. A. Reynolds, W. M. Campbell, W. Shen, and E. Singer, “Automatic language recognition via spectral and token based approaches,” in Springer Handbook of Speech Processing, J. Benesty, M. M. Sondhi, and Y. Huang, Eds. Springer, 2008.

[3] E. Ambikairajah, H. Li, L. Wang, B. Yin, and V. Sethu, “Language identification: A tutorial,” Circuits and Systems Magazine, IEEE, vol. 11, no. 2, pp. 82-108.

[4] M. Zissman, “Comparison of four approaches to automatic language identification of telephone speech,” IEEE Transactions on Speech and Audio Processing, vol. 4, no. 1, pp. 31-44, 1996.

[5] D. Mart´ınez, L. Burget, L. Ferrer, and N. Scheffer, “ivector-based prosodic system for language identification,” in ICASSP, 2012, pp. 4861-4864.

[6] O. Plchot, M. Diez, M. Soufifar, and L. Burget, “Pllr features in language recognition system for rats,” in Fifteenth Annual Conference of the International Speech Communication Association, 2014.

[7] M. H. Bahari, N. Dehak, L. Burget, A. Ali, J. Glass et al., “Nonnegative factor analysis for gmm weight adaptation,” IEEE Transactions on Audio Speech and Language Processing, 2014.

[8] H. Soltau, L. Mangu, and F. Biadsy, “From modern standard arabic to levantine asr: Leveraging gale for dialects,” in ASRU, 2011, pp. 266-271.

[9] M. Soufifar, S. Cumani, L. Burget, and J. Cˇernocky, “Discriminative classifiers for phonotactic language recognition with ivectors,” in ICASSP, 2012, pp. 4853-4856. [OpenAIRE]

[19] A. Ali, Y. Zhang, and S. Vogel, “QCRI advanced transcription ssystem (QATS),” in SLT, 2014.

[20] S. Wray and A. Ali, “Crowdsource a little to label a lot: Labeling a speech corpus of dialectal arabic,” in INTERSPEECH, 2015.

[21] M. Collins, “Language Modeling.”

[22] A. Ng, “CS229 Lecture notes Generative Learning algorithms,” no. 0, pp. 1-14.

[23] H. Drucker, D. Wu, and V. N. Vapnik, “Support vector machines for spam categorization,” Neural Networks, IEEE Transactions on, vol. 10, no. 5, pp. 1048-1054, 1999.

[24] K. Nigam, J. Lafferty, and A. Mccallum, “Using Maximum Entropy for Text Classification.”

[25] S. Meignier and T. Merlin, “Lium spkdiarization: an open source toolkit for diarization,” in CMU SPUD Workshop, 2010. [OpenAIRE]

20 references, page 1 of 2
Abstract
In this paper, we investigate different approaches for dialect identification in Arabic broadcast speech. These methods are based on phonetic and lexical features obtained from a speech recognition system, and bottleneck features using the <br/>i-vector framework. We studied both generative and discriminative classifiers, and we combined these features using a multi-class Support Vector Machine (SVM). We validated our results on an Arabic/English language identification task, with an accuracy of 100%. We also evaluated these features in a binary classifier to discriminate between Modern Standard Arabic (MSA) and Dialectal Arabic, with an accuracy of 100%. We fur...
Subjects
ACM Computing Classification System: ComputingMethodologies_PATTERNRECOGNITION
free text keywords: Computer Science - Computation and Language, Broadcasting, business.industry, business, Arabic, language.human_language, language, Natural language processing, computer.software_genre, computer, Artificial intelligence, Computer science
Related Organizations
Funded by
EC| SUMMA
Project
SUMMA
Scalable Understanding of Multilingual Media
  • Funder: European Commission (EC)
  • Project Code: 688139
  • Funding stream: H2020 | RIA
Validated by funder
Communities
Digital Humanities and Cultural Heritage
Download fromView all 5 versions
OpenAIRE
Preprint . 2016
Provider: OpenAIRE
Edinburgh Research Explorer
Contribution for newspaper or weekly magazine . 2016
20 references, page 1 of 2

[2] D. A. Reynolds, W. M. Campbell, W. Shen, and E. Singer, “Automatic language recognition via spectral and token based approaches,” in Springer Handbook of Speech Processing, J. Benesty, M. M. Sondhi, and Y. Huang, Eds. Springer, 2008.

[3] E. Ambikairajah, H. Li, L. Wang, B. Yin, and V. Sethu, “Language identification: A tutorial,” Circuits and Systems Magazine, IEEE, vol. 11, no. 2, pp. 82-108.

[4] M. Zissman, “Comparison of four approaches to automatic language identification of telephone speech,” IEEE Transactions on Speech and Audio Processing, vol. 4, no. 1, pp. 31-44, 1996.

[5] D. Mart´ınez, L. Burget, L. Ferrer, and N. Scheffer, “ivector-based prosodic system for language identification,” in ICASSP, 2012, pp. 4861-4864.

[6] O. Plchot, M. Diez, M. Soufifar, and L. Burget, “Pllr features in language recognition system for rats,” in Fifteenth Annual Conference of the International Speech Communication Association, 2014.

[7] M. H. Bahari, N. Dehak, L. Burget, A. Ali, J. Glass et al., “Nonnegative factor analysis for gmm weight adaptation,” IEEE Transactions on Audio Speech and Language Processing, 2014.

[8] H. Soltau, L. Mangu, and F. Biadsy, “From modern standard arabic to levantine asr: Leveraging gale for dialects,” in ASRU, 2011, pp. 266-271.

[9] M. Soufifar, S. Cumani, L. Burget, and J. Cˇernocky, “Discriminative classifiers for phonotactic language recognition with ivectors,” in ICASSP, 2012, pp. 4853-4856. [OpenAIRE]

[19] A. Ali, Y. Zhang, and S. Vogel, “QCRI advanced transcription ssystem (QATS),” in SLT, 2014.

[20] S. Wray and A. Ali, “Crowdsource a little to label a lot: Labeling a speech corpus of dialectal arabic,” in INTERSPEECH, 2015.

[21] M. Collins, “Language Modeling.”

[22] A. Ng, “CS229 Lecture notes Generative Learning algorithms,” no. 0, pp. 1-14.

[23] H. Drucker, D. Wu, and V. N. Vapnik, “Support vector machines for spam categorization,” Neural Networks, IEEE Transactions on, vol. 10, no. 5, pp. 1048-1054, 1999.

[24] K. Nigam, J. Lafferty, and A. Mccallum, “Using Maximum Entropy for Text Classification.”

[25] S. Meignier and T. Merlin, “Lium spkdiarization: an open source toolkit for diarization,” in CMU SPUD Workshop, 2010. [OpenAIRE]

20 references, page 1 of 2
Any information missing or wrong?Report an Issue