ShefCE: A Cantonese-English bilingual speech corpus

<p>ShefCE is a Cantonese English bilingual parallel speech corpus recorded by L2 English learners in Hong Kong. 31 undergraduate to postgraduate students in Hong Kong aged 20-30 were recruited and recorded a 25-hour speech corpus (12 hours in Cantonese and 13 hours in English). Details can be found in [1].</p><p>The corpus is available free of charge for academic research, teaching and non-commercial use. A <a href="http://mini.dcs.shef.ac.uk/wp-content/uploads/ShefCE_DataRequestForm.pdf">data request form</a> has to be signed and submitted to the University of Sheffield to use the data. Please find the details and the data request form at <a href="http://mini.dcs.shef.ac.uk/resources/shefce">http://mini.dcs.shef.ac.uk/resources/shefce</a>, and cite [1] when using the data.</p><p>[1] Raymond W. M. Ng, Alvin C.M. Kwan, Tan Lee and Thomas Hain, "ShefCE: A Cantonese-English Bilingual Speech Corpus for Pronunciation Assessment",  in <i>Proc. The 42th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)</i>, 2017.<br></p><p><br></p>