Hacker News new | past | comments | ask | show | jobs | submit login

A pointer of potential data

Cotterell, R., & Callison-Burch, C. (2014). A Multi-Dialect, Multi-Genre Corpus of Informal Written Arabic. In LREC (pp. 241-245).


It's using arabic characters but at least it's labelled data of Magrhebi arabic. So you'd "only" have to perform a translitteration or multiple translitteration between that corpus and your data.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact