Hacker News new | past | comments | ask | show | jobs | submit login

A pointer of potential data

Cotterell, R., & Callison-Burch, C. (2014). A Multi-Dialect, Multi-Genre Corpus of Informal Written Arabic. In LREC (pp. 241-245).

http://www.lrec-conf.org/proceedings/lrec2014/pdf/641_Paper....

It's using arabic characters but at least it's labelled data of Magrhebi arabic. So you'd "only" have to perform a translitteration or multiple translitteration between that corpus and your data.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: