„MADAR (Multi-Arabic Dialect Applications and Resources) is a three-year joint project among the NLP Group at Carnegie Mellon University in Qatar (CMU‑Q), the Computational Approaches to Modeling Language (CAMEL) Lab at New York University Abu Dhabi (NYUAD), and Columbia University. The project also involves collaborators from the University of Bahrain (UoB).
The project aims at improving dialectal Arabic processing by:
- developing resources for Arabic Dialect modeling, including the creation of a 25-city multi-dialect lexicon and a 25-city multi-dialect parallel corpus;
- developing machine translation systems among dialects, dialects and English, dialects and Standard Arabic; and
- developing dialect identification systems that can work on a variety of granularity levels.
The MADAR Project is the largest in scale and depth to date when it comes to working on natural language processing of Arabic dialects.“