Arabic script

The Arabic script is the writing system used for writing Arabic and several other languages of Asia and Africa, such as Persian, Urdu, Azerbaijani, Pashto, Central Kurdish, Luri, dialects of Mandinka, and others. Until the 16th century, it was also used to write some texts in Spanish. It is the second-most widely used writing system in the world by the number of countries using it and the third by the number of users, after Latin and Chinese characters.

Today Afghanistan, Iran, India, Pakistan and China are the main non-Arabic speaking states using the Arabic alphabet to write one or more official national languages, including Azerbaijani, Baluchi, Brahui, Persian, Pashto, Central Kurdish, Urdu, Sindhi, Kashmiri, Punjabi and Uyghur.

The Arabic script is written from right to left in a cursive style. In most cases the letters transcribe consonants, or consonants and a few vowels, so most Arabic alphabets are abjads.

The script was first used to write texts in Arabic, most notably the Qurʼān, the holy book of Islam. With the spread of Islam, it came to be used to write languages of many language families, leading to the addition of new letters and other symbols, with some versions, such as Kurdish, Uyghur, and old Bosnian being abugidas or true alphabets. It is also the basis for the tradition of Arabic calligraphy.

Arabic characters in Unicode

As of Unicode 10.0, the following ranges encode Arabic characters:

  • Arabic (0600–06FF)
  • Arabic Supplement (0750–077F)
  • Arabic Extended-A (08A0–08FF)
  • Arabic Presentation Forms-A (FB50–FDFF)
  • Arabic Presentation Forms-B (FE70–FEFF)
  • Arabic Mathematical Alphabetic Symbols (1EE00–1EEFF)
  • Rumi Numeral Symbols (10E60–10E7F)

The basic Arabic range encodes the standard letters and diacritics, but does not encode contextual forms (U+0621–U+0652 being directly based on ISO 8859-6); and also includes the most common diacritics and Arabic-Indic digits. The Arabic Supplement range encodes letter variants mostly used for writing African (non-Arabic) languages. The Arabic Extended-A range encodes additional Qur’anic annotations and letter variants used for various non-Arabic languages. The Arabic Presentation Forms-A range encodes contextual forms and ligatures of letter variants needed for Persian, Urdu, Sindhi and Central Asian languages. The Arabic Presentation Forms-B range encodes spacing forms of Arabic diacritics, and more contextual letter forms. The presentation forms are present only for compatibility with older standards, and are not currently needed for coding text. The Arabic Mathematical Alphabetical Symbols block encodes characters used in Arabic mathematical expressions.

See here the Official Unicode Consortium code chart