Skip to content

Arabic numerals (Draft)

Najib Tounsi edited this page Nov 20, 2016 · 14 revisions

Topics to talk about

Different families
Origin
Writing/Reading digits
Issues related to families (fonts, keyboards ...)
What else...

There are mostly two families of numerals in Arabic script. One known as Western digits aka as Arabic Digit (Unicode range U+0030-U+0039), the second is Arabic-Indic digits (Unicode range U+0660-U+0669). The latter is further gave another sub-notation called Persian/Urdu digits, aka Extended Arabic-Indic digits (Unicode range U+06F0-U+06F9), in which digits 4, 5 and 6 have another glyph. The following table summarizes those families

(TO DO may be put a bigger table with Unicode values/names for each character).

Arabic Numerals

TODO here in some words, historical considerations about the origin of these three families and why they differ, although they have the same indian origin.

The first line above lists digits which are mostly used in western Arabic regions (Maghreb), while the second lists digits used in most Middle-East countries. Persian (and Urdu) mostly uses the third category, even though some fonts may not distinguish between the changing glyph.

An important fact to note here is the bidirectional category of these numbers.

  • Western digits (U+0030..U+0039) are of category "EN - European number",
  • Arabic-digit indic (U+0660..U+0669) are "AN - Arabic number",
  • Extended Arabic-digit indic U+06F0..U+06F9) are classified "EN - European number", differently from their counterpart above

The difference in bidi category between Arabic-Indic digits and Eastern Arabic-Indic digits is due to the difference in bidi behavior desired in Arabic vs. Persian.

As a consequence, "a 5 b", say, will give (in RTL context) :

  • "b ۵ a", in case of Eastern Arabic-Indic digits
  • "a ٥ b", in case of Arabic-Indic digits

which may seem weird/surprising if the digits in question have the same shape.

Numerals do not always appear alone, and could come with other characters like financial symbols, fraction sign, decimals and/or thousands signs (excluding math expressions here). Moreover, numerals can come "separated" by (or mixed with) space or other signs (e.g. phone numbers +12 34 56 78 89, cars licence plate like 123 د‎ 4 etc.

A particular attention is needed here. Firstly, numbers have a weak directionality with regards to Bidi algorithm and secondly, the placement of the accompanying signs and symbols may depend on regions. Generally middle east vs. Maghreb. This is not to mention punctuation signs.

  • The % percent sign (U+0025) or the ٪ Arabic percent sign (U+066A) may both be used with Arabic script. May also be placed on the left or on the right of a number (١٢% or %١٢), with or without space (12 % or % 12) @@ images to put here @@

  • The decimal and thousand sign are indifferently the dot . sign or the comma , (1,234.45 or 1.234,45) @@images here @@. The thousand sign may also be a single space (1 234,45) etc.

  • The fractions could be written, for a one-half say, 1/2, 1\2 or 2\1

Issues :

  • How to know that a sign (space, comma ...) is a separator or a sign within a number? 12 34 56 78 90 is a phone number or a sequence of digits? Which may be inverted in RTL. A tip is to use a syntax like 12.34.56.78.90 or 12-34-56-78-90 for phones.

  • String like licence plate above 123 د‎ 4 would require a tag or a control character, but this not always desirable.

  • etc.

Note: We do not mention math expression. See @@ elsewhere @@

Other topics to talk about: keyboards layout WRT regions, which digits are used by default in different OS/Applications

...

REMOVED: Arabic numbers are written left to right, and are read in Arabic in different ways

  • almost from right to left

    23 three and twenty  
    

    423 three and twenty and four hundreds
    3423 three and twenty and four hundreds and three thousands
    23423 three and twenty and four hundreds and (three and twenty) thousands

  • from left to right, but from right to left for the two first digits

    23 three and twenty
    423 four hundreds and three and twenty
    23423 (three and twenty) thousands and four hundreds and three and twenty

Although this is about how different languages read their texts, numbers might be written in all letters (cf. money an payment documents, checks etc.) and this is worth to mention.

Clone this wiki locally