Persian vs Arabic Orthographies

Persian and Arabic may both use the Arabic script, but their written forms are quite different from each other. In this post I’m going to try and talk about the big differences so that people can both learn to distinguish them from each other and learn some cool facts.

The New Letters

Arabic is kind of weird in that it doesn’t have the sounds “p” or “g”, meaning its alphabet naturally doesn’t have any letters corresponding to those sounds. Persian, however, has both, so the letters پ pe and گ gâf were created to represent p and g respectively. There are also 2 other new letters, ژ zhe and چ che, representing the sounds “zh” (like the “si” in “vision”) and “ch”.

Different Pronunciation

For its lack of sounds as common as “p” and “g”, Arabic also has a lot of pretty weird sounds: some of which include the “th”s in “thick” and “this” (which you may think are perfectly normal because of English but are actually quite rare worldwide) and a set of weird throaty “emphatic consonants”. Naturally these weird sounds have their own letters: the two “th”s are written as ث and ذ and there are lots of emphatic letters which I don’t feel like going over now. But Persian has neither the “th”s nor emphatics. The logical solution would be to get rid of these letters entirely, but no, Persian decided to write the these weird sounds in Arabic loanwords but just pronounce them with their closest Persian counterparts. Thus ث and ذ are pronounced as “s” and “z”, and emphatics are pronounced as non-emphatic: س and ص are both “s”, ز ض ظ are all “z”, ت ط are both “t”, and ه ح are both “h”. Also, the infamous ع ‘ayn which any Arabic learner will complain to you about is simply pronounced as a glottal stop in Persian. One more thing to note: the letter و, named “waw” and pronounced as “w” in Arabic, is now “vâv” and pronounced as “v”.

Differing Letter Forms

Arabic has grammatical gender, and with that there is the very common suffix -a to mark feminine gender, written with a form of the letter tā’ called tā’ marbūṭa ”tied tā’”, which looks like ة (the letter ه hā’ “h” with 2 dots). Persian has no have grammatical gender and thus has no need for tā’ marbūṭa. In Arabic loanwords which have tā marbūṭa, it is either loaned in as a final -ه e (اسطوره osture vs  أسطورة usṭūra “myth”) or -at (دولت dowlat vs دولة dawla “state”). 

There are 2 word-final forms of letters that are very similar looking to each other in Arabic: ي, final yā’ “y”, and ى, actually a form of ا alif called alif maqṣūra which is pronounced as long ā. Persian, however, doesn’t actually dot its yā’ (or rather “ye”), making the two identical. The thing is, alif maqsure is VERY rare in Persian, only really commonly occuring in some proper names such as عیسی ‘isâ “Jesus” or مرتضی mortezâ “Morteza”. 

Arabic’s letter for k, ‌ك kāf, looks kind of like the letter ل lām “l” with a doodad inside of it in the isolated and final forms, but looks like this: كـ elsewhere. In Persian, it has the isolated and final forms ک کـ, giving it a much more consistent aesthetic across the board. The letter for g, گ gâf, also naturally follows this convention.

So Arabic has this thing called hamza that represents the glottal stop (a pause, like the sound in “uh-oh” represented by the hyphen). It can go on top of the letters yā’ and wāw ی و and give you ئ ؤ, representing a glottal stop proceeded or followed by the vowel sounds “i” and “u” (سئل su’ila “he was asked”, سؤال su’āl “question”), or it can go either on top of OR below alif ا. The only letter with a hamza that can occur at the beginning of a word is alif, which gives it the burden of representing all 3 short vowels. A hamza on top means an “a” or “u” (أول ‘awwal “first”, أسطورة ‘usṭūra “myth”) and a hamza on the bottom means it’s an “i” (إستقلال ‘istiqlāl “independence”). Hamza can also come at the end of a word not attached to anything, such as سوداء sawdā’ “black (feminine)”. 

So I spent all that time explaining how hamza works in Arabic to deliver this shocking news: the hamza is actually not very common in Persian. The only real place you see it is in the middle of words on ئ and ؤ: otherwise it’s either optional or actually discouraged by the Persian Language Academy.


Now this is where the most drastic differences come in. Note I’ll mainly be talking about Modern Iranian Persian, which is an important detail because the vowels can vary pretty heavily across dialects.

Arabic has six vowels: a i u ā ī ū, with the ones with the line on top simply being longer versions of the first 3. Iranian Persian has… well, also 6 vowels, but they’re a e o â i u (a being the “a” in “cat”). In Arabic, due to how the vowel system works, there’s a pretty clean division of how vowels are written: short vowels are optionally indicated through diacritics, long vowels are indicated through consonant placeholders. As you can see, Persian doesn’t really have short and long vowels in the same way Arabic does, but we’re going to shoehorn the vowels into these now-arbitrary categories to make things simpler to understand.

Short vowels: a e o 
Long vowels: â i u 

The short vowels are indicated with diacritics:

اَ اِ اُ

While the long vowels are indicated through ا (glottal stop), ی “y”, and و “v”. The two diphthongs, ey and ow, are indicated through ی and و too. So this matches up pretty cleanly with the Arabic system, actually; In Arabic, those diacritics represent “a”, “i”, and “u”. This makes reading Arabic loanwords in Persian quite easy, because you can just read the short vowels as “a e o” and the long vowels as “â i u”. For example:

Arabic حُروف ḥurūf “letters”
Persian حُروف horuf “letters”

Persian writes vowels initially by just throwing the vowel diacritics on top of ا alef, very similar to Arabic and its stuff with Hamza:

اَسب asb “horse”
اِمروز emruz “today”
اُتاق otâq “room”

The vowels â i u are simply represented by آ (alef with a tilde-like diacritic), ای (alef + ye), and او (alef + vâv) respectively, which is quite close to what Arabic does with ā ī ū (but Arabic is cool and adds hamzas).

Word-final vowels are where things get a bit different though. In Arabic, short vowels are just indicated with diacritics at the end of words and the long vowels… let’s just say Arabic has a bit of a complex relationship with word-final long vowels. In Persian, though, all vowels must be indicated word-finally somehow. And here’s how it happens:

1. The most common short vowel at the end of a word is “e”, indicated by ه. Next up is “o”, indicated by و, and finally the very rare “a”, indicated also by ه.

2. Long vowels are indicated with ا، ی، و just like they are in the middle of words. 

Like I said though, I’m talking about Iranian Persian. Afghan Persian actually has 2 more vowels: ē ō, longer versions of “e” and “o”. These are also indicated with ی and و. In Iranian Persian these two vowels have merged with i and u, resulting in the words شیر shēr “lion” and شیر shir “milk” both being pronounced “shir”. 


This section is mainly for fun, but what the hell. A lot of Arabic calligraphy gradually drifted towards a style called naskh, which is also how Arabic is displayed in basically every modern computer font. 

Iran, however, developed a distinctive style called nastaliq. Besides being used very commonly for Persian poetry, this is also the standard way of writing Urdu! For example, here’s an Urdu newspaper. 

Well, that’s about all I have to say! I may have forgotten some stuff, but to me this seems like a pretty comprehensive list as I read over it. I hope you learned some stuff!