No topic has had a greater impact on the definition of the Urdu language than words.
The modules in this unit introduce readers to the five linguistic traditions whose confluence comprises the massive reservoir of words that is the Urdu language. The following lessons are intended as basic introductions to the topic. For detailed analyses, reader might consult some of the works listed in the bibliography.
Philologists and linguistics have historically categorized Urdu vocabulary using terms such as “Indic” and “Perso-Arabic.” They have even gotten more specific and designated words as deriving from Sanskrit, Persian, Arabic, and other languages. As we will see, this categorization can be very useful. But it has its limitations. The recognition that a word is etymologically derived from Arabic tells us nothing about the context in which it was borrowed into Urdu. Many Arabic words entered Urdu through Persian, but we wouldn’t know that just by reading an Urdu dictionary where a word’s etymology is marked as “Arabic.” The recognition of a Sanskritic etymology often tells us very little about the history of the word. Many Sanskrit words were borrowed into Urdu through Prakrit. Some may be borrowings from other modern Indic languages.
One way that philologists and historical linguists have drawn etymological distinctions among words is by their sounds and spellings. For example, خ and غ are considered Perso-Arabic (and thus not Indic) sounds. Likewise, because ذ and ط are considered to mark sounds in Arabic not used in Urdu (or Persian), they are imagined only to appear in Arabic loanwords.
Urdu presents serious problems for this narrative by way of substantial exceptions. In theoretical terms, it casts doubt on the notion that a sound is foreign to a language or language family by providing examples of speakers of that language or within that language family who have used the sound for centuries. To return to our examples, the letters خ and غ have appeared in Indic vocabulary in Urdu for centuries in words like پٹاخا [paṭāḳhā] (firecracker)
That said, it is useful for the purpose of vocabulary acquisition and word studies to make broad etymological distinctions. For that reason, the following modules divide the sources of Urdu vocabulary into separate sections according to etymological origin. In each section, we will learn a basic strategy for recognizing the origin of a word. We will also learn three word-building patterns that we can use to help build and refine our vocabulary.
Because Urdu uses a modified form of the Arabic script, most Arabic words in Urdu are spelled in their original form. The preservation of Arabic spellings in Urdu allows us to recognize Arabic loanwords in Urdu. It also allows us to recognize relationships among Arabic words. In turn, this not only helps us remember how to spell Arabic words, but to make educated guesses about the meaning of unfamiliar words that we encounter when reading. Likewise, knowing how words relate to each in Arabic other can offer clues to the correct spelling of unfamiliar vocabulary. Suppose you know the word ظلم [zulm]
The history of interaction between Urdu and Persian in South Asia is as old as the history of the Urdu language. The authors of many of the earliest extant Urdu texts also wrote in Persian, and all of them lived in a cultural milieu in which Persian was the language of administration, governance, and learning. Many of the earliest extant works in what we now recognize as Urdu were written in imitation of Perso-Arabic literary genres. Some were written in imitation of Persian masterpieces. Others drew heavily on Persian (as opposed to Arabic) aesthetic sensibilities. For example, the Urdu ġhazal (lyric) and maṡnavī (long-form rhyming couplet) poetry written in the Deccan from the fifteenth century more closely resembles Persian masterpieces in these genres than those in Arabic. The same is generally true of Urdu poetry today. And because Persian remained the language of governance until the mid-nineteenth century, a language of education under colonialism, and to varying degrees remains a language of education, culture, and scholarship among many literate Urdu users, it has continued to shape and inform the contours of the Urdu language and its literature down to the present day. You can even find Persian lines and phrases in the lyrics of Bollywood item numbers such as the title song from the film Om Shanti Om.
The identification of Persian vocabulary presents problems similar to those we have discussed in the case of Arabic. One problem is the difference between linguistic identity and the history of linguistic interactions and borrowing. As mentioned, many of the Arabic words in use in Urdu today were borrowed not from Arabic, but through Persian. In some cases, Urdu follows the Persian spellings, forms, and meaning of a word rather than the Arabic one. For example, the word تماشا [tamāshā]
A third dimension of Persian’s relationship with Urdu worth keeping in mind is that Persian is a very close relative of Indic languages such as Sanskrit. Often, Urdu will have both a Persian word and its Indic cousins. For example, the Persian word شنا [shinā]
That said, it can be useful for the purposes of vocabulary building to know how to recognize Persian words as such. Here’s an overview.
One way to recognize that a word is Persian in origin without knowing anything about Persian is with reference to the letters in the word. In general, Urdu words that contain both letters not historically identified as Indic and those not found in classical Arabic are often Persian. For example, Indic languages stereotypically lack the sounds represented by خ [ḳh], ذ [z], ز [z], ژ [zh], غ [ġh], and ق [q]
Likewise, you would be correct to assume that گذارش [gużārish]
The following three modules introduce some of the most common Persian word building patterns in Urdu.
Largely owing to the history of European trade, British colonialism, and postcolonial globalization in South Asia, English and other European languages have left a profound and indelible impression on Urdu. As of 2022, it is impossible to watch an Urdu television serial or Bollywood film, listen to Urdu popular music, or have a conversation with Urdu speakers anywhere in South Asia or diaspora without encountering European, and particularly English, vocabulary.
It just so happens that some of the most common Urdu words are borrowed from European languages. The words چابی [chābī]
English words are so ubiquitous and thoroughly assimilated in Urdu that it is difficult to know where to start to begin to discuss them. To be sure, there are Indianisms of English usage, but it often feels when speaking with certain classes of Urdu speakers that nearly the whole of the English language is fair game and one can simply drop any English noun, adjective, or adverb into an Urdu sentence without disrupting the flow of the conversation. One shudders to think what it must be like for non-English-speaking leaners of Urdu. They not only have to learn all of Urdu’s Arabic, Persian, Sanskritic, and Indic vocabulary, but its English vocabulary, too.
Somewhat ironically, English-knowing students of Urdu often find English words in the Urdu script the most difficult to read. An intermediate-level student of mine recently struggled to read the word انسائکلوپیڈیا [insāiklopīḍiyā]
The Urdu forms of English words may involve approximations of English consonants, changes to internal vowel and consonant patterns, or the addition of sounds to avoid things like consonant clusters. The following modules contain strategies for recognizing English words as such. The modules are also useful for learning how to spell Urdu words correctly even if you have not read them before. For example, knowing that English ts and ds are almost always rendered as unaspirated retroflex consonants and that Urdu often prefixes ا [i] to avoid combinations and clusters of s and sh in the initial syllable of words, you would be correct to assume that the English word “standard” is spelled and pronounced اسٹینڈرڈ [isṭainḍarḍ]
Discussions of Urdu vocabulary have rarely involved discussions of Sanskritic vocabulary. As the divide between Hindi and Urdu has widened in the past two centuries, so have notions of exclusivity risen to dominance. The result is that Sanskritic registers have become almost exclusively associated with Hindi and Perso-Arabic registers likewise with Urdu, especially high-register vocabulary.
But the fact remains that Sanskritic were central to the development of Urdu literature in its earliest centuries and remain part of everyday Urdu speech today. Commonplace Sanskritic words and phrases such as [darshan] (vision, visitation [esp. of a deity]) and phrases like [kāyā-palaṭ] (change, about-face) show the persistent relevance of Sanskrit and Sanskritic registers to Urdu.
Before we begin our analysis of Sanskritic registers in Urdu, it is important to keep in mind that just as the relationships among Arabic, Persian, and Indic languages are complex, so is the relationship between Sanskrit and so-called vernacular Indic languages like Urdu. We will see that much of the Indic vocabulary in Urdu can be related to Sanskrit. These relationships cover a fairly wide range, from direct loanwords to distant relatives with shared histories. It has become somewhat commonplace to speak of vernacular registers (a term which, itself, may imply a Sanskrit-centric, classicist perspective) as derived from Sanskrit vocabulary. In fact, the relationship between vernacular Indic words and Sanskrit ones is much more complicated and involves long histories of standardization, interaction, borrowing, circulation in the Prakrit, literary influence, and interactions among the vernaculars. Further, the conventional focus on relationship between Indic vernacular vocabulary and Sanskrit has meant that words from other linguistic families, particularly Persian, are often omitted from discussions of Urdu vocabulary that likewise stand in important historical relationships to Sanskritic words. With that in mind, let us proceed.
Philologists conventionally distinguish two kinds of words when speaking of Sanskritic registers in vernacular Indic languages like Urdu. Tatsama (U: تتسم [tatsam]
The following three modules introduce students to two tatsama word building patterns and one pattern describing the relationship between tadbhava and tatsama words.
Further Reading:
Barker, M.A.R. A Course in Urdu. 2 vols. 1967. Montreal: McGill University, 1967.
Bruce, Gregory Maxwell. Urdu Vocabulary. Edinburgh: Edinburgh University Press, 2021.
Shackle, Christopher and Rupert Snell. Hindi and Urdu Since 1800: A Common Reader. New Delhi: Heritage Publishers, 1990.