Numbers to Indian (Hindi) Words Conversion In Unicode

6:57:00 am 0 Comments



NumbersToHindiWords.png 

Introduction


I recently, post an article to convert numbers to words and then I realize that the same logic can be used to convert any other language so here I am starting with Hindi (Indian) language but using Unicode to get the converted numbers in Native language. As you know, the benefit of Unicode, you don’t have to install the font and can display the Native language in web pages as well.
The entire conversion relies on these three arrays. Now, similar arrays can be developed for another languages and conversion can be made. Granted you have to understand how the numbering logic works for that language. Indonesian and Chinese are coming soon!

Private HundredHindiDigitArray() = _
    {"", "एक", "दो", "तीन", "चार", "पाँच", "छह", "सात", "आठ", "नौ", "दस", _
    "ग्यारह", "बारह", "तेरह", "चौदह", "पन्द्रह", "सोलह", "सत्रह", "अठारह", "उन्नीस", "बीस", _
    "इक्कीस", "बाईस", "तेईस", "चौबीस", "पच्चीस", "छब्बीस", "सत्ताईस", "अट्ठाईस", "उनतीस", "तीस", _
    "इकतीस", "बत्तीस", "तैंतीस", "चौंतीस", "पैंतीस", "छत्तीस", "सैंतीस", "अड़तीस", "उनतालीस", "चालीस", _
    "इकतालीस", "बयालीस", "तैंतालीस", "चौवालीस", "पैंतालीस", "छियालीस", "सैंतालीस", "अड़तालीस", "उनचास", "पचास", _
    "इक्यावन", "बावन", "तिरेपन", "चौवन", "पचपन", "छप्पन", "सत्तावन", "अट्ठावन", "उनसठ", "साठ", _
    "इकसठ", "बासठ", "तिरेसठ", "चौंसठ", "पैंसठ", "छियासठ", "सड़सठ", "अड़सठ", "उनहत्तर", "सत्तर", _
    "इकहत्तर", "बहत्तर", "तिहत्तर", "चौहत्तर", "पचहत्तर", "छिहत्तर", "सतहत्तर", "अठहत्तर", "उनासी", "अस्सी", _
    "इक्यासी", "बयासी", "तिरासी", "चौरासी", "पचासी", "छियासी", "सत्तासी", "अट्ठासी", "नवासी", "नब्बे", _
    "इक्यानबे", "बानबे", "तिरानबे", "चौरानबे", "पंचानबे", "छियानबे", "सत्तानबे", "अट्ठानबे", "निन्यानबे"}

  Private HigherDigitHindiNumberArray() = {"", "", "सौ", "हजार", "लाख", "करोड़", "अरब", "खरब", "नील"}
  Private HigherDigitSouthAsianStringArray() As String = {"", "", "Hundred", "Thousand", "Lakh", "Karod", _
                                                         "Arab", "Kharab", "Neel"}

  Private SouthAsianCodeArray() As String = {"1", "22", "3", "4", "42", "5", "52", "6", "62", "7", "72", _
                                             "8", "82", "9", "92"}
  Private EnglishCodeArray() As String = {"1", "22", "3"}

  Private SingleDigitStringArray() As String = {"", "One", "Two", "Three", "Four", "Five", "Six", "Seven", _
                                                "Eight", "Nine", "Ten"}
  Private DoubleDigitsStringArray() As String = {"", "Ten", "Twenty", "Thirty", "Forty", "Fifty", "Sixty", _
                                                 "Seventy", "Eighty", "Ninety"}
  Private TenthDigitStringArray() As String = {"Ten", "Eleven", "Tweleve", "Thirteen", "Fourteen", _
                                              "Fifteen", "Sixteen", "Seventeen", "Eighteen", "Nineteen"} 

Background


Hindi numbering system, like other South Asian numbering system, is very alike. The last three digits from right are ready in one way then then higher order digits are read similar to the 10th place digit but with a suffix of higher order digit word.

Example:
12,12,112 = Twelve lakh twelve thousand one hundred twelve
12,00,000 = Twelve lakh
      12,000 =                    twelve thousand
            112 =                                               one hundred twelve
12,12,112 = बारह लाख बारह हजार एक सौ बारह
12,00,000 =
बारह लाख
      12,000 =                       
बारह हजार
            112 =                                                
एक सौ बारह

Code Flow

The entire process is basically array and string manipulation. The primary goal is to find the correct index corresponding to the number and its position and then pulling the corresponding word out of the array shown above.

Below is the main function that converts giving number to Hindi words. Zero is exceptional case so we have to be careful at every step when working with digit zero. The very first thing we do is convert the given number to string and then to an array, by calling NumberToArray for example, 1234 to “1234” then to {1, 2, 3, 4}.

Now the fun begins. We first find out in which place the given digits falls in like, unit, tenth, hundredth, and so on by using SouthAsianCodeArray. The logic behind this array is very simple, explained later in the article.  Once we know the place of the digit we can trisect the case as if it’s in unit place, tenth place and other place. When working with these numbers, we take advantage of both backward (i variable) and forward (j variable) indices.

Private Function HindiStyle() As String
    Dim amountString As String = Amount.ToString

    If Amount = 0 Then Return "शून्य" 'Unique and exceptional case
    If amountString.Length > 15 Then Return "That's too long..."

    Dim amountArray() As Integer = NumberToArray(amountString)

    Dim j As Integer = 0
    Dim digit As Integer = 0
    Dim result As String = ""
    Dim separator As String = ""
    Dim higherDigitHindiString As String = ""
    Dim codeIndex As String = ""


    For i As Integer = amountArray.Length To 1 Step -1
      j = amountArray.Length - i
      digit = amountArray(j)

      codeIndex = SouthAsianCodeArray(i - 1)
      higherDigitHindiString = HigherDigitHindiNumberArray(CInt(codeIndex.Substring(0, 1)) - 1)


      If codeIndex = "1" Then 'Number [1, 9]
        result = result & separator & HundredHindiDigitArray(digit)

      ElseIf codeIndex.Length = 2 And digit <> 0 Then 'Number in tenth place and skip if digit is 0
        Dim suffixDigit As Integer = amountArray(j + 1)
        Dim wholeTenthPlaceDigit As Integer = digit * 10 + suffixDigit

        result = result & separator & HundredHindiDigitArray(wholeTenthPlaceDigit) & " " & _
                                       higherDigitHindiString
        i -= 1

      ElseIf digit <> 0 Then  'Standard Number like 100, 1000, 1000000 and skip if digit is 0
        result = result & separator & HundredHindiDigitArray(digit) & " " & higherDigitHindiString
      End If

      separator = " "
    Next

    Return RemoveSpaces(result)
End Function 

Remove extra spaces:

During the process a space or two get attached in between the words so for the cleanup I use the RegEx and call the RemoveSpaces function as:
Private Function RemoveSpaces(ByVal word As String) As String
    Dim regEx As New System.Text.RegularExpressions.Regex("  ")
    Return regEx.Replace(word, " ").Trim
End Function 

Number formatting (or grouping):

There is another public function FormatNumber which basically calls a private FormatNumberPerLanguage in the Converter class. This FormatNumberPerLanguage will format group based on the provided regional name which is “hi-IN” in this case. A simple use of CultureInfo class.
Private Function FormatNumberPerLanguage(ByVal culterInfoName As String)
    Dim ci As New System.Globalization.CultureInfo(culterInfoName)
    ci.NumberFormat.NumberDecimalDigits = 0
    Return Me.Amount.ToString("N", ci)
End Function

Points of Interest  

These arrays that helps to find the place where the numbers falls in are quite important and interesting. For example, in 123456 number, from right, 1 is at 6th position. Now from the AsianCodeArray the 6th item is "52" which tells two things:

a) the given number is in tenth position (of some order)
b)  and the higher order is in Lakh's position because of the first letter of 52 is 5 and 4th item (5-1 = 4) in HigherDigitSouthAsianStringArray or HigherDigitHindiNumberArray is Lakh or लाख

This is how I determine the higher order prefixing word!


Download NumbersToIndianWords.zip - 14.58 KB  

0 comments: