Numbers to Indian (Hindi) Words Conversion In Unicode

6:57:00 am 0 Comments



NumbersToHindiWords.png 

Introduction


I recently, post an article to convert numbers to words and then I realize that the same logic can be used to convert any other language so here I am starting with Hindi (Indian) language but using Unicode to get the converted numbers in Native language. As you know, the benefit of Unicode, you don’t have to install the font and can display the Native language in web pages as well.
The entire conversion relies on these three arrays. Now, similar arrays can be developed for another languages and conversion can be made. Granted you have to understand how the numbering logic works for that language. Indonesian and Chinese are coming soon!

Private HundredHindiDigitArray() = _
    {"", "เคเค•", "เคฆो", "เคคीเคจ", "เคšाเคฐ", "เคชाँเคš", "เค›เคน", "เคธाเคค", "เค†เค ", "เคจौ", "เคฆเคธ", _
    "เค—्เคฏाเคฐเคน", "เคฌाเคฐเคน", "เคคेเคฐเคน", "เคšौเคฆเคน", "เคชเคจ्เคฆ्เคฐเคน", "เคธोเคฒเคน", "เคธเคค्เคฐเคน", "เค…เค ाเคฐเคน", "เค‰เคจ्เคจीเคธ", "เคฌीเคธ", _
    "เค‡เค•्เค•ीเคธ", "เคฌाเคˆเคธ", "เคคेเคˆเคธ", "เคšौเคฌीเคธ", "เคชเคš्เคšीเคธ", "เค›เคฌ्เคฌीเคธ", "เคธเคค्เคคाเคˆเคธ", "เค…เคŸ्เค ाเคˆเคธ", "เค‰เคจเคคीเคธ", "เคคीเคธ", _
    "เค‡เค•เคคीเคธ", "เคฌเคค्เคคीเคธ", "เคคैंเคคीเคธ", "เคšौंเคคीเคธ", "เคชैंเคคीเคธ", "เค›เคค्เคคीเคธ", "เคธैंเคคीเคธ", "เค…เคก़เคคीเคธ", "เค‰เคจเคคाเคฒीเคธ", "เคšाเคฒीเคธ", _
    "เค‡เค•เคคाเคฒीเคธ", "เคฌเคฏाเคฒीเคธ", "เคคैंเคคाเคฒीเคธ", "เคšौเคตाเคฒीเคธ", "เคชैंเคคाเคฒीเคธ", "เค›िเคฏाเคฒीเคธ", "เคธैंเคคाเคฒीเคธ", "เค…เคก़เคคाเคฒीเคธ", "เค‰เคจเคšाเคธ", "เคชเคšाเคธ", _
    "เค‡เค•्เคฏाเคตเคจ", "เคฌाเคตเคจ", "เคคिเคฐेเคชเคจ", "เคšौเคตเคจ", "เคชเคšเคชเคจ", "เค›เคช्เคชเคจ", "เคธเคค्เคคाเคตเคจ", "เค…เคŸ्เค ाเคตเคจ", "เค‰เคจเคธเค ", "เคธाเค ", _
    "เค‡เค•เคธเค ", "เคฌाเคธเค ", "เคคिเคฐेเคธเค ", "เคšौंเคธเค ", "เคชैंเคธเค ", "เค›िเคฏाเคธเค ", "เคธเคก़เคธเค ", "เค…เคก़เคธเค ", "เค‰เคจเคนเคค्เคคเคฐ", "เคธเคค्เคคเคฐ", _
    "เค‡เค•เคนเคค्เคคเคฐ", "เคฌเคนเคค्เคคเคฐ", "เคคिเคนเคค्เคคเคฐ", "เคšौเคนเคค्เคคเคฐ", "เคชเคšเคนเคค्เคคเคฐ", "เค›िเคนเคค्เคคเคฐ", "เคธเคคเคนเคค्เคคเคฐ", "เค…เค เคนเคค्เคคเคฐ", "เค‰เคจाเคธी", "เค…เคธ्เคธी", _
    "เค‡เค•्เคฏाเคธी", "เคฌเคฏाเคธी", "เคคिเคฐाเคธी", "เคšौเคฐाเคธी", "เคชเคšाเคธी", "เค›िเคฏाเคธी", "เคธเคค्เคคाเคธी", "เค…เคŸ्เค ाเคธी", "เคจเคตाเคธी", "เคจเคฌ्เคฌे", _
    "เค‡เค•्เคฏाเคจเคฌे", "เคฌाเคจเคฌे", "เคคिเคฐाเคจเคฌे", "เคšौเคฐाเคจเคฌे", "เคชंเคšाเคจเคฌे", "เค›िเคฏाเคจเคฌे", "เคธเคค्เคคाเคจเคฌे", "เค…เคŸ्เค ाเคจเคฌे", "เคจिเคจ्เคฏाเคจเคฌे"}

  Private HigherDigitHindiNumberArray() = {"", "", "เคธौ", "เคนเคœाเคฐ", "เคฒाเค–", "เค•เคฐोเคก़", "เค…เคฐเคฌ", "เค–เคฐเคฌ", "เคจीเคฒ"}
  Private HigherDigitSouthAsianStringArray() As String = {"", "", "Hundred", "Thousand", "Lakh", "Karod", _
                                                         "Arab", "Kharab", "Neel"}

  Private SouthAsianCodeArray() As String = {"1", "22", "3", "4", "42", "5", "52", "6", "62", "7", "72", _
                                             "8", "82", "9", "92"}
  Private EnglishCodeArray() As String = {"1", "22", "3"}

  Private SingleDigitStringArray() As String = {"", "One", "Two", "Three", "Four", "Five", "Six", "Seven", _
                                                "Eight", "Nine", "Ten"}
  Private DoubleDigitsStringArray() As String = {"", "Ten", "Twenty", "Thirty", "Forty", "Fifty", "Sixty", _
                                                 "Seventy", "Eighty", "Ninety"}
  Private TenthDigitStringArray() As String = {"Ten", "Eleven", "Tweleve", "Thirteen", "Fourteen", _
                                              "Fifteen", "Sixteen", "Seventeen", "Eighteen", "Nineteen"} 

Background


Hindi numbering system, like other South Asian numbering system, is very alike. The last three digits from right are ready in one way then then higher order digits are read similar to the 10th place digit but with a suffix of higher order digit word.

Example:
12,12,112 = Twelve lakh twelve thousand one hundred twelve
12,00,000 = Twelve lakh
      12,000 =                    twelve thousand
            112 =                                               one hundred twelve
12,12,112 = เคฌाเคฐเคน เคฒाเค– เคฌाเคฐเคน เคนเคœाเคฐ เคเค• เคธौ เคฌाเคฐเคน
12,00,000 =
เคฌाเคฐเคน เคฒाเค–
      12,000 =                       
เคฌाเคฐเคน เคนเคœाเคฐ
            112 =                                                
เคเค• เคธौ เคฌाเคฐเคน

Code Flow

The entire process is basically array and string manipulation. The primary goal is to find the correct index corresponding to the number and its position and then pulling the corresponding word out of the array shown above.

Below is the main function that converts giving number to Hindi words. Zero is exceptional case so we have to be careful at every step when working with digit zero. The very first thing we do is convert the given number to string and then to an array, by calling NumberToArray for example, 1234 to “1234” then to {1, 2, 3, 4}.

Now the fun begins. We first find out in which place the given digits falls in like, unit, tenth, hundredth, and so on by using SouthAsianCodeArray. The logic behind this array is very simple, explained later in the article.  Once we know the place of the digit we can trisect the case as if it’s in unit place, tenth place and other place. When working with these numbers, we take advantage of both backward (i variable) and forward (j variable) indices.

Private Function HindiStyle() As String
    Dim amountString As String = Amount.ToString

    If Amount = 0 Then Return "เคถूเคจ्เคฏ" 'Unique and exceptional case
    If amountString.Length > 15 Then Return "That's too long..."

    Dim amountArray() As Integer = NumberToArray(amountString)

    Dim j As Integer = 0
    Dim digit As Integer = 0
    Dim result As String = ""
    Dim separator As String = ""
    Dim higherDigitHindiString As String = ""
    Dim codeIndex As String = ""


    For i As Integer = amountArray.Length To 1 Step -1
      j = amountArray.Length - i
      digit = amountArray(j)

      codeIndex = SouthAsianCodeArray(i - 1)
      higherDigitHindiString = HigherDigitHindiNumberArray(CInt(codeIndex.Substring(0, 1)) - 1)


      If codeIndex = "1" Then 'Number [1, 9]
        result = result & separator & HundredHindiDigitArray(digit)

      ElseIf codeIndex.Length = 2 And digit <> 0 Then 'Number in tenth place and skip if digit is 0
        Dim suffixDigit As Integer = amountArray(j + 1)
        Dim wholeTenthPlaceDigit As Integer = digit * 10 + suffixDigit

        result = result & separator & HundredHindiDigitArray(wholeTenthPlaceDigit) & " " & _
                                       higherDigitHindiString
        i -= 1

      ElseIf digit <> 0 Then  'Standard Number like 100, 1000, 1000000 and skip if digit is 0
        result = result & separator & HundredHindiDigitArray(digit) & " " & higherDigitHindiString
      End If

      separator = " "
    Next

    Return RemoveSpaces(result)
End Function 

Remove extra spaces:

During the process a space or two get attached in between the words so for the cleanup I use the RegEx and call the RemoveSpaces function as:
Private Function RemoveSpaces(ByVal word As String) As String
    Dim regEx As New System.Text.RegularExpressions.Regex("  ")
    Return regEx.Replace(word, " ").Trim
End Function 

Number formatting (or grouping):

There is another public function FormatNumber which basically calls a private FormatNumberPerLanguage in the Converter class. This FormatNumberPerLanguage will format group based on the provided regional name which is “hi-IN” in this case. A simple use of CultureInfo class.
Private Function FormatNumberPerLanguage(ByVal culterInfoName As String)
    Dim ci As New System.Globalization.CultureInfo(culterInfoName)
    ci.NumberFormat.NumberDecimalDigits = 0
    Return Me.Amount.ToString("N", ci)
End Function

Points of Interest  

These arrays that helps to find the place where the numbers falls in are quite important and interesting. For example, in 123456 number, from right, 1 is at 6th position. Now from the AsianCodeArray the 6th item is "52" which tells two things:

a) the given number is in tenth position (of some order)
b)  and the higher order is in Lakh's position because of the first letter of 52 is 5 and 4th item (5-1 = 4) in HigherDigitSouthAsianStringArray or HigherDigitHindiNumberArray is Lakh or เคฒाเค–

This is how I determine the higher order prefixing word!


Download NumbersToIndianWords.zip - 14.58 KB  

0 comments: