What should be the strategy of training Malayalam alphabets? I would like to start from an example of two vowels in Malayalam alphabets. Fig. 1 and 2 shows an example of the two vowels “oh” and “ohoh”.
Fig. 1: Vowel “oh”
Fig. 2: Vowel “ohoh” It is very easy to train vowel “oh”. However this is not easy to train vowel “ohoh” as in the character image this vowel is vertically separated by one/more than one lines of white pixels which leads to poor accuracy in recognition. So, to avoid this problem the idea of training “ohoh” is to consider this vowel as a combination of two different symbols. The symbols can be anything like alphabet from English. For our convenient lets assume that the first part will be trained as “a” and the second part will be “b”. Fig. 3 shows this mapping.
Fig. 3: Mapping for training vowel “ohoh” Now the question is how do we represent “ohoh” while we will show the output? The solution is quite simple which is to use a post processor. The post processor will have multiple steps where at the first stage we have to convert the symbols to Malayalam characters.