Applied Linguistics and Translation Studies - Publications
Permanent URI for this collection
Browse
Browsing Applied Linguistics and Translation Studies - Publications by Author "Krishnamurthy, Parameswari"
Results Per Page
Sort Options
-
ItemDevelopment of Telugu-Tamil transfer-based machine translation system: An improvization using divergence index( 2019-07-01) Krishnamurthy, ParameswariBuilding an automatic, high-quality, robust machine translation (MT) system is a fascinating yet an arduous task, as one of the major difficulties lies in cross-linguistic differences or divergences between languages at various levels. The existence of translation divergence precludes straightforward mapping in the MT system. An increase in the number of divergences also increases the complexity, especially in linguistically motivated transfer-based MT systems. This paper discusses the development of Telugu-Tamil transfer-based MT and how a divergence index (DI) is built to quantify the number of parametric variations between languages in order to improve the success rate of MT. The DI facilitates MT in proposing where to put efforts for the given language pair to attain better and faster results. In addition, handling strategies of different types of divergences in a transfer-based approach to MT are discussed. The paper also includes the evaluation method and how an improvization takes place with the application of DI in MT.
-
ItemParameswari_faith_nagaraju@Dravidian-CodeMixFIRE: A machine-learning approach using n-grams in sentiment analysis for code-mixed texts: A case study in Tamil and Malayalam( 2020-01-01) Krishnamurthy, Parameswari ; Varghese, Faith ; Vuppala, NagarajuSentiment analysis is a fast growing research positioned to uncover the underlying meaning of a text by categorizing it into different levels. This paper is an attempt to decode the deeply entangled code-mixed Malayalam and Tamil datasets and classify its interlined meaning at five various levels. Along with the corpus creation, [1] propose a five-level classification for Malayalam and Tamil code-mixed datasets. In this paper, we follow the five-level annotated datasets and aim to solve the classification problem by implementing unigram and bigram knowledge with a Multinomial Naive Bayes model. Our model scores an F1-score of 0.55 for Tamil and 0.48 for Malayalam.