Sanskrit Studies - Publications

Item

Designing a constraint based parser for Sanskrit

( 2010-12-01) Kulkarni, Amba ; Pokar, Sheetal ; Shukl, Devanand

Verbal understanding (śābdabodha) of any utterance requires the knowledge of how words in that utterance are related to each other. Such knowledge is usually available in the form of cognition of grammatical relations. Generative grammars describe how a language codes these relations. Thus the knowledge of what information various grammatical relations convey is available from the generation point of view and not the analysis point of view. In order to develop a parser based on any grammar one should then know precisely the semantic content of the grammatical relations expressed in a language string, the clues for extracting these relations and finally whether these relations are expressed explicitly or implicitly. Based on the design principles that emerge from this knowledge, we model the parser as finding a directed Tree, given a graph with nodes representing the words and edges representing the possible relations between them. Further, we also use the Mīmā msā constraint of ākānksā (expectancy) to rule out non-solutions and sannidhi (proximity) to prioritize the solutions. We have implemented a parser based on these principles and its performance was found to be satisfactory giving us a confidence to extend its functionality to handle the complex sentences. © 2010 Springer-Verlag Berlin Heidelberg.

Item

A Deterministic Dependency Parser with Dynamic Programming for Sanskrit

( 2013-01-01) Kulkarni, Amba

We describe a Deterministic Dependency Parser for Sanskrit. The parse is developed following a Depth First traversal of a graph whose nodes represent morphological analyses of the words in a sentence. During the traversal, relations at each node are checked for local compatibility, and finally for each full path, the relations on the path are checked for global compatibility. Stacking of intermediate results guarantees dynamic programming. We also describe an interface that displays multiple parses compactly and facilitates users to select the desired parse among various possible solutions with a maximum of n − 1 choices for a sentence with n words.

Item

A distributed platform for sanskrit processing

( 2012-12-01) Goyal, Pawan ; Huet, Gérard ; Kulkarni, Amba ; Scharf, Peter ; Bunker, Ralph

Sanskrit, the classical language of India, presents specific challenges for computational linguistics: exact phonetic transcription in writing that obscures word boundaries, rich morphology and an enormous corpus, among others. Recent international cooperation has developed innovative solutions to these problems and significant resources for linguistic research. Solutions include efficient segmenting and tagging algorithms and dependency parsers based on constraint programming. The integration of lexical resources, text archives and linguistic software is achieved by distributed interoperable Web services. Resources include a morphological tagger and tagged corpus. © 2012 The COLING.

Item

Semantic processing of compounds in indian languages

( 2012-12-01) Kulkarni, Amba ; Paul, Soma ; Kulkarni, Malhar ; Kumar, Anil ; Surtani, Nitesh

Compounds occur very frequently in Indian Languages. There are no strict orthographic conventions for compounds in modern Indian Languages. In this paper, Sanskrit compounding system is examined thoroughly and the insight gained from the Sanskrit grammar is applied for the analysis of compounds in Hindi and Marathi. It is interesting to note that compounding in Hindi deviates from that in Sanskrit in two aspects. The data analysed for Hindi does not contain any instance of Bahuvrihi (exo-centric) compound. Second, Hindi data presents many cases where quite a lot of compounds require a verb as well as vibhakti(a case marker) for its paraphrasing. Compounds requiring a verb for paraphrasing are termed as madhyama-pada-lopi in Sanskrit, and they are found to be rare in Sanskrit. © 2012 The COLING.

Item

Constituency parsing of complex noun sequences in hindi

( 2014-01-01) Batra, Arpita ; Paul, Soma ; Kulkarni, Amba

A complex noun sequence is one in which a head noun is recursively modified by one or more bare nouns and/or genitives Constituency analysis of complex noun sequence is a prerequisite for finding dependency relation (semantic relation) between components of the sequence. Identification of dependency relation is useful for various applications such as question answering, information extraction, textual entailment, paraphrasing. In Hindi, syntactic agreement rules can handle to a large extent the parsing of recursive genitives (Sharma, 2012)[12].This paper implements frequency based corpus driven approaches for parsing recursive genitive structures that syntactic rules cannot handle as well as recursive compound nouns and combination of gentive and compound noun sequences. Using syntactic rules and dependency global algorithm, an accuracy of 92.85% is obtained. © 2014 Springer-Verlag Berlin Heidelberg.

Sanskrit Studies - Publications

Permanent URI for this collection

Browse

Browse

Recent Submissions