Mathematical and computational aspects of lexicalized grammars

1990 1990

Other formats: Order a copy

Abstract (summary)

Most current linguistic theories give lexical accounts of several phenomena that used to be considered purely syntactic. The information put in the lexicon is thereby increased both in amount and complexity. We explore the view that syntactic rules are not separated from lexical items. In this approach, each elementary structure is associated with a lexical item called the anchor. These structures specify extended domains of locality (as compared to context-free grammars) over which constraints can be stated. The 'grammar' consists of a lexicon where each lexical item is associated with a finite number of structures for which that item is the anchor. There are 'rules' which tell us how these structures are composed. A grammar of this form will be said to be lexicalized.

The process of lexicalization of context-free grammars (CFGs) constrained by linguistic requirements forces us to use operations for combining structures that make the formalism fall in the class of mildly context sensitive languages. We show that substitution, the combining operation corresponding to CFGs, does not allow one to lexicalize CFGs but the combination of substitution and adjunction does. We show how tree-adjoining grammar (TAG) is derived from the lexicalization process of CFGs. Then we show that TAGs are closed under lexicalization and we illustrate the main structures found in a lexicalized TAG for English. The properties of TAGs permit us to encapsulate diverse syntactic phenomena in a very natural way. TAG's extended domain of locality and its factoring of recursion from local dependencies enable us to localize many syntactic dependencies (such as filler-gap) as well as semantic dependencies (such as predicate-arguments).

We investigate the processing of lexicalized TAGs. We first present two general practical parsers that follow Earley-style parsing. They are practical parsers for TAGs because, as for CFGs, the average behavior of Earley-type parsers is superior to its worst case complexity. They are both left to right bottom-up parsers that use top-down predictions but they differ in the way the top down prediction is used.

Then we explain the building of a set of deterministic bottom-up left to right parsers which analyze a subset of tree-adjoining languages. The LR parsing strategy for CFGs is extended to TAG by using a machine, called Bottom-up Embedded Push Down Automaton (BEPDA), that recognizes in a bottom-up fashion the set of tree-adjoining languages (and exactly this set).

Finally we show how lexicalized grammars suggest a natural two-step parsing strategy. We consider lexicalized TAGs as an instance of lexicalized grammar and we examine the effect of the two-step parsing strategy on main types of parsing algorithms.

Indexing (details)

Computer science;
0984: Computer science
0290: Linguistics
Identifier / keyword
Applied sciences; Language, literature and linguistics; mathematical aspects
Mathematical and computational aspects of lexicalized grammars
Schabes, Yves
Number of pages
Publication year
Degree date
School code
DAI-B 51/08, Dissertation Abstracts International
Place of publication
Ann Arbor
Country of publication
United States
Joshi, Aravind K.
University of Pennsylvania
University location
United States -- Pennsylvania
Source type
Dissertations & Theses
Document type
Dissertation/thesis number
ProQuest document ID
Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works.
Document URL
Access the complete full text

You can get the full text of this document if it is part of your institution's ProQuest subscription.

Try one of the following:

  • Connect to ProQuest through your library network and search for the document from there.
  • Request the document from your library.
  • Go to the ProQuest login page and enter a ProQuest or My Research username / password.