|
pdf
|
@phdthesis{Prost2008,
author = {{Jean-Philippe Prost}},
year = {2008},
title = {Modelling Syntactic Gradience with Loose Constraint-based Parsing},
type = {Cotutelle Ph.D. Thesis},
school = {Macquarie University, Sydney, Australia, and
Université de Provence, Aix-en-Provence, France}
}
|
|
The grammaticality of a sentence has conventionally been treated in a binary
way: either a sentence is grammatical or not. A growing body of work,
however, focuses on studying intermediate levels of acceptability, sometimes
referred to as gradience. To date, the bulk of this work has concerned
itself with the exploration of human assessments of syntactic gradience.
This dissertation explores the possibility to build a robust computational
model that accords with these human judgements.
We suggest that the concepts of Intersective Gradience and Subsective Gradience
introduced by Aarts for modelling graded judgements be extended to cover
deviant language. Under such a new model, the problem then raised by
gradience is to classify an utterance as a member of a specific category
according to its syntactic characteristics. More specifically, we extend
Intersective Gradience (IG) so that it is concerned with choosing the most
suitable syntactic structure for an utterance among a set of candidates,
while Subsective Gradience (SG) is extended to be concerned with calculating
to what extent the chosen syntactic structure is typical from the category
at stake. IG is addressed in relying on a criterion of optimality, while SG
is addressed in rating an utterance according to its grammatical
acceptability. As for the required syntactic characteristics, which serve as
features for classifying an utterance, our investigation of different
frameworks for representing the syntax of natural language shows that they
can easily be represented in Model-Theoretic Syntax; we choose to use
Property Grammars (PG), which offers to model the characterisation of an
utterance. We present here a fully automated solution for modelling
syntactic gradience, which characterises any well formed or ill formed input
sentence, generates an optimal parse for it, then rates the utterance
according to its grammatical acceptability.
Through the development of such a new model of gradience, the main contribution of
this work is three-fold.
First, we specify a model-theoretic logical framework for PG, which bridges the gap
observed in the existing formalisation regarding the constraint satisfaction
and constraint relaxation mechanisms, and how they relate to the projection
of a category during the parsing process. This new framework introduces the
notion of loose satisfaction, along with a formulation in first-order logic,
which enables reasoning about the characterisation of an utterance.
Second, we present our implementation of Loose Satisfaction Chart Parsing (LSCP), a
dynamic programming approach based on the above mechanisms, which is proven
to always _nd the full parse of optimal merit. Although it shows a high
theoretical worst time complexity, it performs sufficiently well with the
help of heuristics to let us experiment with our model of gradience.
And third, after postulating that human acceptability judgements can be
predicted by factors derivable from LSCP, we present a numeric model for
rating an utterance according to its syntactic gradience. We measure a good
correlation with grammatical acceptability by human judgements. Moreover,
the model turns out to outperform an existing one discussed in the
literature, which was experimented with parses generated manually.
|