Sunday 14 June 2009

Behavioral Profiling and polysemy

In their paper entitled In defense of corpus-based methods: A behavioral profile analysis of polysemous 'get' in English (presented at the 24th North West Linguistics Conference, 3-4th May 2008), Andrea L. Berez and Stefan Th. Gries make a general case for the use of corpus data. their paper serves as a response to Raukko's (1999,2003) proposal to disregard corpus data investigations in favour of experimentally motivated studies. Berez and Gries conclude that:

[A] rejection of corpus-based investigations of polysemy is premature: our BP approach to get not only avoids the pitfalls Raukko mistakenly claims to be inherent in corpus research, it also provides results that are surprisingly similar to his own questionnaire-based results, and Divjak and Gries (to appear) show how predictions following from a BP study are strongly supported in two different psycholinguistic experiments." (P.165)
Before conducting a case study of polysemous get -- the results of which are compared , in the second part of the paper, to those presented in Raukko's An "intersubjective" method for cognitive semantic research on polysemy: the case of 'get' (1999), the authors briefly state the advantages of corpus data:

- (...) the richness of and diversity of naturally-occurring data often forces the researcher to take a broader range of facts into consideration;
- the corpus output from a particular search expression together constitute an objective database of a kind that made-up sentences or judgements often do not. More pointedly, made-up sentences or introspective judgements involve potentially non-objective (1) data gathering, (ii) classification, (iii) interpretive process on the part of the researcher. Corpus data, on the other hand, at least allow for an objective and replicable data-gathering process; given replicable retrieval operations, the nature, scope and the ideas underlying the classification of examples can be made very explicit (...) (p.159)

Methodologically, Berez and Gries attempt to make their case by targeting 'polysemy' as their domain of investigation and by applying the Behavioral profiling method (described here):

Given the recency of this method, the number of studies that investigate highly polysemous items is still limited. We therefore apply this method to the verb to get to illustrate that not only does it not suffer from the problems of the intersubjective approach, but it also allows for a more bottom-up/data-driven analysis of the semantics of lexical elements to determine how many senses of a word to assume and what their similarities and differences are. (p.157)

Generally, the results encountered in both Berez and Gries' study and Raukko study are very similar. However, Berez and Gries' BP approach allows for a finer grained investigation:
we show that some of our results are incredibly close to Raukko's, but also provide an illustration of how the BPs can combine syntactic and semantic information in a multifactorial way that is hard to come by using the kinds of production experiments Raukko discusses. (p.159)

With regard to my project, broadly concerned with a corpus-driven investigation of polysemous lexical items , Berez and Gries' paper provides, methodologically, a useful illustration of how to exploit corpus data optimally for the retrieval of semantic information.

No comments:

Post a Comment