• Nu S-Au Găsit Rezultate

(5)What is discourse? Longman: 1

N/A
N/A
Protected

Academic year: 2022

Share "(5)What is discourse? Longman: 1"

Copied!
43
0
0

Text complet

(1)

Tehnici de Ingineria Limbajului Natural

Curs 6

Teorii ale discursului: Centering

Curs: Dan Cristea

Laboratoare: Diana Trandabăț, Mihaela Onofrei,

Daniela Gîfu, Ionuț Pistol

(2)

The discourse layer

INITIAL PROCESSING

text SYNTACTIC

PROCESSING SUB-SYNTACTIC

PROCESSING

SEMANTIC

PROCESSING DISCOURSE

PROCESSING PRAGMATIC result

PROCESSING

(3)

The semantic layer

COHESION &

COHERENCE SUMMARISATION INITIAL

PROCESSING

text SYNTACTIC

PROCESSING SUB-SYNTACTIC

PROCESSING

SEMANTIC

PROCESSING DISCOURSE

PROCESSING PRAGMATIC result

PROCESSING

TEMPORAL PROCESSING QUESTION- ANSWERING

DISCOURSE STRUCTURE

(4)

What is discourse?

Longman: 1. a serious speech or piece or writing on a particular subject: Professor Grant delivered a long discourse on aspects of moral theology. 2.

serious conversation between people: You can’t expect meaningful discourse when you two

disagree so violently. 3. the language used in particular kinds of speech or writing: scientific discourse.

(5)

What is discourse?

Longman: 1. a serious speech or piece or writing on a particular subject: Professor Grant delivered a long discourse on aspects of moral theology. 2.

serious conversation between people: You can’t expect meaningful discourse when you two

disagree so violently. 3. the language used in particular kinds of speech or writing: scientific discourse.

As you can see, the definition 1 lets aside the dialogue. Usually the dialogue is considered closely related to discourse and many techniques devised for discourse can be applied to dialogue as well.

In this presentation we will not refer to dialogue.

(6)

Text versus discourse

Syntactically – a discourse is more than a single sentence.

From Garcia Marquez

(7)

A text is not a discourse!

But it becomes a discourse the very moment it is read or heard by a human... or a machine.

Text versus discourse

(8)

Time and discourse

Discourse has a dynamic nature Time axes

real time

discourse interpretation time story time

1 2

2 1

1000 1030

800 920

1 2

(9)

Cohesion and coherence

A text manifests cohesion when its parts closely correlate.

A text is coherent when it makes sense, with respect to an accepted setting, real or virtual.

Setting: a dynamic system of conventions.

(10)

Cohesion and coherence

Cohesion: usually enforced by anaphoric links, repetitions, etc. (see Haliday and Hassan, 1978)

Coherence: rather easy to decide that a text is coherent, and very difficult to risk a statement of the contrary.

Recently, a friend of mine defied me that I am unable to give him a senseless sentence. So, I uttered the famous Chomskyan sentence “Colorless green ideas sleep furiously.” challenging him to find a sense.

And he did, because he explained me that this sentence simply says that one night some ideas (colorless, as all ideas) came, during an agitated sleep, to the mind of a politician, a member of the green party…

So, the example argues for the necessity of a setting (or a context) according to which to give a meaning to a discourse. Often the key to the interpretation of a discourse comes from finding this setting. This is why to some people a novel like The sound and the fury of William Faulkner is obscure and difficult to read, while for others it is such a relish.

(11)

intertextuality

Interpretation of discourse

discourse interpretation

text setting

knowledge about the language knowledge about the situation knowledge about the author

knowledge about the world (real+virtual)

The setting explains why one sentence for different people could have different meanings: each one decodes it in a different setting (context).

(12)

What do we expect from a theory of discourse?

• To tell us how is the discourse structured

• To make explicit, using this structure, some discourse phenomena: at least cohesion and coherence

• To explain interruptions, flashbacks

• To explain how the structure can be built from the raw text

• To be easily put at the base of implementations

that deal with discourse content

(13)

Centering - a theory of local discourse coherence

• Joshi,A.K. and Weinstein,S., 1981: “Control of Inference: Role of Some Aspects of Discourse-Structure Centering“

• Grosz,B.; Joshi,A.K. and Weinstein,S., 1986: Towards a computational theory of discourse interpretation

Brennan,S.E.; Friedman,M.W. and Pollard,C.J., 1987: “A Centering approach to pronouns“

Grosz,B.; Joshi,A.K. and Weinstein,S, 1995: Centering: A framework for modeling the local coherence of discourse

• Strube,M. and Hahn,U., 1996: “Functional Centering“

• Walker,M.A.; Joshi,A.K. and Prince,E.F. (eds.), 1997: “Centering in Discourse“

• Kameyama,M., 1997: “Intrasentential Centering: A Case Study“

Poesio, M., Stevenson, R., Di Eugenio, B. and Hitzeman, J. 2004:

“Centering: A Parametric Theory and Its Instantiations”

(14)

CT: goals of the theory

• explains why certain texts are more difficult to process than others

• explains why we use the pronouns the way we use them

• anchors a practical approach for anaphora resolution

(15)

A smooth discourse

from (Walker, Joshi and Prince, 1997)

a. Jeff1 helped Dick2 wash the car.

b. He1 washed the windows as Dick2 waxed the car.

c. He1 soaped a pane.

He in c. is Jeff because soaping can only be related to the washing event

Se începe prin a se vorbi despre Jeff în (a), în (b) se continuă despre el, iar în (c) constrângerea semanticădicteazăe vorba de același Jeff. Procesarea se face lin.

(16)

A less smooth discourse

a. Jeff1 helped Dick2 wash the car.

b. He1 washed the windows as Dick2 waxed the car.

c. He2 buffed the hood.

He in c. is Dick, because buffing can only be related to the waxing event.

CT pretinde că acest discurs este mai greu de procesat decât precedentul. O inferență de natură semantică dictează că în (c)He se leagă la Dick iar nu la Jeff. Aici apare o comutare a atenției de la Jeff, ce fusese în centrul atenției în primele două fraze, la Dick despre care se vorbește în (c). O comutare a centrului atenției este mai greu de procesat decât o păstrare a lui.

(17)

CT: focus helps to disambiguate pronominal anaphora

1. Susan1 is a fine friend.

2. She1 gives people the most wonderful presents.

3. She1 just gave Betsy2 a wonderful bottle of wine.

4. She1 told her2 it was quite rare.

5. She1 knows a lot about wine.

from (Grosz, Joshi and Weinstein, 1995)

There is no problem with finding who she is in 2 and 3 because no referent other than Susan were introduced by the discourse. 3 introduces Betsy, also a female character. Neither syntactic nor semantic criteria can be applied in 4 to decide who the referent of she and respectively her are. Still we easily link she to Susan and her to Betsy because the focus of the preceding utterance was Susan and it is normal to consider that the focus is preserved. A similar criteria works in 5: more people are inclined to consider she here as being Susan than being Betsy.

(18)

CT explains the normality or oddness of some utterances

a. Terry really goofs sometimes.

b. Yesterday was a beautiful day and he was excited about trying out his new sailboat.

c. He wanted Tony to join on a sailing expedition.

d. He called him at 6 A.M.

from (Grosz, Joshi and Weinstein, 1995)

“CT investighează interacţiunile ce pot fi stabilite între alegerea expresiilor referențiale, starea

atențională, inferențele necesare pentru determinarea interpretărilor unei exprimări într-un segment de discurs și coerență. Pronumele şi descrierile definite nu sunt echivalente relativ la efectul pe care îl au asupra coerenței.” (citat din GJW95).

(19)

This text has a defect…

a. Terry really goofs sometimes.

b. Yesterday was a beautiful day and he was excited about trying out his new sailboat.

c. He wanted Tony to join on a sailing expedition.

d. He called him at 6 A.M.

e. He was sick and furious at being woken up so early.

După ce pînă la unitatea (d) focusul a fost menţinut constant –Terry, abia în momentul în care sick apare cititorul realizează că heîn (e) nu mai e Terry ci Tony, pentru că Terry n-ar fi putut fi bolnav.

Pentru un moment cititorul a fost derutat. El a avut nevoie să facă o inferenţă pentru a repara confuzia.

(20)

This text has a defect…

a. Terry really goofs sometimes.

b. Yesterday was a beautiful day and he was excited about trying out his new sailboat.

c. He wanted Tony to join on a sailing expedition.

d. He called him at 6 A.M.

e. He was sick and furious at being woken up so early.

(21)

This text repairs the proceeding

a. Terry really goofs sometimes.

b. Yesterday was a beautiful day and he was excited about trying out his new sailboat.

c. He wanted Tony to join on a sailing expedition.

d. He called him at 6 A.M.

e. Tony was sick and furious at being woken up so early.

O secvență mai naturală ar fi fost aceea în care în (e) s-ar fi folosit direct Tony.

(22)

... And we can go on like this:

a. Terry really goofs sometimes.

b. Yesterday was a beautiful day and he was excited about trying out his new sailboat.

c. He wanted Tony to join on a sailing expedition.

d. He called him at 6 A.M.

e. Tony was sick and furious at being woken up so early.

f. He told Terry to get lost and hung up.

(23)

But here we have again a problem...

a. Terry really goofs sometimes.

b. Yesterday was a beautiful day and he was excited about trying out his new sailboat.

c. He wanted Tony to join on a sailing expedition.

d. He called him at 6 A.M.

e. Tony was sick and furious at being woken up so early.

f. He told Terry to get lost and hung up.

g. Of course, he hadn’t intended to upset Tony.

… dar cititorul este din nou derutat în unitatea (g). Focusul fusese schimbat din Terryîn Tonyîn unitățile (e) şi (f) și așteptarea era ca același personaj să rămână în continuare în centrul atenției.

(24)

But here we have again a problem...

a. Terry really goofs sometimes.

b. Yesterday was a beautiful day and he was excited about trying out his new sailboat.

c. He wanted Tony to join on a sailing expedition.

d. He called him at 6 A.M.

e. Tony was sick and furious at being woken up so early.

f. He told Terry to get lost and hung up.

g. Of, course he hadn’t intended to upset Tony.

(25)

Finally, we are done!

a. Terry really goofs sometimes.

b. Yesterday was a beautiful day and he was excited about trying out his new sailboat.

c. He wanted Tony to join on a sailing expedition.

d. He called him at 6 A.M.

e. Tony was sick and furious at being woken up so early.

f. He told Terry to get lost and hung up.

g. Of, course Terry hadn’t intended to upset Tony.

În sfârșit deruta este eliminată în această variantă.

Conjectura CT este că forma de expresie într-un discurs influențează direct cererile de resurse utilizate la descifrarea lui. Este cunoscut că identificarea referenților grupurilor nominale într-un discurs presupune un anumit proces inferențial. CT afirmă că forma în care sunt exprimate aceste grupuri nominale poate introduce o încărcare inferențialămai mare ori mai mică în cititor.

(26)

CT: the main lines

• Applies to just one segment of discourse

– refers to Grosz&Sidner‘s Attentional State Theory

• Sees the segment drawn up of adjacent utterances (sentences)

• Discourse entities are called centers

(27)

Centers

John gave Mary a flower.

person1 type = person name = John gender = masc

person2 type = person name = Mary gender = fem

flower1 type = flower number = sg

the realisation relation

(28)

What is a center?

John went to Mary‘s house

He met her down the street.

He found it down the street.

person1 type = person name = Mary gender = fem

the realisation relation

house1 type = house number = sg

the realisation relation

(29)

For each utterance, compute:

• a list of forward-looking centers:

Cf(ui) = <e1, e2, ... ek>

ranking: subject > direct-object > indirect-object > others

• a backward-looking center:

Cb(ui) = highest-ranked element of Cf(ui-1) that is realised in ui

• a prefered center:

Cp(ui) = e1

(30)

Rule 1: pronoun realisation

If some element of Cf(ui-1) is realised as a pronoun in ui, than so is Cb(ui)

– it captures the intuition that pronominalisation is one way to indicate discourse salience

– if there are multiple pronouns in a sentence realising discourse entities from the previous utterance, than Cb must be one of them

– if there is just one pronoun, then the pronoun must be the Cb

(31)

Rule 1 observed

a. Terry really goofs sometimes.

Cf = ([Terry])

b. Yesterday was a beatiful day and he was excited about trying out his new sailboat.

Cf = (he=his=[Terry], [the sailboat]) Cb = [Terry]

c. He wanted Tony to join on a sailing expedition.

Cf = (he=[Terry], [Tony], [the expedition]) Cb = [Terry]

d. He called him at 6 A.M.

Cf = (he=[Terry], him=[Tony]) Cb = [Terry]

(32)

Rule 1 still observed

a. Terry really goofs sometimes.

Cf = ([Terry])

b. Yesterday was a beatiful day and he was excited about trying out his new sailboat.

Cf = (he=his=[Terry], [the sailboat]) Cb = [Terry]

c. He wanted Tony to join on a sailing expedition.

Cf = (he=[Terry], [Tony], [the expedition]) Cb = [Terry]

d. He called Tony at 6 A.M.

Cf = (he=[Terry], [Tony]) Cb = [Terry]

(33)

Rule 1 disobserved

a. Terry really goofs sometimes.

Cf = ([Terry])

b. Yesterday was a beatiful day and he was excited about trying out his new sailboat.

Cf = (he=his=[Terry], [the sailboat]) Cb = [Terry]

c. He wanted Tony to join on a sailing expedition.

Cf = (he=[Terry], [Tony], [the expedition]) Cb = [Terry]

d. Terry called him at 6 A.M.

Cf = ([Terry], him=[Tony]) Cb = [Terry]

(34)

Rule 2: transitions

Cb(u) = Cb(u-1) Cb(u) ¹ Cb(u-1) Cb(u) = Cp(u)

Cb(u) ¹ Cp(u)

CONTINUING SMOOTH SHIFT

RETAINING ABRUPT SHIFT

CON > RET > SSH > ASH

Following (Grosz, Joshi and Weinstein, 1995) and (Brennan, Friedman and Pollard, 1987)

(35)

Rule 2: a point of discontinuity

a. Terry really goofs sometimes.

Cf = ([Terry])

b. Yesterday was a beatiful day and he was excited about trying out his new sailboat.

Cf = (he=his=[Terry], [the sailboat]) Cb = [Terry]

c. He wanted Tony to join on a sailing expedition.

Cf = (he=[Terry], [Tony], [the expedition]) Cb = [Terry]

d. He called Tony at 6 A.M.

Cf = (he=[Terry], [Tony]) Cb = [Terry]

e. Tony was sick and furious at being woken up so early.

Cf = ([Tony]) Cb = [Tony]

CONTINUING

CONTINUING

CONTINUING

SMOOTH SHIFT

(36)

Rule 2: further analysis

...

d. He called Tony at 6 A.M.

Cf = (he=[Terry], [Tony]) Cb = [Terry]

e. Tony was sick and furious at being woken up so early.

Cf = ([Tony]) Cb = [Tony]

f. He told Terry to get lost and hung up.

Cf = (he=[Tony], [Terry]) Cb = [Tony]

g. Of, course Terry hadn’t intended to upset Tony.

Cf = ([Terry], [Tony]) Cb = [Tony]

CONTINUING SMOOTH SHIFT

CONTINUING

RETAINING

(37)

Centering hints on pronominal anaphora

a. I haven’t seen Jeff for several days.

b. Carl thinks he’s studying for his exams.

c. I think he

?

went to the Cape with Linda.

from (Grosz, Joshi & Weinstein, 1983) Cf = (I=[I], [Jeff])

Cb = [I]

Cf = ([Carl], he=[Jeff], [Jeff´s exams]) Cb = [Jeff]

(38)

Centering explains why we

understand he in unit c as Jeff

b. Carl thinks he’s studying for his exams.

c. I think he? went to the Cape with Linda.

Cf = ([Carl], he=[Jeff], [Jeff´s exams]) Cb = [Jeff]

RETAINING

ABRUPT SHIFT Cf = (I=[I], he=[Jeff], [the Cape], [Linda])

Cb = [Jeff]

Cf = (I=[I], he=[Carl], [the Cape], [Linda]) Cb = [Carl]

(39)

Centering explains why we

understand he in unit c as Jeff

b. Carl thinks he’s studying for his exams.

c. I think he? went to the Cape with Linda.

Cf = ([Carl], he=[Jeff], [Jeff´s exams]) Cb = [Jeff]

RETAINING

ABRUPT SHIFT Cf = (I=[I], he=[Jeff], [the Cape], [Linda])

Cb = [Jeff]

Cf = (I=[I], he=[Carl], [the Cape], [Linda]) Cb = [Carl]

Jeff

(40)

Attentional state theory (AST)

(Barbara Grosz & Candence Sidner, 1987)

Models the linguistic structure of the discourse

Gives an account on intentions and how they are combined Explains the shift of attention during discourse interpretation Explains interruptions and flash-backs

Puts in evidence a dynamic domain of referentiality 3 components

(41)

AST: 1 st component

• a linguistic structure:

– more sentences are aggregated in the same segment – segments display a recursive structure

(42)

What locality means in CT view?

• CT acceptă cadrul teoretic al teoriei AST care explică discursul la nivel global. CT a fost elaborată pentru a explica ce se întîmplă în interiorul unui segment. Asta înseamnă că ea va fi în stare să explice de ce anumite tranziţii între unităţile unui segment sînt mai uşor de procesat decît altele. Dar, după cum ştim de la AST, în interiorul unui segment putem regăsi relaţii de dominare, ceea ce înseamnă că el are în componenţă alte subsegmente.

Definiţia recursivă a segmentului ridică însă un semn de întrebare relativ la interpretarea restricţiei de localitate în care a fost definită CT: atunci cînd, trecem graniţa dintre ultima unitate a segmentului B şi prima unitate a segmentului C, aflate între ele în relaţie de satisfacere-precedenţă, atunci cînd amîndouă sînt dominate de un segment mai cuprinzător A, pe de o parte trecem graniţa unui segment deci nu putem aplica CT, dar pe altă parte rămînem în interiorul segmentului A, deci ar trebui să putem aplica CT…

(43)

Centering: other problems?

• still a local theory (applies inside a segment)

• ranking of Cf ellements

– on what criteria (surface-order, syntactic role,

functional (Strube&Hahn, 1996)) è language dependant

• null pronouns

– Italian, Japanese, Romanian

• clitics: doubling references

– Romanian

• intrasentensial centering (Kameyama, 1997)

Centering - a parametric theory (Poesio et al, 2004)

Referințe

DOCUMENTE SIMILARE

Seen as three directions of discursive argumentation, they allow in-depth discussions on every specific activity pertaining to someone on the job: putting out in

[1] Abita, R., Upper bound estimate for the blow-up time of a class of integrodifferen- tial equation of parabolic type involving variable source, C.R.. [10] Wang, H., He, Y.,

If accounts receivable decrease during the time period, this means customers have paid off some accounts, (the company received cash payments) and so, net income should be increased

Which means a word that never occurs in both email groups, and it will be assigned a neutral probability 0.5, also known as the hapaxial value.(In Graham’s method, he use 0.4 since

Let there be color!: joint end-to-end learning of global and local image priors for automatic image colorization with simultaneous classification. ACM Transactions on Graphics 35

On the contrary, Otto maintains a view of nature found in modern science as that of what he calls “unmeaning, purposeless, confused, and dark.” 25 In his

– Players, Objectives, Procedures, Rules, Resources, Conflict, Boundaries, Outcome. •

However, any language with functionality that can be expressed in terms of a valid class file can be hosted by the Java Virtual Machine..