logo

MATE Deliverable D1.1

Supported Coding Schemes

 

PROZODIAG (LUND)

 

 The scheme has been developed in the Project "Prosodic Segmentation and Structuring of Dialogue", supported by the Swedish national "Language Technology Programme" and involving Lund University and KTH.

Information may be found at http://galaxy.ling.lu.se/projects/ProZodiag/ and in the papers: [Bruce et al. 95a,b, 96, 97a,b,c,d].

Coding book:

Information not available.

Applications:

The scheme has been applied to the "Waxholm Application Database" [Bertenstam et al. 95], http://www.speech.kth.se/waxholm/waxholm2.html

consisting of man-machine dialogues in a boat traffic information service, amounting to 198 scenarios and 1900 dialogue turns.

Evaluation of scheme:

Information not available.

Purpose and underlying approach:

"The object of study is the prosody of dialogue in Swedish in a language technology framework. The primary goal of the project is to increase our understanding of how prosodic aspects of speech are exploited interactively in dialogue - the genuine environment for prosody - and on the basis of this increased knowledge to be able to create a more powerful prosody model." [Bruce et al. 97b]

The adopted methodology is the following:

analysis of discourse/dialogue structure (independent of prosody)

prosodic analysis: 1) auditory analysis (prosodic transcription) 2) acoustic-phonetic analysis (f0 curve, waveform)

speech synthesis (model-based resynthesis, text-to-speech)

Dialogues are annotated with textual and prosodic information aligned with the original f0 contour, the 'fine tuned' contour (computed on the basis of : F0 register - baseline- and range of F0 movements, timing and slope) and the synthetic contour (based on the Lund model of Swedish intonation, [Bruce 77]).

List of phenomena annotated:

The model represents prosodic phrasing and intonation in Swedish.

A set of "discrete elements" is intended for a phonological (auditory) transcription of phrasing and intonation contours. A set of "gradational elements" gives a more precise acoustic description of intonation, allowing to compute a "fine-tuned contour" representing a close-copy of the original f0 curve.

 

Discrete Elements:

Tonal Labels aligned with the nucleus of the accented syllable:

accented I

HL*

" II

H*L

focussed I

(H)L*H

" II

H*LH,

" compound

H*L...L*H

juncture - initial

%L

" - "

%H

" - final

L%

" - "

H%

" - "

LH%

Grouping:

||

major boundary

|

minor boundary

Gradational elements:

For each phrase:

Register

0 baseline (f0 level of unaccented portions)

low/mid/high

Range

height of pitch movement (starting from baseline)

low/ mid/ high/ flat/ decreased

For each f0 movement: slope, timing

Non-intonational phenomena: duration, voice source characteristics, reduction

Examples:

Information not available.

Markup language:

Symbolic labels, organized in separate tiers: a tonal tier and a boundary tier, synchronized with discourse labelling (orthographic tier, discourse referent tier, textual segmentation tier).

Annotation tools:

Labels and signal information are synchronized in ESPS/Waves+