logo

MATE Deliverable D1.1

Supported Coding Schemes

 

KIM (Kiel Intonation Model)
(Developed at IPDS -Institut fur Phonetik und Digitale Sprachverarbeitung Kiel University.)
 

Coding book:

K.J.Kohler, M Patzold, A.P. Simpson, "From Scenario to Segment: the Controlled Elicitation, Transcription, Segmentation and Labelling of Spontaneous Speech", AIPUK 29, 1995, Kiel University [Kohler et al. 95]

Applications:

(see [Kohler 96])

Kiel Corpus of Read Speech (598 sentences, 2 stories, recorded words: 31,374)

Kiel Corpus of Spontaneous Speech (82 dialogues, 25,603 words): 1/3 has been prosodically labelled

Evaluations of scheme:

Information about formal evaluations not available.

The scheme has been judged to reach "high level of consistency within and across segmenters" [Kohler et al. 95].

Purpose and underlying approach:

The Kiel intonation model KIM ([Kohler 95], [Kohler 97]) is intended to represent intonation in general, at a phonetic and phonological level, with special regard to German. The model is oriented both to prosodic research and to text-to-speech implementation.

List of phenomena annotated:

The model is articulated into the following domains: prosodic phrase boundaries, speech rate, f0 downstep, lexical stress, sentence stress, intonation contours (types of peaks and valleys), synchronization of pitch events with syllables.

For each domain in the KIM model, the relevant categories are represented by symbolic labels. The level of representation is phonological/phonetic. Prosodic labels (prefixed with &) are inserted directly into the phonetic annotation tier and describe quite precisely the morphology and alignment of intonation contours, although not quantitatively.

Prosodic labelling presupposes phonetic segmentation. At least word boundaries and accented vowels should be aligned with the signal. Prosodic labels are associated with word boundaries (in this case they are prefixed with #) or with vowels inside the word (in this case the prefix is $).

Prosodic phrasing:

PG1

when coinciding with syntactic clause boundaries

PG2

otherwise (phrase)

PG1< PG1>

for parenthesis

PG2/

for truncations, false starts

PG

for technical breaks ?

Speech rate (marked after the phrase boundary, if a rate change is perceived with respect to preceding phrase):

  

RP +

(rate plus) rate increase

RM

(rate minus) rate decrease

 Downstep is considered a default, it is not marked, as well as reset at phrase boundaries. Exceptions:

-

before PG if reset is absent

+

before the accent digit, if reset is within the phrase

|

before the accent digit, in case of upstep

Lexical stress: integrated in the segmental notation of vowels (' for primary, '' for secondary)

Sentence accent (attribute of the word; placed before the word if referring to the lexical-stressed vowel, otherwise placed before the accented vowel):

0

unaccented

1

partially accented

2

accented

3

reinforced

Intonation contours (marked between accents or at the end of a prosodic phrase):

.

falling contour, peak or hat (with three levels: steep, slight, level)

,

low, narrow rising contour, valley

?

high, wide rising contour, valley

.,

fall rise (low rise)

.?

fall-rise (high rise)

Peak and valley alignment (marked after the accent digit):

Peaks:

^

for centre of accented syllable nucleus

)

before the nucleus

(

late in the nucleus or after

Valleys:

]

early

[

non-early

Examples:

Information not available.

Markup language:

Prosodic labels are ASCII labels inserted directly into the phonetic annotation tier. Such tier is represented twice: firstly as a symbolic stream, then as a list of labels associated with their time alignment in the signal (segmentation).

All prosodic labels are prefixed with &. Those corresponding to word boundaries are also prefixed with #, those associated to vowels inside the word are prefixed with $.

Annotation tools:

Phonetic segmentation is performed manually with a Waveform editor ([Carlson et al. 85]).

Prosodic labels are manually inserted, aligned with segment boundaries, with the help of the environment Xassp developed at IPDS, which displays waveform, f0 curve and labels.

An automatic tool checks the formal consistency of labelling.