MATE Deliverable D1.1

Supported Coding Schemes




Coding scheme adopted in the Project: "Spoken Language and Social Activity", at the Göteborg University.

Coding book:

"Transcription Standard", J. Nivre, Semantics and Spoken Language, Department of Linguistics, Göteborg University [Nivre ?]



Swedish Spoken Language Corpus (interviews, shop, meeting, phone, task-oriented dialogues)

Number of activities: 227

Number of words: 967,141

Evaluations of scheme:

Information not available.

Purpose and underlying approach:

Aim of the Project is to investigate spoken language in different social activities.

The theoretical framework is activity-based communication analysis.

Prosody has an indirect relevance, as a correlate of discourse structure.

List of phenomena annotated:

Prosodic markers may be inserted in the orthographic or phonetic transcription, representing global rhythmic or intonational features of speech or local events such as stress, lengthening and pauses.


Emphatic or contrastive stress marked with uppercase letters


marked with diacritic ":"


short (/) long (//) very long (///)

Properties of speech:

<high pitch>, <low pitch>, <quick>, <slow>, <loud>, <quiet>, etc.


Information not available.

Markup language:

A transcription is divided into a header section and a body.

The body section is made of:

lines beginning with "$" contain the transcribed utterances, with possible prosodic diacritics;

information lines beginning with "@", for comments and properties of speech in angle brackets.

Annotation tools:

TransTool is a computer-tool for transcribing spoken language developed in accordance with the transcription standards used within the research program Semantics and Spoken Language at the Department of Linguistics, Göteborg University. TransTool is implemented in Tcl/Tk (Tool Command Language/Toolkit) a scripting language designed to be easy to embed into other applications with a window system toolkit for building graphical user interfaces.