MATE Deliverable D1.1

Supported Coding Schemes



 SAMPA (SAM Phonetic Alphabet) is a multi-lingual computer-readable transcription system developed within the ESPRIT project 2589 SAM (Multilingual Speech Input/Output Assessment, Methodology and Standardization). The SAMPA final standard system is presented in Wells et al. (1992).

Information about SAMPA is available at:


Coding book:

Information not available.


SAMPA has been applied not only by the SAM partners collaborating on EUROM 1, but also in other speech research projects (e.g. BABEL, Onomastica) and by Oxford University Press.


Information not available.

Purpose and underlying approach:

SAMPA aims to provide ASCII encodings for the IPA symbols required for European languages. SAMPA includes a number of symbols for prosodic transcription, attempting to avoid any model-dependency. It is mainly intended to support signal-oriented labelling and provides a basis for cross-language comparisons.

List of phenomena annotated:

Prosodic boundaries:


syllable boundary


morpheme boundary


word boundary


tone group/intonation phrase boundary


phonological phrase/rhythm group boundary


sentence boundary

Prosodic phenomena

  • Stress


primary stress and accent I in Norwegian and Swedish


secondary stress


accent II in Norwegian and Swedish

Phonetic cues of prosody

  • Duration


length mark

  • Intonation
  • SAMPA includes a set of symbols for transcribing intonation contours:


rising tone


falling tone

' '


' '


  • Pauses


silent pause


Information not available.

Markup language:

Diacritics inserted in the phonetic transcription.

Annotation tools:

Information not available.