Speech and Spoken Language Corpora

CARINA – A CORPUS OF ALIGNED GERMAN READ SPEECH INCLUDING ANNOTATIONS

Read more about CARINA – A CORPUS OF ALIGNED GERMAN READ SPEECH INCLUDING ANNOTATIONS
Log in to post comments

This paper presents the semi-automatically created Corpus of Aligned Read Speech Including Annotations (CARInA), a speech corpus based on the German Spoken Wikipedia Corpus (GSWC). CARInA tokenizes, consolidates and organizes the vast, but rather unstructured material contained in GSWC. The contents are grouped by annotation completeness, and extended by canonic, morphosyntactic and prosodic annotations. The annotations are provided in BPF and TextGrid format.

CARInA_Poster.pdf

Poster CARInA (317)

Categories:: Speech Processing

32 Views