Just as pop music has evolved from an originally youth cultural phenomenon into an integral part of modern culture, its textual content has become omnipresent in the realm of everyday language. We are surrounded by pop lyrics, e.g. in the form of in-car radio listening, online streaming services, ambient music for department stores and restaurants, or in the context of TV shows. In view of this high communicative impact factor, there is a substantial desideratum regarding the empirical exploration of pop lyrics in corpus linguistics.
The Corpus of Song Lyrics („Songkorpus“) addresses this desiteratum and contains sustainably utilizable, multilayer annotated song texts, featuring phenomena of both written and spoken discourse. It is dedicated to linguistic research, as well as to related disciplines such as media, cultural & literary studies, social sciences, or musicology, who have a scientific interest in contemporary German rock and pop music language.
The corpus is still in its early stages, for detailed information please refer to the general statistics. So far, it comprises:
Udo Lindenberg Archive (1972-today)
Konstantin Wecker Archive (1973-today)
Stoppok Archive (1982-today)
Ulla Meinecke Archive (1977-today)
Hannes Wader Archive (1969-today)
Fettes Brot Archive (1994-today)
All these archives contain XML TEI P5 annotated song lyrics with lemmatizations and part-of-speech annotations (extended STTS). Named entities, neologisms, and constituent structures are throughout annotated, sometimes also rhyme types.
Besides, the corpus features some thematic archives:https://songkorpus.de/about.html
FRG Single Charts: the 1800 most succesfull German-language songs from the Top 100 since 1970, according to Chartsurfer
GDR Single Charts: ca. 500 East German songs from 1970–1990, based on the GDR chart lists
HipHop Songs: 1000 German rap lyrics, covering more than two decades