About the GCND

 The parsed corpus of Southern Dutch Dialects (GCND) is the first corpus of spoken Dutch dialects. The project aims at making accessible a unique collection of dialect recordings from 768 places in Belgium, the north of France and the south of the Netherlands with speakers that are generally non-mobile, rural, unschooled and born around 1900. All recordings were made by dialectologists from Ghent University and 740 of those recordings were recorded between 1963 and 1976. For the GCND the Ghent collection is being complemented with new recordings (28 recordings from Brussels, Flemish Brabant and Limburg) and recordings from the Meertens Institute (73 recordings from the south of the Netherlands).

For the GCND, the recordings are transcribed – urgent in times of rapidly progressing dialect loss! – using a newly developed two-tier protocol, and linguistically annotated (i.e. with information on the word class of the individual words (‘pos-tags’) and the syntactic functions of word groups and their underlying relationship (‘parsing’)) using existing software tools.

 

Compared to other data collections on Dutch dialects, the GCND will be unique in being based exclusively on spontaneous speech. As the dialect recordings represent a historical stage of the language (in the case of French-Flemish even the last witness of a now all but extinct language variety) and will now finally be searchable for word forms and syntactic patterns, the GCND will (i) make it possible to track language change through time and space, (ii) enable a new perspective on the functional strength of dialect features in real life and (iii) facilitate the serendipitous research of previously unnoticed structures. Audio, transcriptions and annotations will be made available online (with query tools). The GCND will as such form an unparalleled corpus of dialect data.

 

Funding

2020-2024: FWO medium-size research infrastructure grant I.0.101.20N

2018-2020: FWO small research grant 1.5.310.18N to A. Breitbarth (pilot project)

2018-2021: FWO postdoctoral mandate junior 1.2.P79.19N to M. Farasyn (French Flemish recordings)

2021-2024: FWO postdoctoral mandate senior 1.2.P79.22N to M. Farasyn (French Flemish recordings)

2019-2021: Subsidies from the provinces of Zeeland, West-Flanders and East-Flanders (pilot project)

 

Gebruik van het GCND

The GCND is still under construction. It will be hosted by the Dutch Language Institute (INT). Details about how to consult the corpus will follow soon.