For the GCND, the recordings are transcribed – urgent in times of rapidly progressing dialect loss! – using a newly developed two-tier protocol, and linguistically annotated (i.e. with information on the word class of the individual words (‘pos-tags’) and the syntactic functions of word groups and their underlying relationship (‘parsing’)) using existing software tools.
Compared to other data collections on Dutch dialects, the GCND will be unique in being based exclusively on spontaneous speech. As the dialect recordings represent a historical stage of the language (in the case of French-Flemish even the last witness of a now all but extinct language variety) and will now finally be searchable for word forms and syntactic patterns, the GCND will (i) make it possible to track language change through time and space, (ii) enable a new perspective on the functional strength of dialect features in real life and (iii) facilitate the serendipitous research of previously unnoticed structures. Audio, transcriptions and annotations will be made available online (with query tools). The GCND will as such form an unparalleled corpus of dialect data.
Funding
2020-2024: FWO medium-size research infrastructure grant I.0.101.20N
2018-2020: FWO small research grant 1.5.310.18N to A. Breitbarth (pilot project)
2018-2021: FWO postdoctoral mandate junior 1.2.P79.19N to M. Farasyn (French Flemish recordings)
2021-2024: FWO postdoctoral mandate senior 1.2.P79.22N to M. Farasyn (French Flemish recordings)
2019-2021: Subsidies from the provinces of Zeeland, West-Flanders and East-Flanders (pilot project)
Gebruik van het GCND
The GCND is still under construction. It will be hosted by the Dutch Language Institute (INT). Details about how to consult the corpus will follow soon.