18. Juni 2019: Simon Krek (Leiter des europaweiten ELEXIS-Projekts): ELEXIS project (2018-2022) – European Lexicographic Infrastructure, IDS Mannheim


18. Juni 2019: Dr. Lydia-Mai Ho-Dac (Université de Toulouse 2): The WikiDisc corpus: In the backstage of Wikipedia, IDS Mannheim


02. - 05. Juli 2019: International Institute for Ethnomethodology and Conversation Analysis Conference 2019 (IIEMCA 2019), Rahmenthema: "Practices",...


10. - 12. März 2020: Deutsch in Europa, Rosengarten, Mannheim



Hiermit laden wir alle interessierten Mitarbeiterinnen, Mitarbeiter und Gäste des Leibniz-Instituts für Deutsche Sprache ein.

Dr. Lydia-Mai Ho-Dac (Université de Toulouse 2)

hält einen Vortrag zu

The WikiDisc corpus: In the backstage of Wikipedia

Dienstag, 18. Juni 2019, 10:00 Uhr, IDS Vortragssaal


Wikipedia constitutes a popular and extremely useful resource for studies in both linguistics and natural language processing. This presentation introduces a language resource based on the French Wikipedia online discussion pages: the WikiDisc corpus. The corpus includes 439,638 talk pages that corresponds to a sort of discussion forum associated with each article where contributors may discuss, interact, and sometimes negotiate, thereby collaboratively improving the article. The total corpus comprises more than 210 million words, structured in more than 3 million posts and more than 1 million threads (thematic sections). This talk will describe the building and the composition of the WikiDisc corpus which is publicly available at https://www.ortolang.fr/market/corpora/wikidisc