French Wikipedia Talk Pages: Profiling and Conflict DetectionReport as inadecuate

French Wikipedia Talk Pages: Profiling and Conflict Detection - Download this document for free, or read online. Document in PDF available to download.

1 CLLE-ERSS - Cognition, Langues, Langage, Ergonomie 2 University of Turku 3 BCL, équipe Logométrie et corpus politiques, médiatiques et littéraires BCL - Bases, Corpus, Langage UMR 7320 - UNS - CNRS

Abstract : Wikipedia is a popular and extremely useful resource for studies in both linguistics and natural language processing Yano and Kang, 2008; Ferschke et al., 2013. This paper introduces a new language resource based on the French Wikipedia online discussion pages, the WikiTalk corpus. The publicly available corpus includes 160M words and 3M posts structured into 1M thematic sections and has been syntactically parsed with the Talismane toolkit Urieli, 2013. In this paper, we present the first results of experiments aiming at classifying and profiling the talk pages and threads in order to determine criteria for selecting discussions with conflicts.

Keywords : French Wikipedia talk pages conflict detection data-driven approaches

Author: Lydia-Mai Ho-Dac - Veronika Laippala - Céline Poudat - Ludovic Tanguy -



Related documents