Digitale Sprachwissenschaft

Kontakt:
    <korpuslinguistik@ids-...>
 
Leitung:
    Dr. Marc Kupietz <kupietz@ids-...>
 
Wissenschaftliche Mitarbeiter:
    Cyril Belica <belica@ids-...>
    Dr. Harald Lüngen <luengen@ids-...>
    Rainer Perkuhn <perkuhn@ids-...>
 
Kooperationen:
    siehe hier
 
Ehemalige am Korpusaufbau beteiligte Mitarbeiter des IDS:
    siehe hier
 
Studentische Hilfskräfte:

  • Caroline Iliadi

Corpora of Written Language


Collaborations of the Project

External Collaborations

Within the framework of the EU-project CLARIN and the project CLARIN-D, funded by both the  BMBF (Federal Ministry of Education and Research) and MWK-BW (Ministry of Science, Research and Arts), the project “Development and Maintenance of Corpora of Contemporary Written Language” collaborates with the following partners to create a research infrastructure (FI) for linguistics:

Main areas within the collaboration framework are:

  • creating explicit research infrastructure centers and securing their sustainability
  • canonisation and standardisation of formats and interfaces
  • best practice guidelines for handling resources that are not free from third party rights as well as corresponding licensing models
  • facilitating persistent identifiability (and citability) of electronic resources - a corresponding ISO standardisation process has been initiated together with the MPI Nijmegen and the Department of Linguistics at the University of Tübingen

Collaboration within the IDS

  • Eric Seubert is significantly involved in the development of the TEI-based text model of the IDS. Moreover, he develops programs for XML conversion and for quality assurance of corpora. He is also responsible for the supervision of student assistants correcting and tagging source texts.
  • Peter Harders analyses data processing formats of acquired raw data and develops programs for their conversion.
  • The project cooperates with the project Research and Teaching Corpus (FOLK) in the  Archive for spoken German  (department of pragmatics) in particular on issues such as the clarification of legal and ethical questions concerning the collection, processing and provision of linguistic data
  • The project also advises the project Historical Text Corpus (department of lexics).
  • Doris al-Wadi is advising the project (after many years of being a staff member) on issues concerning the IDS text model, especially the corpus text bibliography.