Slovenska raziskovalna infrastruktura za jezikovne vire in tehnologije
Common Language Resources and Technology Infrastructure, Slovenia


CLASSLA K Centre workshops

The following workshops, organised by CLASSLA, are presented below:

Workshop on regional markedness in text

On 6 and 7 November 2021, an online workshop dedicated to regional markedness in text took place, organised by the ReLDI centre, University of Zurich, and CLASSLA.

The program of the two-day event included the keynote talk on Computational dialectology by Yves Scherrer from the University of Helsinki, Darja Fišer’s presentation of the student research at the JTDH Language Technologies and Digital Humanities Conference, and two interactive workshops: Interactive workshop on regional variation in text, led by Sara Košutar, Larissa Schmidt, and Leyla Feiner, and Regional variation in gender marking: a hands-on tutorial on extracting data from corpora, led by Mirjana Starović and Tanja Samardžić.

The materials for the workshop on regional variation in gender marking are available here. They provide a gentle introduction to the process of analysing corpora, containing information on:

  • which South Slavic corpora are available on the CLARIN.SI repository, and how to find comparable corpora
  • how to explore corpora through the noSketchEngine and KonText concordancers
  • how to query the corpora using the CQL (Corpus Query Language) syntax
  • how to analyse gender marking in each South Slavic corpus by analysing the number of occurrences of feminine and masculine nouns describing occupations (e. g. the feminine and masculine nouns for the word “director”)
  • how to use the morphosyntactic descriptions (MSDs) to analyse the distribution of verbs with feminine and masculine suffixes (e.g. “mislila” vs. “mislil” for “she/he thought”)
  • and finally, how the results can be interpreted to analyse gender bias in society.

Some excerpts of the document are presented below:

First CLASSLA K Centre workshop

The first CLASSLA K Centre workshop was supposed to be held from May 6 to May 8 2020 in Ljubljana, but due to the COVID-19 crisis, the face2face workshop had to be postponed. However, in the process of selecting participants a very nice crowd came together, so we decided to host an online Zoom session on May 6, the day we should have all met in Ljubljana.

The session took two hours, and in the first hour all the participants briefly presented themselves. In the second hour of the session, a short discussion on the future steps for the workshop and the knowledge centre were discussed, kick-started with the results on the survey taken by the participants before the online session. Also, the ReLDI centre for linguistic data was presented as well as the current CLARIN ERIC funding opportunities.

This discussion revealed the following priorities: (1) connecting web services with concordancers is a very sought feature, so that researchers could easily process and publish their textual raw data, (2) the Knowledge centre might need a form for reporting use cases on its resources (a draft of such a form has been made available here), and (3) the participants are very interested in holding group discussions on specific topics, which will be organised in the weeks to come.

The whole online session seemed to be a very pleasant experience for the 42 participants and we have the Zoom photos of the participants below to prove that!

We are still looking forward to the face2face workshop which we hope will take place during the next year.