Slovenska raziskovalna infrastruktura za jezikovne vire in tehnologije
Common Language Resources and Technology Infrastructure, Slovenia

Services

In addition to the repository and online concordancers, CLARIN.SI provides its users also with the following services.

Automated text annotation

CLARIN.SI offers two on-line services for automatic linguistic annotation. The newer service is the CLASSLA Annotation Tool, which uses the CLASSLA-Stanza pipeline for text annotation, and has models for Bulgarian, Croatian, Macedonian, Serbian and Slovenian texts, also with models for non-standard (colloquial) Croatian, Serbian and Slovenian. It can annotate l

The older service is the ReLDIanno text annotation service which supports processing of Slovenian, Croatian and Serbian. There are two main ways of using ReLDIanno: (1) through the web application and (2) through the Python library. To read more, visit the CLASSLA K-centre webpage.

Manual text annotation

CLARIN.SI hosts a tool for on-line manual linguistic annotation of corpora called WebAnno. To read more about WebAnno, have a look at the home page of the project. If you would like an account on WebAnno@CLARIN.SI, please send an e-mail to info@clarin.si explaining who you are and why you need access.

Storage and cooperative development

CLARIN.SI has a virtual organisation at GitHub called CLARINSI that hosts a number of projects related to language resources and technologies, such as PoS and NER taggers, word normalisers, standards and conversions between linguistic formalisms.

CLARIN.SI also hosts a GitLab server that offers a platform for developers of language technology tools and resources. The main advantage as compared to GitHub.com is that projects can also be made private, without paying any fee, and that not all of our code is stored by companies in the U.S. If you would like an account on GitLab@CLARIN.SI, please send an e-mail to info@clarin.si explaining who you are and why you need access.

Text simplification and analysis

CLARIN.SI. through its partner CJVT UL, hosts the service SENTA for simplification and analysis of Slovene texts. When developing the tool, special attention was paid to accessibility for the users with special needs. If you want to try the SENTA tool and find out more about it, visit the service’s website.

Summarizing corpus data

CLARIN.SI, through its partner CJVT UL, hosts a service for summarizing corpus data called Korpusnik, which displays statistical and textual data from the five corpora of the Slovenian language in a user-friendly way. When developing the tool, special attention was paid to accessibility for the users with special needs. You can read more about Korpusnik on the tool’s website.

Knowledge transfer

CLARIN.SI supports the recording and archiving of the JOTA lectures, organised by the Slovene Society for Language Technologies, on the VideoLectures portal. Note that the recordings of some other CLARIN events are also available at VideoLectures.